L (x; 1 - x)(xi2- x) [(x; -x)z +(x;z -.xf] 2n i l. 2n i=l. kn i=l j=l

Similar documents
Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Lecture 24 Floods and flood frequency

Additional Notes and Computational Formulas CHAPTER 3

Sample Size Estimation in the Proportional Hazards Model for K-sample or Regression Settings Scott S. Emerson, M.D., Ph.D.

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers

6.3 Testing Series With Positive Terms

Statistics 511 Additional Materials

Random Variables, Sampling and Estimation

Understanding Samples

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

1 Inferential Methods for Correlation and Regression Analysis

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

A statistical method to determine sample size to estimate characteristic value of soil parameters

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics

V. Nollau Institute of Mathematical Stochastics, Technical University of Dresden, Germany

Properties and Hypothesis Testing

*Corresponding Author

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

Probability and statistics: basic terms

Lecture 2: Monte Carlo Simulation

Sample Size Determination (Two or More Samples)

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

Module 1 Fundamentals in statistics

Output Analysis (2, Chapters 10 &11 Law)

Stat 139 Homework 7 Solutions, Fall 2015

The standard deviation of the mean

Sequences. Notation. Convergence of a Sequence

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Chapter 6. Sampling and Estimation

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

Lecture 7: Properties of Random Samples

Lecture 19: Convergence

MA Advanced Econometrics: Properties of Least Squares Estimators

M1 for method for S xy. M1 for method for at least one of S xx or S yy. A1 for at least one of S xy, S xx, S yy correct. M1 for structure of r

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Estimation of the Mean and the ACVF

CHAPTER I: Vector Spaces

Paired Data and Linear Correlation

THE SYSTEMATIC AND THE RANDOM. ERRORS - DUE TO ELEMENT TOLERANCES OF ELECTRICAL NETWORKS

Median and IQR The median is the value which divides the ordered data values in half.

This is an introductory course in Analysis of Variance and Design of Experiments.

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

ECON 3150/4150, Spring term Lecture 3

Statistical Fundamentals and Control Charts

MEASURES OF DISPERSION (VARIABILITY)

Determining the sample size necessary to pass the tentative final monograph pre-operative skin preparation study requirements

Chapter 6 Sampling Distributions

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS

BIOSTATS 640 Intermediate Biostatistics Frequently Asked Questions Topic 1 FAQ 1 Review of BIOSTATS 540 Introductory Biostatistics

Read through these prior to coming to the test and follow them when you take your test.

Full file at

[412] A TEST FOR HOMOGENEITY OF THE MARGINAL DISTRIBUTIONS IN A TWO-WAY CLASSIFICATION

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n,

Lesson 11: Simple Linear Regression

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Statisticians use the word population to refer the total number of (potential) observations under consideration

Confidence Interval for Standard Deviation of Normal Distribution with Known Coefficients of Variation

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

The coalescent coalescence theory

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

Chapter 20. Comparing Two Proportions. BPS - 5th Ed. Chapter 20 1

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,

October 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates

Expectation and Variance of a random variable

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

Topic 9: Sampling Distributions of Estimators

Agenda: Recap. Lecture. Chapter 12. Homework. Chapt 12 #1, 2, 3 SAS Problems 3 & 4 by hand. Marquette University MATH 4740/MSCS 5740

Lecture 8: Non-parametric Comparison of Location. GENOME 560, Spring 2016 Doug Fowler, GS

11 Correlation and Regression

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Table 1: Mean FEV1 (and sample size) by smoking status and time. FEV (L/sec)

Simple Linear Regression

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

Chapter 6 Principles of Data Reduction

It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable.

1 Lesson 6: Measure of Variation

3 Resampling Methods: The Jackknife

Statistical Analysis on Uncertainty for Autocorrelated Measurements and its Applications to Key Comparisons

REGRESSION MODELS ANOVA

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

The Sample Variance Formula: A Detailed Study of an Old Controversy

PH 425 Quantum Measurement and Spin Winter SPINS Lab 1

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}.

Solutions to Odd-Numbered End-of-Chapter Exercises: Chapter 9

Sampling Error. Chapter 6 Student Lecture Notes 6-1. Business Statistics: A Decision-Making Approach, 6e. Chapter Goals

The "Last Riddle" of Pierre de Fermat, II

1 Models for Matched Pairs

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Understanding Dissimilarity Among Samples

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Transcription:

The Itraclass Correlatio: What is it ad why do we care? David J. Pasta, Techology Assessmet Grop, Sa Fracisco, CA Abstract The itraclass correlatio coefficiet (ICC) is a measure of reproducibility that should be used i may commo situatios where the ordiary (Pearso product-momet) correlatio coefficiet is used iappropriately. The ICC is defied ad its history ad motivatio described. Alterative methods of calculatig the ICC are preseted i detail. The ICC is related to other measures of agreemet, such as the cocordace correlatio coefficiet. Itroductio I like to thik that the itraclass correlatio came about this way. The famous statisticia ad geeticist R. A. Fisher was calculatig the correlatio coefficiet for related subjects, such as sibligs. This was easy to do as log as he had two sibligs from each family, ad there was a atural order (for example, you could always put the older siblig first). The simple Pearso productmomet correlatio coefficiet, the oe we are all familiar with, would do icely. The "x" variable would be "older siblig" ad the "y" variable would be "youger siblig" ad you could calculate the correlatio betwee the sibligs o whatever outcome measure you were studyig. But what about twis? Well, the statisticia R. A. Fisher could tell you that twis occur oly about I i 90 births ad so it wo't make much differece which order you list the two twis. But the geeticist R A. Fisher kew that studies comparig idetical (moozygotic) twis ad frateral (dizygotic) twis were very importat i the study of heritability. What if you had a whole study of twis? Oe approach would be to correlate usig the grad mea ad grad stadard deviatio. The geeral (but balaced) situatio for which the itraclass correlatio coefficiet is appropriate is give i (1). There are k members of each of "classes" of subjects. For the twis, k= (two twis) i each of families. You ca calculate the grad mea ad grad stadard deviatio ad use those to calculate a correlatio coefficiet as show i (). Note that a divisor of rather tha -1 is used both for calculatig the variace ad for calculatig the correlatio coefficiet. This is i keepig with the practices of the time. Sample Data (I) x 1 i i=l,..., j=i,...,k "classes" k members i each 1 () x =- :L (x;j + X) 1 i l s =- :L i=l 1 r=-- s [(x; -x)z +(x;z -.xf] L (x; 1 - x)(xi- x) ;~J Alteratively, a approach would be to eter the data for each pair twice, with each member of the pair switchig positios. That gives the same aswer (usig istead of -1 helps here). What if there are more tha two members of each class? For k=3, for example, you could eter the data for the 6 possible ordered pairs. I geeral, you would eed to eter k( k-1) pairs for each class. Harris itroduced the computatioal method (3) for calculatig a coefficiet r. _ 1 k (3) X = - L L X;j k i=l j=l k L (x 11 -x) =s [t+(k-l)r] i=l Note that this defiitio of r, which has a (oegative) sum of squares o the left had side, implies that I +(k-l)r is oegative, which i tur implies r is at least -1/(k-1 ). That is, ulike a ordiary (Pearso) correlatio coefficiet, the itraclass correlatio coefficiet -- for that is what we have just defied -- caot be "very" egative. Whe k=, the "limitatio" is that r is at least -1, but for k=3 r must be at least -1/ ad for k==4 r must be at least -1/3 ad so o. I practice, though, itraclass correlatio coefficiets are expected to be positive ad substatial. 30

Note also that a itraclass correlatio coefficiet of I implies that (4) k L (x -x) = s (k) ip;] 1 i.e.? = - 1=1 L (.xil - xl which occurs if ad oly if all the idividuals i a class all take o exactly the same value. It is worth otig that whe the true (populatio) value of the itraclass correlatio coefficiet is ear 0, the accuracy of a estimate is about that of k(k-1 )/ observatios for a ordiary correlatio. Whe the value is ear 1, the accuracy of a estimate is like that for pairs, ad for a value ear.5 the accuracy is o better tha 9/ pairs. Now it is possible to write this dow as a aalysis of variace. The itraclass correlatio coefficiet predates Fisher's ivetio of aalysis of variace, however, so we are usig a moder approach to better uderstad the uderpiigs of the itraclass correlatio coefficiet. The Basic ANOVA for ICC table is i Figure 1 below. Suppose the total is made up of two idepedet ormally distributed compoets, oe commo to the class with variace c; ~ad the other uique to the idividual with variace c; ~. The the variace of idividual scores a = a ~ + a i. Further, the covariace betwee two members of a class Cov (xu, xif.) = Cov (B; + E 1, B; + EJ') = Cov (B;, B;) =c;b cr Thus the correlatio p = ---:::---' 9!!...,..- withi classes. cr +cr B E egyatec! to Betwee (-1) kl i=l (- _) X;.- X s [1+{k-1)r] (-1) (kcr~ +cri> k Withi (k -1) 'L 'L i=1 J=l (xii- x;. r s (k-1){1-r) (k-l)cr~ k Total (k -1) 'L 'L i=1 j=l (xu- xr s k Figure 1. Basic ANOV A for ICC 303

The class meas have a part with variace cr i ad a part that is the mea of k values each with variace cr i so i=l (x> x) =k{-l)(cri + crf) ={-t)(kcr! +cr~) ={-l)(cri +cri){kp+l-p) =(-l)cr (l+(k-l)p} Withi class we get (k-l)a!: =(k-l)cr (1-p) Bartko (1966) gave a derivatio of the ICC fork scores for persos. It is based o a oe-way radom effects aalysis of variace. xu= 1-1.+ P; +eu i =l,..., j = l,... k P;- N (o, cr~) eu- N (o, cr;) so xu - N (J.L, cr! +cr;) Source MS E(MS) (Betwee) -1 Persos (Withi) Error Total k-1 (k-1) MSE 0 Figure. Oe-Way Radom Effects ANOV A Ad p = ( MSP- MSE) I k r = (MSP- MSE)I k+ MSE MSP-MSE.. = Js a estimate. MSP+(k-1) MSE Now Cov (xu, X;r) = E ( x!l - J.1 )(xu. - 1-1.) So = E (P; + e 0 )(P; + eu ) =cr Note F = MSP I MSE tests cr! = 0 <=> p = 0. The ANOV A cotrollig for multiple raters is somewhat complicated, but istructive. There are two formulatios, depedig o whether raters are radom or fixed effects. TWO-WAY RANDOM EFFECI'S (OR TWO-WAY MIXED MODEL) xu = 1-1. + P; + rj + (pr )if + e 0 i = l,..., j = l,...,k p,- N (o,cr!) ri - N (o, cr;) (or fixed values) (pr )ij - N { eu -N(o,cr;) 0, cr!) The ANOV A for this is show i Figure 3 p Cov(xu,x 9.) cr! Nowp = ~var{xu) var(xu ) = cr! +cr; +cr! +cr; (or ad (MSP- MSE)I k r=~==--~~~~~~~~~~~~= (MSP- MSE)I k+(msr- MSE)I+ MSE MSP-MSE = MSP+(k-l)MSE+k(MSR- MSE)I ( MSP-MSE+a! ) or-----~--~----~~ - MSP+(k-1) MSE+cr! 304

gj: E (M$) Radom Effects E (MS> Mixed Mo4el Persos -1 MSP crz +crz +kcrz crz +crz +kcr Raters k-1 MSR e q P e q p (}:rf] cr +cr +cr cr +cr + _,_._ e q r e q k-1 Error ( -1)(k-1) MSE cr+cr q cr+cr e q Total k-1 Figure 3. Two-Way Radom Effects or Mixed Model ANOVA The last te i the deomiator of the radom effects versio, k ( MSR - MSE) I, is the cr ~ part, which is ofte igored. Note it is easy to test cr! = 0 ad cr ; = 0 for the radom effects model. Now this gives two alterative defmitios/estimates of ICC, depedig o whether you assume raters are a radom effect or a fixed effect The radom effect versio may be better if the raters are cosidered typical of the type of raters to be used i the study. The fixed effect versio may be better if the raters studied are the actual raters used i the study. The it is ecessary to have a estimate of crq, the iteractio betwee persos ad raters. Oe ca simply assume cr/ is zero, which gives a coservative estimate of the ICC (essetially you are asslimig all of the MSE is error, cr/). Or you ca assume all of the MSE is iteractio cr/, which meas cr} is zero. This gives a higher estimate for the ICC. I other words, for fixed effects 0 :s; cr! :s; MSE, so MSP- MSE :s; r :s; MSP MSP+(k-1)MSE MSP+k MSE Reliabi6ty for k= Deyo, Diehr, Patrick (1991) provide a very accessible explaatio of reliability ad its estimatio: xij i = 1,..., j = 1, 1 x.j = - L xij i=l 1 s] = -:- 1 L (xij -x.j) i=l X c. = x. 1 - x. (backwards) 1 s~ = - L ((x -x; )-xc.) -1 i=l 1 = - 1 ~ ((x;1 -x.1)-(x -x.)) 1 ~ ( - ) = --..J X -x. 1-1 i=l -- -i: (x; 1 -x.,)(x; -x. ) -1 i=l 1 _ +- L (x; -x.) -1 i=l 305

Persos -I (Occasio) Raters (Residual) Error Total - -I (-I){s~ +si)+x; -I Figure 4. Two-Way Radom Effects ANOV A for k= The ANOVA for this is show i Figure 4 above.. x& parred t = r s 11 I v s s +s -s Pearso r = --1L = 1 11 ( s 1 s s 1 s ( - - ) - _ ) _ x. 1 + x. Now ote x. 1 - x.. = x. 1 - = ( - - ) x. 1 - x.. (- _ ) so± (x!i -x ad similarly for x. - x.. ) =(x 11 ) = x~ j=l A related coefficiet is the Cocordace Correlatio Coefficiet = CCC= agreemet with 45 lie: = r Ad for referece, the formula for the ordiary correlatio i this otatio is: Summary ICC= MSP- MSE MSP + MSE + ( MSR- MSE) I Notice how similar the ICC ad CCC are; the ICC is ever larger tha the CCC ad is oly equal whe all subjects have exactly the same chage, so that s~ is zero. 306

Coclusio The Itraclass correlatio coefficiet measures reproducibility. It is especially appropriate whe all the ratigs are equally valid ad there is o particular orderig to the ratigs. I the presece of a atural orderig of the ratigs (e.g. test-retest, several idepedet judges), the two-way AN OVA versio of the ICC should be used. (The literature icludes errors i this respect.) If raters are ot to be treated as a radom effect, a sep~e estimate or assumptios about perso-rater iteractios are eeded. For two ratigs (icludig the test-retest situatio), ICC is ot hard to calculate from meas ad stadard deviatios of xi>~. ad (x 1 -x ) which are aturally calculated i the cotext of a paired t test. The ICC is closely related to the Cocordace Correlatio Coefficiet, CCC. Ordiary test-retest correlatios, eve with supplemetary t-tests for systematic mea differeces, are ot good measures of reproducibility ad should be avoided. Refereces: Geeral Kramer MS, Feistei AR (1981), Cliical biostatistics: LN. the biostatistics of cocordace, Cli Pharmacol Ther, 9:111-3. Deyo RA, Diehr P, Patrick DL (1991), Reproducibility ad resposiveess of health status measures: statistics ad strategies for evaluatio, Cotrolled Cliical Trials, 1: 14S-158S. Refereces: Itraclass Correlatio Fisher RA (1944), Statistical Methods for Research Workers, 9th ed., Ediburgh: Oliver ad Boyd Bartko JJ (1966), The itraclass correlatio coefficiet as a measure of reliability, Psycho/ Rep, 19:3-11 Autbor David Pasta Techology Assessmet Group 409 Secod Street, Suite 01 Sa Fracisco, CA 94107 (415) 495-8966 x18 (415)495-8969 FAX g:\geeral\saslmaus7.doc lw 307