The Itraclass Correlatio: What is it ad why do we care? David J. Pasta, Techology Assessmet Grop, Sa Fracisco, CA Abstract The itraclass correlatio coefficiet (ICC) is a measure of reproducibility that should be used i may commo situatios where the ordiary (Pearso product-momet) correlatio coefficiet is used iappropriately. The ICC is defied ad its history ad motivatio described. Alterative methods of calculatig the ICC are preseted i detail. The ICC is related to other measures of agreemet, such as the cocordace correlatio coefficiet. Itroductio I like to thik that the itraclass correlatio came about this way. The famous statisticia ad geeticist R. A. Fisher was calculatig the correlatio coefficiet for related subjects, such as sibligs. This was easy to do as log as he had two sibligs from each family, ad there was a atural order (for example, you could always put the older siblig first). The simple Pearso productmomet correlatio coefficiet, the oe we are all familiar with, would do icely. The "x" variable would be "older siblig" ad the "y" variable would be "youger siblig" ad you could calculate the correlatio betwee the sibligs o whatever outcome measure you were studyig. But what about twis? Well, the statisticia R. A. Fisher could tell you that twis occur oly about I i 90 births ad so it wo't make much differece which order you list the two twis. But the geeticist R A. Fisher kew that studies comparig idetical (moozygotic) twis ad frateral (dizygotic) twis were very importat i the study of heritability. What if you had a whole study of twis? Oe approach would be to correlate usig the grad mea ad grad stadard deviatio. The geeral (but balaced) situatio for which the itraclass correlatio coefficiet is appropriate is give i (1). There are k members of each of "classes" of subjects. For the twis, k= (two twis) i each of families. You ca calculate the grad mea ad grad stadard deviatio ad use those to calculate a correlatio coefficiet as show i (). Note that a divisor of rather tha -1 is used both for calculatig the variace ad for calculatig the correlatio coefficiet. This is i keepig with the practices of the time. Sample Data (I) x 1 i i=l,..., j=i,...,k "classes" k members i each 1 () x =- :L (x;j + X) 1 i l s =- :L i=l 1 r=-- s [(x; -x)z +(x;z -.xf] L (x; 1 - x)(xi- x) ;~J Alteratively, a approach would be to eter the data for each pair twice, with each member of the pair switchig positios. That gives the same aswer (usig istead of -1 helps here). What if there are more tha two members of each class? For k=3, for example, you could eter the data for the 6 possible ordered pairs. I geeral, you would eed to eter k( k-1) pairs for each class. Harris itroduced the computatioal method (3) for calculatig a coefficiet r. _ 1 k (3) X = - L L X;j k i=l j=l k L (x 11 -x) =s [t+(k-l)r] i=l Note that this defiitio of r, which has a (oegative) sum of squares o the left had side, implies that I +(k-l)r is oegative, which i tur implies r is at least -1/(k-1 ). That is, ulike a ordiary (Pearso) correlatio coefficiet, the itraclass correlatio coefficiet -- for that is what we have just defied -- caot be "very" egative. Whe k=, the "limitatio" is that r is at least -1, but for k=3 r must be at least -1/ ad for k==4 r must be at least -1/3 ad so o. I practice, though, itraclass correlatio coefficiets are expected to be positive ad substatial. 30
Note also that a itraclass correlatio coefficiet of I implies that (4) k L (x -x) = s (k) ip;] 1 i.e.? = - 1=1 L (.xil - xl which occurs if ad oly if all the idividuals i a class all take o exactly the same value. It is worth otig that whe the true (populatio) value of the itraclass correlatio coefficiet is ear 0, the accuracy of a estimate is about that of k(k-1 )/ observatios for a ordiary correlatio. Whe the value is ear 1, the accuracy of a estimate is like that for pairs, ad for a value ear.5 the accuracy is o better tha 9/ pairs. Now it is possible to write this dow as a aalysis of variace. The itraclass correlatio coefficiet predates Fisher's ivetio of aalysis of variace, however, so we are usig a moder approach to better uderstad the uderpiigs of the itraclass correlatio coefficiet. The Basic ANOVA for ICC table is i Figure 1 below. Suppose the total is made up of two idepedet ormally distributed compoets, oe commo to the class with variace c; ~ad the other uique to the idividual with variace c; ~. The the variace of idividual scores a = a ~ + a i. Further, the covariace betwee two members of a class Cov (xu, xif.) = Cov (B; + E 1, B; + EJ') = Cov (B;, B;) =c;b cr Thus the correlatio p = ---:::---' 9!!...,..- withi classes. cr +cr B E egyatec! to Betwee (-1) kl i=l (- _) X;.- X s [1+{k-1)r] (-1) (kcr~ +cri> k Withi (k -1) 'L 'L i=1 J=l (xii- x;. r s (k-1){1-r) (k-l)cr~ k Total (k -1) 'L 'L i=1 j=l (xu- xr s k Figure 1. Basic ANOV A for ICC 303
The class meas have a part with variace cr i ad a part that is the mea of k values each with variace cr i so i=l (x> x) =k{-l)(cri + crf) ={-t)(kcr! +cr~) ={-l)(cri +cri){kp+l-p) =(-l)cr (l+(k-l)p} Withi class we get (k-l)a!: =(k-l)cr (1-p) Bartko (1966) gave a derivatio of the ICC fork scores for persos. It is based o a oe-way radom effects aalysis of variace. xu= 1-1.+ P; +eu i =l,..., j = l,... k P;- N (o, cr~) eu- N (o, cr;) so xu - N (J.L, cr! +cr;) Source MS E(MS) (Betwee) -1 Persos (Withi) Error Total k-1 (k-1) MSE 0 Figure. Oe-Way Radom Effects ANOV A Ad p = ( MSP- MSE) I k r = (MSP- MSE)I k+ MSE MSP-MSE.. = Js a estimate. MSP+(k-1) MSE Now Cov (xu, X;r) = E ( x!l - J.1 )(xu. - 1-1.) So = E (P; + e 0 )(P; + eu ) =cr Note F = MSP I MSE tests cr! = 0 <=> p = 0. The ANOV A cotrollig for multiple raters is somewhat complicated, but istructive. There are two formulatios, depedig o whether raters are radom or fixed effects. TWO-WAY RANDOM EFFECI'S (OR TWO-WAY MIXED MODEL) xu = 1-1. + P; + rj + (pr )if + e 0 i = l,..., j = l,...,k p,- N (o,cr!) ri - N (o, cr;) (or fixed values) (pr )ij - N { eu -N(o,cr;) 0, cr!) The ANOV A for this is show i Figure 3 p Cov(xu,x 9.) cr! Nowp = ~var{xu) var(xu ) = cr! +cr; +cr! +cr; (or ad (MSP- MSE)I k r=~==--~~~~~~~~~~~~= (MSP- MSE)I k+(msr- MSE)I+ MSE MSP-MSE = MSP+(k-l)MSE+k(MSR- MSE)I ( MSP-MSE+a! ) or-----~--~----~~ - MSP+(k-1) MSE+cr! 304
gj: E (M$) Radom Effects E (MS> Mixed Mo4el Persos -1 MSP crz +crz +kcrz crz +crz +kcr Raters k-1 MSR e q P e q p (}:rf] cr +cr +cr cr +cr + _,_._ e q r e q k-1 Error ( -1)(k-1) MSE cr+cr q cr+cr e q Total k-1 Figure 3. Two-Way Radom Effects or Mixed Model ANOVA The last te i the deomiator of the radom effects versio, k ( MSR - MSE) I, is the cr ~ part, which is ofte igored. Note it is easy to test cr! = 0 ad cr ; = 0 for the radom effects model. Now this gives two alterative defmitios/estimates of ICC, depedig o whether you assume raters are a radom effect or a fixed effect The radom effect versio may be better if the raters are cosidered typical of the type of raters to be used i the study. The fixed effect versio may be better if the raters studied are the actual raters used i the study. The it is ecessary to have a estimate of crq, the iteractio betwee persos ad raters. Oe ca simply assume cr/ is zero, which gives a coservative estimate of the ICC (essetially you are asslimig all of the MSE is error, cr/). Or you ca assume all of the MSE is iteractio cr/, which meas cr} is zero. This gives a higher estimate for the ICC. I other words, for fixed effects 0 :s; cr! :s; MSE, so MSP- MSE :s; r :s; MSP MSP+(k-1)MSE MSP+k MSE Reliabi6ty for k= Deyo, Diehr, Patrick (1991) provide a very accessible explaatio of reliability ad its estimatio: xij i = 1,..., j = 1, 1 x.j = - L xij i=l 1 s] = -:- 1 L (xij -x.j) i=l X c. = x. 1 - x. (backwards) 1 s~ = - L ((x -x; )-xc.) -1 i=l 1 = - 1 ~ ((x;1 -x.1)-(x -x.)) 1 ~ ( - ) = --..J X -x. 1-1 i=l -- -i: (x; 1 -x.,)(x; -x. ) -1 i=l 1 _ +- L (x; -x.) -1 i=l 305
Persos -I (Occasio) Raters (Residual) Error Total - -I (-I){s~ +si)+x; -I Figure 4. Two-Way Radom Effects ANOV A for k= The ANOVA for this is show i Figure 4 above.. x& parred t = r s 11 I v s s +s -s Pearso r = --1L = 1 11 ( s 1 s s 1 s ( - - ) - _ ) _ x. 1 + x. Now ote x. 1 - x.. = x. 1 - = ( - - ) x. 1 - x.. (- _ ) so± (x!i -x ad similarly for x. - x.. ) =(x 11 ) = x~ j=l A related coefficiet is the Cocordace Correlatio Coefficiet = CCC= agreemet with 45 lie: = r Ad for referece, the formula for the ordiary correlatio i this otatio is: Summary ICC= MSP- MSE MSP + MSE + ( MSR- MSE) I Notice how similar the ICC ad CCC are; the ICC is ever larger tha the CCC ad is oly equal whe all subjects have exactly the same chage, so that s~ is zero. 306
Coclusio The Itraclass correlatio coefficiet measures reproducibility. It is especially appropriate whe all the ratigs are equally valid ad there is o particular orderig to the ratigs. I the presece of a atural orderig of the ratigs (e.g. test-retest, several idepedet judges), the two-way AN OVA versio of the ICC should be used. (The literature icludes errors i this respect.) If raters are ot to be treated as a radom effect, a sep~e estimate or assumptios about perso-rater iteractios are eeded. For two ratigs (icludig the test-retest situatio), ICC is ot hard to calculate from meas ad stadard deviatios of xi>~. ad (x 1 -x ) which are aturally calculated i the cotext of a paired t test. The ICC is closely related to the Cocordace Correlatio Coefficiet, CCC. Ordiary test-retest correlatios, eve with supplemetary t-tests for systematic mea differeces, are ot good measures of reproducibility ad should be avoided. Refereces: Geeral Kramer MS, Feistei AR (1981), Cliical biostatistics: LN. the biostatistics of cocordace, Cli Pharmacol Ther, 9:111-3. Deyo RA, Diehr P, Patrick DL (1991), Reproducibility ad resposiveess of health status measures: statistics ad strategies for evaluatio, Cotrolled Cliical Trials, 1: 14S-158S. Refereces: Itraclass Correlatio Fisher RA (1944), Statistical Methods for Research Workers, 9th ed., Ediburgh: Oliver ad Boyd Bartko JJ (1966), The itraclass correlatio coefficiet as a measure of reliability, Psycho/ Rep, 19:3-11 Autbor David Pasta Techology Assessmet Group 409 Secod Street, Suite 01 Sa Fracisco, CA 94107 (415) 495-8966 x18 (415)495-8969 FAX g:\geeral\saslmaus7.doc lw 307