Intraclass and Interclass Correlation Coefficients with Application

Similar documents
SIMULTANEOUS CONFIDENCE INTERVALS AMONG k MEAN VECTORS IN REPEATED MEASURES WITH MISSING DATA

Estimating direct effects in cohort and case-control studies

Model II (or random effects) one-way ANOVA:

Mantel-Haenszel Test Statistics. for Correlated Binary Data. Department of Statistics, North Carolina State University. Raleigh, NC

Oct Simple linear regression. Minimum mean square error prediction. Univariate. regression. Calculating intercept and slope

Two Measurement Procedures

Confounding, mediation and colliding

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /8/2016 1/38

2018 년전기입시기출문제 2017/8/21~22

Resemblance among relatives

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

What does independence look like?

Reliability Coefficients

P a g e 5 1 of R e p o r t P B 4 / 0 9

Lecture 7 Correlated Characters

General Linear Model (Chapter 4)

Lecture 4. Basic Designs for Estimation of Genetic Parameters

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective

First Year Examination Department of Statistics, University of Florida

Unit 12: Analysis of Single Factor Experiments

Correlation and simple linear regression S5

One-Way ANOVA Source Table J - 1 SS B / J - 1 MS B /MS W. Pairwise Post-Hoc Comparisons of Means

Introduction to Within-Person Analysis and RM ANOVA

DNA polymorphisms such as SNP and familial effects (additive genetic, common environment) to

N J SS W /df W N - 1

SOME TECHNIQUES FOR SIMPLE CLASSIFICATION

Reliability of Three-Dimensional Facial Landmarks Using Multivariate Intraclass Correlation

Biostatistics for physicists fall Correlation Linear regression Analysis of variance

Lecture 9. QTL Mapping 2: Outbred Populations

multilevel modeling: concepts, applications and interpretations

Economics 620, Lecture 2: Regression Mechanics (Simple Regression)

Modelling heterogeneous variance-covariance components in two-level multilevel models with application to school effects educational research

Lecture 6: Selection on Multiple Traits

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

Introduction to Random Effects of Time and Model Estimation

Application of Variance Homogeneity Tests Under Violation of Normality Assumption

Least Squares Analyses of Variance and Covariance

Simple Linear Regression

STAT 525 Fall Final exam. Tuesday December 14, 2010

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46

Confidence Intervals, Testing and ANOVA Summary

The dilemma of negative analysis of variance estimators of intraclass correlation

Analysis of Variance. ภาว น ศ ร ประภาน ก ล คณะเศรษฐศาสตร มหาว ทยาล ยธรรมศาสตร

A L A BA M A L A W R E V IE W

Chapter 10: Analysis of variance (ANOVA)

Introduction to Business Statistics QM 220 Chapter 12

Asymptotic Statistics-III. Changliang Zou

Empirical Power of Four Statistical Tests in One Way Layout

Sample size calculations for logistic and Poisson regression models

STK4900/ Lecture 3. Program

CS483 Design and Analysis of Algorithms

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

Logistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Econometric textbooks usually discuss the procedures to be adopted

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Biostatistics. Correlation and linear regression. Burkhardt Seifert & Alois Tschopp. Biostatistics Unit University of Zurich

Part 6: Multivariate Normal and Linear Models

Stat 579: Generalized Linear Models and Extensions

Multiple Linear Regression

Comparison of the Logistic Regression Model and the Linear Probability Model of Categorical Data

MSc / PhD Course Advanced Biostatistics. dr. P. Nazarov

Charles E. McCulloch Biometrics Unit and Statistics Center Cornell University

A bias improved estimator of the concordance correlation coefficient

A Test for Order Restriction of Several Multivariate Normal Mean Vectors against all Alternatives when the Covariance Matrices are Unknown but Common

Correlation and regression

The Matrix Algebra of Sample Statistics

econstor Make Your Publications Visible.

4.1. Introduction: Comparing Means

Marcia Gumpertz and Sastry G. Pantula Department of Statistics North Carolina State University Raleigh, NC

ANALYSIS OF CORRELATED DATA SAMPLING FROM CLUSTERS CLUSTER-RANDOMIZED TRIALS

... x. Variance NORMAL DISTRIBUTIONS OF PHENOTYPES. Mice. Fruit Flies CHARACTERIZING A NORMAL DISTRIBUTION MEAN VARIANCE

For more information about how to cite these materials visit

Lecture 11. Correlation and Regression

Stat 705: Completely randomized and complete block designs

Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption

Analysis of Variance. Read Chapter 14 and Sections to review one-way ANOVA.

Correlation and the Analysis of Variance Approach to Simple Linear Regression

Rerandomization to Balance Covariates

Feature selection. c Victor Kitov August Summer school on Machine Learning in High Energy Physics in partnership with

Dyadic Data Analysis. Richard Gonzalez University of Michigan. September 9, 2010

Randomization in Algorithms

Stat472/572 Sampling: Theory and Practice Instructor: Yan Lu

Time-Invariant Predictors in Longitudinal Models

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

EE290H F05. Spanos. Lecture 5: Comparison of Treatments and ANOVA

Statistics For Economics & Business

2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Math 152. Rumbos Fall Solutions to Exam #2

Chapter 1. Linear Regression with One Predictor Variable

TWO-FACTOR AGRICULTURAL EXPERIMENT WITH REPEATED MEASURES ON ONE FACTOR IN A COMPLETE RANDOMIZED DESIGN

of, denoted Xij - N(~i,af). The usu~l unbiased

Title. Description. Quick start. Menu. stata.com. icc Intraclass correlation coefficients

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS

STAT 526 Spring Midterm 1. Wednesday February 2, 2011

STAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression

UNIVERSITY OF TORONTO Faculty of Arts and Science

Biometrical Genetics. Lindon Eaves, VIPBG Richmond. Boulder CO, 2012

Chapter 14. Linear least squares

Transcription:

Developments in Statistics and Methodology A. Ferligoj and A. Kramberger (Editors) Metodološi zvezi, 9, Ljubljana : FDV 1993 Intraclass and Interclass Correlation Coefficients with Application Lada Smoljanović1 Abstract In the paper a review and a comparison of estimates and tests of intraclass and interclass correlation coefficients is presented. 1 Introduction Rosner (1982) pointed out that the basic unit for statistical analysis in ophthalmological studies is the eye rather than person. Sometimes, one eye is used as the treated eye and the other as the control. In this case standard methods of estimation and hypothesis testing are valid. But if the purpose is to compare two different types of people on some finding in an ocular examination such as a comparison of intraocular pressures in persons in different age groups, their values from the two eyes are highly correlated and the above statistical methods are not valid. One of the appropriate methods for such a design is given by a nested mixed effects analysis of variance (ANOVA) and appropriate extensions to multivariate methods in ophthalmology with application to other paired-data situations. The methods have implications for otolaryngological data, dental data as well as in the analysis of familial data. An assessment of the degree of resemblance among family members with respect to some biological or psychological attribute such as weight, differences in the sin, arm length or blood pressure, is of particular interest to geneticists. 1.1 Intraclass correlation model in ophthalmologic studies Suppose we have g groups of persons and we wish to compare them with respect to some ocular finding y (for example intraocular pressure). This is presented by Rosner (1982). The model used is given by Y,i=h+ai+Pij+eij i=1,...,g; j=1,...,pi ; =1,...,Nij where there are Pi persons in the ith group i = 1,..., g ; P = E Pi and each person contributes Nij eyes, Nij = 1 or 2, i = 1,..., g ; j = 1,..., Pi ; Qij - N(0, ape) is the 1 Faculty of Sport, University of Zagreb, Horvaćansi zavoj 15, 41000 Zagreb, Croatia

58 Lada Smoljanović effect of persons within groups, ei, - N(O,a2 ) is the effect of eyes within persons which considered to be random effect, µ and {ai} are constants. But there exists a difficult problem due to the unbalanced nature of the design whereby different persons contributed either one or two eyes to the analysis. A reasonable approximate method for this problem is given by (Searle, 1971 :367). The test procedure is to test the hypothesis Ho : all a; are equal versus H1 some of the ai are unequal. The test statistic A = MSG/MSP which follows an F9_ l,p_ 9 distribution under Ho, where MSG is mean square between groups and MSP is mean square between eyes within persons and appropriate intraclass correlations p is given by p* = Qa2 /(op2 + o,*2 ) where ape = max{o, MSP - (MSE/N)}, 0*2 = MSE. But if we assume that two eyes from the same person are independent random variables then we have an ordinary one-way ANOVA model. 1.2 Constant R method Rosner (1982) assumed this model where Y, = 1 if the th eye of the j th person in the i th group is affected, and 0 otherwise, i = 1,..., g ; j = 1,..., Pi ; = 1, 2 and P(Y, = 1) = I\i, P(Y, = li}j( 3_) = 1) = R), for some positive constant R. The constant R is a measure of dependence between two eyes of the same person. The primary aim is to test Ho : A1 = A2 =... = A9 = A versus H1 : A, # a, for at least one pair (r, s). Rosner estimated "the effective number of eyes per person" under this model by e* 2A (1 - a* ) A* (1 -,A*) + (R* - 1)A*2 Let Pi; denote the number of persons in the i th group with j affected eyes, i = 1,...,g ; j = 0, 1,2 then R* An appropriate test is given by E(Pil+2Pi2 ) 2N 4N E Pi e (E Pil + 2 E Pi2 ) 2 T = ~'(le* PP(A ; - A * )2. Rosner presented the extensions of these methods such as multiple regression method and he related the value of a normally distributed outcome variable Y, for the j th subunit of the i th primary unit (i = 1,..., n ; j = 1,..., ti) to the values of independent variables Xij 1,..., Xi;, where Xi; denotes the th independent variables for the j th subunit of the i th primary unit : Y,=/jo+ErXi,+ei, where var(ei,) = a2, p(ei,,ei) = p ; i = 1,...,n ; j # = 1,...,ti and our aim is to test the hypothesis Ho : 13 = 0, all other /i # 0 versus Ho : all Qi # 0. '

Intraclass and Interclass Correlation Coefficients with Application 59 2 Estimation of the degree of resemblance among family members One of the aims of the analysis of familial data is to estimate the resemblance among the members of a family. Usually we face two distinct problems : to estimate the resemblance among the children themselves, the problem of estimation of intraclass correlation to estimate the resemblance among parents and their children, the problem of estimation of interclass correlation 3 Estimation of intraclass correlation We denote the score of i th family's offspring by Xii i = 1,..., ; j = 1,..., n i, where is the total numbers of families studied and ni is the position in the family and N = E n ; is the total number of observations. We assume that size of family offspring are not necessarily the same in each family, since that problem has been most frequently encountered in practice. 3.1 Method of components of variance Donner (1979) suggested to use analysis of variance : X, =P+ai+e13 where p is the grand mean of all observations in the population, the family effects {ai } are identically distributed with mean 0 and variance oa, the residual errors {eii } are identically distributed with mean 0 and variance a.' and {a;}, {e ;} are completely independent. The variance of X, is given by ox = oa + o.2. The proportion of the variation among groups is also nown as 2 0 A Pcc = "'A + oe Since family size n; differs among families it is obvious that no single value of n ; would be appropriate in the formula. We therefore use an average n ; ; this is not a simple n, the arithmetic mean of the n i 's but 1 n o = n - ( - 1)N ~ (n ` - n)2 which is an average usually close to, but always less then n, unless sample size are equal when n o = n. Since unbiased estimates of oe and oa are given by Sy, = MSW and SA = (MSA - MSW)/no the analysis of variance intraclass correlation coefficient as _ SA MSA -MSW F -1 ra SA+SH, MSA-MSW+n 0 MSW n o +F-1 where F = MSS,.

6 0 Lada Smoljanović 3.2 Common correlation model called also random effects model The other model is called "common correlation model" in which the observations X,, are assumed to be distributed about the same µ and with same variance a2 in such a way that X ;j and Xia in the same class have a common correlation p. 3.3 Maximum lielihood estimator Suppose we have Xi (X11,...,Xin ;) ; Xi - N(µi,Ei), i = 1,..., µi=(µ,...,µ), Pat, a2, 7,1=1,...,ni, j01 L(X1i...,XIµ,a2 ) = ( 2 r) -N/2 fl IEiI-1/2eXP{-2 E (X1-µ1)T '7 1 (X1 - µi)} i=1 i=1-21nl=n(1+1n or 2 +ln(2r))+(n-)ln(1-p)+~1nwi i=1 where Wi = 1 + (no - 1)p. This expression may be numerically minimized with respect to p to yield the maximum lielihood estimation rm. It is interesting to note that for the case n i = n, i = 1,..., we have ray =- rp Pearson's correlation coefficient which is expressed by n n rp= 1 E>>(X ij -X )(Xi1 - X) Nn(n - 1)S.2 i=1 j=1 1=1 where X and Sx are mean and variance. If one remembers the last research wor (for example among the siblings for the blood pressure) where p,, < 0.5 (small value) and in the cases when we do not now anything about intraclass correlation coefficient, in such case, we use the model of "common correlation", i.e. rn. Under the assumption that p,, has a large value (0.8) we use the maximum lielihood estimator. 4 Estimation of interclass correlation coefficients A1 Let us tae for example a sample of measurements from families and Xio, Xi1,..., Xini representing measurements from i th family where X;0 is the mother's score (parent's score in general) and X, 1,..., Xin, are the scores of her ni siblings. Suppose we have Xi = (Xio,Xi1,...,Xin,)^' N(µ,E), i= 1,...,

and Interclass Correlation Coefficients with Application 6 1 pi = E! = E; l = Pmcamac, j = E =o c, 1= Pccoc, 7,1=1,...,ni ;j l 1,...,ni pcc and the mother sib interclass corre- The intraclass correlation is denoted by lation is denoted by pmc. 4.1 Pairwise estimator This method is performed by pairing each mother's score with each of her sibling's scores and considering the collection of all such pairs over all families. If we consider each of the pairs (Xi,Xii) as an independent observation from a bivariate normal distribution N(A,E) where A = ( Am, 11c), E _ z am PmcOmac PmcQmac Oc 2 then where Pmc = ~1(Xio - Xm) Ej= 1(Xii - Xc) 1 ni(xi0 -Xm)2JE 1 Ej=1(Xij - Xc)2 s=1,=1 m = ~'', c - ~ X" L:i=1 n ; Ei=1 n i X ~ i=1 n 'Xi0 X = Each dependence arises for two reasons : the same mother will appear in as many pairs as children possessed and there will be positive correlation among children within the same family, i.e. pcc > 0. 4.2 Sib-mean estimator Falconer (1960) recommends pairing each mother's score (X io) with the mean of her sibling's scores, i.g. Xic = E7 4, Xi;/n i Pm = E 1(X i0- Xm)(Xic- Xc) J>1(Xio - X m) 2JE1 (Xic - Xc)2 where Xm = Xio, Xc = Xic This estimator depends on sample size.

6 2 Lada Smoljanović 4.3 Random-sib estimator Biron (1977) selected one child randomly from those families having two or more children and computed the parent-child correlation on the basis of the resulting subsample only. Pr = ~,i-1(xio - X m)v:j- Xc ) 2 ~ *)2 Vi=1(Xio - Xm ) L+i=1(Xij - Xc where X, denotes a random sibling from the i th family and EX,j X: = i=1 4.4 Ensemble estimator One solution to this problem (Rosner, Donner 1977) is to compute the expected value of pt over all possible selections of members over all families. This solution would be unwieldy, since it would require the computation of 11 1 ni distributed correlations. But using some approximations we have Pe - JE-1(X i0 E i=1(xio - Xm)(Xic _ ) - X m) 2 'JE1 >2"i(Xii - Xic) 2 /ni(1 - ) + E 1(Xic - Xc)2 5 Discussion (Donner, Rosner 1977) compared the pairwise, sib-mean, random-sib and ensemble estimators using, as the criterion of comparison, their mean square errors obtained from Monte Carlo simulation. The pairwise estimate has a lower mean square error than either sib-mean square errors, which we utilized to compare the estimators under performing 100 iterations. The pairwise estimator is more effective than ensemble estimator at low values of p. and the ensemble estimator is more effective at high values of p«. For intermediate values of p,,, the two estimator perform about equally well. These methods can also be applied to other types of paired data, as in matched studies with a variable matching ratio, where one has a continuous outcome variable and wishes to control for other confounding variables while maintaining the matching. References (1] Donner A. (1978) : The number of families required for detecting the familial aggregation of a continuous attribute. American Journal of Epidemiology, 108, 425-471.

Intraclass and Interclass Correlation Coefficients with Application 6 3 [2] Donner A. (1979) : The use of correlation and regression in the analysis of familial resemblance. American Journal of Epidemiology, 110, 335-342. [3] Donner A. and Koval J.J. (1980) : The estimation of intraclass correlation in the analysis of familial data. Biometrics, 36, 19-25. [4] Falconer, D.S. (1960) : Introduction to Quantitative Genetics. London : Longman Groups. [5] Rosner B., Donner A., and Henneens C.H. (1977) : Estimation of interclass correlations from familial data. Applied Statistics, 26, 179-187. [6] Rosner B. and Donner A. (1979) : Significance testing of interclass correlations from familial data. Biometrics, 35, 461-471. [7] Rosner B. (1982) : Statistical methods in ophthalmology : An adjustment for the intraclass correlation between eyes. Biometrics, 38, 105-114. [8] Rosner B. (1984) : Multivariate methods in ophthalmology with application to other paired-data situations, Biometrics, 40, 1025-1035. [9] Elston, R.C. (1975) : On the correlation between correlations. Biometria, 62, 133-140.