Random marginal agreement coefficients: rethinking the adjustment for chance when measuring agreement

Size: px
Start display at page:

Download "Random marginal agreement coefficients: rethinking the adjustment for chance when measuring agreement"

Transcription

1 Biostatistics (2005), 6, 1,pp doi: /biostatistics/kxh027 Random marginal agreement coefficients: rethinking the adjustment for chance when measuring agreement MICHAEL P. FAY National Institute of Allergy and Infectious Diseases, 6700B Rockledge Dr. MSC 7609, Bethesda, MD , USA SUMMARY Agreement coefficients quantify how well a set of instruments agree in measuring some response on a population of interest. Many standard agreement coefficients (e.g. kappa for nominal, weighted kappa for ordinal, and the concordance correlation coefficient (CCC) for continuous responses) may indicate increasing agreement as the marginal distributions of the two instruments become more different even as the true cost of disagreement stays the same or increases. This problem has been described for the kappa coefficients; here we describe it for the CCC. We propose a solution for all types of responses in the form of random marginal agreement coefficients (RMACs), which use a different adjustment for chance than the standard agreement coefficients. Standard agreement coefficients model chance agreement using expected agreement between two independent random variables each distributed according to the marginal distribution of one of the instruments. RMACs adjust for chance by modeling two independent readings both from the mixture distribution that averages the two marginal distributions. In other words, both independent readings represent first a random choice of instrument, then a random draw from the marginal distribution of the chosen instrument. The advantage of the resulting RMAC is that differences between the two marginal distributions will not induce greater apparent agreement. As with the standard agreement coefficients, the RMACs do not require any assumptions about the bivariate distribution of the random variables associated with the two instruments. We describe the RMAC for nominal, ordinal and continuous data, and show through the delta method how to approximate the variances of some important special cases. Keywords: Concordance correlation coefficient; Kappa; Random marginal agreement coefficient; Reliability; Weighted kappa. 1. INTRODUCTION When two instruments are believed to measure the same values, it is often desired to have a single coefficient that measures how well the two instruments agree. We consider coefficients that apply to categorical responses (e.g. two health professionals both classifying patients into k possibly ordered categories of disease) or to more continuous-like responses (e.g. two assays both measuring concentration of a specific antibody in blood samples). Let X and Y be the random variables associated with the responses measured on some population of interest by the two instruments. Then X and Y are either scalar valued (corresponding to continuous responses or discrete responses with known scores), or vector valued with each element zero except Biostatistics Vol. 6 No. 1 c Oxford University Press 2005; all rights reserved.

2 172 M. P. FAY one (corresponding to categorical responses). Let F XY be the joint distribution of X and Y.Wewish to summarize the distribution F XY with a single scalar coefficient which represents how well X and Y agree. We denote these population agreement coefficients by A and their sample values by Â. In this paper we consider only nonparametric agreement coefficients, where A requires no assumptions about F XY. By defining the agreement problem this way, we exclude many useful parametric models used for measuring agreement which require some assumptions about F XY.For example, log linear models can describe agreement with nominal data (Tanner and Young, 1985) and ordinal data (Agresti, 1988). For continuous data, the intraclass correlation is defined under an additive model which induces a structure on F XY (see e.g. Shrout and Fleiss, 1979). Carrasco and Jover (2003) show that under the usual additive model assumptions, the intraclass correlation is equivalent to the concordance correlation coefficient (CCC) of Lin (1989). For binary responses, the intraclass kappa (Bloch and Kraemer, 1989) assumes equivalent marginal distributions. Although we show later that the sample intraclass kappa (equivalent to Scott s (1955) estimator) is a good estimator of the RMAC applied to nominal data, an important difference between the population intraclass kappa and the associated RMAC is that the population RMAC makes no assumptions about the bivariate distribution, F XY. Agreement coefficients which do not require assumptions about F XY are the CCC for continuous data (Lin, 1989), and Cohen s kappa or weighted kappa for nominal data or ordinal data (see e.g. Fleiss et al., 2003). We call these standard agreement coefficients (e.g. kappa, CCC), fixed marginal agreement coefficients (FMACs) in order to contrast them with the random marginal agreement coefficients (RMACs) we propose. In Section 2 we review how the FMACs adjust for chance, and we propose a different adjustment producing the RMACs. The terms fixed and random apply to how the marginal distributions are used in the chance calculation, and this terminology should not be confused with Lin et al. (2002) who talk about whether one of the instruments has values that may be fixed or random. An important property of the RMACs is that increasing differences in the marginal distributions cannot increase the adjustment for chance and consequently increase the agreement coefficient as is the case with the FMAC. We define both the FMACs and the RMACs using general cost functions similar to King and Chinchilli (2001) who generalized only the FMACs. We spend the bulk of this paper (Sections 2 4) comparing population agreement coefficients, discussing the usefulness of different ways of summarizing F XY into a single number. In Section 3 we discuss the RMAC applied to categorical data. The RMAC counterpart to weighted kappa is also discussed in Section 3, and the RMAC counterpart to the concordance correlation coefficient is discussed in Section 4. Also in Section 4 we give an interpretation of a transformation of the RMAC with squared difference cost as the proportion of variance of the response from an randomly chosen instrument attributable to instrument disagreement. We offer estimators and confidence intervals of these coefficients in Section 5 and end with a discussion. 2. FIXED MARGINAL VERSUS RANDOM MARGINAL AGREEMENT COEFFICIENTS Let c(x, y) be the cost of disagreement when X = x and Y = y, which equals zero when x = y and is non-negative otherwise, and c(x, y) = c(y, x) for all x, y. Agreement coefficients for categorical data can equivalently be represented using positive weights for agreement (see Section 3). Let the expected cost given F XY be called the true cost and be written E FXY (c(x, Y )). Togive interpretability to the true cost, we first scale it by some chance cost, then transform the scaled value to equal 1 at perfect agreement and 0 when true cost equals chance cost. Write the chance cost in general form as E FU E FV {c(u, V )}, where U and V are independent random variables defined later. Then the agreement coefficients discussed in this paper are all in the form A = 1 E F XY {c(x, Y )} E FU E FV {c(u, V )}. (2.1)

3 Random marginal agreement coefficients 173 In FMACs (e.g. kappa, CCC), we model the chance cost by fixing the distribution for the first random variable to be the marginal distribution of the first instrument, and similarly for the second random variable, giving A F (c) = 1 E F XY {c(x, Y )} E FX E FY {c(x, Y )}, where F X and F Y are the marginal distributions for X and Y respectively. The problem with FMACs is that increasing differences between F X and F Y while holding the true cost constant can cause larger values for chance cost, which implies better agreement for A F (c). This problem has been widely studied for nominal data (see e.g. Byrt et al., 1993), but not studied for continuous data. Examples are presented in Sections 3 and 4. Our solution to the above problem is the RMACs, denoted A R (c). The RMACs let U and V of equation (2.1) be independent responses from the same distribution, F Z = 0.5F X + 0.5F Y,i.e. E FXY {c(x, Y )} A R (c) = 1 E FZ1 E FZ2 {c(z 1, Z 2 )}. For the RMAC, we model disagreement by chance by first randomly choosing an instrument and then randomly drawing from the marginal of that instrument. Thus, differences between the marginal distributions cannot affect RMACs. For practical applications, we can apply Zwick s (1988) recommendation for nominal data to all types of responses; when exploring agreement, first test for differences between the marginal distributions F X and F Y, then if there are no significant differences use the sample RMAC (for nominal responses this is Scott s (1955) estimator). Thus, even if there was low power to detect marginal differences, the subsequent RMAC can detect the effect of the marginal differences on the true cost more strongly than the FMAC, since larger marginal differences do not induce greater chance cost adjustments. 3. CATEGORICAL RESPONSES FOR k k TABLES In this section X and Y both represent categorical responses with k possible responses. Let e j be a k 1 vector of zeros except with a 1 in the jth row, and the sample space for both X and Y is {e 1,...,e k }. Let π ab = Pr[X = e a, Y = e b ], and let a dot over an index denote summation over that index (e.g. π a = k j=1 π aj ). In this notation, k kj=1 c ij π ij A F (c) = 1 k kj=1, c ij π i π j where c ij = c(e i, e j ) and A R (c) = 1 k kj=1 c ij π ij k kj=1 c ij (0.5π i + 0.5π i ) ( 0.5π j + 0.5π j ). We can write both A F (c) and A R (c) in terms of positive weights for agreement. Since scaling the cost by a constant does not change the value of either A F (c) or A R (c),weuse a scaled version of the c ij,saycij, such that max i, j cij = 1. Then w ij 1 cij equals 1 for perfect agreement and 0 w ij 1 for all i = j, and A F (c) = 1 k kj=1 (1 w ij )π ij k kj=1 (1 w ij )π i π j = 1 1 k kj=1 w ij π ij 1 k kj=1 w ij π i π j = o e 1 e,

4 174 M. P. FAY Table 1. Multiple Sclerosis Diagnoses (Westlund and Kurkland, 1953) 1a: Original data 1b: Modified data Neurologist 2 Neurologist 2 Neurologist Total Neurologist Total Total Total = certain MS, 2 = probable MS, 3 = possible MS (50:50 odds), and 4 = doubtful, unlikely, or definitely not MS. where o = k kj=1 w ij π ij and e = k kj=1 w ij π i π j. This is the standard form for weighted kappa. In this kappa form A R (c) is A R (c) = o z 1 z (3.1) where z = k kj=1 w ij (0.5π i + 0.5π i ) ( 0.5π j + 0.5π j ). Consider three common cost functions for categorical data, nominal cost (n), squared difference cost (d), and absolute value of the difference cost (a). The usual kappa is A F (n), the FMAC using the nominal cost function, where n(x, y) = 0ifx = y and 1 otherwise. In terms of weights the nominal cost is c ij = 0 (i.e. w ij = 1) if i = j and c ij = 1 (i.e. w ij = 0) if i = j. Then o represents the probability of perfect agreement, and e represents the probability of perfect agreement by chance under the fixed marginal model. For ordinal responses, the value of the most common cost functions when x = e i and y = e j are either d(x, y) = c ij = (i j) 2 (i.e. w ij = 1 (i j) 2 /(k 1) 2 )ora(x, y) = c ij = i j (i.e. w ij = 1 i j /(k 1)) (see Fleiss et al., 2003). The associated FMACs are denoted A F (d) and A F (a), respectively. Another way to represent ordered scores is to let the sample space for X and Y consist of k ordered (scalar) scores, s 1 < s 2 < < s k. Then letting s i = i we get A F (d) or A F (a) by now defining d(x, y) = (x y) 2 and a(x, y) = x y. The RMAC notation is analogous. In Table 1a we present data previously used in the agreement literature, the independent classification of two neurologists of 149 patients into four categories, 1=certain multiple sclerosis (MS), 2=probable MS, 3=possible MS (50:50 odds), and 4=doubtful, unlikely, or definitely not MS. Suppose we define the π ab values by the proportions from Table 1a, then A F (n) = and A R (n) = Now modify the data to get Table 1b by supposing that the 10 patients that were rated 3 by Neurologist 1 and 1 by Neurologist 2, were instead rated 1 by Neurologist 1 and 3 by Neurologist 2. Again defining the π ab values by the proportions, the values of the agreement coefficients are A F (n) = and A R (n) = for the modified table. The FMAC shows better agreement for Table 1a over Table 1b, despite the fact that the modified Table 1b has closer matching marginals and identical diagonal values (exact matches) to Table 1a. In contrast, the RMAC shows identical values for both tables. A similar phenomenon occurs when using the ordinal cost functions, d and a. The FMACs show better agreement for Table 1a despite the fact that Table 1b has the same diagonal values and more closely matched marginals (Table 1a, A F (d) = Table 1b, A F (d) = 0.503; Table 1a, A F (a) = Table 1b, A F (a) = 0.355). In contrast, the RMACs show identical agreement between the two tables (both tables, A R (d) = 0.497; both tables, A R (a) = 0.348).

5 Random marginal agreement coefficients CONTINUOUS RESPONSES 4.1 Comparison of RMAC to concordance correlation coefficient Because of historical precedent, simplifications, and some nice properties, we focus on the squared difference cost function (where c(x, y) is d(x, y) = (x y) 2 ) for continuous responses. Other cost functions (e.g. c(x, y) = a(x, y) = x y ) may be used, but are not discussed in this section. For continuous responses A F (d) gives the CCC (Lin, 1989), A F (d) = 1 σ x 2 + σ y 2 + (µ x µ y ) 2 2ρσ x σ y 2ρσ x σ y σx 2 + σ y 2 + (µ x µ y ) 2 = σx 2 + σ y 2 + (µ x µ y ) 2, where µ x (µ y ) and σ 2 x (σ 2 y )are the means and variances associated with F X (F Y ), and ρ = Corr(X, Y ). Following Lin (1989) we can write this in terms of three parameters, A F (d) = 2ρ v + 1/v + u 2, where v = σ x /σ y and u = (µ x µ y )/ σ x σ y. To calculate A R (d), first note that E FZ1 E FZ1 (Z 1 Z 2 ) 2 = 2Var(Z), where as before Z 1 and Z 2 are independent and F Z = 0.5F X + 0.5F Y. This gives [E Z (Z 2 ) {E Z (Z)} 2] 2Var(Z) = 2 = 2 and A R (d) in terms of u, v and ρ is [ 1 2 E X (X 2 ) E Y (Y 2 ) = σ 2 x + σ 2 y (µ x µ y ) 2 A R (d) = 2ρ 1 2 u2 ( ) ] µx + µ 2 y 2 v + 1/v + 1. (4.1) 2u2 When u = 0then A F (d) = A R (d).tocompare the two agreement measures more generally we plot each agreement measure versus u fixing v = 1 with lines representing different values of ρ. InFigure 1a we see that the CCC (A F (d)) approaches 0 as u gets large, while Figure 1b shows that A R (d) approaches 1 inthe same situations. With fixed negative correlation and increasing standardized mean difference, the CCC increases (implying better agreement), while A R (d) decreases. To show the problem consider two multivariate normal distributions both with σx 2 = σ y 2 = 1, and ρ = 0.1. In the first distribution, the means are equal, µ x = µ y = 0, while in the second the means differ, µ x = 2and µ y = 2. Clearly the second distribution represents worse agreement between X and Y,but only A R (d) shows this (first distribution, A F (d) = A R (d) = 0.1; second distribution, A F (d) = 0.01, A R (d) = 0.82). 4.2 Interpretation as partition of variance For the RMAC with continuous responses we can interpret {1 A R (d)}/2 asthe proportion of variance of an arbitrary instrument s response attributable to disagreement between the instruments. To see this, let R be a Bernoulli random variable with parameter 0.5. Then Z = RX + (1 R)Y represents a random

6 176 M. P. FAY (a) Concordance Correlation Coefficient, A F (d) A F (d) A R (d) ρ=0.5 ρ=0 ρ= 0.5 ρ= u ρ=1 (b) RMAC with Squared Difference Cost, A R (d) ρ=0.5 ρ=0 ρ= 0.5 ρ= u Fig. 1. choice between X and Y, and the distribution of Z is F Z as previously defined. The variance of Z can be partitioned into Var(Z) = Var (U) E ( (X Y ) 2), where here U = 0.5X + 0.5Y. Thus, 1 A R (d) 2 = ρ=1 1 4 E { F XY (X Y ) 2 } Var(Z) can be interpreted as a proportion of the variance of Z attributable to disagreement between instruments. The value of {1 2A R (d)}/2 isclose to zero (i.e. A R (d) is close to one) when the expected squared difference between the responses from the two instruments is small compared to the variance of the average response of the two instruments; and the value is close to one (i.e. A R (d) is close to minus one) when the expected squared difference is much larger than that variance of the average. 5. ESTIMATION AND INFERENCES 5.1 General case We can use the bootstrap to derive simple estimators (see e.g. Efron and Tibshirani, 1993). Let the data be paired responses, (x 1, y 1 ),...,(x n, y n ). The ideal bootstrap estimators are  F (c) = 1 n 1 n c(x i, y i ) n 2 n nj=1 c(x i, y j )

7 Random marginal agreement coefficients 177 for the FMAC, and  R (c) = 1 n 1 n c(x i, y i ) (2n) 2 2n 2n j=1 c(z i, z j ) for the RMAC, where z =[x, y] =[x 1,...,x n, y 1,...,y n ]. For categorical data these estimators are equivalent to replacing the π ij values in the expression for A F (c) or A R (c) with the sample proportions. Similarly we can write the bootstrap for continuous and ordinal data by replacing F XY, F X, and F Y with their respective empirical distributions. For scalar data, we can write A F (d) or A R (d) in terms of E(X), E(Y ),Var(X),Var(Y ), and Corr(X, Y ) (see Section 4), so we simply replace those values with their usual bootstrap estimators. Alternatively, we could use unbiased sample variance and covariance estimators. For inferences on A R (c) or A F (c), wecan apply the bias corrected and accelerated (BC a ) bootstrap confidence intervals (see e.g. Efron and Tibshirani, 1993). 5.2 Special case: categorical responses An asymptotic variance expression for A F (c) has been derived (see e.g. Fleiss et al., 2003); here we give an estimator for A R (c) using the kappa form weights. Fisher s z-transformation gives β = tanh 1 [A R (c)] = 1 ( ) log AR (c). 1 A R (c) In Section 1 of the supplementary material ( we derive the delta method variance estimate for ˆβ, where ˆσ 2ˆβ = a=1 b=1 ( ˆπ ab ˆD ab 2 a=1 ) 2 ˆπ ab ˆD ab b=1 ˆD ab = 2w ab ( w a + w a + w b + w b ) 4(1 + ˆ o 2 ˆ z ) w a = w ia ˆπ i and w a = w ab 2(1 ˆ o ), w aj ˆπ j, and any value topped with a hat denotes replacing all π ij with ˆπ ij in its definition, where ˆπ ij is n 1 times the number ( of pairs with x ) = e i and y = e j. The 100(1 α) percent confidence limits for  R (c) are tanh ˆβ ± 1 (1 α/2) ˆσ ˆβ where 1 (p) is the pth quantile of the standard normal distribution. We performed simulations on five distributions for F XY,three with k = 2, one with k = 4 and one with k = 5. We used the nominal cost function in every case, and when k = 4or5we additionally used the absolute difference and squared difference cost functions. For each distribution/cost function combination, we simulated with n = 20 and n = 50, and with c(x, y) = d(x, y) and k = 4, or 5 we additionally did n = 200. There were a total of 20 simulations. For each simulation we did 1000 replications, and for the BC a we used 1000 bootstrap resamples. j=1

8 178 M. P. FAY In every case the estimators of A R (c) appear slightly biased downward, with all simulated means within 0.05 of the true value. Both the delta method intervals and the BC a intervals give reasonably adequate coverage, with the BC a intervals preferred when k = 4or5.Fork = 2 the simulated 95% coverage for the delta method was 94 95% and for the BC a method was 95 96% except one case of 89.5%. For the cases with k = 4 and 5 the coverage was 94% or greater in 10/14 cases for the BC a method but only 4/14 for the delta method. Note that even in the cases with k 4, that have quite a few cells with very low probability of response, the coverage for both methods was generally over 90%. Details are given in Section 2 of the supplementary material ( 5.3 Special case: continuous using squared difference cost To derive confidence intervals for A R (d) we follow a similar strategy to Lin (1989). Fisher s z- transformation gives ξ = 1 ( ) ( log AR (d) = 12 1 A R (d) log σx 2 + σ y 2 + 2σ ) xy σx 2 + σ y 2 2σ xy + (µ x µ y ) 2. To estimate ξ we use unbiased estimators of the numerator and denominator of the ratio inside the logarithm, to obtain ˆξ = 1 2 log S x 2 + S2 y + 2S xy ( ) ( ) ( ) n 1 n Sx 2 + n 1 n Sy 2 2 n 1 n S xy + ( X Ȳ ) 2 where X and Ȳ are means, and S 2 x = (n 1) 1 n (X i X) 2, S 2 y = (n 1) 1 n (Y i Ȳ ) 2, and S xy = (n 1) 1 n (X i X)(Y i Ȳ ). Then using the delta method we show in Section 3 of the supplementary material ( that under the assumption of normal responses an asymptotic estimator of the variance is ˆσ 2ˆξ = ( X Ȳ ) 4 (Sx 2 + S2 y + 2S xy) + 2( X Ȳ ) 2 (Sx 4 + S4 y + 6S2 x S2 y 8S xy) + 8(Sx 2 + S2 y 2S xy)(sx 2S2 y S2 xy ) ) 2 ( ). 2n (Sx 2 + S2 y 2S xy + ( X Ȳ ) 2 Sx 2 + S2 y + 2S xy Through simulations (and similar to the results of Lin, 1989) we show that we get better coverage if we use σ 2ˆξ = n ˆσ /(n 2) to calculate confidence intervals. The 100(1 α) percent confidence limits for 2ˆξ ) Â R (c) are tanh (ˆξ ± 1 (1 α/2) σˆξ. We performed 18 simulations on different normal distributions with replications each using 1000 bootstrap resamples. The simulated bias estimates were all less than The simulated 95% coverage for the delta method intervals using σˆξ were all above 92% with 15/18 between 94 95%. The BC a simulated coverage was worse with coverage mostly around 91 93%. The coverage for the BC a intervals may improve with more bootstrap replications. Details are given in Section 4 of the supplementary material ( 6. DISCUSSION We have proposed that RMAC should be used in order to stop differences between marginal distributions from inducing greater agreement. The RMAC do not address other common criticisms of

9 Random marginal agreement coefficients 179 agreement coefficients. Firstly, as with FMAC, when comparing two agreement coefficients, it is necessary to realize the dependence of the RMAC on the form of the average marginal distribution F Z (Byrt et al., 1993). Secondly (and relatedly), as with FMAC, the RMAC depends on the heterogeneity of the population; for example, in the continuous case if the range of responses is large, then it is much easier to obtain higher agreement coefficients than if the range of responses is small (Atkinson and Nevill, 1997; Lin and Chinchilli, 1997). For binary data, one can see this effect when the data are nearly homogeneous (i.e. if the probability of responding in one category is close to one), then both the FMAC and RMAC will have large chance agreement (low chance cost) and generally lower agreement coefficients. Thirdly, in the nominal case with more than two categories of response, both the FMAC and the RMAC may be misleading. One may have high agreement yet all the categories but one may be indistinguishable from each other (Kraemer et al., 2002). Finally, since it is only one measure, the RMAC cannot describe all aspects of the bivariate distribution F XY that are of interest in agreement studies (for other measures see Lin et al., 2002). Although the sample RMAC for nominal data is equivalent to Scott s (1955) estimator, we have made no assumptions on the equality of the marginal distributions. This apparent assumption of Scott may have led to a preference for Cohen s kappa over Scott s estimator. For example, Fleiss (1975) says the kappa is preferred to Scott s estimator because it does not make an unwarranted assumption about the marginal proportions. In fact, in our presentation we have emphasized that neither the FMAC (estimated by kappa for nominal data) nor the RMAC (estimated by Scott s estimator for nominal data) make any assumptions about the marginal distributions. In this paper we have argued for the use of RMAC over the use of FMAC, but there may be some cases when the FMAC is preferred. Consider two raters classifying observations into sets with no clear boundaries, so that there is no intrinsic meaning to the classification. For example, suppose raters were classifying people as being in poor health, fair health, or good health. Because the categories are fuzzy, there is no correct distribution for the study population, and the marginal for each rater just denotes that rater s preferences. The FMAC could be interpreted as measuring agreement given the preferences (i.e. marginal distributions) of the raters. Then if more disparate marginals induce greater agreement in the FMAC, we accept that interpretation because the induced agreement should be greater since it was achieved despite the larger difference in marginals. The kappa coefficient and the CCC have been generalized and extended to handle multiple raters, stratified data, and testing of agreement coefficients (Banerjee et al., 1999; King and Chinchilli, 2001). The RMAC should be able to be extended in similar ways, and that work is left to future research. ACKNOWLEDGMENTS I thank Dean Follmann, Ji Hyun Le, and Martha Nason for comments and discussions on drafts of this paper. REFERENCES AGRESTI, A.(1988). A model for agreement between ratings on an ordinal scale. Biometrics 44, ATKINSON, G. AND NEVILL, A.(1997). Comment on the use of concordance correlation to assess the agreement between two variables. Biometrics 53, BANERJEE, M., CAPOZZOLI, M., MCSWEENEY, L. AND SINHA, D.(1999). Beyond kappa: a review of interrater agreement measures. Canadian Journal of Statistics 27, BLOCH, D. A. AND KRAEMER, H. C.(1989). 2 2 kappa coefficients: measures of agreement or association. Biometrics 45,

10 180 M. P. FAY BYRT, T., BISHOP, J. AND CARLIN, J. B.(1993). Bias, prevalence and kappa. Journal of Clinical Epidemiology 46, CARRASCO, J. L. AND JOVER, L.(2003). Estimating the generalized concordance correlation coefficient through variance components. Biometrics 59, EFRON, B.AND TIBSHIRANI, R.J.(1993). An Introduction to the Bootstrap. New York: Chapman & Hall. FLEISS, J.L.(1975). Measuring agreement between two judges on the presence or absence of a trait. Biometrics 31, FLEISS, J. L., LEVIN, B. AND PAIK, M. C.(2003). Statistical Methods for Rates and Proportions, 3rd edn. New York: Wiley. KING, T.S.AND CHINCHILLI, V.M.(2001). A generalized concordance correlation coefficient for continuous and categorical data. Statistics in Medicine 20, KRAEMER, H. C., PERIYAKOIL, V. S. AND NODA, A.(2002). Kappa coefficients in medical research. Statistics in Medicine 21, LIN, L. I. (1989). A concordance correlation coefficient to evaluate reproducibility. Biometrics 45, (Correction: 2000, pp ). LIN, L. I. AND CHINCHILLI, V.(1997). Rejoinder to the letter to the editor from Atkinson and Nevill. Biometrics 53, LIN, L., HEDAYAT, A. S., SINHA, B. AND YANG, M.(2002). Statistical methods in assessing agreement: models, issues, and tools. Journal of the American Statistical Association 97, SCOTT, W. A.(1955). Reliability of content analysis: the case of nominal scale coding. Public Opinion Quarterly 19, SHROUT, P. E. AND FLEISS, J. L.(1979). Intraclass correlations: uses in assessing rater reliability. Psychological Bulletin 86, TANNER, M. A. AND YOUNG, M. A.(1985). Modeling agreement among raters. Journal of the American Statistical Association 80, WESTLUND, K. B. AND KURKLAND, L. T.(1953). Studies in multiple sclerosis in Winnipeg, Manitoba and New Orleans, Louisiana. American Journal of Hygiene 57, ZWICK, R.(1988). Another look at interrater agreement. Psychological Bulletin 103, [Received 13 July 2004; revised 29 September 2004; accepted for publication 1 October 2004]

A UNIFIED APPROACH FOR ASSESSING AGREEMENT FOR CONTINUOUS AND CATEGORICAL DATA

A UNIFIED APPROACH FOR ASSESSING AGREEMENT FOR CONTINUOUS AND CATEGORICAL DATA Journal of Biopharmaceutical Statistics, 17: 69 65, 007 Copyright Taylor & Francis Group, LLC ISSN: 1054-3406 print/150-5711 online DOI: 10.1080/10543400701376498 A UNIFIED APPROACH FOR ASSESSING AGREEMENT

More information

Table 2.14 : Distribution of 125 subjects by laboratory and +/ Category. Test Reference Laboratory Laboratory Total

Table 2.14 : Distribution of 125 subjects by laboratory and +/ Category. Test Reference Laboratory Laboratory Total 2.5. Kappa Coefficient and the Paradoxes. - 31-2.5.1 Kappa s Dependency on Trait Prevalence On February 9, 2003 we received an e-mail from a researcher asking whether it would be possible to apply the

More information

Chapter 19. Agreement and the kappa statistic

Chapter 19. Agreement and the kappa statistic 19. Agreement Chapter 19 Agreement and the kappa statistic Besides the 2 2contingency table for unmatched data and the 2 2table for matched data, there is a third common occurrence of data appearing summarised

More information

Lecture 25: Models for Matched Pairs

Lecture 25: Models for Matched Pairs Lecture 25: Models for Matched Pairs Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture

More information

Measures of Agreement

Measures of Agreement Measures of Agreement An interesting application is to measure how closely two individuals agree on a series of assessments. A common application for this is to compare the consistency of judgments of

More information

Inferences for the Ratio: Fieller s Interval, Log Ratio, and Large Sample Based Confidence Intervals

Inferences for the Ratio: Fieller s Interval, Log Ratio, and Large Sample Based Confidence Intervals Inferences for the Ratio: Fieller s Interval, Log Ratio, and Large Sample Based Confidence Intervals Michael Sherman Department of Statistics, 3143 TAMU, Texas A&M University, College Station, Texas 77843,

More information

Worked Examples for Nominal Intercoder Reliability. by Deen G. Freelon October 30,

Worked Examples for Nominal Intercoder Reliability. by Deen G. Freelon October 30, Worked Examples for Nominal Intercoder Reliability by Deen G. Freelon (deen@dfreelon.org) October 30, 2009 http://www.dfreelon.com/utils/recalfront/ This document is an excerpt from a paper currently under

More information

UNIVERSITY OF CALGARY. Measuring Observer Agreement on Categorical Data. Andrea Soo A THESIS SUBMITTED TO THE FACULTY OF GRADUATE STUDIES

UNIVERSITY OF CALGARY. Measuring Observer Agreement on Categorical Data. Andrea Soo A THESIS SUBMITTED TO THE FACULTY OF GRADUATE STUDIES UNIVERSITY OF CALGARY Measuring Observer Agreement on Categorical Data by Andrea Soo A THESIS SUBMITTED TO THE FACULTY OF GRADUATE STUDIES IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR

More information

Agreement Coefficients and Statistical Inference

Agreement Coefficients and Statistical Inference CHAPTER Agreement Coefficients and Statistical Inference OBJECTIVE This chapter describes several approaches for evaluating the precision associated with the inter-rater reliability coefficients of the

More information

Decomposition of Parsimonious Independence Model Using Pearson, Kendall and Spearman s Correlations for Two-Way Contingency Tables

Decomposition of Parsimonious Independence Model Using Pearson, Kendall and Spearman s Correlations for Two-Way Contingency Tables International Journal of Statistics and Probability; Vol. 7 No. 3; May 208 ISSN 927-7032 E-ISSN 927-7040 Published by Canadian Center of Science and Education Decomposition of Parsimonious Independence

More information

MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES

MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES REVSTAT Statistical Journal Volume 13, Number 3, November 2015, 233 243 MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES Authors: Serpil Aktas Department of

More information

Supporting Information for Estimating restricted mean. treatment effects with stacked survival models

Supporting Information for Estimating restricted mean. treatment effects with stacked survival models Supporting Information for Estimating restricted mean treatment effects with stacked survival models Andrew Wey, David Vock, John Connett, and Kyle Rudser Section 1 presents several extensions to the simulation

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Assessing intra, inter and total agreement with replicated readings

Assessing intra, inter and total agreement with replicated readings STATISTICS IN MEDICINE Statist. Med. 2005; 24:1371 1384 Published online 30 November 2004 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/sim.2006 Assessing intra, inter and total agreement

More information

Discrete Multivariate Statistics

Discrete Multivariate Statistics Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are

More information

Package cccrm. July 8, 2015

Package cccrm. July 8, 2015 Package cccrm July 8, 2015 Title Concordance Correlation Coefficient for Repeated (and Non-Repeated) Measures Version 1.2.1 Date 2015-07-03 Author Josep Lluis Carrasco , Josep Puig Martinez

More information

Estimation of AUC from 0 to Infinity in Serial Sacrifice Designs

Estimation of AUC from 0 to Infinity in Serial Sacrifice Designs Estimation of AUC from 0 to Infinity in Serial Sacrifice Designs Martin J. Wolfsegger Department of Biostatistics, Baxter AG, Vienna, Austria Thomas Jaki Department of Statistics, University of South Carolina,

More information

Coefficients of agreement for fixed observers

Coefficients of agreement for fixed observers Statistical Methods in Medical Research 2006; 15: 255 271 Coefficients of agreement for fixed observers Michael Haber Department of Biostatistics, Rollins School of Public Health, Emory University, Atlanta,

More information

The assumptions are needed to give us... valid standard errors valid confidence intervals valid hypothesis tests and p-values

The assumptions are needed to give us... valid standard errors valid confidence intervals valid hypothesis tests and p-values Statistical Consulting Topics The Bootstrap... The bootstrap is a computer-based method for assigning measures of accuracy to statistical estimates. (Efron and Tibshrani, 1998.) What do we do when our

More information

Variance Estimation of the Survey-Weighted Kappa Measure of Agreement

Variance Estimation of the Survey-Weighted Kappa Measure of Agreement NSDUH Reliability Study (2006) Cohen s kappa Variance Estimation Acknowledg e ments Variance Estimation of the Survey-Weighted Kappa Measure of Agreement Moshe Feder 1 1 Genomics and Statistical Genetics

More information

MULTIVARIATE CONCORDANCE CORRELATION COEFFICIENT

MULTIVARIATE CONCORDANCE CORRELATION COEFFICIENT The Pennsylvania State University The Graduate School Department of Statistics MULTIVARIATE CONCORDANCE CORRELATION COEFFICIENT A Dissertation in Statistics by Sasiprapa Hiriote c 2009 Sasiprapa Hiriote

More information

Describing Stratified Multiple Responses for Sparse Data

Describing Stratified Multiple Responses for Sparse Data Describing Stratified Multiple Responses for Sparse Data Ivy Liu School of Mathematical and Computing Sciences Victoria University Wellington, New Zealand June 28, 2004 SUMMARY Surveys often contain qualitative

More information

The exact bootstrap method shown on the example of the mean and variance estimation

The exact bootstrap method shown on the example of the mean and variance estimation Comput Stat (2013) 28:1061 1077 DOI 10.1007/s00180-012-0350-0 ORIGINAL PAPER The exact bootstrap method shown on the example of the mean and variance estimation Joanna Kisielinska Received: 21 May 2011

More information

The Nonparametric Bootstrap

The Nonparametric Bootstrap The Nonparametric Bootstrap The nonparametric bootstrap may involve inferences about a parameter, but we use a nonparametric procedure in approximating the parametric distribution using the ECDF. We use

More information

Confidence Intervals in Ridge Regression using Jackknife and Bootstrap Methods

Confidence Intervals in Ridge Regression using Jackknife and Bootstrap Methods Chapter 4 Confidence Intervals in Ridge Regression using Jackknife and Bootstrap Methods 4.1 Introduction It is now explicable that ridge regression estimator (here we take ordinary ridge estimator (ORE)

More information

Research Article Fixed-Effects Modeling of Cohen s Weighted Kappa for Bivariate Multinomial Data: A Perspective of Generalized Inverse

Research Article Fixed-Effects Modeling of Cohen s Weighted Kappa for Bivariate Multinomial Data: A Perspective of Generalized Inverse Probability and Statistics Volume 2011, Article ID 603856, 14 pages doi:101155/2011/603856 Research Article Fixed-Effects Modeling of Cohen s Weighted Kappa for Bivariate Multinomial Data: A Perspective

More information

Correlation analysis. Contents

Correlation analysis. Contents Correlation analysis Contents 1 Correlation analysis 2 1.1 Distribution function and independence of random variables.......... 2 1.2 Measures of statistical links between two random variables...........

More information

On Certain Indices for Ordinal Data with Unequally Weighted Classes

On Certain Indices for Ordinal Data with Unequally Weighted Classes Quality & Quantity (2005) 39:515 536 Springer 2005 DOI 10.1007/s11135-005-1611-6 On Certain Indices for Ordinal Data with Unequally Weighted Classes M. PERAKIS, P. E. MARAVELAKIS, S. PSARAKIS, E. XEKALAKI

More information

Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection

Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection Biometrical Journal 42 (2000) 1, 59±69 Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection Kung-Jong Lui

More information

The scatterplot is the basic tool for graphically displaying bivariate quantitative data.

The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Example: Some investors think that the performance of the stock market in January

More information

Model Selection, Estimation, and Bootstrap Smoothing. Bradley Efron Stanford University

Model Selection, Estimation, and Bootstrap Smoothing. Bradley Efron Stanford University Model Selection, Estimation, and Bootstrap Smoothing Bradley Efron Stanford University Estimation After Model Selection Usually: (a) look at data (b) choose model (linear, quad, cubic...?) (c) fit estimates

More information

Assessing agreement with multiple raters on correlated kappa statistics

Assessing agreement with multiple raters on correlated kappa statistics Biometrical Journal 52 (2010) 61, zzz zzz / DOI: 10.1002/bimj.200100000 Assessing agreement with multiple raters on correlated kappa statistics Hongyuan Cao,1, Pranab K. Sen 2, Anne F. Peery 3, and Evan

More information

Sample Size Formulas for Estimating Intraclass Correlation Coefficients in Reliability Studies with Binary Outcomes

Sample Size Formulas for Estimating Intraclass Correlation Coefficients in Reliability Studies with Binary Outcomes Western University Scholarship@Western Electronic Thesis and Dissertation Repository September 2016 Sample Size Formulas for Estimating Intraclass Correlation Coefficients in Reliability Studies with Binary

More information

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Glenn Heller and Jing Qin Department of Epidemiology and Biostatistics Memorial

More information

Characterizing Forecast Uncertainty Prediction Intervals. The estimated AR (and VAR) models generate point forecasts of y t+s, y ˆ

Characterizing Forecast Uncertainty Prediction Intervals. The estimated AR (and VAR) models generate point forecasts of y t+s, y ˆ Characterizing Forecast Uncertainty Prediction Intervals The estimated AR (and VAR) models generate point forecasts of y t+s, y ˆ t + s, t. Under our assumptions the point forecasts are asymtotically unbiased

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data.

Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Example: Some investors think that the performance of the stock market in January

More information

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One. Outline. Charlotte Wickham  1. Basic ideas about estimation Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

More information

TECHNICAL REPORT # 59 MAY Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study

TECHNICAL REPORT # 59 MAY Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study TECHNICAL REPORT # 59 MAY 2013 Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study Sergey Tarima, Peng He, Tao Wang, Aniko Szabo Division of Biostatistics,

More information

A Tolerance Interval Approach for Assessment of Agreement in Method Comparison Studies with Repeated Measurements

A Tolerance Interval Approach for Assessment of Agreement in Method Comparison Studies with Repeated Measurements A Tolerance Interval Approach for Assessment of Agreement in Method Comparison Studies with Repeated Measurements Pankaj K. Choudhary 1 Department of Mathematical Sciences, University of Texas at Dallas

More information

Confidence Estimation Methods for Neural Networks: A Practical Comparison

Confidence Estimation Methods for Neural Networks: A Practical Comparison , 6-8 000, Confidence Estimation Methods for : A Practical Comparison G. Papadopoulos, P.J. Edwards, A.F. Murray Department of Electronics and Electrical Engineering, University of Edinburgh Abstract.

More information

Kappa Coefficients for Circular Classifications

Kappa Coefficients for Circular Classifications Journal of Classification 33:507-522 (2016) DOI: 10.1007/s00357-016-9217-3 Kappa Coefficients for Circular Classifications Matthijs J. Warrens University of Groningen, The Netherlands Bunga C. Pratiwi

More information

Intraclass Correlations in One-Factor Studies

Intraclass Correlations in One-Factor Studies CHAPTER Intraclass Correlations in One-Factor Studies OBJECTIVE The objective of this chapter is to present methods and techniques for calculating the intraclass correlation coefficient and associated

More information

Inverse Sampling for McNemar s Test

Inverse Sampling for McNemar s Test International Journal of Statistics and Probability; Vol. 6, No. 1; January 27 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education Inverse Sampling for McNemar s Test

More information

Package agree. R topics documented: July 7, Title Various Methods for Measuring Agreement Version Author Dai Feng

Package agree. R topics documented: July 7, Title Various Methods for Measuring Agreement Version Author Dai Feng Title Various Methods for Measuring Agreement Version 0.5-0 Author Dai Feng Package agree July 7, 2016 Bland-Altman plot and scatter plot with identity line for visualization and point and interval estimates

More information

18 Bivariate normal distribution I

18 Bivariate normal distribution I 8 Bivariate normal distribution I 8 Example Imagine firing arrows at a target Hopefully they will fall close to the target centre As we fire more arrows we find a high density near the centre and fewer

More information

Robust covariance estimator for small-sample adjustment in the generalized estimating equations: A simulation study

Robust covariance estimator for small-sample adjustment in the generalized estimating equations: A simulation study Science Journal of Applied Mathematics and Statistics 2014; 2(1): 20-25 Published online February 20, 2014 (http://www.sciencepublishinggroup.com/j/sjams) doi: 10.11648/j.sjams.20140201.13 Robust covariance

More information

Bootstrap, Jackknife and other resampling methods

Bootstrap, Jackknife and other resampling methods Bootstrap, Jackknife and other resampling methods Part III: Parametric Bootstrap Rozenn Dahyot Room 128, Department of Statistics Trinity College Dublin, Ireland dahyot@mee.tcd.ie 2005 R. Dahyot (TCD)

More information

MULTIVARIATE DISTRIBUTIONS

MULTIVARIATE DISTRIBUTIONS Chapter 9 MULTIVARIATE DISTRIBUTIONS John Wishart (1898-1956) British statistician. Wishart was an assistant to Pearson at University College and to Fisher at Rothamsted. In 1928 he derived the distribution

More information

Lecture 01: Introduction

Lecture 01: Introduction Lecture 01: Introduction Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 01: Introduction

More information

Describing Contingency tables

Describing Contingency tables Today s topics: Describing Contingency tables 1. Probability structure for contingency tables (distributions, sensitivity/specificity, sampling schemes). 2. Comparing two proportions (relative risk, odds

More information

Covariance. Lecture 20: Covariance / Correlation & General Bivariate Normal. Covariance, cont. Properties of Covariance

Covariance. Lecture 20: Covariance / Correlation & General Bivariate Normal. Covariance, cont. Properties of Covariance Covariance Lecture 0: Covariance / Correlation & General Bivariate Normal Sta30 / Mth 30 We have previously discussed Covariance in relation to the variance of the sum of two random variables Review Lecture

More information

UNIVERSITÄT POTSDAM Institut für Mathematik

UNIVERSITÄT POTSDAM Institut für Mathematik UNIVERSITÄT POTSDAM Institut für Mathematik Testing the Acceleration Function in Life Time Models Hannelore Liero Matthias Liero Mathematische Statistik und Wahrscheinlichkeitstheorie Universität Potsdam

More information

Categorical Data Analysis Chapter 3

Categorical Data Analysis Chapter 3 Categorical Data Analysis Chapter 3 The actual coverage probability is usually a bit higher than the nominal level. Confidence intervals for association parameteres Consider the odds ratio in the 2x2 table,

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu

More information

Unit 14: Nonparametric Statistical Methods

Unit 14: Nonparametric Statistical Methods Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based

More information

ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS

ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS Libraries 1997-9th Annual Conference Proceedings ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS Eleanor F. Allan Follow this and additional works at: http://newprairiepress.org/agstatconference

More information

Ayfer E. Yilmaz 1*, Serpil Aktas 2. Abstract

Ayfer E. Yilmaz 1*, Serpil Aktas 2. Abstract 89 Kuwait J. Sci. Ridit 45 (1) and pp exponential 89-99, 2018type scores for estimating the kappa statistic Ayfer E. Yilmaz 1*, Serpil Aktas 2 1 Dept. of Statistics, Faculty of Science, Hacettepe University,

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 599 604 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.599 Print ISSN 2287-7843 / Online ISSN 2383-4757 Estimation of Conditional

More information

ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES

ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES Hisashi Tanizaki Graduate School of Economics, Kobe University, Kobe 657-8501, Japan e-mail: tanizaki@kobe-u.ac.jp Abstract:

More information

Inter-Rater Agreement

Inter-Rater Agreement Engineering Statistics (EGC 630) Dec., 008 http://core.ecu.edu/psyc/wuenschk/spss.htm Degree of agreement/disagreement among raters Inter-Rater Agreement Psychologists commonly measure various characteristics

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

Testing Goodness Of Fit Of The Geometric Distribution: An Application To Human Fecundability Data

Testing Goodness Of Fit Of The Geometric Distribution: An Application To Human Fecundability Data Journal of Modern Applied Statistical Methods Volume 4 Issue Article 8 --5 Testing Goodness Of Fit Of The Geometric Distribution: An Application To Human Fecundability Data Sudhir R. Paul University of

More information

The concord Package. August 20, 2006

The concord Package. August 20, 2006 The concord Package August 20, 2006 Version 1.4-6 Date 2006-08-15 Title Concordance and reliability Author , Ian Fellows Maintainer Measures

More information

Probability Theory and Statistics. Peter Jochumzen

Probability Theory and Statistics. Peter Jochumzen Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................

More information

Analysing data: regression and correlation S6 and S7

Analysing data: regression and correlation S6 and S7 Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association

More information

An Overview of Methods in the Analysis of Dependent Ordered Categorical Data: Assumptions and Implications

An Overview of Methods in the Analysis of Dependent Ordered Categorical Data: Assumptions and Implications WORKING PAPER SERIES WORKING PAPER NO 7, 2008 Swedish Business School at Örebro An Overview of Methods in the Analysis of Dependent Ordered Categorical Data: Assumptions and Implications By Hans Högberg

More information

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:. MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss

More information

STAT Section 2.1: Basic Inference. Basic Definitions

STAT Section 2.1: Basic Inference. Basic Definitions STAT 518 --- Section 2.1: Basic Inference Basic Definitions Population: The collection of all the individuals of interest. This collection may be or even. Sample: A collection of elements of the population.

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 3: Bivariate association : Categorical variables Proportion in one group One group is measured one time: z test Use the z distribution as an approximation to the binomial

More information

SAMPLE SIZE AND OPTIMAL DESIGNS FOR RELIABILITY STUDIES

SAMPLE SIZE AND OPTIMAL DESIGNS FOR RELIABILITY STUDIES STATISTICS IN MEDICINE, VOL. 17, 101 110 (1998) SAMPLE SIZE AND OPTIMAL DESIGNS FOR RELIABILITY STUDIES S. D. WALTER, * M. ELIASZIW AND A. DONNER Department of Clinical Epidemiology and Biostatistics,

More information

Estimation of uncertainties using the Guide to the expression of uncertainty (GUM)

Estimation of uncertainties using the Guide to the expression of uncertainty (GUM) Estimation of uncertainties using the Guide to the expression of uncertainty (GUM) Alexandr Malusek Division of Radiological Sciences Department of Medical and Health Sciences Linköping University 2014-04-15

More information

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression Rebecca Barter April 20, 2015 Fisher s Exact Test Fisher s Exact Test

More information

Lecture 9: Classification, LDA

Lecture 9: Classification, LDA Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis October 13, 2017 1 / 21 Review: Main strategy in Chapter 4 Find an estimate ˆP (Y X). Then, given an input x 0, we

More information

A Bivariate Weibull Regression Model

A Bivariate Weibull Regression Model c Heldermann Verlag Economic Quality Control ISSN 0940-5151 Vol 20 (2005), No. 1, 1 A Bivariate Weibull Regression Model David D. Hanagal Abstract: In this paper, we propose a new bivariate Weibull regression

More information

Reports of the Institute of Biostatistics

Reports of the Institute of Biostatistics Reports of the Institute of Biostatistics No 02 / 2008 Leibniz University of Hannover Natural Sciences Faculty Title: Properties of confidence intervals for the comparison of small binomial proportions

More information

Web-based Supplementary Material for. Dependence Calibration in Conditional Copulas: A Nonparametric Approach

Web-based Supplementary Material for. Dependence Calibration in Conditional Copulas: A Nonparametric Approach 1 Web-based Supplementary Material for Dependence Calibration in Conditional Copulas: A Nonparametric Approach Elif F. Acar, Radu V. Craiu, and Fang Yao Web Appendix A: Technical Details The score and

More information

COMPOSITIONAL IDEAS IN THE BAYESIAN ANALYSIS OF CATEGORICAL DATA WITH APPLICATION TO DOSE FINDING CLINICAL TRIALS

COMPOSITIONAL IDEAS IN THE BAYESIAN ANALYSIS OF CATEGORICAL DATA WITH APPLICATION TO DOSE FINDING CLINICAL TRIALS COMPOSITIONAL IDEAS IN THE BAYESIAN ANALYSIS OF CATEGORICAL DATA WITH APPLICATION TO DOSE FINDING CLINICAL TRIALS M. Gasparini and J. Eisele 2 Politecnico di Torino, Torino, Italy; mauro.gasparini@polito.it

More information

Lecture 9: Classification, LDA

Lecture 9: Classification, LDA Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis October 13, 2017 1 / 21 Review: Main strategy in Chapter 4 Find an estimate ˆP (Y X). Then, given an input x 0, we

More information

A bias improved estimator of the concordance correlation coefficient

A bias improved estimator of the concordance correlation coefficient The 22 nd Annual Meeting in Mathematics (AMM 217) Department of Mathematics, Faculty of Science Chiang Mai University, Chiang Mai, Thailand A bias improved estimator of the concordance correlation coefficient

More information

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap Patrick Breheny December 6 Patrick Breheny BST 764: Applied Statistical Modeling 1/21 The empirical distribution function Suppose X F, where F (x) = Pr(X x) is a distribution function, and we wish to estimate

More information

Math 180B Problem Set 3

Math 180B Problem Set 3 Math 180B Problem Set 3 Problem 1. (Exercise 3.1.2) Solution. By the definition of conditional probabilities we have Pr{X 2 = 1, X 3 = 1 X 1 = 0} = Pr{X 3 = 1 X 2 = 1, X 1 = 0} Pr{X 2 = 1 X 1 = 0} = P

More information

Tolerance limits for a ratio of normal random variables

Tolerance limits for a ratio of normal random variables Tolerance limits for a ratio of normal random variables Lanju Zhang 1, Thomas Mathew 2, Harry Yang 1, K. Krishnamoorthy 3 and Iksung Cho 1 1 Department of Biostatistics MedImmune, Inc. One MedImmune Way,

More information

f X, Y (x, y)dx (x), where f(x,y) is the joint pdf of X and Y. (x) dx

f X, Y (x, y)dx (x), where f(x,y) is the joint pdf of X and Y. (x) dx INDEPENDENCE, COVARIANCE AND CORRELATION Independence: Intuitive idea of "Y is independent of X": The distribution of Y doesn't depend on the value of X. In terms of the conditional pdf's: "f(y x doesn't

More information

Rank parameters for Bland Altman plots

Rank parameters for Bland Altman plots Rank parameters for Bland Altman plots Roger B. Newson May 2, 8 Introduction Bland Altman plots were introduced by Altman and Bland (983)[] and popularized by Bland and Altman (986)[2]. Given N bivariate

More information

Correlation and simple linear regression S5

Correlation and simple linear regression S5 Basic medical statistics for clinical and eperimental research Correlation and simple linear regression S5 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/41 Introduction Eample: Brain size and

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW SSC Annual Meeting, June 2015 Proceedings of the Survey Methods Section ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW Xichen She and Changbao Wu 1 ABSTRACT Ordinal responses are frequently involved

More information

Estimation and sample size calculations for correlated binary error rates of biometric identification devices

Estimation and sample size calculations for correlated binary error rates of biometric identification devices Estimation and sample size calculations for correlated binary error rates of biometric identification devices Michael E. Schuckers,11 Valentine Hall, Department of Mathematics Saint Lawrence University,

More information

PROBABILITY THEORY REVIEW

PROBABILITY THEORY REVIEW PROBABILITY THEORY REVIEW CMPUT 466/551 Martha White Fall, 2017 REMINDERS Assignment 1 is due on September 28 Thought questions 1 are due on September 21 Chapters 1-4, about 40 pages If you are printing,

More information

Tests for Assessment of Agreement Using Probability Criteria

Tests for Assessment of Agreement Using Probability Criteria Tests for Assessment of Agreement Using Probability Criteria Pankaj K. Choudhary Department of Mathematical Sciences, University of Texas at Dallas Richardson, TX 75083-0688; pankaj@utdallas.edu H. N.

More information

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics

More information

BIOL 4605/7220 CH 20.1 Correlation

BIOL 4605/7220 CH 20.1 Correlation BIOL 4605/70 CH 0. Correlation GPT Lectures Cailin Xu November 9, 0 GLM: correlation Regression ANOVA Only one dependent variable GLM ANCOVA Multivariate analysis Multiple dependent variables (Correlation)

More information

Final Exam. Economics 835: Econometrics. Fall 2010

Final Exam. Economics 835: Econometrics. Fall 2010 Final Exam Economics 835: Econometrics Fall 2010 Please answer the question I ask - no more and no less - and remember that the correct answer is often short and simple. 1 Some short questions a) For each

More information

Accounting for Baseline Observations in Randomized Clinical Trials

Accounting for Baseline Observations in Randomized Clinical Trials Accounting for Baseline Observations in Randomized Clinical Trials Scott S Emerson, MD, PhD Department of Biostatistics, University of Washington, Seattle, WA 9895, USA October 6, 0 Abstract In clinical

More information

Modelling Dropouts by Conditional Distribution, a Copula-Based Approach

Modelling Dropouts by Conditional Distribution, a Copula-Based Approach The 8th Tartu Conference on MULTIVARIATE STATISTICS, The 6th Conference on MULTIVARIATE DISTRIBUTIONS with Fixed Marginals Modelling Dropouts by Conditional Distribution, a Copula-Based Approach Ene Käärik

More information

DIAGNOSTICS FOR STRATIFIED CLINICAL TRIALS IN PROPORTIONAL ODDS MODELS

DIAGNOSTICS FOR STRATIFIED CLINICAL TRIALS IN PROPORTIONAL ODDS MODELS DIAGNOSTICS FOR STRATIFIED CLINICAL TRIALS IN PROPORTIONAL ODDS MODELS Ivy Liu and Dong Q. Wang School of Mathematics, Statistics and Computer Science Victoria University of Wellington New Zealand Corresponding

More information

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career.

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career. Introduction to Data and Analysis Wildlife Management is a very quantitative field of study Results from studies will be used throughout this course and throughout your career. Sampling design influences

More information

Outline. Confidence intervals More parametric tests More bootstrap and randomization tests. Cohen Empirical Methods CS650

Outline. Confidence intervals More parametric tests More bootstrap and randomization tests. Cohen Empirical Methods CS650 Outline Confidence intervals More parametric tests More bootstrap and randomization tests Parameter Estimation Collect a sample to estimate the value of a population parameter. Example: estimate mean age

More information