D I S C U S S I O N P A P E R

Size: px

Start display at page:

Download "D I S C U S S I O N P A P E R"

Penelope Armstrong
6 years ago
Views:

1 I S T I T U T D E S T A T I S T I Q U E B I O S T A T I S T I Q U E E T S C I E C E S A C T U A R I E L L E S ( I S B A ) UIVERSITÉ CATHOLIQUE DE LOUVAI D I S C U S S I O P A P E R 2012/39 Hyperbolic confidence bands of errors-in-variables regression lines applied to method comparison studies FRACQ, B. and B. GOVAERTS

2 Submission Journal de la Société Française de Statistique Hyperbolic confidence bands of errors-in-variables regression lines applied to method comparison studies Bernard G. Francq 1 and Bernadette B. Govaerts 2 Abstract: This paper focuses on the confidence bands of errors-in-variables regression line applied in the context of measurement methods comparison with or without replicated data. In method comparison studies, the goal is to proof that two measurement methods are equivalent. In that case, there is no analytical bias between them and they must provide the same results on average notwithstanding the errors of measurement. Usually, there is no gold-standard and each measurement method is compared to other one and vice-versa with a linear regression analysis. If there is no bias, the results should not be far from the identity line Y = X with a slope (β) equal to 1 and the intercept (α) equal to 0. A joint-ci is ideally used to test this joint hypothesis and the measurement errors in both axes must be taken into account in the regression analysis by applying a linear errors-in-variables regression. As the very well-known OLS regression assumes that there is no error in the X-axis, it provides a biased estimated line. The DR (Deming regression) or the BLS (Bivariate Least Square regression) are more general regressions and take into account the precision ratio (λ) of the two measurement methods. DR and BLS provide consistent estimated lines which are confounded in the case of homoscedasticity. The joint-ci with a shape of ellipse in a (β,α) plane computed with such regressions (DR or BLS) already exist in the literature but we propose to transform these joint-ci into hyperbolic confidence bands for the line in the (X,Y ) plane which are easier to display and interpret. Four methodologies are proposed based on previous papers and their properties and advantages are discussed. The proposed confidence bands are mathematically equivalent to the ellipses but a detailed comparison of the coverage probabilities is provided with simulations and real data. When the error variances are known, the obtained coverage probabilities are very close to each other but the joint-ci computed with the maximum likelihood (ML) or the method of moments provide slightly better coverage probabilities. Under unknown and heteroscedastic error variances, the ML coverage probabilities drop drastically while the BLS provide better coverage probabilities. Keywords: confidence bands, errors-in-variables regression, method comparison study, equivalence of measurement methods, deming regression, bivariate least square AMS 2000 subject classifications:,, 1. Introduction The needs of the industries to quickly assess the quality of products leads to the development and improvement of new measurement methods sometimes faster, easier to handle, less expensive or more accurate than the reference method. These alternative methods should ideally lead to results comparable to those obtained by a standard method [35] in such a way that there is no bias between these two methods or that these measurement methods can be interchangeable. 1 Université Catholique de Louvain, Institut de Statistique, Biostatistique et sciences Actuarielles, Belgium bernard.g.francq@uclouvain.be 2 Université Catholique de Louvain, Institut de Statistique, Biostatistique et sciences Actuarielles, Belgium bernadette.govaerts@uclouvain.be

3 2 B. G. Francq, B. B. Govaerts There are different approaches proposed in the literature to deal with the method comparison studies. Firstly, the most known and widely used is certainly the approach proposed by Bland and Altman which focuses directly on the differences between the results of the two measurement methods [1, 4, 5]. Secondly, the approach based on regression analysis (or linear functional relationship [18]) is also widely applied and focuses on the regression parameter estimates and their confidence intervals (CI) [28]. Each approach has his own advantages and disadvantages, the first being more intuitive for the user and the second providing a stronger statistical basis. This paper focuses on the regression approach. Practically, to test statistical equivalence between two measurement methods, a certain characteristic of series of samples (or subjects) is measured by the two methods over the experimental domain of interest. The pairs of measures taken by the reference method and the alternative one are modelled by a regression line and the parameter estimates used to test the equivalence. Indeed, an intercept significantly different from zero indicates a systematic analytical bias between the methods and a slope significantly different from one indicates a proportional bias [28]. To achieve this correctly, it is essential to take into account the errors in both axes and the heteroscedasticity if necessary [28]. Various types of regressions exist to deal with this problem [31]. In this paper, we base the equivalence test on the joint-ci and propose to transform the classical ellipse computed with the bivariate distribution of the parameters into hyperbolic confidence bands for the regression line with replicated or unreplicated data. We don t consider the confidence bands of other shapes [25, 24] like two straight lines around the estimated line as the hyperbolic curve is the only correct way to fit the distribution of estimated lines. We ll first review, explain and compare the joint-ci and the confidence bands of the very wellknown OLS regression. Then, we ll review four different approaches to compute a joint-ci with an ellipse and we ll transform these joint-ci into confidence bands. on-parametric and bootstrap confidence bands for errors-in-variables regressions are available in the literature [6] but according to our knowledge, the confidence bands given in this paper are novel. The systolic blood pressure data published by Bland and Altman [5] are used to illustrate these techniques. 2. The model and the goal of method equivalence testing 2.1. What is method equivalence and which data are needed to test it? In the systolic blood pressure data [5], simultaneous measurements were performed using a sphygmomanometer and a semi-automatic blood pressure monitor in order to study whether these two devices are equivalent or not. The literature does not provide a clear and unique definition of the equivalence concept. Statistical equivalence approach will, in priority, test whether the two devices are equivalent notwithstanding the errors of measurement or whether there is a bias between the two devices. Additionaly, a statistical test can be performed to compare the accuracies of the two methods. Practical equivalence approach will not focus on statistical parameters (bias and variance) but will consider two methods equivalent when one device can substitute by the other one without affecting the decision taken from the measurement result. In this paper, we ll focus mainly on the statistical question and test whether there is a bias between the two devices. The most classical design in method comparison studies consists of measuring each sample (or subject) once by both devices. Unfortunately, whith unreplicated data, it is not possible to estimate

4 the variances of measurement errors if needed and a design with replicated data is more suitable. If the accuracy of a reference measurement method (gold standard) is known, each sample (or subject) can, more simply, be measured only once by this method and at several times by the new method. In the systolic blood pressure data, the design is quite simple: three sets of readings were made in quick succession by both devices on each patient The general model To compare two measurement methods, we suppose that a parameter of interest is measured on samples (i = 1,2,...,) or subjects by both methods [3, 11, 26]: X i j = ξ i + τ i j, Y i j = η i + ν i j (1) The X i j ( j = 1,2,...,n Xi ) and Y i j ( j = 1,2,...,n Yi ) are the repeated measures for sample i by methods X and Y respectively, n Xi and n Yi are the number of repeated measures of sample i by each method. ξ i and η i are the true but unobservable value of the parameter of interest for both methods, which we ll assume to be linked by a linear regression [3, 11, 26]: η i = α + βξ i (2) Let s define by X i and Y i, the means of the repeated measures for a given sample: X i = 1 n Xi n Xi X i j and Y i = 1 j=1 n Yi n Yi j=1 Y i j (3) τ i j and ν i j are the measurement errors which are assumed to be independent and normally distributed (with constant variances in the case of homoscedasticity: ( ) (( ) ( )) τi j 0 σ 2 i, τi 0 ν i j 0 0 σν 2 (4) i So, the means of the repeated measures are also normally distributed with: ( ) ) στ Xi i ( 2 i ξi, n Xi 0 (5) Y i η i 0 When the variances σ 2 τ i and σ 2 ν i are unknown, they can be estimated, if repeated measures are available, respectively by S 2 τ i and S 2 ν i : Sτ 2 i = 1 n Xi n Xi 1 j=1 In this paper, we ll also use the notations: S xx = i=1 σ 2 ν i n Yi (X i j X i ) 2 and Sν 2 i = 1 n Yi 1 X = 1 (X i X) 2,S yy = i=1x i and Y = 1 i=1 i=1 (Y i Y ) 2 and S xy = Y i, n Yi j=1 i=1 3 (Y i j Y i ) 2 (6) (X i X)(Y i Y ).

5 4 B. G. Francq, B. B. Govaerts 2.3. The homoscedastic model In the homoscedastic model, we consider that both measurement methods have a constant accuracy through the domain of interest, i. e., σ 2 τ i = σ 2 τ and σ 2 ν i = σ 2 ν i. Moreover, as we ll regress the mean measures Y i to X i, we also assume as constant the number of replicates (n Xi = n X and n Yi = n Y i) to prevent the model to become heteroscedastic even if the accuracies of measurement methods are constant. In this case of homoscedasticity, the variances S 2 τ i and S 2 ν i are estimates of σ 2 τ and σ 2 ν and we can define global estimates for σ 2 τ and σ 2 ν respectively by S 2 τ and S 2 ν: Sτ 2 = i=1 (n X i 1)Sτ 2 i ( i=1 n X i ) and S2 ν = i=1 (n Y i 1)Sν 2 i ( i=1 n Y i ) Hence, with constant number of repeated measures, we have n Xi = n X and n Yi = n Y i: 2.4. How to test the equivalence? S 2 τ = i=1 S2 τ i and S 2 ν = i=1 S2 ν i (7) If the two measurement methods are equivalent, they should give the same results for a given sample notwithstanding the measurement errors. In the model notations, method equivalence means then ξ i = η i i [28, 33]. In practice, due to the measurement errors, these parameters are unobservable and the equivalence test will be based on the following regression model: Y i = α + βx i + ε i with ε i (0,σ 2 ε i ) and σ 2 ε i = σ 2 ν i n Yi + β 2 σ 2 τ i n Xi, (8) where α, the intercept, and β, the slope, are estimated respectively by ˆα and ˆβ. This regression model is applied on the averages of repeated measures because individual measures cannot be paired. A lot of practitioners wonder which measurement method to put on the X-axis or on the Y-axis. Actually the variables X and Y play similar roles [2] and if the regression line is estimated adequately (by taking into account errors in both axis), the coordinate system should not matter [34]. The estimated parameters ˆα and ˆβ provide the information to assess the equivalence. Indeed, an intercept significantly different from 0 means that there is a constant bias between the two measurement methods and a slope significantly different from 1 means that there is a proportional bias between the two measurement methods [28]. So, the following two-sided hypothesis will be used to test method equivalence: H α 0 : α = 0, H α 1 : α 0 and H β 0 : β = 1, Hβ 1 : β 1 (9) The null hypothesis H0 α : α = 0 is rejected if 0 isn t included inside a confidence interval (CI) for α and the null hypothesis H β 0 : β = 1 is rejected if 1 isn t included inside a CI for β. The two measurement methods can be called average equivalent if the hypothesis H β 0 : β = 1 and Hα 0 :

6 α = 0 are not rejected [33]. However, these tests can also be applied jointly by checking whether the point (1,0) is included or not in a joint-ci for the regression coefficients θ = (α,β) which is, classically, a confidence ellipse. This paper focuses on this joint hypothesis: H 0 : θ = (0,1) and H 1 : θ (0,1) (10) and proposes simultaneous confidence bands (CB) for the regression line Y = α + βx = x θ over X (, ) where x = (1,X) which are equivalent to the confidence ellipses. The hypothesis H 0 : θ = (0,1) is rejected if the line Y = X intercepts the CB and not rejected if the line lies inside the CB The OLS regression This section reviews results on the classical estimation of a regression line by OLS when X is observed without errors. These will be crucial in the development in further sections on errors in variables models. In particular, the not well known concept of confidence bands (CB) around the regression line is reviewed. It is shown that it is equivalent to the joint confidence interval on the regression parameters and that they can both be used for equivalence testing Ordinary Least Squares (OLS) regression estimators The easiest way to estimate the parameters α and β of model (8) in the case of homoscedasticity is to use the very well known technique of Ordinary Least Squares [13, 17]: OLS. The OLS regression minimizes the criterion C BLS : the sum of squared vertical distances (residuals) between each point and the line as shown in Figure 1: C OLS = i=1 (Y i ˆα ˆβX i ) 2, and the corresponding parameter estimators are given by the following formulas: ˆβ OLS = S xy S xx and ˆα OLS = Y ˆβ OLS X (11) Unfortunately, OLS minimization criterion doesn t take into account the errors in the independent variables [7]. In other words, the OLS assumes that there is no error given by the measurement method in the X-axis, i.e., τ i j are supposed to be equal to zero or negligible in (1). The corresponding estimates are therefore biased [7] and the bias are related to the variance ratio: σ 2 τ /σ 2 ξ [10]. For large sample sizes, the estimated parameters provided by the OLS converge to [10, 16]: 1 ˆβ OLS 1 + (στ 2 /n X )/(σξ 2/n Y ) β, ˆα OLS α + (σ 2 τ /n X )/(σ 2 ξ /n Y ) 1 + (σ 2 τ /n X )/(σ 2 ξ /n Y ) Xβ.

7 6 B. G. Francq, B. B. Govaerts Y DR-BLS OLS λ XY X FIGURE 1. Illustration of OLS and DR-BLS regressions criteria of minimization 3.2. Confidence intervals computed from OLS estimators Supposing that σ 2 τ = 0, separated 100(1 γ)% CI for β and α are symmetric around ˆβ OLS and ˆα OLS respectively and are computed as [8]: CI(β OLS ) : ˆβ OLS ±t 1 γ 2, 2S ˆβOLS with S ˆβOLS = S 2 OLS S xx (12) where SOLS 2 = 1 2 (Y i ˆα OLS ˆβ OLS X i ) 2, i=1 ( ) CI(α OLS ) : ˆα OLS ±t 1 γ, 2S ˆα 2 OLS with S ˆαOLS = S 2 1 OLS + X 2, (13) S xx where t 1 γ/2, 2 is the 100(1 γ/2)% percentile of a t-distribution with 2 df. Additionally, the covariance between the slope and the intercept, S ˆα ˆβOLS is given by: S ˆα ˆβOLS = X S2 OLS S xx = XS 2ˆβOLS The joint-ci for α and β is given by: ( ˆα OLS α ˆβ OLS β) ( S 2ˆα OLS S ˆα ˆβOLS S ˆα ˆβOLS S 2ˆβOLS ) 1 ( ) ˆαOLS α 2F 1 γ,2, 2, (14) ˆβ OLS β where F 1 γ,2, 2 is the 100(1 γ)% percentile of the F distribution with 2 and 2 degrees of freedom. This joint-ci is an ellipse centered on the estimated parameters ˆθ = ( ˆα OLS, ˆβ OLS ). When S ˆα ˆβOLS increases, the ellipse becomes narrower and collapses to a line, otherwise when S ˆα ˆβOLS 0 the ellipse becomes wider until its major and minor axes become parallel to the

8 β-axis and α-axis. These CI (separate and joint CI) are exact under the assumptions of OLS, i. e., no errors in the X-axis (σ 2 τ = 0) and normality of ε i. Finally, the equivalence between two measurements methods is rejected if: ( ˆα OLS 0 ˆβOLS 1) ( S 2ˆα OLS S ˆα ˆβOLS S ˆα ˆβOLS S 2ˆβOLS ) 1 ( ) ˆαOLS 0 > 2F 1 γ,2, 2. ˆβ OLS The confidence bands for the OLS line The confidence bands for the OLS line is a concept not very well-known by non-statisticians despite many papers already published in the past. Indeed, the confidence band for a line is often not available in classical software. The confidence bands are the CI for the regression line and should not be confused with the CI for the conditional mean E(Ŷ 0 X 0 ) given by the following well-known formula [8]: Ŷ 0 ±t 1 γ 2, 2S Ŷ 0, where Ŷ 0 = ˆα OLS + ˆβ OLS X 0. ( 1 and S 2 Ŷ 0 = S 2ˆα OLS + X0 2 S 2ˆβOLS + 2X 0 S ˆα = S 2 ˆβOLS OLS + (X 0 X) 2 ). (15) S xx This formula is computed point by point but can, of course, be displayed on a graph for all possible values of X 0 and has a hyperbolic shape. The CI for the line Y = α +βx over X (, ) (and not for a given X 0 ) relies on the bivariate dimension of a line as we have to take into account the uncertainties of both parameters (α and β) jointly. The confidence bands (CB) of the line Y = α + βx under OLS assumptions is given by the following formula [22]: ( ˆα OLS + ˆβ ) OLS X ± ( ) 1 2F 1 γ,2, 2 SOLS 2 (X X)2 +. (16) S xx Both formulas are similar but a F-distribution is used for the confidence bands based on the work of Working and Hotelling [36] Comparison of the joint-ci and the confidence bands Actually, the joint-ci on the regression coefficients and the confidence bands of the regression line are mathematically equivalent. In other words, to test the equivalence between two measurement methods, we can check whether the point (β = 1,α = 0) lies inside the ellipse or check whether the identity line (Y = X) lies inside the confidence bands. Figure 1 displays a simulated data set ( = 10,X 1 = 1,...,X 10 = 10,Y i = X i + ε i with ε i (0,1)) on the right with the OLS line and the 95% CB. We can notice that the identity line (the dashed line) lies inside the CB which means that the equivalence isn t rejected. On the left, the 95% joint-ci on (β,α) is displayed as an ellipse and the equivalence point (β = 1, α = 0) lies inside the ellipse. More generally, if a given point (β,α) lies inside the ellipse, the corresponding line (Y = α + βx) lies inside the CB and

9 8 B. G. Francq, B. B. Govaerts α OLS Equivalence Y OLS line and CB Equivalence line β X FIGURE 2. Mathematical equivalence between the joint-ci on (β,α) (ellipse on the left) and the CB (hyperbolic band on the right) computed around the line vice-versa. On the other hand, if a given point (β,α) lies outside the ellipse, the corresponding line intercepts the CB. For a given point (β,α) on the edge of the ellipse, the corresponding line is tangent to the CB. If we consider all the lines which lie inside the CB, it s easy to understand that the two extreme slopes correspond to the oblique asymptotes of the hyperbolic CB (red lines in Figure 2-right). These two extreme slopes correspond to the domain of the ellipse (red arrow on Figure 2-left) and it s easy to check that these slopes are: ˆβ OLS ± 2F 1 γ,2, 2 S ˆβOLS. (17) From the CB, the slopes of the oblique asymptotes can be easily computed: ( ˆα OLS + ˆβ OLS X) ± ( ) 2F 1 γ,2, 2 SOLS (X X)2 S xx lim X X = lim X ˆβOLS ± 2F 1 γ,2, 2 S 2 OLS ( ) 1 + (X X)2 S xx X 2 = ˆβ OLS ± 2F 1 γ,2, 2 S ˆβOLS. By analogy, it s easy to check that the image of the ellipse gives the two extreme intercepts (green arrow in Figure 2-left): ˆα OLS ± 2F 1 γ,2, 2 S ˆαOLS, (18) and correspond to the two intercepts of the CB (when X = 0). Finally, for any slope β 0 in the domain of the ellipse (17), there exists two lines tangent to the hyperbolic CB with slope β 0 and intercepts α 1 0 and α2 0 which correspond on the ellipse, to the two points α1 0 and α2 0 where the vertical line β = β 0 intercepts the ellipse (details not given). This justifies that the joint-ci (ellipse) and the CB (hyperbola) are mathematically equivalent.

10 9 4. The errors-in-variables regressions under homoscedasticity 4.1. The estimators of the errors-in-variables regressions This section presents the formulas of the regression estimators of the model Y i = α + βx i + ε i when X i is observed with error and the errors are assumed to be homoscedastic. We consider four methodologies published previously in the literature. The formulas are written with the notations presented in Section 2. Under homoscedasticity, these four methodologies provide identical estimations of the regression line but the associated variance-covariance matrix of the parameters are different and lead to different CI The Deming regression (DR) estimators To take into account the errors in both axes, let s first define the ratio between the two error variances: C DR = i=1 X i Y i + λ XY X i / ˆβ ˆα ˆβ + λ XY / ˆβ λ XY = σ 2 ν /n Y σ 2 τ /n X (19) λ XY is the ratio of the errors variance in the Y-axis and the errors variance in the X-axis. Usually, the problem of replicated data isn t discussed in details in the literature and a precision ratio λ is defined by λ = σν 2 /στ 2 [33] (or inversely [19, 20, 21, 22]). The Deming regression (DR) is the Maximum Likelihood (ML) solution of model (1) when λ XY is known [11]. In practice, if λ XY is unknown since στ 2 or/and σν 2 is/are unknown, it can be estimated with replicated data by replacing στ 2 and σν 2 with Sτ 2 and Sν 2 respectively. We ll denote the estimator of λ by ˆλ = Sν/S 2 τ 2 (and λ XY by ˆλ XY ). The DR minimizes criterion C DR which is the sum of squared oblique distances between each point and the line [22, 33] as shown in Figure 1, the angle of the direction is related to λ XY and given by λ XY / ˆβ [33]. Specifically, ( ) 2 ( ) 2 + Y i ˆα ˆβY i + λ XY X i ˆα ˆβ ˆβ + λ XY / ˆβ The ML estimators are given by: S yy λ XY S xx + (S yy λ XY S xx ) 2 + 4λ XY Sxy 2 ˆβ DR = and ˆα DR = Y 2S ˆβ DR X (20) xy The assumption of DR is the constancy of λ XY. This assumption is fulfilled when we suppose homoscedasticity and a balanced design (i. e., n Xi and n Yi are constant) The Galea-Rojas et al. procedure Galea-Rojas et al. [12] propose a regression model based on a paper previously published by Ripley and Thompson [30] where maximum likelihood is applied to take into account the errors

11 10 B. G. Francq, B. B. Govaerts and heteroscedasticity in both axes. We present in this section the formulas in the case of homoscedasticity and with replicated data. The estimators of the parameters are: where and ˆβ GR = i=1 W GR ˆx i (Y i Y ) i=1 W GR ˆx i (X i X) = i=1 ˆx i(y i Y ) i=1 ˆx i(x i X) and ˆα GR = Y ˆβ GR X (21) ˆx i = W GR = σ 2 ν 1 n Y + ˆβ 2 GR στ 2 n X σ 2 ν n Y X i + ˆβ GR σ 2 τ n X (Y i ˆα GR ) σ 2 ν n Y + ˆβ 2 GR ˆx i is the abscissa of the projection of the i th point to the line in the oblique direction defined in the previous section. It can be proven that ˆβ GR = ˆβ DR and ˆα GR = ˆα DR (as the GR procedure is more general than the DR). In practice, if σ 2 τ and σ 2 ν are unknown, they can be estimated with replicated data and replaced by S 2 τ and S 2 ν respectively The Bivariate Least Square regression: BLS The Bivariate Least Square regression, BLS, is a generic name but this article refers to the papers published first by Lisý et al. [23] and later by other authors [28, 29, 9, 32]. BLS can take into account error and heteroscedasticity in both axes and is written usually in matrix notation [23, 28, 29, 9, 32]. We present here its formulas in the case of homoscedasticity and with replicated data. The BLS minimizes the criterion C BLS : C BLS = 1 W BLS i=1 στ 2 n X (Y i ˆα ˆβX i ) 2 = ( 2)s 2 BLS with W BLS = σε 2 = σ ν 2 + n ˆβ 2 σ τ 2 Y Practically, the estimations of the parameters (the b vector) are computed by iterations with the following formula: Rb = g (22) where and ( g = 1 W BLS R = 1 ( ) i=1 X i W BLS i=1 X i i=1 X i 2, ( ) ˆαBLS b = ˆβ BLS i=1 i=1 Y ) ( i X i Y i + ˆβ στ 2 (Y i ˆα BLS ˆβ ) BLS X i ) 2 BLS n X W BLS If σ 2 τ and σ 2 ν are unknown, they can be estimated with replicated data and replaced by S 2 τ and S 2 ν. Moreover, even if the parameters σ 2 τ and σ 2 ν are present separately in the formula, the solution b only depends of the ratio λ XY and it can be proven (after some algebraic manipulations) that under homoscedasticity ˆβ BLS = ˆβ DR = ˆβ GR and ˆα BLS = ˆα DR = ˆα GR. n X (23)

12 The Mandel procedure Mandel developed a regression in the context of inter-laboratories studies [27] that can take into account the correlation between the error terms (as he regressed the results of a given laboratory with respect to the averages of the results of all laboratories). In the context of measurement methods, the errors τ i j and ν i j are uncorrelated and it can be shown that, if the correlation term is set to zero, Mandel s regression is exactly equivalent to the DR. The Mandel s procedure consists in transforming the (X,Y ) data into (U = X i + ky i,v = Y i ˆβ Mandel X i ) data such that U has a very small error. The OLS s regression is then applied to the (U,V ) data and transformed back into the (X,Y) axis to finally get a regression line which takes into account errors in both axis and their correlation. By using λ XY instead of λ in the Mandel s procedure (with uncorrelated errors), the formulas are: ˆβ Mandel = S xy + ks yy S xx + ks xy and ˆα Mandel = Y ˆβ Mandel X (24) with k = ˆβ Mandel λ XY ˆβ Mandel can be computed by iterations or by solving a 2 nd degree equation and it is easy to check that ˆβ Mandel = ˆβ DR = ˆβ GR = ˆβ BLS and ˆα Mandel = ˆα DR = ˆα GR = ˆα BLS. By analogy, we ll also use these notations: S uu = i=1 U = 1 (U i U) 2, S vv = i=1 i=1u i and V = 1 i=1 (V i V ) 2 and S uv = V i, i=1 (U i U)(V i V ) = The joint-ci and confidence bands of the errors-in-variables regressions This section considers four methodologies to compute the approximate covariance matrix ˆΣ of the estimates ˆθ = ( ˆα, ˆβ) of the errors-in-variables regression coefficients θ presented in section 4.1 in order to compute the joint-ci for θ or the CB for the regression line. The joint-ci are all confidence ellipses centered on the same point (as the estimators given in the previous section are identical) but with different shapes and are all approximate. The confidence ellipse for θ can be computed with the following formula: ( ˆθ,θ) ˆΣ 1 ( ˆθ,θ) < c (25) where the critical constant c is chosen suitably depending on whether χ1 γ,2 2 (the 100(1 γ)% percentile of a χ 2 distribution with 2 df) or 2F 1 γ,2, 2 is used as the approximate distribution. This confidence ellipse can be represented equivalently as a CB: x θ x ˆθ ± c x ˆΣ 1 x (26)

13 12 B. G. Francq, B. B. Govaerts The variance-covariance matrix of the estimators provided by DR Gillard and Iles [14, 15] propose to compute the variance-covariance matrix of the estimators by the method of moments. When the ratio of error variances λ XY is assumed to be known, the variances and covariance of the estimators can be computed with the following formulas (which have been modified to take into account the replicated data): S 2ˆβDR = S xxs yy Sxy 2 ( ) Sxy 2, S 2ˆα = X 2 DR S + ˆβ DR 2 σ τ 2 /n X + σν 2 /n Y 2ˆβDR ˆβ DR and S ˆα ˆβDR = XS 2ˆβDR (27) If needed, the variance(s) σ 2 τ and σ 2 ν can be replaced by S 2 τ and S 2 ν. Based on the asymptotic normal distribution of the parameters ˆβ DR and ˆα DR (get by ML), we propose to use a F-distribution with c = 2F 1 γ,2, 2 to prevent the joint-ci or CB being too narrow for small sample sizes The variance-covariance matrix of the estimators provided by Galea-Rojas et al. Galea-Rojas et al. [12] provide the asymptotic variance-covariance matrix of the parameters derived by ML. In the case of homoscedasticity and with replicated data, the asymptotic variancecovariance matrix of the parameters is given by: Σ GR = Wn 1 V n Wn 1 /, (28) where with W n = W GR ( ) ( ) i=1 ξ i 0 0 i=1 ξ i i=1 ξ i 2 and V n = W n + 0 k GR k GR = W GR C and C = 1 σ 2 τ /n X + β 2 GR σ 2 ν /n Y In practice, to get a consistent estimator of the variance-covariance matrix, βgr 2 can be replaced by ˆβ GR 2, ξ i by ˆx i and ξi 2 by ˆξ i 2 1/C [12]. From the Wald statistic given in the literature [12], the critical constant is c = χ1 γ,2 2. The variances of the parameters can also be computed with the following equivalent formulas [12]: σ 2ˆβGR = 1 ( 1 + k ) GR, σ 2ˆα SS W SS GR = 1 + ξ 2 σ 2ˆβGR and σ W W ˆα = ξ σ 2ˆβGR (29) ˆβGR GR with SS W = W GR i=1 (ξ i ξ ) 2 and ξ = i=1 ξ i. In practice, σ 2ˆβGR, σ 2ˆα GR and σ ˆα ˆβGR can be estimated by replacing ξ by X, and SS W by W GR i=1 ( ˆx2 i C 1 2 ˆx i X + X 2 ).

14 The variance-covariance matrix of the estimators provided by BLS Riu and Rius [32] propose the following variance-covariance matrix for the BLS parameters: or equivalently: S 2ˆβBLS = ˆΣ BLS = s 2 BLSR 1 (30) W BLS s 2 BLS i=1 X i 2 ( i=1 X i), 2 S2ˆα BLS = WBLSs2 BLS i=1 X i 2 i=1 X i 2 ( i=1 X i) and S 2 ˆα ˆβ BLS = XS 2ˆβBLS (31) The critical constant c is given by 2F 1 γ,2, 2 [12]. The disadvantages of this approximate ellipse (the theoretical background of this joint-ci isn t rigorous) can be found in the literature [12] The variance-covariance matrix of the estimators provided by Mandel With the Mandel s procedure, the OLS technique is applied in the (U,V) axes [27] and the variances of the parameters computed with the formulas given in section 3.2. The variances of the parameters ˆβ Mandel and ˆα Mandel are derived by the general formula for the propagation of errors from the reconversion to (X,Y ) scales: S 2ˆβMandel = (1 + k ˆβ Mandel ) 2 S uu S 2 e Mandel and S 2ˆα Mandel = ( 1 + X 2 (1 + k ˆβ ) Mandel ) 2 Se 2 S Mandel (32) uu with Se 2 Mandel = S vv 2 The covariance and the joint-ci are not provided by Mandel but the covariance can be also easily computed by the formula for the propagation of errors: S ˆα ˆβMandel = XS 2ˆβMandel (33) We propose to use a F-distribution with c = 2F 1 γ,2, 2 as the Mandel s procedure is based on the OLS technique Coverage probabilities of the joint confidence intervals or confidence bands In order to compare the coverage probabilities of the joint-ci or CB provided by the four methodologies presented in the previous sections, we simulated samples with = 10, 20, 50, with unreplicated data (n X = n Y = 1,λ = λ XY known) and under equivalence (α = 0,β = 1,η i = ξ i ) for the values of λ XY given in Table 1 (with σ 2 ν from 0.1 to 2 and σ 2 τ inversely from 2 to 0.1 providing 13 values of λ from 0.05 to 20). Different values of η i were drawn randomly from an Uniform distribution U(10,20) for each simulated sample. The corresponding joint-ci or CB were computed for each simulated sample and the coverage probabilities (at a nominal level = 95%) computed per value of λ XY. We also simulated replicated data to allow the estimation of σ 2 ν and σ 2 τ : with equal number of replicates (n X = n Y = 2 and λ = λ XY or n X = n Y = 4 and

15 14 B. G. Francq, B. B. Govaerts σν στ λ = λ XY TABLE 1. Values of σ 2 ν and σ 2 τ for the simulations with n X = n Y = 1 and the corresponding values of λ and λ XY λ = λ XY ) and with unequal number of replicates (n X = 4,n Y = 2 and λ XY = 2λ) as shown in Table 2 (Appendix) in such a way that the values of λ XY are identical to those of the unreplicated case. To study in more details the effect of the sample size, we also run simulations with λ XY = 0.33,1,3, and from 10 to 100 (10, 12, 14, 16, 20, 30, 50, 75, 100), with or without replicated data as shown in Table 3 (Appendix). The coverage probabilities are displayed for λ XY known in Figure 3 with respect to λ XY (left, on a logarithmic scale) and to (right). Figures 4,5,13 display the coverage probabilities corresponding to replicated data and estimated λ XY. When the variances are known, all the coverage probabilities are between 93% and 96% and closer to 95% when λ XY > 1 for the GR, BLS and Mandel procedures. That s the reason why we recommend choosing the coordinate system such that λ XY is higher than 1 in practice. When increases, the GR and DR are closer to 95% while the BLS and Mandel procedures provide slightly lower coverage probabilities. When the variances are estimated with replicated data, the coverage probabilities provided by the DR are slightly lower but are the closest to 95%. On the other hand, the coverage probabilities provided by the GR s procedure drop because of the uncertainties on the estimated variances but these coverage probabilities increase with. For instance, when n X = n Y = 2 with λ XY = 1, the GR outperforms the BLS with > 30 (approximately). The BLS and Mandel methodologies provide similar coverage probabilities with known or unknown variances. In practice, the accuracy of one measurement method is rarely three times higher than the other such that λ XY often lies between around 0.33 and 3. In this interval, the DR and GR are always slightly more suitable with known variances while the DR is always more suitable with unknown variances. 5. The errors-in-variables regressions under heteroscedasticity This section presents the formulas of the regression estimators of model (8) under heteroscedasticity when the variances σ 2 τ i and σ 2 ν i are known or estimated with replicates. The Mandel s regression and the DR are not considered anymore as they are not suitable in the case of heteroscedasticity. The GR procedure and the BLS can take into account the heteroscedasticity and are identical for estimating the regression line but the variance-covariance matrix of the parameters are computed differently. The covariance matrix ˆΣ of the estimates ˆθ = ( ˆα, ˆβ) given in this section can be plugged in formulas (25) and (26) to get respectively the corresponding joint-ci (confidence ellipse) or CB (with the same critical constant c given in section 4.2) The Galea-Rojas et al. procedure Based on a paper published previously by Ripley and Thompson [30], the ML parameters estimators are: ˆβ GR = i=1 W i GR ˆx i (Y i Y ) i=1 W i GR ˆx i (X i X) and ˆα GR = Y W ˆβ GR X W, (34)

16 15 FIGURE 3. Coverage probabilities of the joint-ci or CB related to λ XY in a logarithmic scale with = 10, 20, 50 (left) and related to for λ XY = 0.33, 1, 3 (right), (n X = n Y = 1)

17 16 B. G. Francq, B. B. Govaerts FIGURE 4. Coverage probabilities of the joint-ci or CB related to λ XY in a logarithmic scale with = 10, 20, 50 (left) and related to for λ XY = 0.33, 1, 3 (right), (n X = n Y = 2)

18 17 FIGURE 5. Coverage probabilities of the joint-ci or CB related to λ XY in a logarithmic scale with = 10, 20, 50 (left) and related to for λ XY = 0.33, 1, 3 (right), (n X = n Y = 4)

19 18 B. G. Francq, B. B. Govaerts where W igr = 1 σν 2 i n Yi + ˆβ and ˆx GR 2 στ 2 i = i n Xi σ 2 ν i n Yi X i + ˆβ GR σ 2 τ i n Xi (Y i ˆα GR ) σ 2 ν i n Yi + ˆβ 2 GR, στ 2 i n Xi X W = i=1 W i GR X i i=1 W i GR and Y W = i=1 W i GR Y i i=1 W i GR In practice, if στ 2 i and σν 2 i are unknown, they can be estimated with replicated data and replaced by Sτ 2 i and Sν 2 i. The asymptotic variance-covariance matrix of the parameters derived by ML is given by Galea-Rojas et al. [12]: Σ GR = Wn 1 V n Wn 1 /, (35) where with W n = 1 ( i=1 W igr i=1 W ) ( ) i GR ξ i 0 0 i=1 W i GR ξ i i=1 W i GR ξi 2 and V n = W n + 0 k GR k GR = 1 i=1 W igr C i and C i = 1 σ 2 τ i /n Xi + β 2 GR σ 2 ν i /n Yi In practice, to get a consistent estimator of the variance-covariance matrix, βgr 2 can be replaced by ˆβ GR 2, ξ i by ˆx i and ξi 2 by ˆξ i 2 1/C i [12]. The variances of the parameters can also be computed with the following equivalent formulas [12]: σ 2ˆβGR = 1 ( 1 + k ) GR, σ 2ˆα 1 SS W SS GR = W i=1 W + ξ 2 W σ 2ˆβGR and σ ˆα = ξ ˆβGR W σ 2ˆβGR (36) i GR with SS W = i=1 W i GR (ξ i ξ W ) 2 and ξ W = i=1 W i GR ξ i i=1 W i GR. In practice, σ 2ˆβGR, σ 2ˆα GR and σ ˆα ˆβGR can be estimated by replacing ξ W by X W, and SS W by i=1 W i GR ( ˆx 2 i C 1 i 2 ˆx i X W + X 2 W ) The Bivariate Least Square regression: BLS The BLS regression line minimizes the criterion C BLS [29, 9, 32]: C BLS = i=1 1 (Y i ˆα W ˆβX i ) 2 = ( 2)s 2 BLS with W ibls = σε 2 i = σ ν 2 i ibls n Yi + ˆβ 2 σ 2 τ i n Xi The estimations of the parameters are computed by iterations with the following formula: i=1 i=1 1 W ibls i=1 X i W ibls i=1 X i W ibls ( ˆαBLS Xi 2 W ibls ) = ˆβ BLS Rb = g (37) i=1 ( i=1 X i Y i W ibls + ˆβ BLS σ 2 τ i Y i W ibls (Y i ˆα BLS ˆβ BLS X i ) 2 n Xi Wi 2 BLS ) (38)

20 In practice, if σ 2 τ i and σ 2 ν i are unknown, they can be estimated with replicated data and replaced by S 2 τ i and S 2 ν i. It can be proven (after some algebraic manipulations) that ˆβ BLS = ˆβ GR and ˆα BLS = ˆα GR under heteroscedasticity. The variance-covariance matrix of the parameters provided by Riu and Rius [32] is given by: ˆΣ BLS = s 2 BLSR 1 (39) or can be computed with the following formulas: 19 with D = i=1 s 2 S 2ˆβBLS BLS i=1 = D 1 W ibls i=1 1 W ibls X 2 i W ibls s 2, S 2ˆα BLS i=1 BLS = D ( i=1 ) X 2. i W ibls X 2 i W ibls s 2 BLS i=1 and S ˆα = ˆβBLS D X i W ibls (40) 5.3. Coverage probabilities of the joint confidence intervals or confidence bands The coverage probabilities of the joint-ci provided by GR and BLS have already been compared in the literature only in the case of known variances [12]. Since in practice, the variances are never known but always estimated, we propose to study in more details the coverage probabilities with known and unknown variances. We simulated samples with = 10, 12, 15, 20, 30, 50, 75, 100, with replicated data (n X = n Y = 5) under equivalence (α = 0,β = 1,η i = ξ i ). The values of ξ i were drawn randomly from an Uniform distribution U(10,20) for each simulated sample and the coverage probabilities computed at a nominal level = 95%. We first chose the variances according to the following functions: σ 2 τ i = 0.45ξ i and σ 2 ν i = 0.45ξ i, where the variances σ 2 τ i and σ 2 ν i increase with ξ i and are equal for a given ξ i (it is common in practice that the precision of a measurement method decreases when ξ i increases) and such that σ 2 τ i = σ 2 ν i = 0.5 when ξ i = 10 and σ 2 τ i = σ 2 ν i = 5 when ξ i = 20. Secondly the variances were chosen independently and randomly ( random heteroscedasticity ) from an Uniform distribution U(0.5,5). To study in more details the effect of the replicates when the variances are estimated, simulations were also launched with = 20 and n X = n Y = 2, 4, 6, 8, 10, 12, 15, 20. Figure 6 displays the coverage probabilities, which are close to 95% when the variances are known but closer to 95% for GR. The coverage probabilities collapse drastically when the variances have to be estimated (the number of replicates, n X = n Y = 5, is too low to estimate a variance) and drop more for GR and especially when increases or with a random heteroscedasticity. Obviously, when the number of replicates increases, the coverage probabilities increase but BLS still outperforms for n X = n Y = 20. Actually, the uncertainties on the estimated variances are not taken into account with BLS or GR s methodologies and work is in progress to estimate such regression s lines by taking into account additionally the variances uncertainties. 6. Applications 6.1. The systolic blood pressure data under homoscedasticity In the Bland and Altman systolic blood pressure data [5], simultaneous measurements were made by two observers (denoted J and R) using a sphygmomanometer and a semi-automatic blood

21 20 B. G. Francq, B. B. Govaerts FIGURE 6. Coverage probabilities of the joint-ci or CB related to in the X-axis (top and middle) or related to n X = n Y (bottom) with heteroscedasticity and variances known (top) or unknown and estimated (middle and bottom), simulated with a function (left) or randomly (right)

22 pressure monitor (denoted S). We ll consider only the measurements made by observer J with the manual device and the measurements made by the semi-automatic monitor S. In other words, the systolic blood pressure was measured three times per patient (85 patients) by the semi-automatic device S and three times per patient by observer J with the manual device. If we decide to put on the Y-axis the mean measures given by S and on the X-axis those given by J (for reasons explained below), we have: X 1 = ,...,X 85 = and Y 1 = ,...,Y 85 = 124. The variances σ 2 τ i and σ 2 ν i are unknown but can be estimated with S 2 τ i and S 2 ν i : S 2 τ 1 = ,...,S 2 τ 85 = and S 2 ν 1 = 9.333,...,S 2 ν 85 = 13. If we calculate the minimum and the maximum of the estimated variances S 2 τ i and S 2 ν i, we get: Min(S 2 τ i ) = 1.333,Max(S 2 τ i ) = and Min(S 2 ν i ) = 1.333,Max(S 2 ν i ) = It is clear that the variances are significantly different from one patient to another one. evertheless, as this section supposes homoscedasticity and for didactic purposes, we compute the global estimates S 2 τ and S 2 ν: S 2 τ = and S 2 ν = λ (= λ XY as n X = n Y ) is estimated by ˆλ = We can observe that ˆλ is higher than 1. Actually, that s the reason why we chose to put on the Y-axis the mean measures given by S and on the X-axis those given by J. Indeed, we explained in Section 4.3 that the coverage probabilities are closer to the nominal level when λ XY > 1. We have also: X = ,Y = ,S xx = ,S yy = ,S xy = ,R 2 = 0.67 For the different regression s lines, we get: ˆβ DR = ˆβ GR = ˆβ BLS = ˆβ Mandel = and ˆα DR = ˆα GR = ˆα BLS = ˆα Mandel = Figure 7 displays the scatterplot of the data with standard errors of the mean (S τ / n X and S ν / n Y ). It s obvious that some points are outliers but they will not be removed for didactic purposes. The 95% CB are displayed for the four methodologies (Figure 8-right) and one can see that the equivalence line (Y = X) isn t inside the CB which means that the manual device and the semi-automatic one are not equivalent: the null hypothesis is rejected; the estimated line is significantly different from the equivalence line. The CB provided by Mandel and BLS are very close to each other (nearly superimposed on the graph) and close to DR when we move away from X while the GR is the narrowest. In fact, the are outliers according to the bivariate distribution of the points (X i,y i ). When the distance from an outlier to the line increases, S xy increases and the variance-covariance matrix computed by the DR is modified as it is related to S xy in formulas (27). The variance-covariance matrix computed by the BLS is related to the sum of the weighted residuals (31) and obviously this sum increases when a point moves away from the line. On the other hand, the variance-covariance matrix computed by the GR is not related to S xy or the residuals but is related to W GR and ˆx i which are not modified by such outliers. This can also be noticed in Figure 8-left where the equivalence point (β = 1 and α = 0) is outside all the ellipses and the one provided by GR is the smallest. The minor axes of the GR and DR ellipses are equal and by analogy, the widths of their hyperbolic CB are equal at (X,Y ). The Mandel and BLS s ellipses are nearly 9 times larger than the GR s ellipse and the DR s ellipse nearly 3 times larger than the GR s ellipse The systolic blood pressure data under heteroscedasticity If we suppose heteroscedasticity in the systolic blood pressure data, the variances σ 2 τ i and σ 2 ν i are unknown but can be estimated with S 2 τ i and S 2 ν i from the replicates for each patient and both devices

23 22 B. G. Francq, B. B. Govaerts J S Estimated line Equivalence line FIGURE 7. Scatterplot of the systolic blood pressure data under homoscedasticity with the regression line and standard errors of the mean β α DR GR BLS Mandel J S DR GR BLS Mandel FIGURE 8. Confidence ellipses (left) computing by the four methodologies and the corresponding CB (right)

24 23 S Homoscedasticity Heteroscedasticity Equivalence line J FIGURE 9. Scatterplot of the systolic blood pressure data with the regression s lines and standard errors of the means under heteroscedasticity (n Xi = n X = n Yi = n Y = 3 i). The estimates GR and BLS regression lines are: ˆβ GR = ˆβ BLS = and ˆα GR = ˆα BLS = Figure 9 displays the scatterplot of the data with local standard errors of the means. The estimated line is close to the one estimated under homoscedasticity assumption (the parameters are very close to each other). The 95% CB provided by the BLS and GR under heteroscedasticity are displayed in Figure 10-right with the ones under homoscedasticity (for comparative purpose). It can be noticed that the CB are narrower under heteroscedasticity. Actually, some of the outliers have lower weights and are less taken into account in the heteroscedasticity analysis which leads to lower variances of the parameters. Like in the previous section, the equivalence line (Y = X) isn t inside the CB: the equivalence between both devices is rejected. This is confirmed in Figure 10-left where the equivalence point (β = 1 and α = 0) is outside all the ellipses and the ones provided by the GR are the narrowest (under homoscedasticity and heteroscedasticity) The arsenate ion in natural river water under heteroscedasticity In the arsenate ion in natural river water data [30], 30 pairs of measures are provided by 2 methods: firstly, a continuous selective reduction and atomic absorption spectrometry and secondly, a nonselective reduction, cold trapping and atomic emission spectrometry. The mean measures (X i,y i ) with their standard errors of the mean are given and analysed in the literature [12, 30]. Figure 11 shows the scatterplot of the data and Figure 12 shows the ellipses of the parameters (left) and the CB (right). We can observe that lower concentrations are frequent and that higher concentrations are rarer (logarithmic scales could be tested). The errors increase with concentration for both devices (like our simulations in Section 5.3). The estimated line is close to the equivalence line (Figure 11) and is inside the hyperbolic CB (Figure 12-right) provided by BLS or GR. Obviously, the equivalence point is inside both ellipses (Figure 12-left). Both CB and both ellipses are very

25 24 B. G. Francq, B. B. Govaerts S GR BLS α Homoscedasticity Heteroscedasticity GR BLS J β FIGURE 10. Systolic blood pressure data, confidence ellipses provided by BLS and GR with or without heteroscedasticity (left) and the corresponding CB under heteroscedasticity (right) close to each other to the opposite of the systolic blood pressure example. This is because there is no outliers in this example (in the bivariate distribution of X i and Y i ). Ideally, the data should be transformed into logarithm but we didn t find the detailed data in the literature. Moreover, these data were analysed in the literature without such transformation [12, 30]. 7. Conclusions Under homoscedasticity, four methodologies (DR, GR, BLS and Mandel) have been compared to estimate a regression line which take into account errors in both axes. They are applied to assess the equivalence between two measurement methods on the basis of a confidence ellipse for the parameters estimates or equivalently a confidence band for the regression line. These four methodologies give identical estimations of the regression line. However, the variance-covariance matrices of the parameters estimates are computed differently. The confidence ellipses and the CB are therefore dissimilar. The coverage probabilities are close to each other and close to the nominal level when the error variances are known, especially for λ XY > 1. However, the variance-covariance matrices of the parameters estimates, derived by the method of moments for DR and by the maximum likelihood for the GR, provide slightly better coverage probabilities. Unfortunately, when the error variances are unknown, the coverage probabilities of the GR drop while they are similar for the other procedures. To tackle this problem, the sample size or the number of replicates has to be increased to recover the coverage probabilities of GR. Moreover, the confidence ellipse or CB computed by the GR is the smallest or narrowest with outliers, work is still in progress to assess the influence of outliers on the covaraince matrix of the parameters. Under heteroscedasticity, the GR and BLS provide the same estimated line. The coverage probabilities of the joint-ci or CB are very close to each other even if the GR slightly outperforms BLS. On the other hand, when the variances are unknown and estimated with replicates, the coverage

26 25 Atomic abs. spec Heteroscedasticity Equivalence line Atomic emission spec. FIGURE 11. Scatterplot of arsenate ion in natural river water data with the regression line and standard errors of the means α GR BLS Atomic abs. spec GR BLS β Atomic emission spec. FIGURE 12. Arsenate ion data, confidence ellipses provided by BLS and GR with heteroscedasticity and the corresponding CB

27 26 B. G. Francq, B. B. Govaerts probabilities of the GR collapse and the BLS outperforms GR. Actually, the uncertainties on the estimated variances are not taken into account and work is still in progress to tackle this problem. To summarize, we recommend the DR or GR under homoscedasticity (with a high number of replicates or sample size for the GR when the error variances have to be estimated) and the BLS or GR under heteroscedasticity by being careful to assess the impact of outliers if needed. Finally, the CB presented in this paper are easier to compute, easier to interpret than an ellipse and can be displayed directly in the (X,Y ) space while an ellipse is displayed in a (β,α) space. Appendix A: Tables n x = n Y = 2 σν στ λ = λ XY n x = n Y = 4 σν στ λ = λ XY n x = 4,n Y = 2 σν στ λ λ XY TABLE 2. Values of σ 2 ν and σ 2 τ for the simulations with replicated data and the corresponding values of λ and λ XY n x = n Y = 1 n x = n Y = 2 n x = 4,n Y = 2 n x = n Y = 4 σν στ λ = λ XY TABLE 3. Values of σ 2 ν and σ 2 τ for the simulations with or without replicated data and the corresponding values of λ and λ XY Thanks Support from the IAP Research etwork P7/06 of the Belgian State (Belgian Science Policy) is gratefully acknowledged.

Simple Linear Regression

Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)