Paired comparisons. We assume that

Size: px

Start display at page:

Download "Paired comparisons. We assume that"

Kory Jordan
6 years ago
Views:

1 To compare to methods, A and B, one can collect a sample of n pairs of observations. Pair i provides two measurements, Y Ai and Y Bi, one for each method: If we want to compare a reaction of patients to two different stimuli, we may want to measure the reaction of each patient to the two stimuli To compare the performance of two sorting algorithms, they could both be applied to the same data sets We assume that Y Ai = µ A + p i + Z Ai, i = 1,..., n, Y Bi = µ B + p i + Z Bi, i = 1,..., n.

2 The difference δ µ = µ A µ B is of the main interest. We define Y i = Y Ai Y Bi = δ µ + Z i, Z i = Z Ai Z Bi, i = 1,..., n. Using the pair differences has several advantages: The pair effects, p i -s, cancel; there is no need to model them There is no need to model Z Ai and Z Bi separately. It is enough to assume that Z i are independent and have zero mean By taking the differences, we get one sample, Y 1,..., Y n, instead of two. To estimate δ µ, we can use the sample mean: δ µ = Ȳ δµ is an unbiased estimator of δ µ : E(δ µ ) = E ( 1 n n i=1 (δ µ + Z i ) ) = 1 n n i=1 (δ µ + E(Z i )) = δ µ.

3 Even if Z Ai and Z Bi are not normal random variables, the difference Z i = Z Ai Z Bi can have an approximate normal distribution. Example 1: We measure the reaction time, Y Ai and Y Bi, to two different stimuli for n = 50 patients. The distributions of Y Ai and Y Bi are very skewed, so normality assumption is not good. The distribution of Y i = Y Ai Y Bi, is closer to normal. reaction time 1 reaction time 2 reaction time difference Frequency Frequency Frequency seconds seconds seconds

4 Example 1 (contd): We measure the reaction time, Y Ai and Y Bi, to two different stimuli for n = 50 patients. Assume that for the observed pair differences, y 1,..., y 50, we find ȳ = 1.41 and s y = Under the normality assumption, T = Ȳ δ µ S y / 50 t 49, and the 95% CI for δ µ is ( ) s y s y ȳ t 0.975,49, ȳ + t 0.975,49 = ( 0.13, 2.69) If H 0 : δ µ = 0 and H a : δ µ 0, then H 0 is rejected at the 5% significance level.

5 Example 2 (exercise 9.16 from the textbook). We compare the average hit rate (measure of prediction accuracy) for the two classification methods. average hit rate Data set 1 Data set 2 Data set 3 Data set 4 Method 1 (y A ) Method 2 (y B ) Difference (y) For the analysis of the data, we first assume that we have two independent normal samples: Y A1, Y A2, Y A3, Y A4 i.i.d. N(µ A, σ 2 ), Y B1, Y B2, Y B3, Y B4 i.i.d. N(µ B, σ 2 ).

6 Let H 0 : δ A,B = µ A µ B = 0 and H a : δ A,B > 0. We find δ A,B = ȳ A ȳ B = , sa 2 = , sb 2 = The pooled sample variance is s 2 P = (4 1)s2 A + (4 1)s2 B (4 1) + (4 1) = To construct the 95% CI for δ A,B, we use the distribution of the test statistic: T = δ A,B δ A,B s P C t ν.

7 : CLICKER QUESTION 1 To construct the 95% CI for δ A,B, we use the distribution of the test statistic: t = δ A,B δ A,B s P C t ν. Find C and ν if n A = n B = 4 in the exercise. A C = 1/(n A + n B 2) = and ν = n A + n B = 8 B C = 1/(n A + n B 2) = and ν = n A + n B 2 = 6 C C = 1/n A + 1/n B = 0.5 and ν = n A + n B = 8 D C = 1/n A + 1/n B = 0.5 and ν = n A + n B 2 = 6 E I give up

8 We have H 0 : δ A,B = 0 and H a : δ A,B > 0. Let T = ȲA Ȳ B δ A,B S P 0.5. P-value of the test: p = Pr (T > t H 0 : δ A,B = 0) = Pr(T > t T t 6 ), the observed value of T under H 0 is t = 4.09 and therefore p = We reject H 0 at the 1% significance level. One-sided 95% CI for δ A,B is ( δa,b t 0.95,6 0.5sP, ) = (0.012, ). We can see that zero falls outside this interval.

9 : CLICKER QUESTION 2 We have H 0 : δ A,B = 0 and H a : δ A,B > 0. We found the P-value p = Pr(T > t = 4.09 T t 6 ) = Find the P-value when testing H 0 : δ A,B = 0 against δ A,B 0. A B C D E 0.003

10 This analysis assumes that the two samples, A and B, are independent which is not the case! Both methods applied to the same data sets and we can consider the paired differences, Y 1, Y 2, Y 3, Y 4 where Y i = Y Ai Y Bi, to remove dependence between the two samples. We assume now that Y 1, Y 2, Y 3, Y 4 i.i.d. N(δ µ, σ 2 ). We find δµ = ȳ = and s 2 = Under normality assumption, T = Ȳ δµ S/ n t n 1, where the sample size n = 4. The observed value of T under H 0 is t = ȳ/s/ 4 =

11 The P-value is p = Pr(T > t = T t 3 ) = We therefore reject the null at the 2% significance level. The 95% CI for δ µ is ( ȳ t 0.95,3 s ) 2, = (0.0083, ). The power of the test when δ µ = 0.01 is P(0.01) = Pr(T 0 = Ȳ S/ 4 > t 0.95,3 T 0 t NC 3, κ ) = 0.36, where t3,κ NC is the noncentral t-distribution with three degrees of freedom and the noncentrality parameter κ = /σ with κ = /s = 1.55.

Comparing two independent samples

Comparing two independent samples In many applications it is necessary to compare two competing methods (for example, to compare treatment effects of a standard drug and an experimental drug). To compare two methods from statistical point