Nonparametric Tests Mathematics 47: Lecture 25 Dan Sloughter Furman University April 20, 2006 Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 1 / 14
The sign test Suppose X 1, X 2,..., X n is a random sample from a continuous distribution with median m. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 2 / 14
The sign test Suppose X 1, X 2,..., X n is a random sample from a continuous distribution with median m. Suppose we wish to test H 0 : m = m 0 H A : m > m 0. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 2 / 14
The sign test Suppose X 1, X 2,..., X n is a random sample from a continuous distribution with median m. Suppose we wish to test H 0 : m = m 0 H A : m > m 0. Let Y be the number of values of X i m 0, i = 1, 2,..., n, which are positive. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 2 / 14
The sign test Suppose X 1, X 2,..., X n is a random sample from a continuous distribution with median m. Suppose we wish to test H 0 : m = m 0 H A : m > m 0. Let Y be the number of values of X i m 0, i = 1, 2,..., n, which are positive. Note: if H 0 is true, Y is binomial with probability of success p = 1 2. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 2 / 14
The sign test Suppose X 1, X 2,..., X n is a random sample from a continuous distribution with median m. Suppose we wish to test H 0 : m = m 0 H A : m > m 0. Let Y be the number of values of X i m 0, i = 1, 2,..., n, which are positive. Note: if H 0 is true, Y is binomial with probability of success p = 1 2. Moreover, large values of Y provide evidence against H 0. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 2 / 14
The sign test (cont d) For an observed value y of Y, the p-value is P(Y y m = m 0 ) = n i=y ( n i ) ( 1 2 ) i ( ) 1 n i = 1 2 2 n n i=y ( ) n. i Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 3 / 14
The sign test (cont d) For an observed value y of Y, the p-value is P(Y y m = m 0 ) = n i=y ( n i ) ( 1 2 ) i ( ) 1 n i = 1 2 2 n n i=y ( ) n. i If the alternative hypothesis is H A : m < m 0, then small values of Y provide evidence against H 0, with p-value P(Y y m = m 0 ) = 1 2 n y i=0 ( ) n. i Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 3 / 14
The sign test (cont d) For an observed value y of Y, the p-value is P(Y y m = m 0 ) = n i=y ( n i ) ( 1 2 ) i ( ) 1 n i = 1 2 2 n n i=y ( ) n. i If the alternative hypothesis is H A : m < m 0, then small values of Y provide evidence against H 0, with p-value P(Y y m = m 0 ) = 1 2 n y i=0 ( ) n. i For H A : m m 0, the p-value is twice the p-value of the appropriate one-sided test. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 3 / 14
The sign test (cont d) For an observed value y of Y, the p-value is P(Y y m = m 0 ) = n i=y ( n i ) ( 1 2 ) i ( ) 1 n i = 1 2 2 n n i=y ( ) n. i If the alternative hypothesis is H A : m < m 0, then small values of Y provide evidence against H 0, with p-value P(Y y m = m 0 ) = 1 2 n y i=0 ( ) n. i For H A : m m 0, the p-value is twice the p-value of the appropriate one-sided test. We call this test the sign test. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 3 / 14
Example Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 4 / 14
Example Recall: A study of 20 Shoshoni beaded rectangles yielded the following width-to-length ratios: 0.693 0.662 0.690 0.606 0.570 0.749 0.672 0.628 0.609 0.844 0.654 0.615 0.668 0.601 0.576 0.670 0.606 0.611 0.553 0.933 Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 4 / 14
Example Recall: A study of 20 Shoshoni beaded rectangles yielded the following width-to-length ratios: 0.693 0.662 0.690 0.606 0.570 0.749 0.672 0.628 0.609 0.844 0.654 0.615 0.668 0.601 0.576 0.670 0.606 0.611 0.553 0.933 Suppose m is the median of the distribution of width-to-length ratios of Shoshoni beaded rectangles. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 4 / 14
Example Recall: A study of 20 Shoshoni beaded rectangles yielded the following width-to-length ratios: 0.693 0.662 0.690 0.606 0.570 0.749 0.672 0.628 0.609 0.844 0.654 0.615 0.668 0.601 0.576 0.670 0.606 0.611 0.553 0.933 Suppose m is the median of the distribution of width-to-length ratios of Shoshoni beaded rectangles. Suppose we wish to test H 0 : m = 0.618 H A : m 0.618. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 4 / 14
Example Recall: A study of 20 Shoshoni beaded rectangles yielded the following width-to-length ratios: 0.693 0.662 0.690 0.606 0.570 0.749 0.672 0.628 0.609 0.844 0.654 0.615 0.668 0.601 0.576 0.670 0.606 0.611 0.553 0.933 Suppose m is the median of the distribution of width-to-length ratios of Shoshoni beaded rectangles. Suppose we wish to test H 0 : m = 0.618 H A : m 0.618. The number of observations which exceed 0.618 is 11. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 4 / 14
Example (cont d) Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 5 / 14
Example (cont d) So the p-value for the sign test is ( ) 20 ( ) 1 20 2 2 20 i i=11 = 0.823803. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 5 / 14
Example (cont d) So the p-value for the sign test is ( ) 20 ( ) 1 20 2 2 20 i i=11 = 0.823803. Note: this was computed in R with > 2*(1-pbinom(10,20,.5)). Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 5 / 14
Example (cont d) So the p-value for the sign test is ( ) 20 ( ) 1 20 2 2 20 i i=11 = 0.823803. Note: this was computed in R with > 2*(1-pbinom(10,20,.5)). Hence, viewed in this way, the data provide no evidence against H 0. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 5 / 14
Example (cont d) So the p-value for the sign test is ( ) 20 ( ) 1 20 2 2 20 i i=11 = 0.823803. Note: this was computed in R with > 2*(1-pbinom(10,20,.5)). Hence, viewed in this way, the data provide no evidence against H 0. How might the outlier 0.933 have affected our tests? Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 5 / 14
Wilcoxon signed-rank test Let X 1, X 2,..., X n be a random sample from a continuous distribution which is symmetric about its median m. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 6 / 14
Wilcoxon signed-rank test Let X 1, X 2,..., X n be a random sample from a continuous distribution which is symmetric about its median m. Suppose we wish to test H 0 : m = m 0 H A : m > m 0. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 6 / 14
Wilcoxon signed-rank test Let X 1, X 2,..., X n be a random sample from a continuous distribution which is symmetric about its median m. Suppose we wish to test For i = 1, 2,..., n, H 0 : m = m 0 H A : m > m 0. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 6 / 14
Wilcoxon signed-rank test Let X 1, X 2,..., X n be a random sample from a continuous distribution which is symmetric about its median m. Suppose we wish to test For i = 1, 2,..., n, let D i = X i m 0, H 0 : m = m 0 H A : m > m 0. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 6 / 14
Wilcoxon signed-rank test Let X 1, X 2,..., X n be a random sample from a continuous distribution which is symmetric about its median m. Suppose we wish to test H 0 : m = m 0 H A : m > m 0. For i = 1, 2,..., n, let D i = X i m 0, rank D i from smallest to largest, and Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 6 / 14
Wilcoxon signed-rank test Let X 1, X 2,..., X n be a random sample from a continuous distribution which is symmetric about its median m. Suppose we wish to test H 0 : m = m 0 H A : m > m 0. For i = 1, 2,..., n, let D i = X i m 0, rank D i from smallest to largest, and let R(D i ) = rank of D i. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 6 / 14
Wilcoxon signed-rank test Let X 1, X 2,..., X n be a random sample from a continuous distribution which is symmetric about its median m. Suppose we wish to test H 0 : m = m 0 H A : m > m 0. For i = 1, 2,..., n, let D i = X i m 0, rank D i from smallest to largest, and let R(D i ) = rank of D i. Let R = D i <0 R(D i ). Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 6 / 14
Wilcoxon signed-rank test Let X 1, X 2,..., X n be a random sample from a continuous distribution which is symmetric about its median m. Suppose we wish to test H 0 : m = m 0 H A : m > m 0. For i = 1, 2,..., n, let D i = X i m 0, rank D i from smallest to largest, and let R(D i ) = rank of D i. Let R = D i <0 R(D i ). Then we should reject H 0 for small values of R. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 6 / 14
Wilcoxon signed-rank test (cont d) To test the alternative H A : m < m 0, let R + = D i >0 R(D i ) and reject H 0 for small values of R +. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 7 / 14
Wilcoxon signed-rank test (cont d) To test the alternative H A : m < m 0, let R + = D i >0 R(D i ) and reject H 0 for small values of R +. For H A : m m 0, we double the appropriate one-sided p-value. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 7 / 14
Wilcoxon signed-rank test (cont d) To test the alternative H A : m < m 0, let R + = D i >0 R(D i ) and reject H 0 for small values of R +. For H A : m m 0, we double the appropriate one-sided p-value. Note: so R + R + = n R(D i ) = i=1 R = n(n + 1) 2 n i = i=1 R +. n(n + 1), 2 Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 7 / 14
Wilcoxon signed-rank test (cont d) To test the alternative H A : m < m 0, let R + = D i >0 R(D i ) and reject H 0 for small values of R +. For H A : m m 0, we double the appropriate one-sided p-value. Note: so R + R + = n R(D i ) = i=1 R = n(n + 1) 2 n i = i=1 R +. n(n + 1), 2 It is possible to find the null distribution of R +, or R, using the fact that each of the 2 n sequences of signs of D i are equally likely. Table VI contains lower-tail probabilities for this distribution for 4 n 15. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 7 / 14
Wilcoxon signed-rank test (cont d) If, for k = 1, 2,..., n, we let { 1, if, for some i, D i has rank k and D i > 0, I k = 0, otherwise, then R + = n ki k. k=1 Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 8 / 14
Wilcoxon signed-rank test (cont d) If, for k = 1, 2,..., n, we let { 1, if, for some i, D i has rank k and D i > 0, I k = 0, otherwise, then R + = n ki k. Assuming H 0 is true, I k is Bernoulli with probability of success p = 1 2. k=1 Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 8 / 14
Wilcoxon signed-rank test (cont d) If, for k = 1, 2,..., n, we let { 1, if, for some i, D i has rank k and D i > 0, I k = 0, otherwise, then R + = n ki k. Assuming H 0 is true, I k is Bernoulli with probability of success p = 1 2. Hence, if H 0 is true, n E[R + ] = ke[i k ] = 1 n n(n + 1) k = 2 4 and var[r + ] = k=1 n k 2 var[i k ] = 1 4 k=1 k=1 k=1 n k 2 = k=1 n(n + 1)(2n + 1). 24 Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 8 / 14
Wilcoxon signed-rank test (cont d) In may be shown that for n > 15 the null distributions of R + and R are reasonably well approximated by a normal distribution with mean and variance n(n + 1) 2 n(n + 1)(2n + 1). 24 Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 9 / 14
Example Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 10 / 14
Example Suppose m is the median of the distribution of the Shoshoni beaded rectangle data and we wish to test H 0 : m = 0.618 H A : m 0.618. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 10 / 14
Example Suppose m is the median of the distribution of the Shoshoni beaded rectangle data and we wish to test H 0 : m = 0.618 H A : m 0.618. The observed values d i, i = 1, 2,..., 20, are 0.075 0.044 0.072 0.012 0.048 0.131 0.054 0.010 0.009 0.226 0.036 0.003 0.050 0.017 0.042 0.052 0.012 0.007 0.065 0.315 Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 10 / 14
Example (cont d) Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 11 / 14
Example (cont d) Putting these in order of magnitude, we have: 0.003 [1] 0.007 [2] 0.009 [3] 0.010 [4] 0.012 [5] 0.012 [6] 0.017 [7] 0.036 [8] 0.042 [9] 0.044 [10] 0.048 [11] 0.050 [12] 0.052 [13] 0.054 [14] 0.065 [15] 0.072 [16] 0.075 [17] 0.131 [18] 0.226 [19] 0.315 [20] Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 11 / 14
Example (cont d) Putting these in order of magnitude, we have: 0.003 [1] 0.007 [2] 0.009 [3] 0.010 [4] 0.012 [5] 0.012 [6] 0.017 [7] 0.036 [8] 0.042 [9] 0.044 [10] 0.048 [11] 0.050 [12] 0.052 [13] 0.054 [14] 0.065 [15] 0.072 [16] 0.075 [17] 0.131 [18] 0.226 [19] 0.315 [20] Hence r + = 4 + 8 + 10 + 12 + 13 + 14 + 16 + 17 + 18 + 19 + 20 = 151 and r = 1 + 2 + 3 + 5 + 6 + 7 + 9 + 11 + 15 = 59. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 11 / 14
Example (cont d) Putting these in order of magnitude, we have: 0.003 [1] 0.007 [2] 0.009 [3] 0.010 [4] 0.012 [5] 0.012 [6] 0.017 [7] 0.036 [8] 0.042 [9] 0.044 [10] 0.048 [11] 0.050 [12] 0.052 [13] 0.054 [14] 0.065 [15] 0.072 [16] 0.075 [17] 0.131 [18] 0.226 [19] 0.315 [20] Hence r + = 4 + 8 + 10 + 12 + 13 + 14 + 16 + 17 + 18 + 19 + 20 = 151 and r = 1 + 2 + 3 + 5 + 6 + 7 + 9 + 11 + 15 = 59. The p-value is then 2P(R 59 m = 0.618) = 2(0.04484749) = 0.08969498, where the R command > psignrank(59,20) was used to compute P(R 59 m = 0.618). Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 11 / 14
Example Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 12 / 14
Example To approximate this value using the normal approximation, we first find E[R ] = (20)(21) = 105 4 and var[r ] = (20)(21)(41) 24 = 717.5. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 12 / 14
Example To approximate this value using the normal approximation, we first find E[R ] = (20)(21) = 105 4 and Then find var[r ] = (20)(21)(41) 24 z = = 717.5. 59 105 717.5 = 1.717. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 12 / 14
Example To approximate this value using the normal approximation, we first find E[R ] = (20)(21) = 105 4 and Then find var[r ] = (20)(21)(41) 24 z = = 717.5. 59 105 717.5 = 1.717. Hence an approximate p-value is given by 2Φ( 1.717) = 0.08597917. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 12 / 14
Example (cont d) Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 13 / 14
Example (cont d) The R command > wilcox.test(x,mu=.618) will perform the test, computing the exact p-value if the sample size is less than 50 and there are no ties in the data, and using a normal approximation for the p-value otherwise. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 13 / 14
Example (cont d) The R command > wilcox.test(x,mu=.618) will perform the test, computing the exact p-value if the sample size is less than 50 and there are no ties in the data, and using a normal approximation for the p-value otherwise. By default, the latter approximation uses a continuity correction, and so will differ slightly from the approximation we computed above. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 13 / 14
Notes on the sign and signed-rank tests In both the sign test and the signed-rank test, we discard any observations x for which x m 0 = 0. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 14 / 14
Notes on the sign and signed-rank tests In both the sign test and the signed-rank test, we discard any observations x for which x m 0 = 0. In the signed-rank test, we average the ranks of any ties. The significance level is not affected greatly if the number of ties is small. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 14 / 14
Notes on the sign and signed-rank tests In both the sign test and the signed-rank test, we discard any observations x for which x m 0 = 0. In the signed-rank test, we average the ranks of any ties. The significance level is not affected greatly if the number of ties is small. The tests are applicable to discrete distributions provided P(X < m) = P(X > m), where X is a random variable with the given distribution. Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 14 / 14