Version 1: Equality of Distributions. 3. F (x) and G(x) represent the distribution functions corresponding to the Xs and Y s, respectively.

Size: px

Start display at page:

Download "Version 1: Equality of Distributions. 3. F (x) and G(x) represent the distribution functions corresponding to the Xs and Y s, respectively."

Irma George
5 years ago
Views:

1 4 Two-Sample Methods 4.1 The (Mann-Whitney) Wilcoxon Rank Sum Test Version 1: Equality of Distributions Assumptions: Given two independent random samples X 1, X 2,..., X n and Y 1, Y 2,..., Y m : 1. The measurement scale is at least ordinal. 2. The response variable of interest is continuous. 3. F (x) and G(x) represent the distribution functions corresponding to the Xs and Y s, respectively. Hypotheses: (A) Two-sided: H 0 : F (x) = G(x) for all x vs H 1 : F (x) G(x) for some x (B) One-sided: H 0 : F (x) = G(x) for all x vs H 1 : F (x) > G(x) for some x (C) One-sided: H 0 : F (x) = G(x) for all x vs H 1 : F (x) < G(x) for some x The alternative hypothesis H 1 in (B) implies that X tends to be less than Y and H 1 in (C) implies that Y tends to be less than X. Rationale: Suppose that ranks are assigned to the combined sample X 1, X 2,..., X n, Y 1, Y 2,..., Y m. If the distributions of X and Y are the same, then we would expect that the average of the ranks associated with the Xs and with the Y s would be similar. However, if the distributions of X and Y are different, and the smaller (larger) ranks tend to be associated with X and the larger (smaller) ranks with Y, then we would expect the rank averages associated with the Xs and Y s to systematically differ. Thus, if the average (or sum) of the ranks associated with the Xs or Y s is too large or small, we have evidence that H 0 : F (x) = G(x) should be rejected. Method: For a given α Assign ranks to the combined sample X 1, X 2,..., X n, Y 1, Y 2,..., Y m. Let W = Under H 0, the sampling distribution of W is determined from the randomization distribution of n + m ranks partitioned into two groups of size n and m. For each randomization, W is calculated. The distribution of W is then determined. Let w p be a p th quantile of W. That is, w p is the largest value of test statistic W such that P (W w p ) p. In the table of critical values, selected lower and upper quantiles w p are given for various common values of p and for n, m

2 Decision Rule (A) For two-sided test H 1 : F (x) G(x), reject H 0 if W < w L or W > w U where w L and w U are the lower and upper α/2 tabled critical values. Otherwise, fail to reject H 0. (B) For one-sided test H 1 : F (x) > G(x),, Reject H 0 if W < w U (the lower α tabled critical value). Otherwise, Fail to Reject H 0. 9 Appendix (C) For one-sided : Wilcoxon test H 1 Rank-Sum : F (x) < G(x), Table, Reject H 0 if W > w U (the upper α tabled critical Probabilities value). relate Otherwise, to distribution Fail to Reject of W A H, 0 the. rank sum for group A when H 0 : A = B is true. The tabulated value for the lower tail is the largest value of w A for which pr(w A w A ) prob. The tabulated value for the upper tail is the smallest Table of value Critical of w A for Values whichfor pr(w W A for wsmall A ) prob. m and n Lower Tail Upper Tail prob prob n A n B

3 Example of the Distribution of W under H 0 when n = 4, m = 4 The following table lists the ( 8 4) = 70 possible assignments of ranks (R1, R 2, R 3, R 4 ) and the corresponding ranksum W. R 1 R 2 R 3 R 4 W R 1 R 2 R 3 R 4 W R 1 R 2 R 3 R 4 W The following table describes the distribution of W under H 0 when n = 4, m = 4. The cumulative probabilities are then used to calculate an exact p-value for a given value of W. Distribution of W with n = 4 and m = 4 P (W = w L ) P (W w L ) w L w U Frequency = P (W = w U ) = P (W w U )

4 Version 2: Shift in Location In addition to Assumptions 1-3 in Version 1, assume that the distributions of X and Y differ only with respect to a shift in location, that is; (4) G(x) = F (x + ) for all x This assumption states that the two distributions are identical in shape but are centered at different values. Hypotheses By including assumption (4), the MWW-test can be used to compare means or medians: For means (θ x = µ X, θ y = µ Y ) or for medians (θ x = M x, θ y = M y ) (A) Two-sided: H 0 : θ x = θ y vs H 1 : θ x θ y. (B) One-sided: H 0 : θ x = θ y vs H 1 : θ x < θ y. (C) One-sided: H 0 : θ x = θ y vs H 1 : θ x > θ y. Rationale: If the distribution of X and Y differ only with respect to a shift in location, then = and =. Consider the following cases: (i) If > 0, then the X s will tend to be larger than the Y s the ranks of the X s will tend to be larger than the ranks of the Y s. (ii) If < 0, then the X s will tend to be smaller than the Y s the ranks of the X s will tend to be smaller than the ranks of the Y s. (iii) If = 0, then the distributions of X and Y are identical. Or, the ranks of the X s will tend to be similarly distributed to the ranks of the Y s. Method and Decision Rule: Follow the same Method and Decision Rule for Version 1. Large Sample Approximation: When either n 1 or n 2 or both are large and there are no ties, the Central Limit Theorem for ranks applies. That is, z = W n 1(n 1 + n 2 + 1)/2 ± c n1 n 2 (n 1 + n 2 + 1)/12 = W E(W ) ± c V ar(w ) d N(0, 1). A value of z can be used to approximate p-values. If no continuity correction is used, then c = 0. If a continuity correction is used, then c =.5 if W > 0 or c = +.5 if W < 0. Both SAS and R use this correction. 76

5 In there are ties, see page 94 of Daniel for the adjustment to W. Later, we will use see a Randomization Test as an alternative approach but does not need a special formula to handle ties (Higgins p.41-42). Example: In a study of 11 patients with Huntington s disease and a control group of 10 subjects, the response (milligrams percent) to a 5-hour oral glucose treatment was studied. Perform a Mann-Whitney (Wilcoxon) Rank Sum Test for a two-sided alternative comparing medians. Control (X) Rank HD patient (Y ) Rank 77

6 R code for Mann-Whitney (Wilcoxon) Rank Sum Test # Mann-Whitney (Wilcoxon) Rank Sum Test HDsubject <- c(85,89,86,91,77,93,100,82,92,86,86) Control <- c(83,73,65,65,90,77,78,97,85,75) # Two-sided alternative with confidence interval for median difference wilcox.test(hdsubject,control,conf.int=t) # One-sided alternative: Median 1 > Median 2 wilcox.test(hdsubject,control,alternative="greater") # One-sided alternative: Median 1 < Median 2 wilcox.test(hdsubject,control,alternative="less") R output for Mann-Whitney (Wilcoxon) Rank Sum Test Warning: R does not report the W statistic, but an equivalent U statistic where U = min(u 1, U 2 ) and U 1 = U 2 = U is referred to as the Mann-Whitney U-statistic. It is also mathematically equivalent to either the number of pairs (X i, Y j ) for which X i < Y j or the number of pairs (X i, Y j ) for which Y j < X i (whichever is smaller). See Siegel p. 120 for details. If there are any tied values (X i = Y j for some i, j), then U can be adjusted by assigning 1/2 to each tied pair. Wilcoxon rank sum test with continuity correction data: HDsubject and Control W = 87, p-value = <-- p-value alternative hypothesis: true location shift is not equal to 0 95 percent confidence interval: sample estimates: difference in location Wilcoxon rank sum test with continuity correction data: HDsubject and Control W = 87, p-value = alternative hypothesis: true location shift is greater than 0 Wilcoxon rank sum test with continuity correction data: HDsubject and Control W = 87, p-value = alternative hypothesis: true location shift is less than 0 78

7 SAS output for Mann-Whitney (Wilcoxon) Rank Sum Test Mann-Whitney (Wilcoxon) Rank Sum Test The NPAR1WAY Procedure Wilcoxon Scores (Rank Sums) for Variable glucose Classified by Variable treatmnt Sum of Expected Std Dev Mean treatmnt N Scores Under H0 Under H0 Score HD subj Control Average scores were used for ties. Wilcoxon Two-Sample Test Statistic (S) Normal Approximation Z One-Sided Pr < Z Two-Sided Pr > Z t Approximation One-Sided Pr < Z Two-Sided Pr > Z Exact Test One-Sided Pr <= S Two-Sided Pr >= S - Mean Z includes a continuity correction of 0.5. Hodges-Lehmann Estimation Location Shift (Control - HD subj) Interval Asymptotic Type 95% Confidence Limits Midpoint Standard Error Asymptotic (Moses) Exact SAS code for Mann-Whitney (Wilcoxon) Rank Sum Test DM LOG;CLEAR;OUT;CLEAR; ; ODS LISTING; OPTIONS LS=72 PS=54 NONUMBER NODATE; data in; do i = 1 to 11; treatmnt = HD subj ; input output; end; do i = 1 to 10; treatmnt = Control ; input output; end; lines; ; PROC NPAR1WAY DATA=in WILCOXON; CLASS treatmnt; VAR glucose; EXACT WILCOXON HL; TITLE Mann-Whitney (Wilcoxon) Rank Sum Test ; RUN; 79

8 4.1.1 Hodges-Lehman Confidence Intervals for the Shift Parameter The following procedure generates an approximate 100(1 α)% confidence interval for. Calculate all mn paired differences U ij = X i Y j Arrange the U ij in increasing order. Use the table of critical values for W to find k = w α/2 n(n + 1)/2. Let L and U bet the k th smallest and largest U ij values. The approximate 100(1 α)% confidence interval for the shift parameter θ is (L, U). Thus, it is also a confidence interval for µ X µ Y and M x M y. This is the procedure used by SAS and R. In the previous example, the output from SAS indicated the 95% confidence interval was 17 1 while R indicated the 95% confidence interval was 1 17 (milligrams percent). You need to keep track of the direction of the calculated differences (e.g., Control - HD Patient in SAS or HD Patient - Control in R). 4.2 The Median Test for Differences in Population Medians Assumptions: Given two independent random samples X 1, X 2,..., X m and Y 1, Y 2,..., Y n : The measurement scale is at least ordinal. The variable of interest is continuous. The two populations have the same shape. Hypotheses about medians: H 0 : M x = M y vs H 1 : M x M y or H 1 : M x < M y or H 1 : M x > M y Rationale: If the two populations have the same median M: For a random X or Y, P (X M) = P (Y M) =.5. Assuming H 0 is true, we estimate the common median M by computing the sample median M of the combined sample X 1, X 2,..., X m, Y 1, Y 2,..., Y n of N = m + n observations. We would expect about half of the X 1, X 2,..., X m sample and half of the Y 1, Y 2,..., Y n sample to be above (or below) M. Method: For a given α: Compute M and summarize the results in a 2 2 table: Sample X s Y s Total Relationship to Above A B A + B Sample Median Equal or Below C D C + D Total m n N 80

9 Under H 0, the sampling distribution of A is hypergeometric. That is, for a sample of size A + B from a population of size N comprised of two types of items (m of Type X and n = N m of Type Y ), the probability of observing exactly A of the Type X and B of Type Y is P(A, B m, n) = ( m )( n A B) ( N ) A+B These probabilities can be determined using SAS or R or can be approximated by the normal distribution when N is large. For a two-sided alternative, the exact p-value is the sum of the extreme tail probabilities from hypergeometric distributions: p value = P (X min(a, B)) + P (X max(c, D)) For a one-sided alternative, use the appropriate single extreme tail probability to find the exact p-value. An approximate p-value can be generated from a normal approximation with z-statistic: z = (A/m) (B/n) p(1 p)(1/m + 1/n) where p = (A + B)/N. Decision Rule using p-values If p-value α, Reject H 0. Otherwise, Fail to Reject H 0. Example: In a study of 11 patients with Huntington s disease and a control group of 10 subjects, the response (milligrams percent) to a 5-hour oral glucose treatment was studied. Perform a Median Test for a two-sided alternative. HD patient (X) Control (Y )

10 R output from generating a hypergeometric distribution > # Hypergeometric Distribution for A,B row xh pdfh cdfh rev_cdfh [1,] [2,] [3,] < [4,] [5,] [6,] [7,] [8,] [9,] [10,] [11,] > # Hypergeometric Distribution C,D row xh pdfh cdfh rev_cdfh [1,] [2,] [3,] [4,] [5,] [6,] [7,] [8,] [9,] < [10,] [11,] [12,] R code for generating a hypergeometric distribution Nh <- 21 AB <- 10 # Hypergeometric Distribution for A,B row mh <- 10 xh <- (0:mh) pdfh <- dhyper(xh,mh,nh-mh,ab) pdfh <- round(pdfh,10) cdfh <- phyper(xh,mh,nh-mh,ab) cdfh <- round(cdfh,10) rev_cdfh <- phyper(ab-xh,mh,nh-mh,ab) rev_cdfh <- round(rev_cdfh,10) hypergeo <- cbind(xh,pdfh,cdfh,rev_cdfh) hypergeo # Hypergeometric Distribution C,D row nh <- 11 xh <- (0:nh) pdfh <- dhyper(xh,nh,nh-nh,ab) pdfh <- round(pdfh,10) cdfh <- phyper(xh,nh,nh-nh,ab) cdfh <- round(cdfh,10) rev_cdfh <- phyper(ab-xh,nh,nh-nh,ab) rev_cdfh <- round(rev_cdfh,10) hypergeo <- cbind(xh,pdfh,cdfh,rev_cdfh) hypergeo 82

11 SAS output for Median Test to compare two medians Median Test to Compare Two Medians The NPAR1WAY Procedure Median Scores (Number of Points Above Median) for Variable glucose Classified by Variable treatmnt Sum of Expected Std Dev Mean treatmnt N Scores Under H0 Under H0 Score HD subj Control Average scores were used for ties. Median Two-Sample Test Statistic (S) Normal Approximation Z One-Sided Pr < Z Two-Sided Pr > Z Exact Test One-Sided Pr <= S Two-Sided Pr >= S - Mean SAS code for Median Test to compare two medians DM LOG;CLEAR;OUT;CLEAR; ; OPTIONS LS=72 PS=54 NONUMBER NODATE; ODS LISTING; data in; do i = 1 to 11; treatmnt = HD subj ; input output; end; do i = 1 to 10; treatmnt = Control ; input output; end; lines; ; PROC NPAR1WAY DATA=in MEDIAN; CLASS treatmnt; VAR glucose; EXACT MEDIAN; TITLE Median Test to Compare Two Medians ; RUN; 83

12 R code for Median Test to compare two medians # Two-Sample Median Test library(modeltools) library(coin) # x are the HD subjects x <- c(85,89,86,91,77,93,100,82,92,86,86) # y are the control subjects y <- c(83,73,65,65,90,77,78,97,85,75) xy <- c(x,y) group <- c(rep(1,length(x)),rep(2,length(y))) group <- as.factor(group) two_sample <- data.frame(cbind(xy,group)) # two_sample # Two-sided alternative with confidence interval for median difference median_test(xy~group,conf.int=t,conf.level=.95,distribution="exact") # One-sided alternatives median_test(xy~group,conf.int=t,distribution="exact",alternative="l") median_test(xy~group,conf.int=t,distribution="exact",alternative="g") R output for Median Test to compare two medians # Two-Sample Median Test > # Two-sided alternative with confidence interval for median difference Exact Median Test Z = , p-value = alternative hypothesis: true mu is not equal to 0 95 percent confidence interval: 1 19 sample estimates: difference in location 0.5 > # One-sided alternatives Exact Median Test Z = , p-value = alternative hypothesis: true mu is less than 0 95 percent confidence interval: -Inf 19 sample estimates: difference in location 0.5 Exact Median Test Z = , p-value = alternative hypothesis: true mu is greater than 0 95 percent confidence interval: 1 Inf sample estimates: difference in location

6 Single Sample Methods for a Location Parameter

6 Single Sample Methods for a Location Parameter If there are serious departures from parametric test assumptions (e.g., normality or symmetry), nonparametric tests on a measure of central tendency (usually