Nonparametric statistic methods. Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health

Nonparametric statistic methods Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health

Measurement What are the 4 levels of measurement discussed? 1. Nominal or Classificatory Scale Gender, ethnic background 2. Ordinal or Ranking Scale Hardness of rocks, beauty, military ranks 3. Interval Scale Celsius or Fahrenheit 4. Ratio Scale speed, height, mass or weight

Parametric Assumptions The observations must be independent The observations must be drawn from normally distributed populations These populations must have the same variances

Introduction The theory upon which the twosample T-test is based requires that the two sampled populations be normal and have equal variances. Many other common statistical procedures have similar assumptions.

Introduction A large body of statistical methods is available that comprises procedures that not requiring the estimation of the population variance and mean and not stating hypothesis about parameters. These testing procedures are termed non-parametric tests

Introduction Non parametric tests may be applied in any situation where we would be justified in employing a parametric test, such as the two-sample t test, as well as in instances when the assumptions of the latter are untenable.

Introduction If either the parametric or nonparametric approach is applicable, then the former will always be more powerful than the latter.

Why use a non-parametric statistics? Very small samples (< 20 replicates) high probability of violating the assumption of normality leads to spurious Type 1 (false alarm) errors Outlier more often leads to spurious Type 1 errors in parametric statistics Non-parametric statistics reduce data to an ordinal rank, which reduce the impact or leverage of outlier

Error Type I error: False alarm for a bogus effect Reject the null hypothesis when it is really true Type II error: Miss a real effect Fail to reject the null hypothesis when it is really false Type III error ;) Lazy, incompetence, or willful ignorance of the truth

Nonparametric Assumptions Observations are independent Variable under study has underlying continuity

Nonparametric Methods There is at least one nonparametric test equivalent to a parametric test These tests fall into several categories Tests of differences between groups (independent samples) Tests of differences between variables (dependent samples) Tests of relationships between variables

Nonparametric Methods Sign Test Wilcoxon Signed-Rank Test Mann-Whitney-Wilcoxon Test Kruskal-Wallis Test Rank Correlation Adapted from JOHN S. LOUCKS St. Edward s University

Sign Test A common application of the sign test involves using a sample of n potential customers to identify a preference for one of two brands of a product. The objective is to determine whether there is a difference in preference between the two items being compared.

Sign Test To record the preference data, we use a plus sign if the individual prefers one brand and a minus sign if the individual prefers the other brand. Because the data are recorded as plus and minus signs, this test is called the sign test.

Example: Hand Cream Test Sign Test: Large-Sample Case o As part of a market research study, a sample of 36 consumers were asked to taste two brands of hand cream and indicate a preference o Do the data shown below indicate a significant difference in the consumer preferences for the two brands?

Example: Hand cream Test 18 preferred L Occitane (+ sign recorded) 12 preferred Bath & Body ( _ sign recorded) 6 had no preference The analysis is based on a sample size of 18 + 12 = 30 Hypotheses H 0 : No preference for one brand over the other exists H a : A preference for one brand over the other exists

Example: Hand cream Test Rejection Rule Using 0.05 level of significance, Reject H 0 if z < -1.96 or z > 1.96 Test Statistic z = (18-15)/2.74 = 3/2.74 = 1.095 Conclusion Do not reject H 0. There is insufficient evidence in the sample to conclude that a difference in preference exists for the two brands of hand cream. Fewer than 10 or more than 20 individuals would have to have a preference for a particular brand in order for us to reject H 0.

Wilcoxon Signed-Rank Test The methodology of the parametric matched-sample analysis requires: interval data, and the assumption that the population of differences between the pairs of observations is normally distributed If the assumption of normally distributed differences is not appropriate, the Wilcoxon signed-rank test can be used.

Wilcoxon Signed-Rank Test Preliminary Steps of the Test Compute the differences between the paired observations Discard any differences of zero Rank the absolute value of the differences from lowest to highest Tied differences are assigned the average ranking of their positions Give the ranks the sign of the original difference in the data Sum the signed ranks... next determine whether the sum is significantly different from zero

Example: Express Deliveries Wilcoxon Signed-Rank Test A huge animal hospital has decided to select one of two express delivery services. To test the delivery times of the two services, the Vet sends two reports to a sample of 10 district animal clinics, with one report carried by one service and the other report carried by the second service. Do the data (delivery times in hours) indicate a difference in the two services?

Example: Express Deliveries District clinic Overnight NiteFlite Seattle 32 hrs. 25 hrs. Los Angeles 30 24 Boston 19 15 Cleveland 16 15 New York 15 13 Houston 18 15 Atlanta 14 15 St. Louis 10 8 Milwaukee 7 9 Denver 16 11

Example: Express Deliveries District clinic Differ Diff Rank Sign Rank Seattle 7 10 +10 Los Angeles 6 9 +9 Boston 4 7 +7 Cleveland 1 1.5 +1.5 New York 2 4 +4 Houston 3 6 +6 Atlanta -1 1.5-1.5 St. Louis 2 4 +4 Milwaukee -2 4-4 Denver 5 8 +8 +44

Example: Express Deliveries Hypotheses H 0 : The delivery times of the two services are the same; neither offers faster service than the other H a : Delivery times differ between the two services; recommend the one with the smaller times

Example: Express Deliveries Rejection Rule Using 0.05 level of significance, Reject H 0 if z < -1.96 or z > 1.96 Test Statistic z = (T - T )/ T = (44-0)/19.62 = 2.24 Conclusion Reject H 0. There is sufficient evidence in the sample to conclude that a difference exists in the delivery times provided by the two services. Recommend using the NiteFlite service

Kruskal-Wallis Test The MWN test can be used to test whether two populations are identical The MWW test has been extended by Kruskal and Wallis for cases of three or more populations The Kruskal-Wallis test can be used with ordinal data, interval or ratio data Not require the assumption of normally distributed populations The hypotheses are: H 0 : All populations are identical H a : Not all populations are identical

Mann-Whitney U Test

Two-sample rank test Although nonparametric procedures have been proposed for testing differences between the dispersion, or variability, of two populations, none has achieved widespread acceptance.

Differences between independent groups Two samples compare mean value for some variable of interest Parametric test T-test for independent samples Non-parametric test Wald-Wolfowitz runs test Mann-Whitney U test Kolmogorov-Smirnov two sample test

Mann-Whitney U Test For this test, as for many other nonparametric procedures, the actual measurements are not employed, but use instead the ranks of the measurements. The data may be ranked either from the highest to lowest or from the lowest to the highest values.

Mann-Whitney U Test Nonparametric alternative to twosample t-test Actual measurements not used ranks of the measurements used Data can be ranked from highest to lowest or lowest to highest values Calculate Mann-Whitney U statistic (for one sided) U = n 1 n 2 + n 1 (n 1 +1) R 1 2

Mann-Whitney U Test Calculate Mann-Whitney U statistic (two sided) U = n1n2+n1(n1+1) R1 U'= n1n2-u 2 n1 and n2 are the number of observations in Sample one and two R1 is the sum of the ranks of the observations in Sample one

Mann-Whitney U Test Calculate Mann-Whitney U statistic (two sided) U'= n2n1+n2(n2+1) R2 U= n1n2-u' 2 n1 and n2 are the number of observations in Sample one and two R2 is the sum of the ranks of the observations in Sample two

Example of Mann-Whitney U test Two tailed null hypothesis that there is no difference between the heights of male and female students Ho: Male and female students are the same height HA: Male and female students are not the same height

Example 1 U 0.05(2),7,5 = U 0.05(2),5,7 = 30 As 33 > 30, Ho is rejected U = n1n2 + n1(n1+1) R1 2 0.01 < P (U >= 33 or U =< 2) < 0.02 U=(7)(5) + (7)(8) 30 2 U = 35 + 28 30 Heights of males (cm) Heights of females (cm) Ranks of male heights Ranks of female heights U = 33 U = n1n2 U U = (7)(5) 33 U = 2 193 175 1 7 188 173 2 8 185 168 3 10 183 165 4 11 180 163 5 12 178 6 170 9 n 1 = 7 n 2 = 5 R 1 = 30 R 2 = 48

Calculation for z-statistics E(U) = (n1n2)/2 =(7*5)/2=17.5 S(U) = n1n2(n1+n2+1)/12 = 7*5*(7+5+1)/12 = 6.16 z = [U-E(U)]/S(U) = [(2-17.5)/6.16 = -2.516

Rejection Rule Using 0.05 level of significance, Reject H 0 if z < -1.96 or z > 1.96 Conclusion Can reject H 0. There is significantly difference between the heights of male and female students..

Example of Mann-Whitney U test Ho: The performance of students is the same under the two teaching assistants Ha: Students do not perform equally well under the two teaching assistants = 0.05

Teaching Assistant A Teaching Assistant B Example 2 Grade A A Rank of grade Grade A A A B+ A- B+ B B B- C+ C C+ C C C- C B D C- D D D D- Rank of grade n 1 = 11 R 1 = n 2 = 14 R 2 =

Example 2 Teaching Assistant A Teaching Assistant B U = n 1 n 2 + n 1 (n 1 +1) R 1 2 U=(11)(14) + (11)(12) 114.5 2 U = 154 + 66 114.5 U = 105.5 U = n 1 n 2 U U = (11)(14) 105.5 U = 48.5 U 0.05(2),11,14 = 114 As < 114, accept H 0 0.10 < P (U >105.5 or U =< 48.5) < 0.20 Grade Rank of grade Grade Rank of grade A 3 A 3 A 3 A 3 A 3 B+ 7.5 A- 6 B+ 7.5 B 10 B 10 B 10 B- 12 C+ 13.5 C 16.5 C+ 13.5 C 16.5 C 16.5 C- 19.5 C 16.5 D 22.5 C- 19.5 D 22.5 D 22.5 D 22.5 D- 25 n 1 = 11 R 1 =114.5 n 2 = 14 R 2 =210.5

Calculation for z-statistics E(U) = (n1n2)/2 = 77 S(U) = n1n2(n1+n2+1)/12 = 18.27 z = [U-E(U)]/S(U) = [(48.5-77)/18.27] = -1.56

Rejection Rule Using 0.05 level of significance, Reject H 0 if z < -1.96 or z > 1.96 Conclusion Can not reject H 0. The performance of students is the same under the two teaching assistants.

Differences between independent groups Multiple groups Multiple groups Parametric Analysis of variance (ANOVA/ MANOVA) Nonparametric Kruskal-Wallis analysis of ranks Median test

Differences between dependent groups Compare two variables measured in the same sample If more than two variables are measured in same sample Parametric t-test for dependent samples Repeated measures ANOVA Nonparametric Sign test Wilcoxon s matched pairs test Friedman s two way analysis of variance Cochran Q

Relationships between variables Two variables of interest are categorical Parametric Correlation coefficient Nonparametric Spearman R Kendall Tau Coefficient Gamma Chi square Phi coefficient Fisher exact test Kendall coefficient of concordance

Summary Table of Statistical Tests Level of Measurement Sample Characteristics Correlation 1 Sample 2 Sample K Sample (i.e., >2) Independent Dependent Independent Dependent Categorical or Nominal Χ 2 or binomial Χ 2 Macnarmar s Χ 2 Χ 2 Cochran s Q Rank or Ordinal Mann Whitney U Wilcoxin Matched Pairs Signed Ranks Kruskal Wallis H Friendman s ANOVA Spearman s rho Parametric (Interval & Ratio) z test or t test t test between groups t test within groups 1 way ANOVA between groups 1 way ANOVA (within or repeated measure) Pearson s r Factorial (2 way) ANOVA

Advantages of Nonparametric Tests Probability statements obtained from most nonparametric statistics are exact probabilities, regardless of the shape of the population distribution from which the random sample was drawn If sample sizes as small as N=6 are used, there is no alternative to using a nonparametric test

Advantages of Nonparametric Tests Treat samples made up of observations from several different populations. Can treat data which are inherently in ranks as well as data whose seemingly numerical scores have the strength in ranks They are available to treat data which are classificatory Easier to learn and apply than parametric tests

Criticisms of Nonparametric Procedures Losing precision/wasteful of data Low power False sense of security Lack of software Testing distributions only Higher-ordered interactions not dealt with

A good tree will bear good fruits