Non-parametric methods - PDF Free Download

Eastern Mediterranean University Faculty of Medicine Biostatistics course Non-parametric methods March 4&7, 2016 Instructor: Dr. Nimet İlke Akçay (ilke.cetin@emu.edu.tr)

Learning Objectives 1. Distinguish Parametric & Nonparametric Test Methods 2. Explain commonly used Nonparametric Test Methods 3. Perform Hypothesis Tests Using Nonparametric Procedures

Nonparametric Statistics Introduction to nonparametric methods Parametric tests (t-test, z-test, etc.) involve estimating population parameters such as the mean are based on the assumptions of normality or known variances (stringent assumptions) What if data does not follow a Normal distribution? Non-parametric tests were developed for these situations where no (or fewer) assumptions have to be made are also called as Distribution-free tests Still have assumptions but they are less stringent Can be applied for a normally distributed data, but Parametric tests have greater power IF the assumptions met

Parametric vs. Nonparametric test procedures Parametric Involve population parameters Example: Population mean Require interval scale or ratio scale Whole numbers or fractions Example: height in inches (72, 60.5, 54.7) Have stringent assumptions Example: Normal dist. Examples: z-test, t-test, F- test Nonparametric Do not involve population parameters Example: Probability distributions, indepedence Data measured on any scale Ratio or Interval Ordinal Example: good-better-best Nominal Example: male-female Have no stringent assumptions Examples: Sign test, Mann- Whitney U test, Wilcoxon test

Advantages and Disadvantages of Non-parametric tests Advantages Used with all measurement scales Analysis possible for ranked or categorical data Distribution-free: may be used when the form of the sampled population is unknown Easy to compute Make fewer assumptions Disadvantages May waste information, if data and assumptions permit using parametric procedures Example: converting data from ratio to ordinal scale E.g. Height values -> shortaverage-tall Difficult to compute by hand for large samples Tables not widely available No need to involve population parameters Results may be as exact as parametric procedures

Determination: Parametric or Non-parametric?

Hypothesis Testing Procedures Type of Design Parametric Test One sample One-sample t- test Two indepent samples Two paired sample... Independentsamples t-test Paired-samples t-test... Nonparametric Test Sign Test Wilcoxon Rank Sum test Mann Whitney U test Wilcoxon Signed Ranks test...

One sample hypothesis testing design: Sign Test

Sign Test This test is used as an alternative to one sample t-test, when normality assumption is not met The only assumption is that the distribution of the underlying variable (data) is continuous. Test focuses on median rather than mean. The test is based on signs, plus and minuses Test is used for one sample as well as for two samples

Test Statistic in Sign Test Test Statistic: The test statistic for the sign test is either the observed number of plus signs or the observed number of minus signs. The nature of the alternative hypothesis determines which of these test statistics is appropriate. In a given test, any one of the following alternative hypotheses is possible: HA: P(+) > P(-) HA: P(+) < P(-) HA: P(+) P(-) one-sided alternative one-sided alternative two-sided alternative

Test Statistic in Sign Test If the alternative hypothesis is HA: P(+) > P(-) a sufficiently small number of minus signs causes rejection of H0. The test statistic is the number of minus signs. If the alternative hypothesis is HA: P(+) < P(-) a sufficiently small number of plus signs causes rejection of H0. The test statistic is the number of plus signs. If the alternative hypothesis is HA: P(+) P(-) either a sufficiently small number of plus signs or a sufficiently small number of minus signs causes rejection of the null hypothesis. We may take as the test statistic the less frequently occurring sign. Calculation of test statistic: P (X k n, p) = k x 0 C n x p x q n x

Sign Test Example 7 patients were asked to rate the services in a private hospital on a 5-point scale (1=terrible,..., 5=excellent) The obtained ratings are given in the table At the α=0.05 level, is there evidence that the median rating is at least 3? Patient Rating Sign 1 2 3 4 5 6 7 2 5 3 4 1 4 5 - + + + - + +

Sign Test Example (cont d) Hypotheses: H0: = 3 Ha: < 3 Significance Level: =.05 Test Statistic: P-Value: Decision: P(x 2) = 1 - P(x 1) = 0.9375 (Binomial Table, n = 7, p = 0.50) Or calculate from the formula Do Not Reject at =.05 S = 2 (Ratings 1 & 2 are < = 3: 2, 5, 3, 4, 1, 4, 5) Conclusion: There is No evidence for Median < 3

Comparing Two Populations: Independent Samples Wilcoxon Rank Sum test

2011 Pearson Education, Inc Wilcoxon Rank Sum test Tests two independent population probability distributions Corresponds to t-test for two independent means Assumptions Independent, random samples Populations are continuous Can use normal approximation if n i 30

Wilcoxon Rank Sum Test: Independent Samples Let D 1 and D 2 represent the probability distributions for populations 1 and 2, respectively. One-Tailed Test H 0 : D 1 and D 2 are identical H a : D 1 is shifted to the right of D 2 [or D 1 is shifted to the left of D 2 ] Test statistic: T 1, if n 1 < n 2 ; T 2, if n 2 < n 1 (Either rank sum can be used if n 1 = n 2.)

Wilcoxon Rank Sum Test: Independent Samples Let D 1 and D 2 represent the probability distributions for populations 1 and 2, respectively. Two-Tailed Test H 0 : D 1 and D 2 are identical H a : D 1 is shifted to the left or to the right of D 2 Test statistic: T 1, if n 1 < n 2 ; T 2, if n 2 < n 1 (Either rank sum can be used if n 1 = n 2.) We will denote this rank sum as T.

Conditions Required for Valid Wilcoxon Rank Sum Test 1. The two samples are random and independent. 2. The two probability distributions from which the samples are drawn are continuous.

Comparing Two Populations: Paired Differences Experiment Wilcoxon Signed Ranks test

2011 Pearson Education, Inc Wilcoxon Signed Rank Test Tests probability distributions of two related populations Corresponds to t-test for dependent (paired) means Assumptions Random samples Both populations are continuous Can use normal approximation if n 30

Wilcoxon Signed Rank Test for a Paired Difference Experiment Let D 1 and D 2 represent the probability distributions for populations 1 and 2, respectively. One-Tailed Test H 0 : D 1 and D 2 are identical H a : D 1 is shifted to the right of D 2 [or D 1 is shifted to the left of D 2 ] 2011 Pearson Education, Inc

Wilcoxon Signed Rank Test for a Paired Difference Experiment Let D 1 and D 2 represent the probability distributions for populations 1 and 2, respectively. Two-Tailed Test H 0 : D 1 and D 2 are identical H a : D 1 is shifted to the left or to the right of D 2 2011 Pearson Education, Inc

Conditions Required for a Valid Signed Rank Test 1. The sample of differences is randomly selected from the population of differences. 2. The probability distribution from which the sample of paired differences is drawn is continuous. 2011 Pearson Education, Inc