Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series

Size: px
Start display at page:

Download "Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series"

Transcription

1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1

2 Chapter 13 Nonparametric Statistics 13-1 Overview 13-2 Sign Test 13-3 Wilcoxon Signed-Ranks Test for Matched Pairs 13-4 Wilcoxon Rank-Sum Test for Two Independent Samples 13-5 Kruskal-Wallis Test 13-6 Rank Correlation 13-7 Runs Test for Randomness Slide 2

3 Section 13-1 Overview Created by Erin Hodgess, Houston, Texas Revised to accompany 10th Edition, Jim Zimmer, Chattanooga State, Chattanooga, TN Slide 3

4 Overview Definitions Parametric tests have requirements about the nature or shape of the populations involved. Nonparametric tests do not require that samples come from populations with normal distributions or have any other particular distributions. Consequently, nonparametric tests are called distribution-free tests. Slide 4

5 Advantages of Nonparametric Methods 1. Nonparametric methods can be applied to a wide variety of situations because they do not have the more rigid requirements of the corresponding parametric methods. In particular, nonparametric methods do not require normally distributed populations. 2. Unlike parametric methods, nonparametric methods can often be applied to categorical data, such as the genders of survey respondents. 3. Nonparametric methods usually involve simpler computations than the corresponding parametric methods and are therefore easier to understand and apply. Slide 5

6 Disadvantages of Nonparametric Methods 1. Nonparametric methods tend to waste information because exact numerical data are often reduced to a qualitative form. 2. Nonparametric tests are not as efficient as parametric tests, so with a nonparametric test we generally need stronger evidence (such as a larger sample or greater differences) before we reject a null hypothesis. Slide 6

7 Efficiency of Nonparametric Methods Slide 7

8 Definitions Data are sorted when they are arranged according to some criterion, such as smallest to the largest or best to worst. A rank is a number assigned to an individual sample item according to its order in the sorted list. The first item is assigned a rank of 1, the second is assigned a rank of 2, and so on. Slide 8

9 Handling Ties in Ranks Find the mean of the ranks involved and assign this mean rank to each of the tied items. Sorted Data Preliminary Ranking Mean is 3. Mean is 7.5. Rank Slide 9

10 Section 13-2 Sign Test Created by Erin Hodgess, Houston, Texas Revised to accompany 10th Edition, Jim Zimmer, Chattanooga State, Chattanooga, TN Slide 10

11 Key Concept The main objective of this section is to understand the sign test procedure, which involves converting data values to plus and minus signs, then testing for disproportionately more of either sign. Slide 11

12 Definition Sign Test The sign test is a nonparametric (distribution free) test that uses plus and minus signs to test different claims, including: 1) Claims involving matched pairs of sample data; 2) Claims involving nominal data; 3) Claims about the median of a single population. Slide 12

13 Basic Concept of the Sign Test The basic idea underlying the sign test is to analyze the frequencies of the plus and minus signs to determine whether they are significantly different. Slide 13

14 Figure 13-1 Sign Test Procedure Slide 14

15 Figure 13-1 Sign Test Procedure Slide 15

16 Figure 13-1 Sign Test Procedure Slide 16

17 Requirements 1. The sample data have been randomly selected. 2. There is no requirement that the sample data come from a population with a particular distribution, such as a normal distribution. Slide 17

18 Notation for Sign Test x = the number of times the less frequent sign occurs n = the total number of positive and negative signs combined Slide 18

19 Test Statistic For n 25: x (the number of times the less frequent sign occurs) For n > 25: z = n 2 (x + 0.5) n Critical values For n 25, critical x values are in Table A-7. For n > 25, critical z values are in Table A-2. n 2 Slide 19

20 Claims Involving Matched Pairs When using the sign test with data that are matched pairs, we convert the raw data to plus and minus signs as follows: 1. Subtract each value of the second variable from the corresponding value of the first variable. 2. Record only the sign of the difference found in step 1. Exclude ties: that is, any matched pairs in which both values are equal. Slide 20

21 Key Concept Underlying This Use of the Sign Test If the two sets of data have equal medians, the number of positive signs should be approximately equal to the number of negative signs. Slide 21

22 Example: Yields of Corn from Different Seeds Use the data in Table 13-3 with a 0.05 significance level to test the claim that there is no difference between the yields from the regular and kiln-dried seed. Slide 22

23 Example: Yields of Corn from Different Seeds Use the data in Table 13-3 with a 0.05 significance level to test the claim that there is no difference between the yields from the regular and kiln-dried seed. H 0 : The median of the differences is equal to 0. H 1 : The median of the differences is not equal to 0. α = 0.05 x = minimum(7, 4) = 4 (From Table 13-3, there are 7 negative signs and 4 positive signs.) Critical value = 1 (From Table A-7 where n = 11 and α = 0.05) Slide 23

24 Example: Yields of Corn from Different Seeds Use the data in Table 13-3 with a 0.05 significance level to test the claim that there is no difference between the yields from the regular and kiln-dried seed. H 0 : The median of the differences is equal to 0. H 1 : The median of the differences is not equal to 0. With a test statistic of x = 4 and a critical value of 1, we fail to reject the null hypothesis of no difference. There is not sufficient evidence to warrant rejection of the claim that the median of the differences is equal to 0. Slide 24

25 Claims Involving Nominal Data The nature of nominal data limits the calculations that are possible, but we can identify the proportion of the sample data that belong to a particular category. Then we can test claims about the corresponding population proportion p. Slide 25

26 Example: Gender Selection Of the 325 babies born to parents using the XSORT method of gender selection, 295 were girls. Use the sign test and a 0.05 significance level to test the claim that this method of gender selection has no effect. The procedures are for cases in which n > 25. Note that the only requirement is that the sample data are randomly selected. H 0 : p = 0.5 (the proportion of girls is 0.5) H 1 : p 0.5 Slide 26

27 Example: Gender Selection Of the 325 babies born to parents using the XSORT method of gender selection, 295 were girls. Use the sign test and a 0.05 significance level to test the claim that this method of gender selection has no effect. Denoting girls by the positive sign (+) and boys by the negative sign ( ), we have 295 positive signs and 30 negative signs. Test statistic x = minimum(295, 30) = 30 The test involves two tails. Slide 27

28 Example: Gender Selection Of the 325 babies born to parents using the XSORT method of gender selection, 295 were girls. Use the sign test and a 0.05 significance level to test the claim that this method of gender selection has no effect. z = z = (x + 0.5) n 2 ( ) n = Slide 28

29 Example: Gender Selection Of the 325 babies born to parents using the XSORT method of gender selection, 295 were girls. Use the sign test and a 0.05 significance level to test the claim that this method of gender selection has no effect. With α = 0.05 in a two-tailed test, the critical values are z = ± The test statistic z = is less than We reject the null hypothesis that p = 0.5. There is sufficient evidence to warrant rejection of the claim that the method of gender selection has no effect. Slide 29

30 Example: Gender Selection Of the 325 babies born to parents using the XSORT method of gender selection, 295 were girls. Use the sign test and a 0.05 significance level to test the claim that this method of gender selection has no effect. Figure 13.2 Slide 30

31 Claims About the Median of a Single Population The negative and positive signs are based on the claimed value of the median. Slide 31

32 Example: Body Temperature Use the temperatures for 12:00 A.M. on Day 2 in Data Set 2 in Appendix B. Use the sign test to test the claim that the median is less than 98.6 F. There are 68 subjects with temperatures below 98.6 F, 23 subjects with temperatures above 98.6 F, and 15 subjects with temperatures equal to 98.6 F. H 0 : Median is equal to 98.6 F. H 1 : Median is less than 98.6 F. Since the claim is that the median is less than 98.6 F. the test involves only the left tail. Slide 32

33 Example: Body Temperature Use the temperatures for 12:00 A.M. on Day 2 in Data Set 2 in Appendix B. Use the sign test to test the claim that the median is less than 98.6 F. Discard the 15 zeros. Use ( ) to denote the 68 temperatures below 98.6 F, and use ( + ) to denote the 23 temperatures above 98.6 F. So n = 91 and x = 23 Slide 33

34 Example: Body Temperature Use the temperatures for 12:00 A.M. on Day 2 in Data Set 2 in Appendix B. Use the sign test to test the claim that the median is less than 98.6 F. z = (x + 0.5) n 2 n 2 z = ( ) = 4.61 Slide 34

35 Example: Body Temperature Use the temperatures for 12:00 A.M. on Day 2 in Data Set 2 in Appendix B. Use the sign test to test the claim that the median is less than 98.6 F. We use Table A-2 to get the critical z value of The test statistic of z = 4.61 falls into the critical region. We reject the null hypothesis. We support the claim that the median body temperature of healthy adults is less than 98.6 F. Slide 35

36 Example: Body Temperature Use the temperatures for 12:00 A.M. on Day 2 in Data Set 2 in Appendix B. Use the sign test to test the claim that the median is less than 98.6 F. Figure 13.3 Slide 36

37 Recap In this section we have discussed: Sign tests where data are assigned plus or minus signs and then tested to see if the number of plus and minus signs is equal. Sign tests can be performed on claims involving: Matched pairs Nominal data The median of a single population Slide 37

38 Section 13-3 Wilcoxon Signed-Ranks Test for Matched Pairs Created by Erin Hodgess, Houston, Texas Revised to accompany 10th Edition, Jim Zimmer, Chattanooga State, Chattanooga, TN Slide 38

39 Key Concept The Wilcoxon signed-ranks test uses ranks of sample data consisting of matched pairs. This test is used with a null hypothesis that the population of differences from the matched pairs has a median equal to zero. Slide 39

40 Definition The Wilcoxon signed-ranks test is a nonparametric test that uses ranks of sample data consisting of matched pairs. It is used to test the null hypothesis that the population of differences has a median of zero. H 0 : The matched pairs have differences that come from a population with a median equal to zero. H 1 : The matched pairs have differences that come from a population with a nonzero median. Slide 40

41 Wilcoxon Signed-Ranks Test Requirements 1. The data consist of matched pairs that have been randomly selected. 2. The population of differences (found from the pairs of data) has a distribution that is approximately symmetric, meaning that the left half of its histogram is roughly a mirror image of its right half. (There is no requirement that the data have a normal distribution.) Slide 41

42 Notation T = the smaller of the following two sums: 1. The sum of the absolute values of the negative ranks of the nonzero differences d 2. The sum of the positive ranks of the nonzero differences d Slide 42

43 Test Statistic for the Wilcoxon Signed-Ranks Test for Matched Pairs For n 30, the test statistic is T. For n > 30, the test statistic is z = T n(n + 1) 4 n(n +1) (2n +1) 24 Slide 43

44 Critical Values for the Wilcoxon Signed-Ranks Test for Matched Pairs For n 30, the critical T value is found in Table A-8. For n > 30, the critical z values are found in Table A-2. Slide 44

45 Procedure for Finding the Value of the Test Statistic Step 1: For each pair of data, find the difference d by subtracting the second value from the first. Keep the signs, but discard any pairs for which d = 0. Step 2: Ignore the signs of the differences, then sort the differences from lowest to highest and replace the differences by the corresponding rank value. When differences have the same numerical value, assign to them the mean of the ranks involved in the tie. Step 3: Attach to each rank the sign difference from which it came. That is, insert those signs that were ignored in step 2. Step 4: Find the sum of the absolute values of the negative ranks. Also find the sum of the positive ranks. Slide 45

46 Procedure for Finding the Value of the Test Statistic Step 5: Let T be the smaller of the two sums found in Step 4. Either sum could be used, but for a simplified procedure we arbitrarily select the smaller of the two sums. Step 6: Let n be the number of pairs of data for which the difference d is not 0. Step 7: Determine the test statistic and critical values based on the sample size, as shown above. Step 8: When forming the conclusion, reject the null hypothesis if the sample data lead to a test statistic that is in the critical region - that is, the test statistic is less than or equal to the critical value(s). Otherwise, fail to reject the null hypothesis. Slide 46

47 Example: Does the Type of Seed Affect Corn Growth? Use the data in Table 13-4 with the Wilcoxon signed-ranks test and 0.05 significance level to test the claim that there is no difference between the yields from the regular and kiln-dried seed. Slide 47

48 Example: Does the Type of Seed Affect Corn Growth? Use the data in Table 13-4 with the Wilcoxon signed-ranks test and 0.05 significance level to test the claim that there is no difference between the yields from the regular and kiln-dried seed. H 0 : There is no difference between the times of the first and second trials. H 1 : There is a difference between the times of the first and second trials. Slide 48

49 Example: Does the Type of Seed Affect Corn Growth? The ranks of differences in row four of the table are found by ranking the absolute differences, handling ties by assigning the mean of the ranks. The signed ranks in row five of the table are found by attaching the sign of the differences to the ranks. The differences in row three of the table are found by computing the first time second time. Slide 49

50 Example: Does the Type of Seed Affect Corn Growth? Calculate the Test Statistic Step 1: In Table 13-4, the row of differences is obtained by computing this difference for each pair of data: d = yield from regular seed yield from kiln-dried seed Step 2: Ignoring their signs, we rank the absolute differences from lowest to highest. Step 3: The bottom row of Table 13-4 is created by attaching to each rank the sign of the corresponding differences. Slide 50

51 Example: Does the Type of Seed Affect Corn Growth? Calculate the Test Statistic Step 3 (cont.): If there really is no difference between the yields from the two types of seed (as in the null hypothesis), we expect the sum of the positive ranks to be approximately equal to the sum of the absolute values of the negative ranks. Step 4: We now find the sum of the absolute values of the negative ranks, and we also find the sum of the positive ranks. Slide 51

52 Example: Does the Type of Seed Affect Corn Growth? Calculate the Test Statistic Step 4 (cont.): Sum of absolute values of negative ranks: 51 (from ) Sum of positive ranks: 15 (from ) Step 5: Letting T be the smaller of the two sums found in Step 4, we find that T = 15. Step 6: Letting n be the number of pairs of data for which the difference d is not 0, we have n = 11. Slide 52

53 Example: Does the Type of Seed Affect Corn Growth? Calculate the Test Statistic Step Step 7: 7: Because n = 11, we have n 30, so we use a test statistic of T = 15. From Table A-8, the critical T = 11 (using n = 11 and α = 0.05 in two tails). Step 8: The test statistic T = 15 is not less than or equal to the critical value of 11, so we fail to reject the null hypothesis. It appears that there is no difference between yields from regular seed and kiln-dried seed. Slide 53

54 Recap In this section we have discussed: The Wilcoxon signed-ranks test which uses matched pairs. The hypothesis is that the matched pairs have differences that come from a population with a median equal to zero. Slide 54

55 Section 13-4 Wilcoxon Rank-Sum Test for Two Independent Samples Created by Erin Hodgess, Houston, Texas Revised to accompany 10th Edition, Jim Zimmer, Chattanooga State, Chattanooga, TN Slide 55

56 Key Concept The Wilcoxon signed-ranks test (Section 13-3) involves matched pairs of data. The Wilcoxon rank-sum test of this section involves two independent samples that are not related or somehow matched or paired. Slide 56

57 Definition The Wilcoxon rank-sum test is a nonparametric test that uses ranks of sample data from two independent populations. It is used to test the null hypothesis that the two independent samples come from populations with equal medians. H 0 : The two samples come from populations with equal medians. H 1 : The two samples come from populations with different medians. Slide 57

58 Basic Concept If two samples are drawn from identical populations and the individual values are all ranked as one combined collection of values, then the high and low ranks should fall evenly between the two samples. Slide 58

59 Requirements 1. There are two independent samples of randomly selected data. 2. Each of the two samples has more than 10 values. 3. There is no requirement that the two populations have a normal distribution or any other particular distribution. Slide 59

60 n 1 = size of Sample 1 n 2 = size of Sample 2 Notation for the Wilcoxon Rank-Sum Test R 1 = sum of ranks for Sample 1 R 2 = sum of ranks for Sample 2 R = same as R 1 (sum of ranks for Sample 1) µ R = mean of the sample R values that is expected when the two populations have equal medians σ R = standard deviation of the sample R values that is expected with two populations having equal medians Slide 60

61 Test Statistic for the Wilcoxon Rank-Sum Test where µ R z = = R µ R σ R n 1 (n 1 + n 2 + 1) 2 σ R = n 1 n 2 (n 1 + n 2 + 1) 12 n 1 = size of the sample from which the rank sum R is found n 2 = size of the other sample R = sum of ranks of the sample with size n 1 Slide 61

62 Critical Values for the Wilcoxon Rank-Sum Test Critical values can be found in Table A-2 (because the test statistic is based on the normal distribution). Slide 62

63 Procedure for Finding the Value of the Test Statistic 1. Temporarily combine the two samples into one big sample, then replace each sample value with its rank. 2. Find the sum of the ranks for either one of the two samples. 3. Calculate the value of the z test statistic as shown in the previous slide, where either sample can be used as Sample 1. Slide 63

64 Example: BMI of Men and Women The data in Table 13-5 are from Data Set 1 in Appendix B and use only the first 13 sample values for men and the first 12 sample values for women. The numbers in parentheses are their ranks beginning with a rank of 1 assigned to the lowest value of R 1 and R 2 at the bottom denote the sum of ranks. Slide 64

65 Example: BMI of Men and Women Use the data in Table 13-5 with the Wilcoxon rank-sum test and a 0.05 significance level to test the claim that the median BMI of men is equal to the median BMI of women. The requirements of having two independent and random samples and each having more than 10 values are met. H 0 : Men and women have BMI values with equal medians H 1 : Men and women have BMI values with medians that are not equal Slide 65

66 Example: BMI of Men and Women Use the data in Table 13-5 with the Wilcoxon rank-sum test and a 0.05 significance level to test the claim that the median BMI of men is equal to the median BMI of women. Procedures. 1. Rank all 25 BMI measurements combined. This is done in Table Find the sum of the ranks of either one of the samples. For men the sum of ranks is R = = 187 Slide 66

67 Example: BMI of Men and Women Procedures (cont.). 3. Calculate the value of the z test statistic. µ R n ( n + n + 1) 13( ) = = = 169 σ R n n ( n + n + 1) (13)(12)( ) = = = z R µ R = = = σ R 0.98 Slide 67

68 Example: BMI of Men and Women Use the data in Table 13-5 with the Wilcoxon rank-sum test and a 0.05 significance level to test the claim that the median BMI of men is equal to the median BMI of women. A large positive value of z would indicate that the higher ranks are found disproportionately in Sample 1, and a large negative value of z would indicate that Sample 1 had a disproportionate share of lower ranks. Slide 68

69 Example: BMI of Men and Women Use the data in Table 13-5 with the Wilcoxon rank-sum test and a 0.05 significance level to test the claim that the median BMI of men is equal to the median BMI of women. We have a two tailed test (with α = 0.05), so the critical values are 1.96 and The test statistic of 0.98 does not fall within the critical region, so we fail to reject the null hypothesis that men and women have BMI values with equal medians. It appears that BMI values of men and women are basically the same. Slide 69

70 Example: BMI of Men and Women The preceding example used only 13 of the 40 sample BMI values for men listed in Data Set 1 in Appendix B, and it used only 12 of the 40 BMI values for women. Do the results change if we use all 40 sample values for both men and women? The null and alternative hypotheses are the same. Slide 70

71 Example: BMI of Men and Women In the Minitab display below ETA1 and ETA2 denote the medians of the first and second samples, respectively. The rank sum for men is W = The P-value is (or after adjustment for ties). Minitab Slide 71

72 Example: BMI of Men and Women Because the P-value is greater than α = 0.05, we fail to reject the null hypothesis. There is not sufficient evidence to warrant rejection of the claim that men and women have BMI values with equal medians. Minitab Slide 72

73 Recap In this section we have discussed: The Wilcoxon Rank-Sum Test for Two Independent Samples. TW9 It is used to test the null hypothesis that the two independent samples come from populations with equal medians. Slide 73

74 Slide 73 TW9 period at end of sentence Tom Wegleitner; 24/5/2006

75 Section 13-5 Kruskal-Wallis Test Created by Erin Hodgess, Houston, Texas Revised to accompany 10th Edition, Jim Zimmer, Chattanooga State, Chattanooga, TN Slide 74

76 Key Concept This section introduces the Kruskal- Wallis test, which uses ranks of data from three or more independent samples to test the null hypothesis that the samples come from populations with equal medians. Slide 75

77 Kruskal-Wallis Test Definition. The Kruskal-Wallis test (also called the H test) is a nonparametric test that uses ranks of sample data from three or more independent populations. It is used to test the null hypothesis that the independent samples come from populations with the equal medians. H 0 : The samples come from populations with equal medians. H 1 : The samples come from populations with medians that are not all equal. Slide 76

78 Kruskal-Wallis Test We compute the test statistic H, which has a distribution that can be approximated by the chi-square (χ 2 ) distribution as long as each sample has at least 5 observations. When we use the chi-square distribution in this context, the number of degrees of freedom is k 1, where k is the number of samples. Slide 77

79 Kruskal-Wallis Test Requirements 1. We have at least three independent samples, all of which are randomly selected. 2. Each sample has at least 5 observations. 3. There is no requirement that the populations have a normal distribution or any other particular distribution. Slide 78

80 N k Kruskal-Wallis Test Notation = total number of observations in all observations combined = number of samples R 1 = sum of ranks for Sample 1 n 1 = number of observations in Sample 1 For Sample 2, the sum of ranks is R 2 and the number of observations is n 2, and similar notation is used for the other samples. Slide 79

81 Kruskal-Wallis Test H Test Statistic 2 R R 2 k 12 R = ( N + 1) N( N + 1) n n n 1 2 k 1. Test is right-tailed. Critical Values 2. df = k 1 (Because the test statistic H can be approximated by the χ 2 distribution, use Table A-4). Slide 80

82 Procedure for Finding the Value of the Test Statistic H 1 Temporarily combine all samples into one big sample and assign a rank to each sample value. 2. For each sample, find the sum of the ranks and find the sample size. 3. Calculate H by using the results of Step 2 and the notation and test statistic given on the preceding slide. Slide 81

83 Procedure for Finding the Value of the Test Statistic H The test statistic H is basically a measure of the variance of the rank sums R 1, R 2,, R k. If the ranks are distributed evenly among the sample groups, then H should be a relatively small number. If the samples are very different, then the ranks will be excessively low in some groups and high in others, with the net effect that H will be large. Slide 82

84 Example: Effects of Treatments on Poplar Tree Weights Table 13-6 lists weights of poplar trees given different treatments. (Numbers in parentheses are ranks.) Slide 83

85 Example: Effects of Treatments on Poplar Tree Weights Use the data in Table 13-6 with the Kruskal-Wallis test to test the claim that the four samples come from populations with equal medians. Are requirements met? There are three or more independent and random samples. Each sample size is 5. (Requirement is at least 5.) H 0 : The populations of poplar tree weights from the four treatments have equal medians. H 1 : The four population medians are not all equal. Slide 84

86 Example: Effects of Treatments on Poplar Tree Weights Use the data in Table 13-6 with the Kruskal-Wallis test to test the claim that the four samples come from populations with equal medians. The following statistics come from Table 13-6: n 1 = 5, n 2 = 5, n 3 = 5, n 4 = 5 N = 20 R 1 = 45, R 2 = 37.5, R 3 = 42.5, R 4 = 85 Slide 85

87 Example: Effects of Treatments on Poplar Tree Weights Use the data in Table 13-6 with the Kruskal-Wallis test to test the claim that the four samples come from populations with equal medians. Evaluate the test statistic.. H 12 R R R = ( N + 1) N( N 1) n n n k k = (20 + 1) 20(20 + 1) = Slide 86

88 Example: Effects of Treatments on Poplar Tree Weights Use the data in Table 13-6 with the Kruskal-Wallis test to test the claim that the four samples come from populations with equal medians. Find the critical value.. Because each sample has at least five observations, the distribution of H is approximately a chi-square distribution. df = k 1 = 4 1 = 3 α = 0.05 From Table A-4 the critical value = Slide 87

89 Example: Effects of Treatments on Poplar Tree Weights Use the data in Table 13-6 with the Kruskal-Wallis test to test the claim that the four samples come from populations with equal medians. The test statistic is in the critical region, so we reject the null hypothesis of equal medians. At least one of the medians appears to be different from the others. Slide 88

90 Recap In this section we have discussed: The Kruskal-Wallis Test is the nonparametric equivalent of ANOVA. It tests the hypothesis that three or more populations have equal means. The populations do not have to be normally distributed. Slide 89

91 Section 13-6 Rank Correlation Created by Erin Hodgess, Houston, Texas Revised to accompany 10th Edition, Jim Zimmer, Chattanooga State, Chattanooga, TN Slide 90

92 Key Concept This section describes the nonparametric method of rank correlation, which uses paired data to test for an association between two variables. In Chapter 10 we used paired sample data to compute values for the linear correlation coefficient r, but in this section we use ranks as a the basis for computing the rank correlation coefficient r s. Slide 91

93 Rank Correlation Definition The rank correlation test (or Spearman s rank correlation test) is a non-parametric test that uses ranks of sample data consisting of matched pairs. It is used to test for an association between two variables, so the null and alternative hypotheses are as follows (where ρ s denotes the rank correlation coefficient for the entire population): H o : ρ s = 0 (There is no correlation between the two variables.) H 1 : ρ s 0 (There is a correlation between the two variables.) Slide 92

94 Advantages Rank correlation has these advantages over the parametric methods discussed in Chapter 10: 1. The nonparametric method of rank correlation can be used in a wider variety of circumstances than the parametric method of linear correlation. With rank correlation, we can analyze paired data that are ranks or can be converted to ranks. 2. Rank correlation can be used to detect some (not all) relationships that are not linear. Slide 93

95 Disadvantages A disadvantage of rank correlation is its efficiency rating of 0.91, as described in Section This efficiency rating shows that with all other circumstances being equal, the nonparametric approach of rank correlation requires 100 pairs of sample data to achieve the same results as only 91 pairs of sample observations analyzed through parametric methods, assuming that the stricter requirements of the parametric approach are met. Slide 94

96 Figure 13-4 Rank Correlation for Testing H 0 : ρ s = 0 Slide 95

97 Figure 13-4 Rank Correlation for Testing H 0 : ρ s = 0 Slide 96

98 Requirements 1. The sample paired data have been randomly selected. 2. Unlike the parametric methods of Section 10-2, there is no requirement that the sample pairs of data have a bivariate normal distribution. There is no requirement of a normal distribution for any population. Slide 97

99 Notation r s = rank correlation coefficient for sample paired data (r s is a sample statistic) ρ s = rank correlation coefficient for all the population data (ρ s is a population parameter) n = number of pairs of data d = difference between ranks for the two values within a pair Slide 98

100 Rank Correlation Test Statistic No ties: After converting the data in each sample to ranks, if there are no ties among ranks for either variable, the exact value of the test statistic can be calculated using this formula: r s = 1 6Σd n n Ties: After converting the data in each sample to ranks, if either variable has ties among its ranks, the exact value of the test statistic rs can be found by using Formula 10-1 with the ranks: r s = 2 2 ( 1) nσxy ( Σx)( Σy) n( Σx ) ( Σx) n( Σy ) ( Σy) Slide 99

101 Rank Correlation Critical values: If n 30, critical values are found in Table A-9. If n > 30, use Formula Formula 13-1 r s = ± z n 11 where the value of z corresponds to the significance level. (For example, if α = 0.05, z 1.96.) Slide 100

102 Example: Rankings of Colleges Use the data in Table 13-7 to determine if there is a correlation between the student rankings and the rankings of the magazine. Slide 101

103 Example: Rankings of Colleges Use the data in Table 13-7 to determine if there is a correlation between the student rankings and the rankings of the magazine. H 0 : ρ s = 0 H 1 : ρ s 0 Since neither variable has ties in the ranks: r s 2 6Σd 6(24) = 1 = 1 n( n 2 1) 8(8 2 1) 144 = 1 = Slide 102

104 Example: Rankings of Colleges Use the data in Table 13-7 to determine if there is a correlation between the student rankings and the rankings of the magazine. H 0 : ρ s = 0 H 1 : ρ s 0 From Table A-9 the critical values are ± Because the test statistic of r s = does not exceed the critical value, we fail to reject the null hypothesis. There is not sufficient evidence to support a claim of a correlation between the rankings of the students and the magazine. Slide 103

105 Example: Rankings of Colleges Large Sample Case Assume that the preceding example is expanded by including a total of 40 colleges and that the test statistic r s is found to be If the significance level of α = 0.05, what do you conclude about the correlation? Since n = 40 exceeds 30, we find the critical value from Formula 13-1 r s ± z ± 1.96 = = = ± n Slide 104

106 Example: Rankings of Colleges Large Sample Case Assume that the preceding example is expanded by including a total of 40 colleges and that the test statistic r s is found to be If the significance level of α = 0.05, what do you conclude about the correlation? The test statistic of r s = does not exceed the critical value of 0.314, so we fail to reject the null hypothesis. There is not sufficient evidence to support the claim of a correlation between students and the magazine. Slide 105

107 Example: Detecting a Nonlinear Pattern The data in Table 13-8 are the numbers of games played and the last scores (in millions) of a Raiders of the Lost Ark pinball game. We expect that there should be an association between the number of games played and the pinball score. H 0 : ρ s = 0 H 1 : ρ s 0 Slide 106

108 Example: Detecting a Nonlinear Pattern There are no ties among ranks of either list. r s 2 6Σd 6(6) = 1 = 1 n( n 2 1) 9(9 2 1) 36 = 1 = Slide 107

109 Example: Detecting a Nonlinear Pattern Since n = 9 is less than 30, use Table A-9 Critical values are ± The sample statistic exceeds 0.700, so we conclude that there is significant evidence to reject the null hypothesis of no correlation. There appears to be correlation between the number of games played and the score. Slide 108

110 Example: Detecting a Nonlinear Pattern If the preceding example is done using the methods of Chapter 9, the linear correlation coefficient is r = This leads to the conclusion that there is not enough evidence to support the claim of a significant linear correlation, whereas the nonlinear test found that there was enough evidence. The Excel scatter diagram shows that there is a non-linear relationship that the parametric method would not have detected. Excel Slide 109

111 Recap In this section we have discussed: Rank correlation which is the non-parametric equivalent of testing for correlation described in Chapter 10. It uses ranks of matched pairs to test for association. Sometimes rank correlation can detect nonlinear correlation that the parametric test will not recognize. Slide 110

112 Section 13-7 Runs Test for Randomness Created by Erin Hodgess, Houston, Texas Revised to accompany 10th Edition, Jim Zimmer, Chattanooga State, Chattanooga, TN Slide 111

113 Key Concept This section introduces the runs test for randomness, which can be used to determine whether the sample data in a sequence are in a random order. This test is based on sample data that have two characteristics, and it analyzes runs of those characteristics to determine whether the runs appear to result from some random process, or whether the runs suggest that the order of the data is not random. Slide 112

114 Runs Test for Randomness Definitions A run is a sequence of data having the same characteristic; the sequence is preceded and followed by data with a different characteristic or by no data at all. The runs test uses the number of runs in a sequence of sample data to test for randomness in the order of the data. Slide 113

115 Fundamental Principles of the Run Test Reject randomness if the number of runs is very low or very high. Example: The sequence of genders FFFFFMMMMM is not random because it has only 2 runs, so the number of runs is very low. Example: The sequence of genders FMFMFMFMFM is not random because there are 10 runs, which is very high. It is important to note that the runs test for randomness is based on the order in which the data occur; it is not based on the frequency of the data. Slide 114

116 Figure 13-5 Procedure for Runs Test for Randomness Slide 115

117 Figure 13-5 Procedure for Runs Test for Randomness Slide 116

118 Requirements 1. The sample data are arranged according to some ordering scheme, such as the order in which the sample values were obtained. 2. Each data value can be categorized into one of two separate categories (such as male/female). Slide 117

119 Notation n 1 = number of elements in the sequence that have one particular characteristic (The characteristic chosen for n 1 is arbitrary.) n 2 = number of elements in the sequence that have the other characteristic G = number of runs Slide 118

120 Runs Test for Randomness For Small Samples (n 1 20 and n 2 20) and α = 0.05: Test Statistic Test statistic is the number of runs G Critical Values Critical values are found in Table A-10. Slide 119

121 Runs Test for Randomness For Small Samples (n 1 20 and n 2 20) and α = 0.05: Decision criteria Reject randomness if the number of runs G is: less than or equal to the smaller critical value found in Table A-10. TW16 or greater than or equal to the larger critical value found in Table A-10. Slide 120

122 Slide 120 TW16 put periods at end of sentences Tom Wegleitner; 24/5/2006

123 Runs Test for Randomness For Small Samples (n 1 20 and n 2 20) and α = 0.05: Test Statistic µ where z = G σ G 2n n G µ = 1 2 G n n2 1 and σ G = (2 n n )(2 n n n n ) ( n n ) ( n n 1) Slide 121

124 Runs Test for Randomness For Large Samples (n 1 > 20 or n 2 > 20) or α 0.05: Critical Values Critical values of z: Use Table A-2. Slide 122

125 Example: Small Sample Genders of Bears Listed below are the genders of the first 10 bears from Data Set 6 in Appendix B. Use a 0.05 significance level to test for randomness in the sequence of genders. M M M M F F M M F F Separate the runs as shown below. M M M M F F M M F F 1st run 2nd run 3rd run 4th run Slide 123

126 Example: Small Sample Genders of Bears M M M M F F M M F F 1st run 2nd run 3rd run 4th run n 1 = total number of males = 6 n 2 = total number of females = 4 G = number of runs = 4 Because n 1 20 and n 2 20 and α = 0.05, the test statistic is G = 4 Slide 124

127 Example: Small Sample Genders of Bears M M M M F F M M F F 1st run 2nd run 3rd run 4th run From Table A-10, the critical values are 2 and 9. Because G = 4 is not less than or equal to 2, nor is it greater than or equal to 9, we do not reject randomness. It appears the sequence of genders is random. Slide 125

128 Example: Large Sample Boston Rainfall on Mondays Refer to the rainfall amounts for Boston as listed in Data Set 10 in Appendix B. Is there sufficient evidence to support the claim that rain on Mondays is not random? D D D D R D R D D R D D R D D D R D D R R R D D D D R D R D R R R D R D D D R D D D R D R D D R D D D R H 0 : The sequence is random. H 1 : The sequence is not random. n 1 = number of Ds = 33 n 2 = number or Rs = 19 G = number of runs = 30 Slide 126

129 Example: Large Sample Boston Rainfall on Mondays Since n 1 > 20, we must calculate z using the formulas: µ = 2 n 1 n 2 2(33)(19) G n + n + = = 1 2 σ G = (2 n n )(2 n n n n ) ( n n ) ( n n 1) (2)(33)(19)[(2(33)(19) 33 19] = = ( ) ( ) G µ G z = = = 1.48 σ G Slide 127

130 Example: Large Sample Boston Rainfall on Mondays The critical values are z = and The test statistic of z = 1.48 does not fall within the critical region, so we fail to reject the null hypothesis of randomness. The given sequence does appear to be random. Slide 128

131 Recap In this section we have discussed: The runs test for randomness which can be used to determine whether the sample data in a sequence are in a random order. We reject randomness if the number of runs is very low or very high. Slide 129

Lecture Slides. Section 13-1 Overview. Elementary Statistics Tenth Edition. Chapter 13 Nonparametric Statistics. by Mario F.

Lecture Slides. Section 13-1 Overview. Elementary Statistics Tenth Edition. Chapter 13 Nonparametric Statistics. by Mario F. Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 13 Nonparametric Statistics 13-1 Overview 13-2 Sign Test 13-3 Wilcoxon Signed-Ranks

More information

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics Understand Difference between Parametric and Nonparametric Statistical Procedures Parametric statistical procedures inferential procedures that rely

More information

Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series

Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 9 Inferences from Two Samples 9-1 Overview 9-2 Inferences About Two Proportions 9-3

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 10 Correlation and Regression 10-1 Overview 10-2 Correlation 10-3 Regression 10-4

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 7 Estimates and Sample Sizes 7-1 Overview 7-2 Estimating a Population Proportion 7-3

More information

Nonparametric statistic methods. Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health

Nonparametric statistic methods. Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health Nonparametric statistic methods Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health Measurement What are the 4 levels of measurement discussed? 1. Nominal or Classificatory Scale Gender,

More information

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007) FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 4-1 Overview 4-2 Fundamentals 4-3 Addition Rule Chapter 4 Probability 4-4 Multiplication Rule:

More information

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests PSY 307 Statistics for the Behavioral Sciences Chapter 20 Tests for Ranked Data, Choosing Statistical Tests What To Do with Non-normal Distributions Tranformations (pg 382): The shape of the distribution

More information

Non-parametric methods

Non-parametric methods Eastern Mediterranean University Faculty of Medicine Biostatistics course Non-parametric methods March 4&7, 2016 Instructor: Dr. Nimet İlke Akçay (ilke.cetin@emu.edu.tr) Learning Objectives 1. Distinguish

More information

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures Non-parametric Test Stephen Opiyo Overview Distinguish Parametric and Nonparametric Test Procedures Explain commonly used Nonparametric Test Procedures Perform Hypothesis Tests Using Nonparametric Procedures

More information

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown Nonparametric Statistics Leah Wright, Tyler Ross, Taylor Brown Before we get to nonparametric statistics, what are parametric statistics? These statistics estimate and test population means, while holding

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 1 1-1 Basic Business Statistics 11 th Edition Chapter 1 Chi-Square Tests and Nonparametric Tests Basic Business Statistics, 11e 009 Prentice-Hall, Inc. Chap 1-1 Learning Objectives In this chapter,

More information

Nonparametric Statistics

Nonparametric Statistics Nonparametric Statistics Nonparametric or Distribution-free statistics: used when data are ordinal (i.e., rankings) used when ratio/interval data are not normally distributed (data are converted to ranks)

More information

Non-parametric (Distribution-free) approaches p188 CN

Non-parametric (Distribution-free) approaches p188 CN Week 1: Introduction to some nonparametric and computer intensive (re-sampling) approaches: the sign test, Wilcoxon tests and multi-sample extensions, Spearman s rank correlation; the Bootstrap. (ch14

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

3. Nonparametric methods

3. Nonparametric methods 3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests

More information

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 4.1-1

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 4.1-1 Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by Mario F. Triola 4.1-1 4-1 Review and Preview Chapter 4 Probability 4-2 Basic Concepts of Probability 4-3 Addition

More information

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric Assumptions The observations must be independent. Dependent variable should be continuous

More information

Statistics: revision

Statistics: revision NST 1B Experimental Psychology Statistics practical 5 Statistics: revision Rudolf Cardinal & Mike Aitken 29 / 30 April 2004 Department of Experimental Psychology University of Cambridge Handouts: Answers

More information

Agonistic Display in Betta splendens: Data Analysis I. Betta splendens Research: Parametric or Non-parametric Data?

Agonistic Display in Betta splendens: Data Analysis I. Betta splendens Research: Parametric or Non-parametric Data? Agonistic Display in Betta splendens: Data Analysis By Joanna Weremjiwicz, Simeon Yurek, and Dana Krempels Once you have collected data with your ethogram, you are ready to analyze that data to see whether

More information

Chapter 18 Resampling and Nonparametric Approaches To Data

Chapter 18 Resampling and Nonparametric Approaches To Data Chapter 18 Resampling and Nonparametric Approaches To Data 18.1 Inferences in children s story summaries (McConaughy, 1980): a. Analysis using Wilcoxon s rank-sum test: Younger Children Older Children

More information

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01 An Analysis of College Algebra Exam s December, 000 James D Jones Math - Section 0 An Analysis of College Algebra Exam s Introduction Students often complain about a test being too difficult. Are there

More information

Inferential statistics

Inferential statistics Inferential statistics Inference involves making a Generalization about a larger group of individuals on the basis of a subset or sample. Ahmed-Refat-ZU Null and alternative hypotheses In hypotheses testing,

More information

Non-parametric tests, part A:

Non-parametric tests, part A: Two types of statistical test: Non-parametric tests, part A: Parametric tests: Based on assumption that the data have certain characteristics or "parameters": Results are only valid if (a) the data are

More information

What is a Hypothesis?

What is a Hypothesis? What is a Hypothesis? A hypothesis is a claim (assumption) about a population parameter: population mean Example: The mean monthly cell phone bill in this city is μ = $42 population proportion Example:

More information

HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC

HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC 1 HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC 7 steps of Hypothesis Testing 1. State the hypotheses 2. Identify level of significant 3. Identify the critical values 4. Calculate test statistics 5. Compare

More information

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 CIVL - 7904/8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 Chi-square Test How to determine the interval from a continuous distribution I = Range 1 + 3.322(logN) I-> Range of the class interval

More information

STATISTIKA INDUSTRI 2 TIN 4004

STATISTIKA INDUSTRI 2 TIN 4004 STATISTIKA INDUSTRI 2 TIN 4004 Pertemuan 11 & 12 Outline: Nonparametric Statistics Referensi: Walpole, R.E., Myers, R.H., Myers, S.L., Ye, K., Probability & Statistics for Engineers & Scientists, 9 th

More information

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă HYPOTHESIS TESTING II TESTS ON MEANS Sorana D. Bolboacă OBJECTIVES Significance value vs p value Parametric vs non parametric tests Tests on means: 1 Dec 14 2 SIGNIFICANCE LEVEL VS. p VALUE Materials and

More information

Dr. Maddah ENMG 617 EM Statistics 10/12/12. Nonparametric Statistics (Chapter 16, Hines)

Dr. Maddah ENMG 617 EM Statistics 10/12/12. Nonparametric Statistics (Chapter 16, Hines) Dr. Maddah ENMG 617 EM Statistics 10/12/12 Nonparametric Statistics (Chapter 16, Hines) Introduction Most of the hypothesis testing presented so far assumes normally distributed data. These approaches

More information

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC CHI SQUARE ANALYSIS I N T R O D U C T I O N T O N O N - P A R A M E T R I C A N A L Y S E S HYPOTHESIS TESTS SO FAR We ve discussed One-sample t-test Dependent Sample t-tests Independent Samples t-tests

More information

ANOVA - analysis of variance - used to compare the means of several populations.

ANOVA - analysis of variance - used to compare the means of several populations. 12.1 One-Way Analysis of Variance ANOVA - analysis of variance - used to compare the means of several populations. Assumptions for One-Way ANOVA: 1. Independent samples are taken using a randomized design.

More information

P(A) = Definitions. Overview. P - denotes a probability. A, B, and C - denote specific events. P (A) - Chapter 3 Probability

P(A) = Definitions. Overview. P - denotes a probability. A, B, and C - denote specific events. P (A) - Chapter 3 Probability Chapter 3 Probability Slide 1 Slide 2 3-1 Overview 3-2 Fundamentals 3-3 Addition Rule 3-4 Multiplication Rule: Basics 3-5 Multiplication Rule: Complements and Conditional Probability 3-6 Probabilities

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Overview 3-2 Measures

More information

Contents Kruskal-Wallis Test Friedman s Two-way Analysis of Variance by Ranks... 47

Contents Kruskal-Wallis Test Friedman s Two-way Analysis of Variance by Ranks... 47 Contents 1 Non-parametric Tests 3 1.1 Introduction....................................... 3 1.2 Advantages of Non-parametric Tests......................... 4 1.3 Disadvantages of Non-parametric Tests........................

More information

Data Analysis: Agonistic Display in Betta splendens I. Betta splendens Research: Parametric or Non-parametric Data?

Data Analysis: Agonistic Display in Betta splendens I. Betta splendens Research: Parametric or Non-parametric Data? Data Analysis: Agonistic Display in Betta splendens By Joanna Weremjiwicz, Simeon Yurek, and Dana Krempels Once you have collected data with your ethogram, you are ready to analyze that data to see whether

More information

Class 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Class 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700 Class 4 Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science Copyright 013 by D.B. Rowe 1 Agenda: Recap Chapter 9. and 9.3 Lecture Chapter 10.1-10.3 Review Exam 6 Problem Solving

More information

Example. χ 2 = Continued on the next page. All cells

Example. χ 2 = Continued on the next page. All cells Section 11.1 Chi Square Statistic k Categories 1 st 2 nd 3 rd k th Total Observed Frequencies O 1 O 2 O 3 O k n Expected Frequencies E 1 E 2 E 3 E k n O 1 + O 2 + O 3 + + O k = n E 1 + E 2 + E 3 + + E

More information

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 9.1-1

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 9.1-1 Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by Mario F. Triola Copyright 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 9.1-1 Chapter 9 Inferences

More information

psychological statistics

psychological statistics psychological statistics B Sc. Counselling Psychology 011 Admission onwards III SEMESTER COMPLEMENTARY COURSE UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION CALICUT UNIVERSITY.P.O., MALAPPURAM, KERALA,

More information

CORELATION - Pearson-r - Spearman-rho

CORELATION - Pearson-r - Spearman-rho CORELATION - Pearson-r - Spearman-rho Scatter Diagram A scatter diagram is a graph that shows that the relationship between two variables measured on the same individual. Each individual in the set is

More information

Non-parametric Hypothesis Testing

Non-parametric Hypothesis Testing Non-parametric Hypothesis Testing Procedures Hypothesis Testing General Procedure for Hypothesis Tests 1. Identify the parameter of interest.. Formulate the null hypothesis, H 0. 3. Specify an appropriate

More information

Psych Jan. 5, 2005

Psych Jan. 5, 2005 Psych 124 1 Wee 1: Introductory Notes on Variables and Probability Distributions (1/5/05) (Reading: Aron & Aron, Chaps. 1, 14, and this Handout.) All handouts are available outside Mija s office. Lecture

More information

Nominal Data. Parametric Statistics. Nonparametric Statistics. Parametric vs Nonparametric Tests. Greg C Elvers

Nominal Data. Parametric Statistics. Nonparametric Statistics. Parametric vs Nonparametric Tests. Greg C Elvers Nominal Data Greg C Elvers 1 Parametric Statistics The inferential statistics that we have discussed, such as t and ANOVA, are parametric statistics A parametric statistic is a statistic that makes certain

More information

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 004 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER II STATISTICAL METHODS The Society provides these solutions to assist candidates preparing for the examinations in future

More information

Analysing data: regression and correlation S6 and S7

Analysing data: regression and correlation S6 and S7 Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association

More information

Glossary for the Triola Statistics Series

Glossary for the Triola Statistics Series Glossary for the Triola Statistics Series Absolute deviation The measure of variation equal to the sum of the deviations of each value from the mean, divided by the number of values Acceptance sampling

More information

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015 AMS7: WEEK 7. CLASS 1 More on Hypothesis Testing Monday May 11th, 2015 Testing a Claim about a Standard Deviation or a Variance We want to test claims about or 2 Example: Newborn babies from mothers taking

More information

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and

More information

Nonparametric Methods

Nonparametric Methods Nonparametric Methods Marc H. Mehlman marcmehlman@yahoo.com University of New Haven Nonparametric Methods, or Distribution Free Methods is for testing from a population without knowing anything about the

More information

Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA)

Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA) BSTT523 Pagano & Gauvreau Chapter 13 1 Nonparametric Statistics Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA) In particular, data

More information

Chapter 8 Student Lecture Notes 8-1. Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance

Chapter 8 Student Lecture Notes 8-1. Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance Chapter 8 Student Lecture Notes 8-1 Department of Economics Business Statistics Chapter 1 Chi-square test of independence & Analysis of Variance ECON 509 Dr. Mohammad Zainal Chapter Goals After completing

More information

Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance ECON 509. Dr.

Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance ECON 509. Dr. Department of Economics Business Statistics Chapter 1 Chi-square test of independence & Analysis of Variance ECON 509 Dr. Mohammad Zainal Chapter Goals After completing this chapter, you should be able

More information

STAT Section 3.4: The Sign Test. The sign test, as we will typically use it, is a method for analyzing paired data.

STAT Section 3.4: The Sign Test. The sign test, as we will typically use it, is a method for analyzing paired data. STAT 518 --- Section 3.4: The Sign Test The sign test, as we will typically use it, is a method for analyzing paired data. Examples of Paired Data: Similar subjects are paired off and one of two treatments

More information

Lecture 28 Chi-Square Analysis

Lecture 28 Chi-Square Analysis Lecture 28 STAT 225 Introduction to Probability Models April 23, 2014 Whitney Huang Purdue University 28.1 χ 2 test for For a given contingency table, we want to test if two have a relationship or not

More information

ST4241 Design and Analysis of Clinical Trials Lecture 9: N. Lecture 9: Non-parametric procedures for CRBD

ST4241 Design and Analysis of Clinical Trials Lecture 9: N. Lecture 9: Non-parametric procedures for CRBD ST21 Design and Analysis of Clinical Trials Lecture 9: Non-parametric procedures for CRBD Department of Statistics & Applied Probability 8:00-10:00 am, Friday, September 9, 2016 Outline Nonparametric tests

More information

Inferences About the Difference Between Two Means

Inferences About the Difference Between Two Means 7 Inferences About the Difference Between Two Means Chapter Outline 7.1 New Concepts 7.1.1 Independent Versus Dependent Samples 7.1. Hypotheses 7. Inferences About Two Independent Means 7..1 Independent

More information

Measuring Associations : Pearson s correlation

Measuring Associations : Pearson s correlation Measuring Associations : Pearson s correlation Scatter Diagram A scatter diagram is a graph that shows that the relationship between two variables measured on the same individual. Each individual in the

More information

Distribution-Free Procedures (Devore Chapter Fifteen)

Distribution-Free Procedures (Devore Chapter Fifteen) Distribution-Free Procedures (Devore Chapter Fifteen) MATH-5-01: Probability and Statistics II Spring 018 Contents 1 Nonparametric Hypothesis Tests 1 1.1 The Wilcoxon Rank Sum Test........... 1 1. Normal

More information

Background to Statistics

Background to Statistics FACT SHEET Background to Statistics Introduction Statistics include a broad range of methods for manipulating, presenting and interpreting data. Professional scientists of all kinds need to be proficient

More information

Nonparametric tests. Mark Muldoon School of Mathematics, University of Manchester. Mark Muldoon, November 8, 2005 Nonparametric tests - p.

Nonparametric tests. Mark Muldoon School of Mathematics, University of Manchester. Mark Muldoon, November 8, 2005 Nonparametric tests - p. Nonparametric s Mark Muldoon School of Mathematics, University of Manchester Mark Muldoon, November 8, 2005 Nonparametric s - p. 1/31 Overview The sign, motivation The Mann-Whitney Larger Larger, in pictures

More information

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. A common problem of this type is concerned with determining

More information

Chapter Fifteen. Frequency Distribution, Cross-Tabulation, and Hypothesis Testing

Chapter Fifteen. Frequency Distribution, Cross-Tabulation, and Hypothesis Testing Chapter Fifteen Frequency Distribution, Cross-Tabulation, and Hypothesis Testing Copyright 2010 Pearson Education, Inc. publishing as Prentice Hall 15-1 Internet Usage Data Table 15.1 Respondent Sex Familiarity

More information

Lecture 7: Hypothesis Testing and ANOVA

Lecture 7: Hypothesis Testing and ANOVA Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis

More information

16. Nonparametric Methods. Analysis of ordinal data

16. Nonparametric Methods. Analysis of ordinal data 16. Nonparametric Methods 數 Analysis of ordinal data 料 1 Data : Non-interval data : nominal data, ordinal data Interval data but not normally distributed Nonparametric tests : Two dependent samples pair

More information

BIO 682 Nonparametric Statistics Spring 2010

BIO 682 Nonparametric Statistics Spring 2010 BIO 682 Nonparametric Statistics Spring 2010 Steve Shuster http://www4.nau.edu/shustercourses/bio682/index.htm Lecture 8 Example: Sign Test 1. The number of warning cries delivered against intruders by

More information

Contents. Acknowledgments. xix

Contents. Acknowledgments. xix Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables

More information

Data analysis and Geostatistics - lecture VII

Data analysis and Geostatistics - lecture VII Data analysis and Geostatistics - lecture VII t-tests, ANOVA and goodness-of-fit Statistical testing - significance of r Testing the significance of the correlation coefficient: t = r n - 2 1 - r 2 with

More information

Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017

Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017 Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017 I. χ 2 or chi-square test Objectives: Compare how close an experimentally derived value agrees with an expected value. One method to

More information

Textbook Examples of. SPSS Procedure

Textbook Examples of. SPSS Procedure Textbook s of IBM SPSS Procedures Each SPSS procedure listed below has its own section in the textbook. These sections include a purpose statement that describes the statistical test, identification of

More information

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION FOR SAMPLE OF RAW DATA (E.G. 4, 1, 7, 5, 11, 6, 9, 7, 11, 5, 4, 7) BE ABLE TO COMPUTE MEAN G / STANDARD DEVIATION MEDIAN AND QUARTILES Σ ( Σ) / 1 GROUPED DATA E.G. AGE FREQ. 0-9 53 10-19 4...... 80-89

More information

Chap The McGraw-Hill Companies, Inc. All rights reserved.

Chap The McGraw-Hill Companies, Inc. All rights reserved. 11 pter11 Chap Analysis of Variance Overview of ANOVA Multiple Comparisons Tests for Homogeneity of Variances Two-Factor ANOVA Without Replication General Linear Model Experimental Design: An Overview

More information

Statistics for Managers Using Microsoft Excel Chapter 9 Two Sample Tests With Numerical Data

Statistics for Managers Using Microsoft Excel Chapter 9 Two Sample Tests With Numerical Data Statistics for Managers Using Microsoft Excel Chapter 9 Two Sample Tests With Numerical Data 999 Prentice-Hall, Inc. Chap. 9 - Chapter Topics Comparing Two Independent Samples: Z Test for the Difference

More information

Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data

Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data 1999 Prentice-Hall, Inc. Chap. 10-1 Chapter Topics The Completely Randomized Model: One-Factor

More information

NON-PARAMETRIC STATISTICS * (http://www.statsoft.com)

NON-PARAMETRIC STATISTICS * (http://www.statsoft.com) NON-PARAMETRIC STATISTICS * (http://www.statsoft.com) 1. GENERAL PURPOSE 1.1 Brief review of the idea of significance testing To understand the idea of non-parametric statistics (the term non-parametric

More information

This is particularly true if you see long tails in your data. What are you testing? That the two distributions are the same!

This is particularly true if you see long tails in your data. What are you testing? That the two distributions are the same! Two sample tests (part II): What to do if your data are not distributed normally: Option 1: if your sample size is large enough, don't worry - go ahead and use a t-test (the CLT will take care of non-normal

More information

4/22/2010. Test 3 Review ANOVA

4/22/2010. Test 3 Review ANOVA Test 3 Review ANOVA 1 School recruiter wants to examine if there are difference between students at different class ranks in their reported intensity of school spirit. What is the factor? How many levels

More information

Statistics Primer. ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong

Statistics Primer. ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong Statistics Primer ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong 1 Quick Overview of Statistics 2 Descriptive vs. Inferential Statistics Descriptive Statistics: summarize and describe data

More information

Chapter 10. Chapter 10. Multinomial Experiments and. Multinomial Experiments and Contingency Tables. Contingency Tables.

Chapter 10. Chapter 10. Multinomial Experiments and. Multinomial Experiments and Contingency Tables. Contingency Tables. Chapter 10 Multinomial Experiments and Contingency Tables 1 Chapter 10 Multinomial Experiments and Contingency Tables 10-1 1 Overview 10-2 2 Multinomial Experiments: of-fitfit 10-3 3 Contingency Tables:

More information

Formulas and Tables. for Elementary Statistics, Tenth Edition, by Mario F. Triola Copyright 2006 Pearson Education, Inc. ˆp E p ˆp E Proportion

Formulas and Tables. for Elementary Statistics, Tenth Edition, by Mario F. Triola Copyright 2006 Pearson Education, Inc. ˆp E p ˆp E Proportion Formulas and Tables for Elementary Statistics, Tenth Edition, by Mario F. Triola Copyright 2006 Pearson Education, Inc. Ch. 3: Descriptive Statistics x Sf. x x Sf Mean S(x 2 x) 2 s Å n 2 1 n(sx 2 ) 2 (Sx)

More information

REVIEW 8/2/2017 陈芳华东师大英语系

REVIEW 8/2/2017 陈芳华东师大英语系 REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p

More information

Chapter 22. Comparing Two Proportions. Bin Zou STAT 141 University of Alberta Winter / 15

Chapter 22. Comparing Two Proportions. Bin Zou STAT 141 University of Alberta Winter / 15 Chapter 22 Comparing Two Proportions Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 15 Introduction In Ch.19 and Ch.20, we studied confidence interval and test for proportions,

More information

Lecture 14: ANOVA and the F-test

Lecture 14: ANOVA and the F-test Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016 Example Consider a study of 983 individuals and examine the relationship between duration of breastfeeding

More information

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics SEVERAL μs AND MEDIANS: MORE ISSUES Business Statistics CONTENTS Post-hoc analysis ANOVA for 2 groups The equal variances assumption The Kruskal-Wallis test Old exam question Further study POST-HOC ANALYSIS

More information

A3. Statistical Inference Hypothesis Testing for General Population Parameters

A3. Statistical Inference Hypothesis Testing for General Population Parameters Appendix / A3. Statistical Inference / General Parameters- A3. Statistical Inference Hypothesis Testing for General Population Parameters POPULATION H 0 : θ = θ 0 θ is a generic parameter of interest (e.g.,

More information

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p. Preface p. xi Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p. 6 The Scientific Method and the Design of

More information

Module 9: Nonparametric Statistics Statistics (OA3102)

Module 9: Nonparametric Statistics Statistics (OA3102) Module 9: Nonparametric Statistics Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 15.1-15.6 Revision: 3-12 1 Goals for this Lecture

More information

My data doesn t look like that..

My data doesn t look like that.. Testing assumptions My data doesn t look like that.. We have made a big deal about testing model assumptions each week. Bill Pine Testing assumptions Testing assumptions We have made a big deal about testing

More information

1; (f) H 0 : = 55 db, H 1 : < 55.

1; (f) H 0 : = 55 db, H 1 : < 55. Reference: Chapter 8 of J. L. Devore s 8 th Edition By S. Maghsoodloo TESTING a STATISTICAL HYPOTHESIS A statistical hypothesis is an assumption about the frequency function(s) (i.e., pmf or pdf) of one

More information

Problem Set 4 - Solutions

Problem Set 4 - Solutions Problem Set 4 - Solutions Econ-310, Spring 004 8. a. If we wish to test the research hypothesis that the mean GHQ score for all unemployed men exceeds 10, we test: H 0 : µ 10 H a : µ > 10 This is a one-tailed

More information

ST4241 Design and Analysis of Clinical Trials Lecture 7: N. Lecture 7: Non-parametric tests for PDG data

ST4241 Design and Analysis of Clinical Trials Lecture 7: N. Lecture 7: Non-parametric tests for PDG data ST4241 Design and Analysis of Clinical Trials Lecture 7: Non-parametric tests for PDG data Department of Statistics & Applied Probability 8:00-10:00 am, Friday, September 2, 2016 Outline Non-parametric

More information

Analysis of variance (ANOVA) Comparing the means of more than two groups

Analysis of variance (ANOVA) Comparing the means of more than two groups Analysis of variance (ANOVA) Comparing the means of more than two groups Example: Cost of mating in male fruit flies Drosophila Treatments: place males with and without unmated (virgin) females Five treatments

More information

6 Single Sample Methods for a Location Parameter

6 Single Sample Methods for a Location Parameter 6 Single Sample Methods for a Location Parameter If there are serious departures from parametric test assumptions (e.g., normality or symmetry), nonparametric tests on a measure of central tendency (usually

More information

Hypothesis Testing hypothesis testing approach

Hypothesis Testing hypothesis testing approach Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we

More information

= 1 i. normal approximation to χ 2 df > df

= 1 i. normal approximation to χ 2 df > df χ tests 1) 1 categorical variable χ test for goodness-of-fit ) categorical variables χ test for independence (association, contingency) 3) categorical variables McNemar's test for change χ df k (O i 1

More information

Statistics Handbook. All statistical tables were computed by the author.

Statistics Handbook. All statistical tables were computed by the author. Statistics Handbook Contents Page Wilcoxon rank-sum test (Mann-Whitney equivalent) Wilcoxon matched-pairs test 3 Normal Distribution 4 Z-test Related samples t-test 5 Unrelated samples t-test 6 Variance

More information

HYPOTHESIS TESTING. Hypothesis Testing

HYPOTHESIS TESTING. Hypothesis Testing MBA 605 Business Analytics Don Conant, PhD. HYPOTHESIS TESTING Hypothesis testing involves making inferences about the nature of the population on the basis of observations of a sample drawn from the population.

More information

Intro to Parametric & Nonparametric Statistics

Intro to Parametric & Nonparametric Statistics Kinds of variable The classics & some others Intro to Parametric & Nonparametric Statistics Kinds of variables & why we care Kinds & definitions of nonparametric statistics Where parametric stats come

More information

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests Overall Overview INFOWO Statistics lecture S3: Hypothesis testing Peter de Waal Department of Information and Computing Sciences Faculty of Science, Universiteit Utrecht 1 Descriptive statistics 2 Scores

More information