Statistical Tests for Computational Intelligence Research and Human Subjective Tests

Size: px
Start display at page:

Download "Statistical Tests for Computational Intelligence Research and Human Subjective Tests"

Transcription

1 tatistical Tests for Computational Intelligence Research and Human ubjective Tests lides are downloadable from Hideyuki TAKAGI Kyushu University, Japan ver. March 6, 015 ver. July 15, 013 ver. July 11, 013 ver. April 3, 013 Contents groups n groups (n > ) distribution Parametric Test (normality) un un t -test t -test ANOVA (Analysis of Variance) one-way ANOVA two-way ANOVA Non-parametric Test (no normality) un Mann-Whitney U-test sign test Wilcoxon signed-ranks test one-way two-way Kruskal-Wallis test Friedman test + cheffé's method of comparison for Human ubjective Tests

2 How to how ignificance? Just compare averages visually? It is not scientific. fitness proposed EC1 conventional EC generations fitness conventional EC proposed EC generations Fig. XX Average convergence curves of n times of trial runs. How to how ignificance? ound design concept: exiting sound made by conventional IEC sound made by proposed IEC1 sound made by proposed IEC Which method is good to make exiting sound? How to show it?

3 You cannot show the superiority of your method without statistical tests. Papers without statistics tests may be rejected. My method is significantly! statistical test distribution Which Test hould We Use? groups n groups (n > ) Parametric Test (normality) un un t -test t -test ANOVA (Analysis of Variance) one-way ANOVA two-way ANOVA Non-parametric Test (no normality) un Mann-Whitney U-test sign test Wilcoxon signed-ranks test one-way two-way Kruskal-Wallis test Friedman test

4 Which Test hould We Use? groups n groups (n > ) distribution Parametric Test (normality) un un t -test t -test ANOVA (Analysis of Variance) one-way ANOVA two-way ANOVA Non-parametric Test (no normality) un Mann-Whitney U-test n-th generation sign test Wilcoxon signed-ranks test one-way two-way Kruskal-Wallis test n-th generation Friedman test Which Test hould we Use? groups n groups (n > ) distribution Parametric Test (normality) Non-parametric Test (no normality) un un un t -test t -test Mann-Whitney U-test sign test Wilcoxon signed-ranks test ANOVA (Analysis of Variance) Normality Test Anderson-Darling test D'Agostino-Pearson test Kolmogorov-mirnov test hapiro-wilk test one-way Jarque Bera test two-way one-way ANOVA two-way ANOVA Kruskal-Wallis test Friedman test

5 Which Test hould We Use? groups n groups (n > ) distribution Parametric Test (normality) Non-parametric Test (no normality) un un un un t -test group A group B t 4.3 -test Mann-Whitney U-test sign test Wilcoxon signed-ranks test ANOVA (Analysis of Variance) initial # one-way ANOVA conven tional proposed two-way ANOVA one-way Kruskal-Wallis test two-way Friedman test Which Test hould We Use? groups n groups (n > ) distribution Parametric Test (normality) un un un t -test A group B group t -test ANOVA (Analysis of Variance) initial # one-way ANOVA GA proposed two-way ANOVA Non-parametric Test (no normality) un Mann-Whitney U-test sign test Wilcoxon signed-ranks test one-way two-way Kruskal-Wallis test Friedman test

6 distribution Which Test hould We Use? Q1: Which tests are more sensitive, those for un or? groups n groups (n > ) A1: tatistical tests for because of more information. Parametric Test (normality) un un un t -test A group B group t -test ANOVA (Analysis of Variance) initial # one-way ANOVA GA proposed two-way ANOVA Non-parametric Test (no normality) un Mann-Whitney U-test sign test Wilcoxon signed-ranks test one-way two-way Kruskal-Wallis test Friedman test distribution Which Test hould We Use? Q: How should you design your experimental conditions groups to use statistical ntests groups for (n > ) and reduce the # of trial runs? A: Use the same initialized for the set of (method A, method B) at each trial run. Parametric Test (normality) un un t -test t -test ANOVA (Analysis of Variance) one-way ANOVA two-way ANOVA significant? Non-parametric Test (no normality) un Mann-Whitney U-test sign test Wilcoxon signed-ranks n-th generation test one-way two-way Kruskal-Wallis test Friedman test

7 distribution Which Test hould we Use? Q3: Which statistical tests are sensitive, parametric tests or non-parametric ones and why? groups n groups (n > ) A3: Parametric tests which can use information of assumed distribution. Parametric Test (normality) un un t -test t -test ANOVA (Analysis of Variance) one-way ANOVA two-way ANOVA Non-parametric Test (no normality) un Mann-Whitney U-test sign test Wilcoxon signed-ranks test one-way two-way Kruskal-Wallis test Friedman test t-test groups n groups (n > ) distribution Parametric Test (normality) un un t -test t -test t -test ANOVA (Analysis of Variance) one-way ANOVA two-way ANOVA Non-parametric Test (no normality) un Mann-Whitney U-test sign test Wilcoxon signed-ranks test one-way two-way Kruskal-Wallis test Friedman test

8 t-test How to how ignificance? significant? n-th generation t-test Test this difference with assuming no difference. (null hypothesis) A B significant difference? Conditions to use t-tests: (1) normality () equal variances (not essential though )

9 t-test F-Test A B 1 When 10 (p > 0.05), we assume 14 that 9 there is no significant difference between σ A and σ B Test this difference with assuming no difference. (null hypothesis) Normality Test significant Anderson-Darling test difference? D'Agostino-Pearson test Kolmogorov-mirnov test hapiro-wilk test Jarque Bera test Conditions to use t-tests: (1) normality () equal variances (not essential though ) t-test Excel (3 bits version only?) has t-tests and ANOVA in Data Analysis Tools. You must install its add-in. (File -> option -> add-in, and set its add-in.)

10 t-test (1) t-test: Pairs two sample for means significant? This is a case when each pair of two methods with the same initial condition. n-th generation () t-test: Two-sample assuming equal variances (3) t-test: Two-sample assuming unequal variances: Welch's t-test t-test sample A B t-test: Paired Two ample for Means Variable 1 Variable Mean Variance Observations Pearson Correlation Hypothesized Mean Difference 0 df 9 t tat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail

11 t-test sample A B A 3. > B t-test: Paired Two ample for Means When p-value is less than 0.01 or 0.05, we assume that there is significant difference with the level Variable of significance 1 Variable of (p < 0.01) or (p < 0.05). Mean Variance Observations % Pearson.5% Correlation % A BHypothesized A < B Mean When A>B never 0happens, Difference you may use a one-tail test. df 9 t tat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail t-test (1) t-test: Pairs two sample for means () t-test: Two-sample assuming equal variances Difference between two groups is significant (p < 0.01). We cannot say that there is a significant difference between two group.

12 distribution ANOVA: Analysis of Variance groups n groups (n > ) Parametric Test (normality) un un t -test t -test ANOVA (Analysis of Variance) one-way ANOVA ANOVA two-way ANOVA Non-parametric Test (no normality) un Mann-Whitney U-test sign test Wilcoxon signed-ranks test one-way two-way Kruskal-Wallis test Friedman test ANOVA: Analysis of Variance significant? n-th generation

13 ANOVA: Analysis of Variance 1. Analysis of more than two groups.. Normality and equal variance are required. Excel has ANOVA in Data Analysis Tools. A B C C A B ANOVA: Analysis of Variance 1. Analysis of more than two groups.. Normality and equal variance are required. A B C Excel has ANOVA in Data Analysis Tools. Check it using the Bartlett test. C A B three t-tests = one ANOVA Three times of t-test with (p<0.05) equivalent one ANOVA (p<0.14). 1-(1-0.05) 3 = 0.14

14 ANOVA: Analysis of Variance n-th generation When are independent, use one-way ANOVA (single factor ANOVA). When correspond each other, use two-way ANOVA (two-factor ANOVA). ANOVA: Analysis of Variance Q1: What are "single factor" and "two factors"? A1: A column factor (e.g. three groups) and a sample factor (e.g. initialized condition). When are independent, use one-way ANOVA column (single factor factor ANOVA). When correspond each other, use two-way ANOVA column (two-factor factor ANOVA). sample factor

15 ANOVA: Analysis of Variance one-factor (one-way) ANOVA column factor two-factor (two-way) ANOVA column factor group A group B group C sample factor initial group A group B group C condition # # # # # # # # We cannot say that three groups are significantly different. (p=0.089) There are significant difference somewhere among three groups. (p<0.05) ANOVA: Analysis of Variance Output of the one-way ANOVA ource of Variation df M F P-value F crit Between Groups E Within Groups Total Output of the two-way ANOVA ource of Variation df M F P-value F crit ample Columns Interaction Within Total When (p-value < 0.01 or 0.05), there is(are) significant difference somewhere among groups. ignificant difference among ample (e.g. initial conditions) cannot be found (p > 0.05). ignificant difference can be found somewhere among Columns (e.g. three methods) (p < 0.01). We need not care an interaction effect between two factors (e.g. initial condition vs. methods) (p > 0.05). ample factor Column factor A B C

16 ANOVA: Analysis of Variance Q1: Where is significant among A, B, and C? A1: Apply multiple comparisons between all pairs among columns. (Fisher's PLD method, cheffé method, Bonferroni-Dunn test, Dunnett method, Williams method, Tukey method, Nemenyi test, Tukey-Kramer method, Games/Howell method, Duncan's new multiple range test, tudent-newman-keuls method, etc. Each has different characteristics.) ource of Variation df M F P-value F crit ample Columns Interaction Within Total significant? ample factor Column factor A B C distribution Non-Parametric Tests groups n groups (n > ) Parametric Test (normality) un un t -test t -test ANOVA (Analysis of Variance) one-way ANOVA If normality and equal variances are not guaranteed, use non-parametric tests. two-way ANOVA Non-parametric Test (no normality) un Mann-Whitney U-test sign test Wilcoxon signed-ranks test one-way two-way Kruskal-Wallis test Friedman test

17 Mann-Whitney U-test groups n groups (n > ) distribution Parametric Test (normality) un un t -test t -test ANOVA (Analysis of Variance) one-way ANOVA two-way ANOVA Non-parametric Test (no normality) un Mann-Whitney U-test sign test Wilcoxon signed-ranks test one-way two-way Kruskal-Wallis test Friedman test Mann-Whitney U-test (Wilcoxon-Mann-Whitney test, two sample Wilcoxon test) 1. Comparison of two groups.. Data have no normality. 3. There are no corresponding between two groups.? no normality? n-th generation

18 Mann-Whitney U-test (Wilcoxon-Mann-Whitney test, two sample Wilcoxon test) 1. Calculate a U value U = = 9 U' = 11 (U + U' = n 1 n ) when two values are the same, ( count as 0.5. ) Mann-Whitney U-test (cont.) (Wilcoxon-Mann-Whitney test, two sample Wilcoxon test). ee a U-test table. Use the smaller value of U or U'. When n 1 0 and n 0, see a Mann-Whitney test table. (where n 1 and n are the # of of two groups.) Otherwise, since U follows the below normal distribution roughly, n1n n1n ( n1 n 1) N U, U N, 1 U U normalize U as z and check a standard normal distribution table U nn 1 U U n1 n 1 1 ( with the z, where and. n n Use an Excel function to calculate the p-value for the z-value: p-value = 1 - NORM..DIT( z ) 1)

19 Examples: Mann-Whitney U-test (Wilcoxon-Mann-Whitney test, two sample Wilcoxon test) Ex.1 0 Ex Ex U = 9 U' = 11 U = 1 U' = 13 U = 3.5 U' = 1.5 (p < 0.05) n n ー (p < 0.01) n n ー 4 ー ー Exercise: Mann-Whitney U-test (Wilcoxon-Mann-Whitney test, two sample Wilcoxon test) (p < 0.05) n 1 n U = 9.5 U' = ー ince U' > 5, (p > 0.05): ( significance is not found) (p < 0.01) n n ーーーー 4 ーー

20 ign Test groups n groups (n > ) distribution Parametric Test (normality) un un t -test t -test ANOVA (Analysis of Variance) one-way ANOVA two-way ANOVA Non-parametric Test (no normality) un Mann-Whitney U-test sign test Wilcoxon signed-ranks test one-way two-way Kruskal-Wallis test Friedman test ign Test (1)ign Test significance test between the # of winnings and losses ()Wilcoxon's igned Ranks Test significance test using both the # of winnings and losses and the level of winnings/losses of groups # of winnings and losses the level of winnings/losses

21 ign Test 1. Calculate the # of winnings and losses by comparing runs with the same initial.. Check a sign test table to show significance of two methods. n-th generation ign Test Fig.3 in Y. Pei and H. Takagi, "Fourier analysis of the fitness landscape for evolutionary search acceleration," IEEE Congress on Evolutionary Computation (CEC), pp.1-7, Brisbane, Australia (June 10-15, 01). The (+,-) marks show whether our proposed methods converge significantly or poorer than normal DE, respectively, (p 0.05). Fig. in the same paper. Generations F1: DE_N vs. DE_LR F1: DE_N vs. DE_L F1: DE_N vs. DE_FR_GLB_nD F1: DE_N vs. DE_FR_LOC_nD F1: DE_N vs. DE_FR_GLB_1D F1: DE_N vs. DE_FR_LOC_1D F: DE_N vs. DE_LR F: DE_N vs. DE_L F: DE_N vs. DE_FR_GLB_nD F: DE_N vs. DE_FR_LOC_nD F: DE_N vs. DE_FR_GLB_1D F: DE_N vs. DE_FR_LOC_1D F3: DE_N vs. DE_LR F3: DE_N vs. DE_L F3: DE_N vs. DE_FR_GLB_nD F3: DE_N vs. DE_FR_LOC_nD F3: DE_N vs. DE_FR_GLB_1D F3: DE_N vs. DE_FR_LOC_1D + + F4: DE_N vs. DE_LR F4: DE_N vs. DE_L F4: DE_N vs. DE_FR_GLB_nD F4: DE_N vs. DE_FR_LOC_nD F4: DE_N vs. DE_FR_GLB_1D F4: DE_N vs. DE_FR_LOC_1D F5: DE_N vs. DE_LR F5: DE_N vs. DE_L F5: DE_N vs. DE_FR_GLB_nD F5: DE_N vs. DE_FR_LOC_nD ++ F5: DE_N vs. DE_FR_GLB_1D +++ F5: DE_N vs. DE_FR_LOC_1D + F6: DE_N vs. DE_LR F6: DE_N vs. DE_L F6: DE_N vs. DE_FR_GLB_nD F6: DE_N vs. DE_FR_LOC_nD F6: DE_N vs. DE_FR_GLB_1D F6: DE_N vs. DE_FR_LOC_1D F7: DE_N vs. DE_LR + F7: DE_N vs. DE_L F7: DE_N vs. DE_FR_GLB_nD F7: DE_N vs. DE_FR_LOC_nD + F7: DE_N vs. DE_FR_GLB_1D + F7: DE_N vs. DE_FR_LOC_1D F8: DE_N vs. DE_LR F8: DE_N vs. DE_L F8: DE_N vs. DE_FR_GLB_nD F8: DE_N vs. DE_FR_LOC_nD + F8: DE_N vs. DE_FR_GLB_1D F8: DE_N vs. DE_FR_LOC_1D

22 Task Example Whether performances of pattern recognition methods A and B are significantly different? n 1 cases: Both methods succeeded. n cases: Method A succeeded, and method B failed. n 3 cases: Method A failed, and method B succeeded. n 4 cases: Both methods failed. level of significance % % level of significance ign Test % % How to check? 1. et N = n + n 3.. Check the right table with the N. 3. If min(n, n 3 ) is smaller than the number for the N, we can say that there is significant difference with the significant risk level of XX. Exercise Whether there is significant difference for n = 1 and n 3 = 8? ANWER: Check the right table with N = 40. As n is bigger than 11 and smaller than 13, we can say that there is a significant difference between two with (p < 0.05) but cannot say so with (p < 0.01). ign Test level of significance % % Let's think about the case of N = 17. To say that n 1 and n are significantly different, (n 1 vs. n ) = (17 vs. 0), (16 vs. 1), or (15 vs. ) (p < 0.01) or (n 1 vs. n ) = (14 vs. 3) or (13 vs. 4) (p < 0.05)

23 Exercise: ign Test level of significance % % Check the significance of: 16 vs vs. 1 9 vs vs. 5 distribution Wilcoxon igned-ranks Test groups n groups (n > ) Parametric Test (normality) un un t -test t -test ANOVA (Analysis of Variance) one-way ANOVA two-way ANOVA Non-parametric Test (no normality) un Mann-Whitney U-test sign test Wilcoxon signed-ranks test one-way two-way Kruskal-Wallis test Friedman test

24 Wilcoxon igned-ranks Test Q: When a sign test could not show significance, how to do? A: Try the Wilcoxon signed-ranks test. It is more sensitive than a simple sign test due to more information use. n-th generation Wilcoxon igned-ranks Test (1)ign Test significance test between the # of winnings and losses ()Wilcoxon's igned Ranks Test significance test using both the # of winnings and losses and the level of winnings/losses of groups # of winnings and losses the level of winnings/losses

25 Wilcoxon igned-ranks Test Example: (step 1) (step ) (step 3) (step 4) v (system A) v (system B) difference d rank of d add sign to the ranks rank of fewer # of signs n 8 (step 6) Wilcoxon test table (step 5) T # of ( tep4) 3 (step 6) n = 8 T = 3 T=3 3 (n=8, p<0.05), then difference between systems A and B is significant. T=3 > 0 (n=8, p<0.01), then we cannot say there is a significant difference. When n > 5 As T follows the below normal distribution roughly, n( n 1) n( n 1)(n 1) N T, T N, 4 4 normalize T as the below and check a standard normal distribution table with the z; see T and in the above equation. T T z T T Wilcoxon Test Table: significance point of T one-tail p < 0.05 p < two-tail p < 0.05 p < 0.01 n =

26 Wilcoxon igned-ranks Test (step 1) (step ) (step 3) (step 4) v (system A) v (system B) difference d rank of d add sign to the ranks rank of fewer # of signs Tip # Tip # Tip # Tips: 1. When d = 0, ignore the.. When there are the same ranks of d, give average ranks. 1 Give the average rank 6.5 = ( )/ Exercise 1: Wilcoxon igned-ranks Test (step 1) (step ) (step 3) (step 4) v (system A) v (system B) difference d rank of d add sign to the ranks rank of fewer # of signs n = (step 5) T (step 6) Wilcoxon test table T = # of ( tep4)

27 Exercise 1: Wilcoxon igned-ranks Test (step 1) (step ) (step 3) (step 4) v (system A) v (system B) difference d rank of d add sign to the ranks rank of fewer # of signs n = 8 (step 6) Wilcoxon test table (step 5) T # of ( tep4) T = As T(=) < 3, there is a significant difference between A and B (p<0.05). But, as 0 < T(=), we cannot say so with the significance level of (p<0.01). Exercise : Wilcoxon igned-ranks Test (step 1) (step ) (step 3) (step 4) v (system A) v (system B) difference d rank of d add sign to the ranks rank of fewer # of signs n = (step 6) Wilcoxon test table (step 5) T # of ( tep4)

28 Exercise : Wilcoxon igned-ranks Test (step 1) (step ) (step 3) (step 4) v (system A) v (system B) difference d rank of d add sign to the ranks rank of fewer # of signs (No need to care the case of d = 0.) n = 8 (no count for d = 0.) (step 6) Wilcoxon test table As T > 3, we cannot say that there is a significant difference between A and B. (step 5) T # of ( tep4) T = 4 Exercise 3: Wilcoxon igned-ranks Test Explain how to apply this test to test whether two groups are significantly different at the below generation? n-th generation

29 Kruskal-Wallis Test groups n groups (n > ) distribution Parametric Test (normality) un un t -test t -test ANOVA (Analysis of Variance) one-way ANOVA two-way ANOVA Non-parametric Test (no normality) un Mann-Whitney U-test sign test Kruskal-Wallis test two-way Wilcoxon signed-ranks test Friedman test Kruskal-Wallis Test 1. Comparison of more than two groups.. Data have no normality. 3. There are no corresponding among groups.??? no normality n-th generation

30 Kruskal-Wallis Test Let's use ranks of Kruskal-Wallis Test N: total # of k: # of groups n i : # of of group i R i : sum of ranks of group i R 1 = 38 R = 69 R 3 = 46 How to Test 1. Rank all.. Calculate N, k, n i and R i. 3. Calculate statistical value H. 1 H N( N 1) i1 R n 3( N 1) 4. If k = 3 and N 17, compare the H with a significant point in a Kruskal-Wallis test table. Otherwise, assume that H follows the χ distribution and test the H using a χ distribution table of (k-1) degrees of freedom k i i

31 Example: Kruskal-Wallis Test N = n 1 +n +n 3 = 17 k = 3 groups (n 1, n, n 3 ) = (6, 5, 6) (R 1, R, R 3 ) = (38, 69, 46) 1 H N( N 1) 1 17(17 = k i1 R n i 38*38 1) 6 i 3( N 1) 69* *46 6 3(17 1) ince significant points of (p<0.05) and (p<0.01) for (n 1, n, n 3 ) = (6, 5, 6) are and 8.14, respectively, there are significant difference(s) somewhere among three groups (p<0.05) significance point of (p<0.05) 8.14 significance point of (p<0.01) Kruskal-Wallis Test Table (for k = 3 and N 17) n 1 n n 3 p < 0.05 p < 0.01 n 1 n n 3 p < 0.05 p < Example: Kruskal-Wallis Test N = n 1 +n +n 3 = 17 k = 3 groups (n 1, n, n 3 ) = (6, 5, 6) (R 1, R, R 3 ) = (38, 69, 46) Q1: Where is significant among A, B, and C? k 1 Ri H A1: Apply multiple 3( Ncomparisons 1) between all pairs N( among N 1) i1columns. ni (Fisher's PLD 3 method, cheffé method, Bonferroni-Dunn test, Dunnett method, 1 38*38 69*69 46*46 Williams Ri method, Tukey method, Nemenyi 3(17 test, 1) Tukey-Kramer method, 17(17 Games/Howell 1) i1 ni 6 method, 5 Duncan's 6 new multiple range test, tudent-newman-keuls method, etc. Each has different characteristics.) = ince significant points of (p<0.05) and (p<0.01) for (n 1, n, n 3 ) = (6, 5, 6) are and 8.14, respectively, there are significant difference(s) somewhere among three groups (p<0.05) significance point of (p<0.05) 8.14 significance point of (p<0.01) Kruskal-Wallis Test Table (for k = 3 and N 17) n 1 n n 3 p < 0.05 p < 0.01 n 1 n n 3 p < 0.05 p <

32 Exercise: Kruskal-Wallis Test R 1 = R = R 3 = N = n 1 +n +n 3 = 13 samples k = 3 groups (n 1, n, n 3 ) = ( 5, 4, 4) (R 1, R, R 3 ) = (4, 44, 3) 1 H N( N 1) = 6.7 k i1 significance point of (p<0.05) R n i i 3( N 1) There is/are significant difference(s) somewhere among three groups (p<0.05) significance point of (p<0.01) Friedman Test groups n groups (n > ) distribution Parametric Test (normality) un un t -test t -test ANOVA (Analysis of Variance) one-way ANOVA two-way ANOVA Non-parametric Test (no normality) un Mann-Whitney U-test sign test one-way Kruskal-Wallis test Wilcoxon signed-ranks test Friedman test

33 Friedman Test When (1) more than two groups, () have correspondence (not independent), but (3) the conditions of two-way ANOVA are not satisfied, Let' use ranks of and Friedman test. (ex.) Comparison of recognition rates. benchmark tasks methods a b c d A B C D methods a b c d Friedman Test tep 1: Make a ranking table. tep : um ranks of the factor that you want to test. benchmark tasks method a b c d A B C D Σ # of methods (k = 4) # of (n = 4) tep 3: Calculate the Friedman test value, χ r. k 1 r Ri 3n( k 1) nk( k 1) i1 where (k, n) are the # of levels of factors 1 and. methods a b c d ranking among methods tep 4: If k =3 or 4, compare χ r with a significant point in a Friedman test table. Otherwise, use a χ table of (k-1) degrees of freedom.

34 Example: Friedman Test tep 1: Make a ranking table. tep : um ranks of the factor that you want to test. benchmark tasks # of methods (k = 4) tep 3: Calculate the Friedman test value, χ r. k 1 r Ri 3n( k 1) nk( k 1) method a b c d A B C D Σ i *4*5 4*4*5 8.1 tep 4: ince significant point for (k,n) = (4,4) is7.80, there is/are significant difference(s) somewhere among four methods, a, b, c, and d (p<0.05) significance point of (p<0.05) significance point of (p<0.01) # of (n = 4) k n p<0.05 p< Friedman test table Example: Friedman Test tep 1: Make a ranking table. tep : um ranks of the factor that you want to test. benchmark method tasks a b c d A B C D Σ Q1: Where is significant among a, b, c, or d? A1: Apply multiple comparisons between all pairs among columns. (Fisher's PLD method, cheffé method, Bonferroni-Dunn test, Dunnett Friedman test table. method, Williams method, # of methods Tukey method, (k = 4) Nemenyi test, Tukey-Kramer tep 3: method, Calculate Games/Howell the Friedman method, test Duncan's value, χnew r. multiple range test, tudent-newman-keuls k 1 method, etc. Each has different characteristics.) r Ri 3n( k 1) nk( k 1) i *4*5 4*4*5 8.1 tep 4: ince significant point for (k,n) = (4,4) is7.80, there is/are significant difference(s) somewhere among four methods, a, b, c, and d (p<0.05) significance point of (p<0.05) significance point of (p<0.01) # of (n = 4) k n p<0.05 p<

35 Multiple Comparisons When there is significant difference among groups, multiple comparison is used to know which group is significantly difference from others. Example 4C = 6 times of pair comparisons with (p < 0.05) 1 - (1-0.05) 6 = significance level 6.5%! Multiple Comparisons When there is significant difference among groups, multiple comparison is used to know which group is significantly difference from others. olution is to apply multiple pair comparisons with more strict significance level. Example 4C = 6 times of pair comparisons with (p < 0.05) 1 - (1-0.05) 6 = significance level 6.5%!

36 Multiple Comparisons -- Bobferroni method -- When pair comparisons are applied m times, let's use a significance level of p / m C = 6 times of pair comparisons with (p < ) 6 Features: (1) imple. () Rather strict, i.e. showing significances is rather hard. Multiple Comparisons -- Holm method -- Corrected Bonferroni method to detect significances easily. Example pair comparisons p-value corrected p-value eqn. corrected p-value vs. vs. vs. vs. vs. vs = p-value* = p-value* = p-value* = p-value* = p-value* = p-value*

37 cheffé's Method of Paired Comparison groups n groups (n > ) distribution normality (parametric) t -test ANOVA (Analysis of Variance) one-way ANOVA two-way ANOVA sign test one-way kruskal-wallis test no normality (non-parametric) Wilcoxon igned-ranks Test two-way Friedman test + cheffé's method of comparison for Human ubjective Tests cheffé's Method of Paired Comparison lighting design of 3-D CG Corridor W K L Verenda Wall room layout planning design B Target ystem Evolutionary Computation room lighting design by optimizing LED assignments subjective evaluations Interactive Evolutionary Computation IEC image enhancement processing Can you hear me??? hearing-aid fitting MEM design measuring mental scale geological simulation

38 cheffé's Method of Paired Comparison ANOVA based on n C comparisons for n objects. even ANOVA even even significance check using a yardstick cheffé's Method of Paired Comparison Original method and three modified methods All subjects must evaluate all pairs. no yes order effect yes no original ( 原法, 195) Haga's variation ( 芳賀の変法 ) Ura's variation ( 浦の変法, 1956) Nakaya's variation ( 中屋の変法, 1970) Order Effect (1) and then () and then may result different evaluation.

39 cheffé's Method of Paired Comparison 1. Ask N human subjects to evaluate t objects in 3, 5 or 7 grades.. Assign [-1, +1], [-, +] or [-3, +3] for these grades. 3. Then, start calculation (see other material). Questionnaire Total row even even even strap for a mobile phone invitation to a dinner Paired comparisons for t=3 objects. O1 O O3 O4 O5 O6 A1 - A A1 - A A - A ix subjects (N = 6) Application Example: What is the best present to be her/his boy/girl friend? [ITUATION] He/he is my longing. I want to be her/his boy/girl friend before we graduate from our university. To get over my one-way love, I decided to present something of about 3,000 JPY and express my heart. I show you 5 C pairs of presents. Please compare each pair and mark your relative evaluation in five levels. tea /coffee stuffed animal fountain pen Ex. Q.

40 I think effective. Results of cheffé's Method of Paired Comparison (Nakaya's variation) What is the best present to be her/his boy/girl friend? less effective (significant difference) present from a male I will catch her heart by dinner more effective less effective present from a female How about tea leave or a stuffed anima? more effective Reality is... less effective I hesitate to accept it as we have not gone about with him more effective less effective Eat! Eat! Eat! more effective cheffé's Method of Paired Comparison Modified methods by Ura and Nakaya Original method and three modified methods All subjects must evaluate all pairs. no yes order effect yes no original ( 原法, 195) Haga's variation ( 芳賀の変法 ) Ura's variation ( 浦の変法, 1956) Nakaya's variation ( 中屋の変法, 1970)

41 cheffé's Method of Paired Comparison Modified method by Ura Pairwise comparisons for objects which are effected by display order (order effect). even even even even even even cheffé's Method of Paired Comparison Modified method by Ura Ask N human subjects to evaluate t C pairs for t objects in 3, 5 or 7 grades and assign [-1, +1], [-, +] or [-3, +3], respectively. even even even even even even

42 cheffé's Method of Paired Comparison Modified method by Ura tep 1: Make comparison table of each human subject. even A A A A A A A A 3 A A1 A A3 A4 3 A A 4 A A A A 4 A 4 cheffé's Method of Paired Comparison Modified method by Ura tep 1: Make comparison table of each human subject. ubject O 1 x ijl : evaluation value when the l-th human subject compares the i-th object with the j-th object. ubject O ubject O 3

43 cheffé's Method of Paired Comparison Modified method by Ura tep : Make a table summing all subjects' and calculate the average evaluations for all objects. Average of four objects ˆ i 1 tn ( x i x i ) where t: # of object (4) N: # of human subjects (3) A 4 A 3 A A ˆ 4 ˆ 3 ˆ ˆ1 cheffé's Method of Paired Comparison Modified method by Ura tep 3: Make a ANOVA table. ( B) T 1 tn 1 t ( x ( x 1 ( x N i ji 1 x Nt( t 1) ( B) i l i i 1 t( t 1) T l i ji i x x ij ijl il x i ) x x l ( B) ji il ) ) ( B) unbiased variance = /f where, ( B),,, ( B), F = and f degree of freedom. unbiased variance unbiased variance of for F tests., T

44 x x N x x t x x tn i i j ji ij l i il l i B i i i ) ( ) ( 1 ) ( 1 ) ( 1 l i i j ijl T B B T i l B x x t t x t Nt ) ( ) ( ) ( 1) ( 1 1) ( 1 cheffé's Method of Paired Comparison Modified method by Ura ANOVA table A 4 A 3 A A 1 ˆ1 ˆ ˆ3 ˆ 4 There are significant difference among A 1 -A 4 cheffé's Method of Paired Comparison Modified method by Ura ANOVA table.

45 cheffé's Method of Paired Comparison Modified method by Ura tep 4: Apply multiple comparisons. Q1: Where is significant among A 1, A, and A 3? A1: Apply multiple comparisons between all pairs. (Fisher's PLD method, cheffé method, Bonferroni-Dunn test, Dunnett method, Williams method, Tukey method, Nemenyi test, Tukey-Kramer method, Games/Howell method, Duncan's new multiple range test, tudent-newman-keuls method, etc. Each has different characteristics.) cheffé's Method of Paired Comparison Modified method by Ura tep 4: Apply multiple comparisons between all pairs and find which distance is significant. (Fisher's PLD method, cheffé method, Bonferroni-Dunn test, Dunnett method, Williams method, Tukey method, Nemenyi test, Tukey-Kramer method, Games/Howell method, Duncan's new multiple range test, tudent-newman-keuls method, etc. Each has different characteristics.) Example of a simple multiple comparison. Calculate a studentized yardstick When a difference of average > a studentized yardstick, the distance is significant. A 4 A 3 A A ˆ 4 ˆ 3 ˆ ˆ 1

46 cheffé's Method of Paired Comparison Modified method by Ura tep 4: Example of a simple multiple comparisons. Y q ( t, f ) ˆ / tn (studentized yardstick) where ( ˆ, t, N ) are an unbiased variance of ε, the # of objects, and the #of human subjects; q ( t, f ) is a studentized range obtained is a statistical test table for t, the degree of freedom of ε ( f ), and the significant level of φ; see these variables in an ANOVA table. When (t, f) = (4,1), studentized yardsticks for significance levels of 5% and 1% are: (ee q 0.05 (4,1) in the next slide.) tudentized yardstick q ( 0.05 t, f ) f t

47 cheffé's Method of Paired Comparison Modified method by Ura tep 4: Example of a simple multiple comparisons. cheffé's Method of Paired Comparison Modified methods by Ura and Nakaya Original method and three modified methods All subjects must evaluate all pairs. no yes order effect yes no original ( 原法, 195) Haga's variation ( 芳賀の変法 ) Ura's variation ( 浦の変法, 1956) Nakaya's variation ( 中屋の変法, 1970)

48 cheffé's Method of Paired Comparison Modified method by Nakaya Pairwise comparisons for objects that can be compared without order effect. even even even cheffé's Method of Paired Comparison Modified method by Nakaya 1. Ask N human subjects to evaluate t objects in 3, 5 or 7 grades.. Assign [-1, +1], [-, +] or [-3, +3] for these grades, respectively. 3. Then, start calculation (see other material). Questionnaire even even even Paired comparisons for t=3 objects. ix human subjects (N = 6) O 1 O O 3 O 4 O 5 O 6 A1 - A A1 - A A - A

49 cheffé's Method of Paired Comparison Modified method by Nakaya tep 1: Make comparison table of each human subject. x ijl : evaluation value when the l-th human subject compares the i-th object with the j-th object. cheffé's Method of Paired Comparison Modified method by Nakaya tep : Make a table summing all subjects' and calculate the average evaluations for all objects. Average of four objects ˆ i 1 tn xi where t: # of object (3) N: # of human subjects (6)

50 tep 3: Make a ANOVA table. 1 xi.. tn ( B) 1 t i 1 x tn i ANOVA table. cheffé's Method of Paired Comparison Modified method by Nakaya l i i.. x i. l F ( B) 1 t T l i x i. l ( B) Unbiased variance Unbariased variance of There are significant difference among A 1 -A 3 cheffé's Method of Paired Comparison Modified method by Nakaya tep 4: Apply multiple comparisons. Q1: Where is significant among A 1, A, and A 3? A1: Apply multiple comparisons between all pairs among columns. (Fisher's PLD method, cheffé method, Bonferroni-Dunn test, Dunnett method, Williams method, Tukey method, Nemenyi test, Tukey-Kramer method, Games/Howell method, Duncan's new multiple range test, tudent-newman-keuls method, etc. Each has different characteristics.) ANOVA table.

51 cheffé's Method of Paired Comparison Modified method by Nakaya tep 4: Apply multiple comparisons between all pairs and find which distance is significant. (Fisher's PLD method, cheffé method, Bonferroni-Dunn test, Dunnett method, Williams method, Tukey method, Nemenyi test, Tukey-Kramer method, Games/Howell method, Duncan's new multiple range test, tudent-newman-keuls method, etc. Each has different characteristics.) Example of a simple multiple comparison. Calculate a studentized yardstick When a difference of average > a studentized yardstick, the distance is significant. cheffé's Method of Paired Comparison Modified method by Nakaya tep 4: Example of a simple multiple comparisons. Y q ( t, f ) ˆ / tn (studentized yardstick) where ( ˆ, t, N ) are an unbiased variance of ε, the # of objects, and the #of human subjects; q ( t, f ) is a studentized range obtained is a statistical test table for t, the degree of freedom of ε ( f ), and the significant level of φ; see these variables in an ANOVA table. Y / (ee q 0.05 (3,5) in the next slide.) Y /

52 tudentized yardstick q ( 0.05 t, f ) f t distribution UMMARY 1. We overview which statistical test we should use for which case. groups n groups (n > ) Parametric Test (normality) un un t -test t -test ANOVA (Analysis of Variance) one-way ANOVA two-way ANOVA Non-parametric Test (no normality) un Mann-Whitney U-test one-way Kruskal-Wallis test sign test two-way Wilcoxon signed-ranks test Friedman test + cheffé's method of comparison for Human ubjective Tests. We can appeal the effectiveness of our experiments with correct use of statistical tests.

Contents. 2 groups n groups (n > 2) (independent) unpaired. paired t -test. one-way ANOVA ANOVA. (related) paired. two-way ANOVA.

Contents. 2 groups n groups (n > 2) (independent) unpaired. paired t -test. one-way ANOVA ANOVA. (related) paired. two-way ANOVA. Statstcal ests for Computatonal Intellgence Research and Human Subjectve ests Sldes are downloadable from http://www.desgn.kyushu-u.ac.jp/~takag Hdeyuk AKAGI Kyushu Unversty, Japan http://www.desgn.kyushu-u.ac.jp/~takag/

More information

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures Non-parametric Test Stephen Opiyo Overview Distinguish Parametric and Nonparametric Test Procedures Explain commonly used Nonparametric Test Procedures Perform Hypothesis Tests Using Nonparametric Procedures

More information

2. RELATIONSHIP BETWEEN A QUALITATIVE AND A QUANTITATIVE VARIABLE

2. RELATIONSHIP BETWEEN A QUALITATIVE AND A QUANTITATIVE VARIABLE 7/09/06. RELATIONHIP BETWEEN A QUALITATIVE AND A QUANTITATIVE VARIABLE Design and Data Analysis in Psychology II usana anduvete Chaves alvadorchacón Moscoso. INTRODUCTION You may examine gender differences

More information

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics Understand Difference between Parametric and Nonparametric Statistical Procedures Parametric statistical procedures inferential procedures that rely

More information

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics SEVERAL μs AND MEDIANS: MORE ISSUES Business Statistics CONTENTS Post-hoc analysis ANOVA for 2 groups The equal variances assumption The Kruskal-Wallis test Old exam question Further study POST-HOC ANALYSIS

More information

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă HYPOTHESIS TESTING II TESTS ON MEANS Sorana D. Bolboacă OBJECTIVES Significance value vs p value Parametric vs non parametric tests Tests on means: 1 Dec 14 2 SIGNIFICANCE LEVEL VS. p VALUE Materials and

More information

Lec 1: An Introduction to ANOVA

Lec 1: An Introduction to ANOVA Ying Li Stockholm University October 31, 2011 Three end-aisle displays Which is the best? Design of the Experiment Identify the stores of the similar size and type. The displays are randomly assigned to

More information

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown Nonparametric Statistics Leah Wright, Tyler Ross, Taylor Brown Before we get to nonparametric statistics, what are parametric statistics? These statistics estimate and test population means, while holding

More information

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC CHI SQUARE ANALYSIS I N T R O D U C T I O N T O N O N - P A R A M E T R I C A N A L Y S E S HYPOTHESIS TESTS SO FAR We ve discussed One-sample t-test Dependent Sample t-tests Independent Samples t-tests

More information

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007) FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

More information

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis Rebecca Barter April 6, 2015 Multiple Testing Multiple Testing Recall that when we were doing two sample t-tests, we were testing the equality

More information

Non-parametric tests, part A:

Non-parametric tests, part A: Two types of statistical test: Non-parametric tests, part A: Parametric tests: Based on assumption that the data have certain characteristics or "parameters": Results are only valid if (a) the data are

More information

Non-parametric (Distribution-free) approaches p188 CN

Non-parametric (Distribution-free) approaches p188 CN Week 1: Introduction to some nonparametric and computer intensive (re-sampling) approaches: the sign test, Wilcoxon tests and multi-sample extensions, Spearman s rank correlation; the Bootstrap. (ch14

More information

Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance ECON 509. Dr.

Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance ECON 509. Dr. Department of Economics Business Statistics Chapter 1 Chi-square test of independence & Analysis of Variance ECON 509 Dr. Mohammad Zainal Chapter Goals After completing this chapter, you should be able

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 1 1-1 Basic Business Statistics 11 th Edition Chapter 1 Chi-Square Tests and Nonparametric Tests Basic Business Statistics, 11e 009 Prentice-Hall, Inc. Chap 1-1 Learning Objectives In this chapter,

More information

ST4241 Design and Analysis of Clinical Trials Lecture 7: N. Lecture 7: Non-parametric tests for PDG data

ST4241 Design and Analysis of Clinical Trials Lecture 7: N. Lecture 7: Non-parametric tests for PDG data ST4241 Design and Analysis of Clinical Trials Lecture 7: Non-parametric tests for PDG data Department of Statistics & Applied Probability 8:00-10:00 am, Friday, September 2, 2016 Outline Non-parametric

More information

Analysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes

Analysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes Analysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes 2 Quick review: Normal distribution Y N(µ, σ 2 ), f Y (y) = 1 2πσ 2 (y µ)2 e 2σ 2 E[Y ] =

More information

Data Analysis: Agonistic Display in Betta splendens I. Betta splendens Research: Parametric or Non-parametric Data?

Data Analysis: Agonistic Display in Betta splendens I. Betta splendens Research: Parametric or Non-parametric Data? Data Analysis: Agonistic Display in Betta splendens By Joanna Weremjiwicz, Simeon Yurek, and Dana Krempels Once you have collected data with your ethogram, you are ready to analyze that data to see whether

More information

Nonparametric Statistics

Nonparametric Statistics Nonparametric Statistics Nonparametric or Distribution-free statistics: used when data are ordinal (i.e., rankings) used when ratio/interval data are not normally distributed (data are converted to ranks)

More information

Agonistic Display in Betta splendens: Data Analysis I. Betta splendens Research: Parametric or Non-parametric Data?

Agonistic Display in Betta splendens: Data Analysis I. Betta splendens Research: Parametric or Non-parametric Data? Agonistic Display in Betta splendens: Data Analysis By Joanna Weremjiwicz, Simeon Yurek, and Dana Krempels Once you have collected data with your ethogram, you are ready to analyze that data to see whether

More information

Non-parametric methods

Non-parametric methods Eastern Mediterranean University Faculty of Medicine Biostatistics course Non-parametric methods March 4&7, 2016 Instructor: Dr. Nimet İlke Akçay (ilke.cetin@emu.edu.tr) Learning Objectives 1. Distinguish

More information

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and

More information

Rank-Based Methods. Lukas Meier

Rank-Based Methods. Lukas Meier Rank-Based Methods Lukas Meier 20.01.2014 Introduction Up to now we basically always used a parametric family, like the normal distribution N (µ, σ 2 ) for modeling random data. Based on observed data

More information

Degrees of freedom df=1. Limitations OR in SPSS LIM: Knowing σ and µ is unlikely in large

Degrees of freedom df=1. Limitations OR in SPSS LIM: Knowing σ and µ is unlikely in large Z Test Comparing a group mean to a hypothesis T test (about 1 mean) T test (about 2 means) Comparing mean to sample mean. Similar means = will have same response to treatment Two unknown means are different

More information

Nonparametric statistic methods. Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health

Nonparametric statistic methods. Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health Nonparametric statistic methods Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health Measurement What are the 4 levels of measurement discussed? 1. Nominal or Classificatory Scale Gender,

More information

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests PSY 307 Statistics for the Behavioral Sciences Chapter 20 Tests for Ranked Data, Choosing Statistical Tests What To Do with Non-normal Distributions Tranformations (pg 382): The shape of the distribution

More information

Chapter 8 Student Lecture Notes 8-1. Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance

Chapter 8 Student Lecture Notes 8-1. Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance Chapter 8 Student Lecture Notes 8-1 Department of Economics Business Statistics Chapter 1 Chi-square test of independence & Analysis of Variance ECON 509 Dr. Mohammad Zainal Chapter Goals After completing

More information

Multiple Comparison Procedures Cohen Chapter 13. For EDUC/PSY 6600

Multiple Comparison Procedures Cohen Chapter 13. For EDUC/PSY 6600 Multiple Comparison Procedures Cohen Chapter 13 For EDUC/PSY 6600 1 We have to go to the deductions and the inferences, said Lestrade, winking at me. I find it hard enough to tackle facts, Holmes, without

More information

MATH Notebook 3 Spring 2018

MATH Notebook 3 Spring 2018 MATH448001 Notebook 3 Spring 2018 prepared by Professor Jenny Baglivo c Copyright 2010 2018 by Jenny A. Baglivo. All Rights Reserved. 3 MATH448001 Notebook 3 3 3.1 One Way Layout........................................

More information

Lecture 7: Hypothesis Testing and ANOVA

Lecture 7: Hypothesis Testing and ANOVA Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis

More information

ST4241 Design and Analysis of Clinical Trials Lecture 9: N. Lecture 9: Non-parametric procedures for CRBD

ST4241 Design and Analysis of Clinical Trials Lecture 9: N. Lecture 9: Non-parametric procedures for CRBD ST21 Design and Analysis of Clinical Trials Lecture 9: Non-parametric procedures for CRBD Department of Statistics & Applied Probability 8:00-10:00 am, Friday, September 9, 2016 Outline Nonparametric tests

More information

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric Assumptions The observations must be independent. Dependent variable should be continuous

More information

Multiple t Tests. Introduction to Analysis of Variance. Experiments with More than 2 Conditions

Multiple t Tests. Introduction to Analysis of Variance. Experiments with More than 2 Conditions Introduction to Analysis of Variance 1 Experiments with More than 2 Conditions Often the research that psychologists perform has more conditions than just the control and experimental conditions You might

More information

Analysis of Variance

Analysis of Variance Analysis of Variance Blood coagulation time T avg A 62 60 63 59 61 B 63 67 71 64 65 66 66 C 68 66 71 67 68 68 68 D 56 62 60 61 63 64 63 59 61 64 Blood coagulation time A B C D Combined 56 57 58 59 60 61

More information

Chapter 7 Comparison of two independent samples

Chapter 7 Comparison of two independent samples Chapter 7 Comparison of two independent samples 7.1 Introduction Population 1 µ σ 1 1 N 1 Sample 1 y s 1 1 n 1 Population µ σ N Sample y s n 1, : population means 1, : population standard deviations N

More information

Inferential Statistics

Inferential Statistics Inferential Statistics Eva Riccomagno, Maria Piera Rogantin DIMA Università di Genova riccomagno@dima.unige.it rogantin@dima.unige.it Part G Distribution free hypothesis tests 1. Classical and distribution-free

More information

Dr. Maddah ENMG 617 EM Statistics 10/12/12. Nonparametric Statistics (Chapter 16, Hines)

Dr. Maddah ENMG 617 EM Statistics 10/12/12. Nonparametric Statistics (Chapter 16, Hines) Dr. Maddah ENMG 617 EM Statistics 10/12/12 Nonparametric Statistics (Chapter 16, Hines) Introduction Most of the hypothesis testing presented so far assumes normally distributed data. These approaches

More information

Analysis of variance (ANOVA) Comparing the means of more than two groups

Analysis of variance (ANOVA) Comparing the means of more than two groups Analysis of variance (ANOVA) Comparing the means of more than two groups Example: Cost of mating in male fruit flies Drosophila Treatments: place males with and without unmated (virgin) females Five treatments

More information

Analysis of variance

Analysis of variance Analysis of variance Tron Anders Moger 3.0.007 Comparing more than two groups Up to now we have studied situations with One observation per subject One group Two groups Two or more observations per subject

More information

CE3502. Environmental Measurements, Monitoring & Data Analysis. ANOVA: Analysis of. T-tests: Excel options

CE3502. Environmental Measurements, Monitoring & Data Analysis. ANOVA: Analysis of. T-tests: Excel options CE350. Environmental Measurements, Monitoring & Data Analysis ANOVA: Analysis of Variance T-tests: Excel options Paired t-tests tests (use s diff, ν =n=n x y ); Unpaired, variance equal (use s pool, ν

More information

Nonparametric Location Tests: k-sample

Nonparametric Location Tests: k-sample Nonparametric Location Tests: k-sample Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota)

More information

= 1 i. normal approximation to χ 2 df > df

= 1 i. normal approximation to χ 2 df > df χ tests 1) 1 categorical variable χ test for goodness-of-fit ) categorical variables χ test for independence (association, contingency) 3) categorical variables McNemar's test for change χ df k (O i 1

More information

Module 9: Nonparametric Statistics Statistics (OA3102)

Module 9: Nonparametric Statistics Statistics (OA3102) Module 9: Nonparametric Statistics Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 15.1-15.6 Revision: 3-12 1 Goals for this Lecture

More information

Unit 14: Nonparametric Statistical Methods

Unit 14: Nonparametric Statistical Methods Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based

More information

9 One-Way Analysis of Variance

9 One-Way Analysis of Variance 9 One-Way Analysis of Variance SW Chapter 11 - all sections except 6. The one-way analysis of variance (ANOVA) is a generalization of the two sample t test to k 2 groups. Assume that the populations of

More information

Lecture Slides. Section 13-1 Overview. Elementary Statistics Tenth Edition. Chapter 13 Nonparametric Statistics. by Mario F.

Lecture Slides. Section 13-1 Overview. Elementary Statistics Tenth Edition. Chapter 13 Nonparametric Statistics. by Mario F. Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 13 Nonparametric Statistics 13-1 Overview 13-2 Sign Test 13-3 Wilcoxon Signed-Ranks

More information

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 004 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER II STATISTICAL METHODS The Society provides these solutions to assist candidates preparing for the examinations in future

More information

Selection should be based on the desired biological interpretation!

Selection should be based on the desired biological interpretation! Statistical tools to compare levels of parasitism Jen_ Reiczigel,, Lajos Rózsa Hungary What to compare? The prevalence? The mean intensity? The median intensity? Or something else? And which statistical

More information

Introduction. Chapter 8

Introduction. Chapter 8 Chapter 8 Introduction In general, a researcher wants to compare one treatment against another. The analysis of variance (ANOVA) is a general test for comparing treatment means. When the null hypothesis

More information

Statistics Handbook. All statistical tables were computed by the author.

Statistics Handbook. All statistical tables were computed by the author. Statistics Handbook Contents Page Wilcoxon rank-sum test (Mann-Whitney equivalent) Wilcoxon matched-pairs test 3 Normal Distribution 4 Z-test Related samples t-test 5 Unrelated samples t-test 6 Variance

More information

Lecture 14: ANOVA and the F-test

Lecture 14: ANOVA and the F-test Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016 Example Consider a study of 983 individuals and examine the relationship between duration of breastfeeding

More information

What is a Hypothesis?

What is a Hypothesis? What is a Hypothesis? A hypothesis is a claim (assumption) about a population parameter: population mean Example: The mean monthly cell phone bill in this city is μ = $42 population proportion Example:

More information

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS Ravinder Malhotra and Vipul Sharma National Dairy Research Institute, Karnal-132001 The most common use of statistics in dairy science is testing

More information

This is particularly true if you see long tails in your data. What are you testing? That the two distributions are the same!

This is particularly true if you see long tails in your data. What are you testing? That the two distributions are the same! Two sample tests (part II): What to do if your data are not distributed normally: Option 1: if your sample size is large enough, don't worry - go ahead and use a t-test (the CLT will take care of non-normal

More information

Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA)

Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA) BSTT523 Pagano & Gauvreau Chapter 13 1 Nonparametric Statistics Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA) In particular, data

More information

Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series

Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 13 Nonparametric Statistics 13-1 Overview 13-2 Sign Test 13-3 Wilcoxon Signed-Ranks

More information

3. Nonparametric methods

3. Nonparametric methods 3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests

More information

Analysis of Variance (ANOVA)

Analysis of Variance (ANOVA) Analysis of Variance (ANOVA) Two types of ANOVA tests: Independent measures and Repeated measures Comparing 2 means: X 1 = 20 t - test X 2 = 30 How can we Compare 3 means?: X 1 = 20 X 2 = 30 X 3 = 35 ANOVA

More information

Lecture 28 Chi-Square Analysis

Lecture 28 Chi-Square Analysis Lecture 28 STAT 225 Introduction to Probability Models April 23, 2014 Whitney Huang Purdue University 28.1 χ 2 test for For a given contingency table, we want to test if two have a relationship or not

More information

1 One-way Analysis of Variance

1 One-way Analysis of Variance 1 One-way Analysis of Variance Suppose that a random sample of q individuals receives treatment T i, i = 1,,... p. Let Y ij be the response from the jth individual to be treated with the ith treatment

More information

Comparing the means of more than two groups

Comparing the means of more than two groups Comparing the means of more than two groups Chapter 15 Analysis of variance (ANOVA) Like a t-test, but can compare more than two groups Asks whether any of two or more means is different from any other.

More information

Textbook Examples of. SPSS Procedure

Textbook Examples of. SPSS Procedure Textbook s of IBM SPSS Procedures Each SPSS procedure listed below has its own section in the textbook. These sections include a purpose statement that describes the statistical test, identification of

More information

Chapter 12 - Lecture 2 Inferences about regression coefficient

Chapter 12 - Lecture 2 Inferences about regression coefficient Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous

More information

A posteriori multiple comparison tests

A posteriori multiple comparison tests A posteriori multiple comparison tests 11/15/16 1 Recall the Lakes experiment Source of variation SS DF MS F P Lakes 58.000 2 29.400 8.243 0.006 Error 42.800 12 3.567 Total 101.600 14 The ANOVA tells us

More information

4.1. Introduction: Comparing Means

4.1. Introduction: Comparing Means 4. Analysis of Variance (ANOVA) 4.1. Introduction: Comparing Means Consider the problem of testing H 0 : µ 1 = µ 2 against H 1 : µ 1 µ 2 in two independent samples of two different populations of possibly

More information

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p. Preface p. xi Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p. 6 The Scientific Method and the Design of

More information

Contents. Acknowledgments. xix

Contents. Acknowledgments. xix Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables

More information

Chapter 12. Analysis of variance

Chapter 12. Analysis of variance Serik Sagitov, Chalmers and GU, January 9, 016 Chapter 1. Analysis of variance Chapter 11: I = samples independent samples paired samples Chapter 1: I 3 samples of equal size J one-way layout two-way layout

More information

Chapter 10: Analysis of variance (ANOVA)

Chapter 10: Analysis of variance (ANOVA) Chapter 10: Analysis of variance (ANOVA) ANOVA (Analysis of variance) is a collection of techniques for dealing with more general experiments than the previous one-sample or two-sample tests. We first

More information

Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data

Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data 1999 Prentice-Hall, Inc. Chap. 10-1 Chapter Topics The Completely Randomized Model: One-Factor

More information

Inferences About the Difference Between Two Means

Inferences About the Difference Between Two Means 7 Inferences About the Difference Between Two Means Chapter Outline 7.1 New Concepts 7.1.1 Independent Versus Dependent Samples 7.1. Hypotheses 7. Inferences About Two Independent Means 7..1 Independent

More information

Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest.

Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest. Experimental Design: Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest We wish to use our subjects in the best

More information

Group comparison test for independent samples

Group comparison test for independent samples Group comparison test for independent samples The purpose of the Analysis of Variance (ANOVA) is to test for significant differences between means. Supposing that: samples come from normal populations

More information

Statistics for EES Factorial analysis of variance

Statistics for EES Factorial analysis of variance Statistics for EES Factorial analysis of variance Dirk Metzler June 12, 2015 Contents 1 ANOVA and F -Test 1 2 Pairwise comparisons and multiple testing 6 3 Non-parametric: The Kruskal-Wallis Test 9 1 ANOVA

More information

http://www.statsoft.it/out.php?loc=http://www.statsoft.com/textbook/ Group comparison test for independent samples The purpose of the Analysis of Variance (ANOVA) is to test for significant differences

More information

(Foundation of Medical Statistics)

(Foundation of Medical Statistics) (Foundation of Medical Statistics) ( ) 4. ANOVA and the multiple comparisons 26/10/2018 Math and Stat in Medical Sciences Basic Statistics 26/10/2018 1 / 27 Analysis of variance (ANOVA) Consider more than

More information

Basics on t-tests Independent Sample t-tests Single-Sample t-tests Summary of t-tests Multiple Tests, Effect Size Proportions. Statistiek I.

Basics on t-tests Independent Sample t-tests Single-Sample t-tests Summary of t-tests Multiple Tests, Effect Size Proportions. Statistiek I. Statistiek I t-tests John Nerbonne CLCG, Rijksuniversiteit Groningen http://www.let.rug.nl/nerbonne/teach/statistiek-i/ John Nerbonne 1/46 Overview 1 Basics on t-tests 2 Independent Sample t-tests 3 Single-Sample

More information

Analysis of variance

Analysis of variance Analysis of variance 1 Method If the null hypothesis is true, then the populations are the same: they are normal, and they have the same mean and the same variance. We will estimate the numerical value

More information

PSYC 331 STATISTICS FOR PSYCHOLOGISTS

PSYC 331 STATISTICS FOR PSYCHOLOGISTS PSYC 331 STATISTICS FOR PSYCHOLOGISTS Session 4 A PARAMETRIC STATISTICAL TEST FOR MORE THAN TWO POPULATIONS Lecturer: Dr. Paul Narh Doku, Dept of Psychology, UG Contact Information: pndoku@ug.edu.gh College

More information

The entire data set consists of n = 32 widgets, 8 of which were made from each of q = 4 different materials.

The entire data set consists of n = 32 widgets, 8 of which were made from each of q = 4 different materials. One-Way ANOVA Summary The One-Way ANOVA procedure is designed to construct a statistical model describing the impact of a single categorical factor X on a dependent variable Y. Tests are run to determine

More information

Week 14 Comparing k(> 2) Populations

Week 14 Comparing k(> 2) Populations Week 14 Comparing k(> 2) Populations Week 14 Objectives Methods associated with testing for the equality of k(> 2) means or proportions are presented. Post-testing concepts and analysis are introduced.

More information

Statistiek II. John Nerbonne using reworkings by Hartmut Fitz and Wilbert Heeringa. February 13, Dept of Information Science

Statistiek II. John Nerbonne using reworkings by Hartmut Fitz and Wilbert Heeringa. February 13, Dept of Information Science Statistiek II John Nerbonne using reworkings by Hartmut Fitz and Wilbert Heeringa Dept of Information Science j.nerbonne@rug.nl February 13, 2014 Course outline 1 One-way ANOVA. 2 Factorial ANOVA. 3 Repeated

More information

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests: One sided tests So far all of our tests have been two sided. While this may be a bit easier to understand, this is often not the best way to do a hypothesis test. One simple thing that we can do to get

More information

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 CIVL - 7904/8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 Chi-square Test How to determine the interval from a continuous distribution I = Range 1 + 3.322(logN) I-> Range of the class interval

More information

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION FOR SAMPLE OF RAW DATA (E.G. 4, 1, 7, 5, 11, 6, 9, 7, 11, 5, 4, 7) BE ABLE TO COMPUTE MEAN G / STANDARD DEVIATION MEDIAN AND QUARTILES Σ ( Σ) / 1 GROUPED DATA E.G. AGE FREQ. 0-9 53 10-19 4...... 80-89

More information

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling Review for Final For a detailed review of Chapters 1 7, please see the review sheets for exam 1 and. The following only briefly covers these sections. The final exam could contain problems that are included

More information

Chapter 18 Resampling and Nonparametric Approaches To Data

Chapter 18 Resampling and Nonparametric Approaches To Data Chapter 18 Resampling and Nonparametric Approaches To Data 18.1 Inferences in children s story summaries (McConaughy, 1980): a. Analysis using Wilcoxon s rank-sum test: Younger Children Older Children

More information

3. (a) (8 points) There is more than one way to correctly express the null hypothesis in matrix form. One way to state the null hypothesis is

3. (a) (8 points) There is more than one way to correctly express the null hypothesis in matrix form. One way to state the null hypothesis is Stat 501 Solutions and Comments on Exam 1 Spring 005-4 0-4 1. (a) (5 points) Y ~ N, -1-4 34 (b) (5 points) X (X,X ) = (5,8) ~ N ( 11.5, 0.9375 ) 3 1 (c) (10 points, for each part) (i), (ii), and (v) are

More information

Biostatistics 270 Kruskal-Wallis Test 1. Kruskal-Wallis Test

Biostatistics 270 Kruskal-Wallis Test 1. Kruskal-Wallis Test Biostatistics 270 Kruskal-Wallis Test 1 ORIGIN 1 Kruskal-Wallis Test The Kruskal-Wallis is a non-parametric analog to the One-Way ANOVA F-Test of means. It is useful when the k samples appear not to come

More information

Analysis of variance (ANOVA) ANOVA. Null hypothesis for simple ANOVA. H 0 : Variance among groups = 0

Analysis of variance (ANOVA) ANOVA. Null hypothesis for simple ANOVA. H 0 : Variance among groups = 0 Analysis of variance (ANOVA) ANOVA Comparing the means of more than two groups Like a t-test, but can compare more than two groups Asks whether any of two or more means is different from any other. In

More information

Comparison of Two Samples

Comparison of Two Samples 2 Comparison of Two Samples 2.1 Introduction Problems of comparing two samples arise frequently in medicine, sociology, agriculture, engineering, and marketing. The data may have been generated by observation

More information

Physics 509: Non-Parametric Statistics and Correlation Testing

Physics 509: Non-Parametric Statistics and Correlation Testing Physics 509: Non-Parametric Statistics and Correlation Testing Scott Oser Lecture #19 Physics 509 1 What is non-parametric statistics? Non-parametric statistics is the application of statistical tests

More information

Solutions exercises of Chapter 7

Solutions exercises of Chapter 7 Solutions exercises of Chapter 7 Exercise 1 a. These are paired samples: each pair of half plates will have about the same level of corrosion, so the result of polishing by the two brands of polish are

More information

Independent Samples ANOVA

Independent Samples ANOVA Independent Samples ANOVA In this example students were randomly assigned to one of three mnemonics (techniques for improving memory) rehearsal (the control group; simply repeat the words), visual imagery

More information

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 22s:152 Applied Linear Regression Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) We now consider an analysis with only categorical predictors (i.e. all predictors are

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Statistics: revision

Statistics: revision NST 1B Experimental Psychology Statistics practical 5 Statistics: revision Rudolf Cardinal & Mike Aitken 29 / 30 April 2004 Department of Experimental Psychology University of Cambridge Handouts: Answers

More information

STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test.

STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter March 30, 2015 Mann-Whitney Test Mann-Whitney Test Recall that the Mann-Whitney

More information

Chapter Seven: Multi-Sample Methods 1/52

Chapter Seven: Multi-Sample Methods 1/52 Chapter Seven: Multi-Sample Methods 1/52 7.1 Introduction 2/52 Introduction The independent samples t test and the independent samples Z test for a difference between proportions are designed to analyze

More information

H0: Tested by k-grp ANOVA

H0: Tested by k-grp ANOVA Analyses of K-Group Designs : Omnibus F, Pairwise Comparisons & Trend Analyses ANOVA for multiple condition designs Pairwise comparisons and RH Testing Alpha inflation & Correction LSD & HSD procedures

More information