ScienceDirect. Who s afraid of the effect size?

Available online at www.sciencedirect.com ScienceDirect Procedia Economics and Finance 0 ( 015 ) 665 669 7th International Conference on Globalization of Higher Education in Economics and Business Administration, GEBA 013 Who s afraid of the effect size? Ciprian Turturean* Alexandru Ioan Cuza University of Iasi,, Carol 1 Ave., no., 700505, Iasi, Romania Abstract The effect size (no more than 35 years) is new topic discussion especially in psychological field. He is quantified by a class of descriptive statistical indicators which based on d Cohen s coefficient. The effect size bring us an additional information to inferential decision to accept or to reject the Null Hypothesis, reason that we find an wide discussion under name Null Hypothesis Significance Testing (NHTS). Therefore the American Psychological Association (APA) recommended in chapter 1.01 Designing and Reporting Research; all published statistical reports also include effect size (APA 5th edition manual section (00)). 015 014 The The Authors. Authors. Published Published by by Elsevier Elsevier B.V. B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Selection and peer-review under responsibility of the Faculty of Economics and Business Administration, Alexandru Ioan Cuza Peer-review University of under Iasi. responsibility of the Faculty of Economics and Business Administration, Alexandru Ioan Cuza University of Iasi. Keywords: effect size, indicators, comparing means, effect size association indexes, effect size coefficients 1. Introduction - What s the effect size? Statistical tests comparing the central level between two statistical distributions give us the answer to the question: Are there significant differences between the two different treatments? but they fail to give us information on the magnitude of the difference. What is more, if we work with two different pair of samples, the estimated variances and central levels, most probably, will differ even if the samples volume and the populations of origin are the same. The new questions that arise are: How big is the magnitude of differences between two different treatments?, How can we quantify the magnitude of differences between two different treatments so that we offer comparability from one test to the other?, Therefore, what is the effect size? * Corresponding author. Tel.: +4-075-197-401. E-mail address: ciprian.turturean@uaic.ro 1-5671 015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of the Faculty of Economics and Business Administration, Alexandru Ioan Cuza University of Iasi. doi:10.1016/s1-5671(15)0011-5

666 Ciprian Turturean / Procedia Economics and Finance 0 ( 015 ) 665 669 Effect Size (ES) is a name given to a family of indices that measure the magnitude of treatment effect. Unlike significance tests, these indices are independent of sample size. ES measures are common currency of metaanalysis studies that summarize the findings from a specific area of research. (Lee A. Becker, [1]). or Effect size is a quantitative reflection of the magnitude of some phenomenon that is used for the purpose of addressing a question of interest. (Kelly & Preacher []). Definitions of effect size abound in literature but many of them cannot capture the complexity of effect size dimensions. Therefore the effect size formulas take many forms according to the nature of the analyzed phenomena. In essence the effect size discussions started from the more detailed analysis of the t statistics for two independent samples from two populations with same variance: 1 t (1) 1 1 s1 n1 n in which the start-up problem is the square root of sum of samples dimension (n 1 and n ). If n 1 and n is sufficiently large, the calculated t value will be, most probably, larger than theoretical t, which lead to rejecting the null hypothesis in most cases. Therefore it is necessary to calculate a descriptive value which doesn t depend on n 1 and n values, but which reflects the magnitude of differences between two compared means. The corresponding effect size for mean differences tested by the t statistic from the formulas (1) is: n1n 1 ES t () nn s. Effect size indicators 1 1 There is a wide diversity of indicators used to measure the effect size. Effect size (ES) indicators enable comparisons between the sizes of the effects. The most common form of expression of effect size indicators is: correlation coefficients or standardized mean differences. ES indicators can be classified: 1. by number of compared groups: the difference between two groups; the difference between more than two groups.. by the measure used to quantify the ES: as a standardized difference between two means; as the correlation between the independent variable classification and the individual scores on the dependent variable named ES correlation (Rosnow & Rosenthal [3]). A. The most common ES coefficients used in practice for two mean comparisons are (t test): 1. Cohen s d original coefficient (Cohen [4]) x1 x d (3) 1/ where x 1 and x are the means of two populations compared and is the standard deviation of one of them when the hypothesis of homoskedasticity is satisfied and the large sizes of the samples.. Cohen s d practical coefficient (Rosnow & Rosenthal [3]) x1 x d (4) pooled Initially (Cohen [4]) the pooled was calculated as a mean of two variances corresponding to the two compared groups when the hypothesis of homoskedasticity is satisfied and for large sizes of the samples:

Ciprian Turturean / Procedia Economics and Finance 0 ( 015 ) 665 669 667 1 pooled (5) There is some differences between Cohen s d and Hedge s g coefficient: Firstly Cohen s d coefficient is a descriptive measure while Hedge s g coefficient is an inferential measure. Secondly Cohan used the parameter to express his coefficient while the Hedge used the unbiased estimators of this, s. In practice Hedge s g coefficient is more importance that the Cohen s d coefficient. The relation between Cohen s d and Hedge s g ES coefficient is given by the formula d g (6) N where N is the size of the aggregate size of the observed groups (ex: n 1 +n ) and is the degree of freedom of pooled variances ( ex: n 1 + n -). Hedge s g coefficient proposes to use instead of parameter pooled to use its estimation spooled (Hartung, Knapp & Sinha [5]) or more better the unbiased estimation s pooled : ' (n11)s 1(n 1)s s pooled (7) n1n Where n 1 and n are the sizes and the s 1 and s are the unbiased standard deviation of the compared samples when the hypothesis of homoskedasticity is satisfied. 3. Cohen s d and Hedge s g coefficient expressed by t The formula of Cohen s d coefficient in t is () (Rosenthal & Rosnow [6]): 1 d t( n1 n) (8) ( n1n) The formula of Hedge s g coefficient in t was presented in formula () (Rosenthal & Rosnow [6]): n1 n g t (9) nn 1 4. Cohen s d and Hedge s g coefficient expressed by r, the ES correlation r d (10) 1 r r nn 1 g (11) (1 r ) ( n1 n) 5. The ES correlation, r is the point biserial correlation between a dichotomous variable and, at least, an interval variable. cov(indep,dichot) r (1) n indepdichot r can be expressed by t t r (13) t In fact the sample correlation coefficient is a biased statistics (Fisher [7]). The unbiased estimate of population correlation is given by adjusted r (McGrath & Meyer [8]):

668 Ciprian Turturean / Procedia Economics and Finance 0 ( 015 ) 665 669 r adj. 1 (1 r )( n1) n where n is the volume of sample. B. The classes of measures used for the multiple comparisons (ANOVA) are named association indexes. Some common formulas of effect size from this class are: 1. Eta squared ( ) and its corrections Epsilon-squared ( ) and Omega-squared ( ). The sample Eta-squared, R-square (R ) (Pearson [8]), is a biased estimator of proportion of explicative variation (Explicative Sum Square) in total variation (Total Sum Square). ESS ESS (14) R TSS ESS RSS (15) From relation (13), based on relation of F: SSE MSE E SSE R F (16) MSR SSR SSR E it is possible to obtain an expression of eta square in F: F E F F E R F R For multiple comparisons, as the sample correlation coefficient, the R square, the estimator of Eta-square, is a biased statistics (Fisher [7]). Two correction of eta-square have been suggested in literature: Epsilon-square and Omega-square (Olenjik & Algina [10]): Epsilon-squared (Kelley [11]) corrects the numerator of eta-squared by subtracting the error mean from the explicative sum of square: ESS RME 1 R (1 R ) (18) TSS R where R is the degree freedom of residuals and RME is the residual/ error mean square and ESS and TSS is the explicative and total sum squares. Omega-squared (Hays [1]) corrects the epsilon-squared by adding to the denominator of the epsilon-squared the residual/error mean square ESS RME (19) TSS RME The interpretations for this class of measures are presented in table 1. Table 1: Interpretation of d, r and r/ R / ˆ (Cohen [4]; Kotrlik & Williams [13], Kirk [14]): Cohen s standard d r r / R / ˆ R E (17) Large ES 0.5 < d 0.43 < r 0.059 < r / R / ˆ Medium ES 0. < d 0.5 0.1 < r 0.43 0.01 < r / R / ˆ 0.059 Small ES d 0. r 0.1 r / R / ˆ 0.01 3. In conclusion, Who is afraid of Effect Size? The effect size measures and the Null Hypotheses Significance Testing (NHST) have aimed at different goals. The link between effect size and decision of NHST are relatively and are given by the sample volume. No matter how larger the effect size is, with a sufficiently large or small sample volume, we determine the values of the

Ciprian Turturean / Procedia Economics and Finance 0 ( 015 ) 665 669 669 statistics of the tests, to become sufficiently big or small so as to obtain the desired result for the test. The sample volume relativizes the NHST. In conclusion, the Effect Size added a new dimension to the hypotheses testing one. What to do now? The solution is to limit/ standardized the volume sample so that the NHST and Effect Size to would not become relative. Choosing the sample size is an important aspect of any statistical research. Cohen & Cohen [15]; Kraemer & Thiemann [16], Cohen [4]; Green [17] and many more have been concerned with this issue. We conclude this paper with a brief enunciation of a set of rules for choosing the sample size depending on the type of comparisons made (Popa M.[18]): A. Testing the differences between means 1. The volume of sampling groups for testing the differences between means is 30 for each group. For example if we have a between subjects (BS) experimental design with 3X3 treatments. For this experiment we must use a sample of size at least 3x3x30 = 70 subjects. In case of the Within Subjects (WS) with 3x3 treatments the sample size will be at least 30!!. The minimum recommended volume, for the situation of comparing a small number of groups, is greater than in case when the comparison is made between several groups. B. Testing the level of association between variable 1. The volume of the sample for the study of a multiple correlation between k independent variable must be at least N=50+8k. For example, when we study the correlation between 5 independent variables we must work at least with a sample of size N=50+8x5=10 subjects.. The volume of the sample to study a multiple regression with k independent variable it must be at least N=104+k. Even more, if we have to study a regression with more than 5 predictors we must ensure that there are at least 10 subjects per each predictor or, even better, at least 30 subjects per each predictor. 3. For the Chi-square test the influences of the increases in the sample volume does not have a negative impact on NHST, but still is recommended that the volume of sampling should be at least 0 subjects and the volume of each groups should be at least 5 subjects. Referencies [1] Becker, A. L., Effect size (ES), 000, http://www.bwgriffin.com/gsu/courses/edur9131/content/effectsizebecker.p [] Kelly, K. & Preacher, J. K. On Effect Size, Psychological Methods, 01, Vol. 17 (), 137-15. [3] Rosnow, R. L. & Rosenthal, R. Computing contrasts, effect sizes, and counter nulls on the people s published data: General procedures for research consumers, Psychological Methods, 1996, Vol. 1 (4), 331-340. [4] Cohen, J. Statistical power analysis for the behavioral sciences, 1988 (nd Ed.). Hillsdale, NJ: Lawrence Erlbaum Associates. [5] Hartung, J. Knapp, G., & Sinha, B. K. Statistical meta-analysis with applications, 008, Hoboken, NJ: Wiley. [6] Rosenthal, R. & Rosnow, R. L. Essentials of behavioral research: Methods and data analysis 1991 (nd Ed.). New York: McGraw Hill. [7] Fisher, R. A. Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population, Biometrika, 1915, Vol. 10 (4), 507 51. [8] McGrath, E. R. & Meyer, J. G. Ehen the effect Sizes Disagree: The case of r and d, Psychological Methods, 006, Vol. 11 (4), 386-401. [9] Pearson, K. On the general theory of skew correlations and nonlinear regression, Mathematical contributions to the theory of evolution: XIV, Draper s Company Research Memoires, Biometric series II, London: Dulau [10] Olenjik S. & Algina J. Measures of Effect Size for comparative Studies: Applications, Interpretations, and Limitations, Contemporary Educational Psychology, 000, Vol. 5 (3), 41-86 [11] Kelley, T. L. An unbiased correlation ratio measure. Proceedings of the National Academy of Sciences, Vol. 1, 554-559. [1] Hays, W. L. Statistics for psychologists, New York: Holt, Reinehart & Winston, 1963. [13] Kotrlik & Williams, The Incorporation of Effect Size, Information Technology, Learning, and Performance Research, 003, Vol. 1 (1) [14] Kirk, R. E. Practical significance: A concept whose time has come. Educational and Psychological Measurement, 1996, Vol. 56 (5), 746-759. [15] Cohen, J. & Cohen, P. Applied multiple regression/ correlation analysis for the behavioral sciences, 1975, Hillsdale, NJ: Erlbaum. [16] Kraemer, H. C. & Thiemann, S., How many subjects? Statistical power analysis in research, 1987, Newbury Park CA: Sage. [17] Green, S.B, How many subjects does it take to do a regression analysis?, Multivariate Behavioral research,1991, Vol. 6 (3), 499-510. [18] Popa, M. Statistica pentru psihologie, 008, Ed. Polirom, Iasi.