ScienceDirect. Who s afraid of the effect size?

Similar documents
Psychology 282 Lecture #4 Outline Inferences in SLR

CHAPTER 2. Types of Effect size indices: An Overview of the Literature

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

r(equivalent): A Simple Effect Size Indicator

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:

Chapter 4. Regression Models. Learning Objectives

What p values really mean (and why I should care) Francis C. Dane, PhD

Correlation Interpretation on Diabetes Mellitus Patients

10/31/2012. One-Way ANOVA F-test

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

The Multiple Regression Model

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

IJESMR International Journal OF Engineering Sciences & Management Research

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

Procedia - Social and Behavioral Sciences 109 ( 2014 )

Introduction to Business Statistics QM 220 Chapter 12

Formal Statement of Simple Linear Regression Model

Methodology Review: Applications of Distribution Theory in Studies of. Population Validity and Cross Validity. James Algina. University of Florida

Contrasts and Correlations in Effect-size Estimation

Finding Relationships Among Variables

Simple Linear Regression: One Qualitative IV

SOME ASPECTS OF MULTIVARIATE BEHRENS-FISHER PROBLEM

The Binomial Effect Size Display (BESD)

8/23/2018. One-Way ANOVA F-test. 1. Situation/hypotheses. 2. Test statistic. 3.Distribution. 4. Assumptions

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

Inferences for Regression

DISCRIMINANT ANALYSIS IN THE STUDY OF ROMANIAN REGIONAL ECONOMIC DEVELOPMENT

Research Design - - Topic 19 Multiple regression: Applications 2009 R.C. Gardner, Ph.D.

1 Overview. Coefficients of. Correlation, Alienation and Determination. Hervé Abdi Lynne J. Williams

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

Review of Statistics 101

Statistics for Managers using Microsoft Excel 6 th Edition

10/4/2013. Hypothesis Testing & z-test. Hypothesis Testing. Hypothesis Testing

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

Two-Mean Inference. Two-Group Research. Research Designs. The Correlated Samples t Test

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

Correlation and Regression

Two-Sample Inferential Statistics

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model

Simple Linear Regression: One Quantitative IV

Parameter Estimation, Sampling Distributions & Hypothesis Testing

Unit 27 One-Way Analysis of Variance

This gives us an upper and lower bound that capture our population mean.

Model Estimation Example

Journal of Educational and Behavioral Statistics

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Prepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti

Types of Statistical Tests DR. MIKE MARRAPODI

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Transition Passage to Descriptive Statistics 28

Ch 2: Simple Linear Regression

Analysis of Variance

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

Recommended effect size statistics for repeated measures designs

Homoskedasticity. Var (u X) = σ 2. (23)

Confidence Intervals, Testing and ANOVA Summary

y response variable x 1, x 2,, x k -- a set of explanatory variables

Hypothesis testing. 1 Principle of hypothesis testing 2

FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE

Prepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti

One-Way ANOVA Cohen Chapter 12 EDUC/PSY 6600

Chapter 4: Regression Models

Ch 3: Multiple Linear Regression

STA121: Applied Regression Analysis

The One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs)

Correlation Analysis

We like to capture and represent the relationship between a set of possible causes and their response, by using a statistical predictive model.

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

Inequality constrained hypotheses for ANOVA

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM

Correlation. A statistics method to measure the relationship between two variables. Three characteristics

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

ECON3150/4150 Spring 2016

Statistical. Psychology

Econometrics Midterm Examination Answers

Comparison between Models With and Without Intercept

Econ 3790: Business and Economic Statistics. Instructor: Yogesh Uppal

IS OMEGA SQUARED LESS BIASED? A COMPARISON OF THREE MAJOR EFFECT SIZE INDICES IN ONE-WAY ANOVA

Lecture 10 Multiple Linear Regression

THE EFFECTS OF NONNORMAL DISTRIBUTIONS ON CONFIDENCE INTERVALS AROUND THE STANDARDIZED MEAN DIFFERENCE: BOOTSTRAP AND PARAMETRIC CONFIDENCE INTERVALS

Hypothesis Testing hypothesis testing approach

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective

Chapter 2. Continued. Proofs For ANOVA Proof of ANOVA Identity. the product term in the above equation can be simplified as n

PSYC 331 STATISTICS FOR PSYCHOLOGISTS

Chapter 14 Simple Linear Regression (A)

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

Evaluation. Andrea Passerini Machine Learning. Evaluation

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

Hypothesis Testing for Var-Cov Components

Paper Robust Effect Size Estimates and Meta-Analytic Tests of Homogeneity

Variance Decomposition and Goodness of Fit

Sample Size to Detect a Planned Contrast and a One Degree-of-Freedom Interaction Effect

TEST POWER IN COMPARISON DIFFERENCE BETWEEN TWO INDEPENDENT PROPORTIONS

Statistics II Exercises Chapter 5

STATISTICS FOR ECONOMISTS: A BEGINNING. John E. Floyd University of Toronto

Multiple t Tests. Introduction to Analysis of Variance. Experiments with More than 2 Conditions

Sensitiveness analysis: Sample sizes for t-tests for paired samples

Evaluation requires to define performance measures to be optimized

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

Transcription:

Available online at www.sciencedirect.com ScienceDirect Procedia Economics and Finance 0 ( 015 ) 665 669 7th International Conference on Globalization of Higher Education in Economics and Business Administration, GEBA 013 Who s afraid of the effect size? Ciprian Turturean* Alexandru Ioan Cuza University of Iasi,, Carol 1 Ave., no., 700505, Iasi, Romania Abstract The effect size (no more than 35 years) is new topic discussion especially in psychological field. He is quantified by a class of descriptive statistical indicators which based on d Cohen s coefficient. The effect size bring us an additional information to inferential decision to accept or to reject the Null Hypothesis, reason that we find an wide discussion under name Null Hypothesis Significance Testing (NHTS). Therefore the American Psychological Association (APA) recommended in chapter 1.01 Designing and Reporting Research; all published statistical reports also include effect size (APA 5th edition manual section (00)). 015 014 The The Authors. Authors. Published Published by by Elsevier Elsevier B.V. B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Selection and peer-review under responsibility of the Faculty of Economics and Business Administration, Alexandru Ioan Cuza Peer-review University of under Iasi. responsibility of the Faculty of Economics and Business Administration, Alexandru Ioan Cuza University of Iasi. Keywords: effect size, indicators, comparing means, effect size association indexes, effect size coefficients 1. Introduction - What s the effect size? Statistical tests comparing the central level between two statistical distributions give us the answer to the question: Are there significant differences between the two different treatments? but they fail to give us information on the magnitude of the difference. What is more, if we work with two different pair of samples, the estimated variances and central levels, most probably, will differ even if the samples volume and the populations of origin are the same. The new questions that arise are: How big is the magnitude of differences between two different treatments?, How can we quantify the magnitude of differences between two different treatments so that we offer comparability from one test to the other?, Therefore, what is the effect size? * Corresponding author. Tel.: +4-075-197-401. E-mail address: ciprian.turturean@uaic.ro 1-5671 015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of the Faculty of Economics and Business Administration, Alexandru Ioan Cuza University of Iasi. doi:10.1016/s1-5671(15)0011-5

666 Ciprian Turturean / Procedia Economics and Finance 0 ( 015 ) 665 669 Effect Size (ES) is a name given to a family of indices that measure the magnitude of treatment effect. Unlike significance tests, these indices are independent of sample size. ES measures are common currency of metaanalysis studies that summarize the findings from a specific area of research. (Lee A. Becker, [1]). or Effect size is a quantitative reflection of the magnitude of some phenomenon that is used for the purpose of addressing a question of interest. (Kelly & Preacher []). Definitions of effect size abound in literature but many of them cannot capture the complexity of effect size dimensions. Therefore the effect size formulas take many forms according to the nature of the analyzed phenomena. In essence the effect size discussions started from the more detailed analysis of the t statistics for two independent samples from two populations with same variance: 1 t (1) 1 1 s1 n1 n in which the start-up problem is the square root of sum of samples dimension (n 1 and n ). If n 1 and n is sufficiently large, the calculated t value will be, most probably, larger than theoretical t, which lead to rejecting the null hypothesis in most cases. Therefore it is necessary to calculate a descriptive value which doesn t depend on n 1 and n values, but which reflects the magnitude of differences between two compared means. The corresponding effect size for mean differences tested by the t statistic from the formulas (1) is: n1n 1 ES t () nn s. Effect size indicators 1 1 There is a wide diversity of indicators used to measure the effect size. Effect size (ES) indicators enable comparisons between the sizes of the effects. The most common form of expression of effect size indicators is: correlation coefficients or standardized mean differences. ES indicators can be classified: 1. by number of compared groups: the difference between two groups; the difference between more than two groups.. by the measure used to quantify the ES: as a standardized difference between two means; as the correlation between the independent variable classification and the individual scores on the dependent variable named ES correlation (Rosnow & Rosenthal [3]). A. The most common ES coefficients used in practice for two mean comparisons are (t test): 1. Cohen s d original coefficient (Cohen [4]) x1 x d (3) 1/ where x 1 and x are the means of two populations compared and is the standard deviation of one of them when the hypothesis of homoskedasticity is satisfied and the large sizes of the samples.. Cohen s d practical coefficient (Rosnow & Rosenthal [3]) x1 x d (4) pooled Initially (Cohen [4]) the pooled was calculated as a mean of two variances corresponding to the two compared groups when the hypothesis of homoskedasticity is satisfied and for large sizes of the samples:

Ciprian Turturean / Procedia Economics and Finance 0 ( 015 ) 665 669 667 1 pooled (5) There is some differences between Cohen s d and Hedge s g coefficient: Firstly Cohen s d coefficient is a descriptive measure while Hedge s g coefficient is an inferential measure. Secondly Cohan used the parameter to express his coefficient while the Hedge used the unbiased estimators of this, s. In practice Hedge s g coefficient is more importance that the Cohen s d coefficient. The relation between Cohen s d and Hedge s g ES coefficient is given by the formula d g (6) N where N is the size of the aggregate size of the observed groups (ex: n 1 +n ) and is the degree of freedom of pooled variances ( ex: n 1 + n -). Hedge s g coefficient proposes to use instead of parameter pooled to use its estimation spooled (Hartung, Knapp & Sinha [5]) or more better the unbiased estimation s pooled : ' (n11)s 1(n 1)s s pooled (7) n1n Where n 1 and n are the sizes and the s 1 and s are the unbiased standard deviation of the compared samples when the hypothesis of homoskedasticity is satisfied. 3. Cohen s d and Hedge s g coefficient expressed by t The formula of Cohen s d coefficient in t is () (Rosenthal & Rosnow [6]): 1 d t( n1 n) (8) ( n1n) The formula of Hedge s g coefficient in t was presented in formula () (Rosenthal & Rosnow [6]): n1 n g t (9) nn 1 4. Cohen s d and Hedge s g coefficient expressed by r, the ES correlation r d (10) 1 r r nn 1 g (11) (1 r ) ( n1 n) 5. The ES correlation, r is the point biserial correlation between a dichotomous variable and, at least, an interval variable. cov(indep,dichot) r (1) n indepdichot r can be expressed by t t r (13) t In fact the sample correlation coefficient is a biased statistics (Fisher [7]). The unbiased estimate of population correlation is given by adjusted r (McGrath & Meyer [8]):

668 Ciprian Turturean / Procedia Economics and Finance 0 ( 015 ) 665 669 r adj. 1 (1 r )( n1) n where n is the volume of sample. B. The classes of measures used for the multiple comparisons (ANOVA) are named association indexes. Some common formulas of effect size from this class are: 1. Eta squared ( ) and its corrections Epsilon-squared ( ) and Omega-squared ( ). The sample Eta-squared, R-square (R ) (Pearson [8]), is a biased estimator of proportion of explicative variation (Explicative Sum Square) in total variation (Total Sum Square). ESS ESS (14) R TSS ESS RSS (15) From relation (13), based on relation of F: SSE MSE E SSE R F (16) MSR SSR SSR E it is possible to obtain an expression of eta square in F: F E F F E R F R For multiple comparisons, as the sample correlation coefficient, the R square, the estimator of Eta-square, is a biased statistics (Fisher [7]). Two correction of eta-square have been suggested in literature: Epsilon-square and Omega-square (Olenjik & Algina [10]): Epsilon-squared (Kelley [11]) corrects the numerator of eta-squared by subtracting the error mean from the explicative sum of square: ESS RME 1 R (1 R ) (18) TSS R where R is the degree freedom of residuals and RME is the residual/ error mean square and ESS and TSS is the explicative and total sum squares. Omega-squared (Hays [1]) corrects the epsilon-squared by adding to the denominator of the epsilon-squared the residual/error mean square ESS RME (19) TSS RME The interpretations for this class of measures are presented in table 1. Table 1: Interpretation of d, r and r/ R / ˆ (Cohen [4]; Kotrlik & Williams [13], Kirk [14]): Cohen s standard d r r / R / ˆ R E (17) Large ES 0.5 < d 0.43 < r 0.059 < r / R / ˆ Medium ES 0. < d 0.5 0.1 < r 0.43 0.01 < r / R / ˆ 0.059 Small ES d 0. r 0.1 r / R / ˆ 0.01 3. In conclusion, Who is afraid of Effect Size? The effect size measures and the Null Hypotheses Significance Testing (NHST) have aimed at different goals. The link between effect size and decision of NHST are relatively and are given by the sample volume. No matter how larger the effect size is, with a sufficiently large or small sample volume, we determine the values of the

Ciprian Turturean / Procedia Economics and Finance 0 ( 015 ) 665 669 669 statistics of the tests, to become sufficiently big or small so as to obtain the desired result for the test. The sample volume relativizes the NHST. In conclusion, the Effect Size added a new dimension to the hypotheses testing one. What to do now? The solution is to limit/ standardized the volume sample so that the NHST and Effect Size to would not become relative. Choosing the sample size is an important aspect of any statistical research. Cohen & Cohen [15]; Kraemer & Thiemann [16], Cohen [4]; Green [17] and many more have been concerned with this issue. We conclude this paper with a brief enunciation of a set of rules for choosing the sample size depending on the type of comparisons made (Popa M.[18]): A. Testing the differences between means 1. The volume of sampling groups for testing the differences between means is 30 for each group. For example if we have a between subjects (BS) experimental design with 3X3 treatments. For this experiment we must use a sample of size at least 3x3x30 = 70 subjects. In case of the Within Subjects (WS) with 3x3 treatments the sample size will be at least 30!!. The minimum recommended volume, for the situation of comparing a small number of groups, is greater than in case when the comparison is made between several groups. B. Testing the level of association between variable 1. The volume of the sample for the study of a multiple correlation between k independent variable must be at least N=50+8k. For example, when we study the correlation between 5 independent variables we must work at least with a sample of size N=50+8x5=10 subjects.. The volume of the sample to study a multiple regression with k independent variable it must be at least N=104+k. Even more, if we have to study a regression with more than 5 predictors we must ensure that there are at least 10 subjects per each predictor or, even better, at least 30 subjects per each predictor. 3. For the Chi-square test the influences of the increases in the sample volume does not have a negative impact on NHST, but still is recommended that the volume of sampling should be at least 0 subjects and the volume of each groups should be at least 5 subjects. Referencies [1] Becker, A. L., Effect size (ES), 000, http://www.bwgriffin.com/gsu/courses/edur9131/content/effectsizebecker.p [] Kelly, K. & Preacher, J. K. On Effect Size, Psychological Methods, 01, Vol. 17 (), 137-15. [3] Rosnow, R. L. & Rosenthal, R. Computing contrasts, effect sizes, and counter nulls on the people s published data: General procedures for research consumers, Psychological Methods, 1996, Vol. 1 (4), 331-340. [4] Cohen, J. Statistical power analysis for the behavioral sciences, 1988 (nd Ed.). Hillsdale, NJ: Lawrence Erlbaum Associates. [5] Hartung, J. Knapp, G., & Sinha, B. K. Statistical meta-analysis with applications, 008, Hoboken, NJ: Wiley. [6] Rosenthal, R. & Rosnow, R. L. Essentials of behavioral research: Methods and data analysis 1991 (nd Ed.). New York: McGraw Hill. [7] Fisher, R. A. Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population, Biometrika, 1915, Vol. 10 (4), 507 51. [8] McGrath, E. R. & Meyer, J. G. Ehen the effect Sizes Disagree: The case of r and d, Psychological Methods, 006, Vol. 11 (4), 386-401. [9] Pearson, K. On the general theory of skew correlations and nonlinear regression, Mathematical contributions to the theory of evolution: XIV, Draper s Company Research Memoires, Biometric series II, London: Dulau [10] Olenjik S. & Algina J. Measures of Effect Size for comparative Studies: Applications, Interpretations, and Limitations, Contemporary Educational Psychology, 000, Vol. 5 (3), 41-86 [11] Kelley, T. L. An unbiased correlation ratio measure. Proceedings of the National Academy of Sciences, Vol. 1, 554-559. [1] Hays, W. L. Statistics for psychologists, New York: Holt, Reinehart & Winston, 1963. [13] Kotrlik & Williams, The Incorporation of Effect Size, Information Technology, Learning, and Performance Research, 003, Vol. 1 (1) [14] Kirk, R. E. Practical significance: A concept whose time has come. Educational and Psychological Measurement, 1996, Vol. 56 (5), 746-759. [15] Cohen, J. & Cohen, P. Applied multiple regression/ correlation analysis for the behavioral sciences, 1975, Hillsdale, NJ: Erlbaum. [16] Kraemer, H. C. & Thiemann, S., How many subjects? Statistical power analysis in research, 1987, Newbury Park CA: Sage. [17] Green, S.B, How many subjects does it take to do a regression analysis?, Multivariate Behavioral research,1991, Vol. 6 (3), 499-510. [18] Popa, M. Statistica pentru psihologie, 008, Ed. Polirom, Iasi.