Calculating Effect-Sizes. David B. Wilson, PhD George Mason University

Similar documents
Lab 8. Matched Case Control Studies

Formula for the t-test

Logistic Regression: Regression with a Binary Dependent Variable

Practical Meta-Analysis -- Lipsey & Wilson

Small n, σ known or unknown, underlying nongaussian

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

Turning a research question into a statistical question.

Suppose that we are concerned about the effects of smoking. How could we deal with this?

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)

7. Assumes that there is little or no multicollinearity (however, SPSS will not assess this in the [binary] Logistic Regression procedure).

Longitudinal Modeling with Logistic Regression

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

Homework Solutions Applied Logistic Regression

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

Generalized Linear Models for Non-Normal Data

Stat 642, Lecture notes for 04/12/05 96

Consider Table 1 (Note connection to start-stop process).

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit

Econometric Analysis of Cross Section and Panel Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

26:010:557 / 26:620:557 Social Science Research Methods

Logistic Regression. Continued Psy 524 Ainsworth

13.1 Causal effects with continuous mediator and. predictors in their equations. The definitions for the direct, total indirect,

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis

Advanced Quantitative Data Analysis

FinQuiz Notes

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

INTRODUCTION TO DESIGN AND ANALYSIS OF EXPERIMENTS

Analysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Chapter 1 Statistical Inference

Machine Learning Linear Classification. Prof. Matteo Matteucci

Contents. Acknowledgments. xix

Today. HW 1: due February 4, pm. Aspects of Design CD Chapter 2. Continue with Chapter 2 of ELM. In the News:

Introduction To Logistic Regression

Sociology 6Z03 Review II

High-Throughput Sequencing Course

Measures of Association and Variance Estimation

Statistics in medicine

Mantel-Haenszel Test Statistics. for Correlated Binary Data. Department of Statistics, North Carolina State University. Raleigh, NC

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20

Harvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Introduction to the Analysis of Tabular Data

22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression

Econometric Modelling Prof. Rudra P. Pradhan Department of Management Indian Institute of Technology, Kharagpur

Introducing Generalized Linear Models: Logistic Regression

Data Analysis as a Decision Making Process

Classification & Regression. Multicollinearity Intro to Nominal Data

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.

2/26/2017. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

Mixed Models for Longitudinal Ordinal and Nominal Outcomes

BIOMETRICS INFORMATION

APPENDIX B Sample-Size Calculation Methods: Classical Design

Simple logistic regression

Chapter 1. Modeling Basics

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

Generalized Models: Part 1

Logistic Regression Models to Integrate Actuarial and Psychological Risk Factors For predicting 5- and 10-Year Sexual and Violent Recidivism Rates

General structural model Part 2: Categorical variables and beyond. Psychology 588: Covariance structure and factor models

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times

Hypothesis testing, part 2. With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation

Review of Panel Data Model Types Next Steps. Panel GLMs. Department of Political Science and Government Aarhus University.

EPSE 594: Meta-Analysis: Quantitative Research Synthesis

LOGISTIC REGRESSION Joseph M. Hilbe

Introduction to Generalized Models

22s:152 Applied Linear Regression

Stat Lecture 20. Last class we introduced the covariance and correlation between two jointly distributed random variables.

Lecture 5: ANOVA and Correlation

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Meta-analysis. 21 May Per Kragh Andersen, Biostatistics, Dept. Public Health

Textbook Examples of. SPSS Procedure

Logistic Regression Models for Multinomial and Ordinal Outcomes

One-way between-subjects ANOVA. Comparing three or more independent means

Practical Statistics for the Analytical Scientist Table of Contents

Introduction to Statistical Analysis

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

STA441: Spring Multiple Regression. More than one explanatory variable at the same time

Comparing Logit & Probit Coefficients Between Nested Models (Old Version)

Classification 1: Linear regression of indicators, linear discriminant analysis

Basic IRT Concepts, Models, and Assumptions

Categorical Predictor Variables

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Multi-level Models: Idea

Y (Nominal/Categorical) 1. Metric (interval/ratio) data for 2+ IVs, and categorical (nominal) data for a single DV

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.

Review of Statistics 101

Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering

Multinomial Logistic Regression Models

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA

Lecture 10: Alternatives to OLS with limited dependent variables. PEA vs APE Logit/Probit Poisson

ECON 5350 Class Notes Functional Form and Structural Change

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

Transcription:

Calculating Effect-Sizes David B. Wilson, PhD George Mason University

The Heart and Soul of Meta-analysis: The Effect Size Meta-analysis shifts focus from statistical significance to the direction and magnitude of effect Key to this is the effect size It is the dependent variable of meta-analysis Encodes research findings on a numerical scale Different types of effect sizes for different research situations Each type may have multiple methods of computation

Overview Main types of Effect Sizes Logic of the Standardized Mean Difference Methods of Computing the Standardized Mean Difference Logic of the Odds-ratio and Risk-ratio Methods of Computing the Odds-ratio Adjustments, such as for baseline differences Issues related to the variance estimate

Most Common Effect Size Indexes Standardized mean difference (d or g) Group contrast (e.g., treatment versus control) Inherently continuous outcome construct Odds-ratio and Risk-ratio (OR and RR) Group contrast (e.g., treatment versus control) Inherently dichotomous (binary) outcome construct Correlation coefficient (r) Two inherently continuous constructs

Necessary Characteristics of Effect Sizes Numerical values produced must be comparable across studies Must be able to compute its standard error Must not be a direct function of sample size

The Standardized Mean Difference Compares the means of two groups Standardized this difference s pooled = d = X 1 X 2 s pooled (n 1 1) s 2 1 + (n 2 1) s 2 2 n 1 + n 2 2

Visual Example of d-type Effect Size Frequency 0.1.2.3.4.5 Control Treatment d =.50 3 2.5 2 1.5 1.5 0.5 1 1.5 2 2.5 3 3.5 z score

Small Sample Size Bias Adjustment d is slightly upwardly bias when based on small samples. This can be fixed with the adjustment below: [ d = 1 3 4N 9 ] d It is common practice to apply this adjustment whenever using d.

Standard Error and Variance of d The standard error (and variance) of d is mostly a function of sample size: n 1 + n 2 d se d = + 2 n 1 n 2 2 (n 1 + n 2 ) v d = se 2 d

Methods of Computing d Lots of methods of computing d Goal is to reproduce what you would get with means, standard deviations, and sample sizes Some methods are straightforward

Computing d from a t-test Formula for d: Formula for t: d = X 1 X 2 (n1 1)s 2 1 +(n 2 1)s 2 2 n 1 +n 2 2 t = X 1 X 2 (n1 1)s 2 1 +(n 2 1)s 2 2 n 1 +n 2 2 n1 +n 2 n 1 n 2 Therefore: d = t n1 + n 2 n 1 n 2

Computing d from a p-value from a t-test An exact p-value from a t-test can be convert into a t-value, assuming you have the degrees-of-freedom Then proceed as above

Computing d from dichotomous outcomes Some studies may report outcome dichotomously d can be estimated from these data Several methods Logit Cox Logit Probit Arcsine

Logit method of computing d Logistic distribution similar to the normal distribution Logged odds-ratios conform to the logistic distribution Standard deviation of the logistic distribution is s logistic = π = 1.814 3 Thus, we can rescale a logged OR into d d = ln(or) 1.814 Monte Carlo simulations suggest that this performs fairly well; some advantages to the Cox logit method (divide by 1.65)

More complicated estimates of d Can think of d as having two parts Numerator: mean difference Denominator: pooled standard deviation Trick is to get reasonable estimates of each Often have one but not the other

Estimates of numerator of d Covariate adjusted means Difference in gain score means Unstandardized regression coefficient for treatment dummy

Estimates of denominator of d Gain score standard deviation s pooled = s gain 2(1 r) r = ( s 2 1 t 2 + s 2 2 t2) ( X 1 X 2 ) n 2s 1 s 2 t 2 (1) Overall standard deviation for the outcome Standard error One-way ANOVA s = se n 1 s pooled = MS w MSB s pooled = F Be care: don t just use any standard deviation laying around

See Online Effect Size Calculator On CEBCP Website: http://gunston.gmu.edu/cebcp/effectsizecalculator/ index.html On Campbell Website: http://www.campbellcollaboration.org/resources/effect_ size_input.php

Logic of the Odds-Ratio and Risk-Ratio Odds-ratio is the ratio of odds An odds is the probability of failure over the probability of success Risk-ratio is the ratio of risks A risk is simply the probability of failure An odds-ratio or risk-ratio of 1 indicates no difference between the groups Odds-ratio more difficult to interpret than risk-ratios Risk ratios are sensitive to base event rate; odds-ratios are not

Logic of the Odds-Ratio and Risk-Ratio Odds-ratios and risk-ratios contrast two groups on dichotomous outcome I.e., a 2 by 2 frequency table Outcome Condition Success Failure T reatment a b Control c d where a, b, c, and d, are cell frequencies

Computing the Odds-Ratio and Risk-Ratio Odds-ratio is computed as: Risk-ratio is computed as OR = ad bc RR = a/(a + b) c/(c + d) = p 1 p 2

Computing the Odds-Ratio and Risk-Ratio The odds-ratio can also be computed from the proportion successful in each group (p). Odds-ratio is computed as: OR = p 1/ (1 p 1 ) p 2 / (1 p 2 )

Computing the Standard Error and Variance of the Odds-Ratio and Risk Ratio Standard error of the logged Odds-ratio: 1 se ln(or) = a + 1 b + 1 c + 1 d Standard error of the logged Risk-ratio: se ln(rr) = Inverse Variance Weight: 1 p 1 n 1 p 1 + 1 p 1 n 1 p 1 w = 1 se 2

Computing the Odds-Ratio from Other Data Only so many was you can report binary data Makes computing odds-ratios and risk ratios relatively easy Generally have Full two-by-two table Percent of events (successes/failures) in each group Proportion of events (successes/failures) in each group Actual odds-ratio or risk-ratio Convert d-type effect size Chi-square

Computing the Odds-Ratio from a χ 2 Odds-ratio can be computed from a χ 2 if you have the overall event rate for the sample (total number of successes or failures) and the sample size in each group Computations reproduce the two-by-two table See online calculator or Lipsey/Wilson Practical Meta-analysis Appendix B

Adjustments to Effect Size Hunter and Schmidt Psychometric meta-analysis Measurement unreliability, invalidity Range restriction Artificial dichotomization Hunter and Schmidt adjustments generally not done in a Campbell or Cochrane review Pre-test of outcome For quasi-experimental designs, may remove some bias Must still use post-test standard deviation in denominator Adds additional error (variance estimate should be bigger) Sensitivity analysis recommended

Issues Related to Variance Estimate Method of computing effect size may affect variance estimate For example d computed from dichotomous data numerator of d is adjusted for baseline or other covariates Study data may involve clusters Possible solution: rescale standard error from correct model provided in study (if available) se d = d z where z is either a z or t test associate with the treatment effect from a complex statistical model.

Questions?