Journal of Educational and Behavioral Statistics

Similar documents
Estimation of Effect Size From a Series of Experiments Involving Paired Comparisons

HYPOTHESIS TESTING. Hypothesis Testing

Using Power Tables to Compute Statistical Power in Multilevel Experimental Designs

r(equivalent): A Simple Effect Size Indicator

Workshop on Statistical Applications in Meta-Analysis

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

A Monte Carlo Simulation of the Robust Rank- Order Test Under Various Population Symmetry Conditions

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

A nonparametric two-sample wald test of equality of variances

Another Look at the Confidence Intervals for the Noncentral T Distribution

S Abelman * Keywords: Multivariate analysis of variance (MANOVA), hypothesis testing.

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

Probabilistic Hindsight: A SAS Macro for Retrospective Statistical Power Analysis

A Monte-Carlo study of asymptotically robust tests for correlation coefficients

ScienceDirect. Who s afraid of the effect size?

Methodology Review: Applications of Distribution Theory in Studies of. Population Validity and Cross Validity. James Algina. University of Florida

Lectures 5 & 6: Hypothesis Testing

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

Two-Sample Inferential Statistics

TWO-FACTOR AGRICULTURAL EXPERIMENT WITH REPEATED MEASURES ON ONE FACTOR IN A COMPLETE RANDOMIZED DESIGN

NEW APPROXIMATE INFERENTIAL METHODS FOR THE RELIABILITY PARAMETER IN A STRESS-STRENGTH MODEL: THE NORMAL CASE

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

On Selecting Tests for Equality of Two Normal Mean Vectors

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

Single Sample Means. SOCY601 Alan Neustadtl

CBA4 is live in practice mode this week exam mode from Saturday!

Distribution-Free Procedures (Devore Chapter Fifteen)

THE 'IMPROVED' BROWN AND FORSYTHE TEST FOR MEAN EQUALITY: SOME THINGS CAN'T BE FIXED

Bootstrap Procedures for Testing Homogeneity Hypotheses

A TIME SERIES PARADOX: UNIT ROOT TESTS PERFORM POORLY WHEN DATA ARE COINTEGRATED

A consideration of the chi-square test of Hardy-Weinberg equilibrium in a non-multinomial situation

Maximum-Likelihood Estimation: Basic Ideas

Inference in Normal Regression Model. Dr. Frank Wood

Inferences About the Difference Between Two Means

A Random Effects Model for Effect Sizes

F n and theoretical, F 0 CDF values, for the ordered sample

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Psicológica ISSN: Universitat de València España

Statistics: CI, Tolerance Intervals, Exceedance, and Hypothesis Testing. Confidence intervals on mean. CL = x ± t * CL1- = exp

Neuendorf MANOVA /MANCOVA. Model: X1 (Factor A) X2 (Factor B) X1 x X2 (Interaction) Y4. Like ANOVA/ANCOVA:

Confidence Interval Estimation

Neuendorf MANOVA /MANCOVA. Model: X1 (Factor A) X2 (Factor B) X1 x X2 (Interaction) Y4. Like ANOVA/ANCOVA:

Distribution Theory. Comparison Between Two Quantiles: The Normal and Exponential Cases

On Testing for Uniformity of Fit in Regression: An Econometric Case Study

Two-sample inference: Continuous data

Outline. PubH 5450 Biostatistics I Prof. Carlin. Confidence Interval for the Mean. Part I. Reviews

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

A Cautionary Note on Estimating the Reliability of a Mastery Test with the Beta-Binomial Model

Exam 2 (KEY) July 20, 2009

Psychology 282 Lecture #4 Outline Inferences in SLR

The exact bootstrap method shown on the example of the mean and variance estimation

The Chi-Square and F Distributions

THE EFFECTS OF NONNORMAL DISTRIBUTIONS ON CONFIDENCE INTERVALS AROUND THE STANDARDIZED MEAN DIFFERENCE: BOOTSTRAP AND PARAMETRIC CONFIDENCE INTERVALS

Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption

Inferential statistics

Chapter 7 Comparison of two independent samples

NONPARAMETRIC TESTS. LALMOHAN BHAR Indian Agricultural Statistics Research Institute Library Avenue, New Delhi-12

PROGRAM STATISTICS RESEARCH

Confidence Intervals for the Process Capability Index C p Based on Confidence Intervals for Variance under Non-Normality

INTRODUCTION TO ANALYSIS OF VARIANCE

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur

Chapter 8 Handout: Interval Estimates and Hypothesis Testing

Biostat Methods STAT 5820/6910 Handout #9a: Intro. to Meta-Analysis Methods

POWER FUNCTION CHARTS FOR SPECIFICATION OF SAMPLE SIZE IN ANALYSIS OF VARIANCE

Types of Statistical Tests DR. MIKE MARRAPODI

PIRLS 2016 Achievement Scaling Methodology 1

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 16 Introduction

Basic Statistics. 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation).

Bayesian Estimation of Prediction Error and Variable Selection in Linear Regression

Tutorial 1: Power and Sample Size for the One-sample t-test. Acknowledgements:

A Hypothesis Test for the End of a Common Source Outbreak

On the Triangle Test with Replications

Tutorial 4: Power and Sample Size for the Two-sample t-test with Unequal Variances

Testing Independence

Two-by-two ANOVA: Global and Graphical Comparisons Based on an Extension of the Shift Function

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Two-sample inference: Continuous data

IMPACT OF ALTERNATIVE DISTRIBUTIONS ON QUANTILE-QUANTILE NORMALITY PLOT

Research Note: A more powerful test statistic for reasoning about interference between units

TEST POWER IN COMPARISON DIFFERENCE BETWEEN TWO INDEPENDENT PROPORTIONS

Two Measurement Procedures

Paper Robust Effect Size Estimates and Meta-Analytic Tests of Homogeneity

Analysis of variance

Jerome Kaltenhauser and Yuk Lee

SOME ASPECTS OF MULTIVARIATE BEHRENS-FISHER PROBLEM

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

A Practitioner s Guide to Cluster-Robust Inference

Neuendorf MANOVA /MANCOVA. Model: MAIN EFFECTS: X1 (Factor A) X2 (Factor B) INTERACTIONS : X1 x X2 (A x B Interaction) Y4. Like ANOVA/ANCOVA:

Tutorial 3: Power and Sample Size for the Two-sample t-test with Equal Variances. Acknowledgements:

BIOL 4605/7220 CH 20.1 Correlation

Outline. Possible Reasons. Nature of Heteroscedasticity. Basic Econometrics in Transportation. Heteroscedasticity

Upon completion of this chapter, you should be able to:

2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

INTERVAL ESTIMATION AND HYPOTHESES TESTING

Transcription:

Journal of Educational and Behavioral Statistics http://jebs.aera.net Theory of Estimation and Testing of Effect Sizes: Use in Meta-Analysis Helena Chmura Kraemer JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS 1983 8: 93 DOI: 10.3102/10769986008002093 The online version of this article can be found at: http://jeb.sagepub.com/content/8/2/93 Published on behalf of American Educational Research Association and http://www.sagepublications.com Additional services and information for Journal of Educational and Behavioral Statistics can be found at: Email Alerts: http://jebs.aera.net/alerts Subscriptions: http://jebs.aera.net/subscriptions Reprints: http://www.aera.net/reprints Permissions: http://www.aera.net/permissions Citations: http://jeb.sagepub.com/content/8/2/93.refs.html >> Version of Record - Jan 1, 1983 What is This?

Journal of Educational Statistics Summer 1983, Volume 8, Number 2,pp. 93-101 THEORY OF ESTIMATION AND TESTING OF EFFECT SIZES: USE IN META-ANALYSIS HELENA CHMURA KRAEMER Stanford University Key words: Meta-analysis, Research Synthesis, Effect Sizes ABSTRACT. Approximations to the distribution of a common form of effect size are presented. Single sample tests, confidence interval formulation, tests of homogeneity and pooling procedures are based on these approximations. Caveats are presented concerning statistical procedures as applied to sample effect sizes commonly used in meta-analysis. When policy decisions are based on research, such decisions are generally not based on one research study, but on the consensus of many, nor on statistical significance alone, but on evaluation of practical significance (i.e., cost-effectiveness) as well. In recent years there has been growing emphasis on systematic syntheses of the results of research studies, meta-analysis (Glass, McGaw, & Smith, 1981), based on the use of quantitative measures indicating practical as well as statistical significance: effect sizes. The application of such methods has, in many ways, preceded the development of sound mathematical theory necessary to implement its application. Consequently, there is much controversy and doubt about the validity of results based on meta-analysis. The focus here is on the most frequently used sample effect size: d= (x E -x c )/S, where x E is the mean response of n E subjects in an experimental group, x c that of n c subjects in a separate control group, and S 2 is a sample variance. Generally, it is assumed that x Ei ~%(ii E9 o ) 9 /= 1,2,...,*, x C/ ~9l(/x c,a<?), /= 1,2,...,w c, 2 2 Further, S 2 is an estimate of a 2, independent of x E and x c with Under these assumptions (Winer, 1971), [Np(l-p)V /2 d~t> r ([Np(l-p)y 2 8), 93

94 Helena Chmura Kraemer where N = n E + n c is the total sample size; p = n E /N reflects balance; = di E jti c )/a is the population effect size; and ^(X) represents a noncentral /-distribution with v degrees of freedom and noncentrality parameter X. If S 2 is the pooled within-group variance, v N 2. If S 2 is based on the control group alone, v = n c 1. The mathematical form of the noncentral / distribution is known, and has been tabled (Resnikoff & Lieberman, 1957). Use of either the exact form or tables as a basis of statistical analysis of effect sizes is extremely difficult. However, when n E and n c both exceed about 10, accurate approximations to the noncentral / distribution are available and provide an accessible approach to applications. When N is small, or when groups sizes are disparate, or when S 2 is injudiciously chosen, distribution theory is highly sensitive to departures from the assumptions (Scheffe, 1959). Interpretation of 8 also becomes problematic (Kraemer & Andrews, 1982). Finally, the approximation procedures are of doubtful accuracy. For all these reasons, attention is here restricted only to the situation in which N>20, when A^p<.6 and when S 2 is the pooled within-group variance (v = N 2). To test the null hypothesis H 0 : 8 = 0 versus 8 > 0, one would reject H 0 at the a-level of significance if [Np(l-p)V /2 d>^a, where t v a is the upper a-level critical value of the / distribution with v degrees of freedom. Since computation of effect size is usually done when one has an a priori reason to believe 8 > 0, this particular test is minimally important. More useful are procedures, in any single study, to: 1. Test the null hypothesis H Q : 8 <8 0 versus A: 8 > 8 0, and compute the power of this test; 2. compute a confidence interval for 8; or, in meta-analysis, to: 3. test the homogeneity of 8,, 8 2,...,8 m ; 4. obtain a pooled estimate and compute a confidence interval for 8 on the basis of a set of sample effect sizes d x,d 2,...,d m \ and 5. to examine statistically those factors producing heterogeneity of effect sizes. Approximation to the Distribution of Sample Effect Size One approximation with little mathematical justification (cf. Hedges, 1982a) is: [Np(l - p)v /2 (d - S) ^91(0,1).

Theory of Estimation and Testing of Effect Sizes 95 Yet this approximation underlies many of the statistical procedures commonly used in meta-analysis (e.g., Hsu, 1980). It should be noted that 1. rrjv.(^/2) 1/2 r((.-i)/2) E { d ) ~ W ) Hence d always overestimates 8 (Hedges, 1981). 2. For large sample sizes, var(</)~- 1 + ^ p{\~p) 2 (Hedges, 1981). Thus, not only is the variance consistently underestimated, but the variance is not independent of 8. For this reason, applications of test procedures to sample effect sizes such as / tests, analysis of variance, or linear regression, which assume homoscedasticity, are of questionable validity. 3. The distribution of d is both skewed and heavy tailed and is here approximated by the normal distribution which is neither. Better approximations are the Johnson-Welch procedure (1940) and the Kraemer-Paik procedure (1979). The Johnson-Welch procedure describes a normal approximation to the noncentral / distribution which, applied to effect size, would yield:?r{d^d) «Pr{z>[tf/>(l ~ P)Y /2 (D - 8)/[l +Z> 2 /2/] 1/2 }, where /= v/np{\ p) (/«4). This procedure justifies tests such as those proposed by Hedges (1982a, 1982b, 1982c). More accurate for small noncentrality parameters, and more useful in this context, is the Kraemer-Paik procedure (1979). Applied to effect sizes, this procedure indicates that if and r = d/(d*+f) l/l, p = «/(«2 +/) 1/2 then w(r, p) = u (r p)/(l rp) is approximately distributed according to the null distribution of the product moment correlation coefficient; that is, v x / 2 u/{\-u 2 ) X/2 ~t v. Thus percentile points of w(r, p), say C v P, where Pr{ W (r,p)>c^}=/>,

96 Helena Chmura Kraemer are tabled (Fisher & Yates, 1957). Alternatively, since c v, P = t^p/(tl P + p) l/ \ percentile points of w(r, p) may be computed from tables of the / distribution. Furthermore, since w(r, p) is distributed as is the product moment correlation coefficient, Fisher's z transformation; that is, z(u) = 2^(1^) = tanh ~ lw is both a variance-stabilizing and a normalizing transformation. Since then z(u) = z(r) - z(p), Computing percentile points is somewhat less accurate and more tedious using this transformation, rather than using C v P directly, but there are many other applications in which having a variance independent of the mean is crucial. Single Sample Tests Consider the null hypothesis H 0 : 8 < S 0 versus A: 8 > 8 0. One would reject the null hypothesis at the a level of significance if: where z(r)^%(z(p), v^- [ ). u{r,p 0 )^C va, that is, if or in terms of d: PO = V(«O 2 +/)' /2 ; 'MQ,«+ Po)/(i + Q,«Po)> <*>«;.. + po)/[(i - Q 2, a )(i - PDY /2. The power of this test at 5, > 8 0 can easily be estimated, for Power(5,) = Pr{r > (C,, + p 0 )/(1 + C v, apo ) p = p,} = Pr{«(r, Pl) > «[(<;, + Po)/(l + C,, apo ), p,] p = p,} = Pr{«, (r, Pl) > (C,, a - A)/(l - C,, A) p = p,}, where A = (Pi -Po)/(l -PiPo)-

Theory of Estimation and Testing of Effect Sizes 97 Thus, since r'/ 2 «(r,p.)/[l-«2 (^.Pi)],/2 ~^ Power(S,) = Vr{t, > v^(c a - A)/[(l - C*.)(l - A 2 )] 1/2 ). Table I presents values of A as a function of v and P. One notes that to detect a separation between effect sizes yielding A =.4, one needs approximately a sample size of 60 for 95 percent power, about 50 for 90 percent power, about 35 for 80 percent power, and so forth. For a sample size of 20 (v = 18) one has less than an even chance of detecting such a separation. The import of this observation is clear only if one realizes how the metric A reflects separation between effect sizes. In Table II are presented A for paired values of S 0 < 8,. A separation between effect size S 0 = 0 and 8 } =.8, that is, between null effect size and one quite large as compared to those TABLE I Table of A to Achieve a Power of P with v Degrees of Freedom (N = v 4-2) T^ 18 20 25 30 40 60 100 50%.378.360.323.296.258.211.150 60.429.409.368.338.295.242.172 70.481.459.415.381.333.275.196 80.537.514.467.430.378.313.244 90.609.584.534.495.436.363.262 95%.662.637.585.544.483.404.292 TABLE II A as a Function of8 0,8 l (/ = 4) \ * 1 0.2.4.6.8 1.0 1.2 1.4 1.6 1.8 2.0 0 0.100.196.287.371.447.514.573.625.669.707.2 0.099.193.282.364.437.503.560.610.654.4 0.097.189.275.354.425.488.544.593.6 0.094.183.267.343.411.472.527.8 0.091.177.257.330.396.455 1.0 0.087.170.246.316.380 1.2 0.084.162.236.303 1.4 0.080.155.225 1.6 0.076.148 1.8 0.072 2.0 0

98 Helena Chmura Kraemer reported in the literature, yields A =.371. Consequently, with a sample size of 20, one would have less than an even chance of discriminating no effect from a large effect. Once again, from yet another viewpoint, there is serious difficulty in using effect sizes when sample sizes are small. Confidence Interval for 8 Since w(r, p) has a distribution independent of p, one- or two-sided confidence intervals for p and hence for 5=/'/V0-p 2 ) 1/2 are readily obtained. For example, where Thus or r = d/(di+f) l/ \ Pr{ W (r,p)<q, tt }~l-«. Pr{p > (r - C)/(l - rc)} «1 - a, C = C va9 Pr{«^/! / 2 (r - C)/[(l - r 2 )(l - C 2 )] 1/2 } ~ 1 - a. It can readily be verified that for small N, confidence intervals for S will be very wide. For example, if one observed d = 1.0 (r =.45) for N = 20 {v = 18, C.Q5-38), the one-tailed 95 percent confidence interval for 8 will be 8>2(.4S-.38) = 1? (.80 X.86) 1/2 A Test of Homogeneity/Pooling Effect Sizes If d l9 d 2,...,d m are independent sample effect sizes (i.e., based on different samples), one might wish to test whether they all estimate the same effect size, that is, H 0 : S x =8 2 = = 8 m (cf. Hedges, 1982a, 1982b). Because it is assumed that/«4 for all included effect sizes, this null hypothesis is equivalent to H 0 : p x p 2 ' P m where P, = V(«, 2 + 4)' /2. This, then, becomes a test of homogeneity of correlation coefficients. One estimates the common p, under the null hypothesis, by p where The test statistic is 2M - 1) m *=2(',-i)(*(/;)-*0>)) 2,

Theory of Estimation and Testing of Effect Sizes 99 which, under i/ 0, has approximately a x^distribution with (m - 1) degrees of freedom (Kraemer, 1975, 1979). Furthermore, then the pooled estimate of 8 is 8 where 5 = 2p/(l-p 2 ),/2. Then z(f>) ~ 9t(z(p), \/m.(v - 1)), where p = 2,-^/m.. On this basis, tests or confidence intervals for p are readily formulated. Note that $ is not a weighted average old x,d 2,...,d m and that 8 itself is not normally distributed. These are important points to consider in meta-analysis because pooling effect sizes is usually implemented by using a weighted average of d x,..., d m say 4* = 2«,-4/2«,-» where co, are positive weights based on sample sizes. This statistic is asymptotically normally distributed if the sample size underlying each d i is large or if m, the number of studies, is large, but the number of subjects per study and the number of different research studies is rarely large enough to warrent using asymptotic theory. If all the studies are small, even asymptotically as the number of studies increases, d u is biased. However, as Hedges points out (1981), one might replace d i by an unbiased estimate of 8. Even then, however, the variance of d i9 is not independent of the unknown 8. Obtaining valid confidence intervals or tests based on d u under these circumstances is indeed a problem, particularly when sample sizes are small. General Applications There are many other statistical questions related to use of sample effect sizes in meta-analysis. One might compile effect sizes for each of g interventions and wish to compare these using t tests or analysis of variance. One might examine characteristics of the studies yielding different effect sizes (size of class, intensity or duration of intervention, length of follow-up, etc.) and wish to assess the influence of such factors on effect size using Multiple Linear Regression. All such evaluations are questionable when applied to sample effect sizes directly (cf. Hedges, 1982c). The above statistical considerations, however, suggest a strategy to implement such procedures, namely: (1) Studies with group size less than 10 or which are seriously unbalanced (p <.4, p >.6) should be set aside. Such studies may be valid for purposes of testing, but estimation of effect sizes from such studies, for reasons detailed above, are problematic. (2) Only one effect size per study can be used to ensure independence.

100 Helena Chmura Kraemer (3) All remaining effect sizes are transformed as follows: r l = rf / /(rf l 2 + 4) 1/2 f *,- = *(*/) Analytic procedures are then applied, not to d i9 but to z,. The effect of this transformation is to attenuate size. When d t is small, there is little change; that is, z, «d r When d t =.80, for example, z,- =.78. When d i = 2.0, z, = 1.8. When d t = 10.0, z,. = 4.63. Sample effect size, </,., has a skew distribution with heavy tails; z, has approximately a normal distribution with mean z(p) 9 (p = 5/(^2 + 4) 1/2 ) and variance equal (*> 1) _1. Hedges and Olkin (1981) suggest a somewhat different variance stabilizing transformation, one essentially based on the Johnson-Welch approximation to the distribution of d. They suggest (in the balanced case) using with Here we suggest using: with z H (d) = j2smh- l (d\2jl) z H (d)~%(z lf (S),l/N). z K (d) = tanh- 1 (rf/y / </ 2 + 4), z K (d)^%(z K (S),l/(N-3)). In Table III are presented values of Z H and Z K for effect sizes d 0 to 2.0. For the typical range of effect sizes, only for relatively small sample size will results based on the two approaches differ. In any one study, because effect sizes are generally small, use of either transformation rather than d will make little difference. However, the relatively TABLE III Comparison of Two Variance Stabilizing Transformations for d: z K,z H. d Z k Z H 0 0 0.2.100.100.4.199.199!.6.296.298 8.390.395 1.0.481.490 1.2.569.583 1.4.653.674 1.6.733.763 1.8.809.848 2.0.881.931

Theory of Estimation and Testing of Effect Sizes 101 small errors incurred by using d rather than z K or z H, since they tend to be in the same direction, cumulate over studies in a meta-analysis and do not cancel each other out. As a result, there may be major impact on what inferences are drawn from meta-analysis and ultimately what recommendations are based thereon. References Fisher, R. A., & Yates, F. Statistical tables. London: Oliver and Boyd, 1957. Glass, G. V, Primary, secondary and meta-analysis of research. Educational Researcher, 1976,5,3-8. Glass, G. V, McGaw, B., & Smith, M. L. Meta-analysis in social research. Beverly Hills: Sage, 1981. Hedges, L. V. Distribution theory for Glass's estimator of effect size and related estimators. Journal of Educational Statistics, 1981, 6(2), 107-128. Hedges, L. V. Estimation and testing for differences in effect size: Comment on Hsu. Psychological Bulletin, 1982, 97, 391-393. (a) Hedges, L. V. Estimation of effect size from a series of independent experiments. Psychological Bulletin, 1982, 92, 490-499. (b) Hedges, L. V. Fitting categorical models to effect sizes from a series of experiments. Journal of Educational Statistics, 1982, 7, 119-137. (c) Hedges, L. V., & Olkin, I. Clustering estimates of effect magnitude from independent studies (Technical Report No. 173). Stanford, Calif: Stanford University, April, 1981. Hsu, L. M. Tests of differences in /^-levels as tests for differences in effect size. Psychological Bulletin, 1980, 88, 705-708. Johnson, N. L., & Welch, B. L. Applications of the noncentral /-distribution. Biometrika, 1940,57,362-389. Kraemer, H. C. On estimation and hypothesis testing problems for correlation coefficients. Psychometrika, 1975, 40(4), 473-485. Kraemer, H. C. Tests of homogeneity of independent correlation coefficients. Psychometrika, 1979, 44(3), 329-335. Kraemer, H. C, & Andrews, G. A non-parametric technique for meta-analysis effect size calculation. Psychological Bulletin, 1982, 97(2), 404-412. Kraemer, H. C, & Paik, M. A central / approximation to the noncentral /-distribution. Technometrics, 1979, 27(3), 357-360. Resnikoff, G. J., & Lieberman, G. J. Tables of the noncentral t-distribution. Stanford. Calif.: Stanford University Press, 1957. Scheffe, H. The analysis of variance. New York: John Wiley & Sons, 1959. Winer, B. J. Statistical principles in experimental design (2d ed.). New York: McGraw-Hill Book Company, 1971. Author KRAEMER, HELENA CHMURA. Associate Professor of Biostatistics in Psychiatry. Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, California, 94306. Specializations: Statistical applications in bio-behavioral areas, particularly of correlation techniques.