An evaluation of homogeneity tests in meta-analyses in pain using simulations of individual patient data

Size: px
Start display at page:

Download "An evaluation of homogeneity tests in meta-analyses in pain using simulations of individual patient data"

Transcription

1 Pain 85 (2000) 415±424 An evaluation of homogeneity tests in meta-analyses in pain using simulations of individual patient data David J. Gavaghan a, *, R. Andrew Moore b, Henry J. McQuay b a Oxford University Computing Laboratory, Wolfson Building, Parks Road, Oxford OX1 3QD, UK b Pain Research, Nuf eld Department of Anaesthetics, The Churchill, Oxford Radcliffe Hospital, Oxford OX3 7LJ, UK Received 22 January 1999; received in revised form 1 October 1999; accepted 23 November 1999 Abstract In this paper we consider the validity and power of some commonly used statistics for assessing the degree of homogeneity between trials in a meta-analysis. We show, using simulated individual patient data typical of that occurring in randomized controlled trials in pain, that the most commonly used statistics do not give the expected levels of statistical signi cance (i.e. the proportion of trials giving a signi cant result is not equal to the proportion expected due to random chance) when used with truly homogeneous data. In addition, all such statistics are shown to have extremely low power to detect true heterogeneity even when that heterogeneity is very large. Since, in most practical situations, failure to detect heterogeneity does not allow us to say with any helpful degree of certainty that the data is truly homogeneous, we advocate the quantitative combination of results only where the trials contained in a meta-analysis can be shown to be clinically homogeneous. We propose as a de nition of clinical homogeneity that all trials have (i) xed and clearly de ned inclusion criteria and (ii) xed and clearly de ned outcomes or outcome measures. In pain relief, for example, the rst of these would be satis ed by all patients having moderate or severe pain, whilst the second would be satis ed by using at least 50% pain relief as the successful outcome measure. q 2000 International Association for the Study of Pain. Published by Elsevier Science B.V. All rights reserved. Keywords: Homogeneity; Heterogeneity; Meta-analysis; Pain relief 1. Introduction There has been a lengthy debate within the medical literature about the effects of heterogeneity in meta-analyses of randomized controlled trials involving dichotomous outcome measures (Dersimonian and Laird, 1986; Thompson and Pocock, 1986). This debate has mainly concentrated upon what should be done if heterogeneity is detected, but little discussion has taken place as to which is the most appropriate statistical test to use when attempting to assess heterogeneity in a meta-analysis. As a result the statistics which are routinely used are those which authors nd easiest to calculate, with the most commonly used being that described by Yusuf et al. (1985) (now often described as the `Peto method'), which is the standard technique in the Cochrane Library. Other fairly common methods are those due to Woolf (1955) and DerSimonian and Laird (1986). 1 In contrast, in the statistical literature there exists a very wide literature on the most appropriate way to assess heterogeneity in a series of 2 2 tables, which is an equivalent problem, although it is usually considered within the framework of the analysis of sub-strata in retrospective studies of the relationship between disease incidence and exposure to a suspected risk factor (Zelen, 1971; Halperin et al., 1977). Two excellent review papers by Paul and Donner (1989, 1992) describe and compare ten homogeneity statistics, 2 six of which require iterative methods and only one of which, the Woolf statistic, overlaps with those which are commonly used in meta-analysis. Of the other two common statistics, the Dersimonian and Laird statistic is based on risk differences, which is inappropriate in the context of retrospective studies, whilst the Peto statistic is actually equivalent to a test proposed by Zelen (1971) which was subsequently shown to be invalid by Halperin et al. (1977) * Corresponding author. Tel.: ; fax: address: gavaghan@comlab.ox.ac.uk (D.J. Gavaghan) 1 A citation search on BIDS on each of these three papers yielded 1110 citations for the Yusuf et al. (1985) paper, 977 citations for the Woolf (1955) paper and 583 citations for the Dersimonian and Laird (1986) paper. 2 We will use the term homogeneity statistics since we are testing whether the null hypothesis of homogeneity holds; if it does not, then we infer that there is evidence of heterogeneity between the trials /00/$20.00 q 2000 International Association for the Study of Pain. Published by Elsevier Science B.V. All rights reserved. PII: S (99)

2 416 D.J. Gavaghan et al. / Pain 85 (2000) 415±424 Table 1 The standard 2 2 table for each trial except for the special case where all tables have a common odds ratio of 1. 3 In this paper we therefore make use of simulation methods to compare the commonly used homogeneity statistics with those recommended in the statistical literature, in the context of meta-analyses of randomized controlled trials in pain relief. We will extend this analysis to other applications elsewhere. We conclude that none of the standard methods used routinely in meta-analyses for assessing homogeneity give the appropriate levels of signi cance and all have very low power to detect true heterogeneity. We recommend an alternative non-iterative test statistic, rst suggested by Breslow and Day (1980), based on the Mantel±Haenszel estimator (Mantel and Haenszel, 1959) of the odds ratio, although we show that this too lacks power to detect true heterogeneity. 2. Methods Success Failure Total Treatment r i n i 2 r i n i Control s i m i 2 s i m i Total t i N i 2 t i N i Throughout this paper we will consider homogeneity tests for dichotomous outcome measures. A typical meta-analysis will be assumed to consist of k trials. In each of the k trials we will assume that we have numbers, n i, m i, assigned to each of the treatment and control groups (in the simulations we will assume that n i ˆ m i, i.e. we have a perfectly balanced design which is what we usually aim to achieve in randomized controlled trials in pain relief). In the ith trial we will assume that there are r i successes in the treatment group and s i successes in the control group (success might mean a patient improves upon treatment, for example in pain relief we will usually take `success' to mean at least 50% pain relief (McQuay and Moore, 1998), although in cancer studies success is usually replaced by number of deaths, etc.). For each trial this can be written as a 2 2 table, as shown in Table 1, where t i is the total number of successes and N i ˆ n i 1 m i is the total number of patients in trial i, i ˆ 1; ¼k. The problem of assessing homogeneity in 3 As we will show it also works adequately for the small effect sizes for which it was originally introduced by Yusuf et al. (1985), but is inappropriate for use with effect sizes typically seen in pain studies. It is worth noting that the associated estimate of the odds ratio proposed by Yusuf et al. (1985) is also severely biased for large effects (Greenland and Salvan, 1990) and is therefore also inappropriate for use in meta-analysis of pain studies. Instead we would recommend the use of the Mantel±Haenszel estimator of the log-odds ratio (Mantel and Haenszel, 1959), or perhaps more usefully from a clinical perspective reporting the effect size in terms of numbers-needed-to treat or NNTs (McQuay and Moore, 1997). a meta-analysis of k trials is therefore identical to assessing the homogeneity of k-independent 2 2 tables. We will consider ve test statistics for assessing homogeneity: the three described in Section 1 (the Peto statistic (denoted by Q P ), the Woolf statistic (Q W ) and the Dersimonian and Laird statistic (Q DL )), together with the score test based on the conditional maximum likelihood estimator of the assumed common odds ratio (Q mle ) (Cox, 1972; Liang and Self, 1985; Paul and Donner, 1989, 1992) and the Breslow±Day score statistic based on the Mantel±Haenszel estimator of the assumed common odds ratio (Q BD ). Brief details of the way in which each of these statistics is calculated are given in Appendix A and full details can be found in the original references Simulations In order to test the ef cacy of the above statistics in detecting heterogeneity we have performed two sets of simulations. The rst considers the case of truly homogeneous data with xed underlying treatment effect (which we term the experimental event rate or EER) and control effect (the control event rate or CER). This allows us to determine whether the ve tests give the correct level of statistical signi cance for truly homogeneous data. Since both event rates are xed, the effect size is homogeneous in both the log-odds scale and the risk difference scale and since we also use perfectly balanced designs in each trial (i.e. equal group sizes) we can conclude that the comparisons we make between the performance of the statistics in our simulations is fair. The second set of simulations considers the case of heterogeneous data by allowing the underlying event rates to vary randomly. This allows us to assess the power of the ve tests to detect increasing levels of heterogeneity in the data. We attempt to make our simulations mimic as closely as possible the likely data that will occur in meta-analyses in pain studies. We therefore use in all cases a CER value of 0.2 and EER values ranging from 0.2 (i.e. no effect of treatment) up to 0.7 (an extremely powerful analgesic) (McQuay and Moore, 1998). For each pair of values of CER and EER, we then simulate meta-analyses with particular numbers of trials, k, in each meta-analysis (we consider the cases k ˆ 5, 10, 20 and 50). The number of patients, n i, in each group of a particular trial is assumed to follow a lognormal distribution with mean 50 (SD 25) (again typical of RCTs in pain relief; McQuay and Moore, 1998) as described in Appendix B. In each of the k trials within each of the simulated meta-analyses individual patient data are then generated so that (i) for xed effects they have the same underlying values of the CER and EER as all other trials in the meta-analysis and (ii) for random effects they have underlying values of the CER of 0.2 and the EER of 0.5, but both are allowed to vary randomly about these underlying means. This variation in the random effects

3 D.J. Gavaghan et al. / Pain 85 (2000) 415± model is calculated via a random perturbation on the logodds scale, as described in Appendix B. The assignment of an individual as a success or failure simply depends on whether a random number generated from a uniform distribution on [0,1] is less than or greater than the underlying event rate for that group. Once the data have been generated, the ve homogeneity statistics are calculated and the proportion which give statistically signi cant results are counted and used to create the graphs given in Section 3. The simulation algorithm is given in Appendix B. The most commonly chosen level for statistical signi cance in homogeneity tests is P ˆ 0:1 or 10% and we will therefore use this level of signi cance throughout this paper (we will refer to this as the `nominal signi cance' level). This implies that in truly homogeneous trials, we would expect each of our homogeneity tests to give a statistically `signi cant' result (against homogeneity) in about 10% (or about 1000) of the simulated metaanalyses, purely due to random chance. For truly heterogeneous data, we would expect the tests to detect heterogeneity more frequently than in 10% of cases, depending on the degree of heterogeneity present. 3. Results The results of the simulations for xed effects are given in Figs. 1 and 2 and those for random effects are given in Figs. 3 and 4. In Fig. 1 we show the percentage of the simulated meta-analyses which give a statistically signi cant result at the 10% (P ˆ 0:1) level for meta-analyses containing 5, 10, 20 and 50 trials using a xed CER of 0.2 and xed EER values ranging from 0.2 to 0.7 in increments of 0.05 (so that the data is truly homogeneous). It can be seen that only two of the ve statistics, Q BD and Q mle, come close to maintaining the nominal signi cance level of 10%. The statistic which performs worst is the Peto statistic, which as expected gives accurate values only for very small treatment effects, but gives gross under-estimates for larger effects. This is because this statistic is identical Fig. 1. Effect of increasing the effect size on the performance of each homogeneity test. The underlying CER was xed at 0.2 and the group sizes were lognormally distributed with mean 50 (SD 25). The nominal level of signi cance in each x 2 test was 10% (P ˆ 0:1).

4 418 D.J. Gavaghan et al. / Pain 85 (2000) 415±424 Fig. 2. As in Fig. 1, but using group sizes which are lognormally distributed with mean 200 (SD 50). Fig. 3. Effect of increasing the size of the random effect (see Appendix B) to test the power of each of the homogeneity tests. The underlying CER and EER were 0.2 and 0.5, respectively. The group sizes were again lognormally distributed with mean 50 (SD 25). The nominal level of signi cance in each x 2 test was 10% (P ˆ 0:1). The results for Q BD and Q mle are almost superimposed in all cases.

5 D.J. Gavaghan et al. / Pain 85 (2000) 415± Fig. 4. The power of the homogeneity tests as a function of the experimental event rate for random effects with SDs s re ˆ 0:2 and 0.3 (see Appendix B for de nitions of random effects), with 20 trials per meta-analysis and lognormally distributed group sizes with mean 50 (SD 25). The underlying CER was 0.2 in all cases. to that proposed by Zelen (1971) and was shown by Halperin et al. (1977) to follow a x 2 distribution only when there is no effect of treatment compared to control (so that the odds ratio is exactly 1). This statistic should not therefore be used in RCTs in pain research, where typical effect sizes are large. Of the other two commonly used statistics, Q W tends to give too low a percentage of trials with statistically signi cant heterogeneity, whilst Q DL tends to give too high a percentage. The degree of under- and over-estimation increases markedly with the number of trials in each meta-analysis; this is again to be expected since all such statistics follow a x 2 distribution only asymptotically, where `asymptotically' means the number of trials remains xed, but the number of patients in each group of each trial becomes large (Paul and Donner, 1989). We would therefore expect that both Q W and Q DL would give closer to nominal levels of signi cance with larger group sizes. This is con rmed by the further simulations shown in Fig. 2, where we have used lognormally distributed group sizes with mean 200 (SD 50) for the cases of 20 and 50 trials in each meta-analysis; both Q W and Q DL give much closer to the nominal 10% signi cance level, Q BD continues to be very accurate, whilst Q P again gives very poor results except for very small treatment effects (we do not give values for Q mle in Fig. 2 since the iterative procedure necessary takes an inordinate amount of CPU time for such large group sizes). Figs. 1 and 2 considered simulated data where the true underlying effect sizes were xed for all trials, i.e. the simulated meta-analyses were all truly homogeneous. In Fig. 3 we consider what happens when the underlying effect sizes are heterogeneous. To do this we choose underlying values of the CER of 0.2 and of the EER of 0.5, which are typical of an average analgesic (McQuay and Moore, 1998). We then allow both event rates to vary randomly about the underlying value, with the uctuations generated in the log-odds scale, for the reasons given in Appendix B. Also given in Table 2 and Fig. 5 in Appendix B are the means and standard deviations of both the underlying perturbed event rates and the observed event rates generated by our algorithm. For small values of s re (the standard deviation of the random error in the log-odds scale) up to about 0.15, we generate a small random effect and the standard deviations of the observed event rates increase only slightly due to the random effect (see Table 2 in Appendix B). However, as s re continues to increase the observed standard deviations become much larger than would be expected due to binomial variation alone, so that the observed EERs within any particular meta-analysis might cover the complete range of values of known analgesics, as illustrated in the lower panels of Fig. 5 in Appendix B. All trials simulated in Fig. 3 use lognormally distributed group sizes with approximate mean 50 (SD 25) (as in Fig. 1). It can be seen from the results presented in Fig. 3 that all of the statistics have very low power to detect random effects, unless those effects are large. Whilst the Dersimonian and Laird statistic, Q DL, appears to have the greater power, this is probably just a consequence of the over-estimation given by this statistic with xed effects as shown in Figs. 1 and 2 and similarly for the under-estimation given with the Woolf statistic, Q W, and the Peto statistic, Q P. The maximum likelihood statistic, Q mle, and the Breslow±Day statistic, Q BD, again give very similar results, reinforcing our recommendation of the Breslow±Day statistic. For all of the tests, the power is a function of the number of trials included in the meta-analysis (this is in agreement with the recent results of Hardy and Thompson (1998) who have shown that the power of homogeneity statistics is a function of the total information available within the metaanalysis). For meta-analyses with numbers of trials between 10 and 20 (typical in pain relief), the Breslow±Day statistic

6 420 D.J. Gavaghan et al. / Pain 85 (2000) 415±424 Table 2 The calculated mean and SD of the perturbed EER, p i, and CER, q i, together with the mean and SD of the `observed' values ^p i, and CER, ^q i, used to generate the data in Fig. 3 for the case of 20 trials per meta-analysis a s re Perturbed values Observed values Mean p i SD p i Mean q i SD q i Mean ^p i SD ^p i Mean ^q i SD ^q i a The underlying CER value was 0.2 and the EER 0.5 in all simulations. Fig. 5. Histograms of the observed event rates, ^p i and ^q i, used to generate the data in Fig. 3 for the case of 20 trials per meta-analysis for s re ˆ 0:0 (i.e. the variation is simply that obtained for binomial distributions (with varying group sizes) with xed event rates) and s re ˆ 0:1, 0.3 and 0.5. The underlying values of the CER and EER are 0.2 and 0.5 in all cases.

7 D.J. Gavaghan et al. / Pain 85 (2000) 415± rejects the null hypothesis of homogeneity in 30±40% of simulations with a random effect with s re ˆ 0:2 and only in 70±90% of the simulations with a large random effect with s re ˆ 0:4. The implications of these results for metaanalyses in pain relief are explored in Section 4. We also considered the effect of the size of the treatment effect on the power of the homogeneity tests and found, as expected, that it had little in uence. We give two examples of such a simulation in Fig. 4, which shows the power of the homogeneity tests as a function of experimental event rate for random effects with SDs s re ˆ 0:2 and 0.3, with 20 trials per meta-analysis and lognormally distributed group sizes with mean 50 (SD 25). It is clear that the power of all of the homogeneity tests (except the Peto statistic, Q P ) remains roughly constant as the treatment effect increases. The slight decreases at the end of the range are likely to be due to the skewing of the distributions of the perturbed event rates in transforming from the log-odds scale back to the probability scale (as described in Appendix B). 4. Discussion As we mentioned brie y above, all statistical tests of homogeneity depend upon the assumption of the asymptotic normality of whatever measurement of effect size we are using ± in this paper this is either the risk difference or the log of the odds ratio (an excellent discussion of the derivation of homogeneity statistics is given in Chapter 10 of Fleiss (1981)). As we have shown in Fig. 1, the only noniterative statistic which gives nominal signi cance for truly homogeneous data and for group sizes typical of pain studies is the Breslow±Day statistic and we would therefore recommend this statistic for routine use in meta-analysis of pain studies. Fig. 1 also shows that the DerSimonian and Laird statistic over-estimates the degree of heterogeneity and the Woolf statistic under-estimates it; it seems likely from the results shown in Fig. 2, which use comparatively large group sizes, that this is due to the assumption of asymptotic normality being poor for these statistics when used with the smaller group sizes of Fig. 1. We have shown that the Peto statistic gives nominal signi cance only when there is no treatment effect, but gives reasonable accuracy (at least for small to medium meta-analyses) for effect sizes up to odds ratios of around 1.5±2 (risk differences of about 0.15) and is therefore unsuitable for use in pain studies, but will give good results in the types of studies for which it was originally introduced, i.e. those with small effect sizes. In Figs. 3 and 4 we considered the effects of including random effects to simulate heterogeneity between the trials and showed that all of the statistics have very low power to detect such heterogeneity. This has been reported before in the context of heterogeneity in several 2 2 tables (Jones et 4 This paper also gives an interesting explanation of why homogeneity tests have such low power. al., 1989), 4 as well as in the context of meta-analyses by Hardy and Thompson (1998). 5 These results suggest that in practice homogeneity tests are of very limited use; in a typical pain study (10±20 trials per meta-analysis) which has a strong degree of heterogeneity (SD 0.25 in Fig. 3) the statistical tests are (at best) equally likely to reject or accept the null hypothesis of homogeneity. It is only with extremely heterogeneous data and large numbers of trials in the meta-analysis that the power of the tests is suf cient to detect the heterogeneity most of the time. So, in most practical situations, failure to detect heterogeneity does not allow us to say with any helpful degree of certainty that the data is truly homogeneous. This leads us into the debate as to whether we should use xed effects analyses to estimate treatment effects, or random effects analyses, which is a topic which has already received much attention in the literature (see, for example, Dersimonian and Laird, 1986; Greenland and Salvan, 1990; Hardy and Thompson, 1998). Since in practical situations it will not be possible to demonstrate with any degree of certainty that a set of trials are statistically homogeneous, we would advocate the quantitative combination of results only where the trials contained in a meta-analysis can be shown to be clinically homogeneous. We would propose as a de nition of clinical homogeneity that all trials have (i) xed and clearly de ned inclusion criteria and (ii) xed and clearly de ned outcomes or outcome measures. In pain relief, for example, the rst of these would be satis ed by all patients having moderate or severe pain, whilst the second would be satis ed by using at least 50% pain relief as the successful outcome measure (Edwards et al., 1999). Provided that the trials are considered to be clinically homogeneous, then we would advocate following the advice of Fleiss (1981, p. 164), who suggests that statistical homogeneity should be tested only at a very conservative level (such as P ˆ 0:01). A similarly attractive argument has been put forward by Greenland and Salvan (1990, p. 252), who argue that the choice between xed and random effects modelling is secondary to the exploration of `clinically important' inter-study differences; where such differences do exist then it is more important to attempt to model and perhaps explain the inter-study differences rather than to attempt to pool the disparate study results in a single summary estimate. Another balanced and informative discussion of how to deal with heterogeneity is given in the paper by Thompson and Pocock (1986), who suggest that `quantitative conclusions [of meta-analyses]¼must take into account the practical relevance of the individual studies and the clinical heterogeneity between them'. 5 These authors start from the assumption that the effect measure is normally distributed and then simulate data from an appropriate normal distribution. They do not therefore observe the effect size and group size variation in the performance of the homogeneity statistics that we have demonstrated.

8 422 D.J. Gavaghan et al. / Pain 85 (2000) 415±424 Acknowledgements We would like to thank Dr Richard Stevens for his very helpful advice on the statistical aspects of this paper and an anonymous referee for his comments which have greatly improved the revised draft. We are grateful to the following organizations for their nancial support which has enabled us to undertake this research: the Medical Research Council for a Career Development Fellowship (D.J.G.), the European Union Biomed 2 contract BMH4 CT (HJM) and the NHS Research and Development Health Technology Assessment Programme 94/11/4. Appendix A. Homogeneity statistics A.1. The Peto statistic In the paper of Yusuf et al. (1985), the problem is framed in terms of observed, O i, and expected, E i, numbers of successes in treatment group i and its variance V i (where in the notation of Table 1, E i ˆ n i t i =N i and V i ˆ E i 1 2 n i =N i N i 2 t i = N i 2 1 ). Making use of an approximation to the conditional maximum likelihood estimate of the common log-odds ratio these authors derive a `natural approximate chi-square test for heterogeneity' given by " # 2 Q P ˆ Xk O i 2 E i 2 V i 2 X k O i 2 E i X k V i with degrees of freedom one less than the number of nonzero variances (and so is usually equal to k 2 1). A.2. The Woolf statistic In the notation of Table 1, the Woolf homogeneity statistic (Woolf, 1955) is given in terms of the natural logarithm of the individual estimates of the odds ratio, c i, from each trial. Letting y i ˆ lnc i, the Woolf homogeneity statistic is given by Q W ˆ Xk w i y 2 i 2 X k w i y i! 2 X k w i where w i ˆ 1=r i 1 1=n i 2 r i 1 1=s i 1 1=m i 2 s i is the approximate variance of the log-odds ratio. Q W is again taken to follow approximately a x 2 distribution with k 2 1 degrees of freedom. A.3. The DerSimonian and Laird statistic 1 2 The DerSimonian and Laird (1986) statistic is the only one of the ve which is not based on the common odds ratio and instead is based on estimates of the risk difference. De ning r Ti ˆ r i =n i and r Ci ˆ s i =m i to be the proportions of successful patients in the treatment and control groups in trial i and y i ˆ r Ti 2 r Ci, the DerSimonian and Laird homogeneity statistic is de ned as Q DL ˆ Xk w i y i 2 y 2 where w i ˆ 1=s 2 i and s 2 i ˆ r Ti 1 2 r Ti =n i 1 r Ci 1 2 r Ci =m i is an estimate of the sampling variance in the ith study. Q DL is again taken to follow approximately a x 2 distribution with k 2 1 degrees of freedom. DerSimonian and Laird also considered two further test statistics: the rst was similar to Q DL described above but with equal weights for each trial and was shown to give poor results; the second was based on the natural logarithm of the relative odds and is equivalent to the Woolf statistic. A.4. The conditional maximum likelihood score statistic This is by far the most complex of the test statistics considered and is unlikely to be adopted for routine use. It is included only for comparison with the four non-iterative statistics. The conditional likelihood is the conditional distribution of the observed data assuming that all marginal totals are xed (Cox, 1972; Breslow and Day, 1980) and is expressed in terms of the assumed common odds ratio c and the observed number of successes, r i, in each of the treatment groups as 8!! 9 n i m i c r i >< l r i ; c ˆYk r i t i 2 r >=! i! 4 X n i m i >: c u >; u u t i 2 u The maximum of this expression as a function of c can be found (for example by the Newton±Raphson method) to give the conditional maximum likelihood estimate of the odds ratio ^c c. Liang and Self (1985) describe a homogeneity statistic based on ^c c given by Q mle ˆ Xk r i 2 E i r i j t i ; ^c c Š 2 5 var r i j t i ; ^c c where E i x i j t i ; ^c c and var x i j t i ; ^c c are the exact mean and variance of r i. Again this is approximately distributed as a x 2 random variable with k 2 1 degrees of freedom. A.5. The Breslow±Day test statistic Breslow and Day (1980) (Paul and Donner, 1992) proposed a homogeneity statistic based on the Mantel± Haenszel estimator (Mantel and Haenszel, 1959) of the odds ratio, c MH ( P k r i m i 2 s i =N i = P k s i n i 2 r i =N i in the notation of Table 1) and is de ned as 3

9 D.J. Gavaghan et al. / Pain 85 (2000) 415± Q BD ˆ Xk r i 2 e i c MH Š 2 v i c MH where e i (c MH ) is the expected value of r i given c MH, v i (c MH ) is an estimator of the variance of r i given the value of c MH and conditional on the value of t i (see Paul and Donner, 1992). Each value of e i can be found by solving the quadratic equation e i m i 2 t i 1 e i t i 2 e i m i 2 e i ˆ c MH 7 and taking the unique root in the interval max 0; t i 2 m i # e i # min t i ; n i (Fleiss, 1981). v i is then obtained as v i ˆ e i t i 2 e i n i 2 e i m i 2 t i 1 e 8 i Although this looks complex, in practice it involves nding the root of k quadratic equations, followed by the usual summation to obtain the test statistic, and so is computationally inexpensive and would be easy to implement on a standard spreadsheet. Appendix B. The simulation algorithm B.1. Generation of group sizes The group sizes for each of the trials in the meta-analysis are generated from a lognormal distribution (to ensure nonnegativity) with mean N (SD s N ). If a random variable X is normally distributed with mean m X and variance s 2 X, then Y ˆ exp X has a lognormal distribution. It can be shown that (see, for example, Dudewicz and Mishra, 1988) the mean of Y is m Y ˆ exp m X 1 s 2 X=2 and the variance of Y is s 2 Y ˆ exp 2m 1 s 2 X exps 2 X 2 1. The group sizes for all simulations were therefore generated by rst simulating a normal random variable, x say, then obtaining the group size from n ˆ nint exp x Š, where `nint' is the nearest integer value. In practice (except for values shown in Fig. 2) we used values of m X ˆ 3:8 and s X ˆ 0:48 to obtain a lognormal distribution with approximate mean N ˆ 50 (SD s N ˆ 25). For Fig. 2 we used values of m X ˆ 5:25 and s X ˆ 0:25 to obtain a lognormal distribution with approximate mean N ˆ 200 (SD s N ˆ 50). B.2. Generation of the EER for the random effects model In the following simulation algorithm, the underlying CER is taken to be q and the underlying EER is p. Ifwe are considering xed effects, then we simply choose a value of the CER, q (this is taken to be 0.20 in all simulations), and of the EER, p (this is chosen in the range 0.2 up to 0.7 for xed effect simulations), and we then follow the 6 algorithm given below for this particular pair (p,q) for simulations. If we are considering random effects, then, for example, we cannot simply allow the CER and EER to vary randomly about some mean with, say, a normally distributed random error, since we could then obtain values of p which are outside the range [0,1] and are therefore unrealistic. We therefore consider perturbing instead in the log-odds scale. We do this by rst choosing underlying values of p and q (we use p ˆ 0:5 and q ˆ 0:2 for all simulations in Fig. 3) and calculate from these the underlying log-odds y p ˆ ln p= 1 2 p Š and y q ˆ ln q= 1 2 q Š which can take values over the whole real line and are asymptotically normally distributed. We then simulate normally distributed random errors, e p e q, with mean 0 (SD s re ) (we take s re to vary between 0.05 and 0.5 in Figs. 3 and 4) and add this to y p and y q. We then invert this process to obtain the perturbed values of p i and q i for trial i from p i ˆ exp y p 1 e pi = 1 1 exp y p 1 e pi and q i ˆ exp y q 1 e qi = 1 1 exp y q 1 e qi. To check the range of values that this process generates, we calculated the mean and standard deviation of the `perturbed' p i and q i values generated in each simulation of meta-analyses. We also calculated the mean and standard deviation of the `observed' event rates, ^p i ˆ n ei =n i and ^q i ˆ n ci =n i, where n ei and n ci are the observed numbers of experimental and control events in trial i which has n i patients in each group. Examples of the resulting means and standard deviations are given for the case n ˆ 20 in Table 2 for each value of s re used in Fig. 3, with underlying CER q ˆ 0:2 and EER p ˆ 0:5. These values allow us to quantify the effect that increases in s re have on the distribution of the observed events rates, so that a value of s re ˆ 0:15 increases the standard deviation of the observed experimental event rates by only about 10% (a small random effect), whilst values of s re ˆ 0:3 and 0.5 increase the standard deviation of the observed experimental event rates by 37 and 78%, respectively. Note that the mean value for q i is increasingly biased above the underlying CER value of 0.2 with increasing random effect size. This is due to the non-linear nature of the transformation from the perturbed log-odds scale back to the probability scale, which skews the distribution away from 0 in the probability scale. This also accounts for the slightly lower percentage changes in the standard deviation of the observed control event rates for a given value of s re in Table 2. In Fig. 5 we also give histograms of the observed event rates, ^p i and ^q i, in each of the ( ) trials which were simulated to generate Fig. 3 for each value of s re. s re ˆ 0 is simply a binomial variation (with random, lognormally varying group sizes) and it is clear that a small random effect of s re ˆ 0:1 makes little impression on the observed distributions, but for s re ˆ 0:3 and 0.5 there is a very marked effect on the observed distributions and we might expect that an effective homogeneity test would

10 424 D.J. Gavaghan et al. / Pain 85 (2000) 415±424 allow us to detect random effects of this order with high power. B.3. Generation of the data for each simulation In our simulations, each meta-analysis consists of k trials, with n i patients in each group (i.e. we use perfectly balanced designs). The simulation algorithm is then as follows: I For each of the k trials in the meta-analysis: 1. Generate a random number n i (lognormally distributed) which is the number of patients in each of the control and experimental groups. 2. If we are considering random effects, generate the perturbed event rates, p i, q i, for each group as described above. If we are considering xed effects then set p i ˆ p, q i ˆ q, the xed underlying event rates. 3. For each of the n i patients in the control group generate a random number, r say, uniformly distributed between 0 and 1. If r, q i then add 1 to the number of control events. This will result in a simulated value of the total number of control events, n ci, and an observed control event rate of n ci =n i ˆ ^q i. 4. Repeat 3 for the experimental group (so now use r, p i ) etc. to obtain the number of experimental events, n ei, and the observed experimental event rate of n ei =n i ˆ ^p i. II Using the data from the k trials simulated in I, calculate each of the ve homogeneity statistics. III Repeat steps I and II times and count the number of simulated trials in which each of the ve homogeneity tests detects statistically signi cant heterogeneity (at the 10% level). References Breslow NE, Day NE. The analysis of case-control studies (chapter 4), Statistical methods in cancer research, 1, Lyons: International Agency for Research on Cancer, Cox DR. Analysis of binary data, London: Methuen, Dersimonian R, Laird N. Meta-analysis in clinical trials. Control Clin Trials 1986;7:177±188. Dudewicz EJ, Mishra SN. Modern mathematical statistics, New York: Wiley, Edwards JE, Oldham A, Smith L, Carroll D, Wiffen PJ, Mcquay HJ, Moore RA. Oral aspirin in post-operative pain. A quantitative systematic review. Pain 1999;81:289±297. Fleiss JL. Statistical methods for rates and proportions (chapter 10), 2nd ed. New York: Wiley, Greenland S, Salvan A. Bias in the one-step method for pooling study results. Stats Med 1990;9:247±252. Halperin M, Ware JH, Byar DP, Mantel N, Brown CC, Kozial J, Gail M, Green SB. Testing for interaction in an I J K table. Biometrika 1977;64:271±275. Hardy RJ, Thompson SG. Detecting and describing heterogeneity in metaanalysis. Stats Med 1998;17:844±856. Jones MP, O'Gorman TW, Lemke JH, Woolson RF. Monte-Carlo investigation of homogeneity tests of the odds ratio under various sample size con gurations. Biometrics 1989;45:171±181. Liang KY, Self SG. Tests for homogeneity of odds ratio when the data are sparse. Biometrika 1985;72:353±358. Mantel N, Haenszel W. Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst 1959;22:719±748. McQuay HJ, Moore RA. Using numerical results for systematic reviews in clinical practice. Ann Intern Med 1997;126:712±720. McQuay HJ, Moore RA. An evidence-based resource for pain relief, Oxford: Oxford University Press, Paul SR, Donner A. comparison of tests of homogeneity of odds ratios in K 2 2 table. Stats Med 1989;8:1455±1468. Paul SR, Donner A. Small sample performance of tests of homogeneity of odds ratios in K 2 2 table. Stats Med 1992;11:159±165. Thompson SG, Pocock SJ. Can meta-analysis be trusted? Lancet 1986;338:1127±1130. Woolf B. On estimating the relation between blood group and disease. Ann Hum Genet 1955;19:251±253. Yusuf S, Peto R, Lewis J, Collins R, Sleight P. Beta blockade during and after myocardial infarction: an overview of the randomised trials. Prog Cardiovasc Dis 1985;5:335±371. Zelen M. The analysis of several 2 2 tables. Biometrika 1971;58:129± 137.

Mantel-Haenszel Test Statistics. for Correlated Binary Data. Department of Statistics, North Carolina State University. Raleigh, NC

Mantel-Haenszel Test Statistics. for Correlated Binary Data. Department of Statistics, North Carolina State University. Raleigh, NC Mantel-Haenszel Test Statistics for Correlated Binary Data by Jie Zhang and Dennis D. Boos Department of Statistics, North Carolina State University Raleigh, NC 27695-8203 tel: (919) 515-1918 fax: (919)

More information

Weighted tests of homogeneity for testing the number of components in a mixture

Weighted tests of homogeneity for testing the number of components in a mixture Computational Statistics & Data Analysis 41 (2003) 367 378 www.elsevier.com/locate/csda Weighted tests of homogeneity for testing the number of components in a mixture Edward Susko Department of Mathematics

More information

An accurate test for homogeneity of odds ratios based on Cochran s Q-statistic

An accurate test for homogeneity of odds ratios based on Cochran s Q-statistic Kulinskaya and Dollinger TECHNICAL ADVANCE An accurate test for homogeneity of odds ratios based on Cochran s Q-statistic Elena Kulinskaya 1* and Michael B Dollinger 2 * Correspondence: e.kulinskaya@uea.ac.uk

More information

Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection

Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection Biometrical Journal 42 (2000) 1, 59±69 Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection Kung-Jong Lui

More information

Previous lecture. Single variant association. Use genome-wide SNPs to account for confounding (population substructure)

Previous lecture. Single variant association. Use genome-wide SNPs to account for confounding (population substructure) Previous lecture Single variant association Use genome-wide SNPs to account for confounding (population substructure) Estimation of effect size and winner s curse Meta-Analysis Today s outline P-value

More information

A simulation study comparing properties of heterogeneity measures in meta-analyses

A simulation study comparing properties of heterogeneity measures in meta-analyses STATISTICS IN MEDICINE Statist. Med. 2006; 25:4321 4333 Published online 21 September 2006 in Wiley InterScience (www.interscience.wiley.com).2692 A simulation study comparing properties of heterogeneity

More information

A note on R 2 measures for Poisson and logistic regression models when both models are applicable

A note on R 2 measures for Poisson and logistic regression models when both models are applicable Journal of Clinical Epidemiology 54 (001) 99 103 A note on R measures for oisson and logistic regression models when both models are applicable Martina Mittlböck, Harald Heinzl* Department of Medical Computer

More information

TESTS FOR EQUIVALENCE BASED ON ODDS RATIO FOR MATCHED-PAIR DESIGN

TESTS FOR EQUIVALENCE BASED ON ODDS RATIO FOR MATCHED-PAIR DESIGN Journal of Biopharmaceutical Statistics, 15: 889 901, 2005 Copyright Taylor & Francis, Inc. ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543400500265561 TESTS FOR EQUIVALENCE BASED ON ODDS RATIO

More information

Exact unconditional tests for a 2 2 matched-pairs design

Exact unconditional tests for a 2 2 matched-pairs design Statistical Methods in Medical Research 2003; 12: 91^108 Exact unconditional tests for a 2 2 matched-pairs design RL Berger Statistics Department, North Carolina State University, Raleigh, NC, USA and

More information

Meta-analysis. 21 May Per Kragh Andersen, Biostatistics, Dept. Public Health

Meta-analysis. 21 May Per Kragh Andersen, Biostatistics, Dept. Public Health Meta-analysis 21 May 2014 www.biostat.ku.dk/~pka Per Kragh Andersen, Biostatistics, Dept. Public Health pka@biostat.ku.dk 1 Meta-analysis Background: each single study cannot stand alone. This leads to

More information

Asymptotic efficiency of general noniterative estimators of common relative risk

Asymptotic efficiency of general noniterative estimators of common relative risk Biometrika (1981), 68, 2, pp. 526-30 525 Printed in Great Britain Asymptotic efficiency of general noniterative estimators of common relative risk BY MARKKU NTJRMINEN Department of Epidemiology and Biometry,

More information

The effect of nonzero second-order interaction on combined estimators of the odds ratio

The effect of nonzero second-order interaction on combined estimators of the odds ratio Biometrika (1978), 65, 1, pp. 191-0 Printed in Great Britain The effect of nonzero second-order interaction on combined estimators of the odds ratio BY SONJA M. MCKINLAY Department of Mathematics, Boston

More information

DIAGNOSTICS FOR STRATIFIED CLINICAL TRIALS IN PROPORTIONAL ODDS MODELS

DIAGNOSTICS FOR STRATIFIED CLINICAL TRIALS IN PROPORTIONAL ODDS MODELS DIAGNOSTICS FOR STRATIFIED CLINICAL TRIALS IN PROPORTIONAL ODDS MODELS Ivy Liu and Dong Q. Wang School of Mathematics, Statistics and Computer Science Victoria University of Wellington New Zealand Corresponding

More information

Suppose that we are concerned about the effects of smoking. How could we deal with this?

Suppose that we are concerned about the effects of smoking. How could we deal with this? Suppose that we want to study the relationship between coffee drinking and heart attacks in adult males under 55. In particular, we want to know if there is an association between coffee drinking and heart

More information

Binomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials

Binomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials Lecture : Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 27 Binomial Model n independent trials (e.g., coin tosses) p = probability of success on each trial (e.g., p =! =

More information

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Previous lecture P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Interaction Outline: Definition of interaction Additive versus multiplicative

More information

Modified Large Sample Confidence Intervals for Poisson Distributions: Ratio, Weighted Average and Product of Means

Modified Large Sample Confidence Intervals for Poisson Distributions: Ratio, Weighted Average and Product of Means Modified Large Sample Confidence Intervals for Poisson Distributions: Ratio, Weighted Average and Product of Means K. KRISHNAMOORTHY a, JIE PENG b AND DAN ZHANG a a Department of Mathematics, University

More information

Lecture 10: Introduction to Logistic Regression

Lecture 10: Introduction to Logistic Regression Lecture 10: Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 2007 Logistic Regression Regression for a response variable that follows a binomial distribution Recall the binomial

More information

Sample size re-estimation in clinical trials. Dealing with those unknowns. Chris Jennison. University of Kyoto, January 2018

Sample size re-estimation in clinical trials. Dealing with those unknowns. Chris Jennison. University of Kyoto, January 2018 Sample Size Re-estimation in Clinical Trials: Dealing with those unknowns Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj University of Kyoto,

More information

Heterogeneity issues in the meta-analysis of cluster randomization trials.

Heterogeneity issues in the meta-analysis of cluster randomization trials. Western University Scholarship@Western Electronic Thesis and Dissertation Repository June 2012 Heterogeneity issues in the meta-analysis of cluster randomization trials. Shun Fu Chen The University of

More information

Harvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen

Harvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen Harvard University Harvard University Biostatistics Working Paper Series Year 2014 Paper 175 A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome Eric Tchetgen Tchetgen

More information

ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS

ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS Libraries 1997-9th Annual Conference Proceedings ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS Eleanor F. Allan Follow this and additional works at: http://newprairiepress.org/agstatconference

More information

STAT331. Cox s Proportional Hazards Model

STAT331. Cox s Proportional Hazards Model STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

More information

A new strategy for meta-analysis of continuous covariates in observational studies with IPD. Willi Sauerbrei & Patrick Royston

A new strategy for meta-analysis of continuous covariates in observational studies with IPD. Willi Sauerbrei & Patrick Royston A new strategy for meta-analysis of continuous covariates in observational studies with IPD Willi Sauerbrei & Patrick Royston Overview Motivation Continuous variables functional form Fractional polynomials

More information

Statistics and Probability Letters. Using randomization tests to preserve type I error with response adaptive and covariate adaptive randomization

Statistics and Probability Letters. Using randomization tests to preserve type I error with response adaptive and covariate adaptive randomization Statistics and Probability Letters ( ) Contents lists available at ScienceDirect Statistics and Probability Letters journal homepage: wwwelseviercom/locate/stapro Using randomization tests to preserve

More information

Describing Stratified Multiple Responses for Sparse Data

Describing Stratified Multiple Responses for Sparse Data Describing Stratified Multiple Responses for Sparse Data Ivy Liu School of Mathematical and Computing Sciences Victoria University Wellington, New Zealand June 28, 2004 SUMMARY Surveys often contain qualitative

More information

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca Lenz-Tönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1

More information

Calculating Effect-Sizes. David B. Wilson, PhD George Mason University

Calculating Effect-Sizes. David B. Wilson, PhD George Mason University Calculating Effect-Sizes David B. Wilson, PhD George Mason University The Heart and Soul of Meta-analysis: The Effect Size Meta-analysis shifts focus from statistical significance to the direction and

More information

PROD. TYPE: COM. Simple improved condence intervals for comparing matched proportions. Alan Agresti ; and Yongyi Min UNCORRECTED PROOF

PROD. TYPE: COM. Simple improved condence intervals for comparing matched proportions. Alan Agresti ; and Yongyi Min UNCORRECTED PROOF pp: --2 (col.fig.: Nil) STATISTICS IN MEDICINE Statist. Med. 2004; 2:000 000 (DOI: 0.002/sim.8) PROD. TYPE: COM ED: Chandra PAGN: Vidya -- SCAN: Nil Simple improved condence intervals for comparing matched

More information

Confidence Intervals. Contents. Technical Guide

Confidence Intervals. Contents. Technical Guide Technical Guide Confidence Intervals Contents Introduction Software options Directory of methods 3 Appendix 1 Byar s method 6 Appendix χ exact method 7 Appendix 3 Wilson Score method 8 Appendix 4 Dobson

More information

Beta-binomial model for meta-analysis of odds ratios

Beta-binomial model for meta-analysis of odds ratios Research Article Received 9 June 2016, Accepted 3 January 2017 Published online in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/sim.7233 Beta-binomial model for meta-analysis of odds ratios

More information

PROD. TYPE: COM ARTICLE IN PRESS. Computational Statistics & Data Analysis ( )

PROD. TYPE: COM ARTICLE IN PRESS. Computational Statistics & Data Analysis ( ) COMSTA 28 pp: -2 (col.fig.: nil) PROD. TYPE: COM ED: JS PAGN: Usha.N -- SCAN: Bindu Computational Statistics & Data Analysis ( ) www.elsevier.com/locate/csda Transformation approaches for the construction

More information

Reports of the Institute of Biostatistics

Reports of the Institute of Biostatistics Reports of the Institute of Biostatistics No 02 / 2008 Leibniz University of Hannover Natural Sciences Faculty Title: Properties of confidence intervals for the comparison of small binomial proportions

More information

Introduction to Statistical Data Analysis Lecture 4: Sampling

Introduction to Statistical Data Analysis Lecture 4: Sampling Introduction to Statistical Data Analysis Lecture 4: Sampling James V. Lambers Department of Mathematics The University of Southern Mississippi James V. Lambers Statistical Data Analysis 1 / 30 Introduction

More information

Asymptotic equivalence of paired Hotelling test and conditional logistic regression

Asymptotic equivalence of paired Hotelling test and conditional logistic regression Asymptotic equivalence of paired Hotelling test and conditional logistic regression Félix Balazard 1,2 arxiv:1610.06774v1 [math.st] 21 Oct 2016 Abstract 1 Sorbonne Universités, UPMC Univ Paris 06, CNRS

More information

Testing Goodness Of Fit Of The Geometric Distribution: An Application To Human Fecundability Data

Testing Goodness Of Fit Of The Geometric Distribution: An Application To Human Fecundability Data Journal of Modern Applied Statistical Methods Volume 4 Issue Article 8 --5 Testing Goodness Of Fit Of The Geometric Distribution: An Application To Human Fecundability Data Sudhir R. Paul University of

More information

A simulation study for comparing testing statistics in response-adaptive randomization

A simulation study for comparing testing statistics in response-adaptive randomization RESEARCH ARTICLE Open Access A simulation study for comparing testing statistics in response-adaptive randomization Xuemin Gu 1, J Jack Lee 2* Abstract Background: Response-adaptive randomizations are

More information

Chapter 1. GMM: Basic Concepts

Chapter 1. GMM: Basic Concepts Chapter 1. GMM: Basic Concepts Contents 1 Motivating Examples 1 1.1 Instrumental variable estimator....................... 1 1.2 Estimating parameters in monetary policy rules.............. 2 1.3 Estimating

More information

Measuring agreement in method comparison studies

Measuring agreement in method comparison studies Statistical Methods in Medical Research 1999; 8: 135±160 Measuring agreement in method comparison studies J Martin Bland Department of Public Health Sciences, St George's Hospital Medical School, London,

More information

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails GMM-based inference in the AR() panel data model for parameter values where local identi cation fails Edith Madsen entre for Applied Microeconometrics (AM) Department of Economics, University of openhagen,

More information

GROUPED SURVIVAL DATA. Florida State University and Medical College of Wisconsin

GROUPED SURVIVAL DATA. Florida State University and Medical College of Wisconsin FITTING COX'S PROPORTIONAL HAZARDS MODEL USING GROUPED SURVIVAL DATA Ian W. McKeague and Mei-Jie Zhang Florida State University and Medical College of Wisconsin Cox's proportional hazard model is often

More information

TESTING FOR HOMOGENEITY IN COMBINING OF TWO-ARMED TRIALS WITH NORMALLY DISTRIBUTED RESPONSES

TESTING FOR HOMOGENEITY IN COMBINING OF TWO-ARMED TRIALS WITH NORMALLY DISTRIBUTED RESPONSES Sankhyā : The Indian Journal of Statistics 2001, Volume 63, Series B, Pt. 3, pp 298-310 TESTING FOR HOMOGENEITY IN COMBINING OF TWO-ARMED TRIALS WITH NORMALLY DISTRIBUTED RESPONSES By JOACHIM HARTUNG and

More information

E509A: Principle of Biostatistics. GY Zou

E509A: Principle of Biostatistics. GY Zou E509A: Principle of Biostatistics (Effect measures ) GY Zou gzou@robarts.ca We have discussed inference procedures for 2 2 tables in the context of comparing two groups. Yes No Group 1 a b n 1 Group 2

More information

Journal of Biostatistics and Epidemiology

Journal of Biostatistics and Epidemiology Journal of Biostatistics and Epidemiology Methodology Marginal versus conditional causal effects Kazem Mohammad 1, Seyed Saeed Hashemi-Nazari 2, Nasrin Mansournia 3, Mohammad Ali Mansournia 1* 1 Department

More information

Survival Analysis for Case-Cohort Studies

Survival Analysis for Case-Cohort Studies Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz

More information

Strati cation in Multivariate Modeling

Strati cation in Multivariate Modeling Strati cation in Multivariate Modeling Tihomir Asparouhov Muthen & Muthen Mplus Web Notes: No. 9 Version 2, December 16, 2004 1 The author is thankful to Bengt Muthen for his guidance, to Linda Muthen

More information

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti Good Confidence Intervals for Categorical Data Analyses Alan Agresti Department of Statistics, University of Florida visiting Statistics Department, Harvard University LSHTM, July 22, 2011 p. 1/36 Outline

More information

Bayesian Estimation of Prediction Error and Variable Selection in Linear Regression

Bayesian Estimation of Prediction Error and Variable Selection in Linear Regression Bayesian Estimation of Prediction Error and Variable Selection in Linear Regression Andrew A. Neath Department of Mathematics and Statistics; Southern Illinois University Edwardsville; Edwardsville, IL,

More information

Sample Size and Power I: Binary Outcomes. James Ware, PhD Harvard School of Public Health Boston, MA

Sample Size and Power I: Binary Outcomes. James Ware, PhD Harvard School of Public Health Boston, MA Sample Size and Power I: Binary Outcomes James Ware, PhD Harvard School of Public Health Boston, MA Sample Size and Power Principles: Sample size calculations are an essential part of study design Consider

More information

Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity. Dankmar Böhning

Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity. Dankmar Böhning Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK Advanced Statistical

More information

MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES

MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES REVSTAT Statistical Journal Volume 13, Number 3, November 2015, 233 243 MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES Authors: Serpil Aktas Department of

More information

Ignoring the matching variables in cohort studies - when is it valid, and why?

Ignoring the matching variables in cohort studies - when is it valid, and why? Ignoring the matching variables in cohort studies - when is it valid, and why? Arvid Sjölander Abstract In observational studies of the effect of an exposure on an outcome, the exposure-outcome association

More information

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke BIOL 51A - Biostatistics 1 1 Lecture 1: Intro to Biostatistics Smoking: hazardous? FEV (l) 1 2 3 4 5 No Yes Smoke BIOL 51A - Biostatistics 1 2 Box Plot a.k.a box-and-whisker diagram or candlestick chart

More information

Statistical Methods in Clinical Trials Categorical Data

Statistical Methods in Clinical Trials Categorical Data Statistical Methods in Clinical Trials Categorical Data Types of Data quantitative Continuous Blood pressure Time to event Categorical sex qualitative Discrete No of relapses Ordered Categorical Pain level

More information

Non-parametric Tests for the Comparison of Point Processes Based on Incomplete Data

Non-parametric Tests for the Comparison of Point Processes Based on Incomplete Data Published by Blackwell Publishers Ltd, 108 Cowley Road, Oxford OX4 1JF, UK and 350 Main Street, Malden, MA 02148, USA Vol 28: 725±732, 2001 Non-parametric Tests for the Comparison of Point Processes Based

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology Group Sequential Tests for Delayed Responses Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Lisa Hampson Department of Mathematics and Statistics,

More information

Sample Size Determination

Sample Size Determination Sample Size Determination 018 The number of subjects in a clinical study should always be large enough to provide a reliable answer to the question(s addressed. The sample size is usually determined by

More information

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Ann Inst Stat Math (0) 64:359 37 DOI 0.007/s0463-00-036-3 Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Paul Vos Qiang Wu Received: 3 June 009 / Revised:

More information

Harvard University. Rigorous Research in Engineering Education

Harvard University. Rigorous Research in Engineering Education Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected

More information

Additive and multiplicative models for the joint effect of two risk factors

Additive and multiplicative models for the joint effect of two risk factors Biostatistics (2005), 6, 1,pp. 1 9 doi: 10.1093/biostatistics/kxh024 Additive and multiplicative models for the joint effect of two risk factors A. BERRINGTON DE GONZÁLEZ Cancer Research UK Epidemiology

More information

Florida State University Libraries

Florida State University Libraries Florida State University Libraries Electronic Theses, Treatises and Dissertations The Graduate School 2011 Individual Patient-Level Data Meta- Analysis: A Comparison of Methods for the Diverse Populations

More information

Published online: 10 Apr 2012.

Published online: 10 Apr 2012. This article was downloaded by: Columbia University] On: 23 March 215, At: 12:7 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 172954 Registered office: Mortimer

More information

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) B.H. Robbins Scholars Series June 23, 2010 1 / 29 Outline Z-test χ 2 -test Confidence Interval Sample size and power Relative effect

More information

AP Statistics Cumulative AP Exam Study Guide

AP Statistics Cumulative AP Exam Study Guide AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics

More information

This paper has been submitted for consideration for publication in Biometrics

This paper has been submitted for consideration for publication in Biometrics BIOMETRICS, 1 10 Supplementary material for Control with Pseudo-Gatekeeping Based on a Possibly Data Driven er of the Hypotheses A. Farcomeni Department of Public Health and Infectious Diseases Sapienza

More information

Categorical Data Analysis Chapter 3

Categorical Data Analysis Chapter 3 Categorical Data Analysis Chapter 3 The actual coverage probability is usually a bit higher than the nominal level. Confidence intervals for association parameteres Consider the odds ratio in the 2x2 table,

More information

Performance of Deming regression analysis in case of misspecified analytical error ratio in method comparison studies

Performance of Deming regression analysis in case of misspecified analytical error ratio in method comparison studies Clinical Chemistry 44:5 1024 1031 (1998) Laboratory Management Performance of Deming regression analysis in case of misspecified analytical error ratio in method comparison studies Kristian Linnet Application

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 3: Bivariate association : Categorical variables Proportion in one group One group is measured one time: z test Use the z distribution as an approximation to the binomial

More information

Chapter Six: Two Independent Samples Methods 1/51

Chapter Six: Two Independent Samples Methods 1/51 Chapter Six: Two Independent Samples Methods 1/51 6.3 Methods Related To Differences Between Proportions 2/51 Test For A Difference Between Proportions:Introduction Suppose a sampling distribution were

More information

SAMPLE SIZE RE-ESTIMATION FOR ADAPTIVE SEQUENTIAL DESIGN IN CLINICAL TRIALS

SAMPLE SIZE RE-ESTIMATION FOR ADAPTIVE SEQUENTIAL DESIGN IN CLINICAL TRIALS Journal of Biopharmaceutical Statistics, 18: 1184 1196, 2008 Copyright Taylor & Francis Group, LLC ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543400802369053 SAMPLE SIZE RE-ESTIMATION FOR ADAPTIVE

More information

An Approximate Test for Homogeneity of Correlated Correlation Coefficients

An Approximate Test for Homogeneity of Correlated Correlation Coefficients Quality & Quantity 37: 99 110, 2003. 2003 Kluwer Academic Publishers. Printed in the Netherlands. 99 Research Note An Approximate Test for Homogeneity of Correlated Correlation Coefficients TRIVELLORE

More information

Approximate and Fiducial Confidence Intervals for the Difference Between Two Binomial Proportions

Approximate and Fiducial Confidence Intervals for the Difference Between Two Binomial Proportions Approximate and Fiducial Confidence Intervals for the Difference Between Two Binomial Proportions K. Krishnamoorthy 1 and Dan Zhang University of Louisiana at Lafayette, Lafayette, LA 70504, USA SUMMARY

More information

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner

More information

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error

More information

A nonparametric two-sample wald test of equality of variances

A nonparametric two-sample wald test of equality of variances University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 211 A nonparametric two-sample wald test of equality of variances David

More information

Part IV Statistics in Epidemiology

Part IV Statistics in Epidemiology Part IV Statistics in Epidemiology There are many good statistical textbooks on the market, and we refer readers to some of these textbooks when they need statistical techniques to analyze data or to interpret

More information

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

Chapter 6. Estimates and Sample Sizes

Chapter 6. Estimates and Sample Sizes Chapter 6 Estimates and Sample Sizes Lesson 6-1/6-, Part 1 Estimating a Population Proportion This chapter begins the beginning of inferential statistics. There are two major applications of inferential

More information

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data Person-Time Data CF Jeff Lin, MD., PhD. Incidence 1. Cumulative incidence (incidence proportion) 2. Incidence density (incidence rate) December 14, 2005 c Jeff Lin, MD., PhD. c Jeff Lin, MD., PhD. Person-Time

More information

Three-Way Contingency Tables

Three-Way Contingency Tables Newsom PSY 50/60 Categorical Data Analysis, Fall 06 Three-Way Contingency Tables Three-way contingency tables involve three binary or categorical variables. I will stick mostly to the binary case to keep

More information

Reconstruction of individual patient data for meta analysis via Bayesian approach

Reconstruction of individual patient data for meta analysis via Bayesian approach Reconstruction of individual patient data for meta analysis via Bayesian approach Yusuke Yamaguchi, Wataru Sakamoto and Shingo Shirahata Graduate School of Engineering Science, Osaka University Masashi

More information

Approximate analysis of covariance in trials in rare diseases, in particular rare cancers

Approximate analysis of covariance in trials in rare diseases, in particular rare cancers Approximate analysis of covariance in trials in rare diseases, in particular rare cancers Stephen Senn (c) Stephen Senn 1 Acknowledgements This work is partly supported by the European Union s 7th Framework

More information

Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence

Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence Sunil Kumar Dhar Center for Applied Mathematics and Statistics, Department of Mathematical Sciences, New Jersey

More information

Biostat Methods STAT 5820/6910 Handout #9a: Intro. to Meta-Analysis Methods

Biostat Methods STAT 5820/6910 Handout #9a: Intro. to Meta-Analysis Methods Biostat Methods STAT 5820/6910 Handout #9a: Intro. to Meta-Analysis Methods Meta-analysis describes statistical approach to systematically combine results from multiple studies [identified follong an exhaustive

More information

Chapter 20: Logistic regression for binary response variables

Chapter 20: Logistic regression for binary response variables Chapter 20: Logistic regression for binary response variables In 1846, the Donner and Reed families left Illinois for California by covered wagon (87 people, 20 wagons). They attempted a new and untried

More information

Physics 509: Non-Parametric Statistics and Correlation Testing

Physics 509: Non-Parametric Statistics and Correlation Testing Physics 509: Non-Parametric Statistics and Correlation Testing Scott Oser Lecture #19 Physics 509 1 What is non-parametric statistics? Non-parametric statistics is the application of statistical tests

More information

Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times

Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times Patrick J. Heagerty PhD Department of Biostatistics University of Washington 1 Biomarkers Review: Cox Regression Model

More information

Institute of Actuaries of India

Institute of Actuaries of India Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the

More information

Statistical Practice

Statistical Practice Statistical Practice A Note on Bayesian Inference After Multiple Imputation Xiang ZHOU and Jerome P. REITER This article is aimed at practitioners who plan to use Bayesian inference on multiply-imputed

More information

Multistage pulse tubes

Multistage pulse tubes Cryogenics 40 (2000) 459±464 www.elsevier.com/locate/cryogenics Multistage pulse tubes A.T.A.M. de Waele *, I.A. Tanaeva, Y.L. Ju Department of Physics, Eindhoven University of Technology, P.O. Box 513,

More information

Probability and Probability Distributions. Dr. Mohammed Alahmed

Probability and Probability Distributions. Dr. Mohammed Alahmed Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about

More information

Exact McNemar s Test and Matching Confidence Intervals Michael P. Fay April 25,

Exact McNemar s Test and Matching Confidence Intervals Michael P. Fay April 25, Exact McNemar s Test and Matching Confidence Intervals Michael P. Fay April 25, 2016 1 McNemar s Original Test Consider paired binary response data. For example, suppose you have twins randomized to two

More information

Testing Independence

Testing Independence Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1

More information

a Sample By:Dr.Hoseyn Falahzadeh 1

a Sample By:Dr.Hoseyn Falahzadeh 1 In the name of God Determining ee the esize eof a Sample By:Dr.Hoseyn Falahzadeh 1 Sample Accuracy Sample accuracy: refers to how close a random sample s statistic is to the true population s value it

More information

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics The candidates for the research course in Statistics will have to take two shortanswer type tests

More information

Richard D Riley was supported by funding from a multivariate meta-analysis grant from

Richard D Riley was supported by funding from a multivariate meta-analysis grant from Bayesian bivariate meta-analysis of correlated effects: impact of the prior distributions on the between-study correlation, borrowing of strength, and joint inferences Author affiliations Danielle L Burke

More information

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007) FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

More information

STAT 536: Genetic Statistics

STAT 536: Genetic Statistics STAT 536: Genetic Statistics Tests for Hardy Weinberg Equilibrium Karin S. Dorman Department of Statistics Iowa State University September 7, 2006 Statistical Hypothesis Testing Identify a hypothesis,

More information

Statistics. Nicodème Paul Faculté de médecine, Université de Strasbourg. 9/5/2018 Statistics

Statistics. Nicodème Paul Faculté de médecine, Université de Strasbourg. 9/5/2018 Statistics Statistics Nicodème Paul Faculté de médecine, Université de Strasbourg file:///users/home/npaul/enseignement/esbs/2018-2019/cours/01/index.html#21 1/62 Course logistics Statistics Course website: http://statnipa.appspot.com/

More information