Additive and multiplicative models for the joint effect of two risk factors
|
|
- Vivian Morton
- 5 years ago
- Views:
Transcription
1 Biostatistics (2005), 6, 1,pp. 1 9 doi: /biostatistics/kxh024 Additive and multiplicative models for the joint effect of two risk factors A. BERRINGTON DE GONZÁLEZ Cancer Research UK Epidemiology Unit, University of Oxford, Gibson Building, Radcliffe Infirmary, Oxford, OX2 6HE, UK amy.berrington@cancer.org.uk D. R. COX Nuffield College, University of Oxford, OX1 1NF, UK david.cox@nuffield.ox.ac.uk SUMMARY Simple tests are given for consistency of the data with additive and with multiplicative effects of two risk factors on a binary outcome. A combination of the procedures will show whether data are consistent with neither, one or both of the models of no additive or no multiplicative interaction. Implications for the size of the study needed to detect differences between the models are also addressed. Because of the simple form of the test statistics, combination of evidence from different studies or strata is straightforward. Illustration of how the method could be extended to data from a 2xRxC table is also given. Keywords: Case-control studies; Cohort studies; Interaction; Multiplicative; Additive. 1. INTRODUCTION In its statistical meaning, interaction of two risk factors requires departure from additivity in their effect on outcome. We concentrate on two binary risk factors with outcome variable the occurrence or non-occurrence of a rare condition and with their interaction as the primary focus of interest. Let θ ij denote the probability of occurrence when the two risk factors are at levels i, j, where i, j = 0, 1. For convenience, we take (0, 0) as a baseline condition for some of the discussion, although this special choice has no impact on the conclusions. Two different representations of the additivity of effect are and θ 10 = θ 00 + α A,θ 01 = θ 00 + β A,θ 11 = θ 00 + α A + β A (1) log θ 10 = log θ 00 + α M, log θ 01 = log θ 00 + β M, log θ 11 = log θ 00 + α M + β M. (2) Equation (2) can equivalently be written To whom correspondence should be addressed. θ 10 = θ 00 λ M,θ 01 = θ 00 ψ M,θ 11 = θ 00 λ M ψ M. (3) Biostatistics Vol. 6 No. 1 c Oxford University Press 2005; all rights reserved.
2 2 A. BERRINGTON DE GONZÁLEZ AND D. R. COX Models (1) and (2) respectively define no additive interaction, H A0, and no multiplicative interaction, H M0. Both (1) and (2) are used in the epidemiological and other literature. Additive models may have a direct public health interpretation in that for a large population of individuals the difference in the numbers of positive outcomes for, say i = 1, j = 0ascompared with the numbers had the individuals been in the baseline state i = j = 0isproportional to α A. Advantages of the multiplicative form are that comparisons are summarized in simple ratios, often not very different from unity, which, moreover, are often relatively stable across populations. From a formal point of view (1) could be generalized to g(θ 10 ) = g(θ 00 ) + α G, g(θ 01 ) = g(θ 00 ) + β G, g(θ 11 ) = g(θ 00 ) + α G + β G, (4) where g(θ) is a suitable monotonic function of θ, for example a power. To be scientifically fruitful, however, the function g(θ) would have to be reasonably easily interpreted and this restricts the choice appreciably. In the present paper we consider only the forms (1) and (2). For a more general discussion of the statistical aspects of interaction see Cox (1984). 2. ANALYSIS OF EMPIRICAL DATA For empirical data, the forms (1) and (2) may need to be compared. The data may be consistent with none, one or both of the models. There are various ways in which this issue can be tackled. One is to calculate a Bayes factor aiming to be an effective likelihood ratio for the model comparison. A second (Aranda-Ordaz, 1981) is to embed the models in a family characterized by a parameter, η, say, to estimate the value of η and to check for consistency with the values corresponding to (1) and (2). The third approach, and the one adopted here, is to provide two tests of significance, one for H A0, sensitive for departures in the direction of the multiplicative interaction model, and the other for H M0,sensitive for departures in the direction of the additive interaction model. There result two p-values from which one can assess the consistency with both, just one, or neither model. We regard this as conceptually the simplest and the most readily interpreted approach. 3. SENSITIVITY A question of general interest concerns the amount of data likely to be needed to distinguish between H A0 and H M0. This requires study of the power of the associated tests. Formulation of power requirements demands several inevitably arbitrary choices and therefore approximate calculation of power is entirely adequate for most purposes. For this we use the following simplifying result. Suppose that T is a test statistic for the null hypothesis H 0 which, under H 0 is approximately normally distributed with zero mean and variance σ0 2 /n, where n is a sample size. Suppose also that under the alternative hypothesis of interest T is distributed with median approximately µ.infact we assume typically that T is approximately symmetrically distributed with mean µ. Then power of 50 per cent is approximately achieved for a onesided test at level of significance ɛ if that is if µ = k ɛ σ 0 / n, Here k ɛ is the upper ɛ point of the standard normal distribution. n = k 2 ɛ σ 2 0 /µ2. (5)
3 Additive and multiplicative models for the joint effect of two risk factors 3 If, for example for comparison with other investigations, it is unavoidable to use power 1 β, then k ɛ should be replaced by k ɛ + k β ; the extra approximation involved is that the variance of the statistic under the alternative differs little from that under the null hypothesis. Requirement of 50 per cent will be used, however, throughout this paper as it is likely to be adequate for most purposes. 4. COHORT STUDIES 4.1 Additive model In a cohort study of two risk factors for a disease, such as a gene and an environmental exposure, if there are r ij deaths out of n ij individuals (i, j = 0, 1), then the estimated risk is ˆρ ij = r ij /n ij with approximately var( ˆρ ij ) = ρ ij /n ij and var(log ˆρ ij ) = 1/(n ij ρ ij ) for rare conditions, i.e. small ρ ij.wetest the hypothesis that the effects are additive, i.e. there is no evidence of additive interaction between the two risk factors, using T A = ˆρ 11 ˆρ 10 ˆρ 01 +ˆρ 00. (6) ( ˆρij /n ij ) In general E(T A ) ρ 11 ρ 10 ρ 01 + ρ 00 ρij /n ij (7) and there will be approximately 50% power where E(T A ) is equal to k ɛ, the upper ɛ point of the standard normal distribution. If p ij is the probability of being exposed to levels i and j of the two risk factors then n ij = np ij and this implies ρ 11 ρ 10 ρ 01 + ρ 00 = k ɛ ( ρij /p ij )/ n. (8) If the data were actually generated from a multiplicative model without interaction then if we take (0, 0) as a reference level we can write this multiplicative model in the form ρ 00 = ρ 0,ρ 01 = ρ 0 λ, ρ 10 = ρ 0 ψ, ρ 11 = ρ 0 λψ. (9) Now suppose we want to know the expected number of deaths needed in the baseline group in order to detect this form of departure from an additive model. If we define this number as r 0 M = np 00ρ 0, the condition for 50% power becomes nρ0 (λ 1)(ψ 1) = k ɛ (1/p01 + λ/p 00 + ψ/p 10 + λψ/p 11 ) (10) so that r0 M = kɛ 2 ( (λ 1) 2 (ψ 1) λp 00 + ψp 00 + λψp ) 00. (11) p 01 p 10 p 11 For example, if the exposure probabilities are all equal (p 00 = p 01 = p 10 = p 11 ), and the relative risks associated with each exposure are both equal to two (λ = ψ = 2) and k ɛ = 2, then r0 = 36, (12) M i.e. approximately 36 deaths would be required in the baseline (unexposed) group to achieve 50% power.
4 4 A. BERRINGTON DE GONZÁLEZ AND D. R. COX Alternatively, we may prefer to know what total number of deaths would be required in order to be able to detect this form of departure from the additive no interaction model. If the expected number of deaths in total is t then our requirement for 50% power is M t M = kɛ 2 ( 1 (λ 1) 2 (ψ 1) 2 + p 00 Note that in the symmetric case where p y = constant and λ = ψ, λ + ψ + λψ ) (p 00 + λp 01 + ψp 10 + λψp 11 ). (13) p 01 p 10 p 11 r0 t M = k2 ɛ M = k2 ɛ (λ + 1) 2 (λ 1) 4, (14) (λ + 1) 4 (λ 1) 4. (15) 4.2 Multiplicative model Now suppose we test consistency with the multiplicative model without interaction by the statistic T M = log ˆρ 11 log ˆρ 10 log ˆρ 01 + log ˆρ 00 {1/(n00 ρ 00 ) + 1/(n 01 ρ 01 ) + 1/(n 10 ρ 10 ) + 1/(n 11 ρ 11 )} (16) with evidence of departure in the direction of the additive model if T M < k ɛ.again 50% power is achieved when n log{(ρ11 ρ 00 )/(ρ 01 ρ 10 )} k ɛ = {1/(p00 ρ 00 ) + 1/(p 01 ρ 01 ) + 1/(p 10 ρ 10 ) + 1/(p 11 ρ 11 )}. (17) We write an additive model without interaction, with (0, 0) as baseline, r 0 A = np 00ρ 00 Then r0 A = k2 ɛ ρ 00 = ρ 0,ρ 01 = ρ 0 (1 + ξ),ρ 01 = ρ 0 (1 + η), ρ 11 = ρ 0 (1 + ξ + η). (18) { 1 + }[ p 00 p 01 (1 + ξ) + p 00 p 10 (1 + η) + p 00 log p 11 (1 + ξ + η) ] 1 + ξ + η 2. (19) (1 + ξ)(1 + η) With p 00 = p 01 = p 10 = p 11, k ɛ = 2,ξ = η = 1, this gives r0 = 110. (20) A The expected numbers of deaths needed in each category of exposure under the additive and multiplicative models without interaction are shown in Table 1. Note that in the symmetric case, p y = const and ξ = η, r0 A = (ξ 2 [ ] + 4ξ + 2) (1 + 2ξ) 2 2k2 ɛ log (1 + ξ)(1 + 2ξ) (1 + ξ) 2 (21) and that the total number of expected deaths is 4(1 + ξ) r 0 A. Tables 2 and 3 shows examples of how the sample sizes to detect departures from an additive model without interaction in the multiplicative direction and a multiplicative model without interaction in the additive direction vary when the relative risk λ = 2 whilst ψ is allowed to vary from 1.5 to4and the probability of being exposed to both risk factors, p 11,isallowed to vary from 0.05 to 0.3 whilst the other exposure probabilities are all equal (p 00 = p 01 = p 10 ).
5 Additive and multiplicative models for the joint effect of two risk factors 5 Table 1. Expected number of deaths needed to detect departure from additive model in multiplicative direction ( r 0 M ) and from a multiplicative model in an additive direction ( r 0 A ) r0 r0 M A j j i Table 2. Sample size required in the baseline group of a cohort study to detect departure from multiplicative model in the direction of an additive model ψ p a a In this and subsequent tables values are given to two working digits. Table 3. Sample size required in the baseline group of a cohort study to detect departure from an additive model in the direction of a multiplicative model ψ = 1 + ξ p CASE-CONTROL STUDIES 5.1 Multiplicative model A relatively minor change in the argument deals with (unmatched) case-control studies. Consider a single case-control study with one binary exposure with frequency m rs ; r = 0 (control), r = 1(case); s = 0(exposure -), s = 1(exposure +). Then the log relative risk is ˆθ = log{(m 11 m 00 )/(m 01 m 10 )} with asymptotically var( ˆθ) = 1/m rs = 4/ m, (22) where m is the harmonic mean frequency. The estimate of the relative risk ˆφ = e ˆθ has asymptotic variance var( ˆφ) = 4φ 2 / m. (23) Now suppose we have two exposures and let m rij (r = 0 (control), r = 1 (case)) be the frequency in exposure category (i, j) for i, j = 0, 1. Write m ij = 2/(1/m 0ij + 1/m 1ij ) for the relevant harmonic mean frequency. Then with ˆγ ij = log(m 1ij /m 0ij ),var( ˆγ ij ) = 2/ m ij, consistency with a multiplicative no
6 6 A. BERRINGTON DE GONZÁLEZ AND D. R. COX interaction model is tested by T M = ˆγ 11 ˆγ 01 ˆγ 10 +ˆγ 00 2/ mij (24) with evidence of departure in the direction of additivity if T M < k ɛ. Under a form additive for relative risk (without interaction), and arbitrarily taking (0, 0) as baseline, we can write γ 01 = γ 00 + log(1 + α 01 ), γ 10 = γ 00 + log(1 + α 10 ), γ 11 = γ 00 + log(1 + α 10 + α 01 ), (25) so that 50% power is achieved when { } (1 + α01 )(1 + α 10 ) log = k ɛ 2/ mij. (26) 1 + α 01 + α 10 Write q ij = m ij / m kl,sothat q ij = 1 and q ij is the proportion of individuals in the risk category (i, j), with cases and controls combined via a harmonic mean. We write a = 2 m ij /n where n is the total number of individuals. In general a 1, with equality when numbers of cases and controls are almost the same cell by cell. Then the required n is given by Note that 1/q ij 16. { }] (1 + n A = 4kɛ 2 α01 )(1 + α 10 ) 2 a 1 1/q ij [log. (27) 1 + α 01 + α Additive model Consistency with an additive no interaction model can be tested by dividing ˆφ 11 ˆφ 10 ˆφ 01 + ˆφ 00 by its estimated standard error, where ˆφ ij is the estimated risk in exposure category (i, j) relative to baseline (0, 0). The numerator is, however, proportional to the simpler statistic m 111 /m 011 m 110 /m 010 m 101 /m m 100 /m 000, leading to the test statistic T A = m 111/m 011 m 110 /m 010 m 101 /m m 100 /m 000 ˆφ 00 (2 ˆφ 2 ij / m ij) = ˆφ 11 ˆφ 10 ˆφ 01 + ˆφ 00 (2 ˆφ ij 2 / m. (28) ij) Under a multiplicative model φ 10 = φ 00 (1 + β 10 ), φ 01 = φ 00 (1 + β 01 ), φ 11 = φ 00 (1 + β 10 )(1 + β 01 ). Then 50% power is achieved when and the total number of individuals is n M = 4k2 ɛ aβ 2 10 β2 01 φ 2 00 β2 01 β2 10 = k2 ɛ 2φ2 ij / m2 ij (29) {1/q 00 + (1 + β 10 ) 2 /q 10 + (1 + β 01 ) 2 /q 01 + (1 + β 10 ) 2 (1 + β 01 ) 2 )/q 11 }. (30) Note that in n A, for given n, q ij = 1/4isoptimal; in n M this is not quite the case but the main point is that a small q ij lowers sensitivity greatly, as is to be expected. There may not be control over this in design, however.
7 Additive and multiplicative models for the joint effect of two risk factors 7 Table 4. Sample size required for a case-control study to detect departure from a multiplicative model in the direction of an additive model β q Table 5. Sample size required in a case-control study to detect departure from an additive model in the direction of a multiplicative model β q In the symmetrical cases, q ij = 1/4,α 10 = α 01 = α and β 01 = β 10 = β, [ ] n A = 4kɛ 2 a 1/ (1 + α)2 log, (31) 1 + 2α n M = k2 ɛ aβ 4 (1 + 2(1 + β)2 + (1 + β) 4 ). (32) Tables 4 and 5 show the required sample sizes for a case-control study to detect departures from a multiplicative and additive model for interaction, respectively. The odds ratio α is 2 whereas the odds ratio β varies from 1.5 to4.for these examples we have assumed that a = 1 and that q 00 = q 01 = q 10 whilst q 11 is allowed to vary from 0.05 to EXAMPLE AND DISCUSSION Znaor et al. (2003) investigated whether there was evidence of interaction between chewing tobacco and alcohol consumption with respect to the risk of oral cancer in a case-control study of Indian men. We reproduce the data for those men who did not smoke tobacco and calculate the crude odds ratios in a twoby-four table (see Table 6). The observed odds ratio for the joint effect of the two risk factors (44.1) was considerably greater than expected under an additive model without interaction ( = 16.7) and slightly greater than expected under a multiplicative model without interaction ( = 39.3). Here T A = 2.5 suggests there is evidence of significant departure from the additive model in the multiplicative direction, but T M = 0.3 confirms that there is no evidence of departure from the multiplicative model in an additive direction. We have discussed only the simplified case of a single set of data. Because of the simple form of the test statistics, combination of evidence from independent studies or strata is straightforward. An important example of such a situation would be the one where adjustment for confounders was necessary. If the adjustments had been made by logistic regression then the variance of the test statistic would be somewhat greater than the Poisson variance and if, for example, the adjusted log relative risk is ˆθ ij then the statistic
8 8 A. BERRINGTON DE GONZÁLEZ AND D. R. COX Table 6. Estimated odds ratios from Znaor et al. (2003) Chewing tobacco Alcohol Cases Controls Odds ratio var[ln(or)] No No No Yes Yes No Yes Yes Table 7. Odds ratios adjusted for age, centre and education level from Znaor et al. (2003) Chewing tobacco Alcohol Odds Ratio* var[ln(or )] No No 1. No Yes Yes No Yes Yes to test for multiplicative interaction in a case-control study becomes T M = ˆθ 11 ˆθ 01 ˆθ 10 + ˆθ 00 var( ˆθ ij ). (33) The odds ratios actually published by Znaor et al. had been adjusted for age, centre and education level. These adjustments reduced the odds ratios for the effect of chewing tobacco and increased their standard errors (see Table 7). Therefore, when the tests for interaction are conducted on the adjusted data there is no evidence of departure from the multiplicative or the additive models without interaction (T M = 0.03 and T A = 1.05). Inclusion of adjustments in sample size calculations could be made by assuming that the adjustment increases the variance by a constant c across all strata and then the sample size estimates are increased by 1 + c. Finally, extension of the method to the situation of interaction in a 2xRxC table could be approached by extracting a single degree of freedom for an initial test. This would be more sensitive than an examination of independence across the RxC contingency table (Yates, 1948). For example, in Znaor et al. there were actually two levels of chewing: with and without tobacco. An examination of whether the increase in risk with increasing level of chewing differed between ever and never alcohol drinkers (2x2x3) could be examined by assigning the levels of chewing (never, without tobacco and with tobacco) to be 3, 1, 2; then a test statistic for departure from the multiplicative model in the additive direction T M would be T M = (2 ˆθ 13 + ˆθ 12 3 ˆθ 11 ) (2 ˆθ 03 + ˆθ 02 3 ˆθ 01 ) (4var( ˆθ 13 ) + var( ˆθ 12 ) + 9var( ˆθ 11 ) + 4var( ˆθ 03 ) + var( ˆθ 02 ) + 9var( ˆθ 01 )) (34) with evidence of departure in the direction of additivity if T M < k ɛ.again these calculations could include adjustments if necessary with the use of the same strategy as described above for the 2x2x2 table. REFERENCES ARANDA-ORDAZ, F. J.(1981). On two families of transformations to additivity for binary response data. Biometrika 68,
9 Additive and multiplicative models for the joint effect of two risk factors 9 BOTTO, L. D. AND KHOURY, M. J.(2001). Commentary: facing the challenge of gene-environment interaction: the two-by-four table and beyond. American Journal of Epidemiology 153, COX, D.R.(1984). Interaction. International Statistical Review 52, SIEMIATYCKI, J. AND THOMAS, D. C.(1981). Biological models and statistical interactions: an example from multistage carcinogenesis. International Journal of Epidemiology 10, YATES, F.(1948). The analysis of contingency tables. Biometrika 35, ZNAOR, A., BRENNAN, P.,GAJALAKSHMI, V.,MATHEW, A., SHANTA, V.,VARGHESE, C.AND BOFFETTA, P. (2003). Independent and combined effects of tobacco smoking, chewing and alcohol drinking on the risk of oral, pharyngeal and esophageal cancers in Indian men. International Journal of Cancer 105, [Received January 15, 2004; first revision June 21, 2004; second revision July 15, 2004; accepted for publication 19 August, 2004]
Power and Sample Size Calculations with the Additive Hazards Model
Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine
More informationCorrelation and regression
1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,
More informationLecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University
Lecture 25 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University November 24, 2015 1 2 3 4 5 6 7 8 9 10 11 1 Hypothesis s of homgeneity 2 Estimating risk
More informationFAILURE-TIME WITH DELAYED ONSET
REVSTAT Statistical Journal Volume 13 Number 3 November 2015 227 231 FAILURE-TIME WITH DELAYED ONSET Authors: Man Yu Wong Department of Mathematics Hong Kong University of Science and Technology Hong Kong
More informationUnbiased estimation of exposure odds ratios in complete records logistic regression
Unbiased estimation of exposure odds ratios in complete records logistic regression Jonathan Bartlett London School of Hygiene and Tropical Medicine www.missingdata.org.uk Centre for Statistical Methodology
More informationMAS3301 / MAS8311 Biostatistics Part II: Survival
MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the
More informationGeneral Regression Model
Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical
More informationTests for Two Correlated Proportions in a Matched Case- Control Design
Chapter 155 Tests for Two Correlated Proportions in a Matched Case- Control Design Introduction A 2-by-M case-control study investigates a risk factor relevant to the development of a disease. A population
More informationPart IV Statistics in Epidemiology
Part IV Statistics in Epidemiology There are many good statistical textbooks on the market, and we refer readers to some of these textbooks when they need statistical techniques to analyze data or to interpret
More informationStat 642, Lecture notes for 04/12/05 96
Stat 642, Lecture notes for 04/12/05 96 Hosmer-Lemeshow Statistic The Hosmer-Lemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal
More informationPerson-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data
Person-Time Data CF Jeff Lin, MD., PhD. Incidence 1. Cumulative incidence (incidence proportion) 2. Incidence density (incidence rate) December 14, 2005 c Jeff Lin, MD., PhD. c Jeff Lin, MD., PhD. Person-Time
More informationProbability and Probability Distributions. Dr. Mohammed Alahmed
Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about
More informationConfounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity. Dankmar Böhning
Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK Advanced Statistical
More informationLab 8. Matched Case Control Studies
Lab 8 Matched Case Control Studies Control of Confounding Technique for the control of confounding: At the design stage: Matching During the analysis of the results: Post-stratification analysis Advantage
More informationThe identification of synergism in the sufficient-component cause framework
* Title Page Original Article The identification of synergism in the sufficient-component cause framework Tyler J. VanderWeele Department of Health Studies, University of Chicago James M. Robins Departments
More informationLecture 14: Introduction to Poisson Regression
Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why
More informationModelling counts. Lecture 14: Introduction to Poisson Regression. Overview
Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week
More informationLecture 3: Measures of effect: Risk Difference Attributable Fraction Risk Ratio and Odds Ratio
Lecture 3: Measures of effect: Risk Difference Attributable Fraction Risk Ratio and Odds Ratio Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK March 3-5,
More informationBIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY
BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca Lenz-Tönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1
More informationFaculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics
Faculty of Health Sciences Regression models Counts, Poisson regression, 27-5-2013 Lene Theil Skovgaard Dept. of Biostatistics 1 / 36 Count outcome PKA & LTS, Sect. 7.2 Poisson regression The Binomial
More informationTest of Association between Two Ordinal Variables while Adjusting for Covariates
Test of Association between Two Ordinal Variables while Adjusting for Covariates Chun Li, Bryan Shepherd Department of Biostatistics Vanderbilt University May 13, 2009 Examples Amblyopia http://www.medindia.net/
More informationTests for the Odds Ratio in a Matched Case-Control Design with a Quantitative X
Chapter 157 Tests for the Odds Ratio in a Matched Case-Control Design with a Quantitative X Introduction This procedure calculates the power and sample size necessary in a matched case-control study designed
More informationIgnoring the matching variables in cohort studies - when is it valid, and why?
Ignoring the matching variables in cohort studies - when is it valid, and why? Arvid Sjölander Abstract In observational studies of the effect of an exposure on an outcome, the exposure-outcome association
More informationOne-stage dose-response meta-analysis
One-stage dose-response meta-analysis Nicola Orsini, Alessio Crippa Biostatistics Team Department of Public Health Sciences Karolinska Institutet http://ki.se/en/phs/biostatistics-team 2017 Nordic and
More informationForecasting with the age-period-cohort model and the extended chain-ladder model
Forecasting with the age-period-cohort model and the extended chain-ladder model By D. KUANG Department of Statistics, University of Oxford, Oxford OX1 3TG, U.K. di.kuang@some.ox.ac.uk B. Nielsen Nuffield
More informationIntroduction to Statistical Analysis
Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive
More informationStatistics in medicine
Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu
More informationThe identi cation of synergism in the su cient-component cause framework
The identi cation of synergism in the su cient-component cause framework By TYLER J. VANDEREELE Department of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, IL 60637
More informationDescribing Contingency tables
Today s topics: Describing Contingency tables 1. Probability structure for contingency tables (distributions, sensitivity/specificity, sampling schemes). 2. Comparing two proportions (relative risk, odds
More informationHarvard University. Harvard University Biostatistics Working Paper Series
Harvard University Harvard University Biostatistics Working Paper Series Year 2015 Paper 192 Negative Outcome Control for Unobserved Confounding Under a Cox Proportional Hazards Model Eric J. Tchetgen
More informationThe distinction between a biologic interaction or synergism
ORIGINAL ARTICLE The Identification of Synergism in the Sufficient-Component-Cause Framework Tyler J. VanderWeele,* and James M. Robins Abstract: Various concepts of interaction are reconsidered in light
More informationTESTS FOR EQUIVALENCE BASED ON ODDS RATIO FOR MATCHED-PAIR DESIGN
Journal of Biopharmaceutical Statistics, 15: 889 901, 2005 Copyright Taylor & Francis, Inc. ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543400500265561 TESTS FOR EQUIVALENCE BASED ON ODDS RATIO
More informationDoes low participation in cohort studies induce bias? Additional material
Does low participation in cohort studies induce bias? Additional material Content: Page 1: A heuristic proof of the formula for the asymptotic standard error Page 2-3: A description of the simulation study
More informationLecture 12: Effect modification, and confounding in logistic regression
Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression
More informationBiostatistics Advanced Methods in Biostatistics IV
Biostatistics 140.754 Advanced Methods in Biostatistics IV Jeffrey Leek Assistant Professor Department of Biostatistics jleek@jhsph.edu 1 / 35 Tip + Paper Tip Meet with seminar speakers. When you go on
More informationStatistics in medicine
Statistics in medicine Lecture 3: Bivariate association : Categorical variables Proportion in one group One group is measured one time: z test Use the z distribution as an approximation to the binomial
More informationPairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion
Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Glenn Heller and Jing Qin Department of Epidemiology and Biostatistics Memorial
More informationST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios
ST3241 Categorical Data Analysis I Two-way Contingency Tables 2 2 Tables, Relative Risks and Odds Ratios 1 What Is A Contingency Table (p.16) Suppose X and Y are two categorical variables X has I categories
More informationSurvival Analysis I (CHL5209H)
Survival Analysis Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca January 7, 2015 31-1 Literature Clayton D & Hills M (1993): Statistical Models in Epidemiology. Not really
More informationPrevious lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.
Previous lecture P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Interaction Outline: Definition of interaction Additive versus multiplicative
More informationPubH 7405: REGRESSION ANALYSIS INTRODUCTION TO LOGISTIC REGRESSION
PubH 745: REGRESSION ANALYSIS INTRODUCTION TO LOGISTIC REGRESSION Let Y be the Dependent Variable Y taking on values and, and: π Pr(Y) Y is said to have the Bernouilli distribution (Binomial with n ).
More informationThe miss rate for the analysis of gene expression data
Biostatistics (2005), 6, 1,pp. 111 117 doi: 10.1093/biostatistics/kxh021 The miss rate for the analysis of gene expression data JONATHAN TAYLOR Department of Statistics, Stanford University, Stanford,
More informationJournal of Biostatistics and Epidemiology
Journal of Biostatistics and Epidemiology Methodology Marginal versus conditional causal effects Kazem Mohammad 1, Seyed Saeed Hashemi-Nazari 2, Nasrin Mansournia 3, Mohammad Ali Mansournia 1* 1 Department
More informationLecture 2: Poisson and logistic regression
Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014 introduction to Poisson regression application to the BELCAP study introduction
More informationSelection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models
Selection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models Massimiliano Bratti & Alfonso Miranda In many fields of applied work researchers need to model an
More informationMissing covariate data in matched case-control studies: Do the usual paradigms apply?
Missing covariate data in matched case-control studies: Do the usual paradigms apply? Bryan Langholz USC Department of Preventive Medicine Joint work with Mulugeta Gebregziabher Larry Goldstein Mark Huberman
More informationij i j m ij n ij m ij n i j Suppose we denote the row variable by X and the column variable by Y ; We can then re-write the above expression as
page1 Loglinear Models Loglinear models are a way to describe association and interaction patterns among categorical variables. They are commonly used to model cell counts in contingency tables. These
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) (b) (c) (d) (e) In 2 2 tables, statistical independence is equivalent
More informationContingency Tables Part One 1
Contingency Tables Part One 1 STA 312: Fall 2012 1 See last slide for copyright information. 1 / 32 Suggested Reading: Chapter 2 Read Sections 2.1-2.4 You are not responsible for Section 2.5 2 / 32 Overview
More informationIdentification of the age-period-cohort model and the extended chain ladder model
Identification of the age-period-cohort model and the extended chain ladder model By D. KUANG Department of Statistics, University of Oxford, Oxford OX TG, U.K. di.kuang@some.ox.ac.uk B. Nielsen Nuffield
More informationSTAT 5500/6500 Conditional Logistic Regression for Matched Pairs
STAT 5500/6500 Conditional Logistic Regression for Matched Pairs The data for the tutorial came from support.sas.com, The LOGISTIC Procedure: Conditional Logistic Regression for Matched Pairs Data :: SAS/STAT(R)
More informationLecture 5: Poisson and logistic regression
Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 3-5 March 2014 introduction to Poisson regression application to the BELCAP study introduction
More informationLecture Discussion. Confounding, Non-Collapsibility, Precision, and Power Statistics Statistical Methods II. Presented February 27, 2018
, Non-, Precision, and Power Statistics 211 - Statistical Methods II Presented February 27, 2018 Dan Gillen Department of Statistics University of California, Irvine Discussion.1 Various definitions of
More informationSTAT 461/561- Assignments, Year 2015
STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and
More informationEquivalence of random-effects and conditional likelihoods for matched case-control studies
Equivalence of random-effects and conditional likelihoods for matched case-control studies Ken Rice MRC Biostatistics Unit, Cambridge, UK January 8 th 4 Motivation Study of genetic c-erbb- exposure and
More informationMarginal Screening and Post-Selection Inference
Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2
More informationSensitivity analysis and distributional assumptions
Sensitivity analysis and distributional assumptions Tyler J. VanderWeele Department of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, IL 60637, USA vanderweele@uchicago.edu
More informationBasic Medical Statistics Course
Basic Medical Statistics Course S7 Logistic Regression November 2015 Wilma Heemsbergen w.heemsbergen@nki.nl Logistic Regression The concept of a relationship between the distribution of a dependent variable
More informationChapter 2: Describing Contingency Tables - II
: Describing Contingency Tables - II Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu]
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationApplied Epidemiologic Analysis
Patricia Cohen, Ph.D. Henian Chen, M.D., Ph. D. Teaching Assistants Julie Kranick Chelsea Morroni Sylvia Taylor Judith Weissman Lecture 13 Interactional questions and analyses Goals: To understand how
More informationLecture 5: ANOVA and Correlation
Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions
More informationBayesian Hierarchical Models
Bayesian Hierarchical Models Gavin Shaddick, Millie Green, Matthew Thomas University of Bath 6 th - 9 th December 2016 1/ 34 APPLICATIONS OF BAYESIAN HIERARCHICAL MODELS 2/ 34 OUTLINE Spatial epidemiology
More informationHigh-Throughput Sequencing Course
High-Throughput Sequencing Course DESeq Model for RNA-Seq Biostatistics and Bioinformatics Summer 2017 Outline Review: Standard linear regression model (e.g., to model gene expression as function of an
More informationSTAT5044: Regression and Anova
STAT5044: Regression and Anova Inyoung Kim 1 / 18 Outline 1 Logistic regression for Binary data 2 Poisson regression for Count data 2 / 18 GLM Let Y denote a binary response variable. Each observation
More informationCategorical Data Analysis Chapter 3
Categorical Data Analysis Chapter 3 The actual coverage probability is usually a bit higher than the nominal level. Confidence intervals for association parameteres Consider the odds ratio in the 2x2 table,
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population
More informationChapter 2: Describing Contingency Tables - I
: Describing Contingency Tables - I Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu]
More informationCategorical data analysis Chapter 5
Categorical data analysis Chapter 5 Interpreting parameters in logistic regression The sign of β determines whether π(x) is increasing or decreasing as x increases. The rate of climb or descent increases
More informationWe know from STAT.1030 that the relevant test statistic for equality of proportions is:
2. Chi 2 -tests for equality of proportions Introduction: Two Samples Consider comparing the sample proportions p 1 and p 2 in independent random samples of size n 1 and n 2 out of two populations which
More informationMODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES
MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES Saurabh Ghosh Human Genetics Unit Indian Statistical Institute, Kolkata Most common diseases are caused by
More informationPB HLTH 240A: Advanced Categorical Data Analysis Fall 2007
Cohort study s formulations PB HLTH 240A: Advanced Categorical Data Analysis Fall 2007 Srine Dudoit Division of Biostatistics Department of Statistics University of California, Berkeley www.stat.berkeley.edu/~srine
More informationOn the relation between initial value and slope
Biostatistics (2005), 6, 3, pp. 395 403 doi:10.1093/biostatistics/kxi017 Advance Access publication on April 14, 2005 On the relation between initial value and slope K. BYTH NHMRC Clinical Trials Centre,
More informationConfidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection
Biometrical Journal 42 (2000) 1, 59±69 Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection Kung-Jong Lui
More informationComputational Systems Biology: Biology X
Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA L#7:(Mar-23-2010) Genome Wide Association Studies 1 The law of causality... is a relic of a bygone age, surviving, like the monarchy,
More informationA Model for Correlated Paired Comparison Data
Working Paper Series, N. 15, December 2010 A Model for Correlated Paired Comparison Data Manuela Cattelan Department of Statistical Sciences University of Padua Italy Cristiano Varin Department of Statistics
More informationStatistical Analysis of Spatio-temporal Point Process Data. Peter J Diggle
Statistical Analysis of Spatio-temporal Point Process Data Peter J Diggle Department of Medicine, Lancaster University and Department of Biostatistics, Johns Hopkins University School of Public Health
More informationRepeated ordinal measurements: a generalised estimating equation approach
Repeated ordinal measurements: a generalised estimating equation approach David Clayton MRC Biostatistics Unit 5, Shaftesbury Road Cambridge CB2 2BW April 7, 1992 Abstract Cumulative logit and related
More informationThe STS Surgeon Composite Technical Appendix
The STS Surgeon Composite Technical Appendix Overview Surgeon-specific risk-adjusted operative operative mortality and major complication rates were estimated using a bivariate random-effects logistic
More informationDATA-ADAPTIVE VARIABLE SELECTION FOR
DATA-ADAPTIVE VARIABLE SELECTION FOR CAUSAL INFERENCE Group Health Research Institute Department of Biostatistics, University of Washington shortreed.s@ghc.org joint work with Ashkan Ertefaie Department
More informationCluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May
Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May 5-7 2008 Peter Schlattmann Institut für Biometrie und Klinische Epidemiologie
More informationSupport Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina
Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes Introduction Method Theoretical Results Simulation Studies Application Conclusions Introduction Introduction For survival data,
More informationReview. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis
Review Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 Chapter 1: background Nominal, ordinal, interval data. Distributions: Poisson, binomial,
More informationMAS3301 / MAS8311 Biostatistics Part II: Survival
MAS330 / MAS83 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-0 8 Parametric models 8. Introduction In the last few sections (the KM
More informationClinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto.
Introduction to Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca September 18, 2014 38-1 : a review 38-2 Evidence Ideal: to advance the knowledge-base of clinical medicine,
More informationMeasures of Association and Variance Estimation
Measures of Association and Variance Estimation Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 35
More informationA note on R 2 measures for Poisson and logistic regression models when both models are applicable
Journal of Clinical Epidemiology 54 (001) 99 103 A note on R measures for oisson and logistic regression models when both models are applicable Martina Mittlböck, Harald Heinzl* Department of Medical Computer
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationLocal Likelihood Bayesian Cluster Modeling for small area health data. Andrew Lawson Arnold School of Public Health University of South Carolina
Local Likelihood Bayesian Cluster Modeling for small area health data Andrew Lawson Arnold School of Public Health University of South Carolina Local Likelihood Bayesian Cluster Modelling for Small Area
More informationConfidence Intervals for the Odds Ratio in Logistic Regression with Two Binary X s
Chapter 866 Confidence Intervals for the Odds Ratio in Logistic Regression with Two Binary X s Introduction Logistic regression expresses the relationship between a binary response variable and one or
More informationTests for the Odds Ratio in Logistic Regression with One Binary X (Wald Test)
Chapter 861 Tests for the Odds Ratio in Logistic Regression with One Binary X (Wald Test) Introduction Logistic regression expresses the relationship between a binary response variable and one or more
More informationWeb Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.
Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we
More informationEstimating direct effects in cohort and case-control studies
Estimating direct effects in cohort and case-control studies, Ghent University Direct effects Introduction Motivation The problem of standard approaches Controlled direct effect models In many research
More informationBIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke
BIOL 51A - Biostatistics 1 1 Lecture 1: Intro to Biostatistics Smoking: hazardous? FEV (l) 1 2 3 4 5 No Yes Smoke BIOL 51A - Biostatistics 1 2 Box Plot a.k.a box-and-whisker diagram or candlestick chart
More informationA stationarity test on Markov chain models based on marginal distribution
Universiti Tunku Abdul Rahman, Kuala Lumpur, Malaysia 646 A stationarity test on Markov chain models based on marginal distribution Mahboobeh Zangeneh Sirdari 1, M. Ataharul Islam 2, and Norhashidah Awang
More informationSTAT5044: Regression and ANOVA, Fall 2011 Final Exam on Dec 14. Your Name:
STAT5044: Regression and ANOVA, Fall 2011 Final Exam on Dec 14 Your Name: Please make sure to specify all of your notations in each problem GOOD LUCK! 1 Problem# 1. Consider the following model, y i =
More informationEffect Modification and Interaction
By Sander Greenland Keywords: antagonism, causal coaction, effect-measure modification, effect modification, heterogeneity of effect, interaction, synergism Abstract: This article discusses definitions
More informationChapter 4. Parametric Approach. 4.1 Introduction
Chapter 4 Parametric Approach 4.1 Introduction The missing data problem is already a classical problem that has not been yet solved satisfactorily. This problem includes those situations where the dependent
More information11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies.
Matched and nested case-control studies Bendix Carstensen Steno Diabetes Center, Gentofte, Denmark http://staff.pubhealth.ku.dk/~bxc/ Department of Biostatistics, University of Copengen 11 November 2011
More information