Additive and multiplicative models for the joint effect of two risk factors

Size: px
Start display at page:

Download "Additive and multiplicative models for the joint effect of two risk factors"

Transcription

1 Biostatistics (2005), 6, 1,pp. 1 9 doi: /biostatistics/kxh024 Additive and multiplicative models for the joint effect of two risk factors A. BERRINGTON DE GONZÁLEZ Cancer Research UK Epidemiology Unit, University of Oxford, Gibson Building, Radcliffe Infirmary, Oxford, OX2 6HE, UK amy.berrington@cancer.org.uk D. R. COX Nuffield College, University of Oxford, OX1 1NF, UK david.cox@nuffield.ox.ac.uk SUMMARY Simple tests are given for consistency of the data with additive and with multiplicative effects of two risk factors on a binary outcome. A combination of the procedures will show whether data are consistent with neither, one or both of the models of no additive or no multiplicative interaction. Implications for the size of the study needed to detect differences between the models are also addressed. Because of the simple form of the test statistics, combination of evidence from different studies or strata is straightforward. Illustration of how the method could be extended to data from a 2xRxC table is also given. Keywords: Case-control studies; Cohort studies; Interaction; Multiplicative; Additive. 1. INTRODUCTION In its statistical meaning, interaction of two risk factors requires departure from additivity in their effect on outcome. We concentrate on two binary risk factors with outcome variable the occurrence or non-occurrence of a rare condition and with their interaction as the primary focus of interest. Let θ ij denote the probability of occurrence when the two risk factors are at levels i, j, where i, j = 0, 1. For convenience, we take (0, 0) as a baseline condition for some of the discussion, although this special choice has no impact on the conclusions. Two different representations of the additivity of effect are and θ 10 = θ 00 + α A,θ 01 = θ 00 + β A,θ 11 = θ 00 + α A + β A (1) log θ 10 = log θ 00 + α M, log θ 01 = log θ 00 + β M, log θ 11 = log θ 00 + α M + β M. (2) Equation (2) can equivalently be written To whom correspondence should be addressed. θ 10 = θ 00 λ M,θ 01 = θ 00 ψ M,θ 11 = θ 00 λ M ψ M. (3) Biostatistics Vol. 6 No. 1 c Oxford University Press 2005; all rights reserved.

2 2 A. BERRINGTON DE GONZÁLEZ AND D. R. COX Models (1) and (2) respectively define no additive interaction, H A0, and no multiplicative interaction, H M0. Both (1) and (2) are used in the epidemiological and other literature. Additive models may have a direct public health interpretation in that for a large population of individuals the difference in the numbers of positive outcomes for, say i = 1, j = 0ascompared with the numbers had the individuals been in the baseline state i = j = 0isproportional to α A. Advantages of the multiplicative form are that comparisons are summarized in simple ratios, often not very different from unity, which, moreover, are often relatively stable across populations. From a formal point of view (1) could be generalized to g(θ 10 ) = g(θ 00 ) + α G, g(θ 01 ) = g(θ 00 ) + β G, g(θ 11 ) = g(θ 00 ) + α G + β G, (4) where g(θ) is a suitable monotonic function of θ, for example a power. To be scientifically fruitful, however, the function g(θ) would have to be reasonably easily interpreted and this restricts the choice appreciably. In the present paper we consider only the forms (1) and (2). For a more general discussion of the statistical aspects of interaction see Cox (1984). 2. ANALYSIS OF EMPIRICAL DATA For empirical data, the forms (1) and (2) may need to be compared. The data may be consistent with none, one or both of the models. There are various ways in which this issue can be tackled. One is to calculate a Bayes factor aiming to be an effective likelihood ratio for the model comparison. A second (Aranda-Ordaz, 1981) is to embed the models in a family characterized by a parameter, η, say, to estimate the value of η and to check for consistency with the values corresponding to (1) and (2). The third approach, and the one adopted here, is to provide two tests of significance, one for H A0, sensitive for departures in the direction of the multiplicative interaction model, and the other for H M0,sensitive for departures in the direction of the additive interaction model. There result two p-values from which one can assess the consistency with both, just one, or neither model. We regard this as conceptually the simplest and the most readily interpreted approach. 3. SENSITIVITY A question of general interest concerns the amount of data likely to be needed to distinguish between H A0 and H M0. This requires study of the power of the associated tests. Formulation of power requirements demands several inevitably arbitrary choices and therefore approximate calculation of power is entirely adequate for most purposes. For this we use the following simplifying result. Suppose that T is a test statistic for the null hypothesis H 0 which, under H 0 is approximately normally distributed with zero mean and variance σ0 2 /n, where n is a sample size. Suppose also that under the alternative hypothesis of interest T is distributed with median approximately µ.infact we assume typically that T is approximately symmetrically distributed with mean µ. Then power of 50 per cent is approximately achieved for a onesided test at level of significance ɛ if that is if µ = k ɛ σ 0 / n, Here k ɛ is the upper ɛ point of the standard normal distribution. n = k 2 ɛ σ 2 0 /µ2. (5)

3 Additive and multiplicative models for the joint effect of two risk factors 3 If, for example for comparison with other investigations, it is unavoidable to use power 1 β, then k ɛ should be replaced by k ɛ + k β ; the extra approximation involved is that the variance of the statistic under the alternative differs little from that under the null hypothesis. Requirement of 50 per cent will be used, however, throughout this paper as it is likely to be adequate for most purposes. 4. COHORT STUDIES 4.1 Additive model In a cohort study of two risk factors for a disease, such as a gene and an environmental exposure, if there are r ij deaths out of n ij individuals (i, j = 0, 1), then the estimated risk is ˆρ ij = r ij /n ij with approximately var( ˆρ ij ) = ρ ij /n ij and var(log ˆρ ij ) = 1/(n ij ρ ij ) for rare conditions, i.e. small ρ ij.wetest the hypothesis that the effects are additive, i.e. there is no evidence of additive interaction between the two risk factors, using T A = ˆρ 11 ˆρ 10 ˆρ 01 +ˆρ 00. (6) ( ˆρij /n ij ) In general E(T A ) ρ 11 ρ 10 ρ 01 + ρ 00 ρij /n ij (7) and there will be approximately 50% power where E(T A ) is equal to k ɛ, the upper ɛ point of the standard normal distribution. If p ij is the probability of being exposed to levels i and j of the two risk factors then n ij = np ij and this implies ρ 11 ρ 10 ρ 01 + ρ 00 = k ɛ ( ρij /p ij )/ n. (8) If the data were actually generated from a multiplicative model without interaction then if we take (0, 0) as a reference level we can write this multiplicative model in the form ρ 00 = ρ 0,ρ 01 = ρ 0 λ, ρ 10 = ρ 0 ψ, ρ 11 = ρ 0 λψ. (9) Now suppose we want to know the expected number of deaths needed in the baseline group in order to detect this form of departure from an additive model. If we define this number as r 0 M = np 00ρ 0, the condition for 50% power becomes nρ0 (λ 1)(ψ 1) = k ɛ (1/p01 + λ/p 00 + ψ/p 10 + λψ/p 11 ) (10) so that r0 M = kɛ 2 ( (λ 1) 2 (ψ 1) λp 00 + ψp 00 + λψp ) 00. (11) p 01 p 10 p 11 For example, if the exposure probabilities are all equal (p 00 = p 01 = p 10 = p 11 ), and the relative risks associated with each exposure are both equal to two (λ = ψ = 2) and k ɛ = 2, then r0 = 36, (12) M i.e. approximately 36 deaths would be required in the baseline (unexposed) group to achieve 50% power.

4 4 A. BERRINGTON DE GONZÁLEZ AND D. R. COX Alternatively, we may prefer to know what total number of deaths would be required in order to be able to detect this form of departure from the additive no interaction model. If the expected number of deaths in total is t then our requirement for 50% power is M t M = kɛ 2 ( 1 (λ 1) 2 (ψ 1) 2 + p 00 Note that in the symmetric case where p y = constant and λ = ψ, λ + ψ + λψ ) (p 00 + λp 01 + ψp 10 + λψp 11 ). (13) p 01 p 10 p 11 r0 t M = k2 ɛ M = k2 ɛ (λ + 1) 2 (λ 1) 4, (14) (λ + 1) 4 (λ 1) 4. (15) 4.2 Multiplicative model Now suppose we test consistency with the multiplicative model without interaction by the statistic T M = log ˆρ 11 log ˆρ 10 log ˆρ 01 + log ˆρ 00 {1/(n00 ρ 00 ) + 1/(n 01 ρ 01 ) + 1/(n 10 ρ 10 ) + 1/(n 11 ρ 11 )} (16) with evidence of departure in the direction of the additive model if T M < k ɛ.again 50% power is achieved when n log{(ρ11 ρ 00 )/(ρ 01 ρ 10 )} k ɛ = {1/(p00 ρ 00 ) + 1/(p 01 ρ 01 ) + 1/(p 10 ρ 10 ) + 1/(p 11 ρ 11 )}. (17) We write an additive model without interaction, with (0, 0) as baseline, r 0 A = np 00ρ 00 Then r0 A = k2 ɛ ρ 00 = ρ 0,ρ 01 = ρ 0 (1 + ξ),ρ 01 = ρ 0 (1 + η), ρ 11 = ρ 0 (1 + ξ + η). (18) { 1 + }[ p 00 p 01 (1 + ξ) + p 00 p 10 (1 + η) + p 00 log p 11 (1 + ξ + η) ] 1 + ξ + η 2. (19) (1 + ξ)(1 + η) With p 00 = p 01 = p 10 = p 11, k ɛ = 2,ξ = η = 1, this gives r0 = 110. (20) A The expected numbers of deaths needed in each category of exposure under the additive and multiplicative models without interaction are shown in Table 1. Note that in the symmetric case, p y = const and ξ = η, r0 A = (ξ 2 [ ] + 4ξ + 2) (1 + 2ξ) 2 2k2 ɛ log (1 + ξ)(1 + 2ξ) (1 + ξ) 2 (21) and that the total number of expected deaths is 4(1 + ξ) r 0 A. Tables 2 and 3 shows examples of how the sample sizes to detect departures from an additive model without interaction in the multiplicative direction and a multiplicative model without interaction in the additive direction vary when the relative risk λ = 2 whilst ψ is allowed to vary from 1.5 to4and the probability of being exposed to both risk factors, p 11,isallowed to vary from 0.05 to 0.3 whilst the other exposure probabilities are all equal (p 00 = p 01 = p 10 ).

5 Additive and multiplicative models for the joint effect of two risk factors 5 Table 1. Expected number of deaths needed to detect departure from additive model in multiplicative direction ( r 0 M ) and from a multiplicative model in an additive direction ( r 0 A ) r0 r0 M A j j i Table 2. Sample size required in the baseline group of a cohort study to detect departure from multiplicative model in the direction of an additive model ψ p a a In this and subsequent tables values are given to two working digits. Table 3. Sample size required in the baseline group of a cohort study to detect departure from an additive model in the direction of a multiplicative model ψ = 1 + ξ p CASE-CONTROL STUDIES 5.1 Multiplicative model A relatively minor change in the argument deals with (unmatched) case-control studies. Consider a single case-control study with one binary exposure with frequency m rs ; r = 0 (control), r = 1(case); s = 0(exposure -), s = 1(exposure +). Then the log relative risk is ˆθ = log{(m 11 m 00 )/(m 01 m 10 )} with asymptotically var( ˆθ) = 1/m rs = 4/ m, (22) where m is the harmonic mean frequency. The estimate of the relative risk ˆφ = e ˆθ has asymptotic variance var( ˆφ) = 4φ 2 / m. (23) Now suppose we have two exposures and let m rij (r = 0 (control), r = 1 (case)) be the frequency in exposure category (i, j) for i, j = 0, 1. Write m ij = 2/(1/m 0ij + 1/m 1ij ) for the relevant harmonic mean frequency. Then with ˆγ ij = log(m 1ij /m 0ij ),var( ˆγ ij ) = 2/ m ij, consistency with a multiplicative no

6 6 A. BERRINGTON DE GONZÁLEZ AND D. R. COX interaction model is tested by T M = ˆγ 11 ˆγ 01 ˆγ 10 +ˆγ 00 2/ mij (24) with evidence of departure in the direction of additivity if T M < k ɛ. Under a form additive for relative risk (without interaction), and arbitrarily taking (0, 0) as baseline, we can write γ 01 = γ 00 + log(1 + α 01 ), γ 10 = γ 00 + log(1 + α 10 ), γ 11 = γ 00 + log(1 + α 10 + α 01 ), (25) so that 50% power is achieved when { } (1 + α01 )(1 + α 10 ) log = k ɛ 2/ mij. (26) 1 + α 01 + α 10 Write q ij = m ij / m kl,sothat q ij = 1 and q ij is the proportion of individuals in the risk category (i, j), with cases and controls combined via a harmonic mean. We write a = 2 m ij /n where n is the total number of individuals. In general a 1, with equality when numbers of cases and controls are almost the same cell by cell. Then the required n is given by Note that 1/q ij 16. { }] (1 + n A = 4kɛ 2 α01 )(1 + α 10 ) 2 a 1 1/q ij [log. (27) 1 + α 01 + α Additive model Consistency with an additive no interaction model can be tested by dividing ˆφ 11 ˆφ 10 ˆφ 01 + ˆφ 00 by its estimated standard error, where ˆφ ij is the estimated risk in exposure category (i, j) relative to baseline (0, 0). The numerator is, however, proportional to the simpler statistic m 111 /m 011 m 110 /m 010 m 101 /m m 100 /m 000, leading to the test statistic T A = m 111/m 011 m 110 /m 010 m 101 /m m 100 /m 000 ˆφ 00 (2 ˆφ 2 ij / m ij) = ˆφ 11 ˆφ 10 ˆφ 01 + ˆφ 00 (2 ˆφ ij 2 / m. (28) ij) Under a multiplicative model φ 10 = φ 00 (1 + β 10 ), φ 01 = φ 00 (1 + β 01 ), φ 11 = φ 00 (1 + β 10 )(1 + β 01 ). Then 50% power is achieved when and the total number of individuals is n M = 4k2 ɛ aβ 2 10 β2 01 φ 2 00 β2 01 β2 10 = k2 ɛ 2φ2 ij / m2 ij (29) {1/q 00 + (1 + β 10 ) 2 /q 10 + (1 + β 01 ) 2 /q 01 + (1 + β 10 ) 2 (1 + β 01 ) 2 )/q 11 }. (30) Note that in n A, for given n, q ij = 1/4isoptimal; in n M this is not quite the case but the main point is that a small q ij lowers sensitivity greatly, as is to be expected. There may not be control over this in design, however.

7 Additive and multiplicative models for the joint effect of two risk factors 7 Table 4. Sample size required for a case-control study to detect departure from a multiplicative model in the direction of an additive model β q Table 5. Sample size required in a case-control study to detect departure from an additive model in the direction of a multiplicative model β q In the symmetrical cases, q ij = 1/4,α 10 = α 01 = α and β 01 = β 10 = β, [ ] n A = 4kɛ 2 a 1/ (1 + α)2 log, (31) 1 + 2α n M = k2 ɛ aβ 4 (1 + 2(1 + β)2 + (1 + β) 4 ). (32) Tables 4 and 5 show the required sample sizes for a case-control study to detect departures from a multiplicative and additive model for interaction, respectively. The odds ratio α is 2 whereas the odds ratio β varies from 1.5 to4.for these examples we have assumed that a = 1 and that q 00 = q 01 = q 10 whilst q 11 is allowed to vary from 0.05 to EXAMPLE AND DISCUSSION Znaor et al. (2003) investigated whether there was evidence of interaction between chewing tobacco and alcohol consumption with respect to the risk of oral cancer in a case-control study of Indian men. We reproduce the data for those men who did not smoke tobacco and calculate the crude odds ratios in a twoby-four table (see Table 6). The observed odds ratio for the joint effect of the two risk factors (44.1) was considerably greater than expected under an additive model without interaction ( = 16.7) and slightly greater than expected under a multiplicative model without interaction ( = 39.3). Here T A = 2.5 suggests there is evidence of significant departure from the additive model in the multiplicative direction, but T M = 0.3 confirms that there is no evidence of departure from the multiplicative model in an additive direction. We have discussed only the simplified case of a single set of data. Because of the simple form of the test statistics, combination of evidence from independent studies or strata is straightforward. An important example of such a situation would be the one where adjustment for confounders was necessary. If the adjustments had been made by logistic regression then the variance of the test statistic would be somewhat greater than the Poisson variance and if, for example, the adjusted log relative risk is ˆθ ij then the statistic

8 8 A. BERRINGTON DE GONZÁLEZ AND D. R. COX Table 6. Estimated odds ratios from Znaor et al. (2003) Chewing tobacco Alcohol Cases Controls Odds ratio var[ln(or)] No No No Yes Yes No Yes Yes Table 7. Odds ratios adjusted for age, centre and education level from Znaor et al. (2003) Chewing tobacco Alcohol Odds Ratio* var[ln(or )] No No 1. No Yes Yes No Yes Yes to test for multiplicative interaction in a case-control study becomes T M = ˆθ 11 ˆθ 01 ˆθ 10 + ˆθ 00 var( ˆθ ij ). (33) The odds ratios actually published by Znaor et al. had been adjusted for age, centre and education level. These adjustments reduced the odds ratios for the effect of chewing tobacco and increased their standard errors (see Table 7). Therefore, when the tests for interaction are conducted on the adjusted data there is no evidence of departure from the multiplicative or the additive models without interaction (T M = 0.03 and T A = 1.05). Inclusion of adjustments in sample size calculations could be made by assuming that the adjustment increases the variance by a constant c across all strata and then the sample size estimates are increased by 1 + c. Finally, extension of the method to the situation of interaction in a 2xRxC table could be approached by extracting a single degree of freedom for an initial test. This would be more sensitive than an examination of independence across the RxC contingency table (Yates, 1948). For example, in Znaor et al. there were actually two levels of chewing: with and without tobacco. An examination of whether the increase in risk with increasing level of chewing differed between ever and never alcohol drinkers (2x2x3) could be examined by assigning the levels of chewing (never, without tobacco and with tobacco) to be 3, 1, 2; then a test statistic for departure from the multiplicative model in the additive direction T M would be T M = (2 ˆθ 13 + ˆθ 12 3 ˆθ 11 ) (2 ˆθ 03 + ˆθ 02 3 ˆθ 01 ) (4var( ˆθ 13 ) + var( ˆθ 12 ) + 9var( ˆθ 11 ) + 4var( ˆθ 03 ) + var( ˆθ 02 ) + 9var( ˆθ 01 )) (34) with evidence of departure in the direction of additivity if T M < k ɛ.again these calculations could include adjustments if necessary with the use of the same strategy as described above for the 2x2x2 table. REFERENCES ARANDA-ORDAZ, F. J.(1981). On two families of transformations to additivity for binary response data. Biometrika 68,

9 Additive and multiplicative models for the joint effect of two risk factors 9 BOTTO, L. D. AND KHOURY, M. J.(2001). Commentary: facing the challenge of gene-environment interaction: the two-by-four table and beyond. American Journal of Epidemiology 153, COX, D.R.(1984). Interaction. International Statistical Review 52, SIEMIATYCKI, J. AND THOMAS, D. C.(1981). Biological models and statistical interactions: an example from multistage carcinogenesis. International Journal of Epidemiology 10, YATES, F.(1948). The analysis of contingency tables. Biometrika 35, ZNAOR, A., BRENNAN, P.,GAJALAKSHMI, V.,MATHEW, A., SHANTA, V.,VARGHESE, C.AND BOFFETTA, P. (2003). Independent and combined effects of tobacco smoking, chewing and alcohol drinking on the risk of oral, pharyngeal and esophageal cancers in Indian men. International Journal of Cancer 105, [Received January 15, 2004; first revision June 21, 2004; second revision July 15, 2004; accepted for publication 19 August, 2004]

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

Correlation and regression

Correlation and regression 1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,

More information

Lecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University

Lecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University Lecture 25 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University November 24, 2015 1 2 3 4 5 6 7 8 9 10 11 1 Hypothesis s of homgeneity 2 Estimating risk

More information

FAILURE-TIME WITH DELAYED ONSET

FAILURE-TIME WITH DELAYED ONSET REVSTAT Statistical Journal Volume 13 Number 3 November 2015 227 231 FAILURE-TIME WITH DELAYED ONSET Authors: Man Yu Wong Department of Mathematics Hong Kong University of Science and Technology Hong Kong

More information

Unbiased estimation of exposure odds ratios in complete records logistic regression

Unbiased estimation of exposure odds ratios in complete records logistic regression Unbiased estimation of exposure odds ratios in complete records logistic regression Jonathan Bartlett London School of Hygiene and Tropical Medicine www.missingdata.org.uk Centre for Statistical Methodology

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the

More information

General Regression Model

General Regression Model Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical

More information

Tests for Two Correlated Proportions in a Matched Case- Control Design

Tests for Two Correlated Proportions in a Matched Case- Control Design Chapter 155 Tests for Two Correlated Proportions in a Matched Case- Control Design Introduction A 2-by-M case-control study investigates a risk factor relevant to the development of a disease. A population

More information

Part IV Statistics in Epidemiology

Part IV Statistics in Epidemiology Part IV Statistics in Epidemiology There are many good statistical textbooks on the market, and we refer readers to some of these textbooks when they need statistical techniques to analyze data or to interpret

More information

Stat 642, Lecture notes for 04/12/05 96

Stat 642, Lecture notes for 04/12/05 96 Stat 642, Lecture notes for 04/12/05 96 Hosmer-Lemeshow Statistic The Hosmer-Lemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal

More information

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data Person-Time Data CF Jeff Lin, MD., PhD. Incidence 1. Cumulative incidence (incidence proportion) 2. Incidence density (incidence rate) December 14, 2005 c Jeff Lin, MD., PhD. c Jeff Lin, MD., PhD. Person-Time

More information

Probability and Probability Distributions. Dr. Mohammed Alahmed

Probability and Probability Distributions. Dr. Mohammed Alahmed Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about

More information

Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity. Dankmar Böhning

Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity. Dankmar Böhning Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK Advanced Statistical

More information

Lab 8. Matched Case Control Studies

Lab 8. Matched Case Control Studies Lab 8 Matched Case Control Studies Control of Confounding Technique for the control of confounding: At the design stage: Matching During the analysis of the results: Post-stratification analysis Advantage

More information

The identification of synergism in the sufficient-component cause framework

The identification of synergism in the sufficient-component cause framework * Title Page Original Article The identification of synergism in the sufficient-component cause framework Tyler J. VanderWeele Department of Health Studies, University of Chicago James M. Robins Departments

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

Lecture 3: Measures of effect: Risk Difference Attributable Fraction Risk Ratio and Odds Ratio

Lecture 3: Measures of effect: Risk Difference Attributable Fraction Risk Ratio and Odds Ratio Lecture 3: Measures of effect: Risk Difference Attributable Fraction Risk Ratio and Odds Ratio Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK March 3-5,

More information

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca Lenz-Tönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1

More information

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics Faculty of Health Sciences Regression models Counts, Poisson regression, 27-5-2013 Lene Theil Skovgaard Dept. of Biostatistics 1 / 36 Count outcome PKA & LTS, Sect. 7.2 Poisson regression The Binomial

More information

Test of Association between Two Ordinal Variables while Adjusting for Covariates

Test of Association between Two Ordinal Variables while Adjusting for Covariates Test of Association between Two Ordinal Variables while Adjusting for Covariates Chun Li, Bryan Shepherd Department of Biostatistics Vanderbilt University May 13, 2009 Examples Amblyopia http://www.medindia.net/

More information

Tests for the Odds Ratio in a Matched Case-Control Design with a Quantitative X

Tests for the Odds Ratio in a Matched Case-Control Design with a Quantitative X Chapter 157 Tests for the Odds Ratio in a Matched Case-Control Design with a Quantitative X Introduction This procedure calculates the power and sample size necessary in a matched case-control study designed

More information

Ignoring the matching variables in cohort studies - when is it valid, and why?

Ignoring the matching variables in cohort studies - when is it valid, and why? Ignoring the matching variables in cohort studies - when is it valid, and why? Arvid Sjölander Abstract In observational studies of the effect of an exposure on an outcome, the exposure-outcome association

More information

One-stage dose-response meta-analysis

One-stage dose-response meta-analysis One-stage dose-response meta-analysis Nicola Orsini, Alessio Crippa Biostatistics Team Department of Public Health Sciences Karolinska Institutet http://ki.se/en/phs/biostatistics-team 2017 Nordic and

More information

Forecasting with the age-period-cohort model and the extended chain-ladder model

Forecasting with the age-period-cohort model and the extended chain-ladder model Forecasting with the age-period-cohort model and the extended chain-ladder model By D. KUANG Department of Statistics, University of Oxford, Oxford OX1 3TG, U.K. di.kuang@some.ox.ac.uk B. Nielsen Nuffield

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu

More information

The identi cation of synergism in the su cient-component cause framework

The identi cation of synergism in the su cient-component cause framework The identi cation of synergism in the su cient-component cause framework By TYLER J. VANDEREELE Department of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, IL 60637

More information

Describing Contingency tables

Describing Contingency tables Today s topics: Describing Contingency tables 1. Probability structure for contingency tables (distributions, sensitivity/specificity, sampling schemes). 2. Comparing two proportions (relative risk, odds

More information

Harvard University. Harvard University Biostatistics Working Paper Series

Harvard University. Harvard University Biostatistics Working Paper Series Harvard University Harvard University Biostatistics Working Paper Series Year 2015 Paper 192 Negative Outcome Control for Unobserved Confounding Under a Cox Proportional Hazards Model Eric J. Tchetgen

More information

The distinction between a biologic interaction or synergism

The distinction between a biologic interaction or synergism ORIGINAL ARTICLE The Identification of Synergism in the Sufficient-Component-Cause Framework Tyler J. VanderWeele,* and James M. Robins Abstract: Various concepts of interaction are reconsidered in light

More information

TESTS FOR EQUIVALENCE BASED ON ODDS RATIO FOR MATCHED-PAIR DESIGN

TESTS FOR EQUIVALENCE BASED ON ODDS RATIO FOR MATCHED-PAIR DESIGN Journal of Biopharmaceutical Statistics, 15: 889 901, 2005 Copyright Taylor & Francis, Inc. ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543400500265561 TESTS FOR EQUIVALENCE BASED ON ODDS RATIO

More information

Does low participation in cohort studies induce bias? Additional material

Does low participation in cohort studies induce bias? Additional material Does low participation in cohort studies induce bias? Additional material Content: Page 1: A heuristic proof of the formula for the asymptotic standard error Page 2-3: A description of the simulation study

More information

Lecture 12: Effect modification, and confounding in logistic regression

Lecture 12: Effect modification, and confounding in logistic regression Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression

More information

Biostatistics Advanced Methods in Biostatistics IV

Biostatistics Advanced Methods in Biostatistics IV Biostatistics 140.754 Advanced Methods in Biostatistics IV Jeffrey Leek Assistant Professor Department of Biostatistics jleek@jhsph.edu 1 / 35 Tip + Paper Tip Meet with seminar speakers. When you go on

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 3: Bivariate association : Categorical variables Proportion in one group One group is measured one time: z test Use the z distribution as an approximation to the binomial

More information

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Glenn Heller and Jing Qin Department of Epidemiology and Biostatistics Memorial

More information

ST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios

ST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios ST3241 Categorical Data Analysis I Two-way Contingency Tables 2 2 Tables, Relative Risks and Odds Ratios 1 What Is A Contingency Table (p.16) Suppose X and Y are two categorical variables X has I categories

More information

Survival Analysis I (CHL5209H)

Survival Analysis I (CHL5209H) Survival Analysis Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca January 7, 2015 31-1 Literature Clayton D & Hills M (1993): Statistical Models in Epidemiology. Not really

More information

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Previous lecture P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Interaction Outline: Definition of interaction Additive versus multiplicative

More information

PubH 7405: REGRESSION ANALYSIS INTRODUCTION TO LOGISTIC REGRESSION

PubH 7405: REGRESSION ANALYSIS INTRODUCTION TO LOGISTIC REGRESSION PubH 745: REGRESSION ANALYSIS INTRODUCTION TO LOGISTIC REGRESSION Let Y be the Dependent Variable Y taking on values and, and: π Pr(Y) Y is said to have the Bernouilli distribution (Binomial with n ).

More information

The miss rate for the analysis of gene expression data

The miss rate for the analysis of gene expression data Biostatistics (2005), 6, 1,pp. 111 117 doi: 10.1093/biostatistics/kxh021 The miss rate for the analysis of gene expression data JONATHAN TAYLOR Department of Statistics, Stanford University, Stanford,

More information

Journal of Biostatistics and Epidemiology

Journal of Biostatistics and Epidemiology Journal of Biostatistics and Epidemiology Methodology Marginal versus conditional causal effects Kazem Mohammad 1, Seyed Saeed Hashemi-Nazari 2, Nasrin Mansournia 3, Mohammad Ali Mansournia 1* 1 Department

More information

Lecture 2: Poisson and logistic regression

Lecture 2: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

Selection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models

Selection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models Selection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models Massimiliano Bratti & Alfonso Miranda In many fields of applied work researchers need to model an

More information

Missing covariate data in matched case-control studies: Do the usual paradigms apply?

Missing covariate data in matched case-control studies: Do the usual paradigms apply? Missing covariate data in matched case-control studies: Do the usual paradigms apply? Bryan Langholz USC Department of Preventive Medicine Joint work with Mulugeta Gebregziabher Larry Goldstein Mark Huberman

More information

ij i j m ij n ij m ij n i j Suppose we denote the row variable by X and the column variable by Y ; We can then re-write the above expression as

ij i j m ij n ij m ij n i j Suppose we denote the row variable by X and the column variable by Y ; We can then re-write the above expression as page1 Loglinear Models Loglinear models are a way to describe association and interaction patterns among categorical variables. They are commonly used to model cell counts in contingency tables. These

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) (b) (c) (d) (e) In 2 2 tables, statistical independence is equivalent

More information

Contingency Tables Part One 1

Contingency Tables Part One 1 Contingency Tables Part One 1 STA 312: Fall 2012 1 See last slide for copyright information. 1 / 32 Suggested Reading: Chapter 2 Read Sections 2.1-2.4 You are not responsible for Section 2.5 2 / 32 Overview

More information

Identification of the age-period-cohort model and the extended chain ladder model

Identification of the age-period-cohort model and the extended chain ladder model Identification of the age-period-cohort model and the extended chain ladder model By D. KUANG Department of Statistics, University of Oxford, Oxford OX TG, U.K. di.kuang@some.ox.ac.uk B. Nielsen Nuffield

More information

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs STAT 5500/6500 Conditional Logistic Regression for Matched Pairs The data for the tutorial came from support.sas.com, The LOGISTIC Procedure: Conditional Logistic Regression for Matched Pairs Data :: SAS/STAT(R)

More information

Lecture 5: Poisson and logistic regression

Lecture 5: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 3-5 March 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

Lecture Discussion. Confounding, Non-Collapsibility, Precision, and Power Statistics Statistical Methods II. Presented February 27, 2018

Lecture Discussion. Confounding, Non-Collapsibility, Precision, and Power Statistics Statistical Methods II. Presented February 27, 2018 , Non-, Precision, and Power Statistics 211 - Statistical Methods II Presented February 27, 2018 Dan Gillen Department of Statistics University of California, Irvine Discussion.1 Various definitions of

More information

STAT 461/561- Assignments, Year 2015

STAT 461/561- Assignments, Year 2015 STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and

More information

Equivalence of random-effects and conditional likelihoods for matched case-control studies

Equivalence of random-effects and conditional likelihoods for matched case-control studies Equivalence of random-effects and conditional likelihoods for matched case-control studies Ken Rice MRC Biostatistics Unit, Cambridge, UK January 8 th 4 Motivation Study of genetic c-erbb- exposure and

More information

Marginal Screening and Post-Selection Inference

Marginal Screening and Post-Selection Inference Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2

More information

Sensitivity analysis and distributional assumptions

Sensitivity analysis and distributional assumptions Sensitivity analysis and distributional assumptions Tyler J. VanderWeele Department of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, IL 60637, USA vanderweele@uchicago.edu

More information

Basic Medical Statistics Course

Basic Medical Statistics Course Basic Medical Statistics Course S7 Logistic Regression November 2015 Wilma Heemsbergen w.heemsbergen@nki.nl Logistic Regression The concept of a relationship between the distribution of a dependent variable

More information

Chapter 2: Describing Contingency Tables - II

Chapter 2: Describing Contingency Tables - II : Describing Contingency Tables - II Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu]

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

Applied Epidemiologic Analysis

Applied Epidemiologic Analysis Patricia Cohen, Ph.D. Henian Chen, M.D., Ph. D. Teaching Assistants Julie Kranick Chelsea Morroni Sylvia Taylor Judith Weissman Lecture 13 Interactional questions and analyses Goals: To understand how

More information

Lecture 5: ANOVA and Correlation

Lecture 5: ANOVA and Correlation Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions

More information

Bayesian Hierarchical Models

Bayesian Hierarchical Models Bayesian Hierarchical Models Gavin Shaddick, Millie Green, Matthew Thomas University of Bath 6 th - 9 th December 2016 1/ 34 APPLICATIONS OF BAYESIAN HIERARCHICAL MODELS 2/ 34 OUTLINE Spatial epidemiology

More information

High-Throughput Sequencing Course

High-Throughput Sequencing Course High-Throughput Sequencing Course DESeq Model for RNA-Seq Biostatistics and Bioinformatics Summer 2017 Outline Review: Standard linear regression model (e.g., to model gene expression as function of an

More information

STAT5044: Regression and Anova

STAT5044: Regression and Anova STAT5044: Regression and Anova Inyoung Kim 1 / 18 Outline 1 Logistic regression for Binary data 2 Poisson regression for Count data 2 / 18 GLM Let Y denote a binary response variable. Each observation

More information

Categorical Data Analysis Chapter 3

Categorical Data Analysis Chapter 3 Categorical Data Analysis Chapter 3 The actual coverage probability is usually a bit higher than the nominal level. Confidence intervals for association parameteres Consider the odds ratio in the 2x2 table,

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population

More information

Chapter 2: Describing Contingency Tables - I

Chapter 2: Describing Contingency Tables - I : Describing Contingency Tables - I Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu]

More information

Categorical data analysis Chapter 5

Categorical data analysis Chapter 5 Categorical data analysis Chapter 5 Interpreting parameters in logistic regression The sign of β determines whether π(x) is increasing or decreasing as x increases. The rate of climb or descent increases

More information

We know from STAT.1030 that the relevant test statistic for equality of proportions is:

We know from STAT.1030 that the relevant test statistic for equality of proportions is: 2. Chi 2 -tests for equality of proportions Introduction: Two Samples Consider comparing the sample proportions p 1 and p 2 in independent random samples of size n 1 and n 2 out of two populations which

More information

MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES

MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES Saurabh Ghosh Human Genetics Unit Indian Statistical Institute, Kolkata Most common diseases are caused by

More information

PB HLTH 240A: Advanced Categorical Data Analysis Fall 2007

PB HLTH 240A: Advanced Categorical Data Analysis Fall 2007 Cohort study s formulations PB HLTH 240A: Advanced Categorical Data Analysis Fall 2007 Srine Dudoit Division of Biostatistics Department of Statistics University of California, Berkeley www.stat.berkeley.edu/~srine

More information

On the relation between initial value and slope

On the relation between initial value and slope Biostatistics (2005), 6, 3, pp. 395 403 doi:10.1093/biostatistics/kxi017 Advance Access publication on April 14, 2005 On the relation between initial value and slope K. BYTH NHMRC Clinical Trials Centre,

More information

Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection

Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection Biometrical Journal 42 (2000) 1, 59±69 Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection Kung-Jong Lui

More information

Computational Systems Biology: Biology X

Computational Systems Biology: Biology X Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA L#7:(Mar-23-2010) Genome Wide Association Studies 1 The law of causality... is a relic of a bygone age, surviving, like the monarchy,

More information

A Model for Correlated Paired Comparison Data

A Model for Correlated Paired Comparison Data Working Paper Series, N. 15, December 2010 A Model for Correlated Paired Comparison Data Manuela Cattelan Department of Statistical Sciences University of Padua Italy Cristiano Varin Department of Statistics

More information

Statistical Analysis of Spatio-temporal Point Process Data. Peter J Diggle

Statistical Analysis of Spatio-temporal Point Process Data. Peter J Diggle Statistical Analysis of Spatio-temporal Point Process Data Peter J Diggle Department of Medicine, Lancaster University and Department of Biostatistics, Johns Hopkins University School of Public Health

More information

Repeated ordinal measurements: a generalised estimating equation approach

Repeated ordinal measurements: a generalised estimating equation approach Repeated ordinal measurements: a generalised estimating equation approach David Clayton MRC Biostatistics Unit 5, Shaftesbury Road Cambridge CB2 2BW April 7, 1992 Abstract Cumulative logit and related

More information

The STS Surgeon Composite Technical Appendix

The STS Surgeon Composite Technical Appendix The STS Surgeon Composite Technical Appendix Overview Surgeon-specific risk-adjusted operative operative mortality and major complication rates were estimated using a bivariate random-effects logistic

More information

DATA-ADAPTIVE VARIABLE SELECTION FOR

DATA-ADAPTIVE VARIABLE SELECTION FOR DATA-ADAPTIVE VARIABLE SELECTION FOR CAUSAL INFERENCE Group Health Research Institute Department of Biostatistics, University of Washington shortreed.s@ghc.org joint work with Ashkan Ertefaie Department

More information

Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May

Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May 5-7 2008 Peter Schlattmann Institut für Biometrie und Klinische Epidemiologie

More information

Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina

Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes Introduction Method Theoretical Results Simulation Studies Application Conclusions Introduction Introduction For survival data,

More information

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis Review Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 Chapter 1: background Nominal, ordinal, interval data. Distributions: Poisson, binomial,

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS330 / MAS83 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-0 8 Parametric models 8. Introduction In the last few sections (the KM

More information

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto.

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto. Introduction to Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca September 18, 2014 38-1 : a review 38-2 Evidence Ideal: to advance the knowledge-base of clinical medicine,

More information

Measures of Association and Variance Estimation

Measures of Association and Variance Estimation Measures of Association and Variance Estimation Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 35

More information

A note on R 2 measures for Poisson and logistic regression models when both models are applicable

A note on R 2 measures for Poisson and logistic regression models when both models are applicable Journal of Clinical Epidemiology 54 (001) 99 103 A note on R measures for oisson and logistic regression models when both models are applicable Martina Mittlböck, Harald Heinzl* Department of Medical Computer

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Local Likelihood Bayesian Cluster Modeling for small area health data. Andrew Lawson Arnold School of Public Health University of South Carolina

Local Likelihood Bayesian Cluster Modeling for small area health data. Andrew Lawson Arnold School of Public Health University of South Carolina Local Likelihood Bayesian Cluster Modeling for small area health data Andrew Lawson Arnold School of Public Health University of South Carolina Local Likelihood Bayesian Cluster Modelling for Small Area

More information

Confidence Intervals for the Odds Ratio in Logistic Regression with Two Binary X s

Confidence Intervals for the Odds Ratio in Logistic Regression with Two Binary X s Chapter 866 Confidence Intervals for the Odds Ratio in Logistic Regression with Two Binary X s Introduction Logistic regression expresses the relationship between a binary response variable and one or

More information

Tests for the Odds Ratio in Logistic Regression with One Binary X (Wald Test)

Tests for the Odds Ratio in Logistic Regression with One Binary X (Wald Test) Chapter 861 Tests for the Odds Ratio in Logistic Regression with One Binary X (Wald Test) Introduction Logistic regression expresses the relationship between a binary response variable and one or more

More information

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we

More information

Estimating direct effects in cohort and case-control studies

Estimating direct effects in cohort and case-control studies Estimating direct effects in cohort and case-control studies, Ghent University Direct effects Introduction Motivation The problem of standard approaches Controlled direct effect models In many research

More information

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke BIOL 51A - Biostatistics 1 1 Lecture 1: Intro to Biostatistics Smoking: hazardous? FEV (l) 1 2 3 4 5 No Yes Smoke BIOL 51A - Biostatistics 1 2 Box Plot a.k.a box-and-whisker diagram or candlestick chart

More information

A stationarity test on Markov chain models based on marginal distribution

A stationarity test on Markov chain models based on marginal distribution Universiti Tunku Abdul Rahman, Kuala Lumpur, Malaysia 646 A stationarity test on Markov chain models based on marginal distribution Mahboobeh Zangeneh Sirdari 1, M. Ataharul Islam 2, and Norhashidah Awang

More information

STAT5044: Regression and ANOVA, Fall 2011 Final Exam on Dec 14. Your Name:

STAT5044: Regression and ANOVA, Fall 2011 Final Exam on Dec 14. Your Name: STAT5044: Regression and ANOVA, Fall 2011 Final Exam on Dec 14 Your Name: Please make sure to specify all of your notations in each problem GOOD LUCK! 1 Problem# 1. Consider the following model, y i =

More information

Effect Modification and Interaction

Effect Modification and Interaction By Sander Greenland Keywords: antagonism, causal coaction, effect-measure modification, effect modification, heterogeneity of effect, interaction, synergism Abstract: This article discusses definitions

More information

Chapter 4. Parametric Approach. 4.1 Introduction

Chapter 4. Parametric Approach. 4.1 Introduction Chapter 4 Parametric Approach 4.1 Introduction The missing data problem is already a classical problem that has not been yet solved satisfactorily. This problem includes those situations where the dependent

More information

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies.

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies. Matched and nested case-control studies Bendix Carstensen Steno Diabetes Center, Gentofte, Denmark http://staff.pubhealth.ku.dk/~bxc/ Department of Biostatistics, University of Copengen 11 November 2011

More information