Collated responses from R-help on confidence intervals for risk ratios

Similar documents
Federated analyses. technical, statistical and human challenges

Lecture 12: Effect modification, and confounding in logistic regression

Epidemiology Wonders of Biostatistics Chapter 13 - Effect Measures. John Koval

BIOS 312: Precision of Statistical Inference

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto.

Statistics in medicine

Approximate and Fiducial Confidence Intervals for the Difference Between Two Binomial Proportions

A Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn

Practical Considerations Surrounding Normality

A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn

Lecture 3: Measures of effect: Risk Difference Attributable Fraction Risk Ratio and Odds Ratio

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 18.1 Logistic Regression (Dose - Response)

Modelling Rates. Mark Lunt. Arthritis Research UK Epidemiology Unit University of Manchester

Generalized linear models

Poisson regression: Further topics

ST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios

Math 70 Homework 8 Michael Downs. and, given n iid observations, has log-likelihood function: k i n (2) = 1 n

Logistic regression analysis. Birthe Lykke Thomsen H. Lundbeck A/S

Logistic Regression - problem 6.14

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables

9 Generalized Linear Models

One-stage dose-response meta-analysis

Correlation and Simple Linear Regression

Analysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013

Chapter 22: Log-linear regression for Poisson counts

Linear Regression Models P8111

Case-control studies

BOOTSTRAPPING WITH MODELS FOR COUNT DATA

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

Standard RE Model. Critique of the RE Model 2016/08/12

Introduction to logistic regression

Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method

Survival analysis in R

Case-control studies C&H 16

Modeling Overdispersion

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

LOGISTIC REGRESSION Joseph M. Hilbe

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies.

Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction

Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection

Reports of the Institute of Biostatistics

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis

Survival Analysis. 732G34 Statistisk analys av komplexa data. Krzysztof Bartoszek

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

You can specify the response in the form of a single variable or in the form of a ratio of two variables denoted events/trials.

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti

Regression Models for Risks(Proportions) and Rates. Proportions. E.g. [Changes in] Sex Ratio: Canadian Births in last 60 years

UNIVERSITY OF CALIFORNIA, SAN DIEGO

Logistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Generalized Linear Models

Power and Sample Size Calculations with the Additive Hazards Model

In Defence of Score Intervals for Proportions and their Differences

Categorical data analysis Chapter 5

An ordinal number is used to represent a magnitude, such that we can compare ordinal numbers and order them by the quantity they represent.

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Chapter 5: Generalized Linear Models

Does low participation in cohort studies induce bias? Additional material

Logistic regression: Miscellaneous topics

Statistical Modelling with Stata: Binary Outcomes

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

Notes for week 4 (part 2)

Testing Independence

Explanatory variables are: weight, width of shell, color (medium light, medium, medium dark, dark), and condition of spine.

Poisson regression 1/15

Exercises R For Simulations Columbia University EPIC 2015 (with answers)

Counting Statistics and Error Propagation!

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Single-level Models for Binary Responses

Logistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression

Stat Lecture 20. Last class we introduced the covariance and correlation between two jointly distributed random variables.

Topic 15: Simple Hypotheses

A Reliable Constrained Method for Identity Link Poisson Regression

Interpreting Regression

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20

Answer Key for STAT 200B HW No. 8

Turning a research question into a statistical question.

Supplementary materials for:

Inference for Binomial Parameters

Logistic Regression Models to Integrate Actuarial and Psychological Risk Factors For predicting 5- and 10-Year Sexual and Violent Recidivism Rates

HYPOTHESIS TESTING: FREQUENTIST APPROACH.

Lecture 8. Poisson models for counts

Semiparametric Regression

STA6938-Logistic Regression Model

Meta-analysis of epidemiological dose-response studies

Poisson Regression. Ryan Godwin. ECON University of Manitoba

Sociology 362 Data Exercise 6 Logistic Regression 2

MODELING COUNT DATA Joseph M. Hilbe

Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see

Lecture 3.1 Basic Logistic LDA

Introduction to General and Generalized Linear Models

Prediction of Bike Rental using Model Reuse Strategy

Maximum-Likelihood Estimation: Basic Ideas

Multivariate Dose-Response Meta-Analysis: the dosresmeta R Package

Categorical Data Analysis Chapter 3

Survival analysis in R

Confidence Intervals for the Interaction Odds Ratio in Logistic Regression with Two Binary X s

Transcription:

Collated responses from R-help on confidence intervals for risk ratios Michael E Dewey November, 2006 Introduction This document arose out of a problem assessing a confidence interval for the risk ratio quoted by authors of a paper I was reviewing. As an example which is close to theirs but made slightly simpler we suppose we have two groups each of size 500 and in one group we observed 8 events and in the other group zero events giving us an observed risk ratio of. Notation We assume we have r i events in group i out of n i trials (i = 1, 2). We have a proportion p i = r i n i in each group. Methods proposed Wald method We use the usual formulae for the log risk ratio and its variance but add a small constant to the numerator r i. Section of Epidemiology, Division of Psychological Medicine, PO 60, Institute of Psychiatry, De Crespigny Park, London, SE5 8AF, UK. mailto:m.dewey@iop.kcl.ac.uk Printed November 22, 2006 at 10:52 1

Figure 1: Lower limit of confidence interval for the risk ratio for a range of values of delta f u n c t i o n s f o r Wald method rr <- function (x1,n1,x2,n2, conflev, delta = 0.5) { x1. d <- x1 + delta x2. d <- x2 + delta lrr <- log (( x1.d/n1) / (x2.d/n2 )) se.lrr <- sqrt (1/x1.d - 1/n1 + 1/x2.d - 1/n2) mult <- abs ( qnorm ((1 - conflev )/ 2)) res <- list ( lrr = lrr, se.lrr = se.lrr, low. lim = lrr - mult * se.lrr, high. lim = lrr + mult * se.lrr ) res wrapper <- function (x) { exp (rr (8, 500, 0, 500, 0.95, x)$ low. lim ) plot ( wrapper, 0.1, 0.9, xlab = " delta ", ylab = " Lower limit of ci") Example 1: R code for Wald method RR = log ( r 1+δ n 1 ) ( r 2+δ n 2 ) (1) The usual choice for δ is 0.5 although there does not seem to be any theory behind this. var(rr) = 1 r 1 + δ 1 n 1 + 1 r 2 + δ 1 n 2 (2) In fact the outcome is quite sensitive to the choice of δ as can be seen from Page 2

f u n c t i o n f o r t r a n s f o r m i n g odds r a t i o rr. from.or <- function (p1, or) { or/((1 - p1) + p1 * or) rr. from.or (8/500, fisher. test ( matrix (c(8, 492, 0, 500), ncol = 2, byrow = TRUE ))$ conf. int [1]) Example 2: R code for the OR to RR transformation Figure 1. Example 1 shows R code for calculating the confidence interval and for producing the plot in Figure 1. Score test based intervals Code has been published for generating confidence intervals by inverting a score test. It is available from http://web.stat.ufl.edu/~aa/cda/r/two_ sample/r2/ Using the interval for the odds ratio We know (Zhang and Yu, 1998) that RR = OR (1 p 1 ) + p 1 OR (3) where p 1 is the rate in the unexposed group (called p 0 in Zhang and Yu (1998)). So having estimated the lower limit of the interval for the odds ratio (using fisher.test or one of a variety of other R packages) we can transform it into a risk ratio. Example 2 shows code for doing this for the example. McNutt et al. (2003) suggest that this method is not in general a good thing and may be biased. Using the profile likelihood We can fit a model to the table using Poisson regression and then use profile likelihood methods to find the value of the relative risk at which the log likelihood differs from that for the full model by 3.84 (for 95% confidence interval). Figure 2 shows a plot of the deviance against values for the log Page 3

Figure 2: Deviance and log relative risk f u n c t i o n s f o r l i k e l i h o o d based method prof <- function ( delta ) { v e c t o r i s e d to e n a b l e p l o t t i n g n. delta <- length ( delta ) dev <- numeric (n. delta ) for (i in 1:n. delta ) { dev [i] <- glm (r ~ offset ( treat * delta [i] + log (n)), family = poisson, data = adverse ) $ deviance dev prof. uniroot <- function ( delta ) { prof ( delta ) - qchisq (0.95, 1) adverse <- data. frame (r = c(8,0), n = c (500,500), treat = c (1,0)) uniroot ( prof. uniroot, interval = c ( -2,2)) Example 3: R code for the likelihood method relative risk. The critical value is obtained around 1.3 Example 3 shows code for evaluating the likelihood and for plotting Figure 2. It also contains code for finding the point at which the deviance differs by the prescribed amount from the deviance for the saturated model (in this case zero). Conclusions Table 1 summarises the values obtained using the various methods. Page 4

Method Estimate 95%ci Wald 17 0.98 293.7 Score test 2.09 Transform odds ratio 1.70 Profile likelihood 3.69 Table 1: Comparison of confidence intervals using the proposed methods It seems strange to use the Wald method as it is sensitive to the choice of δ and also fails to include the sample estimate of relative risk. General problems with the transformation of the odds ratio suggest another method might be better. The choice seems therefore to lie between the score test based method and that based on likelihood. To do Work out why the values are so different (incompatible assumptions about the other end of the ci?) Work out how to provide starting values so confint from MASS works for the likelihood method. Acknowledgments Thanks to David Duffy, Spencer Graves, Jukka Jokinen, Terry Therneau, and Wolfgan Viechtbauer. The views and calculations presented here are my own and any errors are my responsibility not theirs. References L-A McNutt, C Wu, X Xue, and J P Hafner. Estimating the relative risk in cohort studies and clinical trials of common outcomes. American Journal of Epidemiology, 157:940 943, 2003. J Zhang and K F Yu. What s the relative risk? A method of correcting the odds ratio in cohort studies of common outcomes. Journal of the American Medical Association, 280:1690 1691, 1998. Page 5