Meta-analysis of epidemiological dose-response studies

Similar documents
One-stage dose-response meta-analysis

Multivariate Dose-Response Meta-Analysis: the dosresmeta R Package

Journal of Statistical Software

Correlation and Simple Linear Regression

Working with Stata Inference on the mean

Point-wise averaging approach in dose-response meta-analysis of aggregated data

Modelling Rates. Mark Lunt. Arthritis Research UK Epidemiology Unit University of Manchester

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method

Homework Solutions Applied Logistic Regression

The Simulation Extrapolation Method for Fitting Generalized Linear Models with Additive Measurement Error

Confidence intervals for the variance component of random-effects linear models

Lecture 3: Measures of effect: Risk Difference Attributable Fraction Risk Ratio and Odds Ratio

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling

Case-control studies

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)

Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity. Dankmar Böhning

Simulating complex survival data

General Linear Model (Chapter 4)

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies.

multilevel modeling: concepts, applications and interpretations

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis

Assessing the Calibration of Dichotomous Outcome Models with the Calibration Belt

The simulation extrapolation method for fitting generalized linear models with additive measurement error

A procedure to tabulate and plot results after flexible modeling of a quantitative covariate

UNIVERSITY OF TORONTO Faculty of Arts and Science

Instantaneous geometric rates via Generalized Linear Models

Case of single exogenous (iv) variable (with single or multiple mediators) iv à med à dv. = β 0. iv i. med i + α 1

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Biostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE

STATISTICS 110/201 PRACTICE FINAL EXAM

especially with continuous

Hypothesis Testing for Var-Cov Components

Essential of Simple regression

Empirical Application of Simple Regression (Chapter 2)

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics

Mediation Analysis: OLS vs. SUR vs. 3SLS Note by Hubert Gatignon July 7, 2013, updated November 15, 2013

Problem Set 10: Panel Data

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Section I. Define or explain the following terms (3 points each) 1. centered vs. uncentered 2 R - 2. Frisch theorem -

Lecture 12: Effect modification, and confounding in logistic regression

Jeffrey M. Wooldridge Michigan State University

Heteroskedasticity Example

Sociology 362 Data Exercise 6 Logistic Regression 2

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like.

Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes

Name: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam. June 8 th, 2016: 9am to 1pm

Lab 11 - Heteroskedasticity

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Binary Dependent Variables

Statistical Modelling in Stata 5: Linear Models

An R # Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM

Understanding the multinomial-poisson transformation

options description set confidence level; default is level(95) maximum number of iterations post estimation results

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

8 Analysis of Covariance

Introductory Econometrics. Lecture 13: Hypothesis testing in the multiple regression model, Part 1

Graduate Econometrics Lecture 4: Heteroskedasticity

Self-Assessment Weeks 8: Multiple Regression with Qualitative Predictors; Multiple Comparisons

sociology 362 regression

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data

Lecture 2. The Simple Linear Regression Model: Matrix Approach

Unit 2 Regression and Correlation Practice Problems. SOLUTIONS Version STATA

M statistic commands: interpoint distance distribution analysis with Stata

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Review of Statistics 101

University of California at Berkeley Fall Introductory Applied Econometrics Final examination. Scores add up to 125 points

LDA Midterm Due: 02/21/2005

Known unknowns : using multiple imputation to fill in the blanks for missing data

Appendix A. Numeric example of Dimick Staiger Estimator and comparison between Dimick-Staiger Estimator and Hierarchical Poisson Estimator

Binomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials

PSC 8185: Multilevel Modeling Fitting Random Coefficient Binary Response Models in Stata

Outline. Mixed models in R using the lme4 package Part 3: Longitudinal data. Sleep deprivation data. Simple longitudinal data

Meta-analysis. 21 May Per Kragh Andersen, Biostatistics, Dept. Public Health

Lecture 5: Poisson and logistic regression

Handout 11: Measurement Error

Trends in Human Development Index of European Union

Lecture 3 Linear random intercept models

Lecture 10: Introduction to Logistic Regression

Introduction to logistic regression

Generalized linear models

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation

Practice: Basic Linear-Interactive Model

Statistical Modelling with Stata: Binary Outcomes

Econometrics Midterm Examination Answers

Week 3: Simple Linear Regression

Lecture 3.1 Basic Logistic LDA

Meta-Analysis in Stata, 2nd edition p.158 Exercise Silgay et al. (2004)

CAMPBELL COLLABORATION

Logit estimates Number of obs = 5054 Wald chi2(1) = 2.70 Prob > chi2 = Log pseudolikelihood = Pseudo R2 =

Course Econometrics I

Longitudinal Modeling with Logistic Regression

Calculating Odds Ratios from Probabillities

Microeconometrics (PhD) Problem set 2: Dynamic Panel Data Solutions

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Lecture 2: Poisson and logistic regression

Lecture#12. Instrumental variables regression Causal parameters III

Correlation and simple linear regression S5

Lecture 4: Generalized Linear Mixed Models

Transcription:

Meta-analysis of epidemiological dose-response studies Nicola Orsini 2nd Italian Stata Users Group meeting October 10-11, 2005 Institute Environmental Medicine, Karolinska Institutet Rino Bellocco Dept. Medical Epidemiology and Biostatistics, Karolinska Institutet Sander Greenland Dept. Epidemiology, UCLA School of Public Health 1

Outline Motivating example - Case-control and Incidence Rate Data The statistical model and estimation method How to fit the variance-covariance matrix Analysis of multiple studies Modeling sources of heterogeneity 2

Meta-analysis Larsson S.C., Orsini N., Wolk A., Milk, milk products and lactose intake and ovarian cancer risk: A meta-analysis of epidemiological studies, Int J Cancer, 2005. 6 Case-control studies 3 Cohort studies. use http://nicolaorsini.altervista.org/2ism/ovcancer, clear 3

. list if id < 4 id == 9, clean id author year study adjrr lb ub dose case n 1. 1 Engle 1991 CC 1 1 1 0 15 50 2. 1 Engle 1991 CC.9.4 2.2 5.5 21 56 3. 1 Engle 1991 CC 1.3.6 2.9 11.5 35 54 4. 1 Engle 1991 CC.9.4 2 18 16 52 5. 2 Risch 1994 CC 1 1 1 0 97 232 6. 2 Risch 1994 CC 1.04.71 1.53 9 107 250 7. 2 Risch 1994 CC.86.58 1.28 16 102 243 8. 2 Risch 1994 CC 1.07.72 1.59 24 143 284 9. 3 Webb 1998 CC 1 1 1 0 128 292 10. 3 Webb 1998 CC 1.01.71 1.43 11 133 297 11. 3 Webb 1998 CC 1.06.74 1.51 16 134 296 12. 3 Webb 1998 CC 1.4.98 2 24 177 328 13. 3 Webb 1998 CC.97.67 1.41 38 149 317 34. 9 Larsson 2005 IR 1 1 1 5.9 54 227238 35. 9 Larsson 2005 IR 1.3.9 1.88 12.6 68 219977 36. 9 Larsson 2005 IR 1.23.86 1.76 18.1 74 222101 37. 9 Larsson 2005 IR 1.48 1.05 2.09 27.7 92 225412 4

0 1 2 3 4 Milk intake (grams/day) CC smoothed CC IR smoothed IR log(adjusted RR).6.4.2 0.2.4 5

Fixed-effects Dose-Response Model y = Xβ + ǫ where y is a n 1 vector of beta coefficients (log odds ratios, log rate ratios, log risk ratios) X is a n p fixed-effects design matrix (no intercept). x i1 is assumed to be the exposure variable, where i = 1, 2,..., n identifies non-reference exposure levels β is a p 1 vector of unknown coefficients ǫ is a n 1 vector of random errors, such that ǫ N(0, Σ) 6

Generalized Least Squares Suppose for now that the variance-covariance matrix of the error Σ is known. This method involves minimizing (y Xβ) Σ 1 (y Xβ) with respect to β. The resulting estimator β of the regression coefficients β is β = (X Σ 1 X) 1 X Σ 1 y and the estimated covariance matrix V of β is V = Cov( β) = (X Σ 1 X) 1 7

Variance-Covariance Matrix Σ = σ 11.... σ i1 σ ij.... σ n1... σ nj... σ nn In Weighted Least Square (WLS) the off-diagonal elements of Σ are set to zeros (y are independent). In Generalized Least Squares (GLS) the off-diagonal elements Σ may not be zeros (y are dependent). 8

Statistical problems using WLS Because the relative risks are estimated using a common referent group they are not independent. The WLS method would lead to Inefficiency of the slope estimator Inconsistency of the variance estimator In a meta-analysis of summarized dose-response data underestimation of the variance of the slope leads to overestimation of the weight. 9

How to calculate the variances The diagonal element σ ij of Σ, with i = j, and simply denoted by σ i, the variance of the beta coefficient y i, is calculated from the normal-theory-based confidence limits σ i = [(log(u b ) log(l b ))/(2 z α/2 )] 2 where u b and l b are, respectively, the upper and lower bounds of the reported exp(y i ), z α/2 denotes the (1 α/2)-level standard normal deviate (e.g. use 1.96 for 95% confidence interval). 10

Information required to estimate covariances As described by Greenland and Longnecker (1992), for each exposure levels, i = 1,2,..., n, we need to know the number of cases and, according to the type of study number of controls in Case-Control (CC) Data number of person-time in Incidence-Rate (IR) Data number of non-cases in Cumulative Incidence (CI) Data 11

How to calculate the covariances in CC Exposure levels x 01 x 11... x i1... x n1 Total Cases A 0 A 1... A i... A n M 1 = n i=0 A i Controls B 0 B 1... B i... B n M 0 = n i=0 B i Total N 0 N 1... N i... N n M 1 + M 0 1. Fit cell counts to the interior of the 2 (n + 1) summary table (which has margin M 1 and N i ), such that (A i B 0 )/(A 0 B i ) = exp(y i ) 12

2. Estimate the asymptotic correlation, r ij, by r ij = s 0 /(s i s j ) 1/2 where s 0 = (1/A 0 + 1/B 0 ) and s i = (1/A i + 1/B i + 1/A 0 + 1/B 0 ). 3. Estimate the off-diagonal elements, σ ij, of the asymptotic covariance matrix Σ by σ ij = r ij (σ i σ j ) 1/2 where σ i and σ j are the variances of y i and y j.

How to calculate the covariances in IR Exposure levels x 01 x 11... x i1... x n1 Total Cases A 0 A 1... A i... A n M 1 = n i=0 A i Person-time N 0 N 1... N i... N n M 0 = n i=0 N i 1. Fit cell counts such that (A i N 0 )/(A 0 N i ) = exp(y i ) 2. Estimate the correlations r ij = s 0 /(s i s j ) 1/2 where s 0 = (1/A 0 ) and s i = (1/A i + 1/A 0 ) 3. Estimate the covariances σ ij = r ij (σ i σ j ) 1/2 13

Heterogeneity The analysis of the estimated residual vector ǫ = y X β is useful to evaluate how close reported and fitted beta coefficients are at each exposure level. A statistic for the goodness of fit of the model is Q = (y X β) Σ 1 (y X β) where Q has approximately, under the null hypothesis, a χ 2 distribution with n p degrees of freedom. 14

Example: WLS trend for a single study. vwls logrr dose if id == 9 & logrr!= 0, sd(se) nocons Variance-weighted least-squares regression Number of obs = 3 Goodness-of-fit chi2(2) = 0.27 Model chi2(1) = 7.95 Prob > chi2 = 0.8728 Prob > chi2 = 0.0048 logrr Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- dose.1423799.0505126 2.82 0.005.043377.2413827 15

Example: GLS trend for a single study. glst logrr dose if id == 9, se(se) cov(n case) ir Generalized least-squares regression Number of obs = 3 Goodness-of-fit chi2(2) = 0.56 Model chi2(1) = 4.49 Prob > chi2 = 0.7553 Prob > chi2 = 0.0340 logrr Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- dose.1309131.0617632 2.12 0.034.0098594.2519669. mat list e(sigma) symmetric e(sigma)[3,3] c1 c2 c3 r1.03531387 r2.0189355.03337611 r3.01899974.01828257.03083846 16

Meta-analysis of multiple studies with fixed-effects models Let s define the matrices y k and X k, respectively, the n k 1 response vector and the n k p covariates matrix for the k th study, with k = 1,2,..., S. The number of non-reference exposure levels n k for the k th study might varies among the S studies. Let s pool the data by appending the matrices y k and X k underneath each other, y = y 1. y k. y S X = X 1. X k. X S 17

The outcome variable y the of dose-response model will be a T 1 vector, with T = S k=1 n k ; and the linear predictor X will be a T p matrix. Let Σ be a symmetric T T block-diagonal matrix, Σ = Σ 1.... 0. Σ k... 0... 0... Σ S where Σ k is the n k n k estimated covariance matrix for the k th study.

Example: Trend for multiple studies. glst logrr dose, se(se) cov(n case) pfirst(id study) Generalized least-squares regression Number of obs = 28 Goodness-of-fit chi2(27) = 40.25 Model chi2(1) = 1.11 Prob > chi2 = 0.0486 Prob > chi2 = 0.2925 logrr Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- dose.0254944.0242201 1.05 0.293 -.0219761.0729648 Overall, there is no evidence of association between milk intake and risk of ovarian cancer. However, the goodness-of-fit test (Q=40.25, p = 0.0486) suggests that we should take into account potential sources of heterogeneity. 18

Example: Trend estimate for case-control studies. glst logrr dose if study == 1, se(se) cov(n case) pfirst(id study) Generalized least-squares regression Number of obs = 18 Goodness-of-fit chi2(17) = 24.02 Model chi2(1) = 1.22 Prob > chi2 = 0.1190 Prob > chi2 = 0.2699 logrr Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- dose -.0340478.0308599-1.10 0.270 -.094532.0264365 No association between milk intake and risk of ovarian cancer was found among 6 case-control studies. 19

Example: Trend estimate for cohort studies. glst logrr dose if study == 2, se(se) cov(n case) pfirst(id study) Generalized least-squares regression Number of obs = 10 Goodness-of-fit chi2(9) = 6.54 Model chi2(1) = 9.58 Prob > chi2 = 0.6852 Prob > chi2 = 0.0020 logrr Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- dose.1209988.0390836 3.10 0.002.0443964.1976012 A positive association between milk intake and risk of ovarian cancer was found among 3 cohort studies. 20

Modeling sources of heterogeneity. gen types = study == 2. gen dosextypes = dose*types. glst logrr dose dosextypes, se(se) cov(n case) pfirst(id study) Generalized least-squares regression Number of obs = 28 Goodness-of-fit chi2(26) = 30.55 Model chi2(2) = 10.80 Prob > chi2 = 0.2453 Prob > chi2 = 0.0045 logrr Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- dose -.0340478.0308599-1.10 0.270 -.094532.0264365 dosextypes.1550465.0497982 3.11 0.002.0574439.2526492 A systematic difference in slopes related to study design might results, for instance, from the existence of recall bias in the casecontrol studies that would not be present in the cohort studies. 21

Interpretation of the slopes (trend). lincom dose + dosextypes*0, eform ( 1) dose = 0 logrr exp(b) Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- (1).9665254.0298269-1.10 0.270.9097986 1.026789. lincom dose + dosextypes*1, eform ( 1) dose + dosextypes = 0 logrr exp(b) Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) 1.128624.0441106 3.10 0.002 1.045397 1.218476 22

log(adjusted RR).6.4.2 0.2.4 0 1 2 3 4 Milk intake (grams/day) CC fitted CC IR fitted IR 23

Conclusions The findings of case-control studies do not provide evidence of positive associations between dairy food and lactose intakes with risk of ovarian cancer. In contrast, the 3 cohort studies are consistent and show significant positive associations between intakes of total dairy foods, low-fat milk, and lactose and risk of ovarian cancer. The summary estimate of the relative risk for a daily increase of 10 g/day in lactose intake (the approximate amount in 1 glass of milk) was 1.13 (95% CI = 1.05-1.22) for cohort studies. 24

About the command The command glst is written for Stata 9. It uses in-line Mata functions, the new matrix programming language (help mata) for the Iterative fitting algorithm (Newton s method) to get Σ Generalized Least Squares estimator Confidence bounds of the covariances Σ 25

Download To install the glst command and run the do-file with the examples, type at the Stata command line. do http://nicolaorsini.altervista.org/2ism/glst exs.do glst is downloadable from Nicola s website. net from http://nicolaorsini.altervista.org/stata or from Statistical Software Components (SSC) archive. ssc install glst 26

References S. Greenland and M. P. Longnecker, Methods for trend estimation from summarized dose-reponse data, with applications to meta-analysis, AJE, 135, 1301-1309, 1992 A. Berrington and D. R. Cox, Generalized least squares for the synthesis of correlated information, Biostatistics, 4, 423-431, 2003 J. Q. Shi and J. B. Copas, Meta-analysis for trend estimation, Statistics in Medicine, 23, 3-19, 2004 N. Orsini, R.Bellocco and S. Greenland, Generalized Least Squares for trend estimation of summarized dose-response data, submitted, 2005 27