Lecture 2: Poisson and logistic regression

Similar documents
Lecture 5: Poisson and logistic regression

Lecture 4: Generalized Linear Mixed Models

Lecture 12: Effect modification, and confounding in logistic regression

Lecture 10: Introduction to Logistic Regression

Homework Solutions Applied Logistic Regression

Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity. Dankmar Böhning

Modelling Rates. Mark Lunt. Arthritis Research UK Epidemiology Unit University of Manchester

Binomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials

Correlation and regression

8 Nominal and Ordinal Logistic Regression

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Sociology 362 Data Exercise 6 Logistic Regression 2

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection

4. MA(2) +drift: y t = µ + ɛ t + θ 1 ɛ t 1 + θ 2 ɛ t 2. Mean: where θ(l) = 1 + θ 1 L + θ 2 L 2. Therefore,

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies.

Case-control studies

Statistical Modelling with Stata: Binary Outcomes

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression

Generalized logit models for nominal multinomial responses. Local odds ratios

Example 7b: Generalized Models for Ordinal Longitudinal Data using SAS GLIMMIX, STATA MEOLOGIT, and MPLUS (last proportional odds model only)

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables

Lecture 3: Measures of effect: Risk Difference Attributable Fraction Risk Ratio and Odds Ratio

Logistic Regression. Building, Interpreting and Assessing the Goodness-of-fit for a logistic regression model

ST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses

Case of single exogenous (iv) variable (with single or multiple mediators) iv à med à dv. = β 0. iv i. med i + α 1

One-stage dose-response meta-analysis

Stat 642, Lecture notes for 04/12/05 96

Latent class analysis and finite mixture models with Stata

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics

Introduction to logistic regression

LOGISTIC REGRESSION Joseph M. Hilbe

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression

Unit 5 Logistic Regression

Consider Table 1 (Note connection to start-stop process).

Unit 5 Logistic Regression

Unit 5 Logistic Regression

Linear Regression Models P8111

Beyond GLM and likelihood

ECON Introductory Econometrics. Lecture 11: Binary dependent variables

Binary Dependent Variables

Outline. 1 Association in tables. 2 Logistic regression. 3 Extensions to multinomial, ordinal regression. sociology. 4 Multinomial logistic regression

ZERO INFLATED POISSON REGRESSION

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

multilevel modeling: concepts, applications and interpretations

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

Lecture 3.1 Basic Logistic LDA

More Statistics tutorial at Logistic Regression and the new:

What is Latent Class Analysis. Tarani Chandola

Single-level Models for Binary Responses

Exercise 7.4 [16 points]

Part IV Statistics in Epidemiology

Measures of Fit from AR(p)

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014

Appendix A. Numeric example of Dimick Staiger Estimator and comparison between Dimick-Staiger Estimator and Hierarchical Poisson Estimator

Chapter 22: Log-linear regression for Poisson counts

Today. HW 1: due February 4, pm. Aspects of Design CD Chapter 2. Continue with Chapter 2 of ELM. In the News:

Chapter 20: Logistic regression for binary response variables

Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method

Assessing the Calibration of Dichotomous Outcome Models with the Calibration Belt

STA6938-Logistic Regression Model

Basic Medical Statistics Course

Group Comparisons: Differences in Composition Versus Differences in Models and Effects

MAS3301 / MAS8311 Biostatistics Part II: Survival

A Journey to Latent Class Analysis (LCA)

Models for Count and Binary Data. Poisson and Logistic GWR Models. 24/07/2008 GWR Workshop 1

Generalized linear models

STAT 7030: Categorical Data Analysis

Understanding the multinomial-poisson transformation

Testing methodology. It often the case that we try to determine the form of the model on the basis of data

Introducing Generalized Linear Models: Logistic Regression

especially with continuous

Multinomial Logistic Regression Models

Confidence Intervals for the Odds Ratio in Logistic Regression with One Binary X

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

Chapter 4 Regression Models

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010

Addition to PGLR Chap 6

Sections 4.1, 4.2, 4.3

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

Categorical and Zero Inflated Growth Models

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

Section 9c. Propensity scores. Controlling for bias & confounding in observational studies

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

Logistic Regression. Fitting the Logistic Regression Model BAL040-A.A.-10-MAJ

Regression modeling for categorical data. Part II : Model selection and prediction

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

Discrete Choice Modeling

Modelling Binary Outcomes 21/11/2017

9 Generalized Linear Models

β j = coefficient of x j in the model; β = ( β1, β2,

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20

Confounding, mediation and colliding

Transcription:

Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014

introduction to Poisson regression application to the BELCAP study introduction to logistic regression confounding and effect modification comparing of different generalized regression models meta-analysis of BCG vaccine against tuberculosis

introduction to Poisson regression the Poisson distribution count data may follow such a distribution, at least approximately Examples: number of deaths, of diseased cases, of hospital admissions and so on... Y Po(µ): where µ > 0 P(Y = y) = µ y exp( µ)/y!

introduction to Poisson regression 0.5 0.4 P(Y=y) 0.3 0.2 0.1 0 1 2 3 4 y 5 6 7 8 9 10

introduction to Poisson regression but why not use a linear regression model? for a Poisson distribution we have E(Y ) = Var(Y ). This violates the constancy of variance assumption (for the conventional regression model) a conventional regression model assumes we are dealing with a normal distribution for the response Y, but the Poisson distribution may not look very normal the conventional regression model may give negative predicted means (negative counts are impossible!)

introduction to Poisson regression the Poisson regression model log E(Y i ) = log µ i = α + βx i the RHS of the above is called the linear predictor Y i Po(µ i ) this model is the log-linear model

introduction to Poisson regression the Poisson regression model can be written equivalently as log E(Y i ) = log µ i = α + βx i µ i = exp[α + βx i ] Hence it is clear that any fitted log-linear model will always give non-negative fitted values!

introduction to Poisson regression an interesting interpretation in the Poisson regression model suppose x represents a binary variable (yes/no, treatment present/not present) { 1 if person is in intervention group x = 0 otherwise log E(Y ) = log µ = α + βx x = 0: log µ no intervention = α + βx = α x = 1: log µ intervention = α + βx = α + β hence log µ intervention log µ no intervention = β

introduction to Poisson regression an interesting interpretation in the Poisson regression model hence or log µ intervention log µ no intervention = β µ intervention µ no intervention = exp(β) the coefficient exp(β) corresponds to the risk ratio comparing the mean risk in the treatment group to the mean risk in the control group

introduction to Poisson regression Poisson regression model for several covariates log E(Y i ) = α + β 1 x 1i + + β p x pi where x 1i,, x pi are the covariates of interest testing the effect of covariate x j is done by the size of the estimate ˆβ j of β j t j = ˆβ j s.e.( ˆβ j ) if t j > 1.96 covariate effect is significant

introduction to Poisson regression estimation of model parameters consider the likelihood (the probability for the observed data) L = n i=1 for model with p covariates: µ y i i exp( µ i )/y i! log µ i = α + β 1 x i1 + β 2 x i2 +... + β p x ip finding parameter estimates by maximizing the likelihood L (or equivalently the log-likelihood log L) guiding principle: choosing the parameters that make the observed data the most likely

application to the BELCAP study The simple regression model for BELCAP with Y = DMFSe: log E(DMFSe i ) = α + β 1 OHE i + β 2 ALL 2i + β 4 ESD i + β 5 MW i + β 6 OHY i + β 7 DMFSb i { 1 if child i is in intervention OHE OHE i = 0 otherwise { 1 if child i is in intervention ALL ALL i = 0 otherwise

application to the BELCAP study analysis of BELCAP study using the Poisson regression model including the DMFS at baseline covariate ˆβj s.e.( ˆβ j ) t j P-value OHE -0.7043014 0.0366375-6.74 0.000 ALL -0.5729402 0.0355591-8.97 0.000 ESD -0.8227017 0.0418510-3.84 0.000 MW -0.6617572 0.0334654-8.16 0.000 OHY -0.7351562 0.0402084-5.63 0.000 DMFSb 1.082113 0.0027412 31.15 0.000

introduction to logistic regression Introduction to logistic regression Binary Outcome Y Y = { 1, Person diseased 0, Person healthy Probability that Outcome Y = 1 Pr(Y = 1) = p is probability for Y = 1

introduction to logistic regression Odds odds = p 1 p p = odds odds + 1 Examples p = 1/2 odds = 1 p = 1/4 odds = 1/3 p = 3/4 odds = 3/1 = 3

introduction to logistic regression Odds Ratio OR = odds( in exposure ) odds( in non-exposure ) = p 1/(1 p 1 ) p 0 /(1 p 0 ) Properties of odds ratio 0 < OR < OR = 1(p 1 = p 0 ) is reference value

introduction to logistic regression Examples risk = risk = { p 1 = 1/4 p 0 = 1/8 { p 1 = 1/100 p 0 = 1/1000 effect measure = eff. meas. = Fundamental Theorem of Epidemiology p 0 small OR RR { OR = p 1 /(1 p 1 ) p 0 /(1 p 0 ) = 1/3 1/7 = 2.33 RR = p 1 p 0 = 2 { OR = 1/99 1/999 = 10.09 RR = p 1 p 0 = 10 benefit: OR is interpretable as RR which is easier to deal with

introduction to logistic regression A simple example: Radiation Exposure and Tumor Development cases non-cases E 52 2820 2872 NE 6 5043 5049 odds and OR odds for disease given exposure (in detail): 52/2872 2820/2872 = 52/2820 odds for disease given non-exposure (in detail): 6/5049 5043/5049 = 6/5043

introduction to logistic regression A simple example: Radiation Exposure and Tumor Development cases non-cases E 52 2820 2872 NE 6 5043 5049 OR odds ratio for disease (in detail): OR = 52/2820 6/5043 = 52 5043 6 2820 = 15.49 or, log OR = log 15.49 = 2.74 for comparison RR = 52/2872 6/5049 = 15.24

introduction to logistic regression Logistic regression model for this simple situation where log p x 1 p x = α + βx p x = Pr(Y = 1 x) { 1, if exposure present x = 0, if exposure not present log px 1 p x is called the logit link that connects p x with the linear predictor

introduction to logistic regression benefits of the logistic regression model is feasible since whereas log p x = p x 1 p x = α + βx exp(α + βx) (0, 1) 1 + exp(α + βx) p x = α + βx is not feasible

introduction to logistic regression Interpretation of parameters α and β log p x 1 p x = α + βx now p 0 x = 0 : log 1 p 0 = α (1) p 1 x = 1 : log 1 p 1 = α + β (2) p 1 (2) (1) = log log = α + β α = β 1 p 1 1 p }{{ 0 } log p 1 1 p 1 p 0 1 p 0 =log OR p 0 log OR = β OR = e β

confounding and effect modification A simple illustration example cases non-cases E 60 1100 1160 NE 1501 3100 4601 OR odds ratio: OR = 60 3100 1501 1100 = 0.1126

confounding and effect modification stratified: Stratum 1: Stratum 2: cases non-cases E 50 100 150 NE 1500 3000 4500 OR = 50 3000 100 1500 = 1 cases non-cases E 10 1000 1010 NE 1 100 101 OR = 10 100 1000 1 = 1

confounding and effect modification +------------------+ Y E S freq ------------------ 1. 1 1 0 50 2. 0 1 0 100 3. 1 0 0 1500 4. 0 0 0 3000 5. 1 1 1 10 6. 0 1 1 1000 7. 1 0 1 1 8. 0 0 1 100 +------------------+

confounding and effect modification The logistic regression model for simple confounding where log p x 1 p x = α + βe + γs x = (E, S) is the covariate combination of exposure E and stratum S

confounding and effect modification in detail for stratum 1 now log p x 1 p x = α + βe + γs p 0,0 E = 0, S = 0 : log 1 p 0,0 = α (3) p 1,0 E = 1, S = 0 : log 1 p 1,0 = α + β (4) (4) (3) = log OR 1 = α + β α = β log OR = β OR = e β the log-odds ratio in the first stratum is β

confounding and effect modification in detail for stratum 2: now: log p x 1 p x = α + βe + γs p 0,1 E = 0, S = 1 : log 1 p 0,1 = α + γ (5) p 1,1 E = 1, S = 1 : log 1 p 1,1 = α + β + γ (6) (6) (5) = log OR 2 = α + β + γ α γ = β the log-odds ratio in the second stratum is β

confounding and effect modification important property of the confounding model: assumes the identical exposure effect in each stratum!

confounding and effect modification (crude analysis) Logistic regression Log likelihood = -3141.5658 ----------------------------------------------------------- Y Odds Ratio Std. Err. [95% Conf. Interval] -------------+--------------------------------------------- E.1126522.0153479.0862522.1471326 ----------------------------------------------------------- (adjusted for confounder) Logistic regression Log likelihood = -3021.5026 ----------------------------------------------------------- Y Odds Ratio Std. Err. [95% Conf. Interval] -------------+--------------------------------------------- E 1.1736619.7115062 1.405469 S.02.0068109.0102603.0389853 -----------------------------------------------------------

confounding and effect modification A simple illustration example: passive smoking and lung cancer cases non-cases E 52 121 173 NE 54 150 204 OR odds ratio: OR = 52 150 54 121 = 1.19

confounding and effect modification stratified: Stratum 1 (females): Stratum 2 (males): cases non-cases E 41 102 143 NE 26 71 97 OR = 41 71 26 102 = 1.10 cases non-cases E 11 19 30 NE 28 79 107 OR = 11 79 19 28 = 1.63

confounding and effect modification interpretation: effect changes from one stratum to the next stratum!

confounding and effect modification The logistic regression model for effect modification where log p x = α + βe + γs + (βγ) E S 1 p x }{{} effect modif. par. x = (E, S) is the covariate combination of exposure E and stratum S

confounding and effect modification in detail for stratum 1 now log p x 1 p x = α + βe + γs + (βγ)e S p 0,0 E = 0, S = 0 : log 1 p 0,0 = α (7) p 1,0 E = 1, S = 0 : log 1 p 1,0 = α + β (8) (8) (7) = log OR 1 = α + β α = β log OR = β OR = e β the log-odds ratio in the first stratum is β

confounding and effect modification in detail for stratum 2: log p x 1 p x = α + βe + γs + (βγ)e S now: p 0,1 E = 0, S = 1 : log 1 p 0,1 = α + γ (9) p 1,1 E = 1, S = 1 : log 1 p 1,1 = α + β + γ + (βγ) (10) (10) (9) = log OR 2 = α + β + γ + (βγ) α γ = β + (βγ) log OR = β OR = e β+(βγ) the log-odds ratio in the second stratum is β + (βγ)

confounding and effect modification important property of the effect modification model: effect modification model allows for different effects in the strata!

confounding and effect modification Data from passive smoking and LC example are as follows: +-----------------------+ Y E S ES freq ----------------------- 1. 1 1 0 0 41 2. 0 1 0 0 102 3. 1 0 0 0 26 4. 0 0 0 0 71 5. 1 1 1 1 11 ----------------------- 6. 0 1 1 1 19 7. 1 0 1 0 28 8. 0 0 1 0 79 +-----------------------+

confounding and effect modification CRUDE EFFECT MODEL Logistic regression Log likelihood = -223.66016 -------------------------------------------------------- Y Coef. Std. Err. z P> z -------------+------------------------------------------ E.1771044.2295221 0.77 0.440 _cons -1.021651.1586984-6.44 0.000 --------------------------------------------------------

confounding and effect modification CONFOUNDING MODEL Logistic regression Log likelihood = -223.56934 ------------------------------------------------------- Y Coef. Std. Err. z P> z -------------+----------------------------------------- E.2158667.2472221 0.87 0.383 S.1093603.2563249 0.43 0.670 _cons -1.079714.2101705-5.14 0.000 -------------------------------------------------------

confounding and effect modification EFFECT MODIFICATION MODEL Logistic regression Log likelihood = -223.2886 ------------------------------------------------------ Y Coef. Std. Err. z P> z -------------+---------------------------------------- E.0931826.2945169 0.32 0.752 S -.03266.3176768-0.10 0.918 ES.397517.5278763 0.75 0.451 _cons -1.004583.2292292-4.38 0.000 -------------------------------------------------------

confounding and effect modification interpretation of crude effects model: log OR = 0.1771 OR = e 0.1771 = 1.19 interpretation of confounding model: log OR = 0.2159 OR = e 0.2159 = 1.24

confounding and effect modification interpretation of effect modification model: stratum 1: stratum 2: log OR 1 = 0.0932 OR 1 = e 0.0932 = 1.10 log OR 2 = 0.0932 + 0.3975 OR 2 = e 0.0932+0.3975 = 1.63

comparing of different generalized regression models Model evaluation in logistic regression: the likelihood approach: L = n i=1 is called the likelihood for models log p xi 1 p xi = p y i x i (1 p xi ) 1 y i { α + βe i + γs i + (βγ)e i S i, (M 1 ) α + βe i + γs i, (M 0 ) where M 1 is the effect modification model and M 0 is the confounding model

comparing of different generalized regression models Model evaluation in logistic regression using the likelihood ratio: let L(M 1 ) and L(M 0 ) be the likelihood for models M 1 and M 0 then LRT = 2 log L(M 1 ) 2 log L(M 0 ) = 2 log L(M 1) L(M 0 ) is called the likelihood ratio for models M 1 and M 0 and has a chi-square distribution with 1 df under M 0

comparing of different generalized regression models illustration for passive smoking and LC example: model log-likelihood LRT crude -223.66016 - homogeneity -223.56934 0.1816 effect modification -223.2886 0.5615 note: for valid comparison on chi-square scale: models must be nested

comparing of different generalized regression models Model evaluation in more general: consider the likelihood L = n i=1 p y i x i (1 p xi ) 1 y i for a general model with p covariates: example: log p xi 1 p xi = α + β 1 x i1 + β 2 x i2 +... + β p x ip (M 0 ) log p xi 1 p xi = α + β 1 AGE i + β 2 SEX i + β 3 ETS i

comparing of different generalized regression models Model evaluation in more general: example: log p xi 1 p xi = α + β 1 AGE i + β 2 SEX i + β 3 ETS i where these covariates can be mixed: quantitative, continuous such as AGE categorical binary (use 1/0 coding) such as SEX non-binary ordered or unordered categorical such as ETS (none, moderate, large)

comparing of different generalized regression models Model evaluation in more general: consider the likelihood L = n i=1 p y i x i (1 p xi ) 1 y i for model with additional k covariates: log p xi 1 p xi = α + β 1 x i1 + β 2 x i2 +... + β p x ip +β p+1 x i,p+1 +... + β k+p x i,k+p (M 1 )

comparing of different generalized regression models Model evaluation in more general for our example: log p xi 1 p xi = α + β 1 AGE i + β 2 SEX i + β 3 ETS i +β 4 RADON i + β 5 AGE-HOUSE i

comparing of different generalized regression models Model evaluation using the likelihood ratio: again let L(M 1 ) and L(M 0 ) be the likelihood for models M 1 and M 0 then the likelihood ratio LRT = 2 log L(M 1 ) 2 log L(M 0 ) = 2 log L(M 1) L(M 0 ) has a chi-square distribution with k df under M 0

comparing of different generalized regression models Model evaluation for our example: then { M 0 : α + β 1 AGE i + β 2 SEX i + β 3 ETS i M 1 :...M 0... + β 4 RADON i + β 5 AGE-HOUSE i LRT = 2 log L(M 1) L(M 0 ) has under model M 0 a chi-square distribution with 2 df

comparing of different generalized regression models model evaluation for model assessment we will use criteria that compromise between model fit and model complexity Akaike information criterion Bayesian Information criterion AIC = 2 log L + 2k BIC = 2 log L + k log n where k is the number of parameters in the model and n is the number of clustered observations we seek a model for which AIC and/or BIC are small

meta-analysis of BCG vaccine against tuberculosis Meta-Analysis Meta-Analysis is a methodology for investigating the study results from several, independent studies with the purpose of an integrative analysis Meta-Analysis on BCG vaccine against tuberculosis Colditz et al. 1974, JAMA provide a meta-analysis to examine the efficacy of BCG vaccine against tuberculosis

meta-analysis of BCG vaccine against tuberculosis Data on the meta-analysis of BCG and TB the data contain the following details 13 studies each study contains: TB cases for BCG intervention number at risk for BCG intervention TB cases for control number at risk for control also two covariates are given: year of study and latitude expressed in degrees from equator latitude represents the variation in rainfall, humidity and environmental mycobacteria suspected of producing immunity against TB

meta-analysis of BCG vaccine against tuberculosis intervention control study year latitude TB cases total TB cases total 1 1933 55 6 306 29 303 2 1935 52 4 123 11 139 3 1935 52 180 1541 372 1451 4 1937 42 17 1716 65 1665 5 1941 42 3 231 11 220 6 1947 33 5 2498 3 2341 7 1949 18 186 50634 141 27338 8 1950 53 62 13598 248 12867 9 1950 13 33 5069 47 5808 10 1950 33 27 16913 29 17854 11 1965 18 8 2545 10 629 12 1965 27 29 7499 45 7277 13 1968 13 505 88391 499 88391

meta-analysis of BCG vaccine against tuberculosis Data analysis on the meta-analysis of BCG and TB these kind of data can be analyzed by taking TB case as disease occurrence response intervention as exposure study as confounder

meta-analysis of BCG vaccine against tuberculosis Log likelihood = -15191.497 Pseudo R2 = 0.0050 TB_Case Odds Ratio Std. Err. z P> z [95% Conf. Interval] Intervention.6116562.024562-12.24 0.000.5653613.6617421 _cons.0091641.0002369-181.51 0.000.0087114.0096404. estat ic, n(13) Model Obs ll(null) ll(model) df AIC BIC. 13-15267.81-15191.5 2 30386.99 30388.12 Note: N=13 used in calculating BIC

meta-analysis of BCG vaccine against tuberculosis Logistic regression Number of obs = 357347 LR chi2(2) = 1239.45 Prob > chi2 = 0.0000 Log likelihood = -14648.082 Pseudo R2 = 0.0406 TB_Case Odds Ratio Std. Err. z P> z [95% Conf. Interval] Latitude 1.043716.00126 35.44 0.000 1.04125 1.046189 Intervention.6253014.0251677-11.67 0.000.577869.6766271 _cons.0031643.0001403-129.85 0.000.002901.0034515. estat ic, n(13) Model Obs ll(null) ll(model) df AIC BIC. 13-15267.81-14648.08 3 29302.16 29303.86 Note: N=13 used in calculating BIC

meta-analysis of BCG vaccine against tuberculosis Logistic regression Number of obs = 357347 LR chi2(3) = 1402.30 Prob > chi2 = 0.0000 Log likelihood = -14566.659 Pseudo R2 = 0.0459 TB_Case Odds Ratio Std. Err. z P> z [95% Conf. Interval] Latitude 1.029997.0016409 18.55 0.000 1.026786 1.033219 Intervention.6041037.0243883-12.48 0.000.5581456.6538459 Year.9666536.0025419-12.90 0.000.9616844.9716485 _cons.0300164.0053119-19.81 0.000.021219.0424611. estat ic, n(13) Model Obs ll(null) ll(model) df AIC BIC. 13-15267.81-14566.66 4 29141.32 29143.58 Note: N=13 used in calculating BIC.

meta-analysis of BCG vaccine against tuberculosis model evaluation model log L AIC BIC intervention -15191.50 30386.99 30388.12 + latitude -14648.08 29302.16 29303.86 + year -14566.66 29141.32 29143.58