Case-control studies C&H 16

Size: px
Start display at page:

Download "Case-control studies C&H 16"

Transcription

1 Case-control studies C&H 6 Bendix Carstensen Steno Diabetes Center & Department of Biostatistics, University of Copenhagen bxc@steno.dk PhD-course in Epidemiology, Department of Biostatistics, Tuesday 3 January 207 Relationship between follow up studies and case control studies In a cohort study, the relationship between exposure and disease incidence is investigated by following the entire cohort and measuring the rate of occurrence of new cases in the different exposure groups. The follow up allows the investigator to register those subjects who develop the disease during the study period and to identify those who remain free of the disease. Case-control studies (C&H 6) 2/ 59 Case-control study In a case-control study the subjects who develop the disease (the cases) are registered by some other mechanism than follow-up, and a group of healthy subjects (the controls) is used to represent the subjects who do not develop the disease. Case-control studies (C&H 6) 3/ 59

2 Rationale behind case-control studies In a follow-up study, rates among exposed and non-exposed are estimated by: D Y D 0 Y 0 and hence the rate ratio by: D Y / D0 Y 0 = D D 0 / Y Y 0 Case-control studies (C&H 6) 4/ 59 In a case-control study we use the same cases, but select controls to represent the distribution of risk time between exposed and unexposed: H H 0 Y Y 0 Therefore the rate ratio is estimated by: D D 0 / H H 0 Controls represent risk time, not disease-free persons. Case-control studies (C&H 6) 5/ 59 Choice of controls (I) Failures Healthy study period The period over which failures are registered as cases is called the study period. A group of subjects who remain healthy over the study period is chosen to represent the healthy part of the source population. but this is an oversimplification... Case-control studies (C&H 6) 6/ 59

3 What about censoring and late entry? Failures Healthy Censored Late entry study period Choosing controls which remains healthy throughout takes no account of censoring or late entry. Instead, choose controls who are in the study and healthy, at the times the cases are registered. Case-control studies (C&H 6) 7/ 59 Choice of controls (II) Failures Healthy Censored Late entry study period This is called incidence density sampling. Subjects can be chosen as controls more than once, and a subject who is chosen as a control can later become a case. Equivalent to sampling observation time from vertical bands drawn to enclose each case. Case-control studies (C&H 6) 8/ 59 Most common way of choosing controls. Case-control probability tree Exposure p p Failure π E π π 0 E 0 π 0 Selection 0.97 F S F S 0.99 Case (D ) Control (H ) Case (D 0 ) Control (H 0 ) Probability pπ 0.97 p( π ) 0.0 ( p)π ( p)( π 0 ) 0.0 Case-control studies (C&H 6) 9/ 59

4 Retrospective analysis of case-control studies Retrospective: Compare the distribution of exposure between cases and controls. The proportion of cases who smoke compared to controls The mean age of cases compared to controls Looks at the study backwards. Only works properly for binary explanatory variables. Case-control studies (C&H 6) 0/ 59 The retrospective argument Selection Failure Exposure Probability E p π 0.97 F (Cases) E 0 ( p) π E p ( π ) 0.0 S (Controls) E 0 ( p) ( π 0 ) 0.0 Not in study Note: Parameters in the previous tree not on these branches. Case-control studies (C&H 6) / 59 Odds of exposure for cases and controls: Ω cas = Ω ctr = p π 0.97 ( p) π = p p π π 0 p ( π ) 0.0 ( p) ( π 0 ) 0.0 = p p π π 0 Odds-ratio for exposure between cases and controls: Ω cas = π / π = OR(disease) population Ω ctr π 0 π 0 Case-control studies (C&H 6) 2/ 59

5 Prospective analysis of case-control studies Compare the case/control ratio between exposed and non-exposed subjects or more general: How does case-control ratio vary with exposure? The point is that in the study it varies in the same way as in the population. Case-control studies (C&H 6) 3/ 59 The prospective argument Selection Exposure Failure Probability π p E π π 0 p E 0 π 0 Not in study F S F S p π 0.97 p ( π ) 0.0 ( p) π ( p) ( π 0 ) 0.0 Case-control studies (C&H 6) 4/ 59 Odds of disease = P {Case given inclusion} P {Control given inclusion} ω = ω 0 = p π 0.97 p ( π ) 0.0 = π π ( p) π ( p) ( π 0 ) 0.0 = π 0 π 0 OR = ω = π / π0 = OR(disease) population ω 0 π π 0 Case-control studies (C&H 6) 5/ 59

6 What is the case-control ratio? D = 0.97 H 0.0 π ( s,cas = π ) π s,ctr π D 0 = 0.97 H π ( 0 s0,cas = π ) 0 π 0 s 0,ctr π 0 D /H D 0 /H 0 = π /( π ) π 0 /( π 0 ) = OR population but only if the sampling fractions are identical: s,cas = s 0,cas and s,ctr = s 0,ctr. Case-control studies (C&H 6) 6/ 59 Log-likelihood for case-control studies Log-Likelihood (conditional on being included) is a binomial likelihood with odds-parameters ω 0 and ω D 0 log(ω 0 ) N 0 log(+ω 0 )+D log(ω ) N log(+ω ) where N 0 = D 0 + H 0 and N = D + H. Exposed: D cases, H controls Unexposed: D 0 cases, H 0 controls Case-control studies (C&H 6) 7/ 59 Odds-ratio (θ) is the ratio of the odds ω to ω 0, so: ( ) ω log(θ) = log = log(ω ) log(ω 0 ) ω 0 Estimates of log(ω ) and log(ω 0 ) are just the empirical odds: ( ) ( ) D D0 log and log H H 0 Case-control studies (C&H 6) 8/ 59

7 The standard errors of the odds are estimated by: D + H and D 0 + H 0 Exposed and unexposed form two independent bodies of data (they are sampled independently), so the estimate of log(θ) [= log(or)] is: ( ) ( ) D D0 log log, H with s.e. ( log(or) ) = H 0 D + H + D 0 + H 0 Case-control studies (C&H 6) 9/ 59 Confidence interval for OR First a confidence interval for log(or): log(or) ± D H D 0 H 0 Take the exponential: ( OR exp ) D H D 0 H }{{ 0 } error factor Case-control studies (C&H 6) 20/ 59 BCG vaccination and leprosy Does BCG vaccination in early childhood protect against leprosy? New cases of leprosy were examined for presence or absence of the BCG scar. During the same period, a 00% survey of the population of this area, which included examination for BCG scar, had been carried out. The tabulated data refer only to subjects under 35, because vaccination was not widely available when older persons were children. Case-control studies (C&H 6) 2/ 59

8 Exercise I BCG scar Leprosy cases Population survey Present Absent Estimate the odds of BCG vaccination for leprosy cases and for the controls. Estimate the odds ratio and hence the extent of protection against leprosy afforded by vaccination. Give a 95% c.i. for the OR. Use SAS for this: Exercise from the notes. Case-control studies (C&H 6) 22/ 59 Solution to I OR = D /H D 0 /H 0 = 0/ /34594 = = 0.48 s.e.(log[or]) = = D + H + D 0 + H = 0.27 The 95% limits for the odds-ratio are: OR exp( ) = = (0.37, 0.6) Case-control studies (C&H 6) 23/ 59 Exercise II BCG scar Leprosy cases Population controls Present Absent The table shows the results of a computer-simulated study which picked 000 controls at random. What is the odds ratio estimate in this study? Give a 95% c.i. for the OR. Use SAS for this: Exercise from the notes. Case-control studies (C&H 6) 24/ 59

9 Solution to II OR = D /H D 0 /H 0 = 0/554 59/446 = = 0.5 s.e.(log[or]) = = D H D 0 H = 0.42 The 95% limits for the odds-ratio are: OR exp( ) = = (0.39, 0.68) Case-control studies (C&H 6) 25/ 59 More levels of exposure (William Guy) Physical exertion at work of 659 outpatients: 34 pulmonary consumption, 38 other diseases. Level of Pulmonary Other Case/ OR exertion in consumption diseases control relative occupation (Cases) (Controls) ratio to (3) Little (0) Varied () More (2) Great (3) The relationship of case-control ratios is what matters. Case-control studies (C&H 6) 26/ 59 The retro/prospective argument Retrospective: Four possible outcomes (little/varied/more/great), Prospective: Two possible outcomes (case/control), but a large number of comparisons (between any two exposure levels). But the probability model is still a binary model, and the argument for the analysis is still the same as before. Prospective argument applicable in deriving a logistic regression model for case-control studies. Case-control studies (C&H 6) 27/ 59

10 Odds-ratio and rate ratio If the disease probability, π, in the study period is small: π = cumulative risik cumulative rate = λt For small π, π, so: OR = π /( π ) π 0 /( π 0 ) π π 0 λ λ 0 = RR π small OR estimate of RR. Case-control studies (C&H 6) 28/ 59 Important assumption behind rate ratio interpretation The entire study base must have been available throughout: no censorings. no delayed entries. This will clearly not always be the case, but it may be achieved in carefully designed studies. Case-control studies (C&H 6) 29/ 59 Avoiding censoring and delayed entry Can be achieved simultaneously with small π by incidence density sampling: Subdivide calendar time in small time bands. New case-control study in each time band. Only one case in each time band. No delayed entry or censoring. If the fraction of exposed does not vary much over time, all the small studies can be analysed together as one. This is effectively matching on calendar time. Case-control studies (C&H 6) 30/ 59

11 The rare disease assumption Necessary to make the approximation: π /( π ) π 0 /( π 0 ) π π 0 This is more appropriately termed: The short study duration assumption each of the small studies we imagine as components of the entire study should be sufficiently short in relation to disease occurrence, so that the π (disease probability) if small. Case-control studies (C&H 6) 3/ 59 Nested case-control studies Study base = large cohort Expensive to get covariate information for all persons. (expensive analyses, tracing of histories,... ) Covariate information only for cases and time matched controls: To each case, choose one or more (usually 5) controls from the risk set. Case-control studies (C&H 6) 32/ 59 How many controls per case? The standard deviation of log(or): Equal number of cases and controls: D + H + D 0 + H 0 = = D D D 0 D 0 ( + ) ( + ) D D 0 Case-control studies (C&H 6) 33/ 59

12 Twice as many controls as cases: D + H + D 0 + H 0 = = D 2D D 0 2D 0 ( + ) ( + /2) D D 0 m times as many cases as controls: ( = + ) ( + /m) D H D 0 H 0 D D 0 Case-control studies (C&H 6) 34/ 59 How many controls per case? The standard deviation of the log[or] is + m times larger in a case-control study, compared to the corresponding cohort-study. Therefore, 5 controls per case is normally sufficient. (Only relevant if controls are cheap compared to cases). But if cases and controls cost the same and are available the most efficient is to have the same number of cases and controls. Case-control studies (C&H 6) 35/ 59 SAS-intro Bendix Carstensen Steno Diabetes Center & Department of Biostatistics, University of Copenhagen bxc@steno.dk PhD-course in Epidemiology, Department of Biostatistics, Tuesday 3 January, 207

13 SAS Display manager (programming): program, log, output windows reproducible easy to document SAS ANALYST menu-oriented interface writes and runs programs for you no learning by heart, no syntax errors not every thing is included it is heavy to use in the long run SAS-intro () 37/ 59 Data set example: Blood pressure and obesity OBESE: weight/ideal weight BP: systolic blood pressure OBS SEX OBESE BP male male male male female female SAS-intro () 38/ 59 Data Data are in the text file BP.TXT located at and contains the following variables: SEX: Character variable ($) OBESE: weight/ideal weight BP: systolic blood pressure 3 variables and 02 observations SAS-intro () 39/ 59

14 Printing in SAS We read the file bp.txt directly from www and skip the first line containing variable names (firstobs=2). data bp; filename bpfile url ; infile bpfile firstobs=2; input sex $ obese bp; run; proc print data=bp; var sex obese bp; run; A temporary data set bp which only exists within the current program. (Permanent data sets may be saved but we will not use this feature in this course.) SAS-intro () 40/ 59 SAS programming data-step: data bp; ( reading ) ; ( data manipulations ) ; run; proc-step: proc xx data=bp ; ( procedure statments ) ; run; NB: No data manipulations after run; only if we make a new data-step. better to revise the first data-step. SAS-intro () 4/ 59 Example data bp; filename bpfile url " infile bpfile firstobs=2; input sex obese bp; run; data bp; set bp; if bp<25 then highbp=0; if bp>=25 then highbp=; /* an alternative way of creating the new variable highbp is: highbp = (bp>=25); */ run; proc freq data=bp; tables sex * highbp ; run; SAS-intro () 42/ 59

15 Example, simplfied data bp; filename bpfile url ; infile bpfile firstobs=2; input sex obese bp; if bp < 25 then highbp=0; if bp >= 25 then highbp=; /* an alternative way of creating the new variable highbp is: highbp = (bp>=25); */ run; proc freq data=bp; tables sex * highbp ; run; SAS-intro () 43/ 59 Typing of programs is done in the Program Editor window: Works like all other text editors: arrow keys, backspace, delete etc. When the program is submitted (click on Submit or press F3), the results are in the Log-window: Here you can see how things went: how many observations you have, how many variables you have if there were any errors which pages were written by which procedures SAS-intro () 44/ 59 Output-window (perhaps): In this window you will find the results (if there are any) Graph-window (which we won t use on this course) Here plots are stored in order SAS-intro () 45/ 59

16 Making life simpler You can move between the windows by clicking Windows in the command bar, or use that: F5 is editor window, F6 is log window, F7 is output window. SAS-intro () 46/ 59 Modifications in the program When the program has been executed and you want to make changes: Go back to the Program-window The Log- Output- and Graph-windows cumulate, that is output is stored consecutively Clear by choosing Clear under Edit (or press Ctrl-E - for erase ) Don t print! Remember to save the the program from time to time before SAS crashes! SAS-intro () 47/ 59 Simple statistical models Proportions and rates Bendix Carstensen Steno Diabetes Center & Department of Biostatistics, University of Copenhagen bxc@steno.dk PhD-course in Epidemiology, Department of Biostatistics, Tuesday 3 January, 207

17 A single proportion The log-likelihood for π, the proportion dead, if we observe 4 deaths out of 0: l(π) = 4log(π) + 6log( π) The log-likelihood for ω, the odds of dying, if we observe 4 deaths and 6 non-deaths: l(π) = 4log(ω) 0log( + ω) Simple statistical models (Proportions and rates) 49/ 59 Programs General purpose programs for estimating in the binomial and Poisson distribution: SAS: proc genmod R: glm Stata: glm Here we primarily look at SAS. Simple statistical models (Proportions and rates) 50/ 59 Estimating odds: genmod data p ; input x n ; datalines ; 4 0 ; proc genmod data= p ; model x/n = / dist=bin link=logit ; estimate "4 versus 6" intercept / exp ; Standard Wald 95% Confidence Wald Parameter DF Estimate Error Limits Chi-Square Pr > Chi Intercept Scale Contrast Estimate Results L Beta Standard L Beta Chi- Label Estimate Error Confidence Limits Square Pr > ChiSq 4 versus Exp(4 versus 6) Simple statistical models (Proportions and rates) 5/ 59

18 Estimating a proportion: genmod The only difference from estimation of odds is the link= argument, which is changed to log (instead of logit): proc genmod data= p ; model x/n = / dist=bin link=log ; estimate "4 out of 0" intercept / exp ; Standard Wald 95% Confidence Wald Parameter DF Estimate Error Limits Chi-Square Pr > Chi Intercept Scale Contrast Estimate Results L Beta Standard L Beta Chi- Label Estimate Error Confidence Limits Square Pr > ChiSq 4 out of Exp(4 out of 0) Simple statistical models (Proportions and rates) 52/ 59 A single proportion: individual records data bissau; filename bisfile url " infile bisfile firstobs=2; input id fuptime dead bcg dtp age agem; run; title "Estimate odds - Bissau" ; proc genmod data=bissau descending ; model dead = / dist=bin link=logit ; estimate "odds of dying" intercept / exp ; Contrast Estimate Results L Beta Standard L Beta Chi- Label Estimate Error Confidence Limits Square Pr > ChiSq odds of dying <.000 Exp(odds of dying) Simple statistical models (Proportions and rates) 53/ 59 A single proportion: individual records title "Estimate proportion - Bissau" ; proc genmod data=bissau descending ; model dead = / dist=bin link=log ; estimate "prob of dying" intercept / exp ; Contrast Estimate Results L Beta Standard L Beta Chi- Label Estimate Error Confidence Limits Square Pr > ChiSq prob of dying <.000 Exp(prob of dying) Simple statistical models (Proportions and rates) 54/ 59

19 Likelihood for a single rate Recall the log-likelihood for a single rate, λ based on D events during Y person years: Dlog(λ) λy This is also the log-likelihood for a Poisson variate D with mean µ = λy. Therefor we can use a program for the Posson distribution to estimate rates, except we must remove the Y from the mean. Poisson distribution usually use the log-mean: log(µ) = log(λ) + log(y ) log(y ) extracted via the offset argument. Simple statistical models (Proportions and rates) 55/ 59 A single rate data r ; input d y ; ly = log(y) ; my = log(y/000) ; datalines ; ; title "Estimate a rate per year" ; proc genmod data= r ; model d = / dist=poisson link=log offset=ly ; estimate "30 during per year" intercept / exp ; Contrast Estimate Results L Beta Standard L Beta Chi- Label Estimate Error Confidence Limits Square 30 during per year Exp(30 during per year) Simple statistical models (Proportions and rates) 56/ 59 A single rate: Scaling Remember the data step statement: my = log(y/000) ; title "Estimate a rate per 000 year" ; proc genmod data= r ; model d = / dist=poisson link=log offset=my ; estimate "30 during per 000 years" intercept / exp ; Contrast Estimate Results L Beta Standard L Beta Label Estimate Error Alpha Confidence Limits 30 during per 000 years Exp(30 during per 000 years) Simple statistical models (Proportions and rates) 57/ 59

20 A single rate: individual records data bissau ; set bissau ; ld = log(fuptime) ; ly = log(fuptime/36525) ; title "Estimate a rate per day" ; proc genmod data=bissau ; model dead = / dist=poisson link=log offset=ld ; estimate "mortality rate - per day" intercept / exp ; Contrast Estimate Results L Beta Standard L Beta Chi- Label Estimate Error Alpha Confidence Limits Square mortality rate - per day Exp(mortality rate - per day) Simple statistical models (Proportions and rates) 58/ 59 Single rate individual records, scaling Remember the data step statement: ly = log(fuptime/36525) ; title "Estimate a rate per year" ; proc genmod data=bissau ; model dead = / dist=poisson link=log offset=ly ; estimate "mortality rate - per 00 years" intercept / exp ; Contrast Estimate Results L Beta Standard L Beta Label Estimate Error Alpha Confidence Limits mortality rate - per 00 years Exp(mortality rate - per 00 years) Simple statistical models (Proportions and rates) 59/ 59

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies.

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies. Matched and nested case-control studies Bendix Carstensen Steno Diabetes Center, Gentofte, Denmark http://staff.pubhealth.ku.dk/~bxc/ Department of Biostatistics, University of Copengen 11 November 2011

More information

Case-control studies

Case-control studies Matched and nested case-control studies Bendix Carstensen Steno Diabetes Center, Gentofte, Denmark b@bxc.dk http://bendixcarstensen.com Department of Biostatistics, University of Copenhagen, 8 November

More information

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES Cox s regression analysis Time dependent explanatory variables Henrik Ravn Bandim Health Project, Statens Serum Institut 4 November 2011 1 / 53

More information

Logistic regression analysis. Birthe Lykke Thomsen H. Lundbeck A/S

Logistic regression analysis. Birthe Lykke Thomsen H. Lundbeck A/S Logistic regression analysis Birthe Lykke Thomsen H. Lundbeck A/S 1 Response with only two categories Example Odds ratio and risk ratio Quantitative explanatory variable More than one variable Logistic

More information

Cohen s s Kappa and Log-linear Models

Cohen s s Kappa and Log-linear Models Cohen s s Kappa and Log-linear Models HRP 261 03/03/03 10-11 11 am 1. Cohen s Kappa Actual agreement = sum of the proportions found on the diagonals. π ii Cohen: Compare the actual agreement with the chance

More information

You can specify the response in the form of a single variable or in the form of a ratio of two variables denoted events/trials.

You can specify the response in the form of a single variable or in the form of a ratio of two variables denoted events/trials. The GENMOD Procedure MODEL Statement MODEL response = < effects > < /options > ; MODEL events/trials = < effects > < /options > ; You can specify the response in the form of a single variable or in the

More information

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 18.1 Logistic Regression (Dose - Response)

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 18.1 Logistic Regression (Dose - Response) Model Based Statistics in Biology. Part V. The Generalized Linear Model. Logistic Regression ( - Response) ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch 9, 10, 11), Part IV

More information

Epidemiology Wonders of Biostatistics Chapter 13 - Effect Measures. John Koval

Epidemiology Wonders of Biostatistics Chapter 13 - Effect Measures. John Koval Epidemiology 9509 Wonders of Biostatistics Chapter 13 - Effect Measures John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is being covered 1. risk factors 2. risk

More information

Logistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression

Logistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression Logistic Regression Usual linear regression (repetition) y i = b 0 + b 1 x 1i + b 2 x 2i + e i, e i N(0,σ 2 ) or: y i N(b 0 + b 1 x 1i + b 2 x 2i,σ 2 ) Example (DGA, p. 336): E(PEmax) = 47.355 + 1.024

More information

Tests for the Odds Ratio in a Matched Case-Control Design with a Quantitative X

Tests for the Odds Ratio in a Matched Case-Control Design with a Quantitative X Chapter 157 Tests for the Odds Ratio in a Matched Case-Control Design with a Quantitative X Introduction This procedure calculates the power and sample size necessary in a matched case-control study designed

More information

ST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses

ST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses ST3241 Categorical Data Analysis I Multicategory Logit Models Logit Models For Nominal Responses 1 Models For Nominal Responses Y is nominal with J categories. Let {π 1,, π J } denote the response probabilities

More information

Q30b Moyale Observed counts. The FREQ Procedure. Table 1 of type by response. Controlling for site=moyale. Improved (1+2) Same (3) Group only

Q30b Moyale Observed counts. The FREQ Procedure. Table 1 of type by response. Controlling for site=moyale. Improved (1+2) Same (3) Group only Moyale Observed counts 12:28 Thursday, December 01, 2011 1 The FREQ Procedure Table 1 of by Controlling for site=moyale Row Pct Improved (1+2) Same () Worsened (4+5) Group only 16 51.61 1.2 14 45.16 1

More information

Sections 4.1, 4.2, 4.3

Sections 4.1, 4.2, 4.3 Sections 4.1, 4.2, 4.3 Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1/ 32 Chapter 4: Introduction to Generalized Linear Models Generalized linear

More information

Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models:

Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models: Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models: Marginal models: based on the consequences of dependence on estimating model parameters.

More information

Modelling Rates. Mark Lunt. Arthritis Research UK Epidemiology Unit University of Manchester

Modelling Rates. Mark Lunt. Arthritis Research UK Epidemiology Unit University of Manchester Modelling Rates Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 05/12/2017 Modelling Rates Can model prevalence (proportion) with logistic regression Cannot model incidence in

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

Chapter 4: Generalized Linear Models-I

Chapter 4: Generalized Linear Models-I : Generalized Linear Models-I Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu] D. Bandyopadhyay

More information

Lecture 7 Time-dependent Covariates in Cox Regression

Lecture 7 Time-dependent Covariates in Cox Regression Lecture 7 Time-dependent Covariates in Cox Regression So far, we ve been considering the following Cox PH model: λ(t Z) = λ 0 (t) exp(β Z) = λ 0 (t) exp( β j Z j ) where β j is the parameter for the the

More information

COMPLEMENTARY LOG-LOG MODEL

COMPLEMENTARY LOG-LOG MODEL COMPLEMENTARY LOG-LOG MODEL Under the assumption of binary response, there are two alternatives to logit model: probit model and complementary-log-log model. They all follow the same form π ( x) =Φ ( α

More information

Lecture 5: Poisson and logistic regression

Lecture 5: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 3-5 March 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

Chapter 5 Formulas Distribution Formula Characteristics n. π is the probability Function. x trial and n is the. where x = 0, 1, 2, number of trials

Chapter 5 Formulas Distribution Formula Characteristics n. π is the probability Function. x trial and n is the. where x = 0, 1, 2, number of trials SPSS Program Notes Biostatistics: A Guide to Design, Analysis, and Discovery Second Edition by Ronald N. Forthofer, Eun Sul Lee, Mike Hernandez Chapter 5: Probability Distributions Chapter 5 Formulas Distribution

More information

STAT 7030: Categorical Data Analysis

STAT 7030: Categorical Data Analysis STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012

More information

STA6938-Logistic Regression Model

STA6938-Logistic Regression Model Dr. Ying Zhang STA6938-Logistic Regression Model Topic 2-Multiple Logistic Regression Model Outlines:. Model Fitting 2. Statistical Inference for Multiple Logistic Regression Model 3. Interpretation of

More information

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3 STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae

More information

Lecture 12: Effect modification, and confounding in logistic regression

Lecture 12: Effect modification, and confounding in logistic regression Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression

More information

Lecture 2: Poisson and logistic regression

Lecture 2: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis Lecture 6: Logistic Regression Analysis Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Homework 2 Overview Logistic regression model conceptually Logistic regression

More information

Chapter 5: Logistic Regression-I

Chapter 5: Logistic Regression-I : Logistic Regression-I Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu] D. Bandyopadhyay

More information

PASS Sample Size Software. Poisson Regression

PASS Sample Size Software. Poisson Regression Chapter 870 Introduction Poisson regression is used when the dependent variable is a count. Following the results of Signorini (99), this procedure calculates power and sample size for testing the hypothesis

More information

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression Section IX Introduction to Logistic Regression for binary outcomes Poisson regression 0 Sec 9 - Logistic regression In linear regression, we studied models where Y is a continuous variable. What about

More information

Statistical Modelling with Stata: Binary Outcomes

Statistical Modelling with Stata: Binary Outcomes Statistical Modelling with Stata: Binary Outcomes Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 21/11/2017 Cross-tabulation Exposed Unexposed Total Cases a b a + b Controls

More information

STAT 705: Analysis of Contingency Tables

STAT 705: Analysis of Contingency Tables STAT 705: Analysis of Contingency Tables Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Analysis of Contingency Tables 1 / 45 Outline of Part I: models and parameters Basic

More information

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data Person-Time Data CF Jeff Lin, MD., PhD. Incidence 1. Cumulative incidence (incidence proportion) 2. Incidence density (incidence rate) December 14, 2005 c Jeff Lin, MD., PhD. c Jeff Lin, MD., PhD. Person-Time

More information

Tests for Two Correlated Proportions in a Matched Case- Control Design

Tests for Two Correlated Proportions in a Matched Case- Control Design Chapter 155 Tests for Two Correlated Proportions in a Matched Case- Control Design Introduction A 2-by-M case-control study investigates a risk factor relevant to the development of a disease. A population

More information

Section Poisson Regression

Section Poisson Regression Section 14.13 Poisson Regression Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 26 Poisson regression Regular regression data {(x i, Y i )} n i=1,

More information

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages

More information

Simple logistic regression

Simple logistic regression Simple logistic regression Biometry 755 Spring 2009 Simple logistic regression p. 1/47 Model assumptions 1. The observed data are independent realizations of a binary response variable Y that follows a

More information

PubHlth Intermediate Biostatistics Spring 2015 Exam 2 (Units 3, 4 & 5) Study Guide

PubHlth Intermediate Biostatistics Spring 2015 Exam 2 (Units 3, 4 & 5) Study Guide PubHlth 640 - Intermediate Biostatistics Spring 2015 Exam 2 (Units 3, 4 & 5) Study Guide Unit 3 (Discrete Distributions) Take care to know how to do the following! Learning Objective See: 1. Write down

More information

Confidence Intervals for the Odds Ratio in Logistic Regression with One Binary X

Confidence Intervals for the Odds Ratio in Logistic Regression with One Binary X Chapter 864 Confidence Intervals for the Odds Ratio in Logistic Regression with One Binary X Introduction Logistic regression expresses the relationship between a binary response variable and one or more

More information

Chapter 20: Logistic regression for binary response variables

Chapter 20: Logistic regression for binary response variables Chapter 20: Logistic regression for binary response variables In 1846, the Donner and Reed families left Illinois for California by covered wagon (87 people, 20 wagons). They attempted a new and untried

More information

Calculating Odds Ratios from Probabillities

Calculating Odds Ratios from Probabillities Arizona State University From the SelectedWorks of Joseph M Hilbe November 2, 2016 Calculating Odds Ratios from Probabillities Joseph M Hilbe Available at: https://works.bepress.com/joseph_hilbe/76/ Calculating

More information

Survival Analysis I (CHL5209H)

Survival Analysis I (CHL5209H) Survival Analysis Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca January 7, 2015 31-1 Literature Clayton D & Hills M (1993): Statistical Models in Epidemiology. Not really

More information

Homework Solutions Applied Logistic Regression

Homework Solutions Applied Logistic Regression Homework Solutions Applied Logistic Regression WEEK 6 Exercise 1 From the ICU data, use as the outcome variable vital status (STA) and CPR prior to ICU admission (CPR) as a covariate. (a) Demonstrate that

More information

Epidemiology Principle of Biostatistics Chapter 14 - Dependent Samples and effect measures. John Koval

Epidemiology Principle of Biostatistics Chapter 14 - Dependent Samples and effect measures. John Koval Epidemiology 9509 Principle of Biostatistics Chapter 14 - Dependent Samples and effect measures John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is being covered

More information

Homework 5: Answer Key. Plausible Model: E(y) = µt. The expected number of arrests arrests equals a constant times the number who attend the game.

Homework 5: Answer Key. Plausible Model: E(y) = µt. The expected number of arrests arrests equals a constant times the number who attend the game. EdPsych/Psych/Soc 589 C.J. Anderson Homework 5: Answer Key 1. Probelm 3.18 (page 96 of Agresti). (a) Y assume Poisson random variable. Plausible Model: E(y) = µt. The expected number of arrests arrests

More information

Appendix: Computer Programs for Logistic Regression

Appendix: Computer Programs for Logistic Regression Appendix: Computer Programs for Logistic Regression In this appendix, we provide examples of computer programs to carry out unconditional logistic regression, conditional logistic regression, polytomous

More information

IP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS IPW and MSM

IP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS IPW and MSM IP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS 776 1 12 IPW and MSM IP weighting and marginal structural models ( 12) Outline 12.1 The causal question 12.2 Estimating IP weights via modeling

More information

Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity. Dankmar Böhning

Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity. Dankmar Böhning Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK Advanced Statistical

More information

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Applied Statistics Friday, January 15, 2016

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Applied Statistics Friday, January 15, 2016 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Applied Statistics Friday, January 15, 2016 Work all problems. 60 points are needed to pass at the Masters Level and 75 to pass at the

More information

Tests for the Odds Ratio in Logistic Regression with One Binary X (Wald Test)

Tests for the Odds Ratio in Logistic Regression with One Binary X (Wald Test) Chapter 861 Tests for the Odds Ratio in Logistic Regression with One Binary X (Wald Test) Introduction Logistic regression expresses the relationship between a binary response variable and one or more

More information

A comparison of 5 software implementations of mediation analysis

A comparison of 5 software implementations of mediation analysis Faculty of Health Sciences A comparison of 5 software implementations of mediation analysis Liis Starkopf, Thomas A. Gerds, Theis Lange Section of Biostatistics, University of Copenhagen Illustrative example

More information

Analysis of Count Data A Business Perspective. George J. Hurley Sr. Research Manager The Hershey Company Milwaukee June 2013

Analysis of Count Data A Business Perspective. George J. Hurley Sr. Research Manager The Hershey Company Milwaukee June 2013 Analysis of Count Data A Business Perspective George J. Hurley Sr. Research Manager The Hershey Company Milwaukee June 2013 Overview Count data Methods Conclusions 2 Count data Count data Anything with

More information

Unit 3. Discrete Distributions

Unit 3. Discrete Distributions BIOSTATS 640 - Spring 2016 3. Discrete Distributions Page 1 of 52 Unit 3. Discrete Distributions Chance favors only those who know how to court her - Charles Nicolle In many research settings, the outcome

More information

Beyond GLM and likelihood

Beyond GLM and likelihood Stat 6620: Applied Linear Models Department of Statistics Western Michigan University Statistics curriculum Core knowledge (modeling and estimation) Math stat 1 (probability, distributions, convergence

More information

(c) Interpret the estimated effect of temperature on the odds of thermal distress.

(c) Interpret the estimated effect of temperature on the odds of thermal distress. STA 4504/5503 Sample questions for exam 2 1. For the 23 space shuttle flights that occurred before the Challenger mission in 1986, Table 1 shows the temperature ( F) at the time of the flight and whether

More information

Multinomial Logistic Regression Models

Multinomial Logistic Regression Models Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word

More information

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam ECLT 5810 Linear Regression and Logistic Regression for Classification Prof. Wai Lam Linear Regression Models Least Squares Input vectors is an attribute / feature / predictor (independent variable) The

More information

Lecture 3: Measures of effect: Risk Difference Attributable Fraction Risk Ratio and Odds Ratio

Lecture 3: Measures of effect: Risk Difference Attributable Fraction Risk Ratio and Odds Ratio Lecture 3: Measures of effect: Risk Difference Attributable Fraction Risk Ratio and Odds Ratio Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK March 3-5,

More information

Testing Independence

Testing Independence Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 3: Bivariate association : Categorical variables Proportion in one group One group is measured one time: z test Use the z distribution as an approximation to the binomial

More information

Models for Binary Outcomes

Models for Binary Outcomes Models for Binary Outcomes Introduction The simple or binary response (for example, success or failure) analysis models the relationship between a binary response variable and one or more explanatory variables.

More information

Chapter 6 Part 4. Confidence Intervals

Chapter 6 Part 4. Confidence Intervals Chapter 6 Part 4 Confidence Intervals October 1, 008 Goal: To clearly understand the link between probability distributions and confidence intervals. Skills: Be able to calculate (1 - α)% confidence interval

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu

More information

CIMAT Taller de Modelos de Capture y Recaptura Known Fate Survival Analysis

CIMAT Taller de Modelos de Capture y Recaptura Known Fate Survival Analysis CIMAT Taller de Modelos de Capture y Recaptura 2010 Known Fate urvival Analysis B D BALANCE MODEL implest population model N = λ t+ 1 N t Deeper understanding of dynamics can be gained by identifying variation

More information

Chapter 1. Modeling Basics

Chapter 1. Modeling Basics Chapter 1. Modeling Basics What is a model? Model equation and probability distribution Types of model effects Writing models in matrix form Summary 1 What is a statistical model? A model is a mathematical

More information

Chapter Six: Two Independent Samples Methods 1/51

Chapter Six: Two Independent Samples Methods 1/51 Chapter Six: Two Independent Samples Methods 1/51 6.3 Methods Related To Differences Between Proportions 2/51 Test For A Difference Between Proportions:Introduction Suppose a sampling distribution were

More information

BIOSTATS Intermediate Biostatistics Spring 2017 Exam 2 (Units 3, 4 & 5) Practice Problems SOLUTIONS

BIOSTATS Intermediate Biostatistics Spring 2017 Exam 2 (Units 3, 4 & 5) Practice Problems SOLUTIONS BIOSTATS 640 - Intermediate Biostatistics Spring 2017 Exam 2 (Units 3, 4 & 5) Practice Problems SOLUTIONS Practice Question 1 Both the Binomial and Poisson distributions have been used to model the quantal

More information

Sociology 362 Data Exercise 6 Logistic Regression 2

Sociology 362 Data Exercise 6 Logistic Regression 2 Sociology 362 Data Exercise 6 Logistic Regression 2 The questions below refer to the data and output beginning on the next page. Although the raw data are given there, you do not have to do any Stata runs

More information

8 Nominal and Ordinal Logistic Regression

8 Nominal and Ordinal Logistic Regression 8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on

More information

Stat 642, Lecture notes for 04/12/05 96

Stat 642, Lecture notes for 04/12/05 96 Stat 642, Lecture notes for 04/12/05 96 Hosmer-Lemeshow Statistic The Hosmer-Lemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal

More information

Regression Models for Risks(Proportions) and Rates. Proportions. E.g. [Changes in] Sex Ratio: Canadian Births in last 60 years

Regression Models for Risks(Proportions) and Rates. Proportions. E.g. [Changes in] Sex Ratio: Canadian Births in last 60 years Regression Models for Risks(Proportions) and Rates Proportions E.g. [Changes in] Sex Ratio: Canadian Births in last 60 years Parameter of Interest: (male)... just above 0.5 or (male)/ (female)... "ODDS"...

More information

n y π y (1 π) n y +ylogπ +(n y)log(1 π).

n y π y (1 π) n y +ylogπ +(n y)log(1 π). Tests for a binomial probability π Let Y bin(n,π). The likelihood is L(π) = n y π y (1 π) n y and the log-likelihood is L(π) = log n y +ylogπ +(n y)log(1 π). So L (π) = y π n y 1 π. 1 Solving for π gives

More information

Generalized Linear Models for Count, Skewed, and If and How Much Outcomes

Generalized Linear Models for Count, Skewed, and If and How Much Outcomes Generalized Linear Models for Count, Skewed, and If and How Much Outcomes Today s Class: Review of 3 parts of a generalized model Models for discrete count or continuous skewed outcomes Models for two-part

More information

Answer to exercise: Blood pressure lowering drugs

Answer to exercise: Blood pressure lowering drugs Answer to exercise: Blood pressure lowering drugs The data set bloodpressure.txt contains data from a cross-over trial, involving three different formulations of a drug for lowering of blood pressure:

More information

Unit 5 Logistic Regression Practice Problems

Unit 5 Logistic Regression Practice Problems Unit 5 Logistic Regression Practice Problems SOLUTIONS R Users Source: Afifi A., Clark VA and May S. Computer Aided Multivariate Analysis, Fourth Edition. Boca Raton: Chapman and Hall, 2004. Exercises

More information

Inference for Binomial Parameters

Inference for Binomial Parameters Inference for Binomial Parameters Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 58 Inference for

More information

ST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios

ST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios ST3241 Categorical Data Analysis I Two-way Contingency Tables 2 2 Tables, Relative Risks and Odds Ratios 1 What Is A Contingency Table (p.16) Suppose X and Y are two categorical variables X has I categories

More information

22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression

22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression 22s:52 Applied Linear Regression Ch. 4 (sec. and Ch. 5 (sec. & 4: Logistic Regression Logistic Regression When the response variable is a binary variable, such as 0 or live or die fail or succeed then

More information

PubH 7405: REGRESSION ANALYSIS INTRODUCTION TO LOGISTIC REGRESSION

PubH 7405: REGRESSION ANALYSIS INTRODUCTION TO LOGISTIC REGRESSION PubH 745: REGRESSION ANALYSIS INTRODUCTION TO LOGISTIC REGRESSION Let Y be the Dependent Variable Y taking on values and, and: π Pr(Y) Y is said to have the Bernouilli distribution (Binomial with n ).

More information

Chapter 4 Part 3. Sections Poisson Distribution October 2, 2008

Chapter 4 Part 3. Sections Poisson Distribution October 2, 2008 Chapter 4 Part 3 Sections 4.10-4.12 Poisson Distribution October 2, 2008 Goal: To develop an understanding of discrete distributions by considering the binomial (last lecture) and the Poisson distributions.

More information

Single-level Models for Binary Responses

Single-level Models for Binary Responses Single-level Models for Binary Responses Distribution of Binary Data y i response for individual i (i = 1,..., n), coded 0 or 1 Denote by r the number in the sample with y = 1 Mean and variance E(y) =

More information

Correlation and regression

Correlation and regression 1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,

More information

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will

More information

Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - Stata Users

Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - Stata Users BIOSTATS 640 Spring 2017 Review of Introductory Biostatistics STATA solutions Page 1 of 16 Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - Stata Users #1. The following table lists length of

More information

Package HGLMMM for Hierarchical Generalized Linear Models

Package HGLMMM for Hierarchical Generalized Linear Models Package HGLMMM for Hierarchical Generalized Linear Models Marek Molas Emmanuel Lesaffre Erasmus MC Erasmus Universiteit - Rotterdam The Netherlands ERASMUSMC - Biostatistics 20-04-2010 1 / 52 Outline General

More information

Stat 587: Key points and formulae Week 15

Stat 587: Key points and formulae Week 15 Odds ratios to compare two proportions: Difference, p 1 p 2, has issues when applied to many populations Vit. C: P[cold Placebo] = 0.82, P[cold Vit. C] = 0.74, Estimated diff. is 8% What if a year or place

More information

Standardization methods have been used in epidemiology. Marginal Structural Models as a Tool for Standardization ORIGINAL ARTICLE

Standardization methods have been used in epidemiology. Marginal Structural Models as a Tool for Standardization ORIGINAL ARTICLE ORIGINAL ARTICLE Marginal Structural Models as a Tool for Standardization Tosiya Sato and Yutaka Matsuyama Abstract: In this article, we show the general relation between standardization methods and marginal

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

Confidence Intervals for the Odds Ratio in Logistic Regression with Two Binary X s

Confidence Intervals for the Odds Ratio in Logistic Regression with Two Binary X s Chapter 866 Confidence Intervals for the Odds Ratio in Logistic Regression with Two Binary X s Introduction Logistic regression expresses the relationship between a binary response variable and one or

More information

Lecture 3.1 Basic Logistic LDA

Lecture 3.1 Basic Logistic LDA y Lecture.1 Basic Logistic LDA 0.2.4.6.8 1 Outline Quick Refresher on Ordinary Logistic Regression and Stata Women s employment example Cross-Over Trial LDA Example -100-50 0 50 100 -- Longitudinal Data

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 12: Logistic regression (v1) Ramesh Johari ramesh.johari@stanford.edu Fall 2015 1 / 30 Regression methods for binary outcomes 2 / 30 Binary outcomes For the duration of this

More information

Generalized linear models

Generalized linear models Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models

More information

Meta-Analysis in Stata, 2nd edition p.158 Exercise Silgay et al. (2004)

Meta-Analysis in Stata, 2nd edition p.158 Exercise Silgay et al. (2004) Stata LightStone Stata 14 Funnel StataPress Meta-Analysis in Stata, 2nd edition p.153 Harbord et al. metabias metabias Steichen (1998) Begg Egger Stata Stata metabias Begg Egger Harbord Peters ado metafunnel

More information

DISPLAYING THE POISSON REGRESSION ANALYSIS

DISPLAYING THE POISSON REGRESSION ANALYSIS Chapter 17 Poisson Regression Chapter Table of Contents DISPLAYING THE POISSON REGRESSION ANALYSIS...264 ModelInformation...269 SummaryofFit...269 AnalysisofDeviance...269 TypeIII(Wald)Tests...269 MODIFYING

More information

Categorical and Zero Inflated Growth Models

Categorical and Zero Inflated Growth Models Categorical and Zero Inflated Growth Models Alan C. Acock* Summer, 2009 *Alan C. Acock, Department of Human Development and Family Sciences, Oregon State University, Corvallis OR 97331 (alan.acock@oregonstate.edu).

More information

Multinomial Regression Models

Multinomial Regression Models Multinomial Regression Models Objectives: Multinomial distribution and likelihood Ordinal data: Cumulative link models (POM). Ordinal data: Continuation models (CRM). 84 Heagerty, Bio/Stat 571 Models for

More information

Measures of Association and Variance Estimation

Measures of Association and Variance Estimation Measures of Association and Variance Estimation Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 35

More information

Introduction to logistic regression

Introduction to logistic regression Introduction to logistic regression Tuan V. Nguyen Professor and NHMRC Senior Research Fellow Garvan Institute of Medical Research University of New South Wales Sydney, Australia What we are going to learn

More information

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca Lenz-Tönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1

More information