Morten Frydenberg Section for Biostatistics Version :Friday, 05 September 2014

Similar documents
STA 250: Statistics. Notes 7. Bayesian Approach to Statistics. Book chapters: 7.2

arxiv: v1 [physics.data-an] 26 Oct 2012

Statistics II Logistic Regression. So far... Two-way repeated measures ANOVA: an example. RM-ANOVA example: the data after log transform

MATHEMATICAL MODELLING OF THE WIRELESS COMMUNICATION NETWORK

Tests for Two Proportions in a Stratified Design (Cochran/Mantel-Haenszel Test)

7.2 Inference for comparing means of two populations where the samples are independent

General Linear Model Introduction, Classes of Linear models and Estimation

LOGISTIC REGRESSION. VINAYANAND KANDALA M.Sc. (Agricultural Statistics), Roll No I.A.S.R.I, Library Avenue, New Delhi

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL

Estimation of the large covariance matrix with two-step monotone missing data

Chapter 7 Sampling and Sampling Distributions. Introduction. Selecting a Sample. Introduction. Sampling from a Finite Population

On split sample and randomized confidence intervals for binomial proportions

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO)

STK4900/ Lecture 7. Program

Towards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK

Bayesian Spatially Varying Coefficient Models in the Presence of Collinearity

Notes on Instrumental Variables Methods

Universal Finite Memory Coding of Binary Sequences

Elementary Analysis in Q p

A Comparison between Biased and Unbiased Estimators in Ordinary Least Squares Regression

The Poisson Regression Model

arxiv:cond-mat/ v2 25 Sep 2002

A Qualitative Event-based Approach to Multiple Fault Diagnosis in Continuous Systems using Structural Model Decomposition

Feedback-error control

MATH 2710: NOTES FOR ANALYSIS

Introduction to Probability and Statistics

Models of Regression type: Logistic Regression Model for Binary Response Variable

Using the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process

CERIAS Tech Report The period of the Bell numbers modulo a prime by Peter Montgomery, Sangil Nahm, Samuel Wagstaff Jr Center for Education

Hotelling s Two- Sample T 2

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Finite Mixture EFA in Mplus

Statics and dynamics: some elementary concepts

RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES

4. Score normalization technical details We now discuss the technical details of the score normalization method.

Shadow Computing: An Energy-Aware Fault Tolerant Computing Model

SAS for Bayesian Mediation Analysis

Chapter 3. GMM: Selected Topics

CHAPTER 5 STATISTICAL INFERENCE. 1.0 Hypothesis Testing. 2.0 Decision Errors. 3.0 How a Hypothesis is Tested. 4.0 Test for Goodness of Fit

1. INTRODUCTION. Fn 2 = F j F j+1 (1.1)

Measuring center and spread for density curves. Calculating probabilities using the standard Normal Table (CIS Chapter 8, p 105 mainly p114)

Hidden Predictors: A Factor Analysis Primer

Plotting the Wilson distribution

Lower Confidence Bound for Process-Yield Index S pk with Autocorrelated Process Data

Approximating min-max k-clustering

¼ ¼ 6:0. sum of all sample means in ð8þ 25

The non-stochastic multi-armed bandit problem

Spin as Dynamic Variable or Why Parity is Broken

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

Biostat Methods STAT 5500/6500 Handout #12: Methods and Issues in (Binary Response) Logistic Regression

Research Note REGRESSION ANALYSIS IN MARKOV CHAIN * A. Y. ALAMUTI AND M. R. MESHKANI **

Objectives. 6.1, 7.1 Estimating with confidence (CIS: Chapter 10) CI)

On the Toppling of a Sand Pile

Objectives. Estimating with confidence Confidence intervals.

Scaling Multiple Point Statistics for Non-Stationary Geostatistical Modeling

Measuring center and spread for density curves. Calculating probabilities using the standard Normal Table (CIS Chapter 8, p 105 mainly p114)

Biostat Methods STAT 5820/6910 Handout #5a: Misc. Issues in Logistic Regression

An Outdoor Recreation Use Model with Applications to Evaluating Survey Estimators

A Closed-Form Solution to the Minimum V 2

Analysis of some entrance probabilities for killed birth-death processes

Model checking, verification of CTL. One must verify or expel... doubts, and convert them into the certainty of YES [Thomas Carlyle]

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests

Pretest (Optional) Use as an additional pacing tool to guide instruction. August 21

FE FORMULATIONS FOR PLASTICITY

An Econometric Framework for Analyzing Health Policy with Nonexperimental Data

8 STOCHASTIC PROCESSES

Evaluating Process Capability Indices for some Quality Characteristics of a Manufacturing Process

Classical gas (molecules) Phonon gas Number fixed Population depends on frequency of mode and temperature: 1. For each particle. For an N-particle gas

Solved Problems. (a) (b) (c) Figure P4.1 Simple Classification Problems First we draw a line between each set of dark and light data points.

Distributed Rule-Based Inference in the Presence of Redundant Information

The one-sample t test for a population mean

Information collection on a graph

Published: 14 October 2013

LECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)]

Availability and Maintainability. Piero Baraldi

The Binomial Approach for Probability of Detection

Paper C Exact Volume Balance Versus Exact Mass Balance in Compositional Reservoir Simulation

On Wrapping of Exponentiated Inverted Weibull Distribution

Positive decomposition of transfer functions with multiple poles

An Improved Calibration Method for a Chopped Pyrgeometer

On the Relationship Between Packet Size and Router Performance for Heavy-Tailed Traffic 1

Determining Momentum and Energy Corrections for g1c Using Kinematic Fitting

The Longest Run of Heads

Extensions of the Penalized Spline Propensity Prediction Method of Imputation

Estimating function analysis for a class of Tweedie regression models

Sampling. Inferential statistics draws probabilistic conclusions about populations on the basis of sample statistics

Lecture 1.2 Units, Dimensions, Estimations 1. Units To measure a quantity in physics means to compare it with a standard. Since there are many

One-way ANOVA Inference for one-way ANOVA

Asymptotic Properties of the Markov Chain Model method of finding Markov chains Generators of..

Estimation of Separable Representations in Psychophysical Experiments

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

On Wald-Type Optimal Stopping for Brownian Motion

COMMUNICATION BETWEEN SHAREHOLDERS 1

KEY ISSUES IN THE ANALYSIS OF PILES IN LIQUEFYING SOILS

arxiv: v3 [physics.data-an] 23 May 2011

dn i where we have used the Gibbs equation for the Gibbs energy and the definition of chemical potential

Unobservable Selection and Coefficient Stability: Theory and Evidence

Analysis of M/M/n/K Queue with Multiple Priorities

Background. GLM with clustered data. The problem. Solutions. A fixed effects approach

Developing A Deterioration Probabilistic Model for Rail Wear

Transcription:

Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 All models are aroximations! The best model does not exist! Comlicated models needs a lot of data. lower your ambitions or get more data If you do not like uncertain conclusions, then study robability theory! If you decide on which analysis to resent based on the observed relationshi between the outcome and other variables, then you will in general get invalid estimates, confidence intervals and -values!! Morten Frydenberg Research seminar: Regression Regression models Research Seminars - Deartment of Public Health Morten Frydenberg Section for Biostatistics, Aarhus Univ, Denmark Some general statements/comments on statistical models. Regression models Definition Examles to of the iceberg A strategy? Interaction/effectmodification. Modelling continuous variable.- HS Morten Frydenberg Research seminar: Regression 2 Research seminar: Regression

Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 We will not discuss which information/variables should be included in a model as we and you(?) know that this is a urely subect-matter roblem, that cannot be solved by statistical technics or algorithms. Any statistical analyses should be receded by a long rocess clarifying the subect-matter roblem and the design/samling rocedure. In this rocess you should: Read, Ask, Draw, Think, Question, Learn and Discuss. -Sometimes this will include reanalysing old data. The result of this rocess should be that you know which variables and effect modifiers should be in your model. (And which of these you do not have!) Morten Frydenberg Research seminar: Regression 3 We believe that the above clarifying of the subect-matter roblem cannot be made without some insight into the statistical models/method/designs that have been used and could de used. We believe that knowing the syntax in Stata/SAS is not equivalent to having insight into a model/method. We believe that to understand some (not all) statistical models/methods/designs you need a background in mathematics and statistics. We know that 99.99% of erroneous statistical analyses are caused by the erson and not the software. Many of these are due limited insight to the assumtions behind and roerties of the model/method. Morten Frydenberg Research seminar: Regression 4 Research seminar: Regression 2

Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 We believe that statistical tests and -values seldom is of real interest and that they much to often lead to misleading conclusions. We believe that, at moment, the best summary of a statistical analysis are estimates of the relevant association with confidence intervals. And the discussion of these should focus on how the lower and uer confidence limits relate to the subect-matter roblem. Whether or not 0 (or ) is included in the interval is, as the -value, seldom of any interest! We believe in the miracle of asymtotics! Morten Frydenberg Research seminar: Regression 5 We know that a statistical analysis will not give the final answer to any question. Mainly because we use statistical analysis on roblems with random (or chaotic) comonent, so the results will also be random. But also because a statistical model is a aroximation. We and you want estimates, confidence intervals and - values we can trust! We will not discuss models with several random comonents (e.g. clusters) models with random coefficient models with latent variables models involving roensity scores or IPW the difference between marginal and conditional effect the many tye of causal models. Working with any of the above requires a understanding of the standard models Morten Frydenberg Research seminar: Regression 6 Research seminar: Regression 3

Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 What is a regression model? A model that models the relationshi between an outcome, y, and a set of exlanatory variables, x. Systematic art Random art y ( x ) ( ) " = " f ; θ " + " e σ Unknown Parameters Unknown Parameters Often Exectation ( Y ) = f ( x θ ) = E ( Y ) " + " e( σ ) E ; Y Morten Frydenberg Research seminar: Regression 7 What is a linear regression model? A large class of models can be secified as E (,, ) Y x x = f β0 + β x (,, ) e( σ ) Y = E Y x x " + " Or (almost) equivalently (,, ) h E Y x x = β0 + β x (,, ) " e( σ ) Y = E Y x x " + Morten Frydenberg Research seminar: Regression 8 Research seminar: Regression 4

Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 Normal regression Y: any value to 2 ( Y x,, x ) = β0 + β x and e N ( σ ) E 0, Logistic regression Y: 0 or ( Y = x,, x ) ( Y = x x ) Pr log = β0 + β x Pr 0,, and no extra random variation Poisson regression (,, ) Some standard models log rate x x = β0 + β x Y: a non-negative integer: 0,,2,. Morten Frydenberg Research seminar: Regression 9 and known Y Poisson ( rate T) Logistic and Poisson regression as multilicative models Logistic regression Y: 0 or odds odds re (,, ) odds( ref ) x x2 x x = OR OR O 2 and no extra random variation ( f ) = ex( β0 ) OR = ex( β ) R x Poisson regression rate rate ( x,, x ) rate ( ref ) Y: any non-negative integer: 0,,2,. x x2 = IRR IRR and 2 ( ref ) = ex( β0 ) IRR = ex ( β ) IRR x Y Poisson ( rate T) known Morten Frydenberg Research seminar: Regression 0 Research seminar: Regression 5

Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 The linear structure β + β x 0 This assumed structure imlies that the difference, on the relevant scale, between two ersons is A A 2 A with covariates x, x,, x B B 2 B with covariates x, x,, x A β x where x = x x The contributions for each x is : Morten Frydenberg Research seminar: Regression A B B Added Proortional to the difference in x Indeendent on the other x s (no effect modification) Note that The linear structure β + β x 0 Some x s can be indicators variable, i.e. 0/ variables, indicating that the erson belong to a secific grou: males or 25<BMI<30. Some x s can relates to the same data/ information. x 2 x 3 x 4 might corresond to 8<BMI<25, 25<BMI<30 and 30<BMI x 2 x 3 x 4 might corresond to Age, Age 2 Age 3 x 2 x 3 x 4 might corresond to a cubic sline of Age Some x s can be interaction terms like Male*BMI Morten Frydenberg Research seminar: Regression 2 Research seminar: Regression 6

Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 Three models for binary outcome OR-model (Logistic regression) ( Y = x,, x ) ( Y = 0 x,, x ) Pr log = β0 + β x Pr RR-model ( =,, ) log Pr Y x x = β0 + β x RD-model Pr ( =,, ) Y x x = β + β x 0 Morten Frydenberg Research seminar: Regression 3 A fourth model for binary data The Probit model a threshold/latent trait model ( =,, ) Φ Pr Y x x = β0 + β x ( =,, ) Pr Y x x = Φ β0 + β x = Pr standard normal β0 + β x Φ : the distribution function for the standard normal Morten Frydenberg Research seminar: Regression 4 Research seminar: Regression 7

Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 The Probit model a threshold/latent trait model 0 As a ersonal threshold model ( ) Y = iff Z β + β x Z N 0, Y = iff L 0 L N β0 + β x, As a ersonal latent trait model The regression coefficients can be interreted as differences in the threshold or in exceted latent trait. Note: The logistic regression model is also a threshold /latent trait model. (Just using the logistic distribution instead of the normal distribution.) Morten Frydenberg Research seminar: Regression 5 Scale factor.702 Morten Frydenberg Research seminar: Regression 6 Research seminar: Regression 8

Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 Choosing between (simle) models for binary outcome. Model/link Measure of association Limitation Will work logit Odds ratio or Latent mean difference None Always robit Latent mean difference None Always log Relative Risk Large RR does not make sense Not if the robability is large identity Risk Difference Numerical large RD does not make sense Not if the robability is large or small Aroximations Logit and robit models are in general close. Logit and log model are close if the event is rare. The argument my events is frequent > the OR differs from the RR > I have to use an RR-model Is not valid!! Relative Risk does not make sense if the event is frequent! Morten Frydenberg Research seminar: Regression 7 The excetion: The normal regression 2 ( ) Y = β0 + β x + e and e N 0, σ The normal regression is the only standard model where:. There is an additional arameter, σ, quantifying the random variation not exlained by the systematic art. 2. Inference (confidence and -values) is exact, i.e. we do not have to rely on asymtotics. 3. Residuals and leverage have a clear interretation and can be used in validation of the model. 4. Most of the validation is done by diagnostic lots. Morten Frydenberg Research seminar: Regression 8 Research seminar: Regression 9

Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 Time-to-event data (Y=Time) How does time, T, to a secific event deend on x? For a start nothing new. We could model T or log(t) by a normal regression model or a more comlicated model T by a Weibull regression model T by a Gomertz regression model Note modelling log(t) as a regression imlies a accelerated failure time model. log x x2 ( T ) + x T T β β γ γ γ 0 0 x female (vs male) and γ =. T is on average 0% higher for females Morten Frydenberg Research seminar: Regression 9 2 x Time-to-event data (Y=Time) How does time, T, to a secific event deend on x? But often the data is right censored, i.e. for some data oints we only know that T>t, but not the actual value! This is not a roblem, as this could (easily) be incororated in the models above. Morten Frydenberg Research seminar: Regression 20 Research seminar: Regression 0

Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 Time-to-event data (Y=Time) A comletely different way to look at the roblem is not to model T directly but to model T via the rate/hazard. t Pr ( t < T t + dt T > t) λ ( t) = lim Pr ( T > t) = ex λ ( u) du Many such models are on the forms: dt λ ( t) = λ0 ( t) + g β λ ( t) = λ0 ( t) g β A subgrou are the roortional hazard models λ ( t) = λ0 ( t) ex β x x x 0 Morten Frydenberg Research seminar: Regression 2 Time-to-event data - Proortional hazard models λ ( t) = λ0 ( t) ex β x Here λ 0 (t) can be model Piecewise constant = Poisson regression Parametric: Exonential, Weibull, Gomertz No arametric (It can be on any form) Cox s roortional hazard model Morten Frydenberg Research seminar: Regression 22 Research seminar: Regression

Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 Time-to-event data - Cox s roortional hazard models λ ( t) = λ0 ( t) ex β x No constrains on λ 0 (t) a semi arametric model The focus is not when the events haened ( i.e. T) but on the hazards ratios. The time is only used to find the order of the events. But note: Different time scales as age or calendar time will define comletely different models. Deciding on the time scale is a key oint when alying the Cox model. Morten Frydenberg Research seminar: Regression 23 Time-to-event data - Cox s roortional hazard models λ ( t) = λ0 ( t) ex β x The model have a lot of virtues: Right censoring is easily handled. You can do a lot of interesting modelling with it. It is relatively easy to extend it to having time-varying covariates or time-varying hazards ratios The best is robably that it have generated many Danish Ph. Ds in theoretical statistics. But: The assumtion of constant hazards ratio over time (the roortionality assumtion) is often not valid or relevant. Hazards and hazard ratios can be difficult to understand! Morten Frydenberg Research seminar: Regression 24 Research seminar: Regression 2

Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 Time-to-event data - Cox s roortional hazard models λ ( t) = λ0 ( t) ex β x Why is it so oular? It has some nice (interesting) mathematical roerties. It is available It has been around for many years My suervisor used it in his Ph. D I would like to analyse 5 year survival (by relative risk), but I do not five years follow u for everybody, so I have to estimate hazards ratios instead (and accet the assumtion behind the Cox model). Now: Use the seudo value aroach!!!! (See Research Seminars on Cometing Risk (May 6th 204) Parner) Morten Frydenberg Research seminar: Regression 25 Secifying a statistical model can often be broken into three arts:. The tye of model: Normal regression, logistic, robit, Cox roortional hazard model.. 2. Which variables should be in the model 3. How these should be ut in the model The first choice should be based on, the outcome, the design/samling rocedure and the measure of association. Morten Frydenberg Research seminar: Regression 26 Research seminar: Regression 3

Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 The miracle of asymtotics Given the statistical model is true and we have a large data set and aly the method of maximum likelihood estimation Then The estimates will be unbiased. 95% confidence intervals will have 95% coverage robabilities. P-values will have a uniform distribution if the hyothesis is true. Imlying that the risk of tye error is 5%. Morten Frydenberg Research seminar: Regression 27 Some models are based on theory and used reeatingly Morten Frydenberg Research seminar: Regression 28 Research seminar: Regression 4

Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 Some models are validated Morten Frydenberg Research seminar: Regression 29 Some models are not reused or validated In the Cox regression analyses, a number of otential confounders was used to adust the effect of BMI on fetal death. These confounders were chosen a riori and included age, arity, height, socio-occuational status, smoking, coffee consumtion, and alcohol consumtion, because these covariates have been considered in revious studies of stillbirth. Finally, hysical exercise was included because it has been suggested that the association between obesity and stillbirth may be confounded by this variable. Morten Frydenberg Research seminar: Regression 30 Research seminar: Regression 5

Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 Which model should I use? You can often divide the exlanatory variables into grous: : Variables of rimary interest- main exosure. 2:Variables of less interest variables you want to adust for. A good model will try to introduce the first grou in an interretable/simle way into the model. - You want to know how they work. The second tye of variables can be introduced any way you like. It can be very comlicated you do not care - as long as they do the ob - that is, adust sufficiently. Morten Frydenberg Research seminar: Regression 3 A general strategy Clarify the urose Read, Ask, Draw, Think, Question, Learn and Discuss. Decide on the outcome 2. Prioritized list of exlanatory variables 3. Design and data collection 4. The maximum comlexity (N/5 or #events/5) 5. Exlore the exlanatory variables 6. Prioritized list of interactions/effectmodifications 7. Allocating the arameters 8. Reresenting blocks 9. Choosing scales, cut-oints and knots 0. Choosing scales, cut-oints and knots for interactions Fit the model Check for serious errors. Write the aer! Without using the outcome Morten Frydenberg Research seminar: Regression 32 Research seminar: Regression 6

Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 Interaction/ effect-measure-modification It is imortant to remember that interaction in a statistical model is a mathematical concet. Whether or not it corresonds to something that you would call interaction in the real world is an other question. Interaction/no interaction always refers to a secific model ( A) logit Pr( =, 2, 3) ( B) Pr( =,, ) Y x x x = β0 + β BMI + β2 Male + β3 Age Y x x x = β + β BMI + β Male + β Age 2 3 0 2 3 The no interaction between x and x 2 assumtions is (A) the OR associated with kg/m 2 differences in BMI is the same men and women.(and vice versa) (B) the RD associated with kg/m 2 differences in BMI is the same men and women.(and vice versa) Morten Frydenberg Research seminar: Regression 33 Interaction/ effect-measure-modification ( =, 2, 3) β0 β β2 β3 logit Pr Y x x x = + BMI + Male + Age Remember every model is an aroximation they are wrong! It is imossible (for me) to image a model where BMI has exactly the same effect for men and women. In my world a model with interaction between sex and BMI will always be a better aroximation to the real world than a model without the interaction! So if you have enough data then any interaction between any two variable will be statistical significant! For a secific model: If a interaction is not statistical significant interactions, then it is because you do not have enough data!! If you want to include all significant interactions the you better not have a large data set.. Morten Frydenberg Research seminar: Regression 34 Research seminar: Regression 7

Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 Interaction/ effect-measure-modification ( =, 2, 3) β0 β β2 β3 logit Pr Y x x x = + BMI + Male + Age We are not interested in statistical significant interactions! We are interested in imortant/relevant interaction in connection the subect-matter roblem. As with other effect interactions they:. Should be modelled based on the subect-matter roblem, 2. Reorted by an estimate with CI 3. And discussed based on the subect-matter roblem. Note: 2 and 3 is only relevant if the interaction is the focus of the roblem. Morten Frydenberg Research seminar: Regression 35 Interaction/ effect-measure-modification It is my exerience that understanding and working with interactions are very difficult for non-statisticians! This can lead the researcher to ignore interactions/ effect-modificators not based on subect-matter considerations, but it is to comlicated. Fall back to statistical significant view of relevance and not the subect-matter oint of view. Of course if the subect-matter consideration and the tye model indicate that you need to incororation interactions in your model then you have to do it! But this requires that you know how to do it and how to interret the arameters and the estimates! Morten Frydenberg Research seminar: Regression 36 Research seminar: Regression 8

Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 Interaction/ effect-measure-modification technicalities ( = ) β0 β β2 β3 + β5 logit Pr Y = + BMI + Male + Age BMI Male ( β ) ( β β ) ( β ) OR = ex OR = ex + = OR ex Female Male Female BMI BMI 5 BMI 5 So the measure of effect-measure-modification is the ratio Male ORBMI ROR = = ex β OR Female BMI ( ) 5 If this interaction (ratio) is.0 then the effect of kg/m 2 difference in BMI is 0% higher among men than among women. When we measure effect by odds ratios!!! And adusted (linearly) for age. Morten Frydenberg Research seminar: Regression 37 Interaction/ effect-measure-modification technicalities Note this is not the effect of BMI it is the effect of BMI among men! ( β ) ( β β ) ( β ) OR = ex OR = ex + = OR ex Female Male Female BMI BMI 5 BMI 5 ( = ) β0 β β2 β3 + β5 logit Pr Y = + BMI + Male + Age BMI Male ( β ) ex( β β ) OR = ex OR = + = OR ROR BMI = 0 BMI = BMI = 0 male vs female 2 male vs female 2 5 male vs female Note this is not the effect of Sex it is the effect of Sex among erson with BMI=0! Morten Frydenberg Research seminar: Regression 38 Research seminar: Regression 9

Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 Always have sensible reference values for the variables ( = ) ( ) + β ( BMI 24) logit Pr Y = β0 + β BMI 24 + β2 Male + β3 Age Male 5 ( β ) ex( β β ) OR = ex OR = + = OR ROR BMI = 24 BMI = 24+ BMI = 24 male vs female 2 male vs female 2 5 male vs female Note this is not the effect of Sex it is the effect of Sex among erson with BMI=24! Morten Frydenberg Research seminar: Regression 39 ( Y = ) 3.33 0.056 Age + ( 24) 5 + 0.025 ( 24) logit Pr = + 0.22 BMI 0. Male BMI Male Morten Frydenberg Research seminar: Regression 40 Research seminar: Regression 20

Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 OR OR Female BMI BMI = 24 male vs female =.3 = 0.89 ROR =.026 Male ORBMI =.3.026 =.6 OR = = 0.89.026 =.8 BMI 35 male vs female Morten Frydenberg Research seminar: Regression 4 Research seminar: Regression 2