A comparison of 5 software implementations of mediation analysis
|
|
- Juniper Goodman
- 5 years ago
- Views:
Transcription
1 Faculty of Health Sciences A comparison of 5 software implementations of mediation analysis Liis Starkopf, Thomas A. Gerds, Theis Lange Section of Biostatistics, University of Copenhagen
2 Illustrative example Examine pathways between physical fitness and post-cumpolsory education 1084 students from all public elementary schools in Aalborg Health examination (2010) High physical fitness Grades Post-compulsory education (2014) Low physical fitness Grades Post-compulsory education (2014) 2 / 42
3 Mediation question Age, ethnicity, parental income and education (C) Academic achievement (M ) Physical fitness (A) Post-compulsory education (Y ) How much of the effect of physical fitness on attendance in post-compulsory education is mediated through academic achievement? 3 / 42
4 Objectives How to do mediation analysis in practice? Implement mediation analysis on the illustrative data using SAS mediation macro R package medflex What software solutions are available? What can they do? What are the differences? Comparison of five estimation methods and their software solutions Simulation study 4 / 42
5 Counterfactual framework General definiton: M (a) the mediator that would have been observed if, possibly contrary to the fact, the exposure A was set to a Y (a, M (a )) the outcome that would have been observed if, possibly contrary to the fact, the exposure A had been set to a and the mediator M was set to the value it would have taken if A was set to a. 5 / 42
6 Counterfactual framework Example: M (0) Y (1, M (0)) grade point average that would have been observed if, possibly contrary to the fact, the level of physical fitness had been set to low (A = 0) attendance in post-compulsory education that would have been observed if, possibly contrary to the fact, the level of physical fitness had been set to high (A = 1) and grade point average was set to the value it would have taken if the level of physical fitness was set to low (M (0)). 6 / 42
7 Marginal natural effects g{e[y (a, M (a))]} g{e[y (a, M (a ))]} }{{} marginal total effect for some link function g. = g{e[y (a, M (a))]} g{e[y (a, M (a))]} }{{} marginal natural direct effect + g{e[y (a, M (a))]} g{e[y (a, M (a ))]} }{{} marginal natural indirect effect 7 / 42
8 Identifiability conditions No uncontrolled confounding for the exposure-outcome, exposure-mediator or mediator-outcome relations Y (a, m) Y (a, m) = M (a) = = No intertwined causal pathways A C for all levels of a and m, A C for all levels of a, M A = a, C for all levels of a and m. Positivity Y (a, m) = M (a ) C for all levels a, a and m. f (m A, C) > 0 w.p.1 for each m. Consistency if A = a, then M (a) = M w.p.1, if A = a and M = m, then Y (a, m) = Y w.p.1. 8 / 42
9 Analytic formulas of natural effects Regression model for the outcome Y g Y {E[Y A = a, M = m, C = c]} = θ 0 + θ 1a + θ 2m + θ 3am + θ T 4 c Regression model for the mediator M g M {E[M A = a, C = c]} = β 0 + β 1a + β T 2 c, Under identifiability conditions, closed form expressions of natural direct and indirect effects can be derived as a combination of β and θ Valeri and VanderWeele 1 implemented the analytic fomulas in SAS/SPSS mediation macros. 1 Linda Valeri and Tyler J VanderWeele. Mediation analysis allowing for exposure-mediator interactions and causal interpretation: Theoretical assumptions and implementation with sas and spss macros. Psychological methods, 18(2):137, / 42
10 SAS/SPSS mediation macro Conditional natural effects at a fixed level of C on the scale of linear predictor (g y ) Seperate formulas for each combination of the mediator and outcome models Binary outcome: For logistic regression model, the formulas hold if the outcome is rare Alternatively, log-linear model has to be used 10 / 42
11 Illustrative example Not attending post-compulsory education is a rare event, P(Y = 0) = 0.08 Logistic regression model for Y logit{p(y = 0 A = a, M = m, C = c)} = θ 0 + θ 1a + θ 2m + θ T 4 c Liner regression model for M Resulting formulas E[M A = a, C = c] = β 0 + β 1a + β T 2 c log{or NDE (a = 1, a = 0)} = θ 1 log{or NIE (a = 1, a = 0)} = β 1θ 2 log{or TE (a = 1, a = 0)} = θ 1 + β 1θ 2 11 / 42
12 Implementation %INC "MEDIATION.sas"; PROC IMPORT DATAFILE="d1.csv" OUT=d1 DBMS=csv; RUN; %MEDIATION(data=d1,yvar=attend,avar=fitness,mvar=gpa,cvar=ethni age14 age15 income1 income2 income3 educ1 educ2 educ3,a0=0,a1=1,m=0,nc=4,c=,yreg=logistic,mreg=linear,interaction=false) RUN; The SAS System Obs fitness gpa attend age income educ age14 age15 income1 income2 income3 educ1 ecuc2 educ / 42
13 Regression for mediator M %MEDIATION(data=d1, yvar=attend, avar=fitness, mvar=gpa, cvar=ethni age14 age15 income1 income2 income3 educ1 educ2 educ3, a0=0, a1=1, m=0, nc=4, c=, yreg=logistic, mreg=linear, interaction=false) RUN; Age, ethnicity, parental income and education (C) Physical fitness (A) Academic achievement (M ) Post-compulsory education (Y ) 13 / 42
14 Error Corrected Total Regression for mediator M Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Variable DF Parameter Estimate Standard Error t Value Pr > t Intercept <.0001 fitness <.0001 ethni age age <.0001 income income income <.0001 educ <.0001 educ <.0001 educ < / 42
15 Regression for outcome Y %MEDIATION(data=d1, yvar=attend, avar=fitness, mvar=gpa, cvar=ethni age14 age15 income1 income2 income3 educ1 educ2 educ3, a0=0, a1=1, m=0, nc=4, c=, yreg=logistic, mreg=linear, interaction=false) RUN; Age, ethnicity, parental income and education (C) Physical fitness (A) Academic achievement (M ) Post-compulsory education (Y ) 15 / 42
16 Point 95% Wald Regression for outcome Y The SAS System 18:27 Thursday, August 11, The LOGISTIC Procedure Analysis of Maximum Likelihood Estimates Parameter DF Estimate Standard Error Wald Chi-Square Pr > ChiSq Intercept fitness gpa ethni age age income income income educ educ educ / 42 Odds Ratio Estimates
17 Exposure-mediator interaction %MEDIATION(data=d1, yvar=attend, avar=fitness, mvar=gpa, cvar=ethni age14 age15 income1 income2 income3 educ1 educ2 educ3, a0=0, a1=1, m=0, nc=4, c=, yreg=logistic, mreg=linear, interaction=false) RUN; Age, ethnicity, parental income and education (C) Physical fitness (A) Academic achievement (M ) Post-compulsory education (Y ) 17 / 42
18 Other options %MEDIATION(data=d1, yvar=attend, avar=fitness, mvar=gpa, cvar=ethni age14 age15 income1 income2 income3 educ1 educ2 educ3, a0=0, a1=1, m=0, nc=4, c=, yreg=logistic, mreg=linear, interaction=false) RUN; a0 - baseline level of exposure (unexposed) a1 - new exposure level c - fixed value for C at which conditional effects are computed nc - number of baseline covariates m - fixed value for M at which controlled direct effect is computed 18 / 42
19 Estimates of natural and controlled direct effects The SAS System 18:27 Thursday, August 11, Obs Effect Estimate p_value _95 CI_lower _95 CI_upper 1 cde=nde nie total effect The SAS System 18:27 Thursday, August 11, Libref Engine Physical Name Filename WORK V9 Directory OR NDE = OR NIE = C:\Users\gzc782\AppData\Local\Temp\SAS Temporary Files\_TD6756_SUND27322_ OR TE = C:\Users\gzc782\AppData\Local\Temp\SAS Temporary Files\_TD6756_SUND27322_ Member # Name Type File Size Last Modified 1 CIL DATA 128KB 11/08/ :54:43 19 / 42 2 CIU DATA 128KB 11/08/ :54:43 3 D1 DATA 128KB 11/08/ :30:06
20 Interpretation OR NIE = changing the grade point average from the value that would have been observed at the low level of physical fitness (M (0)) to the value that would have been observed at high level of physical fitness (M (1)), while actually keeping the physical fitness at the high level (A = 1) increases the odds of attending post-compulsory 1 education by = times 20 / 42
21 Natural effect models Lange 2, Vansteelandt 3 suggested using so-called natural effect models g Y {E[Y (a, M (a )) C = c]} = θ 0 + θ 1a + θ 2a + θ T 3 c Conditional natural effects at the observed values of C given on the scale of linear predictor g Y θ 1(a a ) - natural direct effect θ 2(a a ) - natural indirect effect Implemented in the R package medflex 2 Theis Lange, Stijn Vansteelandt, and Maarten Bekaert. A simple unified approach for estimating natural direct and indirect effects. American journal of epidemiology, 176(3): , Stijn Vansteelandt, Maarten Bekaert, and Theis Lange. Imputation strategies for the estimation of natural direct and indirect effects. Epidemiologic Methods, 1(1):13-158, / 42
22 Estimation of natural effect models At first glance, it seemes that fitting natural effect models requires data for nested counterfactuals Y (a, M (a )) i A i a a M i M i(a ) Y i Y i(a, M i(a )) M 1 M 1 Y 1 Y M 1 M 1 Y 1? M 2 M 2 Y 2 Y M 2 M 2 Y 2? 22 / 42
23 Estimation of natural effect models Vansteelandt et al. suggested imputing the missing counterfactuals Y (a, M (a )) i A i a a M i M i(a ) Y i Y i(a, M i(a )) M 1 M 1 Y 1 Y M 1 M 1 Y 1 Ê[Y 1 A = a, M 1, C 1] M 2 M 2 Y 2 Y M 2 M 2 Y 2 Ê[Y 2 A = a, M 2, C 2] Imputation model g Y {E[Y A = a, M = m, C = c]} = β 0 + β 1a + β 2m + β 3c. 23 / 42
24 Illustrative example Logistic regression model as the natural effects model logit{p(y (a, M (a ) = 0 C = c)} = θ 0 + θ 1a + θ 2a + θ T 4 c Logistic regression model for Y logit{p(y = 0, A = a, M = m, C = c)} = β 0 + β 1a + β 2m + β T 3 c Conditional natural effects as odds ratios given the observed level of covariates C log{or NDE (a = 1, a = 0)} = θ 1 log{or NIE (a = 1, a = 0)} = θ 2 log{or TE (a = 1, a = 0)} = θ 1 + θ 2 24 / 42
25 Expanding the data R> library(medflex) R> d1 <- read.csv("d.csv") R> Yimp <- glm(attend~fitness+ + gpa+ + age+ethni+income+educ, + family="binomial") R> impdata <- neimpute(yimp) Age, ethnicity, parental income and education (C) Academic achievement (M ) Physical fitness (A) Post-compulsory education (Y ) 25 / 42
26 Expanding the data R> library(medflex) R> d1 <- read.csv("d.csv") R> impdata <- neimpute(attend~fitness+ + gpa+ + age+ethni+income+educ, + data=d1, + family="binomial") Age, ethnicity, parental income and education (C) Academic achievement (M ) Physical fitness (A) Post-compulsory education (Y ) 26 / 42
27 Fitting the natural effect model R> Yfit <- nemodel(formula=attend~ + fitness0+ + fitness1+ + age+ethni+income+educ, + expdata=impdata, + se="robust", + family="binomial") Age, ethnicity, parental income and education (C) Academic achievement (M ) Physical fitness (A) Post-compulsory education (Y ) 27 / 42
28 Fitting the natural effect model R> Yfit <- nemodel(formula=attend~ + fitness0+ + fitness1+ + age+ethni+income+educ, + expdata=impdata, + se="robust", + family="binomial") Age, ethnicity, parental income and education (C) Academic achievement (M ) Physical fitness (A) Post-compulsory education (Y ) 28 / 42
29 Other arguments R> Yfit <- nemodel(formula=attend~ + fitness0+ + fitness1+ + age+ethni+income+educ, + expdata=impdata, + se="robust", + family="binomial") expdata - expanded and imputed data set se - standard errors (robust - based on Delta method, bootstrap-based on 1000 bootstrap) 29 / 42
30 Estimates of the natural effects R> summary(yfit) Natural effect model with robust standard errors based on the sandwich estimator --- Exposure: fitness Mediator(s): gpa --- Parameter estimates: Estimate Std. Error z value Pr(> z ) (Intercept) fitness fitness ** ethni * age age income income income educ ** educ educ Signif. codes: 0 *** ** 0.01 * OR NDE = exp{ 0.513} = OR NIE = exp{ 0.432} = / 42
31 Estimate of the total effect R> summary(neeffdecomp(yfit)) Effect decomposition on the scale of the linear predictor with standard errors based on the sandwich estimator --- conditional on: ethni, age, income, educ with x* = 0, x = OR TE = exp{ 0.946} = Estimate Std. Error z value Pr(> z ) natural direct effect natural indirect effect ** total effect * --- Signif. codes: 0 *** ** 0.01 * (Univariate p-values reported) 31 / 42
32 Interpretation OR NIE = changing the grade point average from the value that would have been observed at the low level of physical fitness (M (0)) to the value that would have been observed at high level of physical fitness (M (1)), while actually keeping the physical fitness at the high level (A = 1) increases the odds of attending post-compulsory 1 education by = times 32 / 42
33 Other estimation methods considered in the paper Weighting approach in the R package medflex Approach based on Monte Carlo approximations implemented in the R package mediation Inverse odds ratio weighted estimation of natural effects with R code examples 33 / 42
34 Comparison Variable SAS/SPSS medflex (W) medflex (I) mediation IORW Parameters of interest Marginal or Conditional Conditional Conditional Marginal Conditional conditional at a fixed at the observed at the observed at the observed level of C level of C level of C level of C Scale Corresponds Corresponds Corresponds Always difference, Corresponds to g to g to g i.e. g = identity to g Modelling Required models M A, C M A, C Y A, M, C M A, C A M, C Y A, M, C Y (a, M (a )) C Y (a, M (a )) C Y A, M, C Y A, C Interactions A M A M A M A M A M A C A C A C A C Type of variables Exposure Continuous Continuous Continuous Continuous Continuous Binary Binary Binary Binary Binary Polytomous Polytomous Polytomous Polytomous Polytomous Mediator Continuous Continuous Continuous Continuous Continuous Binary Binary Binary Binary Binary Count Count Count Count Polytomous Polytomous Polytomous Polytomous Multidimensional Failure time Failure time Multidimensional Multidimensional Outcome Continuous Continuous Continuous Continuous Continuous Binary Binary Binary Binary Binary Count Count Count Count Count Failure time Polytomous Polytomous Failure time Failure time 34 / 42
35 Simulation study 2000 samples of data sets with 200 observations Set up: P(C = 1) = 0.7 P(A = 1 C = c) = Φ( 0.3c) M = A 0.7C + ε P(Y = 1 A = a, M = m, C = c) = Φ( a + 0.2m + 0.2c) with ε t(df = 10) 35 / 42
36 Simulation study True natural effects: estimated from a simulated data set with 100,000 observations Relative bias: i=1 NDE i NDE true NDE true Relative RMSE: i=1 ( ) NDE i NDE 2 true NDE true 36 / 42
37 Results for direct effect Method Rel.Bias Rel.RMSE Cov.P SAS macro % medflex (I) % medflex (W) % R mediation % IORW % 37 / 42
38 Results for indirect effect Method Rel.Bias Rel.RMSE Cov.P SAS macro % medflex (I) % medflex (W) % R mediation % IORW % 38 / 42
39 Results for total effect Method Rel.Bias Rel.RMSE Cov.P SAS macro % medflex (I) % medflex (W) % R mediation % IORW % 39 / 42
40 Results depending on sample size Relative bias Relative RMSE Relative bias medflex (W) medflex (I) R mediation IORW SAS Relative RMSE medflex (W) medflex (I) R mediation IORW SAS Samplesize (n) Samplesize (n) 40 / 42
41 Conclusions All estimation methods peform good in this particular setting IORW estimation seems to have larger relative bias and relative RMSE, needs further investigation Choice of estimation method depends on parameter of interest aimed for software preferance. Mediation analysis can be applied fairly easily in most of the standard software packages Our paper will give guidance and examples how to apply mediation analysis with 5 different software solutions If you run into problems, you are very welcome to contact us! 41 / 42
42 Thank you! 42 / 42
Department of Biostatistics University of Copenhagen
Comparison of five software solutions to mediation analysis Liis Starkopf Mikkel Porsborg Andersen Thomas Alexander Gerds Christian Torp-Pedersen Theis Lange Research Report 17/01 Department of Biostatistics
More informationFlexible mediation analysis in the presence of non-linear relations: beyond the mediation formula.
FACULTY OF PSYCHOLOGY AND EDUCATIONAL SCIENCES Flexible mediation analysis in the presence of non-linear relations: beyond the mediation formula. Modern Modeling Methods (M 3 ) Conference Beatrijs Moerkerke
More informationCasual Mediation Analysis
Casual Mediation Analysis Tyler J. VanderWeele, Ph.D. Upcoming Seminar: April 21-22, 2017, Philadelphia, Pennsylvania OXFORD UNIVERSITY PRESS Explanation in Causal Inference Methods for Mediation and Interaction
More informationAn Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016
An Introduction to Causal Mediation Analysis Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 1 Causality In the applications of statistics, many central questions
More informationMediation analyses. Advanced Psychometrics Methods in Cognitive Aging Research Workshop. June 6, 2016
Mediation analyses Advanced Psychometrics Methods in Cognitive Aging Research Workshop June 6, 2016 1 / 40 1 2 3 4 5 2 / 40 Goals for today Motivate mediation analysis Survey rapidly developing field in
More informationStatistical Methods for Causal Mediation Analysis
Statistical Methods for Causal Mediation Analysis The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters. Citation Accessed Citable
More informationObservational Studies 4 (2018) Submitted 12/17; Published 6/18
Observational Studies 4 (2018) 193-216 Submitted 12/17; Published 6/18 Comparing logistic and log-binomial models for causal mediation analyses of binary mediators and rare binary outcomes: evidence to
More informationEstimating direct effects in cohort and case-control studies
Estimating direct effects in cohort and case-control studies, Ghent University Direct effects Introduction Motivation The problem of standard approaches Controlled direct effect models In many research
More informationCausal mediation analysis: Multiple mediators
Causal mediation analysis: ultiple mediators Trang Quynh guyen Seminar on Statistical ethods for ental Health Research Johns Hopkins Bloomberg School of Public Health 330.805.01 term 4 session 4 - ay 5,
More informationMarginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal
Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Overview In observational and experimental studies, the goal may be to estimate the effect
More informationA Unification of Mediation and Interaction. A 4-Way Decomposition. Tyler J. VanderWeele
Original Article A Unification of Mediation and Interaction A 4-Way Decomposition Tyler J. VanderWeele Abstract: The overall effect of an exposure on an outcome, in the presence of a mediator with which
More informationunadjusted model for baseline cholesterol 22:31 Monday, April 19,
unadjusted model for baseline cholesterol 22:31 Monday, April 19, 2004 1 Class Level Information Class Levels Values TRETGRP 3 3 4 5 SEX 2 0 1 Number of observations 916 unadjusted model for baseline cholesterol
More informationCausal inference in epidemiological practice
Causal inference in epidemiological practice Willem van der Wal Biostatistics, Julius Center UMC Utrecht June 5, 2 Overview Introduction to causal inference Marginal causal effects Estimating marginal
More informationIP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS IPW and MSM
IP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS 776 1 12 IPW and MSM IP weighting and marginal structural models ( 12) Outline 12.1 The causal question 12.2 Estimating IP weights via modeling
More informationLongitudinal Modeling with Logistic Regression
Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to
More informationLecture 12: Effect modification, and confounding in logistic regression
Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression
More informationMediation Analysis: A Practitioner s Guide
ANNUAL REVIEWS Further Click here to view this article's online features: Download figures as PPT slides Navigate linked references Download citations Explore related articles Search keywords Mediation
More informationLecture 15 (Part 2): Logistic Regression & Common Odds Ratio, (With Simulations)
Lecture 15 (Part 2): Logistic Regression & Common Odds Ratio, (With Simulations) Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology
More informationCausal Mediation Analysis Short Course
Causal Mediation Analysis Short Course Linda Valeri McLean Hospital and Harvard Medical School May 14, 2018 Plan of the Course 1. Concepts of Mediation 2. Regression Approaches 3. Sensitivity Analyses
More informationSPECIAL TOPICS IN REGRESSION ANALYSIS
1 SPECIAL TOPICS IN REGRESSION ANALYSIS Representing Nominal Scales in Regression Analysis There are several ways in which a set of G qualitative distinctions on some variable of interest can be represented
More information13.1 Causal effects with continuous mediator and. predictors in their equations. The definitions for the direct, total indirect,
13 Appendix 13.1 Causal effects with continuous mediator and continuous outcome Consider the model of Section 3, y i = β 0 + β 1 m i + β 2 x i + β 3 x i m i + β 4 c i + ɛ 1i, (49) m i = γ 0 + γ 1 x i +
More informationmultilevel modeling: concepts, applications and interpretations
multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models
More informationStrategy of Bayesian Propensity. Score Estimation Approach. in Observational Study
Theoretical Mathematics & Applications, vol.2, no.3, 2012, 75-86 ISSN: 1792-9687 (print), 1792-9709 (online) Scienpress Ltd, 2012 Strategy of Bayesian Propensity Score Estimation Approach in Observational
More informationNELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation
NELS 88 Table 2.3 Adjusted odds ratios of eighth-grade students in 988 performing below basic levels of reading and mathematics in 988 and dropping out of school, 988 to 990, by basic demographics Variable
More informationPossibly useful formulas for this exam: b1 = Corr(X,Y) SDY / SDX. confidence interval: Estimate ± (Critical Value) (Standard Error of Estimate)
Statistics 5100 Exam 2 (Practice) Directions: Be sure to answer every question, and do not spend too much time on any part of any question. Be concise with all your responses. Partial SAS output and statistical
More informationCausal Mechanisms Short Course Part II:
Causal Mechanisms Short Course Part II: Analyzing Mechanisms with Experimental and Observational Data Teppei Yamamoto Massachusetts Institute of Technology March 24, 2012 Frontiers in the Analysis of Causal
More informationCausal mediation analysis: Definition of effects and common identification assumptions
Causal mediation analysis: Definition of effects and common identification assumptions Trang Quynh Nguyen Seminar on Statistical Methods for Mental Health Research Johns Hopkins Bloomberg School of Public
More informationModeration & Mediation in Regression. Pui-Wa Lei, Ph.D Professor of Education Department of Educational Psychology, Counseling, and Special Education
Moderation & Mediation in Regression Pui-Wa Lei, Ph.D Professor of Education Department of Educational Psychology, Counseling, and Special Education Introduction Mediation and moderation are used to understand
More informationComputationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models
Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Tihomir Asparouhov 1, Bengt Muthen 2 Muthen & Muthen 1 UCLA 2 Abstract Multilevel analysis often leads to modeling
More informationMethods for inferring short- and long-term effects of exposures on outcomes, using longitudinal data on both measures
Methods for inferring short- and long-term effects of exposures on outcomes, using longitudinal data on both measures Ruth Keogh, Stijn Vansteelandt, Rhian Daniel Department of Medical Statistics London
More informationGeneral Regression Model
Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical
More informationEPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7
Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2008 Paper 241 A Note on Risk Prediction for Case-Control Studies Sherri Rose Mark J. van der Laan Division
More informationRatio of Mediator Probability Weighting for Estimating Natural Direct and Indirect Effects
Ratio of Mediator Probability Weighting for Estimating Natural Direct and Indirect Effects Guanglei Hong University of Chicago, 5736 S. Woodlawn Ave., Chicago, IL 60637 Abstract Decomposing a total causal
More informationTelescope Matching: A Flexible Approach to Estimating Direct Effects
Telescope Matching: A Flexible Approach to Estimating Direct Effects Matthew Blackwell and Anton Strezhnev International Methods Colloquium October 12, 2018 direct effect direct effect effect of treatment
More informationCase-control studies C&H 16
Case-control studies C&H 6 Bendix Carstensen Steno Diabetes Center & Department of Biostatistics, University of Copenhagen bxc@steno.dk http://bendixcarstensen.com PhD-course in Epidemiology, Department
More informationIn Class Review Exercises Vartanian: SW 540
In Class Review Exercises Vartanian: SW 540 1. Given the following output from an OLS model looking at income, what is the slope and intercept for those who are black and those who are not black? b SE
More informationRon Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)
Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October
More informationLecture 3.1 Basic Logistic LDA
y Lecture.1 Basic Logistic LDA 0.2.4.6.8 1 Outline Quick Refresher on Ordinary Logistic Regression and Stata Women s employment example Cross-Over Trial LDA Example -100-50 0 50 100 -- Longitudinal Data
More informationQ30b Moyale Observed counts. The FREQ Procedure. Table 1 of type by response. Controlling for site=moyale. Improved (1+2) Same (3) Group only
Moyale Observed counts 12:28 Thursday, December 01, 2011 1 The FREQ Procedure Table 1 of by Controlling for site=moyale Row Pct Improved (1+2) Same () Worsened (4+5) Group only 16 51.61 1.2 14 45.16 1
More informationClass Notes: Week 8. Probit versus Logit Link Functions and Count Data
Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While
More informationMulticollinearity Exercise
Multicollinearity Exercise Use the attached SAS output to answer the questions. [OPTIONAL: Copy the SAS program below into the SAS editor window and run it.] You do not need to submit any output, so there
More informationCase-control studies
Matched and nested case-control studies Bendix Carstensen Steno Diabetes Center, Gentofte, Denmark b@bxc.dk http://bendixcarstensen.com Department of Biostatistics, University of Copenhagen, 8 November
More informationStatistical Modelling with Stata: Binary Outcomes
Statistical Modelling with Stata: Binary Outcomes Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 21/11/2017 Cross-tabulation Exposed Unexposed Total Cases a b a + b Controls
More informationLecture 11. Interval Censored and. Discrete-Time Data. Statistics Survival Analysis. Presented March 3, 2016
Statistics 255 - Survival Analysis Presented March 3, 2016 Motivating Dan Gillen Department of Statistics University of California, Irvine 11.1 First question: Are the data truly discrete? : Number of
More informationIntroducing Generalized Linear Models: Logistic Regression
Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and
More informationLogistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression
Logistic Regression Usual linear regression (repetition) y i = b 0 + b 1 x 1i + b 2 x 2i + e i, e i N(0,σ 2 ) or: y i N(b 0 + b 1 x 1i + b 2 x 2i,σ 2 ) Example (DGA, p. 336): E(PEmax) = 47.355 + 1.024
More informationCOMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION
COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION Answer all parts. Closed book, calculators allowed. It is important to show all working,
More informationeappendix: Description of mgformula SAS macro for parametric mediational g-formula
eappendix: Description of mgformula SAS macro for parametric mediational g-formula The implementation of causal mediation analysis with time-varying exposures, mediators, and confounders Introduction The
More informationBeyond GLM and likelihood
Stat 6620: Applied Linear Models Department of Statistics Western Michigan University Statistics curriculum Core knowledge (modeling and estimation) Math stat 1 (probability, distributions, convergence
More information22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression
22s:52 Applied Linear Regression Ch. 4 (sec. and Ch. 5 (sec. & 4: Logistic Regression Logistic Regression When the response variable is a binary variable, such as 0 or live or die fail or succeed then
More informationEstimating the Marginal Odds Ratio in Observational Studies
Estimating the Marginal Odds Ratio in Observational Studies Travis Loux Christiana Drake Department of Statistics University of California, Davis June 20, 2011 Outline The Counterfactual Model Odds Ratios
More informationMultilevel Statistical Models: 3 rd edition, 2003 Contents
Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction
More informationLecture#12. Instrumental variables regression Causal parameters III
Lecture#12 Instrumental variables regression Causal parameters III 1 Demand experiment, market data analysis & simultaneous causality 2 Simultaneous causality Your task is to estimate the demand function
More information11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies.
Matched and nested case-control studies Bendix Carstensen Steno Diabetes Center, Gentofte, Denmark http://staff.pubhealth.ku.dk/~bxc/ Department of Biostatistics, University of Copengen 11 November 2011
More informationConfounding, mediation and colliding
Confounding, mediation and colliding What types of shared covariates does the sibling comparison design control for? Arvid Sjölander and Johan Zetterqvist Causal effects and confounding A common aim of
More informationSimple, Marginal, and Interaction Effects in General Linear Models: Part 1
Simple, Marginal, and Interaction Effects in General Linear Models: Part 1 PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 2: August 24, 2012 PSYC 943: Lecture 2 Today s Class Centering and
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationMediation and Interaction Analysis
Mediation and Interaction Analysis Andrea Bellavia abellavi@hsph.harvard.edu May 17, 2017 Andrea Bellavia Mediation and Interaction May 17, 2017 1 / 43 Epidemiology, public health, and clinical research
More informationSAS/STAT 14.3 User s Guide The CAUSALMED Procedure
SAS/STAT 14.3 User s Guide The CAUSALMED Procedure This document is an individual chapter from SAS/STAT 14.3 User s Guide. The correct bibliographic citation for this manual is as follows: SAS Institute
More informationRatio-of-Mediator-Probability Weighting for Causal Mediation Analysis. in the Presence of Treatment-by-Mediator Interaction
Ratio-of-Mediator-Probability Weighting for Causal Mediation Analysis in the Presence of Treatment-by-Mediator Interaction Guanglei Hong Jonah Deutsch Heather D. Hill University of Chicago (This is a working
More informationConceptual overview: Techniques for establishing causal pathways in programs and policies
Conceptual overview: Techniques for establishing causal pathways in programs and policies Antonio A. Morgan-Lopez, Ph.D. OPRE/ACF Meeting on Unpacking the Black Box of Programs and Policies 4 September
More informationLab 8. Matched Case Control Studies
Lab 8 Matched Case Control Studies Control of Confounding Technique for the control of confounding: At the design stage: Matching During the analysis of the results: Post-stratification analysis Advantage
More informationSplineLinear.doc 1 # 9 Last save: Saturday, 9. December 2006
SplineLinear.doc 1 # 9 Problem:... 2 Objective... 2 Reformulate... 2 Wording... 2 Simulating an example... 3 SPSS 13... 4 Substituting the indicator function... 4 SPSS-Syntax... 4 Remark... 4 Result...
More informationFlexible Mediation Analysis in the Presence of Nonlinear Relations: Beyond the Mediation Formula
Multivariate Behavioral Research ISSN: 0027-3171 (Print) 1532-7906 (Online) Journal homepage: http://www.tandfonline.com/loi/hmbr20 Flexible Mediation Analysis in the Presence of Nonlinear Relations: Beyond
More informationLongitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois
Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 217, Chicago, Illinois Outline 1. Opportunities and challenges of panel data. a. Data requirements b. Control
More informationRandom Intercept Models
Random Intercept Models Edps/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline A very simple case of a random intercept
More informationStructural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall
1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Dept. of Biostatistics, Duke University Medical Joint work
More informationSTAT 3A03 Applied Regression With SAS Fall 2017
STAT 3A03 Applied Regression With SAS Fall 2017 Assignment 2 Solution Set Q. 1 I will add subscripts relating to the question part to the parameters and their estimates as well as the errors and residuals.
More informationHelp! Statistics! Mediation Analysis
Help! Statistics! Lunch time lectures Help! Statistics! Mediation Analysis What? Frequently used statistical methods and questions in a manageable timeframe for all researchers at the UMCG. No knowledge
More informationCourse Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model
Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 1: August 22, 2012
More informationExtending causal inferences from a randomized trial to a target population
Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh
More informationGEE for Longitudinal Data - Chapter 8
GEE for Longitudinal Data - Chapter 8 GEE: generalized estimating equations (Liang & Zeger, 1986; Zeger & Liang, 1986) extension of GLM to longitudinal data analysis using quasi-likelihood estimation method
More informationMULTINOMIAL LOGISTIC REGRESSION
MULTINOMIAL LOGISTIC REGRESSION Model graphically: Variable Y is a dependent variable, variables X, Z, W are called regressors. Multinomial logistic regression is a generalization of the binary logistic
More informationSection 9c. Propensity scores. Controlling for bias & confounding in observational studies
Section 9c Propensity scores Controlling for bias & confounding in observational studies 1 Logistic regression and propensity scores Consider comparing an outcome in two treatment groups: A vs B. In a
More informationLecture 11 Multiple Linear Regression
Lecture 11 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 11-1 Topic Overview Review: Multiple Linear Regression (MLR) Computer Science Case Study 11-2 Multiple Regression
More informationChapter 1 Linear Regression with One Predictor
STAT 525 FALL 2018 Chapter 1 Linear Regression with One Predictor Professor Min Zhang Goals of Regression Analysis Serve three purposes Describes an association between X and Y In some applications, the
More informationModels for Clustered Data
Models for Clustered Data Edps/Psych/Soc 589 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline Notation NELS88 data Fixed Effects ANOVA
More informationChapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression
BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between
More informationFaculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics
Faculty of Health Sciences Regression models Counts, Poisson regression, 27-5-2013 Lene Theil Skovgaard Dept. of Biostatistics 1 / 36 Count outcome PKA & LTS, Sect. 7.2 Poisson regression The Binomial
More informationModels for Clustered Data
Models for Clustered Data Edps/Psych/Stat 587 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2017 Outline Notation NELS88 data Fixed Effects ANOVA
More informationBinary Dependent Variables
Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome
More informationSensitivity analysis and distributional assumptions
Sensitivity analysis and distributional assumptions Tyler J. VanderWeele Department of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, IL 60637, USA vanderweele@uchicago.edu
More informationSTAT 7030: Categorical Data Analysis
STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012
More informationAdvanced Quantitative Data Analysis
Chapter 24 Advanced Quantitative Data Analysis Daniel Muijs Doing Regression Analysis in SPSS When we want to do regression analysis in SPSS, we have to go through the following steps: 1 As usual, we choose
More informationHandout 1: Predicting GPA from SAT
Handout 1: Predicting GPA from SAT appsrv01.srv.cquest.utoronto.ca> appsrv01.srv.cquest.utoronto.ca> ls Desktop grades.data grades.sas oldstuff sasuser.800 appsrv01.srv.cquest.utoronto.ca> cat grades.data
More informationStat 642, Lecture notes for 04/12/05 96
Stat 642, Lecture notes for 04/12/05 96 Hosmer-Lemeshow Statistic The Hosmer-Lemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal
More informationSEM REX B KLINE CONCORDIA D. MODERATION, MEDIATION
ADVANCED SEM REX B KLINE CONCORDIA D1 D. MODERATION, MEDIATION X 1 DY Y DM 1 M D2 topics moderation mmr mpa D3 topics cpm mod. mediation med. moderation D4 topics cma cause mediator most general D5 MMR
More informationAssessing In/Direct Effects: from Structural Equation Models to Causal Mediation Analysis
Assessing In/Direct Effects: from Structural Equation Models to Causal Mediation Analysis Part 1: Vanessa Didelez with help from Ryan M Andrews Leibniz Institute for Prevention Research & Epidemiology
More informationSAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1
CLDP 944 Example 3a page 1 From Between-Person to Within-Person Models for Longitudinal Data The models for this example come from Hoffman (2015) chapter 3 example 3a. We will be examining the extent to
More informationA Longitudinal Look at Longitudinal Mediation Models
A Longitudinal Look at Longitudinal Mediation Models David P. MacKinnon, Arizona State University Causal Mediation Analysis Ghent, Belgium University of Ghent January 8-9, 03 Introduction Assumptions Unique
More informationlme4 Luke Chang Last Revised July 16, Fitting Linear Mixed Models with a Varying Intercept
lme4 Luke Chang Last Revised July 16, 2010 1 Using lme4 1.1 Fitting Linear Mixed Models with a Varying Intercept We will now work through the same Ultimatum Game example from the regression section and
More informationUNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator
UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages
More informationSTA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3
STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae
More informationMcGill University. Faculty of Science. Department of Mathematics and Statistics. Statistics Part A Comprehensive Exam Methodology Paper
Student Name: ID: McGill University Faculty of Science Department of Mathematics and Statistics Statistics Part A Comprehensive Exam Methodology Paper Date: Friday, May 13, 2016 Time: 13:00 17:00 Instructions
More informationBIOS 625 Fall 2015 Homework Set 3 Solutions
BIOS 65 Fall 015 Homework Set 3 Solutions 1. Agresti.0 Table.1 is from an early study on the death penalty in Florida. Analyze these data and show that Simpson s Paradox occurs. Death Penalty Victim's
More informationIntroduction to SAS proc mixed
Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen 2 / 28 Preparing data for analysis The
More informationData Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
More informationPrerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3
University of California, Irvine 2017-2018 1 Statistics (STATS) Courses STATS 5. Seminar in Data Science. 1 Unit. An introduction to the field of Data Science; intended for entering freshman and transfers.
More informationPaper: ST-161. Techniques for Evidence-Based Decision Making Using SAS Ian Stockwell, The Hilltop UMBC, Baltimore, MD
Paper: ST-161 Techniques for Evidence-Based Decision Making Using SAS Ian Stockwell, The Hilltop Institute @ UMBC, Baltimore, MD ABSTRACT SAS has many tools that can be used for data analysis. From Freqs
More information