A comparison of 5 software implementations of mediation analysis

Size: px
Start display at page:

Download "A comparison of 5 software implementations of mediation analysis"

Transcription

1 Faculty of Health Sciences A comparison of 5 software implementations of mediation analysis Liis Starkopf, Thomas A. Gerds, Theis Lange Section of Biostatistics, University of Copenhagen

2 Illustrative example Examine pathways between physical fitness and post-cumpolsory education 1084 students from all public elementary schools in Aalborg Health examination (2010) High physical fitness Grades Post-compulsory education (2014) Low physical fitness Grades Post-compulsory education (2014) 2 / 42

3 Mediation question Age, ethnicity, parental income and education (C) Academic achievement (M ) Physical fitness (A) Post-compulsory education (Y ) How much of the effect of physical fitness on attendance in post-compulsory education is mediated through academic achievement? 3 / 42

4 Objectives How to do mediation analysis in practice? Implement mediation analysis on the illustrative data using SAS mediation macro R package medflex What software solutions are available? What can they do? What are the differences? Comparison of five estimation methods and their software solutions Simulation study 4 / 42

5 Counterfactual framework General definiton: M (a) the mediator that would have been observed if, possibly contrary to the fact, the exposure A was set to a Y (a, M (a )) the outcome that would have been observed if, possibly contrary to the fact, the exposure A had been set to a and the mediator M was set to the value it would have taken if A was set to a. 5 / 42

6 Counterfactual framework Example: M (0) Y (1, M (0)) grade point average that would have been observed if, possibly contrary to the fact, the level of physical fitness had been set to low (A = 0) attendance in post-compulsory education that would have been observed if, possibly contrary to the fact, the level of physical fitness had been set to high (A = 1) and grade point average was set to the value it would have taken if the level of physical fitness was set to low (M (0)). 6 / 42

7 Marginal natural effects g{e[y (a, M (a))]} g{e[y (a, M (a ))]} }{{} marginal total effect for some link function g. = g{e[y (a, M (a))]} g{e[y (a, M (a))]} }{{} marginal natural direct effect + g{e[y (a, M (a))]} g{e[y (a, M (a ))]} }{{} marginal natural indirect effect 7 / 42

8 Identifiability conditions No uncontrolled confounding for the exposure-outcome, exposure-mediator or mediator-outcome relations Y (a, m) Y (a, m) = M (a) = = No intertwined causal pathways A C for all levels of a and m, A C for all levels of a, M A = a, C for all levels of a and m. Positivity Y (a, m) = M (a ) C for all levels a, a and m. f (m A, C) > 0 w.p.1 for each m. Consistency if A = a, then M (a) = M w.p.1, if A = a and M = m, then Y (a, m) = Y w.p.1. 8 / 42

9 Analytic formulas of natural effects Regression model for the outcome Y g Y {E[Y A = a, M = m, C = c]} = θ 0 + θ 1a + θ 2m + θ 3am + θ T 4 c Regression model for the mediator M g M {E[M A = a, C = c]} = β 0 + β 1a + β T 2 c, Under identifiability conditions, closed form expressions of natural direct and indirect effects can be derived as a combination of β and θ Valeri and VanderWeele 1 implemented the analytic fomulas in SAS/SPSS mediation macros. 1 Linda Valeri and Tyler J VanderWeele. Mediation analysis allowing for exposure-mediator interactions and causal interpretation: Theoretical assumptions and implementation with sas and spss macros. Psychological methods, 18(2):137, / 42

10 SAS/SPSS mediation macro Conditional natural effects at a fixed level of C on the scale of linear predictor (g y ) Seperate formulas for each combination of the mediator and outcome models Binary outcome: For logistic regression model, the formulas hold if the outcome is rare Alternatively, log-linear model has to be used 10 / 42

11 Illustrative example Not attending post-compulsory education is a rare event, P(Y = 0) = 0.08 Logistic regression model for Y logit{p(y = 0 A = a, M = m, C = c)} = θ 0 + θ 1a + θ 2m + θ T 4 c Liner regression model for M Resulting formulas E[M A = a, C = c] = β 0 + β 1a + β T 2 c log{or NDE (a = 1, a = 0)} = θ 1 log{or NIE (a = 1, a = 0)} = β 1θ 2 log{or TE (a = 1, a = 0)} = θ 1 + β 1θ 2 11 / 42

12 Implementation %INC "MEDIATION.sas"; PROC IMPORT DATAFILE="d1.csv" OUT=d1 DBMS=csv; RUN; %MEDIATION(data=d1,yvar=attend,avar=fitness,mvar=gpa,cvar=ethni age14 age15 income1 income2 income3 educ1 educ2 educ3,a0=0,a1=1,m=0,nc=4,c=,yreg=logistic,mreg=linear,interaction=false) RUN; The SAS System Obs fitness gpa attend age income educ age14 age15 income1 income2 income3 educ1 ecuc2 educ / 42

13 Regression for mediator M %MEDIATION(data=d1, yvar=attend, avar=fitness, mvar=gpa, cvar=ethni age14 age15 income1 income2 income3 educ1 educ2 educ3, a0=0, a1=1, m=0, nc=4, c=, yreg=logistic, mreg=linear, interaction=false) RUN; Age, ethnicity, parental income and education (C) Physical fitness (A) Academic achievement (M ) Post-compulsory education (Y ) 13 / 42

14 Error Corrected Total Regression for mediator M Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Variable DF Parameter Estimate Standard Error t Value Pr > t Intercept <.0001 fitness <.0001 ethni age age <.0001 income income income <.0001 educ <.0001 educ <.0001 educ < / 42

15 Regression for outcome Y %MEDIATION(data=d1, yvar=attend, avar=fitness, mvar=gpa, cvar=ethni age14 age15 income1 income2 income3 educ1 educ2 educ3, a0=0, a1=1, m=0, nc=4, c=, yreg=logistic, mreg=linear, interaction=false) RUN; Age, ethnicity, parental income and education (C) Physical fitness (A) Academic achievement (M ) Post-compulsory education (Y ) 15 / 42

16 Point 95% Wald Regression for outcome Y The SAS System 18:27 Thursday, August 11, The LOGISTIC Procedure Analysis of Maximum Likelihood Estimates Parameter DF Estimate Standard Error Wald Chi-Square Pr > ChiSq Intercept fitness gpa ethni age age income income income educ educ educ / 42 Odds Ratio Estimates

17 Exposure-mediator interaction %MEDIATION(data=d1, yvar=attend, avar=fitness, mvar=gpa, cvar=ethni age14 age15 income1 income2 income3 educ1 educ2 educ3, a0=0, a1=1, m=0, nc=4, c=, yreg=logistic, mreg=linear, interaction=false) RUN; Age, ethnicity, parental income and education (C) Physical fitness (A) Academic achievement (M ) Post-compulsory education (Y ) 17 / 42

18 Other options %MEDIATION(data=d1, yvar=attend, avar=fitness, mvar=gpa, cvar=ethni age14 age15 income1 income2 income3 educ1 educ2 educ3, a0=0, a1=1, m=0, nc=4, c=, yreg=logistic, mreg=linear, interaction=false) RUN; a0 - baseline level of exposure (unexposed) a1 - new exposure level c - fixed value for C at which conditional effects are computed nc - number of baseline covariates m - fixed value for M at which controlled direct effect is computed 18 / 42

19 Estimates of natural and controlled direct effects The SAS System 18:27 Thursday, August 11, Obs Effect Estimate p_value _95 CI_lower _95 CI_upper 1 cde=nde nie total effect The SAS System 18:27 Thursday, August 11, Libref Engine Physical Name Filename WORK V9 Directory OR NDE = OR NIE = C:\Users\gzc782\AppData\Local\Temp\SAS Temporary Files\_TD6756_SUND27322_ OR TE = C:\Users\gzc782\AppData\Local\Temp\SAS Temporary Files\_TD6756_SUND27322_ Member # Name Type File Size Last Modified 1 CIL DATA 128KB 11/08/ :54:43 19 / 42 2 CIU DATA 128KB 11/08/ :54:43 3 D1 DATA 128KB 11/08/ :30:06

20 Interpretation OR NIE = changing the grade point average from the value that would have been observed at the low level of physical fitness (M (0)) to the value that would have been observed at high level of physical fitness (M (1)), while actually keeping the physical fitness at the high level (A = 1) increases the odds of attending post-compulsory 1 education by = times 20 / 42

21 Natural effect models Lange 2, Vansteelandt 3 suggested using so-called natural effect models g Y {E[Y (a, M (a )) C = c]} = θ 0 + θ 1a + θ 2a + θ T 3 c Conditional natural effects at the observed values of C given on the scale of linear predictor g Y θ 1(a a ) - natural direct effect θ 2(a a ) - natural indirect effect Implemented in the R package medflex 2 Theis Lange, Stijn Vansteelandt, and Maarten Bekaert. A simple unified approach for estimating natural direct and indirect effects. American journal of epidemiology, 176(3): , Stijn Vansteelandt, Maarten Bekaert, and Theis Lange. Imputation strategies for the estimation of natural direct and indirect effects. Epidemiologic Methods, 1(1):13-158, / 42

22 Estimation of natural effect models At first glance, it seemes that fitting natural effect models requires data for nested counterfactuals Y (a, M (a )) i A i a a M i M i(a ) Y i Y i(a, M i(a )) M 1 M 1 Y 1 Y M 1 M 1 Y 1? M 2 M 2 Y 2 Y M 2 M 2 Y 2? 22 / 42

23 Estimation of natural effect models Vansteelandt et al. suggested imputing the missing counterfactuals Y (a, M (a )) i A i a a M i M i(a ) Y i Y i(a, M i(a )) M 1 M 1 Y 1 Y M 1 M 1 Y 1 Ê[Y 1 A = a, M 1, C 1] M 2 M 2 Y 2 Y M 2 M 2 Y 2 Ê[Y 2 A = a, M 2, C 2] Imputation model g Y {E[Y A = a, M = m, C = c]} = β 0 + β 1a + β 2m + β 3c. 23 / 42

24 Illustrative example Logistic regression model as the natural effects model logit{p(y (a, M (a ) = 0 C = c)} = θ 0 + θ 1a + θ 2a + θ T 4 c Logistic regression model for Y logit{p(y = 0, A = a, M = m, C = c)} = β 0 + β 1a + β 2m + β T 3 c Conditional natural effects as odds ratios given the observed level of covariates C log{or NDE (a = 1, a = 0)} = θ 1 log{or NIE (a = 1, a = 0)} = θ 2 log{or TE (a = 1, a = 0)} = θ 1 + θ 2 24 / 42

25 Expanding the data R> library(medflex) R> d1 <- read.csv("d.csv") R> Yimp <- glm(attend~fitness+ + gpa+ + age+ethni+income+educ, + family="binomial") R> impdata <- neimpute(yimp) Age, ethnicity, parental income and education (C) Academic achievement (M ) Physical fitness (A) Post-compulsory education (Y ) 25 / 42

26 Expanding the data R> library(medflex) R> d1 <- read.csv("d.csv") R> impdata <- neimpute(attend~fitness+ + gpa+ + age+ethni+income+educ, + data=d1, + family="binomial") Age, ethnicity, parental income and education (C) Academic achievement (M ) Physical fitness (A) Post-compulsory education (Y ) 26 / 42

27 Fitting the natural effect model R> Yfit <- nemodel(formula=attend~ + fitness0+ + fitness1+ + age+ethni+income+educ, + expdata=impdata, + se="robust", + family="binomial") Age, ethnicity, parental income and education (C) Academic achievement (M ) Physical fitness (A) Post-compulsory education (Y ) 27 / 42

28 Fitting the natural effect model R> Yfit <- nemodel(formula=attend~ + fitness0+ + fitness1+ + age+ethni+income+educ, + expdata=impdata, + se="robust", + family="binomial") Age, ethnicity, parental income and education (C) Academic achievement (M ) Physical fitness (A) Post-compulsory education (Y ) 28 / 42

29 Other arguments R> Yfit <- nemodel(formula=attend~ + fitness0+ + fitness1+ + age+ethni+income+educ, + expdata=impdata, + se="robust", + family="binomial") expdata - expanded and imputed data set se - standard errors (robust - based on Delta method, bootstrap-based on 1000 bootstrap) 29 / 42

30 Estimates of the natural effects R> summary(yfit) Natural effect model with robust standard errors based on the sandwich estimator --- Exposure: fitness Mediator(s): gpa --- Parameter estimates: Estimate Std. Error z value Pr(> z ) (Intercept) fitness fitness ** ethni * age age income income income educ ** educ educ Signif. codes: 0 *** ** 0.01 * OR NDE = exp{ 0.513} = OR NIE = exp{ 0.432} = / 42

31 Estimate of the total effect R> summary(neeffdecomp(yfit)) Effect decomposition on the scale of the linear predictor with standard errors based on the sandwich estimator --- conditional on: ethni, age, income, educ with x* = 0, x = OR TE = exp{ 0.946} = Estimate Std. Error z value Pr(> z ) natural direct effect natural indirect effect ** total effect * --- Signif. codes: 0 *** ** 0.01 * (Univariate p-values reported) 31 / 42

32 Interpretation OR NIE = changing the grade point average from the value that would have been observed at the low level of physical fitness (M (0)) to the value that would have been observed at high level of physical fitness (M (1)), while actually keeping the physical fitness at the high level (A = 1) increases the odds of attending post-compulsory 1 education by = times 32 / 42

33 Other estimation methods considered in the paper Weighting approach in the R package medflex Approach based on Monte Carlo approximations implemented in the R package mediation Inverse odds ratio weighted estimation of natural effects with R code examples 33 / 42

34 Comparison Variable SAS/SPSS medflex (W) medflex (I) mediation IORW Parameters of interest Marginal or Conditional Conditional Conditional Marginal Conditional conditional at a fixed at the observed at the observed at the observed level of C level of C level of C level of C Scale Corresponds Corresponds Corresponds Always difference, Corresponds to g to g to g i.e. g = identity to g Modelling Required models M A, C M A, C Y A, M, C M A, C A M, C Y A, M, C Y (a, M (a )) C Y (a, M (a )) C Y A, M, C Y A, C Interactions A M A M A M A M A M A C A C A C A C Type of variables Exposure Continuous Continuous Continuous Continuous Continuous Binary Binary Binary Binary Binary Polytomous Polytomous Polytomous Polytomous Polytomous Mediator Continuous Continuous Continuous Continuous Continuous Binary Binary Binary Binary Binary Count Count Count Count Polytomous Polytomous Polytomous Polytomous Multidimensional Failure time Failure time Multidimensional Multidimensional Outcome Continuous Continuous Continuous Continuous Continuous Binary Binary Binary Binary Binary Count Count Count Count Count Failure time Polytomous Polytomous Failure time Failure time 34 / 42

35 Simulation study 2000 samples of data sets with 200 observations Set up: P(C = 1) = 0.7 P(A = 1 C = c) = Φ( 0.3c) M = A 0.7C + ε P(Y = 1 A = a, M = m, C = c) = Φ( a + 0.2m + 0.2c) with ε t(df = 10) 35 / 42

36 Simulation study True natural effects: estimated from a simulated data set with 100,000 observations Relative bias: i=1 NDE i NDE true NDE true Relative RMSE: i=1 ( ) NDE i NDE 2 true NDE true 36 / 42

37 Results for direct effect Method Rel.Bias Rel.RMSE Cov.P SAS macro % medflex (I) % medflex (W) % R mediation % IORW % 37 / 42

38 Results for indirect effect Method Rel.Bias Rel.RMSE Cov.P SAS macro % medflex (I) % medflex (W) % R mediation % IORW % 38 / 42

39 Results for total effect Method Rel.Bias Rel.RMSE Cov.P SAS macro % medflex (I) % medflex (W) % R mediation % IORW % 39 / 42

40 Results depending on sample size Relative bias Relative RMSE Relative bias medflex (W) medflex (I) R mediation IORW SAS Relative RMSE medflex (W) medflex (I) R mediation IORW SAS Samplesize (n) Samplesize (n) 40 / 42

41 Conclusions All estimation methods peform good in this particular setting IORW estimation seems to have larger relative bias and relative RMSE, needs further investigation Choice of estimation method depends on parameter of interest aimed for software preferance. Mediation analysis can be applied fairly easily in most of the standard software packages Our paper will give guidance and examples how to apply mediation analysis with 5 different software solutions If you run into problems, you are very welcome to contact us! 41 / 42

42 Thank you! 42 / 42

Department of Biostatistics University of Copenhagen

Department of Biostatistics University of Copenhagen Comparison of five software solutions to mediation analysis Liis Starkopf Mikkel Porsborg Andersen Thomas Alexander Gerds Christian Torp-Pedersen Theis Lange Research Report 17/01 Department of Biostatistics

More information

Flexible mediation analysis in the presence of non-linear relations: beyond the mediation formula.

Flexible mediation analysis in the presence of non-linear relations: beyond the mediation formula. FACULTY OF PSYCHOLOGY AND EDUCATIONAL SCIENCES Flexible mediation analysis in the presence of non-linear relations: beyond the mediation formula. Modern Modeling Methods (M 3 ) Conference Beatrijs Moerkerke

More information

Casual Mediation Analysis

Casual Mediation Analysis Casual Mediation Analysis Tyler J. VanderWeele, Ph.D. Upcoming Seminar: April 21-22, 2017, Philadelphia, Pennsylvania OXFORD UNIVERSITY PRESS Explanation in Causal Inference Methods for Mediation and Interaction

More information

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 An Introduction to Causal Mediation Analysis Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 1 Causality In the applications of statistics, many central questions

More information

Mediation analyses. Advanced Psychometrics Methods in Cognitive Aging Research Workshop. June 6, 2016

Mediation analyses. Advanced Psychometrics Methods in Cognitive Aging Research Workshop. June 6, 2016 Mediation analyses Advanced Psychometrics Methods in Cognitive Aging Research Workshop June 6, 2016 1 / 40 1 2 3 4 5 2 / 40 Goals for today Motivate mediation analysis Survey rapidly developing field in

More information

Statistical Methods for Causal Mediation Analysis

Statistical Methods for Causal Mediation Analysis Statistical Methods for Causal Mediation Analysis The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters. Citation Accessed Citable

More information

Observational Studies 4 (2018) Submitted 12/17; Published 6/18

Observational Studies 4 (2018) Submitted 12/17; Published 6/18 Observational Studies 4 (2018) 193-216 Submitted 12/17; Published 6/18 Comparing logistic and log-binomial models for causal mediation analyses of binary mediators and rare binary outcomes: evidence to

More information

Estimating direct effects in cohort and case-control studies

Estimating direct effects in cohort and case-control studies Estimating direct effects in cohort and case-control studies, Ghent University Direct effects Introduction Motivation The problem of standard approaches Controlled direct effect models In many research

More information

Causal mediation analysis: Multiple mediators

Causal mediation analysis: Multiple mediators Causal mediation analysis: ultiple mediators Trang Quynh guyen Seminar on Statistical ethods for ental Health Research Johns Hopkins Bloomberg School of Public Health 330.805.01 term 4 session 4 - ay 5,

More information

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Overview In observational and experimental studies, the goal may be to estimate the effect

More information

A Unification of Mediation and Interaction. A 4-Way Decomposition. Tyler J. VanderWeele

A Unification of Mediation and Interaction. A 4-Way Decomposition. Tyler J. VanderWeele Original Article A Unification of Mediation and Interaction A 4-Way Decomposition Tyler J. VanderWeele Abstract: The overall effect of an exposure on an outcome, in the presence of a mediator with which

More information

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

unadjusted model for baseline cholesterol 22:31 Monday, April 19, unadjusted model for baseline cholesterol 22:31 Monday, April 19, 2004 1 Class Level Information Class Levels Values TRETGRP 3 3 4 5 SEX 2 0 1 Number of observations 916 unadjusted model for baseline cholesterol

More information

Causal inference in epidemiological practice

Causal inference in epidemiological practice Causal inference in epidemiological practice Willem van der Wal Biostatistics, Julius Center UMC Utrecht June 5, 2 Overview Introduction to causal inference Marginal causal effects Estimating marginal

More information

IP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS IPW and MSM

IP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS IPW and MSM IP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS 776 1 12 IPW and MSM IP weighting and marginal structural models ( 12) Outline 12.1 The causal question 12.2 Estimating IP weights via modeling

More information

Longitudinal Modeling with Logistic Regression

Longitudinal Modeling with Logistic Regression Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to

More information

Lecture 12: Effect modification, and confounding in logistic regression

Lecture 12: Effect modification, and confounding in logistic regression Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression

More information

Mediation Analysis: A Practitioner s Guide

Mediation Analysis: A Practitioner s Guide ANNUAL REVIEWS Further Click here to view this article's online features: Download figures as PPT slides Navigate linked references Download citations Explore related articles Search keywords Mediation

More information

Lecture 15 (Part 2): Logistic Regression & Common Odds Ratio, (With Simulations)

Lecture 15 (Part 2): Logistic Regression & Common Odds Ratio, (With Simulations) Lecture 15 (Part 2): Logistic Regression & Common Odds Ratio, (With Simulations) Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology

More information

Causal Mediation Analysis Short Course

Causal Mediation Analysis Short Course Causal Mediation Analysis Short Course Linda Valeri McLean Hospital and Harvard Medical School May 14, 2018 Plan of the Course 1. Concepts of Mediation 2. Regression Approaches 3. Sensitivity Analyses

More information

SPECIAL TOPICS IN REGRESSION ANALYSIS

SPECIAL TOPICS IN REGRESSION ANALYSIS 1 SPECIAL TOPICS IN REGRESSION ANALYSIS Representing Nominal Scales in Regression Analysis There are several ways in which a set of G qualitative distinctions on some variable of interest can be represented

More information

13.1 Causal effects with continuous mediator and. predictors in their equations. The definitions for the direct, total indirect,

13.1 Causal effects with continuous mediator and. predictors in their equations. The definitions for the direct, total indirect, 13 Appendix 13.1 Causal effects with continuous mediator and continuous outcome Consider the model of Section 3, y i = β 0 + β 1 m i + β 2 x i + β 3 x i m i + β 4 c i + ɛ 1i, (49) m i = γ 0 + γ 1 x i +

More information

multilevel modeling: concepts, applications and interpretations

multilevel modeling: concepts, applications and interpretations multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models

More information

Strategy of Bayesian Propensity. Score Estimation Approach. in Observational Study

Strategy of Bayesian Propensity. Score Estimation Approach. in Observational Study Theoretical Mathematics & Applications, vol.2, no.3, 2012, 75-86 ISSN: 1792-9687 (print), 1792-9709 (online) Scienpress Ltd, 2012 Strategy of Bayesian Propensity Score Estimation Approach in Observational

More information

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation NELS 88 Table 2.3 Adjusted odds ratios of eighth-grade students in 988 performing below basic levels of reading and mathematics in 988 and dropping out of school, 988 to 990, by basic demographics Variable

More information

Possibly useful formulas for this exam: b1 = Corr(X,Y) SDY / SDX. confidence interval: Estimate ± (Critical Value) (Standard Error of Estimate)

Possibly useful formulas for this exam: b1 = Corr(X,Y) SDY / SDX. confidence interval: Estimate ± (Critical Value) (Standard Error of Estimate) Statistics 5100 Exam 2 (Practice) Directions: Be sure to answer every question, and do not spend too much time on any part of any question. Be concise with all your responses. Partial SAS output and statistical

More information

Causal Mechanisms Short Course Part II:

Causal Mechanisms Short Course Part II: Causal Mechanisms Short Course Part II: Analyzing Mechanisms with Experimental and Observational Data Teppei Yamamoto Massachusetts Institute of Technology March 24, 2012 Frontiers in the Analysis of Causal

More information

Causal mediation analysis: Definition of effects and common identification assumptions

Causal mediation analysis: Definition of effects and common identification assumptions Causal mediation analysis: Definition of effects and common identification assumptions Trang Quynh Nguyen Seminar on Statistical Methods for Mental Health Research Johns Hopkins Bloomberg School of Public

More information

Moderation & Mediation in Regression. Pui-Wa Lei, Ph.D Professor of Education Department of Educational Psychology, Counseling, and Special Education

Moderation & Mediation in Regression. Pui-Wa Lei, Ph.D Professor of Education Department of Educational Psychology, Counseling, and Special Education Moderation & Mediation in Regression Pui-Wa Lei, Ph.D Professor of Education Department of Educational Psychology, Counseling, and Special Education Introduction Mediation and moderation are used to understand

More information

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Tihomir Asparouhov 1, Bengt Muthen 2 Muthen & Muthen 1 UCLA 2 Abstract Multilevel analysis often leads to modeling

More information

Methods for inferring short- and long-term effects of exposures on outcomes, using longitudinal data on both measures

Methods for inferring short- and long-term effects of exposures on outcomes, using longitudinal data on both measures Methods for inferring short- and long-term effects of exposures on outcomes, using longitudinal data on both measures Ruth Keogh, Stijn Vansteelandt, Rhian Daniel Department of Medical Statistics London

More information

General Regression Model

General Regression Model Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical

More information

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2008 Paper 241 A Note on Risk Prediction for Case-Control Studies Sherri Rose Mark J. van der Laan Division

More information

Ratio of Mediator Probability Weighting for Estimating Natural Direct and Indirect Effects

Ratio of Mediator Probability Weighting for Estimating Natural Direct and Indirect Effects Ratio of Mediator Probability Weighting for Estimating Natural Direct and Indirect Effects Guanglei Hong University of Chicago, 5736 S. Woodlawn Ave., Chicago, IL 60637 Abstract Decomposing a total causal

More information

Telescope Matching: A Flexible Approach to Estimating Direct Effects

Telescope Matching: A Flexible Approach to Estimating Direct Effects Telescope Matching: A Flexible Approach to Estimating Direct Effects Matthew Blackwell and Anton Strezhnev International Methods Colloquium October 12, 2018 direct effect direct effect effect of treatment

More information

Case-control studies C&H 16

Case-control studies C&H 16 Case-control studies C&H 6 Bendix Carstensen Steno Diabetes Center & Department of Biostatistics, University of Copenhagen bxc@steno.dk http://bendixcarstensen.com PhD-course in Epidemiology, Department

More information

In Class Review Exercises Vartanian: SW 540

In Class Review Exercises Vartanian: SW 540 In Class Review Exercises Vartanian: SW 540 1. Given the following output from an OLS model looking at income, what is the slope and intercept for those who are black and those who are not black? b SE

More information

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011) Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October

More information

Lecture 3.1 Basic Logistic LDA

Lecture 3.1 Basic Logistic LDA y Lecture.1 Basic Logistic LDA 0.2.4.6.8 1 Outline Quick Refresher on Ordinary Logistic Regression and Stata Women s employment example Cross-Over Trial LDA Example -100-50 0 50 100 -- Longitudinal Data

More information

Q30b Moyale Observed counts. The FREQ Procedure. Table 1 of type by response. Controlling for site=moyale. Improved (1+2) Same (3) Group only

Q30b Moyale Observed counts. The FREQ Procedure. Table 1 of type by response. Controlling for site=moyale. Improved (1+2) Same (3) Group only Moyale Observed counts 12:28 Thursday, December 01, 2011 1 The FREQ Procedure Table 1 of by Controlling for site=moyale Row Pct Improved (1+2) Same () Worsened (4+5) Group only 16 51.61 1.2 14 45.16 1

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

Multicollinearity Exercise

Multicollinearity Exercise Multicollinearity Exercise Use the attached SAS output to answer the questions. [OPTIONAL: Copy the SAS program below into the SAS editor window and run it.] You do not need to submit any output, so there

More information

Case-control studies

Case-control studies Matched and nested case-control studies Bendix Carstensen Steno Diabetes Center, Gentofte, Denmark b@bxc.dk http://bendixcarstensen.com Department of Biostatistics, University of Copenhagen, 8 November

More information

Statistical Modelling with Stata: Binary Outcomes

Statistical Modelling with Stata: Binary Outcomes Statistical Modelling with Stata: Binary Outcomes Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 21/11/2017 Cross-tabulation Exposed Unexposed Total Cases a b a + b Controls

More information

Lecture 11. Interval Censored and. Discrete-Time Data. Statistics Survival Analysis. Presented March 3, 2016

Lecture 11. Interval Censored and. Discrete-Time Data. Statistics Survival Analysis. Presented March 3, 2016 Statistics 255 - Survival Analysis Presented March 3, 2016 Motivating Dan Gillen Department of Statistics University of California, Irvine 11.1 First question: Are the data truly discrete? : Number of

More information

Introducing Generalized Linear Models: Logistic Regression

Introducing Generalized Linear Models: Logistic Regression Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and

More information

Logistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression

Logistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression Logistic Regression Usual linear regression (repetition) y i = b 0 + b 1 x 1i + b 2 x 2i + e i, e i N(0,σ 2 ) or: y i N(b 0 + b 1 x 1i + b 2 x 2i,σ 2 ) Example (DGA, p. 336): E(PEmax) = 47.355 + 1.024

More information

COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION

COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION Answer all parts. Closed book, calculators allowed. It is important to show all working,

More information

eappendix: Description of mgformula SAS macro for parametric mediational g-formula

eappendix: Description of mgformula SAS macro for parametric mediational g-formula eappendix: Description of mgformula SAS macro for parametric mediational g-formula The implementation of causal mediation analysis with time-varying exposures, mediators, and confounders Introduction The

More information

Beyond GLM and likelihood

Beyond GLM and likelihood Stat 6620: Applied Linear Models Department of Statistics Western Michigan University Statistics curriculum Core knowledge (modeling and estimation) Math stat 1 (probability, distributions, convergence

More information

22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression

22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression 22s:52 Applied Linear Regression Ch. 4 (sec. and Ch. 5 (sec. & 4: Logistic Regression Logistic Regression When the response variable is a binary variable, such as 0 or live or die fail or succeed then

More information

Estimating the Marginal Odds Ratio in Observational Studies

Estimating the Marginal Odds Ratio in Observational Studies Estimating the Marginal Odds Ratio in Observational Studies Travis Loux Christiana Drake Department of Statistics University of California, Davis June 20, 2011 Outline The Counterfactual Model Odds Ratios

More information

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Statistical Models: 3 rd edition, 2003 Contents Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction

More information

Lecture#12. Instrumental variables regression Causal parameters III

Lecture#12. Instrumental variables regression Causal parameters III Lecture#12 Instrumental variables regression Causal parameters III 1 Demand experiment, market data analysis & simultaneous causality 2 Simultaneous causality Your task is to estimate the demand function

More information

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies.

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies. Matched and nested case-control studies Bendix Carstensen Steno Diabetes Center, Gentofte, Denmark http://staff.pubhealth.ku.dk/~bxc/ Department of Biostatistics, University of Copengen 11 November 2011

More information

Confounding, mediation and colliding

Confounding, mediation and colliding Confounding, mediation and colliding What types of shared covariates does the sibling comparison design control for? Arvid Sjölander and Johan Zetterqvist Causal effects and confounding A common aim of

More information

Simple, Marginal, and Interaction Effects in General Linear Models: Part 1

Simple, Marginal, and Interaction Effects in General Linear Models: Part 1 Simple, Marginal, and Interaction Effects in General Linear Models: Part 1 PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 2: August 24, 2012 PSYC 943: Lecture 2 Today s Class Centering and

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

Mediation and Interaction Analysis

Mediation and Interaction Analysis Mediation and Interaction Analysis Andrea Bellavia abellavi@hsph.harvard.edu May 17, 2017 Andrea Bellavia Mediation and Interaction May 17, 2017 1 / 43 Epidemiology, public health, and clinical research

More information

SAS/STAT 14.3 User s Guide The CAUSALMED Procedure

SAS/STAT 14.3 User s Guide The CAUSALMED Procedure SAS/STAT 14.3 User s Guide The CAUSALMED Procedure This document is an individual chapter from SAS/STAT 14.3 User s Guide. The correct bibliographic citation for this manual is as follows: SAS Institute

More information

Ratio-of-Mediator-Probability Weighting for Causal Mediation Analysis. in the Presence of Treatment-by-Mediator Interaction

Ratio-of-Mediator-Probability Weighting for Causal Mediation Analysis. in the Presence of Treatment-by-Mediator Interaction Ratio-of-Mediator-Probability Weighting for Causal Mediation Analysis in the Presence of Treatment-by-Mediator Interaction Guanglei Hong Jonah Deutsch Heather D. Hill University of Chicago (This is a working

More information

Conceptual overview: Techniques for establishing causal pathways in programs and policies

Conceptual overview: Techniques for establishing causal pathways in programs and policies Conceptual overview: Techniques for establishing causal pathways in programs and policies Antonio A. Morgan-Lopez, Ph.D. OPRE/ACF Meeting on Unpacking the Black Box of Programs and Policies 4 September

More information

Lab 8. Matched Case Control Studies

Lab 8. Matched Case Control Studies Lab 8 Matched Case Control Studies Control of Confounding Technique for the control of confounding: At the design stage: Matching During the analysis of the results: Post-stratification analysis Advantage

More information

SplineLinear.doc 1 # 9 Last save: Saturday, 9. December 2006

SplineLinear.doc 1 # 9 Last save: Saturday, 9. December 2006 SplineLinear.doc 1 # 9 Problem:... 2 Objective... 2 Reformulate... 2 Wording... 2 Simulating an example... 3 SPSS 13... 4 Substituting the indicator function... 4 SPSS-Syntax... 4 Remark... 4 Result...

More information

Flexible Mediation Analysis in the Presence of Nonlinear Relations: Beyond the Mediation Formula

Flexible Mediation Analysis in the Presence of Nonlinear Relations: Beyond the Mediation Formula Multivariate Behavioral Research ISSN: 0027-3171 (Print) 1532-7906 (Online) Journal homepage: http://www.tandfonline.com/loi/hmbr20 Flexible Mediation Analysis in the Presence of Nonlinear Relations: Beyond

More information

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 217, Chicago, Illinois Outline 1. Opportunities and challenges of panel data. a. Data requirements b. Control

More information

Random Intercept Models

Random Intercept Models Random Intercept Models Edps/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline A very simple case of a random intercept

More information

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall 1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Dept. of Biostatistics, Duke University Medical Joint work

More information

STAT 3A03 Applied Regression With SAS Fall 2017

STAT 3A03 Applied Regression With SAS Fall 2017 STAT 3A03 Applied Regression With SAS Fall 2017 Assignment 2 Solution Set Q. 1 I will add subscripts relating to the question part to the parameters and their estimates as well as the errors and residuals.

More information

Help! Statistics! Mediation Analysis

Help! Statistics! Mediation Analysis Help! Statistics! Lunch time lectures Help! Statistics! Mediation Analysis What? Frequently used statistical methods and questions in a manageable timeframe for all researchers at the UMCG. No knowledge

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 1: August 22, 2012

More information

Extending causal inferences from a randomized trial to a target population

Extending causal inferences from a randomized trial to a target population Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh

More information

GEE for Longitudinal Data - Chapter 8

GEE for Longitudinal Data - Chapter 8 GEE for Longitudinal Data - Chapter 8 GEE: generalized estimating equations (Liang & Zeger, 1986; Zeger & Liang, 1986) extension of GLM to longitudinal data analysis using quasi-likelihood estimation method

More information

MULTINOMIAL LOGISTIC REGRESSION

MULTINOMIAL LOGISTIC REGRESSION MULTINOMIAL LOGISTIC REGRESSION Model graphically: Variable Y is a dependent variable, variables X, Z, W are called regressors. Multinomial logistic regression is a generalization of the binary logistic

More information

Section 9c. Propensity scores. Controlling for bias & confounding in observational studies

Section 9c. Propensity scores. Controlling for bias & confounding in observational studies Section 9c Propensity scores Controlling for bias & confounding in observational studies 1 Logistic regression and propensity scores Consider comparing an outcome in two treatment groups: A vs B. In a

More information

Lecture 11 Multiple Linear Regression

Lecture 11 Multiple Linear Regression Lecture 11 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 11-1 Topic Overview Review: Multiple Linear Regression (MLR) Computer Science Case Study 11-2 Multiple Regression

More information

Chapter 1 Linear Regression with One Predictor

Chapter 1 Linear Regression with One Predictor STAT 525 FALL 2018 Chapter 1 Linear Regression with One Predictor Professor Min Zhang Goals of Regression Analysis Serve three purposes Describes an association between X and Y In some applications, the

More information

Models for Clustered Data

Models for Clustered Data Models for Clustered Data Edps/Psych/Soc 589 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline Notation NELS88 data Fixed Effects ANOVA

More information

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between

More information

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics Faculty of Health Sciences Regression models Counts, Poisson regression, 27-5-2013 Lene Theil Skovgaard Dept. of Biostatistics 1 / 36 Count outcome PKA & LTS, Sect. 7.2 Poisson regression The Binomial

More information

Models for Clustered Data

Models for Clustered Data Models for Clustered Data Edps/Psych/Stat 587 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2017 Outline Notation NELS88 data Fixed Effects ANOVA

More information

Binary Dependent Variables

Binary Dependent Variables Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome

More information

Sensitivity analysis and distributional assumptions

Sensitivity analysis and distributional assumptions Sensitivity analysis and distributional assumptions Tyler J. VanderWeele Department of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, IL 60637, USA vanderweele@uchicago.edu

More information

STAT 7030: Categorical Data Analysis

STAT 7030: Categorical Data Analysis STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012

More information

Advanced Quantitative Data Analysis

Advanced Quantitative Data Analysis Chapter 24 Advanced Quantitative Data Analysis Daniel Muijs Doing Regression Analysis in SPSS When we want to do regression analysis in SPSS, we have to go through the following steps: 1 As usual, we choose

More information

Handout 1: Predicting GPA from SAT

Handout 1: Predicting GPA from SAT Handout 1: Predicting GPA from SAT appsrv01.srv.cquest.utoronto.ca> appsrv01.srv.cquest.utoronto.ca> ls Desktop grades.data grades.sas oldstuff sasuser.800 appsrv01.srv.cquest.utoronto.ca> cat grades.data

More information

Stat 642, Lecture notes for 04/12/05 96

Stat 642, Lecture notes for 04/12/05 96 Stat 642, Lecture notes for 04/12/05 96 Hosmer-Lemeshow Statistic The Hosmer-Lemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal

More information

SEM REX B KLINE CONCORDIA D. MODERATION, MEDIATION

SEM REX B KLINE CONCORDIA D. MODERATION, MEDIATION ADVANCED SEM REX B KLINE CONCORDIA D1 D. MODERATION, MEDIATION X 1 DY Y DM 1 M D2 topics moderation mmr mpa D3 topics cpm mod. mediation med. moderation D4 topics cma cause mediator most general D5 MMR

More information

Assessing In/Direct Effects: from Structural Equation Models to Causal Mediation Analysis

Assessing In/Direct Effects: from Structural Equation Models to Causal Mediation Analysis Assessing In/Direct Effects: from Structural Equation Models to Causal Mediation Analysis Part 1: Vanessa Didelez with help from Ryan M Andrews Leibniz Institute for Prevention Research & Epidemiology

More information

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1 CLDP 944 Example 3a page 1 From Between-Person to Within-Person Models for Longitudinal Data The models for this example come from Hoffman (2015) chapter 3 example 3a. We will be examining the extent to

More information

A Longitudinal Look at Longitudinal Mediation Models

A Longitudinal Look at Longitudinal Mediation Models A Longitudinal Look at Longitudinal Mediation Models David P. MacKinnon, Arizona State University Causal Mediation Analysis Ghent, Belgium University of Ghent January 8-9, 03 Introduction Assumptions Unique

More information

lme4 Luke Chang Last Revised July 16, Fitting Linear Mixed Models with a Varying Intercept

lme4 Luke Chang Last Revised July 16, Fitting Linear Mixed Models with a Varying Intercept lme4 Luke Chang Last Revised July 16, 2010 1 Using lme4 1.1 Fitting Linear Mixed Models with a Varying Intercept We will now work through the same Ultimatum Game example from the regression section and

More information

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages

More information

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3 STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae

More information

McGill University. Faculty of Science. Department of Mathematics and Statistics. Statistics Part A Comprehensive Exam Methodology Paper

McGill University. Faculty of Science. Department of Mathematics and Statistics. Statistics Part A Comprehensive Exam Methodology Paper Student Name: ID: McGill University Faculty of Science Department of Mathematics and Statistics Statistics Part A Comprehensive Exam Methodology Paper Date: Friday, May 13, 2016 Time: 13:00 17:00 Instructions

More information

BIOS 625 Fall 2015 Homework Set 3 Solutions

BIOS 625 Fall 2015 Homework Set 3 Solutions BIOS 65 Fall 015 Homework Set 3 Solutions 1. Agresti.0 Table.1 is from an early study on the death penalty in Florida. Analyze these data and show that Simpson s Paradox occurs. Death Penalty Victim's

More information

Introduction to SAS proc mixed

Introduction to SAS proc mixed Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen 2 / 28 Preparing data for analysis The

More information

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

More information

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3 University of California, Irvine 2017-2018 1 Statistics (STATS) Courses STATS 5. Seminar in Data Science. 1 Unit. An introduction to the field of Data Science; intended for entering freshman and transfers.

More information

Paper: ST-161. Techniques for Evidence-Based Decision Making Using SAS Ian Stockwell, The Hilltop UMBC, Baltimore, MD

Paper: ST-161. Techniques for Evidence-Based Decision Making Using SAS Ian Stockwell, The Hilltop UMBC, Baltimore, MD Paper: ST-161 Techniques for Evidence-Based Decision Making Using SAS Ian Stockwell, The Hilltop Institute @ UMBC, Baltimore, MD ABSTRACT SAS has many tools that can be used for data analysis. From Freqs

More information