Causal inference in biomedical sciences: causal models involving genotypes. Mendelian randomization genes as Instrumental Variables

Size: px
Start display at page:

Download "Causal inference in biomedical sciences: causal models involving genotypes. Mendelian randomization genes as Instrumental Variables"

Transcription

1 Causal inference in biomedical sciences: causal models involving genotypes Causal models for observational data Instrumental variables estimation and Mendelian randomization Krista Fischer Estonian Genome Center, University of Tartu, Estonia 36th Finnish Summer School on Probability Theory and Statistics A general association structure with one genotype and two phenotypes References 1 / 14 2 / 14 Mendelian randomization genes as Instrumental Variables Most of the exposures of interest in chronic disease epidemiology cannot be randomized. Sometimes, however, nature will randomize for us: there is a SNP (Single nucleotide polymorphism, a DNA marker) that affects the exposure of interest, but not directly the outcome. Example: a SNP that is associated with the enzyme involved in alcohol metabolism, genetic lactose intolerance, etc. However, the crucial assumption that the SNP cannot affect outcome in any other way than throughout the exposure, cannot be tested statistically! 3 / 14 A causal graph with exposure X, outcome, confounder U and an instrument Z : δ β Simple regression will yield a biased estimate of the causal effect of X on, as the graph implies: = α y + βx + U + ɛ, E(ɛ X, U) =0 so E( X) =α y + βx + E(U X). Thus the coefficient of X will also depend on and the association between X and U. 4 / 14 δ β δ β = α y + βx + U + ɛ, E(ɛ X, U) =0 How can Z help? If E(X Z )=α x + δz, we get E( Z )=α y +βe(x Z )+E(U Z )=α y +β(α x +δz )=α y+βδz. As δ and βδ are estimable, also β becomes estimable. 1. Regress X on Z, obtain an estimate ˆδ 2. Regress on Z, obtain an estimate ˆ δβ 3. Obtain ˆβ = ˆ δβ ˆδ 4. Valid, if Z is not associated with U and does not have any effect on (other than mediated by X) 5. Standard error estimation: use the sandwich estimator, implemented for instance in R, library(sem), function tsls(). 5 / 14 6 / 14

2 Mendelian randomization example FTO genotype, BMI and Blood Glucose level (related to Type 2 Diabetes risk; Estonian Biobank, n=3635, aged 45+) IV estimation in R (using library(sem)): > summary(tsls(glc~bmi, ~fto,data=fen),digits=2) 2SLS Estimates Model Formula: Glc ~ bmi Instruments: ~fto Average difference in Blood Glucose level (Glc, mmol/l) per BMI unit is estimated as (SE=0.005) Average BMI difference per FTO risk allele is estimated as 0.50 (SE=0.09) Average difference in Glc level per FTO risk allele is estimated as 0.13 (SE=0.04) Instrumental variable estimate of the mean Glc difference per BMI unit is (se=0.078) 7 / 14 Residuals: Min. 1st Qu. Median Mean 3rd Qu. Max Estimate Std. Error t value Pr(> t ) (Intercept) bmi ** 8 / 14 IV estimation: can untestable assumptions be tested? > summary(lm(glc~bmi+fto,data=fen)) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) <2e-16 *** bmi <2e-16 *** fto For Type 2 Diabetes: > summary(glm(t2d~bmi+fto,data=fen,family=binomial)) Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) <2e-16 *** bmi <2e-16 *** fto * Does FTO have a direct effect on Glc or T2D? A significant FTO effect would not be a proof here (nor does non-significance prove the opposite)! (WH?) 9 / 14 A general association structure with one genotype and two phenotypes β β β β β If β gy 0, the genotype G is said to have a pleiotropic effect on variables X and. 10 / 14 One genotype and two phenotypes β β β β β Note that if one fits a linear regression model for, with G as an only covariate, one estimates: E( G) = E(const + β xy X + β gy G + β uy U G) = const + β xy (β gx G)+β gy G = const +(β gx β xy + β gy )G So when one uses the MR approach here (incorrectly assuming no direct effect of G on ), one estimates: β gx β xy + β gy β gx = β xy + β gy β gx Can we test pleiotropy? A naïve approach would be to fit a linear regression model for, with both X and G as covariates. But in this case we estimate: E( X, G) =const + β gy G + β xy X + β uy E(U X, G). As it is possible to show that (assuming standardized variables): we get E( X, G) =const+ E(U X, G) =const + [ β xy + β uxβ uy 1 β 2 gx β ux (1 β 2 gx) (X β gxg), ] X + [ β gy β gx β ux β uy 1 β 2 gx ] G, 11 / / 14

3 One genotype and two phenotypes: linear models for Some references What do we estimate by fitting different linear models for? Estimable coefficients Covariates coef. of X coef. of G X, G, U β xy β gy G β gy + β gx β xy X β xy + β gx β gy + β ux β uy X, G β xy + βux βuy 1 β 2 gx β gy β gx β ux β uy 1 β 2 gx An excellent overview of Mendelian randomization: Sheehan, N., Didelez, V., Burton, P., Tobin, M., Mendelian Randomization and Causal Inference in Observational Epidemiology, PLoS Med August; 5(8). A recent review on causality in genetics: Vansteelandt, S., Lange, C., Causation and causal inference for genetic effects. Human Genet, : / / 14

4 Example: FTO genotype (G), BMI (X) and other outcomes (data: Estonian Biobank) N=12,740 Effect of FTO on BMI: E(X G) h = 0.08 (se=0.01) scaled BMI on coef of X in E( X) FTO on E( G) BMI-adjusted effect of FTO on E( X,G) SBP 0.40 (0.008) (0.013) (0.011) HDL (0.011) (0.016) (0.016) TG (0.002) (0.016) (0.015) T2D* 1.02 (0.030) (0.041) (0.040) *logistic regression scaled BMI on coef of X in E( X) FTO on E( G) BMI-adjusted effect of FTO on E( X,G) SBP 0.40 (0.008) (0.013) (0.011) HDL (0.011) (0.016) (0.016) TG (0.002) (0.016) (0.015) T2D* 1.02 (0.030) (0.041) (0.040) *logistic regression Direct effect? A simple simulated example N=50000, all non-zero coefficients are highly significant True parameters Coef of X Coefficient of G MR Model: X Estimated parameters X G G G,X X inst G NA MultiPhen analysis idea (O Reilly et al, PLoS One 2012) The idea: with correlated phenotypes use genotype as the outcome, phenotypes as covariates (proportional odds regression) Thus in our setting, regress G on X and. However, if E( X,G) = g 1 (X)+ g 2 (G), (for some g 1 and g 2 ) regardless of causal mechanism, E(G X,) = h 1 (X)+ h 2 () (for some h 1 and h 2 ) MultiPHEN is a useful tool for detecting associations with correlated phenotypes but NOT for causal parameter estimates A simple simulated example Mendelian randomization: more on assumptions N=50000, all non-zero coefficients are highly significant True parameters Coef of X Coefficient of G MR MultiPHEN Model: X Estimated parameters X G G G,X X inst G G adj X G X adj NA The causal effect is defined via potential outcomes: E( 0 G,X) = (X X 0 ) 0 potential exposure-free outcome (if X 0 =0) or outcome at a potential baseline exposure level Assuming the same effect of X at each level of G no exposure effect heterogeneity One way to understand this assumption is via principal stratification easily understood in the context of noncompliance analysis of randomized trials

5 Classical vs Mendelian Randomization Estimating Complier Average Causal Effect (CACE): assumptions A Randomized Clinical Trial (RCT) Unobserved confounders U Mendelian Randomization (MR) Unobserved confounders U As R has no direct effect on, there is: No assignment effect in never takers No assignment effect in always takers The estimated causal effect is only valid for compliers R X G X Treatment Control Random assignment Received treatment Outcome Genotype Exposure phenotype Outcome phenotype Always takers p 1A = p 0A Compliers p 1C p 0C Association between R and is unconfounded and present only when X- association is present Association between G and is unconfounded and present only when X- association is present Never takers p 1N = p 0N Outcome probabilities: p is =P(=1 R=i, Stratum=s), with R-assigned treatment Principal stratification and Mendelian randomization (ignoring heterozygotes) Always takers : Overweight regardless of their genotype Compliers : Overweight when having risk alleles of the FTO genotype, normal weight otherwise Never takers : Normal weight even when having the FTO genotype A/A T/T Do the principal strata exist? There is a proven causal associationbetween FTO genotype and overweight status This means, there must exist individuals, whose overweight is caused by their FTO risk alleles So there also exist individuals who have normal weight only because they do not have FTO risk alleles Any differencesin the T2D risk between people with different genotype can only come from this stratum of compliers T2D, overweight and FTO example The estimated effectis valid in the stratum of compliers (estimated as 10% of the individuals) Extending this to other principal strata involves assumptions on no exposure effect heterogeneity Summary on causal analysis in genomics data Association is not causality -oldtruth, butstillneedsto be reminded while analyzing omics data In most cases, causal inference relies on statistically untestable assumptions. The assumptions should be verified based on external knowledge (biology). There are no forbidden models, but it is important to understand the interpretation of model parameters given realistic assumptions. There are always unobserved confounders between health phenotypes!

Mendelian randomization (MR)

Mendelian randomization (MR) Mendelian randomization (MR) Use inherited genetic variants to infer causal relationship of an exposure and a disease outcome. 1 Concepts of MR and Instrumental variable (IV) methods motivation, assumptions,

More information

Mendelian randomization as an instrumental variable approach to causal inference

Mendelian randomization as an instrumental variable approach to causal inference Statistical Methods in Medical Research 2007; 16: 309 330 Mendelian randomization as an instrumental variable approach to causal inference Vanessa Didelez Departments of Statistical Science, University

More information

Lecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017

Lecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017 Lecture 2: Genetic Association Testing with Quantitative Traits Instructors: Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 29 Introduction to Quantitative Trait Mapping

More information

Lecture 4 Multiple linear regression

Lecture 4 Multiple linear regression Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters

More information

Statistical inference in Mendelian randomization: From genetic association to epidemiological causation

Statistical inference in Mendelian randomization: From genetic association to epidemiological causation Statistical inference in Mendelian randomization: From genetic association to epidemiological causation Department of Statistics, The Wharton School, University of Pennsylvania March 1st, 2018 @ UMN Based

More information

Proportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power

Proportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power Proportional Variance Explained by QTL and Statistical Power Partitioning the Genetic Variance We previously focused on obtaining variance components of a quantitative trait to determine the proportion

More information

Recent Challenges for Mendelian Randomisation Analyses

Recent Challenges for Mendelian Randomisation Analyses Recent Challenges for Mendelian Randomisation Analyses Vanessa Didelez Leibniz Institute for Prevention Research and Epidemiology & Department of Mathematics University of Bremen, Germany CRM Montreal,

More information

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification Todd MacKenzie, PhD Collaborators A. James O Malley Tor Tosteson Therese Stukel 2 Overview 1. Instrumental variable

More information

A Comparison of Robust Methods for Mendelian Randomization Using Multiple Genetic Variants

A Comparison of Robust Methods for Mendelian Randomization Using Multiple Genetic Variants 8 A Comparison of Robust Methods for Mendelian Randomization Using Multiple Genetic Variants Yanchun Bao ISER, University of Essex Paul Clarke ISER, University of Essex Melissa C Smart ISER, University

More information

On the Choice of Parameterisation and Priors for the Bayesian Analyses of Mendelian Randomisation Studies.

On the Choice of Parameterisation and Priors for the Bayesian Analyses of Mendelian Randomisation Studies. On the Choice of Parameterisation and Priors for the Bayesian Analyses of Mendelian Randomisation Studies. E. M. Jones 1, J. R. Thompson 1, V. Didelez, and N. A. Sheehan 1 1 Department of Health Sciences,

More information

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs STAT 5500/6500 Conditional Logistic Regression for Matched Pairs Motivating Example: The data we will be using comes from a subset of data taken from the Los Angeles Study of the Endometrial Cancer Data

More information

BTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014

BTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014 BTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014 Homework 4 (version 3) - posted October 3 Assigned October 2; Due 11:59PM October 9 Problem 1 (Easy) a. For the genetic regression model: Y

More information

Estimating direct effects in cohort and case-control studies

Estimating direct effects in cohort and case-control studies Estimating direct effects in cohort and case-control studies, Ghent University Direct effects Introduction Motivation The problem of standard approaches Controlled direct effect models In many research

More information

4.1 Example: Exercise and Glucose

4.1 Example: Exercise and Glucose 4 Linear Regression Post-menopausal women who exercise less tend to have lower bone mineral density (BMD), putting them at increased risk for fractures. But they also tend to be older, frailer, and heavier,

More information

Harvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen

Harvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen Harvard University Harvard University Biostatistics Working Paper Series Year 2014 Paper 175 A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome Eric Tchetgen Tchetgen

More information

1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics

1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics 1 Springer Nan M. Laird Christoph Lange The Fundamentals of Modern Statistical Genetics 1 Introduction to Statistical Genetics and Background in Molecular Genetics 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

More information

Association Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5

Association Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 Association Testing with Quantitative Traits: Common and Rare Variants Timothy Thornton and Katie Kerr Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 1 / 41 Introduction to Quantitative

More information

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Overview In observational and experimental studies, the goal may be to estimate the effect

More information

Estimating Structural Mean Models with Multiple Instrumental Variables using the Generalised Method of Moments

Estimating Structural Mean Models with Multiple Instrumental Variables using the Generalised Method of Moments THE CENTRE FOR MARKET AND PUBLIC ORGANISATION Estimating Structural Mean Models with Multiple Instrumental Variables using the Generalised Method of Moments Paul S Clarke, Tom M Palmer and Frank Windmeijer

More information

Linear Regression (1/1/17)

Linear Regression (1/1/17) STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression

More information

Case-Control Association Testing. Case-Control Association Testing

Case-Control Association Testing. Case-Control Association Testing Introduction Association mapping is now routinely being used to identify loci that are involved with complex traits. Technological advances have made it feasible to perform case-control association studies

More information

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture16: Population structure and logistic regression I Jason Mezey jgm45@cornell.edu April 11, 2017 (T) 8:40-9:55 Announcements I April

More information

Causal exposure effect on a time-to-event response using an IV.

Causal exposure effect on a time-to-event response using an IV. Faculty of Health Sciences Causal exposure effect on a time-to-event response using an IV. Torben Martinussen 1 Stijn Vansteelandt 2 Eric Tchetgen 3 1 Department of Biostatistics University of Copenhagen

More information

Extending the MR-Egger method for multivariable Mendelian randomization to correct for both measured and unmeasured pleiotropy

Extending the MR-Egger method for multivariable Mendelian randomization to correct for both measured and unmeasured pleiotropy Received: 20 October 2016 Revised: 15 August 2017 Accepted: 23 August 2017 DOI: 10.1002/sim.7492 RESEARCH ARTICLE Extending the MR-Egger method for multivariable Mendelian randomization to correct for

More information

Lecture 1: Case-Control Association Testing. Summer Institute in Statistical Genetics 2015

Lecture 1: Case-Control Association Testing. Summer Institute in Statistical Genetics 2015 Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2015 1 / 1 Introduction Association mapping is now routinely being used to identify loci that are involved with complex traits.

More information

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes Kosuke Imai Department of Politics Princeton University July 31 2007 Kosuke Imai (Princeton University) Nonignorable

More information

Unbiased estimation of exposure odds ratios in complete records logistic regression

Unbiased estimation of exposure odds ratios in complete records logistic regression Unbiased estimation of exposure odds ratios in complete records logistic regression Jonathan Bartlett London School of Hygiene and Tropical Medicine www.missingdata.org.uk Centre for Statistical Methodology

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

Instrumental variables & Mendelian randomization

Instrumental variables & Mendelian randomization Instrumental variables & Mendelian randomization Qingyuan Zhao Department of Statistics, The Wharton School, University of Pennsylvania May 9, 2018 @ JHU 2 C (Confounder) 1 Z (Gene) X (HDL) Y (Heart disease)

More information

Causal Inference for Binary Outcomes

Causal Inference for Binary Outcomes Causal Inference for Binary Outcomes Applied Health Econometrics Symposium Leeds October 2013 Frank Windmeijer University of Bristol Outline: 1. Instrumental variables estimators for binary outcomes Structural

More information

IV-estimators of the causal odds ratio for a continuous exposure in prospective and retrospective designs

IV-estimators of the causal odds ratio for a continuous exposure in prospective and retrospective designs IV-estimators of the causal odds ratio for a continuous exposure in prospective and retrospective designs JACK BOWDEN (corresponding author) MRC Biostatistics Unit, Institute of Public Health, Robinson

More information

Lecture Discussion. Confounding, Non-Collapsibility, Precision, and Power Statistics Statistical Methods II. Presented February 27, 2018

Lecture Discussion. Confounding, Non-Collapsibility, Precision, and Power Statistics Statistical Methods II. Presented February 27, 2018 , Non-, Precision, and Power Statistics 211 - Statistical Methods II Presented February 27, 2018 Dan Gillen Department of Statistics University of California, Irvine Discussion.1 Various definitions of

More information

Investigating mediation when counterfactuals are not metaphysical: Does sunlight exposure mediate the effect of eye-glasses on cataracts?

Investigating mediation when counterfactuals are not metaphysical: Does sunlight exposure mediate the effect of eye-glasses on cataracts? Investigating mediation when counterfactuals are not metaphysical: Does sunlight exposure mediate the effect of eye-glasses on cataracts? Brian Egleston Fox Chase Cancer Center Collaborators: Daniel Scharfstein,

More information

Comparative effectiveness of dynamic treatment regimes

Comparative effectiveness of dynamic treatment regimes Comparative effectiveness of dynamic treatment regimes An application of the parametric g- formula Miguel Hernán Departments of Epidemiology and Biostatistics Harvard School of Public Health www.hsph.harvard.edu/causal

More information

Using Genomic Structural Equation Modeling to Model Joint Genetic Architecture of Complex Traits

Using Genomic Structural Equation Modeling to Model Joint Genetic Architecture of Complex Traits Using Genomic Structural Equation Modeling to Model Joint Genetic Architecture of Complex Traits Presented by: Andrew D. Grotzinger & Elliot M. Tucker-Drob Paper: Grotzinger, A. D., Rhemtulla, M., de Vlaming,

More information

An introduction to biostatistics: part 1

An introduction to biostatistics: part 1 An introduction to biostatistics: part 1 Cavan Reilly September 6, 2017 Table of contents Introduction to data analysis Uncertainty Probability Conditional probability Random variables Discrete random

More information

Mendelian randomization: From genetic association to epidemiological causation

Mendelian randomization: From genetic association to epidemiological causation Mendelian randomization: From genetic association to epidemiological causation Qingyuan Zhao Department of Statistics, The Wharton School, University of Pennsylvania April 24, 2018 2 C (Confounder) 1 Z

More information

Correlation and regression

Correlation and regression 1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,

More information

Specification Errors, Measurement Errors, Confounding

Specification Errors, Measurement Errors, Confounding Specification Errors, Measurement Errors, Confounding Kerby Shedden Department of Statistics, University of Michigan October 10, 2018 1 / 32 An unobserved covariate Suppose we have a data generating model

More information

Distinctive aspects of non-parametric fitting

Distinctive aspects of non-parametric fitting 5. Introduction to nonparametric curve fitting: Loess, kernel regression, reproducing kernel methods, neural networks Distinctive aspects of non-parametric fitting Objectives: investigate patterns free

More information

Measurement Error in Spatial Modeling of Environmental Exposures

Measurement Error in Spatial Modeling of Environmental Exposures Measurement Error in Spatial Modeling of Environmental Exposures Chris Paciorek, Alexandros Gryparis, and Brent Coull August 9, 2005 Department of Biostatistics Harvard School of Public Health www.biostat.harvard.edu/~paciorek

More information

Computational Systems Biology: Biology X

Computational Systems Biology: Biology X Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA L#7:(Mar-23-2010) Genome Wide Association Studies 1 The law of causality... is a relic of a bygone age, surviving, like the monarchy,

More information

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture 18: Introduction to covariates, the QQ plot, and population structure II + minimal GWAS steps Jason Mezey jgm45@cornell.edu April

More information

Propensity Score Methods for Causal Inference

Propensity Score Methods for Causal Inference John Pura BIOS790 October 2, 2015 Causal inference Philosophical problem, statistical solution Important in various disciplines (e.g. Koch s postulates, Bradford Hill criteria, Granger causality) Good

More information

A Comparison of Methods for Estimating the Causal Effect of a Treatment in Randomized. Clinical Trials Subject to Noncompliance.

A Comparison of Methods for Estimating the Causal Effect of a Treatment in Randomized. Clinical Trials Subject to Noncompliance. Draft June 6, 006 A Comparison of Methods for Estimating the Causal Effect of a Treatment in Randomized Clinical Trials Subject to Noncompliance Roderick Little 1, Qi Long and Xihong Lin 3 Abstract We

More information

Exam ECON5106/9106 Fall 2018

Exam ECON5106/9106 Fall 2018 Exam ECO506/906 Fall 208. Suppose you observe (y i,x i ) for i,2,, and you assume f (y i x i ;α,β) γ i exp( γ i y i ) where γ i exp(α + βx i ). ote that in this case, the conditional mean of E(y i X x

More information

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall 1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Dept. of Biostatistics, Duke University Medical Joint work

More information

Final Exam. Economics 835: Econometrics. Fall 2010

Final Exam. Economics 835: Econometrics. Fall 2010 Final Exam Economics 835: Econometrics Fall 2010 Please answer the question I ask - no more and no less - and remember that the correct answer is often short and simple. 1 Some short questions a) For each

More information

Estimating and contextualizing the attenuation of odds ratios due to non-collapsibility

Estimating and contextualizing the attenuation of odds ratios due to non-collapsibility Estimating and contextualizing the attenuation of odds ratios due to non-collapsibility Stephen Burgess Department of Public Health & Primary Care, University of Cambridge September 6, 014 Short title:

More information

Effect Modification and Interaction

Effect Modification and Interaction By Sander Greenland Keywords: antagonism, causal coaction, effect-measure modification, effect modification, heterogeneity of effect, interaction, synergism Abstract: This article discusses definitions

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline

More information

BTRY 7210: Topics in Quantitative Genomics and Genetics

BTRY 7210: Topics in Quantitative Genomics and Genetics BTRY 7210: Topics in Quantitative Genomics and Genetics Jason Mezey Biological Statistics and Computational Biology (BSCB) Department of Genetic Medicine jgm45@cornell.edu February 12, 2015 Lecture 3:

More information

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs STAT 5500/6500 Conditional Logistic Regression for Matched Pairs The data for the tutorial came from support.sas.com, The LOGISTIC Procedure: Conditional Logistic Regression for Matched Pairs Data :: SAS/STAT(R)

More information

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Previous lecture P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Interaction Outline: Definition of interaction Additive versus multiplicative

More information

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS Donald B. Rubin Harvard University 1 Oxford Street, 7th Floor Cambridge, MA 02138 USA Tel: 617-495-5496; Fax: 617-496-8057 email: rubin@stat.harvard.edu

More information

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017 Econometrics with Observational Data Introduction and Identification Todd Wagner February 1, 2017 Goals for Course To enable researchers to conduct careful quantitative analyses with existing VA (and non-va)

More information

Comparison of Three Approaches to Causal Mediation Analysis. Donna L. Coffman David P. MacKinnon Yeying Zhu Debashis Ghosh

Comparison of Three Approaches to Causal Mediation Analysis. Donna L. Coffman David P. MacKinnon Yeying Zhu Debashis Ghosh Comparison of Three Approaches to Causal Mediation Analysis Donna L. Coffman David P. MacKinnon Yeying Zhu Debashis Ghosh Introduction Mediation defined using the potential outcomes framework natural effects

More information

An Introduction to Causal Analysis on Observational Data using Propensity Scores

An Introduction to Causal Analysis on Observational Data using Propensity Scores An Introduction to Causal Analysis on Observational Data using Propensity Scores Margie Rosenberg*, PhD, FSA Brian Hartman**, PhD, ASA Shannon Lane* *University of Wisconsin Madison **University of Connecticut

More information

Friday Harbor From Genetics to GWAS (Genome-wide Association Study) Sept David Fardo

Friday Harbor From Genetics to GWAS (Genome-wide Association Study) Sept David Fardo Friday Harbor 2017 From Genetics to GWAS (Genome-wide Association Study) Sept 7 2017 David Fardo Purpose: prepare for tomorrow s tutorial Genetic Variants Quality Control Imputation Association Visualization

More information

Introduction to Analysis of Genomic Data Using R Lecture 6: Review Statistics (Part II)

Introduction to Analysis of Genomic Data Using R Lecture 6: Review Statistics (Part II) 1/45 Introduction to Analysis of Genomic Data Using R Lecture 6: Review Statistics (Part II) Dr. Yen-Yi Ho (hoyen@stat.sc.edu) Feb 9, 2018 2/45 Objectives of Lecture 6 Association between Variables Goodness

More information

Methods for Cryptic Structure. Methods for Cryptic Structure

Methods for Cryptic Structure. Methods for Cryptic Structure Case-Control Association Testing Review Consider testing for association between a disease and a genetic marker Idea is to look for an association by comparing allele/genotype frequencies between the cases

More information

Survival Analysis for Case-Cohort Studies

Survival Analysis for Case-Cohort Studies Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz

More information

Causality II: How does causal inference fit into public health and what it is the role of statistics?

Causality II: How does causal inference fit into public health and what it is the role of statistics? Causality II: How does causal inference fit into public health and what it is the role of statistics? Statistics for Psychosocial Research II November 13, 2006 1 Outline Potential Outcomes / Counterfactual

More information

Sensitivity analysis and distributional assumptions

Sensitivity analysis and distributional assumptions Sensitivity analysis and distributional assumptions Tyler J. VanderWeele Department of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, IL 60637, USA vanderweele@uchicago.edu

More information

Lecture 12: Effect modification, and confounding in logistic regression

Lecture 12: Effect modification, and confounding in logistic regression Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression

More information

Marginal Structural Cox Model for Survival Data with Treatment-Confounder Feedback

Marginal Structural Cox Model for Survival Data with Treatment-Confounder Feedback University of South Carolina Scholar Commons Theses and Dissertations 2017 Marginal Structural Cox Model for Survival Data with Treatment-Confounder Feedback Yanan Zhang University of South Carolina Follow

More information

Introduction to Causal Bayesian Inference Chris Holmes University of Oxford

Introduction to Causal Bayesian Inference Chris Holmes University of Oxford Causal Bayesian Inference 1 Introduction to Causal Bayesian Inference Chris Holmes University of Oxford Causal Bayesian Inference 2 Objectives of Course To introduce concepts and methods for Causal Inference

More information

University of Bristol - Explore Bristol Research

University of Bristol - Explore Bristol Research Bowden, J., Del Greco M, F., Minelli, C., Davey Smith, G., Sheehan, N., & Thompson, J. (2017). A framework for the investigation of pleiotropy in twosample summary data Mendelian randomization. Statistics

More information

Asymptotic distribution of the largest eigenvalue with application to genetic data

Asymptotic distribution of the largest eigenvalue with application to genetic data Asymptotic distribution of the largest eigenvalue with application to genetic data Chong Wu University of Minnesota September 30, 2016 T32 Journal Club Chong Wu 1 / 25 Table of Contents 1 Background Gene-gene

More information

MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES

MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES Saurabh Ghosh Human Genetics Unit Indian Statistical Institute, Kolkata Most common diseases are caused by

More information

Identification Analysis for Randomized Experiments with Noncompliance and Truncation-by-Death

Identification Analysis for Randomized Experiments with Noncompliance and Truncation-by-Death Identification Analysis for Randomized Experiments with Noncompliance and Truncation-by-Death Kosuke Imai First Draft: January 19, 2007 This Draft: August 24, 2007 Abstract Zhang and Rubin 2003) derives

More information

A Decision Theoretic Approach to Causality

A Decision Theoretic Approach to Causality A Decision Theoretic Approach to Causality Vanessa Didelez School of Mathematics University of Bristol (based on joint work with Philip Dawid) Bordeaux, June 2011 Based on: Dawid & Didelez (2010). Identifying

More information

SNP Association Studies with Case-Parent Trios

SNP Association Studies with Case-Parent Trios SNP Association Studies with Case-Parent Trios Department of Biostatistics Johns Hopkins Bloomberg School of Public Health September 3, 2009 Population-based Association Studies Balding (2006). Nature

More information

Part IV Statistics in Epidemiology

Part IV Statistics in Epidemiology Part IV Statistics in Epidemiology There are many good statistical textbooks on the market, and we refer readers to some of these textbooks when they need statistical techniques to analyze data or to interpret

More information

Missing Covariate Data in Matched Case-Control Studies

Missing Covariate Data in Matched Case-Control Studies Missing Covariate Data in Matched Case-Control Studies Department of Statistics North Carolina State University Paul Rathouz Dept. of Health Studies U. of Chicago prathouz@health.bsd.uchicago.edu with

More information

Multiple linear regression S6

Multiple linear regression S6 Basic medical statistics for clinical and experimental research Multiple linear regression S6 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/42 Introduction Two main motivations for doing multiple

More information

Casual Mediation Analysis

Casual Mediation Analysis Casual Mediation Analysis Tyler J. VanderWeele, Ph.D. Upcoming Seminar: April 21-22, 2017, Philadelphia, Pennsylvania OXFORD UNIVERSITY PRESS Explanation in Causal Inference Methods for Mediation and Interaction

More information

Journal of Biostatistics and Epidemiology

Journal of Biostatistics and Epidemiology Journal of Biostatistics and Epidemiology Methodology Marginal versus conditional causal effects Kazem Mohammad 1, Seyed Saeed Hashemi-Nazari 2, Nasrin Mansournia 3, Mohammad Ali Mansournia 1* 1 Department

More information

Causal Effect Estimation Under Linear and Log- Linear Structural Nested Mean Models in the Presence of Unmeasured Confounding

Causal Effect Estimation Under Linear and Log- Linear Structural Nested Mean Models in the Presence of Unmeasured Confounding University of Pennsylvania ScholarlyCommons Publicly Accessible Penn Dissertations Summer 8-13-2010 Causal Effect Estimation Under Linear and Log- Linear Structural Nested Mean Models in the Presence of

More information

Robust instrumental variable methods using multiple candidate instruments with application to Mendelian randomization

Robust instrumental variable methods using multiple candidate instruments with application to Mendelian randomization Robust instrumental variable methods using multiple candidate instruments with application to Mendelian randomization arxiv:1606.03729v1 [stat.me] 12 Jun 2016 Stephen Burgess 1, Jack Bowden 2 Frank Dudbridge

More information

Instrumental Variables

Instrumental Variables Instrumental Variables Teppei Yamamoto Keio University Introduction to Causal Inference Spring 2016 Noncompliance in Randomized Experiments Often we cannot force subjects to take specific treatments Units

More information

Causal Inference with Counterfactuals

Causal Inference with Counterfactuals Causal Inference with Counterfactuals Robin Evans robin.evans@stats.ox.ac.uk Hilary 2014 1 Introduction What does it mean to say that a (possibly random) variable X is a cause of the random variable Y?

More information

p(d g A,g B )p(g B ), g B

p(d g A,g B )p(g B ), g B Supplementary Note Marginal effects for two-locus models Here we derive the marginal effect size of the three models given in Figure 1 of the main text. For each model we assume the two loci (A and B)

More information

arxiv: v1 [stat.me] 3 Feb 2016

arxiv: v1 [stat.me] 3 Feb 2016 Principal stratification analysis using principal scores Peng Ding and Jiannan Lu arxiv:602.096v [stat.me] 3 Feb 206 Abstract Practitioners are interested in not only the average causal effect of the treatment

More information

Lecture 2: Poisson and logistic regression

Lecture 2: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1

Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1 Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1 Gary King GaryKing.org April 13, 2014 1 c Copyright 2014 Gary King, All Rights Reserved. Gary King ()

More information

Two-Sample Instrumental Variable Analyses using Heterogeneous Samples

Two-Sample Instrumental Variable Analyses using Heterogeneous Samples Two-Sample Instrumental Variable Analyses using Heterogeneous Samples Qingyuan Zhao, Jingshu Wang, Wes Spiller Jack Bowden, and Dylan S. Small Department of Statistics, The Wharton School, University of

More information

WORKSHOP ON PRINCIPAL STRATIFICATION STANFORD UNIVERSITY, Luke W. Miratrix (Harvard University) Lindsay C. Page (University of Pittsburgh)

WORKSHOP ON PRINCIPAL STRATIFICATION STANFORD UNIVERSITY, Luke W. Miratrix (Harvard University) Lindsay C. Page (University of Pittsburgh) WORKSHOP ON PRINCIPAL STRATIFICATION STANFORD UNIVERSITY, 2016 Luke W. Miratrix (Harvard University) Lindsay C. Page (University of Pittsburgh) Our team! 2 Avi Feller (Berkeley) Jane Furey (Abt Associates)

More information

Régression en grande dimension et épistasie par blocs pour les études d association

Régression en grande dimension et épistasie par blocs pour les études d association Régression en grande dimension et épistasie par blocs pour les études d association V. Stanislas, C. Dalmasso, C. Ambroise Laboratoire de Mathématiques et Modélisation d Évry "Statistique et Génome" 1

More information

Case-control studies

Case-control studies Matched and nested case-control studies Bendix Carstensen Steno Diabetes Center, Gentofte, Denmark b@bxc.dk http://bendixcarstensen.com Department of Biostatistics, University of Copenhagen, 8 November

More information

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington Analysis of Longitudinal Data Patrick J Heagerty PhD Department of Biostatistics University of Washington Auckland 8 Session One Outline Examples of longitudinal data Scientific motivation Opportunities

More information

Harvard University. Harvard University Biostatistics Working Paper Series

Harvard University. Harvard University Biostatistics Working Paper Series Harvard University Harvard University Biostatistics Working Paper Series Year 2014 Paper 174 Control Function Assisted IPW Estimation with a Secondary Outcome in Case-Control Studies Tamar Sofer Marilyn

More information

Accounting for Baseline Observations in Randomized Clinical Trials

Accounting for Baseline Observations in Randomized Clinical Trials Accounting for Baseline Observations in Randomized Clinical Trials Scott S Emerson, MD, PhD Department of Biostatistics, University of Washington, Seattle, WA 9895, USA August 5, 0 Abstract In clinical

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University Joint

More information

Cross-Sectional Regression after Factor Analysis: Two Applications

Cross-Sectional Regression after Factor Analysis: Two Applications al Regression after Factor Analysis: Two Applications Joint work with Jingshu, Trevor, Art; Yang Song (GSB) May 7, 2016 Overview 1 2 3 4 1 / 27 Outline 1 2 3 4 2 / 27 Data matrix Y R n p Panel data. Transposable

More information

Bounds on Causal Effects in Three-Arm Trials with Non-compliance. Jing Cheng Dylan Small

Bounds on Causal Effects in Three-Arm Trials with Non-compliance. Jing Cheng Dylan Small Bounds on Causal Effects in Three-Arm Trials with Non-compliance Jing Cheng Dylan Small Department of Biostatistics and Department of Statistics University of Pennsylvania June 20, 2005 A Three-Arm Randomized

More information

Accounting for Baseline Observations in Randomized Clinical Trials

Accounting for Baseline Observations in Randomized Clinical Trials Accounting for Baseline Observations in Randomized Clinical Trials Scott S Emerson, MD, PhD Department of Biostatistics, University of Washington, Seattle, WA 9895, USA October 6, 0 Abstract In clinical

More information

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423

More information

1 Preliminary Variance component test in GLM Mediation Analysis... 3

1 Preliminary Variance component test in GLM Mediation Analysis... 3 Honglang Wang Depart. of Stat. & Prob. wangho16@msu.edu Omics Data Integration Statistical Genetics/Genomics Journal Club Summary and discussion of Joint Analysis of SNP and Gene Expression Data in Genetic

More information