Statistical inference in Mendelian randomization: From genetic association to epidemiological causation

Size: px
Start display at page:

Download "Statistical inference in Mendelian randomization: From genetic association to epidemiological causation"

Transcription

1 Statistical inference in Mendelian randomization: From genetic association to epidemiological causation Department of Statistics, The Wharton School, University of Pennsylvania March 1st, UMN Based on joint work with Jingshu Wang, Dylan Small (Penn). Jack Bowden, Gibran Hemani (University of Bristol).

2 Epidemiology = Why certain people are getting ill? Central problem: Causal inference from observational data. Methodological challenge: Correlation does not imply causation. Conventional solution: Conditioning on confounding variables. Matched case-control studies. Cohort studies. Problem: Biased if there is unmeasured confounder. 1/44

3 Motivating example Question What is the causal effect of Body Mass Index (BMI) on Systolic Blood Pressure (SBP)? Exposure variable X : BMI. Outcome variable Y : SBP. Confounders C: age, gender, lifestyle (food, smoking,... )... Fundamental problem Where is the end of this list? 2/44

4 MR: unbiased estimation without enumerating C Mendelian randomization (MR) is an alternative design of observational studies. Instead, MR uses additional instrumental variables Z. Not necessary to enumerate C. Brief history of MR Katan (1986): first proposal. Davey Smith and Ebrahim (2003): gaining attention. Hernán and Robins (2006): An Epidemiologist s dream? Pickrell et al. (2016): Surging interest in human genetics. Publications per year: only 5 in 2003, almost 400 in /44

5 This talk: MR using summary data Individual-level data (restricted access) Exposure variable X : BMI; Outcome variable Y : SBP; Instrumental variables Z 1,..., Z p : Genetic variants (SNPs). Summary-level data (public) Dataset BMI-FEM BMI-MAL SBP-UKBB Source GIANT (female) GIANT (male) UKBB Sample size GWAS lm(x Z j ) lm(x Z j ) lm(y Z j ) Coefficient ˆγ Selection j ˆΓj Std. Err. σ Xj σ Yj Step 1 Use BMI-FEM to select significant (p-value ) and uncorrelated SNPs (p = 25). Step 2 Use BMI-MAL to obtain (ˆγ j, σ Xj ), j = 1,..., p. Step 3 Use SBP-UKBB to obtain (ˆΓ j, σ Yj ), j = 1,..., p. 4/44

6 Statistical problem Model: Errors-in-variables regression Observe mutually independent random variables: ˆγ j N(γ j, σ 2 Xj ) and ˆΓ j N(Γ j, σ 2 Yj ). For some real number β 0, Γ j β 0 γ j for almost all j. β 0 can be regarded as the causal effect of X on Y. 5/44

7 Example SNP effect on SBP SNP effect on BMI Method Simple Overdispersion + Robust loss Figure: Scatter plot of ˆΓ j versus ˆγ j in the BMI-SBP example (p = 25). The goal is to fit a straight line across the origin. Observation: effects are tiny but many are significant. 6/44

8 Reasonable model? Key assumptions 1 Measurement error is normal. 2 (ˆγ 1,..., ˆγ p ) (ˆΓ 1,..., ˆΓ p ). 3 ˆγ j ˆγ k and ˆΓ j ˆΓ k if j k. 4 (Implicitly) γ 0. Otherwise no information for β 0. 5 ˆγ j is unbiased: E[ˆγ j ] = γ j. Pre-processing guarantees these assumptions 1 Large sample size CLT. 2 Two-sample MR: ˆγ and ˆΓ are computed from independent samples (GIANT males and UKBB). 3 Selecting uncorrelated and significant SNPs using another independent dataset (GIANT females). 7/44

9 MR estimates causal effect Two remaining questions 1 Why should the model Γ j β 0 γ j be linear? 2 Why is β 0 causal? Answer: MR is using SNPs as instrumental variables (IV). Heuristic Consider the following linear model: p X = γ j Z j + η X C + E X, j=1 Y = β 0 X + η Y C + E Y = p j=1 (β 0 γ j }{{} Γ j )Z j +... Key assumption: Z (C, E X, E Y ), i.e. Z are valid instruments. 8/44

10 Background: What is an IV? 2 C 1 Z X Y 3 Figure: Causal effect of X and Y is confounded by C. Z is an instrumental variable. Core IV assumptions 1 Z is associated with the exposure (X ). 2 Z is independent of the unmeasured confounder (C). 3 Z cannot have any direct effect on the outcome (Y ). 9/44

11 MR = using genetic variants as IVs 2 C 1 Z X Y 3 Examine the core IV assumptions for MR 1 Criterion 1 is not a problem. 2 Criterion 2 almost comes free due to Mendel s Second Law of random assortment. A minor concern is population stratification 3 Criterion 3 is not clear. There is a wide-spread phenomenon in genetics called pleiotropy, a.k.a. multi-function of genes. 10/44

12 Linear model (for summary data) is very robust Surprisingly, Γ j β 0 γ j is true if the individual-level model for (X, Y, Z, C) are non-linear. Heuristic Each SNP corresponds to a very small randomized trial. Maybe the relationship between Γ and γ is nonlinear, but it is locally linear near γ = 0. Can formalize this argument (see the paper) and show: β 0 lim x 0 E[Y (X + x) Y (X )]. x Y (x) is the counterfactual or potential outcome. The approximate linear model is still true even if the ˆΓ j are computed from glm(y Z j ). 11/44

13 Recap Two-sample summary-data MR (ˆγ j, ˆΓ j, σxj 2, σ2 Yj )p j=1 Genetic associations Inference = β 0 Epidemiological causation Statistical model 1 Observe mutually independent random variables: ˆγ j N(γ j, σ 2 Xj ) and ˆΓ j N(Γ j, σ 2 Yj ). 2 For some real number β 0, Γ j β 0 γ j for almost all j. Three concrete models for α j = Γ j β 0 γ j 1 No direct effect: α j 0. 2 Random effects model: α j N(0, τ0 2 ). 3 Contamination model: α j (1 ɛ)n(0, τ0 2 ) + ɛg. 12/44

14 Outline 1 2 : No direct effect 3 : Random direct effects 4 : Random direct effects with outliers 5 More examples 6 13/44

15 Model (No direct effect) The linear model Γ j = β 0 γ j is true for every j = 1,..., p. Log-likelihood of the data (up to additive constant): l(β, γ 1,..., γ p ) = 1 [ p (ˆγ j γ j ) Profile likelihood j=1 l(β) = max γ l(β, γ) = 1 2 σ 2 Xj p j=1 Profile score (PS): ψ(β) = (d/dβ)l(β). p (ˆΓ j γ j β) 2 ]. j=1 (ˆΓ j βˆγ j ) 2 σxj 2 β2 + σyj 2. Estimator: ˆβPS = arg max l(β) solves ψ(β) = 0. β σ 2 Yj 15/44

16 Some theory Assumption σxj 2 = O(1/n) = σ2 Yj. γ 2 is bounded (even if p ). Heuristic p Linear structural model: X = γ j Z j + η X C + E X. j=1 γ 2 is like the variance of X explained by Z. Theorem (Consistency) Under, ˆβ PS p β0 if p/n /44

17 Some theory (continued) Theorem Suppose n, and further p is fixed or p but γ 3 / γ 2 0, then V 2 V1 ( ˆβ PS β 0 ) d N(0, 1), where V 1 = p j=1 γ 2 j σ2 Yj + Γ2 j σ2 Xj + σ2 Xj σ2 Yj (σ 2 Xj β2 0 + σ2 Yj )2, V 2 = p j=1 γ 2 j σ2 Yj + Γ2 j σ2 Xj (σ 2 Xj β2 0 + σ2 Yj )2. V 1 and V 2 can be consistently estimated from data. Necessity IV selection: adding Z p+1 hurts if γ p+1 = 0. But adding even very weak SNPs usually reduces variance. 17/44

18 Diagnostic plots Residual Q-Q plot Under model 1, standardized residuals follow N(0, 1) if ˆβ = β 0 : ˆt j = Influence of a single SNP Asymptotic expansion: ˆΓ j ˆβˆγ j. ˆβ 2 σxj 2 + σ2 Yj ˆβ PS = 1 + o p(1) V 2 p j=1 (ˆΓ j β 0ˆγ j )(ˆΓ j σ 2 Xj β 0 + ˆγ j σ 2 Yj ) (σ 2 Xj β2 0 + σ2 Yj )2. In practice, we can also use leave-one-out. 18/44

19 Example (continued) Same three datasets, selected 160 uncorrelated SNPs that have p-values 10 4 in BMI-FEM. Diagnostic plots: 10 rs Standardized residual 5 0 Leave one out estimate rs Quantile of standard normal Instrument strength (F statistic) Clear overdispersion! 19/44

20 The most likely explanation of the lack of fit is pleiotropy (direct effects). Heuristic (Direct effect of Z j is α j ) p X = γ j Z j + η X C + E X, j=1 Y = β 0 X + p α j Z j + η Y C + E Y = j=1 p j=1 (β 0 γ j + α j }{{} Γ j )Z j +... Model (Random direct effects) Assume α j = Γ j β 0 γ j i.i.d. N(0, τ 2 0 ) for all j. τ 2 0 should be small (scales like 1/p). 21/44

21 Connection to population genetics We assumed the normal random effects model because of The observed overdiserpsion. Statistical convenience. This is consistent (or not inconsistent) with genetics: Fisher s (1918) infitestimal model. New perspective on pleiotropy: 1 1 Boyle, E. et al. (2017). An expanded view of complex traits: from polygenic to omnigenic. Cell 169, p /44

22 Back to statistics: Failure of profile likelihood The profile likelihood of is given by l(β, τ 2 ) = 1 2 Easy to verify p j=1 (ˆΓ j βˆγ j ) 2 σ 2 Xj β2 + σ 2 Yj + τ 2 + log(σ2 Yj + τ 2 ), [ ] E β l(β 0, τ0 2 ) = 0. But the other score function is biased: τ 2 l(β, τ 2 ) = 1 2 p j=1 (ˆΓ j βˆγ j ) 2 (σ 2 Xj β2 + σ 2 Yj + τ 2 ) 2 1 σ 2 Yj + τ 2. This is not too surprising as we are profiling out p nuisance parameters γ 1,, γ p (the Neyman-Scott problem). 23/44

23 Adjusted profile score There exist many ways to modify the profile likelihood. We will take the approach of McCullagh and Tibshirani (1990). Adjusted profile score (APS) ψ 1 (β, τ 2 ) = β l(β, τ 2 ), p [ ψ 2 (β, τ 2 ) = j=1 σ 2 Xj (ˆΓ j βˆγ j ) 2 (σ 2 Xj β2 + σ 2 Yj + τ 2 ) 2 1 σ 2 Xj β2 + σ 2 Yj +τ 2 ]. Trivial root: β ± or τ 2. Let ˆβ APS be the non-trivial finite solution. 24/44

24 Some theory Theorem Assume that is true, (β 0, pτ 2 0 ) is in a bounded set B, p and p/n 2 0. Then 1 With probability going to 1 there exists a solution in B. p 2 All solutions in B are consistent: ˆβ APE β0 and pˆτ APS 2 pτ0 2 p 0. Can obtain similar asymptotic normality assuming p/n λ (0, ) (see the paper). 25/44

25 Example (continued) Same 160 SNPs. Standardized residual rs Leave one out estimate rs Quantile of standard normal Instrument strength (F statistic) A clear outlier: rs Slightly underdispersed. 26/44

26 Model (Random direct effects with outliers) α j (1 ɛ)n(0, τ 2 0 ) + ɛg for some unknown G. Recall the expansion: ˆβ PS = 1 + o p(1) V 2 p j=1 (ˆΓ j β 0ˆγ j )(ˆΓ j σ 2 Xj β 0 + ˆγ j σ 2 Yj ) (σ 2 Xj β2 0 + σ2 Yj )2. Problem: a single SNP can have unbounded influence (same for APS). Our solution: robustify the adjusted profile score. 28/44

27 A method robust to outliers Standardized residual: t j (β, τ 2 ) = ˆΓ j βˆγ j σ 2 Xj β2 + σ 2 Yj + τ 2. Robust adjusted profile score (RAPS) ψ (ρ) 1 (β, τ 2 ) = ψ (ρ) 2 (β, τ 2 ) = p ρ (t j ) β t j, j=1 p j=1 σ 2 Xj t j ρ (t j ) δ σxj 2 β2 + σyj 2 + τ 2. ρ is robust loss, δ = E[T ρ (T )] for T N(0, 1). Reduces to solution 2 (APS) when ρ(t) = t 2 /2. 29/44

28 Some theory General theory is very difficult because we are solving nonlinear equations. Theorem (Local identifiability) In, E[ψ (ρ) (β 0, τ 2 0 )] = 0 and E[ ψ (ρ) ] has full rank. Asymptotic normality can be established assuming consistency and additional technical conditions (see the paper). 30/44

29 Example (continued) Same 160 SNPs. Huber s loss function. Standardized residual rs Leave one out estimate rs Quantile of standard normal Instrument strength (F statistic) Influence of the outlier is limited. 31/44

30 Example (continued) Same 160 SNPs. Tukey s biweight function. Standardized residual rs Leave one out estimate rs Quantile of standard normal Instrument strength (F statistic) Influence of the outlier is essentially 0. 32/44

31 Comparison of all the methods Method Estimate Standard error Profile score (PS) Adjusted PS (APS) Robust APS (RAPS, Huber) Robust APS (RAPS, Tukey) Inverse variance weighted (IVW) Weighted median MR Egger Existing meta-analysis methods for MR 1 IVW: weighted average of ˆΓ j /ˆγ j. 2 Weighted median of ˆΓ j /ˆγ j. 3 MR Egger: weighted least squares of ˆΓ j against ˆγ j. They all ignore measurement error in ˆγ j and weak IV bias. 33/44

32 Simulation setting Create simulated datasets to mimic the BMI-SBP example with known β 0. ind. ˆγ j N(γ j, σxj 2 ) where (γ j, σxj 2 ) come from the real data. Γ j are generated in 6 different ways (β 0 = 0.5): : Γ j = γ j β 0 (i.e. α j 0); : α j i.i.d. N(0, τ 2 0 ), where τ 0 = 2 (1/p) p σ Yj ; j=1 : Same as 2 but add 5 τ 0 to α 1. i.i.d. Heavy-tailed α: α j τ 0 Lap(1). Heteroskedastic α: Var(α j ) γj 2 and normal. Many outliers: Add 5 τ 0 to 10% randomly selected α j. 35/44

33 Simulation results 1 (p = 25) IVW Egger W. Median method PS APS RAPS β^ Black: boxplot of point estimate over 1000 realizations. Red: quartile using median standard error. 36/44

34 Simulation results 1 (p = 160) IVW Egger W. Median method PS APS RAPS β^ Black: boxplot of point estimate over 1000 realizations. Red: quartile using median standard error. 37/44

35 Another real data example: BMI-BMI BMI-GIANT: full dataset from the GIANT consortium (i.e. combining BMI-FEM and BMI-MAL), used to select SNPs. BMI-UKBB-1: half of the UKBB data, used as the exposure. BMI-UKBB-2: another half of UKBB data, used as the outcome. Because exposure and outcome, γ j Γ j. So true β 0 = 1 and there is no direct effect. 38/44

36 BMI-BMI results (GIANT, UKBB-1, UKBB-2) p sel # SNPs Mean F IVW W. Median W. Mode 1e (0.026) (0.039) (0.042) 1e (0.024) (0.039) (0.044) 1e (0.024) (0.036) (0.041) 1e (0.022) (0.034) (0.038) 1e (0.019) (0.033) (0.039) 1e (0.017) (0.031) (0.035) 1e (0.015) (0.027) (0.231) 1e (0.014) (0.023) (7.130) p sel # SNPs Median F Egger PS RAPS 1e (0.055) (0.023) (0.026) 1e (0.050) (0.023) (0.025) 1e (0.048) (0.021) (0.025) 1e (0.043) (0.019) (0.023) 1e (0.036) (0.018) (0.020) 1e (0.031) (0.017) (0.018) 1e (0.027) (0.016) (0.016) 1e (0.022) (0.015) (0.015) 39/44

37 Selection bias (UKBB-1, UKBB-1, UKBB-2) p sel # SNPs Mean F IVW W. Median W. Mode 1e (0.02) 0.83 (0.025) (0.046) 1e (0.017) 0.8 (0.022) (0.053) 1e (0.016) (0.019) (0.058) 1e (0.015) (0.019) (0.079) 1e (0.013) (0.016) (0.12) 1e (0.012) (0.015) (0.122) 1e (0.011) 0.57 (0.014) (0.096) 1e (0.01) (0.013) (0.093) p sel # SNPs Median F Egger PS RAPS 1e (0.051) (0.015) (0.021) 1e (0.046) (0.014) (0.018) 1e (0.043) (0.012) (0.016) 1e (0.041) (0.011) (0.016) 1e (0.037) (0.01) (0.015) 1e (0.033) (0.009) 0.66 (0.014) 1e (0.03) (0.008) (0.013) 1e (0.025) (0.008) (0.012) 40/44

38 This talk Paper: Statistical inference in two-sample Mendelian randomization using robust adjusted profile score. arxiv: Software: R package mr.raps is currently on CRAN. Can be directly called from the TwoSampleMR platform ( 42/44

39 Additional references I Overview of MR (Epidemiology): Davey Smith, G. and Ebrahim, S. (2003). Mendelian randomization : can genetic epidemiology contribute to understanding environmental determinants of disease? International Journal of Epidemiology. Davey Smith, G. and Hemani, G. (2014). Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Human Molecular Genetics. Overview of MR (more Stats): Hernán, M. and Robins, J. (2006). Instruments for causal inference: an epidemiologist s dream?. Epidemiology. Didelez, V. and Sheehan N. (2007). Mendelian randomization as an instrumental variable approach to causal inference. Statistical Methods in Medical Research. Human genetics: Gamazon, E. et al. (2015). A gene-based association method for mapping traits using reference transcriptome data. Nature Genetics. Pickrell, J. et al. (2016) Detection and interpretation of shared genetic influences on 42 human traits. Nature Genetics. 43/44

40 Additional references II Robust MR methods (individual-level data): Kang, H. et al. (2016). Instrumental Variables Estimation with Some Invalid Instruments and its Application to Mendelian Randomization. JASA. Tchetgen Tchetgen, E. et al. (2017). The GENIUS approach to robust Mendelian randomization inference. arxiv: Robust MR methods (summary-level data): Bowden, J. et al. (2015). Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. International Journal of Epidemiology. Bowden, J. et al. (2016). Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genetic Epidemiology. Pleiotropy: Solovieff, N. et al. (2013). Pleiotropy in complex traits: challenges and strategies. Nature Reviews Genetics. Boyle, E. et al. (2017). An expanded view of complex traits: from polygenic to omnigenic. Cell. 44/44

Mendelian randomization: From genetic association to epidemiological causation

Mendelian randomization: From genetic association to epidemiological causation Mendelian randomization: From genetic association to epidemiological causation Qingyuan Zhao Department of Statistics, The Wharton School, University of Pennsylvania April 24, 2018 2 C (Confounder) 1 Z

More information

Statistical inference in two-sample summary-data Mendelian randomization using robust adjusted profile score

Statistical inference in two-sample summary-data Mendelian randomization using robust adjusted profile score Statistical inference in two-sample summary-data Mendelian randomization using robust adjusted profile score Qingyuan Zhao, Jingshu Wang, Jack Bowden, and Dylan S. Small Department of Statistics, The Wharton

More information

arxiv: v2 [stat.ap] 7 Feb 2018

arxiv: v2 [stat.ap] 7 Feb 2018 STATISTICAL INFERENCE IN TWO-SAMPLE SUMMARY-DATA MENDELIAN RANDOMIZATION USING ROBUST ADJUSTED PROFILE SCORE arxiv:1801.09652v2 [stat.ap] 7 Feb 2018 By Qingyuan Zhao, Jingshu Wang, Gibran Hemani, Jack

More information

Instrumental variables & Mendelian randomization

Instrumental variables & Mendelian randomization Instrumental variables & Mendelian randomization Qingyuan Zhao Department of Statistics, The Wharton School, University of Pennsylvania May 9, 2018 @ JHU 2 C (Confounder) 1 Z (Gene) X (HDL) Y (Heart disease)

More information

Robust instrumental variable methods using multiple candidate instruments with application to Mendelian randomization

Robust instrumental variable methods using multiple candidate instruments with application to Mendelian randomization Robust instrumental variable methods using multiple candidate instruments with application to Mendelian randomization arxiv:1606.03729v1 [stat.me] 12 Jun 2016 Stephen Burgess 1, Jack Bowden 2 Frank Dudbridge

More information

A Comparison of Robust Methods for Mendelian Randomization Using Multiple Genetic Variants

A Comparison of Robust Methods for Mendelian Randomization Using Multiple Genetic Variants 8 A Comparison of Robust Methods for Mendelian Randomization Using Multiple Genetic Variants Yanchun Bao ISER, University of Essex Paul Clarke ISER, University of Essex Melissa C Smart ISER, University

More information

Mendelian randomization (MR)

Mendelian randomization (MR) Mendelian randomization (MR) Use inherited genetic variants to infer causal relationship of an exposure and a disease outcome. 1 Concepts of MR and Instrumental variable (IV) methods motivation, assumptions,

More information

Two-Sample Instrumental Variable Analyses using Heterogeneous Samples

Two-Sample Instrumental Variable Analyses using Heterogeneous Samples Two-Sample Instrumental Variable Analyses using Heterogeneous Samples Qingyuan Zhao, Jingshu Wang, Wes Spiller Jack Bowden, and Dylan S. Small Department of Statistics, The Wharton School, University of

More information

Harvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen

Harvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen Harvard University Harvard University Biostatistics Working Paper Series Year 2014 Paper 175 A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome Eric Tchetgen Tchetgen

More information

Confounder Adjustment in Multiple Hypothesis Testing

Confounder Adjustment in Multiple Hypothesis Testing in Multiple Hypothesis Testing Department of Statistics, Stanford University January 28, 2016 Slides are available at http://web.stanford.edu/~qyzhao/. Collaborators Jingshu Wang Trevor Hastie Art Owen

More information

Extending the MR-Egger method for multivariable Mendelian randomization to correct for both measured and unmeasured pleiotropy

Extending the MR-Egger method for multivariable Mendelian randomization to correct for both measured and unmeasured pleiotropy Received: 20 October 2016 Revised: 15 August 2017 Accepted: 23 August 2017 DOI: 10.1002/sim.7492 RESEARCH ARTICLE Extending the MR-Egger method for multivariable Mendelian randomization to correct for

More information

Causal inference in biomedical sciences: causal models involving genotypes. Mendelian randomization genes as Instrumental Variables

Causal inference in biomedical sciences: causal models involving genotypes. Mendelian randomization genes as Instrumental Variables Causal inference in biomedical sciences: causal models involving genotypes Causal models for observational data Instrumental variables estimation and Mendelian randomization Krista Fischer Estonian Genome

More information

Recent Challenges for Mendelian Randomisation Analyses

Recent Challenges for Mendelian Randomisation Analyses Recent Challenges for Mendelian Randomisation Analyses Vanessa Didelez Leibniz Institute for Prevention Research and Epidemiology & Department of Mathematics University of Bremen, Germany CRM Montreal,

More information

Combining multiple observational data sources to estimate causal eects

Combining multiple observational data sources to estimate causal eects Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* syang24@ncsuedu Joint work with Peng Ding UC Berkeley May 23,

More information

University of Bristol - Explore Bristol Research

University of Bristol - Explore Bristol Research Bowden, J., Del Greco M, F., Minelli, C., Davey Smith, G., Sheehan, N., & Thompson, J. (2017). A framework for the investigation of pleiotropy in twosample summary data Mendelian randomization. Statistics

More information

Estimating direct effects in cohort and case-control studies

Estimating direct effects in cohort and case-control studies Estimating direct effects in cohort and case-control studies, Ghent University Direct effects Introduction Motivation The problem of standard approaches Controlled direct effect models In many research

More information

A Sampling of IMPACT Research:

A Sampling of IMPACT Research: A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Propensity Score Weighting with Multilevel Data

Propensity Score Weighting with Multilevel Data Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative

More information

Causality and Experiments

Causality and Experiments Causality and Experiments Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania April 13, 2009 Michael R. Roberts Causality and Experiments 1/15 Motivation Introduction

More information

Cross-Sectional Regression after Factor Analysis: Two Applications

Cross-Sectional Regression after Factor Analysis: Two Applications al Regression after Factor Analysis: Two Applications Joint work with Jingshu, Trevor, Art; Yang Song (GSB) May 7, 2016 Overview 1 2 3 4 1 / 27 Outline 1 2 3 4 2 / 27 Data matrix Y R n p Panel data. Transposable

More information

Lecture 14 October 13

Lecture 14 October 13 STAT 383C: Statistical Modeling I Fall 2015 Lecture 14 October 13 Lecturer: Purnamrita Sarkar Scribe: Some one Disclaimer: These scribe notes have been slightly proofread and may have typos etc. Note:

More information

Empirical Bayes Moderation of Asymptotically Linear Parameters

Empirical Bayes Moderation of Asymptotically Linear Parameters Empirical Bayes Moderation of Asymptotically Linear Parameters Nima Hejazi Division of Biostatistics University of California, Berkeley stat.berkeley.edu/~nhejazi nimahejazi.org twitter/@nshejazi github/nhejazi

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline

More information

Extending the MR-Egger method for multivariable Mendelian randomization to correct for both measured and unmeasured pleiotropy

Extending the MR-Egger method for multivariable Mendelian randomization to correct for both measured and unmeasured pleiotropy Extending the MR-Egger method for multivariable Mendelian randomization to correct for both measured and unmeasured pleiotropy arxiv:1708.00272v1 [stat.me] 1 Aug 2017 Jessica MB Rees 1 Angela Wood 1 Stephen

More information

Causal exposure effect on a time-to-event response using an IV.

Causal exposure effect on a time-to-event response using an IV. Faculty of Health Sciences Causal exposure effect on a time-to-event response using an IV. Torben Martinussen 1 Stijn Vansteelandt 2 Eric Tchetgen 3 1 Department of Biostatistics University of Copenhagen

More information

Varieties of Count Data

Varieties of Count Data CHAPTER 1 Varieties of Count Data SOME POINTS OF DISCUSSION What are counts? What are count data? What is a linear statistical model? What is the relationship between a probability distribution function

More information

Previous lecture. Single variant association. Use genome-wide SNPs to account for confounding (population substructure)

Previous lecture. Single variant association. Use genome-wide SNPs to account for confounding (population substructure) Previous lecture Single variant association Use genome-wide SNPs to account for confounding (population substructure) Estimation of effect size and winner s curse Meta-Analysis Today s outline P-value

More information

Inference For High Dimensional M-estimates: Fixed Design Results

Inference For High Dimensional M-estimates: Fixed Design Results Inference For High Dimensional M-estimates: Fixed Design Results Lihua Lei, Peter Bickel and Noureddine El Karoui Department of Statistics, UC Berkeley Berkeley-Stanford Econometrics Jamboree, 2017 1/49

More information

Casual Mediation Analysis

Casual Mediation Analysis Casual Mediation Analysis Tyler J. VanderWeele, Ph.D. Upcoming Seminar: April 21-22, 2017, Philadelphia, Pennsylvania OXFORD UNIVERSITY PRESS Explanation in Causal Inference Methods for Mediation and Interaction

More information

Survival Analysis for Case-Cohort Studies

Survival Analysis for Case-Cohort Studies Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz

More information

6.3 How the Associational Criterion Fails

6.3 How the Associational Criterion Fails 6.3. HOW THE ASSOCIATIONAL CRITERION FAILS 271 is randomized. We recall that this probability can be calculated from a causal model M either directly, by simulating the intervention do( = x), or (if P

More information

Control Function Instrumental Variable Estimation of Nonlinear Causal Effect Models

Control Function Instrumental Variable Estimation of Nonlinear Causal Effect Models Journal of Machine Learning Research 17 (2016) 1-35 Submitted 9/14; Revised 2/16; Published 2/16 Control Function Instrumental Variable Estimation of Nonlinear Causal Effect Models Zijian Guo Department

More information

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Princeton University Asian Political Methodology Conference University of Sydney Joint

More information

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017 Econometrics with Observational Data Introduction and Identification Todd Wagner February 1, 2017 Goals for Course To enable researchers to conduct careful quantitative analyses with existing VA (and non-va)

More information

Missing covariate data in matched case-control studies: Do the usual paradigms apply?

Missing covariate data in matched case-control studies: Do the usual paradigms apply? Missing covariate data in matched case-control studies: Do the usual paradigms apply? Bryan Langholz USC Department of Preventive Medicine Joint work with Mulugeta Gebregziabher Larry Goldstein Mark Huberman

More information

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall 1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Duke University Medical, Dept. of Biostatistics Joint work

More information

Inference For High Dimensional M-estimates. Fixed Design Results

Inference For High Dimensional M-estimates. Fixed Design Results : Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University Joint

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification Todd MacKenzie, PhD Collaborators A. James O Malley Tor Tosteson Therese Stukel 2 Overview 1. Instrumental variable

More information

Extending causal inferences from a randomized trial to a target population

Extending causal inferences from a randomized trial to a target population Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh

More information

Primal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing

Primal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing Primal-dual Covariate Balance and Minimal Double Robustness via (Joint work with Daniel Percival) Department of Statistics, Stanford University JSM, August 9, 2015 Outline 1 2 3 1/18 Setting Rubin s causal

More information

Extending the results of clinical trials using data from a target population

Extending the results of clinical trials using data from a target population Extending the results of clinical trials using data from a target population Issa Dahabreh Center for Evidence-Based Medicine, Brown School of Public Health Disclaimer Partly supported through PCORI Methods

More information

Empirical Bayes Moderation of Asymptotically Linear Parameters

Empirical Bayes Moderation of Asymptotically Linear Parameters Empirical Bayes Moderation of Asymptotically Linear Parameters Nima Hejazi Division of Biostatistics University of California, Berkeley stat.berkeley.edu/~nhejazi nimahejazi.org twitter/@nshejazi github/nhejazi

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Likelihood Ratio Test in High-Dimensional Logistic Regression Is Asymptotically a Rescaled Chi-Square

Likelihood Ratio Test in High-Dimensional Logistic Regression Is Asymptotically a Rescaled Chi-Square Likelihood Ratio Test in High-Dimensional Logistic Regression Is Asymptotically a Rescaled Chi-Square Yuxin Chen Electrical Engineering, Princeton University Coauthors Pragya Sur Stanford Statistics Emmanuel

More information

Mendelian randomization as an instrumental variable approach to causal inference

Mendelian randomization as an instrumental variable approach to causal inference Statistical Methods in Medical Research 2007; 16: 309 330 Mendelian randomization as an instrumental variable approach to causal inference Vanessa Didelez Departments of Statistical Science, University

More information

Causal Mechanisms Short Course Part II:

Causal Mechanisms Short Course Part II: Causal Mechanisms Short Course Part II: Analyzing Mechanisms with Experimental and Observational Data Teppei Yamamoto Massachusetts Institute of Technology March 24, 2012 Frontiers in the Analysis of Causal

More information

Proportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power

Proportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power Proportional Variance Explained by QTL and Statistical Power Partitioning the Genetic Variance We previously focused on obtaining variance components of a quantitative trait to determine the proportion

More information

6. Assessing studies based on multiple regression

6. Assessing studies based on multiple regression 6. Assessing studies based on multiple regression Questions of this section: What makes a study using multiple regression (un)reliable? When does multiple regression provide a useful estimate of the causal

More information

Inference for High Dimensional Robust Regression

Inference for High Dimensional Robust Regression Department of Statistics UC Berkeley Stanford-Berkeley Joint Colloquium, 2015 Table of Contents 1 Background 2 Main Results 3 OLS: A Motivating Example Table of Contents 1 Background 2 Main Results 3 OLS:

More information

Specification Errors, Measurement Errors, Confounding

Specification Errors, Measurement Errors, Confounding Specification Errors, Measurement Errors, Confounding Kerby Shedden Department of Statistics, University of Michigan October 10, 2018 1 / 32 An unobserved covariate Suppose we have a data generating model

More information

The GENIUS Approach to Robust Mendelian Randomization Inference

The GENIUS Approach to Robust Mendelian Randomization Inference Original Article The GENIUS Approach to Robust Mendelian Randomization Inference Eric J. Tchetgen Tchetgen, BaoLuo Sun, Stefan Walter Department of Biostatistics,Harvard University arxiv:1709.07779v5 [stat.me]

More information

On the Choice of Parameterisation and Priors for the Bayesian Analyses of Mendelian Randomisation Studies.

On the Choice of Parameterisation and Priors for the Bayesian Analyses of Mendelian Randomisation Studies. On the Choice of Parameterisation and Priors for the Bayesian Analyses of Mendelian Randomisation Studies. E. M. Jones 1, J. R. Thompson 1, V. Didelez, and N. A. Sheehan 1 1 Department of Health Sciences,

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x

More information

Generalised method of moments estimation of structural mean models

Generalised method of moments estimation of structural mean models Generalised method of moments estimation of structural mean models Tom Palmer 1,2 Roger Harbord 2 Paul Clarke 3 Frank Windmeijer 3,4,5 1. MRC Centre for Causal Analyses in Translational Epidemiology 2.

More information

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Previous lecture P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Interaction Outline: Definition of interaction Additive versus multiplicative

More information

Marginal Screening and Post-Selection Inference

Marginal Screening and Post-Selection Inference Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2

More information

EMERGING MARKETS - Lecture 2: Methodology refresher

EMERGING MARKETS - Lecture 2: Methodology refresher EMERGING MARKETS - Lecture 2: Methodology refresher Maria Perrotta April 4, 2013 SITE http://www.hhs.se/site/pages/default.aspx My contact: maria.perrotta@hhs.se Aim of this class There are many different

More information

Selecting (In)Valid Instruments for Instrumental Variables Estimation

Selecting (In)Valid Instruments for Instrumental Variables Estimation Selecting InValid Instruments for Instrumental Variables Estimation Frank Windmeijer a,e, Helmut Farbmacher b, Neil Davies c,e George Davey Smith c,e, Ian White d a Department of Economics University of

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Module 17: Bayesian Statistics for Genetics Lecture 4: Linear regression

Module 17: Bayesian Statistics for Genetics Lecture 4: Linear regression 1/37 The linear regression model Module 17: Bayesian Statistics for Genetics Lecture 4: Linear regression Ken Rice Department of Biostatistics University of Washington 2/37 The linear regression model

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

Estimating the Marginal Odds Ratio in Observational Studies

Estimating the Marginal Odds Ratio in Observational Studies Estimating the Marginal Odds Ratio in Observational Studies Travis Loux Christiana Drake Department of Statistics University of California, Davis June 20, 2011 Outline The Counterfactual Model Odds Ratios

More information

Propensity Score Methods for Causal Inference

Propensity Score Methods for Causal Inference John Pura BIOS790 October 2, 2015 Causal inference Philosophical problem, statistical solution Important in various disciplines (e.g. Koch s postulates, Bradford Hill criteria, Granger causality) Good

More information

Bayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson

Bayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson Bayesian variable selection via penalized credible regions Brian Reich, NC State Joint work with Howard Bondell and Ander Wilson Brian Reich, NCSU Penalized credible regions 1 Motivation big p, small n

More information

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Cun-Hui Zhang and Stephanie S. Zhang Rutgers University and Columbia University September 14, 2012 Outline Introduction Methodology

More information

Generalised method of moments estimation of structural mean models

Generalised method of moments estimation of structural mean models Generalised method of moments estimation of structural mean models Tom Palmer 1,2 Roger Harbord 2 Paul Clarke 3 Frank Windmeijer 3,4,5 1. MRC Centre for Causal Analyses in Translational Epidemiology 2.

More information

The propensity score with continuous treatments

The propensity score with continuous treatments 7 The propensity score with continuous treatments Keisuke Hirano and Guido W. Imbens 1 7.1 Introduction Much of the work on propensity score analysis has focused on the case in which the treatment is binary.

More information

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall 1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Dept. of Biostatistics, Duke University Medical Joint work

More information

Median Cross-Validation

Median Cross-Validation Median Cross-Validation Chi-Wai Yu 1, and Bertrand Clarke 2 1 Department of Mathematics Hong Kong University of Science and Technology 2 Department of Medicine University of Miami IISA 2011 Outline Motivational

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Probabilistic Index Models

Probabilistic Index Models Probabilistic Index Models Jan De Neve Department of Data Analysis Ghent University M3 Storrs, Conneticut, USA May 23, 2017 Jan.DeNeve@UGent.be 1 / 37 Introduction 2 / 37 Introduction to Probabilistic

More information

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score Causal Inference with General Treatment Regimes: Generalizing the Propensity Score David van Dyk Department of Statistics, University of California, Irvine vandyk@stat.harvard.edu Joint work with Kosuke

More information

Sociology 6Z03 Review I

Sociology 6Z03 Review I Sociology 6Z03 Review I John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review I Fall 2016 1 / 19 Outline: Review I Introduction Displaying Distributions Describing

More information

8. Instrumental variables regression

8. Instrumental variables regression 8. Instrumental variables regression Recall: In Section 5 we analyzed five sources of estimation bias arising because the regressor is correlated with the error term Violation of the first OLS assumption

More information

Lecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017

Lecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017 Lecture 2: Genetic Association Testing with Quantitative Traits Instructors: Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 29 Introduction to Quantitative Trait Mapping

More information

Statistical Data Analysis

Statistical Data Analysis DS-GA 0 Lecture notes 8 Fall 016 1 Descriptive statistics Statistical Data Analysis In this section we consider the problem of analyzing a set of data. We describe several techniques for visualizing the

More information

Prentice Hall Stats: Modeling the World 2004 (Bock) Correlated to: National Advanced Placement (AP) Statistics Course Outline (Grades 9-12)

Prentice Hall Stats: Modeling the World 2004 (Bock) Correlated to: National Advanced Placement (AP) Statistics Course Outline (Grades 9-12) National Advanced Placement (AP) Statistics Course Outline (Grades 9-12) Following is an outline of the major topics covered by the AP Statistics Examination. The ordering here is intended to define the

More information

De-mystifying random effects models

De-mystifying random effects models De-mystifying random effects models Peter J Diggle Lecture 4, Leahurst, October 2012 Linear regression input variable x factor, covariate, explanatory variable,... output variable y response, end-point,

More information

Rerandomization to Balance Covariates

Rerandomization to Balance Covariates Rerandomization to Balance Covariates Kari Lock Morgan Department of Statistics Penn State University Joint work with Don Rubin University of Minnesota Biostatistics 4/27/16 The Gold Standard Randomized

More information

TGDR: An Introduction

TGDR: An Introduction TGDR: An Introduction Julian Wolfson Student Seminar March 28, 2007 1 Variable Selection 2 Penalization, Solution Paths and TGDR 3 Applying TGDR 4 Extensions 5 Final Thoughts Some motivating examples We

More information

Contents. Acknowledgments. xix

Contents. Acknowledgments. xix Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables

More information

Causal Discovery by Computer

Causal Discovery by Computer Causal Discovery by Computer Clark Glymour Carnegie Mellon University 1 Outline 1. A century of mistakes about causation and discovery: 1. Fisher 2. Yule 3. Spearman/Thurstone 2. Search for causes is statistical

More information

Causal Inference Basics

Causal Inference Basics Causal Inference Basics Sam Lendle October 09, 2013 Observed data, question, counterfactuals Observed data: n i.i.d copies of baseline covariates W, treatment A {0, 1}, and outcome Y. O i = (W i, A i,

More information

Spatial Regression. 3. Review - OLS and 2SLS. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Spatial Regression. 3. Review - OLS and 2SLS. Luc Anselin.   Copyright 2017 by Luc Anselin, All Rights Reserved Spatial Regression 3. Review - OLS and 2SLS Luc Anselin http://spatial.uchicago.edu OLS estimation (recap) non-spatial regression diagnostics endogeneity - IV and 2SLS OLS Estimation (recap) Linear Regression

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

IV-estimators of the causal odds ratio for a continuous exposure in prospective and retrospective designs

IV-estimators of the causal odds ratio for a continuous exposure in prospective and retrospective designs IV-estimators of the causal odds ratio for a continuous exposure in prospective and retrospective designs JACK BOWDEN (corresponding author) MRC Biostatistics Unit, Institute of Public Health, Robinson

More information

Selective Inference for Effect Modification

Selective Inference for Effect Modification Inference for Modification (Joint work with Dylan Small and Ashkan Ertefaie) Department of Statistics, University of Pennsylvania May 24, ACIC 2017 Manuscript and slides are available at http://www-stat.wharton.upenn.edu/~qyzhao/.

More information

Bayesian Hierarchical Models

Bayesian Hierarchical Models Bayesian Hierarchical Models Gavin Shaddick, Millie Green, Matthew Thomas University of Bath 6 th - 9 th December 2016 1/ 34 APPLICATIONS OF BAYESIAN HIERARCHICAL MODELS 2/ 34 OUTLINE Spatial epidemiology

More information

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and

More information

ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY

ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY G.L. Shevlyakov, P.O. Smirnov St. Petersburg State Polytechnic University St.Petersburg, RUSSIA E-mail: Georgy.Shevlyakov@gmail.com

More information

Basics of Experimental Design. Review of Statistics. Basic Study. Experimental Design. When an Experiment is Not Possible. Studying Relations

Basics of Experimental Design. Review of Statistics. Basic Study. Experimental Design. When an Experiment is Not Possible. Studying Relations Basics of Experimental Design Review of Statistics And Experimental Design Scientists study relation between variables In the context of experiments these variables are called independent and dependent

More information

Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13)

Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13) Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13) 1. Weighted Least Squares (textbook 11.1) Recall regression model Y = β 0 + β 1 X 1 +... + β p 1 X p 1 + ε in matrix form: (Ch. 5,

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007)

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007) Double Robustness Bang and Robins (2005) Kang and Schafer (2007) Set-Up Assume throughout that treatment assignment is ignorable given covariates (similar to assumption that data are missing at random

More information

Causality II: How does causal inference fit into public health and what it is the role of statistics?

Causality II: How does causal inference fit into public health and what it is the role of statistics? Causality II: How does causal inference fit into public health and what it is the role of statistics? Statistics for Psychosocial Research II November 13, 2006 1 Outline Potential Outcomes / Counterfactual

More information

multilevel modeling: concepts, applications and interpretations

multilevel modeling: concepts, applications and interpretations multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models

More information