Advanced Statistical Methods for Observational Studies L E C T U R E 0 1

Size: px
Start display at page:

Download "Advanced Statistical Methods for Observational Studies L E C T U R E 0 1"

Transcription

1 Advanced Statistical Methods for Observational Studies L E C T U R E 0 1

2 introduction

3 this class Website Expectations Questions

4 observational studies The world of observational studies is kind of hard to get into because it grew up in several distinct, but overlapping, disciplines: Epidemiology Demography Economics (econometrics) Political Science Sociology Biostatistics Statistics Psychology (psychometrics) Computer Science

5 a small bit about me I do causal inference: Observational studies of: NICU, surgical interventions, and smoking cessation. Randomized studies: six interventions here at Stanford to improve educational outcomes, two large trials of a sexual assault prevention program.

6 an aside You can call me Mike If you want to use my last name, Baiocchi, totally feel free to if you say it this way I ll definitely know you re talking to me: bye-oh-key

7 potential outcomes framework Design of Observational Studies: section 2.2

8 causal inference Our goal is to figure out what the change in the outcome will be for a person if we change from the control to the treatment: Y i t = 1 Y i t = 0 = Δ i

9 causal inference Y i t = 1 Y i t = 0 Fundamental problem of causality: We cannot observe both Y i t = 1 and Y i t = 0 at the same time.

10 notation DOS notation r C = response when the control is applied r T = response when the treatment is applied r Ti = response for observational unit i when the treatment is applied. Z i = assignment to treatment for unit i. R i = Z i r Ti 1 Z i r Ci = the observed response for unit i.

11 a table person i r_{c_i} r_{t_i} z_i r_i Delta_i

12 a table person i r_{c_i} r_{t_i} z_i r_i Delta_i Sometimes called: The Science Table

13 potential outcomes framework The potential outcomes framework is meant to clarify the challenge: We only get to observe half of the data we actually want. How do we get at the unobserved half?

14 study design vs. inference Don Rubin: For objective causal inference, design trumps analysis

15 study design vs. inference 90% of statistics classes are about inference Why? It s useful, getting you those confidence intervals and p-values. The math is pretty cool. It feels hard. Because many of us don t really know much about the real world

16 design R A N D O M I Z A T I O N A N D S A M P L I N G

17 where does the data come from? We design trials. Assign groups that are similar at baseline Examine counterfactuals We also design sampling schemes. Representative groups Understand population from subsets of those populations Both use elements of control and randomness

18 an example: randomization Want to study a pill. Design the study Uniform randomization Matched pairs randomization Crossover design Cluster-randomized Inference t-test Matched-pairs t-test Repeated measures model Generalized linear mixed model But maybe all of those could be GLMM.

19 an example: sampling Want to study an election. Design the study Simple random sample Stratified sampling Snowball sampling Inference t-test Inverse probability weighting Generalized linear mixed model But maybe all of those could be GLMM.

20 different beliefs about where data come from RCT and sampling Structural equation modeling y i = β 0 + β 1 x 1,i + + β p x p,i + ε i If you want to be disabused of SEM spend some time reading

21 where data come from If you d like to be abused by SEM please see

22 inference

23 picking inference Inference requires assumptions Linear regression: Linearity and additivity Independent errors Homoskedastiticity Normality of errors Permutation test: Known assignment mechanism to T or C Fancier methods tend to have more assumptions and thus leave you open to more lines of attack. These attacks can be obviated by careful preparation during the design phase.

24 picking inference Use the simplest method that gets the job done. If you want to accomplish more, collect more data or do additional analyses. ( If have to use something more complicated than a t-test then someone messed up ) The fewer assumptions there are, the easier it will be to perform a sensitivity analysis build an argument to beat back the haters.

25 picking inference Another option: Proof by intimidation This paper presents a breakthrough in rhetorical logic, a promising field of science, of great values to those writing research proposals. It provides new, and utterly convincing tools for closing embarrassing gaps in your reasoning, without having to resort to brute-force methods such as actually thinking about the problem in the first place. The Craske-Trump Theorem Conjecture will allow researchers in any field to use the technique of Proof by Intimidation fully. - Michael Wilkinson (Annals of Improbable Research 2000)

26 observational study design N E O N A T A L I N T E N S I V E C A R E U N I T S

27 Application: Regionalization Hospitals vary in their ability to care for premature infants. The American Academy of Pediatrics recognizes levels: 1, 2, 3A, 3B, 3C, 3D and Regional Centers. Regionalization of care refers to a policy that suggests or requires that high-risk mothers deliver at hospitals with greater levels of capabilities.

28 Outcome Outcome

29 Outcome Outcome

30 The data Every baby delivered in a 10+ year period California Pennsylvania Missouri Mothers information ICD9 codes Delivery Post-delivery complications Some pre-delivery Some SES information Zip code of residence Birth/death certificates Census information PA and MO have zip code level CA will have block group Pre-delivery Severity?

31 H H

32 H H

33 naïve model for observational studies Design of Observational Studies: section

34 the variation comes from somewhere The model starts with thinking about the assignment mechanism: π i = Pr(Z i = 1 r Ti, r Ci, x i, u i ) where π i will likely vary from person to person, x i are the observed covariates and u i are the unobserved covariates. Consider the vector of all treatment assignments, Z. This can be written as: Pr Z = z r T, r C, x, u = π 1 z 1 1 π 1 1 z 1 π n z n 1 π n 1 z n

35 motivation for matching Imagine we can find two subjects, k and l, such that Z k + Z l = 1 but π k = π l Recall the model π i = Pr(Z i = 1 r Ti, r Ci, x i, u i ) we can only match on a subset of these. If we were in the RCT framework then we could create these trivially by doing a matched pairs randomization and assigning k and l to the same pair. We re building a story about the randomization.

36 H H Excess Travel Time

37 H H Excess Travel Time

38 H H Excess Travel Time

39 H H McClellan, McNeil & Newhouse; "Does more intensive treatment of acute myocardial infarction reduce mortality? JAMA. 272(11): , September 1994

40 motivation for matching Recall the model π i = Pr(Z i = 1 r Ti, r Ci, x i, u i ) we can only match on a subset of these. r Ti, r Ci are whether baby lives or dies. Z i is whether baby is delivered at low or high level NIC. x i are the 40+ covariates of mom and baby, as well as where they live. u i are all the unmeasured features that influenced the baby, mom and the process of how they ended up delivering at the hospital they did.

41 H H

42 H H

43 motivation: natural experiments Perhaps the π i behave in ways that will allow us to draw inferences in ways that are similar to experiments. Loosely speaking, this will tend to happen when the assignment mechanism is haphazard in its assignment not using prognostically relevant covariates to sort the units into levels of the treatment. What happens if we can find matches such that Z k + Z l = 1 and π k = π l?

44 motivation: natural experiments What happens if we can find matches such that π k = π l? Pr Z k = z k, Z l = z l r Tk, r Ck, x k, u k, r Tl, r Cl, x l, u l = π k z k 1 π k 1 z k π l z l 1 πl 1 z l = π k z k +z l 1 πk 1 z k +(1 z l ) What if we design our study such that Z l + Z k = 1? Pr Z k = 1, Z l = 0, Z l + Z k = 1

45 motivation: natural experiments What happens if we can find matches such that π k = π l? Pr Z k = z k, Z l = z l r Tk, r Ck, x k, u k, r Tl, r Cl, x l, u l = π k z k +z l 1 πk 1 z k +(1 z l ) What if we design our study such that Z l + Z k = 1? Pr Z k = 1, Z l = 0, Z l + Z k = 1 = Pr Z k = 1, Z l = 0 Pr Z k = 1, Z l = 0 +Pr Z k = 0, Z l = 1 = π k πk 1 1 +(1 0) Pr Z k = 1, Z l = 0 +Pr Z k = 0, Z l = 1

46 motivation: natural experiments What if we design our study such that Z l + Z k = 1? Pr Z k = 1, Z l = 0, Z l + Z k = 1 = Pr Z k = 1, Z l = 0 Pr Z k = 1, Z l = 0 +Pr Z k = 0, Z l = 1 = π k πk 1 1 +(1 0) Pr Z k = 1, Z l = 0 +Pr Z k = 0, Z l = 1 = π k πk 1 1 +(1 0) π k π k 1 1 +(1 0) +π k π k 1 0 +(1 1) = 1 2 IF we can do this then we get to use the same tools developed for RCTs!

47 the naïve model Those that look alike (in our data set) are alike: π i = Pr Z i = 1 r Ti, r Ci, x i, u i = Pr Z i = 1 x i and with 0 < π i < 1 for all i = 1, 2,, n Pr Z = z r T, r C, x, u = n i=1 π i z i 1 πi 1 z i We could make this model true by randomly assigning, potentially using biased coins based on the observed covariates.

48 at the core: the propensity score This quantity has a name: e(x i ) = Pr Z i = 1 x i often, e(x i ) is referred to as the propensity score. It s the probability of assignment to treatment. It links the features of an individual unit to its assignment to the treatment. It s a description of the assignment mechanism. It s not magic. But it does have several cool features that are not immediately obvious: Propensity: facilitates randomization (Fisher) Score: short for balancing score which facilitates balance (Mill)

49 wishes If the ideal match and the naïve model were true then we could just match on the observed covariates and analyze data using traditional techniques for RCTs. Unfortunately, it s extraordinarily unlikely that strongly ignorable treatment assignment actually holds. It can be more plausible or less plausible given different situations, but it can never be proven.

50 takeaways

51 takeaways The potential outcomes framework helps organize our thinking on counterfactuals Design comes in two flavors (actually, three but the third one is not very healthy) In prospective studies design is an obvious consideration and one that MUST be passed through in order to obtain data In retrospective studies, design is a less obvious consideration but one that MUST be passed through unfortunately without much attention paid

52 fin. C H E C K O U T T H E W E B S I T E.

Advanced Statistical Methods for Observational Studies L E C T U R E 0 1

Advanced Statistical Methods for Observational Studies L E C T U R E 0 1 Advanced Statistical Methods for Observational Studies L E C T U R E 0 1 introduction this class Website Expectations Questions observational studies The world of observational studies is kind of hard

More information

Near/Far Matching. Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants

Near/Far Matching. Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants Near/Far Matching Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants Joint research: Mike Baiocchi, Dylan Small, Scott Lorch and Paul Rosenbaum What this talk

More information

Near/Far Matching. Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants

Near/Far Matching. Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants Near/Far Matching Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants Joint research: Mike Baiocchi, Dylan Small, Scott Lorch and Paul Rosenbaum Classic set

More information

Advanced Statistical Methods for Observational Studies L E C T U R E 0 6

Advanced Statistical Methods for Observational Studies L E C T U R E 0 6 Advanced Statistical Methods for Observational Studies L E C T U R E 0 6 class management Problem set 1 is posted Questions? design thus far We re off to a bad start. 1 2 1 2 1 2 2 2 1 1 1 2 1 1 2 2 2

More information

1 Impact Evaluation: Randomized Controlled Trial (RCT)

1 Impact Evaluation: Randomized Controlled Trial (RCT) Introductory Applied Econometrics EEP/IAS 118 Fall 2013 Daley Kutzman Section #12 11-20-13 Warm-Up Consider the two panel data regressions below, where i indexes individuals and t indexes time in months:

More information

Causal Inference. Prediction and causation are very different. Typical questions are:

Causal Inference. Prediction and causation are very different. Typical questions are: Causal Inference Prediction and causation are very different. Typical questions are: Prediction: Predict Y after observing X = x Causation: Predict Y after setting X = x. Causation involves predicting

More information

Mediation analyses. Advanced Psychometrics Methods in Cognitive Aging Research Workshop. June 6, 2016

Mediation analyses. Advanced Psychometrics Methods in Cognitive Aging Research Workshop. June 6, 2016 Mediation analyses Advanced Psychometrics Methods in Cognitive Aging Research Workshop June 6, 2016 1 / 40 1 2 3 4 5 2 / 40 Goals for today Motivate mediation analysis Survey rapidly developing field in

More information

Gov 2002: 4. Observational Studies and Confounding

Gov 2002: 4. Observational Studies and Confounding Gov 2002: 4. Observational Studies and Confounding Matthew Blackwell September 10, 2015 Where are we? Where are we going? Last two weeks: randomized experiments. From here on: observational studies. What

More information

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification Todd MacKenzie, PhD Collaborators A. James O Malley Tor Tosteson Therese Stukel 2 Overview 1. Instrumental variable

More information

Uncertainty. Michael Peters December 27, 2013

Uncertainty. Michael Peters December 27, 2013 Uncertainty Michael Peters December 27, 20 Lotteries In many problems in economics, people are forced to make decisions without knowing exactly what the consequences will be. For example, when you buy

More information

An introduction to biostatistics: part 1

An introduction to biostatistics: part 1 An introduction to biostatistics: part 1 Cavan Reilly September 6, 2017 Table of contents Introduction to data analysis Uncertainty Probability Conditional probability Random variables Discrete random

More information

Probability and Inference. POLI 205 Doing Research in Politics. Populations and Samples. Probability. Fall 2015

Probability and Inference. POLI 205 Doing Research in Politics. Populations and Samples. Probability. Fall 2015 Fall 2015 Population versus Sample Population: data for every possible relevant case Sample: a subset of cases that is drawn from an underlying population Inference Parameters and Statistics A parameter

More information

Potential Outcomes Model (POM)

Potential Outcomes Model (POM) Potential Outcomes Model (POM) Relationship Between Counterfactual States Causality Empirical Strategies in Labor Economics, Angrist Krueger (1999): The most challenging empirical questions in economics

More information

Technical Track Session I: Causal Inference

Technical Track Session I: Causal Inference Impact Evaluation Technical Track Session I: Causal Inference Human Development Human Network Development Network Middle East and North Africa Region World Bank Institute Spanish Impact Evaluation Fund

More information

Statistical Models for Causal Analysis

Statistical Models for Causal Analysis Statistical Models for Causal Analysis Teppei Yamamoto Keio University Introduction to Causal Inference Spring 2016 Three Modes of Statistical Inference 1. Descriptive Inference: summarizing and exploring

More information

CPSC 320 Sample Solution, Reductions and Resident Matching: A Residentectomy

CPSC 320 Sample Solution, Reductions and Resident Matching: A Residentectomy CPSC 320 Sample Solution, Reductions and Resident Matching: A Residentectomy August 25, 2017 A group of residents each needs a residency in some hospital. A group of hospitals each need some number (one

More information

Selection on Observables: Propensity Score Matching.

Selection on Observables: Propensity Score Matching. Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017

More information

Two-sample Categorical data: Testing

Two-sample Categorical data: Testing Two-sample Categorical data: Testing Patrick Breheny April 1 Patrick Breheny Introduction to Biostatistics (171:161) 1/28 Separate vs. paired samples Despite the fact that paired samples usually offer

More information

Propensity Score Methods for Causal Inference

Propensity Score Methods for Causal Inference John Pura BIOS790 October 2, 2015 Causal inference Philosophical problem, statistical solution Important in various disciplines (e.g. Koch s postulates, Bradford Hill criteria, Granger causality) Good

More information

1 Probabilities. 1.1 Basics 1 PROBABILITIES

1 Probabilities. 1.1 Basics 1 PROBABILITIES 1 PROBABILITIES 1 Probabilities Probability is a tricky word usually meaning the likelyhood of something occuring or how frequent something is. Obviously, if something happens frequently, then its probability

More information

Differences-in- Differences. November 10 Clair

Differences-in- Differences. November 10 Clair Differences-in- Differences November 10 Clair The Big Picture What is this class really about, anyway? The Big Picture What is this class really about, anyway? Causality The Big Picture What is this class

More information

The Generalized Roy Model and Treatment Effects

The Generalized Roy Model and Treatment Effects The Generalized Roy Model and Treatment Effects Christopher Taber University of Wisconsin November 10, 2016 Introduction From Imbens and Angrist we showed that if one runs IV, we get estimates of the Local

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

Lecture 8. Roy Model, IV with essential heterogeneity, MTE

Lecture 8. Roy Model, IV with essential heterogeneity, MTE Lecture 8. Roy Model, IV with essential heterogeneity, MTE Economics 2123 George Washington University Instructor: Prof. Ben Williams Heterogeneity When we talk about heterogeneity, usually we mean heterogeneity

More information

Causal Modeling in Environmental Epidemiology. Joel Schwartz Harvard University

Causal Modeling in Environmental Epidemiology. Joel Schwartz Harvard University Causal Modeling in Environmental Epidemiology Joel Schwartz Harvard University When I was Young What do I mean by Causal Modeling? What would have happened if the population had been exposed to a instead

More information

Treatment Effects. Christopher Taber. September 6, Department of Economics University of Wisconsin-Madison

Treatment Effects. Christopher Taber. September 6, Department of Economics University of Wisconsin-Madison Treatment Effects Christopher Taber Department of Economics University of Wisconsin-Madison September 6, 2017 Notation First a word on notation I like to use i subscripts on random variables to be clear

More information

Modeling Log Data from an Intelligent Tutor Experiment

Modeling Log Data from an Intelligent Tutor Experiment Modeling Log Data from an Intelligent Tutor Experiment Adam Sales 1 joint work with John Pane & Asa Wilks College of Education University of Texas, Austin RAND Corporation Pittsburgh, PA & Santa Monica,

More information

multilevel modeling: concepts, applications and interpretations

multilevel modeling: concepts, applications and interpretations multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models

More information

ECON Introductory Econometrics. Lecture 17: Experiments

ECON Introductory Econometrics. Lecture 17: Experiments ECON4150 - Introductory Econometrics Lecture 17: Experiments Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 13 Lecture outline 2 Why study experiments? The potential outcome framework.

More information

review session gov 2000 gov 2000 () review session 1 / 38

review session gov 2000 gov 2000 () review session 1 / 38 review session gov 2000 gov 2000 () review session 1 / 38 Overview Random Variables and Probability Univariate Statistics Bivariate Statistics Multivariate Statistics Causal Inference gov 2000 () review

More information

PSC 504: Instrumental Variables

PSC 504: Instrumental Variables PSC 504: Instrumental Variables Matthew Blackwell 3/28/2013 Instrumental Variables and Structural Equation Modeling Setup e basic idea behind instrumental variables is that we have a treatment with unmeasured

More information

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Kosuke Imai Department of Politics Princeton University November 13, 2013 So far, we have essentially assumed

More information

Comments on The Role of Large Scale Assessments in Research on Educational Effectiveness and School Development by Eckhard Klieme, Ph.D.

Comments on The Role of Large Scale Assessments in Research on Educational Effectiveness and School Development by Eckhard Klieme, Ph.D. Comments on The Role of Large Scale Assessments in Research on Educational Effectiveness and School Development by Eckhard Klieme, Ph.D. David Kaplan Department of Educational Psychology The General Theme

More information

Experiments and Quasi-Experiments

Experiments and Quasi-Experiments Experiments and Quasi-Experiments (SW Chapter 13) Outline 1. Potential Outcomes, Causal Effects, and Idealized Experiments 2. Threats to Validity of Experiments 3. Application: The Tennessee STAR Experiment

More information

CIS 2033 Lecture 5, Fall

CIS 2033 Lecture 5, Fall CIS 2033 Lecture 5, Fall 2016 1 Instructor: David Dobor September 13, 2016 1 Supplemental reading from Dekking s textbook: Chapter2, 3. We mentioned at the beginning of this class that calculus was a prerequisite

More information

Casual Mediation Analysis

Casual Mediation Analysis Casual Mediation Analysis Tyler J. VanderWeele, Ph.D. Upcoming Seminar: April 21-22, 2017, Philadelphia, Pennsylvania OXFORD UNIVERSITY PRESS Explanation in Causal Inference Methods for Mediation and Interaction

More information

Causal inference in multilevel data structures:

Causal inference in multilevel data structures: Causal inference in multilevel data structures: Discussion of papers by Li and Imai Jennifer Hill May 19 th, 2008 Li paper Strengths Area that needs attention! With regard to propensity score strategies

More information

Propensity Score Matching

Propensity Score Matching Methods James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 Methods 1 Introduction 2 3 4 Introduction Why Match? 5 Definition Methods and In

More information

Modeling Mediation: Causes, Markers, and Mechanisms

Modeling Mediation: Causes, Markers, and Mechanisms Modeling Mediation: Causes, Markers, and Mechanisms Stephen W. Raudenbush University of Chicago Address at the Society for Resesarch on Educational Effectiveness,Washington, DC, March 3, 2011. Many thanks

More information

P (E) = P (A 1 )P (A 2 )... P (A n ).

P (E) = P (A 1 )P (A 2 )... P (A n ). Lecture 9: Conditional probability II: breaking complex events into smaller events, methods to solve probability problems, Bayes rule, law of total probability, Bayes theorem Discrete Structures II (Summer

More information

Econometric Causality

Econometric Causality Econometric (2008) International Statistical Review, 76(1):1-27 James J. Heckman Spencer/INET Conference University of Chicago Econometric The econometric approach to causality develops explicit models

More information

Comments on Best Quasi- Experimental Practice

Comments on Best Quasi- Experimental Practice Comments on Best Quasi- Experimental Practice Larry V. Hedges Northwestern University Presented at the Future of Implementation Evaluation, National Science Foundation, Arlington, VA, October 28, 2013

More information

Chapter 3. Estimation of p. 3.1 Point and Interval Estimates of p

Chapter 3. Estimation of p. 3.1 Point and Interval Estimates of p Chapter 3 Estimation of p 3.1 Point and Interval Estimates of p Suppose that we have Bernoulli Trials (BT). So far, in every example I have told you the (numerical) value of p. In science, usually the

More information

Chapter 7: Hypothesis testing

Chapter 7: Hypothesis testing Chapter 7: Hypothesis testing Hypothesis testing is typically done based on the cumulative hazard function. Here we ll use the Nelson-Aalen estimate of the cumulative hazard. The survival function is used

More information

Causal Inference Basics

Causal Inference Basics Causal Inference Basics Sam Lendle October 09, 2013 Observed data, question, counterfactuals Observed data: n i.i.d copies of baseline covariates W, treatment A {0, 1}, and outcome Y. O i = (W i, A i,

More information

An Introduction to Causal Analysis on Observational Data using Propensity Scores

An Introduction to Causal Analysis on Observational Data using Propensity Scores An Introduction to Causal Analysis on Observational Data using Propensity Scores Margie Rosenberg*, PhD, FSA Brian Hartman**, PhD, ASA Shannon Lane* *University of Wisconsin Madison **University of Connecticut

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics STAT-S-301 Experiments and Quasi-Experiments (2016/2017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 Why study experiments? Ideal randomized controlled experiments

More information

Instrumental variables estimation in the Cox Proportional Hazard regression model

Instrumental variables estimation in the Cox Proportional Hazard regression model Instrumental variables estimation in the Cox Proportional Hazard regression model James O Malley, Ph.D. Department of Biomedical Data Science The Dartmouth Institute for Health Policy and Clinical Practice

More information

University of Pennsylvania and The Children s Hospital of Philadelphia

University of Pennsylvania and The Children s Hospital of Philadelphia Submitted to the Annals of Applied Statistics arxiv: arxiv:0000.0000 ESTIMATION OF CAUSAL EFFECTS USING INSTRUMENTAL VARIABLES WITH NONIGNORABLE MISSING COVARIATES: APPLICATION TO EFFECT OF TYPE OF DELIVERY

More information

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers: Use of Matching Methods for Causal Inference in Experimental and Observational Studies Kosuke Imai Department of Politics Princeton University April 27, 2007 Kosuke Imai (Princeton University) Matching

More information

Lecture 5. 1 Review (Pairwise Independence and Derandomization)

Lecture 5. 1 Review (Pairwise Independence and Derandomization) 6.842 Randomness and Computation September 20, 2017 Lecture 5 Lecturer: Ronitt Rubinfeld Scribe: Tom Kolokotrones 1 Review (Pairwise Independence and Derandomization) As we discussed last time, we can

More information

Causality II: How does causal inference fit into public health and what it is the role of statistics?

Causality II: How does causal inference fit into public health and what it is the role of statistics? Causality II: How does causal inference fit into public health and what it is the role of statistics? Statistics for Psychosocial Research II November 13, 2006 1 Outline Potential Outcomes / Counterfactual

More information

Comparative effectiveness of dynamic treatment regimes

Comparative effectiveness of dynamic treatment regimes Comparative effectiveness of dynamic treatment regimes An application of the parametric g- formula Miguel Hernán Departments of Epidemiology and Biostatistics Harvard School of Public Health www.hsph.harvard.edu/causal

More information

Studies. Frank E Harrell Jr. NIH AHRQ Methodologic Challenges in CER 2 December 2010

Studies. Frank E Harrell Jr. NIH AHRQ Methodologic Challenges in CER 2 December 2010 Frank E Harrell Jr Department of Biostatistics Vanderbilt University School of Medice NIH AHRQ Methodologic Challenges CER 2 December 2010 Outle 1 2 3 4 What characterizes situations when causal ference

More information

Causal Inference with Big Data Sets

Causal Inference with Big Data Sets Causal Inference with Big Data Sets Marcelo Coca Perraillon University of Colorado AMC November 2016 1 / 1 Outlone Outline Big data Causal inference in economics and statistics Regression discontinuity

More information

Analysis of propensity score approaches in difference-in-differences designs

Analysis of propensity score approaches in difference-in-differences designs Author: Diego A. Luna Bazaldua Institution: Lynch School of Education, Boston College Contact email: diego.lunabazaldua@bc.edu Conference section: Research methods Analysis of propensity score approaches

More information

Module 9: Sampling IPDET. Sampling. Intro Concepts Types Confidence/ Precision? How Large? Intervention or Policy. Evaluation Questions

Module 9: Sampling IPDET. Sampling. Intro Concepts Types Confidence/ Precision? How Large? Intervention or Policy. Evaluation Questions IPDET Module 9: Sampling Sampling Intervention or Policy Evaluation Questions Design Approaches Data Collection Intro Concepts Types Confidence/ Precision? How Large? Introduction Introduction to Sampling

More information

Sampling and Sample Size. Shawn Cole Harvard Business School

Sampling and Sample Size. Shawn Cole Harvard Business School Sampling and Sample Size Shawn Cole Harvard Business School Calculating Sample Size Effect Size Power Significance Level Variance ICC EffectSize 2 ( ) 1 σ = t( 1 κ ) + tα * * 1+ ρ( m 1) P N ( 1 P) Proportion

More information

Rubin s Potential Outcome Framework. Motivation. Example: Crossfit versus boot camp 2/20/2012. Paul L. Hebert, PhD

Rubin s Potential Outcome Framework. Motivation. Example: Crossfit versus boot camp 2/20/2012. Paul L. Hebert, PhD Rubin s Potential Outcome Framework Paul L. Hebert, PhD Motivation Rubin s potential outcomes model has been the theoretical framework that statisticians/econometricians use to demonstrate under what conditions/assumptions

More information

Impact Evaluation of Mindspark Centres

Impact Evaluation of Mindspark Centres Impact Evaluation of Mindspark Centres March 27th, 2014 Executive Summary About Educational Initiatives and Mindspark Educational Initiatives (EI) is a prominent education organization in India with the

More information

Nondeterministic finite automata

Nondeterministic finite automata Lecture 3 Nondeterministic finite automata This lecture is focused on the nondeterministic finite automata (NFA) model and its relationship to the DFA model. Nondeterminism is an important concept in the

More information

NISS. Technical Report Number 167 June 2007

NISS. Technical Report Number 167 June 2007 NISS Estimation of Propensity Scores Using Generalized Additive Models Mi-Ja Woo, Jerome Reiter and Alan F. Karr Technical Report Number 167 June 2007 National Institute of Statistical Sciences 19 T. W.

More information

OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS Outcome regressions and propensity scores

OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS Outcome regressions and propensity scores OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS 776 1 15 Outcome regressions and propensity scores Outcome Regression and Propensity Scores ( 15) Outline 15.1 Outcome regression 15.2 Propensity

More information

STA111 - Lecture 1 Welcome to STA111! 1 What is the difference between Probability and Statistics?

STA111 - Lecture 1 Welcome to STA111! 1 What is the difference between Probability and Statistics? STA111 - Lecture 1 Welcome to STA111! Some basic information: Instructor: Víctor Peña (email: vp58@duke.edu) Course Website: http://stat.duke.edu/~vp58/sta111. 1 What is the difference between Probability

More information

EMERGING MARKETS - Lecture 2: Methodology refresher

EMERGING MARKETS - Lecture 2: Methodology refresher EMERGING MARKETS - Lecture 2: Methodology refresher Maria Perrotta April 4, 2013 SITE http://www.hhs.se/site/pages/default.aspx My contact: maria.perrotta@hhs.se Aim of this class There are many different

More information

Confidence intervals CE 311S

Confidence intervals CE 311S CE 311S PREVIEW OF STATISTICS The first part of the class was about probability. P(H) = 0.5 P(T) = 0.5 HTTHHTTTTHHTHTHH If we know how a random process works, what will we see in the field? Preview of

More information

Last few slides from last time

Last few slides from last time Last few slides from last time Example 3: What is the probability that p will fall in a certain range, given p? Flip a coin 50 times. If the coin is fair (p=0.5), what is the probability of getting an

More information

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 An Introduction to Causal Mediation Analysis Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 1 Causality In the applications of statistics, many central questions

More information

In Defence of a Naïve Conditional Epistemology

In Defence of a Naïve Conditional Epistemology In Defence of a Naïve Conditional Epistemology Andrew Bacon 28th June 2013 1 The data You pick a card at random from a standard deck of cards. How confident should I be about asserting the following sentences?

More information

Basics of Proofs. 1 The Basics. 2 Proof Strategies. 2.1 Understand What s Going On

Basics of Proofs. 1 The Basics. 2 Proof Strategies. 2.1 Understand What s Going On Basics of Proofs The Putnam is a proof based exam and will expect you to write proofs in your solutions Similarly, Math 96 will also require you to write proofs in your homework solutions If you ve seen

More information

Causal modelling in Medical Research

Causal modelling in Medical Research Causal modelling in Medical Research Debashis Ghosh Department of Biostatistics and Informatics, Colorado School of Public Health Biostatistics Workshop Series Goals for today Introduction to Potential

More information

Technical Track Session I:

Technical Track Session I: Impact Evaluation Technical Track Session I: Click to edit Master title style Causal Inference Damien de Walque Amman, Jordan March 8-12, 2009 Click to edit Master subtitle style Human Development Human

More information

PSC 504: Differences-in-differeces estimators

PSC 504: Differences-in-differeces estimators PSC 504: Differences-in-differeces estimators Matthew Blackwell 3/22/2013 Basic differences-in-differences model Setup e basic idea behind a differences-in-differences model (shorthand: diff-in-diff, DID,

More information

CompSci Understanding Data: Theory and Applications

CompSci Understanding Data: Theory and Applications CompSci 590.6 Understanding Data: Theory and Applications Lecture 17 Causality in Statistics Instructor: Sudeepa Roy Email: sudeepa@cs.duke.edu Fall 2015 1 Today s Reading Rubin Journal of the American

More information

LECTURE 15: SIMPLE LINEAR REGRESSION I

LECTURE 15: SIMPLE LINEAR REGRESSION I David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).

More information

Using Instrumental Variables to Find Causal Effects in Public Health

Using Instrumental Variables to Find Causal Effects in Public Health 1 Using Instrumental Variables to Find Causal Effects in Public Health Antonio Trujillo, PhD John Hopkins Bloomberg School of Public Health Department of International Health Health Systems Program October

More information

Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing

Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing 1. Purpose of statistical inference Statistical inference provides a means of generalizing

More information

Dynamics in Social Networks and Causality

Dynamics in Social Networks and Causality Web Science & Technologies University of Koblenz Landau, Germany Dynamics in Social Networks and Causality JProf. Dr. University Koblenz Landau GESIS Leibniz Institute for the Social Sciences Last Time:

More information

Introduction to Econometrics. Review of Probability & Statistics

Introduction to Econometrics. Review of Probability & Statistics 1 Introduction to Econometrics Review of Probability & Statistics Peerapat Wongchaiwat, Ph.D. wongchaiwat@hotmail.com Introduction 2 What is Econometrics? Econometrics consists of the application of mathematical

More information

Regression with Nonlinear Transformations

Regression with Nonlinear Transformations Regression with Nonlinear Transformations Joel S Steele Portland State University Abstract Gaussian Likelihood When data are drawn from a Normal distribution, N (µ, σ 2 ), we can use the Gaussian distribution

More information

1. When applied to an affected person, the test comes up positive in 90% of cases, and negative in 10% (these are called false negatives ).

1. When applied to an affected person, the test comes up positive in 90% of cases, and negative in 10% (these are called false negatives ). CS 70 Discrete Mathematics for CS Spring 2006 Vazirani Lecture 8 Conditional Probability A pharmaceutical company is marketing a new test for a certain medical condition. According to clinical trials,

More information

Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1

Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1 Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1 Gary King GaryKing.org April 13, 2014 1 c Copyright 2014 Gary King, All Rights Reserved. Gary King ()

More information

Treatment of Error in Experimental Measurements

Treatment of Error in Experimental Measurements in Experimental Measurements All measurements contain error. An experiment is truly incomplete without an evaluation of the amount of error in the results. In this course, you will learn to use some common

More information

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Progress, Updates, Problems William Jen Hoe Koh May 9, 2013 Overview Marginal vs Conditional What is TMLE? Key Estimation

More information

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score Causal Inference with General Treatment Regimes: Generalizing the Propensity Score David van Dyk Department of Statistics, University of California, Irvine vandyk@stat.harvard.edu Joint work with Kosuke

More information

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last week: Sample, population and sampling

More information

A little (more) about me.

A little (more) about me. Comparative Effectiveness Research Methods Training Module 2: Research Designs J. Michael Oakes, PhD Associate Professor Division of Epidemiology University of Minnesota oakes007@umn.edu A little (more)

More information

Ignoring the matching variables in cohort studies - when is it valid, and why?

Ignoring the matching variables in cohort studies - when is it valid, and why? Ignoring the matching variables in cohort studies - when is it valid, and why? Arvid Sjölander Abstract In observational studies of the effect of an exposure on an outcome, the exposure-outcome association

More information

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 CS 70 Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 Introduction One of the key properties of coin flips is independence: if you flip a fair coin ten times and get ten

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 EECS 70 Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 Introduction to Basic Discrete Probability In the last note we considered the probabilistic experiment where we flipped

More information

The decision theoretic approach to causal inference OR Rethinking the paradigms of causal modelling

The decision theoretic approach to causal inference OR Rethinking the paradigms of causal modelling The decision theoretic approach to causal inference OR Rethinking the paradigms of causal modelling A.P.Dawid 1 and S.Geneletti 2 1 University of Cambridge, Statistical Laboratory 2 Imperial College Department

More information

Heterogeneity and Causality: Unit Heterogeneity and Design Sensitivity in Observational Studies. Paul R. Rosenbaum, University of Pennsylvania

Heterogeneity and Causality: Unit Heterogeneity and Design Sensitivity in Observational Studies. Paul R. Rosenbaum, University of Pennsylvania Heterogeneity and Causality: Unit Heterogeneity and Design Sensitivity in Observational Studies Paul R. Rosenbaum, University of Pennsylvania References [1] Rosenbaum, P. R. (2005) Heterogeneity and causality:

More information

What s New in Econometrics. Lecture 1

What s New in Econometrics. Lecture 1 What s New in Econometrics Lecture 1 Estimation of Average Treatment Effects Under Unconfoundedness Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Potential Outcomes 3. Estimands and

More information

Quantitative Economics for the Evaluation of the European Policy

Quantitative Economics for the Evaluation of the European Policy Quantitative Economics for the Evaluation of the European Policy Dipartimento di Economia e Management Irene Brunetti Davide Fiaschi Angela Parenti 1 25th of September, 2017 1 ireneb@ec.unipi.it, davide.fiaschi@unipi.it,

More information

Lecture Notes 22 Causal Inference

Lecture Notes 22 Causal Inference Lecture Notes 22 Causal Inference Prediction and causation are very different. Typical questions are: Prediction: Predict after observing = x Causation: Predict after setting = x. Causation involves predicting

More information

PROBLEMS OF CAUSAL ANALYSIS IN THE SOCIAL SCIENCES

PROBLEMS OF CAUSAL ANALYSIS IN THE SOCIAL SCIENCES Patrick Suppes PROBLEMS OF CAUSAL ANALYSIS IN THE SOCIAL SCIENCES This article is concerned with the prospects and problems of causal analysis in the social sciences. On the one hand, over the past 40

More information

Probabilistic Index Models

Probabilistic Index Models Probabilistic Index Models Jan De Neve Department of Data Analysis Ghent University M3 Storrs, Conneticut, USA May 23, 2017 Jan.DeNeve@UGent.be 1 / 37 Introduction 2 / 37 Introduction to Probabilistic

More information

Episode 1: Phis wants to be a physicist

Episode 1: Phis wants to be a physicist Illustration: Xia Hong Script: Xia Hong 12/2012 Phis is 14 years old. Like most kids of her age, she s a bit anxious about finding the person inside her She tried singing. Hmm You are very talented, Phis!

More information

Targeted Maximum Likelihood Estimation in Safety Analysis

Targeted Maximum Likelihood Estimation in Safety Analysis Targeted Maximum Likelihood Estimation in Safety Analysis Sam Lendle 1 Bruce Fireman 2 Mark van der Laan 1 1 UC Berkeley 2 Kaiser Permanente ISPE Advanced Topics Session, Barcelona, August 2012 1 / 35

More information

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression

More information