Research Design: Causal inference and counterfactuals

Size: px
Start display at page:

Download "Research Design: Causal inference and counterfactuals"

Transcription

1 Research Design: Causal inference and counterfactuals University College Dublin 8 March 2013

2

3 Outline

4 Inference In regression analysis we look at the relationship between (a set of) independent variable(s) and a dependent variable. Statistical inference is concerned with the question how likely it is to observe this relationship given the null hypothesis of no relationship (frequentist) A different question is whether or not we can deduce that the independent variable is a cause of the dependent one. (Note: references in these slides can be found in the associated handout.)

5 Inference In regression analysis we look at the relationship between (a set of) independent variable(s) and a dependent variable. Statistical inference is concerned with the question how likely it is to observe this relationship given the null hypothesis of no relationship (frequentist) or how much we should update our beliefs concerning this relationship given our new evidence (Bayesian). A different question is whether or not we can deduce that the independent variable is a cause of the dependent one. (Note: references in these slides can be found in the associated handout.)

6 Association association causation

7 Association Given that, say, X and Y are correlation (associated), there are still many possible causal patterns at play.

8 Many possible patterns

9 Many possible patterns

10 Many possible patterns

11 Many possible patterns

12 Many possible patterns

13 Many possible patterns

14 Many possible patterns

15 Many possible patterns

16 Inference Generally, to make causal inferences from your analysis, additional assumptions need to be made in addition to the ones already made for associational or predictive inference.

17 Outline

18 Fundamental problem Imagine, there are two kinds of people, one group, T = 1, that has a college degree, and another group, T = 0, that does not. We want to measure where a college degree leads to a higher salary, Y.

19 Fundamental problem Imagine, there are two kinds of people, one group, T = 1, that has a college degree, and another group, T = 0, that does not. We want to measure where a college degree leads to a higher salary, Y. What we would like to know is the difference for any individual i whether they have a college degree or not: Y Ti=1 i Y Ti=0 i. However, for every individual i, we either observe Y Ti=1 i, or we observe Y Ti=0 i they either have the degree or they don t.

20 We wish... Respondent Degree Y T i=0 i Y T i=1 i effect 1 Yes Yes No No Yes Yes No Yes

21 We wish... we have... Respondent Degree Y T i=0 i Y T i=1 i 1 Yes Yes No 90 4 No 87 5 Yes Yes No 92 8 Yes 109 effect

22 Potential outcomes Potential outcome = { Y1i if T i = 1 Y 0i if T i = 0 E.g., Y 1i is the salary of individual i had (s)he a college degree, irrespective of whether (s)he actually does. (Angrist & Pischke 2009, 13-14)

23 Potential outcomes Potential outcome = { Y1i if T i = 1 Y 0i if T i = 0 E.g., Y 1i is the salary of individual i had (s)he a college degree, irrespective of whether (s)he actually does. Y i = Y 0i +(Y 1i Y 0i )T i = Y 0i +δt i, where δ = Y 1i Y 0i is the causal effect. (Angrist & Pischke 2009, 13-14)

24 Average treatment effect Because it is impossible to observe individual treatment effect, we usually turn to average treatment effect: E[δ] = E[Y 1i Y 0i ] = E[Y 1i ] E[Y 0i ], which we could naively estimate with ˆδ = E[Y 1i T i = 1] E[Y 0i T i = 0].

25 Average treatment effect Because it is impossible to observe individual treatment effect, we usually turn to average treatment effect: E[δ] = E[Y 1i Y 0i ] = E[Y 1i ] E[Y 0i ], which we could naively estimate with ˆδ = E[Y 1i T i = 1] E[Y 0i T i = 0]. This assumes that E[Y 1i ] reflects the salary for people with a college degree, irrespective of whether they got one or not, and that E[Y 0i ] reflects the salary without a college degree, irrespective of whether they got one or not.

26 Counterfactual causality By making such assumptions by looking at the ATE we are making a counterfactual argument. We are making assumptions of what Y 1i would have been, had i had a college degree.

27 Counterfactual causality By making such assumptions by looking at the ATE we are making a counterfactual argument. We are making assumptions of what Y 1i would have been, had i had a college degree. This implies that we cannot measure a causal effect, only estimate it.

28 Counterfactual causality By making such assumptions by looking at the ATE we are making a counterfactual argument. We are making assumptions of what Y 1i would have been, had i had a college degree. This implies that we cannot measure a causal effect, only estimate it. To understand when the ATE assumptions are reasonable, we need to look at the effect of covariates other variables that relate to Y, which we will denote by X.

29 Treatment effect: abbreviations ATE Average Treatment Effect E[δ] = E[Y 1i Y 0i ] ATT ATE for the Treated E[δ T ] = E[Y 1i Y 0i T i = 1] ATC ATE for the Control (untreated) E[δ C ] = E[Y 1i Y 0i T i = 0] PATE Population ATE E[Y 1i Y 0i ] SATE Sample ATE E n [Y 1i Y 0i ] LATE Local ATE E[Y 1i Y 0i X i = x] CATE Conditional ATE E[Y 1i Y 0i X i = x] and analogously we have LATT, PATT, SATC, etceta. Note that ATE, ATT, ATC implicitly refer to population values. E n [ ] is the sample mean, i.e. E n [x] = x = 1 n n i=1, where the E n allows for the formulation of conditional and counterfactual means.

30 Bias in causal inference Using shorthand E 01 = E[Y 0i T i = 1], etc., and taking π as the population proportion that received the treatment, E[δ] = πe[δ T i = 1]+(1 π)e[δ T i = 0] = π(e 11 E 01 )+(1 π)(e 10 E 00 )

31 Bias in causal inference Using shorthand E 01 = E[Y 0i T i = 1], etc., and taking π as the population proportion that received the treatment, can be decomposed into E[δ] = πe[δ T i = 1]+(1 π)e[δ T i = 0] = π(e 11 E 01 )+(1 π)(e 10 E 00 ) (E 11 E 00 ) = E[δ]+(E 01 E 00 )+(1 π){(e 11 E 01 ) (E 10 E 00 )}.

32 Bias in causal inference Using shorthand E 01 = E[Y 0i T i = 1], etc., and taking π as the population proportion that received the treatment, can be decomposed into E[δ] = πe[δ T i = 1]+(1 π)e[δ T i = 0] = π(e 11 E 01 )+(1 π)(e 10 E 00 ) (E 11 E 00 ) = E[δ]+(E 01 E 00 )+(1 π){(e 11 E 01 ) (E 10 E 00 )}. (E 11 E 00 ) observed difference in effect E[δ] average treatment effect (E 01 E 00 ) selection bias (1 π){(e 11 E 01 ) (E 10 E 00 )} differential treatment effect bias (Morgan & Winship 2007

33 SUTVA The stable unit treatment value assumption SUTVA is simply the a priori assumption that the value of Y for unit i when exposed to treatment t will be the same no matter what mechanism is used to assign treatment t to unit i and no matter what treatments the other units receive. (Rubin (1986: 961), as cited in Morgan & Winship (2007: 37))

34 Outline

35 When studying effect of, say, T on Y, by examining the statistical association between the two variables, we need to ascertain that the observed effect is not caused by a third variable, say, X. (Pearl 2000: )

36 When studying effect of, say, T on Y, by examining the statistical association between the two variables, we need to ascertain that the observed effect is not caused by a third variable, say, X. We can say that T and Y are confounded when there is a third variable X that influences both T and Y; such a variable is then called a confounder of T and Y. (Pearl 2000: )

37 Another way of saying this is that if E(Y T,X) E(Y T) and E(T X) E(T), X is a confounder of the effect of T on Y. (Lee 2005: 44)

38 If healthier patients take a drug and sicker patients do not, we can find an association between drug and recovery even when the drug does not work. If sicker patients take a drug and healthier patients do not, we might not find an association between drug and recovery even when the drug works. association causation The first example is also called a spurious effect (not to be confused with spurious regression).

39 Endogeneity The situation where cor(x,ε) 0 is called endogeneity. Endogeneity has three main causes: Measurement error in X Simultaneity or reverse causation Omitted variables

40 Note that confounding is a causal concept, not an associational one! (Pearl 2000)

41 Note that confounding is a causal concept, not an associational one! X has to have a causal effect on T and X has to have a causal effect on Y for there to be an issue. (Pearl 2000)

42 Outline

43 X affects both T and Y = control (Lee 2005: 43-48)

44 Do control This is the typical case of a confounding factor, and hence should be eliminated through controlling.

45 Do control

46 X affects both T and Y = control T affects Y, which in turn affects X = do not control (Lee 2005: 43-48)

47 Don t control

48 Don t control In this case, X is an effect of Y. By controlling for X, you can severily underestimate the effect of T on Y.

49 Don t control In this case, X is an effect of Y. By controlling for X, you can severily underestimate the effect of T on Y. Imagine that a college degree leads to a better income leads to a nicer car. Controlling for the price of the car in estimating the effect of having a college degree on income might cancel the effect.

50 X affects both T and Y = control T affects Y, which in turn affects X = do not control T affects X, which in turn affects Y = do not control... (Lee 2005: 43-48)

51 Don t control

52 Don t control To get the overall effect of T on Y, you want to include the effect through X.

53 Don t control To get the overall effect of T on Y, you want to include the effect through X. E.g. if you want to know the effect of changing the policy regarding smoking in pubs on the amount of smoking in general, you do not care through what mechanism this happened (through peer pressure, laziness, etc.), but only about the overall effect.

54 X affects both T and Y = control T affects Y, which in turn affects X = do not control T affects X, which in turn affects Y = do not control unless you explicitly want only the direct effect (Lee 2005: 43-48)

55 Maybe control

56 Maybe control Remember the following equation: β = β +φγ

57 Maybe control Remember the following equation: β = β +φγ Sometimes you are interested in β (so control), sometimes in β (so don t control).

58 Maybe control Example: A scholarship for poorer students might help them to get a college degree, which in turn might help them to earn more money later in life.

59 Maybe control Example: A scholarship for poorer students might help them to get a college degree, which in turn might help them to earn more money later in life. Having a scholarship on your CV, however, might also further your career, independent of the effect of having a college degree.

60 Maybe control Example: A scholarship for poorer students might help them to get a college degree, which in turn might help them to earn more money later in life. Having a scholarship on your CV, however, might also further your career, independent of the effect of having a college degree. To see the overall effect of the scholarship, don t control on having a college degree.

61 Maybe control Example: A scholarship for poorer students might help them to get a college degree, which in turn might help them to earn more money later in life. Having a scholarship on your CV, however, might also further your career, independent of the effect of having a college degree. To see the overall effect of the scholarship, don t control on having a college degree. To see the effect of having a scholarship, independent of the effect of getting a college degree, do control for college degree.

62 X affects both T and Y = control T affects Y, which in turn affects X = do not control T affects X, which in turn affects Y = do not control unless you explicitly want only the direct effect X affects Y, but not T, nor the effect of T on Y (Lee 2005: 43-48)

63 Maybe control

64 Maybe control When X affects Y, but not T, there is no confounding issue and the estimates for the effect of T on Y should not be affected by inclusion of X. However, including X in the model can still help for efficiency. (Gelman & Hill 2007: 177)

65 X affects both T and Y = control T affects Y, which in turn affects X = do not control T affects X, which in turn affects Y = do not control unless you explicitly want only the direct effect X affects Y, but not T, nor the effect of T on Y X affects Y, not T, but it does affect of effect of T on Y (interaction) (Lee 2005: 43-48)

66 Maybe control Here including the interaction in your model can highlight how the effect is different for different groups.

67 Maybe control Here including the interaction in your model can highlight how the effect is different for different groups. Note that it affects the interpretation, but that the estimation of the overall ATE is not affected by controlling for X.

68 Outline

69 The ideal experiment To avoid any effect of covariates the ideal is to randomly select participants for your research from the overal population (enables inference to the population) and to randomly assign the treatment to these participants (enables causal inference).

70 Experiment Field experiment Natural experiment Blocking Matching Regression etc.

71 Kitchen sink A typical approach in the quantitative social sciences is to collect a number of different theories / hypotheses, add them all as variables to a regression, and see who wins. This is the kitchen sink approach (or garbage can approach). If anything, the above discussion should have made clear that to draw causal inferences, a clear distinction of treatment from covariates is crucial. In other words: focus your research!

72 Kitchen sink A typical approach in the quantitative social sciences is to collect a number of different theories / hypotheses, add them all as variables to a regression, and see who wins. This is the kitchen sink approach (or garbage can approach). If anything, the above discussion should have made clear that to draw causal inferences, a clear distinction of treatment from covariates is crucial. In other words: focus your research! (Note that the garbage can phrase has also been used to argue against ignoring nonlinearities (Achen 2005), as opposed to careless specification of the causal effect.)

73 Kitchen sink Another way of putting the issue is that the above is all about trying to study the effect of a cause (treatment), rather than the cause of an effect. The latter is perhaps ill-defined and runs into the infinite regress of causation. (See Gerring (2001, 2012) for an extensive discussion of Y-centered and X-centered research.) (Gelman & Hill 2007: 187)

74 Causal diagrams The preceding examples underline how it is important to always draw out the causal diagram and consider carefully how you select cases and select controls when making causal inferences.

75 Equifinality Equifinality refers to the situation where a particular outcome might come about through different causal paths. No particular path might show a strong association between independent and dependent variable.

76 Equifinality Equifinality refers to the situation where a particular outcome might come about through different causal paths. No particular path might show a strong association between independent and dependent variable. Should this be seen as problematic for the counterfactual causal inference framework?

77 Outline

78 KKV s 5 rules 1 Construct falsifiable theories 2 Build theories that are internally consistent 3 Select dependent variables carefully No endogenous relationship Ensure variation in dependent variable 4 Maximize concreteness 5 State theories as encompassing as feasible (King, Keohane & Verba 1994: )

79 Gerring s criteria Clarity Manipulability Separation Independence (priority) Impact Mechanism

80 Discussion points How do causal inference and prediction relate? ( The proof of the pudding is in the eating? )

81 Discussion points How do causal inference and prediction relate? ( The proof of the pudding is in the eating? ) How does the theory thus far translate to qualitative research? Does it?

82 Discussion points How do causal inference and prediction relate? ( The proof of the pudding is in the eating? ) How does the theory thus far translate to qualitative research? Does it? Are all causal inferences counterfactual?

83 Discussion points How do causal inference and prediction relate? ( The proof of the pudding is in the eating? ) How does the theory thus far translate to qualitative research? Does it? Are all causal inferences counterfactual? How does this all relate to causal mechanisms in the sense of Hedstrom & Swedberg?

Advanced Quantitative Methods: Causal inference

Advanced Quantitative Methods: Causal inference Advanced Quantitative Methods: Johan A. Elkink University College Dublin 2 March 2017 1 2 3 1 2 3 Inference In regression analysis we look at the relationship between (a set of) independent variable(s)

More information

Notes on causal effects

Notes on causal effects Notes on causal effects Johan A. Elkink March 4, 2013 1 Decomposing bias terms Deriving Eq. 2.12 in Morgan and Winship (2007: 46): { Y1i if T Potential outcome = i = 1 Y 0i if T i = 0 Using shortcut E

More information

Statistical Models for Causal Analysis

Statistical Models for Causal Analysis Statistical Models for Causal Analysis Teppei Yamamoto Keio University Introduction to Causal Inference Spring 2016 Three Modes of Statistical Inference 1. Descriptive Inference: summarizing and exploring

More information

Quantitative Economics for the Evaluation of the European Policy

Quantitative Economics for the Evaluation of the European Policy Quantitative Economics for the Evaluation of the European Policy Dipartimento di Economia e Management Irene Brunetti Davide Fiaschi Angela Parenti 1 25th of September, 2017 1 ireneb@ec.unipi.it, davide.fiaschi@unipi.it,

More information

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015 Introduction to causal identification Nidhiya Menon IGC Summer School, New Delhi, July 2015 Outline 1. Micro-empirical methods 2. Rubin causal model 3. More on Instrumental Variables (IV) Estimating causal

More information

Empirical approaches in public economics

Empirical approaches in public economics Empirical approaches in public economics ECON4624 Empirical Public Economics Fall 2016 Gaute Torsvik Outline for today The canonical problem Basic concepts of causal inference Randomized experiments Non-experimental

More information

Predicting the Treatment Status

Predicting the Treatment Status Predicting the Treatment Status Nikolay Doudchenko 1 Introduction Many studies in social sciences deal with treatment effect models. 1 Usually there is a treatment variable which determines whether a particular

More information

Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1

Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1 Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1 Gary King GaryKing.org April 13, 2014 1 c Copyright 2014 Gary King, All Rights Reserved. Gary King ()

More information

Propensity Score Methods for Causal Inference

Propensity Score Methods for Causal Inference John Pura BIOS790 October 2, 2015 Causal inference Philosophical problem, statistical solution Important in various disciplines (e.g. Koch s postulates, Bradford Hill criteria, Granger causality) Good

More information

The Causal Inference Problem and the Rubin Causal Model

The Causal Inference Problem and the Rubin Causal Model The Causal Inference Problem and the Rubin Causal Model Lecture 2 Rebecca B. Morton NYU Exp Class Lectures R B Morton (NYU) EPS Lecture 2 Exp Class Lectures 1 / 23 Variables in Modeling the E ects of a

More information

CompSci Understanding Data: Theory and Applications

CompSci Understanding Data: Theory and Applications CompSci 590.6 Understanding Data: Theory and Applications Lecture 17 Causality in Statistics Instructor: Sudeepa Roy Email: sudeepa@cs.duke.edu Fall 2015 1 Today s Reading Rubin Journal of the American

More information

Gov 2002: 4. Observational Studies and Confounding

Gov 2002: 4. Observational Studies and Confounding Gov 2002: 4. Observational Studies and Confounding Matthew Blackwell September 10, 2015 Where are we? Where are we going? Last two weeks: randomized experiments. From here on: observational studies. What

More information

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 An Introduction to Causal Mediation Analysis Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 1 Causality In the applications of statistics, many central questions

More information

Technical Track Session I:

Technical Track Session I: Impact Evaluation Technical Track Session I: Click to edit Master title style Causal Inference Damien de Walque Amman, Jordan March 8-12, 2009 Click to edit Master subtitle style Human Development Human

More information

An Introduction to Causal Analysis on Observational Data using Propensity Scores

An Introduction to Causal Analysis on Observational Data using Propensity Scores An Introduction to Causal Analysis on Observational Data using Propensity Scores Margie Rosenberg*, PhD, FSA Brian Hartman**, PhD, ASA Shannon Lane* *University of Wisconsin Madison **University of Connecticut

More information

Causal Inference. Prediction and causation are very different. Typical questions are:

Causal Inference. Prediction and causation are very different. Typical questions are: Causal Inference Prediction and causation are very different. Typical questions are: Prediction: Predict Y after observing X = x Causation: Predict Y after setting X = x. Causation involves predicting

More information

The returns to schooling, ability bias, and regression

The returns to schooling, ability bias, and regression The returns to schooling, ability bias, and regression Jörn-Steffen Pischke LSE October 4, 2016 Pischke (LSE) Griliches 1977 October 4, 2016 1 / 44 Counterfactual outcomes Scholing for individual i is

More information

Selection on Observables: Propensity Score Matching.

Selection on Observables: Propensity Score Matching. Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017

More information

Econometric Causality

Econometric Causality Econometric (2008) International Statistical Review, 76(1):1-27 James J. Heckman Spencer/INET Conference University of Chicago Econometric The econometric approach to causality develops explicit models

More information

Potential Outcomes Model (POM)

Potential Outcomes Model (POM) Potential Outcomes Model (POM) Relationship Between Counterfactual States Causality Empirical Strategies in Labor Economics, Angrist Krueger (1999): The most challenging empirical questions in economics

More information

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors Laura Mayoral IAE, Barcelona GSE and University of Gothenburg Gothenburg, May 2015 Roadmap Deviations from the standard

More information

Simple Regression Model. January 24, 2011

Simple Regression Model. January 24, 2011 Simple Regression Model January 24, 2011 Outline Descriptive Analysis Causal Estimation Forecasting Regression Model We are actually going to derive the linear regression model in 3 very different ways

More information

ANALYTIC COMPARISON. Pearl and Rubin CAUSAL FRAMEWORKS

ANALYTIC COMPARISON. Pearl and Rubin CAUSAL FRAMEWORKS ANALYTIC COMPARISON of Pearl and Rubin CAUSAL FRAMEWORKS Content Page Part I. General Considerations Chapter 1. What is the question? 16 Introduction 16 1. Randomization 17 1.1 An Example of Randomization

More information

Differences-in- Differences. November 10 Clair

Differences-in- Differences. November 10 Clair Differences-in- Differences November 10 Clair The Big Picture What is this class really about, anyway? The Big Picture What is this class really about, anyway? Causality The Big Picture What is this class

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University

More information

Causality and Experiments

Causality and Experiments Causality and Experiments Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania April 13, 2009 Michael R. Roberts Causality and Experiments 1/15 Motivation Introduction

More information

Causal Analysis in Social Research

Causal Analysis in Social Research Causal Analysis in Social Research Walter R Davis National Institute of Applied Statistics Research Australia University of Wollongong Frontiers in Social Statistics Metholodogy 8 February 2017 Walter

More information

Sampling Distributions

Sampling Distributions Sampling Distributions Sampling Distribution of the Mean & Hypothesis Testing Remember sampling? Sampling Part 1 of definition Selecting a subset of the population to create a sample Generally random sampling

More information

Potential Outcomes and Causal Inference

Potential Outcomes and Causal Inference Potential Outcomes and Causal Inference PUBL0050 Week 1 Jack Blumenau Department of Political Science UCL 1 / 47 Lecture outline Causality and Causal Inference Course Outline and Logistics The Potential

More information

Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha

Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha January 18, 2010 A2 This appendix has six parts: 1. Proof that ab = c d

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University Joint

More information

Causal inference in multilevel data structures:

Causal inference in multilevel data structures: Causal inference in multilevel data structures: Discussion of papers by Li and Imai Jennifer Hill May 19 th, 2008 Li paper Strengths Area that needs attention! With regard to propensity score strategies

More information

Potential Outcomes and Causal Inference I

Potential Outcomes and Causal Inference I Potential Outcomes and Causal Inference I Jonathan Wand Polisci 350C Stanford University May 3, 2006 Example A: Get-out-the-Vote (GOTV) Question: Is it possible to increase the likelihood of an individuals

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Princeton University Asian Political Methodology Conference University of Sydney Joint

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Controlling for Time Invariant Heterogeneity

Controlling for Time Invariant Heterogeneity Controlling for Time Invariant Heterogeneity Yona Rubinstein July 2016 Yona Rubinstein (LSE) Controlling for Time Invariant Heterogeneity 07/16 1 / 19 Observables and Unobservables Confounding Factors

More information

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017 Introduction to Regression Analysis Dr. Devlina Chatterjee 11 th August, 2017 What is regression analysis? Regression analysis is a statistical technique for studying linear relationships. One dependent

More information

Statistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23

Statistical Methods. Missing Data  snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23 1 / 23 Statistical Methods Missing Data http://www.stats.ox.ac.uk/ snijders/sm.htm Tom A.B. Snijders University of Oxford November, 2011 2 / 23 Literature: Joseph L. Schafer and John W. Graham, Missing

More information

Section 1 : Introduction to the Potential Outcomes Framework. Andrew Bertoli 4 September 2013

Section 1 : Introduction to the Potential Outcomes Framework. Andrew Bertoli 4 September 2013 Section 1 : Introduction to the Potential Outcomes Framework Andrew Bertoli 4 September 2013 Roadmap 1. Preview 2. Helpful Tips 3. Potential Outcomes Framework 4. Experiments vs. Observational Studies

More information

Rockefeller College University at Albany

Rockefeller College University at Albany Rockefeller College University at Albany PAD 705 Handout: Simultaneous quations and Two-Stage Least Squares So far, we have studied examples where the causal relationship is quite clear: the value of the

More information

Lecture Discussion. Confounding, Non-Collapsibility, Precision, and Power Statistics Statistical Methods II. Presented February 27, 2018

Lecture Discussion. Confounding, Non-Collapsibility, Precision, and Power Statistics Statistical Methods II. Presented February 27, 2018 , Non-, Precision, and Power Statistics 211 - Statistical Methods II Presented February 27, 2018 Dan Gillen Department of Statistics University of California, Irvine Discussion.1 Various definitions of

More information

AGEC 661 Note Fourteen

AGEC 661 Note Fourteen AGEC 661 Note Fourteen Ximing Wu 1 Selection bias 1.1 Heckman s two-step model Consider the model in Heckman (1979) Y i = X iβ + ε i, D i = I {Z iγ + η i > 0}. For a random sample from the population,

More information

Causal Inference with Big Data Sets

Causal Inference with Big Data Sets Causal Inference with Big Data Sets Marcelo Coca Perraillon University of Colorado AMC November 2016 1 / 1 Outlone Outline Big data Causal inference in economics and statistics Regression discontinuity

More information

Job Training Partnership Act (JTPA)

Job Training Partnership Act (JTPA) Causal inference Part I.b: randomized experiments, matching and regression (this lecture starts with other slides on randomized experiments) Frank Venmans Example of a randomized experiment: Job Training

More information

PSC 504: Dynamic Causal Inference

PSC 504: Dynamic Causal Inference PSC 504: Dynamic Causal Inference Matthew Blackwell 4/8/203 e problem Let s go back to a problem that we faced earlier, which is how to estimate causal effects with treatments that vary over time. We could

More information

PSC 504: Instrumental Variables

PSC 504: Instrumental Variables PSC 504: Instrumental Variables Matthew Blackwell 3/28/2013 Instrumental Variables and Structural Equation Modeling Setup e basic idea behind instrumental variables is that we have a treatment with unmeasured

More information

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers: Use of Matching Methods for Causal Inference in Experimental and Observational Studies Kosuke Imai Department of Politics Princeton University April 27, 2007 Kosuke Imai (Princeton University) Matching

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

What Causality Is (stats for mathematicians)

What Causality Is (stats for mathematicians) What Causality Is (stats for mathematicians) Andrew Critch UC Berkeley August 31, 2011 Introduction Foreword: The value of examples With any hard question, it helps to start with simple, concrete versions

More information

The Economics of European Regions: Theory, Empirics, and Policy

The Economics of European Regions: Theory, Empirics, and Policy The Economics of European Regions: Theory, Empirics, and Policy Dipartimento di Economia e Management Davide Fiaschi Angela Parenti 1 1 davide.fiaschi@unipi.it, and aparenti@ec.unipi.it. Fiaschi-Parenti

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

Introduction to bivariate analysis

Introduction to bivariate analysis Introduction to bivariate analysis When one measurement is made on each observation, univariate analysis is applied. If more than one measurement is made on each observation, multivariate analysis is applied.

More information

1 Impact Evaluation: Randomized Controlled Trial (RCT)

1 Impact Evaluation: Randomized Controlled Trial (RCT) Introductory Applied Econometrics EEP/IAS 118 Fall 2013 Daley Kutzman Section #12 11-20-13 Warm-Up Consider the two panel data regressions below, where i indexes individuals and t indexes time in months:

More information

Front-Door Adjustment

Front-Door Adjustment Front-Door Adjustment Ethan Fosse Princeton University Fall 2016 Ethan Fosse Princeton University Front-Door Adjustment Fall 2016 1 / 38 1 Preliminaries 2 Examples of Mechanisms in Sociology 3 Bias Formulas

More information

Introduction to bivariate analysis

Introduction to bivariate analysis Introduction to bivariate analysis When one measurement is made on each observation, univariate analysis is applied. If more than one measurement is made on each observation, multivariate analysis is applied.

More information

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017 Econometrics with Observational Data Introduction and Identification Todd Wagner February 1, 2017 Goals for Course To enable researchers to conduct careful quantitative analyses with existing VA (and non-va)

More information

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 14/11/2017 This Week Categorical Variables Categorical

More information

A Decision Theoretic Approach to Causality

A Decision Theoretic Approach to Causality A Decision Theoretic Approach to Causality Vanessa Didelez School of Mathematics University of Bristol (based on joint work with Philip Dawid) Bordeaux, June 2011 Based on: Dawid & Didelez (2010). Identifying

More information

Seminar how does one know if their approach/perspective is appropriate (in terms of being a student, not a professional)

Seminar how does one know if their approach/perspective is appropriate (in terms of being a student, not a professional) Seminar 5 10.00-11.00 Lieberson and Horwich (2008) argue that it is necessary to address and evaluate alternative causal explanations as a way of reaching consensus about the superiority of one or another

More information

LECTURE 2: SIMPLE REGRESSION I

LECTURE 2: SIMPLE REGRESSION I LECTURE 2: SIMPLE REGRESSION I 2 Introducing Simple Regression Introducing Simple Regression 3 simple regression = regression with 2 variables y dependent variable explained variable response variable

More information

review session gov 2000 gov 2000 () review session 1 / 38

review session gov 2000 gov 2000 () review session 1 / 38 review session gov 2000 gov 2000 () review session 1 / 38 Overview Random Variables and Probability Univariate Statistics Bivariate Statistics Multivariate Statistics Causal Inference gov 2000 () review

More information

Teaching Causal Inference in Undergraduate Econometrics

Teaching Causal Inference in Undergraduate Econometrics Teaching Causal Inference in Undergraduate Econometrics October 24, 2012 Abstract This paper argues that the current way in which the undergraduate introductory econometrics course is taught is neither

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

HYPOTHESIS TESTING. Hypothesis Testing

HYPOTHESIS TESTING. Hypothesis Testing MBA 605 Business Analytics Don Conant, PhD. HYPOTHESIS TESTING Hypothesis testing involves making inferences about the nature of the population on the basis of observations of a sample drawn from the population.

More information

Describing Associations, Covariance, Correlation, and Causality. Interest Rates and Inflation. Data & Scatter Diagram. Lecture 4

Describing Associations, Covariance, Correlation, and Causality. Interest Rates and Inflation. Data & Scatter Diagram. Lecture 4 Describing Associations, Covariance, Correlation, and Causality Lecture Reading: Chapter 6 & SW11 ( Readings in portal) 1 Interest Rates and Inflation At the heart of Canada s monetary policy framework

More information

EMERGING MARKETS - Lecture 2: Methodology refresher

EMERGING MARKETS - Lecture 2: Methodology refresher EMERGING MARKETS - Lecture 2: Methodology refresher Maria Perrotta April 4, 2013 SITE http://www.hhs.se/site/pages/default.aspx My contact: maria.perrotta@hhs.se Aim of this class There are many different

More information

Introduction to Statistical Inference

Introduction to Statistical Inference Introduction to Statistical Inference Kosuke Imai Princeton University January 31, 2010 Kosuke Imai (Princeton) Introduction to Statistical Inference January 31, 2010 1 / 21 What is Statistics? Statistics

More information

Sensitivity checks for the local average treatment effect

Sensitivity checks for the local average treatment effect Sensitivity checks for the local average treatment effect Martin Huber March 13, 2014 University of St. Gallen, Dept. of Economics Abstract: The nonparametric identification of the local average treatment

More information

Combining Difference-in-difference and Matching for Panel Data Analysis

Combining Difference-in-difference and Matching for Panel Data Analysis Combining Difference-in-difference and Matching for Panel Data Analysis Weihua An Departments of Sociology and Statistics Indiana University July 28, 2016 1 / 15 Research Interests Network Analysis Social

More information

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp page 1 Lecture 7 The Regression Discontinuity Design fuzzy and sharp page 2 Regression Discontinuity Design () Introduction (1) The design is a quasi-experimental design with the defining characteristic

More information

The subjective worlds of Frequentist and Bayesian statistics

The subjective worlds of Frequentist and Bayesian statistics Chapter 2 The subjective worlds of Frequentist and Bayesian statistics 2.1 The deterministic nature of random coin throwing Suppose that, in an idealised world, the ultimate fate of a thrown coin heads

More information

On the Use of Linear Fixed Effects Regression Models for Causal Inference

On the Use of Linear Fixed Effects Regression Models for Causal Inference On the Use of Linear Fixed Effects Regression Models for ausal Inference Kosuke Imai Department of Politics Princeton University Joint work with In Song Kim Atlantic ausal Inference onference Johns Hopkins

More information

Lecture: Difference-in-Difference (DID)

Lecture: Difference-in-Difference (DID) Lecture: Difference-in-Difference (DID) 1 2 Motivation Q: How to show a new medicine is effective? Naive answer: Give the new medicine to some patients (treatment group), and see what happens This before-and-after

More information

The problem of causality in microeconometrics.

The problem of causality in microeconometrics. The problem of causality in microeconometrics. Andrea Ichino University of Bologna and Cepr June 11, 2007 Contents 1 The Problem of Causality 1 1.1 A formal framework to think about causality....................................

More information

Instrumental Variables

Instrumental Variables Instrumental Variables Yona Rubinstein July 2016 Yona Rubinstein (LSE) Instrumental Variables 07/16 1 / 31 The Limitation of Panel Data So far we learned how to account for selection on time invariant

More information

The Econometric Evaluation of Policy Design: Part I: Heterogeneity in Program Impacts, Modeling Self-Selection, and Parameters of Interest

The Econometric Evaluation of Policy Design: Part I: Heterogeneity in Program Impacts, Modeling Self-Selection, and Parameters of Interest The Econometric Evaluation of Policy Design: Part I: Heterogeneity in Program Impacts, Modeling Self-Selection, and Parameters of Interest Edward Vytlacil, Yale University Renmin University, Department

More information

Dynamics in Social Networks and Causality

Dynamics in Social Networks and Causality Web Science & Technologies University of Koblenz Landau, Germany Dynamics in Social Networks and Causality JProf. Dr. University Koblenz Landau GESIS Leibniz Institute for the Social Sciences Last Time:

More information

Technical Track Session I: Causal Inference

Technical Track Session I: Causal Inference Impact Evaluation Technical Track Session I: Causal Inference Human Development Human Network Development Network Middle East and North Africa Region World Bank Institute Spanish Impact Evaluation Fund

More information

Lecture 1: Probability Fundamentals

Lecture 1: Probability Fundamentals Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability

More information

Causality and research design: experiments versus observation. Harry Ganzeboom Research Skills #3 November

Causality and research design: experiments versus observation. Harry Ganzeboom Research Skills #3 November Causality and research design: experiments versus observation Harry Ganzeboom Research Skills #3 November 10 2008 Experiments All scientific questions are implicitly explanatory questions and ask for causal

More information

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. The Basic Methodology 2. How Should We View Uncertainty in DD Settings?

More information

Relationship between Least Squares Approximation and Maximum Likelihood Hypotheses

Relationship between Least Squares Approximation and Maximum Likelihood Hypotheses Relationship between Least Squares Approximation and Maximum Likelihood Hypotheses Steven Bergner, Chris Demwell Lecture notes for Cmpt 882 Machine Learning February 19, 2004 Abstract In these notes, a

More information

Sociology 593 Exam 2 Answer Key March 28, 2002

Sociology 593 Exam 2 Answer Key March 28, 2002 Sociology 59 Exam Answer Key March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably

More information

Treatment Effects. Christopher Taber. September 6, Department of Economics University of Wisconsin-Madison

Treatment Effects. Christopher Taber. September 6, Department of Economics University of Wisconsin-Madison Treatment Effects Christopher Taber Department of Economics University of Wisconsin-Madison September 6, 2017 Notation First a word on notation I like to use i subscripts on random variables to be clear

More information

Regression - Modeling a response

Regression - Modeling a response Regression - Modeling a response We often wish to construct a model to Explain the association between two or more variables Predict the outcome of a variable given values of other variables. Regression

More information

Controlling for latent confounding by confirmatory factor analysis (CFA) Blinded Blinded

Controlling for latent confounding by confirmatory factor analysis (CFA) Blinded Blinded Controlling for latent confounding by confirmatory factor analysis (CFA) Blinded Blinded 1 Background Latent confounder is common in social and behavioral science in which most of cases the selection mechanism

More information

Difference-in-Differences Methods

Difference-in-Differences Methods Difference-in-Differences Methods Teppei Yamamoto Keio University Introduction to Causal Inference Spring 2016 1 Introduction: A Motivating Example 2 Identification 3 Estimation and Inference 4 Diagnostics

More information

Experiments and Quasi-Experiments

Experiments and Quasi-Experiments Experiments and Quasi-Experiments (SW Chapter 13) Outline 1. Potential Outcomes, Causal Effects, and Idealized Experiments 2. Threats to Validity of Experiments 3. Application: The Tennessee STAR Experiment

More information

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous Econometrics of causal inference Throughout, we consider the simplest case of a linear outcome equation, and homogeneous effects: y = βx + ɛ (1) where y is some outcome, x is an explanatory variable, and

More information

Regression and Stats Primer

Regression and Stats Primer D. Alex Hughes dhughes@ucsd.edu March 1, 2012 Why Statistics? Theory, Hypotheses & Inference Research Design and Data-Gathering Mechanics of OLS Regression Mechanics Assumptions Practical Regression Interpretation

More information

STAT Exam Jam Solutions. Contents

STAT Exam Jam Solutions. Contents s Contents 1 First Day 2 Question 1: PDFs, CDFs, and Finding E(X), V (X).......................... 2 Question 2: Bayesian Inference...................................... 3 Question 3: Binomial to Normal

More information

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification Todd MacKenzie, PhD Collaborators A. James O Malley Tor Tosteson Therese Stukel 2 Overview 1. Instrumental variable

More information

Discussion: Can We Get More Out of Experiments?

Discussion: Can We Get More Out of Experiments? Discussion: Can We Get More Out of Experiments? Kosuke Imai Princeton University September 4, 2010 Kosuke Imai (Princeton) Discussion APSA 2010 (Washington D.C.) 1 / 9 Keele, McConnaughy, and White Question:

More information

G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation

G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS 776 1 14 G-Estimation ( G-Estimation of Structural Nested Models 14) Outline 14.1 The causal question revisited 14.2 Exchangeability revisited

More information

Development. ECON 8830 Anant Nyshadham

Development. ECON 8830 Anant Nyshadham Development ECON 8830 Anant Nyshadham Projections & Regressions Linear Projections If we have many potentially related (jointly distributed) variables Outcome of interest Y Explanatory variable of interest

More information

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Kosuke Imai Department of Politics Princeton University November 13, 2013 So far, we have essentially assumed

More information

A little (more) about me.

A little (more) about me. Comparative Effectiveness Research Methods Training Module 2: Research Designs J. Michael Oakes, PhD Associate Professor Division of Epidemiology University of Minnesota oakes007@umn.edu A little (more)

More information

ECON Introductory Econometrics. Lecture 17: Experiments

ECON Introductory Econometrics. Lecture 17: Experiments ECON4150 - Introductory Econometrics Lecture 17: Experiments Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 13 Lecture outline 2 Why study experiments? The potential outcome framework.

More information

Finding Instrumental Variables: Identification Strategies. Amine Ouazad Ass. Professor of Economics

Finding Instrumental Variables: Identification Strategies. Amine Ouazad Ass. Professor of Economics Finding Instrumental Variables: Identification Strategies Amine Ouazad Ass. Professor of Economics Outline 1. Before/After 2. Difference-in-difference estimation 3. Regression Discontinuity Design BEFORE/AFTER

More information

Four Parameters of Interest in the Evaluation. of Social Programs. James J. Heckman Justin L. Tobias Edward Vytlacil

Four Parameters of Interest in the Evaluation. of Social Programs. James J. Heckman Justin L. Tobias Edward Vytlacil Four Parameters of Interest in the Evaluation of Social Programs James J. Heckman Justin L. Tobias Edward Vytlacil Nueld College, Oxford, August, 2005 1 1 Introduction This paper uses a latent variable

More information