Near/Far Matching. Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants

Size: px
Start display at page:

Download "Near/Far Matching. Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants"

Transcription

1 Near/Far Matching Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants Joint research: Mike Baiocchi, Dylan Small, Scott Lorch and Paul Rosenbaum

2 What this talk is about We ve developed an IV method that feels a lot like propensity score matching. Under some conditions, we can design studies with stronger instruments.

3 An Encouragement Design

4 References Instrumental Variables: Angrist, Imbens, Rubin: Identifcation of causal effects using instrumental variables (with Discusssion). JASA 91, (1996) Encouragement Design: Holland, P.W.: Causal inference, path analysis, and recursive structural equations models. Sociological Methodology 18, (1988)

5 Note on Terminology Encouragement = Instrumental in selection

6 What s the difference: Strong vs. Weak

7 Example: Weaker IV 1,000

8 Example: Weaker IV 500 1,

9 Example: Weaker IV ,

10 Example: Weaker IV 250 1,

11 Example: Stronger IV 250 1,

12 Example: Stronger IV 250 1,

13 Example: Stronger IV 400 1,

14 Example: Weaker IV 250 1,

15 Key Take-Aways People worry about weak instruments. They are easily biased. They provide large (sometimes HUGE) confidence intervals. If you re not careful, you can get inappropriate confidence intervals. Angrist & Kreuger (1991) Does compulsory school attendance affect schooling and earnings? Near/far matching can create studies with stronger instruments. Near/far matching feels a lot like a randomized study with noncompliance.

16 Neonatal Intensive Care Units

17 Application: Regionalization Hospitals vary in their ability to care for premature infants. The American Academy of Pediatrics recognizes levels: 1, 2, 3A, 3B, 3C, 3D and Regional Centers. Regionalization of care refers to a policy that suggests or requires that high-risk mothers deliver at hospitals with greater levels of capabilities.

18 Application: Regionalization

19 Application: Regionalization L H

20 Application: Regionalization L H

21 Application: Regionalization L H

22 Application: Regionalization L H

23 Application: Regionalization L H

24 Application: Regionalization L H

25 The task at hand Regionalization is complex Focus on estimating the difference in death rates

26 The task at hand Regionalization is complex Focus on estimating the difference in death rates

27

28 Outcome Outcome

29 Outcome Outcome

30 The data Every baby delivered in a 10+ year period California Pennsylvania Missouri Mothers information ICD9 codes Delivery Post-delivery complications Some pre-delivery Some SES information Zip code of residence Birth/death certificates Census information PA and MO have zip code level CA will have block group

31 The data Every baby delivered in a 10+ year period California Pennsylvania Missouri Mothers information ICD9 codes Delivery Post-delivery complications Some pre-delivery Some SES information Zip code of residence Birth/death certificates Census information PA and MO have zip code level CA will have block group Pre-delivery Severity?

32 Summary of Problem Want to quantify effect of level of NICU on rate of death Observational data Selection bias Some selection variables are unobserved

33 Instrument: Excess Travel Time L H

34 Instrument: Excess Travel Time L H Excess Travel Time

35 Instrument: Excess Travel Time L H Excess Travel Time

36 Instrument: Excess Travel Time L H Excess Travel Time

37 Instrument: Excess Travel Time L H McClellan, McNeil & Newhouse; "Does more intensive treatment of acute myocardial infarction reduce mortality? JAMA. 272(11): , September 1994

38 Fewer Pairs at Greater Distances

39 Our method a quick sketch Use the idea of block design / pair matching to control observed variation. Use the idea of instrumental variables/encouragement to control unobserved variation.

40 Our method: 1 st step Summarize discrepancies in subjects covariates We used Mahalanobis distance D M x 1, x 2 = (x 1 x 2 ) S 1 (x 1 x 2 )

41 Our method: 1 st step D x = d 11 d 12 d 13 d 1n d 21 d 22 d 31 d n1 d nn d ij = Mahalanobis distance between preemies i and j

42 Our method: 2 nd step Create a penalty for preemies with similar instrument values (e.g., calipers)

43 Our method: 2 nd step D x = d 11 d 12 d 13 d 1n d 21 d 22 d 31 d n1 d nn

44 Our method: 2 nd step D x = d 11 d 12 d 13 d 1n d 21 d 22 d 31 d n1 d nn

45 Our method: 2 nd step D x = d 11 d 12 d 13 d 1n d 21 d 22 d 31 d n1 d nn C z = c 11 c 12 c 13 c 1n c 21 c 22 c 31 c n1 c nn

46 Instrument: Excess Travel Time H H Selection is potentially biased!

47 Instrument: Excess Travel Time H H Selection is potentially biased!

48 Instrument: Excess Travel Time H H Selection largely due to the instrument!

49 Instrument: Excess Travel Time H H Selection largely due to the instrument!

50 Our method: 2 nd step D x = d 11 d 12 d 13 d 1n d 21 d 22 d 31 d n1 d nn C z = c 11 c 12 c 13 c 1n c 21 c 22 c 31 c n1 c nn

51 Our method: 2 nd step D x = d 11 d 12 d 13 d 1n d 21 d 22 d 31 d n1 d nn C z = c 11 c 12 c 13 c 1n c 21 c 22 c 31 c n1 c nn Diff Covariates + Diff Encouragement = Discrepancy Matrix

52 Our method: 2 nd step D x = d 11 d 12 d 13 d 1n d 21 d 22 d 31 d n1 d nn C z = c 11 c 12 c 13 c 1n c 21 c 22 c 31 c n1 c nn Diff Covariates + Diff Encouragement = Discrepancy Matrix (near) (far) (barrier to being paired)

53 Our method: 2 nd step D x = d 11 d 12 d 13 d 1n d 21 d 22 d 31 d n1 d nn C z = c 11 c 12 c 13 c 1n c 21 c 22 c 31 c n1 c nn D x + C z = D

54 Our method: 3 rd step Something has got to give: As we force separation in the instrument, it will be more difficult to find preemies with similar covariates. Allow some subjects to be removed from the study design by matching to sinks.

55 Our method: 3 rd step Let k=number of sinks. Then augment the matrix like so: D 0 0 D = n n discepancy matrix, after first two steps 0 = n k matrix, with all entries 0 = k k matrix, with entries

56 Two matched comparisons, one stronger and one weaker

57 The two matches Two matches 1) No sinks / no forced separation 2) 50% of babies matched to sinks / 25min separation

58 The two matches: Variables Excess Travel Time (i.e., Encouragement/Instrument) Pregnancy and birth variables Mother variables Mother s health insurance Mother s neighborhood Rare congenital anomalies Year Missing indicators

59 The two matches: Variables Excess Travel Time (i.e., Encouragement/Instrument) Pregnancy and birth variables Mother variables Mother s health insurance Mother s neighborhood Rare congenital anomalies Year Missing indicators In total: 45 covariates

60 The two matches 1. Weak Instrument a) No sinks (99,174 pairs) b) No forced separation 2. Strong Instrument a) 50% of babies matched to sinks (49,587 pairs) b) 25min separation

61 Weaker Instrument No sinks Number of pairs: 99,174 Stronger Instrument 50% of babies matched to sinks Number of pairs: 49,587 Variable Variable Type Encouraged Mean Unencouraged Mean Δ/sd Encouraged Mean Unencouraged Mean Δ/sd Excess travel time to hihg-level NICU (minutes) Birthweight (grams) Gestational age (weeks) Gestational diabetes, 1/0 Prenatal care (month) Singel birth, 1/0 Parity Mother's education (scale) Mother's age White, 1/0 Black, 1/0 Asian, 1/0 Other race, 1/0 Race missing, 1/0 Income ($1,000) Home value ($1,000) Has high school degree (fr) Has college degree (fr) Rent (fr) Below poverty (fr) Magnitude of encouragement Pregnancy and birth Mother Mother's neighborhood (zip code/census) , , , , , , , , , , , ,

62 Weaker Instrument No sinks Number of pairs: 99,174 Stronger Instrument 50% of babies matched to sinks Number of pairs: 49,587 Variable Variable Type Encouraged Mean Unencouraged Mean Δ/sd Encouraged Mean Unencouraged Mean Δ/sd Excess travel time to hihg-level NICU (minutes) Birthweight (grams) Gestational age (weeks) Gestational diabetes, 1/0 Prenatal care (month) Singel birth, 1/0 Parity Mother's education (scale) Mother's age White, 1/0 Black, 1/0 Asian, 1/0 Other race, 1/0 Race missing, 1/0 Income ($1,000) Home value ($1,000) Has high school degree (fr) Has college degree (fr) Rent (fr) Below poverty (fr) Magnitude of encouragement Pregnancy and birth Mother Mother's neighborhood (zip code/census) , , , , , , , , , , , ,

63 Application: Two matches Weaker Instrument No sinks Stronger Instrument 50% of babies matched to sinks Number of pairs: 99,174 Number of pairs: 49,587 Variable Variable Type Encouraged Mean Unencouraged Mean Δ/sd Encouraged Mean Unencouraged Mean Δ/sd Excess travel time to hihg-level NICU (minutes) High-level NICU, 1/0 Dead, 1/0 Magnitude of encouragement Delivery at a high-level NICU(Dij) Infant mortality (Rij)

64 Application to the Study of Perinatal Care

65 Pop death rate: ~1.90% Application: Estimating λ Inference about the effect ratio, λ, under the assumption of random assignment of excess travel time within pairs matched for covariates Weaker Instrument Stronger Instrument 99,174 Pairs of Two Babies 49,587 Pairs of Two Babies Point Estimate 0.92% 0.90% 95% CI (0.36%, 1.48%) (0.57%, 1.23%) Length of 95% CI 1.12% 0.66%

66 Pop death rate: ~1.90% Application: Estimating λ Inference about the effect ratio, λ, under the assumption of random assignment of excess travel time within pairs matched for covariates Weaker Instrument Stronger Instrument 99,174 Pairs of Two Babies 49,587 Pairs of Two Babies Point Estimate 0.92% 0.90% 95% CI (0.36%, 1.48%) (0.57%, 1.23%) Length of 95% CI 1.12% 0.66% Bound, Jaeger & Baker. (1995), Problems With Instrumental Variables Estimation When the Correlation Between the Instruments and the Endogenous Explanatory Variable Is Weak, JASA, 90,

67 General Method: Quantifying Departures from Random Assignment

68 Sensitivity Analysis: Framework Up to this point we have assumed Pr Z = z F, Z = 1/ Ω for each z Ω. Now we will allow Pr Z = z F = π ij. Consider matched pair i, then 1 Γ π ij 1 π ij π ij 1 π ij Γ for all i, j, j with x ij = x ij.

69 Sensitivity Analysis: Numerical Examples Gamma of 1.25 Doubling of the odds of death. Doubling of the odds of treatment. Gamma of 1.08 Doubling of the odds of death Increase of 25% of the odds of treatment

70 Application to the Study of Regionalization of Perinatal Care

71 Pop death rate: ~1.90% Application: Sensitivity Analysis Inference about the effect ratio, λ with sensitivity analysis. Weaker Instrument Stronger Instrument 99,174 Pairs of Two Babies 49,587 Pairs of Two Babies Point Estimate 0.92% 0.90% 95% CI (0.36%, 1.48%) (0.57%, 1.23%) Length of 95% CI 1.12% 0.66% Sensitivity (Γ) 1.07 >1.22

72 Pop death rate: ~1.90% Application: Sensitivity Analysis Inference about the effect ratio, λ with sensitivity analysis. Weaker Instrument Stronger Instrument 99,174 Pairs of Two Babies 49,587 Pairs of Two Babies Point Estimate 0.92% 0.90% 95% CI (0.36%, 1.48%) (0.57%, 1.23%) Length of 95% CI 1.12% 0.66% Sensitivity (Γ) 1.07 >1.22 Small & Rosenbaum (2008), War and Wages: The Strength of Instrumental Variables and Their Sensitivity to Unobserved Biases, JASA, 103,

73 What changes when an instrument is strengthened?

74 What changes? 1. Smaller study looks less like population Fewer black mothers Fewer renters That is, less urban

75 Weaker Instrument No sinks Number of pairs: 99,174 Stronger Instrument 50% of babies matched to sinks Number of pairs: 49,587 Variable Variable Type Encouraged Mean Unencouraged Mean Δ/sd Encouraged Mean Unencouraged Mean Δ/sd Excess travel time to hihg-level NICU (minutes) Birthweight (grams) Gestational age (weeks) Gestational diabetes, 1/0 Prenatal care (month) Singel birth, 1/0 Parity Mother's education (scale) Mother's age White, 1/0 Black, 1/0 Asian, 1/0 Other race, 1/0 Race missing, 1/0 Income ($1,000) Home value ($1,000) Has high school degree (fr) Has college degree (fr) Rent (fr) Below poverty (fr) Magnitude of encouragement Pregnancy and birth Mother Mother's neighborhood (zip code/census) , , , , , , , , , , , ,

76 Weaker Instrument No sinks Number of pairs: 99,174 Stronger Instrument 50% of babies matched to sinks Number of pairs: 49,587 Variable Variable Type Encouraged Mean Unencouraged Mean Δ/sd Encouraged Mean Unencouraged Mean Δ/sd Excess travel time to hihg-level NICU (minutes) Birthweight (grams) Gestational age (weeks) Gestational diabetes, 1/0 Prenatal care (month) Singel birth, 1/0 Parity Mother's education (scale) Mother's age White, 1/0 Black, 1/0 Asian, 1/0 Other race, 1/0 Race missing, 1/0 Income ($1,000) Home value ($1,000) Has high school degree (fr) Has college degree (fr) Rent (fr) Below poverty (fr) Magnitude of encouragement Pregnancy and birth Mother Mother's neighborhood (zip code/census) , , , , , , , , , , , ,

77 What changes? 1. Smaller study looks less like population Fewer black mothers Fewer renters That is, less urban 2. Compliers change Larger study: 14 minutes difference Smaller study: 34 minutes difference

78 Why is it OK to throw out data?

79 Stronger instrument by design In some sense we already know this tradeoff: Larger studies with poor design and low compliance Vs. Smaller studies with good design and high levels of compliance

80 Stronger Instruments by Design

81 Our method Advantages Other researchers worry about weak instruments. Now we do something about it! More compliance leads to better estimates Less bias Tighter confidence intervals Sensitivity analysis Easier to explain Mimics an experimental design Avoids MLE (i.e., parametric assumptions)

82 The End More questions?

83 The End Baiocchi, Small, Lorch & Rosenbaum: Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants JASA. Dec 2010, Vol. 105, No. 492:

84 Pop death rate: ~1.90% Application: Estimating λ Inference about the effect ratio, λ, under the assumption of random assignment of excess travel time within pairs matched for covariates

85 Notation: Treatment Effects, Treatment Assignments

86 Notation Indices i denotes which pair There are I matched pairs, thus i = 1,, I. j denotes which subject within the matched pair Thus j = 1, 2. Covariates Observed: x ij Unobserved: u ij

87 Notation Matching on observed covariates x i1 =x i2 for all pairs i But it may be that u i1 u i2

88 Notation Instrument/Encouragement Z ij = 1, if subject j in the i th pair was encouraged Z ij =0, if subject j in the i th pair was unencouraged Note that, within a matched pair, Z i1 +Z i2 =1 Potential outcomes framework (Neyman 1923; Rubin 1974)

89 Instrument Design Response (R) Instrument (Z) Dose (D) Response (R) Response (R) Dose (D) Response (R)

90 Notation Dose (d Tij,d Cij ) Can t observe: d Tij -d Cij Can observe: D ij =Z ij d Tij +(1-Z ij ) d Cij Response (r Tij,r Cij ) Can t observe: r Tij -r Cij Can observe: R ij =Z ij r Tij +(1-Z ij ) r Cij

91 Notation Let F = {(d Tij, d Cij, r Tij, r Cij, x ij, u ij ), i=1,,i,j=1,2} Let Z = Z 11, Z 12,, Z I2

92 Notation Let Z be the event that Z Ω Let Ω be the set containing the Ω = 2 I z of Z Then, in a randomized experiment: Pr(Z=z F, Z)=1/ Ω for each z Ω

93 Effect Ratios

94 The Effect Ratio λ = I i=1 I i=1 2 j =1 2 j =1 (r Tij r Cij ) (d Tij d Cij ) λ is parameter of the 2I subjects, fixed under F. λ is not directly observable. Under Fisher s sharp null hypothesis, H 0 : r Tij = r Cij, for all i, j, it follows that λ=0.

95 The Effect Ratio I 2 λ = I i=1 I i=1 2 j =1 2 j =1 (r Tij r Cij ) (d Tij d Cij ) i=1 j=1 r Tij r Cij is the effect of encouragement on response. I 2 i=1 j=1 d Tij d Cij is the effect of the encouragement on the dose. λ is the ratio of these effects. If λ = 1/100 then for every 100 discouraged by distance from delivering at a high level NICU there is one additional infant death.

96 Inference about an Effect Ratio in a Randomized Experiment

97 Inference Composite Null Consider H 0 λ : λ = λ0 This is a composite null, because no assumptions on distribution of F. Need to consider the supremum over null hypotheses of the probability of rejection.

98 Models have incredible power Models force you to clarify your thinking. By writing down a model, you take a stand on how the world works. If the model is correct, there are very powerful mathematical machines you can deploy to get precise answers.

99 Models have incredible power If the model is wrong, then the power (in the statistical sense) of your inference is not credible.

100 Inference Test Statistic Consider the following test statistic T λ 0 = 1 I I 2 Z ij (R ij λ 0 D ij ) 2 (1 Z ij )(R ij λ 0 D ij ) = 1 I i=1 I i=1 j =1 V i (λ 0 ) j =1

101 Inference Unobserved to Observed Note that in T λ 0 If Z ij = 1, then R ij λ 0 D ij = r Tij λ 0 d Tij, and If Z ij = 0, then R ij λ 0 D ij = r Cij λ 0 d Cij Thus we can write: 2 2 V i λ 0 = j=1 Z ij r Tij λ 0 d Tij j=1 1 Z ij r Cij λ 0 d Cij For the variance of the test statistic S 2 λ 0 = 1 I 2 V I I 1 i=1 i λ 0 T λ 0

102 Inference t-stat! Then, for each k>0, lim sup I Pr T I λ S I λ k F I, Z I Φ( k) lim sup I Pr T I λ S I λ k F I, Z I Φ k.

103 Visualizing our method

104 Encouragement Our method two criteria for matching Distance in Covariates

105 Encouragement Our method near in covariates Distance in Covariates

106 Encouragement Our method far in covariates Distance in Covariates

107 Encouragement Our method far in covariates Distance in Covariates

108 Encouragement Our method good pair Distance in Covariates

109 Encouragement Our method ok pair Distance in Covariates

110 Encouragement Our method strength of instrument Distance in Covariates

111 Encouragement Our method strength of instrument Weak IV Distance in Covariates

112 Encouragement Our method strength of instrument Medium IV Distance in Covariates

113 Encouragement Our method strength of instrument Strong IV Distance in Covariates

114 Encouragement Our method good pair Strong IV Distance in Covariates

115 Encouragement Our method near in instrument Strong IV Distance in Covariates

116 Encouragement Our method near in instrument Strong IV Distance in Covariates

117 Encouragement Our method near in instrument Strong IV Distance in Covariates

118 Encouragement Our method near in instrument Possible Bias Strong IV Distance in Covariates

119 Encouragement Our method near in instrument Strong IV Distance in Covariates

120 Encouragement Our method far in instrument Strong IV Distance in Covariates

121 Encouragement Our method far in instrument Strong IV Distance in Covariates

122 Encouragement Our method far in instrument Stronger Encouragement Strong IV Distance in Covariates

123 Encouragement Find the experiment Strong IV Distance in Covariates

124 Encouragement Find the experiment Strong IV Distance in Covariates

125 The framework Treatment (T) Encouragement (Z) Treatment (T)

126 The framework Treatment (T) Encouragement (Z) Treatment (T)

127 The framework Response (R) D A Treatment (T) Encouragement (Z) Treatment (T) Response (R) D A

128 The framework Response (R) D A Treatment (T) Encouragement (Z) Treatment (T) Response (R) D A

129 The framework Response (R) D A Encouragement (Z) Treatment (T) Treatment (T) Response (R) Response (R) Response (R) D A D A D A

130 Clarifying who we re talking about Compliance Class Encouragement (Z) Treatment (T) 1 1 Complier Never Taker Always Taker Defier 0 1

131 Clarifying who we re talking about Compliance Class Encouragement (Z) Treatment (T) 1 1 Complier Never Taker Always Taker Defier 0 1

132 Clarifying who we re talking about Compliance Class Encouragement (Z) Treatment (T) 1 1 Complier Never Taker Always Taker Defier 0 1

133 Building intuition for implementation

134 Variable Type High NICU Low NICU sd Δ/sd Mortality Outcome 2.26% 1.25% 13.33% 0.08 Difference in Travel Time Instrument % attending high level NICU Treatment 100.0% 0.0% 49.7% 2.01 Birth weight 2, , Preemie covariates Gestational age GI 0.9% 0.6% 8.7% 0.04 GU 0.9% 0.8% 9.0% 0.01 CNS 0.9% 0.4% 8.3% 0.05 Pulmonary 0.8% 0.7% 8.8% 0.01 % of preemies with type of Cardio 1.4% 0.7% 10.5% 0.06 congenital disorders Skeletal 0.7% 0.9% 9.0% Skin 0.0% 0.0% 0.0% 0.00 Chromosomes 0.4% 0.3% 6.3% 0.02 Other_Anomaly 0.8% 0.1% 7.0% 0.09 Gestational_DiabetesM 4.9% 4.3% 21.0% 0.03 Mother's education Insurance - Fee for service 24.0% 24.5% 42.8% Insurance - HMO 32.3% 27.8% 46.0% 0.10 Insurance - Government 23.5% 24.2% 42.6% Insurance - Other Mother covariates 16.8% 21.4% 39.1% Uninsured 2.2% 1.6% 13.7% 0.04 Prenatal care Single birth (y/n) 79.0% 86.1% 38.3% Parity Mother's age Median income 41, , , Median home value 97, , , % completed high school 79.9% 80.0% 9.7% Census level covariates % completed college 22.2% 19.4% 13.1% 0.21 % renting 31.4% 27.9% 12.8% 0.28 % below poverty line 13.4% 11.8% 9.9% 0.16

135 Pre-matching Variable Type High NICU Low NICU sd Δ/sd Mortality Outcome 2.26% 1.25% 13.33% 0.08 Difference in Travel Time Instrument % attending high level NICU Treatment 100.0% 0.0% 49.7% 2.01 Birth weight 2, , Preemie covariates Gestational age

136 Covariates across the instrument 1st Quartile 2nd Quartile 3rd Quartile 4th Quartile max(δ/sd) Mortality 1.93% 2.08% 1.47% 1.74% 0.05 Difference in Travel Time (3.19) % attending high level NICU 81.1% 69.8% 49.9% 21.6% 1.20 Birth weight 2, , , , Gestational age

137 Post-matching Matched Pairs 49,587 Variable Type Encouraged Mean Unencouraged Mean Mortality Outcome 1.54% 1.94% 12.86% Difference in Travel Time Instrument % attending high level NICU Treatment 68.6% 25.4% 49.7% 0.87 Birth weight 2, , Preemie covariates Gestational age sd Δ/sd

138 Result Point Estimate 95% CI Length of 95% CI Sensitivity (Γ) Weaker Instrument Stronger Instrument 99,174 Pairs 49,587 Pairs of Two Babies of Two Babies 0.92% 0.90% (0.36%, 1.48%) (0.57%, 1.23%) 1.12% 0.66% 1.07 >1.22

139 Final thoughts Near/Far deals with binary outcomes 2SLS can lead to logical absurdities when outcomes are binary Increase the strength of the instrument Stronger instruments lead to more robust results Sensitivity analysis for this method is available This technique has the potential to be a Philosopher s stone

140 Why not use 2SLS? Linear probability models (LPMs) have trouble when your parameter values are up against the edge of parameter space

141 Quality of Care Trouble with time Treatment effect High level NICU Low level NICU Time

142 Quality of Care Trouble with time Treatment effect Treatment effect High level NICU Low level NICU Time

143 Modeling hospital choice

144 Instrument: Probabilities Empirical distributions M

145 Instrument: Probabilities Empirical distributions?

146 Instrument: Probabilities Near Neighbors?

147 Instrument: Probabilities Conditional logistic model/bayesian hierarchical modeling H H M

148 Instrument: Probabilities Conditional logistic model/bayesian hierarchical modeling H H M

149

Near/Far Matching. Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants

Near/Far Matching. Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants Near/Far Matching Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants Joint research: Mike Baiocchi, Dylan Small, Scott Lorch and Paul Rosenbaum Classic set

More information

Advanced Statistical Methods for Observational Studies L E C T U R E 0 6

Advanced Statistical Methods for Observational Studies L E C T U R E 0 6 Advanced Statistical Methods for Observational Studies L E C T U R E 0 6 class management Problem set 1 is posted Questions? design thus far We re off to a bad start. 1 2 1 2 1 2 2 2 1 1 1 2 1 1 2 2 2

More information

Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants

Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants Mike Baiocchi, Dylan S. Small, Scott Lorch, Paul R. Rosenbaum 1 University of Pennsylvania, Philadelphia

More information

Advanced Statistical Methods for Observational Studies L E C T U R E 0 1

Advanced Statistical Methods for Observational Studies L E C T U R E 0 1 Advanced Statistical Methods for Observational Studies L E C T U R E 0 1 introduction this class Website Expectations Questions observational studies The world of observational studies is kind of hard

More information

Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants

Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants Mike BAIOCCHI, DylanS.SMALL, ScottLORCH, and Paul R. ROSENBAUM An instrument is a random nudge toward acceptance

More information

Advanced Statistical Methods for Observational Studies L E C T U R E 0 1

Advanced Statistical Methods for Observational Studies L E C T U R E 0 1 Advanced Statistical Methods for Observational Studies L E C T U R E 0 1 introduction this class Website Expectations Questions observational studies The world of observational studies is kind of hard

More information

University of Pennsylvania and The Children s Hospital of Philadelphia

University of Pennsylvania and The Children s Hospital of Philadelphia Submitted to the Annals of Applied Statistics arxiv: arxiv:0000.0000 ESTIMATION OF CAUSAL EFFECTS USING INSTRUMENTAL VARIABLES WITH NONIGNORABLE MISSING COVARIATES: APPLICATION TO EFFECT OF TYPE OF DELIVERY

More information

War and Wages: The Strength of Instrumental Variables and Their Sensitivity to Unobserved Biases

War and Wages: The Strength of Instrumental Variables and Their Sensitivity to Unobserved Biases University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 2008 War and Wages: The Strength of Instrumental Variables and Their Sensitivity to Unobserved Biases Dylan S. Small

More information

Jun Tu. Department of Geography and Anthropology Kennesaw State University

Jun Tu. Department of Geography and Anthropology Kennesaw State University Examining Spatially Varying Relationships between Preterm Births and Ambient Air Pollution in Georgia using Geographically Weighted Logistic Regression Jun Tu Department of Geography and Anthropology Kennesaw

More information

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors Laura Mayoral IAE, Barcelona GSE and University of Gothenburg Gothenburg, May 2015 Roadmap Deviations from the standard

More information

Using Split Samples and Evidence Factors in an Observational Study of Neonatal Outcomes

Using Split Samples and Evidence Factors in an Observational Study of Neonatal Outcomes University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 2011 Using Split Samples and Evidence Factors in an Observational Study of Neonatal Outcomes Kai Zhang University

More information

Causal Inference with Big Data Sets

Causal Inference with Big Data Sets Causal Inference with Big Data Sets Marcelo Coca Perraillon University of Colorado AMC November 2016 1 / 1 Outlone Outline Big data Causal inference in economics and statistics Regression discontinuity

More information

Using split samples and evidence factors in an observational study of neonatal outcomes

Using split samples and evidence factors in an observational study of neonatal outcomes Using split samples and evidence factors in an observational study of neonatal outcomes Kai Zhang, Dylan Small, Scott Lorch, Sindhu Srinivas, Paul R. Rosenbaum 1 University of Pennsylvania, Philadelphia

More information

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P. Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk

More information

Instrumental Variables

Instrumental Variables Instrumental Variables Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Noncompliance in Experiments Stat186/Gov2002 Fall 2018 1 / 18 Instrumental Variables

More information

Randomization Inference with An Instrumental Variable: Two Examples and Some Theory

Randomization Inference with An Instrumental Variable: Two Examples and Some Theory Randomization Inference with An Instrumental Variable: Two Examples and Some Theory Paul R. Rosenbaum, Department of Statistics, Wharton School University of Pennsylvania, Philadelphia, PA 19104-6340 US

More information

Heterogeneity and Causality: Unit Heterogeneity and Design Sensitivity in Observational Studies. Paul R. Rosenbaum, University of Pennsylvania

Heterogeneity and Causality: Unit Heterogeneity and Design Sensitivity in Observational Studies. Paul R. Rosenbaum, University of Pennsylvania Heterogeneity and Causality: Unit Heterogeneity and Design Sensitivity in Observational Studies Paul R. Rosenbaum, University of Pennsylvania References [1] Rosenbaum, P. R. (2005) Heterogeneity and causality:

More information

THE DESIGN (VERSUS THE ANALYSIS) OF EVALUATIONS FROM OBSERVATIONAL STUDIES: PARALLELS WITH THE DESIGN OF RANDOMIZED EXPERIMENTS DONALD B.

THE DESIGN (VERSUS THE ANALYSIS) OF EVALUATIONS FROM OBSERVATIONAL STUDIES: PARALLELS WITH THE DESIGN OF RANDOMIZED EXPERIMENTS DONALD B. THE DESIGN (VERSUS THE ANALYSIS) OF EVALUATIONS FROM OBSERVATIONAL STUDIES: PARALLELS WITH THE DESIGN OF RANDOMIZED EXPERIMENTS DONALD B. RUBIN My perspective on inference for causal effects: In randomized

More information

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score Causal Inference with General Treatment Regimes: Generalizing the Propensity Score David van Dyk Department of Statistics, University of California, Irvine vandyk@stat.harvard.edu Joint work with Kosuke

More information

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes Kosuke Imai Department of Politics Princeton University July 31 2007 Kosuke Imai (Princeton University) Nonignorable

More information

The Essential Role of Pair Matching in. Cluster-Randomized Experiments. with Application to the Mexican Universal Health Insurance Evaluation

The Essential Role of Pair Matching in. Cluster-Randomized Experiments. with Application to the Mexican Universal Health Insurance Evaluation The Essential Role of Pair Matching in Cluster-Randomized Experiments, with Application to the Mexican Universal Health Insurance Evaluation Kosuke Imai Princeton University Gary King Clayton Nall Harvard

More information

Comparing Means from Two-Sample

Comparing Means from Two-Sample Comparing Means from Two-Sample Kwonsang Lee University of Pennsylvania kwonlee@wharton.upenn.edu April 3, 2015 Kwonsang Lee STAT111 April 3, 2015 1 / 22 Inference from One-Sample We have two options to

More information

Discussion of Identifiability and Estimation of Causal Effects in Randomized. Trials with Noncompliance and Completely Non-ignorable Missing Data

Discussion of Identifiability and Estimation of Causal Effects in Randomized. Trials with Noncompliance and Completely Non-ignorable Missing Data Biometrics 000, 000 000 DOI: 000 000 0000 Discussion of Identifiability and Estimation of Causal Effects in Randomized Trials with Noncompliance and Completely Non-ignorable Missing Data Dylan S. Small

More information

Lecture 8. Roy Model, IV with essential heterogeneity, MTE

Lecture 8. Roy Model, IV with essential heterogeneity, MTE Lecture 8. Roy Model, IV with essential heterogeneity, MTE Economics 2123 George Washington University Instructor: Prof. Ben Williams Heterogeneity When we talk about heterogeneity, usually we mean heterogeneity

More information

Using Instrumental Variables to Find Causal Effects in Public Health

Using Instrumental Variables to Find Causal Effects in Public Health 1 Using Instrumental Variables to Find Causal Effects in Public Health Antonio Trujillo, PhD John Hopkins Bloomberg School of Public Health Department of International Health Health Systems Program October

More information

multilevel modeling: concepts, applications and interpretations

multilevel modeling: concepts, applications and interpretations multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models

More information

Sensitivity checks for the local average treatment effect

Sensitivity checks for the local average treatment effect Sensitivity checks for the local average treatment effect Martin Huber March 13, 2014 University of St. Gallen, Dept. of Economics Abstract: The nonparametric identification of the local average treatment

More information

PROPENSITY SCORE MATCHING. Walter Leite

PROPENSITY SCORE MATCHING. Walter Leite PROPENSITY SCORE MATCHING Walter Leite 1 EXAMPLE Question: Does having a job that provides or subsidizes child care increate the length that working mothers breastfeed their children? Treatment: Working

More information

Estimating the Dynamic Effects of a Job Training Program with M. Program with Multiple Alternatives

Estimating the Dynamic Effects of a Job Training Program with M. Program with Multiple Alternatives Estimating the Dynamic Effects of a Job Training Program with Multiple Alternatives Kai Liu 1, Antonio Dalla-Zuanna 2 1 University of Cambridge 2 Norwegian School of Economics June 19, 2018 Introduction

More information

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification Todd MacKenzie, PhD Collaborators A. James O Malley Tor Tosteson Therese Stukel 2 Overview 1. Instrumental variable

More information

1Department of Demography and Organization Studies, University of Texas at San Antonio, One UTSA Circle, San Antonio, TX

1Department of Demography and Organization Studies, University of Texas at San Antonio, One UTSA Circle, San Antonio, TX Well, it depends on where you're born: A practical application of geographically weighted regression to the study of infant mortality in the U.S. P. Johnelle Sparks and Corey S. Sparks 1 Introduction Infant

More information

Propensity Score Weighting with Multilevel Data

Propensity Score Weighting with Multilevel Data Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative

More information

IV Estimation WS 2014/15 SS Alexander Spermann. IV Estimation

IV Estimation WS 2014/15 SS Alexander Spermann. IV Estimation SS 2010 WS 2014/15 Alexander Spermann Evaluation With Non-Experimental Approaches Selection on Unobservables Natural Experiment (exogenous variation in a variable) DiD Example: Card/Krueger (1994) Minimum

More information

Causal Modeling in Environmental Epidemiology. Joel Schwartz Harvard University

Causal Modeling in Environmental Epidemiology. Joel Schwartz Harvard University Causal Modeling in Environmental Epidemiology Joel Schwartz Harvard University When I was Young What do I mean by Causal Modeling? What would have happened if the population had been exposed to a instead

More information

Introduction to Panel Data Analysis

Introduction to Panel Data Analysis Introduction to Panel Data Analysis Youngki Shin Department of Economics Email: yshin29@uwo.ca Statistics and Data Series at Western November 21, 2012 1 / 40 Motivation More observations mean more information.

More information

ESTIMATION OF TREATMENT EFFECTS VIA MATCHING

ESTIMATION OF TREATMENT EFFECTS VIA MATCHING ESTIMATION OF TREATMENT EFFECTS VIA MATCHING AAEC 56 INSTRUCTOR: KLAUS MOELTNER Textbooks: R scripts: Wooldridge (00), Ch.; Greene (0), Ch.9; Angrist and Pischke (00), Ch. 3 mod5s3 General Approach The

More information

MATH 10 INTRODUCTORY STATISTICS

MATH 10 INTRODUCTORY STATISTICS MATH 10 INTRODUCTORY STATISTICS Ramesh Yapalparvi It is Time for Homework! ( ω `) First homework + data will be posted on the website, under the homework tab. And also sent out via email. 30% weekly homework.

More information

Econometrics Review questions for exam

Econometrics Review questions for exam Econometrics Review questions for exam Nathaniel Higgins nhiggins@jhu.edu, 1. Suppose you have a model: y = β 0 x 1 + u You propose the model above and then estimate the model using OLS to obtain: ŷ =

More information

150C Causal Inference

150C Causal Inference 150C Causal Inference Instrumental Variables: Modern Perspective with Heterogeneous Treatment Effects Jonathan Mummolo May 22, 2017 Jonathan Mummolo 150C Causal Inference May 22, 2017 1 / 26 Two Views

More information

AGEC 661 Note Fourteen

AGEC 661 Note Fourteen AGEC 661 Note Fourteen Ximing Wu 1 Selection bias 1.1 Heckman s two-step model Consider the model in Heckman (1979) Y i = X iβ + ε i, D i = I {Z iγ + η i > 0}. For a random sample from the population,

More information

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 8 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 25 Recommended Reading For the today Instrumental Variables Estimation and Two Stage

More information

Noncompliance in Randomized Experiments

Noncompliance in Randomized Experiments Noncompliance in Randomized Experiments Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Noncompliance in Experiments Stat186/Gov2002 Fall 2018 1 / 15 Encouragement

More information

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Alessandra Mattei Dipartimento di Statistica G. Parenti Università

More information

Rank preserving Structural Nested Distribution Model (RPSNDM) for Continuous

Rank preserving Structural Nested Distribution Model (RPSNDM) for Continuous Rank preserving Structural Nested Distribution Model (RPSNDM) for Continuous Y : X M Y a=0 = Y a a m = Y a cum (a) : Y a = Y a=0 + cum (a) an unknown parameter. = 0, Y a = Y a=0 = Y for all subjects Rank

More information

Gov 2002: 5. Matching

Gov 2002: 5. Matching Gov 2002: 5. Matching Matthew Blackwell October 1, 2015 Where are we? Where are we going? Discussed randomized experiments, started talking about observational data. Last week: no unmeasured confounders

More information

Instrumental Variables in Action: Sometimes You get What You Need

Instrumental Variables in Action: Sometimes You get What You Need Instrumental Variables in Action: Sometimes You get What You Need Joshua D. Angrist MIT and NBER May 2011 Introduction Our Causal Framework A dummy causal variable of interest, i, is called a treatment,

More information

MATH 10 INTRODUCTORY STATISTICS

MATH 10 INTRODUCTORY STATISTICS MATH 10 INTRODUCTORY STATISTICS Tommy Khoo Your friendly neighbourhood graduate student. It is Time for Homework! ( ω `) First homework + data will be posted on the website, under the homework tab. And

More information

Exact Nonparametric Inference for a Binary. Endogenous Regressor

Exact Nonparametric Inference for a Binary. Endogenous Regressor Exact Nonparametric Inference for a Binary Endogenous Regressor Brigham R. Frandsen December 5, 2013 Abstract This paper describes a randomization-based estimation and inference procedure for the distribution

More information

Dynamics in Social Networks and Causality

Dynamics in Social Networks and Causality Web Science & Technologies University of Koblenz Landau, Germany Dynamics in Social Networks and Causality JProf. Dr. University Koblenz Landau GESIS Leibniz Institute for the Social Sciences Last Time:

More information

Instrumental Variables

Instrumental Variables Instrumental Variables Teppei Yamamoto Keio University Introduction to Causal Inference Spring 2016 Noncompliance in Randomized Experiments Often we cannot force subjects to take specific treatments Units

More information

STAT/SOC/CSSS 221 Statistical Concepts and Methods for the Social Sciences. Random Variables

STAT/SOC/CSSS 221 Statistical Concepts and Methods for the Social Sciences. Random Variables STAT/SOC/CSSS 221 Statistical Concepts and Methods for the Social Sciences Random Variables Christopher Adolph Department of Political Science and Center for Statistics and the Social Sciences University

More information

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last week: Sample, population and sampling

More information

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last two weeks: Sample, population and sampling

More information

Economics 345: Applied Econometrics Section A01 University of Victoria Midterm Examination #2 Version 1 SOLUTIONS Fall 2016 Instructor: Martin Farnham

Economics 345: Applied Econometrics Section A01 University of Victoria Midterm Examination #2 Version 1 SOLUTIONS Fall 2016 Instructor: Martin Farnham Economics 345: Applied Econometrics Section A01 University of Victoria Midterm Examination #2 Version 1 SOLUTIONS Fall 2016 Instructor: Martin Farnham Last name (family name): First name (given name):

More information

Control Function Instrumental Variable Estimation of Nonlinear Causal Effect Models

Control Function Instrumental Variable Estimation of Nonlinear Causal Effect Models Journal of Machine Learning Research 17 (2016) 1-35 Submitted 9/14; Revised 2/16; Published 2/16 Control Function Instrumental Variable Estimation of Nonlinear Causal Effect Models Zijian Guo Department

More information

Selection on Observables: Propensity Score Matching.

Selection on Observables: Propensity Score Matching. Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017

More information

PhD/MA Econometrics Examination January 2012 PART A

PhD/MA Econometrics Examination January 2012 PART A PhD/MA Econometrics Examination January 2012 PART A ANSWER ANY TWO QUESTIONS IN THIS SECTION NOTE: (1) The indicator function has the properties: (2) Question 1 Let, [defined as if using the indicator

More information

Bootstrapping Sensitivity Analysis

Bootstrapping Sensitivity Analysis Bootstrapping Sensitivity Analysis Qingyuan Zhao Department of Statistics, The Wharton School University of Pennsylvania May 23, 2018 @ ACIC Based on: Qingyuan Zhao, Dylan S. Small, and Bhaswar B. Bhattacharya.

More information

EDF 7405 Advanced Quantitative Methods in Educational Research. Data are available on IQ of the child and seven potential predictors.

EDF 7405 Advanced Quantitative Methods in Educational Research. Data are available on IQ of the child and seven potential predictors. EDF 7405 Advanced Quantitative Methods in Educational Research Data are available on IQ of the child and seven potential predictors. Four are medical variables available at the birth of the child: Birthweight

More information

University of Michigan School of Public Health

University of Michigan School of Public Health University of Michigan School of Public Health The University of Michigan Department of Biostatistics Working Paper Series Year 2016 Paper 121 A Weighted Instrumental Variable Estimator to Control for

More information

Quantitative Economics for the Evaluation of the European Policy

Quantitative Economics for the Evaluation of the European Policy Quantitative Economics for the Evaluation of the European Policy Dipartimento di Economia e Management Irene Brunetti Davide Fiaschi Angela Parenti 1 25th of September, 2017 1 ireneb@ec.unipi.it, davide.fiaschi@unipi.it,

More information

An Alternative Assumption to Identify LATE in Regression Discontinuity Designs

An Alternative Assumption to Identify LATE in Regression Discontinuity Designs An Alternative Assumption to Identify LATE in Regression Discontinuity Designs Yingying Dong University of California Irvine September 2014 Abstract One key assumption Imbens and Angrist (1994) use to

More information

Propensity Score Analysis with Hierarchical Data

Propensity Score Analysis with Hierarchical Data Propensity Score Analysis with Hierarchical Data Fan Li Alan Zaslavsky Mary Beth Landrum Department of Health Care Policy Harvard Medical School May 19, 2008 Introduction Population-based observational

More information

Using Post Outcome Measurement Information in Censoring by Death Problems

Using Post Outcome Measurement Information in Censoring by Death Problems Using Post Outcome Measurement Information in Censoring by Death Problems Fan Yang University of Chicago, Chicago, USA. Dylan S. Small University of Pennsylvania, Philadelphia, USA. Summary. Many clinical

More information

What s New in Econometrics. Lecture 1

What s New in Econometrics. Lecture 1 What s New in Econometrics Lecture 1 Estimation of Average Treatment Effects Under Unconfoundedness Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Potential Outcomes 3. Estimands and

More information

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity LECTURE 10 Introduction to Econometrics Multicollinearity & Heteroskedasticity November 22, 2016 1 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

Chapter 11. Regression with a Binary Dependent Variable

Chapter 11. Regression with a Binary Dependent Variable Chapter 11 Regression with a Binary Dependent Variable 2 Regression with a Binary Dependent Variable (SW Chapter 11) So far the dependent variable (Y) has been continuous: district-wide average test score

More information

Treatment Effects. Christopher Taber. September 6, Department of Economics University of Wisconsin-Madison

Treatment Effects. Christopher Taber. September 6, Department of Economics University of Wisconsin-Madison Treatment Effects Christopher Taber Department of Economics University of Wisconsin-Madison September 6, 2017 Notation First a word on notation I like to use i subscripts on random variables to be clear

More information

Weighting. Homework 2. Regression. Regression. Decisions Matching: Weighting (0) W i. (1) -å l i. )Y i. (1-W i 3/5/2014. (1) = Y i.

Weighting. Homework 2. Regression. Regression. Decisions Matching: Weighting (0) W i. (1) -å l i. )Y i. (1-W i 3/5/2014. (1) = Y i. Weighting Unconfounded Homework 2 Describe imbalance direction matters STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

More information

Propensity Score Methods for Causal Inference

Propensity Score Methods for Causal Inference John Pura BIOS790 October 2, 2015 Causal inference Philosophical problem, statistical solution Important in various disciplines (e.g. Koch s postulates, Bradford Hill criteria, Granger causality) Good

More information

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017 Econometrics with Observational Data Introduction and Identification Todd Wagner February 1, 2017 Goals for Course To enable researchers to conduct careful quantitative analyses with existing VA (and non-va)

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

Rockefeller College University at Albany

Rockefeller College University at Albany Rockefeller College University at Albany PAD 705 Handout: Simultaneous quations and Two-Stage Least Squares So far, we have studied examples where the causal relationship is quite clear: the value of the

More information

An Alternative Assumption to Identify LATE in Regression Discontinuity Design

An Alternative Assumption to Identify LATE in Regression Discontinuity Design An Alternative Assumption to Identify LATE in Regression Discontinuity Design Yingying Dong University of California Irvine May 2014 Abstract One key assumption Imbens and Angrist (1994) use to identify

More information

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection Model Selection in GLMs Last class: estimability/identifiability, analysis of deviance, standard errors & confidence intervals (should be able to implement frequentist GLM analyses!) Today: standard frequentist

More information

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015 Introduction to causal identification Nidhiya Menon IGC Summer School, New Delhi, July 2015 Outline 1. Micro-empirical methods 2. Rubin causal model 3. More on Instrumental Variables (IV) Estimating causal

More information

Linear Probability Model

Linear Probability Model Linear Probability Model Note on required packages: The following code requires the packages sandwich and lmtest to estimate regression error variance that may change with the explanatory variables. If

More information

Inference in Regression Analysis

Inference in Regression Analysis ECNS 561 Inference Inference in Regression Analysis Up to this point 1.) OLS is unbiased 2.) OLS is BLUE (best linear unbiased estimator i.e., the variance is smallest among linear unbiased estimators)

More information

Defining Statistically Significant Spatial Clusters of a Target Population using a Patient-Centered Approach within a GIS

Defining Statistically Significant Spatial Clusters of a Target Population using a Patient-Centered Approach within a GIS Defining Statistically Significant Spatial Clusters of a Target Population using a Patient-Centered Approach within a GIS Efforts to Improve Quality of Care Stephen Jones, PhD Bio-statistical Research

More information

Job Training Partnership Act (JTPA)

Job Training Partnership Act (JTPA) Causal inference Part I.b: randomized experiments, matching and regression (this lecture starts with other slides on randomized experiments) Frank Venmans Example of a randomized experiment: Job Training

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2009 Paper 250 A Machine-Learning Algorithm for Estimating and Ranking the Impact of Environmental Risk

More information

THE DERIVATION OF A LATENT THRESHOLD INSTRUMENTAL VARIABLES MODEL

THE DERIVATION OF A LATENT THRESHOLD INSTRUMENTAL VARIABLES MODEL Statistica Sinica 10(2000), 517-544 THE DERIVATION OF A LATENT THRESHOLD INSTRUMENTAL VARIABLES MODEL Mark E. Glickman and Sharon-Lise T. Normand Boston University and Harvard Medical School Abstract:

More information

Potential Outcomes and Causal Inference I

Potential Outcomes and Causal Inference I Potential Outcomes and Causal Inference I Jonathan Wand Polisci 350C Stanford University May 3, 2006 Example A: Get-out-the-Vote (GOTV) Question: Is it possible to increase the likelihood of an individuals

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

The propensity score with continuous treatments

The propensity score with continuous treatments 7 The propensity score with continuous treatments Keisuke Hirano and Guido W. Imbens 1 7.1 Introduction Much of the work on propensity score analysis has focused on the case in which the treatment is binary.

More information

Variable selection and machine learning methods in causal inference

Variable selection and machine learning methods in causal inference Variable selection and machine learning methods in causal inference Debashis Ghosh Department of Biostatistics and Informatics Colorado School of Public Health Joint work with Yeying Zhu, University of

More information

4.8 Instrumental Variables

4.8 Instrumental Variables 4.8. INSTRUMENTAL VARIABLES 35 4.8 Instrumental Variables A major complication that is emphasized in microeconometrics is the possibility of inconsistent parameter estimation due to endogenous regressors.

More information

ECON Introductory Econometrics. Lecture 17: Experiments

ECON Introductory Econometrics. Lecture 17: Experiments ECON4150 - Introductory Econometrics Lecture 17: Experiments Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 13 Lecture outline 2 Why study experiments? The potential outcome framework.

More information

Lecture 1 Introduction to Multi-level Models

Lecture 1 Introduction to Multi-level Models Lecture 1 Introduction to Multi-level Models Course Website: http://www.biostat.jhsph.edu/~ejohnson/multilevel.htm All lecture materials extracted and further developed from the Multilevel Model course

More information

Covariate Balancing Propensity Score for General Treatment Regimes

Covariate Balancing Propensity Score for General Treatment Regimes Covariate Balancing Propensity Score for General Treatment Regimes Kosuke Imai Princeton University October 14, 2014 Talk at the Department of Psychiatry, Columbia University Joint work with Christian

More information

Multiple Regression Analysis: Inference MULTIPLE REGRESSION ANALYSIS: INFERENCE. Sampling Distributions of OLS Estimators

Multiple Regression Analysis: Inference MULTIPLE REGRESSION ANALYSIS: INFERENCE. Sampling Distributions of OLS Estimators 1 2 Multiple Regression Analysis: Inference MULTIPLE REGRESSION ANALYSIS: INFERENCE Hüseyin Taştan 1 1 Yıldız Technical University Department of Economics These presentation notes are based on Introductory

More information

Learning Representations for Counterfactual Inference. Fredrik Johansson 1, Uri Shalit 2, David Sontag 2

Learning Representations for Counterfactual Inference. Fredrik Johansson 1, Uri Shalit 2, David Sontag 2 Learning Representations for Counterfactual Inference Fredrik Johansson 1, Uri Shalit 2, David Sontag 2 1 2 Counterfactual inference Patient Anna comes in with hypertension. She is 50 years old, Asian

More information

ANALYSIS OF CORRELATED DATA SAMPLING FROM CLUSTERS CLUSTER-RANDOMIZED TRIALS

ANALYSIS OF CORRELATED DATA SAMPLING FROM CLUSTERS CLUSTER-RANDOMIZED TRIALS ANALYSIS OF CORRELATED DATA SAMPLING FROM CLUSTERS CLUSTER-RANDOMIZED TRIALS Background Independent observations: Short review of well-known facts Comparison of two groups continuous response Control group:

More information

Chapter 11. Correlation and Regression

Chapter 11. Correlation and Regression Chapter 11. Correlation and Regression The word correlation is used in everyday life to denote some form of association. We might say that we have noticed a correlation between foggy days and attacks of

More information

STA6938-Logistic Regression Model

STA6938-Logistic Regression Model Dr. Ying Zhang STA6938-Logistic Regression Model Topic 2-Multiple Logistic Regression Model Outlines:. Model Fitting 2. Statistical Inference for Multiple Logistic Regression Model 3. Interpretation of

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

ECO Class 6 Nonparametric Econometrics

ECO Class 6 Nonparametric Econometrics ECO 523 - Class 6 Nonparametric Econometrics Carolina Caetano Contents 1 Nonparametric instrumental variable regression 1 2 Nonparametric Estimation of Average Treatment Effects 3 2.1 Asymptotic results................................

More information

The Problem of Causality in the Analysis of Educational Choices and Labor Market Outcomes Slides for Lectures

The Problem of Causality in the Analysis of Educational Choices and Labor Market Outcomes Slides for Lectures The Problem of Causality in the Analysis of Educational Choices and Labor Market Outcomes Slides for Lectures Andrea Ichino (European University Institute and CEPR) February 28, 2006 Abstract This course

More information