Near/Far Matching. Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants

Size: px
Start display at page:

Download "Near/Far Matching. Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants"

Transcription

1 Near/Far Matching Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants Joint research: Mike Baiocchi, Dylan Small, Scott Lorch and Paul Rosenbaum

2 Classic set up

3 Simple Regression Y T

4 Simple Regression Y T

5 Simple Regression Y T ε

6 Regression with problems Y T ε U

7 Regression with serious problems Y T ε U

8 IV techniques the idea Y T ε Z U

9 IV techniques the idea Y T ε Z U

10 IV techniques the idea Y T ε Z U

11 IV techniques the idea Y T ε Z U

12 IV techniques the idea Y = BMI Exercise = T ε Z U = Lethargy

13 IV techniques the idea Y = BMI Exercise = T ε Free gym membership= Z U = Lethargy

14 An Encouragement Design

15 References Instrumental Variables: Angrist, Imbens, Rubin: Identifcation of causal effects using instrumental variables (with Discusssion). JASA 91, (1996) Encouragement Design: Holland, P.W.: Causal inference, path analysis, and recursive structural equations models. Sociological Methodology 18, (1988)

16 Note on Terminology Instrumental (Merriam-Webster): a) serving as a crucial means, agent, or tool b) of, relating to, or done with an instrument or tool

17 Note on Terminology Instrumental (Merriam-Webster): a) serving as a crucial means, agent, or tool b) of, relating to, or done with an instrument or tool

18 Note on Terminology Encouragement = Instrumental in selection

19 What s the difference: Strong vs. Weak

20 Example: Weak IV 1,000

21 Example: Weak IV 500 1,

22 Example: Weak IV ,

23 Example: Weak IV 250 1,

24 Example: Strong IV 250 1,

25 Example: Strong IV 250 1,

26 Example: Strong IV 400 1,

27 Example: Weak IV 250 1,

28 References Instrumental Variables: Angrist, Imbens, Rubin: Identifcation of causal effects using instrumental variables (with Discusssion). JASA 91, (1996) Encouragement Design: Holland, P.W.: Causal inference, path analysis, and recursive structural equations models. Sociological Methodology 18, (1988)

29 Key Take-Aways People worry about weak instruments. They are easily biased. They provide large (sometimes HUGE) confidence intervals. If you re not careful, you can get inappropriate confidence intervals. Angrist & Kreuger (1991) Does compulsory school attendance affect schooling and earnings? Near/far matching can create studies with stronger instruments. Near/far matching feels a lot like a randomized study with noncompliance.

30 Neonatal Intensive Care Units

31 Application: Regionalization Hospitals vary in their ability to care for premature infants. The American Academy of Pediatrics recognizes levels: 1, 2, 3A, 3B, 3C, 3D and Regional Centers. Regionalization of care refers to a policy that suggests or requires that high-risk mothers deliver at hospitals with greater levels of capabilities.

32 Application: Regionalization

33 Application: Regionalization H H

34 Application: Regionalization H H

35 Application: Regionalization H H

36 Application: Regionalization H H

37 Application: Regionalization H H

38 Application: Regionalization H H

39 Application: Regionalization H H

40 Application: Regionalization H H

41 The task at hand Regionalization is complex Focus on estimating the difference in death rates

42 The task at hand Regionalization is complex Focus on estimating the difference in death rates

43

44 Outcome Outcome

45 Outcome Outcome

46 The data Every baby delivered in a 10+ year period California Pennsylvania Missouri Mothers information ICD9 codes Delivery Post-delivery complications Some pre-delivery Some SES information Zip code of residence Birth/death certificates Census information PA and MO have zip code level CA will have block group

47 The data Every baby delivered in a 10+ year period California Pennsylvania Missouri Mothers information ICD9 codes Delivery Post-delivery complications Some pre-delivery Some SES information Zip code of residence Birth/death certificates Census information PA and MO have zip code level CA will have block group Pre-delivery Severity?

48 Summary of Problem Want to quantify effect of level of NICU on rate of death Observational data Selection bias Some selection variables are unobserved

49 Instrument: Excess Travel Time H H

50 Instrument: Excess Travel Time H H Excess Travel Time

51 Instrument: Excess Travel Time H H Excess Travel Time

52 Instrument: Excess Travel Time H H Excess Travel Time

53 Instrument: Excess Travel Time H H McClellan, McNeil & Newhouse; "Does more intensive treatment of acute myocardial infarction reduce mortality? JAMA. 272(11): , September 1994

54 Fewer Pairs at Greater Distances

55 Our method a quick sketch Use the idea of block design / pair matching to control observed variation. Use the idea of instrumental variables/encouragement to control unobserved variation.

56 Our method: 1 st step Summarize discrepancies in subjects covariates We used Mahalanobis distance D M x 1, x 2 = (x 1 x 2 ) S 1 (x 1 x 2 )

57 Our method: 1 st step D x = d 11 d 12 d 13 d 1n d 21 d 22 d 31 d n1 d nn d ij = Mahalanobis distance between preemies i and j

58 Our method: 2 nd step Create a penalty for preemies with similar instrument values (e.g., calipers)

59 Our method: 2 nd step D x = d 11 d 12 d 13 d 1n d 21 d 22 d 31 d n1 d nn

60 Our method: 2 nd step D x = d 11 d 12 d 13 d 1n d 21 d 22 d 31 d n1 d nn

61 Our method: 2 nd step D x = d 11 d 12 d 13 d 1n d 21 d 22 d 31 d n1 d nn C z = c 11 c 12 c 13 c 1n c 21 c 22 c 31 c n1 c nn

62 Instrument: Excess Travel Time H H Selection is potentially biased!

63 Instrument: Excess Travel Time H H Selection is potentially biased!

64 Instrument: Excess Travel Time H H Selection largely due to the instrument!

65 Instrument: Excess Travel Time H H Selection largely due to the instrument!

66 Our method: 2 nd step D x = d 11 d 12 d 13 d 1n d 21 d 22 d 31 d n1 d nn C z = c 11 c 12 c 13 c 1n c 21 c 22 c 31 c n1 c nn

67 Our method: 2 nd step D x = d 11 d 12 d 13 d 1n d 21 d 22 d 31 d n1 d nn C z = c 11 c 12 c 13 c 1n c 21 c 22 c 31 c n1 c nn Diff Covariates + Diff Encouragement = Discrepancy Matrix

68 Our method: 2 nd step D x = d 11 d 12 d 13 d 1n d 21 d 22 d 31 d n1 d nn C z = c 11 c 12 c 13 c 1n c 21 c 22 c 31 c n1 c nn Diff Covariates + Diff Encouragement = Discrepancy Matrix (near) (far) (barrier to being paired)

69 Our method: 2 nd step D x = d 11 d 12 d 13 d 1n d 21 d 22 d 31 d n1 d nn C z = c 11 c 12 c 13 c 1n c 21 c 22 c 31 c n1 c nn D x + C z = D

70 Our method: 3 rd step Something has got to give: As we force separation in the instrument, it will be more difficult to find preemies with similar covariates. Allow some subjects to be removed from the study design by matching to sinks.

71 Our method: 3 rd step Let k=number of sinks. Then augment the matrix like so: D 0 0 D = n n discepancy matrix, after first two steps 0 = n k matrix, with all entries 0 = k k matrix, with entries

72 Two matched comparisons, one stronger and one weaker

73 The two matches Two matches 1) No sinks / no forced separation 2) 50% of babies matched to sinks / 25min separation

74 The two matches: Variables Excess Travel Time (i.e., Encouragement/Instrument) Pregnancy and birth variables Mother variables Mother s health insurance Mother s neighborhood Rare congenital anomalies Year Missing indicators

75 The two matches: Variables Excess Travel Time (i.e., Encouragement/Instrument) Pregnancy and birth variables Mother variables Mother s health insurance Mother s neighborhood Rare congenital anomalies Year Missing indicators In total: 45 covariates

76 The two matches 1. Weak Instrument a) No sinks (99,174 pairs) b) No forced separation 2. Strong Instrument a) 50% of babies matched to sinks (49,587 pairs) b) 25min separation

77 Weaker Instrument No sinks Number of pairs: 99,174 Stronger Instrument 50% of babies matched to sinks Number of pairs: 49,587 Variable Variable Type Encouraged Mean Unencouraged Mean Δ/sd Encouraged Mean Unencouraged Mean Δ/sd Excess travel time to hihg-level NICU (minutes) Birthweight (grams) Gestational age (weeks) Gestational diabetes, 1/0 Prenatal care (month) Singel birth, 1/0 Parity Mother's education (scale) Mother's age White, 1/0 Black, 1/0 Asian, 1/0 Other race, 1/0 Race missing, 1/0 Income ($1,000) Home value ($1,000) Has high school degree (fr) Has college degree (fr) Rent (fr) Below poverty (fr) Magnitude of encouragement Pregnancy and birth Mother Mother's neighborhood (zip code/census) , , , , , , , , , , , ,

78 Weaker Instrument No sinks Number of pairs: 99,174 Stronger Instrument 50% of babies matched to sinks Number of pairs: 49,587 Variable Variable Type Encouraged Mean Unencouraged Mean Δ/sd Encouraged Mean Unencouraged Mean Δ/sd Excess travel time to hihg-level NICU (minutes) Birthweight (grams) Gestational age (weeks) Gestational diabetes, 1/0 Prenatal care (month) Singel birth, 1/0 Parity Mother's education (scale) Mother's age White, 1/0 Black, 1/0 Asian, 1/0 Other race, 1/0 Race missing, 1/0 Income ($1,000) Home value ($1,000) Has high school degree (fr) Has college degree (fr) Rent (fr) Below poverty (fr) Magnitude of encouragement Pregnancy and birth Mother Mother's neighborhood (zip code/census) , , , , , , , , , , , ,

79 Application: Two matches Weaker Instrument No sinks Stronger Instrument 50% of babies matched to sinks Number of pairs: 99,174 Number of pairs: 49,587 Variable Variable Type Encouraged Mean Unencouraged Mean Δ/sd Encouraged Mean Unencouraged Mean Δ/sd Excess travel time to hihg-level NICU (minutes) High-level NICU, 1/0 Dead, 1/0 Magnitude of encouragement Delivery at a high-level NICU(Dij) Infant mortality (Rij)

80 Notation: Treatment Effects, Treatment Assignments

81 WARNING It s about to get mathy.

82 Notation Indices i denotes which pair There are I matched pairs, thus i = 1,, I. j denotes which subject within the matched pair Thus j = 1, 2. Covariates Observed: x ij Unobserved: u ij

83 Notation Matching on observed covariates x i1 =x i2 for all pairs i But it may be that u i1 u i2

84 Notation Instrument/Encouragement Z ij = 1, if subject j in the i th pair was encouraged Z ij =0, if subject j in the i th pair was unencouraged Note that, within a matched pair, Z i1 +Z i2 =1 Potential outcomes framework (Neyman 1923; Rubin 1974)

85 Encouragement Design Outcome (Y) Encouragement (Z) Treatment (T) Outcome (Y) Outcome (Y) Treatment (T) Outcome (Y)

86 Instrument Design Response (R) Instrument (Z) Dose (D) Response (R) Response (R) Dose (D) Response (R)

87 Notation Dose (d Tij,d Cij ) Can t observe: d Tij -d Cij Can observe: D ij =Z ij d Tij +(1-Z ij ) d Cij Response (r Tij,r Cij ) Can t observe: r Tij -r Cij Can observe: R ij =Z ij r Tij +(1-Z ij ) r Cij

88 Instrument Design Response (R) Instrument (Z) Dose (D) Response (R) Response (R) Dose (D) Response (R)

89 Instrument Design Response (R T ) Dose (D T ) Instrument (Z) Dose (D C ) Complier Response (R C )

90 Instrument Design Response (R T ) Dose (D T ) Instrument (Z) Response (R C ) Dose (D C ) Always-Taker

91 Instrument Design Dose (D T ) Response (R T ) Instrument (Z) Dose (D C ) Never-Taker Response (R C )

92 Instrument Design Dose (D T ) Response (R T ) Instrument (Z) Response (R C ) Dose (D C ) Defier

93 Instrument Design Dose (D T ) Response (R T ) Instrument (Z) Response (R C ) Dose (D C ) Defier

94 Instrument Design Response (R T ) Dose (D T ) Instrument (Z) Dose (D C ) Complier Response (R C )

95 Notation Let F = {(d Tij, d Cij, r Tij, r Cij, x ij, u ij ), i=1,,i,j=1,2} Let Z = Z 11, Z 12,, Z I2

96 Notation Let Z be the event that Z Ω Let Ω be the set containing the Ω = 2 I z of Z Then, in a randomized experiment: Pr(Z=z F, Z)=1/ Ω for each z Ω

97 Effect Ratios

98 The Effect Ratio λ = I i=1 I i=1 2 j =1 2 j =1 (r Tij r Cij ) (d Tij d Cij ) λ is parameter of the 2I subjects, fixed under F. λ is not directly observable. Under Fisher s sharp null hypothesis, H 0 : r Tij = r Cij, for all i, j, it follows that λ=0.

99 The Effect Ratio I 2 λ = I i=1 I i=1 2 j =1 2 j =1 (r Tij r Cij ) (d Tij d Cij ) i=1 j=1 r Tij r Cij is the effect of encouragement on response. I 2 i=1 j=1 d Tij d Cij is the effect of the encouragement on the dose. λ is the ratio of these effects. If λ = 1/100 then for every 100 discouraged by distance from delivering at a high level NICU there is one additional infant death.

100 Inference about an Effect Ratio in a Randomized Experiment

101 Inference Composite Null Consider H 0 λ : λ = λ0 This is a composite null, because no assumptions on distribution of F. Need to consider the supremum over null hypotheses of the probability of rejection.

102 Models have incredible power Models force you to clarify your thinking. By writing down a model, you take a stand on how the world works. If the model is correct, there are very powerful mathematical machines you can deploy to get precise answers.

103 Models have incredible power If the model is wrong, then the power (in the statistical sense) of your inference is not credible.

104 Inference Test Statistic Consider the following test statistic T λ 0 = 1 I I 2 Z ij (R ij λ 0 D ij ) 2 (1 Z ij )(R ij λ 0 D ij ) = 1 I i=1 I i=1 j =1 V i (λ 0 ) j =1

105 Inference Unobserved to Observed Note that in T λ 0 If Z ij = 1, then R ij λ 0 D ij = r Tij λ 0 d Tij, and If Z ij = 0, then R ij λ 0 D ij = r Cij λ 0 d Cij Thus we can write: 2 2 V i λ 0 = j=1 Z ij r Tij λ 0 d Tij j=1 1 Z ij r Cij λ 0 d Cij For the variance of the test statistic S 2 λ 0 = 1 I 2 V I I 1 i=1 i λ 0 T λ 0

106 Inference t-stat! Then, for each k>0, lim sup I Pr T I λ S I λ k F I, Z I Φ( k) lim sup I Pr T I λ S I λ k F I, Z I Φ k.

107 Application to the Study of Perinatal Care

108 Pop death rate: ~1.90% Application: Estimating λ Inference about the effect ratio, λ, under the assumption of random assignment of excess travel time within pairs matched for covariates Weaker Instrument Stronger Instrument 99,174 Pairs of Two Babies 49,587 Pairs of Two Babies Point Estimate 0.92% 0.90% 95% CI (0.36%, 1.48%) (0.57%, 1.23%) Length of 95% CI 1.12% 0.66%

109 Pop death rate: ~1.90% Application: Estimating λ Inference about the effect ratio, λ, under the assumption of random assignment of excess travel time within pairs matched for covariates Weaker Instrument Stronger Instrument 99,174 Pairs of Two Babies 49,587 Pairs of Two Babies Point Estimate 0.92% 0.90% 95% CI (0.36%, 1.48%) (0.57%, 1.23%) Length of 95% CI 1.12% 0.66% Bound, Jaeger & Baker. (1995), Problems With Instrumental Variables Estimation When the Correlation Between the Instruments and the Endogenous Explanatory Variable Is Weak, JASA, 90,

110 General Method: Quantifying Departures from Random Assignment

111 Sensitivity Analysis: Framework Up to this point we have assumed Pr Z = z F, Z = 1/ Ω for each z Ω. Now we will allow Pr Z = z F = π ij. Consider matched pair i, then 1 Γ π ij 1 π ij π ij 1 π ij Γ for all i, j, j with x ij = x ij.

112 Sensitivity Analysis: Numerical Examples Gamma of 1.25 Doubling of the odds of death. Doubling of the odds of treatment. Gamma of 1.08 Doubling of the odds of death Increase of 25% of the odds of treatment

113 Application to the Study of Regionalization of Perinatal Care

114 Pop death rate: ~1.90% Application: Sensitivity Analysis Inference about the effect ratio, λ with sensitivity analysis. Weaker Instrument Stronger Instrument 99,174 Pairs of Two Babies 49,587 Pairs of Two Babies Point Estimate 0.92% 0.90% 95% CI (0.36%, 1.48%) (0.57%, 1.23%) Length of 95% CI 1.12% 0.66% Sensitivity (Γ) 1.07 >1.22

115 Pop death rate: ~1.90% Application: Sensitivity Analysis Inference about the effect ratio, λ with sensitivity analysis. Weaker Instrument Stronger Instrument 99,174 Pairs of Two Babies 49,587 Pairs of Two Babies Point Estimate 0.92% 0.90% 95% CI (0.36%, 1.48%) (0.57%, 1.23%) Length of 95% CI 1.12% 0.66% Sensitivity (Γ) 1.07 >1.22 Small & Rosenbaum (2008), War and Wages: The Strength of Instrumental Variables and Their Sensitivity to Unobserved Biases, JASA, 103,

116 What changes when an instrument is strengthened?

117 What changes? 1. Smaller study looks less like population Fewer black mothers Fewer renters That is, less urban

118 Weaker Instrument No sinks Number of pairs: 99,174 Stronger Instrument 50% of babies matched to sinks Number of pairs: 49,587 Variable Variable Type Encouraged Mean Unencouraged Mean Δ/sd Encouraged Mean Unencouraged Mean Δ/sd Excess travel time to hihg-level NICU (minutes) Birthweight (grams) Gestational age (weeks) Gestational diabetes, 1/0 Prenatal care (month) Singel birth, 1/0 Parity Mother's education (scale) Mother's age White, 1/0 Black, 1/0 Asian, 1/0 Other race, 1/0 Race missing, 1/0 Income ($1,000) Home value ($1,000) Has high school degree (fr) Has college degree (fr) Rent (fr) Below poverty (fr) Magnitude of encouragement Pregnancy and birth Mother Mother's neighborhood (zip code/census) , , , , , , , , , , , ,

119 Weaker Instrument No sinks Number of pairs: 99,174 Stronger Instrument 50% of babies matched to sinks Number of pairs: 49,587 Variable Variable Type Encouraged Mean Unencouraged Mean Δ/sd Encouraged Mean Unencouraged Mean Δ/sd Excess travel time to hihg-level NICU (minutes) Birthweight (grams) Gestational age (weeks) Gestational diabetes, 1/0 Prenatal care (month) Singel birth, 1/0 Parity Mother's education (scale) Mother's age White, 1/0 Black, 1/0 Asian, 1/0 Other race, 1/0 Race missing, 1/0 Income ($1,000) Home value ($1,000) Has high school degree (fr) Has college degree (fr) Rent (fr) Below poverty (fr) Magnitude of encouragement Pregnancy and birth Mother Mother's neighborhood (zip code/census) , , , , , , , , , , , ,

120 What changes? 1. Smaller study looks less like population Fewer black mothers Fewer renters That is, less urban 2. Compliers change Larger study: 14 minutes difference Smaller study: 34 minutes difference

121 Why is it OK to throw out data?

122 Stronger instrument by design In some sense we already know this tradeoff: Larger studies with poor design and low compliance Vs. Smaller studies with good design and high levels of compliance

123 Stronger Instruments by Design

124 Our method Advantages Other researchers worry about weak instruments. Now we do something about it! More compliance leads to better estimates Less bias Tighter confidence intervals Sensitivity analysis Easier to explain Mimics an experimental design Avoids MLE (i.e., parametric assumptions)

125 The End More questions?

126 The End Baiocchi, Small, Lorch & Rosenbaum: Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants JASA. Dec 2010, Vol. 105, No. 492:

127 Pop death rate: ~1.90% Application: Estimating λ Inference about the effect ratio, λ, under the assumption of random assignment of excess travel time within pairs matched for covariates

128 Visualizing our method

129 Encouragement Our method two criteria for matching Distance in Covariates

130 Encouragement Our method near in covariates Distance in Covariates

131 Encouragement Our method far in covariates Distance in Covariates

132 Encouragement Our method far in covariates Distance in Covariates

133 Encouragement Our method good pair Distance in Covariates

134 Encouragement Our method ok pair Distance in Covariates

135 Encouragement Our method strength of instrument Distance in Covariates

136 Encouragement Our method strength of instrument Weak IV Distance in Covariates

137 Encouragement Our method strength of instrument Medium IV Distance in Covariates

138 Encouragement Our method strength of instrument Strong IV Distance in Covariates

139 Encouragement Our method good pair Strong IV Distance in Covariates

140 Encouragement Our method near in instrument Strong IV Distance in Covariates

141 Encouragement Our method near in instrument Strong IV Distance in Covariates

142 Encouragement Our method near in instrument Strong IV Distance in Covariates

143 Encouragement Our method near in instrument Possible Bias Strong IV Distance in Covariates

144 Encouragement Our method near in instrument Strong IV Distance in Covariates

145 Encouragement Our method far in instrument Strong IV Distance in Covariates

146 Encouragement Our method far in instrument Strong IV Distance in Covariates

147 Encouragement Our method far in instrument Stronger Encouragement Strong IV Distance in Covariates

148 Encouragement Find the experiment Strong IV Distance in Covariates

149 Encouragement Find the experiment Strong IV Distance in Covariates

150 The framework Treatment (T) Encouragement (Z) Treatment (T)

151 The framework Treatment (T) Encouragement (Z) Treatment (T)

152 The framework Response (R) D A Treatment (T) Encouragement (Z) Treatment (T) Response (R) D A

153 The framework Response (R) D A Treatment (T) Encouragement (Z) Treatment (T) Response (R) D A

154 The framework Response (R) D A Encouragement (Z) Treatment (T) Treatment (T) Response (R) Response (R) Response (R) D A D A D A

155 Clarifying who we re talking about Compliance Class Encouragement (Z) Treatment (T) 1 1 Complier Never Taker Always Taker Defier 0 1

156 Clarifying who we re talking about Compliance Class Encouragement (Z) Treatment (T) 1 1 Complier Never Taker Always Taker Defier 0 1

157 Clarifying who we re talking about Compliance Class Encouragement (Z) Treatment (T) 1 1 Complier Never Taker Always Taker Defier 0 1

158 Building intuition for implementation

159 Variable Type High NICU Low NICU sd Δ/sd Mortality Outcome 2.26% 1.25% 13.33% 0.08 Difference in Travel Time Instrument % attending high level NICU Treatment 100.0% 0.0% 49.7% 2.01 Birth weight 2, , Preemie covariates Gestational age GI 0.9% 0.6% 8.7% 0.04 GU 0.9% 0.8% 9.0% 0.01 CNS 0.9% 0.4% 8.3% 0.05 Pulmonary 0.8% 0.7% 8.8% 0.01 % of preemies with type of Cardio 1.4% 0.7% 10.5% 0.06 congenital disorders Skeletal 0.7% 0.9% 9.0% Skin 0.0% 0.0% 0.0% 0.00 Chromosomes 0.4% 0.3% 6.3% 0.02 Other_Anomaly 0.8% 0.1% 7.0% 0.09 Gestational_DiabetesM 4.9% 4.3% 21.0% 0.03 Mother's education Insurance - Fee for service 24.0% 24.5% 42.8% Insurance - HMO 32.3% 27.8% 46.0% 0.10 Insurance - Government 23.5% 24.2% 42.6% Insurance - Other Mother covariates 16.8% 21.4% 39.1% Uninsured 2.2% 1.6% 13.7% 0.04 Prenatal care Single birth (y/n) 79.0% 86.1% 38.3% Parity Mother's age Median income 41, , , Median home value 97, , , % completed high school 79.9% 80.0% 9.7% Census level covariates % completed college 22.2% 19.4% 13.1% 0.21 % renting 31.4% 27.9% 12.8% 0.28 % below poverty line 13.4% 11.8% 9.9% 0.16

160 Pre-matching Variable Type High NICU Low NICU sd Δ/sd Mortality Outcome 2.26% 1.25% 13.33% 0.08 Difference in Travel Time Instrument % attending high level NICU Treatment 100.0% 0.0% 49.7% 2.01 Birth weight 2, , Preemie covariates Gestational age

161 Covariates across the instrument 1st Quartile 2nd Quartile 3rd Quartile 4th Quartile max(δ/sd) Mortality 1.93% 2.08% 1.47% 1.74% 0.05 Difference in Travel Time (3.19) % attending high level NICU 81.1% 69.8% 49.9% 21.6% 1.20 Birth weight 2, , , , Gestational age

162 Post-matching Matched Pairs 49,587 Variable Type Encouraged Mean Unencouraged Mean Mortality Outcome 1.54% 1.94% 12.86% Difference in Travel Time Instrument % attending high level NICU Treatment 68.6% 25.4% 49.7% 0.87 Birth weight 2, , Preemie covariates Gestational age sd Δ/sd

163 Result Point Estimate 95% CI Length of 95% CI Sensitivity (Γ) Weaker Instrument Stronger Instrument 99,174 Pairs 49,587 Pairs of Two Babies of Two Babies 0.92% 0.90% (0.36%, 1.48%) (0.57%, 1.23%) 1.12% 0.66% 1.07 >1.22

164 Final thoughts Near/Far deals with binary outcomes 2SLS can lead to logical absurdities when outcomes are binary Increase the strength of the instrument Stronger instruments lead to more robust results Sensitivity analysis for this method is available This technique has the potential to be a Philosopher s stone

165 Why not use 2SLS? Linear probability models (LPMs) have trouble when your parameter values are up against the edge of parameter space

166 Quality of Care Trouble with time Treatment effect High level NICU Low level NICU Time

167 Quality of Care Trouble with time Treatment effect Treatment effect High level NICU Low level NICU Time

168 Modeling hospital choice

169 Instrument: Probabilities Empirical distributions M

170 Instrument: Probabilities Empirical distributions?

171 Instrument: Probabilities Near Neighbors?

172 Instrument: Probabilities Conditional logistic model/bayesian hierarchical modeling H H M

173 Instrument: Probabilities Conditional logistic model/bayesian hierarchical modeling H H M

174

Near/Far Matching. Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants

Near/Far Matching. Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants Near/Far Matching Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants Joint research: Mike Baiocchi, Dylan Small, Scott Lorch and Paul Rosenbaum What this talk

More information

Advanced Statistical Methods for Observational Studies L E C T U R E 0 6

Advanced Statistical Methods for Observational Studies L E C T U R E 0 6 Advanced Statistical Methods for Observational Studies L E C T U R E 0 6 class management Problem set 1 is posted Questions? design thus far We re off to a bad start. 1 2 1 2 1 2 2 2 1 1 1 2 1 1 2 2 2

More information

Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants

Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants Mike Baiocchi, Dylan S. Small, Scott Lorch, Paul R. Rosenbaum 1 University of Pennsylvania, Philadelphia

More information

Advanced Statistical Methods for Observational Studies L E C T U R E 0 1

Advanced Statistical Methods for Observational Studies L E C T U R E 0 1 Advanced Statistical Methods for Observational Studies L E C T U R E 0 1 introduction this class Website Expectations Questions observational studies The world of observational studies is kind of hard

More information

Advanced Statistical Methods for Observational Studies L E C T U R E 0 1

Advanced Statistical Methods for Observational Studies L E C T U R E 0 1 Advanced Statistical Methods for Observational Studies L E C T U R E 0 1 introduction this class Website Expectations Questions observational studies The world of observational studies is kind of hard

More information

Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants

Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants Mike BAIOCCHI, DylanS.SMALL, ScottLORCH, and Paul R. ROSENBAUM An instrument is a random nudge toward acceptance

More information

University of Pennsylvania and The Children s Hospital of Philadelphia

University of Pennsylvania and The Children s Hospital of Philadelphia Submitted to the Annals of Applied Statistics arxiv: arxiv:0000.0000 ESTIMATION OF CAUSAL EFFECTS USING INSTRUMENTAL VARIABLES WITH NONIGNORABLE MISSING COVARIATES: APPLICATION TO EFFECT OF TYPE OF DELIVERY

More information

Jun Tu. Department of Geography and Anthropology Kennesaw State University

Jun Tu. Department of Geography and Anthropology Kennesaw State University Examining Spatially Varying Relationships between Preterm Births and Ambient Air Pollution in Georgia using Geographically Weighted Logistic Regression Jun Tu Department of Geography and Anthropology Kennesaw

More information

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors Laura Mayoral IAE, Barcelona GSE and University of Gothenburg Gothenburg, May 2015 Roadmap Deviations from the standard

More information

War and Wages: The Strength of Instrumental Variables and Their Sensitivity to Unobserved Biases

War and Wages: The Strength of Instrumental Variables and Their Sensitivity to Unobserved Biases University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 2008 War and Wages: The Strength of Instrumental Variables and Their Sensitivity to Unobserved Biases Dylan S. Small

More information

Instrumental Variables

Instrumental Variables Instrumental Variables Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Noncompliance in Experiments Stat186/Gov2002 Fall 2018 1 / 18 Instrumental Variables

More information

Causal Inference with Big Data Sets

Causal Inference with Big Data Sets Causal Inference with Big Data Sets Marcelo Coca Perraillon University of Colorado AMC November 2016 1 / 1 Outlone Outline Big data Causal inference in economics and statistics Regression discontinuity

More information

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P. Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk

More information

Using Split Samples and Evidence Factors in an Observational Study of Neonatal Outcomes

Using Split Samples and Evidence Factors in an Observational Study of Neonatal Outcomes University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 2011 Using Split Samples and Evidence Factors in an Observational Study of Neonatal Outcomes Kai Zhang University

More information

Randomization Inference with An Instrumental Variable: Two Examples and Some Theory

Randomization Inference with An Instrumental Variable: Two Examples and Some Theory Randomization Inference with An Instrumental Variable: Two Examples and Some Theory Paul R. Rosenbaum, Department of Statistics, Wharton School University of Pennsylvania, Philadelphia, PA 19104-6340 US

More information

Using split samples and evidence factors in an observational study of neonatal outcomes

Using split samples and evidence factors in an observational study of neonatal outcomes Using split samples and evidence factors in an observational study of neonatal outcomes Kai Zhang, Dylan Small, Scott Lorch, Sindhu Srinivas, Paul R. Rosenbaum 1 University of Pennsylvania, Philadelphia

More information

multilevel modeling: concepts, applications and interpretations

multilevel modeling: concepts, applications and interpretations multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models

More information

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes Kosuke Imai Department of Politics Princeton University July 31 2007 Kosuke Imai (Princeton University) Nonignorable

More information

Using Instrumental Variables to Find Causal Effects in Public Health

Using Instrumental Variables to Find Causal Effects in Public Health 1 Using Instrumental Variables to Find Causal Effects in Public Health Antonio Trujillo, PhD John Hopkins Bloomberg School of Public Health Department of International Health Health Systems Program October

More information

AGEC 661 Note Fourteen

AGEC 661 Note Fourteen AGEC 661 Note Fourteen Ximing Wu 1 Selection bias 1.1 Heckman s two-step model Consider the model in Heckman (1979) Y i = X iβ + ε i, D i = I {Z iγ + η i > 0}. For a random sample from the population,

More information

Heterogeneity and Causality: Unit Heterogeneity and Design Sensitivity in Observational Studies. Paul R. Rosenbaum, University of Pennsylvania

Heterogeneity and Causality: Unit Heterogeneity and Design Sensitivity in Observational Studies. Paul R. Rosenbaum, University of Pennsylvania Heterogeneity and Causality: Unit Heterogeneity and Design Sensitivity in Observational Studies Paul R. Rosenbaum, University of Pennsylvania References [1] Rosenbaum, P. R. (2005) Heterogeneity and causality:

More information

Lecture 8. Roy Model, IV with essential heterogeneity, MTE

Lecture 8. Roy Model, IV with essential heterogeneity, MTE Lecture 8. Roy Model, IV with essential heterogeneity, MTE Economics 2123 George Washington University Instructor: Prof. Ben Williams Heterogeneity When we talk about heterogeneity, usually we mean heterogeneity

More information

Discussion of Identifiability and Estimation of Causal Effects in Randomized. Trials with Noncompliance and Completely Non-ignorable Missing Data

Discussion of Identifiability and Estimation of Causal Effects in Randomized. Trials with Noncompliance and Completely Non-ignorable Missing Data Biometrics 000, 000 000 DOI: 000 000 0000 Discussion of Identifiability and Estimation of Causal Effects in Randomized Trials with Noncompliance and Completely Non-ignorable Missing Data Dylan S. Small

More information

THE DESIGN (VERSUS THE ANALYSIS) OF EVALUATIONS FROM OBSERVATIONAL STUDIES: PARALLELS WITH THE DESIGN OF RANDOMIZED EXPERIMENTS DONALD B.

THE DESIGN (VERSUS THE ANALYSIS) OF EVALUATIONS FROM OBSERVATIONAL STUDIES: PARALLELS WITH THE DESIGN OF RANDOMIZED EXPERIMENTS DONALD B. THE DESIGN (VERSUS THE ANALYSIS) OF EVALUATIONS FROM OBSERVATIONAL STUDIES: PARALLELS WITH THE DESIGN OF RANDOMIZED EXPERIMENTS DONALD B. RUBIN My perspective on inference for causal effects: In randomized

More information

Sensitivity checks for the local average treatment effect

Sensitivity checks for the local average treatment effect Sensitivity checks for the local average treatment effect Martin Huber March 13, 2014 University of St. Gallen, Dept. of Economics Abstract: The nonparametric identification of the local average treatment

More information

The Essential Role of Pair Matching in. Cluster-Randomized Experiments. with Application to the Mexican Universal Health Insurance Evaluation

The Essential Role of Pair Matching in. Cluster-Randomized Experiments. with Application to the Mexican Universal Health Insurance Evaluation The Essential Role of Pair Matching in Cluster-Randomized Experiments, with Application to the Mexican Universal Health Insurance Evaluation Kosuke Imai Princeton University Gary King Clayton Nall Harvard

More information

150C Causal Inference

150C Causal Inference 150C Causal Inference Instrumental Variables: Modern Perspective with Heterogeneous Treatment Effects Jonathan Mummolo May 22, 2017 Jonathan Mummolo 150C Causal Inference May 22, 2017 1 / 26 Two Views

More information

MATH 10 INTRODUCTORY STATISTICS

MATH 10 INTRODUCTORY STATISTICS MATH 10 INTRODUCTORY STATISTICS Ramesh Yapalparvi It is Time for Homework! ( ω `) First homework + data will be posted on the website, under the homework tab. And also sent out via email. 30% weekly homework.

More information

IV Estimation WS 2014/15 SS Alexander Spermann. IV Estimation

IV Estimation WS 2014/15 SS Alexander Spermann. IV Estimation SS 2010 WS 2014/15 Alexander Spermann Evaluation With Non-Experimental Approaches Selection on Unobservables Natural Experiment (exogenous variation in a variable) DiD Example: Card/Krueger (1994) Minimum

More information

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score Causal Inference with General Treatment Regimes: Generalizing the Propensity Score David van Dyk Department of Statistics, University of California, Irvine vandyk@stat.harvard.edu Joint work with Kosuke

More information

1Department of Demography and Organization Studies, University of Texas at San Antonio, One UTSA Circle, San Antonio, TX

1Department of Demography and Organization Studies, University of Texas at San Antonio, One UTSA Circle, San Antonio, TX Well, it depends on where you're born: A practical application of geographically weighted regression to the study of infant mortality in the U.S. P. Johnelle Sparks and Corey S. Sparks 1 Introduction Infant

More information

Estimating the Dynamic Effects of a Job Training Program with M. Program with Multiple Alternatives

Estimating the Dynamic Effects of a Job Training Program with M. Program with Multiple Alternatives Estimating the Dynamic Effects of a Job Training Program with Multiple Alternatives Kai Liu 1, Antonio Dalla-Zuanna 2 1 University of Cambridge 2 Norwegian School of Economics June 19, 2018 Introduction

More information

Econometrics Review questions for exam

Econometrics Review questions for exam Econometrics Review questions for exam Nathaniel Higgins nhiggins@jhu.edu, 1. Suppose you have a model: y = β 0 x 1 + u You propose the model above and then estimate the model using OLS to obtain: ŷ =

More information

Introduction to Panel Data Analysis

Introduction to Panel Data Analysis Introduction to Panel Data Analysis Youngki Shin Department of Economics Email: yshin29@uwo.ca Statistics and Data Series at Western November 21, 2012 1 / 40 Motivation More observations mean more information.

More information

Propensity Score Weighting with Multilevel Data

Propensity Score Weighting with Multilevel Data Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative

More information

Noncompliance in Randomized Experiments

Noncompliance in Randomized Experiments Noncompliance in Randomized Experiments Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Noncompliance in Experiments Stat186/Gov2002 Fall 2018 1 / 15 Encouragement

More information

MATH 10 INTRODUCTORY STATISTICS

MATH 10 INTRODUCTORY STATISTICS MATH 10 INTRODUCTORY STATISTICS Tommy Khoo Your friendly neighbourhood graduate student. It is Time for Homework! ( ω `) First homework + data will be posted on the website, under the homework tab. And

More information

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification Todd MacKenzie, PhD Collaborators A. James O Malley Tor Tosteson Therese Stukel 2 Overview 1. Instrumental variable

More information

Instrumental Variables

Instrumental Variables Instrumental Variables Teppei Yamamoto Keio University Introduction to Causal Inference Spring 2016 Noncompliance in Randomized Experiments Often we cannot force subjects to take specific treatments Units

More information

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 8 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 25 Recommended Reading For the today Instrumental Variables Estimation and Two Stage

More information

Dynamics in Social Networks and Causality

Dynamics in Social Networks and Causality Web Science & Technologies University of Koblenz Landau, Germany Dynamics in Social Networks and Causality JProf. Dr. University Koblenz Landau GESIS Leibniz Institute for the Social Sciences Last Time:

More information

Rockefeller College University at Albany

Rockefeller College University at Albany Rockefeller College University at Albany PAD 705 Handout: Simultaneous quations and Two-Stage Least Squares So far, we have studied examples where the causal relationship is quite clear: the value of the

More information

Comparing Means from Two-Sample

Comparing Means from Two-Sample Comparing Means from Two-Sample Kwonsang Lee University of Pennsylvania kwonlee@wharton.upenn.edu April 3, 2015 Kwonsang Lee STAT111 April 3, 2015 1 / 22 Inference from One-Sample We have two options to

More information

Causal Modeling in Environmental Epidemiology. Joel Schwartz Harvard University

Causal Modeling in Environmental Epidemiology. Joel Schwartz Harvard University Causal Modeling in Environmental Epidemiology Joel Schwartz Harvard University When I was Young What do I mean by Causal Modeling? What would have happened if the population had been exposed to a instead

More information

Exact Nonparametric Inference for a Binary. Endogenous Regressor

Exact Nonparametric Inference for a Binary. Endogenous Regressor Exact Nonparametric Inference for a Binary Endogenous Regressor Brigham R. Frandsen December 5, 2013 Abstract This paper describes a randomization-based estimation and inference procedure for the distribution

More information

EDF 7405 Advanced Quantitative Methods in Educational Research. Data are available on IQ of the child and seven potential predictors.

EDF 7405 Advanced Quantitative Methods in Educational Research. Data are available on IQ of the child and seven potential predictors. EDF 7405 Advanced Quantitative Methods in Educational Research Data are available on IQ of the child and seven potential predictors. Four are medical variables available at the birth of the child: Birthweight

More information

Control Function Instrumental Variable Estimation of Nonlinear Causal Effect Models

Control Function Instrumental Variable Estimation of Nonlinear Causal Effect Models Journal of Machine Learning Research 17 (2016) 1-35 Submitted 9/14; Revised 2/16; Published 2/16 Control Function Instrumental Variable Estimation of Nonlinear Causal Effect Models Zijian Guo Department

More information

PROPENSITY SCORE MATCHING. Walter Leite

PROPENSITY SCORE MATCHING. Walter Leite PROPENSITY SCORE MATCHING Walter Leite 1 EXAMPLE Question: Does having a job that provides or subsidizes child care increate the length that working mothers breastfeed their children? Treatment: Working

More information

Rank preserving Structural Nested Distribution Model (RPSNDM) for Continuous

Rank preserving Structural Nested Distribution Model (RPSNDM) for Continuous Rank preserving Structural Nested Distribution Model (RPSNDM) for Continuous Y : X M Y a=0 = Y a a m = Y a cum (a) : Y a = Y a=0 + cum (a) an unknown parameter. = 0, Y a = Y a=0 = Y for all subjects Rank

More information

Economics 345: Applied Econometrics Section A01 University of Victoria Midterm Examination #2 Version 1 SOLUTIONS Fall 2016 Instructor: Martin Farnham

Economics 345: Applied Econometrics Section A01 University of Victoria Midterm Examination #2 Version 1 SOLUTIONS Fall 2016 Instructor: Martin Farnham Economics 345: Applied Econometrics Section A01 University of Victoria Midterm Examination #2 Version 1 SOLUTIONS Fall 2016 Instructor: Martin Farnham Last name (family name): First name (given name):

More information

Quantitative Economics for the Evaluation of the European Policy

Quantitative Economics for the Evaluation of the European Policy Quantitative Economics for the Evaluation of the European Policy Dipartimento di Economia e Management Irene Brunetti Davide Fiaschi Angela Parenti 1 25th of September, 2017 1 ireneb@ec.unipi.it, davide.fiaschi@unipi.it,

More information

ESTIMATION OF TREATMENT EFFECTS VIA MATCHING

ESTIMATION OF TREATMENT EFFECTS VIA MATCHING ESTIMATION OF TREATMENT EFFECTS VIA MATCHING AAEC 56 INSTRUCTOR: KLAUS MOELTNER Textbooks: R scripts: Wooldridge (00), Ch.; Greene (0), Ch.9; Angrist and Pischke (00), Ch. 3 mod5s3 General Approach The

More information

Instrumental Variables in Action: Sometimes You get What You Need

Instrumental Variables in Action: Sometimes You get What You Need Instrumental Variables in Action: Sometimes You get What You Need Joshua D. Angrist MIT and NBER May 2011 Introduction Our Causal Framework A dummy causal variable of interest, i, is called a treatment,

More information

An Alternative Assumption to Identify LATE in Regression Discontinuity Designs

An Alternative Assumption to Identify LATE in Regression Discontinuity Designs An Alternative Assumption to Identify LATE in Regression Discontinuity Designs Yingying Dong University of California Irvine September 2014 Abstract One key assumption Imbens and Angrist (1994) use to

More information

Learning Representations for Counterfactual Inference. Fredrik Johansson 1, Uri Shalit 2, David Sontag 2

Learning Representations for Counterfactual Inference. Fredrik Johansson 1, Uri Shalit 2, David Sontag 2 Learning Representations for Counterfactual Inference Fredrik Johansson 1, Uri Shalit 2, David Sontag 2 1 2 Counterfactual inference Patient Anna comes in with hypertension. She is 50 years old, Asian

More information

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity LECTURE 10 Introduction to Econometrics Multicollinearity & Heteroskedasticity November 22, 2016 1 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists

More information

Inference in Regression Analysis

Inference in Regression Analysis ECNS 561 Inference Inference in Regression Analysis Up to this point 1.) OLS is unbiased 2.) OLS is BLUE (best linear unbiased estimator i.e., the variance is smallest among linear unbiased estimators)

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

What s New in Econometrics. Lecture 1

What s New in Econometrics. Lecture 1 What s New in Econometrics Lecture 1 Estimation of Average Treatment Effects Under Unconfoundedness Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Potential Outcomes 3. Estimands and

More information

4.8 Instrumental Variables

4.8 Instrumental Variables 4.8. INSTRUMENTAL VARIABLES 35 4.8 Instrumental Variables A major complication that is emphasized in microeconometrics is the possibility of inconsistent parameter estimation due to endogenous regressors.

More information

Instrumental Variables

Instrumental Variables Instrumental Variables Yona Rubinstein July 2016 Yona Rubinstein (LSE) Instrumental Variables 07/16 1 / 31 The Limitation of Panel Data So far we learned how to account for selection on time invariant

More information

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015 Introduction to causal identification Nidhiya Menon IGC Summer School, New Delhi, July 2015 Outline 1. Micro-empirical methods 2. Rubin causal model 3. More on Instrumental Variables (IV) Estimating causal

More information

An Alternative Assumption to Identify LATE in Regression Discontinuity Design

An Alternative Assumption to Identify LATE in Regression Discontinuity Design An Alternative Assumption to Identify LATE in Regression Discontinuity Design Yingying Dong University of California Irvine May 2014 Abstract One key assumption Imbens and Angrist (1994) use to identify

More information

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last week: Sample, population and sampling

More information

University of Michigan School of Public Health

University of Michigan School of Public Health University of Michigan School of Public Health The University of Michigan Department of Biostatistics Working Paper Series Year 2016 Paper 121 A Weighted Instrumental Variable Estimator to Control for

More information

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last two weeks: Sample, population and sampling

More information

Chapter 11. Correlation and Regression

Chapter 11. Correlation and Regression Chapter 11. Correlation and Regression The word correlation is used in everyday life to denote some form of association. We might say that we have noticed a correlation between foggy days and attacks of

More information

Potential Outcomes and Causal Inference I

Potential Outcomes and Causal Inference I Potential Outcomes and Causal Inference I Jonathan Wand Polisci 350C Stanford University May 3, 2006 Example A: Get-out-the-Vote (GOTV) Question: Is it possible to increase the likelihood of an individuals

More information

Selection on Observables: Propensity Score Matching.

Selection on Observables: Propensity Score Matching. Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection Model Selection in GLMs Last class: estimability/identifiability, analysis of deviance, standard errors & confidence intervals (should be able to implement frequentist GLM analyses!) Today: standard frequentist

More information

Instrumental Variables

Instrumental Variables Instrumental Variables Econometrics II R. Mora Department of Economics Universidad Carlos III de Madrid Master in Industrial Organization and Markets Outline 1 2 3 OLS y = β 0 + β 1 x + u, cov(x, u) =

More information

Problem Set # 1. Master in Business and Quantitative Methods

Problem Set # 1. Master in Business and Quantitative Methods Problem Set # 1 Master in Business and Quantitative Methods Contents 0.1 Problems on endogeneity of the regressors........... 2 0.2 Lab exercises on endogeneity of the regressors......... 4 1 0.1 Problems

More information

Linear Probability Model

Linear Probability Model Linear Probability Model Note on required packages: The following code requires the packages sandwich and lmtest to estimate regression error variance that may change with the explanatory variables. If

More information

Recitation Notes 5. Konrad Menzel. October 13, 2006

Recitation Notes 5. Konrad Menzel. October 13, 2006 ecitation otes 5 Konrad Menzel October 13, 2006 1 Instrumental Variables (continued) 11 Omitted Variables and the Wald Estimator Consider a Wald estimator for the Angrist (1991) approach to estimating

More information

Recitation Notes 6. Konrad Menzel. October 22, 2006

Recitation Notes 6. Konrad Menzel. October 22, 2006 Recitation Notes 6 Konrad Menzel October, 006 Random Coefficient Models. Motivation In the empirical literature on education and earnings, the main object of interest is the human capital earnings function

More information

What s New in Econometrics. Lecture 13

What s New in Econometrics. Lecture 13 What s New in Econometrics Lecture 13 Weak Instruments and Many Instruments Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Motivation 3. Weak Instruments 4. Many Weak) Instruments

More information

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS Donald B. Rubin Harvard University 1 Oxford Street, 7th Floor Cambridge, MA 02138 USA Tel: 617-495-5496; Fax: 617-496-8057 email: rubin@stat.harvard.edu

More information

Gov 2002: 5. Matching

Gov 2002: 5. Matching Gov 2002: 5. Matching Matthew Blackwell October 1, 2015 Where are we? Where are we going? Discussed randomized experiments, started talking about observational data. Last week: no unmeasured confounders

More information

THE DERIVATION OF A LATENT THRESHOLD INSTRUMENTAL VARIABLES MODEL

THE DERIVATION OF A LATENT THRESHOLD INSTRUMENTAL VARIABLES MODEL Statistica Sinica 10(2000), 517-544 THE DERIVATION OF A LATENT THRESHOLD INSTRUMENTAL VARIABLES MODEL Mark E. Glickman and Sharon-Lise T. Normand Boston University and Harvard Medical School Abstract:

More information

EXAMINATION: QUANTITATIVE EMPIRICAL METHODS. Yale University. Department of Political Science

EXAMINATION: QUANTITATIVE EMPIRICAL METHODS. Yale University. Department of Political Science EXAMINATION: QUANTITATIVE EMPIRICAL METHODS Yale University Department of Political Science January 2014 You have seven hours (and fifteen minutes) to complete the exam. You can use the points assigned

More information

Weighting. Homework 2. Regression. Regression. Decisions Matching: Weighting (0) W i. (1) -å l i. )Y i. (1-W i 3/5/2014. (1) = Y i.

Weighting. Homework 2. Regression. Regression. Decisions Matching: Weighting (0) W i. (1) -å l i. )Y i. (1-W i 3/5/2014. (1) = Y i. Weighting Unconfounded Homework 2 Describe imbalance direction matters STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

More information

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017 Econometrics with Observational Data Introduction and Identification Todd Wagner February 1, 2017 Goals for Course To enable researchers to conduct careful quantitative analyses with existing VA (and non-va)

More information

ECON Introductory Econometrics. Lecture 17: Experiments

ECON Introductory Econometrics. Lecture 17: Experiments ECON4150 - Introductory Econometrics Lecture 17: Experiments Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 13 Lecture outline 2 Why study experiments? The potential outcome framework.

More information

Many natural processes can be fit to a Poisson distribution

Many natural processes can be fit to a Poisson distribution BE.104 Spring Biostatistics: Poisson Analyses and Power J. L. Sherley Outline 1) Poisson analyses 2) Power What is a Poisson process? Rare events Values are observational (yes or no) Random distributed

More information

Chapter 11. Regression with a Binary Dependent Variable

Chapter 11. Regression with a Binary Dependent Variable Chapter 11 Regression with a Binary Dependent Variable 2 Regression with a Binary Dependent Variable (SW Chapter 11) So far the dependent variable (Y) has been continuous: district-wide average test score

More information

Treatment Effects. Christopher Taber. September 6, Department of Economics University of Wisconsin-Madison

Treatment Effects. Christopher Taber. September 6, Department of Economics University of Wisconsin-Madison Treatment Effects Christopher Taber Department of Economics University of Wisconsin-Madison September 6, 2017 Notation First a word on notation I like to use i subscripts on random variables to be clear

More information

Propensity Score Analysis with Hierarchical Data

Propensity Score Analysis with Hierarchical Data Propensity Score Analysis with Hierarchical Data Fan Li Alan Zaslavsky Mary Beth Landrum Department of Health Care Policy Harvard Medical School May 19, 2008 Introduction Population-based observational

More information

PhD/MA Econometrics Examination January 2012 PART A

PhD/MA Econometrics Examination January 2012 PART A PhD/MA Econometrics Examination January 2012 PART A ANSWER ANY TWO QUESTIONS IN THIS SECTION NOTE: (1) The indicator function has the properties: (2) Question 1 Let, [defined as if using the indicator

More information

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects Economics 113 Simple Regression Models Simple Regression Assumptions Simple Regression Derivation Changing Units of Measurement Nonlinear effects OLS and unbiased estimates Variance of the OLS estimates

More information

The Generalized Roy Model and Treatment Effects

The Generalized Roy Model and Treatment Effects The Generalized Roy Model and Treatment Effects Christopher Taber University of Wisconsin November 10, 2016 Introduction From Imbens and Angrist we showed that if one runs IV, we get estimates of the Local

More information

Lecture 1 Introduction to Multi-level Models

Lecture 1 Introduction to Multi-level Models Lecture 1 Introduction to Multi-level Models Course Website: http://www.biostat.jhsph.edu/~ejohnson/multilevel.htm All lecture materials extracted and further developed from the Multilevel Model course

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

Bootstrapping Sensitivity Analysis

Bootstrapping Sensitivity Analysis Bootstrapping Sensitivity Analysis Qingyuan Zhao Department of Statistics, The Wharton School University of Pennsylvania May 23, 2018 @ ACIC Based on: Qingyuan Zhao, Dylan S. Small, and Bhaswar B. Bhattacharya.

More information

LECTURE 5. Introduction to Econometrics. Hypothesis testing

LECTURE 5. Introduction to Econometrics. Hypothesis testing LECTURE 5 Introduction to Econometrics Hypothesis testing October 18, 2016 1 / 26 ON TODAY S LECTURE We are going to discuss how hypotheses about coefficients can be tested in regression models We will

More information

Multiple Regression Analysis: Inference MULTIPLE REGRESSION ANALYSIS: INFERENCE. Sampling Distributions of OLS Estimators

Multiple Regression Analysis: Inference MULTIPLE REGRESSION ANALYSIS: INFERENCE. Sampling Distributions of OLS Estimators 1 2 Multiple Regression Analysis: Inference MULTIPLE REGRESSION ANALYSIS: INFERENCE Hüseyin Taştan 1 1 Yıldız Technical University Department of Economics These presentation notes are based on Introductory

More information

ANALYSIS OF CORRELATED DATA SAMPLING FROM CLUSTERS CLUSTER-RANDOMIZED TRIALS

ANALYSIS OF CORRELATED DATA SAMPLING FROM CLUSTERS CLUSTER-RANDOMIZED TRIALS ANALYSIS OF CORRELATED DATA SAMPLING FROM CLUSTERS CLUSTER-RANDOMIZED TRIALS Background Independent observations: Short review of well-known facts Comparison of two groups continuous response Control group:

More information

Using Post Outcome Measurement Information in Censoring by Death Problems

Using Post Outcome Measurement Information in Censoring by Death Problems Using Post Outcome Measurement Information in Censoring by Death Problems Fan Yang University of Chicago, Chicago, USA. Dylan S. Small University of Pennsylvania, Philadelphia, USA. Summary. Many clinical

More information

ECO Class 6 Nonparametric Econometrics

ECO Class 6 Nonparametric Econometrics ECO 523 - Class 6 Nonparametric Econometrics Carolina Caetano Contents 1 Nonparametric instrumental variable regression 1 2 Nonparametric Estimation of Average Treatment Effects 3 2.1 Asymptotic results................................

More information

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Alessandra Mattei Dipartimento di Statistica G. Parenti Università

More information