Estimating the Dynamic Effects of a Job Training Program with M. Program with Multiple Alternatives

Similar documents
Recitation Notes 6. Konrad Menzel. October 22, 2006

Potential Outcomes Model (POM)

Partial Identification of Average Treatment Effects in Program Evaluation: Theory and Applications

Recitation Notes 5. Konrad Menzel. October 13, 2006

Impact Evaluation Technical Workshop:

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015

A Course in Applied Econometrics. Lecture 5. Instrumental Variables with Treatment Effect. Heterogeneity: Local Average Treatment Effects.

Explaining Rising Wage Inequality: Explorations With a Dynamic General Equilibrium Model of Labor Earnings with Heterogeneous Agents

AGEC 661 Note Fourteen

Lecture 8. Roy Model, IV with essential heterogeneity, MTE

Recitation 7. and Kirkebøen, Leuven, and Mogstad (2014) Spring Peter Hull

Empirical approaches in public economics

Groupe de lecture. Instrumental Variables Estimates of the Effect of Subsidized Training on the Quantiles of Trainee Earnings. Abadie, Angrist, Imbens

150C Causal Inference

Treatment Effects. Christopher Taber. September 6, Department of Economics University of Wisconsin-Madison

Market and Nonmarket Benefits

WORKSHOP ON PRINCIPAL STRATIFICATION STANFORD UNIVERSITY, Luke W. Miratrix (Harvard University) Lindsay C. Page (University of Pittsburgh)

Sensitivity checks for the local average treatment effect

Policy-Relevant Treatment Effects

The Generalized Roy Model and Treatment Effects

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

Instrumental Variables

The Design of a University System

Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case

Selection on Observables: Propensity Score Matching.

Front-Door Adjustment

Instrumental Variables in Action: Sometimes You get What You Need

Introduction to Causal Inference. Solutions to Quiz 4

PSC 504: Instrumental Variables

Rising Wage Inequality and the Effectiveness of Tuition Subsidy Policies:

Principles Underlying Evaluation Estimators

Potential Outcomes and Causal Inference I

Noncompliance in Randomized Experiments

Flexible Estimation of Treatment Effect Parameters

More on Roy Model of Self-Selection

Identification for Difference in Differences with Cross-Section and Panel Data

IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

IV Estimation WS 2014/15 SS Alexander Spermann. IV Estimation

Unpacking the Black-Box: Learning about Causal Mechanisms from Experimental and Observational Studies

Going Beyond LATE: Bounding Average Treatment Effects of Job Corps Training

Imbens, Lecture Notes 2, Local Average Treatment Effects, IEN, Miami, Oct 10 1

Bounds on Average and Quantile Treatment Effects of Job Corps Training on Wages*

Selection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models

The College Premium in the Eighties: Returns to College or Returns to Ability

Exploring Marginal Treatment Effects

Econometrics in a nutshell: Variation and Identification Linear Regression Model in STATA. Research Methods. Carlos Noton.

ECO Class 6 Nonparametric Econometrics

An Economic Analysis of Exclusion Restrictions for Instrumental Variable Estimation

The returns to schooling, ability bias, and regression

Lecture 11 Roy model, MTE, PRTE

Applied Microeconometrics. Maximilian Kasy

Econometric Causality

Bounds on Average and Quantile Treatment Effects of Job Corps Training on Wages*

What s New in Econometrics. Lecture 1

Statistical Models for Causal Analysis

Instrumental Variables in Models with Multiple Outcomes: The General Unordered Case

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp

Addressing Analysis Issues REGRESSION-DISCONTINUITY (RD) DESIGN

Training, Wages, and Sample Selection: Estimating Sharp Bounds on Treatment Effects *

A Dynamic Model of Health, Education, and Wealth with Credit Constraints and Rational Addiction

Lecture 11/12. Roy Model, MTE, Structural Estimation

Treatment Effects with Normal Disturbances in sampleselection Package

SREE WORKSHOP ON PRINCIPAL STRATIFICATION MARCH Avi Feller & Lindsay C. Page

Bounds on Average and Quantile Treatment E ects of Job Corps Training on Participants Wages

Identification with Latent Choice Sets: The Case of the Head Start Impact Study

Quantitative Economics for the Evaluation of the European Policy

The Problem of Causality in the Analysis of Educational Choices and Labor Market Outcomes Slides for Lectures

Using Instrumental Variables to Find Causal Effects in Public Health

The Econometric Evaluation of Policy Design: Part I: Heterogeneity in Program Impacts, Modeling Self-Selection, and Parameters of Interest

Dynamics in Social Networks and Causality

Part VII. Accounting for the Endogeneity of Schooling. Endogeneity of schooling Mean growth rate of earnings Mean growth rate Selection bias Summary

The relationship between treatment parameters within a latent variable framework

Experimental Designs for Identifying Causal Mechanisms

EC 533 Labour Economics Problem Set 1 Answers. = w r. S = f S. f r = 0. log y = log w + log f(s, A)

leebounds: Lee s (2009) treatment effects bounds for non-random sample selection for Stata

Ch 7: Dummy (binary, indicator) variables

Comparative Advantage and Schooling

Instrumental Variables

Difference-in-Differences Methods

14.74 Lecture 10: The returns to human capital: education

Exam D0M61A Advanced econometrics

Program Evaluation in the Presence of Strategic Interactions

Urban Revival in America

The problem of causality in microeconometrics.

Comments on: Panel Data Analysis Advantages and Challenges. Manuel Arellano CEMFI, Madrid November 2006

Identification and Extrapolation with Instrumental Variables

Bounds on Population Average Treatment E ects with an Instrumental Variable

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

Chilean and High School Dropout Calculations to Testing the Correlated Random Coefficient Model

An example to start off with

IsoLATEing: Identifying Heterogeneous Effects of Multiple Treatments

The problem of causality in microeconometrics.

Limited Dependent Variables and Panel Data

Eco517 Fall 2004 C. Sims MIDTERM EXAM

Causality and Experiments

Instrumental Variables

Instrumental Variables

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

Transcription:

Estimating the Dynamic Effects of a Job Training Program with Multiple Alternatives Kai Liu 1, Antonio Dalla-Zuanna 2 1 University of Cambridge 2 Norwegian School of Economics June 19, 2018

Introduction Public job training programs: whether they are effective in promoting skill accumulation for disadvantaged individuals whether a greater return on public spending could be had elsewhere Existing evidence points to low or very modest returns from public job training (Barnow and Smith, 2015).

Introduction Most of the literature focuses on addressing endogenous (initial) selection into training Credible estimates now available thanks to experimental design The National Job Corps Study: random assignments into a treatment group (given an offer to participate) a control group (excluded from participation) Use randomization as instrument for participation to get causal effect of participation (Schochet et al., 2008) average treatment effect among compliers (LATE)

Introduction Even with experimental design, additional selection issues complicate the evaluation problem: Dynamic selection on when to quit Selection into alternative training/educational programs by: 1 Non-participant 2 Participant after having completed some training Yet these factors may be important to understand heterogeneity in the returns to training programs how different training programs interact (e.g. complementarity in program returns) cost-benefit analysis (ultimately) optimal program design/targeting

Our paper: What We Do We begin by building a non-parametric potential outcome framework in a dynamic and sequential choice setting, allowing for flexible dynamic selection in program participation, AND flexible (unordered) choice of multiple alternatives in each period With experimental variation in initial program participation, we show that LATE is a mixture of a number of different sublates (specific type of compliers) sublates and their population shares are not identified unless with strong assumptions Knowledge of shares of sublates essential for cost-benefit analysis

Our paper: What We Do We then estimate a (semi-parametric) dynamic selection model using data and experimental variation from the The National JC Study Using the estimated dynamic selection model, we quantify: 1 different sublates and their proportions in the population 2 ATE and selection patterns in each potential period 3 how JC and alternative training program interact (dynamic) complementarity in program returns intertemporal substitution in program choices

Introduction: Relation to existing research Our paper builds on the following literatures 1) Program evaluation: Choice substitution: Heckman, Hohmann and Smith (2000); Kline and Walters (2016) Dropouts and endogenous duration: Ham and LaLonde (1996); Heckman, Smith, Taber (1998) Dynamic treatment effect: e.g., Taber (2000), Heckman and Navarro (2007); Heckman, Humphries and Veramendi (2016) 2) Returns to job training: Heckman, LaLonde and Smith (Handbook, 1999), Barnow and Smith (2015)

Plan of the talk 1 Data and the National Job Corps Study 2 Potential outcomes and choices in a dynamic setting 3 Dynamic selection model of program participation 4 Estimation Results (Preliminary)

The Job Corps Program Job Corps (JC) is the largest vocationally focused education and training program for 1 disadvantaged (means-tested) 2 youths (16-24 years old) Features of JC: Center-based and most centers (87%) are residential Personalized vocational training, academic education (providing GED certificate) and other services. After training is completed, JC provides placement services to find a job or pursue additional education.

Data and the National Job Corps Study The National JC Study was conducted in mid-1990s to experimentally evaluate the program. 73% of the treatment group enrolled in JC mean duration is 8 months control group excluded from JC for 3 years (Only 1.4% did not obey this rule) 4 surveys were conducted during NJCS: baseline (at assignment), 6, 12, 48 months Our sample: all individuals who answered the last survey (around 80%) and reported duration in JC (dropped 3%) Main outcome variable: average weekly earnings during the 16th quarter after random assignment (zero or missing earnings: 25%)

Data and the National Job Corps Study Defining decision periods (S) 0 (randomization) 1 (1-3 months in JC) 2 (4-6 months in JC) 3 (7 months+ in JC) Defining multiple choices (T): No training: no additional training after JC Alternative training: enrolled in any program other than JC GED program (40%) high school (29%) vocational/technical/trade school (40%) college (21%)

Proportions in Each Treatment Status JC duration No training Alternative Total Ratio (a/n) control group (Z=0) 0 29.3 66.3 95.6 2.26 treatment group (Z=1) 0 8.5 19.6 28.1 2.30 1-3 8.3 11.8 20.2 1.42 4-6 5.6 7.9 13.5 1.41 7+ 16.7 21.5 38.2 1.29 Sum 39.2 60.8 100.0 1.55

ITT Decomposition Earnings Prob. Employed Earnings Prob. Employed (1) (2) (3) (4) z 21.1*** 2.7*** (4.2) (0.9) z*n in Period 0 7.2 1.1 (9.4) (2.2) z*a in Period 0 21.1*** 1.4 (7.5) (1.5) z* n after Period 1-2.4-2.0 (9.4) (2.3) z* a after Period 1-1.6-1.3 (8.5) (1.9) z* n after Period 2 20.7* 2.1 (12.0) (2.6) z* a after Period 2 21.7** 2.0 (10.7) (2.2) z*n after Period 3 35.9*** 5.8*** (7.3) (1.6) z*a after Period 3 36.5*** 6.6*** (7.0) (1.4) N 10,792 10,537 10,792 10,537

Plan of the talk 1 Data and the National Job Corps Study 2 Potential outcomes and choices in a dynamic setting 3 Dynamic selection model of program participation 4 Estimation Results

Choice Structure: Dynamic Case Potential duration of JC: t=0 (0 month), t=1 (1-3 months), t=2 (4-6 months) and t=3 (7 months+) Sequential choice: JC can only start in period 0 for Z=1 group, but can choose a or n in any period Once an individual opts out of JC: no recall Z = 0 Z = 1 Period 0 a n n a jc Period 1 n a jc Period 2 n a jc Period 3 n a

Choice Structure: Dynamic Case Potential duration of JC: t=0 (0 month), t=1 (1-3 months), t=2 (4-6 months) and t=3 (7 months+) Sequential choice: JC can only start in period 0 for Z=1 group, but can choose a or n in any period Once an individual opts out of JC: no recall Z = 0 Z = 1 Period 0 a n n a jc Period 1 n a jc Period 2 n a jc Period 3 n a

Choice Structure: Dynamic Case Potential duration of JC: t=0 (0 month), t=1 (1-3 months), t=2 (4-6 months) and t=3 (7 months+) Sequential choice: JC can only start in period 0 for Z=1 group, but can choose a or n in any period Once an individual opts out of JC: no recall Z = 0 Z = 1 Period 0 a n n a jc Period 1 n a jc Period 2 n a jc Period 3 n a

Choice Structure: Dynamic Case Potential duration of JC: t=0 (0 month), t=1 (1-3 months), t=2 (4-6 months) and t=3 (7 months+) Sequential choice: JC can only start in period 0 for Z=1 group, but can choose a or n in any period Once an individual opts out of JC: no recall Z = 0 Z = 1 Period 0 a n n a jc Period 1 n a jc Period 2 n a jc Period 3 n a

Potential Outcome Potential outcome in the multiperiod setting: Y i = Y 0,n i + S D i (S, a)(y S,a i Y 0,n i ) + S D i (S, n)(y S,n i Y 0,n i ) Y S,T i is earnings for individual i enrolled in JC for S periods and who chooses treatment T after S D i = (S, T ) identifies the treatment choice for individual i; D i (S, T ) defines an indicator function for each possible choice: D i (S, T ) = 1 [D i = (S, T )]. Linking the realized and the potential treatment: D i (S, T ) = D 0 i (S, T ) + (D 1 i (S, T ) D 0 i (S, T ))Z i

Identifying Assumptions In what follows we assume that: 1 Z i has no direct effect on Y S,T i, S, T (exclusion) 2 Z i independent of Y S,T and Di Z, S, T (random assignment) 3 i D 0 i D 1 i = D 1 i = (S, T ), S > 0 Compliers must choose some JC when Z = 1 stable rank of next-best alternatives in period 0 In addition, we assume there is no Always Takers (data driven).

Compliance types: Dynamic Case Z = 0 Z = 1 Period 0 a n n a jc Period 1 n a jc Period 2 n a jc Period 3 n a Compliers: those choosing JC for at least one period In this 3-period case there are 12 types of compliers 6 types of compliers are those who select Di 0 = (0, t) and Di 1 = (s, t) s, t the other 6 types: select Di 0 = (0, t) and Di 1 = (s, v) s, t, v t

Compliance types: Dynamic Case Z = 0 Z = 1 Period 0 a n n a jc Period 1 n a jc Period 2 n a jc Period 3 n a Compliers: those choosing JC for at least one period In this 3-period case there are 12 types of compliers 6 types of compliers are those who select D 0 i = (0, t) and D 1 i = (s, t) s, t [e.g. π Ca,1a =P(D 0 i = (0, a), D 1 i = (1, a))] the other 6 types: select D 0 i = (0, t) and D 1 i = (s, v) s, t, v t

Compliance types: Dynamic Case Z = 0 Z = 1 Period 0 a n n a jc Period 1 n a jc Period 2 n a jc Period 3 n a Compliers: those choosing JC for at least one period In this 3-period case there are 12 types of compliers 6 types of compliers are those who select D 0 i = (0, t) and D 1 i = (s, t) s, t [e.g. π Ca,1a =P(D 0 i = (0, a), D 1 i = (1, a))] the other 6 types: select D 0 i = (0, t) and D 1 i = (s, v) s, t, v t [e.g. π Ca,3n =P(D 0 i = (0, a), D 1 i = (3, n))]

Compliance types: Dynamic Case Z = 0 Z = 1 Period 0 a n n a jc Period 1 n a jc Period 2 n a jc Period 3 n a Never Takers are of two types Those who choose a: π Aa =P(D 0 i = (0, a), D 1 i = (0, a)) Those who choose n: π An =P(D 0 i = (0, n), D 1 i = (0, n))

Compliance types: Dynamic Case Z = 0 Z = 1 Period 0 a n n a jc Period 1 n a jc Period 2 n a jc Period 3 n a Never Takers are of two types Those who choose a: π Aa =P(D 0 i = (0, a), D 1 i = (0, a)) Those who choose n: π An =P(D 0 i = (0, n), D 1 i = (0, n))

Compliance types: Dynamic Case Z = 0 Z = 1 Period 0 a n n a jc Period 1 n a jc Period 2 n a jc Period 3 n a Never Takers are of two types Those who choose a: π Aa =P(D 0 i = (0, a), D 1 i = (0, a)) Those who choose n: π An =P(D 0 i = (0, n), D 1 i = (0, n))

Interpreting LATE: 3 Periods Case Suppose we ignore the duration of JC and do not distinguish alternatives (like in Schochet et al. (2008)) Single choice: D can be either 0 (no JC) or 1 (some JC) LATE is estimated using the Wald estimator: LATE = E[Y i Z i = 1] E[Y i Z i = 0] P(D 1 i = 1) P(D 0 i = 1) We now interpret this LATE parameter, under dynamic sequential choice structure with multiple alternatives and contrast it to static model with multiple alternatives

Interpreting LATE: 3 Periods Case 3 s=1 Define π C as π C 3 (π Ca,sa ) + s=1 3 (π Cn,sn ) + s=1 3 (π Ca,sn ) + s=1 The Wald estimator can be decomposed in π Ca,sa E[Y s,a i Y 0,a i Di 1 (s, a) = 1, Di 0 (0, a) = 1]+ π C 3 s=1 3 s=1 3 (π Cn,sa ) s=1 π Cn,sn E[Y s,n i Y 0,n i Di 1 (s, n) = 1, Di 0 (0, n) = 1]+ π C π Ca,sn E[Y s,a i Y 0,n i Di 1 (s, a) = 1, Di 0 (0, n) = 1]+ π C 3 s=1 π Cn,sa E[Y s,n i Y 0,a i Di 1 (s, n) = 1, Di 0 (0, a) = 1] π C Which implies that there are 12 sublates

Comparing with the Static Case Consider the static case where individuals make mutually exclusive choices in one period. Z = 0 Z = 1 a n n a jc Compliers are of two types π Ca =P(D 0 i = a, D 1 i = jc) π Cn =P(D 0 i = n, D 1 i = jc) LATE can be decomposed into two sublates (Kline and Walters 2016, Kirkeboen, Mogstad, Leuven 2016) π Cn E[Y jc i Yi n Di 1 = jc, Di 0 = n]+ π Ca π Cn + π Ca E[Y jc i π Cn + π Ca Y a i D 1 i = jc, D 0 i = a]

Comparing with the Static Case Consider the static case where individuals make mutually exclusive choices in one period. Z = 0 Z = 1 a n n a jc Compliers are of two types π Ca =P(D 0 i = a, D 1 i = jc) π Cn =P(D 0 i = n, D 1 i = jc) LATE can be decomposed into two sublates (Kline and Walters 2016, Kirkeboen, Mogstad, Leuven 2016) π Cn E[Y jc i Yi n Di 1 = jc, Di 0 = n]+ π Ca π Cn + π Ca E[Y jc i π Cn + π Ca Y a i D 1 i = jc, D 0 i = a]

Comparing with the Static Case Consider the static case where individuals make mutually exclusive choices in one period. Z = 0 Z = 1 a n n a jc Compliers are of two types π Ca =P(D 0 i = a, D 1 i = jc) π Cn =P(D 0 i = n, D 1 i = jc) LATE can be decomposed into two sublates (Kline and Walters 2016, Kirkeboen, Mogstad, Leuven 2016) π Cn E[Y jc i Yi n Di 1 = jc, Di 0 = n]+ π Ca π Cn + π Ca E[Y jc i π Cn + π Ca Y a i D 1 i = jc, D 0 i = a]

Static vs. Dynamic Case The dynamic case provides additional parameters of interest (dynamic) complementarity in program returns, e.g. Y 3,n Y 0,n < Y 3,a Y 0,a or Y 0,a Y 0,n < Y 3,a Y 3,n effect of program duration (e.g. Y S,n Y 0,n S ) intertemporal choice substitution: D 0 = a, D 1 = (S, n), S > 0 (static model tends to over-predict program substitution) intertemporal complementarity between JC and alternative: D 0 = n, D 1 = (S, a), S > 0 (ruled out in static case) In the dynamic potential outcome framework with a single instrument, we cannot non-parametrically identify sublates: static model: (with assumptions) using Z interacted with covariates as additional IV Results

Relevance of Separate Effect Estimation for Different Compliers Knowing the share of the different types of compliers is essential to conduct a cost-benefit analysis if the cost of JC changes depending on the period spent in JC we need to take into account also the costs of the alternative treatments Cost-benefit analysis

Plan of the talk 1 Data and the National Job Corps Study 2 Potential outcomes and choices in a dynamic setting 3 Dynamic selection model of program participation 4 Estimation Results

Dynamic Selection Model For each potential duration s (s [0, 3]), utilities from each potential choice: U n is = 0 (1) U a is = β a s X i + θ ia + u a is (2) U j is = βj sx j i + θ ij + u j is (3) β a s and β j s vary flexibly with potential JC duration state dependence in JC participation (may vary by X) experience from JC makes alternative program more attractive Permanent unobserved factors follow a bivariate normal distribution: (θ ia, θ ij ) BN(0, 0, σ a, σ j, ρ) (4) u a is, uj is are i.i.d. (variances normalized to 1)

Dynamic Selection Model Parameterize potential outcomes Y S,T i = α S,T + β S,T w X i + γ S,T a θ ia + γ S,T j θ ij + ε S,T i (5) γa S,T and γ S,T j capture how potential outcomes vary with unobserved factors H 0 : γa S,T = 0, γ S,T j = 0, S, T (no selection) H 0 : γa S,a = γa S,n, S (no selection into a on gains) H 0 : γ S,T j = γ S,T j (dynamic selection into JC on gains) α S,T is ATE for the group with X it = 0 ε S,T i are i.i.d (measurement errors).

Dynamic Selection Model: Identification Exclusion restrictions in the first period: Z and Z X E(Y i X i = x, Z i = 1, D i = n, s = 0) E(Y i X i = x, Z i = 0, D i = n, s = 0) = γ 0,n a λ a (x, 1, n, 0) + γ 0,n j λ j (x, 1, n, 0) γ 0,n a λ a (x, 0, n, 0) λ a (x, z, n, 0) = G 0 a (P(D i = n X i = x, Z i = z, s = 0)) Subsequent period: E(Y i X i = x, Z i = 1, D i = n, s = 1) = α 1,n + β 1,n x + γ 1,n a λ a(x, 1, n, 1) + γ 1,n j λ j (x, 1, n, 1) λ a(x, z, n, 1) = G 1 a (P(D i = j X i = x, Z i = z, s = 0), P(D i = n X i = x, Z i = z, s = 1)) Z and Z X shift λ a(x, z, n, 1) via P(D i = j X i = x, Z i = z, s = 0) but have no direct effect on Y.

Dynamic Selection Model: Estimation The conditional likelihood function of individuals with Z = 0 is L (1) (S = 0, T = k, Y θ ia, θ ij ) = P(U k i0 > U k i0 θ ia, θ ij )h(y T = k, θ ia, θ ij ) (6) where k = {a, n} and k denotes all the remaining choices other than k. The conditional likelihood function of individuals with Z = 1 is L (2) (S = s, T = k, Y θ ia, θ ij ) =P(U j i0 > max(uj i0 ),..., Uj is > max(uj is ), Uk is+1 > max(u k is+1 ) θ ia, θ ij ) h(y S = s, T = k, θ ia, θ ij ) (7)

Dynamic Selection Model: Estimation To form the likelihood contribution for the individual, we need to average out over all possible individual types: L (m) (S = s, T = k, Y ) = L (m) (S = s, T = k, Y θ ia, θ ij )f (θ ia, θ ij )dθ ia dθ ij, m = {1, 2} choice probabilities simulated using the GHK simulator The complete likelihood function consists of products over workers: L = L (1) (T, Y ) L (2) (S, T, Y ) (8) i B 1 i B 2

Plan of the talk 1 Data and the National Job Corps Study 2 Potential outcomes and choices in a dynamic setting 3 Dynamic selection model of program participation 4 Estimation Results

Estimation Results: ATE for Age 16-19 1 1 Test for program complementarity: H 0 : Y 3,n Y 0,n = Y 3,a Y 0,a. reject with p-val=0.05

Estimation Results: Selection on Unobservables S=0 S=1 S=2 S=3 a n a n a n a n γa S,T 0.56-0.79 0.25-0.70-0.54-0.65-0.03-0.58 (0.11) (0.12) (0.99) (0.34) (0.51) (0.27) (0.71) (0.78) γ S,T j -0.13 0.12-0.49-0.05 0.47 0.00-0.38 0.63 (0.08) (0.09) (0.35) (0.74) (0.39) (0.81) (0.43) (0.41) We can reject the following null hypothesis no selection (H 0 : γa S,T = 0, γ S,T j = 0, S, T ) no selection on gains (a) (H 0 : γa S,a We cannot reject = γa S,n, S) no dynamic selection on gains (jc) (H 0 : γ S,T j = γ S,T j, S, T )

Goodness of Fit: Proportions in Each Treatment Status JC duration No training Alternative control group 0 30.6 69.4 29.4 70.6 treatment group 0 8.2 19.2 8.2 19.0 1-3 7.8 11.2 7.7 12.1 4-6 5.5 7.7 5.4 7.8 7+ 17.4 21.5 16.6 23.1 Sum 39.2 60.8 37.9 62.1

Goodness of Fit: Wages (w/o measurement errors) All individuals Control group Density 0.2.4.6.8 1 Density 0.2.4.6.8 1 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 log wage 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 log wage actual predicted actual predicted Treatment group Density 0.2.4.6.8 1 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 log wage actual predicted

Estimation Results: sublates D 0 D 1 Estimates Shares SubLATEs (dynamic) a (1, a) -0.21 10.1 n (1, a) 0.87 2.1 n (1, n) 0.01 5.7 a (1, n) 0.14 2.1 a (2, a) -0.1 6.7 n (2, a) 0.93 1.2 n (2, n) 0.11 3.5 a (2, n) 0.21 1.9 a (3, a) -0.15 21.2 n (3, a) 1.32 2.3 n (3, n) 0.2 6.9 a (3, n) 0.55 9.9 SubLATEs (static) n j 0.36 21.7 a j 0 51.9 Overall LATE 0.11 73.6

Estimation Results: Understanding Intertemporal Substitution In the static framework, π aj = P(D 0 i = a, D 1 i = jc) = 0.52 implying a great degree of choice substitution between a and JC disallowing anyone to switch from n to a (identifying assumption) By comparison, the dynamic framework implies much less choice substitution: π a,1n + π a,2n + π a,3n = 0.14 < 0.52 intertemporal complementarity between JC and a: π n,1a + π n,2a + π n,3a = 0.06

Estimation Results: Understanding Differences Between Age Groups The effect of JC significantly larger for 20-24 age group (Schochet et al. (2008)): LATE young = 0.09, LATE old = 0.17 This age difference can be due to two factors: Different treatment effect (via β w, the age parameter in potential wages) Different patterns of sorting into different compliance groups Using the estimated model, we simulate a counterfactual LATE by shutting down age difference in potential wages (β w = 0) the counterfactual LATE old = 0.07 all the age difference in LATE is due to age difference in treatment effect

Conclusion In this paper, we decomposed LATE to various sublates by extending the potential outcome framework to a dynamic setting showed non-identification of the sublates imposed parametric assumptions to infer the sublates and ATE We learnt that there are large heterogeneity in program returns incorporating dynamics in program evaluation seems fruitful: dynamic complementarity+intertemporal substitution Our framework potentially useful to a large literature in development using encouragement design thinking of optimal program design/targeting

Interacted IV Results - Static Case Back Job Corps Job Corps and Alternative Training (1) (2) jc 26.2*** 218.7*** (6.0) (51.5) a 276.1*** (72.4) F-statistic jc 1,608.3 35.0 a 8.7 Overid. p-value 0.001 0.753 N 10,586 10,586 interact JC offer with observable covariates (mother s edu, age, race, first language, enrollment in welfare programs) assumption: sublates do NOT differ across covariates groups (Kline and Walters, 2016) the instruments need to be relevant in separately identifying

Cost-Benefit Analysis The benefit from JC can be summarized by the increase in net lifetime earnings for participants B = (1 τ) E[Y i ] where τ is the tax rate The cost is the sum of the cost of JC and the cost of the alternative training (which may be complements or substitutes to JC), net of the increase in tax revenue In a 3 periods setting it is reasonable to assume that the longer a person is enrolled in JC, the higher the cost of JC: C = 3 φ s jc P(Di 1 (s, t)) + φ a P(D i (S, a)) τ E[Y i ] t=n,a s=1 where φ s jc is the cost of JC for s periods, φ a is the cost of the alternative training and P(D i (S, a)) is the probability of enrolling in a in every period both for treatment and controls, with S (0, 1, 2, 3)

Cost-Benefit Analysis: Program Expansion Consider an expansion of the program which increases the proportion of individuals who are randomly offered JC treatment. Call this proportion δ The benefit of expanding the program depend on the overall LATE and on the proportion of individuals who enrolls in JC for at least one period (similar result in Kline and Walters, 2016): B δ = (1 τ)late P(D1 i (S > 0, T ))

Cost-Benefit Analysis: Program Expansion The change in costs due to an expansion of JC depends on the proportion of different types of compliers: C δ = t=n,a s=1 3 φ s jc P(Di 1 (s, t)) 3 φ a P(Di 1 (s, n), Di 0 (0, a))+ s=1 3 φ a P(Di 1 (s, a), Di 0 (0, n)) s=1 τlate P(D 1 i (S > 0, T )] the first term is the increase in marginal costs due to more people enrolling in JC the second term is the savings due to individuals enrolling in a if Z = 0 and in JC if Z = 1 who don t enroll in a after JC the third term is the increase in marginal costs due to individuals enrolling in n if Z = 0 who enroll in a after JC the last term is the increase in tax revenues