Testing for Rank Invariance or Similarity in Program Evaluation: The Effect of Training on Earnings Revisited

Similar documents
Testing for Rank Invariance or Similarity in Program Evaluation

Testing for Rank Invariance or Similarity in Program Evaluation: The Effect of Training on Earnings Revisited

A Test for Rank Similarity and Partial Identification of the Distribution of Treatment Effects Preliminary and incomplete

Flexible Estimation of Treatment Effect Parameters

Groupe de lecture. Instrumental Variables Estimates of the Effect of Subsidized Training on the Quantiles of Trainee Earnings. Abadie, Angrist, Imbens

Testing Rank Similarity

A Test for Rank Similarity and Partial Identification of the Distribution of Treatment Effects Preliminary and incomplete

The Econometric Evaluation of Policy Design: Part I: Heterogeneity in Program Impacts, Modeling Self-Selection, and Parameters of Interest

Applied Microeconometrics. Maximilian Kasy

Impact Evaluation Technical Workshop:

WORKING P A P E R. Unconditional Quantile Treatment Effects in the Presence of Covariates DAVID POWELL WR-816. December 2010

Weak Stochastic Increasingness, Rank Exchangeability, and Partial Identification of The Distribution of Treatment Effects

An Alternative Assumption to Identify LATE in Regression Discontinuity Designs

Regression Discontinuity Designs with a Continuous Treatment

Differences-in-differences, differences of quantiles and quantiles of differences

Lecture 9: Quantile Methods 2

An Alternative Assumption to Identify LATE in Regression Discontinuity Design

Regression Discontinuity Designs with a Continuous Treatment

A Discontinuity Test for Identification in Nonparametric Models with Endogeneity

Sensitivity checks for the local average treatment effect

150C Causal Inference

Unconditional Quantile Treatment Effects Under Endogeneity

Job Displacement of Older Workers during the Great Recession: Tight Bounds on Distributional Treatment Effect Parameters

Instrumental Variables in Action: Sometimes You get What You Need

What s New in Econometrics? Lecture 14 Quantile Methods

Regression Discontinuity Designs with a Continuous Treatment

Selection on Observables: Propensity Score Matching.

A Course in Applied Econometrics. Lecture 2 Outline. Estimation of Average Treatment Effects. Under Unconfoundedness, Part II

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

Unconditional Quantile Regression with Endogenous Regressors

ExtrapoLATE-ing: External Validity and Overidentification in the LATE framework. March 2011

New Developments in Econometrics Lecture 16: Quantile Estimation

QUANTILE MODELS WITH ENDOGENEITY

Estimating the Dynamic Effects of a Job Training Program with M. Program with Multiple Alternatives

A Course in Applied Econometrics. Lecture 5. Instrumental Variables with Treatment Effect. Heterogeneity: Local Average Treatment Effects.

Instrumental Variables in Action

Additional Material for Estimating the Technology of Cognitive and Noncognitive Skill Formation (Cuttings from the Web Appendix)

What s New in Econometrics. Lecture 1

Instrumental Variables (Take 2): Causal Effects in a Heterogeneous World

Generated Covariates in Nonparametric Estimation: A Short Review.

Program Evaluation with High-Dimensional Data

Data Models. Dalia A. Ghanem. May 8, Abstract. Recent work on nonparametric identification of average partial effects (APEs) from panel

Efficient Semiparametric Estimation of Quantile Treatment Effects

Online Appendix for Targeting Policies: Multiple Testing and Distributional Treatment Effects

Instrumental Variables

Partial Identification of the Distribution of Treatment Effects

Potential Outcomes Model (POM)

Exact Nonparametric Inference for a Binary. Endogenous Regressor

Identi cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance

TESTING IDENTIFYING ASSUMPTIONS IN FUZZY REGRESSION DISCONTINUITY DESIGN 1. INTRODUCTION

A test of the conditional independence assumption in sample selection models

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

University of Toronto Department of Economics. Testing Local Average Treatment Effect Assumptions

Unconditional Quantile Regression for Panel Data with Exogenous or Endogenous Regressors

Estimation and Inference for Distribution Functions and Quantile Functions in Endogenous Treatment Effect Models. Abstract

Supplemental Appendix to "Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs"

IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade

Principles Underlying Evaluation Estimators

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Casuality and Programme Evaluation

Four Parameters of Interest in the Evaluation. of Social Programs. James J. Heckman Justin L. Tobias Edward Vytlacil

Inference on Optimal Treatment Assignments

Nonadditive Models with Endogenous Regressors

Chapter 8. Quantile Regression and Quantile Treatment Effects

Identifying the Effect of Changing the Policy Threshold in Regression Discontinuity Models

Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility

Job Training Partnership Act (JTPA)

AGEC 661 Note Fourteen

Chilean and High School Dropout Calculations to Testing the Correlated Random Coefficient Model

Comparative Advantage and Schooling

Nonparametric Tests for Treatment Effect Heterogeneity

Gov 2002: 9. Differences in Differences

Sensitivity to Missing Data Assumptions: Theory and An Evaluation of the U.S. Wage Structure. September 21, 2012

Review of probability and statistics 1 / 31

Ron Heck, Fall Week 3: Notes Building a Two-Level Model

Comparing Distributional Policy Parameters between Populations with Different Outcome Structures

Bounds on Average and Quantile Treatment Effects of Job Corps Training on Wages*

WORKING P A P E R. Unconditional Quantile Regression for Panel Data with Exogenous or Endogenous Regressors DAVID POWELL WR

ted: a Stata Command for Testing Stability of Regression Discontinuity Models

ECON Introductory Econometrics. Lecture 17: Experiments

Testing for Treatment Effect Heterogeneity in Regression Discontinuity Design

Lecture 28 Chi-Square Analysis

Selection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models

Consistent Tests for Conditional Treatment Effects

Approximate Distributions of the Likelihood Ratio Statistic in a Structural Equation with Many Instruments

Lecture 8 Inequality Testing and Moment Inequality Models

The changes-in-changes model with covariates

By Marcel Voia. February Abstract

Testing instrument validity for LATE identification based on inequality moment constraints

Nonparametric Tests for Treatment Effect. Heterogeneity

Nonparametric Tests for Treatment Effect Heterogeneity

ECON 482 / WH Hong Binary or Dummy Variables 1. Qualitative Information

What s New in Econometrics. Lecture 13

CALIFORNIA INSTITUTE OF TECHNOLOGY

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp

Testing for covariate balance using quantile regression and resampling methods

Answers to Problem Set #4

Nonparametric Instrumental Variables Identification and Estimation of Nonseparable Panel Models

Econ 2148, fall 2017 Instrumental variables II, continuous treatment

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Transcription:

Testing for Rank Invariance or Similarity in Program Evaluation: The Effect of Training on Earnings Revisited Yingying Dong and Shu Shen UC Irvine and UC Davis Sept 2015 @ Chicago 1 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Introduction Motivation: QTE/LQTE Literature In program evaluations, applied researchers care about treatment effect heterogeneity and often look at distributional/quantile effects of treatments. In quantile treatment effect (QTE) models, rank invariance or rank similarity is required either for identification: e.g., IVQR model of Chernozhukov and Hansen (2005, 06, 08), Chernozhukov, Imbens, and Newey (2007), Horowitz and Lee (2007). or for interpretation: e.g., LQTE framework (Abadie, Angrist and Imbens, 2002). Also Frolich and Melly (2013), Firpo (2007), and Imbens and Newey (2009). This paper studies the assumption of (unconditional) rank invariance and rank similarity. provides identification of the distribution of individuals (unconditional) potential ranks conditional on covariates. proposes nonparametric tests that are applicable to both exogenous and endogenous treatments 2 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Introduction Motivation: Program Evaluation Applications The Star Project: the effect of attending a small class (T ) in grade K on student outcome (Y, grade K test score) Figure: The Star Project Score Distributions Probability 50 100 150 Total Score QTE Regular Class With Aid Small Class Regular Class 3 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Introduction Motivation: Program Evaluation Applications JTPA (Job Training Partnership Act): the effect of job training (T ) on individual earnings (Y ). Randomly assigned (Z) treatment with about 60% compliance rate. Potential Earnings Distributions, Female Potential Earnings Distributions Among Compiers, Male Probability Probability 0 10000 20000 30000 40000 Earnings 0 20000 40000 60000 Earnings LQTE Control Treatment LQTE Control Treatment 4 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Introduction Definition of Rank Invariance Y 0 and Y 1 are the potential outcomes under no treatment and under treatment, respectively. U t = F t (Y t) U(0, 1) is the rank of the potential outcome Y t. U 0 and U 1 are unconditional and are never observed at the same time. Rank invariance is the condition that U 0 = U 1 Example: Y t = g t (X, V ), where Y t is test score, X is observed characteristics such as gender, race, and V is ability. If (X, V ) : Ω W, so that U t = F t (g t (X(ω), V (ω))), then rank invariance is says that U 0 (ω) = U 1 (ω) for all ω Ω. Let q t(τ) = F 1 Y t (τ) and QTE(τ) = q 1 (τ) q 0 (τ). Rank invaraince implies that QTE(τ) is the individual treatment effect for anyone who is at quantile τ. Rank invariance is restrictive does not allow for random slippages in potential ranks (e.g., caused by luck). 5 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Introduction Rank Similarity Suppose Y t = g t (X, V, S t), where X (gender, race) and V (ability) determine the common rank level, S t (luck) is a random shock (luck) responsible for the random slippages. S t is realized after a treatment is assigned. Rank similarity is the condition that U 0 (X = x, V = v) U 1 (X = x, V = v) for all (x, v) W. If (X, V ) : Ω W, then rank similarity says that U 0 (ω) U 1 (ω) for all ω Ω. 6 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Introduction Implications of Rank Similarity Rank similarity implies that Lemma 1 1 The distributions of observables and unobservables at the same rank are the same across treatment states. That is, Given rank similarity, F X,V U0 (x, v τ) = F X,V U1 (x, v τ), for all τ (0, 1), (x, v) W. 2 For any individual, her average treatment effect is a weighted average of the unconditional QTEs, where the weights are the individual s probabilities of being at different quantiles. That is, 1 Given rank similarity, E [Y 1 Y 0 X = x, V = v] = QTE(τ)dF U X,V (τ x, v) for all (x, v) W 0 3 (Main Testable Implication) Treatment should not affect the distribution of ranks among observationally equivalent individuals. That is, Given rank similarity, F U0 X(τ x) = F U1 X(τ x), for all τ (0, 1), x X. 7 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Identification Exogenous Treatment Identification: the Exogenous Treatment Case If T is exogenous, identification of F U1 X(τ x) F U0 X (τ x) for τ (0, 1) and x X is trivial: F U1 X(τ x) F U0 X (τ x) = E ( 1(U 1 τ) X = x ) E ( 1(U 0 τ)) X = x ) = E ( 1(Y 1 q 1 (τ) X = x ) E ( 1(Y 0 q 0 (τ)) X = x ) = E ( 1(Y q 1 (τ)) X = x, T = 1 ) E ( 1(Y q 0 (τ)) X = x, T = 0 ), where marginal quantiles q 1 (τ) and q 0 (τ) are directly identified from sub-samples with T = 1 and T = 0, respectively. 8 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Identification Endogenous Treatment Identification: the Endogenous Treatment Case If T is endogenous, let Z = 0, 1 be an IV, and T z for z = 0, 1 be potential treatment status. Interested in testing for rank similarity among compliers (T 1 > T 0 ): F U1 C,X(τ x) = F U0 C,X(τ x) for all τ (0, 1) and x X C, where X C = {x X : Pr [T 1 > T 0 X = x] > 0}. Assumption 1 Let (Y t, T t, X, Z), t = 0, 1 be random variables mapped from the common probability space (Ω, F, P). The following conditions hold jointly with probability one. 1 Independence: (Y 0, Y 1, T 0, T 1 ) Z X. 2 First stage: E(T 1 ) E(T 0 ). 3 Monotonicity: Pr(T 1 T 0 ) =1. 4 Nontrivial assignment: 0 < Pr (Z = 1 X = x) < 1 for all x X. 9 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Identification Endogenous Treatment Identification: the Endogenous Treatment Case Theorem 1 Let I (τ) 1 ( Y ( Tq 1 C (τ) + (1 T ) q 0 C (τ) )). Given Assumption 1, for all τ (0, 1), x X C, and t = 0, 1, F Ut C,X(τ x) is identified and is given by F Ut C,X(τ x) = E [I (τ)1 (T = t) Z = 1, X = x] E [I (τ)1 (T = t) Z = 0, X = x]. (1) E[1 (T = t) Z = 1, X = x] E[1 (T = t) Z = 0, X = x] F U1 C,X(. x) = F U0 C,X(. x) for x X C if and only if for all τ (0, 1) and x X Note: I (τ) is a rank indicator. E [I (τ) Z = 1, X = x] = E [I (τ) Z = 0, X = x]. (2) Notice the change from x X C to x X in the theorem. This is because Equation (2) holds trivially for X /X C. Use the identification result of Equation (2) to test for H 0 : F U1 C,X(. x) = F U0 C,X(. x). Use the identification result of Equation (1) to estimate F U1 C,X(. x) F U0 C,X(. x). 10 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Identification Endogenous Treatment Mean Test Theorem 1 also gives identification of specific features of the potential rank distribution such as the mean. Rank similarity implies F U1 C,X(τ x) = F U0 C,X(τ x) which further implies E[U 1 C, X = x] = E[U 0 C, X = x]. E[U 1 C, X = x] = E[U 0 C, X = x] holds if and only if E [U Z = 1, X = x] = E [U Z = 0, X = x], where U TU 1 + (1 T )U 0 = 1 0 1 (( Tq 1 C (τ) + (1 T )q 0 C (τ) ) < Y ) dτ = 1 1 0 I (τ)dτ. U is identified because I (τ) is identified. E[U 1 C, X = x] E[U 0 C, X = x] represents the average rank change for each subpopulation. 11 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Identification Endogenous Treatment Star Project: Small Classes V.s. Regular Classes Rank Distributions by Race Rank Distributions by Gender Probability Probability Rank of Total Score Rank of Total Score Nonwhite, Small Class White, Small Class Nonwhite, Regular Class White, Regular Class Boy, Small Class Girl, Small Class Boy, Regular Class Girl, Regular Class Regular Class with Aid V.s. Regular Classes Rank Distributions by Race Rank Distributions by Gender Probability Probability Rank of Total Score Rank of Total Score 12 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Identification Endogenous Treatment Empirical Example: JTPA Female Potential Rank Distributions, by Education Potential Rank Distributions, by Employment Last Year Probability Probability 0 20 40 60 80 100 Rank 0 20 40 60 80 100 Rank <HS, Treatment HS, Treatment <HS, Control HS, Control <13 Weeks, Treatment <13 Weeks, Control >=13 Weeks, Treatment >=13 Weeks, Control Male Potential Rank Distributions by Education Potential Rank Distributions by Employment Last Year Probability Probability 0 20 40 60 80 100 0 20 40 60 80 100 Rank Rank 13 / 37 <HS, Treatment <HS, Control Dong, Shen Testing<13 forweeks, Rank Treatment Invariance or Similarity <13 Weeks, in Program ControlEvaluation

Identification Endogenous Treatment Null Hypothesis and Test Statistic Let X = {x 1, x 2,..., x J }, Ω = {τ 1, τ 2,..., τ K } H 0 : m 0 j (τ k) = m 1 j (τ k) for j = 1,..., J 1 and k = 1,..., K, for z = 0, 1 m z j (τ k ) E [ 1 ( Y Tq 1 C (τ k ) + (1 T )q 0 C (τ k ) ) Z = z, X = x j ]. ˆm z j (τ k ) = 1 n z j where ˆω i ( Zi Z i =z,x i =x j 1 ( Y i T i ˆq 1 C (τ k ) + (1 T i )ˆq 0 C (τ k ) ), with (ˆq0 C (τ k ), ˆq 1 C (τ k ) ) 1 = arg min q 0,q 1 n π(x i ) 1 Z i 1 π(x i ) n ρ τk (Y i q 0 (1 T i ) q 1 T i )ˆω i, i=1 ) (2T i 1) and π(x) is a consistent estimator of π(x) = Pr (Z = 1 X = x), and n z j = n i=1 1(Z i = z, X i = x j ). Wald-type test: W n ( ˆm 1 ˆm 0) ˆV 1 ( ˆm 1 ˆm 0). 14 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Identification Endogenous Treatment Assumption for Asymptotic Properties Assumption 3 1 i.i.d. data: the data (Y i, T i, Z i, X i ) for i = 1,..., n is a random sample of size n from (Y, T, Z, X). 2 For all τ Ω = {τ 1, τ 2,..., τ K }, the random variable Y 1 and Y 0 are continuously distributed with positive density in a neighborhood of q 0 C (τ) and q 1 C (τ) in the subpopulation of compliers. 3 For all j = 1,..., J, ˆπ(x j ) is consistent, or ˆπ ( x j ) p π ( xj ). 4 Let f Y T,Z,X be the conditional density of Y given T, Z and X. For all t, z = 0, 1, j = 1,..., J and τ Ω, f Y T,Z,X (y t, z, x j ) has bounded first derivative with respect to y in a neighborhood of q t C (τ). Let f Y X (y x) be the conditional density of Y given X. For all τ Ω and j = 1,..., J, f Y X (. x j ) is positive and bounded in a neighborhood of q t C (τ). 15 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Identification Endogenous Treatment Asymptotics Let ˆm z, z = 0, 1, be K(J 1) dimentional vector with K(j 1) + k-th element ˆm j z (τ k). Theorem 2 Given Assumptions 1 and 3, ( n ˆm 1 ˆm 0 ( m 1 m 0)) N(0, V) where ( V is the K(J 1) K(J 1) asymptotic variance-covariance matrix. The J 1 j=1 K(j 1) + k, J 1 j =1 K(j 1) + k )-th element of V is equal to [( ) ( )] E φ 1 j (τ k) φ 0 j (τ k) φ 1 j (τ k ) φ 0 j (τ k ) with φ z j (τ k) φ z j (τ k; Y, T, Z, X) = I (τ k) mj z (τ k) 1(Z = z, X = x j ) p Z,X (z, x j ) f Y T,Z,X(q 0 C (τ k ) 0, z, x j )(1 p T Z,X (z, x j )) ψ 0 (Y, T, Z, X) P cf 0 C (q 0 C (τ k )) f Y T,Z,X(q 1 C (τ k ) 1, z, x j )p T Z,X (z, x j ) ψ 1 (Y, T, Z, X), P cf 1 C (q 1 C (τ k )) where ψ 0 (Y, T, Z, X) and ψ 1 (Y, T, Z, X) are defined in the proof of Theorem 7 in Frolich and Melly (2007), and restated in the proof of this theorem in the Appendix. 16 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Identification Endogenous Treatment Asymptotic Properties of the Test Remember the Wald-type test statistic W n ( ˆm 1 ˆm 0) ˆV 1 ( ˆm 1 ˆm 0) χ 2 (K(J 1)) under the null. Bootstrap ˆV. If q 0 C (τ k ) and q 1 C (τ k ) were known, φ z j (τ k ; Y, T, Z, X) would reduce to I (τ k ) m j z (τ k ) ( J 1 1(Z = z, X = x p Z,X (z,x j ) j ) and the j=1 K(j 1) + k, J 1 j K(j 1) + k ) -th element of =1 V is equal to z=0,1 mz j (τ k τ k ) m z j (τ k )m z j (τ k ) if j = j, and 0 if j j. If J is very large, then the first stage estimation error may be ignored and one can construct ˆV by the analytic formula. Discussed in extensions where J or X includes continuous variables. The critical value c α is the (1 α) 100-th percentile of the χ 2 (K(J 1)) distribution. The test is consistent for the null hypothesis H 0 Once again, the test does NOT test the unobservable part (e.g. V or ability) of the rank invariance assumption. 17 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Identification Endogenous Treatment The Mean Rank Similarity Test Let m z j = E[U Z = z, X = x j ] for z = 0, 1. H 0,mean : m j 0 = m j 1, for all j = 1,..., J 1. Let {τ s } S x =1 be S random draws from U (0, 1). Ûi T Û1i + (1 T )Û0i for i = 1,..., n, can be estimated by Û i = 1 S 1 (( T ˆq 1 C (τ s ) + (1 T )ˆq 0 C (τ s ) ) ) Y i, S s=1 m z j can then be estimated by m z j = 1 n z j Z i =z,x i =x j Û i. 18 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Identification Endogenous Treatment The Mean Rank Similarity Test Corollary 3 Suppose Assumptions 1 and 3 hold for Ω = (0, 1). Under the null hypothesis where m 1 = m 0, when S, n n ( m 1 m 0) N(0, V mean), where V mean is the (J [( 1) (J 1) asymptotic variance-covariance matrix. The (j, j )-th element of V mean is E 1 0 φ1 j (τ)dτ ) ( 1 0 φ0 j (τ)dτ 1 0 φ1 j (τ)dτ )] 1 0 φ0 j (τ)dτ, where 1 φ z j (τ)dτ = U mz j 0 p Z,X (z, x j ) 1(Z = z, X = x j ) ( 1 f Y T,Z,X (q 0 C (τ) 0, z, x j ) 1 PT Z,X (z, x j ) ) ψ 0 (Y, T, Z, X) dτ 0 f 0 C (q 0 C (τ)) P c 1 f Y T,Z,X (q 1 C (τ) 1, z, x j ) dτ P T Z,X(z, x j )ψ 1 (Y, T, Z, X). 0 f 1 C (q 1 C (τ)) P c A Wald-type test statistic is then as N, J, N/J. W mean n ( m 1 m 0) V 1 ( m 1 m 0) χ 2 (J 1) 19 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Simulation and JTPA Simulation Simulation: DGPs DGP: Y 0 = X + V + S 0, Y 1 = X + V + (1 bxv ) + S 1, Y = Y 1 T + Y 0 (1 T ), Pr(X = 0.4j) = 1/5 for j = 1,..., 5, V, S 0, S 1 N(0, 1) and b = 0, 2. Exogenous treatment: Pr(T = t) = 1 2, t = 0, 1. Endogenous treatment: Pr(Z = z) = 1 2, z = 0, 1, and T = 1 (0.15(Y1 Y0) + Z 0.5 > 0). Rank similarity holds when b = 0 but not when b 0. 20 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Simulation and JTPA Simulation Illustration of DGPs: Exogenous Treatment Figure: Conditional distributions of potential ranks b=0 b=1 Conditional CDF Conditional CDF Potential rank Potential rank X=0.4, Y(1) X=0.8, Y(1) X=1.2, Y(1) X=1.6, Y(1) X=2.0, Y(1) X=0.4, Y(0) X=0.8, Y(0) X=1.2, Y(0) X=1.6, Y(0) X=2.0, Y(0) X=0.4, Y(1) X=0.8, Y(1) X=1.2, Y(1) X=1.6, Y(1) X=2.0, Y(1) X=0.4, Y(0) X=0.8, Y(0) X=1.2, Y(0) X=1.6, Y(0) X=2.0, Y(0) b=2 b=3 Conditional CDF Conditional CDF Potential rank Potential ranks X=0.4, Y(1) X=0.8, Y(1) X=1.2, Y(1) X=1.6, Y(1) X=2.0, Y(1) X=0.4, Y(0) X=0.8, Y(0) X=1.2, Y(0) X=1.6, Y(0) X=2.0, Y(0) X=0.4, Y(1) X=0.8, Y(1) X=1.2, Y(1) X=1.6, Y(1) X=2.0, Y(1) X=0.4, Y(0) X=0.8, Y(0) X=1.2, Y(0) X=1.6, Y(0) X=2.0, Y(0) 21 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Simulation and JTPA Simulation Simulation Results: Exogenous Treatment N 500 1000 1500 2000 2500 500 1000 1500 2000 2500 b = 0 b = 1 Test 1: Ω = {0.5} 0.034 0.039 0.051 0.040 0.053 0.047 0.059 0.101 0.120 0.146 Test 2: Ω = {0.2, 0.3, 0.4} 0.013 0.013 0.025 0.021 0.023 0.018 0.038 0.065 0.099 0.152 Test 3: Ω = {0.5, 0.6, 0.7, 0.8} 0.014 0.014 0.023 0.023 0.018 0.022 0.050 0.134 0.148 0.260 Test 4: Ω = {0.2, 0.3,..., 0.8} 0.006 0.010 0.013 0.013 0.013 0.009 0.044 0.102 0.144 0.283 Test 5: Mean Test 0.051 0.044 0.048 0.041 0.067 0.063 0.092 0.144 0.176 0.251 b = 2 b = 3 Test 1: Ω = {0.5} 0.074 0.150 0.232 0.303 0.388 0.143 0.335 0.512 0.640 0.800 Test 2: Ω = {0.2, 0.3, 0.4} 0.269 0.776 0.968 0.994 1.000 0.817 0.999 1.000 1.000 1.000 Test 3: Ω = {0.5, 0.6, 0.7, 0.8} 0.151 0.581 0.857 0.962 0.991 0.306 0.880 0.992 1.000 1.000 Test 4: Ω = {0.2, 0.3,..., 0.8} 0.287 0.910 0.996 1.000 1.000 0.836 0.999 1.000 1.000 1.000 Test 5: Mean Test 0.103 0.213 0.278 0.424 0.500 0.340 0.659 0.853 0.941 0.971 Sample Size = 1000 b = 2 Rejection Rate Rejection Rate 0 1 2 3 b Distributional Test 1 Distributional Test 2 Distributional Test 3 Distributional Test 4 Mean Test 500 1000 1500 2000 2500 Sample Size Distributional Test 1 Distributional Test 2 Distributional Test 3 Distributional Test 4 Mean Test 22 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Simulation and JTPA Simulation Simulation Results: Endogenous Treatment N 500 1000 1500 2000 2500 500 1000 1500 2000 2500 b = 0 b = 1 Ω = {0.5} 0.025 0.036 0.041 0.038 0.057 0.036 0.040 0.064 0.073 0.085 Ω = {0.2, 0.3, 0.4} 0.012 0.012 0.018 0.017 0.025 0.013 0.022 0.040 0.065 0.107 Ω = {0.5, 0.6, 0.7, 0.8} 0.006 0.013 0.016 0.022 0.015 0.006 0.016 0.034 0.047 0.074 Ω = {0.2, 0.3,..., 0.8} 0.002 0.010 0.006 0.010 0.008 0.003 0.015 0.029 0.057 0.094 Mean Test 0.054 0.050 0.051 0.045 0.057 0.037 0.050 0.054 0.063 0.068 b = 2 b = 3 Ω = {0.5} 0.084 0.242 0.379 0.522 0.615 0.113 0.293 0.441 0.617 0.700 Ω = {0.2, 0.3, 0.4} 0.170 0.589 0.870 0.965 0.993 0.284 0.783 0.975 1.000 1.000 Ω = {0.5, 0.6, 0.7, 0.8} 0.021 0.150 0.340 0.600 0.764 0.020 0.198 0.450 0.704 0.865 Ω = {0.2, 0.3,..., 0.8} 0.053 0.431 0.823 0.960 0.993 0.093 0.634 0.949 1.000 1.000 Mean Test 0.152 0.322 0.481 0.622 0.709 0.191 0.441 0.602 0.772 0.843 Sample Size = 1000 b = 2 Rejection Rate Rejection Rate 0 1 2 3 b Distributional Test 1 Distributional Test 2 Distributional Test 3 Distributional Test 4 Mean Test 500 1000 1500 2000 2500 Sample Size Distributional Test 1 Distributional Test 2 Distributional Test 3 Distributional Test 4 Mean Test 23 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Empirical Examples: Testing Results Star Testing: the Exogenous Treatment Case Conclusions Table: Star Project: Test Results Treatment type Total Test Score Birthday (1st-31st) Small Class V.s. Regular Class Test Stat 232 21.23 P-value 0.033 0.439 Aid Class V.s. Regular Class Test Stat 266 13.25 P-value 0.001 0.905 Both treatments (small class and regular class with aid) improve the rank distribution of the disadvantaged (boy, nonwhite) Assigning a teaching aid to the regular class systematically changes students rank. Researchers may want to reconsider the practice of using both regular class with and without aid as the control group in analysis. 24 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Empirical Examples: Testing Results Star Empirical Example: JTPA Y = 30 months earnings following assignment T = receiving training services, Z = random assignment indicator, X = black, Hispanic, HS or GED, married, worked at least 13 weeks the year before, AFDC receipt (for women only) and 5 age category dummies. (Abadie, Angrist and Imbens, 2002) 25 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Empirical Examples: Testing Results JTPA JTPF: First-Stage Unconditional LQTE Estimation Table: First-stage estimates of unconditional QTEs of training on trainee earnings Female Male Quantile Y 0 QTE Y 0 QTE 0.15 195 291 (341.88) 1,462 249 (713.36) 0.20 723 714 (358.31)* 2,733 390 (723.01) 0.25 1,458 1,200 (372.08)*** 4,434 489 (746.85) 0.30 2,463 1,380 (399.21)*** 6,993 340 (891.74) 0.35 3,784 1,705 (497.01)*** 8,836 594 (1,042.40) 0.40 5,271 1,974 (669.75)*** 11,010 723 (1,104.63) 0.45 6,726 2,451 (766.25)*** 13,104 1,069 (1,144.28) 0.50 8,685 2,436 (829.29)*** 15,374 1,291 (1,234.59) 0.55 11,007 2,089 (877.56)** 17,357 2,239 (1,295.79)* 0.60 12,618 2,729 (886.96)*** 20,409 2,118 (1,418.40) 0.65 14,682 2,943 (920.45)*** 23,342 2,319 (1,557.00) 0.70 16,971 2,772 (1,027.14)*** 27,169 1,780 (1,606.66) 0.75 20,252 2,106 (1,152.35)* 30,439 2,408 (1,641.47) 0.80 23,064 2,331 (1,149.71)** 34,620 2,800 (1,701.90)* 0.85 26,735 1,762 (1,179.91) 39,233 3,955 (1,886.98)** Note: Standard errors are in the parentheses; All estimates control for covariates including dummies for black, Hispanic, high-school graduates (including GED holders), marital status, whether the applicant worked at least 12 weeks in the 12 months preceding random assignment, and AFDC receipt (for women only) as well as 5 age group dummies; * significant at the 10% level, ** significant at the 5% level, ***significant at the 1% level. 26 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Empirical Examples: Testing Results JTPA JTPA: Joint Test Table: Rank similarity test jointly at all quantiles Female Male I II I II (1) (2) (1) (2) (1) (2) (1) (2) Panel A: Dependent Var. Earnings χ 2 7,652.1 7,763.8 1,197.2 1,177.8 2,780.7 2,719.0 886.1 876.8 (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) d.f. 1,544 1,544 723 723 1,218 1,218 570 570 Panel B: Falsification test (Dependent Var. Age) χ 2 478.8 471.9 252.0 259.9 209.3 203.5 124.7 123.0 (0.926) (0.953) (0.366) (0.245) (1.000) (1.000) (0.977) (0.982) d.f. 525 525 245 245 338 338 158 158 Note: Results are based on the Chi-squared test in Theorem 2; Variance-covariance matrices are bootstrapped with 2,000 replications; P-values are in the parentheses; Columns I report a joint test at equally-spaced 15 quantiles from 0.15 to 0.85; Columns II reports a joint test at equally-spaced 7 quantiles from 0.20 to 0.80; (1) controls for covariates in the first-stage unconditional QTE estimation, while (2) does not; X values with fewer than 5 observations when either Z = 0 or Z = 1 are not used in the test to ensure the common support assumption. 27 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Empirical Examples: Testing Results JTPA JTPA: Test Individual Quantiles Table: Rank similarity test at individual quantiles Panel A: Dependent Var. Earnings Panel B: Falsification test (Dependent Var. Age) Female Male Female Male Quantile χ 2 χ 2 χ 2 χ 2 0.15 134.4 (0.012) 103.8 (0.045) 43.9 (0.144) 19.4 (0.561) 0.20 143.0 (0.004) 113.3 (0.010) 37.9 (0.340) 22.1 (0.391) 0.25 126.2 (0.060) 107.8 (0.025) 26.0 (0.863) 13.9 (0.907) 0.30 131.9 (0.034) 104.7 (0.039) 26.9 (0.834) 15.0 (0.861) 0.35 147.2 (0.003) 95.8 (0.142) 22.1 (0.956) 17.9 (0.712) 0.40 118.3 (0.160) 88.6 (0.291) 31.1 (0.659) 23.2 (0.447) 0.45 107.5 (0.387) 110.7 (0.019) 32.1 (0.611) 22.4 (0.497) 0.50 110.9 (0.304) 113.6 (0.012) 32.3 (0.599) 19.2 (0.692) 0.55 112.6 (0.266) 110.9 (0.019) 30.8 (0.673) 19.6 (0.664) 0.60 112.1 (0.276) 112.3 (0.015) 32.7 (0.581) 22.3 (0.503) 0.65 121.7 (0.113) 105.0 (0.044) 29.4 (0.734) 18.4 (0.735) 0.70 108.0 (0.375) 106.1 (0.038) 36.7 (0.388) 24.0 (0.402) 0.75 130.4 (0.035) 109.7 (0.018) 45.4 (0.112) 16.5 (0.831) 0.80 118.4 (0.128) 116.5 (0.005) 47.7 (0.074) 17.1 (0.802) 0.85 92.3 (0.697) 118.7 (0.002) 44.7 (0.125) 18.7 (0.716) Note: Results are based on the Chi-squared test in Theorem 2; Variance-covariance matrices are bootstrapped with 2,000 replications; P-values are in the parentheses; Covariates are controlled for in the first-stage unconditional QTE estimation. X values with fewer than 5 observations when either Z = 1 or Z = 0 are not used in the test to ensure the common support assumption. 28 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Empirical Examples: Testing Results JTPA JTPA: Test the Mean Rank Table: Rank similarity test for the mean rank only Female Male (1) (2) (1) (2) Panel A: Dependent Var. Earnings χ 2 123.1 (0.098) 123.1 (0.098) 115.2 (0.009) 115.2 (0.009) d.f. 104 104 82 82 Panel B: Falsification test (Dependent Var. Age) χ 2 30.6 (0.683) 30.6 (0.683) 18.4 (0.736) 18.4 (0.736) d.f. 35 35 23 23 Note: Results are based on the Chi-squared test for the mean ranks only; Variance-covariance matrices are bootstrapped with 2,000 replications; P-values are in the parentheses; (1) controls for covariates in the first-stage unconditional QTE estimation, while (2) does not; X values with fewer than 5 observations when either Z = 1 or Z = 0 are not used in the test to ensure the common support assumption. Conclusion: Training causes some individuals to systemically change their ranks in the earnings distribution. Should be cautious in equating the distributional impacts of training with the true effects on individual trainees. Results largely agree with Heckman, Smith and Clements (1997): perfect positive dependence across potential outcome distributions... not credible. 29 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Extensions Extension I: Covariates with Large Support Assume that J, as n. Assumption 4 1 i.i.d. data: the data {Y i, T i, Z i, X i } for i = 1,..., n is a random sample of size n of (Y, T, Z, X). 2 For all τ Ω = {τ 1, τ 2,..., τ K }, the random variable Y 1 and Y 0 are continuously distributed with positive density in a neighborhood of q 0 C (τ) and q 1 C (τ) in the subpopulation of compliers. 3 Let n j = n i=1 1(X = x j). n j n/j uniformly over j, i.e. there exist 0 < c C < such that c n J n j C n for all j = 1,..., J. J p 4 ˆπ(x j ) is uniformly consistent, or sup j=1,...,j ˆπ(x j ) π(x j ) 0 as n, J and n/j. 5 For all t, z = 0, 1, j = 1,..., J and τ Ω, f Y T,Z,X (. t, z, x j ) is bounded in a neighborhood of q t C (τ). For all τ Ω and j = 1,..., J, f Y X (. x j ) is positive and bounded in a neighborhood of q t C (τ). 30 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Extensions Extension I: Covariates with Large Support Let ˆm z j = ( ˆm z j (τ 1),..., ˆm z j (τ K )) and m z j = (m z j (τ 1),..., m z j (τ K )) be K 1 vector. Corollary 4 Given Assumptions 1 and 4, we have n1 j n0 ( ( )) j nj 1 + nj 0 ˆm 1 j ˆm 0 j m 1 j m 0 j Z j N(0, V j ), where Z j for j = 1,..J follow independent multi-variate normal distributions; the (k, k )-th element of K K variance-covariance matrix ) V j is ) V j;k,k = π(x j )mj 1(τ k τ k ) (1 mj 1(τ k ) + (1 π(x j ))mj 0(τ k τ k ) (1 mj 0(τ k ). 31 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Extensions Extension I: Covariates with Large Support For each j = 1,..., J, define the Wald-type statistic w j = n1 j n0 j nj 1 + nj 0 ( ) ( ) ˆm 1 j ˆm 0 j ˆV 1 j ˆm 1 j ˆm 0 j where ˆV j is a consistent estimator of V j. The (k, k )-th element of ˆV j is ˆV j;k,k = n0 j nj 0 + nj 1 ) ˆm j 1 (τ k τ k ) (1 ˆm j 1 (τ k ) + n1 j nj 0 + nj 1 ) ˆm j 0 (τ k τ k ) (1 ˆm j 0 (τ k ). The test statistic is then W largej = J 1 j=1 w j K(J 1) 2K(J 1) N(0, 1). The one-sided decision rule of the test is to reject the null hypothesis H 0 if W largej > c α. 32 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Extensions Extension II: Continuous Covariates Let m z k (x) = E[I (τ k) Z = z, X = x] for z = 0, 1. Interested in testing H 0 : mk 1 (x) = m0 k (x) for all x X and k = 1,..., K, Apply Chernozhukov, Lee and Rosen (2013) and form Kolmogorov-Smirnov type test statistic: ˆm k 1 KS = sup (x) ˆm0 k (x) k,x s k (x), where ˆm z k (x) is local linear estimator and s k(x) the standard error of ˆm 1 k (x) ˆm0 k (x). Construct the critical value c α by multiplier bootstrap. Let ˆm k (x) is a multiplier process such that ˆm k (x) = Z i =1 η i ˆɛ k,i K h1 (X i x) Z Z i =1 K i =0 η i ˆɛ k,i K h0 (X i x) h 1 (X i x) Z i =0 K h 0 (X i x) where {η i } N i=1 is simulated from i.i.d. N(0, 1) and independent of data, ˆɛ k,i = 1 ( Y i ˆq 1 C (τ k )T i + ˆq 0 C (τ k )(1 T i ) ) ˆm k 1(x i )Z i ˆm k 0(x i )(1 Z i ). c α is the ˆm k (1 α) 100% percentile of the simulated process sup (x) k,x s k (x). Reject the null if KS > c α. 33 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Extensions Extension III: Testing for Conditional Rank Invariance or Similarity Two main modifications are required: 1 First, estimate conditional quantiles conditional on some covariates X 1 of interest. 2 Second, use additional covariates X 2 other than the conditioning covariates in the first-step to perform the test. Feasible only when the conditioning set for the conditional quantiles is small. E.g., we estimate quantiles of potential earnings, and perform tests for male and female trainees, so the tests are essentially rank similarity tests for conditional ranks conditional on gender. 34 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Conclusion Conclusion Proposes nonparametric tests for rank invariance or similarity popular in program evaluation or various QTE models. The tests explore whether the distribution (or features of it) of potential ranks remains the same among observationally equivalent individuals. Simulations show good size and power of the proposed tests in small samples. Empirical application to the JTPA training program: Training causes some individuals to systematically change their ranks in the distribution of earnings. Program effects are more complicated than suggested by standard QTEs. Should be cautious in equating program impacts on the distribution of earnings with those on individual trainees. 35 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Conclusion (Incomplete) Reference List QTE models: Abadie, Angrist and Imbens (2002), Chesher (2003, 2005), Chernozhukov and Hansen (2005, 2006, 2008), Firpo (2007), Firpo, Fortin and Lemieux (2007), Chernozhukov, Imbens and Newey (2007), Horowitz and Lee (2007), Imbens and Newey (2009), Rothe (2010), Frolich and Melly (2013), Powell (2013), Yu (2014), etc. Other works in rank invariance/similarity testing Frandsen and Lefgreen (2015): a parametric test for rank similarity, testing the equality of mean ranks with or without treatment. Yu (2015): a test for rank invariance, assuming unconfoundedness. JTPA: Abadie, Angrist and Imbens (2002), Chernozhukov and Hansen (2008), Orr et al. (1996), Heckman, Smith, and Clements (1997), etc. Star: Krueger (1999), Krueger and Whitmore (2001), Chetty, et. al. (2010), Jackson and Page (2013), etc. 36 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation

Conclusion Thank you! 37 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation