ArCo: An Artificial Counterfactual Approach for High-Dimensional Data

Size: px

Start display at page:

Download "ArCo: An Artificial Counterfactual Approach for High-Dimensional Data"

Merryl Cameron
6 years ago
Views:

1 ArCo: An Artificial Counterfactual Approach for High-Dimensional Data Carlos V. de Carvalho BCB and PUC-Rio Marcelo C. Medeiros PUC-Rio Ricardo Masini São Paulo School of Economics BIG DATA, MACHINE LEARNING AND THE MACROECONOMY Norges Bank, Oslo, 2-3 October 2017

2 Overview of the Method Observe aggregated time series data from t = 1 to T time

3 Overview of the Method Intervention occurs at t = T time T 0

4 Overview of the Method No clear controls. Observed variables from untreated peers time T 0

5 Overview of the Method Counterfactual estimation in-sample (before intervention) time T 0

6 Overview of the Method Counterfactual extrapolation (after the intervention) time T 0

7 One-page Summary Artificial Counterfactual (ArCo): alternative framework to measure the impact of an intervention in aggregate data when a control group is not readily available.

8 One-page Summary Artificial Counterfactual (ArCo): alternative framework to measure the impact of an intervention in aggregate data when a control group is not readily available. Generalization of the Synthetic Control (SC) method of Abadie and Gardeazabal (2003, AER) and the panel approach of Hsiao, Ching and Wan (2012, JAE).

9 One-page Summary Artificial Counterfactual (ArCo): alternative framework to measure the impact of an intervention in aggregate data when a control group is not readily available. Generalization of the Synthetic Control (SC) method of Abadie and Gardeazabal (2003, AER) and the panel approach of Hsiao, Ching and Wan (2012, JAE). Our contribution: estimator with ❶ asymptotic theory, ❷ rigorous inference, ❸ high-dimensionality, and ❹ new tests such as detection of the time of the intervention, contamination effects and effects on multiple moments.

10 One-page Summary Artificial Counterfactual (ArCo): alternative framework to measure the impact of an intervention in aggregate data when a control group is not readily available. Generalization of the Synthetic Control (SC) method of Abadie and Gardeazabal (2003, AER) and the panel approach of Hsiao, Ching and Wan (2012, JAE). Our contribution: estimator with ❶ asymptotic theory, ❷ rigorous inference, ❸ high-dimensionality, and ❹ new tests such as detection of the time of the intervention, contamination effects and effects on multiple moments. Key hypothesis: ❶ Interventions takes place in variables that can be modelled as trend-stationary processes. ❷ The intervention affects only the unit of interest

11 One-page Summary Artificial Counterfactual (ArCo): alternative framework to measure the impact of an intervention in aggregate data when a control group is not readily available. Generalization of the Synthetic Control (SC) method of Abadie and Gardeazabal (2003, AER) and the panel approach of Hsiao, Ching and Wan (2012, JAE). Our contribution: estimator with ❶ asymptotic theory, ❷ rigorous inference, ❸ high-dimensionality, and ❹ new tests such as detection of the time of the intervention, contamination effects and effects on multiple moments. Key hypothesis: ❶ Interventions takes place in variables that can be modelled as trend-stationary processes. ❷ The intervention affects only the unit of interest

12 The road map 1. The setup 2. The counterfactual estimation 3. Estimator properties 4. Inference 5. Extensions 6. Simulations 7. Empirical example: Nota Fiscal Paulista 8. Research agenda 9. Concluding remarks

13 Setup Observe q i > 0 variables for i = 1,... n units for t = 1,... T periods (Panel structure): z it = (z 1 it,..., zq i it ).

14 Setup Observe q i > 0 variables for i = 1,... n units for t = 1,... T periods (Panel structure): z it = (z 1 it,..., zq i it ). Unit 1 is exposed to the intervention at known T 0.

15 Setup Observe q i > 0 variables for i = 1,... n units for t = 1,... T periods (Panel structure): z it = (z 1 it,..., zq i it ). Unit 1 is exposed to the intervention at known T 0. The remanning n 1 units z 0t (z 2t,..., z nt ) are an untreated potential control group (donor pool).

16 Setup Observe q i > 0 variables for i = 1,... n units for t = 1,... T periods (Panel structure): z it = (z 1 it,..., zq i it ). Unit 1 is exposed to the intervention at known T 0. The remanning n 1 units z 0t (z 2t,..., z nt ) are an untreated potential control group (donor pool). Potential Outcome notation: z 1t = d t z (1) 1t + (1 d t )z (0) 1t ; d t = z 0t = z (0) 0t where z (1) it z (0) it { 1 if t T 0 0 otherwise is potential outcome under the intervention and the potential outcome with no intervention.

17 Setup Effects on functions of z 1t : h (z 1t ) : R q 1 R q { v p y t = h (z 1t ) e.g. h(v t ) = t v t y t = d t y (1) t + (1 d t )y (0) t

18 Setup Effects on functions of z 1t : h (z 1t ) : R q 1 R q { v p y t = h (z 1t ) e.g. h(v t ) = t v t y t = d t y (1) t + (1 d t )y (0) t Hypothesis of interest: y (1) t = δ t + y (0) t, t = T 0..., T, H 0 : T = 1 T T T [ ] y (1) t y (0) t = 0. }{{} δ t t=t 0

19 Setup Effects on functions of z 1t : h (z 1t ) : R q 1 R q { v p y t = h (z 1t ) e.g. h(v t ) = t v t y t = d t y (1) t + (1 d t )y (0) t Hypothesis of interest: y (1) t = δ t + y (0) t, t = T 0..., T, H 0 : T = 1 T T T [ ] y (1) t y (0) t = 0. }{{} δ t t=t 0 We do not observe the counterfactual y (0) t. Therefore, we construct an estimate ŷ (0) t such that: δ t y (1) t ŷ (0) t for t = T 0,..., T

20 Counterfactual Estimation How should we construct ŷ (0) t?

21 Counterfactual Estimation How should we construct ŷ (0) t? Ideally (in the MSE sense) we would set ŷ (0) t = E(y (0) t z 0t, z 0t 1,..., z 0t p )

22 Counterfactual Estimation How should we construct ŷ (0) t? Ideally (in the MSE sense) we would set ŷ (0) t = E(y (0) t z 0t, z 0t 1,..., z 0t p ) In practice, we choose a (parametric) specification (could be linear or not). Let x t = (z 0t, z 0t 1,..., z 0t p ) and such that E(ν t ) = 0 and y (0) t = M(x t ) + ν t, ŷ (0) t = M(x t ).

23 ArCo estimator The Artificial Counterfactual (ArCo) estimator is then simply given by T = 1 T T T t=t 0 δt, where δ t y t ŷ (0) t, for t = T 0,..., T.

24 ArCo estimator The Artificial Counterfactual (ArCo) estimator is then simply given by T = 1 T T T t=t 0 δt, where δ t y t ŷ (0) t, for t = T 0,..., T. ArCo estimator is a two-step estimator: 1. First step: estimation of M with the pre-intervention sample; 2. Second step: extrapolate M with actual data for x t and compute T.

25 ArCo estimator Estimation sample y t Δ T y t = M(x t )

26 ArCo and the literature Hsiao, Ching and Wan (2012, JAE) Two-step method where M(x t) is a linear and scalar function of a small set of variables from the peers. Correct specification (M is the conditional expectation). Selection of peers by information criteria. Date of the intervention known.

27 ArCo and the literature Hsiao, Ching and Wan (2012, JAE) Two-step method where M(x t) is a linear and scalar function of a small set of variables from the peers. Correct specification (M is the conditional expectation). Selection of peers by information criteria. Date of the intervention known. Differences-in-Differences (DiD) Number of treated units must grow. Parallel trends hypothesis. Similar control group.

28 ArCo and the literature Hsiao, Ching and Wan (2012, JAE) Two-step method where M(x t) is a linear and scalar function of a small set of variables from the peers. Correct specification (M is the conditional expectation). Selection of peers by information criteria. Date of the intervention known. Differences-in-Differences (DiD) Number of treated units must grow. Parallel trends hypothesis. Similar control group. Gobillon and Magnac (2016, REStat) Generalize the above authors by explicitly considering a factor model. interactive fixed effects with strictly exogenous regressors. Asymptotics both on the cross-section and time dimensions.

29 ArCo and the literature Abadie and Gardeazabal (2003, AER) Convex combination of peers. The weights are estimated using time averages of the observed variables. No time-series dynamics.

30 ArCo and the literature Abadie and Gardeazabal (2003, AER) Convex combination of peers. The weights are estimated using time averages of the observed variables. No time-series dynamics. Pesaran et al. (2004, JBES); Dees et al. (2007, JAE) Counterfactual based on variables that belong to the treated unit. No donor pool. Their key assumption is that a subset of variables of the treated unit is invariant to the intervention.

31 ArCo and the literature Abadie and Gardeazabal (2003, AER) Convex combination of peers. The weights are estimated using time averages of the observed variables. No time-series dynamics. Pesaran et al. (2004, JBES); Dees et al. (2007, JAE) Counterfactual based on variables that belong to the treated unit. No donor pool. Their key assumption is that a subset of variables of the treated unit is invariant to the intervention. Angrist, Jordà and Kuersteiner (2016, JBES) Information only on the treated unit and no donor pool is available.

32 ArCo estimator Key Assumption Independence Let z 0t = (z 2t,..., z nt ) denotes the vector of all the observable variables for the untreated units. Then, d s, for all t, s. z 0t =

33 ArCo estimator Key Assumption Independence Let z 0t = (z 2t,..., z nt ) denotes the vector of all the observable variables for the untreated units. Then, d s, for all t, s. z 0t = The independence condition donors are untreated.

34 ArCo estimator Key Assumption Independence Let z 0t = (z 2t,..., z nt ) denotes the vector of all the observable variables for the untreated units. Then, d s, for all t, s. z 0t = The independence condition donors are untreated. Examples of interventions (treatments): Natural disasters: Belasen and Polachek (2008, AER P&P), Cavallo, Galiani, Noy, and Pantano (2013, ReStat), Fujiki and Hsiao (2015, JoE),...

35 ArCo estimator Key Assumption Independence Let z 0t = (z 2t,..., z nt ) denotes the vector of all the observable variables for the untreated units. Then, d s, for all t, s. z 0t = The independence condition donors are untreated. Examples of interventions (treatments): Natural disasters: Belasen and Polachek (2008, AER P&P), Cavallo, Galiani, Noy, and Pantano (2013, ReStat), Fujiki and Hsiao (2015, JoE),... Region specific policies (laws): Hsiao, Ching, and Wan (2012, JAE), Abadie, Diamond, and Hainmueller (AJPS, 2015), Gobillon and Magnac (ReStat, 2016),...

36 ArCo estimator Key Assumption Independence Let z 0t = (z 2t,..., z nt ) denotes the vector of all the observable variables for the untreated units. Then, d s, for all t, s. z 0t = The independence condition donors are untreated. Examples of interventions (treatments): Natural disasters: Belasen and Polachek (2008, AER P&P), Cavallo, Galiani, Noy, and Pantano (2013, ReStat), Fujiki and Hsiao (2015, JoE),... Region specific policies (laws): Hsiao, Ching, and Wan (2012, JAE), Abadie, Diamond, and Hainmueller (AJPS, 2015), Gobillon and Magnac (ReStat, 2016),... New government or political regime: Grier and Maynard (2013, JEBO)

37 ArCo estimator Key Assumption Independence Let z 0t = (z 2t,..., z nt ) denotes the vector of all the observable variables for the untreated units. Then, d s, for all t, s. z 0t = The independence condition donors are untreated.

38 ArCo estimator Key Assumption Independence Let z 0t = (z 2t,..., z nt ) denotes the vector of all the observable variables for the untreated units. Then, d s, for all t, s. z 0t = The independence condition donors are untreated. Examples of interventions (treatments): Abadie and Diamond (2004, AER) and Abadie, Diamond, and Hainmueller (2010, JASA) have, respectively, 1712 and 1280 Google cites as measured on August 2, 2017.

39 ArCo estimator Key Assumption Independence Let z 0t = (z 2t,..., z nt ) denotes the vector of all the observable variables for the untreated units. Then, d s, for all t, s. z 0t = The independence condition donors are untreated. Examples of interventions (treatments): Abadie and Diamond (2004, AER) and Abadie, Diamond, and Hainmueller (2010, JASA) have, respectively, 1712 and 1280 Google cites as measured on August 2, Arguably the most important innovation in the evaluation literature in the last fifteen years is the synthetic control method. Athey and Imbens (2016)

40 Counterfactual Estimation Why do we expect the method to work?

41 Counterfactual Estimation Why do we expect the method to work? Possible data generating mechanism (in reduced form): z (0) it = µ i + Ψ ij ε it j j=0 ε it = Λ i f t + η it where f t (f 1) (0, Q) is a vector of common unobserved factors. Λ i (q i f ) are matrices of factor loadings; η it (q i 1) (0, R i ) is idiosyncratic error term.

42 Counterfactual Estimation Why do we expect the method to work? Possible data generating mechanism (in reduced form): z (0) it = µ i + Ψ ij ε it j j=0 ε it = Λ i f t + η it where f t (f 1) (0, Q) is a vector of common unobserved factors. Λ i (q i f ) are matrices of factor loadings; η it (q i 1) (0, R i ) is idiosyncratic error term. Quality of the pool of donors comes from the factor structure.

43 Counterfactual estimation Least Absolute Shrinkage and Selection Operator (LASSO) Set X t = (1, x t ) R d and consider M(x t ) linear: y (0) t = α + β x t + ν t = θ X t + ν t.

44 Counterfactual estimation Least Absolute Shrinkage and Selection Operator (LASSO) Set X t = (1, x t ) R d and consider M(x t ) linear: y (0) t Estimation: θ = arg min T 0 1 t=1 = α + β x t + ν t = θ X t + ν t. ( y (0) t ) 2 d θ X t + ς β j. j=1

45 Counterfactual estimation Least Absolute Shrinkage and Selection Operator (LASSO) Set X t = (1, x t ) R d and consider M(x t ) linear: y (0) t Estimation: θ = arg min T 0 1 t=1 = α + β x t + ν t = θ X t + ν t. ( y (0) t ) 2 d θ X t + ς β j. Why LASSO? Avoid overfitting. Large dataset compared to the sample size. Automatic model selection. j=1

46 Counterfactual estimation LASSO Catalog of hypotheses Design Let Σ 1 T1 T 1 t=1 E(x tx t ) and S 0 = {i : θ 0,1 0} (set of non-zero parameters). There exists a constant ψ 0 > 0 such that θ[s 0 ] 2 1 θσθs 0 ψ0 2, for all θ[s c 0 ] 1 3 θ[s 0 ] 1.

47 Counterfactual estimation LASSO Catalog of hypotheses Design Let Σ 1 T1 T 1 t=1 E(x tx t ) and S 0 = {i : θ 0,1 0} (set of non-zero parameters). There exists a constant ψ 0 > 0 such that θ[s 0 ] 2 1 θσθs 0 ψ0 2, for all θ[s c 0 ] 1 3 θ[s 0 ] 1. Compatibility condition of Bülhmann and van der Geer (2011).

48 Counterfactual estimation LASSO Catalog of hypotheses Design Let Σ 1 T1 T 1 t=1 E(x tx t ) and S 0 = {i : θ 0,1 0} (set of non-zero parameters). There exists a constant ψ 0 > 0 such that θ[s 0 ] 2 1 θσθs 0 ψ0 2, for all θ[s c 0 ] 1 3 θ[s 0 ] 1. Compatibility condition of Bülhmann and van der Geer (2011). Similar to the restriction of the smallest eigenvalue of Σ.

49 Counterfactual estimation LASSO Catalog of hypotheses Design Let Σ 1 T1 T 1 t=1 E(x tx t ) and S 0 = {i : θ 0,1 0} (set of non-zero parameters). There exists a constant ψ 0 > 0 such that θ[s 0 ] 2 1 θσθs 0 ψ0 2, for all θ[s c 0 ] 1 3 θ[s 0 ] 1. Compatibility condition of Bülhmann and van der Geer (2011). Similar to the restriction of the smallest eigenvalue of Σ. Important for prediction consistency and l 1 -consistency of the LASSO.

50 Counterfactual estimation LASSO Catalog of hypotheses Heterogeneity and dynamics Let w t (ν t, x t ), then: (a) {w t } is strong mixing with α(m) = exp( cm) for some c c > 0 (b) E w it 2γ+δ c γ for some γ > 2 and δ > 0 for all 1 i d, 1 t T and T 1, (c) E(νt 2 ) ϵ > 0, for all 1 t T and T 1.

51 Counterfactual estimation LASSO Catalog of hypotheses Heterogeneity and dynamics Let w t (ν t, x t ), then: (a) {w t } is strong mixing with α(m) = exp( cm) for some c c > 0 (b) E w it 2γ+δ c γ for some γ > 2 and δ > 0 for all 1 i d, 1 t T and T 1, (c) E(νt 2 ) ϵ > 0, for all 1 t T and T 1. w t is an α-mixing process with exponential decay.

52 Counterfactual estimation LASSO Catalog of hypotheses Heterogeneity and dynamics Let w t (ν t, x t ), then: (a) {w t } is strong mixing with α(m) = exp( cm) for some c c > 0 (b) E w it 2γ+δ c γ for some γ > 2 and δ > 0 for all 1 i d, 1 t T and T 1, (c) E(νt 2 ) ϵ > 0, for all 1 t T and T 1. w t is an α-mixing process with exponential decay. Part (b) bounds uniformly some higher moments Law of Large Numbers.

53 Counterfactual estimation LASSO Catalog of hypotheses Heterogeneity and dynamics Let w t (ν t, x t ), then: (a) {w t } is strong mixing with α(m) = exp( cm) for some c c > 0 (b) E w it 2γ+δ c γ for some γ > 2 and δ > 0 for all 1 i d, 1 t T and T 1, (c) E(νt 2 ) ϵ > 0, for all 1 t T and T 1. w t is an α-mixing process with exponential decay. Part (b) bounds uniformly some higher moments Law of Large Numbers. Part (c) Sufficient condition for the Central Limit Theorem.

54 Counterfactual estimation LASSO Catalog of hypotheses Regularity ( ) (a) ς = O d 1/γ T (b) s 0 d 2/γ T = o(1)

55 Counterfactual estimation LASSO Catalog of hypotheses Regularity ( ) (a) ς = O d 1/γ T (b) s 0 d 2/γ T = o(1) Part (a) puts discipline on the growth rate of the regularization parameter.

56 Counterfactual estimation LASSO Catalog of hypotheses Regularity ( ) (a) ς = O d 1/γ T (b) s 0 d 2/γ T = o(1) Part (a) puts discipline on the growth rate of the regularization parameter. Part (b) bounds the number of (total/relevant) parameters.

57 Counterfactual estimation LASSO Catalog of hypotheses Regularity ( ) (a) ς = O d 1/γ T (b) s 0 d 2/γ T = o(1) Part (a) puts discipline on the growth rate of the regularization parameter. Part (b) bounds the number of (total/relevant) parameters. Both conditions can be relaxed if normality is assumed.

58 Counterfactual estimation LASSO Results Consistency and Asymptotic Normality Let M be the model defined as before, whose parameters are estimated by LASSO, then under previous assumptions and as T : sup [ 1/2 sup P P TΩ P P a R q T ( ] T T ) a Φ(a) 0, where Ω T is defined in the previous proposition and Φ( ) is the cumulative distribution function of a zero-mean normal random vector with identity covariance matrix. The inequality is defined element-wise.

59 Counterfactual estimation LASSO Results Uniform Confidence Interval Let Ω T be a consistent estimator for Ω T uniformly in P P. Under the same conditions as before: [ I α j,t ± ω ] j Φ 1 (1 α/2) T for each j = 1,..., q, where ω j = [ Ω] jj and Φ 1 ( ) is the quantile function of a standard normal distribution. I α is uniformly valid (honest) in the sense that for a given ϵ > 0, there exists a T ϵ such that for all T > T ϵ : sup P P ( j,t I α ) (1 α) < ϵ. P P

60 Counterfactual estimation LASSO Results Uniform Hypothesis Test Let Ω T be a consistent estimator for Ω T uniformly in P P. Under the same conditions as before, for a given ϵ > 0, there exists a T ϵ such that for all T > T ϵ : sup P P (W T c α ) (1 α) < ϵ, P P where W T T T T, P(χ 2 q c α ) = 1 α and χ 2 q is a chi-square distributed random variable with q degrees of freedom. Ω 1 T

61 Additional Results 1. Tests under unknown intervention date. Results are not altered when the intervention date is estimated. λ is super-consistent.

62 Additional Results 1. Tests under unknown intervention date. Results are not altered when the intervention date is estimated. λ is super-consistent. 2. Multiple interventions.

63 Additional Results 1. Tests under unknown intervention date. Results are not altered when the intervention date is estimated. λ is super-consistent. 2. Multiple interventions. 3. Test for the untreated peers and /or treated unit.

64 Additional Results 1. Tests under unknown intervention date. Results are not altered when the intervention date is estimated. λ is super-consistent. 2. Multiple interventions. 3. Test for the untreated peers and /or treated unit. 4. The framework is extended to conduct joint test (e.g., affecting the mean and the variance simultaneously)

65 Additional Results 1. Tests under unknown intervention date. Results are not altered when the intervention date is estimated. λ is super-consistent. 2. Multiple interventions. 3. Test for the untreated peers and /or treated unit. 4. The framework is extended to conduct joint test (e.g., affecting the mean and the variance simultaneously) 5. R package: Fonseca, Masini, Medeiros, and Vasconcelos (2017a)

66 Additional Results 1. Tests under unknown intervention date. Results are not altered when the intervention date is estimated. λ is super-consistent. 2. Multiple interventions. 3. Test for the untreated peers and /or treated unit. 4. The framework is extended to conduct joint test (e.g., affecting the mean and the variance simultaneously) 5. R package: Fonseca, Masini, Medeiros, and Vasconcelos (2017a) 6. Carvalho, Masini and Medeiros (2017): Integrated data

67 Additional Results 1. Tests under unknown intervention date. Results are not altered when the intervention date is estimated. λ is super-consistent. 2. Multiple interventions. 3. Test for the untreated peers and /or treated unit. 4. The framework is extended to conduct joint test (e.g., affecting the mean and the variance simultaneously) 5. R package: Fonseca, Masini, Medeiros, and Vasconcelos (2017a) 6. Carvalho, Masini and Medeiros (2017): Integrated data 7. Fonseca, Masini, Medeiros, and Vasconcelos (2017b): Random Forests and other machine learning techniques.

68 Monte Carlo results Data Generating Process (DGP) Consider the following model for i {1,..., n} and t 1: where: ε it = Λ i f t + η it, z (0) it = ρa i z (0) it 1 + ε it, f t = [1, (t/t) φ, v t ], z it R q, ρ [0, 1), φ > 0, A i (q q) is a diagonal matrix with diagonal elements strictly between 1 and 1, iid N(0, 1), η it N(0, rf 2I nq), and Λ i is a (q 3) matrix of factor loadings. v t iid

69 Monte Carlo results Rejection Rates under the Null (1/2) Bias Var ŝ 0 α = Innovation Distribution T = 100, d = 100, s 0 = 5, φ = 0, ρ = 0 Normal χ 2 (1) t-stud(3) Mixed-Normal Sample Size normal dist., d = 100, s 0 = 5, φ = 0, ρ = 0 T = Number of Total Covariates normal dist., T = 100, s 0 = 5, φ = 0, ρ = 0 d =

70 Monte Carlo results Rejection Rates under the Null (2/2) Bias Var ŝ 0 α = Number of Relevant (non-zero) Covariates normal dist., T = 100, d = 100, φ = 0, ρ = 0 s 0 = Determinist Trend (t/t) φ normal dist., T = 100, d = 100, s 0 = 5, ρ = 0 φ = Serial Correlation normal dist., T = 100, d = 100, s 0 = 5, φ = 0 ρ =

71 Monte Carlo results Rejection rates under the alternative α = Step Intervention δ t = c σ 1 1{t T 0 } c = Linear Increasing δ t = c σ 1 t T 0 +1 T T {t T 0} c = T t+1 Linear Decreasing δ t = c σ 1 T T {t T 0} c =

72 Monte Carlo results Horse race (1/2) BA SC DiD* DiD GM* GM ArCo* ArCo No Time Trend (φ = 0) and No Serial Correlation (ρ = 0) Bias Var MSE No Time Trend (φ = 0) Bias Var MSE Common Linear Time Trend (φ = 1) Bias Var MSE Idiosyncratic Linear Time Trend (φ = 1) Bias Var MSE

73 Monte Carlo results Horse race (2/2) BA SC DiD* DiD GM* GM ArCo* ArCo Common Quadratic Time Trend (φ = 2) Bias Var MSE Idiosyncratic Quadratic Time Trend (φ = 2) Bias Var MSE

74 Kernel Normal Monte Carlo results No trend and no serial correlation BA SC DiD* DiD GM* GM ArCo* ArCo

75 Kernel Normal Monte Carlo results No trend BA SC DiD* DiD GM* GM ArCo* ArCo

76 Kernel Normal Monte Carlo results Common linear trend BA SC DiD* DiD GM* GM ArCo* ArCo

77 Kernel Normal Monte Carlo results Heterogeneous linear trend BA SC DiD* DiD GM* GM ArCo* ArCo

78 Kernel Normal Monte Carlo results Common quadratic trend BA SC DiD* DiD GM* GM ArCo* ArCo

79 Kernel Normal Monte Carlo results Heterogeneous quadratic trend BA SC DiD* DiD GM* GM ArCo* ArCo

80 Application In October 2007, the state government of São Paulo in Brazil implemented a anti tax evasion scheme called Nota Fiscal Paulista (NFP).

81 Application In October 2007, the state government of São Paulo in Brazil implemented a anti tax evasion scheme called Nota Fiscal Paulista (NFP). The NFP consists of a tax rebate from a state tax named ICMS (tax on circulation of products and services).

82 Application In October 2007, the state government of São Paulo in Brazil implemented a anti tax evasion scheme called Nota Fiscal Paulista (NFP). The NFP consists of a tax rebate from a state tax named ICMS (tax on circulation of products and services). Incentive to the consumer to ask for sales receipts.

83 Application In October 2007, the state government of São Paulo in Brazil implemented a anti tax evasion scheme called Nota Fiscal Paulista (NFP). The NFP consists of a tax rebate from a state tax named ICMS (tax on circulation of products and services). Incentive to the consumer to ask for sales receipts. Additionally, the registered sales receipts give the consumer the right to participate in monthly lotteries promoted by the government.

84 Application Under the premisses that: 1. a certain degree of tax evasion was occurring before the intervention, 2. the sellers has some degree of market power and 3. the penalty for tax-evasion is large enough to alter the seller behaviour, one is expected to see an upwards movements in prices due to an increase in marginal cost. Hence, we would like to investigate whether the NFP had an impact on consumer prices in São Paulo.

85 Application The NFP was not implemented throughout the sectors in the economy at once.

86 Application The NFP was not implemented throughout the sectors in the economy at once. The first sector were restaurants, followed by bakeries, bar and other food service retailers.

87 Application The NFP was not implemented throughout the sectors in the economy at once. The first sector were restaurants, followed by bakeries, bar and other food service retailers. The sample then consists of monthly inflation (food outside home) index for 10 metropolitan areas including São Paulo from Jan-05 to Sep-09.

88 Application 1200 Dec 07 Jan 08 Feb 08 Mar 08 Apr 08 May 08 Jun 08 Jul 08 Aug 08 Sep 08 Oct 08 Nov 08 Dec 08 Jan 09 Feb 09 Mar 09 Apr 09 May 09 Jun 09 Jul 09 Aug 09 Sep 09 # of participants (millions) Distributed Value (millions R$)

89 Application Panel (a): ArCo Estimates (1) (2) (3) (4) (5) (6) (7) (0.1704) (0.1486) (0.1432) (0.1480) (0.2010) (0.1600) (0.1539) Inflation Yes No No No Yes Yes Yes GDP No Yes No No Yes Yes Yes Retail Sales No No Yes No No Yes Yes Credit No No No Yes No No Yes R-squared d s T T DiD (0.1466) GM (0.1234) Panel (b): Alternative Estimates (1) (2) (3) (4) (5) (6) (0.1456) (0.1467) (0.1556) (0.1457) (0.1466) (0.1243) (0.1246) (0.1227) (0.1228) GDP Yes No No Yes Yes No Retail Sales No Yes No Yes Yes No Credit No No Yes No Yes No

90 Application ArCo estimates: CPI inflation (food outside home) Jan-05 Jan-06 Jan-07 Jan-08 Jan-09 Pre-intervention fit Counterfactual Actual index

91 Application ArCo estimates: CPI (food outside home) Jan-05 Jan-06 Jan-07 Jan-08 Jan-09 Pre-intervention fit Counterfactual Actual index

92 Concluding remarks Important to note that the ArCo is more than just another structural break estimator. By controlling for common shocks that might have occurred in all units after the intervention, it provides a effective methodology to isolate the effect of the intervention of interest. Providing compelling evidence for the causality of the structural break observed in the unit of interest It is almost a model-free methodology. It implies no structure in the stationary process (no # of lags to be determined nor # MA terms)

TEXTO PARA DISCUSSÃO. No ARCO: an artificial counterfactual approach for high-dimensional panel time-series data

TEXTO PARA DISCUSSÃO. No ARCO: an artificial counterfactual approach for high-dimensional panel time-series data EXO PARA DISCUSSÃO No. 653 ARCO: an artificial counterfactual approach for high-dimensional panel time-series data Carlos C. Carvalho Ricardo Masini Marcelo C. Medeiros DEPARAMENO DE ECONOMIA www.econ.puc-rio.br