One Economist s Perspective on Some Important Estimation Issues

Similar documents
Empirical approaches in public economics

Introduction to Panel Data Analysis

Table B1. Full Sample Results OLS/Probit

Development. ECON 8830 Anant Nyshadham

Treatment Effects. Christopher Taber. September 6, Department of Economics University of Wisconsin-Madison

Has the Family Planning Policy Improved the Quality of the Chinese New. Generation? Yingyao Hu University of Texas at Austin

Child Maturation, Time-Invariant, and Time-Varying Inputs: their Relative Importance in Production of Child Human Capital

Causal Inference with Big Data Sets

Technical Track Session I:

Problem Set # 1. Master in Business and Quantitative Methods

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous

Regression Discontinuity Designs in Economics

MSU Development Field Exam May 9, 2008

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017

Pilot to Improve the Development and Nutrition of Young Children in Poor Rural Areas in Guatemala

Impact Evaluation of Rural Road Projects. Dominique van de Walle World Bank

EMERGING MARKETS - Lecture 2: Methodology refresher

Using Instrumental Variables to Find Causal Effects in Public Health

Applied Microeconometrics (L5): Panel Data-Basics

Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates

To Hold Out or Not. Frank Schorfheide and Ken Wolpin. April 4, University of Pennsylvania

Instrumental Variables in Action: Sometimes You get What You Need

Instrumental Variables

Differences in Differences (Dif-in and Panel Data. Technical Session III. Manila, December Manila, December Christel Vermeersch

Applied Quantitative Methods II

Quantitative Economics for the Evaluation of the European Policy

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015

More on Roy Model of Self-Selection

Technical Track Session I: Causal Inference

Recitation Notes 6. Konrad Menzel. October 22, 2006

The World Bank Telangana Rural Inclusive Growth Project (P143608)

ESTIMATE THE REGRESSION COEFFICIENTS OF VARIABLES SPL. REFERENCE TO FERTILITY

Sindh Enhancing Response to Reduce Stunting (P161624)

Factor Models. A. Linear. Costas Meghir. February 27, 2018

PhD/MA Econometrics Examination January 2012 PART A

An example to start off with

Recitation Notes 5. Konrad Menzel. October 13, 2006

Ability Bias, Errors in Variables and Sibling Methods. James J. Heckman University of Chicago Econ 312 This draft, May 26, 2006

1Department of Demography and Organization Studies, University of Texas at San Antonio, One UTSA Circle, San Antonio, TX

Market and Nonmarket Benefits

The World Bank Health System Performance Reinforcement Project (P156679)

Econ 673: Microeconometrics Chapter 12: Estimating Treatment Effects. The Problem

COMPARISON OF THE HUMAN DEVELOPMENT BETWEEN OIC COUNTRIES AND THE OTHER COUNTRIES OF THE WORLD

Impact Evaluation of Mindspark Centres

ted: a Stata Command for Testing Stability of Regression Discontinuity Models

The Wounds That Do Not Heal: The Life-time Scar of Youth Unemployment

Chilean and High School Dropout Calculations to Testing the Correlated Random Coefficient Model

Topics in Applied Econometrics and Development - Spring 2014

Describing Associations, Covariance, Correlation, and Causality. Interest Rates and Inflation. Data & Scatter Diagram. Lecture 4

14.74 Lecture 10: The returns to human capital: education

1 Impact Evaluation: Randomized Controlled Trial (RCT)

IV Estimation WS 2014/15 SS Alexander Spermann. IV Estimation

Sensitivity checks for the local average treatment effect

Summary Article: Poverty from Encyclopedia of Geography

The Econometric Evaluation of Policy Design: Part I: Heterogeneity in Program Impacts, Modeling Self-Selection, and Parameters of Interest

The World Bank India: Andhra Pradesh Rural Inclusive Growth Project (P152210)

The World Bank CG Rep. LISUNGI Safety Nets System Project (P145263)

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Introduction to Development. Indicators and Models

A Model of Human Capital Accumulation and Occupational Choices. A simplified version of Keane and Wolpin (JPE, 1997)

The World Bank TOGO Community Development and Safety Nets Project (P127200)

Income Distribution Dynamics with Endogenous Fertility. By Michael Kremer and Daniel Chen

Write your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover).

EDF 7405 Advanced Quantitative Methods in Educational Research. Data are available on IQ of the child and seven potential predictors.

Regression Analysis. A statistical procedure used to find relations among a set of variables.

WHO lunchtime seminar Mapping child growth failure in Africa between 2000 and Professor Simon I. Hay March 12, 2018

Development Economics Unified Growth Theory: From Stagnation to Growth

Poverty and Inclusion in the West Bank and Gaza. Tara Vishwanath and Roy Van der Weide

Marco Cosconati. FINET Workshop November 2012

Econometric Analysis of Cross Section and Panel Data

Instrumental Variables

Regression Discontinuity

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation

Applied Econometrics Lecture 1

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects

Comparative Advantage and Schooling

14.32 Final : Spring 2001

Instrumental Variables

Dan Graham Professor of Statistical Modelling. Centre for Transport Studies

Estimating the Technology of Children s Skill Formation

The World Bank Ghana - Maternal, Child Health and Nutrition Project (P145792)

A STUDY OF HUMAN DEVELOPMENT APPROACH TO THE DEVELOPMENT OF NORTH EASTERN REGION OF INDIA

Rockefeller College University at Albany

Rubio-Codina, Attanasio, Meghir, Varela, and Grantham-McGregor On-line Appendix

Differences-in- Differences. November 10 Clair

Announcements. J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 February 8, / 45

Comparing Change Scores with Lagged Dependent Variables in Models of the Effects of Parents Actions to Modify Children's Problem Behavior

RESTRUCTURING PAPER ON A PROPOSED PROJECT RESTRUCTURING LK WATER SUPPLY AND SANITATION IMPROVEMENT PROJECT APPROVED ON JUNE 24, 2015

The FIRE Project: FIRE3

A Dynamic Model of Health, Education, and Wealth with Credit Constraints and Rational Addiction

Urbanization and Maternal Health

Running head: GEOGRAPHICALLY WEIGHTED REGRESSION 1. Geographically Weighted Regression. Chelsey-Ann Cu GEOB 479 L2A. University of British Columbia

Marriage Institutions and Sibling Competition: Online Theory Appendix

Binary Logistic Regression

ECONOMETRICS II (ECO 2401) Victor Aguirregabiria. Spring 2018 TOPIC 4: INTRODUCTION TO THE EVALUATION OF TREATMENT EFFECTS

International Development

Econometrics Homework 4 Solutions

1 Motivation for Instrumental Variable (IV) Regression

Chapter 60 Evaluating Social Programs with Endogenous Program Placement and Selection of the Treated

Effects on School Enrollment and Performance of a Conditional Cash Transfer Program in Mexico

Transcription:

One Economist s Perspective on Some Important Estimation Issues Jere R. Behrman W.R. Kenan Jr. Professor of Economics & Sociology University of Pennsylvania SRCD Seattle Preconference on Interventions for Children and Youth in Low- and Middle- Income Countries: New Opportunities and Challenges for Developmental Science 17 April 2013

I. Life-Cycle Framework

Behavioral choices make it challenging to estimate impact of one variable (e.g., schooling, ECD program) on another. Measurement problems include persistent unobserved factors ( endowments such as genetics), likely intergenerationally correlated. Dynamics include lagged effects but also forwardlooking behaviors (and other unobservables) and dynamic complementarities or substitution. Heterogeneities may be critical (unobservables, parameters). Parameters, not variance decomposition, of primary interest. Generalizability from sample and from context?

II. Unobserved Variable Bias, Endogeneity, Simultaneity Interested in estimating impact of nutritional status HAZ (H) on cognitive skills (C) in linear approximation: (1) C t = a 0 + a 1 H t + a 2 X + a 3 U f + a 4 U v t + u t, where subscripts refer to time period, C t = cognitive skills in the tth period, H t = height-for-age z score (HAZ) in the tth period, X = other observed characteristics assumed fixed (e.g., sex, age), U f = unobserved fixed characteristics (e.g., innate ability, innate health), U v t = unobserved time-varying characteristics (e.g., cognitive skills accumulated through interacting with others in same context), u t = random disturbance term (e.g, chance learning encounters), a i = parameters to be estimated

Some characteristics of relation (1) (1) For impact estimate, coefficient a 1 of interest, NOT explained variance (which could be small if variance H low or variance u high) (2) Variables (upper case letters) all may be vectors but for simplicity they are treated here as if they each have just one element. (3) For simplicity assumed that only one observed choice variable (H t ) on right side though in more extensive specifications there may be others (e.g., home stimulation, pre-school attendance mother s time with child).

(4) If ordinary least squares (OLS) multivariate regression estimates are made of relation (1) the effective disturbance term is a 3 U f + a 4 U v t + u t. (5) Except for simultaneity (below), same framework holds if C for period t + n or if H lagged. (6) Important question is whether H t correlated with effective disturbance term (a 3 U f + a 4 U v t + u t ); if correlation, estimate of a 1 biased because H t proxies in part for effective disturbance term, not only for effect of H t alone.

Suppose H t determined by linear relation: (2) H t = b 0 + b 1 C t + b 2 X + b 2 Z + b 3 U f + b 4 U v t + v t, Where variables as defined above except: Z represents observed variables that affect H t. but not C t directly (e.g., the price of nutrients, which does not directly affect cognitive skills in relation 1), random term is v t (not correlated with u t in relation 1) parameters are b s instead of a s. In this relation H t determined in part by current cognitive skills. That is relations (1) and (2) simultaneously determine cognitive skills and HAZ.

Clear from relation (2) that estimation problem in using OLS for relation (1) because in relation (2) H t depends on U f and U v t both of which are in the effective disturbance term (a 3 U f + a 4 U v t + u t ) in relation (1). OLS estimate of a 1 biased because H 1 in relation (1) is representing in part correlated part of these unobserved variables and there will be a correlation because the same unobserved variables affect H t in relation (2). ( unobserved variable bias ). Note that can occur even if b 1 = 0 so that cognitive skills do NOT determine HAZ.

IF simultaneous determination of C t and H t so that b 1 is nonzero, relation (1) can be substituted into relation (2) : (3) H t = b 0 + b 1 a 0 + [(b 2 + b 1 a 2 )/(1- b 1 a 1 )] X + [b 2 /(1- b 1 a 1 )] Z + [(b 3 + b 1 a 3 ) /(1- b 1 a 1 )] U f + [(b 4 + b 1 a 4 ) /(1- b 1 a 1 )] U v t + b 1 u t + v t or (4) H t = c 0 + c 2 X + c 2 Z + c 3 U f + c 4 U v t + z t, where the definition of the c s and z t should be clear from comparing relations (3) and (4). Again, clear estimation problem in using OLS for relation (1) because in relations (3) and (4) H t depends on U f and U v t both of which are in the effective disturbance term (a 3 U f + a 4 U v t + u t ) in relation (1).

What can be done to eliminate this bias or, equivalently, to eliminate the correlation between H t and the effective disturbance term (a 3 U f + a 4 U v t + u t ) in relation (1)? (1) Random controlled trial (RCT) so H t randomly distributed across individuals and NOT outcome of individual behaviors as in relation (2). A lot of attractions, but conducting a RCT not within the scope of what is often possible. (May provide Z.)

(2) Natural experiment in which some people beneficiaries of options that improved HAZ and others were not e.g. policy or weather changes that affected people differentially depending on age or geography or both. (May provide Z.) (3) Fixed effects (FE) estimates in which dichotomous ( dummy ) variable added for each individual that control for ALL individual fixed characteristics, including U f and thus removes U f from compound disturbance term in relation (1): Need multiple observations at level for which FE used Related to first-differencing & sibling (twins) estimates Still biases if unobserved time-varying factors that enter into both relation (1) and relation (2).

(4) Instrumental variable (IV) estimates: correlation between H t and disturbance term (a 3 U f + a 4 U v t + u t ) in relation (1) broken by replacing H t by predicted value of H based on relation (4) (which depends on X and Z, but not U f and U v t). Two-stage process (two-stage least squares, 2SLS): in firststage relation (4) estimated and then second-stage relation (1) estimated using predicted value of H rather than actual value. For Z to be good instrument it must (i) predict well H and (ii) NOT be included in relation (1) (or, equivalently, not correlated with disturbance term (a 3 U f + a 4 U v t + u t )). Standard statistical tests. # of instruments must be >= # of right-side behavioral variables. Perhaps in combination with FE estimates (IV-FE). Local average treatment effects LATE (e.g., minimum schooling laws)

(5) Regression discontinuity (compare those right above and right below eligibility cutoffs) (6) Propensity score matching (compare based on similar propensities to receive treatment given observed variables)

(1 ) S c = a 0 + a 1 S m + a 2 X + a 3 U f + a 4 U v + u, Where S c = child schooling, S m is mother s schooling, U f is family endowments for mother, U v is individual-specific endowments for mother. Probable bias because S m correlated with unobserved endowments (parallel relation for S m ). Adult sisters (mothers) sibling estimates. Behrman, J. R. and B. L. Wolfe (1987). "Investments in Schooling in Two Generations in Pre Revolutionary Nicaragua: The Roles of Family Background and School Supply." Journal of Development Economics 27:12(October): 395-420. Adult sisters identical twins estimates. Behrman, J. R. and M. R. Rosenzweig (2002). "Does increasing women's schooling raise the schooling of the next generation?" American Economic Review 92(1): 323-334

Mothers (left) and Fathers (right) Additional Grade of Schooling and Child Schooling (vertical) 0.4 0.3 0.2 0.1 0 Effect Sm 0.5 0.4 0.3 0.2 0.1 Effect Sf -0.1-0.2 0 OLS OLS w Sm Twins Twins w Sm -0.3 OLS OLS w Sf Twins Twn w Sf

(1 ) Y = a 0 + a 1 BW+ a 2 X + a 3 U f + a 4 U v + u, Where Y is child outcome, BW is birth weight, U f is family endowments for child, U v is individual-specific endowments for child. Probable bias because BW correlated with unobserved endowments (parallel relation for BW). Adult sisters identical twins estimates. Behrman, J. R. and M. R. Rosenzweig (2004). "Returns to Birthweight." The Review of Economics and Statistics 86(2): 586-601

Impact of Higher Birth Weight with (Twn) and without (OLS) Control for Endowments (Twn) 1 0.2 0.5 0 OLS Twn School ing 0.1 0 OLS Twn ln Wage 1.55 2 1.5 1.45 OLS Twn Height 1 0 OLS Twn Overwt

(1 ) R = a 0 + a 1 S+ a 2 Stunt + a 3 X + a 4 U f + a 5 U v + u, Where R is adult reading comprehension, S is completed grades of schooling, Stunt is stunted when of preschool age, U f is family endowments for child, U v is individual-specific endowments for child. Probable bias because S and Stunt correlated with unobserved endowments (parallel relation for both). Instrumental variables estimates for Guatemala. Behrman, JR, J Hoddinott, JA Maluccio, E Soler-Hampejsek, EL Behrman, R Martorell, A Quisumbing, M Ramirez and AD Stein What Determines Adult Skills? Impacts of Pre-School, School-Years and Post-School Experiences in Guatemala, Washington, DC: IFPRI Discussion Paper No. 826, 2008.

SDs in Reading Comprehensive Score per Grade of Schooling or Pre-School Child Not Stunted

III. Related Other Thoughts III-A. Treatment effects (T) added: (1) C t = a 0 + a 1 H t + a 2 X + a 3 U f + a 4 U v t + a 5 T t + u t Related issue if endogenous choice of treatment (i.e., behavioral choices of T). Therefore: (i) RCTs, (ii) Natural experiments, (iii) FE (iv), IV (v) Regression discontinuity, (vi) Propensity score matching. III-B. Selected samples (e.g., those attending preschool programs) (high value of u t ).

Preschool enrolment by region and income less than 20% for poorer income quintiles

III-C. Life-cycle Implications for impact evaluation: Present Discounted Values in U.S. $ of Seven Major Benefits of Moving an Infant Out of Low Birth Weight Status in a Poor Developing Country (Alderman, H. and J. R. Behrman (2006). "Reducing the Incidence of Low Birth Weight In Low-Income Countries has Substantial Economic Benefits." World Bank Research Observer 21(1): 25-48.) Annual discount rate Benefit 3 percent 5 percent 10 percent 1. Reduced infant mortality $95 $99 $89 2. Reduced neonatal care $42 $42 $42 3. Reduced costs of infant and child illness $36 $35 $34 4. Productivity gain from reduced stunting $152 $85 $25 5. Productivity gain from increased cognitive ability $367 $205 $60 6. Reduced costs of chronic diseases $49 $15 $1 7. Intergenerational benefits $92 $35 $6 Total $832 $510 $257 Share of total at 5 percent discount rate (percent) 163% 100% 50% Note: The 5 percent discount rate, shown in bold in the table, is the base case estimate.

III-D. Resource Costs Private and Public -- as Important as Benefits: Benefit-Cost Ratios for Reducing Low Birth Weight Treatment for women with asymptotic bacterial infections Treatment for women with presumptive sexually transmitted diseases Drugs for pregnant women with poor obstretic history Benefits/Costs 0.6-4.9 1.3-10.7 4.1-35.2 Source: Behrman, JR, H Alderman and J Hoddinott (2004), Hunger and Malnutrition in Bjorn Lomborg, ed., Global Crisis, Global Solutions Cambridge, UK, Cambridge University Press, 363-420.

III-E Context Matters And Risks of Facile Claims of Best Practice (e.g. J-PAL Website)

Guess for which country interventions to increase primary schooling in the last decade are likely to be least effective? Country Net Primary School Enrollments in 2000 Madagascar 68% Kenya 65% India 79% Mexico 97%

III-F. Structural Models Estimation of underlying preference relations and production functions and other constraints. Learn about, for example, dynamic complementarities (e.g., Cunha and Heckman). Can perform counterfactual experiments (e.g., Todd, P. E. and K. I. Wolpin (2006). "Using a Social Experiment to Validate a Dynamic Behavioral Model of Child Schooling and Fertility: Assessing the Impact of a School Subsidy Program in Mexico." American Economics Review 96(5): 1384-1417.)