Stata module for decomposing goodness of fit according to Shapley and Owen values

Similar documents
Department of Applied Mathematics Faculty of EEMCS. University of Twente. Memorandum No Owen coalitional value without additivity axiom

Lab 10 - Binary Variables

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Sources of Inequality: Additive Decomposition of the Gini Coefficient.

Topic 7: Heteroskedasticity

Problem Set 10: Panel Data

Making sense of Econometrics: Basics

Practice exam questions

Marginal effects and extending the Blinder-Oaxaca. decomposition to nonlinear models. Tamás Bartus

ECO321: Economic Statistics II

1 The basics of panel data

Exploring Marginal Treatment Effects

INTRODUCTION TO BASIC LINEAR REGRESSION MODEL

COOPERATIVE GAME THEORY: CORE AND SHAPLEY VALUE

PhD/MA Econometrics Examination. January, 2015 PART A. (Answer any TWO from Part A)

Cooperative Games. M2 ISI Systèmes MultiAgents. Stéphane Airiau LAMSADE

A Theory of Attribution

Regression #8: Loose Ends

Robust covariance estimation for quantile regression

Lab 6 - Simple Regression

Applied cooperative game theory:

Minimum cost spanning tree problems with groups

Econometrics I Lecture 7: Dummy Variables

Homework Solutions Applied Logistic Regression

ECON Interactions and Dummies

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

Multiple Regression Analysis: Inference ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

The Stata module felsdvreg to fit a linear model with two high-dimensional fixed effects

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

Selection on Observables

An axiomatic characterization of the position value for network situations

Linear Regression with one Regressor

1 Linear Regression Analysis The Mincer Wage Equation Data Econometric Model Estimation... 11

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

1 Motivation for Instrumental Variable (IV) Regression

Seminar: Topics in Cooperative Game Theory

CER-ETH Center of Economic Research at ETH Zurich

Players Indifferent to Cooperate and Characterizations of the Shapley Value

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

ECON 482 / WH Hong Binary or Dummy Variables 1. Qualitative Information

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Variance estimation for Generalized Entropy and Atkinson indices: the complex survey data case

DOCUMENTS DE TRAVAIL CEMOI / CEMOI WORKING PAPERS. A SAS macro to estimate Average Treatment Effects with Matching Estimators

Statistical Inference with Regression Analysis

Econometrics - 30C00200

Econ Spring 2016 Section 9

Advanced Microeconomics

Weighted Voting Games

Cooperative Games. Stéphane Airiau. Institute for Logic, Language & Computations (ILLC) University of Amsterdam

Empirical Asset Pricing

Partial effects in fixed effects models

Decomposing Changes (or Differences) in Distributions. Thomas Lemieux, UBC Econ 561 March 2016

Linear Models in Econometrics

Instrumental variables estimation using heteroskedasticity-based instruments

Problem Set # 1. Master in Business and Quantitative Methods

Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics

Chapter 9: The Regression Model with Qualitative Information: Binary Variables (Dummies)

Dynamic Panel Data Models

Applied Econometrics (QEM)

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

xtseqreg: Sequential (two-stage) estimation of linear panel data models

Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems

Problem 13.5 (10 points)

Variable importance in RF. 1 Start. p < Conditional variable importance in RF. 2 n = 15 y = (0.4, 0.6) Other variable importance measures

Math 152: Applicable Mathematics and Computing

DOCUMENTS DE TRAVAIL CEMOI / CEMOI WORKING PAPERS

Exam ECON3150/4150: Introductory Econometrics. 18 May 2016; 09:00h-12.00h.

ECONOMETRICS HONOR S EXAM REVIEW SESSION

Module 16: Signaling

Multiagent Systems Motivation. Multiagent Systems Terminology Basics Shapley value Representation. 10.

Game Theory: Spring 2017

CORVINUS ECONOMICS WORKING PAPERS. On the impossibility of fair risk allocation. by Péter Csóka Miklós Pintér CEWP 12/2014

Fortin Econ Econometric Review 1. 1 Panel Data Methods Fixed Effects Dummy Variables Regression... 7

Econometrics -- Final Exam (Sample)

Job Training Partnership Act (JTPA)

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Dynamic Games, II: Bargaining

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

MGEC11H3Y L01 Introduction to Regression Analysis Term Test Friday July 5, PM Instructor: Victor Yu

FNCE 926 Empirical Methods in CF

Wooldridge, Introductory Econometrics, 3d ed. Chapter 16: Simultaneous equations models. An obvious reason for the endogeneity of explanatory

Ordinary Least Squares Regression

Applied Quantitative Methods II

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

Lecture 9: Panel Data Model (Chapter 14, Wooldridge Textbook)

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2)

Write your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover).

Beyond the Target Customer: Social Effects of CRM Campaigns

(a) Briefly discuss the advantage of using panel data in this situation rather than pure crosssections

Simultaneous Equations with Error Components. Mike Bronner Marko Ledic Anja Breitwieser

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

Graduate Econometrics Lecture 4: Heteroskedasticity

Math-Net.Ru All Russian mathematical portal

Homework Set 2, ECO 311, Spring 2014

Hypothesis Tests and Confidence Intervals. in Multiple Regression

EEE-05, Series 05. Time. 3 hours Maximum marks General Instructions: Please read the following instructions carefully

Solutions to Problem Set 4 (Due November 13) Maximum number of points for Problem set 4 is: 66. Problem C 6.1

xtdpdqml: Quasi-maximum likelihood estimation of linear dynamic short-t panel data models

Transcription:

rego Stata module for decomposing goodness of fit according to Shapley and Owen values Frank Huettner and Marco Sunder Department of Economics University of Leipzig, Germany Presentation at the UK Stata Users Group meeting, London, 13/14 Sept. 2012

Motivation Challenges in interpreting regression results by the analyst (the quick and dirty way) - Sign econometrics - Confusion of statistical significance and economic significance - Interaction terms Challenges to readers - Units of measurement - Omitted coefficients

Decompose R 2

Motivation Imagine regressor variables as players - Cooperative game theory: games with transferable utility - Regressors may form coalition -- how to distribute the gains from cooperation? - Shapley (1953) and Owen (1977) Values as means to decompose goodness of fit - Axioms under which these values are unique solutions hierarchical partitioning (Lindeman et al. 1980, Chevan & Sutherland 1991) Previous implementations: Stata: shapley (Kolenikov 2000) R: relaimpo package (Grömping 2006)

Illustrative calculation of the Shapley Value Assume the full model is... y = 0 + 1 x 1 + 2 x 2 + 3 x 3 + " log wage intelligence schooling experience

Worths in 2 k (sub-)models = 0.18 0.54 0.48 0.30 0.24 0.12 0.18

Marginal contributions (MC) remove first remove first remove first 0.30 0.18 0.12 0.30 0.12 0.12 0.18 0.54 0.18 0.18 0.54 0.24 0.48 0.12 0.12 0.48 0.24 0.24

[ { } Sh(x j )= 1 k! X X all permutations MC Sh( ) = 1 6 Sh( ) = 1 6 Sh( ) = 1 6 (0.12 + 0.12 + 0.18 + 0.18 + 0.12 + 0.24) = 0.16 (0.42 + 0.42 + 0.36 + 0.24 + 0.36 + 0.24) = 0.34 (0.18 + 0.18 + 0.18 + 0.30 + 0.24 + 0.24) = 0.22 Σ =

Axioms 1) Efficiency GOF of full model is decomposed among the regressors 2) Monotonicity Increase in R 2 must not decrease the value 3) Equal treatment Perfect substitutes (in terms of GOF) receive the same value The Shapley Value is the only value that satisfies these properties (Young 1985).

A priori grouped regressors Analyst may believe that some regressor variables belong together, e.g.: polynomial terms, region dummies Regressor variables are partitioned: G = {G 1,...,G`,...,G } When thinking about marginal contributions of variables from G`,, variables of G q6=` must be completely absent or present This limits the set of admissible model permutations Owen Value E.g.: nature, nurture

4 permutations respect G 0.30 0.18 0.30 0.12 0.54 0.18 0.54 0.24 0.48 0.12 0.48 0.24 Ow( ) = 0.33 Ow( ) = 0.165 Ow( ) = 0.225

Axioms 1) Efficiency* 2) Monotonicity* 3) Equal treatment of players 4) Equal treatment of groups The Owen Value is the only value that satisfies these properties (Khmelnitskaya and Yanovskaya 2007).

Stata implementation With grouping, large models become possible User decides for which groups to calculate Owen values R 2 calculated from covariance structure of the data Syntax uses \ to indicate group boundaries in varlist Computation in Mata Bootstrapping option

Wage regression: German male employees Table 1 OLS regression results with decomposition of R 2 (in %) R 2 decomposition (%) Group Regressor Coef. Owen Group 1 SCT 0.789 3.0 33.2 SCT EDUC 0.048 8.3 EDUC 0.103 21.9 2 EXPER 0.025 7.0 11.0 (EXPER) 2 /100 0.041 4.0 3 TENURE 0.017 9.3 14.3 (TENURE) 2 /100 0.029 5.0 4 MARRIED 0.084 5.0 5.0 5 Firm size (3 dummies) 14.7 6 Industry (6 dummies) 5.5 7 Region (14 dummies) 16.2 Observations 850 Full model R 2 0.501 Remark: / / denotes statistical significance at the 10% / 5% / 1% level for individual variables (t-test) or groups of dummy variables (F-test), based on the heteroscedasticity-robust covariance matrix.

Wage regression: German male employees 1 EDU / SCT 2 EXPER polynomial 3 TENURE polynomial 4 MARRIED 5 Firm size dummies 6 Industry dummies 7 Region dummies 0.05.1.15.2 Absolute contribution to R² Fig 1. Decomposition results for groups, with 90% bootstrap confidence intervals, based on 5000 bootstrap replications.

Wage regression: German male employees 1-1 SCT01 1-2 SCT01_EDUC x 1-3 EDUC 2-4 EXPER EXPER 2-5 EXPER2 2 / 100 3-6 TENURE TENURE 3-7 TENURE2 2 / 100 0.02.04.06.08.1.12.14 Absolute contribution to R² Fig 2. Owen value decomposition results for human capital variables, with 90% bootstrap confidence intervals, based on 5000 bootstrap replications.

Limits Only OLS with R 2 decomposition at this time Does not yet accept factor variables or weights... Possible extensions? Decomposition of other measures (e.g., AIC) More levels of aggregation Essential regressors / fixed effects model (econ)

Thank you! www.uni-leipzig.de/~rego Huettner & Sunder (2012) Electronic Journal of Statistics