Now what do I do with this function?

Similar documents
Estimating and Interpreting Effects for Nonlinear and Nonparametric Models

Day 4A Nonparametrics

Day 3B Nonparametrics and Bootstrap

Dealing With and Understanding Endogeneity

ECON 594: Lecture #6

Generalized method of moments estimation in Stata 11

A Journey to Latent Class Analysis (LCA)

i (x i x) 2 1 N i x i(y i y) Var(x) = P (x 1 x) Var(x)

Econ 582 Nonparametric Regression

Latent class analysis and finite mixture models with Stata

Generalized linear models

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, Last revised January 20, 2018

raise Coef. Std. Err. z P> z [95% Conf. Interval]

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M.

Tables and Figures. This draft, July 2, 2007

Description Remarks and examples Also see

Lab 10 - Binary Variables

Binary Dependent Variables

Using generalized structural equation models to fit customized models without programming, and taking advantage of new features of -margins-

Multivariate probit regression using simulated maximum likelihood

University of California at Berkeley Fall Introductory Applied Econometrics Final examination. Scores add up to 125 points

41903: Introduction to Nonparametrics

ECON Introductory Econometrics. Lecture 11: Binary dependent variables

Description Remarks and examples Also see

Binary Outcomes. Objectives. Demonstrate the limitations of the Linear Probability Model (LPM) for binary outcomes

Model and Working Correlation Structure Selection in GEE Analyses of Longitudinal Data

Estimation of multivalued treatment effects under conditional independence

Nonparametric Methods

Nonlinear Econometric Analysis (ECO 722) : Homework 2 Answers. (1 θ) if y i = 0. which can be written in an analytically more convenient way as

Case of single exogenous (iv) variable (with single or multiple mediators) iv à med à dv. = β 0. iv i. med i + α 1

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

12 - Nonparametric Density Estimation

Homework Solutions Applied Logistic Regression

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

David M. Drukker Italian Stata Users Group meeting 15 November Executive Director of Econometrics Stata

Suggested Answers Problem set 4 ECON 60303

Firm-sponsored Training and Poaching Externalities in Regional Labor Markets

Chapter 11. Regression with a Binary Dependent Variable

Consider Table 1 (Note connection to start-stop process).

Adding Uncertainty to a Roy Economy with Two Sectors

Introduction to Econometrics

fhetprob: A fast QMLE Stata routine for fractional probit models with multiplicative heteroskedasticity

Marginal Effects in Multiply Imputed Datasets

Description Remarks and examples Reference Also see

Problem Set 10: Panel Data

ECO220Y Simple Regression: Testing the Slope

GMM Estimation in Stata

Right-censored Poisson regression model

Section 7: Local linear regression (loess) and regression discontinuity designs

Lecture 2: Poisson and logistic regression

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

The Regression Tool. Yona Rubinstein. July Yona Rubinstein (LSE) The Regression Tool 07/16 1 / 35

Lecture Data Science

The Simulation Extrapolation Method for Fitting Generalized Linear Models with Additive Measurement Error

Problem Set #3-Key. wage Coef. Std. Err. t P> t [95% Conf. Interval]

Remedial Measures for Multiple Linear Regression Models

2. We care about proportion for categorical variable, but average for numerical one.

The simulation extrapolation method for fitting generalized linear models with additive measurement error

Lab 07 Introduction to Econometrics

Nonparametric Density Estimation

Week 2: Review of probability and statistics

LOGISTIC REGRESSION Joseph M. Hilbe

Business Statistics. Lecture 10: Course Review

General Linear Model (Chapter 4)

Exploring Marginal Treatment Effects

Time Series and Forecasting Lecture 4 NonLinear Time Series

ECON3150/4150 Spring 2016

Propensity Score Matching and Analysis TEXAS EVALUATION NETWORK INSTITUTE AUSTIN, TX NOVEMBER 9, 2018

Regression #8: Loose Ends

Econ 371 Problem Set #6 Answer Sheet. deaths per 10,000. The 90% confidence interval for the change in death rate is 1.81 ±

Econometrics I. Lecture 10: Nonparametric Estimation with Kernels. Paul T. Scott NYU Stern. Fall 2018

User s Guide for interflex

Empirical Asset Pricing

The Stata Journal. Editors H. Joseph Newton Department of Statistics Texas A&M University College Station, Texas

An explanation of Two Stage Least Squares

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones

QIC program and model selection in GEE analyses

A simple alternative to the linear probability model for binary choice models with endogenous regressors

Selection on Observables: Propensity Score Matching.

Practice exam questions

Logit estimates Number of obs = 5054 Wald chi2(1) = 2.70 Prob > chi2 = Log pseudolikelihood = Pseudo R2 =

Estimation of Treatment Effects under Essential Heterogeneity

Multiple Regression: Inference

Estimation of ordered response models with sample selection

Lecture 5: Poisson and logistic regression

From the help desk: Comparing areas under receiver operating characteristic curves from two or more probit or logit models

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

Regression Discontinuity Design

Estimation of quantile treatment effects with Stata

Understanding the multinomial-poisson transformation

PSC 8185: Multilevel Modeling Fitting Random Coefficient Binary Response Models in Stata

Applied Statistics and Econometrics

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors

Econometrics Review questions for exam

Problem Set 4 ANSWERS

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Stat 5101 Lecture Notes

Lecture 3.1 Basic Logistic LDA

The Stata Journal. Editors H. Joseph Newton Department of Statistics Texas A&M University College Station, Texas

Transcription:

Now what do I do with this function? Enrique Pinzón StataCorp LP December 08, 2017 Sao Paulo (StataCorp LP) December 08, 2017 Sao Paulo 1 / 42

Initial thoughts Nonparametric regression and about effects/questions npregress Mean relation between an outcome and covariates Model birtweight : age, education level, smoked, number of prenatal visits,... Model wages: age, education level, profession, tenure,... E (y X), conditional mean Parametric models have a known functional form Linear regression: E (y X) = X β Binary: E (y X) = F(Xβ) Poisson: E (y X) = exp(x β) Nonparametric E (y X). The result of using predict (StataCorp LP) December 08, 2017 Sao Paulo 2 / 42

Initial thoughts Nonparametric regression and about effects/questions npregress Mean relation between an outcome and covariates Model birtweight : age, education level, smoked, number of prenatal visits,... Model wages: age, education level, profession, tenure,... E (y X), conditional mean Parametric models have a known functional form Linear regression: E (y X) = X β Binary: E (y X) = F(Xβ) Poisson: E (y X) = exp(x β) Nonparametric E (y X). The result of using predict (StataCorp LP) December 08, 2017 Sao Paulo 2 / 42

Initial thoughts Nonparametric regression and about effects/questions npregress Mean relation between an outcome and covariates Model birtweight : age, education level, smoked, number of prenatal visits,... Model wages: age, education level, profession, tenure,... E (y X), conditional mean Parametric models have a known functional form Linear regression: E (y X) = X β Binary: E (y X) = F(Xβ) Poisson: E (y X) = exp(x β) Nonparametric E (y X). The result of using predict (StataCorp LP) December 08, 2017 Sao Paulo 2 / 42

Initial thoughts Nonparametric regression and about effects/questions npregress Mean relation between an outcome and covariates Model birtweight : age, education level, smoked, number of prenatal visits,... Model wages: age, education level, profession, tenure,... E (y X), conditional mean Parametric models have a known functional form Linear regression: E (y X) = X β Binary: E (y X) = F(Xβ) Poisson: E (y X) = exp(x β) Nonparametric E (y X). The result of using predict (StataCorp LP) December 08, 2017 Sao Paulo 2 / 42

Initial thoughts Nonparametric regression and about effects/questions npregress Mean relation between an outcome and covariates Model birtweight : age, education level, smoked, number of prenatal visits,... Model wages: age, education level, profession, tenure,... E (y X), conditional mean Parametric models have a known functional form Linear regression: E (y X) = X β Binary: E (y X) = F(Xβ) Poisson: E (y X) = exp(x β) Nonparametric E (y X). The result of using predict (StataCorp LP) December 08, 2017 Sao Paulo 2 / 42

Initial thoughts Nonparametric regression and about effects/questions npregress Mean relation between an outcome and covariates Model birtweight : age, education level, smoked, number of prenatal visits,... Model wages: age, education level, profession, tenure,... E (y X), conditional mean Parametric models have a known functional form Linear regression: E (y X) = X β Binary: E (y X) = F(Xβ) Poisson: E (y X) = exp(x β) Nonparametric E (y X). The result of using predict (StataCorp LP) December 08, 2017 Sao Paulo 2 / 42

(StataCorp LP) December 08, 2017 Sao Paulo 3 / 42

But... We had nonparametric regression tools lpoly lowess (StataCorp LP) December 08, 2017 Sao Paulo 4 / 42

But... We had nonparametric regression tools lpoly lowess (StataCorp LP) December 08, 2017 Sao Paulo 4 / 42

What happened in the past lpoly bweight mage if (msmoke==0 & medu>12 & fedu>12), /// mcolor(%30) lineopts(lwidth(thick)) (StataCorp LP) December 08, 2017 Sao Paulo 5 / 42

Effects: A thought experiment I give you the true function. list y x a gx in 1/10, noobs y x a gx 13.46181.7630615 2 12.73349 1.41086.9241793 1 1.547555 22.88834 1.816095 2 21.43813 10.97789.8206556 2 13.01466 11.37173.0440157 2 10.13213 -.1938587 1.083093 1.439635 55.87413 3.32037 2 56.56772 2.94979.8900821 1 1.804343-1.178733-2.342678 0-2.856946 48.79958 3.418333 0 49.94323 (StataCorp LP) December 08, 2017 Sao Paulo 6 / 42

Effects: A thought experiment I give you the true function. list y x a gx in 1/10, noobs y x a gx 13.46181.7630615 2 12.73349 1.41086.9241793 1 1.547555 22.88834 1.816095 2 21.43813 10.97789.8206556 2 13.01466 11.37173.0440157 2 10.13213 -.1938587 1.083093 1.439635 55.87413 3.32037 2 56.56772 2.94979.8900821 1 1.804343-1.178733-2.342678 0-2.856946 48.79958 3.418333 0 49.94323 (StataCorp LP) December 08, 2017 Sao Paulo 6 / 42

Effects: A thought experiment I give you the true function Do we know what are the marginal effects Do we know causal/treatment effects Do we know counterfactuals It seems cosmetic We cannot use margins (StataCorp LP) December 08, 2017 Sao Paulo 7 / 42

Effects: A thought experiment I give you the true function Do we know what are the marginal effects Do we know causal/treatment effects Do we know counterfactuals It seems cosmetic We cannot use margins (StataCorp LP) December 08, 2017 Sao Paulo 7 / 42

Effects: A thought experiment I give you the true function Do we know what are the marginal effects Do we know causal/treatment effects Do we know counterfactuals It seems cosmetic We cannot use margins (StataCorp LP) December 08, 2017 Sao Paulo 7 / 42

Effects: A thought experiment I give you the true function Do we know what are the marginal effects Do we know causal/treatment effects Do we know counterfactuals It seems cosmetic We cannot use margins (StataCorp LP) December 08, 2017 Sao Paulo 7 / 42

A detour margins (StataCorp LP) December 08, 2017 Sao Paulo 8 / 42

Effects: outcome of interest (StataCorp LP) December 08, 2017 Sao Paulo 9 / 42

Data crash 1 if crash traffic Measure of vehicular traffic tickets Number of traffic tickets male 1 if male (StataCorp LP) December 08, 2017 Sao Paulo 10 / 42

Probit model and average marginal effects probit crash tickets traffic i.male. margins Predictive margins Number of obs = 948 Model VCE : OIM Expression : Pr(crash), predict() Delta-method Margin Std. Err. z P> z [95% Conf. Interval] _cons.1626529.0044459 36.58 0.000.153939.1713668. margins, dydx(traffic tickets) Average marginal effects Number of obs = 948 Model VCE : OIM Expression : Pr(crash), predict() dy/dx w.r.t. : tickets traffic Delta-method dy/dx Std. Err. z P> z [95% Conf. Interval] tickets.0857818.0031049 27.63 0.000.0796963.0918672 traffic.0055371.0020469 2.71 0.007.0015251.009549 (StataCorp LP) December 08, 2017 Sao Paulo 11 / 42

Probit model and average marginal effects probit crash tickets traffic i.male. margins Predictive margins Number of obs = 948 Model VCE : OIM Expression : Pr(crash), predict() Delta-method Margin Std. Err. z P> z [95% Conf. Interval] _cons.1626529.0044459 36.58 0.000.153939.1713668. margins, dydx(traffic tickets) Average marginal effects Number of obs = 948 Model VCE : OIM Expression : Pr(crash), predict() dy/dx w.r.t. : tickets traffic Delta-method dy/dx Std. Err. z P> z [95% Conf. Interval] tickets.0857818.0031049 27.63 0.000.0796963.0918672 traffic.0055371.0020469 2.71 0.007.0015251.009549 (StataCorp LP) December 08, 2017 Sao Paulo 11 / 42

Probit model and average marginal effects probit crash tickets traffic i.male. margins Predictive margins Number of obs = 948 Model VCE : OIM Expression : Pr(crash), predict() Delta-method Margin Std. Err. z P> z [95% Conf. Interval] _cons.1626529.0044459 36.58 0.000.153939.1713668. margins, dydx(traffic tickets) Average marginal effects Number of obs = 948 Model VCE : OIM Expression : Pr(crash), predict() dy/dx w.r.t. : tickets traffic Delta-method dy/dx Std. Err. z P> z [95% Conf. Interval] tickets.0857818.0031049 27.63 0.000.0796963.0918672 traffic.0055371.0020469 2.71 0.007.0015251.009549 (StataCorp LP) December 08, 2017 Sao Paulo 11 / 42

Not calculus. margins, at(traffic=generate(traffic*1.10)) at(traffic=generate(traffic)) /// > contrast(atcontrast(r) nowald) Contrasts of predictive margins Model VCE : OIM Expression : Pr(crash), predict() 1._at : traffic = traffic*1.10 2._at : traffic = traffic Delta-method Contrast Std. Err. [95% Conf. Interval] _at (2 vs 1) -.0028589.0010882 -.0049917 -.0007262 (StataCorp LP) December 08, 2017 Sao Paulo 12 / 42

Probit model and counterfactuals. margins male Predictive margins Number of obs = 948 Model VCE : OIM Expression : Pr(crash), predict() Delta-method Margin Std. Err. z P> z [95% Conf. Interval] male 0.0746963.0051778 14.43 0.000.0645481.0848446 1.2839021.008062 35.21 0.000.2681008.2997034. margins, dydx(male) Average marginal effects Number of obs = 948 Model VCE : OIM Expression : Pr(crash), predict() dy/dx w.r.t. : 1.male Delta-method dy/dx Std. Err. z P> z [95% Conf. Interval] 1.male.2092058.0105149 19.90 0.000.188597.2298145 Note: dy/dx for factor levels is the discrete change from the base level. (StataCorp LP) December 08, 2017 Sao Paulo 13 / 42

Probit model and counterfactuals. margins male Predictive margins Number of obs = 948 Model VCE : OIM Expression : Pr(crash), predict() Delta-method Margin Std. Err. z P> z [95% Conf. Interval] male 0.0746963.0051778 14.43 0.000.0645481.0848446 1.2839021.008062 35.21 0.000.2681008.2997034. margins, dydx(male) Average marginal effects Number of obs = 948 Model VCE : OIM Expression : Pr(crash), predict() dy/dx w.r.t. : 1.male Delta-method dy/dx Std. Err. z P> z [95% Conf. Interval] 1.male.2092058.0105149 19.90 0.000.188597.2298145 Note: dy/dx for factor levels is the discrete change from the base level. (StataCorp LP) December 08, 2017 Sao Paulo 13 / 42

More counterfactuals. margins, dydx(tickets) Average marginal effects Number of obs = 948 Model VCE : OIM Expression : Pr(crash), predict() dy/dx w.r.t. : tickets Delta-method dy/dx Std. Err. z P> z [95% Conf. Interval] tickets.0857818.0031049 27.63 0.000.0796963.0918672. margins, at(tickets=(0(1)5)) contrast(atcontrast(ar) nowald) Contrasts of predictive margins Model VCE : OIM Expression : Pr(crash), predict() 1._at : tickets = 0 2._at : tickets = 1 3._at : tickets = 2 4._at : tickets = 3 5._at : tickets = 4 6._at : tickets = 5 Delta-method Contrast Std. Err. [95% Conf. Interval] _at (2 vs 1).0001208.0001671 -.0002067.0004484 (3 vs 2).0547975.0177313.0200448.0895502 (4 vs 3).3503763.0225727.3061346.3946179 (5 vs 4).091227.0298231.0327747.1496793 (6 vs 5).37736.0283876.3217213.4329986 (StataCorp LP) December 08, 2017 Sao Paulo 14 / 42

More counterfactuals. margins, dydx(tickets) Average marginal effects Number of obs = 948 Model VCE : OIM Expression : Pr(crash), predict() dy/dx w.r.t. : tickets Delta-method dy/dx Std. Err. z P> z [95% Conf. Interval] tickets.0857818.0031049 27.63 0.000.0796963.0918672. margins, at(tickets=(0(1)5)) contrast(atcontrast(ar) nowald) Contrasts of predictive margins Model VCE : OIM Expression : Pr(crash), predict() 1._at : tickets = 0 2._at : tickets = 1 3._at : tickets = 2 4._at : tickets = 3 5._at : tickets = 4 6._at : tickets = 5 Delta-method Contrast Std. Err. [95% Conf. Interval] _at (2 vs 1).0001208.0001671 -.0002067.0004484 (3 vs 2).0547975.0177313.0200448.0895502 (4 vs 3).3503763.0225727.3061346.3946179 (5 vs 4).091227.0298231.0327747.1496793 (6 vs 5).37736.0283876.3217213.4329986 (StataCorp LP) December 08, 2017 Sao Paulo 14 / 42

marginsplot margins, at(tickets=(0(1)5)) marginsplot, ciopts(recast(rarea)) (StataCorp LP) December 08, 2017 Sao Paulo 15 / 42

Back to nonparametric regression npregress and nonparametric regression (StataCorp LP) December 08, 2017 Sao Paulo 16 / 42

Nonparametric regression: discrete covariates Mean function for a discrete covariate Mean wage conditional on having a college degree. mean wage if collgrad==1 Mean estimation Number of obs = 4,795 Mean Std. Err. [95% Conf. Interval] wage 8.648064.0693118 8.512181 8.783947 regress wage collgrad, noconstant E(wage collgrad = 1), nonparametric estimate (StataCorp LP) December 08, 2017 Sao Paulo 17 / 42

Nonparametric regression: discrete covariates Mean function for a discrete covariate Mean wage conditional on having a college degree. mean wage if collgrad==1 Mean estimation Number of obs = 4,795 Mean Std. Err. [95% Conf. Interval] wage 8.648064.0693118 8.512181 8.783947 regress wage collgrad, noconstant E(wage collgrad = 1), nonparametric estimate (StataCorp LP) December 08, 2017 Sao Paulo 17 / 42

Nonparametric regression: discrete covariates Mean function for a discrete covariate Mean wage conditional on having a college degree. mean wage if collgrad==1 Mean estimation Number of obs = 4,795 Mean Std. Err. [95% Conf. Interval] wage 8.648064.0693118 8.512181 8.783947 regress wage collgrad, noconstant E(wage collgrad = 1), nonparametric estimate (StataCorp LP) December 08, 2017 Sao Paulo 17 / 42

Nonparametric regression: continuous covariates Conditional mean for a continuous covariate Mean wage conditional on tenure, measured in years E(wage tenure = 5.583333) Take observations near the value of 5.583333 and then take an average tenure i 5.583333 h h is a small number referred to as the bandwidth (StataCorp LP) December 08, 2017 Sao Paulo 18 / 42

Nonparametric regression: continuous covariates Conditional mean for a continuous covariate Mean wage conditional on tenure, measured in years E(wage tenure = 5.583333) Take observations near the value of 5.583333 and then take an average tenure i 5.583333 h h is a small number referred to as the bandwidth (StataCorp LP) December 08, 2017 Sao Paulo 18 / 42

Nonparametric regression: continuous covariates Conditional mean for a continuous covariate Mean wage conditional on tenure, measured in years E(wage tenure = 5.583333) Take observations near the value of 5.583333 and then take an average tenure i 5.583333 h h is a small number referred to as the bandwidth (StataCorp LP) December 08, 2017 Sao Paulo 18 / 42

Nonparametric regression: continuous covariates Conditional mean for a continuous covariate Mean wage conditional on tenure, measured in years E(wage tenure = 5.583333) Take observations near the value of 5.583333 and then take an average tenure i 5.583333 h h is a small number referred to as the bandwidth (StataCorp LP) December 08, 2017 Sao Paulo 18 / 42

Nonparametric regression: continuous covariates Conditional mean for a continuous covariate Mean wage conditional on tenure, measured in years E(wage tenure = 5.583333) Take observations near the value of 5.583333 and then take an average tenure i 5.583333 h h is a small number referred to as the bandwidth (StataCorp LP) December 08, 2017 Sao Paulo 18 / 42

Nonparametric regression: continuous covariates Conditional mean for a continuous covariate Mean wage conditional on tenure, measured in years E(wage tenure = 5.583333) Take observations near the value of 5.583333 and then take an average tenure i 5.583333 h h is a small number referred to as the bandwidth (StataCorp LP) December 08, 2017 Sao Paulo 18 / 42

Graphical example (StataCorp LP) December 08, 2017 Sao Paulo 19 / 42

Graphical example (StataCorp LP) December 08, 2017 Sao Paulo 20 / 42

Graphical example continued (StataCorp LP) December 08, 2017 Sao Paulo 21 / 42

Two concepts 1 h 2 Definition of distance between points, x i x h 1 (StataCorp LP) December 08, 2017 Sao Paulo 22 / 42

Kernel weights u x i x h Kernel K (u) ( ) 1 Gaussian exp u2 ( 2π ) 2 3 Epanechnikov 4 1 u2 5 5 I ( u 5 ) ( 3 Epanechnikov2 4 1 u 2 ) I ( u 1) 1 Rectangular(Uniform) 2I ( u 1) Triangular (1 u ) I ( u 1) ( 15 Biweight 16 1 u 2 ) 2 I ( u 1) ( 35 Triweight 32 1 u 2 ) 3 I ( u 1) Cosine (1 + cos (2πu)) I ( u 1 ) ( 2 Parzen 4 3 8u2 + 8 u 3) I ( u 1 2) + 8 3 (1 u )3 I ( 1 2 < u 1) (StataCorp LP) December 08, 2017 Sao Paulo 23 / 42

Kernel weights u x i x h Kernel K (u) ( ) 1 Gaussian exp u2 ( 2π ) 2 3 Epanechnikov 4 1 u2 5 5 I ( u 5 ) ( 3 Epanechnikov2 4 1 u 2 ) I ( u 1) 1 Rectangular(Uniform) 2I ( u 1) Triangular (1 u ) I ( u 1) ( 15 Biweight 16 1 u 2 ) 2 I ( u 1) ( 35 Triweight 32 1 u 2 ) 3 I ( u 1) Cosine (1 + cos (2πu)) I ( u 1 ) ( 2 Parzen 4 3 8u2 + 8 u 3) I ( u 1 2) + 8 3 (1 u )3 I ( 1 2 < u 1) (StataCorp LP) December 08, 2017 Sao Paulo 23 / 42

Discrete bandwidths Default Cell mean { 1 if xi = x k (.) = h otherwise k (.) = { 1 if xi = x 0 otherwise (StataCorp LP) December 08, 2017 Sao Paulo 24 / 42

Bandwidth (bias) (StataCorp LP) December 08, 2017 Sao Paulo 25 / 42

Bandwidth (variance) (StataCorp LP) December 08, 2017 Sao Paulo 26 / 42

Estimation Choose bandwidth optimally. Minimize bias variance trade off Cross-validation (default) Improved AIC (IMAIC) Compute a regression for every point in data (local linear) Computes derivatives and derivative bandwidths Compute a mean for every point in data (local-constant) (StataCorp LP) December 08, 2017 Sao Paulo 27 / 42

Example citations monthly drunk driving citations taxes 1 if alcoholic beverages are taxed fines drunk driving fines in thousands csize city size (small, medium, large) college 1 if college town (StataCorp LP) December 08, 2017 Sao Paulo 28 / 42

Example citations monthly drunk driving citations taxes 1 if alcoholic beverages are taxed fines drunk driving fines in thousands csize city size (small, medium, large) college 1 if college town (StataCorp LP) December 08, 2017 Sao Paulo 28 / 42

npregress bandwidth. npregress kernel citations fines Computing mean function Minimizing cross-validation function: Iteration 0: Cross-validation criterion = 35.478784 Iteration 1: Cross-validation criterion = 4.0147129 Iteration 2: Cross-validation criterion = 4.0104176 Iteration 3: Cross-validation criterion = 4.0104176 Iteration 4: Cross-validation criterion = 4.0104176 Iteration 5: Cross-validation criterion = 4.0104176 Iteration 6: Cross-validation criterion = 4.0104006 Computing optimal derivative bandwidth Iteration 0: Cross-validation criterion = 6.1648059 Iteration 1: Cross-validation criterion = 4.3597488 Iteration 2: Cross-validation criterion = 4.3597488 Iteration 3: Cross-validation criterion = 4.3597488 Iteration 4: Cross-validation criterion = 4.3597488 Iteration 5: Cross-validation criterion = 4.3597488 Iteration 6: Cross-validation criterion = 4.3595842 Iteration 7: Cross-validation criterion = 4.3594713 Iteration 8: Cross-validation criterion = 4.3594713 (StataCorp LP) December 08, 2017 Sao Paulo 29 / 42

npregress output. npregress kernel citations fines, nolog Bandwidth Mean Mean Effect fines.5631079.924924 Local-linear regression Number of obs = 500 Kernel : epanechnikov E(Kernel obs) = 282 Bandwidth: cross validation R-squared = 0.4380 citations Estimate Mean citations 22.33999 Effect fines -7.692388 Note: Effect estimates are averages of derivatives. Note: You may compute standard errors using vce(bootstrap) or reps(). (StataCorp LP) December 08, 2017 Sao Paulo 30 / 42

npregress predicted values. describe _* storage display value variable name type format label variable label _Mean_citations double %10.0g mean function _d_mean_citat ~ s double %10.0g derivative of mean function w.r.t fines (StataCorp LP) December 08, 2017 Sao Paulo 31 / 42

npgraph. npgraph (StataCorp LP) December 08, 2017 Sao Paulo 32 / 42

npregress standard errors I. quietly npregress kernel citations fines, reps(3) seed(111). estimates store A. quietly npregress kernel citations fines, vce(bootstrap, reps(3) seed(111)). estimates store B. estimates table A B, se Variable A B Mean citations 22.339995 22.339995.65062763.65062763 Effect fines -7.6923878-7.6923878.23195785.23195785 legend: b/se (StataCorp LP) December 08, 2017 Sao Paulo 33 / 42

npregress standard errors II (percentile C.I.). npregress Bandwidth Mean Mean Effect fines.5631079.924924 Local-linear regression Number of obs = 500 Kernel : epanechnikov E(Kernel obs) = 282 Bandwidth: cross validation R-squared = 0.4380 Observed Bootstrap Percentile citations Estimate Std. Err. z P> z [95% Conf. Interval] Mean citations 22.33999.6506276 34.34 0.000 21.54051 22.74807 Effect fines -7.692388.2319578-33.16 0.000-7.701931-7.267385 Note: Effect estimates are averages of derivatives. (StataCorp LP) December 08, 2017 Sao Paulo 34 / 42

A more interesting model. npregress kernel citations fines i.taxes i.csize i.college, reps(200) seed(10) Bandwidth Mean Mean Effect fines.4471373.6537197 taxes.4375656.4375656 csize.3938759.3938759 college.554583.554583 Local-linear regression Number of obs = 500 Continuous kernel : epanechnikov E(Kernel obs) = 224 Discrete kernel : liracine R-squared = 0.8010 Bandwidth : cross validation Observed Bootstrap Percentile citations Estimate Std. Err. z P> z [95% Conf. Interval] Mean citations 22.26306.4616724 48.22 0.000 21.39581 23.30278 Effect fines -7.332833.3341222-21.95 0.000-7.970275-6.665263 taxes (tax vs no tax) -4.502718.4946306-9.10 0.000-5.360078-3.465397 csize (medium vs small) 5.300524.2731374 19.41 0.000 4.723821 5.879301 (large vs small) 11.05053.5236424 21.10 0.000 9.942253 12.1252 college (college vs not coll..) 5.953188.500154 11.90 0.000 4.937102 6.969837 Note: Effect estimates are averages of derivatives for continuous covariates and averages of contrasts for factor covariates. (StataCorp LP) December 08, 2017 Sao Paulo 35 / 42

margins. margins, at(fines=generate(fines)) at(fines=generate(fines*1.15)) /// > contrast(atcontrast(r) nowald) reps(200) seed(12) (running margins on estimation sample) Bootstrap replications (200) 1 2 3 4 5... 50... 100... 150... 200 Contrasts of predictive margins Number of obs = 500 Replications = 200 Expression : mean function, predict() 1._at : fines = fines 2._at : fines = fines*1.15 Observed Bootstrap Percentile Contrast Std. Err. [95% Conf. Interval] _at (2 vs 1) -8.254875.8058215-10.44121-7.381583 (StataCorp LP) December 08, 2017 Sao Paulo 36 / 42

Another example with margins 10 + x 3 + ε if a = 0 y = 10 + x 3 10x + ε if a = 1 10 + x 3 + 3x + ε if a = 2 (StataCorp LP) December 08, 2017 Sao Paulo 37 / 42

Mean and marginal effects. quietly regress y (c.x#c.x#c.x)#i.a c.x#i.a. margins Predictive margins Number of obs = 1,000 Model VCE : OLS Expression : Linear prediction, predict() Delta-method Margin Std. Err. t P> t [95% Conf. Interval] _cons 12.02269.0313857 383.06 0.000 11.9611 12.08428. margins, dydx(*) Average marginal effects Number of obs = 1,000 Model VCE : OLS Expression : Linear prediction, predict() dy/dx w.r.t. : 1.a 2.a x Delta-method dy/dx Std. Err. t P> t [95% Conf. Interval] a 1-9.781302.05743-170.32 0.000-9.894-9.668604 2 3.028531.0544189 55.65 0.000 2.921742 3.13532 x 3.97815.0303517 131.07 0.000 3.91859 4.037711 Note: dy/dx for factor levels is the discrete change from the base level. (StataCorp LP) December 08, 2017 Sao Paulo 38 / 42

Mean and marginal effects. quietly regress y (c.x#c.x#c.x)#i.a c.x#i.a. margins Predictive margins Number of obs = 1,000 Model VCE : OLS Expression : Linear prediction, predict() Delta-method Margin Std. Err. t P> t [95% Conf. Interval] _cons 12.02269.0313857 383.06 0.000 11.9611 12.08428. margins, dydx(*) Average marginal effects Number of obs = 1,000 Model VCE : OLS Expression : Linear prediction, predict() dy/dx w.r.t. : 1.a 2.a x Delta-method dy/dx Std. Err. t P> t [95% Conf. Interval] a 1-9.781302.05743-170.32 0.000-9.894-9.668604 2 3.028531.0544189 55.65 0.000 2.921742 3.13532 x 3.97815.0303517 131.07 0.000 3.91859 4.037711 Note: dy/dx for factor levels is the discrete change from the base level. (StataCorp LP) December 08, 2017 Sao Paulo 38 / 42

npregress estimates. npregress kernel y x i.a, vce(bootstrap, reps(100) seed(111)) (running npregress on estimation sample) Bootstrap replications (100) 1 2 3 4 5... 50... 100 Bandwidth Mean Mean Effect x.3630656.5455175 a 3.05e-06 3.05e-06 Local-linear regression Number of obs = 1,000 Continuous kernel : epanechnikov E(Kernel obs) = 363 Discrete kernel : liracine R-squared = 0.9888 Bandwidth : cross validation Mean Effect Observed Bootstrap Percentile y Estimate Std. Err. z P> z [95% Conf. Interval] y 12.34335.3195918 38.62 0.000 11.57571 12.98202 x 3.619627.2937529 12.32 0.000 3.063269 4.143166 a (1 vs 0) -9.881542.3491042-28.31 0.000-10.5277-9.110781 (2 vs 0) 3.168084.2129506 14.88 0.000 2.73885 3.570004 Note: Effect estimates are averages of derivatives for continuous covariates and averages of contrasts for factor covariates. (StataCorp LP) December 08, 2017 Sao Paulo 39 / 42

Function for different values of x. margins, at(x=(1(.5)3)) reps(100) seed(111) (StataCorp LP) December 08, 2017 Sao Paulo 40 / 42

Funtion at different values of x for all a. margins a, at(x=(-1(1)3)) reps(100) seed(111) (StataCorp LP) December 08, 2017 Sao Paulo 41 / 42

Conclusion Intuition about nonparametric regression Details about how npregress Importance of being able to ask questions to your model (StataCorp LP) December 08, 2017 Sao Paulo 42 / 42