Nonparametric Econometrics in R

Similar documents
Nonparametric regresion models estimation in R

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

The np Package. 1. Introduction. Tristen Hayfield ETH Zürich. Jeffrey S. Racine McMaster University

Nonparametric Estimation of Regression Functions In the Presence of Irrelevant Regressors

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo

Nonparametric Regression

Lecture 3: Multiple Regression

Local regression I. Patrick Breheny. November 1. Kernel weighted averages Local linear regression

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

13 Endogeneity and Nonparametric IV

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

Introduction to Nonparametric Regression

Economics 620, Lecture 19: Introduction to Nonparametric and Semiparametric Estimation

Using Nonparametric Smoothing: Adaptation and Testing Parametric Models

Nonparametric Methods

Using Non-parametric Methods in Econometric Production Analysis: An Application to Polish Family Farms

5.1 Model Specification and Data 5.2 Estimating the Parameters of the Multiple Regression Model 5.3 Sampling Properties of the Least Squares

Entropy-Based Inference using R and the np Package: A Primer

Nonparametric Econometrics

Leverage. the response is in line with the other values, or the high leverage has caused the fitted model to be pulled toward the observed response.

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Modelling Non-linear and Non-stationary Time Series

Econometrics Multiple Regression Analysis: Heteroskedasticity

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'

Classification. Chapter Introduction. 6.2 The Bayes classifier

Error distribution function for parametrically truncated and censored data

GeoDa-GWR Results: GeoDa-GWR Output (portion only): Program began at 4/8/2016 4:40:38 PM

USE OF R ENVIRONMENT FOR A FINDING PARAMETERS NONLINEAR REGRESSION MODELS

Lecture Notes 15 Prediction Chapters 13, 22, 20.4.

Regression Discontinuity Design Econometric Issues

Package causalweight

Homoskedasticity. Var (u X) = σ 2. (23)

the error term could vary over the observations, in ways that are related

Econometrics 1. Lecture 8: Linear Regression (2) 黄嘉平

Nonparametric Regression Härdle, Müller, Sperlich, Werwarz, 1995, Nonparametric and Semiparametric Models, An Introduction

ST430 Exam 2 Solutions

Analysis of Cross-Sectional Data

Day 3B Nonparametrics and Bootstrap

A tour of kernel smoothing

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University

Time Series and Forecasting Lecture 4 NonLinear Time Series

Day 4A Nonparametrics

Econ 582 Nonparametric Regression

Alternatives to Basis Expansions. Kernels in Density Estimation. Kernels and Bandwidth. Idea Behind Kernel Methods

Local linear multiple regression with variable. bandwidth in the presence of heteroscedasticity

Bayesian estimation of bandwidths for a nonparametric regression model with a flexible error density

Topic 20: Single Factor Analysis of Variance

Introduction to Nonparametric and Semiparametric Estimation. Good when there are lots of data and very little prior information on functional form.

A review of some semiparametric regression models with application to scoring

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

ECON 721: Lecture Notes on Nonparametric Density and Regression Estimation. Petra E. Todd

Test for Discontinuities in Nonparametric Regression

TESTING FOR THE EQUALITY OF k REGRESSION CURVES

Statistical Methods for SVM

Titolo Smooth Backfitting with R

A Monte Carlo Investigation of Smoothing Methods. for Error Density Estimation in Functional Data. Analysis with an Illustrative Application to a

JACKKNIFE MODEL AVERAGING. 1. Introduction

Nonparametric Density Estimation (Multidimension)

USING A BIMODAL KERNEL FOR A NONPARAMETRIC REGRESSION SPECIFICATION TEST

Specification Testing of Production in a Stochastic Frontier Model

ECO Class 6 Nonparametric Econometrics

A New Test in Parametric Linear Models with Nonparametric Autoregressive Errors

Specification testing for regressions: an approach bridging between local smoothing and global smoothing methods

Least Squares Model Averaging. Bruce E. Hansen University of Wisconsin. January 2006 Revised: August 2006

Nonparametric Regression. Badr Missaoui

Non-parametric Inference and Resampling

Uniform Convergence Rates for Nonparametric Estimation

Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis

Introduction. Linear Regression. coefficient estimates for the wage equation: E(Y X) = X 1 β X d β d = X β

JACKKNIFE MODEL AVERAGING. 1. Introduction

Function of Longitudinal Data

Econometric Methods for Panel Data

Test of Convergence in Agricultural Factor Productivity: A Semiparametric Approach

Multiple Regression Analysis

Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis

Hypothesis Testing with the Bootstrap. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods

Estimation for nonparametric mixture models

Identification and Estimation of Semiparametric Two Step Models

An introduction to nonparametric and semi-parametric econometric methods

A Bootstrap Test for Conditional Symmetry

Nonparametric Methods in Econometrics using

Introduction to Regression

arxiv: v1 [stat.me] 3 Nov 2015

Simple Estimators for Semiparametric Multinomial Choice Models

Nonparametric and Semiparametric Regressions Subject to Monotonicity Constraints: Estimation and Forecasting

SPRING 2007 EXAM C SOLUTIONS

A COMPARISON OF HETEROSCEDASTICITY ROBUST STANDARD ERRORS AND NONPARAMETRIC GENERALIZED LEAST SQUARES

Some Monte Carlo Evidence for Adaptive Estimation of Unit-Time Varying Heteroscedastic Panel Data Models

Regression Analysis Tutorial 228 LECTURE / DISCUSSION. Simultaneous Equations

Regression Discontinuity Design

8. Instrumental variables regression

ECON 4551 Econometrics II Memorial University of Newfoundland. Panel Data Models. Adapted from Vera Tabakova s notes

Working Paper No Maximum score type estimators

Semiparametric Trending Panel Data Models with Cross-Sectional Dependence

SINGLE-STEP ESTIMATION OF A PARTIALLY LINEAR MODEL

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a

Multiple Regression Analysis

On the Robust Modal Local Polynomial Regression

Transcription:

Nonparametric Econometrics in R Philip Shaw Fordham University November 17, 2011 Philip Shaw (Fordham University) Nonparametric Econometrics in R November 17, 2011 1 / 16

Introduction The NP Package R seems to be the only software package that has a list of comprehensive nonparametric routines A variety of nonparametric econometrics can be run under the NP package Much of the code was written and is maintained by Jeffrey Racine, McMaster University Philip Shaw (Fordham University) Nonparametric Econometrics in R November 17, 2011 2 / 16

Data for the Examples Cereal Data Scanner data on price (dpfwgtavgppoz) and quantity (lndfshr) of cereal across five brands n=10450 Includes cereal characteristics such as fiber, calories, sugar, etc. Philip Shaw (Fordham University) Nonparametric Econometrics in R November 17, 2011 3 / 16

Density Estimation Routines in the Package Density Estimation and Bandwidth Selection Suppose you are interested in estimating the unconditional density function: where: ˆf (x) = 1 hn k(x i x ) (1) h k(v) = 1 2π e 1 2 v 2 (2) h is referred to as the bandwidth Philip Shaw (Fordham University) Nonparametric Econometrics in R November 17, 2011 4 / 16

Density Estimation Routines in the Package Density Estimation and Bandwidth Selection For various types of data: K γ (x i, x) = W h (x c i, x c )L(x d i, x d, λ)j(x s i, x s, λ) (3) W h (x c i, x c ) = r 1 L(x d i, x d, λ) = J(x s i, x s, λ) = j 1 ( x c j x c ) ji w h j h j r 2 j r 3 j (4) λ I (xd ji xd j ) j (5) λ xs ji xs j j (6) where x c are continous, x d are discrete, and x s are discrete ordered variables. Philip Shaw (Fordham University) Nonparametric Econometrics in R November 17, 2011 5 / 16

Routines in the Package Density Estimation and Bandwidth Selection Bandwidth Selection For density estimation: CV (h) = 1 n 2 h n n i=1 j=1 For kernel regression: where 0 M(x i ) 1. AIC = 1 n k( x i x j ) h γ opt = argmin γ R l i=1 where H ij = K γ,ij / n l=1 K γ,il. 2 n(n 1)h n n i=1 j i,j=1 k( x i x j ) (7) h n (Y t ˆm i (x i )) 2 M(x i ) (8) i=1 n {y i ˆm(x i )} 2 1 + tr(h)/n + 1 {tr(h) + 2}/n Philip Shaw (Fordham University) Nonparametric Econometrics in R November 17, 2011 6 / 16 (9)

Routines in the Package The npudensbw command Density Estimation and Bandwidth Selection > bw < npudensbw( dpfwgtavgppoz) > bw Data (10450 observations, 1 variable(s)): dpfwgtavgppoz Bandwidth(s): 0.01340203 Bandwidth Selection Method: Maximum Likelihood Cross-Validation Formula: dpfwgtavgppoz Bandwidth Type: Fixed Objective Function Value: -1.66763 (achieved on multistart 1) Continuous Kernel Type: Second-Order Gaussian No. Continuous Vars.: 1 Philip Shaw (Fordham University) Nonparametric Econometrics in R November 17, 2011 7 / 16

Routines in the Package The npudens command Density Estimation and Bandwidth Selection > fhat < npudens(bw) > plot(fhat, plot.errors.method= bootstrap ) Philip Shaw (Fordham University) Nonparametric Econometrics in R November 17, 2011 8 / 16

Kernel Regression Routines in the Package Kernel Regression Now suppose we want to estimate the demand curve for cereal Typically papers have assumed that demand is a linear function of price such that: q = β 0 + β 1 p + u (10) Instead we would like to estimate the demand curve without imposing a linear assumption on the model: q = f (p) + u (11) Under the assumption that E(u p) = 0 we can consistently estimate the function f (p) as: n E[q x] ˆ i=1 = ˆf (p) = q ik γ (p i, p) n i=1 K γ(p i, p) (12) Philip Shaw (Fordham University) Nonparametric Econometrics in R November 17, 2011 9 / 16

Routines in the Package The npregbw command Kernel Regression For our example we can select the optimal bandwidth for the kernel regression using the following R code: > bw1 < npregbw(formula=lndfshr dpfwgtavgppoz) > bw1 Regression Data (10450 observations, 1 variable(s)): dpfwgtavgppoz Bandwidth(s): 0.008413665 Regression Type: Local-Constant Bandwidth Selection Method: Least Squares Cross-Validation Formula: lndfshr dpfwgtavgppoz Bandwidth Type: Fixed Objective Function Value: 0.963636 (achieved on multistart 1) Continuous Kernel Type: Second-Order Gaussian No. Continuous Explanatory Vars.: 1 Philip Shaw (Fordham University) Nonparametric Econometrics in R November 17, 2011 10 / 16

The npreg command Routines in the Package Kernel Regression > fhat1 < npreg(bw1) > plot(fhat1, plot.errors.method= bootstrap ) Philip Shaw (Fordham University) Nonparametric Econometrics in R November 17, 2011 11 / 16

Significance Testing Routines in the Package Significance Testing Suppose we would like to test the following condition: versus H 0 : E(y x) = E(y) (13) H 1 : E(y x) E(y) (14) In the linear model this test is equivalent to using a t-test on the estimated coefficient under the zero null The problem is that this test is in general not a consistent test Racine proposes a consistent test in the nonparametric setting Philip Shaw (Fordham University) Nonparametric Econometrics in R November 17, 2011 12 / 16

Routines in the Package The npsigtest command Significance Testing > npsigtest(bw1,boot.num=100,boot.method = wild ) Kernel Regression Significance Test Type I Test with Wild Bootstrap (100 replications) Explanatory variables tested for significance: dpfwgtavgppoz (1) dpfwgtavgppoz Bandwidth(s): 0.008413665 Significance Tests P Value: dpfwgtavgppoz < 2.22e 16 Philip Shaw (Fordham University) Nonparametric Econometrics in R November 17, 2011 13 / 16

Routines in the Package Functional Form Testing Functional Form Testing Suppose we would like to test the following condition: H 0 : versus E(y x) = m(x, γ 0 ), for almost all x and for some γ 0 B R p (15) H 1 : E(y x) m(x, γ 0 ) (16) > model < lm(lndfshr dpfwgtavgppoz, x=true, y=true) > model Call: lm(formula = lndfshr dpfwgtavgppoz, x = TRUE, y = TRUE) Coefficients: (Intercept) dpfwgtavgppoz -3.068-2.841 Philip Shaw (Fordham University) Nonparametric Econometrics in R November 17, 2011 14 / 16

Routines in the Package The npcmstest command Functional Form Testing > npcmstest(model = model, xdat = dpfwgtavgppoz, ydat = lndfshr) Consistent Model Specification Test Parametric null model: lm(formula = lndfshr dpfwgtavgppoz, x = TRUE, y = TRUE) Number of regressors: 1 Wild Bootstrap (100 replications) Test Statistic Jn: 119.5579 P Value: < 2.22e 16 Philip Shaw (Fordham University) Nonparametric Econometrics in R November 17, 2011 15 / 16

Other Procedures in the NP Package Other commands in the NP package Conditional density estimation Local linear regression Statistical test of distribution equality Instrumental variable estimation Semiparametric single index models Philip Shaw (Fordham University) Nonparametric Econometrics in R November 17, 2011 16 / 16