Day 3B Nonparametrics and Bootstrap

Similar documents
Day 4A Nonparametrics

11. Bootstrap Methods

Day 1A Ordinary Least Squares and GLS

Day 2A Instrumental Variables, Two-stage Least Squares and Generalized Method of Moments

Day 4 Nonlinear methods

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

Эконометрика, , 4 модуль Семинар Для Группы Э_Б2015_Э_3 Семинарист О.А.Демидова

Day 1B Count Data Regression: Part 2

ECON 594: Lecture #6

Confidence intervals for kernel density estimation

Now what do I do with this function?

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

GMM Estimation in Stata

Nonparametric Econometrics

Nonparametric Methods

Generalized method of moments estimation in Stata 11

8. Nonstandard standard error issues 8.1. The bias of robust standard errors

STAT 704 Sections IRLS and Bootstrap

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

ECON Introductory Econometrics. Lecture 16: Instrumental variables

Economics 326 Methods of Empirical Research in Economics. Lecture 14: Hypothesis testing in the multiple regression model, Part 2

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

Economics 620, Lecture 19: Introduction to Nonparametric and Semiparametric Estimation

Remedial Measures for Multiple Linear Regression Models

Day 2B Inference for Clustered Data: Part 2 With a Cross-section Data Example (Revised August 25 to add 14. Dyadic Clustering)

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M.

Hypothesis Testing with the Bootstrap. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

12 - Nonparametric Density Estimation

GLS. Miguel Sarzosa. Econ626: Empirical Microeconomics, Department of Economics University of Maryland

Bootstrap-Based Improvements for Inference with Clustered Errors

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

Introduction to Regression

STATISTICS 110/201 PRACTICE FINAL EXAM

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Multivariate Regression: Part I

Unemployment Rate Example

Applied Statistics and Econometrics

Nonlinear Regression Functions

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Dynamic Panels. Chapter Introduction Autoregressive Model

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Empirical Asset Pricing

User s Guide for interflex

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

Lab 11 - Heteroskedasticity

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation

Simultaneous Equations with Error Components. Mike Bronner Marko Ledic Anja Breitwieser

Jeffrey M. Wooldridge Michigan State University

DEEP, University of Lausanne Lectures on Econometric Analysis of Count Data Pravin K. Trivedi May 2005

Preliminaries The bootstrap Bias reduction Hypothesis tests Regression Confidence intervals Time series Final remark. Bootstrap inference

****Lab 4, Feb 4: EDA and OLS and WLS

Binary Dependent Variables

Instrumental Variables. Ethan Kaplan

ECON Introductory Econometrics. Lecture 4: Linear Regression with One Regressor

Preliminaries The bootstrap Bias reduction Hypothesis tests Regression Confidence intervals Time series Final remark. Bootstrap inference

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors

Introduction to Nonparametric and Semiparametric Estimation. Good when there are lots of data and very little prior information on functional form.

A Practitioner s Guide to Cluster-Robust Inference

Estimating and Interpreting Effects for Nonlinear and Nonparametric Models

A Journey to Latent Class Analysis (LCA)

Essential of Simple regression

Time Series and Forecasting Lecture 4 NonLinear Time Series

Day 1A Count Data Regression: Part 1

The assumptions are needed to give us... valid standard errors valid confidence intervals valid hypothesis tests and p-values

Applied Statistics and Econometrics

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Econometrics Midterm Examination Answers

4 Resampling Methods: The Bootstrap

Lecture#12. Instrumental variables regression Causal parameters III

Introduction to Regression

The Nonparametric Bootstrap

David M. Drukker Italian Stata Users Group meeting 15 November Executive Director of Econometrics Stata

Nonparametric Inference via Bootstrapping the Debiased Estimator

Business Statistics. Lecture 10: Course Review

Introductory Econometrics. Lecture 13: Hypothesis testing in the multiple regression model, Part 1

Review of Statistics 101

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Chapter 6. Estimation of Confidence Intervals for Nodal Maximum Power Consumption per Customer

Model-free prediction intervals for regression and autoregression. Dimitris N. Politis University of California, San Diego

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp

Econometrics II. Nonstandard Standard Error Issues: A Guide for the. Practitioner

Rank Estimation of Partially Linear Index Models

xtdpdqml: Quasi-maximum likelihood estimation of linear dynamic short-t panel data models

General Linear Model (Chapter 4)

MC3: Econometric Theory and Methods. Course Notes 4

ECON Introductory Econometrics. Lecture 11: Binary dependent variables

Environmental Econometrics

Casuality and Programme Evaluation

Chapter 6: Endogeneity and Instrumental Variables (IV) estimator

Latent class analysis and finite mixture models with Stata

1 Linear Regression Analysis The Mincer Wage Equation Data Econometric Model Estimation... 11

Lecture 8: Heteroskedasticity. Causes Consequences Detection Fixes

ECON Introductory Econometrics. Lecture 2: Review of Statistics

How To Do Piecewise Exponential Survival Analysis in Stata 7 (Allison 1995:Output 4.20) revised

Transcription:

Day 3B Nonparametrics and Bootstrap c A. Colin Cameron Univ. of Calif.- Davis Frontiers in Econometrics Bavarian Graduate Program in Economics. Based on A. Colin Cameron and Pravin K. Trivedi (2009,2010), Microeconometrics using Stata (MUS), Stata Press. and A. Colin Cameron and Pravin K. Trivedi (2005), Microeconometrics: Methods and Applications (MMA), C.U.P. March 21-25, 2011 c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program in March Economics 21-25,. Based 2011 on A. 1 Colin / 28 C

1. ntroduction 1. ntroduction Brief discussion of nonparametric and semiparametric methods and the bootstrap. 1 ntroduction 2 Nonparametric (kernel) density estimation 3 Nonparametric (kernel) regression 4 Semiparametric regression 5 Bootstrap 6 Stata Commands 7 Appendix: Histogram and kernel density estimate c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program in March Economics 21-25,. Based 2011 on A. 2 Colin / 28 C

2. Nonparametric (kernel) density estimation Summary 2. Nonparametric (kernel) density estimation Parametric density estimate assume a density and use estimated parameters of this density e.g. normal density estimate: assume y i N [µ, σ 2 ] and use N [ȳ, s 2 ]. Nonparametric density estimate: a histogram break data into bins and use relative frequency within each bin Problem: a histogram is a step function, even if data are continuous Smooth nonparametric density estimate: kernel density estimate. Kernel density estimate smooths a histogram in two ways: use overlapping bins so evaluate at many more points use bins of greater width with most weight at the middle of the bin. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program in March Economics 21-25,. Based 2011 on A. 3 Colin / 28 C

2. Nonparametric (kernel) density estimation Histogram data example Formula: bf HST (x 0 ) = 1 2Nh N i=1 1(x 0 h < x i < x 0 + h) or bf HST (x 0 ) = 1 Nh N 1 i=1 2 1 x i x 0 h < 1. Data example: histogram of lnwage for 175 observations Varies with the bin width (or equivalently the number of bins) Here 30 bins, each of width 2h ' 0.20 so h ' 0.10. Density 0.2.4.6.8 1 0 1 2 3 4 5 natural log of hwage c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program in March Economics 21-25,. Based 2011 on A. 4 Colin / 28 C

2. Nonparametric (kernel) density estimation Kernel density data example Kernel density estimate of f (x 0 ) replaces 1 (A) by kernel K (A) : bf (x 0 ) = 1 Nh N i=1 K x i x 0 h Data example: kernel of lnwage for 175 observations Epanechnikov kernel K (z) = 0.75(1 z 2 ) 1(jzj < 1) h = 0.07 (oversmooths), 0.21 (default) or 0.63 (undersmooths) kdensity lnhwage 0.2.4.6.8 0 1 2 3 4 5 x Default Twice default Half default c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program in March Economics 21-25,. Based 2011 on A. 5 Colin / 28 C

2. Nonparametric (kernel) density estimation mplementation mplementation Stata examples are kdensity y uses defaults kdensity y, bw(0.2) manually set bandwidth kdensity y, normal overlays the N [ȳ, s 2 ] density hist y, kdensity gives both histogram and kernel estimates Key is choice of bandwidth The default can oversmooth: may need to decrease bw() Less important is choice of kernel: default is Epanechnikov. Other smooth estimators exist including k-nearest neighbors. But usually no reason to use anything but kernel. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program in March Economics 21-25,. Based 2011 on A. 6 Colin / 28 C

3. Kernel regression Local average estimator 3. Kernel regression: Local average estimator The regression model is y i = m(x i ) + u i, u i i.i.d. (0, σ(x i ) 2 ). The functional form m() is not speci ed, so NLS not possible. f many obs have x = x 0 use the average of the yi 0s at x i = x 0 : bm(x 0 ) = i : y xi =x0 i / i : 1 xi =x0 N = i=1 1 (x N i = x 0 ) y i / i=1 1 (x i = x 0 ) nstead few values of y i at x = x 0 so do local average estimator: N bm(x 0 ) = i=1 w N i0y i / i=1 w i0 where weights w i0 = w(x i, x 0 ) are largest for x i close to x 0. Evaluate at a variety of points x 0 gives regression curve. Di erent methods use di erent weight functions w i0 = w(x i, x 0 ) c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program in March Economics 21-25,. Based 2011 on A. 7 Colin / 28 C

3. Kernel regression Common nonparametric regression estimators Common nonparametric regression estimators 1. Kernel estimate of m(x 0 ) replaces 1 (x i x 0 = 0) by K x i x 0 bm(x 0 ) = 1 Nh N i=1 K x i x 0 h yi / 1 Nh N i=1 K x i x 0 h 2. Local linear estimate of bm(x 0 ) minimizes w.r.t. a 0 and b 0 1 Nh N i=1 K x i x 0 h (yi a 0 b 0 (x i x 0 )) 2 Motivation: Kernelestimate is equivalent to bm(x 0 ) minimizes 1 Nh N i =1 K xi x 0 h (y i m 0 ) 2 with respect to m 0. better on endpoints 3. Lowess (locally weighted scatterplot smoothing) variation of local linear with variable bandwidth, tricubic kernel and downweighting of outliers. 4. K-Nearest neighbors Average the yi 0 s for the k x 0 i s that are closest to x 0 c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program in March Economics 21-25,. Based 2011 on A. 8 Colin / 28 C h

3. Kernel regression Kernel regression data example Kernel regression with 95% con dence bands, default Kernel (Epanechnikov) and default bandwidths lpoly lnhwage educatn, ci msize(medsmall) Local poly nomial smooth natural log of hwage 0 1 2 3 4 5 0 5 10 15 20 years of completed schooling 1992 95% C natural log of hwage lpoly smooth kernel = epanechnikov, degree = 0, bandwidth = 1.53, pwidth = 2.3 c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program in March Economics 21-25,. Based 2011 on A. 9 Colin / 28 C

3. Kernel regression Di erent bandwidths Kernel regression with three bandwidths: default, half and double. smoother with larger bandwidth lpoly smooth: natural log of hwage 1.5 2 2.5 3 0 5 10 15 20 lpoly smoothing grid Default Twice default Half default c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program inmarch Economics 21-25,. 2011 Based on A. 10 Colin / 28 C

3. Kernel regression Compare three methods Kernel, local linear and lowess with default bandwidths graph twoway lpoly y x jj lpoly y x, deg(1) jj lowess y x kernel erroneously underestimates m(x) at the endpoint x = 17. 1.5 2 2.5 3 0 5 10 15 20 Kernel Local linear lowess c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program inmarch Economics 21-25,. 2011 Based on A. 11 Colin / 28 C

3. Kernel regression mplementation mplementation Di erent methods work di erently Local linear and local polynomial handle endpoints better than kernel. bm(x 0 ) is asymptotically normal this gives con dence bands that allow for heteroskedasticity Bandwidth choice is crucial optimal bandwidth trades o bias (minimized with small bandwidth) and variance (minimized with large bandwidth) theory just says optimal bandwidth for kernel regression is O(N 0.2 ) plug-in or default bandwidth estimates are often not the best so also try e.g. half and two times the default. cross validation minimizes the empirical mean square error i (y i bm i (x i )) 2, where bm i (x i ) is the leave-one-out" estimate of bm(x i ) formed with y i excluded. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program inmarch Economics 21-25,. 2011 Based on A. 12 Colin / 28 C

4. Semiparametric estimation Motivation 4. Semiparametric estimation Nonparametric regression is problematic when more than one regressor in theory can do multivariate kernel regression in practice the local averages are over sparse cells called the curse of dimensionality Semiparametric methods place some structure on the problem parametric component for part of the model nonparametric component that is often one dimensional c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program inmarch Economics 21-25,. 2011 Based on A. 13 Colin / 28 C

4. Semiparametric estimation Leading examples Leading semiparametric examples Partially linear model E[y i jx i, z i ] = x 0 i β + λ(z i ) Estimate λ() nonparametrically and ideally p N( b β β) d! N [0, V] Single-index model E[y i jx i ] = g(x 0 i β) Estimate g() nonparametrically and ideally p N( β b β)! d N [0, V] Can only estimate β up to scale in this model Still useful as ratio of coe cients equals ratio of marginal e ects in a single-index models Generalized additive model E[y i jx i ] = g 1 (x 1i ) + + g K (x Ki ) c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program inmarch Economics 21-25,. 2011 Based on A. 14 Colin / 28 C

5. Bootstrap Estimate of standard error 5. Bootstrap estimate of standard error Basic idea is view f(y 1, x 1 ),..., (y N, x N )g as the population. Then obtain B random samples from this population Get B estimates bθ 1,..., bθ S. Then estimate Var[bθ] using the usual standard deviation of the B estimates bv[bθ] = 1 B 1 B b=1 ( bθ s bθ) 2, where bθ = 1 B B b=1 b θ b. Square root of this is called a bootstrap standard error. To get B di erent samples of size N we resample with replacement from f(y 1, x 1 ),..., (y N, x N )g n each bootstrap sample some original data points appear more than once while others not appear at all. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program inmarch Economics 21-25,. 2011 Based on A. 15 Colin / 28 C

5. Bootstrap Regression application Regression application Data: Doctor visits (count) and chronic conditions. N = 50. Contains data from musbootdata.dta obs: 50 vars: 3 16 Apr 2010 10:32 size: 750 (99.9% of memory free) storage display value variable name type format label variable label docvis int %8.0g number of doctor visits age float %9.0g Age in years / 10 chronic byte %8.0g = 1 if a chronic condition Sorted by:. summarize Variable Obs Mean Std. Dev. Min Max docvis 50 4.12 7.82106 0 43 age 50 4.162 1.160382 2.6 6.2 chronic 50.28.4535574 0 1 c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program inmarch Economics 21-25,. 2011 Based on A. 16 Colin / 28 C

5. Bootstrap Standard error estimation Bootstrap standard errors after Poisson regression Use option vce(boot) Set the seed! Set the number of bootstrap repetitions!. * Compute bootstrap standard errors using option vce(bootstrap) to. poisson docvis chronic, vce(boot, reps(400) seed(10101) nodots) Poisson regression Number of obs = 50 Replications = 400 Wald chi2(1) = 3.50 Prob > chi2 = 0.0612 Log likelihood = -238.75384 Pseudo R2 = 0.0917 Observed Bootstrap Normal-based docvis Coef. Std. Err. z P> z [95% Conf. nterval] chronic.9833014.5253149 1.87 0.061 -.0462968 2.0129 _cons 1.031602.3497212 2.95 0.003.3461607 1.717042 Bootstrap se = 0.525 versus White robust se = 0.515. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program inmarch Economics 21-25,. 2011 Based on A. 17 Colin / 28 C

5. Bootstrap Standard error estimation Results vary with seed and number of reps. * Bootstrap standard errors for different reps and seeds. quietly poisson docvis chronic, vce(boot, reps(50) seed(10101)). estimates store boot50. quietly poisson docvis chronic, vce(boot, reps(50) seed(20202)). estimates store boot50diff. quietly poisson docvis chronic, vce(boot, reps(2000) seed(10101)). estimates store boot2000. quietly poisson docvis chronic, vce(robust). estimates store robust. estimates table boot50 boot50diff boot2000 robust, b(%8.5f) se(%8.5f) Variable boot50 boot50~f boot2000 robust chronic 0.98330 0.98330 0.98330 0.98330 0.47010 0.50673 0.53479 0.51549 _cons 1.03160 1.03160 1.03160 1.03160 0.39545 0.32575 0.34885 0.34467 legend: b/se c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program inmarch Economics 21-25,. 2011 Based on A. 18 Colin / 28 C

5. Bootstrap Leading uses of bootstrap standard errors Leading uses of bootstrap standard errors Sequential two-step m-estimator First step gives bα used to create a regressor z(bα) Second step regresses y on x and z(bα) Do a paired bootstrap resampling (x, y, z) e.g. Heckman two-step estimator. 2SLS estimator with heteroskedastic errors (if no White option) Paired bootstrap gives heteroskedastic robust standard errors. Functions of other estimates e.g. bθ = bα bβ replaces delta method Clustered data with many small clusters, such as short panels. F F Then resample the clusters. But be careful if model includes cluster-speci c xed e ects. For these in Stata need to use pre x command bootstrap: c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program inmarch Economics 21-25,. 2011 Based on A. 19 Colin / 28 C

5. Bootstrap General algorithm The bootstrap: general algorithm A general bootstrap algorithm is as follows: 1. Given data w 1,..., w N F draw a bootstrap sample of size N (see below) F denote this new sample w1,..., w N. 2. Calculate an appropriate statistic using the bootstrap sample. Examples include: F (a) estimate bθ of θ; F (b) standard error s b θ of estimate bθ F (c) t statistic t = (bθ bθ)/s b θ centered at bθ. 3. Repeat steps 1-2 B independent times. F Gives B bootstrap replications of bθ 1,..., bθ B or t1,..., t B or... 4. Use these B bootstrap replications to obtain a bootstrapped version of the statistic (see below). c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program inmarch Economics 21-25,. 2011 Based on A. 20 Colin / 28 C

5. Bootstrap mplementation mplementation Number of bootstraps: B high is best but increases computer time. CT use 400 for se s and 999 for tests and con dence intervals. Defaults are often too low. And set the seed! Various resampling methods 1. Paired (or nonparametric or empirical dist. func.) is most common F w 1,..., w N obtained by sampling with replacement from w 1,..., w N. 2. Parametric bootstrap for fully parametric models. F Suppose y jx F (x, θ 0 ) and generate yi by draws from F (x i, bθ) 3. Residual bootstrap for regression with additive errors F Resample tted residuals bu 1,..., bu N to get (bu 1,..., bu N ) and form new (y1, x 1),..., (yn, x N ). Need to resample over i.i.d. observations resample over clusters if data are clustered F But be careful if model includes cluster-speci c xed e ects. resample over moving blocks if data are serially correlated. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program inmarch Economics 21-25,. 2011 Based on A. 21 Colin / 28 C

5. Bootstrap Asymptotic re nement Asymptotic re nement The simplest bootstraps are no better than usual asymptotic theory advantage is easy to implement, e.g. standard errors. More complicated bootstraps provide asymptotic re nement this may provide a better nite-sample approximation. Conventional asymptotic tests (such as Wald test). α = nominal size for a test, e.g. α = 0.05. Actual size= α + O(N 1/2 ). Tests with asymptotic re nement Actual size= α + O(N 1 ). asymptotic bias of size O(N 1 ) < O(N 1/2 ) is smaller asymptotically. But need simulation studies to con rm nite sample gains. F e.g. if N = 100 then 100/N = O(N 1 ) > 5/ p N = O(N 1/2 ). c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program inmarch Economics 21-25,. 2011 Based on A. 22 Colin / 28 C

5. Bootstrap Asymptotically Pivotal Statistic Asymptotically pivotal statistic Asymptotic re nement bootstraps an asymptotically pivotal statistic this means limit distribution does not depend on unknown parameters. a An estimator bθ θ 0 N [0, σ 2 b ] is not asymptotically pivotal θ since σbθ 2 is an unknown parameter. But the studentized t statistic is asymptotically pivotal since t = (bθ θ 0 )/s b θ a N [0, 1] has no unknown parameters. So bootstrap Wald test statistic to get tests and con dence intervals with asymptotically re nement. For con dence intervals can also use BC (bias-corrected) and BCa methods. Econometricians rarely use asymptotic re nement. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program inmarch Economics 21-25,. 2011 Based on A. 23 Colin / 28 C

5. Bootstrap Con dence intervals. * Bootstrap confidence intervals: normal-based, percentile, BC, and BCa. quietly poisson docvis chronic, vce(boot, reps(999) seed(10101) bca). estat bootstrap, all Poisson regression Number of obs = 50 Replications = 999 Observed Bootstrap docvis Coef. Bias Std. Err. [95% Conf. nterval] chronic.98330144 -.0244473.54040762 -.075878 2.042481 (N) -.1316499 2.076792 (P) -.0820317 2.100361 (BC) -.0215526 2.181476 (BCa) _cons 1.0316016 -.0503223.35257252.3405721 1.722631 (N).2177235 1.598568 (P).2578293 1.649789 (BC).3794897 1.781907 (BCa) (N) normal confidence interval (P) percentile confidence interval (BC) bias-corrected confidence interval (BCa) bias-corrected and accelerated confidence interval (N) is observed coe cient 1.96 bootstrap s.e. (P) is 2.5 to 97.5 percentile of the bootstrap estimates bβ 1,..., bβ B. (BC) and (BCa) have asymptotic re nement. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program inmarch Economics 21-25,. 2011 Based on A. 24 Colin / 28 C

5. Bootstrap Bootstraps can fail Bootstrap failure The following are cases where standard bootstraps fail so need to adjust standard bootstraps. GMM (and empirical likelihood) in over-identi ed models For overidenti ed models need to recenter or use empirical likelihood. Nonparametric Regression: Nonparametric density and regression estimators converge at rate less than root-n and are asymptotically biased. This complicates inference such as con dence intervals. Non-Smooth Estimators: e.g. LAD. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program inmarch Economics 21-25,. 2011 Based on A. 25 Colin / 28 C

6. Stata Commands 6. Stata commands Command kernel does kernel density estimate. Command lpoly does several nonparametric regressions kernel is default local linear is option degree(1) local polynomial of degree p is option degree(p) Command lowess does Lowess. Stata has no built-in commands for the semiparametric estimators These methods are not easy to automate as no easy way to automate bandwidth choice and treatment of outliers. For bootstrap use option,vce(boot) or command bootstrap: set the seed!! c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program inmarch Economics 21-25,. 2011 Based on A. 26 Colin / 28 C

7. Appendix Histogram estimate 7. Appendix: Histogram estimate A histogram is a nonparametric estimate of the density of y break data into bins of width 2h form rectangles of area the relative frequency = freq/n the height is freq/2nh (then area = (freq/2nh) 2h = freq/n). Use freq = N i=1 1(x 0 h < x i < x 0 + h) where indicator function 1(A) equals 1 if event A happens and equals 0 otherwise The histogram estimate of f (x 0 ), the density of x evaluated at x 0, is bf HST (x 0 ) = 1 2Nh N i=1 1(x 0 h < x i < x 0 + h) = 1 Nh N 1 i=1 2 1 x i x 0 h < 1. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program inmarch Economics 21-25,. 2011 Based on A. 27 Colin / 28 C

7. Appendix Kernel density estimate Appendix: Kernel density estimate Recall bf HST (x 0 ) = 1 Nh N i=1 1 2 1 x i x 0 h < 1 Replace 1 (A) by a kernel function Kernel density estimate of f (x 0 ), the density of x evaluated at x 0, is bf (x 0 ) = 1 Nh N i=1 K x i x 0 h K () is called a kernel function h is called the bandwidth or window width or smoothing parameter h Example is Epanechnikov kernel K (z) = 0.75(1 z 2 ) 1(jzj < 1) more weight on data at center. less weight at end More generally kernel function must satisfy conditions including Continuous, K (z) = K ( z), R K (z)dz = 1, R K (z)dz = 1, tails go to zero. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE: in Econometrics Nonparametric Bavarian / Bootstrap Graduate Program inmarch Economics 21-25,. 2011 Based on A. 28 Colin / 28 C