Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference

Similar documents
Why Do Statisticians Treat Predictors as Fixed? A Conspiracy Theory

Post-Selection Inference for Models that are Approximations

Construction of PoSI Statistics 1

Generalized Cp (GCp) in a Model Lean Framework

Models as Approximations, Part I: A Conspiracy of Nonlinearity and Random Regressors in Linear Regression

Models as Approximations A Conspiracy of Random Regressors and Model Misspecification Against Classical Inference in Regression

Lawrence D. Brown* and Daniel McCarthy*

Models as Approximations Part II: A General Theory of Model-Robust Regression

Inference for Approximating Regression Models

PoSI and its Geometry

PoSI Valid Post-Selection Inference

Models as Approximations II: A Model-Free Theory of Parametric Regression

Discussant: Lawrence D Brown* Statistics Department, Wharton, Univ. of Penn.

Model Selection, Estimation, and Bootstrap Smoothing. Bradley Efron Stanford University

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Assumption Lean Regression

Department of Criminology

Mallows Cp for Out-of-sample Prediction

Variable Selection Insurance aka Valid Statistical Inference After Model Selection

A Resampling Method on Pivotal Estimating Functions

Ch 2: Simple Linear Regression

Robust estimation, efficiency, and Lasso debiasing

Cross-Validation with Confidence

Regression Diagnostics for Survey Data

Lecture 5: LDA and Logistic Regression

Bootstrap & Confidence/Prediction intervals

Formal Statement of Simple Linear Regression Model

Homoskedasticity. Var (u X) = σ 2. (23)

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models

Ch 3: Multiple Linear Regression

Outline of GLMs. Definitions

Cross-Validation with Confidence

Inference For High Dimensional M-estimates: Fixed Design Results

Inference For High Dimensional M-estimates. Fixed Design Results

Measurement error effects on bias and variance in two-stage regression, with application to air pollution epidemiology

Marginal Screening and Post-Selection Inference

Selective Inference for Effect Modification

Simple Linear Regression for the MPG Data

Default priors and model parametrization

Bootstrap, Jackknife and other resampling methods

Distribution-Free Predictive Inference for Regression

On Post-selection Confidence Intervals in Linear Regression

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)

Probability and Statistics Notes

Model-free prediction intervals for regression and autoregression. Dimitris N. Politis University of California, San Diego

Statistical Inference

Bayesian Sparse Linear Regression with Unknown Symmetric Error

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

L. Brown. Statistics Department, Wharton School University of Pennsylvania

Contents. 1 Review of Residuals. 2 Detecting Outliers. 3 Influential Observations. 4 Multicollinearity and its Effects

University of California, Berkeley

Bayesian estimation of the discrepancy with misspecified parametric models

Classification. Chapter Introduction. 6.2 The Bayes classifier

Bootstrap, Jackknife and other resampling methods

Robustness to Parametric Assumptions in Missing Data Models

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Lectures on Simple Linear Regression Stat 431, Summer 2012

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap

ICES REPORT Model Misspecification and Plausibility

Uniform Post Selection Inference for LAD Regression and Other Z-estimation problems. ArXiv: Alexandre Belloni (Duke) + Kengo Kato (Tokyo)

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Hypothesis testing, part 2. With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal

MA 575 Linear Models: Cedric E. Ginestet, Boston University Bootstrap for Regression Week 9, Lecture 1

A Simple, Graphical Procedure for Comparing Multiple Treatment Effects

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Lecture 3: Statistical Decision Theory (Part II)

Econometrics - 30C00200

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

WISE International Masters

Introduction to Linear regression analysis. Part 2. Model comparisons

Master s Written Examination

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

Review of Econometrics

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Can we do statistical inference in a non-asymptotic way? 1

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Recall that a measure of fit is the sum of squared residuals: where. The F-test statistic may be written as:

Bootstrap. Director of Center for Astrostatistics. G. Jogesh Babu. Penn State University babu.

Sanjay Chaudhuri Department of Statistics and Applied Probability, National University of Singapore

The Bootstrap 1. STA442/2101 Fall Sampling distributions Bootstrap Distribution-free regression example

arxiv: v1 [stat.me] 23 Jun 2018

Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001)

5. Linear Regression

Regression, Ridge Regression, Lasso

The assumptions are needed to give us... valid standard errors valid confidence intervals valid hypothesis tests and p-values

ECON 4160, Autumn term Lecture 1

Contents. Acknowledgments. xix

Linear models and their mathematical foundations: Simple linear regression

Some Curiosities Arising in Objective Bayesian Analysis

The regression model with one stochastic regressor.

Module 17: Bayesian Statistics for Genetics Lecture 4: Linear regression

Psychology 282 Lecture #4 Outline Inferences in SLR

1. The OLS Estimator. 1.1 Population model and notation

Simple Linear Regression

Multiple Regression Analysis

STAT 511. Lecture : Simple linear regression Devore: Section Prof. Michael Levine. December 3, Levine STAT 511

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed

Transcription:

Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference Andreas Buja joint with: Richard Berk, Lawrence Brown, Linda Zhao, Arun Kuchibhotla, Kai Zhang Werner Stützle, Ed George, Mikhail Traskin, Emil Pitkin, Dan McCarthy Mostly at the Department of Statistics, The Wharton School University of Pennsylvania WHOA-PSI 2017/08/14

Outline Rehash 1: The big problem, irreproducible empirical research Rehash 2: Review of the PoSI version of post-selection inference. Assumption-lean inference because degrees of misspecification are universal, especially after model selection. Regression functionals = statistical functionals of joint x y distributions. Assumption-lean standard errors: sandwich estimators, x y bootstrap Which is better, sandwich or bootstrap? Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 2 / 20

Principle: Models, yes, but they are not be believed Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 3 / 20

Principle: Models, yes, but they are not be believed Contributing to irreproducibility: Formal selection based on algorithms: Lasso, stepwise, AIC,... Informal selection based on plots, diagnostics,... Post hoc selection based on subject matter preference, cost,... Every potential (!) decision informed by data. Resulting models take advantage of data in unaccountable ways. Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 3 / 20

Principle: Models, yes, but they are not be believed Contributing to irreproducibility: Formal selection based on algorithms: Lasso, stepwise, AIC,... Informal selection based on plots, diagnostics,... Post hoc selection based on subject matter preference, cost,... Every potential (!) decision informed by data. Resulting models take advantage of data in unaccountable ways. Traditions of not believing in model correctness: G.E.P. Box:... Cox (1995): The very word model implies simplification and idealization. Tukey: The hallmark of good science is that it uses models and theory but never believes them. Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 3 / 20

Principle: Models, yes, but they are not be believed Contributing to irreproducibility: Formal selection based on algorithms: Lasso, stepwise, AIC,... Informal selection based on plots, diagnostics,... Post hoc selection based on subject matter preference, cost,... Every potential (!) decision informed by data. Resulting models take advantage of data in unaccountable ways. Traditions of not believing in model correctness: G.E.P. Box:... Cox (1995): The very word model implies simplification and idealization. Tukey: The hallmark of good science is that it uses models and theory but never believes them. But: Lip service to these maxims is not enough. There are consequences. Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 3 / 20

PoSI, just briefly PoSI: Reduction of post-selection inference to simultaneous inference Computational cost: exponential in p, linear in N PoSI covering all submodels feasible for p 20. Sparse PoSI for M 5, e.g., feasible for p 60. Coverage guarantees are marginal, not conditional. Statistical practice is minimally changed: Use a single ˆσ across all submodels M. For PoSI-valid CIs, enlarge the multiplier of SE[ ˆβ j M ] suitably. PoSI covers for formal, informal, and post-hoc selection. PoSI solves the circularity problem of selective inference: Selecting regressors with PoSI tests does not invalidate the tests. PoSI is possible for response and transformation selection. Current PoSI allows 1st order misspecification, but uses fixed-x theory and assumes normality, homoskedasticity and a valid ˆσ. Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 4 / 20

Under Misspecification, Conditioning on X is Wrong Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 5 / 20

Under Misspecification, Conditioning on X is Wrong What is the justification for conditioning on X, i.e., treating X as fixed? Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 5 / 20

Under Misspecification, Conditioning on X is Wrong What is the justification for conditioning on X, i.e., treating X as fixed? Answer: Fisher s ancillarity argument for X. p(y, x; β (1) ) p(y, x; β (0) ) Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 5 / 20

Under Misspecification, Conditioning on X is Wrong What is the justification for conditioning on X, i.e., treating X as fixed? Answer: Fisher s ancillarity argument for X. p(y, x; β (1) ) p(y, x; β (0) ) = p(y x; β(1) ) p(x) p(y x; β (0) ) p(x) Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 5 / 20

Under Misspecification, Conditioning on X is Wrong What is the justification for conditioning on X, i.e., treating X as fixed? Answer: Fisher s ancillarity argument for X. p(y, x; β (1) ) p(y, x; β (0) ) = p(y x; β(1) ) p(x) p(y x; β (0) ) p(x) = p(y x; β(1) ) p(y x; β (0) ) Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 5 / 20

Under Misspecification, Conditioning on X is Wrong What is the justification for conditioning on X, i.e., treating X as fixed? Answer: Fisher s ancillarity argument for X. p(y, x; β (1) ) p(y, x; β (0) ) = p(y x; β(1) ) p(x) p(y x; β (0) ) p(x) = p(y x; β(1) ) p(y x; β (0) ) Important! The ancillarity argument assumes correctness of the model. Refutation of Regressor Ancillarity under Misspecification: Assume xi random, y i = f (x i ) nonlinear deterministic, no errors. Fit a linear function ŷi = β 0 + β 1 x i. A situation with misspecification, but no errors. Conditional on x i, estimates ˆβ j have no sampling variability. Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 5 / 20

Under Misspecification, Conditioning on X is Wrong What is the justification for conditioning on X, i.e., treating X as fixed? Answer: Fisher s ancillarity argument for X. p(y, x; β (1) ) p(y, x; β (0) ) = p(y x; β(1) ) p(x) p(y x; β (0) ) p(x) = p(y x; β(1) ) p(y x; β (0) ) Important! The ancillarity argument assumes correctness of the model. Refutation of Regressor Ancillarity under Misspecification: Assume xi random, y i = f (x i ) nonlinear deterministic, no errors. Fit a linear function ŷi = β 0 + β 1 x i. A situation with misspecification, but no errors. Conditional on x i, estimates ˆβ j have no sampling variability. However, watch The Unconditional Movie... source("http://stat.wharton.upenn.edu/ buja/src-conspiracy-animation2.r") Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 5 / 20

Random X and Misspecification Randomness of X and misspecification conspire to create sampling variability in the estimates. Consequence: Under misspecification, conditioning on X is wrong. Method 1 for model-robust & unconditional inference: Asymptotic inference based on Eicker/Huber/White s Sandwich Estimator of Standard Error V[ ˆβ] = E[ X X ] 1 E[(Y X β) X X ] E[ X X ] 1. Method 2 for model-robust & unconditional inference: the Pairs or x-y Bootstrap (not the residual bootstrap) Validity of inference is in an assumption-lean/model-robust framework... Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 6 / 20

An Assumption-Lean Framework Joint ( x, y) distribution, i.i.d. sampling: (y i, x i ) P(dy, d x) = P Y, X = P Assume properties sufficient to grant CLTs for estimates of interest. No assumptions on µ( x) = E[Y X = x ], σ 2 ( x) = V[Y X = x] Define a population OLS parameter: [ ( ) ] 2 β := argmin β E Y β X = E[ X X ] -1 E[ X Y ] Target of inference: β = β(p) (best L 2 approximation at P) OLS Estimator: ˆβ = β(ˆp N ) (plug-in estimator, ˆP N = 1 N δ( xi,y i )) β = Statistical functional =: Regression Functional Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 7 / 20

Implication 1 of Misspecification misspecified well-specified Y Y Y = µ(x) Y = µ(x) X Under misspecification parameters depend on the X distribution: β(p Y, X ) = β(p Y X, P X ) β(p Y, X ) = β(p Y X ) Upside down: Ancillarity = parameters do not affect the X distribution. Upside up: "Ancillarity = the parameters are not affected by the X distribution. X Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 8 / 20

Implication 2 of Misspecification misspecified well-specified y y x x Misspecification and random X generate sampling variability: V[ E[ ˆβ X ] ] > 0 V[ E[ ˆβ X ] ] = 0 (Recall The Unconditional Movie. ) Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 9 / 20

Diagnostic for Dependence on the X-Distribution 10 20 30 40 50 60 70 2.0 1.5 1.0 0.5 0.0 MedianInc Coeff: MedianInc 0 2 4 6 8 10 0 5 10 15 PercVacant Coeff: PercVacant 0 20 40 60 80 1.0 0.5 0.0 0.5 1.0 1.5 PercMinority Coeff: PercMinority 0 20 40 60 80 1.0 0.5 0.0 0.5 PercResidential Coeff: PercResidential 10 0 10 20 30 1 0 1 2 PercCommercial Coeff: PercCommercial 10 0 10 20 1 0 1 2 PercIndustrial Coeff: PercIndustrial Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 10 / 20

CLT under Misspecification Sandwich Estimator N ( ˆβ β(p) ) D ( N 0, E[ x x ] 1 E[ (y x β(p)) 2 x x ] E[ x x ] 1 ) }{{} AV [P] AV [P] is another statistical functional (dependence on β not shown). Sandwich Estimator of Asymptotic Variance: AV [ˆP N ] (Plug-in) Sandwich Standard Error Estimate: ˆ SE sand [ ˆβ j ] = Assumption-Lean Asymptotically Valid Standard Error 1 N AV [ˆP N ] jj (Linear Models Assumptions: ɛ = y x β(p) AV [P] = E[ x x ] 1 σ 2.) Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 11 / 20

The x-y Bootstrap for Regression Assumption-lean framework: The nonparametric x-y bootstrap applies. Estimate SE( ˆβ j ) by Resample ( x i, y i ) pairs ( x i, y i ) ˆβ. ˆ SEboot ( ˆβ j ) = SD (β j ). Let SElin ˆ ( ˆβ j ) = models theory. ˆσ x j be the usual standard error erstimate of linear Question: Is the following always true? ˆ SE boot ( ˆβ j )? ˆ SElin ( ˆβ j ) Compare conventional and bootstrap standard errors empirically... Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 12 / 20

Conventional vs Bootstrap Std Errors LA Homeless Data (Richard Berk, UPenn) Response: StreetTotal of homeless in a census tract, N = 505 R 2 0.13, residual dfs = 498 ˆβ j SE lin SE boot SE boot /SE lin t lin MedianInc -4.241 4.342 2.651 0.611-0.977 PropVacant 18.476 3.595 5.553 1.545 5.140 PropMinority 2.759 3.935 3.750 0.953 0.701 PerResidential -1.249 4.275 2.776 0.649-0.292 PerCommercial 10.603 3.927 5.702 1.452 2.700 PerIndustrial 11.663 4.139 7.550 1.824 2.818 Reason for the discrepancy: misspecification. Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 13 / 20

Sandwich and x-y Bootstrap Estimators The x-y bootstrap and sandwich are both asymptotically correct in the assumption-lean framework. Connection? Which should we use? Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 14 / 20

Sandwich and x-y Bootstrap Estimators The x-y bootstrap and sandwich are both asymptotically correct in the assumption-lean framework. Connection? Which should we use? Consider the M-of-N bootstrap (e.g., Bickel, Götze, van Zwet, 1997): Draw M iid resamples ( x i, yi ) (i = 1...M) from samples ( x i, y i ) (i = 1...N). PM = 1 δ( x M ), ˆPN = 1 δ( xi N,y i ). i,y i M-of-N Bootstrap estimator of AV [P]: M VˆPN [ β(p M) ] Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 14 / 20

Sandwich and x-y Bootstrap Estimators The x-y bootstrap and sandwich are both asymptotically correct in the assumption-lean framework. Connection? Which should we use? Consider the M-of-N bootstrap (e.g., Bickel, Götze, van Zwet, 1997): Draw M iid resamples ( x i, yi ) (i = 1...M) from samples ( x i, y i ) (i = 1...N). PM = 1 δ( x M ), ˆPN = 1 δ( xi N,y i ). i,y i M-of-N Bootstrap estimator of AV [P]: M VˆPN [ β(p M) ] Apply the CLT to β(p M ) as M for fixed N, treating ˆP N as population: M ( β(p M ) β(ˆp N ) ) D N ( 0, AV (ˆP N ) ) M Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 14 / 20

Sandwich and x-y Bootstrap Estimators The x-y bootstrap and sandwich are both asymptotically correct in the assumption-lean framework. Connection? Which should we use? Consider the M-of-N bootstrap (e.g., Bickel, Götze, van Zwet, 1997): Draw M iid resamples ( x i, yi ) (i = 1...M) from samples ( x i, y i ) (i = 1...N). PM = 1 δ( x M ), ˆPN = 1 δ( xi N,y i ). i,y i M-of-N Bootstrap estimator of AV [P]: M VˆPN [ β(p M) ] Apply the CLT to β(p M ) as M for fixed N, treating ˆP N as population: M ( β(p M ) β(ˆp N ) ) D N ( 0, AV (ˆP N ) ) M Consequence: M VˆPN [ β(p M ) ] M AV (ˆP N ) Bootstrap Estimator of AV Sandwich Estimator of AV Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 14 / 20

Connection with M-Bagging So we have a family of bootstrap estimators M VˆPN [ β(p M ) ] of asymptotic variance, parametrized by M. Connection with bagging: Bagging = averaging of an estimate over bootstrap resamples. For example, M-bagged β(ˆp N ): EˆPN [ β(p M ) ] Observation about bootstrap variance (assume β one-dimensional): VˆPN [ β(p M) ] = EˆPN [ β(p M) 2 ] EˆPN [ β(p M) ] 2 A simple function of M-bagged functionals Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 15 / 20

M-Bagged Functionals Rather than Statistics Q: What is the target of estimation of an M-bagged statistic EˆPN [ β(p M ) ]? A: Lift the notion of M-bagging to functionals: ( x i, y i ) P, rather than ˆP N β B M(P) = E P [ β(p M) ]. The functional β B M(P) is the M-bagged version of the functional β(p). Plug-in produces the usual bagged statistic: Its target of estimation is β B M(P). β B M(ˆP N ) = EˆPN [ β(p M) ]. The notion of M-bagged functional is made possible by distinguishing resample size M and sample size N. Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 16 / 20

Von Mises Expansions of M-Bagged Functionals Von Mises expansion of β( ) around P: β(q) = β(p) + E Q [ψ 1 (X)] + 1 2 E Q[ψ 2 (X 1, X 2 )] +... ψ k (X 1,..., X k ) = k th order influence function at P Used to prove asymptotic normality with Q = ˆP N. Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 17 / 20

Von Mises Expansions of M-Bagged Functionals Von Mises expansion of β( ) around P: β(q) = β(p) + E Q [ψ 1 (X)] + 1 2 E Q[ψ 2 (X 1, X 2 )] +... ψ k (X 1,..., X k ) = k th order influence function at P Used to prove asymptotic normality with Q = ˆP N. Theorem (B-S): M-bagged functionals have von Mises expansions of length M +1. Detail: The influence functions are obtained from the Serfling-Efron-Stein Anova decomposition of β(p M ) = β(x 1,..., X M ). Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 17 / 20

M-Bootstrap or Sandwich? A sense of smoothing achieved by bagging: The smaller M, the shorter the von Mises expansion, the smoother the M-bagged functional. Recall: M-of-N bootstrap estimates of standard error are simple functions of M-bagged functionals, and sandwich estimates of standard error are the limit M. It seems that some form of bootstrapping should be beneficial by providing more stable standard error estimates. (This agrees with Bickel, Götze, van Zwet (1997) who let M/N 0.) Caveat: This is not a fully worked out argument, rather a heuristic. The OLS functional β(p) is already very smooth, so there may not be much of an effect. But: The argument goes through for any type of regression, not just OLS. Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 18 / 20

Conclusions Misspecification makes fixed-x regression illegitimate. Misspecification and random X create sampling variability. Misspecification creates dependence of parameters on X distribution. Such dependence can be diagnosed. Misspecification may invalidate usual standard errors. Alternatives: M-of-N bootstrap, x-y version, not residuals Bootstrap standard errors may be more stable than sandwich. Bagging makes anything smooth. Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 19 / 20

Conclusions Misspecification makes fixed-x regression illegitimate. Misspecification and random X create sampling variability. Misspecification creates dependence of parameters on X distribution. Such dependence can be diagnosed. Misspecification may invalidate usual standard errors. Alternatives: M-of-N bootstrap, x-y version, not residuals Bootstrap standard errors may be more stable than sandwich. Bagging makes anything smooth. THANKS! Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 19 / 20

The Meaning of Slopes under Misspecification Allowing misspecification messes up our regression practice: What is the meaning of slopes under misspecification? y y x x Case-wise slopes: ˆβ = i w i b i, b i := y i x i, w i := x 2 i i x 2 i Pairwise slopes: ˆβ = ik w ik b ik, b ik := y i y k x i x k, w ik := (x i x k ) 2 i k (x i x k )2 Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 20 / 20