Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference
|
|
- Arnold Shaw
- 5 years ago
- Views:
Transcription
1 Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference Andreas Buja joint with: Richard Berk, Lawrence Brown, Linda Zhao, Arun Kuchibhotla, Kai Zhang Werner Stützle, Ed George, Mikhail Traskin, Emil Pitkin, Dan McCarthy Mostly at the Department of Statistics, The Wharton School University of Pennsylvania WHOA-PSI 2017/08/14
2 Outline Rehash 1: The big problem, irreproducible empirical research Rehash 2: Review of the PoSI version of post-selection inference. Assumption-lean inference because degrees of misspecification are universal, especially after model selection. Regression functionals = statistical functionals of joint x y distributions. Assumption-lean standard errors: sandwich estimators, x y bootstrap Which is better, sandwich or bootstrap? Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 2 / 20
3 Principle: Models, yes, but they are not be believed Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 3 / 20
4 Principle: Models, yes, but they are not be believed Contributing to irreproducibility: Formal selection based on algorithms: Lasso, stepwise, AIC,... Informal selection based on plots, diagnostics,... Post hoc selection based on subject matter preference, cost,... Every potential (!) decision informed by data. Resulting models take advantage of data in unaccountable ways. Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 3 / 20
5 Principle: Models, yes, but they are not be believed Contributing to irreproducibility: Formal selection based on algorithms: Lasso, stepwise, AIC,... Informal selection based on plots, diagnostics,... Post hoc selection based on subject matter preference, cost,... Every potential (!) decision informed by data. Resulting models take advantage of data in unaccountable ways. Traditions of not believing in model correctness: G.E.P. Box:... Cox (1995): The very word model implies simplification and idealization. Tukey: The hallmark of good science is that it uses models and theory but never believes them. Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 3 / 20
6 Principle: Models, yes, but they are not be believed Contributing to irreproducibility: Formal selection based on algorithms: Lasso, stepwise, AIC,... Informal selection based on plots, diagnostics,... Post hoc selection based on subject matter preference, cost,... Every potential (!) decision informed by data. Resulting models take advantage of data in unaccountable ways. Traditions of not believing in model correctness: G.E.P. Box:... Cox (1995): The very word model implies simplification and idealization. Tukey: The hallmark of good science is that it uses models and theory but never believes them. But: Lip service to these maxims is not enough. There are consequences. Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 3 / 20
7 PoSI, just briefly PoSI: Reduction of post-selection inference to simultaneous inference Computational cost: exponential in p, linear in N PoSI covering all submodels feasible for p 20. Sparse PoSI for M 5, e.g., feasible for p 60. Coverage guarantees are marginal, not conditional. Statistical practice is minimally changed: Use a single ˆσ across all submodels M. For PoSI-valid CIs, enlarge the multiplier of SE[ ˆβ j M ] suitably. PoSI covers for formal, informal, and post-hoc selection. PoSI solves the circularity problem of selective inference: Selecting regressors with PoSI tests does not invalidate the tests. PoSI is possible for response and transformation selection. Current PoSI allows 1st order misspecification, but uses fixed-x theory and assumes normality, homoskedasticity and a valid ˆσ. Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 4 / 20
8 Under Misspecification, Conditioning on X is Wrong Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 5 / 20
9 Under Misspecification, Conditioning on X is Wrong What is the justification for conditioning on X, i.e., treating X as fixed? Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 5 / 20
10 Under Misspecification, Conditioning on X is Wrong What is the justification for conditioning on X, i.e., treating X as fixed? Answer: Fisher s ancillarity argument for X. p(y, x; β (1) ) p(y, x; β (0) ) Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 5 / 20
11 Under Misspecification, Conditioning on X is Wrong What is the justification for conditioning on X, i.e., treating X as fixed? Answer: Fisher s ancillarity argument for X. p(y, x; β (1) ) p(y, x; β (0) ) = p(y x; β(1) ) p(x) p(y x; β (0) ) p(x) Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 5 / 20
12 Under Misspecification, Conditioning on X is Wrong What is the justification for conditioning on X, i.e., treating X as fixed? Answer: Fisher s ancillarity argument for X. p(y, x; β (1) ) p(y, x; β (0) ) = p(y x; β(1) ) p(x) p(y x; β (0) ) p(x) = p(y x; β(1) ) p(y x; β (0) ) Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 5 / 20
13 Under Misspecification, Conditioning on X is Wrong What is the justification for conditioning on X, i.e., treating X as fixed? Answer: Fisher s ancillarity argument for X. p(y, x; β (1) ) p(y, x; β (0) ) = p(y x; β(1) ) p(x) p(y x; β (0) ) p(x) = p(y x; β(1) ) p(y x; β (0) ) Important! The ancillarity argument assumes correctness of the model. Refutation of Regressor Ancillarity under Misspecification: Assume xi random, y i = f (x i ) nonlinear deterministic, no errors. Fit a linear function ŷi = β 0 + β 1 x i. A situation with misspecification, but no errors. Conditional on x i, estimates ˆβ j have no sampling variability. Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 5 / 20
14 Under Misspecification, Conditioning on X is Wrong What is the justification for conditioning on X, i.e., treating X as fixed? Answer: Fisher s ancillarity argument for X. p(y, x; β (1) ) p(y, x; β (0) ) = p(y x; β(1) ) p(x) p(y x; β (0) ) p(x) = p(y x; β(1) ) p(y x; β (0) ) Important! The ancillarity argument assumes correctness of the model. Refutation of Regressor Ancillarity under Misspecification: Assume xi random, y i = f (x i ) nonlinear deterministic, no errors. Fit a linear function ŷi = β 0 + β 1 x i. A situation with misspecification, but no errors. Conditional on x i, estimates ˆβ j have no sampling variability. However, watch The Unconditional Movie... source(" buja/src-conspiracy-animation2.r") Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 5 / 20
15 Random X and Misspecification Randomness of X and misspecification conspire to create sampling variability in the estimates. Consequence: Under misspecification, conditioning on X is wrong. Method 1 for model-robust & unconditional inference: Asymptotic inference based on Eicker/Huber/White s Sandwich Estimator of Standard Error V[ ˆβ] = E[ X X ] 1 E[(Y X β) X X ] E[ X X ] 1. Method 2 for model-robust & unconditional inference: the Pairs or x-y Bootstrap (not the residual bootstrap) Validity of inference is in an assumption-lean/model-robust framework... Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 6 / 20
16 An Assumption-Lean Framework Joint ( x, y) distribution, i.i.d. sampling: (y i, x i ) P(dy, d x) = P Y, X = P Assume properties sufficient to grant CLTs for estimates of interest. No assumptions on µ( x) = E[Y X = x ], σ 2 ( x) = V[Y X = x] Define a population OLS parameter: [ ( ) ] 2 β := argmin β E Y β X = E[ X X ] -1 E[ X Y ] Target of inference: β = β(p) (best L 2 approximation at P) OLS Estimator: ˆβ = β(ˆp N ) (plug-in estimator, ˆP N = 1 N δ( xi,y i )) β = Statistical functional =: Regression Functional Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 7 / 20
17 Implication 1 of Misspecification misspecified well-specified Y Y Y = µ(x) Y = µ(x) X Under misspecification parameters depend on the X distribution: β(p Y, X ) = β(p Y X, P X ) β(p Y, X ) = β(p Y X ) Upside down: Ancillarity = parameters do not affect the X distribution. Upside up: "Ancillarity = the parameters are not affected by the X distribution. X Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 8 / 20
18 Implication 2 of Misspecification misspecified well-specified y y x x Misspecification and random X generate sampling variability: V[ E[ ˆβ X ] ] > 0 V[ E[ ˆβ X ] ] = 0 (Recall The Unconditional Movie. ) Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 9 / 20
19 Diagnostic for Dependence on the X-Distribution MedianInc Coeff: MedianInc PercVacant Coeff: PercVacant PercMinority Coeff: PercMinority PercResidential Coeff: PercResidential PercCommercial Coeff: PercCommercial PercIndustrial Coeff: PercIndustrial Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 10 / 20
20 CLT under Misspecification Sandwich Estimator N ( ˆβ β(p) ) D ( N 0, E[ x x ] 1 E[ (y x β(p)) 2 x x ] E[ x x ] 1 ) }{{} AV [P] AV [P] is another statistical functional (dependence on β not shown). Sandwich Estimator of Asymptotic Variance: AV [ˆP N ] (Plug-in) Sandwich Standard Error Estimate: ˆ SE sand [ ˆβ j ] = Assumption-Lean Asymptotically Valid Standard Error 1 N AV [ˆP N ] jj (Linear Models Assumptions: ɛ = y x β(p) AV [P] = E[ x x ] 1 σ 2.) Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 11 / 20
21 The x-y Bootstrap for Regression Assumption-lean framework: The nonparametric x-y bootstrap applies. Estimate SE( ˆβ j ) by Resample ( x i, y i ) pairs ( x i, y i ) ˆβ. ˆ SEboot ( ˆβ j ) = SD (β j ). Let SElin ˆ ( ˆβ j ) = models theory. ˆσ x j be the usual standard error erstimate of linear Question: Is the following always true? ˆ SE boot ( ˆβ j )? ˆ SElin ( ˆβ j ) Compare conventional and bootstrap standard errors empirically... Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 12 / 20
22 Conventional vs Bootstrap Std Errors LA Homeless Data (Richard Berk, UPenn) Response: StreetTotal of homeless in a census tract, N = 505 R , residual dfs = 498 ˆβ j SE lin SE boot SE boot /SE lin t lin MedianInc PropVacant PropMinority PerResidential PerCommercial PerIndustrial Reason for the discrepancy: misspecification. Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 13 / 20
23 Sandwich and x-y Bootstrap Estimators The x-y bootstrap and sandwich are both asymptotically correct in the assumption-lean framework. Connection? Which should we use? Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 14 / 20
24 Sandwich and x-y Bootstrap Estimators The x-y bootstrap and sandwich are both asymptotically correct in the assumption-lean framework. Connection? Which should we use? Consider the M-of-N bootstrap (e.g., Bickel, Götze, van Zwet, 1997): Draw M iid resamples ( x i, yi ) (i = 1...M) from samples ( x i, y i ) (i = 1...N). PM = 1 δ( x M ), ˆPN = 1 δ( xi N,y i ). i,y i M-of-N Bootstrap estimator of AV [P]: M VˆPN [ β(p M) ] Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 14 / 20
25 Sandwich and x-y Bootstrap Estimators The x-y bootstrap and sandwich are both asymptotically correct in the assumption-lean framework. Connection? Which should we use? Consider the M-of-N bootstrap (e.g., Bickel, Götze, van Zwet, 1997): Draw M iid resamples ( x i, yi ) (i = 1...M) from samples ( x i, y i ) (i = 1...N). PM = 1 δ( x M ), ˆPN = 1 δ( xi N,y i ). i,y i M-of-N Bootstrap estimator of AV [P]: M VˆPN [ β(p M) ] Apply the CLT to β(p M ) as M for fixed N, treating ˆP N as population: M ( β(p M ) β(ˆp N ) ) D N ( 0, AV (ˆP N ) ) M Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 14 / 20
26 Sandwich and x-y Bootstrap Estimators The x-y bootstrap and sandwich are both asymptotically correct in the assumption-lean framework. Connection? Which should we use? Consider the M-of-N bootstrap (e.g., Bickel, Götze, van Zwet, 1997): Draw M iid resamples ( x i, yi ) (i = 1...M) from samples ( x i, y i ) (i = 1...N). PM = 1 δ( x M ), ˆPN = 1 δ( xi N,y i ). i,y i M-of-N Bootstrap estimator of AV [P]: M VˆPN [ β(p M) ] Apply the CLT to β(p M ) as M for fixed N, treating ˆP N as population: M ( β(p M ) β(ˆp N ) ) D N ( 0, AV (ˆP N ) ) M Consequence: M VˆPN [ β(p M ) ] M AV (ˆP N ) Bootstrap Estimator of AV Sandwich Estimator of AV Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 14 / 20
27 Connection with M-Bagging So we have a family of bootstrap estimators M VˆPN [ β(p M ) ] of asymptotic variance, parametrized by M. Connection with bagging: Bagging = averaging of an estimate over bootstrap resamples. For example, M-bagged β(ˆp N ): EˆPN [ β(p M ) ] Observation about bootstrap variance (assume β one-dimensional): VˆPN [ β(p M) ] = EˆPN [ β(p M) 2 ] EˆPN [ β(p M) ] 2 A simple function of M-bagged functionals Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 15 / 20
28 M-Bagged Functionals Rather than Statistics Q: What is the target of estimation of an M-bagged statistic EˆPN [ β(p M ) ]? A: Lift the notion of M-bagging to functionals: ( x i, y i ) P, rather than ˆP N β B M(P) = E P [ β(p M) ]. The functional β B M(P) is the M-bagged version of the functional β(p). Plug-in produces the usual bagged statistic: Its target of estimation is β B M(P). β B M(ˆP N ) = EˆPN [ β(p M) ]. The notion of M-bagged functional is made possible by distinguishing resample size M and sample size N. Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 16 / 20
29 Von Mises Expansions of M-Bagged Functionals Von Mises expansion of β( ) around P: β(q) = β(p) + E Q [ψ 1 (X)] E Q[ψ 2 (X 1, X 2 )] +... ψ k (X 1,..., X k ) = k th order influence function at P Used to prove asymptotic normality with Q = ˆP N. Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 17 / 20
30 Von Mises Expansions of M-Bagged Functionals Von Mises expansion of β( ) around P: β(q) = β(p) + E Q [ψ 1 (X)] E Q[ψ 2 (X 1, X 2 )] +... ψ k (X 1,..., X k ) = k th order influence function at P Used to prove asymptotic normality with Q = ˆP N. Theorem (B-S): M-bagged functionals have von Mises expansions of length M +1. Detail: The influence functions are obtained from the Serfling-Efron-Stein Anova decomposition of β(p M ) = β(x 1,..., X M ). Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 17 / 20
31 M-Bootstrap or Sandwich? A sense of smoothing achieved by bagging: The smaller M, the shorter the von Mises expansion, the smoother the M-bagged functional. Recall: M-of-N bootstrap estimates of standard error are simple functions of M-bagged functionals, and sandwich estimates of standard error are the limit M. It seems that some form of bootstrapping should be beneficial by providing more stable standard error estimates. (This agrees with Bickel, Götze, van Zwet (1997) who let M/N 0.) Caveat: This is not a fully worked out argument, rather a heuristic. The OLS functional β(p) is already very smooth, so there may not be much of an effect. But: The argument goes through for any type of regression, not just OLS. Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 18 / 20
32 Conclusions Misspecification makes fixed-x regression illegitimate. Misspecification and random X create sampling variability. Misspecification creates dependence of parameters on X distribution. Such dependence can be diagnosed. Misspecification may invalidate usual standard errors. Alternatives: M-of-N bootstrap, x-y version, not residuals Bootstrap standard errors may be more stable than sandwich. Bagging makes anything smooth. Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 19 / 20
33 Conclusions Misspecification makes fixed-x regression illegitimate. Misspecification and random X create sampling variability. Misspecification creates dependence of parameters on X distribution. Such dependence can be diagnosed. Misspecification may invalidate usual standard errors. Alternatives: M-of-N bootstrap, x-y version, not residuals Bootstrap standard errors may be more stable than sandwich. Bagging makes anything smooth. THANKS! Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 19 / 20
34 The Meaning of Slopes under Misspecification Allowing misspecification messes up our regression practice: What is the meaning of slopes under misspecification? y y x x Case-wise slopes: ˆβ = i w i b i, b i := y i x i, w i := x 2 i i x 2 i Pairwise slopes: ˆβ = ik w ik b ik, b ik := y i y k x i x k, w ik := (x i x k ) 2 i k (x i x k )2 Andreas Buja (Wharton, UPenn) Higher-Order von Mises Expansions, Bagging and Assumption-Lean Inference 2017/08/14 20 / 20
Why Do Statisticians Treat Predictors as Fixed? A Conspiracy Theory
Why Do Statisticians Treat Predictors as Fixed? A Conspiracy Theory Andreas Buja joint with the PoSI Group: Richard Berk, Lawrence Brown, Linda Zhao, Kai Zhang Ed George, Mikhail Traskin, Emil Pitkin,
More informationPost-Selection Inference for Models that are Approximations
Post-Selection Inference for Models that are Approximations Andreas Buja joint work with the PoSI Group: Richard Berk, Lawrence Brown, Linda Zhao, Kai Zhang Ed George, Mikhail Traskin, Emil Pitkin, Dan
More informationConstruction of PoSI Statistics 1
Construction of PoSI Statistics 1 Andreas Buja and Arun Kumar Kuchibhotla Department of Statistics University of Pennsylvania September 8, 2018 WHOA-PSI 2018 1 Joint work with "Larry s Group" at Wharton,
More informationGeneralized Cp (GCp) in a Model Lean Framework
Generalized Cp (GCp) in a Model Lean Framework Linda Zhao University of Pennsylvania Dedicated to Lawrence Brown (1940-2018) September 9th, 2018 WHOA 3 Joint work with Larry Brown, Juhui Cai, Arun Kumar
More informationModels as Approximations, Part I: A Conspiracy of Nonlinearity and Random Regressors in Linear Regression
Submitted to Statistical Science Models as Approximations, Part I: A Conspiracy of Nonlinearity and Random Regressors in Linear Regression Andreas Buja,,, Richard Berk, Lawrence Brown,, Edward George,
More informationModels as Approximations A Conspiracy of Random Regressors and Model Misspecification Against Classical Inference in Regression
Submitted to Statistical Science Models as Approximations A Conspiracy of Random Regressors and Model Misspecification Against Classical Inference in Regression Andreas Buja,,, Richard Berk, Lawrence Brown,,
More informationLawrence D. Brown* and Daniel McCarthy*
Comments on the paper, An adaptive resampling test for detecting the presence of significant predictors by I. W. McKeague and M. Qian Lawrence D. Brown* and Daniel McCarthy* ABSTRACT: This commentary deals
More informationModels as Approximations Part II: A General Theory of Model-Robust Regression
Submitted to Statistical Science Models as Approximations Part II: A General Theory of Model-Robust Regression Andreas Buja,,, Richard Berk, Lawrence Brown,, Ed George,, Arun Kumar Kuchibhotla,, and Linda
More informationInference for Approximating Regression Models
University of Pennsylvania ScholarlyCommons Publicly Accessible Penn Dissertations 1-1-2014 Inference for Approximating Regression Models Emil Pitkin University of Pennsylvania, pitkin@wharton.upenn.edu
More informationPoSI and its Geometry
PoSI and its Geometry Andreas Buja joint work with Richard Berk, Lawrence Brown, Kai Zhang, Linda Zhao Department of Statistics, The Wharton School University of Pennsylvania Philadelphia, USA Simon Fraser
More informationPoSI Valid Post-Selection Inference
PoSI Valid Post-Selection Inference Andreas Buja joint work with Richard Berk, Lawrence Brown, Kai Zhang, Linda Zhao Department of Statistics, The Wharton School University of Pennsylvania Philadelphia,
More informationModels as Approximations II: A Model-Free Theory of Parametric Regression
Submitted to Statistical Science Models as Approximations II: A Model-Free Theory of Parametric Regression Andreas Buja,, Lawrence Brown,, Arun Kumar Kuchibhotla, Richard Berk, Ed George,, and Linda Zhao,,
More informationDiscussant: Lawrence D Brown* Statistics Department, Wharton, Univ. of Penn.
Discussion of Estimation and Accuracy After Model Selection By Bradley Efron Discussant: Lawrence D Brown* Statistics Department, Wharton, Univ. of Penn. lbrown@wharton.upenn.edu JSM, Boston, Aug. 4, 2014
More informationModel Selection, Estimation, and Bootstrap Smoothing. Bradley Efron Stanford University
Model Selection, Estimation, and Bootstrap Smoothing Bradley Efron Stanford University Estimation After Model Selection Usually: (a) look at data (b) choose model (linear, quad, cubic...?) (c) fit estimates
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationAssumption Lean Regression
Assumption Lean Regression Richard Berk, Andreas Buja, Lawrence Brown, Edward George Arun Kumar Kuchibhotla, Weijie Su, and Linda Zhao University of Pennsylvania November 26, 2018 Abstract It is well known
More informationDepartment of Criminology
Department of Criminology Working Paper No. 2015-13.01 Calibrated Percentile Double Bootstrap for Robust Linear Regression Inference Daniel McCarthy University of Pennsylvania Kai Zhang University of North
More informationMallows Cp for Out-of-sample Prediction
Mallows Cp for Out-of-sample Prediction Lawrence D. Brown Statistics Department, Wharton School, University of Pennsylvania lbrown@wharton.upenn.edu WHOA-PSI conference, St. Louis, Oct 1, 2016 Joint work
More informationVariable Selection Insurance aka Valid Statistical Inference After Model Selection
Variable Selection Insurance aka Valid Statistical Inference After Model Selection L. Brown Wharton School, Univ. of Pennsylvania with Richard Berk, Andreas Buja, Ed George, Michael Traskin, Linda Zhao,
More informationA Resampling Method on Pivotal Estimating Functions
A Resampling Method on Pivotal Estimating Functions Kun Nie Biostat 277,Winter 2004 March 17, 2004 Outline Introduction A General Resampling Method Examples - Quantile Regression -Rank Regression -Simulation
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationRobust estimation, efficiency, and Lasso debiasing
Robust estimation, efficiency, and Lasso debiasing Po-Ling Loh University of Wisconsin - Madison Departments of ECE & Statistics WHOA-PSI workshop Washington University in St. Louis Aug 12, 2017 Po-Ling
More informationCross-Validation with Confidence
Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University WHOA-PSI Workshop, St Louis, 2017 Quotes from Day 1 and Day 2 Good model or pure model? Occam s razor We really
More informationRegression Diagnostics for Survey Data
Regression Diagnostics for Survey Data Richard Valliant Joint Program in Survey Methodology, University of Maryland and University of Michigan USA Jianzhu Li (Westat), Dan Liao (JPSM) 1 Introduction Topics
More informationLecture 5: LDA and Logistic Regression
Lecture 5: and Logistic Regression Hao Helen Zhang Hao Helen Zhang Lecture 5: and Logistic Regression 1 / 39 Outline Linear Classification Methods Two Popular Linear Models for Classification Linear Discriminant
More informationBootstrap & Confidence/Prediction intervals
Bootstrap & Confidence/Prediction intervals Olivier Roustant Mines Saint-Étienne 2017/11 Olivier Roustant (EMSE) Bootstrap & Confidence/Prediction intervals 2017/11 1 / 9 Framework Consider a model with
More informationFormal Statement of Simple Linear Regression Model
Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor
More informationHomoskedasticity. Var (u X) = σ 2. (23)
Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This
More informationA Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models
A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationOutline of GLMs. Definitions
Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density
More informationCross-Validation with Confidence
Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University UMN Statistics Seminar, Mar 30, 2017 Overview Parameter est. Model selection Point est. MLE, M-est.,... Cross-validation
More informationInference For High Dimensional M-estimates: Fixed Design Results
Inference For High Dimensional M-estimates: Fixed Design Results Lihua Lei, Peter Bickel and Noureddine El Karoui Department of Statistics, UC Berkeley Berkeley-Stanford Econometrics Jamboree, 2017 1/49
More informationInference For High Dimensional M-estimates. Fixed Design Results
: Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and
More informationMeasurement error effects on bias and variance in two-stage regression, with application to air pollution epidemiology
Measurement error effects on bias and variance in two-stage regression, with application to air pollution epidemiology Chris Paciorek Department of Statistics, University of California, Berkeley and Adam
More informationMarginal Screening and Post-Selection Inference
Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2
More informationSelective Inference for Effect Modification
Inference for Modification (Joint work with Dylan Small and Ashkan Ertefaie) Department of Statistics, University of Pennsylvania May 24, ACIC 2017 Manuscript and slides are available at http://www-stat.wharton.upenn.edu/~qyzhao/.
More informationSimple Linear Regression for the MPG Data
Simple Linear Regression for the MPG Data 2000 2500 3000 3500 15 20 25 30 35 40 45 Wgt MPG What do we do with the data? y i = MPG of i th car x i = Weight of i th car i =1,...,n n = Sample Size Exploratory
More informationDefault priors and model parametrization
1 / 16 Default priors and model parametrization Nancy Reid O-Bayes09, June 6, 2009 Don Fraser, Elisabeta Marras, Grace Yun-Yi 2 / 16 Well-calibrated priors model f (y; θ), F(y; θ); log-likelihood l(θ)
More informationBootstrap, Jackknife and other resampling methods
Bootstrap, Jackknife and other resampling methods Part III: Parametric Bootstrap Rozenn Dahyot Room 128, Department of Statistics Trinity College Dublin, Ireland dahyot@mee.tcd.ie 2005 R. Dahyot (TCD)
More informationDistribution-Free Predictive Inference for Regression
Distribution-Free Predictive Inference for Regression Jing Lei, Max G Sell, Alessandro Rinaldo, Ryan J. Tibshirani, and Larry Wasserman Department of Statistics, Carnegie Mellon University Abstract We
More informationOn Post-selection Confidence Intervals in Linear Regression
Washington University in St. Louis Washington University Open Scholarship Arts & Sciences Electronic Theses and Dissertations Arts & Sciences Spring 5-2017 On Post-selection Confidence Intervals in Linear
More informationSTAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)
STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points
More informationProbability and Statistics Notes
Probability and Statistics Notes Chapter Seven Jesse Crawford Department of Mathematics Tarleton State University Spring 2011 (Tarleton State University) Chapter Seven Notes Spring 2011 1 / 42 Outline
More informationModel-free prediction intervals for regression and autoregression. Dimitris N. Politis University of California, San Diego
Model-free prediction intervals for regression and autoregression Dimitris N. Politis University of California, San Diego To explain or to predict? Models are indispensable for exploring/utilizing relationships
More informationStatistical Inference
Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park
More informationBayesian Sparse Linear Regression with Unknown Symmetric Error
Bayesian Sparse Linear Regression with Unknown Symmetric Error Minwoo Chae 1 Joint work with Lizhen Lin 2 David B. Dunson 3 1 Department of Mathematics, The University of Texas at Austin 2 Department of
More informationLECTURE 2 LINEAR REGRESSION MODEL AND OLS
SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another
More informationL. Brown. Statistics Department, Wharton School University of Pennsylvania
Non-parametric Empirical Bayes and Compound Bayes Estimation of Independent Normal Means Joint work with E. Greenshtein L. Brown Statistics Department, Wharton School University of Pennsylvania lbrown@wharton.upenn.edu
More informationContents. 1 Review of Residuals. 2 Detecting Outliers. 3 Influential Observations. 4 Multicollinearity and its Effects
Contents 1 Review of Residuals 2 Detecting Outliers 3 Influential Observations 4 Multicollinearity and its Effects W. Zhou (Colorado State University) STAT 540 July 6th, 2015 1 / 32 Model Diagnostics:
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2009 Paper 251 Nonparametric population average models: deriving the form of approximate population
More informationBayesian estimation of the discrepancy with misspecified parametric models
Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012
More informationClassification. Chapter Introduction. 6.2 The Bayes classifier
Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode
More informationBootstrap, Jackknife and other resampling methods
Bootstrap, Jackknife and other resampling methods Part VI: Cross-validation Rozenn Dahyot Room 128, Department of Statistics Trinity College Dublin, Ireland dahyot@mee.tcd.ie 2005 R. Dahyot (TCD) 453 Modern
More informationRobustness to Parametric Assumptions in Missing Data Models
Robustness to Parametric Assumptions in Missing Data Models Bryan Graham NYU Keisuke Hirano University of Arizona April 2011 Motivation Motivation We consider the classic missing data problem. In practice
More informationMax. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes
Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationThe bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap
Patrick Breheny December 6 Patrick Breheny BST 764: Applied Statistical Modeling 1/21 The empirical distribution function Suppose X F, where F (x) = Pr(X x) is a distribution function, and we wish to estimate
More informationICES REPORT Model Misspecification and Plausibility
ICES REPORT 14-21 August 2014 Model Misspecification and Plausibility by Kathryn Farrell and J. Tinsley Odena The Institute for Computational Engineering and Sciences The University of Texas at Austin
More informationUniform Post Selection Inference for LAD Regression and Other Z-estimation problems. ArXiv: Alexandre Belloni (Duke) + Kengo Kato (Tokyo)
Uniform Post Selection Inference for LAD Regression and Other Z-estimation problems. ArXiv: 1304.0282 Victor MIT, Economics + Center for Statistics Co-authors: Alexandre Belloni (Duke) + Kengo Kato (Tokyo)
More informationRecent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data
Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)
More informationHypothesis testing, part 2. With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal
Hypothesis testing, part 2 With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal 1 CATEGORICAL IV, NUMERIC DV 2 Independent samples, one IV # Conditions Normal/Parametric Non-parametric
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Bootstrap for Regression Week 9, Lecture 1
MA 575 Linear Models: Cedric E. Ginestet, Boston University Bootstrap for Regression Week 9, Lecture 1 1 The General Bootstrap This is a computer-intensive resampling algorithm for estimating the empirical
More informationA Simple, Graphical Procedure for Comparing Multiple Treatment Effects
A Simple, Graphical Procedure for Comparing Multiple Treatment Effects Brennan S. Thompson and Matthew D. Webb May 15, 2015 > Abstract In this paper, we utilize a new graphical
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationLecture 3: Statistical Decision Theory (Part II)
Lecture 3: Statistical Decision Theory (Part II) Hao Helen Zhang Hao Helen Zhang Lecture 3: Statistical Decision Theory (Part II) 1 / 27 Outline of This Note Part I: Statistics Decision Theory (Classical
More informationEconometrics - 30C00200
Econometrics - 30C00200 Lecture 11: Heteroskedasticity Antti Saastamoinen VATT Institute for Economic Research Fall 2015 30C00200 Lecture 11: Heteroskedasticity 12.10.2015 Aalto University School of Business
More informationQuantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be
Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F
More informationA Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008
A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. Linear-in-Parameters Models: IV versus Control Functions 2. Correlated
More informationWISE International Masters
WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are
More informationIntroduction to Linear regression analysis. Part 2. Model comparisons
Introduction to Linear regression analysis Part Model comparisons 1 ANOVA for regression Total variation in Y SS Total = Variation explained by regression with X SS Regression + Residual variation SS Residual
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth
More informationAsymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands
Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Elizabeth C. Mannshardt-Shamseldin Advisor: Richard L. Smith Duke University Department
More informationG. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication
G. S. Maddala Kajal Lahiri WILEY A John Wiley and Sons, Ltd., Publication TEMT Foreword Preface to the Fourth Edition xvii xix Part I Introduction and the Linear Regression Model 1 CHAPTER 1 What is Econometrics?
More informationReview of Econometrics
Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,
More informationStatistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation
Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence
More informationCan we do statistical inference in a non-asymptotic way? 1
Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.
More informationNonconcave Penalized Likelihood with A Diverging Number of Parameters
Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized
More informationRecall that a measure of fit is the sum of squared residuals: where. The F-test statistic may be written as:
1 Joint hypotheses The null and alternative hypotheses can usually be interpreted as a restricted model ( ) and an model ( ). In our example: Note that if the model fits significantly better than the restricted
More informationBootstrap. Director of Center for Astrostatistics. G. Jogesh Babu. Penn State University babu.
Bootstrap G. Jogesh Babu Penn State University http://www.stat.psu.edu/ babu Director of Center for Astrostatistics http://astrostatistics.psu.edu Outline 1 Motivation 2 Simple statistical problem 3 Resampling
More informationSanjay Chaudhuri Department of Statistics and Applied Probability, National University of Singapore
AN EMPIRICAL LIKELIHOOD BASED ESTIMATOR FOR RESPONDENT DRIVEN SAMPLED DATA Sanjay Chaudhuri Department of Statistics and Applied Probability, National University of Singapore Mark Handcock, Department
More informationThe Bootstrap 1. STA442/2101 Fall Sampling distributions Bootstrap Distribution-free regression example
The Bootstrap 1 STA442/2101 Fall 2013 1 See last slide for copyright information. 1 / 18 Background Reading Davison s Statistical models has almost nothing. The best we can do for now is the Wikipedia
More informationarxiv: v1 [stat.me] 23 Jun 2018
Assumption Lean Regression arxiv:1806.09014v1 [stat.me] 23 Jun 2018 Richard Berk, Andreas Buja, Lawrence Brown, Edward George Arun Kumar Kuchibhotla, Weijie Su, and Linda Zhao Department of Statistics
More informationPaper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001)
Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Presented by Yang Zhao March 5, 2010 1 / 36 Outlines 2 / 36 Motivation
More information5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationThe assumptions are needed to give us... valid standard errors valid confidence intervals valid hypothesis tests and p-values
Statistical Consulting Topics The Bootstrap... The bootstrap is a computer-based method for assigning measures of accuracy to statistical estimates. (Efron and Tibshrani, 1998.) What do we do when our
More informationECON 4160, Autumn term Lecture 1
ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least
More informationContents. Acknowledgments. xix
Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables
More informationLinear models and their mathematical foundations: Simple linear regression
Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction
More informationSome Curiosities Arising in Objective Bayesian Analysis
. Some Curiosities Arising in Objective Bayesian Analysis Jim Berger Duke University Statistical and Applied Mathematical Institute Yale University May 15, 2009 1 Three vignettes related to John s work
More informationThe regression model with one stochastic regressor.
The regression model with one stochastic regressor. 3150/4150 Lecture 6 Ragnar Nymoen 30 January 2012 We are now on Lecture topic 4 The main goal in this lecture is to extend the results of the regression
More informationModule 17: Bayesian Statistics for Genetics Lecture 4: Linear regression
1/37 The linear regression model Module 17: Bayesian Statistics for Genetics Lecture 4: Linear regression Ken Rice Department of Biostatistics University of Washington 2/37 The linear regression model
More informationPsychology 282 Lecture #4 Outline Inferences in SLR
Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations
More information1. The OLS Estimator. 1.1 Population model and notation
1. The OLS Estimator OLS stands for Ordinary Least Squares. There are 6 assumptions ordinarily made, and the method of fitting a line through data is by least-squares. OLS is a common estimation methodology
More informationSimple Linear Regression
Simple Linear Regression September 24, 2008 Reading HH 8, GIll 4 Simple Linear Regression p.1/20 Problem Data: Observe pairs (Y i,x i ),i = 1,...n Response or dependent variable Y Predictor or independent
More informationMultiple Regression Analysis
Multiple Regression Analysis y = 0 + 1 x 1 + x +... k x k + u 6. Heteroskedasticity What is Heteroskedasticity?! Recall the assumption of homoskedasticity implied that conditional on the explanatory variables,
More informationSTAT 511. Lecture : Simple linear regression Devore: Section Prof. Michael Levine. December 3, Levine STAT 511
STAT 511 Lecture : Simple linear regression Devore: Section 12.1-12.4 Prof. Michael Levine December 3, 2018 A simple linear regression investigates the relationship between the two variables that is not
More informationNotes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed
18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,
More information