Some new ideas for post selection inference and model assessment
|
|
- Hope Wiggins
- 5 years ago
- Views:
Transcription
1 Some new ideas for post selection inference and model assessment Robert Tibshirani, Stanford WHOA!! 2018 Thanks to Jon Taylor and Ryan Tibshirani for helpful feedback 1 / 23
2 Two topics 1. How to improve post-selection inference for the lasso: Keli Liu, Jelena Markovic & RT (with further generalizations by Jon Taylor) 2. Maybe we re answering the wrong question in #1: Post model-fitting exploration via Next-Door analysis Leying Guan & RT 2 / 23
3 Keli Liu Jelena Markovic Leying Guan 3 / 23
4 Post-selection inference for the lasso Data (x i, y i ), i = 1, 2,... N; x i = (x i1, x i2,... x ip ). X fixed. Model y i = β 0 + x ij β j + ɛ i ; ɛ i N(0, σ 2 ). j 4 / 23
5 Post-selection inference for the lasso Data (x i, y i ), i = 1, 2,... N; x i = (x i1, x i2,... x ip ). X fixed. Model y i = β 0 + x ij β j + ɛ i ; ɛ i N(0, σ 2 ). j The Lasso { arg min (y i β 0 β 0,β 1,...,β p i j x ij β j ) 2 + λ j } β j for some λ 0. 4 / 23
6 Review of truncated Gaussian approach Polyhedral selection events Response vector y N(µ, Σ). Suppose we make a selection that can be written as {y : Ay b} with A, b not depending on y. This is true for forward stepwise regression, lasso with fixed λ, least angle regression and other procedures. 5 / 23
7 The polyhedral lemma [Lee et al, Ryan Tibshirani et al.] For any vector η F [V,V + ] η µ,σ 2 η η (η y) {Ay b} Unif(0, 1) (truncated Gaussian distribution), where V, V + are (computable) values that are functions of η, A, b. Typically choose η so that η T y is the partial least squares estimate for a selected variable 6 / 23
8 V (y) P η y y η T y V + (y) η P η y {Ay b} 7 / 23
9 Example: Lasso with fixed-λ HIV data: mutations that predict response to a drug. Selection intervals for lasso with fixed tuning parameter λ. Coefficient Naive interval Selection adjusted interval Predictor 8 / 23
10 A big shortcoming of this approach Intervals are often very wide, can even be infinite. 9 / 23
11 A big shortcoming of this approach Intervals are often very wide, can even be infinite. Why? We have conditioned on too much, leaving not enough variation for inference [Fithian, Taylor- data carving ]. 9 / 23
12 A big shortcoming of this approach Intervals are often very wide, can even be infinite. Why? We have conditioned on too much, leaving not enough variation for inference [Fithian, Taylor- data carving ]. Jonathan Taylor & co-authors have worked to solve this problem by adding noise to the data before model fitting. This is clever and produces shorter intervals and more powerful tests. 9 / 23
13 A big shortcoming of this approach Intervals are often very wide, can even be infinite. Why? We have conditioned on too much, leaving not enough variation for inference [Fithian, Taylor- data carving ]. Jonathan Taylor & co-authors have worked to solve this problem by adding noise to the data before model fitting. This is clever and produces shorter intervals and more powerful tests. Here we show how the problem can be largely solved without randomization to provide shorter intervals. 9 / 23
14 Forming a Data Driven Query: Two Costs 1. Variable Selection: The data is used to decide which variables are worthy of attention, e.g., running the lasso and focusing on the active set. 10 / 23
15 Forming a Data Driven Query: Two Costs 1. Variable Selection: The data is used to decide which variables are worthy of attention, e.g., running the lasso and focusing on the active set. 2. Target Formation: Having settled on a subset M {1,..., p} of variables for careful study, what should be the target of our estimation? Two choices: 10 / 23
16 Forming a Data Driven Query: Two Costs 1. Variable Selection: The data is used to decide which variables are worthy of attention, e.g., running the lasso and focusing on the active set. 2. Target Formation: Having settled on a subset M {1,..., p} of variables for careful study, what should be the target of our estimation? Two choices: full target β F j, j M, where or partial target β F = ( X X ) 1 X µ, β (M) = ( X MX M ) 1 X M µ. 10 / 23
17 Consequences With the full target, our only cost is in #1. Our proposal: instead of conditioning on the entire active set and signs, we can condition just on the event that a given variable X j was chosen. [minimal conditioning: it s the event that leads us to ask a question about X j ] This leads to a truncated Gaussian distribution on the union of two disjoint intervals, with exact coverage under Gaussian errors. 11 / 23
18 Consequences With the full target, our only cost is in #1. Our proposal: instead of conditioning on the entire active set and signs, we can condition just on the event that a given variable X j was chosen. [minimal conditioning: it s the event that leads us to ask a question about X j ] This leads to a truncated Gaussian distribution on the union of two disjoint intervals, with exact coverage under Gaussian errors. With the partial target, we have to deal with both #1 and #2. Details in a few slides. 11 / 23
19 Full Model Coefficients Naïve (0.33) TZ V (0.29) TZ M (0.82) TZ Ms (1.19) lcavol svi lweight age lbph pgg45 gleason Prostate cancer data. Naive ignore selection; TZ V condition just on selected variable; TZ M condition on active set; TZ Ms condition on active set and signs (Lee et al.). 12 / 23
20 Partial targets Idea: we choose a subset Ĥ ˆM of high value targets (details below). How we choose to summarize the effect of a variable j ˆM depends on whether j is a high value target: 13 / 23
21 Partial targets Idea: we choose a subset Ĥ ˆM of high value targets (details below). How we choose to summarize the effect of a variable j ˆM depends on whether j is a high value target: High Value: We summarize the effect of j using βĥj βĥ = ( X Ĥ X Ĥ) 1 X Ĥ µ. where So our choice of target is fully adaptive for high value targets. Low Value: If variable j is selected by the lasso but is not deemed a high value target, we summarize its effect via βĥ {j} j βĥ {j} = ( X Ĥ {j} X Ĥ {j}) 1 X Ĥ {j} µ where and XĤ {j} is the matrix containing the high value targets as well as variable j. The coefficient βĥ {j} j is the effect of variable j after partialing out the effect of the high value targets, i.e., it allows us to ask the question whether variable j contributes any explanatory power beyond the variables in Ĥ. 13 / 23
22 Defining high and low-value targets Stable-t: Take Ĥ to be those variables in ˆM with t-statistics surpassing a Bonferroni corrected threshold. We first fit a OLS model using all the variables in ˆM, i.e., ˆβ ˆM = ( X ˆM X ˆM) 1 X ˆM y and allow j to be a high value target if the t-statistic for ˆβ ˆM j if ˆβ ˆM j ( ) 1 σ X ˆM X > c ˆM for some ( ) cutoff c. If we choose c by Bonferroni, it has the form Φ 1 α 2p 2 log p for large p; jj is large, i.e., We again get a truncated Gaussian over a union of intervals, and exact coverage with finite samples. 14 / 23
23 Partial Model Coefficients High Value Naïve (0.30) TZ stab-t (0.40) TZ M (0.80) TZ Ms (1.12) Low Value lcavol svi lweight age lbph pgg45 gleason Prostate cancer data. Naive ignore selection; TZ V condition just on selected variable; TZ M condition on active set; TZ Ms condition on active set and signs (Lee et al.); TZ stab t stable-t for high value target selection. 15 / 23
24 n=100, p=250, pure noise naive bonf TZ t TZ l1 TZ M TZ Ms len: 0.32, cov: 0.00 len: 0.51, cov: 0.47, len: 0.78, cov: 0.92 len: 0.74, cov: 0.91 len: 0.51, cov: 0.92 Prop. Inf. t: 0.02 l1: 0.00 M: 0.09 Ms: 0.47 len: 7.62, cov: Boxplot of lengths of 90% confidence intervals for partial regression coefficients. Naive ignore selection; Bonf Bonferroni; TZ t stable-t for high value target selection; TZ l1 stable-l 1 for high value target selection; TZ M condition on active set; TZ Ms condition on active set and signs (Lee et al.); 16 / 23
25 Wrapup All of this is for N > p; 17 / 23
26 Wrapup All of this is for N > p; The ideas extended for the high-dimensional full target case via ROSI: in preparation with Kevin Fry, Keli Liu, Jonathan Taylor and Rob Tibshirani. Gets good power as well! Application to large GWAS problems. 17 / 23
27 Wrapup All of this is for N > p; The ideas extended for the high-dimensional full target case via ROSI: in preparation with Kevin Fry, Keli Liu, Jonathan Taylor and Rob Tibshirani. Gets good power as well! Application to large GWAS problems. Will be added to our selectiveinference R and Python packages. 17 / 23
28 Next-door analysis Motivation Having fit a model by e.g. lasso, post-selection inference (as above) focusses on significance and confidence intervals for each chosen feature But scientists will often have different questions: Is the chosen model the uniquely best one? Are there other models with similar prediction performance? Is a given predictor indispensible or can it be swapped out for one or more other predictors? These are model-centric as opposed to feature-centric questions Our proposed solution is an application of the LOCO (leave-one-covariate-out) method of Lei et al (the CMU group) [no data splitting; focus on models, not variables] 18 / 23
29 leave out x1 x2,x3,x4 leave out x2 x1,x3 Chosen model x1,x2,x3 Leave-one out models minimum error higher error leave out x3 x1,x5 19 / 23
30 Algorithm: Next-Door analysis for the lasso 1. Fit the lasso with parameter λ chosen by cross-validation. Let the solutions be ˆβ(ˆλ). Let S be the active set where the coefficient in ˆβ(ˆλ) is non-zero. 2. For each j S, solve the lasso problem with the coefficient for the j th predictor being fixed at 0: 1 { ˆβ 0, ˆβ; ˆλ, j} = argmin βj =0 2 (y i β 0 X il β l ) 2 + ˆλ i l j l β l (1) Let ˆβ(ˆλ; j) be the coefficients and d j be the increase in validation error for this model relative to the base model. 3. Form an approximately unbiased estimate of d j and test if predictor j is indispensable: that is, test whether the increase in estimated prediction error d j is significantly larger than zero. 20 / 23
31 Details Need to condition on selection events: (1) chosen model has minimum CV error, (2) predictor j is in chosen model We use tricks of Markovic and Taylor (adding noise in CV) and Xiaoying Tian (adding ± noise for Cp) to obtain approximately debiased prediction error estimates and the bootstrap to get approximate type I error control 21 / 23
32 Table: Prostate cancer results. The leftmost column shows the fitted model from the lasso, and the remaining columns show the nearby models corresponding to the removal of each predictor.. base lcavol lwt svi lcp lbph pgg45 age lcavol lwt svi lcp lbph pgg age cv error debiased error test error selection freq NextDoor pvalue Feature (Post-Sel) pval Post selection p-value, Frequency of selection Feature indispensability!! 22 / 23
33 Final comments Paper on arxiv by Guan & Tibshirani NextDoor R package will soon be on CRAN. Idea: run glmnet to fit model, then run NextDoor on the output to get post-fitting summary report 23 / 23
Post-selection inference with an application to internal inference
Post-selection inference with an application to internal inference Robert Tibshirani, Stanford University November 23, 2015 Seattle Symposium in Biostatistics, 2015 Joint work with Sam Gross, Will Fithian,
More informationRecent Advances in Post-Selection Statistical Inference
Recent Advances in Post-Selection Statistical Inference Robert Tibshirani, Stanford University October 1, 2016 Workshop on Higher-Order Asymptotics and Post-Selection Inference, St. Louis Joint work with
More informationRecent Advances in Post-Selection Statistical Inference
Recent Advances in Post-Selection Statistical Inference Robert Tibshirani, Stanford University June 26, 2016 Joint work with Jonathan Taylor, Richard Lockhart, Ryan Tibshirani, Will Fithian, Jason Lee,
More informationStatistical Inference
Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park
More informationPost-selection Inference for Forward Stepwise and Least Angle Regression
Post-selection Inference for Forward Stepwise and Least Angle Regression Ryan & Rob Tibshirani Carnegie Mellon University & Stanford University Joint work with Jonathon Taylor, Richard Lockhart September
More informationA Significance Test for the Lasso
A Significance Test for the Lasso Lockhart R, Taylor J, Tibshirani R, and Tibshirani R Ashley Petersen June 6, 2013 1 Motivation Problem: Many clinical covariates which are important to a certain medical
More informationA significance test for the lasso
1 Gold medal address, SSC 2013 Joint work with Richard Lockhart (SFU), Jonathan Taylor (Stanford), and Ryan Tibshirani (Carnegie-Mellon Univ.) Reaping the benefits of LARS: A special thanks to Brad Efron,
More informationLeast Angle Regression, Forward Stagewise and the Lasso
January 2005 Rob Tibshirani, Stanford 1 Least Angle Regression, Forward Stagewise and the Lasso Brad Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani Stanford University Annals of Statistics,
More informationSTAT 462-Computational Data Analysis
STAT 462-Computational Data Analysis Chapter 5- Part 2 Nasser Sadeghkhani a.sadeghkhani@queensu.ca October 2017 1 / 27 Outline Shrinkage Methods 1. Ridge Regression 2. Lasso Dimension Reduction Methods
More informationData Mining Stat 588
Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic
More informationPost-Selection Inference
Classical Inference start end start Post-Selection Inference selected end model data inference data selection model data inference Post-Selection Inference Todd Kuffner Washington University in St. Louis
More informationChapter 3. Linear Models for Regression
Chapter 3. Linear Models for Regression Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Linear
More informationA significance test for the lasso
1 First part: Joint work with Richard Lockhart (SFU), Jonathan Taylor (Stanford), and Ryan Tibshirani (Carnegie-Mellon Univ.) Second part: Joint work with Max Grazier G Sell, Stefan Wager and Alexandra
More informationIntroduction to Statistics and R
Introduction to Statistics and R Mayo-Illinois Computational Genomics Workshop (2018) Ruoqing Zhu, Ph.D. Department of Statistics, UIUC rqzhu@illinois.edu June 18, 2018 Abstract This document is a supplimentary
More informationInference Conditional on Model Selection with a Focus on Procedures Characterized by Quadratic Inequalities
Inference Conditional on Model Selection with a Focus on Procedures Characterized by Quadratic Inequalities Joshua R. Loftus Outline 1 Intro and background 2 Framework: quadratic model selection events
More informationRecent Developments in Post-Selection Inference
Recent Developments in Post-Selection Inference Yotam Hechtlinger Department of Statistics yhechtli@andrew.cmu.edu Shashank Singh Department of Statistics Machine Learning Department sss1@andrew.cmu.edu
More informationRegularization Paths
December 2005 Trevor Hastie, Stanford Statistics 1 Regularization Paths Trevor Hastie Stanford University drawing on collaborations with Brad Efron, Saharon Rosset, Ji Zhu, Hui Zhou, Rob Tibshirani and
More informationarxiv: v5 [stat.me] 11 Oct 2015
Exact Post-Selection Inference for Sequential Regression Procedures Ryan J. Tibshirani 1 Jonathan Taylor 2 Richard Lochart 3 Robert Tibshirani 2 1 Carnegie Mellon University, 2 Stanford University, 3 Simon
More informationSampling Distributions
Merlise Clyde Duke University September 3, 2015 Outline Topics Normal Theory Chi-squared Distributions Student t Distributions Readings: Christensen Apendix C, Chapter 1-2 Prostate Example > library(lasso2);
More informationMallows Cp for Out-of-sample Prediction
Mallows Cp for Out-of-sample Prediction Lawrence D. Brown Statistics Department, Wharton School, University of Pennsylvania lbrown@wharton.upenn.edu WHOA-PSI conference, St. Louis, Oct 1, 2016 Joint work
More informationRegularization Paths. Theme
June 00 Trevor Hastie, Stanford Statistics June 00 Trevor Hastie, Stanford Statistics Theme Regularization Paths Trevor Hastie Stanford University drawing on collaborations with Brad Efron, Mee-Young Park,
More informationRegression Shrinkage and Selection via the Lasso
Regression Shrinkage and Selection via the Lasso ROBERT TIBSHIRANI, 1996 Presenter: Guiyun Feng April 27 () 1 / 20 Motivation Estimation in Linear Models: y = β T x + ɛ. data (x i, y i ), i = 1, 2,...,
More informationCOMP 551 Applied Machine Learning Lecture 3: Linear regression (cont d)
COMP 551 Applied Machine Learning Lecture 3: Linear regression (cont d) Instructor: Herke van Hoof (herke.vanhoof@mail.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless
More informationSampling Distributions
Merlise Clyde Duke University September 8, 2016 Outline Topics Normal Theory Chi-squared Distributions Student t Distributions Readings: Christensen Apendix C, Chapter 1-2 Prostate Example > library(lasso2);
More informationSelective Inference for Effect Modification: An Empirical Investigation
Observational Studies () Submitted ; Published Selective Inference for Effect Modification: An Empirical Investigation Qingyuan Zhao Department of Statistics The Wharton School, University of Pennsylvania
More informationThe lasso: some novel algorithms and applications
1 The lasso: some novel algorithms and applications Robert Tibshirani Stanford University ASA Bay Area chapter meeting Collaborations with Trevor Hastie, Jerome Friedman, Ryan Tibshirani, Daniela Witten,
More informationMachine Learning CSE546 Carlos Guestrin University of Washington. October 7, Efficiency: If size(w) = 100B, each prediction is expensive:
Simple Variable Selection LASSO: Sparse Regression Machine Learning CSE546 Carlos Guestrin University of Washington October 7, 2013 1 Sparsity Vector w is sparse, if many entries are zero: Very useful
More informationThe lasso: some novel algorithms and applications
1 The lasso: some novel algorithms and applications Newton Institute, June 25, 2008 Robert Tibshirani Stanford University Collaborations with Trevor Hastie, Jerome Friedman, Holger Hoefling, Gen Nowak,
More informationLecture 17 May 11, 2018
Stats 300C: Theory of Statistics Spring 2018 Lecture 17 May 11, 2018 Prof. Emmanuel Candes Scribe: Emmanuel Candes and Zhimei Ren) 1 Outline Agenda: Topics in selective inference 1. Inference After Model
More informationCovariance test Selective inference. Selective inference. Patrick Breheny. April 18. Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/20
Patrick Breheny April 18 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/20 Introduction In our final lecture on inferential approaches for penalized regression, we will discuss two rather
More informationSelective Inference for Effect Modification
Inference for Modification (Joint work with Dylan Small and Ashkan Ertefaie) Department of Statistics, University of Pennsylvania May 24, ACIC 2017 Manuscript and slides are available at http://www-stat.wharton.upenn.edu/~qyzhao/.
More informationA Significance Test for the Lasso
A Significance Test for the Lasso Lockhart R, Taylor J, Tibshirani R, and Tibshirani R Ashley Petersen May 14, 2013 1 Last time Problem: Many clinical covariates which are important to a certain medical
More informationIdentify Relative importance of covariates in Bayesian lasso quantile regression via new algorithm in statistical program R
Identify Relative importance of covariates in Bayesian lasso quantile regression via new algorithm in statistical program R Fadel Hamid Hadi Alhusseini Department of Statistics and Informatics, University
More informationExact Post-selection Inference for Forward Stepwise and Least Angle Regression
Exact Post-selection Inference for Forward Stepwise and Least Angle Regression Jonathan Taylor 1 Richard Lockhart 2 Ryan J. Tibshirani 3 Robert Tibshirani 1 1 Stanford University, 2 Simon Fraser University,
More informationDISCUSSION OF A SIGNIFICANCE TEST FOR THE LASSO. By Peter Bühlmann, Lukas Meier and Sara van de Geer ETH Zürich
Submitted to the Annals of Statistics DISCUSSION OF A SIGNIFICANCE TEST FOR THE LASSO By Peter Bühlmann, Lukas Meier and Sara van de Geer ETH Zürich We congratulate Richard Lockhart, Jonathan Taylor, Ryan
More informationLecture 3: More on regularization. Bayesian vs maximum likelihood learning
Lecture 3: More on regularization. Bayesian vs maximum likelihood learning L2 and L1 regularization for linear estimators A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting
More informationBias-free Sparse Regression with Guaranteed Consistency
Bias-free Sparse Regression with Guaranteed Consistency Wotao Yin (UCLA Math) joint with: Stanley Osher, Ming Yan (UCLA) Feng Ruan, Jiechao Xiong, Yuan Yao (Peking U) UC Riverside, STATS Department March
More informationHigh-dimensional data analysis
High-dimensional data analysis HW3 Reproduce Figure 3.8 3.10 and Table 3.3. (do not need PCR PLS Std Error) Figure 3.8 There is Profiles of ridge coefficients for the prostate cancer example, as the tuning
More informationModel-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate
Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Lucas Janson, Stanford Department of Statistics WADAPT Workshop, NIPS, December 2016 Collaborators: Emmanuel
More informationLecture 5: Soft-Thresholding and Lasso
High Dimensional Data and Statistical Learning Lecture 5: Soft-Thresholding and Lasso Weixing Song Department of Statistics Kansas State University Weixing Song STAT 905 October 23, 2014 1/54 Outline Penalized
More informationChapter 6. Ensemble Methods
Chapter 6. Ensemble Methods Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Introduction
More informationSelection-adjusted estimation of effect sizes
Selection-adjusted estimation of effect sizes with an application in eqtl studies Snigdha Panigrahi 19 October, 2017 Stanford University Selective inference - introduction Selective inference Statistical
More informationCOMP 551 Applied Machine Learning Lecture 2: Linear regression
COMP 551 Applied Machine Learning Lecture 2: Linear regression Instructor: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551 Unless otherwise noted, all material posted for this
More informationSummary and discussion of: Exact Post-selection Inference for Forward Stepwise and Least Angle Regression Statistics Journal Club
Summary and discussion of: Exact Post-selection Inference for Forward Stepwise and Least Angle Regression Statistics Journal Club 36-825 1 Introduction Jisu Kim and Veeranjaneyulu Sadhanala In this report
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationLecture 8: Fitting Data Statistical Computing, Wednesday October 7, 2015
Lecture 8: Fitting Data Statistical Computing, 36-350 Wednesday October 7, 2015 In previous episodes Loading and saving data sets in R format Loading and saving data sets in other structured formats Intro
More informationMSA220/MVE440 Statistical Learning for Big Data
MSA220/MVE440 Statistical Learning for Big Data Lecture 7/8 - High-dimensional modeling part 1 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Classification
More informationCross-Validation with Confidence
Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University WHOA-PSI Workshop, St Louis, 2017 Quotes from Day 1 and Day 2 Good model or pure model? Occam s razor We really
More informationRegression III: Computing a Good Estimator with Regularization
Regression III: Computing a Good Estimator with Regularization -Applied Multivariate Analysis- Lecturer: Darren Homrighausen, PhD 1 Another way to choose the model Let (X 0, Y 0 ) be a new observation
More informationMS-C1620 Statistical inference
MS-C1620 Statistical inference 10 Linear regression III Joni Virta Department of Mathematics and Systems Analysis School of Science Aalto University Academic year 2018 2019 Period III - IV 1 / 32 Contents
More informationHigh-dimensional statistics and data analysis Course Part I
and data analysis Course Part I 3 - Computation of p-values in high-dimensional regression Jérémie Bigot Institut de Mathématiques de Bordeaux - Université de Bordeaux Master MAS-MSS, Université de Bordeaux,
More informationCross-Validation with Confidence
Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University UMN Statistics Seminar, Mar 30, 2017 Overview Parameter est. Model selection Point est. MLE, M-est.,... Cross-validation
More informationMachine Learning for OR & FE
Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com
More informationarxiv: v2 [math.st] 9 Feb 2017
Submitted to Biometrika Selective inference with unknown variance via the square-root LASSO arxiv:1504.08031v2 [math.st] 9 Feb 2017 1. Introduction Xiaoying Tian, and Joshua R. Loftus, and Jonathan E.
More informationInference After Variable Selection
Department of Mathematics, SIU Carbondale Inference After Variable Selection Lasanthi Pelawa Watagoda lasanthi@siu.edu June 12, 2017 Outline 1 Introduction 2 Inference For Ridge and Lasso 3 Variable Selection
More informationCOMP 551 Applied Machine Learning Lecture 2: Linear Regression
COMP 551 Applied Machine Learning Lecture 2: Linear Regression Instructor: Herke van Hoof (herke.vanhoof@mail.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise
More informationKnockoffs as Post-Selection Inference
Knockoffs as Post-Selection Inference Lucas Janson Harvard University Department of Statistics blank line blank line WHOA-PSI, August 12, 2017 Controlled Variable Selection Conditional modeling setup:
More informationLecture 4: Newton s method and gradient descent
Lecture 4: Newton s method and gradient descent Newton s method Functional iteration Fitting linear regression Fitting logistic regression Prof. Yao Xie, ISyE 6416, Computational Statistics, Georgia Tech
More informationA knockoff filter for high-dimensional selective inference
1 A knockoff filter for high-dimensional selective inference Rina Foygel Barber and Emmanuel J. Candès February 2016; Revised September, 2017 Abstract This paper develops a framework for testing for associations
More informationConstruction of PoSI Statistics 1
Construction of PoSI Statistics 1 Andreas Buja and Arun Kumar Kuchibhotla Department of Statistics University of Pennsylvania September 8, 2018 WHOA-PSI 2018 1 Joint work with "Larry s Group" at Wharton,
More informationRegression Shrinkage and Selection via the Elastic Net, with Applications to Microarrays
Regression Shrinkage and Selection via the Elastic Net, with Applications to Microarrays Hui Zou and Trevor Hastie Department of Statistics, Stanford University December 5, 2003 Abstract We propose the
More informationLinear regression methods
Linear regression methods Most of our intuition about statistical methods stem from linear regression. For observations i = 1,..., n, the model is Y i = p X ij β j + ε i, j=1 where Y i is the response
More informationMSA220/MVE440 Statistical Learning for Big Data
MSA220/MVE440 Statistical Learning for Big Data Lecture 9-10 - High-dimensional regression Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Recap from
More informationPoSI and its Geometry
PoSI and its Geometry Andreas Buja joint work with Richard Berk, Lawrence Brown, Kai Zhang, Linda Zhao Department of Statistics, The Wharton School University of Pennsylvania Philadelphia, USA Simon Fraser
More informationarxiv: v2 [math.st] 9 Feb 2017
Submitted to the Annals of Statistics PREDICTION ERROR AFTER MODEL SEARCH By Xiaoying Tian Harris, Department of Statistics, Stanford University arxiv:1610.06107v math.st 9 Feb 017 Estimation of the prediction
More informationShrinkage Methods: Ridge and Lasso
Shrinkage Methods: Ridge and Lasso Jonathan Hersh 1 Chapman University, Argyros School of Business hersh@chapman.edu February 27, 2019 J.Hersh (Chapman) Ridge & Lasso February 27, 2019 1 / 43 1 Intro and
More informationThe lasso, persistence, and cross-validation
The lasso, persistence, and cross-validation Daniel J. McDonald Department of Statistics Indiana University http://www.stat.cmu.edu/ danielmc Joint work with: Darren Homrighausen Colorado State University
More informationRandom Forests. These notes rely heavily on Biau and Scornet (2016) as well as the other references at the end of the notes.
Random Forests One of the best known classifiers is the random forest. It is very simple and effective but there is still a large gap between theory and practice. Basically, a random forest is an average
More informationFrequentist Accuracy of Bayesian Estimates
Frequentist Accuracy of Bayesian Estimates Bradley Efron Stanford University RSS Journal Webinar Objective Bayesian Inference Probability family F = {f µ (x), µ Ω} Parameter of interest: θ = t(µ) Prior
More informationA Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models
A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los
More informationGraphical Model Selection
May 6, 2013 Trevor Hastie, Stanford Statistics 1 Graphical Model Selection Trevor Hastie Stanford University joint work with Jerome Friedman, Rob Tibshirani, Rahul Mazumder and Jason Lee May 6, 2013 Trevor
More informationModel Selection, Estimation, and Bootstrap Smoothing. Bradley Efron Stanford University
Model Selection, Estimation, and Bootstrap Smoothing Bradley Efron Stanford University Estimation After Model Selection Usually: (a) look at data (b) choose model (linear, quad, cubic...?) (c) fit estimates
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationPost-selection Inference for Changepoint Detection
Post-selection Inference for Changepoint Detection Sangwon Hyun (Justin) Dept. of Statistics Advisors: Max G Sell, Ryan Tibshirani Committee: Will Fithian (UC Berkeley), Alessandro Rinaldo, Kathryn Roeder,
More informationThe Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA
The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA Presented by Dongjun Chung March 12, 2010 Introduction Definition Oracle Properties Computations Relationship: Nonnegative Garrote Extensions:
More informationDistribution-Free Predictive Inference for Regression
Distribution-Free Predictive Inference for Regression Jing Lei, Max G Sell, Alessandro Rinaldo, Ryan J. Tibshirani, and Larry Wasserman Department of Statistics, Carnegie Mellon University Abstract We
More informationTutorial on Linear Regression
Tutorial on Linear Regression HY-539: Advanced Topics on Wireless Networks & Mobile Systems Prof. Maria Papadopouli Evripidis Tzamousis tzamusis@csd.uoc.gr Agenda 1. Simple linear regression 2. Multiple
More informationSparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda
Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic
More informationAdvanced Introduction to Machine Learning CMU-10715
Advanced Introduction to Machine Learning CMU-10715 Gaussian Processes Barnabás Póczos http://www.gaussianprocess.org/ 2 Some of these slides in the intro are taken from D. Lizotte, R. Parr, C. Guesterin
More informationA General Framework for High-Dimensional Inference and Multiple Testing
A General Framework for High-Dimensional Inference and Multiple Testing Yang Ning Department of Statistical Science Joint work with Han Liu 1 Overview Goal: Control false scientific discoveries in high-dimensional
More informationExact Post Model Selection Inference for Marginal Screening
Exact Post Model Selection Inference for Marginal Screening Jason D. Lee Computational and Mathematical Engineering Stanford University Stanford, CA 94305 jdl17@stanford.edu Jonathan E. Taylor Department
More informationTutz, Binder: Boosting Ridge Regression
Tutz, Binder: Boosting Ridge Regression Sonderforschungsbereich 386, Paper 418 (2005) Online unter: http://epub.ub.uni-muenchen.de/ Projektpartner Boosting Ridge Regression Gerhard Tutz 1 & Harald Binder
More informationMarginal Screening and Post-Selection Inference
Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2
More informationPackage covtest. R topics documented:
Package covtest February 19, 2015 Title Computes covariance test for adaptive linear modelling Version 1.02 Depends lars,glmnet,glmpath (>= 0.97),MASS Author Richard Lockhart, Jon Taylor, Ryan Tibshirani,
More informationConfidence Intervals for Low-dimensional Parameters with High-dimensional Data
Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Cun-Hui Zhang and Stephanie S. Zhang Rutgers University and Columbia University September 14, 2012 Outline Introduction Methodology
More informationLinear Regression. Machine Learning Seyoung Kim. Many of these slides are derived from Tom Mitchell. Thanks!
Linear Regression Machine Learning 10-601 Seyoung Kim Many of these slides are derived from Tom Mitchell. Thanks! Regression So far, we ve been interested in learning P(Y X) where Y has discrete values
More informationConsistent Selection of Tuning Parameters via Variable Selection Stability
Journal of Machine Learning Research 14 2013 3419-3440 Submitted 8/12; Revised 7/13; Published 11/13 Consistent Selection of Tuning Parameters via Variable Selection Stability Wei Sun Department of Statistics
More informationA Magiv CV Theory for Large-Margin Classifiers
A Magiv CV Theory for Large-Margin Classifiers Hui Zou School of Statistics, University of Minnesota June 30, 2018 Joint work with Boxiang Wang Outline 1 Background 2 Magic CV formula 3 Magic support vector
More informationMultinomial functional regression with application to lameness detection for horses
Department of Mathematical Sciences Multinomial functional regression with application to lameness detection for horses Helle Sørensen (helle@math.ku.dk) Joint with Seyed Nourollah Mousavi user! 2015,
More informationarxiv: v3 [stat.me] 11 Sep 2017
Scalable methods for Bayesian selective inference arxiv:703.0676v3 [stat.me] Sep 207 Snigdha Panigrahi, Jonathan Taylor Abstract: Modeled along the truncated approach in Panigrahi et al. (206), selection-adjusted
More informationRegularization and Variable Selection via the Elastic Net
p. 1/1 Regularization and Variable Selection via the Elastic Net Hui Zou and Trevor Hastie Journal of Royal Statistical Society, B, 2005 Presenter: Minhua Chen, Nov. 07, 2008 p. 2/1 Agenda Introduction
More informationLecture 1 Intro to Spatial and Temporal Data
Lecture 1 Intro to Spatial and Temporal Data Dennis Sun Stanford University Stats 253 June 22, 2015 1 What is Spatial and Temporal Data? 2 Trend Modeling 3 Omitted Variables 4 Overview of this Class 1
More informationMedian Cross-Validation
Median Cross-Validation Chi-Wai Yu 1, and Bertrand Clarke 2 1 Department of Mathematics Hong Kong University of Science and Technology 2 Department of Medicine University of Miami IISA 2011 Outline Motivational
More informationRobust Variable Selection Methods for Grouped Data. Kristin Lee Seamon Lilly
Robust Variable Selection Methods for Grouped Data by Kristin Lee Seamon Lilly A dissertation submitted to the Graduate Faculty of Auburn University in partial fulfillment of the requirements for the Degree
More informationarxiv: v2 [stat.me] 13 Mar 2015
arxiv:1503.00334v2 [stat.me] 13 Mar 2015 Sparse regression and marginal testing using cluster prototypes Stephen Reid and Robert Tibshirani Abstract We propose a new approach for sparse regression and
More informationCS242: Probabilistic Graphical Models Lecture 4A: MAP Estimation & Graph Structure Learning
CS242: Probabilistic Graphical Models Lecture 4A: MAP Estimation & Graph Structure Learning Professor Erik Sudderth Brown University Computer Science October 4, 2016 Some figures and materials courtesy
More informationStatistical Learning with the Lasso, spring The Lasso
Statistical Learning with the Lasso, spring 2017 1 Yeast: understanding basic life functions p=11,904 gene values n number of experiments ~ 10 Blomberg et al. 2003, 2010 The Lasso fmri brain scans function
More informationarxiv: v1 [stat.me] 26 May 2017
Tractable Post-Selection Maximum Likelihood Inference for the Lasso arxiv:1705.09417v1 [stat.me] 26 May 2017 Amit Meir Department of Statistics University of Washington January 8, 2018 Abstract Mathias
More informationCS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS
CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS * Some contents are adapted from Dr. Hung Huang and Dr. Chengkai Li at UT Arlington Mingon Kang, Ph.D. Computer Science, Kennesaw State University Problems
More informationAdaptive Piecewise Polynomial Estimation via Trend Filtering
Adaptive Piecewise Polynomial Estimation via Trend Filtering Liubo Li, ShanShan Tu The Ohio State University li.2201@osu.edu, tu.162@osu.edu October 1, 2015 Liubo Li, ShanShan Tu (OSU) Trend Filtering
More information