Statistical Inference

Size: px
Start display at page:

Download "Statistical Inference"

Transcription

1 Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

2 Outline The Bayesian Lasso Trevor Park and George Casella, The Bayesian Lasso, Journal of the American Statistical Association, June 2008, Vol. 103, No. 482, Theory and Methods DOI Bootstrap Lasso Trevor Hastie, Robert Tibshirani and Martin Wainwright (2015). Statistical Learning with Sparsity-The lasso and Generalizations. (pp ). Chapman and Hall/CRC Post-Selection Inference for the Lasso - The Covariance Test Trevor Hastie, Robert Tibshirani and Martin Wainwright (2015). Statistical Learning with Sparsity-The lasso and Generalizations. (pp ). Chapman and Hall/CRC Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

3 Why Statistical Inference Statistical Inference provides confidence interval and statistical strength of variables, as p-value in models. Statistical Inference is well proposed in low dimension cases, for high dimensional problems, traditional methods may not be applicable. We need some statistical inference methods for high dimensional studies. Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

4 The Bayesian Lasso We adopt the approach of Park and Casella (2008), involving a hierarchical model of the form y µ, X, β, σ 2 N n (µ1 n + Xβ, σ 2 I n ) (1) β λ, σ p j=1 λ 2 λ β σ 2 e σ 2 j For a complete Bayesian model, we use the improper prior density for σ 2 and a hyperprior for λ 2 with the class of gamma priors: π(σ 2 ) = 1/σ 2 and π(λ 2 ) = δr Γ(r) (λ2 ) r 1 exp( δλ 2 ) (3) (2) Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

5 Bayesian Lasso Prior and posterior distribution for the seventh variable in the diabetes example. Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

6 Bayesian Lasso Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

7 Bayesian Lasso Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

8 Bootstrap Lasso The bootstrap is popular for assessing the statistical properties of complex estimators. How do we obtain the sampling distribution of ˆβ through bootstrap method? The non parametric bootstrap is one method for approximating this sampling distribution. Another method is parametric bootstrap. Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

9 Bootstrap Lasso Suppose that we have obtained an estimate ˆβ(ˆλ CV ) for a lasso problem according to Cross-validation procedure: Fit a lasso path to (X, y) over a dense grid of values Λ = {λ ι } L ι=1. Divide the training samples into 10 groups at random. With the k th group left out, fit a lasso path to the remaining 9/10ths, using the same grid Λ. Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

10 Bootstrap Lasso For each λ Λ compute the mean-squared prediction error for the left-out group. Average these errors to obtain a prediction error curve over the grid Λ. Find the value ˆλ CV that minimizes this curve, and then return the coefficient vector from our original fit in the first step at that value of λ. Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

11 Non parametric Bootstrap It approximates the cumulative distribution function F of the random pair (X, Y ) by the empirical CDF ˆF N defined by the N samples. Draw N samples with replacement from the given dataset. Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

12 Bootstrap Lasso Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

13 Bootstrap Lasso Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

14 Bootstrap Lasso Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

15 Parametric Bootstrap We have a parametric estimate of F, or its corresponding density function f. We can sample from residual or Gaussian process regression. Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

16 Bootstrap Lasso Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

17 Bayesian Lasso vs Bootstrap Lasso In a general sense, results for the Bayesian lasso and lasso/bootstrap are similar. Nonparametric bootstrap can be treated as a kind of posterior-bayes estimate under a non-informative prior in the multinomial model. (Rubin 1981, Efron 1982) Bayesian Lasso leans more on parametric assumptions. Bootstrap scales better. Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

18 Bayesian Lasso vs Bootstrap Lasso p Bayesian Lasso Lasso/Bootstrap secs secs secs secs mins 14.7 mins hours 18.1 mins Table: Timing for Bayesian lasso and bootstrapped lasso Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

19 The Covariance Test Bayesian methods and the bootstrap are two "traditional "models and we would like to present some newer approaches. We would like to describe two methods proposed for assigning p-values or confidence interval to predictors as they are successively entered by the lasso and forward stepwise regression. These two methods have different results. Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

20 The Covariance Test Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

21 The Covariance Test We start from a usual linear regression setup: y = Xβ + ɛ, ɛ N(0, σ 2 I N N ) (4) Consider forward-stepwise regression. Defining RSS k to be the residual sum of squares for the model containing k predictors, we can use this change in residual sum-of-squares to form a test statistic and compare it to a χ 2 1 distribution. R k = 1 σ 2 (RSS k 1 RSS k ) (5) Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

22 The Covariance Test Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

23 The Covariance Test Forward stepwise procedure has chosen the strongest predictor among all of the available choices and it yields a larger drop in training error than expected. It is difficult to derive an appropriate p-value for forward stepwise regression, if we want properly account the adaptive nature of the fitting. For the lasso, a simple test can be derived that properly accounts for the adaptivity. Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

24 The Covariance Test Suppose that we wish to test significance of the predictor entered by LAR at λ k. Let A k 1 be the set of predictors with nonzero coefficients before this predictor was added and let the estimate at the end of this step be ˆβ(λ k+1 ). We refit the lasso, keeping λ = λ k+1 but using just the variables in A k 1. This yields the estimate ˆβ Ak 1 (λ k+1 ). The covariance test statistic is: T k = 1 σ 2 ( y, X ˆβ(λ k+1 ) y, X ˆβ Ak 1 (λ k+1 ) ) (6) T k Exp(1) (7) This statistic measures how much of the covariance between the outcome and the fitted model can be attributed to the predictor that has just entered the model. We can present a quantile-quantile plot for T 1 versus Exp(1): Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

25 The Covariance Test Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

26 The exponential limiting distribution for the covariance test requires certain conditions on the data: signal variables are not too correlated with the noise variables. Assumes linearity of the underlying model. In the next section, Libo will show a more general scheme that gives the spacing test, which works for any data matrix X, and the null distribution holds exactly for finite N and p. Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

27 Reference Trevor Hastie, Robert Tibshirani, Martin Wainwright Statistical Learning with Sparsity - The Lasso and Generalizations. Chapman and Hall/CRC, Trevor Park and George Casella The Bayesian Lasso Journal of the American Statistical Association, June 2008, Vol. 103, No. 482, Theory and Methods DOI Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, / 27

28 Post-selection Inference and Bayesian Inference Libo Wang Department of Statistics Florida State University Oct 19th, 2016

29 Motivation Assigning significance in high-dimensional regression is challenging. Most computationally efficient selection algorithms cannot guard against inclusion of noise variables. Asymptotically valid p-values are not available. Statistics versus Machine Learning Libo Wang Post-selection Inference and Bayesian Inference

30 What is post-selection inference? Inference the old way (pre- 1980?): 1. Devise a model 2. Collect data 3. Test hypothesis Classical inference Inference the new way: 1. Collect data 2. Select a model 3. Test hypothesis Post-selection inference Classical tools cannot be used post-selection, because they do not yield valid inferences (generally, too optimistic, underestimate p-value). Reason? For parametric model with full column rank p < n, the true parameter β is well-defined as the target of statistical inference. When we allow p n, we should add an selection step before working on inference that contains only submodels of full column rank. Libo Wang Post-selection Inference and Bayesian Inference

31 Example: Lasso with fixed-λ HIV data: mutations that predict response to a drug. Selection intervals for lasso with fixed tuning parameter λ. Libo Wang Post-selection Inference and Bayesian Inference

32 Formal goal of Post-selective inference [Lee et al. and Fithian, Sun, Taylor] Having selected a model ˆM based on our data y, we d like to test an hypothesis Ĥ 0. Note that Ĥ 0 will be random a function of the selected model and hence of y If our rejection region is {T (y) R}, we want to control the selected type I error: Prob(T (y) R ˆM, Ĥ 0 ) α Libo Wang Post-selection Inference and Bayesian Inference

33 Existing Approaches Data splitting - fit on one half of the data, do inferences on the other half. Problem: fitted model changes varies with random choice of half ; loss of power. Reference: P-Values for High-Dimensional Regression, Meinshausen N, Meier L, Buhlmann P, 2009 Permutations and related methods: not clear how to use these, beyond the global null Libo Wang Post-selection Inference and Bayesian Inference

34 A key mathematical result Polyhedral lemma: Provides a good solution for FS; an optimal solution for the fixed -λ lasso Polyhedrol section events Response vector y N(µ, Σ). Suppose we make a selection that can be written as y : Ay b with A, b not depending on y. This is true for forward stepwise regression, lasso with fixed λ, least angle regression and other procedures. Libo Wang Post-selection Inference and Bayesian Inference

35 Some intuition for Forward stepwise regression Suppose that we run forward stepwise regression for k steps {y : Ay b} is the set of y vectors that would yield the same predictors and their signs entered at each step. Each step represents a competition involving inner products between each x j and y; Polyhedral Ay b summarizes the results of the competition after k steps. Similar result holds for Lasso (fixed-λ or LAR) Libo Wang Post-selection Inference and Bayesian Inference

36 Example: The lasso and its selection event Lasso estimation: ˆβ argmin y Xβ 2 2/2 + λ β 1 Re-write the KKT condition by partitioning them according to activate set ( ˆM) or inactivate set ( ˆM). (to be continued.) X Ṱ M (X Ṱ M ˆ β M y) + λŝ ˆM = 0, X T ˆM (X Ṱ M ˆ β M y) + λŝ ˆM = 0, sign( ˆβ ˆM ) = ŝ ˆM, ŝ ˆM < 1 Libo Wang Post-selection Inference and Bayesian Inference

37 Example: The lasso and its selection event If and only if: X Ṱ M (X Ṱ w y) + λs = 0, M X T ˆM (X Ṱ w y) + λu = 0, M sign(w) = s, u < 1 Solve w and u by the first 2 equations substituting expression for w and u Libo Wang Post-selection Inference and Bayesian Inference

38 The polyhedral lemma [Lee et al, Ryan Tibs. et al.] For vector η F ν,ν + (η T y) {Ay b} Unif (0, 1) η T µ,σ 2 η 2 2 (truncated Gaussian distribution), where ν, ν + are (computable) values that are functions of η, A, b. Let F [a,b] denote the CDF of µ,σ 2 a N(µ, σ 2 ) r.v. truncated to the interval [a, b], that is, F [a,b] µ,σ 2 = φ((x µ)/σ) φ((a µ)/σ) φ((b µ)/σ) φ((a µ)/σ) Typically choose η so that η T y is the partial least squares estimate for a selected variable. Libo Wang Post-selection Inference and Bayesian Inference

39 Schematic illustrating the polyhedral lemma for the case N = 2. Libo Wang Post-selection Inference and Bayesian Inference

40 Example: Fixed-λ inference for the Lasso The intervals of Lasso is calculated by the selection procedure which is adaptive to the data (choose probabilistic model for the data first then formulate testing). The OLS model is pre-specified with only the selected 7 variables by LASSO. Selection-adjusted intervals are similar with strong signals, but larger for weak signals because these variables are close to the endpoint of the selection region, therefore wide ranges of µ would be consistent with observations. Libo Wang Post-selection Inference and Bayesian Inference

41 Current Work: Lasso with λ estimated by Cross-validation and unknown σ Selective inference with a randomized response by Tian, Taylor Can condition on the selection of λ by CV, and addition to the selection of the model Not clear about the difference between the Lasso with fixed-λ and the Lasso with λ estimated by cross-validation. Selective inference with unknown variance via the square-root LASSO by Tian, Loftus and Taylor Focuses on adapting post-selection inference to the case of unknown σ and the choice of tuning parameter. Square-root LASSO is favorable because the independence of λ with noise level σ. Libo Wang Post-selection Inference and Bayesian Inference

42 Improving the power The preceding approach conditions on the part of y orthogonal to the direction of interest η. This is for computational convenience yielding an analytic solution. Conditioning on less more power Are we conditioning on too much? Libo Wang Post-selection Inference and Bayesian Inference

43 Data splitting, carving, and adding noise Further improvements in power Fithian, Sun, Taylor, Tian Selection inference yields correct post-selection type I error. But confidence intervals are sometimes quite large. How to do better? (say, to make the randomness in selection is independent with the data for inference) Data carving: withholds a small proportion (say 10%) of data in selection stage, then uses all data for inference (conditioning using theory outlined above) Randomized response: add noise to y in selection stage. Like withholding data, but smoother. Then use unnoised data in inference stage. Related to differential privacy techniques. Libo Wang Post-selection Inference and Bayesian Inference

44 Data splitting, carving, and adding noise Libo Wang Post-selection Inference and Bayesian Inference

45 Alternative Method: Bayesian quantile regression inference Setting: y i = β 0 + β 1 X 1i + β 2 X 2i + ɛ i with β 0 = 1/3, β 1 = β 2 = 1, ɛ i exp(1) log(2). Table : Simulation results of Bayesian quantile regression Method ˆβ0 ˆβ1 ˆβ2 Err( ˆβ 0 ) 100 Err( ˆβ) 100 Err(ŷ) 100 BQR QR Libo Wang Post-selection Inference and Bayesian Inference

46 Difference between Bayesian and Classical Frequentist Inference Frequentist: 1. Point estimates and standard errors or 95% confidence intervals. 2. Deduction from P(data H 0 ), by setting α in advance. 3. Accept H 1 if P(data H 0 ) < α. Bayesian: 1. Induction from P(θ data), starting with P(θ). 2. Broad descriptions of the posterior distribution such as means and quantiles. Frequentist: P(data H 0 ) is the sampling distribution of the data given the parameter Bayesian: P(θ) is the prior distribution, P(θ data) is the posterior distribution of the parameter Libo Wang Post-selection Inference and Bayesian Inference

47 Bayesian feature selection methods Laplacian shrinkage. Bayesian lasso Adaptive shrinkage. Spike and Slab Get the selection inference P(θ data) by running 5000 or more iterations. Libo Wang Post-selection Inference and Bayesian Inference

48 Conclusions Post-selection inference is an exciting new area. Lots of potential research problems and generalizations. Bayesian and Frequentist methods both have drawbacks in finite sample settings. R package on CRAN: selectiveinference. Forward stepwise regression, Lasso, Lars Libo Wang Post-selection Inference and Bayesian Inference

49 References Book: Statistical Learning with Sparsity, Chapter 6 (Hastie, Tibshirani, Wainwright) Lee, Sun, Sun, Taylor (2013) Exact post-selection inference with the lasso. arxiv; To appear Tian, X. and Taylor, J. (2015) Selective inference with a randomized response. arxiv Libo Wang Post-selection Inference and Bayesian Inference

50 Thank you! Libo Wang Post-selection Inference and Bayesian Inference

Post-selection inference with an application to internal inference

Post-selection inference with an application to internal inference Post-selection inference with an application to internal inference Robert Tibshirani, Stanford University November 23, 2015 Seattle Symposium in Biostatistics, 2015 Joint work with Sam Gross, Will Fithian,

More information

Recent Advances in Post-Selection Statistical Inference

Recent Advances in Post-Selection Statistical Inference Recent Advances in Post-Selection Statistical Inference Robert Tibshirani, Stanford University October 1, 2016 Workshop on Higher-Order Asymptotics and Post-Selection Inference, St. Louis Joint work with

More information

Recent Advances in Post-Selection Statistical Inference

Recent Advances in Post-Selection Statistical Inference Recent Advances in Post-Selection Statistical Inference Robert Tibshirani, Stanford University June 26, 2016 Joint work with Jonathan Taylor, Richard Lockhart, Ryan Tibshirani, Will Fithian, Jason Lee,

More information

Some new ideas for post selection inference and model assessment

Some new ideas for post selection inference and model assessment Some new ideas for post selection inference and model assessment Robert Tibshirani, Stanford WHOA!! 2018 Thanks to Jon Taylor and Ryan Tibshirani for helpful feedback 1 / 23 Two topics 1. How to improve

More information

Post-selection Inference for Forward Stepwise and Least Angle Regression

Post-selection Inference for Forward Stepwise and Least Angle Regression Post-selection Inference for Forward Stepwise and Least Angle Regression Ryan & Rob Tibshirani Carnegie Mellon University & Stanford University Joint work with Jonathon Taylor, Richard Lockhart September

More information

Inference Conditional on Model Selection with a Focus on Procedures Characterized by Quadratic Inequalities

Inference Conditional on Model Selection with a Focus on Procedures Characterized by Quadratic Inequalities Inference Conditional on Model Selection with a Focus on Procedures Characterized by Quadratic Inequalities Joshua R. Loftus Outline 1 Intro and background 2 Framework: quadratic model selection events

More information

Summary and discussion of: Exact Post-selection Inference for Forward Stepwise and Least Angle Regression Statistics Journal Club

Summary and discussion of: Exact Post-selection Inference for Forward Stepwise and Least Angle Regression Statistics Journal Club Summary and discussion of: Exact Post-selection Inference for Forward Stepwise and Least Angle Regression Statistics Journal Club 36-825 1 Introduction Jisu Kim and Veeranjaneyulu Sadhanala In this report

More information

Post-Selection Inference

Post-Selection Inference Classical Inference start end start Post-Selection Inference selected end model data inference data selection model data inference Post-Selection Inference Todd Kuffner Washington University in St. Louis

More information

A significance test for the lasso

A significance test for the lasso 1 Gold medal address, SSC 2013 Joint work with Richard Lockhart (SFU), Jonathan Taylor (Stanford), and Ryan Tibshirani (Carnegie-Mellon Univ.) Reaping the benefits of LARS: A special thanks to Brad Efron,

More information

Recent Developments in Post-Selection Inference

Recent Developments in Post-Selection Inference Recent Developments in Post-Selection Inference Yotam Hechtlinger Department of Statistics yhechtli@andrew.cmu.edu Shashank Singh Department of Statistics Machine Learning Department sss1@andrew.cmu.edu

More information

A Significance Test for the Lasso

A Significance Test for the Lasso A Significance Test for the Lasso Lockhart R, Taylor J, Tibshirani R, and Tibshirani R Ashley Petersen May 14, 2013 1 Last time Problem: Many clinical covariates which are important to a certain medical

More information

A significance test for the lasso

A significance test for the lasso 1 First part: Joint work with Richard Lockhart (SFU), Jonathan Taylor (Stanford), and Ryan Tibshirani (Carnegie-Mellon Univ.) Second part: Joint work with Max Grazier G Sell, Stefan Wager and Alexandra

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

DISCUSSION OF A SIGNIFICANCE TEST FOR THE LASSO. By Peter Bühlmann, Lukas Meier and Sara van de Geer ETH Zürich

DISCUSSION OF A SIGNIFICANCE TEST FOR THE LASSO. By Peter Bühlmann, Lukas Meier and Sara van de Geer ETH Zürich Submitted to the Annals of Statistics DISCUSSION OF A SIGNIFICANCE TEST FOR THE LASSO By Peter Bühlmann, Lukas Meier and Sara van de Geer ETH Zürich We congratulate Richard Lockhart, Jonathan Taylor, Ryan

More information

Covariance test Selective inference. Selective inference. Patrick Breheny. April 18. Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/20

Covariance test Selective inference. Selective inference. Patrick Breheny. April 18. Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/20 Patrick Breheny April 18 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/20 Introduction In our final lecture on inferential approaches for penalized regression, we will discuss two rather

More information

Exact Post Model Selection Inference for Marginal Screening

Exact Post Model Selection Inference for Marginal Screening Exact Post Model Selection Inference for Marginal Screening Jason D. Lee Computational and Mathematical Engineering Stanford University Stanford, CA 94305 jdl17@stanford.edu Jonathan E. Taylor Department

More information

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Or How to select variables Using Bayesian LASSO

Or How to select variables Using Bayesian LASSO Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO On Bayesian Variable Selection

More information

Regression Shrinkage and Selection via the Lasso

Regression Shrinkage and Selection via the Lasso Regression Shrinkage and Selection via the Lasso ROBERT TIBSHIRANI, 1996 Presenter: Guiyun Feng April 27 () 1 / 20 Motivation Estimation in Linear Models: y = β T x + ɛ. data (x i, y i ), i = 1, 2,...,

More information

Lecture 17 May 11, 2018

Lecture 17 May 11, 2018 Stats 300C: Theory of Statistics Spring 2018 Lecture 17 May 11, 2018 Prof. Emmanuel Candes Scribe: Emmanuel Candes and Zhimei Ren) 1 Outline Agenda: Topics in selective inference 1. Inference After Model

More information

Cross-Validation with Confidence

Cross-Validation with Confidence Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University UMN Statistics Seminar, Mar 30, 2017 Overview Parameter est. Model selection Point est. MLE, M-est.,... Cross-validation

More information

Machine Learning Linear Regression. Prof. Matteo Matteucci

Machine Learning Linear Regression. Prof. Matteo Matteucci Machine Learning Linear Regression Prof. Matteo Matteucci Outline 2 o Simple Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression o Multi Variate Regession Model Least Squares

More information

Biostatistics Advanced Methods in Biostatistics IV

Biostatistics Advanced Methods in Biostatistics IV Biostatistics 140.754 Advanced Methods in Biostatistics IV Jeffrey Leek Assistant Professor Department of Biostatistics jleek@jhsph.edu Lecture 12 1 / 36 Tip + Paper Tip: As a statistician the results

More information

arxiv: v2 [math.st] 9 Feb 2017

arxiv: v2 [math.st] 9 Feb 2017 Submitted to the Annals of Statistics PREDICTION ERROR AFTER MODEL SEARCH By Xiaoying Tian Harris, Department of Statistics, Stanford University arxiv:1610.06107v math.st 9 Feb 017 Estimation of the prediction

More information

Consistent high-dimensional Bayesian variable selection via penalized credible regions

Consistent high-dimensional Bayesian variable selection via penalized credible regions Consistent high-dimensional Bayesian variable selection via penalized credible regions Howard Bondell bondell@stat.ncsu.edu Joint work with Brian Reich Howard Bondell p. 1 Outline High-Dimensional Variable

More information

A New Bayesian Variable Selection Method: The Bayesian Lasso with Pseudo Variables

A New Bayesian Variable Selection Method: The Bayesian Lasso with Pseudo Variables A New Bayesian Variable Selection Method: The Bayesian Lasso with Pseudo Variables Qi Tang (Joint work with Kam-Wah Tsui and Sijian Wang) Department of Statistics University of Wisconsin-Madison Feb. 8,

More information

Shrinkage Methods: Ridge and Lasso

Shrinkage Methods: Ridge and Lasso Shrinkage Methods: Ridge and Lasso Jonathan Hersh 1 Chapman University, Argyros School of Business hersh@chapman.edu February 27, 2019 J.Hersh (Chapman) Ridge & Lasso February 27, 2019 1 / 43 1 Intro and

More information

Cross-Validation with Confidence

Cross-Validation with Confidence Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University WHOA-PSI Workshop, St Louis, 2017 Quotes from Day 1 and Day 2 Good model or pure model? Occam s razor We really

More information

Bayesian linear regression

Bayesian linear regression Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

MSA220/MVE440 Statistical Learning for Big Data

MSA220/MVE440 Statistical Learning for Big Data MSA220/MVE440 Statistical Learning for Big Data Lecture 9-10 - High-dimensional regression Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Recap from

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Frequentist Accuracy of Bayesian Estimates

Frequentist Accuracy of Bayesian Estimates Frequentist Accuracy of Bayesian Estimates Bradley Efron Stanford University RSS Journal Webinar Objective Bayesian Inference Probability family F = {f µ (x), µ Ω} Parameter of interest: θ = t(µ) Prior

More information

A Significance Test for the Lasso

A Significance Test for the Lasso A Significance Test for the Lasso Lockhart R, Taylor J, Tibshirani R, and Tibshirani R Ashley Petersen June 6, 2013 1 Motivation Problem: Many clinical covariates which are important to a certain medical

More information

A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables

A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider the problem of

More information

The lasso, persistence, and cross-validation

The lasso, persistence, and cross-validation The lasso, persistence, and cross-validation Daniel J. McDonald Department of Statistics Indiana University http://www.stat.cmu.edu/ danielmc Joint work with: Darren Homrighausen Colorado State University

More information

Generalized Elastic Net Regression

Generalized Elastic Net Regression Abstract Generalized Elastic Net Regression Geoffroy MOURET Jean-Jules BRAULT Vahid PARTOVINIA This work presents a variation of the elastic net penalization method. We propose applying a combined l 1

More information

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 10

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 10 COS53: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 0 MELISSA CARROLL, LINJIE LUO. BIAS-VARIANCE TRADE-OFF (CONTINUED FROM LAST LECTURE) If V = (X n, Y n )} are observed data, the linear regression problem

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Bootstrap & Confidence/Prediction intervals

Bootstrap & Confidence/Prediction intervals Bootstrap & Confidence/Prediction intervals Olivier Roustant Mines Saint-Étienne 2017/11 Olivier Roustant (EMSE) Bootstrap & Confidence/Prediction intervals 2017/11 1 / 9 Framework Consider a model with

More information

arxiv: v5 [stat.me] 11 Oct 2015

arxiv: v5 [stat.me] 11 Oct 2015 Exact Post-Selection Inference for Sequential Regression Procedures Ryan J. Tibshirani 1 Jonathan Taylor 2 Richard Lochart 3 Robert Tibshirani 2 1 Carnegie Mellon University, 2 Stanford University, 3 Simon

More information

Discussion of Least Angle Regression

Discussion of Least Angle Regression Discussion of Least Angle Regression David Madigan Rutgers University & Avaya Labs Research Piscataway, NJ 08855 madigan@stat.rutgers.edu Greg Ridgeway RAND Statistics Group Santa Monica, CA 90407-2138

More information

Approximating high-dimensional posteriors with nuisance parameters via integrated rotated Gaussian approximation (IRGA)

Approximating high-dimensional posteriors with nuisance parameters via integrated rotated Gaussian approximation (IRGA) Approximating high-dimensional posteriors with nuisance parameters via integrated rotated Gaussian approximation (IRGA) Willem van den Boom Department of Statistics and Applied Probability National University

More information

Exact Post-selection Inference for Forward Stepwise and Least Angle Regression

Exact Post-selection Inference for Forward Stepwise and Least Angle Regression Exact Post-selection Inference for Forward Stepwise and Least Angle Regression Jonathan Taylor 1 Richard Lockhart 2 Ryan J. Tibshirani 3 Robert Tibshirani 1 1 Stanford University, 2 Simon Fraser University,

More information

Inference After Variable Selection

Inference After Variable Selection Department of Mathematics, SIU Carbondale Inference After Variable Selection Lasanthi Pelawa Watagoda lasanthi@siu.edu June 12, 2017 Outline 1 Introduction 2 Inference For Ridge and Lasso 3 Variable Selection

More information

Post-selection Inference for Changepoint Detection

Post-selection Inference for Changepoint Detection Post-selection Inference for Changepoint Detection Sangwon Hyun (Justin) Dept. of Statistics Advisors: Max G Sell, Ryan Tibshirani Committee: Will Fithian (UC Berkeley), Alessandro Rinaldo, Kathryn Roeder,

More information

MS-C1620 Statistical inference

MS-C1620 Statistical inference MS-C1620 Statistical inference 10 Linear regression III Joni Virta Department of Mathematics and Systems Analysis School of Science Aalto University Academic year 2018 2019 Period III - IV 1 / 32 Contents

More information

Bayesian methods in economics and finance

Bayesian methods in economics and finance 1/26 Bayesian methods in economics and finance Linear regression: Bayesian model selection and sparsity priors Linear Regression 2/26 Linear regression Model for relationship between (several) independent

More information

arxiv: v1 [math.st] 9 Feb 2014

arxiv: v1 [math.st] 9 Feb 2014 Degrees of Freedom and Model Search Ryan J. Tibshirani arxiv:1402.1920v1 [math.st] 9 Feb 2014 Abstract Degrees of freedom is a fundamental concept in statistical modeling, as it provides a quantitative

More information

Regularization Paths

Regularization Paths December 2005 Trevor Hastie, Stanford Statistics 1 Regularization Paths Trevor Hastie Stanford University drawing on collaborations with Brad Efron, Saharon Rosset, Ji Zhu, Hui Zhou, Rob Tibshirani and

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

g-priors for Linear Regression

g-priors for Linear Regression Stat60: Bayesian Modeling and Inference Lecture Date: March 15, 010 g-priors for Linear Regression Lecturer: Michael I. Jordan Scribe: Andrew H. Chan 1 Linear regression and g-priors In the last lecture,

More information

Selective Inference for Effect Modification

Selective Inference for Effect Modification Inference for Modification (Joint work with Dylan Small and Ashkan Ertefaie) Department of Statistics, University of Pennsylvania May 24, ACIC 2017 Manuscript and slides are available at http://www-stat.wharton.upenn.edu/~qyzhao/.

More information

High-dimensional Ordinary Least-squares Projection for Screening Variables

High-dimensional Ordinary Least-squares Projection for Screening Variables 1 / 38 High-dimensional Ordinary Least-squares Projection for Screening Variables Chenlei Leng Joint with Xiangyu Wang (Duke) Conference on Nonparametric Statistics for Big Data and Celebration to Honor

More information

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28 Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:

More information

Probabilistic machine learning group, Aalto University Bayesian theory and methods, approximative integration, model

Probabilistic machine learning group, Aalto University  Bayesian theory and methods, approximative integration, model Aki Vehtari, Aalto University, Finland Probabilistic machine learning group, Aalto University http://research.cs.aalto.fi/pml/ Bayesian theory and methods, approximative integration, model assessment and

More information

Uncertainty quantification in high-dimensional statistics

Uncertainty quantification in high-dimensional statistics Uncertainty quantification in high-dimensional statistics Peter Bühlmann ETH Zürich based on joint work with Sara van de Geer Nicolai Meinshausen Lukas Meier 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70

More information

Fast Regularization Paths via Coordinate Descent

Fast Regularization Paths via Coordinate Descent August 2008 Trevor Hastie, Stanford Statistics 1 Fast Regularization Paths via Coordinate Descent Trevor Hastie Stanford University joint work with Jerry Friedman and Rob Tibshirani. August 2008 Trevor

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference

Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference Shunsuke Horii Waseda University s.horii@aoni.waseda.jp Abstract In this paper, we present a hierarchical model which

More information

Least Absolute Shrinkage is Equivalent to Quadratic Penalization

Least Absolute Shrinkage is Equivalent to Quadratic Penalization Least Absolute Shrinkage is Equivalent to Quadratic Penalization Yves Grandvalet Heudiasyc, UMR CNRS 6599, Université de Technologie de Compiègne, BP 20.529, 60205 Compiègne Cedex, France Yves.Grandvalet@hds.utc.fr

More information

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION SEAN GERRISH AND CHONG WANG 1. WAYS OF ORGANIZING MODELS In probabilistic modeling, there are several ways of organizing models:

More information

Robust methods and model selection. Garth Tarr September 2015

Robust methods and model selection. Garth Tarr September 2015 Robust methods and model selection Garth Tarr September 2015 Outline 1. The past: robust statistics 2. The present: model selection 3. The future: protein data, meat science, joint modelling, data visualisation

More information

Spatial Lasso with Application to GIS Model Selection. F. Jay Breidt Colorado State University

Spatial Lasso with Application to GIS Model Selection. F. Jay Breidt Colorado State University Spatial Lasso with Application to GIS Model Selection F. Jay Breidt Colorado State University with Hsin-Cheng Huang, Nan-Jung Hsu, and Dave Theobald September 25 The work reported here was developed under

More information

Bayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson

Bayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson Bayesian variable selection via penalized credible regions Brian Reich, NC State Joint work with Howard Bondell and Ander Wilson Brian Reich, NCSU Penalized credible regions 1 Motivation big p, small n

More information

Conjugate direction boosting

Conjugate direction boosting Conjugate direction boosting June 21, 2005 Revised Version Abstract Boosting in the context of linear regression become more attractive with the invention of least angle regression (LARS), where the connection

More information

Summary and discussion of: Controlling the False Discovery Rate via Knockoffs

Summary and discussion of: Controlling the False Discovery Rate via Knockoffs Summary and discussion of: Controlling the False Discovery Rate via Knockoffs Statistics Journal Club, 36-825 Sangwon Justin Hyun and William Willie Neiswanger 1 Paper Summary 1.1 Quick intuitive summary

More information

Selective Inference for Effect Modification: An Empirical Investigation

Selective Inference for Effect Modification: An Empirical Investigation Observational Studies () Submitted ; Published Selective Inference for Effect Modification: An Empirical Investigation Qingyuan Zhao Department of Statistics The Wharton School, University of Pennsylvania

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

A General Framework for High-Dimensional Inference and Multiple Testing

A General Framework for High-Dimensional Inference and Multiple Testing A General Framework for High-Dimensional Inference and Multiple Testing Yang Ning Department of Statistical Science Joint work with Han Liu 1 Overview Goal: Control false scientific discoveries in high-dimensional

More information

Lecture 14: Shrinkage

Lecture 14: Shrinkage Lecture 14: Shrinkage Reading: Section 6.2 STATS 202: Data mining and analysis October 27, 2017 1 / 19 Shrinkage methods The idea is to perform a linear regression, while regularizing or shrinking the

More information

Chris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010

Chris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010 Model-Averaged l 1 Regularization using Markov Chain Monte Carlo Model Composition Technical Report No. 541 Department of Statistics, University of Washington Chris Fraley and Daniel Percival August 22,

More information

DEGREES OF FREEDOM AND MODEL SEARCH

DEGREES OF FREEDOM AND MODEL SEARCH Statistica Sinica 25 (2015), 1265-1296 doi:http://dx.doi.org/10.5705/ss.2014.147 DEGREES OF FREEDOM AND MODEL SEARCH Ryan J. Tibshirani Carnegie Mellon University Abstract: Degrees of freedom is a fundamental

More information

Machine Learning. A. Supervised Learning A.1. Linear Regression. Lars Schmidt-Thieme

Machine Learning. A. Supervised Learning A.1. Linear Regression. Lars Schmidt-Thieme Machine Learning A. Supervised Learning A.1. Linear Regression Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim, Germany

More information

A Confidence Region Approach to Tuning for Variable Selection

A Confidence Region Approach to Tuning for Variable Selection A Confidence Region Approach to Tuning for Variable Selection Funda Gunes and Howard D. Bondell Department of Statistics North Carolina State University Abstract We develop an approach to tuning of penalized

More information

High-dimensional regression with unknown variance

High-dimensional regression with unknown variance High-dimensional regression with unknown variance Christophe Giraud Ecole Polytechnique march 2012 Setting Gaussian regression with unknown variance: Y i = f i + ε i with ε i i.i.d. N (0, σ 2 ) f = (f

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Regularization Path Algorithms for Detecting Gene Interactions

Regularization Path Algorithms for Detecting Gene Interactions Regularization Path Algorithms for Detecting Gene Interactions Mee Young Park Trevor Hastie July 16, 2006 Abstract In this study, we consider several regularization path algorithms with grouped variable

More information

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Sparse regression. Optimization-Based Data Analysis.   Carlos Fernandez-Granda Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic

More information

Adaptive Piecewise Polynomial Estimation via Trend Filtering

Adaptive Piecewise Polynomial Estimation via Trend Filtering Adaptive Piecewise Polynomial Estimation via Trend Filtering Liubo Li, ShanShan Tu The Ohio State University li.2201@osu.edu, tu.162@osu.edu October 1, 2015 Liubo Li, ShanShan Tu (OSU) Trend Filtering

More information

Frequentist Accuracy of Bayesian Estimates

Frequentist Accuracy of Bayesian Estimates Frequentist Accuracy of Bayesian Estimates Bradley Efron Stanford University Bayesian Inference Parameter: µ Ω Observed data: x Prior: π(µ) Probability distributions: Parameter of interest: { fµ (x), µ

More information

Supplement to Bayesian inference for high-dimensional linear regression under the mnet priors

Supplement to Bayesian inference for high-dimensional linear regression under the mnet priors The Canadian Journal of Statistics Vol. xx No. yy 0?? Pages?? La revue canadienne de statistique Supplement to Bayesian inference for high-dimensional linear regression under the mnet priors Aixin Tan

More information

Marginal Screening and Post-Selection Inference

Marginal Screening and Post-Selection Inference Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2

More information

Linear Regression (9/11/13)

Linear Regression (9/11/13) STA561: Probabilistic machine learning Linear Regression (9/11/13) Lecturer: Barbara Engelhardt Scribes: Zachary Abzug, Mike Gloudemans, Zhuosheng Gu, Zhao Song 1 Why use linear regression? Figure 1: Scatter

More information

Machine Learning for Economists: Part 4 Shrinkage and Sparsity

Machine Learning for Economists: Part 4 Shrinkage and Sparsity Machine Learning for Economists: Part 4 Shrinkage and Sparsity Michal Andrle International Monetary Fund Washington, D.C., October, 2018 Disclaimer #1: The views expressed herein are those of the authors

More information

Biostatistics-Lecture 16 Model Selection. Ruibin Xi Peking University School of Mathematical Sciences

Biostatistics-Lecture 16 Model Selection. Ruibin Xi Peking University School of Mathematical Sciences Biostatistics-Lecture 16 Model Selection Ruibin Xi Peking University School of Mathematical Sciences Motivating example1 Interested in factors related to the life expectancy (50 US states,1969-71 ) Per

More information

A Bias Correction for the Minimum Error Rate in Cross-validation

A Bias Correction for the Minimum Error Rate in Cross-validation A Bias Correction for the Minimum Error Rate in Cross-validation Ryan J. Tibshirani Robert Tibshirani Abstract Tuning parameters in supervised learning problems are often estimated by cross-validation.

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA Presented by Dongjun Chung March 12, 2010 Introduction Definition Oracle Properties Computations Relationship: Nonnegative Garrote Extensions:

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox February 4 th, Emily Fox 2014

Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox February 4 th, Emily Fox 2014 Case Study 3: fmri Prediction Fused LASSO LARS Parallel LASSO Solvers Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox February 4 th, 2014 Emily Fox 2014 1 LASSO Regression

More information

Some Curiosities Arising in Objective Bayesian Analysis

Some Curiosities Arising in Objective Bayesian Analysis . Some Curiosities Arising in Objective Bayesian Analysis Jim Berger Duke University Statistical and Applied Mathematical Institute Yale University May 15, 2009 1 Three vignettes related to John s work

More information

A Magiv CV Theory for Large-Margin Classifiers

A Magiv CV Theory for Large-Margin Classifiers A Magiv CV Theory for Large-Margin Classifiers Hui Zou School of Statistics, University of Minnesota June 30, 2018 Joint work with Boxiang Wang Outline 1 Background 2 Magic CV formula 3 Magic support vector

More information

arxiv: v2 [math.st] 9 Feb 2017

arxiv: v2 [math.st] 9 Feb 2017 Submitted to Biometrika Selective inference with unknown variance via the square-root LASSO arxiv:1504.08031v2 [math.st] 9 Feb 2017 1. Introduction Xiaoying Tian, and Joshua R. Loftus, and Jonathan E.

More information

Behavioral Data Mining. Lecture 7 Linear and Logistic Regression

Behavioral Data Mining. Lecture 7 Linear and Logistic Regression Behavioral Data Mining Lecture 7 Linear and Logistic Regression Outline Linear Regression Regularization Logistic Regression Stochastic Gradient Fast Stochastic Methods Performance tips Linear Regression

More information

Iterative Selection Using Orthogonal Regression Techniques

Iterative Selection Using Orthogonal Regression Techniques Iterative Selection Using Orthogonal Regression Techniques Bradley Turnbull 1, Subhashis Ghosal 1 and Hao Helen Zhang 2 1 Department of Statistics, North Carolina State University, Raleigh, NC, USA 2 Department

More information

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear

More information

A Survey of L 1. Regression. Céline Cunen, 20/10/2014. Vidaurre, Bielza and Larranaga (2013)

A Survey of L 1. Regression. Céline Cunen, 20/10/2014. Vidaurre, Bielza and Larranaga (2013) A Survey of L 1 Regression Vidaurre, Bielza and Larranaga (2013) Céline Cunen, 20/10/2014 Outline of article 1.Introduction 2.The Lasso for Linear Regression a) Notation and Main Concepts b) Statistical

More information

Least Angle Regression, Forward Stagewise and the Lasso

Least Angle Regression, Forward Stagewise and the Lasso January 2005 Rob Tibshirani, Stanford 1 Least Angle Regression, Forward Stagewise and the Lasso Brad Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani Stanford University Annals of Statistics,

More information