MATH 829: Introduction to Data Mining and Analysis Least angle regression

Size: px
Start display at page:

Download "MATH 829: Introduction to Data Mining and Analysis Least angle regression"

Transcription

1 1/14 MATH 829: Introduction to Data Mining and Analysis Least angle regression Dominique Guillot Departments of Mathematical Sciences University of Delaware February 29, 2016

2 Least angle regression (LARS) 2/14 Recall the forward stagewise approach to linear regression:

3 Least angle regression (LARS) 2/14 Recall the forward stagewise approach to linear regression: 1 Start with intercept y, and centered predictors with coecients initially all 0. 2 At each step the algorithm: identify the variable most correlated with the current residual. 3 Compute the simple linear regression coecient of the residual on this chosen variable, and add it to the current coecient for that variable. 4 Continued till none of the variables have correlation with the residuals.

4 Least angle regression (LARS) 2/14 Recall the forward stagewise approach to linear regression: 1 Start with intercept y, and centered predictors with coecients initially all 0. 2 At each step the algorithm: identify the variable most correlated with the current residual. 3 Compute the simple linear regression coecient of the residual on this chosen variable, and add it to the current coecient for that variable. 4 Continued till none of the variables have correlation with the residuals. Greedy approach.

5 Least angle regression (LARS) 2/14 Recall the forward stagewise approach to linear regression: 1 Start with intercept y, and centered predictors with coecients initially all 0. 2 At each step the algorithm: identify the variable most correlated with the current residual. 3 Compute the simple linear regression coecient of the residual on this chosen variable, and add it to the current coecient for that variable. 4 Continued till none of the variables have correlation with the residuals. Greedy approach. However, the solution often looks similar to the lasso solution.

6 Least angle regression (LARS) 2/14 Recall the forward stagewise approach to linear regression: 1 Start with intercept y, and centered predictors with coecients initially all 0. 2 At each step the algorithm: identify the variable most correlated with the current residual. 3 Compute the simple linear regression coecient of the residual on this chosen variable, and add it to the current coecient for that variable. 4 Continued till none of the variables have correlation with the residuals. Greedy approach. However, the solution often looks similar to the lasso solution. Connection between the two methods?

7 Forward stagewise vs lasso 3/14 Example: Prostate cancer data (see ESL). Efron et al., 2003

8 LARS 4/14 Least angle regression (LARS) is similar to forward stagewise, but only enters as much of a predictor as it deserves.

9 LARS 4/14 Least angle regression (LARS) is similar to forward stagewise, but only enters as much of a predictor as it deserves. ESL, Algorithm 3.2.

10 LARS (cont.) 5/14 Let A k be the current active set.

11 LARS (cont.) 5/14 Let A k be the current active set. β Ak be the coecients vectors at step k.

12 LARS (cont.) 5/14 Let A k be the current active set. β Ak be the coecients vectors at step k. Let r k = y X Ak β Ak denote the residual at step k.

13 LARS (cont.) 5/14 Let A k be the current active set. β Ak be the coecients vectors at step k. Let r k = y X Ak β Ak denote the residual at step k. Then, at step k, we move the coecients in the direction δ k = (X T A k X Ak ) 1 X T A k r k, i.e., β Ak (α) = β Ak + α δ k.

14 Analysis of LARS 6/14 How does the correlation between the predictors and the residuals evolve? δ k = (X T A k X Ak ) 1 X T A k r k β Ak (α) = β Ak + α δ k.

15 Analysis of LARS 6/14 How does the correlation between the predictors and the residuals evolve? δ k = (X T A k X Ak ) 1 X T A k r k β Ak (α) = β Ak + α δ k. It remains the same for all predictors, and decreases monotonically.

16 Analysis of LARS 6/14 How does the correlation between the predictors and the residuals evolve? δ k = (X T A k X Ak ) 1 X T A k r k β Ak (α) = β Ak + α δ k. It remains the same for all predictors, and decreases monotonically. Indeed, suppose each predictor in a linear regression problem has equal correlation (in absolute value) with the response. 1 n x j, y = λ j = 1,..., p. (Recall, we assume the predictors have been standardized.)

17 Analysis of LARS 6/14 How does the correlation between the predictors and the residuals evolve? δ k = (X T A k X Ak ) 1 X T A k r k β Ak (α) = β Ak + α δ k. It remains the same for all predictors, and decreases monotonically. Indeed, suppose each predictor in a linear regression problem has equal correlation (in absolute value) with the response. 1 n x j, y = λ j = 1,..., p. (Recall, we assume the predictors have been standardized.) Let ˆβ be the least-squares coecients of y on X and let u(α) = αx ˆβ for α [0, 1].

18 Analysis of LARS (cont.) 7/14 We have 1 n x j, y = λ and u(α) = αx ˆβ. Now,

19 Analysis of LARS (cont.) 7/14 We have 1 n x j, y = λ and u(α) = αx ˆβ. Now, ( ) 1 p n x j, y u(α) = 1 j=1 n XT (y u(α)) = 1 n XT (y αx(x T X) 1 X T y) = 1 n (1 α)xt y = (1 α)λ 1 p 1.

20 Analysis of LARS (cont.) 7/14 We have 1 n x j, y = λ and u(α) = αx ˆβ. Now, ( ) 1 p n x j, y u(α) = 1 j=1 n XT (y u(α)) = 1 n XT (y αx(x T X) 1 X T y) = 1 n (1 α)xt y = (1 α)λ 1 p 1. Therefore, the correlation between x j and the residuals y u(α) decreases linearly to 0.

21 Analysis of LARS (cont.) 7/14 We have 1 n x j, y = λ and u(α) = αx ˆβ. Now, ( ) 1 p n x j, y u(α) = 1 j=1 n XT (y u(α)) = 1 n XT (y αx(x T X) 1 X T y) = 1 n (1 α)xt y = (1 α)λ 1 p 1. Therefore, the correlation between x j and the residuals y u(α) decreases linearly to 0. In LARS, the parameter α is increased until a new variable becomes equally correlated with the residuals y u(α).

22 Analysis of LARS (cont.) 7/14 We have 1 n x j, y = λ and u(α) = αx ˆβ. Now, ( ) 1 p n x j, y u(α) = 1 j=1 n XT (y u(α)) = 1 n XT (y αx(x T X) 1 X T y) = 1 n (1 α)xt y = (1 α)λ 1 p 1. Therefore, the correlation between x j and the residuals y u(α) decreases linearly to 0. In LARS, the parameter α is increased until a new variable becomes equally correlated with the residuals y u(α). The new variable is then added to the model, and a new direction is computed.

23 Analysis of LARS (cont.) 8/14 Example: Ĉ k = current maximal correlation. Efron et al., 2003.

24 Equiangular vector 9/14 Why least angle regression? Recal: β Ak (α) = β Ak + α δ k.

25 Equiangular vector 9/14 Why least angle regression? Recal: β Ak (α) = β Ak + α δ k. Thus, ŷ(α) = X Ak β Ak (α) = X Ak β Ak + α X Ak δ k.

26 Equiangular vector 9/14 Why least angle regression? Recal: β Ak (α) = β Ak + α δ k. Thus, ŷ(α) = X Ak β Ak (α) = X Ak β Ak + α X Ak δ k. It is not hard to check that u k := X Ak δ k makes equal angles with the predictors in A k.

27 Equiangular vector 9/14 Why least angle regression? Indeed, Recal: β Ak (α) = β Ak + α δ k. Thus, ŷ(α) = X Ak β Ak (α) = X Ak β Ak + α X Ak δ k. It is not hard to check that u k := X Ak δ k makes equal angles with the predictors in A k. X T A k u k = X T A k X Ak δ k = X T A k X Ak (X T A k X Ak ) 1 X T A k r k = X T A k r k.

28 Equiangular vector 9/14 Why least angle regression? Indeed, Recal: β Ak (α) = β Ak + α δ k. Thus, ŷ(α) = X Ak β Ak (α) = X Ak β Ak + α X Ak δ k. It is not hard to check that u k := X Ak δ k makes equal angles with the predictors in A k. X T A k u k = X T A k X Ak δ k = X T A k X Ak (X T A k X Ak ) 1 X T A k r k = X T A k r k. The entries of the vector X T A k r k are all the same since the predictors in A k all have the same correlation with the residuals r k (by construction).

29 Equiangular vector 9/14 Why least angle regression? Indeed, Recal: β Ak (α) = β Ak + α δ k. Thus, ŷ(α) = X Ak β Ak (α) = X Ak β Ak + α X Ak δ k. It is not hard to check that u k := X Ak δ k makes equal angles with the predictors in A k. X T A k u k = X T A k X Ak δ k = X T A k X Ak (X T A k X Ak ) 1 X T A k r k = X T A k r k. The entries of the vector X T A k r k are all the same since the predictors in A k all have the same correlation with the residuals r k (by construction). Conclusion: u k makes equal angles with the predictors in A k.

30 Equiangular vector Why least angle regression? Indeed, Recal: β Ak (α) = β Ak + α δ k. Thus, ŷ(α) = X Ak β Ak (α) = X Ak β Ak + α X Ak δ k. It is not hard to check that u k := X Ak δ k makes equal angles with the predictors in A k. X T A k u k = X T A k X Ak δ k = X T A k X Ak (X T A k X Ak ) 1 X T A k r k = X T A k r k. The entries of the vector X T A k r k are all the same since the predictors in A k all have the same correlation with the residuals r k (by construction). Conclusion: u k makes equal angles with the predictors in A k. Problem: In general, given v 1,..., v k R n, how do we nd a vector that makes equal angles with v 1,..., v k. When is this possible? 9/14

31 LARS and Lasso 10/14 LARS is closely related to stepwise regression.

32 LARS and Lasso 10/14 LARS is closely related to stepwise regression. There is also a connection to the Lasso.

33 LARS and Lasso LARS is closely related to stepwise regression. There is also a connection to the Lasso. ESL, Figure On the above gure, the lasso coecient proles are almost identical to those of LARS in the left panel, and dier for the rst time when the blue coecient passes back through zero. 10/14

34 LARS and Lasso (cont.) 11/14 The previous observation suggests the following LARS modication.

35 LARS and Lasso (cont.) 11/14 The previous observation suggests the following LARS modication. ESL, Algorithm 3.2a.

36 LARS and Lasso (cont.) 11/14 The previous observation suggests the following LARS modication. ESL, Algorithm 3.2a. Theorem: The modied LARS (lasso) algorithm (Algorithm 3.2a) yields the solution of the Lasso problem if variables appear/disappear one at a time.

37 LARS and Lasso (cont.) 11/14 The previous observation suggests the following LARS modication. ESL, Algorithm 3.2a. Theorem: The modied LARS (lasso) algorithm (Algorithm 3.2a) yields the solution of the Lasso problem if variables appear/disappear one at a time. See Efron et al., Least angle regression, The Annals of Statistics, 2004.

38 LARS and Lasso (cont.) 11/14 The previous observation suggests the following LARS modication. ESL, Algorithm 3.2a. Theorem: The modied LARS (lasso) algorithm (Algorithm 3.2a) yields the solution of the Lasso problem if variables appear/disappear one at a time. See Efron et al., Least angle regression, The Annals of Statistics, Note: the theorem explains the piecewise linear nature of the lasso.

39 Consistency of the Lasso 12/14 Recall: We proved before that the least-squares estimator is consistent, i.e., ˆβ n β as the sample size n goes to innity (under some assumptions).

40 Consistency of the Lasso 12/14 Recall: We proved before that the least-squares estimator is consistent, i.e., ˆβ n β as the sample size n goes to innity (under some assumptions). We now study analogous results for the lasso. Assumptions:

41 Consistency of the Lasso 12/14 Recall: We proved before that the least-squares estimator is consistent, i.e., ˆβ n β as the sample size n goes to innity (under some assumptions). We now study analogous results for the lasso. Assumptions: X 1,..., X p are (possibly dependent) random variables.

42 Consistency of the Lasso 12/14 Recall: We proved before that the least-squares estimator is consistent, i.e., ˆβ n β as the sample size n goes to innity (under some assumptions). We now study analogous results for the lasso. Assumptions: X 1,..., X p are (possibly dependent) random variables. X j M almost surely for some M > 0, (j = 1,..., p).

43 Consistency of the Lasso 12/14 Recall: We proved before that the least-squares estimator is consistent, i.e., ˆβ n β as the sample size n goes to innity (under some assumptions). We now study analogous results for the lasso. Assumptions: X 1,..., X p are (possibly dependent) random variables. X j M almost surely for some M > 0, (j = 1,..., p). Y = p j=1 β j X j + ɛ for some (unknown) constants β j.

44 Consistency of the Lasso 12/14 Recall: We proved before that the least-squares estimator is consistent, i.e., ˆβ n β as the sample size n goes to innity (under some assumptions). We now study analogous results for the lasso. Assumptions: X 1,..., X p are (possibly dependent) random variables. X j M almost surely for some M > 0, (j = 1,..., p). Y = p j=1 β j X j + ɛ for some (unknown) constants β j. ɛ N(0, σ 2 ) is independent of the X j (σ 2 unknown).

45 Consistency of the Lasso 12/14 Recall: We proved before that the least-squares estimator is consistent, i.e., ˆβ n β as the sample size n goes to innity (under some assumptions). We now study analogous results for the lasso. Assumptions: X 1,..., X p are (possibly dependent) random variables. X j M almost surely for some M > 0, (j = 1,..., p). Y = p j=1 β j X j + ɛ for some (unknown) constants β j. ɛ N(0, σ 2 ) is independent of the X j (σ 2 unknown). Sparsity assumption (specied later).

46 Consistency of the Lasso (cont.) 13/14 We are given n iid observations of (Y, X 1,..., X p ). Z i = (Y i, X i,1,..., X i,p )

47 Consistency of the Lasso (cont.) 13/14 We are given n iid observations of (Y, X 1,..., X p ). Z i = (Y i, X i,1,..., X i,p ) Our goal is to recover β 1,..., β p as accurately as possible.

48 Consistency of the Lasso (cont.) 13/14 We are given n iid observations Z i = (Y i, X i,1,..., X i,p ) of (Y, X 1,..., X p ). Our goal is to recover β1,..., β p as accurately as possible. Let p Ŷ = βj X j, j=1 the best predictor of Y if the true coecients were known.

49 Consistency of the Lasso (cont.) 13/14 We are given n iid observations Z i = (Y i, X i,1,..., X i,p ) of (Y, X 1,..., X p ). Our goal is to recover β1,..., β p as accurately as possible. Let p Ŷ = βj X j, j=1 the best predictor of Y if the true coecients were known. Given β 1,..., β p, let p Ỹ = β j X j. j=1

50 Consistency of the Lasso (cont.) 13/14 We are given n iid observations Z i = (Y i, X i,1,..., X i,p ) of (Y, X 1,..., X p ). Our goal is to recover β1,..., β p as accurately as possible. Let p Ŷ = βj X j, j=1 the best predictor of Y if the true coecients were known. Given β 1,..., β p, let p Ỹ = β j X j. j=1 Dene the mean square prediction error by MSPE( β) = E(Ŷ Ỹ )2.

51 Consistency of the Lasso (cont.) We are given n iid observations Z i = (Y i, X i,1,..., X i,p ) of (Y, X 1,..., X p ). Our goal is to recover β1,..., β p as accurately as possible. Let p Ŷ = βj X j, j=1 the best predictor of Y if the true coecients were known. Given β 1,..., β p, let p Ỹ = β j X j. j=1 Dene the mean square prediction error by MSPE( β) = E(Ŷ Ỹ )2. We will provide a bound on MSPE( β) when β is the lasso solution. 13/14

52 Consistency of the Lasso (cont.) 14/14 Given K > 0, let β K = ( β 1 K,..., β p K ) be the minimizer of n (Y i β 1 X i,1 β p X i,p ) 2 i=1 under the constraint p β i K. i=1 (The problem is equivalent to the lasso).

53 Consistency of the Lasso (cont.) Given K > 0, let β K = ( β 1 K,..., β p K ) be the minimizer of n (Y i β 1 X i,1 β p X i,p ) 2 i=1 under the constraint p β i K. i=1 (The problem is equivalent to the lasso). Theorem: Under the previous assumptions and assuming p βj K for some K > 0 (sparsity assumption), we have j=1 MSPE( β K ) 2KMσ 2 log(2p) n 2 log(2p + 8K 2 M 2 2 ). n See Chatterjee, Assumptionless consistency of the Lasso, preprint, /14

MATH 829: Introduction to Data Mining and Analysis Consistency of Linear Regression

MATH 829: Introduction to Data Mining and Analysis Consistency of Linear Regression 1/9 MATH 829: Introduction to Data Mining and Analysis Consistency of Linear Regression Dominique Guillot Deartments of Mathematical Sciences University of Delaware February 15, 2016 Distribution of regression

More information

MATH 567: Mathematical Techniques in Data Science Linear Regression: old and new

MATH 567: Mathematical Techniques in Data Science Linear Regression: old and new 1/14 MATH 567: Mathematical Techniques in Data Science Linear Regression: old and new Dominique Guillot Departments of Mathematical Sciences University of Delaware February 13, 2017 Linear Regression:

More information

Regularization Paths. Theme

Regularization Paths. Theme June 00 Trevor Hastie, Stanford Statistics June 00 Trevor Hastie, Stanford Statistics Theme Regularization Paths Trevor Hastie Stanford University drawing on collaborations with Brad Efron, Mee-Young Park,

More information

MATH 829: Introduction to Data Mining and Analysis Linear Regression: statistical tests

MATH 829: Introduction to Data Mining and Analysis Linear Regression: statistical tests 1/16 MATH 829: Introduction to Data Mining and Analysis Linear Regression: statistical tests Dominique Guillot Departments of Mathematical Sciences University of Delaware February 17, 2016 Statistical

More information

Least Angle Regression, Forward Stagewise and the Lasso

Least Angle Regression, Forward Stagewise and the Lasso January 2005 Rob Tibshirani, Stanford 1 Least Angle Regression, Forward Stagewise and the Lasso Brad Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani Stanford University Annals of Statistics,

More information

MATH 829: Introduction to Data Mining and Analysis Computing the lasso solution

MATH 829: Introduction to Data Mining and Analysis Computing the lasso solution 1/16 MATH 829: Introduction to Data Mining and Analysis Computing the lasso solution Dominique Guillot Departments of Mathematical Sciences University of Delaware February 26, 2016 Computing the lasso

More information

Regularization Paths

Regularization Paths December 2005 Trevor Hastie, Stanford Statistics 1 Regularization Paths Trevor Hastie Stanford University drawing on collaborations with Brad Efron, Saharon Rosset, Ji Zhu, Hui Zhou, Rob Tibshirani and

More information

Linear regression methods

Linear regression methods Linear regression methods Most of our intuition about statistical methods stem from linear regression. For observations i = 1,..., n, the model is Y i = p X ij β j + ε i, j=1 where Y i is the response

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic

More information

A Short Introduction to the Lasso Methodology

A Short Introduction to the Lasso Methodology A Short Introduction to the Lasso Methodology Michael Gutmann sites.google.com/site/michaelgutmann University of Helsinki Aalto University Helsinki Institute for Information Technology March 9, 2016 Michael

More information

LASSO Review, Fused LASSO, Parallel LASSO Solvers

LASSO Review, Fused LASSO, Parallel LASSO Solvers Case Study 3: fmri Prediction LASSO Review, Fused LASSO, Parallel LASSO Solvers Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade May 3, 2016 Sham Kakade 2016 1 Variable

More information

ESL Chap3. Some extensions of lasso

ESL Chap3. Some extensions of lasso ESL Chap3 Some extensions of lasso 1 Outline Consistency of lasso for model selection Adaptive lasso Elastic net Group lasso 2 Consistency of lasso for model selection A number of authors have studied

More information

MATH 829: Introduction to Data Mining and Analysis Linear Regression: old and new

MATH 829: Introduction to Data Mining and Analysis Linear Regression: old and new 1/15 MATH 829: Introduction to Data Mining and Analysis Linear Regression: old and new Dominique Guillot Departments of Mathematical Sciences University of Delaware February 10, 2016 Linear Regression:

More information

MATH 829: Introduction to Data Mining and Analysis Principal component analysis

MATH 829: Introduction to Data Mining and Analysis Principal component analysis 1/11 MATH 829: Introduction to Data Mining and Analysis Principal component analysis Dominique Guillot Departments of Mathematical Sciences University of Delaware April 4, 2016 Motivation 2/11 High-dimensional

More information

Linear Methods for Regression. Lijun Zhang

Linear Methods for Regression. Lijun Zhang Linear Methods for Regression Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Linear Regression Models and Least Squares Subset Selection Shrinkage Methods Methods Using Derived

More information

Direct Learning: Linear Regression. Donglin Zeng, Department of Biostatistics, University of North Carolina

Direct Learning: Linear Regression. Donglin Zeng, Department of Biostatistics, University of North Carolina Direct Learning: Linear Regression Parametric learning We consider the core function in the prediction rule to be a parametric function. The most commonly used function is a linear function: squared loss:

More information

BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1

BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1 BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1 Shaobo Li University of Cincinnati 1 Partially based on Hastie, et al. (2009) ESL, and James, et al. (2013)

More information

Consistent high-dimensional Bayesian variable selection via penalized credible regions

Consistent high-dimensional Bayesian variable selection via penalized credible regions Consistent high-dimensional Bayesian variable selection via penalized credible regions Howard Bondell bondell@stat.ncsu.edu Joint work with Brian Reich Howard Bondell p. 1 Outline High-Dimensional Variable

More information

MATH 644: Regression Analysis Methods

MATH 644: Regression Analysis Methods MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100

More information

MATH 829: Introduction to Data Mining and Analysis Graphical Models I

MATH 829: Introduction to Data Mining and Analysis Graphical Models I MATH 829: Introduction to Data Mining and Analysis Graphical Models I Dominique Guillot Departments of Mathematical Sciences University of Delaware May 2, 2016 1/12 Independence and conditional independence:

More information

Lecture 5: Clustering, Linear Regression

Lecture 5: Clustering, Linear Regression Lecture 5: Clustering, Linear Regression Reading: Chapter 10, Sections 3.1-3.2 STATS 202: Data mining and analysis October 4, 2017 1 / 22 .0.0 5 5 1.0 7 5 X2 X2 7 1.5 1.0 0.5 3 1 2 Hierarchical clustering

More information

Saharon Rosset 1 and Ji Zhu 2

Saharon Rosset 1 and Ji Zhu 2 Aust. N. Z. J. Stat. 46(3), 2004, 505 510 CORRECTED PROOF OF THE RESULT OF A PREDICTION ERROR PROPERTY OF THE LASSO ESTIMATOR AND ITS GENERALIZATION BY HUANG (2003) Saharon Rosset 1 and Ji Zhu 2 IBM T.J.

More information

Regularization: Ridge Regression and the LASSO

Regularization: Ridge Regression and the LASSO Agenda Wednesday, November 29, 2006 Agenda Agenda 1 The Bias-Variance Tradeoff 2 Ridge Regression Solution to the l 2 problem Data Augmentation Approach Bayesian Interpretation The SVD and Ridge Regression

More information

Stat588 Homework 1 (Due in class on Oct 04) Fall 2011

Stat588 Homework 1 (Due in class on Oct 04) Fall 2011 Stat588 Homework 1 (Due in class on Oct 04) Fall 2011 Notes. There are three sections of the homework. Section 1 and Section 2 are required for all students. While Section 3 is only required for Ph.D.

More information

Lecture 5: Soft-Thresholding and Lasso

Lecture 5: Soft-Thresholding and Lasso High Dimensional Data and Statistical Learning Lecture 5: Soft-Thresholding and Lasso Weixing Song Department of Statistics Kansas State University Weixing Song STAT 905 October 23, 2014 1/54 Outline Penalized

More information

Business Statistics. Tommaso Proietti. Model Evaluation and Selection. DEF - Università di Roma 'Tor Vergata'

Business Statistics. Tommaso Proietti. Model Evaluation and Selection. DEF - Università di Roma 'Tor Vergata' Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Model Evaluation and Selection Predictive Ability of a Model: Denition and Estimation We aim at achieving a balance between parsimony

More information

MATH 567: Mathematical Techniques in Data Science Logistic regression and Discriminant Analysis

MATH 567: Mathematical Techniques in Data Science Logistic regression and Discriminant Analysis Logistic regression MATH 567: Mathematical Techniques in Data Science Logistic regression and Discriminant Analysis Dominique Guillot Departments of Mathematical Sciences University of Delaware March 6,

More information

Lecture 5: Clustering, Linear Regression

Lecture 5: Clustering, Linear Regression Lecture 5: Clustering, Linear Regression Reading: Chapter 10, Sections 3.1-3.2 STATS 202: Data mining and analysis October 4, 2017 1 / 22 Hierarchical clustering Most algorithms for hierarchical clustering

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Non-linear Supervised High Frequency Trading Strategies with Applications in US Equity Markets

Non-linear Supervised High Frequency Trading Strategies with Applications in US Equity Markets Non-linear Supervised High Frequency Trading Strategies with Applications in US Equity Markets Nan Zhou, Wen Cheng, Ph.D. Associate, Quantitative Research, J.P. Morgan nan.zhou@jpmorgan.com The 4th Annual

More information

Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox February 4 th, Emily Fox 2014

Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox February 4 th, Emily Fox 2014 Case Study 3: fmri Prediction Fused LASSO LARS Parallel LASSO Solvers Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox February 4 th, 2014 Emily Fox 2014 1 LASSO Regression

More information

Regularization Path Algorithms for Detecting Gene Interactions

Regularization Path Algorithms for Detecting Gene Interactions Regularization Path Algorithms for Detecting Gene Interactions Mee Young Park Trevor Hastie July 16, 2006 Abstract In this study, we consider several regularization path algorithms with grouped variable

More information

A significance test for the lasso

A significance test for the lasso 1 First part: Joint work with Richard Lockhart (SFU), Jonathan Taylor (Stanford), and Ryan Tibshirani (Carnegie-Mellon Univ.) Second part: Joint work with Max Grazier G Sell, Stefan Wager and Alexandra

More information

ST430 Exam 1 with Answers

ST430 Exam 1 with Answers ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.

More information

Gradient descent. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Gradient descent. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725 Gradient descent Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Gradient descent First consider unconstrained minimization of f : R n R, convex and differentiable. We want to solve

More information

A significance test for the lasso

A significance test for the lasso 1 Gold medal address, SSC 2013 Joint work with Richard Lockhart (SFU), Jonathan Taylor (Stanford), and Ryan Tibshirani (Carnegie-Mellon Univ.) Reaping the benefits of LARS: A special thanks to Brad Efron,

More information

Lecture 14: Shrinkage

Lecture 14: Shrinkage Lecture 14: Shrinkage Reading: Section 6.2 STATS 202: Data mining and analysis October 27, 2017 1 / 19 Shrinkage methods The idea is to perform a linear regression, while regularizing or shrinking the

More information

Lecture 6: Linear Regression (continued)

Lecture 6: Linear Regression (continued) Lecture 6: Linear Regression (continued) Reading: Sections 3.1-3.3 STATS 202: Data mining and analysis October 6, 2017 1 / 23 Multiple linear regression Y = β 0 + β 1 X 1 + + β p X p + ε Y ε N (0, σ) i.i.d.

More information

Biostatistics Advanced Methods in Biostatistics IV

Biostatistics Advanced Methods in Biostatistics IV Biostatistics 140.754 Advanced Methods in Biostatistics IV Jeffrey Leek Assistant Professor Department of Biostatistics jleek@jhsph.edu Lecture 12 1 / 36 Tip + Paper Tip: As a statistician the results

More information

Regression Shrinkage and Selection via the Lasso

Regression Shrinkage and Selection via the Lasso Regression Shrinkage and Selection via the Lasso ROBERT TIBSHIRANI, 1996 Presenter: Guiyun Feng April 27 () 1 / 20 Motivation Estimation in Linear Models: y = β T x + ɛ. data (x i, y i ), i = 1, 2,...,

More information

Lecture 5: Clustering, Linear Regression

Lecture 5: Clustering, Linear Regression Lecture 5: Clustering, Linear Regression Reading: Chapter 10, Sections 3.1-2 STATS 202: Data mining and analysis Sergio Bacallado September 19, 2018 1 / 23 Announcements Starting next week, Julia Fukuyama

More information

High-dimensional regression

High-dimensional regression High-dimensional regression Advanced Methods for Data Analysis 36-402/36-608) Spring 2014 1 Back to linear regression 1.1 Shortcomings Suppose that we are given outcome measurements y 1,... y n R, and

More information

Analysis of Greedy Algorithms

Analysis of Greedy Algorithms Analysis of Greedy Algorithms Jiahui Shen Florida State University Oct.26th Outline Introduction Regularity condition Analysis on orthogonal matching pursuit Analysis on forward-backward greedy algorithm

More information

MATH 829: Introduction to Data Mining and Analysis Support vector machines

MATH 829: Introduction to Data Mining and Analysis Support vector machines 1/10 MATH 829: Introduction to Data Mining and Analysis Support vector machines Dominique Guillot Departments of Mathematical Sciences University of Delaware March 11, 2016 Hyperplanes 2/10 Recall: A hyperplane

More information

An Introduction to Sparse Approximation

An Introduction to Sparse Approximation An Introduction to Sparse Approximation Anna C. Gilbert Department of Mathematics University of Michigan Basic image/signal/data compression: transform coding Approximate signals sparsely Compress images,

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

Day 4: Shrinkage Estimators

Day 4: Shrinkage Estimators Day 4: Shrinkage Estimators Kenneth Benoit Data Mining and Statistical Learning March 9, 2015 n versus p (aka k) Classical regression framework: n > p. Without this inequality, the OLS coefficients have

More information

Lecture 6 Multiple Linear Regression, cont.

Lecture 6 Multiple Linear Regression, cont. Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression

More information

A Significance Test for the Lasso

A Significance Test for the Lasso A Significance Test for the Lasso Lockhart R, Taylor J, Tibshirani R, and Tibshirani R Ashley Petersen May 14, 2013 1 Last time Problem: Many clinical covariates which are important to a certain medical

More information

The lasso, persistence, and cross-validation

The lasso, persistence, and cross-validation The lasso, persistence, and cross-validation Daniel J. McDonald Department of Statistics Indiana University http://www.stat.cmu.edu/ danielmc Joint work with: Darren Homrighausen Colorado State University

More information

Least angle regression for time series forecasting with many predictors. Sarah Gelper & Christophe Croux Faculty of Business and Economics K.U.

Least angle regression for time series forecasting with many predictors. Sarah Gelper & Christophe Croux Faculty of Business and Economics K.U. Least angle regression for time series forecasting with many predictors Sarah Gelper & Christophe Croux Faculty of Business and Economics K.U.Leuven I ve got all these variables, but I don t know which

More information

Lecture 6: Linear Regression

Lecture 6: Linear Regression Lecture 6: Linear Regression Reading: Sections 3.1-3 STATS 202: Data mining and analysis Jonathan Taylor, 10/5 Slide credits: Sergio Bacallado 1 / 30 Simple linear regression Model: y i = β 0 + β 1 x i

More information

Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables

Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables LIB-MA, FSSM Cadi Ayyad University (Morocco) COMPSTAT 2010 Paris, August 22-27, 2010 Motivations Fan and Li (2001), Zou and Li (2008)

More information

Math 3330: Solution to midterm Exam

Math 3330: Solution to midterm Exam Math 3330: Solution to midterm Exam Question 1: (14 marks) Suppose the regression model is y i = β 0 + β 1 x i + ε i, i = 1,, n, where ε i are iid Normal distribution N(0, σ 2 ). a. (2 marks) Compute the

More information

Conjugate direction boosting

Conjugate direction boosting Conjugate direction boosting June 21, 2005 Revised Version Abstract Boosting in the context of linear regression become more attractive with the invention of least angle regression (LARS), where the connection

More information

Chapter 3. Linear Models for Regression

Chapter 3. Linear Models for Regression Chapter 3. Linear Models for Regression Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Linear

More information

A Significance Test for the Lasso

A Significance Test for the Lasso A Significance Test for the Lasso Lockhart R, Taylor J, Tibshirani R, and Tibshirani R Ashley Petersen June 6, 2013 1 Motivation Problem: Many clinical covariates which are important to a certain medical

More information

SCMA292 Mathematical Modeling : Machine Learning. Krikamol Muandet. Department of Mathematics Faculty of Science, Mahidol University.

SCMA292 Mathematical Modeling : Machine Learning. Krikamol Muandet. Department of Mathematics Faculty of Science, Mahidol University. SCMA292 Mathematical Modeling : Machine Learning Krikamol Muandet Department of Mathematics Faculty of Science, Mahidol University February 9, 2016 Outline Quick Recap of Least Square Ridge Regression

More information

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Sparse regression. Optimization-Based Data Analysis.   Carlos Fernandez-Granda Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic

More information

Sparsity in Underdetermined Systems

Sparsity in Underdetermined Systems Sparsity in Underdetermined Systems Department of Statistics Stanford University August 19, 2005 Classical Linear Regression Problem X n y p n 1 > Given predictors and response, y Xβ ε = + ε N( 0, σ 2

More information

UNIVERSITETET I OSLO

UNIVERSITETET I OSLO UNIVERSITETET I OSLO Det matematisk-naturvitenskapelige fakultet Examination in: STK4030 Modern data analysis - FASIT Day of examination: Friday 13. Desember 2013. Examination hours: 14.30 18.30. This

More information

1. Simple Linear Regression

1. Simple Linear Regression 1. Simple Linear Regression Suppose that we are interested in the average height of male undergrads at UF. We put each male student s name (population) in a hat and randomly select 100 (sample). Then their

More information

MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.)

MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.) 1/12 MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.) Dominique Guillot Departments of Mathematical Sciences University of Delaware May 6, 2016

More information

Bayesian Grouped Horseshoe Regression with Application to Additive Models

Bayesian Grouped Horseshoe Regression with Application to Additive Models Bayesian Grouped Horseshoe Regression with Application to Additive Models Zemei Xu 1,2, Daniel F. Schmidt 1, Enes Makalic 1, Guoqi Qian 2, John L. Hopper 1 1 Centre for Epidemiology and Biostatistics,

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Simple and Multiple Linear Regression

Simple and Multiple Linear Regression Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where

More information

MLCC 2018 Variable Selection and Sparsity. Lorenzo Rosasco UNIGE-MIT-IIT

MLCC 2018 Variable Selection and Sparsity. Lorenzo Rosasco UNIGE-MIT-IIT MLCC 2018 Variable Selection and Sparsity Lorenzo Rosasco UNIGE-MIT-IIT Outline Variable Selection Subset Selection Greedy Methods: (Orthogonal) Matching Pursuit Convex Relaxation: LASSO & Elastic Net

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 370 Regression models are used to study the relationship of a response variable and one or more predictors. The response is also called the dependent variable, and the predictors

More information

MS&E 226. In-Class Midterm Examination Solutions Small Data October 20, 2015

MS&E 226. In-Class Midterm Examination Solutions Small Data October 20, 2015 MS&E 226 In-Class Midterm Examination Solutions Small Data October 20, 2015 PROBLEM 1. Alice uses ordinary least squares to fit a linear regression model on a dataset containing outcome data Y and covariates

More information

arxiv: v1 [math.st] 2 May 2007

arxiv: v1 [math.st] 2 May 2007 arxiv:0705.0269v1 [math.st] 2 May 2007 Electronic Journal of Statistics Vol. 1 (2007) 1 29 ISSN: 1935-7524 DOI: 10.1214/07-EJS004 Forward stagewise regression and the monotone lasso Trevor Hastie Departments

More information

On High-Dimensional Cross-Validation

On High-Dimensional Cross-Validation On High-Dimensional Cross-Validation BY WEI-CHENG HSIAO Institute of Statistical Science, Academia Sinica, 128 Academia Road, Section 2, Nankang, Taipei 11529, Taiwan hsiaowc@stat.sinica.edu.tw 5 WEI-YING

More information

COMPARATIVE ANALYSIS OF ORTHOGONAL MATCHING PURSUIT AND LEAST ANGLE REGRESSION

COMPARATIVE ANALYSIS OF ORTHOGONAL MATCHING PURSUIT AND LEAST ANGLE REGRESSION COMPARATIVE ANALYSIS OF ORTHOGONAL MATCHING PURSUIT AND LEAST ANGLE REGRESSION By Mazin Abdulrasool Hameed A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for

More information

Statistical View of Least Squares

Statistical View of Least Squares May 23, 2006 Purpose of Regression Some Examples Least Squares Purpose of Regression Purpose of Regression Some Examples Least Squares Suppose we have two variables x and y Purpose of Regression Some Examples

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

Linear model selection and regularization

Linear model selection and regularization Linear model selection and regularization Problems with linear regression with least square 1. Prediction Accuracy: linear regression has low bias but suffer from high variance, especially when n p. It

More information

ECON 616: Lecture Two: Deterministic Trends, Nonstationary Processes

ECON 616: Lecture Two: Deterministic Trends, Nonstationary Processes ECON 616: Lecture Two: Deterministic Trends, Nonstationary Processes ED HERBST September 11, 2017 Background Hamilton, chapters 15-16 Trends vs Cycles A commond decomposition of macroeconomic time series

More information

Multiple linear regression S6

Multiple linear regression S6 Basic medical statistics for clinical and experimental research Multiple linear regression S6 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/42 Introduction Two main motivations for doing multiple

More information

MATH 829: Introduction to Data Mining and Analysis Logistic regression

MATH 829: Introduction to Data Mining and Analysis Logistic regression 1/11 MATH 829: Introduction to Data Mining and Analysis Logistic regression Dominique Guillot Departments of Mathematical Sciences University of Delaware March 7, 2016 Logistic regression 2/11 Suppose

More information

The lasso. Patrick Breheny. February 15. The lasso Convex optimization Soft thresholding

The lasso. Patrick Breheny. February 15. The lasso Convex optimization Soft thresholding Patrick Breheny February 15 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/24 Introduction Last week, we introduced penalized regression and discussed ridge regression, in which the penalty

More information

Lecture 7: Modeling Krak(en)

Lecture 7: Modeling Krak(en) Lecture 7: Modeling Krak(en) Variable selection Last In both time cases, we saw we two are left selection with selection problems problem -- How -- do How we pick do we either pick either subset the of

More information

MIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications. Class 08: Sparsity Based Regularization. Lorenzo Rosasco

MIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications. Class 08: Sparsity Based Regularization. Lorenzo Rosasco MIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications Class 08: Sparsity Based Regularization Lorenzo Rosasco Learning algorithms so far ERM + explicit l 2 penalty 1 min w R d n n l(y

More information

Summary and discussion of: Exact Post-selection Inference for Forward Stepwise and Least Angle Regression Statistics Journal Club

Summary and discussion of: Exact Post-selection Inference for Forward Stepwise and Least Angle Regression Statistics Journal Club Summary and discussion of: Exact Post-selection Inference for Forward Stepwise and Least Angle Regression Statistics Journal Club 36-825 1 Introduction Jisu Kim and Veeranjaneyulu Sadhanala In this report

More information

Least Squares Regression

Least Squares Regression Least Squares Regression Chemical Engineering 2450 - Numerical Methods Given N data points x i, y i, i 1 N, and a function that we wish to fit to these data points, fx, we define S as the sum of the squared

More information

Oslo Class 6 Sparsity based regularization

Oslo Class 6 Sparsity based regularization RegML2017@SIMULA Oslo Class 6 Sparsity based regularization Lorenzo Rosasco UNIGE-MIT-IIT May 4, 2017 Learning from data Possible only under assumptions regularization min Ê(w) + λr(w) w Smoothness Sparsity

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

ISyE 691 Data mining and analytics

ISyE 691 Data mining and analytics ISyE 691 Data mining and analytics Regression Instructor: Prof. Kaibo Liu Department of Industrial and Systems Engineering UW-Madison Email: kliu8@wisc.edu Office: Room 3017 (Mechanical Engineering Building)

More information

MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models

MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models 1/13 MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models Dominique Guillot Departments of Mathematical Sciences University of Delaware May 4, 2016 Recall

More information

Ridge and Lasso Regression

Ridge and Lasso Regression enote 8 1 enote 8 Ridge and Lasso Regression enote 8 INDHOLD 2 Indhold 8 Ridge and Lasso Regression 1 8.1 Reading material................................. 2 8.2 Presentation material...............................

More information

Linear Model Selection and Regularization

Linear Model Selection and Regularization Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In

More information

COMS 4771 Lecture Fixed-design linear regression 2. Ridge and principal components regression 3. Sparse regression and Lasso

COMS 4771 Lecture Fixed-design linear regression 2. Ridge and principal components regression 3. Sparse regression and Lasso COMS 477 Lecture 6. Fixed-design linear regression 2. Ridge and principal components regression 3. Sparse regression and Lasso / 2 Fixed-design linear regression Fixed-design linear regression A simplified

More information

APPROXIMATING THE COMPLEXITY MEASURE OF. Levent Tuncel. November 10, C&O Research Report: 98{51. Abstract

APPROXIMATING THE COMPLEXITY MEASURE OF. Levent Tuncel. November 10, C&O Research Report: 98{51. Abstract APPROXIMATING THE COMPLEXITY MEASURE OF VAVASIS-YE ALGORITHM IS NP-HARD Levent Tuncel November 0, 998 C&O Research Report: 98{5 Abstract Given an m n integer matrix A of full row rank, we consider the

More information

Assignment 3. Introduction to Machine Learning Prof. B. Ravindran

Assignment 3. Introduction to Machine Learning Prof. B. Ravindran Assignment 3 Introduction to Machine Learning Prof. B. Ravindran 1. In building a linear regression model for a particular data set, you observe the coefficient of one of the features having a relatively

More information

S-PLUS and R package for Least Angle Regression

S-PLUS and R package for Least Angle Regression S-PLUS and R package for Least Angle Regression Tim Hesterberg, Chris Fraley Insightful Corp. Abstract Least Angle Regression is a promising technique for variable selection applications, offering a nice

More information

A New Perspective on Boosting in Linear Regression via Subgradient Optimization and Relatives

A New Perspective on Boosting in Linear Regression via Subgradient Optimization and Relatives A New Perspective on Boosting in Linear Regression via Subgradient Optimization and Relatives Paul Grigas May 25, 2016 1 Boosting Algorithms in Linear Regression Boosting [6, 9, 12, 15, 16] is an extremely

More information

x 1 + 4x 2 = 5, 7x 1 + 5x 2 + 2x 3 4,

x 1 + 4x 2 = 5, 7x 1 + 5x 2 + 2x 3 4, LUNDS TEKNISKA HÖGSKOLA MATEMATIK LÖSNINGAR LINJÄR OCH KOMBINATORISK OPTIMERING 2018-03-16 1. a) The rst thing to do is to rewrite the problem so that the right hand side of all constraints are positive.

More information

Regression and Statistical Inference

Regression and Statistical Inference Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF

More information

Weighted Least Squares

Weighted Least Squares Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w

More information

LEAST ANGLE REGRESSION 469

LEAST ANGLE REGRESSION 469 LEAST ANGLE REGRESSION 469 Specifically for the Lasso, one alternative strategy for logistic regression is to use a quadratic approximation for the log-likelihood. Consider the Bayesian version of Lasso

More information

AMS 7 Correlation and Regression Lecture 8

AMS 7 Correlation and Regression Lecture 8 AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation

More information

Within Group Variable Selection through the Exclusive Lasso

Within Group Variable Selection through the Exclusive Lasso Within Group Variable Selection through the Exclusive Lasso arxiv:1505.07517v1 [stat.me] 28 May 2015 Frederick Campbell Department of Statistics, Rice University and Genevera Allen Department of Statistics,

More information