A Consistent Model Selection Criterion for L 2 -Boosting in High-Dimensional Sparse Linear Models
|
|
- Mae Fowler
- 6 years ago
- Views:
Transcription
1 A Consistent Model Selection Criterion for L 2 -Boosting in High-Dimensional Sparse Linear Models Tze Leung Lai, Stanford University Ching-Kang Ing, Academia Sinica, Taipei Zehao Chen, Lehman Brothers
2 y in = p n j=1 High Dimensional Regression β jn x ij (n) + ε i, i = 1, 2,..., n, p n = O(exp(n ξ )) for some 0 < ξ < 1. The n s in β jn, y in and x ij (n) may be suppressed. Some typical examples for n << p: Gene Expression Data: n = sample size, p = # genes. Sparse Signal Reconstruction: n = # measurements, p = # signals Image Recovery: n = # grid points, p = # basis functions Portfolio Selection: p = # assets, n = # time points 1
3 L 2 -Boosting: Pure Greedy Algorithm (PGA) Step 1. Define R 0 = (y 1,, y n ) and x j = (x 1j,..., x nj ). Find a variable among {x 1,..., x pn } that is most correlated to R 0. Call the variable xŝ1 and generate residual vector R 1 = R 0 ˆβŝ1 xŝ1, where ˆβŝ1 is the least squares estimate obtained by regressing R 0 on xŝ1. 2
4 Step 2. Find a variable among {x 1,..., x pn } that is most correlated to R 1. Call the variable xŝ2 and let R 2 = R 1 ˆβŝ2 xŝ2.. If iterations stop at step m, then the new outcome, y n+1, is predicted by ŷŝ1,...,ŝ m = ˆβŝ1 x n+1,ŝ1 + + ˆβŝm x n+1,ŝm. 3
5 L 2 -Boosting: Orthogonal Greedy Algorithm (OGA) Step 1. Define y n = (y 1,..., y n ) and x j = (x 1j,..., x nj ). Find a variable among {x 1,..., x pn } that is most correlated to y n. Call the variable xŝo 1 R 1 = y n Mŝo 1 y n, where Mŝo 1 L(xŝo 1 ). and generate residual is the projection matrix into 4
6 Step 2. Find a variable among {x 1,..., x pn } that is most correlated to R 1. Call the variable xŝo 2 R 2 = y n Mŝo 1,ŝo 2 y n, where Mŝo 1,ŝo 2 into L(xŝo 1, x ŝ o 2 ).. and let is the projection matrix If iterations stop at step m, then the new outcome, y n+1, is predicted by ŷŝo = ˆβ o 1,...,ŝo m ŝ x o n+1,ŝ o + + ˆβ o 1 1 ŝ x o n+1,ŝ o, m m where ˆβ o,..., ˆβ o ŝ o ŝ o 1 m are the least squares estimates. 5
7 Convergence Rates in L 2 -Boosting Assumptions: (K.1) Ee tε 1 + supj E exp(sx 1j ) < for t t 0, s s 0. (K.2) 0 < ι 1 < λ min (Γ n ) λ max (Γ n ) < ι 2 < for all n, where Γ n = E(x 1 x 1 ) (K.2*) 0 < η 1 min 1 j p n E(x 2 1j ) max 1 j p n E(x 2 1j ) < η 2 <. (K.3) p n = O(exp(Cn ξ )) for some 0 < ξ < 1 and C > 0. (K.4) sup n 1 pn j=1 β j <. 6
8 Theorem 1. Assume (K.1), (K.2), (K.3) and (K.4). Then, for any choice of m = m n satisfying m n = O(n l ) with 0 < l < (1 ξ)/5, E { (y n+1 ŷŝo 1,...,ŝo m )2 y n, x 1,..., x pn } σ 2 = O p (m 1 ). Theorem 2. Assume (K.1), (K.2*), (K.3) and (K.4). Then, for any choice of m = m n satisfying m n = o(log n) and 0 < θ < 1, 3 E { } (y n+1 ŷ P ŝ 1,...,ŝ m ) 2 y n, x 1,..., x pn σ 2 = O p (m θ ). 7
9 Theorem 3. Assume (K.1), (K.2), (K.3), (K.4) and (K.5) For some 0 γ < (1 ξ)/5, lim inf n nγ min j O n β 2 j > C > 0. Then, for any choice of m = m n satisfying m n = O(n l ) with 0 < l < (1 ξ)/5 and lim n m n /n γ =, where lim n P(Do n) = 1, O n = {j : 1 j p n, β j 0}, D o n = { O n {ŝ o 1,..., ŝo m n } }. 8
10 Theorem 3 indicates that with probability approaching 1, the variables chosen by the OGA after m n steps contains all relevant variables. To prevent overfitting, the algorithm should be stopped at the first time when all relevant variables are included, i.e., choose the smallest correct model along the boosting path. 9
11 High-Dimensional Hannan Quinn Criterion HQ(k) = log ˆσ 2 (xŝo 1,..., x ŝ o k ) + (log p n)c n k n where C n and ˆk HQ = arg min 1 k m n HQ(k), ˆσ 2 (xŝo 1,..., x ŝ o k ) = 1 n y n(i Mŝo 1,...,ŝo k )y n. 10
12 The major difference between HQ and conventional Hannan Quinn is the additional factor log p n, which is related to the maximum of i.i.d. V 1,..., V pn χ 2 (1): max 1 k pn V i 2 log p n 1 in probability. Define A j = {ŝ o,..., 1 ŝo }, j 1, j m k n o n, if O n A mn = min{j : 1 j m n, O n A j }, if O n A mn. 11
13 Theorem 4. Assume (K.1), (K.2), (K.3), (K.4) and (K.5). Then, for any choice of m = m n satisfying m n = O(n l ) with 0 < l < (1 ξ)/5 and lim n m n /n γ =, and any choice of C n satisfying C n log p n = o(n 1 2γ ) and n γ /C n = o(1), lim n P(AˆkHQ = A ko n ) = 1. For γ = 0, one can take C n = log n, giving the high-dimensional version of BIC. 12
14 Sketch of Proof 1. Difference between OGA (or PGA) with its population version: Exponential bounds. 2. The convergence rates in Theorems 1 and 2 are used to prove Theorem Theorem 3, exponential bounds and extension of Hannan Quinn-type arguments to prove Theorem 4. 13
15 Finite-Sample Performance Example 1. (β 1,..., β 5 ) = (3, 3.5, 4, 2.8, 3.2), β j = 0 for j 6 in y i = 5 β j x ij + p β j x ij + ε i, i = 1,..., n. j=1 j=6 ε i i.i.d. N(0, 1), x i = (x i1,..., x ip ) T independent of ε i. x ij = z ij + ηw i, where η > 0, (z T i, w i) T N(0, I) simulation runs for each result. 14
16 Table 1: Frequency of all 5 relevant and i additional variables selected (m n = 30) in 1000 simulations. i η n p method Total OGA+HDBIC OGA+BIC OGA+HDBIC OGA+BIC OGA+HDBIC OGA+BIC OGA+HDBIC OGA+BIC
17 Example 2. q = 10, n = 400, p = 4000 in y i = q j=1 β j x ij + p j=q+1 β j x ij + ε i, i = 1,..., n, (β 1,..., β 10 ) = (3.2, 3.2, 3.2, 3.2, 4.4, 4.4, 3.5, 3.5, 3.5, 3.5), β j = 0 for j > 10. ε i i.i.d. N(0, 2.25). x i = (x i1,..., x ip ) T, same as in Example simulations for each result. 16
18 Table 2: Frequency of all 9 relevant and i additional variables selected (m n = 40) in 100 simulations. i η method Total 1 OGA+HDBIC LASSO OGA+HDBIC LASSO
19 Example 3. Whereas Example 2 satisfies the neighborhood stability condition of Meinshausen and Buhlmann (2006), which is an extension of the irrepresentable condition of Zhao and Yu (2006) to ensure model selection consistency of LASSO to random covariates, this condition is violated in taking q = 10 in Example 2 and (β 1,..., β q ) = (3, 3.75, 4.5, 5.25, 6, 6.75, 7.5, 8.25, 9, 9.75). β j = 0 for j 11, (x i1,..., x iq ) T i.i.d. N(0, I q ). ε i i.i.d. N(0, 1) and independent of (x i1,..., x ip ) T. x ij = z ij + ( q l=1 bx il) for q < j p, z i N(0, I p q /4). qb 2 + 1/4 = 1 (i.e, b = 3/4q). 18
20 Table 3: Frequency of all 10 relevant and i additional variables selected (m n = 40) in 100 simulations. i method or more Total OGA+HDBIC LASSO
21 Conclusion By obtaining the convergence rates of orthogonal greedy L 2 -boosting (OGA), classical model selection criteria such as BIC can be modified to a large number p n of regressors in sparse regression models. Consistency of model selection is formulated in terms of capturing all, and only, variables with nonzero coefficients under (K5). 20
22 OGA with consistent model selection means that the estimate is asymptotically equivalent to OLS with the correct set of covariates. In the absence of (K5), consistency of model selection can be linked to the objective of the regression model, e.g., prediction, estimation of high-dimensional covariance matrix via modified Cholesky decomposition. For the latter, consistent model selection can be carried out via thresholding (Bickel & Levina, 2008, and extensions). 21
On High-Dimensional Cross-Validation
On High-Dimensional Cross-Validation BY WEI-CHENG HSIAO Institute of Statistical Science, Academia Sinica, 128 Academia Road, Section 2, Nankang, Taipei 11529, Taiwan hsiaowc@stat.sinica.edu.tw 5 WEI-YING
More informationSample Size Requirement For Some Low-Dimensional Estimation Problems
Sample Size Requirement For Some Low-Dimensional Estimation Problems Cun-Hui Zhang, Rutgers University September 10, 2013 SAMSI Thanks for the invitation! Acknowledgements/References Sun, T. and Zhang,
More informationSparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28
Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:
More informationThe Iterated Lasso for High-Dimensional Logistic Regression
The Iterated Lasso for High-Dimensional Logistic Regression By JIAN HUANG Department of Statistics and Actuarial Science, 241 SH University of Iowa, Iowa City, Iowa 52242, U.S.A. SHUANGE MA Division of
More informationStepwise Searching for Feature Variables in High-Dimensional Linear Regression
Stepwise Searching for Feature Variables in High-Dimensional Linear Regression Qiwei Yao Department of Statistics, London School of Economics q.yao@lse.ac.uk Joint work with: Hongzhi An, Chinese Academy
More informationBAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage
BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement
More informationA Multiple Testing Approach to the Regularisation of Large Sample Correlation Matrices
A Multiple Testing Approach to the Regularisation of Large Sample Correlation Matrices Natalia Bailey 1 M. Hashem Pesaran 2 L. Vanessa Smith 3 1 Department of Econometrics & Business Statistics, Monash
More informationA Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models
A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los
More informationLASSO-TYPE RECOVERY OF SPARSE REPRESENTATIONS FOR HIGH-DIMENSIONAL DATA
The Annals of Statistics 2009, Vol. 37, No. 1, 246 270 DOI: 10.1214/07-AOS582 Institute of Mathematical Statistics, 2009 LASSO-TYPE RECOVERY OF SPARSE REPRESENTATIONS FOR HIGH-DIMENSIONAL DATA BY NICOLAI
More informationOrthogonal Matching Pursuit for Sparse Signal Recovery With Noise
Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published
More informationFeature selection with high-dimensional data: criteria and Proc. Procedures
Feature selection with high-dimensional data: criteria and Procedures Zehua Chen Department of Statistics & Applied Probability National University of Singapore Conference in Honour of Grace Wahba, June
More informationHigh-dimensional variable selection via tilting
High-dimensional variable selection via tilting Haeran Cho and Piotr Fryzlewicz September 2, 2010 Abstract This paper considers variable selection in linear regression models where the number of covariates
More informationInference After Variable Selection
Department of Mathematics, SIU Carbondale Inference After Variable Selection Lasanthi Pelawa Watagoda lasanthi@siu.edu June 12, 2017 Outline 1 Introduction 2 Inference For Ridge and Lasso 3 Variable Selection
More informationHigh-dimensional covariance estimation based on Gaussian graphical models
High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,
More informationHigh-dimensional Covariance Estimation Based On Gaussian Graphical Models
High-dimensional Covariance Estimation Based On Gaussian Graphical Models Shuheng Zhou, Philipp Rutimann, Min Xu and Peter Buhlmann February 3, 2012 Problem definition Want to estimate the covariance matrix
More informationOn Model Selection Consistency of Lasso
On Model Selection Consistency of Lasso Peng Zhao Department of Statistics University of Berkeley 367 Evans Hall Berkeley, CA 94720-3860, USA Bin Yu Department of Statistics University of Berkeley 367
More informationHigh-dimensional regression with unknown variance
High-dimensional regression with unknown variance Christophe Giraud Ecole Polytechnique march 2012 Setting Gaussian regression with unknown variance: Y i = f i + ε i with ε i i.i.d. N (0, σ 2 ) f = (f
More informationConfidence Intervals for Low-dimensional Parameters with High-dimensional Data
Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Cun-Hui Zhang and Stephanie S. Zhang Rutgers University and Columbia University September 14, 2012 Outline Introduction Methodology
More informationRegularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008
Regularized Estimation of High Dimensional Covariance Matrices Peter Bickel Cambridge January, 2008 With Thanks to E. Levina (Joint collaboration, slides) I. M. Johnstone (Slides) Choongsoon Bae (Slides)
More informationISyE 691 Data mining and analytics
ISyE 691 Data mining and analytics Regression Instructor: Prof. Kaibo Liu Department of Industrial and Systems Engineering UW-Madison Email: kliu8@wisc.edu Office: Room 3017 (Mechanical Engineering Building)
More informationLog Covariance Matrix Estimation
Log Covariance Matrix Estimation Xinwei Deng Department of Statistics University of Wisconsin-Madison Joint work with Kam-Wah Tsui (Univ. of Wisconsin-Madsion) 1 Outline Background and Motivation The Proposed
More informationAccepted author version posted online: 12 Aug 2015.
This article was downloaded by: [Institute of Software] On: 13 August 2015, At: 12:15 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office:
More informationsparse and low-rank tensor recovery Cubic-Sketching
Sparse and Low-Ran Tensor Recovery via Cubic-Setching Guang Cheng Department of Statistics Purdue University www.science.purdue.edu/bigdata CCAM@Purdue Math Oct. 27, 2017 Joint wor with Botao Hao and Anru
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationMULTIVARIATE STOCHASTIC REGRESSION IN TIME SERIES MODELING
Statistica Sinica 26 (2016), 1411-1426 doi:http://dx.doi.org/10.5705/ss.2014.211t MULTIVARIATE STOCHASTIC REGRESSION IN TIME SERIES MODELING Tze Leung Lai and Ka Wai Tsang Stanford University Abstract:
More informationVariable Selection for Highly Correlated Predictors
Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu arxiv:1709.04840v1 [stat.me] 14 Sep 2017 Abstract Penalty-based variable selection methods are powerful in selecting relevant covariates
More informationRegression Shrinkage and Selection via the Lasso
Regression Shrinkage and Selection via the Lasso ROBERT TIBSHIRANI, 1996 Presenter: Guiyun Feng April 27 () 1 / 20 Motivation Estimation in Linear Models: y = β T x + ɛ. data (x i, y i ), i = 1, 2,...,
More information(Part 1) High-dimensional statistics May / 41
Theory for the Lasso Recall the linear model Y i = p j=1 β j X (j) i + ɛ i, i = 1,..., n, or, in matrix notation, Y = Xβ + ɛ, To simplify, we assume that the design X is fixed, and that ɛ is N (0, σ 2
More informationThe Sparsity and Bias of The LASSO Selection In High-Dimensional Linear Regression
The Sparsity and Bias of The LASSO Selection In High-Dimensional Linear Regression Cun-hui Zhang and Jian Huang Presenter: Quefeng Li Feb. 26, 2010 un-hui Zhang and Jian Huang Presenter: Quefeng The Sparsity
More informationA New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables
A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider the problem of
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationNonconcave Penalized Likelihood with A Diverging Number of Parameters
Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized
More informationLeast squares under convex constraint
Stanford University Questions Let Z be an n-dimensional standard Gaussian random vector. Let µ be a point in R n and let Y = Z + µ. We are interested in estimating µ from the data vector Y, under the assumption
More informationPermutation-invariant regularization of large covariance matrices. Liza Levina
Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work
More informationAnalysis of Greedy Algorithms
Analysis of Greedy Algorithms Jiahui Shen Florida State University Oct.26th Outline Introduction Regularity condition Analysis on orthogonal matching pursuit Analysis on forward-backward greedy algorithm
More informationExtended Bayesian Information Criteria for Model Selection with Large Model Spaces
Extended Bayesian Information Criteria for Model Selection with Large Model Spaces Jiahua Chen, University of British Columbia Zehua Chen, National University of Singapore (Biometrika, 2008) 1 / 18 Variable
More informationSparse Linear Models (10/7/13)
STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine
More informationIterative Selection Using Orthogonal Regression Techniques
Iterative Selection Using Orthogonal Regression Techniques Bradley Turnbull 1, Subhashis Ghosal 1 and Hao Helen Zhang 2 1 Department of Statistics, North Carolina State University, Raleigh, NC, USA 2 Department
More informationA Significance Test for the Lasso
A Significance Test for the Lasso Lockhart R, Taylor J, Tibshirani R, and Tibshirani R Ashley Petersen May 14, 2013 1 Last time Problem: Many clinical covariates which are important to a certain medical
More informationConsistent Model Selection Criteria on High Dimensions
Journal of Machine Learning Research 13 (2012) 1037-1057 Submitted 6/11; Revised 1/12; Published 4/12 Consistent Model Selection Criteria on High Dimensions Yongdai Kim Department of Statistics Seoul National
More informationHigh Dimensional Inverse Covariate Matrix Estimation via Linear Programming
High Dimensional Inverse Covariate Matrix Estimation via Linear Programming Ming Yuan October 24, 2011 Gaussian Graphical Model X = (X 1,..., X p ) indep. N(µ, Σ) Inverse covariance matrix Σ 1 = Ω = (ω
More informationarxiv: v1 [stat.me] 30 Dec 2017
arxiv:1801.00105v1 [stat.me] 30 Dec 2017 An ISIS screening approach involving threshold/partition for variable selection in linear regression 1. Introduction Yu-Hsiang Cheng e-mail: 96354501@nccu.edu.tw
More informationComplexity of two and multi-stage stochastic programming problems
Complexity of two and multi-stage stochastic programming problems A. Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0205, USA The concept
More informationSGN Advanced Signal Processing Project bonus: Sparse model estimation
SGN 21006 Advanced Signal Processing Project bonus: Sparse model estimation Ioan Tabus Department of Signal Processing Tampere University of Technology Finland 1 / 12 Sparse models Initial problem: solve
More informationIntroduction to graphical models: Lecture III
Introduction to graphical models: Lecture III Martin Wainwright UC Berkeley Departments of Statistics, and EECS Martin Wainwright (UC Berkeley) Some introductory lectures January 2013 1 / 25 Introduction
More informationMS&E 226. In-Class Midterm Examination Solutions Small Data October 20, 2015
MS&E 226 In-Class Midterm Examination Solutions Small Data October 20, 2015 PROBLEM 1. Alice uses ordinary least squares to fit a linear regression model on a dataset containing outcome data Y and covariates
More informationGrouping Pursuit in regression. Xiaotong Shen
Grouping Pursuit in regression Xiaotong Shen School of Statistics University of Minnesota Email xshen@stat.umn.edu Joint with Hsin-Cheng Huang (Sinica, Taiwan) Workshop in honor of John Hartigan Innovation
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationModel Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model
Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population
More informationarxiv: v1 [math.st] 8 Jan 2008
arxiv:0801.1158v1 [math.st] 8 Jan 2008 Hierarchical selection of variables in sparse high-dimensional regression P. J. Bickel Department of Statistics University of California at Berkeley Y. Ritov Department
More informationJournal of Multivariate Analysis. Consistency of sparse PCA in High Dimension, Low Sample Size contexts
Journal of Multivariate Analysis 5 (03) 37 333 Contents lists available at SciVerse ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva Consistency of sparse PCA
More informationConvex relaxation for Combinatorial Penalties
Convex relaxation for Combinatorial Penalties Guillaume Obozinski Equipe Imagine Laboratoire d Informatique Gaspard Monge Ecole des Ponts - ParisTech Joint work with Francis Bach Fête Parisienne in Computation,
More informationGreedy and Relaxed Approximations to Model Selection: A simulation study
Greedy and Relaxed Approximations to Model Selection: A simulation study Guilherme V. Rocha and Bin Yu April 6, 2008 Abstract The Minimum Description Length (MDL) principle is an important tool for retrieving
More informationAdvanced Econometrics I
Lecture Notes Autumn 2010 Dr. Getinet Haile, University of Mannheim 1. Introduction Introduction & CLRM, Autumn Term 2010 1 What is econometrics? Econometrics = economic statistics economic theory mathematics
More informationTopic 4 Unit Roots. Gerald P. Dwyer. February Clemson University
Topic 4 Unit Roots Gerald P. Dwyer Clemson University February 2016 Outline 1 Unit Roots Introduction Trend and Difference Stationary Autocorrelations of Series That Have Deterministic or Stochastic Trends
More informationThe Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA
The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA Presented by Dongjun Chung March 12, 2010 Introduction Definition Oracle Properties Computations Relationship: Nonnegative Garrote Extensions:
More informationDISCUSSION OF A SIGNIFICANCE TEST FOR THE LASSO. By Peter Bühlmann, Lukas Meier and Sara van de Geer ETH Zürich
Submitted to the Annals of Statistics DISCUSSION OF A SIGNIFICANCE TEST FOR THE LASSO By Peter Bühlmann, Lukas Meier and Sara van de Geer ETH Zürich We congratulate Richard Lockhart, Jonathan Taylor, Ryan
More informationTractable Upper Bounds on the Restricted Isometry Constant
Tractable Upper Bounds on the Restricted Isometry Constant Alex d Aspremont, Francis Bach, Laurent El Ghaoui Princeton University, École Normale Supérieure, U.C. Berkeley. Support from NSF, DHS and Google.
More informationWeighted Least Squares
Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w
More informationZellner s Seemingly Unrelated Regressions Model. James L. Powell Department of Economics University of California, Berkeley
Zellner s Seemingly Unrelated Regressions Model James L. Powell Department of Economics University of California, Berkeley Overview The seemingly unrelated regressions (SUR) model, proposed by Zellner,
More informationarxiv: v2 [stat.me] 16 May 2009
Stability Selection Nicolai Meinshausen and Peter Bühlmann University of Oxford and ETH Zürich May 16, 9 ariv:0809.2932v2 [stat.me] 16 May 9 Abstract Estimation of structure, such as in variable selection,
More informationThe generalized method of moments
Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna February 2008 Based on the book Generalized Method of Moments by Alastair R. Hall (2005), Oxford
More informationTECHNICAL REPORT NO. 1091r. A Note on the Lasso and Related Procedures in Model Selection
DEPARTMENT OF STATISTICS University of Wisconsin 1210 West Dayton St. Madison, WI 53706 TECHNICAL REPORT NO. 1091r April 2004, Revised December 2004 A Note on the Lasso and Related Procedures in Model
More informationLinear Model Selection and Regularization
Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In
More informationAdvanced Econometrics
Based on the textbook by Verbeek: A Guide to Modern Econometrics Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna May 16, 2013 Outline Univariate
More informationA One-Covariate at a Time, Multiple Testing Approach to Variable Selection in High-Dimensional Linear Regression Models *
Federal Reserve Bank of Dallas Globalization and Monetary Policy Institute Working Paper No. 90 http://www.dallasfed.org/assets/documents/institute/wpapers/016/090.pdf A One-Covariate at a ime Multiple
More informationData Mining Stat 588
Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic
More informationarxiv: v1 [econ.em] 16 Jan 2019
lassopack: Model selection and prediction with regularized regression in Stata arxiv:1901.05397v1 [econ.em] 16 Jan 2019 Achim Ahrens The Economic and Social Research Institute Dublin, Ireland achim.ahrens@esri.ie
More informationCausal Inference: Discussion
Causal Inference: Discussion Mladen Kolar The University of Chicago Booth School of Business Sept 23, 2016 Types of machine learning problems Based on the information available: Supervised learning Reinforcement
More informationMFE Financial Econometrics 2018 Final Exam Model Solutions
MFE Financial Econometrics 2018 Final Exam Model Solutions Tuesday 12 th March, 2019 1. If (X, ε) N (0, I 2 ) what is the distribution of Y = µ + β X + ε? Y N ( µ, β 2 + 1 ) 2. What is the Cramer-Rao lower
More informationBootstrapping high dimensional vector: interplay between dependence and dimensionality
Bootstrapping high dimensional vector: interplay between dependence and dimensionality Xianyang Zhang Joint work with Guang Cheng University of Missouri-Columbia LDHD: Transition Workshop, 2014 Xianyang
More informationThe adaptive and the thresholded Lasso for potentially misspecified models (and a lower bound for the Lasso)
Electronic Journal of Statistics Vol. 0 (2010) ISSN: 1935-7524 The adaptive the thresholded Lasso for potentially misspecified models ( a lower bound for the Lasso) Sara van de Geer Peter Bühlmann Seminar
More informationReconstruction from Anisotropic Random Measurements
Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013
More informationStatistical Learning with the Lasso, spring The Lasso
Statistical Learning with the Lasso, spring 2017 1 Yeast: understanding basic life functions p=11,904 gene values n number of experiments ~ 10 Blomberg et al. 2003, 2010 The Lasso fmri brain scans function
More informationRelaxed Lasso. Nicolai Meinshausen December 14, 2006
Relaxed Lasso Nicolai Meinshausen nicolai@stat.berkeley.edu December 14, 2006 Abstract The Lasso is an attractive regularisation method for high dimensional regression. It combines variable selection with
More informationLasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices
Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,
More informationSparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results
Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results David Prince Biostat 572 dprince3@uw.edu April 19, 2012 David Prince (UW) SPICE April 19, 2012 1 / 11 Electronic
More informationBayesian Sparse Linear Regression with Unknown Symmetric Error
Bayesian Sparse Linear Regression with Unknown Symmetric Error Minwoo Chae 1 Joint work with Lizhen Lin 2 David B. Dunson 3 1 Department of Mathematics, The University of Texas at Austin 2 Department of
More informationAnalysis of Cross-Sectional Data
Analysis of Cross-Sectional Data Kevin Sheppard http://www.kevinsheppard.com Oxford MFE This version: November 8, 2017 November 13 14, 2017 Outline Econometric models Specification that can be analyzed
More informationBayesian linear regression
Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding
More informationStatistics 203: Introduction to Regression and Analysis of Variance Course review
Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying
More informationLinear regression methods
Linear regression methods Most of our intuition about statistical methods stem from linear regression. For observations i = 1,..., n, the model is Y i = p X ij β j + ε i, j=1 where Y i is the response
More informationProperties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation
Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana
More informationBig Data Analytics: A New Perspective
Big Data Analytics: A New Perspective A. Chudik ederal Reserve Bank of Dallas G. Kapetanios King s College London, University of London M. Hashem Pesaran USC Dornsife INE, and rinity College, Cambridge
More informationA Consistent Information Criterion for Support Vector Machines in Diverging Model Spaces
Journal of Machine Learning Research 17 (2016) 1-26 Submitted 6/14; Revised 5/15; Published 4/16 A Consistent Information Criterion for Support Vector Machines in Diverging Model Spaces Xiang Zhang Yichao
More informationarxiv: v2 [math.st] 12 Feb 2008
arxiv:080.460v2 [math.st] 2 Feb 2008 Electronic Journal of Statistics Vol. 2 2008 90 02 ISSN: 935-7524 DOI: 0.24/08-EJS77 Sup-norm convergence rate and sign concentration property of Lasso and Dantzig
More informationShort Questions (Do two out of three) 15 points each
Econometrics Short Questions Do two out of three) 5 points each ) Let y = Xβ + u and Z be a set of instruments for X When we estimate β with OLS we project y onto the space spanned by X along a path orthogonal
More informationStatistical Machine Learning for Structured and High Dimensional Data
Statistical Machine Learning for Structured and High Dimensional Data (FA9550-09- 1-0373) PI: Larry Wasserman (CMU) Co- PI: John Lafferty (UChicago and CMU) AFOSR Program Review (Jan 28-31, 2013, Washington,
More informationModel comparison and selection
BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)
More informationEstimation of Threshold Cointegration
Estimation of Myung Hwan London School of Economics December 2006 Outline Model Asymptotics Inference Conclusion 1 Model Estimation Methods Literature 2 Asymptotics Consistency Convergence Rates Asymptotic
More informationHIGH-DIMENSIONAL L 2 BOOSTING: RATE OF CONVERGENCE. By Ye Luo and Martin Spindler University of Florida, University of Hamburg and Max Planck Society
HIGH-DIMENSIONAL L 2 BOOSTING: RATE OF CONVERGENCE By Ye Luo and Martin Spindler University of Florida, University of Hamburg and Max Planck Society Boosting is one of the most significant developments
More informationMultiple Equation GMM with Common Coefficients: Panel Data
Multiple Equation GMM with Common Coefficients: Panel Data Eric Zivot Winter 2013 Multi-equation GMM with common coefficients Example (panel wage equation) 69 = + 69 + + 69 + 1 80 = + 80 + + 80 + 2 Note:
More informationTime Series Analysis
Time Series Analysis hm@imm.dtu.dk Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby 1 Outline of the lecture Regression based methods, 1st part: Introduction (Sec.
More informationSimple Linear Regression
Simple Linear Regression Christopher Ting Christopher Ting : christophert@smu.edu.sg : 688 0364 : LKCSB 5036 January 7, 017 Web Site: http://www.mysmu.edu/faculty/christophert/ Christopher Ting QF 30 Week
More informationVector Auto-Regressive Models
Vector Auto-Regressive Models Laurent Ferrara 1 1 University of Paris Nanterre M2 Oct. 2018 Overview of the presentation 1. Vector Auto-Regressions Definition Estimation Testing 2. Impulse responses functions
More informationInformation Criteria and Model Selection
Information Criteria and Model Selection Herman J. Bierens Pennsylvania State University March 12, 2006 1. Introduction Let L n (k) be the maximum likelihood of a model with k parameters based on a sample
More informationVAR Models and Applications
VAR Models and Applications Laurent Ferrara 1 1 University of Paris West M2 EIPMC Oct. 2016 Overview of the presentation 1. Vector Auto-Regressions Definition Estimation Testing 2. Impulse responses functions
More informationarxiv: v3 [stat.me] 10 Mar 2016
Submitted to the Annals of Statistics ESTIMATING THE EFFECT OF JOINT INTERVENTIONS FROM OBSERVATIONAL DATA IN SPARSE HIGH-DIMENSIONAL SETTINGS arxiv:1407.2451v3 [stat.me] 10 Mar 2016 By Preetam Nandy,,
More informationBayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson
Bayesian variable selection via penalized credible regions Brian Reich, NC State Joint work with Howard Bondell and Ander Wilson Brian Reich, NCSU Penalized credible regions 1 Motivation big p, small n
More information