Variable selection with error control: Another look at Stability Selection
|
|
- Eustace Osborne
- 5 years ago
- Views:
Transcription
1 Variable selection with error control: Another look at Stability Selection Richard J. Samworth and Rajen D. Shah University of Cambridge RSS Journal Webinar 25 October 2017 Samworth & Shah (Cambridge) Variable selection 25 October / 28
2 High-dimensional data Many modern applications, e.g. in genomics, can have the number of predictors p greatly exceeding the number of observations n. In these settings, variable selection is particularly important. (a) Microarray data (b) Sparsity Samworth & Shah (Cambridge) Variable selection 25 October / 28
3 What is Stability Selection Stability Selection (Meinshausen & Bühlmann, 2010) is a very general technique designed to improve the performance of a variable selection algorithm. It is based on aggregating the results of applying a selection procedure to subsamples of the data. A key feature of Stability Selection is the error control provided in the form an upper bound on the expected number of falsely selected variables. Samworth & Shah (Cambridge) Variable selection 25 October / 28
4 A general model for variable selection Let Z 1,..., Z n be i.i.d. random vectors. We think of indices S of some components of Z i as being signal variables, and the rest N as noise variables. E.g. Z i = (X i, Y i ), with covariate vector X i R p, response Y i R and log-likelihood of the form n i=1 L(Y i, X T i β) with β R p. Then S = {k : β k 0} and N = {k : β k = 0}. Thus S {1,..., p} and N = {1,..., p} \ S. A variable selection procedure is a statistic Ŝ n := Ŝ n (Z 1,..., Z n ) taking values in the set of all subsets of {1,..., p}. Samworth & Shah (Cambridge) Variable selection 25 October / 28
5 How does Stability Selection work? For a subset A = {i 1,..., i A } {1,..., n}, write Meinshausen and Bühlmann defined ˆΠ(k) := Ŝ := Ŝ A (Z i1,..., Z i A ). ( n ) 1 n/2 1. {k Ŝ(A)} A {1,...,n}, A = n/2 Stability selection fixes τ [0, 1] and selects Ŝ SS n,τ = {k : ˆΠ(k) τ}. Samworth & Shah (Cambridge) Variable selection 25 October / 28
6 Error control of Stability Selection Assume that {1 {k Ŝ n/2 } : k N} is exchangeable, and that Ŝ n/2 is no worse than random guessing: Then, for τ ( 1 2, 1], E( Ŝ n/2 S ) E( Ŝ n/2 N ) S N. E( Ŝ n,τ SS N ) 1 (E Ŝ n/2 ) 2 2τ 1 p Samworth & Shah (Cambridge) Variable selection 25 October / 28
7 Error control discussion In principle, this theorem allows to user to choose τ based on the expected number of false positives they are willing to tolerate. However: The theorem requires two conditions, and the exchangeability assumption is very strong There are too many subsets to evaluate Ŝ SS n,τ exactly when n 30 The bound tends to be rather weak. Samworth & Shah (Cambridge) Variable selection 25 October / 28
8 Complementary Pairs Stability Selection (CPSS) Let {(A 2j 1, A 2j ) : j = 1,..., B} be randomly chosen independent pairs of subsets of {1,..., n} of size n/2 such that A 2j 1 A 2j =. Define ˆΠ B (k) := 1 2B B j=1 1 {k Ŝ(A j )} and select Ŝ CPSS n,τ := {k : ˆΠ B (k) τ}. Samworth & Shah (Cambridge) Variable selection 25 October / 28
9 Worst case error control bounds Define the selection probability of variable k to be p k,n = P(k Ŝ n ). We can divide our variables into those that have low and high selection probabilities: for θ [0, 1], let L θ := {k : p k, n/2 θ} and H θ := {k : p k, n/2 > θ}. If τ ( 1 2, 1], then Moreover, if τ [0, 1 2 ), then E Ŝ CPSS n,τ L θ θ 2τ 1 E Ŝ n/2 L θ. CPSS E ˆN n,τ H θ 1 θ 1 2τ E ˆN n/2 H θ. Samworth & Shah (Cambridge) Variable selection 25 October / 28
10 Illustration and discussion Suppose p = 1000 and q := E Ŝ n/2 = 50. On average, CPSS with τ = 0.6 selects no more than a quarter of the variables that have below average selection probability under Ŝ n/2. The theorem requires no exchangeability or random guessing conditions It holds even when B = 1 If exchangeability and random guessing conditions do hold, then we recover E Ŝn,τ CPSS N 1 q ) E Ŝ 2τ 1( p n/2 L q/p 1 2τ 1 ( q 2 p ). Samworth & Shah (Cambridge) Variable selection 25 October / 28
11 Proof Let Π B (k) := 1 B B 1 {k Ŝ(A 2j 1 )} 1 {k Ŝ(A 2j )}. j=1 Note that E{ Π B (k)} = p 2 k, n/2. Now Thus 0 1 B B (1 1 {k Ŝ(A2j 1 )} )(1 1 {k Ŝ(A 2j )} ) j=1 = 1 2ˆΠ B (k) + Π B (k). P{ˆΠ B (k) τ} P{ 1 2 (1 + Π B (k)) τ} = P{ Π B (k) 2τ 1} 1 2τ 1 p2 k, n/2. Samworth & Shah (Cambridge) Variable selection 25 October / 28
12 Proof 2 It follows that ( L θ = E E Ŝ CPSS n,τ 1 2τ 1 k:p k, n/2 θ k:p k, n/2 θ 1 {k Ŝ CPSS n,τ } ) = p 2 k, n/2 where the final inequality follows because ( E Ŝ n/2 L θ = E k:p k, n/2 θ 1 {k Ŝ n/2 } k:p k, n/2 θ P(k Ŝ CPSS n,τ )) θ 2τ 1 E Ŝ n/2 L θ, ) = k:p k, n/2 θ p k, n/2. Samworth & Shah (Cambridge) Variable selection 25 October / 28
13 Bounds with no assumptions whatsoever If Z 1,..., Z n are not identically distributed, the same bound holds, provided in L θ we redefine p k, n/2 := ( ) n 1 P{k Ŝ n/2 n/2 (A)}. A =n/2 Similarly, if Z 1,..., Z n are not independent, the same bound holds, with pk, n/2 2 as the average of over all complementary pairs A 1, A 2. P{k Ŝ n/2 (A 1 ) Ŝ n/2 (A 2 )} Samworth & Shah (Cambridge) Variable selection 25 October / 28
14 Can we improve on Markov s inequality? 0.0 Probability Figure : Typical and extremal pmfs of Π 25 (k) for a low selection probability variable k Samworth & Shah (Cambridge) Variable selection 25 October / 28
15 Can we improve on Markov s inequality? 0 Probability Figure : Typical and extremal pmfs of Π 25 (k) for a low selection probability variable k. Samworth & Shah (Cambridge) Variable selection 25 October / 28
16 Improved bound under unimodality Suppose that the distribution of Π B (k) is unimodal for each k L θ. If τ { B, B, B,..., 1}, then E Ŝ CPSS n,τ L θ C(τ, B)θE Ŝ n/2 L θ, where, when θ 1/ 3, 1 2(2τ 1 1/2B) C(τ, B) = 4(1 τ + 1/2B) 1 + 1/B if τ (min( θ2, B θ2 ), 3 4 ] if τ ( 3 4, 1]. Samworth & Shah (Cambridge) Variable selection 25 October / 28
17 Extremal distribution under unimodality 0.0 Probability Figure : Typical and extremal pmfs of Π 25 (k) for a low selection probability variable k Samworth & Shah (Cambridge) Variable selection 25 October / 28
18 Extremal distribution under unimodality Probability Figure : Typical and extremal pmfs of Π 25 (k) for a low selection probability variable k Samworth & Shah (Cambridge) Variable selection 25 October / 28
19 The r-concavity constraint r-concavity provides a continuum of constraints that interpolate between unimodality and log-concavity. A non-negative function f on an interval I R is r-concave with r < 0 if f r is convex on I. A pmf f on {0, 1/B,..., 1} is r-concave if the linear interpolant to {(i, f (i/b)) : i = 0, 1,..., B} is r-concave. The constraint becomes weaker as r increases to 0. Samworth & Shah (Cambridge) Variable selection 25 October / 28
20 Further improvements under r-concavity Suppose Π B (k) is r-concave for all k L θ. Then for τ = ( 1 2, 1], E Ŝ CPSS n,τ L θ D(θ 2, 2τ 1, B, r) L θ where D can be evaluated numerically. Our simulations suggest r = 1/2 is a reasonable choice. Samworth & Shah (Cambridge) Variable selection 25 October / 28
21 Extremal distribution under 1/2-concavity 0.0 Probability Figure : Typical and extremal pmfs of Π 25 (k) for a low selection probability variable k Samworth & Shah (Cambridge) Variable selection 25 October / 28
22 Extremal distribution under 1/2-concavity Probability Figure : Typical and extremal pmfs of Π 25 (k) for a low selection probability variable k. Samworth & Shah (Cambridge) Variable selection 25 October / 28
23 r = 1/2 is sensible Probability 1/ Typical and extremal pmfs raised to the power 1/ Probability 10-2 Tail probabilities from 0.2 onwards. Worst case Unimodal 1 2 -concave Empirical Samworth & Shah (Cambridge) Variable selection 25 October / 28
24 Reducing the threshold τ Suppose ˆΠ B (k) is 1/4-concave, and that Π B (k) is 1/2-concave for all k L θ. Then E Ŝ CPSS n,τ L θ min{d(θ 2, 2τ 1, B, 1/2), D(θ, τ, 2B, 1/2)} L θ, for all τ (θ, 1]. (We take D(, t,, ) = 1 for t 0.) Samworth & Shah (Cambridge) Variable selection 25 October / 28
25 Error bound 15 Improved bounds τ CPSS E S n,τ Figure : Comparison of the bounds on Lq/p where p = 1000, q = 50 showing the M & B (dashes), worst case (dot dash), unimodal and r -concave bounds, and the true value for a simulated example. Samworth & Shah (Cambridge) Variable selection 25 October / 28
26 Simulation study Linear model Y i = Xi T β + ε i with X i N p (0, Σ). Toeplitz covariance Σ ij = ρ i j p/2 p/2. β has sparsity s with s/2 equally spaced within [ 1, 0.5] and s/2 equally spaced within [0.5, 1]. n = 200, p = Use Lasso and seek E Ŝn,τ CPSS L q/p l. Fix q = 0.8lp and for worst-case bound choose τ = 0.9. Choose τ from r-concave bound, oracle τ, and oracle λ for Lasso Ŝ λ n. Compare E Ŝ CPSS n,0.9 S E Ŝ CPSS n,τ E Ŝ CPSS n, τ S S, E Ŝ CPSS S and n,τ E Ŝ λ n S E Ŝ CPSS n,τ S. Samworth & Shah (Cambridge) Variable selection 25 October / 28
27 Simulation results 1 3/4 1 1/2 1/4 0 SNR s r /4 2 1/2 1/4 0 Figure : Expected number or true positives using worst case and r-concave bounds, and an oracle Lasso procedure (crosses), as a fraction of the expected number of true positives for an oracle CPSS procedure. The y-axis label gives the desired error control level l. Samworth & Shah (Cambridge) Variable selection 25 October / 28
28 Summary CPSS can be used with any variable selection procedure. We can bound the average number of low selection probability variables chosen by CPSS with no conditions on the model or original selection procedure needed. Under mild conditions e.g. unimodality or r-concavity, the bounds can be strengthened, yielding tight error control. This allows the user to choose the threshold τ in an effective way. R packages: mboost and stabsel. Thank you for listening. Samworth & Shah (Cambridge) Variable selection 25 October / 28
VARIABLE SELECTION AND INDEPENDENT COMPONENT
VARIABLE SELECTION AND INDEPENDENT COMPONENT ANALYSIS, PLUS TWO ADVERTS Richard Samworth University of Cambridge Joint work with Rajen Shah and Ming Yuan My core research interests A broad range of methodological
More informationVariable selection with error control: another look at stability selection
J. R. Statist. Soc. B (3) 75, Part, pp. 55 8 Variable selection with error control: another look at stability selection Rajen D. Shah and Richard J. Samworth University of Cambridge, UK [Received May.
More informationSTAT 200C: High-dimensional Statistics
STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57
More informationThe Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA
The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA Presented by Dongjun Chung March 12, 2010 Introduction Definition Oracle Properties Computations Relationship: Nonnegative Garrote Extensions:
More informationHigh-dimensional covariance estimation based on Gaussian graphical models
High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,
More informationA Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models
A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los
More informationRanking-Based Variable Selection for high-dimensional data.
Ranking-Based Variable Selection for high-dimensional data. Rafal Baranowski 1 and Piotr Fryzlewicz 1 1 Department of Statistics, Columbia House, London School of Economics, Houghton Street, London, WC2A
More informationReconstruction from Anisotropic Random Measurements
Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013
More informationThe annals of Statistics (2006)
High dimensional graphs and variable selection with the Lasso Nicolai Meinshausen and Peter Buhlmann The annals of Statistics (2006) presented by Jee Young Moon Feb. 19. 2010 High dimensional graphs and
More informationarxiv: v1 [stat.me] 30 Dec 2017
arxiv:1801.00105v1 [stat.me] 30 Dec 2017 An ISIS screening approach involving threshold/partition for variable selection in linear regression 1. Introduction Yu-Hsiang Cheng e-mail: 96354501@nccu.edu.tw
More informationarxiv: v2 [stat.me] 16 May 2009
Stability Selection Nicolai Meinshausen and Peter Bühlmann University of Oxford and ETH Zürich May 16, 9 ariv:0809.2932v2 [stat.me] 16 May 9 Abstract Estimation of structure, such as in variable selection,
More informationHigh Dimensional Inverse Covariate Matrix Estimation via Linear Programming
High Dimensional Inverse Covariate Matrix Estimation via Linear Programming Ming Yuan October 24, 2011 Gaussian Graphical Model X = (X 1,..., X p ) indep. N(µ, Σ) Inverse covariance matrix Σ 1 = Ω = (ω
More informationModel-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate
Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Lucas Janson, Stanford Department of Statistics WADAPT Workshop, NIPS, December 2016 Collaborators: Emmanuel
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 7, 04 Reading: See class website Eric Xing @ CMU, 005-04
More informationPermutation-invariant regularization of large covariance matrices. Liza Levina
Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work
More informationNonrobust and Robust Objective Functions
Nonrobust and Robust Objective Functions The objective function of the estimators in the input space is built from the sum of squared Mahalanobis distances (residuals) d 2 i = 1 σ 2(y i y io ) C + y i
More informationRelaxed Lasso. Nicolai Meinshausen December 14, 2006
Relaxed Lasso Nicolai Meinshausen nicolai@stat.berkeley.edu December 14, 2006 Abstract The Lasso is an attractive regularisation method for high dimensional regression. It combines variable selection with
More informationMessage Passing Algorithms for Compressed Sensing: II. Analysis and Validation
Message Passing Algorithms for Compressed Sensing: II. Analysis and Validation David L. Donoho Department of Statistics Arian Maleki Department of Electrical Engineering Andrea Montanari Department of
More informationThe adaptive and the thresholded Lasso for potentially misspecified models (and a lower bound for the Lasso)
Electronic Journal of Statistics Vol. 0 (2010) ISSN: 1935-7524 The adaptive the thresholded Lasso for potentially misspecified models ( a lower bound for the Lasso) Sara van de Geer Peter Bühlmann Seminar
More informationExpectation Maximization (EM) Algorithm. Each has it s own probability of seeing H on any one flip. Let. p 1 = P ( H on Coin 1 )
Expectation Maximization (EM Algorithm Motivating Example: Have two coins: Coin 1 and Coin 2 Each has it s own probability of seeing H on any one flip. Let p 1 = P ( H on Coin 1 p 2 = P ( H on Coin 2 Select
More information11 : Gaussian Graphic Models and Ising Models
10-708: Probabilistic Graphical Models 10-708, Spring 2017 11 : Gaussian Graphic Models and Ising Models Lecturer: Bryon Aragam Scribes: Chao-Ming Yen 1 Introduction Different from previous maximum likelihood
More informationInference for High Dimensional Robust Regression
Department of Statistics UC Berkeley Stanford-Berkeley Joint Colloquium, 2015 Table of Contents 1 Background 2 Main Results 3 OLS: A Motivating Example Table of Contents 1 Background 2 Main Results 3 OLS:
More informationEstimation of large dimensional sparse covariance matrices
Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed)
More informationStatistical Data Mining and Machine Learning Hilary Term 2016
Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes
More informationHigh-dimensional graphical model selection: Practical and information-theoretic limits
1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John
More informationarxiv: v2 [math.st] 12 Feb 2008
arxiv:080.460v2 [math.st] 2 Feb 2008 Electronic Journal of Statistics Vol. 2 2008 90 02 ISSN: 935-7524 DOI: 0.24/08-EJS77 Sup-norm convergence rate and sign concentration property of Lasso and Dantzig
More informationWEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract
Journal of Data Science,17(1). P. 145-160,2019 DOI:10.6339/JDS.201901_17(1).0007 WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION Wei Xiong *, Maozai Tian 2 1 School of Statistics, University of
More informationSparse inverse covariance estimation with the lasso
Sparse inverse covariance estimation with the lasso Jerome Friedman Trevor Hastie and Robert Tibshirani November 8, 2007 Abstract We consider the problem of estimating sparse graphs by a lasso penalty
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationRegularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008
Regularized Estimation of High Dimensional Covariance Matrices Peter Bickel Cambridge January, 2008 With Thanks to E. Levina (Joint collaboration, slides) I. M. Johnstone (Slides) Choongsoon Bae (Slides)
More informationAn ensemble learning method for variable selection
An ensemble learning method for variable selection Vincent Audigier, Avner Bar-Hen CNAM, MSDMA team, Paris Journées de la statistique 2018 1 / 17 Context Y n 1 = X n p β p 1 + ε n 1 ε N (0, σ 2 I) β sparse
More informationUncertainty quantification in high-dimensional statistics
Uncertainty quantification in high-dimensional statistics Peter Bühlmann ETH Zürich based on joint work with Sara van de Geer Nicolai Meinshausen Lukas Meier 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70
More informationSTAT 461/561- Assignments, Year 2015
STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and
More informationApproximation Theoretical Questions for SVMs
Ingo Steinwart LA-UR 07-7056 October 20, 2007 Statistical Learning Theory: an Overview Support Vector Machines Informal Description of the Learning Goal X space of input samples Y space of labels, usually
More informationLecture 9: Classification, LDA
Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis October 13, 2017 1 / 21 Review: Main strategy in Chapter 4 Find an estimate ˆP (Y X). Then, given an input x 0, we
More informationComposite Loss Functions and Multivariate Regression; Sparse PCA
Composite Loss Functions and Multivariate Regression; Sparse PCA G. Obozinski, B. Taskar, and M. I. Jordan (2009). Joint covariate selection and joint subspace selection for multiple classification problems.
More informationGoodness-of-fit tests for high dimensional linear models
J. R. Statist. Soc. B (2018) 80, Part 1, pp. 113 135 Goodness-of-fit tests for high dimensional linear models Rajen D. Shah University of Cambridge, UK and Peter Bühlmann Eidgenössiche Technische Hochschule
More informationPruning Variable Selection Ensembles
Pruning Variable Selection Ensembles Chunxia Zhang School of Mathematics and Statistics, Xi an Jiaotong University, China arxiv:1704.08265v1 [stat.ml] 26 Apr 2017 Yilei Wu and Mu Zhu Department of Statistics
More informationA New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables
A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider the problem of
More informationChapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1)
Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1) Detection problems can usually be casted as binary or M-ary hypothesis testing problems. Applications: This chapter: Simple hypothesis
More informationLecture 9: Classification, LDA
Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis October 13, 2017 1 / 21 Review: Main strategy in Chapter 4 Find an estimate ˆP (Y X). Then, given an input x 0, we
More informationA note on the group lasso and a sparse group lasso
A note on the group lasso and a sparse group lasso arxiv:1001.0736v1 [math.st] 5 Jan 2010 Jerome Friedman Trevor Hastie and Robert Tibshirani January 5, 2010 Abstract We consider the group lasso penalty
More informationData assimilation in high dimensions
Data assimilation in high dimensions David Kelly Kody Law Andy Majda Andrew Stuart Xin Tong Courant Institute New York University New York NY www.dtbkelly.com February 3, 2016 DPMMS, University of Cambridge
More informationChris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010
Model-Averaged l 1 Regularization using Markov Chain Monte Carlo Model Composition Technical Report No. 541 Department of Statistics, University of Washington Chris Fraley and Daniel Percival August 22,
More informationVariable Selection for Highly Correlated Predictors
Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu Department of Statistics, University of Illinois at Urbana-Champaign WHOA-PSI, Aug, 2017 St. Louis, Missouri 1 / 30 Background Variable
More informationSparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28
Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:
More informationOn the power and the limits of evolvability. Vitaly Feldman Almaden Research Center
On the power and the limits of evolvability Vitaly Feldman Almaden Research Center Learning from examples vs evolvability Learnable from examples Evolvable Parity functions The core model: PAC [Valiant
More informationPre-Selection in Cluster Lasso Methods for Correlated Variable Selection in High-Dimensional Linear Models
Pre-Selection in Cluster Lasso Methods for Correlated Variable Selection in High-Dimensional Linear Models Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider variable
More informationAn Introduction to Signal Detection and Estimation - Second Edition Chapter III: Selected Solutions
An Introduction to Signal Detection and Estimation - Second Edition Chapter III: Selected Solutions H. V. Poor Princeton University March 17, 5 Exercise 1: Let {h k,l } denote the impulse response of a
More informationOptimal Designs for 2 k Experiments with Binary Response
1 / 57 Optimal Designs for 2 k Experiments with Binary Response Dibyen Majumdar Mathematics, Statistics, and Computer Science College of Liberal Arts and Sciences University of Illinois at Chicago Joint
More informationConvex Optimization M2
Convex Optimization M2 Lecture 8 A. d Aspremont. Convex Optimization M2. 1/57 Applications A. d Aspremont. Convex Optimization M2. 2/57 Outline Geometrical problems Approximation problems Combinatorial
More informationMaximum likelihood estimation of a log-concave density based on censored data
Maximum likelihood estimation of a log-concave density based on censored data Dominic Schuhmacher Institute of Mathematical Statistics and Actuarial Science University of Bern Joint work with Lutz Dümbgen
More informationVariable Selection for Highly Correlated Predictors
Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu arxiv:1709.04840v1 [stat.me] 14 Sep 2017 Abstract Penalty-based variable selection methods are powerful in selecting relevant covariates
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationBayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson
Bayesian variable selection via penalized credible regions Brian Reich, NC State Joint work with Howard Bondell and Ander Wilson Brian Reich, NCSU Penalized credible regions 1 Motivation big p, small n
More informationAsymptotic Equivalence of Regularization Methods in Thresholded Parameter Space
Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space Jinchi Lv Data Sciences and Operations Department Marshall School of Business University of Southern California http://bcf.usc.edu/
More information10708 Graphical Models: Homework 2
10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves
More informationECE531 Lecture 2a: A Mathematical Model for Hypothesis Testing (Finite Number of Possible Observations)
ECE531 Lecture 2a: A Mathematical Model for Hypothesis Testing (Finite Number of Possible Observations) D. Richard Brown III Worcester Polytechnic Institute 26-January-2011 Worcester Polytechnic Institute
More informationSparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda
Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic
More informationOn Model Selection Consistency of Lasso
On Model Selection Consistency of Lasso Peng Zhao Department of Statistics University of Berkeley 367 Evans Hall Berkeley, CA 94720-3860, USA Bin Yu Department of Statistics University of Berkeley 367
More informationThe deterministic Lasso
The deterministic Lasso Sara van de Geer Seminar für Statistik, ETH Zürich Abstract We study high-dimensional generalized linear models and empirical risk minimization using the Lasso An oracle inequality
More informationFalse Discovery and Its Control in Low-Rank Estimation
False Discovery and Its Control in Low-Rank Estimation Armeen Taeb a, Parikshit Shah b, and Venkat Chandrasekaran a,c a Department of Electrical Engineering, California Institute of Technology b Yahoo
More informationU.C. Berkeley CS294: Spectral Methods and Expanders Handout 11 Luca Trevisan February 29, 2016
U.C. Berkeley CS294: Spectral Methods and Expanders Handout Luca Trevisan February 29, 206 Lecture : ARV In which we introduce semi-definite programming and a semi-definite programming relaxation of sparsest
More informationGraphical Model Selection
May 6, 2013 Trevor Hastie, Stanford Statistics 1 Graphical Model Selection Trevor Hastie Stanford University joint work with Jerome Friedman, Rob Tibshirani, Rahul Mazumder and Jason Lee May 6, 2013 Trevor
More informationEstimation of Quantiles
9 Estimation of Quantiles The notion of quantiles was introduced in Section 3.2: recall that a quantile x α for an r.v. X is a constant such that P(X x α )=1 α. (9.1) In this chapter we examine quantiles
More informationA General Framework for High-Dimensional Inference and Multiple Testing
A General Framework for High-Dimensional Inference and Multiple Testing Yang Ning Department of Statistical Science Joint work with Han Liu 1 Overview Goal: Control false scientific discoveries in high-dimensional
More informationGradient boosting for distributional regression: faster tuning and improved variable selection via noncyclical updates
DOI 1.17/s11-17-975- Gradient boosting for distributional regression: faster tuning and improved variable selection via non updates Janek Thomas 1 Andreas Mayr,3 Bernd Bischl 1 Matthias Schmid 3 Adam Smith
More informationSparse survival regression
Sparse survival regression Anders Gorst-Rasmussen gorst@math.aau.dk Department of Mathematics Aalborg University November 2010 1 / 27 Outline Penalized survival regression The semiparametric additive risk
More informationOn Bayesian Computation
On Bayesian Computation Michael I. Jordan with Elaine Angelino, Maxim Rabinovich, Martin Wainwright and Yun Yang Previous Work: Information Constraints on Inference Minimize the minimax risk under constraints
More informationHigh-dimensional graphical model selection: Practical and information-theoretic limits
1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John
More informationHigh-dimensional Covariance Estimation Based On Gaussian Graphical Models
High-dimensional Covariance Estimation Based On Gaussian Graphical Models Shuheng Zhou, Philipp Rutimann, Min Xu and Peter Buhlmann February 3, 2012 Problem definition Want to estimate the covariance matrix
More informationA direct formulation for sparse PCA using semidefinite programming
A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon
More informationSURE INDEPENDENCE SCREENING IN GENERALIZED LINEAR MODELS WITH NP-DIMENSIONALITY 1
The Annals of Statistics 2010, Vol. 38, No. 6, 3567 3604 DOI: 10.1214/10-AOS798 Institute of Mathematical Statistics, 2010 SURE INDEPENDENCE SCREENING IN GENERALIZED LINEAR MODELS WITH NP-DIMENSIONALITY
More informationLearning discrete graphical models via generalized inverse covariance matrices
Learning discrete graphical models via generalized inverse covariance matrices Duzhe Wang, Yiming Lv, Yongjoon Kim, Young Lee Department of Statistics University of Wisconsin-Madison {dwang282, lv23, ykim676,
More informationThe MNet Estimator. Patrick Breheny. Department of Biostatistics Department of Statistics University of Kentucky. August 2, 2010
Department of Biostatistics Department of Statistics University of Kentucky August 2, 2010 Joint work with Jian Huang, Shuangge Ma, and Cun-Hui Zhang Penalized regression methods Penalized methods have
More informationBayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang
Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January
More informationInformation Theory and Statistics Lecture 3: Stationary ergodic processes
Information Theory and Statistics Lecture 3: Stationary ergodic processes Łukasz Dębowski ldebowsk@ipipan.waw.pl Ph. D. Programme 2013/2014 Measurable space Definition (measurable space) Measurable space
More informationCalibrating Environmental Engineering Models and Uncertainty Analysis
Models and Cornell University Oct 14, 2008 Project Team Christine Shoemaker, co-pi, Professor of Civil and works in applied optimization, co-pi Nikolai Blizniouk, PhD student in Operations Research now
More informationVariable Selection and Weighting by Nearest Neighbor Ensembles
Variable Selection and Weighting by Nearest Neighbor Ensembles Jan Gertheiss (joint work with Gerhard Tutz) Department of Statistics University of Munich WNI 2008 Nearest Neighbor Methods Introduction
More informationStatistical Learning with the Lasso, spring The Lasso
Statistical Learning with the Lasso, spring 2017 1 Yeast: understanding basic life functions p=11,904 gene values n number of experiments ~ 10 Blomberg et al. 2003, 2010 The Lasso fmri brain scans function
More informationThe Iterated Lasso for High-Dimensional Logistic Regression
The Iterated Lasso for High-Dimensional Logistic Regression By JIAN HUANG Department of Statistics and Actuarial Science, 241 SH University of Iowa, Iowa City, Iowa 52242, U.S.A. SHUANGE MA Division of
More informationAn Introduction to Graphical Lasso
An Introduction to Graphical Lasso Bo Chang Graphical Models Reading Group May 15, 2015 Bo Chang (UBC) Graphical Lasso May 15, 2015 1 / 16 Undirected Graphical Models An undirected graph, each vertex represents
More informationEstimating subgroup specific treatment effects via concave fusion
Estimating subgroup specific treatment effects via concave fusion Jian Huang University of Iowa April 6, 2016 Outline 1 Motivation and the problem 2 The proposed model and approach Concave pairwise fusion
More informationDe-biasing the Lasso: Optimal Sample Size for Gaussian Designs
De-biasing the Lasso: Optimal Sample Size for Gaussian Designs Adel Javanmard USC Marshall School of Business Data Science and Operations department Based on joint work with Andrea Montanari Oct 2015 Adel
More informationLecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary
ECE 830 Spring 207 Instructor: R. Willett Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics Executive summary In the last lecture we saw that the likelihood
More informationBAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage
BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement
More informationWeak Signals: machine-learning meets extreme value theory
Weak Signals: machine-learning meets extreme value theory Stephan Clémençon Télécom ParisTech, LTCI, Université Paris Saclay machinelearningforbigdata.telecom-paristech.fr 2017-10-12, Workshop Big Data
More informationFalse Discovery Control in Spatial Multiple Testing
False Discovery Control in Spatial Multiple Testing WSun 1,BReich 2,TCai 3, M Guindani 4, and A. Schwartzman 2 WNAR, June, 2012 1 University of Southern California 2 North Carolina State University 3 University
More informationPenalized Squared Error and Likelihood: Risk Bounds and Fast Algorithms
university-logo Penalized Squared Error and Likelihood: Risk Bounds and Fast Algorithms Andrew Barron Cong Huang Xi Luo Department of Statistics Yale University 2008 Workshop on Sparsity in High Dimensional
More informationVector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.
Vector spaces DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Vector space Consists of: A set V A scalar
More informationLeast squares under convex constraint
Stanford University Questions Let Z be an n-dimensional standard Gaussian random vector. Let µ be a point in R n and let Y = Z + µ. We are interested in estimating µ from the data vector Y, under the assumption
More informationIntroduction to graphical models: Lecture III
Introduction to graphical models: Lecture III Martin Wainwright UC Berkeley Departments of Statistics, and EECS Martin Wainwright (UC Berkeley) Some introductory lectures January 2013 1 / 25 Introduction
More informationModel Selection and Geometry
Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model
More informationRobust estimation of scale and covariance with P n and its application to precision matrix estimation
Robust estimation of scale and covariance with P n and its application to precision matrix estimation Garth Tarr, Samuel Müller and Neville Weber USYD 2013 School of Mathematics and Statistics THE UNIVERSITY
More informationGeostatistical Modeling for Large Data Sets: Low-rank methods
Geostatistical Modeling for Large Data Sets: Low-rank methods Whitney Huang, Kelly-Ann Dixon Hamil, and Zizhuang Wu Department of Statistics Purdue University February 22, 2016 Outline Motivation Low-rank
More informationTAMS39 Lecture 10 Principal Component Analysis Factor Analysis
TAMS39 Lecture 10 Principal Component Analysis Factor Analysis Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content - Lecture Principal component analysis
More informationESL Chap3. Some extensions of lasso
ESL Chap3 Some extensions of lasso 1 Outline Consistency of lasso for model selection Adaptive lasso Elastic net Group lasso 2 Consistency of lasso for model selection A number of authors have studied
More informationMinwise hashing for large-scale regression and classification with sparse data
Minwise hashing for large-scale regression and classification with sparse data Nicolai Meinshausen (Seminar für Statistik, ETH Zürich) joint work with Rajen Shah (Statslab, University of Cambridge) Simons
More informationA Significance Test for the Lasso
A Significance Test for the Lasso Lockhart R, Taylor J, Tibshirani R, and Tibshirani R Ashley Petersen May 14, 2013 1 Last time Problem: Many clinical covariates which are important to a certain medical
More information