A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers
|
|
- Alaina McDonald
- 5 years ago
- Views:
Transcription
1 A uified framework for high-dimesioal aalysis of M-estimators with decomposable regularizers Sahad Negahba, UC Berkeley Pradeep Ravikumar, UT Austi Marti Waiwright, UC Berkeley Bi Yu, UC Berkeley NIPS Coferece
2 Loss fuctios ad regularizatio Model class: parameter space Ω R p, ad set of probability distributios {P θ θ Ω} Data: samples X 1 = (x i,y i ), i = 1,..., are draw from ukow P θ Estimatio: Miimize loss fuctio plus regularizatio term: θ }{{} Estimate arg mi θ R p { L (θ;x 1 ) }{{} Loss fuctio } + λ r(θ). }{{} Regularizer Aalysis: Boud error d( θ θ ) uder high-dimesioal scalig (,p) +.
3 Example: Sparse regressio y X θ w p = + S S c Set-up: oisy observatios y = Xθ +w with sparse θ Estimator: Lasso program 1 θ argmi θ p (y i x T i θ) 2 +λ θ j i=1 j=1 Some past work: Tibshirai, 1996; Che et al., 1998; Dooho/Xuo, 2001; Tropp, 2004; Fuchs, 2004; Meishause/Buhlma, 2005; Cades/Tao, 2005; Dooho, 2005; Haupt & Nowak, 2006; Zhao/Yu, 2006; Waiwright, 2006; Zou, 2006; Koltchiskii, 2007; Meishause/Yu, 2007; Tsybakov et al., 2008
4 Example: Structured iverse covariace matrices Zero patter of iverse covariace Set-up: Samples from radom vector with sparse iverse covariace Θ. Estimator: Θ argmi Θ 1 p x i x T i, Θ logdet(θ)+λ Θ j q i=1 j=1 Some past work: Yua & Li, 2006; d Asprémot et al., 2007; Bickel & Levia, 2007; El Karoui, 2007; Rothma et al., 2007; Zhou et al., 2007; Friedma et al., 2008; Ravikumar et al., 2008
5 Example: Low-rak matrix approximatio Θ U D V T = k m k r r r r m Set-up: Matrix Θ R k m with rak r mi{k,m}. Estimator: Θ argmi Θ 1 mi{k,m} (y i X i, Θ ) 2 +λ σ j (Θ) i=1 j=1 Some past work: Frieze et al., 1998; Achilioptas & McSherry, 2001; Srebro et al., 2004; Drieas et al., 2005; Rudelso & Vershyi, 2006; Recht et al., 2007; Bach, 2008; Meka et al., 2008; Cades & Tao, 2009; Keshava et al., 2009
6 Importat properties of regularizer/loss 1 Decomposability of regularizer vectors u A ad v B r(u+v) = r(u)+r(v) costrais error = θ θ to smaller set C C 2 Restricted strog covexity: loss fuctios ot strictly covex i high-dimesios require curvature oly for directios C loss fuctio L(θ) := L (θ;x 1 ) satisfies L (θ + ) L (θ ) L (θ ), γ(l) d 2 ( ) }{{}}{{}}{{} Excess loss score squared fuctio error for all C.
7 Mai theorem Quatities that cotrol rates: restricted strog covexity parameter: γ(l) dual orm of regularizer: r (v) := sup v, u. r(u)=1 optimal subspace cost.: Ψ(A) = mi { c R r(θ) cd(θ) for all θ A }. Theorem With regularizatio costat λ 2r ( L(θ ;X 1 )), the ay solutio θ satisfies d( θ θ ) 1 [ Ψ(B ] )λ. γ(l) Assumptios: θ belogs to a subspace A regularizer r decomposable over subspace pair (A, B) loss obeys restricted strog covexity with parameter γ(l) > 0
8 Applicatio: Liear regressio (hard sparsity) RSC reduces to lower boud o restricted eigevalues of X T X for a k-sparse vector, we have θ 1 k θ 2. Corollary Suppose that true parameter θ is exactly k-sparse. Uder RSC ad with λ 2 XT ε, the ay Lasso solutio satisfies θ θ 2 γ(l) 1 kλ. Some stochastic istaces: recover kow results Compressed sesig: X ij N(0,1) ad bouded oise ε 2 σ Determiistic desig: X with bouded colums ad ε i N(0,σ 2 ) XT ε 2σ2 logp w.h.p. = θ θ 2 8σ k logp. γ(l) (e.g., Cades & Tao, 2007; Meishause/Yu, 2007; Bickel et al., 2008)
9 Applicatio: Liear regressio (weak sparsity) for some q [0,1], say θ belogs to l q - ball B q (R q ) := { θ R p p θ j q } R q. j=1 Corollary Uder RSC, the ay Lasso solutio satisfies (w.h.p.) θ θ 2 2 [ O σ 2 R q ( logp ) 1 q/2 ]. ew result; rate kow to be miimax optimal (Raskutti et al., 2009)
10 Multivariate regressio with block regularizers Y m X Θ = p + l 1/l q-regularized group Lasso: with λ 2 XT W, q where 1/q +1/ q = 1 Corollary S S c { 1 Θ arg mi Θ R p p 2 Y } XΘ 2 F +λ Θ 1,q. Say Θ is supported o S = s rows, X satisfies RSC ad W ij N(0,σ 2 ). The we have Θ Θ F 2 γ(l) Ψ q(s)λ where Ψ q (S) = m p W m { m 1/q 1/2 s if q [1,2). s if q 2.
11 Multivariate regressio with block regularizers Y m X Θ = p + Effect of varyig q [1, ]: for q = 1, problem reduces ordiary Lasso with pm parameters ad sparsity sm: Θ Θ F S S c m p W m ( smlog(pm) ) O for q = 2, rate decouples ito term terms: ( Θ Θ slogp F O }{{} Search term (fid s rows) + sm ) }{{} Estimate sm parameters similar rates for q = 2: Louici et al. (2009) ad Huag ad Zhag (2009)
12 Applicatio: Low-rak matrices ad uclear orm low-rak matrix Θ R k m with rak r mi{k,m} oisy/partial observatios of the form y i = X i, Θ +ε i, i = 1,...,, ε i N(0,σ 2 ). Corollary With regularizatio parameter λ 16σ ( k + m ), we have w.h.p. Θ Θ F 32σ [ r k r m γ(l) + ]. for a rak r matrix M, we have M 1 r M F solve uclear orm regularized program with λ 2 i=1 Xiεi 2
13 Summary uified approach to covergece rates for high-dimesioal estimators decomposability of regularizer r restricted strog covexity of loss fuctios actual rates determied by: oise measured i dual fuctio r subspace costat Ψ i movig from r to error orm d restricted strog covexity costat recovered some kow results as corollaries: Lasso with exact sparsity multivariate group Lasso iverse covariace matrix estimatio derived ew results o: low-rak matrix estimatio approximately sparse models other models?
A primer on high-dimensional statistics: Lecture 2
A primer o high-dimesioal statistics: Lecture 2 Marti Waiwright UC Berkeley Departmets of Statistics, ad EECS Simos Istitute Workshop, Bootcamp Tutorials High-level overview Regularized M-estimators: May
More informationSummary and Discussion on Simultaneous Analysis of Lasso and Dantzig Selector
Summary ad Discussio o Simultaeous Aalysis of Lasso ad Datzig Selector STAT732, Sprig 28 Duzhe Wag May 4, 28 Abstract This is a discussio o the work i Bickel, Ritov ad Tsybakov (29). We begi with a short
More informationA unified framework for high-dimensional analysis of M-estimators with decomposable regularizers
A uified framework for high-dimesioal aalysis of M-estimators with decomposable regularizers Sahad Negahba Departmet of EECS UC Berkeley sahad @eecs.berkeley.edu Marti J. Waiwright Departmet of Statistics
More informationHigh-dimensional Statistics
High-dimensional Statistics Pradeep Ravikumar UT Austin Outline 1. High Dimensional Data : Large p, small n 2. Sparsity 3. Group Sparsity 4. Low Rank 1 Curse of Dimensionality Statistical Learning: Given
More informationHigh-dimensional Statistical Models
High-dimensional Statistical Models Pradeep Ravikumar UT Austin MLSS 2014 1 Curse of Dimensionality Statistical Learning: Given n observations from p(x; θ ), where θ R p, recover signal/parameter θ. For
More informationGeneral principles for high-dimensional estimation: Statistics and computation
General principles for high-dimensional estimation: Statistics and computation Martin Wainwright Statistics, and EECS UC Berkeley Joint work with: Garvesh Raskutti, Sahand Negahban Pradeep Ravikumar, Bin
More informationA unified framework for high-dimensional analysis of M-estimators with decomposable regularizers
A uified framework for high-dimesioal aalysis of M-estimators with decomposable regularizers Sahad Negahba 1 Pradeep Ravikumar 2 Marti J. Waiwright 1,3 Bi Yu 1,3 Departmet of EECS 1 Departmet of CS 2 Departmet
More informationECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization
ECE 90 Lecture 4: Maximum Likelihood Estimatio ad Complexity Regularizatio R Nowak 5/7/009 Review : Maximum Likelihood Estimatio We have iid observatios draw from a ukow distributio Y i iid p θ, i,, where
More informationLecture 12: February 28
10-716: Advaced Machie Learig Sprig 2019 Lecture 12: February 28 Lecturer: Pradeep Ravikumar Scribes: Jacob Tyo, Rishub Jai, Ojash Neopae Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:
More informationSparsity oracle inequalities
(SOI) Laboratoire de Statistique, CREST ad Laboratoire de Probabilités et Modèles Aléatoires, Uiversité Paris 6 Cambridge, Jue 24, 2008 (SOI) Model, dictioary, liear approximatio Sparsity ad dimesio reductio
More informationOutline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression
REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques
More informationECE 901 Lecture 13: Maximum Likelihood Estimation
ECE 90 Lecture 3: Maximum Likelihood Estimatio R. Nowak 5/7/009 The focus of this lecture is to cosider aother approach to learig based o maximum likelihood estimatio. Ulike earlier approaches cosidered
More informationLecture 13: Maximum Likelihood Estimation
ECE90 Sprig 007 Statistical Learig Theory Istructor: R. Nowak Lecture 3: Maximum Likelihood Estimatio Summary of Lecture I the last lecture we derived a risk (MSE) boud for regressio problems; i.e., select
More informationHigh Dimensional Structured Superposition Models
High Dimesioal Structured Superpositio Models Qilog Gu Dept of Computer Sciece & Egieerig Uiversity of Miesota, Twi Cities guxxx396@cs.um.edu Aridam Baerjee Dept of Computer Sciece & Egieerig Uiversity
More informationLinear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d
Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y
More informationEstimation Error of the Constrained Lasso
Estimatio Error of the Costraied Lasso Nissim Zerbib 1, Ye-Hua Li, Ya-Pig Hsieh, ad Volka Cevher Abstract This paper presets a o-asymptotic upper boud for the estimatio error of the costraied lasso, uder
More informationLecture 8: October 20, Applications of SVD: least squares approximation
Mathematical Toolkit Autum 2016 Lecturer: Madhur Tulsiai Lecture 8: October 20, 2016 1 Applicatios of SVD: least squares approximatio We discuss aother applicatio of sigular value decompositio (SVD) of
More informationHigh-dimensional statistics: Some progress and challenges ahead
High-dimensional statistics: Some progress and challenges ahead Martin Wainwright UC Berkeley Departments of Statistics, and EECS University College, London Master Class: Lecture Joint work with: Alekh
More informationMaximum Likelihood Estimation and Complexity Regularization
ECE90 Sprig 004 Statistical Regularizatio ad Learig Theory Lecture: 4 Maximum Likelihood Estimatio ad Complexity Regularizatio Lecturer: Rob Nowak Scribe: Pam Limpiti Review : Maximum Likelihood Estimatio
More informationHigh-dimensional regression with noisy and missing data: Provable guarantees with non-convexity
High-dimesioal regressio with oisy ad missig data: Provable guaratees with o-covexity Po-Lig Loh Departmet of Statistics Uiversity of Califoria, Berkeley Berkeley, CA 94720 ploh@berkeley.edu Marti J. Waiwright
More informationECE 901 Lecture 12: Complexity Regularization and the Squared Loss
ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality
More informationRestricted Eigenvalue Properties for Correlated Gaussian Designs
Joural of Machie Learig Research 11 (2010) 2241-2259 Submitted 3/10; Revised 8/10; Published 8/10 Restricted Eigevalue Properties for Correlated Gaussia Desigs Garvesh Raskutti Marti J. Waiwright Bi Yu
More informationInformation-theoretic bounds on model selection for Gaussian Markov random fields
Iformatio-theoretic bouds o model selectio for Gaussia Markov radom fields Wei Wag, Marti J. Waiwright,, ad Kaa Ramchadra Departmet of Electrical Egieerig ad Computer Scieces, ad Departmet of Statistics
More informationRobust Lasso with missing and grossly corrupted observations
Robust Lasso with missig ad grossly corrupted observatios Nam H. Nguye Johs Hopkis Uiversity am@jhu.edu Nasser M. Nasrabadi U.S. Army Research Lab asser.m.asrabadi.civ@mail.mil Trac D. Tra Johs Hopkis
More informationNoisy low-rank matrix completion with general sampling distribution
Beroulli 20), 204, 282 303 DOI: 0.350/2-BEJ486 Noisy low-rak matrix completio with geeral samplig distributio OLGA KLOPP MODAL X, Uiversity Paris Ouest Naterre ad CREST, 200 aveue de la République, 9200
More informationA Dirty Model for Multi-task Learning
A Dirty Model for Multi-task Learig Ali Jalali Uiversity of Texas at Austi ali@mail.utexas.edu Suay Saghavi Uiversity of Texas at Austi saghavi@mail.utexas.edu Pradeep Ravikumar Uiversity of Texas at Asuti
More informationOn Robust Estimation of High Dimensional Generalized Linear Models
O Robust Estimatio of High Dimesioal Geeralized Liear Models Euho Yag Departmet of Computer Sciece Uiversity of Texas Austi euho@csutexasedu Ambuj Tewari Departmet of Statistics Uiversity of Michiga A
More informationLower bounds on minimax rates for nonparametric regression with additive sparsity and smoothness
Lower bouds o miimax rates for oparametric regressio with additive sparsity ad smoothess Garvesh Raskutti 1, Marti J. Waiwright 1,2, Bi Yu 1,2 1 UC Berkeley Departmet of Statistics 2 UC Berkeley Departmet
More informationLecture 24: Variable selection in linear models
Lecture 24: Variable selectio i liear models Cosider liear model X = Z β + ε, β R p ad Varε = σ 2 I. Like the LSE, the ridge regressio estimator does ot give 0 estimate to a compoet of β eve if that compoet
More informationFAST GLOBAL CONVERGENCE OF GRADIENT METHODS FOR HIGH-DIMENSIONAL STATISTICAL RECOVERY
Submitted to the Aals of Statistics FAST GLOBAL CONVERGENCE OF GRADIENT METHODS FOR HIGH-DIMENSIONAL STATISTICAL RECOVERY By Alekh Agarwal ad Sahad Negahba ad Marti J. Waiwright UC Berkeley, Departmet
More informationFAST GLOBAL CONVERGENCE OF GRADIENT METHODS FOR HIGH-DIMENSIONAL STATISTICAL RECOVERY
The Aals of Statistics 2012, Vol. 40, No. 5, 2452 2482 DOI: 10.1214/12-AOS1032 Istitute of Mathematical Statistics, 2012 FAST GLOBAL CONVERGENCE OF GRADIENT METHODS FOR HIGH-DIMENSIONAL STATISTICAL RECOVERY
More informationSlide Set 13 Linear Model with Endogenous Regressors and the GMM estimator
Slide Set 13 Liear Model with Edogeous Regressors ad the GMM estimator Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Friday
More informationarxiv: v3 [math.st] 16 Jun 2015
Geometric Iferece for Geeral High-Dimesioal Liear Iverse Problems T. Toy Cai, Tegyua Liag ad Alexader Rakhli arxiv:1404.4408v3 [math.st] 16 Ju 2015 Departmet of Statistics The Wharto School Uiversity of
More informationWeek 10. f2 j=2 2 j k ; j; k 2 Zg is an orthonormal basis for L 2 (R). This function is called mother wavelet, which can be often constructed
Wee 0 A Itroductio to Wavelet regressio. De itio: Wavelet is a fuctio such that f j= j ; j; Zg is a orthoormal basis for L (R). This fuctio is called mother wavelet, which ca be ofte costructed from father
More informationA survey on penalized empirical risk minimization Sara A. van de Geer
A survey o pealized empirical risk miimizatio Sara A. va de Geer We address the questio how to choose the pealty i empirical risk miimizatio. Roughly speakig, this pealty should be a good boud for the
More informationA First Order Free Lunch for SQRT-Lasso
A First Order Free Luch for SQRT-Lasso Xigguo Li, Jarvis Haupt, Rama Arora, Ha Liu, Migyi Hog, ad Tuo Zhao arxiv:605.07950v [cs.lg] 5 May 06 Abstract May statistical machie learig techiques sacrifice coveiet
More information5.1 A mutual information bound based on metric entropy
Chapter 5 Global Fao Method I this chapter, we exted the techiques of Chapter 2.4 o Fao s method the local Fao method) to a more global costructio. I particular, we show that, rather tha costructig a local
More informationPrincipal Component Analysis with Structured Factors
Pricipal Compoet Aalysis with Structured Factors Yash Deshpade ad Adrea Motaari Staford Uiversity May 28, 2014 Adrea Motaari (Staford) Sparse PCA May 28, 2014 1 / 51 Pricipal Compoet Aalysis Data matrix
More informationStatistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions
Statistical ad Mathematical Methods DS-GA 00 December 8, 05. Short questios Sample Fial Problems Solutios a. Ax b has a solutio if b is i the rage of A. The dimesio of the rage of A is because A has liearly-idepedet
More informationMatrix Representation of Data in Experiment
Matrix Represetatio of Data i Experimet Cosider a very simple model for resposes y ij : y ij i ij, i 1,; j 1,,..., (ote that for simplicity we are assumig the two () groups are of equal sample size ) Y
More information5.1 Review of Singular Value Decomposition (SVD)
MGMT 69000: Topics i High-dimesioal Data Aalysis Falll 06 Lecture 5: Spectral Clusterig: Overview (cotd) ad Aalysis Lecturer: Jiamig Xu Scribe: Adarsh Barik, Taotao He, September 3, 06 Outlie Review of
More informationChapter 1 Simple Linear Regression (part 6: matrix version)
Chapter Simple Liear Regressio (part 6: matrix versio) Overview Simple liear regressio model: respose variable Y, a sigle idepedet variable X Y β 0 + β X + ε Multiple liear regressio model: respose Y,
More informationFirst Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise
First Year Quatitative Comp Exam Sprig, 2012 Istructio: There are three parts. Aswer every questio i every part. Questio I-1 Part I - 203A A radom variable X is distributed with the margial desity: >
More informationApplied Statistics and Machine Learning
Applied Statistics and Machine Learning Theory: Stability, CLT, and Sparse Modeling Bin Yu, IMA, June 26, 2013 Today s plan 1. Reproducibility and statistical stability 2. In the classical world: CLT stability
More informationMA Advanced Econometrics: Properties of Least Squares Estimators
MA Advaced Ecoometrics: Properties of Least Squares Estimators Karl Whela School of Ecoomics, UCD February 5, 20 Karl Whela UCD Least Squares Estimators February 5, 20 / 5 Part I Least Squares: Some Fiite-Sample
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio
More informationA Risk Comparison of Ordinary Least Squares vs Ridge Regression
Joural of Machie Learig Research 14 (2013) 1505-1511 Submitted 5/12; Revised 3/13; Published 6/13 A Risk Compariso of Ordiary Least Squares vs Ridge Regressio Paramveer S. Dhillo Departmet of Computer
More informationREGRESSION WITH QUADRATIC LOSS
REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d
More informationThe L[subscript 1] penalized LAD estimator for high dimensional linear regression
The L[subscript 1] pealized LAD estimator for high dimesioal liear regressio The MIT Faculty has made this article opely available. Please share how this access beefits you. Your story matters. Citatio
More informationarxiv: v3 [math.st] 23 Aug 2012
High-dimesioal regressio with oisy ad missig data: Provable guaratees with o-covexity Po-Lig Loh 1 Marti J. Waiwright 1, ploh@berkeley.edu waiwrig@stat.berkeley.edu Departmet of Statistics 1 Departmet
More informationAlgorithms for Clustering
CR2: Statistical Learig & Applicatios Algorithms for Clusterig Lecturer: J. Salmo Scribe: A. Alcolei Settig: give a data set X R p where is the umber of observatio ad p is the umber of features, we wat
More informationMachine Learning Theory (CS 6783)
Machie Learig Theory (CS 6783) Lecture 2 : Learig Frameworks, Examples Settig up learig problems. X : istace space or iput space Examples: Computer Visio: Raw M N image vectorized X = 0, 255 M N, SIFT
More informationTAMS24: Notations and Formulas
TAMS4: Notatios ad Formulas Basic otatios ad defiitios X: radom variable stokastiska variabel Mea Vätevärde: µ = X = by Xiagfeg Yag kpx k, if X is discrete, xf Xxdx, if X is cotiuous Variace Varias: =
More informationLinear Support Vector Machines
Liear Support Vector Machies David S. Roseberg The Support Vector Machie For a liear support vector machie (SVM), we use the hypothesis space of affie fuctios F = { f(x) = w T x + b w R d, b R } ad evaluate
More informationOptimal prediction for sparse linear models? Lower bounds for coordinate-separable M-estimators
Electroic Joural of Statistics Vol. (207) 752 799 ISSN: 935-7524 DOI: 0.24/7-EJS233 Optimal predictio for sparse liear models? Lower bouds for coordiate-separable M-estimators Yuche Zhag Computer Sciece
More informationMinimax rates of convergence for high-dimensional regression under l q -ball sparsity
Forty-Seveth Aual Allerto Coferece Allerto House, UIUC, Illiois, USA September 30 - October, 009 Miimax rates of covergece for high-dimesioal regressio uder l q -ball sparsity Garvesh Raskutti Marti J
More informationLeast Squares Parameter Estimation for Sparse Functional Varying Coefficient Model
Joural of Statistical Theory ad Applicatios, Vol. 16, No. 3 (September 017) 337 344 Least Squares Parameter Estimatio for Sparse Fuctioal Varyig Coefficiet Model Behdad Mostafaiy Departmet of Statistics,
More informationFactor Analysis. Lecture 10: Factor Analysis and Principal Component Analysis. Sam Roweis
Lecture 10: Factor Aalysis ad Pricipal Compoet Aalysis Sam Roweis February 9, 2004 Whe we assume that the subspace is liear ad that the uderlyig latet variable has a Gaussia distributio we get a model
More informationGraphical Nonconvex Optimization via an Adaptive Convex Relaxation
Graphical Nocovex Optimizatio via a Adaptive Covex Relaxatio Qiag Su 1 Kea Mig Ta Ha Liu 3 Tog Zhag 3 Abstract We cosider the problem of learig highdimesioal Gaussia graphical models. The graphical lasso
More informationSignal Processing in Mechatronics
Sigal Processig i Mechatroics Zhu K.P. AIS, UM. Lecture, Brief itroductio to Sigals ad Systems, Review of Liear Algebra ad Sigal Processig Related Mathematics . Brief Itroductio to Sigals What is sigal
More informationLecture 7: Density Estimation: k-nearest Neighbor and Basis Approach
STAT 425: Itroductio to Noparametric Statistics Witer 28 Lecture 7: Desity Estimatio: k-nearest Neighbor ad Basis Approach Istructor: Ye-Chi Che Referece: Sectio 8.4 of All of Noparametric Statistics.
More informationSparse Estimation with Strongly Correlated Variables using Ordered Weighted l 1 Regularization
Sparse Estimatio with Strogly Correlated Variables usig Ordered Weighted l Regularizatio Mário A. T. Figueiredo Istituto de Telecomuicaçõe ad Istituto Superior Técico, Uiversidade de Lisboa, Portugal Robert
More informationFast Classification Rates for High-dimensional Gaussian Generative Models
Fast Classificatio Rates for High-dimesioal Gaussia Geerative Models Tiayag Li Adarsh Prasad Departmet of Computer Sciece, UT Austi {lty,adarsh,pradeepr}@cs.utexas.edu Pradeep Ravikumar Abstract We cosider
More informationarxiv: v1 [math.st] 13 Mar 2009
Adaptive Lasso for High Dimesioal Regressio ad Gaussia Graphical Modelig Shuheg Zhou Sara va de Geer Peter Bühlma arxiv:0903.55v [math.st] 3 Mar 009 Semiar für Statistik ETH Zürich CH-809 Zürich, Switzerlad
More informationSLOPE MEETS LASSO: IMPROVED ORACLE BOUNDS AND OPTIMALITY
Submitted to the Aals of Statistics SLOPE MEETS LASSO: IMPROVED ORACLE BOUNDS AND OPTIMALITY By Pierre C. Bellec,, Guillaume Lecué,, ad Alexadre B. Tsybakov, ENSAE, CREST UMR CNRS 9194 ad CNRS Abstract
More informationLecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)
Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +
More informationRegression with quadratic loss
Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,
More informationarxiv: v1 [math.pr] 13 Oct 2011
A tail iequality for quadratic forms of subgaussia radom vectors Daiel Hsu, Sham M. Kakade,, ad Tog Zhag 3 arxiv:0.84v math.pr] 3 Oct 0 Microsoft Research New Eglad Departmet of Statistics, Wharto School,
More informationAccuracy Assessment for High-Dimensional Linear Regression
Uiversity of Pesylvaia ScholarlyCommos Statistics Papers Wharto Faculty Research -016 Accuracy Assessmet for High-Dimesioal Liear Regressio Toy Cai Uiversity of Pesylvaia Zijia Guo Uiversity of Pesylvaia
More informationHarder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression
Harder, Better, Faster, Stroger Covergece Rates for Least-Squares Regressio Aoymous Author(s) Affiliatio Address email Abstract 1 2 3 4 5 6 We cosider the optimizatio of a quadratic objective fuctio whose
More informationLecture 11 and 12: Basic estimation theory
Lecture ad 2: Basic estimatio theory Sprig 202 - EE 94 Networked estimatio ad cotrol Prof. Kha March 2 202 I. MAXIMUM-LIKELIHOOD ESTIMATORS The maximum likelihood priciple is deceptively simple. Louis
More informationMinimax rates of estimation for high-dimensional linear regression over l q -balls
TO APPEAR IN IEEE TRANS. OF INFORMATION THEORY Miimax rates of estimatio for high-dimesioal liear regressio over l -balls Garvesh Raskutti, Marti J. Waiwright, Seior Member, IEEE ad Bi Yu, Fellow, IEEE.
More informationA Nonconvex Free Lunch for Low-Rank plus Sparse Matrix Recovery
A Nocovex Free Luch for Low-Rak plus Sparse Matrix Recovery Xiao Zhag ad Ligxiao Wag ad Quaqua Gu arxiv:70.0655v [stat.ml] 3 Apr 07 Abstract We study the problem of low-rak plus sparse matrix recovery.
More informationRegularization methods for large scale machine learning
Regularizatio methods for large scale machie learig Lorezo Rosasco March 7, 2017 Abstract After recallig a iverse problems perspective o supervised learig, we discuss regularizatio methods for large scale
More informationAGGREGATION AND HIGH-DIMENSIONAL STATISTICS (preliminary notes of Saint-Flour lectures, July 8-20, 2013)
AGGREGATION AND HIGH-DIENSIONAL STATISTICS (prelimiary otes of Sait-Flour lectures, July 8-20, 2013) Alexadre B. Tsybakov (CREST-ENSAE) October 30, 2013 1 Itroductio Give a collectio of estimators, the
More informationClosed-form Estimators for High-dimensional Generalized Linear Models
Closed-form Estimators for High-dimesioal Geeralized Liear Models Euho Yag IBM T.J. Watso Research Ceter euhyag@us.ibm.com Aurélie C. Lozao IBM T.J. Watso Research Ceter aclozao@us.ibm.com Pradeep Ravikumar
More informationA Hadamard-type lower bound for symmetric diagonally dominant positive matrices
A Hadamard-type lower boud for symmetric diagoally domiat positive matrices Christopher J. Hillar, Adre Wibisoo Uiversity of Califoria, Berkeley Jauary 7, 205 Abstract We prove a ew lower-boud form of
More informationBayesian Methods: Introduction to Multi-parameter Models
Bayesia Methods: Itroductio to Multi-parameter Models Parameter: θ = ( θ, θ) Give Likelihood p(y θ) ad prior p(θ ), the posterior p proportioal to p(y θ) x p(θ ) Margial posterior ( θ, θ y) is Iterested
More informationVon Neumann Entropy Penalization and Low Rank Matrix Estimation
Vo Neuma Etropy Pealizatio ad Low Rak Matrix Estimatio Vladimir Koltchiskii School of Mathematics Georgia Istitute of Techology Atlata, GA 3033-0160 vlad@math.gatech.edu October 6, 011 Abstract We study
More informationMinimax rates of estimation for high-dimensional linear regression over l q -balls
Miimax rates of estimatio for high-dimesioal liear regressio over l q -balls Garvesh Raskutti Marti J. Waiwright, garveshr@stat.berkeley.edu waiwrig@stat.berkeley.edu Bi Yu, biyu@stat.berkeley.edu arxiv:090.04v
More informationOptimally Sparse SVMs
A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but
More informationStatistica Sinica Preprint No: SS R3
Statistica Siica Preprit No: SS-2015-0179R3 Title Sparse ad robust liear regressio: a optimizatio algorithm ad its statistical properties Mauscript ID SS-2015-0179 URL http://www.stat.siica.edu.tw/statistica/
More informationSUPPORT UNION RECOVERY IN HIGH-DIMENSIONAL MULTIVARIATE REGRESSION 1 arxiv: v2 [stat.ml] 7 Mar 2011
The Aals of Statistics 011, Vol. 39, No. 1, 1 47 DOI: 10.114/09-AOS776 c Istitute of Mathematical Statistics, 011 SUPPORT UNION RECOVERY IN HIGH-DIMENSIONAL MULTIVARIATE REGRESSION 1 arxiv:0808.0711v [stat.ml]
More informationFast Rates for Regularized Objectives
Fast Rates for Regularized Objectives Karthik Sridhara, Natha Srebro, Shai Shalev-Shwartz Toyota Techological Istitute Chicago Abstract We study covergece properties of empirical miimizatio of a stochastic
More informationElementary Estimators for Graphical Models
Elemetary Estimators for Graphical Models Euho Yag IBM T.J. Watso Research Ceter euhyag@us.ibm.com Aurélie C. Lozao IBM T.J. Watso Research Ceter aclozao@us.ibm.com Pradeep Ravikumar Uiversity of Texas
More informationarxiv: v4 [math.st] 24 Jul 2017
Robust Low-Rak Matrix Estimatio arxiv:1603.09071v4 [math.st] 24 Jul 2017 Adreas Elseer ad Sara va de Geer Semiar for Statistics ETH Zurich 8092 Zurich Switzerlad e-mail: elseer@stat.math.ethz.ch e-mail:
More informationState Space Representation
Optimal Cotrol, Guidace ad Estimatio Lecture 2 Overview of SS Approach ad Matrix heory Prof. Radhakat Padhi Dept. of Aerospace Egieerig Idia Istitute of Sciece - Bagalore State Space Represetatio Prof.
More informationQuantile regression with multilayer perceptrons.
Quatile regressio with multilayer perceptros. S.-F. Dimby ad J. Rykiewicz Uiversite Paris 1 - SAMM 90 Rue de Tolbiac, 75013 Paris - Frace Abstract. We cosider oliear quatile regressio ivolvig multilayer
More informationHigh-Dimensional Graphical Model Selection Using l 1 -Regularized Logistic Regression
High-Dimesioal Graphical Model Selectio Usig l 1 -Regularized Logistic Regressio Marti J. Waiwright Pradeep Ravikumar Joh D. Lafferty Departmet of Statistics Machie Learig Dept. Computer Sciece Dept. Departmet
More information( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2
82 CHAPTER 4. MAXIMUM IKEIHOOD ESTIMATION Defiitio: et X be a radom sample with joit p.m/d.f. f X x θ. The geeralised likelihood ratio test g.l.r.t. of the NH : θ H 0 agaist the alterative AH : θ H 1,
More informationSUPPLEMENT TO GEOMETRIC INFERENCE FOR GENERAL HIGH-DIMENSIONAL LINEAR INVERSE PROBLEMS
Submitted to the Aals of Statistics arxiv: arxiv:0000.0000 SUPPLEMENT TO GEOMETRIC INFERENCE FOR GENERAL HIGH-DIMENSIONAL LINEAR INVERSE PROBLEMS By T. Toy Cai, Tegyua Liag ad Alexader Rakhli The Wharto
More informationLecture 4. Hw 1 and 2 will be reoped after class for every body. New deadline 4/20 Hw 3 and 4 online (Nima is lead)
Lecture 4 Homework Hw 1 ad 2 will be reoped after class for every body. New deadlie 4/20 Hw 3 ad 4 olie (Nima is lead) Pod-cast lecture o-lie Fial projects Nima will register groups ext week. Email/tell
More informationSieve Estimators: Consistency and Rates of Convergence
EECS 598: Statistical Learig Theory, Witer 2014 Topic 6 Sieve Estimators: Cosistecy ad Rates of Covergece Lecturer: Clayto Scott Scribe: Julia Katz-Samuels, Brado Oselio, Pi-Yu Che Disclaimer: These otes
More informationConfidence Intervals for High-Dimensional Linear Regression: Minimax Rates and Adaptivity
Uiversity of Pesylvaia ScholarlyCommos Statistics Papers Wharto Faculty Research 5-207 Cofidece Itervals for High-Dimesioal Liear Regressio: Miimax Rates ad Adaptivity Toy Cai Uiversity of Pesylvaia Zijia
More informationSupplementary material to Non-negative least squares for high-dimensional linear models: consistency and sparse recovery without regularization
Electroic Joural of Statistics ISSN: 1935-754 Supplemetary material to No-egative least squares for high-dimesioal liear models: cosistecy ad sparse recovery without regularizatio Marti Slawski ad Matthias
More informationRegularized M-estimators with Nonconvexity: Statistical and Algorithmic Theory for Local Optima
Joural of Machie Learig Research 6 05 559-66 Submitted 4/4; Revised 0/4; Published 3/5 Regularized M-estimators with Nocovexity: Statistical ad Algorithmic Theory for Local Optima Po-Lig Loh Departmet
More informationTopics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion
.87 Machie learig: lecture Tommi S. Jaakkola MIT CSAIL tommi@csail.mit.edu Topics The learig problem hypothesis class, estimatio algorithm loss ad estimatio criterio samplig, empirical ad epected losses
More informationDISCUSSION: LATENT VARIABLE GRAPHICAL MODEL SELECTION VIA CONVEX OPTIMIZATION. By Zhao Ren and Harrison H. Zhou Yale University
Submitted to the Aals of Statistics DISCUSSION: LATENT VARIABLE GRAPHICAL MODEL SELECTION VIA CONVEX OPTIMIZATION By Zhao Re ad Harriso H. Zhou Yale Uiversity 1. Itroductio. We would like to cogratulate
More informationSupplemental Material: Proofs
Proof to Theorem Supplemetal Material: Proofs Proof. Let be the miimal umber of traiig items to esure a uique solutio θ. First cosider the case. It happes if ad oly if θ ad Rak(A) d, which is a special
More information