High-dimensional covariance estimation based on Gaussian graphical models

Size: px
Start display at page:

Download "High-dimensional covariance estimation based on Gaussian graphical models"

Transcription

1 High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26, 2011 Joint work with Philipp Rütimann, Min Xu, and Peter Bühlmann

2 Problem definition Want to estimate the covariance matrix for Gaussian Distributions: e.g., stock prices Take a random sample of vectors X (1),..., X (n) i.i.d. N p (0,Σ 0 ), where p is understood to depend on n Let Θ 0 := Σ 1 0 denote the concentration matrix Sparsity: certain elements of Θ 0 are assumed to be zero Task: Use the sample to obtain a set of zeros, and then an estimator for Θ 0 (Σ 0 ) upon a given pattern of zeros Show consistency in predictive risk and in estimating Θ 0 and Σ 0 when n, p

3 Gaussian graphical model: representation Let X be a p-dimensional Gaussian random vector X (X 1,..., X p ) N(0,Σ 0 ), where Σ 0 = Θ 1 0 In Gaussian graphical model G(V, E 0 ), where V = p: A pair (i, j) is NOT contained in E 0 (θ 0,ij = 0) iff X i X j {X k ; k V \{i, j}} Define Predictive Risk with Σ 0 as R(Σ) = tr(σ 1 Σ 0 )+log Σ 2E 0 (log f Σ (X)) where the Gaussian Log-likelihood function using Σ 0 is log f Σ (X) = p 2 log 2π 1 2 log Σ 1 2 X T Σ 1 X

4 Penalized maximum likelihood estimators To estimate a sparse model (i.e., Θ 0 0 is small), recent work has considered l 1 -penalized maximum likelihood estimators: let Θ 1 = vecθ 1 = i j θ ij, { } Θ n = arg min tr(θ Ŝ n ) log Θ +λ n Θ 1, where Θ 0 Ŝ n = n 1 n r=1 X(r) (X (r) ) T is the sample covariance The graph Ĝn is determined by the non-zeros of Θ n References: Yuan-Lin 07, d Aspremont-Banerjee-El Ghaoui 08, Friedman-Hastie-Tibshirani 08, Rothman et al 08, Z-Lafferty-Wasserman 08, and Ravikumar et. al. 08

5 Predictive risks Fix a point of interest with f 0 = N(0,Σ 0 ) For a given L n, consider a constrained set of positive definite matrices: Γ n = {Σ : Σ 0, Σ 1 1 L n } Define the oracle estimator as Σ = arg min Σ Γn R(Σ) Recall R(Σ) = tr(σ 1 Σ 0 )+log Σ Define Σ n as the minimizer of R n (Σ) subject to Σ Γ n, { Σ n = arg min trσ 1 } Ŝ n + log Σ Σ Γ n }{{} R n(σ) R n (Σ) is the negative Gaussian log-likelihood function and Ŝ n is the sample covariance

6 Risk consistency Persistence Theorem: Let p < n ξ, for some ξ > 0. Given Γ n = {Σ : Σ 0, Σ 1 1 L n }, where L n = o( n log n ), n. Then R( Σ n ) R(Σ n) P 0, where R(Σ) = tr(σ 1 Σ 0 )+log Σ and Σ n = arg min Σ Γ n R(Σ) L n = n 1 2 logn n= o + o +o +o o + Persistency answers the asymptotic question: How large may the set Γ n be, so that it is still possible to select empirically a predictor whose risk is close to that of the best predictor in the set (see Greenshtein-Ritov 04)

7 Non-edges act as the constraints Suppose we obtain an edge set E such that E 0 E: Define the estimator for the concentration matrix Θ 0 as: ) Θ n (E) = argmin Θ ME (tr(θŝn) log Θ, where M E = {Θ 0 and θ ij = 0 (i, j) E, and i j} Theorem. Assume that 0 < ϕ min (Σ 0 ) < ϕ max (Σ 0 ) <. Suppose that E 0 E and E \ E 0 = O(S), where S = E 0. Then, Θ n (E) Θ 0 F = O P ( (p+s) log max(n, p)/n ) This is the same rate as Rothman et al 08 for the l 1 -penalized likelihood estimate

8 Get rid of the dependency on p Theorem. Assume that 0 < ϕ min (Σ 0 ) < ϕ max (Σ 0 ) <. Assume that Σ 0,ii = 1, i. Suppose we obtain an edge set E such that E 0 E and E \ E 0 = O(S), where S := E 0 = p i=1 si. Then, Θ n (E) Θ 0 F = O P ( S log max(n, p)/n ) In the likelihood function, Ŝn will be replaced by the sample correlation matrix Γ n = diag(ŝn) 1/2 (Ŝn)diag(Ŝn) 1/2

9 Main questions: How to select an edge set E so that we estimate Θ 0 well? What assumptions do we need to impose on Σ 0 or Θ 0? How does n scale with p, E, or the maximum node degree deg(g)? What if some edges have very small weights? How to ensure that E \ E 0 is small? How does the edge-constrained maximum likelihood estimate behave with respect to E 0 \ E and E \ E 0?

10 Outline Introduction The regression model The method Theoretical results Conclusion

11 A Regression Model We assume a multivariate Gaussian model X = (X 1,..., X p ) N p (0,Σ 0 ), where Σ 0,ii = 1 Consider a regression formulation of the model: For all i = 1,...,p X i = j i β i j X j + V i where β i j = θ 0,ij /θ 0,ii, and V i N(0,σ 2 V i ) is independent of {X j ; j i} for which we assume that there exists v 2 > 0 such that for all i, Var(V i ) = 1/θ 0,ii v 2 Recall X i X j {X k ; k V \{i, j}} θ 0,ij = 0 β j i = 0 and β i j = 0

12 Want to recover the support of β i Take a random sample of size n, and use the sample to estimate β i, i; that is, we have for each variable X i, X i n = X.\i n (p 1) β i p 1 + ǫ n, where we assume p > n, that is, given high-dimensional data X Lasso (Tibshirani 96), a.k.a. Basis Pursuit (Chen, Donoho, and Saunders 98, and others): β i = arg min β X i X \i β 2 /2n+λ n β 1

13 Meinshausen and Bühlmann 06 Perform p regressions using the Lasso to obtain p vectors of regression coefficients β 1,..., β p where for each i, β i = { β j i ; j {1,..., p}\i} Then estimate the edge set by the OR rule, estimate an edge between nodes i and j β i j 0 or β j i 0 Under sparsity and Neighborhood Stability conditions, they show P(Ên = E 0 ) 1 as n

14 Sparsity At row i, define s0,n i as the smallest integer such that: p j=1,j i min{θ 2 0,ij,λ2 θ 0,ii } s i 0,n λ2 θ 0,ii The essential sparsity s0,n i at row i counts all (i, j) such that θ 0,ij λ θ 0,ii βj i λσ V i Define S 0,n = p i=1 si 0,n as the essential sparsity of the graph, which counts all (i, j) such that θ 0,ij λ min( θ 0,ii, θ 0,jj ) βj i λσ V i or β j i λσ V j Aim to keep 2S 0 edges in E

15 Defining 2s 0 Let 0 s 0 s be the smallest integer such that p 1 i=1 min(β2 i,λ 2 σ 2 ) s 0 λ 2 σ 2, where λ = 2 log p/n If we order the β j s in decreasing order of magnitude β 1 β 2... β p 1, then β j < λσ j > s 0 Value s 0 2s 0 s p = 512 n = 500 s = 96 σ = 1 λ n = logp n σ 2logp n σ logp n This notion of sparsity has been used in linear regression (Candès-Tao 07, Z09,10) σ n

16 Selection: individual neighborhood We use the Lasso in combination with thresholding (Z09, Z10) for inferring the graph: Let λ = 2 log p/n For each of the nodewise regressions, obtain an estimator β i init using the Lasso with penalty parameter λ n λ, β i init = argmin β i n r=1 (X (r) i j i β i j X(r) j ) 2 +λ n βj i i, j i Threshold βinit i with τ λ to get the Zero set: Let D i = {j : j i, < τ} β i j,init

17 Selection: joining the neighborhoods Define the total zeros as: D = {(i, j) : i j : (i, j) D i D j } Select edge set E := {(i, j) : i, j = 1,...,p, i j,(i, j) D} That is, edge set is the joint neighborhoods across all nodes in the graph This reflects the idea that the essential sparsity S 0,n of the graph counts all (i, j) such that θ 0,ij λ min( θ 0,ii, θ 0,jj )

18 Example: a star graph Construct Σ 0 from a model used in Ravikumar et. al. 08: Σ 0 = 1 ρ ρ ρ... 0 ρ 1 ρ 2 ρ ρ ρ 2 1 ρ ρ ρ 2 ρ p p

19 Example: original graph p = 128, n = 96, s = 8,ρ = 0.5 λ n = 2 2 log p/n, τ = log p/n

20 Example: estimated graph with n = 96 λ n = 2 2 log p/n

21 Example: estimated graph λ n = 2 2 log p/n

22 Example: estimated graph λ n = 2 2 log p/n

23 Example: estimated graph λ n = 2 2 log p/n

24 Example: estimated graph λ n = 2 2 log p/n

25 Example: estimated graph λ n = 2 2 log p/n

26 Example: estimated graph λ n = 2 2 log p/n

27 Example: estimated graph λ n = 2 2 log p/n

28 Example: estimated graph τ = log p/n

29 Gelato: estimation of edge weight Given a graph with edge set E, we estimate the concentration matrix by maximum likelihood: Denote the sample correlation matrix by Γ n : Γ n = diag(ŝn) 1/2 (Ŝn)diag(Ŝn) 1/2 The estimator for the concentration matrix Θ 0 is: ) Θ n (E) = argmin Θ Mp,E (tr(θ Γ n ) log Θ, where M p,e = {Θ R p p ; Θ 0 and θ ij = 0 for all (i, j) D} and D := {(i, j) : i, j = 1,...,p,(i, j) E and i j}

30 Likelihood equations Let diag(ŝn) 1/2 = { σ 1,..., σ p } The following relationships hold for the maximum likelihood estimate Θ n and Σ n = ( Θ n ) 1 : Σ n,ii Σ n,ij Θ n,ij = 1, i = 1,...,p = Γ n,ij = Ŝn,ij/ σ i σ j, (i, j) E and = 0, (i, j) D This is also known as positive definite matrix completion problem

31 Set of assumptions Let c, C be some absolute contants. (A0) The size of the neighborhood for each node i V is bounded by an integer s < p and the sample size satisfies n Cs log(cp/s). (A1) The dimension and number of sufficiently strong edges S 0,n satisfy: p = o(e cn ) for some 0 < c < 1 and S 0,n = o(n/ log max(n, p)) (n ). (A2) The minimal and maximal eigenvalues of Σ 0 are bounded, and Σ 0,ii = 1 for all i.

32 The main theorem: selection Assume that (A0), and (A2) hold. Let λ = 2 log p/n. Let d, C, D depend on sparse and restrictive eigenvalues of Σ 0. Let λ n = dλ and τ = Dλ n be chosen appropriately chosen Denote the estimated edge by E = Ên(λ n,τ) Then with high probability, E 2S 0,n, where E \ E 0 S 0,n and Θ0,D F Cλ n min{s 0,n ( max i=1,...p θ2 0,ii ), s 0 diag(θ 0 ) 2 F } where s 0 = max i s0,n i denotes the maximum essential node degree

33 Example: p = 128, s = 12, ρ = 0.5 λ n = 2 2 log p/n, τ = f 2 log p/n FPR FNR Lasso f = 0.30 f = 0.35 f = Lasso f = 0.30 f = 0.35 f = n n

34 The main theorem: estimation Assume that in addition, (A1) holds. Then for Θ n and Σ n = ( Θ n ) 1, ) Θ n Θ 0 F = O P ( S 0,n log max(n, p)/n ) Σ n Σ 0 F = O P ( S 0,n log max(n, p)/n R( Θ n ) R(Θ 0 ) = O P ( S0,n log max(n, p)/n ) 2 2 ( So Θ n Θ 0, Σ n Σ 0 = O p S0,n log max(n, p)/n )

35 Obtaining an edge set E Let S i = {j : j i, β i j 0} and s i = S i. Let D,λn, C be the same as in the main Theorem For each of nodewise regressions, we apply the same thresholding rule to obtain a subset I i as follows, I i = {j : j i, τ = Dλ n }, and D i β i j,init := {1,..., i 1, i + 1,..., p}\i i Then we have with high probability, I i 2s0 i and Ii S i S i +s0 i and Θ i 0,D Cλ n θ 0,ii s i β 2 0 i 2 D Cλ n s0 i Proof follows from results in Z10 on the Thresholded Lasso estimator

36 Oracle inequalities for the Lasso Theorem (Z 10). Under (A0) and (A2), for all nodewise regressions, the Lasso estimator achieves squared l 2 loss of O P (s 0 σ 2 log p/n). Value s 0 2s 0 s p = 512 n = 500 s = 96 σ = 1 λ n = logp n σ 2logp n σ logp n σ n

37 Constructing a pivot point Now clearly by the OR rule, we have and E = {(i, j) : j I i, i = 1,...,p} p E I i p 2s0 i = 2S 0 i=1 i=1 Given a 2S 0 -sparse set of edges E, Define a sparse approximation Θ 0 of Θ 0 which is identical to Θ 0 on E and the diagonal, and zero elsewhere: Θ 0 = diag(θ 0 )+Θ 0,E = diag(θ 0 )+Θ 0,E E0 0 Θ 0 = p+2 E E 0 p+4s 0 with s+1-sparse row (column) vectors

38 Θ 0 as a sparse approximation The bias is small: Θ 0 Θ F 0 C max i=1,...p (θ 0,ii) S 0 log p/n For q = 1, 2,, Θ 0 Θ q 0 C max i=1,...p (θ 0,ii) s 0 λ n s Note that each row vector will be 2s sparse if we apply the AND rule, however, at the cost of a larger bias

39 Θ 0 as a pivot The sparsity and the small bias allow us to bound Θ n (E) Θ 0 F = O P S 0,n log max(n, p)/n }{{} r n where we use the fact that both the estimator Θ n (E) and the pivot Θ 0 are sparse By the triangle inequality, we conclude that Θ n (E) Θ 0 F Θ n (E) Θ 0 F + Θ 0 Θ 0 ) F = O P ( S 0,n log max(n, p)/n

40 Generalization on the estimation step Assume that (A1) and (A2) hold. Let σ 2 max := max i Σ 0,ii < and σ 2 min := min i Σ 0,ii > 0. Let W = diag(σ 0 ) 1/2. Suppose that we obtain an edge set E such that E = lin(s 0,n ) is a linear function in S 0,n For Θ 0 = diag(θ 0 )+Θ 0,E Θ 0 Θ 0 F C 2S 0,n log(p)/n We note that this is equivalent to assume Ω 0 Ω 0 F C 2S 0,n log(p)/n where Ω 0 = WΘ 0 W and Ω 0 = W( Θ 0 )W

41 Generalization on the estimation step Theorem. Suppose the sample size satisfies for a sufficiently large constant M, n > MS 0,n log max(n, p). Then Ω n (E) Ω 0 F O p ( 2S 0,n log max p, n/n) where Ω n (E) is the maximum likelihood estimator based on the sample correlation matrix Γ n : ) Ω n (E) = argmin Ω Mp,E (tr(ω Γ n ) log Ω

42 Generalization on the estimation step Given Ŵ = diag(ŝn) 1/2 and Ω n (E), compute Θ n = Ŵ 1 Ωn Ŵ 1 and Σ n = Ŵ( Ω n (E)) 1 Ŵ for which the following hold: Σ n,ij = Ŝn,ij (i, j) E {(i, i) : i = 1,...,p} Θ n,ij = 0, (i, j) D. Following the bound on Ω n (E) and arguments in Rothman et al. (2008), ) Θ n Θ 0 2 = O P ( S 0,n log max(n, p)/n

43 Generalization on the estimation error For the Frobenius norm and the risk to converge to zero, (A1) is to be replaced by: p n c for some 0 < c < 1 and p+s 0,n = o(n/ log max(n, p)) as n In this case, we have Θ n Θ 0 F ) = O P ( (p+s 0,n ) log max(n, p)/n Σ n Σ 0 F ) = O P ( (p+s 0,n ) log max(n, p)/n R( Θ n ) R(Θ 0 ) = O P ( (p+s0,n ) log max(n, p)/n ) We could achieve these rates with ) Θ n (E) = argmin Θ Mp,E (tr(θŝn) log Θ

44 Conclusion Gelato separates the tasks of model selection and (inverse) covariance estimation Thresholding plays a key role in obtaining a sparse approximation of the graph with a small bias using a very small amount of sample With stronger conditions on the sample size, convergence rates in terms of operator and frobenius norms, and KL divergence are established The method is feasible in high dimensions: p > n is allowed

45 Related work on inverse/covariance estimation Regression based selection/estimation: Meinshausen-Bühlmann 06, Peng et. al. 09, Yuan 10, Verzelen 10, Cai-Liu-Luo 11 Penalized likelihood method based on l 1 -norm penalty: Yuan-Lin 07, d Aspremont-Banerjee-El Ghaoui 08, Friedman-Hastie-Tibshirani 07, Rothman et. al. 08, Zhou-Lafferty-Wasserman 08, Ravikumar et. al. 08 Nonconvex: Lam-Fan 09 Sparse covariance selection/estimation: Bickel and Levina 06, 08; El Karoui 08, Levina and Vershynin 10 and more...

46 References RUDELSON, M. and ZHOU, S. (2011). Reconstruction from anisotropic random measurements. ArXiv: ZHOU, S. (2009). Restricted eigenvalue conditions on subgaussian random matrices. ArXiv: v2. ZHOU, S. (2009). Thresholding procedures for high dimensional variable selection and statistical estimation. In Advances in Neural Information Processing Systems 22. MIT Press. ZHOU, S. (2010). Thresholded lasso for high dimensional variable selection and statistical estimation. ArXiv: ZHOU, S., RÜTIMANN, P., XU, M. and BÜHLMANN, P. (2011). High-dimensional covariance estimation based on gaussian graphical models. ArXiv: v2.

High-dimensional Covariance Estimation Based On Gaussian Graphical Models

High-dimensional Covariance Estimation Based On Gaussian Graphical Models High-dimensional Covariance Estimation Based On Gaussian Graphical Models Shuheng Zhou, Philipp Rutimann, Min Xu and Peter Buhlmann February 3, 2012 Problem definition Want to estimate the covariance matrix

More information

High-dimensional Covariance Estimation Based On Gaussian Graphical Models

High-dimensional Covariance Estimation Based On Gaussian Graphical Models Journal of Machine Learning Research 12 (211) 2975-326 Submitted 9/1; Revised 6/11; Published 1/11 High-dimensional Covariance Estimation Based On Gaussian Graphical Models Shuheng Zhou Department of Statistics

More information

Reconstruction from Anisotropic Random Measurements

Reconstruction from Anisotropic Random Measurements Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013

More information

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement

More information

arxiv: v1 [math.st] 13 Feb 2012

arxiv: v1 [math.st] 13 Feb 2012 Sparse Matrix Inversion with Scaled Lasso Tingni Sun and Cun-Hui Zhang Rutgers University arxiv:1202.2723v1 [math.st] 13 Feb 2012 Address: Department of Statistics and Biostatistics, Hill Center, Busch

More information

Permutation-invariant regularization of large covariance matrices. Liza Levina

Permutation-invariant regularization of large covariance matrices. Liza Levina Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work

More information

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana

More information

High Dimensional Inverse Covariate Matrix Estimation via Linear Programming

High Dimensional Inverse Covariate Matrix Estimation via Linear Programming High Dimensional Inverse Covariate Matrix Estimation via Linear Programming Ming Yuan October 24, 2011 Gaussian Graphical Model X = (X 1,..., X p ) indep. N(µ, Σ) Inverse covariance matrix Σ 1 = Ω = (ω

More information

High-dimensional statistics: Some progress and challenges ahead

High-dimensional statistics: Some progress and challenges ahead High-dimensional statistics: Some progress and challenges ahead Martin Wainwright UC Berkeley Departments of Statistics, and EECS University College, London Master Class: Lecture Joint work with: Alekh

More information

Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results

Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results David Prince Biostat 572 dprince3@uw.edu April 19, 2012 David Prince (UW) SPICE April 19, 2012 1 / 11 Electronic

More information

Regularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008

Regularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008 Regularized Estimation of High Dimensional Covariance Matrices Peter Bickel Cambridge January, 2008 With Thanks to E. Levina (Joint collaboration, slides) I. M. Johnstone (Slides) Choongsoon Bae (Slides)

More information

Robust and sparse Gaussian graphical modelling under cell-wise contamination

Robust and sparse Gaussian graphical modelling under cell-wise contamination Robust and sparse Gaussian graphical modelling under cell-wise contamination Shota Katayama 1, Hironori Fujisawa 2 and Mathias Drton 3 1 Tokyo Institute of Technology, Japan 2 The Institute of Statistical

More information

Sample Size Requirement For Some Low-Dimensional Estimation Problems

Sample Size Requirement For Some Low-Dimensional Estimation Problems Sample Size Requirement For Some Low-Dimensional Estimation Problems Cun-Hui Zhang, Rutgers University September 10, 2013 SAMSI Thanks for the invitation! Acknowledgements/References Sun, T. and Zhang,

More information

Introduction to graphical models: Lecture III

Introduction to graphical models: Lecture III Introduction to graphical models: Lecture III Martin Wainwright UC Berkeley Departments of Statistics, and EECS Martin Wainwright (UC Berkeley) Some introductory lectures January 2013 1 / 25 Introduction

More information

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss arxiv:1811.04545v1 [stat.co] 12 Nov 2018 Cheng Wang School of Mathematical Sciences, Shanghai Jiao

More information

Sparse inverse covariance estimation with the lasso

Sparse inverse covariance estimation with the lasso Sparse inverse covariance estimation with the lasso Jerome Friedman Trevor Hastie and Robert Tibshirani November 8, 2007 Abstract We consider the problem of estimating sparse graphs by a lasso penalty

More information

High-dimensional graphical model selection: Practical and information-theoretic limits

High-dimensional graphical model selection: Practical and information-theoretic limits 1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John

More information

Robust Inverse Covariance Estimation under Noisy Measurements

Robust Inverse Covariance Estimation under Noisy Measurements .. Robust Inverse Covariance Estimation under Noisy Measurements Jun-Kun Wang, Shou-De Lin Intel-NTU, National Taiwan University ICML 2014 1 / 30 . Table of contents Introduction.1 Introduction.2 Related

More information

The Nonparanormal skeptic

The Nonparanormal skeptic The Nonpara skeptic Han Liu Johns Hopkins University, 615 N. Wolfe Street, Baltimore, MD 21205 USA Fang Han Johns Hopkins University, 615 N. Wolfe Street, Baltimore, MD 21205 USA Ming Yuan Georgia Institute

More information

Tractable Upper Bounds on the Restricted Isometry Constant

Tractable Upper Bounds on the Restricted Isometry Constant Tractable Upper Bounds on the Restricted Isometry Constant Alex d Aspremont, Francis Bach, Laurent El Ghaoui Princeton University, École Normale Supérieure, U.C. Berkeley. Support from NSF, DHS and Google.

More information

Causal Inference: Discussion

Causal Inference: Discussion Causal Inference: Discussion Mladen Kolar The University of Chicago Booth School of Business Sept 23, 2016 Types of machine learning problems Based on the information available: Supervised learning Reinforcement

More information

Estimation of Graphical Models with Shape Restriction

Estimation of Graphical Models with Shape Restriction Estimation of Graphical Models with Shape Restriction BY KHAI X. CHIONG USC Dornsife INE, Department of Economics, University of Southern California, Los Angeles, California 989, U.S.A. kchiong@usc.edu

More information

High-dimensional graphical model selection: Practical and information-theoretic limits

High-dimensional graphical model selection: Practical and information-theoretic limits 1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John

More information

High Dimensional Covariance and Precision Matrix Estimation

High Dimensional Covariance and Precision Matrix Estimation High Dimensional Covariance and Precision Matrix Estimation Wei Wang Washington University in St. Louis Thursday 23 rd February, 2017 Wei Wang (Washington University in St. Louis) High Dimensional Covariance

More information

Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models

Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models arxiv:1006.3316v1 [stat.ml] 16 Jun 2010 Contents Han Liu, Kathryn Roeder and Larry Wasserman Carnegie Mellon

More information

De-biasing the Lasso: Optimal Sample Size for Gaussian Designs

De-biasing the Lasso: Optimal Sample Size for Gaussian Designs De-biasing the Lasso: Optimal Sample Size for Gaussian Designs Adel Javanmard USC Marshall School of Business Data Science and Operations department Based on joint work with Andrea Montanari Oct 2015 Adel

More information

Sparse Permutation Invariant Covariance Estimation: Final Talk

Sparse Permutation Invariant Covariance Estimation: Final Talk Sparse Permutation Invariant Covariance Estimation: Final Talk David Prince Biostat 572 dprince3@uw.edu May 31, 2012 David Prince (UW) SPICE May 31, 2012 1 / 19 Electronic Journal of Statistics Vol. 2

More information

Inference in high-dimensional graphical models arxiv: v1 [math.st] 25 Jan 2018

Inference in high-dimensional graphical models arxiv: v1 [math.st] 25 Jan 2018 Inference in high-dimensional graphical models arxiv:1801.08512v1 [math.st] 25 Jan 2018 Jana Janková Seminar for Statistics ETH Zürich Abstract Sara van de Geer We provide a selected overview of methodology

More information

Least squares under convex constraint

Least squares under convex constraint Stanford University Questions Let Z be an n-dimensional standard Gaussian random vector. Let µ be a point in R n and let Y = Z + µ. We are interested in estimating µ from the data vector Y, under the assumption

More information

The lasso, persistence, and cross-validation

The lasso, persistence, and cross-validation The lasso, persistence, and cross-validation Daniel J. McDonald Department of Statistics Indiana University http://www.stat.cmu.edu/ danielmc Joint work with: Darren Homrighausen Colorado State University

More information

Log Covariance Matrix Estimation

Log Covariance Matrix Estimation Log Covariance Matrix Estimation Xinwei Deng Department of Statistics University of Wisconsin-Madison Joint work with Kam-Wah Tsui (Univ. of Wisconsin-Madsion) 1 Outline Background and Motivation The Proposed

More information

Estimating Sparse Precision Matrix with Bayesian Regularization

Estimating Sparse Precision Matrix with Bayesian Regularization Estimating Sparse Precision Matrix with Bayesian Regularization Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement Graphical

More information

Shrinkage Tuning Parameter Selection in Precision Matrices Estimation

Shrinkage Tuning Parameter Selection in Precision Matrices Estimation arxiv:0909.1123v1 [stat.me] 7 Sep 2009 Shrinkage Tuning Parameter Selection in Precision Matrices Estimation Heng Lian Division of Mathematical Sciences School of Physical and Mathematical Sciences Nanyang

More information

Tuning Parameter Selection in Regularized Estimations of Large Covariance Matrices

Tuning Parameter Selection in Regularized Estimations of Large Covariance Matrices Tuning Parameter Selection in Regularized Estimations of Large Covariance Matrices arxiv:1308.3416v1 [stat.me] 15 Aug 2013 Yixin Fang 1, Binhuan Wang 1, and Yang Feng 2 1 New York University and 2 Columbia

More information

The Sparsity and Bias of The LASSO Selection In High-Dimensional Linear Regression

The Sparsity and Bias of The LASSO Selection In High-Dimensional Linear Regression The Sparsity and Bias of The LASSO Selection In High-Dimensional Linear Regression Cun-hui Zhang and Jian Huang Presenter: Quefeng Li Feb. 26, 2010 un-hui Zhang and Jian Huang Presenter: Quefeng The Sparsity

More information

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los

More information

arxiv: v2 [econ.em] 1 Oct 2017

arxiv: v2 [econ.em] 1 Oct 2017 Estimation of Graphical Models using the L, Norm Khai X. Chiong and Hyungsik Roger Moon arxiv:79.8v [econ.em] Oct 7 Naveen Jindal School of Management, University of exas at Dallas E-mail: khai.chiong@utdallas.edu

More information

Gaussian Graphical Models and Graphical Lasso

Gaussian Graphical Models and Graphical Lasso ELE 538B: Sparsity, Structure and Inference Gaussian Graphical Models and Graphical Lasso Yuxin Chen Princeton University, Spring 2017 Multivariate Gaussians Consider a random vector x N (0, Σ) with pdf

More information

Statistica Sinica Preprint No: SS R2

Statistica Sinica Preprint No: SS R2 Statistica Sinica Preprint No: SS-2017-0076.R2 Title Graph Estimation for Matrix-variate Gaussian Data Manuscript ID SS-2017-0076.R2 URL http://www.stat.sinica.edu.tw/statistica/ DOI 10.5705/ss.202017.0076

More information

Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space

Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space Jinchi Lv Data Sciences and Operations Department Marshall School of Business University of Southern California http://bcf.usc.edu/

More information

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Nonconcave Penalized Likelihood with A Diverging Number of Parameters Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized

More information

Mixed and Covariate Dependent Graphical Models

Mixed and Covariate Dependent Graphical Models Mixed and Covariate Dependent Graphical Models by Jie Cheng A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Statistics) in The University of

More information

Learning discrete graphical models via generalized inverse covariance matrices

Learning discrete graphical models via generalized inverse covariance matrices Learning discrete graphical models via generalized inverse covariance matrices Duzhe Wang, Yiming Lv, Yongjoon Kim, Young Lee Department of Statistics University of Wisconsin-Madison {dwang282, lv23, ykim676,

More information

The adaptive and the thresholded Lasso for potentially misspecified models (and a lower bound for the Lasso)

The adaptive and the thresholded Lasso for potentially misspecified models (and a lower bound for the Lasso) Electronic Journal of Statistics Vol. 0 (2010) ISSN: 1935-7524 The adaptive the thresholded Lasso for potentially misspecified models ( a lower bound for the Lasso) Sara van de Geer Peter Bühlmann Seminar

More information

10708 Graphical Models: Homework 2

10708 Graphical Models: Homework 2 10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves

More information

MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.)

MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.) 1/12 MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.) Dominique Guillot Departments of Mathematical Sciences University of Delaware May 6, 2016

More information

arxiv: v1 [stat.me] 16 Feb 2018

arxiv: v1 [stat.me] 16 Feb 2018 Vol., 2017, Pages 1 26 1 arxiv:1802.06048v1 [stat.me] 16 Feb 2018 High-dimensional covariance matrix estimation using a low-rank and diagonal decomposition Yilei Wu 1, Yingli Qin 1 and Mu Zhu 1 1 The University

More information

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Cun-Hui Zhang and Stephanie S. Zhang Rutgers University and Columbia University September 14, 2012 Outline Introduction Methodology

More information

arxiv: v2 [math.st] 2 Jul 2017

arxiv: v2 [math.st] 2 Jul 2017 A Relaxed Approach to Estimating Large Portfolios Mehmet Caner Esra Ulasan Laurent Callot A.Özlem Önder July 4, 2017 arxiv:1611.07347v2 [math.st] 2 Jul 2017 Abstract This paper considers three aspects

More information

Graphical Model Selection

Graphical Model Selection May 6, 2013 Trevor Hastie, Stanford Statistics 1 Graphical Model Selection Trevor Hastie Stanford University joint work with Jerome Friedman, Rob Tibshirani, Rahul Mazumder and Jason Lee May 6, 2013 Trevor

More information

Extended Bayesian Information Criteria for Gaussian Graphical Models

Extended Bayesian Information Criteria for Gaussian Graphical Models Extended Bayesian Information Criteria for Gaussian Graphical Models Rina Foygel University of Chicago rina@uchicago.edu Mathias Drton University of Chicago drton@uchicago.edu Abstract Gaussian graphical

More information

arxiv: v1 [math.st] 31 Jan 2008

arxiv: v1 [math.st] 31 Jan 2008 Electronic Journal of Statistics ISSN: 1935-7524 Sparse Permutation Invariant arxiv:0801.4837v1 [math.st] 31 Jan 2008 Covariance Estimation Adam Rothman University of Michigan Ann Arbor, MI 48109-1107.

More information

Posterior convergence rates for estimating large precision. matrices using graphical models

Posterior convergence rates for estimating large precision. matrices using graphical models Biometrika (2013), xx, x, pp. 1 27 C 2007 Biometrika Trust Printed in Great Britain Posterior convergence rates for estimating large precision matrices using graphical models BY SAYANTAN BANERJEE Department

More information

Genetic Networks. Korbinian Strimmer. Seminar: Statistical Analysis of RNA-Seq Data 19 June IMISE, Universität Leipzig

Genetic Networks. Korbinian Strimmer. Seminar: Statistical Analysis of RNA-Seq Data 19 June IMISE, Universität Leipzig Genetic Networks Korbinian Strimmer IMISE, Universität Leipzig Seminar: Statistical Analysis of RNA-Seq Data 19 June 2012 Korbinian Strimmer, RNA-Seq Networks, 19/6/2012 1 Paper G. I. Allen and Z. Liu.

More information

Computational and Statistical Aspects of Statistical Machine Learning. John Lafferty Department of Statistics Retreat Gleacher Center

Computational and Statistical Aspects of Statistical Machine Learning. John Lafferty Department of Statistics Retreat Gleacher Center Computational and Statistical Aspects of Statistical Machine Learning John Lafferty Department of Statistics Retreat Gleacher Center Outline Modern nonparametric inference for high dimensional data Nonparametric

More information

Convex relaxation for Combinatorial Penalties

Convex relaxation for Combinatorial Penalties Convex relaxation for Combinatorial Penalties Guillaume Obozinski Equipe Imagine Laboratoire d Informatique Gaspard Monge Ecole des Ponts - ParisTech Joint work with Francis Bach Fête Parisienne in Computation,

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 7, 04 Reading: See class website Eric Xing @ CMU, 005-04

More information

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1

More information

A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables

A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider the problem of

More information

Proximity-Based Anomaly Detection using Sparse Structure Learning

Proximity-Based Anomaly Detection using Sparse Structure Learning Proximity-Based Anomaly Detection using Sparse Structure Learning Tsuyoshi Idé (IBM Tokyo Research Lab) Aurelie C. Lozano, Naoki Abe, and Yan Liu (IBM T. J. Watson Research Center) 2009/04/ SDM 2009 /

More information

Dimension Reduction in Abundant High Dimensional Regressions

Dimension Reduction in Abundant High Dimensional Regressions Dimension Reduction in Abundant High Dimensional Regressions Dennis Cook University of Minnesota 8th Purdue Symposium June 2012 In collaboration with Liliana Forzani & Adam Rothman, Annals of Statistics,

More information

Regularized Parameter Estimation in High-Dimensional Gaussian Mixture Models

Regularized Parameter Estimation in High-Dimensional Gaussian Mixture Models LETTER Communicated by Clifford Lam Regularized Parameter Estimation in High-Dimensional Gaussian Mixture Models Lingyan Ruan lruan@gatech.edu Ming Yuan ming.yuan@isye.gatech.edu School of Industrial and

More information

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA Presented by Dongjun Chung March 12, 2010 Introduction Definition Oracle Properties Computations Relationship: Nonnegative Garrote Extensions:

More information

Sparse Permutation Invariant Covariance Estimation

Sparse Permutation Invariant Covariance Estimation Sparse Permutation Invariant Covariance Estimation Adam J. Rothman University of Michigan, Ann Arbor, USA. Peter J. Bickel University of California, Berkeley, USA. Elizaveta Levina University of Michigan,

More information

MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models

MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models 1/13 MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models Dominique Guillot Departments of Mathematical Sciences University of Delaware May 4, 2016 Recall

More information

Joint Gaussian Graphical Model Review Series I

Joint Gaussian Graphical Model Review Series I Joint Gaussian Graphical Model Review Series I Probability Foundations Beilun Wang Advisor: Yanjun Qi 1 Department of Computer Science, University of Virginia http://jointggm.org/ June 23rd, 2017 Beilun

More information

arxiv: v2 [math.st] 7 Aug 2014

arxiv: v2 [math.st] 7 Aug 2014 Sparse and Low-Rank Covariance Matrices Estimation Shenglong Zhou, Naihua Xiu, Ziyan Luo +, Lingchen Kong Department of Applied Mathematics + State Key Laboratory of Rail Traffic Control and Safety arxiv:407.4596v2

More information

Sparse Graph Learning via Markov Random Fields

Sparse Graph Learning via Markov Random Fields Sparse Graph Learning via Markov Random Fields Xin Sui, Shao Tang Sep 23, 2016 Xin Sui, Shao Tang Sparse Graph Learning via Markov Random Fields Sep 23, 2016 1 / 36 Outline 1 Introduction to graph learning

More information

High-dimensional regression with unknown variance

High-dimensional regression with unknown variance High-dimensional regression with unknown variance Christophe Giraud Ecole Polytechnique march 2012 Setting Gaussian regression with unknown variance: Y i = f i + ε i with ε i i.i.d. N (0, σ 2 ) f = (f

More information

The picasso Package for Nonconvex Regularized M-estimation in High Dimensions in R

The picasso Package for Nonconvex Regularized M-estimation in High Dimensions in R The picasso Package for Nonconvex Regularized M-estimation in High Dimensions in R Xingguo Li Tuo Zhao Tong Zhang Han Liu Abstract We describe an R package named picasso, which implements a unified framework

More information

Composite Loss Functions and Multivariate Regression; Sparse PCA

Composite Loss Functions and Multivariate Regression; Sparse PCA Composite Loss Functions and Multivariate Regression; Sparse PCA G. Obozinski, B. Taskar, and M. I. Jordan (2009). Joint covariate selection and joint subspace selection for multiple classification problems.

More information

arxiv: v1 [math.st] 8 Jan 2008

arxiv: v1 [math.st] 8 Jan 2008 arxiv:0801.1158v1 [math.st] 8 Jan 2008 Hierarchical selection of variables in sparse high-dimensional regression P. J. Bickel Department of Statistics University of California at Berkeley Y. Ritov Department

More information

General principles for high-dimensional estimation: Statistics and computation

General principles for high-dimensional estimation: Statistics and computation General principles for high-dimensional estimation: Statistics and computation Martin Wainwright Statistics, and EECS UC Berkeley Joint work with: Garvesh Raskutti, Sahand Negahban Pradeep Ravikumar, Bin

More information

Estimating Structured High-Dimensional Covariance and Precision Matrices: Optimal Rates and Adaptive Estimation

Estimating Structured High-Dimensional Covariance and Precision Matrices: Optimal Rates and Adaptive Estimation Estimating Structured High-Dimensional Covariance and Precision Matrices: Optimal Rates and Adaptive Estimation T. Tony Cai 1, Zhao Ren 2 and Harrison H. Zhou 3 University of Pennsylvania, University of

More information

Statistical Machine Learning for Structured and High Dimensional Data

Statistical Machine Learning for Structured and High Dimensional Data Statistical Machine Learning for Structured and High Dimensional Data (FA9550-09- 1-0373) PI: Larry Wasserman (CMU) Co- PI: John Lafferty (UChicago and CMU) AFOSR Program Review (Jan 28-31, 2013, Washington,

More information

Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate

Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Lucas Janson, Stanford Department of Statistics WADAPT Workshop, NIPS, December 2016 Collaborators: Emmanuel

More information

An algorithm for the multivariate group lasso with covariance estimation

An algorithm for the multivariate group lasso with covariance estimation An algorithm for the multivariate group lasso with covariance estimation arxiv:1512.05153v1 [stat.co] 16 Dec 2015 Ines Wilms and Christophe Croux Leuven Statistics Research Centre, KU Leuven, Belgium Abstract

More information

Sparse Covariance Matrix Estimation with Eigenvalue Constraints

Sparse Covariance Matrix Estimation with Eigenvalue Constraints Sparse Covariance Matrix Estimation with Eigenvalue Constraints Han Liu and Lie Wang 2 and Tuo Zhao 3 Department of Operations Research and Financial Engineering, Princeton University 2 Department of Mathematics,

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57

More information

Markov Network Estimation From Multi-attribute Data

Markov Network Estimation From Multi-attribute Data Mladen Kolar mladenk@cs.cmu.edu Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA 117 USA Han Liu hanliu@princeton.edu Department of Operations Research and Financial Engineering,

More information

Linear Regression with Strongly Correlated Designs Using Ordered Weigthed l 1

Linear Regression with Strongly Correlated Designs Using Ordered Weigthed l 1 Linear Regression with Strongly Correlated Designs Using Ordered Weigthed l 1 ( OWL ) Regularization Mário A. T. Figueiredo Instituto de Telecomunicações and Instituto Superior Técnico, Universidade de

More information

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization / Coordinate descent Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Adding to the toolbox, with stats and ML in mind We ve seen several general and useful minimization tools First-order methods

More information

High-dimensional variable selection via tilting

High-dimensional variable selection via tilting High-dimensional variable selection via tilting Haeran Cho and Piotr Fryzlewicz September 2, 2010 Abstract This paper considers variable selection in linear regression models where the number of covariates

More information

Learning Gaussian Graphical Models with Unknown Group Sparsity

Learning Gaussian Graphical Models with Unknown Group Sparsity Learning Gaussian Graphical Models with Unknown Group Sparsity Kevin Murphy Ben Marlin Depts. of Statistics & Computer Science Univ. British Columbia Canada Connections Graphical models Density estimation

More information

Sparse Covariance Selection using Semidefinite Programming

Sparse Covariance Selection using Semidefinite Programming Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support

More information

Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables

Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables LIB-MA, FSSM Cadi Ayyad University (Morocco) COMPSTAT 2010 Paris, August 22-27, 2010 Motivations Fan and Li (2001), Zou and Li (2008)

More information

Adaptive estimation of the copula correlation matrix for semiparametric elliptical copulas

Adaptive estimation of the copula correlation matrix for semiparametric elliptical copulas Adaptive estimation of the copula correlation matrix for semiparametric elliptical copulas Department of Mathematics Department of Statistical Science Cornell University London, January 7, 2016 Joint work

More information

The deterministic Lasso

The deterministic Lasso The deterministic Lasso Sara van de Geer Seminar für Statistik, ETH Zürich Abstract We study high-dimensional generalized linear models and empirical risk minimization using the Lasso An oracle inequality

More information

Adaptive First-Order Methods for General Sparse Inverse Covariance Selection

Adaptive First-Order Methods for General Sparse Inverse Covariance Selection Adaptive First-Order Methods for General Sparse Inverse Covariance Selection Zhaosong Lu December 2, 2008 Abstract In this paper, we consider estimating sparse inverse covariance of a Gaussian graphical

More information

Inference for High Dimensional Robust Regression

Inference for High Dimensional Robust Regression Department of Statistics UC Berkeley Stanford-Berkeley Joint Colloquium, 2015 Table of Contents 1 Background 2 Main Results 3 OLS: A Motivating Example Table of Contents 1 Background 2 Main Results 3 OLS:

More information

Sparse estimation of high-dimensional covariance matrices

Sparse estimation of high-dimensional covariance matrices Sparse estimation of high-dimensional covariance matrices by Adam J. Rothman A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Statistics) in The

More information

Compressed Sensing and Neural Networks

Compressed Sensing and Neural Networks and Jan Vybíral (Charles University & Czech Technical University Prague, Czech Republic) NOMAD Summer Berlin, September 25-29, 2017 1 / 31 Outline Lasso & Introduction Notation Training the network Applications

More information

Hierarchical kernel learning

Hierarchical kernel learning Hierarchical kernel learning Francis Bach Willow project, INRIA - Ecole Normale Supérieure May 2010 Outline Supervised learning and regularization Kernel methods vs. sparse methods MKL: Multiple kernel

More information

Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models

Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models Han Liu Kathryn Roeder Larry Wasserman Carnegie Mellon University Pittsburgh, PA 15213 Abstract A challenging

More information

The Iterated Lasso for High-Dimensional Logistic Regression

The Iterated Lasso for High-Dimensional Logistic Regression The Iterated Lasso for High-Dimensional Logistic Regression By JIAN HUANG Department of Statistics and Actuarial Science, 241 SH University of Iowa, Iowa City, Iowa 52242, U.S.A. SHUANGE MA Division of

More information

Theory and Applications of High Dimensional Covariance Matrix Estimation

Theory and Applications of High Dimensional Covariance Matrix Estimation 1 / 44 Theory and Applications of High Dimensional Covariance Matrix Estimation Yuan Liao Princeton University Joint work with Jianqing Fan and Martina Mincheva December 14, 2011 2 / 44 Outline 1 Applications

More information

Delta Theorem in the Age of High Dimensions

Delta Theorem in the Age of High Dimensions Delta Theorem in the Age of High Dimensions Mehmet Caner Department of Economics Ohio State University December 15, 2016 Abstract We provide a new version of delta theorem, that takes into account of high

More information

Estimation of large dimensional sparse covariance matrices

Estimation of large dimensional sparse covariance matrices Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed)

More information

Generalized Concomitant Multi-Task Lasso for sparse multimodal regression

Generalized Concomitant Multi-Task Lasso for sparse multimodal regression Generalized Concomitant Multi-Task Lasso for sparse multimodal regression Mathurin Massias https://mathurinm.github.io INRIA Saclay Joint work with: Olivier Fercoq (Télécom ParisTech) Alexandre Gramfort

More information