Permutation-invariant regularization of large covariance matrices. Liza Levina

Size: px
Start display at page:

Download "Permutation-invariant regularization of large covariance matrices. Liza Levina"

Transcription

1 Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work with Adam Rothman, Amy Wagaman, Ji Zhu (University of Michigan) Peter Bickel (UC Berkeley)

2 Liza Levina Permutation-invariant covariance regularization 2/42 Outline Introduction Motivation: a result for ordered variables Covariance Thresholding Generalized Thresholding Discovering Variable Orderings Regularization of the inverse

3 Liza Levina Permutation-invariant covariance regularization 3/42 Why estimate covariance? Principal component analysis (PCA) Linear or quadratic discriminant analysis (LDA/QDA) Inferring independence and conditional independence in Gaussian graphical models Inference about the mean (e.g., longitudinal mean response curve) Covariance itself is not always the end goal: PCA requires estimation of the eigenstructure LDA/QDA and conditional independence require the inverse (a.k.a. the concentration or precision matrix)

4 Liza Levina Permutation-invariant covariance regularization 4/42 What s wrong with the sample covariance? Observe X 1,...,X n, i.i.d. p-variate random variables ˆΣ = 1 n ( Xi n X )( X i X ) T i=1 MLE, unbiased (almost), well-behaved (and well studied) for fixed p, n. But very noisy if p is large. Eigenvalues overdispersed unless p/n 0 (Marčenko-Pastur, 1967; Wachter, 1978; Geman, 1980; Bai and Yin, 1993; Johnstone, 2001; Paul, 2004; el Karoui, 2007) Eigenvectors are not consistent (Johnstone and Lu, 2004; Paul, 2006) LDA breaks down if p/n (Bickel and Levina, 2004) Singular if p > n

5 Liza Levina Permutation-invariant covariance regularization 5/42 Desirable features for a covariance estimator Positive definite and well-conditioned Inverse easily available (in some applications) Sparse covariance or inverse (in some applications) Ultimately, consistency and good rates for large p

6 Liza Levina Permutation-invariant covariance regularization 6/42 Two general classes of estimators Variables have a natural notion of distance (time series, spatial data, longitudinal data, spectroscopy) exploit weaker dependence between distant variables The variable order is meaningless (genetics, finance, social sciences) estimators should be invariant to variable permutations

7 Liza Levina Permutation-invariant covariance regularization 7/42 Outline Introduction Motivation: a result for ordered variables Covariance Thresholding Generalized Thresholding Discovering Variable Orderings Regularization of the inverse

8 Liza Levina Permutation-invariant covariance regularization 8/42 Estimators for ordered variables Exploit weaker dependence between distant variables Banding or tapering the covariance (Bickel and Levina, 2008; Furrer and Bengsston, 2007): make the entries far away from the diagonal smaller The concentration matrix is usually regularized via its Cholesky factor (Wu and Pourahmadi, 2003; Bickel and Levina, 2008; Huang et al., 2006; Levina et al., 2008).

9 Liza Levina Permutation-invariant covariance regularization 9/42 Bickel and Levina (2008) A result for banding/tapering Replace ˆΣ with ˆΣ R, where means Schur (element-wise) product If R is positive definite, so is ˆΣ R Banding: R k (i, j) = 1( i j k) gives B k (ˆΣ) (not necessarily positive definite) Other more smooth tapers guarantee positive definiteness

10 Liza Levina Permutation-invariant covariance regularization 10/42 All results in operator norm, a.k.a. matrix 2-norm: for symmetric M, M = max i λ i (M) Convergence in operator norm implies convergence of eigenvalues and eigenvectors Result uniform over a class of approximately bandable covariance matrices as p, n U(α, M, C) = { Σ :λ max (Σ) M, max σ ij Ck α for all k 0 }. j i: i j >k

11 Liza Levina Permutation-invariant covariance regularization 11/42 Theorem 1: If X is Gaussian and k on Σ U(α, M, C), B k (ˆΣ p ) Σ p = O P ( ) 1 α+1 log p n, then, uniformly ( log p n ) α α+1 For approximately bandable matrices, the banded estimator is consistent if log p n 0. Theorem also holds for general tapering If we assume λ min (Σ) ε > 0, the same rate holds for the inverse. In practice the tuning parameter k is selected by cross-validation

12 Liza Levina Permutation-invariant covariance regularization 12/42 Outline Introduction Motivation: a result for ordered variables Covariance Thresholding Generalized Thresholding Discovering Variable Orderings Regularization of the inverse

13 Liza Levina Permutation-invariant covariance regularization 13/42 Covariance thresholding Bickel and Levina (2007); El Karoui (2007). If the variable order is meaningless, estimators should be invariant to variable permutations Consider thresholding as a permutation-invariant analogue of banding: T t (ˆΣ) = [ˆσ ij 1( ˆσ ij t)] This has been done in practice for a long time

14 Liza Levina Permutation-invariant covariance regularization 14/42 A class of approximately sparse matrices (a permutation-invariant analogue of approximately bandable ): for 0 q < 1 U τ ( q, M, c0 (p) ) = {Σ : σ ii M, p σ ij q c 0 (p), for all i} j=1 If q = 0, U τ (0, M, c 0 (p)) is a class of sparse matrices. U U τ for appropriately chosen constants (approximately bandable matrices are approximately sparse).

15 Liza Levina Permutation-invariant covariance regularization 15/42 ( Theorem 2: Suppose X is Gaussian. Uniformly on U τ q, c0 (p), M ), for sufficiently large M, if t = M log p log p n and n = o(1), then ( ) 1 q log p T t (ˆΣ p ) Σ p = O P c 0 (p) n For approximately sparse matrices, the thresholded estimator is consistent if log p n 0 (and c 0(p) does not grow too fast). If we assume λ min (Σ) > ε > 0, the same rate holds for the inverse. Threshold chosen by cross-validation

16 Liza Levina Permutation-invariant covariance regularization 16/42 Climate data example January mean temperatures from 1850 to 2006 (n = 157) The region covered by p = 2592 recording stations extends from to degrees longitude, and from to 87.5 latitude Not all the stations were in place for the entire period Look at EOFs (empirical orthogonal functions), the principal components of the spatial covariance matrix

17 Liza Levina Permutation-invariant covariance regularization 17/42 Sample Covariance EOFs EOF #1 (14.82% Variance) EOF #2 (8.68% Variance) Latitude Longitude Longitude

18 Liza Levina Permutation-invariant covariance regularization 18/42 Thresholded Covariance EOFs EOF #1 (8.49% Variance) EOF #2 (7.82% Variance) Latitude Longitude Longitude

19 Liza Levina Permutation-invariant covariance regularization 19/42 Outline Introduction Motivation: a result for ordered variables Covariance Thresholding Generalized Thresholding Discovering Variable Orderings Regularization of the inverse

20 Liza Levina Permutation-invariant covariance regularization 20/42 Generalized thresholding Rothman, Levina, Zhu (2008) Theorem 2 applies to hard thresholding In wavelets and regression, smoother penalties often do better We define generalized thresholding by a function s t : R R s.t. (i) s t (z) z (shrinkage) (ii) s t (z) = 0 for z t (thresholding) (iii) s t (z) z t for all z R and t 0 (limit shrinkage) Also makes sense to have s t (z) = sign(z)s t ( z ). The covariance estimator is S t (ˆΣ) = [s t (ˆσ ij )]

21 Liza Levina Permutation-invariant covariance regularization 21/42 Examples of Generalized Thresholding Hard thresholding: s t (z) = z 1( z > t) Soft thresholding: (Donoho and Johnstone, 1994, 1995) s t (z) = sign(z)( z t) + Equivalent to minimizing quadratic loss with lasso penalty: S t (ˆΣ) = arg min S (s ij ˆσ ij ) 2 + t i,j i,j s ij

22 Liza Levina Permutation-invariant covariance regularization 22/42 Comparison of generalized thresholding operators s( z ) Hard Adaptive Lasso Soft SCAD z

23 Liza Levina Permutation-invariant covariance regularization 23/42 More examples SCAD (Fan, 1997; Fan and Li, 2001) Linear interpolation between soft and hard thresholding Equivalent to quadratic loss with the SCAD penalty Adaptive Lasso (Zou, 2006), goes back to Breiman (1995) Equivalent to quadratic loss with weighted lasso penalty, S t (ˆΣ) = arg min S (s ij ˆσ ij ) 2 + t i,j i,j w ij s ij with weights w ij = ˆσ ij η, which penalize small elements more.

24 Liza Levina Permutation-invariant covariance regularization 24/42 Theorem 3: (a) Under the same conditions, the hard thresholding theorem holds for generalized thresholding. (b) Sparsistency : with probability tending to 1, s t (ˆσ ij ) = 0 for all (i, j) such that σ ij = 0, (c) If we additionally assume that all non-zero elements satisfy σ ij > τ, where n(τ t), we also have, with probability tending to 1, s t (ˆσ ij ) 0 for all (i, j) such that σ ij 0. Tuning parameters have to be selected by cross-validation

25 Liza Levina Permutation-invariant covariance regularization 25/42 Some generalized thresholding results SCAD and adaptive lasso do somewhat better than hard and soft thresholding as measured by operator loss Sparsity: look at true positive rate (fraction of non-zeros estimated as non-zero) vs. false positive rate (fraction of zeros estimated as non-zero). Example: MA(1) (tri-diagonal matrix, σ ii = 1, σ i,i 1 = 0.3)

26 Liza Levina Permutation-invariant covariance regularization 26/42 p = 30 p = 200 True Positive Rate Soft Hard Adapt. Lasso SCAD True Positive Rate Soft Hard Adapt. Lasso SCAD False Positive Rate False Positive Rate

27 Liza Levina Permutation-invariant covariance regularization 27/42 Outline Introduction Motivation: a result for ordered variables Covariance Thresholding Generalized Thresholding Discovering Variable Orderings Regularization of the inverse

28 Liza Levina Permutation-invariant covariance regularization 28/42 Discovering Variable Orderings Wagaman and Levina (2007) Rate comparisons, simulations and intuition suggest that if a structured order exists, it is better to use it than to ignore it Goal: reorder variables to make the covariance matrix as close to bandable as possible. Main Idea: use correlations as a measure of similarity and embed the variables in 1-d. The coordinates provide an ordering.

29 Liza Levina Permutation-invariant covariance regularization 29/42 The Isomap algorithm The classical tool is multi-dimensional scaling (MDS) An alternative: the Isomap (Tenenbaum et al, 2000) (developed for manifold embeddings). Given a dissimilarity measure d(i, j) (we use 1 ˆρ ij ) 1. For each point, find its r nearest neighbors and construct a neighborhood graph with dissimilarities as edge weights. 2. Estimate the geodesic distance d r (i, j) between each pair of points i, j by computing the shortest-path graph distance from i to j. 3. Apply metric MDS to the estimated geodesic distances.

30 Liza Levina Permutation-invariant covariance regularization 30/42 The Isoband estimator 1. Construct the neighborhood graph using d(i, j) = 1 ˆρ ij. 2. Apply Isomap to each connected component of the graph 3. Reorder the variables within each component according to their 1-d embedding coordinates and apply banding 4. Concatenate components into a block-diagonal estimator

31 Liza Levina Permutation-invariant covariance regularization 31/42 A bootstrap approach to improve the nearest neighbor graph There is noise in the k-nn links based on sample correlations Main idea: create T bootstrap replicates and only include an edge if it appears in at least ct of the replicates, where 0 < c < 1 is a tuning parameter (we use c = 0.5). Similar to bagging Improves detection of independent blocks and ordering, but adds computational cost

32 Liza Levina Permutation-invariant covariance regularization 32/42 A simulation example AR(1): σ ij = ρ i j, n = 100, ρ = 0.7, average (SE) operator norm loss over 50 replications p Sample Band. Band.Perm. MDS Isoband Bootst. Thresh (.05) 0.89(.04) 0.84(.05) 0.92(.04) 0.87(.04) 0.90(.04) 0.87(.05) (.06) 1.65(.04) 4.63(.01) 3.41(.05) 1.82(.04) 1.67(.04) 2.85(.03) (.07) 1.79(.04) 4.67(.00) 3.97(.03) 2.10(.06) 1.85(.05) 3.02(.02)

33 Liza Levina Permutation-invariant covariance regularization 33/42 Assessing sparsity: a block-diagonal example Three AR blocks of size 50,30,20, with ρ = 0.7,0.8,0.9 Heatmap of percentage of times each element was estimated as zero (black is 0%, white is 100%). As benchmark, apply banding to known blocks separately, each one in correct order.

34 Liza Levina Permutation-invariant covariance regularization 34/ (a) Banding known blocks (b) Thresholding (c) Isoband (d) Bootstrap

35 Liza Levina Permutation-invariant covariance regularization 35/42 Outline Introduction Motivation: a result for ordered variables Covariance Thresholding Generalized Thresholding Discovering Variable Orderings Regularization of the inverse

36 Liza Levina Permutation-invariant covariance regularization 36/42 Estimators of the inverse invariant under variable permutations Inverse Ω = Σ 1 (a.k.a. concentration matrix or precision matrix); needed in graphical models, LDA/QDA. Graphical models mostly focus on testing/finding zeros rather than estimating the matrix (Drton and Perlman, 2004; Dobra et al, 2004; Meinshausen and Bühlmann, 2006; Kalisch and Bühlmann, 2007) Testing for zeros can be followed by constrained estimation (Chaudhuri, Drton and Richardson, 2007), but hard to analyze the resulting estimator

37 Liza Levina Permutation-invariant covariance regularization 37/42 A Sparse Permutation-Invariant Estimator The negative log-likelihood up to a constant: l(x; Ω) = tr(ωˆσ) log Ω Add lasso penalty on off-diagonal elements of Ω force regularization and sparsity. SPICE: ˆΩλ = arg min Ω 0 { } tr(ωˆσ) log Ω + λ Ω jj j j The optimization problem has been studied by several authors; Yuan and Lin (2007) provided a fixed p analysis.

38 Liza Levina Permutation-invariant covariance regularization 38/42 High-Dimensional Analysis of SPICE Rothman, Bickel, Levina, Zhu (2007) To get the best rate, standardize first, run SPICE on the correlation matrix, then multiply variances back in Ω λ Theorem 4: Let Ω = Σ 1, and take λ (A1) 0 < ε 0 λ min (Σ) λ max (Σ) ε 1 0 log p n. Assume (A2) card{(i, j) : Ω ij 0, i j} s. Then, ( ) (1 + s) log p Ω λ Ω = O P n SPICE is consistent as long as s log p n 0. Compare to c 0 (p) log p n for thresholding.

39 Liza Levina Permutation-invariant covariance regularization 39/42 Solving the optimization problem Optimization is non-trivial and computationally expensive Yuan and Lin (2007): Maxdet algorithm (interior point optimization) Banerjee et al. (2005): Nesterov s convex optimization Our approach (Rothman et al 2008): parametrize via the Cholesky factor to enforce positive definiteness, use local quadratic approximation and coordinate-wise descent Friedman, Hastie, and Tibshirani (2008): apply the coordinate-wise descent lasso algorithm (R package glasso)

40 Liza Levina Permutation-invariant covariance regularization 40/42 Colon tumor classification example 40 colon adenocarcinoma and 22 healthy tissue samples (Alon et al, 1999) Affymetrix gene expression data Top p genes selected by two-sample t-test (out of 2000) 100 random splits into training (2/3) and test (1/3) sets LDA classification Consider two ways of selecting the tuning parameter

41 Liza Levina Permutation-invariant covariance regularization 41/42 Colon tumor classification errors Tuning parameter for SPICE chosen by (A) 5-fold CV on the training data maximizing the likelihood; (B) 5-fold CV on the training data minimizing the classification error; (C) minimizing the classification error on the test data p Naive Bayes A B C (0.8) 13.5(0.6) 14.6(0.6) 9.7(0.5) (0.8) 12.1(0.7) 14.7(0.7) 9.0(0.6) (0.8) 18.7(0.8) 16.9(0.9) 9.1(0.5) (1.0) 18.3(0.7) 18.0(0.7) 10.2(0.5)

42 Liza Levina Permutation-invariant covariance regularization 42/42 Conclusions Regularization is necessary High-dimensional asymptotics give trade-offs of dimension vs. sample size vs. sparsity and conditions on the covariance model Structure helps regularization Efficient optimization and low computational complexity are very important

43 Liza Levina Permutation-invariant covariance regularization 43/42 Thank you

Regularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008

Regularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008 Regularized Estimation of High Dimensional Covariance Matrices Peter Bickel Cambridge January, 2008 With Thanks to E. Levina (Joint collaboration, slides) I. M. Johnstone (Slides) Choongsoon Bae (Slides)

More information

Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results

Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results David Prince Biostat 572 dprince3@uw.edu April 19, 2012 David Prince (UW) SPICE April 19, 2012 1 / 11 Electronic

More information

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana

More information

Sparse estimation of high-dimensional covariance matrices

Sparse estimation of high-dimensional covariance matrices Sparse estimation of high-dimensional covariance matrices by Adam J. Rothman A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Statistics) in The

More information

Sparse Permutation Invariant Covariance Estimation: Final Talk

Sparse Permutation Invariant Covariance Estimation: Final Talk Sparse Permutation Invariant Covariance Estimation: Final Talk David Prince Biostat 572 dprince3@uw.edu May 31, 2012 David Prince (UW) SPICE May 31, 2012 1 / 19 Electronic Journal of Statistics Vol. 2

More information

Sparse Permutation Invariant Covariance Estimation

Sparse Permutation Invariant Covariance Estimation Sparse Permutation Invariant Covariance Estimation Adam J. Rothman University of Michigan, Ann Arbor, USA. Peter J. Bickel University of California, Berkeley, USA. Elizaveta Levina University of Michigan,

More information

Sparse Permutation Invariant Covariance Estimation

Sparse Permutation Invariant Covariance Estimation Sparse Permutation Invariant Covariance Estimation Adam J. Rothman University of Michigan Ann Arbor, MI 48109-1107 e-mail: ajrothma@umich.edu Peter J. Bickel University of California Berkeley, CA 94720-3860

More information

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement

More information

arxiv: v1 [math.st] 31 Jan 2008

arxiv: v1 [math.st] 31 Jan 2008 Electronic Journal of Statistics ISSN: 1935-7524 Sparse Permutation Invariant arxiv:0801.4837v1 [math.st] 31 Jan 2008 Covariance Estimation Adam Rothman University of Michigan Ann Arbor, MI 48109-1107.

More information

High-dimensional covariance estimation based on Gaussian graphical models

High-dimensional covariance estimation based on Gaussian graphical models High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,

More information

High Dimensional Covariance and Precision Matrix Estimation

High Dimensional Covariance and Precision Matrix Estimation High Dimensional Covariance and Precision Matrix Estimation Wei Wang Washington University in St. Louis Thursday 23 rd February, 2017 Wei Wang (Washington University in St. Louis) High Dimensional Covariance

More information

Tuning Parameter Selection in Regularized Estimations of Large Covariance Matrices

Tuning Parameter Selection in Regularized Estimations of Large Covariance Matrices Tuning Parameter Selection in Regularized Estimations of Large Covariance Matrices arxiv:1308.3416v1 [stat.me] 15 Aug 2013 Yixin Fang 1, Binhuan Wang 1, and Yang Feng 2 1 New York University and 2 Columbia

More information

Estimating Covariance Structure in High Dimensions

Estimating Covariance Structure in High Dimensions Estimating Covariance Structure in High Dimensions Ashwini Maurya Michigan State University East Lansing, MI, USA Thesis Director: Dr. Hira L. Koul Committee Members: Dr. Yuehua Cui, Dr. Hyokyoung Hong,

More information

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss arxiv:1811.04545v1 [stat.co] 12 Nov 2018 Cheng Wang School of Mathematical Sciences, Shanghai Jiao

More information

Robust Inverse Covariance Estimation under Noisy Measurements

Robust Inverse Covariance Estimation under Noisy Measurements .. Robust Inverse Covariance Estimation under Noisy Measurements Jun-Kun Wang, Shou-De Lin Intel-NTU, National Taiwan University ICML 2014 1 / 30 . Table of contents Introduction.1 Introduction.2 Related

More information

Sparse inverse covariance estimation with the lasso

Sparse inverse covariance estimation with the lasso Sparse inverse covariance estimation with the lasso Jerome Friedman Trevor Hastie and Robert Tibshirani November 8, 2007 Abstract We consider the problem of estimating sparse graphs by a lasso penalty

More information

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations. Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,

More information

L26: Advanced dimensionality reduction

L26: Advanced dimensionality reduction L26: Advanced dimensionality reduction The snapshot CA approach Oriented rincipal Components Analysis Non-linear dimensionality reduction (manifold learning) ISOMA Locally Linear Embedding CSCE 666 attern

More information

A path following algorithm for Sparse Pseudo-Likelihood Inverse Covariance Estimation (SPLICE)

A path following algorithm for Sparse Pseudo-Likelihood Inverse Covariance Estimation (SPLICE) A path following algorithm for Sparse Pseudo-Likelihood Inverse Covariance Estimation (SPLICE) Guilherme V. Rocha, Peng Zhao, Bin Yu July 24, 2008 Abstract Given n observations of a p-dimensional random

More information

Computationally efficient banding of large covariance matrices for ordered data and connections to banding the inverse Cholesky factor

Computationally efficient banding of large covariance matrices for ordered data and connections to banding the inverse Cholesky factor Computationally efficient banding of large covariance matrices for ordered data and connections to banding the inverse Cholesky factor Y. Wang M. J. Daniels wang.yanpin@scrippshealth.org mjdaniels@austin.utexas.edu

More information

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 7, 04 Reading: See class website Eric Xing @ CMU, 005-04

More information

Shrinkage Tuning Parameter Selection in Precision Matrices Estimation

Shrinkage Tuning Parameter Selection in Precision Matrices Estimation arxiv:0909.1123v1 [stat.me] 7 Sep 2009 Shrinkage Tuning Parameter Selection in Precision Matrices Estimation Heng Lian Division of Mathematical Sciences School of Physical and Mathematical Sciences Nanyang

More information

Posterior convergence rates for estimating large precision. matrices using graphical models

Posterior convergence rates for estimating large precision. matrices using graphical models Biometrika (2013), xx, x, pp. 1 27 C 2007 Biometrika Trust Printed in Great Britain Posterior convergence rates for estimating large precision matrices using graphical models BY SAYANTAN BANERJEE Department

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction

More information

Regularized Parameter Estimation in High-Dimensional Gaussian Mixture Models

Regularized Parameter Estimation in High-Dimensional Gaussian Mixture Models LETTER Communicated by Clifford Lam Regularized Parameter Estimation in High-Dimensional Gaussian Mixture Models Lingyan Ruan lruan@gatech.edu Ming Yuan ming.yuan@isye.gatech.edu School of Industrial and

More information

Estimation of Graphical Models with Shape Restriction

Estimation of Graphical Models with Shape Restriction Estimation of Graphical Models with Shape Restriction BY KHAI X. CHIONG USC Dornsife INE, Department of Economics, University of Southern California, Los Angeles, California 989, U.S.A. kchiong@usc.edu

More information

Estimating Structured High-Dimensional Covariance and Precision Matrices: Optimal Rates and Adaptive Estimation

Estimating Structured High-Dimensional Covariance and Precision Matrices: Optimal Rates and Adaptive Estimation Estimating Structured High-Dimensional Covariance and Precision Matrices: Optimal Rates and Adaptive Estimation T. Tony Cai 1, Zhao Ren 2 and Harrison H. Zhou 3 University of Pennsylvania, University of

More information

Sparse PCA with applications in finance

Sparse PCA with applications in finance Sparse PCA with applications in finance A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon 1 Introduction

More information

Sparse Covariance Selection using Semidefinite Programming

Sparse Covariance Selection using Semidefinite Programming Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support

More information

Lecture 6: Methods for high-dimensional problems

Lecture 6: Methods for high-dimensional problems Lecture 6: Methods for high-dimensional problems Hector Corrada Bravo and Rafael A. Irizarry March, 2010 In this Section we will discuss methods where data lies on high-dimensional spaces. In particular,

More information

Log Covariance Matrix Estimation

Log Covariance Matrix Estimation Log Covariance Matrix Estimation Xinwei Deng Department of Statistics University of Wisconsin-Madison Joint work with Kam-Wah Tsui (Univ. of Wisconsin-Madsion) 1 Outline Background and Motivation The Proposed

More information

Penalized versus constrained generalized eigenvalue problems

Penalized versus constrained generalized eigenvalue problems Penalized versus constrained generalized eigenvalue problems Irina Gaynanova, James G. Booth and Martin T. Wells. arxiv:141.6131v3 [stat.co] 4 May 215 Abstract We investigate the difference between using

More information

Generalized Elastic Net Regression

Generalized Elastic Net Regression Abstract Generalized Elastic Net Regression Geoffroy MOURET Jean-Jules BRAULT Vahid PARTOVINIA This work presents a variation of the elastic net penalization method. We propose applying a combined l 1

More information

The lasso: some novel algorithms and applications

The lasso: some novel algorithms and applications 1 The lasso: some novel algorithms and applications Newton Institute, June 25, 2008 Robert Tibshirani Stanford University Collaborations with Trevor Hastie, Jerome Friedman, Holger Hoefling, Gen Nowak,

More information

Estimation of large dimensional sparse covariance matrices

Estimation of large dimensional sparse covariance matrices Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed)

More information

Simultaneous variable selection and class fusion for high-dimensional linear discriminant analysis

Simultaneous variable selection and class fusion for high-dimensional linear discriminant analysis Biostatistics (2010), 11, 4, pp. 599 608 doi:10.1093/biostatistics/kxq023 Advance Access publication on May 26, 2010 Simultaneous variable selection and class fusion for high-dimensional linear discriminant

More information

MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.)

MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.) 1/12 MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.) Dominique Guillot Departments of Mathematical Sciences University of Delaware May 6, 2016

More information

High-dimensional Covariance Estimation Based On Gaussian Graphical Models

High-dimensional Covariance Estimation Based On Gaussian Graphical Models High-dimensional Covariance Estimation Based On Gaussian Graphical Models Shuheng Zhou, Philipp Rutimann, Min Xu and Peter Buhlmann February 3, 2012 Problem definition Want to estimate the covariance matrix

More information

High-dimensional Covariance Estimation Based On Gaussian Graphical Models

High-dimensional Covariance Estimation Based On Gaussian Graphical Models Journal of Machine Learning Research 12 (211) 2975-326 Submitted 9/1; Revised 6/11; Published 1/11 High-dimensional Covariance Estimation Based On Gaussian Graphical Models Shuheng Zhou Department of Statistics

More information

High-dimensional graphical model selection: Practical and information-theoretic limits

High-dimensional graphical model selection: Practical and information-theoretic limits 1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John

More information

High Dimensional Inverse Covariate Matrix Estimation via Linear Programming

High Dimensional Inverse Covariate Matrix Estimation via Linear Programming High Dimensional Inverse Covariate Matrix Estimation via Linear Programming Ming Yuan October 24, 2011 Gaussian Graphical Model X = (X 1,..., X p ) indep. N(µ, Σ) Inverse covariance matrix Σ 1 = Ω = (ω

More information

High-dimensional graphical model selection: Practical and information-theoretic limits

High-dimensional graphical model selection: Practical and information-theoretic limits 1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon

More information

Generalized Power Method for Sparse Principal Component Analysis

Generalized Power Method for Sparse Principal Component Analysis Generalized Power Method for Sparse Principal Component Analysis Peter Richtárik CORE/INMA Catholic University of Louvain Belgium VOCAL 2008, Veszprém, Hungary CORE Discussion Paper #2008/70 joint work

More information

Effective Linear Discriminant Analysis for High Dimensional, Low Sample Size Data

Effective Linear Discriminant Analysis for High Dimensional, Low Sample Size Data Effective Linear Discriant Analysis for High Dimensional, Low Sample Size Data Zhihua Qiao, Lan Zhou and Jianhua Z. Huang Abstract In the so-called high dimensional, low sample size (HDLSS) settings, LDA

More information

Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001)

Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Presented by Yang Zhao March 5, 2010 1 / 36 Outlines 2 / 36 Motivation

More information

Sparse Gaussian conditional random fields

Sparse Gaussian conditional random fields Sparse Gaussian conditional random fields Matt Wytock, J. ico Kolter School of Computer Science Carnegie Mellon University Pittsburgh, PA 53 {mwytock, zkolter}@cs.cmu.edu Abstract We propose sparse Gaussian

More information

Proximity-Based Anomaly Detection using Sparse Structure Learning

Proximity-Based Anomaly Detection using Sparse Structure Learning Proximity-Based Anomaly Detection using Sparse Structure Learning Tsuyoshi Idé (IBM Tokyo Research Lab) Aurelie C. Lozano, Naoki Abe, and Yan Liu (IBM T. J. Watson Research Center) 2009/04/ SDM 2009 /

More information

SPARSE ESTIMATION OF LARGE COVARIANCE MATRICES VIA A NESTED LASSO PENALTY

SPARSE ESTIMATION OF LARGE COVARIANCE MATRICES VIA A NESTED LASSO PENALTY The Annals of Applied Statistics 2008, Vol. 2, No. 1, 245 263 DOI: 10.1214/07-AOAS139 Institute of Mathematical Statistics, 2008 SPARSE ESTIMATION OF LARGE COVARIANCE MATRICES VIA A NESTED LASSO PENALTY

More information

Theory and Applications of High Dimensional Covariance Matrix Estimation

Theory and Applications of High Dimensional Covariance Matrix Estimation 1 / 44 Theory and Applications of High Dimensional Covariance Matrix Estimation Yuan Liao Princeton University Joint work with Jianqing Fan and Martina Mincheva December 14, 2011 2 / 44 Outline 1 Applications

More information

Fast Regularization Paths via Coordinate Descent

Fast Regularization Paths via Coordinate Descent August 2008 Trevor Hastie, Stanford Statistics 1 Fast Regularization Paths via Coordinate Descent Trevor Hastie Stanford University joint work with Jerry Friedman and Rob Tibshirani. August 2008 Trevor

More information

TECHNICAL REPORT NO. 1091r. A Note on the Lasso and Related Procedures in Model Selection

TECHNICAL REPORT NO. 1091r. A Note on the Lasso and Related Procedures in Model Selection DEPARTMENT OF STATISTICS University of Wisconsin 1210 West Dayton St. Madison, WI 53706 TECHNICAL REPORT NO. 1091r April 2004, Revised December 2004 A Note on the Lasso and Related Procedures in Model

More information

A Multiple Testing Approach to the Regularisation of Large Sample Correlation Matrices

A Multiple Testing Approach to the Regularisation of Large Sample Correlation Matrices A Multiple Testing Approach to the Regularisation of Large Sample Correlation Matrices Natalia Bailey 1 M. Hashem Pesaran 2 L. Vanessa Smith 3 1 Department of Econometrics & Business Statistics, Monash

More information

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA Presented by Dongjun Chung March 12, 2010 Introduction Definition Oracle Properties Computations Relationship: Nonnegative Garrote Extensions:

More information

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Nonconcave Penalized Likelihood with A Diverging Number of Parameters Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized

More information

Mixed and Covariate Dependent Graphical Models

Mixed and Covariate Dependent Graphical Models Mixed and Covariate Dependent Graphical Models by Jie Cheng A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Statistics) in The University of

More information

Journal of Multivariate Analysis. Consistency of sparse PCA in High Dimension, Low Sample Size contexts

Journal of Multivariate Analysis. Consistency of sparse PCA in High Dimension, Low Sample Size contexts Journal of Multivariate Analysis 5 (03) 37 333 Contents lists available at SciVerse ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva Consistency of sparse PCA

More information

Variable Selection and Weighting by Nearest Neighbor Ensembles

Variable Selection and Weighting by Nearest Neighbor Ensembles Variable Selection and Weighting by Nearest Neighbor Ensembles Jan Gertheiss (joint work with Gerhard Tutz) Department of Statistics University of Munich WNI 2008 Nearest Neighbor Methods Introduction

More information

Sparse representation classification and positive L1 minimization

Sparse representation classification and positive L1 minimization Sparse representation classification and positive L1 minimization Cencheng Shen Joint Work with Li Chen, Carey E. Priebe Applied Mathematics and Statistics Johns Hopkins University, August 5, 2014 Cencheng

More information

Topics in High-dimensional Unsupervised Learning

Topics in High-dimensional Unsupervised Learning Topics in High-dimensional Unsupervised Learning by Jian Guo A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Statistics) in The University of

More information

Sparse Covariance Matrix Estimation with Eigenvalue Constraints

Sparse Covariance Matrix Estimation with Eigenvalue Constraints Sparse Covariance Matrix Estimation with Eigenvalue Constraints Han Liu and Lie Wang 2 and Tuo Zhao 3 Department of Operations Research and Financial Engineering, Princeton University 2 Department of Mathematics,

More information

Lecture 14: Variable Selection - Beyond LASSO

Lecture 14: Variable Selection - Beyond LASSO Fall, 2017 Extension of LASSO To achieve oracle properties, L q penalty with 0 < q < 1, SCAD penalty (Fan and Li 2001; Zhang et al. 2007). Adaptive LASSO (Zou 2006; Zhang and Lu 2007; Wang et al. 2007)

More information

MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models

MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models 1/13 MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models Dominique Guillot Departments of Mathematical Sciences University of Delaware May 4, 2016 Recall

More information

Approximation. Inderjit S. Dhillon Dept of Computer Science UT Austin. SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina.

Approximation. Inderjit S. Dhillon Dept of Computer Science UT Austin. SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina. Using Quadratic Approximation Inderjit S. Dhillon Dept of Computer Science UT Austin SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina Sept 12, 2012 Joint work with C. Hsieh, M. Sustik and

More information

Graphical Model Selection

Graphical Model Selection May 6, 2013 Trevor Hastie, Stanford Statistics 1 Graphical Model Selection Trevor Hastie Stanford University joint work with Jerome Friedman, Rob Tibshirani, Rahul Mazumder and Jason Lee May 6, 2013 Trevor

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley A. d Aspremont, INFORMS, Denver,

More information

DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania

DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania Submitted to the Annals of Statistics DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING By T. Tony Cai and Linjun Zhang University of Pennsylvania We would like to congratulate the

More information

Comparisons of penalized least squares. methods by simulations

Comparisons of penalized least squares. methods by simulations Comparisons of penalized least squares arxiv:1405.1796v1 [stat.co] 8 May 2014 methods by simulations Ke ZHANG, Fan YIN University of Science and Technology of China, Hefei 230026, China Shifeng XIONG Academy

More information

Latent Variable Graphical Model Selection Via Convex Optimization

Latent Variable Graphical Model Selection Via Convex Optimization Latent Variable Graphical Model Selection Via Convex Optimization The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published

More information

Learning Gaussian Graphical Models with Unknown Group Sparsity

Learning Gaussian Graphical Models with Unknown Group Sparsity Learning Gaussian Graphical Models with Unknown Group Sparsity Kevin Murphy Ben Marlin Depts. of Statistics & Computer Science Univ. British Columbia Canada Connections Graphical models Density estimation

More information

arxiv: v1 [stat.me] 16 Feb 2018

arxiv: v1 [stat.me] 16 Feb 2018 Vol., 2017, Pages 1 26 1 arxiv:1802.06048v1 [stat.me] 16 Feb 2018 High-dimensional covariance matrix estimation using a low-rank and diagonal decomposition Yilei Wu 1, Yingli Qin 1 and Mu Zhu 1 1 The University

More information

Robust and sparse Gaussian graphical modelling under cell-wise contamination

Robust and sparse Gaussian graphical modelling under cell-wise contamination Robust and sparse Gaussian graphical modelling under cell-wise contamination Shota Katayama 1, Hironori Fujisawa 2 and Mathias Drton 3 1 Tokyo Institute of Technology, Japan 2 The Institute of Statistical

More information

MSA220/MVE440 Statistical Learning for Big Data

MSA220/MVE440 Statistical Learning for Big Data MSA220/MVE440 Statistical Learning for Big Data Lecture 7/8 - High-dimensional modeling part 1 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Classification

More information

Introduction to graphical models: Lecture III

Introduction to graphical models: Lecture III Introduction to graphical models: Lecture III Martin Wainwright UC Berkeley Departments of Statistics, and EECS Martin Wainwright (UC Berkeley) Some introductory lectures January 2013 1 / 25 Introduction

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 5, 06 Reading: See class website Eric Xing @ CMU, 005-06

More information

Direct Learning: Linear Regression. Donglin Zeng, Department of Biostatistics, University of North Carolina

Direct Learning: Linear Regression. Donglin Zeng, Department of Biostatistics, University of North Carolina Direct Learning: Linear Regression Parametric learning We consider the core function in the prediction rule to be a parametric function. The most commonly used function is a linear function: squared loss:

More information

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization / Coordinate descent Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Adding to the toolbox, with stats and ML in mind We ve seen several general and useful minimization tools First-order methods

More information

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Final Overview. Introduction to ML. Marek Petrik 4/25/2017 Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,

More information

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Sparse regression. Optimization-Based Data Analysis.   Carlos Fernandez-Granda Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic

More information

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1

More information

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto Unsupervised Learning Techniques 9.520 Class 07, 1 March 2006 Andrea Caponnetto About this class Goal To introduce some methods for unsupervised learning: Gaussian Mixtures, K-Means, ISOMAP, HLLE, Laplacian

More information

LECTURE NOTE #11 PROF. ALAN YUILLE

LECTURE NOTE #11 PROF. ALAN YUILLE LECTURE NOTE #11 PROF. ALAN YUILLE 1. NonLinear Dimension Reduction Spectral Methods. The basic idea is to assume that the data lies on a manifold/surface in D-dimensional space, see figure (1) Perform

More information

Pathwise coordinate optimization

Pathwise coordinate optimization Stanford University 1 Pathwise coordinate optimization Jerome Friedman, Trevor Hastie, Holger Hoefling, Robert Tibshirani Stanford University Acknowledgements: Thanks to Stephen Boyd, Michael Saunders,

More information

GRAPH-BASED REGULARIZATION OF LARGE COVARIANCE MATRICES.

GRAPH-BASED REGULARIZATION OF LARGE COVARIANCE MATRICES. GRAPH-BASED REGULARIZATION OF LARGE COVARIANCE MATRICES. A Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in the Graduate School of The Ohio State University

More information

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 Last week... supervised and unsupervised methods need adaptive

More information

ORIE 4741: Learning with Big Messy Data. Regularization

ORIE 4741: Learning with Big Messy Data. Regularization ORIE 4741: Learning with Big Messy Data Regularization Professor Udell Operations Research and Information Engineering Cornell October 26, 2017 1 / 24 Regularized empirical risk minimization choose model

More information

Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides

Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides Intelligent Data Analysis and Probabilistic Inference Lecture

More information

Approximating the Covariance Matrix with Low-rank Perturbations

Approximating the Covariance Matrix with Low-rank Perturbations Approximating the Covariance Matrix with Low-rank Perturbations Malik Magdon-Ismail and Jonathan T. Purnell Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 12180 {magdon,purnej}@cs.rpi.edu

More information

Robust Variable Selection Through MAVE

Robust Variable Selection Through MAVE Robust Variable Selection Through MAVE Weixin Yao and Qin Wang Abstract Dimension reduction and variable selection play important roles in high dimensional data analysis. Wang and Yin (2008) proposed sparse

More information

(Part 1) High-dimensional statistics May / 41

(Part 1) High-dimensional statistics May / 41 Theory for the Lasso Recall the linear model Y i = p j=1 β j X (j) i + ɛ i, i = 1,..., n, or, in matrix notation, Y = Xβ + ɛ, To simplify, we assume that the design X is fixed, and that ɛ is N (0, σ 2

More information

Sparse PCA in High Dimensions

Sparse PCA in High Dimensions Sparse PCA in High Dimensions Jing Lei, Department of Statistics, Carnegie Mellon Workshop on Big Data and Differential Privacy Simons Institute, Dec, 2013 (Based on joint work with V. Q. Vu, J. Cho, and

More information

Dimension Reduction and Low-dimensional Embedding

Dimension Reduction and Low-dimensional Embedding Dimension Reduction and Low-dimensional Embedding Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1/26 Dimension

More information

arxiv: v1 [math.st] 13 Feb 2012

arxiv: v1 [math.st] 13 Feb 2012 Sparse Matrix Inversion with Scaled Lasso Tingni Sun and Cun-Hui Zhang Rutgers University arxiv:1202.2723v1 [math.st] 13 Feb 2012 Address: Department of Statistics and Biostatistics, Hill Center, Busch

More information

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Journal of Data Science 9(2011), 549-564 Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Masaru Kanba and Kanta Naito Shimane University Abstract: This paper discusses the

More information

A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression

A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression Noah Simon Jerome Friedman Trevor Hastie November 5, 013 Abstract In this paper we purpose a blockwise descent

More information

Regularization and Variable Selection via the Elastic Net

Regularization and Variable Selection via the Elastic Net p. 1/1 Regularization and Variable Selection via the Elastic Net Hui Zou and Trevor Hastie Journal of Royal Statistical Society, B, 2005 Presenter: Minhua Chen, Nov. 07, 2008 p. 2/1 Agenda Introduction

More information

Computational Statistics and Data Analysis. SURE-tuned tapering estimation of large covariance matrices

Computational Statistics and Data Analysis. SURE-tuned tapering estimation of large covariance matrices Computational Statistics and Data Analysis 58(2013) 339 351 Contents lists available at SciVerse ScienceDirect Computational Statistics and Data Analysis journal homepage: www.elsevier.com/locate/csda

More information