Robust and sparse Gaussian graphical modelling under cell-wise contamination
|
|
- Iris Fox
- 5 years ago
- Views:
Transcription
1 Robust and sparse Gaussian graphical modelling under cell-wise contamination Shota Katayama 1, Hironori Fujisawa 2 and Mathias Drton 3 1 Tokyo Institute of Technology, Japan 2 The Institute of Statistical Mathematics, Japan 3 University of Washington, USA 1 Introduction Let Y = (Y 1,..., Y p ) T be a p-dimensional random vector representing a multivariate observation. The conditional independence graph of Y is the undirected graph G = (V, E) whose vertex set V = 1,..., p} indexes the individual variables and whose edge set E indicates conditional dependences among them. More precisely, (i, j) E if and only if Y i and Y j are conditionally independent given Y V \i,j} = Y k : k i, j}. For a Gaussian vector, the edge set E corresponds to the support of the precision matrix. Indeed, it is well known that if Y follows a multivariate Gaussian distribution N p (µ, Σ) with mean vector µ and covariance matrix Σ, then (i, j) E if and only if Ω ij = 0, where Ω = Σ 1. Inference of the conditional independence graph sheds light on direct as opposed to indirect interactions and has received much recent attention (Drton & Maathuis, 2017). In particular, for high-dimensional Gaussian problems, several techniques have been developed that exploit available sparsity in inference of the support of the precision matrix Ω. Meinshausen & Bühlmann (2006) suggested fitting node-wise linear regression models with l 1 penalty to recover the support of each row. Yuan & Lin (2007), Banerjee et al. (2008) and Friedman et al. (2008) considered the graphical lasso (Glasso) that involves the l 1 penalized log-likelihood function. Cai et al. (2011) proposed the constrained l 1 minimization for inverse matrix estimation (CLIME), which may be formulated as a linear program. In fields such as bioinformatics and economics, data are often not only high-dimensional but also subject to contamination. While suitable for high dimensionality, the above mentioned techniques are sensitive to contamination. Moreover, traditional robust methods may not be appropriate when the number of variables is large. Indeed, they are based on the model in which an observation vector is either without contamination or fully contaminated.
2 Hence, an observation vector is treated as an outlier even if only one of many variables is contaminated. As a result these methods down-weight the entire vector regardless of whether it contains clean values for some variables. Such information loss can become fatal as the dimension increases. As a more realistic model in high dimensional data, Alqallaf et al. (2002) considered cell-wise contamination: the observations X 1,..., X n with p variables are generated by X i = (I p E i )Y i + E i Z i, i = 1,..., n. (1) Here, I p is the p p identity matrix and each E i = diag(e i1,..., E ip ) is a diagonal random matrix with the E ij s independent and Bernoulli distributed with P (E ij = 1) = ε j. The random vectors Y i and Z i are independent, and Y i N p (µ, Σ) corresponds to a clean sample while Z i makes contaminations in some elements of X i. Our goal is to develop a robust estimation method for the conditional independence graph G of Y i from the cell-wise contaminated observations X i. Techniques such as nodewise regression, Glasso and CLIME process an estimate of the covariance matrix. Our strategy is thus simply to apply these procedures using a covariance matrix estimator that is robust against cell-wise contamination. However, while many researchers have considered the traditional whole-vector contamination framework (see, e.g., Maronna et al., 2006, Chapter 6), there are fewer existing methods for cell-wise contamination. Specifically, we are aware of three approaches, namely, use of alternative t-distributions (Finegold & Drton, 2011), use of rank correlations (Loh & Tan, 2015; Öllerer & Croux, 2015), and a pairwise covariance estimation method by Tarr et al. (2016) who adopt an idea of Gnanadesikan & Kettenring (1972). In contrast, in this talk, we provide a robust covariance matrix estimator via - divergence as proposed by Fujisawa & Eguchi (2008). The -divergence can automatically reduce the impact of contaminations, and it is known to be robust even when the number of contaminations is large. 2 Preliminaries 2.1 Graph estimation For concise presentation, we focus on node-wise regression, Glasso and CLIME. Let ˆΣ be an estimator of Σ. For index sets A and B, define ˆΣ A,B as the sub-matrix of ˆΣ with the rows in A and the columns in B. We use the shorthand \j for the set V \j}, so that ˆΣ \j,\j denotes the sub-matrix with both rows and columns in V \j}. In l 1 penalized node-wise
3 regression, one finds ˆβ (j) 1 = argmin β R p 1 2 βt ˆΣ\j,\j β ˆΣ j,\j β + λ β 1, j = 1,..., p, where the tuning parameter λ > 0 controls the strength of the penalty β 1. Large λ yields high sparsity of ˆβ (j). After obtaining ˆβ (1),..., ˆβ (p), the edge set E is estimated by the AND (j) (i) (j) (i) rule Ê = (i, j) : ˆβ i 0 and ˆβ j 0} or the OR rule Ê = (i, j) : ˆβ i 0 or ˆβ j 0}. Node-wise regression is well-defined for any positive semidefinite estimate ˆΣ. The Glasso estimator is obtained by solving ˆΩ Glasso = argmin Ω R p p trˆσω log Ω + λ Ω 1, (2) where Ω 1 is the element-wise l 1 norm of Ω and λ > 0 is a tuning parameter that controls the sparsity of Ω. The edge set may be estimated by Ê = (i, j) : ˆΩij 0}. Efficient algorithms for the Glasso are given in Friedman et al. (2008) and Hsieh et al. (2011). For convergence, the former requires ˆΣ + λi p to be positive semidefinite while the latter requires the same for ˆΣ. Finally, we review the CLIME method. Let ˆΩ 0 = argmin Ω R p p Ω 1 subject to ˆΣΩ I p λ, (3) where means the element-wise infinity norm. Generally, ˆΩ 0 = (ˆω 0 ij) is not symmetric. The CLIME is defined through a simple symmetrization, namely, ˆΩ CLIME ij = ˆω 0 iji( ˆω 0 ij ˆω 0 ji ) + ˆω 0 jii( ˆω 0 ij > ˆω 0 ji ), i, j = 1,..., p, and the edge set is estimated as in Glasso. Cai et al. (2011) translated the matrix optimization problem from (3) into p vector optimization problems. Each of them can be solved by linear programming. The CLIME essentially needs the positive definiteness of ˆΣ. Without it, the optimization problem may be infeasible or return inadequate solutions. 2.2 Robust inference via -divergence Let f and f n be the data generating and empirical densities, respectively. In robust inference one typically assumes that f = (1 ε)g + εh, where g is the density of clean data, h is the density of contamination, and ε 0 is the contamination level. For estimation of g, consider a model with densities g θ indexed by the parameter θ. The Kullback-Leibler (KL) divergence
4 between f n and g θ results in a biased estimate unless ε = 0. To overcome it, Fujisawa & Eguchi (2008) proposed the -divergence given by d (f n, g θ ) = 1 log f n (x)g θ (x) dx log g θ (x) 1+ dx, where > 0 is a constant that controls the trade-off between efficiency and robustness. In fact, the -divergence is equivalent to the KL divergence when 0. The estimator given by minimizing d (f n, g θ ) over a possible parameter space is highly robust. Roughly speaking, in the limiting case n, d (f n, g θ ) can be regarded as d (f, g θ ), and d (f, g θ ) = 1 (1 log ε) } g(x)g θ (x) dx + εν(θ; ) log g θ (x) 1+ dx, where ν(θ; ) = h(x)g θ (x) dx. The -divergence successfully provides robust estimates whenever ν(θ; ) 0 over the parameter space considered. In such a case, we see that d (f, g θ ) d (g, g θ ) (1/) log(1 ε), where the contamination density h is automatically ignored, so that the minimizer of d (f, g θ ) is approximately equal to the minimizer of d (g, g θ ). This is a favorable property, because when g = g θ, the minimizer of d (g, g θ ) is θ. Fortunately, ν(θ; ) is close to zero when h lies in the tail of g θ. To illustrate this fact, assume for a moment that g θ is the density of N(θ, 1). If h is the density of N(α, 1), then ν(θ; ) = c 1, exp c 2, (α θ) 2 } for some c 1,, c 2, > 0. Thus, ν(θ; ) is small whenever α is not too close to the set of parameters θ that determine the better fitting densities g θ. 3 Methods 3.1 Proposed methodology As noted in Section 2.1, we seek a robust covariance estimate ˆΣ for use in graph estimation. In this section, we construct such an estimate via the -divergence. Our estimator ˆΣ is constructed in an element-wise fashion and exhibits robustness to cell-wise contamination. We begin by writing each covariance as Σ = σ jj σkk ρ, (4) where σ jj = Var(Y ij ), σ kk = Var(Y ik ) and ρ = Corr(Y ij, Y ik ), for j, k = 1,..., p. Here, Y i = (Y i1,..., Y ip ) T is the i-th unobserved clean sample in (1). We now derive estimates of the variances and the correlation in (4) based on the observations X i = (X i1,..., X ip ) T,
5 which under cell-wise contamination may have some of their elements corrupted. Fixing a coordinate j 1,..., p}, let g (µj,σ jj ) be the density of N(µ j, σ jj ), and let f (j) n be the empirical density of X 1j,..., X nj. The robust estimators of µ j and σ jj based on -divergence are given by d (f (j) n (ˆµ j, ˆσ jj ) = argmin d (f n (j), g (µj,σ jj )), µ j,σ jj, g (µj,σ jj )) = 1 n log exp } (X ij µ j ) 2 + 2σ jj 1 2(1 + ) log σ jj. Fujisawa & Eguchi (2008) gave an efficient iterative algorithm to compute (ˆµ j, ˆσ jj ). Let µ t j and σ t jj denote the t-th values starting from initializations µ 0 j and σ 0 jj. The algorithm repeats the following steps until convergence: µ t+1 j n w t ijx ij, σ t+1 jj (1 + ) n w t ij(x ij µ t+1 j ) 2, where the weights are updated as wij t = exp 2σ t jj (X ij µ t j) 2 }/ n exp 2σ t jj (X ij µ t j) 2 }. We take the median of X 1j,..., X nj as µ 0 j and the median absolute deviation (MAD) as σ 0 jj. After ˆµ j and ˆσ jj are obtained for j = 1,..., p, we estimate each correlation ρ from the standardized observations Z ij = (X ij ˆµ j )/ ˆσ jj. Let h ρ be the bivariate standardized normal density with correlation ρ, and let f n (j,k) be the empirical density of (Z 1j, Z 1k ),..., (Z nj, Z nk ). Our correlation estimator is d (f (j,k) n, h ρ ) = 1 log n ˆρ = argmin d (f n (j,k), h ρ ), (5) ρ <1 } exp 2(1 ρ 2 )(Z2 ij + Z 2 ik 2ρ Z ij Z ik ) + 1 2(1 + ) log 1 ρ2. The required univariate optimization problem can be solved with standard techniques. We provide a projected gradient descent algorithm below. Finally, we obtain the estimator of Σ as ˆΣ = ˆσ jj ˆσkk ˆρ. We outline the projected gradient descent algorithm for computation of ˆρ from (5). To avoid numerical singularity, we replace the restriction ρ < 1 by ρ R with R 1.
6 For simpler notation, let d (ρ ) = d (f n (j,k), h ρ ). The gradient of this function is d (ρ ) = (1 ρ 2 )2 n w i (1 + ρ 2 )Z ij Z ik ρ (Z 2 ij + Z 2 ik) } ρ, 1 ρ 2 where w i = exp }/ ( Z 2 2(1 ρ 2 ) ij + Zik 2 ) n 2ρ Z ij Z ik exp The objective function d (ρ ) is locally approximated around ρ by } ( Z 2 2(1 ρ 2 ) ij + Zik 2 ) 2ρ Z ij Z ik. ϕ (ρ ; ρ ) = d (ρ ) + d (ρ )(ρ ρ ) + s 2 (ρ ρ ) 2, where s > 0 is the step size parameter. We select sufficiently large s such that d (ρ ) ϕ (ρ ; ρ ). The projected gradient descent minimizes ϕ (ρ ; ρ ) over ρ R instead of d (ρ ). The minimizer is sgn( ρ ) min( ρ, R) with ρ = ρ s 1 d (ρ ). Algorithm 1 summarizes the procedure. Algorithm 1: Projected gradient descent algorithm for ˆρ Input: Standardized data (Z 1j, Z 1k ),..., (Z nj, Z nk ); divergence parameter > 0. Initialize t = 0, ρ 0 = 0, s0 = 0.01 and set R = repeat t in 0 s t,t in s t repeat ν t,t in ρ t,t in+1 ρ t d (ρ t )/st,t in sgn(ν t,t in s t,tin+1 2s t,t in t in t in + 1 until d (ρ t ) ϕ (ρ t ; ρt,t in ρ t+1 ρ t,t in s t+1 s t,t in /2 t t + 1 until convergence; ) min( ν t,t in, R) ); 3.2 Projection of covariance matrix estimate We cannot directly plug an estimate of Σ into the methods introduced in Section 2.1 if it is not positive semidefinite. The node-wise regression and Glasso require a positive semidefinite estimate, and CLIME needs a positive definite one. However, if an estimate ˆΣ is not positive
7 semidefinite, we may project it onto the set of positive (semi)definite matrices. We will simply proceed by solving the problem min S S ˆΣ F subject to S δi p, where δ 0, F denotes the Frobenius norm and S δi p means that S δi p is positive semidefinite. The solution can be calculated by applying the singular value decomposition to ˆΣ and then replacing the singular values λ j by max(λ j, δ) for each j = 1,..., p. References Alqallaf, FA, Konis, KP, Martin, RD & Zamar, RH (2002), Scalable robust covariance and correlation estimates for data mining, In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp ACM, New York, NY, USA. Banerjee, O, El Ghaoui, L & D Aspremont, A (2008), Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data, Journal of Machine Learning Research, 9, Boudt, K, Cornelissen, J & Croux, C (2012), The Gaussian rank correlation estimator: robustness properties, Statistics and Computing, 22, Cai, T, Liu, W & Luo, X (2011), A constrained l 1 minimization approach to sparse precision matrix estimation, Journal of the American Statistical Association, 106, Drton, M & Maathuis, MH (2017), Structure learning in graphical modeling, Annual Review of Statistics and Its Application, 4, Finegold, M & Drton, M (2011), Robust graphical modeling of gene networks using classical and alternative t-distributions, The Annals of Applied Statistics, 2A, Friedman, JH, Hastie, T & Tibshirani, R (2008), Sparse inverse covariance estimation with the graphical lasso, Biostatistics, 9, Fujisawa, H & Eguchi, S (2008), Robust parameter estimation with a small bias against heavy contamination, Journal of Multivariate Analysis, 99, Gnanadesikan, R & Kettenring, JR (1972), Robust estimates, residuals and outlier detection with multiresponse data, Biometrics, 28,
8 Hsieh, C-J, Sustik, MA, Dhillon, IS & Ravikumar, P (2011), Sparse inverse covariance matrix estimation using quadratic approximation, In Advances in Neural Information Processing Systems, pp Liu, H, Han, F, Yuan, M & Lafferty, J (2012), High-dimensional semiparametric Gaussian copula graphical models, The Annals of Statistics, 40, Loh, P-L & Tan, XL (2015), High-dimensional robust precision matrix estimation: Cellwise corruption under ε contamination, arxiv: Maronna, RA, Martin, RD & Yohai, V (2006), Robust Statistics: Theory and Methods, Wiley, New York. Meinshausen, N & Bühlmann, P (2006), High-dimensional graphs and variable selection with the lasso, The Annals of Statistics, 34, Öllerer, V & Croux, X (2015), Robust high-dimensional precision matrix estimation, In Modern Nonparametric, Robust and Multivariate Methods, pp Springer International Publishing. Rousseeuw, PJ & Croux, C (1993), Alternatives to the median absolute deviation, Journal of the American Statistical Association, 88, Tarr, G, Müller, S & Weber NC (2016), Robust estimation of precision matrices under cellwise contamination, Computational Statistics and Data Analysis, 93, Yuan, M & Lin, Y (2007), Model selection and estimation in the Gaussian graphical model, Biometrika, 94,
Robust estimation of scale and covariance with P n and its application to precision matrix estimation
Robust estimation of scale and covariance with P n and its application to precision matrix estimation Garth Tarr, Samuel Müller and Neville Weber USYD 2013 School of Mathematics and Statistics THE UNIVERSITY
More informationHigh-dimensional Covariance Estimation Based On Gaussian Graphical Models
High-dimensional Covariance Estimation Based On Gaussian Graphical Models Shuheng Zhou, Philipp Rutimann, Min Xu and Peter Buhlmann February 3, 2012 Problem definition Want to estimate the covariance matrix
More informationAn efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss
An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss arxiv:1811.04545v1 [stat.co] 12 Nov 2018 Cheng Wang School of Mathematical Sciences, Shanghai Jiao
More informationBAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage
BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement
More informationRobust and sparse estimation of the inverse covariance matrix using rank correlation measures
Robust and sparse estimation of the inverse covariance matrix using rank correlation measures Christophe Croux, Viktoria Öllerer Abstract Spearman s rank correlation is a robust alternative for the standard
More informationSparse inverse covariance estimation with the lasso
Sparse inverse covariance estimation with the lasso Jerome Friedman Trevor Hastie and Robert Tibshirani November 8, 2007 Abstract We consider the problem of estimating sparse graphs by a lasso penalty
More informationHigh-dimensional covariance estimation based on Gaussian graphical models
High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,
More informationThe Nonparanormal skeptic
The Nonpara skeptic Han Liu Johns Hopkins University, 615 N. Wolfe Street, Baltimore, MD 21205 USA Fang Han Johns Hopkins University, 615 N. Wolfe Street, Baltimore, MD 21205 USA Ming Yuan Georgia Institute
More informationProperties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation
Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana
More informationSparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results
Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results David Prince Biostat 572 dprince3@uw.edu April 19, 2012 David Prince (UW) SPICE April 19, 2012 1 / 11 Electronic
More informationGaussian Graphical Models and Graphical Lasso
ELE 538B: Sparsity, Structure and Inference Gaussian Graphical Models and Graphical Lasso Yuxin Chen Princeton University, Spring 2017 Multivariate Gaussians Consider a random vector x N (0, Σ) with pdf
More informationLearning discrete graphical models via generalized inverse covariance matrices
Learning discrete graphical models via generalized inverse covariance matrices Duzhe Wang, Yiming Lv, Yongjoon Kim, Young Lee Department of Statistics University of Wisconsin-Madison {dwang282, lv23, ykim676,
More informationCellwise robust regularized discriminant analysis
Cellwise robust regularized discriminant analysis Ines Wilms (KU Leuven) and Stéphanie Aerts (University of Liège) ICORS, July 2017 Wilms and Aerts Cellwise robust regularized discriminant analysis 1 Discriminant
More informationCellwise robust regularized discriminant analysis
Cellwise robust regularized discriminant analysis JSM 2017 Stéphanie Aerts University of Liège, Belgium Ines Wilms KU Leuven, Belgium Cellwise robust regularized discriminant analysis 1 Discriminant analysis
More informationGraphical Model Selection
May 6, 2013 Trevor Hastie, Stanford Statistics 1 Graphical Model Selection Trevor Hastie Stanford University joint work with Jerome Friedman, Rob Tibshirani, Rahul Mazumder and Jason Lee May 6, 2013 Trevor
More informationSparse Graph Learning via Markov Random Fields
Sparse Graph Learning via Markov Random Fields Xin Sui, Shao Tang Sep 23, 2016 Xin Sui, Shao Tang Sparse Graph Learning via Markov Random Fields Sep 23, 2016 1 / 36 Outline 1 Introduction to graph learning
More informationProximity-Based Anomaly Detection using Sparse Structure Learning
Proximity-Based Anomaly Detection using Sparse Structure Learning Tsuyoshi Idé (IBM Tokyo Research Lab) Aurelie C. Lozano, Naoki Abe, and Yan Liu (IBM T. J. Watson Research Center) 2009/04/ SDM 2009 /
More informationCausal Inference: Discussion
Causal Inference: Discussion Mladen Kolar The University of Chicago Booth School of Business Sept 23, 2016 Types of machine learning problems Based on the information available: Supervised learning Reinforcement
More informationMATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.)
1/12 MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.) Dominique Guillot Departments of Mathematical Sciences University of Delaware May 6, 2016
More informationRobust scale estimation with extensions
Robust scale estimation with extensions Garth Tarr, Samuel Müller and Neville Weber School of Mathematics and Statistics THE UNIVERSITY OF SYDNEY Outline The robust scale estimator P n Robust covariance
More informationEstimation of Graphical Models with Shape Restriction
Estimation of Graphical Models with Shape Restriction BY KHAI X. CHIONG USC Dornsife INE, Department of Economics, University of Southern California, Los Angeles, California 989, U.S.A. kchiong@usc.edu
More informationRobust Inverse Covariance Estimation under Noisy Measurements
.. Robust Inverse Covariance Estimation under Noisy Measurements Jun-Kun Wang, Shou-De Lin Intel-NTU, National Taiwan University ICML 2014 1 / 30 . Table of contents Introduction.1 Introduction.2 Related
More informationHigh Dimensional Inverse Covariate Matrix Estimation via Linear Programming
High Dimensional Inverse Covariate Matrix Estimation via Linear Programming Ming Yuan October 24, 2011 Gaussian Graphical Model X = (X 1,..., X p ) indep. N(µ, Σ) Inverse covariance matrix Σ 1 = Ω = (ω
More informationMATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models
1/13 MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models Dominique Guillot Departments of Mathematical Sciences University of Delaware May 4, 2016 Recall
More informationApproximation. Inderjit S. Dhillon Dept of Computer Science UT Austin. SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina.
Using Quadratic Approximation Inderjit S. Dhillon Dept of Computer Science UT Austin SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina Sept 12, 2012 Joint work with C. Hsieh, M. Sustik and
More informationAdaptive estimation of the copula correlation matrix for semiparametric elliptical copulas
Adaptive estimation of the copula correlation matrix for semiparametric elliptical copulas Department of Mathematics Department of Statistical Science Cornell University London, January 7, 2016 Joint work
More informationChapter 17: Undirected Graphical Models
Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)
More informationJoint Gaussian Graphical Model Review Series I
Joint Gaussian Graphical Model Review Series I Probability Foundations Beilun Wang Advisor: Yanjun Qi 1 Department of Computer Science, University of Virginia http://jointggm.org/ June 23rd, 2017 Beilun
More informationAn Introduction to Graphical Lasso
An Introduction to Graphical Lasso Bo Chang Graphical Models Reading Group May 15, 2015 Bo Chang (UBC) Graphical Lasso May 15, 2015 1 / 16 Undirected Graphical Models An undirected graph, each vertex represents
More informationLearning Multiple Tasks with a Sparse Matrix-Normal Penalty
Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1
More informationPermutation-invariant regularization of large covariance matrices. Liza Levina
Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work
More informationHigh Dimensional Covariance and Precision Matrix Estimation
High Dimensional Covariance and Precision Matrix Estimation Wei Wang Washington University in St. Louis Thursday 23 rd February, 2017 Wei Wang (Washington University in St. Louis) High Dimensional Covariance
More informationCoordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /
Coordinate descent Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Adding to the toolbox, with stats and ML in mind We ve seen several general and useful minimization tools First-order methods
More informationLearning Gaussian Graphical Models with Unknown Group Sparsity
Learning Gaussian Graphical Models with Unknown Group Sparsity Kevin Murphy Ben Marlin Depts. of Statistics & Computer Science Univ. British Columbia Canada Connections Graphical models Density estimation
More informationRobust Variable Selection Through MAVE
Robust Variable Selection Through MAVE Weixin Yao and Qin Wang Abstract Dimension reduction and variable selection play important roles in high dimensional data analysis. Wang and Yin (2008) proposed sparse
More informationHigh-dimensional graphical model selection: Practical and information-theoretic limits
1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John
More informationGraphical Models for Non-Negative Data Using Generalized Score Matching
Graphical Models for Non-Negative Data Using Generalized Score Matching Shiqing Yu Mathias Drton Ali Shojaie University of Washington University of Washington University of Washington Abstract A common
More informationOn the inconsistency of l 1 -penalised sparse precision matrix estimation
On the inconsistency of l 1 -penalised sparse precision matrix estimation Otte Heinävaara Helsinki Institute for Information Technology HIIT Department of Computer Science University of Helsinki Janne
More informationarxiv: v1 [math.st] 13 Feb 2012
Sparse Matrix Inversion with Scaled Lasso Tingni Sun and Cun-Hui Zhang Rutgers University arxiv:1202.2723v1 [math.st] 13 Feb 2012 Address: Department of Statistics and Biostatistics, Hill Center, Busch
More informationMultivariate Bernoulli Distribution 1
DEPARTMENT OF STATISTICS University of Wisconsin 1300 University Ave. Madison, WI 53706 TECHNICAL REPORT NO. 1170 June 6, 2012 Multivariate Bernoulli Distribution 1 Bin Dai 2 Department of Statistics University
More informationExtended Bayesian Information Criteria for Gaussian Graphical Models
Extended Bayesian Information Criteria for Gaussian Graphical Models Rina Foygel University of Chicago rina@uchicago.edu Mathias Drton University of Chicago drton@uchicago.edu Abstract Gaussian graphical
More informationMultivariate Normal Models
Case Study 3: fmri Prediction Graphical LASSO Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Emily Fox February 26 th, 2013 Emily Fox 2013 1 Multivariate Normal Models
More informationMultivariate Normal Models
Case Study 3: fmri Prediction Coping with Large Covariances: Latent Factor Models, Graphical Models, Graphical LASSO Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox February
More informationExact Hybrid Covariance Thresholding for Joint Graphical Lasso
Exact Hybrid Covariance Thresholding for Joint Graphical Lasso Qingming Tang Chao Yang Jian Peng Jinbo Xu Toyota Technological Institute at Chicago Massachusetts Institute of Technology Abstract. This
More informationRegularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008
Regularized Estimation of High Dimensional Covariance Matrices Peter Bickel Cambridge January, 2008 With Thanks to E. Levina (Joint collaboration, slides) I. M. Johnstone (Slides) Choongsoon Bae (Slides)
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 7, 04 Reading: See class website Eric Xing @ CMU, 005-04
More informationStatistical Machine Learning for Structured and High Dimensional Data
Statistical Machine Learning for Structured and High Dimensional Data (FA9550-09- 1-0373) PI: Larry Wasserman (CMU) Co- PI: John Lafferty (UChicago and CMU) AFOSR Program Review (Jan 28-31, 2013, Washington,
More information10708 Graphical Models: Homework 2
10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves
More informationSparse Gaussian conditional random fields
Sparse Gaussian conditional random fields Matt Wytock, J. ico Kolter School of Computer Science Carnegie Mellon University Pittsburgh, PA 53 {mwytock, zkolter}@cs.cmu.edu Abstract We propose sparse Gaussian
More informationHigh dimensional sparse covariance estimation via directed acyclic graphs
Electronic Journal of Statistics Vol. 3 (009) 1133 1160 ISSN: 1935-754 DOI: 10.114/09-EJS534 High dimensional sparse covariance estimation via directed acyclic graphs Philipp Rütimann and Peter Bühlmann
More informationStahel-Donoho Estimation for High-Dimensional Data
Stahel-Donoho Estimation for High-Dimensional Data Stefan Van Aelst KULeuven, Department of Mathematics, Section of Statistics Celestijnenlaan 200B, B-3001 Leuven, Belgium Email: Stefan.VanAelst@wis.kuleuven.be
More informationROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY
ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY G.L. Shevlyakov, P.O. Smirnov St. Petersburg State Polytechnic University St.Petersburg, RUSSIA E-mail: Georgy.Shevlyakov@gmail.com
More informationA Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression
A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression Noah Simon Jerome Friedman Trevor Hastie November 5, 013 Abstract In this paper we purpose a blockwise descent
More informationHigh-dimensional graphical model selection: Practical and information-theoretic limits
1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John
More informationLearning Markov Network Structure using Brownian Distance Covariance
arxiv:.v [stat.ml] Jun 0 Learning Markov Network Structure using Brownian Distance Covariance Ehsan Khoshgnauz May, 0 Abstract In this paper, we present a simple non-parametric method for learning the
More information11 : Gaussian Graphic Models and Ising Models
10-708: Probabilistic Graphical Models 10-708, Spring 2017 11 : Gaussian Graphic Models and Ising Models Lecturer: Bryon Aragam Scribes: Chao-Ming Yen 1 Introduction Different from previous maximum likelihood
More informationSparse Covariance Selection using Semidefinite Programming
Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support
More informationMarkov Network Estimation From Multi-attribute Data
Mladen Kolar mladenk@cs.cmu.edu Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA 117 USA Han Liu hanliu@princeton.edu Department of Operations Research and Financial Engineering,
More informationSparse Inverse Covariance Estimation for a Million Variables
Sparse Inverse Covariance Estimation for a Million Variables Inderjit S. Dhillon Depts of Computer Science & Mathematics The University of Texas at Austin SAMSI LDHD Opening Workshop Raleigh, North Carolina
More informationAn algorithm for the multivariate group lasso with covariance estimation
An algorithm for the multivariate group lasso with covariance estimation arxiv:1512.05153v1 [stat.co] 16 Dec 2015 Ines Wilms and Christophe Croux Leuven Statistics Research Centre, KU Leuven, Belgium Abstract
More informationarxiv: v2 [econ.em] 1 Oct 2017
Estimation of Graphical Models using the L, Norm Khai X. Chiong and Hyungsik Roger Moon arxiv:79.8v [econ.em] Oct 7 Naveen Jindal School of Management, University of exas at Dallas E-mail: khai.chiong@utdallas.edu
More informationLearning the Network Structure of Heterogenerous Data via Pairwise Exponential MRF
Learning the Network Structure of Heterogenerous Data via Pairwise Exponential MRF Jong Ho Kim, Youngsuk Park December 17, 2016 1 Introduction Markov random fields (MRFs) are a fundamental fool on data
More informationThe picasso Package for Nonconvex Regularized M-estimation in High Dimensions in R
The picasso Package for Nonconvex Regularized M-estimation in High Dimensions in R Xingguo Li Tuo Zhao Tong Zhang Han Liu Abstract We describe an R package named picasso, which implements a unified framework
More informationStability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models
Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models arxiv:1006.3316v1 [stat.ml] 16 Jun 2010 Contents Han Liu, Kathryn Roeder and Larry Wasserman Carnegie Mellon
More informationLecture 25: November 27
10-725: Optimization Fall 2012 Lecture 25: November 27 Lecturer: Ryan Tibshirani Scribes: Matt Wytock, Supreeth Achar Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have
More informationBig & Quic: Sparse Inverse Covariance Estimation for a Million Variables
for a Million Variables Cho-Jui Hsieh The University of Texas at Austin NIPS Lake Tahoe, Nevada Dec 8, 2013 Joint work with M. Sustik, I. Dhillon, P. Ravikumar and R. Poldrack FMRI Brain Analysis Goal:
More informationLearning Binary Classifiers for Multi-Class Problem
Research Memorandum No. 1010 September 28, 2006 Learning Binary Classifiers for Multi-Class Problem Shiro Ikeda The Institute of Statistical Mathematics 4-6-7 Minami-Azabu, Minato-ku, Tokyo, 106-8569,
More informationSparse Permutation Invariant Covariance Estimation: Final Talk
Sparse Permutation Invariant Covariance Estimation: Final Talk David Prince Biostat 572 dprince3@uw.edu May 31, 2012 David Prince (UW) SPICE May 31, 2012 1 / 19 Electronic Journal of Statistics Vol. 2
More informationEstimating Sparse Precision Matrix with Bayesian Regularization
Estimating Sparse Precision Matrix with Bayesian Regularization Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement Graphical
More informationVariable Selection for Highly Correlated Predictors
Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu arxiv:1709.04840v1 [stat.me] 14 Sep 2017 Abstract Penalty-based variable selection methods are powerful in selecting relevant covariates
More informationInverse Covariance Estimation with Missing Data using the Concave-Convex Procedure
Inverse Covariance Estimation with Missing Data using the Concave-Convex Procedure Jérôme Thai 1 Timothy Hunter 1 Anayo Akametalu 1 Claire Tomlin 1 Alex Bayen 1,2 1 Department of Electrical Engineering
More informationSparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28
Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:
More informationGenetic Networks. Korbinian Strimmer. Seminar: Statistical Analysis of RNA-Seq Data 19 June IMISE, Universität Leipzig
Genetic Networks Korbinian Strimmer IMISE, Universität Leipzig Seminar: Statistical Analysis of RNA-Seq Data 19 June 2012 Korbinian Strimmer, RNA-Seq Networks, 19/6/2012 1 Paper G. I. Allen and Z. Liu.
More informationRobust methods and model selection. Garth Tarr September 2015
Robust methods and model selection Garth Tarr September 2015 Outline 1. The past: robust statistics 2. The present: model selection 3. The future: protein data, meat science, joint modelling, data visualisation
More informationStatistical Data Mining and Machine Learning Hilary Term 2016
Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes
More informationInequalities on partial correlations in Gaussian graphical models
Inequalities on partial correlations in Gaussian graphical models containing star shapes Edmund Jones and Vanessa Didelez, School of Mathematics, University of Bristol Abstract This short paper proves
More informationVariables. Cho-Jui Hsieh The University of Texas at Austin. ICML workshop on Covariance Selection Beijing, China June 26, 2014
for a Million Variables Cho-Jui Hsieh The University of Texas at Austin ICML workshop on Covariance Selection Beijing, China June 26, 2014 Joint work with M. Sustik, I. Dhillon, P. Ravikumar, R. Poldrack,
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 5, 06 Reading: See class website Eric Xing @ CMU, 005-06
More informationStructure estimation for Gaussian graphical models
Faculty of Science Structure estimation for Gaussian graphical models Steffen Lauritzen, University of Copenhagen Department of Mathematical Sciences Minikurs TUM 2016 Lecture 3 Slide 1/48 Overview of
More informationOPTIMISATION CHALLENGES IN MODERN STATISTICS. Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan
OPTIMISATION CHALLENGES IN MODERN STATISTICS Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan How do optimisation problems arise in Statistics? Let X 1,...,X n be independent and identically distributed
More informationInference in high-dimensional graphical models arxiv: v1 [math.st] 25 Jan 2018
Inference in high-dimensional graphical models arxiv:1801.08512v1 [math.st] 25 Jan 2018 Jana Janková Seminar for Statistics ETH Zürich Abstract Sara van de Geer We provide a selected overview of methodology
More informationRobust Inverse Covariance Estimation under Noisy Measurements
Jun-Kun Wang Intel-NTU, National Taiwan University, Taiwan Shou-de Lin Intel-NTU, National Taiwan University, Taiwan WANGJIM123@GMAIL.COM SDLIN@CSIE.NTU.EDU.TW Abstract This paper proposes a robust method
More informationGroup Sparse Priors for Covariance Estimation
Group Sparse Priors for Covariance Estimation Benjamin M. Marlin, Mark Schmidt, and Kevin P. Murphy Department of Computer Science University of British Columbia Vancouver, Canada Abstract Recently it
More informationHigh Dimensional Semiparametric Gaussian Copula Graphical Models
High Dimeional Semiparametric Gaussian Copula Graphical Models Han Liu, Fang Han, Ming Yuan, John Lafferty and Larry Wasserman February 9, 2012 arxiv:1202.2169v1 [stat.ml] 10 Feb 2012 Contents Abstract:
More informationIndirect multivariate response linear regression
Biometrika (2016), xx, x, pp. 1 22 1 2 3 4 5 6 C 2007 Biometrika Trust Printed in Great Britain Indirect multivariate response linear regression BY AARON J. MOLSTAD AND ADAM J. ROTHMAN School of Statistics,
More informationLinear Regression (9/11/13)
STA561: Probabilistic machine learning Linear Regression (9/11/13) Lecturer: Barbara Engelhardt Scribes: Zachary Abzug, Mike Gloudemans, Zhuosheng Gu, Zhao Song 1 Why use linear regression? Figure 1: Scatter
More informationTractable Upper Bounds on the Restricted Isometry Constant
Tractable Upper Bounds on the Restricted Isometry Constant Alex d Aspremont, Francis Bach, Laurent El Ghaoui Princeton University, École Normale Supérieure, U.C. Berkeley. Support from NSF, DHS and Google.
More informationBiostatistics Advanced Methods in Biostatistics IV
Biostatistics 140.754 Advanced Methods in Biostatistics IV Jeffrey Leek Assistant Professor Department of Biostatistics jleek@jhsph.edu Lecture 12 1 / 36 Tip + Paper Tip: As a statistician the results
More informationRobust variable selection through MAVE
This is the author s final, peer-reviewed manuscript as accepted for publication. The publisher-formatted version may be available through the publisher s web site or your institution s library. Robust
More informationSelection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty
Journal of Data Science 9(2011), 549-564 Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Masaru Kanba and Kanta Naito Shimane University Abstract: This paper discusses the
More informationComputational and Statistical Aspects of Statistical Machine Learning. John Lafferty Department of Statistics Retreat Gleacher Center
Computational and Statistical Aspects of Statistical Machine Learning John Lafferty Department of Statistics Retreat Gleacher Center Outline Modern nonparametric inference for high dimensional data Nonparametric
More informationSparse and Locally Constant Gaussian Graphical Models
Sparse and Locally Constant Gaussian Graphical Models Jean Honorio, Luis Ortiz, Dimitris Samaras Department of Computer Science Stony Brook University Stony Brook, NY 794 {jhonorio,leortiz,samaras}@cs.sunysb.edu
More informationIntroduction to graphical models: Lecture III
Introduction to graphical models: Lecture III Martin Wainwright UC Berkeley Departments of Statistics, and EECS Martin Wainwright (UC Berkeley) Some introductory lectures January 2013 1 / 25 Introduction
More information25 : Graphical induced structured input/output models
10-708: Probabilistic Graphical Models 10-708, Spring 2016 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Raied Aljadaany, Shi Zong, Chenchen Zhu Disclaimer: A large
More informationMachine Learning for OR & FE
Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com
More informationLecture 2 Part 1 Optimization
Lecture 2 Part 1 Optimization (January 16, 2015) Mu Zhu University of Waterloo Need for Optimization E(y x), P(y x) want to go after them first, model some examples last week then, estimate didn t discuss
More informationA Robust Approach to Regularized Discriminant Analysis
A Robust Approach to Regularized Discriminant Analysis Moritz Gschwandtner Department of Statistics and Probability Theory Vienna University of Technology, Austria Österreichische Statistiktage, Graz,
More informationMachine Learning Linear Models
Machine Learning Linear Models Outline II - Linear Models 1. Linear Regression (a) Linear regression: History (b) Linear regression with Least Squares (c) Matrix representation and Normal Equation Method
More informationBayesian Learning of Sparse Gaussian Graphical Models. Duke University Durham, NC, USA. University of South Carolina Columbia, SC, USA I.
Bayesian Learning of Sparse Gaussian Graphical Models Minhua Chen, Hao Wang, Xuejun Liao and Lawrence Carin Electrical and Computer Engineering Department Duke University Durham, NC, USA Statistics Department
More information