An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss
|
|
- Dwight Domenic Long
- 5 years ago
- Views:
Transcription
1 An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss arxiv: v1 [stat.co] 12 Nov 2018 Cheng Wang School of Mathematical Sciences, Shanghai Jiao Tong University and Binyan Jiang Department of Applied Mathematics, The Hong Kong Polytechnic University, Abstract The estimation of high dimensional precision matrices has been a central topic in statistical learning. However, as the number of parameters scales quadratically with the dimension p, many state-ofthe-art methods do not scale well to solve problems with a very large p. In this paper, we propose a very efficient algorithm for precision matrix estimation via penalized quadratic loss functions. Under the high dimension low sample size setting, the computation complexity of our algorithm is linear in the sample size and the number of parameters, which is the same as computing the sample covariance matrix. Keywords: ADMM; High dimension; Penalized quadratic loss; Precision matrix. 1 Introduction Precision matrix plays an important role in statistical analysis. Under Gaussian assumptions, the precision matrix is often used to study the conditional independence among the random variables. Contemporary applications usually require fast methods for estimating a very high dimensional precision 1
2 matrix (Meinshausen and Bühlmann, 2006). Despite recent advances, estimation of the precision matrix remains challenging when the dimension p is very large, owing to the fact that the number of parameters to be estimated is of order O(p 2 ). For example, in the Prostate dataset we are studying in this paper, 6033 genetic activity measurements are recorded for 102 subjects. The precision matrix to be estimated is of dimension , resulting in more than 18 million parameters. One of the well-known and popular method in precision matrix estimation is referred to as the graphical lasso (Yuan and Lin, 2007; Banerjee et al., 2008; Friedman et al., 2008). Without loss of generality, assume that X 1,, X n are i.i.d. observations from a p-dimensional Gaussian distribution with mean 0 and covariance matrix Σ. To estimation the precision matrix Ω = Σ 1, the graphical lasso solves the following l 1 regularized negative log-likelihood: arg min tr(sω) log Ω + λ Ω 1, (1) Ω O p where O p is the set of non-negative definite matrices in R p p, S is the sample covariance matrix, Ω 1 denotes the sum of the absolute values of Ω and λ 0 is the tuning parameter. Although (1) is constructed based on Gaussian likelihood, it is known that glasso also works for non-gaussian data (Ravikumar et al., 2011). Many algorithms have been developed to solve the graphical lasso. Friedman et al. (2008) proposed a coordinate descent procedure and Boyd et al. (2011) provided an alternating direction method of multipliers (ADMM) algorithm for solving (1). In order to obtain faster convergence for the iterations, second order methods and proximal gradient algorithms on the dual problem are also well developed; see for example Hsieh et al. (2014), Dalal and Rajaratnam (2017), and the references therein. However, since the graphical lasso involves matrix determinant, eigen-decomposition or calculation of the determinant of a p p matrix is inevitable in these algorithms. Note that the computation complexity of eigen-decomposition or matrix determinant is of order O(p 3 ). Thus, the computation time for these algorithms will scale up cubically in p. Recently, Zhang and Zou (2014) and Liu and Luo (2015) proposed to estimate Ω by some trace based quadratic loss functions. Using Kronecker product and matrix vectorization, our interest is to estimate, or equivalently, vec(ω ) = vec(σ 1 ) = vec(i I Σ 1 ) = (Σ I) 1 vec(i), (2) vec(ω ) = vec(σ 1 ) = vec(σ 1 I I) = (I Σ) 1 vec(i), (3) 2
3 where I denotes the identity matrix and is Kronecker product. Motivated by (2) and LASSO (Tibshirani, 1996), a natural way to estimate β = vec(ω ) is 1 arg min β R 2 βt (S I)β β T vec(i) + λ β 1. (4) p2 To obtain a symmetric estimator we can use both (2) and (3), and estimate β by 1 arg min β R 4 βt (S I + I S)β β T vec(i) + λ β 1. (5) p2 In matrix notation and writing β = vec(ω), (4) can be written as ˆΩ 1 = arg min 1 2 tr(ωsωt ) tr(ω) + λ Ω 1. (6) Symmetrization can then be applied to obtain a final estimator. The loss function L 1 (Ω) := 1 2 tr(ωsωt ) tr(ω) (7) is used in Liu and Luo (2015) and they proposed a column-wise estimation approach called SCIO. Similarly, the matrix form of (5) is ˆΩ 2 = arg min 1 4 tr(ωt SΩ) tr(ωsωt ) tr(ω) + λ Ω 1. (8) The loss function L 2 (Ω) = 1 2 {L 1(Ω) + L 1 (Ω T )} = 1 4 tr(ωsωt ) tr(ωt SΩ) tr(ω), is equivalent to the D-trace loss proposed by Zhang and Zou (2014), owing to the fact that L 2 (Ω) naturally force the solution to be symmetric. In the original papers by Zhang and Zou (2014) and Liu and Luo (2015), the authors have established consistency results for the estimation (6) and (8) and show that their performance are comparable to graphical lasso. As can be seen in the vectorized formulation (2) and (3), the loss functions L 1 (Ω) and L 2 (Ω) are quadratic in β. In this note, we show that under such quadratic loss functions, explicit solutions can be obtained in each step of the ADMM algorithm. In particular, we derived explicit formulations for the inverses of (S I + ρi) and (2 1 S I I S + ρi) for any given ρ > 0, from which we are able to solve (6) and (8), or equivalently (4) and (5), with computation complexity of order O(np 2 ). Such a rate is in some sense optimal, as the complexity for computing S is also of order O(np 2 ). 3
4 2 Main Results For any real matrix M, we shall use M 2 = tr(mm T ) to denote its Frobenius norm, and M to denote its Spectral norm, i.e., the square root of the largest eigenvalue of M T M. For the quadratic loss function L(Ω) = L 1 (Ω) or L 2 (Ω), the augmented Lagrangian is L a (Ω, A, B) = arg min L(Ω) + ρ/2 Ω A + B λ A 1, where ρ > 0 is the step size of ADMM. By Boyd et al. (2011), the alternating iterations are Ω k+1 = arg min L a (Ω, A k, B k ), A k+1 = arg min L a (Ω k+1, A, B k ) = soft(ω k+1 + B k, λ/ρ), B k+1 = Ω k+1 A k+1 + B k, where soft(a, λ) is an element-wise soft thresholding operator. Clearly the computation complexity will be mainly dominated by the update of Ω k+1, which amount to solving the following problem: arg min Ω R p p L(Ω) + ρ/2 Ω A k + B k 2 2. From the convexity of the objective function, we have, L (Ω) + ρ(ω A k + B k ) = 0. Consequently, for the estimation (6) and (8), we need to solve the following equations respectively, SΩ + ρω = I ρ(a k B k ), (9) 2 1 SΩ ΩS + ρω = I ρ(a k B k ). (10) By looking at the vectorized formulations (4) and (5), we immediately have that in order to solve (9) and (10) we need to compute the inverses of (S I + ρi) and (2 1 S I I S + ρi). The following proposition provides explicit expressions for the inverses. Proposition 1 Write the decomposition of S as S = UΛU T where U R p m, m = min(n, p), U T U = I m and Λ = diag{τ 1,, τ m }, τ 1,..., τ m 0. For any ρ > 0, we have (S + ρi) 1 = ρ 1 I ρ 1 U T Λ 1 U, and (2 1 S I I S + ρi) 1 =ρ 1 I ρ 1 (UΛ 1 U T ) I ρ 1 I (UΛ 1 U T ) + ρ 1 (U U)diag{vec(Λ 2 )}(U T U T ), { } { } τ where Λ 1 = diag 1,, τ m τ τ 1 +2ρ τ m+2ρ, and Λ 2 = i τ j (τ i +τ j +4ρ) (τ i +2ρ)(τ j +2ρ)(τ i +τ j. +2ρ) 4 m m
5 The main technique used for deriving the above proposition is the well-known Woodbury matrix identity. The derivation involves lengthy and tedious calculations and is hence omitted. Nevertheless, one can simply verify the proposition by showing for example (S + ρi)(ρ 1 I ρ 1 U T Λ 1 U) = (ρ 1 I ρ 1 U T Λ 1 U)(S + ρi) = I. By Proposition 1, and the following equations: {(UΛ 1 U T ) I}vec(C) = vec(cuλ 1 U T ), I (UΛ 1 U T )vec(c) = vec(uλ 1 U T C), (U U)diag{vec(Λ 2 )}(U T U T )vec(c) = U{Λ 2 (U T CU)}U T, where denotes the Hadamard product, we can obtain the following theorem: Theorem 1 For a given ρ > 0, (i) the solution to the equation SΩ + ρω = C is unique and is given as Ω = ρ 1 C ρ 1 UΛ 1 U T C; (ii) the solution to the equation 2 1 SΩ ΩS + ρω = C is unique and is given as Ω = ρ 1 C ρ 1 CUΛ 1 U T ρ 1 UΛ 1 U T C + ρ 1 U{Λ 2 (U T CU)}U T. Note that when S is the the sample covariance, U and Λ can be obtained from the singular value decomposition (SVD) of X = (X 1,..., X n ) whose complexity is of order O(mnp). On the other hand, the solutions obtained in Theorem 1 involves only elementary matrix operations of p m and m m matrices and thus the complexity for solving (9) and (10) can be seen to be of order O(mnp + mp 2 ) = O(np 2 ). Remark 1 Generally, we can specify different weights for each element and consider the estimation ˆΩ k = arg min L k (Ω) + λ W Ω 1, k = 1, 2, where W = (w ij ) p p, w i,j 0. For example, Setting w ii = 0 and w i,j = 1, i j where the diagonal elements are left out of penalization; Using the local linear approximation (Zou and Li, 2008), we can set W = {p λ (ˆΩ ij )} p p, where ˆΩ = (ˆΩ ij ) p p is a LASSO solution and p λ ( ) is a general penalized function such as SCAD or MCP. The ADMM algorithm will be the same as the l 1 penalized case except the A k+1 related update is replaced by a element-wise soft thresholding with different thresholding parameters. 5
6 Remark 2 Compared with the ADMM algorithm given in Zhang and Zou (2014), our update of Ω k+1 only involves matrix operations of some p m and m m matrices, while matrix operations on some p p matrices are required in Zhang and Zou (2014); see for example Theorem 1 in Zhang and Zou (2014). Consequently, we are able to obtain the order O(np 2 ) in these updates while Zhang and Zou (2014) requires O((n + p)p 2 ). Our algorithm thus scales much better when n p. Similarly, for the graphical lasso, we can also use ADMM (Boyd et al., 2011) to implement the minimization where the loss function is L(Ω) = tr(sω) log Ω. The update for Ω k+1 is obtained by solving ρω Ω 1 = ρ(a k B k ) S. Denote the eigenvalue decomposition of ρ(a k B k ) S as Q T Λ 0 Q where Λ 0 = diag{a 1,, a p }, we can obtain a closed form solution, { Ω k+1 a = Q T 1 + a ρ diag,, a p + } a 2 p + 4ρ Q. 2ρ 2ρ Compared with the algorithm based on quadratic loss functions, the computational complexity is dominated by the eigenvalue decomposition of p p matrices. 3 Simulations In this section, we conduct several simulations to illustrate the efficiency and estimation accuracy of our proposed methods. We consider the following two precision matrices: 4 2 Ω 1 = (0.5 i j ) p p, Ω 2 = Ω 1 1 = where Ω 1 is an asymptotic sparse matrix with many small elements and Ω 2 is a sparse matrix where each column/row only has at most three non-zero elements. For all of our simulations, we set the sample size n = 200 and generate the data X 1,, X n from N(0, Σ) with Σ = Ω 1 1 or Ω 1 2. For notation convenience, we shall use the term EQUAL to denote our proposed Efficient admm algorithm via the QUAdratic Loss (4), and similarly, use EQUALs to denote the estimation based on the symmetric quadratic loss (5). An R package EQUAL has been developed to implement our methods. For comparison, we consider the following competitors: 6,
7 CLIME (Cai et al., 2011) which is implemented by the R package fastclime ; glasso (Friedman et al., 2008) which is implemented by the R package glasso ; BigQuic (Hsieh et al., 2013) which is implemented by the R package BigQuic ; glasso-admm which solves the glasso by ADMM (Boyd et al., 2011); SCIO (Liu and Luo, 2015) which is implemented by the R package scio ; D-trace (Zhang and Zou, 2014) which is implemented using the ADMM algorithm provided in the paper. Table 1 summaries the computation time in seconds based on 10 replications where all methods are implemented in R with a PC with 3.3 GHz Intel Core i7 CPU and 16GB memory. For all the methods, we solve a solution path corresponding to 50 lambda values ranging from λ max to λ max log p/n. Here λ max is the maximum absolute elements of the sample covariance matrix. Although the stopping criteria is different for each method, we can see from Table 1 the computation advantage of our methods. In particularly, our proposed algorithms are much faster than the original quadratic loss based methods SCIO or D-trace for large p. In addition, we can roughly observe that the required time increases quadratically in p in our proposed algorithms. The second simulation is designed to evaluate the performance of estimation accuracy. Given the true precision matrix Ω and an estimator ˆΩ, we report the following four loss functions: loss1 = 1 Ω ˆΩ 2, loss2 = Ω ˆΩ, loss3 = p 1 loss4 = p {tr(ˆω T Ω 1 ˆΩ)/2 tr(ˆω) + tr(ω)}, 1 p {tr(ω 1 ˆΩ) log Ω 1 ˆΩ p}, where loss1 is the scaled Frobenius loss, loss2 is the spectral loss, loss3 is the Stein s loss which is related to the Gaussian likelihood and loss4 is related to the quadratic loss. Table 2 reports the simulation results where the tuning parameter is chosen by five-fold crossvalidations. We can see that the performance of all three estimators are comparable, indicating that the penalized quadratic loss estimator is also reliable for high dimensional precision matrix estimation. We also observe that the estimator based on (5) has slightly smaller estimation error than those based on (4). As shown in Table 1, the computation for quadratic loss estimator are much faster than glasso. 7
8 Table 1: The average computation time (standard deviation) of solving a solution path for the high dimensional precision matrix estimation. p=100 p=200 p=400 p=800 p=1600 Sparse case where Ω = Ω 2 CLIME 0.388(0.029) 2.863(0.056) (0.341) (0.696) ( 7.953) glasso 0.096(0.014) 0.562(0.080) 3.108(0.367) (1.739) (11.379) BigQuic 2.150(0.068) 5.337(0.073) (0.326) (0.867) ( 2.376) glasso-admm 0.909(0.018) 1.788(0.045) 5.419(0.137) (0.540) ( 3.993) SCIO 0.044(0.001) 0.272(0.021) 1.766(0.032) (0.103) ( 1.325) EQUAL 0.064(0.002) 0.343(0.006) 1.180(0.031) 4.846(0.112) ( 0.281) D-trace 0.088(0.003) 0.504(0.010) 2.936(0.070) (0.598) ( 4.959) EQUALs 0.113(0.003) 0.648(0.014) 1.723(0.039) 5.687(0.184) ( 0.373) Asymptotic sparse case where Ω = Ω 1 CLIME 0.382(0.026) 2.605(0.081) (0.480) (1.198) (9.645) glasso 0.052(0.007) 0.289(0.049) 1.467(0.324) 9.124(1.756) (7.386) BigQuic 1.816(0.043) 4.134(0.033) (0.123) (0.247) (0.574) glasso-admm 0.847(0.018) 1.636(0.032) 5.370(0.239) (0.518) (3.656) SCIO 0.036(0.000) 0.253(0.028) 1.690(0.030) (0.181) (0.236) EQUAL 0.031(0.001) 0.167(0.006) 0.614(0.029) 3.084(0.122) (0.129) D-trace 0.037(0.001) 0.213(0.009) 1.438(0.072) (0.591) (4.898) EQUALs 0.048(0.001) 0.273(0.011) 0.856(0.043) 3.518(0.107) (0.128) 8
9 Moreover, to check the singularity of the estimation, we also report the minimum eigenvalue for each estimation in the final column of Table 2. We observe that all estimators are well estimated to be positive definite. Finally, we apply our proposal to the Prostate dataset which is publicly available at stanford.edu/ hastie/casi_files/data/prostate.html. The data records 6033 genetic activity measurements for the control group (50 subjects) and the prostate cancer group (52 subjects). Here, the data dimension p is 6033 and the sample size n is 50 or 52. We estimate the precision for each group. Since our EQUAL and EQUALs give similar results, we only report the estimation for EQUALs. It took less than 20 minutes for EQUALs to obtain the solution paths while glasso can not produce the solution due to out of memory in R. The sparsity level of the solution paths are plotted in the upper panel of Figure 1. To compare the precision matrices between the two groups, the network graphs of the EQUALs estimators with tuning λ = 0.75 are provided in the lower panel of Figure 1. References O. Banerjee, L. E. Ghaoui, and A. daspremont. Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data. The Journal of Machine Learning Research, 9 (Mar): , S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends R in Machine learning, 3 (1):1 122, T. Cai, W. Liu, and X. Luo. A constrained l 1 minimization approach to sparse precision matrix estimation. Journal of the American Statistical Association, 106(494): , O. Dalal and B. Rajaratnam. Sparse gaussian graphical model estimation via alternating minimization. Biometrika, 104(2): , J. Friedman, T. Hastie, and R. Tibshirani. Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3): ,
10 Table 2: The estimation error (standard deviation) for precision matrix estimation. loss1 loss2 loss3 loss4 min-eigen Sparse case where Ω = Ω 2 p = 500 EQUAL 0.508(0.009) 1.183(0.051) 0.273(0.004) 0.286(0.004) 0.555(0.018) EQUALs 0.465(0.010) 1.122(0.050) 0.240(0.003) 0.254(0.004) 0.500(0.013) glasso 0.554(0.008) 1.209(0.032) 0.232(0.002) 0.273(0.003) 0.298(0.011) p = 1000 EQUAL 0.605(0.010) 1.323(0.039) 0.304(0.004) 0.326(0.005) 0.577(0.015) EQUALs 0.550(0.008) 1.274(0.041) 0.267(0.003) 0.289(0.004) 0.527(0.014) glasso 0.599(0.006) 1.280(0.026) 0.248(0.002) 0.292(0.002) 0.307(0.011) p = 2000 EQUAL 0.555(0.007) 1.293(0.042) 0.291(0.003) 0.307(0.004) 0.560(0.016) EQUALs 0.538(0.008) 1.250(0.037) 0.278(0.003) 0.293(0.004) 0.549(0.015) glasso 0.640(0.005) 1.343(0.025) 0.263(0.001) 0.310(0.002) 0.318(0.010) Asymptotic sparse case where Ω = Ω 1 p = 500 EQUAL 0.707(0.005) 2.028(0.016) 0.329(0.003) 0.344(0.003) 0.364(0.011) EQUALs 0.664(0.010) 1.942(0.026) 0.297(0.005) 0.320(0.004) 0.333(0.011) glasso 0.706(0.004) 2.019(0.010) 0.310(0.002) 0.335(0.001) 0.252(0.008) p = 1000 EQUAL 0.701(0.008) 2.033(0.020) 0.331(0.004) 0.344(0.004) 0.361(0.012) EQUALs 0.681(0.003) 1.983(0.013) 0.314(0.002) 0.331(0.002) 0.353(0.011) glasso 0.728(0.003) 2.070(0.008) 0.321(0.002) 0.344(0.001) 0.261(0.008) p = 2000 EQUAL 0.860(0.106) 2.351(0.190) 0.446(0.066) 0.426(0.049) 0.322(0.023) EQUALs 0.666(0.020) 1.984(0.042) 0.317(0.008) 0.331(0.007) 0.348(0.011) 10 glasso 0.746(0.002) 2.109(0.007) 0.332(0.001) 0.353(0.001) 0.265(0.008)
11 lambda Sparsity level lambda Sparsity level (a) Solution path for control subjects (b) Solution path for prostate cancer subjects (c) Estimated networks for control subjects (d) Estimated networks for prostate cancer subjects Figure 1: Estimation for the Prostate data using EQUALs. Upper panel: sparsity level (average number of non-zero elements for each row/column) versus λ. Lower panel: network graphs for the two patient groups when λ = C.-J. Hsieh, M. A. Sustik, I. S. Dhillon, P. K. Ravikumar, and R. Poldrack. BIG & QUIC: Sparse inverse covariance estimation for a million variables. In Advances in neural information processing systems, pages , C.-J. Hsieh, M. A. Sustik, I. S. Dhillon, and P. Ravikumar. QUIC: quadratic approximation for sparse inverse covariance estimation. The Journal of Machine Learning Research, 15(1): ,
12 W. Liu and X. Luo. Fast and adaptive sparse precision matrix estimation in high dimensions. Journal of Multivariate Analysis, 135: , N. Meinshausen and P. Bühlmann. High-dimensional graphs and variable selection with the lasso. Annals of Statistics, 34(3): , P. Ravikumar, M. J. Wainwright, G. Raskutti, B. Yu, et al. High-dimensional covariance estimation by minimizing l 1 -penalized log-determinant divergence. Electronic Journal of Statistics, 5: , R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, pages , M. Yuan and Y. Lin. Model selection and estimation in the Gaussian graphical model. Biometrika, 94 (1):19 35, T. Zhang and H. Zou. Sparse precision matrix estimation via lasso penalized D-trace loss. Biometrika, 101(1): , H. Zou and R. Li. One-step sparse estimates in nonconcave penalized likelihood models. Annals of Statistics, 36(4):1509,
Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation
Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana
More informationBAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage
BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement
More informationSparse inverse covariance estimation with the lasso
Sparse inverse covariance estimation with the lasso Jerome Friedman Trevor Hastie and Robert Tibshirani November 8, 2007 Abstract We consider the problem of estimating sparse graphs by a lasso penalty
More informationarxiv: v1 [stat.co] 22 Jan 2019
A Fast Iterative Algorithm for High-dimensional Differential Network arxiv:9.75v [stat.co] Jan 9 Zhou Tang,, Zhangsheng Yu,, and Cheng Wang Department of Bioinformatics and Biostatistics, Shanghai Jiao
More informationRobust Inverse Covariance Estimation under Noisy Measurements
.. Robust Inverse Covariance Estimation under Noisy Measurements Jun-Kun Wang, Shou-De Lin Intel-NTU, National Taiwan University ICML 2014 1 / 30 . Table of contents Introduction.1 Introduction.2 Related
More informationSparse Gaussian conditional random fields
Sparse Gaussian conditional random fields Matt Wytock, J. ico Kolter School of Computer Science Carnegie Mellon University Pittsburgh, PA 53 {mwytock, zkolter}@cs.cmu.edu Abstract We propose sparse Gaussian
More informationRobust and sparse Gaussian graphical modelling under cell-wise contamination
Robust and sparse Gaussian graphical modelling under cell-wise contamination Shota Katayama 1, Hironori Fujisawa 2 and Mathias Drton 3 1 Tokyo Institute of Technology, Japan 2 The Institute of Statistical
More informationApproximation. Inderjit S. Dhillon Dept of Computer Science UT Austin. SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina.
Using Quadratic Approximation Inderjit S. Dhillon Dept of Computer Science UT Austin SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina Sept 12, 2012 Joint work with C. Hsieh, M. Sustik and
More informationOn the inconsistency of l 1 -penalised sparse precision matrix estimation
On the inconsistency of l 1 -penalised sparse precision matrix estimation Otte Heinävaara Helsinki Institute for Information Technology HIIT Department of Computer Science University of Helsinki Janne
More informationBig & Quic: Sparse Inverse Covariance Estimation for a Million Variables
for a Million Variables Cho-Jui Hsieh The University of Texas at Austin NIPS Lake Tahoe, Nevada Dec 8, 2013 Joint work with M. Sustik, I. Dhillon, P. Ravikumar and R. Poldrack FMRI Brain Analysis Goal:
More informationExact Hybrid Covariance Thresholding for Joint Graphical Lasso
Exact Hybrid Covariance Thresholding for Joint Graphical Lasso Qingming Tang Chao Yang Jian Peng Jinbo Xu Toyota Technological Institute at Chicago Massachusetts Institute of Technology Abstract. This
More informationMATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.)
1/12 MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.) Dominique Guillot Departments of Mathematical Sciences University of Delaware May 6, 2016
More informationThe picasso Package for Nonconvex Regularized M-estimation in High Dimensions in R
The picasso Package for Nonconvex Regularized M-estimation in High Dimensions in R Xingguo Li Tuo Zhao Tong Zhang Han Liu Abstract We describe an R package named picasso, which implements a unified framework
More informationPermutation-invariant regularization of large covariance matrices. Liza Levina
Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work
More informationShrinkage Tuning Parameter Selection in Precision Matrices Estimation
arxiv:0909.1123v1 [stat.me] 7 Sep 2009 Shrinkage Tuning Parameter Selection in Precision Matrices Estimation Heng Lian Division of Mathematical Sciences School of Physical and Mathematical Sciences Nanyang
More informationSparse Inverse Covariance Estimation for a Million Variables
Sparse Inverse Covariance Estimation for a Million Variables Inderjit S. Dhillon Depts of Computer Science & Mathematics The University of Texas at Austin SAMSI LDHD Opening Workshop Raleigh, North Carolina
More informationHigh Dimensional Covariance and Precision Matrix Estimation
High Dimensional Covariance and Precision Matrix Estimation Wei Wang Washington University in St. Louis Thursday 23 rd February, 2017 Wei Wang (Washington University in St. Louis) High Dimensional Covariance
More informationJoint Gaussian Graphical Model Review Series I
Joint Gaussian Graphical Model Review Series I Probability Foundations Beilun Wang Advisor: Yanjun Qi 1 Department of Computer Science, University of Virginia http://jointggm.org/ June 23rd, 2017 Beilun
More informationHigh-dimensional covariance estimation based on Gaussian graphical models
High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,
More informationSparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results
Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results David Prince Biostat 572 dprince3@uw.edu April 19, 2012 David Prince (UW) SPICE April 19, 2012 1 / 11 Electronic
More informationBig Data Analytics: Optimization and Randomization
Big Data Analytics: Optimization and Randomization Tianbao Yang Tutorial@ACML 2015 Hong Kong Department of Computer Science, The University of Iowa, IA, USA Nov. 20, 2015 Yang Tutorial for ACML 15 Nov.
More informationCoordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /
Coordinate descent Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Adding to the toolbox, with stats and ML in mind We ve seen several general and useful minimization tools First-order methods
More informationEstimation of Graphical Models with Shape Restriction
Estimation of Graphical Models with Shape Restriction BY KHAI X. CHIONG USC Dornsife INE, Department of Economics, University of Southern California, Los Angeles, California 989, U.S.A. kchiong@usc.edu
More informationProximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization
Proximal Newton Method Zico Kolter (notes by Ryan Tibshirani) Convex Optimization 10-725 Consider the problem Last time: quasi-newton methods min x f(x) with f convex, twice differentiable, dom(f) = R
More informationVariables. Cho-Jui Hsieh The University of Texas at Austin. ICML workshop on Covariance Selection Beijing, China June 26, 2014
for a Million Variables Cho-Jui Hsieh The University of Texas at Austin ICML workshop on Covariance Selection Beijing, China June 26, 2014 Joint work with M. Sustik, I. Dhillon, P. Ravikumar, R. Poldrack,
More informationThe Nonparanormal skeptic
The Nonpara skeptic Han Liu Johns Hopkins University, 615 N. Wolfe Street, Baltimore, MD 21205 USA Fang Han Johns Hopkins University, 615 N. Wolfe Street, Baltimore, MD 21205 USA Ming Yuan Georgia Institute
More informationIntroduction to Alternating Direction Method of Multipliers
Introduction to Alternating Direction Method of Multipliers Yale Chang Machine Learning Group Meeting September 29, 2016 Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction
More informationPenalized versus constrained generalized eigenvalue problems
Penalized versus constrained generalized eigenvalue problems Irina Gaynanova, James G. Booth and Martin T. Wells. arxiv:141.6131v3 [stat.co] 4 May 215 Abstract We investigate the difference between using
More informationNewton-Like Methods for Sparse Inverse Covariance Estimation
Newton-Like Methods for Sparse Inverse Covariance Estimation Peder A. Olsen Figen Oztoprak Jorge Nocedal Stephen J. Rennie June 7, 2012 Abstract We propose two classes of second-order optimization methods
More informationRobust Inverse Covariance Estimation under Noisy Measurements
Jun-Kun Wang Intel-NTU, National Taiwan University, Taiwan Shou-de Lin Intel-NTU, National Taiwan University, Taiwan WANGJIM123@GMAIL.COM SDLIN@CSIE.NTU.EDU.TW Abstract This paper proposes a robust method
More informationMultivariate Normal Models
Case Study 3: fmri Prediction Coping with Large Covariances: Latent Factor Models, Graphical Models, Graphical LASSO Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox February
More informationarxiv: v1 [math.st] 13 Feb 2012
Sparse Matrix Inversion with Scaled Lasso Tingni Sun and Cun-Hui Zhang Rutgers University arxiv:1202.2723v1 [math.st] 13 Feb 2012 Address: Department of Statistics and Biostatistics, Hill Center, Busch
More informationGaussian Graphical Models and Graphical Lasso
ELE 538B: Sparsity, Structure and Inference Gaussian Graphical Models and Graphical Lasso Yuxin Chen Princeton University, Spring 2017 Multivariate Gaussians Consider a random vector x N (0, Σ) with pdf
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Distributed ADMM for Gaussian Graphical Models Yaoliang Yu Lecture 29, April 29, 2015 Eric Xing @ CMU, 2005-2015 1 Networks / Graphs Eric Xing
More informationA Divide-and-Conquer Procedure for Sparse Inverse Covariance Estimation
A Divide-and-Conquer Procedure for Sparse Inverse Covariance Estimation Cho-Jui Hsieh Dept. of Computer Science University of Texas, Austin cjhsieh@cs.utexas.edu Inderjit S. Dhillon Dept. of Computer Science
More informationEE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6)
EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement to the material discussed in
More information2 Regularized Image Reconstruction for Compressive Imaging and Beyond
EE 367 / CS 448I Computational Imaging and Display Notes: Compressive Imaging and Regularized Image Reconstruction (lecture ) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement
More informationSparse Graph Learning via Markov Random Fields
Sparse Graph Learning via Markov Random Fields Xin Sui, Shao Tang Sep 23, 2016 Xin Sui, Shao Tang Sparse Graph Learning via Markov Random Fields Sep 23, 2016 1 / 36 Outline 1 Introduction to graph learning
More informationOptimization methods
Optimization methods Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda /8/016 Introduction Aim: Overview of optimization methods that Tend to
More informationSmoothly Clipped Absolute Deviation (SCAD) for Correlated Variables
Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables LIB-MA, FSSM Cadi Ayyad University (Morocco) COMPSTAT 2010 Paris, August 22-27, 2010 Motivations Fan and Li (2001), Zou and Li (2008)
More informationHigh-dimensional Covariance Estimation Based On Gaussian Graphical Models
High-dimensional Covariance Estimation Based On Gaussian Graphical Models Shuheng Zhou, Philipp Rutimann, Min Xu and Peter Buhlmann February 3, 2012 Problem definition Want to estimate the covariance matrix
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring
More informationLecture 25: November 27
10-725: Optimization Fall 2012 Lecture 25: November 27 Lecturer: Ryan Tibshirani Scribes: Matt Wytock, Supreeth Achar Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have
More informationIndirect multivariate response linear regression
Biometrika (2016), xx, x, pp. 1 22 1 2 3 4 5 6 C 2007 Biometrika Trust Printed in Great Britain Indirect multivariate response linear regression BY AARON J. MOLSTAD AND ADAM J. ROTHMAN School of Statistics,
More informationPackage scio. February 20, 2015
Title Sparse Column-wise Inverse Operator Version 0.6.1 Author Xi (Rossi) LUO and Weidong Liu Package scio February 20, 2015 Maintainer Xi (Rossi) LUO Suggests clime, lorec, QUIC
More informationDivide-and-combine Strategies in Statistical Modeling for Massive Data
Divide-and-combine Strategies in Statistical Modeling for Massive Data Liqun Yu Washington University in St. Louis March 30, 2017 Liqun Yu (WUSTL) D&C Statistical Modeling for Massive Data March 30, 2017
More informationInverse Covariance Estimation with Missing Data using the Concave-Convex Procedure
Inverse Covariance Estimation with Missing Data using the Concave-Convex Procedure Jérôme Thai 1 Timothy Hunter 1 Anayo Akametalu 1 Claire Tomlin 1 Alex Bayen 1,2 1 Department of Electrical Engineering
More informationA Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem
A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem Kangkang Deng, Zheng Peng Abstract: The main task of genetic regulatory networks is to construct a
More informationHigh Dimensional Inverse Covariate Matrix Estimation via Linear Programming
High Dimensional Inverse Covariate Matrix Estimation via Linear Programming Ming Yuan October 24, 2011 Gaussian Graphical Model X = (X 1,..., X p ) indep. N(µ, Σ) Inverse covariance matrix Σ 1 = Ω = (ω
More informationMultivariate Normal Models
Case Study 3: fmri Prediction Graphical LASSO Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Emily Fox February 26 th, 2013 Emily Fox 2013 1 Multivariate Normal Models
More informationProximal Newton Method. Ryan Tibshirani Convex Optimization /36-725
Proximal Newton Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: primal-dual interior-point method Given the problem min x subject to f(x) h i (x) 0, i = 1,... m Ax = b where f, h
More informationLearning Local Dependence In Ordered Data
Journal of Machine Learning Research 8 07-60 Submitted 4/6; Revised 0/6; Published 4/7 Learning Local Dependence In Ordered Data Guo Yu Department of Statistical Science Cornell University, 73 Comstock
More informationCausal Inference: Discussion
Causal Inference: Discussion Mladen Kolar The University of Chicago Booth School of Business Sept 23, 2016 Types of machine learning problems Based on the information available: Supervised learning Reinforcement
More informationFast Regularization Paths via Coordinate Descent
August 2008 Trevor Hastie, Stanford Statistics 1 Fast Regularization Paths via Coordinate Descent Trevor Hastie Stanford University joint work with Jerry Friedman and Rob Tibshirani. August 2008 Trevor
More informationGraphical Model Selection
May 6, 2013 Trevor Hastie, Stanford Statistics 1 Graphical Model Selection Trevor Hastie Stanford University joint work with Jerome Friedman, Rob Tibshirani, Rahul Mazumder and Jason Lee May 6, 2013 Trevor
More informationLecture 9: September 28
0-725/36-725: Convex Optimization Fall 206 Lecturer: Ryan Tibshirani Lecture 9: September 28 Scribes: Yiming Wu, Ye Yuan, Zhihao Li Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These
More informationVariable Selection for Highly Correlated Predictors
Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu Department of Statistics, University of Illinois at Urbana-Champaign WHOA-PSI, Aug, 2017 St. Louis, Missouri 1 / 30 Background Variable
More informationEfficient Quasi-Newton Proximal Method for Large Scale Sparse Optimization
Efficient Quasi-Newton Proximal Method for Large Scale Sparse Optimization Xiaocheng Tang Department of Industrial and Systems Engineering Lehigh University Bethlehem, PA 18015 xct@lehigh.edu Katya Scheinberg
More informationhttps://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:
More informationLearning Graphical Models With Hubs
Learning Graphical Models With Hubs Kean Ming Tan, Palma London, Karthik Mohan, Su-In Lee, Maryam Fazel, Daniela Witten arxiv:140.7349v1 [stat.ml] 8 Feb 014 Abstract We consider the problem of learning
More informationAutomatic Response Category Combination. in Multinomial Logistic Regression
Automatic Response Category Combination in Multinomial Logistic Regression Bradley S. Price, Charles J. Geyer, and Adam J. Rothman arxiv:1705.03594v1 [stat.me] 10 May 2017 Abstract We propose a penalized
More informationFast Nonnegative Matrix Factorization with Rank-one ADMM
Fast Nonnegative Matrix Factorization with Rank-one Dongjin Song, David A. Meyer, Martin Renqiang Min, Department of ECE, UCSD, La Jolla, CA, 9093-0409 dosong@ucsd.edu Department of Mathematics, UCSD,
More informationA Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression
A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression Noah Simon Jerome Friedman Trevor Hastie November 5, 013 Abstract In this paper we purpose a blockwise descent
More informationA Direct Approach for Sparse Quadratic Discriminant Analysis
A Direct Approach for Sparse Quadratic Discriminant Analysis arxiv:1510.00084v3 [stat.me] 20 Sep 2016 Binyan Jiang, Xiangyu Wang, and Chenlei Leng September 21, 2016 Abstract Quadratic discriminant analysis
More informationRobust estimation of scale and covariance with P n and its application to precision matrix estimation
Robust estimation of scale and covariance with P n and its application to precision matrix estimation Garth Tarr, Samuel Müller and Neville Weber USYD 2013 School of Mathematics and Statistics THE UNIVERSITY
More informationEstimating Sparse Precision Matrix with Bayesian Regularization
Estimating Sparse Precision Matrix with Bayesian Regularization Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement Graphical
More informationComputational and Statistical Aspects of Statistical Machine Learning. John Lafferty Department of Statistics Retreat Gleacher Center
Computational and Statistical Aspects of Statistical Machine Learning John Lafferty Department of Statistics Retreat Gleacher Center Outline Modern nonparametric inference for high dimensional data Nonparametric
More informationHigh-dimensional graphical model selection: Practical and information-theoretic limits
1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John
More informationAn Introduction to Graphical Lasso
An Introduction to Graphical Lasso Bo Chang Graphical Models Reading Group May 15, 2015 Bo Chang (UBC) Graphical Lasso May 15, 2015 1 / 16 Undirected Graphical Models An undirected graph, each vertex represents
More informationLasso: Algorithms and Extensions
ELE 538B: Sparsity, Structure and Inference Lasso: Algorithms and Extensions Yuxin Chen Princeton University, Spring 2017 Outline Proximal operators Proximal gradient methods for lasso and its extensions
More informationDiscriminative Feature Grouping
Discriative Feature Grouping Lei Han and Yu Zhang, Department of Computer Science, Hong Kong Baptist University, Hong Kong The Institute of Research and Continuing Education, Hong Kong Baptist University
More informationPreconditioning via Diagonal Scaling
Preconditioning via Diagonal Scaling Reza Takapoui Hamid Javadi June 4, 2014 1 Introduction Interior point methods solve small to medium sized problems to high accuracy in a reasonable amount of time.
More informationComparisons of penalized least squares. methods by simulations
Comparisons of penalized least squares arxiv:1405.1796v1 [stat.co] 8 May 2014 methods by simulations Ke ZHANG, Fan YIN University of Science and Technology of China, Hefei 230026, China Shifeng XIONG Academy
More informationLearning the Network Structure of Heterogenerous Data via Pairwise Exponential MRF
Learning the Network Structure of Heterogenerous Data via Pairwise Exponential MRF Jong Ho Kim, Youngsuk Park December 17, 2016 1 Introduction Markov random fields (MRFs) are a fundamental fool on data
More informationGraphical Models for Non-Negative Data Using Generalized Score Matching
Graphical Models for Non-Negative Data Using Generalized Score Matching Shiqing Yu Mathias Drton Ali Shojaie University of Washington University of Washington University of Washington Abstract A common
More informationLecture 17: October 27
0-725/36-725: Convex Optimiation Fall 205 Lecturer: Ryan Tibshirani Lecture 7: October 27 Scribes: Brandon Amos, Gines Hidalgo Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These
More informationSparse and Locally Constant Gaussian Graphical Models
Sparse and Locally Constant Gaussian Graphical Models Jean Honorio, Luis Ortiz, Dimitris Samaras Department of Computer Science Stony Brook University Stony Brook, NY 794 {jhonorio,leortiz,samaras}@cs.sunysb.edu
More informationThe lasso: some novel algorithms and applications
1 The lasso: some novel algorithms and applications Newton Institute, June 25, 2008 Robert Tibshirani Stanford University Collaborations with Trevor Hastie, Jerome Friedman, Holger Hoefling, Gen Nowak,
More informationProximity-Based Anomaly Detection using Sparse Structure Learning
Proximity-Based Anomaly Detection using Sparse Structure Learning Tsuyoshi Idé (IBM Tokyo Research Lab) Aurelie C. Lozano, Naoki Abe, and Yan Liu (IBM T. J. Watson Research Center) 2009/04/ SDM 2009 /
More informationarxiv: v2 [econ.em] 1 Oct 2017
Estimation of Graphical Models using the L, Norm Khai X. Chiong and Hyungsik Roger Moon arxiv:79.8v [econ.em] Oct 7 Naveen Jindal School of Management, University of exas at Dallas E-mail: khai.chiong@utdallas.edu
More informationMSA220/MVE440 Statistical Learning for Big Data
MSA220/MVE440 Statistical Learning for Big Data Lecture 9-10 - High-dimensional regression Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Recap from
More informationRegularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008
Regularized Estimation of High Dimensional Covariance Matrices Peter Bickel Cambridge January, 2008 With Thanks to E. Levina (Joint collaboration, slides) I. M. Johnstone (Slides) Choongsoon Bae (Slides)
More informationA Parametric Simplex Approach to Statistical Learning Problems
A Parametric Simplex Approach to Statistical Learning Problems Haotian Pang Tuo Zhao Robert Vanderbei Han Liu Abstract In this paper, we show that the parametric simplex method is an efficient algorithm
More informationBiostatistics Advanced Methods in Biostatistics IV
Biostatistics 140.754 Advanced Methods in Biostatistics IV Jeffrey Leek Assistant Professor Department of Biostatistics jleek@jhsph.edu Lecture 12 1 / 36 Tip + Paper Tip: As a statistician the results
More informationAn algorithm for the multivariate group lasso with covariance estimation
An algorithm for the multivariate group lasso with covariance estimation arxiv:1512.05153v1 [stat.co] 16 Dec 2015 Ines Wilms and Christophe Croux Leuven Statistics Research Centre, KU Leuven, Belgium Abstract
More informationDifferential network analysis from cross-platform gene expression data: Supplementary Information
Differential network analysis from cross-platform gene expression data: Supplementary Information Xiao-Fei Zhang, Le Ou-Yang, Xing-Ming Zhao, and Hong Yan Contents 1 Supplementary Figures Supplementary
More informationStructure estimation for Gaussian graphical models
Faculty of Science Structure estimation for Gaussian graphical models Steffen Lauritzen, University of Copenhagen Department of Mathematical Sciences Minikurs TUM 2016 Lecture 3 Slide 1/48 Overview of
More informationExtended Bayesian Information Criteria for Gaussian Graphical Models
Extended Bayesian Information Criteria for Gaussian Graphical Models Rina Foygel University of Chicago rina@uchicago.edu Mathias Drton University of Chicago drton@uchicago.edu Abstract Gaussian graphical
More informationMATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models
1/13 MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models Dominique Guillot Departments of Mathematical Sciences University of Delaware May 4, 2016 Recall
More informationSimultaneous variable selection and class fusion for high-dimensional linear discriminant analysis
Biostatistics (2010), 11, 4, pp. 599 608 doi:10.1093/biostatistics/kxq023 Advance Access publication on May 26, 2010 Simultaneous variable selection and class fusion for high-dimensional linear discriminant
More informationA Direct Approach for Sparse Quadratic Discriminant Analysis
Journal of Machine Learning Research 19 (2018) 1-37 Submitted 5/17; Revised 4/18; Published 9/18 A Direct Approach for Sparse Quadratic Discriminant Analysis Binyan Jiang Department of Applied Mathematics
More informationSparse Covariance Matrix Estimation with Eigenvalue Constraints
Sparse Covariance Matrix Estimation with Eigenvalue Constraints Han Liu and Lie Wang 2 and Tuo Zhao 3 Department of Operations Research and Financial Engineering, Princeton University 2 Department of Mathematics,
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 7, 04 Reading: See class website Eric Xing @ CMU, 005-04
More informationThe Alternating Direction Method of Multipliers
The Alternating Direction Method of Multipliers With Adaptive Step Size Selection Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein December 2, 2015 1 / 25 Background The Dual Problem Consider
More informationEstimating Covariance Structure in High Dimensions
Estimating Covariance Structure in High Dimensions Ashwini Maurya Michigan State University East Lansing, MI, USA Thesis Director: Dr. Hira L. Koul Committee Members: Dr. Yuehua Cui, Dr. Hyokyoung Hong,
More informationEstimating Sparse Graphical Models: Insights Through Simulation. Yunan Zhu. Master of Science STATISTICS
Estimating Sparse Graphical Models: Insights Through Simulation by Yunan Zhu A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in STATISTICS Department of
More informationA Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models
A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los
More informationIntroduction to graphical models: Lecture III
Introduction to graphical models: Lecture III Martin Wainwright UC Berkeley Departments of Statistics, and EECS Martin Wainwright (UC Berkeley) Some introductory lectures January 2013 1 / 25 Introduction
More informationGeneralized Elastic Net Regression
Abstract Generalized Elastic Net Regression Geoffroy MOURET Jean-Jules BRAULT Vahid PARTOVINIA This work presents a variation of the elastic net penalization method. We propose applying a combined l 1
More informationLog Covariance Matrix Estimation
Log Covariance Matrix Estimation Xinwei Deng Department of Statistics University of Wisconsin-Madison Joint work with Kam-Wah Tsui (Univ. of Wisconsin-Madsion) 1 Outline Background and Motivation The Proposed
More information