Inverse Covariance Estimation with Missing Data using the Concave-Convex Procedure

Size: px
Start display at page:

Download "Inverse Covariance Estimation with Missing Data using the Concave-Convex Procedure"

Transcription

1 Inverse Covariance Estimation with Missing Data using the Concave-Convex Procedure Jérôme Thai 1 Timothy Hunter 1 Anayo Akametalu 1 Claire Tomlin 1 Alex Bayen 1,2 1 Department of Electrical Engineering & Computer Sciences University of California at Berkeley 2 Department of Civil & Environmental Engineering University of California at Berkeley December 8,

2 Outline Motivation: Gaussian Markov Random Fields Sparse estimator with missing data Concave convex procedure Comparison with Expectation-Maximization algorithm Conclusion 2

3 Outline Motivation: Gaussian Markov Random Fields Sparse estimator with missing data Concave-Convex Procedure Comparison with Expectation-Maximization algorithm Conclusion Motivation: Gaussian Markov Random Fields 3

4 Multivariate Gaussian & Gaussian Markov Random Fields Definition: Multivariate Gaussian A random vector x =(X 1,, X p ) T 2 R p is a Multivariate Gaussian with mean µ and inverse covariance matrix Q if its density is 1 (x µ, Q 1 )=(2 ) n/2 Q 1/2 exp 2 (x µ)t Q(x µ) Motivation: Gaussian Markov Random Fields 4

5 Multivariate Gaussian & Gaussian Markov Random Fields Definition: Multivariate Gaussian A random vector x =(X 1,, X p ) T 2 R p is a Multivariate Gaussian with mean µ and inverse covariance matrix Q if its density is 1 (x µ, Q 1 )=(2 ) n/2 Q 1/2 exp 2 (x µ)t Q(x µ) Definition: Gaussian Markov Random Field A random vector x =(X 1,, X p ) T is a Gaussian Markov random field w.r.t. graph G =(V, E) if it is a Multivariate Gaussian with Q ij =0 () {i, j} /2 E, 8 i 6= j () x i? x j x ij Motivation: Gaussian Markov Random Fields 4

6 Applications of Gaussian Markov Random Fields (GMRF) Find sparse 1 for dependency patterns between biological factors 1 1 Dobra, Variable selection and dependency networks for genomewide data, Biostatistics, Banerjee, El Ghaoui, daspremont. Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. JMLR, Friedman, Hastie, Tibshirani. Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9,2008. Motivation: Gaussian Markov Random Fields 5

7 Applications of Gaussian Markov Random Fields (GMRF) Find sparse 1 for dependency patterns between biological factors 1 Sparse estimator for GMRF from data y 1,, y n 2 R p (ˆµ-centered): 2 3 ˆQ = argmin Q 0 Pn log Q +Tr j=1 y jyj T Q + kqk 1 1 Dobra, Variable selection and dependency networks for genomewide data, Biostatistics, Banerjee, El Ghaoui, daspremont. Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. JMLR, Friedman, Hastie, Tibshirani. Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9,2008. Motivation: Gaussian Markov Random Fields 5

8 Outline Motivation: Gaussian Markov Random Fields Sparse estimator with missing data Concave-Convex Procedure Comparison with Expectation-Maximization algorithm Conclusion Sparse estimator with missing data 6

9 Maximum likelihood with missing data I obs j = observed entries, mis j =missingentriesineachsampley j Sparse estimator with missing data 7

10 Maximum likelihood with missing data I obs j = observed entries, mis j =missingentriesineachsampley j I Sparse estimator (max. likelihood) with missing data: ˆQ = argmax Q 0 nx j=1 log (y j,obs obsj ) {z } + kqk 1 density of marginal Gaussian N(µ obsj, obsj ) with µ obsj, obsj subvector and submatrix over entries obs j Sparse estimator with missing data 7

11 What is the dependency of ( obsj ) 1 in Q? 4 Kolar and Xing. Estimating Sparse Precision Matrices from Data with Missing Values. ICML Stadler and Buhlmann. Missing values: sparse inverse covariance estimation and an extension to sparse regression. Statistics and Computing, Sparse estimator with missing data 8

12 What is the dependency of ( obsj ) 1 in Q? I Modulo permutation matrix P j P j y j = apple yj,obs y j,mis P j QP T j = apple Qobsj Q misj obs j Q obsj mis j Q misj 4 Kolar and Xing. Estimating Sparse Precision Matrices from Data with Missing Values. ICML Stadler and Buhlmann. Missing values: sparse inverse covariance estimation and an extension to sparse regression. Statistics and Computing, Sparse estimator with missing data 8

13 What is the dependency of ( obsj ) 1 in Q? I Modulo permutation matrix P j P j y j = apple yj,obs y j,mis P j QP T j = apple Qobsj Q misj obs j Q obsj mis j Q misj I We have the Schur complement w.r.t. Q obsj : S j (Q) :=( obsj ) 1 = Q obsj Q obsj mis j Q 1 mis j Q misj obs j 4 Kolar and Xing. Estimating Sparse Precision Matrices from Data with Missing Values. ICML Stadler and Buhlmann. Missing values: sparse inverse covariance estimation and an extension to sparse regression. Statistics and Computing, Sparse estimator with missing data 8

14 What is the dependency of ( obsj ) 1 in Q? I Modulo permutation matrix P j P j y j = apple yj,obs y j,mis P j QP T j = apple Qobsj Q misj obs j Q obsj mis j Q misj I We have the Schur complement w.r.t. Q obsj : S j (Q) :=( obsj ) 1 = Q obsj Q obsj mis j Q 1 mis j Q misj obs j I Explicit formulation of the sparse estimator with missing data: 4 5 ˆQ = argmin Q 0 nx { log S j (Q) +Tr(y j,obs yj,obs T S j(q))} + kqk 1 j=1 4 Kolar and Xing. Estimating Sparse Precision Matrices from Data with Missing Values. ICML Stadler and Buhlmann. Missing values: sparse inverse covariance estimation and an extension to sparse regression. Statistics and Computing, Sparse estimator with missing data 8

15 What is the dependency of ( obsj ) 1 in Q? I Modulo permutation matrix P j P j y j = apple yj,obs y j,mis P j QP T j = apple Qobsj Q misj obs j Q obsj mis j Q misj I We have the Schur complement w.r.t. Q obsj : S j (Q) :=( obsj ) 1 = Q obsj Q obsj mis j Q 1 mis j Q misj obs j I Explicit formulation of the sparse estimator with missing data: 4 5 ˆQ = argmin Q 0 nx { log S j (Q) +Tr(y j,obs yj,obs T S j(q))} + kqk 1 j=1 4 Kolar and Xing. Estimating Sparse Precision Matrices from Data with Missing Values. ICML Stadler and Buhlmann. Missing values: sparse inverse covariance estimation and an extension to sparse regression. Statistics and Computing, Sparse estimator with missing data 8

16 What is the dependency of ( obsj ) 1 in Q? I Modulo permutation matrix P j P j y j = apple yj,obs y j,mis P j QP T j = apple Qobsj Q misj obs j Q obsj mis j Q misj I We have the Schur complement w.r.t. Q obsj : S j (Q) :=( obsj ) 1 = Q obsj Q obsj mis j Q 1 mis j Q misj obs j I Explicit formulation of the sparse estimator with missing data: 4 5 ˆQ = argmin Q 0 nx { log S j (Q) +Tr(y j,obs yj,obs T S j(q))} + kqk 1 j=1 4 Kolar and Xing. Estimating Sparse Precision Matrices from Data with Missing Values. ICML Stadler and Buhlmann. Missing values: sparse inverse covariance estimation and an extension to sparse regression. Statistics and Computing, Sparse estimator with missing data 8

17 Problem statement I Inverse covariance estimation with missing data has a complicated non-convex objective. Sparse estimator with missing data 9

18 Problem statement I Inverse covariance estimation with missing data has a complicated non-convex objective. I Design a novel application of the Concave-Convex procedure to solve our program. Sparse estimator with missing data 9

19 Problem statement I Inverse covariance estimation with missing data has a complicated non-convex objective. I Design a novel application of the Concave-Convex procedure to solve our program. I Better theoretical and experimental convergence than the Expectation-maximization algorithm. Sparse estimator with missing data 9

20 Outline Motivation: Gaussian Markov Random Fields Sparse estimator with missing data Concave-Convex Procedure Comparison with Expectation-Maximization algorithm Conclusion Concave-Convex Procedure 10

21 Di erence of convex programs Our program: min f (Q) g(q) s.t. Q 0with f (Q) = g(q) = P n P n j=1 log S j(q) + kqk 1 j=1 Tr(y j,obsyj,obs T S j(q)) Concave-Convex Procedure 11

22 Di erence of convex programs Our program: min f (Q) g(q) s.t. Q 0with f (Q) = g(q) = P n P n j=1 log S j(q) + kqk 1 j=1 Tr(y j,obsyj,obs T S j(q)) Why are both f and g convex? Concave-Convex Procedure 11

23 Di erence of convex programs Our program: min f (Q) g(q) s.t. Q 0with f (Q) = g(q) = P n P n j=1 log S j(q) + kqk 1 j=1 Tr(y j,obsyj,obs T S j(q)) Why are both f and g convex? Lemma The function Q 7! S(Q) is concave on the set of positive definite matrices. Concave-Convex Procedure 11

24 Di erence of convex programs Our program: min f (Q) g(q) s.t. Q 0with f (Q) = g(q) = P n P n j=1 log S j(q) + kqk 1 j=1 Tr(y j,obsyj,obs T S j(q)) Why are both f and g convex? Lemma The function Q 7! S(Q) is concave on the set of positive definite matrices. Proposition The function Q 7! log S(Q) is concave on the set of positive definite matrices. Concave-Convex Procedure 11

25 Review of the Concave-Convex Procedure Definition: Di erence of Convex (DC) program Let f, g two convex functions and X R n a convex set, a Di erence of Convex (DC) program is such that min f (x) g(x) s.t. x 2X 6 Yuille and Rangarajan. The Concave-Convex Procedure. Neural Computation, Concave-Convex Procedure 12

26 Review of the Concave-Convex Procedure Definition: Di erence of Convex (DC) program Let f, g two convex functions and X R n a convex set, a Di erence of Convex (DC) program is such that min f (x) g(x) s.t. x 2X Concave-convex procedure (CCCP) to solve DC programs At x t, solves convex approximation by linearizing g L x t g x t+1 = argmin f (x) g(x t ) rg(x t ) T (x x t ) x2x 6 Yuille and Rangarajan. The Concave-Convex Procedure. Neural Computation, Concave-Convex Procedure 12

27 Review of the Concave-Convex Procedure Definition: Di erence of Convex (DC) program Let f, g two convex functions and X R n a convex set, a Di erence of Convex (DC) program is such that min f (x) g(x) s.t. x 2X Concave-convex procedure (CCCP) to solve DC programs At x t, solves convex approximation by linearizing g L x t g x t+1 = argmin f (x) g(x t ) rg(x t ) T (x x t ) x2x Proposition CCCP is a descent method: f (x t+1 ) g(x t+1 ) apple f (x t ) g(x t ) 6 6 Yuille and Rangarajan. The Concave-Convex Procedure. Neural Computation, Concave-Convex Procedure 12

28 Illustration of the Concave-Convex Procedure (CCCP) Concave-Convex Procedure 13

29 Illustration of the Concave-Convex Procedure (CCCP) Concave-Convex Procedure 13

30 Illustration of the Concave-Convex Procedure (CCCP) Concave-Convex Procedure 13

31 Illustration of the Concave-Convex Procedure (CCCP) Concave-Convex Procedure 13

32 Outline Motivation: Gaussian Markov Random Fields Sparse estimator with missing data Concave-Convex Procedure Comparison with Expectation-Maximization algorithm Conclusion Comparison with Expectation-Maximization algorithm 14

33 Expectation-Maximization (EM) algorithm I Our estimate: ˆQ = argmin Q 0 P n j=1 log (y j,obs obsj ) + kqk 1 {z } density of marginal distribution over obs j Comparison with Expectation-Maximization algorithm 15

34 Expectation-Maximization (EM) algorithm I Our estimate: ˆQ = argmin Q 0 P n j=1 log (y j,obs obsj ) + kqk 1 {z } density of marginal distribution over obs j I Di cult because ( obsj ) 1 = S j (Q) =Q obsj Q obsj mis j Q 1 mis j Q misj obs j Comparison with Expectation-Maximization algorithm 15

35 Expectation-Maximization (EM) algorithm I Our estimate: ˆQ = argmin Q 0 P n j=1 log (y j,obs obsj ) + kqk 1 {z } density of marginal distribution over obs j I Di cult because ( obsj ) 1 = S j (Q) =Q obsj Q obsj mis j Q 1 mis j Q misj obs j Expectation-Maximization (EM) algorithm Updates Q t in two steps E-step: y j,mis x j,mis {x j,obs = y j,obs, Q = Q t } ˆ t = P j E x j,mis x j,obs =y j,obs [y j y T j ] M-step: Q t+1 = argmin Q 0 = argmin Q 0 P j E x j,mis x j,obs =y j,obs log (x (Q t ) 1 )+ kqk 1 log Q +Tr(ˆ t Q)+ kqk 1 Comparison with Expectation-Maximization algorithm 15

36 EM algorithm as a Concave-Convex procedure Proposition For Gaussians, the EM algorithm is a CCCP with DC decomposition: min f EM (Q) g EM (Q) s.t. Q 0 f EM (Q) = log Q + kqk 1 g EM (Q) = 1 n P n j=1 {log Q mis j +Tr(y j,obs y T j,obs S j(q))} Comparison with Expectation-Maximization algorithm 16

37 EM algorithm as a Concave-Convex procedure Proposition For Gaussians, the EM algorithm is a CCCP with DC decomposition: min f EM (Q) g EM (Q) s.t. Q 0 f EM (Q) = log Q + kqk 1 g EM (Q) = 1 n P n j=1 {log Q mis j + Tr(y j,obs y T j,obs S j(q))} If we pose h(q) := 1 n P n j=1 log Q mis j : EM decomposition: f EM (h + g) Our decomposition: (f EM h) g Comparison with Expectation-Maximization algorithm 16

38 EM algorithm as a Concave-Convex procedure Proposition For Gaussians, the EM algorithm is a CCCP with DC decomposition: min f EM (Q) g EM (Q) s.t. Q 0 f EM (Q) = log Q + kqk 1 g EM (Q) = 1 n P n j=1 {log Q mis j + Tr(y j,obs y T j,obs S j(q))} If we pose h(q) := 1 n P n j=1 log Q mis j : Proposition EM decomposition: f EM (h + g) Our decomposition: (f EM h) g With our decomposition, CCCP is a lower bound on EM: (f EM h) Lg apple f EM (Lh + Lg) Comparison with Expectation-Maximization algorithm 16

39 Experimental setting 1. Generate n samples y 1,, y n from N (0, ) from 3 models with dimension p = 10, 50, 100 and n = 100, 150, 200: 7 I Model 1: ij =0.7 j i AR(1) I Model 2: ij = I i=j +0.4I i j =1 +0.2I i j 2{2,3} +0.1I i j =4 ( 0 if i = j I Model 3: = B + I where B ij = 0/0.5 w.p.=0.5 if i 6= j 7 N. Stadler and P. Buhlmann. Missing values: sparse inverse covariance estimation and an extension to sparse regression. Statistics and Computing, Comparison with Expectation-Maximization algorithm 17

40 Experimental setting 1. Generate n samples y 1,, y n from N (0, ) from 3 models with dimension p = 10, 50, 100 and n = 100, 150, 200: 7 I Model 1: ij =0.7 j i AR(1) I Model 2: ij = I i=j +0.4I i j =1 +0.2I i j 2{2,3} +0.1I i j =4 ( 0 if i = j I Model 3: = B + I where B ij = 0/0.5 w.p.=0.5 if i 6= j 2. Remove at random 20, 40, 60, 80% of the data for each sample 7 N. Stadler and P. Buhlmann. Missing values: sparse inverse covariance estimation and an extension to sparse regression. Statistics and Computing, Comparison with Expectation-Maximization algorithm 17

41 Experimental setting 1. Generate n samples y 1,, y n from N (0, ) from 3 models with dimension p = 10, 50, 100 and n = 100, 150, 200: 7 I Model 1: ij =0.7 j i AR(1) I Model 2: ij = I i=j +0.4I i j =1 +0.2I i j 2{2,3} +0.1I i j =4 ( 0 if i = j I Model 3: = B + I where B ij = 0/0.5 w.p.=0.5 if i 6= j 2. Remove at random 20, 40, 60, 80% of the data for each sample 3. Impute y j,mis by row means ) complete su cient statistics ˆ 7 N. Stadler and P. Buhlmann. Missing values: sparse inverse covariance estimation and an extension to sparse regression. Statistics and Computing, Comparison with Expectation-Maximization algorithm 17

42 Experimental setting 1. Generate n samples y 1,, y n from N (0, ) from 3 models with dimension p = 10, 50, 100 and n = 100, 150, 200: 7 I Model 1: ij =0.7 j i AR(1) I Model 2: ij = I i=j +0.4I i j =1 +0.2I i j 2{2,3} +0.1I i j =4 ( 0 if i = j I Model 3: = B + I where B ij = 0/0.5 w.p.=0.5 if i 6= j 2. Remove at random 20, 40, 60, 80% of the data for each sample 3. Impute y j,mis by row means ) complete su cient statistics ˆ 4. Initialization: Q 0 = argmin Q 0 log Q +Tr(ˆ Q)+ kqk 1 7 N. Stadler and P. Buhlmann. Missing values: sparse inverse covariance estimation and an extension to sparse regression. Statistics and Computing, Comparison with Expectation-Maximization algorithm 17

43 Numerical results on synthetic datasets missing=20% missing=40% missing=60% missing=80% Comparison with Expectation-Maximization algorithm 18

44 Numerical results on real datasets missing=20% missing=40% missing=60% missing=80% Comparison with Expectation-Maximization algorithm 19

45 Numerical results Comparison with Expectation-Maximization algorithm 20

46 Outline Motivation: Gaussian Markov Random Fields Sparse estimator with missing data Concave-Convex Procedure Comparison with Expectation-Maximization algorithm Conclusion Conclusion 21

47 Summary of contributions I Schur complement is log-concave. I Hence sparse inverse covariance estimator is a DC program. I New CCCP for the sparse inverse covariance estimator. I Superior convergence in number of iterations. I Validated by numerical results on synthetic and real datasets. Conclusion 22

Gaussian Graphical Models and Graphical Lasso

Gaussian Graphical Models and Graphical Lasso ELE 538B: Sparsity, Structure and Inference Gaussian Graphical Models and Graphical Lasso Yuxin Chen Princeton University, Spring 2017 Multivariate Gaussians Consider a random vector x N (0, Σ) with pdf

More information

Sparse Gaussian conditional random fields

Sparse Gaussian conditional random fields Sparse Gaussian conditional random fields Matt Wytock, J. ico Kolter School of Computer Science Carnegie Mellon University Pittsburgh, PA 53 {mwytock, zkolter}@cs.cmu.edu Abstract We propose sparse Gaussian

More information

Approximation. Inderjit S. Dhillon Dept of Computer Science UT Austin. SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina.

Approximation. Inderjit S. Dhillon Dept of Computer Science UT Austin. SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina. Using Quadratic Approximation Inderjit S. Dhillon Dept of Computer Science UT Austin SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina Sept 12, 2012 Joint work with C. Hsieh, M. Sustik and

More information

Robust Inverse Covariance Estimation under Noisy Measurements

Robust Inverse Covariance Estimation under Noisy Measurements .. Robust Inverse Covariance Estimation under Noisy Measurements Jun-Kun Wang, Shou-De Lin Intel-NTU, National Taiwan University ICML 2014 1 / 30 . Table of contents Introduction.1 Introduction.2 Related

More information

Chapter 17: Undirected Graphical Models

Chapter 17: Undirected Graphical Models Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)

More information

Multivariate Normal Models

Multivariate Normal Models Case Study 3: fmri Prediction Coping with Large Covariances: Latent Factor Models, Graphical Models, Graphical LASSO Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox February

More information

MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models

MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models 1/13 MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models Dominique Guillot Departments of Mathematical Sciences University of Delaware May 4, 2016 Recall

More information

MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.)

MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.) 1/12 MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.) Dominique Guillot Departments of Mathematical Sciences University of Delaware May 6, 2016

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 7, 04 Reading: See class website Eric Xing @ CMU, 005-04

More information

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss arxiv:1811.04545v1 [stat.co] 12 Nov 2018 Cheng Wang School of Mathematical Sciences, Shanghai Jiao

More information

Multivariate Normal Models

Multivariate Normal Models Case Study 3: fmri Prediction Graphical LASSO Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Emily Fox February 26 th, 2013 Emily Fox 2013 1 Multivariate Normal Models

More information

Graphical Model Selection

Graphical Model Selection May 6, 2013 Trevor Hastie, Stanford Statistics 1 Graphical Model Selection Trevor Hastie Stanford University joint work with Jerome Friedman, Rob Tibshirani, Rahul Mazumder and Jason Lee May 6, 2013 Trevor

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 5, 06 Reading: See class website Eric Xing @ CMU, 005-06

More information

10708 Graphical Models: Homework 2

10708 Graphical Models: Homework 2 10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves

More information

Sparse inverse covariance estimation with the lasso

Sparse inverse covariance estimation with the lasso Sparse inverse covariance estimation with the lasso Jerome Friedman Trevor Hastie and Robert Tibshirani November 8, 2007 Abstract We consider the problem of estimating sparse graphs by a lasso penalty

More information

High-dimensional Covariance Estimation Based On Gaussian Graphical Models

High-dimensional Covariance Estimation Based On Gaussian Graphical Models High-dimensional Covariance Estimation Based On Gaussian Graphical Models Shuheng Zhou, Philipp Rutimann, Min Xu and Peter Buhlmann February 3, 2012 Problem definition Want to estimate the covariance matrix

More information

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement

More information

Big & Quic: Sparse Inverse Covariance Estimation for a Million Variables

Big & Quic: Sparse Inverse Covariance Estimation for a Million Variables for a Million Variables Cho-Jui Hsieh The University of Texas at Austin NIPS Lake Tahoe, Nevada Dec 8, 2013 Joint work with M. Sustik, I. Dhillon, P. Ravikumar and R. Poldrack FMRI Brain Analysis Goal:

More information

Sparse Covariance Selection using Semidefinite Programming

Sparse Covariance Selection using Semidefinite Programming Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support

More information

Learning Gaussian Graphical Models with Unknown Group Sparsity

Learning Gaussian Graphical Models with Unknown Group Sparsity Learning Gaussian Graphical Models with Unknown Group Sparsity Kevin Murphy Ben Marlin Depts. of Statistics & Computer Science Univ. British Columbia Canada Connections Graphical models Density estimation

More information

Variables. Cho-Jui Hsieh The University of Texas at Austin. ICML workshop on Covariance Selection Beijing, China June 26, 2014

Variables. Cho-Jui Hsieh The University of Texas at Austin. ICML workshop on Covariance Selection Beijing, China June 26, 2014 for a Million Variables Cho-Jui Hsieh The University of Texas at Austin ICML workshop on Covariance Selection Beijing, China June 26, 2014 Joint work with M. Sustik, I. Dhillon, P. Ravikumar, R. Poldrack,

More information

High Dimensional Inverse Covariate Matrix Estimation via Linear Programming

High Dimensional Inverse Covariate Matrix Estimation via Linear Programming High Dimensional Inverse Covariate Matrix Estimation via Linear Programming Ming Yuan October 24, 2011 Gaussian Graphical Model X = (X 1,..., X p ) indep. N(µ, Σ) Inverse covariance matrix Σ 1 = Ω = (ω

More information

Linear Models for Regression CS534

Linear Models for Regression CS534 Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict

More information

Permutation-invariant regularization of large covariance matrices. Liza Levina

Permutation-invariant regularization of large covariance matrices. Liza Levina Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work

More information

An Introduction to Graphical Lasso

An Introduction to Graphical Lasso An Introduction to Graphical Lasso Bo Chang Graphical Models Reading Group May 15, 2015 Bo Chang (UBC) Graphical Lasso May 15, 2015 1 / 16 Undirected Graphical Models An undirected graph, each vertex represents

More information

Linear Models for Regression CS534

Linear Models for Regression CS534 Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict

More information

On the Convergence of the Concave-Convex Procedure

On the Convergence of the Concave-Convex Procedure On the Convergence of the Concave-Convex Procedure Bharath K. Sriperumbudur and Gert R. G. Lanckriet UC San Diego OPT 2009 Outline Difference of convex functions (d.c.) program Applications in machine

More information

Node-Based Learning of Multiple Gaussian Graphical Models

Node-Based Learning of Multiple Gaussian Graphical Models Journal of Machine Learning Research 5 (04) 445-488 Submitted /; Revised 8/; Published /4 Node-Based Learning of Multiple Gaussian Graphical Models Karthik Mohan Palma London Maryam Fazel Department of

More information

Sparse and Locally Constant Gaussian Graphical Models

Sparse and Locally Constant Gaussian Graphical Models Sparse and Locally Constant Gaussian Graphical Models Jean Honorio, Luis Ortiz, Dimitris Samaras Department of Computer Science Stony Brook University Stony Brook, NY 794 {jhonorio,leortiz,samaras}@cs.sunysb.edu

More information

Lecture 25: November 27

Lecture 25: November 27 10-725: Optimization Fall 2012 Lecture 25: November 27 Lecturer: Ryan Tibshirani Scribes: Matt Wytock, Supreeth Achar Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have

More information

Structure estimation for Gaussian graphical models

Structure estimation for Gaussian graphical models Faculty of Science Structure estimation for Gaussian graphical models Steffen Lauritzen, University of Copenhagen Department of Mathematical Sciences Minikurs TUM 2016 Lecture 3 Slide 1/48 Overview of

More information

Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results

Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results David Prince Biostat 572 dprince3@uw.edu April 19, 2012 David Prince (UW) SPICE April 19, 2012 1 / 11 Electronic

More information

High-dimensional graphical model selection: Practical and information-theoretic limits

High-dimensional graphical model selection: Practical and information-theoretic limits 1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John

More information

Sparse Inverse Covariance Estimation for a Million Variables

Sparse Inverse Covariance Estimation for a Million Variables Sparse Inverse Covariance Estimation for a Million Variables Inderjit S. Dhillon Depts of Computer Science & Mathematics The University of Texas at Austin SAMSI LDHD Opening Workshop Raleigh, North Carolina

More information

High-dimensional covariance estimation based on Gaussian graphical models

High-dimensional covariance estimation based on Gaussian graphical models High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,

More information

A Divide-and-Conquer Procedure for Sparse Inverse Covariance Estimation

A Divide-and-Conquer Procedure for Sparse Inverse Covariance Estimation A Divide-and-Conquer Procedure for Sparse Inverse Covariance Estimation Cho-Jui Hsieh Dept. of Computer Science University of Texas, Austin cjhsieh@cs.utexas.edu Inderjit S. Dhillon Dept. of Computer Science

More information

Maximum Likelihood (ML), Expectation Maximization (EM) Pieter Abbeel UC Berkeley EECS

Maximum Likelihood (ML), Expectation Maximization (EM) Pieter Abbeel UC Berkeley EECS Maximum Likelihood (ML), Expectation Maximization (EM) Pieter Abbeel UC Berkeley EECS Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics Outline Maximum likelihood (ML) Priors, and

More information

Joint Gaussian Graphical Model Review Series I

Joint Gaussian Graphical Model Review Series I Joint Gaussian Graphical Model Review Series I Probability Foundations Beilun Wang Advisor: Yanjun Qi 1 Department of Computer Science, University of Virginia http://jointggm.org/ June 23rd, 2017 Beilun

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Multivariate Gaussians Mark Schmidt University of British Columbia Winter 2019 Last Time: Multivariate Gaussian http://personal.kenyon.edu/hartlaub/mellonproject/bivariate2.html

More information

Statistical Data Mining and Machine Learning Hilary Term 2016

Statistical Data Mining and Machine Learning Hilary Term 2016 Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes

More information

Linear Regression and Discrimination

Linear Regression and Discrimination Linear Regression and Discrimination Kernel-based Learning Methods Christian Igel Institut für Neuroinformatik Ruhr-Universität Bochum, Germany http://www.neuroinformatik.rub.de July 16, 2009 Christian

More information

Exact Hybrid Covariance Thresholding for Joint Graphical Lasso

Exact Hybrid Covariance Thresholding for Joint Graphical Lasso Exact Hybrid Covariance Thresholding for Joint Graphical Lasso Qingming Tang Chao Yang Jian Peng Jinbo Xu Toyota Technological Institute at Chicago Massachusetts Institute of Technology Abstract. This

More information

Proximity-Based Anomaly Detection using Sparse Structure Learning

Proximity-Based Anomaly Detection using Sparse Structure Learning Proximity-Based Anomaly Detection using Sparse Structure Learning Tsuyoshi Idé (IBM Tokyo Research Lab) Aurelie C. Lozano, Naoki Abe, and Yan Liu (IBM T. J. Watson Research Center) 2009/04/ SDM 2009 /

More information

Machine Learning Linear Models

Machine Learning Linear Models Machine Learning Linear Models Outline II - Linear Models 1. Linear Regression (a) Linear regression: History (b) Linear regression with Least Squares (c) Matrix representation and Normal Equation Method

More information

Robust Inverse Covariance Estimation under Noisy Measurements

Robust Inverse Covariance Estimation under Noisy Measurements Jun-Kun Wang Intel-NTU, National Taiwan University, Taiwan Shou-de Lin Intel-NTU, National Taiwan University, Taiwan WANGJIM123@GMAIL.COM SDLIN@CSIE.NTU.EDU.TW Abstract This paper proposes a robust method

More information

Introduction to Probabilistic Graphical Models: Exercises

Introduction to Probabilistic Graphical Models: Exercises Introduction to Probabilistic Graphical Models: Exercises Cédric Archambeau Xerox Research Centre Europe cedric.archambeau@xrce.xerox.com Pascal Bootcamp Marseille, France, July 2010 Exercise 1: basics

More information

CSCI-567: Machine Learning (Spring 2019)

CSCI-567: Machine Learning (Spring 2019) CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March

More information

High-dimensional graphical model selection: Practical and information-theoretic limits

High-dimensional graphical model selection: Practical and information-theoretic limits 1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John

More information

OPTIMISATION CHALLENGES IN MODERN STATISTICS. Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan

OPTIMISATION CHALLENGES IN MODERN STATISTICS. Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan OPTIMISATION CHALLENGES IN MODERN STATISTICS Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan How do optimisation problems arise in Statistics? Let X 1,...,X n be independent and identically distributed

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm 1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable

More information

Robust and sparse Gaussian graphical modelling under cell-wise contamination

Robust and sparse Gaussian graphical modelling under cell-wise contamination Robust and sparse Gaussian graphical modelling under cell-wise contamination Shota Katayama 1, Hironori Fujisawa 2 and Mathias Drton 3 1 Tokyo Institute of Technology, Japan 2 The Institute of Statistical

More information

GAUSSIAN PROCESS TRANSFORMS

GAUSSIAN PROCESS TRANSFORMS GAUSSIAN PROCESS TRANSFORMS Philip A. Chou Ricardo L. de Queiroz Microsoft Research, Redmond, WA, USA pachou@microsoft.com) Computer Science Department, Universidade de Brasilia, Brasilia, Brazil queiroz@ieee.org)

More information

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization / Coordinate descent Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Adding to the toolbox, with stats and ML in mind We ve seen several general and useful minimization tools First-order methods

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 29, 2016 Outline Convex vs Nonconvex Functions Coordinate Descent Gradient Descent Newton s method Stochastic Gradient Descent Numerical Optimization

More information

Graphical Models for Collaborative Filtering

Graphical Models for Collaborative Filtering Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,

More information

Markov Network Estimation From Multi-attribute Data

Markov Network Estimation From Multi-attribute Data Mladen Kolar mladenk@cs.cmu.edu Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA 117 USA Han Liu hanliu@princeton.edu Department of Operations Research and Financial Engineering,

More information

CMPE 58K Bayesian Statistics and Machine Learning Lecture 5

CMPE 58K Bayesian Statistics and Machine Learning Lecture 5 CMPE 58K Bayesian Statistics and Machine Learning Lecture 5 Multivariate distributions: Gaussian, Bernoulli, Probability tables Department of Computer Engineering, Boğaziçi University, Istanbul, Turkey

More information

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization Proximal Newton Method Zico Kolter (notes by Ryan Tibshirani) Convex Optimization 10-725 Consider the problem Last time: quasi-newton methods min x f(x) with f convex, twice differentiable, dom(f) = R

More information

Proximal Newton Method. Ryan Tibshirani Convex Optimization /36-725

Proximal Newton Method. Ryan Tibshirani Convex Optimization /36-725 Proximal Newton Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: primal-dual interior-point method Given the problem min x subject to f(x) h i (x) 0, i = 1,... m Ax = b where f, h

More information

Gaussian Mixture Models

Gaussian Mixture Models Gaussian Mixture Models David Rosenberg, Brett Bernstein New York University April 26, 2017 David Rosenberg, Brett Bernstein (New York University) DS-GA 1003 April 26, 2017 1 / 42 Intro Question Intro

More information

Machine Learning for Data Science (CS4786) Lecture 12

Machine Learning for Data Science (CS4786) Lecture 12 Machine Learning for Data Science (CS4786) Lecture 12 Gaussian Mixture Models Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016fa/ Back to K-means Single link is sensitive to outliners We

More information

Regularized Parameter Estimation in High-Dimensional Gaussian Mixture Models

Regularized Parameter Estimation in High-Dimensional Gaussian Mixture Models LETTER Communicated by Clifford Lam Regularized Parameter Estimation in High-Dimensional Gaussian Mixture Models Lingyan Ruan lruan@gatech.edu Ming Yuan ming.yuan@isye.gatech.edu School of Industrial and

More information

PMR Learning as Inference

PMR Learning as Inference Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Undirected Graphical Models Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 3: 2 late days to hand it in today, Thursday is final day. Assignment 4:

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms François Caron Department of Statistics, Oxford STATLEARN 2014, Paris April 7, 2014 Joint work with Adrien Todeschini,

More information

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception LINEAR MODELS FOR CLASSIFICATION Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification,

More information

High-dimensional statistics: Some progress and challenges ahead

High-dimensional statistics: Some progress and challenges ahead High-dimensional statistics: Some progress and challenges ahead Martin Wainwright UC Berkeley Departments of Statistics, and EECS University College, London Master Class: Lecture Joint work with: Alekh

More information

Coordinate Descent. Lecturer: Pradeep Ravikumar Co-instructor: Aarti Singh. Convex Optimization / Slides adapted from Tibshirani

Coordinate Descent. Lecturer: Pradeep Ravikumar Co-instructor: Aarti Singh. Convex Optimization / Slides adapted from Tibshirani Coordinate Descent Lecturer: Pradeep Ravikumar Co-instructor: Aarti Singh Convex Optimization 10-725/36-725 Slides adapted from Tibshirani Coordinate descent We ve seen some pretty sophisticated methods

More information

10-725/36-725: Convex Optimization Prerequisite Topics

10-725/36-725: Convex Optimization Prerequisite Topics 10-725/36-725: Convex Optimization Prerequisite Topics February 3, 2015 This is meant to be a brief, informal refresher of some topics that will form building blocks in this course. The content of the

More information

LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition LINEAR CLASSIFIERS Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification, the input

More information

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28 Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:

More information

Gaussian Mixture Models

Gaussian Mixture Models Gaussian Mixture Models Pradeep Ravikumar Co-instructor: Manuela Veloso Machine Learning 10-701 Some slides courtesy of Eric Xing, Carlos Guestrin (One) bad case for K- means Clusters may overlap Some

More information

Gaussian Processes 1. Schedule

Gaussian Processes 1. Schedule 1 Schedule 17 Jan: Gaussian processes (Jo Eidsvik) 24 Jan: Hands-on project on Gaussian processes (Team effort, work in groups) 31 Jan: Latent Gaussian models and INLA (Jo Eidsvik) 7 Feb: Hands-on project

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Lior Wolf 2014-15 We know that X ~ B(n,p), but we do not know p. We get a random sample from X, a

More information

Linear Regression. CSL603 - Fall 2017 Narayanan C Krishnan

Linear Regression. CSL603 - Fall 2017 Narayanan C Krishnan Linear Regression CSL603 - Fall 2017 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Univariate regression Multivariate regression Probabilistic view of regression Loss functions Bias-Variance analysis Regularization

More information

Bayesian Learning of Sparse Gaussian Graphical Models. Duke University Durham, NC, USA. University of South Carolina Columbia, SC, USA I.

Bayesian Learning of Sparse Gaussian Graphical Models. Duke University Durham, NC, USA. University of South Carolina Columbia, SC, USA I. Bayesian Learning of Sparse Gaussian Graphical Models Minhua Chen, Hao Wang, Xuejun Liao and Lawrence Carin Electrical and Computer Engineering Department Duke University Durham, NC, USA Statistics Department

More information

Linear Regression. CSL465/603 - Fall 2016 Narayanan C Krishnan

Linear Regression. CSL465/603 - Fall 2016 Narayanan C Krishnan Linear Regression CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Univariate regression Multivariate regression Probabilistic view of regression Loss functions Bias-Variance analysis

More information

1 Bayesian Linear Regression (BLR)

1 Bayesian Linear Regression (BLR) Statistical Techniques in Robotics (STR, S15) Lecture#10 (Wednesday, February 11) Lecturer: Byron Boots Gaussian Properties, Bayesian Linear Regression 1 Bayesian Linear Regression (BLR) In linear regression,

More information

Mixtures of Gaussians. Sargur Srihari

Mixtures of Gaussians. Sargur Srihari Mixtures of Gaussians Sargur srihari@cedar.buffalo.edu 1 9. Mixture Models and EM 0. Mixture Models Overview 1. K-Means Clustering 2. Mixtures of Gaussians 3. An Alternative View of EM 4. The EM Algorithm

More information

CS281A/Stat241A Lecture 17

CS281A/Stat241A Lecture 17 CS281A/Stat241A Lecture 17 p. 1/4 CS281A/Stat241A Lecture 17 Factor Analysis and State Space Models Peter Bartlett CS281A/Stat241A Lecture 17 p. 2/4 Key ideas of this lecture Factor Analysis. Recall: Gaussian

More information

Biostat 2065 Analysis of Incomplete Data

Biostat 2065 Analysis of Incomplete Data Biostat 2065 Analysis of Incomplete Data Gong Tang Dept of Biostatistics University of Pittsburgh October 20, 2005 1. Large-sample inference based on ML Let θ is the MLE, then the large-sample theory implies

More information

Lecture 9: September 28

Lecture 9: September 28 0-725/36-725: Convex Optimization Fall 206 Lecturer: Ryan Tibshirani Lecture 9: September 28 Scribes: Yiming Wu, Ye Yuan, Zhihao Li Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These

More information

Fast Direct Methods for Gaussian Processes

Fast Direct Methods for Gaussian Processes Fast Direct Methods for Gaussian Processes Mike O Neil Departments of Mathematics New York University oneil@cims.nyu.edu December 12, 2015 1 Collaborators This is joint work with: Siva Ambikasaran Dan

More information

Tractable Upper Bounds on the Restricted Isometry Constant

Tractable Upper Bounds on the Restricted Isometry Constant Tractable Upper Bounds on the Restricted Isometry Constant Alex d Aspremont, Francis Bach, Laurent El Ghaoui Princeton University, École Normale Supérieure, U.C. Berkeley. Support from NSF, DHS and Google.

More information

Learning Binary Classifiers for Multi-Class Problem

Learning Binary Classifiers for Multi-Class Problem Research Memorandum No. 1010 September 28, 2006 Learning Binary Classifiers for Multi-Class Problem Shiro Ikeda The Institute of Statistical Mathematics 4-6-7 Minami-Azabu, Minato-ku, Tokyo, 106-8569,

More information

A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression

A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression Noah Simon Jerome Friedman Trevor Hastie November 5, 013 Abstract In this paper we purpose a blockwise descent

More information

https://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:

More information

Regularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008

Regularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008 Regularized Estimation of High Dimensional Covariance Matrices Peter Bickel Cambridge January, 2008 With Thanks to E. Levina (Joint collaboration, slides) I. M. Johnstone (Slides) Choongsoon Bae (Slides)

More information

Contents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II)

Contents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II) Contents Lecture Lecture Linear Discriminant Analysis Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University Email: fredriklindsten@ituuse Summary of lecture

More information

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana

More information

11 : Gaussian Graphic Models and Ising Models

11 : Gaussian Graphic Models and Ising Models 10-708: Probabilistic Graphical Models 10-708, Spring 2017 11 : Gaussian Graphic Models and Ising Models Lecturer: Bryon Aragam Scribes: Chao-Ming Yen 1 Introduction Different from previous maximum likelihood

More information

PDF hosted at the Radboud Repository of the Radboud University Nijmegen

PDF hosted at the Radboud Repository of the Radboud University Nijmegen PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is a preprint version which may differ from the publisher's version. For additional information about this

More information

Regulatory Inferece from Gene Expression. CMSC858P Spring 2012 Hector Corrada Bravo

Regulatory Inferece from Gene Expression. CMSC858P Spring 2012 Hector Corrada Bravo Regulatory Inferece from Gene Expression CMSC858P Spring 2012 Hector Corrada Bravo 2 Graphical Model Let y be a vector- valued random variable Suppose some condi8onal independence proper8es hold for some

More information

Lecture 3. Linear Regression II Bastian Leibe RWTH Aachen

Lecture 3. Linear Regression II Bastian Leibe RWTH Aachen Advanced Machine Learning Lecture 3 Linear Regression II 02.11.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de This Lecture: Advanced Machine Learning Regression

More information

A Framework for Feature Selection in Clustering

A Framework for Feature Selection in Clustering A Framework for Feature Selection in Clustering Daniela M. Witten and Robert Tibshirani October 10, 2011 Outline Problem Past work Proposed (sparse) clustering framework Sparse K-means clustering Sparse

More information

The lasso: some novel algorithms and applications

The lasso: some novel algorithms and applications 1 The lasso: some novel algorithms and applications Newton Institute, June 25, 2008 Robert Tibshirani Stanford University Collaborations with Trevor Hastie, Jerome Friedman, Holger Hoefling, Gen Nowak,

More information

An Efficient Sparse Metric Learning in High-Dimensional Space via l 1 -Penalized Log-Determinant Regularization

An Efficient Sparse Metric Learning in High-Dimensional Space via l 1 -Penalized Log-Determinant Regularization via l 1 -Penalized Log-Determinant Regularization Guo-Jun Qi qi4@illinois.edu Depart. ECE, University of Illinois at Urbana-Champaign, 405 North Mathews Avenue, Urbana, IL 61801 USA Jinhui Tang, Zheng-Jun

More information

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1

More information

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Cun-Hui Zhang and Stephanie S. Zhang Rutgers University and Columbia University September 14, 2012 Outline Introduction Methodology

More information