Probabilistic Graphical Models

Size: px
Start display at page:

Download "Probabilistic Graphical Models"

Transcription

1 School of Computer Science Probabilistic Graphical Models Distributed ADMM for Gaussian Graphical Models Yaoliang Yu Lecture 29, April 29, 2015 Eric CMU,

2 Networks / Graphs Eric CMU,

3 Where do graphs come from? Prior knowledge Mom told me A is connected to B Estimate from data! We have seen this in previous classes Will see two more today Sometimes may also be interested in edge weights An easier problem Real networks are BIG Require distributed optimization Eric CMU,

4 Structural Learning for completely observed Data MRF (Recall) 1) ( x,, x ( ( 1) 1 n ( 2) ( 2) 1,, n ( x x M ) ( x,, x ) ) ( ( M ) 1 n ) Eric CMU,

5 Gaussian Graphical Models Multivariate Gaussian density: WOLG: let We can view this as a continuous Markov Random Field with potentials defined on every node and edge: { } ) - ( ) - ( - exp ) ( ), ( / / µ µ π µ x x x Σ Σ = Σ T n p ( ) = = < j i j i ij i i ii n p x x q x q Q Q x x x p / 2 1/ exp ) (2 ) 0,,,, ( π µ! 5 Eric CMU,

6 The covariance and the precision matrices Covariance matrix Graphical model interpretation? Precision matrix Graphical model interpretation? Eric CMU,

7 Sparse precision vs. sparse covariance in GGM Σ = Σ = X 1 Σ 15 = 0 X1 X 5 X nbrs (1) or nbrs(5) 1 X 5 Σ15 = 0 Eric CMU,

8 Another example How to estimate this MRF? What if p >> n MLE does not exist in general! What about only learning a sparse graphical model? This is possible when s=o(n) Very often it is the structure of the GM that is more interesting Eric CMU,

9 Recall lasso Eric CMU,

10 Graph Regression (Meinshausen & Buhlmann 06) Neighborhood selection Lasso: Eric CMU,

11 Graph Regression Eric CMU,

12 Graph Regression Pros: Computationally convenient Strong theoretical guarantee ( p <= pol(n) ) Cons: Asymmetry Not minimax optimal Eric CMU,

13 The regularized MLE (Yuan & Lin 07) min Q log det Q +tr(qs)+ kqk 1 S: sample covariance matrix, may be singular Q 1 : may exclude the diagonal log det Q: implicitly force Q to be PSD symmetric Pros Cons Single step for estimating graph and inverse covariance MLE! Computationally challenging, partly solved by Glasso (Banergee et al 08, Friedman et al 08) Eric CMU,

14 Many many follow-ups Eric CMU,

15 A closer look of RMLE min Q log det Q +tr(qs)+ kqk 1 Set derivative to 0: Q 1 + S + sign(q) =0 kq 1 Sk 1 apple Can we (?!): min Q kqk 1 s.t. kq 1 Sk 1 apple Eric CMU,

16 CLIME (Cai et al. 11) Further relaxation min Q kqk 1 s.t. ksq Ik 1 apple Constraint controls Q S 1 Objective controls sparsity in Q Q is not required to be PSD or symmetric Separable! LP!!! Both objective and constraint are element-wise separable Can be reformulated as LP Strong theoretical guarantee Variations are minimax-optimal (Cai et al. 12, Liu & Wang 12) Eric CMU,

17 But for BIG problems min Q kqk 1 s.t. ksq Ik 1 apple Standard solvers for LP can be slow Embarrassingly parallel: Solve each column of Q independently in each core/machine min q i kq i k 1 s.t. ksq i e i k 1 apple Thanks for not having PSD constraint on Q Still troublesome if S is big Need to consider first-order methods Eric CMU,

18 A gentle introduction to alternating direction method of multipliers (ADMM) Eric CMU,

19 Optimization with coupling variables J uncoupled L coupled Canonical form: min w,z f(w)+g(z), s.t. Aw + Bz = c, where w 2 R m,z 2 R p,a: R m! R q,b : R p! R q,c2 R q Numerically challenging because Function f or g nonsmooth or constrained (i.e., can take value ) Linear constraint couples the variables w and z Large scale, interior point methods NA Naively alternating x and z does not work Min w 2 s.t. w + z = 1; optimum clearly is w = 0 Start with say w = 1 à z = 0 à w = 1 à z = 0 However, without coupling, can solve separately w and z Idea: try to decouple vars in the constraint! 1 Eric CMU,

20 Example: Empirical Risk Minimization (ERM) Each i corresponds to a training point (x i, y i ) Loss f i measures the fitness of the model parameter w least squares: support vector machines: boosting: logistic regression: g is the regularization function, e.g. or Vars coupled in obj, but not in constraint (none) min w n g(w)+ X f i (w) f i (w) =(y i w > x i ) 2 L coupled i=1 f i (w) =(1 y i w > x i ) + f i (w) =exp( y i w > x i ) f i (w) = log(1 + exp( y i w > x i )) Reformulate: transfer coupling from obj to constraint Arrive at canonical form, allow unified treatment later nkwk 2 2 nkwk 1 Eric CMU,

21 Why canonical form? ERM: min w g(w)+ n X i=1 f i (w) Canonical form: min w,z f(w)+g(z), s.t. Aw + Bz = c, where w 2 R m,z 2 R p,a: R m! R q,b : R p! R q,c2 R q ADMM algorithm (to be introduced shortly) excels at solving the canonical form Canonical form is a general template for constrained problems ERM (and many other problems) can be converted to canonical form through variable duplication (see next slide) Eric CMU,

22 How to: variable duplication Duplicate variables to achieve canonical form All w i must (eventually) agree Downside: many extra variables, increase problem size min w n g(w)+ X f i (w) Implicitly maintain duplicated variables i=1 v =[w 1,...,w n ] > min g(z)+ X f i(w i ), s.t. w i = z,8i v,z i {z } {z } v [I,...,I] f(v) > z=0 Global consensus constraint: 8i, w i = z Eric CMU,

23 Augmented Lagrangian Canonical form: min w,z f(w)+g(z), s.t. Aw + Bz = c, where w 2 R m, z 2 R p,a: R m! R q,b : R p! R q, c 2 R q Intro Lagrangian multiplier to decouple variables min w,z > (Aw + Bz {z c)+ µ 2 kaw + Bz ck2 2 } L µ (w,z; ) L µ : augmented Lagrangian More complicated min-max problem, but no coupling constraints Eric CMU,

24 Why Augmented Lagrangian? Quadratic term gives numerical stability min w,z > (Aw + Bz {z c)+ µ 2 kaw + Bz ck2 2 } L µ (w,z; ) May lead to strong convexity in w or z Faster convergence when strongly convex Allows larger step size (due to higher stability) Prevents subproblems diverging to infinity (again, stability) But sometimes better to work with normal Lagrangian Eric CMU,

25 ADMM Algorithm Fix dual, block coordinate descent on primal w, z min w,z > (Aw + Bz {z c)+ µ 2 kaw + Bz ck2 2 } L µ (w,z; ) arg min L µ(w, z t ; w z t+1 arg min L µ (w t+1, z; z Fix primal w, z, gradient ascent on dual w t+1 t+1 t + (Aw t+1 + Bz t+1 c) Step size can be large, e.g. = µ Usually rescale to remove / t ) f(w)+ µ 2 kaw + Bzt c + t /µk 2 t ) g(z)+ µ 2 kawt+1 + Bz c + t /µk 2 Eric CMU,

26 ERM revisited Reformulate by duplicating variables min v,z g(z)+ X f i(w i ), s.t. w i = z, 8i i {z } {z } v [I,...,I] f(v) > z=0 ADMM x-step: w t+1 arg min w L µ(w, z t ; t ) f(w)+ µ 2 kaw + Bzt c + t /µk 2 = X i f i (w i )+ µ 2 kw i z t + t ik 2 Thanks to duplicating Completely decoupled Parallelizable Closed-form if f i is simple Eric CMU,

27 ADMM: History and Related Augmented Lagrangian Method (ALM): solve w, z jointly even though coupled (Bertsekas'82) and refs therein Alternating Direction of Multiplier Method (ADMM): alternate w and z as previous slide (Boyd et al.'10) and refs therein Operator splitting for PDEs: Douglas, Peaceman, and Rachford (50s-70s) Glowinsky et al.'80s, Gabay'83; Spingarn'85 (Eckstein & Bertsekas'92; He et al.'02) in variational inequality Lots of recent work. Eric CMU,

28 ADMM: Linearization x t+1 Demanding step in each iteration of ADMM (similar for z): Diagonal A: reduce to proximal map (more later) of f f(x) =kxk 1, soft-shrinkage: sign(x) ( x µ) + Non-diagonal A: no closed-form, messy inner loop Instead, reduce to diagonal A by arg min L µ (x, z t ; y t ) = f(x)+g(z t )+y t > (Ax + Bz t c)+ µ 2 kax + Bz t ck 2 2 x A single gradient step: x t+1 x t )+A > y t + µa > (Ax t + Bz t c) x t Or, linearize the quadratic at : x t+1 arg min f(x)+y t > Ax +(x x t ) > µa > (Ax t + Bz t c)+ µ x 2 kx x tk 2 2 {z } Intuition: x re-computed in the next iteration anyways No need for perfect x f(x)+ µ 2 kx x t+a > (Ax t +Bz t c+y t /µ)k 2 2 Eric CMU,

29 Convergence Guarantees: Fixedpoint theory Recall some definitions Lagrangian: ADMM = Douglas-Rachford splitting Fixed-point iteration! proximal map P µ f (w) := arg min z reflection map R µ f (w) :=2Pµ f (w) convergence follows, e.g. (Bauschke & Combettes'13) explains why dual y, not primal x or z, always converges 1 2µ kz wk2 2 + f(z) well-defined for convex f, non-expansive: kt (x) T (y)k 2 applekx yk 2 proximal map generalizes the Euclidean projection L 0 (x, z; y) =min f(x)+y > Ax +min g(z)+y > (Bz c) x z {z } {z } d 1 (y) d 2 (y) w 1 2 (w + Rµ d 2 (R µ d 1 (w))); y P µ d 2 (w) w Eric CMU,

30 ADMM for CLIME Eric CMU,

31 Apply ADMM to CLIME min Q kqk 1 s.t. ksq Ek 1 apple Solve a block of columns of Q in each core/machine E is the corresponding block in I Step 1: reduce to ADMM canonical form Use variable duplicating min Q,Z kqk 1 s.t. kz Ek 1 apple, Z = SQ min Q,Z kqk 1 +[kz Ek 1 apple ] s.t. Z = SQ Eric CMU,

32 Apply ADMM to CLIME (cont ) Step 2: Write out augmented Lagrangian L(Q, Z; Y )=kqk 1 +[kz Ek 1 apple ]+ tr[(sq Z)Y ]+ 2 ksq Zk2 F Step 3: Perform primal-dual updates Q Z arg min Q kqk ksq Z + Y k2 F arg min Z [kz Ek 1 apple ]+ 2 ksq Z + Y k2 F = arg min kz Ek 1 apple Y Y + SQ Z 2 ksq Z + Y k2 F Eric CMU,

33 Apply ADMM to CLIME (cont ) Step 4: Solve the subproblems Lagrangian dual Y: trivial Primal Z: projection to l_inf ball, separable, easy Primal Q: easy if S is orthogonal, in general a lasso problem Bypass double loop by linearization Intuition: wasteful to solve Q to death min Q kqk 1 + tr(q > S(Y + SQ t Z)) + 2 kq Q tk 2 F Soft-thresholding Putting things together Q Z arg min Q kqk ksq Z + Y k2 F arg min Z [kz Ek 1 apple ]+ 2 ksq Z + Y k2 F = arg min kz Ek 1 apple Y Y + SQ Z 2 ksq Z + Y k2 F Eric CMU,

34 Exploring structure Expensive step in ADMM-CLIME: Matrix-matrix multiplication: SQ and alike If p >> n, S is size p x p but of rank at most n Write S = AA, and do A(A Q) A =[X 1,...,X n ] 2 R p n Matrix * matrix >> for loop of matrix * vector Preferable to solve a balanced block of columns Eric CMU,

35 Parallelization (Wang et al 13) Embarrassingly parallel If A fits into memory Chop A into small blocks and distribute Communication may be high Eric CMU,

36 Numerical results (Wang et al 13) Eric CMU,

37 Numerical results cont (Wang et al 13) Eric CMU,

38 Nonparanormal extensions Eric CMU,

39 Nonparanormal (Liu et al. 09) Z i = f i (X i ), i =1,...,p (Z 1,...,Z p ) N(0, ) f i: unknown monotonic functions Observe X, but not Z Independence preserved under transformations X i? X j X rest () Z i? Z j Z rest () Q ij =0 Can estimate f i first, then apply glasso on f i (X i ) Estimating functions can be slow, nonparametric rate Eric CMU,

40 A neat observation Since f i is monotonic Z i,: comonotone / concordant with X i,: Use rank estimator! Z i = f i (X i ), i =1,...,p (Z 1,...,Z p ) N(0, ) Z i,1,z i,2,...,z i,n R i,1,r i,2,...,r i,n Want, but do not observe Z Maybe??? ij = 1 n X i,1,x i,2,...,x i,n nx k=1 Z i,k Z j,k? = 1 n nx k=1 R i,k R j,k Eric CMU,

41 Kendall s tau Assuming no ties: Key: ij = 1 n(n 1) k,` Complexity of computing Kendall s tau? Genuine, asymptotically unbiased estimate of t 7! 2sin( 6 t) X sign[(r i,k R i,`)(r j,k R j,`)] ij =2sin( 6 E( ij)) is a contraction, preserving concentration After having, use whatever glasso, e.g., CLIME Can also use other rank estimator, e.g., Spearman s rho Eric CMU,

42 Summary Gaussian graphical model selection Neighborhood selection, Regularized MLE, CLIME Implicit PSD vs. Explicit PSD Distributed ADMM Generic procedure Can distribute matrix product Nonparanormal Gaussian graphical model Rank statistics Plug-in estimator Eric CMU,

43 References Graph Lasso High-Dimensional Graphs and Variable Selection with the LASSO, Meinshausen and Buhlmann, Annals of Statistics, vol. 34, 2006 Model Selection and Estimation in the Gaussian Graphical Model, Yuan and Lin, Biometrika, vol. 94, 2007 A Constrained l_1 minimization Approach for Sparse Precision Matrix Estimation, Cai et al., Journal of the American Statistical Association, 2011 Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data, Banerjee et al, Journal of Machine Learning Research, 2008 Sparse Inverse Covariance Estimation with the Graphical Lasso, Friedman et al., Biostatistics, 2008 Nonparanormal The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs, Han Liu et al., Journal of Machine Learning Research, 2009 Regularized Rank-based Estimation of High-Dimensional Nonparanormal Graphical Models, Lingzhou Xue and Hui Zou, Annals of Statistics, vol. 40, 2012 High-Dimensional Semiparametric Gaussian Copula Graphical Models, Han Liu et al., Annals of Statistics, vol. 40, 2012 ADMM Large Scale Distributed Sparse Precision Estimation, Wang et al., NIPS 2013 Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers, Boyd et al., Foundation and Trends in Machine Learning, 2011 Eric CMU,

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss arxiv:1811.04545v1 [stat.co] 12 Nov 2018 Cheng Wang School of Mathematical Sciences, Shanghai Jiao

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 7, 04 Reading: See class website Eric Xing @ CMU, 005-04

More information

Distributed Optimization via Alternating Direction Method of Multipliers

Distributed Optimization via Alternating Direction Method of Multipliers Distributed Optimization via Alternating Direction Method of Multipliers Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato Stanford University ITMANET, Stanford, January 2011 Outline precursors dual decomposition

More information

Divide-and-combine Strategies in Statistical Modeling for Massive Data

Divide-and-combine Strategies in Statistical Modeling for Massive Data Divide-and-combine Strategies in Statistical Modeling for Massive Data Liqun Yu Washington University in St. Louis March 30, 2017 Liqun Yu (WUSTL) D&C Statistical Modeling for Massive Data March 30, 2017

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 5, 06 Reading: See class website Eric Xing @ CMU, 005-06

More information

Sparse Graph Learning via Markov Random Fields

Sparse Graph Learning via Markov Random Fields Sparse Graph Learning via Markov Random Fields Xin Sui, Shao Tang Sep 23, 2016 Xin Sui, Shao Tang Sparse Graph Learning via Markov Random Fields Sep 23, 2016 1 / 36 Outline 1 Introduction to graph learning

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)

More information

Frist order optimization methods for sparse inverse covariance selection

Frist order optimization methods for sparse inverse covariance selection Frist order optimization methods for sparse inverse covariance selection Katya Scheinberg Lehigh University ISE Department (joint work with D. Goldfarb, Sh. Ma, I. Rish) Introduction l l l l l l The field

More information

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana

More information

Causal Inference: Discussion

Causal Inference: Discussion Causal Inference: Discussion Mladen Kolar The University of Chicago Booth School of Business Sept 23, 2016 Types of machine learning problems Based on the information available: Supervised learning Reinforcement

More information

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop

More information

Multivariate Normal Models

Multivariate Normal Models Case Study 3: fmri Prediction Coping with Large Covariances: Latent Factor Models, Graphical Models, Graphical LASSO Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox February

More information

Sparse Gaussian conditional random fields

Sparse Gaussian conditional random fields Sparse Gaussian conditional random fields Matt Wytock, J. ico Kolter School of Computer Science Carnegie Mellon University Pittsburgh, PA 53 {mwytock, zkolter}@cs.cmu.edu Abstract We propose sparse Gaussian

More information

The Nonparanormal skeptic

The Nonparanormal skeptic The Nonpara skeptic Han Liu Johns Hopkins University, 615 N. Wolfe Street, Baltimore, MD 21205 USA Fang Han Johns Hopkins University, 615 N. Wolfe Street, Baltimore, MD 21205 USA Ming Yuan Georgia Institute

More information

Linearized Alternating Direction Method: Two Blocks and Multiple Blocks. Zhouchen Lin 林宙辰北京大学

Linearized Alternating Direction Method: Two Blocks and Multiple Blocks. Zhouchen Lin 林宙辰北京大学 Linearized Alternating Direction Method: Two Blocks and Multiple Blocks Zhouchen Lin 林宙辰北京大学 Dec. 3, 014 Outline Alternating Direction Method (ADM) Linearized Alternating Direction Method (LADM) Two Blocks

More information

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization / Coordinate descent Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Adding to the toolbox, with stats and ML in mind We ve seen several general and useful minimization tools First-order methods

More information

High-dimensional graphical model selection: Practical and information-theoretic limits

High-dimensional graphical model selection: Practical and information-theoretic limits 1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John

More information

Sparse inverse covariance estimation with the lasso

Sparse inverse covariance estimation with the lasso Sparse inverse covariance estimation with the lasso Jerome Friedman Trevor Hastie and Robert Tibshirani November 8, 2007 Abstract We consider the problem of estimating sparse graphs by a lasso penalty

More information

Lecture 25: November 27

Lecture 25: November 27 10-725: Optimization Fall 2012 Lecture 25: November 27 Lecturer: Ryan Tibshirani Scribes: Matt Wytock, Supreeth Achar Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have

More information

The Kernel Trick, Gram Matrices, and Feature Extraction. CS6787 Lecture 4 Fall 2017

The Kernel Trick, Gram Matrices, and Feature Extraction. CS6787 Lecture 4 Fall 2017 The Kernel Trick, Gram Matrices, and Feature Extraction CS6787 Lecture 4 Fall 2017 Momentum for Principle Component Analysis CS6787 Lecture 3.1 Fall 2017 Principle Component Analysis Setting: find the

More information

UVA CS 4501: Machine Learning

UVA CS 4501: Machine Learning UVA CS 4501: Machine Learning Lecture 16 Extra: Support Vector Machine Optimization with Dual Dr. Yanjun Qi University of Virginia Department of Computer Science Today Extra q Optimization of SVM ü SVM

More information

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement

More information

Multivariate Normal Models

Multivariate Normal Models Case Study 3: fmri Prediction Graphical LASSO Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Emily Fox February 26 th, 2013 Emily Fox 2013 1 Multivariate Normal Models

More information

Logistic Regression Review Fall 2012 Recitation. September 25, 2012 TA: Selen Uguroglu

Logistic Regression Review Fall 2012 Recitation. September 25, 2012 TA: Selen Uguroglu Logistic Regression Review 10-601 Fall 2012 Recitation September 25, 2012 TA: Selen Uguroglu!1 Outline Decision Theory Logistic regression Goal Loss function Inference Gradient Descent!2 Training Data

More information

ARock: an algorithmic framework for asynchronous parallel coordinate updates

ARock: an algorithmic framework for asynchronous parallel coordinate updates ARock: an algorithmic framework for asynchronous parallel coordinate updates Zhimin Peng, Yangyang Xu, Ming Yan, Wotao Yin ( UCLA Math, U.Waterloo DCO) UCLA CAM Report 15-37 ShanghaiTech SSDS 15 June 25,

More information

Distributed Optimization and Statistics via Alternating Direction Method of Multipliers

Distributed Optimization and Statistics via Alternating Direction Method of Multipliers Distributed Optimization and Statistics via Alternating Direction Method of Multipliers Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato Stanford University Stanford Statistics Seminar, September 2010

More information

Math 273a: Optimization Overview of First-Order Optimization Algorithms

Math 273a: Optimization Overview of First-Order Optimization Algorithms Math 273a: Optimization Overview of First-Order Optimization Algorithms Wotao Yin Department of Mathematics, UCLA online discussions on piazza.com 1 / 9 Typical flow of numerical optimization Optimization

More information

High-dimensional graphical model selection: Practical and information-theoretic limits

High-dimensional graphical model selection: Practical and information-theoretic limits 1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John

More information

CS598 Machine Learning in Computational Biology (Lecture 5: Matrix - part 2) Professor Jian Peng Teaching Assistant: Rongda Zhu

CS598 Machine Learning in Computational Biology (Lecture 5: Matrix - part 2) Professor Jian Peng Teaching Assistant: Rongda Zhu CS598 Machine Learning in Computational Biology (Lecture 5: Matrix - part 2) Professor Jian Peng Teaching Assistant: Rongda Zhu Feature engineering is hard 1. Extract informative features from domain knowledge

More information

Max Margin-Classifier

Max Margin-Classifier Max Margin-Classifier Oliver Schulte - CMPT 726 Bishop PRML Ch. 7 Outline Maximum Margin Criterion Math Maximizing the Margin Non-Separable Data Kernels and Non-linear Mappings Where does the maximization

More information

Homework 4. Convex Optimization /36-725

Homework 4. Convex Optimization /36-725 Homework 4 Convex Optimization 10-725/36-725 Due Friday November 4 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)

More information

Contraction Methods for Convex Optimization and monotone variational inequalities No.12

Contraction Methods for Convex Optimization and monotone variational inequalities No.12 XII - 1 Contraction Methods for Convex Optimization and monotone variational inequalities No.12 Linearized alternating direction methods of multipliers for separable convex programming Bingsheng He Department

More information

Support Vector Machines and Kernel Methods

Support Vector Machines and Kernel Methods 2018 CS420 Machine Learning, Lecture 3 Hangout from Prof. Andrew Ng. http://cs229.stanford.edu/notes/cs229-notes3.pdf Support Vector Machines and Kernel Methods Weinan Zhang Shanghai Jiao Tong University

More information

ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING

ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING ACCELERATED FIRST-ORDER PRIMAL-DUAL PROXIMAL METHODS FOR LINEARLY CONSTRAINED COMPOSITE CONVEX PROGRAMMING YANGYANG XU Abstract. Motivated by big data applications, first-order methods have been extremely

More information

EE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015

EE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015 EE613 Machine Learning for Engineers Kernel methods Support Vector Machines jean-marc odobez 2015 overview Kernel methods introductions and main elements defining kernels Kernelization of k-nn, K-Means,

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 13: Learning in Gaussian Graphical Models, Non-Gaussian Inference, Monte Carlo Methods Some figures

More information

Approximation. Inderjit S. Dhillon Dept of Computer Science UT Austin. SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina.

Approximation. Inderjit S. Dhillon Dept of Computer Science UT Austin. SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina. Using Quadratic Approximation Inderjit S. Dhillon Dept of Computer Science UT Austin SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina Sept 12, 2012 Joint work with C. Hsieh, M. Sustik and

More information

Splitting methods for decomposing separable convex programs

Splitting methods for decomposing separable convex programs Splitting methods for decomposing separable convex programs Philippe Mahey LIMOS - ISIMA - Université Blaise Pascal PGMO, ENSTA 2013 October 4, 2013 1 / 30 Plan 1 Max Monotone Operators Proximal techniques

More information

Introduction to Alternating Direction Method of Multipliers

Introduction to Alternating Direction Method of Multipliers Introduction to Alternating Direction Method of Multipliers Yale Chang Machine Learning Group Meeting September 29, 2016 Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction

More information

Proximal splitting methods on convex problems with a quadratic term: Relax!

Proximal splitting methods on convex problems with a quadratic term: Relax! Proximal splitting methods on convex problems with a quadratic term: Relax! The slides I presented with added comments Laurent Condat GIPSA-lab, Univ. Grenoble Alpes, France Workshop BASP Frontiers, Jan.

More information

27: Case study with popular GM III. 1 Introduction: Gene association mapping for complex diseases 1

27: Case study with popular GM III. 1 Introduction: Gene association mapping for complex diseases 1 10-708: Probabilistic Graphical Models, Spring 2015 27: Case study with popular GM III Lecturer: Eric P. Xing Scribes: Hyun Ah Song & Elizabeth Silver 1 Introduction: Gene association mapping for complex

More information

Is the test error unbiased for these programs?

Is the test error unbiased for these programs? Is the test error unbiased for these programs? Xtrain avg N o Preprocessing by de meaning using whole TEST set 2017 Kevin Jamieson 1 Is the test error unbiased for this program? e Stott see non for f x

More information

Alternating Direction Method of Multipliers. Ryan Tibshirani Convex Optimization

Alternating Direction Method of Multipliers. Ryan Tibshirani Convex Optimization Alternating Direction Method of Multipliers Ryan Tibshirani Convex Optimization 10-725 Consider the problem Last time: dual ascent min x f(x) subject to Ax = b where f is strictly convex and closed. Denote

More information

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization Proximal Newton Method Zico Kolter (notes by Ryan Tibshirani) Convex Optimization 10-725 Consider the problem Last time: quasi-newton methods min x f(x) with f convex, twice differentiable, dom(f) = R

More information

Big Data Analytics: Optimization and Randomization

Big Data Analytics: Optimization and Randomization Big Data Analytics: Optimization and Randomization Tianbao Yang Tutorial@ACML 2015 Hong Kong Department of Computer Science, The University of Iowa, IA, USA Nov. 20, 2015 Yang Tutorial for ACML 15 Nov.

More information

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725 Dual methods and ADMM Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given f : R n R, the function is called its conjugate Recall conjugate functions f (y) = max x R n yt x f(x)

More information

Simple Techniques for Improving SGD. CS6787 Lecture 2 Fall 2017

Simple Techniques for Improving SGD. CS6787 Lecture 2 Fall 2017 Simple Techniques for Improving SGD CS6787 Lecture 2 Fall 2017 Step Sizes and Convergence Where we left off Stochastic gradient descent x t+1 = x t rf(x t ; yĩt ) Much faster per iteration than gradient

More information

The Alternating Direction Method of Multipliers

The Alternating Direction Method of Multipliers The Alternating Direction Method of Multipliers With Adaptive Step Size Selection Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein December 2, 2015 1 / 25 Background The Dual Problem Consider

More information

Robust Inverse Covariance Estimation under Noisy Measurements

Robust Inverse Covariance Estimation under Noisy Measurements .. Robust Inverse Covariance Estimation under Noisy Measurements Jun-Kun Wang, Shou-De Lin Intel-NTU, National Taiwan University ICML 2014 1 / 30 . Table of contents Introduction.1 Introduction.2 Related

More information

Learning discrete graphical models via generalized inverse covariance matrices

Learning discrete graphical models via generalized inverse covariance matrices Learning discrete graphical models via generalized inverse covariance matrices Duzhe Wang, Yiming Lv, Yongjoon Kim, Young Lee Department of Statistics University of Wisconsin-Madison {dwang282, lv23, ykim676,

More information

Robust and sparse Gaussian graphical modelling under cell-wise contamination

Robust and sparse Gaussian graphical modelling under cell-wise contamination Robust and sparse Gaussian graphical modelling under cell-wise contamination Shota Katayama 1, Hironori Fujisawa 2 and Mathias Drton 3 1 Tokyo Institute of Technology, Japan 2 The Institute of Statistical

More information

Accelerated primal-dual methods for linearly constrained convex problems

Accelerated primal-dual methods for linearly constrained convex problems Accelerated primal-dual methods for linearly constrained convex problems Yangyang Xu SIAM Conference on Optimization May 24, 2017 1 / 23 Accelerated proximal gradient For convex composite problem: minimize

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course otes for EE7C (Spring 018): Conve Optimization and Approimation Instructor: Moritz Hardt Email: hardt+ee7c@berkeley.edu Graduate Instructor: Ma Simchowitz Email: msimchow+ee7c@berkeley.edu October

More information

Permutation-invariant regularization of large covariance matrices. Liza Levina

Permutation-invariant regularization of large covariance matrices. Liza Levina Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work

More information

2 Regularized Image Reconstruction for Compressive Imaging and Beyond

2 Regularized Image Reconstruction for Compressive Imaging and Beyond EE 367 / CS 448I Computational Imaging and Display Notes: Compressive Imaging and Regularized Image Reconstruction (lecture ) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 XVI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 A slightly changed ADMM for convex optimization with three separable operators Bingsheng He Department of

More information

Douglas-Rachford Splitting: Complexity Estimates and Accelerated Variants

Douglas-Rachford Splitting: Complexity Estimates and Accelerated Variants 53rd IEEE Conference on Decision and Control December 5-7, 204. Los Angeles, California, USA Douglas-Rachford Splitting: Complexity Estimates and Accelerated Variants Panagiotis Patrinos and Lorenzo Stella

More information

Proximity-Based Anomaly Detection using Sparse Structure Learning

Proximity-Based Anomaly Detection using Sparse Structure Learning Proximity-Based Anomaly Detection using Sparse Structure Learning Tsuyoshi Idé (IBM Tokyo Research Lab) Aurelie C. Lozano, Naoki Abe, and Yan Liu (IBM T. J. Watson Research Center) 2009/04/ SDM 2009 /

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 11 CRFs, Exponential Family CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due today Project milestones due next Monday (Nov 9) About half the work should

More information

Preconditioning via Diagonal Scaling

Preconditioning via Diagonal Scaling Preconditioning via Diagonal Scaling Reza Takapoui Hamid Javadi June 4, 2014 1 Introduction Interior point methods solve small to medium sized problems to high accuracy in a reasonable amount of time.

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized

More information

Homework 5. Convex Optimization /36-725

Homework 5. Convex Optimization /36-725 Homework 5 Convex Optimization 10-725/36-725 Due Tuesday November 22 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)

More information

Expectation Maximization Algorithm

Expectation Maximization Algorithm Expectation Maximization Algorithm Vibhav Gogate The University of Texas at Dallas Slides adapted from Carlos Guestrin, Dan Klein, Luke Zettlemoyer and Dan Weld The Evils of Hard Assignments? Clusters

More information

Lecture 5 : Projections

Lecture 5 : Projections Lecture 5 : Projections EE227C. Lecturer: Professor Martin Wainwright. Scribe: Alvin Wan Up until now, we have seen convergence rates of unconstrained gradient descent. Now, we consider a constrained minimization

More information

13 : Variational Inference: Loopy Belief Propagation and Mean Field

13 : Variational Inference: Loopy Belief Propagation and Mean Field 10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction

More information

First-order methods for structured nonsmooth optimization

First-order methods for structured nonsmooth optimization First-order methods for structured nonsmooth optimization Sangwoon Yun Department of Mathematics Education Sungkyunkwan University Oct 19, 2016 Center for Mathematical Analysis & Computation, Yonsei University

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Yu-Hsin Kuo, Amos Ng 1 Introduction Last lecture

More information

Learning Gaussian Graphical Models with Unknown Group Sparsity

Learning Gaussian Graphical Models with Unknown Group Sparsity Learning Gaussian Graphical Models with Unknown Group Sparsity Kevin Murphy Ben Marlin Depts. of Statistics & Computer Science Univ. British Columbia Canada Connections Graphical models Density estimation

More information

Introduction to graphical models: Lecture III

Introduction to graphical models: Lecture III Introduction to graphical models: Lecture III Martin Wainwright UC Berkeley Departments of Statistics, and EECS Martin Wainwright (UC Berkeley) Some introductory lectures January 2013 1 / 25 Introduction

More information

Lecture 9: September 28

Lecture 9: September 28 0-725/36-725: Convex Optimization Fall 206 Lecturer: Ryan Tibshirani Lecture 9: September 28 Scribes: Yiming Wu, Ye Yuan, Zhihao Li Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These

More information

The classifier. Theorem. where the min is over all possible classifiers. To calculate the Bayes classifier/bayes risk, we need to know

The classifier. Theorem. where the min is over all possible classifiers. To calculate the Bayes classifier/bayes risk, we need to know The Bayes classifier Theorem The classifier satisfies where the min is over all possible classifiers. To calculate the Bayes classifier/bayes risk, we need to know Alternatively, since the maximum it is

More information

The classifier. Linear discriminant analysis (LDA) Example. Challenges for LDA

The classifier. Linear discriminant analysis (LDA) Example. Challenges for LDA The Bayes classifier Linear discriminant analysis (LDA) Theorem The classifier satisfies In linear discriminant analysis (LDA), we make the (strong) assumption that where the min is over all possible classifiers.

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 29, 2016 Outline Convex vs Nonconvex Functions Coordinate Descent Gradient Descent Newton s method Stochastic Gradient Descent Numerical Optimization

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Lecture 8: Optimization Cho-Jui Hsieh UC Davis May 9, 2017 Optimization Numerical Optimization Numerical Optimization: min X f (X ) Can be applied

More information

Chapter 17: Undirected Graphical Models

Chapter 17: Undirected Graphical Models Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)

More information

10708 Graphical Models: Homework 2

10708 Graphical Models: Homework 2 10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves

More information

Machine Learning. Lecture 6: Support Vector Machine. Feng Li.

Machine Learning. Lecture 6: Support Vector Machine. Feng Li. Machine Learning Lecture 6: Support Vector Machine Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2018 Warm Up 2 / 80 Warm Up (Contd.)

More information

Selected Topics in Optimization. Some slides borrowed from

Selected Topics in Optimization. Some slides borrowed from Selected Topics in Optimization Some slides borrowed from http://www.stat.cmu.edu/~ryantibs/convexopt/ Overview Optimization problems are almost everywhere in statistics and machine learning. Input Model

More information

Statistical Machine Learning for Structured and High Dimensional Data

Statistical Machine Learning for Structured and High Dimensional Data Statistical Machine Learning for Structured and High Dimensional Data (FA9550-09- 1-0373) PI: Larry Wasserman (CMU) Co- PI: John Lafferty (UChicago and CMU) AFOSR Program Review (Jan 28-31, 2013, Washington,

More information

Proximal Newton Method. Ryan Tibshirani Convex Optimization /36-725

Proximal Newton Method. Ryan Tibshirani Convex Optimization /36-725 Proximal Newton Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: primal-dual interior-point method Given the problem min x subject to f(x) h i (x) 0, i = 1,... m Ax = b where f, h

More information

LASSO Review, Fused LASSO, Parallel LASSO Solvers

LASSO Review, Fused LASSO, Parallel LASSO Solvers Case Study 3: fmri Prediction LASSO Review, Fused LASSO, Parallel LASSO Solvers Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade May 3, 2016 Sham Kakade 2016 1 Variable

More information

Gaussian Graphical Models and Graphical Lasso

Gaussian Graphical Models and Graphical Lasso ELE 538B: Sparsity, Structure and Inference Gaussian Graphical Models and Graphical Lasso Yuxin Chen Princeton University, Spring 2017 Multivariate Gaussians Consider a random vector x N (0, Σ) with pdf

More information

Machine Learning and Computational Statistics, Spring 2017 Homework 2: Lasso Regression

Machine Learning and Computational Statistics, Spring 2017 Homework 2: Lasso Regression Machine Learning and Computational Statistics, Spring 2017 Homework 2: Lasso Regression Due: Monday, February 13, 2017, at 10pm (Submit via Gradescope) Instructions: Your answers to the questions below,

More information

Convex Optimization Algorithms for Machine Learning in 10 Slides

Convex Optimization Algorithms for Machine Learning in 10 Slides Convex Optimization Algorithms for Machine Learning in 10 Slides Presenter: Jul. 15. 2015 Outline 1 Quadratic Problem Linear System 2 Smooth Problem Newton-CG 3 Composite Problem Proximal-Newton-CD 4 Non-smooth,

More information

An interior-point stochastic approximation method and an L1-regularized delta rule

An interior-point stochastic approximation method and an L1-regularized delta rule Photograph from National Geographic, Sept 2008 An interior-point stochastic approximation method and an L1-regularized delta rule Peter Carbonetto, Mark Schmidt and Nando de Freitas University of British

More information

Is the test error unbiased for these programs? 2017 Kevin Jamieson

Is the test error unbiased for these programs? 2017 Kevin Jamieson Is the test error unbiased for these programs? 2017 Kevin Jamieson 1 Is the test error unbiased for this program? 2017 Kevin Jamieson 2 Simple Variable Selection LASSO: Sparse Regression Machine Learning

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

Nonlinear Optimization for Optimal Control

Nonlinear Optimization for Optimal Control Nonlinear Optimization for Optimal Control Pieter Abbeel UC Berkeley EECS Many slides and figures adapted from Stephen Boyd [optional] Boyd and Vandenberghe, Convex Optimization, Chapters 9 11 [optional]

More information

Graphical Model Selection

Graphical Model Selection May 6, 2013 Trevor Hastie, Stanford Statistics 1 Graphical Model Selection Trevor Hastie Stanford University joint work with Jerome Friedman, Rob Tibshirani, Rahul Mazumder and Jason Lee May 6, 2013 Trevor

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)

More information

On the interior of the simplex, we have the Hessian of d(x), Hd(x) is diagonal with ith. µd(w) + w T c. minimize. subject to w T 1 = 1,

On the interior of the simplex, we have the Hessian of d(x), Hd(x) is diagonal with ith. µd(w) + w T c. minimize. subject to w T 1 = 1, Math 30 Winter 05 Solution to Homework 3. Recognizing the convexity of g(x) := x log x, from Jensen s inequality we get d(x) n x + + x n n log x + + x n n where the equality is attained only at x = (/n,...,

More information

The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent

The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent Yinyu Ye K. T. Li Professor of Engineering Department of Management Science and Engineering Stanford

More information

Sparse Optimization Lecture: Dual Methods, Part I

Sparse Optimization Lecture: Dual Methods, Part I Sparse Optimization Lecture: Dual Methods, Part I Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know dual (sub)gradient iteration augmented l 1 iteration

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Maria Ryskina, Yen-Chia Hsu 1 Introduction

More information

Big & Quic: Sparse Inverse Covariance Estimation for a Million Variables

Big & Quic: Sparse Inverse Covariance Estimation for a Million Variables for a Million Variables Cho-Jui Hsieh The University of Texas at Austin NIPS Lake Tahoe, Nevada Dec 8, 2013 Joint work with M. Sustik, I. Dhillon, P. Ravikumar and R. Poldrack FMRI Brain Analysis Goal:

More information

11 : Gaussian Graphic Models and Ising Models

11 : Gaussian Graphic Models and Ising Models 10-708: Probabilistic Graphical Models 10-708, Spring 2017 11 : Gaussian Graphic Models and Ising Models Lecturer: Bryon Aragam Scribes: Chao-Ming Yen 1 Introduction Different from previous maximum likelihood

More information

Lecture 23: November 19

Lecture 23: November 19 10-725/36-725: Conve Optimization Fall 2018 Lecturer: Ryan Tibshirani Lecture 23: November 19 Scribes: Charvi Rastogi, George Stoica, Shuo Li Charvi Rastogi: 23.1-23.4.2, George Stoica: 23.4.3-23.8, Shuo

More information

Joint Gaussian Graphical Model Review Series I

Joint Gaussian Graphical Model Review Series I Joint Gaussian Graphical Model Review Series I Probability Foundations Beilun Wang Advisor: Yanjun Qi 1 Department of Computer Science, University of Virginia http://jointggm.org/ June 23rd, 2017 Beilun

More information