MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.)

Size: px
Start display at page:

Download "MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.)"

Transcription

1 1/12 MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.) Dominique Guillot Departments of Mathematical Sciences University of Delaware May 6, 2016

2 Estimating the conditional independence structure of a GGM 2/12 During the last lecture, we have shown that when X N(µ, Σ), 1 X i X j i Σ ij = 0. 2 X i X j rest i (Σ 1 ) ij = 0.

3 Estimating the conditional independence structure of a GGM 2/12 During the last lecture, we have shown that when X N(µ, Σ), 1 X i X j i Σ ij = 0. 2 X i X j rest i (Σ 1 ) ij = 0. To discover the conditional structure of X, we need to estimate the structure of zeros of the precision matrix Ω = Σ 1.

4 Estimating the conditional independence structure of a GGM 2/12 During the last lecture, we have shown that when X N(µ, Σ), 1 X i X j i Σ ij = 0. 2 X i X j rest i (Σ 1 ) ij = 0. To discover the conditional structure of X, we need to estimate the structure of zeros of the precision matrix Ω = Σ 1. We will proceed in a way that is similar to the lasso.

5 Estimating the conditional independence structure of a GGM 2/12 During the last lecture, we have shown that when X N(µ, Σ), 1 X i X j i Σ ij = 0. 2 X i X j rest i (Σ 1 ) ij = 0. To discover the conditional structure of X, we need to estimate the structure of zeros of the precision matrix Ω = Σ 1. We will proceed in a way that is similar to the lasso. Suppose x (1),..., x (n) R p are iid observations of X. The associated log-likelihood of (µ, Σ) is given by l(µ, Σ) := n 2 log det Σ 1 2 n i=1 (x (i) µ) T Σ 1 (x (i) µ) np 2 log(2π).

6 Estimating the conditional independence structure of a GGM 2/12 During the last lecture, we have shown that when X N(µ, Σ), 1 X i X j i Σ ij = 0. 2 X i X j rest i (Σ 1 ) ij = 0. To discover the conditional structure of X, we need to estimate the structure of zeros of the precision matrix Ω = Σ 1. We will proceed in a way that is similar to the lasso. Suppose x (1),..., x (n) R p are iid observations of X. The associated log-likelihood of (µ, Σ) is given by l(µ, Σ) := n 2 log det Σ 1 2 n i=1 (x (i) µ) T Σ 1 (x (i) µ) np 2 log(2π). Classical result: the MLE of (µ, Σ) is given by ˆµ := 1 n n x (i), S := 1 n n (x (i) ˆµ)(x (i) ˆµ) T. i=1 i=1

7 Estimating the CI structure of a GGM (cont.) 3/12 Using ˆµ and Σ, we can conveniently rewrite the log-likelihood as: l(µ, Σ) = n 2 log det Σ n 2 Tr(SΣ 1 ) np 2 log(2π) n 2 Tr(Σ 1 (ˆµ µ)(ˆµ µ) T ). (use the identity x T Ax = Tr(Axx T ).

8 Estimating the CI structure of a GGM (cont.) 3/12 Using ˆµ and Σ, we can conveniently rewrite the log-likelihood as: l(µ, Σ) = n 2 log det Σ n 2 Tr(SΣ 1 ) np 2 log(2π) n 2 Tr(Σ 1 (ˆµ µ)(ˆµ µ) T ). (use the identity x T Ax = Tr(Axx T ). Note that the last term is minimized when µ = ˆµ (independently of Σ) since Tr(Σ 1 (ˆµ µ)(ˆµ µ) T ) = (ˆµ µ) T Σ 1 (ˆµ µ) 0. (The last inequality holds since Σ 1 is positive denite.)

9 Estimating the CI structure of a GGM (cont.) 3/12 Using ˆµ and Σ, we can conveniently rewrite the log-likelihood as: l(µ, Σ) = n 2 log det Σ n 2 Tr(SΣ 1 ) np 2 log(2π) n 2 Tr(Σ 1 (ˆµ µ)(ˆµ µ) T ). (use the identity x T Ax = Tr(Axx T ). Note that the last term is minimized when µ = ˆµ (independently of Σ) since Tr(Σ 1 (ˆµ µ)(ˆµ µ) T ) = (ˆµ µ) T Σ 1 (ˆµ µ) 0. (The last inequality holds since Σ 1 is positive denite.) Therefore the log-likelihood of Ω := Σ 1 is l(ω) log det Ω Tr(SΩ) (up to a constant).

10 The Graphical Lasso 4/12 The Graphical Lasso (glasso) algorithm (Friedman, Hastie, Tibshirani, 2007), Banerjee et al. (2007), solves the penalized likelihood problem: ˆΩ ρ = argmax Ω psd log det Ω Tr(SΩ) ρ p Ω 1, i,j=1 where Ω 1 := p i,j=1 Ω ij, and ρ > 0 is a xed regularization parameter.

11 The Graphical Lasso 4/12 The Graphical Lasso (glasso) algorithm (Friedman, Hastie, Tibshirani, 2007), Banerjee et al. (2007), solves the penalized likelihood problem: ˆΩ ρ = argmax Ω psd log det Ω Tr(SΩ) ρ p Ω 1, i,j=1 where Ω 1 := p i,j=1 Ω ij, and ρ > 0 is a xed regularization parameter. Idea: Make a trade-o between maximizing the likelihood and having a sparse Ω.

12 The Graphical Lasso 4/12 The Graphical Lasso (glasso) algorithm (Friedman, Hastie, Tibshirani, 2007), Banerjee et al. (2007), solves the penalized likelihood problem: ˆΩ ρ = argmax Ω psd log det Ω Tr(SΩ) ρ p Ω 1, i,j=1 where Ω 1 := p i,j=1 Ω ij, and ρ > 0 is a xed regularization parameter. Idea: Make a trade-o between maximizing the likelihood and having a sparse Ω. Just like in the lasso problem, using a 1-norm tends to introduce many zeros into Ω.

13 The Graphical Lasso 4/12 The Graphical Lasso (glasso) algorithm (Friedman, Hastie, Tibshirani, 2007), Banerjee et al. (2007), solves the penalized likelihood problem: ˆΩ ρ = argmax Ω psd log det Ω Tr(SΩ) ρ p Ω 1, i,j=1 where Ω 1 := p i,j=1 Ω ij, and ρ > 0 is a xed regularization parameter. Idea: Make a trade-o between maximizing the likelihood and having a sparse Ω. Just like in the lasso problem, using a 1-norm tends to introduce many zeros into Ω. The regularization parameter ρ can be chosen by cross-validation.

14 The Graphical Lasso 4/12 The Graphical Lasso (glasso) algorithm (Friedman, Hastie, Tibshirani, 2007), Banerjee et al. (2007), solves the penalized likelihood problem: ˆΩ ρ = argmax Ω psd log det Ω Tr(SΩ) ρ p Ω 1, i,j=1 where Ω 1 := p i,j=1 Ω ij, and ρ > 0 is a xed regularization parameter. Idea: Make a trade-o between maximizing the likelihood and having a sparse Ω. Just like in the lasso problem, using a 1-norm tends to introduce many zeros into Ω. The regularization parameter ρ can be chosen by cross-validation. The above problem can be eciently solved for problems with up to a few thousand variables (see e.g. ESL, Algorithm 17.2).

15 The Graphical Lasso (cont.) 5/12 We need to maximize F (Ω) := log det Ω Tr(SΩ) ρ p Ω 1. i,j=1 Since F is concave, we can use the sub-gradient to identify optimal points of F

16 The Graphical Lasso (cont.) 5/12 We need to maximize F (Ω) := log det Ω Tr(SΩ) ρ p Ω 1. i,j=1 Since F is concave, we can use the sub-gradient to identify optimal points of F (to be really rigorous, we should be working with F in order to use the sub-gradient, but the derivation is the same).

17 The Graphical Lasso (cont.) 5/12 We need to maximize F (Ω) := log det Ω Tr(SΩ) ρ p Ω 1. i,j=1 Since F is concave, we can use the sub-gradient to identify optimal points of F (to be really rigorous, we should be working with F in order to use the sub-gradient, but the derivation is the same). We have Ω log det Ω = Ω 1, Tr(SΩ) = S. Ω

18 The Graphical Lasso (cont.) 5/12 We need to maximize F (Ω) := log det Ω Tr(SΩ) ρ p Ω 1. i,j=1 Since F is concave, we can use the sub-gradient to identify optimal points of F (to be really rigorous, we should be working with F in order to use the sub-gradient, but the derivation is the same). We have Ω log det Ω = Ω 1, Tr(SΩ) = S. Ω Also, where p Ω ij = Sign(Ω) i,j=1 Sign(Ω) ij = 1 if Ω ij > 0 1 if Ω ij < 0. [ 1, 1] if Ω ij = 0

19 The Graphical Lasso (cont.) 6/12 Putting everything together, we get F = Ω 1 S ρ Sign(Ω).

20 The Graphical Lasso (cont.) 6/12 Putting everything together, we get F = Ω 1 S ρ Sign(Ω). Just like for the lasso problem, we will derive a coordinate-wise approach to solve the glasso problem.

21 The Graphical Lasso (cont.) 6/12 Putting everything together, we get F = Ω 1 S ρ Sign(Ω). Just like for the lasso problem, we will derive a coordinate-wise approach to solve the glasso problem. Let W = Ω 1. Write W and Ω in block form ( W11 w W = 12 w12 T w 22 where W 11, Ω 11 R (p 1) (p 1). ), Ω = ( Ω11 ω 12 ω T 12 ω 22 ),

22 The Graphical Lasso (cont.) 6/12 Putting everything together, we get F = Ω 1 S ρ Sign(Ω). Just like for the lasso problem, we will derive a coordinate-wise approach to solve the glasso problem. Let W = Ω 1. Write W and Ω in block form ( W11 w W = 12 w12 T w 22 ), Ω = ( Ω11 ω 12 ω T 12 ω 22 ), where W 11, Ω 11 R (p 1) (p 1). We will cyclically optimize F, one column/row at a time.

23 The Graphical Lasso (cont.) 6/12 Putting everything together, we get F = Ω 1 S ρ Sign(Ω). Just like for the lasso problem, we will derive a coordinate-wise approach to solve the glasso problem. Let W = Ω 1. Write W and Ω in block form ( W11 w W = 12 w12 T w 22 ), Ω = ( Ω11 ω 12 ω T 12 ω 22 where W 11, Ω 11 R (p 1) (p 1). We will cyclically optimize F, one column/row at a time. Note that since W Ω = I, we have ( W11 Ω 11 + w 12 ω12 T ) ( ) W 11 ω 12 + w 12 ω 22 I(p 1) (p 1) 0 w12 T Ω 11 + w 22 ω12 T w12 T ω = (p 1) w 22 ω (p 1) 0 ),

24 The Graphical Lasso (cont.) 7/12 In particular, we have W 11 ω 12 + w 12 ω 22 = 0, i.e., where β := ω 12 /ω 22. w 12 = W 11 ω 12 ω 22 = W 11 β,

25 The Graphical Lasso (cont.) 7/12 In particular, we have W 11 ω 12 + w 12 ω 22 = 0, i.e., w 12 = W 11 ω 12 ω 22 = W 11 β, where β := ω 12 /ω 22. Now, the upper right block of Ω 1 S ρ Sign(Ω) is equal to since ω 22 > 0. w 12 s 12 + ρ Sign(β)

26 The Graphical Lasso (cont.) 7/12 In particular, we have W 11 ω 12 + w 12 ω 22 = 0, i.e., w 12 = W 11 ω 12 ω 22 = W 11 β, where β := ω 12 /ω 22. Now, the upper right block of Ω 1 S ρ Sign(Ω) is equal to w 12 s 12 + ρ Sign(β) since ω 22 > 0. We need to choose w 12 such that 0 w 12 s 12 + ρ Sign(β) 0 W 11 β s 12 + ρ Sign(β).

27 The Graphical Lasso (cont.) 7/12 In particular, we have W 11 ω 12 + w 12 ω 22 = 0, i.e., w 12 = W 11 ω 12 ω 22 = W 11 β, where β := ω 12 /ω 22. Now, the upper right block of Ω 1 S ρ Sign(Ω) is equal to w 12 s 12 + ρ Sign(β) since ω 22 > 0. We need to choose w 12 such that 0 w 12 s 12 + ρ Sign(β) 0 W 11 β s 12 + ρ Sign(β). 1 Observation: in the lasso problem min β 2 y Zβ 2 + ρ β 1, we have ( ) 1 2 y Zβ 2 + ρ β 1 = Z T Zβ Z T y + ρ Sign(β).

28 The Graphical Lasso (cont.) 8/12 So, we have the two optimality conditions: Glasso update: 0 W 11 β s 12 + ρ Sign(β) Lasso problem: 0 Z T Zβ Z T y + ρ Sign(β)

29 The Graphical Lasso (cont.) 8/12 So, we have the two optimality conditions: Glasso update: 0 W 11 β s 12 + ρ Sign(β) Lasso problem: 0 Z T Zβ Z T y + ρ Sign(β) Now, let Z := W 1/2 1/2 11, and y := W11 s 12.

30 The Graphical Lasso (cont.) 8/12 So, we have the two optimality conditions: Glasso update: 0 W 11 β s 12 + ρ Sign(β) Lasso problem: 0 Z T Zβ Z T y + ρ Sign(β) Now, let Z := W 1/2 1/2 11, and y := W11 s 12. The glasso update is thus equivalent to solving the lasso problem: 1/2 min W11 s 12 W 1/ β ρ β 1. β 1

31 The Graphical Lasso (cont.) 8/12 So, we have the two optimality conditions: Glasso update: 0 W 11 β s 12 + ρ Sign(β) Lasso problem: 0 Z T Zβ Z T y + ρ Sign(β) Now, let Z := W 1/2 1/2 11, and y := W11 s 12. The glasso update is thus equivalent to solving the lasso problem: 1/2 min W11 s 12 W 1/ β ρ β 1. β 1 We can therefore solve the glasso problem by cycling through the row/columns of W, and updating them by solving a lasso problem!

32 The Graphical Lasso (cont.) 9/12 We therefore have the following algorithm to solve the glasso problem. ESL, Algorithm 17.2.

33 MLE estimation of a GGM 10/12 From the glasso solution, one infers a conditional independence graph for X = (X 1,..., X p ) by examining the zeros in the estimated inverse covariance matrix.

34 MLE estimation of a GGM 10/12 From the glasso solution, one infers a conditional independence graph for X = (X 1,..., X p ) by examining the zeros in the estimated inverse covariance matrix. Given a graph G = (V, E) with p vertices, let P G := {A P p : A ij = 0 if (i, j) E}.

35 MLE estimation of a GGM 10/12 From the glasso solution, one infers a conditional independence graph for X = (X 1,..., X p ) by examining the zeros in the estimated inverse covariance matrix. Given a graph G = (V, E) with p vertices, let P G := {A P p : A ij = 0 if (i, j) E}. We can now estimate the optimal covariance matrix with the given graph structure by solving: ˆΣ G := argmax l(σ), Σ : Ω=Σ 1 P G where l(σ) denotes the log-likelihood of Σ.

36 MLE estimation of a GGM 10/12 From the glasso solution, one infers a conditional independence graph for X = (X 1,..., X p ) by examining the zeros in the estimated inverse covariance matrix. Given a graph G = (V, E) with p vertices, let P G := {A P p : A ij = 0 if (i, j) E}. We can now estimate the optimal covariance matrix with the given graph structure by solving: ˆΣ G := argmax l(σ), Σ : Ω=Σ 1 P G where l(σ) denotes the log-likelihood of Σ. Note: Instead of maximizing the log-likelihood over all possible psd matrices as in the classical case, we restrict ourselves to the matrices having the right conditional independence structure.

37 MLE estimation of a GGM From the glasso solution, one infers a conditional independence graph for X = (X 1,..., X p ) by examining the zeros in the estimated inverse covariance matrix. Given a graph G = (V, E) with p vertices, let P G := {A P p : A ij = 0 if (i, j) E}. We can now estimate the optimal covariance matrix with the given graph structure by solving: ˆΣ G := argmax l(σ), Σ : Ω=Σ 1 P G where l(σ) denotes the log-likelihood of Σ. Note: Instead of maximizing the log-likelihood over all possible psd matrices as in the classical case, we restrict ourselves to the matrices having the right conditional independence structure. The graphical MLE problem can be solved eciently for up to a few thousand variables (see e.g. ESL, Algorithm 17.1). 10/12

38 MLE estimation of a GGM (cont.) Computing the Gaussian MLE of a multivariate normal random vector with known conditional independence graph G: ESL, Algorithm The derivation of the algorithm is similar to the derivation of the glasso algorithm (see ESL, Section ). 11/12

39 Application Example: Estimating the conditional independencies in temperature elds (Guillot et al., 2015) 12/12

40 Application Example: Estimating the conditional independencies in temperature elds (Guillot et al., 2015) 12/12

41 Application Example: Estimating the conditional independencies in temperature elds (Guillot et al., 2015) Reconstructing climate elds using paleoclimate proxies: Estimate conditional independence graph on instrumental period. 12/12

42 Application Example: Estimating the conditional independencies in temperature elds (Guillot et al., 2015) Reconstructing climate elds using paleoclimate proxies: Estimate conditional independence graph on instrumental period. Use an EM algorithm with an embedded graphical model. 12/12

43 Application Example: Estimating the conditional independencies in temperature elds (Guillot et al., 2015) Reconstructing climate elds using paleoclimate proxies: Estimate conditional independence graph on instrumental period. Use an EM algorithm with an embedded graphical model. The resulting algorithm is called GraphEM. 12/12

44 Application Example: Estimating the conditional independencies in temperature elds (Guillot et al., 2015) Reconstructing climate elds using paleoclimate proxies: Estimate conditional independence graph on instrumental period. Use an EM algorithm with an embedded graphical model. The resulting algorithm is called GraphEM. See Guillot et al.(2015) for more details. 12/12

MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models

MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models 1/13 MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models Dominique Guillot Departments of Mathematical Sciences University of Delaware May 4, 2016 Recall

More information

Chapter 17: Undirected Graphical Models

Chapter 17: Undirected Graphical Models Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)

More information

Gaussian Graphical Models and Graphical Lasso

Gaussian Graphical Models and Graphical Lasso ELE 538B: Sparsity, Structure and Inference Gaussian Graphical Models and Graphical Lasso Yuxin Chen Princeton University, Spring 2017 Multivariate Gaussians Consider a random vector x N (0, Σ) with pdf

More information

Probabilistic Models of Past Climate Change

Probabilistic Models of Past Climate Change Probabilistic Models of Past Climate Change Julien Emile-Geay USC College, Department of Earth Sciences SIAM International Conference on Data Mining Anaheim, California April 27, 2012 Probabilistic Models

More information

Graphical Model Selection

Graphical Model Selection May 6, 2013 Trevor Hastie, Stanford Statistics 1 Graphical Model Selection Trevor Hastie Stanford University joint work with Jerome Friedman, Rob Tibshirani, Rahul Mazumder and Jason Lee May 6, 2013 Trevor

More information

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss arxiv:1811.04545v1 [stat.co] 12 Nov 2018 Cheng Wang School of Mathematical Sciences, Shanghai Jiao

More information

Structure estimation for Gaussian graphical models

Structure estimation for Gaussian graphical models Faculty of Science Structure estimation for Gaussian graphical models Steffen Lauritzen, University of Copenhagen Department of Mathematical Sciences Minikurs TUM 2016 Lecture 3 Slide 1/48 Overview of

More information

Learning Gaussian Graphical Models with Unknown Group Sparsity

Learning Gaussian Graphical Models with Unknown Group Sparsity Learning Gaussian Graphical Models with Unknown Group Sparsity Kevin Murphy Ben Marlin Depts. of Statistics & Computer Science Univ. British Columbia Canada Connections Graphical models Density estimation

More information

Sparse inverse covariance estimation with the lasso

Sparse inverse covariance estimation with the lasso Sparse inverse covariance estimation with the lasso Jerome Friedman Trevor Hastie and Robert Tibshirani November 8, 2007 Abstract We consider the problem of estimating sparse graphs by a lasso penalty

More information

10708 Graphical Models: Homework 2

10708 Graphical Models: Homework 2 10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves

More information

Robust and sparse Gaussian graphical modelling under cell-wise contamination

Robust and sparse Gaussian graphical modelling under cell-wise contamination Robust and sparse Gaussian graphical modelling under cell-wise contamination Shota Katayama 1, Hironori Fujisawa 2 and Mathias Drton 3 1 Tokyo Institute of Technology, Japan 2 The Institute of Statistical

More information

High Dimensional Inverse Covariate Matrix Estimation via Linear Programming

High Dimensional Inverse Covariate Matrix Estimation via Linear Programming High Dimensional Inverse Covariate Matrix Estimation via Linear Programming Ming Yuan October 24, 2011 Gaussian Graphical Model X = (X 1,..., X p ) indep. N(µ, Σ) Inverse covariance matrix Σ 1 = Ω = (ω

More information

High-dimensional Covariance Estimation Based On Gaussian Graphical Models

High-dimensional Covariance Estimation Based On Gaussian Graphical Models High-dimensional Covariance Estimation Based On Gaussian Graphical Models Shuheng Zhou, Philipp Rutimann, Min Xu and Peter Buhlmann February 3, 2012 Problem definition Want to estimate the covariance matrix

More information

Permutation-invariant regularization of large covariance matrices. Liza Levina

Permutation-invariant regularization of large covariance matrices. Liza Levina Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work

More information

Multivariate Normal Models

Multivariate Normal Models Case Study 3: fmri Prediction Graphical LASSO Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Emily Fox February 26 th, 2013 Emily Fox 2013 1 Multivariate Normal Models

More information

Inverse Covariance Estimation with Missing Data using the Concave-Convex Procedure

Inverse Covariance Estimation with Missing Data using the Concave-Convex Procedure Inverse Covariance Estimation with Missing Data using the Concave-Convex Procedure Jérôme Thai 1 Timothy Hunter 1 Anayo Akametalu 1 Claire Tomlin 1 Alex Bayen 1,2 1 Department of Electrical Engineering

More information

Multivariate Normal Models

Multivariate Normal Models Case Study 3: fmri Prediction Coping with Large Covariances: Latent Factor Models, Graphical Models, Graphical LASSO Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox February

More information

An Introduction to Graphical Lasso

An Introduction to Graphical Lasso An Introduction to Graphical Lasso Bo Chang Graphical Models Reading Group May 15, 2015 Bo Chang (UBC) Graphical Lasso May 15, 2015 1 / 16 Undirected Graphical Models An undirected graph, each vertex represents

More information

A Robust Approach to Regularized Discriminant Analysis

A Robust Approach to Regularized Discriminant Analysis A Robust Approach to Regularized Discriminant Analysis Moritz Gschwandtner Department of Statistics and Probability Theory Vienna University of Technology, Austria Österreichische Statistiktage, Graz,

More information

Linear Models for Regression CS534

Linear Models for Regression CS534 Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict

More information

Machine Learning for Data Science (CS4786) Lecture 12

Machine Learning for Data Science (CS4786) Lecture 12 Machine Learning for Data Science (CS4786) Lecture 12 Gaussian Mixture Models Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016fa/ Back to K-means Single link is sensitive to outliners We

More information

Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results

Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results David Prince Biostat 572 dprince3@uw.edu April 19, 2012 David Prince (UW) SPICE April 19, 2012 1 / 11 Electronic

More information

Linear Models for Regression CS534

Linear Models for Regression CS534 Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict

More information

Introduction to Normal Distribution

Introduction to Normal Distribution Introduction to Normal Distribution Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 17-Jan-2017 Nathaniel E. Helwig (U of Minnesota) Introduction

More information

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization / Coordinate descent Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Adding to the toolbox, with stats and ML in mind We ve seen several general and useful minimization tools First-order methods

More information

High-dimensional covariance estimation based on Gaussian graphical models

High-dimensional covariance estimation based on Gaussian graphical models High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,

More information

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana

More information

Exam 2. Jeremy Morris. March 23, 2006

Exam 2. Jeremy Morris. March 23, 2006 Exam Jeremy Morris March 3, 006 4. Consider a bivariate normal population with µ 0, µ, σ, σ and ρ.5. a Write out the bivariate normal density. The multivariate normal density is defined by the following

More information

MATH 829: Introduction to Data Mining and Analysis Graphical Models I

MATH 829: Introduction to Data Mining and Analysis Graphical Models I MATH 829: Introduction to Data Mining and Analysis Graphical Models I Dominique Guillot Departments of Mathematical Sciences University of Delaware May 2, 2016 1/12 Independence and conditional independence:

More information

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1

More information

Chapter 4: Asymptotic Properties of the MLE (Part 2)

Chapter 4: Asymptotic Properties of the MLE (Part 2) Chapter 4: Asymptotic Properties of the MLE (Part 2) Daniel O. Scharfstein 09/24/13 1 / 1 Example Let {(R i, X i ) : i = 1,..., n} be an i.i.d. sample of n random vectors (R, X ). Here R is a response

More information

Lecture 25: November 27

Lecture 25: November 27 10-725: Optimization Fall 2012 Lecture 25: November 27 Lecturer: Ryan Tibshirani Scribes: Matt Wytock, Supreeth Achar Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have

More information

Approximation. Inderjit S. Dhillon Dept of Computer Science UT Austin. SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina.

Approximation. Inderjit S. Dhillon Dept of Computer Science UT Austin. SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina. Using Quadratic Approximation Inderjit S. Dhillon Dept of Computer Science UT Austin SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina Sept 12, 2012 Joint work with C. Hsieh, M. Sustik and

More information

MATH 829: Introduction to Data Mining and Analysis Computing the lasso solution

MATH 829: Introduction to Data Mining and Analysis Computing the lasso solution 1/16 MATH 829: Introduction to Data Mining and Analysis Computing the lasso solution Dominique Guillot Departments of Mathematical Sciences University of Delaware February 26, 2016 Computing the lasso

More information

Lecture 2 Part 1 Optimization

Lecture 2 Part 1 Optimization Lecture 2 Part 1 Optimization (January 16, 2015) Mu Zhu University of Waterloo Need for Optimization E(y x), P(y x) want to go after them first, model some examples last week then, estimate didn t discuss

More information

Finite Singular Multivariate Gaussian Mixture

Finite Singular Multivariate Gaussian Mixture 21/06/2016 Plan 1 Basic definitions Singular Multivariate Normal Distribution 2 3 Plan Singular Multivariate Normal Distribution 1 Basic definitions Singular Multivariate Normal Distribution 2 3 Multivariate

More information

Lecture 3. Inference about multivariate normal distribution

Lecture 3. Inference about multivariate normal distribution Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates

More information

Lecture 1 October 9, 2013

Lecture 1 October 9, 2013 Probabilistic Graphical Models Fall 2013 Lecture 1 October 9, 2013 Lecturer: Guillaume Obozinski Scribe: Huu Dien Khue Le, Robin Bénesse The web page of the course: http://www.di.ens.fr/~fbach/courses/fall2013/

More information

Parameter estimation in linear Gaussian covariance models

Parameter estimation in linear Gaussian covariance models Parameter estimation in linear Gaussian covariance models Caroline Uhler (IST Austria) Joint work with Piotr Zwiernik (UC Berkeley) and Donald Richards (Penn State University) Big Data Reunion Workshop

More information

Estimation of Graphical Models with Shape Restriction

Estimation of Graphical Models with Shape Restriction Estimation of Graphical Models with Shape Restriction BY KHAI X. CHIONG USC Dornsife INE, Department of Economics, University of Southern California, Los Angeles, California 989, U.S.A. kchiong@usc.edu

More information

Linear Regression and Discrimination

Linear Regression and Discrimination Linear Regression and Discrimination Kernel-based Learning Methods Christian Igel Institut für Neuroinformatik Ruhr-Universität Bochum, Germany http://www.neuroinformatik.rub.de July 16, 2009 Christian

More information

Proximity-Based Anomaly Detection using Sparse Structure Learning

Proximity-Based Anomaly Detection using Sparse Structure Learning Proximity-Based Anomaly Detection using Sparse Structure Learning Tsuyoshi Idé (IBM Tokyo Research Lab) Aurelie C. Lozano, Naoki Abe, and Yan Liu (IBM T. J. Watson Research Center) 2009/04/ SDM 2009 /

More information

Regulatory Inferece from Gene Expression. CMSC858P Spring 2012 Hector Corrada Bravo

Regulatory Inferece from Gene Expression. CMSC858P Spring 2012 Hector Corrada Bravo Regulatory Inferece from Gene Expression CMSC858P Spring 2012 Hector Corrada Bravo 2 Graphical Model Let y be a vector- valued random variable Suppose some condi8onal independence proper8es hold for some

More information

Sparse Covariance Selection using Semidefinite Programming

Sparse Covariance Selection using Semidefinite Programming Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support

More information

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement

More information

Sparse Gaussian conditional random fields

Sparse Gaussian conditional random fields Sparse Gaussian conditional random fields Matt Wytock, J. ico Kolter School of Computer Science Carnegie Mellon University Pittsburgh, PA 53 {mwytock, zkolter}@cs.cmu.edu Abstract We propose sparse Gaussian

More information

MATH 829: Introduction to Data Mining and Analysis Linear Regression: statistical tests

MATH 829: Introduction to Data Mining and Analysis Linear Regression: statistical tests 1/16 MATH 829: Introduction to Data Mining and Analysis Linear Regression: statistical tests Dominique Guillot Departments of Mathematical Sciences University of Delaware February 17, 2016 Statistical

More information

STAT 730 Chapter 4: Estimation

STAT 730 Chapter 4: Estimation STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum

More information

Adaptive First-Order Methods for General Sparse Inverse Covariance Selection

Adaptive First-Order Methods for General Sparse Inverse Covariance Selection Adaptive First-Order Methods for General Sparse Inverse Covariance Selection Zhaosong Lu December 2, 2008 Abstract In this paper, we consider estimating sparse inverse covariance of a Gaussian graphical

More information

STATS 306B: Unsupervised Learning Spring Lecture 2 April 2

STATS 306B: Unsupervised Learning Spring Lecture 2 April 2 STATS 306B: Unsupervised Learning Spring 2014 Lecture 2 April 2 Lecturer: Lester Mackey Scribe: Junyang Qian, Minzhe Wang 2.1 Recap In the last lecture, we formulated our working definition of unsupervised

More information

MATH 829: Introduction to Data Mining and Analysis Consistency of Linear Regression

MATH 829: Introduction to Data Mining and Analysis Consistency of Linear Regression 1/9 MATH 829: Introduction to Data Mining and Analysis Consistency of Linear Regression Dominique Guillot Deartments of Mathematical Sciences University of Delaware February 15, 2016 Distribution of regression

More information

IEOR 165 Lecture 13 Maximum Likelihood Estimation

IEOR 165 Lecture 13 Maximum Likelihood Estimation IEOR 165 Lecture 13 Maximum Likelihood Estimation 1 Motivating Problem Suppose we are working for a grocery store, and we have decided to model service time of an individual using the express lane (for

More information

Gaussian Mixture Models

Gaussian Mixture Models Gaussian Mixture Models David Rosenberg, Brett Bernstein New York University April 26, 2017 David Rosenberg, Brett Bernstein (New York University) DS-GA 1003 April 26, 2017 1 / 42 Intro Question Intro

More information

The picasso Package for Nonconvex Regularized M-estimation in High Dimensions in R

The picasso Package for Nonconvex Regularized M-estimation in High Dimensions in R The picasso Package for Nonconvex Regularized M-estimation in High Dimensions in R Xingguo Li Tuo Zhao Tong Zhang Han Liu Abstract We describe an R package named picasso, which implements a unified framework

More information

TAMS39 Lecture 2 Multivariate normal distribution

TAMS39 Lecture 2 Multivariate normal distribution TAMS39 Lecture 2 Multivariate normal distribution Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content Lecture Random vectors Multivariate normal distribution

More information

The graphical lasso: New insights and alternatives

The graphical lasso: New insights and alternatives Electronic Journal of Statistics Vol. 6 (2012) 2125 2149 ISSN: 1935-7524 DOI: 10.1214/12-EJS740 The graphical lasso: New insights and alternatives Rahul Mazumder Massachusetts Institute of Technology Cambridge,

More information

MATH 829: Introduction to Data Mining and Analysis Support vector machines and kernels

MATH 829: Introduction to Data Mining and Analysis Support vector machines and kernels 1/12 MATH 829: Introduction to Data Mining and Analysis Support vector machines and kernels Dominique Guillot Departments of Mathematical Sciences University of Delaware March 14, 2016 Separating sets:

More information

Bayesian Decision and Bayesian Learning

Bayesian Decision and Bayesian Learning Bayesian Decision and Bayesian Learning Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 30 Bayes Rule p(x ω i

More information

Classification 2: Linear discriminant analysis (continued); logistic regression

Classification 2: Linear discriminant analysis (continued); logistic regression Classification 2: Linear discriminant analysis (continued); logistic regression Ryan Tibshirani Data Mining: 36-462/36-662 April 4 2013 Optional reading: ISL 4.4, ESL 4.3; ISL 4.3, ESL 4.4 1 Reminder:

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 2013-14 We know that X ~ B(n,p), but we do not know p. We get a random sample

More information

Joint Gaussian Graphical Model Review Series I

Joint Gaussian Graphical Model Review Series I Joint Gaussian Graphical Model Review Series I Probability Foundations Beilun Wang Advisor: Yanjun Qi 1 Department of Computer Science, University of Virginia http://jointggm.org/ June 23rd, 2017 Beilun

More information

Genetic Networks. Korbinian Strimmer. Seminar: Statistical Analysis of RNA-Seq Data 19 June IMISE, Universität Leipzig

Genetic Networks. Korbinian Strimmer. Seminar: Statistical Analysis of RNA-Seq Data 19 June IMISE, Universität Leipzig Genetic Networks Korbinian Strimmer IMISE, Universität Leipzig Seminar: Statistical Analysis of RNA-Seq Data 19 June 2012 Korbinian Strimmer, RNA-Seq Networks, 19/6/2012 1 Paper G. I. Allen and Z. Liu.

More information

Linear Models for Regression CS534

Linear Models for Regression CS534 Linear Models for Regression CS534 Prediction Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict the

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Lior Wolf 2014-15 We know that X ~ B(n,p), but we do not know p. We get a random sample from X, a

More information

Sparse Inverse Covariance Estimation for a Million Variables

Sparse Inverse Covariance Estimation for a Million Variables Sparse Inverse Covariance Estimation for a Million Variables Inderjit S. Dhillon Depts of Computer Science & Mathematics The University of Texas at Austin SAMSI LDHD Opening Workshop Raleigh, North Carolina

More information

Sparse Permutation Invariant Covariance Estimation: Final Talk

Sparse Permutation Invariant Covariance Estimation: Final Talk Sparse Permutation Invariant Covariance Estimation: Final Talk David Prince Biostat 572 dprince3@uw.edu May 31, 2012 David Prince (UW) SPICE May 31, 2012 1 / 19 Electronic Journal of Statistics Vol. 2

More information

Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables

Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables LIB-MA, FSSM Cadi Ayyad University (Morocco) COMPSTAT 2010 Paris, August 22-27, 2010 Motivations Fan and Li (2001), Zou and Li (2008)

More information

MATH5745 Multivariate Methods Lecture 07

MATH5745 Multivariate Methods Lecture 07 MATH5745 Multivariate Methods Lecture 07 Tests of hypothesis on covariance matrix March 16, 2018 MATH5745 Multivariate Methods Lecture 07 March 16, 2018 1 / 39 Test on covariance matrices: Introduction

More information

Uncertainty quantification and visualization for functional random variables

Uncertainty quantification and visualization for functional random variables Uncertainty quantification and visualization for functional random variables MascotNum Workshop 2014 S. Nanty 1,3 C. Helbert 2 A. Marrel 1 N. Pérot 1 C. Prieur 3 1 CEA, DEN/DER/SESI/LSMR, F-13108, Saint-Paul-lez-Durance,

More information

MATH 829: Introduction to Data Mining and Analysis Support vector machines

MATH 829: Introduction to Data Mining and Analysis Support vector machines 1/10 MATH 829: Introduction to Data Mining and Analysis Support vector machines Dominique Guillot Departments of Mathematical Sciences University of Delaware March 11, 2016 Hyperplanes 2/10 Recall: A hyperplane

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 7, 04 Reading: See class website Eric Xing @ CMU, 005-04

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon

More information

The Absolute Value Equation

The Absolute Value Equation The Absolute Value Equation The absolute value of number is its distance from zero on the number line. The notation for absolute value is the presence of two vertical lines. 5 This is asking for the absolute

More information

Computing the MLE and the EM Algorithm

Computing the MLE and the EM Algorithm ECE 830 Fall 0 Statistical Signal Processing instructor: R. Nowak Computing the MLE and the EM Algorithm If X p(x θ), θ Θ, then the MLE is the solution to the equations logp(x θ) θ 0. Sometimes these equations

More information

University of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians

University of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians Engineering Part IIB: Module F Statistical Pattern Processing University of Cambridge Engineering Part IIB Module F: Statistical Pattern Processing Handout : Multivariate Gaussians. Generative Model Decision

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm 1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable

More information

Group Sparse Priors for Covariance Estimation

Group Sparse Priors for Covariance Estimation Group Sparse Priors for Covariance Estimation Benjamin M. Marlin, Mark Schmidt, and Kevin P. Murphy Department of Computer Science University of British Columbia Vancouver, Canada Abstract Recently it

More information

The lasso: some novel algorithms and applications

The lasso: some novel algorithms and applications 1 The lasso: some novel algorithms and applications Newton Institute, June 25, 2008 Robert Tibshirani Stanford University Collaborations with Trevor Hastie, Jerome Friedman, Holger Hoefling, Gen Nowak,

More information

Math 152. Rumbos Fall Solutions to Assignment #12

Math 152. Rumbos Fall Solutions to Assignment #12 Math 52. umbos Fall 2009 Solutions to Assignment #2. Suppose that you observe n iid Bernoulli(p) random variables, denoted by X, X 2,..., X n. Find the LT rejection region for the test of H o : p p o versus

More information

Decomposable and Directed Graphical Gaussian Models

Decomposable and Directed Graphical Gaussian Models Decomposable Decomposable and Directed Graphical Gaussian Models Graphical Models and Inference, Lecture 13, Michaelmas Term 2009 November 26, 2009 Decomposable Definition Basic properties Wishart density

More information

Copyright is owned by the Author of the thesis. Permission is given for a copy to be downloaded by an individual for the purpose of research and

Copyright is owned by the Author of the thesis. Permission is given for a copy to be downloaded by an individual for the purpose of research and Copyright is owned by the Author of the thesis. Permission is given for a copy to be downloaded by an individual for the purpose of research and private study only. The thesis may not be reproduced elsewhere

More information

Notes on Random Vectors and Multivariate Normal

Notes on Random Vectors and Multivariate Normal MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution

More information

Extended Bayesian Information Criteria for Gaussian Graphical Models

Extended Bayesian Information Criteria for Gaussian Graphical Models Extended Bayesian Information Criteria for Gaussian Graphical Models Rina Foygel University of Chicago rina@uchicago.edu Mathias Drton University of Chicago drton@uchicago.edu Abstract Gaussian graphical

More information

Lecture 3. Linear Regression II Bastian Leibe RWTH Aachen

Lecture 3. Linear Regression II Bastian Leibe RWTH Aachen Advanced Machine Learning Lecture 3 Linear Regression II 02.11.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de This Lecture: Advanced Machine Learning Regression

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley A. d Aspremont, INFORMS, Denver,

More information

Estimating Covariance Structure in High Dimensions

Estimating Covariance Structure in High Dimensions Estimating Covariance Structure in High Dimensions Ashwini Maurya Michigan State University East Lansing, MI, USA Thesis Director: Dr. Hira L. Koul Committee Members: Dr. Yuehua Cui, Dr. Hyokyoung Hong,

More information

Gaussian Mixture Models

Gaussian Mixture Models Gaussian Mixture Models Pradeep Ravikumar Co-instructor: Manuela Veloso Machine Learning 10-701 Some slides courtesy of Eric Xing, Carlos Guestrin (One) bad case for K- means Clusters may overlap Some

More information

Lecture 4 September 15

Lecture 4 September 15 IFT 6269: Probabilistic Graphical Models Fall 2017 Lecture 4 September 15 Lecturer: Simon Lacoste-Julien Scribe: Philippe Brouillard & Tristan Deleu 4.1 Maximum Likelihood principle Given a parametric

More information

Regularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008

Regularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008 Regularized Estimation of High Dimensional Covariance Matrices Peter Bickel Cambridge January, 2008 With Thanks to E. Levina (Joint collaboration, slides) I. M. Johnstone (Slides) Choongsoon Bae (Slides)

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Multivariate Gaussians Mark Schmidt University of British Columbia Winter 2019 Last Time: Multivariate Gaussian http://personal.kenyon.edu/hartlaub/mellonproject/bivariate2.html

More information

Multivariate Gaussian Analysis

Multivariate Gaussian Analysis BS2 Statistical Inference, Lecture 7, Hilary Term 2009 February 13, 2009 Marginal and conditional distributions For a positive definite covariance matrix Σ, the multivariate Gaussian distribution has density

More information

Inf2b Learning and Data

Inf2b Learning and Data Inf2b Learning and Data Lecture 13: Review (Credit: Hiroshi Shimodaira Iain Murray and Steve Renals) Centre for Speech Technology Research (CSTR) School of Informatics University of Edinburgh http://www.inf.ed.ac.uk/teaching/courses/inf2b/

More information

K-Means and Gaussian Mixture Models

K-Means and Gaussian Mixture Models K-Means and Gaussian Mixture Models David Rosenberg New York University October 29, 2016 David Rosenberg (New York University) DS-GA 1003 October 29, 2016 1 / 42 K-Means Clustering K-Means Clustering David

More information

High Dimensional Discriminant Analysis

High Dimensional Discriminant Analysis High Dimensional Discriminant Analysis Charles Bouveyron 1,2, Stéphane Girard 1, and Cordelia Schmid 2 1 LMC IMAG, BP 53, Université Grenoble 1, 38041 Grenoble cedex 9 France (e-mail: charles.bouveyron@imag.fr,

More information

F & B Approaches to a simple model

F & B Approaches to a simple model A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys

More information

CS281A/Stat241A Lecture 17

CS281A/Stat241A Lecture 17 CS281A/Stat241A Lecture 17 p. 1/4 CS281A/Stat241A Lecture 17 Factor Analysis and State Space Models Peter Bartlett CS281A/Stat241A Lecture 17 p. 2/4 Key ideas of this lecture Factor Analysis. Recall: Gaussian

More information

Math 181B Homework 1 Solution

Math 181B Homework 1 Solution Math 181B Homework 1 Solution 1. Write down the likelihood: L(λ = n λ X i e λ X i! (a One-sided test: H 0 : λ = 1 vs H 1 : λ = 0.1 The likelihood ratio: where LR = L(1 L(0.1 = 1 X i e n 1 = λ n X i e nλ

More information

Big & Quic: Sparse Inverse Covariance Estimation for a Million Variables

Big & Quic: Sparse Inverse Covariance Estimation for a Million Variables for a Million Variables Cho-Jui Hsieh The University of Texas at Austin NIPS Lake Tahoe, Nevada Dec 8, 2013 Joint work with M. Sustik, I. Dhillon, P. Ravikumar and R. Poldrack FMRI Brain Analysis Goal:

More information

STATS306B STATS306B. Discriminant Analysis. Jonathan Taylor Department of Statistics Stanford University. June 3, 2010

STATS306B STATS306B. Discriminant Analysis. Jonathan Taylor Department of Statistics Stanford University. June 3, 2010 STATS306B Discriminant Analysis Jonathan Taylor Department of Statistics Stanford University June 3, 2010 Spring 2010 Classification Given K classes in R p, represented as densities f i (x), 1 i K classify

More information