Introduction to graphical models: Lecture III

Size: px
Start display at page:

Download "Introduction to graphical models: Lecture III"

Transcription

1 Introduction to graphical models: Lecture III Martin Wainwright UC Berkeley Departments of Statistics, and EECS Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

2 Introduction Markov random fields (undirected graphical models): central in many application areas of science/engineering: Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

3 Introduction Markov random fields (undirected graphical models): central in many application areas of science/engineering: some fundamental problems counting/integrating: computing marginal distributions and partition functions optimization: computing most probable configurations (or top M-configurations) model selection: fitting and selecting models on the basis of data Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

4 Graph structure and factorization Markov random field: random vector (X 1,...,X p ) with distribution factoring according to a graph G = (V,E): D A B C Hammersley-Clifford theorem: factorization over cliques Q(x 1,...,x p ;θ) = 1 Z(θ) exp{ θ C (x C ) } C C

5 Graphical model selection let G = (V,E) be an undirected graph on p = V vertices pairwise graphical model factorizes over edges of graph: Q(x 1,...,x p ;θ) exp { θ st (x s,x t ) }. (s,t) E given n independent and identically distributed (i.i.d.) samples of X = (X 1,...,X p ), identify the underlying graph structure Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

6 1 Exact solutions Various classes of methods Chow-Liu algorithm for trees (Chow & Liu, 1967) computationally intractable for hypertrees (Srebro & Karger, 2001)

7 1 Exact solutions Various classes of methods Chow-Liu algorithm for trees (Chow & Liu, 1967) computationally intractable for hypertrees (Srebro & Karger, 2001) 2 Testing-based approaches PC algorithm (Spirtes et al., 2000; Kalisch & Bühlmann, 2008) thresholding (Bresler et al., 2008; Anandkumar et al., 2010)

8 1 Exact solutions Various classes of methods Chow-Liu algorithm for trees (Chow & Liu, 1967) computationally intractable for hypertrees (Srebro & Karger, 2001) 2 Testing-based approaches PC algorithm (Spirtes et al., 2000; Kalisch & Bühlmann, 2008) thresholding (Bresler et al., 2008; Anandkumar et al., 2010) 3 Penalized forms of global likelihood combinatorial penalties (AIC, BIC, GIC etc.) l1 and related penalties classical analysis of penalized Gaussian MLE: Yuan & Lin, 2006 some fast algorithms: d Asprémont et al., 2007; Friedman et al, 2008

9 1 Exact solutions Various classes of methods Chow-Liu algorithm for trees (Chow & Liu, 1967) computationally intractable for hypertrees (Srebro & Karger, 2001) 2 Testing-based approaches PC algorithm (Spirtes et al., 2000; Kalisch & Bühlmann, 2008) thresholding (Bresler et al., 2008; Anandkumar et al., 2010) 3 Penalized forms of global likelihood combinatorial penalties (AIC, BIC, GIC etc.) l1 and related penalties classical analysis of penalized Gaussian MLE: Yuan & Lin, 2006 some fast algorithms: d Asprémont et al., 2007; Friedman et al, Pseudolikelihoods and neighborhood regression pseudolikeliood consistency for Gaussians (Besag, 1977) pseudolikelihood and BIC criterion (Csiszar & Talata, 2006) neighborhood regression for Gaussian MRFs (e.g., Meinshausen & Buhlmann, 2005; Wainwright, 2006, Zhao & Yu 2006) logistic regression for Ising models (Ravikumar et al., 2010)

10 1. Global maximum likelihood given i.i.d. samples X n 1 := {(X 1,...,X n }, might consider methods based on global likelihood l(θ;x n 1) := 1 n n i=1 logq(x i ;θ) Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

11 1. Global maximum likelihood given i.i.d. samples X n 1 := {(X 1,...,X n }, might consider methods based on global likelihood l(θ;x n 1) := 1 n n i=1 logq(x i ;θ) maximum likelihood for graphical model in exponential form θ = argmax Ê[θ(X θ s,x t )] }{{} logz(θ) (s,t) E empirical moments Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

12 1. Global maximum likelihood given i.i.d. samples X n 1 := {(X 1,...,X n }, might consider methods based on global likelihood l(θ;x n 1) := 1 n n i=1 logq(x i ;θ) maximum likelihood for graphical model in exponential form θ = argmax Ê[θ(X θ s,x t )] }{{} logz(θ) (s,t) E empirical moments exact likelihood involves log partition function log Z(θ): can be computed for Gaussian MRFs (log-determinant) intractable for Ising models (binary pairwise MRFs) (Welsh, 1993) Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

13 1. Global maximum likelihood given i.i.d. samples X n 1 := {(X 1,...,X n }, might consider methods based on global likelihood l(θ;x n 1) := 1 n n i=1 logq(x i ;θ) maximum likelihood for graphical model in exponential form θ = argmax Ê[θ(X θ s,x t )] }{{} logz(θ) (s,t) E empirical moments exact likelihood involves log partition function log Z(θ): can be computed for Gaussian MRFs (log-determinant) intractable for Ising models (binary pairwise MRFs) (Welsh, 1993) possible solutions: MCMC methods stochastic approximation methods variational approximations (mean field, Bethe and belief propagation) Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

14 Gaussian graphs Sparse inverse covariances Zero pattern of inverse covariance Gaussian graphical model specified by sparse inverse covariance Θ: Q(x 1,...,x p ;Θ) = det(θ) (2π) p/2 exp( 1 2 xt Θx ). Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

15 Gaussian l 1 -penalized MLE Estimator: l 1 -regularized log-determinant program: { Θ = arg min logdetθ+ Σn, Θ Θ 0 }{{} Gaussian log likelihood + λ n Θ ij }. i j }{{} Regularization Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

16 Gaussian l 1 -penalized MLE Estimator: l 1 -regularized log-determinant program: { Θ = arg min logdetθ+ Σn, Θ Θ 0 }{{} Gaussian log likelihood + λ n Θ ij }. i j }{{} Regularization Results on this method: analysis under classical scaling (n with p fixed) (Yuan & Lin, 2006) some fast algorithms (d Asprémont et al., 2007; Friedman et al, 2008) high-dimensional analysis of Frobenius norm error (Rothman et al., 2008) high-dimensional variable selection and l bounds (Ravikumar et al., 2011) Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

17 High-dimensional analysis classical analysis: dimension p fixed, sample size n + high-dimensional analysis: allow both dimension p, sample size n, and maximum degree d to increase at arbitrary rates take n i.i.d. samples from MRF defined by G p,d study probability of success as a function of three parameters: Success(n,p,d) = Q[Method recovers graph G p,d from n samples] theory is non-asymptotic: explicit probabilities for finite (n, p, d) Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

18 Empirical behavior: Unrescaled plots 1 Chain graph 0.8 Prob. of success p= p=100 p=225 p= n Plots of success probability versus raw sample size n.

19 Empirical behavior: Appropriately rescaled 1 Chain graph 0.8 Prob. of success p= p=100 p=225 p= n/log p Plots of success probability versus rescaled sample size

20 Sufficient conditions for consistent model selection graph sequences G p,d = (V,E) with p vertices, and maximum degree d. suitable regularity conditions on Hessian of log-determinant Γ := (Θ ) 1 (Θ ) 1 Theorem: For multivariate Gaussian and sample size n > c 1 τ d 2 logp logp and regularization parameter λ n c 2 τ n, then with probability greater than 1 2exp ( c 3 (τ 2)logp ) : Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

21 Sufficient conditions for consistent model selection graph sequences G p,d = (V,E) with p vertices, and maximum degree d. suitable regularity conditions on Hessian of log-determinant Γ := (Θ ) 1 (Θ ) 1 Theorem: For multivariate Gaussian and sample size n > c 1 τ d 2 logp logp and regularization parameter λ n c 2 τ n, then with probability greater than 1 2exp ( c 3 (τ 2)logp ) : (a) No false inclusions: The regularized log-determinant estimate Θ returns an edge set Ê E. Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

22 Sufficient conditions for consistent model selection graph sequences G p,d = (V,E) with p vertices, and maximum degree d. suitable regularity conditions on Hessian of log-determinant Γ := (Θ ) 1 (Θ ) 1 Theorem: For multivariate Gaussian and sample size n > c 1 τ d 2 logp logp and regularization parameter λ n c 2 τ n, then with probability greater than 1 2exp ( c 3 (τ 2)logp ) : (a) No false inclusions: The regularized log-determinant estimate Θ returns an edge set Ê E. (b) l -control: Estimate satisfies max i,j Θ ij Θ ij 2c τ logp 4 n. Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

23 Sufficient conditions for consistent model selection graph sequences G p,d = (V,E) with p vertices, and maximum degree d. suitable regularity conditions on Hessian of log-determinant Γ := (Θ ) 1 (Θ ) 1 Theorem: For multivariate Gaussian and sample size n > c 1 τ d 2 logp logp and regularization parameter λ n c 2 τ n, then with probability greater than 1 2exp ( c 3 (τ 2)logp ) : (a) No false inclusions: The regularized log-determinant estimate Θ returns an edge set Ê E. (b) l -control: Estimate satisfies max i,j Θ ij Θ ij 2c τ logp 4 n. τ logp (c) Model selection consistency: If θ min c 4 n, then E = Ê. Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

24 Some other graphs (a) 4-grid (b) Star d = 4 d {O(logp), αp} Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

25 Results for 4-grid graphs Vertical axis: success probability Q[Ê = E] Martin Wainwright (UC Berkeley) Some introductory lectures January / nearest neighbor grid 0.8 Prob. of success p= p=100 p=225 p= n/log p

26 Results for star graphs Vertical axis: success probability Q[Ê = E] Martin Wainwright (UC Berkeley) Some introductory lectures January / 25 1 Star graph 0.8 Prob. of success p= p=100 p=225 p= n/log p

27 Proof sketch: Primal-dual certificate construct candidate primal-dual pair ( θ,ẑ) R p p R p p. proof technique -not a practical algorithm! (A) Solve the restricted log-determinant program θ = arg min Θ 0,Θ S c=0 { logdetθ+ Σn, Θ + λ n Θ ij } thereby obtaining candidate solution θ = ( θ S,0 S c). (B) We choose ẑ S R S as an element of the subdifferential θ S 1. (C) Using optimality conditions from original convex program, solve for ẑ S c and check whether or not strict dual feasibility ẑ j < 1 for all j S c holds. i j Lemma: Full convex program recovers neighborhood primal-dual witness succeeds. Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

28 2. Pseudolikelihood and neighborhood approaches Markov properties encode neighborhood structure: (X s X V\s ) }{{} Condition on full graph d = (X s X N(s) ) }{{} Condition on Markov blanket N(s) = {s,t,u,v,w} X s X t X w X s X u X v basis of pseudolikelihood method (Besag, 1974) basis of many graph learning algorithm (Friedman et al., 1999; Csiszar & Talata, 2005; Abeel et al., 2006; Meinshausen & Buhlmann, 2006)

29 Graph selection via neighborhood regression X \s X s..... Predict X s based on X \s := {X s, t s}.

30 Graph selection via neighborhood regression X \s X s..... Predict X s based on X \s := {X s, t s}. 1 For each node s V, compute (regularized) max. likelihood estimate: { } θ[s] := arg min 1 n L(θ;X i\s ) + λ n θ 1 θ R p 1 n }{{}}{{} i=1 local log. likelihood regularization

31 Graph selection via neighborhood regression X \s X s..... Predict X s based on X \s := {X s, t s}. 1 For each node s V, compute (regularized) max. likelihood estimate: { } θ[s] := arg min 1 n L(θ;X i\s ) + λ n θ 1 θ R p 1 n }{{}}{{} i=1 local log. likelihood regularization 2 Estimate the local neighborhood N(s) as support of regression vector θ[s] R p 1.

32 Empirical behavior: Unrescaled plots 1 Star graph; Linear fraction neighbors 0.8 Prob. success p = 64 p = 100 p = Number of samples

33 Empirical behavior: Appropriately rescaled 1 Star graph; Linear fraction neighbors 0.8 Prob. success p = 64 p = 100 p = Control parameter

34 Sufficient conditions for consistent Ising selection graph sequences G p,d = (V,E) with p vertices, and maximum degree d. edge weights θ st θ min for all (s,t) E draw n i.i.d, samples, and analyze prob. success indexed by (n,p,d) Theorem (Ravikumar, W. & Lafferty, 2010)

35 Sufficient conditions for consistent Ising selection graph sequences G p,d = (V,E) with p vertices, and maximum degree d. edge weights θ st θ min for all (s,t) E draw n i.i.d, samples, and analyze prob. success indexed by (n,p,d) Theorem (Ravikumar, W. & Lafferty, 2010) Under incoherence conditions, with sample size n > c 1 d 3 logp logp and regularization parameter λ n c 2 n, then with probability greater than 1 2exp ( c 3 λ 2 nn ) : (a) Correct exclusion: The estimated sign neighborhood N(s) correctly excludes all edges not in the true neighborhood.

36 Sufficient conditions for consistent Ising selection graph sequences G p,d = (V,E) with p vertices, and maximum degree d. edge weights θ st θ min for all (s,t) E draw n i.i.d, samples, and analyze prob. success indexed by (n,p,d) Theorem (Ravikumar, W. & Lafferty, 2010) Under incoherence conditions, with sample size n > c 1 d 3 logp logp and regularization parameter λ n c 2 n, then with probability greater than 1 2exp ( c 3 λ 2 nn ) : (a) Correct exclusion: The estimated sign neighborhood N(s) correctly excludes all edges not in the true neighborhood. (b) Correct inclusion: For θ min c 4 dλn, the method selects the correct signed neighborhood.

37 US Senate network ( voting)

38 3. Info. theory: Graph selection as channel coding graphical model selection is an unorthodox channel coding problem: Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

39 3. Info. theory: Graph selection as channel coding graphical model selection is an unorthodox channel coding problem: codewords/codebook: graph G in some graph class G channel use: draw sample Xi = (X i1,...,x ip) from Markov random field Q θ(g) decoding problem: use n samples {X1,...,X n } to correctly distinguish the codeword G Q(X G) X 1,...,X n Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

40 3. Info. theory: Graph selection as channel coding graphical model selection is an unorthodox channel coding problem: codewords/codebook: graph G in some graph class G channel use: draw sample Xi = (X i1,...,x ip) from Markov random field Q θ(g) decoding problem: use n samples {X1,...,X n } to correctly distinguish the codeword G Q(X G) X 1,...,X n Channel capacity for graph decoding determined by balance between log number of models relative distinguishability of different models Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

41 Necessary conditions for G d,p G G d,p : graphs with p nodes and max. degree d Ising models with: Minimum edge weight: θ st θ min for all edges Maximum neighborhood weight: ω(θ) := max s V θst t N(s)

42 Necessary conditions for G d,p G G d,p : graphs with p nodes and max. degree d Ising models with: Minimum edge weight: θ st θ min for all edges Maximum neighborhood weight: ω(θ) := max s V θst t N(s) Theorem If the sample size n is upper bounded by (Santhanam & W, 2012) { d n < max 8 log p 8d, exp( ω(θ) 4 )dθ minlog(pd/8), 128exp( 3θmin 2 ) logp } 2θ min tanh(θ min ) then the probability of error of any algorithm over G d,p is at least 1/2.

43 Necessary conditions for G d,p G G d,p : graphs with p nodes and max. degree d Ising models with: Minimum edge weight: θ st θ min for all edges Maximum neighborhood weight: ω(θ) := max s V θst t N(s) Theorem If the sample size n is upper bounded by (Santhanam & W, 2012) { d n < max 8 log p 8d, exp( ω(θ) 4 )dθ minlog(pd/8), 128exp( 3θmin 2 ) logp } 2θ min tanh(θ min ) then the probability of error of any algorithm over G d,p is at least 1/2. Interpretation: Naive bulk effect: Arises from log cardinality log G d,p d-clique effect: Difficulty of separating models that contain a near d-clique Small weight effect: Difficult to detect edges with small weights.

44 Some consequences Corollary For asymptotically reliable recovery over G d,p, any algorithm requires at least n = Ω(d 2 logp) samples. Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

45 Some consequences Corollary For asymptotically reliable recovery over G d,p, any algorithm requires at least n = Ω(d 2 logp) samples. note that maximum neighborhood weight ω(θ ) dθ min = require θ min = O(1/d) Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

46 Some consequences Corollary For asymptotically reliable recovery over G d,p, any algorithm requires at least n = Ω(d 2 logp) samples. note that maximum neighborhood weight ω(θ ) dθ min = require θ min = O(1/d) from small weight effect logp n = Ω( θ min tanh(θ min ) ) = Ω( logp) θ 2 min Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

47 Some consequences Corollary For asymptotically reliable recovery over G d,p, any algorithm requires at least n = Ω(d 2 logp) samples. note that maximum neighborhood weight ω(θ ) dθ min = require θ min = O(1/d) from small weight effect logp n = Ω( θ min tanh(θ min ) ) = Ω( logp) θ 2 min conclude that l 1 -regularized logistic regression (LR) is within Θ(d) of optimal for general graphs Martin Wainwright (UC Berkeley) Some introductory lectures January / 25

High-dimensional graphical model selection: Practical and information-theoretic limits

High-dimensional graphical model selection: Practical and information-theoretic limits 1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John

More information

High-dimensional graphical model selection: Practical and information-theoretic limits

High-dimensional graphical model selection: Practical and information-theoretic limits 1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John

More information

Learning and message-passing in graphical models

Learning and message-passing in graphical models Learning and message-passing in graphical models Martin Wainwright UC Berkeley Departments of Statistics, and EECS Martin Wainwright (UC Berkeley) Graphical models and message-passing 1 / 41 graphical

More information

High dimensional Ising model selection

High dimensional Ising model selection High dimensional Ising model selection Pradeep Ravikumar UT Austin (based on work with John Lafferty, Martin Wainwright) Sparse Ising model US Senate 109th Congress Banerjee et al, 2008 Estimate a sparse

More information

High dimensional ising model selection using l 1 -regularized logistic regression

High dimensional ising model selection using l 1 -regularized logistic regression High dimensional ising model selection using l 1 -regularized logistic regression 1 Department of Statistics Pennsylvania State University 597 Presentation 2016 1/29 Outline Introduction 1 Introduction

More information

Learning discrete graphical models via generalized inverse covariance matrices

Learning discrete graphical models via generalized inverse covariance matrices Learning discrete graphical models via generalized inverse covariance matrices Duzhe Wang, Yiming Lv, Yongjoon Kim, Young Lee Department of Statistics University of Wisconsin-Madison {dwang282, lv23, ykim676,

More information

High-dimensional statistics: Some progress and challenges ahead

High-dimensional statistics: Some progress and challenges ahead High-dimensional statistics: Some progress and challenges ahead Martin Wainwright UC Berkeley Departments of Statistics, and EECS University College, London Master Class: Lecture Joint work with: Alekh

More information

Graphical models and message-passing Part I: Basics and MAP computation

Graphical models and message-passing Part I: Basics and MAP computation Graphical models and message-passing Part I: Basics and MAP computation Martin Wainwright UC Berkeley Departments of Statistics, and EECS Tutorial materials (slides, monograph, lecture notes) available

More information

Sparse Graph Learning via Markov Random Fields

Sparse Graph Learning via Markov Random Fields Sparse Graph Learning via Markov Random Fields Xin Sui, Shao Tang Sep 23, 2016 Xin Sui, Shao Tang Sparse Graph Learning via Markov Random Fields Sep 23, 2016 1 / 36 Outline 1 Introduction to graph learning

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 7, 04 Reading: See class website Eric Xing @ CMU, 005-04

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference II: Mean Field Method and Variational Principle Junming Yin Lecture 15, March 7, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3

More information

High-dimensional covariance estimation based on Gaussian graphical models

High-dimensional covariance estimation based on Gaussian graphical models High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,

More information

Large-Deviations and Applications for Learning Tree-Structured Graphical Models

Large-Deviations and Applications for Learning Tree-Structured Graphical Models Large-Deviations and Applications for Learning Tree-Structured Graphical Models Vincent Tan Stochastic Systems Group, Lab of Information and Decision Systems, Massachusetts Institute of Technology Thesis

More information

3 : Representation of Undirected GM

3 : Representation of Undirected GM 10-708: Probabilistic Graphical Models 10-708, Spring 2016 3 : Representation of Undirected GM Lecturer: Eric P. Xing Scribes: Longqi Cai, Man-Chia Chang 1 MRF vs BN There are two types of graphical models:

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 5, 06 Reading: See class website Eric Xing @ CMU, 005-06

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Maria Ryskina, Yen-Chia Hsu 1 Introduction

More information

11 : Gaussian Graphic Models and Ising Models

11 : Gaussian Graphic Models and Ising Models 10-708: Probabilistic Graphical Models 10-708, Spring 2017 11 : Gaussian Graphic Models and Ising Models Lecturer: Bryon Aragam Scribes: Chao-Ming Yen 1 Introduction Different from previous maximum likelihood

More information

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement

More information

Markov random fields. The Markov property

Markov random fields. The Markov property Markov random fields The Markov property Discrete time: (X k X k!1,x k!2,... = (X k X k!1 A time symmetric version: (X k! X!k = (X k X k!1,x k+1 A more general version: Let A be a set of indices >k, B

More information

Probabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013

Probabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013 School of Computer Science Probabilistic Graphical Models Theory of Variational Inference: Inner and Outer Approximation Junming Yin Lecture 15, March 4, 2013 Reading: W & J Book Chapters 1 Roadmap Two

More information

Multivariate Bernoulli Distribution 1

Multivariate Bernoulli Distribution 1 DEPARTMENT OF STATISTICS University of Wisconsin 1300 University Ave. Madison, WI 53706 TECHNICAL REPORT NO. 1170 June 6, 2012 Multivariate Bernoulli Distribution 1 Bin Dai 2 Department of Statistics University

More information

Which graphical models are difficult to learn?

Which graphical models are difficult to learn? Which graphical models are difficult to learn? Jose Bento Department of Electrical Engineering Stanford University jbento@stanford.edu Andrea Montanari Department of Electrical Engineering and Department

More information

Estimators based on non-convex programs: Statistical and computational guarantees

Estimators based on non-convex programs: Statistical and computational guarantees Estimators based on non-convex programs: Statistical and computational guarantees Martin Wainwright UC Berkeley Statistics and EECS Based on joint work with: Po-Ling Loh (UC Berkeley) Martin Wainwright

More information

Discrete Markov Random Fields the Inference story. Pradeep Ravikumar

Discrete Markov Random Fields the Inference story. Pradeep Ravikumar Discrete Markov Random Fields the Inference story Pradeep Ravikumar Graphical Models, The History How to model stochastic processes of the world? I want to model the world, and I like graphs... 2 Mid to

More information

Causal Inference: Discussion

Causal Inference: Discussion Causal Inference: Discussion Mladen Kolar The University of Chicago Booth School of Business Sept 23, 2016 Types of machine learning problems Based on the information available: Supervised learning Reinforcement

More information

1 Regression with High Dimensional Data

1 Regression with High Dimensional Data 6.883 Learning with Combinatorial Structure ote for Lecture 11 Instructor: Prof. Stefanie Jegelka Scribe: Xuhong Zhang 1 Regression with High Dimensional Data Consider the following regression problem:

More information

CSC 412 (Lecture 4): Undirected Graphical Models

CSC 412 (Lecture 4): Undirected Graphical Models CSC 412 (Lecture 4): Undirected Graphical Models Raquel Urtasun University of Toronto Feb 2, 2016 R Urtasun (UofT) CSC 412 Feb 2, 2016 1 / 37 Today Undirected Graphical Models: Semantics of the graph:

More information

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana

More information

Does Better Inference mean Better Learning?

Does Better Inference mean Better Learning? Does Better Inference mean Better Learning? Andrew E. Gelfand, Rina Dechter & Alexander Ihler Department of Computer Science University of California, Irvine {agelfand,dechter,ihler}@ics.uci.edu Abstract

More information

10708 Graphical Models: Homework 2

10708 Graphical Models: Homework 2 10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves

More information

CS Lecture 19. Exponential Families & Expectation Propagation

CS Lecture 19. Exponential Families & Expectation Propagation CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces

More information

Graphical models and message-passing Part II: Marginals and likelihoods

Graphical models and message-passing Part II: Marginals and likelihoods Graphical models and message-passing Part II: Marginals and likelihoods Martin Wainwright UC Berkeley Departments of Statistics, and EECS Tutorial materials (slides, monograph, lecture notes) available

More information

Graphical Model Selection

Graphical Model Selection May 6, 2013 Trevor Hastie, Stanford Statistics 1 Graphical Model Selection Trevor Hastie Stanford University joint work with Jerome Friedman, Rob Tibshirani, Rahul Mazumder and Jason Lee May 6, 2013 Trevor

More information

Undirected Graphical Models

Undirected Graphical Models Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional

More information

Simultaneous Support Recovery in High Dimensions: Benefits and Perils of Block `1=` -Regularization

Simultaneous Support Recovery in High Dimensions: Benefits and Perils of Block `1=` -Regularization IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 6, JUNE 2011 3841 Simultaneous Support Recovery in High Dimensions: Benefits and Perils of Block `1=` -Regularization Sahand N. Negahban and Martin

More information

Lecture 9: PGM Learning

Lecture 9: PGM Learning 13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and

More information

Learning the Network Structure of Heterogenerous Data via Pairwise Exponential MRF

Learning the Network Structure of Heterogenerous Data via Pairwise Exponential MRF Learning the Network Structure of Heterogenerous Data via Pairwise Exponential MRF Jong Ho Kim, Youngsuk Park December 17, 2016 1 Introduction Markov random fields (MRFs) are a fundamental fool on data

More information

High-dimensional Covariance Estimation Based On Gaussian Graphical Models

High-dimensional Covariance Estimation Based On Gaussian Graphical Models High-dimensional Covariance Estimation Based On Gaussian Graphical Models Shuheng Zhou, Philipp Rutimann, Min Xu and Peter Buhlmann February 3, 2012 Problem definition Want to estimate the covariance matrix

More information

Undirected Graphical Models

Undirected Graphical Models Undirected Graphical Models 1 Conditional Independence Graphs Let G = (V, E) be an undirected graph with vertex set V and edge set E, and let A, B, and C be subsets of vertices. We say that C separates

More information

Information-Theoretic Limits of Selecting Binary Graphical Models in High Dimensions

Information-Theoretic Limits of Selecting Binary Graphical Models in High Dimensions IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 7, JULY 2012 4117 Information-Theoretic Limits of Selecting Binary Graphical Models in High Dimensions Narayana P. Santhanam, Member, IEEE, and Martin

More information

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014 Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 Problem Set 3 Issued: Thursday, September 25, 2014 Due: Thursday,

More information

Improved Greedy Algorithms for Learning Graphical Models

Improved Greedy Algorithms for Learning Graphical Models Improved Greedy Algorithms for Learning Graphical Models Avik Ray, Sujay Sanghavi Member, IEEE and Sanjay Shakkottai Fellow, IEEE Abstract We propose new greedy algorithms for learning the structure of

More information

Inferning with High Girth Graphical Models

Inferning with High Girth Graphical Models Uri Heinemann The Hebrew University of Jerusalem, Jerusalem, Israel Amir Globerson The Hebrew University of Jerusalem, Jerusalem, Israel URIHEI@CS.HUJI.AC.IL GAMIR@CS.HUJI.AC.IL Abstract Unsupervised learning

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and

More information

The Nonparanormal skeptic

The Nonparanormal skeptic The Nonpara skeptic Han Liu Johns Hopkins University, 615 N. Wolfe Street, Baltimore, MD 21205 USA Fang Han Johns Hopkins University, 615 N. Wolfe Street, Baltimore, MD 21205 USA Ming Yuan Georgia Institute

More information

Machine Learning for Data Science (CS4786) Lecture 24

Machine Learning for Data Science (CS4786) Lecture 24 Machine Learning for Data Science (CS4786) Lecture 24 Graphical Models: Approximate Inference Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ BELIEF PROPAGATION OR MESSAGE PASSING Each

More information

Junction Tree, BP and Variational Methods

Junction Tree, BP and Variational Methods Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,

More information

Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results

Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results David Prince Biostat 572 dprince3@uw.edu April 19, 2012 David Prince (UW) SPICE April 19, 2012 1 / 11 Electronic

More information

13 : Variational Inference: Loopy Belief Propagation

13 : Variational Inference: Loopy Belief Propagation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 13 : Variational Inference: Loopy Belief Propagation Lecturer: Eric P. Xing Scribes: Rajarshi Das, Zhengzhong Liu, Dishan Gupta 1 Introduction

More information

Statistical Learning

Statistical Learning Statistical Learning Lecture 5: Bayesian Networks and Graphical Models Mário A. T. Figueiredo Instituto Superior Técnico & Instituto de Telecomunicações University of Lisbon, Portugal May 2018 Mário A.

More information

Probabilistic Graphical Models

Probabilistic Graphical Models 2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector

More information

Graphical Models for Collaborative Filtering

Graphical Models for Collaborative Filtering Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,

More information

Permutation-invariant regularization of large covariance matrices. Liza Levina

Permutation-invariant regularization of large covariance matrices. Liza Levina Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work

More information

Graphical Models and Kernel Methods

Graphical Models and Kernel Methods Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.

More information

Convex relaxation for Combinatorial Penalties

Convex relaxation for Combinatorial Penalties Convex relaxation for Combinatorial Penalties Guillaume Obozinski Equipe Imagine Laboratoire d Informatique Gaspard Monge Ecole des Ponts - ParisTech Joint work with Francis Bach Fête Parisienne in Computation,

More information

Probabilistic Graphical Models Lecture Notes Fall 2009

Probabilistic Graphical Models Lecture Notes Fall 2009 Probabilistic Graphical Models Lecture Notes Fall 2009 October 28, 2009 Byoung-Tak Zhang School of omputer Science and Engineering & ognitive Science, Brain Science, and Bioinformatics Seoul National University

More information

arxiv: v1 [stat.ml] 8 Oct 2011

arxiv: v1 [stat.ml] 8 Oct 2011 On the trade-off between complexity and correlation decay in structural learning algorithms arxiv:1110.1769v1 [stat.ml] 8 Oct 2011 José Bento and Andrea Montanari October 11, 2011 Abstract We consider

More information

Linear and conic programming relaxations: Graph structure and message-passing

Linear and conic programming relaxations: Graph structure and message-passing Linear and conic programming relaxations: Graph structure and message-passing Martin Wainwright UC Berkeley Departments of EECS and Statistics Banff Workshop Partially supported by grants from: National

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 11 CRFs, Exponential Family CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due today Project milestones due next Monday (Nov 9) About half the work should

More information

arxiv: v1 [stat.ap] 19 Oct 2015

arxiv: v1 [stat.ap] 19 Oct 2015 Submitted to the Annals of Applied Statistics STRUCTURE ESTIMATION FOR MIXED GRAPHICAL MODELS IN HIGH-DIMENSIONAL DATA arxiv:1510.05677v1 [stat.ap] 19 Oct 2015 By Jonas M. B. Haslbeck Utrecht University

More information

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling Professor Erik Sudderth Brown University Computer Science October 27, 2016 Some figures and materials courtesy

More information

Extended Bayesian Information Criteria for Gaussian Graphical Models

Extended Bayesian Information Criteria for Gaussian Graphical Models Extended Bayesian Information Criteria for Gaussian Graphical Models Rina Foygel University of Chicago rina@uchicago.edu Mathias Drton University of Chicago drton@uchicago.edu Abstract Gaussian graphical

More information

Learning Quadratic Variance Function (QVF) DAG Models via OverDispersion Scoring (ODS)

Learning Quadratic Variance Function (QVF) DAG Models via OverDispersion Scoring (ODS) Journal of Machine Learning Research 18 2018 1-44 Submitted 4/17; Revised 12/17; Published 4/18 Learning Quadratic Variance Function QVF DAG Models via OverDispersion Scoring ODS Gunwoong Park Department

More information

Gaussian Graphical Models and Graphical Lasso

Gaussian Graphical Models and Graphical Lasso ELE 538B: Sparsity, Structure and Inference Gaussian Graphical Models and Graphical Lasso Yuxin Chen Princeton University, Spring 2017 Multivariate Gaussians Consider a random vector x N (0, Σ) with pdf

More information

Structure Learning of Mixed Graphical Models

Structure Learning of Mixed Graphical Models Jason D. Lee Institute of Computational and Mathematical Engineering Stanford University Trevor J. Hastie Department of Statistics Stanford University Abstract We consider the problem of learning the structure

More information

Directed and Undirected Graphical Models

Directed and Undirected Graphical Models Directed and Undirected Graphical Models Adrian Weller MLSALT4 Lecture Feb 26, 2016 With thanks to David Sontag (NYU) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 12: Gaussian Belief Propagation, State Space Models and Kalman Filters Guest Kalman Filter Lecture by

More information

Learning Gaussian Graphical Models with Unknown Group Sparsity

Learning Gaussian Graphical Models with Unknown Group Sparsity Learning Gaussian Graphical Models with Unknown Group Sparsity Kevin Murphy Ben Marlin Depts. of Statistics & Computer Science Univ. British Columbia Canada Connections Graphical models Density estimation

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference IV: Variational Principle II Junming Yin Lecture 17, March 21, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3 X 3 Reading: X 4

More information

arxiv: v6 [math.st] 3 Feb 2018

arxiv: v6 [math.st] 3 Feb 2018 Submitted to the Annals of Statistics HIGH-DIMENSIONAL CONSISTENCY IN SCORE-BASED AND HYBRID STRUCTURE LEARNING arxiv:1507.02608v6 [math.st] 3 Feb 2018 By Preetam Nandy,, Alain Hauser and Marloes H. Maathuis,

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Yu-Hsin Kuo, Amos Ng 1 Introduction Last lecture

More information

Exact MAP estimates by (hyper)tree agreement

Exact MAP estimates by (hyper)tree agreement Exact MAP estimates by (hyper)tree agreement Martin J. Wainwright, Department of EECS, UC Berkeley, Berkeley, CA 94720 martinw@eecs.berkeley.edu Tommi S. Jaakkola and Alan S. Willsky, Department of EECS,

More information

On the optimality of tree-reweighted max-product message-passing

On the optimality of tree-reweighted max-product message-passing Appeared in Uncertainty on Artificial Intelligence, July 2005, Edinburgh, Scotland. On the optimality of tree-reweighted max-product message-passing Vladimir Kolmogorov Microsoft Research Cambridge, UK

More information

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

A graph contains a set of nodes (vertices) connected by links (edges or arcs) BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,

More information

Variational Inference (11/04/13)

Variational Inference (11/04/13) STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further

More information

Undirected graphical models

Undirected graphical models Undirected graphical models Kevin P. Murphy Last updated November 16, 2006 * Denotes advanced sections that may be omitted on a first reading. 1 Introduction We have seen that conditional independence

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57

More information

Estimation of Graphical Models with Shape Restriction

Estimation of Graphical Models with Shape Restriction Estimation of Graphical Models with Shape Restriction BY KHAI X. CHIONG USC Dornsife INE, Department of Economics, University of Southern California, Los Angeles, California 989, U.S.A. kchiong@usc.edu

More information

UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics. EECS 281A / STAT 241A Statistical Learning Theory

UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics. EECS 281A / STAT 241A Statistical Learning Theory UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics EECS 281A / STAT 241A Statistical Learning Theory Solutions to Problem Set 2 Fall 2011 Issued: Wednesday,

More information

High-Dimensional Learning of Linear Causal Networks via Inverse Covariance Estimation

High-Dimensional Learning of Linear Causal Networks via Inverse Covariance Estimation University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 10-014 High-Dimensional Learning of Linear Causal Networks via Inverse Covariance Estimation Po-Ling Loh University

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 10 Undirected Models CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due this Wednesday (Nov 4) in class Project milestones due next Monday (Nov 9) About half

More information

Non-Asymptotic Analysis for Relational Learning with One Network

Non-Asymptotic Analysis for Relational Learning with One Network Peng He Department of Automation Tsinghua University Changshui Zhang Department of Automation Tsinghua University Abstract This theoretical paper is concerned with a rigorous non-asymptotic analysis of

More information

High Dimensional Inverse Covariate Matrix Estimation via Linear Programming

High Dimensional Inverse Covariate Matrix Estimation via Linear Programming High Dimensional Inverse Covariate Matrix Estimation via Linear Programming Ming Yuan October 24, 2011 Gaussian Graphical Model X = (X 1,..., X p ) indep. N(µ, Σ) Inverse covariance matrix Σ 1 = Ω = (ω

More information

Undirected Graphical Models: Markov Random Fields

Undirected Graphical Models: Markov Random Fields Undirected Graphical Models: Markov Random Fields 40-956 Advanced Topics in AI: Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2015 Markov Random Field Structure: undirected

More information

Estimating Latent Variable Graphical Models with Moments and Likelihoods

Estimating Latent Variable Graphical Models with Moments and Likelihoods Estimating Latent Variable Graphical Models with Moments and Likelihoods Arun Tejasvi Chaganty Percy Liang Stanford University June 18, 2014 Chaganty, Liang (Stanford University) Moments and Likelihoods

More information

Inconsistent parameter estimation in Markov random fields: Benefits in the computation-limited setting

Inconsistent parameter estimation in Markov random fields: Benefits in the computation-limited setting Inconsistent parameter estimation in Markov random fields: Benefits in the computation-limited setting Martin J. Wainwright Department of Statistics, and Department of Electrical Engineering and Computer

More information

Variational Inference. Sargur Srihari

Variational Inference. Sargur Srihari Variational Inference Sargur srihari@cedar.buffalo.edu 1 Plan of discussion We first describe inference with PGMs and the intractability of exact inference Then give a taxonomy of inference algorithms

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft

More information

Semidefinite relaxations for approximate inference on graphs with cycles

Semidefinite relaxations for approximate inference on graphs with cycles Semidefinite relaxations for approximate inference on graphs with cycles Martin J. Wainwright Electrical Engineering and Computer Science UC Berkeley, Berkeley, CA 94720 wainwrig@eecs.berkeley.edu Michael

More information

Robust Inverse Covariance Estimation under Noisy Measurements

Robust Inverse Covariance Estimation under Noisy Measurements .. Robust Inverse Covariance Estimation under Noisy Measurements Jun-Kun Wang, Shou-De Lin Intel-NTU, National Taiwan University ICML 2014 1 / 30 . Table of contents Introduction.1 Introduction.2 Related

More information

1 Undirected Graphical Models. 2 Markov Random Fields (MRFs)

1 Undirected Graphical Models. 2 Markov Random Fields (MRFs) Machine Learning (ML, F16) Lecture#07 (Thursday Nov. 3rd) Lecturer: Byron Boots Undirected Graphical Models 1 Undirected Graphical Models In the previous lecture, we discussed directed graphical models.

More information

Rapid Introduction to Machine Learning/ Deep Learning

Rapid Introduction to Machine Learning/ Deep Learning Rapid Introduction to Machine Learning/ Deep Learning Hyeong In Choi Seoul National University 1/24 Lecture 5b Markov random field (MRF) November 13, 2015 2/24 Table of contents 1 1. Objectives of Lecture

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Learning Bayesian network : Given structure and completely observed data

Learning Bayesian network : Given structure and completely observed data Learning Bayesian network : Given structure and completely observed data Probabilistic Graphical Models Sharif University of Technology Spring 2017 Soleymani Learning problem Target: true distribution

More information

Independencies. Undirected Graphical Models 2: Independencies. Independencies (Markov networks) Independencies (Bayesian Networks)

Independencies. Undirected Graphical Models 2: Independencies. Independencies (Markov networks) Independencies (Bayesian Networks) (Bayesian Networks) Undirected Graphical Models 2: Use d-separation to read off independencies in a Bayesian network Takes a bit of effort! 1 2 (Markov networks) Use separation to determine independencies

More information

Intelligent Systems:

Intelligent Systems: Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition

More information

New ways of dimension reduction? Cutting data sets into small pieces

New ways of dimension reduction? Cutting data sets into small pieces New ways of dimension reduction? Cutting data sets into small pieces Roman Vershynin University of Michigan, Department of Mathematics Statistical Machine Learning Ann Arbor, June 5, 2012 Joint work with

More information

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics Markov Random Fields: Representation Conditional Random Fields Log-Linear Models Readings: KF

More information

Linear Response for Approximate Inference

Linear Response for Approximate Inference Linear Response for Approximate Inference Max Welling Department of Computer Science University of Toronto Toronto M5S 3G4 Canada welling@cs.utoronto.ca Yee Whye Teh Computer Science Division University

More information