Max$Sum(( Exact(Inference(

Size: px
Start display at page:

Download "Max$Sum(( Exact(Inference("

Transcription

1 Probabilis7c( Graphical( Models( Inference( MAP( Max$Sum(( Exact(Inference(

2 Product Summation a 1 b 1 8 a 1 b 2 1 a 2 b a 2 b 2 2 a 1 b 1 3 a 1 b 2 0 a 2 b 1-1 a 2 b 2 1

3 Max-Sum Elimination in Chains X A B C D E ( θ ( A, B) + θ ( B, C) + θ ( C, D) + θ ( D, )) max D max C max B max A E ( θ ( B, C) + θ ( C, D) + θ ( D, E) + max θ ( A, )) max D max C max B 3 4 A 1 2 B ( θ ( B, C) + θ ( C, D) + θ ( D, E) + λ ( )) max D max C max B B

4 Factor Summation a 1 b 1 c 1 3+4=7 a 1 b 1 c =4.5 a 1 b 1 3 a 1 b 2 0 a 2 b 1-1 a 2 b 2 1 b 1 c 1 4 b 1 c b 2 c b 2 c 2 2 a 1 b 2 c =0.2 a 1 b 2 c 2 0+2=2 a 2 b 1 c 1-1+4=3 a 2 b 1 c =0.5 a 2 b 2 c =1.2 a 2 b 2 c 2 1+2=3

5 Factor Maximization a 1 b 1 c 1 7 a 1 b 1 c a 1 b 2 c a 1 b 2 c 2 2 a 2 b 1 c 1 3 a 2 b 1 c a 1 c 1 7 a 1 c a 2 c 1 3 a 2 c 2 3 a 2 b 2 c a 2 b 2 c 2 3

6 Max-Sum Elimination in Chains X X A B C D E ( θ ( B, C) + θ ( C, D) + θ ( D, E) + λ ( )) max D max C max B B ( θ ( C, D) + θ ( D, E) + max ( θ ( B, C) + λ ( ))) max D max C 4 B B ( θ ( C, D) + θ ( D, E) + λ ( )) max D max C C

7 Max-Sum Elimination in Chains X X A B X C X D E ( θ ( C, D) + θ ( D, E) + λ ( )) max D max C C ( θ ( D, E) + λ ( )) max D 3 4 D λ 4( E) λ 4 (e)

8 Max-Sum in Clique Trees A B C D E λ λ 2 3 (C) = 1 2 (B) = λ (D) = 3 4 max max ( θ +δ A θ 1 B ) max C θ3 +δ2 ( ) 3 1: A,B 2: 3: B C D B,C C,D 4: D,E λ 2 1 (B) = max θ +δ C ( ) λ 3 2 (C) = max θ +δ D ( ) λ 4 3 (D) = max E θ 4

9 Convergence of Message Passing Once C i receives a final message from all neighbors except C j, then λ i j is also final (will never change) Messages from leaves are immediately final λ λ 2 3 (C) = 1 2 (B) = λ (D) = 3 4 max max ( θ +δ A θ 1 B ) max C θ3 +δ2 ( ) 3 1: A,B 2: 3: B C D B,C C,D 4: D,E λ 2 1 (B) = max θ +δ C ( ) λ 3 2 (C) = max θ +δ D ( ) λ 4 3 (D) = max E θ 4

10 Simple Example A B C a 1 b 1 c 1 3+4=7 a 1 b 1 c =4.5 a 1 b 1 3 a 1 b 2 0 a 2 b 1-1 a 2 b 2 1 b 1 c 1 4 b 1 c b 2 c b 2 c 2 2 a 1 b 2 c =0.2 a 1 b 2 c 2 0+2=2 a 2 b 1 c 1-1+4=3 a 2 b 1 c =0.5 a 2 b 2 c =1.2 a 2 b 2 c 2 1+2=3

11 Simple Example a 1 b 1 3 a 1 b 2 0 a 2 b 1-1 a 2 b 2 1 1: A,B a 1 b 1 3+4=7 a 1 b 2 0+2=2 a 2 b 1-1+4=3 a 2 b 2 1+2=3 B b 1 3 b 2 1 b 1 4 b 2 2 b 1 c 1 4 b 1 c b 2 c b 2 c 2 2 2: B,C b 1 c 1 4+3=7 b 1 c =4.5 b 2 c =1.2 b 2 c 2 2+1=3

12 Max-Sum BP at Convergence Beliefs at each clique are max-marginals β i ( C i ) =θ i ( C i ) + λ k i k Calibration: cliques agree on shared variables a 1 b 1 3+4=7 a 1 b 2 0+2=2 a 2 b 1-1+4= 3 a 2 b 2 1+2=3 b 1 c 1 4+3=7 b 1 c =4.5 b 2 c =1.2 b 2 c 2 2+1=3

13 Summary The same clique tree algorithm used for sum-product can be used for max-sum As in sum-product, convergence is achieved after a single up-down pass Result is a max-marginal at each clique C: For each assignment c to C, what is the score of the best completion to c

14 Probabilis3c& Graphical& Models& Inference& MAP& Finding&a&MAP& Assignment&

15 Decoding a MAP Assignment Easy if MAP assignment is unique Single maximizing assignment at each clique Whose value is the θ value of the MAP assignment Due to calibration, choices at all cliques must agree a 1 b 1 c 1 7 a 1 b 1 c a 1 b 2 c a 1 b 2 c 2 2 a 2 b 1 c 1 3 a 2 b 1 c a 2 b 2 c a 2 b 2 c 2 3 a 1 b 1 3+4=7 a 1 b 2 0+2=2 a 2 b 1-1+4=3 a 2 b 2 1+2=3 b 1 c 1 4+3=7 b 1 c =4.5 b 2 c =1.2 b 2 c 2 2+1=3

16 Decoding a MAP assignment If MAP assignment is not unique, we may have multiple choices at some cliques Arbitrary tie-breaking may not produce a MAP assignment a 1 b 1 2 a 1 b 2 1 a 2 b 1 1 a 2 b 2 2 b 1 c 1 2 b 1 c 2 1 b 2 c 1 1 b 2 c 2 2

17 Decoding a MAP assignment If MAP assignment is not unique, we may have multiple choices at some cliques Arbitrary tie-breaking may not produce a MAP assignment Two options: Slightly perturb parameters to make MAP unique Use traceback procedure that incrementally builds a MAP assignment, one variable at a time

18 Probabilis1c$ Graphical$ Models$ Inference$ MAP$ Tractable$ MAP$$ Problems$

19 Correspondence /data association X ij = 1 if i matched to j 0 otherwise θ ij = quality of match between i and j Find highest scoring matching maximize Σ ij θ ij X ij subject to mutual exclusion constraint Easily solved using matching algorithms Many applications matching sensor readings to objects matching features in two related images matching mentions in text to entities

20 3D Cell Reconstruction correspond tilt images compute 3D reconstruction Matching weights: similarity of location and local neighborhood appearance Duchi, Tarlow, Elidan, and Koller, NIPS Amat, Moussavi, Comolli, Elidan, Downing, Horowitz, Journal of Strurctural Biology, 2006.

21 Mesh Registration Matching weights: similarity of location and local neighborhood appearance [Anguelov, Koller, Srinivasan, Thrun, Pang, Davis, NIPS 2004]

22 Associative potentials Arbitrary network over binary variables using only singleton θ i and supermodular pairwise potentials θ ij Exact solution using algorithms for finding minimum cuts in graphs Many related variants admit efficient exact or approximate solutions Metric MRFs a b 1 c d

23 Example: Depth Reconstruction view 1 view 2 depth reconstruction Scharstein & Szeliski, High-accuracy stereo depth maps using structured light Proc. IEEE CVPR 2003

24 A B C D score Cardinality Factors A factor over arbitrarily many binary variables X 1,,X k Score(X 1,,X k ) = f(σ i X i ) Example applications: soft parity constraints prior on # pixels in a given category prior on # of instances assigned to a given cluster

25 A B C D score Sparse Pattern Factors A factor over variables X 1,,X k Score(X 1,,X k ) specified for some small # of assignments x 1,,x k Constant for all other assignments Examples: give higher score to combinations that occur in real data In spelling, letter combinations that occur in dictionary 5 5 image patches that appear in natural images

26 Convexity Factors Ordered binary variables X 1,,X k Convexity constraints Examples: Convexity of parts in image segmentation Contiguity of word labeling in text Temporal contiguity of subactivities

27 Summary Many specialized models admit tractable MAP solution Many do not have tractable algorithms for computing marginals These specialized models are useful On their own As a component in a larger model with other types of factors

28 Probabilis1c) Graphical) Models) Inference) MAP) Tractable) MAP) Problems)

29 Correspondence /data association X ij = 1 if i matched to j 0 otherwise θ ij = quality of match between i and j Find highest scoring matching maximize Σ ij θ ij X ij subject to mutual exclusion constraint Easily solved using matching algorithms Many applications matching sensor readings to objects matching features in two related images matching mentions in text to entities

30 3D Cell Reconstruction correspond tilt images compute 3D reconstruction Matching weights: similarity of location and local neighborhood appearance Duchi, Tarlow, Elidan, and Koller, NIPS Amat, Moussavi, Comolli, Elidan, Downing, Horowitz, Journal of Strurctural Biology, 2006.

31 Mesh Registration Matching weights: similarity of location and local neighborhood appearance [Anguelov, Koller, Srinivasan, Thrun, Pang, Davis, NIPS 2004]

32 Associative potentials Arbitrary network over binary variables using only singleton θ i and supermodular pairwise potentials θ ij Exact solution using algorithms for finding minimum cuts in graphs Many related variants admit efficient exact or approximate solutions Metric MRFs a b 1 c d

33 Example: Depth Reconstruction view 1 view 2 depth reconstruction Scharstein & Szeliski, High-accuracy stereo depth maps using structured light Proc. IEEE CVPR 2003

34 A B C D score Cardinality Factors A factor over arbitrarily many binary variables X 1,,X k Score(X 1,,X k ) = f(σ i X i ) Example applications: soft parity constraints prior on # pixels in a given category prior on # of instances assigned to a given cluster

35 A B C D score Sparse Pattern Factors A factor over variables X 1,,X k Score(X 1,,X k ) specified for some small # of assignments x 1,,x k Constant for all other assignments Examples: give higher score to combinations that occur in real data In spelling, letter combinations that occur in dictionary 5 5 image patches that appear in natural images

36 Convexity Factors Ordered binary variables X 1,,X k Convexity constraints Examples: Convexity of parts in image segmentation Contiguity of word labeling in text Temporal contiguity of subactivities

37 Summary Many specialized models admit tractable MAP solution Many do not have tractable algorithms for computing marginals These specialized models are useful On their own As a component in a larger model with other types of factors

38 Probabilis-c% Graphical% Models% Inference% MAP% Dual% Decomposi-on%

39 Problem%Formula-on% Singleton factors Large factors

40 Divide%and%Conquer%

41 Divide%and%Conquer% L(λ) is upper bound on MAP(θ) for any setting of λ s

42 θ F (X 1,X 2 ) -λ F1 (X 1 ) -λ F2 (X 2 ) X 1 θ K (X 1,X 4 ) -λ X K1 (X 1 ) 1 -λ K4 (X 4 ) θ 1 (X 1 ) X 2 θ 1 (X 1 )+λ F1 (X 1 )+λ K1 (X 1 ) X 4 θ F (X 1,X 2 ) X 1 θ K (X 1,X 4 ) X 1 θ 2 (X 2 )+λ F2 (X 2 )+λ G2 (X 2 ) θ 4 (X 4 )+λ K4 (X 4 )+λ H4 (X 4 ) θ 2 (X 2 ) X 2 X 4 θ 4 (X 4 ) X 2 X 4 θ G (X 2,X 3 ) θ H (X 3,X 4 ) X 3 X 3 θ 3 (X 3 ) θ 3 (X 3 )+λ G3 (X 3 )+λ H3 (X 3 ) X 2 X 4 θ G (X 2,X 3 ) -λ G2 (X 2 ) -λ G3 (X 3 ) X 3 X 3 θ H (X 3,X 4 ) -λ H3 (X 3 ) -λ H4 (X 4 )

43 Divide%and%Conquer% Slaves don t have to be factors in original model Subsets of factors that admit tractable solution to local maximization task θ F (X 1,X 2 )+θ G (X 2,X 3 ) θ 1 (X 1 ) X 1 X 2 X 3 -λ FG1 (X 1 )-λ FG2 (X 2 )-λ FG3 (X 3 ) θ F (X 1,X 2 ) X 1 θ K (X 1,X 4 ) X 1 X 4 X 3 θ K (X 1,X 4 )+θ H (X 3,X 4 ) -λ KH1 (X 1 )-λ KH3 (X 3 )-λ KH4 (X 4 ) θ 2 (X 2 ) X 2 X 4 θ 4 (X 4 ) θ 1 (X 1 )+λ FG1 (X 1 )+λ KH1 (X 1 ) θ 2 (X 2 )+λ FG2 (X 2 ) θ G (X 2,X 3 ) θ H (X 3,X 4 ) X 3 θ 3 (X 3 ) X 1 θ 3 (X 3 )+λ FG3 (X 3 )+λ KH3 (X 3 ) X 3 X 2 θ 4 (X 4 )+λ KH4 (X 4 ) X 4

44 Divide%and%Conquer% In pairwise networks, often divide factors into set of disjoint trees Each edge factor assigned to exactly one tree Other tractable classes of factor sets Matchings Associative models

45 Example:%3D%Cell%Reconstruc-on% correspond%-lt% images% compute%3d% reconstruc-on% Matching%weights:%similarity%of%loca-on%and% local%neighborhood%appearance% Pairwise%poten-als:%approximate%preserva-on% of%rela-ve%marker%posi-ons%across%images% Duchi,%Tarlow,%Elidan,%and%Koller,%NIPS%2006.%Amat,%Moussavi,%Comolli,%Elidan,%Downing,%Horowitz,% Journal%of%Strurctural%Biology,%2006.%

46 Probabilis-c% Graphical% Models% Inference% MAP% Dual% Decomposi-on% Algorithm%

47 Dual Decomposition Algorithm Initialize all λ s to be 0 Repeat for t=1,2, Locally optimize all slaves: For all F and i F If then

48 Dual Decomposition Convergence Under weak conditions on α t, the λ s are guaranteed to converge Σ t α t = Σ t α t 2 < Convergence is to a unique global optimum, regardless of initialization

49 At Convergence Each slave has a locally optimal solution over its own variables Solutions may not agree on shared variables If all slaves agree, the shared solution is a guaranteed MAP assignment Otherwise, we need to solve the decoding problem to construct a joint assignment

50 Options for Decoding x* Several heuristics If we use decomposition into spanning trees, can take MAP solution of any tree Have each slave vote on X i s in its scope & for each X i pick value with most votes Weighted average of sequence of messages sent regarding each X i Score θ is easy to evaluate Best to generate many candidates and pick the one with highest score

51 Upper Bound L(λ) is upper bound on MAP(θ) score(x) MAP(θ) L(λ) MAP(θ) - score(x) L(λ) score(x)

52 Important Design Choices Division of problem into slaves Larger slaves (with more factors) improve convergence and often quality of answers Selecting locally optimal solutions for slaves Try to move toward faster agreement Adjusting the step size α t Methods to construct candidate solutions

53 Summary: Algorithm Dual decomposition is a general-purpose algorithm for MAP inference Divides model into tractable components Solves each one locally Passes messages to induce them to agree Any tractable MAP subclass can be used in this setting

54 Summary: Theory Formally: a subgradient optimization algorithm on dual problem to MAP Provides important guarantees Upper bound on distance to MAP Conditions that guarantee exact MAP solution Even some analysis for which decomposition into slaves is better

55 Pros: Summary: Practice Very general purpose Best theoretical guarantees Can use very fast, specialized MAP subroutines for solving large model components Cons: Not the fastest algorithm Lots of tunable parameters / design choices

Dual Decomposition for Inference

Dual Decomposition for Inference Dual Decomposition for Inference Yunshu Liu ASPITRG Research Group 2014-05-06 References: [1]. D. Sontag, A. Globerson and T. Jaakkola, Introduction to Dual Decomposition for Inference, Optimization for

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 9 Undirected Models CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due next Wednesday (Nov 4) in class Start early!!! Project milestones due Monday (Nov 9)

More information

13 : Variational Inference: Loopy Belief Propagation

13 : Variational Inference: Loopy Belief Propagation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 13 : Variational Inference: Loopy Belief Propagation Lecturer: Eric P. Xing Scribes: Rajarshi Das, Zhengzhong Liu, Dishan Gupta 1 Introduction

More information

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,

More information

Undirected graphical models

Undirected graphical models Undirected graphical models Semantics of probabilistic models over undirected graphs Parameters of undirected models Example applications COMP-652 and ECSE-608, February 16, 2017 1 Undirected graphical

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models David Sontag New York University Lecture 6, March 7, 2013 David Sontag (NYU) Graphical Models Lecture 6, March 7, 2013 1 / 25 Today s lecture 1 Dual decomposition 2 MAP inference

More information

Inference' Sampling'Methods' Probabilis5c' Graphical' Models' Inference'In' Template' Models' Daphne Koller

Inference' Sampling'Methods' Probabilis5c' Graphical' Models' Inference'In' Template' Models' Daphne Koller Probabilis5c' Graphical' Models' Inference' Sampling'Methods' Inference'In' Template' Models' DBN Template Specification Weather 0 Velocity 0 Location 0 Weather Velocity Location Weather Velocity Location

More information

Tightness of LP Relaxations for Almost Balanced Models

Tightness of LP Relaxations for Almost Balanced Models Tightness of LP Relaxations for Almost Balanced Models Adrian Weller University of Cambridge AISTATS May 10, 2016 Joint work with Mark Rowland and David Sontag For more information, see http://mlg.eng.cam.ac.uk/adrian/

More information

Inference and Representation

Inference and Representation Inference and Representation David Sontag New York University Lecture 5, Sept. 30, 2014 David Sontag (NYU) Inference and Representation Lecture 5, Sept. 30, 2014 1 / 16 Today s lecture 1 Running-time of

More information

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,

More information

Undirected Graphical Models

Undirected Graphical Models Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional

More information

Undirected Graphical Models: Markov Random Fields

Undirected Graphical Models: Markov Random Fields Undirected Graphical Models: Markov Random Fields 40-956 Advanced Topics in AI: Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2015 Markov Random Field Structure: undirected

More information

Markov Random Fields for Computer Vision (Part 1)

Markov Random Fields for Computer Vision (Part 1) Markov Random Fields for Computer Vision (Part 1) Machine Learning Summer School (MLSS 2011) Stephen Gould stephen.gould@anu.edu.au Australian National University 13 17 June, 2011 Stephen Gould 1/23 Pixel

More information

CSC 412 (Lecture 4): Undirected Graphical Models

CSC 412 (Lecture 4): Undirected Graphical Models CSC 412 (Lecture 4): Undirected Graphical Models Raquel Urtasun University of Toronto Feb 2, 2016 R Urtasun (UofT) CSC 412 Feb 2, 2016 1 / 37 Today Undirected Graphical Models: Semantics of the graph:

More information

Notes on Markov Networks

Notes on Markov Networks Notes on Markov Networks Lili Mou moull12@sei.pku.edu.cn December, 2014 This note covers basic topics in Markov networks. We mainly talk about the formal definition, Gibbs sampling for inference, and maximum

More information

Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passing

Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passing Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passing Huayan Wang huayanw@cs.stanford.edu Computer Science Department, Stanford University, Palo Alto, CA 90 USA Daphne Koller koller@cs.stanford.edu

More information

Junction Tree, BP and Variational Methods

Junction Tree, BP and Variational Methods Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,

More information

Probabilistic Graphical Models Lecture Notes Fall 2009

Probabilistic Graphical Models Lecture Notes Fall 2009 Probabilistic Graphical Models Lecture Notes Fall 2009 October 28, 2009 Byoung-Tak Zhang School of omputer Science and Engineering & ognitive Science, Brain Science, and Bioinformatics Seoul National University

More information

Probabilistic Graphical Models Homework 2: Due February 24, 2014 at 4 pm

Probabilistic Graphical Models Homework 2: Due February 24, 2014 at 4 pm Probabilistic Graphical Models 10-708 Homework 2: Due February 24, 2014 at 4 pm Directions. This homework assignment covers the material presented in Lectures 4-8. You must complete all four problems to

More information

Graphical Model Inference with Perfect Graphs

Graphical Model Inference with Perfect Graphs Graphical Model Inference with Perfect Graphs Tony Jebara Columbia University July 25, 2013 joint work with Adrian Weller Graphical models and Markov random fields We depict a graphical model G as a bipartite

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Undirected Graphical Models Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 3: 2 late days to hand it in today, Thursday is final day. Assignment 4:

More information

CS Lecture 4. Markov Random Fields

CS Lecture 4. Markov Random Fields CS 6347 Lecture 4 Markov Random Fields Recap Announcements First homework is available on elearning Reminder: Office hours Tuesday from 10am-11am Last Time Bayesian networks Today Markov random fields

More information

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials by Phillip Krahenbuhl and Vladlen Koltun Presented by Adam Stambler Multi-class image segmentation Assign a class label to each

More information

MAP Examples. Sargur Srihari

MAP Examples. Sargur Srihari MAP Examples Sargur srihari@cedar.buffalo.edu 1 Potts Model CRF for OCR Topics Image segmentation based on energy minimization 2 Examples of MAP Many interesting examples of MAP inference are instances

More information

Lecture 6: Graphical Models

Lecture 6: Graphical Models Lecture 6: Graphical Models Kai-Wei Chang CS @ Uniersity of Virginia kw@kwchang.net Some slides are adapted from Viek Skirmar s course on Structured Prediction 1 So far We discussed sequence labeling tasks:

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Mark Schmidt University of British Columbia Winter 2018 Last Time: Learning and Inference in DAGs We discussed learning in DAG models, log p(x W ) = n d log p(x i j x i pa(j),

More information

Submodularity in Machine Learning

Submodularity in Machine Learning Saifuddin Syed MLRG Summer 2016 1 / 39 What are submodular functions Outline 1 What are submodular functions Motivation Submodularity and Concavity Examples 2 Properties of submodular functions Submodularity

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Yu-Hsin Kuo, Amos Ng 1 Introduction Last lecture

More information

Intelligent Systems:

Intelligent Systems: Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition

More information

Alternative Parameterizations of Markov Networks. Sargur Srihari

Alternative Parameterizations of Markov Networks. Sargur Srihari Alternative Parameterizations of Markov Networks Sargur srihari@cedar.buffalo.edu 1 Topics Three types of parameterization 1. Gibbs Parameterization 2. Factor Graphs 3. Log-linear Models with Energy functions

More information

13 : Variational Inference: Loopy Belief Propagation and Mean Field

13 : Variational Inference: Loopy Belief Propagation and Mean Field 10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction

More information

4 : Exact Inference: Variable Elimination

4 : Exact Inference: Variable Elimination 10-708: Probabilistic Graphical Models 10-708, Spring 2014 4 : Exact Inference: Variable Elimination Lecturer: Eric P. ing Scribes: Soumya Batra, Pradeep Dasigi, Manzil Zaheer 1 Probabilistic Inference

More information

Multiple Whole Genome Alignment

Multiple Whole Genome Alignment Multiple Whole Genome Alignment BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 206 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material, are licensed under CC BY-NC 4.0 by

More information

Course 16:198:520: Introduction To Artificial Intelligence Lecture 9. Markov Networks. Abdeslam Boularias. Monday, October 14, 2015

Course 16:198:520: Introduction To Artificial Intelligence Lecture 9. Markov Networks. Abdeslam Boularias. Monday, October 14, 2015 Course 16:198:520: Introduction To Artificial Intelligence Lecture 9 Markov Networks Abdeslam Boularias Monday, October 14, 2015 1 / 58 Overview Bayesian networks, presented in the previous lecture, are

More information

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

A graph contains a set of nodes (vertices) connected by links (edges or arcs) BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,

More information

Clique trees & Belief Propagation. Siamak Ravanbakhsh Winter 2018

Clique trees & Belief Propagation. Siamak Ravanbakhsh Winter 2018 Graphical Models Clique trees & Belief Propagation Siamak Ravanbakhsh Winter 2018 Learning objectives message passing on clique trees its relation to variable elimination two different forms of belief

More information

Learning MN Parameters with Approximation. Sargur Srihari

Learning MN Parameters with Approximation. Sargur Srihari Learning MN Parameters with Approximation Sargur srihari@cedar.buffalo.edu 1 Topics Iterative exact learning of MN parameters Difficulty with exact methods Approximate methods Approximate Inference Belief

More information

Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract)

Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Arthur Choi and Adnan Darwiche Computer Science Department University of California, Los Angeles

More information

A Graphical Model for Simultaneous Partitioning and Labeling

A Graphical Model for Simultaneous Partitioning and Labeling A Graphical Model for Simultaneous Partitioning and Labeling Philip J Cowans Cavendish Laboratory, University of Cambridge, Cambridge, CB3 0HE, United Kingdom pjc51@camacuk Martin Szummer Microsoft Research

More information

A Graph Cut Algorithm for Higher-order Markov Random Fields

A Graph Cut Algorithm for Higher-order Markov Random Fields A Graph Cut Algorithm for Higher-order Markov Random Fields Alexander Fix Cornell University Aritanan Gruber Rutgers University Endre Boros Rutgers University Ramin Zabih Cornell University Abstract Higher-order

More information

12 : Variational Inference I

12 : Variational Inference I 10-708: Probabilistic Graphical Models, Spring 2015 12 : Variational Inference I Lecturer: Eric P. Xing Scribes: Fattaneh Jabbari, Eric Lei, Evan Shapiro 1 Introduction Probabilistic inference is one of

More information

Discrete Inference and Learning Lecture 3

Discrete Inference and Learning Lecture 3 Discrete Inference and Learning Lecture 3 MVA 2017 2018 h

More information

9 Forward-backward algorithm, sum-product on factor graphs

9 Forward-backward algorithm, sum-product on factor graphs Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous

More information

The Ising model and Markov chain Monte Carlo

The Ising model and Markov chain Monte Carlo The Ising model and Markov chain Monte Carlo Ramesh Sridharan These notes give a short description of the Ising model for images and an introduction to Metropolis-Hastings and Gibbs Markov Chain Monte

More information

Efficient Inference with Cardinality-based Clique Potentials

Efficient Inference with Cardinality-based Clique Potentials Rahul Gupta IBM Research Lab, New Delhi, India Ajit A. Diwan Sunita Sarawagi IIT Bombay, India Abstract Many collective labeling tasks require inference on graphical models where the clique potentials

More information

Learning MN Parameters with Alternative Objective Functions. Sargur Srihari

Learning MN Parameters with Alternative Objective Functions. Sargur Srihari Learning MN Parameters with Alternative Objective Functions Sargur srihari@cedar.buffalo.edu 1 Topics Max Likelihood & Contrastive Objectives Contrastive Objective Learning Methods Pseudo-likelihood Gradient

More information

3 : Representation of Undirected GM

3 : Representation of Undirected GM 10-708: Probabilistic Graphical Models 10-708, Spring 2016 3 : Representation of Undirected GM Lecturer: Eric P. Xing Scribes: Longqi Cai, Man-Chia Chang 1 MRF vs BN There are two types of graphical models:

More information

Message Passing and Junction Tree Algorithms. Kayhan Batmanghelich

Message Passing and Junction Tree Algorithms. Kayhan Batmanghelich Message Passing and Junction Tree Algorithms Kayhan Batmanghelich 1 Review 2 Review 3 Great Ideas in ML: Message Passing Each soldier receives reports from all branches of tree 3 here 7 here 1 of me 11

More information

Alternative Parameterizations of Markov Networks. Sargur Srihari

Alternative Parameterizations of Markov Networks. Sargur Srihari Alternative Parameterizations of Markov Networks Sargur srihari@cedar.buffalo.edu 1 Topics Three types of parameterization 1. Gibbs Parameterization 2. Factor Graphs 3. Log-linear Models Features (Ising,

More information

Grouping with Bias. Stella X. Yu 1,2 Jianbo Shi 1. Robotics Institute 1 Carnegie Mellon University Center for the Neural Basis of Cognition 2

Grouping with Bias. Stella X. Yu 1,2 Jianbo Shi 1. Robotics Institute 1 Carnegie Mellon University Center for the Neural Basis of Cognition 2 Grouping with Bias Stella X. Yu, Jianbo Shi Robotics Institute Carnegie Mellon University Center for the Neural Basis of Cognition What Is It About? Incorporating prior knowledge into grouping Unitary

More information

Part 7: Structured Prediction and Energy Minimization (2/2)

Part 7: Structured Prediction and Energy Minimization (2/2) Part 7: Structured Prediction and Energy Minimization (2/2) Colorado Springs, 25th June 2011 G: Worst-case Complexity Hard problem Generality Optimality Worst-case complexity Integrality Determinism G:

More information

Joint Optimization of Segmentation and Appearance Models

Joint Optimization of Segmentation and Appearance Models Joint Optimization of Segmentation and Appearance Models David Mandle, Sameep Tandon April 29, 2013 David Mandle, Sameep Tandon (Stanford) April 29, 2013 1 / 19 Overview 1 Recap: Image Segmentation 2 Optimization

More information

5. Sum-product algorithm

5. Sum-product algorithm Sum-product algorithm 5-1 5. Sum-product algorithm Elimination algorithm Sum-product algorithm on a line Sum-product algorithm on a tree Sum-product algorithm 5-2 Inference tasks on graphical models consider

More information

High-dimensional graphical model selection: Practical and information-theoretic limits

High-dimensional graphical model selection: Practical and information-theoretic limits 1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John

More information

Does Better Inference mean Better Learning?

Does Better Inference mean Better Learning? Does Better Inference mean Better Learning? Andrew E. Gelfand, Rina Dechter & Alexander Ihler Department of Computer Science University of California, Irvine {agelfand,dechter,ihler}@ics.uci.edu Abstract

More information

Exact Inference I. Mark Peot. In this lecture we will look at issues associated with exact inference. = =

Exact Inference I. Mark Peot. In this lecture we will look at issues associated with exact inference. = = Exact Inference I Mark Peot In this lecture we will look at issues associated with exact inference 10 Queries The objective of probabilistic inference is to compute a joint distribution of a set of query

More information

Chapter 8 Cluster Graph & Belief Propagation. Probabilistic Graphical Models 2016 Fall

Chapter 8 Cluster Graph & Belief Propagation. Probabilistic Graphical Models 2016 Fall Chapter 8 Cluster Graph & elief ropagation robabilistic Graphical Models 2016 Fall Outlines Variable Elimination 消元法 imple case: linear chain ayesian networks VE in complex graphs Inferences in HMMs and

More information

Directed and Undirected Graphical Models

Directed and Undirected Graphical Models Directed and Undirected Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Last Lecture Refresher Lecture Plan Directed

More information

Semi-Markov/Graph Cuts

Semi-Markov/Graph Cuts Semi-Markov/Graph Cuts Alireza Shafaei University of British Columbia August, 2015 1 / 30 A Quick Review For a general chain-structured UGM we have: n n p(x 1, x 2,..., x n ) φ i (x i ) φ i,i 1 (x i, x

More information

3 Undirected Graphical Models

3 Undirected Graphical Models Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 3 Undirected Graphical Models In this lecture, we discuss undirected

More information

High-dimensional graphical model selection: Practical and information-theoretic limits

High-dimensional graphical model selection: Practical and information-theoretic limits 1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John

More information

Message-Passing Algorithms for GMRFs and Non-Linear Optimization

Message-Passing Algorithms for GMRFs and Non-Linear Optimization Message-Passing Algorithms for GMRFs and Non-Linear Optimization Jason Johnson Joint Work with Dmitry Malioutov, Venkat Chandrasekaran and Alan Willsky Stochastic Systems Group, MIT NIPS Workshop: Approximate

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Probability Propagation

Probability Propagation Graphical Models, Lectures 9 and 10, Michaelmas Term 2009 November 13, 2009 Characterizing chordal graphs The following are equivalent for any undirected graph G. (i) G is chordal; (ii) G is decomposable;

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18 Oct, 21, 2015 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models CPSC

More information

Graphical models. Sunita Sarawagi IIT Bombay

Graphical models. Sunita Sarawagi IIT Bombay 1 Graphical models Sunita Sarawagi IIT Bombay http://www.cse.iitb.ac.in/~sunita 2 Probabilistic modeling Given: several variables: x 1,... x n, n is large. Task: build a joint distribution function Pr(x

More information

Final Exam, Machine Learning, Spring 2009

Final Exam, Machine Learning, Spring 2009 Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3

More information

COMPSCI 650 Applied Information Theory Apr 5, Lecture 18. Instructor: Arya Mazumdar Scribe: Hamed Zamani, Hadi Zolfaghari, Fatemeh Rezaei

COMPSCI 650 Applied Information Theory Apr 5, Lecture 18. Instructor: Arya Mazumdar Scribe: Hamed Zamani, Hadi Zolfaghari, Fatemeh Rezaei COMPSCI 650 Applied Information Theory Apr 5, 2016 Lecture 18 Instructor: Arya Mazumdar Scribe: Hamed Zamani, Hadi Zolfaghari, Fatemeh Rezaei 1 Correcting Errors in Linear Codes Suppose someone is to send

More information

Variational algorithms for marginal MAP

Variational algorithms for marginal MAP Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Work with Qiang

More information

Bayesian Learning in Undirected Graphical Models

Bayesian Learning in Undirected Graphical Models Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ Work with: Iain Murray and Hyun-Chul

More information

Markov Random Fields

Markov Random Fields Markov Random Fields Umamahesh Srinivas ipal Group Meeting February 25, 2011 Outline 1 Basic graph-theoretic concepts 2 Markov chain 3 Markov random field (MRF) 4 Gauss-Markov random field (GMRF), and

More information

Probability Propagation

Probability Propagation Graphical Models, Lecture 12, Michaelmas Term 2010 November 19, 2010 Characterizing chordal graphs The following are equivalent for any undirected graph G. (i) G is chordal; (ii) G is decomposable; (iii)

More information

Variable Elimination: Algorithm

Variable Elimination: Algorithm Variable Elimination: Algorithm Sargur srihari@cedar.buffalo.edu 1 Topics 1. Types of Inference Algorithms 2. Variable Elimination: the Basic ideas 3. Variable Elimination Sum-Product VE Algorithm Sum-Product

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Maria Ryskina, Yen-Chia Hsu 1 Introduction

More information

Results: MCMC Dancers, q=10, n=500

Results: MCMC Dancers, q=10, n=500 Motivation Sampling Methods for Bayesian Inference How to track many INTERACTING targets? A Tutorial Frank Dellaert Results: MCMC Dancers, q=10, n=500 1 Probabilistic Topological Maps Results Real-Time

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference IV: Variational Principle II Junming Yin Lecture 17, March 21, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3 X 3 Reading: X 4

More information

p L yi z n m x N n xi

p L yi z n m x N n xi y i z n x n N x i Overview Directed and undirected graphs Conditional independence Exact inference Latent variables and EM Variational inference Books statistical perspective Graphical Models, S. Lauritzen

More information

Relaxed Marginal Inference and its Application to Dependency Parsing

Relaxed Marginal Inference and its Application to Dependency Parsing Relaxed Marginal Inference and its Application to Dependency Parsing Sebastian Riedel David A. Smith Department of Computer Science University of Massachusetts, Amherst {riedel,dasmith}@cs.umass.edu Abstract

More information

Inference as Optimization

Inference as Optimization Inference as Optimization Sargur Srihari srihari@cedar.buffalo.edu 1 Topics in Inference as Optimization Overview Exact Inference revisited The Energy Functional Optimizing the Energy Functional 2 Exact

More information

Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation

Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation Jason K. Johnson, Dmitry M. Malioutov and Alan S. Willsky Department of Electrical Engineering and Computer Science Massachusetts Institute

More information

Conditional Random Field

Conditional Random Field Introduction Linear-Chain General Specific Implementations Conclusions Corso di Elaborazione del Linguaggio Naturale Pisa, May, 2011 Introduction Linear-Chain General Specific Implementations Conclusions

More information

arxiv: v1 [stat.ml] 26 Feb 2013

arxiv: v1 [stat.ml] 26 Feb 2013 Variational Algorithms for Marginal MAP Qiang Liu Donald Bren School of Information and Computer Sciences University of California, Irvine Irvine, CA, 92697-3425, USA qliu1@uci.edu arxiv:1302.6584v1 [stat.ml]

More information

Inference in Graphical Models Variable Elimination and Message Passing Algorithm

Inference in Graphical Models Variable Elimination and Message Passing Algorithm Inference in Graphical Models Variable Elimination and Message Passing lgorithm Le Song Machine Learning II: dvanced Topics SE 8803ML, Spring 2012 onditional Independence ssumptions Local Markov ssumption

More information

Graph structure in polynomial systems: chordal networks

Graph structure in polynomial systems: chordal networks Graph structure in polynomial systems: chordal networks Pablo A. Parrilo Laboratory for Information and Decision Systems Electrical Engineering and Computer Science Massachusetts Institute of Technology

More information

Variational Inference (11/04/13)

Variational Inference (11/04/13) STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further

More information

Lecture 8: PGM Inference

Lecture 8: PGM Inference 15 September 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I 1 Variable elimination Max-product Sum-product 2 LP Relaxations QP Relaxations 3 Marginal and MAP X1 X2 X3 X4

More information

Probabilistic Graphical Models (I)

Probabilistic Graphical Models (I) Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random

More information

Rounding-based Moves for Semi-Metric Labeling

Rounding-based Moves for Semi-Metric Labeling Rounding-based Moves for Semi-Metric Labeling M. Pawan Kumar, Puneet K. Dokania To cite this version: M. Pawan Kumar, Puneet K. Dokania. Rounding-based Moves for Semi-Metric Labeling. Journal of Machine

More information

COMPSCI 276 Fall 2007

COMPSCI 276 Fall 2007 Exact Inference lgorithms for Probabilistic Reasoning; OMPSI 276 Fall 2007 1 elief Updating Smoking lung ancer ronchitis X-ray Dyspnoea P lung cancer=yes smoking=no, dyspnoea=yes =? 2 Probabilistic Inference

More information

Belief Propagation for Traffic forecasting

Belief Propagation for Traffic forecasting Belief Propagation for Traffic forecasting Cyril Furtlehner (INRIA Saclay - Tao team) context : Travesti project http ://travesti.gforge.inria.fr/) Anne Auger (INRIA Saclay) Dimo Brockhoff (INRIA Lille)

More information

Variational Inference. Sargur Srihari

Variational Inference. Sargur Srihari Variational Inference Sargur srihari@cedar.buffalo.edu 1 Plan of discussion We first describe inference with PGMs and the intractability of exact inference Then give a taxonomy of inference algorithms

More information

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014 Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 Problem Set 3 Issued: Thursday, September 25, 2014 Due: Thursday,

More information

6.867 Machine learning, lecture 23 (Jaakkola)

6.867 Machine learning, lecture 23 (Jaakkola) Lecture topics: Markov Random Fields Probabilistic inference Markov Random Fields We will briefly go over undirected graphical models or Markov Random Fields (MRFs) as they will be needed in the context

More information

Random Field Models for Applications in Computer Vision

Random Field Models for Applications in Computer Vision Random Field Models for Applications in Computer Vision Nazre Batool Post-doctorate Fellow, Team AYIN, INRIA Sophia Antipolis Outline Graphical Models Generative vs. Discriminative Classifiers Markov Random

More information

Pushmeet Kohli Microsoft Research

Pushmeet Kohli Microsoft Research Pushmeet Kohli Microsoft Research E(x) x in {0,1} n Image (D) [Boykov and Jolly 01] [Blake et al. 04] E(x) = c i x i Pixel Colour x in {0,1} n Unary Cost (c i ) Dark (Bg) Bright (Fg) x* = arg min E(x)

More information

Machine Learning for Structured Prediction

Machine Learning for Structured Prediction Machine Learning for Structured Prediction Grzegorz Chrupa la National Centre for Language Technology School of Computing Dublin City University NCLT Seminar Grzegorz Chrupa la (DCU) Machine Learning for

More information

Learning in Bayesian Networks

Learning in Bayesian Networks Learning in Bayesian Networks Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Berlin: 20.06.2002 1 Overview 1. Bayesian Networks Stochastic Networks

More information

Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning

Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning Anton Rodomanov Higher School of Economics, Russia Bayesian methods research group (http://bayesgroup.ru) 14 March

More information

Probabilistic Graphical Models & Applications

Probabilistic Graphical Models & Applications Probabilistic Graphical Models & Applications Learning of Graphical Models Bjoern Andres and Bernt Schiele Max Planck Institute for Informatics The slides of today s lecture are authored by and shown with

More information

Variational Algorithms for Marginal MAP

Variational Algorithms for Marginal MAP Variational Algorithms for Marginal MAP Qiang Liu Department of Computer Science University of California, Irvine Irvine, CA, 92697 qliu1@ics.uci.edu Alexander Ihler Department of Computer Science University

More information