Semi-Markov/Graph Cuts
|
|
- Herbert Flowers
- 5 years ago
- Views:
Transcription
1 Semi-Markov/Graph Cuts Alireza Shafaei University of British Columbia August, / 30
2 A Quick Review For a general chain-structured UGM we have: n n p(x 1, x 2,..., x n ) φ i (x i ) φ i,i 1 (x i, x i 1 ), i=1 i=2 X1 X2 X3 X4 X5 X6 X7 In this case we only have local Markov property, x i x 1,..., x i 2, x i+2,..., x n x i 1, x i+1, 2 / 30
3 A Quick Review For a general chain-structured UGM we have: n n p(x 1, x 2,..., x n ) φ i (x i ) φ i,i 1 (x i, x i 1 ), i=1 i=2 X1 X2 X3 X4 X5 X6 X7 In this case we only have local Markov property, x i x 1,..., x i 2, x i+2,..., x n x i 1, x i+1, Local Markov property in general UGMs: given neighbours, conditional independence of other nodes. (Marginal independence corresponds to reachability.) 2 / 30
4 A Quick Review For chain-structured UGMs we learned the Viterbi decoding algorithm. Forward phase: V 1,s = φ 1(s), V i,s = max s {φ i(s)φ i,i 1(s, s )V i 1,s }, 3 / 30
5 A Quick Review For chain-structured UGMs we learned the Viterbi decoding algorithm. Forward phase: V 1,s = φ 1(s), V i,s = max s {φ i(s)φ i,i 1(s, s )V i 1,s }, Backward phase: backtrack through argmax values. Solves the decoding problem in O(ns 2 ) instead of O(s n ). 3 / 30
6 A Quick Review For inference in chain-structured UGMs we learned the forward-backward algorithm. 4 / 30
7 A Quick Review For inference in chain-structured UGMs we learned the forward-backward algorithm. 1 Forward phase (sums up paths from the beginning): V 1,s = φ 1(s), V i,s = s φ i(s)φ i,i 1(s, s )V i 1,s, Z = s V n,s. 4 / 30
8 A Quick Review For inference in chain-structured UGMs we learned the forward-backward algorithm. 1 Forward phase (sums up paths from the beginning): V 1,s = φ 1(s), V i,s = s φ i(s)φ i,i 1(s, s )V i 1,s, Z = s V n,s. 2 Backward phase: (sums up paths to the end): B n,s = 1, B i,s = s φ i+1(s )φ i+1,i(s, s)b i+1,s. 4 / 30
9 A Quick Review For inference in chain-structured UGMs we learned the forward-backward algorithm. 1 Forward phase (sums up paths from the beginning): V 1,s = φ 1(s), V i,s = s φ i(s)φ i,i 1(s, s )V i 1,s, Z = s V n,s. 2 Backward phase: (sums up paths to the end): B n,s = 1, B i,s = s φ i+1(s )φ i+1,i(s, s)b i+1,s. 3 Marginals are given by p(x i = s) V i,sb i,s. 4 / 30
10 A Quick Review For chain-structured UGMs we learned the forward-backward and Viterbi algorithm. The same idea was generalized for tree-structured UGMs. 5 / 30
11 A Quick Review For chain-structured UGMs we learned the forward-backward and Viterbi algorithm. The same idea was generalized for tree-structured UGMs. For graphs with small cutset we learned the Cutset Conditioning method. 5 / 30
12 A Quick Review For chain-structured UGMs we learned the forward-backward and Viterbi algorithm. The same idea was generalized for tree-structured UGMs. For graphs with small cutset we learned the Cutset Conditioning method. For graphs with small tree-width we learned the Junction Tree method. 5 / 30
13 A Quick Review For chain-structured UGMs we learned the forward-backward and Viterbi algorithm. The same idea was generalized for tree-structured UGMs. For graphs with small cutset we learned the Cutset Conditioning method. For graphs with small tree-width we learned the Junction Tree method. Two more group of problems that we can deal with exactly in polynomial time. 5 / 30
14 A Quick Review For chain-structured UGMs we learned the forward-backward and Viterbi algorithm. The same idea was generalized for tree-structured UGMs. For graphs with small cutset we learned the Cutset Conditioning method. For graphs with small tree-width we learned the Junction Tree method. Two more group of problems that we can deal with exactly in polynomial time. Semi-Markov chain-structured UGMs. 5 / 30
15 A Quick Review For chain-structured UGMs we learned the forward-backward and Viterbi algorithm. The same idea was generalized for tree-structured UGMs. For graphs with small cutset we learned the Cutset Conditioning method. For graphs with small tree-width we learned the Junction Tree method. Two more group of problems that we can deal with exactly in polynomial time. Semi-Markov chain-structured UGMs. Binary and attractive state UGMs. 5 / 30
16 Semi-Markov chain-structured UGMs Local Markov property in general chain-structured UGMs: Given neighbours, we have conditional independence of other nodes. 6 / 30
17 Semi-Markov chain-structured UGMs Local Markov property in general chain-structured UGMs: Given neighbours, we have conditional independence of other nodes. In Semi-Markov chain-structured models: Given neighbours and their lengths, we have conditional independence of other nodes. 6 / 30
18 Semi-Markov chain-structured UGMs Local Markov property in general chain-structured UGMs: Given neighbours, we have conditional independence of other nodes. In Semi-Markov chain-structured models: Given neighbours and their lengths, we have conditional independence of other nodes. A subsequence of nodes can have the same state. You can encourage smoothness. 6 / 30
19 Semi-Markov chain-structured UGMs Local Markov property in general chain-structured UGMs: Given neighbours, we have conditional independence of other nodes. In Semi-Markov chain-structured models: Given neighbours and their lengths, we have conditional independence of other nodes. A subsequence of nodes can have the same state. You can encourage smoothness. Useful when you wish to keep track of how long you have been staying on the same state. 6 / 30
20 Semi-Markov chain-structured UGMs Previously, the potential of each edge was a function of neighboring vertices φ i,i 1 (s, s ). 7 / 30
21 Semi-Markov chain-structured UGMs Previously, the potential of each edge was a function of neighboring vertices φ i,i 1 (s, s ). In Semi-Markov chain-structured models we define the potential as φ i,i 1 (s, s, l) 7 / 30
22 Semi-Markov chain-structured UGMs Previously, the potential of each edge was a function of neighboring vertices φ i,i 1 (s, s ). In Semi-Markov chain-structured models we define the potential as φ i,i 1 (s, s, l) The potential of making a transition from s to s after l steps. 7 / 30
23 Semi-Markov chain-structured UGMs Previously, the potential of each edge was a function of neighboring vertices φ i,i 1 (s, s ). In Semi-Markov chain-structured models we define the potential as φ i,i 1 (s, s, l) The potential of making a transition from s to s after l steps. You can encourage staying in certain states for a period of time. 7 / 30
24 Decoding the Semi-Markov chain-structured UGMs Let us look at the Viterbi decoding again: V 1,s = φ 1 (s), V i,s = φ i (s) max s {φ i,i 1 (s, s ) V i 1,s }, 8 / 30
25 Decoding the Semi-Markov chain-structured UGMs Let us look at the Viterbi decoding again: V 1,s = φ 1 (s), V i,s = φ i (s) max s {φ i,i 1 (s, s ) V i 1,s }, How can we update the formula to solve Semi-Markov chain structures? 8 / 30
26 Decoding the Semi-Markov chain-structured UGMs Let us look at the Viterbi decoding again: V 1,s = φ 1 (s), V i,s = φ i (s) max s {φ i,i 1 (s, s ) V i 1,s }, How can we update the formula to solve Semi-Markov chain structures? V 1,s = φ 1 (s), V i,s = φ i (s) max s,l {φ i,i 1(s, s, l) V i l,s }, 8 / 30
27 Decoding the Semi-Markov chain-structured UGMs Let us look at the Viterbi decoding again: V 1,s = φ 1 (s), V i,s = φ i (s) max s {φ i,i 1 (s, s ) V i 1,s }, How can we update the formula to solve Semi-Markov chain structures? V 1,s = φ 1 (s), V i,s = φ i (s) max s,l {φ i,i 1(s, s, l) V i l,s }, Depending on the application we can bound the maximum possible value of l to be L. 8 / 30
28 Decoding the Semi-Markov chain-structured UGMs Let us look at the Viterbi decoding again: V 1,s = φ 1 (s), V i,s = φ i (s) max s {φ i,i 1 (s, s ) V i 1,s }, How can we update the formula to solve Semi-Markov chain structures? V 1,s = φ 1 (s), V i,s = φ i (s) max s,l {φ i,i 1(s, s, l) V i l,s }, Depending on the application we can bound the maximum possible value of l to be L. For the unbounded case, L is simply n, the total length of chain. 8 / 30
29 Decoding the Semi-Markov chain-structured UGMs Let us look at the Viterbi decoding again: V 1,s = φ 1 (s), V i,s = φ i (s) max s {φ i,i 1 (s, s ) V i 1,s }, How can we update the formula to solve Semi-Markov chain structures? V 1,s = φ 1 (s), V i,s = φ i (s) max s,l {φ i,i 1(s, s, l) V i l,s }, Depending on the application we can bound the maximum possible value of l to be L. For the unbounded case, L is simply n, the total length of chain. Note that it is different from having an order-l Markov chain (why?). 8 / 30
30 Inference in the Semi-Markov chain-structured UGMs Forward-backward algorithm for the Semi-Markov models: Forward phase (sums up paths from the beginning): V 1,s = φ 1(s), V i,s = φ i(s) φ i,i 1(s, s, l)v i l,s, Z = s s,l V n,s. 9 / 30
31 Inference in the Semi-Markov chain-structured UGMs Forward-backward algorithm for the Semi-Markov models: Forward phase (sums up paths from the beginning): V 1,s = φ 1(s), V i,s = φ i(s) φ i,i 1(s, s, l)v i l,s, Z = s s,l Backward phase: (sums up paths to the end): B n,s = 1, B i,s = φ i+1(s )φ i+1,i(s, s, l)b i+l,s. s,l V n,s. 9 / 30
32 Inference in the Semi-Markov chain-structured UGMs Forward-backward algorithm for the Semi-Markov models: Forward phase (sums up paths from the beginning): V 1,s = φ 1(s), V i,s = φ i(s) φ i,i 1(s, s, l)v i l,s, Z = s s,l Backward phase: (sums up paths to the end): B n,s = 1, B i,s = φ i+1(s )φ i+1,i(s, s, l)b i+l,s. s,l V n,s. Marginals are given by p(x i = s) V i,sb i,s. 9 / 30
33 Inference in the Semi-Markov chain-structured UGMs Forward-backward algorithm for the Semi-Markov models: Forward phase (sums up paths from the beginning): V 1,s = φ 1(s), V i,s = φ i(s) φ i,i 1(s, s, l)v i l,s, Z = s s,l Backward phase: (sums up paths to the end): B n,s = 1, B i,s = φ i+1(s )φ i+1,i(s, s, l)b i+l,s. s,l V n,s. Marginals are given by p(x i = s) V i,sb i,s. Questions? 9 / 30
34 Solving with graph cuts Restricting the structure of graph is just one way to simplify our tasks. 10 / 30
35 Solving with graph cuts Restricting the structure of graph is just one way to simplify our tasks. We can also restrict the potentials. 10 / 30
36 Solving with graph cuts Restricting the structure of graph is just one way to simplify our tasks. We can also restrict the potentials. Here we look at a group of pairwise UGMs with the following restrictions: 10 / 30
37 Solving with graph cuts Restricting the structure of graph is just one way to simplify our tasks. We can also restrict the potentials. Here we look at a group of pairwise UGMs with the following restrictions: 1 Binary variables. 10 / 30
38 Solving with graph cuts Restricting the structure of graph is just one way to simplify our tasks. We can also restrict the potentials. Here we look at a group of pairwise UGMs with the following restrictions: 1 Binary variables. 2 Pairwise potential makes a submodular problem. 10 / 30
39 Solving with graph cuts Restricting the structure of graph is just one way to simplify our tasks. We can also restrict the potentials. Here we look at a group of pairwise UGMs with the following restrictions: 1 Binary variables. 2 Pairwise potential makes a submodular problem. Can be decoded by reformulation as a Max-Flow problem. 10 / 30
40 Solving with graph cuts Restricting the structure of graph is just one way to simplify our tasks. We can also restrict the potentials. Here we look at a group of pairwise UGMs with the following restrictions: 1 Binary variables. 2 Pairwise potential makes a submodular problem. Can be decoded by reformulation as a Max-Flow problem. Can be generalized to non-binary cases. 10 / 30
41 Solving with graph cuts Restricting the structure of graph is just one way to simplify our tasks. We can also restrict the potentials. Here we look at a group of pairwise UGMs with the following restrictions: 1 Binary variables. 2 Pairwise potential makes a submodular problem. Can be decoded by reformulation as a Max-Flow problem. Can be generalized to non-binary cases. Can be 2-approximated when not submodular (under a different constraint). 10 / 30
42 Solving with graph cuts Restricting the structure of graph is just one way to simplify our tasks. We can also restrict the potentials. Here we look at a group of pairwise UGMs with the following restrictions: 1 Binary variables. 2 Pairwise potential makes a submodular problem. Can be decoded by reformulation as a Max-Flow problem. Can be generalized to non-binary cases. Can be 2-approximated when not submodular (under a different constraint). In the general case it is known to be NP-Hard. 10 / 30
43 Solving with graph cuts Restricting the structure of graph is just one way to simplify our tasks. We can also restrict the potentials. Here we look at a group of pairwise UGMs with the following restrictions: 1 Binary variables. 2 Pairwise potential makes a submodular problem. Can be decoded by reformulation as a Max-Flow problem. Can be generalized to non-binary cases. Can be 2-approximated when not submodular (under a different constraint). In the general case it is known to be NP-Hard. The following material is borrowed from Simon Prince s (@UCL) slides. Available at computervisionmodels.com 10 / 30
44 The Max-Flow problem 11 / 30
45 The Max-Flow problem The goal is to push as much flow as possible through the directed graph from the source to the sink. 11 / 30
46 The Max-Flow problem The goal is to push as much flow as possible through the directed graph from the source to the sink. Cannot exceed the (non-negative) capacities C ij associated with each edge. 11 / 30
47 Saturation When we push the maximum flow from source to sink: There must be at least one saturated edge on any path from source to sink, otherwise you can push more flow. 12 / 30
48 Saturation When we push the maximum flow from source to sink: There must be at least one saturated edge on any path from source to sink, otherwise you can push more flow. The set of saturated edges hence separate the source and sink. 12 / 30
49 Saturation When we push the maximum flow from source to sink: There must be at least one saturated edge on any path from source to sink, otherwise you can push more flow. The set of saturated edges hence separate the source and sink. This set is simultaneously the min-cut and the max-flow. 12 / 30
50 An example Two numbers are: current flow/ total capacity 13 / 30
51 An example Chose any path from source to sink with spare capacity and push as much flow as possible. 14 / 30
52 An example 15 / 30
53 An example 16 / 30
54 An example 17 / 30
55 An example 18 / 30
56 An example No further augmenting path exists. 19 / 30
57 An example The saturated edges partition the graph into two subgraphs. 20 / 30
58 The binary MRFs In the simplest form, let us constrain the pairwise potentials for adjacent nodes m, n to be: φ m,n(0, 0) = φ m,n(1, 1) = 0. φ m,n(1, 0) = θ 10. φ m,n(0, 1) = θ / 30
59 The binary MRFs In the simplest form, let us constrain the pairwise potentials for adjacent nodes m, n to be: φ m,n(0, 0) = φ m,n(1, 1) = 0. φ m,n(1, 0) = θ 10. φ m,n(0, 1) = θ 01. Will make a graph such that each cut corresponds to a configuration. 21 / 30
60 The binary MRFs 22 / 30
61 The binary MRFs 23 / 30
62 The binary MRFs 24 / 30
63 The binary MRFs 25 / 30
64 The binary MRFs In the general case: 26 / 30
65 The binary MRFs In the general case: Constraint θ 10 + θ 01 > θ 11 + θ 00 (attraction). 26 / 30
66 The binary MRFs In the general case: Constraint θ 10 + θ 01 > θ 11 + θ 00 (attraction). If met, the problem is called submodular and we can solve it in polynomial time. 26 / 30
67 Other cases 27 / 30
68 Other cases 27 / 30
69 Other cases Another type of constraint allows approximate solutions. 28 / 30
70 Other cases Another type of constraint allows approximate solutions. if the pairwise potential is a metric 28 / 30
71 Other cases Another type of constraint allows approximate solutions. if the pairwise potential is a metric Alpha Expansion Algorithm (next week) uses the max-flow idea as a subroutine to do coordinate descent in the label space. 28 / 30
72 Conclusion Decoding and inference is still efficient with Semi-Markov models. 29 / 30
73 Conclusion Decoding and inference is still efficient with Semi-Markov models. Useful if need to control the length of each state over a sequence. 29 / 30
74 Conclusion Decoding and inference is still efficient with Semi-Markov models. Useful if need to control the length of each state over a sequence. Graph cuts help with decoding on models with pairwise potentials. 29 / 30
75 Conclusion Decoding and inference is still efficient with Semi-Markov models. Useful if need to control the length of each state over a sequence. Graph cuts help with decoding on models with pairwise potentials. Exact solution in binary case if submodular. 29 / 30
76 Conclusion Decoding and inference is still efficient with Semi-Markov models. Useful if need to control the length of each state over a sequence. Graph cuts help with decoding on models with pairwise potentials. Exact solution in binary case if submodular. Exact solution in multi-label case if submodular. 29 / 30
77 Conclusion Decoding and inference is still efficient with Semi-Markov models. Useful if need to control the length of each state over a sequence. Graph cuts help with decoding on models with pairwise potentials. Exact solution in binary case if submodular. Exact solution in multi-label case if submodular. Approximate solution in multi-label case if a metric. 29 / 30
78 Thank you! Questions? 30 / 30
Intelligent Systems:
Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition
More informationEnergy minimization via graph-cuts
Energy minimization via graph-cuts Nikos Komodakis Ecole des Ponts ParisTech, LIGM Traitement de l information et vision artificielle Binary energy minimization We will first consider binary MRFs: Graph
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning More Approximate Inference Mark Schmidt University of British Columbia Winter 2018 Last Time: Approximate Inference We ve been discussing graphical models for density estimation,
More informationMAP Examples. Sargur Srihari
MAP Examples Sargur srihari@cedar.buffalo.edu 1 Potts Model CRF for OCR Topics Image segmentation based on energy minimization 2 Examples of MAP Many interesting examples of MAP inference are instances
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Undirected Graphical Models Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 3: 2 late days to hand it in today, Thursday is final day. Assignment 4:
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project
More informationGraphical Model Inference with Perfect Graphs
Graphical Model Inference with Perfect Graphs Tony Jebara Columbia University July 25, 2013 joint work with Adrian Weller Graphical models and Markov random fields We depict a graphical model G as a bipartite
More informationPart 6: Structured Prediction and Energy Minimization (1/2)
Part 6: Structured Prediction and Energy Minimization (1/2) Providence, 21st June 2012 Prediction Problem Prediction Problem y = f (x) = argmax y Y g(x, y) g(x, y) = p(y x), factor graphs/mrf/crf, g(x,
More informationLogistic Regression: Online, Lazy, Kernelized, Sequential, etc.
Logistic Regression: Online, Lazy, Kernelized, Sequential, etc. Harsha Veeramachaneni Thomson Reuter Research and Development April 1, 2010 Harsha Veeramachaneni (TR R&D) Logistic Regression April 1, 2010
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Mark Schmidt University of British Columbia Winter 2018 Last Time: Learning and Inference in DAGs We discussed learning in DAG models, log p(x W ) = n d log p(x i j x i pa(j),
More informationThe Ising model and Markov chain Monte Carlo
The Ising model and Markov chain Monte Carlo Ramesh Sridharan These notes give a short description of the Ising model for images and an introduction to Metropolis-Hastings and Gibbs Markov Chain Monte
More informationSequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them
HMM, MEMM and CRF 40-957 Special opics in Artificial Intelligence: Probabilistic Graphical Models Sharif University of echnology Soleymani Spring 2014 Sequence labeling aking collective a set of interrelated
More informationCRF for human beings
CRF for human beings Arne Skjærholt LNS seminar CRF for human beings LNS seminar 1 / 29 Let G = (V, E) be a graph such that Y = (Y v ) v V, so that Y is indexed by the vertices of G. Then (X, Y) is a conditional
More informationSTA 414/2104: Machine Learning
STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationHidden Markov models 1
Hidden Markov models 1 Outline Time and uncertainty Markov process Hidden Markov models Inference: filtering, prediction, smoothing Most likely explanation: Viterbi 2 Time and uncertainty The world changes;
More informationUndirected Graphical Models: Markov Random Fields
Undirected Graphical Models: Markov Random Fields 40-956 Advanced Topics in AI: Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2015 Markov Random Field Structure: undirected
More informationMachine Learning for OR & FE
Machine Learning for OR & FE Hidden Markov Models Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com Additional References: David
More informationCOMP90051 Statistical Machine Learning
COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 24. Hidden Markov Models & message passing Looking back Representation of joint distributions Conditional/marginal independence
More informationIntroduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah
Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,
More informationWe Live in Exciting Times. CSCI-567: Machine Learning (Spring 2019) Outline. Outline. ACM (an international computing research society) has named
We Live in Exciting Times ACM (an international computing research society) has named CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Apr. 2, 2019 Yoshua Bengio,
More informationLearning MN Parameters with Approximation. Sargur Srihari
Learning MN Parameters with Approximation Sargur srihari@cedar.buffalo.edu 1 Topics Iterative exact learning of MN parameters Difficulty with exact methods Approximate methods Approximate Inference Belief
More informationLecture 9: PGM Learning
13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Variational Inference II: Mean Field Method and Variational Principle Junming Yin Lecture 15, March 7, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3
More informationHidden Markov Models. AIMA Chapter 15, Sections 1 5. AIMA Chapter 15, Sections 1 5 1
Hidden Markov Models AIMA Chapter 15, Sections 1 5 AIMA Chapter 15, Sections 1 5 1 Consider a target tracking problem Time and uncertainty X t = set of unobservable state variables at time t e.g., Position
More informationConditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013
Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013 Outline Modeling Inference Training Applications Outline Modeling Problem definition Discriminative vs. Generative Chain CRF General
More information4 : Exact Inference: Variable Elimination
10-708: Probabilistic Graphical Models 10-708, Spring 2014 4 : Exact Inference: Variable Elimination Lecturer: Eric P. ing Scribes: Soumya Batra, Pradeep Dasigi, Manzil Zaheer 1 Probabilistic Inference
More informationLecture 15. Probabilistic Models on Graph
Lecture 15. Probabilistic Models on Graph Prof. Alan Yuille Spring 2014 1 Introduction We discuss how to define probabilistic models that use richly structured probability distributions and describe how
More informationHidden Markov Models Part 2: Algorithms
Hidden Markov Models Part 2: Algorithms CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Hidden Markov Model An HMM consists of:
More informationSubmodularity in Machine Learning
Saifuddin Syed MLRG Summer 2016 1 / 39 What are submodular functions Outline 1 What are submodular functions Motivation Submodularity and Concavity Examples 2 Properties of submodular functions Submodularity
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Mark Schmidt University of British Columbia Winter 2019 Last Time: Monte Carlo Methods If we want to approximate expectations of random functions, E[g(x)] = g(x)p(x) or E[g(x)]
More informationSequence Modelling with Features: Linear-Chain Conditional Random Fields. COMP-599 Oct 6, 2015
Sequence Modelling with Features: Linear-Chain Conditional Random Fields COMP-599 Oct 6, 2015 Announcement A2 is out. Due Oct 20 at 1pm. 2 Outline Hidden Markov models: shortcomings Generative vs. discriminative
More informationCSC 412 (Lecture 4): Undirected Graphical Models
CSC 412 (Lecture 4): Undirected Graphical Models Raquel Urtasun University of Toronto Feb 2, 2016 R Urtasun (UofT) CSC 412 Feb 2, 2016 1 / 37 Today Undirected Graphical Models: Semantics of the graph:
More informationProbabilistic Graphical Models
Probabilistic Graphical Models David Sontag New York University Lecture 6, March 7, 2013 David Sontag (NYU) Graphical Models Lecture 6, March 7, 2013 1 / 25 Today s lecture 1 Dual decomposition 2 MAP inference
More information9 Forward-backward algorithm, sum-product on factor graphs
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous
More informationLecture 4: State Estimation in Hidden Markov Models (cont.)
EE378A Statistical Signal Processing Lecture 4-04/13/2017 Lecture 4: State Estimation in Hidden Markov Models (cont.) Lecturer: Tsachy Weissman Scribe: David Wugofski In this lecture we build on previous
More information3 : Representation of Undirected GM
10-708: Probabilistic Graphical Models 10-708, Spring 2016 3 : Representation of Undirected GM Lecturer: Eric P. Xing Scribes: Longqi Cai, Man-Chia Chang 1 MRF vs BN There are two types of graphical models:
More informationVariational Inference (11/04/13)
STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further
More informationGraphical Models. Outline. HMM in short. HMMs. What about continuous HMMs? p(o t q t ) ML 701. Anna Goldenberg ... t=1. !
Outline Graphical Models ML 701 nna Goldenberg! ynamic Models! Gaussian Linear Models! Kalman Filter! N! Undirected Models! Unification! Summary HMMs HMM in short! is a ayes Net hidden states! satisfies
More informationLecture 4: Hidden Markov Models: An Introduction to Dynamic Decision Making. November 11, 2010
Hidden Lecture 4: Hidden : An Introduction to Dynamic Decision Making November 11, 2010 Special Meeting 1/26 Markov Model Hidden When a dynamical system is probabilistic it may be determined by the transition
More informationLinear Dynamical Systems
Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations
More informationSequential Supervised Learning
Sequential Supervised Learning Many Application Problems Require Sequential Learning Part-of of-speech Tagging Information Extraction from the Web Text-to to-speech Mapping Part-of of-speech Tagging Given
More informationIntroduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah
Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,
More informationBayesian Machine Learning - Lecture 7
Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1
More informationCISC 889 Bioinformatics (Spring 2004) Hidden Markov Models (II)
CISC 889 Bioinformatics (Spring 24) Hidden Markov Models (II) a. Likelihood: forward algorithm b. Decoding: Viterbi algorithm c. Model building: Baum-Welch algorithm Viterbi training Hidden Markov models
More informationCOMS 4771 Probabilistic Reasoning via Graphical Models. Nakul Verma
COMS 4771 Probabilistic Reasoning via Graphical Models Nakul Verma Last time Dimensionality Reduction Linear vs non-linear Dimensionality Reduction Principal Component Analysis (PCA) Non-linear methods
More informationChapter 4 Dynamic Bayesian Networks Fall Jin Gu, Michael Zhang
Chapter 4 Dynamic Bayesian Networks 2016 Fall Jin Gu, Michael Zhang Reviews: BN Representation Basic steps for BN representations Define variables Define the preliminary relations between variables Check
More informationMarkov Random Fields for Computer Vision (Part 1)
Markov Random Fields for Computer Vision (Part 1) Machine Learning Summer School (MLSS 2011) Stephen Gould stephen.gould@anu.edu.au Australian National University 13 17 June, 2011 Stephen Gould 1/23 Pixel
More informationA brief introduction to Conditional Random Fields
A brief introduction to Conditional Random Fields Mark Johnson Macquarie University April, 2005, updated October 2010 1 Talk outline Graphical models Maximum likelihood and maximum conditional likelihood
More informationInference and Representation
Inference and Representation David Sontag New York University Lecture 5, Sept. 30, 2014 David Sontag (NYU) Inference and Representation Lecture 5, Sept. 30, 2014 1 / 16 Today s lecture 1 Running-time of
More informationCombinatorial optimization problems
Combinatorial optimization problems Heuristic Algorithms Giovanni Righini University of Milan Department of Computer Science (Crema) Optimization In general an optimization problem can be formulated as:
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 12: Gaussian Belief Propagation, State Space Models and Kalman Filters Guest Kalman Filter Lecture by
More informationGraphs and Network Flows IE411. Lecture 12. Dr. Ted Ralphs
Graphs and Network Flows IE411 Lecture 12 Dr. Ted Ralphs IE411 Lecture 12 1 References for Today s Lecture Required reading Sections 21.1 21.2 References AMO Chapter 6 CLRS Sections 26.1 26.2 IE411 Lecture
More informationLecture 6: Graphical Models
Lecture 6: Graphical Models Kai-Wei Chang CS @ Uniersity of Virginia kw@kwchang.net Some slides are adapted from Viek Skirmar s course on Structured Prediction 1 So far We discussed sequence labeling tasks:
More informationBeyond Junction Tree: High Tree-Width Models
Beyond Junction Tree: High Tree-Width Models Tony Jebara Columbia University February 25, 2015 Joint work with A. Weller, N. Ruozzi and K. Tang Data as graphs Want to perform inference on large networks...
More informationIntelligent Systems (AI-2)
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 23, 2015 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,
More informationPart V. Matchings. Matching. 19 Augmenting Paths for Matchings. 18 Bipartite Matching via Flows
Matching Input: undirected graph G = (V, E). M E is a matching if each node appears in at most one Part V edge in M. Maximum Matching: find a matching of maximum cardinality Matchings Ernst Mayr, Harald
More informationECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning
ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics Markov Random Fields: Representation Conditional Random Fields Log-Linear Models Readings: KF
More information11 The Max-Product Algorithm
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms for Inference Fall 2014 11 The Max-Product Algorithm In the previous lecture, we introduced
More informationACO Comprehensive Exam March 20 and 21, Computability, Complexity and Algorithms
1. Computability, Complexity and Algorithms Part a: You are given a graph G = (V,E) with edge weights w(e) > 0 for e E. You are also given a minimum cost spanning tree (MST) T. For one particular edge
More informationUndirected graphical models
Undirected graphical models Semantics of probabilistic models over undirected graphs Parameters of undirected models Example applications COMP-652 and ECSE-608, February 16, 2017 1 Undirected graphical
More informationDirected and Undirected Graphical Models
Directed and Undirected Graphical Models Adrian Weller MLSALT4 Lecture Feb 26, 2016 With thanks to David Sontag (NYU) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,
More information10 : HMM and CRF. 1 Case Study: Supervised Part-of-Speech Tagging
10-708: Probabilistic Graphical Models 10-708, Spring 2018 10 : HMM and CRF Lecturer: Kayhan Batmanghelich Scribes: Ben Lengerich, Michael Kleyman 1 Case Study: Supervised Part-of-Speech Tagging We will
More informationIntelligent Systems (AI-2)
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 24, 2016 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,
More informationGenerative and Discriminative Approaches to Graphical Models CMSC Topics in AI
Generative and Discriminative Approaches to Graphical Models CMSC 35900 Topics in AI Lecture 2 Yasemin Altun January 26, 2007 Review of Inference on Graphical Models Elimination algorithm finds single
More informationHidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing
Hidden Markov Models By Parisa Abedi Slides courtesy: Eric Xing i.i.d to sequential data So far we assumed independent, identically distributed data Sequential (non i.i.d.) data Time-series data E.g. Speech
More informationProbabilistic Graphical Models Homework 2: Due February 24, 2014 at 4 pm
Probabilistic Graphical Models 10-708 Homework 2: Due February 24, 2014 at 4 pm Directions. This homework assignment covers the material presented in Lectures 4-8. You must complete all four problems to
More informationAlternative Parameterizations of Markov Networks. Sargur Srihari
Alternative Parameterizations of Markov Networks Sargur srihari@cedar.buffalo.edu 1 Topics Three types of parameterization 1. Gibbs Parameterization 2. Factor Graphs 3. Log-linear Models Features (Ising,
More information13 : Variational Inference: Loopy Belief Propagation and Mean Field
10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction
More informationBayesian Networks Introduction to Machine Learning. Matt Gormley Lecture 24 April 9, 2018
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Bayesian Networks Matt Gormley Lecture 24 April 9, 2018 1 Homework 7: HMMs Reminders
More informationBasic math for biology
Basic math for biology Lei Li Florida State University, Feb 6, 2002 The EM algorithm: setup Parametric models: {P θ }. Data: full data (Y, X); partial data Y. Missing data: X. Likelihood and maximum likelihood
More informationUNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS
UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS JONATHAN YEDIDIA, WILLIAM FREEMAN, YAIR WEISS 2001 MERL TECH REPORT Kristin Branson and Ian Fasel June 11, 2003 1. Inference Inference problems
More informationEfficient Inference in Fully Connected CRFs with Gaussian Edge Potentials
Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials by Phillip Krahenbuhl and Vladlen Koltun Presented by Adam Stambler Multi-class image segmentation Assign a class label to each
More informationSubmodular Functions Properties Algorithms Machine Learning
Submodular Functions Properties Algorithms Machine Learning Rémi Gilleron Inria Lille - Nord Europe & LIFL & Univ Lille Jan. 12 revised Aug. 14 Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised
More informationSearch and Lookahead. Bernhard Nebel, Julien Hué, and Stefan Wölfl. June 4/6, 2012
Search and Lookahead Bernhard Nebel, Julien Hué, and Stefan Wölfl Albert-Ludwigs-Universität Freiburg June 4/6, 2012 Search and Lookahead Enforcing consistency is one way of solving constraint networks:
More informationHidden Markov Models. Aarti Singh Slides courtesy: Eric Xing. Machine Learning / Nov 8, 2010
Hidden Markov Models Aarti Singh Slides courtesy: Eric Xing Machine Learning 10-701/15-781 Nov 8, 2010 i.i.d to sequential data So far we assumed independent, identically distributed data Sequential data
More informationHuman-Oriented Robotics. Temporal Reasoning. Kai Arras Social Robotics Lab, University of Freiburg
Temporal Reasoning Kai Arras, University of Freiburg 1 Temporal Reasoning Contents Introduction Temporal Reasoning Hidden Markov Models Linear Dynamical Systems (LDS) Kalman Filter 2 Temporal Reasoning
More informationThe maximum flow problem
The maximum flow problem A. Agnetis 1 Basic properties Given a network G = (N, A) (having N = n nodes and A = m arcs), and two nodes s (source) and t (sink), the maximum flow problem consists in finding
More informationMultiplex network inference
(using hidden Markov models) University of Cambridge Bioinformatics Group Meeting 11 February 2016 Words of warning Disclaimer These slides have been produced by combining & translating two of my previous
More informationDiscrete Inference and Learning Lecture 3
Discrete Inference and Learning Lecture 3 MVA 2017 2018 h
More informationJunction Tree, BP and Variational Methods
Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,
More informationAlternative Parameterizations of Markov Networks. Sargur Srihari
Alternative Parameterizations of Markov Networks Sargur srihari@cedar.buffalo.edu 1 Topics Three types of parameterization 1. Gibbs Parameterization 2. Factor Graphs 3. Log-linear Models with Energy functions
More informationMaximum flow problem CE 377K. February 26, 2015
Maximum flow problem CE 377K February 6, 05 REVIEW HW due in week Review Label setting vs. label correcting Bellman-Ford algorithm Review MAXIMUM FLOW PROBLEM Maximum Flow Problem What is the greatest
More informationLearning Tetris. 1 Tetris. February 3, 2009
Learning Tetris Matt Zucker Andrew Maas February 3, 2009 1 Tetris The Tetris game has been used as a benchmark for Machine Learning tasks because its large state space (over 2 200 cell configurations are
More informationBayesian Learning in Undirected Graphical Models
Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ Work with: Iain Murray and Hyun-Chul
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Hidden Markov Models Barnabás Póczos & Aarti Singh Slides courtesy: Eric Xing i.i.d to sequential data So far we assumed independent, identically distributed
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Mark Schmidt University of British Columbia Winter 2018 Last Time: Monte Carlo Methods If we want to approximate expectations of random functions, E[g(x)] = g(x)p(x) or E[g(x)]
More informationMachine Learning Lecture 14
Many slides adapted from B. Schiele, S. Roth, Z. Gharahmani Machine Learning Lecture 14 Undirected Graphical Models & Inference 23.06.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de
More information7.5 Bipartite Matching
7. Bipartite Matching Matching Matching. Input: undirected graph G = (V, E). M E is a matching if each node appears in at most edge in M. Max matching: find a max cardinality matching. Bipartite Matching
More informationGraphical Models Another Approach to Generalize the Viterbi Algorithm
Exact Marginalization Another Approach to Generalize the Viterbi Algorithm Oberseminar Bioinformatik am 20. Mai 2010 Institut für Mikrobiologie und Genetik Universität Göttingen mario@gobics.de 1.1 Undirected
More information13 : Variational Inference: Loopy Belief Propagation
10-708: Probabilistic Graphical Models 10-708, Spring 2014 13 : Variational Inference: Loopy Belief Propagation Lecturer: Eric P. Xing Scribes: Rajarshi Das, Zhengzhong Liu, Dishan Gupta 1 Introduction
More informationProbabilistic and Bayesian Machine Learning
Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/
More informationUndirected Graphical Models
Undirected Graphical Models 1 Conditional Independence Graphs Let G = (V, E) be an undirected graph with vertex set V and edge set E, and let A, B, and C be subsets of vertices. We say that C separates
More informationGraphical Models and Kernel Methods
Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.
More information1/22/13. Example: CpG Island. Question 2: Finding CpG Islands
I529: Machine Learning in Bioinformatics (Spring 203 Hidden Markov Models Yuzhen Ye School of Informatics and Computing Indiana Univerty, Bloomington Spring 203 Outline Review of Markov chain & CpG island
More informationInference in Bayesian Networks
Andrea Passerini passerini@disi.unitn.it Machine Learning Inference in graphical models Description Assume we have evidence e on the state of a subset of variables E in the model (i.e. Bayesian Network)
More informationCours 7 12th November 2014
Sum Product Algorithm and Hidden Markov Model 2014/2015 Cours 7 12th November 2014 Enseignant: Francis Bach Scribe: Pauline Luc, Mathieu Andreux 7.1 Sum Product Algorithm 7.1.1 Motivations Inference, along
More informationLecture 13: Structured Prediction
Lecture 13: Structured Prediction Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage: http://kwchang.net/teaching/nlp16 CS6501: NLP 1 Quiz 2 v Lectures 9-13 v Lecture 12: before page
More informationProbabilistic Graphical Models Lecture Notes Fall 2009
Probabilistic Graphical Models Lecture Notes Fall 2009 October 28, 2009 Byoung-Tak Zhang School of omputer Science and Engineering & ognitive Science, Brain Science, and Bioinformatics Seoul National University
More informationHidden Markov Models,99,100! Markov, here I come!
Hidden Markov Models,99,100! Markov, here I come! 16.410/413 Principles of Autonomy and Decision-Making Pedro Santana (psantana@mit.edu) October 7 th, 2015. Based on material by Brian Williams and Emilio
More information