Sistemi Cognitivi per le

Size: px
Start display at page:

Download "Sistemi Cognitivi per le"

Transcription

1 Università degli Studi di Genova Dipartimento di Ingegneria Biofisica ed Elettronica Sistemi Cognitivi per le Telecomunicazioni Prof. Carlo Regazzoni DIBE

2 An Introduction to Bayesian Networks Markov random fields This presentation is included in data fusion as a probabilistic situation awareness tool for problem representation: state estimation. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 2

3 Outline Probabilistic graphical models Bayesian Networks Representation of Bayesian Networks Dynamic Bayesian Networks Data Fusion with Dynamic Bayesian Networks Markov random fields (MRF) MRF: conditional independence MRF: Cliques MRF: Factorization MRF: Example of MRF Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 3

4 Part I A brief introduction to probabilistic graphical models Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 4

5 Probabilistic graphical models Definition: Diagrammatic representations of probability distribution. Properties: A simple way to visualize the structure of a probabilistic model Insights into the properties of the model, including conditional independence properties Expressing complex computations in terms of graphical manipulations Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 5

6 Probabilistic graphical models A graph comprises nodes/ vertices Nodes are connected by edges/ links/ arcs Each node represents one (or a group of) random variable The links express probabilistic relationship between these random variables The link Node a Node a Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 6

7 Probabilistic graphical models Two major classes of graphical models: Bayesian Networks (directed graphical models) Markov random fields (undirected graphical models) Directed graphical models: The links of the graphs have a particular directionality indicated by arrows. Undirected graphical models: The links do not carry arrows and have no directional significance. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 7

8 Probabilistic graphical models Directed graphical models: Useful for expressing causal relationships between random variables. Undirected graphical models: Better suited to express soft constraints between random variables. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 8

9 Part II Bayesian Networks Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 9

10 Bayesian networks To motivate the use of directed graphs consider the following example: Consider the joint distribution ib ti p( a, b, c ) over the random variables a, b, and c Applying the product rule we have: (,, ) = p( c a, b) p( a, b) p a b c = p( c a, b) p( b a) p( a) Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 1

11 Bayesian networks First, we introduce a node for each random variable: a b c Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 11

12 Bayesian networks Then, for each conditional distribution we add directed links (arrows) from the nodes on which the distribution is conditioned. a b c Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 12

13 Bayesian networks Explanations: ( ) For the factor p b a there will be a link from node b to node a. We say that the node a is the parent of node b. And the node b is the child of node a. ( ) For the factor p c a, b there will be links from nodes a and b to node c. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 13

14 Bayesian networks ( ) The distribution of p a, b, c is symmetrical with respect to three random variables. In the previous example, the ordering a, b, c was chosen. a b c Different ordering results in different decompositions and hence a different graphical model. (,, ) = p ( cab, ) p ( ab, ) p abc p p = (, ) ( ) ( ) p c a b p b a p a Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 14

15 Bayesian networks Other ordering can be as follows: ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ordering a, b, c p a, b, c = p c a, b p b a p a ordering b, a, c p a, b, c = p c a, b p a b p b ordering a, c, b p a, b, c = p b a, c p c a p a ordering cab,, p abc,, = p bac, p ac p c ordering b, c, a p a, b, c = p a b, c p c b p b ordering c, b, a p a, b, c = p ab, c p b c p c Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 15

16 Bayesian networks The previous example can be extended. Consider the joint distribution over K variables. By repeated application of the product rule of probability we have: px,..., x = px x,..., x px x px ( ) ( ) ( ) ( ) 1 K k 1 K Each node has incoming links from all lower numbered nodes. This graph is fully connected because there is a link between every pair of nodes. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 16

17 Bayesian networks In general, the relationship between a directed graph and the corresponding distribution over random variables is given by the product over conditional distribution of all nodes of the graph conditioned on their parents. For a graph with K nodes the distribution is given by: K p x1 x = p x parent x (,..., ) ( ( )) K k k k = 1 Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 17

18 Bayesian networks: a general example from a high-level view Task: object recognition Each observed data point corresponds to the image of one object. The latent variables have an interpretation as the position and orientation of the object. Latent (hidden) variables are the ones that are not observed. The goal: given a particular observed image, what is the posterior distribution over objects integrating over all possible positions and orientations? Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 18

19 Bayesian networks: a general example from a high-level view The graphical model is as follows: Object Position Orientation Image Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 19

20 Conditional independence Conditional independence is an important concept for probability distribution over multiple variables. Having a distribution ib ti over three random variables a, b and c, consider that the conditional distribution of a, given b and c, is such that it does not depend on b: (, ) = p( ac) p ab c We say that a is conditionally independent of b given c. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 2

21 Conditional independence The Conditional independence can be expressed in a different way: p ( a, b c) = p( ab, c) p( b c ) = ( ) p ( ) p ac p bc Therefore, the joint distribution of a and b, conditioned on c, factorizes into the product of marginal probabilities of a and b, conditioned on c. The colored-part used the equation in the previous page. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 21

22 Conditional independence The two following formulas for conditional independence are equivalent. (, ) = p( ac) p ab c (, ) = p( ac) p( bc) p abc The above formulas must hold for every ypossible value of c. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 22

23 Conditional independence The graphical model for the aforementioned conditional independence is as follows: p abc,, = p ac p bc p c ( ) ( ) ( ) ( ) c a b Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 23

24 Investigating conditional independence In order to investigate the conditional independence between a and b given c, three possible combinations of the graphical model can be investigated. These combinations are called: tail-to-tail, head-totail, and head-to-head (next slide) For each combination, two cases are considered (next slide): c is not observed (c is shown using a normal node) ) c is observed (c is shown using a shaded node) Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 24

25 Investigating conditional independence c c Tail-to-tail Head-to-tail a b a b a c b a c b a b a b Head-to-head c c is not observed c c is observed Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 25

26 Investigating conditional independence tail-to-tail: we can consider a simple graphical interpretation by considering the path from a to b via c. The node c is tail-to-tail to because it is connected to the tail of two arrows. head-to-tail: the node c is connected to the head of one arrow and to the tail of the other. head-to-head: the node c is connected to the tail of two arrows. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 26

27 Investigating conditional independence Investigating the conditional independence between a and b given c, when c is tail-to-tail: c is not observed c is observed c c a b a b c is not observed c is observed Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 27

28 Investigating conditional independence Case 1: tail-to-tail, c is not observed: we marginalize both sides of the following equation (representing the graph) with respect to c. p a, b, c = p a c p b c p c ( ) ( ) ( ) ( ) c a b Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 28

29 Investigating conditional independence Marginalizing the equation with respect to c: (, ) p( ac) p( bc) p( c) p ab = c In general, it does not factorize into the product That means, given the above equation, in general: (, ) p( a) p( b) p a b Therefore, a and b are not conditionally independent given a null set (nothing has been observed). p( a) p( b) Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 29

30 Investigating conditional independence Graphical interpretation: Consider the path from a to b given c. If c is not observed, the presence of the path connecting a and b causes these nodes to be dependent. c a b Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 3

31 Investigating conditional independence Case 2: tail-to-tail, c is observed. Bayes theorem p ( abc, ) = = = (,, ) p( c) p a b c ( ) ( ) ( ) p ac p b c p c ( ) p c ( ) ( ) p ac p b c Therefore, a and b are conditionally independent given c. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 31

32 Investigating conditional independence Graphical interpretation: Consider the path from a to b given c. If c is observed, the conditioned node (c) blocks the path from a to b causes these nodes to become conditionally dependent. c a b Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 32

33 Investigating conditional independence Investigating other cases in the same way we have: head-to-tail, tail c is unobserved: a and b are dependent. head-to-tail, c is observed: a and b are independent. head-to-head, c is unobserved: a and b are dependent. head-to-head, c is observed: a and b are independent. The head-to-head case is inverse with respect to the other two cases. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 33

34 Investigating conditional independence The important feature of graphical models: The conditional independence of the joint distribution can be read directly from the graph. To this end, no analytical manipulation is needed. To read the conditional independence directly from the graph, a general framework can be derived by a reasoning similar to the previous example. The framework is called d-separation (d stands for directed). ) Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 34

35 Bayesian Network for solving problems: Video object tracking - an example Consider the following example: A given object should be tracked in an image sequence. To represent the object, the position of the object should be estimated in an image frame. The object position is represented by one point in the image frame called the reference point. The reference point can be any given point, e.g. the center of the object. What is the joint distribution of all random variables involving in tracking the object? Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 35

36 Bayesian Network for solving problems: Video object tracking - solution Here is an object (square) in an image frame: The image frame is a 2D matrix: Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 36

37 Bayesian Network for solving problems: Video object tracking - solution The object shape is represented by its corners. The red boxes are the object corners. Here is the corner-based representation of the object. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 37

38 Bayesian Network for solving problems: Video object tracking - solution Video object tracking solution Here is the corner-based representation of the object. It can be shown as a binary matrix: Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 38

39 Bayesian Network for solving problems: Video object tracking - solution Video object tracking solution Also it is possible to consider one of the object pixels (e.g. the center) as the object position (the red box) Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 39

40 Bayesian Network for solving problems: Video object tracking - solution Also it is possible to consider one of the object pixels (e.g. the center) as the object position (the red box). So, the shape model is formed using the green arrows. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 4

41 Bayesian Network for solving problems: Video object tracking - solution The shape, represented by corners, is formed the relative coordinates of corners with respect to the position: C1 (2,2) (2,-2) C4 C2 (-2,2) (-2,-2) C3 Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 41

42 Bayesian Network for solving problems: Video object tracking - solution To form the joint probability, to track the object in the next frame, we use the following notation: I N = x y I N N Image frame at time t: t image size: Object position at time t Object shape at time t: X = t {( x,y) 1 x x N,1 y y N} { } = { } S = S C X t i,t 1 i M i,t t 1 i M M is sthe number of corners es&c C is the corner coordinates Observations at time t: Z t = { Z1 Z N } (all image pixels, that may/may not be corners) S t = { } { X X } = ( 2,2 ),( 2, 2 ),( 2,2 ),( 2, 2) corner t Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 42

43 Bayesian Network for solving problems: Video object tracking - solution In the aforementioned example, the object position at time t-1 is used to build the object shape at time t-1. At time t, a corner extractor t extracts t the corners (observations). The object position at time t should be estimated. Therefore, we need to find the joint distribution over the above mentioned variables: p ( X,S,Z) = p( X,S, Z Z ) t t-1 t t t-1 1 N Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 43

44 Bayesian Network for solving problems: Video object tracking - solution From the shape represented before, it is clear that each observation (corner) can be available or unavailable independently from other corners. Given an object at time t-1, itsshape model can be formed. The shape at time t-1, is independent from the observations at time t. The position at time t is estimated using the shape at time t-1 and observations at time t. Therefore, position depends on both observations and shape. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 44

45 Bayesian Network for solving problems: Video object tracking - solution Using Bayesian network the joint distribution can be shown using the following graph: Z Z1 2 Z N St-1 X t Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 45

46 Bayesian Network for solving problems: Video object tracking - solution Using Bayesian network the joint distribution can be shown using the following graph: Z 1 Z 2 Z N S t-1 ( X,S,Z) = ( X,S, Z Z ) = ( X Z Z, S) ( S ) ( Z ) ( Z ) p p p p p p t t-1 t t t-1 1 N t 1 N t-1 1 N N X t = p ( Xt St-1) p( St-1) p( Xt Z n) p( Z n) N n= 1 n= 1 Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 46

47 Bayesian Network for solving problems: Video object tracking - solution At the current example, we consider pixels. = 14 9 = 126 We define a function q to show if a given pixel i is a corner q ( Z i ) = 1 or it is not a corner q ( Z i ) = p ( Z i ) is the a priori probability of the pixel i to be a corner. We define it as follows: p( Z i ) =.5 i,1 i N p ( St-1) is the a priori probability of having the shape S at time t-1. since we have just one shape, p ( S t-1 ) = 1 p ( Xt St-1) is the probability that each position Xcan be the object position at time t. since no observation is still observed, all positions have equal probabilities: Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 47 I t

48 Bayesian Network for solving problems: Video object tracking - solution p ( Xt St-1) = N = 126 Therefore, the only unknown term is the posterior probability bilit of the position given some observations: For each position i if no corner is observed, If a corner is observed at position Z then: p ( Xt Z i ) ( ) { Z i Sj,t-1} Z i ε > if Xt + j :1 j M = 1 δ if Xt { Z i + Sj,t-1 } card ( S) p ( Xt Z n ) ( Xt Z n ).8 p = card S is the cardinality of the shape, i.e. the number of shape corners. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 48

49 Bayesian Network for solving problems: Video object tracking - solution p ( X Z ) To make t i a probability function, the marginal probability over all positions must sum to 1: 1 card ( S) δ card ( S) + ε( N card ( S) ) = 1 N δ = ε 1 card ( S) For example, there are 4 corners in the current shape definition, and hence, card S = ( ) 4 ε =.1 N = 126 δ =.2195 ε If we consider and since we will have: Setting the small value of is to avoid a zero matrix. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 49

50 Bayesian Network for solving problems: Video object tracking - solution The aforementioned values indicate that if a corner is observed, 122 object pixels will have a probability equal to.1 1 to be the object position and 4 object pixels will have a probability of Now, consider an example in which 6 corners are observed (next slide). The corners that will be observed in this example are as follows: a= Z31 : q( 3,3) = 1 d = Z91 : q( 7,7) = 1 b = Z 87 : q ( 3,7 ) = 1 e= Z39 : q ( 11,3 113 ) = 1 c=z : q 7,3 = 1 f =Z : q 12,6 = 1 ( ) ( ) Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 5

51 Bayesian Network for solving problems: Video object tracking - solution The observations set Z - Nothing is observed Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 51

52 Bayesian Network for solving problems: p Video object tracking - solution ( ) p( ) XZ = XS -1 =.8 X I t n t t t t Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 52

53 Bayesian Network for solving problems: Video object tracking - solution a was observed a Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 53

54 Bayesian Network for solving problems: Video object tracking - solution The probabilities of 4 positions increase. a Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 54

55 The new probability values: p( XZ t 31) Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 55

56 Bayesian Network for solving problems: Video object tracking - solution b was observed a b Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 56

57 Bayesian Network for solving problems: Video object tracking - solution The probabilities of 4 positions increase. a b Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 57

58 Bayesian Network for solving problems: Video object tracking - solution The new probability values: p ( X ) t Z Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 58

59 Bayesian Network for solving problems: Video object tracking - solution c was observed a c b Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 59

60 Bayesian Network for solving problems: Video object tracking - solution The probabilities of 4 positions increase. a c b Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 6

61 Bayesian Network for solving problems: Video object tracking - solution The new probability values: p ( X ) t Z Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 61

62 Bayesian Network for solving problems: Video object tracking - solution d was observed a c b d Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 62

63 Bayesian Network for solving problems: Video object tracking - solution The probabilities of 4 positions increase. a c b d Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 63

64 Bayesian Network for solving problems: Video object tracking - solution The new probability values: p ( X ) t Z Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 64

65 Bayesian Network for solving problems: Video object tracking - solution e was observed a c e b d Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 65

66 Bayesian Network for solving problems: Video object tracking - solution The probabilities of 4 positions increase. a c e b d Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 66

67 Bayesian Network for solving problems: Video object tracking - solution The new probability values: p ( X ) t Z Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 67

68 Bayesian Network for solving problems: Video object tracking - solution f was observed a c e f b d Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 68

69 Bayesian Network for solving problems: Video object tracking - solution The probabilities of 4 positions increase. a c e f b d Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 69

70 Bayesian Network for solving problems: Video object tracking - solution The new probability values: p ( X ) t Z Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 7

71 Bayesian Network for solving problems: Video object tracking - solution Now, we have all probability values. Therefore: ( X, S ) ( ) ( ) ( ) ( -1, Z X S S t t t t t-1 t-1 Z n X Z t n ) p = p p p p = n= 1 n= constant ( Xt Z31) ( Xt Z35 ) ( Xt Z39 ) ( Xt Z68 ) ( Xt Z87 ) ( Xt Z91 ) p p p p p p Note that in the above multiplication, the. symbol is for entry-by-entry multiplication not the matrix multiplication. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 71

72 Bayesian Network for solving problems: Video object tracking - solution ( XZ) ( XZ ) ( XZ ) ( XZ ) ( XZ ) ( XZ) p p p p p p t 31 t 35 t 39 t 68 t 87 t x x x x x x x x x x x x x x Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 72

73 Bayesian Network for solving problems: Video object tracking - solution Multiplying all probability matrices, the posterior probability of the position will be maximized at position (5,5). 5) Therefore, the position (5,5) is the new position of the object. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 73

74 Bayesian Network for solving problems: Video object tracking - solution a c e f b d Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 74

75 Bayesian networks Representation Our goal is to represent the joint distribution over a set of random variables: {,, } χ = X1 Xn P Let us define Val ( X 1 ) as all the possible discrete assignments of X 1 The probability space of the full joint distribution is n i=1 val(x i ) (Eq.1) Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 75

76 Bayesian networks Simplest Example Consider the problem of a company trying to hire a recent college graduate. The goal is to hire intelligent students but there is not possibility to measure intelligence directly. The company has access to the Scholastic Aptitude Test (SAT) scores, which are informative but not fully indicative. In this simple example, we induce two random variables: Intelligence that can take values {i,i 1 } and SAT score that can take values {s, s 1 }. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 76

77 Bayesian networks Simplest Example In this case that our random variables are discrete and binary-valued, the number of non-redundant parameters for a full distribution is defined by 2 n 1 (the last parameter is fully defined by the others) equal 2 to 2 1= 3 i i 1 Low Intelligence High Intelligence Intelligence I S P(I,S) i s.665 i s 1.35 s s 1 Low score High score SAT i 1 s.6 i 1 s Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 77

78 Bayesian networks Simplest Example Factorizing the joint distribution pis (, ) can give us more natural casuality meaning to the parameters. Our probabilty bilt distribution ib ti can be factorized as pis (, ) = pi () ps ( I) And the original distribution pis (, ) can be represented with the following CPTs: i i I s s 1 i.95.5 i Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 78

79 Bayesian networks Simplest Example From the mathematical perspective, the last alternative leads to the exactly same joint distribution (you can prove yourself by doing the math) The 2 defined CPTs required 3 non-redundant parameters to be fully specified. 1 parameter from the binomial distribution from first table and 2 parameters for the two binomial distributions of second table (one for every possible assigment of the parents) Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 79

80 Bayesian networks Second Example Now let s assume that the company has also access to the student s grade in some course. We can enhance our Bayesian Network with the variable G that can take the values {g 1, g 2, g 3 } p ( ISG,, ) = p ( I ) p ( S I ) p ( G I ) g 1 g 2 A B g 3 C Grade Intelligence SAT I g 1 g 2 g 3 i i Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 8

81 Bayesian networks Second Example Some important remarks about the enhanced model: Although all the joint distribution changed, the CPTs defined in the first example are still valid. (adding nodes to some parts of the graphs does not mean change all the graph) According to Eg.(1), the total number of non-redudant parameters to describe the full distribution is 11 Using the current factorization allows to describe the distribution with only 7 non-redundant paramteres. Which mean that the factorized distribution is more compact. The less number of parameter come from the missing link between G and S. Implicitly this implies that Grades is conditional independent from SAT given Intelligence Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 81

82 Bayesian networks Third Example Lets consider a more complex scenario where we add two random variables D and L The grades now depend on how difficult the course is. It can take the values {d, d 1 }={easy {easy, hard} And a letter recommendation that can take values {l, l 1 } = {strong, weak} Difficulty Intelligence Grade SAT Letter Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 82

83 Bayesian networks Third Example d d Difficulty Intelligence i i g 1 g 2 g 3 i d i d i 1 d i 1 d Grade Letter l l 1 g g g SAT s s 1 i.95.5 i Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 83

84 Bayesian networks Third Example What is the factorization of the presented graph? What is the probability space dimension of the full probability distribution? What is the total t number of non-redudant d parameters of the full distribution? What is the total number of non-redudant parameters of the factorized distribution? Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 84

85 Bayesian networks Third Example What is the factorization of the presented graph? pidgsl (,,,, ) = pi () pdpg ( ) ( IDpS, ) ( I) plg ( ) What is the probability space dimension of the full probability distribution? 48 What is the total t number of non-redudant d parameters of the full posterior? 47 What is the total number of non-redudant parameters of the factorized distribution? 15 Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 85

86 Bayesian networks Third Example How to calculate entrances of the full distribution using the CPTs? Let s say we want to find the joint probability of: A student being intelligent. The course to be easy. To obtain a B in given course. Obtain a good score in SAT exam. Receive a strong recommendation letter. pi d g s l pi pd pg i d ps i pl g (,,,, ) = ( ) ( ) (, ) ( ) ( ) =.3*.6*.8*.8*.4 =.468 Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 86

87 Bayesian networks Behavior with Evidence Once selected the random variables, constructed the Graphical Model and set the CPTs, we can infer the posterior probability of an even given some evidence. py ( = y E= e) The naive way obtain the this posterior probability is by eliminating the entries in the joint inconsistent with our observation e and renormalize the result entries to sum up to 1. Then we compute the probability of the event y by summing the probabilities of all of the entries in the resulting posterior that are consistent with y. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 87

88 Bayesian networks Behavior with Evidence Let s see how the probabilities change once we get evidence. The probability to obtain a strong recommendation without 1 any evidence is around 5.2% pl ( ) If we know that the student is not intelligent, this probability decreases to 38.9% 1 pl ( i).389 If we discover that the course is an easy class, the probability increases again to 51.3% 1 pl ( i, d ).513 Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 88

89 Bayesian networks Behavior with Evidence Let s see another interesting example of the effect called explaining away. Without seeing any evidence, our belief that a student is intelligent is 3% pi ( 1 ) = 3.3 If we have the evidence that the student got a C in the course, the probability of being intelligent decreases to 79%but 7.9% at the same time the probability of the course being difficult increases from 4% to 62,9% pi g 1 3 ( ).79 pd g 1 3 ( ).629 If the students submits the SAT score with a high score, his probability of being intelligent goes from 7.9% to 57.8% and the probability of the course to be difficult to 76% p i g s ( i, ).578 p d g s ( d, ).76 Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 89

90 Bayesian networks Behavior with Evidence In the last example the high SAT score outweighs the poor grade because low intelligence students are extremely unlikely to receive high scores in SAT, whereas high intelligence students can still get C s if the course is difficult. Explaining away is an instance of a general reasoning pattern called intercasual reasoning, where different causes of the same effect can interact. This type of reasoning is avery common pattern in human reasoning. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 9

91 Bayesian networks Graph Independences What are the independence that can be drawn from the graph? ( L I, D, S G ) ( S D, G, L I) Difficulty Intelligence ( G S I, D ) ( I D) ( D IS, ) Grade Letter SAT Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 91

92 Dynamic Bayesian networks Dynamic Bayesian Networks (DBNs) can be considered as an extension of Bayesian Networks to handle temporal models. The term dynamic i is due to the fact that t they are use to represent a dynamic model (A model with a variable state over time) A DBN is defined by ( B, B ) where B defines the prior probability over the state and B is a two-slice temporal Bayes net (2TBN) which defines how the systems evolves in time. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 92

93 Dynamic Bayesian networks There are two types of edges (dependencies) that can be defined in a DBN. Intra-slice topology (within a slice) and inter-slice topology (between two slices) Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 93

94 Dynamic Bayesian networks The decision of how to relate two variables, if either intra-slice (aka intra-time-slice) or inter-slice (aka intertime-slice) depends on how tight the coupling is between them. If the effect of one variable on the other is inmediate (much shorter then the time granularity) the influence should manifest as intra-slice edge. If the effect is slightly longer-term the influence should manifest as inter-slice edge. An inter-slice edge connecting two instances of the same variable is called persistence-edge Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 94

95 Dynamic Bayesian networks The DBN structure must satisfy the following assumptions: B The structure and CPDs in the time. do not change over Inter-slice arcs are all from left to right, in accordance with the temporal evolution. No cycles must be present in the intra-slice arcs. Thus we can view a DBN as a compact representation from which we can generate an infinite set of Bayesian networks (one for every T>) Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 95

96 Dynamic Bayesian networks Hidden Markov Models (HMMs) and Kalman Filter Model (KFM) are specific nontrivial examples of DBNs. The are formed by one hidden variable with persistence links between time steps and one observed. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 96

97 Dynamic Bayesian networks HMM HMM is characterized by one dicrete hidden node. The probabilities that have to be defined are: px ( ) that is the initial state distribution and represents the uncertainty t on the intial value of thestate. t px ( k xk 1) that is the transition model. It describes how the state evolves in time. pz ( k xk ) that is the observation model and represents how the observations are related and generated by the hidden state. It is also called likelihood. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 97

98 Dynamic Bayesian networks KFM Revisited KFM is characterized by one continuous hidden node. All nodes are assumed to be linear-gaussian distributions. ib ti The probabilities then defined as: px ( ) = Nx (, Q) Initial state Transition model Observation model px ( x ) = Ν ( Fx + Gu, Q ) k k 1 k 1 k pz ( x) = Ν( Hx, V ) k k Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 98

99 Dynamic Bayesian networks Data Fusion There mainly three ways to fuse observations in DBNs Conditionally independent fusion Linearly condiationally dependent fusion Conditionally dependent d fusion Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 99

100 Dynamic Bayesian networks Data Fusion Mathematically this relations can be expressed, i 1 defining {, 2,..., L Z k = z k z k z k } as the set of different observations (or sensors), as: Conditionally indepent fusion pz x pz x pz x pz x 1 2 L ( k k) = ( k k) ( k k)... ( k k) Linearly condiationally dependent fusion pz x pz x pz x pz x L L ( k k) = α k ( k k) + α k ( k k) α k ( k k) Subject to: L i α k = 1 Conditionally dependent fusion i pz ( x) = pz ( z, x) pz ( z, x)... pz ( x) L 1: L 1 L 1 1: L 2 1 k k k k k k k k k k 1 Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni

101 Dynamic Bayesian networks Data Fusion Let s consider the example of visual tracking: The likelihood coming from motion and color can be taken as conditionally independent (the motion of the object can be assumed not correlated to it s motion) The problem of fusing different cues in order to create active discriminative appearance models using different color spaces can be fused with conditional linear dependency where more weight is given to the cue that is more discriminative at that time step. Now consider the case where we not only want to use different color spaces but we want to actively find the color space that best separates foreground and background. Then try to find the best color description in this color space. In this case, the color description depends on the color space chosen. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 11

102 Dynamic Bayesian networks - Advance DBN structures The DBN in the figure is constructed from HMMs and is called factorial HMM. This type of model is very useful in a variety of appliacations, for example, when several sources of sound are being heard simultaneously through a single microphone. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 12

103 Dynamic Bayesian networks - Advance DBN structures The DBN in the figure is called coupled HMM. This type of model is constructed from a set of chains, with each chain having its own observation. Chains interact t via their state t variable affecting adjacent chains. This kind of HMM is useful, for example, to model interaction between different interacting objects. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 13

104 Bayesian networks belief propagation Consider the simplest tree structured network: X Y { } If evidence e= Y= y is observed, then from Bayes rule, the belief distribution of is given by: X ( ) = ( ) =α ( ) λ( ) BEL x p x p x x ( ) ( ) λ x is the likelihood vector e α p ( e ) p x is the prior probability of x = 1 Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 14

105 Bayesian networks belief propagation ( x) p( x) p( y x) λ = e = Y= = Myx Where M yx is the conditional probability matrix: ( ) ( ) ( 1 1) ( 2 1) ( n 1) ( 1 2 ) ( 2 2 ) ( n 2 ) M = p y x = p Y= y X= x = yx p y x p y x p y x p y x p y x p y x p y 1 x p y 2 x p y x ( m ) ( m ) ( n m ) Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 15

106 Bayesian networks belief propagation If Y is not observed directly, but it is supported by indirect observation e= { Z= z} we still have: ( ) = ( e) =α ( ) λ( ) BEL x p x p x x X Y Z But the likelihood vector can no longer be directly obtained from M but it must reflect the matrix M yx zyas well. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 16

107 Bayesian networks belief propagation ( x ) p ( e x ) p ( e x, y ) p ( y x ) λ = = = ( ) y y ( e ) ( ) = λ ( ) p y p y x M y λ y can also be obtained from yx M z y Y separates X from the evidence Therefore, the belief of a node in a chain can be obtained by ypropagating p gthe likelihood vector: M ut M M M x u yx zy T U X Y Z λ ( t) ( u) λ λ ( x) λ ( y) Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 17

108 Bayesian networks bidirectional belief propagation Now consider that two evidences are observed at the + - two sides of the chain, we call them and + e In this way we have: U X Y e - ( ) = ( e +, e ) = α ( e, e + ) ( e + ) = απ( ) λ( ) BEL x p x p x p x x x e e π ( ) ( + x p x ) = e The posterior probability of x given the evidence ( ) ( ) λ x = p e x The likelihood of x + e Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 18

109 Bayesian networks bidirectional belief propagation Again, if Y is not directly observed, we know how to calculate the likelihood vector in the chain. But, if U is not directly observed, we need + to - e e propagate + the information about π ( x) from e down the chain. posterior probability + e T U X Y Z - e π ( ) ( + ) ( + x p x p x u, ) p( + e e u e ) = = = u u p( xu) π ( u) = π ( u) M xu U separates X from the evidence Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 19

110 Bayesian networks bidirectional belief propagation + e π is forward-propagated. λ is backward-propagated. Each node computes it own after obtaining the of its parent. Each node computes it own after obtaining the of its child. π λ π ( t ) π ( u ) π ( x ) π ( y ) π ( z ) T U X Y Z λ ( t) λ ( u) λ ( x) λ ( y) λ ( z) π λ e Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 11

111 Bayesian networks: summary Properties: It specifies a factorization of the joint distribution into a product of local conditional distribution. It also defines a set of conditional independence properties that must be satisfied by the distribution that factorizes according to the graph. Drawback: Due to presence of paths having head-to-head nodes, the d-separation test is somewhat subtle. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 111

112 part III Markov Random Fields Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 112

113 Markov random fields A Markov random field (MRF) is also known as Markov network or undirected graphical model. It has a set of nodes and a set of links (like Bayesian network). The links are undirected. An MRF is an alternative graphical semantics. Using an MRF, the conditional independence is determined by simple graph separation and easier than a Bayesian network. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 113

114 Markov random fields By removing directionality from the links, the asymmetry between parent and child nodes is removed. Therefore, the head-to-head nodes no longer arise. Using an MRF, the conditional independence is determined by simple graph separation and easier than a Bayesian network. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 114

115 MRF: conditional independence Assume the following MRF: We identify three sets of nodes: A, B, and C. We want to test the conditional independence between set A and set B. A C B Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 115

116 MRF: conditional independence To test the conditional independence: We consider all possible paths that connect nodes in set A to nodes in set B. If all such paths pass through one or more nodes in C, all such paths are blocked, so the conditional independence holds. A C B Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 116

117 MRF: conditional independence To test the conditional independence (continued): If there is at least one path that is not blocked, the property does not necessarily hold. More precisely: there exist at least some distributions corresponding to the graph that do not satisfy the conditional independence relation. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 117

118 MRF: conditional independence An alternative way to test the conditional independence: To remove all nodes in set C together with all links that connect to these nodes. Is there a path that connects any node in A to any node in B? A C B Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 118

119 MRF: factorization If two nodes A and B are not connected by a link, they must be conditionally independent, given all other nodes in the graph. The reason is that: There is no direct path between A and B. All other paths pass through nodes that are observed (they are blocked). Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 119

120 MRF: factorization Markov blanket of a node A consists of the set of neighboring nodes. The conditional distribution ib ti of A, conditioned d on all other variables in the graph, depends only on the variables in the Markov blanket (previous slide). A Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 12

121 MRF: factorization The conditional independence property between and can be expressed as: x j x i (, ) { } = { } ( ) ( ) { } p x x X p x X p x X i j \ i, j i \ i, j j \ i, j X X i denotes the set of all variables, with and \{ i, j} removed. x x j Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 121

122 MRF: factorization - clique To generalize factorization, we need to define another graphical concept called clique. A clique is a subset of nodes such hthat tthere is a link between all pairs of nodes in the subset (fully connected). ) A maximal clique is a clique such that it is not possible to add any other node to it. An example of cliques appears in the next slide. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 122

123 MRF: factorization - clique Cliques of two nodes: x, x, x, x, x, x, x, x, x, x { } { } { } { } { } Maximal cliques: x, x, x, x, x, x { } { } x 1 x 2 x 3 x 4 Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 123

124 MRF: factorization - clique We define the factors in the decomposition of joint distribution to be functions of variables in the cliques. To avoid loss of generality, we can define them over the maximal cliques. We denote a clique by C The set of variables in that clique is denoted by X C We denote potential functions over the maximal ψ X cliques by ( ) C C Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 124

125 MRF: factorization The joint distribution is written as the product of potential functions. p 1 X = ψ X & Z = ψ X Z ( ) ( ) ( ) C C C C C X C Z is called partition function (normalization constant). Considering only potential functions that ψc ( X C ) we ensure that p ( X ) Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 125

126 Markov random field: illustrative example Exploiting dependencies using Graph cuts algorithm on MRF Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 126

127 Markov random field: summary Properties: It specifies a factorization of the joint distribution into a product of potential functions defined over the maximal cliques. It also defines a set of conditional independence properties. Investigating the conditional independence properties is done by simple graph separation. Determining the conditional independence properties is easier than that of Bayesian networks. Corso di Sistemi Cognitivi per le Telecomunicazioni Prof. C. Regazzoni 127

Chris Bishop s PRML Ch. 8: Graphical Models

Chris Bishop s PRML Ch. 8: Graphical Models Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular

More information

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

A graph contains a set of nodes (vertices) connected by links (edges or arcs) BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,

More information

9 Forward-backward algorithm, sum-product on factor graphs

9 Forward-backward algorithm, sum-product on factor graphs Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous

More information

Bayesian Networks BY: MOHAMAD ALSABBAGH

Bayesian Networks BY: MOHAMAD ALSABBAGH Bayesian Networks BY: MOHAMAD ALSABBAGH Outlines Introduction Bayes Rule Bayesian Networks (BN) Representation Size of a Bayesian Network Inference via BN BN Learning Dynamic BN Introduction Conditional

More information

Machine Learning Lecture 14

Machine Learning Lecture 14 Many slides adapted from B. Schiele, S. Roth, Z. Gharahmani Machine Learning Lecture 14 Undirected Graphical Models & Inference 23.06.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de

More information

Directed and Undirected Graphical Models

Directed and Undirected Graphical Models Directed and Undirected Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Last Lecture Refresher Lecture Plan Directed

More information

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4 ECE52 Tutorial Topic Review ECE52 Winter 206 Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides ECE52 Tutorial ECE52 Winter 206 Credits to Alireza / 4 Outline K-means, PCA 2 Bayesian

More information

Undirected Graphical Models

Undirected Graphical Models Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional

More information

Introduction to Probabilistic Graphical Models

Introduction to Probabilistic Graphical Models Introduction to Probabilistic Graphical Models Sargur Srihari srihari@cedar.buffalo.edu 1 Topics 1. What are probabilistic graphical models (PGMs) 2. Use of PGMs Engineering and AI 3. Directionality in

More information

Template-Based Representations. Sargur Srihari

Template-Based Representations. Sargur Srihari Template-Based Representations Sargur srihari@cedar.buffalo.edu 1 Topics Variable-based vs Template-based Temporal Models Basic Assumptions Dynamic Bayesian Networks Hidden Markov Models Linear Dynamical

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

p L yi z n m x N n xi

p L yi z n m x N n xi y i z n x n N x i Overview Directed and undirected graphs Conditional independence Exact inference Latent variables and EM Variational inference Books statistical perspective Graphical Models, S. Lauritzen

More information

COMS 4771 Probabilistic Reasoning via Graphical Models. Nakul Verma

COMS 4771 Probabilistic Reasoning via Graphical Models. Nakul Verma COMS 4771 Probabilistic Reasoning via Graphical Models Nakul Verma Last time Dimensionality Reduction Linear vs non-linear Dimensionality Reduction Principal Component Analysis (PCA) Non-linear methods

More information

Probabilistic Graphical Models (I)

Probabilistic Graphical Models (I) Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random

More information

Recall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem

Recall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem Recall from last time: Conditional probabilities Our probabilistic models will compute and manipulate conditional probabilities. Given two random variables X, Y, we denote by Lecture 2: Belief (Bayesian)

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Roman Barták Department of Theoretical Computer Science and Mathematical Logic Summary of last lecture We know how to do probabilistic reasoning over time transition model P(X t

More information

CS 188: Artificial Intelligence. Bayes Nets

CS 188: Artificial Intelligence. Bayes Nets CS 188: Artificial Intelligence Probabilistic Inference: Enumeration, Variable Elimination, Sampling Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew

More information

Learning from Sensor Data: Set II. Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University

Learning from Sensor Data: Set II. Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University Learning from Sensor Data: Set II Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University 1 6. Data Representation The approach for learning from data Probabilistic

More information

Bayesian Network Representation

Bayesian Network Representation Bayesian Network Representation Sargur Srihari srihari@cedar.buffalo.edu 1 Topics Joint and Conditional Distributions I-Maps I-Map to Factorization Factorization to I-Map Perfect Map Knowledge Engineering

More information

Graphical Models and Kernel Methods

Graphical Models and Kernel Methods Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.

More information

Directed Graphical Models or Bayesian Networks

Directed Graphical Models or Bayesian Networks Directed Graphical Models or Bayesian Networks Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Bayesian Networks One of the most exciting recent advancements in statistical AI Compact

More information

Lecture 6: Graphical Models

Lecture 6: Graphical Models Lecture 6: Graphical Models Kai-Wei Chang CS @ Uniersity of Virginia kw@kwchang.net Some slides are adapted from Viek Skirmar s course on Structured Prediction 1 So far We discussed sequence labeling tasks:

More information

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016 CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016 Plan for today and next week Today and next time: Bayesian networks (Bishop Sec. 8.1) Conditional

More information

3 : Representation of Undirected GM

3 : Representation of Undirected GM 10-708: Probabilistic Graphical Models 10-708, Spring 2016 3 : Representation of Undirected GM Lecturer: Eric P. Xing Scribes: Longqi Cai, Man-Chia Chang 1 MRF vs BN There are two types of graphical models:

More information

Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013

Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013 Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013 Outline Modeling Inference Training Applications Outline Modeling Problem definition Discriminative vs. Generative Chain CRF General

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18 Oct, 21, 2015 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models CPSC

More information

Conditional Independence and Factorization

Conditional Independence and Factorization Conditional Independence and Factorization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Chapter 16. Structured Probabilistic Models for Deep Learning

Chapter 16. Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 1 Chapter 16 Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 2 Structured Probabilistic Models way of using graphs to describe

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Bayes Nets: Independence

Bayes Nets: Independence Bayes Nets: Independence [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at http://ai.berkeley.edu.] Bayes Nets A Bayes

More information

Variable Elimination: Algorithm

Variable Elimination: Algorithm Variable Elimination: Algorithm Sargur srihari@cedar.buffalo.edu 1 Topics 1. Types of Inference Algorithms 2. Variable Elimination: the Basic ideas 3. Variable Elimination Sum-Product VE Algorithm Sum-Product

More information

Machine Learning 4771

Machine Learning 4771 Machine Learning 4771 Instructor: Tony Jebara Topic 16 Undirected Graphs Undirected Separation Inferring Marginals & Conditionals Moralization Junction Trees Triangulation Undirected Graphs Separation

More information

Markov Networks.

Markov Networks. Markov Networks www.biostat.wisc.edu/~dpage/cs760/ Goals for the lecture you should understand the following concepts Markov network syntax Markov network semantics Potential functions Partition function

More information

Markov Networks. l Like Bayes Nets. l Graphical model that describes joint probability distribution using tables (AKA potentials)

Markov Networks. l Like Bayes Nets. l Graphical model that describes joint probability distribution using tables (AKA potentials) Markov Networks l Like Bayes Nets l Graphical model that describes joint probability distribution using tables (AKA potentials) l Nodes are random variables l Labels are outcomes over the variables Markov

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

Human-Oriented Robotics. Temporal Reasoning. Kai Arras Social Robotics Lab, University of Freiburg

Human-Oriented Robotics. Temporal Reasoning. Kai Arras Social Robotics Lab, University of Freiburg Temporal Reasoning Kai Arras, University of Freiburg 1 Temporal Reasoning Contents Introduction Temporal Reasoning Hidden Markov Models Linear Dynamical Systems (LDS) Kalman Filter 2 Temporal Reasoning

More information

Variable Elimination: Algorithm

Variable Elimination: Algorithm Variable Elimination: Algorithm Sargur srihari@cedar.buffalo.edu 1 Topics 1. Types of Inference Algorithms 2. Variable Elimination: the Basic ideas 3. Variable Elimination Sum-Product VE Algorithm Sum-Product

More information

Conditional Random Field

Conditional Random Field Introduction Linear-Chain General Specific Implementations Conclusions Corso di Elaborazione del Linguaggio Naturale Pisa, May, 2011 Introduction Linear-Chain General Specific Implementations Conclusions

More information

Representation. Stefano Ermon, Aditya Grover. Stanford University. Lecture 2

Representation. Stefano Ermon, Aditya Grover. Stanford University. Lecture 2 Representation Stefano Ermon, Aditya Grover Stanford University Lecture 2 Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 2 1 / 32 Learning a generative model We are given a training

More information

Introduction to Bayesian Learning

Introduction to Bayesian Learning Course Information Introduction Introduction to Bayesian Learning Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Apprendimento Automatico: Fondamenti - A.A. 2016/2017 Outline

More information

Hidden Markov Models (recap BNs)

Hidden Markov Models (recap BNs) Probabilistic reasoning over time - Hidden Markov Models (recap BNs) Applied artificial intelligence (EDA132) Lecture 10 2016-02-17 Elin A. Topp Material based on course book, chapter 15 1 A robot s view

More information

Markov Networks. l Like Bayes Nets. l Graph model that describes joint probability distribution using tables (AKA potentials)

Markov Networks. l Like Bayes Nets. l Graph model that describes joint probability distribution using tables (AKA potentials) Markov Networks l Like Bayes Nets l Graph model that describes joint probability distribution using tables (AKA potentials) l Nodes are random variables l Labels are outcomes over the variables Markov

More information

Part I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

Part I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Part I C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Probabilistic Graphical Models Graphical representation of a probabilistic model Each variable corresponds to a

More information

Probabilistic Reasoning. (Mostly using Bayesian Networks)

Probabilistic Reasoning. (Mostly using Bayesian Networks) Probabilistic Reasoning (Mostly using Bayesian Networks) Introduction: Why probabilistic reasoning? The world is not deterministic. (Usually because information is limited.) Ways of coping with uncertainty

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Lecture 15. Probabilistic Models on Graph

Lecture 15. Probabilistic Models on Graph Lecture 15. Probabilistic Models on Graph Prof. Alan Yuille Spring 2014 1 Introduction We discuss how to define probabilistic models that use richly structured probability distributions and describe how

More information

An Introduction to Bayesian Machine Learning

An Introduction to Bayesian Machine Learning 1 An Introduction to Bayesian Machine Learning José Miguel Hernández-Lobato Department of Engineering, Cambridge University April 8, 2013 2 What is Machine Learning? The design of computational systems

More information

CS 5522: Artificial Intelligence II

CS 5522: Artificial Intelligence II CS 5522: Artificial Intelligence II Bayes Nets: Independence Instructor: Alan Ritter Ohio State University [These slides were adapted from CS188 Intro to AI at UC Berkeley. All materials available at http://ai.berkeley.edu.]

More information

6.867 Machine learning, lecture 23 (Jaakkola)

6.867 Machine learning, lecture 23 (Jaakkola) Lecture topics: Markov Random Fields Probabilistic inference Markov Random Fields We will briefly go over undirected graphical models or Markov Random Fields (MRFs) as they will be needed in the context

More information

COS402- Artificial Intelligence Fall Lecture 10: Bayesian Networks & Exact Inference

COS402- Artificial Intelligence Fall Lecture 10: Bayesian Networks & Exact Inference COS402- Artificial Intelligence Fall 2015 Lecture 10: Bayesian Networks & Exact Inference Outline Logical inference and probabilistic inference Independence and conditional independence Bayes Nets Semantics

More information

Announcements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic

Announcements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic CS 188: Artificial Intelligence Fall 2008 Lecture 16: Bayes Nets III 10/23/2008 Announcements Midterms graded, up on glookup, back Tuesday W4 also graded, back in sections / box Past homeworks in return

More information

Review: Directed Models (Bayes Nets)

Review: Directed Models (Bayes Nets) X Review: Directed Models (Bayes Nets) Lecture 3: Undirected Graphical Models Sam Roweis January 2, 24 Semantics: x y z if z d-separates x and y d-separation: z d-separates x from y if along every undirected

More information

Bayes Nets III: Inference

Bayes Nets III: Inference 1 Hal Daumé III (me@hal3.name) Bayes Nets III: Inference Hal Daumé III Computer Science University of Maryland me@hal3.name CS 421: Introduction to Artificial Intelligence 10 Apr 2012 Many slides courtesy

More information

Outline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012

Outline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012 CSE 573: Artificial Intelligence Autumn 2012 Reasoning about Uncertainty & Hidden Markov Models Daniel Weld Many slides adapted from Dan Klein, Stuart Russell, Andrew Moore & Luke Zettlemoyer 1 Outline

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft

More information

Probabilistic Graphical Models Homework 2: Due February 24, 2014 at 4 pm

Probabilistic Graphical Models Homework 2: Due February 24, 2014 at 4 pm Probabilistic Graphical Models 10-708 Homework 2: Due February 24, 2014 at 4 pm Directions. This homework assignment covers the material presented in Lectures 4-8. You must complete all four problems to

More information

Bayesian Networks Inference with Probabilistic Graphical Models

Bayesian Networks Inference with Probabilistic Graphical Models 4190.408 2016-Spring Bayesian Networks Inference with Probabilistic Graphical Models Byoung-Tak Zhang intelligence Lab Seoul National University 4190.408 Artificial (2016-Spring) 1 Machine Learning? Learning

More information

Probabilistic Graphical Models

Probabilistic Graphical Models 2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector

More information

Sampling Algorithms for Probabilistic Graphical models

Sampling Algorithms for Probabilistic Graphical models Sampling Algorithms for Probabilistic Graphical models Vibhav Gogate University of Washington References: Chapter 12 of Probabilistic Graphical models: Principles and Techniques by Daphne Koller and Nir

More information

Modeling and Inference with Relational Dynamic Bayesian Networks Cristina Manfredotti

Modeling and Inference with Relational Dynamic Bayesian Networks Cristina Manfredotti Modeling and Inference with Relational Dynamic Bayesian Networks Cristina Manfredotti cristina.manfredotti@gmail.com The importance of the context Cristina Manfredotti 2 Complex activity recognition Cristina

More information

Linear Dynamical Systems

Linear Dynamical Systems Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations

More information

Outline. CSE 573: Artificial Intelligence Autumn Bayes Nets: Big Picture. Bayes Net Semantics. Hidden Markov Models. Example Bayes Net: Car

Outline. CSE 573: Artificial Intelligence Autumn Bayes Nets: Big Picture. Bayes Net Semantics. Hidden Markov Models. Example Bayes Net: Car CSE 573: Artificial Intelligence Autumn 2012 Bayesian Networks Dan Weld Many slides adapted from Dan Klein, Stuart Russell, Andrew Moore & Luke Zettlemoyer Outline Probabilistic models (and inference)

More information

A Brief Introduction to Graphical Models. Presenter: Yijuan Lu November 12,2004

A Brief Introduction to Graphical Models. Presenter: Yijuan Lu November 12,2004 A Brief Introduction to Graphical Models Presenter: Yijuan Lu November 12,2004 References Introduction to Graphical Models, Kevin Murphy, Technical Report, May 2001 Learning in Graphical Models, Michael

More information

Probabilistic Graphical Models. Guest Lecture by Narges Razavian Machine Learning Class April

Probabilistic Graphical Models. Guest Lecture by Narges Razavian Machine Learning Class April Probabilistic Graphical Models Guest Lecture by Narges Razavian Machine Learning Class April 14 2017 Today What is probabilistic graphical model and why it is useful? Bayesian Networks Basic Inference

More information

CSC 412 (Lecture 4): Undirected Graphical Models

CSC 412 (Lecture 4): Undirected Graphical Models CSC 412 (Lecture 4): Undirected Graphical Models Raquel Urtasun University of Toronto Feb 2, 2016 R Urtasun (UofT) CSC 412 Feb 2, 2016 1 / 37 Today Undirected Graphical Models: Semantics of the graph:

More information

Undirected Graphical Models: Markov Random Fields

Undirected Graphical Models: Markov Random Fields Undirected Graphical Models: Markov Random Fields 40-956 Advanced Topics in AI: Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2015 Markov Random Field Structure: undirected

More information

Directed and Undirected Graphical Models

Directed and Undirected Graphical Models Directed and Undirected Graphical Models Adrian Weller MLSALT4 Lecture Feb 26, 2016 With thanks to David Sontag (NYU) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,

More information

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005 Instructor: Professor Jeff A. Bilmes Uncertainty & Bayesian Networks

More information

Probabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov

Probabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov Probabilistic Graphical Models: MRFs and CRFs CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov Why PGMs? PGMs can model joint probabilities of many events. many techniques commonly

More information

Artificial Intelligence Bayes Nets: Independence

Artificial Intelligence Bayes Nets: Independence Artificial Intelligence Bayes Nets: Independence Instructors: David Suter and Qince Li Course Delivered @ Harbin Institute of Technology [Many slides adapted from those created by Dan Klein and Pieter

More information

CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Spring Announcements CS 188: Artificial Intelligence Spring 2011 Lecture 16: Bayes Nets IV Inference 3/28/2011 Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore Announcements

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Oct, 3, 2016 CPSC 422, Lecture 11 Slide 1 422 big picture: Where are we? Query Planning Deterministic Logics First Order Logics Ontologies

More information

Lecture 9: PGM Learning

Lecture 9: PGM Learning 13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and

More information

Extensions of Bayesian Networks. Outline. Bayesian Network. Reasoning under Uncertainty. Features of Bayesian Networks.

Extensions of Bayesian Networks. Outline. Bayesian Network. Reasoning under Uncertainty. Features of Bayesian Networks. Extensions of Bayesian Networks Outline Ethan Howe, James Lenfestey, Tom Temple Intro to Dynamic Bayesian Nets (Tom Exact inference in DBNs with demo (Ethan Approximate inference and learning (Tom Probabilistic

More information

Outline. Spring It Introduction Representation. Markov Random Field. Conclusion. Conditional Independence Inference: Variable elimination

Outline. Spring It Introduction Representation. Markov Random Field. Conclusion. Conditional Independence Inference: Variable elimination Probabilistic Graphical Models COMP 790-90 Seminar Spring 2011 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Outline It Introduction ti Representation Bayesian network Conditional Independence Inference:

More information

Conditional Independence

Conditional Independence Conditional Independence Sargur Srihari srihari@cedar.buffalo.edu 1 Conditional Independence Topics 1. What is Conditional Independence? Factorization of probability distribution into marginals 2. Why

More information

1 Undirected Graphical Models. 2 Markov Random Fields (MRFs)

1 Undirected Graphical Models. 2 Markov Random Fields (MRFs) Machine Learning (ML, F16) Lecture#07 (Thursday Nov. 3rd) Lecturer: Byron Boots Undirected Graphical Models 1 Undirected Graphical Models In the previous lecture, we discussed directed graphical models.

More information

Recall from last time. Lecture 3: Conditional independence and graph structure. Example: A Bayesian (belief) network.

Recall from last time. Lecture 3: Conditional independence and graph structure. Example: A Bayesian (belief) network. ecall from last time Lecture 3: onditional independence and graph structure onditional independencies implied by a belief network Independence maps (I-maps) Factorization theorem The Bayes ball algorithm

More information

STA 414/2104: Machine Learning

STA 414/2104: Machine Learning STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 12 Dynamical Models CS/CNS/EE 155 Andreas Krause Homework 3 out tonight Start early!! Announcements Project milestones due today Please email to TAs 2 Parameter learning

More information

Using Graphs to Describe Model Structure. Sargur N. Srihari

Using Graphs to Describe Model Structure. Sargur N. Srihari Using Graphs to Describe Model Structure Sargur N. srihari@cedar.buffalo.edu 1 Topics in Structured PGMs for Deep Learning 0. Overview 1. Challenge of Unstructured Modeling 2. Using graphs to describe

More information

6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008

6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008 MIT OpenCourseWare http://ocw.mit.edu 6.047 / 6.878 Computational Biology: Genomes, Networks, Evolution Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Undirected Graphical Models Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 3: 2 late days to hand it in today, Thursday is final day. Assignment 4:

More information

Lecture 4 October 18th

Lecture 4 October 18th Directed and undirected graphical models Fall 2017 Lecture 4 October 18th Lecturer: Guillaume Obozinski Scribe: In this lecture, we will assume that all random variables are discrete, to keep notations

More information

Directed Probabilistic Graphical Models CMSC 678 UMBC

Directed Probabilistic Graphical Models CMSC 678 UMBC Directed Probabilistic Graphical Models CMSC 678 UMBC Announcement 1: Assignment 3 Due Wednesday April 11 th, 11:59 AM Any questions? Announcement 2: Progress Report on Project Due Monday April 16 th,

More information

Based on slides by Richard Zemel

Based on slides by Richard Zemel CSC 412/2506 Winter 2018 Probabilistic Learning and Reasoning Lecture 3: Directed Graphical Models and Latent Variables Based on slides by Richard Zemel Learning outcomes What aspects of a model can we

More information

Graphical Models - Part I

Graphical Models - Part I Graphical Models - Part I Oliver Schulte - CMPT 726 Bishop PRML Ch. 8, some slides from Russell and Norvig AIMA2e Outline Probabilistic Models Bayesian Networks Markov Random Fields Inference Outline Probabilistic

More information

Bayesian Machine Learning - Lecture 7

Bayesian Machine Learning - Lecture 7 Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1

More information

2-Step Temporal Bayesian Networks (2TBN): Filtering, Smoothing, and Beyond Technical Report: TRCIM1030

2-Step Temporal Bayesian Networks (2TBN): Filtering, Smoothing, and Beyond Technical Report: TRCIM1030 2-Step Temporal Bayesian Networks (2TBN): Filtering, Smoothing, and Beyond Technical Report: TRCIM1030 Anqi Xu anqixu(at)cim(dot)mcgill(dot)ca School of Computer Science, McGill University, Montreal, Canada,

More information

Bayesian Machine Learning

Bayesian Machine Learning Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 4 Occam s Razor, Model Construction, and Directed Graphical Models https://people.orie.cornell.edu/andrew/orie6741 Cornell University September

More information

CS Lecture 4. Markov Random Fields

CS Lecture 4. Markov Random Fields CS 6347 Lecture 4 Markov Random Fields Recap Announcements First homework is available on elearning Reminder: Office hours Tuesday from 10am-11am Last Time Bayesian networks Today Markov random fields

More information

Bayesian networks. Chapter Chapter

Bayesian networks. Chapter Chapter Bayesian networks Chapter 14.1 3 Chapter 14.1 3 1 Outline Syntax Semantics Parameterized distributions Chapter 14.1 3 2 Bayesian networks A simple, graphical notation for conditional independence assertions

More information

Rapid Introduction to Machine Learning/ Deep Learning

Rapid Introduction to Machine Learning/ Deep Learning Rapid Introduction to Machine Learning/ Deep Learning Hyeong In Choi Seoul National University 1/32 Lecture 5a Bayesian network April 14, 2016 2/32 Table of contents 1 1. Objectives of Lecture 5a 2 2.Bayesian

More information

Bayesian Networks. instructor: Matteo Pozzi. x 1. x 2. x 3 x 4. x 5. x 6. x 7. x 8. x 9. Lec : Urban Systems Modeling

Bayesian Networks. instructor: Matteo Pozzi. x 1. x 2. x 3 x 4. x 5. x 6. x 7. x 8. x 9. Lec : Urban Systems Modeling 12735: Urban Systems Modeling Lec. 09 Bayesian Networks instructor: Matteo Pozzi x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 x 9 1 outline example of applications how to shape a problem as a BN complexity of the inference

More information

Probabilistic Robotics

Probabilistic Robotics University of Rome La Sapienza Master in Artificial Intelligence and Robotics Probabilistic Robotics Prof. Giorgio Grisetti Course web site: http://www.dis.uniroma1.it/~grisetti/teaching/probabilistic_ro

More information

CSE 473: Artificial Intelligence Probability Review à Markov Models. Outline

CSE 473: Artificial Intelligence Probability Review à Markov Models. Outline CSE 473: Artificial Intelligence Probability Review à Markov Models Daniel Weld University of Washington [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley.

More information

CS 343: Artificial Intelligence

CS 343: Artificial Intelligence CS 343: Artificial Intelligence Bayes Nets: Sampling Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley.

More information

Introduction to Bayes Nets. CS 486/686: Introduction to Artificial Intelligence Fall 2013

Introduction to Bayes Nets. CS 486/686: Introduction to Artificial Intelligence Fall 2013 Introduction to Bayes Nets CS 486/686: Introduction to Artificial Intelligence Fall 2013 1 Introduction Review probabilistic inference, independence and conditional independence Bayesian Networks - - What

More information

Probabilistic Models

Probabilistic Models Bayes Nets 1 Probabilistic Models Models describe how (a portion of) the world works Models are always simplifications May not account for every variable May not account for all interactions between variables

More information