Graphical Models (Lecture 1 - Introduction)

Size: px
Start display at page:

Download "Graphical Models (Lecture 1 - Introduction)"

Transcription

1 Grahical Models Lecture - Introduction Tibério Caetano tiberiocaetano.com Statistical Machine Learning Grou NICTA Canberra LLSS Canberra 009 Tibério Caetano: Grahical Models Lecture - Introduction / 7

2 Material on Grahical Models Many good books Chris Bisho s book Pattern Recognition and Machine Learning Grahical Models chater available from his webage in df format as well as all the figures many used here in these slides! Judea Pearl s Probabilistic Reasoning in Intelligent Systems Stehen Lauritzen s Grahical Models Unublished material Michael Jordan s unublished book An Introduction to Probabilistic Grahical Models Koller and Friedman s unublished book Structured Probabilistic Models Videos Sam Roweis videos on videolectures.net Ecellent! Tibério Caetano: Grahical Models Lecture - Introduction / 7

3 Introduction Query in quotes # results in Google Scholar Kalman Filter >0000 EM algorithm > Hidden Markov Models > Bayesian Networks > 600 Markov Random Fields > 5000 Particle Filters > 4000 Miture Models > 4000 Conditional Random Fields > 500 Markov Chain Monte Carlo > Gibbs Samling > 8000 Tibério Caetano: Grahical Models Lecture - Introduction / 7

4 Introduction Grahical Models have been alied to Image Processing Seech Processing Natural Language Processing Document Processing Pattern Recognition Bioinformatics Comuter Vision Economics Physics Social Sciences Tibério Caetano: Grahical Models Lecture - Introduction 4 / 7

5 Physics Tibério Caetano: Grahical Models Lecture - Introduction 5 / 7

6 Biology Tibério Caetano: Grahical Models Lecture - Introduction 6 / 7

7 Music Tibério Caetano: Grahical Models Lecture - Introduction 7 / 7

8 Comuter Vision Matching Tibério Caetano: Grahical Models Lecture - Introduction 8 / 7

9 Comuter Alications Vision in Vision and PR Matching Tibério Caetano: Grahical Models Lecture - Introduction 9 / 7

10 Image Processing l Tibério Caetano: Grahical Models Lecture - Introduction 0 / 7

11 Image Processing Tibério Caetano: Grahical Models Lecture - Introduction / 7

12 Introduction Technically Grahical Models are Multivariate robabilistic models... which are structured... in terms of conditional indeendence statements Informally Models that reresent a system by its arts and the ossible relations among them in a robabilistic way Tibério Caetano: Grahical Models Lecture - Introduction / 7

13 Introduction Questions we want to ask about these models Estimating the arameters of the model given data Obtaining data samles from the model Comuting robabilities of articular outcomes Finding most likely outcome Tibério Caetano: Grahical Models Lecture - Introduction / 7

14 Univariate Eamle = σ π e[ µ /σ ] Estimate µ and σ given X = {... n } Samle from Comute Pµ σ µ + σ := µ+σ d µ σ Find argma Tibério Caetano: Grahical Models Lecture - Introduction 4 / 7

15 Multivariate Case We want to do the same things For multivariate distributions structured according to CI Efficiently Accurately For eamle given... n ; θ Estimate θ given a samle X and criterion e.g. ML Comute A A {... n } marginal distributions Find argma... n... n ; θ MAP assignment Tibério Caetano: Grahical Models Lecture - Introduction 5 / 7

16 Naively... When trying to answer the relevant questions... =... N... N O X X X N and similarly for other questions: NOT GOOD We need comact reresentations of multivariate distributions Tibério Caetano: Grahical Models Lecture - Introduction 6 / 7

17 When to Use Grahical Models When the comactness of the model arises from conditional indeendence statements involving its random variables. CAUTION: Grahical Models are useful in such cases. If the robability sace is structured in different ways Grahical Models may not and in rincile should not be the right framework to reresent and deal with the robability distributions involved. Tibério Caetano: Grahical Models Lecture - Introduction 7 / 7

18 Grahical Models Lecture - Basics Tibério Caetano tiberiocaetano.com Statistical Machine Learning Grou NICTA Canberra LLSS Canberra 009 Tibério Caetano: Grahical Models Lecture - Basics /

19 Notation Basic definitions involving random quantities X - A random variable X = X... X N N - A articular realization of X: =... N X - Set of all realizations samle sace X A - A random vector of variables indeed by A {... N} A for realizations XÃ - The random vector comrised of all variables other than those in X A A Ã = {... N} A Ã =. Ã for realizations X A := { A } X A := { A } := robability that X assumes realization Basic roerties of robabilities 0 X X = Tibério Caetano: Grahical Models Lecture - Basics /

20 Conditioning and Marginalization Conditioning The two rules you will always need A B = A B B for B > 0 Marginalization A = Ã XÃ A Ã Tibério Caetano: Grahical Models Lecture - Basics /

21 Indeendence and Conditional Inde. Indeendence A B = A B Conditional Indeendence A B C = A C B C or equivalently A B C = A C or equivalently B A C = B C Notation: X A X B X C Tibério Caetano: Grahical Models Lecture - Basics 4 /

22 Conditional Indeendence Eamles Weather tomorrow Weather yesterday Weather today My Genome my grandarents Genome my arent s Genome My mood my wife s boss mood my wife s mood A iel s color color of far away iels color of surrounding iels Tibério Caetano: Grahical Models Lecture - Basics 5 /

23 Conditional Indeendence The KEY Fact is A B C }{{} f variables = A C }{{} f variables B C }{{} f variables factors as functions over roer subsets of variables Tibério Caetano: Grahical Models Lecture - Basics 6 /

24 Conditional Indeendence Therefore A B C }{{} f variables = A C }{{} f variables B C }{{} f variables A B C cannot assume arbitrary values for arbitrary A B C If you vary A and B for fied C you can only realize robabilities that satisfy the above condition Tibério Caetano: Grahical Models Lecture - Basics 7 /

25 What is a Grahical Model? What is a Grahical Model? Given a set of conditional indeendence statements for the random vector X = X... X N : {X Ai X Bi X Ci } Our object of study will be the family of robability distributions where these statements hold... N Tibério Caetano: Grahical Models Lecture - Basics 8 /

26 Questions to be addressed Tyical questions when we have a robabilistic model Estimate arameters of the model given data Comute robabilities of articular outcomes Find articularly interesting realizations e.g. MAP assignment In order to maniulate the robabilistic model We need to know the mathematical structure of We need to find ways of comuting efficiently and accurately in such structure Tibério Caetano: Grahical Models Lecture - Basics 9 /

27 An Eercise How does look like? Eamle: where = = = so = Tibério Caetano: Grahical Models Lecture - Basics 0 /

28 An Eercise However = is also = since = = since = Tibério Caetano: Grahical Models Lecture - Basics /

29 Cond. Inde. and Factorization So CI seems to generate factorization of! Is this useful for the questions we want to ask? Let s see an eamle of how eensive it is to comute = Without factorization: = O X X X With factorization: = = = O X X Tibério Caetano: Grahical Models Lecture - Basics /

30 Cond. Inde. and Factorization Therefore Conditional Indeendence seems to induce a structure in that allows us to eloit the distributive law in order to make comutations more tractable However what about the general case... N? What is the form that will take in general given a set of conditional indeendence statements? Will we be able to eloit the distributive law in this general case as well? Tibério Caetano: Grahical Models Lecture - Basics /

31 Re-Writing the Joint Distribution A little eercise... N =... N N... N... N =... N... N = N i= i <i where < i := {j : j < i j N + } now denote by π a ermutation of the labels {... N} such that π j < π i i j < i. Above we have π = i.e. π i = i So we can write = N i= π i <πi. Tibério Caetano: Grahical Models Lecture - Basics 4 /

32 Re-Writing the Joint Distribution So Any can be written as = N i= π i <πi. Now assume that the following CI statements hold πi <πi = πi aπi i where a πi < π i. Then we immediately get = N i= π i aπi Tibério Caetano: Grahical Models Lecture - Basics 5 /

33 Comuting in Grahical Models Algebra is boring so let s draw this Let s reresent variables as circles Let s draw an arrow from j to i if j a i The resulting drawing will be a Directed Grah Tyes of Grahical Models Moreover it will be Acyclic no directed cycles Eercise:why? Directed Grahical Models: X 4 X X X 6 X X 5 Tibério Caetano: Grahical Models Lecture - Basics 6 /

34 Comuting intyes of Grahical Models Directed Grahical Models: X 4 X X X 6 X X 5 =? Eercise Tibério Caetano: Grahical Models Lecture - Basics 7 /

35 Bayesian Networks This is why the name Grahical Models Such Grahical Models with arrows are called Bayesian Networks Bayes Nets Bayes Belief Nets Belief Networks Or more descritively: Directed Grahical Models Tibério Caetano: Grahical Models Lecture - Basics 8 /

36 Bayesian Networks A Bayesian Network associated to a DAG is a set of robability distributions where each element can be written as = i i ai where random variable i is reresented as a node in the DAG and a i = { j : arrow j i in the DAG }. a is for arents. Colloquially we say the BN is the DAG Tibério Caetano: Grahical Models Lecture - Basics 9 /

37 Toological Sorts A ermutation π of the node labels which for every node makes each of its arents have a smaller inde than that of the node is called a toological sort of the nodes in the DAG. Theorem: Every DAG has at least one toological sort Eercise: Prove Tibério Caetano: Grahical Models Lecture - Basics 0 /

38 A Little Eercise Revisited Remember?... N =... N N... N... N =... N... N = N i= i <i where < i := {j : j < i j N + } now denote by π a ermutation of the labels {... N} such that π j < π i i j < i. Above we have π = So we can write = N i= π i <πi. Tibério Caetano: Grahical Models Lecture - Basics /

39 Eercises Eercises: How many toological sorts has a BN where no CI statements hold? How many toological sorts has a BN where all CI statements hold? Tibério Caetano: Grahical Models Lecture - Basics /

40 Grahical Models Lecture -Bayesian Networks Tibério Caetano tiberiocaetano.com Statistical Machine Learning Grou NICTA Canberra LLSS Canberra 009 Tibério Caetano: Grahical Models Lecture -Bayesian Networks / 9

41 Some Key Elements of Lecture We can always write = i i <i Create a DAG with arrow j i whenever j < i Imose CI statements by removing some arrows The result will be = i i ai Now there will be ermutations π other than the identity such that = i π i aπi with π i > k where k aπ i. Tibério Caetano: Grahical Models Lecture -Bayesian Networks / 9

42 Refreshing Eercise Eercise Prove that the factorized form for the robability distribution of a Bayesian Network is indeed normalized to. Tibério Caetano: Grahical Models Lecture -Bayesian Networks / 9

43 Hidden CI Statements? We have obtained a BN by Introducing very convenient CI statements namely those that shrink the factors of the eansion = N i= π i <πi By doing so have we induced other CI statements? The answer is YES Tibério Caetano: Grahical Models Lecture -Bayesian Networks 4 / 9

44 Head-to-Tail Nodes Indeendence Are a and b indeendent? a c b Does a b hold? Check whether ab = ab ab = c abc = c ac ab c = a c b cc a = ab a ab Tibério Caetano: Grahical Models Lecture -Bayesian Networks 5 / 9

45 Head-to-Tail Nodes Cond. Inde. Factorization CI? a c b Does abc = ac ab c a b c? Assume abc = ac ab c holds Then ab c = abc c = ac ab c c = ca cb c c = a cb c Tibério Caetano: Grahical Models Lecture -Bayesian Networks 6 / 9

46 Head-to-Tail Nodes Cond. Inde. CI Factorization? a c b Does a b c abc = ac ab c? Assume a b c i.e. ab c = a cb c Then abc := ab cc = a cb cc Bayes = ac ab c Tibério Caetano: Grahical Models Lecture -Bayesian Networks 7 / 9

47 Tail-to-Tail Nodes Indeendence Are a and b indeendent? c a b Does a b hold? Check whether ab = ab ab = c abc = c ca cb c = ba cc b = ba b ab in general c Tibério Caetano: Grahical Models Lecture -Bayesian Networks 8 / 9

48 Tail-to-Tail Nodes Cond. Inde. Factorization CI? c a b Does abc = ca cb c a b c? Assume abc = ca cb c. Then ab c = abc c = ca cb c c = a cb c Tibério Caetano: Grahical Models Lecture -Bayesian Networks 9 / 9

49 Tail-to-Tail Nodes Cond. Inde. CI Factorization? c a b Does a b c abc = ca cb c? Assume a b c holds i.e. ab c = a cb c holds Then abc = ab cc = a cb cc Tibério Caetano: Grahical Models Lecture -Bayesian Networks 0 / 9

50 Head-to-Head Nodes Indeendence Are a and b indeendent? a b c Does a b hold? Check whether ab = ab ab = c abc = c abc ab = ab Tibério Caetano: Grahical Models Lecture -Bayesian Networks / 9

51 Head-to-head Nodes Cond. Inde. Factorization CI? a b c Does abc = abc ab a b c? Assume abc = abc ab holds ab c = abc c Then = abc ab c Tibério Caetano: Grahical Models Lecture -Bayesian Networks / 9 a cb c in general

52 CI Factorization in -Node BNs Therefore we conclude that Conditional Indeendence and Factorization are equivalent for the atomic Bayesian Networks with only nodes. Question Are they equivalent for any Bayesian Network? To answer we need to characterize which conditional indeendence statements hold for an arbitrary factorization and check whether a distribution that satisfies those statements will have such factorization. Tibério Caetano: Grahical Models Lecture -Bayesian Networks / 9

53 Blocked Paths We start by defining a blocked ath which is one containing An observed TT or HT node or A HH node which is not observed nor any of its descendants is observed a f a f e b e b c c Tibério Caetano: Grahical Models Lecture -Bayesian Networks 4 / 9

54 D-Searation A set of nodes Tyes A is of said Grahical to be d-searated Models from a set of nodes B by a set of nodes C if every ath from A to B is blocked when C is in the conditioning set. Directed Grahical Models: X 4 X X X 6 X X 5 Eercise: Is X d-searated from X 6 when the conditioning set is {X X 5 }? Tibério Caetano: Grahical Models Lecture -Bayesian Networks 5 / 9

55 CI Factorization for BNs Theorem: Factorization CI If a robability distribution factorizes according to a directed acyclic grah and if A B and C are disjoint subsets of nodes such that A is d-searated from B by C in the grah then the distribution satisfies A B C. Theorem: CI Factorization If a robability distribution satisfies the conditional indeendence statements imlied by d-searation over a articular directed grah then it also factorizes according to the grah. Tibério Caetano: Grahical Models Lecture -Bayesian Networks 6 / 9

56 Factorization CI for BNs Proof Strategy: DF d-se d-se: d-searation roerty DF: Directed Factorization Proerty Tibério Caetano: Grahical Models Lecture -Bayesian Networks 7 / 9

57 CI Factorization for BNs Proof Strategy: d-se DL DF DL: Directed Local Markov Proerty: α ndα aα Thus we obtain DF d-se DL DF Tibério Caetano: Grahical Models Lecture -Bayesian Networks 8 / 9

58 Relevance of CI Factorization for BNs Has local wants global CI statements are usually what is known by the eert The eert needs the model in order to comute things The CI Factorization art gives from what is known CI statements Tibério Caetano: Grahical Models Lecture -Bayesian Networks 9 / 9

59 Grahical Models Lecture 4 - Markov Random Fields Tibério Caetano tiberiocaetano.com Statistical Machine Learning Grou NICTA Canberra LLSS Canberra 009 Tibério Caetano: Grahical Models Lecture 4 - Markov Random Fields / 8

60 Changing the class of CI statements We obtained BNs by assuming πi <πi = πi aπi i where a πi < π i. We saw in general that such tyes of CI statements would roduce others and in general all CI statements can be read as d-searation in a DAG. However there are sets of CI statements which cannot be satisfied by any BN. Tibério Caetano: Grahical Models Lecture 4 - Markov Random Fields / 8

61 Markov Random Fields Ideally we would like to have more freedom There is another class of Grahical Models called Markov Random Fields MRFs MRFs allow for the secification of a different class of CI statements The class of CI statements for MRFs can be easily defined by grahical means in undirected grahs. Tibério Caetano: Grahical Models Lecture 4 - Markov Random Fields / 8

62 Grah Searation Definition of Grah Searation In an undirected grah G being A B and C disjoint subsets of nodes if every ath from A to B includes at least one node from C then C is said to searate A from B in G. A C B Tibério Caetano: Grahical Models Lecture 4 - Markov Random Fields 4 / 8

63 Grah Searation Definition of Markov Random Field An MRF is a set of robability distributions { : > 0 } such that there eists an undirected grah G with disjoint subsets of nodes A B C in which whenever C searates A from B in G A B C in Colloquially we say that the MRF is such undirected grah. But in reality it is the set of all robability distributions whose conditional indeendency statements are recisely those given by grah searation in the grah. Tibério Caetano: Grahical Models Lecture 4 - Markov Random Fields 5 / 8

64 Cliques and Maimal Cliques Definitions concerning undirected grahs A clique of a grah is a comlete subgrah of it i.e. a subgrah where every air of nodes is connected by an edge. A maimal clique of a grah is clique which is not a roer subset of another clique 4 {X X } form a clique and {X X X 4 } a maimal clique. Tibério Caetano: Grahical Models Lecture 4 - Markov Random Fields 6 / 8

65 Factorization Proerty Definition of factorization w.r.t. an undirected grah A robability distribution is said to factorize with resect to a given undirected grah if it can be written as = Z c C c c where C is the set of maimal cliques c is a maimal clique c is the domain of restricted to c and c c is an arbitrary non-negative real-valued function. Z ensures =. Tibério Caetano: Grahical Models Lecture 4 - Markov Random Fields 7 / 8

66 CI Factorization for ositive MRFs Theorem: Factorization CI If a robability distribution factorizes according to an undirected grah and if A B and C are disjoint subsets of nodes such that C searates A from B in the grah then the distribution satisfies A B C. Theorem: CI Factorization Hammersley-Clifford If a strictly ositive robability distribution > 0 satisfies the conditional indeendence statements imlied by grah searation over a articular undirected grah then it also factorizes according to the grah. Tibério Caetano: Grahical Models Lecture 4 - Markov Random Fields 8 / 8

67 Factorization CI for MRFs Proof... Tibério Caetano: Grahical Models Lecture 4 - Markov Random Fields 9 / 8

68 CI Factorization for +MRFs H-C Thm Möbius Inversion: for C B A S and F : PS R: FA = B:B A C:C B B C F C Define F = φ = log and comute the inner sum for the case where B is not a clique i.e. X X not connected in B. Then CI φx C X + φc = φc X + φc X holds and = B C φc = C B C B;X X / C C B;X X / C C B;X X / C C B;X X / C C B;X X / C B C φc + B C X φc X + B C X φc X + B X C X φx C X = B C [φx C X + φc φc X φc X ] }{{} Tibério Caetano: Grahical Models Lecture 4 - Markov Random Fields 0 / 8 =0

69 Relevance of CI Factorization for MRFs Relevance is analogous to the BN case CI statements are usually what is known by the eert The eert needs the model in order to comute things The CI Factorization art gives from what is known CI statements Tibério Caetano: Grahical Models Lecture 4 - Markov Random Fields / 8

70 Comarison BNs vs. MRFs In both tyes of Grahical Models A relationshi between the CI statements satisfied by a distribution and the associated simlified algebraic structure of the distribution is made in term of grahical objects. The CI statements are related to concets of searation between variables in the grah. The simlified algebraic structure factorization of in this case is related to local ieces of the grah child + its arents in BNs cliques in MRFs Tibério Caetano: Grahical Models Lecture 4 - Markov Random Fields / 8

71 Comarison BNs vs. MRFs Differences The set of robability distributions that can be reresented as MRFs is different from the set that can be reresented as BNs. Although both MRFs and BNs are eressed as a factorization of local functions on the grah the MRF has a normalization constant Z = c C c c that coules all factors whereas the BN has not. The local ieces of the BN are robability distributions themselves whereas in MRFs they need only be non-negative functions i.e. they may not have range [0 ] as robabilities do. Tibério Caetano: Grahical Models Lecture 4 - Markov Random Fields / 8

72 Comarison BNs vs. MRFs Eercises When are the CI statements of a BN and a MRF recisely the same? A grah has nodes A B and C. We know that A B but C A and C B both do not hold. Can this reresent a BN? An MRF? Tibério Caetano: Grahical Models Lecture 4 - Markov Random Fields 4 / 8

73 I-Mas D-Mas and P-Mas A grah is said to be a D-ma for deendence ma of a distribution if every conditional indeendence statement satisfied by the distribution is reflected in the grah. A grah is said to be an I-ma for indeendence ma of a distribution if every conditional indeendence statement imlied by the grah is satisfied in the distribution. A grah is said to be an P-ma for erfect ma of a distribution if it is both a D-ma and an I-ma for the distribution. Tibério Caetano: Grahical Models Lecture 4 - Markov Random Fields 5 / 8

74 I-Mas D-Mas and P-Mas D U P D: set of distributions on n variables that can be reresented as a erfect ma by a DAG U: set of distributions on n variables that can be reresented as a erfect ma by an Undirected grah P: set of all distributions on n variables Tibério Caetano: Grahical Models Lecture 4 - Markov Random Fields 6 / 8

75 Markov Blankets i The Markov Blanket of a node X i in either a BN or an MRF is the smallest set of nodes A such that i ĩ = i A BN: arents children and co-arents of the node MRF: neighbors of the node Tibério Caetano: Grahical Models Lecture 4 - Markov Random Fields 7 / 8

76 Eercises Eercises Show that the Markov Blanket of a node i in a BN is given by it s children arents and co-arents Show that the Markov Blanket of a node i in a MRF is given by its neighbors Tibério Caetano: Grahical Models Lecture 4 - Markov Random Fields 8 / 8

77 Grahical Models Lecture 5 - Inference Tibério Caetano tiberiocaetano.com Statistical Machine Learning Grou NICTA Canberra LLSS Canberra 009 Tibério Caetano: Grahical Models Lecture 5 - Inference / 9

78 Factorized Distributions Our as a factorized form For BNs we have = i i a i for MRFs we have = Z c C c c Will this enable us to answer the relevant questions in ractice? Tibério Caetano: Grahical Models Lecture 5 - Inference / 9

79 Key Concet: Distributive Law Distributive Law ab }{{ + ac } = ab + c }{{} oerations oerations I.e. if the same constant factor a here is resent in every term we can gain by ulling it out Consider comuting the marginal for the MRF with factorization N = Z i= i i+ Eercise: which grah is this? = N... N Z i= i i+ = Z N N N O N i= X i vs. O i=n i= X i X i+ Tibério Caetano: Grahical Models Lecture 5 - Inference / 9

80 Elimination Algorithm Distributive Law DL is the key to efficient inference in GMs The simlest algorithm using the DL is the Elimination Algorithm This algorithm is aroriate when we have a single query Just like in the revious eamle of comuting in a givel MRF This algorithm can be seen as successive elimination of nodes in the grah Tibério Caetano: Grahical Models Lecture 5 - Inference 4 / 9

81 X X 5 Elimination Algorithm XX X X 5 5 a a a X X X X X X X 5 X X 5 5 b b b X X X X X 4 X 4 XX4 4 X 4 X X 4X 4 4 X X 5 X 5 X X X XX X X X X c X X X c d d X 6 X X 6 X X X X X X X X X X X X X X X X 5 X X d X b X 5 Comute Xwith 4 elimination order X X X X X X X X X X X X = Z P X X X XX X c c c XX5 5 X 5 X X f f f g g g X k# h8 f # $ &'$ &$ ' #" &'$ $ " {z a } h id8 D e X d d d X X X f g k# h8 f # $ &'$ &$ ' #" &'$ $ " k# h8$ f & # $ &' $! $ ad8 &$ ' #" &'$ $ " $ & $! ad8 = Z P P P 4 4 X 5 5 X k# h8 f # # $ $ &'$ &$ &$ ' ' #" #" &'$ $ $ " " a a h h id8 id8 D D $ & $ & $! $! ad8 ad8 m 6 5 {z } = Z sum P m 5 X 4 4 m 5 e e e f g f {z } g &$ ' #" &'$ $ " a h id8 D m 4 = Z X m 4 X m 5 {z } m {z } m Tibério Caetano: Grahical Models Lecture 5 - Inference 5 / 9

82 Belief Proagation Belief Proagation Algorithm also called Probability Proagation Sum-Product Algorithm Does not reeat comutations Is secifically targeted at tree-structured grahs Tibério Caetano: Grahical Models Lecture 5 - Inference 6 / 9

83 Belief Proagation in a Chain µ α n µ α n µ β n µ β n+ n n n+ N N Z n <n >n n = <n >n n = Z [ n = Z <n n = Z i= i i+ i= i i+ N i=n i i+ ] [ >n n i= i i+ [ ] n i i+ <n i= }{{} µ α n=o P i=n i= X i X i+ [ >n N Tibério Caetano: Grahical Models Lecture 5 - Inference 7 / 9 i=n N i=n i i+ i i+ ] }{{} µ β n=o P N i=n X i X i+ ]

84 Belief Proagation in a Chain So in order to comute n we only need the incoming messages to n But n is arbitrary so in order to answer an arbitrary query we need an arbitrary air of incoming messages So we need all messages To comute a message to the right left we need all revious messages coming from the left right So the rotocol should be: start from the leaves u to n then go back towards the leaves Chain with N nodes N messages to be comuted Tibério Caetano: Grahical Models Lecture 5 - Inference 8 / 9

85 Comuting Messages Defining trivial messages: m 0 := m N+ N := For i = to N comute m i i = i i i m i i For i = N back to comute m i+ i = i+ i i+ m i+ + Tibério Caetano: Grahical Models Lecture 5 - Inference 9 / 9

86 Belief Proagation in a Tree The reason why things are so nice in the chain is that every node can be seen as a leaf after it has received the message from one side i.e. after the nodes from which the message come have been eliminated Original Leaves give us the right lace to start the comutations and from there the adjacent nodes become leaves as well However this roerty also holds in a tree Tibério Caetano: Grahical Models Lecture 5 - Inference 0 / 9

87 Belief Proagation in a Tree Message Passing Equation m j i = j j i k:k jk i m k j k:k jk i m k j := whenever j is a leaf Comuting Marginals i = j:j i m j i Tibério Caetano: Grahical Models Lecture 5 - Inference / 9

88 Ma-Product Algorithm There are imortant queries other than comuting marginals. For eamle we may want to comute the most likely assignment: = argma as well as its robability one ossibility would be to comute i = for all i then ĩ i = argma i i and then simly =... N What s the roblem with this? Tibério Caetano: Grahical Models Lecture 5 - Inference / 9

89 Eercises Eercise Construct with {0 } such that = 0 where i = argma i i Tibério Caetano: Grahical Models Lecture 5 - Inference / 9

90 Ma-Product and Ma-Sum Algorithms Instead we need to comute directly = argma... N... N We can use the distributive law again since maab ac = a mab c for a > 0 Eactly the same algorithm alies here with ma instead of : ma-roduct algorithm. To avoid underflow we comute via logargma = argma log = argma s log f s s since log is a monotonic function. We can still use the distributive law since ma + is also a commutative semiring i.e. maa + b a + c = a + mab c Tibério Caetano: Grahical Models Lecture 5 - Inference 4 / 9

91 A Detail in Ma-Sum After comuting the ma-marginal for the root : i = ma i s µ f s and its maimizer i = argma i i It s not a good idea simly to ass back the messages to the leaves and then terminate Why? In such cases it is safer to store the maimizing configurations of revious variables with resect to the net variables and then simly backtrack to restore the maimizing ath. In the articular case of a chain this is called Viterbi algorithm an instance of dynamic rogramming. Tibério Caetano: Grahical Models Lecture 5 - Inference 5 / 9

92 Arbitrary Grahs X 4 X X X 6 X X 5 Elimination algorithm is needed to comute marginals Tibério Caetano: Grahical Models Lecture 5 - Inference 6 / 9

93 A Problem with the Elimination Algorithm Elimination: eamle How to comute X 4 6 X X X 6 X X 5 Tibério Caetano: Grahical Models Lecture 5 - Inference 7 / 9

94 A Problem with the Elimination Algorithm Elimination m Z m m Z m m Z m Z m Z Z Z = = = = = = = δ δ Tibério Caetano: Grahical Models Lecture 5 - Inference 8 / 9

95 A Problem with the Elimination Algorithm Elimination = 6 m Z 6 m Z = = 6 m m Tibério Caetano: Grahical Models Lecture 5 - Inference 9 / 9

96 A Problem with the Elimination Algorithm Elimination: eamle What if now we want to comute 6 X X 4 X X 6 X X 5 Tibério Caetano: Grahical Models Lecture 5 - Inference 0 / 9

97 A Problem with the Elimination Algorithm Elimination: eamle m Z m Z m m Z m Z m Z Z Z = = = = = = = δ δ Tibério Caetano: Grahical Models Lecture 5 - Inference / 9

98 A Problem with the Elimination Algorithm Elimination: eamle m Z m Z m m Z m Z m Z Z Z = = = = = = = δ δ m Z m m Z m m Z m Z m Z Z Z = = = = = = = δ δ Reeated Comutations!! 6 6 How to avoid that? Tibério Caetano: Grahical Models Lecture 5 - Inference / 9

99 The Junction Tree Algorithm The Junction Tree Algorithm is a generalization of the belief roagation algorithm for arbitrary grahs In theory it can be alied to any grah DAG or undirected However it will be efficient only for certain classes of grahs Tibério Caetano: Grahical Models Lecture 5 - Inference / 9

100 Chordal Grahs Chordal Grahs also called triangulated grahs The JT algorithm runs on chordal grahs A chord in a cycle is an edge connecting two nodes in the cycle but which does not belong to the cycle i.e. a shortcut in the cycle Triangulation A grah is chordal if every cycle of length greater than The first ste in the algorithm is to triangulate the grah by roer insertion of edges. has a chord. A B A B C D C D not chordal E F E F chordal Tibério Caetano: Grahical Models Lecture 5 - Inference 4 / 9

101 Chordal Grahs What if a grah is not chordal? Add edges until it becomes chordal This will change the grah Eercise: Why is this not a roblem? Tibério Caetano: Grahical Models Lecture 5 - Inference 5 / 9

102 Triangulation Ste Triangulate the grah if it s not triangulated X 4 X X X 6 X X 5 Tibério Caetano: Grahical Models Lecture 5 - Inference 6 / 9

103 Junction Tree Construction The Junction Tree Algorithm Create a Junction Tree X 4 X X X X X X X X 5 X X X 6 X X 5 X X 5 X X 4 X X X 5 X 6 Tibério Caetano: Grahical Models Lecture 5 - Inference 7 / 9

104 Initialization The Junction Tree Algorithm Initialize clique otentials nodes and searators Ψc Directly introduced Φ s Initialized to Ψ 4 Ψ Ψ 5 X X X X X X X X 5 Φ5 = ones S S X X 4 Φ = ones S S Φ = ones S X X X 5 X X 5 X 6 Ψ 56 Tibério Caetano: Grahical Models Lecture 5 - Inference 8 / 9

105 Proagation Ψ ** Φ Ψ * * 4 Φ ** The Junction Tree Algorithm 4 Message assing Φ = Φ ** * = Ψ Φ = Φ * = Ψ 4 Ψ Ψ 4 * 4 Φ Ψ Φ Ψ Φ Ψ ** * 5 * 5 * 56 * * 56 = Ψ Φ = Φ * = Ψ Φ = Φ = Ψ Φ = Φ Ψ * 5 * 5 5 Ψ * 56 ** * ** 5 Ψ Ψ ** 5 Φ ** 5 Φ = Φ = Ψ 6 X X X X X X X X 5 ** 5 * 5 Ψ ** 56 * 5 X X 4 X X X 5 X X 5 X 6 Tibério Caetano: Grahical Models Lecture 5 - Inference 9 / 9

106 Grahical Models Lecture 6 - Learning Tibério Caetano tiberiocaetano.com Statistical Machine Learning Grou NICTA Canberra LLSS Canberra 009 Tibério Caetano: Grahical Models Lecture 6 - Learning / 0

107 Learning We saw that Given ; θ Probabilistic Inference consists of comuting Marginals of ; θ Conditional distributions MAP configurations etc. However what is ; θ in the first lace? Finding ; θ from data is called Learning or Estimation or Statistical Inference. Tibério Caetano: Grahical Models Lecture 6 - Learning / 0

108 Maimum Likelihood Estimation In the case of Grahical Models we ve seen that ; θ = Z s S f s s ; θ s where {s} are subsets of random variables and {f s } are non-negative real-valued functions. We can re-write that as ; θ = e s S log f s s ; θ s gθ where gθ = log e s S log f s s ; θ s Tibério Caetano: Grahical Models Lecture 6 - Learning / 0

109 Data IID assumtion We observe data X = {X... X m } We assume every X i is a samle from the same unknown distribution ; θ S identical assumtion We assume X i and X j i j to be drawn indeendently from ; θ S indeendence assumtion This is the iid setting indeendently and identically distributed Tibério Caetano: Grahical Models Lecture 6 - Learning 4 / 0

110 Maimum Likelihood Estimation The joint robability of observing the data X is thus X; θ = i i ; θ = i e s log f s i s; θ s gθ Seen as a function of θ this is the likelihood function. The negative log-likelihood is log X; θ = mgθ m i= s log f ss; i θ s Tibério Caetano: Grahical Models Lecture 6 - Learning 5 / 0

111 Maimum Likelihood Estimation Maimum likelihood estimation consists of finding θ that maimizes the likelihood function or minimizes the negative log-likelihood: ] m θ = argmin θ [mgθ log f s s; i θ s i= s } {{ } :=lθ;x In order to minimize it we must have θ lθ; X = 0 and therefore each θs lθ; X = 0 s. What haens for both BNs and MRFs? Tibério Caetano: Grahical Models Lecture 6 - Learning 6 / 0

112 ML Estimation in BNs For BNs gθ = 0 Eercise and s f s s ; θ s = s so θs [ mgθ m log f s s; i θ s + s i= m θs f s s i ; θ s = λ s i= s s θs f s s ; θ s s λ s s f s s ; θ s Therefore we have S equations where every air can be solved indeendently for θ s and λ s So the ML estimation roblem decoules on local ML estimation roblems involving only the variables in each individual set s. Tibério Caetano: Grahical Models Lecture 6 - Learning 7 / 0 ] =

113 ML Estimation in MRFs For MRFs gθ 0 so θs [mgθ m θs gθ ] m log f s s; i θ s = 0 i= s m θs f s s; i θ s = 0 s i= s Therefore we have S equations that cannot be solved indeendently since gθ involves all s. This may give rise to a comle non-linear system of equations. So learning in MRFs is more difficult than in BNs. Tibério Caetano: Grahical Models Lecture 6 - Learning 8 / 0

114 Eonential Families Consider the arameterized family of distributions ; θ = e Φ θ gθ Such a family of distributions is called an Eonential Family Φ is the sufficient statistics θ is the natural arameter gθ = log e Φ θ is the log-artition function This is the form of several distributions of interest like Gaussian binomial multinomial Poisson gamma Rayleigh beta etc. Tibério Caetano: Grahical Models Lecture 6 - Learning 9 / 0

115 Eonential Families If we assume that our ; θ is an eonential family the learning roblem becomes articularly convenient because it becomes conve Why? Recall the form of ; θ for a grahical model ; θ = e s S log f s s ; θ s gθ for it to be an eonential family we need s S log f s s ; θ s = Φ θ = s Φ s s θ s Tibério Caetano: Grahical Models Lecture 6 - Learning 0 / 0

116 Eonential Families for MRFs For MRFs the negative log-likelihood now becomes m log X; θ =mgθ Φs s i θ s i= =mgθ m s where we defined µ s s := m i= Φ s s /m s µ s s θ s Taking the gradient and setting to zero we have θs mgθ m s µ s s θ s = 0 θs gθ = µ s s but θs gθ = E ;θ [Φ s s ]? so E ;θ [Φ s s ] = µ s s Tibério Caetano: Grahical Models Lecture 6 - Learning / 0

117 Eonential Families for MRFs In other words: The ML estimate θ must be such that the eected value of the sufficient statistics under ; θ for every clique has to match the samle average for the clique. Why is the roblem conve? Eercise Tibério Caetano: Grahical Models Lecture 6 - Learning / 0

118 Eonential Families for BNs For BNs the negative log-likelihood now becomes m log X; θ = Φs s i θ s i= = m s s µ s s θ s where we also defined µ s s := m i= Φ s i s/m. Constructing the Lagrangian corresonding to the constraints s e Φ s θ s = s and taking the gradient equal to zero we have m µ s s = λ s E s s ;θ s [Φ s s ] s which can be solved for θ s and λ s using e Φ s θ s = Tibério Caetano: Grahical Models Lecture 6 - Learning / 0 s

119 Eamle: Discrete BNs Multinomial random variables Tabular reresentation for v av define φ v := v av One arameter θ v associated to each v av i.e. θ v φv := v av ; θ v Note that there are no constraints beyond the normalization constraint The joint is V θ = v θ v φv The likelihood is then log X; θ = log n Vn θ continuing... Tibério Caetano: Grahical Models Lecture 6 - Learning 4 / 0

120 Eamle: Discrete BNs log X; θ = log Vn θ n = log V ; θ δ V Vn n V = δ V Vn log V ; θ n V = m V log V ; θ V = V m V log v θ v φv = m V log θ v φv V v = m V log θ v φv v φv V\φv = m φv log θ v φv v φv Tibério Caetano: Grahical Models Lecture 6 - Learning 5 / 0

121 Eamle: Discrete BNs The Lagrangian is Lθ λ = m φv log θ v φv + v v λ v v θ v φv φ v and θv φv Lθ λ = m φ v θ v φv λ v = 0 λ v = m φ v θ v φv but since v θ v φv = λ v = m av so θ v φv = m φ v m av Matches intuition Tibério Caetano: Grahical Models Lecture 6 - Learning 6 / 0

122 Learning the Potentials ML Estimation First How to learn the otential functions when we have observed data for all variables in the model? Second How to learn the otential functions when there are latent hidden variables i.e. we do not observe data for them? Tibério Caetano: Grahical Models Lecture 6 - Learning 7 / 0

123 Learning the Potentials Comletely Observed GMs V Ψ Z X X X = Ψ Assume we observe N instances of this model For IID samling the sufficient statistics are the emirical marginals ~ ~ and How do we estimate Ψ and Ψ from the sufficient statistics? Tibério Caetano: Grahical Models Lecture 6 - Learning 8 / 0

124 Learning the Potentials Comletely Observed GMs Let s make a guess: ~ ~ ~ ˆ ML = so that We can verify that our guess is good because: ~ ˆ ML = Ψ ~ ~ ˆ ML = Ψ = = ~ ~ ~ ~ ˆ ML = = ~ ~ ~ ~ ˆ ML Tibério Caetano: Grahical Models Lecture 6 - Learning 9 / 0

125 Learning the Potentials Comletely Observed GMs The general recie is: For every maimal clique C set the clique otential to its emirical marginal For every intersection S between maimal cliques associate an emirical marginal with that intersection and divide it into the otential of ONE of the cliques that form the intersection This will give ML estimates for decomosable Grahical Models Tibério Caetano: Grahical Models Lecture 6 - Learning 0 / 0

126 that are not adjacent in the cycle are not neighbors that is any {v a v b } with a b is not in E. That is there is no chord or shortcut for the cycle. Decomosable A grah is chordal also Grahs called triangulated if it contains no chordless cycles of length greater than. A grah is comlete if E contains all airs of distinct elements of V. A grah G = V E is decomosable if either. G is comlete or. We can eress V as V = A B C where a A B and C are disjoint b A and C are non-emty c B is comlete d B searates A and C in G and e A B and B C are decomosable. A ath is a sequence v... v n of distinct vertices for which all {v i v i+ } E. Afortree decomosable is an undirectedgrahs for thewhich derivative every air of the of vertices log-artition is connected by recisely one ath. A clique of a grah G is a subset of vertices that are comletely function gθ connected. decoules A maimal over clique theof cliques G is a clique Eercise for which every MRF suerset of learning vertices of easy. G is not a clique. A clique tree for a grah G = V E is a tree T = V T E T where V T is a set of cliques of G that contains all maimal cliques of G. Tibério Caetano: Grahical Models Lecture 6 - Learning / 0

127 Learning the Potentials Comletely Observed GMs Non-decomosable Grahical Models: An iterative rocedure must be used: Iterative Proortional Fitting IPF: ~ C C C C C C Ψ = Ψ ~ C t C C t C C t C Ψ = Ψ + Where it can be shown that: ~ C C t = + Tibério Caetano: Grahical Models Lecture 6 - Learning / 0

128 EM Algorithm Hidden variables: EM algorithm How to estimate the otentials when there are unobserved variables? X X X Answer: EM algorithm Tibério Caetano: Grahical Models Lecture 6 - Learning / 0

129 EM Algorithm Hidden variables: EM algorithm Denote the observed variables by X and the hidden variables by Z X Z X X X If we knew Z the roblem would reduce to maimizing the comlete log-likelihood: l c θ; z = log z θ Tibério Caetano: Grahical Models Lecture 6 - Learning 4 / 0

130 EM Algorithm Hidden variables: EM algorithm However we don t observe Z so the robability of the data X is l θ; = log θ = log z θ Which is the incomlete log-likelihood z This is the quantity we really want to maimize Note that now the logarithm cannot transform the roduct into a sum since it is blocked by the sum over Z and the otimization does not decoule Tibério Caetano: Grahical Models Lecture 6 - Learning 5 / 0

131 EM Algorithm Hidden variables: EM algorithm The basic idea of the EM algorithm is: Given that Z is not observed we may try to otimize an averaged version over all ossible values of Z of the comlete log-likelihood We do that through an averaging distribution q: l θ z = q z θ log z θ c q z And obtain the eected comlete log-likelihood The hoe then is that maimizing this should at least imrove the current estimate for the arameters so that iteration would eventually maimize the log-likelihood Tibério Caetano: Grahical Models Lecture 6 - Learning 6 / 0

132 EM Algorithm Hidden variables: EM algorithm In order to resent the algorithm we first note that: : log log ; log ; log ; θ θ θ θ θ θ θ θ q L z q z z q z q z z q l z l l z z z = = = = Where L is the auiliary function. The EM algorithm is coordinate-ascent on L Tibério Caetano: Grahical Models Lecture 6 - Learning 7 / 0

133 Hidden variables: EM algorithm EM Algorithm The EM algorithm E - ste q = arg ma L q θ t+ t q M - ste θ t+ = arg ma L q θ t+ θ Tibério Caetano: Grahical Models Lecture 6 - Learning 8 / 0

134 EM Algorithm Hidden variables: EM algorithm Note that the M ste is equivalent to maimizing the eected comlete loglikelihood: = = = z q c z z z z q z q z l q L z q z q z z q q L z q z z q q L log ; log log log θ θ θ θ θ θ Because the second term does not deend on θ Tibério Caetano: Grahical Models Lecture 6 - Learning 9 / 0

135 EM Algorithm The general solution to the E ste turns out to be Because t t z z q θ = + ; log log log l z L z L z z L z z z z L t t t t t t t z t t t t t z t t t θ θ θ θ θ θ θ θ θ θ θ θ θ θ θ = = = = Tibério Caetano: Grahical Models Lecture 6 - Learning 0 / 0

Introduction to Probability for Graphical Models

Introduction to Probability for Graphical Models Introduction to Probability for Grahical Models CSC 4 Kaustav Kundu Thursday January 4, 06 *Most slides based on Kevin Swersky s slides, Inmar Givoni s slides, Danny Tarlow s slides, Jaser Snoek s slides,

More information

Bayesian Networks Practice

Bayesian Networks Practice Bayesian Networks Practice Part 2 2016-03-17 Byoung-Hee Kim, Seong-Ho Son Biointelligence Lab, CSE, Seoul National University Agenda Probabilistic Inference in Bayesian networks Probability basics D-searation

More information

Bayesian networks. 鮑興國 Ph.D. National Taiwan University of Science and Technology

Bayesian networks. 鮑興國 Ph.D. National Taiwan University of Science and Technology Bayesian networks 鮑興國 Ph.D. National Taiwan University of Science and Technology Outline Introduction to Bayesian networks Bayesian networks: definition, d-searation, equivalence of networks Eamles: causal

More information

Bayesian Networks Practice

Bayesian Networks Practice ayesian Networks Practice Part 2 2016-03-17 young-hee Kim Seong-Ho Son iointelligence ab CSE Seoul National University Agenda Probabilistic Inference in ayesian networks Probability basics D-searation

More information

Part I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

Part I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Part I C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Probabilistic Graphical Models Graphical representation of a probabilistic model Each variable corresponds to a

More information

p L yi z n m x N n xi

p L yi z n m x N n xi y i z n x n N x i Overview Directed and undirected graphs Conditional independence Exact inference Latent variables and EM Variational inference Books statistical perspective Graphical Models, S. Lauritzen

More information

Approximating min-max k-clustering

Approximating min-max k-clustering Aroximating min-max k-clustering Asaf Levin July 24, 2007 Abstract We consider the roblems of set artitioning into k clusters with minimum total cost and minimum of the maximum cost of a cluster. The cost

More information

Machine Learning Lecture 14

Machine Learning Lecture 14 Many slides adapted from B. Schiele, S. Roth, Z. Gharahmani Machine Learning Lecture 14 Undirected Graphical Models & Inference 23.06.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de

More information

Outline. Markov Chains and Markov Models. Outline. Markov Chains. Markov Chains Definitions Huizhen Yu

Outline. Markov Chains and Markov Models. Outline. Markov Chains. Markov Chains Definitions Huizhen Yu and Markov Models Huizhen Yu janey.yu@cs.helsinki.fi Det. Comuter Science, Univ. of Helsinki Some Proerties of Probabilistic Models, Sring, 200 Huizhen Yu (U.H.) and Markov Models Jan. 2 / 32 Huizhen Yu

More information

Undirected Graphical Models: Markov Random Fields

Undirected Graphical Models: Markov Random Fields Undirected Graphical Models: Markov Random Fields 40-956 Advanced Topics in AI: Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2015 Markov Random Field Structure: undirected

More information

Chapter 7 Sampling and Sampling Distributions. Introduction. Selecting a Sample. Introduction. Sampling from a Finite Population

Chapter 7 Sampling and Sampling Distributions. Introduction. Selecting a Sample. Introduction. Sampling from a Finite Population Chater 7 and s Selecting a Samle Point Estimation Introduction to s of Proerties of Point Estimators Other Methods Introduction An element is the entity on which data are collected. A oulation is a collection

More information

CSC 412 (Lecture 4): Undirected Graphical Models

CSC 412 (Lecture 4): Undirected Graphical Models CSC 412 (Lecture 4): Undirected Graphical Models Raquel Urtasun University of Toronto Feb 2, 2016 R Urtasun (UofT) CSC 412 Feb 2, 2016 1 / 37 Today Undirected Graphical Models: Semantics of the graph:

More information

Probabilistic Graphical Models (I)

Probabilistic Graphical Models (I) Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random

More information

Uniform Law on the Unit Sphere of a Banach Space

Uniform Law on the Unit Sphere of a Banach Space Uniform Law on the Unit Shere of a Banach Sace by Bernard Beauzamy Société de Calcul Mathématique SA Faubourg Saint Honoré 75008 Paris France Setember 008 Abstract We investigate the construction of a

More information

Particle-Based Approximate Inference on Graphical Model

Particle-Based Approximate Inference on Graphical Model article-based Approimate Inference on Graphical Model Reference: robabilistic Graphical Model Ch. 2 Koller & Friedman CMU, 0-708, Fall 2009 robabilistic Graphical Models Lectures 8,9 Eric ing attern Recognition

More information

Statics and dynamics: some elementary concepts

Statics and dynamics: some elementary concepts 1 Statics and dynamics: some elementary concets Dynamics is the study of the movement through time of variables such as heartbeat, temerature, secies oulation, voltage, roduction, emloyment, rices and

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016 CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016 Plan for today and next week Today and next time: Bayesian networks (Bishop Sec. 8.1) Conditional

More information

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

A graph contains a set of nodes (vertices) connected by links (edges or arcs) BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,

More information

Lecture 4 October 18th

Lecture 4 October 18th Directed and undirected graphical models Fall 2017 Lecture 4 October 18th Lecturer: Guillaume Obozinski Scribe: In this lecture, we will assume that all random variables are discrete, to keep notations

More information

Empirical Bayesian EM-based Motion Segmentation

Empirical Bayesian EM-based Motion Segmentation Emirical Bayesian EM-based Motion Segmentation Nuno Vasconcelos Andrew Liman MIT Media Laboratory 0 Ames St, E5-0M, Cambridge, MA 09 fnuno,lig@media.mit.edu Abstract A recent trend in motion-based segmentation

More information

Lecture 7: Linear Classification Methods

Lecture 7: Linear Classification Methods Homeork Homeork Lecture 7: Linear lassification Methods Final rojects? Grous oics Proosal eek 5 Lecture is oster session, Jacobs Hall Lobby, snacks Final reort 5 June. What is linear classification? lassification

More information

Feedback-error control

Feedback-error control Chater 4 Feedback-error control 4.1 Introduction This chater exlains the feedback-error (FBE) control scheme originally described by Kawato [, 87, 8]. FBE is a widely used neural network based controller

More information

Convex Analysis and Economic Theory Winter 2018

Convex Analysis and Economic Theory Winter 2018 Division of the Humanities and Social Sciences Ec 181 KC Border Conve Analysis and Economic Theory Winter 2018 Toic 16: Fenchel conjugates 16.1 Conjugate functions Recall from Proosition 14.1.1 that is

More information

Chris Bishop s PRML Ch. 8: Graphical Models

Chris Bishop s PRML Ch. 8: Graphical Models Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular

More information

MATH 2710: NOTES FOR ANALYSIS

MATH 2710: NOTES FOR ANALYSIS MATH 270: NOTES FOR ANALYSIS The main ideas we will learn from analysis center around the idea of a limit. Limits occurs in several settings. We will start with finite limits of sequences, then cover infinite

More information

CSE 599d - Quantum Computing When Quantum Computers Fall Apart

CSE 599d - Quantum Computing When Quantum Computers Fall Apart CSE 599d - Quantum Comuting When Quantum Comuters Fall Aart Dave Bacon Deartment of Comuter Science & Engineering, University of Washington In this lecture we are going to begin discussing what haens to

More information

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4 ECE52 Tutorial Topic Review ECE52 Winter 206 Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides ECE52 Tutorial ECE52 Winter 206 Credits to Alireza / 4 Outline K-means, PCA 2 Bayesian

More information

Topic 7: Using identity types

Topic 7: Using identity types Toic 7: Using identity tyes June 10, 2014 Now we would like to learn how to use identity tyes and how to do some actual mathematics with them. By now we have essentially introduced all inference rules

More information

Learning Sequence Motif Models Using Gibbs Sampling

Learning Sequence Motif Models Using Gibbs Sampling Learning Sequence Motif Models Using Gibbs Samling BMI/CS 776 www.biostat.wisc.edu/bmi776/ Sring 2018 Anthony Gitter gitter@biostat.wisc.edu These slides excluding third-arty material are licensed under

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

Biointelligence Lab School of Computer Sci. & Eng. Seoul National University

Biointelligence Lab School of Computer Sci. & Eng. Seoul National University Artificial Intelligence Chater 19 easoning with Uncertain Information Biointelligence Lab School of Comuter Sci. & Eng. Seoul National University Outline l eview of Probability Theory l Probabilistic Inference

More information

Representation of undirected GM. Kayhan Batmanghelich

Representation of undirected GM. Kayhan Batmanghelich Representation of undirected GM Kayhan Batmanghelich Review Review: Directed Graphical Model Represent distribution of the form ny p(x 1,,X n = p(x i (X i i=1 Factorizes in terms of local conditional probabilities

More information

Named Entity Recognition using Maximum Entropy Model SEEM5680

Named Entity Recognition using Maximum Entropy Model SEEM5680 Named Entity Recognition using Maximum Entroy Model SEEM5680 Named Entity Recognition System Named Entity Recognition (NER): Identifying certain hrases/word sequences in a free text. Generally it involves

More information

Part III. for energy minimization

Part III. for energy minimization ICCV 2007 tutorial Part III Message-assing algorithms for energy minimization Vladimir Kolmogorov University College London Message assing ( E ( (,, Iteratively ass messages between nodes... Message udate

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft

More information

Biitlli Biointelligence Laboratory Lb School of Computer Science and Engineering. Seoul National University

Biitlli Biointelligence Laboratory Lb School of Computer Science and Engineering. Seoul National University Monte Carlo Samling Chater 4 2009 Course on Probabilistic Grahical Models Artificial Neural Networs, Studies in Artificial i lintelligence and Cognitive i Process Biitlli Biointelligence Laboratory Lb

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 9 Undirected Models CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due next Wednesday (Nov 4) in class Start early!!! Project milestones due Monday (Nov 9)

More information

Graphical Models and Kernel Methods

Graphical Models and Kernel Methods Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture Notes Fall 2009 November, 2009 Byoung-Ta Zhang School of Computer Science and Engineering & Cognitive Science, Brain Science, and Bioinformatics Seoul National University

More information

3 : Representation of Undirected GM

3 : Representation of Undirected GM 10-708: Probabilistic Graphical Models 10-708, Spring 2016 3 : Representation of Undirected GM Lecturer: Eric P. Xing Scribes: Longqi Cai, Man-Chia Chang 1 MRF vs BN There are two types of graphical models:

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 11 CRFs, Exponential Family CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due today Project milestones due next Monday (Nov 9) About half the work should

More information

Conditional Independence and Factorization

Conditional Independence and Factorization Conditional Independence and Factorization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

1 Entropy 1. 3 Extensivity 4. 5 Convexity 5

1 Entropy 1. 3 Extensivity 4. 5 Convexity 5 Contents CONEX FUNCIONS AND HERMODYNAMIC POENIALS 1 Entroy 1 2 Energy Reresentation 2 3 Etensivity 4 4 Fundamental Equations 4 5 Conveity 5 6 Legendre transforms 6 7 Reservoirs and Legendre transforms

More information

On parameter estimation in deformable models

On parameter estimation in deformable models Downloaded from orbitdtudk on: Dec 7, 07 On arameter estimation in deformable models Fisker, Rune; Carstensen, Jens Michael Published in: Proceedings of the 4th International Conference on Pattern Recognition

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 12 Dynamical Models CS/CNS/EE 155 Andreas Krause Homework 3 out tonight Start early!! Announcements Project milestones due today Please email to TAs 2 Parameter learning

More information

Outline. Spring It Introduction Representation. Markov Random Field. Conclusion. Conditional Independence Inference: Variable elimination

Outline. Spring It Introduction Representation. Markov Random Field. Conclusion. Conditional Independence Inference: Variable elimination Probabilistic Graphical Models COMP 790-90 Seminar Spring 2011 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Outline It Introduction ti Representation Bayesian network Conditional Independence Inference:

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

B8.1 Martingales Through Measure Theory. Concept of independence

B8.1 Martingales Through Measure Theory. Concept of independence B8.1 Martingales Through Measure Theory Concet of indeendence Motivated by the notion of indeendent events in relims robability, we have generalized the concet of indeendence to families of σ-algebras.

More information

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL Technical Sciences and Alied Mathematics MODELING THE RELIABILITY OF CISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL Cezar VASILESCU Regional Deartment of Defense Resources Management

More information

Machine Learning 4771

Machine Learning 4771 Machine Learning 4771 Instructor: Tony Jebara Topic 16 Undirected Graphs Undirected Separation Inferring Marginals & Conditionals Moralization Junction Trees Triangulation Undirected Graphs Separation

More information

Chapter 1: PROBABILITY BASICS

Chapter 1: PROBABILITY BASICS Charles Boncelet, obability, Statistics, and Random Signals," Oxford University ess, 0. ISBN: 978-0-9-0005-0 Chater : PROBABILITY BASICS Sections. What Is obability?. Exeriments, Outcomes, and Events.

More information

Bayesian Networks for Modeling and Managing Risks of Natural Hazards

Bayesian Networks for Modeling and Managing Risks of Natural Hazards [National Telford Institute and Scottish Informatics and Comuter Science Alliance, Glasgow University, Set 8, 200 ] Bayesian Networks for Modeling and Managing Risks of Natural Hazards Daniel Straub Engineering

More information

Notes on Markov Networks

Notes on Markov Networks Notes on Markov Networks Lili Mou moull12@sei.pku.edu.cn December, 2014 This note covers basic topics in Markov networks. We mainly talk about the formal definition, Gibbs sampling for inference, and maximum

More information

Probabilistic Graphical Models

Probabilistic Graphical Models 2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector

More information

CHAPTER 5 STATISTICAL INFERENCE. 1.0 Hypothesis Testing. 2.0 Decision Errors. 3.0 How a Hypothesis is Tested. 4.0 Test for Goodness of Fit

CHAPTER 5 STATISTICAL INFERENCE. 1.0 Hypothesis Testing. 2.0 Decision Errors. 3.0 How a Hypothesis is Tested. 4.0 Test for Goodness of Fit Chater 5 Statistical Inference 69 CHAPTER 5 STATISTICAL INFERENCE.0 Hyothesis Testing.0 Decision Errors 3.0 How a Hyothesis is Tested 4.0 Test for Goodness of Fit 5.0 Inferences about Two Means It ain't

More information

MAP Estimation Algorithms in Computer Vision - Part II

MAP Estimation Algorithms in Computer Vision - Part II MAP Estimation Algorithms in Comuter Vision - Part II M. Pawan Kumar, University of Oford Pushmeet Kohli, Microsoft Research Eamle: Image Segmentation E() = c i i + c ij i (1- j ) i i,j E: {0,1} n R 0

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley Elements of Asymtotic Theory James L. Powell Deartment of Economics University of California, Berkeley Objectives of Asymtotic Theory While exact results are available for, say, the distribution of the

More information

Chapter 10. Supplemental Text Material

Chapter 10. Supplemental Text Material Chater 1. Sulemental Tet Material S1-1. The Covariance Matri of the Regression Coefficients In Section 1-3 of the tetbook, we show that the least squares estimator of β in the linear regression model y=

More information

Bayesian Networks BY: MOHAMAD ALSABBAGH

Bayesian Networks BY: MOHAMAD ALSABBAGH Bayesian Networks BY: MOHAMAD ALSABBAGH Outlines Introduction Bayes Rule Bayesian Networks (BN) Representation Size of a Bayesian Network Inference via BN BN Learning Dynamic BN Introduction Conditional

More information

Chapter 16. Structured Probabilistic Models for Deep Learning

Chapter 16. Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 1 Chapter 16 Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 2 Structured Probabilistic Models way of using graphs to describe

More information

BERNOULLI TRIALS and RELATED PROBABILITY DISTRIBUTIONS

BERNOULLI TRIALS and RELATED PROBABILITY DISTRIBUTIONS BERNOULLI TRIALS and RELATED PROBABILITY DISTRIBUTIONS A BERNOULLI TRIALS Consider tossing a coin several times It is generally agreed that the following aly here ) Each time the coin is tossed there are

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 10 Undirected Models CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due this Wednesday (Nov 4) in class Project milestones due next Monday (Nov 9) About half

More information

1 Undirected Graphical Models. 2 Markov Random Fields (MRFs)

1 Undirected Graphical Models. 2 Markov Random Fields (MRFs) Machine Learning (ML, F16) Lecture#07 (Thursday Nov. 3rd) Lecturer: Byron Boots Undirected Graphical Models 1 Undirected Graphical Models In the previous lecture, we discussed directed graphical models.

More information

Radial Basis Function Networks: Algorithms

Radial Basis Function Networks: Algorithms Radial Basis Function Networks: Algorithms Introduction to Neural Networks : Lecture 13 John A. Bullinaria, 2004 1. The RBF Maing 2. The RBF Network Architecture 3. Comutational Power of RBF Networks 4.

More information

Statistics II Logistic Regression. So far... Two-way repeated measures ANOVA: an example. RM-ANOVA example: the data after log transform

Statistics II Logistic Regression. So far... Two-way repeated measures ANOVA: an example. RM-ANOVA example: the data after log transform Statistics II Logistic Regression Çağrı Çöltekin Exam date & time: June 21, 10:00 13:00 (The same day/time lanned at the beginning of the semester) University of Groningen, Det of Information Science May

More information

Research Note REGRESSION ANALYSIS IN MARKOV CHAIN * A. Y. ALAMUTI AND M. R. MESHKANI **

Research Note REGRESSION ANALYSIS IN MARKOV CHAIN * A. Y. ALAMUTI AND M. R. MESHKANI ** Iranian Journal of Science & Technology, Transaction A, Vol 3, No A3 Printed in The Islamic Reublic of Iran, 26 Shiraz University Research Note REGRESSION ANALYSIS IN MARKOV HAIN * A Y ALAMUTI AND M R

More information

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests 009 American Control Conference Hyatt Regency Riverfront, St. Louis, MO, USA June 0-, 009 FrB4. System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests James C. Sall Abstract

More information

Monte Carlo Studies. Monte Carlo Studies. Sampling Distribution

Monte Carlo Studies. Monte Carlo Studies. Sampling Distribution Monte Carlo Studies Do not let yourself be intimidated by the material in this lecture This lecture involves more theory but is meant to imrove your understanding of: Samling distributions and tests of

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Undirected Graphical Models Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 3: 2 late days to hand it in today, Thursday is final day. Assignment 4:

More information

An Introduction to Bayesian Machine Learning

An Introduction to Bayesian Machine Learning 1 An Introduction to Bayesian Machine Learning José Miguel Hernández-Lobato Department of Engineering, Cambridge University April 8, 2013 2 What is Machine Learning? The design of computational systems

More information

Lecture 6: Graphical Models

Lecture 6: Graphical Models Lecture 6: Graphical Models Kai-Wei Chang CS @ Uniersity of Virginia kw@kwchang.net Some slides are adapted from Viek Skirmar s course on Structured Prediction 1 So far We discussed sequence labeling tasks:

More information

An Introduction to Probabilistic Graphical Models

An Introduction to Probabilistic Graphical Models An Introduction to Probabilistic Graphical Models Cédric Archambeau Xerox Research Centre Europe cedric.archambeau@xrce.xerox.com Pascal Bootcamp Marseille, France, July 2010 Reference material D. Koller

More information

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III AI*IA 23 Fusion of Multile Pattern Classifiers PART III AI*IA 23 Tutorial on Fusion of Multile Pattern Classifiers by F. Roli 49 Methods for fusing multile classifiers Methods for fusing multile classifiers

More information

Graphical models. Sunita Sarawagi IIT Bombay

Graphical models. Sunita Sarawagi IIT Bombay 1 Graphical models Sunita Sarawagi IIT Bombay http://www.cse.iitb.ac.in/~sunita 2 Probabilistic modeling Given: several variables: x 1,... x n, n is large. Task: build a joint distribution function Pr(x

More information

RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES

RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES AARON ZWIEBACH Abstract. In this aer we will analyze research that has been recently done in the field of discrete

More information

Lecture 21: Quantum Communication

Lecture 21: Quantum Communication CS 880: Quantum Information Processing 0/6/00 Lecture : Quantum Communication Instructor: Dieter van Melkebeek Scribe: Mark Wellons Last lecture, we introduced the EPR airs which we will use in this lecture

More information

arxiv: v1 [physics.data-an] 26 Oct 2012

arxiv: v1 [physics.data-an] 26 Oct 2012 Constraints on Yield Parameters in Extended Maximum Likelihood Fits Till Moritz Karbach a, Maximilian Schlu b a TU Dortmund, Germany, moritz.karbach@cern.ch b TU Dortmund, Germany, maximilian.schlu@cern.ch

More information

A Social Welfare Optimal Sequential Allocation Procedure

A Social Welfare Optimal Sequential Allocation Procedure A Social Welfare Otimal Sequential Allocation Procedure Thomas Kalinowsi Universität Rostoc, Germany Nina Narodytsa and Toby Walsh NICTA and UNSW, Australia May 2, 201 Abstract We consider a simle sequential

More information

CS Lecture 4. Markov Random Fields

CS Lecture 4. Markov Random Fields CS 6347 Lecture 4 Markov Random Fields Recap Announcements First homework is available on elearning Reminder: Office hours Tuesday from 10am-11am Last Time Bayesian networks Today Markov random fields

More information

Generating Swerling Random Sequences

Generating Swerling Random Sequences Generating Swerling Random Sequences Mark A. Richards January 1, 1999 Revised December 15, 8 1 BACKGROUND The Swerling models are models of the robability function (df) and time correlation roerties of

More information

Short course A vademecum of statistical pattern recognition techniques with applications to image and video analysis. Agenda

Short course A vademecum of statistical pattern recognition techniques with applications to image and video analysis. Agenda Short course A vademecum of statistical attern recognition techniques with alications to image and video analysis Lecture 6 The Kalman filter. Particle filters Massimo Piccardi University of Technology,

More information

4. Score normalization technical details We now discuss the technical details of the score normalization method.

4. Score normalization technical details We now discuss the technical details of the score normalization method. SMT SCORING SYSTEM This document describes the scoring system for the Stanford Math Tournament We begin by giving an overview of the changes to scoring and a non-technical descrition of the scoring rules

More information

Based on slides by Richard Zemel

Based on slides by Richard Zemel CSC 412/2506 Winter 2018 Probabilistic Learning and Reasoning Lecture 3: Directed Graphical Models and Latent Variables Based on slides by Richard Zemel Learning outcomes What aspects of a model can we

More information

Chapter 7: Special Distributions

Chapter 7: Special Distributions This chater first resents some imortant distributions, and then develos the largesamle distribution theory which is crucial in estimation and statistical inference Discrete distributions The Bernoulli

More information

4 : Exact Inference: Variable Elimination

4 : Exact Inference: Variable Elimination 10-708: Probabilistic Graphical Models 10-708, Spring 2014 4 : Exact Inference: Variable Elimination Lecturer: Eric P. ing Scribes: Soumya Batra, Pradeep Dasigi, Manzil Zaheer 1 Probabilistic Inference

More information

Inference and Representation

Inference and Representation Inference and Representation David Sontag New York University Lecture 5, Sept. 30, 2014 David Sontag (NYU) Inference and Representation Lecture 5, Sept. 30, 2014 1 / 16 Today s lecture 1 Running-time of

More information

Random variables. Lecture 5 - Discrete Distributions. Discrete Probability distributions. Example - Discrete probability model

Random variables. Lecture 5 - Discrete Distributions. Discrete Probability distributions. Example - Discrete probability model Random Variables Random variables Lecture 5 - Discrete Distributions Sta02 / BME02 Colin Rundel Setember 8, 204 A random variable is a numeric uantity whose value deends on the outcome of a random event

More information

Statistical Learning

Statistical Learning Statistical Learning Lecture 5: Bayesian Networks and Graphical Models Mário A. T. Figueiredo Instituto Superior Técnico & Instituto de Telecomunicações University of Lisbon, Portugal May 2018 Mário A.

More information

Directed Graphical Models

Directed Graphical Models CS 2750: Machine Learning Directed Graphical Models Prof. Adriana Kovashka University of Pittsburgh March 28, 2017 Graphical Models If no assumption of independence is made, must estimate an exponential

More information

Lecture 8: Bayesian Networks

Lecture 8: Bayesian Networks Lecture 8: Bayesian Networks Bayesian Networks Inference in Bayesian Networks COMP-652 and ECSE 608, Lecture 8 - January 31, 2017 1 Bayes nets P(E) E=1 E=0 0.005 0.995 E B P(B) B=1 B=0 0.01 0.99 E=0 E=1

More information

CMSC 425: Lecture 4 Geometry and Geometric Programming

CMSC 425: Lecture 4 Geometry and Geometric Programming CMSC 425: Lecture 4 Geometry and Geometric Programming Geometry for Game Programming and Grahics: For the next few lectures, we will discuss some of the basic elements of geometry. There are many areas

More information

Network Configuration Control Via Connectivity Graph Processes

Network Configuration Control Via Connectivity Graph Processes Network Configuration Control Via Connectivity Grah Processes Abubakr Muhammad Deartment of Electrical and Systems Engineering University of Pennsylvania Philadelhia, PA 90 abubakr@seas.uenn.edu Magnus

More information

Universal Finite Memory Coding of Binary Sequences

Universal Finite Memory Coding of Binary Sequences Deartment of Electrical Engineering Systems Universal Finite Memory Coding of Binary Sequences Thesis submitted towards the degree of Master of Science in Electrical and Electronic Engineering in Tel-Aviv

More information

Robust hamiltonicity of random directed graphs

Robust hamiltonicity of random directed graphs Robust hamiltonicity of random directed grahs Asaf Ferber Rajko Nenadov Andreas Noever asaf.ferber@inf.ethz.ch rnenadov@inf.ethz.ch anoever@inf.ethz.ch Ueli Peter ueter@inf.ethz.ch Nemanja Škorić nskoric@inf.ethz.ch

More information

Numerical Linear Algebra

Numerical Linear Algebra Numerical Linear Algebra Numerous alications in statistics, articularly in the fitting of linear models. Notation and conventions: Elements of a matrix A are denoted by a ij, where i indexes the rows and

More information

1 Gambler s Ruin Problem

1 Gambler s Ruin Problem Coyright c 2017 by Karl Sigman 1 Gambler s Ruin Problem Let N 2 be an integer and let 1 i N 1. Consider a gambler who starts with an initial fortune of $i and then on each successive gamble either wins

More information

4.1 Notation and probability review

4.1 Notation and probability review Directed and undirected graphical models Fall 2015 Lecture 4 October 21st Lecturer: Simon Lacoste-Julien Scribe: Jaime Roquero, JieYing Wu 4.1 Notation and probability review 4.1.1 Notations Let us recall

More information