Local Probability Models

Size: px
Start display at page:

Download "Local Probability Models"

Transcription

1 Readings: K&F 3.4, 5.~5.5 Local Probability Models Lecture 3 pr 4, 2 SE 55, Statistical Methods, Spring 2 Instructor: Su-In Lee University of Washington, Seattle Outline Last time onditional parameterization ayesian networks Independencies in graphs Today Local independencies, d-separation, I-equivalence From distributions to N graphs Local probability models (PDs) Tabular Deterministic ontet-specific Independence of causal influences ontinuous variables SE 55 Statistical Methods Spring 2 2

2 I-equivalence between graphs I(G) describe all conditional independencies in G Different ayesian networks can have same Ind. ( Z) ( Z) ( Z) ( ) Z Z Z Z Equivalence class I Equivalence class II Two N graphs G and G 2 are I-equivalent if I(G ) = I(G 2 ) SE 55 Statistical Methods Spring 2 3 I-equivalence between graphs If P factorizes over a graph in an I-equivalence class P factorizes over all other graphs in the same class P cannot distinguish one I-equivalent graph from another Implications for structure learning We cannot find the correct structure from within the same equivalent class. -> will revisit later. Test for I-equivalence: d-separation SE 55 Statistical Methods Spring 2 4 2

3 Test for I-equivalence Necessary condition: same graph skeleton Otherwise, can find active path in one graph but not other ut, not sufficient: v-structures Sufficient condition: same skeleton and v-structures ut, not necessary: complete graphs (no independence) Every two nodes are connected by some edge Define Z as immoral if, are not directly connected Necessary and Sufficient: same skeleton and immoral set of v- structures SE 55 Statistical Methods Spring 2 5 onstructing graphs for P an we construct a graph for a distribution P? ny graph which is an I-map for P ut, this is not so useful: complete graphs omplete graphs imply no independence assumptions Thus, they are I-maps of any distribution Every two nodes are connected by some edge SE 55 Statistical Methods Spring 2 6 3

4 Minimal I-Maps graph G is a minimal I-map for P if: G is an I-map for P Removing any edge from G renders it not an I-map for P Eample: if is a minimal I-map for P, W Z Then:,,, is not I-maps. W Z W Z W Z W Z SE 55 Statistical Methods Spring 2 7 ayesian network definition revisited ayesian network is a pair (G,P) P factorizes over G P is specified as set of PDs associated with G s nodes dditional requirement: G is a minimal I-map for P SE 55 Statistical Methods Spring 2 8 4

5 onstructing minimal I-maps Reverse factorization theorem n P(,..., ) = P( Pa( )) G is an I-map of P n i= i lgorithm for constructing a minimal I-Map Input: () fied ordering of nodes,, n ; (2) set of independencies that hold in P, denoted by I For each i, Select parents of i as minimal subset of {,, i- }, such that ( i {, i- Pa( i )} Pa( i )) I (Outline of) Proof of minimal I-map i I-map since the factorization above holds by construction Minimal since by construction, removing one edge destroys the factorization 9 Non-uniqueness of minimal I-map pplying the same I-Map construction process with different orders can lead to different structures ssume: I(G) = I(P) hoosing order: Drastic effects on compleity of minimal I-Map graph Heuristic: Use causal order D Order: E,,D,, D E E Different independence assumptions (different skeletons, e.g., ( ) holds on left) SE 55 Statistical Methods Spring 2 5

6 Perfect maps G is a perfect map (P-Map) for P if I(P)=I(G) Does every distribution have a P-map? No: independencies may be encoded in PD ( Z=) No: some structures cannot be represented in a N Independencies in P: ( D,), and (,D) D D (,D) does not hold ( D) also holds SE 55 Statistical Methods Spring 2 Finding a perfect map If P has a P-map, can we find it? Not uniquely, since I-equivalent graphs are indistinguishable Thus, represent I-equivalent graphs and return it Recall I-Equivalence Necessary and Sufficient: same skeleton and immoral set of v-structures Finding P-maps Step I: Find skeleton Step II: Find immoral set of v-structures Step III: Direct constrained edges Detailed algorithm: please read the tetbook SE 55 Statistical Methods Spring 2 2 6

7 Outline Last time onditional parameterization ayesian networks Independencies in graphs Local independencies, d-separation, I-equivalence Today From distributions to N graphs Local probability models onditional Probability Distributions (PDs) Table PDs Deterministic PDs ontet-specific PDs Independence of causal influences ontinuous variables 3 Table PDs Entry for each joint assignment of and Pa() For each pa : Val( ) P( pa ) = Most general representation P(I) I i i.7.3 P(S I) Represents every discrete PD S S I I s s i.95.5 i.2.8 Limitations I annot model continuous RVs # parameters eponential in Pa() annot model large in-degree dependencies Ignores structure within the PD How to overcome the limitations? D S H P(S I,D,,H) S I,D, H s s I d h.95.5 I d h.2.8 : : 7

8 Structured PDs Key idea: reduce # parameters by modeling P( Pa ) without eplicitly determining all entries of the joint We use constraints instead. Lose epressive power (cannot represent every PD) Many ways depending on the constraints Deterministic, tree-structured, rule-based PDs SE 55 Statistical Methods Spring 2 5 Deterministic PDs There is a function f : Val(Pa ) Val() such that P( pa = f ( pa ) ) = otherwise Eample functions OR, OR, ND, NND functions Z = + (continuous variables) SE 55 Statistical Methods Spring 2 6 8

9 Deterministic PDs: eample I Induce additional conditional independencies Eample: T is any deterministic function of T,T 2 T T 2 (S S 2 T,T 2 ) T S S 2 SE 55 Statistical Methods Spring 2 7 Deterministic PDs: eample II Induce additional conditional independencies Eample: is an OR deterministic function of, D (D E,) E SE 55 Statistical Methods Spring 2 8 9

10 Deterministic PDs: eample III Induce additional conditional independencies Eample: T is an OR deterministic function of T,T 2 T T 2 (S S 2 T =t ) T S S 2 ontet specific independencies 9 ontet specific independencies Let,,Z be disjoint random variable sets Let be a set of variables and c Val() and are contetually independent given Z and c, denoted ( c Z, =c) if: P(, Z, c) = P( Z, c) whenever P(, Z, c) > SE 55 Statistical Methods Spring 2 2

11 Tree PDs natural representation for capturing common elements in PDs Uses a decision tree L S Eample job offer Job offer Val(J)={yes,no} pply in time Val()={yes,no} Quality of Letter Val(L)={good,bad} GP Score Val(S)={high,low} P(J,L,S) J J L S j j a l s.95.5 a l s.95.5 a l s.95.5 a l s.95.5 a l s.9. a l s.3.7 a l s.2.8 a l s.2.8 Tree PDs: eample L S L S J J J L S j j a l s.95.5 a l s.95.5 a l s.95.5 a l s.95.5 a l s.9. a l s.3.7 a l s.2.8 a l s parameters (.95,.5) a a S L l l s s (.9,.) (.3,.7) (.2,.8) binary tree + 4 parameters

12 Rule PDs rule r is a pair (c;p) where c is an assignment to a subset of variables and p [,]. Let Scope[r]= rule-based PD P( Pa()) is a set of rules R s.t. For each rule r R Scope[r] {} Pa() For each assignment (,u) to {} Pa() we have eactly one rule (c;p) R such that c is compatible with (,u). Then, we have P(= Pa()=u) = p Pa() SE 55 Statistical Methods Spring 2 23 Rule PDs Eample: Let be a variable with Pa() = {,,} r: (a, b, ;.) r2: (a, c, ;.2) r3: (b, c, ;.3) r4: (a, b, c, ;.4) r5: (a, b, c ;.5) r6: (a, b, ;.9) r7: (a, c, ;.8) r8: (b, c, ;.7) r9: (a, b, c, ;.6) Note: each assignment maps to eactly one rule Rules cannot always be represented compactly within tree PDs SE 55 Statistical Methods Spring

13 Tree PDs and Rule PDs an represent every discrete function an be easily learned and dealt with in inference ut, some functions are not represented compactly OR in tree PDs: cannot split in one step on a,b and a,b lternative representations eist omple logical rules SE 55 Statistical Methods Spring 2 25 ontet specific independencies =a (D =a ) D spurious edge D =a (D =a ) a a c c b b D (.9,.) (.7,.3) (.2,.8) (.4,.6) Reasoning by cases implies that (,D) 3

14 Independence of causal influence auses:, n Effect: 2... n General case: has a comple dependency on, n ommon case Each i influences separately Influence of, n is combined to an overall influence on SE 55 Statistical Methods Spring 2 27 Eample : Noisy OR Two independent causal mechanisms by, 2 Key assumptions OR : =y cannot happen unless one of, 2 occurs Noisy : each causal mechanism is noisy. Independence of the causal mechanisms by, 2 P(=y =, 2 = 2 ) = P(=y =, 2 = 2 ) P(=y =, 2 = 2 ) 2 2 y y

15 Noisy OR: elaborate representation 2 effective value of each cause..9 Noisy PD Noisy PD 2 Noise parameter λ =.9 Noise parameter λ =.8 2 y y Deterministic OR 29 Noisy OR: general case is a binary variable with n binary parents,... n PD P(,... n ) is a noisy OR if there are (n+) noise parameters λ,λ,...,λ n such that P( = y,..., n ) = ( λ ) i: i = i ( λ ) P( = y,..., n) = ( λ) ( λi ) i: i = i i n n i j: ( i j =y o ) 5

16 Generalized linear models (GLMs) soft version of a linear threshold function Eample: logistic PD 2 inary variables,... n, n (dditive) combined effect of s z = wo + wi( i = ) Z i= Logistic PD: n P( = y,..., k ) = logit wo + wi( i = ) i=... n Logit function (smooth step function) z e logit( z) = + e z Z 3 General formulation Let be a random variable with parents,... n The PD P(,... n ) ehibits independence of causal influence (II) if it can be described via a network structure shown below: Logistic Z i = w i ( i =) Z = ΣZ i = logit (Z) 2 Z Z Z n Z n Noisy OR Z i has noise model Z is an OR function is the identity PD The PD P(Z Z,...Z n ) is deterministic Key advantage: O(n) parameters 32 6

17 ontinuous variables One solution: discretize Often requires too many value states Loses domain structure Other solutions: use continuous function for P( Pa()) an combine continuous and discrete variables, resulting in hybrid networks Inference and learning may become more difficult SE 55 Statistical Methods Spring 2 33 Gaussian density functions mong the most common continuous representations Univariate case: 2 P( ) ~ N( µ, σ ) if p( ) = e 2πσ 2 µ 2 σ

18 Multivariate Gaussian density functions multivariate Gaussian distribution over,... n has n mean vector µ nn positive definite covariance matri Σ (positive definite: n T R : > ) Joint density function: T p( ) = ep ( µ ) ( µ ) n / 2 / 2 (2π ) 2 µ i =E[ i ] Σ ii =Var[ i ] Σ ij =ov[ i, j ]=E[ i j ]-E[ i ]E[ j ] (i j) SE 55 Statistical Methods Spring 2 35 Gaussian density functions Marginal distributions are easy to compute µ P(, ) = N ; µ ( ) P( ) = N µ ; Independencies can be determined from parameters If =,... n have a joint normal distribution N(µ;Σ) then ( i j ) iff Σ ij = (for i j) Does not hold in general for non-gaussian distributions SE 55 Statistical Methods Spring

19 Linear Gaussian PDs is a continuous variable with parents,... n has a linear Gaussian model if it can be described using parameters β,...,β n and σ 2 such that 2 P (,..., n) = N( β + β βnn; σ ) T 2 Vector notation: P ( ) = N( β + β ; )... 2 n σ Pros Simple aptures many interesting dependencies ons Fied variance (variance cannot depend on parents values) SE 55 Statistical Methods Spring 2 37 Linear Gaussian ayesian network linear Gaussian ayesian network is a ayesian network where ll variables are continuous ll of the PDs are linear Gaussians Key result: linear Gaussian models are equivalent to multivariate Gaussian density functions Proof? SE 55 Statistical Methods Spring

20 Hybrid models Models of continuous and discrete variables ontinuous variables with discrete parents Discrete variables with continuous parents onditional Linear Gaussians continuous variable = {,..., n } continuous parents U = {U,...,U m } discrete parents u U P( u, ) = N a n 2 ( u, au i i; σ ) + = : i, u... n U... U m conditional Linear ayesian network is one where Discrete variables have only discrete parents ontinuous variables have only LG PDs SE 55 Statistical Methods Spring 2 39 Hybrid models ontinuous parents for discrete children Threshold models P( = y.9 < ) =.5 otherwise Linear sigmoid (logit function) P( = y,..., ) = logit( w + k o n i= w ) i i SE 55 Statistical Methods Spring 2 4 2

21 Summary: PD models Deterministic functions ontet specific dependencies Tree PDs Rule PDs Independence of causal influence Noisy OR Logistic function PDs capture additional domain structure SE 55 Statistical Methods Spring 2 4 cknowledgement These lecture notes were generated based on the slides from Prof Eran Segal. SE 55 Statistical Methods Spring

Undirected Graphical Models

Undirected Graphical Models Readings: K&F 4. 4.2 4.3 4.4 Undirected Graphical Models Lecture 4 pr 6 20 SE 55 Statistical Methods Spring 20 Instructor: Su-In Lee University of Washington Seattle ayesian Network Representation irected

More information

Example: multivariate Gaussian Distribution

Example: multivariate Gaussian Distribution School of omputer Science Probabilistic Graphical Models Representation of undirected GM (continued) Eric Xing Lecture 3, September 16, 2009 Reading: KF-chap4 Eric Xing @ MU, 2005-2009 1 Example: multivariate

More information

Bayesian & Markov Networks: A unified view

Bayesian & Markov Networks: A unified view School of omputer Science ayesian & Markov Networks: unified view Probabilistic Graphical Models (10-708) Lecture 3, Sep 19, 2007 Receptor Kinase Gene G Receptor X 1 X 2 Kinase Kinase E X 3 X 4 X 5 TF

More information

Y1 Y2 Y3 Y4 Y1 Y2 Y3 Y4 Z1 Z2 Z3 Z4

Y1 Y2 Y3 Y4 Y1 Y2 Y3 Y4 Z1 Z2 Z3 Z4 Inference: Exploiting Local Structure aphne Koller Stanford University CS228 Handout #4 We have seen that N inference exploits the network structure, in particular the conditional independence and the

More information

Short course A vademecum of statistical pattern recognition techniques with applications to image and video analysis. Agenda

Short course A vademecum of statistical pattern recognition techniques with applications to image and video analysis. Agenda Short course A vademecum of statistical pattern recognition techniques with applications to image and video analysis Lecture Recalls of probability theory Massimo Piccardi University of Technology, Sydney,

More information

Undirected Graphical Models 4 Bayesian Networks and Markov Networks. Bayesian Networks to Markov Networks

Undirected Graphical Models 4 Bayesian Networks and Markov Networks. Bayesian Networks to Markov Networks Undirected Graphical Models 4 ayesian Networks and Markov Networks 1 ayesian Networks to Markov Networks 2 1 Ns to MNs X Y Z Ns can represent independence constraints that MN cannot MNs can represent independence

More information

Outline. Spring It Introduction Representation. Markov Random Field. Conclusion. Conditional Independence Inference: Variable elimination

Outline. Spring It Introduction Representation. Markov Random Field. Conclusion. Conditional Independence Inference: Variable elimination Probabilistic Graphical Models COMP 790-90 Seminar Spring 2011 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Outline It Introduction ti Representation Bayesian network Conditional Independence Inference:

More information

Chris Bishop s PRML Ch. 8: Graphical Models

Chris Bishop s PRML Ch. 8: Graphical Models Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Undirected Graphical Models Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 3: 2 late days to hand it in today, Thursday is final day. Assignment 4:

More information

Undirected Graphical Models: Markov Random Fields

Undirected Graphical Models: Markov Random Fields Undirected Graphical Models: Markov Random Fields 40-956 Advanced Topics in AI: Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2015 Markov Random Field Structure: undirected

More information

Graphical Models. Lecture 3: Local Condi6onal Probability Distribu6ons. Andrew McCallum

Graphical Models. Lecture 3: Local Condi6onal Probability Distribu6ons. Andrew McCallum Graphical Models Lecture 3: Local Condi6onal Probability Distribu6ons Andrew McCallum mccallum@cs.umass.edu Thanks to Noah Smith and Carlos Guestrin for some slide materials. 1 Condi6onal Probability Distribu6ons

More information

STA 414/2104, Spring 2014, Practice Problem Set #1

STA 414/2104, Spring 2014, Practice Problem Set #1 STA 44/4, Spring 4, Practice Problem Set # Note: these problems are not for credit, and not to be handed in Question : Consider a classification problem in which there are two real-valued inputs, and,

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Directed and Undirected Graphical Models

Directed and Undirected Graphical Models Directed and Undirected Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Last Lecture Refresher Lecture Plan Directed

More information

Context-specific independence Parameter learning: MLE

Context-specific independence Parameter learning: MLE Use hapter 3 of K&F as a reference for SI Reading for parameter learning: hapter 12 of K&F ontext-specific independence Parameter learning: MLE Graphical Models 10708 arlos Guestrin arnegie Mellon University

More information

Representation of undirected GM. Kayhan Batmanghelich

Representation of undirected GM. Kayhan Batmanghelich Representation of undirected GM Kayhan Batmanghelich Review Review: Directed Graphical Model Represent distribution of the form ny p(x 1,,X n = p(x i (X i i=1 Factorizes in terms of local conditional probabilities

More information

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005 Instructor: Professor Jeff A. Bilmes Uncertainty & Bayesian Networks

More information

Probabilistic Graphical Models (Cmput 651): Hybrid Network. Matthew Brown 24/11/2008

Probabilistic Graphical Models (Cmput 651): Hybrid Network. Matthew Brown 24/11/2008 Probabilistic Graphical Models (Cmput 651): Hybrid Network Matthew Brown 24/11/2008 Reading: Handout on Hybrid Networks (Ch. 13 from older version of Koller Friedman) 1 Space of topics Semantics Inference

More information

Inference in Graphical Models Variable Elimination and Message Passing Algorithm

Inference in Graphical Models Variable Elimination and Message Passing Algorithm Inference in Graphical Models Variable Elimination and Message Passing lgorithm Le Song Machine Learning II: dvanced Topics SE 8803ML, Spring 2012 onditional Independence ssumptions Local Markov ssumption

More information

Directed Graphical Models or Bayesian Networks

Directed Graphical Models or Bayesian Networks Directed Graphical Models or Bayesian Networks Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Bayesian Networks One of the most exciting recent advancements in statistical AI Compact

More information

Lecture 17: May 29, 2002

Lecture 17: May 29, 2002 EE596 Pat. Recog. II: Introduction to Graphical Models University of Washington Spring 2000 Dept. of Electrical Engineering Lecture 17: May 29, 2002 Lecturer: Jeff ilmes Scribe: Kurt Partridge, Salvador

More information

Recall from last time. Lecture 3: Conditional independence and graph structure. Example: A Bayesian (belief) network.

Recall from last time. Lecture 3: Conditional independence and graph structure. Example: A Bayesian (belief) network. ecall from last time Lecture 3: onditional independence and graph structure onditional independencies implied by a belief network Independence maps (I-maps) Factorization theorem The Bayes ball algorithm

More information

CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Spring Announcements CS 188: Artificial Intelligence Spring 2011 Lecture 16: Bayes Nets IV Inference 3/28/2011 Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore Announcements

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Oct, 3, 2016 CPSC 422, Lecture 11 Slide 1 422 big picture: Where are we? Query Planning Deterministic Logics First Order Logics Ontologies

More information

2 : Directed GMs: Bayesian Networks

2 : Directed GMs: Bayesian Networks 10-708: Probabilistic Graphical Models 10-708, Spring 2017 2 : Directed GMs: Bayesian Networks Lecturer: Eric P. Xing Scribes: Jayanth Koushik, Hiroaki Hayashi, Christian Perez Topic: Directed GMs 1 Types

More information

Bayesian Networks 3 D-separation. D-separation

Bayesian Networks 3 D-separation. D-separation ayesian Networks 3 D-separation 1 D-separation iven a graph, we would like to read off independencies The converse is easier to think about: when does an independence statement not hold? g. when can X

More information

Midterm 2 V1. Introduction to Artificial Intelligence. CS 188 Spring 2015

Midterm 2 V1. Introduction to Artificial Intelligence. CS 188 Spring 2015 S 88 Spring 205 Introduction to rtificial Intelligence Midterm 2 V ˆ You have approximately 2 hours and 50 minutes. ˆ The exam is closed book, closed calculator, and closed notes except your one-page crib

More information

CS Lecture 3. More Bayesian Networks

CS Lecture 3. More Bayesian Networks CS 6347 Lecture 3 More Bayesian Networks Recap Last time: Complexity challenges Representing distributions Computing probabilities/doing inference Introduction to Bayesian networks Today: D-separation,

More information

MODULE 6 LECTURE NOTES 1 REVIEW OF PROBABILITY THEORY. Most water resources decision problems face the risk of uncertainty mainly because of the

MODULE 6 LECTURE NOTES 1 REVIEW OF PROBABILITY THEORY. Most water resources decision problems face the risk of uncertainty mainly because of the MODULE 6 LECTURE NOTES REVIEW OF PROBABILITY THEORY INTRODUCTION Most water resources decision problems ace the risk o uncertainty mainly because o the randomness o the variables that inluence the perormance

More information

Example. Bayesian networks. Outline. Example. Bayesian networks. Example contd. Topology of network encodes conditional independence assertions:

Example. Bayesian networks. Outline. Example. Bayesian networks. Example contd. Topology of network encodes conditional independence assertions: opology of network encodes conditional independence assertions: ayesian networks Weather Cavity Chapter 4. 3 oothache Catch W eather is independent of the other variables oothache and Catch are conditionally

More information

Outline. Bayesian networks. Example. Bayesian networks. Example contd. Example. Syntax. Semantics Parameterized distributions. Chapter 14.

Outline. Bayesian networks. Example. Bayesian networks. Example contd. Example. Syntax. Semantics Parameterized distributions. Chapter 14. Outline Syntax ayesian networks Semantics Parameterized distributions Chapter 4. 3 Chapter 4. 3 Chapter 4. 3 2 ayesian networks simple, graphical notation for conditional independence assertions and hence

More information

Where now? Machine Learning and Bayesian Inference

Where now? Machine Learning and Bayesian Inference Machine Learning and Bayesian Inference Dr Sean Holden Computer Laboratory, Room FC6 Telephone etension 67 Email: sbh@clcamacuk wwwclcamacuk/ sbh/ Where now? There are some simple take-home messages from

More information

Machine Learning Summer School

Machine Learning Summer School Machine Learning Summer School Lecture 1: Introduction to Graphical Models Zoubin Ghahramani zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/ epartment of ngineering University of ambridge, UK

More information

Ridge Regression 1. to which some random noise is added. So that the training labels can be represented as:

Ridge Regression 1. to which some random noise is added. So that the training labels can be represented as: CS 1: Machine Learning Spring 15 College of Computer and Information Science Northeastern University Lecture 3 February, 3 Instructor: Bilal Ahmed Scribe: Bilal Ahmed & Virgil Pavlu 1 Introduction Ridge

More information

Basic Sampling Methods

Basic Sampling Methods Basic Sampling Methods Sargur Srihari srihari@cedar.buffalo.edu 1 1. Motivation Topics Intractability in ML How sampling can help 2. Ancestral Sampling Using BNs 3. Transforming a Uniform Distribution

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Lecture 3: Pattern Classification. Pattern classification

Lecture 3: Pattern Classification. Pattern classification EE E68: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mitures and

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft

More information

CSC 412 (Lecture 4): Undirected Graphical Models

CSC 412 (Lecture 4): Undirected Graphical Models CSC 412 (Lecture 4): Undirected Graphical Models Raquel Urtasun University of Toronto Feb 2, 2016 R Urtasun (UofT) CSC 412 Feb 2, 2016 1 / 37 Today Undirected Graphical Models: Semantics of the graph:

More information

Announcements. CS 188: Artificial Intelligence Spring Bayes Net Semantics. Probabilities in BNs. All Conditional Independences

Announcements. CS 188: Artificial Intelligence Spring Bayes Net Semantics. Probabilities in BNs. All Conditional Independences CS 188: Artificial Intelligence Spring 2011 Announcements Assignments W4 out today --- this is your last written!! Any assignments you have not picked up yet In bin in 283 Soda [same room as for submission

More information

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes

More information

Probabilistic Graphical Models (I)

Probabilistic Graphical Models (I) Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random

More information

2 Statistical Estimation: Basic Concepts

2 Statistical Estimation: Basic Concepts Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof. N. Shimkin 2 Statistical Estimation:

More information

Gaussian Process Vine Copulas for Multivariate Dependence

Gaussian Process Vine Copulas for Multivariate Dependence Gaussian Process Vine Copulas for Multivariate Dependence José Miguel Hernández-Lobato 1,2 joint work with David López-Paz 2,3 and Zoubin Ghahramani 1 1 Department of Engineering, Cambridge University,

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 4 Learning Bayesian Networks CS/CNS/EE 155 Andreas Krause Announcements Another TA: Hongchao Zhou Please fill out the questionnaire about recitations Homework 1 out.

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

Logic and Computer Design Fundamentals. Chapter 2 Combinational Logic Circuits. Part 2 Circuit Optimization

Logic and Computer Design Fundamentals. Chapter 2 Combinational Logic Circuits. Part 2 Circuit Optimization Logic and omputer Design Fundamentals hapter 2 ombinational Logic ircuits Part 2 ircuit Optimization harles Kime & Thomas Kaminski 2008 Pearson Education, Inc. (Hyperlinks are active in View Show mode)

More information

Approximate inference, Sampling & Variational inference Fall Cours 9 November 25

Approximate inference, Sampling & Variational inference Fall Cours 9 November 25 Approimate inference, Sampling & Variational inference Fall 2015 Cours 9 November 25 Enseignant: Guillaume Obozinski Scribe: Basile Clément, Nathan de Lara 9.1 Approimate inference with MCMC 9.1.1 Gibbs

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 24, 2016 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,

More information

Lecture Notes 2 Random Variables. Discrete Random Variables: Probability mass function (pmf)

Lecture Notes 2 Random Variables. Discrete Random Variables: Probability mass function (pmf) Lecture Notes 2 Random Variables Definition Discrete Random Variables: Probability mass function (pmf) Continuous Random Variables: Probability density function (pdf) Mean and Variance Cumulative Distribution

More information

Notes on Discriminant Functions and Optimal Classification

Notes on Discriminant Functions and Optimal Classification Notes on Discriminant Functions and Optimal Classification Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Discriminant Functions Consider a classification problem

More information

Final Exam, Machine Learning, Spring 2009

Final Exam, Machine Learning, Spring 2009 Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3

More information

COMP538: Introduction to Bayesian Networks

COMP538: Introduction to Bayesian Networks COMP538: Introduction to ayesian Networks Lecture 4: Inference in ayesian Networks: The VE lgorithm Nevin L. Zhang lzhang@cse.ust.hk Department of Computer Science and Engineering Hong Kong University

More information

Computational Genomics. Systems biology. Putting it together: Data integration using graphical models

Computational Genomics. Systems biology. Putting it together: Data integration using graphical models 02-710 Computational Genomics Systems biology Putting it together: Data integration using graphical models High throughput data So far in this class we discussed several different types of high throughput

More information

An Introduction to Bayesian Machine Learning

An Introduction to Bayesian Machine Learning 1 An Introduction to Bayesian Machine Learning José Miguel Hernández-Lobato Department of Engineering, Cambridge University April 8, 2013 2 What is Machine Learning? The design of computational systems

More information

Gaussian Processes (10/16/13)

Gaussian Processes (10/16/13) STA561: Probabilistic machine learning Gaussian Processes (10/16/13) Lecturer: Barbara Engelhardt Scribes: Changwei Hu, Di Jin, Mengdi Wang 1 Introduction In supervised learning, we observe some inputs

More information

Logistic Regression Review Fall 2012 Recitation. September 25, 2012 TA: Selen Uguroglu

Logistic Regression Review Fall 2012 Recitation. September 25, 2012 TA: Selen Uguroglu Logistic Regression Review 10-601 Fall 2012 Recitation September 25, 2012 TA: Selen Uguroglu!1 Outline Decision Theory Logistic regression Goal Loss function Inference Gradient Descent!2 Training Data

More information

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016 CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016 Plan for today and next week Today and next time: Bayesian networks (Bishop Sec. 8.1) Conditional

More information

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University FEATURE EXPANSIONS FEATURE EXPANSIONS

More information

Graphical Models and Kernel Methods

Graphical Models and Kernel Methods Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.

More information

CS839: Probabilistic Graphical Models. Lecture 2: Directed Graphical Models. Theo Rekatsinas

CS839: Probabilistic Graphical Models. Lecture 2: Directed Graphical Models. Theo Rekatsinas CS839: Probabilistic Graphical Models Lecture 2: Directed Graphical Models Theo Rekatsinas 1 Questions Questions? Waiting list Questions on other logistics 2 Section 1 1. Intro to Bayes Nets 3 Section

More information

Introduction to Probabilistic Graphical Models

Introduction to Probabilistic Graphical Models Introduction to Probabilistic Graphical Models Sargur Srihari srihari@cedar.buffalo.edu 1 Topics 1. What are probabilistic graphical models (PGMs) 2. Use of PGMs Engineering and AI 3. Directionality in

More information

Machine Learning and Data Mining. Decision Trees. Prof. Alexander Ihler

Machine Learning and Data Mining. Decision Trees. Prof. Alexander Ihler + Machine Learning and Data Mining Decision Trees Prof. Alexander Ihler Decision trees Func-onal form f(x;µ): nested if-then-else statements Discrete features: fully expressive (any func-on) Structure:

More information

Advanced Introduction to Machine Learning CMU-10715

Advanced Introduction to Machine Learning CMU-10715 Advanced Introduction to Machine Learning CMU-10715 Gaussian Processes Barnabás Póczos http://www.gaussianprocess.org/ 2 Some of these slides in the intro are taken from D. Lizotte, R. Parr, C. Guesterin

More information

Why do we care? Examples. Bayes Rule. What room am I in? Handling uncertainty over time: predicting, estimating, recognizing, learning

Why do we care? Examples. Bayes Rule. What room am I in? Handling uncertainty over time: predicting, estimating, recognizing, learning Handling uncertainty over time: predicting, estimating, recognizing, learning Chris Atkeson 004 Why do we care? Speech recognition makes use of dependence of words and phonemes across time. Knowing where

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture Notes Fall 2009 November, 2009 Byoung-Ta Zhang School of Computer Science and Engineering & Cognitive Science, Brain Science, and Bioinformatics Seoul National University

More information

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee November 15, 2007 Gaussian Processes Outline Gaussian Processes Outline Parametric Bayesian Regression Gaussian

More information

COMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification

COMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification COMP 55 Applied Machine Learning Lecture 5: Generative models for linear classification Instructor: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp55 Unless otherwise noted, all material

More information

15-388/688 - Practical Data Science: Basic probability. J. Zico Kolter Carnegie Mellon University Spring 2018

15-388/688 - Practical Data Science: Basic probability. J. Zico Kolter Carnegie Mellon University Spring 2018 15-388/688 - Practical Data Science: Basic probability J. Zico Kolter Carnegie Mellon University Spring 2018 1 Announcements Logistics of next few lectures Final project released, proposals/groups due

More information

Chapter 16. Structured Probabilistic Models for Deep Learning

Chapter 16. Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 1 Chapter 16 Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 2 Structured Probabilistic Models way of using graphs to describe

More information

BN Semantics 3 Now it s personal!

BN Semantics 3 Now it s personal! Readings: K&F: 3.3, 3.4 BN Semantics 3 Now it s personal! Graphical Models 10708 Carlos Guestrin Carnegie Mellon University September 22 nd, 2008 10-708 Carlos Guestrin 2006-2008 1 Independencies encoded

More information

Outline. Introduction to Bayesian Networks. Outline. Directed Acyclic Graphs. Bayesian Networks Overview Building Models Modeling Tricks.

Outline. Introduction to Bayesian Networks. Outline. Directed Acyclic Graphs. Bayesian Networks Overview Building Models Modeling Tricks. Outline Introduction to Huizhen Yu janey.yu@cs.helsinki.fi Dept. omputer Science, Univ. of Helsinki Probabilistic Models, Spring, 2010 cknowledgment: Illustrative examples in this lecture are mostly from

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Linear Classification

Linear Classification Linear Classification Lili MOU moull12@sei.pku.edu.cn http://sei.pku.edu.cn/ moull12 23 April 2015 Outline Introduction Discriminant Functions Probabilistic Generative Models Probabilistic Discriminative

More information

L03. PROBABILITY REVIEW II COVARIANCE PROJECTION. NA568 Mobile Robotics: Methods & Algorithms

L03. PROBABILITY REVIEW II COVARIANCE PROJECTION. NA568 Mobile Robotics: Methods & Algorithms L03. PROBABILITY REVIEW II COVARIANCE PROJECTION NA568 Mobile Robotics: Methods & Algorithms Today s Agenda State Representation and Uncertainty Multivariate Gaussian Covariance Projection Probabilistic

More information

Bayesian networks. Chapter AIMA2e Slides, Stuart Russell and Peter Norvig, Completed by Kazim Fouladi, Fall 2008 Chapter 14.

Bayesian networks. Chapter AIMA2e Slides, Stuart Russell and Peter Norvig, Completed by Kazim Fouladi, Fall 2008 Chapter 14. Bayesian networks Chapter 14.1 3 AIMA2e Slides, Stuart Russell and Peter Norvig, Completed by Kazim Fouladi, Fall 2008 Chapter 14.1 3 1 Outline Syntax Semantics Parameterized distributions AIMA2e Slides,

More information

15-780: Graduate Artificial Intelligence. Bayesian networks: Construction and inference

15-780: Graduate Artificial Intelligence. Bayesian networks: Construction and inference 15-780: Graduate Artificial Intelligence ayesian networks: Construction and inference ayesian networks: Notations ayesian networks are directed acyclic graphs. Conditional probability tables (CPTs) P(Lo)

More information

Lecture 12: May 09, Decomposable Graphs (continues from last time)

Lecture 12: May 09, Decomposable Graphs (continues from last time) 596 Pat. Recog. II: Introduction to Graphical Models University of Washington Spring 00 Dept. of lectrical ngineering Lecture : May 09, 00 Lecturer: Jeff Bilmes Scribe: Hansang ho, Izhak Shafran(000).

More information

Outline. CSE 473: Artificial Intelligence Spring Bayes Nets: Big Picture. Bayes Net Semantics. Hidden Markov Models. Example Bayes Net: Car

Outline. CSE 473: Artificial Intelligence Spring Bayes Nets: Big Picture. Bayes Net Semantics. Hidden Markov Models. Example Bayes Net: Car CSE 473: rtificial Intelligence Spring 2012 ayesian Networks Dan Weld Outline Probabilistic models (and inference) ayesian Networks (Ns) Independence in Ns Efficient Inference in Ns Learning Many slides

More information

Bayesian Approach 2. CSC412 Probabilistic Learning & Reasoning

Bayesian Approach 2. CSC412 Probabilistic Learning & Reasoning CSC412 Probabilistic Learning & Reasoning Lecture 12: Bayesian Parameter Estimation February 27, 2006 Sam Roweis Bayesian Approach 2 The Bayesian programme (after Rev. Thomas Bayes) treats all unnown quantities

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 12: Gaussian Belief Propagation, State Space Models and Kalman Filters Guest Kalman Filter Lecture by

More information

Tufts COMP 135: Introduction to Machine Learning

Tufts COMP 135: Introduction to Machine Learning Tufts COMP 135: Introduction to Machine Learning https://www.cs.tufts.edu/comp/135/2019s/ Logistic Regression Many slides attributable to: Prof. Mike Hughes Erik Sudderth (UCI) Finale Doshi-Velez (Harvard)

More information

Bayesian Network Representation

Bayesian Network Representation Bayesian Network Representation Sargur Srihari srihari@cedar.buffalo.edu 1 Topics Joint and Conditional Distributions I-Maps I-Map to Factorization Factorization to I-Map Perfect Map Knowledge Engineering

More information

Active and Semi-supervised Kernel Classification

Active and Semi-supervised Kernel Classification Active and Semi-supervised Kernel Classification Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London Work done in collaboration with Xiaojin Zhu (CMU), John Lafferty (CMU),

More information

Probabilistic Graphical Models and Bayesian Networks. Artificial Intelligence Bert Huang Virginia Tech

Probabilistic Graphical Models and Bayesian Networks. Artificial Intelligence Bert Huang Virginia Tech Probabilistic Graphical Models and Bayesian Networks Artificial Intelligence Bert Huang Virginia Tech Concept Map for Segment Probabilistic Graphical Models Probabilistic Time Series Models Particle Filters

More information

Bayesian networks. Chapter Chapter

Bayesian networks. Chapter Chapter Bayesian networks Chapter 14.1 3 Chapter 14.1 3 1 Outline Syntax Semantics Parameterized distributions Chapter 14.1 3 2 Bayesian networks A simple, graphical notation for conditional independence assertions

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Mark Schmidt University of British Columbia Winter 2018 Last Time: Learning and Inference in DAGs We discussed learning in DAG models, log p(x W ) = n d log p(x i j x i pa(j),

More information

Multivariate Gaussians. Sargur Srihari

Multivariate Gaussians. Sargur Srihari Multivariate Gaussians Sargur srihari@cedar.buffalo.edu 1 Topics 1. Multivariate Gaussian: Basic Parameterization 2. Covariance and Information Form 3. Operations on Gaussians 4. Independencies in Gaussians

More information

CMPSCI 240: Reasoning about Uncertainty

CMPSCI 240: Reasoning about Uncertainty CMPSCI 240: Reasoning about Uncertainty Lecture 17: Representing Joint PMFs and Bayesian Networks Andrew McGregor University of Massachusetts Last Compiled: April 7, 2017 Warm Up: Joint distributions Recall

More information

COMP538: Introduction to Bayesian Networks

COMP538: Introduction to Bayesian Networks COMP538: Introduction to Bayesian Networks Lecture 9: Optimal Structure Learning Nevin L. Zhang lzhang@cse.ust.hk Department of Computer Science and Engineering Hong Kong University of Science and Technology

More information

Bayesian networks: Modeling

Bayesian networks: Modeling Bayesian networks: Modeling CS194-10 Fall 2011 Lecture 21 CS194-10 Fall 2011 Lecture 21 1 Outline Overview of Bayes nets Syntax and semantics Examples Compact conditional distributions CS194-10 Fall 2011

More information

Bayesian Networks. instructor: Matteo Pozzi. x 1. x 2. x 3 x 4. x 5. x 6. x 7. x 8. x 9. Lec : Urban Systems Modeling

Bayesian Networks. instructor: Matteo Pozzi. x 1. x 2. x 3 x 4. x 5. x 6. x 7. x 8. x 9. Lec : Urban Systems Modeling 12735: Urban Systems Modeling Lec. 09 Bayesian Networks instructor: Matteo Pozzi x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 x 9 1 outline example of applications how to shape a problem as a BN complexity of the inference

More information

CS Belief networks. Chapter

CS Belief networks. Chapter CS 580 1 Belief networks Chapter 15.1 2 CS 580 2 Outline Conditional independence Bayesian networks: syntax and semantics Exact inference Approximate inference CS 580 3 Independence Two random variables

More information

3. Several Random Variables

3. Several Random Variables . Several Random Variables. Two Random Variables. Conditional Probabilit--Revisited. Statistical Independence.4 Correlation between Random Variables. Densit unction o the Sum o Two Random Variables. Probabilit

More information

Qualifier: CS 6375 Machine Learning Spring 2015

Qualifier: CS 6375 Machine Learning Spring 2015 Qualifier: CS 6375 Machine Learning Spring 2015 The exam is closed book. You are allowed to use two double-sided cheat sheets and a calculator. If you run out of room for an answer, use an additional sheet

More information

2.7 The Gaussian Probability Density Function Forms of the Gaussian pdf for Real Variates

2.7 The Gaussian Probability Density Function Forms of the Gaussian pdf for Real Variates .7 The Gaussian Probability Density Function Samples taken from a Gaussian process have a jointly Gaussian pdf (the definition of Gaussian process). Correlator outputs are Gaussian random variables if

More information

The exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet.

The exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet. CS 189 Spring 013 Introduction to Machine Learning Final You have 3 hours for the exam. The exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet. Please

More information

Probabilistic Reasoning. (Mostly using Bayesian Networks)

Probabilistic Reasoning. (Mostly using Bayesian Networks) Probabilistic Reasoning (Mostly using Bayesian Networks) Introduction: Why probabilistic reasoning? The world is not deterministic. (Usually because information is limited.) Ways of coping with uncertainty

More information

Lecture 4 October 18th

Lecture 4 October 18th Directed and undirected graphical models Fall 2017 Lecture 4 October 18th Lecturer: Guillaume Obozinski Scribe: In this lecture, we will assume that all random variables are discrete, to keep notations

More information