last two digits of your SID

Similar documents
CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence. Bayes Nets

Bayesian networks. Chapter 14, Sections 1 4

Bayesian networks. Chapter Chapter

Announcements. Inference. Mid-term. Inference by Enumeration. Reminder: Alarm Network. Introduction to Artificial Intelligence. V22.

Bayesian Network. Outline. Bayesian Network. Syntax Semantics Exact inference by enumeration Exact inference by variable elimination

Announcements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic

CSEP 573: Artificial Intelligence

Bayes Nets III: Inference

qfundamental Assumption qbeyond the Bayes net Chain rule conditional independence assumptions,

School of EECS Washington State University. Artificial Intelligence

Announcements. CS 188: Artificial Intelligence Spring Bayes Net Semantics. Probabilities in BNs. All Conditional Independences

COS402- Artificial Intelligence Fall Lecture 10: Bayesian Networks & Exact Inference

Directed Graphical Models

Outline. CSE 573: Artificial Intelligence Autumn Bayes Nets: Big Picture. Bayes Net Semantics. Hidden Markov Models. Example Bayes Net: Car

Bayesian Networks. Motivation

Announcements. CS 188: Artificial Intelligence Spring Probability recap. Outline. Bayes Nets: Big Picture. Graphical Model Notation

Probabilistic Reasoning Systems

Artificial Intelligence Bayes Nets: Independence

Objectives. Probabilistic Reasoning Systems. Outline. Independence. Conditional independence. Conditional independence II.

Learning Bayesian Networks (part 1) Goals for the lecture

CS 730/730W/830: Intro AI

CS 5522: Artificial Intelligence II

CS 343: Artificial Intelligence

Bayes Nets: Independence

Artificial Intelligence Methods. Inference in Bayesian networks

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

Artificial Intelligence Bayesian Networks

Inference in Bayesian networks

Probabilistic Reasoning. (Mostly using Bayesian Networks)

PROBABILISTIC REASONING SYSTEMS

CS 188: Artificial Intelligence Fall 2009

Inference in Bayesian Networks

Bayesian Networks. Philipp Koehn. 6 April 2017

Probabilistic Graphical Models and Bayesian Networks. Artificial Intelligence Bert Huang Virginia Tech

CS 188: Artificial Intelligence Spring Announcements

CS 484 Data Mining. Classification 7. Some slides are from Professor Padhraic Smyth at UC Irvine

Bayesian Networks. Philipp Koehn. 29 October 2015

Probability. CS 3793/5233 Artificial Intelligence Probability 1

Introduction to Artificial Intelligence. Unit # 11

Bayesian Networks BY: MOHAMAD ALSABBAGH

Quantifying uncertainty & Bayesian networks

Bayesian Networks. Semantics of Bayes Nets. Example (Binary valued Variables) CSC384: Intro to Artificial Intelligence Reasoning under Uncertainty-III

15-780: Graduate Artificial Intelligence. Bayesian networks: Construction and inference

Uncertainty and Bayesian Networks

Bayes Networks. CS540 Bryan R Gibson University of Wisconsin-Madison. Slides adapted from those used by Prof. Jerry Zhu, CS540-1

Graphical Models - Part I

Quantifying Uncertainty

Informatics 2D Reasoning and Agents Semester 2,

Reasoning Under Uncertainty

Bayesian networks. Soleymani. CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018

Introduction to Artificial Intelligence Belief networks

Lecture 10: Bayesian Networks and Inference

CSE 473: Artificial Intelligence Autumn 2011

CS 380: ARTIFICIAL INTELLIGENCE UNCERTAINTY. Santiago Ontañón

Probabilistic Reasoning. Kee-Eung Kim KAIST Computer Science

Product rule. Chain rule

Bayes Networks 6.872/HST.950

Bayes Net Representation. CS 188: Artificial Intelligence. Approximate Inference: Sampling. Variable Elimination. Sampling.

Bayesian Networks. Vibhav Gogate The University of Texas at Dallas

Probabilistic Models. Models describe how (a portion of) the world works

Artificial Intelligence

14 PROBABILISTIC REASONING

CS 5522: Artificial Intelligence II

Review: Bayesian learning and inference

Bayesian Networks. Vibhav Gogate The University of Texas at Dallas

Bayesian Belief Network

CS 343: Artificial Intelligence

Introduction to Probabilistic Reasoning. Image credit: NASA. Assignment

COMP5211 Lecture Note on Reasoning under Uncertainty

Inference in Bayesian networks

CS 188: Artificial Intelligence Fall 2008

Bayesian networks. Chapter AIMA2e Slides, Stuart Russell and Peter Norvig, Completed by Kazim Fouladi, Fall 2008 Chapter 14.

Reasoning Under Uncertainty

Bayesian Networks. Machine Learning, Fall Slides based on material from the Russell and Norvig AI Book, Ch. 14

Y. Xiang, Inference with Uncertain Knowledge 1

Probabilistic Models

COMP538: Introduction to Bayesian Networks

Probabilistic Representation and Reasoning

Axioms of Probability? Notation. Bayesian Networks. Bayesian Networks. Today we ll introduce Bayesian Networks.

Defining Things in Terms of Joint Probability Distribution. Today s Lecture. Lecture 17: Uncertainty 2. Victor R. Lesser

Probabilistic representation and reasoning

Bayes Nets: Sampling

Stochastic inference in Bayesian networks, Markov chain Monte Carlo methods

Implementing Machine Reasoning using Bayesian Network in Big Data Analytics

Probabilistic representation and reasoning

CS6220: DATA MINING TECHNIQUES

Bayesian Networks Introduction to Machine Learning. Matt Gormley Lecture 24 April 9, 2018

p(x) p(x Z) = y p(y X, Z) = αp(x Y, Z)p(Y Z)

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS

Brief Intro. to Bayesian Networks. Extracted and modified from four lectures in Intro to AI, Spring 2008 Paul S. Rosenbloom

CS188 Outline. We re done with Part I: Search and Planning! Part II: Probabilistic Reasoning. Part III: Machine Learning

Belief Networks for Probabilistic Inference

Intelligent Systems (AI-2)

Bayesian Learning. Instructor: Jesse Davis

Bayesian networks (1) Lirong Xia

Uncertainty. Logic and Uncertainty. Russell & Norvig. Readings: Chapter 13. One problem with logical-agent approaches: C:145 Artificial

Probabilistic Models Bayesian Networks Markov Random Fields Inference. Graphical Models. Foundations of Data Analysis

Foundations of Artificial Intelligence

Bayesian Networks: Representation, Variable Elimination

Transcription:

Announcements Midterm: Wednesday 7pm-9pm See midterm prep page (posted on Piazza, inst.eecs page) Four rooms; your room determined by last two digits of your SID: 00-32: Dwinelle 155 33-45: Genetics and Plant Biology 100 46-62: Hearst Annex A1 63-99: Pimentel 1 Discussions this week by topic Survey: complete it before midterm; 80% participation = +1pt 1

Bayes net global semantics Bayes nets encode joint distributions as product of conditional distributions on each variable: P(X 1,..,X n ) = i P(X i Parents(X i ))

Conditional independence semantics Every variable is conditionally independent of its non-descendants given its parents Conditional independence semantics <=> global semantics 3

Example JohnCalls independent of Burglary given Alarm? Yes JohnCalls independent of MaryCalls given Alarm? Yes Burglary independent of Earthquake? Yes Burglary independent of Earthquake given Alarm? NO! Given that the alarm has sounded, both burglary and earthquake become more likely But if we then learn that a burglary has happened, the alarm is explained away and the probability of earthquake drops back Burglary John calls V-structure 4 Alarm Earthquake Mary calls

Markov blanket A variable s Markov blanket consists of parents, children, children s other parents Every variable is conditionally independent of all other variables given its Markov blanket 5

CS 188: Artificial Intelligence Bayes Nets: Exact Inference Instructor: Sergey Levine and Stuart Russell--- University of California, Berkeley

Bayes Nets Part I: Representation Part II: Exact inference Enumeration (always exponential complexity) Variable elimination (worst-case exponential complexity, often better) Inference is NP-hard in general Part III: Approximate Inference Later: Learning Bayes nets from data

Inference Inference: calculating some useful quantity from a probability model (joint probability distribution) Examples: Posterior marginal probability P(Q e 1,..,e k ) E.g., what disease might I have? Most likely explanation: argmax q,r,s P(Q=q,R=r,S=s e 1,..,e k ) E.g., what did he say?

Inference by Enumeration in Bayes Net Reminder of inference by enumeration: Any probability of interest can be computed by summing entries from the joint distribution Entries from the joint distribution can be obtained from a BN by multiplying the corresponding conditional probabilities P(B j, m) = α P(B, j, m) = α e,a P(B, e, a, j, m) = α e,a P(B) P(e) P(a B,e) P(j a) P(m a) So inference in Bayes nets means computing sums of products of numbers: sounds easy!! Problem: sums of exponentially many products! B J A E M

Can we do better? Consider uwy + uwz + uxy + uxz + vwy + vwz + vxy +vxz 16 multiplies, 7 adds Lots of repeated subexpressions! Rewrite as (u+v)(w+x)(y+z) 2 multiplies, 3 adds e,a P(B) P(e) P(a B,e) P(j a) P(m a) = P(B)P(e)P(a B,e)P(j a)p(m a) + P(B)P( e)p(a B, e)p(j a)p(m a) + P(B)P(e)P( a B,e)P(j a)p(m a) + P(B)P( e)p( a B, e)p(j a)p(m a) Lots of repeated subexpressions! 10

Variable elimination: The basic ideas Move summations inwards as far as possible P(B j, m) = α e,a P(B) P(e) P(a B,e) P(j a) P(m a) = α P(B) e P(e) a P(a B,e) P(j a) P(m a) Do the calculation from the inside out I.e., sum over a first, then sum over e Problem: P(a B,e) isn t a single number, it s a bunch of different numbers depending on the values of B and e Solution: use arrays of numbers (of various dimensions) with appropriate operations on them; these are called factors 11

Factor Zoo

Factor Zoo I Joint distribution: P(X,Y) Entries P(x,y) for all x, y X x Y matrix Sums to 1 P(A,J) A \ J true false true 0.09 0.01 false 0.045 0.855 Projected joint: P(x,Y) A slice of the joint distribution Entries P(x,y) for one x, all y Y -element vector Sums to P(x) P(a,J) A \ J true false true 0.09 0.01 Number of variables (capitals) = dimensionality of the table

Factor Zoo II Single conditional: P(Y x) Entries P(y x) for fixed x, all y Sums to 1 P(J a) A \ J true false true 0.9 0.1 Family of conditionals: P(X Y) Multiple conditionals Entries P(x y) for all x, y Sums to Y P(J A) A \ J true false true 0.9 0.1 false 0.05 0.95 } - P(J a) } - P(J a)

Operation 1: Pointwise product First basic operation: pointwise product of factors (similar to a database join, not matrix multiply!) New factor has union of variables of the two original factors Each entry is the product of the corresponding entries from the original factors Example: P(J A) x P(A) = P(A,J) P(A) P(J A) A \ J true false true 0.1 true false 0.9 x 0.9 0.1 = false 0.05 0.95 P(A,J) A \ J true false true 0.09 0.01 false 0.045 0.855

Example: Making larger factors Example: P(A,J) x P(A,M) = P(A,J,M) P(A,J) P(A,M) P(A,J,M) A \ J true false true 0.09 0.01 false 0.045 0.855 A \ M true false true 0.07 0.03 x = false 0.009 0.891 A=true A=false

Example: Making larger factors Example: P(U,V) x P(V,W) x P(W,X) = P(U,V,W,X) Sizes: [10,10] x [10,10] x [10,10] = [10,10,10,10] I.e., 300 numbers blows up to 10,000 numbers! Factor blowup can make VE very expensive

Operation 2: Summing out a variable Second basic operation: summing out (or eliminating) a variable from a factor Shrinks a factor to a smaller one Example: j P(A,J) = P(A,j) + P(A, j) = P(A) P(A,J) A \ J true false true 0.09 0.01 false 0.045 0.855 Sum out J P(A) true 0.1 false 0.9

Summing out from a product of factors Project the factors each way first, then sum the products Example: a P(a B,e) x P(j a) x P(m a) = P(a B,e) x P(j a) x P(m a) + P( a B,e) x P(j a) x P(m a)

Variable Elimination

Variable Elimination Query: P(Q E 1 =e 1,.., E k =e k ) Start with initial factors: Local CPTs (but instantiated by evidence) While there are still hidden variables (not Q or evidence): Pick a hidden variable H Join all factors mentioning H Eliminate (sum out) H Join all remaining factors and normalize

Variable Elimination function VariableElimination(Q, e, bn) returns a distribution over Q factors [ ] for each var in ORDER(bn.vars) do factors [MAKE-FACTOR(var, e) factors] if var is a hidden variable then factors SUM-OUT(var,factors) return NORMALIZE(POINTWISE-PRODUCT(factors)) 22

Example Query P(B j,m) P(B) P(E) P(A B,E) P(j A) P(m A) Choose A P(A B,E) P(j A) P(m A) P(j,m B,E) P(B) P(E) P(j,m B,E)

Example P(B) P(E) P(j,m B,E) Choose E P(E) P(j,m B,E) P(j,m B) P(B) P(j,m B) Finish with B P(B) P(j,m B) P(j,m,B) Normalize P(B j,m)

Order matters Order the terms Z, A, B C, D P(D) = α z,a,b,c P(z) P(a z) P(b z) P(c z) P(D z) = α z P(z) a P(a z) b P(b z) c P(c z) P(D z) Largest factor has 2 variables (D,Z) Order the terms A, B C, D, Z P(D) = α a,b,c,z P(a z) P(b z) P(c z) P(D z) P(z) = α a b c z P(a z) P(b z) P(c z) P(D z) P(z) Largest factor has 4 variables (A,B,C,D) In general, with n leaves, factor of size 2 n Z A B C D

VE: Computational and Space Complexity The computational and space complexity of variable elimination is determined by the largest factor (and it s space that kills you) The elimination ordering can greatly affect the size of the largest factor. E.g., previous slide s example 2 n vs. 2 Does there always exist an ordering that only results in small factors? No!

Worst Case Complexity? Reduction from SAT CNF clauses: 1. A v B v C 2. C v D v A 3. B v C v D P(AND) > 0 iff clauses are satisfiable => NP-hard P(AND) = S x 0.5 n where S is the number of satisfying assignments for clauses => #P-hard

Polytrees A polytree is a directed graph with no undirected cycles For poly-trees the complexity of variable elimination is linear in the network size if you eliminate from the leave towards the roots This is essentially the same theorem as for treestructured CSPs

Bayes Nets Part I: Representation Part II: Exact inference Enumeration (always exponential complexity) Variable elimination (worst-case exponential complexity, often better) Inference is NP-hard in general Part III: Approximate Inference Later: Learning Bayes nets from data