Bayesian Networks. Axioms of Probability Theory. Conditional Probability. Inference by Enumeration. Inference by Enumeration CSE 473

Similar documents
Outline. CSE 473: Artificial Intelligence Spring Bayes Nets: Big Picture. Bayes Net Semantics. Hidden Markov Models. Example Bayes Net: Car

Uncertainty and Bayesian Networks

Which coin will I use? Which coin will I use? Logistics. Statistical Learning. Topics. 573 Topics. Coin Flip. Coin Flip

Uncertainty Chapter 13. Mausam (Based on slides by UW-AI faculty)

Quantifying uncertainty & Bayesian networks

Bayesian networks. Chapter Chapter

Lecture 8. Probabilistic Reasoning CS 486/686 May 25, 2006

Outline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012

CS 188: Artificial Intelligence. Our Status in CS188

Pengju XJTU 2016

Introduction to Bayes Nets. CS 486/686: Introduction to Artificial Intelligence Fall 2013

Objectives. Probabilistic Reasoning Systems. Outline. Independence. Conditional independence. Conditional independence II.

Probabilistic Reasoning Systems

Outline. CSE 573: Artificial Intelligence Autumn Bayes Nets: Big Picture. Bayes Net Semantics. Hidden Markov Models. Example Bayes Net: Car

CS 5522: Artificial Intelligence II

Uncertain Knowledge and Reasoning

Probabilistic Models

Outline. Uncertainty. Methods for handling uncertainty. Uncertainty. Making decisions under uncertainty. Probability. Uncertainty

CS 188: Artificial Intelligence Fall 2009

Uncertainty. Outline. Probability Syntax and Semantics Inference Independence and Bayes Rule. AIMA2e Chapter 13

CS 188: Artificial Intelligence Fall 2009

Bayesian networks. Chapter 14, Sections 1 4

Artificial Intelligence Uncertainty

Probabilistic Reasoning. (Mostly using Bayesian Networks)

CS 343: Artificial Intelligence

CSE 473: Artificial Intelligence Autumn 2011

CS 188: Artificial Intelligence Fall 2008

Probabilistic Models. Models describe how (a portion of) the world works

CS 188: Artificial Intelligence Spring Announcements

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS

School of EECS Washington State University. Artificial Intelligence

Probabilistic Reasoning. Kee-Eung Kim KAIST Computer Science

CSE 473: Artificial Intelligence

Bayesian Belief Network

Bayesian Network. Outline. Bayesian Network. Syntax Semantics Exact inference by enumeration Exact inference by variable elimination

Y. Xiang, Inference with Uncertain Knowledge 1

Artificial Intelligence Programming Probability

PROBABILISTIC REASONING SYSTEMS

Artificial Intelligence

Our Status. We re done with Part I Search and Planning!

Uncertainty. Introduction to Artificial Intelligence CS 151 Lecture 2 April 1, CS151, Spring 2004

Bayesian networks. Soleymani. CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018

Probability Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 27 Mar 2012

Probabilistic Reasoning

Reasoning with Bayesian Networks

Pengju

Uncertainty. Chapter 13

Uncertainty. Outline

CSE 473: Artificial Intelligence Probability Review à Markov Models. Outline

Bayesian Networks. Vibhav Gogate The University of Texas at Dallas

Uncertainty. Chapter 13

Probabilistic representation and reasoning

Uncertainty. 22c:145 Artificial Intelligence. Problem of Logic Agents. Foundations of Probability. Axioms of Probability

Bayesian Networks. Vibhav Gogate The University of Texas at Dallas

Bayes Nets. CS 188: Artificial Intelligence Fall Example: Alarm Network. Bayes Net Semantics. Building the (Entire) Joint. Size of a Bayes Net

Bayes Networks. CS540 Bryan R Gibson University of Wisconsin-Madison. Slides adapted from those used by Prof. Jerry Zhu, CS540-1

Brief Intro. to Bayesian Networks. Extracted and modified from four lectures in Intro to AI, Spring 2008 Paul S. Rosenbloom

Bayesian Networks. Material used

This lecture. Reading. Conditional Independence Bayesian (Belief) Networks: Syntax and semantics. Chapter CS151, Spring 2004

CS 188: Artificial Intelligence. Bayes Nets

Bayesian Networks. Semantics of Bayes Nets. Example (Binary valued Variables) CSC384: Intro to Artificial Intelligence Reasoning under Uncertainty-III

Graphical Models - Part I

Product rule. Chain rule

Uncertainty and Belief Networks. Introduction to Artificial Intelligence CS 151 Lecture 1 continued Ok, Lecture 2!

Announcements. CS 188: Artificial Intelligence Spring Bayes Net Semantics. Probabilities in BNs. All Conditional Independences

Fusion in simple models

15-780: Graduate Artificial Intelligence. Bayesian networks: Construction and inference

COS402- Artificial Intelligence Fall Lecture 10: Bayesian Networks & Exact Inference

Uncertainty. Chapter 13. Chapter 13 1

Artificial Intelligence Bayes Nets

CS 188: Artificial Intelligence Fall 2011

Intelligent Systems (AI-2)

CS 484 Data Mining. Classification 7. Some slides are from Professor Padhraic Smyth at UC Irvine

Probability theory: elements

CSE 473: Ar+ficial Intelligence. Hidden Markov Models. Bayes Nets. Two random variable at each +me step Hidden state, X i Observa+on, E i

Artificial Intelligence Bayes Nets: Independence

Computer Science CPSC 322. Lecture 18 Marginalization, Conditioning

Logic and Bayesian Networks

CS 5522: Artificial Intelligence II

CS 343: Artificial Intelligence

Directed Graphical Models or Bayesian Networks

COMP9414/ 9814/ 3411: Artificial Intelligence. 14. Uncertainty. Russell & Norvig, Chapter 13. UNSW c AIMA, 2004, Alan Blair, 2012

Bayesian Networks. 22c:145 Artificial Intelligence. Bayesian Networks. Review of Probability Theory. Bayesian (Belief) Networks.

Uncertainty. Chapter 13, Sections 1 6

Course Introduction. Probabilistic Modelling and Reasoning. Relationships between courses. Dealing with Uncertainty. Chris Williams.

Bayes Nets: Independence

Events A and B are independent P(A) = P(A B) = P(A B) / P(B)

Part I Qualitative Probabilistic Networks

Reasoning Under Uncertainty

Y1 Y2 Y3 Y4 Y1 Y2 Y3 Y4 Z1 Z2 Z3 Z4

Chapter 13 Quantifying Uncertainty

Recall from last time. Lecture 3: Conditional independence and graph structure. Example: A Bayesian (belief) network.

Bayesian networks (1) Lirong Xia

Informatics 2D Reasoning and Agents Semester 2,

Probabilistic Representation and Reasoning

Review: Bayesian learning and inference

CSEP 573: Artificial Intelligence

Axioms of Probability? Notation. Bayesian Networks. Bayesian Networks. Today we ll introduce Bayesian Networks.

Our Status in CSE 5522

Reasoning Under Uncertainty: More on BNets structure and construction

Transcription:

ayesian Networks CSE 473 Last Time asic notions tomic events Probabilities Joint distribution Inference by enumeration Independence & conditional independence ayes rule ayesian networks Statistical learning Dynamic ayesian networks (DNs) Markov decision processes (MDPs) Daniel S. Weld 2 xioms of Probability Theory ll probabilities between 0 and 1 0 1 true) 1 false) 0. The probability of disjunction is: + Conditional Probability is the probability of given ssumes that is the only info known. Defined by: Daniel S. Weld 3 Daniel S. Weld 4 Inference by Enumeration Inference by Enumeration toothache).108+.012+.016+.064.20 or 20% Daniel S. Weld 5 Daniel S. Weld 6 1

Problems?? Worst case time: O(n d ) Where d max arity nd n number of random variables Space complexity also O(n d ) Size of joint distribution How get O(n d ) entries for table?? Value of cavity & catch irrelevant - When computing toothache) Daniel S. Weld 7 Independence and are independent iff: P ( P ( P ( P ( These two constraints are logically equivalent Therefore, if and are independent: Daniel S. Weld 8 Independence Independence Complete independence is powerful but rare What to do if it doesn t hold? Daniel S. Weld 9 Daniel S. Weld 10 Conditional Independence & not independent, since < Conditional Independence ut: & are made independent by C C C C), C) C Daniel S. Weld 11 Daniel S. Weld 12 2

Conditional Independence Conditional Independence II catch toothache, cavity) catch cavity) catch toothache, cavity) catch cavity) Why only 5 entries in table? Instead of 7 entries, only need 5 Daniel S. Weld 13 Daniel S. Weld 14 Power of Cond. Independence Often, using conditional independence reduces the storage complexity of the joint distribution from exponential to linear!! Conditional independence is the most basic & robust form of knowledge about uncertain environments. ayes Rule E P ( H E) E) Simple proof from def of conditional probability: H E) H E) (Def. cond. prob.) E) H E) E (Def. cond. prob.) P ( H E) E (Mult by H) in line 1) QED: E P ( H E) E) (Substitute #3 in #2) Daniel S. Weld 15 Daniel S. Weld 16 Use to Compute Diagnostic Probability from Causal Probability ayes Rule & Cond. Independence E.g. let M be meningitis, S be stiff neck M) 0.0001, S) 0.1, S M) 0.8 M S) Daniel S. Weld 17 Daniel S. Weld 18 3

ayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation & inference Ns provide a graphical representation of conditional independence relations in P usually quite compact requires assessment of fewer parameters, those being quite natural (e.g., causal) efficient (usually) inference: query answering and belief update Independence (in the extreme) If X 1, X 2,... X n are mutually independent, then X 1, X 2,... X n ) X 1 )X 2 )... X n ) Joint can be specified with n parameters cf. the usual 2 n -1 parameters required While extreme independence is unusual, Conditional independence is common Ns exploit this conditional independence Daniel S. Weld 19 Daniel S. Weld 20 n Example ayes Net Example (con t) Radio larm urglary Radio larm urglary Pr( E, e,b 0.9 (0.1) e,b 0.2 (0.8) e,b 0.85 (0.15) e,b 0.01 (0.99) Pr(t) Pr(f) 0.05 0.95 If I know if larm, no other evidence influences my degree of belief in N1 N2,,E, N1 also: N2 N2,,E, N2 and E E) y the chain rule we have N1,N2,,E, N1 N2,,E, N2,E, E, E N1 N2,E) E) Full joint requires only 10 parameters (cf. 32) Daniel S. Weld 21 Daniel S. Weld 22 Ns: Qualitative Structure Graphical structure of N reflects conditional independence among variables Each variable X is a node in the DG Edges denote direct probabilistic influence usually interpreted causally parents of X are denoted Par(X) X is conditionally independent of all nondescendents given its parents Graphical test exists for more general independence Markov lanket Daniel S. Weld 23 Given Parents, X is Independent of Non-Descendants Daniel S. Weld 24 4

For Example Given Markov lanket, X is Independent of ll Other Nodes urglary Radio larm M(X) Par(X) Childs(X) Par(Childs(X)) Daniel S. Weld 25 Daniel S. Weld 26 Conditional Probability Tables Conditional Probability Tables For complete spec. of joint dist., quantify N Radio larm urglary Pr( E, e,b 0.9 (0.1) e,b 0.2 (0.8) e,b 0.85 (0.15) e,b 0.01 (0.99) Pr(t) Pr(f) 0.05 0.95 For each variable X, specify CPT: X Par(X)) number of params locally exponential in Par(X) If X 1, X 2,... X n is any topological sort of the network, then we are assured: X n,x n-1,...x 1 ) X n X n-1,...x 1 ) X n-1 X n-2, X 1 ) X 2 X 1 ) X 1 ) X n Par(X n )) X n-1 Par(X n-1 )) X 1 ) Daniel S. Weld 27 Daniel S. Weld 28 Inference in Ns The graphical independence representation yields efficient inference schemes We generally want to compute Pr(X), or Pr(X E) where E is (conjunctive) evidence Computations organized by network topology One simple algorithm: variable elimination (VE) Daniel S. Weld 29 Jtrue, Mtrue) Radio John larm urglary Mary b j,m) αb) Σe) Σa b,e)j a)m,a) e a Daniel S. Weld 30 5

Structure of Computation Variable Elimination Dynamic Programming Daniel S. Weld 31 factor is a function from some set of variables into a specific value: e.g., f(e,,n1) CPTs are factors, e.g., E, function of,e, VE works by eliminating all variables in turn until there is a factor with only query variable To eliminate a variable: join all factors containing that variable (like D sum out the influence of the variable on new factor exploits product form of joint distribution Daniel S. Weld 32 Example of VE: N1) N1) Σ N2,,,E N1,N2,,,E) Σ N2,,,E N1 N2,E)E) Σ N1 Σ N2 N2 Σ Σ E,E)E) Earthqk urgl larm Σ N1 Σ N2 N2 Σ f1(, Σ N1 Σ N2 N2 f2( N1 N2 Σ N1 f3( f4(n1) Daniel S. Weld 33 Notes on VE Each operation is a simply multiplication of factors and summing out a variable Complexity determined by size of largest factor e.g., in example, 3 vars (not 5) linear in number of vars, exponential in largest factorelimination ordering greatly impacts factor size optimal elimination orderings: NP-hard heuristics, special structure (e.g., polytrees) Practically, inference is much more tractable using structure of this sort Daniel S. Weld 34 6