Figure 1: Bayesian network for problem 1. P (A = t) = 0.3 (1) P (C = t) = 0.6 (2) Table 1: CPTs for problem 1. (c) P (E B) C P (D = t) f 0.9 t 0.

Similar documents
Figure 1: Bayesian network for problem 1. P (A = t) = 0.3 (1) P (C = t) = 0.6 (2) Table 1: CPTs for problem 1. C P (D = t) f 0.9 t 0.

( x) f = where P and Q are polynomials.

Exact Inference by Complete Enumeration

Supplementary material for Continuous-action planning for discounted infinite-horizon nonlinear optimal control with Lipschitz values

Exact Inference: Variable Elimination

9.3 Graphing Functions by Plotting Points, The Domain and Range of Functions

Bayesian Networks: Independencies and Inference

Using first-order logic, formalize the following knowledge:

15-780: Graduate Artificial Intelligence. Bayesian networks: Construction and inference

Introduction to Bayesian Networks

Bayesian networks. Chapter 14, Sections 1 4

Maximum Flow. Reading: CLRS Chapter 26. CSE 6331 Algorithms Steve Lai

Circuit Complexity / Counting Problems

8.4 Inverse Functions

Algorithms and Complexity Results for Exact Bayesian Structure Learning

Curve Sketching. The process of curve sketching can be performed in the following steps:

(C) The rationals and the reals as linearly ordered sets. Contents. 1 The characterizing results

Bayesian Networks. Material used

Artificial Intelligence

Bayesian Networks. Semantics of Bayes Nets. Example (Binary valued Variables) CSC384: Intro to Artificial Intelligence Reasoning under Uncertainty-III

Quantifying uncertainty & Bayesian networks

Probabilistic Reasoning Systems

Objectives. Probabilistic Reasoning Systems. Outline. Independence. Conditional independence. Conditional independence II.

Bayesian networks. Chapter Chapter

Recall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem

CS 188: Artificial Intelligence Spring Announcements

Lecture 4: Probability Propagation in Singly Connected Bayesian Networks

14 PROBABILISTIC REASONING

CS 188: Artificial Intelligence. Bayes Nets

Introduction to Artificial Intelligence. Unit # 11

Example: When describing where a function is increasing, decreasing or constant we use the x- axis values.

The 2nd Texas A&M at Galveston Mathematics Olympiad. September 24, Problems & Solutions

On High-Rate Cryptographic Compression Functions

Exponential Functions. Michelle Northshield

A Brief Introduction to Graphical Models. Presenter: Yijuan Lu November 12,2004

VALUATIVE CRITERIA FOR SEPARATED AND PROPER MORPHISMS

Least-Squares Spectral Analysis Theory Summary

Definition: Let f(x) be a function of one variable with continuous derivatives of all orders at a the point x 0, then the series.

SEPARATED AND PROPER MORPHISMS

Axioms of Probability? Notation. Bayesian Networks. Bayesian Networks. Today we ll introduce Bayesian Networks.

Announcements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic

3. Several Random Variables

Announcements. Inference. Mid-term. Inference by Enumeration. Reminder: Alarm Network. Introduction to Artificial Intelligence. V22.

The achievable limits of operational modal analysis. * Siu-Kui Au 1)

Informatics 2D Reasoning and Agents Semester 2,

Bayesian Networks. Motivation

Basic mathematics of economic models. 3. Maximization

Lecture 10: Introduction to reasoning under uncertainty. Uncertainty

PROBABILISTIC REASONING SYSTEMS

Numerical Methods - Lecture 2. Numerical Methods. Lecture 2. Analysis of errors in numerical methods

On the Girth of (3,L) Quasi-Cyclic LDPC Codes based on Complete Protographs

Bayesian Learning. Bayesian Learning Criteria

SEPARATED AND PROPER MORPHISMS

HSP SUBCATEGORIES OF EILENBERG-MOORE ALGEBRAS

«Develop a better understanding on Partial fractions»

Differentiation. The main problem of differential calculus deals with finding the slope of the tangent line at a point on a curve.

Final Exam December 12, 2017

New Results on Boomerang and Rectangle Attacks

(One Dimension) Problem: for a function f(x), find x 0 such that f(x 0 ) = 0. f(x)

The Clifford algebra and the Chevalley map - a computational approach (detailed version 1 ) Darij Grinberg Version 0.6 (3 June 2016). Not proofread!

Bayesian belief networks. Inference.

Automatic Differentiation Equipped Variable Elimination for Sensitivity Analysis on Probabilistic Inference Queries

Bayesian Networks and Decision Graphs

MEASUREMENT UNCERTAINTIES

Midterm II Sample Questions

SPCS Cryptography Homework 13

Network Upgrade Game. Qiuhui Li, Qiao Zhao

CHAPTER 2 LINEAR MOTION

Probabilistic Graphical Models for Image Analysis - Lecture 1

Math 2412 Activity 1(Due by EOC Sep. 17)

APPENDIX 1 ERROR ESTIMATION

Secure Communication in Multicast Graphs

Final Exam December 12, 2017

RESOLUTION MSC.362(92) (Adopted on 14 June 2013) REVISED RECOMMENDATION ON A STANDARD METHOD FOR EVALUATING CROSS-FLOODING ARRANGEMENTS

Bayes Networks. CS540 Bryan R Gibson University of Wisconsin-Madison. Slides adapted from those used by Prof. Jerry Zhu, CS540-1

UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics. EECS 281A / STAT 241A Statistical Learning Theory

Introduction to Algorithms

Probabilistic Reasoning. Kee-Eung Kim KAIST Computer Science

Bayesian Networks Introduction to Machine Learning. Matt Gormley Lecture 24 April 9, 2018

Defining Things in Terms of Joint Probability Distribution. Today s Lecture. Lecture 17: Uncertainty 2. Victor R. Lesser

OPTIMAL PLACEMENT AND UTILIZATION OF PHASOR MEASUREMENTS FOR STATE ESTIMATION

CS220/MATH320 Applied Discrete Math Fall 2018 Instructor: Marc Pomplun. Assignment #3. Sample Solutions

9.1 The Square Root Function

Bayesian Network. Outline. Bayesian Network. Syntax Semantics Exact inference by enumeration Exact inference by variable elimination

Quantifying Uncertainty & Probabilistic Reasoning. Abdulla AlKhenji Khaled AlEmadi Mohammed AlAnsari

Function Operations. I. Ever since basic math, you have been combining numbers by using addition, subtraction, multiplication, and division.

SIO 211B, Rudnick. We start with a definition of the Fourier transform! ĝ f of a time series! ( )

Symbolic-Numeric Methods for Improving Structural Analysis of DAEs

The First-Order Theory of Ordering Constraints over Feature Trees

Bayesian Networks (Part II)

RATIONAL FUNCTIONS. Finding Asymptotes..347 The Domain Finding Intercepts Graphing Rational Functions

This lecture. Reading. Conditional Independence Bayesian (Belief) Networks: Syntax and semantics. Chapter CS151, Spring 2004

CS 484 Data Mining. Classification 7. Some slides are from Professor Padhraic Smyth at UC Irvine

Uncertainty. Logic and Uncertainty. Russell & Norvig. Readings: Chapter 13. One problem with logical-agent approaches: C:145 Artificial

Rolle s Theorem and the Mean Value Theorem. Rolle s Theorem

1 Categories, Functors, and Natural Transformations. Discrete categories. A category is discrete when every arrow is an identity.

Sec 3.1. lim and lim e 0. Exponential Functions. f x 9, write the equation of the graph that results from: A. Limit Rules

Introduction to Bayesian Learning

9 Forward-backward algorithm, sum-product on factor graphs

Bayesian belief networks

Transcription:

Probabilistic Artiicial Intelligence Problem Set 3 Oct 27, 2017 1. Variable elimination In this exercise you will use variable elimination to perorm inerence on a bayesian network. Consider the network in igure 1 and its corresponding conditional probability tables (CPTs). A C B E D F Figure 1: Bayesian network or problem 1. P (A = t) = 0.3 (1) P (C = t) = 0.6 (2) Table 1: CPTs or problem 1. (a) P (B A, C) (b) P (D C) (c) P (E B) (d) P (F D, E) A C P (B = t) 0.2 t 0.8 t 0.3 t t 0.5 C P (D = t) 0.9 t 0.75 B P (E = t) 0.2 t 0.4 D E P (F = t) 0.95 t 1 t 0 t t 0.25 1

Assuming a query on A with evidence or B and D, i.e. P (A B, D), use the variable elimination algorithm to answer the ollowing queries. Make explicit the selected ordering or the variables and compute the probability tables o the intermediate actors. 1. P (A = t B = t, D = ) 2. P (A = B =, D = ) 3. P (A = t B = t, D = t) Consider now the ordering, C, E, F, D, B, A, use again the variable elimination algorithm and write down the intermediate actors, this time without computing their probability tables. Is this ordering better or worse than the one you used beore? Why? 2. Belie propagation In this exercise, you will implement the belie propagation algorithm or perorming inerence in Bayesian networks. As you have seen in the class lectures, the algorithm is based on converting the Bayesian network to a actor graph and then passing messages between variable and actor nodes o that graph until convergence. You are provided some skeleton Python code in the.zip ile accompanying this document. Take the ollowing steps or this exercise. 1. Install the Python dependencies listed in README.txt, i your system does not already satisy them. Ater that, you should be able to run demo.py and produce some plots, albeit wrong ones or now. 2. Implement the missing code in bprop.py marked with TODO. In particular, you have to ill in parts o the two unctions that are responsible or sending messages rom variable to actor nodes and vice versa, as well as parts o the unction that returns the resulting marginal distribution o a variable node ater message passing has terminated. 3. Now, set up the ull-ledged earthquake network, whose structure was introduced in Problem Set 2 and is shown again in Figure 2. Here is the story behind this network: While Fred is commuting to work, he receives a phone call rom his neighbor saying that the burglar alarm in Fred s house is ringing. Upon hearing this, Fred immediately turns around to get back and check his home. A ew minutes on his way back, however, he hears on the radio that there was an earthquake near his home earlier that day. Relieved by the news, he turns around again and continues his way to work. To build up the conditional probability tables (CPTs) or the network o Figure 2 you may make the ollowing assumptions about the variables involved: All variables in the network are binary. As can be seen rom the network structure, burglaries and earthquakes are assumed to be independent. Furthermore, each o them is assumed to occur with probability 0.1%. 2

Earthquake Burglar Radio Alarm Phone Figure 2: The earthquake network to be implemented. The alarm is triggered in the ollowing ways: (1) When a burglar enters the house, the alarm will ring 99% o the time; (2) when an earthquake occurs, there will be a alse alarm 1% o the time; (3) the alarm might go o due to other causes (wind, rain, etc.) 0.1% o the time. These three types o causes are assumed to be independent o each other. The neighbor is assumed to call only when the alarm is ringing, but only does so 70% o the time when it is actually ringing. The radio is assumed to never alsely report an earthquake, but it might ail to report an earthquake that actually happened 50% o the time. (This includes the times that Fred ails to listen to the announcement.) 4. Ater having set up the network and its CPTs, answer the ollowing questions using your belie propagation implementation: (a) Beore Fred gets the neighbor s call, what is the probability o a burglary having occurred? What is the probability o an earthquake having occurred? (b) How do these probabilities change ater Fred receives the neighbor s phonecall? (c) How do these probabilities change ater Fred listens to the news on the radio? 3. Belie propagation on tree actor graphs* In this exercise we will prove that the belie propagation algorithm converges to the exact marginals ater a ixed number o iterations given that the actor graph is a tree, that is, given that the original Bayesian network is a polytree. We will assume that the actor graph contains no single-variable actors. (You have already seen that i those exist, they can easily be incorporated into multi-variable actors without increasing the complexity o the algorithm.) Since the actor graph is a tree, we will designate 3

u u g g v 1 v 2 w v 1 v 2 w h h T [v1 ] T [u] r 1 r 2 r 1 r 2 Figure 3: An example actor graph and two o its subtrees. a variable node, say a, as the root o the tree. We will consider subtrees T [rt], where r and t are adjacent nodes in the actor graph and t is closer to the root than r, which are o two types: i r is a actor node (and t a variable node), then T [rt] denotes a subtree that has t as its root, contains the whole subtree under r and, additionally, the edge {r, t}, i r is a variable node (and t a actor node), then T [rt] denotes the whole subtree under r with r as its root. See Figure 3 or two example subtrees, one o each type. The depth o a tree is deined as the maximum distance between the root and any other node. Note that, both types o subtrees T [rt] deined above have always depths that are even numbers. We will use the subscript notation [rt] to reer to quantities constrained to the subtree T [rt]. In particular, we denote by F [rt] the set o actors in the subtree and by P [rt] (x v ) the marginal distribution o v when we only consider the nodes o the subtree. More concretely, i r is a variable node, by the sum rule we get P [rt] (x r ) (x ), (1) F [rt] x [rt]\{r} where denotes equality up to a normalization constant. Remember the orm o the messages passed between variable and actor nodes at each iteration 4

T [v] v T [v] T [v] v v... N () \ {v}... N (v) \ {}... N () \ {v} (a) Base case. (b) First induction step. (c) Second induction step. Figure 4: The three cases considered in the proo by induction. o the algorithm: µ (t+1) v (x v) := g N (v)\{} µ (t+1) v (x v) := x \{v} (x ) g v(x v ) (2) w N ()\{v} w (x w). (3) We also deine the estimated marginal distribution o variable v at iteration t as ˆP (t) (x v ) := g v(x v ). (4) g N (v) Our ultimate goal is to show that the estimated marginals are equal to the true marginals or all variables ater a number iterations. However, we will irst consider the rooted version o the actor graph and show that the previous statement holds or the root node a. More concretely, i we denote variable nodes with v and actor nodes with, we will show using induction that or all subtrees T [v] o depth τ, it holds that, or all t τ, v (x v) P [v] (x v ). (5) (i) Consider the base case o T [v] being a subtree o depth τ = 2 (see Figure 4a). Show that (5) holds in this case or all t 2. (ii) Now, assume that (5) holds or all subtrees o depth τ. As a irst step, show that or any subtree T [v] o depth τ (see Figure 4b) it holds that, or all t τ + 1, v (x v) P [v] (x v ). (iii) Using the result o the previous step, show that, or any subtree T [v] o depth τ = τ +2, (5) holds or all t τ (see Figure 4c). 5

(iv) Show that, i the actor graph rooted at a has depth d, then, or all t d, ˆP (t) (x a ) P (x a ). (v) To generalize the statement above, assume that the actor graph has diameter D, i.e., the maximum distance between any two nodes in the graph is D. Show that or all t D the estimated marginal o any variable v is exact, that is, ˆP (t) (x v ) P (x v ). 6