Probability Calculus. Chapter From Propositional to Graded Beliefs

Size: px
Start display at page:

Download "Probability Calculus. Chapter From Propositional to Graded Beliefs"

Transcription

1 Chapter 2 Probability Calculus Our purpose in this chapter is to introduce probability calculus and then show how it can be used to represent uncertain beliefs, and then change them in the face of new information. 2.1 From Propositional to Graded Beliefs We have seen in the previous chapter that propositional logic provides a valuable tool for representing beliefs about particular situations. But we have also seen that it is most appropriate for representing categorical beliefs. Specifically, given a propositional knowledge base, one can classify each sentence α as either: Believed: α; Disbelieved α; or Neither: α and α. This coarse classification of sentences which can be visualized by examining Figure 2.1 is a consequence of the binary classification imposed by the knowl- α α α a b c Figure 2.1: Three possible relationships between a knowledge base and a sentence α: (a) α since Mods( ) Mods(α); (b) α since Mods( ) Mods( α); and (c) α and α. 1

2 2 Class Notes for CS262A, UCLA world Earthquake Burglary Alarm Pr(.) ω 1 true true true.019 ω 2 true true false.001 ω 3 true false true.056 ω 4 true false false.024 ω 5 false true true.162 ω 6 false true false.018 ω 7 false false true.0072 ω 8 false false false.7128 Table 2.1: A state of belief, also known as a joint probability distribution. edge base on worlds, i.e., a world is either possible or impossible depending on whether it satisfies or contradicts. One can obtain a much finer classification of sentences through a finer classification of worlds. In particular, we can assign a degree of belief or probability in [0, 1] to each world ω and denote it by Pr(ω). The belief in, or probability of, a sentence α can then be defined as: Pr(α) def Pr(ω), (2.1) which induces a continuous classification on sentences. Consider now Table 2.1 which lists a set of worlds and their corresponding degrees of beliefs. Table 2.1 is known as a state of belief or a joint probability distribution and we will require that the beliefs assigned to worlds add up to 1: w ω α Pr(w) 1. Based on Table 2.1, we have the following beliefs: Pr(Earthquake) Pr(ω 1 ) + Pr(ω 2 ) + Pr(ω 3 ) + Pr(ω 4 ).1 Pr(Burglary).2 Pr(Alarm).2442 It is relatively straightforward to establish the following properties of beliefs. First, a bound on the belief in any sentence: 0 Pr(α) 1 for any sentence α. (2.2) This follows since every degree of belief must be in [0, 1], leading to 0 Pr(α), and since the beliefs assigned to worlds must add up to 1, leading to Pr(α) 1. The second property is a baseline for inconsistent sentences: Pr(α) 0 when α is inconsistent. (2.3) This follows since there are no worlds that satisfy α. The third property is a baseline for valid sentences: Pr(α) 1 when α is valid. (2.4)

3 Adnan Darwiche c α α Figure 2.2: The worlds that satisfy α and those that satisfy α form a partition of the set of all worlds. α β Figure 2.3: The worlds that satisfy α β can be partitioned into three sets: those satisfying α β, α β and α β. This follows since α is satisfied by every world. The following property allows one to compute the belief in a sentence given the belief in its negation: Pr(α) + Pr( α) 1. (2.5) This follows because every world must either satisfy α or satisfy α, but cannot satisfy both; see Figure 2.2. Consider now Table 2.1 for an example and let α : Burglary. We then have: Pr(Burglary) Pr(ω 1 ) + Pr(ω 2 ) + Pr(ω 5 ) + Pr(ω 6 ).2 Pr( Burglary) Pr(ω 3 ) + Pr(ω 4 ) + Pr(ω 7 ) + Pr(ω 8 ).8 The next property allows us to compute the belief in a disjunction: Pr(α β) Pr(α) + Pr(β) Pr(α β). (2.6) This identity is best seen by examining Figure 2.3. If we simply add Pr(α) and Pr(β) we will end up summing the beliefs in worlds that satisfy α β twice. Hence, by subtracting Pr(α β), we will end up accounting for the belief in every world that satisfies α β only once. Consider Table 2.1 for an example and let α : Earthquake and β : Burglary. We then have: Pr(Earthquake) Pr(ω 1 ) + Pr(ω 2 ) + Pr(ω 3 ) + Pr(ω 4 ).1 Pr(Burglary) Pr(ω 1 ) + Pr(ω 2 ) + Pr(ω 5 ) + Pr(ω 6 ).2

4 4 Class Notes for CS262A, UCLA Pr(Earthquake Burglary) Pr(ω 1 ) + Pr(ω 2 ).02 Pr(Earthquake Burglary) The belief in a disjunction α β can also be computed directly from the belief in α and the belief in β: Pr(α β) Pr(α) + Pr(β) when α and β are mutually exclusive. In this case, there is no world that satisfies both α and β. Hence, α β is inconsistent and Pr(α β) 0. A related question is whether we can state a non trivial logical condition on α and β which would permit one to compute the belief in the conjunction α β in terms of the belief in α and the belief in β. This turns out to be impossible, but we will later present an interesting non logical condition for this purpose. We should stress here that the joint probability distribution is usually too large to allow a direct representation as given in Table 2.1. For example, if we have 20 variables and each has two values, the table will have 1, 048, 576 entries. And if we have 40 variables, the table will have 1, 099, 511, 627, 776 entries! We will discuss however in the next chapter a key tool, known as a Bayesian network, for efficiently representing the joint probability distribution Notational Conventions Before we move on to the next subject of updating beliefs, we need to settle some notational conventions. First, it is common to replace the conjoin operator ( ) by a comma (,) so we will often write Pr(α, β) instead of Pr(α β). Next, it is also common to use the term event to refer to a set of worlds. But we will also use this term when referring to a sentence α, since each sentence denotes a set of worlds, Mods(α). Finally, we will denote variables by upper case letters (A) and their values by lower case letters (a). Sets of variables will be denoted by bold face upper case letters (A) and their instantiations by bold face lower case letters (a). For variable A and value a, we will often write a instead of Aa and, hence, Pr(a) instead of Pr(Aa). For a variable A with values true and false, we may use a to denote Atrue and a to denote Afalse. Therefore, Pr(A), Pr(Atrue) and Pr(a) all represent the same quantity in this case. Similarly, Pr( A), Pr(Afalse) and Pr(a) all represent the same quantity. 2.2 Updating Beliefs Consider again the state of belief in Table 2.1 and suppose that we now know that the Alarm variable has taken the value true. This piece of information is not compatible with the state of belief, since it ascribes a belief of.2442 to the Alarm being true. One of the key questions is then to update the state of belief so it becomes compatible with this new piece of information, which we will refer

5 Adnan Darwiche c to as evidence. More generally, evidence will be represented by an arbitrary sentence, say β, and our goal is to update the state of belief Pr(.) into a new state of belief, which we will denote by Pr(. β). Given that β is known for sure, we expect the new state of belief Pr(. β) to assign a belief of 1 to β: Pr(β β) 1. This immediately implies that Pr( β β) 0 and, hence, every world ω that satisfies β must be assigned the belief 0: Pr(ω β) 0 for all ω β. (2.7) To completely define the new state of belief Pr(. β), all we have to do then is define the new belief in every world ω that satisfies β. We already know that the sum of all such beliefs must be 1: Pr(ω β) 1. (2.8) ω β But this leaves us with many options for Pr(ω β) when world ω satisfies β. Since evidence β tells us nothing about worlds that satisfy β, except that the total belief in them should be 1, it is then reasonable to perturb our beliefs in such worlds as little as possible. To this end, we will insist that our relative beliefs in these worlds stay the same: Pr(ω) Pr(ω ) Pr(ω β) Pr(ω β) for all ω, ω β, Pr(ω ) 0. (2.9) The constraints expressed by Equations 2.8 and 2.9 leave us with only one option for the new beliefs in worlds that satisfy the evidence β: Pr(ω β) Pr(ω) Pr(β) for all ω β. That is, the new beliefs in such worlds are just the result of normalizing our old beliefs, with the normalization constant being our old belief in the evidence, Pr(β). Our new state of belief is now completely defined: Pr(ω β) def { 0, if ω β; Pr(ω)/Pr(β), if ω β. (2.10) The new state of belief Pr(. β) will be called the result of conditioning the old state Pr on evidence β. Consider now the state of belief in Table 2.1 and suppose that the evidence β is Alarm. The result of conditioning this state of belief on Alarm is given in Table 2.2. Let us now examine some of the changes in beliefs that are induced by this new evidence. First, our belief in Burglary increases: Pr(Burglary).2 Pr(Burglary Alarm).741

6 6 Class Notes for CS262A, UCLA world Earthquake Burglary Alarm Pr(.) Pr(. Alarm) ω 1 true true true /.2442 ω 2 true true false ω 3 true false true /.2442 ω 4 true false false ω 5 false true true /.2442 ω 6 false true false ω 7 false false true /.2442 ω 8 false false false Table 2.2: A state of belief and the result of its conditioning on evidence Alarm. And so does our belief in Earthquake: Pr(Earthquake).1 Pr(Earthquake Alarm).307 One can derive a simple closed form for the updated belief in an arbitrary sentence α given evidence β, without having to explicitly compute the belief Pr(ω β) for every world ω. The derivation is as follows: Pr(α β) ω α Pr(ω β) Equation 2.1 ω α, ω β ω α, ω β ω α β ω α β 1 Pr(β) Pr(ω β) + ω α, ω β Pr(ω β) Equation 2.10 Pr(ω β) Properties of Pr(ω β) Pr(ω)/Pr(β) Equation 2.10 ω α β Pr(α β) Pr(β) Pr(ω) Equation 2.1. ω satisfies β or β but not both The closed form, Pr(α β) Pr(α β), (2.11) Pr(β) is known as Bayes conditioning. Note that the updated state of belief Pr(. β)

7 Adnan Darwiche c is defined only when Pr(β) 0. We will usually avoid stating this condition explicitly in the future, but it should be implicitly assumed. Let us now use Bayes conditioning to further examine some of the belief dynamics in our previous example. In particular, here is how some beliefs would change upon accepting the evidence Earthquake: Pr(Burglary).2 Pr(Burglary Earthquake).2 Pr(Alarm).2442 Pr(Alarm Earthquake).75 That is, the belief in Burglary is not changed, but the belief in Alarm increases. Here are some more belief changes, as a reaction to the evidence Burglary: Pr(Alarm).2442 Pr(Alarm Burglary).905 Pr(Earthquake).1 Pr(Earthquake Burglary).1 The belief in Alarm increases in this case, but the belief in Earthquake stays the same. The above belief dynamics are a property of the state of belief in Table 2.1 and may not hold for other states of beliefs. For example, it is possible to conceive of a reasonable state of belief in which information about Earthquake would change the belief about Burglary and vice versa. One of the central questions in building automated reasoning systems is that of synthesizing states of beliefs that are faithful, i.e., those that correspond to the beliefs held by some human expert. We shall study a major technique in the next chapter for synthesizing faithful states of beliefs. Before we move on, let us look at one more example of belief change. We know that the belief in Burglary increases when accepting the evidence Alarm. The question though is how would such a belief change further upon obtaining more evidence. Here s what happens when we get a confirmation that an Earthquake took place: Pr(Burglary Alarm).741 Pr(Burglary Alarm Earthquake).253 That is, our belief in a Burglary decreases in this case, as we now have an explanation of Alarm. If on the other hand we get a confirmation that there was no Earthquake, our belief in Burglary increases even further: Pr(Burglary Alarm).741 Pr(Burglary Alarm Earthquake).957 As it turns out, some of the above belief changes are not accidental as they are guaranteed by the method used to construct the state of belief in Table 2.1. We will have more to say about this in the next chapter.

8 8 Class Notes for CS262A, UCLA 2.3 Independence According to the state of belief in Table 2.1, the evidence Burglary does not change the belief in Earthquake: Pr(Earthquake).1 Pr(Earthquake Burglary).1 Hence, we say in this case that the state of belief Pr finds Earthquake independent of Burglary. More generally, we will say that Pr finds event α independent of event β iff Pr(α β) Pr(α) when Pr(β) 0. (2.12) Note that the state of belief in Table 2.1 also finds Burglary independent of Earthquake: Pr(Burglary).2 Pr(Burglary Earthquake).2 It is indeed a general property that Pr must find event α independent of event β if it also finds β independent of α. Independence satisfies many other properties that we shall explore in depth in future chapters. Independence provides a general condition under which the belief in a conjunction α β can be expressed in terms of the belief in α and that in β. Specifically, if Pr finds α independent of β, we must have: Pr(α β) Pr(α)Pr(β), which follows immediately from the definition of independence and Bayes conditioning (Equation 2.11). The above equation is sometimes taken as the definition of independence, where the equation Pr(α β) Pr(α) is viewed as a consequence. We will sometimes use the above equation to stress the symmetry between α and β in the definition of independence. It is important here to stress the difference between independence and logical disjointness (mutual exclusiveness) as it is common to mix these two notions. Recall that two events α and β are disjoint (mutually exclusive) iff they share no models: Mods(α) Mods(β). That is, they cannot hold together at the same world. On the other hand, events α and β are independent iff Pr(α β) Pr(α)Pr(β). Note that disjointness is an objective property of events, while independence is a property of beliefs. Hence, two people with different beliefs may disagree on whether two events are independent, but they cannot disagree on their disjointness Conditional Independence Independence is a dynamic notion. That is, one may find two events independent at some point, but then find them dependent after obtaining some evidence. For example, we have seen earlier how the state of belief in Table 2.1 finds Burglary

9 Adnan Darwiche c independent of Earthquake. This state of belief, however, finds these events dependent on each other after accepting the evidence Alarm: Pr(Burglary Alarm).741 Pr(Burglary Alarm Earthquake).253 That is, the evidence Earthquake changes the belief in Burglary in the presence of evidence Alarm. In general, independent events may become dependent given new evidence, and dependent events may become independent given new evidence. This calls for the following more general definition of independence. We say that state of belief Pr finds event α conditionally independent of event β given event γ iff Pr(α β γ) Pr(α γ) when Pr(β γ) 0. (2.13) That is, in the presence of evidence γ, the additional evidence β will not change the belief in α. Conditional independence enables the following more general equation for computing the belief in a conjunction: Pr(α β γ) Pr(α γ)pr(β γ) Variable Independence We will find it useful in the future to talk about independence between sets of variables. In particular, let X, Y and Z by three disjoint sets of variables. We will say that a state of belief Pr finds X independent of Y given Z, denoted I Pr (X, Z, Y), to mean that Pr finds α independent of β given γ for all sentences α, β and γ that represent states of variables X, Y, and Z, respectively. Suppose for example that X {A, B}, Y {C} and Z {D, E}. The statement I Pr (X, Z, Y) is then a compact notation for a number of statements about independence: A B is independent of C given D; A B is independent of C given D,..., A B is independent of C given D. Using the notation we developed in Section 2.1.1, the statement I Pr (X, Z, Y) is then asserting that Pr(x y, z) Pr(x z) when Pr(y, z) 0 for all x, y, and z. This is obviously a much more compact notation, which we will use frequently. 2.4 Further Properties of Beliefs We will discuss in this section some more properties of beliefs that are commonly used. We start with the Chain Rule: Pr(α 1 α 2... α n ) Pr(α 1 α 2... α n )Pr(α 2 α 3... α n )... Pr(α n ). This rule follows from a repeated application of Bayes conditioning. We will find a major use of the chain rule when we discuss Bayesian networks in the following chapter.

10 10 Class Notes for CS262A, UCLA The next important property of beliefs is Case Analysis: Pr(α) n Pr(α β i ), (2.14) i1 where the events β 1,..., β n are mutually exclusive and exhaustive. 1 Case analysis holds because the models of α β 1,..., α β n form a partition of the models of α. Intuitively, case analysis says that we can compute the belief in event α by adding up our beliefs in a number of non overlapping cases, α β 1,..., α β n, that cover all conditions under which α holds. Another version of case analysis is the following: Pr(α) n Pr(α β i )Pr(β i ), (2.15) i1 where the events β 1,..., β n are mutually exclusive and exhaustive. This version is obtained from the first one by applying Bayes conditioning, and calls for considering a number of non overlapping and exhaustive cases, β 1,..., β n. We compute our belief in α under each one of these cases, Pr(α β i ), and then add up these beliefs after applying the weight of each case, Pr(β i ). Two simple and useful forms of case analysis are these: Pr(α) Pr(α β) + Pr(α β) Pr(α) Pr(α β)pr(β) + Pr(α β)pr( β). These equations hold because β and β are mutually exclusive and exhaustive. The main value of case analysis is that in many situations, computing our beliefs in the cases is easier than computing our beliefs in α. We shall see many examples of this phenomena in later chapters. The last property of beliefs we shall consider is known as Bayes Rule or Bayes Theorem: Pr(α β) Pr(β α)pr(α), (2.16) Pr(β) which follows from applying Bayes conditioning twice. The classical usage of this rule is when event α is perceived to be a cause of event β for example, α is a disease and β is a symptom and our goal is to assess our belief in the cause given the symptom. It is common for the belief in an effect given its cause, Pr(β α), to be more readily available than the belief in a cause given one of its effects, Pr(α β). Hence, this rule allows us to compute the latter from the former. To consider an example of Bayes rule, suppose that we have a patient who was just tested for a particular disease and the test came out positive. We know that one in every thousand people has this disease. We also know that the test is not reliable: it has a false positive rate of 2% and a false negative rate of 5%. 1 That is, Mods(β j ) Mods(β k ) for j k, and n i1 Mods(β i) is the set of all worlds.

11 Adnan Darwiche c Our goal is then to assess our belief in the patient having the disease given that the test came out positive. If we let variable D stand for the patient has the disease, and variable T stand for the test came out positive, our goal is then to compute Pr(D T ). From the given information, we know that Pr(D) since one in every thousand has the disease this is our belief in the patient having the disease before we run any tests. Since the false positive rate of the test is 2%, we know that Pr(T D) and, by Equation 2.5, Pr( T D) Similarly, since the false negative rate of the test is 5%, we know that and Using Bayes rule, we now have Pr( T D) Pr(T D) Pr(D T ) Pr(T ). The belief in the test coming out positive for an average individual, Pr(T ), is not readily available but can be computed using case analysis: Pr(T ) Pr(T D)Pr(D) + Pr(T D)Pr( D) , which leads to Pr(D T ) %. It turns out that if the test false positive is brought down to 2/1000, the above belief in disease will go up to around 32.2%. Another way to solve the above problem is to construct the state of belief completely and then use it to answer queries. This is feasible in this case because we have only two events of interest T and D, leading up to only four worlds: world T D ω 1 true true ω 2 true false ω 3 false true ω 4 false false

12 12 Class Notes for CS262A, UCLA If we can obtain the belief in each one of these worlds, then we are done since the belief in any other sentence can be computed mechanically using Equations 2.1 and To compute the beliefs in the above worlds, we can use the chain rule: Pr(ω 1 ) Pr(T D) Pr(T D)Pr(D) Pr(ω 2 ) Pr(T D) Pr(T D)Pr( D) Pr(ω 3 ) Pr( T D) Pr( T D)Pr(D) Pr(ω 4 ) Pr( T D) Pr( T D)Pr( D). All of the above quantities are available directly from the problem statement. 2.5 Soft Evidence There are two types of evidence that one may encounter: hard evidence and soft evidence. Hard evidence is information to the effect that some event has occurred, which is also the type of evidence we have considered earlier. Soft evidence on the other hand is not conclusive: we may get an unreliable testimony that event β occurred, which may increase our belief in β, but not to the point where we would consider it certain. One key issue relating to soft evidence is how to specify it. There are two key methods for this, which we will discuss next The All things considered Method One method for specifying a soft evidence on event β is by stating the new belief in β after the evidence has been accommodated. For example, we would say given this soft evidence on β, my belief in β becomes.85. Formally, we are stating that Pr (β).85, where Pr denotes the new state of belief after accommodating the evidence. This is sometimes known as the All things considered method, since the new belief in β depends not only on the new evidence, but also on our old beliefs. That is, the statement Pr (β).85 is not a statement about the evidence itself, but about the result of its integration with our current beliefs. Given this method of specifying evidence, computing the new state of belief Pr can be done along the same principles we used for Bayes conditioning. In particular, suppose that we obtain some soft evidence on event β, which leads us to change our belief in β to q. We will denote such evidence by the pair (β, q) and understand it as imposing the following constraint on the new state of belief Pr : Pr (β) q, which immediately gives the additional constraint Pr ( β) 1 q. Therefore, we know that we must change the beliefs in worlds that satisfy β so these beliefs add up to q. We also know that we must change the beliefs in worlds that satisfy β so they add up to 1 q. Again, if we insist on preserving the relative beliefs in worlds that satisfy β, and also on preserving the relative beliefs in worlds that satisfy β, we find ourselves committed to the

13 Adnan Darwiche c following definition of Pr : Pr (ω) { q Pr(β) Pr(ω), 1 q Pr( β) Pr(ω), if ω β if ω β. That is, we effectively have to scale our beliefs in the worlds satisfying β using the constant q/pr(β), and similarly for the worlds satisfying β. There is also a useful closed form for the above definition, which can be derived similarly to Equation 2.11: Pr Pr(α β) Pr(α β) (α) q + (1 q), (2.17) Pr(β) Pr( β) where Pr is the new state of belief after accommodating the soft evidence (β, q). This method of updating a state of belief in the face of soft evidence is known as Jeffrey s Rule. Note that Bayes conditioning is a special case of Jeffrey s rule when q 1, which is to be expected as they were both derived using the same principle. Jeffrey s rule has a simple generalization to the case where the evidence concerns a set of mutually exclusive and exhaustive events β 1,..., β n, where the new beliefs in these events are q 1,..., q n, respectively. This soft evidence can be accommodated using the following generalized version of Jeffrey s rule: Pr (α) n i1 Pr(α β i ) q i. (2.18) Pr(β i ) Consider the following example, due to Jeffrey. Assume that we are given a piece of cloth C, where its color can be one of: green (c g ), blue (c b ), or violet (c v ). We want to know whether, in the next day, the cloth will be sold (s), or not sold (s). Our original state of belief is as follows: worlds S C Pr(.) ω 1 s c g.12 ω 2 s c b.12 ω 3 s c v.32 ω 4 s c g.18 ω 5 s c b.18 ω 6 s c v.08 Therefore, our original belief in the cloth being sold is Pr(s).56. Moreover, our original beliefs in the colors c g, c b, c v are.3,.3, and.4, respectively. Assume that we now inspect the cloth by candlelight, and we conclude that our new beliefs in these colors should be.7,.25, and.05, respectively. If we apply

14 14 Class Notes for CS262A, UCLA Jeffrey s rule, we get the following new state of belief: worlds S C Pr (.) ω 1 s c g.28 ω 2 s c b.10 ω 3 s c v.04 ω 4 s c g.42 ω 5 s c b.15 ω 6 s c v.01 Therefore, our new belief in the cloth being sold is now Pr (s).42. We can also obtain this result using the close form given by Equation 2.18: Pr (s) The Nothing else considered Method The second method for specifying soft evidence on event β is based on declaring the strength of this evidence, independently of currently held beliefs. In particular, let us define the odds of event β as follows: O(β) def Pr(β) Pr( β). (2.19) That is, an odds of 1 indicates that we believe β and β equally, while an odds of 10 indicates that we believe β ten times more than we believe β. Given the notion of odds, we can specify soft evidence on event β by declaring the relative change it induces on the odds of β, that is, by specifying the ratio O (β)/o(β), where O (β) is the odds of β after accommodating the evidence, Pr (β)/pr ( β). The ratio O (β)/o(β) is known as the Bayes factor. Hence, a Bayes factor of 1 indicates a neutral evidence, while a Bayes factor of 2 indicates an evidence on β which is strong enough to double the odds of β. This method of specifying evidence is sometimes known as the Nothing else considered method, as it is a statement about the strength of evidence, without any reference to the initial state of belief since the Bayes factor does not constrain the initial state of belief. Suppose now that we obtain soft evidence on β whose strength is given by a Bayes factor of k, and our goal is to compute the new state of belief Pr that results from accommodating this evidence. If we are able to translate this evidence into a form which is accepted by Jeffrey s rule, then we can use that rule to compute Pr. This turns out to be possible as we describe next. First, from the constraint O (β)/o(β) k, we get: Pr (β) kpr(β) kpr(β) + Pr( β). Hence, we can view this as a problem of updating the initial state of belief Pr using Jeffrey s rule and the soft evidence given above. That is, what we have

15 Adnan Darwiche c done is translate a Nothing else considered specification of soft evidence a constraint on O (β)/o(β) into an All things considered specification a constraint on Pr (β). Computing Pr using Jeffrey s rule and the above soft evidence, we get: Pr kpr(α β) + Pr(α β) (α), (2.20) kpr(β) + Pr( β) where Pr is the new state of belief after accommodating soft evidence on event β using a Bayes factor of k. Note that Bayes conditioning is a special case of the above rule, when the Bayes factor tends to infinity. Note also that the difference between Equations 2.17 and 2.20 is only in the way soft evidence is specified. The first rule expects the evidence to be specified as a pair (β, q), where q is the new belief in event β, Pr (β) q. The second rule expects the evidence to be specified as a pair (β, k), where k O (β)/o(β) is a Bayes factor that quantifies the strength of evidence. Consider now the following example due to Pearl, which concerns the alarm of Mr. Holmes house and the potential of a burglary. The initial state of belief is given by: world Alarm Burglary Pr(.) ω 1 true true ω 2 true false ω 3 false true ω 4 false false One day, Mr. Holmes receives a call from his neighbor, Mrs. Gibbons, saying that she may have heard the alarm of his house going off. Since Mrs. Gibbons suffers from a hearing problem, Mr. Holmes concludes that Mrs. Gibbons testimony increases the odds of the alarm going off by a factor of 4: O (Alarm)/O(Alarm) 4. Our goal now is to compute our new belief in a burglary taking place, Pr (Burglary). Using Equation 2.20 with α : Burglary, β : Alarm and k 4, we get: Pr (Burglary) 4( ) ( ) There is a generalization of Equation 2.20 to the case where the soft evidence bears on a set of mutually exclusive and exhaustive events β 1,..., β n. This generalization requires that we define the odds of event β i to event β j : O(β i, β j ) def Pr(β i ) Pr(β j ). The odds of β, O(β), is then a special case since O(β, β) O(β). Given this more general notion of odds, a soft evidence bearing on a set of mutually exclusive and exhaustive events β 1,..., β n can then be specified using a set of numbers λ 1,..., λ n, with the following interpretation: O (β i, β j ) O(β i, β j ) λ i λ j. (2.21)

16 16 Class Notes for CS262A, UCLA That is, we are specifying the relative increase in the odds of β i to β j for every pair of events. Note here that the specific numbers λ 1,..., λ n do not matter; only their ratios are important. 2 Furthermore, each ratio λ i /λ j is known as the Bayes factor for events β i /β j. Hence, the numbers λ 1,..., λ n are indirectly specifying a set of Bayes factors, one for each pair of distinct events in β 1,..., β n. 3 Given the constraints given by Equation 2.21 on the new state of belief, one can show that the new belief in any of the events β j must be given by: 4 Pr (β j ) λ j Pr(β j ) n i1 λ ipr(β i ). Using these new beliefs and Jeffrey s rule (Equation 2.18), we get the following rule: n Pr i1 (α) λ ipr(α β i ) n i1 λ. (2.22) ipr(β i ) This generalizes Equation 2.20, which falls as a special case when n 2, β 1 β, β 2 β, λ 1 k, and λ 2 1. Again, note that the difference between Equation 2.18 and Equation 2.22 is only in the way soft evidence is specified. The first equation expects evidence in the form of a set of mutually exclusive and exhaustive events β 1,..., β n and a corresponding set of beliefs q 1,..., q n, which are interpreted as constraints of the form Pr (β i ) q i. The second rule expects a different set of numbers λ 1,..., λ n, which are interpreted as constraints of the form O (β i, β j )/O(β i, β j ) λ i /λ j. Again, we will provide another, possibly more intuitive, interpretation of the numbers λ 1,..., λ n in the following section. To illustrate Equation 2.22, consider the cloth example we discussed in the previous section, and suppose that we have some soft evidence on the mutually exclusive and exhaustive events C c g, C c b, and C c v whose strength is quantified by λ g : λ b : λ v 7 : 2.5 :.375. We can compute our new belief in the event Ss using Equation 2.22 as follows: Pr (Ss) (7.12) + (2.5.12) + ( ) (7.3) + (2.5.3) + (.375.4).42. The above evidence strength was in fact chosen carefully so it leads to the same state of belief Pr that we arrived at using Jeffrey s rule. 2 See the following section for another interpretation of these ratios. 3 For the special case of n 2, β 2 must be equivalent to β 1, O (β 1, β 2 )/O(β 1, β 2 ) O (β 1 )/O(β 1 ), and the ratio λ 1 /λ 2 is then the Bayes factor k discussed earlier. 4 This can be shown as follows. By unfolding O and O in Equation 2.21, we get Pr (β i )/λ i Pr(β i ) Pr (β j )/λ j Pr(β j ). We also get that Pr (β 1 )/λ 1 Pr(β 1 )... Pr (β n)/λ npr(β n) k. This leads to Pr (β i ) kλ i Pr(β i ) for i 1,..., n. Adding up the two sides of this equation for i 1,..., n, we get 1 k n i1 λ ipr(β i ). For any particular β j, we finally get Pr (β j ) λ j Pr(β j )/ n i1 λ ipr(β i ).

17 Adnan Darwiche c The Virtual Evidence Method Equation 2.22 has an alternative semantics using the notion of virtual evidence, which we explain next. Suppose that we have a soft evidence bearing on the set of mutually exclusive and exhaustive events β 1,..., β n. We can model this evidence explicity by augmenting our language with a new propositional variable V, which represents the event of receiving this soft evidence for example, V could represent the event of receiving a call from our unreliable neighbor that the alarm in our house went off. We can now quantify the strength of this evidence by specifying the probability that we will receive it given each of the events β i : Pr(V β i ) λ i, for i 1,..., n. The new state of belief Pr, after the soft evidence has been accommodated, is now given by Pr(. V ). If we also assume that V is independent of every other event given β i, for i 1,..., n, we can then show that Pr(. V ) is indeed equal to Pr as given by Equation This method is known as the method of virtual evidence as it is based on introducing a new virtual variable V, which allows one to model the soft evidence in terms of hard evidence on V, where the relationship of V to events β 1,..., β n is uncertain. Moreover, this uncertainty is captured explicitly using the numbers λ 1,..., λ n. According to this method, the ratios of these numbers, Pr(V β i ) Pr(V β j ) λ i λ j, are interpreted as the odds of receiving evidence V given event β i to receiving it given event β j. Note that these ratios where called Bayes factors in the previous section. We have seen earlier that when the soft evidence bears only on β and β, its strength can be specified using the Bayes factor k O (β)/o(β), which corresponds to a setting of λ 1 k and λ 2 1. This very common case can then be handled using virtual evidence by simply ensuring that: Pr(V β) Pr(V β) k. The method of virtual evidence is quite important practically, as it allows us to integrate soft evidence using the tools developed for hard evidence. We will rely on this method for accommodating soft evidence in future chapters.

Probability Calculus. p.1

Probability Calculus. p.1 Probability Calculus p.1 Joint probability distribution A state of belief : assign a degree of belief or probability to each world 0 Pr ω 1 ω Ω sentence > event world > elementary event Pr ω ω α 1 Pr ω

More information

Reasoning with Bayesian Networks

Reasoning with Bayesian Networks Reasoning with Lecture 1: Probability Calculus, NICTA and ANU Reasoning with Overview of the Course Probability calculus, Bayesian networks Inference by variable elimination, factor elimination, conditioning

More information

Refresher on Probability Theory

Refresher on Probability Theory Much of this material is adapted from Chapters 2 and 3 of Darwiche s book January 16, 2014 1 Preliminaries 2 Degrees of Belief 3 Independence 4 Other Important Properties 5 Wrap-up Primitives The following

More information

Bayesian Networks. Probability Theory and Probabilistic Reasoning. Emma Rollon and Javier Larrosa Q

Bayesian Networks. Probability Theory and Probabilistic Reasoning. Emma Rollon and Javier Larrosa Q Bayesian Networks Probability Theory and Probabilistic Reasoning Emma Rollon and Javier Larrosa Q1-2015-2016 Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 1 / 26 Degrees of belief Given

More information

Logic and Bayesian Networks

Logic and Bayesian Networks Logic and Part 1: and Jinbo Huang Jinbo Huang and 1/ 31 What This Course Is About Probabilistic reasoning with Bayesian networks Reasoning by logical encoding and compilation Jinbo Huang and 2/ 31 Probabilities

More information

Quantifying Uncertainty & Probabilistic Reasoning. Abdulla AlKhenji Khaled AlEmadi Mohammed AlAnsari

Quantifying Uncertainty & Probabilistic Reasoning. Abdulla AlKhenji Khaled AlEmadi Mohammed AlAnsari Quantifying Uncertainty & Probabilistic Reasoning Abdulla AlKhenji Khaled AlEmadi Mohammed AlAnsari Outline Previous Implementations What is Uncertainty? Acting Under Uncertainty Rational Decisions Basic

More information

Lecture 10: Introduction to reasoning under uncertainty. Uncertainty

Lecture 10: Introduction to reasoning under uncertainty. Uncertainty Lecture 10: Introduction to reasoning under uncertainty Introduction to reasoning under uncertainty Review of probability Axioms and inference Conditional probability Probability distributions COMP-424,

More information

Formal Epistemology: Lecture Notes. Horacio Arló-Costa Carnegie Mellon University

Formal Epistemology: Lecture Notes. Horacio Arló-Costa Carnegie Mellon University Formal Epistemology: Lecture Notes Horacio Arló-Costa Carnegie Mellon University hcosta@andrew.cmu.edu Bayesian Epistemology Radical probabilism doesn t insists that probabilities be based on certainties;

More information

Part I Qualitative Probabilistic Networks

Part I Qualitative Probabilistic Networks Part I Qualitative Probabilistic Networks In which we study enhancements of the framework of qualitative probabilistic networks. Qualitative probabilistic networks allow for studying the reasoning behaviour

More information

STATISTICAL METHODS IN AI/ML Vibhav Gogate The University of Texas at Dallas. Propositional Logic and Probability Theory: Review

STATISTICAL METHODS IN AI/ML Vibhav Gogate The University of Texas at Dallas. Propositional Logic and Probability Theory: Review STATISTICAL METHODS IN AI/ML Vibhav Gogate The University of Texas at Dallas Propositional Logic and Probability Theory: Review Logic Logics are formal languages for representing information such that

More information

Basing Decisions on Sentences in Decision Diagrams

Basing Decisions on Sentences in Decision Diagrams Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence Basing Decisions on Sentences in Decision Diagrams Yexiang Xue Department of Computer Science Cornell University yexiang@cs.cornell.edu

More information

Introducing Proof 1. hsn.uk.net. Contents

Introducing Proof 1. hsn.uk.net. Contents Contents 1 1 Introduction 1 What is proof? 1 Statements, Definitions and Euler Diagrams 1 Statements 1 Definitions Our first proof Euler diagrams 4 3 Logical Connectives 5 Negation 6 Conjunction 7 Disjunction

More information

Bayesian Reasoning. Adapted from slides by Tim Finin and Marie desjardins.

Bayesian Reasoning. Adapted from slides by Tim Finin and Marie desjardins. Bayesian Reasoning Adapted from slides by Tim Finin and Marie desjardins. 1 Outline Probability theory Bayesian inference From the joint distribution Using independence/factoring From sources of evidence

More information

For True Conditionalizers Weisberg s Paradox is a False Alarm

For True Conditionalizers Weisberg s Paradox is a False Alarm For True Conditionalizers Weisberg s Paradox is a False Alarm Franz Huber Abstract: Weisberg (2009) introduces a phenomenon he terms perceptual undermining He argues that it poses a problem for Jeffrey

More information

Probability theory basics

Probability theory basics Probability theory basics Michael Franke Basics of probability theory: axiomatic definition, interpretation, joint distributions, marginalization, conditional probability & Bayes rule. Random variables:

More information

Our learning goals for Bayesian Nets

Our learning goals for Bayesian Nets Our learning goals for Bayesian Nets Probability background: probability spaces, conditional probability, Bayes Theorem, subspaces, random variables, joint distribution. The concept of conditional independence

More information

Introduction to Metalogic

Introduction to Metalogic Philosophy 135 Spring 2008 Tony Martin Introduction to Metalogic 1 The semantics of sentential logic. The language L of sentential logic. Symbols of L: Remarks: (i) sentence letters p 0, p 1, p 2,... (ii)

More information

P Q (P Q) (P Q) (P Q) (P % Q) T T T T T T T F F T F F F T F T T T F F F F T T

P Q (P Q) (P Q) (P Q) (P % Q) T T T T T T T F F T F F F T F T T T F F F F T T Logic and Reasoning Final Exam Practice Fall 2017 Name Section Number The final examination is worth 100 points. 1. (10 points) What is an argument? Explain what is meant when one says that logic is the

More information

1 Propositional Logic

1 Propositional Logic CS 2800, Logic and Computation Propositional Logic Lectures Pete Manolios Version: 384 Spring 2011 1 Propositional Logic The study of logic was initiated by the ancient Greeks, who were concerned with

More information

Uncertainty. Logic and Uncertainty. Russell & Norvig. Readings: Chapter 13. One problem with logical-agent approaches: C:145 Artificial

Uncertainty. Logic and Uncertainty. Russell & Norvig. Readings: Chapter 13. One problem with logical-agent approaches: C:145 Artificial C:145 Artificial Intelligence@ Uncertainty Readings: Chapter 13 Russell & Norvig. Artificial Intelligence p.1/43 Logic and Uncertainty One problem with logical-agent approaches: Agents almost never have

More information

For True Conditionalizers Weisberg s Paradox is a False Alarm

For True Conditionalizers Weisberg s Paradox is a False Alarm For True Conditionalizers Weisberg s Paradox is a False Alarm Franz Huber Department of Philosophy University of Toronto franz.huber@utoronto.ca http://huber.blogs.chass.utoronto.ca/ July 7, 2014; final

More information

In Defense of Jeffrey Conditionalization

In Defense of Jeffrey Conditionalization In Defense of Jeffrey Conditionalization Franz Huber Department of Philosophy University of Toronto Please do not cite! December 31, 2013 Contents 1 Introduction 2 2 Weisberg s Paradox 3 3 Jeffrey Conditionalization

More information

Recall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem

Recall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem Recall from last time: Conditional probabilities Our probabilistic models will compute and manipulate conditional probabilities. Given two random variables X, Y, we denote by Lecture 2: Belief (Bayesian)

More information

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 CS 70 Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 Introduction One of the key properties of coin flips is independence: if you flip a fair coin ten times and get ten

More information

Probabilistic Reasoning. (Mostly using Bayesian Networks)

Probabilistic Reasoning. (Mostly using Bayesian Networks) Probabilistic Reasoning (Mostly using Bayesian Networks) Introduction: Why probabilistic reasoning? The world is not deterministic. (Usually because information is limited.) Ways of coping with uncertainty

More information

Y. Xiang, Inference with Uncertain Knowledge 1

Y. Xiang, Inference with Uncertain Knowledge 1 Inference with Uncertain Knowledge Objectives Why must agent use uncertain knowledge? Fundamentals of Bayesian probability Inference with full joint distributions Inference with Bayes rule Bayesian networks

More information

On the errors introduced by the naive Bayes independence assumption

On the errors introduced by the naive Bayes independence assumption On the errors introduced by the naive Bayes independence assumption Author Matthijs de Wachter 3671100 Utrecht University Master Thesis Artificial Intelligence Supervisor Dr. Silja Renooij Department of

More information

AI Programming CS S-09 Knowledge Representation

AI Programming CS S-09 Knowledge Representation AI Programming CS662-2013S-09 Knowledge Representation David Galles Department of Computer Science University of San Francisco 09-0: Overview So far, we ve talked about search, which is a means of considering

More information

Uncertainty. Introduction to Artificial Intelligence CS 151 Lecture 2 April 1, CS151, Spring 2004

Uncertainty. Introduction to Artificial Intelligence CS 151 Lecture 2 April 1, CS151, Spring 2004 Uncertainty Introduction to Artificial Intelligence CS 151 Lecture 2 April 1, 2004 Administration PA 1 will be handed out today. There will be a MATLAB tutorial tomorrow, Friday, April 2 in AP&M 4882 at

More information

CHAPTER 2 INTRODUCTION TO CLASSICAL PROPOSITIONAL LOGIC

CHAPTER 2 INTRODUCTION TO CLASSICAL PROPOSITIONAL LOGIC CHAPTER 2 INTRODUCTION TO CLASSICAL PROPOSITIONAL LOGIC 1 Motivation and History The origins of the classical propositional logic, classical propositional calculus, as it was, and still often is called,

More information

Non-impeding Noisy-AND Tree Causal Models Over Multi-valued Variables

Non-impeding Noisy-AND Tree Causal Models Over Multi-valued Variables Non-impeding Noisy-AND Tree Causal Models Over Multi-valued Variables Yang Xiang School of Computer Science, University of Guelph, Canada Abstract To specify a Bayesian network (BN), a conditional probability

More information

Symbolic Logic 3. For an inference to be deductively valid it is impossible for the conclusion to be false if the premises are true.

Symbolic Logic 3. For an inference to be deductively valid it is impossible for the conclusion to be false if the premises are true. Symbolic Logic 3 Testing deductive validity with truth tables For an inference to be deductively valid it is impossible for the conclusion to be false if the premises are true. So, given that truth tables

More information

Proofs. Joe Patten August 10, 2018

Proofs. Joe Patten August 10, 2018 Proofs Joe Patten August 10, 2018 1 Statements and Open Sentences 1.1 Statements A statement is a declarative sentence or assertion that is either true or false. They are often labelled with a capital

More information

Inquiry Calculus and the Issue of Negative Higher Order Informations

Inquiry Calculus and the Issue of Negative Higher Order Informations Article Inquiry Calculus and the Issue of Negative Higher Order Informations H. R. Noel van Erp, *, Ronald O. Linger and Pieter H. A. J. M. van Gelder,2 ID Safety and Security Science Group, TU Delft,

More information

Truth-Functional Logic

Truth-Functional Logic Truth-Functional Logic Syntax Every atomic sentence (A, B, C, ) is a sentence and are sentences With ϕ a sentence, the negation ϕ is a sentence With ϕ and ψ sentences, the conjunction ϕ ψ is a sentence

More information

Uncertainty. Variables. assigns to each sentence numerical degree of belief between 0 and 1. uncertainty

Uncertainty. Variables. assigns to each sentence numerical degree of belief between 0 and 1. uncertainty Bayes Classification n Uncertainty & robability n Baye's rule n Choosing Hypotheses- Maximum a posteriori n Maximum Likelihood - Baye's concept learning n Maximum Likelihood of real valued function n Bayes

More information

Uncertainty and Bayesian Networks

Uncertainty and Bayesian Networks Uncertainty and Bayesian Networks Tutorial 3 Tutorial 3 1 Outline Uncertainty Probability Syntax and Semantics for Uncertainty Inference Independence and Bayes Rule Syntax and Semantics for Bayesian Networks

More information

2) There should be uncertainty as to which outcome will occur before the procedure takes place.

2) There should be uncertainty as to which outcome will occur before the procedure takes place. robability Numbers For many statisticians the concept of the probability that an event occurs is ultimately rooted in the interpretation of an event as an outcome of an experiment, others would interpret

More information

LING 473: Day 5. START THE RECORDING Bayes Theorem. University of Washington Linguistics 473: Computational Linguistics Fundamentals

LING 473: Day 5. START THE RECORDING Bayes Theorem. University of Washington Linguistics 473: Computational Linguistics Fundamentals LING 473: Day 5 START THE RECORDING 1 Announcements I will not be physically here August 8 & 10 Lectures will be made available right before I go to sleep in Oslo So, something like 2:30-3:00pm here. I

More information

THE LOGIC OF COMPOUND STATEMENTS

THE LOGIC OF COMPOUND STATEMENTS CHAPTER 2 THE LOGIC OF COMPOUND STATEMENTS Copyright Cengage Learning. All rights reserved. SECTION 2.1 Logical Form and Logical Equivalence Copyright Cengage Learning. All rights reserved. Logical Form

More information

Section 1.1: Logical Form and Logical Equivalence

Section 1.1: Logical Form and Logical Equivalence Section 1.1: Logical Form and Logical Equivalence An argument is a sequence of statements aimed at demonstrating the truth of an assertion. The assertion at the end of an argument is called the conclusion,

More information

Philosophy 244: Modal Logic Preliminaries

Philosophy 244: Modal Logic Preliminaries Philosophy 244: Modal Logic Preliminaries By CI Lewis in his 1910 Harvard dis- sertation The Place of Intuition in Knowledge (advised by Josiah Royce). Lewis credits earlier work of Hugh MacColl s. What

More information

Axioms of Probability? Notation. Bayesian Networks. Bayesian Networks. Today we ll introduce Bayesian Networks.

Axioms of Probability? Notation. Bayesian Networks. Bayesian Networks. Today we ll introduce Bayesian Networks. Bayesian Networks Today we ll introduce Bayesian Networks. This material is covered in chapters 13 and 14. Chapter 13 gives basic background on probability and Chapter 14 talks about Bayesian Networks.

More information

Evidence with Uncertain Likelihoods

Evidence with Uncertain Likelihoods Evidence with Uncertain Likelihoods Joseph Y. Halpern Cornell University Ithaca, NY 14853 USA halpern@cs.cornell.edu Riccardo Pucella Cornell University Ithaca, NY 14853 USA riccardo@cs.cornell.edu Abstract

More information

Artificial Intelligence CS 6364

Artificial Intelligence CS 6364 Artificial Intelligence CS 6364 rofessor Dan Moldovan Section 12 robabilistic Reasoning Acting under uncertainty Logical agents assume propositions are - True - False - Unknown acting under uncertainty

More information

Glossary of Logical Terms

Glossary of Logical Terms Math 304 Spring 2007 Glossary of Logical Terms The following glossary briefly describes some of the major technical logical terms used in this course. The glossary should be read through at the beginning

More information

Learning Bayesian Network Parameters under Equivalence Constraints

Learning Bayesian Network Parameters under Equivalence Constraints Learning Bayesian Network Parameters under Equivalence Constraints Tiansheng Yao 1,, Arthur Choi, Adnan Darwiche Computer Science Department University of California, Los Angeles Los Angeles, CA 90095

More information

Bayesian Networks. Semantics of Bayes Nets. Example (Binary valued Variables) CSC384: Intro to Artificial Intelligence Reasoning under Uncertainty-III

Bayesian Networks. Semantics of Bayes Nets. Example (Binary valued Variables) CSC384: Intro to Artificial Intelligence Reasoning under Uncertainty-III CSC384: Intro to Artificial Intelligence Reasoning under Uncertainty-III Bayesian Networks Announcements: Drop deadline is this Sunday Nov 5 th. All lecture notes needed for T3 posted (L13,,L17). T3 sample

More information

KRIPKE S THEORY OF TRUTH 1. INTRODUCTION

KRIPKE S THEORY OF TRUTH 1. INTRODUCTION KRIPKE S THEORY OF TRUTH RICHARD G HECK, JR 1. INTRODUCTION The purpose of this note is to give a simple, easily accessible proof of the existence of the minimal fixed point, and of various maximal fixed

More information

On analysis of the unicity of Jeffrey s rule of conditioning in a possibilistic framework

On analysis of the unicity of Jeffrey s rule of conditioning in a possibilistic framework Abstract Conditioning is an important task for designing intelligent systems in artificial intelligence. This paper addresses an issue related to the possibilistic counterparts of Jeffrey s rule of conditioning.

More information

Encoding formulas with partially constrained weights in a possibilistic-like many-sorted propositional logic

Encoding formulas with partially constrained weights in a possibilistic-like many-sorted propositional logic Encoding formulas with partially constrained weights in a possibilistic-like many-sorted propositional logic Salem Benferhat CRIL-CNRS, Université d Artois rue Jean Souvraz 62307 Lens Cedex France benferhat@criluniv-artoisfr

More information

12. Vagueness, Uncertainty and Degrees of Belief

12. Vagueness, Uncertainty and Degrees of Belief 12. Vagueness, Uncertainty and Degrees of Belief KR & R Brachman & Levesque 2005 202 Noncategorical statements Ordinary commonsense knowledge quickly moves away from categorical statements like a P is

More information

Uncertainty and Rules

Uncertainty and Rules Uncertainty and Rules We have already seen that expert systems can operate within the realm of uncertainty. There are several sources of uncertainty in rules: Uncertainty related to individual rules Uncertainty

More information

A Logic of Relative Desire (Preliminary Report)

A Logic of Relative Desire (Preliminary Report) Reprinted from Proceedings of the Sixth International Symposium on Methodologies for Intelligent Systems (ISMIS 91), October, 1991, Z. W. Ras and M. Zemankova, editors, Berlin: Springer-Verlag, pp. 16

More information

Exact Inference by Complete Enumeration

Exact Inference by Complete Enumeration 21 Exact Inference by Complete Enumeration We open our toolbox of methods for handling probabilities by discussing a brute-force inference method: complete enumeration of all hypotheses, and evaluation

More information

A Little Deductive Logic

A Little Deductive Logic A Little Deductive Logic In propositional or sentential deductive logic, we begin by specifying that we will use capital letters (like A, B, C, D, and so on) to stand in for sentences, and we assume that

More information

Proof strategies, or, a manual of logical style

Proof strategies, or, a manual of logical style Proof strategies, or, a manual of logical style Dr Holmes September 27, 2017 This is yet another version of the manual of logical style I have been working on for many years This semester, instead of posting

More information

PROBABILISTIC REASONING SYSTEMS

PROBABILISTIC REASONING SYSTEMS PROBABILISTIC REASONING SYSTEMS In which we explain how to build reasoning systems that use network models to reason with uncertainty according to the laws of probability theory. Outline Knowledge in uncertain

More information

Bayesian belief networks

Bayesian belief networks CS 2001 Lecture 1 Bayesian belief networks Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square 4-8845 Milos research interests Artificial Intelligence Planning, reasoning and optimization in the presence

More information

CSC384: Intro to Artificial Intelligence Reasoning under Uncertainty-II

CSC384: Intro to Artificial Intelligence Reasoning under Uncertainty-II CSC384: Intro to Artificial Intelligence Reasoning under Uncertainty-II 1 Bayes Rule Example Disease {malaria, cold, flu}; Symptom = fever Must compute Pr(D fever) to prescribe treatment Why not assess

More information

Section 1.3. Let I be a set. When I is used in the following context,

Section 1.3. Let I be a set. When I is used in the following context, Section 1.3. Let I be a set. When I is used in the following context, {B i } i I, we call I the index set. The set {B i } i I is the family of sets of the form B i where i I. One could also use set builder

More information

A Differential Approach to Inference in Bayesian Networks

A Differential Approach to Inference in Bayesian Networks A Differential Approach to Inference in Bayesian Networks Adnan Darwiche Computer Science Department University of California Los Angeles, Ca 90095 darwiche@cs.ucla.edu We present a new approach to inference

More information

First-Degree Entailment

First-Degree Entailment March 5, 2013 Relevance Logics Relevance logics are non-classical logics that try to avoid the paradoxes of material and strict implication: p (q p) p (p q) (p q) (q r) (p p) q p (q q) p (q q) Counterintuitive?

More information

Database Theory VU , SS Complexity of Query Evaluation. Reinhard Pichler

Database Theory VU , SS Complexity of Query Evaluation. Reinhard Pichler Database Theory Database Theory VU 181.140, SS 2018 5. Complexity of Query Evaluation Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 17 April, 2018 Pichler

More information

A Little Deductive Logic

A Little Deductive Logic A Little Deductive Logic In propositional or sentential deductive logic, we begin by specifying that we will use capital letters (like A, B, C, D, and so on) to stand in for sentences, and we assume that

More information

School of Computer Science and Electrical Engineering 28/05/01. Digital Circuits. Lecture 14. ENG1030 Electrical Physics and Electronics

School of Computer Science and Electrical Engineering 28/05/01. Digital Circuits. Lecture 14. ENG1030 Electrical Physics and Electronics Digital Circuits 1 Why are we studying digital So that one day you can design something which is better than the... circuits? 2 Why are we studying digital or something better than the... circuits? 3 Why

More information

Introduction to Artificial Intelligence. Unit # 11

Introduction to Artificial Intelligence. Unit # 11 Introduction to Artificial Intelligence Unit # 11 1 Course Outline Overview of Artificial Intelligence State Space Representation Search Techniques Machine Learning Logic Probabilistic Reasoning/Bayesian

More information

Arguments and Proofs. 1. A set of sentences (the premises) 2. A sentence (the conclusion)

Arguments and Proofs. 1. A set of sentences (the premises) 2. A sentence (the conclusion) Arguments and Proofs For the next section of this course, we will study PROOFS. A proof can be thought of as the formal representation of a process of reasoning. Proofs are comparable to arguments, since

More information

Cartesian-product sample spaces and independence

Cartesian-product sample spaces and independence CS 70 Discrete Mathematics for CS Fall 003 Wagner Lecture 4 The final two lectures on probability will cover some basic methods for answering questions about probability spaces. We will apply them to the

More information

1 What are probabilities? 2 Sample Spaces. 3 Events and probability spaces

1 What are probabilities? 2 Sample Spaces. 3 Events and probability spaces 1 What are probabilities? There are two basic schools of thought as to the philosophical status of probabilities. One school of thought, the frequentist school, considers the probability of an event to

More information

Probabilistic Representation and Reasoning

Probabilistic Representation and Reasoning Probabilistic Representation and Reasoning Alessandro Panella Department of Computer Science University of Illinois at Chicago May 4, 2010 Alessandro Panella (CS Dept. - UIC) Probabilistic Representation

More information

Artificial Intelligence Programming Probability

Artificial Intelligence Programming Probability Artificial Intelligence Programming Probability Chris Brooks Department of Computer Science University of San Francisco Department of Computer Science University of San Francisco p.1/?? 13-0: Uncertainty

More information

Kreisel s Conjecture with minimality principle

Kreisel s Conjecture with minimality principle Kreisel s Conjecture with minimality principle Pavel Hrubeš November 9, 2008 Abstract We prove that Kreisel s Conjecture is true, if Peano arithmetic is axiomatised using minimality principle and axioms

More information

Mathematical Foundations of Logic and Functional Programming

Mathematical Foundations of Logic and Functional Programming Mathematical Foundations of Logic and Functional Programming lecture notes The aim of the course is to grasp the mathematical definition of the meaning (or, as we say, the semantics) of programs in two

More information

Philosophy 148 Announcements & Such. Axiomatic Treatment of Probability Calculus I. Axiomatic Treatment of Probability Calculus II

Philosophy 148 Announcements & Such. Axiomatic Treatment of Probability Calculus I. Axiomatic Treatment of Probability Calculus II Branden Fitelson Philosophy 148 Lecture 1 Branden Fitelson Philosophy 148 Lecture 2 Philosophy 148 Announcements & Such Administrative Stuff Raul s office is 5323 Tolman (not 301 Moses). We have a permanent

More information

Philosophy 148 Announcements & Such

Philosophy 148 Announcements & Such Branden Fitelson Philosophy 148 Lecture 1 Philosophy 148 Announcements & Such Administrative Stuff Raul s office is 5323 Tolman (not 301 Moses). We have a permanent location for the Tues. section: 206

More information

4 Derivations in the Propositional Calculus

4 Derivations in the Propositional Calculus 4 Derivations in the Propositional Calculus 1. Arguments Expressed in the Propositional Calculus We have seen that we can symbolize a wide variety of statement forms using formulas of the propositional

More information

Applied Logic. Lecture 1 - Propositional logic. Marcin Szczuka. Institute of Informatics, The University of Warsaw

Applied Logic. Lecture 1 - Propositional logic. Marcin Szczuka. Institute of Informatics, The University of Warsaw Applied Logic Lecture 1 - Propositional logic Marcin Szczuka Institute of Informatics, The University of Warsaw Monographic lecture, Spring semester 2017/2018 Marcin Szczuka (MIMUW) Applied Logic 2018

More information

Formal Logic. Critical Thinking

Formal Logic. Critical Thinking ormal Logic Critical hinking Recap: ormal Logic If I win the lottery, then I am poor. I win the lottery. Hence, I am poor. his argument has the following abstract structure or form: If P then Q. P. Hence,

More information

CS206 Lecture 21. Modal Logic. Plan for Lecture 21. Possible World Semantics

CS206 Lecture 21. Modal Logic. Plan for Lecture 21. Possible World Semantics CS206 Lecture 21 Modal Logic G. Sivakumar Computer Science Department IIT Bombay siva@iitb.ac.in http://www.cse.iitb.ac.in/ siva Page 1 of 17 Thu, Mar 13, 2003 Plan for Lecture 21 Modal Logic Possible

More information

Reasoning about uncertainty

Reasoning about uncertainty Reasoning about uncertainty Rule-based systems are an attempt to embody the knowledge of a human expert within a computer system. Human knowledge is often imperfect. - it may be incomplete (missing facts)

More information

Lifted Inference: Exact Search Based Algorithms

Lifted Inference: Exact Search Based Algorithms Lifted Inference: Exact Search Based Algorithms Vibhav Gogate The University of Texas at Dallas Overview Background and Notation Probabilistic Knowledge Bases Exact Inference in Propositional Models First-order

More information

Uncertainty. Chapter 13

Uncertainty. Chapter 13 Uncertainty Chapter 13 Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes Rule Uncertainty Let s say you want to get to the airport in time for a flight. Let action A

More information

Equivalent Forms of the Axiom of Infinity

Equivalent Forms of the Axiom of Infinity Equivalent Forms of the Axiom of Infinity Axiom of Infinity 1. There is a set that contains each finite ordinal as an element. The Axiom of Infinity is the axiom of Set Theory that explicitly asserts that

More information

Discrete Probability and State Estimation

Discrete Probability and State Estimation 6.01, Fall Semester, 2007 Lecture 12 Notes 1 MASSACHVSETTS INSTITVTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.01 Introduction to EECS I Fall Semester, 2007 Lecture 12 Notes

More information

Logic. Readings: Coppock and Champollion textbook draft, Ch

Logic. Readings: Coppock and Champollion textbook draft, Ch Logic Readings: Coppock and Champollion textbook draft, Ch. 3.1 3 1. Propositional logic Propositional logic (a.k.a propositional calculus) is concerned with complex propositions built from simple propositions

More information

Modeling and reasoning with uncertainty

Modeling and reasoning with uncertainty CS 2710 Foundations of AI Lecture 18 Modeling and reasoning with uncertainty Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square KB systems. Medical example. We want to build a KB system for the diagnosis

More information

Sample Space: Specify all possible outcomes from an experiment. Event: Specify a particular outcome or combination of outcomes.

Sample Space: Specify all possible outcomes from an experiment. Event: Specify a particular outcome or combination of outcomes. Chapter 2 Introduction to Probability 2.1 Probability Model Probability concerns about the chance of observing certain outcome resulting from an experiment. However, since chance is an abstraction of something

More information

CMPT Machine Learning. Bayesian Learning Lecture Scribe for Week 4 Jan 30th & Feb 4th

CMPT Machine Learning. Bayesian Learning Lecture Scribe for Week 4 Jan 30th & Feb 4th CMPT 882 - Machine Learning Bayesian Learning Lecture Scribe for Week 4 Jan 30th & Feb 4th Stephen Fagan sfagan@sfu.ca Overview: Introduction - Who was Bayes? - Bayesian Statistics Versus Classical Statistics

More information

HANDOUT AND SET THEORY. Ariyadi Wijaya

HANDOUT AND SET THEORY. Ariyadi Wijaya HANDOUT LOGIC AND SET THEORY Ariyadi Wijaya Mathematics Education Department Faculty of Mathematics and Natural Science Yogyakarta State University 2009 1 Mathematics Education Department Faculty of Mathematics

More information

Proof Techniques (Review of Math 271)

Proof Techniques (Review of Math 271) Chapter 2 Proof Techniques (Review of Math 271) 2.1 Overview This chapter reviews proof techniques that were probably introduced in Math 271 and that may also have been used in a different way in Phil

More information

13.4 INDEPENDENCE. 494 Chapter 13. Quantifying Uncertainty

13.4 INDEPENDENCE. 494 Chapter 13. Quantifying Uncertainty 494 Chapter 13. Quantifying Uncertainty table. In a realistic problem we could easily have n>100, makingo(2 n ) impractical. The full joint distribution in tabular form is just not a practical tool for

More information

CHAPTER 4 CLASSICAL PROPOSITIONAL SEMANTICS

CHAPTER 4 CLASSICAL PROPOSITIONAL SEMANTICS CHAPTER 4 CLASSICAL PROPOSITIONAL SEMANTICS 1 Language There are several propositional languages that are routinely called classical propositional logic languages. It is due to the functional dependency

More information

Formalizing Probability. Choosing the Sample Space. Probability Measures

Formalizing Probability. Choosing the Sample Space. Probability Measures Formalizing Probability Choosing the Sample Space What do we assign probability to? Intuitively, we assign them to possible events (things that might happen, outcomes of an experiment) Formally, we take

More information

Basics of Probability

Basics of Probability Basics of Probability Lecture 1 Doug Downey, Northwestern EECS 474 Events Event space E.g. for dice, = {1, 2, 3, 4, 5, 6} Set of measurable events S 2 E.g., = event we roll an even number = {2, 4, 6} S

More information

Belief revision: A vade-mecum

Belief revision: A vade-mecum Belief revision: A vade-mecum Peter Gärdenfors Lund University Cognitive Science, Kungshuset, Lundagård, S 223 50 LUND, Sweden Abstract. This paper contains a brief survey of the area of belief revision

More information

UNCERTAINTY. In which we see what an agent should do when not all is crystal-clear.

UNCERTAINTY. In which we see what an agent should do when not all is crystal-clear. UNCERTAINTY In which we see what an agent should do when not all is crystal-clear. Outline Uncertainty Probabilistic Theory Axioms of Probability Probabilistic Reasoning Independency Bayes Rule Summary

More information

Discrete Probability and State Estimation

Discrete Probability and State Estimation 6.01, Spring Semester, 2008 Week 12 Course Notes 1 MASSACHVSETTS INSTITVTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.01 Introduction to EECS I Spring Semester, 2008 Week

More information

KB Agents and Propositional Logic

KB Agents and Propositional Logic Plan Knowledge-Based Agents Logics Propositional Logic KB Agents and Propositional Logic Announcements Assignment2 mailed out last week. Questions? Knowledge-Based Agents So far, what we ve done is look

More information