An objective definition of subjective probability

An objective definition of subjective probability Nico Roos 1 Abstract. Several attempts have been made to give an objective definition of subjective probability. These attempts can be divided into two approaches. The first approach uses an a priori probability distribution over the set of interpretations of the language that we are using to describe information. The idea is to define such an a priori probability distribution using some general principles such as the insufficient reason principle of Bernoulli and Laplace. The second approach does not start from a set of interpretations among which we try to find the one describing the world, but instead tries to build a partial model of the world. Uncertainty in the available information results in several possible partial models, each presenting a different view of the world. Using the insufficient reason principle, a probability is assigned to each view. This paper will present arguments for using the second approach instead of the first. Furthermore, a new formalization of the second approach, solving the problems of earlier attempts, will be given. 1 Introduction Several attempts have been made to give an objective definition of subjective probability. These attempts can be divided into two approaches. The first approach uses an a priori probability distribution on the set of interpretations of the language that we are using to describe information [1, 4, 2, 6]. The idea is to define such an a priori probability distribution using some general principles. The second approach does not start from a set of interpretations among which we try to find the one describing the world, but instead tries to build a partial model of the world [7, 8, 9]. Uncertainty in the available information results in several possible partial models, each presenting a different view of the world. Using the insufficient reason principle of Bernoulli and Laplace, a probability is assigned to each view. The two approaches can be characterized by two child games. The approach in which we start with a probability distribution over the set interpretations can be characterized by the game Who is it. The goal of this game is to identify a person from a set of candidates by asking questions. Based on the answers, we eliminate some of the candidates. If we would define an a priori probability distribution over the set of candidates, we could answer questions such as: what is the chance that the person we try to identify has blue eyes?. The other approach can be characterized as solving a jigsaw puzzle. When we receive some information, this information may represent two or more pieces of which one belongs to the puzzle. Each of these pieces results in a different view on how to complete the puzzle. We may also receive information representing a piece of the puzzle of which we do not know where to put it into the puzzle. There might be more than one position where it could fit. This again can result in different views on how to complete the puzzle. If we have no 1 Maastricht University, Department of Computer Science, P.O. Box 616, 6200 MD Maastricht, The Netherlands, e-mail: roos@cs.unimaas.nl information to prefer one of these views, using the insufficient reason principle of Bernoulli and Laplace, we might consider all these views on how to complete the puzzle as being equally likely. So the uncertainty expresses our lack of knowledge. For the first approach, one also uses the insufficient reason principle. It is used in the definition of the a priori probability distribution. Can we argue, however, that the set of candidates should be equally likely? Suppose, for example that we have the following information about the person to be identified. The person has blue or green eyes. If this person has green eyes, his/her hair is blond. If there are as many candidates with blue as with green eyes, what should be the probability that the person to be identified has green eyes? If all candidates are equally likely, and if there are candidates for each combination of eye color and hair color, then the probability that the person to be identified has green eyes will be less than 0:5. This outcome is perfectly valid for this game, but is it for an agent receiving information about the world? The fact that we do not know the color of the person s hair if he/she has blue eyes, can hardly be a reason to consider having blue eyes to be more likely. It would imply that the likelihood of a possible situation is proportional with the lack of information about this situation. The heart of the problem is that in the first approach, we assume that all worlds are really possible. The only thing that we do not know is which world has been selected for today. The name of the game is, however, that there is one fixed world of which we have to determine what it looks like. Therefore, we should view information as pieces of a puzzle. Some information, such as the color of the persons eyes, represents two pieces, one of which belongs to the puzzle. So, we can have different views on how to complete the puzzle. As a result uncertainty arises. Furthermore, information stating that the person has blond hair if s/he has green eyes, is a pieces that we can put into the puzzle in one view, but which is not a piece of the puzzle in the other view. So, we have more information about one possible view than we have about the other. This should not influence the likelihood of the two views. 2 The probability distribution Since there is only one fixed world, we cannot define a probability distribution over a set of possible worlds. As was pointed out in the introduction, we can only define a probability distribution over the set views we have about the world. Therefore, we must now address the question concerning the requirements such a probability distribution must meet. First of all, views must be mutually exclusive. Since views are in fact epistemic states, it might be possible to combine the information of two different views in one more informative view. In other words, c 1998 N. Roos ECAI 98. 13th European Conference on Artificial Intelligence Edited by Henri Prade Published in 1998 by John Wiley & Sons, Ltd.

two different views need not give incompatible descriptions of the world. Since such views cannot be considered as being mutually exclusive, we must exclude them. If views are mutually exclusive and if we have no reason to prefer one description of the world to another, all views should be equally likely. Note that the information content of a view cannot be an issue since we cannot talk about the set of worlds described by a view. There is only one world. New information described by an implication may change the likelihood of a view. If the consequent of the implication consists of a disjunction, a view may be replaced by two or more new views. For example, If the person has blue eyes, s/he has black or blond hair. The above implication describes two possible ways to extend the information content of a view. So, one view is replaced by two new views. In our example, we go from two views to three views. Since we have still no reason to prefer one view to another, the probability distribution changes. Now consider the following slightly modified example. A man or a woman is standing in front of the house. If it a woman, her eyes are blue or green. If it is a man, his eyes are blue. This example also describes three views of the world. Our intuition tells us here that the probability that the person in front of the house is a man should be the same as the probability that this person is a woman. Since this example appears to be similar to the previous example, why are the views not equally likely? The difference between the two examples is that in the first example we have one person to be identified while in the last example we have two persons, a man and a woman. If the person in front of the house is a woman, then there is uncertainty concerning the color of her eyes. How do we represent this information in one or more views? In a view of the world, the woman will be represented by some object. Since an object can, in principle, denote any entity in the world, we need a way to indicate that an object in one view representing a woman having blue eyes, denotes the same entity as an object in another view representing the same woman having green eyes. We can realize this by introducing the object denoting the woman in a common parent-view of the two views. Naturally, objects that are present in both parent- and child-view should denote the same entity in the world. Since a parent-view can be interpreted as the intersection of its child-views, two of the three views described by our example must be grouped in order to represent that we are talking about one woman. The resulting hierarchy presents the preferences between the views. For the same reason as describing a single entity can result in a hierarchy of views, so can describing a group of entities. Suppose for example that 10 or 15 persons are waiting a in room. If we also know that each person has brown or blond hair, then we get again a hierarchy of views, where the root view consists of two sub-views, one containing 10 persons and one containing 15 persons. Each sub-view is also divided into sub-views which describe the possible properties of the objects. Now suppose that we know that 80% of the persons have brown hair. Then there will be respectively 8 or 12 persons? having brown hair. So the view with 10 persons will be divided into? 8 10 15 sub-views, and the view with 15 persons will be divided into 12 sub-views. Although there are more views with 15 persons, the presence of 15 persons in the room is not more likely than the presence 10 persons. This result corresponds with our intuitions. 3 The language We assume a slightly modified first order language L. This language is recursively defined from a set of atomic predicates P, a set of constants C, a set of variables X, a set of objects O, the operators :, ^, _ and!, and the quantified descriptions [Q x ]. The quantified description corresponds with a quantifier and a variable in standard first order logic; e.g. 8x. In the quantified description [Q x ], Q represents one of the quantifiers 8, 9, (%p) or (#n) where p 2 [0; 1] and n 2 IN. x is a variable and 2 L is a formula denoting the reference class. The reference class is optional. If absent, all objects belong to the reference class. The premises used in the reasoning process are a finite set of formulas without free variables or objects. 4 Handling uncertainty through partial model construction The general idea is to approximate the model of the world by constructing a partial model using information about the world. In this way we try to solve the puzzle which is represented by a complete model of the world. Unfortunately, some kinds of information, such as disjunctions and statistical information, allows us to construct different partial models, resulting in uncertainty about what holds in the world. Definition 1 Let O be the set of all possible objects. A partial model M is a tuple ho; Fi where O C [ O is a set of objects and F L is a set of formulas. Furthermore, any object or constant occurring in a formula of F is an element of O. Notice that we allow that the set of objects O also contains constants. This choice avoids the need to specify a denotation of the constants. As a consequence, we will not be able to represent that two constants denote the same object. This may seem worse than it is. Constants are used by an agent for names of objects and for values of sensors. If we say that Mary has blue eyes, we mean that an object named Mary has eye color blue. The person named Mary may have other names and there may even be other persons having the same name. In all circumstance Mary is different from Julia although one person may have both names. How do we actually construct a partial model? As we have seen above, some information may not uniquely describe a part of the world. This results in different views of how the world may look like. To make things more complicated, statistical information can divide a view into sub-views. Therefore, the partial models must be organized into a tree. In this tree, each node represents a view and its children represent the sub-views. The leaves of the tree consist of the actual partial models. We will call this tree an hierarchical model. The root of an hierarchical model must, of course, satisfy the premises. Furthermore, a formula satisfied by a node (a view) must also be satisfied by its children. Definition 2 Let V be a view. V is either a partial model M or a set of views fw 1; :::; W k g. O(V ) = T ho;f i2leaves(v ) O. F (V ) = T ho;f i2leaves(v ) F. Definition 3 Let ' be a formula and let V be a view. V satisfies ', V j= ', is defined in the following way. Reasoning under Uncertainty 596 N. Roos

V j= ' if for some 2 L, V j= and V j= :. V j= ' if ' 2 F (V ). V j= 1 ^ 2 if V j= 1 and V j= 2. V j= :( 1 ^ 2) if V j= : 1 or V j= : 2. V j= 1 _ 2 if V j= 1 or V j= 2. V j= :( 1 _ 2) if V j= : 1 and V j= : 2. V j= [9 x ] if V = fw1; :::; W k g, < = fo 2 O(V ) j for each 1 i k: W i j= [ x = o]; g (< = O(V ) if is absent) and for some o 2 < : W i j= [ x = o] for each 1 i k. V j= :[Q x ] if V = fw1; :::; W k g, < = fo 2 O(V ) j for each 1 i k: W i j= [ x = o]; g (< = O(V ) if is absent) and: if Q = #n, then jfo 2 < j for each 1 i k: W i j= [ x = o]gj > n; if Q = 8, then for some o 2 <: W i j= : [ x = o] for each 1 i k; V j= ' if V = fw1; :::; W k g and for each V 2 W, W j= '. Partial models can be ordered with respect to the amount of information that they contain. Here, we will use an information ordering according to which a partial model N contains at least the same information as a partial model M, if and only if N satisfies every formula of M. There is, however, a small problem that we have to deal with. Objects that are not represented by constants are just arbitrary names for entities in the world. Therefore, different partial models can use different names to denote the same entity. To relate the objects of one partial model to the objects of another partial model, we need a mapping of the objects of one model to the objects of the other model. Definition 4 Let M = ho; Fi and N = ho 0 ; F 0 i be two partial models. Furthermore, let f : O! (O [ C) be a mapping of the objects of M to the objects of N. N contains at least the same information as M given f, M v f N, if and only if for each ' 2 F there holds that N j= '[f]. We can extend the above defined information ordering to an information ordering on views. In this information ordering, we must take into account that sub-views can be deleted because of new information. Furthermore, views with less hierarchical levels, are considered to contain less information. Definition 5 Let V and W be two views and let f : O! (O [ C) be a mapping of the objects. W contains at least the same information as V given f, V v f W, if and only if V and W are partial models and V v f W ; or for each W 0 2 W, either V v f W 0 or there is a V 0 2 V such that V 0 v f W 0. Definition 6 Let V and W be two hierarchical models. W contains at least the same information as V, V v W, if and only if for some f : O! (O [ C) mapping of the objects V v f W. A view of a hierarchical model may contain redundant sub-views. We prefer hierarchical models without redundant views since redundant views provide no relevant information and redundant views are not mutually exclusive. Different views may use different objects to denote the same entity in the world. Therefore, to determine redundancy, we need a mapping of the objects of one view to another view. The objects of a common parent view, however, denote the same entity in every child view. Definition 7 Let H be a hierarchical model. H is redundant free if and only if for no two sub-views V and W of a view U in the hierarchical model H there is a function f : O! (O [ C) such that for each o 2 O(U): f(o) = o and V v f W. The above defined partial models and views do not guarantee that the truth value of a formula will be defined in terms of its constitutive parts. So, how do we guarantee this? We could try to guarantee this by demanding that a view still satisfies a non atomic formula after removing this formula from the leaves of the view. Unfortunately, this approach does not guarantee that the partial models and views give a good approximation of the world. We cannot represent the information that there are 5 houses in a street in terms of its constitutive parts in a partial model. The semantics of the formula is incomplete respect to a partial model. Therefore, another approach is needed. To assure that a partial model gives the best possible representation of the world given the available information, the premises, we will introduce a representation relation. This representation relation tells us whether a view accurately represents a formula. Definition 8 Let ' be a formula and let V be a view. V represents ', V j ', is defined in the following way. V j ' if ' is a literal and ' 2 F (V ). V j 1 ^ 2 if V j= 1 and V j= 2. V j :( 1 ^ 2) if V j= : 1 or V j= : 2, and V j= i or V j= : i for i 2 f1; 2g. V j 1 _ 2 if V j= 1 or V j= 2, and V j= i or V j= : i for i 2 f1; 2g. V j 1 ^ 2 if V j= : 1 and V j= : 2. V j! if V 6j= or V j=. V j [Q x ] if V = fw 1; :::; W k g, < = fo 2 O(V ) j for each 1 i k: W i j= [ x = o]; g (< = O(V ) if is absent), j<j = k and: if Q = 9, then for some o 2 <: W i j= [ x = o] for each 1 i k; if Q = #n, then jfo 2 < j for each 1 i k: W i j= [ x = o]gj = n and jfo 2 < j for each 1 i k: W i j= : [ x = o]gj = k? n; if Q = %p, then jfo 2 < j for each 1 i k: W i j= [ x = o]gj = p k and jfo 2 < j for each 1 i k: W i j= : [ x = o]gj = (1?p)k. if Q = 8, then for each o 2 <: W i j= [ x = o] for each 1 i k; Definition 9 Let V be a view. V is a complete if and only if for each formula ' 2 F (V ): V j ' and for each W 2 V such that W 6= fmg for some partial model M, there holds that W is complete. Given the above defined view we can define the least informative complete hierarchical model satisfying the premises. Definition 10 Let be a set of premises and let V be a view. The view V is an hierarchical model of if and only if V is the least informative, redundant free and complete view such that V j=. By applying the insufficient reason principle, we can now define the probability measure. Reasoning under Uncertainty 597 N. Roos

Definition 11 Let V be a view and let ' be a formula. If the truth value of ' is defined in every leave of V, then P V (') = 8 < : 1 if V j= ' 0 if V j= :' V = fw 1; :::; W k g 1 P k W 2V PW (') if Otherwise, P V (') is undefined. Notice that the elimination of sub-views after receiving new information need not influence the probability of a view. A sub-view only denotes a possible way to extend its parent-view. It does not denote a situation that occurs in a proportion of the worlds. Therefore, the sub-views bear no influence on the parent view. So, the probability of a formula holding in the parent-view should not change when some sub-views are eliminated after receiving new information. Only the elimination of views on the same level after receiving of new information can have this effect. This behavior corresponds more or less with transferable belief model of Smets and Robert [10]. The example of Mr. Jones murder case discussed in [10], can be reformulated in the here proposed approach, leading to the same results a the transferable belief model. 5 Validity To verify the validity of the proposed approach, we must verify whether the derivable results correspond with our intuitions. First, however, we will look at three general requirements that a subjective probability measure should meet. The probability measure must uniquely be determined by the premises. The probability measure should not depend on the vocabulary. The probability measure should not depend on the number of objects in the world. The reason for the last two requirements is that every day new concepts are introduced and new objects are created or invented. When, for example, some advertiser introduces some new property of washing powder, this should not influence the chances of rain. Only information relating this new property to the climate change can have this effect. Theorem 1 Let be a set of premises. For each two hierarchical models V and W determined by the premises, there holds: V v W and W v V. This theorem implies that the views V and W are identical except for the names of the objects. So, the probability measure is uniquely determined by the premises. The other two requirements are also met. Since the defined probability measure depends on the current epistemic states, as follows from the above theorem, the vocabulary and the number of objects in the world have no influence on the probability measure. Now we will look at the probability that an object possesses some property, given statistical information about a group of objects to which the object belongs. Theorem 2 Let the premises consist of '(a) and [(%p)x '(x)] (x), and let V be the corresponding hierarchical model. Then P V ( (a)) = p. Specificity is the principle by which properties of a smaller, more specific group of objects override the properties of a larger group of objects. Theorem 3 Let the premises consist of '(a), [(%p)x '(x)] (x), [(%q)x (x)] (x), and [8 x '(x)](x), and let V be the corresponding hierarchical model. Then P V ( (a)) = p. 6 Related work Carnap [3] makes a distinction between a probability that describes the relation between a hypothesis (a formula) and evidence (the premises), and a probability that says something about the facts of nature. The former, which he calls a logical probability, describes a purely logical relation between evidence and a hypothesis. The latter, which describes statistics, is a mathematical theory about elementary statements based on an empirical procedure. Clearly, the here proposed probability measure defines a logical probability. It describes a logical relation between a formula and the premises. Carnap s main interest is in inductive logic. For example, the derivation of the relative frequency p in the formula [(%p) x (x)]'(x) on the bases of evidence, the premises. The here defined probability is unsuited for this purpose. Suppose for example that there are 10 houses in a street and that we wish to determine whether 50% or 80% of these house have a basement. After inspecting the first 7 houses, we may have found 5 house with a basement. One can easily verify that the two hypotheses are still equally likely given this evidence. Hence, the here defined probability measure does not confirm one of the hypotheses. Roos [7, 8, 9] proposes to establish a relation between information and subjective probability through the construction of a partial model. In the reasoning process that he proposes, a partial model consisting of mutually exclusive views, is constructed. On these views the insufficient reason argument is applied. Hence, the probability of a formula is proportional the number of views that satisfy the formula. So, like that above proposed approach, the probability depends on the number of epistemic states. In [7], Roos implicitly assumes that a reference class consists of en infinite number of objects. Furthermore, he does not require that views are mutually exclusive in the way as defined in this paper. In [9], he solves the latter problem. Here, he shows that reasoning through partial model construction can be used for handling uncertainty and for reason maintenance. In [8], Roos no longer assumes that a reference class contains an infinite number of objects. Since he does not take into account the dependency of statistical information on the reference class, this approach leads to counter intuitive results if we do not exactly know the number of objects in a reference class. Furthermore, since Roos requires that all views have the same objects, it also leads to conceptual difficulties. Knowing that there are 4 or 5 house in a street, what denotes the fifth object in the view where we only know of 4 houses? Bacchus, Grove, Halpern and Koller [1, 4, 2] propose to define a probability distributions over the set of interpretations. To assure that the interpretations are mutually exclusive, they introduce an important restriction on the set of interpretation. They assume that all interpretations possess the same set of objects. Furthermore, they assume that identical objects in different interpretations denote the same entity in the world. Though similar assumptions have been made in this paper, there is an important difference. The here proposed partial model introduce objects to denote specific entities in the world. By doing so, we implicitly assume that each entity in the world about Reasoning under Uncertainty 598 N. Roos

which we receive information can somehow be distinguished. When we receive information stating that there are 5 houses in a street, we assume that we will be able to identify 5 houses when we enter the street. The objects of a semantic interpretation do not play the same role as the objects of a partial model. In an interpretation, objects do not by themselves denote entities in the world. They only identify entities in the world through their relations with other objects. In this light, a set of objects shared by all interpretation plays an odd role. It implies that we know of the existence of each object in the set of objects but we do not know which entity in the world is represented by the object. So, an object can denote a person, a virus, a dragon or whatever in different interpretations. Another odd assumption that Bacchus et al. make is that the number of objects approximates infinity. Though there are of course many objects, we cannot claim that there are infinitely many objects. Assuming a finite number of objects however, raises problems since every day new objects are created while others disappear. As pointed out in the previous section, the probability measure should not depend on the vocabulary. The random numbers approach described in [1, 4, 2] does not meet this requirement if finite numbers of objects are allowed for. Another requirement formulated in the previous section is that the probability measure may not depend on the number of objects. Extending the set of objects of an interpretation with a new object, may not influence the probability measure. This should particularly hold for the probability that n objects o possess a property ', i.e. '(o) holds, if a new object o 0 does not possess this property. From this requirement, we can derive the following result. Theorem 4 If probability that n objects o possess a property ' does not change after adding a new object o 0 for which :'(o 0 ) holds, then for each n 2 IN there is a p 2 [0; 1] such that for each k 2 IN; k n there holds: P ([(#n) x]'(x) j [(#k) x]true) = p Notice that [(#k) x]true denotes that the world consists of k objects. If we would restrict ourselves to monadic predicate logic and apply the insufficient reason argument on the number of objects possessing the property ', we get the random propensity approach proposed in [1]. How to extend the random propensity approach to non monadic predicate logic is, however, unclear. Kyburg [6] uses an approach based on counting models, to give a semantics for direct inference. Direct inference is inference from a general statistical or analogical premise to a conclusion that concerns the case [6]. In his semantic definition, Kybrug does not start from the models of the premises. Instead, he considers several broader sets of models. The reason is that, according to Kyburg, logical constraints can give misleading precision. He illustrates this with the following two formulas. The proportion of 50-year-old females in the USA that live for more than two year lays between [0.945,0.965] and the proportion of 50-year-old females in Rochester (USA) that live for more than two year lays between [0.930,0.975]. Kyburg argues that the statistical information about the 50-year-old females in the USA is better than the statistical information about 50-year-old females in Rochester. Therefore, the latter information should be ignored. This is however, one possible interpretation of the provided statistical information. We could also argue that a proportion of 0.930 of the 50- year-old females in Rochester is possible. Therefore, the 50-year-old females in Rochester may represent an exception on the 50-year-old females in the USA. Smets and Robert [10] proposes the transferable belief model as a way to handle an agent s beliefs. As we have noted at the end of Section 4, the hierarchy of views can exhibit the same behavior as the transferable belief model after receiving new information. The causes of these behaviors are different. In the transferable belief model, the transfer of belief is the result of assigning belief masses to sets of worlds, while here it is the result of dependencies between formulas. 7 Conclusion In this paper a subjective probability measure has been defined by constructing epistemic states using the premises and by assigning probabilities to these states using the insufficient reason principle. In this way subjective probability has been defined in an objective way. There are still some issues that are open for further research. Firstly, way a group of objects is represented in a partial model, actually the way it is not represented, can be improved. Secondly, a hierarchy of view has been used to represent uncertainty about the properties of objects. Is the proposed hierarchy the best way to represent this uncertainy? Thirdly, after eliminating a sub-view of a view, e.g. because of new information, the likelihood of the view does not change as long as the there remain other sub-views. Is this behavior correct in all circumstances? Fourthly, on the pragmatic side improvements are possible and even required if we wish to use this approach for practical purposes. Humans cannot handle much more than 7 views at the same time [5]. This means that we can consider 1 of 7 objects, or 2 of 4 objects. Computers can perform slightly better. Since selecting 10 of 20 objects can already result in 184756 different views, a computer will soon reach its limits. To cope with this problem, we might assign weights to views. A weighted view should summarize a large number of views. A last issue for further research is the incorporation of frequency information, such as: the bus is often too late, in the proposed approach. REFERENCES [1] F. Bacchus, A. J. Grove, J. Y. Halpern, D. Koller, From statistics to beliefs, AAAI-92 (1992) 602-608. [2] F. Bacchus, A. J. Grove, J. Y. Halpern, D. Koller, From statistical knowledge bases to degrees of belief, Artificial Intelligence 87 (1996) 75-143. [3] R. Carnap, Logical foundations of probability 2nd edition, The University of Chicago Press, (1962) [4] A. J. Grove, J. Y. Halpern, D. Koller, Random worlds and maximum entropy, Journal of Artificial Intelligence Research 2 (1994) 33-88. [5] P. N. Johnson-Laird, Mental models, Toward a cognitive science of language inferences and consciousness, Cambridge University Press, Cambridge (1983). [6] H. E. Kyburg, Combinatorial semantics: semantics for frequent validity, Computational Intelligence 13 (1997) 215-257. [7] N. Roos, How to reason with uncertain knowledge, IPMU 90, in: B. Bouchon-Meurier, R. R. Yager, L. A. Zadeh (eds), Uncertainty in knowledge bases Springer-Verlag (1991) 403-412. [8] N. Roos, Reasoning with partial models; Construction of partial models and management of uncertainty, in: W. van der Hoek, J.-J. Ch. Meyer, Y. H. Tan, C. Witteveen (eds), Non-monotonic reasoning and partial semantics, Ellis Horwood (1992) 79-105. [9] N. Roos, Uncertain, inconsistent and default knowledge: reasoning through the construction of a partial model, Dutch/German workshop on non-monotonic reasoning techniques and their application (1993). [10] Ph. Smets, K. Robert, The transferable belief model Artificial Intelligence 66 (1994) 191-234. Reasoning under Uncertainty 599 N. Roos