Belief functions: past, present and future

Size: px
Start display at page:

Download "Belief functions: past, present and future"

Transcription

1 Belief functions: past, present and future CSA 2016, Algiers Fabio Cuzzolin Department of Computing and Communication Technologies Oxford Brookes University, Oxford, UK Algiers, 13/12/2016 Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

2 IJCAI tutorial web site Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

3 Beyond probability Outline 1 Beyond probability Uncertainty The cloaked die The murder trial Uncertain evidence 2 A theory of evidence Multivalued mappings Belief functions Dempster s combination Dempster s conditioning Bayes generalised Misunderstandings 3 Reasoning Inference Combination Belief vs Bayesian reasoning Partially reliable data Making decisions 4 Applications Recent trends Climate change prediction Pose estimation 5 New horizons Upper and lower likelihood Generalising logistic regression A new machine learning 6 Summarising Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

4 Beyond probability Uncertainty Uncertainty uncertainty is widespread, however.... difference between predictable and unpredictable variation second order uncertainty: being uncertain about our very model of uncertainty has a consequence on human behaviour: people are averse to unpredictable variations ( Ellsberg s paradox in decision making) Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

5 Beyond probability Uncertainty The problem(s) with Bayes common mathematical representation of uncertainty: Bayesian reasoning uses a very special measure of uncertainty: Kolmogorov s additive probability pretty bad at representing ignorance uninformative priors are just not adequate different results on different parameter spaces in order to apply Bayes rule P(B A) = P(A B)P(B) P(A) assumes the new evidence comes in the form of certainty: A is true in the real world, often this is not the case uncertain evidence beware the prior! why should we pick a prior? either there is prior knowledge (beliefs) or there is not asymptotically, the choice of the prior does not matter (really!) Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

6 Beyond probability The cloaked die The die as random variable a die is a simple example of (discrete) random variable measurable mapping from a probability space to the real numbers there is a probability space Ω = {face1, face2,..., face6} which maps to a real number: 1, 2,..., 6 (no need for measurability here) Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

7 Beyond probability The cloaked die The cloaked die Observations which are sets now, imagine that face1 and face4 are cloaked, and we roll the die the same probability space Ω = {face1, face2,..., face6} is still there (nothing has changed in the way the die works) however, now the mapping is different: both face1 and face4 are mapped to the set of possible values {1, 4} (since we cannot observe the outcome) mathematically, this is called a random set (a set-valued random variable) Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

8 Beyond probability The cloaked die Occluded dice A more realistic scenario a more realistic scenario is one in which we roll, say, four dice for some of them, their top face might be occluded, but some of the side faces will still be visible, providing information e.g. I see the top face of Red die, Green die and Purple die but, say, I cannot see the outcome of Blue die however, I see faces and of Blue, therefore the outcome of Blue is the set {2, 4, 5, 6} the bottom line is, whenever data are missing observations are inherently set-valued mathematically, we are not sampling a (scalar) random variable but we are sampling a set-valued random variable: a random set Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

9 Beyond probability The murder trial A murder trial Evidence supporting propositions suppose there is a murder, and three people are under trial for it: Θ = {Peter, John, Mary} there is a witness: he testifies that the person he saw was a man this amounts to supporting the proposition A = {Peter, John} Θ should we take this testimony at face value? in fact, the witness was tested and the machine reported an 80% chance he was drunk when he reported the crime it is natural to assign 80% chance to proposition A, and 20% chance to proposition Θ can we do that with probabilities? no Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

10 Beyond probability The murder trial Dealing with propositional evidence when data are missing, or evidence comes in the form of a probability on a related space, data directly support propositions even when evidence (data) supports propositions, Kolmogorov s probability forces us to specify support for individual outcomes this is unreasonable - an artificial constraint due to a mathematical model that is not general enough we have no elements to assign this 80% probability to either Peter or John, nor to distribute it among them the cause is the additivity of probability measures: but this is not the most general type of measure for sets Belief functions and propositional evidence As random sets, belief functions allow us to assign mass directly to propositions. Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

11 Beyond probability Uncertain evidence Uncertain data concepts themselves can be not well defined, e.g. dark or somewhat round object (qualitative data) fuzzy theory accounts for this via the concept of graded membership unreliable sensors can generate faulty (outlier) measurements: can we still treat these data as certain? or is more natural to attach to them a degree of reliability, based on the past track record of the sensor? but then, can we still apply Bayes rule? people ( experts, e.g. doctors) tend to express themselves in terms of likelihoods directly (e.g. I think diagnosis A is most likely, otherwise either A or B ) multiple sensors can provide as output a PDF on the same space e.g., two Kalman filters based one on color, the other on motion (optical flow), providing a normal predictive PDF on the location of the target in the image plane Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

12 Beyond probability Uncertain evidence Belief functions and uncertain evidence Conditioning versus combination belief function deal with uncertain evidence by moving away from the concept of conditioning (via Bayes rule).... to that of combining pieces of evidence supporting multiple (intersecting) propositions to various degrees Belief functions and evidence Belief reasoning works by combining existing belief functions with new ones, which are able to encode uncertain evidence. in addition, belief functions can represent fuzzy concepts as consonant (nested) belief functions they can represent unreliable measurements as discounted probabilities (by assigning mass to the entire hypothesis set) Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

13 A theory of evidence Outline 1 Beyond probability Uncertainty The cloaked die The murder trial Uncertain evidence 2 A theory of evidence Multivalued mappings Belief functions Dempster s combination Dempster s conditioning Bayes generalised Misunderstandings 3 Reasoning Inference Combination Belief vs Bayesian reasoning Partially reliable data Making decisions 4 Applications Recent trends Climate change prediction Pose estimation 5 New horizons Upper and lower likelihood Generalising logistic regression A new machine learning 6 Summarising Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

14 A theory of evidence Multivalued mappings Dempster s original setting Rationale There exists evidence E in the form of probabilities, which supports degrees of belief on a certain hypothesis space H. in the murder trial example Ω is the space where the evidence E lives, in a form of a probability distribution P Θ is the hypothesis space, the set of outcomes of the trial elements of Ω are mapped to subsets of Θ (e.g. Γ maps {not drunk} Ω to {Peter, John} Θ) the probability distribution P induces a mass assignment m : 2 Θ [0, 1] via the multi-valued (one-to-many) mapping Γ : Ω 2 Θ the corresponding mass function is: m({peter, John}) = 0.8, m(θ) = 0.2 Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

15 A theory of evidence Belief functions Belief and plausibility functions Dempster s upper and lower probabilities Belief value The probability that the evidence implies A: Bel(A) = P({ω Ω Γ(ω) A}) = B A m(b) Plausibility value The probability that the evidence does not contradict A: Pl(A) = P({ω Ω Γ(ω) A }) = m(b) = 1 Bel(A) B A belief and plausibility values can (but this is disputed) be interpreted as lower and upper bounds to the values of an unknown, underlying probability measure: Bel(A) P(A) Pl(A) for all A Θ Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

16 A theory of evidence Dempster s combination Dempster s combination a blond hair has been found, but there is a probability 0.6 that the room was cleaned before the crime the outcomes compatible with both ω 1 Ω 1 and ω 2 Ω 2 are θ Γ 1 (ω 1 ) Γ 2 (ω 2 ) if the sources of evidence are independent, then the probability of (ω 1, ω 2 ) is P 1 ({ω 1 }) P 2 ({ω 2 }) if Γ 1 (ω 1 ) Γ 2 (ω 2 ) =, the pair (ω 1, ω 2 ) cannot be selected Dempster s rule The combination of the two mass functions m 1, m 2 is defined as: (m 1 m 2 )(A) = 1 1 κ B C=A m 1 (B)m 2 (C), = A Θ subsets with non-zero mass: focal elements Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

17 A theory of evidence Dempster s combination Dempster s rule - example m({θ 1 }) = = 0.48, m({θ }) = = 0.31, m({θ , θ 2 }) = = 0.21 Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

18 A theory of evidence Dempster s conditioning Dempster s conditioning Dempster s rule of combination induces a conditioning operator given a new event A, the logical belief function such that m(a) = is combined with the a-priori belief function Bel using Dempster s rule the resulting BF is the conditional belief function given A Bel (A B) in terms of belief and plausibility values, Dempster s conditioning yields Bel (A B) = Bel(A B) Bel( B) 1 Bel( B) = Pl(B) Pl(B\A), Pl Pl(B) (A B) = Pl(A B) Pl(B) obtained by Bayes rule by replacing probability with plausibility measures! Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

19 A theory of evidence Bayes generalised A generalisation of Bayesian inference belief theory generalises Bayesian probability as: classical probability measures are a special class of belief functions (in the finite case) or random sets (in the infinite case) Bayes certain evidence is a special case of belief functions the belief function m A which assigns mass 1 to the single subset A Bayes rule of conditioning is a special case of Dempster s rule of combination however, it overcomes its limitations you do not need a prior: if you are ignorant, you will use the vacuous BF m Θ which, when combined with new BFs m encoding data, will not change the result m Θ m = m however, if you do have prior knowledge you are welcome to use it! Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

20 A theory of evidence Misunderstandings Belief functions are not (general) credal sets a belief function on Θ is in 1-1 correspondence with a convex set of probability distributions there: a credal set this is the set of probabilities which meet belief and plausibility values as lower and upper bounds: Bel(A) P(A) Pl(A) however, belief functions are a special class of credal sets, those induced by a random set mapping Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

21 A theory of evidence Misunderstandings Belief functions are not (general) credal sets a belief function on Θ is in 1-1 correspondence with a convex set of probability distributions there: a credal set this is the set of probabilities which meet belief and plausibility values as lower and upper bounds: Bel(A) P(A) Pl(A) however, belief functions are a special class of credal sets, those induced by a random set mapping Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

22 A theory of evidence Misunderstandings Belief functions are not (general) credal sets a belief function on Θ is in 1-1 correspondence with a convex set of probability distributions there: a credal set this is the set of probabilities which meet belief and plausibility values as lower and upper bounds: Bel(A) P(A) Pl(A) however, belief functions are a special class of credal sets, those induced by a random set mapping Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

23 A theory of evidence Misunderstandings Belief functions are not (general) credal sets a belief function on Θ is in 1-1 correspondence with a convex set of probability distributions there: a credal set this is the set of probabilities which meet belief and plausibility values as lower and upper bounds: Bel(A) P(A) Pl(A) however, belief functions are a special class of credal sets, those induced by a random set mapping Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

24 A theory of evidence Misunderstandings Belief functions are not (general) credal sets a belief function on Θ is in 1-1 correspondence with a convex set of probability distributions there: a credal set this is the set of probabilities which meet belief and plausibility values as lower and upper bounds: Bel(A) P(A) Pl(A) however, belief functions are a special class of credal sets, those induced by a random set mapping Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

25 A theory of evidence Misunderstandings Belief functions are not (general) credal sets a belief function on Θ is in 1-1 correspondence with a convex set of probability distributions there: a credal set this is the set of probabilities which meet belief and plausibility values as lower and upper bounds: Bel(A) P(A) Pl(A) however, belief functions are a special class of credal sets, those induced by a random set mapping Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

26 A theory of evidence Misunderstandings Belief functions are not (general) credal sets a belief function on Θ is in 1-1 correspondence with a convex set of probability distributions there: a credal set this is the set of probabilities which meet belief and plausibility values as lower and upper bounds: Bel(A) P(A) Pl(A) however, belief functions are a special class of credal sets, those induced by a random set mapping Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

27 A theory of evidence Misunderstandings Belief functions are not second-order distributions general Bayesian inference leads to probability distributions over the space of parameters these are second order probabilities, i.e. probability distributions on hypotheses which are themselves probabilities belief functions can be defined on the hypothesis space Ω, or on the parameter space Θ when defined on Ω they are sets of PDFs and can then be seen as indicator second order distributions (see figure) when defined on the parameter space Θ, they amount to families of second-order distributions in the two cases they generalise MLE/MAP and general Bayesian inference, respectively Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

28 Reasoning Outline 1 Beyond probability Uncertainty The cloaked die The murder trial Uncertain evidence 2 A theory of evidence Multivalued mappings Belief functions Dempster s combination Dempster s conditioning Bayes generalised Misunderstandings 3 Reasoning Inference Combination Belief vs Bayesian reasoning Partially reliable data Making decisions 4 Applications Recent trends Climate change prediction Pose estimation 5 New horizons Upper and lower likelihood Generalising logistic regression A new machine learning 6 Summarising Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

29 Reasoning Inference Reasoning with belief functions working with belief functions involves a number of natural steps: this section: inference, combination, decision making we are not going to see: conditioning, propagation, or generalised theorems Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

30 Reasoning Inference Dempster s approach to statistical inference (1) { } consider a statistical model f (x θ), x X, θ Θ, where X is the observation space and Θ the parameter space having observed x, how to quantify the uncertainty about the parameter θ, without specifying a prior probability distribution? assume that the samples X = {x 1,..., x n} are generated as a function of an (unobserved) auxiliary variable U X = a(θ, U) with known probability distribution independent of θ for instance, to generate a continuous random variable X with cumulative distribution function (CDF) F θ, one might draw U from U([0, 1]) and set X = F 1 θ (U) Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

31 Reasoning Inference Dempster s approach to statistical inference (2) the data-generation equation X = a(θ, U) defines a multi-valued mapping { } Γ : U Γ(U) = (X, θ) X Θ X = a(θ, U) the probability space (U, B(U), µ) and the multi-valued mapping Γ induce a belief function Bel Θ X on X Θ conditioning Bel Θ X on θ yields Bel X (. θ) f ( θ) on X conditioning it on X = x gives Bel Θ ( x) on Θ Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

32 Reasoning Inference Likelihood-based inference compatible with the likelihood principle: Bel Θ ( x) should be based only on the likelihood function L(θ x) = f (x θ) generates a consonant belief function, a BF whose focal elements are nested A 1 A 2 A 3 Bel Θ ( x) is the consonant BF with plausibility of singleton elements equal to the normalized likelihood: pl(θ x) = L(θ x) sup θ Θ L(θ x) takes the empirical normalised likelihood to be the upper bound to the probability density of the sought parameter! (rather than the actual PDF) the corresponding plausibility function is Pl Θ (A x) = sup θ A pl(θ x) Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

33 Reasoning Inference Coin toss example consider a coin toss experiment we toss the coin n = 10 times, obtaining the sample X = {H, H, T, H, T, T, T, H, H, H} parameter of interest: the probability θ = p of heads in a single toss the likelihood of the sample is binomial: P(X p) = p k (1 p) n k likelihood-based belief function inference determines an entire envelope of PDF on the parameter space Θ = [0, 1] we can apply the same criterion to normalised empirical counts ˆf (H) = 1, ˆf (T ) = 4/6 = 2/3 we get the mass assignment m(h) = 1/3, m(t ) = 0, m(ω) = 2/3 this robustifies the ML estimate, which is a PDF compatible with the inferred BF Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

34 Reasoning Combination Dempster s rule under fire Zadeh s paradox question is: is Dempster s sum the only possible rule of combination? seems to have paradoxical behaviour in certain circumstances.. example: doctors have opinions about the condition of a patient Θ = {M, C, T }, where M stands for meningitis, C for concussion and T for tumor two doctors provide the following diagnoses: D 1 : I am 99% sure it s meningitis, but there is a small chance of 1% that it is concussion". D 2 : I am 99% sure it s tumor, but there is a 1% chance that it s concussion". can be encoded by the following mass functions: 0.99 A = {M} m 1 (A) = 0.01 A = {C} m 2 (A) = 0 otherwise 0.99 A = {T } 0.01 A = {C} 0 otherwise, (1) Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

35 Reasoning Combination Dempster s rule under fire Zadeh s paradox their (unnormalised) Dempster s combination is: { A = { } m(a) = A = {C} as the two masses are highly conflicting, normalisation yields the belief function focussed on C it is definitively concussion but both experts had left it as only a fringe possibility! objections: the belief functions in the example are really probabilities, so this is a problem with Bayesian representations, in case! diseases are never exclusive, so that it may be argued that Zadeh s choice of a frame of discernment is misleading open world approaches with no normalisation doctors disagree so much that any person would conclude that one of the them is just wrong reliability of sources needs to be accounted for Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

36 Reasoning Combination Proposed combination rules a number of alternative combination mechanisms have been proposed Yager s rule: conflict mass is assigned to Ω Dubois rule: conflict mass B C = is assigned to B C conjunctive rule: Dempster without normalisation disjunctive rule: dual of the conjunctive (and Dempster s) Denoeux s cautious rule: min weight after canonical decomposition bold rule: dual of cautious Murphy s averaging idea Deng s distance-weighted averaging Lefevre s weighting factors we will see only the first four my position: working with intervals of belief functions Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

37 Reasoning Combination Yager s and Dubois rules conflict is generated by non-reliable information sources conflicting mass m( ) = B C= m 1(B)m 2 (C) should be re-assigned to the whole frame Θ let m (A) = m 1 (B)m 2 (C) whenever B C = A m Y (A) = { m (A) A Θ m (Θ) + m( ) A = Θ. (2) Dubois and Prade s idea: similar to Yager s, BUT conflicting mass is not transferred all the way up, but to B C (due to applying the minimum specificity principle) m D (A) = m (A) + m 1 (B)m 2 (C). (3) B C=A,B C= the resulting BF dominates Yager s combination: m D (A) m Y (A) A Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

38 Reasoning Combination Conjunctive and disjunctive rules rather than normalising (as in Dempster s rule) or re-assigning the conflicting mass m( ) to other non-empty subsets (as in Yager s and Dubois ).. Smets conjunctive rule leaves the conflicting mass with the empty set: m (A) = m 1 (B)m 2 (C), A Θ B C=A applicable to unnormalised belief functions in an open world assumption: current frame only approximately describes the set of possible hypotheses disjunctive rule of combination: m (A) = B C=A m 1 (B)m 2 (C) consensus between two sources is expressed by the union of the supported propositions, rather than by their intersection not that Bel 1 Bel 2 (A) = Bel 1 (A) Bel 2 (A): belief values are simply multiplied! Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

39 Reasoning Combination Combination Moving forward Yager s rule, rather unjustified, Dubois kinda intermediate between and cautious and bold rules are rather inspired by possibility theory s min rule my take on this: Dempster s (conjunctive) combination and disjunctive combination are the two extrema of a spectrum of possible results Proposal: combination tubes? Meta-uncertainty on the sources generating the input belief functions (their independence and reliability) induces uncertainty on the result of the combination, represented by a bracket of combination rules, which produce a tube of BFs. we will encounter this idea when generalising the concept of likelihood we should probably work with intervals of belief functions then? Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

40 Reasoning Belief vs Bayesian reasoning Belief vs Bayesian reasoning Image data fusion for object classification suppose we want to estimate the class of an object appearing in an image, based on feature measurements extracted from the image we capture a training set of images, complete with annotated object labels assuming a PDF of a certain family (e.g. mixture of Gaussians) we can learn from the training data a likelihood function p(y x), where y is the object class and x the image feature vector suppose n different sensors extract n features x i from each image: x 1,..., x n let us compare how data fusion works under the Bayesian and the belief function paradigms! Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

41 Reasoning Belief vs Bayesian reasoning Belief vs Bayesian reasoning Bayesian data fusion the likelihoods of the individual features are computed using the n likelihood functions learned during training: p(x i y), for all i = 1,..., n measurements are typically assumed to be conditionally independent, yielding the product likelihood p(x y) = i p(x i y) Bayesian inference is applied, typically assuming uniform priors (for there is no reason to think otherwise), yielding p(y x) p(x y) = i p(x i y) Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

42 Reasoning Belief vs Bayesian reasoning Belief vs Bayesian reasoning Dempster-Shafer data fusion for each feature type i a BF is learned from the the individual likelihood p(x i y) (e.g. via the likelihood-based approach) this yields n belief functions Bel(y x i ), on the range of possible object classes Y a combination rule is applied to compute an overall BF (e.g.,, ), obtaining Bel(Y x) = Bel(Y x 1 )... Bel(Y x n), Y Y (an empirical comparison of this kind is shown under pose estimation later) Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

43 Reasoning Partially reliable data Inference under partially reliable data Belief vs Bayesian reasoning in the fusion example we have assumed that the data are measured correctly what if the data-generating process is not completely reliable? problem: suppose we want to just detect an object (binary decision: yes Y or no N) two sensors produce image features x 1 and x 2, but we learned from the training data that both are reliable only 20% of the time at test time we get an image, measure x 1 and x 2, and unluckily sensor 2 got it wrong! the object is actually there we get the following normalised likelihoods p(x 1 Y ) = 0.9, p(x 1 N) = 0.1; p(x 2 Y ) = 0.1, p(x 2 N) = 0.9 Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

44 Reasoning Partially reliable data Inference under partially reliable data Belief vs Bayesian reasoning how do the two fusion pipelines cope with this? the Bayesian scholar assumes the two sensors/processes are conditionally independent, and multiply the likelihoods obtaining p(x 1, x 2 Y ) = = 0.09, p(x 1, x 2 N) = = 0.09 so that p(y x 1, x 2 ) = 1 2, p(n x 1, x 2 ) = 1 2 Shafer s faithful follower discounts the likelihoods by assigning mass.2 to the whole hypothesis space Θ = {Y, N}: m(y x 1 ) = = 0.72, m(n x 1 ) = = 0.08, m(θ x 1 ) = 0.2; m(y x 2 ) = = 0.08, m(n x 2 ) = = 0.72 m(θ x 2 ) = 0.2 Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

45 Reasoning Partially reliable data Inference under partially reliable data Belief vs Bayesian reasoning thus, when we combine them by Dempster s rule we get the BF Bel on {Y, N}: m(y x 1, x 2 ) = 0.458, m(n x 1, x 2 ) = 0.458, m(θ x 1, x 2 ) = when combined using the disjunctive rule (the least committal one) we get Bel : m (Y x 1, x 2 ) = 0.09, m (N x 1, x 2 ) = 0.09, m (Θ x 1, x 2 ) = 0.82 the corresponding (credal) sets of probabilities are the credal interval for Bel is quite narrow: reliability is assumed to be 80%, and we got a faulty measurement in two! (50%) the disjunctive rule is much more cautious about the correct inference Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

46 Reasoning Making decisions Decision making with belief functions natural application of belief function representation of uncertainty problem: selecting an act f from an available list F (making a decision ), which optimises a certain objective function various approaches to decision making decision making in the TBM is based on expected utility via pignistic transform Strat has proposed something similar in his cloaked carnival wheel scenario generalised expected utility [Gilboa] based on classical expected utility theory [Savage,von Neumann] a lot of interest in multicriteria decision making (based on a number of attributes) Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

47 Reasoning Making decisions Decision making in the TBM Savage (1954) has showed that verifies some rationality requirements iff there exists a probability measure P on Ω and a utility function u : X R s.t. f, g F, f g E P (u f ) E P (u g) the best choice is the one that maximises the expected utility: does that mean that using belief functions is irrational? in Smets Transferable Belief Model (TBM), decisions are made by maximising the expected utility of actions E[u] = θ Θ u(f, θ)betp(θ) based on the pignistic transform BetP(θ) = A {θ} m(a) A center of mass of the credal set of probabilities consistent with m Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

48 Reasoning Making decisions Lower and upper expected utilities a preference relation meets Gilboa s weaker axioms iff there exists a (non necessarily additive) measure µ and a utility function u : X R such that f, g F, f g C µ(u f ) C µ(u g), where C µ is the Choquet integral, defined for X : Ω R as C µ(x) = + 0 µ(x(ω) t)dt + 0 [µ(x(ω) t) 1]dt. given a belief function Bel on Ω and a utility function u, this theorem supports making decisions based on the Choquet integral of u with respect to Bel for finite Ω, it can be shown that C Bel (u f ) = m(b) min u(f (ω)) C ω B B Ω Pl(u f ) = B Ω m(b) max u(f (ω)) ω B let P(Bel) as usual be the set of probability measures P compatible with Bel, i.e., such that Bel P. Then, it can be shown that C Bel (u f ) = min E P (u f ) = E(u f ) C Pl (u f ) = E(u f ) P P(Bel) Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

49 Reasoning Making decisions Decision making using intervals of expected utilities for each act f we have two expected utilities E(f ) and E(f ) how do we make a decision? we need to compare intervals possible decision criteria based on interval dominance: 1 f g iff E(u f ) E(u g) (conservative strategy) 2 f g iff E(u f ) E(u g) (pessimistic strategy) 3 f g iff E(u f ) E(u g) (optimistic strategy) 4 f g iff αe(u f ) + (1 α)e(u f ) αe(u g) + (1 α)e(u g) for some α [0, 1] called a pessimism index (Hurwicz criterion) it can be shown that the observed behavior in Ellberg s paradox is explained by the pessimistic strategy Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

50 Applications Outline 1 Beyond probability Uncertainty The cloaked die The murder trial Uncertain evidence 2 A theory of evidence Multivalued mappings Belief functions Dempster s combination Dempster s conditioning Bayes generalised Misunderstandings 3 Reasoning Inference Combination Belief vs Bayesian reasoning Partially reliable data Making decisions 4 Applications Recent trends Climate change prediction Pose estimation 5 New horizons Upper and lower likelihood Generalising logistic regression A new machine learning 6 Summarising Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

51 Applications Recent trends A new wave of applications sensor fusion has always been a stronghold of belief calculus mainly about merging different sensors using Dempster s rule typical applications: tracking and data association, reliability in engineering, image processing, robotics, medical imaging and diagnosis, business and finance (audit) a new wave of applications, on: here we present one (or two!) in more detail: climate change prediction motion capture in computer vision geographical information systems (GIS) communication networks and security earth sciences Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

52 Applications Recent trends Most popular applications of belief functions information quality in financial accounting [A conceptual framework and belief-function approach to assessing overall information quality (158)] auditing [The Bayesian and belief-function formalisms: A general perspective for auditing (148)] reputation and trust management in telecoms [An evidential model of distributed reputation management (615)] security [An information systems security risk assessment model under the DS theory of belief functions (137)] DoS [Towards multisensor data fusion for DoS detection (137)] Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

53 Applications Recent trends Most popular applications of belief functions robotics and navigation [An evidential approach to map-building for autonomous vehicles (229)], [Dempster-Shafer theory for sensor fusion in autonomous mobile robots (192)] tracking and data association [Shafer-Dempster reasoning with applications to multisensor target identification systems (317)] image processing and computer vision [Image annotations by combining multiple evidence Wordnet (231)], [Evidence-based recognition of 3-D objects (176)] biometrics [Image quality assessment for iris biometric (160)] Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

54 Applications Climate change prediction Climate change Adaptation of flood defense structures climate change is expected to have enormous economic impact: damage or destruction from extreme events; coastal flooding and inundation from sea level rise, etc. adaptation of infrastructure to climate change is a major issue engineering design processes and standards are based on analysis of historical climate data (using, e.g. Extreme Value Theory), with the assumption of a stable climate commonly, flood defenses in coastal areas are designed to withstand at least 100 years return period events. However, due to climate change, they will be subject during their life time to higher loads than the design estimations the main impact is related to the increase of the mean sea level, which affects the frequency and intensity of surges for adaptation purposes, statistics of extreme sea levels derived from historical data should be combined with projections of the future sea level rise (SLR) Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

55 Applications Climate change prediction Assumptions and approach the annual maximum sea level Z at a given location is often assumed to have a Gumbel distribution [ ( P(Z z) = exp exp z µ )] σ with mode µ and scale parameter σ procedures are based on the return level z T associated with a return period T, defined as: z T = µ σ log [ log ( )] 1 1 T because of climate change, it is assumed that the distribution of annual maximum sea level at the end of the century will be shifted to the right, with shift equal to the SLR : z T = z T + SLR approach: 1 represent the evidence on z T by a likelihood-based belief function using past sea level measurements; 2 represent the evidence on SLR by a belief function describing expert opinions; 3 combine these two items of evidence to get a belief function on z T = z T + SLR. Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

56 Applications Climate change prediction Expert evidence on sea level rise future SLR projections provided by the IPCC last Assessment Report (2007) give [0.18 m, 0.79 m] as a likely range of values for SLR over the period however, it is indicated that higher values cannot be excluded based on a simple statistical model, Rahmstorf (2007) suggests [0.5m, 1.4 m] recent studies indicate that the threshold of 2 m cannot be exceeded by the end of this century due to physical constraints the interval [0.5, 0.79] = [0.18, 0.79] [0.5, 1.4] seems to be fully supported, as considered highly plausible by all three sources while values outside the interval [0, 2] are considered as impossible how do we encode this evidence using belief functions? Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

57 Applications Climate change prediction Combination of expert and statistical evidence expert evidence: consonant random intervals with core [0.5, 0.79], support [0, 2] and different pl ( contour functions ) Contour functions Cumulative Bel and Pl π(slr) F * (SLR), F * (SLR) SLR SLR let [U zt, V zt ] and [U SLR, V SLR ] be the independent random intervals representing evidence on z T and SLR, respectively the random interval for z T = z T + SLR is [U zt, V zt ] + [U SLR, V SLR ] = [U zt + U SLR, V zt + V SLR ] Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

58 Applications Climate change prediction Some results of combining expert and historical belief functions the corresponding belief and plausibility functions are, for all A B(R): Bel(A) = P([U zt + U SLR, V zt + V SLR ] A) Pl(A) = P([U zt + U SLR, V zt + V SLR ] A ) Bel(A) and Pl(A) can be estimated by Monte Carlo simulation linear convex concave constant pl(z T ) Bel(z T <= z), Pl(z T <= z) linear convex concave constant z T z Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

59 Applications Pose estimation Belief Modeling Regression for example-based pose estimation the available evidence comes in the form of a training set of images containing sample poses of an unspecified object configuration: a vector q Q R D an oracle" provides for each training image I k the configuration q k of the object portrayed in the image source of ground truth: motion capture system object location within each training image is known as a bounding box in training, the object explores its range of possible configurations and both samples poses Q =. { } q k, k = 1,..., T and N features Ỹ =. { } y i (k), k = 1,..., T, i = 1,..., N are collected in testing, a supervised localization algorithm is employed to locate the object within the test image such features are exploited to produce an estimate of the object s configuration Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

60 Applications Pose estimation Learning evidential models learn from the training data an approximation ρ of the unknown mapping between each feature space Y i and the pose space Q we apply EM clustering to the N training sequences of feature values, obtaining a obtain a Mixture of Gaussians (MoG) { Γ j i, j = 1,..., n } j i, Γ i N (µ j i, Σj i ) { }.. and a approximate feature-pose map ρ i : Y j i Q j. i = q k Q : y i (k) Y j i Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

61 Applications Pose estimation Computing belief pose estimates new visual features y 1,..., y N are mapped to a collection of belief functions Bel 1,..., Bel N on the set of sample poses Q belief functions also allow to take into account the scarcity of the training samples by assigning some mass m(θ i ) to the whole feature space m i : 2 Θ i [0, 1], m i (Y j i ) = Γj i (y i) ( 1 mi (Θ i ) ) Γ k i (y i ) they are combined by conjunctive combination this yields the belief estimate of the pose on the set of sample poses Q 1 we can compute the expected pose associated with each vertex of the credal set: T ˆq = p(q k )q k 2 or, we can approximate ˆb with a probability ˆp on Q (e.g. the pignistic function) k=1 k Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

62 Applications Pose estimation Two human pose estimation experiments person filmed by two uncalibrated DV cameras arm experiment: subject moves his arm, while standing in a fixed floor location legs experiment: person walking normally on the floor, training set collected by sampling a random walk on a section of the floor length of the training sequences: 1726 frames for the arm and 1952 for the legs pose vector: 3D coordinates of the markers quite challenging setup: background was highly non-static, with people coming in and out the scene and flickering monitors; self-occlusions Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

63 Applications Pose estimation Comparison with RVMs and GPR BMR results for components of the pose vector: 9 on top, 1 and 6 at bottom blue ground truth, red pignistic estimate average Euclidean errors for Relevant Vector Machine (RVM): 25.0, 10.6, 18.6, 7.0 cm for Gaussian Process Regression (GPR): 31.2, 13.6, 23.0, and 4.5 centimeters our belief-theoretical approach outperforms both competitors! Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

64 New horizons Outline 1 Beyond probability Uncertainty The cloaked die The murder trial Uncertain evidence 2 A theory of evidence Multivalued mappings Belief functions Dempster s combination Dempster s conditioning Bayes generalised Misunderstandings 3 Reasoning Inference Combination Belief vs Bayesian reasoning Partially reliable data Making decisions 4 Applications Recent trends Climate change prediction Pose estimation 5 New horizons Upper and lower likelihood Generalising logistic regression A new machine learning 6 Summarising Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

65 A research programme New horizons we made the case that non-additive probabilities arise from real issues with the way standard probability models the data (or absence thereof) we showed that random sets are the most natural representation of uncertainty they are also a straightforward generalisation of mathematical statistics how should the theory develop? some modest proposals: generalised logistic regression for dealing with rare events parameterised families of random sets.. would allow frequentist hypothesis testing.... MAP-like estimation.. in particular, Gaussian random sets.... and how the central limit theorem generalises to RS generalising the total probability theorem.... and the concept of random variable where can its full impact be felt? new, robust foundations for machine learning a novel understanding on quantum mechanics robust models of climatic change a geometry of uncertainty as a general framework for uncertainty theory Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

66 New horizons Upper and lower likelihood Belief likelihood function Generalising the sample likelihood traditional likelihood function is a conditional probability of the data given a parameter θ Θ, i.e. a family of PDF over X parameterised by θ different take: instead of using conventional likelihood to build a belief function, can we define a belief likelihood function of a sample x X? it is natural to define a belief (set-) likelihood function as family of belief functions on X, Bel X (. θ) parameterised by θ Θ this is the input of Smets Generalised Bayesian Theorem, a collection of conditional belief functions note that a belief likelihood takes values on sets of outcomes individual outcomes are a special case seems a natural setting for computing likelihoods of set-valued observations coherent with the random set philosophy Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

67 New horizons Upper and lower likelihood Belief likelihood function Series of trials what can we say about the belief likelihood function of a series of trials observations are a tuple x = (x 1,..., x n) X 1 X n, where X i = X denotes the space of quantities observed at time i by definition the belief likelihood function is Bel X1 X n (A θ), where A is any subset of X 1 X n Belief likelihood function of repeated trials Bel X1 X n (A θ). = Bel i X i X 1 Bel i X i X n (A θ) where Bel i X i X j is the vacuous extension of Bel Xj to the Cartesian product X 1 X n where the observed tuples live, and is a combination rule. Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

68 New horizons Upper and lower likelihood Belief likelihood function Series of trials, individual tuples can we reduce this to the belief values of the individual trials? yes, if we wish to compute likelihood values of tuples of individual outcomes rather than sets of them Decomposition for individual tuples When using both or as a combination rule in the definition of belief likelihood function, the following holds: L(x = {x 1,..., x. n n}) = Bel X1 X n ({(x 1,..., x n)} θ) = Bel Xi (x i ) L(x = {x 1,..., x n}). = Pl X1 X n ({(x 1,..., x n)} θ) = i=1 n Pl Xi (x i ) We can call them lower and upper likelihoods of the sample x = {x 1,..., x n} i=1 second line conditional conjunctive independence (but just for individual samples x) new result, yet unpublished similar regularities hold when using the more cautious disjunctive combination open question: does this hold for arbitrary subsets of samples A X 1 X n? Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

69 New horizons Upper and lower likelihood Lower and upper likelihoods Bernoulli trials let us go back to the Bernoulli trials example: X i = X = {H, T } under conditional independence and equidistribution, the traditional likelihood for a series of Bernoulli trials reads as p k (1 p) n k, where k is the number of successes and n the number of trials let us compute the belief likelihood function for Bernoulli trials! we seek the belief function on X = {H, T }, parameterised by p = m(h), q = m(t ) (with p + q 1 this time) which best describes the observed sample if we apply the previous result, since all Bel i are equally distributed the lower and upper likelihoods of the sample x = {x 1,..., x n} are: L(x = {x 1,..., x n}) = Bel X ({x 1 }) Bel X ({x n}) = p k q n k L(x = {x 1,..., x n}) = Pl X ({x 1 }) Pl X ({x n}) = (1 q) k (1 p) n k after normalisation, these are PDFs over the space B of all belief functions definable on X! Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

70 New horizons Upper and lower likelihood Lower and upper likelihoods (Bernoulli trials) lower likelihood (left) subsumes to the traditional likelihood p k (1 p) n k for p + q = 1 the maximum of the lower likelihood is the traditional ML estimate makes sense: the lower likelihood is highest for the most committed belief functions (i.e. probabilities) upper likelihood (right) has maximum in p = q = 0 (the vacuous BF on {H, T }) the interval of BFs joining max L with max L is the set of belief functions such that p q = k, those which preserve the ratio between the empirical counts n k once again the maths leads us to think in terms of intervals of belief functions, rather than individual ones Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

71 New horizons Generalising logistic regression Generalising logistic regression (1) Bernoulli trials are central in statistics: generalising their likelihood allow us to represent uncertainty in a number of regression problems in logistic regression π i = P(Y i = 1 x i ) = e, 1 π (β 0+β 1 x i ) i = P(Y i = 0 x i ) = e (β0+β1xi ) 1 + e (β 0+β 1 x i ) the parameters β 0, β 1 are estimated by maximum likelihood of the sample, where L(β 0, β 1 Y ) = n i=1 π Y i i (1 π i ) 1 Y i where Y i {0, 1} and π i is a function of β 0, β 1 yielding a single conditional PDF as in the Bernoulli series experiment, we can replace the conditional probability (π i, 1 π i ) on X = {0, 1} with a belief function there Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

72 New horizons Generalising logistic regression Generalising logistic regression (2) upper and lower likelihoods can then be computed as n n L(β Y ) = π Y i i q 1 Y i i, L(β Y ) = (1 q i ) Y i (1 π i ) 1 Y i i=1 i=1 where this time the Bel i are not equally distributed how do we generalise the logit link between observations x and outputs y? we need to enforce an analytical dependency for q i first simple proposal: add a parameter β 2 such that q i = m(y i = 0 x i ) = β 2 e (β 0+β 1 x i ) 1 + e (β 0+β 1 x i ) we can then find lower and upper optimal estimates for the parameters β arg max β L β 0, β 1, β 2 arg max β L β 0, β 1, β 2 plugging these optimal parameters into the logit expressions for π i, 1 π i, q i yields an upper and a lower family of conditional belief functions given x: Bel X (. β, x) Bel X (. β, x) Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

73 New horizons Generalising logistic regression Rare events with belief functions Generalising logistic regression how do we use belief functions to be cautious about rare event prediction? when we measure a new observation x we plug it into Bel X (. β, x) and Bel X (. β, x), and get a lower and an upper belief function on Y note that each belief function is really an envelope of logistic functions robust estimate of rare events: how does this relate to results of classical logit regression? more to come in the near future! Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

74 New horizons A new machine learning What s wrong with machine learning unfortunate (but predictable) Tesla accident unable to predict how a system will behave in a radically new setting (e.g., how does a smart car cope with driving through extreme weather conditions? most systems have no way of detecting whether their underlying assumptions have been violated: they will happily continue to predict and act even on inputs that are completely outside the scope of what they have actually learned it is imperative to ensure that these algorithms behave predictably in the wild Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

75 New horizons A new machine learning Vapnik s statistical learning theory classical statistical learning theory [Vapnik] makes predictions on the reliability of a training set based on simple quantities such as number of samples N generalisation issue: training error is different from the expected error: E x D[δ(h(x) y(x))] N δ(h(x n) y(x n)) the training data x = [x 1,..., x n] is assumed drawn from a distribution D, h(x) is the predicted label for input x and y(x) the actual label Probabilistically Approximately Correct learning The learning algorithm finds with probability at least 1 δ a model h H which is approximately correct, i.e. it makes a training error of no more than ɛ the main result of PAC learning is that we can relate the required size N of a training sample to the size of the model space H N 1 ( log H + log 1 ) ɛ δ n=1 Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

76 New horizons A new machine learning Vapnik s statistical learning theory Vapnik-Chervonenkis Dimension The VC dimension of H is the maximum number of points that can be successfully shattered by a hypothesis h H (i.e, they can be correctly classified by some h H for all possible binary labellings of these points) 4 points in R 2 with H = the space of linear separators however we arrange 4 points, there is a labelling that we cannot shatter (correctly reproduce), therefore the VC dimension of linear separators in R 2 is 3 pretty useless for model selection, for bounds are too wide: people do cross validation instead however, it provides the only justification for max-margin linear SVMs! { } VC SVM = min D, 4R2 m Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

77 New horizons A new machine learning Imprecise-theoretical foundations for machine learning A modest proposal issues with Vapnik s traditional statistical learning theory have been recently recognised by many researchers what about deep learning? nobody has a clue of why it works, really approaches should provide worst-case guarantees: it is not possible to rule out completely unexpected behaviours or catastrophic failures Liang s proposal: using minimax optimization to learn models that are suitable for any target distribution within a safe" family minimax models similar to Liang s are naturally associated with convex sets of probabilities uncertainty theory may be able to provide worst-case, cautious predictions, delivering AI agents aware of their own limitations research programme: a generalisation of the concept of Probably Approximately Correct where does the probability distribution of the data come from? Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

78 Summarising Outline 1 Beyond probability Uncertainty The cloaked die The murder trial Uncertain evidence 2 A theory of evidence Multivalued mappings Belief functions Dempster s combination Dempster s conditioning Bayes generalised Misunderstandings 3 Reasoning Inference Combination Belief vs Bayesian reasoning Partially reliable data Making decisions 4 Applications Recent trends Climate change prediction Pose estimation 5 New horizons Upper and lower likelihood Generalising logistic regression A new machine learning 6 Summarising Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

79 Summarising A summary the theory of belief functions is grounded in the beautiful mathematics of random sets has strong relationships with other theories of uncertainty can be efficiently implemented by Monte-Carlo approximation and local propagation statistical evidence may be represented in several ways: by likelihood-based belief functions, generalizing both likelihood-based and Bayesian inference by Dempster s idea of using auxiliary variables decision making strategies based on intervals of expected utilities can be formulated that are more cautious than traditional ones propagation on graphical models can be performed the extension to continuous domains can be tackled via the Borel interval representation, in the more general case using the theory of random sets a toolbox of estimation, classification, regression tools based on the theory of belief functions is available Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

80 Summarising Recent trends in the theory and application of belief functions in 2014 alone, almost 1200 papers were published on belief functions new applications are gaining ground, beyond sensor fusion or expert systems earth sciences, telecoms, etc Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

81 Summarising What still needs to be resolved clarify once and for all the epistemic interpretation of belief function theory random variables for set-valued observations mechanism for evidence combination still debated, depend on meta-information on sources hardly accessible working with intervals of belief functions may be the way forward acknowledges the meta-uncertainty on the nature of the sources generating the evidence same holds for conditioning (although we did not show that) what about computational complexity? not an issue, just apply sampling for approximate inference we do not need to assign mass to all subsets, but we need to be allowed to do so when necessary (e.g. missing data) belief functions on reals Borel intervals are nice, but the way forward is grounding the theory into the mathematics of random sets Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

82 Summarising Future of random set/belief function theory further development of machine learning tools e.g., random set random forests for multilabel classification tackling current trends such as transfer learning, deep learning fully developed theory of statistical inference with random sets generalised likelihood, logistic regression limit theorem, total probability for random sets random set random variables and processes frequentist inference with random sets propose solutions to high impact problems rare event prediction robust foundations for machine learning robust climatic change predictions mathematics and geometry of random sets and other uncertainty measures Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

83 Appendix For Further Reading For Further Reading I F. Cuzzolin. The geometry of uncertainty - The geometry of imprecise probabilities Artificial Intelligence: Foundations, Theory, and Algorithms ( Springer-Verlag (2017) Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

84 Appendix For Further Reading For Further Reading I G. Shafer. A mathematical theory of evidence. Princeton University Press, F. Cuzzolin. Visions of a generalized probability theory. Lambert Academic Publishing, F. Cuzzolin (Ed.). Belief functions: theory and applications. LNCS Volume 8764, Springer, Fabio Cuzzolin Belief functions: past, present and future Algiers, 13/12/ / 78

Belief functions: A gentle introduction

Belief functions: A gentle introduction Belief functions: A gentle introduction Seoul National University Professor Fabio Cuzzolin School of Engineering, Computing and Mathematics Oxford Brookes University, Oxford, UK Seoul, Korea, 30/05/18

More information

Introduction to belief functions

Introduction to belief functions Introduction to belief functions Thierry Denœux 1 1 Université de Technologie de Compiègne HEUDIASYC (UMR CNRS 6599) http://www.hds.utc.fr/ tdenoeux Spring School BFTA 2011 Autrans, April 4-8, 2011 Thierry

More information

Belief functions: basic theory and applications

Belief functions: basic theory and applications Belief functions: basic theory and applications Thierry Denœux 1 1 Université de Technologie de Compiègne, France HEUDIASYC (UMR CNRS 7253) https://www.hds.utc.fr/ tdenoeux ISIPTA 2013, Compiègne, France,

More information

Handling imprecise and uncertain class labels in classification and clustering

Handling imprecise and uncertain class labels in classification and clustering Handling imprecise and uncertain class labels in classification and clustering Thierry Denœux 1 1 Université de Technologie de Compiègne HEUDIASYC (UMR CNRS 6599) COST Action IC 0702 Working group C, Mallorca,

More information

Sequential adaptive combination of unreliable sources of evidence

Sequential adaptive combination of unreliable sources of evidence Sequential adaptive combination of unreliable sources of evidence Zhun-ga Liu, Quan Pan, Yong-mei Cheng School of Automation Northwestern Polytechnical University Xi an, China Email: liuzhunga@gmail.com

More information

Grundlagen der Künstlichen Intelligenz

Grundlagen der Künstlichen Intelligenz Grundlagen der Künstlichen Intelligenz Uncertainty & Probabilities & Bandits Daniel Hennes 16.11.2017 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Uncertainty Probability

More information

Fuzzy Systems. Possibility Theory.

Fuzzy Systems. Possibility Theory. Fuzzy Systems Possibility Theory Rudolf Kruse Christian Moewes {kruse,cmoewes}@iws.cs.uni-magdeburg.de Otto-von-Guericke University of Magdeburg Faculty of Computer Science Department of Knowledge Processing

More information

Analyzing the degree of conflict among belief functions.

Analyzing the degree of conflict among belief functions. Analyzing the degree of conflict among belief functions. Liu, W. 2006). Analyzing the degree of conflict among belief functions. Artificial Intelligence, 17011)11), 909-924. DOI: 10.1016/j.artint.2006.05.002

More information

The internal conflict of a belief function

The internal conflict of a belief function The internal conflict of a belief function Johan Schubert 1 Abstract In this paper we define and derive an internal conflict of a belief function We decompose the belief function in question into a set

More information

Lecture 1: Probability Fundamentals

Lecture 1: Probability Fundamentals Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability

More information

Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric?

Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric? Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric? Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/

More information

SMPS 08, 8-10 septembre 2008, Toulouse. A hierarchical fusion of expert opinion in the Transferable Belief Model (TBM) Minh Ha-Duong, CNRS, France

SMPS 08, 8-10 septembre 2008, Toulouse. A hierarchical fusion of expert opinion in the Transferable Belief Model (TBM) Minh Ha-Duong, CNRS, France SMPS 08, 8-10 septembre 2008, Toulouse A hierarchical fusion of expert opinion in the Transferable Belief Model (TBM) Minh Ha-Duong, CNRS, France The frame of reference: climate sensitivity Climate sensitivity

More information

Decision-making with belief functions

Decision-making with belief functions Decision-making with belief functions Thierry Denœux Université de Technologie de Compiègne, France HEUDIASYC (UMR CNRS 7253) https://www.hds.utc.fr/ tdenoeux Fourth School on Belief Functions and their

More information

The cautious rule of combination for belief functions and some extensions

The cautious rule of combination for belief functions and some extensions The cautious rule of combination for belief functions and some extensions Thierry Denœux UMR CNRS 6599 Heudiasyc Université de Technologie de Compiègne BP 20529 - F-60205 Compiègne cedex - France Thierry.Denoeux@hds.utc.fr

More information

Semantics of the relative belief of singletons

Semantics of the relative belief of singletons Semantics of the relative belief of singletons Fabio Cuzzolin INRIA Rhône-Alpes 655 avenue de l Europe, 38334 SAINT ISMIER CEDEX, France Fabio.Cuzzolin@inrialpes.fr Summary. In this paper we introduce

More information

Fuzzy Systems. Possibility Theory

Fuzzy Systems. Possibility Theory Fuzzy Systems Possibility Theory Prof. Dr. Rudolf Kruse Christoph Doell {kruse,doell}@iws.cs.uni-magdeburg.de Otto-von-Guericke University of Magdeburg Faculty of Computer Science Department of Knowledge

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning CS4375 --- Fall 2018 Bayesian a Learning Reading: Sections 13.1-13.6, 20.1-20.2, R&N Sections 6.1-6.3, 6.7, 6.9, Mitchell 1 Uncertainty Most real-world problems deal with

More information

Probability theory basics

Probability theory basics Probability theory basics Michael Franke Basics of probability theory: axiomatic definition, interpretation, joint distributions, marginalization, conditional probability & Bayes rule. Random variables:

More information

Introduction to Machine Learning

Introduction to Machine Learning Uncertainty Introduction to Machine Learning CS4375 --- Fall 2018 a Bayesian Learning Reading: Sections 13.1-13.6, 20.1-20.2, R&N Sections 6.1-6.3, 6.7, 6.9, Mitchell Most real-world problems deal with

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Entropy and Specificity in a Mathematical Theory of Evidence

Entropy and Specificity in a Mathematical Theory of Evidence Entropy and Specificity in a Mathematical Theory of Evidence Ronald R. Yager Abstract. We review Shafer s theory of evidence. We then introduce the concepts of entropy and specificity in the framework

More information

Discrete Probability and State Estimation

Discrete Probability and State Estimation 6.01, Fall Semester, 2007 Lecture 12 Notes 1 MASSACHVSETTS INSTITVTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.01 Introduction to EECS I Fall Semester, 2007 Lecture 12 Notes

More information

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com 1 School of Oriental and African Studies September 2015 Department of Economics Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com Gujarati D. Basic Econometrics, Appendix

More information

arxiv: v1 [cs.cv] 11 Jun 2008

arxiv: v1 [cs.cv] 11 Jun 2008 HUMAN EXPERTS FUSION FOR IMAGE CLASSIFICATION Arnaud MARTIN and Christophe OSSWALD arxiv:0806.1798v1 [cs.cv] 11 Jun 2008 Abstract In image classification, merging the opinion of several human experts is

More information

arxiv: v1 [cs.ai] 16 Aug 2018

arxiv: v1 [cs.ai] 16 Aug 2018 Decision-Making with Belief Functions: a Review Thierry Denœux arxiv:1808.05322v1 [cs.ai] 16 Aug 2018 Université de Technologie de Compiègne, CNRS UMR 7253 Heudiasyc, Compiègne, France email: thierry.denoeux@utc.fr

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 143 Part IV

More information

A Static Evidential Network for Context Reasoning in Home-Based Care Hyun Lee, Member, IEEE, Jae Sung Choi, and Ramez Elmasri, Member, IEEE

A Static Evidential Network for Context Reasoning in Home-Based Care Hyun Lee, Member, IEEE, Jae Sung Choi, and Ramez Elmasri, Member, IEEE 1232 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART A: SYSTEMS AND HUMANS, VOL. 40, NO. 6, NOVEMBER 2010 A Static Evidential Network for Context Reasoning in Home-Based Care Hyun Lee, Member,

More information

Context-dependent Combination of Sensor Information in Dempster-Shafer Theory for BDI

Context-dependent Combination of Sensor Information in Dempster-Shafer Theory for BDI Context-dependent Combination of Sensor Information in Dempster-Shafer Theory for BDI Sarah Calderwood Kevin McAreavey Weiru Liu Jun Hong Abstract There has been much interest in the Belief-Desire-Intention

More information

Course Introduction. Probabilistic Modelling and Reasoning. Relationships between courses. Dealing with Uncertainty. Chris Williams.

Course Introduction. Probabilistic Modelling and Reasoning. Relationships between courses. Dealing with Uncertainty. Chris Williams. Course Introduction Probabilistic Modelling and Reasoning Chris Williams School of Informatics, University of Edinburgh September 2008 Welcome Administration Handout Books Assignments Tutorials Course

More information

Decision Making Under Uncertainty. First Masterclass

Decision Making Under Uncertainty. First Masterclass Decision Making Under Uncertainty First Masterclass 1 Outline A short history Decision problems Uncertainty The Ellsberg paradox Probability as a measure of uncertainty Ignorance 2 Probability Blaise Pascal

More information

Single Maths B: Introduction to Probability

Single Maths B: Introduction to Probability Single Maths B: Introduction to Probability Overview Lecturer Email Office Homework Webpage Dr Jonathan Cumming j.a.cumming@durham.ac.uk CM233 None! http://maths.dur.ac.uk/stats/people/jac/singleb/ 1 Introduction

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem set 1 Solutions Thursday, September 19 What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.

More information

A generic framework for resolving the conict in the combination of belief structures E. Lefevre PSI, Universite/INSA de Rouen Place Emile Blondel, BP

A generic framework for resolving the conict in the combination of belief structures E. Lefevre PSI, Universite/INSA de Rouen Place Emile Blondel, BP A generic framework for resolving the conict in the combination of belief structures E. Lefevre PSI, Universite/INSA de Rouen Place Emile Blondel, BP 08 76131 Mont-Saint-Aignan Cedex, France Eric.Lefevre@insa-rouen.fr

More information

The intersection probability and its properties

The intersection probability and its properties The intersection probability and its properties Fabio Cuzzolin INRIA Rhône-Alpes 655 avenue de l Europe Montbonnot, France Abstract In this paper we introduce the intersection probability, a Bayesian approximation

More information

Basic Probabilistic Reasoning SEG

Basic Probabilistic Reasoning SEG Basic Probabilistic Reasoning SEG 7450 1 Introduction Reasoning under uncertainty using probability theory Dealing with uncertainty is one of the main advantages of an expert system over a simple decision

More information

A novel k-nn approach for data with uncertain attribute values

A novel k-nn approach for data with uncertain attribute values A novel -NN approach for data with uncertain attribute values Asma Trabelsi 1,2, Zied Elouedi 1, and Eric Lefevre 2 1 Université de Tunis, Institut Supérieur de Gestion de Tunis, LARODEC, Tunisia trabelsyasma@gmail.com,zied.elouedi@gmx.fr

More information

Uncertainty and Rules

Uncertainty and Rules Uncertainty and Rules We have already seen that expert systems can operate within the realm of uncertainty. There are several sources of uncertainty in rules: Uncertainty related to individual rules Uncertainty

More information

Stochastic dominance with imprecise information

Stochastic dominance with imprecise information Stochastic dominance with imprecise information Ignacio Montes, Enrique Miranda, Susana Montes University of Oviedo, Dep. of Statistics and Operations Research. Abstract Stochastic dominance, which is

More information

Evidence combination for a large number of sources

Evidence combination for a large number of sources Evidence combination for a large number of sources Kuang Zhou a, Arnaud Martin b, and Quan Pan a a. Northwestern Polytechnical University, Xi an, Shaanxi 710072, PR China. b. DRUID, IRISA, University of

More information

Discrete Probability and State Estimation

Discrete Probability and State Estimation 6.01, Spring Semester, 2008 Week 12 Course Notes 1 MASSACHVSETTS INSTITVTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.01 Introduction to EECS I Spring Semester, 2008 Week

More information

Imprecise Probability

Imprecise Probability Imprecise Probability Alexander Karlsson University of Skövde School of Humanities and Informatics alexander.karlsson@his.se 6th October 2006 0 D W 0 L 0 Introduction The term imprecise probability refers

More information

Chapter 2 Class Notes

Chapter 2 Class Notes Chapter 2 Class Notes Probability can be thought of in many ways, for example as a relative frequency of a long series of trials (e.g. flips of a coin or die) Another approach is to let an expert (such

More information

Multiplication of Multinomial Subjective Opinions

Multiplication of Multinomial Subjective Opinions Multiplication of Multinomial Subjective Opinions Audun Jøsang 1 and Stephen O Hara 2 1 UNIK Graduate Center University of Oslo, Norway josang@unik.no 2 21st Century Systems Inc. USA sohara@21csi.com Abstract.

More information

Where are we? Knowledge Engineering Semester 2, Reasoning under Uncertainty. Probabilistic Reasoning

Where are we? Knowledge Engineering Semester 2, Reasoning under Uncertainty. Probabilistic Reasoning Knowledge Engineering Semester 2, 2004-05 Michael Rovatsos mrovatso@inf.ed.ac.uk Lecture 8 Dealing with Uncertainty 8th ebruary 2005 Where are we? Last time... Model-based reasoning oday... pproaches to

More information

Bayesian Inference. Introduction

Bayesian Inference. Introduction Bayesian Inference Introduction The frequentist approach to inference holds that probabilities are intrinsicially tied (unsurprisingly) to frequencies. This interpretation is actually quite natural. What,

More information

Probability, Entropy, and Inference / More About Inference

Probability, Entropy, and Inference / More About Inference Probability, Entropy, and Inference / More About Inference Mário S. Alvim (msalvim@dcc.ufmg.br) Information Theory DCC-UFMG (2018/02) Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference

More information

Contradiction Measures and Specificity Degrees of Basic Belief Assignments

Contradiction Measures and Specificity Degrees of Basic Belief Assignments Contradiction Measures and Specificity Degrees of Basic Belief Assignments Florentin Smarandache Arnaud Martin Christophe Osswald Originally published as: Smarandache F., Martin A., Osswald C - Contradiction

More information

Deep Learning for Computer Vision

Deep Learning for Computer Vision Deep Learning for Computer Vision Lecture 3: Probability, Bayes Theorem, and Bayes Classification Peter Belhumeur Computer Science Columbia University Probability Should you play this game? Game: A fair

More information

A gentle introduction to imprecise probability models

A gentle introduction to imprecise probability models A gentle introduction to imprecise probability models and their behavioural interpretation Gert de Cooman gert.decooman@ugent.be SYSTeMS research group, Ghent University A gentle introduction to imprecise

More information

A NEW CLASS OF FUSION RULES BASED ON T-CONORM AND T-NORM FUZZY OPERATORS

A NEW CLASS OF FUSION RULES BASED ON T-CONORM AND T-NORM FUZZY OPERATORS A NEW CLASS OF FUSION RULES BASED ON T-CONORM AND T-NORM FUZZY OPERATORS Albena TCHAMOVA, Jean DEZERT and Florentin SMARANDACHE Abstract: In this paper a particular combination rule based on specified

More information

Belief Functions: the Disjunctive Rule of Combination and the Generalized Bayesian Theorem

Belief Functions: the Disjunctive Rule of Combination and the Generalized Bayesian Theorem Belief Functions: the Disjunctive Rule of Combination and the Generalized Bayesian Theorem Philippe Smets IRIDIA Université Libre de Bruxelles 50 av. Roosevelt, CP 194-6, 1050 Bruxelles, Belgium January

More information

Adaptative combination rule and proportional conflict redistribution rule for information fusion

Adaptative combination rule and proportional conflict redistribution rule for information fusion Adaptative combination rule and proportional conflict redistribution rule for information fusion M. C. Florea 1, J. Dezert 2, P. Valin 3, F. Smarandache 4, Anne-Laure Jousselme 3 1 Radiocommunication &

More information

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

Introduction: MLE, MAP, Bayesian reasoning (28/8/13) STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this

More information

A hierarchical fusion of expert opinion in the Transferable Belief Model (TBM) Minh Ha-Duong, CNRS, France

A hierarchical fusion of expert opinion in the Transferable Belief Model (TBM) Minh Ha-Duong, CNRS, France Ambiguity, uncertainty and climate change, UC Berkeley, September 17-18, 2009 A hierarchical fusion of expert opinion in the Transferable Belief Model (TBM) Minh Ha-Duong, CNRS, France Outline 1. Intro:

More information

The Unnormalized Dempster s Rule of Combination: a New Justification from the Least Commitment Principle and some Extensions

The Unnormalized Dempster s Rule of Combination: a New Justification from the Least Commitment Principle and some Extensions J Autom Reasoning manuscript No. (will be inserted by the editor) 0 0 0 The Unnormalized Dempster s Rule of Combination: a New Justification from the Least Commitment Principle and some Extensions Frédéric

More information

This article is published in Journal of Multiple-Valued Logic and Soft Computing 2008, 15(1), 5-38.

This article is published in Journal of Multiple-Valued Logic and Soft Computing 2008, 15(1), 5-38. Audun Jøsang. Conditional Reasoning with Subjective Logic. This article is published in Journal of Multiple-Valued Logic and Soft Computing 2008, 15(1), 5-38. Published in DUO with permission from Old

More information

Bayesian Learning (II)

Bayesian Learning (II) Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning (II) Niels Landwehr Overview Probabilities, expected values, variance Basic concepts of Bayesian learning MAP

More information

Bayesian Reasoning. Adapted from slides by Tim Finin and Marie desjardins.

Bayesian Reasoning. Adapted from slides by Tim Finin and Marie desjardins. Bayesian Reasoning Adapted from slides by Tim Finin and Marie desjardins. 1 Outline Probability theory Bayesian inference From the joint distribution Using independence/factoring From sources of evidence

More information

A New Definition of Entropy of Belief Functions in the Dempster-Shafer Theory

A New Definition of Entropy of Belief Functions in the Dempster-Shafer Theory A New Definition of Entropy of Belief Functions in the Dempster-Shafer Theory Radim Jiroušek Faculty of Management, University of Economics, and Institute of Information Theory and Automation, Academy

More information

Joint Tracking and Classification of Airbourne Objects using Particle Filters and the Continuous Transferable Belief Model

Joint Tracking and Classification of Airbourne Objects using Particle Filters and the Continuous Transferable Belief Model Joint Tracking and Classification of Airbourne Objects using Particle Filters and the Continuous Transferable Belief Model Gavin Powell & David Marshall The Geometric Computing & Computer Vision Group,

More information

Analyzing the Combination of Conflicting Belief Functions.

Analyzing the Combination of Conflicting Belief Functions. Analyzing the Combination of Conflicting Belief Functions. Philippe Smets IRIDIA Université Libre de Bruxelles 50 av. Roosevelt, CP 194-6, 1050 Bruxelles, Belgium psmets@ulb.ac.be http://iridia.ulb.ac.be/

More information

7.1 What is it and why should we care?

7.1 What is it and why should we care? Chapter 7 Probability In this section, we go over some simple concepts from probability theory. We integrate these with ideas from formal language theory in the next chapter. 7.1 What is it and why should

More information

An Alternative Combination Rule for Evidential Reasoning

An Alternative Combination Rule for Evidential Reasoning An Alternative Combination Rule for Evidential Reasoning Faouzi Sebbak, Farid Benhammadi, M hamed Mataoui, Sofiane Bouznad and Yacine Amirat AI Laboratory, Ecole Militaire Polytechnique, Bordj el Bahri,

More information

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag Decision Trees Nicholas Ruozzi University of Texas at Dallas Based on the slides of Vibhav Gogate and David Sontag Supervised Learning Input: labelled training data i.e., data plus desired output Assumption:

More information

Probabilistic Graphical Models for Image Analysis - Lecture 1

Probabilistic Graphical Models for Image Analysis - Lecture 1 Probabilistic Graphical Models for Image Analysis - Lecture 1 Alexey Gronskiy, Stefan Bauer 21 September 2018 Max Planck ETH Center for Learning Systems Overview 1. Motivation - Why Graphical Models 2.

More information

Application of Evidence Theory to Construction Projects

Application of Evidence Theory to Construction Projects Application of Evidence Theory to Construction Projects Desmond Adair, University of Tasmania, Australia Martin Jaeger, University of Tasmania, Australia Abstract: Crucial decisions are necessary throughout

More information

Counter-examples to Dempster s rule of combination

Counter-examples to Dempster s rule of combination Jean Dezert 1, Florentin Smarandache 2, Mohammad Khoshnevisan 3 1 ONERA, 29 Av. de la Division Leclerc 92320, Chatillon, France 2 Department of Mathematics University of New Mexico Gallup, NM 8730, U.S.A.

More information

Shedding New Light on Zadeh s Criticism of Dempster s Rule of Combination

Shedding New Light on Zadeh s Criticism of Dempster s Rule of Combination Shedding New Light on Zadeh s Criticism of Dempster s Rule of Combination Rolf Haenni Institute of Computer Science and Applied Mathematics University of Berne CH-3012 Bern, Switzerland e-mail: haenni@iam.unibe.ch

More information

Probability and Information Theory. Sargur N. Srihari

Probability and Information Theory. Sargur N. Srihari Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal

More information

Quantifying Uncertainty & Probabilistic Reasoning. Abdulla AlKhenji Khaled AlEmadi Mohammed AlAnsari

Quantifying Uncertainty & Probabilistic Reasoning. Abdulla AlKhenji Khaled AlEmadi Mohammed AlAnsari Quantifying Uncertainty & Probabilistic Reasoning Abdulla AlKhenji Khaled AlEmadi Mohammed AlAnsari Outline Previous Implementations What is Uncertainty? Acting Under Uncertainty Rational Decisions Basic

More information

AN INTRODUCTION TO DSMT IN INFORMATION FUSION. Jean Dezert and Florentin Smarandache

AN INTRODUCTION TO DSMT IN INFORMATION FUSION. Jean Dezert and Florentin Smarandache BRAIN. Broad Research in Artificial Intelligence and Neuroscience, ISSN 2067-3957, Volume 1, October 2010, Special Issue on Advances in Applied Sciences, Eds Barna Iantovics, Marius Mǎruşteri, Rodica-M.

More information

Threat assessment of a possible Vehicle-Born Improvised Explosive Device using DSmT

Threat assessment of a possible Vehicle-Born Improvised Explosive Device using DSmT Threat assessment of a possible Vehicle-Born Improvised Explosive Device using DSmT Jean Dezert French Aerospace Lab. ONERA/DTIM/SIF 29 Av. de la Div. Leclerc 92320 Châtillon, France. jean.dezert@onera.fr

More information

Econ 325: Introduction to Empirical Economics

Econ 325: Introduction to Empirical Economics Econ 325: Introduction to Empirical Economics Lecture 2 Probability Copyright 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 3-1 3.1 Definition Random Experiment a process leading to an uncertain

More information

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization

More information

Classification and Regression Trees

Classification and Regression Trees Classification and Regression Trees Ryan P Adams So far, we have primarily examined linear classifiers and regressors, and considered several different ways to train them When we ve found the linearity

More information

Reasoning with Uncertainty

Reasoning with Uncertainty Reasoning with Uncertainty Representing Uncertainty Manfred Huber 2005 1 Reasoning with Uncertainty The goal of reasoning is usually to: Determine the state of the world Determine what actions to take

More information

Naive Bayes classification

Naive Bayes classification Naive Bayes classification Christos Dimitrakakis December 4, 2015 1 Introduction One of the most important methods in machine learning and statistics is that of Bayesian inference. This is the most fundamental

More information

Prediction of future observations using belief functions: a likelihood-based approach

Prediction of future observations using belief functions: a likelihood-based approach Prediction of future observations using belief functions: a likelihood-based approach Orakanya Kanjanatarakul 1,2, Thierry Denœux 2 and Songsak Sriboonchitta 3 1 Faculty of Management Sciences, Chiang

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

Application of New Absolute and Relative Conditioning Rules in Threat Assessment

Application of New Absolute and Relative Conditioning Rules in Threat Assessment Application of New Absolute and Relative Conditioning Rules in Threat Assessment Ksawery Krenc C4I Research and Development Department OBR CTM S.A. Gdynia, Poland Email: ksawery.krenc@ctm.gdynia.pl Florentin

More information

The Semi-Pascal Triangle of Maximum Deng Entropy

The Semi-Pascal Triangle of Maximum Deng Entropy The Semi-Pascal Triangle of Maximum Deng Entropy Xiaozhuan Gao a, Yong Deng a, a Institute of Fundamental and Frontier Science, University of Electronic Science and Technology of China, Chengdu, 610054,

More information

On the Relation of Probability, Fuzziness, Rough and Evidence Theory

On the Relation of Probability, Fuzziness, Rough and Evidence Theory On the Relation of Probability, Fuzziness, Rough and Evidence Theory Rolly Intan Petra Christian University Department of Informatics Engineering Surabaya, Indonesia rintan@petra.ac.id Abstract. Since

More information

Based on slides by Richard Zemel

Based on slides by Richard Zemel CSC 412/2506 Winter 2018 Probabilistic Learning and Reasoning Lecture 3: Directed Graphical Models and Latent Variables Based on slides by Richard Zemel Learning outcomes What aspects of a model can we

More information

Probabilistic Reasoning

Probabilistic Reasoning Course 16 :198 :520 : Introduction To Artificial Intelligence Lecture 7 Probabilistic Reasoning Abdeslam Boularias Monday, September 28, 2015 1 / 17 Outline We show how to reason and act under uncertainty.

More information

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007 Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.

More information

Aarti Singh. Lecture 2, January 13, Reading: Bishop: Chap 1,2. Slides courtesy: Eric Xing, Andrew Moore, Tom Mitchell

Aarti Singh. Lecture 2, January 13, Reading: Bishop: Chap 1,2. Slides courtesy: Eric Xing, Andrew Moore, Tom Mitchell Machine Learning 0-70/5 70/5-78, 78, Spring 00 Probability 0 Aarti Singh Lecture, January 3, 00 f(x) µ x Reading: Bishop: Chap, Slides courtesy: Eric Xing, Andrew Moore, Tom Mitchell Announcements Homework

More information

Uncertainty. Logic and Uncertainty. Russell & Norvig. Readings: Chapter 13. One problem with logical-agent approaches: C:145 Artificial

Uncertainty. Logic and Uncertainty. Russell & Norvig. Readings: Chapter 13. One problem with logical-agent approaches: C:145 Artificial C:145 Artificial Intelligence@ Uncertainty Readings: Chapter 13 Russell & Norvig. Artificial Intelligence p.1/43 Logic and Uncertainty One problem with logical-agent approaches: Agents almost never have

More information

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006 Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312 This draft, April 20, 2006 1 1 A Brief Review of Hypothesis Testing and Its Uses values and pure significance tests (R.A. Fisher)

More information

Today s s lecture. Lecture 16: Uncertainty - 6. Dempster-Shafer Theory. Alternative Models of Dealing with Uncertainty Information/Evidence

Today s s lecture. Lecture 16: Uncertainty - 6. Dempster-Shafer Theory. Alternative Models of Dealing with Uncertainty Information/Evidence Today s s lecture Lecture 6: Uncertainty - 6 Alternative Models of Dealing with Uncertainty Information/Evidence Dempster-Shaffer Theory of Evidence Victor Lesser CMPSCI 683 Fall 24 Fuzzy logic Logical

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information

Lecture 10: Introduction to reasoning under uncertainty. Uncertainty

Lecture 10: Introduction to reasoning under uncertainty. Uncertainty Lecture 10: Introduction to reasoning under uncertainty Introduction to reasoning under uncertainty Review of probability Axioms and inference Conditional probability Probability distributions COMP-424,

More information

MATH MW Elementary Probability Course Notes Part I: Models and Counting

MATH MW Elementary Probability Course Notes Part I: Models and Counting MATH 2030 3.00MW Elementary Probability Course Notes Part I: Models and Counting Tom Salisbury salt@yorku.ca York University Winter 2010 Introduction [Jan 5] Probability: the mathematics used for Statistics

More information

A hierarchical fusion of expert opinion in the TBM

A hierarchical fusion of expert opinion in the TBM A hierarchical fusion of expert opinion in the TBM Minh Ha-Duong CIRED, CNRS haduong@centre-cired.fr Abstract We define a hierarchical method for expert opinion aggregation that combines consonant beliefs

More information

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore

More information

A Simple Proportional Conflict Redistribution Rule

A Simple Proportional Conflict Redistribution Rule A Simple Proportional Conflict Redistribution Rule Florentin Smarandache Dept. of Mathematics Univ. of New Mexico Gallup, NM 8730 U.S.A. smarand@unm.edu Jean Dezert ONERA/DTIM/IED 29 Av. de la Division

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Bayesian Social Learning with Random Decision Making in Sequential Systems

Bayesian Social Learning with Random Decision Making in Sequential Systems Bayesian Social Learning with Random Decision Making in Sequential Systems Yunlong Wang supervised by Petar M. Djurić Department of Electrical and Computer Engineering Stony Brook University Stony Brook,

More information

Introduction to Bayesian Learning. Machine Learning Fall 2018

Introduction to Bayesian Learning. Machine Learning Fall 2018 Introduction to Bayesian Learning Machine Learning Fall 2018 1 What we have seen so far What does it mean to learn? Mistake-driven learning Learning by counting (and bounding) number of mistakes PAC learnability

More information

Review: Probability. BM1: Advanced Natural Language Processing. University of Potsdam. Tatjana Scheffler

Review: Probability. BM1: Advanced Natural Language Processing. University of Potsdam. Tatjana Scheffler Review: Probability BM1: Advanced Natural Language Processing University of Potsdam Tatjana Scheffler tatjana.scheffler@uni-potsdam.de October 21, 2016 Today probability random variables Bayes rule expectation

More information

Probability is related to uncertainty and not (only) to the results of repeated experiments

Probability is related to uncertainty and not (only) to the results of repeated experiments Uncertainty probability Probability is related to uncertainty and not (only) to the results of repeated experiments G. D Agostini, Probabilità e incertezze di misura - Parte 1 p. 40 Uncertainty probability

More information