Continuous updating rules for imprecise probabilities

Size: px
Start display at page:

Download "Continuous updating rules for imprecise probabilities"

Transcription

1 Continuous updating rules for imprecise probabilities Marco Cattaneo Department of Statistics, LMU Munich WPMSIIP 2013, Lugano, Switzerland 6 September 2013

2 example X {1, 2, 3} Marco LMU Munich Continuous updating rules for imprecise probabilities 2/9

3 example X {1, 2, 3} prior assessment: P(X ) =... Marco LMU Munich Continuous updating rules for imprecise probabilities 2/9

4 example X {1, 2, 3} prior assessment: P(X ) =... observation: A = {X 2} Marco LMU Munich Continuous updating rules for imprecise probabilities 2/9

5 example X {1, 2, 3} prior assessment: P(X ) =... observation: A = {X 2} updated prevision: P(X A) =... Marco LMU Munich Continuous updating rules for imprecise probabilities 2/9

6 example X {1, 2, 3} prior assessment: P(X ) =... observation: A = {X 2} updated prevision: P(X A) =... natural/regular extension: P(X A) P(X ) Marco LMU Munich Continuous updating rules for imprecise probabilities 2/9

7 updating likelihood of linear previsions: lik(p) P(A) Marco LMU Munich Continuous updating rules for imprecise probabilities 3/9

8 updating likelihood of linear previsions: lik(p) P(A) Bayesian updating: P P( A) when lik(p) 0 Marco LMU Munich Continuous updating rules for imprecise probabilities 3/9

9 updating likelihood of linear previsions: lik(p) P(A) Bayesian updating: P P( A) when lik(p) 0 regular updating: P inf P M(P) : lik(p)>0 P( A) when lik 0 on M(P) Marco LMU Munich Continuous updating rules for imprecise probabilities 3/9

10 updating likelihood of linear previsions: lik(p) P(A) Bayesian updating: P P( A) when lik(p) 0 regular updating: P inf P M(P) : lik(p)>0 P( A) when lik 0 on M(P) the discontinuities are caused by linear previsions P M(P) with arbitrarily small likelihood (i.e., almost contradicted by the data) Marco LMU Munich Continuous updating rules for imprecise probabilities 3/9

11 updating likelihood of linear previsions: lik(p) P(A) Bayesian updating: P P( A) when lik(p) 0 regular updating: P inf P M(P) : lik(p)>0 P( A) when lik 0 on M(P) the discontinuities are caused by linear previsions P M(P) with arbitrarily small likelihood (i.e., almost contradicted by the data) β-cut updating, with β (0, 1): P inf P M(P) : lik(p)>β P(A) P( A) when lik 0 on M(P) Marco LMU Munich Continuous updating rules for imprecise probabilities 3/9

12 updating likelihood of linear previsions: lik(p) P(A) Bayesian updating: P P( A) when lik(p) 0 regular updating: P inf P M(P) : lik(p)>0 P( A) when lik 0 on M(P) the discontinuities are caused by linear previsions P M(P) with arbitrarily small likelihood (i.e., almost contradicted by the data) β-cut updating, with β (0, 1): P inf P M(P) : lik(p)>β P(A) P( A) when lik 0 on M(P) regular updating is the limit of β-cut updating as β 0, while the limit as β 1 is a generalization of Dempster s rule of conditioning Marco LMU Munich Continuous updating rules for imprecise probabilities 3/9

13 updating likelihood of linear previsions: lik(p) P(A) Bayesian updating: P P( A) when lik(p) 0 regular updating: P inf P M(P) : lik(p)>0 P( A) when lik 0 on M(P) the discontinuities are caused by linear previsions P M(P) with arbitrarily small likelihood (i.e., almost contradicted by the data) β-cut updating, with β (0, 1): P inf P M(P) : lik(p)>β P(A) P( A) when lik 0 on M(P) regular updating is the limit of β-cut updating as β 0, while the limit as β 1 is a generalization of Dempster s rule of conditioning β-cut updating was used, e.g., in Antonucci et al. (2012); Cattaneo and Wiencierz (2012); Destercke (2013), and similar updating rules for imprecise probabilities were suggested, e.g., in Moral and de Campos (1991); Moral (1992); Cano and Moral (1996) Marco LMU Munich Continuous updating rules for imprecise probabilities 3/9

14 example β-cut updating: P(X A) β = 0.5 β = 0.1 β 0 β P(X ) Marco LMU Munich Continuous updating rules for imprecise probabilities 4/9

15 example β-cut updating: P(X A) β = 0.5 β = 0.1 β 0 β P(X ) the slope of the central line segment is 1 /β Marco LMU Munich Continuous updating rules for imprecise probabilities 4/9

16 a more interesting example Piatti et al. (2005) studied the behavior of the IDM model when the realization of each binary experiment X 1, X 2,... can be observed incorrectly with a known probability ε (the errors of observation are independent, conditional on the realizations of X 1, X 2,...) Marco LMU Munich Continuous updating rules for imprecise probabilities 5/9

17 a more interesting example Piatti et al. (2005) studied the behavior of the IDM model when the realization of each binary experiment X 1, X 2,... can be observed incorrectly with a known probability ε (the errors of observation are independent, conditional on the realizations of X 1, X 2,...) observation: X X 10 = 7 (i.e., 7 successes in 10 experiments) Marco LMU Munich Continuous updating rules for imprecise probabilities 5/9

18 a more interesting example Piatti et al. (2005) studied the behavior of the IDM model when the realization of each binary experiment X 1, X 2,... can be observed incorrectly with a known probability ε (the errors of observation are independent, conditional on the realizations of X 1, X 2,...) observation: X X 10 = 7 (i.e., 7 successes in 10 experiments) lower probability of success in the next experiment (with s = 2): P(X 11 = 1 X X 10 = 7) =... Marco LMU Munich Continuous updating rules for imprecise probabilities 5/9

19 a more interesting example Piatti et al. (2005) studied the behavior of the IDM model when the realization of each binary experiment X 1, X 2,... can be observed incorrectly with a known probability ε (the errors of observation are independent, conditional on the realizations of X 1, X 2,...) observation: X X 10 = 7 (i.e., 7 successes in 10 experiments) lower probability of success in the next experiment (with s = 2): P(X 11 = 1 X X 10 = 7) =... regular updating: P(X 11 = 1 X X 10 = 7) { if ε = 0 0 if ε > 0 Marco LMU Munich Continuous updating rules for imprecise probabilities 5/9

20 a more interesting example Piatti et al. (2005) studied the behavior of the IDM model when the realization of each binary experiment X 1, X 2,... can be observed incorrectly with a known probability ε (the errors of observation are independent, conditional on the realizations of X 1, X 2,...) observation: X X 10 = 7 (i.e., 7 successes in 10 experiments) lower probability of success in the next experiment (with s = 2): P(X 11 = 1 X X 10 = 7) =... regular updating: P(X 11 = 1 X X 10 = 7) { if ε = 0 0 if ε > 0 β-cut updating with β = 0.01: if ε = if ε = 0.01 P(X 11 = 1 X X 10 = 7) if ε = if ε = 0.1 Marco LMU Munich Continuous updating rules for imprecise probabilities 5/9

21 distances distance between two linear previsions P, P on Ω (dual/operator norm = total variation distance): d(p, P ) = sup P(X ) P (X ) X : Ω [ 1, 1] = 2 sup P(A) P (A) A Ω Marco LMU Munich Continuous updating rules for imprecise probabilities 6/9

22 distances distance between two linear previsions P, P on Ω (dual/operator norm = total variation distance): d(p, P ) = sup P(X ) P (X ) X : Ω [ 1, 1] = 2 sup P(A) P (A) A Ω distance between two (coherent) lower previsions P, P on Ω (Hausdorff distance = dual/operator norm ): { d(p, P ) = max sup inf d(p, P M(P) P M(P ) P ), = sup P(X ) P (X ) X : Ω [ 1, 1] sup P M(P ) inf d(p, P M(P) P ) } Marco LMU Munich Continuous updating rules for imprecise probabilities 6/9

23 continuity Bayesian updating is continuous on all linear previsions P with P(A) > 0 Marco LMU Munich Continuous updating rules for imprecise probabilities 7/9

24 continuity Bayesian updating is continuous on all linear previsions P with P(A) > 0 regular updating is continuous on all lower previsions P with P(A) > 0 Marco LMU Munich Continuous updating rules for imprecise probabilities 7/9

25 continuity Bayesian updating is continuous on all linear previsions P with P(A) > 0 regular updating is continuous on all lower previsions P with P(A) > 0 regular updating is not continuous on all lower previsions P with P(A) > 0, apart if A = Ω or A = 1 (and the same holds for natural extension) Marco LMU Munich Continuous updating rules for imprecise probabilities 7/9

26 continuity Bayesian updating is continuous on all linear previsions P with P(A) > 0 regular updating is continuous on all lower previsions P with P(A) > 0 regular updating is not continuous on all lower previsions P with P(A) > 0, apart if A = Ω or A = 1 (and the same holds for natural extension) β-cut updating is continuous on all lower previsions P with P(A) > 0, for each β (0, 1) Marco LMU Munich Continuous updating rules for imprecise probabilities 7/9

27 hierarchical model a likelihood function lik on the set P of all linear previsions on Ω is described by its normal hypograph {(P, β) P [0, 1] : β sup lik lik (P)} Marco LMU Munich Continuous updating rules for imprecise probabilities 8/9

28 hierarchical model a likelihood function lik on the set P of all linear previsions on Ω is described by its normal hypograph {(P, β) P [0, 1] : β sup lik lik (P)} hierarchical updating: lik lik (P ) sup P P : P( A)=P lik (P) lik(p) when lik lik 0 Marco LMU Munich Continuous updating rules for imprecise probabilities 8/9

29 hierarchical model a likelihood function lik on the set P of all linear previsions on Ω is described by its normal hypograph {(P, β) P [0, 1] : β sup lik lik (P)} hierarchical updating: lik lik (P ) sup P P : P( A)=P lik (P) lik(p) when lik lik 0 distance between two likelihood functions lik, lik on P: Hausdorff distance between their normal hypographs, with respect to a product metric on P [0, 1] Marco LMU Munich Continuous updating rules for imprecise probabilities 8/9

30 hierarchical model a likelihood function lik on the set P of all linear previsions on Ω is described by its normal hypograph {(P, β) P [0, 1] : β sup lik lik (P)} hierarchical updating: lik lik (P ) sup P P : P( A)=P lik (P) lik(p) when lik lik 0 distance between two likelihood functions lik, lik on P: Hausdorff distance between their normal hypographs, with respect to a product metric on P [0, 1] hierarchical updating is continuous on all likelihood functions lik with lik lik 0 Marco LMU Munich Continuous updating rules for imprecise probabilities 8/9

31 references Antonucci, A., Cattaneo, M., and Corani, G. (2012). Likelihood-based robust classification with Bayesian networks. In Advances in Computational Intelligence, Part 3. Springer, Cano, A., and Moral, S. (1996). A genetic algorithm to approximate convex sets of probabilities. In IPMU 96. Vol. 2. Proyecto Sur, Cattaneo, M., and Wiencierz, A. (2012). Likelihood-based Imprecise Regression. Int. J. Approx. Reasoning 53, Destercke, S. (2013). A pairwise label ranking method with imprecise scores and partial predictions. In Machine Learning and Knowledge Discovery in Databases. Springer, Moral, S. (1992). Calculating uncertainty intervals from conditional convex sets of probabilities. In UAI 92. Morgan Kaufmann, Moral, S., and de Campos, L. M. (1991). Updating uncertain information. In Uncertainty in Knowledge Bases. Springer, Piatti, A., Zaffalon, M., and Trojani, F. (2005). Limits of learning from imperfect observations under prior ignorance: the case of the imprecise Dirichlet model. In ISIPTA 05. SIPTA, Marco LMU Munich Continuous updating rules for imprecise probabilities 9/9

Credal Classification

Credal Classification Credal Classification A. Antonucci, G. Corani, D. Maua {alessandro,giorgio,denis}@idsia.ch Istituto Dalle Molle di Studi sull Intelligenza Artificiale Lugano (Switzerland) IJCAI-13 About the speaker PhD

More information

Likelihood-Based Naive Credal Classifier

Likelihood-Based Naive Credal Classifier 7th International Symposium on Imprecise Probability: Theories and Applications, Innsbruck, Austria, 011 Likelihood-Based Naive Credal Classifier Alessandro Antonucci IDSIA, Lugano alessandro@idsia.ch

More information

Regression with Imprecise Data: A Robust Approach

Regression with Imprecise Data: A Robust Approach 7th International Symposium on Imprecise Probability: Theories and Applications, Innsbruck, Austria, 2011 Regression with Imprecise Data: A Robust Approach Marco E. G. V. Cattaneo Department of Statistics,

More information

Imprecise Probability

Imprecise Probability Imprecise Probability Alexander Karlsson University of Skövde School of Humanities and Informatics alexander.karlsson@his.se 6th October 2006 0 D W 0 L 0 Introduction The term imprecise probability refers

More information

Generalized Loopy 2U: A New Algorithm for Approximate Inference in Credal Networks

Generalized Loopy 2U: A New Algorithm for Approximate Inference in Credal Networks Generalized Loopy 2U: A New Algorithm for Approximate Inference in Credal Networks Alessandro Antonucci, Marco Zaffalon, Yi Sun, Cassio P. de Campos Istituto Dalle Molle di Studi sull Intelligenza Artificiale

More information

A Study of the Pari-Mutuel Model from the Point of View of Imprecise Probabilities

A Study of the Pari-Mutuel Model from the Point of View of Imprecise Probabilities PMLR: Proceedings of Machine Learning Research, vol. 62, 229-240, 2017 ISIPTA 17 A Study of the Pari-Mutuel Model from the Point of View of Imprecise Probabilities Ignacio Montes Enrique Miranda Dep. of

More information

Machine Learning Basics Lecture 7: Multiclass Classification. Princeton University COS 495 Instructor: Yingyu Liang

Machine Learning Basics Lecture 7: Multiclass Classification. Princeton University COS 495 Instructor: Yingyu Liang Machine Learning Basics Lecture 7: Multiclass Classification Princeton University COS 495 Instructor: Yingyu Liang Example: image classification indoor Indoor outdoor Example: image classification (multiclass)

More information

Decision-Theoretic Specification of Credal Networks: A Unified Language for Uncertain Modeling with Sets of Bayesian Networks

Decision-Theoretic Specification of Credal Networks: A Unified Language for Uncertain Modeling with Sets of Bayesian Networks Decision-Theoretic Specification of Credal Networks: A Unified Language for Uncertain Modeling with Sets of Bayesian Networks Alessandro Antonucci Marco Zaffalon IDSIA, Istituto Dalle Molle di Studi sull

More information

Bayesian Networks with Imprecise Probabilities: Classification

Bayesian Networks with Imprecise Probabilities: Classification Bayesian Networks with Imprecise Probabilities: Classification Alessandro Antonucci and Giorgio Corani {alessandro,giorgio}@idsia.ch IDSIA - Switzerland AAAI-10 A companion chapter A companion chapter

More information

Practical implementation of possibilistic probability mass functions

Practical implementation of possibilistic probability mass functions Practical implementation of possibilistic probability mass functions Leen Gilbert Gert de Cooman Etienne E. Kerre October 12, 2000 Abstract Probability assessments of events are often linguistic in nature.

More information

Practical implementation of possibilistic probability mass functions

Practical implementation of possibilistic probability mass functions Soft Computing manuscript No. (will be inserted by the editor) Practical implementation of possibilistic probability mass functions Leen Gilbert, Gert de Cooman 1, Etienne E. Kerre 2 1 Universiteit Gent,

More information

On the Complexity of Strong and Epistemic Credal Networks

On the Complexity of Strong and Epistemic Credal Networks On the Complexity of Strong and Epistemic Credal Networks Denis D. Mauá Cassio P. de Campos Alessio Benavoli Alessandro Antonucci Istituto Dalle Molle di Studi sull Intelligenza Artificiale (IDSIA), Manno-Lugano,

More information

A gentle introduction to imprecise probability models

A gentle introduction to imprecise probability models A gentle introduction to imprecise probability models and their behavioural interpretation Gert de Cooman gert.decooman@ugent.be SYSTeMS research group, Ghent University A gentle introduction to imprecise

More information

Learning Extended Tree Augmented Naive Structures

Learning Extended Tree Augmented Naive Structures Learning Extended Tree Augmented Naive Structures Cassio P. de Campos a,, Giorgio Corani b, Mauro Scanagatta b, Marco Cuccu c, Marco Zaffalon b a Queen s University Belfast, UK b Istituto Dalle Molle di

More information

Linear Discrimination Functions

Linear Discrimination Functions Laurea Magistrale in Informatica Nicola Fanizzi Dipartimento di Informatica Università degli Studi di Bari November 4, 2009 Outline Linear models Gradient descent Perceptron Minimum square error approach

More information

On the Impact of Robust Statistics on Imprecise Probability Models

On the Impact of Robust Statistics on Imprecise Probability Models ICOSSAR 2009 September 14th, 2009 On the Impact of Robust Statistics on Imprecise Probability Models A Review Thomas Augustin Department of Statistics LMU Munich Germany Robert Hable Department of Mathematics

More information

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan Clustering CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Supervised vs Unsupervised Learning Supervised learning Given x ", y " "%& ', learn a function f: X Y Categorical output classification

More information

Bayesian Networks with Imprecise Probabilities: Theory and Application to Classification

Bayesian Networks with Imprecise Probabilities: Theory and Application to Classification Bayesian Networks with Imprecise Probabilities: Theory and Application to Classification G. Corani 1, A. Antonucci 1, and M. Zaffalon 1 IDSIA, Manno, Switzerland {giorgio,alessandro,zaffalon}@idsia.ch

More information

Decoupled Collaborative Ranking

Decoupled Collaborative Ranking Decoupled Collaborative Ranking Jun Hu, Ping Li April 24, 2017 Jun Hu, Ping Li WWW2017 April 24, 2017 1 / 36 Recommender Systems Recommendation system is an information filtering technique, which provides

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Extension of coherent lower previsions to unbounded random variables

Extension of coherent lower previsions to unbounded random variables Matthias C. M. Troffaes and Gert de Cooman. Extension of coherent lower previsions to unbounded random variables. In Proceedings of the Ninth International Conference IPMU 2002 (Information Processing

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

Intelligent Systems I

Intelligent Systems I Intelligent Systems I 00 INTRODUCTION Stefan Harmeling & Philipp Hennig 24. October 2013 Max Planck Institute for Intelligent Systems Dptmt. of Empirical Inference Which Card? Opening Experiment Which

More information

Min-BDeu and Max-BDeu Scores for Learning Bayesian Networks

Min-BDeu and Max-BDeu Scores for Learning Bayesian Networks Min-BDeu and Max-BDeu Scores for Learning Bayesian Networks Mauro Scanagatta, Cassio P. de Campos, and Marco Zaffalon Istituto Dalle Molle di Studi sull Intelligenza Artificiale (IDSIA), Switzerland {mauro,cassio,zaffalon}@idsia.ch

More information

Entropy-based pruning for learning Bayesian networks using BIC

Entropy-based pruning for learning Bayesian networks using BIC Entropy-based pruning for learning Bayesian networks using BIC Cassio P. de Campos a,b,, Mauro Scanagatta c, Giorgio Corani c, Marco Zaffalon c a Utrecht University, The Netherlands b Queen s University

More information

Symmetric Probability Theory

Symmetric Probability Theory Symmetric Probability Theory Kurt Weichselberger, Munich I. The Project p. 2 II. The Theory of Interval Probability p. 4 III. The Logical Concept of Probability p. 6 IV. Inference p. 11 Kurt.Weichselberger@stat.uni-muenchen.de

More information

Reliable Uncertain Evidence Modeling in Bayesian Networks by Credal Networks

Reliable Uncertain Evidence Modeling in Bayesian Networks by Credal Networks Reliable Uncertain Evidence Modeling in Bayesian Networks by Credal Networks Sabina Marchetti Sapienza University of Rome Rome (Italy) sabina.marchetti@uniroma1.it Alessandro Antonucci IDSIA Lugano (Switzerland)

More information

On Markov Properties in Evidence Theory

On Markov Properties in Evidence Theory On Markov Properties in Evidence Theory 131 On Markov Properties in Evidence Theory Jiřina Vejnarová Institute of Information Theory and Automation of the ASCR & University of Economics, Prague vejnar@utia.cas.cz

More information

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1

More information

Learning Extended Tree Augmented Naive Structures

Learning Extended Tree Augmented Naive Structures Learning Extended Tree Augmented Naive Structures de Campos, C. P., Corani, G., Scanagatta, M., Cuccu, M., & Zaffalon, M. (2016). Learning Extended Tree Augmented Naive Structures. International Journal

More information

Generative v. Discriminative classifiers Intuition

Generative v. Discriminative classifiers Intuition Logistic Regression Machine Learning 070/578 Carlos Guestrin Carnegie Mellon University September 24 th, 2007 Generative v. Discriminative classifiers Intuition Want to Learn: h:x a Y X features Y target

More information

Multiple Model Tracking by Imprecise Markov Trees

Multiple Model Tracking by Imprecise Markov Trees 1th International Conference on Information Fusion Seattle, WA, USA, July 6-9, 009 Multiple Model Tracking by Imprecise Markov Trees Alessandro Antonucci and Alessio Benavoli and Marco Zaffalon IDSIA,

More information

Combining Belief Functions Issued from Dependent Sources

Combining Belief Functions Issued from Dependent Sources Combining Belief Functions Issued from Dependent Sources MARCO E.G.V. CATTANEO ETH Zürich, Switzerland Abstract Dempster s rule for combining two belief functions assumes the independence of the sources

More information

Convergence Rate of Expectation-Maximization

Convergence Rate of Expectation-Maximization Convergence Rate of Expectation-Maximiation Raunak Kumar University of British Columbia Mark Schmidt University of British Columbia Abstract raunakkumar17@outlookcom schmidtm@csubcca Expectation-maximiation

More information

University of Bielefeld

University of Bielefeld On the Information Value of Additional Data and Expert Knowledge in Updating Imprecise Prior Information Pertisau, September 2002 Thomas Augustin University of Bielefeld thomas@stat.uni-muenchen.de www.stat.uni-muenchen.de/

More information

Introduction to Imprecise Probability and Imprecise Statistical Methods

Introduction to Imprecise Probability and Imprecise Statistical Methods Introduction to Imprecise Probability and Imprecise Statistical Methods Frank Coolen UTOPIAE Training School, Strathclyde University 22 November 2017 (UTOPIAE) Intro to IP 1 / 20 Probability Classical,

More information

Logistic Regression. Will Monroe CS 109. Lecture Notes #22 August 14, 2017

Logistic Regression. Will Monroe CS 109. Lecture Notes #22 August 14, 2017 1 Will Monroe CS 109 Logistic Regression Lecture Notes #22 August 14, 2017 Based on a chapter by Chris Piech Logistic regression is a classification algorithm1 that works by trying to learn a function

More information

Learning Theory. Ingo Steinwart University of Stuttgart. September 4, 2013

Learning Theory. Ingo Steinwart University of Stuttgart. September 4, 2013 Learning Theory Ingo Steinwart University of Stuttgart September 4, 2013 Ingo Steinwart University of Stuttgart () Learning Theory September 4, 2013 1 / 62 Basics Informal Introduction Informal Description

More information

It provides a complete overview of credal networks and credal Cassio P. de Campos, Alessandro Antonucci, Giorgio Corani classifiers.

It provides a complete overview of credal networks and credal Cassio P. de Campos, Alessandro Antonucci, Giorgio Corani classifiers. Bayesian estimation vs Imprecise Dirichlet Model From Naive Bayes to Naive Credal Classifier Further classifiers Conclusions Bayesian estimation vs Imprecise Dirichlet Model From Naive Bayes to Naive Credal

More information

Introduction to Probabilistic Graphical Models

Introduction to Probabilistic Graphical Models Introduction to Probabilistic Graphical Models Kyu-Baek Hwang and Byoung-Tak Zhang Biointelligence Lab School of Computer Science and Engineering Seoul National University Seoul 151-742 Korea E-mail: kbhwang@bi.snu.ac.kr

More information

Entropy-based Pruning for Learning Bayesian Networks using BIC

Entropy-based Pruning for Learning Bayesian Networks using BIC Entropy-based Pruning for Learning Bayesian Networks using BIC Cassio P. de Campos Queen s University Belfast, UK arxiv:1707.06194v1 [cs.ai] 19 Jul 2017 Mauro Scanagatta Giorgio Corani Marco Zaffalon Istituto

More information

Machine Learning for Signal Processing Bayes Classification and Regression

Machine Learning for Signal Processing Bayes Classification and Regression Machine Learning for Signal Processing Bayes Classification and Regression Instructor: Bhiksha Raj 11755/18797 1 Recap: KNN A very effective and simple way of performing classification Simple model: For

More information

On Conditional Independence in Evidence Theory

On Conditional Independence in Evidence Theory 6th International Symposium on Imprecise Probability: Theories and Applications, Durham, United Kingdom, 2009 On Conditional Independence in Evidence Theory Jiřina Vejnarová Institute of Information Theory

More information

Optimizing Abstaining Classifiers using ROC Analysis. Tadek Pietraszek / 'tʌ dek pɪe 'trʌ ʃek / ICML 2005 August 9, 2005

Optimizing Abstaining Classifiers using ROC Analysis. Tadek Pietraszek / 'tʌ dek pɪe 'trʌ ʃek / ICML 2005 August 9, 2005 IBM Zurich Research Laboratory, GSAL Optimizing Abstaining Classifiers using ROC Analysis Tadek Pietraszek / 'tʌ dek pɪe 'trʌ ʃek / pie@zurich.ibm.com ICML 2005 August 9, 2005 To classify, or not to classify:

More information

Infinitely Imbalanced Logistic Regression

Infinitely Imbalanced Logistic Regression p. 1/1 Infinitely Imbalanced Logistic Regression Art B. Owen Journal of Machine Learning Research, April 2007 Presenter: Ivo D. Shterev p. 2/1 Outline Motivation Introduction Numerical Examples Notation

More information

GERT DE COOMAN AND DIRK AEYELS

GERT DE COOMAN AND DIRK AEYELS POSSIBILITY MEASURES, RANDOM SETS AND NATURAL EXTENSION GERT DE COOMAN AND DIRK AEYELS Abstract. We study the relationship between possibility and necessity measures defined on arbitrary spaces, the theory

More information

Independence Concepts for Convex Sets of Probabilities. Luis M. De Campos and Serafn Moral. Departamento de Ciencias de la Computacion e I.A.

Independence Concepts for Convex Sets of Probabilities. Luis M. De Campos and Serafn Moral. Departamento de Ciencias de la Computacion e I.A. Independence Concepts for Convex Sets of Probabilities Luis M. De Campos and Serafn Moral Departamento de Ciencias de la Computacion e I.A. Universidad de Granada, 18071 - Granada - Spain e-mails: lci@robinson.ugr.es,smc@robinson.ugr

More information

SMPS 08, 8-10 septembre 2008, Toulouse. A hierarchical fusion of expert opinion in the Transferable Belief Model (TBM) Minh Ha-Duong, CNRS, France

SMPS 08, 8-10 septembre 2008, Toulouse. A hierarchical fusion of expert opinion in the Transferable Belief Model (TBM) Minh Ha-Duong, CNRS, France SMPS 08, 8-10 septembre 2008, Toulouse A hierarchical fusion of expert opinion in the Transferable Belief Model (TBM) Minh Ha-Duong, CNRS, France The frame of reference: climate sensitivity Climate sensitivity

More information

Short Course Robust Optimization and Machine Learning. 3. Optimization in Supervised Learning

Short Course Robust Optimization and Machine Learning. 3. Optimization in Supervised Learning Short Course Robust Optimization and 3. Optimization in Supervised EECS and IEOR Departments UC Berkeley Spring seminar TRANSP-OR, Zinal, Jan. 16-19, 2012 Outline Overview of Supervised models and variants

More information

A survey of the theory of coherent lower previsions

A survey of the theory of coherent lower previsions A survey of the theory of coherent lower previsions Enrique Miranda Abstract This paper presents a summary of Peter Walley s theory of coherent lower previsions. We introduce three representations of coherent

More information

Click Prediction and Preference Ranking of RSS Feeds

Click Prediction and Preference Ranking of RSS Feeds Click Prediction and Preference Ranking of RSS Feeds 1 Introduction December 11, 2009 Steven Wu RSS (Really Simple Syndication) is a family of data formats used to publish frequently updated works. RSS

More information

Hierarchical Multinomial-Dirichlet model for the estimation of conditional probability tables

Hierarchical Multinomial-Dirichlet model for the estimation of conditional probability tables Hierarchical Multinomial-Dirichlet model for the estimation of conditional probability tables Laura Azzimonti IDSIA - SUPSI/USI Manno, Switzerland laura@idsia.ch Giorgio Corani IDSIA - SUPSI/USI Manno,

More information

Performance Metrics for Machine Learning. Sargur N. Srihari

Performance Metrics for Machine Learning. Sargur N. Srihari Performance Metrics for Machine Learning Sargur N. srihari@cedar.buffalo.edu 1 Topics 1. Performance Metrics 2. Default Baseline Models 3. Determining whether to gather more data 4. Selecting hyperparamaters

More information

Variable Selection in Classification Trees Based on Imprecise Probabilities

Variable Selection in Classification Trees Based on Imprecise Probabilities 4th International Symposium on Imprecise Probabilities and Their Applications, Pittsburgh, Pennsylvania, 2005 Variable Selection in Classification Trees Based on Imprecise Probabilities Carolin Strobl

More information

INDEPENDENT NATURAL EXTENSION

INDEPENDENT NATURAL EXTENSION INDEPENDENT NATURAL EXTENSION GERT DE COOMAN, ENRIQUE MIRANDA, AND MARCO ZAFFALON ABSTRACT. There is no unique extension of the standard notion of probabilistic independence to the case where probabilities

More information

Imprecise Probabilities in Stochastic Processes and Graphical Models: New Developments

Imprecise Probabilities in Stochastic Processes and Graphical Models: New Developments Imprecise Probabilities in Stochastic Processes and Graphical Models: New Developments Gert de Cooman Ghent University SYSTeMS NLMUA2011, 7th September 2011 IMPRECISE PROBABILITY MODELS Imprecise probability

More information

Intelligent Systems:

Intelligent Systems: Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition

More information

Linear Regression. Aarti Singh. Machine Learning / Sept 27, 2010

Linear Regression. Aarti Singh. Machine Learning / Sept 27, 2010 Linear Regression Aarti Singh Machine Learning 10-701/15-781 Sept 27, 2010 Discrete to Continuous Labels Classification Sports Science News Anemic cell Healthy cell Regression X = Document Y = Topic X

More information

CPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017

CPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017 CPSC 340: Machine Learning and Data Mining MLE and MAP Fall 2017 Assignment 3: Admin 1 late day to hand in tonight, 2 late days for Wednesday. Assignment 4: Due Friday of next week. Last Time: Multi-Class

More information

Credal Ensembles of Classifiers

Credal Ensembles of Classifiers Credal Ensembles of Classifiers G. Corani a,, A. Antonucci a a IDSIA Istituto Dalle Molle di Studi sull Intelligenza Artificiale CH-6928 Manno (Lugano), Switzerland Abstract It is studied how to aggregate

More information

Machine Learning Summer School

Machine Learning Summer School Machine Learning Summer School Lecture 3: Learning parameters and structure Zoubin Ghahramani zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/ Department of Engineering University of Cambridge,

More information

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012 Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood

More information

Reintroducing Credal Networks under Epistemic Irrelevance Jasper De Bock

Reintroducing Credal Networks under Epistemic Irrelevance Jasper De Bock Reintroducing Credal Networks under Epistemic Irrelevance Jasper De Bock Ghent University Belgium Jasper De Bock Ghent University Belgium Jasper De Bock Ghent University Belgium Jasper De Bock IPG group

More information

Parameter Estimation. Industrial AI Lab.

Parameter Estimation. Industrial AI Lab. Parameter Estimation Industrial AI Lab. Generative Model X Y w y = ω T x + ε ε~n(0, σ 2 ) σ 2 2 Maximum Likelihood Estimation (MLE) Estimate parameters θ ω, σ 2 given a generative model Given observed

More information

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

PILCO: A Model-Based and Data-Efficient Approach to Policy Search PILCO: A Model-Based and Data-Efficient Approach to Policy Search (M.P. Deisenroth and C.E. Rasmussen) CSC2541 November 4, 2016 PILCO Graphical Model PILCO Probabilistic Inference for Learning COntrol

More information

Modeling High-Dimensional Discrete Data with Multi-Layer Neural Networks

Modeling High-Dimensional Discrete Data with Multi-Layer Neural Networks Modeling High-Dimensional Discrete Data with Multi-Layer Neural Networks Yoshua Bengio Dept. IRO Université de Montréal Montreal, Qc, Canada, H3C 3J7 bengioy@iro.umontreal.ca Samy Bengio IDIAP CP 592,

More information

Density-Based Clustering

Density-Based Clustering Density-Based Clustering idea: Clusters are dense regions in feature space F. density: objects volume ε here: volume: ε-neighborhood for object o w.r.t. distance measure dist(x,y) dense region: ε-neighborhood

More information

Conservative Inference Rule for Uncertain Reasoning under Incompleteness

Conservative Inference Rule for Uncertain Reasoning under Incompleteness Journal of Artificial Intelligence Research 34 (2009) 757 821 Submitted 11/08; published 04/09 Conservative Inference Rule for Uncertain Reasoning under Incompleteness Marco Zaffalon Galleria 2 IDSIA CH-6928

More information

Applying Bayesian networks in the game of Minesweeper

Applying Bayesian networks in the game of Minesweeper Applying Bayesian networks in the game of Minesweeper Marta Vomlelová Faculty of Mathematics and Physics Charles University in Prague http://kti.mff.cuni.cz/~marta/ Jiří Vomlel Institute of Information

More information

An Empirical-Bayes Score for Discrete Bayesian Networks

An Empirical-Bayes Score for Discrete Bayesian Networks An Empirical-Bayes Score for Discrete Bayesian Networks scutari@stats.ox.ac.uk Department of Statistics September 8, 2016 Bayesian Network Structure Learning Learning a BN B = (G, Θ) from a data set D

More information

Part 3 Robust Bayesian statistics & applications in reliability networks

Part 3 Robust Bayesian statistics & applications in reliability networks Tuesday 9:00-12:30 Part 3 Robust Bayesian statistics & applications in reliability networks by Gero Walter 69 Robust Bayesian statistics & applications in reliability networks Outline Robust Bayesian Analysis

More information

Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric?

Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric? Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric? Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables

More information

Introduction to Probabilistic Machine Learning

Introduction to Probabilistic Machine Learning Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning

More information

Logistic Regression Trained with Different Loss Functions. Discussion

Logistic Regression Trained with Different Loss Functions. Discussion Logistic Regression Trained with Different Loss Functions Discussion CS640 Notations We restrict our discussions to the binary case. g(z) = g (z) = g(z) z h w (x) = g(wx) = + e z = g(z)( g(z)) + e wx =

More information

Exploiting Bayesian Network Sensitivity Functions for Inference in Credal Networks

Exploiting Bayesian Network Sensitivity Functions for Inference in Credal Networks 646 ECAI 2016 G.A. Kaminka et al. (Eds.) 2016 The Authors and IOS Press. This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution

More information

Climbing the Hills of Compiled Credal Networks

Climbing the Hills of Compiled Credal Networks 5th International Symposium on Imprecise Probability: Theories and Applications, Prague, Czech Republic, 2007 Climbing the Hills of Compiled Credal Networks Rolf Haenni Bern University of Applied Sciences,

More information

Learning Binary Classifiers for Multi-Class Problem

Learning Binary Classifiers for Multi-Class Problem Research Memorandum No. 1010 September 28, 2006 Learning Binary Classifiers for Multi-Class Problem Shiro Ikeda The Institute of Statistical Mathematics 4-6-7 Minami-Azabu, Minato-ku, Tokyo, 106-8569,

More information

Issues using Logistic Regression for Highly Imbalanced data

Issues using Logistic Regression for Highly Imbalanced data Issues using Logistic Regression for Highly Imbalanced data Yazhe Li, Niall Adams, Tony Bellotti Imperial College London yli16@imperialacuk Credit Scoring and Credit Control conference, Aug 2017 Yazhe

More information

The Minimum Message Length Principle for Inductive Inference

The Minimum Message Length Principle for Inductive Inference The Principle for Inductive Inference Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population Health University of Melbourne University of Helsinki, August 25,

More information

ECS171: Machine Learning

ECS171: Machine Learning ECS171: Machine Learning Lecture 3: Linear Models I (LFD 3.2, 3.3) Cho-Jui Hsieh UC Davis Jan 17, 2018 Linear Regression (LFD 3.2) Regression Classification: Customer record Yes/No Regression: predicting

More information

From Bayesian Networks to Markov Networks. Sargur Srihari

From Bayesian Networks to Markov Networks. Sargur Srihari From Bayesian Networks to Markov Networks Sargur srihari@cedar.buffalo.edu 1 Topics Bayesian Networks and Markov Networks From BN to MN: Moralized graphs From MN to BN: Chordal graphs 2 Bayesian Networks

More information

Introduction p. 1 Fundamental Problems p. 2 Core of Fundamental Theory and General Mathematical Ideas p. 3 Classical Statistical Decision p.

Introduction p. 1 Fundamental Problems p. 2 Core of Fundamental Theory and General Mathematical Ideas p. 3 Classical Statistical Decision p. Preface p. xiii Acknowledgment p. xix Introduction p. 1 Fundamental Problems p. 2 Core of Fundamental Theory and General Mathematical Ideas p. 3 Classical Statistical Decision p. 4 Bayes Decision p. 5

More information

Probability Theory Review

Probability Theory Review Probability Theory Review Brendan O Connor 10-601 Recitation Sept 11 & 12, 2012 1 Mathematical Tools for Machine Learning Probability Theory Linear Algebra Calculus Wikipedia is great reference 2 Probability

More information

Optimization and Gradient Descent

Optimization and Gradient Descent Optimization and Gradient Descent INFO-4604, Applied Machine Learning University of Colorado Boulder September 12, 2017 Prof. Michael Paul Prediction Functions Remember: a prediction function is the function

More information

CPSC 340: Machine Learning and Data Mining

CPSC 340: Machine Learning and Data Mining CPSC 340: Machine Learning and Data Mining MLE and MAP Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due tonight. Assignment 5: Will be released

More information

Bayesian Hypotheses Testing

Bayesian Hypotheses Testing Bayesian Hypotheses Testing Jakub Repický Faculty of Mathematics and Physics, Charles University Institute of Computer Science, Czech Academy of Sciences Selected Parts of Data Mining Jan 19 2018, Prague

More information

Outline. On Premise Evaluation On Conclusion Entailment. 1 Imperfection : Why and What. 2 Imperfection : How. 3 Conclusions

Outline. On Premise Evaluation On Conclusion Entailment. 1 Imperfection : Why and What. 2 Imperfection : How. 3 Conclusions Outline 1 Imperfection : Why and What 2 Imperfection : How On Premise Evaluation On Conclusion Entailment 3 Conclusions Outline 1 Imperfection : Why and What 2 Imperfection : How On Premise Evaluation

More information

Branch and Bound for Regular Bayesian Network Structure Learning

Branch and Bound for Regular Bayesian Network Structure Learning Branch and Bound for Regular Bayesian Network Structure Learning Joe Suzuki and Jun Kawahara Osaka University, Japan. j-suzuki@sigmath.es.osaka-u.ac.jp Nara Institute of Science and Technology, Japan.

More information

Logistic Regression. COMP 527 Danushka Bollegala

Logistic Regression. COMP 527 Danushka Bollegala Logistic Regression COMP 527 Danushka Bollegala Binary Classification Given an instance x we must classify it to either positive (1) or negative (0) class We can use {1,-1} instead of {1,0} but we will

More information

CS 188: Artificial Intelligence Fall 2008

CS 188: Artificial Intelligence Fall 2008 CS 188: Artificial Intelligence Fall 2008 Lecture 23: Perceptrons 11/20/2008 Dan Klein UC Berkeley 1 General Naïve Bayes A general naive Bayes model: C E 1 E 2 E n We only specify how each feature depends

More information

General Naïve Bayes. CS 188: Artificial Intelligence Fall Example: Overfitting. Example: OCR. Example: Spam Filtering. Example: Spam Filtering

General Naïve Bayes. CS 188: Artificial Intelligence Fall Example: Overfitting. Example: OCR. Example: Spam Filtering. Example: Spam Filtering CS 188: Artificial Intelligence Fall 2008 General Naïve Bayes A general naive Bayes model: C Lecture 23: Perceptrons 11/20/2008 E 1 E 2 E n Dan Klein UC Berkeley We only specify how each feature depends

More information

Optimal Uncertainty Quantification of Quantiles

Optimal Uncertainty Quantification of Quantiles Optimal Uncertainty Quantification of Quantiles Application to Industrial Risk Management. Merlin Keller 1, Jérôme Stenger 12 1 EDF R&D, France 2 Institut Mathématique de Toulouse, France ETICS 2018, Roscoff

More information

Lecture 22: Error exponents in hypothesis testing, GLRT

Lecture 22: Error exponents in hypothesis testing, GLRT 10-704: Information Processing and Learning Spring 2012 Lecture 22: Error exponents in hypothesis testing, GLRT Lecturer: Aarti Singh Scribe: Aarti Singh Disclaimer: These notes have not been subjected

More information

Statistical Approaches to Learning and Discovery. Week 4: Decision Theory and Risk Minimization. February 3, 2003

Statistical Approaches to Learning and Discovery. Week 4: Decision Theory and Risk Minimization. February 3, 2003 Statistical Approaches to Learning and Discovery Week 4: Decision Theory and Risk Minimization February 3, 2003 Recall From Last Time Bayesian expected loss is ρ(π, a) = E π [L(θ, a)] = L(θ, a) df π (θ)

More information

Is there an Elegant Universal Theory of Prediction?

Is there an Elegant Universal Theory of Prediction? Is there an Elegant Universal Theory of Prediction? Shane Legg Dalle Molle Institute for Artificial Intelligence Manno-Lugano Switzerland 17th International Conference on Algorithmic Learning Theory Is

More information

Learning Conditional Probabilities from Incomplete Data: An Experimental Comparison Marco Ramoni Knowledge Media Institute Paola Sebastiani Statistics

Learning Conditional Probabilities from Incomplete Data: An Experimental Comparison Marco Ramoni Knowledge Media Institute Paola Sebastiani Statistics Learning Conditional Probabilities from Incomplete Data: An Experimental Comparison Marco Ramoni Knowledge Media Institute Paola Sebastiani Statistics Department Abstract This paper compares three methods

More information

A Unified Posterior Regularized Topic Model with Maximum Margin for Learning-to-Rank

A Unified Posterior Regularized Topic Model with Maximum Margin for Learning-to-Rank A Unified Posterior Regularized Topic Model with Maximum Margin for Learning-to-Rank Shoaib Jameel Shoaib Jameel 1, Wai Lam 2, Steven Schockaert 1, and Lidong Bing 3 1 School of Computer Science and Informatics,

More information

CONDITIONAL MODELS: COHERENCE AND INFERENCE THROUGH SEQUENCES OF JOINT MASS FUNCTIONS

CONDITIONAL MODELS: COHERENCE AND INFERENCE THROUGH SEQUENCES OF JOINT MASS FUNCTIONS CONDITIONAL MODELS: COHERENCE AND INFERENCE THROUGH SEQUENCES OF JOINT MASS FUNCTIONS ENRIQUE MIRANDA AND MARCO ZAFFALON Abstract. We call a conditional model any set of statements made of conditional

More information

Learning Objectives. c D. Poole and A. Mackworth 2010 Artificial Intelligence, Lecture 7.2, Page 1

Learning Objectives. c D. Poole and A. Mackworth 2010 Artificial Intelligence, Lecture 7.2, Page 1 Learning Objectives At the end of the class you should be able to: identify a supervised learning problem characterize how the prediction is a function of the error measure avoid mixing the training and

More information