BAYESIAN inference is rooted from the well-known

Similar documents
Bayesian Networks BY: MOHAMAD ALSABBAGH

Probabilistic and Bayesian Analytics

Machine Learning

Probability Basics. Robot Image Credit: Viktoriya Sukhanova 123RF.com

Introduction to Artificial Intelligence. Prof. Inkyu Moon Dept. of Robotics Engineering, DGIST

Bayesian Learning Extension

CMPT 310 Artificial Intelligence Survey. Simon Fraser University Summer Instructor: Oliver Schulte

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

Machine Learning. Bayes Basics. Marc Toussaint U Stuttgart. Bayes, probabilities, Bayes theorem & examples

The Island Problem Revisited

Intelligent Systems I

Uncertainty. Variables. assigns to each sentence numerical degree of belief between 0 and 1. uncertainty

Bayesian Learning (II)

Basic Probabilistic Reasoning SEG

Human-level concept learning through probabilistic program induction

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur

Math for Machine Learning Open Doors to Data Science and Artificial Intelligence. Richard Han

Introduction to Machine Learning

Reasoning Under Uncertainty: Conditioning, Bayes Rule & the Chain Rule

Machine Learning

Model Averaging (Bayesian Learning)

Structure learning in human causal induction

Introduction to Machine Learning

Why Probability? It's the right way to look at the world.

Probabilistic Reasoning

CS 188: Artificial Intelligence Fall 2009

Discrete Random Variables

Machine Learning. CS Spring 2015 a Bayesian Learning (I) Uncertainty

Algorithms for Classification: The Basic Methods

Topic Models. Brandon Malone. February 20, Latent Dirichlet Allocation Success Stories Wrap-up

CSE 473: Artificial Intelligence Autumn 2011

Probability Review Lecturer: Ji Liu Thank Jerry Zhu for sharing his slides

Lecture : Probabilistic Machine Learning

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr

Lecture 2: Bayesian Classification

Bayesian Updating: Discrete Priors: Spring

Bayesian Networks Basic and simple graphs

Consider an experiment that may have different outcomes. We are interested to know what is the probability of a particular set of outcomes.

A Generalized Decision Logic in Interval-set-valued Information Tables

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf

Naive Bayes classification

Module 10. Reasoning with Uncertainty - Probabilistic reasoning. Version 2 CSE IIT,Kharagpur

CSEP 573: Artificial Intelligence

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida

PROBABILITY. Inference: constraint propagation

Quantifying uncertainty & Bayesian networks

Lecture 10: Introduction to reasoning under uncertainty. Uncertainty

Tutorial on Venn-ABERS prediction

Bayesian belief networks

MAE 493G, CpE 493M, Mobile Robotics. 6. Basic Probability

Reasoning with Uncertainty

Basic Probability and Statistics

Probabilistic Graphical Models

Week 2 Quantitative Analysis of Financial Markets Bayesian Analysis

Inteligência Artificial (SI 214) Aula 15 Algoritmo 1R e Classificador Bayesiano

DD Advanced Machine Learning

Generative Techniques: Bayes Rule and the Axioms of Probability

Probability and Information Theory. Sargur N. Srihari

Assignment No A-05 Aim. Pre-requisite. Objective. Problem Statement. Hardware / Software Used

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio

Grundlagen der Künstlichen Intelligenz

Recall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem

Bayesian Updating: Discrete Priors: Spring

Introduction to Artificial Intelligence. Unit # 11

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Bayesian Reasoning. Adapted from slides by Tim Finin and Marie desjardins.

SAMPLE CHAPTERS UNESCO EOLSS FOUNDATIONS OF STATISTICS. Reinhard Viertl Vienna University of Technology, A-1040 Wien, Austria

Probabilistic representation and reasoning

Bayesian Computation

Probability Theory Review

What are the Findings?

GENERALIZED ARYABHATA REMAINDER THEOREM

Bios 6649: Clinical Trials - Statistical Design and Monitoring

Probabilistic representation and reasoning

Artificial Intelligence Uncertainty

Discussion of Predictive Density Combinations with Dynamic Learning for Large Data Sets in Economics and Finance

Bayesian networks. Soleymani. CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018

Outline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012

Bayes Nets. CS 188: Artificial Intelligence Fall Example: Alarm Network. Bayes Net Semantics. Building the (Entire) Joint. Size of a Bayes Net

ECE521 week 3: 23/26 January 2017

Probabilistic Models

A Probabilistic Relational Model for Characterizing Situations in Dynamic Multi-Agent Systems

Evidence Evaluation: a Study of Likelihoods and Independence

CS 188: Artificial Intelligence Fall 2008

Modeling and reasoning with uncertainty

Bayesian belief networks. Inference.

UNIVERSITY OF SURREY

Mihai Boicu, Gheorghe Tecuci, Dorin Marcu Learning Agent Center, George Mason University

Probabilistic Representation and Reasoning

Bayesian Networks. Vibhav Gogate The University of Texas at Dallas

Machine Learning

Machine Learning and Deep Learning! Vincent Lepetit!

Representing and Querying Correlated Tuples in Probabilistic Databases

CS 5522: Artificial Intelligence II

A new Approach to Drawing Conclusions from Data A Rough Set Perspective

DEPARTMENT OF COMPUTER SCIENCE AUTUMN SEMESTER MACHINE LEARNING AND ADAPTIVE INTELLIGENCE

CS 188: Artificial Intelligence. Our Status in CS188

Bayesian Learning. Artificial Intelligence Programming. 15-0: Learning vs. Deduction

Basics of Bayesian analysis. Jorge Lillo-Box MCMC Coffee Season 1, Episode 5

Transcription:

, June 29 - July 1, 2016, London, U.K. Crafting a Lightweight Bayesian Inference Engine Feng-Jen Yang, Member, IAENG Abstract Many expert systems are counting on Bayesian inference to perform probabilistic inferences and provide quantitative supports for decisions, hypotheses, forecasts and diagnoses. With this concern in mind, I implemented this probability-based inference engine that is domain independent and can be plugged into future systems to speed up the development life cycle of machine learning and/or inference systems. Index Terms Bayesian inference, Bayesian theory, Inference engine. I. INTRODUCTION BAYESIAN inference is rooted from the well-known posthumous theory of Thomas Bayes that was formulated in the 18th century and soon adopted as the mathematical rationale for the processing of uncertain information and the drawing of probabilistic conclusions based on the evidences that have been observed [2]. A counter intuitive calculation that makes Bayesian inference hard to comprehend is that it turns around the evidence and the hypothesis within a conditional probability in a way that most of the beginners are not feeling quite comfortable with. Nonetheless, this unnatural formula derivation is the essential idea that transforms the mathematical notations to meet the probabilistic inference purposes in most of the stochasticbased expert systems. Mathematically this concept can be demonstrated as computing P (H E) in terms of P (E H), where H is denoting a hypothesis and E is denoting an evidence to support this hypothesis. As stated in most of the probability and statistics books, the algebra involved in this calculation of a conditional probability is illustrated as: P (H E) = P (H E) P (E) From the standing point of inference, the real semantic behind this calculation is representing the inference of: If the evidence E is observed, how much likely is the hypothesis H true? While designing a system, if a domain expert can provide the values of P (H E) and P (E) then the inference of P (H E) is just a simple division. However, P (H E) and P (E) are not friendly cognitive values that human experts can easily construct mentally. As a result, most of the knowledge engineers are trying an indirect knowledge Manuscript received March 2, 2016; revised March 16, 2016. This research is funded by the Seed Grant of Florida Polytechnic University to Dr. Feng-Jen Yang. The implementation detail of this project does not reflect the policy of Florida Polytechnic University and no official endorsement should be inferred. F. Yang is with the Department of Computer Science and Information Technology, Florida Polytechnic University, FL 33805, USA, e-mail: fyang@flpoly.org. extraction to compute P (H E) as follows: P (H E) P (H E) = P (E) P (E H) P (H) = P (E H) P (H) + P (E H) P ( H) Where P (H) is the prior probability of H being true, P ( H) is the prior probability of H being false, P (E H) is the conditional probability of see E when H is true, P (E H) is the conditional probability of see E when H is false, and P (H E) is the posterior probability of H being true by seeing E. II. THE PROBABILISTIC REASONING A more generalized Bayesian inference can be represented as a series of mathematical derivations. Considering a given problem domain with n evidences and m hypotheses, the inference of: If evidences E 1, E 2,... and E n are observed, how much likely is the hypothesis Hi true, where 1 i m? The detail inference steps can be performed by the following computations [1]: P (H i (E 1 E 2... E n )) = P (H i (E 1 E 2... E n )) P (E 1 E 2... E n ) P ((E 1 E 2... E n ) H i ) = m j=1 P ((E 1 E 2... E n ) H j ) = P ((E 1 E 2... E n ) H i ) P (H i ) m j=1 P ((E 1 E 2... E n ) H j ) P (H j ) Where 1 i m. The above series of computations are mathematically sound, but the involvement of joint conditional probabilities that have to deal with all possible combinations of evidences and hypotheses is way too complicate for any human experts to conclude from their real life experiences. To make this mathematical rationale more practical, the conditional independence among these evidences is usually assumed so that: P ((E 1 E 2... E n ) H i ) = P (E 1 H i ) P (E 2 H i )... P (E n H i ) P (H i ) So the mathematical result can be further simplified into: P (H i (E 1 E 2... E n )) P (E 1 H i ) P (E 2 H i )... P (E n H i ) P (H i ) = m j=1 P (E 1 H j ) P (E 2 H j )... P (E n H j ) P (H j ) Where 1 i m. This assumption significantly simplifies the complexity of the domain experts mental burden and makes the inference processing cognitively workable. Instead of considering all evidences at once, the experts are now considering an evidence and a hypothesis at a time [3].

, June 29 - July 1, 2016, London, U.K. TABLE I THE DESIGN OF THE PRIOR PROBABILITY CLASS TYPE Class Name: PrioProb Purpose: Representing a prior probability hypo prob Initializing the prior probability Converting the prior probability into a string representation The hypothesis in the prior probability The likelihood value of the prior probability TABLE II THE DESIGN OF THE CONDITIONAL PROBABILITY CLASS TYPE Class Name: CondProb Purpose: Representing a conditional probability hypo evid prob Initializing the conditional probability Converting the conditional probability into a string representation The hypothesis in the conditional probability The evidence in the conditional probability The likelihood value of the conditional probability III. THE DESIGN OF CLASS TYPES Although this probabilistic inference process looks specific, it involves intensive calculations among prior probabilities, conditional probabilities and posterior probabilities. To design this inference engine as an open source module, four class types at the top level are presented in the following subsections. A. The Class Type Representing a Prior Probability A prior probability within a Bayesian inference domain is an unconditional likelihood that supports a hypothesis without taking any evidence into consideration, such as P (H 1 ), P (H 2 ),,or P (H n ). The design of methods and attributes within the prior probability class type are described in Table 1. TABLE III THE DESIGN OF THE KNOWLEDGE BASE CLASS TYPE Class Name: KnowledgeBase Purpose: Representing a knowledge domain listofpriorprobs listofcondprobs Initializing the knowledge base Converting the knowledge base into a string representation The list of prior probabilities in the knowledge domain The list of conditional probabilities in the knowledge domain TABLE IV THE DESIGN OF THE POSTERIOR PROBABILITY CLASS TYPE Class Name: PostProb Purpose: Representing a resultant posterior probability Inference kb hypo listofevids Initializing the domain knowledge Applying Bayesian inference to compute the resultant posterior probability Converting the resultant posterior probability into a string representation The current knowledge base The hypothesis of the posterior probability The list of evidences observed D. The Class Type Representing a Posterior Probability A posterior probability within a Bayesian inference domain is an inference result that represents the likelihood of a hypothesis by seeing some evidences, such as P (H i (E 1 E 2... E n )).The design of methods and attributes within the posterior probability class type are described in Table 4. IV. THE IMPLEMENTATION OF THIS INFERENCE ENGINE B. The Class Type Representing a Conditional Probability A conditional probabilities within a Bayesian inference domain is a conditional likelihood that an evidence will be revealed if a hypothesis is true, such as P (E i H 1 ), P (E i H 2 ),,or P (E i H n ). The design of methods and attributes within the conditional probability class type are described in Table 2. This inference engine consists of the aforementioned four class types is implemented in Python programming language to be a reusable module as shown in Appendix A. This module is lightweight, domain independent and can be used in future probabilistic inference applications to speed up the development life cycle. C. The Class Type Representing a Knowledge Base The knowledge base within a Bayesian inference domain consists of a list of prior probabilities and a list of conditional probabilities. The design of methods and attributes within the domain knowledge class type are described in Table 3. V. A DEMONSTRATION OF HOW TO APPLY THIS INFERENCE ENGINE A demonstration of applying this inference engine is shown in Appendix B in which the following domain knowl-

, June 29 - July 1, 2016, London, U.K. edge is instantiated: P (H 1 E 1 ) = 0.2 P (H 2 ) = 0.3 P (H 3 ) = 0.6 P (E 1 H 1 ) = 0.3 P (E 2 H 1 ) = 0.9 P (E 3 H 1 ) = 0.6 P (E 1 H 2 ) = 0.8 P (E 2 H 2 ) = 0.1 P (E 3 H 2 ) = 0.7 P (E 1 H 3 ) = 0.5 P (E 2 H 3 ) = 0.7 P (E 3 H 3 ) = 0.9 and the following three posterior probabilities are inferred: P (H 1 (E 1 E 2 E 3 ) = 0.136 P (H 2 (E 1 E 2 E 3 ) = 0.071 P (H 3 (E 1 E 2 E 3 ) = 0.793 The program output from this demonstration are shown in Appendix C. VI. CONCLUSION Nowadays, Bayesian theorem has offered a stochastic foundation for expert systems to deal with forecast and classification problems, ranging from pattern recognition, medical diagnostic, weather forecast, to natural language processing [4]. This inference engine is based on the theory of Nave Bayesian Network and implemented in Python programming language. In light of its ubiquity, this inference is designed to be domain-independent. As a performance-centered design this inference engine is functioning comprehensively without consuming excessive computation resources. I am aiming at publishing this inference engine as an open source to further benefit both industrial application developments and academic researches. REFERENCES [1] T. Bayes, An Essay Towards Solving a Problem in the Doctrine of Chances, Philosophical Transactions of the Royal Society, Volume 53, Issue 1, pp. 370-418, 1763. [2] J. Tabak, Probability and Statistics: The Science of Uncertainty, NY: Facts On File, Inc., USA, pp. 46-50, 2004. [3] F. Yang, Eliciting an Overlooked Aspect of Bayesian Reasoning, ACM SIGCSE Bulletin, Volume 39, Issue 4, pp. 45-48, 2007. [4] G. DAgnostini, Bayesian Reasoning in Data Analysis: A Critical Introduction, NJ: World Scientific Publishing Co., Pte. Ltd., USA, 2003 Dr. Feng-Jen Yang became a Member (M) of IAENG in 20012. He received the B.E. degree in Information Engineering from Feng Chia University, Taichung, Taiwan, in 1989, the M.S. degree in Computer Science from California State University, Chico, California, in 1995, and the Ph.D. degree in Computer Science from Illinois Institute of Technology, Chicago, Illinois, in 2001, respectively. Currently, he is an associate professor of Computer Science and Information Technology at Florida Polytechnic University. Besides the currently academic career, he also has some prior research experiences. He once was a research assistant at the Chung Shan Institute of Science and Technology (CSIST), Taoyuan, Taiwan, from 1989 to 1993, as well as an engineer at the Industrial Technology Research Institute (ITRI), Hsinchu, Taiwan, from 1995 to 1996. His research areas include Artificial Intelligence, Expert Systems, and Software Engineering.

, June 29 - July 1, 2016, London, U.K. APPENDIX A THE IMPLEMENTATION OF THE BAYESIAN INFERENCE ENGINE File Name: bayesian.py Creator: Feng-Jen Yang Purpose: Implementing a Bayesian inference engine class PrioProb(object): Representing a prior probability def init (self, inithypo, initprob): Initializing the prior probability self.hypo = inithypo self.prob = initprob Returning a string representation of the prior probability return "P(" + self.hypo + ") = " + str(self.prob) class CondProb(object): Representing a conditional probability def init (self, initevid, inithypo, initprob): Initializing the conditional probability self.evid = initevid self.hypo = inithypo self.prob = initprob Returning a string representation of the conditional probability return "P(" + self.evid + " " + self.hypo + ") = " + str(self.prob) class KnowledgeBase(object): Representing a knowledge domain def init (self, initprioprobs, initcondprobs): Initializing the prior probabilities and conditional probabilities self.listofprioprobs = initprioprobs self.listofcondprobs = initcondprobs Returning a string representation of the knowledge base result = "The Knowledge Base:" for p in self.listofprioprobs: result += "\n" + str(p) for p in self.listofcondprobs: result += "\n" + str(p) return result class PostProb(object): Representing Bayesian inference def init (self, initkb, inithypo, initevids): Initializing the knowledge base self.kb = initkb self.hypo = inithypo self.listofevids = initevids def inference(self): Computing the posterior probability

, June 29 - July 1, 2016, London, U.K. numerator = 1 for cp in self.kb.listofcondprobs: if (cp.hypo == self.hypo) and (cp.evid in self.listofevids): numerator *=cp.prob for pp in self.kb.listofprioprobs: if pp.hypo == self.hypo: numerator *= pp.prob break denominator = 0 for pp in self.kb.listofprioprobs: term = 1 for cp in self.kb.listofcondprobs: if (cp.hypo == pp.hypo) and (cp.evid in self.listofevids): term *=cp.prob term *= pp.prob denominator += term return round(numerator/denominator, 3) Returning a string representation of the inference reault result = "The Posterior Probability:\nP(" + self.hypo + " (" for e in self.listofevids: result += e if e!= self.listofevids[-1]: result += "ˆ" result += ")) = " + str(self.inference()) return result APPENDIX B A DEMONSTRATION OF APPLYING THIS INFERENCE ENGINE File Name: demo.py Creator: Feng-Jen Yang Puopose: A demonstration of using this Bayesian inference engine from bayesian import * def main(): The main function that coordinates the program execution #Instantiate the prior probabilities listofprioprobs = [PrioProb("h1", 0.2), PrioProb("h2", 0.3), PrioProb("h3", 0.6)] #Instantiate conditional probabilities listofcondprobs = [CondProb("e1", "h1", 0.3), CondProb("e2", "h1", 0.9),\ CondProb("e3", "h1", 0.6), CondProb("e1", "h2", 0.8),\ CondProb("e2", "h2", 0.1), CondProb("e3", "h2", 0.7),\ CondProb("e1", "h3", 0.5), CondProb("e2", "h3", 0.7),\ CondProb("e3", "h3", 0.9)] #Instantiate the knowledge base kb = KnowledgeBase(listOfPrioProbs, listofcondprobs) #Display the knowledge base print(kb, "\n") #Perform the Bayesian inference to compute the posterior probability

, June 29 - July 1, 2016, London, U.K. pp = PostProb(kb, "h1", ["e1", "e2", "e3"]) #display the inference result print(pp, "\n") #Perform the Bayesian inference to compute the posterior probability pp = PostProb(kb, "h2", ["e1", "e2", "e3"]) #display the inference result print(pp, "\n") #Perform the Bayesian inference to compute the posterior probability pp = PostProb(kb, "h3", ["e1", "e2", "e3"]) #display the inference result print(pp, "\n") #The entry point of program execution main() The Knowledge Base: P(h1) = 0.2 P(h2) = 0.3 P(h3) = 0.6 P(e1 h1) = 0.3 P(e2 h1) = 0.9 P(e3 h1) = 0.6 P(e1 h2) = 0.8 P(e2 h2) = 0.1 P(e3 h2) = 0.7 P(e1 h3) = 0.5 P(e2 h3) = 0.7 P(e3 h3) = 0.9 APPENDIX C THE RESULTS OF THE DEMONSTRATION The Posterior Probability: P(h1 (e1ˆe2ˆe3)) = 0.136 The Posterior Probability: P(h2 (e1ˆe2ˆe3)) = 0.071 The Posterior Probability: P(h3 (e1ˆe2ˆe3)) = 0.793