Drawing Conclusions from Data The Rough Set Way

Similar documents
A new Approach to Drawing Conclusions from Data A Rough Set Perspective

Rough sets, decision algorithms and Bayes' theorem

Data Analysis - the Rough Sets Perspective

Some remarks on conflict analysis

Rough Sets, Rough Relations and Rough Functions. Zdzislaw Pawlak. Warsaw University of Technology. ul. Nowowiejska 15/19, Warsaw, Poland.

Rough Sets, Bayes Theorem and Flow Graphs

Granularity, Multi-valued Logic, Bayes Theorem and Rough Sets

Bayes Theorem - the Rough Set Perspective

Computational Intelligence, Volume, Number, VAGUENES AND UNCERTAINTY: A ROUGH SET PERSPECTIVE. Zdzislaw Pawlak

ROUGH SET THEORY FOR INTELLIGENT INDUSTRIAL APPLICATIONS

A PRIMER ON ROUGH SETS:

Interpreting Low and High Order Rules: A Granular Computing Approach

Decision tables and decision spaces

Naive Bayesian Rough Sets

A Generalized Decision Logic in Interval-set-valued Information Tables

Classification of Voice Signals through Mining Unique Episodes in Temporal Information Systems: A Rough Set Approach

A Logical Formulation of the Granular Data Model

Modeling the Real World for Data Mining: Granular Computing Approach

Rough Set Model Selection for Practical Decision Making

Comparison of Rough-set and Interval-set Models for Uncertain Reasoning

Banacha Warszawa Poland s:

Sets with Partial Memberships A Rough Set View of Fuzzy Sets

Research Article Special Approach to Near Set Theory

Rough Set Approaches for Discovery of Rules and Attribute Dependencies

FUZZY PARTITIONS II: BELIEF FUNCTIONS A Probabilistic View T. Y. Lin

Fuzzy Modal Like Approximation Operations Based on Residuated Lattices

An algorithm for induction of decision rules consistent with the dominance principle

APPLICATION FOR LOGICAL EXPRESSION PROCESSING

Classification Based on Logical Concept Analysis

ORTHODOX AND NON-ORTHODOX SETS - SOME PHILOSOPHICAL REMARKS

On Rough Set Modelling for Data Mining

Concept Lattices in Rough Set Theory

On the Relation of Probability, Fuzziness, Rough and Evidence Theory

Index. C, system, 8 Cech distance, 549

Andrzej Skowron, Zbigniew Suraj (Eds.) To the Memory of Professor Zdzisław Pawlak

A Rough Set Interpretation of User s Web Behavior: A Comparison with Information Theoretic Measure

Mathematical Approach to Vagueness

The size of decision table can be understood in terms of both cardinality of A, denoted by card (A), and the number of equivalence classes of IND (A),

Three Discretization Methods for Rule Induction

Rough Sets for Uncertainty Reasoning

Learning Rules from Very Large Databases Using Rough Multisets

Foundations of Classification

Rough Sets and Conflict Analysis

Rough Approach to Fuzzification and Defuzzification in Probability Theory

ARPN Journal of Science and Technology All rights reserved.

ROUGH set methodology has been witnessed great success

Chapter 18 Rough Neurons: Petri Net Models and Applications

Entropy for intuitionistic fuzzy sets

Probabilistic and Bayesian Analytics

Applied Logic. Lecture 3 part 1 - Fuzzy logic. Marcin Szczuka. Institute of Informatics, The University of Warsaw

Probability. 25 th September lecture based on Hogg Tanis Zimmerman: Probability and Statistical Inference (9th ed.)

Flow Graphs and Data Mining

Research Article Decision Analysis via Granulation Based on General Binary Relation

Practical implementation of possibilistic probability mass functions

2) There should be uncertainty as to which outcome will occur before the procedure takes place.

FUZZY ASSOCIATION RULES: A TWO-SIDED APPROACH

Brock University. Probabilistic granule analysis. Department of Computer Science. Ivo Düntsch & Günther Gediga Technical Report # CS May 2008

Parameters to find the cause of Global Terrorism using Rough Set Theory

Information Refinement and Revision for Medical Expert System Automated Extraction of Hierarchical Rules from Clinical Data

Bayes Approach on Government Job Selection Procedure In India

Practical implementation of possibilistic probability mass functions

SAMPLE CHAPTERS UNESCO EOLSS FOUNDATIONS OF STATISTICS. Reinhard Viertl Vienna University of Technology, A-1040 Wien, Austria

Rough Soft Sets: A novel Approach

Uncertainty and Rules

Characterizing Pawlak s Approximation Operators

A Formal Logical Framework for Cadiag-2

Basic Probabilistic Reasoning SEG

ARTICLE IN PRESS. Information Sciences xxx (2016) xxx xxx. Contents lists available at ScienceDirect. Information Sciences

Mining Approximative Descriptions of Sets Using Rough Sets

Similarity-based Classification with Dominance-based Decision Rules

Computers and Mathematics with Applications

MATHEMATICS OF DATA FUSION

Relationship between Loss Functions and Confirmation Measures

ENTROPIES OF FUZZY INDISCERNIBILITY RELATION AND ITS OPERATIONS

Rough Set Approach for Generation of Classification Rules for Jaundice

SpringerBriefs in Statistics

Imprecise Probability

Interval based Uncertain Reasoning using Fuzzy and Rough Sets

Semantic Rendering of Data Tables: Multivalued Information Systems Revisited

Partial Approximative Set Theory: A Generalization of the Rough Set Theory

Rough Set Theory. Andrew Kusiak Intelligent Systems Laboratory 2139 Seamans Center The University of Iowa Iowa City, Iowa

Reasoning with Uncertainty

Fuzzy Rough Sets with GA-Based Attribute Division

Bulletin of the Transilvania University of Braşov Vol 10(59), No Series III: Mathematics, Informatics, Physics, 67-82

Outlier Detection Using Rough Set Theory

Hierarchical Structures on Multigranulation Spaces

Some Properties of a Set-valued Homomorphism on Modules

Chi-square goodness-of-fit test for vague data

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

FUZZY DIAGNOSTIC REASONING THAT TAKES INTO ACCOUNT THE UNCERTAINTY OF THE RELATION BETWEEN FAULTS AND SYMPTOMS

Notes on Rough Set Approximations and Associated Measures

Easy Categorization of Attributes in Decision Tables Based on Basic Binary Discernibility Matrix

Ensembles of classifiers based on approximate reducts

On flexible database querying via extensions to fuzzy sets

Available online at ScienceDirect. Procedia Computer Science 35 (2014 )

ON SOME PROPERTIES OF ROUGH APPROXIMATIONS OF SUBRINGS VIA COSETS

UPPER AND LOWER SET FORMULAS: RESTRICTION AND MODIFICATION OF THE DEMPSTER-PAWLAK FORMALISM

Positional Analysis in Fuzzy Social Networks

ROUGH NEUTROSOPHIC SETS. Said Broumi. Florentin Smarandache. Mamoni Dhar. 1. Introduction

Tolerance Approximation Spaces. Andrzej Skowron. Institute of Mathematics. Warsaw University. Banacha 2, Warsaw, Poland

Transcription:

Drawing Conclusions from Data The Rough et Way Zdzisław Pawlak Institute of Theoretical and Applied Informatics, Polish Academy of ciences, ul Bałtycka 5, 44 000 Gliwice, Poland In the rough set theory with every decision rule two conditional probabilities, called certainty and co erage factors, are associated These two factors are closely related with the lower and the upper approximation of a set, basic notions of rough set theory It is shown that these two factors satisfy the Bayes theorem The Bayes theorem in our case simply shows some relationship in the data, without referring to prior and posterior probabilities intrinsically associated with Bayesian inference in our case and can be used to inverse decision rules, ie, to find reasons Ž explanation for decisions 2001 John Wiley & ons, Inc 1 INTRODUCTION This paper is a modified version of Ref 1 The relationship between rough set theory Ž see Refs 7 11 and Bayes theorem is shown However the meaning of Bayes theorem in our case and that in statistical inference is different tatistical inference grounded on the Bayes rule supposes that some prior knowledge Ž prior probability about some parameters without knowledge about the data is given first Next the posterior probability is computed when the data is available The posterior probability is then used to verify the prior probability In the rough set philosophy with every decision rule, two conditional probabilities, called certainty and co erage factors, are associated These two factors are closely related with the lower and the upper approximation of a set, basic concepts of rough set theory It turned out that these two factors satisfy the Bayes theorem without referring to prior and posterior probabilities This property enables us to explain decisions in terms of conditions, ie, to compute certainty and coverage factors of inverse decision rules, without referring to prior and posterior probabilities intrinsically associated with Bayesian reasoning * e-mail: zpw@iipwedupl Ž INTERNATIONAL JOURNAL OF INTELLIGENT YTEM, VOL 16, 3 11 2001 2001 John Wiley & ons, Inc

4 PAWLAK 2 INFORMATION YTEM AND DECIION TABLE The starting point of rough set based data analysis is a data set, called an information system An information system is a data table, whose columns are labeled by attributes, rows are labeled by objects of interest, and entries of the table are attribute values Formally by an information system we will understand a pair Ž U, A, where U and A, are finite, nonempty sets called the uni erse, and the set of attributes, respectively With every attribute a A we associate a set V a, of its alues, called the domain of a Any subset B of A determines a binary relation IŽ B on U, which will be called an indiscernibility relation, and is defined as follows: Ž x, y IŽ B if and only if ax Ž až y for every a A, where ax Ž denotes the value of attribute a for element x Obviously IŽ B is an equivalence relation The family of all equivalence classes of IŽ B, ie, partition determined by B, will be denoted by U IŽ B, or simple U B; an equivalence class of IŽ B, ie, block of the partition U B, containing x will be denoted by BŽ x If Ž x, y belongs to IŽ B we will say that x and y are B-indiscernible or indiscernible with respect to B Equivalence classes of the relation IŽ B Žor blocks of the partition U B are referred to as B-elementary sets or B-granules If we distinguish in an information system two classes of attributes, called condition and decision attributes, respectively, then the system will be called a decision table A simple, tutorial example of an information system Ž a decision table is shown in Table I The table contains data about six car types, where F, P, and are condition attributes and denote fuel consumption, selling price and size respectively, whereas M denotes marketability and is the decision attribute Besides, T denotes the car type and N the number of cars sold of a given type Each row of the decision table determines a decision obeyed when specified conditions are satisfied Table I An example of an information system T F P M N 1 med med med poor 8 2 high med large poor 10 3 med low large poor 4 4 low med med good 50 5 high low small poor 8 6 med low large good 20

DRAWING CONCLUION ROUGH DATA ET 5 3 APPROXIMATION uppose we are given an information system Ž a data set Ž U, A, a subset X of the universe U, and subset of attributes B Our task is to describe the set X in terms of attribute values from B To this end we define two operations assigning to every X U two sets B Ž X and B Ž X called the B-lower and the B-upper approximation of X, respectively, and defined as follows: 4 B Ž X BŽ x : BŽ x X x U 4 B Ž X BŽ x : BŽ x X x U Hence, the B-lower approximation of a set is the union of all B-granules that are included in the set, whereas the B-upper approximation of a set is the union of all B-granules that have a nonempty intersection with the set The set BN X B X B X BŽ Ž Ž will be referred to as the B-boundary region of X If the boundary region of X is the empty set, ie, BN Ž X B then X is crisp Ž exact with respect to B; in the opposite case, ie, if BN Ž X B X is referred to as rough Ž inexact with respect to B For example, for B F, P, 4 and the set X 1 2 3 4 5 of cars with poor marketability we have B Ž X 1 2 4 5, B Ž X 1 2 3 5 4 6 and BN Ž X 3 4 6, where B i denotes the set of all cars of type i 4 DECIION RULE With every information system Ž U, A we associate a formal language L, Ž written L when is understood Expressions of the language L are logical formulas denoted by,, etc built up from attributes and attributevalue pairs by means of logical connectives Ž and, Ž or, Ž not in the standard way We will denote by the set of all objects x U satisfying in and refer to as the meaning of in The meaning of in is defined inductively as follows: Ž Ž 1 a, x U : až x4 for all a A and V Ž 2 Ž 3 Ž 4 U A formula is true in if U A decision rule in L is an expression read if then ; and are referred to as conditions and decisions of the rule, respectively a

6 PAWLAK An example of a decision rule is given below: Ž F, med Ž P, low Ž, large Ž M, poor Obviously a decision rule is true in if With every decision rule we associate a conditional probability Ž that is true in given is true in with the probability Ž card Ž card Ž U, called the certainty factor introduced first by Łukasiewicz 2 Ž see also 5, 6, and 12 and defined as follows: cardž Ž cardž where This coefficient is widely used in data mining and is called confidence coefficient Obviously, Ž 1 if and only if is true in If Ž 1, then will be called a certain decision rule; if 0 Ž 1 the decision rule will be referred to as a possible decision rule Besides, we will also use a co erage factor proposed by Tsumoto 3, 4 and defined as follows: cardž Ž, cardž which is the conditional probability that is true in, given is true in with the probability Ž 5 DECIION RULE AND APPROXIMATION Let 4 i n be a set of n decision rules 1 2, n such that: all conditions are pairwise mutually exclusive, ie, i i j n Ý Ž i for any 1 i, j n, i j, and 1 i 1 Ž 1 Let C be the set of condition attributes and let 4 i rules satisfying eq Ž 1 Then the following relationships are valid: n be a set of decision Ž a C Ž Ž i 1 Ž b C Ž i 0 Ž i 1 i

DRAWING CONCLUION ROUGH DATA ET 7 Ž c BN Ž C i 0 Ž 1 i The above properties enable us to introduce the following definitions: Ž i If C Ž, then formula will be called the C-lower approximation of the formula and will be denoted by C Ž Ž ii If C Ž, then the formula will be called the C-upper approximation of the formula and will be denoted by C Ž Ž iii If BN Ž C, then will be called the C-boundary of the formula and will be denoted by BN Ž C In this way we are allowed to approximate not only sets but also formulas Let us consider the following example Ž The C-lower approximation of M, poor is the formula C Ž M, poor Ž Ž F, med Ž P, med Ž, med Ž Ž F, high Ž P, med Ž, large Ž Ž F, high Ž P, low Ž, small Ž The C-upper approximation of M, poor is the formula C Ž M, poor Ž Ž F, med Ž P, med Ž, med Ž The C-boundary of M, poor is the formula Ž Ž F, high Ž P, med Ž, large Ž Ž F, med Ž P, low Ž, large Ž Ž F, high Ž P, low Ž, small BN M, poor F, med P, low, large C Ž Ž Ž Ž After simplification using the rough set approach Ž not presented here the following approximations: we get C Ž M, poor Ž Ž F, med Ž P, med Ž F, high C Ž M, poor Ž F, med Ž F, high From the above considerations it follows that any decision can be uniquely described by the following decision rules: C Ž C Ž BNC Ž

8 PAWLAK or equivalently C Ž BNC Ž Thus for the considered example we can get two decision rules: Ž Ž F, med Ž P, med Ž F, high Ž M, poor Ž F, med Ž P, low Ž M, poor which are associated with the lower approximation and the boundary region of the decision Ž M, poor, respectively and describe decision Ž M, poor For the decision Ž M, good we get the following decision rules: Ž F, low Ž M, good Ž F, med Ž P, low Ž M, good This coincides with the idea given by Ziarko 13 to represent decision tables by means of three decision rules corresponding to positive region, the boundary region, and the negative region of a decision 6 INVERE DECIION RULE Often we are interested in explanation of decisions in terms of conditions, ie, give reasons to decisions To this end we have to inverse the decision rules, ie, to exchange mutually conditions and decisions in a decision rule For example, for the set of decision rules describing poor and good marketability: cer co Ž Ž F, med Ž P, med Ž F, high Ž M, poor 100 087 Ž Ž F, med Ž P, low Ž M, poor 017 013 Ž 1 Ž F, low Ž M, good 100 071 Ž F, med Ž P, low Ž M, good 083 029 we get the following inverse decision rules: cer co Ž M, poor Ž Ž F, med Ž P, med Ž F, high 087 100 Ž M, poor Ž F, med Ž P, low 013 017 Ž M, good Ž F, low 071 100 Ž M, good Ž F, med Ž P, low 083 083 Ž Ž 2 We can get more specific decision rules replacing sets of decision rules Ž 1 and Ž 2 by means of simple decision rules Ža decision rule is simple if it does not contain logical connectives

DRAWING CONCLUION ROUGH DATA ET 9 Ž Thus instead of decision rules 1 we will have: cer co Ž F, med Ž P, med Ž M, poor 100 027 Ž F, high Ž M, poor 100 060 Ž F, med Ž P, low Ž M, poor 017 013 Ž F, low Ž M, good 100 071 Ž F, med Ž P, low Ž M, good 083 029 Ž 3 Ž and decision rules 2 can be replaced by: cer co Ž M, poor Ž F, med Ž P, med 027 100 Ž M, poor Ž F, high 060 100 Ž M, poor Ž F, med Ž P, low 013 017 Ž M, good Ž F, low 071 100 Ž M, good Ž F, med Ž P, low 029 089 Ž 4 From the set of decision rules and their certainty and coverage factors we can draw the following conclusions: Ž 1 Cars with medium fuel consumption and medium price or high fuel consumption always have poor marketability Ž sell poorly Ž 2 17% cars with medium fuel consumption and low price have poor marketability Ž sell poorly Ž 3 Cars with low fuel consumption always have good marketability Ž sell well Ž 4 83% cars with medium fuel consumption and low price have good marketability Ž sell well The above conclusions can be explained by means of inverse decision rules as follows: Ž1 87% cars selling poorly have medium fuel consumption and medium price Ž 27% or high fuel consumption Ž 60% Ž2 13% cars selling poorly have medium fuel consumption and low price Ž3 71% cars with low fuel consumption are selling well Ž4 83% cars selling well have medium fuel consumption and low price 7 PROPERTIE OF DECIION RULE If 4 is a set of decision rules satisfying condition Ž i n 1, then the well known formula for total probability holds: n Ý Ž Ž Ž Ž 2 i i i 1

10 PAWLAK Moreover for any decision rule the following Bayes theorem is valid: Ž Ž j j Ž j Ž 3 n Ý Ž Ž i 1 i i That is, any decision table or any set of implications satisfying condition Ž 1 satisfies the Bayes theorem, without referring to prior and posterior probabilities fundamental in Bayesian data analysis philosophy Bayes theorem in our case says that: if an implication is true to the degree Ž then the implication is true to the degree Ž This property explains the relationships between the certainty and coverage factors and can be used to explain decisions in terms of conditions, ie, it can be used to compute coverage factors by means of certainty factors, but this is more complicated than the direction computation from the data The Bayes theorem is more useful when instead of data table like Table I we are given some probabilities, but we will not discuss this issue in this paper 8 CONCLUION It is shown in this paper that any decision table satisfies Bayes theorem This enables us to apply Bayes theorem to inverse decision rules without referring to prior and posterior probabilities, inherently associated with classical Bayesian inference philosophy The inverse decision rules can be used to explain decisions in terms of conditions, ie, to give reasons for decisions Let us observe that the conclusions drawn from the data are not universally true, but are valid only for the data Whether they are valid for a bigger universe depends if the data is a proper sample of the bigger universe or not Thanks are due to Professor Andrzej kowron and the anonymous referee for their critical remarks References 1 Pawlak Z Decision rules, Bayes rule and rough sets In: Zhong N, koron A, Ohsuga, editors New directions in rough sets, data mining, and granular soft computing Proceedings 7th International Workshop, RFDGC 99, Yamaguchi, Japan, November 1999 p 1 9 2 Łukasiewicz J Die logishen Grundlagen der Wahrscheinilchkeitsrechnung Krakow Ž 1913 In: Borkowski L, editor Jan Łukasiewicz elected works Amsterdam, London: North Holland Publishing; Warsaw: Polish cientific Publishers; 1970 3 Tsumoto, Kobayashi, Yokomori T, Tanaka H, Nakamura A, editors Proceedings of the fourth international workshop on rough sets, fuzzy sets, and machine discovery Ž RFD 96 The University of Tokyo, November 6 8, 1996 4 Tsumoto Modelling medical diagnostic rules based on rough sets In: Polkowski L, kowron A, editors Rough sets and current trends in computing Lecture notes in artificial intelligence Proceedings, First International Conference, RCTC 98, Warsaw, Poland, June, 1998 p 475 482

DRAWING CONCLUION ROUGH DATA ET 11 5 Adams, EW The logic of conditionals, an application of probability to deductive logic Boston, Dordrecht: D Reidel Publishing Company; 1975 6 Grzymała Busse J Managing uncertainty in expert systems Boston, Dordrecht: Kluwer Academic Publishers; 1991 7 Pawlak Z Rough sets theoretical aspects of reasoning about data Boston, Dordrecht: Kluwer Academic Publishers; 1991 8 Pawlak Z Reasoning about data a rough set perspective In: Polkowski L, kowron, A, editors Rough sets and current trends in computing Lecture notes in artificial intelligence First international conference, RCTC 98, Warsaw, Poland, June 1998 p25 34 9 Pawlak Z, kowron A Rough membership functions In: Yaeger RR, Fedrizzi M, Kacprzyk J, editors Advances in the Dempster hafer theory of evidence New York: John Wiley & ons, Inc; 1994 p 251 271 10 Polkowski L, kowron A, editors Rough sets and current trends in computing Lecture notes in artificial intelligence Proceedings First International Conference, RCTC 98, Warsaw, Poland, June, 1998 11 Polkowski L, kowron A, editors Rough sets in knowledge discovery Physica-Verlag, Vol 1, No 2, 1998 12 kowron A Management of uncertainty in AI: a rough set approach In: Proceedings of the Conference OFTEK pringer Verlag and British Computer ociety; 1994 p 69 86 13 Ziarko W Approximation region-based decision tables In: Polkowski L, kowron A, editors Rough sets in knowledge discovery Physica-Verlag, Vol 1, No 2, 1998 p 178 185