ARPN Journal of Science and Technology All rights reserved.

Rule Induction Based On Boundary Region Partition Reduction with Stards Comparisons Du Weifeng Min Xiao School of Mathematics Physics Information Engineering Jiaxing University Jiaxing 34 China ABSTRACT Prof. Ye Dongyi has pointed out in his paper that the reduction approach proposed by Hu Xiaohua et al. will lead to wrong results under some circumstances. In this paper we find out through analysis that Ye s approach is actually positive region reduction whereas Hu s approach is to make sure that the boundary regions partition are kept unchanged. The main difference between the two approaches is that each use a different stard thus there is no ground to judge which one is correct or wrong. What is more we clarify the relationship among various reduction stards for decision table we also give the relationship between reduction results when there is a strong-week correlation between two reduction stards. Keywords: Decision table; Rule induction; Rough set theory (RST); Reduction stard. INTRODUCTION Knowledge discovery always plays a central role of artificial intelligence. The method of knowledge discovery we discuss here has the following features: ()The pattern or the knowledge is concealed in the data which is abundant incomplete noisy vague; ()The pattern could be understood by people; (3)The pattern must be useful novel; (4)The data processing procedure is unordinary. In hling uncertain problems fuzzy set theory rough set theory (RST) both generalize the classical set theory but the viewpoints between them are different. Fuzzy set describes approximate knowledge using the membership. It mainly hles the fuzzy uncertainty which is inherent in natural languages. Roughness is the result of granularity of knowledge. If the objects having the same description belong to different classifications thus rough uncertainty appears. In fact it doesn t mean that these objects are really identical. It only means that we have limited understing on them. In such level of cognition we recognize some different objects as identical. Rough uncertainty will decrease when the cognition level increases the granularity of knowledge refines. RST is a new mathematical approach to hle uncertainty incomplete information. It was initially proposed by Polish mathematician Pawlak [] in 98. After nearly thirty years research development it has made great progress in theoretical aspects applications. Especially it obtained broad attentions after its successful use in knowledge discovery. Now it has applied to a broad domain such as artificial intelligence knowledge discovery in database pattern recognition failure detection etc. It is without doubt one of the most challenging areas of modern computer applications nowadays a new important rapidly growing area of research applications. Knowledge redact is one of the main contents of RST. As we all know each attribute in knowledge base (information system) is not identically important. Even some attributes are redundant. Knowledge reduct is to desert irrelevant or unimportant attributes in keeping the classification capability to the knowledge base [3]. So in information system deserting some attributes will not influence its classification capability. We only need to hold the subsets which can constitute reduct thus the new information system will have the same classification capability with the original one. A natural idea is to search all the subsets to acquire all the reducts. But unfortunately searching all the subsets of some set is an NP problem so it is infeasible practically. In 99 Prof. Skowron of Warsaw University introduced discernibility matrix discernibility function [4]. He pointed out all the conjuncts of the minimal disjunctive normal form of the discernibility function are all the reducts of the attributes set. Though the algorithm to transform the conjunctive normal form into disjunctive normal form of the discernibility function is still exponentially complex. But the method is simple clear easy to operate besides its complexity of computation can be reduced greatly using absorption law of Boolean expression. It is feasible when the scale of the problem is not too large. To date this is the best approach to get all the correct reducts it can be a criterion to test other heuristic algorithm. And besides the thought of this method has important significance. In applications Pawlak model is often represented by an information system or a decision table in which each row represents an object each column represents an attribute. In a decision table the attributes are generally classified into conditions decision (Generally only one column represents decision). If any object in the decision table with the same values of conditions must have the same values of 43

decision hence the values of decision are defined by the values of conditions or to say the decisions are consistent with the conditions. Now it is obvious that a decision table is said to be inconsistent if two objects with the same values of conditions may have different values of decisions. From this statement we can see that inconsistent decision table is more complex general than consistent decision table. Inconsistent decision table is maybe more common in practical use real-life situations because some data in our database are polluted by noise some data are contradictive because of our limitation. So it is more important necessary for us to investigate the inconsistent there have been some methods applied to the reduction of inconsistent decision tables already. These methods are different the results they get are also different of course there are some relationships among them. Positive domain reduction distribution reduction maximal distribution reduction distributive reduction so on are based on different rules respectively. The strength of reduction is different. The relationship among them is complex. In paper [5] Prof. Ye pointed out that the reduction approach introduced by Hu et al. [6] will give wrong result in some situation. In this paper we come to a conclusion by analyzing that Ye s reduction approach is positive region reduction Hu s approach is to keep boundary region partition unchangeable virtually. They are just different stards. The remainder of this paper is organized as follows: The next section introduces some fundamental concepts like partition equivalence relation rough set approximation information system indiscernibility relation inconsistent decision table various reduction stards so on. Section 3 presents that Ye s approach is positive region reduction. Property of Hu s approach is introduced in Section 4. Section 5 gives out the relationship of all the reduction stards. An example is illustrated in Section 6 to show how to acquire certain rules uncertain rules. The last section concludes the paper.. FUNDAMENTAL CONCEPTS. Partition Equivalence Relation Definition [7] A partition of a nonempty set is a collection{ }if a. b. c. ; = ( ); =. The subsets in collection are called the blocks of the partition. Definition : a. b. c. A relation on a set is reflexive if ( ) for all that is if for all ; A relation on a set is symmetric if whenever arb then bra; A relation on a set is transitive if whenever arbbrc thenarc. Definition 3: A relation R on a set is called an equivalence relation if it is reflexive symmetric transitive. Theorem.A partition on would generate an equivalence relation on also an equivalence relation Ron would generatea partition on. This partition will be denoted by / the blocks of the partitionare traditionally called equivalence classes of.. Rough Set Its Approximation Rough set theory believes that knowledge is essentially a kind of capability of classification. The capability of classification incarnates the knowledge who owns. Definition 4[7] isa nonempty set of objects is an equivalence relation on = ( ) is called approximation space equivalence classes generated by = ( ) is denoted as / = { }. In the Pawlak model = ( ) the equivalence relation in the equation characterizes the classification to universe. We can express the concepts of universe once we have such knowledge. When the concepts can be presented accurately by the knowledge in knowledge base they are called accurate concepts or accurate sets or they are called rough concepts or rough sets. Rough sets can be approached by two accurate sets the lower upper approximation. They are defined as follows: Definition 5: space denote ( )is [7] approximation ( ) = { }() ( ) = { }() ( )is called lower approximation of ( ) is called upper approximation of. If ( ) ( ) is called rough set. 44

.3 Information System Indiscernibility Relation According to RST the original knowledge can be expressed in an information system (e.g. see Table ) in which each row represents an object each column represents an attribute. Table : Information system Therefore knowledge can be described in an information system as follows: Definition 6: [7] A decision table can be denoted by a 4-tuple = ( ) where : Nonempty finite set of all objects be called universe; : Nonempty finite set of all the condition attributes; = { } is domain of attribute ; : is information function it gives an information value for every attribute of every ( ) object i.e. information function is denoted as ( ) sometimes. If it does not lead to confusion decision table can be commonly denoted as = ( ) briefly. Table illustrates the information of six objects that are characterized with three attributes ( ). It can be easily seen every attribute in a decision table corresponds an equivalence relation. As for the corresponding equivalence relation is: ( ) ( ) = ( )(3) Each attributes set also leads to an equivalence relation. As for the corresponding equivalence relation is: ( ) ( ) = ( )(4) The starting point of RST is the indiscernibility relation. The indiscernibility relation identifies objects having the same properties i.e. objects having the same properties are indiscernible consequently are treated as identical. In other words the indiscernibility relation leads to clustering of elements into granules of indiscernible objects. In RST these granules called elementary sets (concepts) are basic building blocks (concepts) of knowledge about the universe. Considering specific attributes the objects are indiscernible according to the available information. For example as shown in Table the attribute of objects is identical; hence these three objects are indiscernible based on the attribute. In other words the objects described by the identical data of considered attributes are indiscernible. A set of all indiscernible objects with respect to specific considered attributes is called an elementary set. Let be a nonempty subset of the set of all attributesi.e.. In particular -indiscernibility relation denotedby ind( ) defines that are -indiscernible with respect to as follows: ( ) ind( ) ( ) = ( )(5) Obviously the indiscernibility relation on.that is are -indiscernible if considering only subset of theattributes. The -indiscernibility relation will induce the -elementaryset in. Moreover the family of all equivalence classes definedby the relation is ind( ) denoted by /ind( ). ind( )is an equivalencerelation A partition of the universe can be generated based on indiscernibility relations; thus the universe can be decomposed into blocks of indiscernible objects i.e. elementary sets. For example the attribute B = {a } will two elementary sets { } { }. Similarly both the attributes = { } the elementary sets of{ } { } { } will be generated. Hence induce /ind( ) = { } { } { }. In order to discuss the reduct we define as follows: is a family of equivalence relation if ind( ) = ind( { })(6) Holds we say is redundant or or is If each is dependent. necessary. is necessary is independent Suppose that if is independent ind( ) = ind( ) then is a reduct of. 45

{ As for the table { }(e.g. see Table ). } is the only reduct of Table : Reducted information system S by As then is also an equivalence relation which is called indiscernibility relation on denoted as ind( ). decisions are defined by the values of conditions or to say the decisions are consistent with the conditions. For example table3 is a consistent decision table. Now it is obvious that a decision table is said to be inconsistent if ind( ) ind( ) doesn t hold. Hence two objects with the same values of conditions may have the different values of decisions. From this statement we can see that inconsistent decision table is more complex general than consistent decision table. Consistent decision table is a special kind of inconsistent decision table. For example table4 is an inconsistent decision table because objects have the same values {} of conditions{ } but the decision values is different. Table 4: Inconsistent decision table As to two knowledge bases = ( ) = ( ) when ind( ) = ind( ) we say that is equivalent denoted as. For = ( ) = ( { }) example presents the let information system shown in table = ( ) = ( { })presents the information system shown in table then..4.5 Various Reduction Stards Inconsistent decision table [7] Definition 7: Decision table is a kind of information system. In decision table the attributes are generally classified into conditions decision (e.g. in Table 3 the three features define the conditions describes thedecision). y b. Table 3: Consistent decision table What is an inconsistent decision table? As we all know a decision table is an information system ( ) such that = { } { } = where are nonempty sets. Elements in are said to be conditions is called decision elements in may be interpreted to be objects. A decision table is said to be consistent if ind( ) ind( ) holds. In other words any two objects with the same values of conditions must have the same values of decisions. Hence the values of is called including degree on satisfies a. c. Let ( ) be a poset ; = ; if it. = ( ) is a decision table are equivalence relations on respectively from condition attributes set decision attribute set denote: = { }(7) Denote: ={ = }(8) ( = )(9) Then is including degree on. Definition 8: If is a set satisfying some property any proper subset of doesn t satisfy such property is called minimal set satisfying such property. Denoting: 46

pos ( ) = ( )= of ( ) () () ( )is called generalized decision distributive function about attributes set in the decision table. ( )= ( )= = ( ) = Definition 9: table a. b. c. d. e. f..6 (3) () (4) ( ) = = max [7] log log (5) = ( )is a decision ( )= ( ) if is called positive region minimal positive region is called positive region reduct; ( ) = ( ) if is called distribution minimal distribution consistent set is called distribution reduct; if ( ) = ( ) is called greatest distribution minimal greatest distribution is called greatest distribution reduct; ( ) = ( ) if is called distributive minimal distributive consistent set is called distributive reduct; if = is called approximate consistent set minimal approximate is called approximate reduct; ( ) = ( ) is called entropy if minimal entropy is called entropy reduct. Discernibility Function Matrix Discernibility Definition : = ( )is original decision table is called some of S if B keeps some properties of decision table[7] then the reduced decision table is denoted by = ( ). Definition.The discernibility matrix of a decision table is an n-ordered phalanx ( = ) the element is: ( )(6) The discernibility function of decision table is defined as follows: = ( )(7) If we regard its element as boolean variable the discernibility function is a boolean equation. We have that all the conjuncts of the minimal disjunctive normal form of the discernibility function are all the reducts of S. ( )is different in every reduction stard Hu etc.[6] put forward that ( ) = { ( ) ( ) ( ) ( )}(8) Prof. Ye Dongyi pointed out in his paper that the reduction approach is wrong in some situation. In his paper Prof. Wang Guoyin[8] discussed Hu s approach Ye s approach algebra view information view their relationship of rough sets theory. He pointed out that all the reduction stards are identical in consistent decision tables but in inconsistent decision tables all the reduction stards give different results commonly. 3. YE S APPROACH BE REGION REDUCTION POSITIVE In papers [4 7] the discernibility condition about objects of positive region reduction is given the elements in discernibility matrix is: ( ) = { ( ) ( ) ( )}(9) ( )satisfies: ( ) ( ) ( ) ( )() ( ) ( ) ( ) In papers [9 ] it has been proved that ( ) without changing the matter of positive region reduction where ( ) satisfies: ( ) can be transformed into ( ) ( ) ( ) ( ) () The conclusion is completely equivalent with the condition introduced by Ye in his paper [5] their forms are very consistent. The condition introduced by Ye in his paper is now denoted in the symbol of this paper: ( ) ( ) min{ ( ) ( ) } = () 47

Now we just need to prove that: Lemma: ( )isequivalent ( ) min{ ( ) ( ) } =. with Proof: ( ) we get we have ( ) = samely or we have ( ) = so min{ ( ) ( ) } =. In fact there are more other reduction stards. The so-called algebra view in Wang s paper is the traditional positive region reduction stard. Such stard can guarantee equivalent certain rules before after reduction but generally speaking uncertain rules are not same. The algebra view of rough sets also have other reduction stards such as distribution reduction approximate reduction distributive reduction maximum distributive reduction. In papers [~3] we had discussed their relationship their logic characteristic. Therein Hu s approach can be regarded as some reduction stard the different reduction results between them is just because of different reduction stards. To clarify its meaning we should analyze the property logic characteristic of Hu s approach. 4. PROPERTY OF HU S APPROACH Lemma. if we have ( ) = ( ) ( ) = ([ ] ) = ( ) = ([ ] ). Proof: then If ( ) or ([ ] ) Hu reduct we have ( ) = ( ) from lemma we have ( ) = ([ ] ) = ( ) = ([ ] ). Theorem 3: is Hu reduct of decision table = ( ) if ( ) then =. Proof is obvious. Now we need prove namely have we prove it by contradiction suppose that we have because of ( ) ( ) ( ) from the definition of Hu reduct we inevitably have ( ) ( ) so this is contradictory with the suppose!. So the proposition is proved we have = From theorem theorem 3 Hu reduction merges some equivalent classes in positive region but the partition of boundary region keeps unchangeable so in this meaning Hu reduction can be called boundary region partition reduction. Following is the sketch map of boundary region partition reduction. Fig. (a) gives out the situation before reduction of a 4-class decision table. Theoretically in the extreme situation after boundary region partition reduction the lower approximation of decision class may be even reducted into class at least (Fig. (c)). As to n-class decision table in the extreme situation after boundary region partition reduction the lower approximation of decision class may be reducted into n classes at least. Of course due to the practical situation of each attribute in concrete decision table such extreme-situation reduct is unlikely to happen. In a general way the reduct is shown as in Fig. (b). ( ) ( ) this is Contradictory with the condition so we have ( ) = ([ ] ) = ( ) = ( ) ([ ] ) = ( ) as we have ( ) = ( ) so ( ) = ( ) thus ( ) = ([ ] ). (a) initial situation Theorem : is Hu reduct of decision table = ( ) if [ ] = then ( ) = ([ ] ) = ( ) ( ) =. Proof = from the definition of 48

boundary region partition positive region (b) reduct under general situation Fig : Relationship between boundary region partition positive region of inconsistent decision table 5. (c) reduct under extreme situation possibly happening Fig : Boundary region partition reduction 5. RELATIONSHIP ABOUT SEVERAL REDUCTION STANDARDS OF INCONSISTENT DECISION TABLE 5. Relationship between boundary region partition positive region Relationship between boundary region partition distributive Definition. = ( )isa decision ( ) = ( ) if then table is called distributive. If is distributive any proper subset of is not distributive is called distributive reduct. Theorem4. = ( )isa decision table then boundary region partition must be distributive. Proof: It can be proved respectively under the following two conditions: a. b. If equivalence classes after reduction are in positive region ie. ( ) = we have ( ) = ( ); If equivalence classes after reduction are in boundary region ie. ( ) according to the definition of boundary region partition we have the conclusion that boundary region partition keeps unchanged ie. = then we have ( ) = ( ) The expressions of ( ) are different with different reduction stards. The element in discernibility matrix of the boundary region partition reduction proposed by Hu etc.[6] is: So boundary region partition must even be distributive. In papers[4] the discernibility condition of positive region reduction about objects was given out. The element in discernibility matrix is: In papers[ 4] we have gotten the relationship about several other reduction stards including the relationship about Boundary Region Partition Reduction stard other reduction stards discussed in this paper then the relationship among all the discussed s is listed in Fig. 3: ( ) = { ( ) ( ) ( ) ( )}(3) ( ) = { ( ) ( ) ( )}(4) where ( )satisfies: ( ) ( ) ( ( ) ( ))(5) It is apparent that the element pair being discerned in positive region reduction will be inevitably discerned in Boundary region partition reduction. So boundary region partition must be positive region. The relationship is given in Fig.. 49

distribution(entr opy) positive region distributive(approximate) boundary region partition Fig 3: Relationship among all the discussed consistent sets of inconsistent decision table 6. 6. LOGIC CHARACTERISTIC OF BOUNDARY REGION PARTITION REDUCTION Rule Acquisition of Decision Table = ( )Is a decision table from S we can get the following decision rules Rule( ): ( ) ( ) Where ( ) is the premise of Rule( ) ( ) is the conclusion of Rule( ). If Rule( ) Rule( ) have the same premise. So we can define the decision rules about the equivalent classes of objects. / = { }is partition of U with regard to conditional attributes set for / denote ( ) = { ( ) } = { ( ) } as d attribute Values set of the elements in. So as for we can get the following ( ) decision rules: Rule( The P(Rule( ): ( ) rule precision )) is defined as ( Rule( )) = (rule () confidence) Where = { ( ) = }. So the rule precision of ( ) is the proportion of the objects with decision attribute value in. If (Rule( )) = Rule( ) is called certain rule if (Rule( )) < Rule( ) is called uncertain rule. maximal distribution 6. Rule Acquisition Based On Boundary Region Partition Reduction Theorem5: is a boundary region partition reduction of = ( ) then one certain rule inducted from the reduced decision table = ( ) corresponds several certain rule(s) of the original decision table S. ( ) = { }is a Proof: partition of. As for ( ( ) = ) the reduced decision table = ( ) can induct one certain decision rule: Rule( ): ( ) ( ) S can produce corresponding certain decision rule(s): Rule( ): ( ) ( ) ( )= Because of ( ) these rules have the same conclusion with Rule( ). The formula above can change into: Rule( ): ( ) Where withrule( ) ( ) ( ) ( ) has the same premise ( ) ( ) Includes ( ) values of the attributes it just corresponds several certain rule(s) of the original decision table S. Theorem6. is a boundary region partition reduction of = ( ) then the uncertain rule inducted from the reduced decision table 5

=( ) one-to-one corresponds uncertain rule of the original decision table S. the Proof: From theorem 3 if ( ) =. Suppose ( ) = ( ) = { ( ) } = { } [ ] As for the reduced decision table ( ) can produce m uncertain rules: Rule : ( ) ( ) Rule( ) = Rule precision where ={ ( )= }. = As for the original decision table = ( ) from it can produce the following m uncertain rules: Rule [y] : ( ) ( Rule precisionp Rule( ) = 6.3 ). Illustrative Example Example : Consider the following decision table (e.g. see Table 5): Table 5: Decision Table S. Where ] ] ] ] ): ( ): ( ): ( ): ( = { 6} is objects set ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( Rule( ): ( ) ( ) ( Rule( ): ( ) ( ) ( Rule( ): ( ) ( ) ( Rule( ): ( ) ( ) ( the following certain rule: Rule( ): ( ) ( ) ( ) ) ) ) ) It is obvious that the reduced decision table produces same uncertain rules with the original decision table As for certain rules Rule( ) is merged by {Rule( ) = 56}. 7. CONCLUSIONS In this paper we find by analysis that the reduction approach introduced by Ye Dongyi is positive region reduction virtually the reduction approach introduced by Hu Xiaohua etc. is to keep boundary region partition unchanged. Thus both reduction approaches are just different in reduction stard. Then we analyze the relationship about several reduction stards get two results:. boundary region partition must be positive region consistent set;. boundary region partition must even be distributive. By analyzing the logic characteristic of boundary region partition reduction stard we come to a conclusion that one certain rule produced from boundary region partition reduction decision table corresponds several certain rule(s) of the original decision table each uncertain rule produced from boundary region partition reduction decision table corresponds to one uncertain rule of the original decision table. ACKNOWLEDGEMENTS = { } is conditional attributes set is decision attribute. From S we can get the following uncertain rules: Rule([ Rule([ Rule([ Rule([ boundary region partition reduct of S from the reduced decision table S we can get the following uncertain rules: ) ) ) ) And the following certain rules: Rule( ): ( ) ( ) ( ) ( ) Rule( ): ( ) ( ) ( ) ( ) By calculating discernibility matrix discernibility function we can get that = { This work was partially supported by Zhejiang province fatal project (priority subjects) key industrial project (Grant No: 8C) the National Nature Science Foundation of China (Grant No: 673 687534) the Specialized Research Fund for the Doctoral Program of Higher Education of China(No.6637) the Provincial Nature Science Foundation of Zhejiang (Grant No: LYA9 LYF9). REFERENCES } is [] Z. Pawlak(98). Rough sets. International Journal of Computer Information Science[J] 34~356 [] Z. Pawlak(99). Rough sets: Theoretical Aspects of Reasoning about Data [M]. Boston: Kluwer Academic Publishers. 5

[3] Zhang Wenxiu Wu Weizhi Liang Jiye Li Deyu(). Rough sets theory approach [M] Beijing:Science Press 58~86. [4] A. Skowron C. Rauszer(99). The discernibility matrices functions in information systems [J] in: R. Slowinski (Ed.). Intelligent Decision Support Hbook of Applications Advances of the Rough Sets Theory. Dordrecht: Kluwer Academic Publishers 33~36. Southwest Jiaotong University Doctor Degree Dissertation. [] Du Weifeng Qin Keyun(5). The relationship of positive domain reduction to other reductions of inconsistent decision tables.journal of Hainan Normal College 8 8~. [] Du Weifeng Yang Li(8). A Brief Analysis about the Basic Reduction Stards of Decision Table. Computational Intelligence in Decision Control (Book Series: World Scientific Proceedings Series on Computer Engineering Information Science) The 8th International FLINS Conference on Computational Intelligence in Decision Control 575~58. Qin Keyun Du Weifeng(6). The Logic Characteristic of Knowledge Reduction. Computer Engineering Applications 4. [5] Ye Dongyi Chen Zhaojiong(). A new discernibility matrix the computation of a core.acta Electronica Sinica 3 86~88. [6] Hu Xiaohua CerconeN(995). Learning in relational databases: a rough set approach [J]. Computational Intelligence 33~337. [3] Zhang Wenxiu Liang Yi Wu Weizhi(3). Information system knowledge discovery [M] Beijing:Science Press 48. [4] Du Weifeng Qin Keyun(8). A Brief Analysis about Boundary Region Partition Reduction Stard. Computer Engineering Applications 44. [5] KeyunQin ZhengPei WeifengDu(5). The relationship among several knowledge reduction approaches. Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) v 363 n PART I Fuzzy Systems Knowledge Discovery: Second International Conference FSKD 5. Proceedings 3~4. [6] Bernard Kolman Robert C. Busby Sharon Cutler Ross (5). Discrete Mathematical Structures (Fifth Edition). Higher Education Press. [7] [8] Wang Guoyin(3). Calculation Methods for Core Attributes of Decision Table. Chinese Journal of Computers 6 6~65. [9] Du Weifeng Qin Keyun (6). The Improvement to Condition in Discernibility Function of Positive Reduct of Decision Table. Computer Engineering Applications4 6~8. [] Du Weifeng (6). Application of Rough Set Theory in Chinese Text Categorization. 5