High Frequency Rough Set Model based on Database Systems

High Frequency Rough Set Model based on Database Systems Kartik Vaithyanathan kvaithya@gmail.com T.Y.Lin Department of Computer Science San Jose State University San Jose, CA 94403, USA tylin@cs.sjsu.edu Abstract - Rough sets theory was proposed by Pawlak in the 1980s and has been applied successfully in a lot of domains. One of the key concepts of the rough sets model is the computation of core and reduct. It has been shown that finding the minimal reduct is an NP-hard problem and its computational complexity has implicitly restricted its effective applications to a small and clean data set. In order to improve the efficiency of computing core attributes and reducts, many novel approaches have been developed, some of which attempt to integrate database technologies. This paper proposes a novel approach to computing reducts called high frequency value reducts using database system concepts. The method deals directly with generating value reducts and also prunes the decision table by placing a lower bound on the frequency of equivalence values in the decision table. I. INTRODUCTION Rough sets theory was proposed by Pawlak [8,9] in the 1980s and has been applied successfully in a lot of domains. One of the key concepts of the rough sets model is the computation of core and reduct. Multiple approaches to improve the efficiency of finding core attributes and reducts have been developed [2], including the algorithms presented in [5], which largely improve the generation of discernability relation by sorting the objects. Some authors have proposed approaches to reduce data size using relational database system techniques [4] and developed rough-set based data mining systems that integrate RDBMS capabilities [3]. Another approach redefined the concepts of rough set theory such as core attributes and reducts by leveraging set-oriented database operations [7]. The current approach extends the extraction of various sizes of inter connected Pawlak information systems [1] while leveraging existing relational database concepts and operations. An in-depth example to illustrate the nuances of this approach is also provided. II. APPROACH A decision table such as the one shown in Table I may have more than one value reducts. Any one of them can be used to replace the original table. Finding all the value reducts by eliminating unnecessary attributes from a decision table is NPhard [6]. Attributes that are redundant given other attributes are perceived as unnecessary. Table I shows a database table of 12 cars with information about the Weight, Door, Size, Cylinder and Mileage. Weight, Door, Size and Cylinder are the condition attributes (represented as ) and Mileage is the decision attribute (represented as ). The attribute Tuple_ID is provided for theoretical understanding only. TABLE I 12 CARS WITH ATTRIBUTES WEIGHT, DOOR, SIZE, CYLINDER AND MILEAGE Tuple_ID Weight Door Size Cylinder Mileage t 1 low 2 compact 4 high t 2 low 4 sub 6 low t 3 medium 4 compact 4 high t 4 high 2 compact 6 low t 5 high 4 compact 4 low t 6 low 4 compact 4 high t 7 high 4 sub 6 low t 8 low 2 sub 6 low t 9 medium 2 compact 4 high t 10 medium 4 sub 4 high t 11 medium 2 compact 4 low t 12 medium 4 sub 4 low The traditional approaches (including [7]) perform a two step process in identifying the value reducts the first step involves obtaining the minimal attribute reducts by eliminating unnecessary attributes without sacrificing the accuracy of the classification model. The second step is to generate the value reducts for each attribute reduct. Most approaches to computing core and reduct assume that: (a) a tuple in the decision table always contributes to the classification model and is not an outlier (b) all tuples in the decision table are consistent. There are two aspects to the new approach and is explained below. The first aspect states that tuples that occur above a certain lower bound threshold are the only ones that contributes to the classification model (decision). As a result, only high frequency rules that contribute to a decision are short-listed. A trivial case of the lower bound (= 1) is equivalent to the traditional approaches of computing value reducts. The high frequency rule prunes the decision table data and is the first key differentiator in this approach. An algorithm is outlined to ensure that the decision table is consistent before applying the proposed high frequency rule to the tuples in the decision table. The second novel step in this approach is to directly generate the value reducts instead of a 978-1-4244-2352-1/08/$25.00 2008 IEEE

two-step process of first identifying the attribute reducts and then subsequently generating the value reducts. III. CONSISTENT HIGH FREQUENCY DECISION RULES Given a decision table with rows and condition attributes, the number of possible decision rules is. The high frequency pruning will eliminate some of the decision rules. Every decision rule can then be analyzed to determine the existence of a value reduct (minimal decision rule). The generation of a decision rule set DR(X,D) can be expressed using SQL statements of the form SELECT * FROM (SELECT X, D FROM T) (1) where X is a subset of C. There are possible values of X. There are rows in each decision rule set. The inconsistent tuples in a decision rule set DR(X,D) are obtained using the following SQL SELECT * FROM DR(X,D) DR 1 WHERE EXISTS ( SELECT * FROM DR(X,D) DR 2 WHERE ((DR 1.X = DR 2.X) AND (DR 1.D!= DR 2.D)) ) (2) A consistent decision rule set DR(X,D) is obtained by removing the tuples in DR 1 (X,D) above from the original set DR(X,D) in (1). SELECT * FROM DR(X,D) MINUS SELECT * FROM DR 1 (X,D) (3) The running time for (2) is and (3) is where is the number of tuples (rows) in the decision table. The overall running time for weeding out inconsistent tuples is. The high frequency pruning is executed on the consistent decision rule set DR(X,D) from the previous step and can be expressed in SQL as SELECT X, D, COUNT(*) as Frequency FROM DR(X,D) GROUP BY X, D HAVING COUNT(*) >= MIN_FREQ (4) where MIN_FREQ represents the minimum value of the high frequency rule. The sorting process (i.e. the GROUP BY) is running time and the counting and pruning is running time where is the number of tuples (rows) in the decision table. Thus, the running time for the high frequency pruning is. The worst-case running time for obtaining a consistent, high frequency decision rule set DR(X,D) is a polynomial function of the number of rows ( ) in the decision table. The overall running time for creation of all the possible consistent, high frequency decision rules for a decision table is where is the number of rows and is the number of attributes (columns) in the decision table. III. VALUE REDUCTS The goal is to find the value reducts for each tuple in a decision rule set DR(X,D). If DR(X,D) is an -attribute decision rule set ( ), there are decision rules for each tuple in the decision rule set each having - attribute values lesser than the original tuple ( ). Each of these decision rules is analyzed for consistency and one or more minimal decision rules (a consistent decision rule with the least number of attributes) are chosen for every tuple and comprise the value reducts for that tuple. The generation of value reducts is explained in detail with an example in Section V. V. ILLUSTRATIVE EXAMPLE The representative example in Table I is broken down into its sub sets along with the high frequency values for illustration below. Each sub set is analyzed for two high frequencies (a) is equivalent to the traditional approach to computing value reducts and (b) MIN_FREQ = 2 to illustrate the high frequency pruning to computing value reducts. All tuples that are inconsistent and don t meet the high frequency criterion are discarded and are highlighted in the tables. In addition, the inconsistent tuples are noted below the respective decision rule set. The value reducts are also provided for each of these decision rule sets. TABLE II 1-ATTRIBUTE HIGH FREQUENCY DECISION RULES: {WEIGHT}, {DOOR}, {SIZE}, AND {CYLINDER} AND AND 2 (1.1) Decision Rules Tuple_ID Weight Mileage Frequency t 1, t 6 low high 2 t 2, t 8 low low 2 t 3, t 9, t 10 medium high 3 t 4, t 5, t 7 high low 3 t 11, t 12 medium low 2

(1.2) Decision Rules Tuple_ID Door Mileage Frequency t 1, t 9 2 high 2 t 2, t 5, t 7, t 12 4 low 4 t 3, t 6, t 10 4 high 3 t 4, t 8, t 11 2 low 3 (1.3) Decision Rules Tuple_ID Size Mileage Frequency t 1, t 3, t 6, t 9 compact high 4 t 2, t 7, t 8, t 12 sub low 4 t 4, t 5, t 11 compact low 3 t 10 sub high 1 (1.4) Decision Rules Tuple_ID Cylinder Mileage Frequency t 1, t 3, t 6, t 9, t 10 4 high 5 t 2, t 4, t 7, t 8 6 low 4 t 5, t 11, t 12 4 low 3 TABLE III 2-ATTRIBUTE HIGH FREQUENCY DECISION RULES: {WEIGHT, DOOR}, {WEIGHT, SIZE}, {WEIGHT, CYLINDER}, {DOOR, SIZE}, {DOOR, CYLINDER}, {SIZE, CYLINDER} AND AND 2 (2.1) Decision Rules (9 possible decision rules) Tuple_ID Weight Door Mileage t 1 low 2 high t 2 low 4 low t 3 medium 4 high t 4 high 2 low t 5 high 4 low t 6 low 4 high t 7 high 4 low t 8 low 2 low t 9 medium 2 high t 10 medium 4 high t 11 medium 2 low t 12 medium 4 low (3 possible decision rules) Tuple_ID Weight Door Mileage Frequency t 1 low 2 high 1 t 2 low 4 low 1 t 3, t 10 medium 4 high 2 t 4 high 2 low 1 Tuple_ID Weight Door Mileage Frequency t 5, t 7 high 4 low 2 t 6 low 4 high 1 t 8 low 2 low 1 t 9 medium 2 high 1 t 11 medium 2 low 1 t 12 medium 4 low 1 {t 1, t 8 },{t 2, t 6 },{t 9, t 11 } and {{t 3, t 10 }, t 12 } are eliminated due to inconsistency. (a) Consistent Decision Rule: ({high, 4} (W, D) {low} (M) ) Minimal Decision Rules: ({high} (W) {low} (M) ), ({4} (D) {low} (M) ) [{4} (D) could imply {high} (M) hence not a value reduct] (2.2) Decision Rules (21 possible decision rules) Tuple_ID Weight Size Mileage t 1 low compact high t 2 low sub low t 3 medium compact high t 4 high compact low t 5 high compact low t 6 low compact high t 7 high sub low t 8 low sub low t 9 medium compact high t 10 medium sub high t 11 medium compact low t 12 medium sub low Value Reduct: ({low, compact} (W,S) {high} (M) ), ({low, sub} (W,S) {low} (M) ),({high} (W) {low} (M) ) (9 possible decision rules) Tuple_ID Weight Size Mileage Frequency t 1, t 6 low compact high 2 t 2, t 8 low sub low 2 t 3, t 9 medium compact high 2 t 4, t 5 high compact low 2 t 7 high sub low 1 t 10 medium sub high 1 t 11 medium compact low 1 t 12 medium sub low 1 {{t 3, t 9 }, t 11 } and {t 10, t 12 } are eliminated due to inconsistency (a) Consistent Decision Rule: ({low, compact} (W,S) {high} (M) ) Minimal Decision Rules: ({low} (W) {high} (M) ), ({compact} (S) {high} (M) ) Both are not valid [{low} (W) could imply {low} (D) and {compact} (S) could imply {low} (D) ] Value Reduct: ({low, compact} (W,S) {high} (M) ) (b) Consistent Decision Rule: ({low, sub} (W,S) {low} (M) ) Minimal Decision Rules: ({low} (W) {low} (M) ), ({sub} (S) {low} (M) ) Both are not valid [{low} (W) could imply {high} (D) and {sub} (S) could imply {high} (D) ] Value Reduct: ({low, sub} (W,S) {low} (M) ) (c) Consistent Decision Rule: ({high, compact} (W,S) {low} (M) ) Minimal Decision Rules: ({high} (W) {low} (M) ), ({compact} (S) {low} (M) ) [{compact} (S) could imply {high} (M) - hence not a value reduct] (2.3) Decision Rules

(21 possible decision rules) Tuple_ID Weight Cylinder Mileage t 1 low 4 high t 2 low 6 low t 3 medium 4 high t 4 high 6 low t 5 high 4 low t 6 low 4 high t 7 high 6 low t 8 low 6 low t 9 medium 4 high t 10 medium 4 high t 11 medium 4 low t 12 medium 4 low Value Reduct: ({low, 4} (W,C) {high} (M) ),({6} (C) {low} (M) ),({high} (W) {low} (M) ) (9 possible decision rules) Tuple_ID Weight Cylinder Mileage Frequency t 1, t 6 low 4 high 2 t 2, t 8 low 6 low 2 t 3, t 9, t 10 medium 4 high 3 t 4, t 7 high 6 low 2 t 5 high 4 low 1 t 11, t 12 medium 4 low 2 {{t 3, t 9, t 10 },{t 11, t 12 }} is eliminated due to inconsistency (a) Consistent Decision Rule: ({low, 4} (W,C) {high} (M) ) Minimal Decision Rules: ({low} (W) {high} (M) ), ({4} (C) {high} (M) ) Both are not valid. Value Reduct: ({low, 4} (W,C) {high} (M) ) (b) Consistent Decision Rule: ({low, 6} (W,C) {low} (M) ) Minimal Decision Rules: ({low} (W) {low} (M) ), ({6} (C) {low} (M) ) (c) Consistent Decision Rule: ({high, 6} (W,C) {low} (M) ) Minimal Decision Rules: ({high} (W) {low} (M) ), ({6} (C) {low} (M) ), ({6} (C) {low} (M) ) (2.4) Decision Rules (3 possible decision rules) Tuple_ID Door Size Mileage t 1 2 compact high t 2 4 sub low t 3 4 compact high t 4 2 compact low t 5 4 compact low t 6 4 compact high t 7 4 sub low t 8 2 sub low t 9 2 compact high t 10 4 sub high t 11 2 compact low t 12 4 sub low Value Reduct: ({2, sub} (D,S) {low} (M) ) (no possible decision rules) Tuple_ID Door Size Mileage Frequency t 1, t 9 2 compact high 2 t 2, t 7, t 12 4 sub low 3 t 3, t 6 4 compact high 2 t 4, t 11 2 compact low 2 Tuple_ID Door Size Mileage Frequency t 5 4 compact low 1 t 8 2 sub low 1 t 10 4 sub high 1 {{t 1, t 9 }, {t 4, t 11 }}, {{t 2, t 7, t 12 }, t 10 }, {{t 3, t 6 }, t 5 } are eliminated due to inconsistency (2.5) Decision Rules (12 possible decision rules) Tuple_ID Door Cylinder Mileage t 1 2 4 high t 2 4 6 low t 3 4 4 high t 4 2 6 low t 5 4 4 low t 6 4 4 high t 7 4 6 low t 8 2 6 low t 9 2 4 high t 10 4 4 high t 11 2 4 low t 12 4 4 low (4 possible decision rules) Tuple_ID Door Cylinder Mileage Frequency t 1, t 9 2 4 high 2 t 2, t 7 4 6 low 2 t 3, t 6, t 10 4 4 high 3 t 4, t 8 2 6 low 2 t 5, t 12 4 4 low 2 t 11 2 4 low 1 {{t 1, t 9 }, t 11 },{{t 3, t 6, t 10 },{t 5, t 12 }} are eliminated due to inconsistency (a) Consistent Decision Rule: ({4, 6} (D,C) {low} (M) ) Minimal Decision Rules: ({4} (D) {low} (M) ), ({6} (C) {low} (M) ) (b) Consistent Decision Rule: ({2, 6} (D,C) {low} (M) ) Minimal Decision Rules: ({2} (D) {low} (M) ), ({6} (C) {low} (M) ) (2.6) Decision Rules (15 possible decision rules) Tuple_ID Size Cylinder Mileage t 1 compact 4 high t 2 sub 6 low t 3 compact 4 high t 4 compact 6 low t 5 compact 4 low t 6 compact 4 high t 7 sub 6 low t 8 sub 6 low t 9 compact 4 high t 10 sub 4 high t 11 compact 4 low t 12 sub 4 low,({compact, 4} (S,C) {low} (M) ) (3 possible decision rules) Tuple_ID Size Cylinder Mileage Frequency t 1, t 3, t 6, t 9 compact 4 high 4

Tuple_ID Size Cylinder Mileage Frequency t 2, t 7, t 8 sub 6 low 3 t 4 compact 6 low 1 t 5 compact 4 low 1 t 10 sub 4 high 1 t 11 compact 4 low 1 t 12 sub 4 low 1 {{t 1, t 3, t 6, t 9 }, t 11 },{t 10, t 12 } are eliminated due to inconsistency (a) Consistent Decision Rule: ({sub, 6} (S,C) {low} (M) ) Minimal Decision Rules: ({sub} (S) {low} (M) ), ({6} (C) {low} (M) ) TABLE IV 3-ATTRIBUTE HIGH FREQUENCY DECISION RULES: {WEIGHT, DOOR, SIZE}, {DOOR, SIZE, CYLINDER}, {WEIGHT, SIZE, CYLINDER}, {WEIGHT, DOOR, CYLINDER} AND AND 2 (3.1) Decision Rules (56 possible decision rules) Tuple_ID Weight Door Size Mileage t 1 low 2 compact high t 2 low 4 sub low t 3 medium 4 compact high t 4 high 2 compact low t 5 high 4 compact low t 6 low 4 compact high t 7 high 4 sub low t 8 low 2 sub low t 9 medium 2 compact high t 10 medium 4 sub high t 11 medium 2 compact low t 12 medium 4 sub low Value Reduct: ({low, compact} (W,S) {high} (M) ), ({low, sub} (W,S) {low} (M) ), ({medium, 4, compact} (W,D,S) {high} (M) ), ({high} (W) {low} (M) ),({2, sub} (D,S) {low} (M) ) (no possible decision rules) Tuple_ID Weight Door Size Mileage Frequency t 1 low 2 compact high 1 t 2 low 4 sub low 1 t 3 medium 4 compact high 1 t 4 high 2 compact low 1 t 5 high 4 compact low 1 t 6 low 4 compact high 1 t 7 high 4 sub low 1 t 8 low 2 sub low 1 t 9 medium 2 compact high 1 t 10 medium 4 sub high 1 t 11 medium 2 compact low 1 t 12 medium 4 sub low 1 {t 9, t 11 } and {t 10, t 12 } are eliminated due to inconsistency (3.2) Decision Rules (28 possible decision rules) Tuple_ID Door Size Cylinder Mileage t 1 2 compact 4 high t 2 4 sub 6 low t 3 4 compact 4 high t 4 2 compact 6 low t 5 4 compact 4 low t 6 4 compact 4 high t 7 4 sub 6 low Tuple_ID Door Size Cylinder Mileage t 8 2 sub 6 low t 9 2 compact 4 high t 10 4 sub 4 high t 11 2 compact 4 low t 12 4 sub 4 low Value Reduct: ({2, sub} (D,S) {low} (M) ), ({6} (C) {low} (M) ) (7 possible decision rules) Tuple_ID Door Size Cylinder Mileage Frequency t 1, t 9 2 compact 4 high 2 t 2, t 7 4 sub 6 low 2 t 3, t 6 4 compact 4 high 2 t 4 2 compact 6 low 1 t 5 4 compact 4 low 1 t 8 2 sub 6 low 1 t 10 4 sub 4 high 1 t 11 2 compact 4 low 1 t 12 4 sub 4 low 1 {{t 1, t 9 }, t 11 },{{t 3, t 6 }, t 5 } and {t 10, t 12 } are eliminated due to inconsistency (a) Consistent Decision Rule: ({4, sub, 6} (D,S,C) {low} (M) ) Minimal Decision Rules: ({4, sub} (D,S) {low} (M) ), ({sub, 6} (S,C) {low} (M) ), ({4, 6} (D,C) {low} (M) ) Minimal Decision Rules: ({sub} (S) {low} (M) ), ({6} (C) {low} (M) ), ({4} (D) {low} (M) ), ({6} (C) {low} (M) ) (3.3) Decision Rules (49 possible decision rules) Tuple_ID Weight Size Cylinder Mileage t 1 low compact 4 high t 2 low sub 6 low t 3 medium compact 4 high t 4 high compact 6 low t 5 high compact 4 low t 6 low compact 4 high t 7 high sub 6 low t 8 low sub 6 low t 9 medium compact 4 high t 10 medium sub 4 high t 11 medium compact 4 low t 12 medium sub 4 low Value Reduct: ({low, compact} (W,S) {high} (M) ),({low, 4} (W,C) {high} (M) ), ({low, sub} (W,S) {low} (M) ),({6} (C) {low} (M) ),({high} (W) {low} (M) ) (14 possible decision rules) Tuple_ID Weight Size Cylinder Mileage Frequency t 1, t 6 low compact 4 high 2 t 2, t 8 low sub 6 low 2 t 3, t 9 medium compact 4 high 2 t 4 high compact 6 low 1 t 5 high compact 4 low 1 t 7 high sub 6 low 1 t 10 medium sub 4 high 1 t 11 medium compact 4 low 1 t 12 medium sub 4 low 1 {{t 3, t 9 }, t 11 } and {t 10, t 12 } are eliminated due to inconsistency (a) Consistent Decision Rule: ({low, compact, 4} (W,S,C) {high} (M) ) Minimal Decision Rules: ({low, compact} (W,S) {high} (M) ), ({compact, 4} (S,C) {high} (M) ), ({low, 4} (W,C) {high} (M) ) Minimal Decision Rules: ({low} (W) {high} (M) ), ({compact} (S) {high} (M) ), ({low} (W) {high} (M) ), ({4} (C) {high} (M) )

Value Reduct: ({low, compact} (W,S) {high} (M) ), ({low, 4} (W,C) {high} (M) ) (b) Consistent Decision Rule: ({low, sub, 6} (W,S,C) {low} (M) ) Minimal Decision Rules: ({low, sub} (W,S) {low} (M) ), ({sub, 6} (S,C) {low} (M) ), ({low, 6} (W,C) {low} (M) ) Minimal Decision Rules: ({low} (W) {low} (M) ), ({sub} (S) {low} (M) ), ({sub} (S) {low} (M) ), ({6} (C) {low} (M) ), ({low} (W) {low} (M) ), ({6} (C) {low} (M) ) Value Reduct: ({low, sub} (W,S) {low} (M) ), ({6} (C) {low} (M) ) (3.4) Decision Rules (49 possible decision rules) Tuple_ID Weight Door Cylinder Mileage t 1 low 2 4 high t 2 low 4 6 low t 3 medium 4 4 high t 4 high 2 6 low t 5 high 4 4 low t 6 low 4 4 high t 7 high 4 6 low t 8 low 2 6 low t 9 medium 2 4 high t 10 medium 4 4 high t 11 medium 2 4 low t 12 medium 4 4 low Value Reduct: ({low, 4} (W,C) {high} (M) ),({6} (C) {low} (M) ),({high} (W) {low} (M) ) (no possible decision rules) Tuple_ID Weight Door Cylinder Mileage Frequency t 1 low 2 4 high 1 t 2 low 4 6 low 1 t 3, t 10 medium 4 4 high 2 t 4 high 2 6 low 1 t 5 high 4 4 low 1 t 6 low 4 4 high 1 t 7 high 4 6 low 1 t 8 low 2 6 low 1 t 9 medium 2 4 high 1 t 11 medium 2 4 low 1 t 12 medium 4 4 low 1 {{t 3, t 10 }, t 12 } and {t 9, t 11 } are eliminated due to inconsistency The final list of all value reducts for Table I are documented as follows: CONCLUSION In summary, a new approach has been proposed to directly generate high frequency value reducts without any knowledge of attribute reducts for a given decision table. The running time is a polynomial function of the number of rows (m) and an exponential function of the number of columns (n) in the decision table. This approach combines value reduct generation together with high frequency pruning of equivalence values while leveraging set-oriented database operations. The number of iterations for high frequency value reducts is lesser than the traditional approach to creating value reducts by considering every tuple in the decision table. Our future work will involve application of this approach to large data sets stored in database systems as well as knowledge discovery in very large data sets. REFERENCES [1] Tsau Young Lin, Rough Set Theory in Very Large Databases, Symposium on Modeling, Analysis and Simulation, CESA 96 IMACS Multi Conference (Computational Engineering in Systems Applications), Lille, France, July 9-12, 1996, Vol. 2 of 2, 936-941. [2] Bazan, J., Nguyen, H., Nguyen, S., Synak, P., Wroblewski, J., Rough set algorithms in classification problems, Rough Set Methods and Applications: New Developments in Knowledge Discovery in Information Systems, L. Polkowski, T. Y. Lin, and S. Tsumoto (eds), 49-88, Physica-Verlag, Heidelberg, Germany, 2000. [3] Fernandez-Baizan, A., Ruiz, E., Sanchez, J., Integrating RDMS and Data Mining Capabilities Using Rough Sets, Proc. IMPU, Granada, Spain, 1996. [4] Kumar A., New Techniques for Data Reduction in Database Systems for Knowledge Discovery Applications, Journal of Intelligent Information Systems, 10(1), 31-48, 1998. [5] Nguyen, H., Nguyen, S., Some efficient algorithms for rough set methods, Proc. IPMU Granada, Spain, 1451-1456, 1996. [6] A. Skowron, C. Rauszer The discernibility matrices and functions in information systems, Decision Support by Experience - Application of the Rough Sets Theory, R. Slowinski (ed.), Kluwer Academic Publishers, 1992, pp. 331-362. [7] Xiaohua Hu, T. Y. Lin, Jianchao Han, A New Rough Sets Model Based on Database Systems, Fundamenta Informaticae, v.59 n.2-3, p.135-152, April 2004. [8] Pawlak Z., Rough Sets, International Journal of Information and Computer Science, 11(5), 341-356, 1982 [9] Pawlak Z., Rough Sets: Theoretical Aspects of Reasoning about Data, Kluwer Academic Publishers, 1992 High Frequency Value Reducts Value Reducts (MIN_FREQ=1) (MIN_FREQ=2) ({high} (W) {low} (M) ) ({high} (W) {low} (M) ) ({6} (C) {low} (M) ) ({6} (C) {low} (M) ) ({low, compact} (W,S) {high} (M) ) ({low, compact} (W,S) {high} (M) ) ({low, sub} (W,S) {low} (M) ) ({low, sub} (W,S) {low} (M) ) ({low, 4} (W,C) {high} (M) ) ({low, 4} (W,C) {high} (M) ) ({compact, 4} (S,C) {low} (M) ) ({2, sub} (D,S) {low} (M) ) ({medium, 4, compact} (W,D,S) {high} (M) )