High Frequency Rough Set Model based on Database Systems

Size: px
Start display at page:

Download "High Frequency Rough Set Model based on Database Systems"

Transcription

1 High Frequency Rough Set Model based on Database Systems Kartik Vaithyanathan T.Y.Lin Department of Computer Science San Jose State University San Jose, CA 94403, USA Abstract - Rough sets theory was proposed by Pawlak in the 1980s and has been applied successfully in a lot of domains. One of the key concepts of the rough sets model is the computation of core and reduct. It has been shown that finding the minimal reduct is an NP-hard problem and its computational complexity has implicitly restricted its effective applications to a small and clean data set. In order to improve the efficiency of computing core attributes and reducts, many novel approaches have been developed, some of which attempt to integrate database technologies. This paper proposes a novel approach to computing reducts called high frequency value reducts using database system concepts. The method deals directly with generating value reducts and also prunes the decision table by placing a lower bound on the frequency of equivalence values in the decision table. I. INTRODUCTION Rough sets theory was proposed by Pawlak [8,9] in the 1980s and has been applied successfully in a lot of domains. One of the key concepts of the rough sets model is the computation of core and reduct. Multiple approaches to improve the efficiency of finding core attributes and reducts have been developed [2], including the algorithms presented in [5], which largely improve the generation of discernability relation by sorting the objects. Some authors have proposed approaches to reduce data size using relational database system techniques [4] and developed rough-set based data mining systems that integrate RDBMS capabilities [3]. Another approach redefined the concepts of rough set theory such as core attributes and reducts by leveraging set-oriented database operations [7]. The current approach extends the extraction of various sizes of inter connected Pawlak information systems [1] while leveraging existing relational database concepts and operations. An in-depth example to illustrate the nuances of this approach is also provided. II. APPROACH A decision table such as the one shown in Table I may have more than one value reducts. Any one of them can be used to replace the original table. Finding all the value reducts by eliminating unnecessary attributes from a decision table is NPhard [6]. Attributes that are redundant given other attributes are perceived as unnecessary. Table I shows a database table of 12 cars with information about the Weight, Door, Size, Cylinder and Mileage. Weight, Door, Size and Cylinder are the condition attributes (represented as ) and Mileage is the decision attribute (represented as ). The attribute Tuple_ID is provided for theoretical understanding only. TABLE I 12 CARS WITH ATTRIBUTES WEIGHT, DOOR, SIZE, CYLINDER AND MILEAGE Tuple_ID Weight Door Size Cylinder Mileage t 1 low 2 compact 4 high t 2 low 4 sub 6 low t 3 medium 4 compact 4 high t 4 high 2 compact 6 low t 5 high 4 compact 4 low t 6 low 4 compact 4 high t 7 high 4 sub 6 low t 8 low 2 sub 6 low t 9 medium 2 compact 4 high t 10 medium 4 sub 4 high t 11 medium 2 compact 4 low t 12 medium 4 sub 4 low The traditional approaches (including [7]) perform a two step process in identifying the value reducts the first step involves obtaining the minimal attribute reducts by eliminating unnecessary attributes without sacrificing the accuracy of the classification model. The second step is to generate the value reducts for each attribute reduct. Most approaches to computing core and reduct assume that: (a) a tuple in the decision table always contributes to the classification model and is not an outlier (b) all tuples in the decision table are consistent. There are two aspects to the new approach and is explained below. The first aspect states that tuples that occur above a certain lower bound threshold are the only ones that contributes to the classification model (decision). As a result, only high frequency rules that contribute to a decision are short-listed. A trivial case of the lower bound (= 1) is equivalent to the traditional approaches of computing value reducts. The high frequency rule prunes the decision table data and is the first key differentiator in this approach. An algorithm is outlined to ensure that the decision table is consistent before applying the proposed high frequency rule to the tuples in the decision table. The second novel step in this approach is to directly generate the value reducts instead of a /08/$ IEEE

2 two-step process of first identifying the attribute reducts and then subsequently generating the value reducts. III. CONSISTENT HIGH FREQUENCY DECISION RULES Given a decision table with rows and condition attributes, the number of possible decision rules is. The high frequency pruning will eliminate some of the decision rules. Every decision rule can then be analyzed to determine the existence of a value reduct (minimal decision rule). The generation of a decision rule set DR(X,D) can be expressed using SQL statements of the form SELECT * FROM (SELECT X, D FROM T) (1) where X is a subset of C. There are possible values of X. There are rows in each decision rule set. The inconsistent tuples in a decision rule set DR(X,D) are obtained using the following SQL SELECT * FROM DR(X,D) DR 1 WHERE EXISTS ( SELECT * FROM DR(X,D) DR 2 WHERE ((DR 1.X = DR 2.X) AND (DR 1.D!= DR 2.D)) ) (2) A consistent decision rule set DR(X,D) is obtained by removing the tuples in DR 1 (X,D) above from the original set DR(X,D) in (1). SELECT * FROM DR(X,D) MINUS SELECT * FROM DR 1 (X,D) (3) The running time for (2) is and (3) is where is the number of tuples (rows) in the decision table. The overall running time for weeding out inconsistent tuples is. The high frequency pruning is executed on the consistent decision rule set DR(X,D) from the previous step and can be expressed in SQL as SELECT X, D, COUNT(*) as Frequency FROM DR(X,D) GROUP BY X, D HAVING COUNT(*) >= MIN_FREQ (4) where MIN_FREQ represents the minimum value of the high frequency rule. The sorting process (i.e. the GROUP BY) is running time and the counting and pruning is running time where is the number of tuples (rows) in the decision table. Thus, the running time for the high frequency pruning is. The worst-case running time for obtaining a consistent, high frequency decision rule set DR(X,D) is a polynomial function of the number of rows ( ) in the decision table. The overall running time for creation of all the possible consistent, high frequency decision rules for a decision table is where is the number of rows and is the number of attributes (columns) in the decision table. III. VALUE REDUCTS The goal is to find the value reducts for each tuple in a decision rule set DR(X,D). If DR(X,D) is an -attribute decision rule set ( ), there are decision rules for each tuple in the decision rule set each having - attribute values lesser than the original tuple ( ). Each of these decision rules is analyzed for consistency and one or more minimal decision rules (a consistent decision rule with the least number of attributes) are chosen for every tuple and comprise the value reducts for that tuple. The generation of value reducts is explained in detail with an example in Section V. V. ILLUSTRATIVE EXAMPLE The representative example in Table I is broken down into its sub sets along with the high frequency values for illustration below. Each sub set is analyzed for two high frequencies (a) is equivalent to the traditional approach to computing value reducts and (b) MIN_FREQ = 2 to illustrate the high frequency pruning to computing value reducts. All tuples that are inconsistent and don t meet the high frequency criterion are discarded and are highlighted in the tables. In addition, the inconsistent tuples are noted below the respective decision rule set. The value reducts are also provided for each of these decision rule sets. TABLE II 1-ATTRIBUTE HIGH FREQUENCY DECISION RULES: {WEIGHT}, {DOOR}, {SIZE}, AND {CYLINDER} AND AND 2 (1.1) Decision Rules Tuple_ID Weight Mileage Frequency t 1, t 6 low high 2 t 2, t 8 low low 2 t 3, t 9, t 10 medium high 3 t 4, t 5, t 7 high low 3 t 11, t 12 medium low 2

3 (1.2) Decision Rules Tuple_ID Door Mileage Frequency t 1, t 9 2 high 2 t 2, t 5, t 7, t 12 4 low 4 t 3, t 6, t 10 4 high 3 t 4, t 8, t 11 2 low 3 (1.3) Decision Rules Tuple_ID Size Mileage Frequency t 1, t 3, t 6, t 9 compact high 4 t 2, t 7, t 8, t 12 sub low 4 t 4, t 5, t 11 compact low 3 t 10 sub high 1 (1.4) Decision Rules Tuple_ID Cylinder Mileage Frequency t 1, t 3, t 6, t 9, t 10 4 high 5 t 2, t 4, t 7, t 8 6 low 4 t 5, t 11, t 12 4 low 3 TABLE III 2-ATTRIBUTE HIGH FREQUENCY DECISION RULES: {WEIGHT, DOOR}, {WEIGHT, SIZE}, {WEIGHT, CYLINDER}, {DOOR, SIZE}, {DOOR, CYLINDER}, {SIZE, CYLINDER} AND AND 2 (2.1) Decision Rules (9 possible decision rules) Tuple_ID Weight Door Mileage t 1 low 2 high t 2 low 4 low t 3 medium 4 high t 4 high 2 low t 5 high 4 low t 6 low 4 high t 7 high 4 low t 8 low 2 low t 9 medium 2 high t 10 medium 4 high t 11 medium 2 low t 12 medium 4 low (3 possible decision rules) Tuple_ID Weight Door Mileage Frequency t 1 low 2 high 1 t 2 low 4 low 1 t 3, t 10 medium 4 high 2 t 4 high 2 low 1 Tuple_ID Weight Door Mileage Frequency t 5, t 7 high 4 low 2 t 6 low 4 high 1 t 8 low 2 low 1 t 9 medium 2 high 1 t 11 medium 2 low 1 t 12 medium 4 low 1 {t 1, t 8 },{t 2, t 6 },{t 9, t 11 } and {{t 3, t 10 }, t 12 } are eliminated due to inconsistency. (a) Consistent Decision Rule: ({high, 4} (W, D) {low} (M) ) Minimal Decision Rules: ({high} (W) {low} (M) ), ({4} (D) {low} (M) ) [{4} (D) could imply {high} (M) hence not a value reduct] (2.2) Decision Rules (21 possible decision rules) Tuple_ID Weight Size Mileage t 1 low compact high t 2 low sub low t 3 medium compact high t 4 high compact low t 5 high compact low t 6 low compact high t 7 high sub low t 8 low sub low t 9 medium compact high t 10 medium sub high t 11 medium compact low t 12 medium sub low Value Reduct: ({low, compact} (W,S) {high} (M) ), ({low, sub} (W,S) {low} (M) ),({high} (W) {low} (M) ) (9 possible decision rules) Tuple_ID Weight Size Mileage Frequency t 1, t 6 low compact high 2 t 2, t 8 low sub low 2 t 3, t 9 medium compact high 2 t 4, t 5 high compact low 2 t 7 high sub low 1 t 10 medium sub high 1 t 11 medium compact low 1 t 12 medium sub low 1 {{t 3, t 9 }, t 11 } and {t 10, t 12 } are eliminated due to inconsistency (a) Consistent Decision Rule: ({low, compact} (W,S) {high} (M) ) Minimal Decision Rules: ({low} (W) {high} (M) ), ({compact} (S) {high} (M) ) Both are not valid [{low} (W) could imply {low} (D) and {compact} (S) could imply {low} (D) ] Value Reduct: ({low, compact} (W,S) {high} (M) ) (b) Consistent Decision Rule: ({low, sub} (W,S) {low} (M) ) Minimal Decision Rules: ({low} (W) {low} (M) ), ({sub} (S) {low} (M) ) Both are not valid [{low} (W) could imply {high} (D) and {sub} (S) could imply {high} (D) ] Value Reduct: ({low, sub} (W,S) {low} (M) ) (c) Consistent Decision Rule: ({high, compact} (W,S) {low} (M) ) Minimal Decision Rules: ({high} (W) {low} (M) ), ({compact} (S) {low} (M) ) [{compact} (S) could imply {high} (M) - hence not a value reduct] (2.3) Decision Rules

4 (21 possible decision rules) Tuple_ID Weight Cylinder Mileage t 1 low 4 high t 2 low 6 low t 3 medium 4 high t 4 high 6 low t 5 high 4 low t 6 low 4 high t 7 high 6 low t 8 low 6 low t 9 medium 4 high t 10 medium 4 high t 11 medium 4 low t 12 medium 4 low Value Reduct: ({low, 4} (W,C) {high} (M) ),({6} (C) {low} (M) ),({high} (W) {low} (M) ) (9 possible decision rules) Tuple_ID Weight Cylinder Mileage Frequency t 1, t 6 low 4 high 2 t 2, t 8 low 6 low 2 t 3, t 9, t 10 medium 4 high 3 t 4, t 7 high 6 low 2 t 5 high 4 low 1 t 11, t 12 medium 4 low 2 {{t 3, t 9, t 10 },{t 11, t 12 }} is eliminated due to inconsistency (a) Consistent Decision Rule: ({low, 4} (W,C) {high} (M) ) Minimal Decision Rules: ({low} (W) {high} (M) ), ({4} (C) {high} (M) ) Both are not valid. Value Reduct: ({low, 4} (W,C) {high} (M) ) (b) Consistent Decision Rule: ({low, 6} (W,C) {low} (M) ) Minimal Decision Rules: ({low} (W) {low} (M) ), ({6} (C) {low} (M) ) (c) Consistent Decision Rule: ({high, 6} (W,C) {low} (M) ) Minimal Decision Rules: ({high} (W) {low} (M) ), ({6} (C) {low} (M) ), ({6} (C) {low} (M) ) (2.4) Decision Rules (3 possible decision rules) Tuple_ID Door Size Mileage t 1 2 compact high t 2 4 sub low t 3 4 compact high t 4 2 compact low t 5 4 compact low t 6 4 compact high t 7 4 sub low t 8 2 sub low t 9 2 compact high t 10 4 sub high t 11 2 compact low t 12 4 sub low Value Reduct: ({2, sub} (D,S) {low} (M) ) (no possible decision rules) Tuple_ID Door Size Mileage Frequency t 1, t 9 2 compact high 2 t 2, t 7, t 12 4 sub low 3 t 3, t 6 4 compact high 2 t 4, t 11 2 compact low 2 Tuple_ID Door Size Mileage Frequency t 5 4 compact low 1 t 8 2 sub low 1 t 10 4 sub high 1 {{t 1, t 9 }, {t 4, t 11 }}, {{t 2, t 7, t 12 }, t 10 }, {{t 3, t 6 }, t 5 } are eliminated due to inconsistency (2.5) Decision Rules (12 possible decision rules) Tuple_ID Door Cylinder Mileage t high t low t high t low t low t high t low t low t high t high t low t low (4 possible decision rules) Tuple_ID Door Cylinder Mileage Frequency t 1, t high 2 t 2, t low 2 t 3, t 6, t high 3 t 4, t low 2 t 5, t low 2 t low 1 {{t 1, t 9 }, t 11 },{{t 3, t 6, t 10 },{t 5, t 12 }} are eliminated due to inconsistency (a) Consistent Decision Rule: ({4, 6} (D,C) {low} (M) ) Minimal Decision Rules: ({4} (D) {low} (M) ), ({6} (C) {low} (M) ) (b) Consistent Decision Rule: ({2, 6} (D,C) {low} (M) ) Minimal Decision Rules: ({2} (D) {low} (M) ), ({6} (C) {low} (M) ) (2.6) Decision Rules (15 possible decision rules) Tuple_ID Size Cylinder Mileage t 1 compact 4 high t 2 sub 6 low t 3 compact 4 high t 4 compact 6 low t 5 compact 4 low t 6 compact 4 high t 7 sub 6 low t 8 sub 6 low t 9 compact 4 high t 10 sub 4 high t 11 compact 4 low t 12 sub 4 low,({compact, 4} (S,C) {low} (M) ) (3 possible decision rules) Tuple_ID Size Cylinder Mileage Frequency t 1, t 3, t 6, t 9 compact 4 high 4

5 Tuple_ID Size Cylinder Mileage Frequency t 2, t 7, t 8 sub 6 low 3 t 4 compact 6 low 1 t 5 compact 4 low 1 t 10 sub 4 high 1 t 11 compact 4 low 1 t 12 sub 4 low 1 {{t 1, t 3, t 6, t 9 }, t 11 },{t 10, t 12 } are eliminated due to inconsistency (a) Consistent Decision Rule: ({sub, 6} (S,C) {low} (M) ) Minimal Decision Rules: ({sub} (S) {low} (M) ), ({6} (C) {low} (M) ) TABLE IV 3-ATTRIBUTE HIGH FREQUENCY DECISION RULES: {WEIGHT, DOOR, SIZE}, {DOOR, SIZE, CYLINDER}, {WEIGHT, SIZE, CYLINDER}, {WEIGHT, DOOR, CYLINDER} AND AND 2 (3.1) Decision Rules (56 possible decision rules) Tuple_ID Weight Door Size Mileage t 1 low 2 compact high t 2 low 4 sub low t 3 medium 4 compact high t 4 high 2 compact low t 5 high 4 compact low t 6 low 4 compact high t 7 high 4 sub low t 8 low 2 sub low t 9 medium 2 compact high t 10 medium 4 sub high t 11 medium 2 compact low t 12 medium 4 sub low Value Reduct: ({low, compact} (W,S) {high} (M) ), ({low, sub} (W,S) {low} (M) ), ({medium, 4, compact} (W,D,S) {high} (M) ), ({high} (W) {low} (M) ),({2, sub} (D,S) {low} (M) ) (no possible decision rules) Tuple_ID Weight Door Size Mileage Frequency t 1 low 2 compact high 1 t 2 low 4 sub low 1 t 3 medium 4 compact high 1 t 4 high 2 compact low 1 t 5 high 4 compact low 1 t 6 low 4 compact high 1 t 7 high 4 sub low 1 t 8 low 2 sub low 1 t 9 medium 2 compact high 1 t 10 medium 4 sub high 1 t 11 medium 2 compact low 1 t 12 medium 4 sub low 1 {t 9, t 11 } and {t 10, t 12 } are eliminated due to inconsistency (3.2) Decision Rules (28 possible decision rules) Tuple_ID Door Size Cylinder Mileage t 1 2 compact 4 high t 2 4 sub 6 low t 3 4 compact 4 high t 4 2 compact 6 low t 5 4 compact 4 low t 6 4 compact 4 high t 7 4 sub 6 low Tuple_ID Door Size Cylinder Mileage t 8 2 sub 6 low t 9 2 compact 4 high t 10 4 sub 4 high t 11 2 compact 4 low t 12 4 sub 4 low Value Reduct: ({2, sub} (D,S) {low} (M) ), ({6} (C) {low} (M) ) (7 possible decision rules) Tuple_ID Door Size Cylinder Mileage Frequency t 1, t 9 2 compact 4 high 2 t 2, t 7 4 sub 6 low 2 t 3, t 6 4 compact 4 high 2 t 4 2 compact 6 low 1 t 5 4 compact 4 low 1 t 8 2 sub 6 low 1 t 10 4 sub 4 high 1 t 11 2 compact 4 low 1 t 12 4 sub 4 low 1 {{t 1, t 9 }, t 11 },{{t 3, t 6 }, t 5 } and {t 10, t 12 } are eliminated due to inconsistency (a) Consistent Decision Rule: ({4, sub, 6} (D,S,C) {low} (M) ) Minimal Decision Rules: ({4, sub} (D,S) {low} (M) ), ({sub, 6} (S,C) {low} (M) ), ({4, 6} (D,C) {low} (M) ) Minimal Decision Rules: ({sub} (S) {low} (M) ), ({6} (C) {low} (M) ), ({4} (D) {low} (M) ), ({6} (C) {low} (M) ) (3.3) Decision Rules (49 possible decision rules) Tuple_ID Weight Size Cylinder Mileage t 1 low compact 4 high t 2 low sub 6 low t 3 medium compact 4 high t 4 high compact 6 low t 5 high compact 4 low t 6 low compact 4 high t 7 high sub 6 low t 8 low sub 6 low t 9 medium compact 4 high t 10 medium sub 4 high t 11 medium compact 4 low t 12 medium sub 4 low Value Reduct: ({low, compact} (W,S) {high} (M) ),({low, 4} (W,C) {high} (M) ), ({low, sub} (W,S) {low} (M) ),({6} (C) {low} (M) ),({high} (W) {low} (M) ) (14 possible decision rules) Tuple_ID Weight Size Cylinder Mileage Frequency t 1, t 6 low compact 4 high 2 t 2, t 8 low sub 6 low 2 t 3, t 9 medium compact 4 high 2 t 4 high compact 6 low 1 t 5 high compact 4 low 1 t 7 high sub 6 low 1 t 10 medium sub 4 high 1 t 11 medium compact 4 low 1 t 12 medium sub 4 low 1 {{t 3, t 9 }, t 11 } and {t 10, t 12 } are eliminated due to inconsistency (a) Consistent Decision Rule: ({low, compact, 4} (W,S,C) {high} (M) ) Minimal Decision Rules: ({low, compact} (W,S) {high} (M) ), ({compact, 4} (S,C) {high} (M) ), ({low, 4} (W,C) {high} (M) ) Minimal Decision Rules: ({low} (W) {high} (M) ), ({compact} (S) {high} (M) ), ({low} (W) {high} (M) ), ({4} (C) {high} (M) )

6 Value Reduct: ({low, compact} (W,S) {high} (M) ), ({low, 4} (W,C) {high} (M) ) (b) Consistent Decision Rule: ({low, sub, 6} (W,S,C) {low} (M) ) Minimal Decision Rules: ({low, sub} (W,S) {low} (M) ), ({sub, 6} (S,C) {low} (M) ), ({low, 6} (W,C) {low} (M) ) Minimal Decision Rules: ({low} (W) {low} (M) ), ({sub} (S) {low} (M) ), ({sub} (S) {low} (M) ), ({6} (C) {low} (M) ), ({low} (W) {low} (M) ), ({6} (C) {low} (M) ) Value Reduct: ({low, sub} (W,S) {low} (M) ), ({6} (C) {low} (M) ) (3.4) Decision Rules (49 possible decision rules) Tuple_ID Weight Door Cylinder Mileage t 1 low 2 4 high t 2 low 4 6 low t 3 medium 4 4 high t 4 high 2 6 low t 5 high 4 4 low t 6 low 4 4 high t 7 high 4 6 low t 8 low 2 6 low t 9 medium 2 4 high t 10 medium 4 4 high t 11 medium 2 4 low t 12 medium 4 4 low Value Reduct: ({low, 4} (W,C) {high} (M) ),({6} (C) {low} (M) ),({high} (W) {low} (M) ) (no possible decision rules) Tuple_ID Weight Door Cylinder Mileage Frequency t 1 low 2 4 high 1 t 2 low 4 6 low 1 t 3, t 10 medium 4 4 high 2 t 4 high 2 6 low 1 t 5 high 4 4 low 1 t 6 low 4 4 high 1 t 7 high 4 6 low 1 t 8 low 2 6 low 1 t 9 medium 2 4 high 1 t 11 medium 2 4 low 1 t 12 medium 4 4 low 1 {{t 3, t 10 }, t 12 } and {t 9, t 11 } are eliminated due to inconsistency The final list of all value reducts for Table I are documented as follows: CONCLUSION In summary, a new approach has been proposed to directly generate high frequency value reducts without any knowledge of attribute reducts for a given decision table. The running time is a polynomial function of the number of rows (m) and an exponential function of the number of columns (n) in the decision table. This approach combines value reduct generation together with high frequency pruning of equivalence values while leveraging set-oriented database operations. The number of iterations for high frequency value reducts is lesser than the traditional approach to creating value reducts by considering every tuple in the decision table. Our future work will involve application of this approach to large data sets stored in database systems as well as knowledge discovery in very large data sets. REFERENCES [1] Tsau Young Lin, Rough Set Theory in Very Large Databases, Symposium on Modeling, Analysis and Simulation, CESA 96 IMACS Multi Conference (Computational Engineering in Systems Applications), Lille, France, July 9-12, 1996, Vol. 2 of 2, [2] Bazan, J., Nguyen, H., Nguyen, S., Synak, P., Wroblewski, J., Rough set algorithms in classification problems, Rough Set Methods and Applications: New Developments in Knowledge Discovery in Information Systems, L. Polkowski, T. Y. Lin, and S. Tsumoto (eds), 49-88, Physica-Verlag, Heidelberg, Germany, [3] Fernandez-Baizan, A., Ruiz, E., Sanchez, J., Integrating RDMS and Data Mining Capabilities Using Rough Sets, Proc. IMPU, Granada, Spain, [4] Kumar A., New Techniques for Data Reduction in Database Systems for Knowledge Discovery Applications, Journal of Intelligent Information Systems, 10(1), 31-48, [5] Nguyen, H., Nguyen, S., Some efficient algorithms for rough set methods, Proc. IPMU Granada, Spain, , [6] A. Skowron, C. Rauszer The discernibility matrices and functions in information systems, Decision Support by Experience - Application of the Rough Sets Theory, R. Slowinski (ed.), Kluwer Academic Publishers, 1992, pp [7] Xiaohua Hu, T. Y. Lin, Jianchao Han, A New Rough Sets Model Based on Database Systems, Fundamenta Informaticae, v.59 n.2-3, p , April [8] Pawlak Z., Rough Sets, International Journal of Information and Computer Science, 11(5), , 1982 [9] Pawlak Z., Rough Sets: Theoretical Aspects of Reasoning about Data, Kluwer Academic Publishers, 1992 High Frequency Value Reducts Value Reducts (MIN_FREQ=1) (MIN_FREQ=2) ({high} (W) {low} (M) ) ({high} (W) {low} (M) ) ({6} (C) {low} (M) ) ({6} (C) {low} (M) ) ({low, compact} (W,S) {high} (M) ) ({low, compact} (W,S) {high} (M) ) ({low, sub} (W,S) {low} (M) ) ({low, sub} (W,S) {low} (M) ) ({low, 4} (W,C) {high} (M) ) ({low, 4} (W,C) {high} (M) ) ({compact, 4} (S,C) {low} (M) ) ({2, sub} (D,S) {low} (M) ) ({medium, 4, compact} (W,D,S) {high} (M) )

Easy Categorization of Attributes in Decision Tables Based on Basic Binary Discernibility Matrix

Easy Categorization of Attributes in Decision Tables Based on Basic Binary Discernibility Matrix Easy Categorization of Attributes in Decision Tables Based on Basic Binary Discernibility Matrix Manuel S. Lazo-Cortés 1, José Francisco Martínez-Trinidad 1, Jesús Ariel Carrasco-Ochoa 1, and Guillermo

More information

On Improving the k-means Algorithm to Classify Unclassified Patterns

On Improving the k-means Algorithm to Classify Unclassified Patterns On Improving the k-means Algorithm to Classify Unclassified Patterns Mohamed M. Rizk 1, Safar Mohamed Safar Alghamdi 2 1 Mathematics & Statistics Department, Faculty of Science, Taif University, Taif,

More information

Ensembles of classifiers based on approximate reducts

Ensembles of classifiers based on approximate reducts Fundamenta Informaticae 34 (2014) 1 10 1 IOS Press Ensembles of classifiers based on approximate reducts Jakub Wróblewski Polish-Japanese Institute of Information Technology and Institute of Mathematics,

More information

Learning Rules from Very Large Databases Using Rough Multisets

Learning Rules from Very Large Databases Using Rough Multisets Learning Rules from Very Large Databases Using Rough Multisets Chien-Chung Chan Department of Computer Science University of Akron Akron, OH 44325-4003 chan@cs.uakron.edu Abstract. This paper presents

More information

The size of decision table can be understood in terms of both cardinality of A, denoted by card (A), and the number of equivalence classes of IND (A),

The size of decision table can be understood in terms of both cardinality of A, denoted by card (A), and the number of equivalence classes of IND (A), Attribute Set Decomposition of Decision Tables Dominik Slezak Warsaw University Banacha 2, 02-097 Warsaw Phone: +48 (22) 658-34-49 Fax: +48 (22) 658-34-48 Email: slezak@alfa.mimuw.edu.pl ABSTRACT: Approach

More information

A Logical Formulation of the Granular Data Model

A Logical Formulation of the Granular Data Model 2008 IEEE International Conference on Data Mining Workshops A Logical Formulation of the Granular Data Model Tuan-Fang Fan Department of Computer Science and Information Engineering National Penghu University

More information

An algorithm for induction of decision rules consistent with the dominance principle

An algorithm for induction of decision rules consistent with the dominance principle An algorithm for induction of decision rules consistent with the dominance principle Salvatore Greco 1, Benedetto Matarazzo 1, Roman Slowinski 2, Jerzy Stefanowski 2 1 Faculty of Economics, University

More information

Mining Approximative Descriptions of Sets Using Rough Sets

Mining Approximative Descriptions of Sets Using Rough Sets Mining Approximative Descriptions of Sets Using Rough Sets Dan A. Simovici University of Massachusetts Boston, Dept. of Computer Science, 100 Morrissey Blvd. Boston, Massachusetts, 02125 USA dsim@cs.umb.edu

More information

Rough Set Approaches for Discovery of Rules and Attribute Dependencies

Rough Set Approaches for Discovery of Rules and Attribute Dependencies Rough Set Approaches for Discovery of Rules and Attribute Dependencies Wojciech Ziarko Department of Computer Science University of Regina Regina, SK, S4S 0A2 Canada Abstract The article presents an elementary

More information

Modeling the Real World for Data Mining: Granular Computing Approach

Modeling the Real World for Data Mining: Granular Computing Approach Modeling the Real World for Data Mining: Granular Computing Approach T. Y. Lin Department of Mathematics and Computer Science San Jose State University, San Jose, California 95192-0103 and Berkeley Initiative

More information

ROUGH set methodology has been witnessed great success

ROUGH set methodology has been witnessed great success IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 14, NO. 2, APRIL 2006 191 Fuzzy Probabilistic Approximation Spaces and Their Information Measures Qinghua Hu, Daren Yu, Zongxia Xie, and Jinfu Liu Abstract Rough

More information

Classification of Voice Signals through Mining Unique Episodes in Temporal Information Systems: A Rough Set Approach

Classification of Voice Signals through Mining Unique Episodes in Temporal Information Systems: A Rough Set Approach Classification of Voice Signals through Mining Unique Episodes in Temporal Information Systems: A Rough Set Approach Krzysztof Pancerz, Wies law Paja, Mariusz Wrzesień, and Jan Warcho l 1 University of

More information

Issues in Modeling for Data Mining

Issues in Modeling for Data Mining Issues in Modeling for Data Mining Tsau Young (T.Y.) Lin Department of Mathematics and Computer Science San Jose State University San Jose, CA 95192 tylin@cs.sjsu.edu ABSTRACT Modeling in data mining has

More information

Research on Complete Algorithms for Minimal Attribute Reduction

Research on Complete Algorithms for Minimal Attribute Reduction Research on Complete Algorithms for Minimal Attribute Reduction Jie Zhou, Duoqian Miao, Qinrong Feng, and Lijun Sun Department of Computer Science and Technology, Tongji University Shanghai, P.R. China,

More information

Sets with Partial Memberships A Rough Set View of Fuzzy Sets

Sets with Partial Memberships A Rough Set View of Fuzzy Sets Sets with Partial Memberships A Rough Set View of Fuzzy Sets T. Y. Lin Department of Mathematics and Computer Science San Jose State University San Jose, California 95192-0103 E-mail: tylin@cs.sjsu.edu

More information

Interpreting Low and High Order Rules: A Granular Computing Approach

Interpreting Low and High Order Rules: A Granular Computing Approach Interpreting Low and High Order Rules: A Granular Computing Approach Yiyu Yao, Bing Zhou and Yaohua Chen Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail:

More information

Granular Computing: Granular Classifiers and Missing Values

Granular Computing: Granular Classifiers and Missing Values 1 Granular Computing: Granular Classifiers and Missing Values Lech Polkowski 1,2 and Piotr Artiemjew 2 Polish-Japanese Institute of Information Technology 1 Koszykowa str. 86, 02008 Warsaw, Poland; Department

More information

A Scientometrics Study of Rough Sets in Three Decades

A Scientometrics Study of Rough Sets in Three Decades A Scientometrics Study of Rough Sets in Three Decades JingTao Yao and Yan Zhang Department of Computer Science University of Regina [jtyao, zhang83y]@cs.uregina.ca Oct. 8, 2013 J. T. Yao & Y. Zhang A Scientometrics

More information

ARPN Journal of Science and Technology All rights reserved.

ARPN Journal of Science and Technology All rights reserved. Rule Induction Based On Boundary Region Partition Reduction with Stards Comparisons Du Weifeng Min Xiao School of Mathematics Physics Information Engineering Jiaxing University Jiaxing 34 China ABSTRACT

More information

ROUGH SET THEORY FOR INTELLIGENT INDUSTRIAL APPLICATIONS

ROUGH SET THEORY FOR INTELLIGENT INDUSTRIAL APPLICATIONS ROUGH SET THEORY FOR INTELLIGENT INDUSTRIAL APPLICATIONS Zdzisław Pawlak Institute of Theoretical and Applied Informatics, Polish Academy of Sciences, Poland, e-mail: zpw@ii.pw.edu.pl ABSTRACT Application

More information

FUZZY PARTITIONS II: BELIEF FUNCTIONS A Probabilistic View T. Y. Lin

FUZZY PARTITIONS II: BELIEF FUNCTIONS A Probabilistic View T. Y. Lin FUZZY PARTITIONS II: BELIEF FUNCTIONS A Probabilistic View T. Y. Lin Department of Mathematics and Computer Science, San Jose State University, San Jose, California 95192, USA tylin@cs.sjsu.edu 1 Introduction

More information

Knowledge Approximations and Representations in Binary Granular Computing

Knowledge Approximations and Representations in Binary Granular Computing 2012 IEEE International Conference on Granular Computing Knowledge Approximations and Representations in Binary Granular Computing Zehua Chen Department of Automation, Taiyuan Univ. of Technology, Taiyuan,

More information

2 WANG Jue, CUI Jia et al. Vol.16 no", the discernibility matrix is only a new kind of learning method. Otherwise, we have to provide the specificatio

2 WANG Jue, CUI Jia et al. Vol.16 no, the discernibility matrix is only a new kind of learning method. Otherwise, we have to provide the specificatio Vol.16 No.1 J. Comput. Sci. & Technol. Jan. 2001 Investigation on AQ11, ID3 and the Principle of Discernibility Matrix WANG Jue (Ξ ±), CUI Jia ( ) and ZHAO Kai (Π Λ) Institute of Automation, The Chinese

More information

Minimal Attribute Space Bias for Attribute Reduction

Minimal Attribute Space Bias for Attribute Reduction Minimal Attribute Space Bias for Attribute Reduction Fan Min, Xianghui Du, Hang Qiu, and Qihe Liu School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu

More information

Three Discretization Methods for Rule Induction

Three Discretization Methods for Rule Induction Three Discretization Methods for Rule Induction Jerzy W. Grzymala-Busse, 1, Jerzy Stefanowski 2 1 Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, Kansas 66045

More information

Rough Set Approach for Generation of Classification Rules for Jaundice

Rough Set Approach for Generation of Classification Rules for Jaundice Rough Set Approach for Generation of Classification Rules for Jaundice Sujogya Mishra 1, Shakti Prasad Mohanty 2, Sateesh Kumar Pradhan 3 1 Research scholar, Utkal University Bhubaneswar-751004, India

More information

Rough Set Approach for Generation of Classification Rules of Breast Cancer Data

Rough Set Approach for Generation of Classification Rules of Breast Cancer Data INFORMATICA, 2004, Vol. 15, No. 1, 23 38 23 2004 Institute of Mathematics and Informatics, Vilnius Rough Set Approach for Generation of Classification Rules of Breast Cancer Data Aboul Ella HASSANIEN,

More information

Parameters to find the cause of Global Terrorism using Rough Set Theory

Parameters to find the cause of Global Terrorism using Rough Set Theory Parameters to find the cause of Global Terrorism using Rough Set Theory Sujogya Mishra Research scholar Utkal University Bhubaneswar-751004, India Shakti Prasad Mohanty Department of Mathematics College

More information

Computers and Mathematics with Applications

Computers and Mathematics with Applications Computers and Mathematics with Applications 59 (2010) 431 436 Contents lists available at ScienceDirect Computers and Mathematics with Applications journal homepage: www.elsevier.com/locate/camwa A short

More information

Knowledge Discovery Based Query Answering in Hierarchical Information Systems

Knowledge Discovery Based Query Answering in Hierarchical Information Systems Knowledge Discovery Based Query Answering in Hierarchical Information Systems Zbigniew W. Raś 1,2, Agnieszka Dardzińska 3, and Osman Gürdal 4 1 Univ. of North Carolina, Dept. of Comp. Sci., Charlotte,

More information

Neighborhoods Systems: Measure, Probability and Belief Functions

Neighborhoods Systems: Measure, Probability and Belief Functions 202-207. Neighborhoods Systems: Measure, Probability and Belief Functions T. Y. Lin 1 * tylin@cs.sjsu.edu Y, Y, Yao 2 yyao@flash.lakeheadu.ca 1 Berkeley Initiative in Soft Computing, Department of Electrical

More information

A new Approach to Drawing Conclusions from Data A Rough Set Perspective

A new Approach to Drawing Conclusions from Data A Rough Set Perspective Motto: Let the data speak for themselves R.A. Fisher A new Approach to Drawing Conclusions from Data A Rough et Perspective Zdzisław Pawlak Institute for Theoretical and Applied Informatics Polish Academy

More information

IN the areas of machine learning, artificial intelligence, as

IN the areas of machine learning, artificial intelligence, as INTL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 22, VOL. 58, NO., PP. 7 76 Manuscript received December 3, 2; revised March 22. DOI:.2478/v77-2--x Features Reduction Using Logic Minimization Techniques

More information

On Rough Set Modelling for Data Mining

On Rough Set Modelling for Data Mining On Rough Set Modelling for Data Mining V S Jeyalakshmi, Centre for Information Technology and Engineering, M. S. University, Abhisekapatti. Email: vsjeyalakshmi@yahoo.com G Ariprasad 2, Fatima Michael

More information

Rough Sets for Uncertainty Reasoning

Rough Sets for Uncertainty Reasoning Rough Sets for Uncertainty Reasoning S.K.M. Wong 1 and C.J. Butz 2 1 Department of Computer Science, University of Regina, Regina, Canada, S4S 0A2, wong@cs.uregina.ca 2 School of Information Technology

More information

Data Analysis - the Rough Sets Perspective

Data Analysis - the Rough Sets Perspective Data Analysis - the Rough ets Perspective Zdzisław Pawlak Institute of Computer cience Warsaw University of Technology 00-665 Warsaw, Nowowiejska 15/19 Abstract: Rough set theory is a new mathematical

More information

Classification Based on Logical Concept Analysis

Classification Based on Logical Concept Analysis Classification Based on Logical Concept Analysis Yan Zhao and Yiyu Yao Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada S4S 0A2 E-mail: {yanzhao, yyao}@cs.uregina.ca Abstract.

More information

Describing Data Table with Best Decision

Describing Data Table with Best Decision Describing Data Table with Best Decision ANTS TORIM, REIN KUUSIK Department of Informatics Tallinn University of Technology Raja 15, 12618 Tallinn ESTONIA torim@staff.ttu.ee kuusik@cc.ttu.ee http://staff.ttu.ee/~torim

More information

Feature Selection with Fuzzy Decision Reducts

Feature Selection with Fuzzy Decision Reducts Feature Selection with Fuzzy Decision Reducts Chris Cornelis 1, Germán Hurtado Martín 1,2, Richard Jensen 3, and Dominik Ślȩzak4 1 Dept. of Mathematics and Computer Science, Ghent University, Gent, Belgium

More information

Application of Rough Set Theory in Performance Analysis

Application of Rough Set Theory in Performance Analysis Australian Journal of Basic and Applied Sciences, 6(): 158-16, 1 SSN 1991-818 Application of Rough Set Theory in erformance Analysis 1 Mahnaz Mirbolouki, Mohammad Hassan Behzadi, 1 Leila Karamali 1 Department

More information

Rough Set Model Selection for Practical Decision Making

Rough Set Model Selection for Practical Decision Making Rough Set Model Selection for Practical Decision Making Joseph P. Herbert JingTao Yao Department of Computer Science University of Regina Regina, Saskatchewan, Canada, S4S 0A2 {herbertj, jtyao}@cs.uregina.ca

More information

Data Dependencies in the Presence of Difference

Data Dependencies in the Presence of Difference Data Dependencies in the Presence of Difference Tsinghua University sxsong@tsinghua.edu.cn Outline Introduction Application Foundation Discovery Conclusion and Future Work Data Dependencies in the Presence

More information

Concept Lattices in Rough Set Theory

Concept Lattices in Rough Set Theory Concept Lattices in Rough Set Theory Y.Y. Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: yyao@cs.uregina.ca URL: http://www.cs.uregina/ yyao Abstract

More information

A PRIMER ON ROUGH SETS:

A PRIMER ON ROUGH SETS: A PRIMER ON ROUGH SETS: A NEW APPROACH TO DRAWING CONCLUSIONS FROM DATA Zdzisław Pawlak ABSTRACT Rough set theory is a new mathematical approach to vague and uncertain data analysis. This Article explains

More information

Some remarks on conflict analysis

Some remarks on conflict analysis European Journal of Operational Research 166 (2005) 649 654 www.elsevier.com/locate/dsw Some remarks on conflict analysis Zdzisław Pawlak Warsaw School of Information Technology, ul. Newelska 6, 01 447

More information

CRITERIA REDUCTION OF SET-VALUED ORDERED DECISION SYSTEM BASED ON APPROXIMATION QUALITY

CRITERIA REDUCTION OF SET-VALUED ORDERED DECISION SYSTEM BASED ON APPROXIMATION QUALITY International Journal of Innovative omputing, Information and ontrol II International c 2013 ISSN 1349-4198 Volume 9, Number 6, June 2013 pp. 2393 2404 RITERI REDUTION OF SET-VLUED ORDERED DEISION SYSTEM

More information

Selected Algorithms of Machine Learning from Examples

Selected Algorithms of Machine Learning from Examples Fundamenta Informaticae 18 (1993), 193 207 Selected Algorithms of Machine Learning from Examples Jerzy W. GRZYMALA-BUSSE Department of Computer Science, University of Kansas Lawrence, KS 66045, U. S. A.

More information

Action rules mining. 1 Introduction. Angelina A. Tzacheva 1 and Zbigniew W. Raś 1,2,

Action rules mining. 1 Introduction. Angelina A. Tzacheva 1 and Zbigniew W. Raś 1,2, Action rules mining Angelina A. Tzacheva 1 and Zbigniew W. Raś 1,2, 1 UNC-Charlotte, Computer Science Dept., Charlotte, NC 28223, USA 2 Polish Academy of Sciences, Institute of Computer Science, Ordona

More information

An Approach to Classification Based on Fuzzy Association Rules

An Approach to Classification Based on Fuzzy Association Rules An Approach to Classification Based on Fuzzy Association Rules Zuoliang Chen, Guoqing Chen School of Economics and Management, Tsinghua University, Beijing 100084, P. R. China Abstract Classification based

More information

Dynamic Programming Approach for Construction of Association Rule Systems

Dynamic Programming Approach for Construction of Association Rule Systems Dynamic Programming Approach for Construction of Association Rule Systems Fawaz Alsolami 1, Talha Amin 1, Igor Chikalov 1, Mikhail Moshkov 1, and Beata Zielosko 2 1 Computer, Electrical and Mathematical

More information

Less is More: Non-Redundant Subspace Clustering

Less is More: Non-Redundant Subspace Clustering Less is More: Non-Redundant Subspace Clustering Ira Assent Emmanuel Müller Stephan Günnemann Ralph Krieger Thomas Seidl Aalborg University, Denmark DATA MANAGEMENT AND DATA EXPLORATION GROUP RWTH Prof.

More information

Data Mining. Preamble: Control Application. Industrial Researcher s Approach. Practitioner s Approach. Example. Example. Goal: Maintain T ~Td

Data Mining. Preamble: Control Application. Industrial Researcher s Approach. Practitioner s Approach. Example. Example. Goal: Maintain T ~Td Data Mining Andrew Kusiak 2139 Seamans Center Iowa City, Iowa 52242-1527 Preamble: Control Application Goal: Maintain T ~Td Tel: 319-335 5934 Fax: 319-335 5669 andrew-kusiak@uiowa.edu http://www.icaen.uiowa.edu/~ankusiak

More information

On Granular Rough Computing: Factoring Classifiers through Granulated Decision Systems

On Granular Rough Computing: Factoring Classifiers through Granulated Decision Systems On Granular Rough Computing: Factoring Classifiers through Granulated Decision Systems Lech Polkowski 1,2, Piotr Artiemjew 2 Department of Mathematics and Computer Science University of Warmia and Mazury

More information

METRIC BASED ATTRIBUTE REDUCTION IN DYNAMIC DECISION TABLES

METRIC BASED ATTRIBUTE REDUCTION IN DYNAMIC DECISION TABLES Annales Univ. Sci. Budapest., Sect. Comp. 42 2014 157 172 METRIC BASED ATTRIBUTE REDUCTION IN DYNAMIC DECISION TABLES János Demetrovics Budapest, Hungary Vu Duc Thi Ha Noi, Viet Nam Nguyen Long Giang Ha

More information

Drawing Conclusions from Data The Rough Set Way

Drawing Conclusions from Data The Rough Set Way Drawing Conclusions from Data The Rough et Way Zdzisław Pawlak Institute of Theoretical and Applied Informatics, Polish Academy of ciences, ul Bałtycka 5, 44 000 Gliwice, Poland In the rough set theory

More information

Mining in Hepatitis Data by LISp-Miner and SumatraTT

Mining in Hepatitis Data by LISp-Miner and SumatraTT Mining in Hepatitis Data by LISp-Miner and SumatraTT Petr Aubrecht 1, Martin Kejkula 2, Petr Křemen 1, Lenka Nováková 1, Jan Rauch 2, Milan Šimůnek2, Olga Štěpánková1, and Monika Žáková1 1 Czech Technical

More information

Decision tables and decision spaces

Decision tables and decision spaces Abstract Decision tables and decision spaces Z. Pawlak 1 Abstract. In this paper an Euclidean space, called a decision space is associated with ever decision table. This can be viewed as a generalization

More information

Bhubaneswar , India 2 Department of Mathematics, College of Engineering and

Bhubaneswar , India 2 Department of Mathematics, College of Engineering and www.ijcsi.org 136 ROUGH SET APPROACH TO GENERATE CLASSIFICATION RULES FOR DIABETES Sujogya Mishra 1, Shakti Prasad Mohanty 2, Sateesh Kumar Pradhan 3 1 Research scholar, Utkal University Bhubaneswar-751004,

More information

International Journal of Approximate Reasoning

International Journal of Approximate Reasoning International Journal of Approximate Reasoning 52 (2011) 231 239 Contents lists available at ScienceDirect International Journal of Approximate Reasoning journal homepage: www.elsevier.com/locate/ijar

More information

Research Article Special Approach to Near Set Theory

Research Article Special Approach to Near Set Theory Mathematical Problems in Engineering Volume 2011, Article ID 168501, 10 pages doi:10.1155/2011/168501 Research Article Special Approach to Near Set Theory M. E. Abd El-Monsef, 1 H. M. Abu-Donia, 2 and

More information

Constraint-based Subspace Clustering

Constraint-based Subspace Clustering Constraint-based Subspace Clustering Elisa Fromont 1, Adriana Prado 2 and Céline Robardet 1 1 Université de Lyon, France 2 Universiteit Antwerpen, Belgium Thursday, April 30 Traditional Clustering Partitions

More information

Approximate counting: count-min data structure. Problem definition

Approximate counting: count-min data structure. Problem definition Approximate counting: count-min data structure G. Cormode and S. Muthukrishhan: An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms 55 (2005) 58-75. Problem

More information

the tree till a class assignment is reached

the tree till a class assignment is reached Decision Trees Decision Tree for Playing Tennis Prediction is done by sending the example down Prediction is done by sending the example down the tree till a class assignment is reached Definitions Internal

More information

Decomposing and Pruning Primary Key Violations from Large Data Sets

Decomposing and Pruning Primary Key Violations from Large Data Sets Decomposing and Pruning Primary Key Violations from Large Data Sets (discussion paper) Marco Manna, Francesco Ricca, and Giorgio Terracina DeMaCS, University of Calabria, Italy {manna,ricca,terracina}@mat.unical.it

More information

Banacha Warszawa Poland s:

Banacha Warszawa Poland  s: Chapter 12 Rough Sets and Rough Logic: A KDD Perspective Zdzis law Pawlak 1, Lech Polkowski 2, and Andrzej Skowron 3 1 Institute of Theoretical and Applied Informatics Polish Academy of Sciences Ba ltycka

More information

1 Introduction Rough sets theory has been developed since Pawlak's seminal work [6] (see also [7]) as a tool enabling to classify objects which are on

1 Introduction Rough sets theory has been developed since Pawlak's seminal work [6] (see also [7]) as a tool enabling to classify objects which are on On the extension of rough sets under incomplete information Jerzy Stefanowski 1 and Alexis Tsouki as 2 1 Institute of Computing Science, Poznań University oftechnology, 3A Piotrowo, 60-965 Poznań, Poland,

More information

Data Mining. 3.6 Regression Analysis. Fall Instructor: Dr. Masoud Yaghini. Numeric Prediction

Data Mining. 3.6 Regression Analysis. Fall Instructor: Dr. Masoud Yaghini. Numeric Prediction Data Mining 3.6 Regression Analysis Fall 2008 Instructor: Dr. Masoud Yaghini Outline Introduction Straight-Line Linear Regression Multiple Linear Regression Other Regression Models References Introduction

More information

P leiades: Subspace Clustering and Evaluation

P leiades: Subspace Clustering and Evaluation P leiades: Subspace Clustering and Evaluation Ira Assent, Emmanuel Müller, Ralph Krieger, Timm Jansen, and Thomas Seidl Data management and exploration group, RWTH Aachen University, Germany {assent,mueller,krieger,jansen,seidl}@cs.rwth-aachen.de

More information

Granularity, Multi-valued Logic, Bayes Theorem and Rough Sets

Granularity, Multi-valued Logic, Bayes Theorem and Rough Sets Granularity, Multi-valued Logic, Bayes Theorem and Rough Sets Zdzis law Pawlak Institute for Theoretical and Applied Informatics Polish Academy of Sciences ul. Ba ltycka 5, 44 000 Gliwice, Poland e-mail:zpw@ii.pw.edu.pl

More information

Pairing Transitive Closure and Reduction to Efficiently Reason about Partially Ordered Events

Pairing Transitive Closure and Reduction to Efficiently Reason about Partially Ordered Events Pairing Transitive Closure and Reduction to Efficiently Reason about Partially Ordered Events Massimo Franceschet Angelo Montanari Dipartimento di Matematica e Informatica, Università di Udine Via delle

More information

On The Complexity of Quantum Circuit Manipulation

On The Complexity of Quantum Circuit Manipulation On The Complexity of Quantum Circuit Manipulation Vincent Liew 1 Introduction The stabilizer class of circuits, introduced by Daniel Gottesman, consists of quantum circuits in which every gate is a controlled-not

More information

Positional Analysis in Fuzzy Social Networks

Positional Analysis in Fuzzy Social Networks 2007 IEEE International Conference on Granular Computing Positional Analysis in Fuzzy Social Networks Tuan-Fang Fan Institute of Information Management National Chiao-Tung University Hsinchu 300, Taiwan

More information

Algorithmic Methods of Data Mining, Fall 2005, Course overview 1. Course overview

Algorithmic Methods of Data Mining, Fall 2005, Course overview 1. Course overview Algorithmic Methods of Data Mining, Fall 2005, Course overview 1 Course overview lgorithmic Methods of Data Mining, Fall 2005, Course overview 1 T-61.5060 Algorithmic methods of data mining (3 cp) P T-61.5060

More information

Investigating Measures of Association by Graphs and Tables of Critical Frequencies

Investigating Measures of Association by Graphs and Tables of Critical Frequencies Investigating Measures of Association by Graphs Investigating and Tables Measures of Critical of Association Frequencies by Graphs and Tables of Critical Frequencies Martin Ralbovský, Jan Rauch University

More information

A DYNAMIC PROGRAMMING APPROACH. Guy Hawerstock Dan Ilan

A DYNAMIC PROGRAMMING APPROACH. Guy Hawerstock Dan Ilan THE n-queens PROBLEM A DYNAMIC PROGRAMMING APPROACH Guy Hawerstock Dan Ilan A Short Chess Review A queen in chess can move anywhere along her row, column or diagonals. A queen can strike any unit by moving

More information

2D Spectrogram Filter for Single Channel Speech Enhancement

2D Spectrogram Filter for Single Channel Speech Enhancement Proceedings of the 7th WSEAS International Conference on Signal, Speech and Image Processing, Beijing, China, September 15-17, 007 89 D Spectrogram Filter for Single Channel Speech Enhancement HUIJUN DING,

More information

Discovery of Concurrent Data Models from Experimental Tables: A Rough Set Approach

Discovery of Concurrent Data Models from Experimental Tables: A Rough Set Approach From: KDD-95 Proceedings. Copyright 1995, AAAI (www.aaai.org). All rights reserved. Discovery of Concurrent Data Models from Experimental Tables: A Rough Set Approach Andrzej Skowronl* and Zbigniew Suraj2*

More information

A Possibilistic Decision Logic with Applications

A Possibilistic Decision Logic with Applications Fundamenta Informaticae 46 (2001) 199 217 199 IOS Press A Possibilistic Decision Logic with Applications Churn-Jung Liau C Institute of Information Science Academia Sinica Taipei, Taiwan email: liaucj@iis.sinica.edu.tw

More information

Adversarial Classification

Adversarial Classification Adversarial Classification Nilesh Dalvi, Pedro Domingos, Mausam, Sumit Sanghai, Deepak Verma [KDD 04, Seattle] Presented by: Aditya Menon UCSD April 22, 2008 Presented by: Aditya Menon (UCSD) Adversarial

More information

Fuzzy Rough Sets with GA-Based Attribute Division

Fuzzy Rough Sets with GA-Based Attribute Division Fuzzy Rough Sets with GA-Based Attribute Division HUGANG HAN, YOSHIO MORIOKA School of Business, Hiroshima Prefectural University 562 Nanatsuka-cho, Shobara-shi, Hiroshima 727-0023, JAPAN Abstract: Rough

More information

Worst-Case Analysis of the Perceptron and Exponentiated Update Algorithms

Worst-Case Analysis of the Perceptron and Exponentiated Update Algorithms Worst-Case Analysis of the Perceptron and Exponentiated Update Algorithms Tom Bylander Division of Computer Science The University of Texas at San Antonio San Antonio, Texas 7849 bylander@cs.utsa.edu April

More information

STUDY OF PERMUTATION MATRICES BASED LDPC CODE CONSTRUCTION

STUDY OF PERMUTATION MATRICES BASED LDPC CODE CONSTRUCTION EE229B PROJECT REPORT STUDY OF PERMUTATION MATRICES BASED LDPC CODE CONSTRUCTION Zhengya Zhang SID: 16827455 zyzhang@eecs.berkeley.edu 1 MOTIVATION Permutation matrices refer to the square matrices with

More information

Computational Intelligence, Volume, Number, VAGUENES AND UNCERTAINTY: A ROUGH SET PERSPECTIVE. Zdzislaw Pawlak

Computational Intelligence, Volume, Number, VAGUENES AND UNCERTAINTY: A ROUGH SET PERSPECTIVE. Zdzislaw Pawlak Computational Intelligence, Volume, Number, VAGUENES AND UNCERTAINTY: A ROUGH SET PERSPECTIVE Zdzislaw Pawlak Institute of Computer Science, Warsaw Technical University, ul. Nowowiejska 15/19,00 665 Warsaw,

More information

Quantization of Rough Set Based Attribute Reduction

Quantization of Rough Set Based Attribute Reduction A Journal of Software Engineering and Applications, 0, 5, 7 doi:46/sea05b0 Published Online Decemer 0 (http://wwwscirporg/ournal/sea) Quantization of Rough Set Based Reduction Bing Li *, Peng Tang, Tommy

More information

Brock University. Probabilistic granule analysis. Department of Computer Science. Ivo Düntsch & Günther Gediga Technical Report # CS May 2008

Brock University. Probabilistic granule analysis. Department of Computer Science. Ivo Düntsch & Günther Gediga Technical Report # CS May 2008 Brock University Department of Computer Science Probabilistic granule analysis Ivo Düntsch & Günther Gediga Technical Report # CS-08-04 May 2008 Brock University Department of Computer Science St. Catharines,

More information

Action Rule Extraction From A Decision Table : ARED

Action Rule Extraction From A Decision Table : ARED Action Rule Extraction From A Decision Table : ARED Seunghyun Im 1 and Zbigniew Ras 2,3 1 University of Pittsburgh at Johnstown, Department of Computer Science Johnstown, PA. 15904, USA 2 University of

More information

ENTROPIES OF FUZZY INDISCERNIBILITY RELATION AND ITS OPERATIONS

ENTROPIES OF FUZZY INDISCERNIBILITY RELATION AND ITS OPERATIONS International Journal of Uncertainty Fuzziness and Knowledge-Based Systems World Scientific ublishing Company ENTOIES OF FUZZY INDISCENIBILITY ELATION AND ITS OEATIONS QINGUA U and DAEN YU arbin Institute

More information

A New Approach to Estimating the Expected First Hitting Time of Evolutionary Algorithms

A New Approach to Estimating the Expected First Hitting Time of Evolutionary Algorithms A New Approach to Estimating the Expected First Hitting Time of Evolutionary Algorithms Yang Yu and Zhi-Hua Zhou National Laboratory for Novel Software Technology Nanjing University, Nanjing 20093, China

More information

43400 Serdang Selangor, Malaysia Serdang Selangor, Malaysia 4

43400 Serdang Selangor, Malaysia Serdang Selangor, Malaysia 4 An Extended ID3 Decision Tree Algorithm for Spatial Data Imas Sukaesih Sitanggang # 1, Razali Yaakob #2, Norwati Mustapha #3, Ahmad Ainuddin B Nuruddin *4 # Faculty of Computer Science and Information

More information

The Fourth International Conference on Innovative Computing, Information and Control

The Fourth International Conference on Innovative Computing, Information and Control The Fourth International Conference on Innovative Computing, Information and Control December 7-9, 2009, Kaohsiung, Taiwan http://bit.kuas.edu.tw/~icic09 Dear Prof. Yann-Chang Huang, Thank you for your

More information

CS570 Introduction to Data Mining

CS570 Introduction to Data Mining CS570 Introduction to Data Mining Department of Mathematics and Computer Science Li Xiong Data Exploration and Data Preprocessing Data and Attributes Data exploration Data pre-processing Data cleaning

More information

Making Fast Buffer Insertion Even Faster via Approximation Techniques

Making Fast Buffer Insertion Even Faster via Approximation Techniques Making Fast Buffer Insertion Even Faster via Approximation Techniques Zhuo Li, C. N. Sze, Jiang Hu and Weiping Shi Department of Electrical Engineering Texas A&M University Charles J. Alpert IBM Austin

More information

Similarity and Dissimilarity

Similarity and Dissimilarity 1//015 Similarity and Dissimilarity COMP 465 Data Mining Similarity of Data Data Preprocessing Slides Adapted From : Jiawei Han, Micheline Kamber & Jian Pei Data Mining: Concepts and Techniques, 3 rd ed.

More information

A Theory of Forgetting in Logic Programming

A Theory of Forgetting in Logic Programming A Theory of Forgetting in Logic Programming Kewen Wang 1,2 and Abdul Sattar 1,2 and Kaile Su 1 1 Institute for Integrated Intelligent Systems 2 School of Information and Computation Technology Griffith

More information

Text Mining. Dr. Yanjun Li. Associate Professor. Department of Computer and Information Sciences Fordham University

Text Mining. Dr. Yanjun Li. Associate Professor. Department of Computer and Information Sciences Fordham University Text Mining Dr. Yanjun Li Associate Professor Department of Computer and Information Sciences Fordham University Outline Introduction: Data Mining Part One: Text Mining Part Two: Preprocessing Text Data

More information

Multi-objective Quadratic Assignment Problem instances generator with a known optimum solution

Multi-objective Quadratic Assignment Problem instances generator with a known optimum solution Multi-objective Quadratic Assignment Problem instances generator with a known optimum solution Mădălina M. Drugan Artificial Intelligence lab, Vrije Universiteit Brussel, Pleinlaan 2, B-1050 Brussels,

More information

Discovery of Functional and Approximate Functional Dependencies in Relational Databases

Discovery of Functional and Approximate Functional Dependencies in Relational Databases JOURNAL OF APPLIED MATHEMATICS AND DECISION SCIENCES, 7(1), 49 59 Copyright c 2003, Lawrence Erlbaum Associates, Inc. Discovery of Functional and Approximate Functional Dependencies in Relational Databases

More information

Distributed Mining of Frequent Closed Itemsets: Some Preliminary Results

Distributed Mining of Frequent Closed Itemsets: Some Preliminary Results Distributed Mining of Frequent Closed Itemsets: Some Preliminary Results Claudio Lucchese Ca Foscari University of Venice clucches@dsi.unive.it Raffaele Perego ISTI-CNR of Pisa perego@isti.cnr.it Salvatore

More information

Favoring Consensus and Penalizing Disagreement in Group Decision Making

Favoring Consensus and Penalizing Disagreement in Group Decision Making Favoring Consensus and Penalizing Disagreement in Group Decision Making Paper: jc*-**-**-**** Favoring Consensus and Penalizing Disagreement in Group Decision Making José Luis García-Lapresta PRESAD Research

More information

Naive Bayesian Rough Sets

Naive Bayesian Rough Sets Naive Bayesian Rough Sets Yiyu Yao and Bing Zhou Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 {yyao,zhou200b}@cs.uregina.ca Abstract. A naive Bayesian classifier

More information