On Granular Rough Computing: Factoring Classifiers through Granulated Decision Systems
|
|
- Daisy Flowers
- 6 years ago
- Views:
Transcription
1 On Granular Rough Computing: Factoring Classifiers through Granulated Decision Systems Lech Polkowski 1,2, Piotr Artiemjew 2 Department of Mathematics and Computer Science University of Warmia and Mazury 2, Olsztyn,Poland Polish Japanese Institute of Information Technology 1 Koszykowa 86, Warszawa, Poland polkow@pjwstk.edu.pl;artem@matman.uwm.edu.pl Abstract. The paradigm of Granular Computing has quite recently emerged as an area of research on its own; in particular, t is pursued within rough set theory initiated by Zdzis law Pawlak. Granules of knowledge consist of entities with a similar in a sense information content. An idea of a granular counterpart to a decision/information system has been put forth, along with its consequence in the form of the hypothesis that various operators, aimed at dealing with information, should factorize sufficiently faithfully through granular structures [7], [8]. Most important such operators are algorithms for inducing classifiers. We show results of testing few well-known algorithms for classifier induction on well used data sets from Irvine Repository in order to verify the hypothesis. The results confirm the hypothesis in case of selected representative algorithms and open a new prospective area of research. Keywords rough inclusion, similarity, granulation of knowledge, granular systems and classifiers 1 Rough Computing Knowledge is represented as a pair (U, A), called an information system [4], where U is a set of objects, and A is a collection of attributes, each a A construed as a mapping a : U V a from U into the value set V a. The collection IND = {ind(a) : a A} of a indiscernibility relations, where ind(a) = {(u, v) : u, v U, a(u) = a(v)} for a A, can be restricted to any set B A, yielding the B indiscernibility relation ind(b) = a B ind(a). A concept is any subset of the set U. By a proper rough entity, we mean any entity e constructed from objects in U and relations in R such that its action e u on each object u U satisfies the condition: if (u, v) r then e u = e v for each r R; in particular, proper rough concepts are called exact, improper rough concepts are called rough. A particular case of an information system is a decision system, i.e., a pair (U, A {d}) in which d is a singled out attribute called the decision. Basic primitives in any reasoning based on rough set theory, are descriptors, see, e.g., [4], of the form (a = v), with semantics of the form [(a = v)] = {u U : a(u) = v}, extended to the set of formulae by means of sentential connectives, with appropriately extended semantics. In order to relate the conditional
2 knowledge (U, IN D) to the world knowledge (U, {ind(d)}), decision rules are in use; a decision rule is an implication of the form, (a = v a ) (d = w). (1) a A A classifier is a set of decision rules. 2 Rough Mereology. Rough Inclusions We outline it here as a basis for discussion of granules in the wake of [7], [8]. Rough Mereology is concerned with the theory of the predicate of Rough Inclusion. 2.1 Rough Inclusions A rough inclusion µ π (x, y, r), where x, y are individual objects, r [0, 1], does satisfy the following requirements, relative to a given part relation π on a set U of individual objects,see [6], [7], [8], [9], 1. µ π (x, y, 1) x ing π y; 2. µ π (x, y, 1) [µ π (z, x, r) µ π (z, y, r)]; 3. µ π (x, y, r) s < r µ π (x, y, s). (2) Those requirements seem to be intuitively clear: 1. demands that the predicate µ π is an extension to the relation ing π of the underlying system of Mereology; 2. does express monotonicity of µ π, and 3. assures the reading: to degree at least r. We use here only one rough inclusion, albeit a fundamental one, viz., see [6],[7] for its derivation, µ L (u, v, r) where IND(u, v) = {a A : a(u) = a(v)}. IND(u, v) A r, (3) 3 Granules A granule g µ (u, r) about u U of the radius r, relative to µ, is defined by letting, g µ (u, r) is ClsF (u, r), (4) where the property F (u, r) is satisfied with an object v if and only if µ(v, u, r) holds, and Cls is the class operator, see, e.g., [6]. Practically, in case of µ L, the granule g(u, r) collects all v U such that IND(v, u) r A. For a given granulation radius r, we form the collection U G r,µ = {g µ (u, r)}.
3 3.1 Granular decision systems The idea of a granular decision system was posed in [7]; for a given information system (U, A), a rough inclusion µ, and r [0, 1], the new universe U G r,µ is given. We apply a strategy G to choose a covering Cov G r,µ of the universe U by granules from U G r,µ. We apply a strategy S in order to assign the value a (g) of each attribute a A to each granule g Cov G r,µ: a (g) = S({a(u) : u g}). The granular counterpart to the information system (U, A) is a tuple (U G r,µ, G, S, {a : a A}); analogously, we define granular counterparts to decision systems by adding the factored decision d. The heuristic principle that objects, similar with respect to conditional attributes in the set A, should also reveal similar (i.e., close) decision values, and therefore, granular counterparts to decision systems should lead to classifiers satisfactorily close in quality to those induced from original decision systems, was stated in [7], and borne out by simple hand examples. In this work we verify this hypothesis with real data sets. 4 Classifiers: Rough set methods Classifiers are evaluated by error which is the ratio of the number of correctly classified objects to the number of recognized test objects (called also total accuracy) and total coverage, rec test, where rec is the number of recognized test cases and test is the number of test cases. We test LEM2 algorithm due to Grzymala Busse, see, e.g., [2] and covering as well as exhaustive algorithm in RSES package [12], see [1], [13], [16],[17]. 4.1 On the approach in this work For g(u, r) with r fixed and attribute a A {d}, the factored value a (g) is defined as S({a(u) : u g}) for a strategy S, each granule g does produce a new object g, with attribute values a(g ) = a (g) for a A, possibly not in the data set universe U. From the set U G r,µ, see sect.3.1, of all granules of the form g µ (u, r), by means of a strategy G, we choose a covering Cov G r,µ of the universe U. Thus, a decision system D ={g : g Cov G r,µ}, A {d }) is formed, called the granular counterpart relative to strategies G, S to the decision system D = (U, A {d}); this new system is substantially smaller in size for intermediate values of r, hence, classifiers induced from it have correspondingly smaller number of rules. As stated above, the hypothesis is that the granular counterpart D at sufficiently large granulation radii r preserves knowledge encoded in the decision system D to a satisfactory degree so given an algorithm A for rule induction, classifiers obtained from the training set D(trn) and its granular counterpart D (trn) should agree with a small error on the test set D(tst).
4 5 Experiments In experiments with real data sets, we accept total accuracy and total coverage coefficients as quality measures in comparison of classifiers given in this work. We make use of some well known real life data sets often used in testing of classifiers. Due to shortage of space, we include only a very few results. The following data sets have been used: Credit card application approval data set (Australian credit), see [14]; Pima Indians diabetes data set [14]. As representative and well established algorithms for rule induction in public domain,we have selected the RSES exhaustive algorithm, see [12]; the covering algorithm of RSES with p=.1[12]; LEM2 algorithm, with p=.5, see [2], [12]. Table 1 shows a comparison of these algorithms on the data set Australian credit split into the training and test sets with the ratio 1:1. Table 1. Comparison of algorithms on Australian credit data. 345 training objects, 345 test objects algorithm accuracy coverage rule number covering(p =.1) covering(p =.5) covering(p = 1.0) exhaustive LEM2(p =.1) LEM2(p =.5) LEM2(p = 1.0) In rough set literature there are results of tests with other algorithms on Australian credit data set; we recall some best of them in Table 2 and we include also best granular cases from this work. Table 2. Best results for Australian credit by some rough set based algorithms; in case, reduction in object size is 40.6 percent, reduction in rule number is 43.6 percent; in case, resp. 10.5, 5.9; in case, resp., 3.6, 1.9 source method accuracy coverage Bazan[1] SNAPM(0.9) error = S.H.Nguyen[13] simple.templates S.H.Nguyen[13] general.templates S.H.Nguyen[13] closest.simple.templates S.H.Nguyen[13] closest.gen.templates S.H.Nguyen[13] tolerance.simple.templ S.H.Nguyen[13] tolerance.gen.templ J.W roblewski[17] adaptive.classifier this.work granular.r = this.work granular.r = this.work granular.concept.dependent.r = For any granule g and any attribute b in the set A d of attributes, the reduced attribute s b value at the granule g has been estimated by means of the majority voting strategy and ties have been resolved at random; majority voting is one of most popular strategies and was frequently applied within rough set theory, see, e.g., [13], [16]. We also use the simplest strategy for covering finding, i.e., we select coverings by ordering objects in the set U and choosing sequentially granules about them in order to obtain an irreducible covering; a random choice of granules is applied in sections in which this is specifically mentioned. The only enhancement of the simple granulation is discussed in sect. 6 where the concept dependent granules are considered; this approach yields even better classification results.
5 5.1 Train and test at 1:1 ratio for Australian credit We include here results for Australian credit. Table 3 shows size of training and test sets in non granular and granular cases as well as results of classification versus radii of granulation. Table 4 shows absolute differences between non granular case (r=nil) and granular cases as well as fraction of training and rule sets in granular cases against those in non granular case. Table 3. Australian credit dataset:r=granule radius,tst=test sample size,trn=training sample size,rulcov=number of rules with covering algorithm,rulex=number of rules with exhaustive algorithm, rullem=number of rules with LEM2,acov=total accuracy with covering algorithm,ccov=total coverage with covering algorithm,aex=total accuracy with exhaustive algorithm,cex=total coverage with exhaustive algorithm,alem=total accuracy with LEM2, clem=total coverage with LEM2 r tst trn rulcov rulex rullem acov ccov aex clex alem clem nil Table 4. Australian credit dataset:comparison; r=granule radius,acerr= abs.total accuracy error with covering algorithm,ccerr= abs.total coverage error with covering algorithm,aexerr=abs.total accuracy error with exhaustive algorithm,cexerr=abs.total coverage error with exhaustive algorithm,alemerr=abs.total accuracy error with LEM2, clemerr=abs.total coverage error with LEM2, sper=training sample size as fraction of the original size,rper= max rule set size as fraction of the original size r acerr ccerr aexerr cexerr alemerr clemerr sper rper nil With covering algorithm, accuracy is better or within error of 1 percent for all radii, coverage is better or within error of 4.5 percent from the radius of on where training set size reduction is 99 percent and reduction in rule set size is 98 percent. With exhaustive algorithm, accuracy is within error of 10 percent from the radius of on, and it is better or within error of 4 percent from the radius of 0.5 where reduction in training set size is 85 percent and reduction in rule set size is 95 percent. The result of.875 at r =.714 is among the best at all (see Table 2). Coverage is better from r =.214 in the granular case, reduction in objects is 99 percent, reduction in rule size is almost 100 percent. LEM2 gives accuracy better or within 2.6 percent error from the radius of 0.5 where training set size reduction is 85 percent and rule set size reduction is 96 percent. Coverage is better or within error of 7.3 percent from the radius of on where reduction in training set size is 69.6 percent and rule set size is reduced by 96 percent.
6 5.2 CV-10 with Pima We have experimented with Pima Indians diabetes data set using 10 fold cross validation and random choice of a covering for exhaustive and LEM2 algorithms. Results are in Tables 5, 6. Table fold CV; Pima; exhaustive algorithm. r=radius,macc=mean accuracy,mcov=mean coverage,mrules=mean rule number, mtrn=mean size of training set r macc mcov mrules mtrn nil Table fold CV; Pima; LEM2 algorithm. r=radius,macc=mean accuracy,mcov=mean coverage,mrules=mean rule number, mtrn=mean size of training set r macc mcov mrules mtrn nil For exhaustive algorithm, accuracy in granular case is 95.4 percent of accuracy in non granular case, from the radius of.25 with reduction in size of the training set of 82.5 percent, and from the radius of.5 on, the difference is less than 3 percent with reduction in size of the training set of about 16.3 percent. The difference in coverage is less than.4 percent from r =.25 on, where reduction in training set size is 82.5 percent. For LEM2, accuracy in both cases differs by less than 1 percent from r =.25 on, and it is better in granular case from r =.5 on with reduction in size of the training set of 16.3 percent; coverage is better in granular case from r =.375 on with the training set size reduced by 48.2 percent. 5.3 A validation by a statistical test We have also carried out the test with Pima Indian Diabetes dataset [14], and random choice of coverings, taking a sample of 30 granular classifiers at the radius of.5 with train-and-test at the ratio 1:1 against the matched sample of classification results without granulation, with the covering algorithm for p=.1. The Wilcoxon [15] signed rank test for matched pairs in this case has given the p value of.14 in case of coverage,so the null hypothesis of identical means should not be rejected, whereas for accuracy, the hypothesis that the mean in granular case is equal to.99 of the mean in non granular case may be rejected (the p value is.009), and the hypothesis that the mean in granular case is greater than.98 of the mean in non granular case is accepted (the p value is.035) at confidence level of Concept dependent granulation A modification of the approach presented in results shown above is the concept dependent granulation; a concept in the narrow sense is a decision/classification class, cf., e.g.,
7 [2]. Granulation in this sense consists in computing granules for objects in the universe U and for all distinct granulation radii as previously, with the only restriction that given any object u U and r [0, 1], the new concept dependent granule g cd (u, r) is computed with taking into account only objects v U with d(v) = d(u), i.e., g cd (u, r) =g(u, r) {v U : d(v) = d(u)}. This method increases the number of granules in coverings but it is also expected to increase quality of classification, as expressed by accuracy and coverage. We show that this is the case indeed, by including results of the test in which exhaustive algorithm and random choice of coverings were applied tenfold to Australian credit data set, once with the standard by now granular approach and then with the concept dependent approach. The averaged results are shown in Table 7. Table 7. Standard and concept dependent granular systems for Australian credit data set; exhaustive RSES algorithm:r=granule radius, macc=mean accuracy, mcov=mean coverage, mrules=mean number of rules, mtrn=mean training sample size; in each column first value is for standard, second for concept dependent r macc mcov mrules mtrn nil 1.0; ; ; ; ; ; 1.0 0; 8 1; ; ; 1.0 0; ; ; ; 1.0 0; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 690 Conclusions for concept dependent granulation Concept dependent granulation, as expected, involves a greater number of granules in a covering, hence, a greater number of rules, which is perceptible clearly up to the radius of and for greater radii the difference is negligible. Accuracy in case of concept dependent granulation is always better than in the standard case, the difference becomes negligible at the radius of when granules become almost single indiscernibility classes. Coverage in concept dependent case is almost the same as in the standard case, the difference between the two not greater than.15 percent from the radius of , where the average number of granules in coverings is 5 percent of the number of objects. Accuracy at that radius is better by.04 i.e. by about 5 percent in the concept dependent case. It follows that concept dependent granulation yields better accuracy whereas coverage is the same as in the standard case. 7 Conclusions The results shown in this work confirm the hypothesis put forth in [7], [8] that granular counterparts to data sets preserve the encoded information to a very high degree. The search for theoretical explanation for this as well as work aimed at developing original algorithms for rule induction based on the discovered phenomenon are in progress to be reported. References 1. J. G. Bazan, A comparison of dynamic and non dynamic rough set methods for extracting laws from decision tables, In: Rough Sets in Knowledge Discovery 1, L. Polkowski, A.Skowron, Eds., Physica Verlag, Heidelberg, 1998,
8 2. J.W. Grzymala Busse, Data with missing attribute values: Generalization of rule indiscernibility relation and rule induction, Transactions on Rough Sets I, Springer Verlag, Berlin, 2004, S. Leśniewski, On the foundations of set theory, Topoi 2, 1982, Z. Pawlak, Rough Sets: Theoretical Aspects of Reasoning about Data, Kluwer, Dordrecht, L. Polkowski, Rough Sets. Mathematical Foundations, Physica Verlag, Heidelberg, L. Polkowski, Toward rough set foundations. Mereological approach (a plenary lecture), in: Proceedings RSCTC04, Uppsala, Sweden, 2004, LNAI vol. 3066,Springer Verlag, Berlin, 2004, L. Polkowski, Formal granular calculi based on rough inclusions (a feature talk), in: [10], L.Polkowski, Formal granular calculi based on rough inclusions (a feature talk), in: [11], L.Polkowski, A. Skowron, Rough mereology: a new paradigm for approximate reasoning,international Journal of Approximate Reasoning 15(4), 1997, Proceedings of IEEE 2005 Conference on Granular Computing,GrC05, Beijing, China, July 2005, IEEE Press, Proceedings of IEEE 2006 Conference on Granular Computing, GrC06, Atlanta, USA, May 2006, IEEE Press, A. Skowron et al., RSES: A system for data analysis; available at http: logic.mimuw.edu.pl rses 13. Sinh Hoa Nguyen, Regularity analysis and its applications in Data Mining, in: Rough Set Methods and Applications, L.Polkowski, S.Tsumoto, T.Y.Lin, Eds., Physica Verlag, Heidelberg, 2000, mlearn/databases/iris 15. F. Wilcoxon, Individual comparisons by ranking method, Biometrics 1, 1945, A. Wojna, Analogy based reasoning in classifier construction, Transactions on Rough Sets IV, LNCS 3700, Springer Verlag, Berlin, 2005, J. Wróblewski, Adaptive aspects of combining approximation spaces, In:Rough Neural Computing, S.K.Pal, L.Polkowski,A.Skowron, Eds., Springer Verlag, 2004,
Granular Computing: Granular Classifiers and Missing Values
1 Granular Computing: Granular Classifiers and Missing Values Lech Polkowski 1,2 and Piotr Artiemjew 2 Polish-Japanese Institute of Information Technology 1 Koszykowa str. 86, 02008 Warsaw, Poland; Department
More information2 Rough Sets In Data Analysis: Foundations and Applications
2 Rough Sets In Data Analysis: Foundations and Applications Lech Polkowski 1,2 and Piotr Artiemjew 2 1 Polish-Japanese Institute of Information Technology, Koszykowa 86, 02008 Warszawa, Poland polkow@pjwstk.edu.pl
More informationOn Knowledge Granulation and Applications to Classifier Induction in the Framework of Rough Mereology
International Journal of Computational Intelligence Systems, Vol.2, No. 4 (December, 2009), 315-331 On Knowledge Granulation and Applications to Classifier Induction in the Framework of Rough Mereology
More informationBanacha Warszawa Poland s:
Chapter 12 Rough Sets and Rough Logic: A KDD Perspective Zdzis law Pawlak 1, Lech Polkowski 2, and Andrzej Skowron 3 1 Institute of Theoretical and Applied Informatics Polish Academy of Sciences Ba ltycka
More informationMinimal Attribute Space Bias for Attribute Reduction
Minimal Attribute Space Bias for Attribute Reduction Fan Min, Xianghui Du, Hang Qiu, and Qihe Liu School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu
More informationClassification of Voice Signals through Mining Unique Episodes in Temporal Information Systems: A Rough Set Approach
Classification of Voice Signals through Mining Unique Episodes in Temporal Information Systems: A Rough Set Approach Krzysztof Pancerz, Wies law Paja, Mariusz Wrzesień, and Jan Warcho l 1 University of
More informationEnsembles of classifiers based on approximate reducts
Fundamenta Informaticae 34 (2014) 1 10 1 IOS Press Ensembles of classifiers based on approximate reducts Jakub Wróblewski Polish-Japanese Institute of Information Technology and Institute of Mathematics,
More informationInterpreting Low and High Order Rules: A Granular Computing Approach
Interpreting Low and High Order Rules: A Granular Computing Approach Yiyu Yao, Bing Zhou and Yaohua Chen Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail:
More informationA new Approach to Drawing Conclusions from Data A Rough Set Perspective
Motto: Let the data speak for themselves R.A. Fisher A new Approach to Drawing Conclusions from Data A Rough et Perspective Zdzisław Pawlak Institute for Theoretical and Applied Informatics Polish Academy
More informationA Logical Formulation of the Granular Data Model
2008 IEEE International Conference on Data Mining Workshops A Logical Formulation of the Granular Data Model Tuan-Fang Fan Department of Computer Science and Information Engineering National Penghu University
More informationA version of rough mereology suitable for rough sets
A version of rough mereology suitable for rough sets Lech T. Polkowski Polish-Japanese Academy IT Koszykowa str. 86, 02-008 Warszawa, Poland email: lech.polkowski@pja.edu.pl;polkow@pjwstk.edu.pl Abstract.
More informationAn algorithm for induction of decision rules consistent with the dominance principle
An algorithm for induction of decision rules consistent with the dominance principle Salvatore Greco 1, Benedetto Matarazzo 1, Roman Slowinski 2, Jerzy Stefanowski 2 1 Faculty of Economics, University
More informationA Simple Implementation of the Stochastic Discrimination for Pattern Recognition
A Simple Implementation of the Stochastic Discrimination for Pattern Recognition Dechang Chen 1 and Xiuzhen Cheng 2 1 University of Wisconsin Green Bay, Green Bay, WI 54311, USA chend@uwgb.edu 2 University
More informationThe size of decision table can be understood in terms of both cardinality of A, denoted by card (A), and the number of equivalence classes of IND (A),
Attribute Set Decomposition of Decision Tables Dominik Slezak Warsaw University Banacha 2, 02-097 Warsaw Phone: +48 (22) 658-34-49 Fax: +48 (22) 658-34-48 Email: slezak@alfa.mimuw.edu.pl ABSTRACT: Approach
More informationClassification Based on Logical Concept Analysis
Classification Based on Logical Concept Analysis Yan Zhao and Yiyu Yao Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada S4S 0A2 E-mail: {yanzhao, yyao}@cs.uregina.ca Abstract.
More informationAction Rule Extraction From A Decision Table : ARED
Action Rule Extraction From A Decision Table : ARED Seunghyun Im 1 and Zbigniew Ras 2,3 1 University of Pittsburgh at Johnstown, Department of Computer Science Johnstown, PA. 15904, USA 2 University of
More informationDrawing Conclusions from Data The Rough Set Way
Drawing Conclusions from Data The Rough et Way Zdzisław Pawlak Institute of Theoretical and Applied Informatics, Polish Academy of ciences, ul Bałtycka 5, 44 000 Gliwice, Poland In the rough set theory
More informationEasy Categorization of Attributes in Decision Tables Based on Basic Binary Discernibility Matrix
Easy Categorization of Attributes in Decision Tables Based on Basic Binary Discernibility Matrix Manuel S. Lazo-Cortés 1, José Francisco Martínez-Trinidad 1, Jesús Ariel Carrasco-Ochoa 1, and Guillermo
More informationFeature Selection with Fuzzy Decision Reducts
Feature Selection with Fuzzy Decision Reducts Chris Cornelis 1, Germán Hurtado Martín 1,2, Richard Jensen 3, and Dominik Ślȩzak4 1 Dept. of Mathematics and Computer Science, Ghent University, Gent, Belgium
More informationGranularity, Multi-valued Logic, Bayes Theorem and Rough Sets
Granularity, Multi-valued Logic, Bayes Theorem and Rough Sets Zdzis law Pawlak Institute for Theoretical and Applied Informatics Polish Academy of Sciences ul. Ba ltycka 5, 44 000 Gliwice, Poland e-mail:zpw@ii.pw.edu.pl
More informationAndrzej Skowron, Zbigniew Suraj (Eds.) To the Memory of Professor Zdzisław Pawlak
Andrzej Skowron, Zbigniew Suraj (Eds.) ROUGH SETS AND INTELLIGENT SYSTEMS To the Memory of Professor Zdzisław Pawlak Vol. 1 SPIN Springer s internal project number, if known Springer Berlin Heidelberg
More informationData mining using Rough Sets
Data mining using Rough Sets Alber Sánchez 1 alber.ipia@inpe.br 1 Instituto Nacional de Pesquisas Espaciais, São José dos Campos, SP, Brazil Referata Geoinformatica, 2015 1 / 44 Table of Contents Rough
More informationThree Discretization Methods for Rule Induction
Three Discretization Methods for Rule Induction Jerzy W. Grzymala-Busse, 1, Jerzy Stefanowski 2 1 Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, Kansas 66045
More informationENSEMBLES OF DECISION RULES
ENSEMBLES OF DECISION RULES Jerzy BŁASZCZYŃSKI, Krzysztof DEMBCZYŃSKI, Wojciech KOTŁOWSKI, Roman SŁOWIŃSKI, Marcin SZELĄG Abstract. In most approaches to ensemble methods, base classifiers are decision
More informationIndex. C, system, 8 Cech distance, 549
Index PF(A), 391 α-lower approximation, 340 α-lower bound, 339 α-reduct, 109 α-upper approximation, 340 α-upper bound, 339 δ-neighborhood consistent, 291 ε-approach nearness, 558 C, 443-2 system, 8 Cech
More informationHigh Frequency Rough Set Model based on Database Systems
High Frequency Rough Set Model based on Database Systems Kartik Vaithyanathan kvaithya@gmail.com T.Y.Lin Department of Computer Science San Jose State University San Jose, CA 94403, USA tylin@cs.sjsu.edu
More informationSimilarity-based Classification with Dominance-based Decision Rules
Similarity-based Classification with Dominance-based Decision Rules Marcin Szeląg, Salvatore Greco 2,3, Roman Słowiński,4 Institute of Computing Science, Poznań University of Technology, 60-965 Poznań,
More informationTHE LOCALIZATION OF MINDSTORMS NXT IN THE MAGNETIC UNSTABLE ENVIRONMENT BASED ON HISTOGRAM FILTERING
THE LOCALIZATION OF MINDSTORMS NXT IN THE MAGNETIC UNSTABLE ENVIRONMENT BASED ON HISTOGRAM FILTERING Piotr Artiemjew Department of Mathematics and Computer Sciences, University of Warmia and Mazury, Olsztyn,
More informationRough Sets and Conflict Analysis
Rough Sets and Conflict Analysis Zdzis law Pawlak and Andrzej Skowron 1 Institute of Mathematics, Warsaw University Banacha 2, 02-097 Warsaw, Poland skowron@mimuw.edu.pl Commemorating the life and work
More informationData Analysis - the Rough Sets Perspective
Data Analysis - the Rough ets Perspective Zdzisław Pawlak Institute of Computer cience Warsaw University of Technology 00-665 Warsaw, Nowowiejska 15/19 Abstract: Rough set theory is a new mathematical
More informationISSN Article. Discretization Based on Entropy and Multiple Scanning
Entropy 2013, 15, 1486-1502; doi:10.3390/e15051486 OPEN ACCESS entropy ISSN 1099-4300 www.mdpi.com/journal/entropy Article Discretization Based on Entropy and Multiple Scanning Jerzy W. Grzymala-Busse
More informationMining Approximative Descriptions of Sets Using Rough Sets
Mining Approximative Descriptions of Sets Using Rough Sets Dan A. Simovici University of Massachusetts Boston, Dept. of Computer Science, 100 Morrissey Blvd. Boston, Massachusetts, 02125 USA dsim@cs.umb.edu
More informationRough Set Approach for Generation of Classification Rules for Jaundice
Rough Set Approach for Generation of Classification Rules for Jaundice Sujogya Mishra 1, Shakti Prasad Mohanty 2, Sateesh Kumar Pradhan 3 1 Research scholar, Utkal University Bhubaneswar-751004, India
More informationRough sets: Some extensions
Information Sciences 177 (2007) 28 40 www.elsevier.com/locate/ins Rough sets: Some extensions Zdzisław Pawlak, Andrzej Skowron * Institute of Mathematics, Warsaw University, Banacha 2, 02-097 Warsaw, Poland
More informationA Scientometrics Study of Rough Sets in Three Decades
A Scientometrics Study of Rough Sets in Three Decades JingTao Yao and Yan Zhang Department of Computer Science University of Regina [jtyao, zhang83y]@cs.uregina.ca Oct. 8, 2013 J. T. Yao & Y. Zhang A Scientometrics
More information1 Introduction Rough sets theory has been developed since Pawlak's seminal work [6] (see also [7]) as a tool enabling to classify objects which are on
On the extension of rough sets under incomplete information Jerzy Stefanowski 1 and Alexis Tsouki as 2 1 Institute of Computing Science, Poznań University oftechnology, 3A Piotrowo, 60-965 Poznań, Poland,
More informationChapter 18 Rough Neurons: Petri Net Models and Applications
Chapter 18 Rough Neurons: Petri Net Models and Applications James F. Peters, 1 Sheela Ramanna, 1 Zbigniew Suraj, 2 Maciej Borkowski 1 1 University of Manitoba, Winnipeg, Manitoba R3T 5V6, Canada jfpeters,
More informationRough sets, decision algorithms and Bayes' theorem
European Journal of Operational Research 136 2002) 181±189 www.elsevier.com/locate/dsw Computing, Arti cial Intelligence and Information Technology Rough sets, decision algorithms and Bayes' theorem Zdzisøaw
More informationHow do we compare the relative performance among competing models?
How do we compare the relative performance among competing models? 1 Comparing Data Mining Methods Frequent problem: we want to know which of the two learning techniques is better How to reliably say Model
More informationFoundations of Classification
Foundations of Classification J. T. Yao Y. Y. Yao and Y. Zhao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 {jtyao, yyao, yanzhao}@cs.uregina.ca Summary. Classification
More informationOn Improving the k-means Algorithm to Classify Unclassified Patterns
On Improving the k-means Algorithm to Classify Unclassified Patterns Mohamed M. Rizk 1, Safar Mohamed Safar Alghamdi 2 1 Mathematics & Statistics Department, Faculty of Science, Taif University, Taif,
More informationFuzzy Modal Like Approximation Operations Based on Residuated Lattices
Fuzzy Modal Like Approximation Operations Based on Residuated Lattices Anna Maria Radzikowska Faculty of Mathematics and Information Science Warsaw University of Technology Plac Politechniki 1, 00 661
More informationOn Probability of Matching in Probability Based Rough Set Definitions
2013 IEEE International Conference on Systems, Man, and Cybernetics On Probability of Matching in Probability Based Rough Set Definitions Do Van Nguyen, Koichi Yamada, and Muneyuki Unehara Department of
More informationLearning Sunspot Classification
Fundamenta Informaticae XX (2006) 1 15 1 IOS Press Learning Sunspot Classification Trung Thanh Nguyen, Claire P. Willis, Derek J. Paddon Department of Computer Science, University of Bath Bath BA2 7AY,
More informationRough Set Model Selection for Practical Decision Making
Rough Set Model Selection for Practical Decision Making Joseph P. Herbert JingTao Yao Department of Computer Science University of Regina Regina, Saskatchewan, Canada, S4S 0A2 {herbertj, jtyao}@cs.uregina.ca
More informationBhubaneswar , India 2 Department of Mathematics, College of Engineering and
www.ijcsi.org 136 ROUGH SET APPROACH TO GENERATE CLASSIFICATION RULES FOR DIABETES Sujogya Mishra 1, Shakti Prasad Mohanty 2, Sateesh Kumar Pradhan 3 1 Research scholar, Utkal University Bhubaneswar-751004,
More informationClass 4: Classification. Quaid Morris February 11 th, 2011 ML4Bio
Class 4: Classification Quaid Morris February 11 th, 211 ML4Bio Overview Basic concepts in classification: overfitting, cross-validation, evaluation. Linear Discriminant Analysis and Quadratic Discriminant
More informationBagging and Boosting for the Nearest Mean Classifier: Effects of Sample Size on Diversity and Accuracy
and for the Nearest Mean Classifier: Effects of Sample Size on Diversity and Accuracy Marina Skurichina, Liudmila I. Kuncheva 2 and Robert P.W. Duin Pattern Recognition Group, Department of Applied Physics,
More informationPUBLICATIONS OF CECYLIA RAUSZER
PUBLICATIONS OF CECYLIA RAUSZER [CR1] Representation theorem for semi-boolean algebras I, Bull. Acad. Polon. Sci., Sér. Sci. Math. Astronom. Phys. 19(1971), 881 887. [CR2] Representation theorem for semi-boolean
More informationNaive Bayesian Rough Sets
Naive Bayesian Rough Sets Yiyu Yao and Bing Zhou Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 {yyao,zhou200b}@cs.uregina.ca Abstract. A naive Bayesian classifier
More informationROUGH SET THEORY FOR INTELLIGENT INDUSTRIAL APPLICATIONS
ROUGH SET THEORY FOR INTELLIGENT INDUSTRIAL APPLICATIONS Zdzisław Pawlak Institute of Theoretical and Applied Informatics, Polish Academy of Sciences, Poland, e-mail: zpw@ii.pw.edu.pl ABSTRACT Application
More informationROUGH set methodology has been witnessed great success
IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 14, NO. 2, APRIL 2006 191 Fuzzy Probabilistic Approximation Spaces and Their Information Measures Qinghua Hu, Daren Yu, Zongxia Xie, and Jinfu Liu Abstract Rough
More informationRough Sets for Uncertainty Reasoning
Rough Sets for Uncertainty Reasoning S.K.M. Wong 1 and C.J. Butz 2 1 Department of Computer Science, University of Regina, Regina, Canada, S4S 0A2, wong@cs.uregina.ca 2 School of Information Technology
More informationOn rule acquisition in incomplete multi-scale decision tables
*Manuscript (including abstract) Click here to view linked References On rule acquisition in incomplete multi-scale decision tables Wei-Zhi Wu a,b,, Yuhua Qian c, Tong-Jun Li a,b, Shen-Ming Gu a,b a School
More informationComparison of Shannon, Renyi and Tsallis Entropy used in Decision Trees
Comparison of Shannon, Renyi and Tsallis Entropy used in Decision Trees Tomasz Maszczyk and W lodzis law Duch Department of Informatics, Nicolaus Copernicus University Grudzi adzka 5, 87-100 Toruń, Poland
More informationGuaranteeing the Accuracy of Association Rules by Statistical Significance
Guaranteeing the Accuracy of Association Rules by Statistical Significance W. Hämäläinen Department of Computer Science, University of Helsinki, Finland Abstract. Association rules are a popular knowledge
More informationSome remarks on conflict analysis
European Journal of Operational Research 166 (2005) 649 654 www.elsevier.com/locate/dsw Some remarks on conflict analysis Zdzisław Pawlak Warsaw School of Information Technology, ul. Newelska 6, 01 447
More informationParameters to find the cause of Global Terrorism using Rough Set Theory
Parameters to find the cause of Global Terrorism using Rough Set Theory Sujogya Mishra Research scholar Utkal University Bhubaneswar-751004, India Shakti Prasad Mohanty Department of Mathematics College
More informationConcept Lattices in Rough Set Theory
Concept Lattices in Rough Set Theory Y.Y. Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: yyao@cs.uregina.ca URL: http://www.cs.uregina/ yyao Abstract
More informationTop-k Parametrized Boost
Top-k Parametrized Boost Turki Turki 1,4, Muhammad Amimul Ihsan 2, Nouf Turki 3, Jie Zhang 4, Usman Roshan 4 1 King Abdulaziz University P.O. Box 80221, Jeddah 21589, Saudi Arabia tturki@kau.edu.sa 2 Department
More informationKnowledge Discovery Based Query Answering in Hierarchical Information Systems
Knowledge Discovery Based Query Answering in Hierarchical Information Systems Zbigniew W. Raś 1,2, Agnieszka Dardzińska 3, and Osman Gürdal 4 1 Univ. of North Carolina, Dept. of Comp. Sci., Charlotte,
More informationOn flexible database querying via extensions to fuzzy sets
On flexible database querying via extensions to fuzzy sets Guy de Tré, Rita de Caluwe Computer Science Laboratory Ghent University Sint-Pietersnieuwstraat 41, B-9000 Ghent, Belgium {guy.detre,rita.decaluwe}@ugent.be
More informationData Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts
Data Mining: Concepts and Techniques (3 rd ed.) Chapter 8 1 Chapter 8. Classification: Basic Concepts Classification: Basic Concepts Decision Tree Induction Bayes Classification Methods Rule-Based Classification
More informationThe Decision List Machine
The Decision List Machine Marina Sokolova SITE, University of Ottawa Ottawa, Ont. Canada,K1N-6N5 sokolova@site.uottawa.ca Nathalie Japkowicz SITE, University of Ottawa Ottawa, Ont. Canada,K1N-6N5 nat@site.uottawa.ca
More informationML in Practice: CMSC 422 Slides adapted from Prof. CARPUAT and Prof. Roth
ML in Practice: CMSC 422 Slides adapted from Prof. CARPUAT and Prof. Roth N-fold cross validation Instead of a single test-training split: train test Split data into N equal-sized parts Train and test
More informationRelationship between Loss Functions and Confirmation Measures
Relationship between Loss Functions and Confirmation Measures Krzysztof Dembczyński 1 and Salvatore Greco 2 and Wojciech Kotłowski 1 and Roman Słowiński 1,3 1 Institute of Computing Science, Poznań University
More informationImproved Closest Fit Techniques to Handle Missing Attribute Values
J. Comp. & Math. Sci. Vol.2 (2), 384-390 (2011) Improved Closest Fit Techniques to Handle Missing Attribute Values SANJAY GAUR and M S DULAWAT Department of Mathematics and Statistics, Maharana, Bhupal
More informationThree-Way Analysis of Facial Similarity Judgments
Three-Way Analysis of Facial Similarity Judgments Daryl H. Hepting, Hadeel Hatim Bin Amer, and Yiyu Yao University of Regina, Regina, SK, S4S 0A2, CANADA hepting@cs.uregina.ca, binamerh@cs.uregina.ca,
More informationDynamic Programming Approach for Construction of Association Rule Systems
Dynamic Programming Approach for Construction of Association Rule Systems Fawaz Alsolami 1, Talha Amin 1, Igor Chikalov 1, Mikhail Moshkov 1, and Beata Zielosko 2 1 Computer, Electrical and Mathematical
More informationOn the Relation of Probability, Fuzziness, Rough and Evidence Theory
On the Relation of Probability, Fuzziness, Rough and Evidence Theory Rolly Intan Petra Christian University Department of Informatics Engineering Surabaya, Indonesia rintan@petra.ac.id Abstract. Since
More informationEvaluation. Albert Bifet. April 2012
Evaluation Albert Bifet April 2012 COMP423A/COMP523A Data Stream Mining Outline 1. Introduction 2. Stream Algorithmics 3. Concept drift 4. Evaluation 5. Classification 6. Ensemble Methods 7. Regression
More informationNotes on Rough Set Approximations and Associated Measures
Notes on Rough Set Approximations and Associated Measures Yiyu Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: yyao@cs.uregina.ca URL: http://www.cs.uregina.ca/
More informationARPN Journal of Science and Technology All rights reserved.
Rule Induction Based On Boundary Region Partition Reduction with Stards Comparisons Du Weifeng Min Xiao School of Mathematics Physics Information Engineering Jiaxing University Jiaxing 34 China ABSTRACT
More informationEvaluation. Andrea Passerini Machine Learning. Evaluation
Andrea Passerini passerini@disi.unitn.it Machine Learning Basic concepts requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain
More informationOn Rough Set Modelling for Data Mining
On Rough Set Modelling for Data Mining V S Jeyalakshmi, Centre for Information Technology and Engineering, M. S. University, Abhisekapatti. Email: vsjeyalakshmi@yahoo.com G Ariprasad 2, Fatima Michael
More informationCS570 Data Mining. Anomaly Detection. Li Xiong. Slide credits: Tan, Steinbach, Kumar Jiawei Han and Micheline Kamber.
CS570 Data Mining Anomaly Detection Li Xiong Slide credits: Tan, Steinbach, Kumar Jiawei Han and Micheline Kamber April 3, 2011 1 Anomaly Detection Anomaly is a pattern in the data that does not conform
More informationResearch Article Special Approach to Near Set Theory
Mathematical Problems in Engineering Volume 2011, Article ID 168501, 10 pages doi:10.1155/2011/168501 Research Article Special Approach to Near Set Theory M. E. Abd El-Monsef, 1 H. M. Abu-Donia, 2 and
More informationEvaluation requires to define performance measures to be optimized
Evaluation Basic concepts Evaluation requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain (generalization error) approximation
More informationAPPLICATION FOR LOGICAL EXPRESSION PROCESSING
APPLICATION FOR LOGICAL EXPRESSION PROCESSING Marcin Michalak, Michał Dubiel, Jolanta Urbanek Institute of Informatics, Silesian University of Technology, Gliwice, Poland Marcin.Michalak@polsl.pl ABSTRACT
More informationA PRIMER ON ROUGH SETS:
A PRIMER ON ROUGH SETS: A NEW APPROACH TO DRAWING CONCLUSIONS FROM DATA Zdzisław Pawlak ABSTRACT Rough set theory is a new mathematical approach to vague and uncertain data analysis. This Article explains
More informationLearning Rules from Very Large Databases Using Rough Multisets
Learning Rules from Very Large Databases Using Rough Multisets Chien-Chung Chan Department of Computer Science University of Akron Akron, OH 44325-4003 chan@cs.uakron.edu Abstract. This paper presents
More informationSemantic Rendering of Data Tables: Multivalued Information Systems Revisited
Semantic Rendering of Data Tables: Multivalued Information Systems Revisited Marcin Wolski 1 and Anna Gomolińska 2 1 Maria Curie-Skłodowska University, Department of Logic and Cognitive Science, Pl. Marii
More informationDialectics of Approximation of Semantics of Rough Sets
Dialectics of of Rough Sets and Deptartment of Pure Mathematics University of Calcutta 9/1B, Jatin Bagchi Road Kolkata-700029, India E-Mail: a.mani.cms@gmail.com Web: www.logicamani.in CLC/SLC 30th Oct
More informationReview of Lecture 1. Across records. Within records. Classification, Clustering, Outlier detection. Associations
Review of Lecture 1 This course is about finding novel actionable patterns in data. We can divide data mining algorithms (and the patterns they find) into five groups Across records Classification, Clustering,
More informationLossless Online Bayesian Bagging
Lossless Online Bayesian Bagging Herbert K. H. Lee ISDS Duke University Box 90251 Durham, NC 27708 herbie@isds.duke.edu Merlise A. Clyde ISDS Duke University Box 90251 Durham, NC 27708 clyde@isds.duke.edu
More informationInvestigating Measures of Association by Graphs and Tables of Critical Frequencies
Investigating Measures of Association by Graphs Investigating and Tables Measures of Critical of Association Frequencies by Graphs and Tables of Critical Frequencies Martin Ralbovský, Jan Rauch University
More informationA Fuzzy Entropy Algorithm For Data Extrapolation In Multi-Compressor System
A Fuzzy Entropy Algorithm For Data Extrapolation In Multi-Compressor System Gursewak S Brar #, Yadwinder S Brar $, Yaduvir Singh * Abstract-- In this paper incomplete quantitative data has been dealt by
More informationDiscovery of Concurrent Data Models from Experimental Tables: A Rough Set Approach
From: KDD-95 Proceedings. Copyright 1995, AAAI (www.aaai.org). All rights reserved. Discovery of Concurrent Data Models from Experimental Tables: A Rough Set Approach Andrzej Skowronl* and Zbigniew Suraj2*
More informationP leiades: Subspace Clustering and Evaluation
P leiades: Subspace Clustering and Evaluation Ira Assent, Emmanuel Müller, Ralph Krieger, Timm Jansen, and Thomas Seidl Data management and exploration group, RWTH Aachen University, Germany {assent,mueller,krieger,jansen,seidl}@cs.rwth-aachen.de
More informationRough Sets, Rough Relations and Rough Functions. Zdzislaw Pawlak. Warsaw University of Technology. ul. Nowowiejska 15/19, Warsaw, Poland.
Rough Sets, Rough Relations and Rough Functions Zdzislaw Pawlak Institute of Computer Science Warsaw University of Technology ul. Nowowiejska 15/19, 00 665 Warsaw, Poland and Institute of Theoretical and
More informationRough operations on Boolean algebras
Rough operations on Boolean algebras Guilin Qi and Weiru Liu School of Computer Science, Queen s University Belfast Belfast, BT7 1NN, UK Abstract In this paper, we introduce two pairs of rough operations
More informationParts 3-6 are EXAMPLES for cse634
1 Parts 3-6 are EXAMPLES for cse634 FINAL TEST CSE 352 ARTIFICIAL INTELLIGENCE Fall 2008 There are 6 pages in this exam. Please make sure you have all of them INTRODUCTION Philosophical AI Questions Q1.
More informationIterative Laplacian Score for Feature Selection
Iterative Laplacian Score for Feature Selection Linling Zhu, Linsong Miao, and Daoqiang Zhang College of Computer Science and echnology, Nanjing University of Aeronautics and Astronautics, Nanjing 2006,
More informationAbduction in Classification Tasks
Abduction in Classification Tasks Maurizio Atzori, Paolo Mancarella, and Franco Turini Dipartimento di Informatica University of Pisa, Italy {atzori,paolo,turini}@di.unipi.it Abstract. The aim of this
More informationRough Set Approaches for Discovery of Rules and Attribute Dependencies
Rough Set Approaches for Discovery of Rules and Attribute Dependencies Wojciech Ziarko Department of Computer Science University of Regina Regina, SK, S4S 0A2 Canada Abstract The article presents an elementary
More informationComputational Learning Theory
Computational Learning Theory Sinh Hoa Nguyen, Hung Son Nguyen Polish-Japanese Institute of Information Technology Institute of Mathematics, Warsaw University February 14, 2006 inh Hoa Nguyen, Hung Son
More informationOutlier Detection Using Rough Set Theory
Outlier Detection Using Rough Set Theory Feng Jiang 1,2, Yuefei Sui 1, and Cungen Cao 1 1 Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences,
More informationData Mining and Machine Learning (Machine Learning: Symbolische Ansätze)
Data Mining and Machine Learning (Machine Learning: Symbolische Ansätze) Learning Individual Rules and Subgroup Discovery Introduction Batch Learning Terminology Coverage Spaces Descriptive vs. Predictive
More informationBrock University. Probabilistic granule analysis. Department of Computer Science. Ivo Düntsch & Günther Gediga Technical Report # CS May 2008
Brock University Department of Computer Science Probabilistic granule analysis Ivo Düntsch & Günther Gediga Technical Report # CS-08-04 May 2008 Brock University Department of Computer Science St. Catharines,
More informationCompenzational Vagueness
Compenzational Vagueness Milan Mareš Institute of information Theory and Automation Academy of Sciences of the Czech Republic P. O. Box 18, 182 08 Praha 8, Czech Republic mares@utia.cas.cz Abstract Some
More information