This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Size: px
Start display at page:

Download "This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and"

Transcription

1 This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier s archiving and manuscript policies are encouraged to visit:

2 Information Sciences 184 (2012) Contents lists available at SciVerse ScienceDirect Information Sciences journal homepage: Modeling rough granular computing based on approximation spaces Andrzej Skowron a,, Jarosław Stepaniuk b, Roman Swiniarski c,d a Institute of Mathematics, The University of Warsaw, Banacha 2, Warsaw, Poland b Department of Computer Science, Białystok University of Technology, Wiejska 45A, Białystok, Poland c Department of Computer Science, San Diego State University, 5500 Campanile Drive, San Diego, CA 92182, USA d Institute of Computer Science, Polish Academy of Sciences, Ordona 21, Warsaw, Poland article info abstract Article history: Available online 18 August 2011 Keywords: Rough sets Approximation spaces Granular computing Concept Interactions Data models The results reported in this paper create a step toward the rough set-based foundations of data mining and machine learning. The approach is based on calculi of approximation spaces. In this paper, we present the summarization and extension of our results obtained since 2003 when we started investigations on foundations of approximation of partially defined concepts (see, e.g., [2,3,7,37,20,21,5,42,39,38,40]). We discuss some important issues for modeling granular computations aimed at inducing compound granules relevant for solving problems such as approximation of complex concepts or selecting relevant actions (plans) for reaching target goals. The problems discussed in this article are crucial for building computer systems that assist researchers in scientific discoveries in many areas such as biology. In this paper, we present foundations for modeling of granular computations inside of system that is based on granules called approximation spaces. Our approach is based on the rough set approach introduced by Pawlak [24,25]. Approximation spaces are fundamental granules used in searching for relevant complex granules called as data models, e.g., approximations of complex concepts, functions or relations. In particular, we discuss some issues that are related to generalizations of the approximation space introduced in [33,34]. We present examples of rough set-based strategies for the extension of approximation spaces from samples of objects onto a whole universe of objects. This makes it possible to present foundations for inducing data models such as approximations of concepts or classifications analogous to the approaches for inducing different types of classifiers known in machine learning and data mining. Searching for relevant approximation spaces and data models are formulated as complex optimization problems. The proposed interactive, granular computing systems should be equipped with efficient heuristics that support searching for (semi-)optimal granules. Ó 2011 Elsevier Inc. All rights reserved. 1. Introduction We discuss some important issues for modeling granular computations aimed at inducing compound granules relevant for solving problems such as approximation of complex concepts or selecting relevant actions (plans) for reaching a target goal. The problems discussed in this article are crucial for building computer systems that assist researchers in scientific discoveries in many application areas such as biology. These systems should be interactive, allowing users to perform such actions as communication hypotheses or hints to the system in searching for complex granules, schemes of reasoning by analogy, strategies for judging under uncertainty or submitting to the system domain knowledge such as ontologies of Corresponding author. addresses: skowron@mimuw.edu.pl (A. Skowron), j.stepaniuk@pb.edu.pl (J. Stepaniuk), rswiniar@sciences.sdsu.edu (R. Swiniarski) /$ - see front matter Ó 2011 Elsevier Inc. All rights reserved. doi: /j.ins

3 A. Skowron et al. / Information Sciences 184 (2012) concepts. A more advanced cooperation of the user with the system can help to develop ontology approximation [5,7]. This approach was used in real-life projects related to medical decision support and planning therapy (see, e.g., [5]), control of UAVs (see, e.g., [4,5,20,50]) or Sun spot classification [22]. The system should also allow users to receive messages from the system, e.g., on discovery progress. In this paper, we present foundations for modeling of granular computations inside of a system that are based on granules called approximation spaces. Our approach is based on rough sets. Rough sets, due to Pawlak [24,25], can be represented by pairs of sets that give the lower and the upper approximation of the original sets. In the standard version of rough set theory, an approximation space is based on the indiscernibility equivalence relation. Approximation spaces belong to the broad spectrum of basic issues investigated in rough set theory (see, e.g., [2,3,7,10,28,29,33,34,37,38,41,42,45]). They can be treated as complex granules. The proposed interactive granular computing systems should be equipped with efficient heuristics that support discovery of such (semi-)optimal granules. Over the years, different aspects of approximation spaces were investigated and many generalizations of the approach based on the indiscernibility equivalence relation [26] were proposed. In this paper, we discuss some aspects of generalizations of approximation spaces investigated in [33,34,42] that are important for real-life applications, e.g., in searching for approximation of complex concepts (see, e.g., [7,5]). This is realized by searching for relevant approximation spaces from a given family of approximation spaces relative to some optimization criteria. There are several components that are important in searching for relevant approximation spaces relative to approximated concepts or classifications. Among them are neighborhoods (granules) of objects defined by features (attributes), inclusion measures making it possible to measure the degree of inclusion of neighborhoods in concepts, operations for inductive extension of approximation spaces allowing us to induce classifiers, optimization measures based on some versions of the minimal description length principle (MLP) used for measuring of approximation space quality relative to approximated concepts, search strategies for (semi-)optimal approximation spaces relative to such measures. Pawlak introduced rough sets [24,25], assuming that objects are perceived by values of some attributes. Hence, information about objects may be incomplete. However, in machine learning and data mining [12] not only information about objects but also information about concepts is partial, e.g., given as a sample of objects only. We propose to deal with this issue using extension operations defined over approximation spaces. An extension of a given approximation space AS is an approximation space defined on a larger universe of objects than the universe of objects of AS, e.g., with the universe including not only objects from a given sample but also including so far unseen objects. Over the years many strategies for inducing classifiers were developed [12]. These strategies can be interpreted as searching strategies for relevant extensions of approximation spaces. The extension operations defined over approximation spaces can be treated as tools for inductive reasoning (for performing judgment [15]) on so far unseen objects. For example, an extension of an approximation space can be based, analogously to rule based classifiers [12,26], on estimation of the membership degrees of any new classified object to concepts in classification on the basis of information about matching by object patterns covering concepts (e.g., left hand sides of decision rules). The investigated approach enables us to present uniform foundations for inducing approximations of different kinds of higher order granules [27] such as concepts, classifications, or functions. In particular, we emphasize the fundamental role of approximation spaces for inducing diverse kinds of classifiers used in machine learning or data mining. Search problems for relevant approximation spaces and their extensions lead to optimization of high computational complexity. Hence, efficient heuristics should be used in searching for approximate solutions of this problem. These heuristics can be based on approximate Boolean reasoning (see, e.g., [26,21]) or/and biologically-inspired metaheuristics (see, e.g., [9]). Moreover, in hierarchical learning of complex concepts, many different approximation spaces should be discovered. Learning of such concepts can be supported by domain knowledge and ontology approximation (see, e.g., [5,7,14]). Finally, let us mention software platforms supporting the development of our projects, i.e., Interactive Classification Engine (RoughICE) [47] and TunedIT [51]. RoughICE is a software platform supporting the approximation of spatio-temporal complex concepts in the given concept ontology acquired during dialogue with the user. RoughICE is freely available on the website [47]. The underlying algorithmic methods, especially for generating reducts and rules, discretization and decomposition, are outgrowths of our previous tools such as RSES [48] and RSESlib [49]. RoughICE software as well as the underlying computational methods have been successfully applied in different data mining projects (e.g., in mining traffic data and medical data; for details see [5,50] and the literature cited in [5]). The TunedIT platform [51], launched recently by members of our research group, facilitates sharing, evaluation and comparison of data-mining and machine-learning algorithms. The resources used in our experiments algorithms and datasets, in particular will be shared on TunedIT website. This website already contains many publicly available datasets and algorithms, as well as performance data for nearly 100 algorithms tested on numerous datasets these include the algorithms from Weka, Rseslib libraries, and the datasets from the UCI Machine Learning Repository. Everyone can contribute new resources and results. TunedIT is composed of three complementary modules: TunedTester, Repository and Knowledge Base. TunedIT may help researchers design repeatable experiments and generate reproducible results. It may be particularly useful

4 22 A. Skowron et al. / Information Sciences 184 (2012) when conducting experiments intended for publication, as reproducibility of experimental results is the essential factor that determines research value of the paper. TunedIT helps also in dissemination of new ideas and findings. Every researcher may upload his implementations, datasets and documents into Repository, so that other users can find them easily and employ in their own research. This paper is organized as follows. In Section 2, we discuss basic notions for our approach. In Section 3, we present a generalization of the approximation space definition from [33,34,42]. The rough set approach to inducing rule based classifiers, knn classifiers and function approximation is presented in Sections 3.1, 3.2 and 3.3, respectively. Relationships of rough granular computing based on approximation spaces and their extensions are discussed in Section 4. Searching for approximation spaces is performed in the set of approximation spaces generated from some atomic approximation spaces by applying to them operations on approximation spaces. The operations on approximation spaces are investigated in Section 5. In conclusions, we summarize the results of the paper and we present some directions for further research. 2. Basic notions 2.1. Concept Concepts in philosophy are the constituents of thoughts. 1 In this paper, concepts are represented as subsets of some universes of objects (denoted by U or U in Table 1). Different universes may be needed to represent different concepts. They are specified by a partial information on finite samples (denoted by U in Table 1) of such universes. This partial information is recorded in data tables representing information systems [25,26]. Objects in data tables are labels of perceived by agents real objects (situations, states) (see Table 1) or they are sets of higher types constructed, e.g., in hierarchical modeling [40]. One should distinguish between real objects (perceived by agents) and objects from universes such as U, U, U used for concept representation Attributes, signatures of objects and two semantics In [26] any attribute a is defined as a function from a universe of objects U into the set of attribute values V a. However, in applications we expect that the value of any attribute should be also defined for objects from extensions of U, i.e., for new objects which can be perceived in the future. 2 The universe U is only a sample of possible objects. This requires some modification of the basic definitions of attribute and signature of objects [24 26]. One can give an interpretation of attributes using the concept of interaction. In this paper, information systems are used for representing results of interactions [40]. Interactions can be external or internal relative to a given agent ag. The results of external interactions of ag with environments, are recorded by (activated) sensory attributes or by attributes used for storing the results of performed actions. The internal interaction of ag with its parts such as knowledge bases lead to activation of sensory attributes or action attributes [40]. We treat attributes as granules and we consider their interactions with environments. If a is a given attribute and e denotes a state of the environment, then the result of interaction between a and e is equal to a pair (l e,v), where l e is a label of e (see Table 1) and v 2 V a. Analogously, if IS =(U,A) is a given information system and e denotes a state of the environment, then by interaction of IS and e we obtain the information system IS 0 =(U [ {l e },A ), where A ={a : a 2 A} and a (u)=a(u) for u 2 U and a (l e )=v for some v 2 V a. Hence, information systems are dynamic objects created via interaction of already existing information systems with environments. Notice that the initial information system can be empty, the set of objects of this information system is empty. Moreover, let us observe that elements of U are labels of environment states rather than states. One can represent any attribute by a family of formulas and interpret the attribute as the result of interaction of this set with the environment. In this case, we assume that, for any attribute a under consideration, there is given a relational structure R a. Together with the simple structure (V a,=) [26], some other relational structures R a with the carrier V a for a 2 A and a signature s are considered. We also assume that with any attribute a there is identified a set of some generic formulas {a i } i2j (where J is a set of indices) interpreted over R a as a subsets of V a, i.e., ka i k Ra ¼fv 2 V a : R a ; v ƒ a i g. Moreover, it is assumed that the set fka i k Ra g i2j is a partition of V a. Perception of an object u by a given attribute a is represented by selection of a formula a i and a value v 2 V a such that v 2ka i k Ra. Using an intuitive interpretation, one can say that such a pair (a i,v) is selected from {a i } i2j and V a, respectively, as the result of sensory measurement. We assume that, for a given set of attributes A and any object u, the signature of u relative to A is given by Inf A ðuþ ¼ a; a a u ; v : a 2 A, where a a u ; v is the result of sensory measurement by a on u. Let us observe that a triple a; a a u ; v can be encoded by the atomic formula a = v with interpretation ka ¼ vk U ¼ u 2 U : a; a a ;v u 2 Infa ðuþ for some a a u : ð1þ For simplicity, we also write (a,v) instead of a; a a u ; v, if this does not lead to confusion. One can also consider a soft version of the attribute definition [43,44]. In this case, we assume that the semantics of the family {a i } i2j is given by fuzzy membership functions for a i and the set of these functions define a fuzzy partition [16]. 1 See 2 Objects from U are treated as labels of real perceived objects.

5 A. Skowron et al. / Information Sciences 184 (2012) Table 1 Notation used in this article. Symbol Interpretation U Set of objects (universe, e.g., sample of objects) a Condition attribute over U (a : U? V a ) V a Set of attribute values of a 2 A A Set of condition attributes over U IS Information system (U, A) d Decision attribute over U(d : U? V d ) DT Decision table (U, A [ {d}) card(u) Number of elements in U P(U) Set of all subsets of U kak U Semantics of a formula a over U AS Approximation space U Extension of the sample U # U U nu Set of testing objects a Extension of a from U to U (a : U? V a ) A Set of condition attributes over U (A ={a : a 2 A}) R þ Set of non-negative reals U Extension of U by objects that differ from the type of objects from U (e.g., U nu is a set of reals used for construction of new granules over objects from U ) I(x) Granule corresponding to x, e.g., neighborhood of x or family of neighborhoods of x X ( U Restriction of X to U; in the simplest case the intersection X \ U but if X # P(U) then X ( U ={Y \ U : Y 2 X} Inf A (u) Signature of u representing the result of sensory measurements by attributes from A on u; for each attribute a the result of sensory l e measurement is recorded by a triple a; a a;v u, where a a u is a formula selected from the language of a and v is a value v satisfying a a u Label of perceived by a given agent ag state (real object, situation) e is created by ag and is used as an object in the constructed by ag information system; labels discern the recorded, usually in time, representations of states resulting in interaction of ag with its environment (for more details see [40]) We construct granular formulas from atomic formulas corresponding to the considered attributes. As a consequence, the satisfiability of such formulas is defined if the satisfiability of atomic formulas is given as a result of sensor measurements. Hence, one can consider for any constructed formula a over atomic formulas, its semantics kak U # U over U as well as the semantics kak U # U over U, where U # U (see Fig. 1). The difference between these two cases is the following. In the case of U, one can compute kak U # U but in the case kak U # U, for objects from U nu, there is no information about their membership relative to kak U nu. One can estimate the satisfiability of a for objects u 2 U nu only after the relevant attribute values on u are stored in data table representing information system (e.g., when the results of sensory measurements on u are stored in data table). In particular, one can use some methods to estimate relationships among semantics of formulas over U, using the relationships among the semantics of the formulas over U. For example, one can apply statistical methods. This step is crucial in the investigation of extensions of approximation spaces relevant for inducing classifiers from data (see, e.g., [2,3,7,12,37,38]) Uncertainty function In [33,34,42], the uncertainty function I defines for every object x from a given sample U of objects, a set of objects with descriptions similar to x. The set I(x) is called the neighborhood of x. Let P x (U )= S ip1p i (U ), where P 1 (U )=P(U ) and P i+1 (U )=P(P i (U )) for i P 1. For example, if card(u ) = 2 and U ={x 1,x 2 }, then we obtain P 1 (U )={;,{x 1 },{x 2 },{x 1,x 2 }}, P 2 (U )={;,{;},{{x 1 }},{{x 2 }},{{x 1,x 2 }},{;,{x 1 }},{;,{x 2 }},{;,{x 1,x 2 }},...},... If U * α U * U α U Fig. 1. Two semantics of a over U and U, respectively.

6 24 A. Skowron et al. / Information Sciences 184 (2012) card(u )=n, where n is a positive natural number, then card(p 1 (U )) = 2 n and cardðp nþ1 ðu ÞÞ ¼ 2 cardðpn ðu ÞÞ, for n P 1. For example, cardðp 3 ðu ÞÞ ¼ 2 22n. Hence, we see that the levels of the powerset hierarchy are very rich and full automatic searching for relevant sets (structures) on such levels is not feasible. However, this approach allows us to present in a uniform way foundations for modeling of granular computing aimed at inducing compound granules from different levels of the powerset hierarchy relevant for solving the target task, e.g., approximation of complex concepts. For applications, it is necessary to restrict searching for relevant granules in relevant fragments of the powerset hierarchy. These fragments are defined by some sets of formulas. Discovery of such sets often is the big challenge for many problems. In this paper, we consider uncertainty functions of the form I: U? P x (U ). The values of uncertainty functions are called granular neighborhoods. These granular neighborhoods are defined by the so called granular formulas. The values of such uncertainty functions are not necessarily from P(U ) but from P x (U ). In the following sections, we will present more details on granular neighborhoods and granular formulas. Fig. 2 presents an illustrative example of the uncertainty function with values in P 2 (U ) rather than in P(U ). The generalization of neighborhoods discussed here are also motivated by the necessity of modeling or discovery of complex structural objects in solving problems of pattern recognition, machine learning, or data mining. These structural objects (granules) can be defined as sets on higher levels of the powerset hierarchy. Among examples of such granules are indiscernibility or similarity classes of patterns or relational structures discovered in images, clusters of time widows, indiscernibility or similarity classes of sequences of time windows representing processes, behavioral graphs (for more details see, e.g., [39,5] and also [19]). If X 2 P x (U ) and U # U, then by X ( U we denote the set defined as follows (i) if X 2 P(U ) then X ( U = X \ U and (ii) for any i P 1ifX 2 P i+1 (U ) then X ( U ={Y ( U:Y 2 X}. For example, if U ={x 1 }, U ={x 1,x 2 } and X ={{x 2 },{x 1,x 2 }} (X 2 P 2 (U )), then X ( U ={Y ( U : Y 2 X}={Y \ U : Y 2 X}={;,{x 1 }}. In this section, we discuss uncertainty functions assigning granules from P 2 (U ) to objects from U. We assume that objects from U and U are of atomic type with the type defined by attributes from a given information system IS =(U,A) over U. For example, the type of objects from U may be identified with a2a V a. In Section 5.2, we present a method of defining information systems with objects of higher order type (structural objects). Such objects can be from different levels of P x (U ), i.e., they can belong to P k (U ), for k > 2. Note that neighborhoods (e.g., indiscernibility classes) over such objects are sets of objects of higher order type from P x (U ) Rough inclusion function The second component of any approximation space is the rough inclusion function [34,42]. One can consider general constraints which the rough inclusion functions should satisfy. In this section, we present only some examples of rough inclusion functions. The rough inclusion function m : P(U) P(U)? [0,1] defines the degree of inclusion of X in Y, where X,Y # U and U is a finite sample of objects. In the simplest case the standard rough inclusion function m SRI can be defined by (see, e.g., [26,30,34,46]): m SRI ðx; YÞ ¼ ( cardðx\yþ ; cardðxþ if X ;; 1; if X ¼;: Some illustrative examples are given in Table 2. It is important to note that an inclusion measure expressed in terms of the confidence measure, widely used in data mining, was considered by Łukasiewicz [17] a long time ago in studies on assigning fractional truth values to logical formulas. ð2þ Fig. 2. Uncertainty function I : U? P 2 (U ). The neighborhood of x 2 U nu, where Inf A (x)={(a,1),(b,0),(c,2)}, does not contain training cases from U. The generalizations of this neighborhood described by formulas ka ¼ 1 ^ c ¼ 2k U and ka ¼ 1 ^ b ¼ 0k U have non empty intersections with U.

7 A. Skowron et al. / Information Sciences 184 (2012) Table 2 Illustration of Standard Rough Inclusion Function. X Y m SRI (X,Y) {x 1,x 3,x 7,x 8 } {x 2,x 4,x 5,x 6,x 9 } 0 {x 1,x 3,x 7,x 8 } {x 1,x 2,x 4,x 5,x 6,x 9 } 0.25 {x 1,x 3,x 7,x 8 } {x 1,x 2,x 3,x 7,x 8 } 1 For a definition of the inclusion function for more general granules, e.g., partitions of objects, one can use a measure based on positive region [26], entropy [12,13] or rough entropy [23,18]. Inclusion measures for more general granules were also investigated [35,8]. However, more work in this direction should be done, especially on inclusion of granules with complex structures, in particular for granular neighborhoods. 3. Approximation spaces In this section, we present a generalization of the approximation space definition from [33,34,42]. In applications, approximation spaces are constructed for a given concept or a family of concepts creating a partition (in the case of classification) rather than for the class of all concepts. Then searching for components of approximation space relevant for the concept approximation becomes feasible. A concept is given only on a sample U of objects. We often restrict the definition of components of approximation space to objects from U and/or some patterns from P x (U ) or from P x (U ), where U # U necessary for approximation of a given concept X only. The definitions of the uncertainty function and the inclusion function can be restricted to some subsets of U and P x (U ) (or U # U ), respectively, which are relevant for approximated concept(s). The set U nu can be treated as the set of testing objects. We assume that the type of objects from U and U are the same, i.e., they are defined by a2a V a, where A is the set of attributes from a given information system IS. In constructing granules relevant for approximation of more compound granules such as functions, we use objects of different types, e.g., objects described also by real values in addition to attribute value vectors from a2a V a. The set of such objects or values create the set U nu. Moreover, the optimization requirements for the lower approximation and upper approximation are then also restricted to the given concept X. These requirements express closeness of the induced concept approximations to the approximation of the concept(s) on the given sample of objects and are combined with the description length of constructed approximations for obtaining the relevant quality measures. Usually, the uncertainty functions and the rough inclusion functions are parameterized. Then searching (in the family of the these functions defined by possible values of parameters) for the (semi)optimal uncertainty function and the rough inclusion function relative to the selected quality measure becomes feasible. Definition 1. An approximation space over a set of attributes A for a concept represented by X # U and given on a sample U # U of objects is a system where AS ¼ðU; U ; I; m; LÞ U is a sample of objects with known signatures relative to a given set of attributes A, L is a language of granular formulas defined over atomic formulas corresponding to generic formulas from signatures (see Section 2.3), the set U is such that, for any object u 2 U, the signature Inf A (u) ofu relative to A can be obtained as the result of sensory measurements on u, I : U? P x (U ) is an uncertainty function, where U # U ; we assume that the granular neighborhood I(u) is computable from Inf A (u), i.e., from Inf A (u) it is possible to compute a formula a InfA ðuþ 2 L such that IðxÞ ¼ka InfA ðuþk U ; m : P x (U ) P x (U )? [0,1] is a partial rough inclusion function, such that for any x 2 U the value m(i(x),x) is defined for the considered concept X. In Section 3.3, we consider an uncertainty function with values in PðPðU ÞÞ PðR þ Þ, where R þ is the set of positive reals. Hence, we assume that the values of the uncertainty function I may belong to the space of possible patterns from P x (U ), where U # U ¼ U [ R þ. The partiality of the rough inclusion makes it possible to define the values of this functions on relevant patterns for approximation only. We assume that the lower approximation operation LOW(AS, X) and the upper approximation operation UPP(AS, X) of the concept X in the approximation space AS satisfy the following condition: ð3þ mðlowðas; XÞ; UPPðAS; XÞÞ ¼ 1: ð4þ

8 26 A. Skowron et al. / Information Sciences 184 (2012) Usually the uncertainty function and the rough inclusion function are parameterized. In this parameterized family of approximation spaces, one can search for an approximation space enabling us to approximate the concept X restricted to a given sample U with the satisfactory quality. The quality of approximation can be expressed by some quality measures. For example, one can use the following criterion: 1. LOW(AS,X) ( U is included in X ( U to a degree at least deg, i.e., m(low(as,x) ( U),X ( U)) P deg, 2. X ( U is included in UPP(AS,X) ( U to a degree at least deg, this means that, mðuppðas; XUÞ; XUÞ P deg; where deg is a given threshold from the interval [0,1]. The above condition expresses the degree to which at least the induced approximations in AS are close to the concept X on the sample U. One can also introduce the description length of the induced approximations. A combination of these two measures can be used as the quality measure for the induced approximation space. Then the searching problem for the relevant approximation space can be considered as the optimization problem relative to this quality measure. This approach may be interpreted as a form of the minimal description length principle [32,11]. The result of optimization can be checked against a testing sample. This enables us to estimate the quality of approximation. Note that further optimization can be performed relative to parameters of the selected quality measure Approximations and rule based classifiers In this section, we discuss generation of approximations on extensions of samples of objects. In the example we illustrate how the approximations of sets (concepts) can be estimated using only partial information about these sets. Moreover, the example introduces uncertainty functions with values in P 2 (U) and rough inclusion functions defined for families of subsets from P 2 (U). Let us assume that DT =(U,A [ {d}) is a decision table, where U ={x 1,...,x 9 } is a set of objects and A ={a,b,c} is a set of condition attributes (see Table 3). In DT we compute two decision reducts: {a,b}and {b,c}. We obtain the set Rule_set ={r 1,...,r 12 } of minimal (reduct based) decision rules [26]. From x 1 we obtain two rules: r 1 : if a =1and b =1then d =1, r 2 : if b =1and c =0then d =1. From x 2 and x 4 we obtain two rules: r 3 : if a =0and b =2then d =1,r 4 : if b =2and c =0then d =1. From x 5 we obtain one new rule: r 5 : if a =0and b =1then d =1. From x 3 we obtain two rules: r 6 : if a =1and b =0then d =0,r 7 : if b =0and c =1then d =0. From x 6 we obtain two rules: r 8 : if a =0and b =0then d =0,r 9 : if b =0and c =0then d =0. From x 7 we obtain one new rule: r 10 : if b =0and c =2then d =0. From x 6 we obtain two rules: r 11 : if a =1and b =2then d =0,r 12 : if b =2and c =1then d =0. Let U = U [ {x 10,x 11,x 12,x 13,x 14 } (see Table 4). Let h : [0,1]? {0,1/2,1} be a function defined by Table 3 Decision table over the set of objects U. a b c d x x x x x x x x x

9 A. Skowron et al. / Information Sciences 184 (2012) Table 4 Decision table over the set of objects U U. a b c d d class x from r 3 or 0 from r 12 x from r 4 or 0 from r 11 x from r 4 or 0 from r 11 x from r 5 x from r 1 8 >< 1; if t > 1 ; 2 1 hðtþ ¼ ; if t ¼ 1 ; 2 2 >: 0; if t < 1 : 2 ð5þ Below we present an example of the uncertainty and rough inclusion functions: IðxÞ ¼fklhðrÞk U : x 2klhðrÞk U and r 2 Rule setg; ð6þ where x 2 U and lh(r) denotes the formula on the left hand side of the rule r, and ( m U ðx; ZÞ ¼ h cardðfy2x:y\u # ZgÞ cardðfy2x:y\u # ZgÞþcardðfY2X:Y\U # U nzgþ ; if X ;; 0; if X ¼;; ð7þ where X # P(U ) and Z # U. The inclusion function defined in (7) can be explained as follows. Let us assume that X is a family of neighborhoods matching an object x and let Z be a decision class. Then in (7) we have the ratio of the number of neighborhoods from X included (after restriction to the sample U) in the decision class Z to the number of neighborhoods from X included (after restriction to the sample U) either in Z or in the complement of Z. Notice that in calculations we use only information available on the sample U. Certainly, (7) presents one of many possible definitions of inclusions relevant for inducing rule based classifiers. The uncertainty and rough inclusion functions can now be used to define the lower approximation LOW(AS,Z), the upper approximation UPP(AS,Z), and the boundary region BN(AS,Z) ofz # P(U ) by: LOWðAS ; ZÞ ¼fx 2 U : m U ðiðxþ; ZÞ ¼1g; ð8þ and UPPðAS ; ZÞ ¼fx 2 U : m U ðiðxþ; ZÞ > 0g; BNðAS ; ZÞ ¼UPPðAS ; ZÞnLOWðAS ; ZÞ: ð9þ ð10þ In the example, we classify objects from U to the lower approximation of Z if majority of rules matching this object are voting for Z and to the upper approximation of Z if at least half of the rules matching x are voting for Z. Certainly, one can follow many other voting schemes developed in machine learning or by introducing less crisp conditions in the boundary region definition. The defined approximations can be treated as estimations of the exact approximations of subsets of U because these induced approximations are constructed on the basis of samples of subsets of U restricted to U only. One can use some standard quality measures developed in machine learning to calculate the quality of such approximations assuming that Table 5 Uncertainty function and rough inclusion over the set of objects U. I() m U ðiðþ; C 1 Þ x 1 {{x 1,x 14 },{x 1,x 5 }} h(2/2) = 1 x 2 {{x 2,x 4,x 10 },{x 2,x 4,x 11,x 12 }} h(2/2) = 1 x 3 {{x 3,x 7 },{x 3,x 9 }} h(0/2) = 0 x 4 {{x 2,x 4,x 10 },{x 2,x 4,x 11,x 12 }} h(2/2) = 1 x 5 {{x 5,x 13 },{x 1,x 5 }} h(2/2) = 1 x 6 {{x 6,x 9 },{x 6 }} h(0/2) = 0 x 7 {{x 3,x 7 },{x 7 }} h(0/2) = 0 x 8 {{x 8,x 11,x 12 },{x 8,x 10 }} h(0/2) = 0 x 9 {{x 6,x 9 },{x 3,x 9 }} h(0/2) = 0 x 10 {{x 2,x 4,x 10 },{x 8,x 10 }} h(1/2) = 1/2 x 11 {{x 8,x 11,x 12 },{x 2,x 4,x 11,x 12 }} h(1/2) = 1/2 x 12 {{x 8,x 11,x 12 },{x 2,x 4,x 11,x 12 }} h(1/2) = 1/2 x 13 {{x 5,x 13 }} h(1/1) = 1 x 14 {{x 1,x 14 }} h(1/1) = 1

10 28 A. Skowron et al. / Information Sciences 184 (2012) after estimation of the approximations on U, full information about membership for objects relative to the approximated subsets of U is uncovered on testing sets analogously to the situation in machine learning. Let C 1 ¼fx 2 U : dðxþ ¼1g ¼fx 1 ; x 2 ; x 4 ; x 5 ; x 10 ; x 13 ; x 14 g. We obtain the set U n C 1 ¼ C 0 ¼fx 3; x 6 ; x 7 ; x 8 ; x 9 ; x 11 ; x 12 g. The uncertainty function and rough inclusion are presented in Table 5. Thus, in our example from Table 5, we obtain LOWðAS ; C 1 Þ¼ x 2 U : m U IðxÞ; C 1 ¼ 1 ¼fx1 ; x 2 ; x 4 ; x 5 ; x 13 ; x 14 g; ð11þ UPP AS ; C 1 ¼ x 2 U : m U IðxÞ; C 1 > 0 ¼fx1 ; x 2 ; x 4 ; x 5 ; x 10 ; x 11 ; x 12 ; x 13 ; x 14 g; ð12þ BN AS ; C 1 ¼ UPP AS ; C 1 n LOW AS ; C 1 ¼fx10 ; x 11 ; x 12 g: ð13þ 3.2. Approximations and nearest neighbor classifiers In this section, we present a method for construction of rough based classifiers based on the k-nearest neighbors idea. The k-nearest neighbors algorithm (k-nn, where k is a positive integer) is a method for classifying objects based on k closest training examples in the attribute space [12]. An object is classified by a majority vote of its neighbors, with the object being assigned to the decision class most common amongst its k nearest neighbors. If k = 1, then the object is simply assigned to the decision class of its nearest neighbor. Let DT =(U,A [ {d}) be a decision table and let DT =(U,A [ {d }) be an extension of DT. We define NN k : U? P(INF(A)) by NN k (x) = a set of k elements of INF(A) with the minimal distances to Inf A (x), where INFðAÞ ¼fInf A ðxþ : x 2 Ug. The Hamming distance d H A ðu; vþ between two strings u; v 2 Q a2a V a of the length card(a) is the number of positions at which the corresponding symbols are different. In our example, we use a normalized Hamming distance d A : Q a2a V a Q a2a V a!½0; 1Š defined by d A ðu; vþ ¼ dh A ðu;vþ. cardðaþ The description of x 1 is Inf A (x 1 ) = (1,1,0) 2 INF(A) 3 (see Table 3) and the description of x 14 is Inf A (x 14 ) = (1,1,2) 2 INF(A) (see Table 4). Because each object is described by 3 condition attributes, we have that the Hamming distance between Inf A (x 1 ) = (1,1,0) and Inf A (x 14 ) = (1,1,2) is 1 and the normalized Hamming distance is equal to d A ðð1; 1; 0Þ; ð1; 1; 2ÞÞ ¼ 1. 3 We define 4 I NNk ðxþ ¼ fk ^ Inf A ðyþk U : y 2 U and Inf A ðyþ 2NN k ðxþg; ð14þ m NNk ðx; YÞ ¼ cardðfs ðz \ UÞ : Z 2 X&Z \ U # Y \ UgÞ ; ð15þ cardðuþ where X is a family of pairwise disjoint subsets of U. In Eq. (14), we consider the family of neighborhoods of x defined by k-objects from U closest to x (i.e., from the set NN k (x)). The inclusion degree in (15), is equal to the ratio of the number of objects from sample U matched by neighborhoods from NN k (x) to the number of objects in the sample U. Following this explanation and Eqs. (16) and (17), one can define approximations on extensions of the sample U (see (18) and (19)). One can easily see the close analogy to the classifying strategy by k-nearest neighbor classifiers. Let J e : U? P({d(x):x2U}) for 0 < e 1 be defined by n J e o ðxþ ¼ i : :9j iðm NNk I NNk ðxþ; C j > m NNk I NNk ðxþ; C i þ eþ ; ð16þ and 8 m e >< 1; if J e ðxþ ¼fig; NN k I NNk ðxþ; C i ¼ 1 ; if i 2 2 Je ðxþ & cardðj e ðxþþ > 1; >: 0; if fig Ü J e ðxþ: ð17þ The defined uncertainty I NNk and rough inclusion m e NN k functions can now be used to define the lower approximation LOW AS ; C i, the upper approximation UPP AS ; C i, and the boundary region BN AS ; C i of C i # U by: n LOW AS ; C i ¼ x 2 U : m e o NN k I NNk ðxþ; C i ¼ 1 ; ð18þ n UPP AS ; C i ¼ x 2 U : m e o NN k I NNk ðxþ; C i > 0 ; ð19þ and 3 We write Inf A (x 1 ) = (1, 1,0) 2 INF(A) instead of Inf A (x 1 )={(a,1),(b,1),(c,0)}. 4 k V Inf A ðyþk U denotes the set of all objects from U satisfying the conjuntion of all descriptors a = a(y) for a 2 A.

11 A. Skowron et al. / Information Sciences 184 (2012) Table 6 Uncertainty function I NN2 and rough inclusion m 0:1 NN 2 over the set of objects U nu = {x 10,...,x 14 }. NN 2 () I NN2 ðþ m NN2 ði NN2 ðþ; C 1 Þ x 10 {(0,2,0), (1,2,1)} {{x 2,x 4 },{x 8 }, {x 10 }} 2/9 x 11 {(1,1,0), (0,2,0)} {{x 1 },{x 2,x 4 }, {x 11,x 12 }} 3/9 x 12 {(1,1,0), (0,2,0)} {{x 1 },{x 2,x 4 }, {x 11,x 12 }} 3/9 x 13 {(0,1,0), (1,1,0)} {{x 5 },{x 1 }} 2/9 x 14 {(1,1,0), (1,0,2)} {{x 1 },{x 7 }} 1/9 m NN2 ði NN2 ðþ; C 0 Þ J0.1 () m 0:1 NN 2 ði NN2 ðþ; C 1 Þ x 10 1/9 {1} 1 x 11 0 {1} 1 x 12 0 {1} 1 x 13 0 {1} 1 x 14 1/9 {0,1} 1/2 BN AS ; C i ¼ UPP AS ; C i n LOW AS ; C i : ð20þ Let k = 2 and e = 0.1, in our example we obtain the results presented in Table 6. The neighbors are taken from the set U of objects for which the correct classification is known. In the classification phase, a new object is classified by assigning the decision class which is most frequent among the 2 training objects nearest to that new object. In the case of more than two nearest objects we choose randomly two. Thus, in our example from Table 6 we obtain n o LOW AS ; C 1 ¼ x 2 U : m 0:1 I NN2 ðxþ; C i ¼ 1 ¼fx 1 ; x 2 ; x 4 ; x 5 ; x 10 ; x 11 ; x 12 ; x 13 g; ð21þ NN 2 n o UPP AS ; C 1 ¼ x 2 U : m 0:1 I NN2 ðxþ; C i > 0 ¼fx 1 ; x 2 ; x 4 ; x 5 ; x 10 ; x 11 ; x 12 ; x 13 ; x 14 g; NN 2 ð22þ BN AS ; C 1 ¼ UPP AS ; C 1 n LOW AS ; C 1 ¼fx14 g: ð23þ 3.3. Function approximations In this subsection, we discuss the rough set approach to function approximation from available incomplete data. Our approach can be treated as a kind of rough clustering of functional data [31]. Let us consider an example of function approximation. We assume that a partial information is only available about a function. This means that, some points from the graph of the function are known. Before presenting a more formal description of function approximation, we introduce some notation. A function f : U! R þ will be called a sample of a function f : U! R þ, where R þ is the set of non-negative reals and U # U is a finite subset of U,iff is an extension of f. By Gf (Gf ) we denote the graph of f(f ), respectively, i.e., the set {(x,f (x)) : x 2 U} ({(x,f (x)) : x 2 U }). For any Z # U R þ by p 1 (Z) and p 2 (Z) we denote the sets fx U : 9y R þ ðx; yþ Zg and fy R þ : 9x U ðx; yþ Zg, respectively. First, we define approximations of Gf given on a sample U of objects and next we show how to induce approximations of Gf over U, i.e., on extension of U. Let D will be a partition of f(u) into sets of reals of diameter less than d > 0, where d is a given threshold. We also assume that IS =(U,A) is a given information system. Let us also assume that for any object signature Inf A (x)={(a,a(x)):a 2 A} [26] there is assigned an interval of non-negative reals with diameter less than d. We denote this interval by D InfA ðxþ. Hence, D ¼fD InfA ðxþ : x 2 Ug. We consider an approximation space AS IS,D =(U,I,m ) (relative to given IS and D), where IðxÞ ¼½xŠ INDðAÞ D InfA ðxþ; ð24þ and m ðx; YÞ ¼ ( cardðp 1 ðx\yþþ cardðp 1 ; ðxþþ if X ;; 1; if X ¼;; ð25þ for X; Y # U R þ. The lower approximation and upper approximation of Gf in AS are defined by LOWðAS IS;D ; Gf Þ¼ [ fiðxþ : m ðiðxþ; Gf Þ¼1g; ð26þ and

12 30 A. Skowron et al. / Information Sciences 184 (2012) UPPðAS IS;D ; Gf Þ¼ [ fiðxþ : m ðiðxþ; Gf Þ > 0g; ð27þ respectively. Observe that this definition is different from the standard definition of the lower approximation [26]. The defined approximation space is a bit more general than in [34], e.g., the values of the uncertainty functions are subsets of U R þ instead of U. Moreover, one can easily see that by applying the standard definition of relation approximation to f [26] (this is a special case of relation) the lower approximation of function is almost always equal to the empty set. The new definition is making it possible to express better the fact that a given neighborhood is well matching the graph of f [37,42]. For expressing this a classical set theoretical inclusion of neighborhood into the graph of f is not satisfactory. Example 1. We present the first illustrative example of a function approximation. Let f : U! R þ where U = {1,2,3,4,5,6}. Let f(1) = 3, f(2) = 2, f(3) = 2, f(4) = 5, f(5) = 5 and f(6) = 2. Let IS =(U,A) be an information system where A ={a} and 8 >< 0; if 0 6 x 6 2; aðxþ ¼ 1; if 2 < x 6 4; ð28þ >: 2; if 4 < x 6 6: Thus the partition U/IND(A) = {{1,2},{3,4},{5,6}}. The graph of f is defined by Gf ={(x,f(x)) : x 2 U}= {(1,3),(2,2),(3,2),(4,5),(5,5),(6,2)}. We define approximations of Gf given on the sample U of objects. We obtain f(u) = {2,3,5} and let D = {{2,3},{5}} will be a partition of f(u). We consider an approximation space AS IS,D =(U,I,m ) (relative to given IS and D), where IðxÞ ¼½xŠ INDðAÞ D InfA ðxþ; ð29þ is defined by 8 >< f1; 2g½1:5; 4Š; if x 2f1; 2g; IðxÞ ¼ f3; 4g½1:7; 4:5Š; if x 2f3; 4g; >: f5; 6g½3; 4Š; if x 2f5; 6g: ð30þ We obtain the lower approximation and upper approximation of Gf in the approximation space AS IS,D : LOWðAS IS;D ; Gf Þ¼ [ fiðxþ : m ðiðxþ; Gf Þ¼1g ¼Ið1Þ[Ið2Þ ¼f1; 2g½1:5; 4Þ; ð31þ and UPPðAS IS;D ; Gf Þ¼ [ fiðxþ : m ðiðxþ; Gf Þ > 0g ¼Ið1Þ[Ið2Þ[Ið3Þ[Ið4Þ ¼f1; 2g½1:5; 4Þ[f4; 5g½1:7; 4:5Þ; ð32þ respectively. Example 2. We present the second illustrative example of a function approximation. First, let us recall that an interval is a set of real numbers with the property that any number that lies between two numbers in the set is also included in the set. The closed interval of numbers between v and w ðv; w 2 R þ Þ; including v and w, will be denoted by [v,w]. Let us consider a function f : R þ! R þ. We have only a partial information about this function given by G i = d(x i ) e(f(x i )), where d(x i ) denotes a closed interval of reals to which belongs x i, e(f(x i )) denotes a closed interval of reals to which belongs f(x i ) and i =1,...,n. A family {G 1,...,G n } is called a partial information about graph Gf ¼fðx; f ðxþþ : x 2 R þ g. Let Nh denotes a family of elements of PðR þ ÞPðR þ Þ called neighborhoods. In our example, we consider Nh ={X 1,X 2,X 3 }, where X 1 = [1,6] [0.1,0.4],X 2 = [7,12] [1.1,1.4] and X 3 = [13,18] [2.1,2.4]. Let us recall the inclusion definition between closed intervals: ½v 1 ; w 1 Š # ½v 2 ; w 2 Š iff v 2 6 v 1 & w 1 6 w 2 ð33þ We define the new rough inclusion function by 8 >< 1; if 8 i2f1;...;ng ðp 1 ðg i Þ\p 1 ðxþ ;!G i # XÞ; 1 mðx; fg 1 ;...; G n gþ ¼ ; if 9 2 i2f1;...;ngðp 1 ðg i Þ\p 1 ðxþ ;&G i \ X ;Þ; >: 0; if 8 i2f1;...;ng ðp 1 ðg i Þ\p 1 ðxþ ;!G i \ X ¼;Þ: ð34þ Let sample values of a function f : R þ! R þ are given in Table 7. Let an approximation space AS ¼ðR þ ; Nh; mþ be given. We define the lower and upper approximation as follows: LOWðAS; fg 1 ;...; G n gþ ¼ [ fx 2 Nh : mðx; fg 1 ;...; G n gþ ¼ 1g; UPPðAS; fg 1 ;...; G n gþ ¼ [ fx 2 Nh : mðx; fg 1 ;...; G n gþ > 0g: ð35þ ð36þ

13 A. Skowron et al. / Information Sciences 184 (2012) Table 7 Sample values of a function f, d(x i ),e(f(x i )) and G i. i x i f(x i ) d(x i ) e(f(x i )) G i p 1 (G i ) [1,2] [0.5,0.6] [1,2] [0.5,0.6] [1,2] [3,4] [0.6,0.6] [3,4] [0.6,0.6] [3,4] [5,6] [0,0.5] [5,6] [0,0.5] [5,6] [7,8] [1.3,1.4] [7,8] [1.3,1.4] [7,8] [9,10] [1.1,1.2] [9,10] [1.1,1.2] [9,10] [11, 12] [1.3, 1.4] [11, 12] [1.3,1.4] [11,12] [13, 14] [2.2, 2.3] [13, 14] [2.2,2.3] [13,14] [15, 16] [2, 2.05] [15, 16] [2,2.05] [15,16] [17, 18] [2.45, 2.5] [17, 18] [2.45, 2.5] [17,18] In our example, we obtain LOWðAS; fg 1 ;...; G 9 gþ ¼ X 2 ¼½7; 12Š½1:1; 1:4Š; UPPðAS; fg 1 ;...; G 9 gþ ¼ X 2 [ X 3 ¼½7; 12Š½1:1; 1:4Š[½13; 18Š½2:1; 2:4Š: ð37þ ð38þ The above defined approximations are approximations over the set of objects from sample U # U. Now, we present an approach for inducing of approximations of the graph Gf of function f on U, i.e., on extension of U. We use an illustrative example to present the approach. It is worthwhile mentioning that by using boolean reasoning [26] one can generate patterns described by conjunctions of descriptors over IS such that the deviation of f on such patterns in U is less than a given threshold d. This means that, for any such a formula a, the set f(kak U ) has diameter less than a given threshold d, i.e., the image of kak U, i.e., the set f(kak U ), is included into [y d/2,y + d/2) for some y 2 U. Moreover, one can generate such minimal patterns, i.e., formulas a having the above property but no formula obtained by drooping some descriptors from a has that property [26,6]. ByPATTERN(A, f, d) we denote a set of induced patterns with the above properties. One can also assume 5 that PATTERN(A,f,d) is extended by adding some shortenings of minimal patterns. For any pattern from PATTERN(A,f,d) it is assigned an interval of reals D a such that the deviation of f on kak U is in D a, i.e., f(kak U ) # D a. Note that, for any boolean combination a of descriptors over A, it is also well defined its semantics kak U over U. However, there is only available information about a part of kak U equal to kak U ¼kak U \ U. Assuming that the patterns from PATTER- N(A, f, d) are strong (i.e., their support is sufficiently large) one may induce that the following inclusion holds: f ðkak U Þ # ½y d=2; y þ d=2þ: ð39þ We can now define a generalized approximation space making it possible to extend the approximation of Gf ={(x,f(x)):x 2 U} over the defined previously approximation space AS to approximation of Gf ={(x,f(x)) : x 2 U }, where U # U. Let us consider a generalized approximation space AS ¼ U; U ; I ; m tr ; L ; ð40þ where tr is a given threshold from the interval [0,0.5), L is a language of boolean combinations of descriptors over the information system IS [26] used for construction of patterns from the set PATTERN(A,f,d), I ðxþ ¼fkak U D a : a 2 PATTERNðA; f ; dþ&x 2kak U g for x 2 U, where U is an extension of the sample U, i.e., U # U, for any finite family X # PðU ÞI, where P(U ) is the powerset of U ; I is a family of intervals of reals of diameter less than d and for any Y from U R þ representing the graph of a function from U into R þ 8 >< 1; if Max < tr; m tr ðx; YÞ ¼ 1 ; if tr 6 Max < 1 tr; ð41þ 2 >: 0; if Max P 1 tr; where n o jy 1. Max ¼ max midðp 2 ðzþþj maxfy ;midðp : Z 2 ðzþþg 2X&m UðZ; YÞ > 0 ; 2. m U ðz; YÞ ¼m ððp 1 ðzþ\uþp 2 ðzþ; Y \ðu R þ ÞÞ, where m is defined by the Eq. (25), 3. midðdþ ¼ aþb, where D =[a,b), 2 5 Analogously to shortening of decision rules [26].

14 32 A. Skowron et al. / Information Sciences 184 (2012) y ¼ 1 X midðp 2 ðzþþ cardðp 1 ðz \ YÞ\UÞ; c Z2X:m U ðz;yþ>0 ð42þ where 0 1 [ c ¼ card@ p 1 ðzþ\ua: ð43þ Z2X:m U ðz;yþ>0 The lower approximation of Gf is defined by LOW ðas ; Gf Þ¼ ðx; yþ : m tr ði ðxþ; Gf Þ¼1 & x 2 U & y 2½y d=2; y þ d=2þ ; ð44þ where y is obtained from Eq. (42) in which X is substituted by I(x) and Y by Gf, respectively. The upper approximation of Gf is defined by UPP ðas ; Gf Þ¼ ðx; yþ : m tr ði ðxþ; Gf Þ > 0 & x 2 U & y 2½y d=2; y þ d=2þ ; ð45þ where y is obtained from Eq. (42) in which X is substituted by I(x) and Y by Gf, respectively. Let us observe that for x 2 U the condition (x,y) R UPP (AS,Gf ) means that m tr ði ðxþ; Gf Þ¼0&y2½y d=2; y þ d=2þ or y R [y d/2,y + d/2). The first condition describes the set of all pairs (x,y), where the deviation of y from y is small (relative to d) but the prediction of y on the set of patterns I (x) is very risky. The values of f can be induced by bf ðxþ ¼ ½y d=2; y þ d=2þ; if m tr ði ðxþ; Gf Þ > 0; ð46þ undefined; otherwise; where x 2 U nu and y is obtained from Eq. (42) in which X is substituted by I(x) and Y by Gf, respectively. Let us now explain the formal definitions presented above. The value of uncertainty function I (x) for a given object x consists all patterns of the form kak U D a such that kak U is matched by the object x. The condition x 2kakÞ U can be verified by checking if the A-signature of x, i.e., Inf A (x) is matching a (to a satisfactory degree). The deviation on kak U is bounded by the interval D a of reals. The degree to which Z is included to Y is estimated by m UðZ; YÞ, i.e., by degree to which the restricted to U projection of the pattern Z is included into Y projected on U. The estimated value for f (x) belongs to the interval [y d/ 2,y + d/2) obtained by fusion of centers of intervals assigned to patterns from X. In this fusion, the weights of these centers are reflecting the strength on U of patterns matching Y to a positive degree. The result of fusion is normalized by c. The degree to which a family of patterns X is included into Y is measured by the deviation of the value y from centers of intervals of patterns Z matching Y to a positive degree (i.e., m U ðz; YÞ > 0). In Fig. 3 we illustrate the idea of the presented definition of y, where. Z i ¼ka i k U D ai for i = 1,2,3, I (x)={z 1,Z 2,Z 3 }, the horizontal bold lines illustrate projections of sets Z i (i =1,2,3) on U, the vertical bold lines illustrate projections of sets Z i (i =1,2,3) on R þ, y ¼ 1 c P 3 t¼1 midðd a t Þcardðka t k U Þ and c is defined by Eq. (43), m U ðz i; Gf Þ > 0 for i = 1,2,3 because (x 1,f(x 1 )) 2 Z 1 and (x 2,f(x 2 )) 2 Z 2 \ Z 3 for x 1,x 2 2 U, f y* f(x 2 ) Z 3 Z 1 Z 2 f(x 1 ) U * x 1 x x 2 Fig. 3. Inducing the value y.

Banacha Warszawa Poland s:

Banacha Warszawa Poland  s: Chapter 12 Rough Sets and Rough Logic: A KDD Perspective Zdzis law Pawlak 1, Lech Polkowski 2, and Andrzej Skowron 3 1 Institute of Theoretical and Applied Informatics Polish Academy of Sciences Ba ltycka

More information

Some remarks on conflict analysis

Some remarks on conflict analysis European Journal of Operational Research 166 (2005) 649 654 www.elsevier.com/locate/dsw Some remarks on conflict analysis Zdzisław Pawlak Warsaw School of Information Technology, ul. Newelska 6, 01 447

More information

Rough Sets and Conflict Analysis

Rough Sets and Conflict Analysis Rough Sets and Conflict Analysis Zdzis law Pawlak and Andrzej Skowron 1 Institute of Mathematics, Warsaw University Banacha 2, 02-097 Warsaw, Poland skowron@mimuw.edu.pl Commemorating the life and work

More information

Andrzej Skowron, Zbigniew Suraj (Eds.) To the Memory of Professor Zdzisław Pawlak

Andrzej Skowron, Zbigniew Suraj (Eds.) To the Memory of Professor Zdzisław Pawlak Andrzej Skowron, Zbigniew Suraj (Eds.) ROUGH SETS AND INTELLIGENT SYSTEMS To the Memory of Professor Zdzisław Pawlak Vol. 1 SPIN Springer s internal project number, if known Springer Berlin Heidelberg

More information

A new Approach to Drawing Conclusions from Data A Rough Set Perspective

A new Approach to Drawing Conclusions from Data A Rough Set Perspective Motto: Let the data speak for themselves R.A. Fisher A new Approach to Drawing Conclusions from Data A Rough et Perspective Zdzisław Pawlak Institute for Theoretical and Applied Informatics Polish Academy

More information

Information Sciences

Information Sciences Information Sciences 178 (2008) 3356 3373 Contents lists available at ScienceDirect Information Sciences journal homepage: www.elsevier.com/locate/ins Attribute reduction in decision-theoretic rough set

More information

Index. C, system, 8 Cech distance, 549

Index. C, system, 8 Cech distance, 549 Index PF(A), 391 α-lower approximation, 340 α-lower bound, 339 α-reduct, 109 α-upper approximation, 340 α-upper bound, 339 δ-neighborhood consistent, 291 ε-approach nearness, 558 C, 443-2 system, 8 Cech

More information

Classification Based on Logical Concept Analysis

Classification Based on Logical Concept Analysis Classification Based on Logical Concept Analysis Yan Zhao and Yiyu Yao Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada S4S 0A2 E-mail: {yanzhao, yyao}@cs.uregina.ca Abstract.

More information

Rough Sets, Rough Relations and Rough Functions. Zdzislaw Pawlak. Warsaw University of Technology. ul. Nowowiejska 15/19, Warsaw, Poland.

Rough Sets, Rough Relations and Rough Functions. Zdzislaw Pawlak. Warsaw University of Technology. ul. Nowowiejska 15/19, Warsaw, Poland. Rough Sets, Rough Relations and Rough Functions Zdzislaw Pawlak Institute of Computer Science Warsaw University of Technology ul. Nowowiejska 15/19, 00 665 Warsaw, Poland and Institute of Theoretical and

More information

Hierarchical Structures on Multigranulation Spaces

Hierarchical Structures on Multigranulation Spaces Yang XB, Qian YH, Yang JY. Hierarchical structures on multigranulation spaces. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 27(6): 1169 1183 Nov. 2012. DOI 10.1007/s11390-012-1294-0 Hierarchical Structures

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

This article was published in an Elsevier journal. The attached copy is furnished to the author for non-commercial research and education use, including for instruction at the author s institution, sharing

More information

International Journal of Approximate Reasoning

International Journal of Approximate Reasoning International Journal of Approximate Reasoning 52 (2011) 231 239 Contents lists available at ScienceDirect International Journal of Approximate Reasoning journal homepage: www.elsevier.com/locate/ijar

More information

Drawing Conclusions from Data The Rough Set Way

Drawing Conclusions from Data The Rough Set Way Drawing Conclusions from Data The Rough et Way Zdzisław Pawlak Institute of Theoretical and Applied Informatics, Polish Academy of ciences, ul Bałtycka 5, 44 000 Gliwice, Poland In the rough set theory

More information

Interpreting Low and High Order Rules: A Granular Computing Approach

Interpreting Low and High Order Rules: A Granular Computing Approach Interpreting Low and High Order Rules: A Granular Computing Approach Yiyu Yao, Bing Zhou and Yaohua Chen Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail:

More information

Tolerance Approximation Spaces. Andrzej Skowron. Institute of Mathematics. Warsaw University. Banacha 2, Warsaw, Poland

Tolerance Approximation Spaces. Andrzej Skowron. Institute of Mathematics. Warsaw University. Banacha 2, Warsaw, Poland Tolerance Approximation Spaces Andrzej Skowron Institute of Mathematics Warsaw University Banacha 2, 02-097 Warsaw, Poland e-mail: skowron@mimuw.edu.pl Jaroslaw Stepaniuk Institute of Computer Science

More information

Feature Selection with Fuzzy Decision Reducts

Feature Selection with Fuzzy Decision Reducts Feature Selection with Fuzzy Decision Reducts Chris Cornelis 1, Germán Hurtado Martín 1,2, Richard Jensen 3, and Dominik Ślȩzak4 1 Dept. of Mathematics and Computer Science, Ghent University, Gent, Belgium

More information

ARTICLE IN PRESS. Information Sciences xxx (2016) xxx xxx. Contents lists available at ScienceDirect. Information Sciences

ARTICLE IN PRESS. Information Sciences xxx (2016) xxx xxx. Contents lists available at ScienceDirect. Information Sciences Information Sciences xxx (2016) xxx xxx Contents lists available at ScienceDirect Information Sciences journal homepage: www.elsevier.com/locate/ins Three-way cognitive concept learning via multi-granularity

More information

This article was published in an Elsevier journal. The attached copy is furnished to the author for non-commercial research and education use, including for instruction at the author s institution, sharing

More information

Classification of Voice Signals through Mining Unique Episodes in Temporal Information Systems: A Rough Set Approach

Classification of Voice Signals through Mining Unique Episodes in Temporal Information Systems: A Rough Set Approach Classification of Voice Signals through Mining Unique Episodes in Temporal Information Systems: A Rough Set Approach Krzysztof Pancerz, Wies law Paja, Mariusz Wrzesień, and Jan Warcho l 1 University of

More information

Computational Intelligence, Volume, Number, VAGUENES AND UNCERTAINTY: A ROUGH SET PERSPECTIVE. Zdzislaw Pawlak

Computational Intelligence, Volume, Number, VAGUENES AND UNCERTAINTY: A ROUGH SET PERSPECTIVE. Zdzislaw Pawlak Computational Intelligence, Volume, Number, VAGUENES AND UNCERTAINTY: A ROUGH SET PERSPECTIVE Zdzislaw Pawlak Institute of Computer Science, Warsaw Technical University, ul. Nowowiejska 15/19,00 665 Warsaw,

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

(This is a sample cover image for this issue. The actual cover is not yet available at this time.)

(This is a sample cover image for this issue. The actual cover is not yet available at this time.) This is a sample cover image for this issue. The actual cover is not yet available at this time.) This article appeared in a journal published by Elsevier. The attached copy is furnished to the author

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

The size of decision table can be understood in terms of both cardinality of A, denoted by card (A), and the number of equivalence classes of IND (A),

The size of decision table can be understood in terms of both cardinality of A, denoted by card (A), and the number of equivalence classes of IND (A), Attribute Set Decomposition of Decision Tables Dominik Slezak Warsaw University Banacha 2, 02-097 Warsaw Phone: +48 (22) 658-34-49 Fax: +48 (22) 658-34-48 Email: slezak@alfa.mimuw.edu.pl ABSTRACT: Approach

More information

Rough sets: Some extensions

Rough sets: Some extensions Information Sciences 177 (2007) 28 40 www.elsevier.com/locate/ins Rough sets: Some extensions Zdzisław Pawlak, Andrzej Skowron * Institute of Mathematics, Warsaw University, Banacha 2, 02-097 Warsaw, Poland

More information

A PRIMER ON ROUGH SETS:

A PRIMER ON ROUGH SETS: A PRIMER ON ROUGH SETS: A NEW APPROACH TO DRAWING CONCLUSIONS FROM DATA Zdzisław Pawlak ABSTRACT Rough set theory is a new mathematical approach to vague and uncertain data analysis. This Article explains

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

A Posteriori Corrections to Classification Methods.

A Posteriori Corrections to Classification Methods. A Posteriori Corrections to Classification Methods. Włodzisław Duch and Łukasz Itert Department of Informatics, Nicholas Copernicus University, Grudziądzka 5, 87-100 Toruń, Poland; http://www.phys.uni.torun.pl/kmk

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

Similarity-based Classification with Dominance-based Decision Rules

Similarity-based Classification with Dominance-based Decision Rules Similarity-based Classification with Dominance-based Decision Rules Marcin Szeląg, Salvatore Greco 2,3, Roman Słowiński,4 Institute of Computing Science, Poznań University of Technology, 60-965 Poznań,

More information

Foundations of Classification

Foundations of Classification Foundations of Classification J. T. Yao Y. Y. Yao and Y. Zhao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 {jtyao, yyao, yanzhao}@cs.uregina.ca Summary. Classification

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research education use, including for instruction at the authors institution

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

Principles of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata

Principles of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata Principles of Pattern Recognition C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata e-mail: murthy@isical.ac.in Pattern Recognition Measurement Space > Feature Space >Decision

More information

An algorithm for induction of decision rules consistent with the dominance principle

An algorithm for induction of decision rules consistent with the dominance principle An algorithm for induction of decision rules consistent with the dominance principle Salvatore Greco 1, Benedetto Matarazzo 1, Roman Slowinski 2, Jerzy Stefanowski 2 1 Faculty of Economics, University

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

Rough sets and Boolean reasoning

Rough sets and Boolean reasoning Information Sciences 177 (2007) 41 73 www.elsevier.com/locate/ins Rough sets and Boolean reasoning Zdzisław Pawlak, Andrzej Skowron * Institute of Mathematics, Warsaw University, ul. Banacha 2, 02-097

More information

Multicriteria decision-making method using the correlation coefficient under single-valued neutrosophic environment

Multicriteria decision-making method using the correlation coefficient under single-valued neutrosophic environment International Journal of General Systems, 2013 Vol. 42, No. 4, 386 394, http://dx.doi.org/10.1080/03081079.2012.761609 Multicriteria decision-making method using the correlation coefficient under single-valued

More information

APPLICATION FOR LOGICAL EXPRESSION PROCESSING

APPLICATION FOR LOGICAL EXPRESSION PROCESSING APPLICATION FOR LOGICAL EXPRESSION PROCESSING Marcin Michalak, Michał Dubiel, Jolanta Urbanek Institute of Informatics, Silesian University of Technology, Gliwice, Poland Marcin.Michalak@polsl.pl ABSTRACT

More information

The Decision List Machine

The Decision List Machine The Decision List Machine Marina Sokolova SITE, University of Ottawa Ottawa, Ont. Canada,K1N-6N5 sokolova@site.uottawa.ca Nathalie Japkowicz SITE, University of Ottawa Ottawa, Ont. Canada,K1N-6N5 nat@site.uottawa.ca

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

On Improving the k-means Algorithm to Classify Unclassified Patterns

On Improving the k-means Algorithm to Classify Unclassified Patterns On Improving the k-means Algorithm to Classify Unclassified Patterns Mohamed M. Rizk 1, Safar Mohamed Safar Alghamdi 2 1 Mathematics & Statistics Department, Faculty of Science, Taif University, Taif,

More information

Fuzzy Limits of Functions

Fuzzy Limits of Functions Fuzzy Limits of Functions Mark Burgin Department of Mathematics University of California, Los Angeles 405 Hilgard Ave. Los Angeles, CA 90095 Abstract The goal of this work is to introduce and study fuzzy

More information

CMSC 422 Introduction to Machine Learning Lecture 4 Geometry and Nearest Neighbors. Furong Huang /

CMSC 422 Introduction to Machine Learning Lecture 4 Geometry and Nearest Neighbors. Furong Huang / CMSC 422 Introduction to Machine Learning Lecture 4 Geometry and Nearest Neighbors Furong Huang / furongh@cs.umd.edu What we know so far Decision Trees What is a decision tree, and how to induce it from

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

ROUGH SETS THEORY AND DATA REDUCTION IN INFORMATION SYSTEMS AND DATA MINING

ROUGH SETS THEORY AND DATA REDUCTION IN INFORMATION SYSTEMS AND DATA MINING ROUGH SETS THEORY AND DATA REDUCTION IN INFORMATION SYSTEMS AND DATA MINING Mofreh Hogo, Miroslav Šnorek CTU in Prague, Departement Of Computer Sciences And Engineering Karlovo Náměstí 13, 121 35 Prague

More information

Learning from Examples

Learning from Examples Learning from Examples Data fitting Decision trees Cross validation Computational learning theory Linear classifiers Neural networks Nonparametric methods: nearest neighbor Support vector machines Ensemble

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

(This is a sample cover image for this issue. The actual cover is not yet available at this time.)

(This is a sample cover image for this issue. The actual cover is not yet available at this time.) (This is a sample cover image for this issue. The actual cover is not yet available at this time.) This article appeared in a journal published by Elsevier. The attached copy is furnished to the author

More information

Foundations of Mathematics MATH 220 FALL 2017 Lecture Notes

Foundations of Mathematics MATH 220 FALL 2017 Lecture Notes Foundations of Mathematics MATH 220 FALL 2017 Lecture Notes These notes form a brief summary of what has been covered during the lectures. All the definitions must be memorized and understood. Statements

More information

Approximate Boolean Reasoning: Foundations and Applications in Data Mining

Approximate Boolean Reasoning: Foundations and Applications in Data Mining Approximate Boolean Reasoning: Foundations and Applications in Data Mining Hung Son Nguyen Institute of Mathematics, Warsaw University Banacha 2, 02-097 Warsaw, Poland son@mimuw.edu.pl Table of Contents

More information

Minimal Attribute Space Bias for Attribute Reduction

Minimal Attribute Space Bias for Attribute Reduction Minimal Attribute Space Bias for Attribute Reduction Fan Min, Xianghui Du, Hang Qiu, and Qihe Liu School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu

More information

On rule acquisition in incomplete multi-scale decision tables

On rule acquisition in incomplete multi-scale decision tables *Manuscript (including abstract) Click here to view linked References On rule acquisition in incomplete multi-scale decision tables Wei-Zhi Wu a,b,, Yuhua Qian c, Tong-Jun Li a,b, Shen-Ming Gu a,b a School

More information

Mathematical Approach to Vagueness

Mathematical Approach to Vagueness International Mathematical Forum, 2, 2007, no. 33, 1617-1623 Mathematical Approach to Vagueness Angel Garrido Departamento de Matematicas Fundamentales Facultad de Ciencias de la UNED Senda del Rey, 9,

More information

Consistency of Nearest Neighbor Methods

Consistency of Nearest Neighbor Methods E0 370 Statistical Learning Theory Lecture 16 Oct 25, 2011 Consistency of Nearest Neighbor Methods Lecturer: Shivani Agarwal Scribe: Arun Rajkumar 1 Introduction In this lecture we return to the study

More information

Fuzzy Systems. Introduction

Fuzzy Systems. Introduction Fuzzy Systems Introduction Prof. Dr. Rudolf Kruse Christian Moewes {kruse,cmoewes}@iws.cs.uni-magdeburg.de Otto-von-Guericke University of Magdeburg Faculty of Computer Science Department of Knowledge

More information

A Logical Formulation of the Granular Data Model

A Logical Formulation of the Granular Data Model 2008 IEEE International Conference on Data Mining Workshops A Logical Formulation of the Granular Data Model Tuan-Fang Fan Department of Computer Science and Information Engineering National Penghu University

More information

Geometric View of Machine Learning Nearest Neighbor Classification. Slides adapted from Prof. Carpuat

Geometric View of Machine Learning Nearest Neighbor Classification. Slides adapted from Prof. Carpuat Geometric View of Machine Learning Nearest Neighbor Classification Slides adapted from Prof. Carpuat What we know so far Decision Trees What is a decision tree, and how to induce it from data Fundamental

More information

Data Mining und Maschinelles Lernen

Data Mining und Maschinelles Lernen Data Mining und Maschinelles Lernen Ensemble Methods Bias-Variance Trade-off Basic Idea of Ensembles Bagging Basic Algorithm Bagging with Costs Randomization Random Forests Boosting Stacking Error-Correcting

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

1 [15 points] Frequent Itemsets Generation With Map-Reduce

1 [15 points] Frequent Itemsets Generation With Map-Reduce Data Mining Learning from Large Data Sets Final Exam Date: 15 August 2013 Time limit: 120 minutes Number of pages: 11 Maximum score: 100 points You can use the back of the pages if you run out of space.

More information

Research Article Special Approach to Near Set Theory

Research Article Special Approach to Near Set Theory Mathematical Problems in Engineering Volume 2011, Article ID 168501, 10 pages doi:10.1155/2011/168501 Research Article Special Approach to Near Set Theory M. E. Abd El-Monsef, 1 H. M. Abu-Donia, 2 and

More information

Chapter 1 The Real Numbers

Chapter 1 The Real Numbers Chapter 1 The Real Numbers In a beginning course in calculus, the emphasis is on introducing the techniques of the subject;i.e., differentiation and integration and their applications. An advanced calculus

More information

Knowledge Discovery Based Query Answering in Hierarchical Information Systems

Knowledge Discovery Based Query Answering in Hierarchical Information Systems Knowledge Discovery Based Query Answering in Hierarchical Information Systems Zbigniew W. Raś 1,2, Agnieszka Dardzińska 3, and Osman Gürdal 4 1 Univ. of North Carolina, Dept. of Comp. Sci., Charlotte,

More information

Topology. Xiaolong Han. Department of Mathematics, California State University, Northridge, CA 91330, USA address:

Topology. Xiaolong Han. Department of Mathematics, California State University, Northridge, CA 91330, USA  address: Topology Xiaolong Han Department of Mathematics, California State University, Northridge, CA 91330, USA E-mail address: Xiaolong.Han@csun.edu Remark. You are entitled to a reward of 1 point toward a homework

More information

Today s topics. Introduction to Set Theory ( 1.6) Naïve set theory. Basic notations for sets

Today s topics. Introduction to Set Theory ( 1.6) Naïve set theory. Basic notations for sets Today s topics Introduction to Set Theory ( 1.6) Sets Definitions Operations Proving Set Identities Reading: Sections 1.6-1.7 Upcoming Functions A set is a new type of structure, representing an unordered

More information

Synchronization of an uncertain unified chaotic system via adaptive control

Synchronization of an uncertain unified chaotic system via adaptive control Chaos, Solitons and Fractals 14 (22) 643 647 www.elsevier.com/locate/chaos Synchronization of an uncertain unified chaotic system via adaptive control Shihua Chen a, Jinhu L u b, * a School of Mathematical

More information

A Generalized Decision Logic in Interval-set-valued Information Tables

A Generalized Decision Logic in Interval-set-valued Information Tables A Generalized Decision Logic in Interval-set-valued Information Tables Y.Y. Yao 1 and Qing Liu 2 1 Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: yyao@cs.uregina.ca

More information

Supplementary Material for MTH 299 Online Edition

Supplementary Material for MTH 299 Online Edition Supplementary Material for MTH 299 Online Edition Abstract This document contains supplementary material, such as definitions, explanations, examples, etc., to complement that of the text, How to Think

More information

Computers and Electrical Engineering

Computers and Electrical Engineering Computers and Electrical Engineering 36 (2010) 56 60 Contents lists available at ScienceDirect Computers and Electrical Engineering journal homepage: wwwelseviercom/locate/compeleceng Cryptanalysis of

More information

Entropy for intuitionistic fuzzy sets

Entropy for intuitionistic fuzzy sets Fuzzy Sets and Systems 118 (2001) 467 477 www.elsevier.com/locate/fss Entropy for intuitionistic fuzzy sets Eulalia Szmidt, Janusz Kacprzyk Systems Research Institute, Polish Academy of Sciences ul. Newelska

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

1 The Well Ordering Principle, Induction, and Equivalence Relations

1 The Well Ordering Principle, Induction, and Equivalence Relations 1 The Well Ordering Principle, Induction, and Equivalence Relations The set of natural numbers is the set N = f1; 2; 3; : : :g. (Some authors also include the number 0 in the natural numbers, but number

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

Fuzzy Systems. Introduction

Fuzzy Systems. Introduction Fuzzy Systems Introduction Prof. Dr. Rudolf Kruse Christoph Doell {kruse,doell}@iws.cs.uni-magdeburg.de Otto-von-Guericke University of Magdeburg Faculty of Computer Science Department of Knowledge Processing

More information

Slides credits: Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University

Slides credits: Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University Note to other teachers and users of these slides: We would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit

More information

Intuitionistic L-Fuzzy Rings. By K. Meena & K. V. Thomas Bharata Mata College, Thrikkakara

Intuitionistic L-Fuzzy Rings. By K. Meena & K. V. Thomas Bharata Mata College, Thrikkakara Global Journal of Science Frontier Research Mathematics and Decision Sciences Volume 12 Issue 14 Version 1.0 Type : Double Blind Peer Reviewed International Research Journal Publisher: Global Journals

More information

Semantic Rendering of Data Tables: Multivalued Information Systems Revisited

Semantic Rendering of Data Tables: Multivalued Information Systems Revisited Semantic Rendering of Data Tables: Multivalued Information Systems Revisited Marcin Wolski 1 and Anna Gomolińska 2 1 Maria Curie-Skłodowska University, Department of Logic and Cognitive Science, Pl. Marii

More information

Solving Classification Problems By Knowledge Sets

Solving Classification Problems By Knowledge Sets Solving Classification Problems By Knowledge Sets Marcin Orchel a, a Department of Computer Science, AGH University of Science and Technology, Al. A. Mickiewicza 30, 30-059 Kraków, Poland Abstract We propose

More information

Journal of Computational Physics

Journal of Computational Physics Journal of Computational Physics 9 () 759 763 Contents lists available at ScienceDirect Journal of Computational Physics journal homepage: www.elsevier.com/locate/jcp Short Note A comment on the computation

More information

Methods of Partial Logic for Knowledge Representation and Deductive Reasoning in Incompletely Specified Domains

Methods of Partial Logic for Knowledge Representation and Deductive Reasoning in Incompletely Specified Domains Methods of Partial Logic for Knowledge Representation and Deductive Reasoning in Incompletely Specified Domains Anatoly Prihozhy and Liudmila Prihozhaya Information Technologies and Robotics Department,

More information

Three-Way Analysis of Facial Similarity Judgments

Three-Way Analysis of Facial Similarity Judgments Three-Way Analysis of Facial Similarity Judgments Daryl H. Hepting, Hadeel Hatim Bin Amer, and Yiyu Yao University of Regina, Regina, SK, S4S 0A2, CANADA hepting@cs.uregina.ca, binamerh@cs.uregina.ca,

More information

ARPN Journal of Science and Technology All rights reserved.

ARPN Journal of Science and Technology All rights reserved. Rule Induction Based On Boundary Region Partition Reduction with Stards Comparisons Du Weifeng Min Xiao School of Mathematics Physics Information Engineering Jiaxing University Jiaxing 34 China ABSTRACT

More information

A first model of learning

A first model of learning A first model of learning Let s restrict our attention to binary classification our labels belong to (or ) We observe the data where each Suppose we are given an ensemble of possible hypotheses / classifiers

More information

Modern Information Retrieval

Modern Information Retrieval Modern Information Retrieval Chapter 8 Text Classification Introduction A Characterization of Text Classification Unsupervised Algorithms Supervised Algorithms Feature Selection or Dimensionality Reduction

More information

Application of Rough Set Theory in Performance Analysis

Application of Rough Set Theory in Performance Analysis Australian Journal of Basic and Applied Sciences, 6(): 158-16, 1 SSN 1991-818 Application of Rough Set Theory in erformance Analysis 1 Mahnaz Mirbolouki, Mohammad Hassan Behzadi, 1 Leila Karamali 1 Department

More information

Mathematics Course 111: Algebra I Part I: Algebraic Structures, Sets and Permutations

Mathematics Course 111: Algebra I Part I: Algebraic Structures, Sets and Permutations Mathematics Course 111: Algebra I Part I: Algebraic Structures, Sets and Permutations D. R. Wilkins Academic Year 1996-7 1 Number Systems and Matrix Algebra Integers The whole numbers 0, ±1, ±2, ±3, ±4,...

More information

Data Analysis - the Rough Sets Perspective

Data Analysis - the Rough Sets Perspective Data Analysis - the Rough ets Perspective Zdzisław Pawlak Institute of Computer cience Warsaw University of Technology 00-665 Warsaw, Nowowiejska 15/19 Abstract: Rough set theory is a new mathematical

More information

Sequential dynamical systems over words

Sequential dynamical systems over words Applied Mathematics and Computation 174 (2006) 500 510 www.elsevier.com/locate/amc Sequential dynamical systems over words Luis David Garcia a, Abdul Salam Jarrah b, *, Reinhard Laubenbacher b a Department

More information

Learning Decision Trees

Learning Decision Trees Learning Decision Trees Machine Learning Spring 2018 1 This lecture: Learning Decision Trees 1. Representation: What are decision trees? 2. Algorithm: Learning decision trees The ID3 algorithm: A greedy

More information

2 WANG Jue, CUI Jia et al. Vol.16 no", the discernibility matrix is only a new kind of learning method. Otherwise, we have to provide the specificatio

2 WANG Jue, CUI Jia et al. Vol.16 no, the discernibility matrix is only a new kind of learning method. Otherwise, we have to provide the specificatio Vol.16 No.1 J. Comput. Sci. & Technol. Jan. 2001 Investigation on AQ11, ID3 and the Principle of Discernibility Matrix WANG Jue (Ξ ±), CUI Jia ( ) and ZHAO Kai (Π Λ) Institute of Automation, The Chinese

More information

Rough operations on Boolean algebras

Rough operations on Boolean algebras Rough operations on Boolean algebras Guilin Qi and Weiru Liu School of Computer Science, Queen s University Belfast Belfast, BT7 1NN, UK Abstract In this paper, we introduce two pairs of rough operations

More information

CHAPTER-17. Decision Tree Induction

CHAPTER-17. Decision Tree Induction CHAPTER-17 Decision Tree Induction 17.1 Introduction 17.2 Attribute selection measure 17.3 Tree Pruning 17.4 Extracting Classification Rules from Decision Trees 17.5 Bayesian Classification 17.6 Bayes

More information

Characterizing Pawlak s Approximation Operators

Characterizing Pawlak s Approximation Operators Characterizing Pawlak s Approximation Operators Victor W. Marek Department of Computer Science University of Kentucky Lexington, KY 40506-0046, USA To the memory of Zdzisław Pawlak, in recognition of his

More information

Journal of Computational Physics

Journal of Computational Physics Journal of Computational Physics 229 (21) 3884 3915 Contents lists available at ScienceDirect Journal of Computational Physics journal homepage: www.elsevier.com/locate/jcp An adaptive high-dimensional

More information

CS145: INTRODUCTION TO DATA MINING

CS145: INTRODUCTION TO DATA MINING CS145: INTRODUCTION TO DATA MINING 4: Vector Data: Decision Tree Instructor: Yizhou Sun yzsun@cs.ucla.edu October 10, 2017 Methods to Learn Vector Data Set Data Sequence Data Text Data Classification Clustering

More information