This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Size: px

Start display at page:

Download "This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and"

Theresa Manning
5 years ago
Views:

1 This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier s archiving and manuscript policies are encouraged to visit:

Information Sciences 184 (2012) 20 43 Contents lists available at SciVerse ScienceDirect Information Sciences journal homepage: www.elsevier.

Banacha 2, 02-097 Warsaw, Poland b Department of Computer Science, Białystok University of Technology, Wiejska 45A, 15-351 Białystok, Poland c Department of Computer Science, San Diego State

2 Information Sciences 184 (2012) Contents lists available at SciVerse ScienceDirect Information Sciences journal homepage: Modeling rough granular computing based on approximation spaces Andrzej Skowron a,, Jarosław Stepaniuk b, Roman Swiniarski c,d a Institute of Mathematics, The University of Warsaw, Banacha 2, Warsaw, Poland b Department of Computer Science, Białystok University of Technology, Wiejska 45A, Białystok, Poland c Department of Computer Science, San Diego State University, 5500 Campanile Drive, San Diego, CA 92182, USA d Institute of Computer Science, Polish Academy of Sciences, Ordona 21, Warsaw, Poland article info abstract Article history: Available online 18 August 2011 Keywords: Rough sets Approximation spaces Granular computing Concept Interactions Data models The results reported in this paper create a step toward the rough set-based foundations of data mining and machine learning. The approach is based on calculi of approximation spaces. In this paper, we present the summarization and extension of our results obtained since 2003 when we started investigations on foundations of approximation of partially defined concepts (see, e.g., [2,3,7,37,20,21,5,42,39,38,40]). We discuss some important issues for modeling granular computations aimed at inducing compound granules relevant for solving problems such as approximation of complex concepts or selecting relevant actions (plans) for reaching target goals. The problems discussed in this article are crucial for building computer systems that assist researchers in scientific discoveries in many areas such as biology. In this paper, we present foundations for modeling of granular computations inside of system that is based on granules called approximation spaces. Our approach is based on the rough set approach introduced by Pawlak [24,25]. Approximation spaces are fundamental granules used in searching for relevant complex granules called as data models, e.g., approximations of complex concepts, functions or relations. In particular, we discuss some issues that are related to generalizations of the approximation space introduced in [33,34]. We present examples of rough set-based strategies for the extension of approximation spaces from samples of objects onto a whole universe of objects. This makes it possible to present foundations for inducing data models such as approximations of concepts or classifications analogous to the approaches for inducing different types of classifiers known in machine learning and data mining. Searching for relevant approximation spaces and data models are formulated as complex optimization problems. The proposed interactive, granular computing systems should be equipped with efficient heuristics that support searching for (semi-)optimal granules. Ó 2011 Elsevier Inc. All rights reserved. 1. Introduction We discuss some important issues for modeling granular computations aimed at inducing compound granules relevant for solving problems such as approximation of complex concepts or selecting relevant actions (plans) for reaching a target goal. The problems discussed in this article are crucial for building computer systems that assist researchers in scientific discoveries in many application areas such as biology. These systems should be interactive, allowing users to perform such actions as communication hypotheses or hints to the system in searching for complex granules, schemes of reasoning by analogy, strategies for judging under uncertainty or submitting to the system domain knowledge such as ontologies of Corresponding author. addresses: skowron@mimuw.edu.pl (A. Skowron), j.stepaniuk@pb.edu.pl (J. Stepaniuk), rswiniar@sciences.sdsu.edu (R. Swiniarski) /$ - see front matter Ó 2011 Elsevier Inc. All rights reserved. doi: /j.ins

3 A. Skowron et al. / Information Sciences 184 (2012) concepts. A more advanced cooperation of the user with the system can help to develop ontology approximation [5,7]. This approach was used in real-life projects related to medical decision support and planning therapy (see, e.g., [5]), control of UAVs (see, e.g., [4,5,20,50]) or Sun spot classification [22]. The system should also allow users to receive messages from the system, e.g., on discovery progress. In this paper, we present foundations for modeling of granular computations inside of a system that are based on granules called approximation spaces. Our approach is based on rough sets. Rough sets, due to Pawlak [24,25], can be represented by pairs of sets that give the lower and the upper approximation of the original sets. In the standard version of rough set theory, an approximation space is based on the indiscernibility equivalence relation. Approximation spaces belong to the broad spectrum of basic issues investigated in rough set theory (see, e.g., [2,3,7,10,28,29,33,34,37,38,41,42,45]). They can be treated as complex granules. The proposed interactive granular computing systems should be equipped with efficient heuristics that support discovery of such (semi-)optimal granules. Over the years, different aspects of approximation spaces were investigated and many generalizations of the approach based on the indiscernibility equivalence relation [26] were proposed. In this paper, we discuss some aspects of generalizations of approximation spaces investigated in [33,34,42] that are important for real-life applications, e.g., in searching for approximation of complex concepts (see, e.g., [7,5]). This is realized by searching for relevant approximation spaces from a given family of approximation spaces relative to some optimization criteria. There are several components that are important in searching for relevant approximation spaces relative to approximated concepts or classifications. Among them are neighborhoods (granules) of objects defined by features (attributes), inclusion measures making it possible to measure the degree of inclusion of neighborhoods in concepts, operations for inductive extension of approximation spaces allowing us to induce classifiers, optimization measures based on some versions of the minimal description length principle (MLP) used for measuring of approximation space quality relative to approximated concepts, search strategies for (semi-)optimal approximation spaces relative to such measures. Pawlak introduced rough sets [24,25], assuming that objects are perceived by values of some attributes. Hence, information about objects may be incomplete. However, in machine learning and data mining [12] not only information about objects but also information about concepts is partial, e.g., given as a sample of objects only. We propose to deal with this issue using extension operations defined over approximation spaces. An extension of a given approximation space AS is an approximation space defined on a larger universe of objects than the universe of objects of AS, e.g., with the universe including not only objects from a given sample but also including so far unseen objects. Over the years many strategies for inducing classifiers were developed [12]. These strategies can be interpreted as searching strategies for relevant extensions of approximation spaces. The extension operations defined over approximation spaces can be treated as tools for inductive reasoning (for performing judgment [15]) on so far unseen objects. For example, an extension of an approximation space can be based, analogously to rule based classifiers [12,26], on estimation of the membership degrees of any new classified object to concepts in classification on the basis of information about matching by object patterns covering concepts (e.g., left hand sides of decision rules). The investigated approach enables us to present uniform foundations for inducing approximations of different kinds of higher order granules [27] such as concepts, classifications, or functions. In particular, we emphasize the fundamental role of approximation spaces for inducing diverse kinds of classifiers used in machine learning or data mining. Search problems for relevant approximation spaces and their extensions lead to optimization of high computational complexity. Hence, efficient heuristics should be used in searching for approximate solutions of this problem. These heuristics can be based on approximate Boolean reasoning (see, e.g., [26,21]) or/and biologically-inspired metaheuristics (see, e.g., [9]). Moreover, in hierarchical learning of complex concepts, many different approximation spaces should be discovered. Learning of such concepts can be supported by domain knowledge and ontology approximation (see, e.g., [5,7,14]). Finally, let us mention software platforms supporting the development of our projects, i.e., Interactive Classification Engine (RoughICE) [47] and TunedIT [51]. RoughICE is a software platform supporting the approximation of spatio-temporal complex concepts in the given concept ontology acquired during dialogue with the user. RoughICE is freely available on the website [47]. The underlying algorithmic methods, especially for generating reducts and rules, discretization and decomposition, are outgrowths of our previous tools such as RSES [48] and RSESlib [49]. RoughICE software as well as the underlying computational methods have been successfully applied in different data mining projects (e.g., in mining traffic data and medical data; for details see [5,50] and the literature cited in [5]). The TunedIT platform [51], launched recently by members of our research group, facilitates sharing, evaluation and comparison of data-mining and machine-learning algorithms. The resources used in our experiments algorithms and datasets, in particular will be shared on TunedIT website. This website already contains many publicly available datasets and algorithms, as well as performance data for nearly 100 algorithms tested on numerous datasets these include the algorithms from Weka, Rseslib libraries, and the datasets from the UCI Machine Learning Repository. Everyone can contribute new resources and results. TunedIT is composed of three complementary modules: TunedTester, Repository and Knowledge Base. TunedIT may help researchers design repeatable experiments and generate reproducible results. It may be particularly useful

4 22 A. Skowron et al. / Information Sciences 184 (2012) when conducting experiments intended for publication, as reproducibility of experimental results is the essential factor that determines research value of the paper. TunedIT helps also in dissemination of new ideas and findings. Every researcher may upload his implementations, datasets and documents into Repository, so that other users can find them easily and employ in their own research. This paper is organized as follows. In Section 2, we discuss basic notions for our approach. In Section 3, we present a generalization of the approximation space definition from [33,34,42]. The rough set approach to inducing rule based classifiers, knn classifiers and function approximation is presented in Sections 3.1, 3.2 and 3.3, respectively. Relationships of rough granular computing based on approximation spaces and their extensions are discussed in Section 4. Searching for approximation spaces is performed in the set of approximation spaces generated from some atomic approximation spaces by applying to them operations on approximation spaces. The operations on approximation spaces are investigated in Section 5. In conclusions, we summarize the results of the paper and we present some directions for further research. 2. Basic notions 2.1. Concept Concepts in philosophy are the constituents of thoughts. 1 In this paper, concepts are represented as subsets of some universes of objects (denoted by U or U in Table 1). Different universes may be needed to represent different concepts. They are specified by a partial information on finite samples (denoted by U in Table 1) of such universes. This partial information is recorded in data tables representing information systems [25,26]. Objects in data tables are labels of perceived by agents real objects (situations, states) (see Table 1) or they are sets of higher types constructed, e.g., in hierarchical modeling [40]. One should distinguish between real objects (perceived by agents) and objects from universes such as U, U, U used for concept representation Attributes, signatures of objects and two semantics In [26] any attribute a is defined as a function from a universe of objects U into the set of attribute values V a. However, in applications we expect that the value of any attribute should be also defined for objects from extensions of U, i.e., for new objects which can be perceived in the future. 2 The universe U is only a sample of possible objects. This requires some modification of the basic definitions of attribute and signature of objects [24 26]. One can give an interpretation of attributes using the concept of interaction. In this paper, information systems are used for representing results of interactions [40]. Interactions can be external or internal relative to a given agent ag. The results of external interactions of ag with environments, are recorded by (activated) sensory attributes or by attributes used for storing the results of performed actions. The internal interaction of ag with its parts such as knowledge bases lead to activation of sensory attributes or action attributes [40]. We treat attributes as granules and we consider their interactions with environments. If a is a given attribute and e denotes a state of the environment, then the result of interaction between a and e is equal to a pair (l e,v), where l e is a label of e (see Table 1) and v 2 V a. Analogously, if IS =(U,A) is a given information system and e denotes a state of the environment, then by interaction of IS and e we obtain the information system IS 0 =(U [ {l e },A ), where A ={a : a 2 A} and a (u)=a(u) for u 2 U and a (l e )=v for some v 2 V a. Hence, information systems are dynamic objects created via interaction of already existing information systems with environments. Notice that the initial information system can be empty, the set of objects of this information system is empty. Moreover, let us observe that elements of U are labels of environment states rather than states. One can represent any attribute by a family of formulas and interpret the attribute as the result of interaction of this set with the environment. In this case, we assume that, for any attribute a under consideration, there is given a relational structure R a. Together with the simple structure (V a,=) [26], some other relational structures R a with the carrier V a for a 2 A and a signature s are considered. We also assume that with any attribute a there is identified a set of some generic formulas {a i } i2j (where J is a set of indices) interpreted over R a as a subsets of V a, i.e., ka i k Ra ¼fv 2 V a : R a ; v ƒ a i g. Moreover, it is assumed that the set fka i k Ra g i2j is a partition of V a. Perception of an object u by a given attribute a is represented by selection of a formula a i and a value v 2 V a such that v 2ka i k Ra. Using an intuitive interpretation, one can say that such a pair (a i,v) is selected from {a i } i2j and V a, respectively, as the result of sensory measurement. We assume that, for a given set of attributes A and any object u, the signature of u relative to A is given by Inf A ðuþ ¼ a; a a u ; v : a 2 A, where a a u ; v is the result of sensory measurement by a on u. Let us observe that a triple a; a a u ; v can be encoded by the atomic formula a = v with interpretation ka ¼ vk U ¼ u 2 U : a; a a ;v u 2 Infa ðuþ for some a a u : ð1þ For simplicity, we also write (a,v) instead of a; a a u ; v, if this does not lead to confusion. One can also consider a soft version of the attribute definition [43,44]. In this case, we assume that the semantics of the family {a i } i2j is given by fuzzy membership functions for a i and the set of these functions define a fuzzy partition [16]. 1 See 2 Objects from U are treated as labels of real perceived objects.

5 A. Skowron et al. / Information Sciences 184 (2012) Table 1 Notation used in this article. Symbol Interpretation U Set of objects (universe, e.g., sample of objects) a Condition attribute over U (a : U? V a ) V a Set of attribute values of a 2 A A Set of condition attributes over U IS Information system (U, A) d Decision attribute over U(d : U? V d ) DT Decision table (U, A [ {d}) card(u) Number of elements in U P(U) Set of all subsets of U kak U Semantics of a formula a over U AS Approximation space U Extension of the sample U # U U nu Set of testing objects a Extension of a from U to U (a : U? V a ) A Set of condition attributes over U (A ={a : a 2 A}) R þ Set of non-negative reals U Extension of U by objects that differ from the type of objects from U (e.g., U nu is a set of reals used for construction of new granules over objects from U ) I(x) Granule corresponding to x, e.g., neighborhood of x or family of neighborhoods of x X ( U Restriction of X to U; in the simplest case the intersection X \ U but if X # P(U) then X ( U ={Y \ U : Y 2 X} Inf A (u) Signature of u representing the result of sensory measurements by attributes from A on u; for each attribute a the result of sensory l e measurement is recorded by a triple a; a a;v u, where a a u is a formula selected from the language of a and v is a value v satisfying a a u Label of perceived by a given agent ag state (real object, situation) e is created by ag and is used as an object in the constructed by ag information system; labels discern the recorded, usually in time, representations of states resulting in interaction of ag with its environment (for more details see [40]) We construct granular formulas from atomic formulas corresponding to the considered attributes. As a consequence, the satisfiability of such formulas is defined if the satisfiability of atomic formulas is given as a result of sensor measurements. Hence, one can consider for any constructed formula a over atomic formulas, its semantics kak U # U over U as well as the semantics kak U # U over U, where U # U (see Fig. 1). The difference between these two cases is the following. In the case of U, one can compute kak U # U but in the case kak U # U, for objects from U nu, there is no information about their membership relative to kak U nu. One can estimate the satisfiability of a for objects u 2 U nu only after the relevant attribute values on u are stored in data table representing information system (e.g., when the results of sensory measurements on u are stored in data table). In particular, one can use some methods to estimate relationships among semantics of formulas over U, using the relationships among the semantics of the formulas over U. For example, one can apply statistical methods. This step is crucial in the investigation of extensions of approximation spaces relevant for inducing classifiers from data (see, e.g., [2,3,7,12,37,38]) Uncertainty function In [33,34,42], the uncertainty function I defines for every object x from a given sample U of objects, a set of objects with descriptions similar to x. The set I(x) is called the neighborhood of x. Let P x (U )= S ip1p i (U ), where P 1 (U )=P(U ) and P i+1 (U )=P(P i (U )) for i P 1. For example, if card(u ) = 2 and U ={x 1,x 2 }, then we obtain P 1 (U )={;,{x 1 },{x 2 },{x 1,x 2 }}, P 2 (U )={;,{;},{{x 1 }},{{x 2 }},{{x 1,x 2 }},{;,{x 1 }},{;,{x 2 }},{;,{x 1,x 2 }},...},... If U * α U * U α U Fig. 1. Two semantics of a over U and U, respectively.

24 A. Skowron et al. / Information Sciences 184 (2012) 20 43 card(u )=n, where n is a positive natural number, then card(p 1 (U )) = 2 n and cardðp nþ1 ðu ÞÞ ¼ 2 cardðpn ðu ÞÞ, for n P 1.

6 24 A. Skowron et al. / Information Sciences 184 (2012) card(u )=n, where n is a positive natural number, then card(p 1 (U )) = 2 n and cardðp nþ1 ðu ÞÞ ¼ 2 cardðpn ðu ÞÞ, for n P 1. For example, cardðp 3 ðu ÞÞ ¼ 2 22n. Hence, we see that the levels of the powerset hierarchy are very rich and full automatic searching for relevant sets (structures) on such levels is not feasible. However, this approach allows us to present in a uniform way foundations for modeling of granular computing aimed at inducing compound granules from different levels of the powerset hierarchy relevant for solving the target task, e.g., approximation of complex concepts. For applications, it is necessary to restrict searching for relevant granules in relevant fragments of the powerset hierarchy. These fragments are defined by some sets of formulas. Discovery of such sets often is the big challenge for many problems. In this paper, we consider uncertainty functions of the form I: U? P x (U ). The values of uncertainty functions are called granular neighborhoods. These granular neighborhoods are defined by the so called granular formulas. The values of such uncertainty functions are not necessarily from P(U ) but from P x (U ). In the following sections, we will present more details on granular neighborhoods and granular formulas. Fig. 2 presents an illustrative example of the uncertainty function with values in P 2 (U ) rather than in P(U ). The generalization of neighborhoods discussed here are also motivated by the necessity of modeling or discovery of complex structural objects in solving problems of pattern recognition, machine learning, or data mining. These structural objects (granules) can be defined as sets on higher levels of the powerset hierarchy. Among examples of such granules are indiscernibility or similarity classes of patterns or relational structures discovered in images, clusters of time widows, indiscernibility or similarity classes of sequences of time windows representing processes, behavioral graphs (for more details see, e.g., [39,5] and also [19]). If X 2 P x (U ) and U # U, then by X ( U we denote the set defined as follows (i) if X 2 P(U ) then X ( U = X \ U and (ii) for any i P 1ifX 2 P i+1 (U ) then X ( U ={Y ( U:Y 2 X}. For example, if U ={x 1 }, U ={x 1,x 2 } and X ={{x 2 },{x 1,x 2 }} (X 2 P 2 (U )), then X ( U ={Y ( U : Y 2 X}={Y \ U : Y 2 X}={;,{x 1 }}. In this section, we discuss uncertainty functions assigning granules from P 2 (U ) to objects from U. We assume that objects from U and U are of atomic type with the type defined by attributes from a given information system IS =(U,A) over U. For example, the type of objects from U may be identified with a2a V a. In Section 5.2, we present a method of defining information systems with objects of higher order type (structural objects). Such objects can be from different levels of P x (U ), i.e., they can belong to P k (U ), for k > 2. Note that neighborhoods (e.g., indiscernibility classes) over such objects are sets of objects of higher order type from P x (U ) Rough inclusion function The second component of any approximation space is the rough inclusion function [34,42]. One can consider general constraints which the rough inclusion functions should satisfy. In this section, we present only some examples of rough inclusion functions. The rough inclusion function m : P(U) P(U)? [0,1] defines the degree of inclusion of X in Y, where X,Y # U and U is a finite sample of objects. In the simplest case the standard rough inclusion function m SRI can be defined by (see, e.g., [26,30,34,46]): m SRI ðx; YÞ ¼ ( cardðx\yþ ; cardðxþ if X ;; 1; if X ¼;: Some illustrative examples are given in Table 2. It is important to note that an inclusion measure expressed in terms of the confidence measure, widely used in data mining, was considered by Łukasiewicz [17] a long time ago in studies on assigning fractional truth values to logical formulas. ð2þ Fig. 2. Uncertainty function I : U? P 2 (U ). The neighborhood of x 2 U nu, where Inf A (x)={(a,1),(b,0),(c,2)}, does not contain training cases from U. The generalizations of this neighborhood described by formulas ka ¼ 1 ^ c ¼ 2k U and ka ¼ 1 ^ b ¼ 0k U have non empty intersections with U.

7 A. Skowron et al. / Information Sciences 184 (2012) Table 2 Illustration of Standard Rough Inclusion Function. X Y m SRI (X,Y) {x 1,x 3,x 7,x 8 } {x 2,x 4,x 5,x 6,x 9 } 0 {x 1,x 3,x 7,x 8 } {x 1,x 2,x 4,x 5,x 6,x 9 } 0.25 {x 1,x 3,x 7,x 8 } {x 1,x 2,x 3,x 7,x 8 } 1 For a definition of the inclusion function for more general granules, e.g., partitions of objects, one can use a measure based on positive region [26], entropy [12,13] or rough entropy [23,18]. Inclusion measures for more general granules were also investigated [35,8]. However, more work in this direction should be done, especially on inclusion of granules with complex structures, in particular for granular neighborhoods. 3. Approximation spaces In this section, we present a generalization of the approximation space definition from [33,34,42]. In applications, approximation spaces are constructed for a given concept or a family of concepts creating a partition (in the case of classification) rather than for the class of all concepts. Then searching for components of approximation space relevant for the concept approximation becomes feasible. A concept is given only on a sample U of objects. We often restrict the definition of components of approximation space to objects from U and/or some patterns from P x (U ) or from P x (U ), where U # U necessary for approximation of a given concept X only. The definitions of the uncertainty function and the inclusion function can be restricted to some subsets of U and P x (U ) (or U # U ), respectively, which are relevant for approximated concept(s). The set U nu can be treated as the set of testing objects. We assume that the type of objects from U and U are the same, i.e., they are defined by a2a V a, where A is the set of attributes from a given information system IS. In constructing granules relevant for approximation of more compound granules such as functions, we use objects of different types, e.g., objects described also by real values in addition to attribute value vectors from a2a V a. The set of such objects or values create the set U nu. Moreover, the optimization requirements for the lower approximation and upper approximation are then also restricted to the given concept X. These requirements express closeness of the induced concept approximations to the approximation of the concept(s) on the given sample of objects and are combined with the description length of constructed approximations for obtaining the relevant quality measures. Usually, the uncertainty functions and the rough inclusion functions are parameterized. Then searching (in the family of the these functions defined by possible values of parameters) for the (semi)optimal uncertainty function and the rough inclusion function relative to the selected quality measure becomes feasible. Definition 1. An approximation space over a set of attributes A for a concept represented by X # U and given on a sample U # U of objects is a system where AS ¼ðU; U ; I; m; LÞ U is a sample of objects with known signatures relative to a given set of attributes A, L is a language of granular formulas defined over atomic formulas corresponding to generic formulas from signatures (see Section 2.3), the set U is such that, for any object u 2 U, the signature Inf A (u) ofu relative to A can be obtained as the result of sensory measurements on u, I : U? P x (U ) is an uncertainty function, where U # U ; we assume that the granular neighborhood I(u) is computable from Inf A (u), i.e., from Inf A (u) it is possible to compute a formula a InfA ðuþ 2 L such that IðxÞ ¼ka InfA ðuþk U ; m : P x (U ) P x (U )? [0,1] is a partial rough inclusion function, such that for any x 2 U the value m(i(x),x) is defined for the considered concept X. In Section 3.3, we consider an uncertainty function with values in PðPðU ÞÞ PðR þ Þ, where R þ is the set of positive reals. Hence, we assume that the values of the uncertainty function I may belong to the space of possible patterns from P x (U ), where U # U ¼ U [ R þ. The partiality of the rough inclusion makes it possible to define the values of this functions on relevant patterns for approximation only. We assume that the lower approximation operation LOW(AS, X) and the upper approximation operation UPP(AS, X) of the concept X in the approximation space AS satisfy the following condition: ð3þ mðlowðas; XÞ; UPPðAS; XÞÞ ¼ 1: ð4þ

8 26 A. Skowron et al. / Information Sciences 184 (2012) Usually the uncertainty function and the rough inclusion function are parameterized. In this parameterized family of approximation spaces, one can search for an approximation space enabling us to approximate the concept X restricted to a given sample U with the satisfactory quality. The quality of approximation can be expressed by some quality measures. For example, one can use the following criterion: 1. LOW(AS,X) ( U is included in X ( U to a degree at least deg, i.e., m(low(as,x) ( U),X ( U)) P deg, 2. X ( U is included in UPP(AS,X) ( U to a degree at least deg, this means that, mðuppðas; XUÞ; XUÞ P deg; where deg is a given threshold from the interval [0,1]. The above condition expresses the degree to which at least the induced approximations in AS are close to the concept X on the sample U. One can also introduce the description length of the induced approximations. A combination of these two measures can be used as the quality measure for the induced approximation space. Then the searching problem for the relevant approximation space can be considered as the optimization problem relative to this quality measure. This approach may be interpreted as a form of the minimal description length principle [32,11]. The result of optimization can be checked against a testing sample. This enables us to estimate the quality of approximation. Note that further optimization can be performed relative to parameters of the selected quality measure Approximations and rule based classifiers In this section, we discuss generation of approximations on extensions of samples of objects. In the example we illustrate how the approximations of sets (concepts) can be estimated using only partial information about these sets. Moreover, the example introduces uncertainty functions with values in P 2 (U) and rough inclusion functions defined for families of subsets from P 2 (U). Let us assume that DT =(U,A [ {d}) is a decision table, where U ={x 1,...,x 9 } is a set of objects and A ={a,b,c} is a set of condition attributes (see Table 3). In DT we compute two decision reducts: {a,b}and {b,c}. We obtain the set Rule_set ={r 1,...,r 12 } of minimal (reduct based) decision rules [26]. From x 1 we obtain two rules: r 1 : if a =1and b =1then d =1, r 2 : if b =1and c =0then d =1. From x 2 and x 4 we obtain two rules: r 3 : if a =0and b =2then d =1,r 4 : if b =2and c =0then d =1. From x 5 we obtain one new rule: r 5 : if a =0and b =1then d =1. From x 3 we obtain two rules: r 6 : if a =1and b =0then d =0,r 7 : if b =0and c =1then d =0. From x 6 we obtain two rules: r 8 : if a =0and b =0then d =0,r 9 : if b =0and c =0then d =0. From x 7 we obtain one new rule: r 10 : if b =0and c =2then d =0. From x 6 we obtain two rules: r 11 : if a =1and b =2then d =0,r 12 : if b =2and c =1then d =0. Let U = U [ {x 10,x 11,x 12,x 13,x 14 } (see Table 4). Let h : [0,1]? {0,1/2,1} be a function defined by Table 3 Decision table over the set of objects U. a b c d x x x x x x x x x

9 A. Skowron et al. / Information Sciences 184 (2012) Table 4 Decision table over the set of objects U U. a b c d d class x from r 3 or 0 from r 12 x from r 4 or 0 from r 11 x from r 4 or 0 from r 11 x from r 5 x from r 1 8 >< 1; if t > 1 ; 2 1 hðtþ ¼ ; if t ¼ 1 ; 2 2 >: 0; if t < 1 : 2 ð5þ Below we present an example of the uncertainty and rough inclusion functions: IðxÞ ¼fklhðrÞk U : x 2klhðrÞk U and r 2 Rule setg; ð6þ where x 2 U and lh(r) denotes the formula on the left hand side of the rule r, and ( m U ðx; ZÞ ¼ h cardðfy2x:y\u # ZgÞ cardðfy2x:y\u # ZgÞþcardðfY2X:Y\U # U nzgþ ; if X ;; 0; if X ¼;; ð7þ where X # P(U ) and Z # U. The inclusion function defined in (7) can be explained as follows. Let us assume that X is a family of neighborhoods matching an object x and let Z be a decision class. Then in (7) we have the ratio of the number of neighborhoods from X included (after restriction to the sample U) in the decision class Z to the number of neighborhoods from X included (after restriction to the sample U) either in Z or in the complement of Z. Notice that in calculations we use only information available on the sample U. Certainly, (7) presents one of many possible definitions of inclusions relevant for inducing rule based classifiers. The uncertainty and rough inclusion functions can now be used to define the lower approximation LOW(AS,Z), the upper approximation UPP(AS,Z), and the boundary region BN(AS,Z) ofz # P(U ) by: LOWðAS ; ZÞ ¼fx 2 U : m U ðiðxþ; ZÞ ¼1g; ð8þ and UPPðAS ; ZÞ ¼fx 2 U : m U ðiðxþ; ZÞ > 0g; BNðAS ; ZÞ ¼UPPðAS ; ZÞnLOWðAS ; ZÞ: ð9þ ð10þ In the example, we classify objects from U to the lower approximation of Z if majority of rules matching this object are voting for Z and to the upper approximation of Z if at least half of the rules matching x are voting for Z. Certainly, one can follow many other voting schemes developed in machine learning or by introducing less crisp conditions in the boundary region definition. The defined approximations can be treated as estimations of the exact approximations of subsets of U because these induced approximations are constructed on the basis of samples of subsets of U restricted to U only. One can use some standard quality measures developed in machine learning to calculate the quality of such approximations assuming that Table 5 Uncertainty function and rough inclusion over the set of objects U. I() m U ðiðþ; C 1 Þ x 1 {{x 1,x 14 },{x 1,x 5 }} h(2/2) = 1 x 2 {{x 2,x 4,x 10 },{x 2,x 4,x 11,x 12 }} h(2/2) = 1 x 3 {{x 3,x 7 },{x 3,x 9 }} h(0/2) = 0 x 4 {{x 2,x 4,x 10 },{x 2,x 4,x 11,x 12 }} h(2/2) = 1 x 5 {{x 5,x 13 },{x 1,x 5 }} h(2/2) = 1 x 6 {{x 6,x 9 },{x 6 }} h(0/2) = 0 x 7 {{x 3,x 7 },{x 7 }} h(0/2) = 0 x 8 {{x 8,x 11,x 12 },{x 8,x 10 }} h(0/2) = 0 x 9 {{x 6,x 9 },{x 3,x 9 }} h(0/2) = 0 x 10 {{x 2,x 4,x 10 },{x 8,x 10 }} h(1/2) = 1/2 x 11 {{x 8,x 11,x 12 },{x 2,x 4,x 11,x 12 }} h(1/2) = 1/2 x 12 {{x 8,x 11,x 12 },{x 2,x 4,x 11,x 12 }} h(1/2) = 1/2 x 13 {{x 5,x 13 }} h(1/1) = 1 x 14 {{x 1,x 14 }} h(1/1) = 1

10 28 A. Skowron et al. / Information Sciences 184 (2012) after estimation of the approximations on U, full information about membership for objects relative to the approximated subsets of U is uncovered on testing sets analogously to the situation in machine learning. Let C 1 ¼fx 2 U : dðxþ ¼1g ¼fx 1 ; x 2 ; x 4 ; x 5 ; x 10 ; x 13 ; x 14 g. We obtain the set U n C 1 ¼ C 0 ¼fx 3; x 6 ; x 7 ; x 8 ; x 9 ; x 11 ; x 12 g. The uncertainty function and rough inclusion are presented in Table 5. Thus, in our example from Table 5, we obtain LOWðAS ; C 1 Þ¼ x 2 U : m U IðxÞ; C 1 ¼ 1 ¼fx1 ; x 2 ; x 4 ; x 5 ; x 13 ; x 14 g; ð11þ UPP AS ; C 1 ¼ x 2 U : m U IðxÞ; C 1 > 0 ¼fx1 ; x 2 ; x 4 ; x 5 ; x 10 ; x 11 ; x 12 ; x 13 ; x 14 g; ð12þ BN AS ; C 1 ¼ UPP AS ; C 1 n LOW AS ; C 1 ¼fx10 ; x 11 ; x 12 g: ð13þ 3.2. Approximations and nearest neighbor classifiers In this section, we present a method for construction of rough based classifiers based on the k-nearest neighbors idea. The k-nearest neighbors algorithm (k-nn, where k is a positive integer) is a method for classifying objects based on k closest training examples in the attribute space [12]. An object is classified by a majority vote of its neighbors, with the object being assigned to the decision class most common amongst its k nearest neighbors. If k = 1, then the object is simply assigned to the decision class of its nearest neighbor. Let DT =(U,A [ {d}) be a decision table and let DT =(U,A [ {d }) be an extension of DT. We define NN k : U? P(INF(A)) by NN k (x) = a set of k elements of INF(A) with the minimal distances to Inf A (x), where INFðAÞ ¼fInf A ðxþ : x 2 Ug. The Hamming distance d H A ðu; vþ between two strings u; v 2 Q a2a V a of the length card(a) is the number of positions at which the corresponding symbols are different. In our example, we use a normalized Hamming distance d A : Q a2a V a Q a2a V a!½0; 1Š defined by d A ðu; vþ ¼ dh A ðu;vþ. cardðaþ The description of x 1 is Inf A (x 1 ) = (1,1,0) 2 INF(A) 3 (see Table 3) and the description of x 14 is Inf A (x 14 ) = (1,1,2) 2 INF(A) (see Table 4). Because each object is described by 3 condition attributes, we have that the Hamming distance between Inf A (x 1 ) = (1,1,0) and Inf A (x 14 ) = (1,1,2) is 1 and the normalized Hamming distance is equal to d A ðð1; 1; 0Þ; ð1; 1; 2ÞÞ ¼ 1. 3 We define 4 I NNk ðxþ ¼ fk ^ Inf A ðyþk U : y 2 U and Inf A ðyþ 2NN k ðxþg; ð14þ m NNk ðx; YÞ ¼ cardðfs ðz \ UÞ : Z 2 X&Z \ U # Y \ UgÞ ; ð15þ cardðuþ where X is a family of pairwise disjoint subsets of U. In Eq. (14), we consider the family of neighborhoods of x defined by k-objects from U closest to x (i.e., from the set NN k (x)). The inclusion degree in (15), is equal to the ratio of the number of objects from sample U matched by neighborhoods from NN k (x) to the number of objects in the sample U. Following this explanation and Eqs. (16) and (17), one can define approximations on extensions of the sample U (see (18) and (19)). One can easily see the close analogy to the classifying strategy by k-nearest neighbor classifiers. Let J e : U? P({d(x):x2U}) for 0 < e 1 be defined by n J e o ðxþ ¼ i : :9j iðm NNk I NNk ðxþ; C j > m NNk I NNk ðxþ; C i þ eþ ; ð16þ and 8 m e >< 1; if J e ðxþ ¼fig; NN k I NNk ðxþ; C i ¼ 1 ; if i 2 2 Je ðxþ & cardðj e ðxþþ > 1; >: 0; if fig Ü J e ðxþ: ð17þ The defined uncertainty I NNk and rough inclusion m e NN k functions can now be used to define the lower approximation LOW AS ; C i, the upper approximation UPP AS ; C i, and the boundary region BN AS ; C i of C i # U by: n LOW AS ; C i ¼ x 2 U : m e o NN k I NNk ðxþ; C i ¼ 1 ; ð18þ n UPP AS ; C i ¼ x 2 U : m e o NN k I NNk ðxþ; C i > 0 ; ð19þ and 3 We write Inf A (x 1 ) = (1, 1,0) 2 INF(A) instead of Inf A (x 1 )={(a,1),(b,1),(c,0)}. 4 k V Inf A ðyþk U denotes the set of all objects from U satisfying the conjuntion of all descriptors a = a(y) for a 2 A.

11 A. Skowron et al. / Information Sciences 184 (2012) Table 6 Uncertainty function I NN2 and rough inclusion m 0:1 NN 2 over the set of objects U nu = {x 10,...,x 14 }. NN 2 () I NN2 ðþ m NN2 ði NN2 ðþ; C 1 Þ x 10 {(0,2,0), (1,2,1)} {{x 2,x 4 },{x 8 }, {x 10 }} 2/9 x 11 {(1,1,0), (0,2,0)} {{x 1 },{x 2,x 4 }, {x 11,x 12 }} 3/9 x 12 {(1,1,0), (0,2,0)} {{x 1 },{x 2,x 4 }, {x 11,x 12 }} 3/9 x 13 {(0,1,0), (1,1,0)} {{x 5 },{x 1 }} 2/9 x 14 {(1,1,0), (1,0,2)} {{x 1 },{x 7 }} 1/9 m NN2 ði NN2 ðþ; C 0 Þ J0.1 () m 0:1 NN 2 ði NN2 ðþ; C 1 Þ x 10 1/9 {1} 1 x 11 0 {1} 1 x 12 0 {1} 1 x 13 0 {1} 1 x 14 1/9 {0,1} 1/2 BN AS ; C i ¼ UPP AS ; C i n LOW AS ; C i : ð20þ Let k = 2 and e = 0.1, in our example we obtain the results presented in Table 6. The neighbors are taken from the set U of objects for which the correct classification is known. In the classification phase, a new object is classified by assigning the decision class which is most frequent among the 2 training objects nearest to that new object. In the case of more than two nearest objects we choose randomly two. Thus, in our example from Table 6 we obtain n o LOW AS ; C 1 ¼ x 2 U : m 0:1 I NN2 ðxþ; C i ¼ 1 ¼fx 1 ; x 2 ; x 4 ; x 5 ; x 10 ; x 11 ; x 12 ; x 13 g; ð21þ NN 2 n o UPP AS ; C 1 ¼ x 2 U : m 0:1 I NN2 ðxþ; C i > 0 ¼fx 1 ; x 2 ; x 4 ; x 5 ; x 10 ; x 11 ; x 12 ; x 13 ; x 14 g; NN 2 ð22þ BN AS ; C 1 ¼ UPP AS ; C 1 n LOW AS ; C 1 ¼fx14 g: ð23þ 3.3. Function approximations In this subsection, we discuss the rough set approach to function approximation from available incomplete data. Our approach can be treated as a kind of rough clustering of functional data [31]. Let us consider an example of function approximation. We assume that a partial information is only available about a function. This means that, some points from the graph of the function are known. Before presenting a more formal description of function approximation, we introduce some notation. A function f : U! R þ will be called a sample of a function f : U! R þ, where R þ is the set of non-negative reals and U # U is a finite subset of U,iff is an extension of f. By Gf (Gf ) we denote the graph of f(f ), respectively, i.e., the set {(x,f (x)) : x 2 U} ({(x,f (x)) : x 2 U }). For any Z # U R þ by p 1 (Z) and p 2 (Z) we denote the sets fx U : 9y R þ ðx; yþ Zg and fy R þ : 9x U ðx; yþ Zg, respectively. First, we define approximations of Gf given on a sample U of objects and next we show how to induce approximations of Gf over U, i.e., on extension of U. Let D will be a partition of f(u) into sets of reals of diameter less than d > 0, where d is a given threshold. We also assume that IS =(U,A) is a given information system. Let us also assume that for any object signature Inf A (x)={(a,a(x)):a 2 A} [26] there is assigned an interval of non-negative reals with diameter less than d. We denote this interval by D InfA ðxþ. Hence, D ¼fD InfA ðxþ : x 2 Ug. We consider an approximation space AS IS,D =(U,I,m ) (relative to given IS and D), where IðxÞ ¼½xŠ INDðAÞ D InfA ðxþ; ð24þ and m ðx; YÞ ¼ ( cardðp 1 ðx\yþþ cardðp 1 ; ðxþþ if X ;; 1; if X ¼;; ð25þ for X; Y # U R þ. The lower approximation and upper approximation of Gf in AS are defined by LOWðAS IS;D ; Gf Þ¼ [ fiðxþ : m ðiðxþ; Gf Þ¼1g; ð26þ and

12 30 A. Skowron et al. / Information Sciences 184 (2012) UPPðAS IS;D ; Gf Þ¼ [ fiðxþ : m ðiðxþ; Gf Þ > 0g; ð27þ respectively. Observe that this definition is different from the standard definition of the lower approximation [26]. The defined approximation space is a bit more general than in [34], e.g., the values of the uncertainty functions are subsets of U R þ instead of U. Moreover, one can easily see that by applying the standard definition of relation approximation to f [26] (this is a special case of relation) the lower approximation of function is almost always equal to the empty set. The new definition is making it possible to express better the fact that a given neighborhood is well matching the graph of f [37,42]. For expressing this a classical set theoretical inclusion of neighborhood into the graph of f is not satisfactory. Example 1. We present the first illustrative example of a function approximation. Let f : U! R þ where U = {1,2,3,4,5,6}. Let f(1) = 3, f(2) = 2, f(3) = 2, f(4) = 5, f(5) = 5 and f(6) = 2. Let IS =(U,A) be an information system where A ={a} and 8 >< 0; if 0 6 x 6 2; aðxþ ¼ 1; if 2 < x 6 4; ð28þ >: 2; if 4 < x 6 6: Thus the partition U/IND(A) = {{1,2},{3,4},{5,6}}. The graph of f is defined by Gf ={(x,f(x)) : x 2 U}= {(1,3),(2,2),(3,2),(4,5),(5,5),(6,2)}. We define approximations of Gf given on the sample U of objects. We obtain f(u) = {2,3,5} and let D = {{2,3},{5}} will be a partition of f(u). We consider an approximation space AS IS,D =(U,I,m ) (relative to given IS and D), where IðxÞ ¼½xŠ INDðAÞ D InfA ðxþ; ð29þ is defined by 8 >< f1; 2g½1:5; 4Š; if x 2f1; 2g; IðxÞ ¼ f3; 4g½1:7; 4:5Š; if x 2f3; 4g; >: f5; 6g½3; 4Š; if x 2f5; 6g: ð30þ We obtain the lower approximation and upper approximation of Gf in the approximation space AS IS,D : LOWðAS IS;D ; Gf Þ¼ [ fiðxþ : m ðiðxþ; Gf Þ¼1g ¼Ið1Þ[Ið2Þ ¼f1; 2g½1:5; 4Þ; ð31þ and UPPðAS IS;D ; Gf Þ¼ [ fiðxþ : m ðiðxþ; Gf Þ > 0g ¼Ið1Þ[Ið2Þ[Ið3Þ[Ið4Þ ¼f1; 2g½1:5; 4Þ[f4; 5g½1:7; 4:5Þ; ð32þ respectively. Example 2. We present the second illustrative example of a function approximation. First, let us recall that an interval is a set of real numbers with the property that any number that lies between two numbers in the set is also included in the set. The closed interval of numbers between v and w ðv; w 2 R þ Þ; including v and w, will be denoted by [v,w]. Let us consider a function f : R þ! R þ. We have only a partial information about this function given by G i = d(x i ) e(f(x i )), where d(x i ) denotes a closed interval of reals to which belongs x i, e(f(x i )) denotes a closed interval of reals to which belongs f(x i ) and i =1,...,n. A family {G 1,...,G n } is called a partial information about graph Gf ¼fðx; f ðxþþ : x 2 R þ g. Let Nh denotes a family of elements of PðR þ ÞPðR þ Þ called neighborhoods. In our example, we consider Nh ={X 1,X 2,X 3 }, where X 1 = [1,6] [0.1,0.4],X 2 = [7,12] [1.1,1.4] and X 3 = [13,18] [2.1,2.4]. Let us recall the inclusion definition between closed intervals: ½v 1 ; w 1 Š # ½v 2 ; w 2 Š iff v 2 6 v 1 & w 1 6 w 2 ð33þ We define the new rough inclusion function by 8 >< 1; if 8 i2f1;...;ng ðp 1 ðg i Þ\p 1 ðxþ ;!G i # XÞ; 1 mðx; fg 1 ;...; G n gþ ¼ ; if 9 2 i2f1;...;ngðp 1 ðg i Þ\p 1 ðxþ ;&G i \ X ;Þ; >: 0; if 8 i2f1;...;ng ðp 1 ðg i Þ\p 1 ðxþ ;!G i \ X ¼;Þ: ð34þ Let sample values of a function f : R þ! R þ are given in Table 7. Let an approximation space AS ¼ðR þ ; Nh; mþ be given. We define the lower and upper approximation as follows: LOWðAS; fg 1 ;...; G n gþ ¼ [ fx 2 Nh : mðx; fg 1 ;...; G n gþ ¼ 1g; UPPðAS; fg 1 ;...; G n gþ ¼ [ fx 2 Nh : mðx; fg 1 ;...; G n gþ > 0g: ð35þ ð36þ

13 A. Skowron et al. / Information Sciences 184 (2012) Table 7 Sample values of a function f, d(x i ),e(f(x i )) and G i. i x i f(x i ) d(x i ) e(f(x i )) G i p 1 (G i ) [1,2] [0.5,0.6] [1,2] [0.5,0.6] [1,2] [3,4] [0.6,0.6] [3,4] [0.6,0.6] [3,4] [5,6] [0,0.5] [5,6] [0,0.5] [5,6] [7,8] [1.3,1.4] [7,8] [1.3,1.4] [7,8] [9,10] [1.1,1.2] [9,10] [1.1,1.2] [9,10] [11, 12] [1.3, 1.4] [11, 12] [1.3,1.4] [11,12] [13, 14] [2.2, 2.3] [13, 14] [2.2,2.3] [13,14] [15, 16] [2, 2.05] [15, 16] [2,2.05] [15,16] [17, 18] [2.45, 2.5] [17, 18] [2.45, 2.5] [17,18] In our example, we obtain LOWðAS; fg 1 ;...; G 9 gþ ¼ X 2 ¼½7; 12Š½1:1; 1:4Š; UPPðAS; fg 1 ;...; G 9 gþ ¼ X 2 [ X 3 ¼½7; 12Š½1:1; 1:4Š[½13; 18Š½2:1; 2:4Š: ð37þ ð38þ The above defined approximations are approximations over the set of objects from sample U # U. Now, we present an approach for inducing of approximations of the graph Gf of function f on U, i.e., on extension of U. We use an illustrative example to present the approach. It is worthwhile mentioning that by using boolean reasoning [26] one can generate patterns described by conjunctions of descriptors over IS such that the deviation of f on such patterns in U is less than a given threshold d. This means that, for any such a formula a, the set f(kak U ) has diameter less than a given threshold d, i.e., the image of kak U, i.e., the set f(kak U ), is included into [y d/2,y + d/2) for some y 2 U. Moreover, one can generate such minimal patterns, i.e., formulas a having the above property but no formula obtained by drooping some descriptors from a has that property [26,6]. ByPATTERN(A, f, d) we denote a set of induced patterns with the above properties. One can also assume 5 that PATTERN(A,f,d) is extended by adding some shortenings of minimal patterns. For any pattern from PATTERN(A,f,d) it is assigned an interval of reals D a such that the deviation of f on kak U is in D a, i.e., f(kak U ) # D a. Note that, for any boolean combination a of descriptors over A, it is also well defined its semantics kak U over U. However, there is only available information about a part of kak U equal to kak U ¼kak U \ U. Assuming that the patterns from PATTER- N(A, f, d) are strong (i.e., their support is sufficiently large) one may induce that the following inclusion holds: f ðkak U Þ # ½y d=2; y þ d=2þ: ð39þ We can now define a generalized approximation space making it possible to extend the approximation of Gf ={(x,f(x)):x 2 U} over the defined previously approximation space AS to approximation of Gf ={(x,f(x)) : x 2 U }, where U # U. Let us consider a generalized approximation space AS ¼ U; U ; I ; m tr ; L ; ð40þ where tr is a given threshold from the interval [0,0.5), L is a language of boolean combinations of descriptors over the information system IS [26] used for construction of patterns from the set PATTERN(A,f,d), I ðxþ ¼fkak U D a : a 2 PATTERNðA; f ; dþ&x 2kak U g for x 2 U, where U is an extension of the sample U, i.e., U # U, for any finite family X # PðU ÞI, where P(U ) is the powerset of U ; I is a family of intervals of reals of diameter less than d and for any Y from U R þ representing the graph of a function from U into R þ 8 >< 1; if Max < tr; m tr ðx; YÞ ¼ 1 ; if tr 6 Max < 1 tr; ð41þ 2 >: 0; if Max P 1 tr; where n o jy 1. Max ¼ max midðp 2 ðzþþj maxfy ;midðp : Z 2 ðzþþg 2X&m UðZ; YÞ > 0 ; 2. m U ðz; YÞ ¼m ððp 1 ðzþ\uþp 2 ðzþ; Y \ðu R þ ÞÞ, where m is defined by the Eq. (25), 3. midðdþ ¼ aþb, where D =[a,b), 2 5 Analogously to shortening of decision rules [26].

14 32 A. Skowron et al. / Information Sciences 184 (2012) y ¼ 1 X midðp 2 ðzþþ cardðp 1 ðz \ YÞ\UÞ; c Z2X:m U ðz;yþ>0 ð42þ where 0 1 [ c ¼ card@ p 1 ðzþ\ua: ð43þ Z2X:m U ðz;yþ>0 The lower approximation of Gf is defined by LOW ðas ; Gf Þ¼ ðx; yþ : m tr ði ðxþ; Gf Þ¼1 & x 2 U & y 2½y d=2; y þ d=2þ ; ð44þ where y is obtained from Eq. (42) in which X is substituted by I(x) and Y by Gf, respectively. The upper approximation of Gf is defined by UPP ðas ; Gf Þ¼ ðx; yþ : m tr ði ðxþ; Gf Þ > 0 & x 2 U & y 2½y d=2; y þ d=2þ ; ð45þ where y is obtained from Eq. (42) in which X is substituted by I(x) and Y by Gf, respectively. Let us observe that for x 2 U the condition (x,y) R UPP (AS,Gf ) means that m tr ði ðxþ; Gf Þ¼0&y2½y d=2; y þ d=2þ or y R [y d/2,y + d/2). The first condition describes the set of all pairs (x,y), where the deviation of y from y is small (relative to d) but the prediction of y on the set of patterns I (x) is very risky. The values of f can be induced by bf ðxþ ¼ ½y d=2; y þ d=2þ; if m tr ði ðxþ; Gf Þ > 0; ð46þ undefined; otherwise; where x 2 U nu and y is obtained from Eq. (42) in which X is substituted by I(x) and Y by Gf, respectively. Let us now explain the formal definitions presented above. The value of uncertainty function I (x) for a given object x consists all patterns of the form kak U D a such that kak U is matched by the object x. The condition x 2kakÞ U can be verified by checking if the A-signature of x, i.e., Inf A (x) is matching a (to a satisfactory degree). The deviation on kak U is bounded by the interval D a of reals. The degree to which Z is included to Y is estimated by m UðZ; YÞ, i.e., by degree to which the restricted to U projection of the pattern Z is included into Y projected on U. The estimated value for f (x) belongs to the interval [y d/ 2,y + d/2) obtained by fusion of centers of intervals assigned to patterns from X. In this fusion, the weights of these centers are reflecting the strength on U of patterns matching Y to a positive degree. The result of fusion is normalized by c. The degree to which a family of patterns X is included into Y is measured by the deviation of the value y from centers of intervals of patterns Z matching Y to a positive degree (i.e., m U ðz; YÞ > 0). In Fig. 3 we illustrate the idea of the presented definition of y, where. Z i ¼ka i k U D ai for i = 1,2,3, I (x)={z 1,Z 2,Z 3 }, the horizontal bold lines illustrate projections of sets Z i (i =1,2,3) on U, the vertical bold lines illustrate projections of sets Z i (i =1,2,3) on R þ, y ¼ 1 c P 3 t¼1 midðd a t Þcardðka t k U Þ and c is defined by Eq. (43), m U ðz i; Gf Þ > 0 for i = 1,2,3 because (x 1,f(x 1 )) 2 Z 1 and (x 2,f(x 2 )) 2 Z 2 \ Z 3 for x 1,x 2 2 U, f y* f(x 2 ) Z 3 Z 1 Z 2 f(x 1 ) U * x 1 x x 2 Fig. 3. Inducing the value y.

Banacha Warszawa Poland s:

Banacha Warszawa Poland s: Chapter 12 Rough Sets and Rough Logic: A KDD Perspective Zdzis law Pawlak 1, Lech Polkowski 2, and Andrzej Skowron 3 1 Institute of Theoretical and Applied Informatics Polish Academy of Sciences Ba ltycka