Rough Sets. V.W. Marek. General introduction and one theorem. Department of Computer Science University of Kentucky. October 2013.

Similar documents
/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Matroids and Greedy Algorithms Date: 10/31/16

Sequence convergence, the weak T-axioms, and first countability

Real Analysis Prof. S.H. Kulkarni Department of Mathematics Indian Institute of Technology, Madras. Lecture - 13 Conditional Convergence

10.3 Matroids and approximation

Clearly C B, for every C Q. Suppose that we may find v 1, v 2,..., v n

CMPSCI611: The Matroid Theorem Lecture 5

(Refer Slide Time: 0:21)

Introduction to Kleene Algebras

Connectedness. Proposition 2.2. The following are equivalent for a topological space (X, T ).

Math 144 Summer 2012 (UCR) Pro-Notes June 24, / 15

MITOCW watch?v=fkfsmwatddy

Tree sets. Reinhard Diestel

ECE353: Probability and Random Processes. Lecture 2 - Set Theory

Countability. 1 Motivation. 2 Counting

Abstract Measure Theory

Measures. 1 Introduction. These preliminary lecture notes are partly based on textbooks by Athreya and Lahiri, Capinski and Kopp, and Folland.

1 Some loose ends from last time

1 More finite deterministic automata

Simplifying Section 13 By Joseph Pang

1 The Local-to-Global Lemma

An Intuitive Introduction to Motivic Homotopy Theory Vladimir Voevodsky

0. Introduction 1 0. INTRODUCTION

Characterizing Pawlak s Approximation Operators

2.2 Some Consequences of the Completeness Axiom

2. The Concept of Convergence: Ultrafilters and Nets

On the Impossibility of Certain Ranking Functions

Problem Set 2: Solutions Math 201A: Fall 2016

Matroids and submodular optimization

Handout 2 (Correction of Handout 1 plus continued discussion/hw) Comments and Homework in Chapter 1

Lecture 10: Everything Else

Chapter 9: Relations Relations

Linear algebra and differential equations (Math 54): Lecture 10

2. Prime and Maximal Ideals

2. Introduction to commutative rings (continued)

Math 762 Spring h Y (Z 1 ) (1) h X (Z 2 ) h X (Z 1 ) Φ Z 1. h Y (Z 2 )

This section will take the very naive point of view that a set is a collection of objects, the collection being regarded as a single object.

CSE 20. Lecture 4: Introduction to Boolean algebra. CSE 20: Lecture4

2 Analogies between addition and multiplication

Footnotes to Linear Algebra (MA 540 fall 2013), T. Goodwillie, Bases

Topological properties

Introductory Analysis I Fall 2014 Homework #5 Solutions

A MODEL-THEORETIC PROOF OF HILBERT S NULLSTELLENSATZ

MITOCW Lec 11 MIT 6.042J Mathematics for Computer Science, Fall 2010

We set up the basic model of two-sided, one-to-one matching

After taking the square and expanding, we get x + y 2 = (x + y) (x + y) = x 2 + 2x y + y 2, inequality in analysis, we obtain.

Some Background Material

Lecture 6: Finite Fields

On minimal models of the Region Connection Calculus

Reading 11 : Relations and Functions

Mathematics-I Prof. S.K. Ray Department of Mathematics and Statistics Indian Institute of Technology, Kanpur. Lecture 1 Real Numbers

Seminaar Abstrakte Wiskunde Seminar in Abstract Mathematics Lecture notes in progress (27 March 2010)

Cosets and Lagrange s theorem

Lecture 10 February 4, 2013

Axiomatic set theory. Chapter Why axiomatic set theory?

MAGIC Set theory. lecture 2

CHAPTER 8: EXPLORING R

Math 421, Homework #6 Solutions. (1) Let E R n Show that = (E c ) o, i.e. the complement of the closure is the interior of the complement.

ABOUT THE CLASS AND NOTES ON SET THEORY

18.175: Lecture 2 Extension theorems, random variables, distributions

Graph Theory. Thomas Bloom. February 6, 2015

the time it takes until a radioactive substance undergoes a decay

Chapter 3. Cartesian Products and Relations. 3.1 Cartesian Products

Error Correcting Codes Prof. Dr. P Vijay Kumar Department of Electrical Communication Engineering Indian Institute of Science, Bangalore

MA554 Assessment 1 Cosets and Lagrange s theorem

TOPOLOGICAL ASPECTS OF YAO S ROUGH SET

Logic, Sets, and Proofs

3. The Sheaf of Regular Functions

Generalized Star Closed Sets In Interior Minimal Spaces

Tree-width and planar minors

Modern Algebra Prof. Manindra Agrawal Department of Computer Science and Engineering Indian Institute of Technology, Kanpur

Sets. Slides by Christopher M. Bourke Instructor: Berthe Y. Choueiry. Fall 2007

The Axiom of Choice. Contents. 1 Motivation 2. 2 The Axiom of Choice 2. 3 Two powerful equivalents of AC 4. 4 Zorn s Lemma 5. 5 Using Zorn s Lemma 6

Topics in Logic, Set Theory and Computability

DO FIVE OUT OF SIX ON EACH SET PROBLEM SET

Lecture 4: Constructing the Integers, Rationals and Reals

An Algebraic View of the Relation between Largest Common Subtrees and Smallest Common Supertrees

Russell s logicism. Jeff Speaks. September 26, 2007

SELF-DUAL UNIFORM MATROIDS ON INFINITE SETS

Weak Choice Principles and Forcing Axioms

Math 320-2: Final Exam Practice Solutions Northwestern University, Winter 2015

Sets. Slides by Christopher M. Bourke Instructor: Berthe Y. Choueiry. Spring 2006

SOLUTIONS TO EXERCISES FOR. MATHEMATICS 205A Part 1. I. Foundational material

Basic counting techniques. Periklis A. Papakonstantinou Rutgers Business School

CSCI3390-Lecture 6: An Undecidable Problem

EE595A Submodular functions, their optimization and applications Spring 2011

Linear Programming and its Extensions Prof. Prabha Shrama Department of Mathematics and Statistics Indian Institute of Technology, Kanpur

Solutions to odd-numbered exercises Peter J. Cameron, Introduction to Algebra, Chapter 2

Relations. Carl Pollard. October 11, Department of Linguistics Ohio State University

Containment restrictions

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

Discrete Optimization 2010 Lecture 2 Matroids & Shortest Paths

1. Introduction to commutative rings and fields

Formal Epistemology: Lecture Notes. Horacio Arló-Costa Carnegie Mellon University

Sets and Functions. (As we will see, in describing a set the order in which elements are listed is irrelevant).

Notes 1 Autumn Sample space, events. S is the number of elements in the set S.)

KRIPKE S THEORY OF TRUTH 1. INTRODUCTION

Graph coloring, perfect graphs

Research Article Matroidal Structure of Rough Sets from the Viewpoint of Graph Theory

Consequences of Continuity

Markov properties for undirected graphs

Transcription:

General introduction and one theorem V.W. Marek Department of Computer Science University of Kentucky October 2013

What it is about? is a popular formalism for talking about approximations Esp. studied in Poland, Canada, US, China, India Actually, invented by late Zdzisław Pawlak Ties several areas of science: Statistics, Logic, Universal Algebra, Topology, Combinatorics even Functional Analysis Motivated by situations when the language is inadequate to describe collections of objects (I wrote few papers on RS and was a coauthor of the first paper on RS)

What it is about, cont d Also I read a paper by Yanfang Liu and William Zhu, of Zhangzhou Normal University Parameterized matroid of rough set and will present one theorem (and its proof) The reason is that, originally, I doubted the result is true, and the proof did not make sense This specific result ties with an important area of Combinatorial Optimization and explains why some algorithms for work This is (unfortunately) the only actual proof that I will present in this series of lectures

Plan One-table bags of records, and associated equivalence relation Rough Liu theorem

Database background A table is a collection of records, possibly with repetition In other words a bag, not set, of records Let us assume now that we assign to each of these records a unique identifier Then there is an equivalence relation on the set of the identifiers, namely: i 1 i 2 if i 1, i 2 are identifiers of the same record

Example Here is a table patients: Id lname fname temp 1 marek victor 104.2 2 morek vector 101.2 3 marek victor 104.2 4 marek victor 99.6 5 morek vector 101.2 (But remember that the id s are NOT the part of data) Here the relation has three equivalence classes: {1, 3}, {2, 5}, and {4}

Not every subset is describable If we implement this table in SQL (what is SQL?) the set consisting of records with identifiers 1, 2, and 3, can not be described The point is that from the point of view of SQL, records where we set id 2 and 5 can not be distinguished Only that are unions of equivalence classes of can be described The set {1, 3, 4} can be described by: SELECT FROM patients WHERE lname = marek

Two kinds of linguistic inadequacy Say, we have a language for description (think medicine, the original motivation of Pawlak) There may be of objects we can not describe (given that language) It is also possible that there is a description, but it is just too big This happens when we have plenty of attributes and need to perform attribute reduction to get a human-readable description When you do this records may become indistinguishable (Ever heard about Johnson-Lindenstrauss Theorem?)

So, we have to approximate X A a1 a2 a3 a4 a5 B b1 b2 b3 b4 b5 There is a largest definable set included in a given set X, often called interior of X, X There is a smallest definable set containing a given set, often called closure of X, X We see them in our figure There is a large number of obvious identities for interior and closure S

Topological angle (What about Alexandrov Topology in our context?) It is just that we are not topologists, and think about our objects as database objects This, of course, has consequences; we implement

Few important facts An equivalence class of x, [x] is {y : y x} The interior of the set X consists of the union of equivalence classes included in X Closure of the set X consists of the union of equivalence classes that have a nonempty intersection with X There are various characterizations of interior and closure, in various terms One such characterization, by Mirek Truszczynski and myself is that the pair X, X is the best approximation of X in the Kleene ordering of pairs of definable (Kleene ordering is the order of approximations where the lower class goes up and the upper class goes down )

Rough, formally Given an equivalence relation in a set U, a rough set determined by a set X such that X U is the pair X, X Then a rough subset of U is a pair determined by any subset X of U Besides of characterization mentioned above, there are other characterizations of rough : in terms of topology, in terms of Boolean Algebras with operators, etc. The person who invented (no longer with us), Professor Zdzisław Pawlak, wrote a often quoted book on the subject There is a journal Transactions on and even a Society There is plenty of conferences on, in all sort of places

Matroid Matroid is a combinatorial structure that attempts to capture notions behind concepts such as independent set of vectors in a vector space But also cycle-free subgraphs of an undirected graph Formally, a matroid is a pair A,M where M consists of (some) sub of A and satisfies the following conditions: M If A B and B M then A M If A, B M, A < B then for some x B \ A, A {x} M (This definition of a matroid abstracts out of linearly independent subset of a vector space)

, cont d This last property is called Steinitz exchange property and whoever had a class of linear algebra must have heard about it The concept of matroid is one of fundamental combinatorial structures There are many other characterizations of matroids in various terms One important connection of matroids and Computer Science is so-called Rado-Edmonds Theorem that characterizes greedy algorithms in terms of (weighted) matroids (Look up an absolute classic: Witold Lipski, jr., Kombinatoryka dla programistow, ISBN 82-204-2968-4)

Why are matroids important? They occur in many places, but the important point is the characterization of Greedy algorithms via matroids Say, we have a set A and a weight function, wt : A R + Weight of set S A is Σ x S wt(x)

Rado-Edmonds Theorem We sort the set A according to weights in descending order Rado-Edmonds Theorem tells us that if a family F P(A) is a matroid, and we select greedily (i.e. we initialize X to the empty set and in each step we select fresh maximum weight element x so that X {x} is in F and then set X := X {x}) then we will compute a base of maximum weight (what is a base?) When no fresh x A so that X {x} belongs to F can be found, we return X

Rado-Edmonds Theorem, cont d Conversely, if F is not a family of independent of a matroid, then there is a weight function where we will not get a maximum-weight element of F (If you had a serious data-structures course, then certainly these facts were learned - if you were paying attention)

associated with rough Let U be a set of objects, and an equivalence relation on U. Let Y U Then Y determines a collection M Y of sub of U namely {A U : A Y} In our simple example, with 1 3, 2 5 and 4 in relation with itself only, The set X = {1, 2, 3} determines the following class M X : empty set. one-element {1}, {2}, {3}, {5} (but not {4}). What about {1, 2}, {1, 3}, {2, 3}? And are there more?

and rough, cont d Here is the result of Liu : Let be an equivalence relation in the set U. For every set Y U, the structure M Y is a matroid (We will prove that) Since the structure M Y obviously is closed under sub ( if a set grows(?) smaller, the interior grows smaller ) first two conditions are obvious So now, let us assume that we have two, A, B in M Y, with A < B. We need to find a object x (B \ A) so that A {x} also belongs to M Y

Case 1 Some x B \ A has the property that [x] = {x} Specifically, that x is in relation with itself but not with any other object We claim that this specific x has the property that A {x} M Y Indeed, because [x] is a singleton, x / A, A {x} = A {x} Now, A Y (because A M Y ) and also x Y because B M Y, and {x} B Y Thus in this case the matter is easy

Case 2 No x B \ A has the property that [x] = {x} Our idea now is to assume that for no x B \ A, A {x} belongs to M Y and work for a contradiction The fact that A {x} / M Y means that A {x} is strictly larger than A But there are only two possibilities: either A {x} is A or it is A [x] The first possibility does not hold - so the other one must hold

What does it mean? This means that all y x, y x, are in A! And this happens for all x B \ A Next question we ask if it is possible that for some x, y B \ A, x y, x y If that would be the case then [x] = [y] and [x]\{x} A and also [y]\{y} A But then [x] A, contradicting the fact that x / A

What is going on? For each x (B \ A) all the elements y such that x y are in A! Moreover, because we are in Case 2, every x B \ A is in relation with some element of A (actually of A\B) Let us select, for each x (B \ A) one element y such that x y, x y Then, because [x]\{x} A, this function maps B \ A into A (in fact into A\B, because no object in B \ A is to any object in B \ A

A bit of combinatorics A\B B\A Figure: The injection of B \ A into A\B Then, because [x]\{x} A, this function maps B \ A into A (in fact into A\B, because no object in B \ A is -related to any different object in B \ A In fact it is an injection of B \ A into A\B! But then B \ A A\B! This contradicts the fact that A < B and completes the argument

Few other things Here is a characterization of M Y M Y = {A : A Y} And one more: M Y = {A : (A\Y) = } There are plenty of other similarly easy characterizations of M Y

More stuff Discussing with Professor M. Truszczyński of my Department, I learned few things It is quite possible that these things are in many papers of W. Zhu and his coauthors We will present Mirek s suggestions now

Selectors Let F be a family of pairwise disjoint nonempty A set Z is a selector for F if for all T F, Z T = 1 Something called Axiom of Choice (what is it?) requires that selectors exist, but if F is a finite family of finite, then no special axioms are needed

Family F Y Now, let Y be a subset of U The family F Y is defined as follows: {[x] : [x] Y = } Thus F Y consists of equivalence classes of which are disjoint with Y This family F Y, if nonempty, consists of nonempty only and so, has nonempty selectors

Family F Y, cont d All the selectors S for F Y are of the same size, namely F Y Here is what Prof. Truszczyński observed: The maximal in M Y are precisely the of the form U \ S where S is a selector for F Y Therefore the family M Y can be characterizes as follows: M Y = {X : T (T is a selector for F Y and X T = )} (This can be used for an alternative proof of the fact that M Y is a matroid)

Conclusions Even though rough are such a fundamental data structure, people still find new and interesting facts In the case of the result we presented there was a new technology (matroids) that we used Maybe it will be useful in further investigations