Relational Algebra as non-distributive Lattice

Similar documents
First Steps in Relational Lattice

Ideal Databases. Vadim Tropashko INTRODUCTION

3 Boolean Algebra 3.1 BOOLEAN ALGEBRA

The least element is 0000, the greatest element is 1111.

LATTICE AND BOOLEAN ALGEBRA

Boolean Algebras. Chapter 2

Boolean Algebra CHAPTER 15

Axioms of Kleene Algebra

Sets and Motivation for Boolean algebra

Outline. We will now investigate the structure of this important set.

Lecture 4. Algebra, continued Section 2: Lattices and Boolean algebras

Databases 2011 The Relational Algebra

Chapter 2. Vectors and Vector Spaces

3. Abstract Boolean Algebras

From Constructibility and Absoluteness to Computability and Domain Independence

Logic and Boolean algebra

3515ICT: Theory of Computation. Regular languages

SAMPLE ANSWERS MARKER COPY

Equivalence and Implication

CLASSICAL EXTENSIONAL MEREOLOGY. Mereology

Functions. Computers take inputs and produce outputs, just like functions in math! Mathematical functions can be expressed in two ways:

CS632 Notes on Relational Query Languages I

Equational Logic. Chapter Syntax Terms and Term Algebras

Lecture Notes in Real Analysis Anant R. Shastri Department of Mathematics Indian Institute of Technology Bombay

Sets. Alice E. Fischer. CSCI 1166 Discrete Mathematics for Computing Spring, Outline Sets An Algebra on Sets Summary

Introduction to Kleene Algebras

Slide Set 3. for ENEL 353 Fall Steve Norman, PhD, PEng. Electrical & Computer Engineering Schulich School of Engineering University of Calgary

Review CHAPTER. 2.1 Definitions in Chapter Sample Exam Questions. 2.1 Set; Element; Member; Universal Set Partition. 2.

Equational Logic and Term Rewriting: Lecture I

Lecture 1: Lattice(I)

Logic, Sets, and Proofs

Boolean algebra. Values

Relational completeness of query languages for annotated databases

CHAPTER 12 Boolean Algebra

Relational Lattice Foundation for Algebraic Logic

INTRODUCTION TO RELATIONAL DATABASE SYSTEMS

Commutative Rings and Fields

CSE 562 Database Systems

Boolean Algebra. The Building Blocks of Digital Logic Design. Section. Section Overview. Binary Operations and Their Representation.

Basic counting techniques. Periklis A. Papakonstantinou Rutgers Business School

Logic Synthesis and Verification

CHAPTER 1: Functions

RELATIONS PROPERTIES COMPATIBILITY

03 Review of First-Order Logic

A MODEL-THEORETIC PROOF OF HILBERT S NULLSTELLENSATZ

Today s topics. Introduction to Set Theory ( 1.6) Naïve set theory. Basic notations for sets

This is logically equivalent to the conjunction of the positive assertion Minimal Arithmetic and Representability

EC-121 Digital Logic Design

Week-I. Combinational Logic & Circuits

Sample Marking Scheme

Introduction to Quantum Logic. Chris Heunen

BASIC MATHEMATICAL TECHNIQUES

Boolean Algebra and Propositional Logic

CS 468: Computational Topology Group Theory Fall b c b a b a c b a c b c c b a

CprE 281: Digital Logic

Lattices, closure operators, and Galois connections.

An Introduction to Tropical Geometry

Computational Logic. Relational Query Languages with Negation. Free University of Bozen-Bolzano, Werner Nutt

Digital Circuit And Logic Design I. Lecture 3

Proving logical equivalencies (1.3)

We simply compute: for v = x i e i, bilinearity of B implies that Q B (v) = B(v, v) is given by xi x j B(e i, e j ) =

Discrete Mathematics: Lectures 6 and 7 Sets, Relations, Functions and Counting Instructor: Arijit Bishnu Date: August 4 and 6, 2009

Lecture 2: Vector Spaces, Metric Spaces

Sets. Introduction to Set Theory ( 2.1) Basic notations for sets. Basic properties of sets CMSC 302. Vojislav Kecman

ClC (X ) : X ω X } C. (11)

Relations Graphical View

Chapter 2: Switching Algebra and Logic Circuits

2MA105 Algebraic Structures I

0 Sets and Induction. Sets

Lecture 2: Syntax. January 24, 2018

Formal Concept Analysis: Foundations and Applications

2.23 Theorem. Let A and B be sets in a metric space. If A B, then L(A) L(B).

Section Summary. Relations and Functions Properties of Relations. Combining Relations

Exercise Sheet 1: Relational Algebra David Carral, Markus Krötzsch Database Theory, 17 April, Summer Term 2018

Axioms for the Real Number System

With Question/Answer Animations. Chapter 2

Cosets and Lagrange s theorem

Chapter 2 Boolean Algebra and Logic Gates

THE REAL NUMBERS Chapter #4

Boolean Algebra and Propositional Logic

An Introduction to the Theory of Lattice

Discrete Mathematics. W. Ethan Duckworth. Fall 2017, Loyola University Maryland

Plan of the lecture. G53RDB: Theory of Relational Databases Lecture 10. Logical consequence (implication) Implication problem for fds

Supplement for MAA 3200, Prof S Hudson, Fall 2018 Constructing Number Systems

CONTINUITY. 1. Continuity 1.1. Preserving limits and colimits. Suppose that F : J C and R: C D are functors. Consider the limit diagrams.

Lecture 6: Finite Fields

Query answering using views

1. Prove: A full m- ary tree with i internal vertices contains n = mi + 1 vertices.

Logic Design. Chapter 2: Introduction to Logic Circuits

GAV-sound with conjunctive queries

Table of mathematical symbols - Wikipedia, the free encyclopedia

2. Two binary operations (addition, denoted + and multiplication, denoted

On the size of Heyting Semi-Lattices and Equationally Linear Heyting Algebras

Torsion Points of Elliptic Curves Over Number Fields

A Class of Star-Algebras for Point-Based Qualitative Reasoning in Two- Dimensional Space

Extensions to the Logic of All x are y: Verbs, Relative Clauses, and Only

Symbol Index Group GermAnal Ring AbMonoid

A generalization of modal definability

Discrete Mathematics. (c) Marcin Sydow. Sets. Set operations. Sets. Set identities Number sets. Pair. Power Set. Venn diagrams

Relational lattices via duality

Transcription:

Relational Algebra as non-distributive Lattice VADIM TROPASHKO Vadim.Tropashko@orcl.com We reduce the set of classic relational algebra operators to two binary operations: natural join and generalized union. We further demonstrate that this set of operators is relationally complete and honors lattice axioms. Categories and Subject Descriptors: H.2 [Database Management]: General Terms: Relational Algebra, Duality, Lattice Theory Additional Key Words and Phrases: Relational Operators, Projection, Selection, Cartesian Product, Set Operations, Tensor Product, Generalized Union, Formal Concept Analysis 1. INTRODUCTION Chapter 4 of the Alice book ([1]) describes SPC Algebra for conjunctive queries. Selection, projection, and Cartesian product are primitive operations, while few others - intersection, for example - could be expressed as a composition of primitive ones. On the other hand, union is a notable exception, so that it is introduced as an additional primitive operation to form SPCU algebra. This apparent lack of symmetry between union and intersection has been investigated in [2], where the number of basic relational operators had been reduced to two pairs of mutually dual operators. Remarkably, set operators were excluded from the basic ones. In this paper we reduce the set of classic relational algebra operators to two binary operations: natural join and generalized union. We demonstrate that this set of operators is relationally complete and honors lattice axioms. In the final section we draw a connection to Formal Concept Analysis [5]. 2. FUNDAMENTAL OPERATORS 2.1 Natural Join Natural join is formally defined as Cartesian product followed by selection with equality predicates on the columns with identical names: A(x,y) B(y,z) = σ A.y=B.y (A(x,y) B(y,z) ) The two identical y columns in the resulting relation are merged into a single column. As argued in [2] this action is symmetric to merging duplicate rows and is always assumed as selection side effect. In this paper we ll shift perspective and emphasize natural join as

basic operator. Selection and Cartesian product would be expressed in terms of natural join. Example 1. Given relations A and B natural join A B is x y 1 2 1 3 y z 1 3 2 4 x y z 1 2 4 2.2 Generalized Union Generalized union is formally defined as set union restricted to a common set of relation attributes: A(x,y) B(y,z) = (π y A(x,y) ) U (π y B(y,z) ) Unlike traditional, set-based union, there is no requirement that relational operand signatures should match. Alternatively, generalized union can be defined as tensor product followed by projection into a set of common columns: A(x,y) B(y,z) = π y (A(x,y) B(y,z) ) The duplicate rows are merged as usual projection side effect in the set (as opposed to bag) semantics. This definition is dual to natural join definition of section 2.1. Example 2. Given relations A and B from the example 1, generalized union A y 1 2 3 B is

Perhaps, the name generalized intersection instead of natural join would better emphasize the symmetry between the two fundamental operators. We, nevertheless stick to traditional naming convention. The other possibility is calling generalized union as natural one. Even better option is omitting adjectives altogether. 3. RELATIONAL COMPLETENESS In the next sections we ll examine basic operators from [2] one by one and express them in terms of natural join and generalized union. 3.1 Cartesian Product Cartesian product is a natural join between relations with disjoint set of columns. 3.2 Selection Selection σ p(x,y) A(x,y,z) is natural join between A and P: σ p(x,y) A(x,y,z) = A(x,y,z) P(x,y) Admittedly, the cost of this reduction is introducing potentially infinite relation corresponding to selection predicate. In author s opinion, this price is justified. Raising abstraction level and reducing complexity always produce nicer mathematics. Infinity emerges naturally in many mathematics endeavors and always simplifies underlying theory. Example 3a. Given relation A(x,y), selection σ x=1 A(x,y) is natural join of A with finite relation EQ 1 (x) x 1 Example 3b. Given relation A from the example 1, selection σ x>1 A(x,y) is natural join of A with infinite relation GT 1 (x) = { (x) x1 } assuming domain of integers. x 2 3 4

Example 3c. Given relation A from the example 1, selection σ x<y A(x,y) is natural join of A with infinite relation GT(x,y) = { (x,y) xy } x y 1 2 2 3 3 4 1 3 1 4 2 4 again, assuming domain of integers. 3.3 Projection Projection π y A(x,y) is generalized union of between A and empty relation Y(y): π y A(x,y) = A(x,y) Y(y) In order to express the remaining operator tensor product, we have to master column renaming, first. 3.4 Renaming In classic relational algebra renaming is somewhat obscure operation. Textbooks and online lecture notes often include it into the list of basic operations (with a dedicated symbol ), while the others treat it as an implicit manipulation upon relations. It is indeed odd operation, since the author is unaware of any other algebraic system in mathematics which has renaming operation. The answer to this problem is that renaming : A(x,y) A(x,z) could be expressed as composition of natural join between given relation A(x,y) and (potentially infinite) binary identity relation EQ(y,z) = { (y,z) y=z }, and projection π x,z : : A(x,y) A(x,z) = π x,z ( A(x,y) EQ(y,z) ) = XZ(x,z) ( A(x,y) EQ(y,z) ) Example 4. Given relation A(x,y), transitive closure is defined as TC A (x,y) = A A 2 A 3... where relation powers are defined by induction with elementary step A i+1 = A i.a = ( : A i (x,y) A i (x,s) ) ( : A (x,y) A (s,y) ) Without renaming, self join A A evaluates to A.

3.5 Difference Reduction for difference operator in [2] involves selection, union, renaming, Cartesian product, and (natural) extension of projection operator with aggregation. Leveraging infinite relation NEQ(x,y) = { (x,y) xy } prompts alternative implementation via antijoin A (x,y) = π x,y ( A(x,y) ( : B(x,y) B (x,y ) ) NEQ(x,x ) ) A (x,y) = π x,y ( A(x,y) ( : B(x,y) B (x,y ) ) NEQ(y,y ) ) which doesn t require aggregation. A \ B = A (x,y) A (x,y) At this point, we have already proved that natural join and generalized union make up a relationally complete set of operators. Nevertheless, let s demonstrate that tensor product can be expressed via natural join and generalized union, as well. 3.6 Tensor Product Given relations A(x,y) and B(x,y) from the example 1, tensor product A B can be produced in 3 steps: A (x,xx,xy,yx,yy,y) = A(x,y) EQ(x,xx) EQ(x,xy) EQ(y,yx) EQ(y,yy) B (x,xx,xy,yx,yy,y) = A(x,y) EQ(x,xx) EQ(y,xy) EQ(x,yx) EQ(y,yy) A B = π xx,xy,yx,yy ( A (x,xx,xy,yx,yy,y) B (x,xx,xy,yx,yy,y) ) 4. LATTICE THEORY PERSPECTIVE Algebra with two binary operations and is called lattice [4] if those operations are idempotent, commutative, associative, and satisfy the absorption law. Lattice operations and are called meet and join, correspondingly. There are 2 possibilities to match lattice operators against natural join and generalized union, but associating lattice join with natural join, perhaps, would be the best to avoid confusion. We write lattice axioms with relational operations as follows: Idempotent laws: A A = A A A = A

Commutative laws: A B = B A A B = B A Associative laws: A (B C) = (A B) C A ( B C ) = ( A B ) C Absorption laws: A (A B) = A A ( A B ) = A The first three laws for natural join are well known. Their dual counterparts easily follow from union and intersection being idempotent, commutative and associative (union being applied to columns, and intersection to rows). Proving absorption law just a little bit more involved. For any relation A with signature containing signature of A the following condition is satisfied A(x,y) A(x,y) A (x,y,z) Therefore, A A (A B) Conversely, both and A(x,y) (A(x,y) B(y,z) ) = A(x,y) U π x,y (A(x,y) B(y,z) ) π x,y (A(x,y) B(y,z) ) A(x,y) imply A (A B) A 4.1 Partial order From the lattice we obtain partially ordered set by defining A B iff B = A B Symmetrically, A B iff A = A B

4.2 Least and Greatest Elements Let s meet all lattice elements and denote the resulting relation as denote join of all relations as. Then, for any relation A. Symmetrically let and, therefore A A = A A = A = A = A Clearly, generalized union of all relations implies that 1. intersection of all relation signatures must be empty, and 2. union of all possible rows must have at least one row. Hence, the relation with no columns, one row and the symbol. (Set semantics allows only one row in an empty signature relation). Likewise, joining all the relations would produce relation with no rows. In [3] these relations are known as DUM and DEE. 4.3 Distributive and Modular Properties Unlike set intersection and union, natural join and generalized union don t honor distributive law. Otherwise, if relational lattice were distributive it would be isomorphic to some lattice of sets. Our lattice, however, is a cylindrical sets algebra: while the join of two cylindrical sets is always the same as their set intersection, the meet of two cylindrical sets defined as their set union wouldn t produce a cylindrical set as a result. Example 5. Given the relations A and B from the example 1, and the relation C the reader can verify z 3 7

(A B) C (A C) (B C) The lattice in the example 5 contains a sublattice isomorphic to N 5. By M 3 -N 5 theorem [4] it follows that it doesn t satisfy modular property either. 5. Case Study: Formal Concepts Analysis Formal Concepts Analysis [5] is a method for data analysis, knowledge representation and information management based on lattice theory foundation. Unlike relational theory, this niche discipline is widely unknown in the US. On quick examination it turns out that the lattice operations in Formal Concept Analysis are identical to the ones introduced in this paper. Example 6 ([5]). A formal context of famous animals cartoon real tortoise dog cat mammal Garfield x x x Snoopy x x x Socks x x x Bobby x x x Harriet x x produces a concept lattice with the following Hasse diagram mammal real cartoon cat dog N Garfield Snoopy Socks Bobby tortoise Harriet The lattice elements can be interpreted as relations. Each relation must have at least one column Name. For each formal attribute element, which is greater than a node in

question, an additional column is introduced. In the example 6 the relation corresponding to the lattice element N has the signature {Name, ismammal, isreal}. For each object element, which is less than the node, the relation would contain a row. All the fields are assigned the same value true, except the Name, which is the object name. The lattice element N in the example 6 is the relation Name ismammal isreal Socks true true Bobby true true Admittedly, the produced relations are rather special. If nothing else, the connection between relational theory and formal concepts analysis allows exchange of methods between the both fields. REFERENCES 1. ABITEBOUL, S., HULL, R., AND VIANU, V. 1995. Foundations of databases. Addison-Wesley. 2. HARAKIRI, M., TROPASHKO, V. Dualities among Relational Algebra Operators. http://www.dbazine.com/tropashko7.shtml 3. DATE, C. 2003 Introduction to Database Systems. 8 th ed. Addison-Wesley. 4. DAVEY, B. A., PRIESTLEY H. A. 2002. Introduction to Lattices and Order. 2 nd ed. Cambridge University Press. 5. PRISS U. 2005. Formal Concept Analysis in Information Science. Annual Review of Information Science and Technology, ARIST, ASIST, vol. 40. http://www.upriss.org.uk/papers/arist.pdf