Databases 2011 The Relational Algebra

Similar documents
Query Processing. 3 steps: Parsing & Translation Optimization Evaluation

Plan of the lecture. G53RDB: Theory of Relational Databases Lecture 2. More operations: renaming. Previous lecture. Renaming.

Relational Algebra on Bags. Why Bags? Operations on Bags. Example: Bag Selection. σ A+B < 5 (R) = A B

CSE 562 Database Systems

Relational Algebra & Calculus

7 RC Simulates RA. Lemma: For every RA expression E(A 1... A k ) there exists a DRC formula F with F V (F ) = {A 1,..., A k } and

Databases 2012 Normalization

Relational-Database Design

Math 3121, A Summary of Sections 0,1,2,4,5,6,7,8,9

Chapter 2 - Basics Structures MATH 213. Chapter 2: Basic Structures. Dr. Eric Bancroft. Fall Dr. Eric Bancroft MATH 213 Fall / 60

Chapter 2 - Basics Structures

Relations. Relations of Sets N-ary Relations Relational Databases Binary Relation Properties Equivalence Relations. Reading (Epp s textbook)

Relational Algebra and Calculus

INTRODUCTION TO RELATIONAL DATABASE SYSTEMS

Database Systems Relational Algebra. A.R. Hurson 323 CS Building

0 Sets and Induction. Sets

Schedule. Today: Jan. 17 (TH) Jan. 24 (TH) Jan. 29 (T) Jan. 22 (T) Read Sections Assignment 2 due. Read Sections Assignment 3 due.

Sets. Alice E. Fischer. CSCI 1166 Discrete Mathematics for Computing Spring, Outline Sets An Algebra on Sets Summary

CS5300 Database Systems

Relations. We have seen several types of abstract, mathematical objects, including propositions, predicates, sets, and ordered pairs and tuples.

CS54100: Database Systems

Propositional Logic. What is discrete math? Tautology, equivalence, and inference. Applications

Relations and Equivalence Relations

Relational Database: Identities of Relational Algebra; Example of Query Optimization

Chapter 9: Relations Relations

BASIC MATHEMATICAL TECHNIQUES

Sets are one of the basic building blocks for the types of objects considered in discrete mathematics.

Provenance Semirings. Todd Green Grigoris Karvounarakis Val Tannen. presented by Clemens Ley

Notes. Relations. Introduction. Notes. Relations. Notes. Definition. Example. Slides by Christopher M. Bourke Instructor: Berthe Y.

CS2742 midterm test 2 study sheet. Boolean circuits: Predicate logic:

Relations Graphical View

A set is an unordered collection of objects.

CS632 Notes on Relational Query Languages I

Axioms of Kleene Algebra

Ring Sums, Bridges and Fundamental Sets

MATH 433 Applied Algebra Lecture 22: Review for Exam 2.

CHAPTER 7: Systems and Inequalities

You are here! Query Processor. Recovery. Discussed here: DBMS. Task 3 is often called algebraic (or re-write) query optimization, while

With Question/Answer Animations. Chapter 2

Relational Algebra as non-distributive Lattice

n CS 160 or CS122 n Sets and Functions n Propositions and Predicates n Inference Rules n Proof Techniques n Program Verification n CS 161

Exercise Sheet 1: Relational Algebra David Carral, Markus Krötzsch Database Theory, 17 April, Summer Term 2018

Comp 5311 Database Management Systems. 5. Functional Dependencies Exercises

P Q1 Q2 Q3 Q4 Q5 Tot (60) (20) (20) (20) (60) (20) (200) You are allotted a maximum of 4 hours to complete this exam.

Lineage implementation in PostgreSQL

The Laplace Expansion Theorem: Computing the Determinants and Inverses of Matrices

Correlated subqueries. Query Optimization. Magic decorrelation. COUNT bug. Magic example (slide 2) Magic example (slide 1)

GROUPS. Chapter-1 EXAMPLES 1.1. INTRODUCTION 1.2. BINARY OPERATION

Notes on Sets, Relations and Functions

Algebraic Structures Exam File Fall 2013 Exam #1

Database Systems SQL. A.R. Hurson 323 CS Building

Definitions, Theorems and Exercises. Abstract Algebra Math 332. Ethan D. Bloch

First Steps in Relational Lattice

Abstract Vector Spaces

DEPARTMENT OF MATHEMATIC EDUCATION MATHEMATIC AND NATURAL SCIENCE FACULTY

Desirable properties of decompositions 1. Decomposition of relational schemes. Desirable properties of decompositions 3

Module 10: Query Optimization

Reading 11 : Relations and Functions

INTRODUCTION TO THE GROUP THEORY

Models of Computation. by Costas Busch, LSU

REVIEW QUESTIONS. Chapter 1: Foundations: Sets, Logic, and Algorithms

MA : Introductory Probability

Relational Algebra Part 1. Definitions.

CS Discrete Mathematics Dr. D. Manivannan (Mani)

Languages. Theory I: Database Foundations. Relational Algebra. Paradigms. Projection. Basic Operators. Jan-Georg Smaus (Georg Lausen)

Mathematics Course 111: Algebra I Part I: Algebraic Structures, Sets and Permutations

BOOLEAN ALGEBRA INTRODUCTION SUBSETS

Notes for Science and Engineering Foundation Discrete Mathematics

DISTINGUISHING PARTITIONS AND ASYMMETRIC UNIFORM HYPERGRAPHS

Chapter 3 Relational Model

Chapter Summary. Sets The Language of Sets Set Operations Set Identities Functions Types of Functions Operations on Functions Computability

Advanced Engineering Mathematics Prof. Pratima Panigrahi Department of Mathematics Indian Institute of Technology, Kharagpur

Relationships between elements of sets occur in many contexts. Every day we deal with

LINEAR ALGEBRA W W L CHEN

Groups. 3.1 Definition of a Group. Introduction. Definition 3.1 Group

bc7f2306 Page 1 Name:

Information Systems for Engineers. Exercise 5. ETH Zurich, Fall Semester Hand-out Due

System of Linear Equations

Chordal Graphs, Interval Graphs, and wqo

RELATIONS PROPERTIES COMPATIBILITY

Sets and Motivation for Boolean algebra

DRAFT CONCEPTUAL SOLUTION REPORT DRAFT

A is a subset of (contained in) B A B iff x A = x B Socrates is a man. All men are mortal. A = B iff A B and B A. A B means A is a proper subset of B

Sets. Subsets. for any set A, A and A A vacuously true: if x then x A transitivity: A B, B C = A C N Z Q R C. C-N Math Massey, 72 / 125

In-Database Factorised Learning fdbresearch.github.io

spaghetti fish pie cake Ann X X Tom X X X Paul X X X

Sets. Introduction to Set Theory ( 2.1) Basic notations for sets. Basic properties of sets CMSC 302. Vojislav Kecman

Relational Algebra SPJRUD

Mathematics Review for Business PhD Students Lecture Notes

* 8 Groups, with Appendix containing Rings and Fields.

MATH 433 Applied Algebra Lecture 22: Semigroups. Rings.

2MA105 Algebraic Structures I

EECS-3421a: Test #2 Electrical Engineering & Computer Science York University

Subplanes of projective planes

Functions and Relations

A General Lower Bound on the I/O-Complexity of Comparison-based Algorithms

Database Applications (15-415)

1 Predicates and Quantifiers

MATH STUDENT BOOK. 11th Grade Unit 10

Unit 3 Vocabulary. An algebraic expression that can contains. variables, numbers and operators (like +, An equation is a math sentence stating

Transcription:

Databases 2011 Christian S. Jensen Computer Science, Aarhus University

What is an Algebra? An algebra consists of values operators rules Closure: operations yield values Examples integers with +,, sets with,, \, matrices with +,, functions with,, -1 O relations with query operators 2

Mathematical Relations An n-ary relation on a set S is a subset of S n Examples is a binary relation on R, a subset of R R { (1.2, 3.4), (34, 117.363), ( 53, 0.1234),... } divides is a binary relation on N, a subset of N N { (2, 4), (3, 9), (3, 12), (17, 34), (1237, 21029),... } negative is a binary relation on N, a subset of N N { (3,-3), (-17,17), (0,0), (2, -2), (-2,2), (87, -87),...} sum is a ternary relation on N, a subset of N N N { (3,5,8), (23,14,37), (0,123,123), (42,87,129),... } married to is a binary relation on people { (Hillary, Bill), (Bill, Hillary), (Angelina, Brad),... } 3

Tables as Relations A database relation on a data set D consists of a schema of attribute names (a 1, a 2,..., a n ) a finite n-ary relation on D, a subset of D n A relation is like a table where all columns have the same generic type no duplicates are allowed no other constraints are imposed We implicitly allow permutations of the attributes 4

Relational Operators Database relations form an algebra with the operators union: intersection: difference: \ projection: π renaming: ρ selection: σ Cartesian product: natural join: These provide an abstract model of database queries 5

Union, Intersection, Difference The arguments must have the same schema The result has again that schema R S R S R \ S They compute the set operations on the relations 6

Projection π a 1,...,a n (R) Assume the schema of R is (a 1,...,a n,b 1,...,b m ) The schema of the result is (a 1,...,a n ) The result relation is { (d 1,..., d n ) (d 1,..., d n+m ) R } 7

Renaming ρ a b (R) The name a must occur as a i in the schema of R The name b must not occur in the schema of R Schema of the result: (a 1,..., a i-1, b, a i+1,..., a n ) The result relation is unchanged ρ a b,c d,e f (R) = ρ a b (ρ c d (ρ e f (R))) 8

Selection σ C (R) C is a condition of the attributes of R The resulting schema is unchanged The relation part is: { r r R C(r) } 9

Cartesian Product R S Assume R has schema (a 1,..., a m ) S has schema (b 1,..., b n ) The new schema is (a 1,..., a m, b 1,..., b n ) The relation part is { (c 1,..., c m+n ) (c 1,..., c m ) R (c m+1,..., c m+n ) S } 10

Natural Join R S Assume R has schema (a 1,..., a k, c 1,..., c n ) S has schema (c 1,..., c n, b 1,..., b m ) {a i } {b i } = The new schema is (a 1,..., a k, c 1,..., c n, b 1,..., b m ) The relation part is { (d 1,..., d k, e 1,..., e n, f 1,..., f m ) (d 1,..., d k, e 1,..., e n ) R (e 1,..., e n, f 1,..., f m ) S } 11

Derived Operators R S = R S = R (R S) when the schemas are identical R S = R S when the schemas are disjoint R Θ S = σ Θ (R S) the theta join SELECT DISTINCT X 1,, X k FROM R 1,, R n WHERE C = π x 1,, xk (σ C (R 1 R n ) 12

Query Trees In which meetings do the owners participate? π what,meetid (σ status= a ( ρ owner userid (Meetings) ρ pid userid (Participants))) π what,meetid σ status= a ρ owner userid ρ pid userid Meetings Participants 13

Limitations The relational algebra cannot answer all queries Flights from Copenhagen to Madrid Rome London Madrid Athens Athens Rome...... Which cities can be reached from Copenhagen in one or more flights? 14

Transitive Closure The transitive closure of a binary relation R R = { (x 1,x k ) x 1,...,x k-1 ((x i,x i+1 ) R) } No relational algebra expression computes R No SQL query can handle it either unless SQL is extended with recursion or a special closure operator is added (some DBMSs do support this) 15

Algebraic Laws (1/3) x x = x x y = y x x x = x x y = y x x (y z) = (x y) z x (y z) = (x y) z x (y z) = (x y) (x z) idempotence commutativity idempotence commutativity associativity associativity distributivity 16

Algebraic Laws (2/3) σ C (x y) = σ C (x) σ C (y) σ C (x \ y) = σ C (x) \ σ C (y) = σ C (x) \ y σ C (x y) = σ C (x) σ C (y) σ C (x y) = σ C (x) σ C (y) σ C (x) = σ C (σ C (x)) σ C (σ D (x)) = σ D (σ C (x)) σ C D (x) = σ C (σ D (x)) = σ C (x) σ D (x) σ C D (x) = σ C (x) σ D (x) σ C (x) = x \ σ C (x) distributivity distributivity distributivity distributivity idempotence commutativity splitting splitting splitting 17

Algebraic Laws (3/3) π a (x y) = π a (x) π a (y) distributivity (does not hold for and ) ρ a b (x y) = ρ a b (x) ρ a b (y) ρ a b (x \ y) = ρ a b (x) \ ρ a b (y) ρ b c (ρ a b (x)) = ρ a c (x) ρ a b (ρ c d (x)) = ρ c d (ρ a b (x)) distributivity distributivity cancellation commutativity 18

Zero and Unit Define 0 = the empty relation (for each schema) Define 1 as follows the schema is empty the relation contains the single empty row 0 x = x 0 = x 0 x = x 0 = 0 1 x = x 1 = x 19

Division 20

Division Example Completed student task Fred Database1 Fred Database2 Fred Compiler1 Eugene Database1 Eugene Compiler1 Eugene Compiler2 Sara Database1 Sara Database2 John Usability1 ddb task Database1 Database2 Completed ddb student Fred Sara Those students that have completed all the ddb tasks 21

Algebraic Query Optimization Rewritings may improve efficiency (A B) C A (B C) σ C (A B) σ C (A) σ C (B) Depends on the predicates (selectivities) and the specific instances 22

Algebraic Query Optimization Rewritings may improve efficiency: 10 6 rows 10 6 rows 10 rows (A B) C A (B C) 10 12 rows 10 rows σ C (A B) σ C (A) σ C (B) Depends on the predicates (selectivities) and the specific instances 23

Rules of Thumb Push selections down the expressions tree Push projections down the expression tree Order joins based on size estimates In general, search for a good expression tree use heuristics use statistics: table sizes, distinct values for attributes, histograms, etc. 24

Bag Algebra Allows relations to contain duplicate entries Sets are replaced by bags The bag versions of,, and \ count copies The bag versions of π, σ, and keep duplicates A better match with real-life SQL than sets Does still not account for the ordering of the tuples SQL offers some support for ordering Tuples in a relation are stored on disk in some order 25

Algebraic Laws for Bags Fewer algebraic laws are valid for the bag algebra Counter examples x (y z) = (x y) (x z) σ C D (x) = σ C (x) σ D (x) Beware when optimizing bag queries! 26

Algebraic Laws for Bags Fewer algebraic laws are valid for the bag algebra Counter examples x (y z) = (x y) (x z) σ C D (x) = σ C (x) σ D (x) x,y,z = a 42 C,D = true Beware when optimizing bag queries! 27