Relational Algebra & Calculus Yanlei Diao UMass Amherst Slides Courtesy of R. Ramakrishnan and J. Gehrke 1
Outline v Conceptual Design: ER model v Logical Design: ER to relational model v Querying and manipulating data Practical language: SQL Declarative: say what you want, not how to get it Formal languages: Relational Algebra and Calculus Mathematical foundation for query processing 2
What is an Algebra? v A mathematical system consisting of: Operands: variables or values from which new values can be constructed. Operators: symbols denoting procedures that construct new values from given values. 3
What is Relational Algebra v Relational Algebra: Operands are relations. Each operator takes 1 or 2 relations and produces a relation. v Closure property: relational algebra is closed under the relational model. Relational operators can be arbitrarily composed! 4
Relational Algebra v Basic operations: Selection ( σ ) Selects a subset of rows from a relation. Projection ( π ) Deletes unwanted columns from a relation. Cross-product ( ) Allows us to combine two relations. Set-difference ( - ) Tuples in reln. 1, but not in reln. 2. Union ( ) Tuples in reln. 1 or in reln. 2. v Additional operations: Join ( ), Intersection ( ), Division ( / ), Renaming ( ρ ) Can be derived from basic operators. Not essential, but useful for writing simpler queries. 5
Example Instances Sailors S1 S2 22 dustin 7 45.0 31 lubber 8 55.5 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 R1 Reserves sid bid day 22 101 10/10/96 58 103 11/12/96 6
Projection v Retain only attributes in the projection list; delete others. v Schema of result contains exactly the fields in projection list. S2 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 sname yuppy 9 lubber 8 guppy 5 rusty 10 rating π ( S2) sname, rating 7
Projection (contd.) S2 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 age 35.0 55.5 π age ( S2) v Projection operator has to eliminate duplicates! Real systems typically don t do duplicate elimination unless the user explicitly asks for it. (Why not?) 8
Selection v Select rows that satisfy the selection condition; discard others. v Schema of result identical to schema of input. S2 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 28 yuppy 9 35.0 σ rating S >8 ( 2) 9
Selection (contd.) S2 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 σ (rating > 5 rating < 5) age < 50 (S2) 10
A Simple Example of Composition v Composition: result relation of an operator can be the input to another operator. 28 yuppy 9 35.0 σ rating S >8 ( 2) sname rating yuppy 9 rusty 10 π σ ( ( S )) sname, rating rating>8 2 11
Union, Intersection, Set-Difference v v v Set operations: Union ( ) Intersection ( ) Set difference ( - ) Take two input relations, which must be union-compatible: Same number of fields. Corresponding fields have the the same type. What is the schema of result? S1 22 dustin 7 45.0 31 lubber 8 55.5 S2 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 12
Example Set Operations S1 22 dustin 7 45.0 31 lubber 8 55.5 S2 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 22 dustin 7 45.0 31 lubber 8 55.5 44 guppy 5 35.0 28 yuppy 9 35.0 S1 S2 Duplicate elimination: remove tuples that have same values in all attributes. 13
Example Set Operations S1 22 dustin 7 45.0 31 lubber 8 55.5 S2 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 22 dustin 7 45.0 S1 S2 31 lubber 8 55.5 S1 S2 Duplicates? 14
Cross (Cartesian) Product v S1 R1: Each row of S1 is paired with each row of R1. S1 22 dustin 7 45.0 31 lubber 8 55.5 R1 sid bid day 22 101 10/10/96 58 103 11/12/96 S1 R1 (sid) sname rating age (sid) bid day 22 dustin 7 45.0 22 101 10/10/96 22 dustin 7 45.0 58 103 11/12/96 31 lubber 8 55.5 22 101 10/10/96 31 lubber 8 55.5 58 103 11/12/96 22 101 10/10/96 58 103 11/12/96 15
Cross-Product (contd.) (sid) sname rating age (sid) bid day 22 dustin 7 45.0 22 101 10/10/96 22 dustin 7 45.0 58 103 11/12/96 31 lubber 8 55.5 22 101 10/10/96 31 lubber 8 55.5 58 103 11/12/96 22 101 10/10/96 58 103 11/12/96 v Result schema inherits all fields of S1 and R1. Conflict: Both S1 and R1 have a field called sid. Renaming operator: ρ ( C( 1 sid1, 5 sid2), S1 R1) What if we want to find pairs s.t. S1.sid = S2.sid? 16
Join v Condition (theta) Join: R θ S = σ θ (R S) (sid) sname rating age (sid) bid day 22 dustin 7 45.0 58 103 11/12/96 31 lubber 8 55.5 58 103 11/12/96 S 1 R 1 S1. sid < R1. sid Result schema same as that of cross-product. But often fewer tuples, more efficient for computation. 17
Join v Equi-Join: A special case of condition join where the condition θ contains only equalities. bid day 22 dustin 7 45.0 101 10/10/96 103 11/12/96 S1 R1 sid Result schema contains only one copy of fields for which equality is specified. v Natural Join ( R w v S ): equi-join on all common fields. 18
Self-Join S1 22 dustin 7 45.0 31 lubber 8 55.5 S2 22 dustin 7 45.0 31 lubber 8 55.5 ρ(s1,s) ρ(s2,s) S1.rating>S2.rating S1.age<S2.age 19
Example Schema v Sailors(sid: integer, sname: string, rating: integer, age: integer) v Boats(bid: integer, color: string) v Reserves(sid: integer, bid: integer, day: date) 20
Find names of sailors who ve reserved boat #103 v Sailors(sid: integer, sname: string, rating: integer, age: integer) v Boats(bid: integer, color: string) v Reserves(sid: integer, bid: integer, day: date) v How many relations do we need to access? If more than 1, use cross-products or joins. v Filtering condition? Use selection. v Attributes to return? Use projection. 21
Find names of sailors who ve reserved boat #103 v Solution 1: π (( σ Re serves ) Sailors ) sname bid =103 v Solution 2: ρ ( Temp1, σ Re serves) bid =103 ρ ( Temp2, Temp1 Sailors) π sname ( Temp2) v Solution 3: π sname( σ (Re serves Sailors)) bid =103 Algebraic equivalence! 22
Find names of sailors who ve reserved a red boat v Sailors(sid: integer, sname: string, rating: integer, age: integer) v Boats(bid: integer, color: string) v Reserves(sid: integer, bid: integer, day: date) v Boat color is only available in Boats; so need an extra join: π sname (( σ color = ' red ' Boats) Re serves Sailors) 23
Find names of sailors who ve reserved a red or a green boat v Sailors(sid: integer, sname: string, rating: integer, age: integer) v Boats(bid: integer, color: string) v Reserves(sid: integer, bid: integer, day: date) 24
Find names of sailors who ve reserved a red or a green boat v Can identify all red or green boats, then find sailors who ve reserved one of these boats: ρ ( Tempboats, ( σ )) color = ' red ' color = ' green' Boats π sname ( Tempboats Re serves Sailors) v Can also define Tempboats using union. How? 25
Find names of sailors who ve reserved a red boat and a green boat v A single selection won t work. Why? v Instead, intersect sailors who ve reserved red boats and sailors who ve reserved green boats. sid is a key for Sailors ρ ( Tempred, π (( σ sid color = ' red ' ρ ( Tempgreen, π (( σ sid color = ' green' Boats) Re serves)) Boats) Re serves)) π sname (( Tempred Tempgreen) Sailors) Can we project onto name and then do the intersection? 26
Find the names of sailors who ve reserved all boats 27
Relational Calculus v Query has the form: ' ) ( ) *! # x1, x2,..., xn p x1, x2,..., xn # " $ + & ), & % ) - domain variables, or constants formula v Answer includes all tuples x1, x2,..., xn that make the formula! # p x1, x2,..., xn # " $ & & % be true. 28
Formulas v Formula is recursively defined: Atomic formulas: retrieving tuples from relations, or making comparisons of values Logical connectives:,, Quantifiers:, (optional) o x F(x) is equivalent to x F(x) 29
Find names of sailors rated > 7 who have reserved boat #103 Relational Algebra: π (( σ Reserves) ( σ sname bid = 103 rating> 7 Sailors)) Relational Calculus: {X sname X sid, X rating, X age Sailors(X sid, X sname, X rating, X age ) X rating > 7 X bid, X day Reserves(X sid, X bid, X day ) X bid =103 } v Where is the join? Use to find a tuple in Reserves that `joins with the Sailors tuple under consideration. 30
Find names of sailors who ve reserved all boats Sailors(sid: integer, sname: string, rating: integer, age: integer) Boats(bid: integer, color: string) Reserves(sid: integer, bid: integer, day: date) {X sname X sid, X rating, X age X sid, X sname, X rating, X age Sailors X bid, X color Boats ( X day X sid, X bid, X day Reserves) } v To find sailors who ve reserved all red boats: {X sname X sid, X rating, X age X sid, X sname, X rating, X age Sailors X bid, X color Boats (X color ʹ redʹ X day X sid, X bid, X day Reserves) } 31
Unsafe Queries, Expressive Power v Some syntactically correct calculus queries can have an infinite number of answers! Such queries are called unsafe. ( " %, * e.g., S $ S Sailors' * ) - * + $ # v Equivalence result: every query that can be expressed in relational algebra can be expressed as a safe query in relational calculus; the converse is also true. ' & *. v Query language (e.g., SQL) can express every query that is expressible in relational algebra/calculus. 32
Find names of sailors who ve reserved all boats {X sname X sid, X rating, X age X sid, X sname, X rating, X age Sailors X bid, X color Boats ( X day X sid, X bid, X day Reserves) } {X sname X sid, X rating, X age X sid, X sname, X rating, X age Sailors X bid, X color Boats ( X day X sid, X bid, X day Reserves) } 33
Find the names of sailors who ve reserved all boats v Step 1: for each sailor, check if there exists a boat that he has not reserved (called formula F). ρ(s _ neg, π sid ( (π sid Re serves) (π bid Boats) (π sid,bid Re serves))) v Step 2: find sailors for which F is not true and retrieve their names π sname ( (π sid Re serves S _ neg) Sailors ) - : the only way to express negation in relational algebra! 34