Lossless Joins, Third Normal Form

Similar documents
Relational Database Design

Information Systems for Engineers. Exercise 8. ETH Zurich, Fall Semester Hand-out Due

UVA UVA UVA UVA. Database Design. Relational Database Design. Functional Dependency. Loss of Information

SCHEMA NORMALIZATION. CS 564- Fall 2015

DECOMPOSITION & SCHEMA NORMALIZATION

Relational Database Design

Chapter 3 Design Theory for Relational Databases

Functional Dependency Theory II. Winter Lecture 21

DESIGN THEORY FOR RELATIONAL DATABASES. csc343, Introduction to Databases Renée J. Miller and Fatemeh Nargesian and Sina Meraji Winter 2018

CSC 261/461 Database Systems Lecture 11

Normalization. October 5, Chapter 19. CS445 Pacific University 1 10/05/17

CS322: Database Systems Normalization

Design theory for relational databases

Constraints: Functional Dependencies

Introduction. Normalization. Example. Redundancy. What problems are caused by redundancy? What are functional dependencies?

Databases 2012 Normalization

Database Design and Implementation

CSC 261/461 Database Systems Lecture 13. Spring 2018

CS122A: Introduction to Data Management. Lecture #13: Relational DB Design Theory (II) Instructor: Chen Li

CS54100: Database Systems

Design Theory for Relational Databases

Chapter 10. Normalization Ext (from E&N and my editing)

CSE 344 AUGUST 6 TH LOSS AND VIEWS

Relational Design Theory

But RECAP. Why is losslessness important? An Instance of Relation NEWS. Suppose we decompose NEWS into: R1(S#, Sname) R2(City, Status)

Normal Forms. Dr Paolo Guagliardo. University of Edinburgh. Fall 2016

Relational Database Design Theory Part II. Announcements (October 12) Review. CPS 116 Introduction to Database Systems

Relational Design: Characteristics of Well-designed DB

Practice and Applications of Data Management CMPSCI 345. Lecture 16: Schema Design and Normalization

Design Theory for Relational Databases. Spring 2011 Instructor: Hassan Khosravi

INF1383 -Bancos de Dados

Normal Forms 1. ICS 321 Fall Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Information Systems (Informationssysteme)

Constraints: Functional Dependencies

Functional Dependencies and Normalization

10/12/10. Outline. Schema Refinements = Normal Forms. First Normal Form (1NF) Data Anomalies. Relational Schema Design

FUNCTIONAL DEPENDENCY THEORY II. CS121: Relational Databases Fall 2018 Lecture 20

Functional Dependencies. Applied Databases. Not all designs are equally good! An example of the bad design

CSC 261/461 Database Systems Lecture 12. Spring 2018

Relational Design Theory II. Detecting Anomalies. Normal Forms. Normalization

Design Theory for Relational Databases

Introduction to Data Management CSE 344

Functional Dependencies

Review: Keys. What is a Functional Dependency? Why use Functional Dependencies? Functional Dependency Properties

Normaliza)on and Func)onal Dependencies

Lecture #7 (Relational Design Theory, cont d.)

CS 186, Fall 2002, Lecture 6 R&G Chapter 15

Introduction to Data Management. Lecture #7 (Relational DB Design Theory II)

Chapter 11, Relational Database Design Algorithms and Further Dependencies

Chapter 7: Relational Database Design

Databases Lecture 8. Timothy G. Griffin. Computer Laboratory University of Cambridge, UK. Databases, Lent 2009

Chapter 7: Relational Database Design. Chapter 7: Relational Database Design

Functional Dependency and Algorithmic Decomposition

Schema Refinement and Normal Forms

Schema Refinement: Other Dependencies and Higher Normal Forms

Consider a relation R with attributes ABCDEF GH and functional dependencies S:

Schema Refinement and Normal Forms. Why schema refinement?

Schema Refinement and Normal Forms. The Evils of Redundancy. Schema Refinement. Yanlei Diao UMass Amherst April 10, 2007

Chapter 8: Relational Database Design

Normal Forms (ii) ICS 321 Fall Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

FUNCTIONAL DEPENDENCY THEORY. CS121: Relational Databases Fall 2017 Lecture 19

Consider a relation R with attributes ABCDEF GH and functional dependencies S:

Database Design and Normalization

CSE 132B Database Systems Applications

Schema Refinement and Normal Forms Chapter 19

Relational-Database Design

CSE 544 Principles of Database Management Systems

Schema Refinement. Feb 4, 2010

CSC 261/461 Database Systems Lecture 8. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

Chapter 3 Design Theory for Relational Databases

Relational Database Design

CSC 261/461 Database Systems Lecture 10 (part 2) Spring 2018

Schema Refinement. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

CMPT 354: Database System I. Lecture 9. Design Theory

Schema Refinement & Normalization Theory

Schema Refinement and Normal Forms. Chapter 19

Design Theory. Design Theory I. 1. Normal forms & functional dependencies. Today s Lecture. 1. Normal forms & functional dependencies

Problem about anomalies

The Evils of Redundancy. Schema Refinement and Normalization. Functional Dependencies (FDs) Example: Constraints on Entity Set. Refining an ER Diagram

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #15: BCNF, 3NF and Normaliza:on

Normal Forms Lossless Join.

Schema Refinement and Normal Forms. Case Study: The Internet Shop. Redundant Storage! Yanlei Diao UMass Amherst November 1 & 6, 2007

Lectures 6. Lecture 6: Design Theory

Schema Refinement and Normal Forms

11/6/11. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

A few details using Armstrong s axioms. Supplement to Normalization Lecture Lois Delcambre

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) CIS 330, Spring 2004 Lecture 11 March 2, 2004

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Background: Functional Dependencies. æ We are always talking about a relation R, with a æxed schema èset of attributesè and a

Schema Refinement and Normal Forms

Functional Dependencies & Normalization. Dr. Bassam Hammo

Homework 2 (by Prashasthi Prabhakar) Due: Wednesday Sep 20, 11:59pm

CSE 303: Database. Outline. Lecture 10. First Normal Form (1NF) First Normal Form (1NF) 10/1/2016. Chapter 3: Design Theory of Relational Database

Database System Concepts, 5th Ed.! Silberschatz, Korth and Sudarshan See for conditions on re-use "

CAS CS 460/660 Introduction to Database Systems. Functional Dependencies and Normal Forms 1.1

Database Normaliza/on. Debapriyo Majumdar DBMS Fall 2016 Indian Statistical Institute Kolkata

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) [R&G] Chapter 19

The Evils of Redundancy. Schema Refinement and Normal Forms. Functional Dependencies (FDs) Example: Constraints on Entity Set. Example (Contd.

Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1

CSE 344 MAY 16 TH NORMALIZATION

Transcription:

Lossless Joins, Third Normal Form FCDB 3.4 3.5 Dr. Chris Mayfield Department of Computer Science James Madison University Mar 19, 2018

Decomposition wish list 1. Eliminate redundancy and anomalies 2. Recover the original relation exactly (by joining) 3. Joined relation should satisfy the original FDs BCNF decomposition: Properties 1 and 2, but not 3 3NF decomposition: Properties 2 and 3, but not 1 No algorithm can get all three! Mar 19, 2018 Lossless Joins, Third Normal Form 2 of 12

Recovering from a decomposition Suppose R has a schema that violates BCNF BCNF algorithm decomposes R into a set {S 1, S 2,..., S k } of new relations, such that: 1. Each relation S i is in BCNF, and 2. Decomposition of R is a lossless join R = S 1 S 2 S k Every tuple in R is in S1 S 2 S k Every tuple in S1 S 2 S k is in R Mar 19, 2018 Lossless Joins, Third Normal Form 3 of 12

Lossless join vs lossy join Consider R(A, B, C) and B C with tuples (a, b, c) and (d, b, e) BCNF is... S(A, B) and T (B, C) How do we prove that R = S T? BCNF is lossless because of FDs! Now consider R(A, B, C) without FD s: A B C 1 2 3 4 2 5 A B 1 2 4 2 and B C 2 3 2 5...? Mar 19, 2018 Lossless Joins, Third Normal Form 4 of 12

Dependency Preservation What does the following relation model? Teach(dept, course, prof, semester, year) Example university has the following rules: Offered either in fall or spring of each year Each prof teaches 1 course per semester DC S PSY DC What are the keys? {D, C, P, Y } and {P, S, Y } Decompose using DC S: T 1 (D, C, S) and T 2 (D, C, P, Y ) How do you enforce PSY DC? (i.e., with a CHECK constraint) Not without joining T 1 and T 2 first BCNF does not always preserve FDs Mar 19, 2018 Lossless Joins, Third Normal Form 5 of 12

Third Normal Form (3NF) R is in Third Normal Form if each nontrivial FD A 1 A 2 A n B 1 B 2, B m has: Either {A 1, A 2,..., A n } is a superkey for R Or each B i is prime (part of some key in R) (Relaxed version of BCNF) Teach(D, C, P, S, Y) has FDs DC S and PSY DC Keys are {D, C, P, Y } and {P, S, Y } DC S violates BCNF (since DC PY ) However, 3NF because S is a part of a key Mar 19, 2018 Lossless Joins, Third Normal Form 6 of 12

More 3NF examples Teach(dept, course, prof, semester, year) Each prof teaches 1 course per semester PSY DC What if we change/add other rules? Offered either in fall or spring of each year, but the semester can change from year to year DCY S Keys are {P, S, Y } and {D, C, P, Y } still 3NF Every time it s offered, each course is taught by at most one prof DCSY P Keys are {P, S, Y } and {D, C, Y } still 3NF Modify the previous constraint: Each course is always taught by the same prof DC P Keys are {P, S, Y } and {D, C, Y } still 3NF Mar 19, 2018 Lossless Joins, Third Normal Form 7 of 12

Synthesis algorithm for 3NF input: Relation R with set of FDs F output: Decomposition of R into 3NF 1. Find a minimal basis G based on F 2. For each FD X A in G, use X A as the schema for one of the new relations Drop relations that are subsets of others 3. If none of the relations is a superkey for R, add a relation whose schema is a key for R Read section 3.5.3 to understand why this works! Mar 19, 2018 Lossless Joins, Third Normal Form 8 of 12

Example 3.27 Consider R(A, B, C, D, E) with FDs AB C, C B, and A D 1. Is this a minimal basis? Find the closure of each, using the others Can we eliminate A or B from AB C? 2. Create relations for each FD S 1 (A, B, C) S 2 (B, C) drop this one; already in S 1 S 3 (A, D) 3. What are the keys of R? {A, B, E} and {A, C, E} S 4 (A, B, E) or S 5 (A, C, E) Mar 19, 2018 Lossless Joins, Third Normal Form 9 of 12

What about 1NF, 2NF, etc? http://www.bkent.net/doc/simple5.htm

Summary of normal forms 1st Normal Form (1NF) Each cell contains a single value (no table within a table) 2nd Normal Form (2NF) 1NF + non-prime attributes depend on all key attributes 3rd Normal Form (3NF) 2NF + non-prime attributes depend only on key attributes Each attribute must be a fact about the key, the whole key, and nothing but the key. Boyce-Codd Normal Form (BCNF) 3NF + every left hand side determinant is a superkey 4th Normal Form (4NF) BCNF + multivalued dependencies are based on superkeys 5th Normal Form (5NF) 4NF + join dependencies are a consequence of superkeys Mar 19, 2018 Lossless Joins, Third Normal Form 11 of 12

What about performance? Normal forms prevent update anomalies and data inconsistencies penalize retrieval (i.e., several records instead of one) There is no obligation to fully normalize all records... Factors affecting normalization: Dependency on the entire key Presence of mutual constraints Independent vs dependent facts Single-valued vs multi-valued Mar 19, 2018 Lossless Joins, Third Normal Form 12 of 12