Normal Forms. Dr Paolo Guagliardo. University of Edinburgh. Fall 2016

Similar documents
Relational Database Design

Normaliza)on and Func)onal Dependencies

Functional Dependencies

SCHEMA NORMALIZATION. CS 564- Fall 2015

DESIGN THEORY FOR RELATIONAL DATABASES. csc343, Introduction to Databases Renée J. Miller and Fatemeh Nargesian and Sina Meraji Winter 2018

CSC 261/461 Database Systems Lecture 13. Spring 2018

UVA UVA UVA UVA. Database Design. Relational Database Design. Functional Dependency. Loss of Information

Constraints: Functional Dependencies

Functional Dependencies & Normalization. Dr. Bassam Hammo

INF1383 -Bancos de Dados

CSC 261/461 Database Systems Lecture 11

DECOMPOSITION & SCHEMA NORMALIZATION

Relational Design Theory

Lossless Joins, Third Normal Form

Constraints: Functional Dependencies

CS54100: Database Systems

Normalization. October 5, Chapter 19. CS445 Pacific University 1 10/05/17

10/12/10. Outline. Schema Refinements = Normal Forms. First Normal Form (1NF) Data Anomalies. Relational Schema Design

Schema Refinement and Normal Forms. Why schema refinement?

Introduction. Normalization. Example. Redundancy. What problems are caused by redundancy? What are functional dependencies?

Design Theory for Relational Databases. Spring 2011 Instructor: Hassan Khosravi

Design Theory for Relational Databases

11/6/11. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

Database Design and Implementation

Schema Refinement. Feb 4, 2010

Schema Refinement and Normal Forms. The Evils of Redundancy. Schema Refinement. Yanlei Diao UMass Amherst April 10, 2007

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms

Lectures 6. Lecture 6: Design Theory

Database Design: Normal Forms as Quality Criteria. Functional Dependencies Normal Forms Design and Normal forms

Schema Refinement and Normal Forms Chapter 19

Schema Refinement & Normalization Theory

CMPT 354: Database System I. Lecture 9. Design Theory

CSC 261/461 Database Systems Lecture 10 (part 2) Spring 2018

Design Theory for Relational Databases

Functional Dependency and Algorithmic Decomposition

Information Systems (Informationssysteme)

Functional Dependencies. Applied Databases. Not all designs are equally good! An example of the bad design

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Example (Contd.

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Refining an ER Diagram

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) CIS 330, Spring 2004 Lecture 11 March 2, 2004

Information Systems for Engineers. Exercise 8. ETH Zurich, Fall Semester Hand-out Due

CSC 261/461 Database Systems Lecture 8. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

11/1/12. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

CS122A: Introduction to Data Management. Lecture #13: Relational DB Design Theory (II) Instructor: Chen Li

The Evils of Redundancy. Schema Refinement and Normal Forms. Functional Dependencies (FDs) Example: Constraints on Entity Set. Example (Contd.

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) [R&G] Chapter 19

Relational Design: Characteristics of Well-designed DB

Design theory for relational databases

CSE 303: Database. Outline. Lecture 10. First Normal Form (1NF) First Normal Form (1NF) 10/1/2016. Chapter 3: Design Theory of Relational Database

Functional Dependencies and Normalization

Functional Dependency Theory II. Winter Lecture 21

Schema Refinement and Normal Forms

Design Theory. Design Theory I. 1. Normal forms & functional dependencies. Today s Lecture. 1. Normal forms & functional dependencies

Chapter 11, Relational Database Design Algorithms and Further Dependencies

CSC 261/461 Database Systems Lecture 12. Spring 2018

CSE 344 AUGUST 6 TH LOSS AND VIEWS

Databases 2012 Normalization

Schema Refinement and Normal Forms. Chapter 19

The Evils of Redundancy. Schema Refinement and Normalization. Functional Dependencies (FDs) Example: Constraints on Entity Set. Refining an ER Diagram

FUNCTIONAL DEPENDENCY THEORY. CS121: Relational Databases Fall 2017 Lecture 19

Schema Refinement and Normal Forms. Case Study: The Internet Shop. Redundant Storage! Yanlei Diao UMass Amherst November 1 & 6, 2007

Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1

Schema Refinement and Normalization

CS 186, Fall 2002, Lecture 6 R&G Chapter 15

Chapter 3 Design Theory for Relational Databases

CSE 132B Database Systems Applications

Functional Dependencies

Chapter 10. Normalization Ext (from E&N and my editing)

CS322: Database Systems Normalization

Schema Refinement and Normal Forms

Relational-Database Design

Relational Database Design

Review: Keys. What is a Functional Dependency? Why use Functional Dependencies? Functional Dependency Properties

Schema Refinement. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Chapter 3 Design Theory for Relational Databases

Lecture #7 (Relational Design Theory, cont d.)

CSE 344 MAY 16 TH NORMALIZATION

Introduction to Data Management. Lecture #6 (Relational DB Design Theory)

Functional Dependencies. Getting a good DB design Lisa Ball November 2012

COSC 430 Advanced Database Topics. Lecture 2: Relational Theory Haibo Zhang Computer Science, University of Otago

Relational Design Theory II. Detecting Anomalies. Normal Forms. Normalization

CSE 344 AUGUST 3 RD NORMALIZATION

Practice and Applications of Data Management CMPSCI 345. Lecture 16: Schema Design and Normalization

Introduction to Data Management. Lecture #7 (Relational DB Design Theory II)

Introduction to Data Management CSE 344

Normal Forms (ii) ICS 321 Fall Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

HKBU: Tutorial 4

CSE 544 Principles of Database Management Systems

Normal Forms Lossless Join.

CAS CS 460/660 Introduction to Database Systems. Functional Dependencies and Normal Forms 1.1

Chapter 8: Relational Database Design

Chapter 7: Relational Database Design

Relational Database Design Theory Part II. Announcements (October 12) Review. CPS 116 Introduction to Database Systems

Functional Dependencies and Normalization. Instructor: Mohamed Eltabakh

FUNCTIONAL DEPENDENCY THEORY II. CS121: Relational Databases Fall 2018 Lecture 20

Introduction to Management CSE 344

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Database Design and Normalization

Schema Refinement: Other Dependencies and Higher Normal Forms

Transcription:

Normal Forms Dr Paolo Guagliardo University of Edinburgh Fall 2016

Example of bad design BAD Title Director Theatre Address Time Price Inferno Ron Howard Vue Omni Centre 20:00 11.50 Inferno Ron Howard Vue Omni Centre 22:30 10.50 Inferno Ron Howard Odeon Lothian Rd 20:00 10.00 Inferno Ron Howard Cineworld Fountain Park 18:20 9.50 Inferno Ron Howard Cineworld Fountain Park 21:00 11.00 Trolls Mike Mitchell Vue Omni Centre 16:10 9.50 Trolls Mike Mitchell Vue Omni Centre 19:30 10.00 Trolls Mike Mitchell Odeon Lothian Rd 15:00 8.50 Trolls Mike Mitchell Cineworld Fountain Park 17:15 9.00 { Title Director, Theatre,Title,Time Price, Theatre Address } 1 / 20

Why is BAD bad? Redundancy Many facts are repeated For every showing we list both director and title For every movie playing we repeat the address Update anomalies Address must be changed for all movies and showtimes If a movie stops playing, association title-director is lost Cannot add a movie before it starts playing 2 / 20

Good design Movies: Title Director Title Director Inferno Ron Howard Trolls Mike Mitchell Theatres: Theatre Address Theatre Address Vue Omni Centre Odeon Lothian Rd Cineworld Fountain Park Showings: Theatre,Title,Time Price Theatre Title Time Price Vue Inferno 20:00 11.50 Vue Inferno 22:30 10.50 Odeon Inferno 20:00 10.00 Cineworld Inferno 18:20 9.50 Cineworld Inferno 21:00 11.00 Vue Trolls 16:10 9.50 Vue Trolls 19:30 10.00 Odeon Trolls 15:00 8.50 Cineworld Trolls 17:15 9.00 3 / 20

Why is GOOD good? No redundancy Every FD defines a key No information loss Movies = π Title,Director (BAD) Theatres = π Theatre,Address (BAD) Showings = π Theatre,Title,Time,Price (BAD) BAD = Movies Theatres Showings No constraints are lost All of the original FDs appear as constraints in the new tables 4 / 20

Boyce-Codd Normal Form (BCNF) Problems with bad designs are caused by FDs X Y where X is not a key A relation with FDs F is in BCNF if for every X Y in F Y X (the FD is trivial), or X is a key A database is in BCNF if all relations are in BCNF 5 / 20

Decompositions Given a set of attributes U and a set of FDs F, a decomposition of (U, F ) is a set (U 1, F 1 ),..., (U n, F n ) such that U = n i=1 U i and F i is a set of FDs over U i BCNF decomposition if each (U i, F i ) is in BCNF Criteria for good decompositions Losslessness: no information is lost Dependency preservation: no constraints are lost 6 / 20

Good decompositions A decomposition of (U, F ) into (U 1, F 1 ),..., (U n, F n ) is Lossless if for every relation R over U that satisfies F each πui (R) satisfies F i, and R = πu1 (R) π Un (R) Dependency preserving if F and n i=1 F i are equivalent (that is, they have the same closure) 7 / 20

Projection of FDs Let F be a set of FDs over attributes U The projection of F on V U π V (F ) = {X Y X, Y V, Y C F (X)} is the set of all FDs over V that are implied by F Can be often represented compactly as a set of FDs F such that X, Y V F = X Y F = X Y 8 / 20

BCNF decomposition algorithm Input: A set of attributes U and a set of FDs F Output: A database schema S 1. S := {(U, F )} 2. While there is (U i, F i ) S not in BCNF: Replace (U i, F i ) by decompose(u i, F i ) 3. Remove any (U i, F i ) for which there is (U j, F j ) with U i U j 4. Return S Subprocedure decompose(u, F ): 1. Choose (X Y ) F that violates BCNF 2. Set V := C F (X) and Z := U V 3. Return ( V, π V (F ) ) and ( XZ, π XZ (F ) ) 9 / 20

Properties of the BCNF algorithm The decomposed schema is in BCNF and lossless-join The output depends on the FDs chosen to decompose Dependency preservation is not guaranteed Example Apply the BCNF algorithm to the BAD schema (blackboard) 10 / 20

BCNF and dependency preservation Take the relation Lectures : Class, Professor, Time with FDs F = {C P, P T C} 11 / 20

BCNF and dependency preservation Take the relation Lectures : Class, Professor, Time with FDs F = {C P, P T C} (CP T, F ) is not in BCNF: (C P ) F, but C is not a key If we decompose using the BCNF algorithm we get We lose the constraint P T C (CP, C P ) and (CT, ) 11 / 20

Third Normal Form (3NF) (U, F ) is in 3NF if for any FD X A implied by F one of the following holds: A X (the FD is trivial) X is a key A is prime Intuition: in 3NF non-key FDs are allowed as long as they imply only prime attributes Every schema in BCNF is also in 3NF 12 / 20

3NF and redundancy Consider again the relation Lectures : Class, Professor, Time with FDs F = {C P, P T C} (CP T, F ) is in 3NF: P T is a candidate key, so P is prime More redundancy than in BCNF each time a class appears in a tuple, professor s name is repeated we tolerate this because there is no BCNF decomposition that preserves dependencies 13 / 20

Minimal covers Let F and G be sets of FDs G is a cover of F if G + = F + Minimal if Each FD in G has the form X A No proper subset of G is a cover (we cannot remove FDs without losing equivalence to F ) For (X A) G and X X, A C F (X ) (we cannot remove attributes from the LHS of FDs in G) Intuition: G is a small representation of all FDs in F 14 / 20

Finding minimal covers 1. Put the FDs in standard form: only one attribute on RHS Use Armstrong s decomposition axiom X A 1 A n is split into n FDs: X A 1,..., X A n 15 / 20

Finding minimal covers 1. Put the FDs in standard form: only one attribute on RHS Use Armstrong s decomposition axiom X A 1 A n is split into n FDs: X A 1,..., X A n 2. Minimize the LHS of each FD Check whether attributes in the LHS can be removed For (X A) F and X X check whether A C F (X ) If yes, replace X A by X A and repeat 15 / 20

Finding minimal covers 1. Put the FDs in standard form: only one attribute on RHS Use Armstrong s decomposition axiom X A 1 A n is split into n FDs: X A 1,..., X A n 2. Minimize the LHS of each FD Check whether attributes in the LHS can be removed For (X A) F and X X check whether A C F (X ) If yes, replace X A by X A and repeat 3. Delete redundant FDs (X A) F, check whether F {X A} = X A 15 / 20

Finding minimal covers: Example Consider the FDs {ABCD E, E D, A B, AC D} 16 / 20

Finding minimal covers: Example Consider the FDs {ABCD E, E D, A B, AC D} 1. Already in standard form {ABCD E, E D, A B, AC D} 16 / 20

Finding minimal covers: Example Consider the FDs {ABCD E, E D, A B, AC D} 1. Already in standard form {ABCD E, E D, A B, AC D} 2. The LHS of ABCD E can be replaced by AC {AC E, E D, A B, AC D} 16 / 20

Finding minimal covers: Example Consider the FDs {ABCD E, E D, A B, AC D} 1. Already in standard form {ABCD E, E D, A B, AC D} 2. The LHS of ABCD E can be replaced by AC {AC E, E D, A B, AC D} 3. The last FD is redundant (implied by the first two) {AC E, E D, A B} 16 / 20

3NF synthesis algorithm Input : A set of attributes U and a set of FDs F Output : A database schema S 1. S := 2. Find a minimal cover G of F 3. For each FD (X A) G, add (XA, X A) to S 4. If no (U i, F i ) in S is such that U i is key for (U, F ), find a key K for (U, F ) and add (K, ) to S 5. If S contains (U i, F i ) and (U j, F j ) with U i U j, replace them by (U j, F i F j ) 6. Output S 17 / 20

3NF synthesis algorithm Input : A set of attributes U and a set of FDs F Output : A database schema S 1. S := 2. Find a minimal cover G of F 3. For each FD (X A) G, add (XA, X A) to S 4. If no (U i, F i ) in S is such that U i is key for (U, F ), find a key K for (U, F ) and add (K, ) to S 5. If S contains (U i, F i ) and (U j, F j ) with U i U j, replace them by (U j, F i F j ) 6. Output S Simplification (XA 1, X A 1 ),..., (XA n, X A n ) in the output can be replaced by (XA 1 A n, X A 1 A n ) 17 / 20

Properties of the 3NF algorithm The synthetized schema is in 3NF lossless-join dependency-preserving Example Apply the 3NF algorithm to the Lectures schema (blackboard) (that schema is already in 3NF, but let s do it anyway) 18 / 20

3NF synthesis: another example Example Input : ( ABCD, {A B, C B, BD A} ) Not in 3NF (the only candidate key is CD) The given set of FDs is already minimal Ouput : { (CB, {C B}), (ABD, {BD A, A B}), (CD, ) } 19 / 20

Schema design: Summary Given the set of attributes U and the set of FDs F Find a lossless, dependency-preserving decomposition into: BCNF if it exists 3NF if BCNF decomposition cannot be found In some cases, DBAs de-normalize tables to reduce number of joins 20 / 20