Relational Database Design

Similar documents
Design theory for relational databases

Information Systems for Engineers. Exercise 8. ETH Zurich, Fall Semester Hand-out Due

UVA UVA UVA UVA. Database Design. Relational Database Design. Functional Dependency. Loss of Information

Chapter 11, Relational Database Design Algorithms and Further Dependencies

Chapter 3 Design Theory for Relational Databases

CSC 261/461 Database Systems Lecture 13. Spring 2018

Lossless Joins, Third Normal Form

Functional Dependencies and Normalization

10/12/10. Outline. Schema Refinements = Normal Forms. First Normal Form (1NF) Data Anomalies. Relational Schema Design

Comp 5311 Database Management Systems. 5. Functional Dependencies Exercises

Relational-Database Design

Relational Database Design

Design Theory for Relational Databases

Design Theory for Relational Databases

DESIGN THEORY FOR RELATIONAL DATABASES. csc343, Introduction to Databases Renée J. Miller and Fatemeh Nargesian and Sina Meraji Winter 2018

Database Design and Normalization

Normal Forms Lossless Join.

Desirable properties of decompositions 1. Decomposition of relational schemes. Desirable properties of decompositions 3

Database Design and Implementation

11/6/11. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

Relational Database Design Theory Part II. Announcements (October 12) Review. CPS 116 Introduction to Database Systems

12/3/2010 REVIEW ALGEBRA. Exam Su 3:30PM - 6:30PM 2010/12/12 Room C9000

Relational Database Design

Schema Refinement. Feb 4, 2010

Relational Design Theory II. Detecting Anomalies. Normal Forms. Normalization

11/1/12. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

Normalization. October 5, Chapter 19. CS445 Pacific University 1 10/05/17

Relational Design Theory

Chapter 3 Design Theory for Relational Databases

CS 186, Fall 2002, Lecture 6 R&G Chapter 15

Databases 2012 Normalization

Database Design and Normalization

Schema Refinement & Normalization Theory

Normaliza)on and Func)onal Dependencies

Schema Refinement and Normal Forms

Functional Dependency and Algorithmic Decomposition

CMPT 354: Database System I. Lecture 9. Design Theory

Practice and Applications of Data Management CMPSCI 345. Lecture 16: Schema Design and Normalization

Constraints: Functional Dependencies

INF1383 -Bancos de Dados

CS322: Database Systems Normalization

Schema Refinement and Normal Forms. Chapter 19

Introduction to Data Management CSE 344

Functional Dependencies

Schema Refinement. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

CSE 344 AUGUST 6 TH LOSS AND VIEWS

CS54100: Database Systems

CSE 132B Database Systems Applications

Constraints: Functional Dependencies

Lectures 6. Lecture 6: Design Theory

CSC 261/461 Database Systems Lecture 11

Review: Keys. What is a Functional Dependency? Why use Functional Dependencies? Functional Dependency Properties

CSC 261/461 Database Systems Lecture 8. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

Schema Refinement and Normal Forms. Case Study: The Internet Shop. Redundant Storage! Yanlei Diao UMass Amherst November 1 & 6, 2007

Introduction. Normalization. Example. Redundancy. What problems are caused by redundancy? What are functional dependencies?

CSC 261/461 Database Systems Lecture 10 (part 2) Spring 2018

Relational Design Theory I. Functional Dependencies: why? Redundancy and Anomalies I. Functional Dependencies

Relational Design: Characteristics of Well-designed DB

CSE 344 MAY 16 TH NORMALIZATION

Schema Refinement: Other Dependencies and Higher Normal Forms

CSE 544 Principles of Database Management Systems

Functional Dependencies. Applied Databases. Not all designs are equally good! An example of the bad design

CSE 303: Database. Outline. Lecture 10. First Normal Form (1NF) First Normal Form (1NF) 10/1/2016. Chapter 3: Design Theory of Relational Database

Design Theory for Relational Databases. Spring 2011 Instructor: Hassan Khosravi

Chapter 10. Normalization Ext (from E&N and my editing)

CSE 344 AUGUST 3 RD NORMALIZATION

Normal Forms (ii) ICS 321 Fall Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1

Introduction to Database Systems CSE 414. Lecture 20: Design Theory

SCHEMA NORMALIZATION. CS 564- Fall 2015

Normal Forms. Dr Paolo Guagliardo. University of Edinburgh. Fall 2016

Schema Refinement and Normal Forms. Why schema refinement?

Functional Dependency Theory II. Winter Lecture 21

CS122A: Introduction to Data Management. Lecture #13: Relational DB Design Theory (II) Instructor: Chen Li

Chapter 7: Relational Database Design

Schema Refinement and Normal Forms. The Evils of Redundancy. Schema Refinement. Yanlei Diao UMass Amherst April 10, 2007

Schema Refinement and Normal Forms

Information Systems (Informationssysteme)

Chapter 7: Relational Database Design. Chapter 7: Relational Database Design

Functional Dependencies and Normalization. Instructor: Mohamed Eltabakh

Chapter 8: Relational Database Design

Unit 3 - Functional Dependency and Decomposition

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Database Design: Normal Forms as Quality Criteria. Functional Dependencies Normal Forms Design and Normal forms

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms Chapter 19

Introduction to Management CSE 344

Introduction to Data Management CSE 344

Introduction to Data Management. Lecture #7 (Relational DB Design Theory II)

Design Theory. Design Theory I. 1. Normal forms & functional dependencies. Today s Lecture. 1. Normal forms & functional dependencies

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Example (Contd.

Functional Dependencies and Normalization

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Refining an ER Diagram

But RECAP. Why is losslessness important? An Instance of Relation NEWS. Suppose we decompose NEWS into: R1(S#, Sname) R2(City, Status)

Databases Lecture 8. Timothy G. Griffin. Computer Laboratory University of Cambridge, UK. Databases, Lent 2009

CAS CS 460/660 Introduction to Database Systems. Functional Dependencies and Normal Forms 1.1

Background: Functional Dependencies. æ We are always talking about a relation R, with a æxed schema èset of attributesè and a

Consider a relation R with attributes ABCDEF GH and functional dependencies S:

Schema Refinement and Normal Forms

The Evils of Redundancy. Schema Refinement and Normalization. Functional Dependencies (FDs) Example: Constraints on Entity Set. Refining an ER Diagram

Transcription:

Relational Database Design Chapter 15 in 6 th Edition 2018/4/6 1

10 Relational Database Design Anomalies can be removed from relation designs by decomposing them until they are in a normal form. Several problems should be investigated regarding a decomposition. A decomposition of a relation scheme, R, is a set of relation schemes {R 1,...,R n } such that R i R for each i, and ڂ n i=1 R i = R Note that in a decomposition {R 1,...,R n }, the intersect of each pair of R i and R j does not have to be empty. Example: R = {A, B, C, D, E}, R 1 = { A, B}, R 2 ={A, C}, R 3 = {C, D, E} A naïve decomposition: each relation has only attribute. A good decomposition should have the following two properties. 2018/4/6 2

Dependency Preserving Definition: Two sets F and G of FD s are equivalent if F + = G +. Given a decomposition {R 1,...,R n } of R: F i = X Y: X Y F&X R i, Y R i. The decomposition {R 1,...,R n } of R is dependency preserving with respect to F if F + = ڂ i=n + i=1 F. i 2018/4/6 3

Examples F = { A BC, D EG, M A }, R = (A, B, C, D, E, G, M, A) 1) Given R 1 = ( A, B, C, M) and R 2 = (C, D, E, G), F 1 = { A BC, M A}, F 2 = {D EG} F = F 1 U F 2. thus, dependency preserving 2) Suppose that F = F U {M D}. R 1 and R 2 remain the same. Thus, F 1 and F 2 remain the same. We need to verify if M D is inferred by F 1 U F 2. Since M + F1 U F2 = {M, A, B, C}, M D is not inferred by F 1 U F 2. Thus, R 1 and R 2 are not dependency preserving regarding F. 3) F = {A BC, D EG, M A, M C, C D, M D} F 1 = {A BC, M A, M C}, F 2 = {D EG, C D} It can be verified that M D is inferred by F 1 and F 2. Thus, F + = (F 1 U F 2 ) + Hence, R 1 and R 2 are dependency preserving regarding F. 2018/4/6 4

Lossless Join Decomposition A second necessary property for decomposition: A decomposition {R 1,...,R n } of R is a lossless join decomposition with respect to a set F of FD s if for every relation instance r that satisfies F: r = πr 1 r πr n r. If r πr 1 r πr n r, the decomposition is lossy. 2018/4/6 5

Lossless Join Decomposition(cont) Example 2: Suppose that we decompose the following relation: STUDENT_ADVISOR Name Department Advisor Jones Comp Sci Smith Ng Chemistry Turner Martin Physics Bosky Dulles Decision Sci Hall Duke Mathematics James James Comp Sci Clark Evan Comp Sci Smith Baxter English Bronte With dependencies ሼName Department, Name Advisor, Advisor Department ሽ, into two relations: 2018/4/6 6

Lossless Join Decomposition(cont) STUDENT_DEPARTMENT Name Department Jones Comp Sci Ng Chemistry Martin Physics Duke Mathematics Dulles Decision Sci James Comp Sci Evan Comp Sci Baxter English DEPARTMENT_ADVISOR Department Advisor Comp Sci Smith Chemistry Turner Physics Bosky Decision Sci Hall Mathematics James Comp Sci Clark English Bronte If we join these decomposed relations we get: 2018/4/6 7

Lossless Join Decomposition(cont) Name Department Advisor Jones Comp Sci Smith Jones Comp Sci Clark* Ng Chemistry Turner Martin Physics Bosky Dulles Decision Sci Hall Duke Mathematics James James Comp Sci Smith* James Comp Sci Clark Evan Comp Sci Smith Evan Comp Sci Clark* Baxter English Bronte This is not the same as the original relation (the tuples marked with * have been added). Thus the decomposition is lossy. Useful theorem: The decomposition {R 1,R 2 } of R is lossless iff the common attributes R 1 R 2 form a superkey for either R 1 or R 2. 2018/4/6 8

Lossless Join Decomposition(cont) Example 3: Given R(A,B,C) and F = {A B}. The decomposition into R 1 (A,B) and R 2 (A,C) is lossless because A B is an FD over R 1, so the common attribute A is a key of R 1. 2018/4/6 9

10.1 Testing for the lossless join property Algorithm TEST_LJ Step 1: Create a matrix S, each element s i,j S corresponds the relation R i and the attribute A j, such that: s j,i = a if A i R j, otherwise s j,i = b. Step 2: Repeat the following process till S has no change or one row is made up entirely of a symbols. Step 2.1: For each X Y, choose the rows where the elements corresponding to X take the value a. 2018/4/6 Step 2.2: In those chosen rows (must be at least two rows), the elements corresponding to Y also take the value a if one of the chosen rows take the value a on Y. 10

10.1 Testing for the lossless join property(cont) The decomposition is lossless if one row is entirely made up by a values. The algorithm can be found as the Algorithm 15.2 in E/N book. Note: The correctness of the algorithm is based on the assumption that no null values are allowed for the join attributes. If and only if exists an order such that R i M i-1 forms a superkey of R i or M i-1, where M i-1 is the join on R 1, R 2, R i-1 2018/4/6 11

10.1 Testing for the lossless join property(cont) Example: R = (A,B,C,D), F = {A B,A C,C D}. Let R 1 = (A,B,C), R 2 = (C,D). Initially, S is A B C D R 1 a a a b R 2 b b a a Note: rows 1 and 2 of S agree on {C}, which is the left hand side of C D. Therefore, change the D value on rows 1 to a, matching the value from row 2. Now row 1 is entirely a s, so the decomposition is lossless. (Check it.) 2018/4/6 12

10.1 Testing for the lossless join property(cont) Example 4: R = (A,B,C,D,E), F = {AB CD,A E,C D}. Let R 1 = (A,B,C), R 2 = (B,C,D) and R 3 = (C,D,E). Example 5: R = (A,B,C,D,E, F), F = {A B,C DE,AB F}. Let R 1 = (A,B), R 2 = (C,D,E) and R 3 = (A,C, F). 2018/4/6 13

10.1 Testing for the lossless join property(cont) Example 5: R = (A,B,C,D,E, G), F = {AB G, C DE, A B,}. Let R 1 = (A,B,), R 2 = (C, D, E) and R 3 = (A, C,G). A B C D E G R 1 a a b b b b R 2 b b a a a b R 3 a b a b b a A B C D E G R 1 a a b b b b R 2 b b a a a b R 3 a b X a a a a 2018/4/6 14 a

10.2 Lossless decomposition into BCNF Algorithm TO_BCNF D := {R 1,R 2,...R n } While a R i D and R i is not in BCNF Do { find a X Y in R i that violates BCNF; replace R i in D by (R i Y ) and (X Y ); } F = {A C, A D, C E, E D, C G}, R1 = (C, D, E, G), R2 = (A, B, C, D) R11 = (C, E, G), R12 = (E, D) due E D R21 = (A, B, C), R22 = (C, D) because of C D 2018/4/6 15

10.2 Lossless decomposition into BCNF Algorithm TO_BCNF D := {R 1,R 2,...R n } While a R i D and R i is not in BCNF Do { find a X Y in R i that violates BCNF; replace R i in D by (R i Y ) and (X Y ); } Since a X Y violating BCNF is not always in F, the main difficulty is to verify if R i is in BCNF; see the approach below: 1. For each subset X of R i, computer X +. 2. X (X + Ri X) violates BCNF, if X + Ri X and R i X +. Here, X + Ri X = means that each F.D with X as the left hand side is trivial; R i X + = means X is a superkey of R i 2018/4/6 16

10.2 Lossless decomposition into BCNF(cont) Example 6:(From Desai 6.35) Find a BCNF decomposition of the relation scheme below: SHIPPING(Ship, Capacity, Date, Cargo, Value) F consists of: Ship Capacity {Ship, Date} Cargo {Cargo, Capacity} Value 2018/4/6 17

10.2 Lossless decomposition into BCNF(cont) From Ship Capacity, we decompose SHIPPING into R 1 (Ship, Date, Cargo, Value) Key: {Ship,Date} Ship Capacity {Ship, Date} Cargo {Cargo, Capacity} Value A nontrivial FD in F + violates BCNF: {Ship, Cargo} Value and R 2 (Ship, Capacity) Key: {Ship} Only one nontrivial FD in F + : Ship Capacity 2018/4/6 18

10.2 Lossless decomposition into BCNF(cont) R 1 is not in BCNF so we must decompose it further into R 11 (Ship, Date, Cargo) Key: {Ship,Date} Ship Capacity {Ship, Date} Cargo {Cargo, Capacity} Value Only one nontrivial FD in F + with single attribute on the right side: {Ship, Date} Cargo And R 12 (Ship, Cargo, Value) Key: {Ship,Cargo} Only one nontrivial FD in F + with single attribute on the right side: {Ship,Cargo} Value This is in BCNF and the decomposition is lossless but not dependency preserving (the FD {Capacity,Cargo} Value) has been lost. 2018/4/6 19

10.2 Lossless decomposition into BCNF(cont) Or we could have chosen {Cargo, Capacity} Value, which would give us: R 1 (Ship, Capacity, Date, Cargo) Key: {Ship,Date} A nontrivial FD in F + violates BCNF: Ship Capacity {Ship, Date} Cargo {Cargo, Capacity} Value Ship Capacity and R 2 (Cargo, Capacity, Value) Key: {Cargo,Capacity} Only one nontrivial FD in F + with single attribute on the right side: {Cargo, Capacity} Value 2018/4/6 20

10.2 Lossless decomposition into BCNF(cont) and then from Ship Capacity, R 11 (Ship, Date, Cargo) Key: {Ship,Date} Ship Capacity {Ship, Date} Cargo {Cargo, Capacity} Value Only one nontrivial FD in F + with single attribute on the right side: {Ship, Date} Cargo And R 12 (Ship, Capacity) Key: {Ship} Only one nontrivial FD in F + : Ship Capacity This is in BCNF and the decomposition is both lossless and dependency preserving. However, there are relation schemes for which there is no lossless, dependency preserving decomposition into BCNF. A B C 2018/4/6 21