INF1383 -Bancos de Dados

Similar documents
CSE 132B Database Systems Applications

Constraints: Functional Dependencies

Constraints: Functional Dependencies

Problem about anomalies

Database Design and Normalization

Relational Database Design

Review: Keys. What is a Functional Dependency? Why use Functional Dependencies? Functional Dependency Properties

Chapter 7: Relational Database Design

COSC 430 Advanced Database Topics. Lecture 2: Relational Theory Haibo Zhang Computer Science, University of Otago

Introduction. Normalization. Example. Redundancy. What problems are caused by redundancy? What are functional dependencies?

Chapter 7: Relational Database Design. Chapter 7: Relational Database Design

CS54100: Database Systems

UVA UVA UVA UVA. Database Design. Relational Database Design. Functional Dependency. Loss of Information

Chapter 8: Relational Database Design

Relational Database Design

Design Theory for Relational Databases

Schema Refinement and Normal Forms

SCHEMA NORMALIZATION. CS 564- Fall 2015

Database Design and Implementation

Schema Refinement & Normalization Theory

Normal Forms. Dr Paolo Guagliardo. University of Edinburgh. Fall 2016

Databases 2012 Normalization

Schema Refinement and Normal Forms. Case Study: The Internet Shop. Redundant Storage! Yanlei Diao UMass Amherst November 1 & 6, 2007

Functional Dependency Theory II. Winter Lecture 21

Schema Refinement and Normal Forms. The Evils of Redundancy. Schema Refinement. Yanlei Diao UMass Amherst April 10, 2007

Lecture 6 Relational Database Design

CS322: Database Systems Normalization

Schema Refinement and Normal Forms Chapter 19

CSC 261/461 Database Systems Lecture 11

DESIGN THEORY FOR RELATIONAL DATABASES. csc343, Introduction to Databases Renée J. Miller and Fatemeh Nargesian and Sina Meraji Winter 2018

FUNCTIONAL DEPENDENCY THEORY. CS121: Relational Databases Fall 2017 Lecture 19

Normalization. October 5, Chapter 19. CS445 Pacific University 1 10/05/17

Lossless Joins, Third Normal Form

The Evils of Redundancy. Schema Refinement and Normalization. Functional Dependencies (FDs) Example: Constraints on Entity Set. Refining an ER Diagram

10/12/10. Outline. Schema Refinements = Normal Forms. First Normal Form (1NF) Data Anomalies. Relational Schema Design

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms

Normal Forms (ii) ICS 321 Fall Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1

Shuigeng Zhou. April 6/13, 2016 School of Computer Science Fudan University

Functional Dependencies and Normalization. Instructor: Mohamed Eltabakh

CSC 261/461 Database Systems Lecture 13. Spring 2018

Chapter 3 Design Theory for Relational Databases

The Evils of Redundancy. Schema Refinement and Normal Forms. Functional Dependencies (FDs) Example: Constraints on Entity Set. Example (Contd.

CS122A: Introduction to Data Management. Lecture #13: Relational DB Design Theory (II) Instructor: Chen Li

Schema Refinement: Other Dependencies and Higher Normal Forms

Design Theory for Relational Databases

Relational-Database Design

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Example (Contd.

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Refining an ER Diagram

Lectures 6. Lecture 6: Design Theory

Schema Refinement and Normal Forms

Functional Dependencies and Normalization

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) [R&G] Chapter 19

But RECAP. Why is losslessness important? An Instance of Relation NEWS. Suppose we decompose NEWS into: R1(S#, Sname) R2(City, Status)

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) CIS 330, Spring 2004 Lecture 11 March 2, 2004

Introduction to Data Management. Lecture #6 (Relational DB Design Theory)

Relational Design Theory

Functional Dependencies & Normalization. Dr. Bassam Hammo

Design Theory for Relational Databases. Spring 2011 Instructor: Hassan Khosravi

Schema Refinement and Normal Forms. Chapter 19

Schema Refinement and Normal Forms. Why schema refinement?

CSC 261/461 Database Systems Lecture 10 (part 2) Spring 2018

Introduction to Data Management. Lecture #7 (Relational DB Design Theory II)

CSC 261/461 Database Systems Lecture 12. Spring 2018

Relational Design: Characteristics of Well-designed DB

Background: Functional Dependencies. æ We are always talking about a relation R, with a æxed schema èset of attributesè and a

CSC 261/461 Database Systems Lecture 8. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

Schema Refinement and Normalization

Schema Refinement. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Database Normaliza/on. Debapriyo Majumdar DBMS Fall 2016 Indian Statistical Institute Kolkata

Functional Dependencies. Applied Databases. Not all designs are equally good! An example of the bad design

Design theory for relational databases

Information Systems (Informationssysteme)

FUNCTIONAL DEPENDENCY THEORY II. CS121: Relational Databases Fall 2018 Lecture 20

Relational Database Design Theory Part II. Announcements (October 12) Review. CPS 116 Introduction to Database Systems

CMPT 354: Database System I. Lecture 9. Design Theory

CAS CS 460/660 Introduction to Database Systems. Functional Dependencies and Normal Forms 1.1

CSIT5300: Advanced Database Systems

Chapter 3 Design Theory for Relational Databases

CSE 344 AUGUST 6 TH LOSS AND VIEWS

Normal Forms Lossless Join.

Schema Refinement. Feb 4, 2010

CS 186, Fall 2002, Lecture 6 R&G Chapter 15

Normaliza)on and Func)onal Dependencies

Lecture #7 (Relational Design Theory, cont d.)

Functional Dependencies

11/6/11. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

Introduction to Data Management. Lecture #6 (Relational Design Theory)

Practice and Applications of Data Management CMPSCI 345. Lecture 16: Schema Design and Normalization

Functional. Dependencies. Functional Dependency. Definition. Motivation: Definition 11/12/2013

DECOMPOSITION & SCHEMA NORMALIZATION

Introduction to Database Systems CSE 414. Lecture 20: Design Theory

CSE 303: Database. Outline. Lecture 10. First Normal Form (1NF) First Normal Form (1NF) 10/1/2016. Chapter 3: Design Theory of Relational Database

Database System Concepts, 5th Ed.! Silberschatz, Korth and Sudarshan See for conditions on re-use "

11/1/12. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

Normal Forms 1. ICS 321 Fall Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Kapitel 3: Formal Design

Functional Dependency and Algorithmic Decomposition

Relational Design Theory II. Detecting Anomalies. Normal Forms. Normalization

Transcription:

INF1383 -Bancos de Dados Prof. Sérgio Lifschitz DI PUC-Rio Eng. Computação, Sistemas de Informação e Ciência da Computação Projeto de BD e Formas Normais Alguns slides são baseados ou modificados dos originais dos livros Elmasri and Navathe, Fundamentals of Database Systems Pearson Education, Inc. e Database System Concepts, McGraw Hill Silberschatz, Korth and Sudarshan Slide 1 Relational Database Design Finding database schemes with good properties Example: Database for information on suppliers, parts supplied, and shipments Information consists of: S#: supplier number SNAME: supplier name SCITY: supplier city P#: part number PNAME: part name PCITY: city where part is stored QTY: quantity of shipment Slide 2

Relational Database Design One possible database schema BAD [S#, SNAME, SCITY, P#, PNAME, PCITY, QTY] Why BAD is bad: redundancy SCITY is determined by S# However, the same SCITY could appear many times with the same S# in this relation Functional dependency: S# SCITY update anomalies SCITY could be changed in one place but not in another, for the same S#, resulting in inconsistency Slide 3 Relational Database Design Why BAD is bad: insertion anomalies cannot record supplier info unless that supplier supplies a part deletion anomalies by deleting parts we can lose supplier info Solution New schema: S[S#, SNAME, SCITY] Key: S# P[P#, PNAME, PCITY] Key: P# SP[S#, P#, QTY] Key: S# P# No other functional dependencies besides key dependencies Normal forms: nice forms for schemas Slide 4

Decomposing Relation Schemes Let R be a relation scheme A decomposition of R is a set ρ={r 1,, R k } of relation schemes such that: att(r) = k i = 1 att(r i ) Example {S, P, SP} is a decomposition of the relation schema BAD (see earlier example) Slide 5 Conditions for a Reasonable Decomposition Do S, P, SP contain the same info as BAD? BAD = π S (BAD) >< π P (BAD) >< π SP (BAD)? Dependencies for BAD can be enforced by local dependencies on S, P, SP? Let F be the set of FDs for BAD F + denotes the set of FDs implied by F π X (F + ) = {V W F + VW X} Is π S (F + ) π P (F + ) π SP (F + ) equivalent to F? Slide 6

Functional-Dependency Theory We now consider the formal theory that tells us which functional dependencies are implied logically by a given set of functional dependencies. We then develop algorithms to generate lossless decompositions into BCNF and 3NF We then develop algorithms to test if a decomposition is dependency-preserving Slide 7 Functional Dependencies Functional dependency over R: expression X Y where X, Y att(r) A relation R satisfies X Y iff whenever two tuples in R agree on X, they also agree on Y e.g. SCHEDULE THEATER TITLE Roxy Artplex Tropa Rambo Satisfies THEATER TITLE SCHEDULE THEATER TITLE Roxy Tropa Artplex Rambo Artplex Platoon Violates THEATER TITLE, satisfies TITLE THEATER 8 Slide 8

Lossless-join Decomposition For the case of R = (R 1, R 2 ), we require that for all possible relations r on schema R r = R1 (r ) R2 (r ) A decomposition of R into R 1 and R 2 is lossless join if and only if at least one of the following dependencies is in F + : R 1 R 2 R 1 R 1 R 2 R 2 Slide 9 Dependency Preservation Let F i be the set of dependencies F + that include only attributes in R i. A decomposition is dependency preserving, if (F 1 F 2 F n ) + = F + If it is not, then checking updates for violation of functional dependencies may require computing joins, which is expensive. Slide 10

Technical Problem Checking if X A is implied by a set F of fds (F = X A) Notation: F + = {X Y F = X Y} Needed to compute keys and check satisfaction of BCNF and other normal forms. Solution: closure of a set of attributes wrt F: X + X + = {A F = X A} (all attributes determined by X) So, X A F + iff A X + 11 Slide 11 Example of Closure Computation R = ABCDEF F = {A C, BC D AD E} X = AB X (0) = AB X (1) = ABC X (2) = ABCD X (3) = ABCDE X (4) = X (3) X + = ABCDE To check if X is key in R: X + = R A A o B o A o B o C o A B C D o o o o B C D o o o o E o 12 Slide 12

Computing the closure of a set of attributes wrt a set of fd s Let F be a set of fd s over R, and x R. The following computes the closure x + of x wrt FL 1. X (0) X 2. X (i+1) X (i) {Z V Z F, V X (i) } We get: X (0) X (1) X (i) X (i+1) att(r) Since att(r) is finite, there must exist a k 0 such that X (k) = X (k+1) (and thus, X (j) = X (k) j k). Then, X + = X (k) Slide 13 Dependency Preserving Decompositions Notation: if F is a set of fd s over R and X R then π X (F + ) = {V W F + VW X}. these are the fds that apply to the set of attributes X ( local to X) Definition: Let ρ = (R 1,, R k ) be a decomposition for R and F a set of fd s over R. Then ρ preserves F iff the set of local fds k i=1π Ri (F + ) is equivalent to F In other words, all fds in F are implied by k i=1π Ri (F + ) 14 Slide 14

Normal Forms Terminology: Let R be a relation schema and F a set of fd s over R. Key: X att(r) such that X att(r) F +. Minimal key: X att(r) s.t. X att(r) F + and there is no Y X such that Y att(r) F +. A att(r) is prime: A X where X is a minimal key A is non-prime: A is not a member of any minimal key. 15 Slide 15 Purpose of normal forms Eliminate problems of redundancy and anomalies. Normal forms: first, second, third, Boyce-Codd Boyce-Codd Normal Form (BCNF) A relation scheme R is in BCNF wrt a set of fd s F over R, iff whenever X A F + and A X, X is a key for R Only nontrivial dependencies are those induced by keys 16 Slide 16

Example BAD(S#, P#, SNAME, PNAME, SCITY, PCITY, QTY) not in BCNF wrt F = S# SNAME SCITY P# PNAME PCITY S# P# QTY S(S#, SCITY, SNAME) is in BCNF wrt S# SNAME SCITY P(P#, PCITY, PNAME) is in BCNF wrt P# PNAME PCITY SP(S# P# QTY) is in BCNF wrt S# P# QTY 17 Slide 17 Decomposition of a relation schema into BCNF relation schemas, with lossless join Any relation scheme has a decomposition into BCNF relation schemes, with lossless join not always dependency preserving Example R = CITY ZIP ST F = CITY ST ZIP, ZIP CITY R has no decomposition into BCNF schemas, which preserves F (CITY ST ZIP is never preserved) Slide 18

Algorithm to obtain a lossless join decomposition of a given R wrt F Start with ρ = {R} Apply recursively the following procedure: for each S in ρ not in BCNF, let X A F + be such that XA S, A X, X is not a key of S replace S by S 1 and S 2, where S 1 = XA S 2 = S A until all relation schemes are in BCNF Note S S 1 S 2 lossless join: X A F + X is a key for S 1 Slide 19 Note a) R R 1 R i R k Lossless join wrt F S i1 S i2 S im Lossless join (for R i ) wrt π Ri (F+) R 1 R R 2 R i-1 S i1 S im S i2 R k Lossless join wrt F b) If ρ is a lossless join decomposition of R wrt F and ρ ρ then ρ is a lossless join decomposition as well. Slide 20

Example R = C T H R S G C = course T = teacher H = hour R = room S = student G = grade F: C T HR C HT R CS G HS R only key: HS Slide 21 Example CTHRSG Key = HS, C T CS G, HR C, HS R, TH R key = CS CSG CS G CTHRS key = SH C T TH R HR C HS R CT key = C C T CHRS key = SH CH R HR C HS R CHR key = CH, CH R HR HR C CHS key = SH SH C Slide 22

Remarks (drawbacks) Decomposition is not unique CHRS could be decomposed into CHR and CHS or CHR and RHS The decomposition does not preserve TH R local fds: G = {CS G C T CH R HR C SH C} which does not imply TH R Slide 23 BCNF and Dependency Preservation It is not always possible to get a BCNF decomposition that is dependency preserving R = (J, K, L ) F = {JK L L K } Two candidate keys = JK and JL R is not in BCNF Any decomposition of R will fail to preserve JK L This implies that testing for JK L requires a join Slide 24

Efficiency Is there an efficient algorithm for BCNF decomposition? Most likely NO: It is NP-complete to decide whether a relation scheme R is in BCNF wrt a set of fd s. Any algorithm for BCNF is most likely exponential 25 Slide 25 3NF Problem with BCNF: Not every relation schema can be decomposed into BCNF relation schemas which preserve the dependencies and have lossless join. Third Normal Form A relation scheme R is in Third Normal Form wrt a set F of fd s over R, if whenever X A holds in R and A X then either X is a key or A is prime Weaker than BCNF 26 Slide 26

Third Normal Form: Motivation There are some situations where BCNF is not dependency preserving, and efficient checking for FD violation on updates is important Solution: define a weaker normal form, called Third Normal Form (3NF) Allows some redundancy (with resultant problems; we will see examples later) But functional dependencies can be checked on individual relations without computing a join. There is always a lossless-join, dependency-preserving decomposition into 3NF. Slide 27 Computing a 3NF decomposition Minimal covers A set F of fd s is minimal iff 1) X Y F Y is a single attribute 2) for no X A F is F {X A} equivalent to F (no fd can be removed from F) 3) if X A F then there is no proper subset Z of X such that F {X A} {Z A} is equivalent to F (attributes cannot be removed from fd s in F) Fact:. Every set of fd s is equivalent to some minimal set of fds (called a minimal cover). 28 Slide 28

Dependency preserving decompositions into Third Normal Form R a schema F minimal set of fd s over R ρ = {XA 1 A m X A i F} {B R B does not occur in F} Example R = CTHRSG (see earlier example) F = C T CS G HR C HS R (minimal) HT R Then ρ = {CT, HRC, HTR, CSG, HRS} Slide 29 3NF Decomposition Algorithm (Cont.) 3NF decomposition algorithm ensures: each relation schema R i is in 3NF decomposition is dependency preserving and lossless-join Slide 30

Formal Results Theorem. ρ preserves F and each R i in ρ is in 3NF wrt π Ri (F + ). To obtain decomposition that also has lossless join: add to ρ a set K where K is a minimal key for R unless there already is a piece in the decomposition whose attributes form a key for R Theorem ρ {K} is a 3NF decomposition which is dependency preserving and has lossless join. 31 Slide 31 Extraneous Attributes Consider a set F of FDs and the functional dependency α β in F. Attribute A is extraneous in α if A α and F logically implies (F {α β}) {(α A) β}. Attribute A is extraneous in β if A β and the set of functional dependencies (F {α β}) {α (β A)} logically implies F. Example: Given F = {A C, AB C } B is extraneous in AB C because {A C, AB C} logically implies A C (i.e. the result of dropping B from AB C). Example: Given F = {A C, AB CD} C is extraneous in AB CD since AB C can be inferred even after deleting C Slide 32

Testing if an Attribute is Extraneous Consider a set F of functional dependencies and the functional dependency α β in F. To test if attribute A α is extraneous in α 1. compute ({α} A) + using the dependencies in F 2. check that ({α} A) + contains A; if it does, A is extraneous To test if attribute A β is extraneous in β 1. compute α + using only the dependencies in F = (F {α β}) {α (β A)}, 2. check that α + contains A; if it does, A is extraneous Slide 33 Canonical (Minimal) Cover Sets of functional dependencies may have redundant dependencies that can be inferred from the others For example: A C is redundant in: {A B, B C} Parts of a functional dependency may be redundant E.g.: on RHS: {A B, B C, A CD} can be simplified to {A B, B C, A D} E.g.: on LHS: {A B, B C, AC D} can be simplified to {A B, B C, A D} Intuitively, a canonical cover of F is a minimal set of functional dependencies equivalent to F, having no redundant dependencies or redundant parts of dependencies Slide 34

Canonical (minimal) Cover A canonical cover for F is a set of dependencies F c such that F logically implies all dependencies in F c, and F c logically implies all dependencies in F, and No functional dependency in F c contains an extraneous attribute, and Each left side of functional dependency in F c is unique. To compute a canonical cover for F: repeat Use the union rule to replace any dependencies in F α 1 β 1 and α 1 β 2 with α 1 β 1 β 2 Find a functional dependency α β with an extraneous attribute either in α or in β If an extraneous attribute is found, delete it from α β until F does not change Note: Union rule may become applicable after some extraneous attributes have been deleted, so it has to be re-applied Slide 35 Comparison of BCNF and 3NF It is always possible to decompose a relation into a set of relations that are in 3NF such that: the decomposition is lossless the dependencies are preserved It is always possible to decompose a relation into a set of relations that are in BCNF such that: the decomposition is lossless it may not be possible to preserve dependencies. Slide 36

Design Goals Goal for a relational database design is: BCNF. Lossless join. Dependency preservation. If we cannot achieve this, we accept one of Lack of dependency preservation Redundancy due to use of 3NF Interestingly, SQL does not provide a direct way of specifying functional dependencies other than superkeys. Can specify FDs using assertions, but they are expensive to test Even if we had a dependency preserving decomposition, using SQL we would not be able to efficiently test a functional dependency whose left hand side is not a key. Slide 37 A (simplified) overview of schema design 1. Choose attributes in R 2. Specify dependencies F (tool: Armstrong relations ) 3. Find a lossless-join, dependency preserving decomposition into Normal Form schemas BCNF, if possible 3NF, if not BCNF 38 Slide 38