Database Design: Normal Forms as Quality Criteria. Functional Dependencies Normal Forms Design and Normal forms

Similar documents
UVA UVA UVA UVA. Database Design. Relational Database Design. Functional Dependency. Loss of Information

Relational Database Design

CS54100: Database Systems

Schema Refinement. Feb 4, 2010

SCHEMA NORMALIZATION. CS 564- Fall 2015

LOGICAL DATABASE DESIGN Part #1/2

Schema Refinement & Normalization Theory

Schema Refinement and Normal Forms

Functional Dependencies

Normal Forms. Dr Paolo Guagliardo. University of Edinburgh. Fall 2016

Database Design and Implementation

Functional Dependencies & Normalization. Dr. Bassam Hammo

Chapter 7: Relational Database Design

Relational Design Theory I. Functional Dependencies: why? Redundancy and Anomalies I. Functional Dependencies

Normalization. October 5, Chapter 19. CS445 Pacific University 1 10/05/17

10/12/10. Outline. Schema Refinements = Normal Forms. First Normal Form (1NF) Data Anomalies. Relational Schema Design

Databases 2012 Normalization

Chapter 7: Relational Database Design. Chapter 7: Relational Database Design

Constraints: Functional Dependencies

Introduction. Normalization. Example. Redundancy. What problems are caused by redundancy? What are functional dependencies?

Constraints: Functional Dependencies

Schema Refinement and Normal Forms. The Evils of Redundancy. Schema Refinement. Yanlei Diao UMass Amherst April 10, 2007

Chapter 8: Relational Database Design

CSE 303: Database. Outline. Lecture 10. First Normal Form (1NF) First Normal Form (1NF) 10/1/2016. Chapter 3: Design Theory of Relational Database

Relational Design: Characteristics of Well-designed DB

Schema Refinement and Normal Forms. Why schema refinement?

Schema Refinement and Normal Forms. Chapter 19

Functional Dependencies and Normalization

CSIT5300: Advanced Database Systems

Review: Keys. What is a Functional Dependency? Why use Functional Dependencies? Functional Dependency Properties

11/1/12. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1

11/6/11. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

Schema Refinement and Normal Forms Chapter 19

Relational Design Theory

Chapter 3 Design Theory for Relational Databases

Information Systems (Informationssysteme)

CSC 261/461 Database Systems Lecture 13. Spring 2018

Design Theory for Relational Databases

CAS CS 460/660 Introduction to Database Systems. Functional Dependencies and Normal Forms 1.1

Design Theory for Relational Databases. Spring 2011 Instructor: Hassan Khosravi

CSE 132B Database Systems Applications

Schema Refinement and Normal Forms

FUNCTIONAL DEPENDENCY THEORY. CS121: Relational Databases Fall 2017 Lecture 19

COSC 430 Advanced Database Topics. Lecture 2: Relational Theory Haibo Zhang Computer Science, University of Otago

Schema Refinement and Normal Forms. Case Study: The Internet Shop. Redundant Storage! Yanlei Diao UMass Amherst November 1 & 6, 2007

CSC 261/461 Database Systems Lecture 10 (part 2) Spring 2018

Introduction to Data Management CSE 344

Schema Refinement and Normal Forms

CSE 344 MAY 16 TH NORMALIZATION

FUNCTIONAL DEPENDENCY THEORY II. CS121: Relational Databases Fall 2018 Lecture 20

Database Design and Normalization

The Evils of Redundancy. Schema Refinement and Normalization. Functional Dependencies (FDs) Example: Constraints on Entity Set. Refining an ER Diagram

Schema Refinement and Normalization

Schema Refinement. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Functional Dependencies

Information Systems for Engineers. Exercise 8. ETH Zurich, Fall Semester Hand-out Due

Introduction to Data Management. Lecture #6 (Relational DB Design Theory)

Schema Refinement: Other Dependencies and Higher Normal Forms

Design Theory. Design Theory I. 1. Normal forms & functional dependencies. Today s Lecture. 1. Normal forms & functional dependencies

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Example (Contd.

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Refining an ER Diagram

Relational Database Design

Design Theory: Functional Dependencies and Normal Forms, Part I Instructor: Shel Finkelstein

Chapter 3 Design Theory for Relational Databases

CSE 344 AUGUST 3 RD NORMALIZATION

The Evils of Redundancy. Schema Refinement and Normal Forms. Functional Dependencies (FDs) Example: Constraints on Entity Set. Example (Contd.

CS322: Database Systems Normalization

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) [R&G] Chapter 19

Functional Dependency and Algorithmic Decomposition

Relational Design Theory II. Detecting Anomalies. Normal Forms. Normalization

Functional Dependency Theory II. Winter Lecture 21

Introduction to Data Management. Lecture #6 (Relational Design Theory)

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) CIS 330, Spring 2004 Lecture 11 March 2, 2004

Normaliza)on and Func)onal Dependencies

Design Theory for Relational Databases

Lecture 6 Relational Database Design

Data Bases Data Mining Foundations of databases: from functional dependencies to normal forms

Schema Refinement and Normal Forms

CSE 344 AUGUST 6 TH LOSS AND VIEWS

DESIGN THEORY FOR RELATIONAL DATABASES. csc343, Introduction to Databases Renée J. Miller and Fatemeh Nargesian and Sina Meraji Winter 2018

Lectures 6. Lecture 6: Design Theory

Introduction to Database Systems CSE 414. Lecture 20: Design Theory

CSE 544 Principles of Database Management Systems

Functional Dependencies. Applied Databases. Not all designs are equally good! An example of the bad design

Shuigeng Zhou. April 6/13, 2016 School of Computer Science Fudan University

CSC 261/461 Database Systems Lecture 11

Relational Database Design Theory Part II. Announcements (October 12) Review. CPS 116 Introduction to Database Systems

CSC 261/461 Database Systems Lecture 8. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

CS122A: Introduction to Data Management. Lecture #13: Relational DB Design Theory (II) Instructor: Chen Li

Lossless Joins, Third Normal Form

Comp 5311 Database Management Systems. 5. Functional Dependencies Exercises

Introduction to Management CSE 344

CMPT 354: Database System I. Lecture 9. Design Theory

Database System Concepts, 5th Ed.! Silberschatz, Korth and Sudarshan See for conditions on re-use "

INF1383 -Bancos de Dados

Introduction to Data Management. Lecture #7 (Relational DB Design Theory II)

Practice and Applications of Data Management CMPSCI 345. Lecture 16: Schema Design and Normalization

CS 186, Fall 2002, Lecture 6 R&G Chapter 15

DECOMPOSITION & SCHEMA NORMALIZATION

Transcription:

Database Design: Normal Forms as Quality Criteria Functional Dependencies Normal Forms Design and Normal forms

Design Quality: Introduction Good conceptual model: - Many alternatives - Informal guidelines - Wanted: Formal methods for comparing designs Informal guidelines - Avoid redundancies e.g., Movie(id, category, ) and Tape(id, category, ) - Avoid modeling several real objects in one entity e.g., Movie_two_concepts(id, title, category, year, price_per_day, length, director, birthday, livesincity, ) - Make constraints explicit e.g. tape is loaned by zero or one customer 2

Design Quality: Anomalies by redundancy Deletion anomaly: Loss of information about data object by deletion of tuple Example: - consider Movie_two_concepts(id, title, category, year, price_per_day, length, director, birthday, livesincity, ) - deletion of tuple (95, 'Psycho', 'suspense', 1969, 2.00, NULL, 'Hitchcock', 1899-08-13, NULL,...) if only-one movie of Hitchcock in database all information about director lost 3

Design Quality: Anomalies by redundancy Modification anomaly: Change of data object update all concerned tuples Example: - consider Movie_two_concepts(id, title, category, year, price_per_day, length, director, birthday, livesincity, ) - Spielberg moves to different city change all his movies 4

Design Quality: Anomalies by redundancy Insertion anomaly: Insert data consistent with existing tuples - Example: no varying in Hitchcock s birthday Insertion of single data object difficult - Example: insertion of new director without film impossible 5

Functional dependency: Introduction Important formal concept Formalizes constraints between attributes Constraints of modeled domain Will be used to formalize design rules Important concept Consider attribute sets X,Y R Functional dependency: X Y - Constraint on possible tuples of R - Tuples t1, t2 R: t1[x]=t2[x] t1[y]=t2[y] Function X Y - Not explicit 6

Functional Dependency: Examples Examples: - Attribute director determines attribute birthday = birthday is functionally determined by director = birthday is functionally dependent on director = director birthday - Attribute director does not determine attribute movie.title = director title - Attribute director determines attribute livesincity = director livesincity - director {birthday,livesincity} - birthday director 7

Functional Dependency: Key dependencies Consider R(A 1,A 2,A 3,A 4,,A n ) Then: {A 1,A 2,A 3 } {A 4,,A n } {A 1,A 2,A 3 } {A 4 },, {A 1,A 2,A 3 } {A n } Function maps keys to values of attributes Examples: - id {title, category, year, price_per_day, length, director, } - id {title, category, year, price_per_day} - id {title} 8

Functional Dependency: Key dependencies Dependencies on super- and candidate keys - Superkey: subset of attributes SK of relation R for which no two tuples have the same value SK R \ SK - Candidate key: superkey K of relation R such that if any attribute A K is removed, the set of attributes K\A is not superkey of R K R \ K Examples: - {id, title, director} {category, year,, length, birthday, } - {title, year} {id, category,, length, director, birthday, } - {director, year} {id, category, year,, length, birthday, } 9

Functional Dependency: Formal definition Functional dependency: Let R ={A 1, A 2,, A n } be attribute set of a relation R, t1, t2 tuples of R, and X, Y R Y is functionally dependent on X (X Y) ( x i X ) t1[x i ] = t2[x i ] ( y i Y ) t1[y i ]= t2[y i ] - Invariants must hold at all times - Independent of state of db 10

Functional Dependency: Example Consider: R(A,B,C,D) R A B C D a1 a1 a2 a2 a3 b1 b2 b2 b3 b3 c1 c1 c2 c2 c2 d1 d2 d2 d3 d4 Inferring FDs from representative table instance: - A C, A B, A D - B A, B C, B D - C A, C B, C D - AB C, AB D, AC B, AC D, 11

Functional Dependency: Example DB design: infer FDs from application semantics Example: Tape(id, format, movie_id) id format id movie_id Actor(stage_name, real_name, birthday) stage_name real_name stage_name birthday Rental(tape_id, member, from, until) {tape_id, member, from} until 12

Functional Dependency: Properties Consider: relation R, X,Y,Z attribute sets in R X Y trivial functional dependency if Y X e.g. AB A, A A X Y full functional dependency if - ( X Y ), and - Z X: Z Y Transitive functional dependency - Suppose X Y and Y Z - Then X Z (Z is transitively dependent on X) 13

Functional Dependency: Properties Consider: relation R, set of FDs F Closure of F: - Set of functional dependencies logically implied by F - Denoted by F + - F + = { X Y F X Y } Minimal cover: - Minimal set of functional dependencies for relation R 14

Functional Dependency: Closure Consider: set of FDs F, set of attributes A={A 1, A n } Computing closure of attributes: Initialize result:=a While(changes in result){ for each FD X Y in F{ if X in result then result = result Y } } Result: set of attributes A + with A A + 15

Functional Dependency: Example Consider R(A,B,C,D,E,F) F: {(1)AB C, (2)BC AD, (3)D E, (4)CF B} Find closure for {A,B}: - res:={a,b} - 1 st iteration: (1) res = res C, res={a,b,c} - 2 nd interation: (2) res = res AD, res={a,b,c,d} - 3 rd interation: (3) res = res E, res={a,b,c,d,e} Result: {AB} + = {A,B,C,D,E} 16

Functional Dependency: Armstrong's Axioms Reflexivity rule: If Y X then X Y Augmentation rule: If X Y and set of attributes Z then X Z Y Z Transitivity rule: If X Y and Y Z then X Z Complete rules: allow to generate all FDs in F + Sound rules: only logically implied FDs are produced 17

Functional Dependency: Additional rules Union rule: If X Y and X Z then X YZ - Proof: use Augmentation, Transitivity Decomposition rule: If X YZ then X Y and X Z Proof: use definition of FD Pseudotransitivity rule: If X Y and ZY T then ZX T Proof: use Augmentation, Transitivity 18

Functional Dependency: Example Consider R = (A, B, C, G, H, I) F: { A B, (1) (2) A C, (3) CG H, (4) CG I, (5) B H } (6) Some members of F + : A H [ (1),(6), Transitivity ] CG HI [ (4),(5), Union ] AG I [ (2),(5), Augmentation, Transitivity ] Candidate key: AG 19

Functional Dependency: Example Consider: Movie_two_concepts_noID(title, category, year, price_per_day, length, director, birthday, livesincity, ) Set of FDs F: {{title, director} category, {title, director} year, {title,director} price_per_day, {title,director} length, director birthday, director livesincity} More functional dependencies: Reflexivity: {title, director} director, Augmentation: {director, title} {birthday,title}, Transitivity: {title, director} birthday, 20

Functional Dependency: Minimal cover Consider: sets of dependencies F and G F and G equivalent if F + = G + (F covers G,G covers F) Minimal set of dependencies (Algorithm): 1. Every right side of a dependency in F is a single attribute (apply decomposition) 2. For no X Y in F is the set F { X Y } equivalent to F (all dependencies are useful) 3. For no X Y in F and Z X is F { X Y } { Z Y } equivalent to F (no attribute in any left side is redundant) 4. Every left side of FDs unique (apply union rule) 21

Functional Dependency: Example Find minimal cover: F: {AB C, C A, BC D, ACD B, D EG, BE C, CE AG, CG BD} 1. Split right sides: { AB C, C A, BC D, ACD B, D E, D G, BE C, CE A, CE G, CG B, CG D } 2. Delete unnecessary FDs: CE A redundant (C A). CG B redundant (CG D, C A and ACD B) 3. Minimal left sides: no changes 4. Unique left sides: {AB C, C A, BC D, ACD B, D EG, BE C, CE G, CG D } 22

Design Quality: Functional Dependencies Intuitive goal: - Find a set of relations which do not have "anomalies" - Algorithm producing relational schema from set of FDs Invariants - in application domain - made explicit during requirements analysis ER-design - First step - Refine afterwards - Good ER provides good design 23

Design Quality: Functional Dependencies Consider: Movie_two_concepts(id, title, category, year, price_per_day, length, director, birthday, livesincity, ) Example Dependencies: id {birthday, livesincity, } director {birthday, livesincity} Bad design: non-key dependency Key properties checked by DB system, other functional dependencies are not! e.g. different birthdays or cities for one director cannot be prevented by the DBS 24

Design Quality: Functional Dependencies Consider: Movie_two_concepts_noID(title, category, year, price_per_day, length, director, birthday, livesincity, ) Example Dependencies: {title, director} {birthday, livesincity, } director {birthday, livesincity} Bad design: partial-key dependency Key properties checked by DB system, other functional dependencies are not! e.g. different birthdays or cities for one director cannot be prevented by the DBS 25

Design Quality: Normalization Functional dependencies Generalization of key concept Normal forms: Quality classes for relation schemes with dependencies Two approaches: - Synthesis: Set up relations in certain normal form - Decomposition: decompose given relations into normalized relations 26

Design Quality: Normal forms First normal form (1NF) - Basic property of relation - All attributes have atomic domain - NF 2 : Non First Normal Form allows nested relations Second normal form (2NF) - No partial FDs on (candidate) keys for non-key attributes Third normal form (3NF) - No dependencies on non-keys for non-key attributes Boyce-Codd normal form (BCNF) - Only FDs from keys - (4NF for multi-valued dependencies) 27

Design Quality: Second normal form Consider: relation R, attribute set R, set of FDs F R in second normal form (2NF), iff - Only single concept modeled in R = Every non-(candidate) key attribute fully depends on every candidate key = non-key A R candidate keys X R : X {A} = X {A} : if A non-key attribute and X candidate key no FD Y {A} with Y X 28

Design Quality: Second normal form Example: Movie_two_concepts(id, title, category, year, price_per_day, length, director, birthday, livesincity, ) F: {id {title, category, year, price_per_day, length, director, birthday, livesincity, }, {title, year} {id, category, price_per_day, length, director, birthday, livesincity, }, {title, director} {id, category, year, price_per_day, length, birthday, livesincity, }, director {birthday, livesincity, } } Not in 2NF since {title,director} {birthday, livesincity} 29

Design Quality: Third normal form Consider: relation R, attribute set R, set of FDs F R in third normal form (3NF), iff FDs X {A}, X R, A R one of the three conditions holds: 1. A X, i.e. trivial FD 2. A candidate key (A is prim attribute) 3. X superkey in R Alternative: no transitive FDs of non-keys on keys 30

Design Quality: Third normal form Example: Tape_with_seller(id, format, movie_id, seller, phone) F: { id {format, movie_id, seller, phone}, seller telephone } In 2NF since id format, id movie_id, id seller, id phone Not in 3NF since id seller phone = (3) {seller} is no superkey 31

Design Quality: Boyce-Codd normal form Consider: relation R, attribute set R, set of FDs F R in Boyce-Codd normal form (BCNF), iff FDs X {A}, X R, A R one of the two conditions holds: 1. A X, i.e. trivial FD 2. X superkey in R Alternative: no FDs between partial key attributes 32

Design Quality: Boyce-Codd normal form Example: CustomerAddress(zip, city, street, number, telephone) F:{ {city, street, number} {zip, telephone}, {zip, street, number} {city, telephone}, zip city } In 2NF since only full FD of non-keys on keys In 3NF since no transitive FD using non-keys Not in BCNF since zip city: FD between partial key attributes = (2) {zip} is no superkey 33

Normalization: Decomposition, Synthesis Loss-less Decomposition/ Synthesis : Split relation R into relations R 1, R 2,..., R n, such that - R i in desired normal form (3NF or BCNF) - preserve information - preserve dependencies Preserve information: R 1 R 2... R n = R Loss-less join Preserve dependencies: F(R 1 ) F(R 2 ) F(R n ) = F(R) 34

Normalization: Loss-less join example Example: R(A,B,C), F: {A B, C B } R A B C 1 4 2 2 3 5 R 1 R 2 R R 1 R 2 A 1 4 R 1 B 2 2 B 2 2 R 2 C 3 5 = A 1 4 1 4 B 2 2 2 2 C 3 3 5 5 35

Normalization: Loss-less join example Example: R(A,B,C), F: {A C, C B } R A B C 1 4 2 2 3 5 R 1 R 2 = R R 1 R 2 R 1 R 2 A 1 4 C 3 5 C 3 5 B 2 2 = A 1 4 B 2 2 C 3 5 36

Normalization: Loss-less joins Loss-less property depends on FDs: Consider: decomposition = (R1,R2) of R, set of FDs F has a loss-less-join with respect to F if and only if: or R (R1) R (R2) R (R1) - R (R2) F + R (R1) R (R2) R (R2) - R (R1) F + 37

Normalization: Loss-less joins Example: R(A,B,C), F: {A B, C B } = (R1,R2) R1(A,B), F: {A B} R2(C,B), F: {C B} R (R1) R (R2) R (R1) - R (R2): B A F + R (R1) R (R2) R (R2) - R (R1): B C F + F(R1) F(R2) = F(R) Lossy decomposition Example: R(A,B,C), F: {A C, C B } = (R1,R2) R1(A,C), F: {A C} R2(C,B), F: {C B} R (R1) R (R2) R (R1) - R (R2): C A F + R (R1) R (R2) R (R2) - R (R1): C B F + F(R1) F(R2) = F(R) Loss-less decomposition 38

Normalization: Loss-less joins Loss-less join: - R (R1) R (R2) R (R1/2) - R (R2/1) F + = common attributes of R1, R2 form super key of R1 or R2 Normalization side-effect: FDs transformed into key dependent FDs Advantage: FDs efficiently checked by DB 39

Normalization: Synthesis Given: schema R, set of FDs F Algorithm: 1. Find minimal cover min(f) of F 2. FD X Y F: Define R x with R(R x ):= X Y F(R x ):= {X' Y' X' Y' R(R x )} 3. If no R X contains a candidate key of R: define R key, R(R key ):=candidate key(r), F(R key )={}, 4. Remove R Y where: R(R Y ) R(R X ) Result: loss-less decomposition of R, with R i in 3NF (preserves information and dependencies) 40

Design Quality: Synthesis Example Example: Movie_two_concepts(id, title, category, year, price_per_day, length, director, birthday, livesincity, ) 1. Minimal cover: {id {title, year}, {title, year} {id, category, price_per_day, length, director}, {title, director} {title, year}, director {birthday, livesincity, } } 2. Define relations: R1(id, title, year) R2(id, title, year, category, price_per_day, length, director) R3(title, director, year), R4(director,birthday, livesincity, ) 3., not necessary 4., remove R1, R3 41

Normalization: Decomposition Given: schema R, set of FDs F Algorithm: res:={r} while relation schema in res not in BCNF{ choose such schema R k in res } find BCNF-violating FD X Y in R k (X R(R k ), X Y= ) split R k into R i, R j : R(R i ) = R(R k ) - Y, R(R j ) = X Y Result: decomposition of R, R i in BCNF (preserves information, possible dependency loss) 42

Normalization: Decomposition Example Example: Movie_two_concepts(id, title, category, year, price_per_day, length, director, birthday, livesincity, ) Violating FD: director {birthday, livesincity, } Split: R1(id, title, category, year, price_per_day, length, director), R2(director, birthday, livesincity, ) F(R1): {id {title, category, year, price_per_day, length, director}, {year, director} {id, title, category, price_per_day, length}, {title, director} {id, category, year, price_per_day, length}} F(R2): {director {birthday, livesincity, }} 43

Normalization: 3NF vs BCNF BCNF decomposition: Loss-less join property and dependency preservation not guaranteed Normalization to 3NF best achievable result If at most one candidate key with more than one attribute: 3NF = BCNF Example: - CustomerAddress(zip, city, street, number, telephone) - Normalization to BCNF: R1 (zip,street,number,telephone), R2(zip,city) - Lost dependency {city, street, number} zip 44

Normalization: Algorithms Decomposition - May produce more relations than necessary - Time consuming: compute keys for each schema - Not dependency preserving - Not deterministic (depends on FD order) Synthesis: - Time consuming: compute minimal cover - Not deterministic (depends on minimal cover) Problem: - Designer must specify initial set of FDs 45

Design Quality: Multi-valued dependencies Special attribute dependencies Example: mem_no 001 001 001 001 Customer_preferences fav_movie 345 290 345 290 telephone 030-545454 030-545454 0170-55555 0170-55555............... only trivial functional dependencies: BCNF mem_no > fav_movie mem_no > telephone 46

Design Quality: Multi-valued dependencies Multi-valued dependencies (MVD) - Generalization of FDs - Notation: X > Y Definition single attribute MVD: Consider R = (a, b, c), a > b if for each two tuples t1, t2 with t1[a]=t2[a] exist tuples t3, t4 with - t1[a]=t2[a]=t3[a]=t4[a] - t1[b]=t3[b] and t2[b]=t4[b] - t2[c]=t3[c] and t1[c]=t4[c] 47

Design Quality: Multi-valued dependencies Consider: relation R, X,Y,Z,W attribute sets in R Properties: - X > Y trivial MVD if Y Y or Y = R(R) -X - Complement: if X > Y then X > R(R) -X Y - Multi-valued augmentation: if X > Y, W Zthen ZX WY - Multi-valued transitivity: if X > Y, Y > Z then X > Z-Y - Generalization: if X Y then X > Y 48

Design Quality: Fourth normal form Consider: relation R, attribute set R, set of FDs F R in fourth normal form (4NF), iff multi-valued FDs X > Y one of the three conditions holds: 1. Y X (trivial MVD) 2. Y = R(R) X (trivial MVD) 3. X superkey in R 49

Design Quality: 4NF decomposition Given: schema R, set of FDs F Algorithm (analog BCNF decomposition): res:={r} while relation schema in res not in 4NF{ choose such schema R k in res find 4NF-violating FD X > Y in R k (X R(R k ), X Y= ) split R k into R i, R j : R(R i ) = R(R k ) - Y, R(R j ) = X Y } Result: decomposition of R, R i in BCNF (preserves information, possible dependency loss) 50

Normalization: 4NF decomposition example Example: Customer_preferences(mem_no, fav_movie, telephone, ) F(Customer): {mem_no > fav_movie, mem_no > telephone} Violating FDs: mem_no > fav_movie, mem_no > telephone Split: R1(mem_no, telephone), R2(mem_no, fav_movie) F(R1): {}, mem_no > fav_movie trivial F(R2): {}, mem_no > telephone trivial 51

Design Quality: Normal forms Relationships: 1NF 2NF 3NF BCNF 4NF dependency-preserving decomposition information-preserving decomposition 52

Normalization: Example Movie(id, title, category, year, director, price_per_day, length) F(Movie):{id {title,category,year,director,price_per_day,length}, {title,year} {id,category,director,price_per_day,length}, {director, year} {id,title,category,price_per_day,length}} F min (Movie):{id {title,category,year,director,price_per_day,length}, {title,year} id, {director, year} id}} 1NF: since only atomic attributes 2NF: since id {title,category,year,director,price_per_day,length}, {title,year} {title,category,year,director,price_per_day,length}, {director, year} {title,category,year,director,price_per_day,length} 3NF: since id,{title,year},{director, year} superkeys BCNF: since id,{title,year},{director, year} superkeys 53

Normalization: Example Tape(id, format, movie_id) F(Tape):{id {format, movie_id}} F min (Tape):{id {format, movie_id}} 1NF: since only atomic attributes 2NF: since id {format, movie_id} 3NF: since id is superkey BCNF: since id is superkey Format(name, charge) F(Format): {name charge} F min (Format): {name charge} BCNF: since name is superkey 1NF, 2NF, 3NF since BCNF 54

Normalization: Example Actor(stage_name, real_name, birthday) F(Actor):{stage_name {real_name, birthday}} F min (Actor):{stage_name {real_name, birthday}} BCNF: since name is superkey 1NF, 2NF, 3NF since BCNF Play(movie_id, actor_name) F(Play): {}, only trivial dependencies F min (Play): {} BCNF: since only trivial dependencies 1NF, 2NF, 3NF since BCNF 55

Normalization: Example Customer(mem_no, last_name, first_name, address, telephone) F(Customer):{mem_no {last_name,first_name,address,telephone}} F min (Customer):{mem_no {last_name,first_name,address,telephone}} 1NF: since only atomic attributes (address not stuctured) 2NF: since mem_no {last_name,first_name,address,telephone} 3NF: since id is superkey BCNF: since id is superkey Rental(tape_id, member, from, until) F(Rental): {{tape_id, member, from} until} F min (Rental): {{tape_id, member, from} until} BCNF: since {tape_id, member, from} is superkey 1NF, 2NF, 3NF since BCNF 56

Normalization: Design Quality Should relations always be normalized? Important Yes : - Makes invariant checking easy, - No update anomalies No : - No updates (e.g., data warehouses) Decision according to cost model: - Costs of selects, - Cost of joins, - #redundant data, - #updates, 57

Normalization: Design Quality ER modeling and Normal Forms: - Two mechanisms to set up or enhance a database scheme - ER more intuitive, NF uses algorithms - ER-models often already in NF Normalization as a complementary design tool - Create ER model - Transform to relations - Normalize each non-normalized relation according to tradeoff: query time vs redundant data update 58

Design Quality: Short summary Redundancies cause anomalies - Insert anomaly - Deletion anomaly - Update anomaly Functional Dependencies - Describe constraints between attributes - Properties (Armstrong) Normal forms - 1NF, 2NF, 3NF, BCNF - Normalization: Decomposition, Synthesis - Design Quality: query time vs redundant update 59