L14: Normalization. CS3200 Database design (sp18 s2) 3/1/2018

Similar documents
CSC 261/461 Database Systems Lecture 11

11/1/12. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

CSE 344 MAY 16 TH NORMALIZATION

11/6/11. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

CMPT 354: Database System I. Lecture 9. Design Theory

Lectures 6. Lecture 6: Design Theory

CSE 344 AUGUST 3 RD NORMALIZATION

10/12/10. Outline. Schema Refinements = Normal Forms. First Normal Form (1NF) Data Anomalies. Relational Schema Design

CSC 261/461 Database Systems Lecture 10 (part 2) Spring 2018

Database Design and Implementation

Design Theory. Design Theory I. 1. Normal forms & functional dependencies. Today s Lecture. 1. Normal forms & functional dependencies

CSE 303: Database. Outline. Lecture 10. First Normal Form (1NF) First Normal Form (1NF) 10/1/2016. Chapter 3: Design Theory of Relational Database

Introduction to Management CSE 344

Introduction to Database Systems CSE 414. Lecture 20: Design Theory

CSC 261/461 Database Systems Lecture 8. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

CSC 261/461 Database Systems Lecture 12. Spring 2018

CSE 544 Principles of Database Management Systems

Schema Refinement. Feb 4, 2010

Introduction to Data Management CSE 344

CSC 261/461 Database Systems Lecture 13. Spring 2018

Practice and Applications of Data Management CMPSCI 345. Lecture 15: Functional Dependencies

L13: Normalization. CS3200 Database design (sp18 s2) 2/26/2018

Practice and Applications of Data Management CMPSCI 345. Lecture 16: Schema Design and Normalization

SCHEMA NORMALIZATION. CS 564- Fall 2015

Design theory for relational databases

CSE 344 AUGUST 6 TH LOSS AND VIEWS

Schema Refinement and Normal Forms

Relational Database Design

Relational Database Design

CS322: Database Systems Normalization

CS54100: Database Systems

DECOMPOSITION & SCHEMA NORMALIZATION

Relational Design Theory

Design Theory for Relational Databases. Spring 2011 Instructor: Hassan Khosravi

Functional Dependencies and Normalization. Instructor: Mohamed Eltabakh

Constraints: Functional Dependencies

Chapter 3 Design Theory for Relational Databases

Design Theory for Relational Databases

Design Theory for Relational Databases

Constraints: Functional Dependencies

INF1383 -Bancos de Dados

DESIGN THEORY FOR RELATIONAL DATABASES. csc343, Introduction to Databases Renée J. Miller and Fatemeh Nargesian and Sina Meraji Winter 2018

Normal Forms. Dr Paolo Guagliardo. University of Edinburgh. Fall 2016

FUNCTIONAL DEPENDENCY THEORY. CS121: Relational Databases Fall 2017 Lecture 19

Schema Refinement & Normalization Theory

UVA UVA UVA UVA. Database Design. Relational Database Design. Functional Dependency. Loss of Information

Database Design and Normalization

Relational Design Theory I. Functional Dependencies: why? Redundancy and Anomalies I. Functional Dependencies

Schema Refinement and Normal Forms Chapter 19

Schema Refinement: Other Dependencies and Higher Normal Forms

Schema Refinement and Normal Forms. Chapter 19

Functional Dependencies & Normalization. Dr. Bassam Hammo

Schema Refinement and Normal Forms

COSC 430 Advanced Database Topics. Lecture 2: Relational Theory Haibo Zhang Computer Science, University of Otago

Databases 2012 Normalization

Introduction. Normalization. Example. Redundancy. What problems are caused by redundancy? What are functional dependencies?

Schema Refinement and Normal Forms

Chapter 3 Design Theory for Relational Databases

Schema Refinement and Normal Forms. Why schema refinement?

Relational Design Theory II. Detecting Anomalies. Normal Forms. Normalization

Lossless Joins, Third Normal Form

Schema Refinement and Normal Forms. The Evils of Redundancy. Schema Refinement. Yanlei Diao UMass Amherst April 10, 2007

Normal Forms (ii) ICS 321 Fall Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Functional Dependencies and Normalization

Functional Dependencies. Applied Databases. Not all designs are equally good! An example of the bad design

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Example (Contd.

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Refining an ER Diagram

The Evils of Redundancy. Schema Refinement and Normalization. Functional Dependencies (FDs) Example: Constraints on Entity Set. Refining an ER Diagram

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) CIS 330, Spring 2004 Lecture 11 March 2, 2004

Information Systems (Informationssysteme)

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) [R&G] Chapter 19

Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1

Information Systems for Engineers. Exercise 8. ETH Zurich, Fall Semester Hand-out Due

The Evils of Redundancy. Schema Refinement and Normal Forms. Functional Dependencies (FDs) Example: Constraints on Entity Set. Example (Contd.

Relational Design: Characteristics of Well-designed DB

FUNCTIONAL DEPENDENCY THEORY II. CS121: Relational Databases Fall 2018 Lecture 20

Functional Dependency Theory II. Winter Lecture 21

Introduction to Data Management. Lecture #6 (Relational DB Design Theory)

CS122A: Introduction to Data Management. Lecture #13: Relational DB Design Theory (II) Instructor: Chen Li

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #15: BCNF, 3NF and Normaliza:on

Database Design: Normal Forms as Quality Criteria. Functional Dependencies Normal Forms Design and Normal forms

Introduction to Data Management. Lecture #6 (Relational Design Theory)

Relational Database Design Theory Part II. Announcements (October 12) Review. CPS 116 Introduction to Database Systems

Chapter 11, Relational Database Design Algorithms and Further Dependencies

Schema Refinement and Normalization

Chapter 8: Relational Database Design

Normal Forms Lossless Join.

But RECAP. Why is losslessness important? An Instance of Relation NEWS. Suppose we decompose NEWS into: R1(S#, Sname) R2(City, Status)

Introduction to Data Management. Lecture #7 (Relational DB Design Theory II)

Data Bases Data Mining Foundations of databases: from functional dependencies to normal forms

Lecture #7 (Relational Design Theory, cont d.)

Introduction to Data Management CSE 344

Relational-Database Design

Functional Dependency and Algorithmic Decomposition

12/3/2010 REVIEW ALGEBRA. Exam Su 3:30PM - 6:30PM 2010/12/12 Room C9000

CS 186, Fall 2002, Lecture 6 R&G Chapter 15

Exam 1 Solutions Spring 2016

CAS CS 460/660 Introduction to Database Systems. Functional Dependencies and Normal Forms 1.1

Functional Dependencies

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Transcription:

L14: Normalization CS3200 Database design (sp18 s2) https://course.ccs.neu.edu/cs3200sp18s2/ 3/1/2018 367

Announcements! Keep bringing your name plates J Outline today - More Normalization - Project 1 discussion & Transactions after Spring Break 368

Closures, Superkeys, and (Candidate) Keys 369

What we will see next Closures Part 2 Superkeys & Keys Practice: The key or a key? 370

Why Do We Need the Closure? With closure we can find all FD s easily To check if X A - Compute X + - Check if A Î X + Note here that X is a set of attributes, but A is a single attribute. Why does considering FDs of this form suffice? Recall the Split/combine rule: X à A 1,, X à A n implies X à {A 1,, A n } 371

Using Closure to Infer ALL FDs Step 1: Compute X +, for every set of attributes X: {A} + = {A} {B} + = {B,D} {C} + = {C} {D} + = {D} {A,B} + = {A,B,C,D} {A,C} + = {A,C} {A,D} + = {A,B,C,D} {A,B,C} + = {A,B,D} + = {A,C,D} + = {A,B,C,D} {B,C,D} + = {B,C,D} {A,B,C,D} + = {A,B,C,D} Example: Given F = {A,B} à C {A,D} à B {B} à D No need to compute all of these- why? 372

Using Closure to Infer ALL FDs Step 1: Compute X +, for every set of attributes X: {A} + = {A}, {B} + = {B,D}, {C} + = {C}, {D} + = {D}, {A,B} + = {A,B,C,D}, {A,C} + = {A,C}, {A,D} + = {A,B,C,D}, {A,B,C} + = {A,B,D} + = {A,C,D} + = {A,B,C,D}, {B,C,D} + = {B,C,D}, {A,B,C,D} + = {A,B,C,D} Example: Given F = {A,B} à C {A,D} à B {B} à D Step 2: Enumerate all FDs X à Y, s.t. Y Í X + and X Ç Y = Æ: {A,B} à {C,D}, {A,D} à {B,C}, {A,B,C} à {D}, {A,B,D} à {C}, {A,C,D} à {B} 373

Using Closure to Infer ALL FDs Step 1: Compute X +, for every set of attributes X: {A} + = {A}, {B} + = {B,D}, {C} + = {C}, {D} + = {D}, {A,B} + = {A,B,C,D}, {A,C} + = {A,C}, {A,D} + = {A,B,C,D}, {A,B,C} + = {A,B,D} + = {A,C,D} + = {A,B,C,D}, {B,C,D} + = {B,C,D}, {A,B,C,D} + = {A,B,C,D} Example: Given F = {A,B} à C {A,D} à B {B} à D Step 2: Enumerate all FDs X à Y, s.t. Y Í X + and X Ç Y = Æ: {A,B} à {C,D}, {A,D} à {B,C}, {A,B,C} à {D}, {A,B,D} à {C}, {A,C,D} à {B} Y is in the closure of X 374

Using Closure to Infer ALL FDs Step 1: Compute X +, for every set of attributes X: {A} + = {A}, {B} + = {B,D}, {C} + = {C}, {D} + = {D}, {A,B} + = {A,B,C,D}, {A,C} + = {A,C}, {A,D} + = {A,B,C,D}, {A,B,C} + = {A,B,D} + = {A,C,D} + = {A,B,C,D}, {B,C,D} + = {B,C,D}, {A,B,C,D} + = {A,B,C,D} Example: Given F = {A,B} à C {A,D} à B {B} à D Step 2: Enumerate all FDs X à Y, s.t. Y Í X + and X Ç Y = Æ: {A,B} à {C,D}, {A,D} à {B,C}, {A,B,C} à {D}, {A,B,D} à {C}, {A,C,D} à {B} The FD X à Y is non-trivial 375

Keys and Superkeys A superkey is a set of attributes A 1,, A n s.t. for any other attribute B in R, we have {A 1,, A n } à B I.e. all attributes are functionally determined by a superkey A key is a minimal superkey (also called "candidate key") This means that no subset of a key is also a superkey (i.e., dropping any attribute from the key makes it no longer a superkey) 376

Finding Keys and Superkeys For each set of attributes X - Compute X + - If X + = set of all attributes then X is a superkey - If X is minimal, then it is a key 377

Example of Finding Keys Product(name, price, category, color) {name, category} à price {category} à color What is a key? 378

Example of Finding Keys Product(name, price, category, color) {name, category} à price {category} à color {name, category} + = {name, price, category, color} = the set of all attributes this is a superkey this is a key, since neither name nor category alone is a superkey 379

Practice Activity-21.ipynb 380

Complete Normalization Practice! 381

Parking Tickets: Original List 382

Parking Tickets: Original List ST ID L Name F Name Phone No St Lic Lic No Ticket # Date Code Fine 38249 Brown Thomas 111-7804 FL BRY 123 15634 10/17/10 2 $25 38249 Brown Thomas 111-7804 FL BRY 123 16017 11/13/10 1 $15 82453 Green Sally 391-1689 AL TRE 141 14987 10/05/10 3 $100 82453 Green Sally 391-1689 AL TRE 141 16293 11/18/10 1 $15 82453 Green Sally 391-1689 AL TRE-141 17892 12/13/10 2 $25 383

Parking Tickets: Original List ST ID L Name F Name Phone No St Lic Lic No Ticket # Date Code Fine 38249 Brown Thomas 111-7804 FL BRY 123 15634 10/17/10 2 $25 38249 Brown Thomas 111-7804 FL BRY 123 16017 11/13/10 1 $15 82453 Green Sally 391-1689 AL TRE 141 14987 10/05/10 3 $100 82453 Green Sally 391-1689 AL TRE 141 16293 11/18/10 1 $15 82453 Green Sally 391-1689 AL TRE-141 17892 12/13/10 2 $25 384

Parking Tickets: Relation in 1NF ParkingTickets STID LName FName PhoneNo StLic LicNo Ticketnr Date Code Fine 38249 Brown Thomas 111-7804 FL BRY 123 15634 10/17/10 2 $25 38249 Brown Thomas 111-7804 FL BRY 123 16017 11/13/10 1 $15 82453 Green Sally 391-1689 AL TRE 141 14987 10/05/10 3 $100 82453 Green Sally 391-1689 AL TRE 141 16293 11/18/10 1 $15 82453 Green Sally 391-1689 AL TRE-141 17892 12/13/10 2 $25 385

Parking Tickets: Relation in 1NF ParkingTickets STID LName FName PhoneNo StLic LicNo Ticketnr Date Code Fine 38249 Brown Thomas 111-7804 FL BRY 123 15634 10/17/10 2 $25 38249 Brown Thomas 111-7804 FL BRY 123 16017 11/13/10 1 $15 82453 Green Sally 391-1689 AL TRE 141 14987 10/05/10 3 $100 82453 Green Sally 391-1689 AL TRE 141 16293 11/18/10 1 $15 82453 Green Sally 391-1689 AL TRE-141 17892 12/13/10 2 $25 386

Parking Tickets: Dependency diagram Assume, each student can have maximal one car: STID LName FName PhoneNo StLic LicNo Ticketnr Date Code Fine ParkingTickets STID LName FName PhoneNo StLic LicNo Ticketnr Date Code Fine 38249 Brown Thomas 111-7804 FL BRY 123 15634 10/17/10 2 $25 38249 Brown Thomas 111-7804 FL BRY 123 16017 11/13/10 1 $15 82453 Green Sally 391-1689 AL TRE 141 14987 10/05/10 3 $100 82453 Green Sally 391-1689 AL TRE 141 16293 11/18/10 1 $15 82453 Green Sally 391-1689 AL TRE-141 17892 12/13/10 2 $25 387

Parking Tickets: Dependency diagram Assume, each student can have maximal one car: STID LName FName PhoneNo StLic LicNo Ticketnr Date Code Fine ParkingTickets STID LName FName PhoneNo StLic LicNo Ticketnr Date Code Fine 38249 Brown Thomas 111-7804 FL BRY 123 15634 10/17/10 2 $25 38249 Brown Thomas 111-7804 FL BRY 123 16017 11/13/10 1 $15 82453 Green Sally 391-1689 AL TRE 141 14987 10/05/10 3 $100 82453 Green Sally 391-1689 AL TRE 141 16293 11/18/10 1 $15 82453 Green Sally 391-1689 AL TRE-141 17892 12/13/10 2 $25 388

Parking Tickets: Dependency diagram Assume, each student can have maximal one car: STID LName FName PhoneNo StLic LicNo Ticketnr Date Code Fine Next assume, students can have more than one car: ParkingTickets STID LName FName PhoneNo StLic LicNo Ticketnr Date Code Fine 38249 Brown Thomas 111-7804 FL BRY 123 15634 10/17/10 2 $25 38249 Brown Thomas 111-7804 FL BRY 123 16017 11/13/10 1 $15 82453 Green Sally 391-1689 AL TRE 141 14987 10/05/10 3 $100 82453 Green Sally 391-1689 AL TRE 141 16293 11/18/10 1 $15 82453 Green Sally 391-1689 AL TRE-141 17892 12/13/10 2 $25 389

Parking Tickets: Dependency diagram Assume, each student can have maximal one car: STID LName FName PhoneNo StLic LicNo Ticketnr Date Code Fine Next assume, students can have more than one car: ParkingTickets STID LName FName PhoneNo StLic LicNo Ticketnr Date Code Fine 38249 Brown Thomas 111-7804 FL BRY 123 15634 10/17/10 2 $25 38249 Brown Thomas 111-7804 FL BRY 123 16017 11/13/10 1 $15 82453 Green Sally 391-1689 AL TRE 141 14987 10/05/10 3 $100 82453 Green Sally 391-1689 AL TRE 141 16293 11/18/10 1 $15 82453 Green Sally 391-1689 AL TRE-141 17892 12/13/10 2 $25 390

Parking Tickets: Relations in 3NF Assume, each student can have maximal one car: STID LName FName PhoneNo StLic LicNo Ticketnr Date Code Fine 391

Parking Tickets: Relations in 3NF Assume, each student can have maximal one car: STID LName FName PhoneNo StLic LicNo Ticketnr Date Code Fine Student STID LName FName PhoneNo StLic LicNo TicketCode Code Fine Ticket Ticketnr Date Code@ STID@ 392

Parking Tickets: Relations in 3NF Next assume, students can have more than one car: STID LName FName PhoneNo StLic LicNo Ticketnr Date Code Fine 393

Parking Tickets: Relations in 3NF Next assume, students can have more than one car: STID LName FName PhoneNo StLic LicNo Ticketnr Date Code Fine Student STID LName FName PhoneNo Car StLic LicNo STID@ TicketCode Code Fine Ticket Ticketnr Date Code@ StLic@ LicNo@ 394

Parking Tickets: Adding Violation Assume, each student can have maximal one car: Student STID LName FName PhoneNo StLic LicNo TicketCode Code Fine Ticket Ticketnr Date Code@ STID@ 395

Parking Tickets: Adding Violation Assume, each student can have maximal one car: Student STID LName FName PhoneNo StLic LicNo TicketCode Code Fine Violation Ticket Ticketnr Date Code@ STID@ 396

Parking Tickets: ER Diagram Student STID LName FName PhoneNo StLic LicNo TicketCode Code Fine Violation Ticket Ticketnr Date Code@ STID@ ER diagrams do not show foreign keys! 397

Complete Normalization Practice! 398

Example: DreamHome Rental StaffPropertyInspection propertyno paddress idate itime comments staffno sname carreg PG4 PG16 6 Lawrence St, Glasgow 5 Novar Dr, Glasgow 18-Oct-03 10:00 need to replace SG37 Ann Beech M231 JGR crockery 22-Apr-04 09:00 in good order SG14 David Ford M533 HDR 1-Oct-04 12:00 damp rot in bathroom SG14 David Ford N721 HFR 22-Apr-04 13:00 replace living room SG14 David Ford M533 HDR carpet 24-Oct-04 14:00 good condition SG37 Ann Beech N721 HFR Can a database store this information? Is it in 1NF? Members of DreamHome inspect properties When staff are required to undertake these inspections, they are allocated a company car for use on the day of the inspections. (One car per person & day) However, a car may be allocated to several members of staff as required throughout the working day. A member of staff may inspect several properties on a given date, but a property is only inspected once on a given date. Source: Connolly, Begg: Database systems, 4th ed, p. 423, 2005. 399

Example: DreamHome Rental StaffPropertyInspection propertyno idate itime paddress comments staffno sname carreg PG4 18-Oct-03 10:00 6 Lawrence St, Glasgow PG4 22-Apr-04 09:00 6 Lawrence St, Glasgow PG4 1-Oct-04 12:00 6 Lawrence St, Glasgow need to replace SG37 Ann Beech M231 JGR crockery in good order SG14 David Ford M533 HDR damp rot in bathroom SG14 David Ford N721 HFR PG16 22-Apr-04 13:00 5 Novar Dr, Glasgow replace living room SG14 David Ford M533 HDR carpet PG16 24-Oct-04 14:00 5 Novar Dr, Glasgow good condition SG37 Ann Beech N721 HFR No! Only now a database can store the information: 1NF But we still need a primary key Members of DreamHome inspect properties When staff are required to undertake these inspections, they are allocated a company car for use on the day of the inspections. (One car per person & day) However, a car may be allocated to several members of staff as required throughout the working day. A member of staff may inspect several properties on a given date, but a property is only inspected once on a given date. Source: Connolly, Begg: Database systems, 4th ed, p. 423, 2005. 400

Example: DreamHome Rental StaffPropertyInspection propertyno idate itime paddress comments staffno sname carreg PG4 18-Oct-03 10:00 6 Lawrence St, Glasgow PG4 22-Apr-04 09:00 6 Lawrence St, Glasgow PG4 1-Oct-04 12:00 6 Lawrence St, Glasgow need to replace SG37 Ann Beech M231 JGR crockery in good order SG14 David Ford M533 HDR damp rot in bathroom SG14 David Ford N721 HFR PG16 22-Apr-04 13:00 5 Novar Dr, Glasgow replace living room SG14 David Ford M533 HDR carpet PG16 24-Oct-04 14:00 5 Novar Dr, Glasgow good condition SG37 Ann Beech N721 HFR Now 1NF + PK Members of DreamHome inspect properties When staff are required to undertake these inspections, they are allocated a company car for use on the day of the inspections. (One car per person & day) However, a car may be allocated to several members of staff as required throughout the working day. A member of staff may inspect several properties on a given date, but a property is only inspected once on a given date. Source: Connolly, Begg: Database systems, 4th ed, p. 423, 2005. 401

Example: DreamHome Rental StaffPropertyInspection propertyno idate itime paddress comments staffno sname carreg Draw all FDs Members of DreamHome inspect properties When staff are required to undertake these inspections, they are allocated a company car for use on the day of the inspections. (One car per person & day) However, a car may be allocated to several members of staff as required throughout the working day. A member of staff may inspect several properties on a given date, but a property is only inspected once on a given date. Source: Connolly, Begg: Database systems, 4th ed, p. 423, 2005. 402

Example: DreamHome Rental StaffPropertyInspection propertyno idate itime paddress comments staffno sname carreg (full, PK) Members of DreamHome inspect properties When staff are required to undertake these inspections, they are allocated a company car for use on the day of the inspections. (One car per person & day) However, a car may be allocated to several members of staff as required throughout the working day. A member of staff may inspect several properties on a given date, but a property is only inspected once on a given date. Source: Connolly, Begg: Database systems, 4th ed, p. 423, 2005. 403

Example: DreamHome Rental StaffPropertyInspection propertyno idate itime paddress comments staffno sname carreg (full, PK) (partial) (transitive) Members of DreamHome inspect properties When staff are required to undertake these inspections, they are allocated a company car for use on the day of the inspections. (One car per person & day) However, a car may be allocated to several members of staff as required throughout the working day. A member of staff may inspect several properties on a given date, but a property is only inspected once on a given date. Source: Connolly, Begg: Database systems, 4th ed, p. 423, 2005. 404

Example: DreamHome Rental StaffPropertyInspection propertyno idate itime paddress comments staffno sname carreg (full, PK) (other) (partial) (transitive) Members of DreamHome inspect properties When staff are required to undertake these inspections, they are allocated a company car for use on the day of the inspections. (One car per person & day) However, a car may be allocated to several members of staff as required throughout the working day. A member of staff may inspect several properties on a given date, but a property is only inspected once on a given date. Source: Connolly, Begg: Database systems, 4th ed, p. 423, 2005. 405

Example: DreamHome Rental StaffPropertyInspection propertyno idate itime paddress comments staffno sname carreg (full, PK) (other) (partial) (Candidate K) (Candidate K) (transitive) Members of DreamHome inspect properties When staff are required to undertake these inspections, they are allocated a company car for use on the day of the inspections. (One car per person & day) However, a car may be allocated to several members of staff as required throughout the working day. A member of staff may inspect several properties on a given date, but a property is only inspected once on a given date. Source: Connolly, Begg: Database systems, 4th ed, p. 423, 2005. 406

Example: DreamHome Rental Property propertyno paddress (PK, now full, former partial) Staff staffno sname Inspection propertyno @ (PK, now full, former transitive) idate itime comments staffno@ carreg (PK) (other) Source: Connolly, Begg: Database systems, 4th ed, p. 423, 2005. 407

Example: DreamHome Rental Property propertyno paddress StaffCar idate staffno@ carreg Staff staffno Inspection propertyno @ sname idate@ itime comments staffno@ (PK) We also need to keep track of the fact that "staffno@" was already a foreign key before we put it into another table Extra question: We now have a composite FK (idate, staffno) from INSPECTION to STAFFCAR. Thus (idate, staffno) is a composite PK in STAFFCAR. Assume we like to replace it with a surrogate key. How would the resulting completely normalized tables look like? Source: Connolly, Begg: Database systems, 4th ed, p. 423, 2005. 408

Example: DreamHome Rental Property propertyno paddress StaffCar scid idate staffno@ carreg Staff staffno sname Inspection propertyno @ scid@ itime comments (PK) This is now fully normalized. Downside: we need to join INSPECTION with STAFFCAR every time we like to find out about when a property (by "properyno") was last inspected Source: Connolly, Begg: Database systems, 4th ed, p. 423, 2005. 409

Boyce-Codd Normal Form 410

Quick recap FDs Functional Dependency: The value of one set of attributes (the determinant) uniquely determines the value of another set of attributes (the dependents) A superkey is as a set of attributes of a relation schema upon which all attributes of the schema are functionally dependent. A candidate key (CK) is an attribute or non-redundant combination of attributes that uniquely identifies a row in a relation 2NF: no Partial FD: A FD in which one or more nonkey attributes are functionally dependent on part (but not all) of the PK 3NF: no Transitive dependency: An FD between two (or more) nonkey attributes 411

Boyce-Codd Normal Form (BCNF) Boyce-Codd normal form (BCNF) - A relation is in BCNF, if and only if, every determinant is a candidate key. The difference between 3NF and BCNF is that for a functional dependency AàB, - 3NF allows this dependency in a relation if B is a primary-key attribute and A is not a candidate key, - whereas BCNF insists that for this dependency to remain in a relation, A must be a candidate key. 412

3NF to BCNF Source: Hoffer, Ramesh, Topi, Modern database management, 10 th ed, Appendix B, 2010. 413

3NF to BCNF 414

3NF to BCNF Source: Hoffer, Ramesh, Topi, Modern database management, 10 th ed, Appendix B, 2010. 415

BCNF vs 3NF BCNF: For every functional dependency X->Y in a set F of functional dependencies over relation R, either: - X is a superkey of R - (or Y is a subset of X) 3NF: For every functional dependency X->Y in a set F of functional dependencies over relation R, either: - X is a superkey of R, or - Y is a subset of K for some key K of R N.b., no subset of a key is a key - (or Y is a subset of X) 416

Back to Conceptual Design Now that we know how to find FDs, it s a straight-forward process: - Search for bad FDs - If there are any, then keep decomposing the table into sub-tables until no more bad FDs - When done, the database schema is normalized Recall: there are several normal forms 417

Boyce-Codd Normal Form (BCNF) Main idea is that we define good and bad FDs as follows: - X à A is a good FD if X is a (super)key In other words, if A is the set of all attributes - X à A is a bad FD otherwise We will try to eliminate the bad FDs! 418

Boyce-Codd Normal Form (BCNF) Why does this definition of good and bad FDs make sense? If X is not a (super)key, it functionally determines some of the attributes; therefore, those other attributes can be duplicated - Recall: this means there is redundancy - And redundancy like this can lead to data anomalies! EmpID Name Phone Position E0045 Smith 1234 Clerk E3542 Mike 9876 Salesrep E1111 Smith 9876 Salesrep E9999 Mary 1234 Lawyer 419

Boyce-Codd Normal Form BCNF is a simple condition for removing anomalies from relations: A relation R is in BCNF if: if {A 1,..., A n } à B is a non-trivial FD in R then {A 1,..., A n } is a superkey for R Equivalently: sets of attributes X, either (X + = X) or (X + = all attributes) In other words: there are no bad FDs 420

Example Name SSN PhoneNumber City Fred 123-45-6789 206-555-1234 Seattle Fred 123-45-6789 206-555-6543 Seattle Joe 987-65-4321 908-555-2121 Westfield Joe 987-65-4321 908-555-1234 Westfield {SSN} à {Name,City} This FD is bad because it is not a superkey Not in BCNF What is the key? {SSN, PhoneNumber} 421

Example Name SSN City Fred 123-45-6789 Seattle Joe 987-65-4321 Madison SSN PhoneNumber 123-45-6789 206-555-1234 123-45-6789 206-555-6543 987-65-4321 908-555-2121 987-65-4321 908-555-1234 Now in BCNF! {SSN} à {Name,City} This FD is now good because it is the key Let s check anomalies: Redundancy? Update? Delete? 422

BCNF Decomposition Algorithm BCNFDecomp(R): Find X s.t.: X + X and X + [all attributes] if (not found) then Return R let Y = X + - X, Z = (X + ) C decompose R into R1(X È Y) and R2(X È Z) Return BCNFDecomp(R1), BCNFDecomp(R2) 423

BCNF Decomposition Algorithm BCNFDecomp(R): Find a set of attributes X s.t.: X + X and X + [all attributes] if (not found) then Return R Find a set of attributes X which has non-trivial bad FDs, i.e. is not a superkey, using closures let Y = X + - X, Z = (X + ) C decompose R into R1(X È Y) and R2(X È Z) Return BCNFDecomp(R1), BCNFDecomp(R2) 424

BCNF Decomposition Algorithm BCNFDecomp(R): Find a set of attributes X s.t.: X + X and X + [all attributes] if (not found) then Return R If no bad FDs found, in BCNF! let Y = X + - X, Z = (X + ) C decompose R into R1(X È Y) and R2(X È Z) Return BCNFDecomp(R1), BCNFDecomp(R2) 425

BCNF Decomposition Algorithm BCNFDecomp(R): Find a set of attributes X s.t.: X + X and X + [all attributes] if (not found) then Return R let Y = X + - X, Z = (X + ) C decompose R into R1(X È Y) and R2(X È Z) Return BCNFDecomp(R1), BCNFDecomp(R2) Let Y be the attributes that X functionally determines (+ that are not in X) And let Z be the complement, the other attributes that it doesn t 426

BCNF Decomposition Algorithm BCNFDecomp(R): Find a set of attributes X s.t.: X + X and X + [all attributes] Split into one relation (table) with X plus the attributes that X determines (Y) if (not found) then Return R let Y = X + - X, Z = (X + ) C decompose R into R 1 (X È Y) and R 2 (X È Z) Y X Z Return BCNFDecomp(R1), BCNFDecomp(R2) R 1 R 2 427

BCNF Decomposition Algorithm BCNFDecomp(R): Find a set of attributes X s.t.: X + X and X + [all attributes] And one relation with X plus the attributes it does not determine (Z) if (not found) then Return R let Y = X + - X, Z = (X + ) C decompose R into R 1 (X È Y) and R 2 (X È Z) Y X Z Return BCNFDecomp(R1), BCNFDecomp(R2) R 1 R 2 428

BCNF Decomposition Algorithm BCNFDecomp(R): Find a set of attributes X s.t.: X + X and X + [all attributes] if (not found) then Return R let Y = X + - X, Z = (X + ) C decompose R into R 1 (X È Y) and R 2 (X È Z) Return BCNFDecomp(R 1 ), BCNFDecomp(R 2 ) Proceed recursively until no more bad FDs! 429

Example BCNFDecomp(R): Find a set of attributes X s.t.: X + X and X + [all attributes] if (not found) then Return R R(A,B,C,D,E) {A} à {B,C} {C} à {D} let Y = X + - X, Z = (X + ) C decompose R into R 1 (X È Y) and R 2 (X È Z) Return BCNFDecomp(R 1 ), BCNFDecomp(R 2 ) 430

Example R(A,B,C,D,E) R(A,B,C,D,E) {A} + = {A,B,C,D} {A,B,C,D,E} {A} à {B,C} {C} à {D} R 1 (A,B,C,D) {C} + = {C,D} {A,B,C,D} R 11 (C,D) R 12 (A,B,C) R 2 (A,E) 431

Practice Activity-22.ipynb 432