Functional Dependencies and Normalization. Instructor: Mohamed Eltabakh

Similar documents
CSE 132B Database Systems Applications

SCHEMA NORMALIZATION. CS 564- Fall 2015

CS322: Database Systems Normalization

Relational Database Design

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Design Theory for Relational Databases. Spring 2011 Instructor: Hassan Khosravi

Constraints: Functional Dependencies

Schema Refinement & Normalization Theory

INF1383 -Bancos de Dados

Functional Dependencies & Normalization. Dr. Bassam Hammo

Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1

CSIT5300: Advanced Database Systems

Database Design and Implementation

Introduction. Normalization. Example. Redundancy. What problems are caused by redundancy? What are functional dependencies?

Introduction to Data Management. Lecture #6 (Relational DB Design Theory)

Chapter 8: Relational Database Design

CSC 261/461 Database Systems Lecture 10 (part 2) Spring 2018

CSC 261/461 Database Systems Lecture 13. Spring 2018

Databases 2012 Normalization

Chapter 7: Relational Database Design

CSC 261/461 Database Systems Lecture 11

Schema Refinement and Normal Forms Chapter 19

Chapter 7: Relational Database Design. Chapter 7: Relational Database Design

Constraints: Functional Dependencies

UVA UVA UVA UVA. Database Design. Relational Database Design. Functional Dependency. Loss of Information

Relational Design Theory

Schema Refinement and Normal Forms. Why schema refinement?

Introduction to Data Management. Lecture #7 (Relational DB Design Theory II)

Database System Concepts, 5th Ed.! Silberschatz, Korth and Sudarshan See for conditions on re-use "

Schema Refinement and Normal Forms. Chapter 19

Chapter 3 Design Theory for Relational Databases

CS122A: Introduction to Data Management. Lecture #13: Relational DB Design Theory (II) Instructor: Chen Li

Normal Forms 1. ICS 321 Fall Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Review: Keys. What is a Functional Dependency? Why use Functional Dependencies? Functional Dependency Properties

Introduction to Data Management. Lecture #6 (Relational Design Theory)

Schema Refinement and Normal Forms

Chapter 11, Relational Database Design Algorithms and Further Dependencies

Information Systems (Informationssysteme)

Schema Refinement. Feb 4, 2010

Schema Refinement and Normal Forms

10/12/10. Outline. Schema Refinements = Normal Forms. First Normal Form (1NF) Data Anomalies. Relational Schema Design

Functional Dependency and Algorithmic Decomposition

CSE 303: Database. Outline. Lecture 10. First Normal Form (1NF) First Normal Form (1NF) 10/1/2016. Chapter 3: Design Theory of Relational Database

The Evils of Redundancy. Schema Refinement and Normal Forms. Functional Dependencies (FDs) Example: Constraints on Entity Set. Example (Contd.

Relational Design Theory I. Functional Dependencies: why? Redundancy and Anomalies I. Functional Dependencies

Schema Refinement and Normalization

DESIGN THEORY FOR RELATIONAL DATABASES. csc343, Introduction to Databases Renée J. Miller and Fatemeh Nargesian and Sina Meraji Winter 2018

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Example (Contd.

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Refining an ER Diagram

The Evils of Redundancy. Schema Refinement and Normalization. Functional Dependencies (FDs) Example: Constraints on Entity Set. Refining an ER Diagram

Schema Refinement and Normal Forms. The Evils of Redundancy. Schema Refinement. Yanlei Diao UMass Amherst April 10, 2007

Schema Refinement and Normal Forms

Normal Forms (ii) ICS 321 Fall Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Lecture #7 (Relational Design Theory, cont d.)

Design theory for relational databases

Practice and Applications of Data Management CMPSCI 345. Lecture 16: Schema Design and Normalization

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) [R&G] Chapter 19

Design Theory: Functional Dependencies and Normal Forms, Part I Instructor: Shel Finkelstein

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) CIS 330, Spring 2004 Lecture 11 March 2, 2004

CS54100: Database Systems

CSC 261/461 Database Systems Lecture 12. Spring 2018

CSC 261/461 Database Systems Lecture 8. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

Functional Dependencies

Normalization. October 5, Chapter 19. CS445 Pacific University 1 10/05/17

Lectures 6. Lecture 6: Design Theory

Relational Design: Characteristics of Well-designed DB

CSE 344 AUGUST 3 RD NORMALIZATION

Normal Forms. Dr Paolo Guagliardo. University of Edinburgh. Fall 2016

Design Theory for Relational Databases

CAS CS 460/660 Introduction to Database Systems. Functional Dependencies and Normal Forms 1.1

Relational-Database Design

Introduction to Data Management CSE 344

Database Normaliza/on. Debapriyo Majumdar DBMS Fall 2016 Indian Statistical Institute Kolkata

Design Theory for Relational Databases

Schema Refinement and Normal Forms. Case Study: The Internet Shop. Redundant Storage! Yanlei Diao UMass Amherst November 1 & 6, 2007

Problem about anomalies

L14: Normalization. CS3200 Database design (sp18 s2) 3/1/2018

Chapter 10. Normalization Ext (from E&N and my editing)

DECOMPOSITION & SCHEMA NORMALIZATION

Functional Dependencies. Getting a good DB design Lisa Ball November 2012

Relational Database Design

Design Theory. Design Theory I. 1. Normal forms & functional dependencies. Today s Lecture. 1. Normal forms & functional dependencies

Schema Refinement and Normal Forms

11/1/12. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

CSE 344 MAY 16 TH NORMALIZATION

12/3/2010 REVIEW ALGEBRA. Exam Su 3:30PM - 6:30PM 2010/12/12 Room C9000

Information Systems for Engineers. Exercise 8. ETH Zurich, Fall Semester Hand-out Due

MIS Database Systems Schema Refinement and Normal Forms

Functional Dependency Theory II. Winter Lecture 21

COSC 430 Advanced Database Topics. Lecture 2: Relational Theory Haibo Zhang Computer Science, University of Otago

Schema Refinement. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Functional Dependencies

Database Design and Normalization

CSE 544 Principles of Database Management Systems

CMPT 354: Database System I. Lecture 9. Design Theory

L13: Normalization. CS3200 Database design (sp18 s2) 2/26/2018

11/6/11. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

Desirable properties of decompositions 1. Decomposition of relational schemes. Desirable properties of decompositions 3

Functional Dependencies and Normalization

Normaliza)on and Func)onal Dependencies

Transcription:

Functional Dependencies and Normalization Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu 1

Goal Given a database schema, how do you judge whether or not the design is good? How do you ensure it does not have redundancy or anomaly problems? To ensure your database schema is in a good form we use: Functional Dependencies Normalization Rules 2

What is Normalization Normalization is a set of rules to systematically achieve a good design If these rules are followed, then the DB design is guarantee to avoid several problems: Inconsistent data Anomalies: insert, delete and update Redundancy: which wastes storage, and often slows down query processing 3

Problem I: Insert Anomaly Student snumber sname pnumber pname s1 Dave p1 s2 Greg p2 ER Student Info Professor Info Question: Could we insert a professor without student? Note: We cannot insert a professor who has no students. Insert Anomaly: We are not able to insert valid value/(s) 4

Problem II: Delete Anomaly Student snumber sname pnumber pname s1 Dave p1 s2 Greg p2 ER Student Info Professor Info Question: Can we delete a student and keep a professor info? Note: We cannot delete a student that is the only student of a professor. Delete Anomaly: We are not able to perform a delete without losing some valid information. 5

Problem III: Update Anomaly Student snumber sname pnumber pname s1 Dave p1 s2 Greg p1 VV VV Student Info Professor Info Question: Can we simply update a professor s name? Note: To update the name of a professor, we have to update in multiple tuples. Update Anomaly: To update a value, we have to update multiple rows. Update anomalies are due to redundancy. 6

Problem IV: Inconsistency Student snumber sname pnumber pname s1 Dave p1 s2 Greg p1 VV Student Info Professor Info What if the name of professor p1 is updated in one place and not the other!!! Inconsistent Data: The same object has multiple values. Inconsistency is due to redundancy. 7

Schema Normalization Following the normalization rules, we avoid Insert anomaly Delete anomaly Update anomaly Inconsistency 8

What to Cover Functional Dependencies (FDs) Closure of Functional Dependencies Lossy & Lossless Decomposition Normalization 9

Usage of Functional Dependencies Discover all dependencies between attributes Identify the keys of relations Enable good (Lossless) decomposition of a given relation 10

Functional Dependencies (FDs) Student snumber sname address 1 Dave 144FL 2 Greg 320FL Suppose we have the FD: snumber address That is, there is a functional dependency from snumber to address Meaning: A student number determines the student address Or: For any two rows in the Student relation with the same value for snumber, the value for address must be same. 11

Functional Dependencies (FDs) Require that the value for a certain set of attributes determines uniquely the value for another set of attributes A functional dependency is a generalization of the notion of a key FD: A1,A2, An B1, B2, Bm L.H.S R.H.S 12

Functional Dependencies (FDs) The basic form of a FDs A1,A2, An B1, B2, Bm L.H.S R.H.S >> The values in the L.H.S uniquely determine the values in the R.H.S attributes (when you lookup the DB) >> It does not mean that L.H.S values compute the R.H.S values Examples: SSN à personname, persondob, personaddress DepartmentID, CourseNum à CourseTitle, NumCredits personname X personaddress 13

FD and Keys Student snumber sname address 1 Dave 144FL 2 Greg 320FL Primary Key : <snumber> Questions : Does a primary key implies functional dependencies? Which ones? Does unique keys imply functional dependencies? Which ones? Does a functional dependency imply keys? Which ones? Observation : Any key (primary or candidate) or superkey of a relation R functionally determines all attributes of R. 14

Functional Dependencies (FDs) Let R be a relation schema where α R and β R -- α and β are subsets of R s attributes The functional dependency α β holds on R if and only if: For any legal instance of R, whenever any two tuples t1 and t2 agree on the attributes α, they also agree on the attributes β. That is, t1[α]=t2[α] t1[β] =t2[β] A,B A ) with B 1 4 1 5 3 7 A à B (Does not hold) B à A (holds) 15

Functional Dependencies & Keys K is a superkey for relation schema R if and only if K R -- K determines all attributes of R K is a candidate key for R if and only if K R, and No α K, α R Keys imply FDs, and FDs imply keys 16

Example I Student(SSN, Fname, Mname, Lname, DoB, address, age, admissiondate) If you know that SSN is a key, Then SSN à Fname, Mname, Lname, DoB, address, age, admissiondate If you know that (Fname, Mname, Lname) is a key, Then Fname, Mname, Lname à SSN, DoB, address, age, admissiondate Need to know all of L.H.S to determine any of the R.H.S 17

Example II Student(SSN, Fname, Mname, Lname, DoB, address, age, admissiondate) If you know that SSN à Fname, Mname, Lname, DoB, address, age, admissiondate Then, we infer that SSN is a candidate key If you know that Fname, Mname, Lname à SSN, DoB, address, age, admissiondate Then, we infer that (Fname, Mname, Lname) is a key. Is it Candidate or super key??? Does any pair of attributes together form a key?? If no à (Fname, Mname, Lname) is a candidate key (minimal) If yes à (Fname, Mname, Lname) is a super key 18

Example III Does this FD hold? Title, year à length, genre, studioname YES Does this FD hold? Title, year à starname NO What is a key of this relation instance? {title, year, starname} Is it candidate key? >> For this instance à not a candidate key (title, starname) can be a key 19

Properties of FDs Consider A, B, C, Z are sets of attributes Reflexive (trivial): A B is trivial if B A customer_name, loan_number customer_name customer_name customer_name 20

Properties of FDs (Cont d) Consider A, B, C, Z are sets of attributes Transitive: if A B, and B C, then A C Augmentation: Union: if A B, then AZ BZ if A B, A C, then A BC Decomposition: Use these properties to derive more FDs if A BC, then A B, A C 21

Example Use the FD properties to derive more FDs Given R( A, B, C, D, E) F = {A à BC, DE à C, B à D} Is A a key for R or not? Does A determine all other attributes? A à A B C D NO Is BE a key for R? BE à B E D C NO Is ABE a candidate or super key for R? ABE à A B E D C AE à A E B C D >> ABE is a super key >> AE is a candidate key 22

What to Cover Functional Dependencies (FDs) Closure of Functional Dependencies Lossy & Lossless Decomposition Normalization 23

Closure of a Set of Functional Dependencies Given a set F set of functional dependencies, there are other FDs that can be inferred based on F For example: If A B and B C, then we can infer that A C Closure set F è F + The set of all FDs that can be inferred from F We denote the closure of F by F + F + is a superset of F Computing the closure F + of a set of FDs can be expensive 24

Inferring FDs Suppose we have: a relation R (A, B, C, D) and functional dependencies A B, C D, A à C Question: What is a key for R? We can infer A ABC, and since C D, then A ABCD Hence A is a key in R Is it is the only candidate key??? 25

Attribute Closure Attribute Closure of A Given a set of FDs, compute all attributes X that A determines A à X Attribute closure is easy to compute Just recursively apply the transitive property A can be a single attribute or set of attributes 26

Algorithm for Computing Attribute Closures Computing the closure of set of attributes {A1, A2,, An}: 1. Let X = {A1, A2,, An} 2. If there exists a FD: B1, B2,, Bm C, such that every Bi X, then X = X C 3. Repeat step 2 until no more attributes can be added. X is the closure of the {A1, A2,, An} attributes X = {A1, A2,, An} + 27

Example 1: Inferring FDs Assume relation R (A, B, C) Given FDs : A B, B C, C A What are the possible keys for R? Compute the closure of each attribute X, i.e., X + X + contains all attributes, then X is a key For example: {A} + = {A, B, C} {B} + = {A, B, C} {C} + = {A, B, C} So keys for R are <A>, <B>, <C> 28

Example 2: Attribute Closure Given R( A, B, C, D, E) F = {A à BC, DE à C, B à D} What is the attribute closure {AB} +? {AB} + = {A B} {AB} + = {A B C} {AB} + = {A B C D} What is the attribute closure {BE} +? {BE} + = {B E} {BE} + = {B E D} {BE} + = {B E D C} Set of attributes α is a key if α + contains all attributes 29

Example 3: Inferring FDs Assume relation R (A, B, C, D, E) Given F = {A à B, B à C, C D à E } Does A à E? The above question is the same as Is E in the attribute closure of A (A + )? Is A à E in the function closure F +? A à E does not hold A D à ABCDE does hold A D is a key for R 30

Summary of FDs They capture the dependencies between attributes How to infer more FDs using properties such as transitivity, augmentation, and union Functional closure F + Attribute closure A + Relationship between FDs and keys 31

What to Cover Functional Dependencies (FDs) Closure of Functional Dependencies Lossy & Lossless Decomposition Normalization 32

Decomposing Relations Greg Dave sname p2 p1 pnumber s2 s1 pname snumber StudentProf FDs: pnumber pname Greg Dave sname p2 p1 pnumber s2 s1 snumber Student p2 p1 pnumber pname Professor Greg Dave sname pname S2 S1 snumber Student p2 p1 pnumber pname Professor 33 Lossless Lossy

Lossless vs. Lossy Decomposition Assume R is divided into R1 and R2 Lossless Decomposition R1 natural join R2 should create exactly R Lossy Decomposition R1 natural join R2 adds more records (or deletes records) from R 34

Lossless Decomposition snumber s1 s2 StudentProf sname pnumber Dave p1 Greg p2 FDs: pnumber pname pname Student Lossless Professor snumber s1 s2 sname Dave Greg pnumber p1 p2 pnumber p1 p2 pname Student & Professor are lossless decomposition of StudentProf (Student Professor = StudentProf) 35

Lossy Decomposition snumber s1 s2 StudentProf sname pnumber Dave p1 Greg p2 FDs: pnumber pname pname Student Lossy Professor snumber S1 S2 sname Dave Greg pname pnumber p1 p2 pname Student & Professor are lossy decomposition of StudentProf (Student Professor!= StudentProf) 36

Goal: Ensure Lossless Decomposition How to ensure lossless decomposition? Answer: The common columns must be candidate key in one of the two relations 37

Back to our example snumber StudentProf sname pnumber pname s1 Dave p1 s2 Greg p2 FDs: pnumber pname pnumber is candidate key Student Lossless Professor snumber s1 s2 sname Dave Greg Student pnumber p1 p2 Lossy pnumber pname p1 p2 Professor pname is not candidate key snumber S1 S2 sname Dave Greg pname pnumber p1 p2 pname 38

What to Cover Functional Dependencies (FDs) Closure of Functional Dependencies Lossy & Lossless Decomposition Normalization 39