CSIT5300: Advanced Database Systems

Similar documents
FUNCTIONAL DEPENDENCY THEORY. CS121: Relational Databases Fall 2017 Lecture 19

UVA UVA UVA UVA. Database Design. Relational Database Design. Functional Dependency. Loss of Information

Relational Design Theory I. Functional Dependencies: why? Redundancy and Anomalies I. Functional Dependencies

LOGICAL DATABASE DESIGN Part #1/2

Functional Dependencies & Normalization. Dr. Bassam Hammo

Relational Database Design

Relational-Database Design

Chapter 7: Relational Database Design

Functional. Dependencies. Functional Dependency. Definition. Motivation: Definition 11/12/2013

Chapter 7: Relational Database Design. Chapter 7: Relational Database Design

Review: Keys. What is a Functional Dependency? Why use Functional Dependencies? Functional Dependency Properties

CSC 261/461 Database Systems Lecture 10 (part 2) Spring 2018

Schema Refinement and Normal Forms. Chapter 19

Chapter 8: Relational Database Design

Functional Dependencies. Getting a good DB design Lisa Ball November 2012

Functional Dependency and Algorithmic Decomposition

Design Theory: Functional Dependencies and Normal Forms, Part I Instructor: Shel Finkelstein

Lectures 6. Lecture 6: Design Theory

Schema Refinement and Normal Forms

CSE 132B Database Systems Applications

Constraints: Functional Dependencies

Schema Refinement & Normalization Theory

Functional Dependencies. Applied Databases. Not all designs are equally good! An example of the bad design

Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1

Functional Dependencies and Normalization. Instructor: Mohamed Eltabakh

CS54100: Database Systems

Lecture 6 Relational Database Design

Schema Refinement. Feb 4, 2010

FUNCTIONAL DEPENDENCY THEORY II. CS121: Relational Databases Fall 2018 Lecture 20

CSC 261/461 Database Systems Lecture 8. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

SCHEMA NORMALIZATION. CS 564- Fall 2015

Relational Design: Characteristics of Well-designed DB

Schema Refinement and Normal Forms. The Evils of Redundancy. Schema Refinement. Yanlei Diao UMass Amherst April 10, 2007

Introduction to Data Management. Lecture #6 (Relational DB Design Theory)

Shuigeng Zhou. April 6/13, 2016 School of Computer Science Fudan University

Relational Database Design

CMPS Advanced Database Systems. Dr. Chengwei Lei CEECS California State University, Bakersfield

Schema Refinement and Normal Forms

Functional Dependencies

Database Normaliza/on. Debapriyo Majumdar DBMS Fall 2016 Indian Statistical Institute Kolkata

Database Design: Normal Forms as Quality Criteria. Functional Dependencies Normal Forms Design and Normal forms

INF1383 -Bancos de Dados

Functional Dependencies and Normalization

Introduction. Normalization. Example. Redundancy. What problems are caused by redundancy? What are functional dependencies?

Constraints: Functional Dependencies

Schema Refinement and Normal Forms. Why schema refinement?

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Example (Contd.

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Refining an ER Diagram

Functional Dependencies

Design theory for relational databases

Plan of the lecture. G53RDB: Theory of Relational Databases Lecture 10. Logical consequence (implication) Implication problem for fds

Introduction to Data Management. Lecture #6 (Relational Design Theory)

The Evils of Redundancy. Schema Refinement and Normal Forms. Functional Dependencies (FDs) Example: Constraints on Entity Set. Example (Contd.

Schema Refinement and Normal Forms. Case Study: The Internet Shop. Redundant Storage! Yanlei Diao UMass Amherst November 1 & 6, 2007

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) [R&G] Chapter 19

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) CIS 330, Spring 2004 Lecture 11 March 2, 2004

CAS CS 460/660 Introduction to Database Systems. Functional Dependencies and Normal Forms 1.1

Schema Refinement and Normal Forms Chapter 19

Schema Refinement and Normalization

Normalization. October 5, Chapter 19. CS445 Pacific University 1 10/05/17

COSC 430 Advanced Database Topics. Lecture 2: Relational Theory Haibo Zhang Computer Science, University of Otago

Schema Refinement and Normal Forms

Database System Concepts, 5th Ed.! Silberschatz, Korth and Sudarshan See for conditions on re-use "

Chapter 3 Design Theory for Relational Databases

Functional Dependencies

Functional Dependency Theory II. Winter Lecture 21

Schema Refinement and Normal Forms

Schema Refinement: Other Dependencies and Higher Normal Forms

Information Systems (Informationssysteme)

The Evils of Redundancy. Schema Refinement and Normalization. Functional Dependencies (FDs) Example: Constraints on Entity Set. Refining an ER Diagram

HKBU: Tutorial 4

CSE 344 AUGUST 3 RD NORMALIZATION

CS322: Database Systems Normalization

Normal Forms (ii) ICS 321 Fall Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Relational Design Theory

Normaliza)on and Func)onal Dependencies

Lecture 15 10/02/15. CMPSC431W: Database Management Systems. Instructor: Yu- San Lin

11/1/12. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

Background: Functional Dependencies. æ We are always talking about a relation R, with a æxed schema èset of attributesè and a

Detecting Functional Dependencies Felix Naumann

Relational Database Design Theory Part II. Announcements (October 12) Review. CPS 116 Introduction to Database Systems

Information Systems for Engineers. Exercise 8. ETH Zurich, Fall Semester Hand-out Due

Normal Forms Lossless Join.

Data Bases Data Mining Foundations of databases: from functional dependencies to normal forms

CSC 261/461 Database Systems Lecture 11

Introduction to Database Systems CSE 414. Lecture 20: Design Theory

Design Theory for Relational Databases

CS122A: Introduction to Data Management. Lecture #13: Relational DB Design Theory (II) Instructor: Chen Li

11/6/11. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

CSE 344 MAY 16 TH NORMALIZATION

DESIGN THEORY FOR RELATIONAL DATABASES. csc343, Introduction to Databases Renée J. Miller and Fatemeh Nargesian and Sina Meraji Winter 2018

Consider a relation R with attributes ABCDEF GH and functional dependencies S:

Schema Refinement. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Functional Dependencies

Functional Dependencies and Normalization

CSC 261/461 Database Systems Lecture 13. Spring 2018

Normal Forms. Dr Paolo Guagliardo. University of Edinburgh. Fall 2016

Comp 5311 Database Management Systems. 5. Functional Dependencies Exercises

10/12/10. Outline. Schema Refinements = Normal Forms. First Normal Form (1NF) Data Anomalies. Relational Schema Design

CS 186, Fall 2002, Lecture 6 R&G Chapter 15

Transcription:

CSIT5300: Advanced Database Systems L05: Functional Dependencies Dr. Kenneth LEUNG Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong SAR, China kwtleung@cse.ust.hk CSIT5300

Functional Dependencies (FD) Definition Let R be a relation schema and X, Y be sets of attributes in R. A functional dependency from X to Y exists if and only if: - For every instance R of R, if two tuples in R agree on the values of the attributes in X, then they agree on the values of the attributes in Y We write X Y and say that X determines Y Example on PGStudent (sid, name, supervisor_id, specialization): {supervisor_id} {specialization} means If two student records have the same supervisor (e.g., Stavros), then their specialization (e.g., Databases) must be the same On the other hand, if the supervisors of 2 students are different, we do not care about their specializations (they may be the same or different). Sometimes, we omit the brackets for simplicity: supervisor_id specialization kwtleung@cse.ust.hk CSIT5300 2

Trivial and Non-trivial FDs A functional dependency X Y is trivial if Y is a subset of X {name, supervisor_id} {name} If two records have the same values on both the name and supervisor_id attributes, then they obviously have the same name. Trivial dependencies hold for all relation instances A functional dependency X Y is non-trivial if Y X = {supervisor_id} {specialization} Non-trivial FDs are given in the form of constraints when designing a database. For instance, the specialization of a student must be the same as that of the supervisor They constrain the set of legal relation instances. For instance, if I try to insert two students under the same supervisor with different specializations, the insertion will be rejected by the DBMS Some FDs are neither trivial nor non-trivial kwtleung@cse.ust.hk CSIT5300 3

Functional Dependencies and Keys A FD is a generalization of the notion of a key For PGStudent (sid, name, supervisor_id, specialization), if we write {sid} {name, supervisor_id, specialization} The sid determines all attributes (i.e., the entire record) If two tuples in the relation student have the same sid, then they must have the same values on all attributes In other words, they must be the same tuple (since the relational model does not allow duplicate records) kwtleung@cse.ust.hk CSIT5300 4

Superkeys and Candidate Keys using FD A set of attributes that determines the entire tuple is a superkey {sid, name} is a superkey for the PGStudent table. Also {sid, name, supervisor_id} etc. A minimal set of attributes that determines the entire tuple is a candidate key {sid, name} is not a candidate key because I can remove the name. sid is a candidate key, and so is HKID (provided that it is stored in the table). If there are multiple candidate keys, the DB designer designates one as the primary key. kwtleung@cse.ust.hk CSIT5300 5

Closure of a Set of Functional Dependencies Given a set of functional dependencies F, there are certain other functional dependencies that are logically implied by F The set of all functional dependencies logically implied by F is the closure of F We denote the closure of F by F + We can find all of F + by applying Armstrong s Axioms: if Y X, then X Y (reflexivity) if X Y, then ZX ZY (augmentation) if X Y and Y Z, then X Z (transitivity) these rules are sound and complete. kwtleung@cse.ust.hk CSIT5300 6

Examples of Armstrong s Axioms if Y X, then X Y (reflexivity generates trivial FDs) name name name, supervisor_id name name, supervisor_id supervisor_id if X Y, then ZX ZY (augmentation) sid name (given) supervisor_id, sid supervisor_id, name if X Y and Y Z, then X Z (transitivity) sid supervisor_id (given) supervisor_id specialization (given) sid specialization kwtleung@cse.ust.hk CSIT5300 7

Additional Rules We can further simplify computation of F + by using the following additional rules If X Y holds and X Z holds, then X YZ holds (union) If X YZ holds, then X Y holds and X Z holds (decomposition) If X Y holds and ZY W holds, then ZX W holds (pseudotransitivity) The above rules can be inferred from Armstrong s axioms. (How?) E.g., pseudotransitivity X Y, ZY W (given) ZX ZY (by augmentation) ZX W (by transitivity) kwtleung@cse.ust.hk CSIT5300 8

Example of FDs in the closure F + R = (A, B, C, G, H, I) F = { A B, A C, CG H, CG I, B H } some members of F + A H (A B; B H) AG I (A C; AG CG; CG I) CG HI (CG H; CG I) kwtleung@cse.ust.hk CSIT5300 9

Closure of Attribute Sets The closure of X under F (denoted by X + ) is the set of attributes that are functionally determined by X under F: X Y is in F + Y X + If sid name is in F + then name is part of sid + i.e., sid + = {sid, name, } If sid supervisor_id is in F + then supervisor_id is part of sid + i.e., sid + = {sid, name, supervisor_id, } kwtleung@cse.ust.hk CSIT5300 10

Algorithm for Computing Attribute Closure Input: R: a relation schema F: a set of functional dependencies X R: the set of attributes for which we want to compute the closure) Output: X +: the closure of X w.r.t. F X (0) := X i = 0 repeat i = i +1 X (i) := X (i-1) Z, where Z is the set of attributes such that there exists Y Z in F, and Y X (i) until X (i) := X (i-1) return X (i) kwtleung@cse.ust.hk CSIT5300 11

Attribute Closure Example R = {A,B,C,D,E,G} F = { {A,B} {C}, {C} {A}, {B,C} {D}, {A,C,D} {B}, {D} {E,G}, {B,E} {C}, {C,G} {B,D}, {C,E} {A,G}} X = {B,D} X (0) = {B,D} {D} {E,G}, X (1) = {B,D,E,G}, {B,E} {C} X (2) = {B,C,D,E,G}, {C} {A} X (3) = {A,B,C,D,E,G} X (4) = X (3) kwtleung@cse.ust.hk CSIT5300 12

Uses of Attribute Closure Testing for superkey To test if X is a superkey, we compute X +, and check if X + contains all attributes of R. Testing functional dependencies To check if a functional dependency X Y holds (or, in other words, X Y is in F + ), just check if Y X +. Computing the closure of F, i.e., F + For each subset X R, we find the closure X +, and for each Y X +, we output a functional dependency X Y. Computing if two sets of functional dependencies F and G are equivalent, i.e., F + = G + For each functional dependency Y Z in F Compute Y + with respect to G If Z Y + then Y Z is in G + And vice versa kwtleung@cse.ust.hk CSIT5300 13

Redundancy of FDs Sets of functional dependencies may have redundant dependencies that can be inferred from the others Example: {A} {C} is redundant in: {{A} {B}, {B} {C},{A} {C}} Because: {A} {B}, {B} {C}: {A} {C} (transitivity) There may be extraneous/redundant attributes on the LHS of a dependency Let α β be a functional dependency in F. Attribute A is extraneous in α if F logically implies (F - {α β}) {(α-a) β} Example : F = {{A} {B}, {B} {C}, {A,C} {D}} can be simplified to {{A} {B}, {B} {C}, {A} {D}} Because: {A} {B}, {B} {C}: {A} {C} (transitivity) {A} {C} : {A} {A,C} (augmentation) {A} {A,C}, {A,C} {D}: {A} {D} (transitivity) kwtleung@cse.ust.hk CSIT5300 14

Redundancy of FDs (cont.) There may be extraneous/redundant attributes on the RHS of a dependency Let α β be a functional dependency in F. Attribute A is extraneous in β if (F - {α β}) {α (β-a)} logically implies F Example: F = {{A} {B}, {B} {C}, {A} {C,D}} can be simplified to {{A} {B}, {B} {C}, {A} {D}} Because: {A} {B}, {B} {C}: {A} {C} (transitivity) {A} {C}, {A} {D}: {A} {C,D} (union) kwtleung@cse.ust.hk CSIT5300 15

Canonical Cover A canonical cover for F is a set of dependencies F c such that F and F c are equivalent F c contains no redundancy Each left side of functional dependency in F c is unique For instance, if we have two FD, X Y and X Z, we convert them to X YZ. Algorithm for canonical cover of F: repeat Use the union rule to replace any dependencies in F X 1 Y 1 and X 1 Y 2 with X 1 Y 1 Y 2 Find a functional dependency X Y with an extraneous attribute either in X or in Y If an extraneous attribute is found, delete it from X Y until F does not change return F Note: The union rule may become applicable after some extraneous attributes have been deleted, so it has to be re-applied kwtleung@cse.ust.hk CSIT5300 16

Example for Computing the Canonical Cover R = (A, B, C) F = { A BC, B C, A B, AB C } Combine A BC and A B into A BC Set is now F = {A BC, B C, AB C} A is extraneous in AB C, because: B C is already in F Set is now {A BC, B C} C is extraneous in A BC, because: A B, B C: A C (transitivity) A B, A C: A BC (union) The canonical cover is: F c = { A B, B C } kwtleung@cse.ust.hk CSIT5300 17