Relational Design Theory I. Functional Dependencies: why? Redundancy and Anomalies I. Functional Dependencies

Similar documents
CSIT5300: Advanced Database Systems

Design theory for relational databases

Relational Database Design

SCHEMA NORMALIZATION. CS 564- Fall 2015

CS54100: Database Systems

Functional Dependencies & Normalization. Dr. Bassam Hammo

Relational Design Theory II. Detecting Anomalies. Normal Forms. Normalization

Database Design: Normal Forms as Quality Criteria. Functional Dependencies Normal Forms Design and Normal forms

UVA UVA UVA UVA. Database Design. Relational Database Design. Functional Dependency. Loss of Information

Review: Keys. What is a Functional Dependency? Why use Functional Dependencies? Functional Dependency Properties

Chapter 3 Design Theory for Relational Databases

CSC 261/461 Database Systems Lecture 13. Spring 2018

Functional Dependencies and Normalization

Functional Dependencies

Functional. Dependencies. Functional Dependency. Definition. Motivation: Definition 11/12/2013

Design Theory for Relational Databases

Design Theory: Functional Dependencies and Normal Forms, Part I Instructor: Shel Finkelstein

Functional Dependencies. Applied Databases. Not all designs are equally good! An example of the bad design

Schema Refinement. Feb 4, 2010

Schema Refinement & Normalization Theory

CSE 344 AUGUST 3 RD NORMALIZATION

10/12/10. Outline. Schema Refinements = Normal Forms. First Normal Form (1NF) Data Anomalies. Relational Schema Design

Schema Refinement and Normal Forms

Relational-Database Design

CSC 261/461 Database Systems Lecture 10 (part 2) Spring 2018

Functional Dependencies and Normalization. Instructor: Mohamed Eltabakh

Relational Design Theory

11/1/12. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

CSC 261/461 Database Systems Lecture 11

Functional Dependencies. Getting a good DB design Lisa Ball November 2012

Schema Refinement and Normal Forms. The Evils of Redundancy. Schema Refinement. Yanlei Diao UMass Amherst April 10, 2007

CSE 344 MAY 16 TH NORMALIZATION

Normal Forms (ii) ICS 321 Fall Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

11/6/11. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

CSE 303: Database. Outline. Lecture 10. First Normal Form (1NF) First Normal Form (1NF) 10/1/2016. Chapter 3: Design Theory of Relational Database

Schema Refinement and Normalization

Design Theory for Relational Databases

Chapter 8: Relational Database Design

Normal Forms Lossless Join.

Functional Dependencies

FUNCTIONAL DEPENDENCY THEORY. CS121: Relational Databases Fall 2017 Lecture 19

Introduction. Normalization. Example. Redundancy. What problems are caused by redundancy? What are functional dependencies?

Relational Database Design

Chapter 3 Design Theory for Relational Databases

Schema Refinement and Normal Forms

Information Systems (Informationssysteme)

Normaliza)on and Func)onal Dependencies

Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1

12/3/2010 REVIEW ALGEBRA. Exam Su 3:30PM - 6:30PM 2010/12/12 Room C9000

Comp 5311 Database Management Systems. 5. Functional Dependencies Exercises

Relational Database Design Theory Part II. Announcements (October 12) Review. CPS 116 Introduction to Database Systems

Lectures 6. Lecture 6: Design Theory

Database Design and Implementation

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Example (Contd.

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Refining an ER Diagram

CMPT 354: Database System I. Lecture 9. Design Theory

Chapter 11, Relational Database Design Algorithms and Further Dependencies

CSC 261/461 Database Systems Lecture 12. Spring 2018

Schema Refinement and Normal Forms. Chapter 19

Constraints: Functional Dependencies

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) [R&G] Chapter 19

The Evils of Redundancy. Schema Refinement and Normal Forms. Functional Dependencies (FDs) Example: Constraints on Entity Set. Example (Contd.

Schema Refinement and Normal Forms. Case Study: The Internet Shop. Redundant Storage! Yanlei Diao UMass Amherst November 1 & 6, 2007

Functional Dependency and Algorithmic Decomposition

Information Systems for Engineers. Exercise 8. ETH Zurich, Fall Semester Hand-out Due

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) CIS 330, Spring 2004 Lecture 11 March 2, 2004

Introduction to Data Management. Lecture #6 (Relational Design Theory)

Schema Refinement and Normal Forms

CSC 261/461 Database Systems Lecture 8. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

DESIGN THEORY FOR RELATIONAL DATABASES. csc343, Introduction to Databases Renée J. Miller and Fatemeh Nargesian and Sina Meraji Winter 2018

INF1383 -Bancos de Dados

Design Theory for Relational Databases. Spring 2011 Instructor: Hassan Khosravi

Schema Refinement and Normal Forms Chapter 19

Design Theory. Design Theory I. 1. Normal forms & functional dependencies. Today s Lecture. 1. Normal forms & functional dependencies

Introduction to Management CSE 344

Normal Forms. Dr Paolo Guagliardo. University of Edinburgh. Fall 2016

Chapter 7: Relational Database Design

Chapter 7: Relational Database Design. Chapter 7: Relational Database Design

Database Design and Normalization

Functional Dependencies

Functional Dependencies and Normalization

The Evils of Redundancy. Schema Refinement and Normalization. Functional Dependencies (FDs) Example: Constraints on Entity Set. Refining an ER Diagram

Practice and Applications of Data Management CMPSCI 345. Lecture 16: Schema Design and Normalization

5. Data dependences and database schema design. Witold Rekuć Data Processing Technology 104

Constraints: Functional Dependencies

L14: Normalization. CS3200 Database design (sp18 s2) 3/1/2018

Introduction to Data Management. Lecture #6 (Relational DB Design Theory)

Schema Refinement and Normal Forms

Relational Design: Characteristics of Well-designed DB

CAS CS 460/660 Introduction to Database Systems. Functional Dependencies and Normal Forms 1.1

Schema Refinement and Normal Forms. Why schema refinement?

Normalization. October 5, Chapter 19. CS445 Pacific University 1 10/05/17

Introduction to Database Systems CSE 414. Lecture 20: Design Theory

CMPS Advanced Database Systems. Dr. Chengwei Lei CEECS California State University, Bakersfield

Databases 2012 Normalization

LOGICAL DATABASE DESIGN Part #1/2

CSE 132B Database Systems Applications

Introduction to Data Management CSE 344

Normal Forms 1. ICS 321 Fall Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Detecting Functional Dependencies Felix Naumann

Transcription:

Relational Design Theory I Functional Dependencies Functional Dependencies: why? Design methodologies: Bottom up (e.g. binary relational model) Top-down (e.g. ER leads to this) Needed: tools for analysis of quality of relational schema Goals: reducing redundancy (update/deletion anomalies) avoiding spurious tuples reducing null values Redundancy and Anomalies I Insertion, Update Anomalies Example (Movie database) MOVIE(title, actorname, year, length, country, role) Consider tasks: update year; insert actor; insert movie delete actor 1

Redundancy and Anomalies II Deletion Anomalies Example (student activities) Student Activity Fee Marcus Brennigan Piano $20 Deepa Patel Swimming $15 Marcus Brennigan Swimming $15 Abigail Winter Tennis $30 Prakash Patel Skiing $150 delete student Abigail Winter. Null values I Example (student activities) SID Activity Fee 1001 Piano $20 1090 Swimming $15 1001 Swimming $15 insert new activity Chess with a fee of $20 (what would be a better design) Null values II Example section(cn, sn, teacherid, collink) versus section(cn, sn, teacherid) col(cn, sn, collink) 2

Spurious Tuples Loss of information through additional, spurious tuples. Example: Split student activity into (SID, Fee) and (Activity, Fee) What is the problem? What is the real solution? FUNCTIONAL DEPENDENCIES Functional Dependencies Example (university) SID determines LastName CID determines CourseName CID determines CourseNr {StudentID, GroupID} determines Joined 3

Functional Dependencies I R(A 1, A 2,, A n ) X and Y are subsets of {A 1, A 2,, A n } X Y means that fixing the values of all attributes in X determines the values of all attributes in Y X is called a determinant of Y Dependencies as constraints FDs are constraints we put on the relational schema We can see whether a relational state violates a FD (example) We cannot deduce FDs from a relational state. A relational state fulfilling a FD is a model of that FD. Keys and Superkeys R(A 1, A 2,, A n ) X subset of {A 1, A 2,, A n } X is superkey if X {A 1, A 2,, A n } A minimal superkey is a key, i.e. X is a key if 1) X is a superkey, and 2) no proper subset of X is a superkey. Examples: University Relations Lot(Lot#, County, PropertyID) 4

Inference on FDs {SID} {LastName, FirstName, SID, SSN, Career, Program, City, Started} We can conclude (among others) that {SID} {Career} {SID} {LastName} {SID} {SSN, City} Inference on FDs {StudentID, CourseID} {Quarter, Year} {StudentID} {SID, SSN, City} {SSN} {LastName, FirstName} {Name} {PresidentID, Founded} Which of the following FDs can we infer from these rules? {Name} {Founded} {StudentID, CourseID} {Year, LastName} {SSN} {City} Trivial Dependencies X Y is trivial, if Y is contained in X. R(A, B, C, D) with FDs AB C (or {A, B} {C}) C D D A What nontrivial FDs can we deduce? What are the keys of R? Are there superkeys which are not keys? 5

Inference Example R(A, B, C, D, E) with primary key {A,C,D} And FDs AB CD B E D E What nontrivial FDs can we deduce? What are the keys of R? Reasoning about FDs A FD X Y can be inferred from a set F of functional dependencies, if it holds true in every relational state that satisfies all FDs in F. In other words: every model of F is a model of X Y Example: from {A BC, C D} we can infer {A D} Reasoning about FDs Two sets F and G of FDs are equivalent, if all FDs in F can be inferred from G, and vice versa. In other words: every model of F is a model of G and vice versa Example: {AB C, A B} and {A B, A C} are equivalent. 6

Rules Reflexivity (triviality) Augmentation Transitivity Decomposition Union Pseudotransitive rule Armstrong s inference rules (imply other rules) (pg. 81) Closure How do we determine whether we can infer a FD X Y from a set of FDs F? X +, the closure of X, is the set of all attributes determined by X under F. X Y follows from F if and only if Y is in X + (with regard to F) Example: R(A,B,C,D,E) with FD {AB DE, A E C BD, D E}, does A D? Does AC D? Computing the Closure Given: set of FDs F set of attributes X Goal: X +, the set of all attributes determined by X Algorithm: X + := X while there is Y Z in F such that Y X + and Z not contained in X + X + := X + U Z 7

Closure Examples R(A,B,C,D,E,F,G,H) F = {AE B, BH C, CDE F, G EH, GH D Compute {A} + {AG} + {B} + {AEH} + Why closure? X Y follows from F if and only if Y is in X + (with regard to F) X is superkey for R(A 1, A 2,, A n ) with FDs F if and only if X + = {A 1, A 2,, A n } (with regard to F) Allows us to test for (Candidate) key Inference Equivalence of systems of FDs Cover and Equivalence F, G: sets of FDs F covers G, if all FDs in G can be inferred from F. in other words: every model of F is a model of G F and G are equivalent, if F covers G and G covers F. 8

Minimal Sets of Dependencies F, a set of FDs is minimal, if 1. The rhs of every dependency in F is a single attribute (a singleton). 2. No dependency X A in F can be replaced by Y A, where Y is a proper subset of X, such that the new system of dependencies is equivalent to F. 3. No dependency can be removed from F such that the new system of dependencies is still equivalent to F. Canonical form with no redundancies. Minimal Cover G is minimal cover of F, if G is minimal, and it covers F. Algorithm: 1. Use decomposition rule to split all rhs. 2. Sequentially try removing each attribute from each rule, and retain new rule if system is still equivalent. 3. If removing a dependency leaves the system equivalent to the old system, then remove it. Minimal Cover Algorithm To test whether G {X A} is equivalent to G we only need to test whether X A can be inferred from G - {X A}. To test whether G {X A} U {(X-{B}) A} is equivalent to G we only need to test whether X-{B} A can be inferred from G. 9

Minimal Cover Examples R(A,B,C) with FDs {AB C, A B} R(A,B,C,D) with FDs {A BC, B AC, D ABC} R(A,B,C,D,E,F,G) with FDs {BCD A, BC E, A F, F G, C D, A G} Canonical Cover F, a set of FDs is canonical, if 1. No dependency X Y in F can be replaced by X Y, where X is a proper subset of X, such that the new system of dependencies is equivalent to F. 2. No dependency X Y in F can be replaced by X Y, where Y is a proper subset of Y, such that the new system of dependencies is equivalent to F. 3. Every lhs of a dependency occurs at most once. Canonical Cover Algorithm We can compute the canonical cover of a set FD of functional dependencies, by computing their minimal cover and recombining the rules with identical lhs. Example: If we have a minimal cover FD = {A B, A C, B E, BC D, BC F, C E, }, the canonical cover is {A BC, B E, BC DF, C E} 10