Relational Database Design

Similar documents
UVA UVA UVA UVA. Database Design. Relational Database Design. Functional Dependency. Loss of Information

Normalization. October 5, Chapter 19. CS445 Pacific University 1 10/05/17

Constraints: Functional Dependencies

Schema Refinement: Other Dependencies and Higher Normal Forms

Constraints: Functional Dependencies

Introduction. Normalization. Example. Redundancy. What problems are caused by redundancy? What are functional dependencies?

Lossless Joins, Third Normal Form

Information Systems for Engineers. Exercise 8. ETH Zurich, Fall Semester Hand-out Due

Functional Dependencies & Normalization. Dr. Bassam Hammo

Functional Dependencies

Databases 2012 Normalization

Schema Refinement and Normal Forms

Normal Forms. Dr Paolo Guagliardo. University of Edinburgh. Fall 2016

Schema Refinement and Normal Forms. The Evils of Redundancy. Schema Refinement. Yanlei Diao UMass Amherst April 10, 2007

Schema Refinement and Normal Forms Chapter 19

Schema Refinement and Normal Forms. Why schema refinement?

SCHEMA NORMALIZATION. CS 564- Fall 2015

Schema Refinement. Feb 4, 2010

Relational Design: Characteristics of Well-designed DB

Relational Design Theory

Normaliza)on and Func)onal Dependencies

Schema Refinement and Normal Forms

Functional Dependencies. Applied Databases. Not all designs are equally good! An example of the bad design

Schema Refinement and Normal Forms. Case Study: The Internet Shop. Redundant Storage! Yanlei Diao UMass Amherst November 1 & 6, 2007

Chapter 3 Design Theory for Relational Databases

Schema Refinement and Normal Forms

CS322: Database Systems Normalization

Chapter 7: Relational Database Design

Chapter 7: Relational Database Design. Chapter 7: Relational Database Design

Schema Refinement & Normalization Theory

Review: Keys. What is a Functional Dependency? Why use Functional Dependencies? Functional Dependency Properties

CS54100: Database Systems

Functional Dependency Theory II. Winter Lecture 21

FUNCTIONAL DEPENDENCY THEORY. CS121: Relational Databases Fall 2017 Lecture 19

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Example (Contd.

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Refining an ER Diagram

Relational Database Design

The Evils of Redundancy. Schema Refinement and Normal Forms. Functional Dependencies (FDs) Example: Constraints on Entity Set. Example (Contd.

Functional Dependency and Algorithmic Decomposition

Relational Database Design Theory Part II. Announcements (October 12) Review. CPS 116 Introduction to Database Systems

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) [R&G] Chapter 19

Schema Refinement. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Functional Dependencies and Normalization

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) CIS 330, Spring 2004 Lecture 11 March 2, 2004

The Evils of Redundancy. Schema Refinement and Normalization. Functional Dependencies (FDs) Example: Constraints on Entity Set. Refining an ER Diagram

Chapter 3 Design Theory for Relational Databases

DESIGN THEORY FOR RELATIONAL DATABASES. csc343, Introduction to Databases Renée J. Miller and Fatemeh Nargesian and Sina Meraji Winter 2018

Design Theory for Relational Databases

Database Design: Normal Forms as Quality Criteria. Functional Dependencies Normal Forms Design and Normal forms

Chapter 8: Relational Database Design

A few details using Armstrong s axioms. Supplement to Normalization Lecture Lois Delcambre

Schema Refinement and Normal Forms

INF1383 -Bancos de Dados

Chapter 10. Normalization Ext (from E&N and my editing)

Lecture 6 Relational Database Design

Schema Refinement and Normalization

FUNCTIONAL DEPENDENCY THEORY II. CS121: Relational Databases Fall 2018 Lecture 20

Database Design and Implementation

Schema Refinement and Normal Forms. Chapter 19

Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1

CSC 261/461 Database Systems Lecture 10 (part 2) Spring 2018

Data Bases Data Mining Foundations of databases: from functional dependencies to normal forms

Information Systems (Informationssysteme)

Background: Functional Dependencies. æ We are always talking about a relation R, with a æxed schema èset of attributesè and a

Functional Dependencies. Getting a good DB design Lisa Ball November 2012

Shuigeng Zhou. April 6/13, 2016 School of Computer Science Fudan University

Functional Dependencies

CSE 132B Database Systems Applications

Design Theory for Relational Databases

CSIT5300: Advanced Database Systems

DECOMPOSITION & SCHEMA NORMALIZATION

Database Design and Normalization

Database System Concepts, 5th Ed.! Silberschatz, Korth and Sudarshan See for conditions on re-use "

COSC 430 Advanced Database Topics. Lecture 2: Relational Theory Haibo Zhang Computer Science, University of Otago

Design Theory: Functional Dependencies and Normal Forms, Part I Instructor: Shel Finkelstein

CSC 261/461 Database Systems Lecture 11

CSC 261/461 Database Systems Lecture 13. Spring 2018

Databases Lecture 8. Timothy G. Griffin. Computer Laboratory University of Cambridge, UK. Databases, Lent 2009

CS122A: Introduction to Data Management. Lecture #13: Relational DB Design Theory (II) Instructor: Chen Li

Design theory for relational databases

Introduction to Data Management. Lecture #6 (Relational Design Theory)

Database Normaliza/on. Debapriyo Majumdar DBMS Fall 2016 Indian Statistical Institute Kolkata

CMPS Advanced Database Systems. Dr. Chengwei Lei CEECS California State University, Bakersfield

Database Design and Normalization

10/12/10. Outline. Schema Refinements = Normal Forms. First Normal Form (1NF) Data Anomalies. Relational Schema Design

Kapitel 3: Formal Design

Relational-Database Design

Plan of the lecture. G53RDB: Theory of Relational Databases Lecture 10. Logical consequence (implication) Implication problem for fds

Lectures 6. Lecture 6: Design Theory

Design Theory for Relational Databases. Spring 2011 Instructor: Hassan Khosravi

CSC 261/461 Database Systems Lecture 8. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

Copyright is owned by the Author of the thesis. Permission is given for a copy to be downloaded by an individual for the purpose of research and

Introduction to Data Management. Lecture #6 (Relational DB Design Theory)

Functional Dependencies and Normalization. Instructor: Mohamed Eltabakh

Relational Design Theory II. Detecting Anomalies. Normal Forms. Normalization

CAS CS 460/660 Introduction to Database Systems. Functional Dependencies and Normal Forms 1.1

Lecture 15 10/02/15. CMPSC431W: Database Management Systems. Instructor: Yu- San Lin

LOGICAL DATABASE DESIGN Part #1/2

HKBU: Tutorial 4

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Transcription:

Relational Database Design Jan Chomicki University at Buffalo Jan Chomicki () Relational database design 1 / 16 Outline 1 Functional dependencies 2 Normal forms 3 Multivalued dependencies Jan Chomicki () Relational database design 2 / 16

Good and bad database schemas Bad schema Repetition of information. Leads to redundancies, potential inconsistencies, and update anomalies. Inability to represent information. Leads to anomalies in insertion and deletion. Good schema relation schemas in normal form (redundancy- and anomaly-free): BCNF, 3NF. Schema decomposition improving a bad schema desirable properties: lossless join dependency preservation Jan Chomicki () Relational database design 3 / 16 Integrity constraints Functional dependencies key constraints cannot express uniqueness properties holding in a proper subset of all attributes key constraints need to be generalized to functional dependencies Other constraints not relevant for decomposition Jan Chomicki () Relational database design 4 / 16

Functional dependencies (FDs) Notation Relation schema R(A 1,..., A n ) r is an instance of R Sets of attributes of R: X, Y, Z,... {A 1,..., A n } A 1 A n instead of {A 1,..., A n }. XY instead of X Y. Functional dependency syntax: X Y semantics: r satisfies X Y if for all tuples t 1, t 2 r: if t 1 [X ] = t 2 [X ], then also t 1 [Y ] = t 2 [Y ]. Jan Chomicki () Relational database design 5 / 16 Dependency implication Implication A set of FDs F implies an FD X Y, if every relation instance that satisfies all the dependencies in F, also satisfies X Y. Notation F = X Y (F implies X Y ). Closure of a dependency set F The set of dependencies implied by F. Notation F + = {X Y : F = X Y }. Jan Chomicki () Relational database design 6 / 16

Keys Key X {A 1,..., A n } is a key of R if: 1 the dependency X A 1 A n is in F +. 2 for all proper subsets Y of X, the dependency Y A 1 A n is not in F +. Related notions superkey: superset of a key. primary key: one designated key. candidate key: one of the keys. Jan Chomicki () Relational database design 7 / 16 Inference of functional dependencies Dependency inference How to tell whether X Y F +? Inference rules (Armstrong axioms) 1 reflexivity: infer X Y if Y X attrs(r) (trivial dependency) 2 augmentation: From X Y infer XZ YZ if Z attrs(r) 3 transitivity: From X Y and Y Z, infer X Z. Jan Chomicki () Relational database design 8 / 16

Properties of axioms Armstrong axioms are: sound: if X Y is derived from F, then X Y F +. complete: if X Y F +, then X Y is derived from F. Additional (implied) inference rules 4. union: from X Y and X Z, infer X YZ 5. decomposition: from X Y infer X Z, if Z Y Jan Chomicki () Relational database design 9 / 16 Boyce-Codd Normal Form (BCNF) and 3NF BCNF A schema R is in BCNF if for every nontrivial FD X A F, X contains a key of R. Each instance of a relation schema which is in BCNF does not contain a redundancy (that can be detected using FDs alone). 3NF R is in 3NF if for every nontrivial FD X A F : X contains a key of R, or A is part of some key of R. BCNF vs. 3NF if R is in BCNF, it is also in 3NF there are relations that are in 3NF but not in BCNF. Jan Chomicki () Relational database design 10 / 16

Decompositions Decomposition Replacement of a relation schema R by two relation schema R 1 and R 2 such that R = R 1 R 2. Lossless-join decomposition (R 1, R 2 ) is a lossless-join decomposition of R with respect to a set of FDs F if for every instance r of R that satisfies F : π R1 (r) π R2 (r) = r. A simple criterion for checking whether a decomposition (R 1, R 2 ) is lossless-join: R 1 R 2 R 1 F +, or R 1 R 2 R 2 F +. Decomposition into more than two schemas generalized definition more complex losslessness test Jan Chomicki () Relational database design 11 / 16 Dependency preservation Dependencies associated with relation schema R 1 and R 2 in a decomposition (R 1, R 2 ): F R1 = {X Y X Y F + XY R 1 } F R2 = {X Y X Y F + XY R 2 }. (R 1, R 2 ) preserves a dependency f iff f (F R1 F R2 ) +. Jan Chomicki () Relational database design 12 / 16

Decomposition into BCNF Algorithm: decomposition of schema R 1 For some nontrivial nonkey dependency X A in F + : create a relation schema R 1 with the set of attributes XA and FDs F R1. create a relation schema R 2 with the set of attributes R {A} and FDs F R2. 2 Decompose further the resulting schemas which are not in BCNF. This algorithm produces a lossless-join decomposition into BCNF which does not have to preserve dependencies. Jan Chomicki () Relational database design 13 / 16 Decomposition (synthesis) into 3NF Minimal basis F for F set of FDs equivalent to F (F + = (F ) + ), all FDs in F are of the form X A where A is a single attribute, further simplification by removing dependencies or attributes from dependencies in F yields a set of FDs inequivalent to F. Algorithm: 3NF synthesis 1 Create a minimal basis F. 2 Create a relation with attributes XA for every dependency X A F. 3 Create a relation X for some key X of R. 4 Remove redundancies. This algorithm produces a lossless-join decomposition into 3NF which preserves dependencies. Jan Chomicki () Relational database design 14 / 16

Multivalued dependencies (MVDs) Notation Relation schema R(A 1,..., A n ). r is an instance of R Sets of attributes: X, Y, Z,... {A 1,..., A n }. Multivalued dependency syntax: a pair X Y. semantics: r satisfies X Y if for all tuples t 1, t 2 r: if t 1 [X ] = t 2 [X ], then there is a tuple t 3 r such that t 3 [XY ] = t 1 [XY ] and t 3 [Z] = t 2 [Z], where Z = {A 1,..., A n } XY. Implication Defined in the same way as for FDs. Jan Chomicki () Relational database design 15 / 16 Fourth Normal Form (4NF) F is the set of FDs and MVDs associated with a relation schema R = {A 1,..., A n }. 4NF R is in 4NF if for every multivalued dependency X Y entailed by F : Y X or XY = {A 1,..., A n } (trivial MVD), or X contains a key of R. Jan Chomicki () Relational database design 16 / 16