CSE 344 AUGUST 6 TH LOSS AND VIEWS

Similar documents
Introduction to Data Management CSE 344

CSE 544 Principles of Database Management Systems

Introduction to Data Management CSE 344

CSE 344 MAY 16 TH NORMALIZATION

CSE 344 AUGUST 3 RD NORMALIZATION

Database Design and Implementation

10/12/10. Outline. Schema Refinements = Normal Forms. First Normal Form (1NF) Data Anomalies. Relational Schema Design

CSC 261/461 Database Systems Lecture 13. Spring 2018

Practice and Applications of Data Management CMPSCI 345. Lecture 16: Schema Design and Normalization

11/1/12. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

SCHEMA NORMALIZATION. CS 564- Fall 2015

Introduction to Management CSE 344

CMPT 354: Database System I. Lecture 9. Design Theory

Introduction to Database Systems CSE 414. Lecture 20: Design Theory

CSC 261/461 Database Systems Lecture 11

Design Theory. Design Theory I. 1. Normal forms & functional dependencies. Today s Lecture. 1. Normal forms & functional dependencies

Design Theory for Relational Databases. Spring 2011 Instructor: Hassan Khosravi

11/6/11. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

Lossless Joins, Third Normal Form

DESIGN THEORY FOR RELATIONAL DATABASES. csc343, Introduction to Databases Renée J. Miller and Fatemeh Nargesian and Sina Meraji Winter 2018

Chapter 3 Design Theory for Relational Databases

Design theory for relational databases

Lectures 6. Lecture 6: Design Theory

Introduction. Normalization. Example. Redundancy. What problems are caused by redundancy? What are functional dependencies?

CSC 261/461 Database Systems Lecture 12. Spring 2018

Schema Refinement and Normal Forms

Relational Design Theory

CSE 303: Database. Outline. Lecture 10. First Normal Form (1NF) First Normal Form (1NF) 10/1/2016. Chapter 3: Design Theory of Relational Database

CSC 261/461 Database Systems Lecture 10 (part 2) Spring 2018

UVA UVA UVA UVA. Database Design. Relational Database Design. Functional Dependency. Loss of Information

Schema Refinement. Feb 4, 2010

Relational Database Design

Chapter 10. Normalization Ext (from E&N and my editing)

CSC 261/461 Database Systems Lecture 8. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

DECOMPOSITION & SCHEMA NORMALIZATION

CS322: Database Systems Normalization

Information Systems for Engineers. Exercise 8. ETH Zurich, Fall Semester Hand-out Due

Databases 2012 Normalization

CS54100: Database Systems

Relational Design Theory II. Detecting Anomalies. Normal Forms. Normalization

Normal Forms. Dr Paolo Guagliardo. University of Edinburgh. Fall 2016

Design Theory for Relational Databases

Normal Forms (ii) ICS 321 Fall Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

INF1383 -Bancos de Dados

L14: Normalization. CS3200 Database design (sp18 s2) 3/1/2018

Normalization. October 5, Chapter 19. CS445 Pacific University 1 10/05/17

Design Theory for Relational Databases

Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1

Schema Refinement & Normalization Theory

Relational Database Design

CS122A: Introduction to Data Management. Lecture #13: Relational DB Design Theory (II) Instructor: Chen Li

Constraints: Functional Dependencies

Functional Dependencies and Normalization

Schema Refinement and Normal Forms. Chapter 19

Schema Refinement and Normal Forms. The Evils of Redundancy. Schema Refinement. Yanlei Diao UMass Amherst April 10, 2007

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms Chapter 19

The Evils of Redundancy. Schema Refinement and Normalization. Functional Dependencies (FDs) Example: Constraints on Entity Set. Refining an ER Diagram

Chapter 3 Design Theory for Relational Databases

Schema Refinement and Normal Forms

The Evils of Redundancy. Schema Refinement and Normal Forms. Functional Dependencies (FDs) Example: Constraints on Entity Set. Example (Contd.

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Example (Contd.

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Refining an ER Diagram

CAS CS 460/660 Introduction to Database Systems. Functional Dependencies and Normal Forms 1.1

Introduction to Data Management. Lecture #6 (Relational DB Design Theory)

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) [R&G] Chapter 19

Introduction to Data Management. Lecture #6 (Relational Design Theory)

Constraints: Functional Dependencies

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) CIS 330, Spring 2004 Lecture 11 March 2, 2004

Schema Refinement. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Relational Database Design Theory Part II. Announcements (October 12) Review. CPS 116 Introduction to Database Systems

Relational Design: Characteristics of Well-designed DB

Chapter 11, Relational Database Design Algorithms and Further Dependencies

CS 186, Fall 2002, Lecture 6 R&G Chapter 15

Chapter 7: Relational Database Design

Introduction to Data Management. Lecture #7 (Relational DB Design Theory II)

Schema Refinement and Normal Forms. Why schema refinement?

Database Design: Normal Forms as Quality Criteria. Functional Dependencies Normal Forms Design and Normal forms

Chapter 7: Relational Database Design. Chapter 7: Relational Database Design

Schema Refinement and Normalization

Schema Refinement and Normal Forms. Case Study: The Internet Shop. Redundant Storage! Yanlei Diao UMass Amherst November 1 & 6, 2007

Information Systems (Informationssysteme)

Schema Refinement and Normal Forms

Databases Lecture 8. Timothy G. Griffin. Computer Laboratory University of Cambridge, UK. Databases, Lent 2009

Normal Forms 1. ICS 321 Fall Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Functional Dependency Theory II. Winter Lecture 21

CSE 132B Database Systems Applications

Relational-Database Design

Database Design and Normalization

Functional Dependency and Algorithmic Decomposition

Functional Dependencies. Applied Databases. Not all designs are equally good! An example of the bad design

Practice and Applications of Data Management CMPSCI 345. Lecture 15: Functional Dependencies

Functional Dependencies and Normalization. Instructor: Mohamed Eltabakh

Lecture #7 (Relational Design Theory, cont d.)

12/3/2010 REVIEW ALGEBRA. Exam Su 3:30PM - 6:30PM 2010/12/12 Room C9000

Comp 5311 Database Management Systems. 5. Functional Dependencies Exercises

Schema Refinement: Other Dependencies and Higher Normal Forms

Desirable properties of decompositions 1. Decomposition of relational schemes. Desirable properties of decompositions 3

Chapter 8: Relational Database Design

Normaliza)on and Func)onal Dependencies

Transcription:

CSE 344 AUGUST 6 TH LOSS AND VIEWS

ADMINISTRIVIA WQ6 due tonight HW7 due Wednesday

DATABASE DESIGN PROCESS Conceptual Model: name product makes company price name address Relational Model: Tables + constraints And also functional dep. Normalization: Eliminates anomalies Conceptual Schema Physical storage details Physical Schema

ELIMINATING ANOMALIES Main idea: X à A is OK if X is a (super)key X à A is bad otherwise Need to decompose the table, but how? Boyce-Codd Normal Form

BOYCE-CODD NORMAL FORM There are no bad FDs: Definition. A relation R is in BCNF if: Whenever Xà B is a non-trivial dependency, then X is a superkey. Equivalently: Definition. A relation R is in BCNF if: " X, either X + = X or X + = [all attributes]

BCNF DECOMPOSITION ALGORITHM Normalize(R) find X s.t.: X X + and X + [all attributes] if (not found) then R is in BCNF let Y = X + - X; Z = [all attributes] - X + decompose R into R1(X Y) and R2(X Z) Normalize(R1); Normalize(R2); Y X Z X +

R(A,B,C,D) EXAMPLE: BCNF A à B B à C R(A,B,C,D)

R(A,B,C,D) EXAMPLE: BCNF Recall: find X s.t. X X + [all-attrs] R(A,B,C,D) A à B B à C

R(A,B,C,D) EXAMPLE: BCNF A à B B à C R(A,B,C,D) A + = ABC ABCD

R(A,B,C,D) EXAMPLE: BCNF A à B B à C R(A,B,C,D) A + = ABC ABCD R 1 (A,B,C) R 2 (A,D)

R(A,B,C,D) EXAMPLE: BCNF A à B B à C R(A,B,C,D) A + = ABC ABCD R 1 (A,B,C) R 2 (A,D)

R(A,B,C,D) EXAMPLE: BCNF A à B B à C R(A,B,C,D) A + = ABC ABCD R 1 (A,B,C) B + = BC ABC R 2 (A,D)

R(A,B,C,D) EXAMPLE: BCNF A à B B à C R(A,B,C,D) A + = ABC ABCD R 1 (A,B,C) B + = BC ABC R 2 (A,D) R 11 (B,C) R 12 (A,B) What happens if in R we first pick B +? Or AB +?]

R(A,B,C,D) EXAMPLE: BCNF A à B B à C R(A,B,C,D) A + = ABC ABCD R 1 (A,B,C) B + = BC ABC R 2 (A,D) R 11 (B,C) R 12 (A,B) What are the keys?

R(A,B,C,D) EXAMPLE: BCNF A à B B à C R(A,B,C,D) A + = ABC ABCD R 1 (A,B,C) B + = BC ABC R 2 (A,D) R 11 (B,C) R 12 (A,B) What are the keys?

R(A,B,C,D) EXAMPLE: BCNF A à B B à C R(A,B,C,D) A + = ABC ABCD R 1 (A,B,C) B + = BC ABC R 2 (A,D) R 11 (B,C) R 12 (A,B) What are the keys?

DECOMPOSITIONS IN GENERAL R(A 1,..., A n, B 1,..., B m, C 1,..., C p ) S 1 (A 1,..., A n, B 1,..., B m ) S 2 (A 1,..., A n, C 1,..., C p ) S 1 = projection of R on A 1,..., A n, B 1,..., B m S 2 = projection of R on A 1,..., A n, C 1,..., C p and R is a subset of S 1 S 2

LOSSLESS DECOMPOSITION Name Price Category Gizmo 19.99 Gadget OneClick 24.99 Camera Gizmo 19.99 Camera Name Price Name Category Gizmo 19.99 OneClick 24.99 Gizmo 19.99 Gizmo OneClick Gizmo Gadget Camera Camera

LOSSY DECOMPOSITION What is lossy here? Name Price Category Gizmo 19.99 Gadget OneClick 24.99 Camera Gizmo 19.99 Camera Name Gizmo OneClick Gizmo Category Gadget Camera Camera Price Category 19.99 Gadget 24.99 Camera 19.99 Camera

LOSSY DECOMPOSITION Name Price Category Gizmo 19.99 Gadget OneClick 24.99 Camera Gizmo 19.99 Camera Name Gizmo OneClick Gizmo Category Gadget Camera Camera Price Category 19.99 Gadget 24.99 Camera 19.99 Camera

DECOMPOSITION IN GENERAL R(A 1,..., A n, B 1,..., B m, C 1,..., C p ) S 1 (A 1,..., A n, B 1,..., B m ) S 2 (A 1,..., A n, C 1,..., C p ) Let: S 1 = projection of R on A 1,..., A n, B 1,..., B m S 2 = projection of R on A 1,..., A n, C 1,..., C p The decomposition is called lossless if R = S 1 S 2 Fact: If A 1,..., A n à B 1,..., B m then the decomposition is lossless It follows that every BCNF decomposition is lossless

IS THIS LOSSLESS? If we decompose R into Π S1 (R), Π S2 (R), Π S3 (R), Is it true that S1 S2 S3 = R? That is true if we can show that: R S1 S2 S3 always holds (why?) R S1 S2 S3 neet to check

Example from textbook Ch. 3.4.2 THE CHASE TEST FOR LOSSLESS JOIN R(A,B,C,D) = S1(A,D) S2(A,C) S3(B,C,D) R satisfies: AàB, BàC, CDàA S1 = Π AD (R), S2 = Π AC (R), S3 = Π BCD (R), hence R S1 S2 S3 Need to check: R S1 S2 S3

Example from textbook Ch. 3.4.2 THE CHASE TEST FOR LOSSLESS JOIN R(A,B,C,D) = S1(A,D) S2(A,C) S3(B,C,D) R satisfies: AàB, BàC, CDàA S1 = Π AD (R), S2 = Π AC (R), S3 = Π BCD (R), hence R S1 S2 S3 Need to check: R S1 S2 S3 Suppose (a,b,c,d) S1 S2 S3 Is it also in R?

Example from textbook Ch. 3.4.2 THE CHASE TEST FOR LOSSLESS JOIN R(A,B,C,D) = S1(A,D) S2(A,C) S3(B,C,D) R satisfies: AàB, BàC, CDàA S1 = Π AD (R), S2 = Π AC (R), S3 = Π BCD (R), hence R S1 S2 S3 Need to check: R S1 S2 S3 Suppose (a,b,c,d) S1 S2 S3 Is it also in R? R must contain the following tuples: A B C D Why? a b1 c1 d (a,d) S1 = Π AD (R)

Example from textbook Ch. 3.4.2 THE CHASE TEST FOR LOSSLESS JOIN R(A,B,C,D) = S1(A,D) S2(A,C) S3(B,C,D) R satisfies: AàB, BàC, CDàA S1 = Π AD (R), S2 = Π AC (R), S3 = Π BCD (R), hence R S1 S2 S3 Need to check: R S1 S2 S3 Suppose (a,b,c,d) S1 S2 S3 Is it also in R? R must contain the following tuples: A B C D Why? a b1 c1 d (a,d) S1 = Π AD (R) a b2 c d2 (a,c) S2 = Π BD (R)

Example from textbook Ch. 3.4.2 THE CHASE TEST FOR LOSSLESS JOIN R(A,B,C,D) = S1(A,D) S2(A,C) S3(B,C,D) R satisfies: AàB, BàC, CDàA S1 = Π AD (R), S2 = Π AC (R), S3 = Π BCD (R), hence R S1 S2 S3 Need to check: R S1 S2 S3 Suppose (a,b,c,d) S1 S2 S3 Is it also in R? R must contain the following tuples: A B C D Why? a b1 c1 d (a,d) S1 = Π AD (R) a b2 c d2 (a,c) S2 = Π BD (R) a3 b c d (b,c,d) S3 = Π BCD (R)

Example from textbook Ch. 3.4.2 THE CHASE TEST FOR LOSSLESS JOIN R(A,B,C,D) = S1(A,D) S2(A,C) S3(B,C,D) R satisfies: AàB, BàC, CDàA S1 = Π AD (R), S2 = Π AC (R), S3 = Π BCD (R), hence R S1 S2 S3 Need to check: R S1 S2 S3 Suppose (a,b,c,d) S1 S2 S3 Is it also in R? R must contain the following tuples: Chase them (apply FDs): AàB A B C D Why? a b1 c1 d (a,d) S1 = Π AD (R) a b2 c d2 (a,c) S2 = Π BD (R) a3 b c d (b,c,d) S3 = Π BCD (R) A B C D a b1 c1 d a b1 c d2 a3 b c d

Example from textbook Ch. 3.4.2 THE CHASE TEST FOR LOSSLESS JOIN R(A,B,C,D) = S1(A,D) S2(A,C) S3(B,C,D) R satisfies: AàB, BàC, CDàA S1 = Π AD (R), S2 = Π AC (R), S3 = Π BCD (R), hence R S1 S2 S3 Need to check: R S1 S2 S3 Suppose (a,b,c,d) S1 S2 S3 Is it also in R? R must contain the following tuples: Chase them (apply FDs): AàB BàC A B C D Why? a b1 c1 d (a,d) S1 = Π AD (R) a b2 c d2 (a,c) S2 = Π BD (R) a3 b c d (b,c,d) S3 = Π BCD (R) A B C D a b1 c1 d a b1 c d2 a3 b c d A B C D a b1 c d a b1 c d2 a3 b c d

Example from textbook Ch. 3.4.2 THE CHASE TEST FOR LOSSLESS JOIN R(A,B,C,D) = S1(A,D) S2(A,C) S3(B,C,D) R satisfies: AàB, BàC, CDàA S1 = Π AD (R), S2 = Π AC (R), S3 = Π BCD (R), hence R S1 S2 S3 Need to check: R S1 S2 S3 Suppose (a,b,c,d) S1 S2 S3 Is it also in R? R must contain the following tuples: Chase them (apply FDs): A B C D Why? a b1 c1 d (a,d) S1 = Π AD (R) a b2 c d2 (a,c) S2 = Π BD (R) a3 b c d (b,c,d) S3 = Π BCD (R) AàB BàC CDàA A B C D a b1 c1 d a b1 c d2 A B C D a b1 c d a b1 c d2 A B C D a b1 c d a b1 c d2 Hence R contains (a,b,c,d) a3 b c d a3 b c d a b c d

SCHEMA REFINEMENTS = NORMAL FORMS 1st Normal Form = all tables are flat 2nd Normal Form = no FD with non-prime attributes Obsolete Prime attributes: attributes part of a key Boyce Codd Normal Form = no bad FDs Are there problems with BCNF?

DEPENDENCY PRESERVATION Bookings(title,theatre,city) FD: theatre -> city title,city -> theatre What are the keys?

DEPENDENCY PRESERVATION Bookings(title,theatre,city) FD: theatre -> city title,city -> theatre What are the keys? None of the single attributes {title,city},{theatre,title} BCNF?

DEPENDENCY PRESERVATION Bookings(title,theatre,city) FD: theatre -> city title,city -> theatre What are the keys? None of the single attributes {title,city},{theatre,title} BCNF? No, {theatre} is neither a trivial dependency nor a superkey Decompose?

DEPENDENCY PRESERVATION Bookings(title,theatre,city) FD: theatre -> city title,city -> theatre What are the keys? None of the single attributes {title,city},{theatre,title} BCNF? No, {theatre} is neither a trivial dependency nor a superkey Decompose? R1(theatre,city) R2(theatre,title) What s wrong? (think of FDs)

DEPENDENCY PRESERVATION Bookings(title,theatre,city) FD: theatre -> city title,city -> theatre What are the keys? None of the single attributes {title,city},{theatre,title} BCNF? No, {theatre} is neither a trivial dependency nor a superkey Decompose? R1(theatre,city) R2(theatre,title) What s wrong? (think of FDs) We can t guarantee title,city -> theatre with simple constraints (now need to join)

NORMAL FORMS 3 rd Normal form Allows tables with BCNF violations if a decomposition separates an FD Can result in redundancy 4 th Normal form Multi-valued dependencies Incorporate info about attributes in neither A nor B All MVDs are also FDs Apply BCNF alg with MVD and FD

NORMAL FORMS 5 th Normal Form Join dependency Lossless/exact joining Join independent Tables 6 th Normal Form Only allow trivial join dependencies Only need key/tuple constraints to represent all constraints

KEY POINTS Produce and verify FDs, superkeys, keys Be able to decompose a table into BCNF Flaws of 1NF & BCNF Identify loss and be able to apply the chase test

IMPLEMENTATION We learned about how to normalize tables to avoid anomalies How can we implement normalization in SQL if we can t modify existing tables? This might be due to legacy applications that rely on previous schemas to run Can recover original tables via join on demand and we want those available to queries

VIEWS A view in SQL = A table computed from other tables, s.t., whenever the base tables are updated, the view is updated too More generally: A view is derived data that keeps track of changes in the original data Compare: A function computes a value from other values, but does not keep track of changes to the inputs

Purchase(customer, product, store) Product(pname, price) StorePrice(store, price) A SIMPLE VIEW Create a view that returns for each store the prices of products purchased at that store CREATE VIEW StorePrice AS SELECT DISTINCT x.store, y.price FROM Purchase x, Product y WHERE x.product = y.pname This is like a new table StorePrice(store,price)

Purchase(customer, product, store) Product(pname, price) StorePrice(store, price) WE USE A VIEW LIKE ANY TABLE A "high end" store is a store that sell some products over 1000. For each customer, return all the high end stores that they visit. SELECT DISTINCT u.customer, u.store FROM Purchase u, StorePrice v WHERE u.store = v.store AND v.price > 1000

TYPES OF VIEWS Virtual views Computed only on-demand slow at runtime Always up to date Materialized views Pre-computed offline fast at runtime May have stale data (must recompute or update) The key components of physical tuning of databases are the selection of materialized views and indexes

MATERIALIZED VIEWS CREATE MATERIALIZED VIEW View_name BUILD [IMMEDIATE/DEFERRED] REFRESH [FAST/COMPLETE/FORCE] ON [COMMIT/DEMAND] AS Sql_query Immediate v deferred Build immediately, or after a query Fast v. Complete v. Force Level of refresh log based v. complete rebuild Commit v. Demand Commit: after data is added Demand: after conditions are set (time is common)

CONCLUSION Poor schemas can lead to bugs and inefficiency E/R diagrams are means to structurally visualize and design relational schemas Normalization is a principled way of converting schemas into a form that avoid such problems BCNF is one of the most widely used normalized form in practice