Database Design and Implementation
|
|
- Diane Fields
- 5 years ago
- Views:
Transcription
1 Database Design and Implementation CS 645 Data provenance
2 Provenance provenance, n. The fact of coming from some particular source or quarter; origin, derivation [Oxford English Dictionary] Data provenance / lineage [BunemanKhannaTan 01]: aims to explain how a particular result was derived. Data-intensive science Worry about provenance
3 Motivation Data integration [WangMadnick90, LeeBressanMadnick98] Data Warehousing [CuiWidonWiener00] Scientific Data Management [BunemanKhannaTan01] Determines trust on results Ensure reliability, quality of data Repeatability/verifiability Avoid effort duplication Understanding transport of annotations
4 Example of data provenance A typical question: For a given database query Q, a database D and a tuple t in the output of Q(D), which parts of D contribute to t? R Emp John Susan Anna Dept D01 D02 D04 S Did D01 D02 D03 Mgr Mary Ken Ed Q Q = select r.a, r.b, s.c from R r, S s where r.b = s.b Emp Dept Mgr John D01 Mary Susan D02 Ken The question can be applied to attribute values, tables, etc.
5 Two approaches Eager or annotation-based Changes the transformation from Q to Q to carry extra information Source data not needed after transformation Annotation-based Q Q Extra information Lazy or non-annotation based Q is unchanged Good when extra storage is an issue Recomputation and access to source required
6 Types of provenance Why What DB tuples contribute to the presence of each result tuple? How By what process is each output tuple produced from the DB instance? Where Where (from what attribute of what tuple) does each output tuple value come from?
7 Why-provenance example a.name, DISTINCT a.phone a.name, a.phone
8 Lineage for an output tuple t is a subset of the input tuples which are relevant to the output tuple DISTINCT a.name, a.phone Lineage: {t1, t5, t6} Problem: Not very precise. e.g., lineage above does not specify that t5 and t6 do not both need to exist.
9 Why provenance DISTINCT a.name, a.phone Witness of t: Any subset of the database sufficient to reconstruct tuple t in the query result. Witness basis: Leaves of the proof tree showing how result tuple t is generated Lineage: {t1, t5, t6} {t1, t5} {t1, t6} {t1, t2, t6, t8} {{t1, t5}, {t1, t6}}
10 Why: query rewriting t1 t t2 t3 Why(Q, I, t): {{t 1 }} Why(Q, I, t): {{t 1 }, {t 1, t 2 }} Minimal witness basis: Minimal witnesses in the witness basis
11 The view deletion problem D a database instance and V=Q(D) a view defined over D. Find a set of tuples ΔD to remove from D so that a specific tuple t is removed from the view Minimize the number of side-effects in the view View side-effect problem Hard: queries with joins and projection or union PTIME: the rest Minimize the number of tuples deleted from D Source side-effect problem Same dichotomy [BunemanKhannaTan. PODS 2002]
12 How provenance Identifies witness tuples and the operations performed on them to produce each result tuple Expresses operations using provenance semirings MERGE (+): union or projection JOIN ( ): joins
13 Propagating annotations (1) R A B C a b c S D B E d b e Join (on B) R S A B C D E a b c d e The annotation means joint use of the data annotated by p and the data annotated by r
14 Propagating annotations (2) R A B C a b c Union R S A B C a b c p + r S A B C a b c The annotation p + r means alternative use of the data annotated by p and the data annotated by r
15 Propagating annotations (3) R A B C a b c 1 a b c 2 p r Project π AB R A B a b p + r + s a b c 3 s + denotes alternative use of data
16 An example (SPJU) R A B C a b c d b e f g e p r s Q = σ C=e π AC (π AB R π BC R π AC R π BC R) A C a c a e d c d e f e For selection, multiply with annotation 0 and 1.
17 Example
18 Example
19 Example
20 Example
21 Example
22 Example
23 Example
24 Example
25 Example
26 Example
27 Back to example R A B C a b c d b e f g e p r s Q A C a c a e d c d e f e
28 Applying the laws: polynomials R A B C a b c p Q A C a e pr d b e r d e 2r 2 + rs f g e s f e rs + 2s 2 Polynomials with coefficients in and annotation tokens as indeterminates p, r, s capture a very general form of provenance
29 How to read this provenance R A B C a b c p Q A C a e pr d b e r d e 2r 2 + rs f g e s f e rs + 2s 2 3 ways to derive (d e) 2 of the ways use only r, but they use it twice the 3 rd uses r once and s once
30 Deletion Propagation R A B C a b c p Q A C a e pr Q A C a e 0 Q A C f e 2s 2 d b e r d e 2r 2 + rs d e 0 f g e s f e rs + 2s 2 f e 2s 2 Delete (d b e) from R Set r to 0!
31 Some useful commutative semirings Set Semantics Bag Semantics Probabilistic events Access Control Public Top Secret
32 Example: access control where a c 2p 2 a b c p d b e r f g e s p=p, r=s, s=t q a d d f e c e e pr pr 2r 2 +rs 2s 2 +rs a b c d b e f g e P S T q a a d d c e c e P S S S Evaluate with p=p, r=s, s=t using min for +, max for f e T User with secret clearance
33 Where provenance Identifies witness cells Important for annotations SELECT * FROM R WHERE A <> 5 UNION SELECT A, 7 AS B FROM R WHERE A= 5 UPDATE R SET B=7 WHERE A=5 R A B ? A B
34 Color algebra [Geerts, Kementsietsidis, Milano 06] A B P[Q] A B Q = SELECT * FROM R WHERE A <> 5 UNION SELECT A, 7 AS B FROM R WHERE A= 5
35 Color algebra A B P[Q] A B Q = UPDATE R SET B=7 WHERE A=5
36 Where provenance and semirings R u A x B y C 1 a 1 b 1 c 1 S v B 1 C 1 b z c 1 m π AC (π AB R (π BC R S)) A 1 C 1 a 1 c 1 u 2 p 2 xy 2 + uvpmxyz 1 is a neutral annotation, used when we don t bother to track data
37 Different annotations à Different tuples R A B C a b c d b e z f g e w p r s π C σ C=e π AC (π AB R π BC R) C e z e w pr+r 2 s 2
38 Wrap up: issues and directions Archiving Compression Generalizations Program Slicing [Cheney07] Negative Provenance Why Not? [SIGMOD09], Artemis [PVLDB09] Causality
On Factorisation of Provenance Polynomials
On Factorisation of Provenance Polynomials Dan Olteanu and Jakub Závodný Oxford University Computing Laboratory Wolfson Building, Parks Road, OX1 3QD, Oxford, UK 1 Introduction Tracking and managing provenance
More informationProvenance Semirings. Todd Green Grigoris Karvounarakis Val Tannen. presented by Clemens Ley
Provenance Semirings Todd Green Grigoris Karvounarakis Val Tannen presented by Clemens Ley place of origin Provenance Semirings Todd Green Grigoris Karvounarakis Val Tannen presented by Clemens Ley place
More informationQuery Evaluation on Probabilistic Databases. CSE 544: Wednesday, May 24, 2006
Query Evaluation on Probabilistic Databases CSE 544: Wednesday, May 24, 2006 Problem Setting Queries: Tables: Review A(x,y) :- Review(x,y), Movie(x,z), z > 1991 name rating p Movie Monkey Love good.5 title
More informationRelational completeness of query languages for annotated databases
Relational completeness of query languages for annotated databases Floris Geerts 1,2 and Jan Van den Bussche 1 1 Hasselt University/Transnational University Limburg 2 University of Edinburgh Abstract.
More informationProvenance for Database Transforma1ons
Provenance for Database Transforma1ons Val Tannen University of Pennsylvania Joint work with J.N. Foster T.J. Green G. Karvounarakis Z. Ives Cornell UC Davis LogicBlox UPenn and ICS- FORTH 03/24/10 EDBT
More informationProvenance for Aggregate Queries
Provenance for Aggregate Queries Yael Amsterdamer Tel Aviv University and University of Pennsylvania yaelamst@post.tau.ac.il Daniel Deutch Ben Gurion University and University of Pennsylvania deutchd@cs.bgu.ac.il
More informationA Dichotomy. in in Probabilistic Databases. Joint work with Robert Fink. for Non-Repeating Queries with Negation Queries with Negation
Dichotomy for Non-Repeating Queries with Negation Queries with Negation in in Probabilistic Databases Robert Dan Olteanu Fink and Dan Olteanu Joint work with Robert Fink Uncertainty in Computation Simons
More informationPath Queries under Distortions: Answering and Containment
Path Queries under Distortions: Answering and Containment Gosta Grahne Concordia University Alex Thomo Suffolk University Foundations of Information and Knowledge Systems (FoIKS 04) Postulate 1 The world
More informationRelational Algebra on Bags. Why Bags? Operations on Bags. Example: Bag Selection. σ A+B < 5 (R) = A B
Relational Algebra on Bags Why Bags? 13 14 A bag (or multiset ) is like a set, but an element may appear more than once. Example: {1,2,1,3} is a bag. Example: {1,2,3} is also a bag that happens to be a
More informationCS 347 Parallel and Distributed Data Processing
CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 3: Query Processing Query Processing Decomposition Localization Optimization CS 347 Notes 3 2 Decomposition Same as in centralized system
More informationDatabase design and implementation CMPSCI 645. Lecture 14: Data Provenance
Databas dsign and implmntation CMPSCI 645 Lctur 14: Data Provnanc 1 Provnanc provnanc, n. Th fact of coming from som particular sourc or quartr; origin, drivation [Oxford English Dictionary] } Data provnanc
More informationCorrelated subqueries. Query Optimization. Magic decorrelation. COUNT bug. Magic example (slide 2) Magic example (slide 1)
Correlated subqueries Query Optimization CPS Advanced Database Systems SELECT CID FROM Course Executing correlated subquery is expensive The subquery is evaluated once for every CPS course Decorrelate!
More informationCS 347 Distributed Databases and Transaction Processing Notes03: Query Processing
CS 347 Distributed Databases and Transaction Processing Notes03: Query Processing Hector Garcia-Molina Zoltan Gyongyi CS 347 Notes 03 1 Query Processing! Decomposition! Localization! Optimization CS 347
More informationP Q1 Q2 Q3 Q4 Q5 Tot (60) (20) (20) (20) (60) (20) (200) You are allotted a maximum of 4 hours to complete this exam.
Exam INFO-H-417 Database System Architecture 13 January 2014 Name: ULB Student ID: P Q1 Q2 Q3 Q4 Q5 Tot (60 (20 (20 (20 (60 (20 (200 Exam modalities You are allotted a maximum of 4 hours to complete this
More informationCS 188: Artificial Intelligence Spring Announcements
CS 188: Artificial Intelligence Spring 2011 Lecture 16: Bayes Nets IV Inference 3/28/2011 Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore Announcements
More informationLogic and Databases. Phokion G. Kolaitis. UC Santa Cruz & IBM Research Almaden. Lecture 4 Part 1
Logic and Databases Phokion G. Kolaitis UC Santa Cruz & IBM Research Almaden Lecture 4 Part 1 1 Thematic Roadmap Logic and Database Query Languages Relational Algebra and Relational Calculus Conjunctive
More informationDatabases 2011 The Relational Algebra
Databases 2011 Christian S. Jensen Computer Science, Aarhus University What is an Algebra? An algebra consists of values operators rules Closure: operations yield values Examples integers with +,, sets
More informationTopics in Probabilistic and Statistical Databases. Lecture 2: Representation of Probabilistic Databases. Dan Suciu University of Washington
Topics in Probabilistic and Statistical Databases Lecture 2: Representation of Probabilistic Databases Dan Suciu University of Washington 1 Review: Definition The set of all possible database instances:
More informationSchema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1
Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1 Background We started with schema design ER model translation into a relational schema Then we studied relational
More informationQuery answering using views
Query answering using views General setting: database relations R 1,...,R n. Several views V 1,...,V k are defined as results of queries over the R i s. We have a query Q over R 1,...,R n. Question: Can
More informationGAV-sound with conjunctive queries
GAV-sound with conjunctive queries Source and global schema as before: source R 1 (A, B),R 2 (B,C) Global schema: T 1 (A, C), T 2 (B,C) GAV mappings become sound: T 1 {x, y, z R 1 (x,y) R 2 (y,z)} T 2
More informationAnnouncements. CS 188: Artificial Intelligence Spring Bayes Net Semantics. Probabilities in BNs. All Conditional Independences
CS 188: Artificial Intelligence Spring 2011 Announcements Assignments W4 out today --- this is your last written!! Any assignments you have not picked up yet In bin in 283 Soda [same room as for submission
More informationCSE 562 Database Systems
Outline Query Optimization CSE 562 Database Systems Query Processing: Algebraic Optimization Some slides are based or modified from originals by Database Systems: The Complete Book, Pearson Prentice Hall
More informationTopics in Probabilistic and Statistical Databases. Lecture 9: Histograms and Sampling. Dan Suciu University of Washington
Topics in Probabilistic and Statistical Databases Lecture 9: Histograms and Sampling Dan Suciu University of Washington 1 References Fast Algorithms For Hierarchical Range Histogram Construction, Guha,
More informationRelational Algebra and Calculus
Topics Relational Algebra and Calculus Linda Wu Formal query languages Preliminaries Relational algebra Relational calculus Expressive power of algebra and calculus (CMPT 354 2004-2) Chapter 4 CMPT 354
More informationINTRODUCTION TO RELATIONAL DATABASE SYSTEMS
INTRODUCTION TO RELATIONAL DATABASE SYSTEMS DATENBANKSYSTEME 1 (INF 3131) Torsten Grust Universität Tübingen Winter 2017/18 1 THE RELATIONAL ALGEBRA The Relational Algebra (RA) is a query language for
More informationA Toolbox of Query Evaluation Techniques for Probabilistic Databases
2nd Workshop on Management and mining Of UNcertain Data (MOUND) Long Beach, March 1st, 2010 A Toolbox of Query Evaluation Techniques for Probabilistic Databases Dan Olteanu, Oxford University Computing
More informationPUG: A Framework and Practical Implementation for Why & Why-Not Provenance (extended version)
arxiv:808.05752v [cs.db] 6 Aug 208 PUG: A Framework and Practical Implementation for Why & Why-Not Provenance (extended version) Seokki Lee, Bertram Ludäscher, Boris Glavic IIT DB Group Technical Report
More informationEnhancing the Updatability of Projective Views
Enhancing the Updatability of Projective Views (Extended Abstract) Paolo Guagliardo 1, Reinhard Pichler 2, and Emanuel Sallinger 2 1 KRDB Research Centre, Free University of Bozen-Bolzano 2 Vienna University
More information12/3/2010 REVIEW ALGEBRA. Exam Su 3:30PM - 6:30PM 2010/12/12 Room C9000
REVIEW Exam Su 3:30PM - 6:30PM 2010/12/12 Room C9000 2 ALGEBRA 1 RELATIONAL ALGEBRA OPERATIONS Basic operations Selection ( ) Selects a subset of rows from relation. Projection ( ) Deletes unwanted columns
More informationDatabase Design and Normalization
Database Design and Normalization Chapter 11 (Week 12) EE562 Slides and Modified Slides from Database Management Systems, R. Ramakrishnan 1 1NF FIRST S# Status City P# Qty S1 20 London P1 300 S1 20 London
More informationQuery Processing. 3 steps: Parsing & Translation Optimization Evaluation
rela%onal algebra Query Processing 3 steps: Parsing & Translation Optimization Evaluation 30 Simple set of algebraic operations on relations Journey of a query SQL select from where Rela%onal algebra π
More informationProvenance-Based Analysis of Data-Centric Processes
Noname manuscript No. will be inserted by the editor) Provenance-Based Analysis of Data-Centric Processes Daniel Deutch Yuval Moskovitch Val Tannen the date of receipt and acceptance should be inserted
More informationDesirable properties of decompositions 1. Decomposition of relational schemes. Desirable properties of decompositions 3
Desirable properties of decompositions 1 Lossless decompositions A decomposition of the relation scheme R into Decomposition of relational schemes subschemes R 1, R 2,..., R n is lossless if, given tuples
More informationData Cleaning and Query Answering with Matching Dependencies and Matching Functions
Data Cleaning and Query Answering with Matching Dependencies and Matching Functions Leopoldo Bertossi Carleton University Ottawa, Canada bertossi@scs.carleton.ca Solmaz Kolahi University of British Columbia
More informationInstructor: Sudeepa Roy
CompSci 590.6 Understanding Data: Theory and Applications Lecture 13 Incomplete Databases Instructor: Sudeepa Roy Email: sudeepa@cs.duke.edu 1 Today s Reading Alice Book : Foundations of Databases Abiteboul-
More informationTractable Lineages on Treelike Instances: Limits and Extensions
Tractable Lineages on Treelike Instances: Limits and Extensions Antoine Amarilli 1, Pierre Bourhis 2, Pierre enellart 1,3 June 29th, 2016 1 Télécom ParisTech 2 CNR CRItAL 3 National University of ingapore
More informationProperties of Real Numbers
Properties of Real Numbers Essential Understanding. Relationships that are always true for real numbers are called properties, which are rules used to rewrite and compare expressions. Two algebraic expressions
More informationThe Complexity of Causality and Responsibility for Query Answers and non-answers
The Complexity of Causality and Responsibility for Query Answers and non-answers Alexandra Meliou Wolfgang Gatterbauer Katherine F. Moore Dan Suciu Department of Computer Science and Engineering, University
More informationSect Properties of Real Numbers and Simplifying Expressions
Sect 1.7 - Properties of Real Numbers and Simplifying Expressions Concept #1 Commutative Properties of Real Numbers Ex. 1a 9.34 + 2.5 Ex. 1b 2.5 + ( 9.34) Ex. 1c 6.3(4.2) Ex. 1d 4.2( 6.3) a) 9.34 + 2.5
More informationUVA UVA UVA UVA. Database Design. Relational Database Design. Functional Dependency. Loss of Information
Relational Database Design Database Design To generate a set of relation schemas that allows - to store information without unnecessary redundancy - to retrieve desired information easily Approach - design
More informationOutline. Approximation: Theory and Algorithms. Application Scenario. 3 The q-gram Distance. Nikolaus Augsten. Definition and Properties
Outline Approximation: Theory and Algorithms Nikolaus Augsten Free University of Bozen-Bolzano Faculty of Computer Science DIS Unit 3 March 13, 2009 2 3 Nikolaus Augsten (DIS) Approximation: Theory and
More informationCMPT 354: Database System I. Lecture 9. Design Theory
CMPT 354: Database System I Lecture 9. Design Theory 1 Design Theory Design theory is about how to represent your data to avoid anomalies. Design 1 Design 2 Student Course Room Mike 354 AQ3149 Mary 354
More informationDatabase Design and Implementation
Database Design and Implementation CS 645 Schema Refinement First Normal Form (1NF) A schema is in 1NF if all tables are flat Student Name GPA Course Student Name GPA Alice 3.8 Bob 3.7 Carol 3.9 Alice
More informationSchema Refinement and Normal Forms
Schema Refinement and Normal Forms Chapter 19 Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1 The Evils of Redundancy Redundancy is at the root of several problems associated with relational
More informationAdding and Subtracting Terms
Adding and Subtracting Terms 1.6 OBJECTIVES 1.6 1. Identify terms and like terms 2. Combine like terms 3. Add algebraic expressions 4. Subtract algebraic expressions To find the perimeter of (or the distance
More informationRelational Database Design
Relational Database Design Jan Chomicki University at Buffalo Jan Chomicki () Relational database design 1 / 16 Outline 1 Functional dependencies 2 Normal forms 3 Multivalued dependencies Jan Chomicki
More informationSchema Refinement and Normal Forms
Schema Refinement and Normal Forms UMass Amherst Feb 14, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke, Dan Suciu 1 Relational Schema Design Conceptual Design name Product buys Person price name
More informationQuantifying Causal Effects on Query Answering in Databases
Quantifying Causal Effects on Query Answering in Databases Babak Salimi University of Washington February 2016 Collaborators: Leopoldo Bertossi (Carleton University), Dan Suciu (University of Washington),
More informationPrecalculus Chapter P.1 Part 2 of 3. Mr. Chapman Manchester High School
Precalculus Chapter P.1 Part of 3 Mr. Chapman Manchester High School Algebraic Expressions Evaluating Algebraic Expressions Using the Basic Rules and Properties of Algebra Definition of an Algebraic Expression:
More informationData Cleaning and Query Answering with Matching Dependencies and Matching Functions
Data Cleaning and Query Answering with Matching Dependencies and Matching Functions Leopoldo Bertossi 1, Solmaz Kolahi 2, and Laks V. S. Lakshmanan 2 1 Carleton University, Ottawa, Canada. bertossi@scs.carleton.ca
More information6.830 Lecture 11. Recap 10/15/2018
6.830 Lecture 11 Recap 10/15/2018 Celebration of Knowledge 1.5h No phones, No laptops Bring your Student-ID The 5 things allowed on your desk Calculator allowed 4 pages (2 pages double sided) of your liking
More informationThe Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Example (Contd.
The Evils of Redundancy Schema Refinement and Normal Forms Chapter 19 Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1 Redundancy is at the root of several problems associated with relational
More informationThe Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Refining an ER Diagram
Schema Refinement and Normal Forms Chapter 19 Database Management Systems, R. Ramakrishnan and J. Gehrke 1 The Evils of Redundancy Redundancy is at the root of several problems associated with relational
More informationDatalog : A Family of Languages for Ontology Querying
Datalog : A Family of Languages for Ontology Querying Georg Gottlob Department of Computer Science University of Oxford joint work with Andrea Calì,Thomas Lukasiewicz, Marco Manna, Andreas Pieris et al.
More informationQueries and Materialized Views on Probabilistic Databases
Queries and Materialized Views on Probabilistic Databases Nilesh Dalvi Christopher Ré Dan Suciu September 11, 2008 Abstract We review in this paper some recent yet fundamental results on evaluating queries
More informationSchedule. Today: Jan. 17 (TH) Jan. 24 (TH) Jan. 29 (T) Jan. 22 (T) Read Sections Assignment 2 due. Read Sections Assignment 3 due.
Schedule Today: Jan. 17 (TH) Relational Algebra. Read Chapter 5. Project Part 1 due. Jan. 22 (T) SQL Queries. Read Sections 6.1-6.2. Assignment 2 due. Jan. 24 (TH) Subqueries, Grouping and Aggregation.
More informationA Comparative Study of Noncontextual and Contextual Dependencies
A Comparative Study of Noncontextual Contextual Dependencies S.K.M. Wong 1 C.J. Butz 2 1 Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: wong@cs.uregina.ca
More informationProvenance Analysis for Missing Answers and Integrity Repairs
Provenance Analysis for Missing Answers and Integrity Repairs Jane Xu, Waley Zhang, Abdussalam Alawini, and Val Tannen Dept. Computer and Information Science University of Pennsylvania {xuyuan, wzha, alawini,
More informationInverting Proof Systems for Secrecy under OWA
Inverting Proof Systems for Secrecy under OWA Giora Slutzki Department of Computer Science Iowa State University Ames, Iowa 50010 slutzki@cs.iastate.edu May 9th, 2010 Jointly with Jia Tao and Vasant Honavar
More informationSchema Refinement and Normal Forms
Schema Refinement and Normal Forms Chapter 19 Quiz #2 Next Thursday Comp 521 Files and Databases Fall 2012 1 The Evils of Redundancy v Redundancy is at the root of several problems associated with relational
More informationFactorised Representations of Query Results: Size Bounds and Readability
Factorised Representations of Query Results: Size Bounds and Readability Dan Olteanu and Jakub Závodný Department of Computer Science University of Oxford {dan.olteanu,jakub.zavodny}@cs.ox.ac.uk ABSTRACT
More informationDesign Theory for Relational Databases. Spring 2011 Instructor: Hassan Khosravi
Design Theory for Relational Databases Spring 2011 Instructor: Hassan Khosravi Chapter 3: Design Theory for Relational Database 3.1 Functional Dependencies 3.2 Rules About Functional Dependencies 3.3 Design
More informationSchema Refinement and Normal Forms. Chapter 19
Schema Refinement and Normal Forms Chapter 19 1 Review: Database Design Requirements Analysis user needs; what must the database do? Conceptual Design high level descr. (often done w/er model) Logical
More informationRelational Database: Identities of Relational Algebra; Example of Query Optimization
Relational Database: Identities of Relational Algebra; Example of Query Optimization Greg Plaxton Theory in Programming Practice, Fall 2005 Department of Computer Science University of Texas at Austin
More informationToday. Vector Clocks and Distributed Snapshots. Motivation: Distributed discussion board. Distributed discussion board. 1. Logical Time: Vector clocks
Vector Clocks and Distributed Snapshots Today. Logical Time: Vector clocks 2. Distributed lobal Snapshots CS 48: Distributed Systems Lecture 5 Kyle Jamieson 2 Motivation: Distributed discussion board Distributed
More informationWavelets for Efficient Querying of Large Multidimensional Datasets
Wavelets for Efficient Querying of Large Multidimensional Datasets Cyrus Shahabi University of Southern California Integrated Media Systems Center (IMSC) and Dept. of Computer Science Los Angeles, CA 90089-0781
More informationSchema Refinement and Normal Forms. Why schema refinement?
Schema Refinement and Normal Forms Why schema refinement? Consider relation obtained from Hourly_Emps: Hourly_Emps (sin,rating,hourly_wages,hourly_worked) Problems: Update Anomaly: Can we change the wages
More informationDatabase Applications (15-415)
Database Applications (15-415) Relational Calculus Lecture 5, January 27, 2014 Mohammad Hammoud Today Last Session: Relational Algebra Today s Session: Relational algebra The division operator and summary
More informationTuple Relational Calculus
Tuple Relational Calculus Université de Mons (UMONS) May 14, 2018 Motivation S[S#, SNAME, STATUS, CITY] P[P#, PNAME, COLOR, WEIGHT, CITY] SP[S#, P#, QTY)] Get all pairs of city names such that a supplier
More informationCount-Min Tree Sketch: Approximate counting for NLP
Count-Min Tree Sketch: Approximate counting for NLP Guillaume Pitel, Geoffroy Fouquier, Emmanuel Marchand and Abdul Mouhamadsultane exensa firstname.lastname@exensa.com arxiv:64.5492v [cs.ir] 9 Apr 26
More informationCS54100: Database Systems
CS54100: Database Systems Relational Algebra 3 February 2012 Prof. Walid Aref Core Relational Algebra A small set of operators that allow us to manipulate relations in limited but useful ways. The operators
More informationIntroduction to Data Management. Lecture #7 (Relational DB Design Theory II)
Introduction to Data Management Lecture #7 (Relational DB Design Theory II) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Announcements v Homework
More informationConstraints: Functional Dependencies
Constraints: Functional Dependencies Fall 2017 School of Computer Science University of Waterloo Databases CS348 (University of Waterloo) Functional Dependencies 1 / 42 Schema Design When we get a relational
More informationAnnotation algebras for RDFS
University of Edinburgh Semantic Web in Provenance Management, 2010 RDF definition U a set of RDF URI references (s, p, o) U U U an RDF triple s subject p predicate o object A finite set of triples an
More informationRelational Algebra 2. Week 5
Relational Algebra 2 Week 5 Relational Algebra (So far) Basic operations: Selection ( σ ) Selects a subset of rows from relation. Projection ( π ) Deletes unwanted columns from relation. Cross-product
More informationThe Query Containment Problem: Set Semantics vs. Bag Semantics. Phokion G. Kolaitis University of California Santa Cruz & IBM Research - Almaden
The Query Containment Problem: Set Semantics vs. Bag Semantics Phokion G. Kolaitis University of California Santa Cruz & IBM Research - Almaden PROBLEMS Problems worthy of attack prove their worth by hitting
More informationFunctional. Dependencies. Functional Dependency. Definition. Motivation: Definition 11/12/2013
Functional Dependencies Functional Dependency Functional dependency describes the relationship between attributes in a relation. Eg. if A and B are attributes of relation R, B is functionally dependent
More informationCOMPUTING SIMILARITY BETWEEN DOCUMENTS (OR ITEMS) This part is to a large extent based on slides obtained from
COMPUTING SIMILARITY BETWEEN DOCUMENTS (OR ITEMS) This part is to a large extent based on slides obtained from http://www.mmds.org Distance Measures For finding similar documents, we consider the Jaccard
More informationExam 1. March 12th, CS525 - Midterm Exam Solutions
Name CWID Exam 1 March 12th, 2014 CS525 - Midterm Exam s Please leave this empty! 1 2 3 4 5 Sum Things that you are not allowed to use Personal notes Textbook Printed lecture notes Phone The exam is 90
More informationSchema Refinement & Normalization Theory
Schema Refinement & Normalization Theory Functional Dependencies Week 13 1 What s the Problem Consider relation obtained (call it SNLRHW) Hourly_Emps(ssn, name, lot, rating, hrly_wage, hrs_worked) What
More informationCompositions of Tree Series Transformations
Compositions of Tree Series Transformations Andreas Maletti a Technische Universität Dresden Fakultät Informatik D 01062 Dresden, Germany maletti@tcs.inf.tu-dresden.de December 03, 2004 1. Motivation 2.
More informationJustifications for Logic Programming
Justifications for Logic Programming C. V. Damásio 1 and A. Analyti 2 and G. Antoniou 3 1 CENTRIA, Departamento de Informática Faculdade de Ciências e Tecnologia Universidade Nova de Lisboa, 2829-516 Caparica,
More informationSYMMETRIC POLYNOMIALS
SYMMETRIC POLYNOMIALS KEITH CONRAD Let F be a field. A polynomial f(x 1,..., X n ) F [X 1,..., X n ] is called symmetric if it is unchanged by any permutation of its variables: for every permutation σ
More informationLectures 6. Lecture 6: Design Theory
Lectures 6 Lecture 6: Design Theory Lecture 6 Announcements Solutions to PS1 are posted online. Grades coming soon! Project part 1 is out. Check your groups and let us know if you have any issues. We have
More informationCompositions of Bottom-Up Tree Series Transformations
Compositions of Bottom-Up Tree Series Transformations Andreas Maletti a Technische Universität Dresden Fakultät Informatik D 01062 Dresden, Germany maletti@tcs.inf.tu-dresden.de May 17, 2005 1. Motivation
More informationIn-Database Factorised Learning fdbresearch.github.io
In-Database Factorised Learning fdbresearch.github.io Mahmoud Abo Khamis, Hung Ngo, XuanLong Nguyen, Dan Olteanu, and Maximilian Schleich December 2017 Logic for Data Science Seminar Alan Turing Institute
More informationLESSON 8.1 RATIONAL EXPRESSIONS I
LESSON 8. RATIONAL EXPRESSIONS I LESSON 8. RATIONAL EXPRESSIONS I 7 OVERVIEW Here is what you'll learn in this lesson: Multiplying and Dividing a. Determining when a rational expression is undefined Almost
More informationScalable Uncertainty Management
Scalable Uncertainty Management 05 Query Evaluation in Probabilistic Databases Rainer Gemulla Jun 1, 2012 Overview In this lecture Primer: relational calculus Understand complexity of query evaluation
More informationSchema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) [R&G] Chapter 19
Schema Refinement and Normal Forms [R&G] Chapter 19 CS432 1 The Evils of Redundancy Redundancy is at the root of several problems associated with relational schemas: redundant storage, insert/delete/update
More informationIntroduction to Data Management. Lecture #12 (Relational Algebra II)
Introduction to Data Management Lecture #12 (Relational Algebra II) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Announcements v HW and exams:
More informationBrief Tutorial on Probabilistic Databases
Brief Tutorial on Probabilistic Databases Dan Suciu University of Washington Simons 206 About This Talk Probabilistic databases Tuple-independent Query evaluation Statistical relational models Representation,
More informationFunctional Dependencies and Normalization. Instructor: Mohamed Eltabakh
Functional Dependencies and Normalization Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu 1 Goal Given a database schema, how do you judge whether or not the design is good? How do you ensure it does
More informationreview To find the coefficient of all the terms in 15ab + 60bc 17ca: Coefficient of ab = 15 Coefficient of bc = 60 Coefficient of ca = -17
1. Revision Recall basic terms of algebraic expressions like Variable, Constant, Term, Coefficient, Polynomial etc. The coefficients of the terms in 4x 2 5xy + 6y 2 are Coefficient of 4x 2 is 4 Coefficient
More informationSemantic Optimization Techniques for Preference Queries
Semantic Optimization Techniques for Preference Queries Jan Chomicki Dept. of Computer Science and Engineering, University at Buffalo,Buffalo, NY 14260-2000, chomicki@cse.buffalo.edu Abstract Preference
More informationCS2742 midterm test 2 study sheet. Boolean circuits: Predicate logic:
x NOT ~x x y AND x /\ y x y OR x \/ y Figure 1: Types of gates in a digital circuit. CS2742 midterm test 2 study sheet Boolean circuits: Boolean circuits is a generalization of Boolean formulas in which
More informationLineage implementation in PostgreSQL
Lineage implementation in PostgreSQL Andrin Betschart, 09-714-882 Martin Leimer, 09-728-569 3. Oktober 2013 Contents Contents 1. Introduction 3 2. Lineage computation in TPDBs 4 2.1. Lineage......................................
More informationBCNF revisited: 40 Years Normal Forms
Full set of slides BCNF revisited: 40 Years Normal Forms Faculty of Computer Science Technion - IIT, Haifa janos@cs.technion.ac.il www.cs.technion.ac.il/ janos 1 Full set of slides Acknowledgements Based
More informationLogical Provenance in Data-Oriented Workflows (Long Version)
Logical Provenance in Data-Oriented Workflows (Long Version) Robert Ikeda Stanford University rmikeda@cs.stanford.edu Akash Das Sarma IIT Kanpur akashds.iitk@gmail.com Jennifer Widom Stanford University
More informationAn Optimal Algorithm for l 1 -Heavy Hitters in Insertion Streams and Related Problems
An Optimal Algorithm for l 1 -Heavy Hitters in Insertion Streams and Related Problems Arnab Bhattacharyya, Palash Dey, and David P. Woodruff Indian Institute of Science, Bangalore {arnabb,palash}@csa.iisc.ernet.in
More information