Relational operations

Size: px
Start display at page:

Download "Relational operations"

Transcription

1 Architecture

2 Relational operations We will consider how to implement and define physical operators for: Projection Selection Grouping Set operators Join Then we will discuss how the optimizer use them to generate physical query plans

3 Java Relational System (JRS):

4 Selectivity of selection The DBMS Catalog stores information about database tables and indexes. (1/10) (1/3) (1/3) (1/4)

5 Selectivity of selection (1/10)

6 Hystograms 105 records and A with 15 possible values in the range 17 to 31 Uniform distribution NonUniform distribution

7 Selectivity of selection with hystograms An Equi-Height histogram is used as an approximation of the actual distribution The active domain of A is divided into k intervals, containing a number of records of about N rec /k For each interval h i are known: min(h i ), max(h i ), N key (h i ), N rec (h i )

8 Equi-Height histograms

9 Equi-Height histograms s f (V = 24) = N rec (V=24)/N rec s f real = 6/105 = 0,057 s f uniform = 1/15 = 7/105= 0,066 s f histeqh = (26/3)/105 = 0,082

10 Physical operators for tables and sort Operators for R : TableScan (R) SortScan (R, {A i }) IndexScan (R, Idx) IndexSequentialScan (R, Idx) Operator to sort ( τ {Ai} ): Sort (O, {Ai})

11 Physical operators for π b {A i } Project (O, {A i }): to project the records of O without duplicates elimination C = C(O) IndexOnlyScan(R, Idx, {A i }): Idx an index on the attributes to project (or that contains them as prefix) C = N leaf (Idx) Is the result equivalent to a Project or to a Project+Distinct?

12 Physical ops for duplicate elimination Distinct (O): to eliminate duplicates from sorted records of O C = C(O) Actually, we only need them to be grouped: r i =r j and i<l<j => r i =r l =r j HashDistinct(O): to eliminate duplicates from records of O; C =?

13 Hashdistinct partitioning phase: duplicate elimination phase (h r h p ): C = C(O) + 2 N pag (O)

14 Sort Hash

15 Result size: Distinct(O) If A is the only O attribute Q = N key (A) If {A 1, A 2,..., A n } are the attributes Q = min( O /2, i N key (A i ) ) Why not just i N key (A i )?

16 Conclusion The sort method often preferred because produces a sorted result and because sort is heavily optimized DBMS: Informix uses Hash DB2, Oracle, Sybase ASE use Sort SQL Server e Sybase ASIQ use both methods

17 Simple selection With no index, and data unsorted: Relation scan with cost N pag (R). With an index on selection attribute: Use index to retrieve RIDs, then retrieve data records: C I + C D. Cost depends on qualifying tuples (size of R s f ), and clustering General selection conditions SELECT * FROM WHERE Students City = PI

18 Physical operators for selection Filter (O, ψ) C = C(O) E rec = s f (ψ) N rec (O) IndexFilter(R, Idx, ψ) = RidIndexFilter(Idx, ψ) + TableAccess(O, R) C = C I +C D E rec = s f (ψ) N rec (R)

19 Physical operators for selection IndexSequentialFilter(R, Idx, ψ) C = s f (ψ) x N leaf (Idx) IndexOnlyFilter(R, Idx, {A i }, ψ) C = s f (ψ) x N leaf (Idx) AndIndexFilter(R, {Idx i, ψ i }) C I = Σ i C I (Idx i ) C D = (E rec,n pag (R)) E rec = s f (ψ) N rec (R) OrIndexFilter(R, {Idx i, ψ i })

20 Conjunctive selection in DBMSs Informix, DB2 use intersection of RID sets Oracle, Sybase ASIQ use bitmaps Oracle use also HashJoin with index (later) on the attribute RID Sybase ASE uses one index only SQL Server use index join (later)

21 Exercises SELECT A, B Idx an index on A FROM R WHERE (A BETWEEN 50 AND 100) AND B > 20; SELECT A, B One index on A, B FROM R WHERE (A BETWEEN 50 AND 100) AND B > 20 ORDER BY A; SELECT DISTINCT A, B Two indexes on A and on B FROM R WHERE (A BETWEEN 50 AND 100) AND B > 20 ORDER BY B;

22 Physical operators for grouping As for duplicate elimination: Sorting Hashing Using an index on the grouping attributes (Duplicate elimination is grouping on all attributes with no aggregation)

23 Physical operators for ( {A i } γ {f j } ) GroupBy (O, {A i }, {f j }): to group the records of O sorted on the {A i }, computing functions in {f j }. {f j } contains the aggregation functions present in the SELECT and HAVING clauses. The operator returns records with attributes {A i } {f j } The records of O must be sorted on the {A i } Any permutation of {A i } is ok (Actually, they only need to be grouped on the {A i }) HashGroupBy (O, {A i }, {f j }) C? Cardinality?

24

25 Computing aggregations Each aggregation functions is an object o with methods start(), next(v), end() With the first record of a group: o.start() For each record of the group o.next(v) is invoked o.end() computes the final aggregate value.

26 Example p b A, T SELECT A, SUM(C) AS T FROM R WHERE A BETWEEN 50 AND 100 GROUP BY A HAVING COUNT(*) > 1; Project ({A, T}) Filter (N > 1) s N > 1 A g SUM(C) AS T, COUNT(*) AS N s... R Logical Tree GroupBy ({A}, {SUM(C) AS T, COUNT(*) AS N}) Sort ({A}) Filter (...) TableScan (R) Physical Tree... = A BETWEEN 50 AND 100

27 Physical ops for join: nested loops foreach r in O E do foreach s in O i do if r.r1 = s.s1 then add <r, s> to result C = C(O E ) + E rec (O E ) C(O I ) E rec = s f (C j ) E rec (O E ) E rec (O I ) Inequality join conditions?

28 Cost of nested loops foreach r in R do foreach s in S do if r.r1 = s.s1 then add <r, s> to result With R external: C = Npag(R) + Nrec(R) Npag(S) Npag(R) Nrec(R) Npag(R) Npag(S) With S external: C = Npag(S) + Nrec(S) Npag(R) Npag(S) Nrec(S) Npag(R) Npag(S) Hence

29 Page nested loops R Pk B S Fk C Pk B Fk C 10 a 20 b 30 c 40 d 10 f Join 20 g = 30 g 10 k 10 a 10 f 20 b 20 g 10 a 10 k 20 b 20 k 30 c 30 g 40 d 40 f 40 f 20 k C = Npag(R) + Npag(R) * Npag(S) The smallest relation is used as external

30 Block nested loops join C BNL = N pag (R) + Τ N pag (R) B N pag (S)

31 Index nested loop Hyp: There is an index on the join column of the internal relation (S) foreach r in R do foreach s in IndexFilter(S, I, s.s1=r.r1) do add <r, s> to result R(r1, r2) S(s1, s2) r1=s1

32 Index nested loop foreach r in R do foreach s in IndexFilter(S,I,s1 =r.r1) do add <r, s> to result Cost for R join S: C = N pag (R) + N rec (R) CaWithIdx(S) General Case: C = C(O E ) + E rec (O E ) (C I +C D )

33 Merge join Hyp: R and S are sorted on the join attribute, a key of the external relation C = C(O E ) + C(O I ) Inequality join conditions?... ri sj

34 Hash join Partitioning (R and S) Matching

35 Hash join: cost Assume N pag (R)/B < B and uniformity C = C(O E ) + C(O I ) + 2( N pag (O E ) + N pag (O I ) ) C = (log B (N pag (O E )) 2-2)*( N pag (O E ) + N pag (O I ) ) C(R join S) = 3 ( Npag(R) + Npag(S) ) What if N pag (R) < B?

36 Complex joins Join with a conjunctive condition Join with a disjunctive condition

37 Physical operators for join NestedLoop (O E,O I, ψ J ) PageNestedLoop (O E,O I, ψ J ) IndexNestedLoop (O E,O I, ψ J ) MergeJoin (O E,O I, ψ J ) HashJoin (O E,O I, ψ J )

38 Example SELECT ar, Sum(cR) FROM R, S WHERE R.PkR=S.FkR AND cs=1000 GROUP BY ar ORDER BY ar C GroupBy = C Sort = C IndexNestedLoop + 2 N pag (Project) N Pag (Project) = (E Rec (IndNestedLoop) (L ar + L cr ))/D pag C INL = C IndexFilterOnR + E rec (IndexFilterOnR) C(Filter) E rec (INL) = s f PkR = FkR E rec (IndexFilterOnR) E rec (Filter)

39 Example C(IndFilOnR) = C I + C D C I = N leaf (IcR)/N key (IcR) C D = (N rec (R)/ N key (IcR), N pag (R)) E rec (IndFilOnR) = N rec (R)/ N key (IcR) C(IndFilOnS) = C I + C D C I = N leaf (IFkR)/N key (IFkR) C D = (N rec (S)/ N key (IFkR), N pag (S)) E rec (Filter) = s f (cs=1000) N rec (S)

40 Set operators Union(O E,O I ), Except(O E,O I ), Intersect(O E,O I ) Operand sorted and without duplicates. HashUnion(O E,O I ), HashExcept(O E,O I ), HasIntersect(O E,O I ) Using hash. UnionAll(O E,O I ) trivial ExceptAll(O E,O I ), IntersectAll(O E,O I ) Not always supported

41 Example SELECT Name FROM Employees UNION SELECT Name FROM Professors ; Distinct Sort ({Name}) Union Distinct Sort ({Name}) Project ({Name}) Project ({Name}) TableScan (Employees) TableScan (Professors)

42 Summary Very few basic operators, hence the implementation of these operators can be carefully tuned. No universally superior technique for operators with many implementations We must consider available alternatives and select the best one: Query optimization... the next topic

P Q1 Q2 Q3 Q4 Q5 Tot (60) (20) (20) (20) (60) (20) (200) You are allotted a maximum of 4 hours to complete this exam.

P Q1 Q2 Q3 Q4 Q5 Tot (60) (20) (20) (20) (60) (20) (200) You are allotted a maximum of 4 hours to complete this exam. Exam INFO-H-417 Database System Architecture 13 January 2014 Name: ULB Student ID: P Q1 Q2 Q3 Q4 Q5 Tot (60 (20 (20 (20 (60 (20 (200 Exam modalities You are allotted a maximum of 4 hours to complete this

More information

Correlated subqueries. Query Optimization. Magic decorrelation. COUNT bug. Magic example (slide 2) Magic example (slide 1)

Correlated subqueries. Query Optimization. Magic decorrelation. COUNT bug. Magic example (slide 2) Magic example (slide 1) Correlated subqueries Query Optimization CPS Advanced Database Systems SELECT CID FROM Course Executing correlated subquery is expensive The subquery is evaluated once for every CPS course Decorrelate!

More information

You are here! Query Processor. Recovery. Discussed here: DBMS. Task 3 is often called algebraic (or re-write) query optimization, while

You are here! Query Processor. Recovery. Discussed here: DBMS. Task 3 is often called algebraic (or re-write) query optimization, while Module 10: Query Optimization Module Outline 10.1 Outline of Query Optimization 10.2 Motivating Example 10.3 Equivalences in the relational algebra 10.4 Heuristic optimization 10.5 Explosion of search

More information

Module 10: Query Optimization

Module 10: Query Optimization Module 10: Query Optimization Module Outline 10.1 Outline of Query Optimization 10.2 Motivating Example 10.3 Equivalences in the relational algebra 10.4 Heuristic optimization 10.5 Explosion of search

More information

CS 347 Distributed Databases and Transaction Processing Notes03: Query Processing

CS 347 Distributed Databases and Transaction Processing Notes03: Query Processing CS 347 Distributed Databases and Transaction Processing Notes03: Query Processing Hector Garcia-Molina Zoltan Gyongyi CS 347 Notes 03 1 Query Processing! Decomposition! Localization! Optimization CS 347

More information

Relational Algebra on Bags. Why Bags? Operations on Bags. Example: Bag Selection. σ A+B < 5 (R) = A B

Relational Algebra on Bags. Why Bags? Operations on Bags. Example: Bag Selection. σ A+B < 5 (R) = A B Relational Algebra on Bags Why Bags? 13 14 A bag (or multiset ) is like a set, but an element may appear more than once. Example: {1,2,1,3} is a bag. Example: {1,2,3} is also a bag that happens to be a

More information

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 3: Query Processing Query Processing Decomposition Localization Optimization CS 347 Notes 3 2 Decomposition Same as in centralized system

More information

Databases 2011 The Relational Algebra

Databases 2011 The Relational Algebra Databases 2011 Christian S. Jensen Computer Science, Aarhus University What is an Algebra? An algebra consists of values operators rules Closure: operations yield values Examples integers with +,, sets

More information

Database Systems SQL. A.R. Hurson 323 CS Building

Database Systems SQL. A.R. Hurson 323 CS Building SQL A.R. Hurson 323 CS Building Structured Query Language (SQL) The SQL language has the following features as well: Embedded and Dynamic facilities to allow SQL code to be called from a host language

More information

Quiz 2. Due November 26th, CS525 - Advanced Database Organization Solutions

Quiz 2. Due November 26th, CS525 - Advanced Database Organization Solutions Name CWID Quiz 2 Due November 26th, 2015 CS525 - Advanced Database Organization s Please leave this empty! 1 2 3 4 5 6 7 Sum Instructions Multiple choice questions are graded in the following way: You

More information

Multiple-Site Distributed Spatial Query Optimization using Spatial Semijoins

Multiple-Site Distributed Spatial Query Optimization using Spatial Semijoins 11 Multiple-Site Distributed Spatial Query Optimization using Spatial Semijoins Wendy OSBORN a, 1 and Saad ZAAMOUT a a Department of Mathematics and Computer Science, University of Lethbridge, Lethbridge,

More information

SPATIAL INDEXING. Vaibhav Bajpai

SPATIAL INDEXING. Vaibhav Bajpai SPATIAL INDEXING Vaibhav Bajpai Contents Overview Problem with B+ Trees in Spatial Domain Requirements from a Spatial Indexing Structure Approaches SQL/MM Standard Current Issues Overview What is a Spatial

More information

6.830 Lecture 11. Recap 10/15/2018

6.830 Lecture 11. Recap 10/15/2018 6.830 Lecture 11 Recap 10/15/2018 Celebration of Knowledge 1.5h No phones, No laptops Bring your Student-ID The 5 things allowed on your desk Calculator allowed 4 pages (2 pages double sided) of your liking

More information

Factorized Relational Databases Olteanu and Závodný, University of Oxford

Factorized Relational Databases   Olteanu and Závodný, University of Oxford November 8, 2013 Database Seminar, U Washington Factorized Relational Databases http://www.cs.ox.ac.uk/projects/fd/ Olteanu and Závodný, University of Oxford Factorized Representations of Relations Cust

More information

Exam 1. March 12th, CS525 - Midterm Exam Solutions

Exam 1. March 12th, CS525 - Midterm Exam Solutions Name CWID Exam 1 March 12th, 2014 CS525 - Midterm Exam s Please leave this empty! 1 2 3 4 5 Sum Things that you are not allowed to use Personal notes Textbook Printed lecture notes Phone The exam is 90

More information

Relational-Database Design

Relational-Database Design C H A P T E R 7 Relational-Database Design Exercises 7.2 Answer: A decomposition {R 1, R 2 } is a lossless-join decomposition if R 1 R 2 R 1 or R 1 R 2 R 2. Let R 1 =(A, B, C), R 2 =(A, D, E), and R 1

More information

Factorised Representations of Query Results: Size Bounds and Readability

Factorised Representations of Query Results: Size Bounds and Readability Factorised Representations of Query Results: Size Bounds and Readability Dan Olteanu and Jakub Závodný Department of Computer Science University of Oxford {dan.olteanu,jakub.zavodny}@cs.ox.ac.uk ABSTRACT

More information

Tuple Relational Calculus

Tuple Relational Calculus Tuple Relational Calculus Université de Mons (UMONS) May 14, 2018 Motivation S[S#, SNAME, STATUS, CITY] P[P#, PNAME, COLOR, WEIGHT, CITY] SP[S#, P#, QTY)] Get all pairs of city names such that a supplier

More information

FIRST-ORDER QUERY EVALUATION ON STRUCTURES OF BOUNDED DEGREE

FIRST-ORDER QUERY EVALUATION ON STRUCTURES OF BOUNDED DEGREE FIRST-ORDER QUERY EVALUATION ON STRUCTURES OF BOUNDED DEGREE INRIA and ENS Cachan e-mail address: kazana@lsv.ens-cachan.fr WOJCIECH KAZANA AND LUC SEGOUFIN INRIA and ENS Cachan e-mail address: see http://pages.saclay.inria.fr/luc.segoufin/

More information

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #3: SQL---Part 1

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #3: SQL---Part 1 CS 4604: Introduc0on to Database Management Systems B. Aditya Prakash Lecture #3: SQL---Part 1 Announcements---Project Goal: design a database system applica=on with a web front-end Project Assignment

More information

Binary Decision Diagrams

Binary Decision Diagrams Binary Decision Diagrams Literature Some pointers: H.R. Andersen, An Introduction to Binary Decision Diagrams, Lecture notes, Department of Information Technology, IT University of Copenhagen Tools: URL:

More information

α-acyclic Joins Jef Wijsen May 4, 2017

α-acyclic Joins Jef Wijsen May 4, 2017 α-acyclic Joins Jef Wijsen May 4, 2017 1 Motivation Joins in a Distributed Environment Assume the following relations. 1 M[NN, Field of Study, Year] stores data about students of UMONS. For example, (19950423158,

More information

Spatial Database. Ahmad Alhilal, Dimitris Tsaras

Spatial Database. Ahmad Alhilal, Dimitris Tsaras Spatial Database Ahmad Alhilal, Dimitris Tsaras Content What is Spatial DB Modeling Spatial DB Spatial Queries The R-tree Range Query NN Query Aggregation Query RNN Query NN Queries with Validity Information

More information

Relations. Relations of Sets N-ary Relations Relational Databases Binary Relation Properties Equivalence Relations. Reading (Epp s textbook)

Relations. Relations of Sets N-ary Relations Relational Databases Binary Relation Properties Equivalence Relations. Reading (Epp s textbook) Relations Relations of Sets N-ary Relations Relational Databases Binary Relation Properties Equivalence Relations Reading (Epp s textbook) 8.-8.3. Cartesian Products The symbol (a, b) denotes the ordered

More information

Configuring Spatial Grids for Efficient Main Memory Joins

Configuring Spatial Grids for Efficient Main Memory Joins Configuring Spatial Grids for Efficient Main Memory Joins Farhan Tauheed, Thomas Heinis, and Anastasia Ailamaki École Polytechnique Fédérale de Lausanne (EPFL), Imperial College London Abstract. The performance

More information

Multi-join Query Evaluation on Big Data Lecture 2

Multi-join Query Evaluation on Big Data Lecture 2 Multi-join Query Evaluation on Big Data Lecture 2 Dan Suciu March, 2015 Dan Suciu Multi-Joins Lecture 2 March, 2015 1 / 34 Multi-join Query Evaluation Outline Part 1 Optimal Sequential Algorithms. Thursday

More information

Why duplicate detection?

Why duplicate detection? Near-Duplicates Detection Naama Kraus Slides are based on Introduction to Information Retrieval Book by Manning, Raghavan and Schütze Some slides are courtesy of Kira Radinsky Why duplicate detection?

More information

Problem Set 1. CSE 373 Spring Out: February 9, 2016

Problem Set 1. CSE 373 Spring Out: February 9, 2016 Problem Set 1 CSE 373 Spring 2016 Out: February 9, 2016 1 Big-O Notation Prove each of the following using the definition of big-o notation (find constants c and n 0 such that f(n) c g(n) for n > n o.

More information

Efficient query evaluation

Efficient query evaluation Efficient query evaluation Maria Luisa Sapino Set of values E.g. select * from EMPLOYEES where SALARY = 1500; Result of a query Sorted list E.g. select * from CAR-IMAGE where color = red ; 2 Queries as

More information

Logic and Databases. Phokion G. Kolaitis. UC Santa Cruz & IBM Research Almaden. Lecture 4 Part 1

Logic and Databases. Phokion G. Kolaitis. UC Santa Cruz & IBM Research Almaden. Lecture 4 Part 1 Logic and Databases Phokion G. Kolaitis UC Santa Cruz & IBM Research Almaden Lecture 4 Part 1 1 Thematic Roadmap Logic and Database Query Languages Relational Algebra and Relational Calculus Conjunctive

More information

CS 512, Spring 2017, Handout 10 Propositional Logic: Conjunctive Normal Forms, Disjunctive Normal Forms, Horn Formulas, and other special forms

CS 512, Spring 2017, Handout 10 Propositional Logic: Conjunctive Normal Forms, Disjunctive Normal Forms, Horn Formulas, and other special forms CS 512, Spring 2017, Handout 10 Propositional Logic: Conjunctive Normal Forms, Disjunctive Normal Forms, Horn Formulas, and other special forms Assaf Kfoury 5 February 2017 Assaf Kfoury, CS 512, Spring

More information

But RECAP. Why is losslessness important? An Instance of Relation NEWS. Suppose we decompose NEWS into: R1(S#, Sname) R2(City, Status)

But RECAP. Why is losslessness important? An Instance of Relation NEWS. Suppose we decompose NEWS into: R1(S#, Sname) R2(City, Status) So far we have seen: RECAP How to use functional dependencies to guide the design of relations How to modify/decompose relations to achieve 1NF, 2NF and 3NF relations But How do we make sure the decompositions

More information

CS570 Introduction to Data Mining

CS570 Introduction to Data Mining CS570 Introduction to Data Mining Department of Mathematics and Computer Science Li Xiong Data Exploration and Data Preprocessing Data and Attributes Data exploration Data pre-processing Data cleaning

More information

Chap 2: Classical models for information retrieval

Chap 2: Classical models for information retrieval Chap 2: Classical models for information retrieval Jean-Pierre Chevallet & Philippe Mulhem LIG-MRIM Sept 2016 Jean-Pierre Chevallet & Philippe Mulhem Models of IR 1 / 81 Outline Basic IR Models 1 Basic

More information

Comp487/587 - Boolean Formulas

Comp487/587 - Boolean Formulas Comp487/587 - Boolean Formulas 1 Logic and SAT 1.1 What is a Boolean Formula Logic is a way through which we can analyze and reason about simple or complicated events. In particular, we are interested

More information

Quantum Searching. Robert-Jan Slager and Thomas Beuman. 24 november 2009

Quantum Searching. Robert-Jan Slager and Thomas Beuman. 24 november 2009 Quantum Searching Robert-Jan Slager and Thomas Beuman 24 november 2009 1 Introduction Quantum computers promise a significant speed-up over classical computers, since calculations can be done simultaneously.

More information

Dictionary: an abstract data type

Dictionary: an abstract data type 2-3 Trees 1 Dictionary: an abstract data type A container that maps keys to values Dictionary operations Insert Search Delete Several possible implementations Balanced search trees Hash tables 2 2-3 trees

More information

CS 347. Parallel and Distributed Data Processing. Spring Notes 11: MapReduce

CS 347. Parallel and Distributed Data Processing. Spring Notes 11: MapReduce CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 11: MapReduce Motivation Distribution makes simple computations complex Communication Load balancing Fault tolerance Not all applications

More information

Dictionary: an abstract data type

Dictionary: an abstract data type 2-3 Trees 1 Dictionary: an abstract data type A container that maps keys to values Dictionary operations Insert Search Delete Several possible implementations Balanced search trees Hash tables 2 2-3 trees

More information

QSQL: Incorporating Logic-based Retrieval Conditions into SQL

QSQL: Incorporating Logic-based Retrieval Conditions into SQL QSQL: Incorporating Logic-based Retrieval Conditions into SQL Sebastian Lehrack and Ingo Schmitt Brandenburg University of Technology Cottbus Institute of Computer Science Chair of Database and Information

More information

Oracle Spatial: Essentials

Oracle Spatial: Essentials Oracle University Contact Us: 1.800.529.0165 Oracle Spatial: Essentials Duration: 5 Days What you will learn The course extensively covers the concepts and usage of the native data types, functions and

More information

COSC 430 Advanced Database Topics. Lecture 2: Relational Theory Haibo Zhang Computer Science, University of Otago

COSC 430 Advanced Database Topics. Lecture 2: Relational Theory Haibo Zhang Computer Science, University of Otago COSC 430 Advanced Database Topics Lecture 2: Relational Theory Haibo Zhang Computer Science, University of Otago Learning objectives and references You should be able to: define the elements of the relational

More information

7 RC Simulates RA. Lemma: For every RA expression E(A 1... A k ) there exists a DRC formula F with F V (F ) = {A 1,..., A k } and

7 RC Simulates RA. Lemma: For every RA expression E(A 1... A k ) there exists a DRC formula F with F V (F ) = {A 1,..., A k } and 7 RC Simulates RA. We now show that DRC (and hence TRC) is at least as expressive as RA. That is, given an RA expression E that mentions at most C, there is an equivalent DRC expression E that mentions

More information

Census Mapping with ArcGIS

Census Mapping with ArcGIS Census Mapping with ArcGIS Jin Lee, GIS manager at the Lewis Mumford Center and Center for Social and Demographic Analysis Email: jwlee@albany.edu Phone: 442-5402 Quick summary of Day1 http://csda.albany.edu/events-news/gisworkshop_outline_fall_3-1.pdf

More information

Similarity Join Size Estimation using Locality Sensitive Hashing

Similarity Join Size Estimation using Locality Sensitive Hashing Similarity Join Size Estimation using Locality Sensitive Hashing Hongrae Lee, Google Inc Raymond Ng, University of British Columbia Kyuseok Shim, Seoul National University Highly Similar, but not Identical,

More information

Reasoning with Deterministic and Probabilistic graphical models Class 2: Inference in Constraint Networks Rina Dechter

Reasoning with Deterministic and Probabilistic graphical models Class 2: Inference in Constraint Networks Rina Dechter Reasoning with Deterministic and Probabilistic graphical models Class 2: Inference in Constraint Networks Rina Dechter Road Map n Graphical models n Constraint networks Model n Inference n Search n Probabilistic

More information

A Dichotomy. in in Probabilistic Databases. Joint work with Robert Fink. for Non-Repeating Queries with Negation Queries with Negation

A Dichotomy. in in Probabilistic Databases. Joint work with Robert Fink. for Non-Repeating Queries with Negation Queries with Negation Dichotomy for Non-Repeating Queries with Negation Queries with Negation in in Probabilistic Databases Robert Dan Olteanu Fink and Dan Olteanu Joint work with Robert Fink Uncertainty in Computation Simons

More information

The Query Containment Problem: Set Semantics vs. Bag Semantics. Phokion G. Kolaitis University of California Santa Cruz & IBM Research - Almaden

The Query Containment Problem: Set Semantics vs. Bag Semantics. Phokion G. Kolaitis University of California Santa Cruz & IBM Research - Almaden The Query Containment Problem: Set Semantics vs. Bag Semantics Phokion G. Kolaitis University of California Santa Cruz & IBM Research - Almaden PROBLEMS Problems worthy of attack prove their worth by hitting

More information

Sub-Queries in SQL SQL. 3 Types of Sub-Queries. 3 Types of Sub-Queries. Scalar Sub-Query Example. Scalar Sub-Query Example

Sub-Queries in SQL SQL. 3 Types of Sub-Queries. 3 Types of Sub-Queries. Scalar Sub-Query Example. Scalar Sub-Query Example SQL Sub-Queries in SQL Peter Y. Wu Department of Computer and Information Systems Robert Morris University Data Definition Language Create table for schema and constraints Drop table Data Manipulation

More information

Skyline Snippets. Markus Endres and Werner Kießling

Skyline Snippets. Markus Endres and Werner Kießling Skyline Snippets Markus Endres and Werner Kießling Outline 1. Skyline and Preference Queries 2. Skyline Snippets 3. Performance Benchmarks 4. Summary and Outlook 2 1. Skyline Queries 3 Skyline Queries

More information

Multi-Dimensional Top-k Dominating Queries

Multi-Dimensional Top-k Dominating Queries Noname manuscript No. (will be inserted by the editor) Multi-Dimensional Top-k Dominating Queries Man Lung Yiu Nikos Mamoulis Abstract The top-k dominating query returns k data objects which dominate the

More information

Relational Algebra & Calculus

Relational Algebra & Calculus Relational Algebra & Calculus Yanlei Diao UMass Amherst Slides Courtesy of R. Ramakrishnan and J. Gehrke 1 Outline v Conceptual Design: ER model v Logical Design: ER to relational model v Querying and

More information

Geodatabase Management Pathway

Geodatabase Management Pathway Geodatabase Management Pathway Table of Contents ArcGIS Desktop II: Tools and Functionality 3 ArcGIS Desktop III: GIS Workflows and Analysis 6 Building Geodatabases 8 Data Management in the Multiuser Geodatabase

More information

CSE 562 Database Systems

CSE 562 Database Systems Outline Query Optimization CSE 562 Database Systems Query Processing: Algebraic Optimization Some slides are based or modified from originals by Database Systems: The Complete Book, Pearson Prentice Hall

More information

Exercises 1 - Solutions

Exercises 1 - Solutions Exercises 1 - Solutions SAV 2013 1 PL validity For each of the following propositional logic formulae determine whether it is valid or not. If it is valid prove it, otherwise give a counterexample. Note

More information

CS145 Midterm Examination

CS145 Midterm Examination S145 Midterm Examination Please read all instructions (including these) carefully. The exam is open book and open notes; any written materials may be used. There are four parts on the exam, with a varying

More information

INTRODUCTION TO RELATIONAL DATABASE SYSTEMS

INTRODUCTION TO RELATIONAL DATABASE SYSTEMS INTRODUCTION TO RELATIONAL DATABASE SYSTEMS DATENBANKSYSTEME 1 (INF 3131) Torsten Grust Universität Tübingen Winter 2017/18 1 THE RELATIONAL ALGEBRA The Relational Algebra (RA) is a query language for

More information

Relational Algebra and Calculus

Relational Algebra and Calculus Topics Relational Algebra and Calculus Linda Wu Formal query languages Preliminaries Relational algebra Relational calculus Expressive power of algebra and calculus (CMPT 354 2004-2) Chapter 4 CMPT 354

More information

Environment (Parallelizing Query Optimization)

Environment (Parallelizing Query Optimization) Advanced d Query Optimization i i Techniques in a Parallel Computing Environment (Parallelizing Query Optimization) Wook-Shin Han*, Wooseong Kwak, Jinsoo Lee Guy M. Lohman, Volker Markl Kyungpook National

More information

Database Design and Normalization

Database Design and Normalization Database Design and Normalization Chapter 11 (Week 12) EE562 Slides and Modified Slides from Database Management Systems, R. Ramakrishnan 1 1NF FIRST S# Status City P# Qty S1 20 London P1 300 S1 20 London

More information

(a) Write a greedy O(n log n) algorithm that, on input S, returns a covering subset C of minimum size.

(a) Write a greedy O(n log n) algorithm that, on input S, returns a covering subset C of minimum size. Esercizi su Greedy Exercise 1.1 Let S = {[a 1, b 1 ], [a 2, b 2 ],..., [a n, b n ]} be a set of closed intervals on the real line. We say that C S is a covering subset for S if, for any interval [a, b]

More information

arxiv: v1 [cs.ai] 21 Sep 2014

arxiv: v1 [cs.ai] 21 Sep 2014 Oblivious Bounds on the Probability of Boolean Functions WOLFGANG GATTERBAUER, Carnegie Mellon University DAN SUCIU, University of Washington arxiv:409.6052v [cs.ai] 2 Sep 204 This paper develops upper

More information

Progressive & Algorithms & Systems

Progressive & Algorithms & Systems University of California Merced Lawrence Berkeley National Laboratory Progressive Computation for Data Exploration Progressive Computation Online Aggregation (OLA) in DB Query Result Estimate Result ε

More information

Mining Approximative Descriptions of Sets Using Rough Sets

Mining Approximative Descriptions of Sets Using Rough Sets Mining Approximative Descriptions of Sets Using Rough Sets Dan A. Simovici University of Massachusetts Boston, Dept. of Computer Science, 100 Morrissey Blvd. Boston, Massachusetts, 02125 USA dsim@cs.umb.edu

More information

Depth Estimation for Ranking Query Optimization

Depth Estimation for Ranking Query Optimization Depth Estimation for Ranking Query Optimization Karl Schnaitter UC Santa Cruz karlsch@soe.ucsc.edu Joshua Spiegel BEA Systems, Inc. jspiegel@bea.com Neoklis Polyzotis UC Santa Cruz alkis@cs.ucsc.edu ABSTRACT

More information

Errata and Proofs for Quickr [2]

Errata and Proofs for Quickr [2] Errata and Proofs for Quickr [2] Srikanth Kandula 1 Errata We point out some errors in the SIGMOD version of our Quickr [2] paper. The transitivity theorem, in Proposition 1 of Quickr, has a revision in

More information

Multi-column Bitmap Indexing

Multi-column Bitmap Indexing Multi-column Bitmap Indexing by Eduardo Gutarra Velez Bachelor of Computer Science, EAFIT, 2009 A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Computer Science

More information

Online Sorted Range Reporting and Approximating the Mode

Online Sorted Range Reporting and Approximating the Mode Online Sorted Range Reporting and Approximating the Mode Mark Greve Progress Report Department of Computer Science Aarhus University Denmark January 4, 2010 Supervisor: Gerth Stølting Brodal Online Sorted

More information

Plan of the lecture. G53RDB: Theory of Relational Databases Lecture 2. More operations: renaming. Previous lecture. Renaming.

Plan of the lecture. G53RDB: Theory of Relational Databases Lecture 2. More operations: renaming. Previous lecture. Renaming. Plan of the lecture G53RDB: Theory of Relational Lecture 2 Natasha Alechina chool of Computer cience & IT nza@cs.nott.ac.uk Renaming Joins Definability of intersection Division ome properties of relational

More information

Circles & Interval & Set Notation.notebook. November 16, 2009 CIRCLES. OBJECTIVE Graph a Circle given the equation in standard form.

Circles & Interval & Set Notation.notebook. November 16, 2009 CIRCLES. OBJECTIVE Graph a Circle given the equation in standard form. OBJECTIVE Graph a Circle given the equation in standard form. Write the equation of a circle in standard form given a graph or two points (one being the center). Students will be able to write the domain

More information

Homework Assignment 2. Due Date: October 17th, CS425 - Database Organization Results

Homework Assignment 2. Due Date: October 17th, CS425 - Database Organization Results Name CWID Homework Assignment 2 Due Date: October 17th, 2017 CS425 - Database Organization Results Please leave this empty! 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.15 2.16 2.17 2.18 2.19 Sum

More information

Sorting Algorithms. We have already seen: Selection-sort Insertion-sort Heap-sort. We will see: Bubble-sort Merge-sort Quick-sort

Sorting Algorithms. We have already seen: Selection-sort Insertion-sort Heap-sort. We will see: Bubble-sort Merge-sort Quick-sort Sorting Algorithms We have already seen: Selection-sort Insertion-sort Heap-sort We will see: Bubble-sort Merge-sort Quick-sort We will show that: O(n log n) is optimal for comparison based sorting. Bubble-Sort

More information

Syntax and Semantics of Propositional Linear Temporal Logic

Syntax and Semantics of Propositional Linear Temporal Logic Syntax and Semantics of Propositional Linear Temporal Logic 1 Defining Logics L, M, = L - the language of the logic M - a class of models = - satisfaction relation M M, ϕ L: M = ϕ is read as M satisfies

More information

Integer Sorting on the word-ram

Integer Sorting on the word-ram Integer Sorting on the word-rm Uri Zwick Tel viv University May 2015 Last updated: June 30, 2015 Integer sorting Memory is composed of w-bit words. rithmetical, logical and shift operations on w-bit words

More information

Homework #5 Solutions

Homework #5 Solutions Homework #5 Solutions p 83, #16. In order to find a chain a 1 a 2 a n of subgroups of Z 240 with n as large as possible, we start at the top with a n = 1 so that a n = Z 240. In general, given a i we will

More information

SOLVING FINITE-DOMAIN LINEAR CONSTRAINTS IN PRESENCE OF THE ALLDIFFERENT

SOLVING FINITE-DOMAIN LINEAR CONSTRAINTS IN PRESENCE OF THE ALLDIFFERENT Logical Methods in Computer Science Vol. 12(3:5)2016, pp. 1 28 www.lmcs-online.org Submitted Jul. 31, 2015 Published Sep. 5, 2016 SOLVING FINITE-DOMAIN LINEAR CONSTRAINTS IN PRESENCE OF THE ALLDIFFERENT

More information

SPATIAL DATA MINING. Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM

SPATIAL DATA MINING. Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM SPATIAL DATA MINING Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM INTRODUCTION The main difference between data mining in relational DBS and in spatial DBS is that attributes of the neighbors

More information

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler Complexity Theory Complexity Theory VU 181.142, SS 2018 6. The Polynomial Hierarchy Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 15 May, 2018 Reinhard

More information

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181.

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181. Complexity Theory Complexity Theory Outline Complexity Theory VU 181.142, SS 2018 6. The Polynomial Hierarchy Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität

More information

1 Frequent Pattern Mining

1 Frequent Pattern Mining Decision Support Systems MEIC - Alameda 2010/2011 Homework #5 Due date: 31.Oct.2011 1 Frequent Pattern Mining 1. The Apriori algorithm uses prior knowledge about subset support properties. In particular,

More information

In-Database Factorised Learning fdbresearch.github.io

In-Database Factorised Learning fdbresearch.github.io In-Database Factorised Learning fdbresearch.github.io Mahmoud Abo Khamis, Hung Ngo, XuanLong Nguyen, Dan Olteanu, and Maximilian Schleich December 2017 Logic for Data Science Seminar Alan Turing Institute

More information

Uncertainty Aware Query Execution Time Prediction

Uncertainty Aware Query Execution Time Prediction Uncertainty Aware Query Execution Time Prediction Wentao Wu Xi Wu Hakan Hacıgümüş Jeffrey F. Naughton Department of Computer Sciences, University of Wisconsin-Madison NEC Laboratories America {wentaowu,

More information

REVIEW QUESTIONS. Chapter 1: Foundations: Sets, Logic, and Algorithms

REVIEW QUESTIONS. Chapter 1: Foundations: Sets, Logic, and Algorithms REVIEW QUESTIONS Chapter 1: Foundations: Sets, Logic, and Algorithms 1. Why can t a Venn diagram be used to prove a statement about sets? 2. Suppose S is a set with n elements. Explain why the power set

More information

Semantic Optimization Techniques for Preference Queries

Semantic Optimization Techniques for Preference Queries Semantic Optimization Techniques for Preference Queries Jan Chomicki Dept. of Computer Science and Engineering, University at Buffalo,Buffalo, NY 14260-2000, chomicki@cse.buffalo.edu Abstract Preference

More information

Problem 5. Use mathematical induction to show that when n is an exact power of two, the solution of the recurrence

Problem 5. Use mathematical induction to show that when n is an exact power of two, the solution of the recurrence A. V. Gerbessiotis CS 610-102 Spring 2014 PS 1 Jan 27, 2014 No points For the remainder of the course Give an algorithm means: describe an algorithm, show that it works as claimed, analyze its worst-case

More information

INSTITUT FÜR INFORMATIK

INSTITUT FÜR INFORMATIK INSTITUT FÜR INFORMATIK DER LUDWIGMAXIMILIANSUNIVERSITÄT MÜNCHEN Bachelorarbeit Propagation of ESCL Cardinality Constraints with Respect to CEP Queries Thanh Son Dang Aufgabensteller: Prof. Dr. Francois

More information

CS156: The Calculus of Computation

CS156: The Calculus of Computation CS156: The Calculus of Computation Zohar Manna Winter 2010 It is reasonable to hope that the relationship between computation and mathematical logic will be as fruitful in the next century as that between

More information

EECS-3421a: Test #2 Electrical Engineering & Computer Science York University

EECS-3421a: Test #2 Electrical Engineering & Computer Science York University 18 November 2015 EECS-3421a: Test #2 1 of 16 EECS-3421a: Test #2 Electrical Engineering & Computer Science York University Family Name: Given Name: Student#: CSE Account: Instructor: Parke Godfrey Exam

More information

CSE 202 Homework 4 Matthias Springer, A

CSE 202 Homework 4 Matthias Springer, A CSE 202 Homework 4 Matthias Springer, A99500782 1 Problem 2 Basic Idea PERFECT ASSEMBLY N P: a permutation P of s i S is a certificate that can be checked in polynomial time by ensuring that P = S, and

More information

Impression Store: Compressive Sensing-based Storage for. Big Data Analytics

Impression Store: Compressive Sensing-based Storage for. Big Data Analytics Impression Store: Compressive Sensing-based Storage for Big Data Analytics Jiaxing Zhang, Ying Yan, Liang Jeff Chen, Minjie Wang, Thomas Moscibroda & Zheng Zhang Microsoft Research The Curse of O(N) in

More information

As mentioned, we will relax the conditions of our dictionary data structure. The relaxations we

As mentioned, we will relax the conditions of our dictionary data structure. The relaxations we CSE 203A: Advanced Algorithms Prof. Daniel Kane Lecture : Dictionary Data Structures and Load Balancing Lecture Date: 10/27 P Chitimireddi Recap This lecture continues the discussion of dictionary data

More information

Chapter 7: Relational Database Design

Chapter 7: Relational Database Design Chapter 7: Relational Database Design Chapter 7: Relational Database Design! First Normal Form! Pitfalls in Relational Database Design! Functional Dependencies! Decomposition! Boyce-Codd Normal Form! Third

More information

Propositional and Predicate Logic - V

Propositional and Predicate Logic - V Propositional and Predicate Logic - V Petr Gregor KTIML MFF UK WS 2016/2017 Petr Gregor (KTIML MFF UK) Propositional and Predicate Logic - V WS 2016/2017 1 / 21 Formal proof systems Hilbert s calculus

More information

DATA MINING LECTURE 3. Frequent Itemsets Association Rules

DATA MINING LECTURE 3. Frequent Itemsets Association Rules DATA MINING LECTURE 3 Frequent Itemsets Association Rules This is how it all started Rakesh Agrawal, Tomasz Imielinski, Arun N. Swami: Mining Association Rules between Sets of Items in Large Databases.

More information

Information theoretical approach for domain ontology exploration in large Earth observation image archives

Information theoretical approach for domain ontology exploration in large Earth observation image archives Information theoretical approach for domain ontology exploration in large Earth observation image archives Mihai Datcu, Mariana Ciucu DLR Oberpfaffenhofen IGARSS 2004 KIM - Knowledge driven Information

More information

n Empty Set:, or { }, subset of all sets n Cardinality: V = {a, e, i, o, u}, so V = 5 n Subset: A B, all elements in A are in B

n Empty Set:, or { }, subset of all sets n Cardinality: V = {a, e, i, o, u}, so V = 5 n Subset: A B, all elements in A are in B Discrete Math Review Discrete Math Review (Rosen, Chapter 1.1 1.7, 5.5) TOPICS Sets and Functions Propositional and Predicate Logic Logical Operators and Truth Tables Logical Equivalences and Inference

More information

Chapter 7: Relational Database Design. Chapter 7: Relational Database Design

Chapter 7: Relational Database Design. Chapter 7: Relational Database Design Chapter 7: Relational Database Design Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design Functional Dependencies Decomposition Boyce-Codd Normal Form Third Normal

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) Relational Calculus Lecture 5, January 27, 2014 Mohammad Hammoud Today Last Session: Relational Algebra Today s Session: Relational algebra The division operator and summary

More information

Slides based on those in:

Slides based on those in: Spyros Kontogiannis & Christos Zaroliagis Slides based on those in: http://www.mmds.org High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering

More information

Relational Nonlinear FIR Filters. Ronald K. Pearson

Relational Nonlinear FIR Filters. Ronald K. Pearson Relational Nonlinear FIR Filters Ronald K. Pearson Daniel Baugh Institute for Functional Genomics and Computational Biology Thomas Jefferson University Philadelphia, PA Moncef Gabbouj Institute of Signal

More information