CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #3: SQL---Part 1

Similar documents
Database Applications (15-415)

Database Applications (15-415)

Database Systems SQL. A.R. Hurson 323 CS Building

Relational Algebra on Bags. Why Bags? Operations on Bags. Example: Bag Selection. σ A+B < 5 (R) = A B

General Overview - rel. model. Carnegie Mellon Univ. Dept. of Computer Science /615 DB Applications. Motivation. Overview - detailed

Introduction to Data Management. Lecture #12 (Relational Algebra II)

Relational Algebra and Calculus

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #15: BCNF, 3NF and Normaliza:on

Homework Assignment 2. Due Date: October 17th, CS425 - Database Organization Results

Relational Algebra & Calculus

Query Processing. 3 steps: Parsing & Translation Optimization Evaluation

Sub-Queries in SQL SQL. 3 Types of Sub-Queries. 3 Types of Sub-Queries. Scalar Sub-Query Example. Scalar Sub-Query Example

Relational Algebra 2. Week 5

Correlated subqueries. Query Optimization. Magic decorrelation. COUNT bug. Magic example (slide 2) Magic example (slide 1)

INTRODUCTION TO RELATIONAL DATABASE SYSTEMS

CS5300 Database Systems

Database Systems Relational Algebra. A.R. Hurson 323 CS Building

Databases 2011 The Relational Algebra

Lineage implementation in PostgreSQL

Schedule. Today: Jan. 17 (TH) Jan. 24 (TH) Jan. 29 (T) Jan. 22 (T) Read Sections Assignment 2 due. Read Sections Assignment 3 due.

COSC 430 Advanced Database Topics. Lecture 2: Relational Theory Haibo Zhang Computer Science, University of Otago

Carnegie Mellon University Department of Computer Science /615 - Database Applications C. Faloutsos & A. Pavlo, Fall 2016

P Q1 Q2 Q3 Q4 Q5 Tot (60) (20) (20) (20) (60) (20) (200) You are allotted a maximum of 4 hours to complete this exam.

Exam 1. March 12th, CS525 - Midterm Exam Solutions

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

CS54100: Database Systems

Query answering using views

Instructor: Sudeepa Roy

Schema Refinement & Normalization Theory

GAV-sound with conjunctive queries

Lectures 6. Lecture 6: Design Theory

CSE 562 Database Systems

CS Data Structures and Algorithm Analysis

Introduction to Data Management. Lecture #6 (Relational Design Theory)

Databases. Exercises on Relational Algebra

Algorithms and Programming I. Lecture#1 Spring 2015

7 RC Simulates RA. Lemma: For every RA expression E(A 1... A k ) there exists a DRC formula F with F V (F ) = {A 1,..., A k } and

CS632 Notes on Relational Query Languages I

Lecture #7 (Relational Design Theory, cont d.)

Discrete Mathematics. Kenneth A. Ribet. January 31, 2013

Plan of the lecture. G53RDB: Theory of Relational Databases Lecture 2. More operations: renaming. Previous lecture. Renaming.

Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1

CSE 241 Class 1. Jeremy Buhler. August 24,

/home/thierry/columbia/msongsdb/tutorials/tutorial4/tutorial4.py January 25,

EECS-3421a: Test #2 Electrical Engineering & Computer Science York University

CS2742 midterm test 2 study sheet. Boolean circuits: Predicate logic:

Environment (Parallelizing Query Optimization)

Constraints: Functional Dependencies

Introduction to Data Management. Lecture #7 (Relational DB Design Theory II)

/home/thierry/columbia/msongsdb/pyreport_tutorials/tutorial1/tutorial1.py January 23, 20111

CS 347 Parallel and Distributed Data Processing

Introduction to Data Management. Lecture #6 (Relational DB Design Theory)

From BASIS DD to Barista Application in Five Easy Steps

But RECAP. Why is losslessness important? An Instance of Relation NEWS. Suppose we decompose NEWS into: R1(S#, Sname) R2(City, Status)

6.830 Lecture 11. Recap 10/15/2018

Functional Dependencies. Chapter 3

CSC 261/461 Database Systems Lecture 10 (part 2) Spring 2018

Databases Exam HT2016 Solution

Algorithms for Data Science

Introduction to Python

An analogy from Calculus: limits

Logic: Bottom-up & Top-down proof procedures

Instructor: Amol Deshpande

CS 347 Parallel and Distributed Data Processing

Relational operations

You are here! Query Processor. Recovery. Discussed here: DBMS. Task 3 is often called algebraic (or re-write) query optimization, while

GIS Lecture 4: Data. GIS Tutorial, Third Edition GIS 1

Math 1553 Introduction to Linear Algebra. School of Mathematics Georgia Institute of Technology

Database Design and Implementation

Databases Lecture 8. Timothy G. Griffin. Computer Laboratory University of Cambridge, UK. Databases, Lent 2009

COMP Logic for Computer Scientists. Lecture 16

DATA MINING LECTURE 6. Similarity and Distance Sketching, Locality Sensitive Hashing

Lecture 1: Asymptotic Complexity. 1 These slides include material originally prepared by Dr.Ron Cytron, Dr. Jeremy Buhler, and Dr. Steve Cole.

Rough Sets. V.W. Marek. General introduction and one theorem. Department of Computer Science University of Kentucky. October 2013.

Course Announcements. Bacon is due next Monday. Next lab is about drawing UIs. Today s lecture will help thinking about your DB interface.

Efficient query evaluation

Designing Information Devices and Systems II Spring 2016 Anant Sahai and Michel Maharbiz Midterm 2

MAT 1302B Mathematical Methods II

Recursion: Introduction and Correctness

Outline. Approximation: Theory and Algorithms. Application Scenario. 3 The q-gram Distance. Nikolaus Augsten. Definition and Properties

CS5371 Theory of Computation. Lecture 14: Computability V (Prove by Reduction)

Crypto Lab 2011: Code Challenge Website - Report (v1.0)

INFO 4300 / CS4300 Information Retrieval. slides adapted from Hinrich Schütze s, linked from

CSE 494/598 Lecture-4: Correlation Analysis. **Content adapted from last year s slides

CSC236 Week 3. Larry Zhang

Practice and Applications of Data Management CMPSCI 345. Lecture 15: Functional Dependencies

Module 10: Query Optimization

A Note on Turing Machine Design

Chuck Cartledge, PhD. 21 January 2018

Chapter 3 Relational Model

Today. Binary relations establish a relationship between elements of two sets

Mathematics. Lecture and practice. Norbert Bogya. University of Szeged, Bolyai Institute. Norbert Bogya (Bolyai Institute) Mathematics / 27

Functions and Relations

Constraints: Functional Dependencies

Lexical Analysis Part II: Constructing a Scanner from Regular Expressions

Factorized Relational Databases Olteanu and Závodný, University of Oxford

Database design and implementation CMPSCI 645

Unit 7 Logical Database Design With Normalization Zvi M. Kedem 1

Design Theory for Relational Databases. Spring 2011 Instructor: Hassan Khosravi

Formal Semantics of SQL (and Cypher) Paolo Guagliardo

Transcription:

CS 4604: Introduc0on to Database Management Systems B. Aditya Prakash Lecture #3: SQL---Part 1

Announcements---Project Goal: design a database system applica=on with a web front-end Project Assignment 1 released today Due Feb 3, in class, hardcopy (1 per project group) Total of 3 during the semester Heads-up: Start thinking about groups same group for rest of the semester You are free to choose your own project members You can post on piazza as well If you like me to assign you to a group, send me email Min size=2 members, Max size=3 members. Anything else needs an excellent reason (and my permission) Prakash 2016 VT CS 4604 2

Annoucements Reminder: Handout 1 is also on the website We will discuss it in the next class, for prac=ce Bring a printed copy to class Prakash 2016 VT CS 4604 3

Last lecture Rela=onal Algebra Prakash 2016 VT CS 4604 4

Quick Quiz: Independence of Operators R \ S = R (R S) R./ C = C (R S) R./ S =?? Prakash 2016 VT CS 4604 5

Quick Quiz: Independence of R./ S Operators Suppose R and S share the azributes A1,A2,..An Let L be the list of azributes in R \Union list of azributes in S (so no duplicate azributes) Let C be the condi=on R.A1 = S.A1 AND R.A2 = S.A2 AND.. R.An = S.An R./ S = L ( C (R S)) Prakash 2016 VT CS 4604 6

Quick Aside: RA queries can become Normal expression: S1.Name,S2.Name ( long! S1.Address=S2.Address ( S1 (Students) S2 (Students))) Linear Nota=on: Pairs(P1, N1, A1, P2, N2, A2) := S1 (Students) S2 (Students) Matched(P1, N1, A1, P2, N2, A2) := A1=A2(Pairs(P1, N1, A1, P2, N2, A2)) Answer(Name1, Name2) := N1,N2 (Matched(P1, N1, A1, P2, N2, A2)) Prakash 2016 VT CS 4604 7

This lecture Structured Query Language (SQL) Pronounced Sequel Prakash 2016 VT CS 4604 8

Overview - detailed - SQL DML select, from, where, renaming set operations ordering aggregate functions nested subqueries other parts: DDL, constraints etc. Prakash 2016 VT CS 4604 9

Relational Query Languages A major strength of the relational model: supports simple, powerful querying of data. Two sublanguages: DDL Data Definition Language define and modify schema (at all 3 levels) DML Data Manipulation Language Queries can be written intuitively. Prakash 2016 VT CS 4604 10

Relational languages The DBMS is responsible for efficient evaluation. Query optimizer: re-orders operations and generates query plan Prakash 2016 VT CS 4604 11

The SQL Query Language The most widely used relational query language. Major standard is SQL-1999 (=SQL3) Introduced Object-Relational concepts SQL 2003, SQL 2008 have small extensions SQL92 is a basic subset Prakash 2016 VT CS 4604 12

SQL (cont d) PostgreSQL has some unique aspects (as do most systems). XML is the next challenge for SQL. Prakash 2016 VT CS 4604 13

SQLite Most popular embedded db in the world Iphone (ios), Android, Chrome. (Very) Easy to use: no need to set it up Self-contained: data+schema DB on your laptop: useful for tes=ng, understanding. Prakash 2016 VT CS 4604 14

DML General form select a1, a2, an from r1, r2, rm where P [order by.] [group by ] [having ] Prakash 2016 VT CS 4604 15

Reminder: mini-u db STUDENT Ssn Name Address 123 smith main str 234 jones forbes ave CLASS c-id c-name units 4602 s.e. 2 4603 o.s. 2 TAKES SSN c-id grade 123 4613 A 234 4613 B Prakash 2016 VT CS 4604 16

DML - eg: find the ssn(s) of everybody called smith select ssn from student where name= smith Prakash 2016 VT CS 4604 17

DML - observation General form select a1, a2, an from r1, r2, rm where P equivalent rel. algebra query? Prakash 2016 VT CS 4604 18

DML - observation General form select a1, a2, an from r1, r2, rm where P π σ a ( ( r1 r2... rm)) 1, a2,... an P Prakash 2016 VT CS 4604 19

DML observation Set VS Bags General form select distinct a1, a2, an from r1, r2, rm where P π σ a ( ( r1 r2... rm)) 1, a2,... an P NOTE: Rela=onal Algebra is set seman0cs (everything is a set), so removes duplicates automa=cally. SQL is bag seman0cs (everything is a mul=set), so removes duplicates only when asked to (using dis=nct) Prakash 2016 VT CS 4604 20

select clause select [distinct all ] name from student where address= main Prakash 2016 VT CS 4604 21

where clause find ssn(s) of all smith s on main select ssn from student where address= main and name = smith Prakash 2016 VT CS 4604 22

where clause boolean operators (and or not ) comparison operators (<, >, =, ) and more Prakash 2016 VT CS 4604 23

What about strings? find student ssns who live on main (st or str or street - ie., main st or main str ) Prakash 2016 VT CS 4604 24

What about strings? find student ssns who live on main (st or str or street) select ssn from student where address like main% %: variable-length don t care _: single-character don t care Prakash 2016 VT CS 4604 25

from clause find names of people taking 4604 STUDENT Ssn Name Address 123 smith main str 234 jones forbes ave CLASS c-id c-name units 4602 s.e. 2 4603 o.s. 2 TAKES SSN c-id grade 123 4613 A 234 4613 B Prakash 2016 VT CS 4604 26

from clause find names of people taking 4604 select name from student, takes where??? Prakash 2016 VT CS 4604 27

from clause find names of people taking 4604 select name from student, takes where student.ssn = takes.ssn and takes.c-id = 4604 Prakash 2016 VT CS 4604 28

renaming - tuple variables find names of people taking 4604 select name from ourveryownstudent, studenttakingclasses where ourveryownstudent.ssn = studenttakingclasses.ssn and studenttakingclasses.c-id = 4604 Prakash 2016 VT CS 4604 29

renaming - tuple variables find names of people taking 4604 select name from ourveryownstudent as S, studenttakingclasses as T where S.ssn =T.ssn and T.c-id = 4604 Prakash 2016 VT CS 4604 30

renaming - self-join self -joins: find Tom s grandparent(s) PC p-id Mary Peter John c-id Tom Mary Tom PC p-id Mary Peter John c-id Tom Mary Tom Prakash 2016 VT CS 4604 31

renaming - self-join find grandparents of Tom (PC(p-id, c-id)) select gp.p-id from PC as gp, PC where gp.c-id= PC.p-id and PC.c-id = Tom Prakash 2016 VT CS 4604 32

renaming - theta join find course names with more units than 4604 select c1.c-name from class as c1, class as c2 where c1.units > c2.units and c2.c-id = 4604 Prakash 2016 VT CS 4604 33

renaming - theta join find course names with more units than 4604 select c1.c-name from class as c1, class as c2 where c1.units > c2.units and c2.c-id = 4604 Prakash 2016 VT CS 4604 34

Overview - detailed - SQL DML select, from, where set operations ordering aggregate functions nested subqueries other parts: DDL, constraints etc. Prakash 2016 VT CS 4604 35

set operations find ssn of people taking both 4604 and 4613 TAKES SSN c-id grade 123 4613 A 234 4613 B Prakash 2016 VT CS 4604 36

set operations find ssn of people taking both 4604 and 4613 select ssn from takes where c-id= 4604 and c-id= 4613 Prakash 2016 VT CS 4604 37

set operations find ssn of people taking both 4604 and 4613 (select ssn from takes where c-id= 4604 ) Intersect (select ssn from takes where c-id= 4613 ) other ops: union, except Prakash 2016 VT CS 4604 38

Overview - detailed - SQL DML select, from, where set operations ordering aggregate functions nested subqueries other parts: DDL, constraints etc. Prakash 2016 VT CS 4604 39

Ordering find student records, sorted in name order select * from student where Prakash 2016 VT CS 4604 40

Ordering find student records, sorted in name order select * from student order by name asc asc is the default Prakash 2016 VT CS 4604 41

Ordering find student records, sorted in name order; break ties by reverse ssn select * from student order by name, ssn desc Prakash 2016 VT CS 4604 42

Overview - detailed - SQL DML select, from, where set operations ordering aggregate functions nested subqueries other parts: DDL, constraints etc. Prakash 2016 VT CS 4604 43

Aggregate functions find avg grade, across all students select?? from takes TAKES SSN c-id grade 123 4613 A 234 4613 B Prakash 2016 VT CS 4604 44

Aggregate functions find avg grade, across all students select avg(grade) from takes result: a single number Which other functions? TAKES SSN c-id grade 123 4613 A 234 4613 B Prakash 2016 VT CS 4604 45

Aggregate functions A: sum count min max (std) Prakash 2016 VT CS 4604 46

Aggregate functions find total number of enrollments select count(*) from takes TAKES SSN c-id grade 123 4613 A 234 4613 B Prakash 2016 VT CS 4604 47

Aggregate functions find total number of students in 4604 select count(*) from takes where c-id= 4604 TAKES SSN c-id grade 123 4613 A 234 4613 B Prakash 2016 VT CS 4604 48

Aggregate functions find total number of students in each course select count(*) from takes where??? TAKES SSN c-id grade 123 4613 A 234 4613 B Prakash 2016 VT CS 4604 49

Aggregate functions find total number of students in each course select c-id, count(*) from takes group by c-id TAKES SSN c-id grade 123 4613 A 234 4613 B c-id count 4613 2 Prakash 2016 VT CS 4604 50

Aggregate functions find total number of students in each course select c-id, count(*) from takes group by c-id order by c-id TAKES SSN c-id grade 123 4613 A 234 4613 B c-id count 4613 2 Prakash 2016 VT CS 4604 51

Aggregate functions find total number of students in each course, and sort by count, decreasing select c-id, count(*) as pop from takes group by c-id order by pop desc TAKES SSN c-id grade 123 4613 A 234 4613 B c-id pop 4613 2 Prakash 2016 VT CS 4604 52

Aggregate functions- having find students with GPA > 3.0 SSN c-id grade 123 4613 4 234 4613 3 Prakash 2016 VT CS 4604 53

Aggregate functions- having find students with GPA > 3.0 select???, avg(grade) from takes group by??? SSN c-id grade 123 4613 4 234 4613 3 Prakash 2016 VT CS 4604 54

Aggregate functions- having find students with GPA > 3.0 select ssn, avg(grade) from takes group by ssn??? SSN c-id grade 123 4613 4 234 4613 3 SSN avg(grade) 123 4 234 3 Prakash 2016 VT CS 4604 55

Aggregate functions- having find students with GPA > 3.0 select ssn, avg(grade) from takes group by ssn having avg(grade)>3.0 having <-> where for groups SSN c-id grade 123 4613 4 234 4613 3 SSN avg(grade) 123 4 234 3 Prakash 2016 VT CS 4604 56

Aggregate functions- having find students and GPA, for students with > 5 courses select ssn, avg(grade) from takes group by ssn having count(*) > 5 SSN c-id grade 123 4613 4 234 4613 3 SSN avg(grade) 123 4 234 3 Prakash 2016 VT CS 4604 57

Drill: Find the age of the youngest sailor for each ra0ng level Sid Sname Ra7ng Age SELECT S.ra7ng, MIN (S.age) as age FROM Sailors S 22 Dus7n 7 45.0 GROUP BY S.ra7ng 31 Lubber 8 55.5 (1) The sailors tuples are put into same 85 Art 3 25.5 ra=ng groups. 32 Andy 8 25.5 (2) Compute the Minimum age for each 95 Bob 3 63.5 ra=ng group. Ra7ng Age Ra7ng Age 3 25.5 7 45.0 8 25.5 (2) 3 25.5 3 63.5 7 45.0 8 55.5 8 25.5 Prakash 2016 VT CS 4604 58 (1)

Drill: Find the age of the youngest sailor for each ra0ng level Sid Sname Ra7ng Age SELECT S.ra7ng, MIN (S.age) as age FROM Sailors S 22 Dus7n 7 45.0 GROUP BY S.ra7ng 31 Lubber 8 55.5 (1) The sailors tuples are put into same 85 Art 3 25.5 ra=ng groups. 32 Andy 8 25.5 (2) Compute the Minimum age for each 95 Bob 3 63.5 ra=ng group. Ra7ng Age Ra7ng Age 3 25.5 7 45.0 8 25.5 (2) 3 25.5 3 63.5 7 45.0 8 55.5 8 25.5 Prakash 2016 VT CS 4604 59 (1)

Drill: Find the age of the youngest sailor for each ra0ng level that has at least 2 members SELECT S.ra7ng, MIN (S.age) as minage FROM Sailors S GROUP BY S.ra7ng HAVING COUNT(*) > 1 1. The sailors tuples are put into same ra=ng groups. 2. Eliminate groups that have < 2 members. 3. Compute the Minimum age for each ra=ng group. Ra7ng Sid Sname Ra7ng Age 22 Dus7n 7 45.0 31 Lubber 8 55.5 85 Art 3 25.5 32 Andy 8 25.5 95 Bob 3 63.5 Minage 3 25.5 8 25.5 Ra7ng Age 3 25.5 3 63.5 7 45.0 8 55.5 8 25.5 Prakash 2016 VT CS 4604 60

Drill: Find the age of the youngest sailor for each ra0ng level that has at least 2 members SELECT S.ra7ng, MIN (S.age) as minage FROM Sailors S GROUP BY S.ra7ng HAVING COUNT(*) > 1 1. The sailors tuples are put into same ra=ng groups. 2. Eliminate groups that have < 2 members. 3. Compute the Minimum age for each ra=ng group. Ra7ng Sid Sname Ra7ng Age 22 Dus7n 7 45.0 31 Lubber 8 55.5 85 Art 3 25.5 32 Andy 8 25.5 95 Bob 3 63.5 Minage 3 25.5 8 25.5 Ra7ng Age 3 25.5 3 63.5 7 45.0 8 55.5 8 25.5 Prakash 2016 VT CS 4604 61

Overview - detailed - SQL DML select, from, where set operations ordering aggregate functions nested subqueries other parts: DDL, constraints etc. Prakash 2016 VT CS 4604 62

Recap: DML General form select a1, a2, an from r1, r2, rm where P [order by.] [group by ] [having ] Prakash 2016 VT CS 4604 63

DML - nested subqueries find names of students of 4604 select name from student where... ssn in the set of people that take 4604 Prakash 2016 VT CS 4604 64

DML - nested subqueries find names of students of 15-415 select name from student where... select ssn from takes where c-id = 4604 Prakash 2016 VT CS 4604 65

DML - nested subqueries find names of students of 15-415 select name from student where ssn in ( select ssn from takes where c-id = 4604 ) Prakash 2016 VT CS 4604 66

DML - nested subqueries in compares a value with a set of values in can be combined other boolean ops it is redundant (but user friendly!): select name from student.. where c-id = 4604. Prakash 2016 VT CS 4604 67

DML - nested subqueries in compares a value with a set of values in can be combined other boolean ops it is redundant (but user friendly!): select name from student, takes where c-id = 4604 and student.ssn=takes.ssn Prakash 2016 VT CS 4604 68

DML - nested subqueries find names of students taking 4604 and living on main str select name from student where address= main str and ssn in ( select ssn from takes where c-id = 4604 ) Prakash 2016 VT CS 4604 69

DML - nested subqueries in compares a value with a set of values other operators like in?? Prakash 2016 VT CS 4604 70

DML - nested subqueries find student record with highest ssn select * from student where ssn is greater than every other ssn Prakash 2016 VT CS 4604 71

DML - nested subqueries find student record with highest ssn select * from student where ssn greater than every select ssn from student Prakash 2016 VT CS 4604 72

DML - nested subqueries find student record with highest ssn select * from student where ssn > all ( select ssn from student) Prakash 2016 VT CS 4604 73

DML - nested subqueries find student record with highest ssn select * from student where ssn >= all ( select ssn from student) Prakash 2016 VT CS 4604 74

DML - nested subqueries find student record with highest ssn - without nested subqueries? select S1.ssn, S1.name, S1.address from student as S1, student as S2 where S1.ssn > S2.ssn is not the answer (what does it give?) Prakash 2016 VT CS 4604 75

DML - nested subqueries S1 STUDENT Ssn Name Address 123 smith main str 234 jones forbes ave S2 STUDENT Ssn Name Address 123 smith main str 234 jones forbes ave S1.ssn>S2.ssn S1 x S2 S1. ssn S2.ssn. 123 123 234 123 123 234 234 234 Prakash 2016 VT CS 4604 76

DML - nested subqueries select S1.ssn, S1.name, S1.address from student as S1, student as S2 where S1.ssn > S2.ssn gives all but the smallest ssn - aha! Prakash 2016 VT CS 4604 77

DML - nested subqueries find student record with highest ssn - without nested subqueries? select S1.ssn, S1.name, S1.address from student as S1, student as S2 where S1.ssn < S2.ssn gives all but the highest - therefore. Prakash 2016 VT CS 4604 78

DML - nested subqueries find student record with highest ssn - without nested subqueries? (select * from student) except (select S1.ssn, S1.name, S1.address from student as S1, student as S2 where S1.ssn < S2.ssn) Prakash 2016 VT CS 4604 79

DML - nested subqueries (select * from student) except (select S1.ssn, S1.name, S1.address from student as S1, student as S2 where S1.ssn < S2.ssn) select * from student where ssn >= all (select ssn from student) Prakash 2016 VT CS 4604 80

DML - nested subqueries Drill: Even more readable than select * from student where ssn >= all (select ssn from student) Prakash 2016 VT CS 4604 81

DML - nested subqueries Drill: Even more readable than select * from student where ssn >= all (select ssn from student) select * from student where ssn in (select max(ssn) from student) Prakash 2016 VT CS 4604 82

from clause Drill: find the ssn of the student with the highest GPA STUDENT Ssn Name Address 123 smith main str 234 jones forbes ave CLASS c-id c-name units 4602 s.e. 2 4603 o.s. 2 TAKES SSN c-id grade 123 4613 A 234 4613 B Prakash 2016 VT CS 4604 83

DML - nested subqueries Drill: find the ssn and GPA of the student with the highest GPA select ssn, avg(grade) from takes where Prakash 2016 VT CS 4604 84

DML - nested subqueries Drill: find the ssn and GPA of the student with the highest GPA select ssn, avg(grade) from takes group by ssn having avg( grade)... greater than every other GPA on file Prakash 2016 VT CS 4604 85

DML - nested subqueries Drill: find the ssn and GPA of the student with the highest GPA select ssn, avg(grade) from takes group by ssn having avg( grade) >= all ( select avg( grade ) from student group by ssn ) } all GPAs Prakash 2016 VT CS 4604 86

DML - nested subqueries in and >= all compares a value with a set of values other operators like these? Prakash 2016 VT CS 4604 87

DML - nested subqueries <all(), <>all()... <>all is identical to not in >some(), >= some ()... = some() is identical to in exists Prakash 2016 VT CS 4604 88

DML - nested subqueries Drill for exists : find all courses that nobody enrolled in select c-id from class.with no tuples in takes TAKES SSN c-id grade 123 4613 A 234 4613 B CLASS c-id c-name units 4602 s.e. 2 4603 o.s. 2 Prakash 2016 VT CS 4604 89

DML - nested subqueries Drill for exists : find all courses that nobody enrolled in select c-id from class where not exists (select * from takes where class.c-id = takes.c-id) Prakash 2016 VT CS 4604 90

Correlated vs Uncorrelated The previous subqueries did not depend on anything outside the subquery and thus need to be executed just once. These are called uncorrelated. A correlated subquery depends on data from the outer query and thus has to be executed for each row of the outer table(s) Prakash 2016 VT CS 4604 91

Correlated Subqueries Find course names that have been used for two or more courses. SELECT CourseName FROM Courses AS First WHERE CourseName IN (SELECT CourseName FROM Courses WHERE (Number <> First.Number) AND (DeptName <> First.DeptName) ); Prakash 2016 VT CS 4604 92

Evalua0ng Correlated Subqueries SELECT CourseName FROM Courses AS First WHERE CourseName IN (SELECT CourseName FROM Courses WHERE (Number <> First.Number) AND (DeptName <> First.DeptName) ); Evaluate query by looping over tuples of First, and for each tuple evaluate the subquery. Scoping rules: an azribute in a subquery belongs to one of the tuple variables in that subquery s FROM clause, or to the immediately surrounding subquery, and so on. Prakash 2016 VT CS 4604 93

Overview - detailed - SQL DML select, from, where set operations ordering aggregate functions nested subqueries other parts: DDL, constraints etc. Prakash 2016 VT CS 4604 94

(Next Week) Overview - detailed SQL DML other parts: views modifications joins DDL Constraints Prakash 2016 VT CS 4604 95