THJ Tietokannanhallintajärjestelmät Database Management Systems

Size: px

Start display at page:

Download "THJ Tietokannanhallintajärjestelmät Database Management Systems"

Doris Lee
6 years ago
Views:

1 THJ Tietokannanhallintajärjestelmät Database Management Systems Matti Nykänen School of Computing, University of Eastern Finland Academic year , IV quarter Contents 1 Introduction 1 2 The Relational Model Tables Keys Integrity Constraints Different Viewpoints to Data Transactions Relational Algebra Structured Query Language Data Definition Language Query Language Data Manipulation Language Client-Server Database Architecture Installing and Running SimpleDB Using a Relational Database from Java JDBC Error Handling JDBC Transaction Handling Impedance Mismatch The Structure of the SimpleDB RDBMS Engine File Management Log Management Buffer Management Transaction Management Database Recovery Concurrency Control Record Management Metadata Management Query Processing Query Scans i

2 4.7.2 Update Scans Plans Predicates Parsing SQL Statements Query Execution Planner The Remote Database Server Indexing Extendable Hashing B + -trees Using an Index in a Relational Algebra Operation Updating Indexed Data Query Optimization Heuristic Optimization On Cost-Based Optimization References Thomas M. Connolly and Carolyn E. Begg. Database Systems: A Practical Approach to Design, Implementation, and Management. Addison Wesley, fifth edition, Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. The MIT Press, third edition, Ramez Elmasri and Shamkant B. Navathe. Database Systems: Models, Languages, Design, and Application Programming. Pearson, sixth edition, John R. Levine, Tony Mason, and Doug Brown. Lex & Yacc. O Reilly, second edition, Simon Peyton Jones. Beautiful concurrency. In Andy Oram and Greg Wilson, editors, Beautiful Code, chapter 24, pages O Reilly, Edward Sciore. Database Design and Implementation. Wiley, Peter Sestoft. Java Precisely. The MIT Press, second edition, Gerhard Weikum and Gottfried Vossen. Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control and Recovery. Morgan Kaufmann, ii

3 1 Introduction The main questions of this course are: What features must a Relational Data Base Management System (RDBMS) have? How can these features be implemented in an RDBMS? This course discusses general design principles, not the vendor-specific design issues in MySQL, Oracle, Microsoft Access,... RDBMS. The course book and software This course is based mainly on the following book: Sciore Edward: Database Design and Implementation. Wiley, It illustrates these principles via a small and simplified RDBMS called SimpleDB developed by its author. It can be downloaded from Unfortunately Amazon offers another non-relational DBMS with the same name. The wiki page for the course Direct your browser to the web address and click the following link chain: tkt-wiki Kurssien kotisivuja - Course homepages THJ - Tietokannanhallintajärjestelmät - Database Management Systems ( ). Or add its direct link into your bookmarks. This page contains these lecture handouts in weekly installments, as well as other course material. Knowledge of RDBMS internals is interesting to for instance: Software developers whose programs communicate with RDBMSs. For instance, they need to understand the concept of transactions and its role in this communication. Data Base Administrators (DBAs) who are IT specialists responsible of keeping the RDBMS of an organization up and running smoothly and efficiently. If an organization is large it has dedicated DBAs, since nowadays organizations rely heavily on their databases small its IT staff acts also as DBAs. For instance, a DBA must know what RDBMS parameters like transaction checkpoint frequency mean and how they affect the performance level. 1

4 Figure 1: The Class Diagram for the University Database. (Sciore, 2008) 2 The Relational Model (Sciore, 2008, part 1) The earlier course Data Management ( Tiedonhallinta (THA) in Finnish) has discussed the following: 1 What is the relational data model? 2 How can the class (or Entity-Relationship, or... ) diagram gained from information system design be cast into relational table form? Figure 1 shows the class diagram for an American university database, our running example. (We will omit its parking PERMIT class later.) 3 How can this form be implemented using a Relational Database Management System (RDBMS) such as MySQL, Oracle, MS Access,...? Here we revisit the relational model briefly from another viewpoint: 4 What does it expect that the RDBMS can do? Then the rest of this course discusses how the RDBMS can do these things. 2.1 Tables (Sciore, 2008, Chapters ) The central feature of the relational data model is to organize data into tables. Moreover, the result of a query in the relational data model is always another table, built from the stored tables. Each table has its own collection of columns called its attributes. 2

5 Figure 2: The Schema for the University Database. (Sciore, 2008) This collection is called the schema of the table. The collection of all the schemas of all the tables in a database is also called the schema of this database. (Some texts use the correct but old-fashioned plural schemata.) Figure 2 shows the schema for our example database, where each table scheme is in the form TABLENAME(AttrName 1,AttrName 2,AttrName 3,...,AttrName n ) which gives each of its Attributes its own Name. The schema is one example of metadata: Data about data. Data is originally Latin for the given things. (It is plural for datum, or one given thing.) Meta (µετα) is originally Greek for after. When later scholars compiled Aristotle s writings, they did not know where to put his first philosophy, so they put it after physics, and started calling it metaphysics instead. This gave meta- the new meaning of what you must read first, before you can understand the rest. Hence in computer science metadata means the extra data which tells how the actual data is structured. (For instance, XML.) Requirement 1 (data and metadata). The RDBMS must be able to maintain both the data itself and its metadata. Rows A table contains zero or more rows. Each row represents the data corresponding to some specific individual. Each row r of table T representing some specific individual x then tells what is stored in the database about x with respect to the attributes a 1, a 2, a 3,..., a n of T, so that the value r.a i on column a i of row r tells what x is like with respect to a i. 3

6 Intuitvely, such a row r says that there is some individual x of kind T whose a 1 is t.a 1 and its a 2 is t.a 2 its a 3 is t.a 3 and... and its a n is t.a n. Figure 3 shows our example tables with some example rows. For instance, the first row of STUDENT says that there is some student whose student ID number ( opiskelijanumero in Finnish) is 1, whose name is Joe, whose graduation year is 2004, and whose major subject is computer science. The rows within a table are unordered. When we said that the first row of STU- DENT is Joe s, we meant that the STUDENT rows were shown in a particular order, here by student ID. Requirement 2 (no order). The RDBMS must be able to sort the rows before showing them to the user. The user can determine the order in which (s)he wants to see them. However, row ordering must not affect anything else than output. Null Values However, some attribute values may come later. For instance: 1 The new STUDENT is registered in the university database and assigned his/her own student ID number in the beginning of his/her studies. 2 But the graduation year is only known at the end of his/her studies. What is the value of the graduation year attribute during his/her studies? A natural solution is to mark the year as not known (yet). The relational model provides special NULL values for such purposes. These NULLs behave differently than any actual values: Let r be a STUDENT table row of a currently studying student, so that r.gradyear = NULL. Then the answer to evey one of these three questions r.gradyear < 2015 r.gradyear = 2015 r.gradyear > 2015 must be No! because we don t know the actual graduation year yet. In addition, if s is another STUDENT row, then also the three questions r.gradyear < s.gradyear r.gradyear = s.gradyear r.gradyear > s.gradyear all get the same answer No! too, whether s.gradyear is a known year or NULL. However, this holds even when row r is s! The relational model has the concept of NULL values in general, but not the NULL value(s) specifically for row r. 4

7 Figure 3: Some Contents of the University Database. (Sciore, 2008) 5

8 Since NULL value behaviour is so different, some database theorists want to get rid of them altogether. However, they are sometimes the best practical way to represent that the information must exist, but is (yet) unknown. In contrast, attribute values which might or might not exist should be represented in some other way. Suppose we added student mobile phone numbers into our university database. If we added another attribute TelNo to the STUDENT table, and allowed NULL values in it, then we would be implicitly claiming that every student does have a phone, but some students have kept their numbers secret. A better design choice would be to add instead a new table with schema MOBILE(SId, TelNo). (1) Then a student (represented by the ID) without a phone would have no rows in this table whereas a student with many phones would have several. Moreover, since the university cannot use the information that a student has phone but its number is secret, the TelNo attribute can be declared to be non-null. That is, this new table represents the known mobile numbers. Requirement 3 (NULL value constraints). The RDBMS must permit the table definition to declare whether a particular attribute can contain NULL values or not. It must enforce such a constraint by rejecting the insertion of a new row which would have a NULL value for an attribute which has been declared non-null. The RDBMS maintains these declarations in its metadata alongside the table definition. However, we shall largely bypass NULL values and their problems in this course. 2.2 Keys (Sciore, 2008, Chapters ) Intuitively, some attributes of a table identify or name uniquely the individual x described by a row r whereas its other attributes describe the other qualities of this x. The database table T with attributes a 1, a 2, a 3,..., a m, b 1, b 2, b 3,..., b n satisfies the functional dependency (FD) a 1, a 2, a 3,..., a m b j (2) if for all possible rows r and s that might be in T we have the following: if r.a 1 = s.a 1 and r.a 2 = s.a 2 and r.a 3 = s.a 3 and... and r.a m = s.a m then also r.b j = s.b j. That is, the values for the attributes a 1, a 2, a 3,..., a m on the left-hand side (LHS) of the FD determine what the value for the attribute b j on its right-hand side (RHS) must be. 6

9 Note that this FD concerns the intended meaning of table T in the database schema, not only the rows which T happens to contain just now. For instance, the MOBILE table in Eq. (1) satisfies the FD because TelNo SId if r.telno = s.telno then also r.sid = s.sid since the mobile phone company will not give two different students r and s the same mobile number (if we assume that two students do not share a common mobile). Trivially for every a i on its LHS. a 1, a 2, a 3,..., a m a i Transitively, if a 1, a 2, a 3,..., a m b 1 a 1, a 2, a 3,..., a m b 2 a 1, a 2, a 3,..., a m b 3. a 1, a 2, a 3,..., a m b n b 1, b 2, b 3,..., b n c a 1, a 2, a 3,..., a m c. and then also We can introduce vector notation in FDs to shorten such indexed sequences into Candidate and Primary Keys if a b and b c then also a c. (3) Attributes a 1, a 2, a 3,..., a m form a candidate key of a table T if both of the following properties hold: 1 T satisfies the FD (2) for every attribute b j of T. 2 If any of the a i is taken away from its LHS then property 1 no longer holds. (That is, every a i on its LHS is really needed.) If two rows r and s share the same values for all the LHS attributes a 1, a 2, a 3,..., a m, then the database cannot tell them apart: They share the same values also for all the other attributes b j as well, by property 1. Their order in T does not matter, by requirement 2. Therefore we see that table T should really have just one copy of this row, not two; and 7

10 each stored table should have candidate keys, to eliminate such duplicate rows. Once the database designer has determined the candidate keys for a new table T, (s)he chooses one of them as its primary key. None of the attributes a i of this chosen primary key is allowed to contain NULLs by requirement 3 because they would make it impossible to check whether two rows are two copies of the same row or not. What if T does not have any such natural candidate keys to choose? One solution is to say that T is all key and take all its attributes as the key. Another is to add an artificial identifier field to be the key. This is how the STUDENT, SECTION and ENROLL of our university database got their Id fields. Its DEPT and COURSE have these Id fields as well, even though they are not necessary: the department name and course title could have been chosen as keys instead. However, UEF must have course ids, because we have both English and Finnish titles for the same course. Requirement 4 (key constraints). The RDBMS must permit table definition to also state which attributes shall be its primary key. It must enforce such a constraint as follows: 1 It must not permit any of these primary key attributes to have NULL values, via requirement 3. 2 It must reject adding another row with identical values for all these key attributes as an already stored row. The RDBMS maintains this primary key information in its metadata alongside the table definition. The RDBMS can also generate unique values for artificial identifiers. It can for instance maintain counters in its metadata. The chosen primary key attributes are often shown underlined in the schema of the table. Only an actually stored database table has a primary key, but a table which the RDBMS computes as an answer to show to the user does not. For instance, if we ask for just the students names in the university database, then the answer will have duplicates, since several students can have the same name. Hence this answer table cannot have a key it is not even all key. 8

11 Figure 4: Foreign Keys for the University Database. (Sciore, 2008) Foreign Keys An attribute a of a table T is a foreign key referencing another table U if its value r.a for a row t in table T names the row s in table U which corresponds to this row r so that r.a = s.b where attribute b is the primary key chosen for table U. If the primary key chosen for table U consists of multiple attributes b 1, b 2, b 3,..., b n then the foreign key in table T consists of corresponding attributes a 1, a 2, a 3,..., a n so that r.a 1 = s.b 1, r.a 2 = s.b 2, r.a 3 = s.b 3,..., r.a n = s.b n. Intuitively, this s is the U of this r. Foreign keys are the central tool to glue together the two individuals x and y represented by the two rows r and s in the two relational tables T and U. Figure 4 shows the foreign keys in our university database example. For instance, attribute MajorId of table STUDENT is a foreign key of table DEPT. Hence the attribute r.majorid of a row r in STUDENT contains the primary key value s.did of a certain row s in DEPT. That is, this department s is the department for the major subject of this student r. Hence Joe s major is computer science in Figure 3. Requirement 5 (referential integrity). The RDBMS must permit defining the foreign key attribute(s) from one table into another. Moreover, it must enforce that if an attribute a of a table T is defined to be a foreign key of table U, and its value r.a in a row r in T is not NULL, then table U must contain a row whose primary key value equals this r.a. In other words, if a row r of table T claims that there is some corresponding row s in table U, then this row s must indeed exist in table U. It is part 1 of requirement 4 applied to foreign keys. Compare to programming: a valid pointer is either NULL or it must point to some valid object. 9

12 The RDBMS must react somehow, if the user attempts to delete from table U the row s referenced by some rows r via the foreign key in table T, since it would violate requirement 5. The corresponding SQL definition ON DELETE... IGNORE means that this attempt to delete row s from U will be rejected, because something must be done to the rows r in T first. The other reactions automate some common ways to do something to these rows t first. CASCADE means the following: 1 this row s is deleted from its table U ; 2 every row r will be deleted from table T ; and 3 the RDBMS reacts to each of these deletions in step 2 as defined. This continues until requirement 5 is restored. SET NULL first sets the foreign key attributes in table T into NULL. That is, each modified row r will now say that there is no corresponding row s in table U. Of course, all these attributes must permit NULL values, via requirement 3. SET DEFAULT constant first sets the foreign key attributes in table T into the given constant instead of NULL. It must be the key for some row s still in table U. That is, each modified row r will now refer to row s instead of s. The RDBMS can maintain these foreign key definitions and their ON DELETE... definitions (if any) alongside the definition of the referring table T in the metadata. Normalization (Sciore, 2008, Chapter 3.6) A central tool in systematic database design is normalization theory. This approach has developed normal forms (NFs) to guide the design of database tables. Each NF is designed to prevent some kinds of update anomalies: strange behaviour when data is updated. We shall now review FD-based NFs, which already prevent the most common update anomalies. Database theory literature has many more NFs based on generalizations of FDs, which prevent other less often encountered anomalies. The insight in FD-based normalization is the following: Suppose we have a database table with the schema T ( a, b). Table T is normalized, if its FDs are exactly a b. Otherwise the design of table T still has some reduncancy left, causing update anomalies, so further normalization is needed. This further normalization consists of splitting table T into two new tables connected together with foreign keys. 10

13 The Oath of the Relational Database Designer I swear to construct my tables so that every non-key attribute depends on (provides a fact about) the key, the whole key, and nothing but the key so help me Codd! Depending on the key ensures the First Normal Form (1NF), the whole key ensures the Second Normal Form (2NF), and nothing but the key ensures the Third Normal Form (3NF). In BCNF the non-key condition is extended into functionally dependent. 1NF Table T is in 1NF if its chosen primary key does indeed satisfy property 1 of candidate keys. Consider as an example maintaining the following contact information: Person as the key Address for that person Phone numbers for that person the same person can have zero or more phone numbers. A natural schema would be CONTACTS(Person,Address,SET OF PhoneNo) since each person does indeed determine some set of corresponding phone numbers. However, our basic relational data model does not permit this: Each attribute permits only a single indivisible value, and not a compound value with inner structure. They would be permitted so-called non-first normal form (NFNF) data models, which extend the basic model. A possibility within the basic model might be to fix some upper limit p on phone numbers/person, and use the schema CONTACTS(Person, Address, PhoneNo 1, PhoneNo 2, PhoneNo 3,..., PhoneNo p ) where each attribute PhoneNo i would be permitted to have NULLs to mean this person does not have an ith phone number. However: Managing these separate attributes and their NULLs would be tedious. What is some well-connected person has more than p phone numbers? That is, this table design would add a new technological restriction not present in the original situation we are trying to model into tables. 11

14 Another possibility might be to give up property 1 and use the schema CONTACTS(Person,Address,PhoneNo) with a duplicate row for each phone number for a given person. However: This table design would implicitly allow the same person to have many different addresses. That is, it would not enforce even the FD in the original situation. Person Address This is an example of an update anomaly: The RDBMS would not be able to reject an update which would violate the intended meaning. A solution in 1NF is to split the table into two with schemas CONTACTS(Person,Address) and PHONES(Person,PhoneNo) where the new table is all key. In practice, this solution needs also a fast way to find the phone numbers of a given person. The primary key index of the PHONES table does not help, so we must create a new clustering index too. Requirement 6 (extra indexes). The RDBMS must permit defining new indexes, and it must maintain these defined indexes automatically as the database contents are modified. Unique indexes associate a single value to each key. Clustered indexes associate a group of values to each key. 2NF The RDBMS maintains these new index definitions in the metadata. 2NF considers tables whose chosen primary key consists of two or more attributes, and requires that all the other attributes must depend on all of them. The schema T ( a, b, c, d) and its FD b c (4) show how 2NF is violated: attribute c depends only on the part b of the whole key a, b. Its solution is to split the table into two tables T ( a, b, d) and U ( b, c) 12

15 where FD (4) has moved into the new table U. These two tables are connected by stating that the attributes b of the old table T are a foreign key referencing the new table U. As an example, consider the schema WORKED(Employee, Project, Department, Hours) (5) so that this employee has worked this number of hours on that project which is one of the projects of that department. The violation of 2NF stems from the FD which represents the which part. Project Department (6) A corresponding update anomaly is that the WORKED table permits the same project to belong to many different departments, despite FD (6). Its solution is two tables WORKED(Employee, Project, Hours) and PROJECTS(Project, Department) (7) or this employee has worked this number of hours on that project and it is one of the projects of that department where FD (6) has moved to the new PROJECTS table, and the attribute WORKED.Project is a foreign key referencing it. 3NF 3NF precludes transitive dependencies (3): Each attribute must depend directly on the key. The schema and its FDs T ( a, b, c) a b b c (8) show how 3NF is violated: attribute c does depend on the key a as it should, but only via the non-key intermediate attribute b. Its solution is to split table T into two tables T ( a, b) and U (b, c) 13

16 where FD (8) has moved into the new table U. These two tables are connected by stating that the attribute b of the old table T is a foreign key referencing the new table U. As an example, consider the schema WORKS(Person,Department,Address) so that this person works in that department which is located at that address. The violation of 3NF stems from the FD chain Person Department where FD (9) represents the which part. Department Address (9) A corresponding update anomaly is that the WORKS table permits the same department to be located at many different locations, despite FD (9). Its solution is two tables WORKS(Person,Department) and LOCATION (Department,Address) or this person works in that department and this department is located at that address where FD (9) has moved into the new table LOCATION, and attribute WORKS.Department is a foreign key referencing it. 2.3 Integrity Constraints (Sciore, 2008, Chapter and ) From the RDBMS perspective, integrity constraints are conditions which the database contents must satisfy to be in a consistent state. Requirement 5 is an example of such an integrity constraint: It states that updating the database must not be permitted to break the intended connections from one table into another. The database is only allowed to change from its current consistent state into another consistent state, as defined by the change operation (insertion, deletion, update) and the integrity constraints of the database. From the database design perspective, integrity constraints describe what it means for the database to reflect the reality (whatever that means... ) of its intended application area. The specified integrity constraints are a part of the metadata. 14

17 Assertions Figure 5: Checking assertions in a table definition. (Sciore, 2008) Assertions are conditions which the database state must satisfy. If the result of a change operation would violate any constraint, then the RDBMS rejects the operation. In SQL, such an assertion can be expressed with check c o n d i t i o n which tests the given truth-valued condition. Such a check can appear in an SQL table definition, where it states a condition which the attribute values for each row must satisfy, as in Figure 5. The condition to check often has the form not exists query which states that the result of this query must be empty. This lets us express constraints of the form the database must never be allowed to contain any rows which would satisfy this query or the database contents must never be allowed to be as described in this query as shown in Figures 6 8. These three examples show how we can define and name assertions outside tables: create assertion ItsName check c o n d i t i o n Figure 8 shows a named assertion involving two tables. 15

18 Figure 6: One example of a named SQL assertion. (Sciore, 2008) Figure 7: Another example of a named SQL assertion. (Sciore, 2008) Figure 8: An example of a named SQL assertion over two tables. (Sciore, 2008) 16

19 Triggers Sometimes we do not want to reject a change operation, as assertions do, but continue with other operations until the database is again in the kind of state we want it to be. An example are the ON DELETE IGNORE, CASCADE, SET NULL and SET constant options to handle deletions of rows referenced by foreign keys in section 2.2. They tell how to continue until the referential integrity of the database are restored again. Triggers are similar rules defined by the DBA. Triggers are also called Event- because a trigger waits for a certain modification operation like insert, delete or update to happen Condition- because a trigger has a condition which the RDBMS tests when its event happens, and this condition determines whether this trigger will fire or not Action- because if the trigger fires, then the RDBMS performs these other operations rules, because they have these three parts. So for ON DELETE these parts would be: Event is a delete operation to the table which is referenced by some foreign key from another table. Condition is to test whether this would delete rows which are referenced by rows of this other table. Action is the given option ON DELETE IGNORE, CASCADE, SET NULL or SET constant. In Figure 9, the university wants to permit several persons in its staff to modify course grades, but also wants to maintain a GRADE LOG table who changed what and when for auditing purposes. In Figure 10 enforces the American university policy that when a new student is inserted into the database, his/her forthcoming expected graduation year is no more than 4 years from now. However, note that the expected graduation year for an existing student can still be updated to violate this policy, because the trigger in Figure 10 applies only to insertion events. 17

20 Figure 9: An example of an SQL trigger. (Sciore, 2008) Figure 10: Another example of an SQL trigger. (Sciore, 2008) 18

21 Figure 11: The Schemata. 2.4 Different Viewpoints to Data (Sciore, 2008, Chapters 1.7 and 4.5) Although normalization makes sense from the DBA s viewpoint, because it avoids update anomalies, the resulting new table structure might make less sense than the original from the user s viewpoint. For instance, the user may well prefer the original WORKED table (5) over its normalized form (7) with the separate PROJECTS table, because (s)he may wish to be reminded of the department when reading the hours listing. Hence we differentiate three separate levels of schemas for a database, as shown in Figure 11. Conceptual schema consists of the normalized tables derived from the class diagrams describing the application area for which this database has been designed. Physical schema implements the conceptual schema with concrete database table and index files. The RDBMS is responsible for maintaining them. A DBMS supports physical data independence if its users do not need to interact with it at this level. External schemas implement the user s various views to the stored data on top of the conceptual schema. A DBMS supports logical data independence if its users can be given their own external schemas, so that they do not need to know the conceptual schema. Data independence is desirable, because it shields the upper levels from changes in the lower levels. 19

22 Requirement 7 (views). To support logical data independence, the RDBMS must support defining views: virtual tables on top of actual tables. A view can be either purely virtual so that it exists only as a query Q which accesses the actual tables in the desired way, or materialized so that the RDBMS maintains its current contents also in a separate actual table V. This information is redundant, since the contents of this table V could be created by the defining query Q from the database instead. In the good ol times, already normalized tables were later denormalized by hand to provide such redundancy. Views are a better alternative, since the RDBMS can manage them automatically. But there is not (yet) any standard vendor-independent way to define a materialized view... The user of an external schema should see its views just like ordinary tables. However, there is a difference: It might not be clear how a view can be updated that is, how the RDBMS should handle insertions, deletions and updates to its rows, because these rows might not really exist. The intuition is that only those views are updatable, whose defining query Q is so simple that the affected rows of its underlying actual tables can be determined (Connolly and Begg, 2010, Chapter 4.4.3) (Elmasri and Navathe, 2011, Chapter 5.3.3): If Q uses grouping or aggregation operations (explained next), then it is not updatable since one row in the view is a combination of several rows of the underlying table(s). If Q uses more than one table, then the view is in general not updatable since one row in the view is a combination of several rows, each from a different underlying table. If Q contains nested queries, then the view is not updatable since the update might have to affect these nested queries too. If Q does not mention all the non-null attributes without default values of its only table, then it is not updatable since the update would not specify the required values for these missing attributes. Otherwise the view can be updatable. Another alternative which is becoming common in RDBMSs is to use stored procedures instead of view updates. A stored procedure is a combination of programming language and query language constructs an RDBMS subroutine. It is stored in the view metadata. The user can invoke such a procedure, which the DBA has programmed to do the right thing when the user wants to update his/her view. They offer more flexibility than plain view updates, where the RDBMS has to guess what would be the right thing to do when handling a view update. 20

23 Grouping Data The WORKED table example (5) suggests also another viewpoint to data: The user may also wish to list the total number of hours spent per project. The RDBMS could fulfill this wish as follows: 1 Sort the rows of the WORKED table according to its WORKED.Project attribute, via the sorting requirement 2. 2 For each distinct WORKED.Project attribute value p, add together all the t.hours values for all the rows t with t.project = p. All these rows t are now adjacent to each other, by step 1. 3 Report to the user each value p and the corresponding sum computed in step 2. Requirement 8 (grouping). The RDBMS must be able to group together related rows and summarize each group into a single representative accumulated value. 2.5 Transactions The data grouping scenario in the end of section 2.4 shows that an RDBMS must control concurrent access to its contents: One employee x has asked the RDBMS to give the listing of total hours per each project while other employees y, z, u,... insert their own hours into the database at the same time. Which of these new hours will be included in the listing? Note that even when no concurrency is permitted, the RDBMS must somehow be able to enforce it as well. Hence RDBMS implementation includes aspects of concurrent programming. The RDBMS must also be able to recover properly after a crash. following scenario: Consider the Suppose that a user deletes some row, which is referenced by a foreign key, and this starts many other CASCADEd deletions in other parts of the database. Then the computer running the RDBMS crashes in the middle of these CAS- CADEd deletions. When the computer and RDBMS are restarted after the crash, the RDBMS must first somehow undo all those CASCADEd deletions which it managed to perform before the crash. (Or carry out the rest of them too, but this would be even harder.) Otherwise some of the CASCADEd deletions would be done while others would be left undone and so the referential integrity of the database might be in danger! The general problem is not in CASCADE: Consider instead a company payroll, a 2% raise to all, and a crash in the middle of computing the new salaries... 21

24 Hence RDBMS implementation includes aspects of fault-tolerant computing. These two scenarios show that the RDBMS must maintain its consistent state in successive snapshots : 1 The first grouping scenario showed that a query must be evaluated in some static snapshot of the database, and updating it cannot be permitted at the same time. 2 The second crash scenario showed that an update must take the database all the way from one snapshot into the next, even though this may mean many lengthy individual operations. One concept subsumes both of these concurrency and recovery requirements for an RDBMS: Requirement 9 (transactions). The RDBMS must permit defining transactions: sequences of operations which satisfy the 4 ACID properties. Atomicity A transaction must be an atomic (Greek: ατοµοσ, atomos, indivisible ) unit of work: either every operation in the transaction is executed successfully, or none of them is. Accordingly, a transaction ends in either a commit which means that it has managed to execute all its operations successfully, or a abort (also called rollback) which means to undo all the operations which it did manage to execute successfully, so that afterwards everything looks like as if the transaction had never started at all. Hence the abort operation is a very convenient abstraction for cleaning everything up after an error occurred in the middle of a transaction a very common programming pattern in fault-tolerant computing. Atomicity solves the problem in our second crash scenario 2 as follows: ❶ The deletion issued by the user starts a new transaction t. All the CASCADEd operations it causes are also executed in this same transaction t. ❷ When the RDBMS is restarted after a crash, it detects that this transaction t did not manage to commit before the crash, so it aborts t. In this way the RDBMS recovers from the crash back into consistent state in which it was before step ❶ started. Hence atomicity is part of the recovery requirement. 22

25 Correctness A transaction must be correct, in the sense that the state of the database state after it has committed must again be consistent, as defined by its integrity constraints in section 2.3, although it can be inconsistent during the transaction: For instance, a deletion and all its cascaded operations in our second crash scenario 2 are executed in the same transaction t in step ❶. The referential integrity requirement 5 is temporarily broken during t but is restored after committing or aborting t. In this way, the RDBMS uses transactions internally for its own operations like these CASCADEd deletions. Hence correctness is part of the concurrency requirement: The RDBMS must ensure that the database always appears to be in a consistent state to the outside world, regardless of its internal state. The RDBMS must also permit external application programs which use the database to specify their own transactions: The canonical example is: Transfer Xe from bank account Y into Z if Y has enough money. In pseudocode: SELECT Balance xfer(x,y,z): 1 FROM Bank WHERE Account = Y 2 if Balance X UPDATE Bank 3 SET Balance = Balance X WHERE Account = Y; UPDATE Bank 4 SET Balance = Balance + X WHERE Account = Z 5 else what? The RDBMS can run each of the three SQL statements on lines 1, 3 and 4 in its own internal transaction but it would not be enough! Instead, lines 1 3 must be executed in the same transaction: Suppose some other execution of xfer(a,y,b) executes concurrently between lines 1 and 2. Then the value of Balance retrieved on line 1 is now out-of-date on line 2. This other execution has transferred another Be out of account Y which might leave less than Xe, and line 3 makes the Balance of account Y negative even though this was exactly what we tried to avoid! Thus we would have a concurrency problem otherwise. 23

26 Lines 3 4 must be executed in the same transaction too: Suppose that line 4 fails for some reason. Then we must abort line 3 too, because if we do not, then Xe would disappear from account Y into nowhere. Thus we would have a recovery problem otherwise. Hence lines 1 4 must all be executed in the same transaction. Consider finally line 5. How do we want to report the error that account Y has less than Xe? A good choice would be to abort the transaction (even though it has changed nothing in the Bank) because then an abort means the money was not transferred for some reason. Otherwise the transaction could commit in two ways: either with the money was transferred or with there was not enough money to transfer and the caller of xfer would then have to find out which of these two possibilities actually happened. When this xfer code is used as a small part of a large program which implements the business logic of the organization, this choice to abort becomes more and more attractive to the programmer. Hence the RDBMS must permit external application programs to begin, commit and abort their own transactions, which may consist of several database and nondatabase operations. This is required, because the database might have also more complex integrity constraints like money should not just disappear which cannot be stated with just the RDBMS assertions and triggers. Isolation Transactions must be isolated from each other, in the sense that a transaction must not notice any of the other concurrently running transactions instead, each transaction must see the database as if it were the only transaction using it. Hence isolation is the other part of the concurrency requirement (besides correctness). Consider again our first grouping scenario 1. Employee x will get a listing of all exactly those hours of the other employees y, z, u,... whose insertion transactions t y, t z, t u,... have already been committed before x starts the listing transaction t x. If such an insertion transaction is running at the same time as the listing transaction, they are isolated from each other. So t x does not see those transactions of t y, t z, t u,... which are still running because they might abort at the end, and must therefore not be listed! 24

27 Figure 12: Transaction isolation levels. (Sciore, 2008) Isolation is the one ACID property which the user can relax, if (s)he... tolerates possible inaccuracies in the answer to the query, and wants the query to run faster. In other words, the user can play fast and loose by altering the transaction isolation level of his/her query, and accept the risks involved. These 4 levels are shown in Figure 12. Its middle column discusses a possible implementation, and we shall return to that column later. Serializable is the full risk-free isolation of ACID property. Every transaction should run at this level by default, and in most RDBMSs they do. In our first grouping scenario 1, the listing would contain effects of only those of transactions t y, t z, t u,... which committed before transaction t x started. Repeatable read is the next riskier level. The risk involved is phantoms: New rows which may appear into the database while the current transaction is running. 25

28 In our first grouping scenario 1, the listing might also contain some rows added by those transactions t y, t z, t u,... which committed during transaction t x but user x would not know which, because that depends on the concurrent execution order of these transactions t x, t y, t z, t u,.... This level is useful for transactions which modify an already existing row in the database, because phantoms do not affect that. This is why some RDBMSs (noatably Oracle and Sybase) use it as the default isolation level instead of fully Serializable. Read committed level is riskier still. The new risk (in addition to phantoms) is nonrepeatable reads: If a transaction reads the same value twice from the database, then it may get different results because another transaction has changed the value inbetween these two reads and committed. Note that the RDBMS may need to reread the same value repeatedly during query evaluation. In our first grouping scenario 1, the listing would include the Hours of some rows modified by those transactions t y, t z, t u,... which committed during transaction t x but again user x would not know which. This level would be OK for a transaction whose operations are unrelated to each other, in the sense that they could be executed in parallel as well as sequentially. Read uncommitted is the riskiest level. Durability The new risk (in addition to phantoms and nonrepeatable reads) is dirty reads: A transaction can read data as soon as another transaction writes it even when this other writing transaction later aborts, and its writings should not have happened at all. This is also very fast, because this transaction does not have to stop and wait for any other transactions. In our first grouping scenario 1, the listing would contain whatever was in the WORKED table when transaction t x happened to read it. However, this level would be OK for read-only transactions whose results do not have to be exactly accurate. For instance, user x can run the listing transaction t x in this level, if (s)he just wants to compute quickly some rough statistics about approximately how many Hours have people WORKED on each project. Durability means that when a transaction commits, then the changes it has made to the data are now stored permanently, so that even a computer crash does not wipe them out. Hence durability is the other part of the recovery requirement (besides atomicity). 26

29 source intermediate runtime Java programming language RDBMS Java source code in a.java file SQL statement from the user (which might also be an application program) declarative approach: what the result must be which gets compiled into corresponding Java object code in a a corresponding Relational Algebra.class file by the Java compiler expression by the SQL parser of the RDBMS and optimized by its query optimizer procedural approach: how the result can be formed which gets executed by the Java virtual machine (JVM) internal algorithms chosen by the RDBMS for each operation in the expression 2.6 Relational Algebra (Sciore, 2008, Chapter 4.2) Table 1: Java vs. RDBMS The THA course has already presented the Relational Algebra from its own viewpoint. Here we present in from the RDBMS viewpoint, as the intermediate language between the user-level Structured Query Language (SQL) and the internal algorithms with which the RDBMS can execute each operator of Relational Algebra as shown in Table 1. Hence we present here a variant of the Relational Algebra which may be closer to the internals of the RDBMS than the one presented in THA. We also assume that the idea of expressions as trees is familiar from the course Basic Models of Computation ( Laskennan perusmallit or LAP in Finnish). In particular, when we say here that an argument of a Relational algebra operator is a table, then it can be either an actual table saved in a database or another Relational Algebra expression whose value is the table. In both cases, this argument table has an associated schema. Recall that the result of a Relational Algebra operation is another table, and that this result has its own schema. 27

30 In mathematical presentations of Relational Algebra, these tables are considered to be sets of rows. Here we consider them to be bags or multi-sets of rows instead, because the results computed by RDBMSs have in general duplicate rows, ulnless they are explicitly suppressed. 28

31 Select (Sciore, 2008, Chapter 4.2.1) The select operator takes 2 arguments: Table from which rows are selected. Predicate which is any Boolean combination of Terms. We assume that Boolean operations and ( ), or ( ) and not ( ) are familiar from the course Discrete Structures ( Diskreetit rakenteet or DSR in Finnish). Here each Term is Expression Comparison Expression where Comparison is =, <, >,... Expression consists of attribute names from the schema of the table argument, constants, and operations like +, -,... Another kind of Term is AttributeName IS [NOT] NULL. Its result consists of those rows of its table argument for which the predicate is true. Hence its result has the same schema as the table argument. For instance the Relational Algebra expression Q3 = select(select(student,gradyear=2004),majorid=10 or MajorId=20) 1 2 of our university example 1 first selects those rows of the STUDENT table where the GradYear attribute equals 2004 as the inner operation and 2 then selects from them those rows where the MajorId attribute equals either 10 or 20 as the outer operation. In this way, it selects the students who graduated in 2004 from either computer science or mathematics. Its corresponding expression tree in shown as Figure 13. Note how the result is conceptually computed inside out starting at the leaf nodes representing the actually stored tables here STUDENT and moving up towards the root, and doing the Relational Algebra operator at each internal node. This operation is often written as σ predicate (table) in the database literature σ being the Greek s. 29

32 Figure 13: The Relational Algebra expression tree for Q3. (Sciore, 2008) Project (Sciore, 2008, Chapter 4.2.2) The project operation takes 2 arguments: Table whose rows are projected. Attributes into which they are projected. Here we represent them as sets of attribute names from the schema of the table argument. Its result has the same rows as its table argument, but its schema is restricted to consist only of these particular attributes that is, we forget that the table argument has any other attributes than these. This result can contain duplicate rows, by the bag semantics. For instance, the Relational Algebra expression Q6 = project(select(student,majorid=10),{sname) 1 first selects all computer science students, and 2 then keeps only their names. Its corresponding expression tree is given as Figure 14. Its result will in general have duplicate rows one for each computer science student with that particular name. This operation is often written as π attributes (table) in the database literature π being the Greek p. 30

33 Figure 14: The tree for Q6. (Sciore, 2008) Sort (Sciore, 2008, Chapter 4.2.3) The sort operator takes 2 arguments: Table whose rows are sorted. List of attributes from its schema according to which they are sorted. We write this list in brackets and separated with commas [like,this]. Such a list [a 1, a 2, a 3,..., a p ] defines a lexicographic sorting order: To compare two rows t and u, find the smallest index i such that t.a i u.a i, and let that decide which of them appears before the other in the result. If there is no such i then either row can appear before the other. Furthermore, this order can be either ascending (or normal) or descending (or reversed, largest-first ). The result is sorted according to this order. That is, the result is now an ordered bag. It has the same schema as the table argument. Because this order does not matter, sort is usually the last (topmost, root) operator in the expression, and it is used only for displaying the result to the user. It fulfills one part of requirement 2. For instance the expression Q8 = sort(student,[gradyear,sname]) sorts and displays the students 1 first ordered by their graduation year, earliest first, and 2 then alphabetically by name within each year. 31

34 Rename (Sciore, 2008, Chapter 4.2.4) The rename operator takes 3 arguments: Table to have one of its attributes renamed. Attribute from its schema that is renamed. New name for this attribute. Its result is the same table argument, except that the attribute argument is now called by this new name in its schema. Relational Algebra contains also operators with two table arguments, as we shall soon see. We must sometimes rename their attributes apart from each other first, to make clear which of these two table arguments contains a particular attribute. Extend (Sciore, 2008, Chapter 4.2.5) The extend operator takes 3 arguments: Table to extend with a new attribute. Expression to compute the value for this new attribute, as in selection. New name for this new attribute one which does not yet appear in the schema for the table argument. For instance the expression Q11 = extend(student,gradyear-1863,gradclass) 1 goes through the STUDENT table, row by row, and 2 for each row t, computes its c t = t.gradyear 1863 since the birth of the university in 1863, and 3 adds this c t as the value for the new GradClass attribute in the result schema into row t. GroupBy (Sciore, 2008, Chapter 4.2.6) The groupby operator takes 3 arguments: Table whose rows are grouped together. Attributes according which they are grouped, as a set of names from the schema of the table argument. Expressions whose values are computed as summaries for each group. This is a set of expressions. The functions on attributes in these Expressions are such that make sense for a whole (nonempty) group of values, such as their Sum, Maximum,... 32

35 This operation handles Requirement 8. The schema of its result consists of every grouping attribute mentioned in that argument, and for each expression mentioned in that argument, a new attribute named in some implementation-dependent way. For instance, if the expression is Max(AttrName) then this new attribute could get the name MaxOfAttrName, and so on. The contents of its result can be formed as follows: 1 The rows in the table argument are partitioned into groups, so that two rows t and u are in the same group exactly when t.a = u.a for every attribute a mentioned in the attribute argument. 2 Each such group g generates one tuple t g into the result. The value t g.a will be this common attribute value of g for every attribute a mentioned in the attribute argument. 3 This tuple t g will also be extended with the values for each of the expressions mentioned in that argument. Here these values are now computed by considering all the rows in g together. Hence they summarize the whole group g. In contrast, extend computed its new values individually row by row. For instance the expression Q12 = groupby(student,{majorid,{min(gradyear),max(gradyear)) groups together every student with the same major 2 computes the minimum and maximum graduation year for each group. Its output is in Figure 15. On the other hand, the expression Q13 = groupby(student,{majorid,gradyear,{count(sid)) specifies two grouping attributes MajorId and GradYear, and so its result in Figure 16 tabulates how many graduates each major subject has had each year. If the attribute argument is empty, then the whole table argument forms a single group, which gets summarized into a single row: 33

36 Figure 15: The output for Q12. (Sciore, 2008) Figure 16: The output for Q13. (Sciore, 2008) 34

37 Q14 = groupby(student,{,{min(gradyear)) computes the earliest graduation year of any student. If the expression argument is empty, then groupby groups the rows of the table argument and removes duplicates: Q15 = groupby(student,{majorid,{) lists all the distinct majors of all students. The functions in the expression argument come in two flavours. For instance: Q16 Counts how many students there are with known major subjects aggregation ignores NULL values, because it is not clear which group they should belong to. Q17 counts instead how many distinct major subjects the students have each major subject is now counted only once, whereas Q16 added 1 to the count for each student. Q16 = groupby(student,{,{count(majorid)) Q17 = groupby(student,{,{countdistinct(majorid)) Product (Sciore, 2008, Chapter 4.2.7) The fundamental tool for combining tables is the product operator. It takes 2 arguments: one table and another table such that their schemas have no attribute names in common so we rename them apart first, if necessary. The result of product(t,u) consists of all these combinations: 35

38 Figure 17: The result of Q22 = product(student,dept). (Sciore, 2008) 1 let the result be initially empty; 2 for each row r in table T 3 for each row s in table U 4 form a new row by combining rows r and s into one; 5 add this new row into the result. Figure 17 shows an example. The schema for its results consists of the schemas for its two table arguments together because they are assumed to be renamed apart from each other. This operation is often written as T U 36

39 Figure 18: The expression tree for Q23. (Sciore, 2008) Join in the database literature because if the tables T and U are sets, then the result is their Cartesian product. The product operator is on the one hand fundamental because with it we can combine tables in every way we may want to, but on the other hand impractical because it is very slow to compute, because its result is so big, and (almost) always we want to combine tables with much more precision than all rows r from table T to all rows s from table U. For instance, if attributes b 1, b 2,..., b n of the table with schema T ( a, b 1, b 2,..., b n ) are the foreign key referencing another table with schema U( c, d), then we almost always want for each a only the corresponding d: select(product(t,u),b 1 = c 1 and b 2 = c 2 and...and b n = c n ). In our university example, we may want to combine students and their majors in this way: Q23 = select(product(student,dept),majorid=did) Then its results contains also the attribute DName which gives the name of the major the MajorId had the same information only as an artificial ID. Figure 18 shows the expression as a tree. 37

40 These are examples of join operations. They have 3 arguments: one table and another table and a predicate as in selection. They are so common and useful that they warrant their own shorthand notation: join(t,u,φ) select(product(t,u),φ) Conversely, a product is a join whose comparison predicate φ is always true. When the comparison predicate φ is comparing attributes for equality, as in this b 1 = c 1 and b 2 = c 2 and...and b n = c n here, the join is called an equijoin. We focus mainly on them here. When an equijoin is used to traverse the foreign key from table T into table U, as in here, it is called a relationship join. As an example of joining multiple tables together, let us find out the grades Joe received in 2004: Q25 = select(student,sname= joe ) Q26 = join(q25,enroll,sid=studentid) Q27 = select(section,yearoffered=2004) Q28 = join(q26,q27,sectionid=sectid) Q29 = project(q28,{grade) Q26 finds the courses to which Joe has ENROLLed. This needed his student ID via Q25. Q28 finds his ENROLLments during This needed the SECTIONs offered then via Q27. Figure 19 is its expression tree. In general, if we must combine m tables, then we need m 1 joins. 38

41 Figure 19: The expression tree for Q25 Q29. (Sciore, 2008) Semijoin (Sciore, 2008, Chapter 4.2.8) A semi join has the same 3 arguments as join. However, its result is different: It consists of those rows r of the first table T for which there exists some matching row s in the second table U. That is, so that r and s together satisfy the join predicate φ. But none of the attributes of this matching row s are included in the result. It is similar to the selection operation except that now rows r of table T are chosen into the result based on the other table U whereas selection chose rows r based on the attribute values in each row r itself. This semijoin(t,u,φ) can be implemented with other operations: project(join(t,u,φ),the attributes of T ). As as example, let us find the students taught by prof. Einstein: Q38 = select(section,prof= einstein ) Q39 = semijoin(enroll,q38 39

42 Figure 20: The expression tree for Q38 Q40. (Sciore, 2008),SectionId=SectId) Q40 = semijoin(student,q39,sid=studentid) Q39 chooses those ENROLLments whose section IDs are found in the SECTIONs taught by him as Q38. Q40 chooses those STUDENTs whose student IDs are found in Q39. Figure 20 shows it as an expression tree. Antijoin The anti join operator is the complement of the semijoin operator: Its result consists of those rows r of table T for which there does not exist any matching row s in table U. In contrast to semijoin, this antijoin operation cannot be implemented with our previous operations: All of them can be shown to be monotone in their table argument(s): If we add more rows into its argument(s), then its result is at least as large as before. Moreover, it can be shown that combining monotone operations always yields a monotone result as well. But antijoin(t,u,φ) is antimonotone in its second table argument U: If we add more rows into U, then the result can get smaller than before! Hence we cannot produce this antimonotone result by any combination of our other monotone operations. 40

43 We need this antijoin for queries whose form is there does not exist any x such that.... For instance, a SECTION of a course was easy if no ENROLLed student got the failing grade F. In other words: if there does not exist any ENROLLed student who got an F in this SECTION. In our Relational Algebra this is Q42 = select(enroll,grade= F ) Q43 = antijoin(section,q42,sectionid=sectid) or keep only those SECTIONs which do not appear in the table Q42 of ENROLLments which got an F. We need antijoin also for queries whose form is something holds for every x. This is because φ is true for every x is logically equivalent to there exists no x for which φ is false or symbolically: x.φ is logically equivalent to x. φ. For instance a professor is stern if (s)he has given at least one grade F in every SECTION (s)he has ever taught. In our Relational Algebra this is Q49 = rename(q43,prof,badprof) Q50 = antijoin(section,q49,prof=badprof) Q51 = groupby(q50,{prof,{) or keep only the professors of those SECTIONs whose professor has never taught an easy SECTION (where the previous query Q43 retrieved the easy sections). Figure 21 shows its expression tree. Note: These double negations can be tricky to read and write! It helps to know something about logic. 41

44 Figure 21: The tree for stern professors. (Sciore, 2008) Union (Sciore, 2008, Chapter 4.2.9) The union operator takes 2 arguments: one table and another table which has the same schema we can rename appropriately first if needed. Its result has also the same schema, and consists of all the rows which appear in at least one of these two table arguments. Hence union(t,u) is similar to T U in mathematics. However, union is not needed very often, because we rarely want to know what information table T or table U contains in most situations, we want to know what information we can get by joining them somehow instead. One use is to coalesce similar values together. For instance Q52 = rename(project(student,{sname),sname,person) Q53 = rename(project(section,{prof),prof,person) Q54 = union(q52,q53) 42

45 Figure 22: The result of Q55. (Sciore, 2008) combines both STUDENTs (in Q52) and professors (in Q53) together as Persons, because here a person is either a student or a professor. Outer Join The union operator is most commonly used as part of the outer join operator. This outerjoin operator has the same 3 arguments as the join operator. Its result consists of the result of the corresponding join operation, together with (here is the union) all the rows from the two argument tables which did not match the join predicate with their missing attribute values filled with NULLs (which of course must be permitted by requirement 3). That is, an outerjoin is a join which does include NULLs because their unknown actual values might have matched the join predicate. For instance, we may want to see all the current ENROLLments together with all the STUDENTs who have not ENROLLed into anything yet: Q55 = outerjoin(student,enroll,sid=studentid) From this we can count the number of ENROLLments for each STUDENT: Q58 = groupby(q55,{sid,{count(eid)) 43

46 Now a STUDENT with no ENROLLments yet is alone is his/her own group and since the Count aggregation function ignores the NULL EId value in his/her own group, its value will be 0 as it should. If we had used just ENROLL instead of Q55 in Q58, then we would have missed these STUDENTs with 0 ENROLLments. In general, there are 3 kinds of outer joins: Full outerjoins as described here, whose result consists of all rows from both table arguments, with NULLs for those attributes for which no matching row existed in the other table argument. Left outer joins, whose result consists of all rows from the first table argument, with NULLs for those attributes for which no matching row existed in the second table argument. This Q55 is such a leftouterjoin, because it follows the foreign key from STUDENT into ENROLL and so each NULL is for a STUDENT without any ENROLLments, and they are all at the right end of the result in Figure whereas there are no ENROLLments without STUDENTs, which would cause NULLs at the left end of the result. Right outer joins, symmetrically. 2.7 Structured Query Language The Structured Query Language (SQL) is the standard language for interacting with an RDBMS. It (like all proper DBMS languages) contains 3 main sublanguages: Data Definition Language (DDL) for defining the elements of the current database schema. Data Manipulation Language (DML) for populating the tables of the defined schema with rows. Query Language (QL) for retrieving the information stored in these database table rows in various ways Data Definition Language (Connolly and Begg, 2010, Chapter 7.3) (Sciore, 2008, Chapter 2.6) The CREATE command adds into the database schema new elements, like tables Figure 5 integrity constraints like assertions in Figures 6 8 and triggers in Figures 9 10 views whose creation consists essentially of giving the defining query Q, and indexes on a table and its attributes (in parentheses, separated by commas) like in Figure

47 Figure 23: Index creation commands. (Sciore, 2008) The SQL DDL user can ALTER these CREATEd tables and VIEWs (by ADDing and DROPping COLUMNs and integrity constraint ASSERTIONSs) later, and DROPping them altogether when they are no longer needed. The SQL DDL user can also CREATE and DROP whole SCHEMAs, because the same RDBMS offers different schemas for different users Query Language (Sciore, 2008, Chapter 4.3) Let us review the main (but not nearly all!) query features of SQL, and relate them to our Relational Algebra presented in section 2.6, because here our aim is to understand how an SQL query gets executed by the RDBMS. The SQL query statement has the form SELECT [DISTINCT] a t t r i b u t e s FROM t a b l e s [WHERE p r e d i c a t e ] [GROUP BY grouping [HAVING p r e d i c a t e ] ] [ORDER BY o r d e r i n g ] where each [bracketed] part is optional. The SELECT Part (Sciore, 2008, Chapter 4.3.3) SQL SELECT is the projection operator of Relational Algebra not selection despite its name. Its optional DISTINCT qualifier removes duplicate rows from the result using the appropriate groupby operator. Its attributes are a comma-separated list of FullNames having the form where RangeVar.AttrName RangeVar is the range variable for some table T declared in the FROM part to be explained next. AttrName is the name of some attribute in this table T. Or it can be * instead. This shorthand expands into all the attributes of table T. 45

48 Such a FullName stands for the attribute value r.attrname for the current row r of table T. Besides these names, the attributes can also contain Expression AS NewAttrName forms. These denote in turn extending the result with this new named attribute, whose value for each row t is obtained by evaluating this Expression. A common use for this form is OldAttrName AS NewAttrName which essentially renames an old attribute. (Connolly and Begg, 2010, Chapter 6.3.7) (Sciore, 2008, Chap- The FROM Part ter 4.3.4) The tables in the FROM part are a comma-separated list of TableName RangeVar forms. Such a form declares that this RangeVar stands for the current row r of TableName. If none of the other TableNames in this FROM part have any attribute names in common with this one, then this RangeVar (and. ) can be omitted from FullNames, because then their AttrNames are enough to determine that they mean this table. This TableName can also be another nested SELECT... FROM... WHERE... query (in parentheses). Then its RangeVar ranges over the result rows of this nested query. These nested queries permit one possible implementation for the view from section 2.4: If the TableName is a view, then put its defining query (Q) in its place. The corresponding Relational Algebra expression is the product of all TableNames and nested queries in this FROM part. It is also possible to write different kinds of joins in this FROM part with the syntax first table [FULL or LEFT or RIGHT or NATURAL or CROSS or... ] JOIN second table ON predicate so Q55 could be written in SQL in for instance like SELECT FROM STUDENT s LEFT JOIN ENROLL e ON s. SId = e. StudentId whose result would then use a row of NULLs for those STUDENT rows s which do not possess any matching ENROLLment rows e. 46

49 The WHERE Part (Sciore, 2008, Chapters and 4.3.8) The optional WHERE part corresponds to the selection operation on this predicate from the big product of the FROM part. If this part is missing, then WHERE true is assumed instead. A particularly common special case is when the predicate is a conjunction (that is, all ands but no ors) of Terms with the form one FullName = another FullName because this is an equijoin of the FROM part. An example of such a query is the grades Joe received during his graduation year : SELECT e.grade FROM STUDENT s,enroll e,section k WHERE s.sid=e.studentid AND e.sectionid=k.sectid AND k.yearoffered=s.gradyear AND s.sname= Joe Its direct corresponding Relational Algebra expression is project(select(product(product(student,enroll),section),s.sid=e.studentid AND e.sectionid=k.sectid AND k.yearoffered=s.gradyear AND s.sname= Joe ),{e.grade) but the RDBMS query optimizer can improve it further into Figure 24. This optimization has consisted of 1 considering each Term of the selection predicate separately this is permitted, because it is a conjunction and 2 moving each Term down towards the actual tables for as far as it will go, and 3 using each moved Term as a join predicate. The predicate can also contain another nested query with FullName [NOT] IN (Query) which is true if the current value of FullName is in the result of this nested Query. That is, this kind of Term specifies a semijoin, while... 47

50 Figure 24: The Relational Algebra tree for Joe s final year grades. (Sciore, 2008) its optional NOT specifies an antijoin instead with the result of this Query on this FullName. Another kind of nested query is [NOT] EXISTS (Query) which is true if the result of this Query has [no] rows. It is also a semi- or antijoin, but without the FullName. We have already used it in our assertions in Figures 6 8. The GROUP BY Part (Sciore, 2008, Chapters ) The optional GROUP BY part turns the SELECT from a projection into a groupby operation whose 3 arguments come from the following places: Table comes from the FROM... WHERE... parts which cannot therefore use any values produced by groupby, because it takes place only after this joining. Attributes come from the comma-separated grouping list of FullNames from the table argument. Expressions come from the SELECT part which can therefore contain only grouping attributes and aggregate function calls on attributes of the table arguments AS new attributes, with optional DISTINCTness directives. The optional HAVING part permits testing a WHERE-like condition after the groupby operations, so it can use the produced values. 48

51 The ORDER BY Part (Sciore, 2008, Chapter ) The optional ORDER BY part specifies a sorting operation as the very last step of the whole query. Its ordering is a comma-separated list of AttrNames from the SELECT part without their possible RangeVar s, because the output of the whole query is sorted, not the tables in its FROM part. An AttrName in this list can be optionally followed by DESCending to indicate that it must be sorted in the opposite order. Combining SELECTion Statements (Connolly and Begg, 2010, Chapter 6.3.9) (Sciore, 2008, Chapter 4.3.9) SQL also permits the set-theoretical operations UNION for T U which corresponds to the Relational Algebra union operator INTERSECT for T U which can be expressed with a suitable semijoin in Relational Algebra EXCEPT for T \ U (set difference, or the part of T which does not belong to U, sometimes denoted as R U instead) which can be expressed with a suitable antijoin in Relational Algebra between the two (parenthesized) SELECT... FROM... WHERE... queries which produce the two result tables T and U Data Manipulation Language (Connolly and Begg, 2010, Chapter ) (Sciore, 2008, Chapter 4.4) The SQL statement to insert one new row into a table is INSERT INTO TableName [(AttributeList)] VALUES (ValueList) whose AttributeList lists the names a 1, a 2, a 3,..., a n of some attributes of TableName. A missing list means all its attributes. ValueList lists the values v 1, v 2, v 3,..., v n given to these named attributes. The other attributes of the new row receive NULL or default values, as prescribed by the table definition. SQL can also insert many new rows by replacing the VALUES part with a database Query. The SQL command to delete rows from a table is DELETE FROM TableName WHERE predicate 49

52 whose predicate chooses the rows to delete, based on their attribute values, as in a Query. The SQL command to update rows in a table is UPDATE TableName SET AssignmentList WHERE predicate whose predicate chooses the rows r to update as beforee, and AssignmentList is a comma-separated list of SET AttrName = Expression forms. Such a form means that r.attrname is updated into the value of its Expression. 3 Client-Server Database Architecture (Sciore, 2008, Chapter 7) An RDBMS is usually organized as as client- which are other computers. A client connects to the server across the network runs some application program which interacts with the RDBMS on the server by sending SQL commands and receiving their results server which is a separate computer running the actual RDBMS as one operating system (OS) process. It architecture. handles concurrent communication with its clients via transactions wher eeach transaction runs as its own OS thread within the RDBMS process executes the SQL commands it receives from its clients in these threads is the component which manages the actual database at the physical level of files keeps the database in a consistent state via transaction recovery. This Client-Server architecture is also used on a single computer, so that the clients are other processes running in the same computer as the RDBMS process. This architecture separates the front end of a database application program (which handles the user interface and the part of the business logic of the organization which cannot be represented with database integrity constraints) in the client from its 50

53 back end in the server which provides the common database part for all such applications. There are also distributed (R)DBMSs: The database is divided among more than one servers, which serve the clients together. They are very important, especially on the web. However, this course concentrates only on the classical one-server RDBMSs. Non-client-server architectures can be used instead when a single application owns the whole database privately. 3.1 Installing and Running SimpleDB (Sciore, 2008, Chapter 7) Here are the general steps for getting the SimpleDB RDBMS up and running on your computer. How each step is carried out in a particular OS is left as an exercise to the reader... 1 Download its latest version from and unzip it. The version used here is Move the unzipped simpledb subdirectory into the serverdirectory where you want the server-side software to be. 3 Ensure that this serverdirectory is in your CLASSPATH environment variable, so that your java can find it. This SimpleDB version assumes java version Ensure that the current working directory. is in CLASSPATH too (it may already be). The SimpleDB server-side software should now be installed. The server process can be started as follows: 5 Start the rmiregistry program as another process. This program is part of Java SDK, which you should already have. It is the Remote Method Invocation (RMI) registry the phone directory for Java methods which can be called from other processes, even across the network. The SimpleDB server registers its public methods there, so that its client processes can invoke them to ask the server to perform database operations. 6 Start the server process with the 51

54 java simpledb.server.startup databasename command. If your home directory contains a subdirectory named databasename, then the server will continue using the already created database there. If the server starts OK, then you will see the message recovering existing database database server ready where the server first recovers databasename into a consistent state, because it may have ended abnormally. (For instance, its previous server process may have been killed.) Otherwise databasename will be created as a new empty database. If the server starts OK, then you will see the message creating new database new transaction: 1 transaction 1 committed database server ready where this 1st transaction created the empty database. This databasename determines the only schema the server will use now SimpleDB does not support multiple schemas at the same time. Note: If you want to kill and restart the SimpleDB server process then kill rmiregistry first and wait a while before restarting it otherwise you might get an RMI error instead. The SimpleDB server process should now be running. 7 You can try it out for instance with the example client programs in the unzipped studentclient/simpledb/ subdirecory: CreateStudentDB.java creates the university database, our running example. It shows how to CREATE tables and INSERT rows into them. FindMajors.java lists all the STUDENTs majoring in the given department and their graduation years. It can be run with the command: java FindMajors department StudentMajor.java lists all STUDENTs and their majors. ChangeMajor.java UPDATEs Amy s major subject into drama. SQLInterpreter.java is a simple interactive SQL shell for SELECTion queries and row UPDATEs. SimpleDB implements only a very small subset of SQL: 52

55 The SELECT part of a query has just an attribute name list no *, AS nor DISTINCT. Its FROM part is just a table name list no RangeVariables, JOINs nor nested queries (but views are supported). Hence attribute names must determine tables. Its WHERE part is just a conjunction of equality comparisons = of attribute names and constants no other comparisons nor expressions. The only 2 supported attribute types are INT for Java 32-bit integers, and VARCHAR(N) for ASCII strings of at most N characters without NULLs. There is no UNION, GROUP nor ORDER BY. There are no keys or integrity constraints. An INSERT takes only VALUES not queries. An UPDATE has only one assignment not many. An INDEX can have only one attribute not many. Moreover, index support must be enabled separately Entities CREATEd in the current schema cannot be DROPped. Its grammar is in Figure Using a Relational Database from Java (Sciore, 2008, Chapter 8) We shall consider the SimpleDB server side structure later in this course. Let us consider here the structure for a simple client. There is a family client-server database communication protocols called Open Data Base Connectivity (ODBC). There are now ODBC binding libraries for many programming languages. They permit application programs written in that language to communicate with any ODBC-compliant database server. The Java binding is called JDBC which does not mean Java DBC according to Sun s legal position... The SimpleDB supports enough of the JDBC specification to allow writing simple clients but not nearly all the features of the whole specification. This basic JDBC API is shown as Figure 26. We shall use the SimpleDB studentclient/simpledb/findmajors.java client as our example. Such a batch-oriented client has 4 main phases: 53

56 Figure 25: A small SQL language dialect. (Sciore, 2008) 54

57 Figure 26: The basic JDBC Application Programming Interface. (Sciore, 2008) 55

58 1 The client opens a connection to the server. The central Java code is Driver d = new therightdriver(); String url = "jdbc:system://server/path"; Connection conn = d.connect(url,properties); where therightdriver() is supplied by the RDBMS JDBC binding, and imported into the client code. For SimpleDB, it is simpledb.remote.simpledriver. system is the RDBMS used. For SimpleDB, it is simpledb. server is the machine running the rmiregistry and the RDBMS processes to which this client wants to connect. If this server is in the same machine as this client, then this is localhost. /path leads to the databasename to use within the server. For SimpleDB it is not needed, becaise it stores its databasename subdirectories directly in its users home direcories. properties is an RDBMS-specific string giving extra options for the connection. For instance, if the RDBMS has mandatory access control, then this string can contain the required username and password. SimpleDB does not support any properties so it is the null pointer. The vendor-independent parts of JDBC are imported from java.sql.*. The method calls of this created connection ❶ happen remotely via the rmiregistry process running on the server... ❷ which in turn forwards them to the RDBMS process. Unfortunately this old way to form the connection is not very portable, because the client contains therightdriver which is vendor-dependent. Java supports also new ways, where the server can send therightdriver to its clients based on the system in the url (Sciore, 2008, Chapter 8.2.1). + Now the client is vendor-independent, but... the server-side setup gets more complicated, and so we continue using the old way here instead. 2 The client sends an SQL statement to the server. The central Java code for querying the database is Statement stmt = conn.createstatement(); String qry = statement; ResultSet rs = stmt.executequery(qry); where statement is an SQL SELECT... FROM... WHERE... statement as text. rs gives the results of the query as a result set to be processed in the next phase 3. 56

59 Other SQL statements can be issued with int howmany = stmt.executeupdate(qry); whose return value tells howmany records were affected instead of a result set. The RDBS server ❶ first compiles this statement into Relational Algebra and optimizes it into a form... ❷ which it then executes. A statement can also be prepared beforehand: The compilation step ❶ happens only once. The same compiled statement can be executed in step ❷ many times with different parameter values each time. This is useful, because we shall see during this course that step ❶ is not trivial. These parameter positions are marked with question marks? within the statement to prepare, while the value for the nth? can be set with the method settype(int n,type value) for each SQL Type. Figure 27 shows an example. 3 The client receives the result from the server. The result set of a query consists of the corresponding rows. One of them is the current row a reading position within the result set. Initially this current row is just before the first row of the result set so it is not valid yet. Method next moves this current row to the next row of the result set. It returns false if it moved past the last row of the result set so it is no longer valid. If the current row is valid, then the value for its named attribute can be extracted with the method Type gettype(string name) for each SQL Type. Note: Figure 26 did not mention this name parameter. SimpleDB will use this current row model also for its internal intermediate result sets of individual Relational Algebra operators. 57

60 Besides these basic read forward result sets, JDBC also supports scrollable result sets, whose current row can move also backwards, and updatable result sets, which permit updating the attribute values of the current row (Sciore, 2008, Chapter 8.2.5) which are especially useful in clients with graphical user interfaces (GUIs). Such a result set is an example of a lazy data structure: it does not exist as a whole, but... its elements are constructed one by one, as the client asks for the next one. Once the while(rs.next()) loop processing the result set rs finishes, the client should call rs.close() as soon as possible, because the RDBMS maintains each open result set, and they reserve its limited resources. 4 The client closes its connection to the server. Similarly client should call conn.close() as soon as it no longer needs this connection to the RDBMS, because open connections are a limited resource too. SimpleDB source file studentclient/simpledb/findmajors.java The symbol denotes a long source code line which had to be divided into many lines on the pages. import j a v a. s q l. ; import simpledb. remote. SimpleDriver ; public c l a s s FindMajors { public s t a t i c void main ( String [ ] args ) { S t r i n g major = a r g s [ 0 ] ; System. out. p r i n t l n ( Here a r e the + major + majors ) ; System. out. p r i n t l n ( Name\ tgradyear ) ; Connection conn = null ; try { // S t e p 1 : c o n n e c t t o d a t a b a s e s e r v e r Driver d = new SimpleDriver ( ) ; conn = d. c o n n e c t ( j d b c : simpledb : / / l o c a l h o s t, null ) ; // S t e p 2 : e x e c u t e t h e q u e r y Statement stmt = conn. c r e a t e S t a t e m e n t ( ) ; S t r i n g qry = s e l e c t sname, g r a d y e a r + from student, dept + where did = m a j o r i d + and dname = + major + ; ResultSet rs = stmt. executequery ( qry ) ; // S t e p 3 : l o o p t h r o u g h t h e r e s u l t s e t while ( r s. next ( ) ) { S t r i n g sname = r s. g e t S t r i n g ( sname ) ; int g r a d y e a r = r s. g e t I n t ( g r a d y e a r ) ; System. out. p r i n t l n ( sname + \ t + g r a d y e a r ) ; 58

61 Figure 27: Preparing an SQL statement and using it. (Sciore, 2008) r s. c l o s e ( ) ; catch ( Exception e ) { e. p r i n t S t a c k T r a c e ( ) ; f i n a l l y { // S t e p 4 : c l o s e t h e c o n n e c t i o n try { i f ( conn!= null ) conn. c l o s e ( ) ; catch ( SQLException e ) { e. p r i n t S t a c k T r a c e ( ) ; 3.3 JDBC Error Handling The FindMajors client code performed its phases 1 3 in a Java try block, because JDBC reports errors by throwing exceptions (Sestoft, 2005, Chapters ). These exceptions may arise for various phases and reasons: ❶ The client might not be able to connect to the server in phase 1. ❷ There may be something wrong in the SQL statement which the client sends to the server in phase 2. ❸ The server or network might crash during the result set processing loop of phase 3. ❹ The RDBMS may have to abort the transaction of the client because the RDBMS is running out of resources. The client may choose to retry its operation later, especially if the reason for its failure was ❹. The FindMajors client gives up trying, and closes the connection, and prints the Java stack trace as the diagnostic information. 59

62 with AutoCommit still true The RDBMS executes each SQL statement as its own transaction. The RDBMS commits (or aborts) them internally and automatically this is what autocommit means. after setting it to false via the API in Figure 28 The RDBMS continues the same transaction when the clients sends its next SQL statement into the connection. The client must commit or abort this transaction by hand at the end. Table 2: With and without autocommit mode. Figure 28: JDBC transactions. (Sciore, 2008) This takes place in the finally part, so it is executed whether the try part executed correctly or caused an exception to catch. This finally part closes the connection if phase 1 managed to open it. It may raise an exception too, and is therefore in its own try block. 3.4 JDBC Transaction Handling An RDBMS operates in autocommit mode by default, as described by Table 2. An RDBMS operates in its default transaction isolation level, unless the client sets this level explicitly for its connection. For instance, conn. s e t T r a n s a c t i o n I s o l a t i o n ( Connection. TRANSACTION SERIALIZABLE) sets it to the full serializable level. When a clients turns off autocommit mode with conn. setautocommit ( f a l s e ) it might (but hopefully never!) encounter the following pathological situation: ❶ Suppose that the client calls conn. r o l l b a c k ( ) for some reason for instance, if some SQL statement it sent to the server caused an exception as in section 3.3. ❷ This attempt to abort the transaction fails with an(other) exception? 60

63 What should the client do then? Neither committing nor aborting its transaction is possible! Then the database may have become corrupted because it may not be possible to recover it to the last consistent state before this transaction started. Hence the client should somehow alert the DBA about this danger if possible. 3.5 Impedance Mismatch (Sciore, 2008, Chapter 9) A JDBC client like FindMajors is rather procedural programming: An attribute value STUDENT.MajorId yields the key of the row in the DEPT which corresponds to the department of the student represented by this row of the STUDENT table. For instance, Joe s row in the STUDENT table has the ID of the computer science department in the DEPT table, and so on. A more object-oriented design would have instead objects like Joe of class STUDENT... with attributes like Joe.majorOf which points to another object compsci of class DEPT, and so on. It is possible to build an Object-Relational Mapping (ORM) which builds the latter design on top of the former. The Java Persistence Architecture (JPA) is one tool for generating such an ORM. It uses Java code annotated with the related relational table design, as in Figure 29. From a programming philosophy (whatever that is... ) viewpoint, this impedance mismatch between these models stems from their origins: Relational model is built on first-order predicate logic, whose structure is flat because it has just indivisible values and their relations, but flexible because these relations can be combined freely in formulas/queries. Object-oriented model is built instead on representing information with structured entities with their own identity and individual properties, but with specific access paths between them, encoded as these per-object properties, such as this student s major. Each is more useful than the other in some situations. It is possible to develop a data model based on the object-oriented philosophy. This leads to Object-Oriented Data Base Management Systems (OODBMSs) like O 2. However, their market share has remained much smaller than that of RDBMSs, even though object-oriented programming languages have become very common. 61

64 Figure 29: JPA annotations combining the STUDENT table and class. (Sciore, 2008) (Continues in Figure 30.) 62

65 Figure 30: Rest of Figure 29. (Sciore, 2008) 63

66 Moreover, there are other programming philosophies than object-orientation, such as functional and logic programming. They are based on the concept of value instead of (object) identity and so the relational model is more natural for them. However, despite their long history they are still niche programming languages. 4 The Structure of the SimpleDB RDBMS Engine (Sciore, 2008, part 3) Now we examine how an RDMBS server can be implemented using SimpleDB as our example. Although SimpleDB is a restricted RDBMS written and made available for teaching purposes, it does contain the most important components of a full RDBMS. These components are shown in Figure 31. SimpleDB has chosen straightforward implementations for these components. We shall mentioned some alternatives too. We can trace the execution of an SQL query in the SimpleDB RDBMS server process down these components: ❶ The Remote manager handles the communication with the client. The server process allocates a separate thread for each connection via the RMI meachanism. ❷ When a clients sends an SQL statement to its open connection, this Remote manager passes it to the Planner component. This component plans how the statement will be executed. This plan is a Relational Algebra expression which it sends to the Query component. It invokes the Parser component, which turns the statement into a syntax tree containing the tables, attributes, constants,... mentioned in it. This Parser component in turn invokes the Metadata manager, which keeps track of information about the tables, attributes, indexes,... CRE- ATEd in the database to check that the things mentioned in the syntax tree do exist and have the right type. ❸ The Query component turns the plan it received from the Planner component into a scan and executes it. It forms this scan by choosing an implementation for each operation in the expression. For instance, if the expression contains a sort operation, then this Query component chooses a particular sorting algorithm to use. The RDBMS can choose from several algorithms for the same operation, because different algorithms suit different situations, improving performance. This component uses the Metadata manager too, because its information helps in making these choices. 64

67 Figure 31: The Components of an RDBMS Engine. (Sciore, 2008, page 310) 65

68 This scan is executed using the same current row approach as the client uses for processing the result in its phase 3 in section 3.2. ❹ Each of these rows processed by the Query component is stored on disk as a record handled by the Record manager. These records are stored in disk blocks held in files managed by the File manager. The Buffer manager is in turn responsible for those disk blocks which have been read into RAM for accessing the records in them. ❺ Each (scan for a) statement is executed as (if in autocommit mode) or within (otherwise) a Transaction. They are managed by a manager responsible for concurrency control and recovery using a designated Log file managed by its own component. The relative order of these components may vary according to architecture: SimpleDB handles concurrency in the Buffer level, so its Transaction manager is located just above it. Other databases handle it in the Record level instead, and so their Transaction managers are above it instead. However, we will go upwards in Figure 31 so that each component uses services provided by the components below it provides services to the components above it. 4.1 File Management (Sciore, 2008, Chapter 12) This lowest level of an RDBMS is the component which handles interaction with the underlying disk drive(s). The RDBMS can do this with raw disk(s) so that the database resides on dedicated drives (or partitions) with nothing else. + This is as fast as possible, but... such disks needs dedicated special support from the DBA. This is used only for very high performance requirements. OS file(s) so that the database is in normal files in normal file systems. + They need only the same support as file systems in general, but... the OS layer overhead impairs performance. This is currently the most common choice. This OS file choice can be divided further into single file architecture, where the whole database is stored in a single (possibly very) big file, like for instance the.mdb files of Microsoft Access. 66

69 multifile architecture, where each database is in a separate subdirectory containing separate files for its tables, indexes,... like for instance Oracle and SimpleDB do. The RDBMS treats its files internally like raw disks: It consults the OS only for opening and closing its files, and extending the with more blocks, but... manages these blocks, their buffering, and their allocation by itself. The reason is not only better performance but even more importantly ensuring durability: The RDBMS must know precisely which of its data is already stored on disk, and which is still only in RAM, and vanishes if the computer crashes. Disks are persistent storage. In order to guarantee durability, the RDBMS needs some memory whose contents do not disappear when the computer crashes. A disk drive provide such persistent storage. A disk drive consists of sectors which the OS divides further into blocks. Big databases require big disks. Big disks are more expensive than small disks. + It is possible to connect many small disks into one unit, which looks like a big disk to the OS, because the controller of the unit takes care of spreading the stored data among these disks. Disk striping builds such a big disk out of many smaller disks. For performance reasons, it spreads the sectors of the big disk evenly across the sectors of the smaller disks, as in Figure 32. The RDBMS relies on the disk drive to function properly. However, the DBA must also be prepared for disk failures. One countermeasure is to make regular backups (on tape)... Another countermeasure is to use a Redundant Array of Inexpensive Disks (RAID) as the drive. A RAID unit adds extra error-correcting information into a striped disk unit. If one of the smaller disks breaks, the RAID unit can inform the DBA about which of them broke. The DBA can then change the broken disk and reconstruct its contents from the other disks and this extra information. 67

70 Figure 32: Two-disk striping. (Sciore, 2008) The only problem is if another disk breaks during this reconstruction... but this is unlikely. Moreover, adding more error-correcting information makes it possible to reconstruct more than one disk at a time. There are now 7 levels of RAID, depending on what extra information the unit holds and where. The simplest is RAID-0, which is plain striping without any extra error-correcting information. Therefore it does not offer any protection against failures. The next level is RAID-1, where the extra error-correcting information is a mirror of the data disk into another identical disk, as in Figure 33. The DBA can reconstruct the contents of the data disk simply by copying this mirror disk into the replacement disk. Another kind of error-correcting extra data is parity: The RAID unit consists of N + 1 small disks. N of these disks hold the data. The extra (N + 1)st disk holds the parity blocks of the data blocks. That is, sector s of this extra (N + 1)st disk holds the exclusive-or of sectors s of the N data disks. In other words, bit b of sector s on this extra (N + 1)st disk is 1 if and only if an odd number of bits b of the sectors s on the N data disks are 1, otherwise 0. If any one of these N data disks breaks, the DBA can reconstruct its contents from the other (N 1) still functioning data disks and this extra (N + 1)st disk. 68

Figure 33: Mirroring. (Sciore, 2008) This is more compact than mirroring, because there is only one extra block per N data blocks, whereas mirroring had one extra block per data block.

71 Figure 33: Mirroring. (Sciore, 2008) This is more compact than mirroring, because there is only one extra block per N data blocks, whereas mirroring had one extra block per data block. In fact, mirroring could be viewed as parity with N = 1. This parity idea is RAID-4. It is in Figure 34. However, the dedicated extra (N + 1)st parity disk becomes a bottleneck for the whole RAID unit, because whenever a data disk sector changes, the corresponding section of the parity disk must be updated too. RAID-5 solves this bottleneck by distributing these parity sectors evenly among the data sectors. Every (N + 1)st sector of a small disk is a parity sector, its other sectors are data. A parity sector s on a small disk d contains the parity of the corresponding sectors s of the other small disks than d itself. 1, 2, 3,..., d 1, d + 1, d + 2, d + 3,..., N + 1 Then the extra work of updating parity sectors is divided evenly among all the other disks, and so no one disk is a bottleneck any longer. The DBA can still reconstruct the contents of any one broken disk from the other still functioning N disks. The two most common levels are RAID-1 and RAID-5. RAID-2 used bit instead of sector striping and an error-correcting code instead of parity, but it was hard to implement and performed poorly, and so is no longer used. 69

72 Figure 34: Parity. (Sciore, 2008) RAID-3 is like RAID-4 but with the less efficient byte instead of sector striping. RAID-6 is like RAID-5 but with two kinds of parity information, so it tolerates two disk failures at the same time. For instance, the current cs.uef.fi server has: one fast RAID-1 unit for the OS and temporary files, and two RAID-5 units with N = 4 for user files, and two hot-swap drives, which allow the IT support to reconstruct a broken disk on the fly without having to shut down the server. The IT support (including the DBAs) recommends which RAID to buy based on the required levels of protection against downtime and loss of work caused by disk failures in theory, by determining a low enough expected value of the cost involved and disk failure probability cost of disk failure performance requirements for the system based on Disks are slow. statistics collected about its current use, and estimates about its future use. Disk storage is much slower than RAM: About times slower for mechanical disk drives, but only about times slower for flash drives. Requirement 10 (little I/O). The RDBMS must strive to avoid unnecessary disk I/O whenever possible. 70

Course Announcements. Bacon is due next Monday. Next lab is about drawing UIs. Today s lecture will help thinking about your DB interface.

Course Announcements. Bacon is due next Monday. Next lab is about drawing UIs. Today s lecture will help thinking about your DB interface. Course Announcements Bacon is due next Monday. Today s lecture will help thinking about your DB interface. Next lab is about drawing UIs. John Jannotti (cs32) ORMs Mar 9, 2017 1 / 24 ORMs John Jannotti