Introduction to Column Stores with MonetDB and Benchmark

Size: px

Start display at page:

Download "Introduction to Column Stores with MonetDB and Benchmark"

Sydney Rodgers
6 years ago
Views:

1 Introduction to Column Stores with MonetDB and Benchmark Seminar Database Systems Master of Science in Engineering Major Software and Systems HSR Hochschule für Technik Rapperswil Supervisor: Prof. Stefan Keller Author: Jannis Grimm Rapperswil, February 2016

2 Abstract Column-Store databases (also: Column-oriented databases) are a new direction for Database Management Systems. This paper will explain the differences to traditional row-oriented databases, go to greater detail with MonetDB as an column-store database example and conclude with a comparing benchmark between PostgreSQL and MonetDB. Contents 1. Introduction 5 2. Column-Store Databases Main differences to row-based databases Advantages and Disadvantages to Row-based Databases MonetDB Introduction 7 4. MonetDB s Column-Store Implementation Vertical Fragmentation Execution Engine Recycling and Updates Adaptive Indexes (Database Cracking) Data Compression Best Practices Benchmark Adaption From PostgreSQL to MonetDB Droping existing tables Create tables Import from CSV echo Rename fields Benchmark Execution Environment Physical Machine

3 6.2. Virtual Machine Benchmark Execution Results Table gnis Query 1a and 1b Query 2a and 2b Query 3a and 3b Query 4a and 4b Table osm_poi_ch Query 10x, 10a, 10b, and 10c Query 11a, 11b, and 11c Query 12a, 12b, and 12c Query 13a, 13b, and 13c Query 14a, 14b, and 14c Conclusion 22 References 24 A. List of Figures 26 B. Bachelor/Master Lab Lession 26 B.1. Installation B.2. Create and Import B.3. Insert, Update and Delete B.4. Comparison B.5. Database Cracking C. PostgreSQL Scripts 29 C.1. gnis_load.sql C.2. osm_poi_ch_load.sql C.3. bm_prepare.sql C.4. bm.sql D. MonetDB Scripts 37 D.1. gnis_load.sql D.2. osm_poi_ch_load.sql D.3. bm_prepare.sql

4 D.4. bm.sql E. PostgreSQL Benchmark Commands and Output 44 F. MonetDB Benchmark Commands and Output 45 4

5 1. Introduction Column-Store databases (also: Columnoriented databases) are a new direction for Database Management Systems. This paper will explain the differences to traditional row-oriented databases, go to greater detail with MonetDB as a column-store database example and conclude with a comparing benchmark between PostgreSQL and MonetDB. Chapter 2 will start with the introduction to Column-Store Databases, together with a comparision to traditional row-based databases. The Chapter 3 introduces MonetDB as an example of a column-store database. The implementation of the column-store theory in MonetDB is the topic of Chapter 4, this chapter will also cover the best practices of working with MonetDB. Chapters 5, 6, and 7 topic is the benchmark. While Chapter 5 explains how the given PostgreSQL benchmark implementation was ported to a MonetDB version, Chapter 6 will go into greater detail about the environment in which the benchmark was executed. Chapter 7 will show the results, together with diagrams and an explanation how the differences between the PostgreSQL and MonetDB times arise. Finally, Chapter 8 will give a conclusion and summarize the most importing findings of this paper. 2. Column-Store Databases 2.1. Main differences to row-based databases Most databases store data tables in a row-based format, i.e. in every table, each tuple follows the other. The values of each tuple s columns follow sequentially for one row before the values for the next row. If there is a table with the columns 1, 2, 3, and 4, the database would internally store for the first tuple, followed by for the second tuple, and so on. This is shown in Figure 1. Figure 1: The storage of a table in the row-based format Because primary storage (e.g. hard disk) is accessed in blocks of consecu- 5

tive data, this means that full rows are loaded into CPU registers for processing. Column-oriented databases follow the opposit storing pattern, i.e. vertical fragmentation: The values for a column of each row is stored before the values for the next column.

6 tive data, this means that full rows are loaded into CPU registers for processing. Column-oriented databases follow the opposit storing pattern, i.e. vertical fragmentation: The values for a column of each row is stored before the values for the next column. With a table with the columns 1, 2, 3, and 4, the this could internally be stored like , , , This is shown in Figure 2. Figure 2: The storage of a table in the column-oriented format This allows aggregating or searching single columns without the need to load to and afterwards discard the unneeded other columns in memory. More details about the concept and implementation of column-stores can be found in [ABH + 13]. This paper will explain the implementation with the implementation in MonetDB in chapter Advantages and Disadvantages to Row-based Databases Column-oriented databases are faster when searching for values in a single column, because all values in a single column are saved together. This is especially noticable with big tables, because more values fit together in one block, which means lesser hard disk accesses are needed. With row-based databases, each value is followed by unneeded values from other columns, so each hard disk block contains fewer values from the column to search in. Additionally, column-oriented databases are faster when building aggregate values (e.g. sums) over few columns but many rows, because other than with row-based databases, not every column has to be read. The values are stored together, so both the access time and the computation time are lower than when the values are spread out in the internal data storage. Furthermore, column-oriented databases are faster when a column of every row has to be changed, because the values are stored next to each other, so 6

7 single hard disk blocks can be overwritten instead of having to parse the whole table to find the values to change. Finally, column-oriented databases can be better compressed, because similar values can be grouped together. If multiple tuples contain the same value for a column, the value only needs to be stored once together with the row numbers, instead of having it repeated for every tuple. But most often we are interested in the speed savings over the storage savings. On the other side, row-oriented databases are faster when many columns of single rows are needed and when inserting new rows with values for every column. In summary, it heavily depends on the data usage which storage system is faster. In genaral, column-oriented database are prefered for statistical usages and when most data querys contain a small subset of columns, whereas row-based databases are faster when generally most columns are read or if there are more insertions of new rows with many values than data querys. 3. MonetDB Introduction MonetDB is an open source columnstore database management system. It is being developed at the CWI database architectures research group since As described in [IGN + 12, Ch. 1], the primary targed for MonetDB was warehouse applications and it is also used for e-science, in health care, telecommunication and sciences as astronomy. MonetDB is supporting the SQL:2013 standard 1. As a column-oriented database management system, it uses vertical fragmentation and has an excecution engine tailored for columnar execution. It is designed to exploit modern hardware, e.g. large main memorys of modern computer systems, by deploying cache-conscious data structures and algorithms that make use of hierachical memory systems. Noteworthy, in difference to most other database management systems, MonetDB is optimized to minimize CPU cache misses rather than IOs, because it was found that CPU speed advances have outpaced advances in memory latency, as described in [BMK99]. It mainly focuses on analytical and scientific workloads that are read-dominated and where updates mostly consist of appending new data to the database in large chunks at a time. One main algorithm principle is supporting a priori unknown or rapidly changing workloads over large data volumes. Examples for this are the intermediate result caching 1 Supported SQL features can be found at [Mon15c], unsupported features at [Mon15d] 7

8 technique recycling and the adaptive indexing technique database cracking, which are explained in more detail in chapter 4. They require minimal overhead to provide benefit for the actual workload and hot data. MonetDB also supports extensibility. Both its core and the SQL syntax may be extended in C or MonetDB s own MAL language. This allows efficient exploitation of domain-specific data characteristics or special application requirements that go beyond the SQL standard. This extensibility will not be a topic of this paper. 4. MonetDB s Column-Store Implementation The next chapter will go into greater detail and briefly explain how the columnstore principle is implemented in MonetDB, with explanations targeted for the database user. The MonetDB documentation can be found at [Mon15b], but it does not go into depth about the technical side of the implementation. More details from the technical side can be found in [IGN + 12, Ch. 2 & 3] or with focus on spacial applications in [VQKN08] Vertical Fragmentation As a column-store database, MonetDB core concept is vertical fragmentation. Instead of storing all attributes of each relational tuple together in one record (what would be called row-store), each column is stored in a seperate table, called a BAT (Binary Association Table). Each BAT has two physical columns: The left column are the object identifiers (identifing the relational tuple), while the right column holds the actual values. Because the object identifier can be seen as an array index (where the actual values are the arrays context), it is not materialized, which also saves storage space (and thus data access times), according to [ABH + 13, Ch. 3.2] Execution Engine The MonetDB execution engine evaluates querys with a low-level two-column relational algebra. Because it is known that all physical tables follow the same layout, this can be highly optimized. The technical details can be found in [BK99]. The algebraic operations for the BATs are compiled to MonetDB s MAL language and executed with the operatorat-a-time principle: Each operation is evaluated to completion over its entire input data, before the subsequent 8

9 data-dependent opretation is executed. This allows exploiting the architecture of modern CPUs by cycling through tight loops of the same data types. (Traditional database management systems use a tuple-at-a-time, where calculations are done on a per-tuple basis.) A key aspect in the execution engine is its reliance on hardware conscious algorithms. Own algorithms are used, mainly for joins. The algorithms are optimized to make good use of the CPU cache to avoid the memory wall i.e. the bottleneck with the main memory access time. The algorithms are designed to avoid CPU cache misses, which mainly means that random data accesses are restricted to regions that fit into the cache. To fullfil this requirement, MonetDB s optimizer creates a cost model which takes the memory access cost into account. MonetDB uses late tuple materialization : The columns are converted back to tuples as late as possible. Every operation is just done with BATs and produces new BATs in memory. This allows MonetDB to use a single data structure (BAT) to manipulate a widely different data sets. The algebra is simple (e.g. in difference to traditional database management systems, the operator functions are not executed with so-called complex parameters) and thus very fast Recycling and Updates These newly created intermediate BATs are kept as long as they fit into storage and as long as they are hot. This allows the reuse in similar querys. MonetDB avoids touching the base BATs and uses already created intermediate BATs whenever possible. One main area where vertical fragmentation (column-store) is slower than row-store is updates with many columns. This would need updates to multiple BATs for simple tuple insertions. To avoid this performance problem, MonetDB uses update BATs: For each base BAT there are also update BATs where just changes to the base BATs are stored. When doing an operation with base BATs, they are joined with the update BATs. This allows MonetDB to postpone the actual base BATs update to a later moment, when multiple values can be changed at once Adaptive Indexes (Database Cracking) As dynamic data storage environments often have unknown a priori workload knowledge and little idle time to spend on reorganizing data (e.g. building indexes), traditional approaches to index building and maintenance do not apply to MonetDB. (MonetDB ignores index creation SQL statements.) 9

10 To solve this problem, MonetDB was the first implementation to use database cracking (as proposed and described in [IKM + 07]). This technique allows to adaptively, contiuously and automatically create and maintain indexes according to the workload at hand without human indexes. This avoids the need to know which indexes will be needed in the future queries and spending time to maintain indexes whose time savings are lower than the index maintainance costs. With database cracking, indexes are created incrementally, partially and on demand. The more queries are preceeded, the more the relevant indexes are optimized. An example for database cracking and what is meant by adaptive partial indexes can be found in [ABH + 13, Ch. 4.8]. data structure (plain C arrays for the BATs) with dense arrays with the least possible number of bytes per value (1 to 8 bytes). This allows efficient reading and direct mapping from storage into CPU cache memory. No overhead is produced by the storage technique (as it would be the case with B-trees and others) as values directly follow each others. For strings, MonetDB uses dictionary encoding. This allows saving storage space by only storing dictionary indexes in the BAT (also only 1 to 8 bytes) while allowing the same relational algebra on the physical BATs like with numbers. Only with large dictionaries the maintaining costs grow higher than the query savings could justifiy which is why MonetDB then switches to a non-compressed string representation Data Compression Data compression in MonetDB is based on optimized storage structures instead of compression algorithms. As MonetDB s speed focus is on using less CPU cycles, the data is not altered for compression, which would trade the saved storage space with needed more CPU cycles for decryption. The principles and effects behind this decision is described at [Mon15a]. Instead the most space savings are gained by using the smallest possible 4.6. Best Practices From these core concept follow some best practices for database administrators when working with MonetDB: Think about wether column-store database management systems really are the right choice for the application (i.e. mostly querying or running aggregation functions over single or few columns, data inserts mostly in big blocks). 10

11 Query only the needed columns (e.g. no SELECT * ). Avoid updating or inserting data with many columns. (It is slow.) Forget everything you know about indexes and let MonetDB handle them by itself. Use a 64bit computer architecture, that MonetDB can address more than 3 gigabytes of data at once. With these best practices, it is possible to get the most potential (and highest speed gains over traditional rowbased database management systems) out of MonetDB. They will show especially when working with huge (multiple gigabytes) data sets. 5. Benchmark Adaption From PostgreSQL to MonetDB One main topic for this paper was to do a MonetDB Benchmark to draw a comparision to PostgreSQL. The PostgreSQL given by Prof. Stefan Keller are in Appendix C on page 29. These scripts where adopted to MonetDB. This chapter will explain the most important changes that where needed to create the benchmark implementation for MonetDB Droping existing tables A rather classical procedure when importing whole tables is to drop an existing table with the same name. This allows repeating the same statements multiple times without getting an error that the tables already exist. PostgreSQL: 1 DROP TABLE IF EXISTS gnis ; The restriction IF EXISTS does not exist in MonetDB. It had to be deleted: MonetDB: 1 DROP TABLE gnis ; This will throw a warning on the first import, when the table does not exist, but not an error and thus can be ignored Create tables All data types and keywords used in the benchmark implementation for PostgreSQL also exist in MonetDB. No changes where needed for the data types. The following data types and keywords were used: not null primary key integer 11

12 double precision text character varying Hence the table creation statements for PostgreSQL and MonetDB are identical for this benchmark. PostgreSQL & MonetDB: 1 CREATE TABLE gnis ( 2 x double precision not null, 3 y double precision not null, 4 fid integer primary key, 5 name text, 6 class text, 7 state text, 8 county text, 9 elevation integer, 10 map text 11 ); 5.3. Import from CSV The syntax to import data from CSV into a database table is different in the two database management systems, because it is not in the SQL standard. PostgreSQL: 1 \ COPY osm_poi_tag_ch FROM osm_poi_tag_ch. csv DELIMITER ; QUOTE " CSV HEADER ; MonetDB: 1 COPY RECORDS INTO osm_poi_tag_ch FROM 2 / path / osm_poi_tag_ch. csv DELIMITERS ;, \n, " NULL AS ; Especially notable: MonetDB needs the number of rows to import, as well as an abolute (instead of relative) path. Empty fields lead to an error, because for empty values the keyword null is expected. To allow the handling of empty fields as null values, an explicit definition in the copy statement is needed, as shown above. Furthermore, it is not possible to automatically ignore the first line when importing, where this benchmark holds the column names. Thus the first line has to be removed from every CSV file echo PostgreSQL: 1 \ echo \n === Table osm_poi_ch There is no echo command for MonetDB. A SELECT was used for echoing strings. MonetDB: 1 SELECT \n === Table osm_poi_ch ; 12

13 5.5. Rename fields PostgreSQL: 1 SELECT 2 max( version ) " version " 3 FROM osm_poi_ch ; In MonetDB, an explicit AS is needed to rename fields in a query: MonetDB: 1 SELECT 2 max( version ) AS " version " 3 FROM osm_poi_ch ; 6. Benchmark Execution Environment The commands used to setup the database tables in PostgresSQL and MonetDB, and to execute the benchmark and its output on the command line follow in Appendix E on page 44 (PostgresSQL) and Appendix F on page 45 (MonetDB) Physical Machine The physical machine was used to run the benchmark without having too many other processes impacting the result. It represents a database server in this benchmark. The benchmark was run 3 times with taking the median time. Hardware Fujitsu Celsius W530, 3.40 GHz Intel Xeon CPU E v3, 16 GB 1600 MHz DDR3 RAM, 256 GB SAMSUNG SSD 840 Software Ubuntu Server 64bit 15.10, PostgreSQL 9.4.5, MonetDB Jul2015- SP Virtual Machine The virtual machine was used as a developing machine. Based on the feedback for this paper at the presentation and since the times show interesting differences to the physical machine, the times were kept in this paper. These times represent a low memory computer after multiple executions. The benchmark was run 25 times with taking the median time from the last 3 runs. Because of the multiple runs, the query times are subject to the database cracking. Host Hardware MacBook Pro (Retina, Mid 2012), 2.3 GHz Intel Core i7, 16 GB 1600 MHz DDR3 RAM, 768 GB APPLE SSD SM768E Host Software OS X El Capitan Beta (15C47a), VMware Fusion Guest Simulated Hardware 1 core, 4096 MB RAM, 20 GB Harddisk Guest Software Ubuntu Server 64bit , PostgreSQL , MonetDB Jul2015-SP1 13

14 7. Benchmark Execution Results Here the benchmark results are shown and explained. An overview of the result times (on the physical machine) is shown in figure Table gnis The first queries target the table gnis, which is a table containing points with x and y coordinates and informations about the points. It contains tuples. In this part of the benchmark, each query is run twice, hence the two numbers for each category. Because of caching, the second run is faster everytime. Interesting to see is that the virtual machine (which had more runs) is faster in every MonetDB query. It is easy to see the database cracking here: Because the querys are simple value comparisions, the BATs have the automatic index applied due to the higher number of runs on the virtual machine Query 1a and 1b 1 select id, version,lon, lat from osm_poi_ch_1mio where id= pt ; Figure 3: Benchmark results overview on physical machine, times in milliseconds 14

Physical machine: PostgreSQL 0.432 ms, 0.075 ms Physical machine: MonetDB 2.022 ms, 1.794 ms Virtual machine: PostgreSQL 0.822 ms, 0.233 ms Virtual machine: MonetDB 1.216 ms, 0.

15 Physical machine: PostgreSQL ms, ms Physical machine: MonetDB ms, ms Virtual machine: PostgreSQL ms, ms Virtual machine: MonetDB ms, ms As this query is a primary key search, it just reduces to an index search test. As PostgreSQL has the needed index built while MonetDB still is in the process to build the needed adaptive index, PostgreSQL is faster. A diagram of the result times (on the physical machine) is shown in figure Query 2a and 2b 1 SELECT name, county, state FROM gnis t WHERE t. county = Texas ; Physical machine: PostgreSQL ms, ms Physical machine: MonetDB ms, ms Virtual machine: PostgreSQL ms, ms Virtual machine: MonetDB ms, ms Figure 4: Diagram for queries 1a and 1b on physical machine (times in milliseconds) Here the column-oriented storage can be used: To search for one value, MonetDB can exploit its propertys to be able to just load values from one column into the CPU cache. Also, MonetDB can see that this would be a good fit for an index as there are multiple points with county Texas and moves those tuples to the beginning of the BAT to build an partial index. A diagram of the result times (on the physical machine) is shown in figure Query 3a and 3b 1 SELECT avg(t. elevation ) FROM gnis t 15

can shine again with its adaptive indexes (as can be seen with the different times from Physical MonetDB and Virtual MonetDB even with the latter having restricted main memory).

16 can shine again with its adaptive indexes (as can be seen with the different times from Physical MonetDB and Virtual MonetDB even with the latter having restricted main memory). A diagram of the result times (on the physical machine) is shown in figure 6. Figure 5: Diagram for queries 2a and 2b on physical machine (times in milliseconds) 2 WHERE t.x > and t.y > and t.x < and t.y <33.460; Physical machine: PostgreSQL ms, ms Physical machine: MonetDB ms, ms Virtual machine: PostgreSQL ms, ms Virtual machine: MonetDB ms, ms Searching a big dataset for few columns is the speciality of MonetDB. It Figure 6: Diagram for queries 3a and 3b on physical machine (times in milliseconds) Query 4a and 4b 1 SELECT count (*), class FROM gnis GROUP BY class Physical machine: PostgreSQL ms, ms 16

17 Physical machine: MonetDB ms, ms Virtual machine: PostgreSQL ms, ms Virtual machine: MonetDB ms, ms Same as the above. A diagram of the result times (on the physical machine) is shown in figure 7. into three smaller tables: osm_poi_ch _1mio contains the first 1 million entries, osm_poi_ch_2mio contains the first 2 million entries, and osm_poi_ch _3mio contains the first 3 million entries. The benchmark queries use each table to see which effect the table size has Query 10x, 10a, 10b, and 10c 1 select id, version,lon, lat from osm_poi_ch_1mio where id= pt ; 2 -- Query 10 b uses table osm_poi_ch_2mio 3 -- Query 10 c uses table osm_poi_ch_3mio Physical machine: PostgreSQL 1mio: ms, ms, 2mio: 0.152, 3mio: Figure 7: Diagram for queries 4a and 4b on physical machine (times in milliseconds) 7.2. Table osm_poi_ch Table osm_poi_ch contains lat/long coordinates with additional information. For the benchmark, these table is copied Physical machine: MonetDB 1mio: ms, ms, 2mio: 0.820, 3mio: Virtual machine: PostgreSQL 1mio: ms, ms, 2mio: 0.523, 3mio: Virtual machine: MonetDB 1mio: ms, ms, 2mio: 0.759, 3mio:

This is the same case as the very first query: PostgreSQL is fast because it can use its prebuilt index whereas MonetDB would have to build its index at query time and decides against doing so,

18 This is the same case as the very first query: PostgreSQL is fast because it can use its prebuilt index whereas MonetDB would have to build its index at query time and decides against doing so, because it would not be worth the time for just one result (where it thinks it is improbable that the same id will be queried again shortly after). In this query, the 1mio table was queried twice to see the effect of result caching. A diagram of the result times (on the physical machine) is shown in figure Query 11a, 11b, and 11c 1 select id, version,lon, lat from osm_poi_ch_1mio where version >300 order by version desc limit 10; 2 -- Query 11 b uses table osm_poi_ch_2mio 3 -- Query 11 c uses table osm_poi_ch_3mio Physical machine: PostgreSQL 1mio: ms, 2mio: , 3mio: Physical machine: MonetDB 1mio: ms, 2mio: 3.872, 3mio: Virtual machine: PostgreSQL 1mio: ms, 2mio: , 3mio: Virtual machine: MonetDB 1mio: ms, 2mio: , 3mio: Figure 8: Diagram for queries 10a, 10b, and 10c on physical machine (times in milliseconds) Only one column to work with and lots of data: The ideal world for MonetDB. A diagram of the result times (on the physical machine) is shown in figure 9. 18

3mio: 373.458 Physical machine: MonetDB 1mio: 8.293 ms, 2mio: 6.756, 3mio: 6.604 Virtual machine: PostgreSQL 1mio: 201.980 ms, 2mio: 384.345, 3mio: 578.311 Virtual machine: MonetDB 1mio: 21.

19 3mio: Physical machine: MonetDB 1mio: ms, 2mio: 6.756, 3mio: Virtual machine: PostgreSQL 1mio: ms, 2mio: , 3mio: Virtual machine: MonetDB 1mio: ms, 2mio: , 3mio: Figure 9: Diagram for queries 11a, 11b, and 11c on physical machine (times in milliseconds) Query 12a, 12b, and 12c 1 select id, version,lon, lat 2 from osm_poi_ch_1mio 3 where lon > and lat > lon < lat < and and 4 -- Query 12 b uses table osm_poi_ch_2mio 5 -- Query 12 c uses table osm_poi_ch_3mio Physical machine: PostgreSQL 1mio: ms, 2mio: , Even with two columns MonetDB can work a lot faster. Where PostgreSQL almost doubles its time from the last query, MonetDB s time is almost the same. A diagram of the result times (on the physical machine) is shown in figure Query 13a, 13b, and 13c 1 select count (id), uid 2 from osm_poi_ch_1mio 3 group by uid having count (id) >1 4 order by 1 desc limit 10; 5 -- Query 13 b uses table osm_poi_ch_2mio 6 -- Query 13 c uses table osm_poi_ch_3mio Physical machine: PostgreSQL 1mio: ms, 2mio: , 3mio:

20 gram of the result times (on the physical machine) is shown in figure 11. Figure 10: Diagram for queries 12a, 12b, and 12c on physical machine (times in milliseconds) Physical machine: MonetDB 1mio: ms, 2mio: , 3mio: Virtual machine: PostgreSQL 1mio: ms, 2mio: , 3mio: Virtual machine: MonetDB 1mio: ms, 2mio: , 3mio: As with the last queries, this shows that MonetDB is fast, no matter the number elements in the SQL query, as long as few columns are in use. A dia- Figure 11: Diagram for queries 13a, 13b, and 13c on physical machine (times in milliseconds) Query 14a, 14b, and 14c 1 select e.id, av2.value as name, av3. value cuisine, lon, lat as 2 from osm_poi_ch_1mio as e 3 join osm_poi_tag_ch as av on e.id=av.id 4 left outer join osm_poi_tag_ch as av2 on e.id=av2.id 20

5 left outer join osm_poi_tag_ch as av3 on e.id=av3.id 6 where 7 av.key= amenity and av.value = restaurant 8 and av2.key= name 9 and av3.key= cuisine 10 and e. lon >47.22088 and e. lat >8.

21 5 left outer join osm_poi_tag_ch as av3 on e.id=av3.id 6 where 7 av.key= amenity and av.value = restaurant 8 and av2.key= name 9 and av3.key= cuisine 10 and e. lon > and e. lat > and e. lon < and e.lat < order by name 12 limit 10; Query 14 b uses table osm_poi_ch_2mio Query 14 c uses table osm_poi_ch_3mio optimization for few columns and is slower than PostgreSQL (which makes use of its indexes for the id s). But interesting to see is that the curve how much the time rises is much lower in MonetDB than in PostgreSQL: MonetDB uses its recycling technique to build the intermediate BATs for this query, which it can reuse for the av2 and av3 tables. Together with its database cracking on those intermediate BATs, it can hugely profit from its optimisations. A diagram of the result times (on the physical machine) is shown in figure 12. Physical machine: PostgreSQL 1mio: ms, 2mio: , 3mio: Physical machine: MonetDB 1mio: ms, 2mio: , 3mio: Virtual machine: PostgreSQL 1mio: ms, 2mio: , 3mio: Virtual machine: MonetDB 1mio: ms, 2mio: , 3mio: As here there are many columns involved, MonetDB cannot fully use its Figure 12: Diagram for queries 14a, 14b, and 14c on physical machine (times in milliseconds) 21

22 8. Conclusion This paper explained column-store database management systems with MonetDB as the example in detail. It explained how the principle of vertical fragmentation looks in MonetDB, what BATs are, how recycling and updates work in MonetDB and how database cracking and data compression is used. Especially the benchmark showed, where the strength of MonetDB are and in which applications and queries it can outperform traditional database management systems. The benchmark showed the same timing patterns for similar querys: For index searches, where one row is selected by its primary key, PostgreSQL is much faster, because it can use the prebuilt index. Because MonetDB is designed for unknown future queries, it does not build indexes when creating or filling a table. And because starting a partitional index for a single row would take more time than it will save in the future, this time will not be faster when repeating the query. In the benchmark, queries 1a, 1b, 10x, 10a, 10b, and 10c are examples for index searches. The timing differences are shown in Figure 13 for a timing overview for this type of queries. The second query category could be named value searches. Here the data- Figure 13: Diagram showing index search query timing differences on the physical machine (times in milliseconds) base tables are filtered by a single value which returns multiple rows. It is important to note that PostgreSQL has no index here, it has to scan the whole table. It will load the whole rows but only needs certain columns. This leads to multiple iterations of data loading. MonetDB on the other side can make use of vertical fragmentation and just loads the right column which leads to more data fitting in each loading procedure. Also, after filtering once, it starts saving the results as a partial index and thus gets much faster after every repetition, whereas PostgreSQL 22

23 will never build an index on its own. On the whole, this makes MonetDB much faster for this type of query, as shown in Figure 14 in the timing overview for value search queries, using 2a/2b and 11a/11b/11c as an example. Figure 14: Diagram showing value search query timing differences on the physical machine (times in milliseconds) future querys and setting on database cracking. The last types of query one could categorize the benchmark queries in is multi column, where lots of columns over multiple tables are used. In the benchmark, query 14a/14b/14c is an example here. This is slow in both database management systems. Interesting here is that MonetDB gets faster after each query because it can reuse its intermediate BATs, which makes it beat PostgreSQL with the three queries, even though query 14a was much slower. This is shown in Figure 15. In conclusion, the two key points of this paper are: MonetDB is fast when working with lots 2 of data in a mostly read environment with few columns in single querys. MonetDB is optimized for modern CPU architectures and for unknown workloads where it is not known before which queries will be run. As the benchmark shows, in these two areas, MonetDB has the most speed difference versus PostgreSQL. There are more querys with this characteristic in the benchmark which show the same results. In reality, for data warehouse or in scientific data evaluations, this will be the most common type of query and shows why MonetDB has its principle of assuming unknown 2 In this benchmark: Few hunded megabytes, but the differences would be more clear with bigger data from ten to hundred gigabytes. 23

The VLDB Journal The International Journal on Very Large Data Bases, 8(2):101 119, 1999. [BMK99] Peter A Boncz, Stefan Manegold, and Martin L Kersten.

24 The VLDB Journal The International Journal on Very Large Data Bases, 8(2): , [BMK99] Peter A Boncz, Stefan Manegold, and Martin L Kersten. Database architecture optimized for the new bottleneck: Memory access. In VLDB, volume 99, pages 54 65, Figure 15: Diagram showing multi column query timing differences on the physical machine (times in milliseconds) References [ABH + 13] Daniel Abadi, Peter Boncz, [BK99] Stavros Harizopoulos, Stratos Idreos, et al. The design and implementation of modern column-oriented database systems. Now, Peter A Boncz and Martin L Kersten. MIL primitives for querying a fragmented world. [IGN + 12] Stratos Idreos, Fabian Groffen, Niels Nes, Stefan Manegold, et al. MonetDB: Two decades of research in column-oriented database architectures. Data Engineering, page 40, [IKM + 07] Stratos Idreos, Martin L Kersten, Stefan Manegold, et al. Database cracking. In CIDR, volume 3, pages 1 8, [Mon15a] MonetDB B.V. Data compression MonetDB. org/documentation/ Guide/Compression, Last visited: February 18, [Mon15b] MonetDB B.V. Documentations MonetDB. 24

25 org/documentation, Last visited: February 18, [Mon15c] MonetDB B.V. SQL features. supported MonetDB. org/documentation/ Manuals/SQLreference/ Features/Supported, Last visited: February 18, [Mon15d] MonetDB B.V. SQL features. unsupported MonetDB. org/documentation/ Manuals/SQLreference/ Features/unsupported, Last visited: February 18, [VQKN08] Maarten Vermeij, Wilko Quak, Martin Kersten, and Niels Nes. MonetDB, a novel spatial columnstore DBMS. In Academic Proceedings of the 2008 Free and Open Source for Geospatial (FOSS4G) Conference, OS- Geo, pages ,

26 A. List of Figures 1. The storage of a table in the row-based format The storage of a table in the column-oriented format Benchmark results overview on physical machine, times in milliseconds Diagram for queries 1a and 1b on physical machine (times in milliseconds) Diagram for queries 2a and 2b on physical machine (times in milliseconds) Diagram for queries 3a and 3b on physical machine (times in milliseconds) Diagram for queries 4a and 4b on physical machine (times in milliseconds) Diagram for queries 10a, 10b, and 10c on physical machine (times in milliseconds) Diagram for queries 11a, 11b, and 11c on physical machine (times in milliseconds) Diagram for queries 12a, 12b, and 12c on physical machine (times in milliseconds) Diagram for queries 13a, 13b, and 13c on physical machine (times in milliseconds) Diagram for queries 14a, 14b, and 14c on physical machine (times in milliseconds) Diagram showing index search query timing differences on the physical machine (times in milliseconds) Diagram showing value search query timing differences on the physical machine (times in milliseconds) Diagram showing multi column query timing differences on the physical machine (times in milliseconds) B. Bachelor/Master Lab Lession One part of the task for this paper was designing an Bachelor/Master lab lession which is attached in this chapter. B.1. Installation For this lab, you need PostgreSQL and MonetDB. It is assumed that you already have installed PostgreSQL. For MonetDB, please go to the download page: 26

27 There you will find the download instructions for every common operating system. If using a Virtual Machine for the lab, please assure that the VM has at least 4 gigabytes of RAM and uses a 64-bit architecture. The statements given here assume MonetDB on Linux. After installing MonetDB, create a database farm. In MonetDB, database farms are used to group databases, which allows having different storage paths for different databases. 1 $ monetdbd create mydbfarm 2 $ monetdbd start mydbfarm Next, we need to create a database (note that we now use the monetdb client, i.e. no d at the end): 1 $ monetdb create labdb 2 $ monetdb release labdb We can now connect (assuming you kept the default user monetdb at the installation) to the database with the following command: 1 $ mclient -u monetdb -d labdb B.2. Create and Import Create a table in both PostgreSQL and MonetDB with the following statement: 1 CREATE TABLE gnis ( 2 x double precision not null, 3 y double precision not null, 4 fid integer primary key, 5 name text, 6 class text, 7 state text, 8 county text, 9 elevation integer, 10 map text 11 ); In PostgreSQL, you can import the data for this table from the file gnis_names09.csv with the following command: 27

28 1 \ COPY gnis FROM gnis_names09. csv DELIMITER ; QUOTE " CSV HEADER ; Ex. 1: How does the statement look like for MonetDB? (Hint: There is no option to ignore the first row of the CSV, you have to manually delete it. You also have to delete the character ; inside strings, MonetDB cannot handle them.) B.3. Insert, Update and Delete To get accustomed to MonetDB and the used table, please do the following. Because MonetDB uses the standard SQL:2003, this should not be hard. Ex. 2: Insert a new row. How does the statement look like? Ex. 3: Update the elevation for the row. How does the statement look like? Ex. 4: Delete the row. How does the statement look like? B.4. Comparison See the following statements: SELECT name, county, state FROM gnis t WHERE t. fid = ; SELECT name, county, state FROM gnis t WHERE t. county = Texas ; SELECT avg( t. elevation ) FROM gnis t 9 WHERE t.x > and t.y > and t.x < and t.y <33.460; Ex. 5: Think about each statement: What do you expect, for which statements will PostgreSQL be faster, for which statements MonetDB? Ex. 6: Execute the statements in both PostgreSQL and MonetDB and note down the execution times. Why is PostgreSQL or MonetDB faster for each statement? 28

29 B.5. Database Cracking You learned that MonetDB uses Database Cracking. Ex. 7: What is Database Cracking? How do future queries with the same data get faster and under which circumstances? Ex. 8: For each query 1 to 3, note down how much faster you expect it to be after 50 iterations. Ex. 9: Execute each query 50 times and compare the 50th run with your estimations. Explain the differences. C. PostgreSQL Scripts C.1. gnis_load.sql 1 gnis_load. s q l 2 Tested on PostgreSQL 9.4 using p s q l SK 4 5 Create t a b l e g n i s : 6 DROP TABLE IF EXISTS g n i s ; 7 8 CREATE TABLE g n i s ( 9 x double p r e c i s i o n not null, 10 y double p r e c i s i o n not null, 11 f i d integer primary key, 12 name text, 13 c l a s s text, 14 s t a t e text, 15 county text, 16 e l e v a t i o n integer, 17 map t e x t 18 ) ; Copy data from CSV f i l e to database : 21 \COPY g n i s FROM gnis_names09. csv DELIMITER ; QUOTE " CSV HEADER; C.2. osm_poi_ch_load.sql 1 osm_poi_ch_load. s q l 29

30 2 Tested on PostgreSQL 9.4 using p s q l SK 4 5 \ echo osm_poi_ch_loader Create new t a b l e 9 DROP TABLE IF EXISTS osm_poi_ch ; CREATE TABLE osm_poi_ch ( 12 i d character varying ( 6 4) not null, no primary key s i n c e OSM i d maybe not unique 13 lastchanged character varying (35), 14 changeset integer, 15 v e r s i o n integer, 16 uid integer, 17 lon double p r e c i s i o n not null, 18 l a t double p r e c i s i o n not null 19 ) ; Copy data from CSV f i l e to database 23 \COPY osm_poi_ch FROM osm_poi_ch. c s v DELIMITER ; QUOTE " CSV HEADER; select count ( ) from osm_poi_ch ; Create new t a b l e DROP TABLE IF EXISTS osm_poi_tag_ch ; CREATE TABLE osm_poi_tag_ch ( 34 i d character varying ( 6 4) not null, 35 key t e x t not null, 36 value t e x t 37 primary key ( id, key ). I t i s not r e a l l y t r u e. A d u p i c a t i o n found. 38 ) ; Copy data from CSV f i l e to the temporary t a b l e 41 \COPY osm_poi_tag_ch FROM osm_poi_tag_ch. c s v DELIMITER ; QUOTE " CSV HEADER; 30

31 42 43 select count ( ) from osm_poi_tag_ch ; C.3. bm_prepare.sql 1 Benchmark 2 Tested on PostgreSQL 9.4 using p s q l SK 4 5 Requirements : 6 Tables gnis, osm_poi_ch and osm_poi_tag_ch e x i s t and are loaded. 7 8 \ echo Preparing t a b l e s. Pls. wait \ timing on \ echo \n=== Table g n i s 13 Preparing index : 14 DROP INDEX IF EXISTS gnis_fid_idx CASCADE; 15 CREATE UNIQUE INDEX gnis_fid_idx ON g n i s ( f i d ) ; 16 CLUSTER g n i s USING gnis_fid_idx ; 17 Refreshing s t a t i s t i c s : 18 VACUUM FULL ANALYZE g n i s ; \ echo \n=== Table osm_poi_ch 22 DROP INDEX IF EXISTS osm_poi_ch_id_idx CASCADE; 23 CREATE INDEX osm_poi_ch_id_idx ON osm_poi_ch ( i d ) ; 24 CLUSTER osm_poi_ch USING osm_poi_ch_id_idx ; 25 VACUUM FULL ANALYZE osm_poi_ch ; \ echo \n=== Table osm_poi_tag_ch 28 DROP INDEX IF EXISTS osm_poi_tag_ch_id_idx ; 29 CREATE INDEX osm_poi_tag_ch_id_idx ON osm_poi_tag_ch ( i d ) ; 103 s e c 30 CLUSTER osm_poi_tag_ch USING osm_poi_tag_ch_id_idx ; 31 VACUUM FULL ANALYZE osm_poi_tag_ch ; \ echo \n=== Table osm_poi_ch_3mio 35 DROP TABLE IF EXISTS osm_poi_ch_3mio CASCADE; 36 CREATE TABLE osm_poi_ch_3mio AS 31

32 37 select 38 id, 39 max( v e r s i o n ) " v e r s i o n ", 40 max( lastchanged ) lastchanged, 41 max( uid ) uid, 42 max( changeset ) changeset, 43 max( lon ) lon, 44 max( l a t ) l a t 45 from osm_poi_ch 46 group by i d 47 ORDER BY 1 LIMIT ; 48 ALTER TABLE osm_poi_ch_3mio ADD CONSTRAINT osm_poi_ch_3mio_pk PRIMARY KEY( id ) ; 47 sec 49 CREATE UNIQUE INDEX osm_poi_ch_3mio_pk_idx ON osm_poi_ch_3mio ( i d ) ; 38 sec 50 CLUSTER osm_poi_ch_3mio USING osm_poi_ch_3mio_pk_idx ; 112 s e c 51 VACUUM FULL ANALYZE osm_poi_ch_3mio ; \ echo \n=== Table osm_poi_ch_2mio 55 DROP TABLE IF EXISTS osm_poi_ch_2mio CASCADE; 56 CREATE TABLE osm_poi_ch_2mio AS 57 SELECT FROM osm_poi_ch_3mio 58 ORDER BY 1 LIMIT ; 59 ALTER TABLE osm_poi_ch_2mio ADD CONSTRAINT osm_poi_ch_2mio_pk PRIMARY KEY( id ) ; 60 CREATE UNIQUE INDEX osm_poi_ch_2mio_pk_idx ON osm_poi_ch_2mio ( i d ) ; 61 CLUSTER osm_poi_ch_2mio USING osm_poi_ch_2mio_pk_idx ; 62 VACUUM FULL ANALYZE osm_poi_ch_2mio ; \ echo \n=== Table osm_poi_ch_1mio 66 DROP TABLE IF EXISTS osm_poi_ch_1mio CASCADE; 67 CREATE TABLE osm_poi_ch_1mio AS 68 SELECT FROM osm_poi_ch_3mio 69 ORDER BY 1 LIMIT ; 70 ALTER TABLE osm_poi_ch_1mio ADD CONSTRAINT osm_poi_ch_1mio_pk PRIMARY KEY( id ) ; 71 CREATE UNIQUE INDEX osm_poi_ch_1mio_pk_idx ON osm_poi_ch_1mio ( i d ) ; 72 CLUSTER osm_poi_ch_1mio USING osm_poi_ch_1mio_pk_idx ; 73 VACUUM FULL ANALYZE osm_poi_ch_1mio ; 74 32

33 75 \ echo \nok. C.4. bm.sql 1 Benchmark 2 Tested on PostgreSQL 9.4 using p s q l SK 4 5 Local c o n f i g u r a t i o n 6 \ pset format 7 \ pset pager o f f 8 9 Redirect query output to f i l e 10 \ set OUTFILE bm_out. txt 11 \o :OUTFILE This i s a dummy query to f i l l cache with ( o t h e r ) t u p l e s 14 SELECT count ( ) FROM osm_poi_tag_ch ; \ echo \n=== Table g n i s Simple e q u a l i t y search with s i n g l e t u p e l in return s e t 19 \ timing o f f 20 SELECT count ( ) FROM osm_poi_tag_ch ; 21 \ timing on 22 \ echo ; 1 a 23 SELECT name, county, s t a t e FROM g n i s t WHERE t. f i d = ; 24 \ echo ; 1 b 25 SELECT name, county, s t a t e FROM g n i s t WHERE t. f i d = ; Simple e q u a l i t y search on county Texas : 28 \ timing o f f 29 SELECT count ( ) FROM osm_poi_tag_ch ; 30 \ timing on 31 \ echo ; 2 a 32 SELECT name, county, s t a t e FROM g n i s t WHERE t. county= Texas ; 33 \ echo ; 2 b 34 SELECT name, county, s t a t e FROM g n i s t WHERE t. county= Texas ; Range search with a g g r e g a t e f u n c t i o n 37 \ timing o f f 33

34 38 SELECT count ( ) FROM osm_poi_tag_ch ; 39 \ timing on 40 \ echo ; 3 a 41 SELECT avg( t. e l e v a t i o n ) : : int FROM g n i s t 42 WHERE t. x> and t. y > and t. x< and t. y <33.460; 43 \ echo ; 3 b 44 SELECT avg( t. e l e v a t i o n ) : : int FROM g n i s t 45 WHERE t. x> and t. y > and t. x< and t. y <33.460; Group by query 48 \ timing o f f 49 SELECT count ( ) FROM osm_poi_tag_ch ; 50 \ timing on 51 \ echo ; 4 a 52 SELECT count ( ), c l a s s FROM g n i s GROUP BY c l a s s 53 ORDER BY 1 DESC; 54 \ echo ; 4 b 55 SELECT count ( ), c l a s s FROM g n i s GROUP BY c l a s s 56 ORDER BY 1 DESC; \ echo \n=== Table osm_poi_ch Query with e q u a l i t y c o n d i t i o n 62 \ timing o f f 63 SELECT count ( ) FROM g n i s ; 64 \ timing on 65 \ echo ; 1 0 x 66 select id, version, lon, l a t from osm_poi_ch_1mio where id= pt ; 67 \ echo ; 1 0 a 68 select id, version, lon, l a t from osm_poi_ch_1mio where id= pt ; 69 \ echo ; 1 0 b 70 select id, version, lon, l a t from osm_poi_ch_2mio where id= pt ; 71 \ echo ; 1 0 c 72 select id, version, lon, l a t from osm_poi_ch_3mio where id= pt ; Query with range c o n d i t i o n 75 \ timing o f f 76 SELECT count ( ) FROM g n i s ; 77 \ timing on 78 \ echo ; 1 1 a 34

35 79 select id, version, lon, l a t from osm_poi_ch_1mio where version >300 order by v e r s i o n desc 80 limit 1 0 ; 81 \ echo ; 1 1 b 82 select id, version, lon, l a t from osm_poi_ch_2mio where version >300 order by v e r s i o n desc 83 limit 1 0 ; 84 \ echo ; 1 1 c 85 select id, version, lon, l a t from osm_poi_ch_3mio where version >300 order by v e r s i o n desc 86 limit 1 0 ; Query with range c o n d i t i o n I I. 90 \ timing o f f 91 SELECT count ( ) FROM g n i s ; 92 \ timing on 93 \ echo ; 1 2 a 94 select id, version, lon, l a t 95 from osm_poi_ch_1mio 96 where lon > and l a t > and lon < and l a t < order by v e r s i o n desc ; 98 \ echo ; 1 2 b 99 select id, version, lon, l a t 100 from osm_poi_ch_2mio 101 where lon > and l a t > and lon < and l a t < order by v e r s i o n desc ; 103 \ echo ; 1 2 c 104 select id, version, lon, l a t 105 from osm_poi_ch_3mio 106 where lon > and l a t > and lon < and l a t < order by v e r s i o n desc ; Query w i t h group by 110 \ timing o f f 111 SELECT count ( ) FROM g n i s ; 112 \ timing on 113 \ echo ; 1 3 a 114 select count ( id ), uid 115 from osm_poi_ch_1mio 116 group by uid having count ( id )>1 117 order by 1 desc limit 1 0 ; 35

36 118 \ echo ; 1 3 b 119 select count ( id ), uid 120 from osm_poi_ch_2mio 121 group by uid having count ( id )>1 122 order by 1 desc limit 1 0 ; 123 \ echo ; 1 3 c 124 select count ( id ), uid 125 from osm_poi_ch_3mio 126 group by uid having count ( id )>1 127 order by 1 desc limit 1 0 ; Query with 3 j o i n s 130 A l l e Restaurants mit id, name und K nart ( f a l l s vorhanden ) : 131 \ timing o f f 132 SELECT count ( ) FROM g n i s ; 133 \ timing on 134 \ echo ; 1 4 a 135 select e. id, av2. value as name, av3. value as c u i s i n e, lon, l a t 136 from osm_poi_ch_1mio as e 137 join osm_poi_tag_ch as av on e. id=av. id 138 l e f t outer join osm_poi_tag_ch as av2 on e. id=av2. id 139 l e f t outer join osm_poi_tag_ch as av3 on e. id=av3. id 140 where 141 av. key= amenity and av. value= r e s t a u r a n t 142 and av2. key= name 143 and av3. key= c u i s i n e 144 and e. lon > and e. l a t > and e. lon < and e. l a t < order by name 146 limit 1 0 ; 147 \ echo ; 1 4 b 148 select e. id, av2. value as name, av3. value as c u i s i n e, lon, l a t 149 from osm_poi_ch_2mio as e 150 join osm_poi_tag_ch as av on e. id=av. id 151 l e f t outer join osm_poi_tag_ch as av2 on e. id=av2. id 152 l e f t outer join osm_poi_tag_ch as av3 on e. id=av3. id 153 where 154 av. key= amenity and av. value= r e s t a u r a n t 155 and av2. key= name 156 and av3. key= c u i s i n e 157 and e. lon > and e. l a t > and e. lon < and e. l a t <

NEC PerforCache. Influence on M-Series Disk Array Behavior and Performance. Version 1.0

NEC PerforCache. Influence on M-Series Disk Array Behavior and Performance. Version 1.0 NEC PerforCache Influence on M-Series Disk Array Behavior and Performance. Version 1.0 Preface This document describes L2 (Level 2) Cache Technology which is a feature of NEC M-Series Disk Array implemented