Environment (Parallelizing Query Optimization)

Size: px
Start display at page:

Download "Environment (Parallelizing Query Optimization)"

Transcription

1 Advanced d Query Optimization i i Techniques in a Parallel Computing Environment (Parallelizing Query Optimization) Wook-Shin Han*, Wooseong Kwak, Jinsoo Lee Guy M. Lohman, Volker Markl Kyungpook National University IBM Almaden Research Center 1

2 2

3 Introduction to Our Recent Query Optimization Research Efforts Progressive optimization in a shared-nothing parallel database (SIGMOD 2007) Parallelizing li i query optimization i (VLDB 2008) Time-series Database & Ranking Ranked Subsequence Matching (VLDB 2007) XML Query Processing StreamTX: Extracting Tuples from Streaming XML Data (VLDB 2008) Mapping-driven XML transformation (WWW 2007) The Dynamic Predicate: Integrating ng Access Control with Query Processing in XML Databases (VLDB Journal 2007) Moving Object Databases Cost-Based Predictive Spatio-Temporal Join (IEEE TKDE 2008 (accepted)) 3

4 My Necessary conditions for your papers p to be accepted in SIGMOD/VLDB/ICDE COND1) Solve REAL problems This gives a very strong motivation! COND2) Solve a SMALL but NOVEL problem Your paper must be focused and distinguishable! COND3) Do NOT BLAME reviewers Your faults, not reviewers! Do not ignore their even seemingly absurd (usually, only to you!) reviews! 4

5 Success of Relational Database Management Systems Standardization d of the SQL query language Development of sophisticated query optimizers Enumerate many alternative query execution plans (QEPs) using dynamic programming (DP) Estimate the cost of each Choose the least expensive plan to execute 5

6 Motivation As # of joins increases, # of alternative QEPs considered by DP grow exponentially Although approximate search algorithms reduce the enumeration time, this can result in sub-optimal plans 6

7 * Motivation (Cont d) New wave of multi-core processors Thus, it is obvious to speed up CPU-bound query optimization by parallelizing it! In the typical parallel DBMS, only a single coordinator node optimizes the query! * 7

8 Problem Definition Parallelize l DP-based query optimization to exp loit multi-core processor architectures the DP algorithm used in join enumeration belongs to the non-serial polyadic DP class [8] 8

9 Contributions first framework for parallel l DP optimization i that generates optimal plans a parallel join enumeration algorithm, along with various allocation skip vector array with algorithms speeding up our parallel join enumeration algorithm formal analysis of why the various allocation schemes generate different sizes of search spaces among threads perform extensive experiments to show parallel algorithms allocate the work to threads evenly Enhanced algorithm using the skip vector array outperforms the conventional DP algorithm by up to orders of magnitude 9

10 A Running Example 10

11 Overview of Serial DP-based Optimization QEPs for single tables QEPs for joins 11

12 Overview of Our Solution Goals partition the search space evenly among threads process each partition independently without any dependencies among threads Key insights By partitioning sub-problems by their sizes, sub-problems of the same resulting size are mutually independent As the number of quantifiers increases, the number of subproblems of the same size grows exponentially Each sub-problem of size S (=smallsz+largesz) is constructed using any combination of one smaller sub-problem of size smallsz and another sub-problem of size largesz 12

13 Overview of Our Solution (Cont d) Transform the enumeration problem into multiple theta joins, multiple plan joins (MPJs), disjoint and connectivity filters as join conditions Each MPJ is then parallelized using multiple threads without any dependencies between the threads. By judiciously allocating to threads portions of the search space for MPJ, we can achieve linear speed-up. 13

14 Plan partitions Example P 4 = (q 1,q 1 q 2 q 3 ) (q 1,q 1 q 2 q 4 ) (q 1,q 1 q 3 q 4 ) (q 2,q 1 q 2 q 3 ) (q 3,q 1 q 2 q 3 ) (q 2,q 1 q 2 q 4 ) (q 3,q 1 q 2 q 4 ) (q 2,q 1 q 3 q 4 ) P 1 P 3 thread 1 P 2 P 2 (q 1 q 2,q 1 q 2 ) (q 1 q 2,q 1 q 3 ) (q 3,q 1 q 3 q 4 ) (q 4,q 1 q 2 q 3 ) (q 4,q 1 q 2 q 4 ) (q 4,q 1 q 3 q 4 ) (q 1 q 2,q 1 q 4 ) (q 1 q 3,q 1 q 2 ) (q 1 q 3,q 1 q 3 ) (q 1 q 3,q 1 q 4 ) thread 2 (q 1 q 4,q 1 q 2 ) (q 1 q 4,q 1 q 3 ) (q 1 q 4,q 1 q 4 ) 14

15 Search Space Allocation Schemes Total Sum Allocation Scheme Stratified Allocation Equi-depth allocation Round-robin outer allocation Round-robin inner allocation 15

16 Example: Round-Robin Outer Allocation thread 1 thread 2 (q 1,q 1 q 2 q 3 ) (q 1,q 1 q 2 q 4 ) (q 1,q 1 q 3 q 4 ) (q 2,q 1 q 2 q 3 ) (q 2,q 1 q 2 q 4 ) (q 2,q 1 q 3 q 4 ) P 1 P 3 (q 3,q 1 q 2 q 3 ) (q 3,q 1 q 2 q 4 ) (q 3,q 1 q 3 q 4 ) (q 4,q 1 q 2 q 3 ) (q 4,q 1 q 2 q 4 ) (q 4,q 1 q 3 q 4 ) (q 1 q 2,q 1 q 2 ) (q 1 q 2,q 1 q 3 ) (q 1 q 2,q 1 q 4 ) P 2 P 2 (q 1 q 3,q 1 q 2 ) (q 1 q 3,q 1 q 3 ) (q 1 q 3,q 1 q 4 ) (q 1 q 4,q 1 q 2 ) (q 1 q 4,q 1 q 3 ) (q 1 q 4,q 1 q 4 ) 16

17 Basic MPJ Use block-nested loop join to be cache-conscious To represent an allocated search space for each thread, use search space description vector (SSDV) 17

18 Motivation for Skip Vector Array Considerable overhead in calling exponential number of disjoint filter calls! 18

19 Skip Vector Array P QS PlanList SV QS PlanList SV 1 P 3 1 q q 1 q 2 q q q 1 q 2 q q q 1 q 2 q q q 1 q 2 q q q 1 q 3 q q q 1 q 4 q q q 1 q 4 q q q 2 q 5 q q 4 q 7 q 8 SKIP! NOTE: Skip Vector Array can be built in one scan of quantifier sets. 19

20 How to allocate SVAs to threads? Use a pair of partitioned ii SVAs as the unit of allocation to threads First, divide each plan partition into sub-partitions Second, build the SVAs for all sub-partitions QS q 1 q 2 q 3 q 1 q 2 q 4 q 1 q 2 q 5 q 1 q 2 q 6 q 1 q 3 q 4 q 1 q 4 q 7 q 1 q 4 q 8 q 2 q 5 q 6 q 4 q 7 q 8 PlanList P {3,1} P {3,3} P 3 QS PlanList SV Equi-depth partitioning q 1 q 2 q 3 q 1 q 2 q 4 & building SVAs P {3,2} P {3,4} QS PlanList SV q 1 q 3 q 4 q 1 q 4 q 7 QS PlanList SV QS PlanList SV q 1 q 2 q 5 q 1 q 4 q 8 q 1 q 2 q 6 q 2 q 5 q 6 q 4 q 7 q 8 20

21 Formal Analysis About Various Refer to the paper allocation schemes 21

22 Experiments Environment Windows Vista PC with two Intel Xeon Quad Core E GHz CPUs and 8 GB of RAM All serial/parallel algorithms prototyped in PostgreSQL 22

23 Overall Comparisons of different algorithms 17 hours to less then 2 minutes using only 8 threads! 23

24 Sensitivity Analysis for Basic MPJ (8 threads) 24

25 Sensitivity Analysis for MPJ with SVA (8 threads) 25

26 For more experimental results Refer to the paper 26

27 Conclusions proposed a novel framework for parallelizing li i query optimization to exploit the coming wave of multi-core processor architectures By viewing enumeration as a join, we devised a way to partition the search space cleanly into independent problems To minimize mnmz unnecessary calls to the disjoint testing, we proposed the skip vector array We also formally analyzed why our various allocation schemes generate differently-sized search spaces among threads, to ensure even allocation of the work among threads 27

28 Thank you! Any yquestions? 28

29 Backup slides 29

30 30

31 Skip Vector Array Avoid unnecessary invocations of the disjoint filter Augment each row in the plan partition with a Skip Vector for the quantifier set in that row Plan partition is sorted in lexicographical order to skip large groups of quantifier sets = 31

32 Equi-depth Allocation 32

33 Round-Robin Robin Inner Allocation 33

Window-aware Load Shedding for Aggregation Queries over Data Streams

Window-aware Load Shedding for Aggregation Queries over Data Streams Window-aware Load Shedding for Aggregation Queries over Data Streams Nesime Tatbul Stan Zdonik Talk Outline Background Load shedding in Aurora Windowed aggregation queries Window-aware load shedding Experimental

More information

SPATIAL INDEXING. Vaibhav Bajpai

SPATIAL INDEXING. Vaibhav Bajpai SPATIAL INDEXING Vaibhav Bajpai Contents Overview Problem with B+ Trees in Spatial Domain Requirements from a Spatial Indexing Structure Approaches SQL/MM Standard Current Issues Overview What is a Spatial

More information

Block AIR Methods. For Multicore and GPU. Per Christian Hansen Hans Henrik B. Sørensen. Technical University of Denmark

Block AIR Methods. For Multicore and GPU. Per Christian Hansen Hans Henrik B. Sørensen. Technical University of Denmark Block AIR Methods For Multicore and GPU Per Christian Hansen Hans Henrik B. Sørensen Technical University of Denmark Model Problem and Notation Parallel-beam 3D tomography exact solution exact data noise

More information

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 3: Query Processing Query Processing Decomposition Localization Optimization CS 347 Notes 3 2 Decomposition Same as in centralized system

More information

Query Optimization: Exercise

Query Optimization: Exercise Query Optimization: Exercise Session 6 Bernhard Radke November 27, 2017 Maximum Value Precedence (MVP) [1] Weighted Directed Join Graph (WDJG) Weighted Directed Join Graph (WDJG) 1000 0.05 R 1 0.005 R

More information

SPATIAL DATA MINING. Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM

SPATIAL DATA MINING. Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM SPATIAL DATA MINING Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM INTRODUCTION The main difference between data mining in relational DBS and in spatial DBS is that attributes of the neighbors

More information

4th year Project demo presentation

4th year Project demo presentation 4th year Project demo presentation Colm Ó héigeartaigh CASE4-99387212 coheig-case4@computing.dcu.ie 4th year Project demo presentation p. 1/23 Table of Contents An Introduction to Quantum Computing The

More information

2 k Factorial Designs Raj Jain

2 k Factorial Designs Raj Jain 2 k Factorial Designs Raj Jain Washington University in Saint Louis Saint Louis, MO 63130 Jain@cse.wustl.edu These slides are available on-line at: http://www.cse.wustl.edu/~jain/cse567-06/ 17-1 Overview!

More information

csci 210: Data Structures Program Analysis

csci 210: Data Structures Program Analysis csci 210: Data Structures Program Analysis 1 Summary Summary analysis of algorithms asymptotic analysis big-o big-omega big-theta asymptotic notation commonly used functions discrete math refresher READING:

More information

Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS. Lluís-Miquel Munguia, Geoffrey M. Oxberry, Deepak Rajan, Yuji Shinano

Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS. Lluís-Miquel Munguia, Geoffrey M. Oxberry, Deepak Rajan, Yuji Shinano Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS Lluís-Miquel Munguia, Geoffrey M. Oxberry, Deepak Rajan, Yuji Shinano ... Our contribution PIPS-PSBB*: Multi-level parallelism for Stochastic

More information

You are here! Query Processor. Recovery. Discussed here: DBMS. Task 3 is often called algebraic (or re-write) query optimization, while

You are here! Query Processor. Recovery. Discussed here: DBMS. Task 3 is often called algebraic (or re-write) query optimization, while Module 10: Query Optimization Module Outline 10.1 Outline of Query Optimization 10.2 Motivating Example 10.3 Equivalences in the relational algebra 10.4 Heuristic optimization 10.5 Explosion of search

More information

Module 10: Query Optimization

Module 10: Query Optimization Module 10: Query Optimization Module Outline 10.1 Outline of Query Optimization 10.2 Motivating Example 10.3 Equivalences in the relational algebra 10.4 Heuristic optimization 10.5 Explosion of search

More information

Progressive & Algorithms & Systems

Progressive & Algorithms & Systems University of California Merced Lawrence Berkeley National Laboratory Progressive Computation for Data Exploration Progressive Computation Online Aggregation (OLA) in DB Query Result Estimate Result ε

More information

Multiple-Site Distributed Spatial Query Optimization using Spatial Semijoins

Multiple-Site Distributed Spatial Query Optimization using Spatial Semijoins 11 Multiple-Site Distributed Spatial Query Optimization using Spatial Semijoins Wendy OSBORN a, 1 and Saad ZAAMOUT a a Department of Mathematics and Computer Science, University of Lethbridge, Lethbridge,

More information

A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters

A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters ANTONINO TUMEO, ORESTE VILLA Collaborators: Karol Kowalski, Sriram Krishnamoorthy, Wenjing Ma, Simone Secchi May 15, 2012 1 Outline!

More information

2 k Factorial Designs Raj Jain Washington University in Saint Louis Saint Louis, MO These slides are available on-line at:

2 k Factorial Designs Raj Jain Washington University in Saint Louis Saint Louis, MO These slides are available on-line at: 2 k Factorial Designs Raj Jain Washington University in Saint Louis Saint Louis, MO 63130 Jain@cse.wustl.edu These slides are available on-line at: 17-1 Overview 2 2 Factorial Designs Model Computation

More information

csci 210: Data Structures Program Analysis

csci 210: Data Structures Program Analysis csci 210: Data Structures Program Analysis Summary Topics commonly used functions analysis of algorithms experimental asymptotic notation asymptotic analysis big-o big-omega big-theta READING: GT textbook

More information

Parallel Transposition of Sparse Data Structures

Parallel Transposition of Sparse Data Structures Parallel Transposition of Sparse Data Structures Hao Wang, Weifeng Liu, Kaixi Hou, Wu-chun Feng Department of Computer Science, Virginia Tech Niels Bohr Institute, University of Copenhagen Scientific Computing

More information

Correlated subqueries. Query Optimization. Magic decorrelation. COUNT bug. Magic example (slide 2) Magic example (slide 1)

Correlated subqueries. Query Optimization. Magic decorrelation. COUNT bug. Magic example (slide 2) Magic example (slide 1) Correlated subqueries Query Optimization CPS Advanced Database Systems SELECT CID FROM Course Executing correlated subquery is expensive The subquery is evaluated once for every CPS course Decorrelate!

More information

Patent Searching using Bayesian Statistics

Patent Searching using Bayesian Statistics Patent Searching using Bayesian Statistics Willem van Hoorn, Exscientia Ltd Biovia European Forum, London, June 2017 Contents Who are we? Searching molecules in patents What can Pipeline Pilot do for you?

More information

Representing Arithmetic Constraints with Finite Automata: An Overview

Representing Arithmetic Constraints with Finite Automata: An Overview Representing Arithmetic Constraints with Finite Automata: An Overview Bernard Boigelot Pierre Wolper Université de Liège Motivation Linear numerical constraints are a very common and useful formalism (our

More information

An Efficient Partition Based Method for Exact Set Similarity Joins

An Efficient Partition Based Method for Exact Set Similarity Joins An Efficient Partition Based Method for Exact Set Similarity Joins Dong Deng Guoliang Li He Wen Jianhua Feng Department of Computer Science, Tsinghua University, Beijing, China. {dd11,wenhe1}@mails.tsinghua.edu.cn;{liguoliang,fengjh}@tsinghua.edu.cn

More information

Delayed and Higher-Order Transfer Entropy

Delayed and Higher-Order Transfer Entropy Delayed and Higher-Order Transfer Entropy Michael Hansen (April 23, 2011) Background Transfer entropy (TE) is an information-theoretic measure of directed information flow introduced by Thomas Schreiber

More information

Approximate String Joins in a Database (Almost) for Free

Approximate String Joins in a Database (Almost) for Free Approximate String Joins in a Database (Almost) for Free Erratum Luis Gravano Panagiotis G. Ipeirotis H. V. Jagadish Columbia University Columbia University University of Michigan gravano@cs.columbia.edu

More information

Large-Scale Behavioral Targeting

Large-Scale Behavioral Targeting Large-Scale Behavioral Targeting Ye Chen, Dmitry Pavlov, John Canny ebay, Yandex, UC Berkeley (This work was conducted at Yahoo! Labs.) June 30, 2009 Chen et al. (KDD 09) Large-Scale Behavioral Targeting

More information

Flow Algorithms for Two Pipelined Filtering Problems

Flow Algorithms for Two Pipelined Filtering Problems Flow Algorithms for Two Pipelined Filtering Problems Anne Condon, University of British Columbia Amol Deshpande, University of Maryland Lisa Hellerstein, Polytechnic University, Brooklyn Ning Wu, Polytechnic

More information

2.6 Complexity Theory for Map-Reduce. Star Joins 2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51

2.6 Complexity Theory for Map-Reduce. Star Joins 2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51 2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51 Star Joins A common structure for data mining of commercial data is the star join. For example, a chain store like Walmart keeps a fact table whose tuples each

More information

6.830 Lecture 11. Recap 10/15/2018

6.830 Lecture 11. Recap 10/15/2018 6.830 Lecture 11 Recap 10/15/2018 Celebration of Knowledge 1.5h No phones, No laptops Bring your Student-ID The 5 things allowed on your desk Calculator allowed 4 pages (2 pages double sided) of your liking

More information

Behavioral Simulations in MapReduce

Behavioral Simulations in MapReduce Behavioral Simulations in MapReduce Guozhang Wang, Marcos Vaz Salles, Benjamin Sowell, Xun Wang, Tuan Cao, Alan Demers, Johannes Gehrke, Walker White Cornell University 1 What are Behavioral Simulations?

More information

Mario A. Nascimento. Univ. of Alberta, Canada http: //

Mario A. Nascimento. Univ. of Alberta, Canada http: // DATA CACHING IN W Mario A. Nascimento Univ. of Alberta, Canada http: //www.cs.ualberta.ca/~mn With R. Alencar and A. Brayner. Work partially supported by NSERC and CBIE (Canada) and CAPES (Brazil) Outline

More information

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #3: SQL---Part 1

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #3: SQL---Part 1 CS 4604: Introduc0on to Database Management Systems B. Aditya Prakash Lecture #3: SQL---Part 1 Announcements---Project Goal: design a database system applica=on with a web front-end Project Assignment

More information

Parallelization of the QC-lib Quantum Computer Simulator Library

Parallelization of the QC-lib Quantum Computer Simulator Library Parallelization of the QC-lib Quantum Computer Simulator Library Ian Glendinning and Bernhard Ömer VCPC European Centre for Parallel Computing at Vienna Liechtensteinstraße 22, A-19 Vienna, Austria http://www.vcpc.univie.ac.at/qc/

More information

Highly-scalable branch and bound for maximum monomial agreement

Highly-scalable branch and bound for maximum monomial agreement Highly-scalable branch and bound for maximum monomial agreement Jonathan Eckstein (Rutgers) William Hart Cynthia A. Phillips Sandia National Laboratories Sandia National Laboratories is a multi-program

More information

Human resource data location privacy protection method based on prefix characteristics

Human resource data location privacy protection method based on prefix characteristics Acta Technica 62 No. 1B/2017, 437 446 c 2017 Institute of Thermomechanics CAS, v.v.i. Human resource data location privacy protection method based on prefix characteristics Yulong Qi 1, 2, Enyi Zhou 1

More information

CS 347. Parallel and Distributed Data Processing. Spring Notes 11: MapReduce

CS 347. Parallel and Distributed Data Processing. Spring Notes 11: MapReduce CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 11: MapReduce Motivation Distribution makes simple computations complex Communication Load balancing Fault tolerance Not all applications

More information

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 Multi-processor vs. Multi-computer architecture µp vs. DSP RISC vs. DSP RISC Reduced-instruction-set Register-to-register operation Higher throughput by using

More information

Probabilistic Databases

Probabilistic Databases Probabilistic Databases Amol Deshpande University of Maryland Goal Introduction to probabilistic databases Focus on an overview of: Different possible representations Challenges in using them Probabilistic

More information

S0214 : GPU Based Stacking Sequence Generation For Composite Skins Using GA

S0214 : GPU Based Stacking Sequence Generation For Composite Skins Using GA S0214 : GPU Based Stacking Sequence Generation For Composite Skins Using GA Date: 16th May 2012 Wed, 3pm to 3.25pm(Adv. Session) Sathyanarayana K., Manish Banga, and Ravi Kumar G. V. V. Engineering Services,

More information

Detection of Highly Correlated Live Data Streams

Detection of Highly Correlated Live Data Streams BIRTE 17 Detection of Highly Correlated Live Data Streams R. Alseghayer, Daniel Petrov, P.K. Chrysanthis, M. Sharaf, A. Labrinidis University of Pittsburgh The University of Queensland Motivation U, m,

More information

Dependable Cardinality Forecasts for XQuery

Dependable Cardinality Forecasts for XQuery c Systems Group Department of Computer Science ETH Zürich August 26, 2008 Dependable Cardinality Forecasts for XQuery Jens Teubner, ETH (formerly IBM Research) Torsten Grust, U Tübingen (formerly TUM)

More information

CS 347 Distributed Databases and Transaction Processing Notes03: Query Processing

CS 347 Distributed Databases and Transaction Processing Notes03: Query Processing CS 347 Distributed Databases and Transaction Processing Notes03: Query Processing Hector Garcia-Molina Zoltan Gyongyi CS 347 Notes 03 1 Query Processing! Decomposition! Localization! Optimization CS 347

More information

Flow Algorithms for Parallel Query Optimization

Flow Algorithms for Parallel Query Optimization Flow Algorithms for Parallel Query Optimization Amol Deshpande amol@cs.umd.edu University of Maryland Lisa Hellerstein hstein@cis.poly.edu Polytechnic University August 22, 2007 Abstract In this paper

More information

Initial Sampling for Automatic Interactive Data Exploration

Initial Sampling for Automatic Interactive Data Exploration Initial Sampling for Automatic Interactive Data Exploration Wenzhao Liu 1, Yanlei Diao 1, and Anna Liu 2 1 College of Information and Computer Sciences, University of Massachusetts, Amherst 2 Department

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) Relational Calculus Lecture 5, January 27, 2014 Mohammad Hammoud Today Last Session: Relational Algebra Today s Session: Relational algebra The division operator and summary

More information

Topic 17. Analysis of Algorithms

Topic 17. Analysis of Algorithms Topic 17 Analysis of Algorithms Analysis of Algorithms- Review Efficiency of an algorithm can be measured in terms of : Time complexity: a measure of the amount of time required to execute an algorithm

More information

What happens to the value of the expression x + y every time we execute this loop? while x>0 do ( y := y+z ; x := x:= x z )

What happens to the value of the expression x + y every time we execute this loop? while x>0 do ( y := y+z ; x := x:= x z ) Starter Questions Feel free to discuss these with your neighbour: Consider two states s 1 and s 2 such that s 1, x := x + 1 s 2 If predicate P (x = y + 1) is true for s 2 then what does that tell us about

More information

An Algorithm for a Two-Disk Fault-Tolerant Array with (Prime 1) Disks

An Algorithm for a Two-Disk Fault-Tolerant Array with (Prime 1) Disks An Algorithm for a Two-Disk Fault-Tolerant Array with (Prime 1) Disks Sanjeeb Nanda and Narsingh Deo School of Computer Science University of Central Florida Orlando, Florida 32816-2362 sanjeeb@earthlink.net,

More information

Exam 1. March 12th, CS525 - Midterm Exam Solutions

Exam 1. March 12th, CS525 - Midterm Exam Solutions Name CWID Exam 1 March 12th, 2014 CS525 - Midterm Exam s Please leave this empty! 1 2 3 4 5 Sum Things that you are not allowed to use Personal notes Textbook Printed lecture notes Phone The exam is 90

More information

QR Decomposition in a Multicore Environment

QR Decomposition in a Multicore Environment QR Decomposition in a Multicore Environment Omar Ahsan University of Maryland-College Park Advised by Professor Howard Elman College Park, MD oha@cs.umd.edu ABSTRACT In this study we examine performance

More information

Estimating the Selectivity of tf-idf based Cosine Similarity Predicates

Estimating the Selectivity of tf-idf based Cosine Similarity Predicates Estimating the Selectivity of tf-idf based Cosine Similarity Predicates Sandeep Tata Jignesh M. Patel Department of Electrical Engineering and Computer Science University of Michigan 22 Hayward Street,

More information

Software optimization for petaflops/s scale Quantum Monte Carlo simulations

Software optimization for petaflops/s scale Quantum Monte Carlo simulations Software optimization for petaflops/s scale Quantum Monte Carlo simulations A. Scemama 1, M. Caffarel 1, E. Oseret 2, W. Jalby 2 1 Laboratoire de Chimie et Physique Quantiques / IRSAMC, Toulouse, France

More information

QuickScorer: a fast algorithm to rank documents with additive ensembles of regression trees

QuickScorer: a fast algorithm to rank documents with additive ensembles of regression trees QuickScorer: a fast algorithm to rank documents with additive ensembles of regression trees Claudio Lucchese, Franco Maria Nardini, Raffaele Perego, Nicola Tonellotto HPC Lab, ISTI-CNR, Pisa, Italy & Tiscali

More information

Scalable Asynchronous Gradient Descent Optimization for Out-of-Core Models

Scalable Asynchronous Gradient Descent Optimization for Out-of-Core Models Scalable Asynchronous Gradient Descent Optimization for Out-of-Core Models Chengjie Qin 1, Martin Torres 2, and Florin Rusu 2 1 GraphSQL, Inc. 2 University of California Merced August 31, 2017 Machine

More information

Discrete Multi-material Topology Optimization under Total Mass Constraint

Discrete Multi-material Topology Optimization under Total Mass Constraint Discrete Multi-material Topology Optimization under Total Mass Constraint Xingtong Yang Ming Li State Key Laboratory of CAD&CG, Zhejiang University Solid and Physical Modeling Bilbao, June 12, 2018 Problem

More information

An Experimental Evaluation of Passage-Based Process Discovery

An Experimental Evaluation of Passage-Based Process Discovery An Experimental Evaluation of Passage-Based Process Discovery H.M.W. Verbeek and W.M.P. van der Aalst Technische Universiteit Eindhoven Department of Mathematics and Computer Science P.O. Box 513, 5600

More information

ERLANGEN REGIONAL COMPUTING CENTER

ERLANGEN REGIONAL COMPUTING CENTER ERLANGEN REGIONAL COMPUTING CENTER Making Sense of Performance Numbers Georg Hager Erlangen Regional Computing Center (RRZE) Friedrich-Alexander-Universität Erlangen-Nürnberg OpenMPCon 2018 Barcelona,

More information

Empirical Analysis of Invariance of Transform Coefficients under Rotation

Empirical Analysis of Invariance of Transform Coefficients under Rotation International Journal of Engineering Research and Development e-issn: 2278-67X, p-issn: 2278-8X, www.ijerd.com Volume, Issue 5 (May 25), PP.43-5 Empirical Analysis of Invariance of Transform Coefficients

More information

VMware VMmark V1.1 Results

VMware VMmark V1.1 Results Vendor and Hardware Platform: IBM System x3950 M2 Virtualization Platform: VMware ESX 3.5.0 U2 Build 110181 Performance VMware VMmark V1.1 Results Tested By: IBM Inc., RTP, NC Test Date: 2008-09-20 Performance

More information

Massively parallel semi-lagrangian solution of the 6d Vlasov-Poisson problem

Massively parallel semi-lagrangian solution of the 6d Vlasov-Poisson problem Massively parallel semi-lagrangian solution of the 6d Vlasov-Poisson problem Katharina Kormann 1 Klaus Reuter 2 Markus Rampp 2 Eric Sonnendrücker 1 1 Max Planck Institut für Plasmaphysik 2 Max Planck Computing

More information

RESEARCH ON THE DISTRIBUTED PARALLEL SPATIAL INDEXING SCHEMA BASED ON R-TREE

RESEARCH ON THE DISTRIBUTED PARALLEL SPATIAL INDEXING SCHEMA BASED ON R-TREE RESEARCH ON THE DISTRIBUTED PARALLEL SPATIAL INDEXING SCHEMA BASED ON R-TREE Yuan-chun Zhao a, b, Cheng-ming Li b a. Shandong University of Science and Technology, Qingdao 266510 b. Chinese Academy of

More information

A Dichotomy. in in Probabilistic Databases. Joint work with Robert Fink. for Non-Repeating Queries with Negation Queries with Negation

A Dichotomy. in in Probabilistic Databases. Joint work with Robert Fink. for Non-Repeating Queries with Negation Queries with Negation Dichotomy for Non-Repeating Queries with Negation Queries with Negation in in Probabilistic Databases Robert Dan Olteanu Fink and Dan Olteanu Joint work with Robert Fink Uncertainty in Computation Simons

More information

An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints

An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints Klaus Schittkowski Department of Computer Science, University of Bayreuth 95440 Bayreuth, Germany e-mail:

More information

6.854 Advanced Algorithms

6.854 Advanced Algorithms 6.854 Advanced Algorithms Homework Solutions Hashing Bashing. Solution:. O(log U ) for the first level and for each of the O(n) second level functions, giving a total of O(n log U ) 2. Suppose we are using

More information

Lecture 16: Relevance Lemma and Relational Databases

Lecture 16: Relevance Lemma and Relational Databases Lecture 16: Relevance Lemma and Relational Databases In the last lecture we saw an introduction to first order logic, discussing both its syntax and semantics. After defining semantics, the reader may

More information

Parallelization Strategies for Density Matrix Renormalization Group algorithms on Shared-Memory Systems

Parallelization Strategies for Density Matrix Renormalization Group algorithms on Shared-Memory Systems Parallelization Strategies for Density Matrix Renormalization Group algorithms on Shared-Memory Systems G. Hager HPC Services, Computing Center Erlangen, Germany E. Jeckelmann Theoretical Physics, Univ.

More information

Adding Flexibility to Russian Doll Search

Adding Flexibility to Russian Doll Search Adding Flexibility to Russian Doll Search Margarita Razgon and Gregory M. Provan Department of Computer Science, University College Cork, Ireland {m.razgon g.provan}@cs.ucc.ie Abstract The Weighted Constraint

More information

Multi-Approximate-Keyword Routing Query

Multi-Approximate-Keyword Routing Query Bin Yao 1, Mingwang Tang 2, Feifei Li 2 1 Department of Computer Science and Engineering Shanghai Jiao Tong University, P. R. China 2 School of Computing University of Utah, USA Outline 1 Introduction

More information

What is (certain) Spatio-Temporal Data?

What is (certain) Spatio-Temporal Data? What is (certain) Spatio-Temporal Data? A spatio-temporal database stores triples (oid, time, loc) In the best case, this allows to look up the location of an object at any time 2 What is (certain) Spatio-Temporal

More information

StreamSVM Linear SVMs and Logistic Regression When Data Does Not Fit In Memory

StreamSVM Linear SVMs and Logistic Regression When Data Does Not Fit In Memory StreamSVM Linear SVMs and Logistic Regression When Data Does Not Fit In Memory S.V. N. (vishy) Vishwanathan Purdue University and Microsoft vishy@purdue.edu October 9, 2012 S.V. N. Vishwanathan (Purdue,

More information

1 First-order logic. 1 Syntax of first-order logic. 2 Semantics of first-order logic. 3 First-order logic queries. 2 First-order query evaluation

1 First-order logic. 1 Syntax of first-order logic. 2 Semantics of first-order logic. 3 First-order logic queries. 2 First-order query evaluation Knowledge Bases and Databases Part 1: First-Order Queries Diego Calvanese Faculty of Computer Science Master of Science in Computer Science A.Y. 2007/2008 Overview of Part 1: First-order queries 1 First-order

More information

Business Process Verification with Constraint Temporal Answer Set Programming

Business Process Verification with Constraint Temporal Answer Set Programming 1 Online appendix for the paper Business Process Verification with Constraint Temporal Answer Set Programming published in Theory and Practice of Logic Programming Laura Giordano DISIT, Università del

More information

Configuring Spatial Grids for Efficient Main Memory Joins

Configuring Spatial Grids for Efficient Main Memory Joins Configuring Spatial Grids for Efficient Main Memory Joins Farhan Tauheed, Thomas Heinis, and Anastasia Ailamaki École Polytechnique Fédérale de Lausanne (EPFL), Imperial College London Abstract. The performance

More information

Two Factor Full Factorial Design with Replications

Two Factor Full Factorial Design with Replications Two Factor Full Factorial Design with Replications Raj Jain Washington University in Saint Louis Saint Louis, MO 63130 Jain@cse.wustl.edu These slides are available on-line at: http://www.cse.wustl.edu/~jain/cse567-08/

More information

Incomplete Information in RDF

Incomplete Information in RDF Incomplete Information in RDF Charalampos Nikolaou and Manolis Koubarakis charnik@di.uoa.gr koubarak@di.uoa.gr Department of Informatics and Telecommunications National and Kapodistrian University of Athens

More information

Sequential: Vector of Bits

Sequential: Vector of Bits Counting the Number of Accesses Sequential: Vector of Bits When estimating seek costs, we need to calculate the probability distribution for the distance between two subsequent qualifying cylinders. We

More information

Uncertainty Aware Query Execution Time Prediction

Uncertainty Aware Query Execution Time Prediction Uncertainty Aware Query Execution Time Prediction Wentao Wu Xi Wu Hakan Hacıgümüş Jeffrey F. Naughton Department of Computer Sciences, University of Wisconsin-Madison NEC Laboratories America {wentaowu,

More information

Large-scale Electronic Structure Simulations with MVAPICH2 on Intel Knights Landing Manycore Processors

Large-scale Electronic Structure Simulations with MVAPICH2 on Intel Knights Landing Manycore Processors Large-scale Electronic Structure Simulations with MVAPICH2 on Intel Knights Landing Manycore Processors Hoon Ryu, Ph.D. (E: elec1020@kisti.re.kr) Principal Researcher / Korea Institute of Science and Technology

More information

ArcGIS Data Models: Raster Data Models. Jason Willison, Simon Woo, Qian Liu (Team Raster, ESRI Software Products)

ArcGIS Data Models: Raster Data Models. Jason Willison, Simon Woo, Qian Liu (Team Raster, ESRI Software Products) ArcGIS Data Models: Raster Data Models Jason Willison, Simon Woo, Qian Liu (Team Raster, ESRI Software Products) Overview of Session Raster Data Model Context Example Raster Data Models Important Raster

More information

Analysis of Algorithms [Reading: CLRS 2.2, 3] Laura Toma, csci2200, Bowdoin College

Analysis of Algorithms [Reading: CLRS 2.2, 3] Laura Toma, csci2200, Bowdoin College Analysis of Algorithms [Reading: CLRS 2.2, 3] Laura Toma, csci2200, Bowdoin College Why analysis? We want to predict how the algorithm will behave (e.g. running time) on arbitrary inputs, and how it will

More information

HYCOM and Navy ESPC Future High Performance Computing Needs. Alan J. Wallcraft. COAPS Short Seminar November 6, 2017

HYCOM and Navy ESPC Future High Performance Computing Needs. Alan J. Wallcraft. COAPS Short Seminar November 6, 2017 HYCOM and Navy ESPC Future High Performance Computing Needs Alan J. Wallcraft COAPS Short Seminar November 6, 2017 Forecasting Architectural Trends 3 NAVY OPERATIONAL GLOBAL OCEAN PREDICTION Trend is higher

More information

591 TFLOPS Multi-TRILLION Particles Simulation on SuperMUC

591 TFLOPS Multi-TRILLION Particles Simulation on SuperMUC International Supercomputing Conference 2013 591 TFLOPS Multi-TRILLION Particles Simulation on SuperMUC W. Eckhardt TUM, A. Heinecke TUM, R. Bader LRZ, M. Brehm LRZ, N. Hammer LRZ, H. Huber LRZ, H.-G.

More information

Factorized Relational Databases Olteanu and Závodný, University of Oxford

Factorized Relational Databases   Olteanu and Závodný, University of Oxford November 8, 2013 Database Seminar, U Washington Factorized Relational Databases http://www.cs.ox.ac.uk/projects/fd/ Olteanu and Závodný, University of Oxford Factorized Representations of Relations Cust

More information

SP-CNN: A Scalable and Programmable CNN-based Accelerator. Dilan Manatunga Dr. Hyesoon Kim Dr. Saibal Mukhopadhyay

SP-CNN: A Scalable and Programmable CNN-based Accelerator. Dilan Manatunga Dr. Hyesoon Kim Dr. Saibal Mukhopadhyay SP-CNN: A Scalable and Programmable CNN-based Accelerator Dilan Manatunga Dr. Hyesoon Kim Dr. Saibal Mukhopadhyay Motivation Power is a first-order design constraint, especially for embedded devices. Certain

More information

GPU Acceleration of Cutoff Pair Potentials for Molecular Modeling Applications

GPU Acceleration of Cutoff Pair Potentials for Molecular Modeling Applications GPU Acceleration of Cutoff Pair Potentials for Molecular Modeling Applications Christopher Rodrigues, David J. Hardy, John E. Stone, Klaus Schulten, Wen-Mei W. Hwu University of Illinois at Urbana-Champaign

More information

Reducing the Run-time of MCMC Programs by Multithreading on SMP Architectures

Reducing the Run-time of MCMC Programs by Multithreading on SMP Architectures Reducing the Run-time of MCMC Programs by Multithreading on SMP Architectures Jonathan M. R. Byrd Stephen A. Jarvis Abhir H. Bhalerao Department of Computer Science University of Warwick MTAAP IPDPS 2008

More information

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 2: Distributed Database Design Logistics Gradiance No action items for now Detailed instructions coming shortly First quiz to be released

More information

n-level Graph Partitioning

n-level Graph Partitioning Vitaly Osipov, Peter Sanders - Algorithmik II 1 Vitaly Osipov: KIT Universität des Landes Baden-Württemberg und nationales Grossforschungszentrum in der Helmholtz-Gemeinschaft Institut für Theoretische

More information

Rainfall data analysis and storm prediction system

Rainfall data analysis and storm prediction system Rainfall data analysis and storm prediction system SHABARIRAM, M. E. Available from Sheffield Hallam University Research Archive (SHURA) at: http://shura.shu.ac.uk/15778/ This document is the author deposited

More information

Performance and Application of Observation Sensitivity to Global Forecasts on the KMA Cray XE6

Performance and Application of Observation Sensitivity to Global Forecasts on the KMA Cray XE6 Performance and Application of Observation Sensitivity to Global Forecasts on the KMA Cray XE6 Sangwon Joo, Yoonjae Kim, Hyuncheol Shin, Eunhee Lee, Eunjung Kim (Korea Meteorological Administration) Tae-Hun

More information

Enhancing Reuse of Constraint Solutions to Improve Symbolic Execution

Enhancing Reuse of Constraint Solutions to Improve Symbolic Execution Enhancing Reuse of Constraint Solutions to Improve Symbolic Execution Xiangyang Jia (Wuhan University) Carlo Ghezzi (Politecnico di Milano) Shi Ying (Wuhan University) Outline Motivation Logical Basis

More information

Towards parallel bipartite matching algorithms

Towards parallel bipartite matching algorithms Outline Towards parallel bipartite matching algorithms Bora Uçar CNRS and GRAAL, ENS Lyon, France Scheduling for large-scale systems, 13 15 May 2009, Knoxville Joint work with Patrick R. Amestoy (ENSEEIHT-IRIT,

More information

INSTITUT FÜR INFORMATIK

INSTITUT FÜR INFORMATIK INSTITUT FÜR INFORMATIK DER LUDWIGMAXIMILIANSUNIVERSITÄT MÜNCHEN Bachelorarbeit Propagation of ESCL Cardinality Constraints with Respect to CEP Queries Thanh Son Dang Aufgabensteller: Prof. Dr. Francois

More information

New Attacks on the Concatenation and XOR Hash Combiners

New Attacks on the Concatenation and XOR Hash Combiners New Attacks on the Concatenation and XOR Hash Combiners Itai Dinur Department of Computer Science, Ben-Gurion University, Israel Abstract. We study the security of the concatenation combiner H 1(M) H 2(M)

More information

7 RC Simulates RA. Lemma: For every RA expression E(A 1... A k ) there exists a DRC formula F with F V (F ) = {A 1,..., A k } and

7 RC Simulates RA. Lemma: For every RA expression E(A 1... A k ) there exists a DRC formula F with F V (F ) = {A 1,..., A k } and 7 RC Simulates RA. We now show that DRC (and hence TRC) is at least as expressive as RA. That is, given an RA expression E that mentions at most C, there is an equivalent DRC expression E that mentions

More information

Data Analytics Beyond OLAP. Prof. Yanlei Diao

Data Analytics Beyond OLAP. Prof. Yanlei Diao Data Analytics Beyond OLAP Prof. Yanlei Diao OPERATIONAL DBs DB 1 DB 2 DB 3 EXTRACT TRANSFORM LOAD (ETL) METADATA STORE DATA WAREHOUSE SUPPORTS OLAP DATA MINING INTERACTIVE DATA EXPLORATION Overview of

More information

Quiz 2. Due November 26th, CS525 - Advanced Database Organization Solutions

Quiz 2. Due November 26th, CS525 - Advanced Database Organization Solutions Name CWID Quiz 2 Due November 26th, 2015 CS525 - Advanced Database Organization s Please leave this empty! 1 2 3 4 5 6 7 Sum Instructions Multiple choice questions are graded in the following way: You

More information

MSC HPC Infrastructure Update. Alain St-Denis Canadian Meteorological Centre Meteorological Service of Canada

MSC HPC Infrastructure Update. Alain St-Denis Canadian Meteorological Centre Meteorological Service of Canada MSC HPC Infrastructure Update Alain St-Denis Canadian Meteorological Centre Meteorological Service of Canada Outline HPC Infrastructure Overview Supercomputer Configuration Scientific Direction 2 IT Infrastructure

More information

Static-scheduling and hybrid-programming in SuperLU DIST on multicore cluster systems

Static-scheduling and hybrid-programming in SuperLU DIST on multicore cluster systems Static-scheduling and hybrid-programming in SuperLU DIST on multicore cluster systems Ichitaro Yamazaki University of Tennessee, Knoxville Xiaoye Sherry Li Lawrence Berkeley National Laboratory MS49: Sparse

More information

arxiv: v1 [hep-lat] 7 Oct 2010

arxiv: v1 [hep-lat] 7 Oct 2010 arxiv:.486v [hep-lat] 7 Oct 2 Nuno Cardoso CFTP, Instituto Superior Técnico E-mail: nunocardoso@cftp.ist.utl.pt Pedro Bicudo CFTP, Instituto Superior Técnico E-mail: bicudo@ist.utl.pt We discuss the CUDA

More information

On Two Class-Constrained Versions of the Multiple Knapsack Problem

On Two Class-Constrained Versions of the Multiple Knapsack Problem On Two Class-Constrained Versions of the Multiple Knapsack Problem Hadas Shachnai Tami Tamir Department of Computer Science The Technion, Haifa 32000, Israel Abstract We study two variants of the classic

More information