CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 3: Query Processing

Query Processing Decomposition Localization Optimization CS 347 Notes 3 2

Decomposition Same as in centralized system 1. Normalization 2. Eliminating redundancy 3. Algebraic rewriting CS 347 Notes 3 3

Normalization Convert from general language to a standard form E.g., relational algebra CS 347 Notes 3 4

Normalization Example: select R.A, R.C from R, S where R.A = S.A and ((R.B = 1 and S.D = 2) or (R.C > 3 and S.D = 2)) R.A,R.C σ (R.B=1 R.C>3) S.D=2 A conjunctive normal form R S CS 347 Notes 3 5

Normalization Also detect invalid expressions select * from R where R.D = 3 R does not have D attribute CS 347 Notes 3 6

Eliminating Redundancy E.g., in conditions (S.A = 1) (S.A > 5) false (S.A < 10) (S.A < 5) S.A < 5 CS 347 Notes 3 7

Eliminating Redundancy E.g., sharing common sub-expressions S σ cond σ cond T S σ cond T R R R CS 347 Notes 3 8

Algebraic Rewriting E.g., pushing conditions down σ cond3 σ cond R S σ cond1 σ cond2 R S CS 347 Notes 3 9

Query Processing Decomposition One or more algebraic query trees on relations Localization Replacing relations by corresponding fragments CS 347 Notes 3 10

Localization Steps (1) Start with query (2) Replace relations by fragments (3) Push up and, σ down (apply CS 245 rules) (4) Simplify ~ eliminate unnecessary operations CS 347 Notes 3 11

Localization Notation for fragment [R : cond] conditions its tuples satisfy fragment CS 347 Notes 3 12

Example A (1) σ E=3 R CS 347 Notes 3 13

Example A (2) σ E=3 [R 1 : E<10] [R 2 : E 10] CS 347 Notes 3 14

Example A (3) σ E=3 σ E=3 [R 1 : E<10] [R 2 : E 10] CS 347 Notes 3 15

Example A (3) σ E=3 σ E=3 [R 1 : E<10] [R 2 : E 10] Ø CS 347 Notes 3 16

Example A (4) σ E=3 [R 1 : E<10] CS 347 Notes 3 17

Localization Rule 1 σ c1 [R: c2] σ c1 [R: c1 c2] [R: false] Ø CS 347 Notes 3 18

Example A σ E=3 [R 2 : E 10] σ E=3 [R 2 : E=3 E 10] σ E=3 [R 2 : false] Ø CS 347 Notes 3 19

Example B (1) A A = common attribute R S CS 347 Notes 3 20

Example B (2) A [R 1 : A<5] [R 2 : 5 A 10] [R 3 : A>10] [S 1 : A<5] [S 2 : A 5] CS 347 Notes 3 21

Example B (3) A A A A [R 1 : A<5] [S 1 : A<5] [R 1 : A<5] [S 2 : A 5] [R 3 : A>10] [S 1 : A<5] [R 3 : A>10] [S 2 : A 5] A A [R 2 : 5 A 10] [S 1 : A<5] [R 2 : 5 A 10] [S 2 : A 5] CS 347 Notes 3 22

Example B (3) A A A A [R 1 : A<5] [S 1 : A<5] [R 1 : A<5] [S 2 : A 5] [R 3 : A>10] [S 1 : A<5] [R 3 : A>10] [S 2 : A 5] A A [R 2 : 5 A 10] [S 1 : A<5] [R 2 : 5 A 10] [S 2 : A 5] CS 347 Notes 3 23

Example B (4) A A A [R 1 : A<5] [S 1 : A<5] [R 2 : 5 A 10] [S 2 : A 5] [R 3 : A>10] [S 2 : A 5] CS 347 Notes 3 24

Localization Rule 2 [R: c 1 ] [S: c 2 ] [R S: c 1 c 2 R.A=S.A] A A [R: false] Ø CS 347 Notes 3 25

Example B [R 1 : A < 5] [S 2 : A 5] A [R 1 A S 2 : R 1.A < 5 S 2.A 5 R 1.A=S 2.A] [R 1 A S 2 : false] Ø CS 347 Notes 3 26

Example C (2) K [R 1 : A<10] [R 2 : A 10] [S 1 : K=R.K R.A<10] [S 2 : K=R.K R.A 10] derived fragmentation CS 347 Notes 3 27

Example C (3) K K K K [R 1 ] [S 1 ] [R 1 ] [S 2 ] [S 1 ] [R 2 ] [R 2 ] [S 2 ] CS 347 Notes 3 28

Example C (3) K K K K [R 1 ] [S 1 ] [R 1 ] [S 2 ] [S 1 ] [R 2 ] [R 2 ] [S 2 ] CS 347 Notes 3 29

Example C (4) K K [R 1 : A<10] [S 1 : K=R.K R.A<10] [R 2 : A 10] [S 2 : K=R.K R.A 10] CS 347 Notes 3 30

Example C [R 1 : A < 10] [S 2 : K = R.K R.A 10] K [R 1 K S 2 : R 1.A < 10 S 2.K = R.K R.A 10 R 1.K = S 2.K] [R 1 K S 2 : false] Ø CS 347 Notes 3 31

Example D (1) A R R1 (K, A, B) R2 (K, C, D) vertical fragmentation CS 347 Notes 3 32

Example D (2) A K R 1 (K,A,B) R 2 (K,C,D) CS 347 Notes 3 33

Example D (2) A K K,A K,A R 1 (K,A,B) R 2 (K,C,D) CS 347 Notes 3 34

Example D (4) A R 1 (K, A, B) CS 347 Notes 3 35

Rule 3 Consider the vertical fragmentation R i = (R), A i A A i Then for any B A B (R) = B ( R i B A i Ø ) i CS 347 Notes 3 36

Query Processing Decomposition Localization Optimization Overview Joins and other operations Inter-operation parallelism Optimization strategies CS 347 Notes 3 37

Overview Optimization process 1. Generate query plans P 1 P 2 P 3 P n 2. Estimate size of intermediate results 3. Estimate cost of plan C 1 C 2 C 3 C n 4. Pick minimum CS 347 Notes 3 38

Overview Differences from centralized optimization New strategies for some operations Parallel/distributed sort Parallel/distributed join Semi-join Privacy preserving join Duplicate elimination Aggregation Selection Many ways to assign and schedule processors CS 347 Notes 3 39

Parallel/Distributed Sort Input a) Relation R on single site/disk b) Relation R fragmented/partitioned by sort attribute c) Relation R fragmented/partitioned by another attribute CS 347 Notes 3 40

Parallel/Distributed Sort Output a) Sorted R on single site/disk b) Sorted R fragments/partitions 5 6 10 12 15 19 20 21 50 F 1 F 2 F 3 CS 347 Notes 3 41

Basic Sort R(K, ) to be sorted on K Horizontally fragmented on K using vector (k 0, k 1,..., k n ) 7 3 k 0 k 1 11 10 20 17 14 27 22 CS 347 Notes 3 42

Basic Sort Algorithm 1. Sort each fragment independently 2. Ship results if necessary CS 347 Notes 3 43

Basic Sort Same idea on different architectures Shared nothing P 1 P 2 M F 1 M F 2 Shared memory P 1 P 2 F 1 M F 2 CS 347 Notes 3 44

Range-Partition Sort R(K, ) to be sorted on K R located at one or more site/disk, not fragmented on K CS 347 Notes 3 45

Range-Partition Sort Algorithm 1. Range partition on K 2. Basic sort R a R 1 local sort R 1 k 0 R b R 2 local sort R 2 result k 1 local sort R 3 R 3 CS 347 Notes 3 46

Partition Vectors Problem Select a good partition vector given fragments 7 52 11 14 31 8 15 11 32 17 10 12 4 R a R b R c CS 347 Notes 3 47

Partition Vectors Example approach Each site sends to coordinator MIN sort key MAX sort key Number of tuples Coordinator Computes vector and distributes to sites Decides the number of sites to perform local sorts CS 347 Notes 3 48

Partition Vectors Sample scenario Coordinator receives S A : MIN = 5 MAX = 9 # of tuples = 10 S B : MIN = 7 MAX = 16 # of tuples = 10 CS 347 Notes 3 49

Partition Vectors Sample scenario Coordinator receives S A : MIN = 5 MAX = 9 # of tuples = 10 S B : MIN = 7 MAX = 16 # of tuples = 10 Expected tuples 2 = 10 / (9 5 + 1) 1 = 10 / (16 7 + 1) 5 10 15 20 k 0? assuming we want to sort at 2 sites CS 347 Notes 3 50

Partition Vectors Sample scenario Expected tuples 2 1 Expected tuples with key < k 0 = half of total tuples 2(k 0 5) + (k 0 7) = 10 k 0 = (10 + 10 + 7) / 3 = 9 5 10 15 20 k 0? CS 347 Notes 3 51

Partition Vectors Variations Send more info to coordinator: a) Partition vector for local site b) Histogram 3 3 3 5 6 8 10 # of tuples local vector 5 6 7 8 9 10 CS 347 Notes 3 52

Partition Vectors Variations Multiple rounds 1. Sites send range and # of tuples to coordinator 2. Coordinator distributes preliminary vector V 0 3. Sites send coordinator # of tuples in each V 0 range 4. Coordinator computes and distributes final vector V f CS 347 Notes 3 53

Partition Vectors Variations Distributed algorithm that does not require a coordinator? CS 347 Notes 3 54

Parallel External Sort-Merge Same as range-partition sort except does sorting first R a local sort R a R 1 k 0 R b local sort R b R 2 result k 1 R 3 in order merge CS 347 Notes 3 55

Parallel/Distributed Join Input Relations R, S May or may not be fragmented Output R S Result at one or more sites CS 347 Notes 3 56

Partitioned Join (Equijoin) local join R 1 R A S 1 S A R B R 2 S 2 S B f(a) R 3 S 3 S C result f(a) CS 347 Notes 3 57

Partitioned Join (Equijoin) Same partition function f is used for both R and S f can be range or hash partitioning Local join can be of any type (use any CS 245 optimization) Various scheduling options, e.g., a) Partition R; partition S; join b) Partition R; build local hash table for R; partition S and join CS 347 Notes 3 58

Partitioned Join (Equijoin) We already know why it works [R 1 ] [R 2 ] [R 3 ] [S 1 ] [S 2 ] [S 3 ] [R 1 ] [S 1 ] [R 2 ] [S 2 ] [R 3 ] [S 3 ] Sometimes we partition just to make this join possible CS 347 Notes 3 59

Partitioned Join (Equijoin) Selecting a good partition function f is very important Goal is to make all R i + S i the same size CS 347 Notes 3 60

Asymmetric Fragment + Replicate Join local join R A R 1 S S A R B R 2 S S B partition R 3 S result union CS 347 Notes 3 61

Asymmetric Fragment + Replicate Join Can use any partition function f for R (even round robin) Can do any join, not just equijoin (e.g., R S) R.A<S.B CS 347 Notes 3 62

General Fragment + Replicate Join R A R 1 R 1 R B R 2 R 2 f partition 3 fragments R 3 R 3 n copies of each fragment CS 347 Notes 3 63

General Fragment + Replicate Join R A R 1 R 1 R B R 2 R 2 f partition 3 fragments R 3 R 3 n copies of each fragment S is partitioned in similar fashion CS 347 Notes 3 64

General Fragment + Replicate Join R 1 S 1 R 1 S 2 R 2 S 1 R 2 S 2 n m pairings of R, S fragments R 3 S 1 R 3 S 2 result CS 347 Notes 3 65

General Fragment + Replicate Join The asymmetric F+R join is special case of the general F+R Asymmetric F+R may be good if S is small Also works on non-equijoins CS 347 Notes 3 66

Semi-Join Goal: reduce communication traffic R S ( R S ) S or A A A R ( S R ) or A ( R S ) ( S R ) A A A A CS 347 Notes 3 67

Semi-Join R S A B A C R 2 a 10 b S 3 x 10 y 25 c 15 z 30 d 25 w 32 x CS 347 Notes 3 68

Semi-Join R S A B A C R 2 a 10 b S 3 x 10 y 25 c 15 z 30 d 25 w 32 x A R = [2, 10, 25, 30] CS 347 Notes 3 69

Semi-Join R S A B A C R 2 a 10 b S 3 x 10 y 25 c 15 z 30 d 25 w 32 x R S A R = [2, 10, 25, 30] 10 y S R = 25 w CS 347 Notes 3 70

Semi-Join A B A C R 2 a 10 b S 3 x 10 y 25 c 15 z 30 d 25 w 32 x Transmitted data With semi-join R ( S R ): T = 4 A + 2 A + C + result With join R S : T = 4 A + B + result better if B is large CS 347 Notes 3 71

Semi-Join Assume R is the smaller relation Then, in general, ( R S ) S is better than ( R S ) if A A A size( A S ) + size ( R S ) < size ( R ) A Can do similar comparison for other joins Only taking into account transmission cost CS 347 Notes 3 72

Semi-Join Trick: encode A S (or A R) as a bit vector keys in S 0 0 1 1 0 1 0 0 0 0 1 0 1 0 0 one bit/possible key CS 347 Notes 3 73

Semi-Join Useful for three way joins R S T CS 347 Notes 3 74

Semi-Join Useful for three way joins R S T Option 1: R S T where R = R S S = S T CS 347 Notes 3 75

Semi-Join Useful for three way joins R S T Option 1: R S T where R = R S S = S T Option 2: R S T where R = R S S = S T CS 347 Notes 3 76

Semi-Join Useful for three way joins R S T Option 1: R S T where R = R S S = S T Option 2: R S T where R = R S S = S T Many options ~ exponential in # of relations CS 347 Notes 3 77

Privacy Preserving Join Site 1 has R(A, B) Site 2 has S(A, C) Want to compute R S Site 1 should not discover any S info not in the join Site 2 should not discover any R info not in the join CS 347 Notes 3 78

Privacy Preserving Join Semi-join won t work If site 1 sends A R to site 2, site 2 learns all keys of R A B A C R a 1 b 1 A R = (a 1, a 2, a 3, a 4 ) S a 1 c 1 a 2 b 2 a 3 c 2 a 3 b 3 a 5 c 3 a 4 b 4 a 7 c 4 site 1 site 2 CS 347 Notes 3 79

Privacy Preserving Join Fix: send hash keys only Site 1 hashes each value of A before sending Site 2 hashes its own A values (same h) to see what tuples match R A B a 1 b 1 a 2 b 2 a 3 b 3 A R = (h(a 1 ), h(a 2 ), h(a 3 ), h(a 4 )) site 2 sees it has h(a 1 ), h(a 3 ) S A C a 1 c 1 a 3 c 2 a 5 c 3 a 4 b 4 a 7 c 4 (a1, c1), (a3, c3) site 1 site 2 CS 347 Notes 3 80

Privacy Preserving Join What is the problem? A B A C R a 1 b 1 A R = (h(a 1 ), h(a 2 ), h(a 3 ), h(a 4 )) S a 1 c 1 a 2 b 2 site 2 sees it has h(a 1 ), h(a 3 ) a 3 c 2 a 3 b 3 a 5 c 3 a 4 b 4 a 7 c 4 (a1, c1), (a3, c3) site 1 site 2 CS 347 Notes 3 81

Privacy Preserving Join What is the problem? A B A C R a 1 b 1 A R = (h(a 1 ), h(a 2 ), h(a 3 ), h(a 4 )) S a 1 c 1 a 2 b 2 site 2 sees it has h(a 1 ), h(a 3 ) a 3 c 2 a 3 b 3 a 5 c 3 a 4 b 4 (a1, c1), (a3, c3) a 7 c 4 site 1 site 2 Dictionary attack Site 2 can take all keys a 1, a 2, a 3, and checks if h(a 1 ), h(a 2 ), h(a 3 ), matches what site 1 sent CS 347 Notes 3 82

Privacy Preserving Join Our adversary model: honest but curious Dictionary attack is possible (cheating is internal and can t be caught) Sending incorrect keys not possible (cheater could be caught) CS 347 Notes 3 83

Privacy Preserving Join One solution Information Sharing Across Private Databases. Agrawal et al., SIGMOD 2003 Use commutative encryption functions E i (x) = x encrypted using a key private to site i E 1 (E 2 (x)) = E 2 (E 1 (x)) Shorthand: E 1 (x) is x, E 2 (x) is x, E 1 (E 2 (x)) is x CS 347 Notes 3 84

Privacy Preserving Join One solution R A B a 1 b 1 a 2 b 2 ( a 1, a 2, a 3, a 4 ) S A C a 1 c 1 a 3 c 2 a 3 b 3 ( a 1, a 2, a 3, a 4 ) a 5 c 3 a 4 b 4 a 7 c 4 ( a 1, a 3, a 5, a 7 ) site 1 site 2 Computes ( a 1, a 3, a 5, a 7 ), intersects with ( a 1, a 2, a 3, a 4 ) (a 1, b 1 ), (a 3, b 3 ) CS 347 Notes 3 85

Other Privacy Preserving Operations Inequality join Similarity join R S R.A>S.A R S sim(r.a,s.a)<e CS 347 Notes 3 86

Other Parallel Operations Duplicate elimination Sort first (in parallel) then eliminate duplicates in result Partition tuples (range or hash) and eliminate locally Aggregates Partition by grouping attributes; compute aggregate locally CS 347 Notes 3 87

Parallel/Distributed Aggregates R a id dept salary 1 toy 10 2 toy 20 3 sales 15 R b id dept salary 4 sales 5 5 toy 20 6 mgmt 15 7 sales 10 8 mgmt 30 sum (sal) group by dept CS 347 Notes 3 88

Parallel/Distributed Aggregates R a R b id dept salary 1 toy 10 2 toy 20 3 sales 15 id dept salary 4 sales 5 5 toy 20 6 mgmt 15 7 sales 10 8 mgmt 30 id dept salary 1 toy 10 2 toy 20 5 toy 20 6 mgmt 15 8 mgmt 30 id dept salary 3 sales 15 4 sales 5 7 sales 10 sum (sal) group by dept CS 347 Notes 3 89

Parallel/Distributed Aggregates R a id dept salary 1 toy 10 id dept salary 1 toy 10 sum dept sum toy 50 2 toy 20 2 toy 20 mgmt 45 3 sales 15 5 toy 20 6 mgmt 15 R b id dept salary 4 sales 5 5 toy 20 8 mgmt 30 id dept salary sum dept sum 6 mgmt 15 3 sales 15 sales 30 7 sales 10 4 sales 5 8 mgmt 30 7 sales 10 sum (sal) group by dept CS 347 Notes 3 90

Parallel/Distributed Aggregates R a id dept salary 1 toy 10 2 toy 20 3 sales 15 R b id dept salary 4 sales 5 5 toy 20 6 mgmt 15 7 sales 10 8 mgmt 30 sum (sal) group by dept with less data CS 347 Notes 3 91

Parallel/Distributed Aggregates R a id dept salary 1 toy 10 sum dept salary toy 30 2 toy 20 toy 20 3 sales 15 mgmt 45 R b id dept salary 4 sales 5 5 toy 20 sum dept salary 6 mgmt 15 sales 15 7 sales 10 sales 15 8 mgmt 30 sum (sal) group by dept with less data CS 347 Notes 3 92

Parallel/Distributed Aggregates R a id dept salary 1 toy 10 sum dept salary toy 30 sum dept sum toy 50 2 toy 20 toy 20 mgmt 45 3 sales 15 mgmt 45 R b id dept salary 4 sales 5 5 toy 20 sum dept salary sum dept sum 6 mgmt 15 sales 15 sales 30 7 sales 10 sales 15 8 mgmt 30 sum (sal) group by dept with less data CS 347 Notes 3 93

Parallel/Distributed Aggregates Enhancement Perform aggregate during partition to reduce data transmitted Does not work for all aggregate functions which ones? CS 347 Notes 3 94

Preview: MapReduce data A 1 map data B 1 reduce data C 1 data A 2 data B 2 data C 1 data A 3 CS 347 Notes 3 95

Parallel/Distributed Selection Straightforward if one can leverage range or hash partitioning How do indexes work? CS 347 Notes 3 96

Parallel/Distributed Selection Partition vector can act as the root of a distributed index k 0 k 1 local indexes site 1 site 2 site 3 CS 347 Notes 3 97

Parallel/Distributed Selection Distributed indexes on a non-partition attributes get complicated k 0 k 1 index sites tuple sites CS 347 Notes 3 98

Parallel/Distributed Selection Indexing schemes Which one is better in a distributed environment? How to make updates and expansion efficient? Where to store the directory and set of participants? Is global knowledge necessary? If the index is not too big, it may be better to keep it whole and replicate it CS 347 Notes 3 99

Query Processing Decomposition Localization Optimization Overview Joins and other operations Inter-operation parallelism Optimization strategies CS 347 Notes 3 100

Inter-Operation Parallelism Pipelined Independent CS 347 Notes 3 101

Pipelined Parallelism site 2 σ c S site 1 R site 1 R tuples matching σ c site 2 join S probe result CS 347 Notes 3 102

Pipelined Parallelism Pipelining cannot be used in all cases stream of R tuples stream of S tuples CS 347 Notes 3 103

Independent Parallelism site 1 site 2 R S T V 1. temp 1 R S temp 2 T V 2. result temp 1 temp 2 CS 347 Notes 3 104

Summary As we consider query plans for optimization, we must consider various new strategies for individual operations scheduling operations CS 347 Notes 3 105