CS 347 Parallel and Distributed Data Processing

Size: px
Start display at page:

Download "CS 347 Parallel and Distributed Data Processing"

Transcription

1 CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 3: Query Processing

2 Query Processing Decomposition Localization Optimization CS 347 Notes 3 2

3 Decomposition Same as in centralized system 1. Normalization 2. Eliminating redundancy 3. Algebraic rewriting CS 347 Notes 3 3

4 Normalization Convert from general language to a standard form E.g., relational algebra CS 347 Notes 3 4

5 Normalization Example: select R.A, R.C from R, S where R.A = S.A and ((R.B = 1 and S.D = 2) or (R.C > 3 and S.D = 2)) R.A,R.C σ (R.B=1 R.C>3) S.D=2 A conjunctive normal form R S CS 347 Notes 3 5

6 Normalization Also detect invalid expressions select * from R where R.D = 3 R does not have D attribute CS 347 Notes 3 6

7 Eliminating Redundancy E.g., in conditions (S.A = 1) (S.A > 5) false (S.A < 10) (S.A < 5) S.A < 5 CS 347 Notes 3 7

8 Eliminating Redundancy E.g., sharing common sub-expressions S σ cond σ cond T S σ cond T R R R CS 347 Notes 3 8

9 Algebraic Rewriting E.g., pushing conditions down σ cond3 σ cond R S σ cond1 σ cond2 R S CS 347 Notes 3 9

10 Query Processing Decomposition One or more algebraic query trees on relations Localization Replacing relations by corresponding fragments CS 347 Notes 3 10

11 Localization Steps (1) Start with query (2) Replace relations by fragments (3) Push up and, σ down (apply CS 245 rules) (4) Simplify ~ eliminate unnecessary operations CS 347 Notes 3 11

12 Localization Notation for fragment [R : cond] conditions its tuples satisfy fragment CS 347 Notes 3 12

13 Example A (1) σ E=3 R CS 347 Notes 3 13

14 Example A (2) σ E=3 [R 1 : E<10] [R 2 : E 10] CS 347 Notes 3 14

15 Example A (3) σ E=3 σ E=3 [R 1 : E<10] [R 2 : E 10] CS 347 Notes 3 15

16 Example A (3) σ E=3 σ E=3 [R 1 : E<10] [R 2 : E 10] Ø CS 347 Notes 3 16

17 Example A (4) σ E=3 [R 1 : E<10] CS 347 Notes 3 17

18 Localization Rule 1 σ c1 [R: c2] σ c1 [R: c1 c2] [R: false] Ø CS 347 Notes 3 18

19 Example A σ E=3 [R 2 : E 10] σ E=3 [R 2 : E=3 E 10] σ E=3 [R 2 : false] Ø CS 347 Notes 3 19

20 Example B (1) A A = common attribute R S CS 347 Notes 3 20

21 Example B (2) A [R 1 : A<5] [R 2 : 5 A 10] [R 3 : A>10] [S 1 : A<5] [S 2 : A 5] CS 347 Notes 3 21

22 Example B (3) A A A A [R 1 : A<5] [S 1 : A<5] [R 1 : A<5] [S 2 : A 5] [R 3 : A>10] [S 1 : A<5] [R 3 : A>10] [S 2 : A 5] A A [R 2 : 5 A 10] [S 1 : A<5] [R 2 : 5 A 10] [S 2 : A 5] CS 347 Notes 3 22

23 Example B (3) A A A A [R 1 : A<5] [S 1 : A<5] [R 1 : A<5] [S 2 : A 5] [R 3 : A>10] [S 1 : A<5] [R 3 : A>10] [S 2 : A 5] A A [R 2 : 5 A 10] [S 1 : A<5] [R 2 : 5 A 10] [S 2 : A 5] CS 347 Notes 3 23

24 Example B (4) A A A [R 1 : A<5] [S 1 : A<5] [R 2 : 5 A 10] [S 2 : A 5] [R 3 : A>10] [S 2 : A 5] CS 347 Notes 3 24

25 Localization Rule 2 [R: c 1 ] [S: c 2 ] [R S: c 1 c 2 R.A=S.A] A A [R: false] Ø CS 347 Notes 3 25

26 Example B [R 1 : A < 5] [S 2 : A 5] A [R 1 A S 2 : R 1.A < 5 S 2.A 5 R 1.A=S 2.A] [R 1 A S 2 : false] Ø CS 347 Notes 3 26

27 Example C (2) K [R 1 : A<10] [R 2 : A 10] [S 1 : K=R.K R.A<10] [S 2 : K=R.K R.A 10] derived fragmentation CS 347 Notes 3 27

28 Example C (3) K K K K [R 1 ] [S 1 ] [R 1 ] [S 2 ] [S 1 ] [R 2 ] [R 2 ] [S 2 ] CS 347 Notes 3 28

29 Example C (3) K K K K [R 1 ] [S 1 ] [R 1 ] [S 2 ] [S 1 ] [R 2 ] [R 2 ] [S 2 ] CS 347 Notes 3 29

30 Example C (4) K K [R 1 : A<10] [S 1 : K=R.K R.A<10] [R 2 : A 10] [S 2 : K=R.K R.A 10] CS 347 Notes 3 30

31 Example C [R 1 : A < 10] [S 2 : K = R.K R.A 10] K [R 1 K S 2 : R 1.A < 10 S 2.K = R.K R.A 10 R 1.K = S 2.K] [R 1 K S 2 : false] Ø CS 347 Notes 3 31

32 Example D (1) A R R1 (K, A, B) R2 (K, C, D) vertical fragmentation CS 347 Notes 3 32

33 Example D (2) A K R 1 (K,A,B) R 2 (K,C,D) CS 347 Notes 3 33

34 Example D (2) A K K,A K,A R 1 (K,A,B) R 2 (K,C,D) CS 347 Notes 3 34

35 Example D (4) A R 1 (K, A, B) CS 347 Notes 3 35

36 Rule 3 Consider the vertical fragmentation R i = (R), A i A A i Then for any B A B (R) = B ( R i B A i Ø ) i CS 347 Notes 3 36

37 Query Processing Decomposition Localization Optimization Overview Joins and other operations Inter-operation parallelism Optimization strategies CS 347 Notes 3 37

38 Overview Optimization process 1. Generate query plans P 1 P 2 P 3 P n 2. Estimate size of intermediate results 3. Estimate cost of plan C 1 C 2 C 3 C n 4. Pick minimum CS 347 Notes 3 38

39 Overview Differences from centralized optimization New strategies for some operations Parallel/distributed sort Parallel/distributed join Semi-join Privacy preserving join Duplicate elimination Aggregation Selection Many ways to assign and schedule processors CS 347 Notes 3 39

40 Parallel/Distributed Sort Input a) Relation R on single site/disk b) Relation R fragmented/partitioned by sort attribute c) Relation R fragmented/partitioned by another attribute CS 347 Notes 3 40

41 Parallel/Distributed Sort Output a) Sorted R on single site/disk b) Sorted R fragments/partitions F 1 F 2 F 3 CS 347 Notes 3 41

42 Basic Sort R(K, ) to be sorted on K Horizontally fragmented on K using vector (k 0, k 1,..., k n ) 7 3 k 0 k CS 347 Notes 3 42

43 Basic Sort Algorithm 1. Sort each fragment independently 2. Ship results if necessary CS 347 Notes 3 43

44 Basic Sort Same idea on different architectures Shared nothing P 1 P 2 M F 1 M F 2 Shared memory P 1 P 2 F 1 M F 2 CS 347 Notes 3 44

45 Range-Partition Sort R(K, ) to be sorted on K R located at one or more site/disk, not fragmented on K CS 347 Notes 3 45

46 Range-Partition Sort Algorithm 1. Range partition on K 2. Basic sort R a R 1 local sort R 1 k 0 R b R 2 local sort R 2 result k 1 local sort R 3 R 3 CS 347 Notes 3 46

47 Partition Vectors Problem Select a good partition vector given fragments R a R b R c CS 347 Notes 3 47

48 Partition Vectors Example approach Each site sends to coordinator MIN sort key MAX sort key Number of tuples Coordinator Computes vector and distributes to sites Decides the number of sites to perform local sorts CS 347 Notes 3 48

49 Partition Vectors Sample scenario Coordinator receives S A : MIN = 5 MAX = 9 # of tuples = 10 S B : MIN = 7 MAX = 16 # of tuples = 10 CS 347 Notes 3 49

50 Partition Vectors Sample scenario Coordinator receives S A : MIN = 5 MAX = 9 # of tuples = 10 S B : MIN = 7 MAX = 16 # of tuples = 10 Expected tuples 2 = 10 / ( ) 1 = 10 / ( ) k 0? assuming we want to sort at 2 sites CS 347 Notes 3 50

51 Partition Vectors Sample scenario Expected tuples 2 1 Expected tuples with key < k 0 = half of total tuples 2(k 0 5) + (k 0 7) = 10 k 0 = ( ) / 3 = k 0? CS 347 Notes 3 51

52 Partition Vectors Variations Send more info to coordinator: a) Partition vector for local site b) Histogram # of tuples local vector CS 347 Notes 3 52

53 Partition Vectors Variations Multiple rounds 1. Sites send range and # of tuples to coordinator 2. Coordinator distributes preliminary vector V 0 3. Sites send coordinator # of tuples in each V 0 range 4. Coordinator computes and distributes final vector V f CS 347 Notes 3 53

54 Partition Vectors Variations Distributed algorithm that does not require a coordinator? CS 347 Notes 3 54

55 Parallel External Sort-Merge Same as range-partition sort except does sorting first R a local sort R a R 1 k 0 R b local sort R b R 2 result k 1 R 3 in order merge CS 347 Notes 3 55

56 Parallel/Distributed Join Input Relations R, S May or may not be fragmented Output R S Result at one or more sites CS 347 Notes 3 56

57 Partitioned Join (Equijoin) local join R 1 R A S 1 S A R B R 2 S 2 S B f(a) R 3 S 3 S C result f(a) CS 347 Notes 3 57

58 Partitioned Join (Equijoin) Same partition function f is used for both R and S f can be range or hash partitioning Local join can be of any type (use any CS 245 optimization) Various scheduling options, e.g., a) Partition R; partition S; join b) Partition R; build local hash table for R; partition S and join CS 347 Notes 3 58

59 Partitioned Join (Equijoin) We already know why it works [R 1 ] [R 2 ] [R 3 ] [S 1 ] [S 2 ] [S 3 ] [R 1 ] [S 1 ] [R 2 ] [S 2 ] [R 3 ] [S 3 ] Sometimes we partition just to make this join possible CS 347 Notes 3 59

60 Partitioned Join (Equijoin) Selecting a good partition function f is very important Goal is to make all R i + S i the same size CS 347 Notes 3 60

61 Asymmetric Fragment + Replicate Join local join R A R 1 S S A R B R 2 S S B partition R 3 S result union CS 347 Notes 3 61

62 Asymmetric Fragment + Replicate Join Can use any partition function f for R (even round robin) Can do any join, not just equijoin (e.g., R S) R.A<S.B CS 347 Notes 3 62

63 General Fragment + Replicate Join R A R 1 R 1 R B R 2 R 2 f partition 3 fragments R 3 R 3 n copies of each fragment CS 347 Notes 3 63

64 General Fragment + Replicate Join R A R 1 R 1 R B R 2 R 2 f partition 3 fragments R 3 R 3 n copies of each fragment S is partitioned in similar fashion CS 347 Notes 3 64

65 General Fragment + Replicate Join R 1 S 1 R 1 S 2 R 2 S 1 R 2 S 2 n m pairings of R, S fragments R 3 S 1 R 3 S 2 result CS 347 Notes 3 65

66 General Fragment + Replicate Join The asymmetric F+R join is special case of the general F+R Asymmetric F+R may be good if S is small Also works on non-equijoins CS 347 Notes 3 66

67 Semi-Join Goal: reduce communication traffic R S ( R S ) S or A A A R ( S R ) or A ( R S ) ( S R ) A A A A CS 347 Notes 3 67

68 Semi-Join R S A B A C R 2 a 10 b S 3 x 10 y 25 c 15 z 30 d 25 w 32 x CS 347 Notes 3 68

69 Semi-Join R S A B A C R 2 a 10 b S 3 x 10 y 25 c 15 z 30 d 25 w 32 x A R = [2, 10, 25, 30] CS 347 Notes 3 69

70 Semi-Join R S A B A C R 2 a 10 b S 3 x 10 y 25 c 15 z 30 d 25 w 32 x R S A R = [2, 10, 25, 30] 10 y S R = 25 w CS 347 Notes 3 70

71 Semi-Join A B A C R 2 a 10 b S 3 x 10 y 25 c 15 z 30 d 25 w 32 x Transmitted data With semi-join R ( S R ): T = 4 A + 2 A + C + result With join R S : T = 4 A + B + result better if B is large CS 347 Notes 3 71

72 Semi-Join Assume R is the smaller relation Then, in general, ( R S ) S is better than ( R S ) if A A A size( A S ) + size ( R S ) < size ( R ) A Can do similar comparison for other joins Only taking into account transmission cost CS 347 Notes 3 72

73 Semi-Join Trick: encode A S (or A R) as a bit vector keys in S one bit/possible key CS 347 Notes 3 73

74 Semi-Join Useful for three way joins R S T CS 347 Notes 3 74

75 Semi-Join Useful for three way joins R S T Option 1: R S T where R = R S S = S T CS 347 Notes 3 75

76 Semi-Join Useful for three way joins R S T Option 1: R S T where R = R S S = S T Option 2: R S T where R = R S S = S T CS 347 Notes 3 76

77 Semi-Join Useful for three way joins R S T Option 1: R S T where R = R S S = S T Option 2: R S T where R = R S S = S T Many options ~ exponential in # of relations CS 347 Notes 3 77

78 Privacy Preserving Join Site 1 has R(A, B) Site 2 has S(A, C) Want to compute R S Site 1 should not discover any S info not in the join Site 2 should not discover any R info not in the join CS 347 Notes 3 78

79 Privacy Preserving Join Semi-join won t work If site 1 sends A R to site 2, site 2 learns all keys of R A B A C R a 1 b 1 A R = (a 1, a 2, a 3, a 4 ) S a 1 c 1 a 2 b 2 a 3 c 2 a 3 b 3 a 5 c 3 a 4 b 4 a 7 c 4 site 1 site 2 CS 347 Notes 3 79

80 Privacy Preserving Join Fix: send hash keys only Site 1 hashes each value of A before sending Site 2 hashes its own A values (same h) to see what tuples match R A B a 1 b 1 a 2 b 2 a 3 b 3 A R = (h(a 1 ), h(a 2 ), h(a 3 ), h(a 4 )) site 2 sees it has h(a 1 ), h(a 3 ) S A C a 1 c 1 a 3 c 2 a 5 c 3 a 4 b 4 a 7 c 4 (a1, c1), (a3, c3) site 1 site 2 CS 347 Notes 3 80

81 Privacy Preserving Join What is the problem? A B A C R a 1 b 1 A R = (h(a 1 ), h(a 2 ), h(a 3 ), h(a 4 )) S a 1 c 1 a 2 b 2 site 2 sees it has h(a 1 ), h(a 3 ) a 3 c 2 a 3 b 3 a 5 c 3 a 4 b 4 a 7 c 4 (a1, c1), (a3, c3) site 1 site 2 CS 347 Notes 3 81

82 Privacy Preserving Join What is the problem? A B A C R a 1 b 1 A R = (h(a 1 ), h(a 2 ), h(a 3 ), h(a 4 )) S a 1 c 1 a 2 b 2 site 2 sees it has h(a 1 ), h(a 3 ) a 3 c 2 a 3 b 3 a 5 c 3 a 4 b 4 (a1, c1), (a3, c3) a 7 c 4 site 1 site 2 Dictionary attack Site 2 can take all keys a 1, a 2, a 3, and checks if h(a 1 ), h(a 2 ), h(a 3 ), matches what site 1 sent CS 347 Notes 3 82

83 Privacy Preserving Join Our adversary model: honest but curious Dictionary attack is possible (cheating is internal and can t be caught) Sending incorrect keys not possible (cheater could be caught) CS 347 Notes 3 83

84 Privacy Preserving Join One solution Information Sharing Across Private Databases. Agrawal et al., SIGMOD 2003 Use commutative encryption functions E i (x) = x encrypted using a key private to site i E 1 (E 2 (x)) = E 2 (E 1 (x)) Shorthand: E 1 (x) is x, E 2 (x) is x, E 1 (E 2 (x)) is x CS 347 Notes 3 84

85 Privacy Preserving Join One solution R A B a 1 b 1 a 2 b 2 ( a 1, a 2, a 3, a 4 ) S A C a 1 c 1 a 3 c 2 a 3 b 3 ( a 1, a 2, a 3, a 4 ) a 5 c 3 a 4 b 4 a 7 c 4 ( a 1, a 3, a 5, a 7 ) site 1 site 2 Computes ( a 1, a 3, a 5, a 7 ), intersects with ( a 1, a 2, a 3, a 4 ) (a 1, b 1 ), (a 3, b 3 ) CS 347 Notes 3 85

86 Other Privacy Preserving Operations Inequality join Similarity join R S R.A>S.A R S sim(r.a,s.a)<e CS 347 Notes 3 86

87 Other Parallel Operations Duplicate elimination Sort first (in parallel) then eliminate duplicates in result Partition tuples (range or hash) and eliminate locally Aggregates Partition by grouping attributes; compute aggregate locally CS 347 Notes 3 87

88 Parallel/Distributed Aggregates R a id dept salary 1 toy 10 2 toy 20 3 sales 15 R b id dept salary 4 sales 5 5 toy 20 6 mgmt 15 7 sales 10 8 mgmt 30 sum (sal) group by dept CS 347 Notes 3 88

89 Parallel/Distributed Aggregates R a R b id dept salary 1 toy 10 2 toy 20 3 sales 15 id dept salary 4 sales 5 5 toy 20 6 mgmt 15 7 sales 10 8 mgmt 30 id dept salary 1 toy 10 2 toy 20 5 toy 20 6 mgmt 15 8 mgmt 30 id dept salary 3 sales 15 4 sales 5 7 sales 10 sum (sal) group by dept CS 347 Notes 3 89

90 Parallel/Distributed Aggregates R a id dept salary 1 toy 10 id dept salary 1 toy 10 sum dept sum toy 50 2 toy 20 2 toy 20 mgmt 45 3 sales 15 5 toy 20 6 mgmt 15 R b id dept salary 4 sales 5 5 toy 20 8 mgmt 30 id dept salary sum dept sum 6 mgmt 15 3 sales 15 sales 30 7 sales 10 4 sales 5 8 mgmt 30 7 sales 10 sum (sal) group by dept CS 347 Notes 3 90

91 Parallel/Distributed Aggregates R a id dept salary 1 toy 10 2 toy 20 3 sales 15 R b id dept salary 4 sales 5 5 toy 20 6 mgmt 15 7 sales 10 8 mgmt 30 sum (sal) group by dept with less data CS 347 Notes 3 91

92 Parallel/Distributed Aggregates R a id dept salary 1 toy 10 sum dept salary toy 30 2 toy 20 toy 20 3 sales 15 mgmt 45 R b id dept salary 4 sales 5 5 toy 20 sum dept salary 6 mgmt 15 sales 15 7 sales 10 sales 15 8 mgmt 30 sum (sal) group by dept with less data CS 347 Notes 3 92

93 Parallel/Distributed Aggregates R a id dept salary 1 toy 10 sum dept salary toy 30 sum dept sum toy 50 2 toy 20 toy 20 mgmt 45 3 sales 15 mgmt 45 R b id dept salary 4 sales 5 5 toy 20 sum dept salary sum dept sum 6 mgmt 15 sales 15 sales 30 7 sales 10 sales 15 8 mgmt 30 sum (sal) group by dept with less data CS 347 Notes 3 93

94 Parallel/Distributed Aggregates Enhancement Perform aggregate during partition to reduce data transmitted Does not work for all aggregate functions which ones? CS 347 Notes 3 94

95 Preview: MapReduce data A 1 map data B 1 reduce data C 1 data A 2 data B 2 data C 1 data A 3 CS 347 Notes 3 95

96 Parallel/Distributed Selection Straightforward if one can leverage range or hash partitioning How do indexes work? CS 347 Notes 3 96

97 Parallel/Distributed Selection Partition vector can act as the root of a distributed index k 0 k 1 local indexes site 1 site 2 site 3 CS 347 Notes 3 97

98 Parallel/Distributed Selection Distributed indexes on a non-partition attributes get complicated k 0 k 1 index sites tuple sites CS 347 Notes 3 98

99 Parallel/Distributed Selection Indexing schemes Which one is better in a distributed environment? How to make updates and expansion efficient? Where to store the directory and set of participants? Is global knowledge necessary? If the index is not too big, it may be better to keep it whole and replicate it CS 347 Notes 3 99

100 Query Processing Decomposition Localization Optimization Overview Joins and other operations Inter-operation parallelism Optimization strategies CS 347 Notes 3 100

101 Inter-Operation Parallelism Pipelined Independent CS 347 Notes 3 101

102 Pipelined Parallelism site 2 σ c S site 1 R site 1 R tuples matching σ c site 2 join S probe result CS 347 Notes 3 102

103 Pipelined Parallelism Pipelining cannot be used in all cases stream of R tuples stream of S tuples CS 347 Notes 3 103

104 Independent Parallelism site 1 site 2 R S T V 1. temp 1 R S temp 2 T V 2. result temp 1 temp 2 CS 347 Notes 3 104

105 Summary As we consider query plans for optimization, we must consider various new strategies for individual operations scheduling operations CS 347 Notes 3 105

CS 347 Distributed Databases and Transaction Processing Notes03: Query Processing

CS 347 Distributed Databases and Transaction Processing Notes03: Query Processing CS 347 Distributed Databases and Transaction Processing Notes03: Query Processing Hector Garcia-Molina Zoltan Gyongyi CS 347 Notes 03 1 Query Processing! Decomposition! Localization! Optimization CS 347

More information

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 4: Query Optimization Query Optimization Cost estimation Strategies for exploring plans Q min CS 347 Notes 4 2 Cost Estimation Based on

More information

P Q1 Q2 Q3 Q4 Q5 Tot (60) (20) (20) (20) (60) (20) (200) You are allotted a maximum of 4 hours to complete this exam.

P Q1 Q2 Q3 Q4 Q5 Tot (60) (20) (20) (20) (60) (20) (200) You are allotted a maximum of 4 hours to complete this exam. Exam INFO-H-417 Database System Architecture 13 January 2014 Name: ULB Student ID: P Q1 Q2 Q3 Q4 Q5 Tot (60 (20 (20 (20 (60 (20 (200 Exam modalities You are allotted a maximum of 4 hours to complete this

More information

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 2: Distributed Database Design Logistics Gradiance No action items for now Detailed instructions coming shortly First quiz to be released

More information

2.6 Complexity Theory for Map-Reduce. Star Joins 2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51

2.6 Complexity Theory for Map-Reduce. Star Joins 2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51 2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51 Star Joins A common structure for data mining of commercial data is the star join. For example, a chain store like Walmart keeps a fact table whose tuples each

More information

Correlated subqueries. Query Optimization. Magic decorrelation. COUNT bug. Magic example (slide 2) Magic example (slide 1)

Correlated subqueries. Query Optimization. Magic decorrelation. COUNT bug. Magic example (slide 2) Magic example (slide 1) Correlated subqueries Query Optimization CPS Advanced Database Systems SELECT CID FROM Course Executing correlated subquery is expensive The subquery is evaluated once for every CPS course Decorrelate!

More information

CSE 562 Database Systems

CSE 562 Database Systems Outline Query Optimization CSE 562 Database Systems Query Processing: Algebraic Optimization Some slides are based or modified from originals by Database Systems: The Complete Book, Pearson Prentice Hall

More information

Relational Algebra on Bags. Why Bags? Operations on Bags. Example: Bag Selection. σ A+B < 5 (R) = A B

Relational Algebra on Bags. Why Bags? Operations on Bags. Example: Bag Selection. σ A+B < 5 (R) = A B Relational Algebra on Bags Why Bags? 13 14 A bag (or multiset ) is like a set, but an element may appear more than once. Example: {1,2,1,3} is a bag. Example: {1,2,3} is also a bag that happens to be a

More information

CS 347. Parallel and Distributed Data Processing. Spring Notes 11: MapReduce

CS 347. Parallel and Distributed Data Processing. Spring Notes 11: MapReduce CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 11: MapReduce Motivation Distribution makes simple computations complex Communication Load balancing Fault tolerance Not all applications

More information

Introduction to Cryptography Lecture 13

Introduction to Cryptography Lecture 13 Introduction to Cryptography Lecture 13 Benny Pinkas June 5, 2011 Introduction to Cryptography, Benny Pinkas page 1 Electronic cash June 5, 2011 Introduction to Cryptography, Benny Pinkas page 2 Simple

More information

Multiple-Site Distributed Spatial Query Optimization using Spatial Semijoins

Multiple-Site Distributed Spatial Query Optimization using Spatial Semijoins 11 Multiple-Site Distributed Spatial Query Optimization using Spatial Semijoins Wendy OSBORN a, 1 and Saad ZAAMOUT a a Department of Mathematics and Computer Science, University of Lethbridge, Lethbridge,

More information

Logic and Databases. Phokion G. Kolaitis. UC Santa Cruz & IBM Research Almaden. Lecture 4 Part 1

Logic and Databases. Phokion G. Kolaitis. UC Santa Cruz & IBM Research Almaden. Lecture 4 Part 1 Logic and Databases Phokion G. Kolaitis UC Santa Cruz & IBM Research Almaden Lecture 4 Part 1 1 Thematic Roadmap Logic and Database Query Languages Relational Algebra and Relational Calculus Conjunctive

More information

communication complexity lower bounds yield data structure lower bounds

communication complexity lower bounds yield data structure lower bounds communication complexity lower bounds yield data structure lower bounds Implementation of a database - D: D represents a subset S of {...N} 2 3 4 Access to D via "membership queries" - Q for each i, can

More information

Database Design and Implementation

Database Design and Implementation Database Design and Implementation CS 645 Data provenance Provenance provenance, n. The fact of coming from some particular source or quarter; origin, derivation [Oxford English Dictionary] Data provenance

More information

Quiz 2. Due November 26th, CS525 - Advanced Database Organization Solutions

Quiz 2. Due November 26th, CS525 - Advanced Database Organization Solutions Name CWID Quiz 2 Due November 26th, 2015 CS525 - Advanced Database Organization s Please leave this empty! 1 2 3 4 5 6 7 Sum Instructions Multiple choice questions are graded in the following way: You

More information

Cut-and-Choose Yao-Based Secure Computation in the Online/Offline and Batch Settings

Cut-and-Choose Yao-Based Secure Computation in the Online/Offline and Batch Settings Cut-and-Choose Yao-Based Secure Computation in the Online/Offline and Batch Settings Yehuda Lindell Bar-Ilan University, Israel Technion Cryptoday 2014 Yehuda Lindell Online/Offline and Batch Yao 30/12/2014

More information

Benny Pinkas Bar Ilan University

Benny Pinkas Bar Ilan University Winter School on Bar-Ilan University, Israel 30/1/2011-1/2/2011 Bar-Ilan University Benny Pinkas Bar Ilan University 1 Extending OT [IKNP] Is fully simulatable Depends on a non-standard security assumption

More information

Factorized Relational Databases Olteanu and Závodný, University of Oxford

Factorized Relational Databases   Olteanu and Závodný, University of Oxford November 8, 2013 Database Seminar, U Washington Factorized Relational Databases http://www.cs.ox.ac.uk/projects/fd/ Olteanu and Závodný, University of Oxford Factorized Representations of Relations Cust

More information

7 RC Simulates RA. Lemma: For every RA expression E(A 1... A k ) there exists a DRC formula F with F V (F ) = {A 1,..., A k } and

7 RC Simulates RA. Lemma: For every RA expression E(A 1... A k ) there exists a DRC formula F with F V (F ) = {A 1,..., A k } and 7 RC Simulates RA. We now show that DRC (and hence TRC) is at least as expressive as RA. That is, given an RA expression E that mentions at most C, there is an equivalent DRC expression E that mentions

More information

Relational operations

Relational operations Architecture Relational operations We will consider how to implement and define physical operators for: Projection Selection Grouping Set operators Join Then we will discuss how the optimizer use them

More information

Flow Algorithms for Two Pipelined Filtering Problems

Flow Algorithms for Two Pipelined Filtering Problems Flow Algorithms for Two Pipelined Filtering Problems Anne Condon, University of British Columbia Amol Deshpande, University of Maryland Lisa Hellerstein, Polytechnic University, Brooklyn Ning Wu, Polytechnic

More information

Databases 2011 The Relational Algebra

Databases 2011 The Relational Algebra Databases 2011 Christian S. Jensen Computer Science, Aarhus University What is an Algebra? An algebra consists of values operators rules Closure: operations yield values Examples integers with +,, sets

More information

Multi-join Query Evaluation on Big Data Lecture 2

Multi-join Query Evaluation on Big Data Lecture 2 Multi-join Query Evaluation on Big Data Lecture 2 Dan Suciu March, 2015 Dan Suciu Multi-Joins Lecture 2 March, 2015 1 / 34 Multi-join Query Evaluation Outline Part 1 Optimal Sequential Algorithms. Thursday

More information

2.4 Graphing Inequalities

2.4 Graphing Inequalities .4 Graphing Inequalities Why We Need This Our applications will have associated limiting values - and either we will have to be at least as big as the value or no larger than the value. Why We Need This

More information

CS425: Algorithms for Web Scale Data

CS425: Algorithms for Web Scale Data CS425: Algorithms for Web Scale Data Most of the slides are from the Mining of Massive Datasets book. These slides have been modified for CS425. The original slides can be accessed at: www.mmds.org Challenges

More information

Question 1. The Chinese University of Hong Kong, Spring 2018

Question 1. The Chinese University of Hong Kong, Spring 2018 CSCI 5440: Cryptography The Chinese University of Hong Kong, Spring 2018 Homework 2 Solutions Question 1 Consider the following encryption algorithm based on the shortlwe assumption. The secret key is

More information

ANALYSIS OF PRIVACY-PRESERVING ELEMENT REDUCTION OF A MULTISET

ANALYSIS OF PRIVACY-PRESERVING ELEMENT REDUCTION OF A MULTISET J. Korean Math. Soc. 46 (2009), No. 1, pp. 59 69 ANALYSIS OF PRIVACY-PRESERVING ELEMENT REDUCTION OF A MULTISET Jae Hong Seo, HyoJin Yoon, Seongan Lim, Jung Hee Cheon, and Dowon Hong Abstract. The element

More information

Jeffrey D. Ullman Stanford University

Jeffrey D. Ullman Stanford University Jeffrey D. Ullman Stanford University 3 We are given a set of training examples, consisting of input-output pairs (x,y), where: 1. x is an item of the type we want to evaluate. 2. y is the value of some

More information

Flow Algorithms for Parallel Query Optimization

Flow Algorithms for Parallel Query Optimization Flow Algorithms for Parallel Query Optimization Amol Deshpande amol@cs.umd.edu University of Maryland Lisa Hellerstein hstein@cis.poly.edu Polytechnic University August 22, 2007 Abstract In this paper

More information

Dot-Product Join: Scalable In-Database Linear Algebra for Big Model Analytics

Dot-Product Join: Scalable In-Database Linear Algebra for Big Model Analytics Dot-Product Join: Scalable In-Database Linear Algebra for Big Model Analytics Chengjie Qin 1 and Florin Rusu 2 1 GraphSQL, Inc. 2 University of California Merced June 29, 2017 Machine Learning (ML) Is

More information

Environment (Parallelizing Query Optimization)

Environment (Parallelizing Query Optimization) Advanced d Query Optimization i i Techniques in a Parallel Computing Environment (Parallelizing Query Optimization) Wook-Shin Han*, Wooseong Kwak, Jinsoo Lee Guy M. Lohman, Volker Markl Kyungpook National

More information

A Dichotomy. in in Probabilistic Databases. Joint work with Robert Fink. for Non-Repeating Queries with Negation Queries with Negation

A Dichotomy. in in Probabilistic Databases. Joint work with Robert Fink. for Non-Repeating Queries with Negation Queries with Negation Dichotomy for Non-Repeating Queries with Negation Queries with Negation in in Probabilistic Databases Robert Dan Olteanu Fink and Dan Olteanu Joint work with Robert Fink Uncertainty in Computation Simons

More information

Lecture 21: Algebraic Computation Models

Lecture 21: Algebraic Computation Models princeton university cos 522: computational complexity Lecture 21: Algebraic Computation Models Lecturer: Sanjeev Arora Scribe:Loukas Georgiadis We think of numerical algorithms root-finding, gaussian

More information

Relational Algebra and Calculus

Relational Algebra and Calculus Topics Relational Algebra and Calculus Linda Wu Formal query languages Preliminaries Relational algebra Relational calculus Expressive power of algebra and calculus (CMPT 354 2004-2) Chapter 4 CMPT 354

More information

Foundations of Databases

Foundations of Databases Foundations of Databases (Slides adapted from Thomas Eiter, Leonid Libkin and Werner Nutt) Foundations of Databases 1 Quer optimization: finding a good wa to evaluate a quer Queries are declarative, and

More information

Lecture 19: Interactive Proofs and the PCP Theorem

Lecture 19: Interactive Proofs and the PCP Theorem Lecture 19: Interactive Proofs and the PCP Theorem Valentine Kabanets November 29, 2016 1 Interactive Proofs In this model, we have an all-powerful Prover (with unlimited computational prover) and a polytime

More information

Relational Algebra 2. Week 5

Relational Algebra 2. Week 5 Relational Algebra 2 Week 5 Relational Algebra (So far) Basic operations: Selection ( σ ) Selects a subset of rows from relation. Projection ( π ) Deletes unwanted columns from relation. Cross-product

More information

Fast Cut-and-Choose Based Protocols for Malicious and Covert Adversaries

Fast Cut-and-Choose Based Protocols for Malicious and Covert Adversaries Fast Cut-and-Choose Based Protocols for Malicious and Covert Adversaries Yehuda Lindell Dept. of Computer Science Bar-Ilan University, Israel lindell@biu.ac.il February 8, 2015 Abstract In the setting

More information

CS54100: Database Systems

CS54100: Database Systems CS54100: Database Systems Relational Algebra 3 February 2012 Prof. Walid Aref Core Relational Algebra A small set of operators that allow us to manipulate relations in limited but useful ways. The operators

More information

Lecture 4 February 2nd, 2017

Lecture 4 February 2nd, 2017 CS 224: Advanced Algorithms Spring 2017 Prof. Jelani Nelson Lecture 4 February 2nd, 2017 Scribe: Rohil Prasad 1 Overview In the last lecture we covered topics in hashing, including load balancing, k-wise

More information

Non-Interactive Zero-Knowledge Proofs of Non-Membership

Non-Interactive Zero-Knowledge Proofs of Non-Membership Non-Interactive Zero-Knowledge Proofs of Non-Membership O. Blazy, C. Chevalier, D. Vergnaud XLim / Université Paris II / ENS O. Blazy (XLim) Negative-NIZK CT-RSA 2015 1 / 22 1 Brief Overview 2 Building

More information

A Worst-Case Optimal Multi-Round Algorithm for Parallel Computation of Conjunctive Queries

A Worst-Case Optimal Multi-Round Algorithm for Parallel Computation of Conjunctive Queries A Worst-Case Optimal Multi-Round Algorithm for Parallel Computation of Conjunctive Queries Bas Ketsman Dan Suciu Abstract We study the optimal communication cost for computing a full conjunctive query

More information

2. Accelerated Computations

2. Accelerated Computations 2. Accelerated Computations 2.1. Bent Function Enumeration by a Circular Pipeline Implemented on an FPGA Stuart W. Schneider Jon T. Butler 2.1.1. Background A naive approach to encoding a plaintext message

More information

Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1

Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1 Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1 Background We started with schema design ER model translation into a relational schema Then we studied relational

More information

CSE 4502/5717 Big Data Analytics Spring 2018; Homework 1 Solutions

CSE 4502/5717 Big Data Analytics Spring 2018; Homework 1 Solutions CSE 502/5717 Big Data Analytics Spring 2018; Homework 1 Solutions 1. Consider the following algorithm: for i := 1 to α n log e n do Pick a random j [1, n]; If a[j] = a[j + 1] or a[j] = a[j 1] then output:

More information

DRAFT. Algebraic computation models. Chapter 14

DRAFT. Algebraic computation models. Chapter 14 Chapter 14 Algebraic computation models Somewhat rough We think of numerical algorithms root-finding, gaussian elimination etc. as operating over R or C, even though the underlying representation of the

More information

The Query Containment Problem: Set Semantics vs. Bag Semantics. Phokion G. Kolaitis University of California Santa Cruz & IBM Research - Almaden

The Query Containment Problem: Set Semantics vs. Bag Semantics. Phokion G. Kolaitis University of California Santa Cruz & IBM Research - Almaden The Query Containment Problem: Set Semantics vs. Bag Semantics Phokion G. Kolaitis University of California Santa Cruz & IBM Research - Almaden PROBLEMS Problems worthy of attack prove their worth by hitting

More information

INTRODUCTION TO RELATIONAL DATABASE SYSTEMS

INTRODUCTION TO RELATIONAL DATABASE SYSTEMS INTRODUCTION TO RELATIONAL DATABASE SYSTEMS DATENBANKSYSTEME 1 (INF 3131) Torsten Grust Universität Tübingen Winter 2017/18 1 THE RELATIONAL ALGEBRA The Relational Algebra (RA) is a query language for

More information

Turing s Thesis. Fall Costas Busch - RPI!1

Turing s Thesis. Fall Costas Busch - RPI!1 Turing s Thesis Costas Busch - RPI!1 Turing s thesis (1930): Any computation carried out by mechanical means can be performed by a Turing Machine Costas Busch - RPI!2 Algorithm: An algorithm for a problem

More information

α-acyclic Joins Jef Wijsen May 4, 2017

α-acyclic Joins Jef Wijsen May 4, 2017 α-acyclic Joins Jef Wijsen May 4, 2017 1 Motivation Joins in a Distributed Environment Assume the following relations. 1 M[NN, Field of Study, Year] stores data about students of UMONS. For example, (19950423158,

More information

An Efficient and Secure Protocol for Privacy Preserving Set Intersection

An Efficient and Secure Protocol for Privacy Preserving Set Intersection An Efficient and Secure Protocol for Privacy Preserving Set Intersection PhD Candidate: Yingpeng Sang Advisor: Associate Professor Yasuo Tan School of Information Science Japan Advanced Institute of Science

More information

Verifying Computations in the Cloud (and Elsewhere) Michael Mitzenmacher, Harvard University Work offloaded to Justin Thaler, Harvard University

Verifying Computations in the Cloud (and Elsewhere) Michael Mitzenmacher, Harvard University Work offloaded to Justin Thaler, Harvard University Verifying Computations in the Cloud (and Elsewhere) Michael Mitzenmacher, Harvard University Work offloaded to Justin Thaler, Harvard University Goals of Verifiable Computation Provide user with correctness

More information

Communication Complexity

Communication Complexity Communication Complexity Jaikumar Radhakrishnan School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai 31 May 2012 J 31 May 2012 1 / 13 Plan 1 Examples, the model, the

More information

SPATIAL DATA MINING. Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM

SPATIAL DATA MINING. Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM SPATIAL DATA MINING Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM INTRODUCTION The main difference between data mining in relational DBS and in spatial DBS is that attributes of the neighbors

More information

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016)

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016) Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016) Week 12: Real-Time Data Analytics (2/2) March 31, 2016 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo

More information

High Performance Computing

High Performance Computing Master Degree Program in Computer Science and Networking, 2014-15 High Performance Computing 2 nd appello February 11, 2015 Write your name, surname, student identification number (numero di matricola),

More information

Query Processing. 3 steps: Parsing & Translation Optimization Evaluation

Query Processing. 3 steps: Parsing & Translation Optimization Evaluation rela%onal algebra Query Processing 3 steps: Parsing & Translation Optimization Evaluation 30 Simple set of algebraic operations on relations Journey of a query SQL select from where Rela%onal algebra π

More information

Schedule. Today: Jan. 17 (TH) Jan. 24 (TH) Jan. 29 (T) Jan. 22 (T) Read Sections Assignment 2 due. Read Sections Assignment 3 due.

Schedule. Today: Jan. 17 (TH) Jan. 24 (TH) Jan. 29 (T) Jan. 22 (T) Read Sections Assignment 2 due. Read Sections Assignment 3 due. Schedule Today: Jan. 17 (TH) Relational Algebra. Read Chapter 5. Project Part 1 due. Jan. 22 (T) SQL Queries. Read Sections 6.1-6.2. Assignment 2 due. Jan. 24 (TH) Subqueries, Grouping and Aggregation.

More information

Lecture 14: Secure Multiparty Computation

Lecture 14: Secure Multiparty Computation 600.641 Special Topics in Theoretical Cryptography 3/20/2007 Lecture 14: Secure Multiparty Computation Instructor: Susan Hohenberger Scribe: Adam McKibben 1 Overview Suppose a group of people want to determine

More information

Lecture 11: Non-Interactive Zero-Knowledge II. 1 Non-Interactive Zero-Knowledge in the Hidden-Bits Model for the Graph Hamiltonian problem

Lecture 11: Non-Interactive Zero-Knowledge II. 1 Non-Interactive Zero-Knowledge in the Hidden-Bits Model for the Graph Hamiltonian problem CS 276 Cryptography Oct 8, 2014 Lecture 11: Non-Interactive Zero-Knowledge II Instructor: Sanjam Garg Scribe: Rafael Dutra 1 Non-Interactive Zero-Knowledge in the Hidden-Bits Model for the Graph Hamiltonian

More information

1 Approximate Quantiles and Summaries

1 Approximate Quantiles and Summaries CS 598CSC: Algorithms for Big Data Lecture date: Sept 25, 2014 Instructor: Chandra Chekuri Scribe: Chandra Chekuri Suppose we have a stream a 1, a 2,..., a n of objects from an ordered universe. For simplicity

More information

Lecture 3: Interactive Proofs and Zero-Knowledge

Lecture 3: Interactive Proofs and Zero-Knowledge CS 355 Topics in Cryptography April 9, 2018 Lecture 3: Interactive Proofs and Zero-Knowledge Instructors: Henry Corrigan-Gibbs, Sam Kim, David J. Wu So far in the class, we have only covered basic cryptographic

More information

Lecture 4: DES and block ciphers

Lecture 4: DES and block ciphers Lecture 4: DES and block ciphers Johan Håstad, transcribed by Ernir Erlingsson 2006-01-25 1 DES DES is a 64 bit block cipher with a 56 bit key. It selects a 64 bit block and modifies it depending on the

More information

Exam 1. March 12th, CS525 - Midterm Exam Solutions

Exam 1. March 12th, CS525 - Midterm Exam Solutions Name CWID Exam 1 March 12th, 2014 CS525 - Midterm Exam s Please leave this empty! 1 2 3 4 5 Sum Things that you are not allowed to use Personal notes Textbook Printed lecture notes Phone The exam is 90

More information

DESIGN THEORY FOR RELATIONAL DATABASES. csc343, Introduction to Databases Renée J. Miller and Fatemeh Nargesian and Sina Meraji Winter 2018

DESIGN THEORY FOR RELATIONAL DATABASES. csc343, Introduction to Databases Renée J. Miller and Fatemeh Nargesian and Sina Meraji Winter 2018 DESIGN THEORY FOR RELATIONAL DATABASES csc343, Introduction to Databases Renée J. Miller and Fatemeh Nargesian and Sina Meraji Winter 2018 1 Introduction There are always many different schemas for a given

More information

Quantum Mechanics - I Prof. Dr. S. Lakshmi Bala Department of Physics Indian Institute of Technology, Madras. Lecture - 16 The Quantum Beam Splitter

Quantum Mechanics - I Prof. Dr. S. Lakshmi Bala Department of Physics Indian Institute of Technology, Madras. Lecture - 16 The Quantum Beam Splitter Quantum Mechanics - I Prof. Dr. S. Lakshmi Bala Department of Physics Indian Institute of Technology, Madras Lecture - 16 The Quantum Beam Splitter (Refer Slide Time: 00:07) In an earlier lecture, I had

More information

FIRST-ORDER QUERY EVALUATION ON STRUCTURES OF BOUNDED DEGREE

FIRST-ORDER QUERY EVALUATION ON STRUCTURES OF BOUNDED DEGREE FIRST-ORDER QUERY EVALUATION ON STRUCTURES OF BOUNDED DEGREE INRIA and ENS Cachan e-mail address: kazana@lsv.ens-cachan.fr WOJCIECH KAZANA AND LUC SEGOUFIN INRIA and ENS Cachan e-mail address: see http://pages.saclay.inria.fr/luc.segoufin/

More information

Lecture 11: Data Flow Analysis III

Lecture 11: Data Flow Analysis III CS 515 Programming Language and Compilers I Lecture 11: Data Flow Analysis III (The lectures are based on the slides copyrighted by Keith Cooper and Linda Torczon from Rice University.) Zheng (Eddy) Zhang

More information

NP-COMPLETE PROBLEMS. 1. Characterizing NP. Proof

NP-COMPLETE PROBLEMS. 1. Characterizing NP. Proof T-79.5103 / Autumn 2006 NP-complete problems 1 NP-COMPLETE PROBLEMS Characterizing NP Variants of satisfiability Graph-theoretic problems Coloring problems Sets and numbers Pseudopolynomial algorithms

More information

COMPUTING SIMILARITY BETWEEN DOCUMENTS (OR ITEMS) This part is to a large extent based on slides obtained from

COMPUTING SIMILARITY BETWEEN DOCUMENTS (OR ITEMS) This part is to a large extent based on slides obtained from COMPUTING SIMILARITY BETWEEN DOCUMENTS (OR ITEMS) This part is to a large extent based on slides obtained from http://www.mmds.org Distance Measures For finding similar documents, we consider the Jaccard

More information

6.830 Lecture 11. Recap 10/15/2018

6.830 Lecture 11. Recap 10/15/2018 6.830 Lecture 11 Recap 10/15/2018 Celebration of Knowledge 1.5h No phones, No laptops Bring your Student-ID The 5 things allowed on your desk Calculator allowed 4 pages (2 pages double sided) of your liking

More information

CS6931 Database Seminar. Lecture 6: Set Operations on Massive Data

CS6931 Database Seminar. Lecture 6: Set Operations on Massive Data CS6931 Database Seminar Lecture 6: Set Operations on Massive Data Set Resemblance and MinWise Hashing/Independent Permutation Basics Consider two sets S1, S2 U = {0, 1, 2,...,D 1} (e.g., D = 2 64 ) f1

More information

Advanced Topics in Cryptography

Advanced Topics in Cryptography Advanced Topics in Cryptography Lecture 6: El Gamal. Chosen-ciphertext security, the Cramer-Shoup cryptosystem. Benny Pinkas based on slides of Moni Naor page 1 1 Related papers Lecture notes of Moni Naor,

More information

Approximate Query Processing Using Wavelets

Approximate Query Processing Using Wavelets Approximate Query Processing Using Wavelets Kaushik Chakrabarti Minos Garofalakis Rajeev Rastogi Kyuseok Shim Presented by Guanghua Yan Outline Approximate query processing: Problem and Prior solutions

More information

CS475: Linear Equations Gaussian Elimination LU Decomposition Wim Bohm Colorado State University

CS475: Linear Equations Gaussian Elimination LU Decomposition Wim Bohm Colorado State University CS475: Linear Equations Gaussian Elimination LU Decomposition Wim Bohm Colorado State University Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution

More information

Lecture and notes by: Alessio Guerrieri and Wei Jin Bloom filters and Hashing

Lecture and notes by: Alessio Guerrieri and Wei Jin Bloom filters and Hashing Bloom filters and Hashing 1 Introduction The Bloom filter, conceived by Burton H. Bloom in 1970, is a space-efficient probabilistic data structure that is used to test whether an element is a member of

More information

Session 4: Efficient Zero Knowledge. Yehuda Lindell Bar-Ilan University

Session 4: Efficient Zero Knowledge. Yehuda Lindell Bar-Ilan University Session 4: Efficient Zero Knowledge Yehuda Lindell Bar-Ilan University 1 Proof Systems Completeness: can convince of a true statement Soundness: cannot convince for a false statement Classic proofs: Written

More information

UVA UVA UVA UVA. Database Design. Relational Database Design. Functional Dependency. Loss of Information

UVA UVA UVA UVA. Database Design. Relational Database Design. Functional Dependency. Loss of Information Relational Database Design Database Design To generate a set of relation schemas that allows - to store information without unnecessary redundancy - to retrieve desired information easily Approach - design

More information

Data-Intensive Distributed Computing

Data-Intensive Distributed Computing Data-Intensive Distributed Computing CS 451/651 431/631 (Winter 2018) Part 9: Real-Time Data Analytics (2/2) March 29, 2018 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo

More information

Lecture 10: Data Flow Analysis II

Lecture 10: Data Flow Analysis II CS 515 Programming Language and Compilers I Lecture 10: Data Flow Analysis II (The lectures are based on the slides copyrighted by Keith Cooper and Linda Torczon from Rice University.) Zheng (Eddy) Zhang

More information

Logical Provenance in Data-Oriented Workflows (Long Version)

Logical Provenance in Data-Oriented Workflows (Long Version) Logical Provenance in Data-Oriented Workflows (Long Version) Robert Ikeda Stanford University rmikeda@cs.stanford.edu Akash Das Sarma IIT Kanpur akashds.iitk@gmail.com Jennifer Widom Stanford University

More information

Optimal Reductions between Oblivious Transfers using Interactive Hashing

Optimal Reductions between Oblivious Transfers using Interactive Hashing Optimal Reductions between Oblivious Transfers using Interactive Hashing Claude Crépeau and George Savvides {crepeau,gsavvi1}@cs.mcgill.ca McGill University, Montéral, QC, Canada. Abstract. We present

More information

A Tight Lower Bound for Dynamic Membership in the External Memory Model

A Tight Lower Bound for Dynamic Membership in the External Memory Model A Tight Lower Bound for Dynamic Membership in the External Memory Model Elad Verbin ITCS, Tsinghua University Qin Zhang Hong Kong University of Science & Technology April 2010 1-1 The computational model

More information

Instructor: Sudeepa Roy

Instructor: Sudeepa Roy CompSci 590.6 Understanding Data: Theory and Applications Lecture 13 Incomplete Databases Instructor: Sudeepa Roy Email: sudeepa@cs.duke.edu 1 Today s Reading Alice Book : Foundations of Databases Abiteboul-

More information

Pairing Transitive Closure and Reduction to Efficiently Reason about Partially Ordered Events

Pairing Transitive Closure and Reduction to Efficiently Reason about Partially Ordered Events Pairing Transitive Closure and Reduction to Efficiently Reason about Partially Ordered Events Massimo Franceschet Angelo Montanari Dipartimento di Matematica e Informatica, Università di Udine Via delle

More information

Algorithms for Data Science

Algorithms for Data Science Algorithms for Data Science CSOR W4246 Eleni Drinea Computer Science Department Columbia University Tuesday, December 1, 2015 Outline 1 Recap Balls and bins 2 On randomized algorithms 3 Saving space: hashing-based

More information

Secure Indexes* Eu-Jin Goh Stanford University 15 March 2004

Secure Indexes* Eu-Jin Goh Stanford University 15 March 2004 Secure Indexes* Eu-Jin Goh Stanford University 15 March 2004 * Generalizes an early version of my paper How to search on encrypted data on eprint Cryptology Archive on 7 October 2003 Secure Indexes Data

More information

Query answering using views

Query answering using views Query answering using views General setting: database relations R 1,...,R n. Several views V 1,...,V k are defined as results of queries over the R i s. We have a query Q over R 1,...,R n. Question: Can

More information

Chosen IV Statistical Analysis for Key Recovery Attacks on Stream Ciphers

Chosen IV Statistical Analysis for Key Recovery Attacks on Stream Ciphers Chosen IV Statistical Analysis for Key Recovery Attacks on Stream Ciphers Simon Fischer 1, Shahram Khazaei 2, and Willi Meier 1 1 FHNW and 2 EPFL (Switzerland) AfricaCrypt 2008, Casablanca - June 11-14

More information

Movies Title Director Actor

Movies Title Director Actor Movies Title Director Actor The Trouble with Harry Hitchcock Gwenn The Trouble with Harry Hitchcock Forsythe The Trouble with Harry Hitchcock MacLaine The Trouble with Harry Hitchcock Hitchcock Cries and

More information

CS632 Notes on Relational Query Languages I

CS632 Notes on Relational Query Languages I CS632 Notes on Relational Query Languages I A. Demers 6 Feb 2003 1 Introduction Here we define relations, and introduce our notational conventions, which are taken almost directly from [AD93]. We begin

More information

Busch Complexity Lectures: Turing s Thesis. Costas Busch - LSU 1

Busch Complexity Lectures: Turing s Thesis. Costas Busch - LSU 1 Busch Complexity Lectures: Turing s Thesis Costas Busch - LSU 1 Turing s thesis (1930): Any computation carried out by mechanical means can be performed by a Turing Machine Costas Busch - LSU 2 Algorithm:

More information

GAV-sound with conjunctive queries

GAV-sound with conjunctive queries GAV-sound with conjunctive queries Source and global schema as before: source R 1 (A, B),R 2 (B,C) Global schema: T 1 (A, C), T 2 (B,C) GAV mappings become sound: T 1 {x, y, z R 1 (x,y) R 2 (y,z)} T 2

More information

Differential Privacy

Differential Privacy CS 380S Differential Privacy Vitaly Shmatikov most slides from Adam Smith (Penn State) slide 1 Reading Assignment Dwork. Differential Privacy (invited talk at ICALP 2006). slide 2 Basic Setting DB= x 1

More information

Lecture 16 Oct 21, 2014

Lecture 16 Oct 21, 2014 CS 395T: Sublinear Algorithms Fall 24 Prof. Eric Price Lecture 6 Oct 2, 24 Scribe: Chi-Kit Lam Overview In this lecture we will talk about information and compression, which the Huffman coding can achieve

More information

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig Multimedia Databases Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 13 Indexes for Multimedia Data 13 Indexes for Multimedia

More information

Branch Prediction based attacks using Hardware performance Counters IIT Kharagpur

Branch Prediction based attacks using Hardware performance Counters IIT Kharagpur Branch Prediction based attacks using Hardware performance Counters IIT Kharagpur March 19, 2018 Modular Exponentiation Public key Cryptography March 19, 2018 Branch Prediction Attacks 2 / 54 Modular Exponentiation

More information

Finally the Weakest Failure Detector for Non-Blocking Atomic Commit

Finally the Weakest Failure Detector for Non-Blocking Atomic Commit Finally the Weakest Failure Detector for Non-Blocking Atomic Commit Rachid Guerraoui Petr Kouznetsov Distributed Programming Laboratory EPFL Abstract Recent papers [7, 9] define the weakest failure detector

More information

Extending Oblivious Transfers Efficiently

Extending Oblivious Transfers Efficiently Extending Oblivious Transfers Efficiently Yuval Ishai Technion Joe Kilian Kobbi Nissim Erez Petrank NEC Microsoft Technion Motivation x y f(x,y) How (in)efficient is generic secure computation? myth don

More information

Lifted Inference: Exact Search Based Algorithms

Lifted Inference: Exact Search Based Algorithms Lifted Inference: Exact Search Based Algorithms Vibhav Gogate The University of Texas at Dallas Overview Background and Notation Probabilistic Knowledge Bases Exact Inference in Propositional Models First-order

More information