Shannon-type Inequalities, Submodular Width, and Disjunctive Datalog
|
|
- Kristopher Green
- 5 years ago
- Views:
Transcription
1 Shannon-type Inequalities, Submodular Width, and Disjunctive Datalog Hung Q. Ngo (Stealth Mode) With Mahmoud Abo Khamis and Dan PODS 2017
2 Table of Contents Connecting the Dots Output Size Bounds and Information Theory Shannon-flow Inequalities and the PANDA Algorithm Wrapping it up Appendix
3 Table of Contents Connecting the Dots Output Size Bounds and Information Theory Shannon-flow Inequalities and the PANDA Algorithm Wrapping it up Appendix
4 Exciting Recent Results on Query Evaluation
5 A Query Evaluation Problem Hypergraph H = ([n], E), E 2 [n]
6 A Query Evaluation Problem Hypergraph H = ([n], E), E 2 [n] Attribute A i, i [n], A J := {A j j J}, J [n]
7 A Query Evaluation Problem Hypergraph H = ([n], E), E 2 [n] Attribute A i, i [n], A J := {A j j J}, J [n] Relation R F (A F ) for each F E, R F for short e.g. R 123 for R 123 (A 1, A 2, A 3 ) e.g. T 25 for T 25 (A 2, A 5 )
8 A Query Evaluation Problem Hypergraph H = ([n], E), E 2 [n] Attribute A i, i [n], A J := {A j j J}, J [n] Relation R F (A F ) for each F E, R F for short e.g. R 123 for R 123 (A 1, A 2, A 3 ) e.g. T 25 for T 25 (A 2, A 5 )
9 A Query Evaluation Problem Hypergraph H = ([n], E), E 2 [n] Attribute A i, i [n], A J := {A j j J}, J [n] Relation R F (A F ) for each F E, R F for short e.g. R 123 for R 123 (A 1, A 2, A 3 ) e.g. T 25 for T 25 (A 2, A 5 ) Problem (Boolean Conjunctive Query (BCQ)) Q : S() F E R F, // can all R F be satisfied at once?
10 A Query Evaluation Problem Hypergraph H = ([n], E), E 2 [n] Attribute A i, i [n], A J := {A j j J}, J [n] Relation R F (A F ) for each F E, R F for short e.g. R 123 for R 123 (A 1, A 2, A 3 ) e.g. T 25 for T 25 (A 2, A 5 ) Problem (Boolean Conjunctive Query (BCQ)) Q : S() F E R F, // can all R F be satisfied at once? Question How do we evaluate Q efficiently?
11 A Query Evaluation Problem Hypergraph H = ([n], E), E 2 [n] Attribute A i, i [n], A J := {A j j J}, J [n] Relation R F (A F ) for each F E, R F for short e.g. R 123 for R 123 (A 1, A 2, A 3 ) e.g. T 25 for T 25 (A 2, A 5 ) Problem (Boolean Conjunctive Query (BCQ)) Q : S() F E R F, // can all R F be satisfied at once? Question How do we evaluate Q efficiently? Conjunctive, count, aggregate queries are fine too.
12 What do we mean by efficiency? Database centric complexity framework Assumption 1: query size data size Data complexity Fixed parameter tractability (e.g. parameter = some function of query size) Õ(something) = O (f( query ) polylog( data ) something)
13 What do we mean by efficiency? Database centric complexity framework Assumption 1: query size data size Data complexity Fixed parameter tractability (e.g. parameter = some function of query size) Õ(something) = O (f( query ) polylog( data ) something) Assumption 2: known constraints on input relations Cardinalities of materialized relations, or upper bounds cardinality constraints (CC) Functional dependencies, the more the merrier FD constraints (FDC) Degree bounds, the more the merrier Degree constraints (DC)
14 Example Q : S() R 12 R 23 R 34 R 41 A 1 + A 2 = A 3. H = ([4], {12, 23, 34, 14, 123}) A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34 A 1 A A 2 A 3 1 c 1 d 2 c A 3 A 4 c 3 d 4 d 5 e 6 e 7 A 4 A Cardinalities: R 12 = 4, R 23 = 3,...
15 Example Q : S() R 12 R 23 R 34 R 41 A 1 + A 2 = A 3. H = ([4], {12, 23, 34, 14, 123}) A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34 A 1 A A 2 A 3 1 c 1 d 2 c A 3 A 4 c 3 d 4 d 5 e 6 e 7 A 4 A Cardinalities: R 12 = 4, R 23 = 3,... FD: {A 1, A 2 } A 3, {A 3, A 2 } A 1,...
16 Example Q : S() R 12 R 23 R 34 R 41 A 1 + A 2 = A 3. H = ([4], {12, 23, 34, 14, 123}) A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34 A 1 A A 2 A 3 1 c 1 d 2 c A 3 A 4 c 3 d 4 d 5 e 6 e 7 A 4 A Cardinalities: R 12 = 4, R 23 = 3,... FD: {A 1, A 2 } A 3, {A 3, A 2 } A 1,... Degree Bounds: deg 34 (A 4 A 3 = x) def = σ A3=x(R 34 ) 2, x,...
17 Degree- generalize cardinality- and FD-constraints DC CC FDC Degree Constraints (DC): deg F (A Y A X ) N Y X, X Y F E
18 Degree- generalize cardinality- and FD-constraints DC CC FDC Degree Constraints (DC): deg F (A Y A X ) N Y X, X Y F E Cardinality Constraints (CC): R F N deg F (A F A ) N F def = N.
19 Degree- generalize cardinality- and FD-constraints DC CC FDC Degree Constraints (DC): deg F (A Y A X ) N Y X, X Y F E Cardinality Constraints (CC): R F N deg F (A F A ) N F def = N. Functional Dependencies (FDC): A X A Y deg F (A Y A X ) N Y X def = 1.
20 Islands of tractability (redrawn from D. Marx s slides) Prior results with cardinality constraints not FPT FPT PTIME Bounded submodular width Bounded fractional hypertree width Bounded (generalized) Hypertree Width Bounded fractional edge cover number Bounded Treewidth
21 Islands of tractability (redrawn from D. Marx s slides) Prior results with cardinality constraints not FPT FPT PTIME Bounded submodular width Bounded fractional hypertree width Bounded (generalized) Hypertree Width Bounded fractional edge cover number Bounded Treewidth We want the same map with degree constraints.
22 A Meta Algorithm (i.e. Meta Query Plan) Database D F E R F itime Datalog rule Q i Q i (D) A propositional formula on relations T j, j J Q o otime overall time = Õ( input + itime + otime ) Output Answer
23 A Meta Algorithm (i.e. Meta Query Plan) Database D F E R F itime Datalog rule Q i Q i (D) A propositional formula on relations T j, j J Q o otime overall time = Õ( input + itime + otime ) Output Answer (a) otime = Q i (D) + answer
24 A Meta Algorithm (i.e. Meta Query Plan) Database D F E R F itime Datalog rule Q i Q i (D) A propositional formula on relations T j, j J Q o otime overall time = Õ( input + itime + otime ) Output Answer (a) otime = Q i (D) + answer i.e. Q o evaluatable in linear time
25 A Meta Algorithm (i.e. Meta Query Plan) Database D F E R F itime Datalog rule Q i Q i (D) A propositional formula on relations T j, j J Q o otime overall time = Õ( input + itime + otime ) Output Answer (a) otime = Q i (D) + answer i.e. Q o evaluatable in linear time (b) itime = max D =DC Q i(d)
26 A Meta Algorithm (i.e. Meta Query Plan) Database D F E R F itime Datalog rule Q i Q i (D) A propositional formula on relations T j, j J Q o otime overall time = Õ( input + itime + otime ) Output Answer (a) otime = Q i (D) + answer i.e. Q o evaluatable in linear time (b) itime = max D =DC Q i(d) i.e. Qi evaluatable within its worst-case output size
27 A Meta Algorithm (i.e. Meta Query Plan) Database D F E R F itime Datalog rule Q i Q i (D) A propositional formula on relations T j, j J Q o otime overall time = Õ( input + itime + otime ) Output Answer (a) otime = Q i (D) + answer i.e. Q o evaluatable in linear time (b) itime = max D =DC Q i(d) i.e. Qi evaluatable within its worst-case output size Design Q i s.t. (a) holds and (b) as small as possible
28 Option 1: Full Conjunctive Query Full conjunctive query Database D Q i Q i (D) Q o Output F E R F itime T [n] otime Answer
29 Option 1: Full Conjunctive Query Full conjunctive query Database D Q i Q i (D) Q o Output F E R F itime T [n] otime Answer otime = Q i (D) + answer trivially true
30 Option 1: Full Conjunctive Query Database D F E R F Full conjunctive query itime Q i Q i (D) T [n] Q o otime Output Answer otime = Q i (D) + answer itime = max Q i(d) D =DC trivially true worst-case optimal algorithm Known if all DC are cardinality constraints NPRR [Ngo, Porat, Ré, Rudra PODS 12] Leapfrog-Triejoin [Veldhuizen ICDT 14] Generic Join [Ngo, Ré, Rudra SIGMOD Records 2013] Unknown for general DC until our work
31 Detour: Tree Decompositions, Informally
32 Detour: Tree Decompositions, Informally S() R(a, b, d) c < d T (c, b, d) U(b, e) V (c, e) b + e = f W (b, e, g) X(i, j, h) e b = k.
33 Detour: Tree Decompositions, Informally S() R(a, b, d) c < d T (c, b, d) U(b, e) V (c, e) b + e = f W (b, e, g) X(i, j, h) e b = k. b, c, e bag a, b, c, d bag b, e, f, g, h bag h, i, j bag e, b, k bag
34 Detour: Tree Decompositions, Informally S() R(a, b, d) c < d T (c, b, d) U(b, e) V (c, e) b + e = f W (b, e, g) X(i, j, h) e b = k. b, c, e bag a, b, c, d bag b, e, f, g, h bag h, i, j bag e, b, k bag Every relation is covered by some bag
35 Detour: Tree Decompositions, Informally S() R(a, b, d) c < d T (c, b, d) U(b, e) V (c, e) b + e = f W (b, e, g) g/f = h X(i, j, h) e b = k. b, c, e bag a, b, c, d bag b, e, f, g, h bag h, i, j bag e, b, k bag Every relation is covered by some bag
36 Detour: Tree Decompositions, Informally S() R(a, b, d) c < d T (c, b, d) U(b, e) V (c, e) b + e = f W (b, e, g) X(i, j, h) e b = k. b, c, e bag a, b, c, d bag b, e, f, g, h bag h, i, j bag e, b, k bag Every relation is covered by some bag Bags conntaining a given variable are connected
37 Detour: Tree Decompositions, Formally Hypergraph H = ([n], E) A Tree Decomposition of H is a pair (T, χ) where T = (V (T ), E(T )) is a tree See [Gottlob et al 2016], Gems of PODS.
38 Detour: Tree Decompositions, Formally Hypergraph H = ([n], E) A Tree Decomposition of H is a pair (T, χ) where T = (V (T ), E(T )) is a tree χ : V (T ) 2 [n] assigns a bag χ(v) to each tree-node v Every hyperedge F E is covered by some bag (F χ(v)) Bags containing i [n] forms a subtree See [Gottlob et al 2016], Gems of PODS.
39 Option 2: A Single Tree Decomposition Fix (T, χ) Database D Multiple Conjunctive Rules itime R F T χ(v) F E v V (T ) Q i Q i (D) Q o otime Output Answer
40 Option 2: A Single Tree Decomposition Fix (T, χ) Database D Multiple Conjunctive Rules itime R F T χ(v) F E v V (T ) Q i Q i (D) Q o otime Output Answer Q i (D) is Olteanu s factorized database (FDB)
41 Option 2: A Single Tree Decomposition Fix (T, χ) Database D Multiple Conjunctive Rules itime R F T χ(v) F E v V (T ) Q i Q i (D) Q o otime Output Answer Q i (D) is Olteanu s factorized database (FDB) otime = Q i (D) + answer Yannakakis for join, FDB/InsideOut for aggregates
42 Option 2: A Single Tree Decomposition Fix (T, χ) Database D Multiple Conjunctive Rules itime R F T χ(v) F E v V (T ) Q i Q i (D) Q o otime Output Answer Q i (D) is Olteanu s factorized database (FDB) otime = Q i (D) + answer Yannakakis for join, FDB/InsideOut for aggregates itime = max Q i(d) max max P v(d) D =DC D =DC v V (T ) Pv : T χ(v) R F v V (T ) F E
43 Option 2: A Single Tree Decomposition Fix (T, χ) Database D Multiple Conjunctive Rules itime R F T χ(v) F E v V (T ) Q i Q i (D) Q o otime Output Answer Q i (D) is Olteanu s factorized database (FDB) otime = Q i (D) + answer Yannakakis for join, FDB/InsideOut for aggregates itime = max Q i(d) max max P v(d) D =DC D =DC v V (T ) Pv : T χ(v) R F v V (T ) F E
44 Option 2: A Single Tree Decomposition Fix (T, χ) Database D Multiple Conjunctive Rules itime R F T χ(v) F E v V (T ) Q i Q i (D) Q o otime Output Answer Q i (D) is Olteanu s factorized database (FDB) otime = Q i (D) + answer Yannakakis for join, FDB/InsideOut for aggregates itime = max Q i(d) max max P v(d) D =DC D =DC v V (T ) Pv : T χ(v) R F v V (T ) F E min max max P v(d) N fhtw(h) N ghtw(h) N tw(h)+1 (T,χ) D =CC v V (T )
45 Option 3: Multiple Tree Decompositions Database D Q i Q i (D) itime R F T χ(v) F E (T,χ) v V (T ) Q o otime Output Answer ranges over non-redundant TDs (T, χ)
46 Option 3: Multiple Tree Decompositions Database D Q i Q i (D) itime R F T χ(v) F E (T,χ) v V (T ) Q o otime Output Answer ranges over non-redundant TDs (T, χ) otime = Q i (D) + answer Union of Yannakakis on all TDs
47 Option 3: Multiple Tree Decompositions Database D How to evaluate this? Q i Q i (D) itime R F T χ(v) F E (T,χ) v V (T ) Q o otime Output Answer ranges over non-redundant TDs (T, χ) otime = Q i (D) + answer Union of Yannakakis on all TDs itime = max D =DC Q i(d)?
48 Example: Q : S() R 12 R 23 R 34 R 41 A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34
49 Example: Q : S() R 12 R 23 R 34 R 41 A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34 A 1 A 2 A 3 A 1 A 4 A 3 A 1 A 2 A 4 A 2 A 4 A 3
50 Example: Q : S() R 12 R 23 R 34 R 41 A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34 A 1 A 2 A 3 A 1 A 4 A 3 A 1 A 2 A 4 A 2 A 4 A 3 (T 123 T 134 ) (T 124 T 234 ) R 12 R 23 R 34 R 41
51 Example: Q : S() R 12 R 23 R 34 R 41 A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34 A 1 A 2 A 3 A 1 A 4 A 3 A 1 A 2 A 4 A 2 A 4 A 3 (T 123 T 134 ) (T 124 T 234 ) R 12 R 23 R 34 R 41 By distributivity, rewrite an equivalent the head: (T 123 T 124 ) (T 123 T 234 ) (T 134 T 124 ) (T 134 T 234 ) R 12 R 23 R 34 R 41 (Each clause has one bag per TD)
52 Option 3: Multiple Tree Decompositions Database D Multiple Disjunctive Datalog Rules! itime R F T B F E B B B Q i Q i (D) Q o otime Output Answer otime = Q i (D) + answer Union of Yannakakis on all TDs itime = max Q i(d) max D =DC B PB : T B B B F E R F D =DC max P B(D) disjunctive datalog rule
53 Option 3: Multiple Tree Decompositions Database D Multiple Disjunctive Datalog Rules! itime R F T B F E B B B Q i Q i (D) Q o otime Output Answer otime = Q i (D) + answer Union of Yannakakis on all TDs itime = max Q i(d) max D =DC B PB : T B B B F E R F D =DC max P B(D) disjunctive datalog rule max max P B(D) N subw(h) N fhtw(h) D =CC B subw = submodular width (Daniel Marx, JACM 2013)
54 Example: Q : S() R 12 R 23 R 34 R 41 Option 1: Q i : T 1234 R 12 R 23 R 34 R 41
55 Example: Q : S() R 12 R 23 R 34 R 41 Option 1: Option 2: Q i : T 1234 R 12 R 23 R 34 R 41 either Q i : T 123 T 134 R 12 R 23 R 34 R 41 or Q i : T 124 T 234 R 12 R 23 R 34 R 41
56 Example: Q : S() R 12 R 23 R 34 R 41 Option 1: Option 2: Q i : T 1234 R 12 R 23 R 34 R 41 Option 3: either Q i : T 123 T 134 R 12 R 23 R 34 R 41 or Q i : T 124 T 234 R 12 R 23 R 34 R 41 Q i : (T 123 T 134 ) (T 124 T 234 ) R 12 R 23 R 34 R 41 Equivalent to: P 123,124 : T 123 T 124 R 12 R 23 R 34 R 41 P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41 P 134,124 : T 134 T 124 R 12 R 23 R 34 R 41 P 134,234 : T 134 T 234 R 12 R 23 R 34 R 41
57 Example: Q : S() R 12 R 23 R 34 R 41 Option 1: itime = max D =DC Q i(d) = N 2 Q i : T 1234 R 12 R 23 R 34 R 41 Option 2: Option 3: either Q i : T 123 T 134 R 12 R 23 R 34 R 41 or Q i : T 124 T 234 R 12 R 23 R 34 R 41 Q i : (T 123 T 134 ) (T 124 T 234 ) R 12 R 23 R 34 R 41 Equivalent to: P 123,124 : T 123 T 124 R 12 R 23 R 34 R 41 P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41 P 134,124 : T 134 T 124 R 12 R 23 R 34 R 41 P 134,234 : T 134 T 234 R 12 R 23 R 34 R 41
58 Example: Q : S() R 12 R 23 R 34 R 41 Option 1: itime = max D =DC Q i(d) = N 2 Q i : T 1234 R 12 R 23 R 34 R 41 Option 2: itime = max D =DC Q i(d) = N 2 Option 3: either Q i : T 123 T 134 R 12 R 23 R 34 R 41 or Q i : T 124 T 234 R 12 R 23 R 34 R 41 Q i : (T 123 T 134 ) (T 124 T 234 ) R 12 R 23 R 34 R 41 Equivalent to: P 123,124 : T 123 T 124 R 12 R 23 R 34 R 41 P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41 P 134,124 : T 134 T 124 R 12 R 23 R 34 R 41 P 134,234 : T 134 T 234 R 12 R 23 R 34 R 41
59 Example: Q : S() R 12 R 23 R 34 R 41 Option 1: itime = max D =DC Q i(d) = N 2 Q i : T 1234 R 12 R 23 R 34 R 41 Option 2: itime = max D =DC Q i(d) = N 2 either Q i : T 123 T 134 R 12 R 23 R 34 R 41 or Q i : T 124 T 234 R 12 R 23 R 34 R 41 Option 3: itime = max D =DC Q i(d) = N 3/2 Q i : (T 123 T 134 ) (T 124 T 234 ) R 12 R 23 R 34 R 41 Equivalent to: P 123,124 : T 123 T 124 R 12 R 23 R 34 R 41 P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41 P 134,124 : T 134 T 124 R 12 R 23 R 34 R 41 P 134,234 : T 134 T 234 R 12 R 23 R 34 R 41
60 Roadmap Given degree constraints DC, and a disjunctive datalog rule P : B B T B F E R F
61 Roadmap Given degree constraints DC, and a disjunctive datalog rule P : B B T B F E R F Question (Worst-case Output Size Bound) Find a good upper-bound for max P (D) D =DC
62 Roadmap Given degree constraints DC, and a disjunctive datalog rule P : B B T B F E R F Question (Worst-case Output Size Bound) Find a good upper-bound for max P (D) D =DC Question (Algorithm) Design an algorithm evaluating P within the bound.
63 Roadmap Given degree constraints DC, and a disjunctive datalog rule P : B B T B F E R F Question (Worst-case Output Size Bound) Find a good upper-bound for max P (D) D =DC Question (Algorithm) Design an algorithm evaluating P within the bound. Question (Gathering fruits) Plug bound/algorithm into Meta Algorithm, what do we get?
64 Table of Contents Connecting the Dots Output Size Bounds and Information Theory Shannon-flow Inequalities and the PANDA Algorithm Wrapping it up Appendix
65 High-level View of the Bound Given degree constraints DC, a disjunctive datalog rule P : B B T B F E We shall prove bounds of the form R F max log P (D) some function of h D =DC s.t. h is (approximately) entropic and h satisfies degree constraints
66 An Idea From Gottlob-Lee-Valiant-Valiant, JACM 12 Q(A 1, A 2, A 3, A 4 ) R 12 R 23 R 34 R 41. A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34
67 An Idea From Gottlob-Lee-Valiant-Valiant, JACM 12 Q(A 1, A 2, A 3, A 4 ) R 12 R 23 R 34 R 41. A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34 A 1 A 2 a 1 b 1 b 2 A 2 A 3 1 c 1 d 2 c A 3 A 4 c 3 d 4 d 5 A 4 A 1 3 b 4 a 4 b
68 An Idea From Gottlob-Lee-Valiant-Valiant, JACM 12 Q(A 1, A 2, A 3, A 4 ) R 12 R 23 R 34 R 41. A 1 A 2 A 3 A 4 a 1 d 4 b 1 c 3 b 1 d 4 b 2 c 3 A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34 A 1 A 2 a 1 b 1 b 2 A 2 A 3 1 c 1 d 2 c A 3 A 4 c 3 d 4 d 5 A 4 A 1 3 b 4 a 4 b
69 An Idea From Gottlob-Lee-Valiant-Valiant, JACM 12 Q(A 1, A 2, A 3, A 4 ) R 12 R 23 R 34 R 41. A 1 A 2 A 3 A 4 a 1 d 4 1/4 b 1 c 3 1/4 b 1 d 4 1/4 b 2 c 3 1/4 A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34 A 1 A 2 a 1 b 1 b 2 A 2 A 3 1 c 1 d 2 c A 3 A 4 c 3 d 4 d 5 A 4 A 1 3 b 4 a 4 b
70 An Idea From Gottlob-Lee-Valiant-Valiant, JACM 12 Q(A 1, A 2, A 3, A 4 ) R 12 R 23 R 34 R 41. A 1 A 2 A 3 A 4 a 1 d 4 1/4 b 1 c 3 1/4 b 1 d 4 1/4 b 2 c 3 1/4 A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34 A 1 A 2 a 1 1/4 b 1 2/4 b 2 1/4 A 2 A 3 1 c 1/4 1 d 2/4 2 c 1/4 A 3 A 4 c 3 2/4 d 4 2/4 d 5 0 A 4 A 1 3 b 2/4 4 a 1/4 4 b 1/4
71 An Idea From Gottlob-Lee-Valiant-Valiant, JACM 12 Q(A 1, A 2, A 3, A 4 ) R 12 R 23 R 34 R 41. A 1 A 2 A 3 A 4 a 1 d 4 1/4 b 1 c 3 1/4 b 1 d 4 1/4 b 2 c 3 1/4 H unif (A 1 A 2 A 3 A 4 ) = log Q A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34 A 1 A 2 a 1 1/4 b 1 2/4 b 2 1/4 A 2 A 3 1 c 1/4 1 d 2/4 2 c 1/4 A 3 A 4 c 3 2/4 d 4 2/4 d 5 0 A 4 A 1 3 b 2/4 4 a 1/4 4 b 1/4
72 An Idea From Gottlob-Lee-Valiant-Valiant, JACM 12 Q(A 1, A 2, A 3, A 4 ) R 12 R 23 R 34 R 41. A 1 A 2 A 3 A 4 a 1 d 4 1/4 b 1 c 3 1/4 b 1 d 4 1/4 b 2 c 3 1/4 H unif (A 1 A 2 A 3 A 4 ) = log Q A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34 A 1 A 2 a 1 1/4 b 1 2/4 b 2 1/4 A 2 A 3 1 c 1/4 1 d 2/4 2 c 1/4 A 3 A 4 c 3 2/4 d 4 2/4 d 5 0 A 4 A 1 3 b 2/4 4 a 1/4 4 b 1/4 H unif (A 1 A 2 ) log R 12, H unif (A 2 A 3 ) log R 23, H unif (A 3 A 4 ) log R 34,...
73 An Idea From Gottlob-Lee-Valiant-Valiant, JACM 12 Q(A 1, A 2, A 3, A 4 ) R 12 R 23 R 34 R 41. A 1 A 2 A 3 A 4 a 1 d 4 1/4 b 1 c 3 1/4 b 1 d 4 1/4 b 2 c 3 1/4 H unif (A 1 A 2 A 3 A 4 ) = log Q A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34 A 1 A 2 a 1 1/4 b 1 2/4 b 2 1/4 A 2 A 3 1 c 1/4 1 d 2/4 2 c 1/4 A 3 A 4 c 3 2/4 d 4 2/4 d 5 0 A 4 A 1 3 b 2/4 4 a 1/4 4 b 1/4 H unif (A 1 A 2 ) log R 12, H unif (A 2 A 3 ) log R 23, H unif (A 3 A 4 ) log R 34,... H unif (A 2 A 1 = a ) log σa1 = a R 12, Hunif (A 2 A 1 = b ) log σa1 = b R 12,...
74 An Idea From Gottlob-Lee-Valiant-Valiant, JACM 12 Q(A 1, A 2, A 3, A 4 ) R 12 R 23 R 34 R 41. A 1 A 2 A 3 A 4 a 1 d 4 1/4 b 1 c 3 1/4 b 1 d 4 1/4 b 2 c 3 1/4 H unif (A 1 A 2 A 3 A 4 ) = log Q A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34 A 1 A 2 a 1 1/4 b 1 2/4 b 2 1/4 A 2 A 3 1 c 1/4 1 d 2/4 2 c 1/4 A 3 A 4 c 3 2/4 d 4 2/4 d 5 0 A 4 A 1 3 b 2/4 4 a 1/4 4 b 1/4 H unif (A 1 A 2 ) log R 12, H unif (A 2 A 3 ) log R 23, H unif (A 3 A 4 ) log R 34,... H unif (A 2 A 1 = a ) log σa1 = a R 12, Hunif (A 2 A 1 = b ) log σa1 = b R 12,... H unif (A 2 A 1 ) log max σa1 =xr 12 x }{{}
75 An Idea From Gottlob-Lee-Valiant-Valiant, JACM 12 Q(A 1, A 2, A 3, A 4 ) R 12 R 23 R 34 R 41. A 1 A 2 A 3 A 4 a 1 d 4 1/4 b 1 c 3 1/4 b 1 d 4 1/4 b 2 c 3 1/4 H unif (A 1 A 2 A 3 A 4 ) = log Q A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34 A 1 A 2 a 1 1/4 b 1 2/4 b 2 1/4 A 2 A 3 1 c 1/4 1 d 2/4 2 c 1/4 A 3 A 4 c 3 2/4 d 4 2/4 d 5 0 A 4 A 1 3 b 2/4 4 a 1/4 4 b 1/4 H unif (A 1 A 2 ) log R 12, H unif (A 2 A 3 ) log R 23, H unif (A 3 A 4 ) log R 34,... H unif (A 2 A 1 = a ) log σa1 = a R 12, Hunif (A 2 A 1 = b ) log σa1 = b R 12,... H unif (A 2 A 1 ) log max σa1 =xr 12 x }{{} deg R12 (A 2 A 1 )
76 Entropic Bound for Full Conjunctive Queries Q : T [n] F E R F, and degree constraints DC
77 Entropic Bound for Full Conjunctive Queries Q : T [n] F E R F, and degree constraints DC max log Q(D) sup h([n]) D =DC
78 Entropic Bound for Full Conjunctive Queries Q : T [n] F E R F, and degree constraints DC max log Q(D) sup h([n]) D =DC subject to (whatever H unif satisfies):
79 Entropic Bound for Full Conjunctive Queries Q : T [n] F E R F, and degree constraints DC max log Q(D) sup h([n]) D =DC subject to (whatever H unif satisfies): h is Entropic
80 Entropic Bound for Full Conjunctive Queries Q : T [n] F E R F, and degree constraints DC max log Q(D) sup h([n]) D =DC subject to (whatever H unif satisfies): h is Entropic There is some distribution on A [n] such that h(x) is the marginal entropy on A X, for all X
81 Entropic Bound for Full Conjunctive Queries Q : T [n] F E R F, and degree constraints DC max log Q(D) sup h([n]) D =DC subject to (whatever H unif satisfies): h is Entropic There is some distribution on A [n] such that h(x) is the marginal entropy on A X, for all X h satisfies DC
82 Entropic Bound for Full Conjunctive Queries Q : T [n] F E R F, and degree constraints DC max log Q(D) sup h([n]) D =DC subject to (whatever H unif satisfies): h is Entropic There is some distribution on A [n] such that h(x) is the marginal entropy on A X, for all X h satisfies DC h(y X) def = h(y ) h(x) log N Y X, X Y F E
83 Entropic Bound for Full Conjunctive Queries Q : T [n] F E R F, and degree constraints DC max log Q(D) sup h([n]) D =DC subject to (whatever H unif satisfies): h is Entropic There is some distribution on A [n] such that h(x) is the marginal entropy on A X, for all X h satisfies DC h(y X) def = h(y ) h(x) log N Y X, X Y F E Good Bound, but not computable!
84 Hierarchy of Set Functions h : 2 [n] R +, non-negative, monotone, h( ) = 0 h(x) h(y ) if X Y SA n := {h h is sub-additive} h(x Y ) h(x) + h(y ) Γ n := {h h is submodular} h(x Y ) + h(x Y ) h(x) + h(y ) Γ n: topological closure of Γ n Γ n = {h : h is entropic} M n : Modular h(x) = x x h(x)
85 Bounds for Full Conjunctive Query HDC def = { h h(y X) log N Y X, (X, Y, N Y X ) }
86 Bounds for Full Conjunctive Query HDC def = { h h(y X) log N Y X, (X, Y, N Y X ) } Then, max log Q(D) max h([n]) D =DC h Γ n HDC max h([n]) h Γ n HDC max h([n]) h SA n HDC entropic bound polymatroid bound sub-additive bound.
87 Size Bounds for Full Conjunctive Queries Bound Entropic Bound Polymatroid Bound Definition log Q max h Γ n HDC h([n]) log Q max h Γ n HDC h([n])
88 Size Bounds for Full Conjunctive Queries Bound Entropic Bound Polymatroid Bound Definition log Q max h Γ n HDC h([n]) log Q max h Γ n HDC h([n]) CC only AGM bound (Tight) [Atserias et al. FOCS 08] AGM bound (Tight) [Atserias et al. FOCS 08]
89 Size Bounds for Full Conjunctive Queries Bound Entropic Bound Polymatroid Bound Definition log Q max h Γ n HDC h([n]) log Q max h Γ n HDC h([n]) CC only CC + FD only AGM bound (Tight) AGM bound (Tight) [Atserias et al. FOCS 08] [Atserias et al. FOCS 08] Entropic Bound for FD Polymatroid Bound for FD [Gottlob et al. JACM 12] [Gottlob et al. JACM 12] (Tight [Gogacz et al. ICDT 17]) (Not tight [our work] )
90 Size Bounds for Full Conjunctive Queries Bound Entropic Bound Polymatroid Bound Definition log Q max h Γ n HDC h([n]) log Q max h Γ n HDC h([n]) CC only CC + FD only DC AGM bound (Tight) [Atserias et al. FOCS 08] Entropic Bound for FD [Gottlob et al. JACM 12] AGM bound (Tight) [Atserias et al. FOCS 08] Polymatroid Bound for FD [Gottlob et al. JACM 12] (Tight [Gogacz et al. ICDT 17]) (Not tight [our work] ) Entropic Bound for DC Polymatroid Bound for DC (Tight [our work] ) (Not tight [our work] )
91 Disjunctive Datalog: Size Bounds P : B B T B (A B ) F E R F (A F ) P (D) def = min T:T =P max B B T B Theorem ( our work ) max log P (D) max D =DC min h(b) h Γ B B n HDC }{{} Entropic bound max min h(b) h Γ n HDC B B }{{} Polymatroid bound Tight Not Tight Imply all known bounds for (Full) Conjunctive Queries!
92 Earlier Example P : B B T B (A B ) F E R F (A F ) P (D) def = min T:T =P max B B T B A 2 R 12 R 23 A 1 A 3 R 41 CC : R 12 N, R 23 N, R 34 N, R 41 N. P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41. A 4 R 34
93 Earlier Example P : B B T B (A B ) F E R F (A F ) P (D) def = min T:T =P max B B T B A 2 R 12 R 23 A 1 A 3 R 41 CC : R 12 N, R 23 N, R 34 N, R 41 N. P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41. A 4 R 34 max log P 123,234(D) max min{h(a 1A 2 A 3 ), h(a 2 A 3 A 4 )} D =CC h Γ n CC 1 max h Γ n CC max h Γ n CC 3 log N. 2 2 [h(a 1A 2 A 3 ), h(a 2 A 3 A 4 )] 1 2 [h(a 1A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 )]
94 Table of Contents Connecting the Dots Output Size Bounds and Information Theory Shannon-flow Inequalities and the PANDA Algorithm Wrapping it up Appendix
95 Roadmap Given degree constraints DC and a disjunctive datalog rule P : B B T B F E R F Answer (Worst-case Output Size Bound) max log P (D) max D =DC min h Γ n HDC B B h(b) = polymatroid bound. Question (Algorithm) Compute a model for P within Õ(2polymatroid bound ) Question (Gathering fruits) Plug bound/algorithm into Meta Algorithm, what do we get?
96 Connection to Shannon-flow Inequalities Lemma (Linearize it) There exists non-negative λ = (λ B ) B B, with λ 1 = 1, s.t. max min h(b) = h Γ n HDC B B max h Γ n HDC B B λ B h(b) (1)
97 Connection to Shannon-flow Inequalities Lemma (Linearize it) There exists non-negative λ = (λ B ) B B, with λ 1 = 1, s.t. max min h(b) = h Γ n HDC B B max h Γ n HDC B B Lemma (Shannon-flow inequality) There exists δ 0 s.t. 2 polymatroid bound = λ B h(b) B B (X,Y,N Y X ) λ B h(b) (1) (X,Y,N Y X ) N δ Y X Y X, and δ Y X h(y X), h Γ n (2)
98 Connection to Shannon-flow Inequalities Lemma (Linearize it) There exists non-negative λ = (λ B ) B B, with λ 1 = 1, s.t. max min h(b) = h Γ n HDC B B max h Γ n HDC B B Lemma (Shannon-flow inequality) There exists δ 0 s.t. 2 polymatroid bound = λ B h(b) B B (X,Y,N Y X ) λ B h(b) (1) (X,Y,N Y X ) N δ Y X Y X, and δ Y X h(y X), h Γ n (2) (2) is a (vast) generalization of Shearer s lemma
99 PANDA (Proof-Assisted entropic Degree-Aware) What? Compute a model for our disjunctive datalog rule ( Run within Õ 2 polymatroid bound) = Õ (X,Y,N Y X ) N δ Y X Y X :
100 PANDA (Proof-Assisted entropic Degree-Aware) What? Compute a model for our disjunctive datalog rule ( Run within Õ 2 polymatroid bound) = Õ (X,Y,N Y X ) How? Proof as symbolic instructions Construct a Proof Sequence for the corresponding Shannon-flow inequality Proof steps relational operators. N δ Y X Y X :
101 Proof sequence Shannon-flow inequality: λ B h(b) B B h(y X) def = h(y ) h(x), X Y (X,Y,N Y X ) δ Y X h(y X) Proof sequence, convert RHS to LHS using following steps (In)equality Steps (X Y ) h(x) + h(y X) = h(y ) h(x) + h(y X) h(y ) h(y ) = h(x) + h(y X) h(y ) h(x) h(y ) h(x) + h(y X) h(y ) h(x) h(y X) h(y Z X Z) h(y X) h(y Z X Z)
102 Proof sequence Shannon-flow inequality: λ B h(b) B B h(y X) def = h(y ) h(x), X Y (X,Y,N Y X ) δ Y X h(y X) Proof sequence, convert RHS to LHS using following steps (In)equality Steps (X Y ) h(x) + h(y X) = h(y ) h(x) + h(y X) h(y ) h(y ) = h(x) + h(y X) h(y ) h(x) h(y ) h(x) + h(y X) h(y ) h(x) h(y X) h(y Z X Z) h(y X) h(y Z X Z) Theorem There is a proof sequence for every Shannon-flow inequality. The length is data-independent.
103 Proof Steps As Relational Operators Shannon-flow inequality: λ B h(b) δ Y X h(y X) B B (X,Y,N Y X ) Proof sequence, Steps (X Y ) h(x) + h(y X) h(y ) h(y ) h(x) + h(y X) h(y ) h(x) h(y X) h(y Z X Z) Relational Operator
104 Proof Steps As Relational Operators Shannon-flow inequality: λ B h(b) δ Y X h(y X) B B (X,Y,N Y X ) Proof sequence, Steps (X Y ) h(x) + h(y X) h(y ) h(y ) h(x) + h(y X) h(y ) h(x) h(y X) h(y Z X Z) Relational Operator (join)
105 Proof Steps As Relational Operators Shannon-flow inequality: λ B h(b) δ Y X h(y X) B B (X,Y,N Y X ) Proof sequence, Steps (X Y ) h(x) + h(y X) h(y ) h(y ) h(x) + h(y X) h(y ) h(x) h(y X) h(y Z X Z) Relational Operator (join) (data partition)
106 Proof Steps As Relational Operators Shannon-flow inequality: λ B h(b) δ Y X h(y X) B B (X,Y,N Y X ) Proof sequence, Steps (X Y ) h(x) + h(y X) h(y ) h(y ) h(x) + h(y X) h(y ) h(x) h(y X) h(y Z X Z) Relational Operator (join) (data partition) (projection)
107 Proof Steps As Relational Operators Shannon-flow inequality: λ B h(b) δ Y X h(y X) B B (X,Y,N Y X ) Proof sequence, Steps (X Y ) h(x) + h(y X) h(y ) h(y ) h(x) + h(y X) h(y ) h(x) h(y X) h(y Z X Z) Relational Operator (join) (data partition) (projection) (NOP)
108 Proof Steps As Relational Operators Shannon-flow inequality: λ B h(b) δ Y X h(y X) B B (X,Y,N Y X ) Proof sequence, Steps (X Y ) h(x) + h(y X) h(y ) h(y ) h(x) + h(y X) h(y ) h(x) h(y X) h(y Z X Z) Relational Operator (join) (data partition) (projection) (NOP) Theorem PANDA solves any disjunctive datalog rule P in time Õ(N + poly(log N) 2 polymatroid bound for P )
109 Example: P : T 123 T 234 R 12 R 23 R 34 R 41. R 12, R 23, R 34, R 41 N
110 Example: P : T 123 T 234 R 12 R 23 R 34 R 41. R 12, R 23, R 34, R 41 N P N 3/2
111 Example: P : T 123 T 234 R 12 R 23 R 34 R 41. R 12, R 23, R 34, R 41 N P N 3/2 log P min(h(a 1 A 2 A 3 ), h(a 2 A 3 A 4 )) (polymatroid bound) 1 2( h(a1 A 2 A 3 ) + h(a 2 A 3 A 4 ) ) (linearize) 1 2( h(a1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) ) (Shannon-flow) 3 2 log N (Cardinality constraints)
112 Example: P : T 123 T 234 R 12 R 23 R 34 R 41. R 12, R 23, R 34, R 41 N P N 3/2 log P min(h(a 1 A 2 A 3 ), h(a 2 A 3 A 4 )) (polymatroid bound) 1 2( h(a1 A 2 A 3 ) + h(a 2 A 3 A 4 ) ) (linearize) 1 2( h(a1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) ) (Shannon-flow) 3 2 log N (Cardinality constraints) Proof sequence h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) Proof Step
113 Example: P : T 123 T 234 R 12 R 23 R 34 R 41. R 12, R 23, R 34, R 41 N P N 3/2 log P min(h(a 1 A 2 A 3 ), h(a 2 A 3 A 4 )) (polymatroid bound) 1 2( h(a1 A 2 A 3 ) + h(a 2 A 3 A 4 ) ) (linearize) 1 2( h(a1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) ) (Shannon-flow) 3 2 log N (Cardinality constraints) Proof sequence h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) Proof Step ( h(a3 A 4 ) h(a 4 A 3 ) + h(a 3 ) )
114 Example: P : T 123 T 234 R 12 R 23 R 34 R 41. R 12, R 23, R 34, R 41 N P N 3/2 log P min(h(a 1 A 2 A 3 ), h(a 2 A 3 A 4 )) (polymatroid bound) 1 2( h(a1 A 2 A 3 ) + h(a 2 A 3 A 4 ) ) (linearize) 1 2( h(a1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) ) (Shannon-flow) 3 2 log N (Cardinality constraints) Proof sequence h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 4 A 3 ) + h(a 3 ) Proof Step ( h(a3 A 4 ) h(a 4 A 3 ) + h(a 3 ) )
115 Example: P : T 123 T 234 R 12 R 23 R 34 R 41. R 12, R 23, R 34, R 41 N P N 3/2 log P min(h(a 1 A 2 A 3 ), h(a 2 A 3 A 4 )) (polymatroid bound) 1 2( h(a1 A 2 A 3 ) + h(a 2 A 3 A 4 ) ) (linearize) 1 2( h(a1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) ) (Shannon-flow) 3 2 log N (Cardinality constraints) Proof sequence h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 4 A 3 ) + h(a 3 ) Proof Step ( h(a3 A 4 ) h(a 4 A 3 ) + h(a 3 ) ) ( h(a4 A 3 ) h(a 4 A 2 A 3 ) )
116 Example: P : T 123 T 234 R 12 R 23 R 34 R 41. R 12, R 23, R 34, R 41 N P N 3/2 log P min(h(a 1 A 2 A 3 ), h(a 2 A 3 A 4 )) (polymatroid bound) 1 2( h(a1 A 2 A 3 ) + h(a 2 A 3 A 4 ) ) (linearize) 1 2( h(a1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) ) (Shannon-flow) 3 2 log N (Cardinality constraints) Proof sequence h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 4 A 3 ) + h(a 3 ) h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 4 A 2 A 3 ) + h(a 3 ) Proof Step ( h(a3 A 4 ) h(a 4 A 3 ) + h(a 3 ) ) ( h(a4 A 3 ) h(a 4 A 2 A 3 ) )
117 Example: P : T 123 T 234 R 12 R 23 R 34 R 41. R 12, R 23, R 34, R 41 N P N 3/2 log P min(h(a 1 A 2 A 3 ), h(a 2 A 3 A 4 )) (polymatroid bound) 1 2( h(a1 A 2 A 3 ) + h(a 2 A 3 A 4 ) ) (linearize) 1 2( h(a1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) ) (Shannon-flow) 3 2 log N (Cardinality constraints) Proof sequence h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 4 A 3 ) + h(a 3 ) h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 4 A 2 A 3 ) + h(a 3 ) Proof Step ( h(a3 A 4 ) h(a 4 A 3 ) + h(a 3 ) ) ( h(a4 A 3 ) h(a 4 A 2 A 3 ) ) ( h(a2 A 3 ) + h(a 4 A 2 A 3 ) h(a 2 A 3 A 4 ) )
118 Example: P : T 123 T 234 R 12 R 23 R 34 R 41. R 12, R 23, R 34, R 41 N P N 3/2 log P min(h(a 1 A 2 A 3 ), h(a 2 A 3 A 4 )) (polymatroid bound) 1 2( h(a1 A 2 A 3 ) + h(a 2 A 3 A 4 ) ) (linearize) 1 2( h(a1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) ) (Shannon-flow) 3 2 log N (Cardinality constraints) Proof sequence h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 4 A 3 ) + h(a 3 ) h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 4 A 2 A 3 ) + h(a 3 ) h(a 1 A 2 ) + h(a 2 A 3 A 4 ) + h(a 3 ) Proof Step ( h(a3 A 4 ) h(a 4 A 3 ) + h(a 3 ) ) ( h(a4 A 3 ) h(a 4 A 2 A 3 ) ) ( h(a2 A 3 ) + h(a 4 A 2 A 3 ) h(a 2 A 3 A 4 ) )
119 Example: P : T 123 T 234 R 12 R 23 R 34 R 41. R 12, R 23, R 34, R 41 N P N 3/2 log P min(h(a 1 A 2 A 3 ), h(a 2 A 3 A 4 )) (polymatroid bound) 1 2( h(a1 A 2 A 3 ) + h(a 2 A 3 A 4 ) ) (linearize) 1 2( h(a1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) ) (Shannon-flow) 3 2 log N (Cardinality constraints) Proof sequence h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 4 A 3 ) + h(a 3 ) h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 4 A 2 A 3 ) + h(a 3 ) h(a 1 A 2 ) + h(a 2 A 3 A 4 ) + h(a 3 ) Proof Step ( h(a3 A 4 ) h(a 4 A 3 ) + h(a 3 ) ) ( h(a4 A 3 ) h(a 4 A 2 A 3 ) ) ( h(a2 A 3 ) + h(a 4 A 2 A 3 ) h(a 2 A 3 A 4 ) ) ( h(a1 A 2 ) h(a 1 A 2 A 3 ) )
120 Example: P : T 123 T 234 R 12 R 23 R 34 R 41. R 12, R 23, R 34, R 41 N P N 3/2 log P min(h(a 1 A 2 A 3 ), h(a 2 A 3 A 4 )) (polymatroid bound) 1 2( h(a1 A 2 A 3 ) + h(a 2 A 3 A 4 ) ) (linearize) 1 2( h(a1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) ) (Shannon-flow) 3 2 log N (Cardinality constraints) Proof sequence h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 4 A 3 ) + h(a 3 ) h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 4 A 2 A 3 ) + h(a 3 ) h(a 1 A 2 ) + h(a 2 A 3 A 4 ) + h(a 3 ) h(a 1 A 2 A 3 ) + h(a 2 A 3 A 4 ) + h(a 3 ) Proof Step ( h(a3 A 4 ) h(a 4 A 3 ) + h(a 3 ) ) ( h(a4 A 3 ) h(a 4 A 2 A 3 ) ) ( h(a2 A 3 ) + h(a 4 A 2 A 3 ) h(a 2 A 3 A 4 ) ) ( h(a1 A 2 ) h(a 1 A 2 A 3 ) )
121 Example: P : T 123 T 234 R 12 R 23 R 34 R 41. R 12, R 23, R 34, R 41 N P N 3/2 log P min(h(a 1 A 2 A 3 ), h(a 2 A 3 A 4 )) (polymatroid bound) 1 2( h(a1 A 2 A 3 ) + h(a 2 A 3 A 4 ) ) (linearize) 1 2( h(a1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) ) (Shannon-flow) 3 2 log N (Cardinality constraints) Proof sequence h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 4 A 3 ) + h(a 3 ) h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 4 A 2 A 3 ) + h(a 3 ) h(a 1 A 2 ) + h(a 2 A 3 A 4 ) + h(a 3 ) h(a 1 A 2 A 3 ) + h(a 2 A 3 A 4 ) + h(a 3 ) Proof Step ( h(a3 A 4 ) h(a 4 A 3 ) + h(a 3 ) ) ( h(a4 A 3 ) h(a 4 A 2 A 3 ) ) ( h(a2 A 3 ) + h(a 4 A 2 A 3 ) h(a 2 A 3 A 4 ) ) ( h(a1 A 2 ) h(a 1 A 2 A 3 ) ) ( h(a1 A 2 A 3 ) + h(a 3 ) h(a 1 A 2 A 3 ) )
122 Example: P : T 123 T 234 R 12 R 23 R 34 R 41. R 12, R 23, R 34, R 41 N P N 3/2 log P min(h(a 1 A 2 A 3 ), h(a 2 A 3 A 4 )) (polymatroid bound) 1 2( h(a1 A 2 A 3 ) + h(a 2 A 3 A 4 ) ) (linearize) 1 2( h(a1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) ) (Shannon-flow) 3 2 log N (Cardinality constraints) Proof sequence h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 3 A 4 ) h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 4 A 3 ) + h(a 3 ) h(a 1 A 2 ) + h(a 2 A 3 ) + h(a 4 A 2 A 3 ) + h(a 3 ) h(a 1 A 2 ) + h(a 2 A 3 A 4 ) + h(a 3 ) h(a 1 A 2 A 3 ) + h(a 2 A 3 A 4 ) + h(a 3 ) h(a 1 A 2 A 3 ) + h(a 2 A 3 A 4 ) Proof Step ( h(a3 A 4 ) h(a 4 A 3 ) + h(a 3 ) ) ( h(a4 A 3 ) h(a 4 A 2 A 3 ) ) ( h(a2 A 3 ) + h(a 4 A 2 A 3 ) h(a 2 A 3 A 4 ) ) ( h(a1 A 2 ) h(a 1 A 2 A 3 ) ) ( h(a1 A 2 A 3 ) + h(a 3 ) h(a 1 A 2 A 3 ) )
123 Example: P : T 123 T 234 R 12 R 23 R 34 R 41. R 12, R 23, R 34, R 41 N P N 3/2 h(a 3 A 4 ) h(a 4 A 3 ) + h(a 3 ) h(a 4 A 3 ) h(a 4 A 2 A 3 ) h(a 2 A 3 ) + h(a 4 A 2 A 3 ) h(a 2 A 3 A 4 ) h(a 1 A 2 ) h(a 1 A 2 A 3 ) h(a 1 A 2 A 3 ) + h(a 3 ) h(a 1 A 2 A 3 )
124 Example: P : T 123 T 234 R 12 R 23 R 34 R 41. R 12, R 23, R 34, R 41 N P N 3/2 h(a 3 A 4 ) h(a 4 A 3 ) + h(a 3 ) R 34 (A 3, A 4 ) R (l) 34 (A 3, A 4 ), R (h) 3 (A 3) h(a 4 A 3 ) h(a 4 A 2 A 3 ) h(a 2 A 3 ) + h(a 4 A 2 A 3 ) h(a 2 A 3 A 4 ) h(a 1 A 2 ) h(a 1 A 2 A 3 ) h(a 1 A 2 A 3 ) + h(a 3 ) h(a 1 A 2 A 3 )
125 Example: P : T 123 T 234 R 12 R 23 R 34 R 41. R 12, R 23, R 34, R 41 N P N 3/2 h(a 3 A 4 ) h(a 4 A 3 ) + h(a 3 ) h(a 4 A 3 ) h(a 4 A 2 A 3 ) R 34 (A 3, A 4 ) R (l) 34 (A 3, A 4 ), R (h) 3 (A 3) R (l) 34 (A 3, A 4 ) R (l) 34 (A 3, A 4 ) h(a 2 A 3 ) + h(a 4 A 2 A 3 ) h(a 2 A 3 A 4 ) h(a 1 A 2 ) h(a 1 A 2 A 3 ) h(a 1 A 2 A 3 ) + h(a 3 ) h(a 1 A 2 A 3 )
126 Example: P : T 123 T 234 R 12 R 23 R 34 R 41. R 12, R 23, R 34, R 41 N P N 3/2 h(a 3 A 4 ) h(a 4 A 3 ) + h(a 3 ) h(a 4 A 3 ) h(a 4 A 2 A 3 ) h(a 2 A 3 ) + h(a 4 A 2 A 3 ) h(a 2 A 3 A 4 ) R 34 (A 3, A 4 ) R (l) 34 (A 3, A 4 ), R (h) 3 (A 3) R (l) 34 (A 3, A 4 ) R (l) 34 (A 3, A 4 ) R 23 (A 2, A 3 ) R (l) 34 (A 3, A 4 ) T 234 (A 2, A 3, A 4 ) h(a 1 A 2 ) h(a 1 A 2 A 3 ) h(a 1 A 2 A 3 ) + h(a 3 ) h(a 1 A 2 A 3 )
127 Example: P : T 123 T 234 R 12 R 23 R 34 R 41. R 12, R 23, R 34, R 41 N P N 3/2 h(a 3 A 4 ) h(a 4 A 3 ) + h(a 3 ) h(a 4 A 3 ) h(a 4 A 2 A 3 ) h(a 2 A 3 ) + h(a 4 A 2 A 3 ) h(a 2 A 3 A 4 ) h(a 1 A 2 ) h(a 1 A 2 A 3 ) R 34 (A 3, A 4 ) R (l) 34 (A 3, A 4 ), R (h) 3 (A 3) R (l) 34 (A 3, A 4 ) R (l) 34 (A 3, A 4 ) R 23 (A 2, A 3 ) R (l) 34 (A 3, A 4 ) T 234 (A 2, A 3, A 4 ) R 12 (A 1, A 2 ) R 12 (A 1, A 2 ) h(a 1 A 2 A 3 ) + h(a 3 ) h(a 1 A 2 A 3 )
128 Example: P : T 123 T 234 R 12 R 23 R 34 R 41. R 12, R 23, R 34, R 41 N P N 3/2 h(a 3 A 4 ) h(a 4 A 3 ) + h(a 3 ) h(a 4 A 3 ) h(a 4 A 2 A 3 ) h(a 2 A 3 ) + h(a 4 A 2 A 3 ) h(a 2 A 3 A 4 ) h(a 1 A 2 ) h(a 1 A 2 A 3 ) h(a 1 A 2 A 3 ) + h(a 3 ) h(a 1 A 2 A 3 ) R 34 (A 3, A 4 ) R (l) 34 (A 3, A 4 ), R (h) 3 (A 3) R (l) 34 (A 3, A 4 ) R (l) 34 (A 3, A 4 ) R 23 (A 2, A 3 ) R (l) 34 (A 3, A 4 ) T 234 (A 2, A 3, A 4 ) R 12 (A 1, A 2 ) R 12 (A 1, A 2 ) R 12 (A 1, A 2 ) R (h) 3 (A 3) T 123 (A 1, A 2, A 3 )
129 Table of Contents Connecting the Dots Output Size Bounds and Information Theory Shannon-flow Inequalities and the PANDA Algorithm Wrapping it up Appendix
130 Roadmap Given degree constraints DC and a disjunctive datalog rule P : B B T B F E R F Answer (Worst-case Output Size Bound) Polymatroid bound max log P (D) max min h(b) D =DC h Γ n HDC B B Answer (Algorithm) PANDA computes a model for P within Õ(2polymatroid bound ) Question (Gathering fruits) Plug bound/algorithm into Meta Algorithm, what do we get?
131 Option 1: Full Conjunctive Query Full conjunctive query Database D Q i Q i (D) Q o Output F E R F itime T [n] otime Answer itime = max Q i(d) max D =DC h Γ 2h([n]) = PANDA s runtime n HDC PANDA is worst-case optimal whenever the polymatroid bound is tight!
132 Option 2: A Single Tree Decomposition Let (T, χ) be a tree decomposition of H to be chosen later Database D F E R F Multiple Conjunctive Rules itime Q i Q i (D) v V (T ) T χ(v) itime = max Q i(d) max max P v(d) D =DC D =DC v V (T ) Q o otime 2 max v V (T ) max h Γn DC h(χ(v)) = PANDA s runtime Pick the best (T, χ) before running PANDA: min max max (T,χ) v V (T ) h Γ n CC h(χ(v)) log N fhtw(q) PANDA evaluates Q within Õ(N fhtw(q) )-time. Output Answer
133 Option 3: Multiple Tree Decompositions Database D Multiple Disjunctive Datalog Rules! itime R F T B F E B B B Q i Q i (D) Q o otime Output Answer log itime = max D =DC log Q i(d) max max B = max max D =DC B max min h(b) = h Γ n HDC B B max min h Γ n HDC (T,χ) v V (T ) max max h Γ n HDC min (T,χ) v V (T ) log P B(D) max max min h(b) h Γ n HDC B B B h(χ(v)) = log(panda s runtime) h(χ(v)) log N subw(q) PANDA evaluates Q within Õ(N subw(q) )-time.
134 Quantities of Interests X {Γ n, Γ n, SA n } Y {HDC, HFD, HCC, log N ED, log N VD} Define LogSizeBound X Y (P ) def = max h X Y min B B h(b) Minimaxwidth X Y (Q) def = min max (T,χ) v V (T ) max h(χ(v)), h X Y Maximinwidth X Y (Q) def = max min max h(χ(v)). h X Y (T,χ) v V (T )
135 Summary of Bounds Z SAn Γn HDC HCC ED log N VD log N Γ n DAEB(Q) log ρ 2 AGM(Q) (Q) log 2 N log 2 VB(Q) DAPB(Q) log ρ 2 AGM(Q) (Q) log 2 N log 2 VB(Q) ρ(q, (NF )F E) ρ(q) log 2 N log 2 VB(Q) eda-fhtw(q) LogSizeBound X Y (Q) da-fhtw(q) eda-subw(q) fhtw(q) tw(q) + 1 fhtw(q) tw(q) + 1 ghtw(q) tw(q) + 1 Minimaxwidth X Y (Q) X da-subw(q) ghtw(q) subw(q) tw(q) + 1 tw(q) + 1 tw(q) + 1 Maximinwidth X Y (Q) Y
136 Many Open Questions Is the entropic bound computable under CC HDC or DC?
137 Many Open Questions Is the entropic bound computable under CC HDC or DC? Worst-case optimal algorithm for full conjunctive queries under CC HDC or DC
138 Many Open Questions Is the entropic bound computable under CC HDC or DC? Worst-case optimal algorithm for full conjunctive queries under CC HDC or DC Worst-case optimal algorithm for disjunctive datalog rules under CC HDC or DC
139 Many Open Questions Is the entropic bound computable under CC HDC or DC? Worst-case optimal algorithm for full conjunctive queries under CC HDC or DC Worst-case optimal algorithm for disjunctive datalog rules under CC HDC or DC Remove the annoying poly-log factor from PANDA
140 Many Open Questions Is the entropic bound computable under CC HDC or DC? Worst-case optimal algorithm for full conjunctive queries under CC HDC or DC Worst-case optimal algorithm for disjunctive datalog rules under CC HDC or DC Remove the annoying poly-log factor from PANDA Other choices for the propositional formula in the Meta Algorithm, perhaps tradding off itime and?
141 Many Open Questions Is the entropic bound computable under CC HDC or DC? Worst-case optimal algorithm for full conjunctive queries under CC HDC or DC Worst-case optimal algorithm for disjunctive datalog rules under CC HDC or DC Remove the annoying poly-log factor from PANDA Other choices for the propositional formula in the Meta Algorithm, perhaps tradding off itime and? What about negations?
142 Many Thanks! Questions?
143 Table of Contents Connecting the Dots Output Size Bounds and Information Theory Shannon-flow Inequalities and the PANDA Algorithm Wrapping it up Appendix
144 Semantics of Our Disjunctive Datalog Rule P : T B (A B ) R F (A F ) B B F E
145 Semantics of Our Disjunctive Datalog Rule P : T B (A B ) R F (A F ) B B F E A 2 R 12 R 23 A 1 A 3 R 41 P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41. A 4 R 34
146 Semantics of Our Disjunctive Datalog Rule P : T B (A B ) R F (A F ) B B F E A 2 R 12 R 23 A 1 A 3 R 41 P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41. A 4 R 34
147 Semantics of Our Disjunctive Datalog Rule P : T B (A B ) R F (A F ) B B F E A 2 R 12 R 23 A 1 A 3 R 41 P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41. A 4 R 34
148 Semantics of Our Disjunctive Datalog Rule P : T B (A B ) R F (A F ) B B F E A 2 R 12 R 23 A 1 A 3 R 41 P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41. A 4 R 34 A 1 A 2 a 1 b 1 b 2 A 2 A 3 1 c 1 d 2 c A 3 A 4 c 3 d 4 d 5 A 4 A 1 3 b 4 a 4 b
149 Semantics of Our Disjunctive Datalog Rule P : T B (A B ) R F (A F ) B B F E A 2 R 12 R 23 A 1 A 3 R 41 P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41. A 4 R 34 A 1 A 2 a 1 b 1 b 2 A 2 A 3 1 c 1 d 2 c A 3 A 4 c 3 d 4 d 5 A 4 A 1 3 b 4 a 4 b A 1 A 2 A 3 A 4 a 1 d 4 b 1 c 3 b 1 d 4 b 2 c 3
150 Semantics of Our Disjunctive Datalog Rule P : T B (A B ) R F (A F ) B B F E A 2 R 12 R 23 A 1 A 3 R 41 P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41. A 4 R 34 A 1 A 2 a 1 b 1 b 2 A 2 A 3 1 c 1 d 2 c A 3 A 4 c 3 d 4 d 5 A 4 A 1 3 b 4 a 4 b A 1 A 2 A 3 A 4 a 1 d 4 b 1 c 3 b 1 d 4 b 2 c 3 A 1 A 2 A 3 b 1 c b 2 c A 2 A 3 A 4 1 d 4 1 c 3 2 d 4
151 Semantics of Our Disjunctive Datalog Rule P : T B (A B ) R F (A F ) B B F E A 2 R 12 R 23 A 1 A 3 R 41 P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41. A 4 R 34 A 1 A 2 a 1 b 1 b 2 A 2 A 3 1 c 1 d 2 c A 3 A 4 c 3 d 4 d 5 A 4 A 1 3 b 4 a 4 b A 1 A 2 A 3 A 4 a 1 d 4 b 1 c 3 b 1 d 4 b 2 c 3 A 1 A 2 A 3 b 1 c b 2 c A 2 A 3 A 4 1 d 4 1 c 3 2 d 4
152 Semantics of Our Disjunctive Datalog Rule P : T B (A B ) R F (A F ) B B F E A 2 R 12 R 23 A 1 A 3 R 41 P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41. A 4 R 34 A 1 A 2 a 1 b 1 b 2 A 2 A 3 1 c 1 d 2 c A 3 A 4 c 3 d 4 d 5 A 4 A 1 3 b 4 a 4 b A 1 A 2 A 3 A 4 a 1 d 4 b 1 c 3 b 1 d 4 b 2 c 3 A 1 A 2 A 3 b 1 c b 2 c Inclusive Disjunction A 2 A 3 A 4 1 d 4 1 c 3 2 d 4
153 Semantics of Our Disjunctive Datalog Rule P : T B (A B ) R F (A F ) B B F E A 2 R 12 R 23 A 1 A 3 R 41 P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41. A 4 R 34 A 1 A 2 a 1 b 1 b 2 A 2 A 3 1 c 1 d 2 c A 3 A 4 c 3 d 4 d 5 A 4 A 1 3 b 4 a 4 b A 1 A 2 A 3 A 4 a 1 d 4 b 1 c 3 b 1 d 4 b 2 c 3 A 1 A 2 A 3 b 1 c b 2 c A 2 A 3 A 4 1 d 4 1 c 3 2 d 4 No minimal model requirement
154 Semantics of Our Disjunctive Datalog Rule P : T B (A B ) R F (A F ) B B F E A 2 R 12 R 23 A 1 A 3 R 41 P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41. A 4 R 34 A 1 A 2 a 1 b 1 b 2 A 2 A 3 1 c 1 d 2 c A 3 A 4 c 3 d 4 d 5 A 4 A 1 3 b 4 a 4 b A 1 A 2 A 3 A 4 a 1 d 4 b 1 c 3 b 1 d 4 b 2 c 3 A 1 A 2 A 3 b 1 c b 2 c A 2 A 3 A 4 1 d 4 1 c 3 2 d 4 Model size is max( T 123, T 234 ) = 3
155 Semantics of Our Disjunctive Datalog Rule P : T B (A B ) R F (A F ) B B F E A 2 R 12 R 23 A 1 A 3 R 41 P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41. A 4 R 34 A 1 A 2 a 1 b 1 b 2 A 2 A 3 1 c 1 d 2 c A 3 A 4 c 3 d 4 d 5 A 4 A 1 3 b 4 a 4 b A 1 A 2 A 3 A 4 a 1 d 4 b 1 c 3 b 1 d 4 b 2 c 3 A 1 A 2 A 3 b 1 c b 2 c A 2 A 3 A 4 1 d 4 1 c 3 2 d 4 Output size is the minimum over all models
156 Semantics of Our Disjunctive Datalog Rule P : T B (A B ) R F (A F ) B B F E A 2 R 12 R 23 A 1 A 3 R 41 P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41. A 4 R 34 A 1 A 2 a 1 b 1 b 2 A 2 A 3 1 c 1 d 2 c A 3 A 4 c 3 d 4 d 5 A 4 A 1 3 b 4 a 4 b A 1 A 2 A 3 A 4 a 1 d 4 b 1 c 3 b 1 d 4 b 2 c 3 A 1 A 2 A 3 b 1 c b 2 c A 2 A 3 A 4 1 d 4 A minimum-sized model of size 2
157 Semantics of Our Disjunctive Datalog Rule P : T B (A B ) R F (A F ) B B F E A 2 R 12 R 23 A 1 A 3 R 41 P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41. A 4 R 34 A 1 A 2 a 1 b 1 b 2 A 2 A 3 1 c 1 d 2 c A 3 A 4 c 3 d 4 d 5 A 4 A 1 3 b 4 a 4 b A 1 A 2 A 3 A 4 a 1 d 4 b 1 c 3 b 1 d 4 b 2 c 3 A 1 A 2 A 3 b 1 c b 2 c A 2 A 3 A 4 1 d 4 A minimum-sized model of size 2 Hence, the output size is 2
158 Output Size of a Disjunctive Datalog Rule P : B B T B (A B ) F E R F (A F ) P (D) def = min T:T =P max B B T B
159 Output Size of a Disjunctive Datalog Rule P : B B T B (A B ) F E R F (A F ) P (D) def = min T:T =P max B B T B A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34
160 Output Size of a Disjunctive Datalog Rule P : B B T B (A B ) F E R F (A F ) P (D) def = min T:T =P max B B T B A 2 R 12 R 23 A 1 A 3 R 41 CC : R 12 N, R 23 N, R 34 N, R 41 N. A 4 R 34
161 Output Size of a Disjunctive Datalog Rule P : B B T B (A B ) F E R F (A F ) P (D) def = min T:T =P max B B T B A 2 R 12 R 23 A 1 A 3 R 41 CC : R 12 N, R 23 N, R 34 N, R 41 N. A 4 R 34 P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41.
162 Output Size of a Disjunctive Datalog Rule P : B B T B (A B ) F E R F (A F ) P (D) def = min T:T =P max B B T B A 2 R 12 R 23 A 1 A 3 R 41 CC : R 12 N, R 23 N, R 34 N, R 41 N. A 4 R 34 P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41. max P 123,234(D) N 3/2 D =CC
163 Output Size of a Disjunctive Datalog Rule P : B B T B (A B ) F E R F (A F ) P (D) def = min T:T =P max B B T B A 2 R 12 R 23 A 1 A 3 R 41 CC : R 12 N, R 23 N, R 34 N, R 41 N. A 4 R 34 P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41. max P 123,234(D) N 3/2 D =CC P : T 123 R 12 R 23 R 34 R 41.
164 Output Size of a Disjunctive Datalog Rule P : B B T B (A B ) F E R F (A F ) P (D) def = min T:T =P max B B T B A 2 R 12 R 23 A 1 A 3 R 41 CC : R 12 N, R 23 N, R 34 N, R 41 N. A 4 R 34 P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41. max P 123,234(D) N 3/2 D =CC P : T 123 R 12 R 23 R 34 R 41. P (D) = N 2, for some D
165 Output Size of a Disjunctive Datalog Rule P : B B T B (A B ) F E R F (A F ) P (D) def = min T:T =P max B B T B A 2 R 12 R 23 A 1 A 3 R 41 CC : R 12 N, R 23 N, R 34 N, R 41 N. A 4 R 34 P 123,234 : T 123 T 234 R 12 R 23 R 34 R 41. max P 123,234(D) N 3/2 D =CC P : T 123 R 12 R 23 R 34 R 41. P (D) = N 2, for some D Using Option 3, 4-cycle query answerable in Õ(N 3/2 )-time, matching [Alon, Yuster, Zwick 97]
166 Polymatroid Bound: Examples Q(A 1, A 2, A 3, A 4 ) R 12 (A 1, A 2 ), R 23 (A 2, A 3 ), R 34 (A 3, A 4 ), R 41 (A 4, A 1 ). A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34 R 12, R 23, R 34, R 41 N Q N 2
167 Polymatroid Bound: Examples Q(A 1, A 2, A 3, A 4 ) R 12 (A 1, A 2 ), R 23 (A 2, A 3 ), R 34 (A 3, A 4 ), R 41 (A 4, A 1 ). A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34 R 12, R 23, R 34, R 41 N Q N 2 log Q = h(a 1 A 2 A 3 A 4 ) h(a 1 A 2 ) + h(a 3 A 4 ) 2 log N
168 Polymatroid Bound: Examples Q(A 1, A 2, A 3, A 4 ) R 12 (A 1, A 2 ), R 23 (A 2, A 3 ), R 34 (A 3, A 4 ), R 41 (A 4, A 1 ). A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34 R 12, R 23, R 34, R 41 N Q N 2 log Q = h(a 1 A 2 A 3 A 4 ) h(a 1 A 2 ) + h(a 3 A 4 ) 2 log N deg 12 (A 1 A 2 A 1 ), deg 12 (A 1 A 2 A 2 ) D Q D N 3/2
169 Polymatroid Bound: Examples Q(A 1, A 2, A 3, A 4 ) R 12 (A 1, A 2 ), R 23 (A 2, A 3 ), R 34 (A 3, A 4 ), R 41 (A 4, A 1 ). A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34 R 12, R 23, R 34, R 41 N Q N 2 log Q = h(a 1 A 2 A 3 A 4 ) h(a 1 A 2 ) + h(a 3 A 4 ) 2 log N deg 12 (A 1 A 2 A 1 ), deg 12 (A 1 A 2 A 2 ) D Q D N 3/2 2 log Q = 2h(A 1 A 2 A 3 A 4 ) h(a 2 A 3 ) + h(a 3 A 4 ) + h(a 4 A 1 ) + h(a 2 A 1 ) + h(a 1 A 2 ) 3 log N + 2 log D
170 Polymatroid Bound: Examples Q(A 1, A 2, A 3, A 4 ) R 12 (A 1, A 2 ), R 23 (A 2, A 3 ), R 34 (A 3, A 4 ), R 41 (A 4, A 1 ). A 2 R 12 R 23 A 1 A 3 R 41 A 4 R 34 R 12, R 23, R 34, R 41 N Q N 2 log Q = h(a 1 A 2 A 3 A 4 ) h(a 1 A 2 ) + h(a 3 A 4 ) 2 log N deg 12 (A 1 A 2 A 1 ), deg 12 (A 1 A 2 A 2 ) D Q D N 3/2 2 log Q = 2h(A 1 A 2 A 3 A 4 ) h(a 2 A 3 ) + h(a 3 A 4 ) + h(a 4 A 1 ) + h(a 2 A 1 ) + h(a 1 A 2 ) 3 log N + 2 log D A 1 A 2, A 2 A 1 Q N 3/2
171 Polymatroid Bound is not tight!
172 Polymatroid Bound is not tight! Proof Strategy
173 Polymatroid Bound is not tight! Proof Strategy Take a non-shannon inequality (modified from [Zhang-Yeung 98]) 11h(ABXY C) 3h(XY ) + 3h(AX) + 3h(AY ) + h(bx) + h(by ) + 5h(C) + h(abxy C AB) + h(abxy C AC) + 4h(ABXY C AXY ) + h(abxy C BXY ) + 2h(ABXY C XC) + 2h(ABXY C Y C) (3)
174 Polymatroid Bound is not tight! Proof Strategy Take a non-shannon inequality (modified from [Zhang-Yeung 98]) 11h(ABXY C) 3h(XY ) + 3h(AX) + 3h(AY ) + h(bx) + h(by ) + 5h(C) + h(abxy C AB) + h(abxy C AC) + 4h(ABXY C AXY ) + h(abxy C BXY ) + 2h(ABXY C XC) + 2h(ABXY C Y C) (3) Construct a query Q where (3) gives the (entropic) bound 11 log Q = 11h(ABXY C) 11 log N log N 2 = 43 log N
Entropy Bounds for Conjunctive Queries with Functional Dependencies
Entropy Bounds for Conjunctive Queries with Functional Dependencies Tomasz Gogacz 1 and Szymon Toruńczyk 2 1 University of Edinburgh, Edinburgh, UK 2 University of Warsaw, Warsaw, Poland Abstract We study
More informationA Worst-Case Optimal Multi-Round Algorithm for Parallel Computation of Conjunctive Queries
A Worst-Case Optimal Multi-Round Algorithm for Parallel Computation of Conjunctive Queries Bas Ketsman Dan Suciu Abstract We study the optimal communication cost for computing a full conjunctive query
More informationarxiv: v6 [cs.db] 6 Feb 2017
FAQ: Questions Asked Frequently Mahmoud Abo Khamis Hung Q. Ngo Atri Rudra arxiv:1504.04044v6 [cs.db] 6 Feb 2017 LogicBlox Inc. {mahmoud.abokhamis,hung.ngo}@logicblox.com Department of Computer Science
More informationApproximating fractional hypertree width
Approximating fractional hypertree width Dániel Marx Abstract Fractional hypertree width is a hypergraph measure similar to tree width and hypertree width. Its algorithmic importance comes from the fact
More informationarxiv: v1 [cs.db] 8 Mar 2012
Worst-case Optimal Join Algorithms arxiv:12031952v1 [csdb] 8 Mar 2012 Hung Q Ngo Computer Science and Engineering, SUNY Buffalo, USA Christopher Ré Computer Science, University of Wisconsin Madison USA
More informationA Dichotomy. in in Probabilistic Databases. Joint work with Robert Fink. for Non-Repeating Queries with Negation Queries with Negation
Dichotomy for Non-Repeating Queries with Negation Queries with Negation in in Probabilistic Databases Robert Dan Olteanu Fink and Dan Olteanu Joint work with Robert Fink Uncertainty in Computation Simons
More informationStructural Tractability of Counting of Solutions to Conjunctive Queries
Structural Tractability of Counting of Solutions to Conjunctive Queries Arnaud Durand 1 1 University Paris Diderot Journées DAG, june 2013 Joint work with Stefan Mengel (JCSS (to appear) + ICDT 2013) 1
More informationTractable Counting of the Answers to Conjunctive Queries
Tractable Counting of the Answers to Conjunctive Queries Reinhard Pichler and Sebastian Skritek Technische Universität Wien, {pichler, skritek}@dbai.tuwien.ac.at Abstract. Conjunctive queries (CQs) are
More informationParallel-Correctness and Transferability for Conjunctive Queries
Parallel-Correctness and Transferability for Conjunctive Queries Tom J. Ameloot1 Gaetano Geck2 Bas Ketsman1 Frank Neven1 Thomas Schwentick2 1 Hasselt University 2 Dortmund University Big Data Too large
More informationSymmetries in the Entropy Space
Symmetries in the Entropy Space Jayant Apte, Qi Chen, John MacLaren Walsh Abstract This paper investigates when Shannon-type inequalities completely characterize the part of the closure of the entropy
More informationOn Factorisation of Provenance Polynomials
On Factorisation of Provenance Polynomials Dan Olteanu and Jakub Závodný Oxford University Computing Laboratory Wolfson Building, Parks Road, OX1 3QD, Oxford, UK 1 Introduction Tracking and managing provenance
More informationTractable structures for constraint satisfaction with truth tables
Tractable structures for constraint satisfaction with truth tables Dániel Marx September 22, 2009 Abstract The way the graph structure of the constraints influences the complexity of constraint satisfaction
More informationSingle-Round Multi-Join Evaluation. Bas Ketsman
Single-Round Multi-Join Evaluation Bas Ketsman Outline 1. Introduction 2. Parallel-Correctness 3. Transferability 4. Special Cases 2 Motivation Single-round Multi-joins Less rounds / barriers Formal framework
More informationR E P O R T. Generalized Hypertree Decompositions: NP-Hardness and Tractable Variants INSTITUT FÜR INFORMATIONSSYSTEME DBAI-TR
TECHNICAL R E P O R T INSTITUT FÜR INFORMATIONSSYSTEME ABTEILUNG DATENBANKEN UND ARTIFICIAL INTELLIGENCE Generalized Hypertree Decompositions: NP-Hardness and Tractable Variants DBAI-TR-2007-55 Georg Gottlob
More informationFactorised Representations of Query Results: Size Bounds and Readability
Factorised Representations of Query Results: Size Bounds and Readability Dan Olteanu and Jakub Závodný Department of Computer Science University of Oxford {dan.olteanu,jakub.zavodny}@cs.ox.ac.uk ABSTRACT
More informationSize and Treewidth Bounds for Conjunctive Queries
Size and Treewidth Bounds for Conjunctive Queries GEORG GOTTLOB University of Oxford, UK STEPHANIE TIEN LEE University of Oxford, UK GREGORY VALIANT University of California, Berkeley and PAUL VALIANT
More informationarxiv: v1 [cs.db] 9 Mar 2017
Juggling Functions Inside a Database Mahmoud Abo Khamis LogicBlox Inc. mahmoud.abokhamis@logicblox.com Hung Q. Ngo LogicBlox Inc. hung.ngo@logicblox.com Atri Rudra University at Buffalo, SUNY atri@buffalo.edu
More informationarxiv: v7 [cs.db] 22 Dec 2015
It s all a matter of degree: Using degree information to optimize multiway joins Manas R. Joglekar 1 and Christopher M. Ré 2 1 Department of Computer Science, Stanford University 450 Serra Mall, Stanford,
More informationarxiv: v2 [cs.db] 16 Oct 2013
Skew Strikes Back: New Developments in the Theory of Join Algorithms Hung Q. Ngo University at Buffalo, SUNY hungngo@buffalo.edu Christopher Ré Stanford University chrismre@cs.stanford.edu Atri Rudra University
More informationThe complexity of enumeration and counting for acyclic conjunctive queries
The complexity of enumeration and counting for acyclic conjunctive queries Cambridge, march 2012 Counting and enumeration Counting Output the number of solutions Example : sat, Perfect Matching, Permanent,
More informationSkew Strikes Back: New Developments in the Theory of Join Algorithms
Skew Strikes Back: New Developments in the Theory of Join Algorithms Hung Q. Ngo University at Buffalo, SUN hungngo@buffalo.edu Christopher Ré Stanford University chrismre@cs.stanford.edu Atri Rudra University
More informationConsistent Query Answering under Primary Keys: A Characterization of Tractable Queries
Consistent Query Answering under Primary Keys: A Characterization of Tractable Queries Jef Wijsen University of Mons-Hainaut Mons, Belgium jef.wijsen@umh.ac.be ABSTRACT This article deals with consistent
More informationDatalog and Constraint Satisfaction with Infinite Templates
Datalog and Constraint Satisfaction with Infinite Templates Manuel Bodirsky 1 and Víctor Dalmau 2 1 CNRS/LIX, École Polytechnique, bodirsky@lix.polytechnique.fr 2 Universitat Pompeu Fabra, victor.dalmau@upf.edu
More informationarxiv: v1 [cs.ai] 21 Sep 2014
Oblivious Bounds on the Probability of Boolean Functions WOLFGANG GATTERBAUER, Carnegie Mellon University DAN SUCIU, University of Washington arxiv:409.6052v [cs.ai] 2 Sep 204 This paper develops upper
More informationA Toolbox of Query Evaluation Techniques for Probabilistic Databases
2nd Workshop on Management and mining Of UNcertain Data (MOUND) Long Beach, March 1st, 2010 A Toolbox of Query Evaluation Techniques for Probabilistic Databases Dan Olteanu, Oxford University Computing
More informationP Q1 Q2 Q3 Q4 Q5 Tot (60) (20) (20) (20) (60) (20) (200) You are allotted a maximum of 4 hours to complete this exam.
Exam INFO-H-417 Database System Architecture 13 January 2014 Name: ULB Student ID: P Q1 Q2 Q3 Q4 Q5 Tot (60 (20 (20 (20 (60 (20 (200 Exam modalities You are allotted a maximum of 4 hours to complete this
More informationBounded Query Rewriting Using Views
Bounded Query Rewriting Using Views Yang Cao 1,3 Wenfei Fan 1,3 Floris Geerts 2 Ping Lu 3 1 University of Edinburgh 2 University of Antwerp 3 Beihang University {wenfei@inf, Y.Cao-17@sms}.ed.ac.uk, floris.geerts@uantwerpen.be,
More informationProvenance Semirings. Todd Green Grigoris Karvounarakis Val Tannen. presented by Clemens Ley
Provenance Semirings Todd Green Grigoris Karvounarakis Val Tannen presented by Clemens Ley place of origin Provenance Semirings Todd Green Grigoris Karvounarakis Val Tannen presented by Clemens Ley place
More informationFactorized Relational Databases Olteanu and Závodný, University of Oxford
November 8, 2013 Database Seminar, U Washington Factorized Relational Databases http://www.cs.ox.ac.uk/projects/fd/ Olteanu and Závodný, University of Oxford Factorized Representations of Relations Cust
More informationBrief Tutorial on Probabilistic Databases
Brief Tutorial on Probabilistic Databases Dan Suciu University of Washington Simons 206 About This Talk Probabilistic databases Tuple-independent Query evaluation Statistical relational models Representation,
More informationReceived: 1 September 2018; Accepted: 10 October 2018; Published: 12 October 2018
entropy Article Entropy Inequalities for Lattices Peter Harremoës Copenhagen Business College, Nørre Voldgade 34, 1358 Copenhagen K, Denmark; harremoes@ieee.org; Tel.: +45-39-56-41-71 Current address:
More informationarxiv: v1 [cs.db] 20 Dec 2017
Boolean Tensor Decomposition for Conjunctive Queries with Negation Mahmoud Abo Khamis & Hung Q. Ngo RelationalAI Dan Olteanu University of Oxford Dan Suciu University of Washington arxiv:1712.07445v1 [cs.db]
More informationTractable Hypergraph Properties for Constraint Satisfaction and Conjunctive Queries
Tractable Hypergraph Properties for Constraint Satisfaction and Conjunctive Queries Dániel Marx School of Computer Science Tel Aviv University Tel Aviv, Israel dmarx@cs.bme.hu ABSTRACT An important question
More informationEntropy Vectors and Network Information Theory
Entropy Vectors and Network Information Theory Sormeh Shadbakht and Babak Hassibi Department of Electrical Engineering Caltech Lee Center Workshop May 25, 2007 1 Sormeh Shadbakht and Babak Hassibi Entropy
More informationThe complexity of acyclic subhypergraph problems
The complexity of acyclic subhypergraph problems David Duris and Yann Strozecki Équipe de Logique Mathématique (FRE 3233) - Université Paris Diderot-Paris 7 {duris,strozecki}@logique.jussieu.fr Abstract.
More informationPropositional Logic: Models and Proofs
Propositional Logic: Models and Proofs C. R. Ramakrishnan CSE 505 1 Syntax 2 Model Theory 3 Proof Theory and Resolution Compiled at 11:51 on 2016/11/02 Computing with Logic Propositional Logic CSE 505
More informationGAV-sound with conjunctive queries
GAV-sound with conjunctive queries Source and global schema as before: source R 1 (A, B),R 2 (B,C) Global schema: T 1 (A, C), T 2 (B,C) GAV mappings become sound: T 1 {x, y, z R 1 (x,y) R 2 (y,z)} T 2
More informationHypertree-Width and Related Hypergraph Invariants
Hypertree-Width and Related Hypergraph Invariants Isolde Adler, Georg Gottlob, Martin Grohe To cite this version: Isolde Adler, Georg Gottlob, Martin Grohe. Hypertree-Width and Related Hypergraph Invariants.
More informationTractable Lineages on Treelike Instances: Limits and Extensions
Tractable Lineages on Treelike Instances: Limits and Extensions Antoine Amarilli 1, Pierre Bourhis 2, Pierre enellart 1,3 June 29th, 2016 1 Télécom ParisTech 2 CNR CRItAL 3 National University of ingapore
More informationFunctional Dependencies and Normalization
Functional Dependencies and Normalization There are many forms of constraints on relational database schemata other than key dependencies. Undoubtedly most important is the functional dependency. A functional
More informationSLD-Resolution And Logic Programming (PROLOG)
Chapter 9 SLD-Resolution And Logic Programming (PROLOG) 9.1 Introduction We have seen in Chapter 8 that the resolution method is a complete procedure for showing unsatisfiability. However, finding refutations
More informationα-acyclic Joins Jef Wijsen May 4, 2017
α-acyclic Joins Jef Wijsen May 4, 2017 1 Motivation Joins in a Distributed Environment Assume the following relations. 1 M[NN, Field of Study, Year] stores data about students of UMONS. For example, (19950423158,
More informationPART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015
Outline Codes and Cryptography 1 Information Sources and Optimal Codes 2 Building Optimal Codes: Huffman Codes MAMME, Fall 2015 3 Shannon Entropy and Mutual Information PART III Sources Information source:
More informationQuery answering using views
Query answering using views General setting: database relations R 1,...,R n. Several views V 1,...,V k are defined as results of queries over the R i s. We have a query Q over R 1,...,R n. Question: Can
More informationThe Logic of Partitions with an application to Information Theory
1 The Logic of Partitions with an application to Information Theory David Ellerman University of California at Riverside and WWW.Ellerman.org 2 Outline of Talk Logic of partitions dual to ordinary logic
More informationOn Datalog vs. LFP. Anuj Dawar and Stephan Kreutzer
On Datalog vs. LFP Anuj Dawar and Stephan Kreutzer 1 University of Cambridge Computer Lab, anuj.dawar@cl.cam.ac.uk 2 Oxford University Computing Laboratory, kreutzer@comlab.ox.ac.uk Abstract. We show that
More informationHypertree Decompositions: Structure, Algorithms, and Applications
Hypertree Decompositions: Structure, Algorithms, and Applications Georg Gottlob 1, Martin Grohe 2, Nysret Musliu 1, Marko Samer 1, and Francesco Scarcello 3 1 Institut für Informationssysteme, TU Wien,
More informationA Relational Knowledge System
Relational Knowledge System by Shik Kam Michael Wong, Cory James utz, and Dan Wu Technical Report CS-01-04 January 2001 Department of Computer Science University of Regina Regina, Saskatchewan Canada S4S
More informationThe Inflation Technique for Causal Inference with Latent Variables
The Inflation Technique for Causal Inference with Latent Variables arxiv:1609.00672 (Elie Wolfe, Robert W. Spekkens, Tobias Fritz) September 2016 Introduction Given some correlations between the vocabulary
More informationDynamic Programming on Trees. Example: Independent Set on T = (V, E) rooted at r V.
Dynamic Programming on Trees Example: Independent Set on T = (V, E) rooted at r V. For v V let T v denote the subtree rooted at v. Let f + (v) be the size of a maximum independent set for T v that contains
More informationAgnostic Learning of Disjunctions on Symmetric Distributions
Agnostic Learning of Disjunctions on Symmetric Distributions Vitaly Feldman vitaly@post.harvard.edu Pravesh Kothari kothari@cs.utexas.edu May 26, 2014 Abstract We consider the problem of approximating
More informationThe Complexity of Causality and Responsibility for Query Answers and non-answers
The Complexity of Causality and Responsibility for Query Answers and non-answers Alexandra Meliou Wolfgang Gatterbauer Katherine F. Moore Dan Suciu Department of Computer Science and Engineering, University
More informationNotes. Corneliu Popeea. May 3, 2013
Notes Corneliu Popeea May 3, 2013 1 Propositional logic Syntax We rely on a set of atomic propositions, AP, containing atoms like p, q. A propositional logic formula φ Formula is then defined by the following
More informationLecture 19 October 28, 2015
PHYS 7895: Quantum Information Theory Fall 2015 Prof. Mark M. Wilde Lecture 19 October 28, 2015 Scribe: Mark M. Wilde This document is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike
More informationKnowledge Compilation Meets Database Theory : Compiling Queries to Decision Diagrams
Knowledge Compilation Meets Database Theory : Compiling Queries to Decision Diagrams Abhay Jha Computer Science and Engineering University of Washington, WA, USA abhaykj@cs.washington.edu Dan Suciu Computer
More informationarxiv: v1 [cs.db] 10 Nov 2017
Size bounds and query plans for relational joins arxiv:1711.03860v1 [cs.db] 10 Nov 2017 Albert Atserias Universitat Politècnica de Catalunya Barcelona, Spain Martin Grohe Humboldt Universität zu Berlin
More informationLifted Inference: Exact Search Based Algorithms
Lifted Inference: Exact Search Based Algorithms Vibhav Gogate The University of Texas at Dallas Overview Background and Notation Probabilistic Knowledge Bases Exact Inference in Propositional Models First-order
More informationGames and Isomorphism in Finite Model Theory. Part 1
1 Games and Isomorphism in Finite Model Theory Part 1 Anuj Dawar University of Cambridge Games Winter School, Champéry, 6 February 2013 2 Model Comparison Games Games in Finite Model Theory are generally
More informationarxiv: v1 [cs.db] 21 Feb 2017
Answering Conjunctive Queries under Updates Christoph Berkholz, Jens Keppeler, Nicole Schweikardt Humboldt-Universität zu Berlin {berkholz,keppelej,schweika}@informatik.hu-berlin.de February 22, 207 arxiv:702.06370v
More informationP is the class of problems for which there are algorithms that solve the problem in time O(n k ) for some constant k.
Complexity Theory Problems are divided into complexity classes. Informally: So far in this course, almost all algorithms had polynomial running time, i.e., on inputs of size n, worst-case running time
More informationPropositional Logic: Part II - Syntax & Proofs 0-0
Propositional Logic: Part II - Syntax & Proofs 0-0 Outline Syntax of Propositional Formulas Motivating Proofs Syntactic Entailment and Proofs Proof Rules for Natural Deduction Axioms, theories and theorems
More informationMulti-join Query Evaluation on Big Data Lecture 2
Multi-join Query Evaluation on Big Data Lecture 2 Dan Suciu March, 2015 Dan Suciu Multi-Joins Lecture 2 March, 2015 1 / 34 Multi-join Query Evaluation Outline Part 1 Optimal Sequential Algorithms. Thursday
More informationBounded-width QBF is PSPACE-complete
Bounded-width QBF is PSPACE-complete Albert Atserias Universitat Politècnica de Catalunya Barcelona, Spain atserias@lsi.upc.edu Sergi Oliva Universitat Politècnica de Catalunya Barcelona, Spain oliva@lsi.upc.edu
More informationLogic and Databases. Phokion G. Kolaitis. UC Santa Cruz & IBM Research Almaden. Lecture 4 Part 1
Logic and Databases Phokion G. Kolaitis UC Santa Cruz & IBM Research Almaden Lecture 4 Part 1 1 Thematic Roadmap Logic and Database Query Languages Relational Algebra and Relational Calculus Conjunctive
More informationLecture 23: Alternation vs. Counting
CS 710: Complexity Theory 4/13/010 Lecture 3: Alternation vs. Counting Instructor: Dieter van Melkebeek Scribe: Jeff Kinne & Mushfeq Khan We introduced counting complexity classes in the previous lecture
More informationEnumeration Complexity of Conjunctive Queries with Functional Dependencies
Enumeration Complexity of Conjunctive Queries with Functional Dependencies Nofar Carmeli Technion, Haifa, Israel snofca@cs.technion.ac.il Markus Kröll TU Wien, Vienna, Austria kroell@dbai.tuwien.ac.at
More informationOn Resolution Like Proofs of Monotone Self-Dual Functions
On Resolution Like Proofs of Monotone Self-Dual Functions Daya Ram Gaur Department of Mathematics and Computer Science University of Lethbridge, Lethbridge, AB, Canada, T1K 3M4 gaur@cs.uleth.ca Abstract
More informationSome algorithmic improvements for the containment problem of conjunctive queries with negation
Some algorithmic improvements for the containment problem of conjunctive queries with negation Michel Leclère and Marie-Laure Mugnier LIRMM, Université de Montpellier, 6, rue Ada, F-3439 Montpellier cedex
More informationOn Incomplete XML Documents with Integrity Constraints
On Incomplete XML Documents with Integrity Constraints Pablo Barceló 1, Leonid Libkin 2, and Juan Reutter 2 1 Department of Computer Science, University of Chile 2 School of Informatics, University of
More informationProving simple set properties...
Proving simple set properties... Part 1: Some examples of proofs over sets Fall 2013 Proving simple set properties... Fall 2013 1 / 17 Introduction Overview: Learning outcomes In this session we will...
More informationCS632 Notes on Relational Query Languages I
CS632 Notes on Relational Query Languages I A. Demers 6 Feb 2003 1 Introduction Here we define relations, and introduce our notational conventions, which are taken almost directly from [AD93]. We begin
More informationLecture 2: Proof of Switching Lemma
Lecture 2: oof of Switching Lemma Yuan Li June 25, 2014 1 Decision Tree Instead of bounding the probability that f can be written as some s-dnf, we estimate the probability that f can be computed by a
More informationJunta Approximations for Submodular, XOS and Self-Bounding Functions
Junta Approximations for Submodular, XOS and Self-Bounding Functions Vitaly Feldman Jan Vondrák IBM Almaden Research Center Simons Institute, Berkeley, October 2013 Feldman-Vondrák Approximations by Juntas
More informationCS 347 Parallel and Distributed Data Processing
CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 3: Query Processing Query Processing Decomposition Localization Optimization CS 347 Notes 3 2 Decomposition Same as in centralized system
More informationα-formulas β-formulas
α-formulas Logic: Compendium http://www.ida.liu.se/ TDDD88/ Andrzej Szalas IDA, University of Linköping October 25, 2017 Rule α α 1 α 2 ( ) A 1 A 1 ( ) A 1 A 2 A 1 A 2 ( ) (A 1 A 2 ) A 1 A 2 ( ) (A 1 A
More informationMaximizing Conjunctive Views in Deletion Propagation
Maximizing Conjunctive Views in Deletion Propagation Extended Version Benny Kimelfeld Jan Vondrák Ryan Williams IBM Research Almaden San Jose, CA 9510, USA {kimelfeld, jvondrak, ryanwill}@us.ibm.com ABSTRACT
More informationMulti-Tuple Deletion Propagation: Approximations and Complexity
Multi-Tuple Deletion Propagation: Approximations and Complexity Benny Kimelfeld IBM Research Almaden San Jose, CA 95120, USA kimelfeld@us.ibm.com Jan Vondrák IBM Research Almaden San Jose, CA 95120, USA
More informationSchema Refinement: Other Dependencies and Higher Normal Forms
Schema Refinement: Other Dependencies and Higher Normal Forms Spring 2018 School of Computer Science University of Waterloo Databases CS348 (University of Waterloo) Higher Normal Forms 1 / 14 Outline 1
More informationCopyright is owned by the Author of the thesis. Permission is given for a copy to be downloaded by an individual for the purpose of research and
Copyright is owned by the Author of the thesis. Permission is given for a copy to be downloaded by an individual for the purpose of research and private study only. The thesis may not be reproduced elsewhere
More informationDisjunctive Databases for Representing Repairs
Noname manuscript No (will be inserted by the editor) Disjunctive Databases for Representing Repairs Cristian Molinaro Jan Chomicki Jerzy Marcinkowski Received: date / Accepted: date Abstract This paper
More information3F1 Information Theory, Lecture 3
3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2011, 28 November 2011 Memoryless Sources Arithmetic Coding Sources with Memory 2 / 19 Summary of last lecture Prefix-free
More informationLecture 3: Oct 7, 2014
Information and Coding Theory Autumn 04 Lecturer: Shi Li Lecture : Oct 7, 04 Scribe: Mrinalkanti Ghosh In last lecture we have seen an use of entropy to give a tight upper bound in number of triangles
More informationCourse information. Winter ATFD
Course information More information next week This is a challenging course that will put you at the forefront of current data management research Lots of work done by you: extra reading: at least 4 research
More informationEndre Boros a Ondřej Čepekb Alexander Kogan c Petr Kučera d
R u t c o r Research R e p o r t A subclass of Horn CNFs optimally compressible in polynomial time. Endre Boros a Ondřej Čepekb Alexander Kogan c Petr Kučera d RRR 11-2009, June 2009 RUTCOR Rutgers Center
More information1 First-order logic. 1 Syntax of first-order logic. 2 Semantics of first-order logic. 3 First-order logic queries. 2 First-order query evaluation
Knowledge Bases and Databases Part 1: First-Order Queries Diego Calvanese Faculty of Computer Science Master of Science in Computer Science A.Y. 2007/2008 Overview of Part 1: First-order queries 1 First-order
More informationLecture 29: Computational Learning Theory
CS 710: Complexity Theory 5/4/2010 Lecture 29: Computational Learning Theory Instructor: Dieter van Melkebeek Scribe: Dmitri Svetlov and Jake Rosin Today we will provide a brief introduction to computational
More informationCoding of memoryless sources 1/35
Coding of memoryless sources 1/35 Outline 1. Morse coding ; 2. Definitions : encoding, encoding efficiency ; 3. fixed length codes, encoding integers ; 4. prefix condition ; 5. Kraft and Mac Millan theorems
More informationAnswer-Set Programming with Bounded Treewidth
Answer-Set Programming with Bounded Treewidth Michael Jakl, Reinhard Pichler and Stefan Woltran Institute of Information Systems, Vienna University of Technology Favoritenstrasse 9 11; A-1040 Wien; Austria
More informationarxiv: v1 [cs.db] 20 Feb 2016
arxiv:1602.06458v1 [cs.db] 20 Feb 2016 Causes for Query Answers from Databases, Datalog Abduction and View-Updates: The Presence of Integrity Constraints Babak Salimi and Leopoldo Bertossi Carleton University,
More informationComposing Schema Mappings: Second-Order Dependencies to the Rescue
Composing Schema Mappings: Second-Order Dependencies to the Rescue RONALD FAGIN IBM Almaden Research Center PHOKION G. KOLAITIS IBM Almaden Research Center LUCIAN POPA IBM Almaden Research Center WANG-CHIEW
More informationNP-Completeness. Andreas Klappenecker. [based on slides by Prof. Welch]
NP-Completeness Andreas Klappenecker [based on slides by Prof. Welch] 1 Prelude: Informal Discussion (Incidentally, we will never get very formal in this course) 2 Polynomial Time Algorithms Most of the
More informationIn-Database Learning with Sparse Tensors
In-Database Learning with Sparse Tensors Mahmoud Abo Khamis, Hung Ngo, XuanLong Nguyen, Dan Olteanu, and Maximilian Schleich Toronto, October 2017 RelationalAI Talk Outline Current Landscape for DB+ML
More informationPropositional and First Order Reasoning
Propositional and First Order Reasoning Terminology Propositional variable: boolean variable (p) Literal: propositional variable or its negation p p Clause: disjunction of literals q \/ p \/ r given by
More informationTopics in Probabilistic and Statistical Databases. Lecture 9: Histograms and Sampling. Dan Suciu University of Washington
Topics in Probabilistic and Statistical Databases Lecture 9: Histograms and Sampling Dan Suciu University of Washington 1 References Fast Algorithms For Hierarchical Range Histogram Construction, Guha,
More informationAlgorithmic Meta-Theorems
Algorithmic Meta-Theorems Stephan Kreutzer Oxford University Computing Laboratory stephan.kreutzer@comlab.ox.ac.uk Abstract. Algorithmic meta-theorems are general algorithmic results applying to a whole
More informationReasoning with Deterministic and Probabilistic graphical models Class 2: Inference in Constraint Networks Rina Dechter
Reasoning with Deterministic and Probabilistic graphical models Class 2: Inference in Constraint Networks Rina Dechter Road Map n Graphical models n Constraint networks Model n Inference n Search n Probabilistic
More informationKnowledge base (KB) = set of sentences in a formal language Declarative approach to building an agent (or other system):
Logic Knowledge-based agents Inference engine Knowledge base Domain-independent algorithms Domain-specific content Knowledge base (KB) = set of sentences in a formal language Declarative approach to building
More informationGenerating Maximal Independent Sets for Hypergraphs with Bounded Edge-Intersections
Generating Maximal Independent Sets for Hypergraphs with Bounded Edge-Intersections Endre Boros 1, Khaled Elbassioni 1, Vladimir Gurvich 1, and Leonid Khachiyan 2 1 RUTCOR, Rutgers University, 640 Bartholomew
More informationOn the Succinctness of Query Rewriting for Datalog
On the Succinctness of Query Rewriting for Datalog Shqiponja Ahmetaj 1 Andreas Pieris 2 1 Institute of Logic and Computation, TU Wien, Austria 2 School of Informatics, University of Edinburgh, UK Foundational
More informationThe Sum-Product Theorem: A Foundation for Learning Tractable Models (Supplementary Material)
The Sum-Product Theorem: A Foundation for Learning Tractable Models (Supplementary Material) Abram L. Friesen AFRIESEN@CS.WASHINGTON.EDU Pedro Domingos PEDROD@CS.WASHINGTON.EDU Department of Computer Science
More information