University of California, Berkeley. ABSTRACT

Similar documents
Deænition 1 A set S is ænite if there exists a number N such that the number of elements in S èdenoted jsjè is less than N. If no such N exists, then

Comp487/587 - Boolean Formulas

Propositional Logic: Models and Proofs

NP and Computational Intractability

Optimal Tree-decomposition Balancing and Reachability on Low Treewidth Graphs

Tecniche di Verifica. Introduction to Propositional Logic

A brief introduction to Logic. (slides from

also has x æ as a local imizer. Of course, æ is typically not known, but an algorithm can approximate æ as it approximates x æ èas the augmented Lagra

W 1 æw 2 G + 0 e? u K y Figure 5.1: Control of uncertain system. For MIMO systems, the normbounded uncertainty description is generalized by assuming

Complexity and NP-completeness

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Deductive Inference for the Interiors and Exteriors of Horn Theories

Propositional and Predicate Logic - V

Nearest Neighbor Search with Keywords

Lecture 7: Polynomial time hierarchy

Chapter 11: Automated Proof Systems (1)

Price: $25 (incl. T-Shirt, morning tea and lunch) Visit:

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

CSCI3390-Lecture 17: A sampler of NP-complete problems

Asymptotic Polynomial-Time Approximation (APTAS) and Randomized Approximation Algorithms

Branching. Teppo Niinimäki. Helsinki October 14, 2011 Seminar: Exact Exponential Algorithms UNIVERSITY OF HELSINKI Department of Computer Science

Notes for Lecture 2. Statement of the PCP Theorem and Constraint Satisfaction

Bayesian Networks Factor Graphs the Case-Factor Algorithm and the Junction Tree Algorithm

Multimedia Databases 1/29/ Indexes for Multimedia Data Indexes for Multimedia Data Indexes for Multimedia Data

Knowledge Compilation and Theory Approximation. Henry Kautz and Bart Selman Presented by Kelvin Ku

Computational Learning Theory - Hilary Term : Introduction to the PAC Learning Framework

Non-Approximability Results (2 nd part) 1/19

6.841/18.405J: Advanced Complexity Wednesday, February 12, Lecture Lecture 3

Classes of Problems. CS 461, Lecture 23. NP-Hard. Today s Outline. We can characterize many problems into three classes:

NP-Completeness Review

Apropos of an errata in ÜB 10 exercise 3

Finding Satisfying Assignments by Random Walk

CS21 Decidability and Tractability

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181.

Essential facts about NP-completeness:

Determine the size of an instance of the minimum spanning tree problem.

Algebraic Dynamic Programming. Solving Satisfiability with ADP

COMP9414: Artificial Intelligence Propositional Logic: Automated Reasoning

Limitations of Algorithm Power

Geometric Steiner Trees

Compiling Knowledge into Decomposable Negation Normal Form

CS6840: Advanced Complexity Theory Mar 29, Lecturer: Jayalal Sarma M.N. Scribe: Dinesh K.

SPRING Final Exam. May 5, 1999

NP-Hardness reductions

9.1 HW (3+3+8 points) (Knapsack decision problem NP-complete)

NP-Completeness. Algorithmique Fall semester 2011/12

CS 294-2, Visual Grouping and Recognition èprof. Jitendra Malikè Sept. 8, 1999

Complexity of domain-independent planning. José Luis Ambite

Lecture 22: Counting

CS 301: Complexity of Algorithms (Term I 2008) Alex Tiskin Harald Räcke. Hamiltonian Cycle. 8.5 Sequencing Problems. Directed Hamiltonian Cycle

More on NP and Reductions

CSE 4111/5111/6111 Computability Jeff Edmonds Assignment 6: NP & Reductions Due: One week after shown in slides

Skylines. Yufei Tao. ITEE University of Queensland. INFS4205/7205, Uni of Queensland

Quantum Searching. Robert-Jan Slager and Thomas Beuman. 24 november 2009

Lecture 15 - NP Completeness 1

Evaluation of DNF Formulas

Lecture Notes Each circuit agrees with M on inputs of length equal to its index, i.e. n, x {0, 1} n, C n (x) = M(x).

Outline. Complexity Theory. Introduction. What is abduction? Motivation. Reference VU , SS Logic-Based Abduction

COMP3702/7702 Artificial Intelligence Week 5: Search in Continuous Space with an Application in Motion Planning " Hanna Kurniawati"

Logarithmic space. Evgenij Thorstensen V18. Evgenij Thorstensen Logarithmic space V18 1 / 18

Inference of A Minimum Size Boolean Function by Using A New Efficient Branch-and-Bound Approach From Examples

Lecture #14: NP-Completeness (Chapter 34 Old Edition Chapter 36) Discussion here is from the old edition.

Show that the following problems are NP-complete

Automated Program Verification and Testing 15414/15614 Fall 2016 Lecture 3: Practical SAT Solving

Solvers for the Problem of Boolean Satisfiability (SAT) Will Klieber Aug 31, 2011

CSI 4105 MIDTERM SOLUTION

Theory of Computation CS3102 Spring 2014 A tale of computers, math, problem solving, life, love and tragic death

CSCI 1590 Intro to Computational Complexity

PROPOSITIONAL LOGIC. VL Logik: WS 2018/19

Exercises 1 - Solutions

Advanced topic: Space complexity

Computational Learning Theory

Computational Complexity

1 More finite deterministic automata

CHAPTER 3 FUNDAMENTALS OF COMPUTATIONAL COMPLEXITY. E. Amaldi Foundations of Operations Research Politecnico di Milano 1

Summer School on Introduction to Algorithms and Optimization Techniques July 4-12, 2017 Organized by ACMU, ISI and IEEE CEDA.

The total current injected in the system must be zero, therefore the sum of vector entries B j j is zero, è1;:::1èb j j =: Excluding j by means of è1è

A note on unsatisfiable k-cnf formulas with few occurrences per variable

A Lower Bound of 2 n Conditional Jumps for Boolean Satisfiability on A Random Access Machine

NP-Completeness. Andreas Klappenecker. [based on slides by Prof. Welch]

Efficient Approximation for Restricted Biclique Cover Problems

Stanford University CS254: Computational Complexity Handout 8 Luca Trevisan 4/21/2010

Propositional and Predicate Logic - II

COMP Analysis of Algorithms & Data Structures

Lecture 13, Fall 04/05

On the Complexity of the Minimum Independent Set Partition Problem

NP Completeness and Approximation Algorithms

More Completeness, conp, FNP,etc. CS254 Chris Pollett Oct 30, 2006.

Non-Deterministic Time

SAT, CSP, and proofs. Ofer Strichman Technion, Haifa. Tutorial HVC 13

Lifted Inference: Exact Search Based Algorithms

CS 161: Design and Analysis of Algorithms

A An Overview of Complexity Theory for the Algorithm Designer

P is the class of problems for which there are algorithms that solve the problem in time O(n k ) for some constant k.

Problem: Data base too big to fit memory Disk reads are slow. Example: 1,000,000 records on disk Binary search might take 20 disk reads

CSE 555 HW 5 SAMPLE SOLUTION. Question 1.

Polynomial-time Reductions

Probabilistically Checkable Proofs. 1 Introduction to Probabilistically Checkable Proofs

Final Exam, Fall 2002

NP-Completeness Part II

Transcription:

Jagged Bite Problem NP-Complete Construction Kris Hildrum and Megan Thomas Computer Science Division University of California, Berkeley fhildrum,mctg@cs.berkeley.edu ABSTRACT Ahyper-rectangle is commonly used as a bounding predicate in tree-based access methods in databases. To reduce the number of IèO's per query, wewould like to reduce the volume of this bounding predicate by cutting chunks out of the corners of the bounding hyper-rectangle. Ideally, one would like to remove the largest hyper-rectangular chunk that does not contain any points. In this paper, we formalize the problem and then show that the problem of ænding the largest possible chunk is NP-complete. We accomplish this through a reduction from 3-SAT to our problem. The Jagged-Bites Problem The Jagged-Bites problem, deæned after we provide some background, is signiæcant to a database access method designer working with nearest neighbor queries because of the way that such queries interact with rectangular bounding predicates èbpsè. It was ærst suggested in ë1ë. Nearest neighbor queries work by ænding points within a given distance of the query point, in essence asking expanding hyper-sphere queries. Figure 1 depicts a rectangular, 2-D BP and the circles of nearest neighbor queries. If the query point is in the middle of a BP, it does not matter how small the BP volume is; this node will be accessed. However, nearest neighbor queries starting from points just outside a particular BP may or may not intersect that BP. For good performance on nearest neighbor query workloads, the intersection of BPs with non-matching query spheres, like the two topmost queries in Figure 1, must be minimized. The data points of R-tree 1 leaf nodes do not always æll their minimum bounding rectangles èmbrsè, instead leaving noticeable gaps at corners of the MBRs. This suggests that query execution costs may rise while checking the contents of the empty corners of the MBR, as the two top queries in Figure 1 do. Combined with our intuition about the importance of reducing spherical intersections with BP regions near the BP edges, this leads us to focus on attempting to remove empty areas from the corners of an MBR BP, by ëbiting" into the volume of the BP from the corners. Consider the points in the rectangle in Figure 1. A two-rectangle BP, pictured in Figure 2, can bind those same data points more tightly than the MBR BP. However, the two-rectangle BP still leaves empty areas at some corners of each rectangle of the BP in this ægure. This would be less problematic in a BP that stored the MBR and the largest possible rectangular ëbite" taken out of each corner. This type of BP, which we call a ëjagged Bites" BP èjbè, would describe the data in a node more precisely than a two rectangle BP. The JB BP, pictured in Figure 3 for the same set of data points, stores the MBR of a set of data, as well as a set of points identifying the bites. Just as the MBR of a data set can be represented by storing two points, one for the highest values in each dimension, and one for the lowest, a corner bite 1 An R-tree is a balanced tree-structured database access method for multi-dimensional data. Data is stored at the leaf level of the tree, while upper levels contain minimum bounding rectangles and pointers to the corresponding sets of data points èor MBRs, in the case of multilevel treesè that the MBRs bound. execution cost, measured in IèO's, increases as the number of leaf level pages read into memory in order to check for data that actually satisæes the query increases.

2 MBR BP Figure 1: Nearest Neighbor Queries And Rectangular BPs Figure 2: A Two-Rectangle BP Figure 3: A Jagged Bites BP

3 can be represented by the associated MBR corner point and another point at the one ëinternal" corner of the bite rectangle, which might not intersect any MBR hyper edge. Here we prove that a polynomial-time algorithm for constructing an optimal JB bounding predicate is not possible unless P = NP, so a heuristic JB BP construction algorithm is necessary. Deænitions Without loss of generality, we assume that all points are positive, and that the MBR corner we bite into will be located at ~0. If this is not the case, the problem can be adjusted by shifting and reæection so that it holds. We also assume that all the p i in all the points ~p are ænite, i.e., 8p i in all ~p; p i écfor some ænite number C. JB problem JB èjagged-biteè problem: Given a set P of n-dimensional points, maximize subject to 8~p 2 P; 9i such that p i ç v i. n,1 Y i=0 v i = v 0 v 1 :::v n,1 JBP problem To argue NP-hardness, we need to turn the above problem into a problem that has a yesèno answer. Call this problem the JBP èjagged-bite-predicateè. JBP problem: Given a number k, and a set of points P, does there exist a vector ~v such that: æ Q v i ç k æ 8~p 2 P; 9i such that p i ç v i? Notice that these two problems are polynomial time equivalent. It is clear that solving jagged-bite problem means one can solve the predicate version. To go the other way, use JBP as an oracle in a binary search to ænd the maximum k, which gives an answer to the JB problem. For example, start with k = 100, and if the answer is yes, double k. èotherwise, halve k.è If the answer is no, then set k =1=2è100 + 200è, then if the answer is yes, set k = 175 and so on until k is found. This will take a logarithmic number of uses of the JBP oracle. 3-SAT Given variables x 0 :::x n,1 and a formula ç = c 1 ^ c 2 ^ :::c m, where c i is the disjunction of three literals èfor example c = x 3 _:x 2 _ x 5 è, determine whether ç is satisæable. JBP is NP-hard We show that a polynomial-time solution of the JBP problem provides a polynomial-time solution to the 3-SAT problem. Therefore, JBP is NP-hard.

4 The Reduction Given the formula ç, we create the following JBP problem by creating points of dimension 2n and setting k. Intuitively, our reduction will relate ~v to a satisfying assignment. The vector ~v will have a location in it for each literal èa literal is a variable or its negationè. Loosely, ~v =èx 0 ; :x 0 ;x 1 ; :x 1 ;:::;:x n,1è, where x i =1ifx i is true, and x i =2ifx i is false. Likewise, :x i is 2 if x 1 is true, and :x 1 is 1 if x i is false. In the discussion that follows, v 2i corresponds to the literal x i, and v 2i+1 corresponds to :x i. In order for the above description of ~v to make sense, certain values of ~v must be prevented. For example ~v =è1; 1; 1; 2; 1; 2è would imply that both x 0 and :x 0 are true, which would not correctly model the satisæability problem. Some of the points created in the reduction exist only to force ~v to be wellformed. Other points are created from the clauses of ç such that satisfying a clause is equivalent to ænding an i such that p i ç v i for the corresponding point. As a running example to clarify the procedure, we will use: ç =èx 1 _ x 2 _ x 3 è ^ è:x 1 _:x 2 _:x 3 è clause points Convert a clause c of ç to a point ~p of dimension 2n in the following way. p 2i =1ifx i appears in clause c, p 2i+1 =1if:x i appears in clause c, and both are 0 otherwise. Create one such point for each clause of ç. The purpose of the clause points is to capture the particular formula ç. The clauses in our running example become the points è1; 0; 1; 0; 1; 0è and è0; 1; 0; 1; 0; 1è. two-points For every i 2 ë0; 2n, 1ë we add the point p j = è 2 j = i 0 otherwise. The purpose of the two-points is to ensure that all the v i will be ç 2. In other words, to make the JB hyper-rectangle into a hyper-square with all sides of length 2. This means, for example, that a vector ~v =è1; 1; 1; 2 4 è will not be possible. The example formula generates è2; 0; 0; 0; 0; 0è, è0; 2; 0; 0; 0; 0è, è0; 0; 2; 0; 0; 0è, è0; 0; 0; 2; 0; 0è, è0; 0; 0; 0; 2; 0è, and è0; 0; 0; 0; 0; 2è. è 1 j 2 ë2i; 2i +1ë one-points For every i 2 ë0;n, 1ë we add the point p j = This means the points 0 otherwise è1; 1; 0 :::0è; è0; 0; 1; 1; 0;:::;0è and so on. These points prevent avariable and its negation from being false at the same time. Indirectly, this prevents both a variable and its negation from being true. In terms of points, one-points prevent ~v from being è2; 2; 1; 2;:::è The example formula generates è1; 1; 0; 0; 0; 0è, è0; 0; 1; 1; 0; 0è, and è0; 0; 0; 0; 1; 1è. Choose k =2 n. Though listed last, this condition is vital. Combined with one-points and two-points, this is the condition that forces ~v to have the desired structure of 1's and 2's. The example formula k =2 3 =8. In the next two sections, we will prove that our reduction is correct. That is, we must show that ç is satisæable if and only if JBP says yes. ç satisæable è JBP says yes We want toshow that if the 3-SAT formula is satisæable, then the JBP algorithm will say yes. To prove this this, we will ænd a ~v such that: æ Q v i ç 2 n æ For all points ~p, 9i s.t. p i ç v i

5 For the ç given above, one satisæng assignment isx 1 = true, x 2 = false, and x 3 = false.from this assignment, the proof below would construct the ~v =è1; 2; 2; 1; 2; 1è. Proof Choose v 2i =1ifx i is true, 2 otherwise. Similarly, v 2i+1 =2if x i is true, 1 otherwise. Clearly, Q vi =2 n ç k =2 n, so the ærst condition is satisæed. Now, we must show that for all points ~p, 9j such that p j ç v j. There are three cases: æ ~p is a clause point. We need to show that in one of the variables with a 1, ~v also has a one. Fortunately, we know that at least one literal in the clause corresponding to that point is true. Suppose that literal is x i èor :x i è. Then v 2i =1,èorv 2i+1 = 1è since we constructed ~v to ensure that this would be true. Therefore, we have that p 2i ç v 2i èor p 2i+1 ç v 2i+1 è. æ ~p is of the form è...0, 2,...0è We selected ~v such that every element of~v is less than or equal to 2, so this holds. æ ~p is of the form è...,1, 1, 0,...0è We constructed ~v such that one of every pair is 1, so this is satisæed. JBP says yes è ç satisæable Suppose that the JBP says yes. From the vector ~v, we will construct a satisfying assignment to ç. First, recall that we know that all v i are less than 2 èbecause of the two-pointsè. Further, for every pair v 2i, v 2i+1,we know that at least one of those pairs is less than one èbecause of the one-pointsè. So, we know that v 2i æv 2i+1 ç 2. Finally, since we know that Q v i ç 2 n,we can conclude that v 2i æv 2i+1 =2. Since each of them has a maximum of two, we can conclude that of the pair v 2i, v 2i+1, exactly one must be 2, and the other must be 1. So, we construct a satisfying assignment in the following way: x i is true if v 2i = 1, and x i is false if v 2i+1 =1. Now we show that for this assignment, each clause has at least one true literal, and so the formula is satisæed. Fix a clause c. This clause corresponds to a point ~p, and we know that for some i, p i ç v i. The only way p i ç v i is if p i = v i = 1. èsince p i can be 0 or 1, and v i can only be 1 or 2.è If v i is 1, then that means the corresponding literal is true, and if p i is 1, then that means that the corresponding literal is in the the clause c, which means that c contains at least one true literal, so c is satisæed. Since this holds for every c, the formula as a whole is satisæed. NP-completeness A problem is NP-complete if it is is NP-hard and in NP.Wehave shown that the JBP problem is NP-hard, so to show it is NP-complete, we need only show that it is in NP. A problem is in NP if it has a polynomial size witness that can be veriæed in polynomial time. In this case, ~v is a witness. It is easy to check, given ~v, that Q v i ç k and that 8~p; 9i such that p i ç v i. This means that JBP is NP-complete. References ë1ë M. Thomas, C. Carson, and J. Hellerstein. Creating a Customized Access Method for Blobworld. Submitted to ICDE 2000.