Geometric Problems in Moderate Dimensions

Similar documents
Deterministic APSP, Orthogonal Vectors, and More:

Polynomial Representations of Threshold Functions and Algorithmic Applications. Joint with Josh Alman (Stanford) and Timothy M.

Orthogonal Range Searching in Moderate Dimensions: k-d Trees and Range Trees Strike Back

Hardness for easy problems. An introduction

Complexity Theory of Polynomial-Time Problems

Speeding up the Four Russians Algorithm by About One More Logarithmic Factor

6.S078 A FINE-GRAINED APPROACH TO ALGORITHMS AND COMPLEXITY LECTURE 1

A FINE-GRAINED APPROACH TO ALGORITHMS AND COMPLEXITY. Virginia Vassilevska Williams Stanford University

A FINE-GRAINED APPROACH TO

Complexity Theory of Polynomial-Time Problems

Connections between exponential time and polynomial time problem Lecture Notes for CS294

1 Randomized reduction: a first start

On The Hardness of Approximate and Exact (Bichromatic) Maximum Inner Product. Lijie Chen (Massachusetts Institute of Technology)

Applications of Chebyshev Polynomials to Low-Dimensional Computational Geometry

Holger Dell Saarland University and Cluster of Excellence (MMCI) Simons Institute for the Theory of Computing

More Consequences of Falsifying SETH and the Orthogonal Vectors Conjecture [Full version]

Complexity Theory of Polynomial-Time Problems

Stanford University CS254: Computational Complexity Handout 8 Luca Trevisan 4/21/2010

More Applications of the Polynomial Method to Algorithm Design

Proximity problems in high dimensions

On some fine-grained questions in algorithms and complexity

On the efficient approximability of constraint satisfaction problems

A Satisfiability Algorithm for Sparse Depth Two Threshold Circuits

An algorithm for the satisfiability problem of formulas in conjunctive normal form

Notes for Lecture 2. Statement of the PCP Theorem and Constraint Satisfaction

Deterministic APSP, Orthogonal Vectors, and More: Quickly Derandomizing Razborov-Smolensky

arxiv: v2 [cs.ds] 3 Oct 2017

Approximating the Minimum Closest Pair Distance and Nearest Neighbor Distances of Linearly Moving Points

On the Exponent of the All Pairs Shortest Path Problem

Fine Grained Counting Complexity I

Strong ETH Breaks With Merlin and Arthur. Or: Short Non-Interactive Proofs of Batch Evaluation

Hardness of Embedding Metric Spaces of Equal Size

A Fast and Simple Algorithm for Computing Approximate Euclidean Minimum Spanning Trees

Deterministic Approximation Algorithms for the Nearest Codeword Problem

Notes on Complexity Theory Last updated: December, Lecture 27

Multi-Dimensional Online Tracking

An efficient approximation for point-set diameter in higher dimensions

Recent Developments on Circuit Satisfiability Algorithms

On Approximating the Depth and Related Problems

Subquadratic Algorithms for Succinct Stable Matching

Geometric Optimization Problems over Sliding Windows

Learning convex bodies is hard

Approximating Graph Spanners

Towards Hardness of Approximation for Polynomial Time Problems

Ma/CS 6b Class 23: Eigenvalues in Regular Graphs

CSC 2429 Approaches to the P vs. NP Question and Related Complexity Questions Lecture 2: Switching Lemma, AC 0 Circuit Lower Bounds

Faster Satisfiability Algorithms for Systems of Polynomial Equations over Finite Fields and ACC^0[p]

Approximating Longest Directed Path

Approximability of the Discrete Fréchet Distance

The Intractability And The Tractability Of The Orthogonal Vectors Problem

CS 5114: Theory of Algorithms

Design and Analysis of Algorithms

Matching Triangles and Basing Hardness on an Extremely Popular Conjecture

compare to comparison and pointer based sorting, binary trees

Average-case Analysis for Combinatorial Problems,

How hard is it to find a good solution?

NP-completeness. Chapter 34. Sergey Bereg

Complexity Theory of Polynomial-Time Problems

Lecture 20: conp and Friends, Oracles in Complexity Theory

CS 5114: Theory of Algorithms. Tractable Problems. Tractable Problems (cont) Decision Problems. Clifford A. Shaffer. Spring 2014

BBM402-Lecture 11: The Class NP

The Complexity of Optimization Problems

PCPs and Inapproximability Gap-producing and Gap-Preserving Reductions. My T. Thai

Streaming and communication complexity of Hamming distance

Complexity Theory of Polynomial-Time Problems

Cell-Probe Proofs and Nondeterministic Cell-Probe Complexity

Lecture 10. Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1

Completeness for First-Order Properties on Sparse Structures with Algorithmic Applications

On the correlation of parity and small-depth circuits

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 11: From random walk to quantum walk

Meta-Algorithms vs. Circuit Lower Bounds Valentine Kabanets

Lecture 8: Complete Problems for Other Complexity Classes

arxiv: v1 [cs.ds] 21 Dec 2018

Limitations of Algorithm Power

Lecture 4. 1 Circuit Complexity. Notes on Complexity Theory: Fall 2005 Last updated: September, Jonathan Katz

A An Overview of Complexity Theory for the Algorithm Designer

CS151 Complexity Theory. Lecture 15 May 22, 2017

WELL-SEPARATED PAIR DECOMPOSITION FOR THE UNIT-DISK GRAPH METRIC AND ITS APPLICATIONS

Linear sketching for Functions over Boolean Hypercube

Scribes: Po-Hsuan Wei, William Kuzmaul Editor: Kevin Wu Date: October 18, 2016

ON RANGE SEARCHING IN THE GROUP MODEL AND COMBINATORIAL DISCREPANCY

18.10 Addendum: Arbitrary number of pigeons

CS 350 Algorithms and Complexity

NP-Complete problems

Fly Cheaply: On the Minimum Fuel Consumption Problem

With P. Berman and S. Raskhodnikova (STOC 14+). Grigory Yaroslavtsev Warren Center for Network and Data Sciences

Essential facts about NP-completeness:

Optimal Data-Dependent Hashing for Approximate Near Neighbors

CS 350 Algorithms and Complexity

Admin NP-COMPLETE PROBLEMS. Run-time analysis. Tractable vs. intractable problems 5/2/13. What is a tractable problem?

Design and Analysis of Algorithms

Finite Metric Spaces & Their Embeddings: Introduction and Basic Tools

Lecture 2: November 9

FPT hardness for Clique and Set Cover with super exponential time in k

ENTROPY OF OPERATORS OR WHY MATRIX MULTIPLICATION IS HARD FOR DEPTH-TWO CIRCUITS

Boolean complexity classes vs. their arithmetic analogs

How proofs are prepared at Camelot

Approximation Algorithms and Hardness of Approximation. IPM, Jan Mohammad R. Salavatipour Department of Computing Science University of Alberta

CSC2420: Algorithm Design, Analysis and Theory Fall 2017

On the Locality of Distributed Sparse Spanner Constructions

Transcription:

Geometric Problems in Moderate Dimensions Timothy Chan UIUC

Basic Problems in Comp. Geometry Orthogonal range search preprocess n points in R d s.t. we can detect, or count, or report points inside a query axis-aligned box q

Basic Problems in Comp. Geometry Orthogonal range search Dominance range search preprocess n points in R d s.t. we can detect, or count, or report points dominated by q, i.e., inside (, q 1 ] (, q d ] (many geom. appl ns: computing skylines,... ) (orthogonal range search in R d reduces to dominance in R 2d ) q

Basic Problems in Comp. Geometry Orthogonal range search Dominance range search l nearest neighbor search preprocess n points in R d s.t. we can find l -closest point to a query point q

Basic Problems in Comp. Geometry Orthogonal range search Dominance range search l nearest neighbor search l 1 nearest neighbor search

Basic Problems in Comp. Geometry Non-orthogonal range search Halfspace range search Ball range search l 2 nearest neighbor search q

Basic Problems in Comp. Geometry Non-orthogonal range search Halfspace range search Ball range search l 2 nearest neighbor search q

Basic Problems in Comp. Geometry Non-orthogonal range search Halfspace range search Ball range search l 2 nearest neighbor search (many geom. appl ns: bichromatic closest/farthest pair, min spanning tree, convex hull,... )

Setting focus on orthogonal problems focus on exact, not approximate focus on upper bounds n online vs. offline queries

Low Dimensions: Classical Results from Comp. Geometry Orthogonal range search: d O(d) n log O(d) n time (but meaningful only when d δ 0 log n... ) Non-orthogonal range search: d O(d) n 2 1/O(d) time

Connection to Non-Geom. Problems Boolean orthogonal vector problem (OV) given sets A, B of n vectors in {0, 1} d, decide a A, b B s.t. a b = 0 (appl ns: subset queries, partial match queries for strings,... ) (equiv. to Boolean version of offline dominance) (OV Conjecture: no O(n 2 δ ) alg m for d = ω(log n)) Boolean matrix multiplication given matrices A {0, 1} n n, B {0, 1} n n, compute c ij = k(a ik b kj ) for each i, j

Connection to Non-Geom. Problems Boolean orthogonal vector problem (OV) given sets A, B of n vectors in {0, 1} d, decide a A, b B s.t. a b = 0 (appl ns: subset queries, partial match queries for strings,... ) (equiv. to Boolean version of offline dominance) (OV Conjecture: no O(n 2 δ ) alg m for d = ω(log n)) Boolean matrix multiplication given matrices A {0, 1} n d, B {0, 1} d n, compute c ij = k(a ik b kj ) for each i, j

Connection to Non-Geom. Problems Boolean orthogonal vector problem (OV) given sets A, B of n vectors in {0, 1} d, decide a A, b B s.t. a b = 0 (appl ns: subset queries, partial match queries for strings,... ) (equiv. to Boolean version of offline dominance) (OV Conjecture: no O(n 2 δ ) alg m for d = ω(log n)) Boolean matrix multiplication i.e., given sets A, B of n vectors in {0, 1} d, decide whether a b = 0 for each a A, b B

Connection to Non-Geom. Problems All-pairs shortest paths (APSP) or (min,+)-matrix multiplication given matrices A R n n, B R n n, compute c ij = min k (a ik + b kj ) for each i, j a ik b kj

Connection to Non-Geom. Problems All-pairs shortest paths (APSP) or (min,+)-matrix multiplication given matrices A R n n, B R n n, compute c ij = min k (a ik + b kj ) for each i, j (appl ns: graph diameter/radius/etc., max 2D subarray, language edit distance, min-weight triangulation of polygons,... )

Connection to Non-Geom. Problems All-pairs shortest paths (APSP) or (min,+)-matrix multiplication given matrices A R n d, B R d n, compute c ij = min k (a ik + b kj ) for each i, j

Connection to Non-Geom. Problems All-pairs shortest paths (APSP) or (min,+)-matrix multiplication i.e., given sets A, B of n vectors in R d, compute min k (a k + b k ) for each a A, b B (reduces to d instances of offline dominance by Fredman s trick : a k0 + b k0 a k + b k a k0 a k b k b k0 )

Connection to Non-Geom. Problems (min,+)-convolution given vectors a, b R n, compute c i = min k (a k + b i k ) for each i (appl ns: jumbled string matching, knapsack, min k-enclosing rectangles,... ) (reduces to O( n) (min,+)-matrix multiplication of n n matrices) 3SUM given vectors a, b, c R n, decide i, j, k s.t. a i + b j = c k

Connection to Non-Geom. Problems SAT for sparse instances given CNF formula in n Boolean vars & cn clauses, decide satisfying assignment (reduces to OV with 2 n/2 Boolean vectors in cn dimensions) for each partial assignment of first n/2 vars, define vector a with a i = 0 iff i-th clause is already satisfied; for each partial assignment of last n/2 vars, define vector b with b i = 0 iff i-th clause is already satisfied)

Connection to Non-Geom. Problems SAT for sparse instances given CNF formula in n Boolean vars & cn clauses, decide satisfying assignment (reduces to OV with 2 n/2 Boolean vectors in cn dimensions) ksat (reduces to sparse case) (SETH Conjecture: no (2 δ) n alg m for k = ω(1)) MAX-SAT for sparse instances MAX-kSAT

Connection to Non-Geom. Problems Integer 0-1 linear programming for sparse instances given cn linear inequalities with real coeffs. over n 0-1 vars, decide satisfying assignment (reduces to dominance with 2 n/2 real vectors in cn dimensions)

High Dimensions: By Fast Matrix Multiplication offline dominance in Boolean case (i.e., OV) can be trivially solved by Boolean rect. matrix multiplication in M(n, d, n) time M(n, n, n) = O(n 2.373 ) M(n, d, n) = Õ(n2 ) for d n 0.1 offline dominance for real case can be reduced to Boolean case, in O(min s (M(n, ds, n) + dn 2 /s)) time [Matoušek 91] by dividing into s buckets of size n/s (but not clear how to beat n 2 time... )

Moderate Dimensions: This Talk subquadratic time for d beyond logarithmic? let d = c log n (c not necessarily constant)

Two Approaches Part I. Comp. Geometry Techniques (k-d trees, range trees) Part II. Polynomial Method

k-d Trees [ 75] 1. pick next axis i {1,..., d} 2. divide by median i-th coord. 3. recurse on both sides preproc. time O(dn log n) dominance query time O(n 1 1/d )

Range Trees [ 79] 0. recurse on projection along 1st coord. 1. divide by median 1st coord. 2. recurse on both sides q P d (n) 2P d (n/2) + P d 1 (n/2) O(n log d n) Q d (n) Q d (n/2) + Q d 1 (n/2) O(log d n)

Lopsided Range Tree for Offline Dominance [Impagliazzo Lovett Paturi Schneider 14] 0. recurse on projection along 1st coord. 1. divide by median 1st coord. 2. recurse on both sides choose s.t. α = β (1 α)n data pts βm query pts αn data pts (1 β)m query pts

Lopsided Range Tree for Offline Dominance [Impagliazzo Lovett Paturi Schneider 14] 0. recurse on projection along 1st coord. 1. divide by (αn)-th largest 1st coord. for some α 2. recurse on both sides choose α s.t. α = β (1 α)n data pts βm query pts T d (n, m) max α αn data pts (1 β)m query pts [T d((1 α)n, αm) + T d (αn, (1 α)m) + T d 1 ((1 α)n, (1 α)m)]

Lopsided Range Tree for Offline Dominance [Impagliazzo Lovett Paturi Schneider 14] T d (n, m) max α [T d((1 α)n, αm) + T d (αn, (1 α)m) + T d 1 ((1 α)n, (1 α)m)] Impagliazzo et al.: n 2 1/Õ(c15) time for n offline queries (for d = c log n) C. [SODA 15]: n 2 1/Õ(c) time (subquadratic for c log n, i.e., d log 2 n) (appl n: integer linear programming with cn constraints in (2 1/Õ(c))n time) (but only works for offline... )

New Lopsided k-d Tree for Online Dominance [C., SoCG 17] 0. directly build data structure for each possible (d/b)-dimensional projection, with b (1/ε)c log c 1. pick next axis i {1,..., d} 2. divide by median i-th coord. 3. recurse on both sides # projections ( d d/b ) = b O(d/b) = b O((c log n)/b) = n O(ε) preproc. time n 1+O(ε)

New Lopsided k-d Tree for Online Dominance [C., SoCG 17] 0. directly build data structure for each possible (d/b)-dimensional projection, with b (1/ε)c log c 1. pick random axis i {1,..., d} 2. divide by (αn)-th largest i-th coord. with α 1/c 4 3. recurse on both sides # projections ( d d/b ) = b O(d/b) = b O((c log n)/b) = n O(ε) preproc. time n 1+O(ε)

New Lopsided k-d Tree for Online Dominance [C., SoCG 17] Let Q j (n) = time for query pt with j bounded coords If j d/b then base case else Q j (n) max l l d Q j((1 α)n) + j l d (Q j(αn) + Q j 1 ((1 α)n)) + d j d (Q j(αn) + Q j ((1 α)n)) q (1 α)n data pts αn data pts

New Lopsided k-d Tree for Online Dominance [C., SoCG 17] Let Q j (n) = time for query pt with j bounded coords If j d/b then base case else Q j (n) max l l d Q j((1 α)n) + j l d (Q j(αn) + Q j 1 ((1 α)n)) + d j d (Q j(αn) + Q j ((1 α)n)) q (1 α)n data pts αn data pts

New Lopsided k-d Tree for Online Dominance [C., SoCG 17] Let Q j (n) = time for query pt with j bounded coords If j d/b then base case else Q j (n) max l l d Q j((1 α)n) + j l d (Q j(αn) + Q j 1 ((1 α)n)) + d j d (Q j(αn) + Q j ((1 α)n)) q (1 α)n data pts αn data pts

New Lopsided k-d Tree for Online Dominance [C., SoCG 17] Let Q j (n) = time for query pt with j bounded coords If j d/b then base case else Q j (n) max l l d Q j((1 α)n) + j l d (Q j(αn) + Q j 1 ((1 α)n)) + d j d (Q j(αn) + Q j ((1 α)n)) expected online query time n 1 1/Õ(c) (appl n to APSP: Õ(n3 / log 3 n) combinatorial alg m [C., SoCG 17]) (specialization to Boolean case: k-d tree trie)

Open Problems deterministic online? lower bounds for geometric tree-based methods?

s-way Range Tree: Reducing Offline Real Dominance to Boolean [C., SoCG 17] Assume that offline Boolean dominance (i.e., OV) can be solved in n 2 f(c) time Let T j (n) be time for n data & query pts in R j [s] d j If j = 0 then T 0 (n) n 2 f(cs) else Set s = c 4 T j (n) st j (n/s) + T j 1 (n) n 2 f(c5 )+O(1/c) time for offline real dominance

Two Approaches Part I. Comp. Geometry Techniques (k-d trees, range trees) Part II. Polynomial Method

Boolean OV [Abboud Williams Yu, SODA 15] First reduce # of input vectors from n to n/s, by treating each group of s vectors as one given sets A, B of n/s vectors in {0, 1} ds, evaluate f(a, b) for each a A, b B, for this funny function f(a, b) = i,j [s] k [d] (a ik b jk ) funny rect. matrix multiplication problem

Boolean OV [Abboud Williams Yu, SODA 15] If we can express f as polynomial, funny rect. matrix multiplication reduces to standard rect. matrix multiplication Example: f(a, b) = a 1 b 2 + 4a 2 b 1 b 2 + 3a 1 a 3 b 1 5a 2 b 1 b 3 = (a 1, 4a 2, 3a 1 a 3, 5a 2 ) (b 2, b 1 b 2, b 1, b 1 b 3 ) new dim. = # of monomials Õ((n/s)2 ) time if # of monomials (n/s) 0.1...

Boolean OV [Abboud Williams Yu, SODA 15] New Problem: express f(a, b) = i,j [s] k [d] (a ik b jk ) as a polynomial with small # of monomials degree

Boolean OV [Abboud Williams Yu, SODA 15] New Problem: express AND-of-OR(x) = l [s 2 ] k [d] x lk as a polynomial with small degree Naive Sol n: degree d l 1 k [d] (1 x lk )

Boolean OV [Abboud Williams Yu, SODA 15] New Problem: express AND-of-OR(x) = l [s 2 ] k [d] x lk as a polynomial with small degree Rand. Sol n: by Razborov Smolensky s trick ( 87) replace each OR with random linear combination in F 2 repeat log(100s 2 ) times to lower error prob. to 1/(100s 2 ) replace AND with another random linear combination in F 2 degree O(log s)

Boolean OV [Abboud Williams Yu, SODA 15] degree O(log s) # monomials s 2 ( d O(log s) = ( log d s )O(log s) = (c/α) O(α log n) for d = c log n, s = n α = no(α log(c/α)) (n/s) 0.1 for α 1/O(log c) Õ((n/s)2 ) = n 2 1/O(log c) rand. time (better than n 2 /poly(d) when d 2 log n ) (similar ideas used in Williams s APSP alg m [STOC 14] in n 3 /2 Ω( log n) time) )

Boolean OV [Abboud Williams Yu, SODA 15] Derandomization [C. Williams, SODA 16] use ε-biased space for the random linear combinations in F 2 sum over entire sample space use modulus-amplifying polynomials before summing Extends to counting problem #OV (via SUM-of-OR) (appl ns: SAT & #SAT with cn clauses in (2 1/O(log c)) n time, ksat & #ksat in (2 1/O(k)) n time)

Offline Hamming Nearest Neighbor [Alman Williams, FOCS 15; Alman C. Williams, FOCS 16] New Problem: express f(a, b) = i,j [s] k [d] as a polynomial with small degree (a ik b jk ) 2 t

Offline Hamming Nearest Neighbor [Alman Williams, FOCS 15; Alman C. Williams, FOCS 16] New Problem: express AND-of-THR(x) = l [s 2 ] k [d] x lk t as a polynomial with small degree Rand. Sol n 1: [Alman Williams] replace AND with sum for each THR, take random sample of size d/2 & recurse if count t ± O( d log s), use interpolating polynomial degree O( d log s)

Offline Hamming Nearest Neighbor [Alman Williams, FOCS 15; Alman C. Williams, FOCS 16] New Problem: express AND-of-THR(x) = l [s 2 ] k [d] x lk t as a polynomial with small degree Simple Det. Sol n 2: [Alman C. Williams] sum of Chebyshev polynomials degree O( d log s) e q/ d 1 0 1 t T q (x/t) t + 1

Offline Hamming Nearest Neighbor [Alman Williams, FOCS 15; Alman C. Williams, FOCS 16] New Problem: express AND-of-THR(x) = l [s 2 ] k [d] x lk t as a polynomial with small degree Combined Sol n 3: [Alman C. Williams] for each THR, take random sample of size r = d 2/3 log 1/3 s use Sol n 1 on sample if count t ± O((d/ r) log s), use Chebyshev polynomial degree O(d 1/3 log 2/3 s)

Offline Hamming Nearest Neighbor [Alman Williams, FOCS 15; Alman C. Williams, FOCS 16] degree O(d 1/3 log 2/3 s) # monomials s 2 ( d O(d 1/3 log 2/3 s) = ( d log s )O(d1/3 log 2/3 s) (c/α) O(c1/3 α 2/3 log n) for d = c log n, s = n α = nõ(c1/3 α 2/3 ) (n/s) 0.1 for α 1/Õ( c) Õ((n/s)2 ) = n 2 1/Õ( c) rand. time (subquadratic when c log 2 n, i.e., d log 3 n) )

Offline Hamming Nearest Neighbor [Alman Williams, FOCS 15; Alman C. Williams, FOCS 16] extends to offline l 1 nearest neighbor in [U] d in n 2 1/Õ( cu) rand. time offline (1+ε)-approximate (l 1 or l 2 ) nearest neighbor in n 2 Ω(ε 1/3) rand. time (via AND-of-Approx-THR) (improving over Valiant [FOCS 12] & LSH for small ε) (appl n: MAX-SAT with cn clauses in (2 1/Õ(c1/3 )) n time) (appl n: MAX-3-SAT with cn clauses in (2 1/polylog c) n time)

Open Problems derandomize? improve d 1/3 degree for AND-of-THR? l 1 nearest neighbor search for larger universe U? beat LSH for offline 2-approximate nearest neighbor? online? (Larsen Williams [SODA 17] solved online Boolean OV) better offline dominance: disprove OV/SETH?? non-orthogonal problems are harder (Williams [SODA 18]: offline l 2 nearest neighbor search for d = ω(log log n) 2 can t be solved in O(n 2 δ ) time, assuming OV conjecture)