Polynomial Representations of Threshold Functions and Algorithmic Applications. Joint with Josh Alman (Stanford) and Timothy M.

Similar documents
Geometric Problems in Moderate Dimensions

Complexity Theory of Polynomial-Time Problems

Some Depth Two (and Three) Threshold Circuit Lower Bounds. Ryan Williams Stanford Joint work with Daniel Kane (UCSD)

Deterministic APSP, Orthogonal Vectors, and More:

Stanford University CS254: Computational Complexity Handout 8 Luca Trevisan 4/21/2010

Strong ETH Breaks With Merlin and Arthur. Or: Short Non-Interactive Proofs of Batch Evaluation

Sums of products of polynomials of low-arity - lower bounds and PIT

1 The Low-Degree Testing Assumption

Learning and Fourier Analysis

Complete problems for classes in PH, The Polynomial-Time Hierarchy (PH) oracle is like a subroutine, or function in

Lecture 3: AC 0, the switching lemma

Topics in Probabilistic Combinatorics and Algorithms Winter, Basic Derandomization Techniques

Faster Satisfiability Algorithms for Systems of Polynomial Equations over Finite Fields and ACC^0[p]

Complexity Theory of Polynomial-Time Problems

Chapter 1 Divide and Conquer Algorithm Theory WS 2016/17 Fabian Kuhn

Linear sketching for Functions over Boolean Hypercube

Notes for Lecture 11

On the efficient approximability of constraint satisfaction problems

Majority is incompressible by AC 0 [p] circuits

Local Reductions. September Emanuele Viola Northeastern University

Polynomial Identity Testing and Circuit Lower Bounds

Optimal Hitting Sets for Combinatorial Shapes

Notes on Computer Theory Last updated: November, Circuits

Lecture 4: LMN Learning (Part 2)

Lecture 10 Oct. 3, 2017

Testing Monotone High-Dimensional Distributions

1 Randomized complexity

1 Randomized Computation

Notes on Complexity Theory Last updated: December, Lecture 27

Lecture 7: Passive Learning

Lecture 23: Alternation vs. Counting

15-855: Intensive Intro to Complexity Theory Spring Lecture 16: Nisan s PRG for small space

CSC 2429 Approaches to the P vs. NP Question and Related Complexity Questions Lecture 2: Switching Lemma, AC 0 Circuit Lower Bounds

Chapter 1 Divide and Conquer Polynomial Multiplication Algorithm Theory WS 2015/16 Fabian Kuhn

Questions Pool. Amnon Ta-Shma and Dean Doron. January 2, Make sure you know how to solve. Do not submit.

Compute the Fourier transform on the first register to get x {0,1} n x 0.

Lecture notes on OPP algorithms [Preliminary Draft]

arxiv: v1 [cs.cc] 13 Feb 2017

Trusting the Cloud with Practical Interactive Proofs

Finite fields, randomness and complexity. Swastik Kopparty Rutgers University

Lecture 10: Learning DNF, AC 0, Juntas. 1 Learning DNF in Almost Polynomial Time

Lecture 4. 1 Circuit Complexity. Notes on Complexity Theory: Fall 2005 Last updated: September, Jonathan Katz

Handout 5. α a1 a n. }, where. xi if a i = 1 1 if a i = 0.

Umans Complexity Theory Lectures

Conspiracies between Learning Algorithms, Lower Bounds, and Pseudorandomness

Quantum walk algorithms

Lecture 1: 01/22/2014

Faster Private Release of Marginals on Small Databases

Introduction to Machine Learning

More Applications of the Polynomial Method to Algorithm Design

CS Communication Complexity: Applications and New Directions

20.1 2SAT. CS125 Lecture 20 Fall 2016

Testing by Implicit Learning

Exponential time vs probabilistic polynomial time

Identifying an Honest EXP NP Oracle Among Many

Circuits. Lecture 11 Uniform Circuit Complexity

Lecture 24 Nov. 20, 2014

CS 151 Complexity Theory Spring Solution Set 5

Lecture 22. m n c (k) i,j x i x j = c (k) k=1

Proximity problems in high dimensions

Low Degree Test with Polynomially Small Error

Lecture 10. Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1

Speeding up the Four Russians Algorithm by About One More Logarithmic Factor

Active Learning and Optimized Information Gathering

5.1 Learning using Polynomial Threshold Functions

CS261: A Second Course in Algorithms Lecture #18: Five Essential Tools for the Analysis of Randomized Algorithms

IS VALIANT VAZIRANI S ISOLATION PROBABILITY IMPROVABLE? Holger Dell, Valentine Kabanets, Dieter van Melkebeek, and Osamu Watanabe December 31, 2012

CSC 5170: Theory of Computational Complexity Lecture 5 The Chinese University of Hong Kong 8 February 2010

A FINE-GRAINED APPROACH TO

Concentration function and other stuff

With P. Berman and S. Raskhodnikova (STOC 14+). Grigory Yaroslavtsev Warren Center for Network and Data Sciences

The space complexity of approximating the frequency moments

Lecture 4: Codes based on Concatenation

Succinct linear programs for easy problems

Randomized Computation

CS151 Complexity Theory. Lecture 15 May 22, 2017

CS Foundations of Communication Complexity

Addition is exponentially harder than counting for shallow monotone circuits

Lecture 26. Daniel Apon

Kernel-Size Lower Bounds. The Evidence from Complexity Theory

-bit integers are all in ThC. Th The following problems are complete for PSPACE NPSPACE ATIME QSAT, GEOGRAPHY, SUCCINCT REACH.

Learning and Fourier Analysis

CS151 Complexity Theory

: On the P vs. BPP problem. 30/12/2016 Lecture 11

CS151 Complexity Theory. Lecture 14 May 17, 2017

Similarity Search in High Dimensions II. Piotr Indyk MIT

: On the P vs. BPP problem. 18/12/16 Lecture 10

Algorithms for Nearest Neighbors

6.045: Automata, Computability, and Complexity (GITCS) Class 17 Nancy Lynch

3 Finish learning monotone Boolean functions

On The Hardness of Approximate and Exact (Bichromatic) Maximum Inner Product. Lijie Chen (Massachusetts Institute of Technology)

Almost transparent short proofs for NP R

Hardness for easy problems. An introduction

Fourier Sampling & Simon s Algorithm

Some notes on streaming algorithms continued

Lecture Examples of problems which have randomized algorithms

Secure Multiparty Computation from Graph Colouring

Cell-Probe Proofs and Nondeterministic Cell-Probe Complexity

Approximate List-Decoding of Direct Product Codes and Uniform Hardness Amplification

Lecture 15: Expanders

Transcription:

Polynomial Representations of Threshold Functions and Algorithmic Applications Ryan Williams Stanford Joint with Josh Alman (Stanford) and Timothy M. Chan (Waterloo)

Outline The Context: Polynomial Representations, Polynomials for Algorithms, Nearest Nbrs The New Results Some Details Conclusion

Exact Polynomial Representations Let f : 0,1 n 0,1 ; let D = F q or R in the following. Def. A polynomial p : D n D is an (exact) polynomial for f if for all x 0, 1 n, p x = f x. Example: OR OR x 1,, x n = ቊ 0 if x 1 = = x n = 0 1 otherwise We can write OR as a polynomial over D OR x 1,, x n = 1 1 x 1 1 x 2 (1 x n ) This has degree n and Ω(2 n ) monomials when expanded out. We need other representations to get smaller polynomials.

Probabilistic Polynomials Let f : 0,1 n 0,1 ; let D = F q or R in the following. Def. A distribution P on degree d polynomials p : D n D is a probabilistic polynomial for f with error at most ε if for all x 0, 1 n, Pr p ~ P p x = f x > 1 - ε. Example: OR with D = F 2 [R 87] Pick a uniformly random R {1, 2,, n}, then pick the polynomial p x 1,, x n p(x) = 0 when x 1 = = x n = 0 = x i. i R Odd with probability 1/2 otherwise Can amplify to error ε with degree k = O(log(1/ε)): p x 1,, x n = 1 1 i R 1 x i 1 i R k x i.

Polynomial Threshold Functions (PTF) Let f : 0,1 n 0,1 ; let D = F q or R in the following. Def. A polynomial p : R n R is a polynomial threshold function (PTF) for f if for all x 0, 1 n, Example: OR p x 1,, x n p x 0 when f x = 1 p x < 0 when f x = 0 Just take a sum! = x 1 + x 2 + + x n 1/2. Def. A distribution P on degree d polynomials p : R n R is a probabilistic PTF for f with error at most ε if for all x 0, 1 n, Pr p ~ P Pr p ~ P p x 0 > 1 ε p x < 0 > 1 ε when f x = 1 when f x = 0

Polynomials For Algorithms low complexity nice polynomials classical algorithm reduction Multipoint evaluation of polynomials (MM/FFT) faster algorithm! Has led to faster algorithms for: All-pairs shortest paths [W 14] All-points nearest neighbors in Hamming metric [AW 15] #k-sat [CW 16] Circuit-SAT in many regimes (and thus, circuit lower bounds) Succinct stable matching [MPS 16] Partial match queries [AWY 15]

Polynomials For Algorithms low complexity nice polynomials classical algorithm randomized reduction Multipoint evaluation of polynomials (MM/FFT) faster algorithm! Has led to faster algorithms for: All-pairs shortest paths [W 14] All-points nearest neighbors in Hamming metric [AW 15] #k-sat [CW 16] Circuit-SAT in many regimes (and thus, circuit lower bounds) Succinct stable matching [MPS 16] Partial match queries [AWY 15]

Multipoint Evaluation Reduce multipoint polynomial evaluation to matrix multiplication Suppose we want to evaluate polynomial p(x 1,, x m, y 1,, y m ) on all pairs of x A and y B. First expand p out in terms of monomials, for instance: p x, y = 2x 1 y 1 + 3x 4 x 7 y 8 12y 9 y 13 + Then p can also be written as an inner product: p x, y = 2x 1, 3x 4 x 7, 12, (y 1, y 8, y 9 y 13, ) Hence we can evaluate p on a combinatorial rectangle using fast (rectangular) matrix multiplication [Coppersmith 82] Evaluation Lemma [C 82,W 14] Given sets A, B {0, 1} m, A = B = n, and a polynomial p x 1,, x m, y 1,, y m, with p (n) 0.1, can evaluate p on all x, y A B in n 2 + n 1.1 m poly log n time.

All-Points (Hamming) Nearest Neighbors Given: Sets A, B of n points in 0, 1 d, d = c log n Task: For all x A, output y B that minimizes h(x, y) Can easily be computed in O n 2 d time and O n 2 d time Thm [AW 15] All-Points Hamming Nearest Neighbors can be solved on n points in c log n dimensions in randomized n 2 1 O c log 2 c Truly subquadratic time for O(log n) dimensions time. Idea 1: Set p to be a PTF for testing whether two vectors of length d have Hamming distance d k, for each k = d, d 1,, 2, 1. p x 1,, x d, y 1,, y d d = k 1 2 i=1 x i y i + (1 x i )(1 y i ) EQ(x i, y i ) When you eval p on all pairs of vectors, only gives Ω n 2 time...

All-Points (Hamming) Nearest Neighbors Given: Sets A, B of n points in 0, 1 d, d = c log n Task: For all x A, output y B that minimizes h(x, y) Idea 2: Group A, B into ~n/s groups of size s, and let p determine whether there is a close pair among a group from A and a group from B. Let ATLk(x) output 1 iff sum of the bits in x is at least k. Consider the circuit C k defined on s vectors from A and B: d EQ O s 2 ATLk ATLk ATLk EQ EQ x 1 y 1 x 2 y 2 x d y d C k outputs 1 iff some pair of points in the input has Hamming distance at most d k Goal: Construct polynomial representing C k with few monomials

All-Points (Hamming) Nearest Neighbors Thm [AW 15] All-Points Hamming Nearest Neighbors can be solved on n points in c log n dimensions in randomized n 2 1 O c log 2 c time. Consider the circuit C k : d EQ O s 2 ATLk ATLk ATLk EQ EQ x 1 y 1 x 2 y 2 x d y d C k outputs 1 iff some pair of points has Hamming distance at most d k Goal: Construct a nice probabilistic polynomial simulating C k Thm [AW 15] Every symmetric function on d inputs has a probabilistic polynomial of degree O d log 1 ε with error at most ε Replace each ATLk in C k with a prob. polynomial having error 1 s 3: obtain a somewhat sparse probabilistic polynomial for C k

All-Points (Hamming) Nearest Neighbors Thm [AW 15] All-Points Hamming Nearest Neighbors can be solved on n points in c log n dimensions in randomized n 2 1 O c log 2 c time. Consider the circuit C k : d EQ O s 2 ATLk ATLk ATLk EQ EQ x 1 y 1 x 2 y 2 x d y d C k outputs 1 iff some pair of points has Hamming distance at most d k Goal: Construct a nice probabilistic polynomial simulating C k Nearest Neighbors on n points reduces to Evaluating C k on O n2 Evaluation Lemma [C 82,W 14] Given sets A, B {0, 1} m, A = B = n/s, and a polynomial q x 1,, x m, y 1,, y m over F, with q (n/s) 0.1, can evaluate q on all x, y A B in n s 2 + n s s 2 points, k m poly log n time.

New Results All allow you to compute an OR of s At-Least-K functions! Prob Polys with less randomness: Thm: Every symmetric function on d inputs has a prob. poly. of degree O d log s and error 1/s, using O log d log d s random bits degree O d log s using Ω(d) random bits was known [AW 15] degree lower bound Ω d log s [R 87,S 87]

New Results Prob Polys with less randomness: Thm: Every symmetric function on d inputs has a prob. poly. of degree O d log s and error 1/s, using O log d log d s random bits A Nice PTF for the At-Least-K (threshold) function: Thm: For all s > 0, there is a PTF P x 1,, x d of degree O d log s for ATLk on d variables such that for all x, ATLk x = 0 1 < P x < 0 ATLk x = 1 P x > s An Even Nicer Probabilistic PTF for At-Least-K: Thm: For all s > 0, there is a probabilistic PTF P x 1,, x d, y of degree O d 1 3 log 2 3 (ds) for ATLk on d variables such that for all x, ATLk x = 0 Pr y 1 < P x, y < 0 1 1 s ATLk x = 1 Pr y P x, y > s 1 1 s All allow you to compute an OR of s At-Least-K functions!

Main Applications All-Points Nearest Neighbors in Hamming, l 1, and l 2 Given n red and n blue points in D c log n, we can: Find a Hamming nearest blue nbr, for all red points, In randomized dn + n 2 1/ O c 0.5 time In deterministic dn + n 2 1/ O c time ( derandomizes [AW 15]) Find 1 + ε -approximate nearest blue nbr for all red points in randomized dn + n 2 Ω(ε 1/3) time, in Hamming, l 1, and l 2 metrics (improving over Locality Sensitive Hashing [IM 98], G. Valiant [V 13]) Circuit Complexity ACC THR THR circuits can be evaluated on all 2 n inputs in 2 n poly(n) (deterministic) time* *circuits with n 2 ε bottom THR gates, and 2 nε gates elsewhere Implies analogous lower bounds for this circuit class Satisfiability of subexp-size MAJORITY AC0 THR AC0 THR circuits can be decided in randomized << 2 n time* *where MAJORITY and THR gates have fan-in n 6/5 ε

Prob. Polys With Less Randomness Thm: ATL k on d inputs has a prob. poly. of degree O d log s and error 1/s, using O log d log d s random bits Sketch: Let s outline the construction from [AW 15] for ATL k. Let δ = Θ log1/2 s d 1/2. Recursive construction. For an input x 0, 1 d, we have two cases: 1. If x k δd, k + δd : Construct a shorter input x by random sampling a 1 10 -fraction of x. Let k = k/10. By Chernoff-Hoeffding and our choice of δ, it s likely that ATL k x = ATL k x, so use the polynomial ATL k x. 2. If x k δd, k + δd : Use an exact polynomial A of degree O δd that s guaranteed to give the correct answer (polynomial interpolation). To determine which of the cases we re in, use ATL k+δd x and ATL k δd x. ATL k x 1 ATL k+δd x ATL k δd x A x + 1 1 ATL k+δd x ATL k δd x ATL k x Observation: In the analysis, we only need O(log d)-wise independence to generate a good random sample x of x

Nice PTFs for Threshold Functions A Nice PTF for the At-Least-K function: Thm: For all s > 0, there is a PTF P x 1,, x d of degree O d log s for ATLk on d variables such that for all x, ATLk x = 0 1 < P x < 0 ATLk x = 1 P x > s Idea: Use Chebyshev polynomials! q/2 T q x = i=0 q 2i x 2 1 i x q 2i T q x = 2xT q 1 x T q 2 x, T 0 x = 1, T 1 x = x. cos q arccos x, x 1 T q x = 1 2 x x2 1 q + 1 2 x + x2 1 q, x 1

Nice PTFs for Threshold Functions A Nice PTF for the At-Least-K function: Thm: For all s > 0, there is a PTF P x 1,, x d of degree O d log s for ATLk on d variables such that for all x, ATLk x = 0 1 < P x < 0 ATLk x = 1 P x > s Idea: Use Chebyshev polynomials! q/2 T q x = i=0 q 2i x 2 1 i x q 2i Chebyshev gives degree O d log s If x 1, 1 then T q x 1, 1 If x 1 + ε then T q x 1 2 eq ε

Nice PTFs for Threshold Functions A Nice PTF for the At-Least-K function: Thm: For all s > 0, there is a PTF P x 1,, x d of degree O d log s for ATLk on d variables such that for all x, ATLk x = 0 1 < P x < 0 ATLk x = 1 P x > s D q,t x Better degree: use Discrete Chebyshev polynomials! [Chebyshev 99] q i=0 1 i q i t x q i x i where c q,t = t + 1 q /q! If x {0, 1,, t} then D q,t x 1, 1 If x 1 + t then D q,t x e q2 /8t /c q,t

Nice PTFs for Threshold Functions A Nice PTF for the At-Least-K function: Thm: For all s > 0, there is a PTF P x 1,, x d of degree O d log s for ATLk on d variables such that for all x, ATLk x = 0 1 < P x < 0 ATLk x = 1 P x > s Fact [Chebyshev 99] For all q Ω t log t 0.5, t, If x {0, 1,, t} then D q,t x [ 1, 1] If x 1 then D q,t x exp q 2 8 t+1. Define P x 1,, x d D q,d k 1 σ i x i, where q: = Θ( d log s 0.5 ) Then: σ i x i < k P x 1, 1 σ i x i k P(x) e^θ(d log(s)/d) = e^(θ log s = poly(s) Shift P a little, to get the desired properties in the theorem.

Even Nicer Probabilistic PTF An Even Nicer Probabilistic PTF for At-Least-K: Thm: For all s > 0, there is a probabilistic PTF P x 1,, x d, y of degree O d 1 3 log 2 3 (ns) for ATLk on d variables such that for all x, ATLk x = 0 Pr y 1 < P x, y < 0 1 1 s ATLk x = 1 Pr y P x, y > s 1 1 s Idea: Combine the previous two constructions! Really Really Sketchy Sketch: Let δ > 0. Construct x by random sampling δd bits of x. Let Q(x) be a probabilistic polynomial for ATL k with error at most 1 2s. (here the parameter k is slightly smaller than k) Take a modified discrete Chebyshev polynomial D q,t (x) which blows up to > s when σ i x i > k 1 And otherwise stays in the interval [ 1, 1]. Set P x = D q,t x Q x for t d/ δd Careful case analysis and new setting of δ (and q ) yields the result!

Conclusion What is the power/limit of probabilistic PTFs representing Boolean functions? How easy/difficult is it to prove degree lower bounds for such representations? Can our SAT algorithm for MAJ AC0 THR AC0 THR be derandomized? Would imply stronger circuit lower bounds How much can the n 2 Ω(ε 1/3) runtime for 1 + ε - approximate batch nearest neighbor be improved? Thank you!