Hardness for easy problems. An introduction

Similar documents
6.S078 A FINE-GRAINED APPROACH TO ALGORITHMS AND COMPLEXITY LECTURE 1

A FINE-GRAINED APPROACH TO

A FINE-GRAINED APPROACH TO ALGORITHMS AND COMPLEXITY. Virginia Vassilevska Williams Stanford University

CSC2420: Algorithm Design, Analysis and Theory Fall 2017

Geometric Problems in Moderate Dimensions

Complexity Theory of Polynomial-Time Problems

Connections between exponential time and polynomial time problem Lecture Notes for CS294

1 Randomized reduction: a first start

Complexity Theory of Polynomial-Time Problems

Deterministic APSP, Orthogonal Vectors, and More:

On some fine-grained questions in algorithms and complexity

Matching Triangles and Basing Hardness on an Extremely Popular Conjecture

Average-case Analysis for Combinatorial Problems,

Algorithm Design and Analysis

Strong ETH Breaks With Merlin and Arthur. Or: Short Non-Interactive Proofs of Batch Evaluation

NP-Hardness reductions

P is the class of problems for which there are algorithms that solve the problem in time O(n k ) for some constant k.

Lecture 4: NP and computational intractability

Intro to Theory of Computation

NP-Completeness. Andreas Klappenecker. [based on slides by Prof. Welch]

Holger Dell Saarland University and Cluster of Excellence (MMCI) Simons Institute for the Theory of Computing

On the efficient approximability of constraint satisfaction problems

NP-Completeness Review

COP 4531 Complexity & Analysis of Data Structures & Algorithms

NP-Complete problems

Limits to Approximability: When Algorithms Won't Help You. Note: Contents of today s lecture won t be on the exam

Essential facts about NP-completeness:

Towards Hardness of Approximation for Polynomial Time Problems

arxiv: v2 [cs.ds] 3 Oct 2017

P, NP, NP-Complete, and NPhard

COMP Analysis of Algorithms & Data Structures

Some Algebra Problems (Algorithmic) CSE 417 Introduction to Algorithms Winter Some Problems. A Brief History of Ideas

NP-Completeness I. Lecture Overview Introduction: Reduction and Expressiveness

Notes for Lecture 21

Complexity Theory of Polynomial-Time Problems

A Working Knowledge of Computational Complexity for an Optimizer

CS21 Decidability and Tractability

NP and Computational Intractability

NP-complete problems. CSE 101: Design and Analysis of Algorithms Lecture 20

Even More on Dynamic Programming

Limitations of Algorithm Power

Comp487/587 - Boolean Formulas

Complexity Theory of Polynomial-Time Problems

1 Reductions and Expressiveness

Admin NP-COMPLETE PROBLEMS. Run-time analysis. Tractable vs. intractable problems 5/2/13. What is a tractable problem?

How hard is it to find a good solution?

Computational Intractability 2010/4/15. Lecture 2

Some Algebra Problems (Algorithmic) CSE 417 Introduction to Algorithms Winter Some Problems. A Brief History of Ideas

1 Circuit Complexity. CS 6743 Lecture 15 1 Fall Definitions

NP-Completeness. ch34 Hewett. Problem. Tractable Intractable Non-computable computationally infeasible super poly-time alg. sol. E.g.

Lecture #14: NP-Completeness (Chapter 34 Old Edition Chapter 36) Discussion here is from the old edition.

NP-completeness. Chapter 34. Sergey Bereg

SAT, NP, NP-Completeness

Lecture 14 - P v.s. NP 1

Algorithms Design & Analysis. Approximation Algorithm

Announcements. Friday Four Square! Problem Set 8 due right now. Problem Set 9 out, due next Friday at 2:15PM. Did you lose a phone in my office?

Chapter 2. Reductions and NP. 2.1 Reductions Continued The Satisfiability Problem (SAT) SAT 3SAT. CS 573: Algorithms, Fall 2013 August 29, 2013

Determine the size of an instance of the minimum spanning tree problem.

Fine Grained Counting Complexity I

Automata Theory CS Complexity Theory I: Polynomial Time

Subcubic Equivalence of Triangle Detection and Matrix Multiplication

NP-Complete Problems. Complexity Class P. .. Cal Poly CSC 349: Design and Analyis of Algorithms Alexander Dekhtyar..

P, NP, NP-Complete. Ruth Anderson

Hardness of Approximation

NP-Completeness. Until now we have been designing algorithms for specific problems

NP-Complete Problems. More reductions

CS/COE

NP Completeness and Approximation Algorithms

Computational and Statistical Learning Theory

Lecture 18: More NP-Complete Problems

6.080 / Great Ideas in Theoretical Computer Science Spring 2008

The Intractability And The Tractability Of The Orthogonal Vectors Problem

CS151 Complexity Theory

NP-Complete Reductions 1

Polynomial-Time Reductions

Simulating Branching Programs with Edit Distance and Friends Or: A Polylog Shaved is a Lower Bound Made

Simulating Branching Programs with Edit Distance and Friends

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: NP-Completeness I Date: 11/13/18

Computability and Complexity Theory: An Introduction

CS 583: Algorithms. NP Completeness Ch 34. Intractability

Geometric Steiner Trees

Asymptotic Analysis. Slides by Carl Kingsford. Jan. 27, AD Chapter 2

Classes of Problems. CS 461, Lecture 23. NP-Hard. Today s Outline. We can characterize many problems into three classes:

Lecture 4 Scribe Notes

Recap from Last Time

Completeness for First-Order Properties on Sparse Structures with Algorithmic Applications

Shortest reconfiguration paths in the solution space of Boolean formulas

Approximation Algorithms and Hardness of Approximation. IPM, Jan Mohammad R. Salavatipour Department of Computing Science University of Alberta

Complete problems for classes in PH, The Polynomial-Time Hierarchy (PH) oracle is like a subroutine, or function in

NP and Computational Intractability

More Consequences of Falsifying SETH and the Orthogonal Vectors Conjecture [Full version]

BCCS: Computational methods for complex systems Wednesday, 16 June Lecture 3

Scribes: Po-Hsuan Wei, William Kuzmaul Editor: Kevin Wu Date: October 18, 2016

Complexity Theory: The P vs NP question

arxiv: v1 [cs.ds] 20 Nov 2015

CSC 5170: Theory of Computational Complexity Lecture 9 The Chinese University of Hong Kong 15 March 2010

Computational and Statistical Learning Theory

1 Computational problems

NP Completeness and Approximation Algorithms

Friday Four Square! Today at 4:15PM, Outside Gates

Transcription:

Hardness for easy problems An introduction

The real world and hard problems I ve got data. I want to solve this algorithmic problem but I m stuck! Ok, thanks, I feel better that none of my attempts worked. I ll use some heuristics. I m sorry, this problem is NP-hard. A fast algorithm for it would resolve a hard problem in CS/math.

The real world and easy problems I ve got data. I want to solve this algorithmic problem but I m stuck! But my data size n is huge! Don t you have a faster algorithm??!? Should I wait? Or should I be satisfied with heuristics? Great news! Your problem is in P. Here s an O(n 2 ) time algorithm! Uhm, I don t know This is already theoretically fast For some reason I can t come up with a faster algorithm for it right now

In theoretical CS, polynomial time = efficient/easy. This is for a variety of reasons. E.g. composing two efficient algorithms results in an efficient algorithm. Also, model-independence. However, noone would consider an O(n 100 ) time algorithm efficient in practice. If n is huge, then O(n 2 ) can also be inefficient. How do we explain when we are stuck?

The easy problems Let s focus on O(N 2 ) time (N- size of the input)

What do we know about O(N 2 ) time? Amazingly fast algorithms for: Classical (almost) linear time: connectivity, planarity, minimum spanning tree, single source shortest paths, min cut, topological order of DAGs Recent breakthroughs: solving SDD linear systems (m 1+o(1), [ST 04] ), approx. max flow (m 1+o(1), [KLOS 13]), max matching (m 10/7, [M 14]), min cost flow (~m n, [LS 14]), Sublinear algorithms/property testing Good news: A lot of problems are in close to linear time!

Hard problems in O(N 2 ) time Bad news: on many problems we are stuck: No N 2 - time algorithms known for: Many problems in computational geometry: e.g Given n points in the plane, are any three collinear? A very important primitive! In general, for many problems in P, there s an ``easy O(N c ) time solution but essentially no improvement in decades.

Hard problems in O(N 2 ) time Bad news: on many problems we are stuck: No N 2- time algorithm known for: Many problems in computational geometry Many string matching problems: Edit distance, Sequence local alignment, Longest common subsequence, jumbled indexing

Sequence alignment A fundamental problem from computational biology: Given two DNA strings ATCGGGTTCCTTAAGGG ATTGGTACCTTCAGG How similar are they? What do they have in common? Several notions of similarity! E.g. Local alignment, Edit Distance, Longest Common Subsequence

Longest Common Subsequence Given two strings ATCGGGTTCCTTAAGGG ATTGGTACCTTCAGG Find a subsequence of both strings of maximum length. Applications both in comp. biology and in spellcheckers.

Longest Common Subsequence Given two strings ATCGGGTTCCTTAAGGG AT T GG_TACCTTCA_GG Find a subsequence of both strings of maximum length. Applications both in comp. biology and in spellcheckers. Solved daily on huge strings!!

Sequence problems theory/practice Fastest algorithm for most sequence alignment variants: O(n 2 ) time on length n sequences Sequence alignment is run on whole genome sequences. Human genome: 3 x 10 9 base pairs. A quadratic time algorithm is not fast!

Hard problems in O(N 2 ) time Bad news: on many problems we are stuck: No N 2- time algorithm known for: Many problems in computational geometry Many string matching problems Many graph problems in sparse graphs: e.g. Given an n node, O(n) edge graph, what is its diameter? Fundamental problem. Even approximation algorithms seem hard!

Hard problems in O(N 2 ) time Bad news: on many problems we are stuck: No N 2- time algorithm known for: Many problems in computational geometry Many string matching problems Many graph problems in sparse graphs No N 1.5- time algs known for many problems in dense graphs: diameter, radius, median, second shortest path, shortest cycle N 1.5 time algs exist i.e. n 3

Why are we stuck? We are stuck on many problems from different subareas of CS! Are we stuck because of the same reason? How do we address this? How did we address this for the hard problems?

A canonical hard problem k-sat Input: variables x 1,,x n and a formula F = C 1 C 2 C m so that each C i is of the form {y 1 y 2 y k } and i, y i is either x t or x t for some t. Output: A boolean assignment to {x 1,,x n } that satisfies all the clauses, or NO if the formula is not satisfiable Trivial algorithm: try all 2 n assignments Best known algorithm: O(2 n-(cn/k) n d ) time for const c,d

Why is k-sat hard? Theorem [Cook,Karp 72]: k-sat is NP-complete for all k 3. Tool: poly-time reductions That is, if there is an algorithm that solves k-sat instances on n variables in poly(n) time, then all problems in NP have poly(n) time solutions, and so P=NP. k-sat (and all other NP-complete problems) are considered hard because fast algorithms for them imply fast algs for many important problems.

Addressing the hardness of easy problems 1. Identify key hard problems 2. Reduce these to all (?) other hard easy problems 3. Hopefully form equivalence classes Goal: understand the landscape of polynomial time.

CNF SAT is conjectured to be really hard Two popular conjectures about SAT [IPZ01]: ETH: 3-SAT requires 2 n time for some > 0. SETH: for every > 0, there is a k such that k-sat on n variables, m clauses cannot be solved in 2 (1- )n poly m time. So we can use k-sat as our hard problem and ETH or SETH as the conjecture we base hardness on.

Three more problems we can blame 3SUM: Given a set S of n integers, are there a,b,c 2 S with a+b+c = 0? Orthogonal vectors (OV): Given a set S of n vectors in {0,1} d, for d = O(log n) are there u,v 2 S with u v = 0? All pairs shortest paths (APSP): given a weighted graph, find the distance between every two nodes.

3SUM: Given a set S of n numbers, are there a,b,c 2 S with a+b+c = 0? Easy O(n 2 ) time algorithm [BDP 05]: ~n 2 /log 2 n time algorithm for integers [GP 14] : ~n 2 /log n time for real numbers Here we ll talk about 3SUM over the integers Folklore: one can assume the integers are in {-n 3,,n 3 } 3SUM Conjecture: 3SUM on n integers in {-n 3,,n 3 } requires n 2-o(1) time.

Three more problems we can blame 3SUM: Given a set S of n integers, are there a,b,c 2 S with a+b+c = 0? Orthogonal vectors (OV): Given a set S of n vectors in {0,1} d, for d = O(log n) are there u,v 2 S with u v = 0? All pairs shortest paths (APSP): given a weighted graph, find the distance between every two nodes.

Orthogonal vectors (OV): Given a set S of n vectors in {0,1} d, for d = O(log n) are there u,v 2 S with u v = 0? Easy O(n 2 log n) time algorithm Best known [AWY 15]: n2 - (1 / log (d/log n)) OV Conjecture: OV on n vectors requires n 2-o(1) time. [W 04]: SETH implies the OV Conjecture. I ll prove this to you later

Three more problems we can blame 3SUM: Given a set S of n integers, are there a,b,c 2 S with a+b+c = 0? Orthogonal vectors (OV): Given a set S of n vectors in {0,1} d, for d = O(log n) are there u,v 2 S with u v = 0? All pairs shortest paths (APSP): given a weighted graph, find the distance between every two nodes.

APSP: given a weighted graph, find the distance between every two nodes. Classical problem Long history Author Runtime Year Fredman n 3 log log 1/3 n / log 1/3 n 1976 APSP Conjecture: APSP on n nodes and O(log n) bit weights requires n 3-o(1) time. Takaoka n 3 log log 1/2 n / log 1/2 n 1992 Dobosiewicz n 3 / log 1/2 n 1992 Han n 3 log log 5/7 n / log 5/7 n 2004 Takaoka n 3 log log 2 n / log n 2004 Zwick n 3 log log 1/2 n / log n 2004 Chan n 3 / log n 2005 Han n 3 log log 5/4 n / log 5/4 n 2006 Chan n 3 log log 3 n / log 2 n 2007 Han, Takaoka n 3 log log n / log 2 n 2012 Williams n 3 / exp( log n) 2014

Addressing the hardness of easy problems 1. Identify key hard problems 2. Reduce these to all (?) other hard easy problems 3. Hopefully form equivalence classes

Fine-grained reductions A is (a,b)-reducible to B if Intuition: a(n),b(n) are the naive runtimes for A and B. A reducible to B implies that beating the naive runtime for B implies also beating the naive runtime for A. for every ε>0 δ>0, and an O(a(n) 1-δ ) time algorithm that transforms any A-instance of size n to B-instances of size n 1,,n k so that Σ i b(n i ) 1-ε < a(n) 1-δ. If B is in O(b(n) 1-ε ) time, then A is in O(a(n) 1-δ ) time. Focus on exponents. We can build equivalences. A a(n) 1-δ B B B B A theory of hardness for polynomial time. 27

Addressing the hardness of easy problems 1. Identify key hard problems 2. Reduce these to all (?) other hard easy problems 3. Hopefully form equivalence classes

Some structure within P Sparse graph diameter [RV 13], local alignment, longest common substring* [AVW 14], Frechet distance [Br 14], Edit distance [BI 15], LCS [ABV 15, BrK 15] N2- Dynamic problems [P 10],[AV 14], [HKNS 15], [RZ 04] N2- N 2- Orthog. vectors k-sat [W 04] 2 (1 - )n n 3- N1.5- Huge literature in comp. geom. [GO 95, BH-P98, ]: Geombase, 3PointsLine, 3LinesPoint, Polygonal Containment String problems: Sequence local alignment [AVW 14], jumbled indexing [ACLL 14] 3SUM N 2- STUCK on all 3! N 1.5- APSP n 3- In dense graphs: radius, median, betweenness [AGV 15], negative triangle, second shortest path, shortest cycle [VW 10],

Fast OV implies SETH is false [W 04] F- k-cnf-formula on n vars, m = O(n) clauses* Split the vars into V 1 and V 2 on n/2 vars each For j=1,2 and each partial assignment of V j create (m+2) length vector v(j, ): clauses 0 1 0 1 0 1 for all v(1, ) 0 if satisfies the clause, 1 otherwise 1 0 0 0 1 1 for all v(2, ) 0 if satisfies the clause, 1 otherwise *By sparsification lemma, any k-cnf can be converted into a small number of k-cnfs on O(n) clauses.

Fast OV implies SETH is false 0 1 0 1 0 1 for all v(1, ) 0 if satisfies the clause, 1 otherwise 1 0 0 0 1 1 for all v(2, ) 0 if satisfies the clause, 1 otherwise Claim: v(1, ) v(2, ) = 0 iff is a sat assignment. N = 2 n/2 vectors of dimension O(n) = O(log N) an OV instance. So O(N 2 - ) time implies SETH is false.

Some structure within P Sparse graph diameter [RV 13], local alignment, longest common substring* [AVW 14], Frechet distance [Br 14], Edit distance [BI 15], LCS [ABV 15, BrK 15] N2- Dynamic problems [P 10],[AV 14], [HKNS 15], [RZ 04] N2- N 2- Orthog. vectors CNF-SAT 2 (1 - )n n 3- N1.5- Huge literature in comp. geom. [GO 95, BH-P98, ]: Geombase, 3PointsLine, 3LinesPoint, Polygonal Containment String problems: Sequence local alignment [AVW 14], jumbled indexing [ACLL 14] 3SUM N 2- STUCK on all 3! N 1.5- APSP n 3- In dense graphs: radius, median, betweenness [AGV 15], negative triangle, second shortest path, shortest cycle [VW 10],

Another popular conjecture Boolean matrix multiplication (BMM): given two n x n boolean matrices A and B, compute their boolean product C, where C[i, j] = k (A[i, k] B[k, j]) BMM can be computed in O(n ) time, < 2.38. The algebraic techniques are not very practical, however. The best known ``combinatorial techniques get a runtime of at best n 3 / log 4 n [Yu 15]. BMM Conjecture: Any ``combinatorial algorithm for BMM requires n 3-o(1) time.

BMM conjecture consequences Reductions from BMM are typically used to show that fast matrix multiplication is probably required Some implications of the BMM conjecture: A triangle in a graph cannot be found faster than in n 3-o(1) time by any combinatorial algorithm The radius of unweighted graphs requires n 3-o(1) time via combinatorial techniques Maintaining a maximum bipartite matching dynamically with nontrivial update times requires fast matrix multiplication CFG parsing requires fast matrix multiplication

Which conjectures are more believable? Besides Ryan s proof that SETH -> OV Conjecture, there are no other reductions relating the conjectures known It could be that one is true while all others are false However: The decision tree complexities of both 3SUM (GP 14) and APSP (Fredman 75) are low: n 1.5 and n 2.5, respectively. This is not known for OV. Perhaps this means the OV conjecture is more believable? OV and APSP both admit better than logarithmic improvements over the naïve runtime, 3SUM does not, as far as we know. So maybe the 3SUM conjecture is more believable? There are natural problems that 3SUM, APSP and k-sat all reduce to, Matching Triangles & Triangle Collection. Amir will talk about them later.

4pm Piotr Arturs Some structure within P Sparse graph diameter [RV 13], local alignment, longest common substring* [AVW 14], Frechet distance [Br 14], Edit distance [BI 15], LCS [ABV 15, BrK 15] N2-3pm Amir Dynamic problems [P 10],[AV 14], [HKNS 15], [RZ 04] N2- Huge literature in comp. geom. [GO 95, BH-P98, ]: Geombase, 3PointsLine, 3LinesPoint, Polygonal Containment String problems: Sequence local alignment [AVW 14], jumbled indexing [ACLL 14] N 2-2 (1 - )n 3SUM N 2- Orthog. vectors k-sat Triangle collection N 1.5- APSP n 3- n 3- n 3- N1.5- In dense graphs: radius, median, betweenness [AGV 15], negative triangle, second shortest path, shortest cycle [VW 10], 2pm V. Monday 10:15am Amir S-T max flow, dynamic max flow, [AVY 15]

Plan for the rest of the day: 2 pm Hardness and Equivalences for Graph Problems (Virginia) 3pm Hardness for Dynamic Problems (Amir) 4pm Intro to hardness for sequence problems (Piotr) 4:30pm Hardness for sequence problems (Arturs) 5pm Conclusion and future work (Amir) THANKS!