Average Case Complexity

Similar documents
Average case Complexity

Average Case Complexity: Levin s Theory

CS294: Pseudorandomness and Combinatorial Constructions September 13, Notes for Lecture 5

Lecture 3: Randomness in Computation

All Natural NPC Problems Have Average-Case Complete Versions

Kolmogorov complexity

1 Randomized complexity

Pseudorandom Generators

CSC 5170: Theory of Computational Complexity Lecture 5 The Chinese University of Hong Kong 8 February 2010

P is the class of problems for which there are algorithms that solve the problem in time O(n k ) for some constant k.

Reductions for One-Way Functions

The Cook-Levin Theorem

1 Distributional problems

15.1 Proof of the Cook-Levin Theorem: SAT is NP-complete

Stanford University CS254: Computational Complexity Handout 8 Luca Trevisan 4/21/2010

Q = Set of states, IE661: Scheduling Theory (Fall 2003) Primer to Complexity Theory Satyaki Ghosh Dastidar

an efficient procedure for the decision problem. We illustrate this phenomenon for the Satisfiability problem.

COMPRESSION OF SAMPLABLE SOURCES

Randomness and non-uniformity

Lecture 3: Nondeterminism, NP, and NP-completeness

1 Randomized Computation

Non-Approximability Results (2 nd part) 1/19

Lecture 25: Cook s Theorem (1997) Steven Skiena. skiena

for average case complexity 1 randomized reductions, an attempt to derive these notions from (more or less) rst

NP-Completeness. ch34 Hewett. Problem. Tractable Intractable Non-computable computationally infeasible super poly-time alg. sol. E.g.

If NP languages are hard on the worst-case then it is easy to find their hard instances

Randomized Computation

In complexity theory, algorithms and problems are classified by the growth order of computation time as a function of instance size.

On Pseudorandomness w.r.t Deterministic Observers

Notes for Lecture Notes 2

Complexity Theory. Knowledge Representation and Reasoning. November 2, 2005

Time Complexity (1) CSCI Spring Original Slides were written by Dr. Frederick W Maier. CSCI 2670 Time Complexity (1)

NP Completeness and Approximation Algorithms

Notes for Lecture 3... x 4

NP, polynomial-time mapping reductions, and NP-completeness

Computational Complexity

Lecture 3: Reductions and Completeness

CS 151 Complexity Theory Spring Solution Set 5

Lecture 9 - One Way Permutations

Lecture 2 (Notes) 1. The book Computational Complexity: A Modern Approach by Sanjeev Arora and Boaz Barak;

Polynomial Hierarchy

The Class NP. NP is the problems that can be solved in polynomial time by a nondeterministic machine.

Announcements. Friday Four Square! Problem Set 8 due right now. Problem Set 9 out, due next Friday at 2:15PM. Did you lose a phone in my office?

Limitations of Algorithm Power

Lecture 5: The Principle of Deferred Decisions. Chernoff Bounds

NP-completeness was introduced by Stephen Cook in 1971 in a foundational paper.

Principles of Knowledge Representation and Reasoning

U.C. Berkeley CS278: Computational Complexity Professor Luca Trevisan August 30, Notes for Lecture 1

1 Computational Problems

CSCI 1590 Intro to Computational Complexity

About the relationship between formal logic and complexity classes

Space is a computation resource. Unlike time it can be reused. Computational Complexity, by Fu Yuxi Space Complexity 1 / 44

The Polynomial Hierarchy

Lecture 4 : Quest for Structure in Counting Problems

CS Communication Complexity: Applications and New Directions

On Uniform Amplification of Hardness in NP

Theory of Computation

Two Comments on Targeted Canonical Derandomizers

COS598D Lecture 3 Pseudorandom generators from one-way functions

Models of Computation

Randomized Complexity Classes; RP

DRAFT. Diagonalization. Chapter 4

Pseudorandom Generators

PCP Theorem and Hardness of Approximation

Indistinguishability and Pseudo-Randomness

From Non-Adaptive to Adaptive Pseudorandom Functions

Lecture 24: Randomized Complexity, Course Summary

Computational Complexity: A Modern Approach. Draft of a book: Dated January 2007 Comments welcome!

Summer School on Introduction to Algorithms and Optimization Techniques July 4-12, 2017 Organized by ACMU, ISI and IEEE CEDA.

CSC 5170: Theory of Computational Complexity Lecture 9 The Chinese University of Hong Kong 15 March 2010

CISC 876: Kolmogorov Complexity

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181.

Lecture 18: PCP Theorem and Hardness of Approximation I

Lecture 5: Two-point Sampling

A Note on the Karp-Lipton Collapse for the Exponential Hierarchy

Lecture 12: Randomness Continued

6.842 Randomness and Computation Lecture 5

Lecture 3 (Notes) 1. The book Computational Complexity: A Modern Approach by Sanjeev Arora and Boaz Barak;

Inaccessible Entropy and its Applications. 1 Review: Psedorandom Generators from One-Way Functions

: On the P vs. BPP problem. 30/12/2016 Lecture 11

Length-Increasing Reductions for PSPACE-Completeness

6.045: Automata, Computability, and Complexity (GITCS) Class 17 Nancy Lynch

Notational conventions

Compression Complexity

Analysis of Algorithms. Unit 5 - Intractable Problems

Notes on Complexity Theory Last updated: December, Lecture 2

Computer Science A Cryptography and Data Security. Claude Crépeau

,

Lecture 6: Oracle TMs, Diagonalization Limits, Space Complexity

Pseudorandom Generators

Lecture 59 : Instance Compression and Succinct PCP s for NP

Lectures One Way Permutations, Goldreich Levin Theorem, Commitments

Chapter 5 The Witness Reduction Technique

SAT, NP, NP-Completeness

U.C. Berkeley CS278: Computational Complexity Professor Luca Trevisan 9/6/2004. Notes for Lecture 3

Quantum pattern matching fast on average

Complexity Theory. Jörg Kreiker. Summer term Chair for Theoretical Computer Science Prof. Esparza TU München

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: NP-Completeness I Date: 11/13/18

Intractable Problems [HMU06,Chp.10a]

Transcription:

Average Case Complexity

A fundamental question in NP-completeness theory is When, and in what sense, can an NP-complete problem be considered solvable in practice? In real life a problem often arises with a natural distribution of problem instances. What qualifies for being natural? Computational Complexity, by Y. Fu Average Case Complexity 1 / 45

Historically the focus was on probabilistic analysis of algorithms with respect to uniform distribution. There are hard NP-problems with easy average case algorithm. Computational Complexity, by Y. Fu Average Case Complexity 2 / 45

The fundamentals of the average case complexity were developed by Leonid Levin in 1986 in a two page paper. Average Case Complete Problems. SIAM Journal of Computing, 15:285-286, 1986. It aims at an NP-completeness theory with natural distributions. Computational Complexity, by Y. Fu Average Case Complexity 3 / 45

Synopsis 1. Distributional Problem 2. Natural Distribution 3. DistNP-Completeness 4. SampNP-Completeness Computational Complexity, by Y. Fu Average Case Complexity 4 / 45

Distributional Problem Computational Complexity, by Y. Fu Average Case Complexity 5 / 45

Technically a problem that arises in practice is a pair consisting of a decision problem and a distribution of the problem instance. Computational Complexity, by Y. Fu Average Case Complexity 6 / 45

Distribution Function A distribution function µ : {0, 1} [0, 1] is from strings to real values in [0, 1] such that µ(x) µ(y) whenever x < y, and lim x µ(x) = 1. Using the lexicographic order, the order < can be defined by x < y iff x < y or x = y x y. 1. The value µ(x) is the cumulative probability at x. 2. The density function µ of µ is defined by µ(x) = µ(x) µ(x 1). Computational Complexity, by Y. Fu Average Case Complexity 7 / 45

A distributional problem is a pair L, µ where L {0, 1} and µ is a distribution function. We are particularly interested in the distributional problem classes that are average case counterparts of P and NP. Computational Complexity, by Y. Fu Average Case Complexity 8 / 45

On Average Time For every TM A and input x, let time A (x) denote the number of steps A takes on input x. 1. We say that the worst case complexity of A is polynomial if c, d. x.time A (x) c x d. 2. It seems natural to say that a distributional problem L, µ is efficiently solvable by a TM A if c, d. n. µ(x)time A (x) cn d. x {0,1} n Computational Complexity, by Y. Fu Average Case Complexity 9 / 45

Polynomial Time on Average However the natural definition is pathological because it is not closed under function composition, and it is not model independent. Computational Complexity, by Y. Fu Average Case Complexity 10 / 45

Polynomial Time on Average Consider a k-tape TM that halts in n steps on every input 0 n and in 2 n steps on input 0 n. Assume the distribution is uniform. Its expected running time is n(1 1/2 n ) + 2 n /2 n < n + 1 On a machine with only one tape the average running time would be exponential due to a quadratic slowdown. Computational Complexity, by Y. Fu Average Case Complexity 11 / 45

Polynomial Time on Average Let s manipulate the worst case complexity formula slightly: c, d. x. time A(x) 1 d x d c. By applying the expectation operation to the above formula we get Levin s definition of average case polynomial time. Computational Complexity, by Y. Fu Average Case Complexity 12 / 45

Average Case Analog of P A distributional problem L, µ is in AvgP if L is accepted by a TM A that renders true the following C, ɛ. x {0,1} µ(x) time A(x)ɛ x C. (1) For every d > 0 condition (1) is equivalent to the following C, ɛ. x {0,1} µ(x) Hint: E[X ] d E[X d ] for each d 1. time A(x)ɛ x d C. Computational Complexity, by Y. Fu Average Case Complexity 13 / 45

Average Case Analog of P Observations: P AvgP. An average case P-time algorithm has a high probability to run in P-time. This is due to Markov s inequality: [ timea (x) ɛ ] Pr x R {0,1} KC 1 x K. Computational Complexity, by Y. Fu Average Case Complexity 14 / 45

1. AvgP contains not just theoretically feasible problems, but also practically feasible problems. 2. One gets a super class of NP, denoted by AvgNP, if one replaces the TM in the definition of AvgP by NDTM. Since NP AvgNP, the hard problems in AvgNP are unlikely to have efficient algorithms. 3. One looks for a class of NP problems that have efficient average case algorithms. Computational Complexity, by Y. Fu Average Case Complexity 15 / 45

Natural Distribution Computational Complexity, by Y. Fu Average Case Complexity 16 / 45

Levin assumed that natural distributions are P-time computable. Computational Complexity, by Y. Fu Average Case Complexity 17 / 45

Polynomial Time Computable Distribution Levin, 1986. A distribution function µ is P-computable if there is a P-time TM that computes it. The density function of a P-computable distribution function is also P-time computable. Computational Complexity, by Y. Fu Average Case Complexity 18 / 45

The density function for the uniform distribution is given by 1 x ( x + 1) 1 2 x. Computational Complexity, by Y. Fu Average Case Complexity 19 / 45

Arguably a distribution is natural not because we can calculate it efficiently; it is natural because it can be generated efficiently. Computational Complexity, by Y. Fu Average Case Complexity 20 / 45

Polynomial Time Samplable Distribution Impagliazzo and Levin, 1990. A distribution function µ is P-samplable if there is a P-time PTM A such that A outputs x with probability µ(x) for all x {0, 1}. Computational Complexity, by Y. Fu Average Case Complexity 21 / 45

Lemma. A P-computable distribution is also P-samplable. Lemma. Assume P P P. Then there is a P-samplable distribution that is not P-computable. Computational Complexity, by Y. Fu Average Case Complexity 22 / 45

DistNP and SampNP A distributional problem L, µ is in DistNP if the following hold: L NP, and µ is P-computable. A distributional problem L, µ is in SampNP if the following hold: L NP, and µ is P-samplable. Computational Complexity, by Y. Fu Average Case Complexity 23 / 45

DistNP-Completeness Computational Complexity, by Y. Fu Average Case Complexity 24 / 45

A reduction between problems in DistNP is a Karp reduction. Additionally it should also satisfy some continuity property. Computational Complexity, by Y. Fu Average Case Complexity 25 / 45

Suppose L, µ and L, µ are distributional problems. L, µ average case reduces to L, µ, noted L, µ A L, µ, if there is a P-time computable f and polynomials p, q such that Correctness. x {0, 1}. x L f (x) L ; Length Regularity. x {0, 1}. f (x) = p( x ); Domination. y {0, 1}. x {0,1}, f (x)=y µ(x) q( y )µ (y). Computational Complexity, by Y. Fu Average Case Complexity 26 / 45

1. Length Regularity implies that f 1 (y) is finite for all y {0, 1}. 2. Domination condition y {0, 1}. x {0,1}, f (x)=y µ(x) q( y )µ (y) is to ensue that the reduction does not map a highly likely instance of the first problem onto a rare instance of the second problem. Otherwise an easy solution to the latter does not necessarily yield an easy solution to the former. Computational Complexity, by Y. Fu Average Case Complexity 27 / 45

Lemma. Average case reduction is transitive. Computational Complexity, by Y. Fu Average Case Complexity 28 / 45

Theorem. If L, µ A L, µ AvgP, then L, µ AvgP. Proof. Let f be a reduction from L, µ to L, µ with polynomials p, q. Let the running time of f be bounded by dn d. Clearly dn d p(n). Suppose A is a TM for L, µ and ɛ, C are such that y {0,1} µ (y) time A (y)ɛ y C. Let A be the obvious TM for L, µ obtained by composition. The inequality derived on the next slide implies L, µ AvgP. Computational Complexity, by Y. Fu Average Case Complexity 29 / 45

x {0,1} µ(x) time A (x) ɛ q( f (x) )d x d y=f (x) x {0,1} µ(x) y=f (x) ( timea (y)+d x d) ɛ q( y )d x d time A (y) ɛ + ( d x d) ɛ µ(x) q( y )d x d x {0,1} y=f (x) x {0,1} µ (y) y {0,1} C + 1. ( µ(x) timea (y) ɛ q( y ) y ( timea (y) ɛ y ) + 1 ) + 1 Computational Complexity, by Y. Fu Average Case Complexity 30 / 45

We say that L, µ is DistNP-complete if the following hold: L, µ DistNP, and L, µ A L, µ for all L, µ DistNP. Computational Complexity, by Y. Fu Average Case Complexity 31 / 45

Levin provided the first DistNP-complete problem in 1986. The proof we will present below is from Yuri Gurevich (1987). Complete and Incomplete Randomized NP Problems. FOCS. Computational Complexity, by Y. Fu Average Case Complexity 32 / 45

Distributional Bounded Halting Problem 1. Let U contain all tuples α, x, 1 t such that the NDTM N α accepts x in t steps. 2. Let µ u be the distribution on tuples α, x, 1 t of length n st. α R {0, 1} log(n), t R {0,..., n log(n)}, and x R {0, 1} n log(n) t. This distribution is P-time computable. 3. U, µ u is the distributional version of Bounded Halting. We could make µ u uniform by replacing 1 t with a string of equal length, and assign each such string the same probability. But we would lose the domination property had we done that. Computational Complexity, by Y. Fu Average Case Complexity 33 / 45

Peak Elimination The obvious reduction fails the Domination property. We bypass the problem by using the following lemma. Lemma. Let µ be a P-computable distribution function. There is a P-time computable function g : {0, 1} {0, 1} such that g is one-one: g(x) = g(x ) iff x = x. For every x {0, 1}, g(x) x + 1. For every y {0, 1}, µ({x y = g(x)}) 1 2 y 1. Computational Complexity, by Y. Fu Average Case Complexity 34 / 45

Proof Given x {0, 1}, let h(x) be the largest common prefix of the binary representations of µ(x) and µ(x 1). h is P-time computable. h(x) k if µ(x) = µ(x) µ(x 1) 2 k. h is one-one. Suppose x < x and h(x) = h(x ) with h(x) = k. The (k + 1)-th bit of µ(x) must be 1. The (k + 1)-th bit of both µ(x ) and µ(x 1) must be 1. Computational Complexity, by Y. Fu Average Case Complexity 35 / 45

Proof 1. For every x {0, 1} n, define g(x) = { 0x, if µ(x) 2 n, 1h(x), otherwise. Clearly g satisfies the first two conditions of the lemma. 2. We now show that µ({x y=g(x)}) 1 2 y 1 for all y {0, 1}. If y is not in the image of g, then µ({x y=g(x)}) = 0. If y = 0x and µ(x) 1, then µ({x y=g(x)}) 1 2 x If y = 1h(x) and µ(x) > 1 2 x, then h(x) log It follows that µ({x y=g(x)}) 1 2 y 1. ( 1 µ(x) 2 y 1. ). Computational Complexity, by Y. Fu Average Case Complexity 36 / 45

Theorem. U, µ u is DistNP-complete. Proof. Suppose L, µ DistNP. 1. Let N α be a P-time NDTM that accepts L. Define N α by On input y, guess x such that y = g(x); then execute N α (x). Let p be the polynomial running time of N α. 2. Reduction: f (x) = α, y, 1 k, where x = n and y = g(x) and k = p(n) + log(n) + n α y. 3. Correctness and Length Regularity conditions are satisfied. By definition {µ(x) α, y, 1 k = f (x)} 1. 2 y 1 Let m = α, y, 1 k. The probability that α, y, 1 k occurs is at least 2 log m 1 1 2 y m = 1 1. So Domination condition is met. m 2 2 y Computational Complexity, by Y. Fu Average Case Complexity 37 / 45

It is remarkable that an NPC problem coupled with a simple distribution contains the projected image of everything in DistNP. Levin s definition of P-computable distribution is crucial to the transformation that maps an instance with higher than average probability to a shorter instance for which the probability is fair. Computational Complexity, by Y. Fu Average Case Complexity 38 / 45

SampNP-Completeness Computational Complexity, by Y. Fu Average Case Complexity 39 / 45

SampNP-Completeness We say that L, µ is SampNP-complete if the following hold: L, µ SampNP, and L, µ L, µ for all L, µ SampNP. Computational Complexity, by Y. Fu Average Case Complexity 40 / 45

SampNP-Completeness Theorem. (Impagliazzo and Levin, 1990) If L, µ is DistNP-complete, then it is also SampNP-complete. Proof. See the paper by Impagliazzo and Levin: No Better Ways to Generate Hard NP Instances than Picking Uniformly at Random, FOCS, 1990. Computational Complexity, by Y. Fu Average Case Complexity 41 / 45

Levin has got it right after all. By restricting to the P-computable distributions, we may overlook some easy problems, but we never turn any easy problems into hard ones. Computational Complexity, by Y. Fu Average Case Complexity 42 / 45

Average Case Complexity vs. Worst Case Complexity Investigations have shown that it is unlikely that the existence of an efficient average case algorithm implies the existence of an efficient worst case algorithm. Computational Complexity, by Y. Fu Average Case Complexity 43 / 45

Application In cryptography one seeks NP problems that are hard on average. This is a strong motivation for studying average case complexity. Open Problem. Is factorization (discrete log) DistNP-hard? Computational Complexity, by Y. Fu Average Case Complexity 44 / 45

Open Problem 1. DistNP SampNP NP AvgNP. 2. Natural DistNP-complete problems. Computational Complexity, by Y. Fu Average Case Complexity 45 / 45