Lecture 2: Minimax theorem, Impagliazzo Hard Core Lemma

Similar documents
Impagliazzo s Hardcore Lemma

Lecture 3: AC 0, the switching lemma

1 Primals and Duals: Zero Sum Games

Lecture 7: ɛ-biased and almost k-wise independent spaces

CS261: A Second Course in Algorithms Lecture #12: Applications of Multiplicative Weights to Games and Linear Programs

CS6999 Probabilistic Methods in Integer Programming Randomized Rounding Andrew D. Smith April 2003

Game Theory. Greg Plaxton Theory in Programming Practice, Spring 2004 Department of Computer Science University of Texas at Austin

1 Randomized Computation

Zero sum games Proving the vn theorem. Zero sum games. Roberto Lucchetti. Politecnico di Milano

Problem set 1, Real Analysis I, Spring, 2015.

Lecture 11: Quantum Information III - Source Coding

CS 6820 Fall 2014 Lectures, October 3-20, 2014

Lectures 6, 7 and part of 8

a i,1 a i,j a i,m be vectors with positive entries. The linear programming problem associated to A, b and c is to find all vectors sa b

MAT1000 ASSIGNMENT 1. a k 3 k. x =

Lecture 6: April 25, 2006

PRGs for space-bounded computation: INW, Nisan

CHAPTER 9. Embedding theorems

Lecture notes on OPP algorithms [Preliminary Draft]

The space complexity of approximating the frequency moments

Solution Set for Homework #1

Metric Spaces and Topology

Massachusetts Institute of Technology 6.854J/18.415J: Advanced Algorithms Friday, March 18, 2016 Ankur Moitra. Problem Set 6

Algorithms, Games, and Networks January 17, Lecture 2

CS 573: Algorithmic Game Theory Lecture date: January 23rd, 2008

Part IB Optimisation

Locally convex spaces, the hyperplane separation theorem, and the Krein-Milman theorem

Convex Sets. Prof. Dan A. Simovici UMB

Real Analysis Math 131AH Rudin, Chapter #1. Dominique Abdi

Today: Linear Programming (con t.)

1 Review of last lecture and introduction

LINEAR PROGRAMMING III

Pr[X = s Y = t] = Pr[X = s] Pr[Y = t]

Algorithmic Game Theory and Applications. Lecture 7: The LP Duality Theorem

M17 MAT25-21 HOMEWORK 6

Set, functions and Euclidean space. Seungjin Han

3 Lecture Separation

15-850: Advanced Algorithms CMU, Fall 2018 HW #4 (out October 17, 2018) Due: October 28, 2018

6.842 Randomness and Computation April 2, Lecture 14

Math 341: Convex Geometry. Xi Chen

CS Communication Complexity: Applications and New Directions

A NICE PROOF OF FARKAS LEMMA

Lecture 3: Semidefinite Programming

Computational Game Theory Spring Semester, 2005/6. Lecturer: Yishay Mansour Scribe: Ilan Cohen, Natan Rubin, Ophir Bleiberg*

Topics in Theoretical Computer Science April 08, Lecture 8

Lecture 8: Basic convex analysis

Convex Sets with Applications to Economics

Topological properties

Measure and integration

12. Perturbed Matrices

(2) E M = E C = X\E M

CS261: A Second Course in Algorithms Lecture #11: Online Learning and the Multiplicative Weights Algorithm

Lebesgue Measure on R n

EC 521 MATHEMATICAL METHODS FOR ECONOMICS. Lecture 1: Preliminaries

Ramsey Theory. May 24, 2015

CS Introduction to Complexity Theory. Lecture #11: Dec 8th, 2015

Lecture 2: Continued fractions, rational approximations

Lecture 11: Polar codes construction

Lecture 5: Two-point Sampling

6.842 Randomness and Computation Lecture 5

2 Completing the Hardness of approximation of Set Cover

CS 151 Complexity Theory Spring Solution Set 5

Conditional Sparse Linear Regression

18.10 Addendum: Arbitrary number of pigeons

Practice Exam 1 CIS/CSE 607, Spring 2009

Game Theory: Lecture 3

Lectures 6: Degree Distributions and Concentration Inequalities

Introduction to Real Analysis

Motivating examples Introduction to algorithms Simplex algorithm. On a particular example General algorithm. Duality An application to game theory

Lecture 4: LMN Learning (Part 2)

Optimal File Sharing in Distributed Networks

Math 172 HW 1 Solutions

Measurable Choice Functions

Lecture 3 Sept. 4, 2014

Economics 204 Summer/Fall 2017 Lecture 7 Tuesday July 25, 2017

Lecture 6. Today we shall use graph entropy to improve the obvious lower bound on good hash functions.

Lecture 4: Codes based on Concatenation

Appendix B Convex analysis

Topology Exercise Sheet 2 Prof. Dr. Alessandro Sisto Due to March 7

Discrete Geometry. Problem 1. Austin Mohr. April 26, 2012

(1) Consider the space S consisting of all continuous real-valued functions on the closed interval [0, 1]. For f, g S, define

The Lebesgue Integral

CS5314 Randomized Algorithms. Lecture 15: Balls, Bins, Random Graphs (Hashing)

Handout 5. α a1 a n. }, where. xi if a i = 1 1 if a i = 0.

Maths 212: Homework Solutions

1 The Knapsack Problem

Lecture 12: November 6, 2017

Errata: First Printing of Mitzenmacher/Upfal Probability and Computing

Learning Large-Alphabet and Analog Circuits with Value Injection Queries

The Distribution of Optimal Strategies in Symmetric Zero-sum Games

Appendix PRELIMINARIES 1. THEOREMS OF ALTERNATIVES FOR SYSTEMS OF LINEAR CONSTRAINTS

COS597D: Information Theory in Computer Science October 5, Lecture 6

14.1 Finding frequent elements in stream

Introduction to Real Analysis Alternative Chapter 1

Measures and Measure Spaces

The Algorithmic Aspects of the Regularity Lemma

Algorithmic Game Theory and Applications. Lecture 4: 2-player zero-sum games, and the Minimax Theorem

CS Foundations of Communication Complexity

Tangent spaces, normals and extrema

Convex Feasibility Problems

Transcription:

Lecture 2: Minimax theorem, Impagliazzo Hard Core Lemma Topics in Pseudorandomness and Complexity Theory (Spring 207) Rutgers University Swastik Kopparty Scribe: Cole Franks Zero-sum games are two player games played on a matrix M Mat m n (R). The row player, denoted R, chooses a row i m] and the column player C chooses a column j n], simultaneously. The payoff to the row player is M ij and the payoff to the column player is M ij (hence the game is zero-sum ). We can also consider randomized strategies, where R chooses a probability distribution p on m], which can be written as a vector in R m, and the column player chooses a probability distribution q on R n. The expected payoff to the row player, denoted E(p, q), is thus equal to p T Mq = M ij p i q j. ij Theorem (Von Neumann Minimax Theorem). For a linear game there is a value V such that. There exists p such that for all q, E(p, q) V and 2. there exists a q such that E(p, q ) V for all p. In words: R can guarantee that the expected payoff to R is at least V, while C can guarantee that the expected payoff to R is at most V. This V is called the value of the game. Proof. We say that R can guarantee that the expected payoff to R is at least α if there exists a probability distribution p R m such that for all probability distributions q R n, E(p, q) α. Lemma 2. R can guarantee that the expected payoff to R is at least α if and only if there is a p such that for all j m] E(p, e j ) α. Proof. The only if direction is clear, because the deterministic strategies is a subset of the randomized strategies. To prove the if direction, suppose we have such a p. Note that for any q, E(p, q) = p i M ij q j = q j E(p, e j ) α, i m] j n] because q is a probability distribution on n]. Lemma 3. Given X, Y closed, disjoint convex sets in R n, there is a hyperplane that separates X and Y, that is, a unit vector u and a number t such that either for all x X and y Y, u, x t and u, y > t or for all x X and y Y, u, x < t and u, y t. j n]

Proof sketch. Suppose X and Y are compact. Consider inf x X,y Y d(x, y); we know this distance is achieved by some x 0 and y 0 because it is a continuous function on X Y which is also compact. One can check that the hyperplane bisecting the segment xy separates X and Y. Suppose X and Y are only closed; then X N X {x : x i N}, Y N {x : x i N} are compact sets for each natural number N. There is a separating hyperplane H N for each X N and Y N ; H N can be described by a unit vector u N and a threshold t N such that H N = {x : x, u N = t N }. Suppose we always pick u N so that X N is on the negative side of H N. t N is in fact the distance from the origin of H N, which must be bounded above because some points x X and y Y are at a finite distance from the origin and hence cannot be separated by any hyperplane that is arbitrarily far from the origin despite being contained in X N and Y N for N sufficiently large. Hence {(u N, t N )} N N resides in a compact subset of R n+, so it has a convergent subsequence. Let (u, t) be the limit of this subsequence. For each x X, and for N sufficiently large, x, u N < t, and x, u = lim N x, u N t. Similarly for y, we have u, y t. The intersections X H and Y H are closed convex sets. If either is empty, we are done. If neither is empty, then by induction there is a hyperplane G H that separates X H and Y H as in the theorem. Without loss of generality suppose G does not intersect X H. By translating, assume G (and so H) is through the origin with normal v, so that x, v < 0 for x X and y, v 0 for y Y. Let u = u +.v. Now x, u = x, u +.v < 0 and similarly y, u 0. Apply the inverse of the translation to obtain a new hyperplane (u, t ) that separates X and Y in the way required by the theorem. Take some value α such that R cannot guarantee expected value at least α. Then for all p there exists j such that p i M ij α. i Let X = {u R n : u j α j n]} and Y = {pm : p is a probability distribution on m]}. R cannot guarantee α if and only if X and Y are disjoint. One needs to check that these are convex and closed. We claim that if R cannot guarantee α then there is a hyperplane separatng X and Y. That is, there is l R n and b R such that for all x X, l, x > b and for all y T, l, y < b. Observe the following:. Every entry of l is nonnegative. Otherwise you could make the corresponding entry in x X arbitrarily large while leaving all others equal to α. This would violate l, x > b. 2. Replace l from the separating hyperplane by q = l li so that q is a probability distribution, and replace α by b = α li. So now for all x X, q, x > b and for all y T, q, y < b. 3. b α, which one can see by applying q, x > b for x = α. Items 2 and 3 show that for all p R m, q, pm < α, or E(p, q) < α. In other words, if C plays q she guarantees that the expected payoff for R is < α. We know that those values R can guarantee comprise a left half interval, and those values C can guarantee comprise a right half interval. What we ve shown is that for any α outside R s left half interval is in the interior of C s right interval. Because our proof was symmetric in R and C, any α outside C s right half interval is in the interior of R s left half interval. Together these imply the 2

intervals are closed and intersect in exactly one point. Theorem 4 (Impagliazzo Hard Core Lemma). For all ɛ, δ > 0, and Ω(n/ɛ 2 log(n/ɛ 2 )) = s 2.9n, suppose f : {0, } n {0, } such that for all circuits C with size s Pr = f(x)] < δ. x {0,} nc(x) Then there exists a hard core H {0, } n, H (2δ)2 n O( δ2 n ) such that for all circuits C with size at most ɛ 2 δs/5n, Pr C(x) = f(x)] < /2 + ɛ. x H Proof. Consider a linear game played by two players S (sets, row) and C (circuit, column). S chooses from sets of {0, } n of size exactly (2δ)2 n, which we will call Large sets and C chooses from circuits of size at most s, which we will call Small circuits. The matrix M of payoffs to the circuit player has entries M S,C = Pr x S C(x) = f(x)]. By the Von Neumann Minimax Theorem and Lemma 2, we are in one of two cases: Either. There is a distribution µ s on Large sets such that for all Small circuits C of size at most s, E S µs M S,C ] /2 + ɛ, or 2. There is a distribution µ c on Small circuits such that for all Large sets E C µc M S,C ] /2 + ɛ. 0. Case : We want to obtain a single Large set from the distribution µ s on which no circuit agrees well with f. Define ν(x) = Pr S µ s x S] Then x {0,} n ν(x) = (2δ)2n, the expected size of S, by linearity of expectation. To choose H, for each x {0, } n, add x to H independently with probability ν(x). Then. P r H (2δ)2 n ( η)] e η2 (2δ)2 n /3. (The sum of independent Bernoulli s is controlled by the multiplicative Chernoff bound; see Alon and Spencer Corollary A..4 or Wikipedia). We will end up choosing η so that this probability is close to one. 2. With high probability over choice of H, Pr f(x) = C(x)] /2 + ɛ/2 x H 3

To prove 2, fix a circuit C. Let Y be the random variable that is the number of agreements between f and C on H. Then E H Y = Prx H] f(x)=c(x) = ν(x) f(x)=c(x) () x {0,} n x {0,} n again by linearity of expectation. M S,C = (2δ)2 n x S f(x)=c(x), so E S µs M S,C ] = S = (2δ)2 n µ S (S) (2δ)2 n f(x)=c(x) (2) x S µ S (S) (3) f(x)=c(x) x {0,} n S x = (2δ)2 n f(x)=c(x) P r S µs x S] x {0,} n (4) = (2δ)2 n ν(x) f(x)=c(x). x {0,} n (5) Equations and 5, combined with the fact that we are in Case, imply EY ] = E S µs M S,C ](2δ)2 n (/2 + ɛ)(2δ)2 n. Note that Y is a sum of independent indicator variables, so by the multiplicative Chernoff bound and ɛ < /2 PrY > (/2 + 2ɛ)(2δ)2 n ] = PrY > (/2 + ɛ)(2δ)2 n + ɛ(2δ)2 n ] e 3( ɛ.5+ɛ) 2 (2δ)2 n e 3 ɛ2 (2δ)2 n. This was for a fixed C. By the union bound, with probability at least (number of Small circuits)e 3 ɛ2 (2δ)2 n e η2 (2δ)2 n /3 = (s ) o(s ) e 3 ɛ2 (2δ)2 n e η2 (2δ)2 n /3 (6) (7) we have that for all Small circuits, the number of agreements between C and f on H is at most (/2 + 2ɛ)(2δ)2 n and H ( η)(2δ)2 n. Provided 7 is positive, there exists such an H, so we can choose s = ɛ 2 δs/5n 2.9n ɛ 2 δ and η = 5(δ2 n ) /2 provided n is large enough. 0.2 Case 2: Now suppose we are in Case 2. We want a small circuit that agrees with f on a ( δ) fraction of inputs. First sample C... C t from µ C.. Let η(x) = Pr C µc C(x) = f(x)]. Recall that M S,C = P r x S C(x) = f(x)] = E x S C(x)=f(x) ]. So for all Large sets S, Let n (x) = t t i= C i (x)=f(x). E C µc M S,C ] = E C µc E x S C(x)=f(x) ] = E x S E C µc C(x)=f(x) ] = E x S η(x) /2 + ɛ. (8) 4

Claim 5. With probability at least 2 n e.5ɛ2t over choice of C,... C t, for all x, η(x) η (x) < ɛ/2. Proof. Use Chernoff to bound the probability of the condition not holding for a specific x: by the additive form of Chernoff, P r η (x) η(x) ɛ/2] e.5ɛ2t. Next union bound over x {0, } n. If t 3n/ɛ 2, there are C,... C t such that for all x, η(x) η (x) < ɛ/2. By 8, E x S η (x)] /2+ɛ/2 for all S of size (2δ)2 n. A first attempt: Define C (x) = Maj(C (x),... C t (x)). Let S be the Large set where E x S η (x)] is the smallest. In other words, S is the (2δ)2 n first x in increasing order of η (x). Note that for x / S, n (x) /2 + ɛ/2. This implies C is correct outside S, since C is the majority of the C i and at least half the C i are correct. So P r x {0,} nc (x) = f(x)] 2δ, but we need δ. It seems like this approach won t tell us much about what happens inside S. To this end, we modify C. The real attempt: Let S be the defined as in the first attempt. S } /2. Define 0 C t Ci (x) 2 β (x) = t Ci (x) > 2 + β Y otherwise, Define β = max{η (x) : x where Y is chosen independently in {0, } with mean 2 + t i C i(x) 2. For x / S, η (x) /2+β > /2 by the definition of β, so C is correct on x. Within S, we need to show that the probability over choice of x and Y that C (x) is correct is at least /2. If f(x) =, then t Ci (x) = η (x) and so because even if t 2 + t Pr x S,Y C (x) = f(x)] = E x S E Y C (x)]] E x S 2 + t i C ] i(x) 2 ] = E x S 2 + η (x) 2 /2 + ɛ/4β > /2. i C i(x) /2 β, C (x) will be zero, which is larger than the negative number i C i(x) 2 β. If f(x) = 0, then t Ci (x) = η (x) and so Pr x,y C (x) = f(x)] = E x E Y C (x)]] E x S 2 + /2 t i C ] i(x) ] = E x S 2 + η (x) 2 /2 + ɛ/4β > /2. (9) 5

If remains to show that C can be computed by a small circuit. This stands to reason, since in addition to the t copies of C we just need to add a gadget that outputs if the sum of the C i is large enough, 0 if it is small enough, and a random bit with the right probability if it is in between. Intuitively this should take something like O(t + ts ) gates. First, let s show C can be computed by a small randomized circuit C(x, r) depending on some random bits r. As is q/t for some integer q t], look at 2 + t i C i(x) 2 (0) = 2q (q + 2 i C i (x) t). () This means once we can compute q + 2 i C i(x) t from the C i, we can output 0 if it is 0 and output if it is > 2q, and if it is in 2q] output or 0 according to whether it is or < a random binary number between and 2q generated by at most O(log t) random bits. Since q+2 i C i(x) t is at most 2t, we need at most O(log t) bits to represent it and O(t) gates to compute each bit (including a sign bit). To compare the two numbers we only need O(log 2 t) gates. In any case, since t = 3n/ɛ 2, we need at most gates. 3ns /ɛ 2 + O(n/ɛ 2 log(n/ɛ 2 ) s If C(x, r) is a randomized circuit (where x is the input and r is random bits) such that E x E r f(x)=c(x,r) ]] δ, then there exists r 0 such that E x f(x)=c(x,r0 )] δ. Fix that r 0 and hardwire it into C. This gives a circuit of size at most s that computes f with probability at least δ. 6