Randomness-efficient Curve Sampling

Size: px
Start display at page:

Download "Randomness-efficient Curve Sampling"

Transcription

1 Randomness-efficient Curve Sampling Thesis by Zeyu Guo In Partial Fulfillment of the Reuirements for the Master Degree in Computer Science California Institute of Technology Pasadena, California 2014 (Submitted February 5, 2014)

2 c 2014 Zeyu Guo All Rights Reserved ii

3 Abstract Curve samplers are sampling algorithms that proceed by viewing the domain as a vector space over a finite field, and randomly picking a low-degree curve in it as the sample. Curve samplers exhibit a nice property besides the sampling property: the restriction of low-degree polynomials over the domain to the sampled curve is still low-degree. This property is often used in combination with the sampling property and has found many applications, including PCP constructions, local decoding of codes, and algebraic PRG constructions. The randomness complexity of curve samplers is a crucial parameter for its applications. It is known that (non-explicit) curve samplers using O(log N + log(1/δ)) random bits exist, where N is the domain size and δ is the confidence error. The uestion of explicitly constructing randomness-efficient curve samplers was first raised in [TSU06] where they obtained curve samplers with near-optimal randomness complexity. In this thesis, we present an explicit construction of low-degree curve samplers with optimal randomness complexity (up to a constant factor) that sample curves of degree ( m log (1/δ) ) O(1) in F m. Our construction is a delicate combination of several components, including extractor machinery, limited independence, iterated sampling, and list-recoverable codes. iii

4 Contents Abstract iii 1 Introduction 1 2 Preliminaries 7 3 Basic results Extractor vs. sampler connection Existence of a good curve sampler Lower bounds Explicit constructions Outer sampler Block source conversion Block source extraction Inner sampler Error reduction Iterated sampling Recursive inner sampler Putting it together An alternative outer sampler Bibliography 39 iv

5 Chapter 1 Introduction Overview Randomness has numerous uses in computer science, and sampling is one of its most classical applications: Suppose we are interested in the size of a particular subset A lying in a large domain D. Instead of counting the size of A directly by enumeration, one can randomly draw a small sample from D and calculate the density of A in the sample. The approximated density is guaranteed to be close to the true density (measured by a parameter ɛ, the accuracy error) with probability 1 δ where δ is very small, known as the confidence error. This sampling techniue is extremely useful both in practice and in theory. One class of sampling algorithms, known as curve samplers, proceed by viewing the domain as a vector space over a finite field, and picking a random low-degree curve in it. Curve samplers exhibit the following nice property besides the sampling property: the restriction of low-degree polynomials over the domain to the sampled curve is still low-degree. This special property, combined with the sampling property, turns out to be useful in many settings, e.g local decoding of Reed-Muller codes and hardness amplification [STV01], PCP constructions [AS98, ALM + 98, MR08], algebraic constructions of pseudorandom-generators [SU05, Uma03], extractor constructions [SU05, TSU06], and some pure complexity results (e.g. [SU06]). The problem of explicitly constructing low-degree curve samplers was raised in [TSU06]. Typically, we are looking for low-degree curve samplers with small sample complexity (polylogrithmic in the domain size) and confidence error (polynomially small in the domain size), and we focus on minimizing the randomness complexity. The simplest way is picking a completely random low-degree curve whose sampling properties are guaranteed by tail bounds for limited independence. The randomness complexity of this method, however, is far from being optimal. The probabilistic method guarantees the existence of (non-explicit) low-degree curve samplers using O(log N + log(1/δ)) random bits where N is the domain size and δ is 1

6 the confidence error. The real difficulty, however, is to find an explicit construction matching this bound. Previous work Randomness-efficient samplers (without the reuirement that the sample points form a curve) are constructed in [CG89, Gil98, BR94, Zuc97]. In particular, [Zuc97] obtains explicit samplers with optimal randomness complexity (up to a 1 + γ factor for arbitrary small γ > 0) using the connection between samplers and extractors. See [Gol11] for a survey of samplers. Degree-1 curve samplers are also called line samplers. Explicit randomness-efficient line samplers are constructed in the PCP literature [BSSVW03, MR08], motivated by the goal of constructing almost linear sized PCPs. In [BSSVW03] line samplers are derandomized by picking a random point and a direction sampled from an ɛ-biased set, instead of two random points. An alternative way is suggested in [MR08] where directions are picked from a subfield. It is not clear, however, how to apply these techniues to higher degree curves. In [TSU06] it was shown how to explicitly construct derandomized curve samplers with near-optimal parameters by employing an iterated sampling techniue. Formally they obtained curve samplers picking curves of degree (log log N + log(1/δ)) O(log log N) using randomness O(log N + log(1/δ) log log N), and curve samplers picking curves of degree (log(1/δ)) O(1) using randomness O(log N + log(1/δ)(log log N) 1+γ ) for any constant γ > 0 for domain size N, field size (log N) Θ(1) and confidence error δ = N Θ(1). Their work left the problem of explicitly constructing low-degree curve samplers (ideally picking curves of degree O(log (1/δ))) with essentially optimal O(log N +log(1/δ)) random bits as a prominent open problem. Main results ( It is known that curve samplers in F m must have sample complexity Ω log(ɛ/δ) and randomness complexity (m 1) log + log (1/ɛ) + log (1/δ) O(1) [RTS00]. It is also not hard to show that the degree of the sampled curves has to be Ω ( log (1/δ) ) (c.f. Theorem 3.3.3). We construct explicit curve samplers with parameters that match or are close to these lower bounds. In particular, we show how to sample degree- ( m log (1/δ) ) O(1) curves in F m using O(log N + log(1/δ)) random bits for domain size N = F m and confidence error δ = N Θ(1). 2 ɛ 2 )

7 Before stating our main theorem, we first present the formal definition of samplers and curve samplers. Samplers. Given a finite set M as the domain, the density of a subset A M is µ(a) def = A. For a collection of elements T = {t M i : i I} M I indexed by set I, the density of A in T is µ T (A) def = A T = Pr T i I [t i A]. Definition (sampler). A sampler is a function S : N D M where D is its sample complexity and M is its domain. We say S samples A M with accuracy error ɛ and confidence error δ if Pr x N [ µ S(x)(A) µ(a) > ɛ] δ where S(x) def = {S(x, y) : y D}. We say S is an (ɛ, δ) sampler if it samples all subsets A M with accuracy error ɛ and confidence error δ. The randomness complexity of S is log( N ). Lines, curves, and manifolds. To define curve sampler, we first define curves, lines, and more generally manifolds. Let f : F d F D be a map. We may view f as D individual functions f i : F d F describing its operation on each output coordinate, i.e., f(x) = (f 1 (x),..., f D (x)) for all x F d. Definition (manifold). A manifold in F D is a function C : F d F D where C 1,..., C D are d-variate polynomials over F. We call d the dimension of C. We say a manifold C has degree t if each polynomial C i has degree t. An 1-dimensional manifold is also called a curve. A curve of degree 1 is also called a line. Now we are ready to define curve samplers, the central objects studied in this thesis. Definition (curve/line sampler). Let M = F D and D = F. The sampler S : N D M is a degree-t curve sampler if for all x N, the function S(x, ) : D M is a curve of degree at most t over F. When t = 1, S is also called a line sampler. The main result of this thesis is as follows. Theorem (main). For any ɛ, δ > 0, integer m 1, and sufficiently large prime ( ) power m log(1/δ) Θ(1), there exists an explicit degree-t curve sampler for the domain F m ɛ with t = ( m log (1/δ) ) O(1), accuracy error ɛ, confidence error δ, sample complexity, and randomness complexity O (m log + log(1/δ)) = O(log N + log(1/δ)) where N = m is the domain size. Moreover, the curve sampler itself has degree ( m log (1/δ) ) O(1) as a polynomial map. 3

8 Theorem has better degree bound and randomness complexity compared with the constructions in [TSU06]. The degree bound, being ( m log (1/δ) ) O(1), is still sub-optimal compared with the lower bound log (1/δ). However, we remark that in typical settings it is satisfying to to achieve such a degree bound. As an example, consider the following setting of parameters: domain size N = m, field size = (log N) Θ(1), confidence error δ = N Θ(1), and accuracy error ɛ = (log N) Θ(1). Note that this is the typical setting in PCP and other literature [ALM + 98, AS98, STV01, SU05]. In this setting, we have the following corollary in which the randomness complexity is logarithmic and the degree is polylogarithmic. Corollary Given domain size N = F m, accuracy error ɛ = (log N) Θ(1), confidence error δ = N Θ(1), and large enough field size = (log N) Θ(1), there exists an explicit degree-t curve sampler for the domain F m with accuracy error ɛ, confidence error δ, randomness complexity O(log N), sample complexity, and t (log N) c for some constant c > 0 independent of the field size. It remains an open problem to explicitly construct curve samplers that have optimal randomness complexity O(log N + log(1/δ)) (up to a constant factor), and sample curves with optimal degree bound O(log (1/δ)). It is also an interesting problem to achieve the optimal randomness complexity up to a 1 + γ factor for any constant γ > 0 (rather than just an O(1) factor), as achieved by [Zuc97] for general samplers. The standard techniues in [Zuc97] are not directly applicable as they increase the dimension of samples and only yield O(1)-dimensional manifold samplers. Techniues Extractor machinery. It was shown in [Zuc97] that samplers are euivalent to extractors, objects that convert weakly random distributions into almost uniform distributions. Therefore the techniues of constructing extractors are extremely useful in constructing curve samplers. Our construction employs the techniue of block source extraction [NZ96, Zuc97, SZ99]. In addition, we also use the techniues appeared in [GUV09], especially their constructions of condensers. Limited independence. It is well known that points on a random degree-(t 1) curve are t-wise independent. So we may simply pick a random curve and use tail ineualities to bound the confidence error. However, the sample complexity is too high, and hence we need to use the techniue of iterated sampling to reduce the number of sample points. 4

9 Iterated sampling. Iterated sampling is a useful techniue for explicitly constructing randomness-efficient samplers [BR94, TSU06]. The idea is first picking a large sample from the domain and then draw a sub-sample from the previous sample. The drawback of iterated sampling, however, is that it invests randomness twice while the confidence error does not shrink correspondingly. To remedy this problem, we add another ingredient into our construction, namely the techniue of error reduction. Error reduction via list-recoverable codes. We will use explicit list-recoverable codes (a strengthening of list-decodable codes [GI01]). More specifically, we will employ the listrecoverability from (folded) Reed-Solomon codes [GR08, GUV09]. List-recoverable codes provide a way of obtaining samplers with very small confidence error from those with mildly small confidence error. We refer to this transformation as error reduction, which plays a key role in our construction. Sketch of the construction Our curve sampler is the composition of two samplers which we call the outer sampler and the inner sampler respectively. The outer sampler picks manifolds of dimension O(log m) from the domain M = F m. The outer sampler has near-optimal randomness complexity but the sample complexity is large. To fix this problem, we employ the idea of iterated sampling. Namely we regard the manifold picked by the outer sampler as the new domain M, and then construct an inner sampler picking a curve from M with small sample complexity. The outer sampler is obtained by constructing an extractor and then using the extractorsampler connection [Zuc97]. We follow the approach in [NZ96, Zuc97, SZ99]: Given an arbitrary random source with enough min-entropy, we will first use a block source converter to convert it into a block source, and then feed it to a block source extractor. In addition, we need to construct these components carefully so as to maintain the low-degree-ness. The way we construct the block source converter is different from those in [NZ96, Zuc97, SZ99] (as they are not in the form of low-degree polynomial maps), and is based on the Reed-Solomon condenser proposed in [GUV09]: To obtain one block, we simply feed the random source and a fresh new seed into the condenser, and let the output be the block. We show that this indeed gives a block source. The inner sampler is constructed using techniues of iterated sampling and error reduction. We start with the basic curve samplers picking totally random curves, and then apply the error reduction as well as iterated sampling techniues repeatedly to obtain the desired inner sampler. Either of the two operations improves one parameter while worsening some other one: Iterated sampling reduces sample complexity but increases the randomness 5

10 complexity, whereas error reduction reduces the confidence error but increases the sample complexity. Our construction applies the two techniues alternately such that (1) we keep the invariant that the confidence error is always exponentially small in the randomness complexity, and (2) the sample complexity is finally brought down to. We remark that the idea of sandwiching several operations to get the desired parameters without spoiling other ones is reminiscent of Reingold s proof that SL = L [Rei08] and Dinur s proof of the PCP theorem [Din07]. Outline The organization of this thesis is as follows: In Chapter 2 we introduce the preliminary definitions and notions as well as some basic facts that will be used later. In Chapter 3 we present some basic results about samplers and curve samplers. Chapter 4 is devoted to the main result of this thesis which describes an explicit construction of curve samplers. We divide this construction into two parts, the outer sampler (Section 4.1) and the inner sampler (Section 4.2). We then put it together and finish the construction of the curve samplers in Section 4.3. We present an alternative and simpler construction of outer samplers in Section

11 Chapter 2 Preliminaries Notations and basic definitions We denote the set of numbers {1, 2,..., n} by [n]. Given a prime power, write F for the finite field of size. Write U S for the uniform distribution over a finite set S, U n, for the uniform distribution over F n, and U n for the uniform distribution over {0, 1} n. Logarithms are taken with base 2 unless the base is explicitly specified. Random variables and distributions are represented by upper-case letters whereas their specific values are represented by lower-case letters. Write x X if x is sampled according to X. Write X S if X is a distribution over a set S. The support of a distribution X S is supp(x) def = {x S : Pr[X = x] > 0}. We use the statistical distance (, ) to measure the closeness of two distributions. The statistical distance between X, Y S is defined as (X, Y ) = max Pr[X T ] Pr[Y T ]. T S Then (, ) defines a metric. We say X is ɛ-close to Y if (X, Y ) ɛ. Fact 1. The statistical distance is half the l 1 distance, i.e., for X, Y S, we have (X, Y ) = 1 Pr[X = x] Pr[Y = x]. 2 x S For an event A, let I[A] be the indicator variable that evaluates to 1 if A occurs and 0 otherwise. For a distribution X and an event A that occurs with nonzero probability, define the conditional distribution X A by Pr[X A = x] = Pr[(X=x) A]. Pr[A] We use forms like {t x : x I} S I to denote a collection of elements indexed by I with each element in the set S. Alternatively, we view {t x : x I} as the function from I to S that maps x to t x. We also slightly abuse the notation and use {t x : x I} for an 7

12 (unordered) multi-set. Indeterminates are written as upper-case letters. E.g., we use F[X 1,..., X n ] to denote the polynomial ring over the field F with indeterminates X 1,..., X n. We say a polynomial p(x 1,..., X n ) has degree t if the sum of the individual degrees n i=1 a i is at most t for all monomial n i=1 Xa i i of p. Facts about curves and manifolds The following facts will be useful: Fact 2. For any distinct x 1,..., x t+1 F and any y 1,..., y t+1 F n, there exists a uniue curve C : F F n of degree t such that C(x i ) = y i for all i [t + 1]. Indeed, C i s are given by the Lagrange polynomials: t+1 C i (X) = j=1 y j,i k [t+1]\{j} X x k x j x k where y j,i is the i-th coordinate of y j, i [n]. Fact 3. Let f 1 : F d 1 F d 0 be a manifold of degree t 1 and f 2 : F d 2 F d 1 degree t 2. Then f 1 f 2 : F d 2 F d 0 is a manifold of degree t 1 t 2. We also need the following lemma, generalizing the one in [TSU06]: a manifold of Lemma A manifold f : ( n ( m F D) F D) of degree t, when viewed as a function f : ( ) F D n ( ) F D m, is also of degree t. Proof. Write f = (f 1,..., f m ). By symmetry we just show f 1, when viewed as a function f 1 : ( ) F D n F D, has degree t. Suppose f 1 (x 1,..., x n ) = d=(d 1,...,d n) di t c d n i=1 x i d i. Let (e 1,..., e D ) be the standard basis of F D over F. Writing the i-th variable x i F D as D j=1 x i,je j F D with x i,j F, and each coefficient c d F D as D j=1 c d,je j with c d,j F, we obtain ( D ) n ( D ) di f 1 (x 1,..., x n ) = c d,j e j x i,j e j. d=(d 1,...,d n) di t j=1 After multiplying out, each monomial has degree at most max d i d i t, and their coefficients are polynomials in the e i elements. Rewriting each of these values in the basis 8 i=1 j=1

13 (e 1,..., e D ) and gathering the coefficients on e i, we obtain the i-th coordinate function of f 1 that has degree at most d for all 1 i D. Therefore f 1 : ( ) F D n F D is a manifold of degree t, and so is f : ( ) F D n ( ) F D m. Tail probability bounds We say random variables X 1,..., X n are independent if for any specific values x 1,..., x n, it holds that [ n ] n Pr X i = x i = Pr[X i = x i ]. i=1 We say X 1,..., X n are pairwise independent if for any specific values x 1, x 2 and any distinct i 1, i 2 [n], it holds that i=1 Pr [X i1 = x 1 X i2 = x 2 ] = Pr[X i1 = x 1 ] Pr[X i2 = x 2 ]. In general, for an integer t > 1, we say X 1,..., X n are t-wise independent if for any specific values x 1,..., x t and any distinct i 1,..., i t [n], it holds that [ t ] Pr X ij = x j = j=1 t Pr[X ij = x j ]. We consider the behaviour of a fully random curve C of degree t over F, i.e., write C = (C 1,..., C n ), then each C i is a degree-t univariate polynomial whose t + 1 coefficients are chosen uniformly at random from F, and all C i s are chosen independently. It is known that the points on C are (t + 1)-wise independent. Lemma Let C : F F n be a random curve of degree t over F. Then the random variables C(x) s are (t + 1)-wise independent where x ranges over F. Proof. First note that each C(x) is uniformly distributed. By Fact 2, for any distinct y 1,..., y t+1 F and any z 1,..., z t+1 F n, there is a uniue degree-t curve, out of all (t+1)n curves, that passes z i at y i for all i [t + 1]. So we have Pr [ t+1 ] C(y i ) = z i i=1 j=1 = 1 = t+1 (t+1)n i=1 Pr[C(y i ) = z i ]. By definition, the random variables C(x) s with x ranging over F are (t + 1)-wise independent. 9

14 We need the Chernoff bound of the following form: Lemma Suppose X 1,..., X n [0, 1] are independent random variables. Let X = n i=1 X i and µ = E[X], and let R 6µ. Then Pr[X R] 2 R. The following bound follows from Chebyshev s ineuality: Lemma Suppose X 1,..., X n are pairwise independent random variables. Let X = n i=1 X i and µ = E[X], and let A > 0. Then Pr[ X µ A] n i=1 Var[X i] A 2. We will also use the following tail bound for t-wise independent random variables: Lemma ([BR94]). Let t 4 be an even integer. Suppose X 1,..., X n [0, 1] are t-wise independent random variables. Let X = n i=1 X i and µ = E[X], and let A > 0. Then ( (tµ ) ) + t 2 t/2 Pr[ X µ A] = O. Basic line/curve samplers The simplest line samplers are those picking completely random lines, as defined below. We call them as basic line samplers. Definition (basic line sampler). For m 1 and prime power, let Line m, : F 2m F be the line sampler that picks a uniformly random line in F m. Formally, F m A 2 Line m, ((a, b), y) def = (a 1 y + b 1,..., a m y + b m ) for a = (a 1,..., a m ), b = (b 1,..., b m ) F m and y F. Remark 1. Note that although Line m, (x, ) is a line (i.e., a degree-1 curve) for x F 2m, the function Line m, itself is a degree-2 manifold. The basic line samplers are indeed good samplers: Lemma For ɛ > 0, m 1 and prime power, Line m, is an ( ) 1 ɛ, line sampler. ɛ 2 10

15 Proof. Let A be an arbitrary subset of F m. Note that Line m, picks a line uniformly at random. By Lemma 2.0.2, the random variables Line m, (U 2m,, y) with y ranging over F are pairwise independent. So the indicator variables I[Line m, (U 2m,, y) A] with y ranging over F are also pairwise independent. Applying Lemma 2.0.4, we get [ Pr µ x U Linem,(x, )(A) µ(a) > ɛ ] 2m, = Pr I[Line m, (U 2m,, y) A] E I[Line m, (U 2m,, y) A] y F y F > ɛ y F Var [ I[Line m, (U 2m,, y) A] ] 1 ɛ 2. By definition, Line m, is an ɛ 2 2 ( ) 1 ɛ, line sampler. ɛ 2 Similarly we consider the simplest low-degree curve samplers that pick completely random curves. We call them basic curve samplers. Definition (basic curve sampler). For m 1, t 4 and prime power, let Curve m,t, : F tm F F m be the curve sampler that picks a uniformly random curve of degree t 1 in F m. Formally, Curve m,t, ((c 0,..., c t 1 ), y) def = ( t 1 ) t 1 c i,1 y i,..., c i,m y i for each c 0 = (c 0,1,..., c 0,m ),..., c t 1 = (c t 1,1,..., c t 1,m ) and y F. Remark 2. Note that Curve m,t, is a manifold of degree t. i=0 The basic curve samplers are indeed good samplers: Lemma For ɛ > 0, m 1, t 4 and sufficiently large prime power = (t/ɛ) O(1), Curve m,t, is an ( ɛ, t/4) sampler. Proof. Let A be an arbitrary subset of F m. Note that Curve m,t, (x, ) picks a degree-(t 1) curve uniformly at random. By Lemma 2.0.2, the random variables Curve m,t, (U tm,, y) with y ranging over F are t-wise independent. So the indicator variables I[Curve m,t, (U tm,, y) A] i=0 11

16 with y ranging over F are also t-wise independent. Applying Lemma 2.0.5, we get Pr [ µcurvem,t,(u tm,, )(A) µ(a) ] > ɛ = Pr I[Curve m,t, (U tm,, y) A] E I[Curve m,t, (U tm,, y) A] y F y F > ɛ ( (tµ(a) ) ) + t 2 t/2 = O = t/4 ɛ 2 2 provided that = (t/ɛ) O(1) is sufficiently large. By definition, Curve m,t, is an ( ɛ, t/4) sampler. Extractors and condensers A (seeded) extractor is an object that takes an imperfect random variable (i.e., a random variable that contains some randomness but is not uniformly distributed) called the (weakly) random source, invests a small amount of randomness called the seed, and produces an output whose distribution is very close to the uniform distribution. We introduce the following notion to measure the amount of randomness contained in a random source. Definition (min-entropy). We say a random variable X over a set S has min-entropy k and entropy deficiency log S k if for any x S, it holds that Pr[X = x] 2 k. The min-entropy of X is at most log S, and it achieves log S iff X = U S. We say X has -ary min-entropy k if for any x S, it holds that Pr[X = x] k (or euivalently, X has min-entropy k log ). Lemma (chain rule for min-entropy). Let (X, Y ) be a joint distribution where X F l and Y has -ary min-entropy k. We have [ Pr Y X=x has -ary min-entropy k l log (1/ɛ) ] 1 ɛ. x X Proof. We say x supp(x) is good if Pr[X = x] ɛ l and bad otherwise. Then Pr x X [x is bad] supp(x)ɛ l ɛ. Consider arbitrary good x. For any specific value Pr[(Y =y) (X=x)] Pr[Y =y] y for Y, we have Pr[Y X=x = y] = (k l log (1/ɛ)) Pr[X=x] ɛ l. By definition, Y X=x has -ary min-entropy k l log (1/ɛ) when X = x is good, which occurs with probability at least 1 ɛ. 12

17 Before giving the formal definition of extractors, we first consider a kind of objects called condensers that can be seen as a relaxation of extractors. A condenser is weaker than an extractor in the sense that the output is only reuired to be close to a distribution with a large amount of min-entropy, rather than close to the uniform distribution. Definition (condenser). Given a function C : F n F d F m, we say C is an (n, k 1 ) ɛ, (m, k 2 ) condenser if for every distribution X with -ary min-entropy k 1, C(X, U d, ) is ɛ-close to a distribution with -ary min-entropy k 2. The second argument of C is its seed. The uantities n log, d log and m log are called the input length, seed length and output length of C respectively. We call k 1 log the min-entropy threshold of C and ɛ the error of C. Next we define extractors as the strengthening of condensers. Definition (extractor). The function E : F n F d F m is a (k, ɛ, ) extractor if it is a (n, k) ɛ, (m, m) condenser. The seed, input length, seed length, output length, min-entropy threshold, and error of the extractor E are the same as the corresponding parameters of E as a condenser. We say a condenser/extractor f : F n F d F m has degree t if f has degree t as a manifold in F n+d. A random source X is called a flat source if it is uniformly distributed over its support, i.e., 1 a supp(x), supp(x) Pr[X = a] = 0 otherwise. The following basic fact will be useful: Fact 4. A random source X of min-entropy k is a convex combination of flat sources of min-entropy k. Write X = i I c ix i as such a convex combination. Then we have ( (X, Y ) = c i X i, ) c i Y i I i I i I c i (X i, Y ) sup ( (X i, Y )). i I Thus, we may assume that the input is always a flat source of min-entropy k when proving the extractor or condenser property for min-entropy threshold k. 13

18 Block source extraction One important class of random sources is the class of block sources, first introduced in [CG88]. A block source is a random source with the property that conditioning on any prefix of blocks, the remaining blocks still have some min-entropy. Definition (block source [CG88]). A random source X = (X 1,..., X s ) F n 1 F ns with each X i F n i is a (k 1,..., k s ) -ary block source if for any i [s] and (x 1,..., x i 1 ) supp(x 1,..., X i 1 ), the conditional distribution X i X1 =x 1,...,X i 1 =x i 1 has - ary min-entropy k i. Each X i is called a block. We consider the problem of extracting randomness from block sources: Definition (block source extractor). A function E : (F n 1 F ns ) F d F m is called a ((k 1,..., k s ), ɛ, ) block source extractor if for any (k 1,..., k s ) -ary block source (X 1,..., X s ) F n 1 F ns, E((X 1,..., X s ), U d, ) is ɛ-close to U m,. One nice property of block sources is that their special structures allow us to compose several extractors and get a block source extractor with only a small amount of randomness invested. Definition (block source extraction by composition). Let s 1 be an integer and for each i [s], let E i : F n i F d i F m i be a map. Suppose that m i d i 1 for all i [s], where we define d 0 = 0. Define E = BlkExt(E 1,..., E s ) as follows: E : (F n 1 F ns ) F ds (F m 1 d 0 F ms d s 1 ) ((x 1,..., x s ), y s ) (z 1,..., z s ) where for i = s,..., 1, we iteratively define (y i 1, z i ) to be a partition of E i (x i, y i ) into the prefix y i 1 F d i 1 and the suffix z i F m i d i 1. See Figure 2.1 for an illustration of the above definition. Lemma Let s 1 be an integer and for each i [s], let E i : F n i F d i F m i be a (k i, ɛ i, ) extractor of degree t i 1. Then E = BlkExt(E 1,..., E s ) is a ((k 1,..., k s ), ɛ, ) block source extractor of degree t where ɛ = s i=1 ɛ i and t = s i=1 t i. Proof. Induct on s. When s = 1 the claim follows from the extractor property of E 1. When s > 1, assume the claim holds for all s < s. Define E = BlkExt(E 2,..., E s ). By the induction hypothesis, E is a ((k 2,..., k s ), ɛ, ) block source extractor where ɛ = s i=2 ɛ i and t = s i=2 t i. 14

19 X s X s 1 X 2 X 1 BlkExt(E 1,..., E s ) Y s E s E s 1 E 2 E 1 Y s 1 Y s 2 Y 2 Y 1 Z s Z s 1 Z 2 Z 1 Figure 2.1: The composed block source extractor BlkExt(E 1,..., E s ) Let (X 1,..., X s ) F n 1 F ns be an arbitrary (k 1,..., k s ) -ary block source. Let (Y i 1, Z i ) F d i 1 F m i d i 1 be the output of E i and (Z 1,..., Z s ) be the the output of E when (X 1,..., X s ) is fed into E as the input and an independent uniform distribution Y s = U ds, is used as the seed (c.f. Figure 2.1). The output of E is then (Y 1, Z 2,..., Z s ). Fix x supp(x 1 ). By the definition of block sources, the distribution (X 2,..., X s ) X1 =x is a (k 2,..., k s ) -ary block source. Also note that Y s X1 =s = Y s is an independent uniform distribution. By the induction hypothesis, the distribution (Y 1, Z 2,..., Z s ) X1 =x = E ((X 2,..., X s ), Y s ) X1 =x is ɛ -close to (U d1,, U m2 d 1,,..., U ms d s 1,). As this holds for all x supp(x 1 ), the distribution (X 1, Y 1, Z 2,..., Z s ) is ɛ -close to (X 1, U d1,, U m2 d 1,,..., U ms d s 1,). So the distribution E((X 1,..., X s ), Y s ) = (E 1 (X 1, Y 1 ), Z 2,..., Z s ) is ɛ -close to (E 1 (X 1, U d1,), U m2 d 1,,..., U ms d s 1,). Then we know that it is also ɛ-close to (U m1 d 0,, U m2 d 1,,..., U ms d s 1,) since E 1 is a (k 1, ɛ 1, ) extractor. Finally, to see that E has degree t, note that E ((X 2,..., X s ), Y s ) = (Y 1, Z 2,..., Z s ) has degree t in its variables X 2,..., X s and Y s by the induction hypothesis and hence Y 1, Z 2,..., Z s have degree t in these variables. Then Z 1 = E 1 (X 1, Y 1 ) has degree t 1 max{1, t } = t in X 1,..., X s and Y s. So E((X 1,..., X s ), Y s ) = (Z 1,..., Z s ) has degree t in X 1,..., X s and Y s. 15

20 Chapter 3 Basic results 3.1 Extractor vs. sampler connection Our construction of curve samplers relies on the following observation by [Zuc97] which shows the euivalence between extractors and samplers. Theorem ([Zuc97], restated). Given a map f : F n F d F m, we have the following: (1) If f is a (k, ɛ, ) extractor, then it is also an (ɛ, δ) sampler where δ = 2 k n. (2) If f is an (ɛ/2, δ) sampler where δ = ɛ k n, then it is also a (k, ɛ, ) extractor. Proof. (Extractor to sampler) Assume to the contrary that f is not an (ɛ, δ) sampler. Then there exists a subset A F m with Pr x [ µ f(x) (A) µ(a) > ɛ] > δ. Then either Pr x [µ f(x) (A) µ(a) > ɛ] > δ/2 or Pr x [µ f(x) (A) µ(a) < ɛ] > δ/2. Assume Pr x [µ f(x) (A) µ(a) > ɛ] > δ/2 (the other case is symmetric). Let X be the uniform distribution over the set of x such that µ f(x) (A) µ(a) > ɛ, i.e., Pr y [f(x, y) A] Pr x [x A] > ɛ. Then f(x, U d, ) U m, > ɛ. But the -ary min-entropy of X is at least log ((δ/2) n ) k, contradicting the extractor property of f. (Sampler to extractor) Assume to the contrary that f is not a (k, ɛ, ) extractor. Then there exists a subset A F m and a random source X of -ary min-entropy k satisfying the property that Pr[f(X, U d, ) A] µ(a) > ɛ. We may assume X is a flat source (i.e. uniformly distributed over its support) since a general source with -ary min-entropy k is a convex combination of flat sources with -ary min-entropy k. Note that supp(x) k. By the averaging argument, for at least an ɛ-fraction of x supp(x), we have Pr[f(x, U d, ) A] µ(a) > ɛ/2. But it implies that for x uniformly chosen from F n, with probability at least ɛµ(supp(x)) = δ we have µ f(x) (A) µ(a) > ɛ/2, contradicting the sampling property of f. 16

21 Table 3.1 shows the rough correspondences between the parameters of extractors and those of samplers. extractor error ɛ entropy deficiency (n k) log seed length d log input length n log output length m log sampler accuracy error ɛ confidence error δ sample complexity d randomness complexity n log domain size m Table 3.1: The correspondences between the parameters of extractors and samplers 3.2 Existence of a good curve sampler In this section, we prove the existence of a (non-explicit) low-degree curve sampler with low randomness complexity and a small number of sample points. ( ) Theorem For any m 1, ɛ, δ > 0 and sufficiently large log(1/δ) Θ(1), there exists a (non-explicit) (ɛ, δ) degree-t curve sampler S : D F F m with randomness complexity log D = m log + log (1/δ) + O(1), sample complexity, and t = O ( log (1/δ) ). Proof. We use the probabilistic method. Choose the curve sampler S by choosing the degreet curve S(x, ) : F F m independently at random for each x D. Let A be an arbitrary subset of F m. Fix x D. By Lemma 2.0.2, the random variables S(x, y) with y ranging over F are (t + 1)-wise independent. So the indicator variables I[S(x, y) A] with y ranging over F are also (t + 1)-wise independent. Applying Lemma 2.0.5, we get Pr [ µs(x, ) (A) µ(a) ] > ɛ = Pr I[S(x, y) A] E I[S(x, y) A] y F y F > ɛ ( ((t ) ) + 1)µ(A) + (t + 1) 2 (t+1)/2 = O δ ɛ for sufficiently large β. Let B(x) be the event that µs(x, ) (A) µ(a) > ɛ. Then 0 Pr [I[B(x)] = 1] δ and hence E [ δ D 6 x I[B(x)]]. The indicator variables I[B(x)] with 6 x ranging over D are independent. Applying Lemma 2.0.3, we obtain [ ] Pr I[B(x)] δ D 2 δ D. x 17 ɛ

22 There are 2 m possible A F m. So with probability at least 1 2 m 2 δ D > 0 (for sufficiently large log D = m log +log (1/δ)+O(1)), the events x I[B(x)] δ D for all A Fm occur by the union bound. Take the curve sampler S that makes all these events occur. Then S is an (ɛ, δ) degree-t curve sampler by definition. The most interesting case is when the domain size m and the confidence error δ are polynomially related, while the field size and the degree t are kept small: Corollary Given the domain size N = F m = m, accuracy error ɛ = (log N) O(1), confidence error δ = N O(1), and large enough field size = (log N) Θ(1), there exists a (non-explicit) (ɛ, δ) degree-t curve sampler S : D F F m with randomness complexity ( ) log D = O(log N), sample complexity, and t = Θ. 3.3 Lower bounds log N log log N We will use the following optimal lower bound for extractors: Theorem ([RTS00], restated). Let E : F n F d F m be a (k, ɛ, ) extractor. Then (a) if ɛ < 1/2 and d m /2, then d = Ω (b) if d m /4, then d+k m = Ω(1/ɛ 2 ). ( (n k) log ɛ 2 ), and Theorem Let S : F n F F m be an (ɛ, δ) curve sampler where ɛ < 1/2 and m 2. Then ( (a) the sample complexity = Ω log(2ɛ/δ) ), and ɛ 2 (b) the randomness complexity n log (m 1) log + log(1/ɛ) + log(1/δ) O(1). Proof. By Theorem 3.1.1, S is a (k, 2ɛ, ) extractor where k = n log (2ɛ/δ). The first claim then follows from Theorem (a). Applying Theorem (b), we get (1+k m) log Ω(log(1/ɛ)) O(1). Therefore n log = k log + log(2ɛ/δ) (m 1) log + log(1/δ) + Ω(log(1/ɛ)) O(1). In particular, as log(1/ɛ) = O(log ), the randomness complexity n log is at least Ω(log N + log(1/δ)) when the domain size N = m N 0 for some constant N 0. Therefore the randomness complexity in Theorem is optimal up to a constant factor. We also present the following lower bound on the degree of curves sampled by a curve sampler: 18

23 Theorem Let S : N F F m be an (ɛ, δ) degree-t curve sampler where m 2, ɛ < 1/2 and δ < 1. Then t = Ω ( log (1/δ) + 1 ). Proof. Clearly t 1. Suppose S = (S 1,..., S m ) and define S = (S 1, S 2 ). Let C be the set of curves of degree at most t in F 2. Then C = 2(t+1). Consider the map τ : N C that sends x to S (x, ). We can pick k = /2 curves C 1,..., C k C such that the union of their preimages B def = k τ 1 (C i ) = i=1 k {x : S (x, ) = C i } has size at least k N = k N. C 2(t+1) Define A F m by A def = {C i (y) : i [k], y F } F m 2, i=1 i.e., let A be the set of points in F m whose first two coordinates are on at least one curve C i. We have A k m 1 and hence µ(a) k/ 1/2 < 1 ɛ. On the other hand, it follows from the definition of A that we have S(x, y) A for all x B and y F. So µ S(x) (A) = 1 for all x B. Then δ Pr [ µ S(x) (A) µ(a) > ɛ ] B k and hence N 2(t+1) t max { 1, 1 log 2 (k/δ) 1 } = Ω ( log (1/δ) + 1 ). We remark that the condition m 2 is necessary in Theorem since when m = 1, the sampler S with S(x, y) = y for all x N and y F is a (0, 0) degree-1 curve sampler. 19

24 Chapter 4 Explicit constructions 4.1 Outer sampler In this section we construct an O(log k)-dimensional manifold sampler, which we called the outer sampler, with the optimal randomness complexity where k = n log (1/δ). In O(log k) the language of extractors, we construct an extractor with the seed in F for random sources of -ary min-entropy k Block source conversion Definition (block source converter [NZ96]). A function C : F n F d (F m 1 F ms is called a (k, (k 1,..., k s ), ɛ, ) block source converter if for any random source X F n of -ary min-entropy k, the output C(X, U d, ) F m 1 F ms is ɛ-close to a (k 1,..., k s ) -ary block source. In addition, we say C has degree t if C has degree t as a manifold in F n+d It was shown in [NZ96] that one can obtain a block by choosing a pseudorandom subset of bits of the random source. Yet the proof is pretty delicate and cumbersome. Furthermore the resulting extractor does not have a nice algebraic structure. Here we make the observation that the following condenser from Reed-Solomon codes in [GUV09] can be used to obtain blocks and is a low-degree manifold. Definition (condenser from Reed-Solomon codes [GUV09]). Let ζ F be a generator of the multiplicative group F. Define RSCon n,m, : F n F F m for n, m 1 and prime power : RSCon n,m, (x, y) = ( y, f x (y), f x (ζy),..., f x (ζ m 2 y) ) where f x (Y ) = n 1 i=0 x iy i for x = (x 0, x 1,..., x n 1 ) F n.. ) 20

25 Theorem ([GUV09]). For any h 1, n m 1, prime power and ɛ > 0, RSCon n,m, is an ( ( n, log H )) ( ( 2ɛ, ɛ m, L log 2ɛ)) condenser, where H = (h 1) m 1 1 and 1 L = (ɛ (n 1)(h 1)(m 1)) h m 1 1. In particular, for large enough (n/ɛ) O(1), RSCon n,m, is a m ɛ, 0.99m condenser. Remark 3. The condenser RSCon n,m, (x, y) is a degree-n manifold, as each monomial in any of its coordinate is of the form y or x i (ζ j y) i where i n 1. Remark 4. The reason we use the condenser from Reed-Solomon codes rather than the ones from Parvaresh-Vardy codes [GUV09, TSU12] is that we need the condenser to be a low-degree manifold in both the seed and the random source. The known condensers from Parvaresh-Vardy codes are low-degree in the seed, yet we have no good bound on the degree in the random source. We apply the above condenser on the random source with an independent seed to obtain a new block each time. Formally: Definition (block source converter via condensing). For integers n, m 1,..., m s 1 and prime power, define the function BlkCnvt n,(m1,...,m s), : F n F s F m 1+ +m s by BlkCnvt n,(m1,...,m s),(x, y) = (RSCon n,m1,(x, y 1 ),..., RSCon n,ms,(x, y s )) for x F n and y = (y 1,..., y s ) F s. The function BlkCnvt n,(m1,...,m s), is indeed a block source converter, as we show below. The intuition is that conditioning on the values of the previous blocks, the random source X still has enough min-entropy, and hence we may apply the condenser to get the next block. We need the following technical lemmas: Lemma Let P, Q I be two distributions with (P, Q) ɛ. Let {X i : i supp(p )} and {Y i : i supp(q)} be two collections of distributions over the same domain S such that (X i, Y i ) ɛ for any i supp(p ) supp(q). Then X = def i supp(p ) Pr[P = i] X i is (2ɛ + ɛ )-close to Y = def i supp(q) Pr[Q = i] Y i. Proof. Let T be an arbitrary subset of S and we will prove that Pr[X T ] Pr[Y T ] 2ɛ + ɛ. Note that we can add dummy distributions X i for i I \ supp(p ) and Y j for j I\supp(Q) such that (X i, Y i ) ɛ for all i I, and it still holds that X = i I Pr[P = i] X i 21

26 and Y = i I Pr[Q = i] Y i. Then we have Pr[X T ] Pr[Y T ] = Pr[P = i] Pr[X i T ] Pr[Q = i] Pr[Y i T ] i I i I Pr[P = i] Pr[X i T ] Pr[Q = i] Pr[Y i T ] i I (Pr[P = i] Pr[Q = i]) Pr[X i T ] + Pr[Q = i](pr[x i T ] Pr[Y i T ]) i I ( ) ( ) Pr[P = i] Pr[Q = i] + ɛ Pr[Q = i] i I i I 2ɛ + ɛ Lemma Let X = (X 1,..., X s ) F n 1 F ns be a distribution such that for any i [s] and (x 1,..., x i 1 ) supp(x 1,..., X i 1 ), the conditional distribution X i X1 =x 1,...,X i 1 =x i 1 is ɛ-close to a distribution X i (x 1,..., x i 1 ) with -ary min-entropy k i. Then X is 2sɛ-close to a (k 1,..., k s ) -ary block source. Proof. Define X = (X 1,..., X s) as the uniue distribution such that for any i [s] and any (x 1,..., x i 1 ) supp(x 1,..., X i 1), the conditional distribution X i X 1 =x 1,...,X i 1 =x i 1 euals X i (x 1,..., x i 1 ) if (x 1,..., x i 1 ) supp(x 1,..., X i 1 ) 1 and otherwise euals U ni,. For any i [s] and (x 1,..., x i 1 ) supp(x 1,..., X i 1), we known X i X 1 =x 1,...,X i 1 =x i 1 is either X i (x 1,..., x i 1 ) or U ni,. And in either case it has min-entropy k i. So X is a (k 1,..., k s ) -ary block source. We will then prove that for any i [s] and any (x 1,..., x i 1 ) supp(x 1,..., X i 1 ) supp(x 1,..., X i 1), the conditional distribution X X1 =x 1,...,X i 1 =x i 1 is 2(s i + 1)ɛ-close to X X 1 =x 1,...,X i 1 =x. Setting i = 1 proves the lemma. i 1 Induct on i. For i = s the claim holds by the definition of X. For i < s, assume the claim holds for i + 1 and we will prove that it holds for i as well. Consider any (x 1,..., x i 1 ) supp(x 1,..., X i 1 ) supp(x 1,..., X i 1). Let A = X i X1 =x 1,...,X i 1 =x i 1 and B = X i X 1 =x 1,...,X i 1 =x. We have i 1 X X1 =x 1,...,X i 1 =x i 1 = x i supp(a) 1 (x 1,..., x i 1 ) supp(x 1,..., X i 1 ) always holds if i = 1. Pr[A = x i ] X X1 =x 1,...,X i =x i 22

27 and X X 1 =x 1,...,X i 1 =x i 1 = By the induction hypothesis, we have x i supp(b) Pr[B = x i ] X X 1 =x 1,...,X i =x i. ( X X1 =x 1,...,X i =x i, X X 1 =x 1,...,X i =x i) 2(s i)ɛ for x i supp(a) supp(b). Also note that B is identical to X i (x 1,..., x i 1 ) and is ɛ-close to A by definition. The claim then follows from Lemma Now we are ready to prove the following theorem. Theorem For ɛ > 0, integers s, n, m 1,..., m s 1 and sufficiently large prime power = (n/ɛ) O(1), the function BlkCnvt n,(m1,...,m s), is a (k, (k 1,..., k s ), 3sɛ, ) block source converter of degree n where k = s i=1 m i + log (1/ɛ) and each k i = 0.99m i. Proof. The degree of BlkCnvt n,(m1,...,m s), is n since RSCon n,m, has degree n. Let X be a random source that has -ary min-entropy k. Let Y 1,..., Y s be independent seeds uniformly distributed over F. Let Z = (Z 1,..., Z s ) = BlkCnvt n,(m1,...,m s),(x, (Y 1,..., Y s )) where each Z i = RSCon n,mi,(x, Y i ) is distributed over F m i. Define B = { (z 1,..., z i ) : i [s], (z 1,..., z i ) supp(z 1,..., Z i ), X Z1 =z 1,...,Z i =z i does not have -ary min-entropy k (m m i ) log (1/ɛ) Define a new distribution Z = (Z 1,..., Z s) as follows: Sample z = (z 1,..., z s ) Z and independently u = (u 1,..., u s ) U m1 +...,+m s,. If there exist i [s] such that (z 1,..., z i 1 ) B, then pick the smallest such i and let z = (z 1,..., z i 1, u i,..., u s ). Otherwise let z = z. Let Z be the distribution of z. For any i [s] and (z 1,..., z i 1 ) supp(z 1,..., Z i), if some prefix of (z 1,..., z i 1 ) is in B then Z i Z 1 =z 1,...,Z i 1 =z is the uniform distribution U i 1 m i,, otherwise Z i Z 1 =z 1,...,Z i 1 =z = i 1 Z i Z1 =z 1,...,Z i 1 =z i 1. In the second case, X Z1 =z 1,...,Z i 1 =z i 1 has min-entropy k (m m i 1 ) log (1/ɛ) m i since (z 1,..., z i 1 ) B. In this case, Z i Z 1 =z 1,...,Z i 1 =z is ɛ-close to i 1 a distribution of min-entropy k i by Theorem and the fact Z i Z 1 =z 1,...,Z i 1 =z i 1 = Z i Z1 =z 1,...,Z i 1 =z i 1 = RSCon n,mi,(x Z1 =z 1,...,Z i 1 =z i 1, Y i ). }. In either cases Z i Z 1 =z 1,...,Z i 1 =z i 1 is ɛ-close to a distribution of min-entropy k i. By Lemma 4.1.2, Z is 2sɛ-close to a (k 1,..., k s ) -ary block source. It remains to prove that Z is sɛ-close to Z, which implies that it is 3sɛ-close to a (k 1,..., k s ) -ary block source. By Lemma 2.0.8, for any i [s], we have Pr[(Z 1,..., Z i 1 ) 23

28 B] ɛ. So the probability that (Z 1,..., Z i 1 ) B for some i [s] is bounded by sɛ. Note that the distribution Z is obtained from Z by redistributing the weights of (z 1,..., z s ) satisfying (z 1,..., z i 1 ) B for some i. We conclude that (Z, Z ) sɛ, as desired Block source extraction We will employ Lemma and compose some basic extractors to get a block source extractor. These basic extractors are given by the basic line samplers Line m, (see Definition 2.0.4). Lemma For ɛ > 0, m 1 and prime power, Line m, is a (k, ɛ, ) extractor of degree 2 where k = 2m log (1/ɛ). Proof. Apply Lemma and Theorem Suppose F Q is an extension field of F with [F Q : F ] = d, i.e., Q = d. By Lemma 2.0.1, Line m,q : F 2m Q F Q F m Q, as a degree-2 manifold over F Q, can also be viewed as a degree-2 manifold over F : Line m,q : F 2md F d F md. Now we are ready to state the main result of this section. We first compose the basic line samplers to get a block source extractor. It is then applied to a block source obtained from the block source converter. Definition (Outer Sampler). For δ > 0, m = 2 s and prime power, let n = 4m + log (2/δ), d = s + 1, and d i = 2 s i for i [s]. For i [s], view Line 2, d i : F 4 di F d i F 2 di F d i F 2d i as a manifold over F : Line 2, d i : F 4d i for i [s] gives the function BlkExt(Line 2, d 1,..., Line 2, ds) : F 4d 1+ +4d s define OuterSamp m,δ, : F n F d F m :. Composing these line samplers Line 2, d i F F m. Finally, OuterSamp m,δ, (x, (y, y )) def = BlkExt(Line 2, d 1,..., Line 2, ds) ( BlkCnvt n,(4d1,...,4d s),(x, y), y ) for x F n, y F s and y F. See Figure 4.1 for an illustration of the above definition. Theorem For any ɛ, δ > 0, integer m 1, and sufficiently large prime power (n/ɛ) O(1), OuterSamp m,δ, is an (ɛ, δ) sampler of degree t where d = O(log m), n = O ( m + log (1/δ) ) and t = O ( m 2 + m log (1/δ) ). Proof. We first show that OuterSamp m,δ, is a (4m, ɛ, ) extractor. Consider any random source X over F n with -ary min-entropy 4m. Let s, d i be as in Definition Let k i = d i for i [s]. Let ɛ 0 = ɛ. 4s 24

29 X OuterExt n,k, Y 1. Y s 1 Y s RSCon n,ds, RSCon n,ds 1, RSCon n,d1, BlkCnvt n,(4d1,...,4d s), X s X s 1 X 1 BlkExt ( Line 2, d 1,..., Line 2, ds ) Y = Y s Line 2, ds Line 2, d s 1 Line 2, d 1 Y s 1 Y s 2 Y 1 Figure 4.1: The extractor OuterExt n,k, that takes the random source X together with the seed (Y 1,..., Y s, Y ) and then outputs Z. Z We have ( s i=1 4d i) + log (1/ɛ 0 ) 4m for sufficiently large (n/ɛ) O(1). So by Theorem 4.1.2, BlkCnvt n,(4d1,...,4d s), is a (4m, (k 1,..., k s ), 3sɛ 0, ) block source converter. Therefore the distribution BlkCnvt n,(4d1,...,4d s),(x, U s, ) is 3sɛ 0 -close to a (k 1,..., k s ) -ary block source X. Then OuterSamp m,δ, (X, U d, ) is 3sɛ 0 -close to BlkExt(Line 2, d 1,..., Line 2, ds)(x, U 1, ). By Lemma 4.1.3, Line 2, d i is a ( k i /d i, ɛ 0, d i) extractor for i [s] since log d i (1/ɛ 0 ) = k i /d i. Euivalently it is a (k i, ɛ 0, ) extractor. By Lemma 2.0.9, BlkExt(Line 2, d 1,..., Line 2, ds) is a ((k 1,..., k s ), sɛ 0, ) block source extractor. Therefore BlkExt(Line 2, d 1,..., Line 2, ds)(x, U 1, ) is sɛ 0 -close to U m,, which by the previous paragraph, implies that OuterSamp m,δ, (X, U d, ) is 4sɛ 0 -close to U m,. By definition, OuterSamp m,δ, is a (4m, ɛ, ) extractor. By Theorem 3.1.1, it is also an (ɛ, δ) sampler. We have d = s+1 = O(log m) and n = O ( m + log (1/δ) ). By Lemma 2.0.1, each Line 2, d i has degree 2 as a manifold over F. Therefore by Lemma 2.0.9, BlkExt(Line 2, d 1,..., Line 2, ds) 25

30 has degree 2 s. By Theorem 4.1.2, BlkCnvt n,(4d1,...,4d s), has degree n. Therefore OuterSamp m,δ, has degree n2 s = O ( m 2 + m log (1/δ) ). Remark 5. We assume m is a power of 2 above. For general m, simply pick m = 2 log m and let OuterSamp m,δ, be the composition of OuterSamp m,δ, with the projection π : F m F m onto the first m coordinates. It yields an (ɛ, δ) sampler of degree t for F m since π is linear, and approximating the density of a subset A in F m is euivalent to approximating the density of π 1 (A) in F m. Remark 6. The most important properties of the extractors Line 2, d i used here are (1) they work for a certain constant min-entropy rate, and (2) the seed is shorter than the output by a constant factor. As the reader can check, besides the basic line samplers, we may also use the randomness-efficient line samplers given by [MR06], or the (strong) extractors from the universal family of hash functions {h a,b : x ax + b} [CW79] (operations are performed in a finite field) together with the leftover hash lemma [ILL89], etc. The sampler OuterSamp m,δ, has optimal randomness complexity O (m log + log (1/δ)), yet the sample complexity is sub-optimal, being d = O(log m) instead of. We will fix this problem by composing it with an inner sampler that has the optimal sample complexity. 4.2 Inner sampler We will construct a curve sampler of low degree in this section, or what we called the inner sampler. It might be viewed as an extractor with optimal seed length, even though it only extracts a tiny fraction of min-entropy from the random source. The construction will be based on two techniues called error reduction and iterated sampling Error reduction Condensers are at the core of many extractor constructions [RSW06, TSUZ07, GUV09, TSU12]. In the language of samplers, the use of condensers can be regarded as an error reduction techniue, as we shall see below. Given a function f : F n F d F m, define LIST f (T, ɛ) def = { x F n : Pr y [f(x, y) T ] > ɛ } for any T F m and ɛ > 0. We are interesting in functions f exhibiting a list-recoverability property that the size of LIST f (T, ɛ) is kept small when T is not too large. Definition A function f : F n F F m H for all T F m of size at most L. is (ɛ, L, H) list-recoverable if LIST f (T, ɛ) 26

Randomness and Computation March 13, Lecture 3

Randomness and Computation March 13, Lecture 3 0368.4163 Randomness and Computation March 13, 2009 Lecture 3 Lecturer: Ronitt Rubinfeld Scribe: Roza Pogalnikova and Yaron Orenstein Announcements Homework 1 is released, due 25/03. Lecture Plan 1. Do

More information

1 Randomized Computation

1 Randomized Computation CS 6743 Lecture 17 1 Fall 2007 1 Randomized Computation Why is randomness useful? Imagine you have a stack of bank notes, with very few counterfeit ones. You want to choose a genuine bank note to pay at

More information

Lecture 5: Two-point Sampling

Lecture 5: Two-point Sampling Randomized Algorithms Lecture 5: Two-point Sampling Sotiris Nikoletseas Professor CEID - ETY Course 2017-2018 Sotiris Nikoletseas, Professor Randomized Algorithms - Lecture 5 1 / 26 Overview A. Pairwise

More information

Inaccessible Entropy and its Applications. 1 Review: Psedorandom Generators from One-Way Functions

Inaccessible Entropy and its Applications. 1 Review: Psedorandom Generators from One-Way Functions Columbia University - Crypto Reading Group Apr 27, 2011 Inaccessible Entropy and its Applications Igor Carboni Oliveira We summarize the constructions of PRGs from OWFs discussed so far and introduce the

More information

Analyzing Linear Mergers

Analyzing Linear Mergers Analyzing Linear Mergers Zeev Dvir Ran Raz Abstract Mergers are functions that transform k (possibly dependent) random sources into a single random source, in a way that ensures that if one of the input

More information

Balls and Bins: Smaller Hash Families and Faster Evaluation

Balls and Bins: Smaller Hash Families and Faster Evaluation Balls and Bins: Smaller Hash Families and Faster Evaluation L. Elisa Celis Omer Reingold Gil Segev Udi Wieder April 22, 2011 Abstract A fundamental fact in the analysis of randomized algorithm is that

More information

Unbalanced Expanders and Randomness Extractors from Parvaresh Vardy Codes

Unbalanced Expanders and Randomness Extractors from Parvaresh Vardy Codes Unbalanced Expanders and Randomness Extractors from Parvaresh Vardy Codes VENKATESAN GURUSWAMI Dept. of Computer Science & Engineering University of Washington Seattle, WA 98195 venkat@cs.washington.edu

More information

Lecture 4: Two-point Sampling, Coupon Collector s problem

Lecture 4: Two-point Sampling, Coupon Collector s problem Randomized Algorithms Lecture 4: Two-point Sampling, Coupon Collector s problem Sotiris Nikoletseas Associate Professor CEID - ETY Course 2013-2014 Sotiris Nikoletseas, Associate Professor Randomized Algorithms

More information

Balls and Bins: Smaller Hash Families and Faster Evaluation

Balls and Bins: Smaller Hash Families and Faster Evaluation Balls and Bins: Smaller Hash Families and Faster Evaluation L. Elisa Celis Omer Reingold Gil Segev Udi Wieder May 1, 2013 Abstract A fundamental fact in the analysis of randomized algorithms is that when

More information

PRGs for space-bounded computation: INW, Nisan

PRGs for space-bounded computation: INW, Nisan 0368-4283: Space-Bounded Computation 15/5/2018 Lecture 9 PRGs for space-bounded computation: INW, Nisan Amnon Ta-Shma and Dean Doron 1 PRGs Definition 1. Let C be a collection of functions C : Σ n {0,

More information

6.842 Randomness and Computation Lecture 5

6.842 Randomness and Computation Lecture 5 6.842 Randomness and Computation 2012-02-22 Lecture 5 Lecturer: Ronitt Rubinfeld Scribe: Michael Forbes 1 Overview Today we will define the notion of a pairwise independent hash function, and discuss its

More information

Formal Groups. Niki Myrto Mavraki

Formal Groups. Niki Myrto Mavraki Formal Groups Niki Myrto Mavraki Contents 1. Introduction 1 2. Some preliminaries 2 3. Formal Groups (1 dimensional) 2 4. Groups associated to formal groups 9 5. The Invariant Differential 11 6. The Formal

More information

Stanford University CS254: Computational Complexity Handout 8 Luca Trevisan 4/21/2010

Stanford University CS254: Computational Complexity Handout 8 Luca Trevisan 4/21/2010 Stanford University CS254: Computational Complexity Handout 8 Luca Trevisan 4/2/200 Counting Problems Today we describe counting problems and the class #P that they define, and we show that every counting

More information

Lecture 7: ɛ-biased and almost k-wise independent spaces

Lecture 7: ɛ-biased and almost k-wise independent spaces Lecture 7: ɛ-biased and almost k-wise independent spaces Topics in Complexity Theory and Pseudorandomness (pring 203) Rutgers University wastik Kopparty cribes: Ben Lund, Tim Naumovitz Today we will see

More information

Measures and Measure Spaces

Measures and Measure Spaces Chapter 2 Measures and Measure Spaces In summarizing the flaws of the Riemann integral we can focus on two main points: 1) Many nice functions are not Riemann integrable. 2) The Riemann integral does not

More information

Randomness-Optimal Oblivious Sampling

Randomness-Optimal Oblivious Sampling Randomness-Optimal Oblivious Sampling David Zuckerman December 12, 1997 Abstract We present the first efficient oblivious sampler that uses an optimal number of random bits, up to an arbitrary constant

More information

Recent Developments in Explicit Constructions of Extractors

Recent Developments in Explicit Constructions of Extractors The Computational Complexity Column by Lance FORTNOW NEC Research Institute 4 Independence Way, Princeton, NJ 08540, USA fortnow@research.nj.nec.com http://www.neci.nj.nec.com/homepages/fortnow/beatcs

More information

Kolmogorov Complexity in Randomness Extraction

Kolmogorov Complexity in Randomness Extraction LIPIcs Leibniz International Proceedings in Informatics Kolmogorov Complexity in Randomness Extraction John M. Hitchcock, A. Pavan 2, N. V. Vinodchandran 3 Department of Computer Science, University of

More information

Lecture 19 : Reed-Muller, Concatenation Codes & Decoding problem

Lecture 19 : Reed-Muller, Concatenation Codes & Decoding problem IITM-CS6845: Theory Toolkit February 08, 2012 Lecture 19 : Reed-Muller, Concatenation Codes & Decoding problem Lecturer: Jayalal Sarma Scribe: Dinesh K Theme: Error correcting codes In the previous lecture,

More information

Polynomial Representations of Threshold Functions and Algorithmic Applications. Joint with Josh Alman (Stanford) and Timothy M.

Polynomial Representations of Threshold Functions and Algorithmic Applications. Joint with Josh Alman (Stanford) and Timothy M. Polynomial Representations of Threshold Functions and Algorithmic Applications Ryan Williams Stanford Joint with Josh Alman (Stanford) and Timothy M. Chan (Waterloo) Outline The Context: Polynomial Representations,

More information

Non-Malleable Extractors with Short Seeds and Applications to Privacy Amplification

Non-Malleable Extractors with Short Seeds and Applications to Privacy Amplification Non-Malleable Extractors with Short Seeds and Applications to Privacy Amplification Gil Cohen Ran Raz Gil Segev Abstract Motivated by the classical problem of privacy amplification, Dodis and Wichs [DW09]

More information

Lecture 3 Small bias with respect to linear tests

Lecture 3 Small bias with respect to linear tests 03683170: Expanders, Pseudorandomness and Derandomization 3/04/16 Lecture 3 Small bias with respect to linear tests Amnon Ta-Shma and Dean Doron 1 The Fourier expansion 1.1 Over general domains Let G be

More information

2 Completing the Hardness of approximation of Set Cover

2 Completing the Hardness of approximation of Set Cover CSE 533: The PCP Theorem and Hardness of Approximation (Autumn 2005) Lecture 15: Set Cover hardness and testing Long Codes Nov. 21, 2005 Lecturer: Venkat Guruswami Scribe: Atri Rudra 1 Recap We will first

More information

Recursive definitions on surreal numbers

Recursive definitions on surreal numbers Recursive definitions on surreal numbers Antongiulio Fornasiero 19th July 2005 Abstract Let No be Conway s class of surreal numbers. I will make explicit the notion of a function f on No recursively defined

More information

Lecture 22. m n c (k) i,j x i x j = c (k) k=1

Lecture 22. m n c (k) i,j x i x j = c (k) k=1 Notes on Complexity Theory Last updated: June, 2014 Jonathan Katz Lecture 22 1 N P PCP(poly, 1) We show here a probabilistically checkable proof for N P in which the verifier reads only a constant number

More information

The Complexity of the Matroid-Greedoid Partition Problem

The Complexity of the Matroid-Greedoid Partition Problem The Complexity of the Matroid-Greedoid Partition Problem Vera Asodi and Christopher Umans Abstract We show that the maximum matroid-greedoid partition problem is NP-hard to approximate to within 1/2 +

More information

Lecture 23: Alternation vs. Counting

Lecture 23: Alternation vs. Counting CS 710: Complexity Theory 4/13/010 Lecture 3: Alternation vs. Counting Instructor: Dieter van Melkebeek Scribe: Jeff Kinne & Mushfeq Khan We introduced counting complexity classes in the previous lecture

More information

Conflict-Free Colorings of Rectangles Ranges

Conflict-Free Colorings of Rectangles Ranges Conflict-Free Colorings of Rectangles Ranges Khaled Elbassioni Nabil H. Mustafa Max-Planck-Institut für Informatik, Saarbrücken, Germany felbassio, nmustafag@mpi-sb.mpg.de Abstract. Given the range space

More information

14.1 Finding frequent elements in stream

14.1 Finding frequent elements in stream Chapter 14 Streaming Data Model 14.1 Finding frequent elements in stream A very useful statistics for many applications is to keep track of elements that occur more frequently. It can come in many flavours

More information

COS598D Lecture 3 Pseudorandom generators from one-way functions

COS598D Lecture 3 Pseudorandom generators from one-way functions COS598D Lecture 3 Pseudorandom generators from one-way functions Scribe: Moritz Hardt, Srdjan Krstic February 22, 2008 In this lecture we prove the existence of pseudorandom-generators assuming that oneway

More information

Problem Set 2. Assigned: Mon. November. 23, 2015

Problem Set 2. Assigned: Mon. November. 23, 2015 Pseudorandomness Prof. Salil Vadhan Problem Set 2 Assigned: Mon. November. 23, 2015 Chi-Ning Chou Index Problem Progress 1 SchwartzZippel lemma 1/1 2 Robustness of the model 1/1 3 Zero error versus 1-sided

More information

Testing Monotone High-Dimensional Distributions

Testing Monotone High-Dimensional Distributions Testing Monotone High-Dimensional Distributions Ronitt Rubinfeld Computer Science & Artificial Intelligence Lab. MIT Cambridge, MA 02139 ronitt@theory.lcs.mit.edu Rocco A. Servedio Department of Computer

More information

Low Degree Test with Polynomially Small Error

Low Degree Test with Polynomially Small Error Low Degree Test with Polynomially Small Error Dana Moshkovitz January 31, 2016 Abstract A long line of work in Theoretical Computer Science shows that a function is close to a low degree polynomial iff

More information

Homework #2 Solutions Due: September 5, for all n N n 3 = n2 (n + 1) 2 4

Homework #2 Solutions Due: September 5, for all n N n 3 = n2 (n + 1) 2 4 Do the following exercises from the text: Chapter (Section 3):, 1, 17(a)-(b), 3 Prove that 1 3 + 3 + + n 3 n (n + 1) for all n N Proof The proof is by induction on n For n N, let S(n) be the statement

More information

Introduction Long transparent proofs The real PCP theorem. Real Number PCPs. Klaus Meer. Brandenburg University of Technology, Cottbus, Germany

Introduction Long transparent proofs The real PCP theorem. Real Number PCPs. Klaus Meer. Brandenburg University of Technology, Cottbus, Germany Santaló s Summer School, Part 3, July, 2012 joint work with Martijn Baartse (work supported by DFG, GZ:ME 1424/7-1) Outline 1 Introduction 2 Long transparent proofs for NP R 3 The real PCP theorem First

More information

Lecture 3: Randomness in Computation

Lecture 3: Randomness in Computation Great Ideas in Theoretical Computer Science Summer 2013 Lecture 3: Randomness in Computation Lecturer: Kurt Mehlhorn & He Sun Randomness is one of basic resources and appears everywhere. In computer science,

More information

Lecture 2: Minimax theorem, Impagliazzo Hard Core Lemma

Lecture 2: Minimax theorem, Impagliazzo Hard Core Lemma Lecture 2: Minimax theorem, Impagliazzo Hard Core Lemma Topics in Pseudorandomness and Complexity Theory (Spring 207) Rutgers University Swastik Kopparty Scribe: Cole Franks Zero-sum games are two player

More information

IITM-CS6845: Theory Toolkit February 3, 2012

IITM-CS6845: Theory Toolkit February 3, 2012 IITM-CS6845: Theory Toolkit February 3, 2012 Lecture 4 : Derandomizing the logspace algorithm for s-t connectivity Lecturer: N S Narayanaswamy Scribe: Mrinal Kumar Lecture Plan:In this lecture, we will

More information

Lecture 3: Lower bound on statistically secure encryption, extractors

Lecture 3: Lower bound on statistically secure encryption, extractors CS 7880 Graduate Cryptography September, 015 Lecture 3: Lower bound on statistically secure encryption, extractors Lecturer: Daniel Wichs Scribe: Giorgos Zirdelis 1 Topics Covered Statistical Secrecy Randomness

More information

Two Query PCP with Sub-Constant Error

Two Query PCP with Sub-Constant Error Electronic Colloquium on Computational Complexity, Report No 71 (2008) Two Query PCP with Sub-Constant Error Dana Moshkovitz Ran Raz July 28, 2008 Abstract We show that the N P-Complete language 3SAT has

More information

Topics in Probabilistic Combinatorics and Algorithms Winter, Basic Derandomization Techniques

Topics in Probabilistic Combinatorics and Algorithms Winter, Basic Derandomization Techniques Topics in Probabilistic Combinatorics and Algorithms Winter, 016 3. Basic Derandomization Techniques Definition. DTIME(t(n)) : {L : L can be decided deterministically in time O(t(n)).} EXP = { L: L can

More information

Lecture 03: Polynomial Based Codes

Lecture 03: Polynomial Based Codes Lecture 03: Polynomial Based Codes Error-Correcting Codes (Spring 016) Rutgers University Swastik Kopparty Scribes: Ross Berkowitz & Amey Bhangale 1 Reed-Solomon Codes Reed Solomon codes are large alphabet

More information

CSE 190, Great ideas in algorithms: Pairwise independent hash functions

CSE 190, Great ideas in algorithms: Pairwise independent hash functions CSE 190, Great ideas in algorithms: Pairwise independent hash functions 1 Hash functions The goal of hash functions is to map elements from a large domain to a small one. Typically, to obtain the required

More information

Majority is incompressible by AC 0 [p] circuits

Majority is incompressible by AC 0 [p] circuits Majority is incompressible by AC 0 [p] circuits Igor Carboni Oliveira Columbia University Joint work with Rahul Santhanam (Univ. Edinburgh) 1 Part 1 Background, Examples, and Motivation 2 Basic Definitions

More information

Lecture 12: Constant Degree Lossless Expanders

Lecture 12: Constant Degree Lossless Expanders Lecture : Constant Degree Lossless Expanders Topics in Complexity Theory and Pseudorandomness (Spring 03) Rutgers University Swastik Kopparty Scribes: Meng Li, Yun Kuen Cheung Overview In this lecture,

More information

Chapter 11. Min Cut Min Cut Problem Definition Some Definitions. By Sariel Har-Peled, December 10, Version: 1.

Chapter 11. Min Cut Min Cut Problem Definition Some Definitions. By Sariel Har-Peled, December 10, Version: 1. Chapter 11 Min Cut By Sariel Har-Peled, December 10, 013 1 Version: 1.0 I built on the sand And it tumbled down, I built on a rock And it tumbled down. Now when I build, I shall begin With the smoke from

More information

Cell-Probe Lower Bounds for Prefix Sums and Matching Brackets

Cell-Probe Lower Bounds for Prefix Sums and Matching Brackets Cell-Probe Lower Bounds for Prefix Sums and Matching Brackets Emanuele Viola July 6, 2009 Abstract We prove that to store strings x {0, 1} n so that each prefix sum a.k.a. rank query Sumi := k i x k can

More information

CS 6820 Fall 2014 Lectures, October 3-20, 2014

CS 6820 Fall 2014 Lectures, October 3-20, 2014 Analysis of Algorithms Linear Programming Notes CS 6820 Fall 2014 Lectures, October 3-20, 2014 1 Linear programming The linear programming (LP) problem is the following optimization problem. We are given

More information

Algebra Review 2. 1 Fields. A field is an extension of the concept of a group.

Algebra Review 2. 1 Fields. A field is an extension of the concept of a group. Algebra Review 2 1 Fields A field is an extension of the concept of a group. Definition 1. A field (F, +,, 0 F, 1 F ) is a set F together with two binary operations (+, ) on F such that the following conditions

More information

Handout 5. α a1 a n. }, where. xi if a i = 1 1 if a i = 0.

Handout 5. α a1 a n. }, where. xi if a i = 1 1 if a i = 0. Notes on Complexity Theory Last updated: October, 2005 Jonathan Katz Handout 5 1 An Improved Upper-Bound on Circuit Size Here we show the result promised in the previous lecture regarding an upper-bound

More information

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory Part V 7 Introduction: What are measures and why measurable sets Lebesgue Integration Theory Definition 7. (Preliminary). A measure on a set is a function :2 [ ] such that. () = 2. If { } = is a finite

More information

Lebesgue Measure on R n

Lebesgue Measure on R n CHAPTER 2 Lebesgue Measure on R n Our goal is to construct a notion of the volume, or Lebesgue measure, of rather general subsets of R n that reduces to the usual volume of elementary geometrical sets

More information

Notes 6 : First and second moment methods

Notes 6 : First and second moment methods Notes 6 : First and second moment methods Math 733-734: Theory of Probability Lecturer: Sebastien Roch References: [Roc, Sections 2.1-2.3]. Recall: THM 6.1 (Markov s inequality) Let X be a non-negative

More information

Some notes on streaming algorithms continued

Some notes on streaming algorithms continued U.C. Berkeley CS170: Algorithms Handout LN-11-9 Christos Papadimitriou & Luca Trevisan November 9, 016 Some notes on streaming algorithms continued Today we complete our quick review of streaming algorithms.

More information

CSC 5170: Theory of Computational Complexity Lecture 5 The Chinese University of Hong Kong 8 February 2010

CSC 5170: Theory of Computational Complexity Lecture 5 The Chinese University of Hong Kong 8 February 2010 CSC 5170: Theory of Computational Complexity Lecture 5 The Chinese University of Hong Kong 8 February 2010 So far our notion of realistic computation has been completely deterministic: The Turing Machine

More information

Testing random variables for independence and identity

Testing random variables for independence and identity Testing random variables for independence and identity Tuğkan Batu Eldar Fischer Lance Fortnow Ravi Kumar Ronitt Rubinfeld Patrick White January 10, 2003 Abstract Given access to independent samples of

More information

HARDNESS AMPLIFICATION VIA SPACE-EFFICIENT DIRECT PRODUCTS

HARDNESS AMPLIFICATION VIA SPACE-EFFICIENT DIRECT PRODUCTS HARDNESS AMPLIFICATION VIA SPACE-EFFICIENT DIRECT PRODUCTS Venkatesan Guruswami and Valentine Kabanets Abstract. We prove a version of the derandomized Direct Product lemma for deterministic space-bounded

More information

Theorem 5.3. Let E/F, E = F (u), be a simple field extension. Then u is algebraic if and only if E/F is finite. In this case, [E : F ] = deg f u.

Theorem 5.3. Let E/F, E = F (u), be a simple field extension. Then u is algebraic if and only if E/F is finite. In this case, [E : F ] = deg f u. 5. Fields 5.1. Field extensions. Let F E be a subfield of the field E. We also describe this situation by saying that E is an extension field of F, and we write E/F to express this fact. If E/F is a field

More information

Dynkin (λ-) and π-systems; monotone classes of sets, and of functions with some examples of application (mainly of a probabilistic flavor)

Dynkin (λ-) and π-systems; monotone classes of sets, and of functions with some examples of application (mainly of a probabilistic flavor) Dynkin (λ-) and π-systems; monotone classes of sets, and of functions with some examples of application (mainly of a probabilistic flavor) Matija Vidmar February 7, 2018 1 Dynkin and π-systems Some basic

More information

Notes 10: List Decoding Reed-Solomon Codes and Concatenated codes

Notes 10: List Decoding Reed-Solomon Codes and Concatenated codes Introduction to Coding Theory CMU: Spring 010 Notes 10: List Decoding Reed-Solomon Codes and Concatenated codes April 010 Lecturer: Venkatesan Guruswami Scribe: Venkat Guruswami & Ali Kemal Sinop DRAFT

More information

On the Complexity of Approximating the VC Dimension

On the Complexity of Approximating the VC Dimension University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 12-2002 On the Complexity of Approximating the VC Dimension Elchanan Mossel University of Pennsylvania Christopher

More information

Noisy Interpolating Sets for Low Degree Polynomials

Noisy Interpolating Sets for Low Degree Polynomials Noisy Interpolating Sets for Low Degree Polynomials Zeev Dvir Amir Shpilka Abstract A Noisy Interpolating Set (NIS) for degree d polynomials is a set S F n, where F is a finite field, such that any degree

More information

Simulating BPP Using a General Weak Random Source

Simulating BPP Using a General Weak Random Source Simulating BPP Using a General Weak Random Source David Zuckerman Dept. of Computer Sciences The University of Texas at Austin Austin, TX 78712 diz@cs.utexas.edu February 21, 1995 Abstract We show how

More information

: Error Correcting Codes. October 2017 Lecture 1

: Error Correcting Codes. October 2017 Lecture 1 03683072: Error Correcting Codes. October 2017 Lecture 1 First Definitions and Basic Codes Amnon Ta-Shma and Dean Doron 1 Error Correcting Codes Basics Definition 1. An (n, K, d) q code is a subset of

More information

Lecture 5: The Principle of Deferred Decisions. Chernoff Bounds

Lecture 5: The Principle of Deferred Decisions. Chernoff Bounds Randomized Algorithms Lecture 5: The Principle of Deferred Decisions. Chernoff Bounds Sotiris Nikoletseas Associate Professor CEID - ETY Course 2013-2014 Sotiris Nikoletseas, Associate Professor Randomized

More information

IMPROVING THE ALPHABET-SIZE IN EXPANDER BASED CODE CONSTRUCTIONS

IMPROVING THE ALPHABET-SIZE IN EXPANDER BASED CODE CONSTRUCTIONS IMPROVING THE ALPHABET-SIZE IN EXPANDER BASED CODE CONSTRUCTIONS 1 Abstract Various code constructions use expander graphs to improve the error resilience. Often the use of expanding graphs comes at the

More information

Counting Matrices Over a Finite Field With All Eigenvalues in the Field

Counting Matrices Over a Finite Field With All Eigenvalues in the Field Counting Matrices Over a Finite Field With All Eigenvalues in the Field Lisa Kaylor David Offner Department of Mathematics and Computer Science Westminster College, Pennsylvania, USA kaylorlm@wclive.westminster.edu

More information

Connectedness. Proposition 2.2. The following are equivalent for a topological space (X, T ).

Connectedness. Proposition 2.2. The following are equivalent for a topological space (X, T ). Connectedness 1 Motivation Connectedness is the sort of topological property that students love. Its definition is intuitive and easy to understand, and it is a powerful tool in proofs of well-known results.

More information

On the Power of the Randomized Iterate

On the Power of the Randomized Iterate On the Power of the Randomized Iterate Iftach Haitner Danny Harnik Omer Reingold August 21, 2006 Abstract We consider two of the most fundamental theorems in Cryptography. The first, due to Håstad et.

More information

Almost k-wise independence versus k-wise independence

Almost k-wise independence versus k-wise independence Almost k-wise independence versus k-wise independence Noga Alon Sackler Faculty of Exact Sciences Tel Aviv University Ramat-Aviv, Israel. nogaa@post.tau.ac.il Yishay Mansour School of Computer Science

More information

BALANCING GAUSSIAN VECTORS. 1. Introduction

BALANCING GAUSSIAN VECTORS. 1. Introduction BALANCING GAUSSIAN VECTORS KEVIN P. COSTELLO Abstract. Let x 1,... x n be independent normally distributed vectors on R d. We determine the distribution function of the minimum norm of the 2 n vectors

More information

Limits to List Decoding Random Codes

Limits to List Decoding Random Codes Limits to List Decoding Random Codes Atri Rudra Department of Computer Science and Engineering, University at Buffalo, The State University of New York, Buffalo, NY, 14620. atri@cse.buffalo.edu Abstract

More information

Notes on Discrete Probability

Notes on Discrete Probability Columbia University Handout 3 W4231: Analysis of Algorithms September 21, 1999 Professor Luca Trevisan Notes on Discrete Probability The following notes cover, mostly without proofs, the basic notions

More information

On Recycling the Randomness of the States in Space Bounded Computation

On Recycling the Randomness of the States in Space Bounded Computation On Recycling the Randomness of the States in Space Bounded Computation Preliminary Version Ran Raz Omer Reingold Abstract Let M be a logarithmic space Turing machine (or a polynomial width branching program)

More information

The Banach-Tarski paradox

The Banach-Tarski paradox The Banach-Tarski paradox 1 Non-measurable sets In these notes I want to present a proof of the Banach-Tarski paradox, a consequence of the axiom of choice that shows us that a naive understanding of the

More information

Lecture 3: Lower Bounds for Bandit Algorithms

Lecture 3: Lower Bounds for Bandit Algorithms CMSC 858G: Bandits, Experts and Games 09/19/16 Lecture 3: Lower Bounds for Bandit Algorithms Instructor: Alex Slivkins Scribed by: Soham De & Karthik A Sankararaman 1 Lower Bounds In this lecture (and

More information

Notes on ordinals and cardinals

Notes on ordinals and cardinals Notes on ordinals and cardinals Reed Solomon 1 Background Terminology We will use the following notation for the common number systems: N = {0, 1, 2,...} = the natural numbers Z = {..., 2, 1, 0, 1, 2,...}

More information

Answering Many Queries with Differential Privacy

Answering Many Queries with Differential Privacy 6.889 New Developments in Cryptography May 6, 2011 Answering Many Queries with Differential Privacy Instructors: Shafi Goldwasser, Yael Kalai, Leo Reyzin, Boaz Barak, and Salil Vadhan Lecturer: Jonathan

More information

Lecture 3: AC 0, the switching lemma

Lecture 3: AC 0, the switching lemma Lecture 3: AC 0, the switching lemma Topics in Complexity Theory and Pseudorandomness (Spring 2013) Rutgers University Swastik Kopparty Scribes: Meng Li, Abdul Basit 1 Pseudorandom sets We start by proving

More information

be any ring homomorphism and let s S be any element of S. Then there is a unique ring homomorphism

be any ring homomorphism and let s S be any element of S. Then there is a unique ring homomorphism 21. Polynomial rings Let us now turn out attention to determining the prime elements of a polynomial ring, where the coefficient ring is a field. We already know that such a polynomial ring is a UFD. Therefore

More information

arxiv: v1 [cs.cc] 29 Feb 2012

arxiv: v1 [cs.cc] 29 Feb 2012 On the Distribution of the Fourier Spectrum of Halfspaces Ilias Diakonikolas 1, Ragesh Jaiswal 2, Rocco A. Servedio 3, Li-Yang Tan 3, and Andrew Wan 4 arxiv:1202.6680v1 [cs.cc] 29 Feb 2012 1 University

More information

Course 311: Michaelmas Term 2005 Part III: Topics in Commutative Algebra

Course 311: Michaelmas Term 2005 Part III: Topics in Commutative Algebra Course 311: Michaelmas Term 2005 Part III: Topics in Commutative Algebra D. R. Wilkins Contents 3 Topics in Commutative Algebra 2 3.1 Rings and Fields......................... 2 3.2 Ideals...............................

More information

The Randomness Complexity of Parallel Repetition

The Randomness Complexity of Parallel Repetition The Randomness Complexity of Parallel Repetition Kai-Min Chung Rafael Pass September 1, 2011 Abstract Consider a m-round interactive protocol with soundness error 1/2. How much extra randomness is required

More information

Notes for Lecture 11

Notes for Lecture 11 Stanford University CS254: Computational Complexity Notes 11 Luca Trevisan 2/11/2014 Notes for Lecture 11 Circuit Lower Bounds for Parity Using Polynomials In this lecture we prove a lower bound on the

More information

Lecture 8 : Eigenvalues and Eigenvectors

Lecture 8 : Eigenvalues and Eigenvectors CPS290: Algorithmic Foundations of Data Science February 24, 2017 Lecture 8 : Eigenvalues and Eigenvectors Lecturer: Kamesh Munagala Scribe: Kamesh Munagala Hermitian Matrices It is simpler to begin with

More information

COUNTING NUMERICAL SEMIGROUPS BY GENUS AND SOME CASES OF A QUESTION OF WILF

COUNTING NUMERICAL SEMIGROUPS BY GENUS AND SOME CASES OF A QUESTION OF WILF COUNTING NUMERICAL SEMIGROUPS BY GENUS AND SOME CASES OF A QUESTION OF WILF NATHAN KAPLAN Abstract. The genus of a numerical semigroup is the size of its complement. In this paper we will prove some results

More information

Notes on the Dual Ramsey Theorem

Notes on the Dual Ramsey Theorem Notes on the Dual Ramsey Theorem Reed Solomon July 29, 2010 1 Partitions and infinite variable words The goal of these notes is to give a proof of the Dual Ramsey Theorem. This theorem was first proved

More information

1 Lecture 6-7, Scribe: Willy Quach

1 Lecture 6-7, Scribe: Willy Quach Special Topics in Complexity Theory, Fall 2017. Instructor: Emanuele Viola 1 Lecture 6-7, Scribe: Willy Quach In these lectures, we introduce k-wise indistinguishability and link this notion to the approximate

More information

Winkler s Hat Guessing Game: Better Results for Imbalanced Hat Distributions

Winkler s Hat Guessing Game: Better Results for Imbalanced Hat Distributions arxiv:1303.705v1 [math.co] 8 Mar 013 Winkler s Hat Guessing Game: Better Results for Imbalanced Hat Distributions Benjamin Doerr Max-Planck-Institute for Informatics 6613 Saarbrücken Germany April 5, 018

More information

INFINITE RINGS WITH PLANAR ZERO-DIVISOR GRAPHS

INFINITE RINGS WITH PLANAR ZERO-DIVISOR GRAPHS INFINITE RINGS WITH PLANAR ZERO-DIVISOR GRAPHS YONGWEI YAO Abstract. For any commutative ring R that is not a domain, there is a zerodivisor graph, denoted Γ(R), in which the vertices are the nonzero zero-divisors

More information

11.1 Set Cover ILP formulation of set cover Deterministic rounding

11.1 Set Cover ILP formulation of set cover Deterministic rounding CS787: Advanced Algorithms Lecture 11: Randomized Rounding, Concentration Bounds In this lecture we will see some more examples of approximation algorithms based on LP relaxations. This time we will use

More information

Finite fields, randomness and complexity. Swastik Kopparty Rutgers University

Finite fields, randomness and complexity. Swastik Kopparty Rutgers University Finite fields, randomness and complexity Swastik Kopparty Rutgers University This talk Three great problems: Polynomial factorization Epsilon-biased sets Function uncorrelated with low-degree polynomials

More information

Incompressible Functions, Relative-Error Extractors, and the Power of Nondeterminsitic Reductions

Incompressible Functions, Relative-Error Extractors, and the Power of Nondeterminsitic Reductions Electronic Colloquium on Computational Complexity, Revision 1 of Report No. 51 (2015) Incompressible Functions, Relative-Error Extractors, and the Power of Nondeterminsitic Reductions Benny Applebaum Sergei

More information

Noisy Interpolating Sets for Low Degree Polynomials

Noisy Interpolating Sets for Low Degree Polynomials Noisy Interpolating Sets for Low Degree Polynomials Zeev Dvir Amir Shpilka Abstract A Noisy Interpolating Set (NIS) for degree-d polynomials is a set S F n, where F is a finite field, such that any degree-d

More information

Continued fractions for complex numbers and values of binary quadratic forms

Continued fractions for complex numbers and values of binary quadratic forms arxiv:110.3754v1 [math.nt] 18 Feb 011 Continued fractions for complex numbers and values of binary quadratic forms S.G. Dani and Arnaldo Nogueira February 1, 011 Abstract We describe various properties

More information

CSC 5170: Theory of Computational Complexity Lecture 9 The Chinese University of Hong Kong 15 March 2010

CSC 5170: Theory of Computational Complexity Lecture 9 The Chinese University of Hong Kong 15 March 2010 CSC 5170: Theory of Computational Complexity Lecture 9 The Chinese University of Hong Kong 15 March 2010 We now embark on a study of computational classes that are more general than NP. As these classes

More information

Notes on Complex Analysis

Notes on Complex Analysis Michael Papadimitrakis Notes on Complex Analysis Department of Mathematics University of Crete Contents The complex plane.. The complex plane...................................2 Argument and polar representation.........................

More information

Model Counting for Logical Theories

Model Counting for Logical Theories Model Counting for Logical Theories Wednesday Dmitry Chistikov Rayna Dimitrova Department of Computer Science University of Oxford, UK Max Planck Institute for Software Systems (MPI-SWS) Kaiserslautern

More information

Notes on Complexity Theory Last updated: December, Lecture 27

Notes on Complexity Theory Last updated: December, Lecture 27 Notes on Complexity Theory Last updated: December, 2011 Jonathan Katz Lecture 27 1 Space-Bounded Derandomization We now discuss derandomization of space-bounded algorithms. Here non-trivial results can

More information