A list-decodable code with local encoding and decoding Marius Zimand Towson University Department of Computer and Information Sciences Baltimore, MD http://triton.towson.edu/ mzimand Abstract For arbitrary constants ɛ > 0 and λ > 0 we present a code E : {0, 1} n {0, 1} n such that n = n O(log(1/ɛ)) and every ball in {0, 1} n of radius ( 1 2 ɛ)n (in the Hamming-distance sense) contains at most 2 λn strings. Furthermore, the code E has encoding and list-decoding algorithms that produce each bit of their output in time polylog(n). 1 Introduction Commonly error-correcting codes are utilized for transmission of information over noisy channels. The code is a set of strings, called codewords, that are pairwise far apart in the Hamming distance metric. Messages are encoded into codewords and the codewords are sent over the channel. If the received word has been corrupted and differs from the transmitted codeword in less than half of the code distance (defined to be the minimum Hamming distance between any codewords), then it is possible to recover the correct codeword by determining the closest codeword to the received word. This is the case of unique decoding, which is possible, as we have seen, only if the number of errors is less than half the distance of the code. What if the number of errors is larger? The corrupted received word may still provide valuable information about the correct transmitted codeword. Indeed, for some codes, there may be just a few codewords in a relatively large vicinity of any (received) word, and in this case, if the numbers of errors is smaller than the vicinity radius, it is possible to produce a relatively short list of candidates for the correct transmitted codeword. Codes that have such a property are called list-decoding codes. List-decoding codes have numerous other applications that have been identified relatively recently in various areas such as computational complexity (for hardness amplification, see the survey paper [8]), cryptography (for constructing hard-core predicates, see the same survey paper [8]), probabilistic algorithms (for constructing extractors, see [7]). The combinatorial quality of a list-decoding code is given by two parameters: ɛ, which represents the fraction of errors that still allows the construction of the list of candidates, and L, the size of the list of candidates. More precisely, a code is (ɛ, L) list-decodable if every string z has at most L codewords that differ from z in at most an ɛ fraction of positions. The goal is to build codes with large ɛ and small L. The efficiency of the encoding and of the listdecoding procedures is another important attribute of such codes. Recently, Guruswamy and Indyk [4] have obtained the important result that there exist (1 ɛ, O(n)) list-decodable codes with linear time encoding and list-decoding. The alphabet size of their code depends on ɛ, which is somewhat undesirable. In this paper, we consider binary codes, i.e., we fix the alphabet to be the binary alphabet {0, 1} and we aim at codes that allow list-decoding recovery when the fraction of errors is 1 2 ɛ, which is the case of interest for most applications (and also, roughly speaking, the case for which it is reasonable to hope for meaningful recovery, see [3]). We seek codes that admit sublinear time encoding and list-decoding (however, the quality of the combinatorial parameters will have to suffer because of some known lower bounds, as we will discuss below). More precisely, we look for codes for which: (1) each bit of the codeword can be constructed, given random access to the message, in polynomial time in the index of the bit (this is called local encoding), and (2) each bit of a candidate for the correct message can be constructed, given random access to the received word, in time polynomial in the index of the bit (this is called local list-decoding). Note that in case the codeword length is at most a
polynomial in the length of the message this amounts to encoding and list-decoding in polylog time with respect to the input length. The well-known Hadamard code has local encoding and a list-decoding algorithm that runs in time polylog in the codeword length; however the codeword length is exponential in the length of the message. Viola [9] has shown a lower bound for listdecoding codes computable by constant-depth circuits and they imply that the list size of a locally encodable code has to be essentially exponential (i.e., the parameter L has to be of the form 2 λn with λ = 1 log c (n), for some constant c that depends on the circuit size). In this paper we get close to this bound. Namely we show that for any ɛ > 0 and for any constant λ > 0, there is a ( 1 2 ɛ, 2λn ) list-decodable code with local encoding and local list-decoding algorithms. In a forthcoming paper we will show that the construction can be modified to yield λ = 1 log c (n). A few words about the technique that we employ for building our code are in order. Ta-Shma and Zuckerman [7] have observed that list-decoding codes are closely related to extractors, which are procedures that mine good random strings from imperfect randomness sources. In fact, list-decoding codes over the binary alphabet, which is the class of codes studied in this paper, are equivalent to a particular type of extractors, namely strong extractors that output a single bit. Recently, the author of this paper has established [11] a new method for building extractors based on techniques used in the construction of pseudorandom generators from one-way permutations [1, 10] (so called Blum-Micali-Yao pseudo-random generators). The advantage of the extractors in [11] over the other extractors in the literature (for a recent survey on extractors, see [5]) resides in their simplicity and their extreme efficiency. Specifically, one extractor in [11] produces individually each output bit in time polynomial in the index of the bit, which is polylog in the input length. The code that we present in this paper is derived from the extractor in [11] by specializing some parameters to values adequate for list-decodable codes and by doing some important simplifications allowed by the new parameters. For the reader s convenience, we give here a self-contained description that does not involve the concept of an extractor. 2 Definitions In this paper we consider the binary alphabet {0, 1} (although our results are extendable to larger alphabets). A few standard notation: if x and y are binary strings, x y denotes the concatenation of the strings x and y, x denotes the length of the string x. If A is a set, A denotes its cardinality. For any natural number k, [k] denotes the set {1,..., k}. For a string x {0, 1} n, for some n N and for any i [n], x(i) denotes the i-th bit of x. For two strings x, y {0, 1} n, for some n N, we use d(x, y) to denote the Hamming distance between x and y, which is the number of positions on which x and y differ. At the highest level of abstraction a code is a set of strings in {0, 1} n. Since the emphasis in this paper is on the algorithmical aspects of codes, we view a code as a function C : {0, 1} k {0, 1} n, for some k, n N. A string x {0, 1} k is called a message and C(x) is the codeword corresponding to the message x. The set of codewors of the code C is {C(x) x {0, 1} k }. The parameter k is called the information rate of the code, and the parameter n is called the block length of the code. The distance d of the code is defined by d = min x,x {0,1} k,x x d(c(x), C(x )). The relative distance is δ = d/n. As mentioned in the Introduction, if a string z differs from a codeword C(x) in less than d/2 positions (these positions are called errors), one can retrieve (not necessarily efficiently) x from z. By the Plotkin bound, binary codes cannot have the relative distance δ 1/2 (unless they have very few codewords). It follows that binary codes allow the recovery of the correct codeword only if the fraction of errors is less than 1/4. If the fraction of errors is larger or equal to 1/4, it may still be possible to have some useful form of recovery which consists in producing a short list of strings (which we call candidates) one of which is the correct codeword. We are interested in the relevant combinatorial and algorithmical attributes of a code C that allow this type of recovery. The combinatorial property is given in the following definition. Definition 2.1 A code C : {0, 1} k {0, 1} n is (ɛ, L) list-decodable if, for all z {0, 1} n, {x {0, 1} k d(z, C(x)) < ɛ n} < L. The algorithmical attributes refer to the efficiency of calculating C(X) (encoding) and the efficiency of constructing the list of candidates given a string z {0, 1} n (list-decoding). We are seeking sublinear time algorithms for both encoding and list-decoding. A sublinear-time algorithm does not have the possibility to even scan the entire input. Therefore, for sublineartime algorithms, the computational model that is used is that of a random access machine (RAM) in which every input bit can be accessed in one time unit. This type of access is also called oracle access. For an algorithm A and a binary string x, we write A x to denote the fact that the algorithm A is having oracle access
to x. A sublinear-time algorithm cannot write the entire output (if, and this is the case for encoding and list-decoding of interesting codes, the output length is at least polynomial in the input length). Therefore, we require that each bit of the output is calculated separately given the index of that bit and, as we have just discussed, oracle access to the input string. In this paper we require that the calculations for both encoding and list-decoding take time polynomial in the index that gives the position of the calculated bit within the output string. If the output length is polynomially related to the input length (which is the case in this paper) this means polylog time in the input length (which certainly is sublinear). The formal definitions are as follows. Definition 2.2 A code C : {0, 1} k {0, 1} n is locally encodable if there exists a polynomial time algorithm A such that for all x {0, 1} k and for all i [n], A x (i) = C(x)(i). The definition of local list-decoding is more subtle. We use the standard version (see [6] or [8]). Definition 2.3 An ( 1 2 ɛ, L) list-decodable code C : {0, 1} k {0, 1} n is locally decodable if there exists algorithms U and V such that for each z {0, 1} n, U with oracle access to z produces L strings d 1,..., d L (these strings are called descriptor strings) such that for all x {0, 1} k with d(c(x), z) ((1/2) ɛ) 2 n there exists j [L] such that, for all i [k], V z,j,dj (i) = x(i). The algorithms U and V are required to work in time polylog(k). Thus, the algorithm U with oracle access to z (viewed here as the received string) produces some amount of information for each element in the list of candidates (i.e., a descriptor d j for each candidate string y j ) and the algorithm V using oracle access to z, the index j and the descriptor d j calculates individually every bit of the candidate y j. Of course the list of candidates contains x for all x {0, 1} k with d(c(x), z) ((1/2) ɛ) 2 n. We will use the Hadamard error-correcting code Had : {0, 1} n {0, 1} 2n defined as follows: We index the bits of Had(x) in order with strings from {0, 1} n and, for every r {0, 1} n, the r-th bit of Had(x) is given by the inner product x r (the inner product modulo 2, i.e., if x = x 1... x n and r = r 1... r n, with x i and r i bits, then x r = x 1 r 1 +... + x n r n (mod 2)). The Hadamard code is a code that has good list-decoding property (the combinatorial property) and is locally encodable and decodable by a probabilistic algorithm that runs in time polynomial in the message length. It has the shortcoming that its rate is exponentially small (i.e., it maps messages of length n into codewords of length 2 n ). The properties of the list-decoding algorithm mentioned above are stated in the following theorem of Goldreich and Levin [2] (the variant given below is from Trevisan s survey paper [8]). Theorem 2.4 Let Had : {0, 1} n {0, 1} 2n be the Hadamard code and let ɛ > 0. There is a probabilistic algorithm A with oracle access to an arbitrary string z {0, 1} 2n that: runs in time O( 1 ɛ 4 n log n), uses O(log 1 ɛ n) random bits, outputs a list LIST(z) with O( 1 ɛ 2 ) strings in {0, 1} n, and has the following property: For every x {0, 1} n such that d(had(x), z) 1 2 ɛ, with probability at least 3 4, it holds that x LIST(z). Viola [9] 1 has shown that if a code C : {0, 1} k {0, 1} n is (O(1), 2 m ) list-decodable and computable by a circuit of depth d and size g then log d 1 g Ω( k m ). Note that if a code C : {0, 1} k {0, 1} n is locally encodable then each bit can be calculated by a circuit of depth 3 and size 2 logc (k), for some constant c. Thus, if n = poly(k), Viola s result implies that 1 m = Ω( We show in this paper that for log 2c k ) k. any constant λ > 0, there exists a code that is locally encodable and locally decodable that is ( 1 2 ɛ), 2λk ) list-decodable, i.e., m = λk. Our construction can be modified to show that, for any constant c, there exists a ( 1 2 ɛ, 2λk ) list-decodable code, that is locally encodable and locally list-decodable, the encoding can be done by circuits of size g = 2 O(logc+1 k) and λ 1 log c k. Therefore, our method shows that Viola s lower bounds are essentially tight. 3 The Code The parameter n N will be considered fixed throughout this section. We denote N = 2 n and N = n N. For two binary strings x and r of the same length, b(x, r) denotes the inner product of x and r viewed as vectors over the field GF(2). Note that b(x, r) is the r-th bit of Had(x), i.e., the r-th bit of the coding of x with the Hadamard code. The message string X has length N. It is convenient to view it as the truth-table of a function X : {0, 1} n {0, 1} n. The code utilizes a parameter l N that depends on the parameters ɛ and L of the code. More precisely, l = O(log(1/ɛ)), where the hidden constant 1 Actually, his result is slightly more general.
depends on L. We define X : {0, 1} ln {0, 1} ln by X(y 1... y l ) = X(y 1 )... X(y l ), i.e., X is the l-direct product of X. We also denote y = y 1... y l. Each bit of the codeword E(X) will be indexed by (y, r) {0, 1} ln {0, 1} ln, and thus the length of the codeword E(X) is 2 2ln. We will denote 2ln by d. The codeword E(X) is defined by E(X)((y, r)) = b(x(y), r). (1) Note that the code E, when viewed as a procedure that maps messages to codewords, is of type E : {0, 1} N {0, 1} 2d, where 2 d N 2l = N O(log(1/ɛ)). Thus, for constant ɛ, the block length is polynomial in the information rate. Theorem 3.1 Let n N and N = n 2 n. Let λ > 0. There is a constant β > 0 such that for all ɛ > 0 with 1/ɛ 2 β n, for l = O((1/λ) log(1/ɛ)), and for n sufficienly large, E is a (1/2 ɛ, 2 λn ) list-decodable code. Furthermore, E is locally encodable and locally decodable. Proof We take δ = λ/3 and l = (3/δ) log(2/ɛ) (the value of the constant β appearing in the statement of the theorem will be specified later). We also use the parameter w = 6 (1/δ) log(2/ɛ) (1/ɛ). It can be checked that the values that we have chosen for l and w imply l/w < [ ɛ (1 δ + e n ) l]. Let Z {0, 1} 2d be an arbitrary binary string (intuitively it represents the received word). We want to estimate the number of codewords that agree with Z in a fraction of (1/2) + ɛ positions. Suppose X is a message so that E(X) agrees with Z in a fraction of (1/2) + ɛ positions. We will show that X can be reconstructed from Z and some additional information that can be represented with λn bits. From this it follows immediately that the number of messages X with the above property is bounded by 2 λn. This establishes the combinatorial property of the code E. The reconstruction procedure will also be the basis of the list-decoding algorithm, and in this way we will show the algorithmical properties of the code E. Therefore we show first the claimed reconstruction procedure. Let Z(u) denotes the u-th bit of Z, for each u {0, 1} d. The agreement relation between Z and E(X) can be rewritten as Prob y,r (Z(y r) = E(X)(y r)) > 1 2 + ɛ, which taking into account the definition of E(X) implies Prob y,r (Z(y r) = X(y) r) > 1 2 + ɛ. By a standard average argument it follows that, for at least a fraction ɛ of y {0, 1} ln, Prob r (Z(y r) = X(y) r) > 1 + ɛ/2. (2) 2 Note that X(y) r is the r-th bit of Had(y). Thus, equation (2) implies that for an ɛ fraction of y s, we have strings that agree in a (1/2) + (ɛ/2) fraction of positions with Had(y). More precisely, consider the string U(y) {0, 1} ln defined as follows: For each r {0, 1} ln, the r-th bit of U(y) is Z(y r). Thus, equation (2) implies that for at least a fraction ɛ of y {0, 1} ln, U(y) and Had(y) agree on at least a (1/2) + (ɛ/2) fraction of positions. By the Goldreich-Levin Theorem 2.4, it follows that there exists a probabilistic algorithm B with the following properties: (1) The input is y {0, 1} ln and the algorithm has oracle access to the binary string Z. (2) The algorithm B uses O(log(1/ɛ) ln) random bits. (3) The algorithm B runs in O((1/ɛ 4 )ln log(ln)) time. (4) For each y such that U(y) and Had(y) agree on more than a (1/2) + (ɛ/2) fraction of positions, with probability at least 3/4, the algorithm B outputs a list with O( 1 ɛ 2 ) elements, one of which is X(y). The above probabilistic algorithm B is used in the algorithm that we present next which on input y {0, 1} n and having oracle access to Z attempts to determine X(y). Let T be the size of the list of candidates returned by B on arbitrary input y. By property (4) above, T = O(1/ɛ 2 ). Algorithm A Input: y {0, 1} n, and random access to the string Z. We assume that Z and E(X) agree on more than a (1/2) + (ɛ/2) fraction of positions. The goal is to calculate a list of strings that contains X(y). LIST =. Repeat the following (4/3) n w times. Pick random i {1,..., l}. Pick l 1 random strings in {0, 1} n denoted y 1,..., y i 1, y i+1,..., y l. Let y = (y 1,..., y i 1, y, y i+1,..., y l ). Run the circuit B on input y. Note that circuit B uses O(ln log( 1 ɛ )) random bits. The circuit B returns a list of l-tuples in ({0, 1} n ) l having T elements. (Note: In case of success, one of these l-tuples is X(y) = X(y 1 ),..., X(y i 1 ), X(y), X(y i+1 ),..., X(y l ).)
Add to LIST the i-th component of every l-tuple in the list produced by B. End Repeat Claim 3.2 (Properties of the algorithm A) Let X {0, 1} N and Z {0, 1} 2d such that d(e(x), Z) < ((1/2) ɛ) 2 d. (1) The algorithm A on input y {0, 1} n and with oracle access to Z returns a list that we call LIST(y) and that contains T elements, where T = O(1/ɛ 2 ) (the constant hidden in O( ) notation comes from the Goldreich-Levin algorithm). (2) The algorithm A, on each input y, uses O(l n 2 w log(1/ɛ)) random bits. (3) With probability at least (1 δ) over the input y and over the random bits used by A on input y, LIST(y) contains X(y). Proof Let GOOD be the set of y {0, 1} ln with the property that the algorithm B on input y returns a list that with probability at least 3/4 contains X(y). From the properties of the algorithm B, the assumption that d(z, E(X)) < ((1/2) ɛ) 2 d implies that GOOD ɛ 2 ln. Let N(y) be the multiset of l-tuples having y as one component where the multiplicity of a tuple is the number of occurrences of y in the tuple. For a set D {0, 1} n, we define N(D) = y D N(y). On input y, at each iteration, the algorithm chooses uniformly at random y in N(y). The algorithm succeeds at that iteration if and only if y GOOD. It can be seen that, for all y {0, 1} n, N(y) = l 2 n(l 1). We define V w = { y {0, 1} n N(y) GOOD N(y) Let V w be the complement of V w. We have N(V w ) GOOD 1 w }. y V w N(y) GOOD < 2 n 1 w (l 2n(l 1) ) = l w (Σn ) l. We show that this is possible only if V w < (δ e n ) Σ n. Let D Σ n be a set with D (δ e n ) Σ n. We observe that N(D) covers an overwhelming fraction of (Σ n ) l. Indeed, note that the probability that a tuple (x 1,..., x l ) is not in N(D) is equal to the probability of the event x 1 D... x l D which is bounded by (1 δ + e n ) l. Therefore, the complementary set of N(D), denoted N(D), satisfies Then, N(D) (1 δ + e n ) l (Σ n ) l. N(D) GOOD = GOOD GOOD N(D) [ GOOD N(D) γ (1 δ + e n ) l] (Σ n ) l. Recall that l/w < [ ɛ (1 δ+e n ) l]. Thus necessarily V w < (δ e n ) 2 n and consequently V w > (1 δ + e n ) 2 n. We continue the estimation of the probability that LIST(y) contains X(y). For all y V w, N(y) GOOD N(y) 1 w, and thus the probability that one iteration fails to insert X(y) in LIST(y), conditioned by y V w, is (1 (3/4) (1/w)). Since the procedure does (4/3) n w iterations, the probability over y {0, 1} n and over the random bits used by the algorithm A, conditioned by y V w, that X(y) LIST(y) is (1 (3/4) (1/w)) (4/3) n w < e n. Therefore the probability that X(y) LIST(y) is bounded by the probability that y V w plus the above conditional probability of failure. Thus, it is bounded by δ e n + e n = δ. Points (1) and (2) follow immediately from algorithm A. The algorithm A is used to describe all the messages X such that d(e(x), Z) ((1/2) ɛ)2 d, for some fixed word Z, as we show next. Note first that it is possible to fix the O(l n 2 w log(1/ɛ)) random bits used by A so that the algorithm A using the fixed bits in lieu of random bits has the property that for a (1 δ) fraction of y {0, 1} n, LIST(y) contains X(y). Then a string X {0, 1} N with the property that d(e(x), Z) ((1/2) ɛ)2 d can be described given Z and the following information: O(l n 2 w log(1/ɛ)) bits to represent the fixed bits that are used in lieu of the random bits. 2δNn bits for representing the strings y for which X(y) LIST(y) and the value X(y) for these strings. For each of the (1 δ)n strings for which X(y) LIST(y), we need log((4/3) n w T ) bits to represent the rank of X(y) in LIST(y). Thus X can be described, given Z, by a number of bits bounded by 2δNn + O(ln 2 w log(1/ɛ)) + N log n + N log w + N log T + O(N). Recall that δ = λ/3,
l = (2/δ log(2/ɛ), w = 6 (1/δ) log(2/ɛ) (1/ɛ) and T = O(1/ɛ 2 ). It can be seen that if 1/ɛ 2 βn for an appropriately small β > 0, the above value is bounded by λn. Thus, each string X {0, 1} N with the property that d(e(x), Z) ((1/2) ɛ)2 d can be described by the string Z and λn bits. It follows that the number of such strings X is bounded by 2 λn. This finishes the proof of the combinatorial property of the code E. We move now to the algorithmical properties of the code E. Note that each bit of a codeword is indexed by a pair (y, r) {0, 1} ln {0, 1} ln, where y is viewed as an l-tuple of strings in {0, 1} n, i.e., y = (y 1,..., y l ) with each y i {0, 1} n. The (y, r)-th bit of E(X) is given by (X(y 1 )... X(y l )) r. Clearly this value can be calculated in O(ln) time, provided there is oracle access to the message X. The list-decoding procedure is given basically by the algorithm A with a few minor modifications. Let Z be a fixed string in {0, 1} 2d, which we view as the received word. Recall that Definition 2.3 requires the existence of algorithms U and V such that U produces a list of descriptors {d j } j [L], one for each candidate of a message X with d(e(x), Z) < ( 1 2 ɛ) 2d, and V, with oracle access to Z, the descriptor d j, and the index j, and having input i, produces the i-th bit of the j- th candidate. In our case, there is no need for the descriptor d j and we will dispense with the algorithm U. We have seen that each message X {0, 1} N with D(E(X), Z) < ( 1 2 ɛ) 2d can be reconstructed from Z and from some extra information comprised in a binary string of length λn. We take this extra information string to be the index j of an element in the list of candidates produced by the list-decoding algorithm that we describe. Note from the description of the algorithm A, that j consists of a block B 1 that encodes the set of pairs of strings (y, X(y)) with X(y) LIST(y), and the rest is a block B 2 consisting of the fixed bits and of the rank of each X(y) in LIST(y) (of course, for those strings X(y) that belong to LIST(y)). Thus to reconstruct the i-th bit of X we first need to identify the block X(y) where the i-th bit resides and do a binary search in B 1 to see if this block X(y) is one of the blocks for which X(y) LIST(y). If this is the case, the value of the block X(y) can be taken directly from the extra information string. If this is not the case, using the extra information block B 2 we can reconstruct the block X(y) using the algorithm A. It is immediate that this procedure takes time poly(n, l, log(1/ɛ)), which is polylog(n). References [1] M. Blum and S. Micali. How to generate cryptographically strong sequences of pseudo-random bits. SIAM Journal on Computing, 13(4):850 864, Nov. 1984. [2] O. Goldreich and L. Levin. A hard-core predicate for all one-way functions. In Proceedings of the 21st ACM Symposium on Theory of Computing, pages 25 32, 1989. [3] V. Guruswamy, J. Håstad, M. Sudan, and D. Zuckerman. Combinatorial bounds for list decoding. IEEE Transactions on Information Theory, 48(5):1021 1035, May 2002. [4] V. Guruswamy and P. Indyk. Linear time encodable and list decodable codes. In Proceedings of the 35th ACM Symposium on Theory of Computing, 2003. [5] R. Shaltiel. Recent developments in explicit constructions of extractors. Bulletin EATCS, 77:67 95, June 2002. [6] M. Sudan, L. Trevisan, and S. Vadhan. Pseudorandom generators without the XOR lemma. Journal of Computer and System Sciences, 62:236 266, 2001. [7] A. Ta-Shma and D. Zuckerman. Extractor codes. In Proceedings of the 33rd ACM Symposium on Theory of Computing, pages 193 199, 2001. [8] L. Trevisan. Some applications of coding theory in computational complexity. Technical Report Report No. 43, Electronic Colloquium on Computational Complexity, September 2004. Available at http://www.eccc.uni-trier.de/ eccc-local/lists/tr-2004.html. [9] E. Viola. Hardness vs. randomness within alternating time. In Proceedings 18th IEEE Conference on Computational Complexity, pages 53 69. IEEE, July 2003. [10] A. Yao. Theory and application of trapdoor functions. In Proceedings of the 23rd IEEE Symposium on Foundations of Computer Science, pages 80 91, 1982. [11] M. Zimand. Simple extractors via constructions of cryptographic pseudo-random generators. Technical Report 0501075, Computing Research Repository, January 2005. Available at http://arxiv.org/abs/cs.cc/0501075.