Self-Testing Polynomial Functions Efficiently and over Rational Domains

Chapter 1 Self-Testing Polynomial Functions Efficiently and over Rational Domains Ronitt Rubinfeld Madhu Sudan Ý Abstract In this paper we give the first self-testers and checkers for polynomials over rational and integer domains. We also show significantly stronger bounds on the efficiency of a simple modification of the algorithm for self-testing polynomials over finite fields given in [8]. 1 Introduction Suppose someone gives us an extremely fast program È that we can call as a black box to compute a function. Rather than trust that È works correctly, a self-testing program for ([5]) verifies that program È is correct on most inputs (without assuming the correctness of another program that is as difficult as one that computes the function), and a self-correcting program ([5] [9]) for takes a program È, that is correct on most inputs, and uses it to compute correctly on every input (with high probability). Both access È only as a black-box and in some precise way are not allowed to compute the function. Self-testing/correcting is an extension of program result checking as defined in [3],[4]. If has a self-tester and a self-corrector, then has a program result checker. Our first result concerns checking and self-testing over rational and integer domains. To date, most selftesting/correcting pairs and checkers that have been found for numerical functions (cf. [5][2][9][8]) have been for functions over finite fields (for example polynomial functions over finite fields [2][9] [8]) or for domains for which a group structure has been imposed on a finite subset of the domain (for example integer multiplication and division functions [5], and floating point logarithm and exponentiation functions [8]). In [7] there is a checker that can also be made into a self-tester for matrix multiplication over any field, but self-correctors for matrix multiplication are only known over finite fields [5]. Many of the self-testers, for example those for linear and polynomial functions [5][8], have been based on showing that a program which satisfies certain easy to verify conditions on randomly chosen inputs must be computing the correct function on most inputs. Hebrew University. This work was done while the author was at Princeton University. Supported by DIMACS (Center for Discrete Mathematics and Theoretical Computer Science), NSF-STC88-09648. Part of this research was also done while the author was visiting Bellcore. Ý U.C. Berkeley. Research supported by NSF PYI Grant CCR 8896202. 1

Finite fields are much easier to work with to this end because they are conducive to clean simple arguments about distributions, i.e. if Ý is chosen uniformly at random from finite field, then for fixed, Ý Ý Ý are all uniformly distributed in (shifting and scaling the distribution gives back the same distribution). This is not at all the case for nontrivial distributions over fields of characteristic ¼. In fact, if is a finite subset of, and Ý is chosen uniformly at random from, then for most, Ý Ý Ý are not even likely to be in. However, functions over such fields, for example rational domains, are of great importance for application programming. A first step in the direction of finding self-testing/correcting pairs for polynomial functions is given in [8], where a self-corrector for polynomial functions over fixed point rational domains of the form Ö Ý ÒØÖ Ý Ö Ö \¼ is given (note that this includes integer domains and fixed point arithmetic domains). The self-corrector needs a program that is known to be correct for a large fraction of inputs over a larger and finer precision domain than the one for which the self-corrector will be able to compute the function correctly. We show that there is a very simple way to actually verify that the program is good enough for the polynomial function self-corrector over fixed point rational and integer domains, without assuming the correctness of another program that computes the polynomial: we give a self-tester for any (multivariate) polynomial function over fixed point rational and integer domains. Thus we also give the first checkers for polynomials over rational and integer domains. A first approach to testing the program might be to attempt to interpolate the polynomial being computed by the program and to verify that the program s output at random points is equal to the value of the interpolated polynomial. This approach is not desirable because both the interpolation procedure and the evaluation of the polynomial at random points make the tester at least as difficult as the program, and not different in the sense defined by [3] [4]. However, similar approaches have been useful in the area of interactive proofs [1] [6] [10], which only requires that the procedure be in polynomial time. The self-tester for polynomials over finite fields in [8] is composed of two parts: One is a degree test - a test which verifies that a program È is computing a function which is usually equal to some (total) degree polynomial. The other is an equality test - a test which verifies that polynomial is identically equal to polynomial. The equality test uses the fact that can be computed easily using È (this follows from the fact that È is usually equal to and because there is a self-corrector for polynomial functions over finite fields [2][9]) and verifies that on at least ½µ Ò points (where Ò is the number of variables). Then since two different degree polynomials can agree on fewer than ½µ Ò points, if the program passes the test, it must be computing the correct degree polynomial. The second part can be easily extended to work on rational domains since there is a self-corrector over such domains. We concentrate on showing that there is a degree test which verifies that the total degree of the polynomial is at most over the rationals. The ideas used to get such testers and checkers over the new domains seem quite general and it is hoped that they will be applicable to many other problems, for example matrix determinant, matrix rank and polynomial multiplication over the rationals and reals. Our second result concerns the self-tester for polynomials over finite fields in [8], which has been shown to work for multivariate polynomials as well by Shen [11]. In particular, we are interested in the efficiency of the degree test, since often testing is done online (i.e. when the user decides to use the program) [5]. Efficient 2

degree tests are interesting for another reason in addition to program testing: they have been used as a tool in order to get multi-prover interactive proofs for non-deterministic exponential time complete functions in [1]. [6] have in turn used the interactive proof in [1] in order to obtain results about the difficulty of approximating the size of the clique, in which the strength of the result depends on the efficiency of the interactive proof. Although their results are not improved by our test, our test is more efficient in situations where the total degree is smaller than maximum degree of any variable multiplied by the number of variables. The degree test given in [8][11] makes Ç µ tests, for a total of Ç µ calls, to the program in order to verify that the total degree of the polynomial is at most. On the other hand, it is easy to see that at least calls to the program are needed. We narrow this gap by showing that a slight modification of the self-tester needs only Ç µ tests and Ç µ total calls. We conjecture that our algorithm (or a slight modification of it) requires only Ç ½µ tests and Ç µ calls. The degree tests given in [1],[6] both require a number of calls depending on the number of variables: the test in [1] requires Ç Ò ÑÜ µ calls to the program where Ò is the number of variables, and ÑÜ is the maximum degree of any variable, and the test in [6] requires Ç Ò ÑÜ µ calls to the program. All of the tests given in this paper are very simple to compute, and require only a small number of additions, subtractions and comparisons. 2 Definitions Consider a function È that attempts to compute. We consider three domains, the query domain Õ, the test domain Ø and the safe domain. We say that program È - computes on Ø if È Ö ÜØ È Üµ Üµ ½. An ½ µ-self-tester ¼ ½ µ for on Õ Ø µ must fail any program that does not -compute on Ø, and must pass any program that ½-computes on Õ (note that the behavior of the tester is not specified for all programs). If Õ Ø, then we say we have an ½ µ self-tester for on. The tester should satisfy these conditions with error probability at most, where is a confidence parameter input by the user. An -self-corrector for on Ø µ is a program that uses È as a black box, such that for every Ü, È Ö È Üµ Üµ, ½ for every È which -computes on Ø. Furthermore, all require only a small multiplicative overhead over the running time of È and are different, simpler and faster than any correct program for in a precise sense defined in [4]. We use Ü Ê to mean that Ü is chosen uniformly at random in. We use to denote the set of integers ½. The self-testers and self-correctors for polynomials are based on the existence of the following interpolation identity relating the function È values between points: for all multivariate polynomials of total degree at most, Ü Ø Ò ½, «¼ Ü Øµ ¼ where the s are distinct elements of, «¼ ½ and «depends only on and not on Ü Ø or even. In particular, if Ô Ò, can be chosen to be, and then «½µ ½ ½. It is also known that if a function satisfies the interpolation identities at a set of ½ ½ this can be amplified to ½ by Ç ÐÓ ½ µ independent repetitions and and a majority vote. 3

points, then its values at those ½ points lie on a degree polynomial. Furthermore È ½ ¼ «Ü Øµ can be computed in Ç µ time using only additions and comparisons by the method of finite differences described in the appendix [12]. 3 Testers for Polynomials over Integer and Rational Domains In this section we present a tester for polynomial functions over rational domains. In [8] a self-corrector for such functions is presented which can compute the correct value of a function, from a safe domain, provided the function has been tested over a finite number of test domains Ø. Typically the test domains were much larger and of finer precision than the safe domain, but were all rational domains. In this section we complement this result by presenting a tester for any rational domain. Notation : We use Ò to denote the set Ò. Note that all rational domains are of the form Ò, where Ò. The tester we construct in this section explicitly tests that certain properties hold over various domains in order to infer that the desired property holds for the domain that we are interested in. We define the domains that our test considers. 3.1 Domains and their properties. Throughout this section Ü and Ø will denote Ò dimensional vectors chosen from an appropriate domain. Let the domain we wish to certify the program over be We will use the following domains : ¼ Ò Ô ½ Ò µô : This will ensure that for all Ü Ø ¼, and for ½, Ü Ø ½. Ì Ò Ä ÔÄ, where Ä Ô Ô µ Ò ½µµ and Ä ½µµ : Ì contains ½ and is a considerably larger and finer domain than ½. For all ½, Ì ÜÜ Ì. For all ½, ÜÜ Ì : The fineness of Ì allows us to conclude that all the domains Ì and contain ½, a fact which will be useful later. Frequently, in our proof, we will need to establish that a certain property holds over a new domain which is not quite the same as any of the domains above but is close to one of them. We first define the notion of closeness that we use in this paper and then we establish some relations among domains. (The definition as well as its properties are initially given in [8].) 4

DEFINITION 3.1 ([8]) Æ Ô ½ Ô µ Ô½ if Ô ½ Ô ¼ otherwise DEFINITION 3.2 ([8]) For domains and, both subsets of a universe, (Note that ¼ Æ µ ½.) Æ µ Æ ÈÖ Ü Ü ÈÖ Ý µ Ý DEFINITION 3.3 ([8]) Domains and are -close if Æ µ ½. Lemma 1 ([8]) If domains and are -close and Æ is the probability of Ü lying in a bad set, Ë, when Ü is picked uniformly from, then the probability of Ý belonging to Ë when Ý is picked uniformly from is at most Æ. Lemma 2 For a fixed Ü ½, the domains Ì and Ü ØØ Ì are ½-close, where ½ Ç ½ Ò µ. Proof: Let Ì Ü and Ì denote the domains from which the th coordinate of the elements Ø and Ü Ø come from. We have Ì ÞÞ Ä ÔÄ where Ä Ô Ô µ Ò ½µµ and Ä ½µµ and Ì Ü Þ Ü Þ ÄÔÄ where Ü is the th coordinate of Ü Let Õ represent the probability that a randomly picked element of Ì Ü belongs to Ì. We have Õ ½ µô Ä Ô Ä ½ ½ Ò Since the size of the domains Ì and Ü ØØ Ì are equal, the quantity Æ Ì Ü ØØ Ì µ is really equal to Ì Ü ØØÌ Ì. We have (for an appropriately chosen constant ) Æ Ì Ü ØØ Ì µ Õ Ò ½ Ò ½ ½ Lemma 3 For a fixed Ü, the domains Ü ØØ and are -close, where Ç ½Ò µ. Proof: Similar to proof of Lemma 2. 3.2 Tests. We perform the following tests : 5

Test ¼ : Repeat Ç ½ Æ ÐÓ ½ µµ times Pick Ê È ½, Ü Ê ¼, Ø Ê Ì ½ Verify «¼ È Ü Øµ ¼ Reject if the test fails more than Æ fraction of the time Test : (done separately for each ½ ) Repeat Ç ½ Æ ÐÓ ½ µµ times Pick Ð Ê È ½, Ü Ê, Ø Ê Ì Ð ½ Verify «¼ È Ü Øµ ¼ Reject if the test fails more than Æ fraction of the time Test : (done for all pairs ½ ) Repeat Ç ½ Æ ÐÓ ½ µµ times Pick Ê È ½, Ü Ê, Ø Ê Ì ½ Verify «Ð¼ ÐÈ Ü Ð Øµ ¼ Reject if the test fails more than Æ fraction of the time Next we list a set of properties, such that if a program È does not have these properties, it is very unlikely to pass the corresponding tests. Property È ¼ : Property È, ½ : Property È, ½ : ÈÖ Ê ½ Ü Ê ¼ Ø Ê Ì ÈÖ Ð Ê ½ Ü Ê Ø Ê Ì Ð ÈÖ Ê ½ Ü Ê Ø Ê Ì ½ ¼ ½ ¼ ½ Ð¼ «È Ü Øµ ¼ ½ Æ «È Ü Øµ ¼ ½ Æ «Ð È Ü Ð Øµ ¼ ½ Æ From hereon we assume that the program È has all the above mentioned properties and show that such a program is essentially computing a degree polynomial. 6

3.3 Proof of Correctness. We define the function to be ½ Üµ majority ½ ØÌ «È Ü Øµ ½ Lemma 4 ÈÖ È Üµ Üµ ½ Æ Ü Proof: Fix and and consider all Ü such that ½ È Üµ majority ½ ØÌ «È Ü Øµ ½ For such Ü s we have Üµ È Üµ. But by property È and a straightforward counting argument the fraction of such Ü s is at least Æ. Lemma 4 ¼ ÈÖ È Üµ Üµ ½ Æ Ü ¼ Proof: Similar to proof of Lemma 4 above (using property È ¼ instead of property È ). Lemma 5 Ü ½ ÈÖ Üµ Ê ½ Ø Ê Ì ½ ½ È Ü Øµ ½ Æ ½ where Æ ½ ½µ Æ µ Proof: Consider Ð Ê ½ and Ø ½ Ê Ì and Ø Ê Ì Ð. For a fixed ½, by property È and the fact that the domains Ü Ø ½ Ø ½ Ì and are -close we get that ÈÖ È Ü Ø ½ µ Similarly we get ÈÖ È Ü Ø µ ½ ½ ½ ½ Summing up over ½ and ½ we get ÈÖ ½ «È Ü Ø ½ µ ½ ½ ½ ½ ½µ Æ µ ½ Æ ½ «È Ü Ø µ È Ü Ø ½ Ø µ ½ Æ È Ü Ø ½ Ø µ ½ Æ 7

(The probabilities in all the expressions above are for Ð Ê ½ Ø ½ Ê Ì È and Ø Ê Ì Ð.) We have ½ shown that with high (½ Æ ½ µ probability, we get the same answer if we evaluate «½ È Ü Øµ for a randomly chosen Ø two times. It is well known that this is a lower bound on the probability of the answer that is most likely to appear. Thus we have ÈÖ Üµ Ê ½ Ø Ê Ì ½ ½ È Ü Øµ ½ Æ ½ Lemma 6 where Æ ½µÆ ½. Ü ½ ÈÖ Üµ Ø Ê Ì ½ ½ «È Ü Øµ ½ Æ Proof: Lemma 5 guarantees that ÈÖ Üµ Ê ½ Ø Ê Ì ½ ½ È Ü Øµ ½ Æ ½ Thus the probability that this happens for a fixed must be at least ½ ½µÆ ½. Lemma 7 Ü ½ where Æ Æ ½µ Æ µ. ÈÖ Üµ Ø Ê Ì ½ ½ «Ü Øµ ½ Æ Proof: Lemma 6 says Ü ½ ÈÖ Üµ Ø Ê Ì ½ ½ «È Ü Øµ ½ Æ Lemma 4 and the fact that the distributions Ü ØØ and are -close implies that Putting them together we get Ü ½ ÈÖ Üµ Ø Ê Ì ½ ½ ÈÖ Ø Ê Ì Ü Øµ È Ü Øµ ½ Æ «Ü Øµ ½ Æ ½µ Æ µ 8

Lemma 8 Proof: Ü Ø ¼ ½ ¼ Ø ½ Ê Ì implies Ø ½ Ê Ì. Lemma 7 implies Ü Ø ¼ ÈÖ Ü Øµ Ø ½ Ê Ì ½ Æ ½ ¼ «Ü Ø Ø ½ µµ «Ü Øµ ¼ Also Lemma 7 and the fact that the distribution of Ø Ø ½ Ø ½ Ê Ì and Ì are ½-close implies Ü Ø ¼ ÈÖ Üµ Ø ½ Ê Ì ½ Æ ½ Summing up we get ÈÖ Ø ½ Ê Ì ½ ¼ ½ ¼ «Ü Ø Ø ½ µµ «Ü Øµ ¼ ½ ½µ Æ ½µ Theorem 1 For Æ Ó ½ µ, there exists an ¼ Æµ self-tester for Ò-variate polynomials of total degree over Õ Ä Ò ÔÄ Ø Ô Ò µ (where Ô and Ä Ô Ô µ Ò ½µµ and Ä ½µµ ) which makes Ç ½ µµµ calls to the program. Æ ÐÓ ½ Proof: The tester performs the tests given above. With probability at least ½ a program È which does not have any of the properties listed above will not pass the test. On the other hand if a program È passes the tests above, then there exists a function, which is a polynomial on the domain ¼ (Lemma 8), which agrees with È on the domain ¼ (Lemma 4 ¼ ). Thus È is a good program. 4 Efficient Testers for Polynomials We present an improved total degree test for polynomials in this section. The algorithm is a simple modification of the algorithm given in [8]. The algorithm given in [8] picks a random point Ü and a random offset Ø, and verifies that the values of the program at the points Ü Ø ¼ ½ satisfy the interpolation equations, or equivalently, lie on the same polynomial. The modification in this algorithm is to look at ½¼ points to make sure that they all lie on the same polynomial. This can be done almost exactly as before using the interpolation equations, and with only a constant factor more running time. 9

program Improved-Total-Degree-Test È µ Membership Test Ð ½¼ Repeat Ç ½ ÐÓ ½ µµ times Pick Ü Ø Ê Ô Ò and test that a poly, deg such that ¼ Ð Ü Øµ È Ü Øµ Reject È if the test fails more than an fraction of the time. Using the method of finite differences described in the appendix, it is very easy to determine whether or not there exists a polynomial that agrees with the function at the queried places. This can be done using only Ç µ subtractions and no multiplications. Theorem 2 If, and È does not -compute some polynomial of total degree at most on ¼¼ ½µ Ò Ô, then Program Improved Total Degree Test rejects È with probability ½. If a program È passes the test in [8] for ½Ç µ then one can infer that the program is essentially computing a degree polynomial. We show that if È passes the above test for ¼¼ ½µ then one can infer that the program is essentially computing a degree polynomial. The rest of this section is devoted to the presentation of the proof for Ò ½. Some minor modifications are needed to prove the theorem for Ò ½, which we omit in this version. Before we prove the theorem, we first mention a technical lemma concerning a property of polynomials that will play a central role in our proof. This is a slight generalization of the techniques used in [8]. Essentially what the lemma shows is that suppose we have a µ µ matrix Å, of points Ü Ýµ, such that Ý is a function of Ü and: 1. For all columns and all but one row a degree polynomial can be associated with the column or row such that the function values of the points in that column/row lie on the associated polynomial. The associated polynomials are not assumed to be the same. 2. Matrix Å is a submatrix (obtained by deleting rows and columns) of a larger matrix where has the property that the Ü coordinates of the points from any of its rows (or any of its columns) are in an arithmetic progression. Then the function values of the points on the remaining row also lie on some degree polynomial. Lemma 9 (Matrix Transposition Lemma) Let ¼ ½ and ¼ ½. Let Å Ñ ¼ ½ be a matrix such that Ñ Ô is of the form Ñ Ü Ø ½ Ø Ø, and let be a function from Ô to Ô such that 10

1. ½ ½, a polynomial Ö of degree at most such that ¼ ½, Ñ µ Ö Ñ µ. (i.e. all of the function values of the points in row Ö are on the same degree polynomial for rows 1 through ½.) 2. ¼ ½, a polynomial of degree at most such that ¼ ½, Ñ µ Ñ µ. (i.e. all of the function values of the points in column are on the same degree polynomial for columns ¼ through ½.) Then there also exists a polynomial Ö ¼ such that ¼ ½, Ö ¼ Ñ ¼ µ Ñ ¼ µ. (i.e. all of the points in row ¼ are also on the same degree polynomial). The proof of this lemma is in the appendix. We now turn our attention to showing that the algorithm satisfies the claims of the theorem: Proof: DEFINITION 4.1 We define Æ to be Æ ÈÖ polynomial of degree at most s.t. ÜØ ¼ ½¼ È Ü Øµ Ü Øµ It is easy to see that if Æ is at least, then the program is unlikely to pass the test. From here on we assume that we have a program È with Æ. ¼¼ ½µ The proof follows the same basic outline as the one in [8], but in order to achieve the better efficiency, we use ideas that can be thought of in terms of error-correction. Thus many of the steps that were quite simple in [8] require more work here. We define a self-corrector for which uses È as an oracle. We show that this self-corrector has the nice property that for every Ü, self-correction using two different random strings ½ and yields the same result (with high probability). We define a function in terms of the self-corrector function, and we show that has the following properties: 1. Üµ È Üµ with probability at least ½ Æ if Ü is picked randomly from Ô. 2. For all Ü, Üµ lies on the same polynomial (of degree at most ) as at least two-thirds of the points Ü ½µ Ü µ Ü µ. We then show that any function that has the second property mentioned above is a polynomial of degree. In [8], the function was defined to be the value that occurs most often (for most Ø) when one looks at the evaluation at Ü of the unique polynomial which agrees with the values of È at Ü Ø Ü ½µØ. Here we view the values of a polynomial at Ü Ø Ü ½¼Ø as a code word. Intuitively, the values of È at Ü Ø Ü ½¼Ø will often have enough good information in it to allow us to get back to a correct codeword. The function defined below can be thought of as the value that occurs most often (for most Ø) when one looks at the polynomial defined by the error correction of the values of È at Ü Ø Ü ½¼Ø evaluated at Ü. We introduce notation which we will be using in the rest of this section. 11

DEFINITION 4.2 Given a set of points Ë Ô and a function from Ô to Ô, the most likely polynomial fitting at these points is a polynomial of degree at most, with the property that it maximizes Ü Ë Üµ Üµ. In case there is more than one candidate for the most likely polynomial, one of them is chosen arbitrarily. We now define a self-corrector for the function using È as an oracle. DEFINITION 4.3 Let be the most likely polynomial fitting È on the set Ü Ø ½ ½¼. Then Ë È Ü Øµ is defined to be Üµ, if Ü Øµ È Ü Øµ for all but ½¼Æ fraction of the values of. Ë È Ü Øµ is defined to be error otherwise. We define as follows DEFINITION 4.4 Üµ majority ØÔ Ë È Ü Øµ In case there is more than one candidate for Üµ, one of them can be chosen arbitrarily. However, it will be shown that this is never the case. Our first lemma shows that È and agree at most places. Lemma 10 ÈÖ ÜÊ Ô Üµ È Üµ ½ Æ The proof of Lemma 10 is the similar to the proof of Lemma 7 in Section 3 of [8] and is omitted here. The next two lemmas show that the self-corrector function has the property that for all Ü Ô, it is welldefined - most Ø s in Ô give the same answer when used to evaluate Ë È Ü Øµ: Lemma 11 For all Ü Ô, ÈÖ Ë È Ü Ø ½ µ Ë È Ü Ø µ ÖÖÓÖ Ø ½ Ø Ê Ô Proof: In [8], it is shown that if one computes the value of a polynomial function at Ü by interpolating from the values of the function along offset Ø ½ which in turn are computed by interpolating from the values of the function along offset Ø, then one would get the same answer as if one had computed the value of the function at Ü by interpolating from the values of the function along offset Ø which in turn are computed by interpolating from the values of the function along offset Ø ½. This is not hard to see because it turns out that an interpolation can be thought of as a weighted sum, and so it amounts to changing the order of a double summation. Here the self-correction function is actually an interpolation of the error-correction of the values of the function, which is no longer a simple algebraic function of the observed values. We avoid analyzing this function by reducing the problem to the non-error-correcting version. Consider the matrix ¼ ½¼, where Ü Ø ½ Ø. (These are all the values that would affect Ë È Ü Ø ½ µ if instead of using È Ü Ø ½ µ, wherever its value is needed, we used the value Ë È Ü Ø ½ Ø µ.) Let Ö denote the most likely polynomial fitting È on the elements in row of and let denote the most likely polynomial fitting È on the elements in column. Call a row, ½, 12

(column, ½) good if all entries in the th row (th column) have the property that È µ Ö µ (È µ µ). Otherwise, call it bad. By our definition of Æ and since for ½ ½, Ü Ø ½ and Ü Ø are uniformly distributed in Ô, we get that By applying Markov s inequality we get that Similarly we get ÈÖ row is good ½ Æ Ø ½ Ø Ê Ô ÈÖ At most ½¼Æ fraction of the rows are bad ½¼ Ø ½ Ø Ê Ô ÈÖ At most ½¼Æ fraction of the columns are bad ½¼ Ø ½ Ø Ê Ô Now, consider a µ µ submatrix consisting of the row ¼ and the ½ good rows ½ ½, and any good columns ½ ½. The Matrix Transposition Lemma with È ½ ½ µ ½ ½ µ ½ ½ µ ½ ½ µ Ø ¼ guarantees that the points on the row ¼ lie on some polynomial Ö ¼. Repeated application by choosing other good columns show that all points in the ¼th row from good columns lie on the same polynomial Ö ¼. Thus Ë È Ü Ø ½ µ Ö ¼ Üµ. A similar argument based on good rows shows that there exists a polynomial ¼ such that all points from column ¼ and good rows lie on this polynomial, implying Ë È Ü Ø µ ¼ Üµ. Showing that Ö ¼ Üµ ¼ Üµ will conclude the proof. Consider a submatrix of containing the row ¼, ½ good rows and the column ¼ and ½ good columns. Apply the Matrix Transposition Lemma with È everywhere except when the argument is Ü where Üµ Ö ¼ Üµ. The Matrix Transposition Lemma guarantees that all points from the ¼th column lie on ¼, implying that ¼ Üµ Ö ¼ Üµ. The following lemma follows immediately and we state it without proof. Lemma 12 Ü, ÈÖ ØÔ Üµ Ë È Ü Øµ. Lemma 13 Ü, Üµ Üµ, where is the unique most likely polynomial fitting at the points Ü ½ Ü Ü. Proof: As in the proof of Lemma 11 we construct a matrix ¼ ¼ ½¼, where Ü µ Ø ½ Ø µ Ü Ø ½ µ ½ Ø µ. The rows of this matrix are the elements that affect Ë È Ü Ø ½ Ø µ. The columns are a subset of the elements that affect Ë È Ü Ø ½ ½ Ø µ. Call a row, of good if Ü µ Ë È Ü Ø ½ Ø µ. By Lemma 11 we know that Using Markov s inequality we get ÈÖ row is good Ø ½ Ø Ê Ô ÈÖ at least two-thirds of the rows are good Ø ½ Ø Ê Ô 13

Thus we find that with probability at least 1/5, two-thirds of the rows are good and row ¼ is good. From now on we will work with a submatrix of containing the ¼th row and other good rows. Let Ö be the most likely polynomial fitting È on points from the th row. Call an element Ñ bad if È Ñ µ Ö Ñ µ. Delete all columns of Å which contain bad elements. Since we are working with only the good rows of, any row can have at most ½¼ ½¼Æ bad elements. Thus we delete at most ¼¼ µæ columns. Hence we will still be left with at least columns (since Æ ). Let ¼¼ ½µ be the most likely polynomial fitting È on elements from column. Delete all columns containing elements such that Ñ µ È Ñ µ. By the definition of Æ and application of Markov s inequality we find that with probability at least ½½¼, no more than ½ columns have such elements, and so ½ of the columns remain (this assumes that, if then the algorithm in [8] is better). Applying the Matrix Transposition Lemma to all µ µ submatrices shows that all the elements Ü µ, is a good row, lie on the same polynomial. Thus we get that at least 2/3rds of the points Ü µ and the point Üµ lie on the same polynomial. In other words, Üµ Üµ where Üµ is the most likely polynomial fitting on the points Ü ½. Also since agrees with at least two-thirds of the points of the set, it must be the unique most likely polynomial fitting these points. Our next lemma shows that the condition guaranteed above suffices to show that is a degree polynomial. Lemma 14 If is a function from Ô to Ô such that for all Ü, Üµ Üµ where is the unique most likely polynomial, of degree at most, fitting on the points Ü ½. Then is a degree polynomial. Proof: We prove this by claiming that the unique most likely polynomial fitting on any set of the form Ü ½ is the same. Assume otherwise. Then there must exist an Ü such that the majority polynomial fitting on the sets Ü ½ and Ü ¼ are different. But this cannot be since Üµ Üµ where is the most likely polynomial fitting on the set Ü ½. Hence the most likely polynomial fitting in any set of the form Ü ½ is the same, say and hence for all Ü Ô Üµ Üµ 5 Open Questions We have shown that programs for polynomial functions can be self-tested over rational domains. A very important question along these lines is to determine whether there is an approximate self-tester [8] for programs which approximate polynomial functions over rational domains, as would occur in floating point and fixed precision computation (an approximate self-tester verifies that a program gives a good approximation to the function on a large fraction of the inputs). Secondly, we have shown stronger bounds on the efficiency of a simple modification of the algorithm for self-testing polynomials in [8]. This algorithm makes Ç µ tests, for a total of Ç µ calls to the program. It would be interesting to show whether or not Ç ½µ tests of this algorithm or a slight modification of it are sufficient. 14

6 Acknowledgements We are very grateful to Avi Wigderson for suggesting that we look for more efficient testers for polynomials, as well as his technical help in proving the theorems in Section 4. We thank Mike Luby, Oded Goldreich and Joan Feigenbaum for their comments on the writeup of this paper. References [1] Babai, L., Fortnow, L., Lund, C., Non-Deterministic Exponential Time has Two-Prover Interactive Protocols, Proceedings of the 31st Annual Symposium on Foundations of Computer Science,, 1990. [2] Beaver, D., Feigenbaum, J., Hiding Instance in Multioracle Queries, Proceedings of Symposium on Theoretical Aspects of Computer Science 1990. [3] Blum, M., Designing programs to check their work, Submitted to CACM. [4] Blum, M., Kannan, S., Program correctness checking... and the design of programs that check their work, Proc. 21st ACM Symposium on Theory of Computing, 1989. [5] Blum, M., Luby, M., Rubinfeld, R., Self-Testing/Correcting with Applications to Numerical Problems, Proc. 22th ACM Symposium on Theory of Computing, 1990. [6] Feige, U., Goldwasser, S., Lovasz, L., Safra, M., Szegedy, M., Approximating Clique is Almost NP- Complete, Proceedings of the 32nd Annual Symposium on Foundations of Computer Science,, 1991. [7] Freivalds, R., Fast Probabilistic Algorithms, Springer Verlag Lecture Notes in CS No. 74, Mathematical Foundations of CS, 57-69 (1979). [8] Gemmell, P., Lipton, R., Rubinfeld, R., Sudan, M., Wigderson, A., Self-Testing/Correcting for Polynomials and for Approximate Functions, Proc. 23th ACM Symposium on Theory of Computing, 1991. [9] Lipton, R.J., New directions in Testing, in Distributed Computing and Cryptography, DIMACS Series on Discrete Mathematics and Theoretical Computer Science, vol. 2, 1991, pp. 191 202. [10] Lund, C., The Power of Interaction, University of Chicago Technical Report 91-01, January 14, 1991. [11] Shen, Sasha, personal communication, May 1991. [12] Van Der Waerden, B.L., Algebra, Vol. 1, Frederick Ungar Publishing Co., Inc., pp. 86-91, 1970. 15

A Appendix I. Proof of Lemma 9 We give the proof of Lemma 9. Lemma 9 (Matrix Transposition Lemma) Let ¼ ½ and ¼ ½. Let Å Ñ ¼ ½ be a matrix such that Ñ Ô is of the form Ñ Ü Ø ½ Ø Ø, and let be a function from Ô to Ô such that 1. ½ ½, a polynomial Ö of degree at most such that ¼ ½, Ñ µ Ö Ñ µ. (i.e. all of the function values of the points in row Ö are on the same degree polynomial for rows 1 through ½.) 2. ¼ ½, a polynomial of degree at most such that ¼ ½, Ñ µ Ñ µ. (i.e. all of the function values of the points in column are on the same degree polynomial for columns ¼ through ½.) Then there also exists a polynomial Ö ¼ such that ¼ ½, Ö ¼ Ñ ¼ µ Ñ ¼ µ. (i.e. all of the points in row ¼ are also on the same degree polynomial). Proof: By standard arguments (cf. [9]), the existence of polynomials Ö ½ Ö ½ implies there exist È ½ constants ¼ ½ ½ such that ½ ½, ¼ Ö Ñ µ ¼. The constants depend only on and. Similarly È there exist constants ½ ½, depending only on and, such that ½ ¼ ½, Ñ ¼ µ ½ Ñ µ. Thus we find that ½ ½ ¼ Ñ ¼ µ ¼ Ñ ¼ µ ½ ½ ¼ ½ ½ ½ ½ ¼ ½ ½ ½ ¼ ¼ Ñ µ Ñ µ Ö Ñ µ Since the function values at these points satisfy the interpolation identities, we know that there must exist a polynomial Ö ¼ such that ¼ ½, Ö ¼ Ñ ¼ µ Ñ ¼ µ. II. Method of Finite Differences. Here we consider the following problem: Given evenly-spaced points Ü ½ Ü Ü Ò Ô, Ü ½ Ü Ü Ü ½, and values of some function at these points: 16

Does there exist a polynomial, of degree at most, such that Ü µ Ü µ? We describe how to use the method of finite differences to test for this property. The test performs Ç µ subtractions and no multiplications. Define the function ½ Üµ Üµ Ü Øµ. In general, define by Üµ ½ Üµ ½ Ü Øµ. The method of finite differences implies the following: Lemma 15 ([12]) agrees with some degree polynomial on the points Ü ½ Ü if and only if the ½ Ø difference ½ Ü µ ¼ for. Using this Lemma we find that we only have to evaluate the s, for ½, to check if the above property holds, and this can be done using Ç µ subtractions. 17