Statistical Analysis of Backtracking on. Inconsistent CSPs? Irina Rish and Daniel Frost.

Statistical Analysis of Backtracking on Inconsistent CSPs Irina Rish and Daniel Frost Department of Information and Computer Science University of California, Irvine, CA 92697-3425 firinar,frostg@ics.uci.edu Abstract. We analyze the distribution of computational eort required by backtracking algorithms on unsatisable CSPs, using analogies with reliability models, where lifetime of a specimen before failure corresponds to the runtime of backtracking on unsatisable CSPs. We extend the results of [7] by showing empirically that the lognormal distribution is a good approximation of the backtracking eort on unsolvable CSPs not only at the 50% satisable point, but in a relatively wide region. We also show how the law of proportionate eect [9] commonly used to derive the lognormal distribution can be applied to modeling the number of nodes expanded in a search tree. Moreover, for certain intervals of C=N, where N is the number of variables, and C is the number of constraints, the parameters of the corresponding lognormal distribution can be approximated by the linear lognormal model [11] where mean log(deadends) is linear in C=N, and variance of log(deadends) is close to constant. The linear lognormal model allows us to extrapolate the results from a relatively easy overconstrained region to the hard critically constrained region and, in particular, to use more ecient strategies for testing backtracking algorithms. 1 Introduction All models are wrong, but some are useful. George E.P. Box Empirical evidence presented in [7] demonstrates that the distribution of eort required to solve CSPs randomly generated at the 50% satisable point, when using a backtracking algorithm, can be approximated by two standard families of continuous probability distribution functions. Solvable problems can be approximated by the Weibull distribution and unsolvable problems by the lognormal distribution. Both distribution are widely used in reliability theory for modeling the lifetime of a product. An analogy between a product's lifetime and an algorithm's runtime suggests reliability models may be applicable to the statistical analysis of algorithms. This work was partially supported by NSF grant IRI-9157636, Air Force Oce of Scientic Research grant AFOSR 900136, Rockwell Micro grant 22147, and UC Micro grant 96-012.

In this paper we focus on unsatisable problems, extending the results of [7] to a wider range of parameters (not just the 50% point) and studying how the parameters of the distribution depend on the parameters of a problem generator. We show empirical results for the Davis-Putnam Procedure augmented with the heuristic proposed in [1] on random 3SAT problems, and for the algorithm BJ+DVO [6] on random binary CSPs. Our results on a variety of CSP problems and dierent backtracking algorithms suggest that the lognormal distribution captures some inherent properties of backtracking search on unsatisable CSPs. The lognormal distribution can be derived from the law of proportionate eect [2] which is used to model certain natural processes, such as failure of a material due to the growing size of fatigue cracks. We observed at certain regions of C=N ratio, including a relatively wide region around the transition point, that the performance of backtracking ts the linear lognormal model [11]. This model allows us to compute the parameters of the lognormal distribution in a relatively easy overconstrained region (say, for C=N around 6 for 3SAT) and extrapolate the results to the transition region. Clearly, studies of articial random problems are limited since they do not reect real applications. However, random problems are useful for systematic analysis of algorithms. Also, some real applications can be studied statistically and may accommodate a similar analysis to the one we propose here. One example is scheduling problems (e.g., airline scheduling) which has to take into account daily changes within the same system. Another example is the performance analysis of inference in a knowledge base on a variety of queries. In the next section, we describe the random problem generators and the algorithms we experimented with. Section 3 gives some statistical background. The law of proportionate eect and its applicability to backtracking search are discussed in section 4. Section 5 presents empirical study and the models we propose. In section 6, we give a summary of our results, discuss their practical importance, and outline some directions for further research. 2 Problems and Algorithms The binary CSP experiments reported in this paper were run on a model of uniform random binary constraint satisfaction problems that takes four parameters: N; D; T and C. The problem instances are binary CSPs with N variables, each having a domain of size D. The parameter T (tightness) species the probability that a value pair in a constraint is disallowed. The parameter C species the probability of a binary constraint existing between two variables. We also experimented with random 3SAT problems which can be viewed as a type of CSP with ternary constraints and D = 2. The random 3SAT problem generator was implemented as proposed in [10] for k = 3. It takes as an input the number of variables N, the number of literals per clause, k, and the number of clauses C, and generates each clause by choosing k variables randomly and ipping the sign of each literal with probability p = 0.5.

For binary CSPs, we present results using algorithm BJ+DVO [6], which combines backjumping, forward checking style domain ltering, and a dynamic variable ordering scheme. On 3SAT, we experimented with another backtracking algorithm, the Davis-Putnam Procedure (DPP) [4] augmented with the heuristics proposed in [1]. We measure the hardness of a problem by counting the total number of deadends, including \internal" (non-leaf) deadends. On unsatisable problems, this number just coincides with the number of nodes explored in the search tree. 3 Statistical Background The two-parameter lognormal distribution is based on the well-known normal or Gaussian distribution. If the logarithm of a random variable is normally distributed, then the random variable itself shows a lognormal distribution. The density function, with scale parameter and shape parameter, is f(t) = ( 1 p 2t exp (log t) 2 2 2 ; t > 0 0; t 0 and the cumulative lognormal distribution function is F (t) = log t where () is the standard normal cumulative distribution function. The mean value of the lognormal distribution is E = exp( + 2 =2). Simple formulas for the median and mode are given by exp() and exp( 2 ), respectively. Note that for a lognormally distributed variable t, log median equals mean log(t). Given a population sample x 1 ; :::; x n and a parameterized probability distribution family, there are several methods for estimating the parameters that best match the data. We used maximum likelihood estimators (MLE)[3]: = mean log(deadends), = square root of the variance of log(deadends) (correcting a misstatement in [7], the MLE for the lognormal distribution is completely satisfactory). 4 The Proportionate Eect Model One of the oldest and most widely used methods for deriving the lognormal distribution is the law of proportionate eect [2]. This law states that if the growth rate of a variable at each step in a process is in random proportion to its size at that step, then the size of the variable at time n will be approximately lognormally distributed. In other words, if the value of a random variable at time i is X i, and the relationship ; X i = X i1 b i

holds, where (b 1 ; b 2 ; : : :; b n ) are positive independent random variables, then the distribution of X i is, for large enough i, lognormally distributed. The law of proportionate eect follows from the central limit theorem, since log(x n ) = nx i log(b i ) and the sum of independent random variables log(b i ) converges to the normal distribution. In the context of constraint satisfaction problems, we will now show that the number of nodes on level i of the search tree explored by backtracking is distributed lognormally when i is suciently large. We restrict our attention to the simple backtracking algorithm with a xed variable ordering (Y 1 ; :::; Y N ) and to the inconsistent random binary CSPs with the parameters hn; D; T; Ci (inconsistency implies that the entire search tree needs to be explored). Let X i be the number of nodes on search tree level i; 1 i N. The branching factor b i at each level is dened as X i =X i1 for 2 i N, and b 1 = D. For i > 1, b i is randomly distributed in [0; D] and species how many values of variable Y i1 are consistent with the previous assignment. The probability of a value k for Y i1 being consistent with the assignment to Y 1 ; :::; Y i2 is p i = (1 CT ) i2 ; where C is the probability of a constraint between Y i1 and a previous variable. T is the probability of a value pair to be prohibited by that constraint. Then the branching factor b i is distributed binomially with parameter p i. On each level i, b i is independent of previous b's. Note that b i are non-negative (positive for all levels except the deepest level in the tree reached by backtracking) and can be greater than or less than 1 (since b i can be zero, the low of proportionate eect is not entirely applicable for some deep levels of the search tree). Then X i = b 1 b 2 : : : b i ; is lognormally distributed by the law of proportionate eect. This derivation applies to the distribution of nodes on each particular level i, where i is large enough. It still remains to be shown how this analysis relates to the distribution of the total number of P nodes explored in a tree. In a complete N search tree, the total number of nodes i=0 Di = (D N+1 1)=(D1) D N D D1 is proportional to the number of nodes at the deepest level, D N. A similar relation may be possible to derive for a backtracking search tree. Satisable CSPs do not t this scheme since the tree traversal is interrupted when solution is found. As mentioned above, empirical results also point to substantial dierence in behavior of satisable and unsatisable problems.

5 Empirical Results We experimented with the DPP and BJ+DVO algorithms described above, running 10,000 experiments per each combination of parameters (in the underconstrained region we had to ran up to 100,000 experiments in order to nd a satisfactory amount of unsatisable problems). For each instance we recorded whether a solution was found and the number of deadends (including internal deadends, as mentioned above). Experiments on 3SAT with 100, 125, 175 and 200 variables and C=N 2 [3; 10], and on binary CSPs with 75, 100 and 150 variables and various values of D, T and C, demonstrate a good t of the lognormal distribution to the empirical data for unsolvable problems. Figure 1 shows histograms (vertical bars) and continuous lognormal distributions (curved lines) for selected experiments, using the algorithms DPP on 3SAT with N = 175 and several values of C=N (on the left) and BJ+DVO on binary CSPs with N = 100, D = 8, T = 0:5 and several values of C (on the right). The x-axis unit is deadends. Vertical dotted lines indicate median of data, while ^ indicates mean. The data has been grouped in 100 intervals of equal length summarized by each vertical bar. Data greater than the maximum number of nodes indicated on the x-axis has been truncated from the charts. The y-axis shows the fraction of the sample that is expected (for the distribution functions) or was found to occur (for the experimental data) within each range. The \count" means the number of instances for each set of parameters. We can see that the lognormal distribution with MLE parameters captures the wide variety of shapes in the empirical distributions. We examined several distributions, such as the normal, lognormal, gamma, Weibull, and inverse Gaussian distributions, which all have an exponential tail and which all have similar curves in certain parameter ranges. When is small (e.g. less than 0.5) the lognormal distribution looks quite similar to the normal distribution. Also, for small, the gamma distribution can sometimes provide a slightly better t (see Figure 2(a)). When the data is strongly skewed, e.g. > 1, the lognormal distribution typically models the data better than the gamma (see Figure 2 (b)). Assuming the lognormal distribution of the unsatisable problems, we would like to know how the parameters and depend on the parameters of the problem and on heuristic employed by the specic backtracking algorithm. Figure 3(a) shows how and depend on the C=N ratio for unsatisable 3SAT problems. Similar data but for both satisable and unsatisable binary CSPs are shown in Figure 4(b). It is striking that on 3SAT problems is practically constant on a wide range of C=N for dierent N (a closer look at reveals a very slow growth in the transition region). The growth of in transition region is more pronounced in binary CSPs (Figure 4(b)).

count = 1,995 C/N = 4.1 12% solvable = 9:57 = 0:38.04 C =.0444 93% solvable = 11:44; = 1:60 count = 589 30,000 5,000,000 C/N = 4.3 59% solvable = 9:19 = 0:38 count = 8,575.05.04 C =.0525 23% solvable = 10:40; = 1:74 count = 4,791.005 10K at 10 scale 0 10,000 30,000 200,000 C/N = 5.0 0% solvable = 7:63 = 0:33 count = 12,356.08.06.04 C =.0566 3% solvable = 9:30; = 1:68 count = 9,674 5,000 100,000 C/N = 10.0 0% solvable = 3:84 = 0:22 count = 100K C =.0808 0% solvable = 5:41 = 0:63 count = 10,000 100 (a) 500 (b) Fig. 1. Lognormal t to empirical distributions for DPP on 3SAT (a) and BJ+DVO on binary CSPs (b).

lognormal C/N = 10.0 0% solvable = 3:84 = 0:22 count = 100K gamma C/N = 10.0 0% solvable = 20:7 count = 100K 100 100 (a).05 C =.0525 23% solvable = 10:40; = 1:74 count = 4,791 lognormal.05 C =.0525 23% solvable = 0:45 count = 4,791 gamma.04.04 200,000 200,000 (b) Fig. 2. Comparison of lognormal and gamma distribution functions on two samples of data from Figure 1. is the shape parameter of the gamma distribution. Typically, the gamma distribution is as good a t or better than lognormal when the shape is fairly symmetrical (which happens for small, such as = 0:22 in the rst row). But the lognormal provides a substantially better t for skewed and long-tailed distributions of data with larger, as in the second row ( = 1:74).

11 10 9 8 7 6 5 4 3 2 1 0 100 variables 125 variables 150 variables 175 var 200 variables 3 4 5 6 7 8 9 10 11 12 (a) 11 10 9 8 7 6 5 4 3 2 1 0 100 variables 125 variables 150 variables 175 var 200 variables 4 4.5 5 5.5 6 (b) Fig. 3. Parameters and as functions of C=N in a wide region (a) and near the transition (b)

11 10 9 8 7 6 5 4 3 2 1 0, satisable, satisable, unsatisable, unsatisable 2 3 4 5 6 7 8 9 (a) 6 5 4 3, satisable, satisable, unsatisable, unsatisable 2 1 0 2 3 4 5 (b) Fig. 4. Parameters and as functions of C=N for both satisable and unsatisable 3SAT problems with N=100, 125, 150, 175 and 200 (a) and for binary CSPs with D=3, T=2/9, and N=75, 100, and 150(b). Steeper lines correspond to larger values of N.

The mean log(deadends) of unsatisable problems grows monotonically with decreasing C=N ratio, except sometimes in the underconstrained region of binary CSPs. This relationship implies that underconstrained inconsistent problems are generally harder to solve than inconsistent problems at the transition point (a theory behind this observation is discussed in [12]). Since the transition region is usually of major interest, we introduce a simplied model that suits that particular region. For example, for 3SAT with C=N < 6, the parameter (mean log(deadends)) can be approximated suciently accurately by a linear function of C=N (see Figure 3(b) and Table 1). Table 1 displays the correlation between the parameters,, and the C=N ratio. We see that is practically linear in C=N in all cases (correlation is almost 1). The last two columns in the table show the negative slope which is decreasing (i.e. the lines become steeper) with increasing N. Note that the slope for is two orders of magnitude smaller than the slope for, so that can be approximated by a constant. Our observations can be described by the so-called linear-lognormal model [11] from reliability studies. The model is commonly applied for accelerated testing. In an example from [11], an engineer selects four test temperatures for the equipment to be tested which are referred to as the stress levels (the higher the stress level the shorter the mean lifetime of a unit tested), and runs several tests on each stress level. Then, using the linear-lognormal model, he estimates and, instead of testing equipment for all possible temperatures. For CSPs and 3SAT, C=N ratio plays the role of the \stress level" aecting the \lifetime" of backtracking. Strictly speaking, the simple linear-lognormal model assumes that 1. Specimen life t at any stress level has a lognormal distribution. 2. The standard deviation of log life is constant. 3. The mean log life at a (possibly transformed by some f(x)) stress level x is (x) = 0 + 1 x: Parameters, 0, and 1 are estimated from data. 4. The random variations in the specimen lives are statistically independent. All these assumptions are satised in the case of 3SAT problems. A similar model can be applied to the backtracking algorithms on random binary CSPs, except that grows within the transition region. However, the linear-lognormal model can be modied to t increasing and usually still works for depending on the stress level [11]. The linear lognormal model can be used as well for the overconstrained region, but with a dierent slope. In gure 4, we plotted and computed for both satisable problems and unsatisable problems on 3SAT (Figure 4(a)) and on binary CSPs (Figure 4(b)). Although the satisable problems do not t the lognormal distribution, and still provide useful information on the complexity of backtracking (recall that = mean log(deadends) and = standard deviation log(deadends) ). We see

Table 1. Dependence between, and C=N for unsatisable 3SAT problems: correlation and slope are given for each parameter as a function of C=N. N Correlation Slope 100-0.9949-0.9716-1.0735 0.0410 125-0.9957-0.5873-1.3550 0.0438 150-0.9963-0.8734-1.6130 0.0468 175-0.9956-0.9632-1.8504 0.0613 200-0.9957-0.9159-2.0925 0.0741 that there is no peak at the transition point in mean log(deadends) for either satisable or unsatisable problems considered separately. The most dicult unsatisable problems appear in satisable region, and the most dicult satis- able ones appear in unsatisable region. The complexity peaks for satisable and unsatisable problems seem to occur on the opposite sides of the narrow transition region located within C=N 2 [2; 4] rather than at the transition point (see Figure 4). 6 Conclusions and Future Work In this paper, we focus on unsatisable CSP problems, supporting the claim made in [7] that the lognormal distribution approximates well the computational eort required by backtracking-based algorithms on unsatisable CSP instances from the 50% satisable region. In addition, 1. we extend this claim to the non-crossover regions; 2. we show a connection between the performance of backtracking and the law of proportionate eect used to derive the lognormal distribution; 3. we show the applicability of the linear-lognormal model in a relatively wide area around the transition point; 4. we propose using accelerated testing techniques developed in reliability studies for design of experiments with backtracking algorithms. When testing an algorithm on variety of problems (or comparing several algorithms with each other), we would like to know what is the optimal way to spend a xed amount of resources to get a good estimate of the algorithms' performance. The theory of accelerating testing [11] provides us with optimal test plans which minimize the eort of conducting experiments while maximizing the accuracy of the estimates of the model's parameters, given the linear-lognormal model described above. The approach is to select the stress levels (C=N ratios) and to split the number of samples (experiments) among those in such a way that an optimal test will be obtained. The results presented in this paper suggest the following strategy for estimating parameters and for an algorithm and distribution of problems: test

a suciently large set of relatively easy unsatisable problems in the overconstrained region, estimate distribution parameters from these samples, derive the coecients of the linear lognormal model from those data, and extrapolate the results to the critical region. Empirical evaluation of accelerated testing strategies for 3SAT and binary CSPs is an area for future research. There are also many theoretical issues to be investigated: what is the appropriate non-linear model for as a function of C=N for xed N how does it change with increasing N how does depend on C=N how to select an optimal test plan, i.e. the set of C=N points and the number of experiments per point for a nonlinear model An important direction for further research is obtaining useful statistical models, both empirically and theoretically, for satisable problems, and combining them with the models of unsatisable problems. Another interesting direction is to investigate the applicability of our results to real-life problems, and to dierent types of backtracking algorithms, including optimization techniques such as branch-and-bound. Acknowledgments We would like to thank Rina Dechter, Eddie Schwalb, and anonymous reviewers whose valuable comments helped to improve the paper. References 1. J. M. Crawford and L. D. Auton. Experimental results on the crossover point in satisability problems. In Proceedings of the Eleventh National Conference on Articial Intelligence, pages 21{27, 1993. 2. E. L. Crow and K. Shimizu. Lognormal distributions: theory and applications. Marcel Dekker, Inc., New York, 1988. 3. R. B. D'Agostino and M. A. Stephens. Goodness-Of-Fit Techniques. Marcel Dekker, Inc., New York, 1986. 4. M. Davis, G. Logemann, and D. Loveland. A Machine Program for Theorem Proving. Communications of the ACM, 5:394{397, 1962. 5. R. Dechter and I. Rish. Directional resolution: The davis-putnam procedure, revisited. In Proceedings of KR-94, pages 134{145, 1994. 6. D. Frost and R. Dechter. In search of the best constraint satisfaction search. In Proceedings of the Twelfth National Conference on Articial Intelligence, 1994. 7. D. Frost, I. Rish, and L. Vila. Summarizing csp hardness with continuous probability distributions. In Proceedings of the Fourteenth National Conference on Articial Intelligence, page (to appear), 1997. 8. R. M. Haralick and G. L. Elliott. Increasing Tree Search Eciency for Constraint Satisfaction Problems. Articial Intelligence, 14:263{313, 1980. 9. N. R. Mann, R. E. Schafer, and N.D.Singpurwalla. Methods for Statistical Analysis of Reliability and Life Data. John Wiley & Sons, New York, 1974. 10. D. Mitchell, B. Selman, and H. Levesque. Hard and Easy Distributions of SAT Problems. In Proceedings of the Tenth National Conference on Articial Intelligence, pages 459{465, 1992.

11. W. Nelson. Accelerated Testing: Statistical Models, Test Plans, and Data Analyses. John Wiley & Sons, New York, 1990. 12. P. van Beek and R. Dechter. Constraint tightness and looseness versus global consistency. Journal of ACM, page to appear, 1997. This article was processed using the LATEX macro package with LLNCS style