Black Box Search By Unbiased Variation Per Kristian Lehre and Carsten Witt CERCIA, University of Birmingham, UK DTU Informatics, Copenhagen, Denmark ThRaSH - March 24th 2010
State of the Art in Runtime Analysis of RSHs OneMax (1+1) EA O(n log n) [Mühlenbein, 1992] (1+λ) EA O(λn + n log n) [Jansen et al., 2005] (µ+1) EA O(µn + n log n) [Witt, 2006] 1-ANT O(n 2 ) w.h.p. [Neumann and Witt, 2006] (µ+1) IA O(µn + n log n) [Zarges, 2009] Linear Functions (1+1) EA Θ(n log n) [Droste et al., 2002] and [He and Yao, 2003] cga Θ(n 2+ε ), ε > 0 const. [Droste, 2006] Max. Matching (1+1) EA e Ω(n), PRAS [Giel and Wegener, 2003] Sorting (1+1) EA Θ(n 2 log n) [Scharnow et al., 2002] SS Shortest Path (1+1) EA O(n 3 log(nw max)) [Baswana et al., 2009] MO (1+1) EA O(n 3 ) [Scharnow et al., 2002] MST (1+1) EA Θ(m 2 log(nw max)) [Neumann and Wegener, 2007] (1+λ) EA O(nλ log(nw max)), λ = m2 [Neumann and Wegener, 2007] n 1-ANT O(mn log(nw max)) [Neumann and Witt, 2008] Max. Clique (1+1) EA Θ(n 5 ) [Storch, 2006] (rand. planar) (16n+1) RLS Θ(n 5/3 ) [Storch, 2006] Eulerian Cycle (1+1) EA Θ(m 2 log m) [Doerr et al., 2007] Partition (1+1) EA PRAS, avg. [Witt, 2005] Vertex Cover (1+1) EA e Ω(n), arb. bad approx. [Friedrich et al., 2007] and [Oliveto et al., 2007a] Set Cover (1+1) EA e Ω(n), arb. bad approx. [Friedrich et al., 2007] SEMO Pol. O(log n)-approx. [Friedrich et al., 2007] Intersection of (1+1) EA 1/p-approximation in [Reichel and Skutella, 2008] p 3 matroids O( E p+2 log( E w max)) UIO/FSM conf. (1+1) EA e Ω(n) [Lehre and Yao, 2007] See survey [Oliveto et al., 2007b].
Motivation - A Theory of Randomised Search Heuristics Computational Complexity Classification of problems according to inherent difficulty. Common limits on the efficiency of all algorithms. Assuming a particular model of computation. Computational Complexity of Search Problems Polynomial-time Local Search [Johnson et al., 1988]. Black-Box Complexity [Droste et al., 2006].
Black Box Complexity A f Function class F Photo: E. Gerhard (1846). [Droste et al., 2006]
Black Box Complexity f(x 1 ), f(x 2 ), f(x 3 ),... x 1, x 2, x 3,... A f Function class F Photo: E. Gerhard (1846). [Droste et al., 2006]
Black Box Complexity f(x 1 ), f(x 2 ), f(x 3 ),..., f(x t ) x 1, x 2, x 3,..., x t A f Function class F Photo: E. Gerhard (1846). Black box complexity on function class F T F := min max T A,f A f F [Droste et al., 2006]
Results with old Model Very general model with few restrictions on resources. Example: Needle has BB complexity (2 n + 1)/2. Some NP-hard problems have polynomial BB complexity. Artificially low BB complexity on example functions, e.g. n/ log(2n + 1) 1 on OneMax n/2 o(n) on LeadingOnes
Refined Black Box Model A f Function class F Photo: E. Gerhard (1846).
Refined Black Box Model f(x 0 ) 0 f(x 0 ) x 0 A f Function class F Photo: E. Gerhard (1846).
Refined Black Box Model f(x 0 ), f(x 1 ) 0, 0 f A f(x 0 ) f(x 1 ) x 0 x 1 Function class F Photo: E. Gerhard (1846).
Refined Black Box Model f(x 0 ), f(x 1 ), f(x 2 ) 0, 0, 2 f(x 0 ) x 0 A f(x 1 ) x 1 Function class F f f(x 2 ) x 2 Photo: E. Gerhard (1846).
Refined Black Box Model f(x 0 ), f(x 1 ), f(x 2 ), f(x 3 ) 0, 0, 2, 3 f(x 0 ) x 0 A f(x 1 ) x 1 Function class F f f(x 2 ) f(x 3 ) x 2 x 3 Photo: E. Gerhard (1846).
Refined Black Box Model f(x 0 ), f(x 1 ), f(x 2 ), f(x 3 ), f(x 4 ), f(x 5 ), f(x 6 ) 0, 0, 2, 3, 0, 2 f(x 0 ) x 0 A f(x 1 ) x 1 Function class F f f(x 2 ) f(x 3 ) x 2 x 3 f(x 4 ) x 4 Photo: E. Gerhard (1846). f(x 5 ) x 5 f(x 6 ) x 6
Refined Black Box Model f(x 0 ) x 0 A f(x 1 ) x 1 Function class F f f(x 2 ) f(x 3 ) x 2 x 3 f(x 4 ) x 4 Photo: E. Gerhard (1846). f(x 5 ) x 5 f(x 6 ) x 6 Unbiased black box complexity on function class F T F := min max T A,f A f F
Unbiased Variation Operators 1 Encoding of solution by bitstring x = x 1 x 2 x 3 x 4 x 5 x 1 x 5 x 2 x 4 x 3 1 Figure by Dake, available under a Creative Commons Attribution-Share Alike 2.5 Generic license.
Unbiased Variation Operators 1 Encoding of solution by bitstring x = x 1 x 2 x 3 x 4 x 5 x 1 x 2 x 3 x 4 x 5 1 Figure by Dake, available under a Creative Commons Attribution-Share Alike 2.5 Generic license.
Unbiased Variation Operators 1 Encoding of solution by bitstring x = x 1 x 2 x 3 x 4 x 5 x 1 x2 = 1 = blue in! x 3 x 4 = 1 = orange in! x 5 1 Figure by Dake, available under a Creative Commons Attribution-Share Alike 2.5 Generic license.
Unbiased Variation Operators 1 Encoding of solution by bitstring x = x 1 x 2 x 3 x 4 x 5 x 1 x2 = 1 = blue in! x 3 x 4 = 1 = orange out! x 5 1 Figure by Dake, available under a Creative Commons Attribution-Share Alike 2.5 Generic license.
Unbiased Variation Operators p(y x) For any bitstrings x, y, z and permutation σ, we require 1) p(y x) = p(y z x z) 2) p(y x) = p(y σ(1) y σ(2) y σ(n) x σ(1) x σ(2) x σ(n) ) We consider unary operators, but higher arities possible. [Droste and Wiesmann, 2000, Rowe et al., 2007]
Unbiased Variation Operators x x r y Condition 1) and 2) imply Hamming-invariance.
Unbiased Black-Box Algorithm Scheme 1: t 0. 2: Choose x(t) uniformly at random from {0, 1} n. 3: repeat 4: t t + 1. 5: Compute f(x(t 1)). 6: I(t) (f(x(0)),..., f(x(t 1))). 7: Depending on I(t), choose a prob. distr. p s on {0,..., t 1}. 8: Randomly choose an index j according to p s. 9: Depending on I(t), choose an unbiased variation op. p v ( x(j)). 10: Randomly choose a bitstring x(t) according to p v. 11: until termination condition met. (µ +, λ) EA, simulated annealing, metropolis, RLS, any population size, any selection mechanism, steady state EAs, cellular EAs, ranked based mutation...
Simple Unimodal Functions Algorithm LeadingOnes (1+1) EA Θ(n 2 ) (1+λ) EA Θ(n 2 + λn) (µ+1) EA Θ(n 2 + µn log n) BB Ω(n)
Simple Unimodal Functions Algorithm LeadingOnes (1+1) EA Θ(n 2 ) (1+λ) EA Θ(n 2 + λn) (µ+1) EA Θ(n 2 + µn log n) BB Ω(n) Theorem The expected runtime of any black box algorithm with unary, unbiased variation on LeadingOnes is Ω(n 2 ).
Simple Unimodal Functions Algorithm LeadingOnes (1+1) EA Θ(n 2 ) (1+λ) EA Θ(n 2 + λn) (µ+1) EA Θ(n 2 + µn log n) BB Ω(n) Theorem The expected runtime of any black box algorithm with unary, unbiased variation on LeadingOnes is Ω(n 2 ). Proof idea Potential between n/2 and 3n/4. # 0-bits flipped hypergeometrically distributed. Lower bound by polynomial drift.
Escaping from Local Optima Jump(x) x m
Escaping from Local Optima Jump(x) x m Theorem For any m n(1 ε)/2 with 0 < ε < 1, the expected runtime of any black box algorithm with unary, unbiased variation is at least 2 cm with probability 1 2 Ω(m). ( n rm) cm with probability 1 2 Ω(m ln(n/(rm))). These bounds are lower than the Θ(n m ) bound for (1+1) EA!
Escaping from Local Optima Jump(x) x m Proof idea Simplified drift in gaps 1. Expectation of hypergeometric distribution. 2. Chvátal s bound.
General Pseudo-boolean Functions Algorithm OneMax (1+1) EA Θ(n log n) (1+λ) EA O(λn + n log n) (µ+1) EA O(µn + n log n) BB Ω(n/ log n)
General Pseudo-boolean Functions Algorithm OneMax (1+1) EA Θ(n log n) (1+λ) EA O(λn + n log n) (µ+1) EA O(µn + n log n) BB Ω(n/ log n) Theorem The expected runtime of any black box search algorithm with unbiased, unary variation on any pseudo-boolean function with a single global optimum is Ω(n log n).
General Pseudo-boolean Functions Algorithm OneMax (1+1) EA Θ(n log n) (1+λ) EA O(λn + n log n) (µ+1) EA O(µn + n log n) BB Ω(n/ log n) Theorem The expected runtime of any black box search algorithm with unbiased, unary variation on any pseudo-boolean function with a single global optimum is Ω(n log n). Proof idea Expected multiplicative weight decrease. Chvátal s bound.
Summary and Conclusion Refined black box model. Proofs are (relatively) easy! Comprises EAs never previously analysed. Ω(n log n) on general functions. Some bounds coincide with the runtime of (1+1) EA. Future work: k-ary variation operators for k > 1.
References I Baswana, S., Biswas, S., Doerr, B., Friedrich, T., Kurur, P. P., and Neumann, F. (2009). Computing single source shortest paths using single-objective fitness. In FOGA 09: Proceedings of the tenth ACM SIGEVO workshop on Foundations of genetic algorithms, pages 59 66, New York, NY, USA. ACM. Doerr, B., Klein, C., and Storch, T. (2007). Faster evolutionary algorithms by superior graph representation. In Proceedings of the 1st IEEE Symposium on Foundations of Computational Intelligence (FOCI 2007), pages 245 250. Droste, S. (2006). A rigorous analysis of the compact genetic algorithm for linear functions. Natural Computing, 5(3):257 283. Droste, S., Jansen, T., and Wegener, I. (2002). On the analysis of the (1+1) Evolutionary Algorithm. Theoretical Computer Science, 276:51 81. Droste, S., Jansen, T., and Wegener, I. (2006). Upper and lower bounds for randomized search heuristics in black-box optimization. Theory of Computing Systems, 39(4):525 544.
References II Droste, S. and Wiesmann, D. (2000). Metric based evolutionary algorithms. In Proceedings of Genetic Programming, European Conference, Edinburgh, Scotland, UK, April 15-16, 2000, Proceedings, volume 1802 of Lecture Notes in Computer Science, pages 29 43. Springer. Friedrich, T., Hebbinghaus, N., Neumann, F., He, J., and Witt, C. (2007). Approximating covering problems by randomized search heuristics using multi-objective models. In Proceedings of the 9th annual conference on Genetic and evolutionary computation (GECCO 2007), pages 797 804, New York, NY, USA. ACM Press. Giel, O. and Wegener, I. (2003). Evolutionary algorithms and the maximum matching problem. In Proceedings of the 20th Annual Symposium on Theoretical Aspects of Computer Science (STACS 2003), pages 415 426. He, J. and Yao, X. (2003). Towards an analytic framework for analysing the computation time of evolutionary algorithms. Artificial Intelligence, 145(1-2):59 97. Jansen, T., Jong, K. A. D., and Wegener, I. (2005). On the choice of the offspring population size in evolutionary algorithms. Evolutionary Computation, 13(4):413 440.
References III Johnson, D. S., Papadimitriou, C. H., and Yannakakis, M. (1988). How easy is local search? Journal of Computer and System Sciences, 37(1):79 100. Lehre, P. K. and Yao, X. (2007). Runtime analysis of (1+1) EA on computing unique input output sequences. In Proceedings of 2007 IEEE Congress on Evolutionary Computation (CEC 2007), pages 1882 1889. IEEE Press. Mühlenbein, H. (1992). How genetic algorithms really work I. Mutation and Hillclimbing. In Proceedings of the Parallel Problem Solving from Nature 2, (PPSN-II), pages 15 26. Elsevier. Neumann, F. and Wegener, I. (2007). Randomized local search, evolutionary algorithms, and the minimum spanning tree problem. Theoretical Computer Science, 378(1):32 40. Neumann, F. and Witt, C. (2006). Runtime analysis of a simple ant colony optimization algorithm. In Proceedings of The 17th International Symposium on Algorithms and Computation (ISAAC 2006), number 4288 in LNCS, pages 618 627.
References IV Neumann, F. and Witt, C. (2008). Ant colony optimization and the minimum spanning tree problem. In Proceedings of Learning and Intelligent Optimization (LION 2008), pages 153 166. Oliveto, P. S., He, J., and Yao, X. (2007a). Evolutionary algorithms and the vertex cover problem. In In Proceedings of the IEEE Congress on Evolutionary Computation (CEC 2007). Oliveto, P. S., He, J., and Yao, X. (2007b). Time complexity of evolutionary algorithms for combinatorial optimization: A decade of results. International Journal of Automation and Computing, 4(1):100 106. Reichel, J. and Skutella, M. (2008). Evolutionary algorithms and matroid optimization problems. Algorithmica. Rowe, J. E., Vose, M. D., and Wright, A. H. (2007). Neighborhood graphs and symmetric genetic operators. In FOGA, pages 110 122.
References V Scharnow, J., Tinnefeld, K., and Wegener, I. (2002). Fitness landscapes based on sorting and shortest paths problems. In Proceedings of 7th Conf. on Parallel Problem Solving from Nature (PPSN VII), number 2439 in LNCS, pages 54 63. Storch, T. (2006). How randomized search heuristics find maximum cliques in planar graphs. In Proceedings of the 8th annual conference on Genetic and evolutionary computation (GECCO 2006), pages 567 574, New York, NY, USA. ACM Press. Witt, C. (2005). Worst-case and average-case approximations by simple randomized search heuristics. In In Proceedings of the 22nd Annual Symposium on Theoretical Aspects of Computer Science (STACS 05), number 3404 in LNCS, pages 44 56. Witt, C. (2006). Runtime Analysis of the (µ + 1) EA on Simple Pseudo-Boolean Functions. Evolutionary Computation, 14(1):65 86. Zarges, C. (2009). On the utility of the population size for inversely fitness proportional mutation rates. In FOGA 09: Proceedings of the tenth ACM SIGEVO workshop on Foundations of genetic algorithms, pages 39 46, New York, NY, USA. ACM.