Landscapes and Other Art Forms.

Similar documents
Next Generation Genetic Algorithms. Next Generation Genetic Algorithms. What if you could... Know your Landscape! And Go Downhill!

Elementary Landscape Decomposition of the Test Suite Minimization Problem

Adele Howe L. Darrell Whitley Computer Science Department Colorado State University Fort Collins, CO

Introduction to Black-Box Optimization in Continuous Search Spaces. Definitions, Examples, Difficulties

1. Computação Evolutiva

Feasibility-Preserving Crossover for Maximum k-coverage Problem

THESIS A STEP TOWARD CONSTANT TIME LOCAL SEARCH FOR OPTIMIZING PSEUDO BOOLEAN FUNCTIONS. Submitted by Wenxiang Chen Department of Computer Science

A Tractable Walsh Analysis of SAT and its Implications for Genetic Algorithms

Rigorous Analyses for the Combination of Ant Colony Optimization and Local Search

Geometric Semantic Genetic Programming (GSGP): theory-laden design of semantic mutation operators

2-bit Flip Mutation Elementary Fitness Landscapes

Computing the moments of k-bounded pseudo-boolean functions over Hamming spheres of arbitrary radius in polynomial time

REIHE COMPUTATIONAL INTELLIGENCE S O N D E R F O R S C H U N G S B E R E I C H 5 3 1

Evolutionary Algorithms How to Cope With Plateaus of Constant Fitness and When to Reject Strings of The Same Fitness

On the Impact of Objective Function Transformations on Evolutionary and Black-Box Algorithms

Approximating the Distribution of Fitness over Hamming Regions

Bounded Approximation Algorithms

CSCI3390-Second Test with Solutions

Lecture 21: Counting and Sampling Problems

A Comparison of GAs Penalizing Infeasible Solutions and Repairing Infeasible Solutions on the 0-1 Knapsack Problem

Tightness of LP Relaxations for Almost Balanced Models

how should the GA proceed?

Finding Ground States of SK Spin Glasses with hboa and GAs

A Glider For Every Graph

Hill-climbing Strategies on Various Landscapes: An Empirical Comparison

Local Search (Greedy Descent): Maintain an assignment of a value to each variable. Repeat:

DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT --

Introduction. Genetic Algorithm Theory. Overview of tutorial. The Simple Genetic Algorithm. Jonathan E. Rowe

Optimization Prof. A. Goswami Department of Mathematics Indian Institute of Technology, Kharagpur. Lecture - 20 Travelling Salesman Problem

A.I.: Beyond Classical Search

CS/COE

Geometric Semantic Genetic Programming (GSGP): theory-laden design of variation operators

Lecture 9 Evolutionary Computation: Genetic algorithms

Integer Least Squares: Sphere Decoding and the LLL Algorithm

CSE200: Computability and complexity Space Complexity

Plateaus Can Be Harder in Multi-Objective Optimization

Black Box Search By Unbiased Variation

Copyright 1972, by the author(s). All rights reserved.

An Analysis on Recombination in Multi-Objective Evolutionary Optimization

ENEE 457: Computer Systems Security 09/19/16. Lecture 6 Message Authentication Codes and Hash Functions

Computational statistics

UNIVERSITY OF DORTMUND

Upper Bounds on the Time and Space Complexity of Optimizing Additively Separable Functions

TECHNISCHE UNIVERSITÄT DORTMUND REIHE COMPUTATIONAL INTELLIGENCE COLLABORATIVE RESEARCH CENTER 531

Markov Chain Monte Carlo The Metropolis-Hastings Algorithm

15.1 Proof of the Cook-Levin Theorem: SAT is NP-complete

Introduction to Karnaugh Maps

Development. biologically-inspired computing. lecture 16. Informatics luis rocha x x x. Syntactic Operations. biologically Inspired computing

Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses

CS 6820 Fall 2014 Lectures, October 3-20, 2014

Lecture 9. Greedy Algorithm

Motivating examples Introduction to algorithms Simplex algorithm. On a particular example General algorithm. Duality An application to game theory

Crossover Techniques in GAs

Foundations of Artificial Intelligence

LOCAL SEARCH. Today. Reading AIMA Chapter , Goals Local search algorithms. Introduce adversarial search 1/31/14

CSC 4510 Machine Learning

Unit 1A: Computational Complexity

Total Ordering on Subgroups and Cosets

Crossover can be constructive when computing unique input output sequences

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:

CHAPTER 2 POLYNOMIALS KEY POINTS

Infinite Series. Copyright Cengage Learning. All rights reserved.

NP-Complete Problems. More reductions

Performance of Evolutionary Algorithms on NK Landscapes with Nearest Neighbor Interactions and Tunable Overlap

Runtime Analysis of Evolutionary Algorithms for the Knapsack Problem with Favorably Correlated Weights

Hierarchical BOA Solves Ising Spin Glasses and MAXSAT

Automata Theory CS Complexity Theory I: Polynomial Time

Computability and Complexity Theory: An Introduction

Boolean circuits. Lecture Definitions

Pengju

Timo Latvala Landscape Families

Crossover Gene Selection by Spatial Location

3 rd class Mech. Eng. Dept. hamdiahmed.weebly.com Fourier Series

Learning Tetris. 1 Tetris. February 3, 2009

Search. Search is a key component of intelligent problem solving. Get closer to the goal if time is not enough

Computational complexity theory

Zebo Peng Embedded Systems Laboratory IDA, Linköping University

Local Search & Optimization

Koza s Algorithm. Choose a set of possible functions and terminals for the program.

Centric Selection: a Way to Tune the Exploration/Exploitation Trade-off

Estimation-of-Distribution Algorithms. Discrete Domain.

How to Treat Evolutionary Algorithms as Ordinary Randomized Algorithms

Applying Price s Equation to Survival Selection

Expected Running Time Analysis of a Multiobjective Evolutionary Algorithm on Pseudo-boolean Functions

15.5. Applications of Double Integrals. Density and Mass. Density and Mass. Density and Mass. Density and Mass. Multiple Integrals

Seminar 1. Introduction to Quantum Computing

Introduction to Cryptography Lecture 13

Computability Theory

Local and Stochastic Search

Analysis of Crossover Operators for Cluster Geometry Optimization

Admin NP-COMPLETE PROBLEMS. Run-time analysis. Tractable vs. intractable problems 5/2/13. What is a tractable problem?

When Hypermutations and Ageing Enable Artificial Immune Systems to Outperform Evolutionary Algorithms 1. Dogan Corus, Pietro S. Oliveto, Donya Yazdani

A Gentle Introduction to the Time Complexity Analysis of Evolutionary Algorithms

Columbus City Schools High School CCSS Mathematics III - High School PARRC Model Content Frameworks Mathematics - Core Standards And Math Practices

UNIVERSITY OF DORTMUND

Tractable & Intractable Problems

How hard is it to find a good solution?

2. Duality and tensor products. In class, we saw how to define a natural map V1 V2 (V 1 V 2 ) satisfying

Lecture 9 Classification of States

Transcription:

Landscapes and Other Art Forms. Darrell Whitley Computer Science, Colorado State University Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author. Copyright is held by the owner/author(s).

Blind No More = GRAY BOX Optimization Darrell Whitley Computer Science, Colorado State University With Thanks to: Francisco Chicano, Gabriela Ochoa, Andrew Sutton and Renato Tinós

GRAY BOX OPTIMIZATION We can construct Gray Box optimization for pseudo-boolean optimization problems composed of M subfunctions, where each subfunction accepts at most k variables. Exploit the general properties of every Mk Landscape: m f(x) = f i (x) i=1 Which can be expressed as a Walsh Polynomial m W (f(x)) = W (f i (x)) i=1 Or can be expressed as a sum of k Elementary Landscapes k f(x) = ϕ (k) (W (f(x))) i=1

Walsh Example: MAXSAT

Walsh Example

Walsh Example

BLACK BOX OPTIMIZATION Don t wear a blind fold when walking about London if you can help it!

NK-Landscapes An Adjacent NK Landscape: n = 6 and k = 3. The subfunctions: f 0 (x 0, x 1, x 2 ) f 1 (x 1, x 2, x 3 ) f 2 (x 2, x 3, x 4 ) f 3 (x 3, x 4, x 5 ) f 4 (x 4, x 5, x 0 ) f 5 (x 5, x 0, x 1 ) These problems can be solved to optimality using Dynamic Programming.

Percent of Offspring that are Local Optima Using a Very Simple (Stupid) Hybrid GA: N k Model 2-point Xover Uniform Xover PX 100 2 Adj 74.2 ±3.9 0.3 ±0.3 100.0 ±0.0 300 4 Adj 30.7 ±2.8 0.0 ±0.0 94.4 ±4.3 500 2 Adj 78.0 ±2.3 0.0 ±0.0 97.9 ±5.0 500 4 Adj 31.0 ±2.5 0.0 ±0.0 93.8 ±4.0 100 2 Rand 0.8 ±0.9 0.5 ±0.5 100.0 ±0.0 300 4 Rand 0.0 ±0.0 0.0 ±0.0 86.4 ±17.1 500 2 Rand 0.0 ±0.0 0.0 ±0.0 98.3 ±4.9 500 4 Rand 0.0 ±0.0 0.0 ±0.0 83.6 ±16.8

Number of partition components discovered N k Model Paired PX Mean Max 100 2 Adjacent 3.34 ±0.16 16 300 4 Adjacent 5.24 ±0.10 26 500 2 Adjacent 7.66 ±0.47 55 500 4 Adjacent 7.52 ±0.16 41 100 2 Random 3.22 ±0.16 15 300 4 Random 2.41 ±0.04 13 500 2 Random 6.98 ±0.47 47 500 4 Random 2.46 ±0.05 13 Paired PX uses Tournament Selection. The first parent is selected by fitness. The second parent is selected by Hamming Distance.

Optimal Solutions for Adjacent NK 2-point Uniform Paired PX N k Found Found Found 300 2 18 0 100 300 3 0 0 100 300 4 0 0 98 500 2 0 0 100 500 3 0 0 98 500 4 0 0 70 Percentage over 50 runs where the global optimum was Found in the experiments of the hybrid GA with the Adjacent NK Landscape.

Tunnelling Local Optima Networks NK Landscapes: Ochoa et al. GECCO 2015

NK and Mk Landscapes, P and NP

NK and Mk Landscapes, P and NP

But a Hybrid Genetic Algorithm is NOT how we should solve these NK Landscape Problems. We can exactly know the location of improving moves in constant time. No enumeration of neighbors is needed. Under well behaved conditions, we can exactly know the location of improving moves for r steps ahead in constant time.

Walsh Analysis Every n-bit MAXSAT or NK-landscape or P-spin problem is a sum of m subfunctions, f i : f(x) = m f i (x) The Walsh transform of f is a sum of the Walsh transforms of the individual subfunctions. W (f(x)) = i=1 m W (f i (x)) If m is O(n) then the number of Walsh coefficients is m 2 k = O(n). i=1

Mk Landscapes A General Model for all bounded Pseudo-Boolean Problems f(x) = m i = 1 fi (x, mask) f 1 f 2 f 3 f f 4 m 1 0 1 0 1 1 1 0 0 1 1 0 0 1 0 1 0 1 0 0 1 0 1 1 1 0 0 1

When 1 bit flips what happens? f(x) = m i = 1 fi (x, mask ) i f 1 f 2 f 3 f f 4 m 1 0 1 0 1 1 1 0 0 1 1 0 0 1 0 1 0 1 0 0 1 0 1 1 1 0 0 1 flip

Constant Time Steepest Descent Assume we flip bit p to move from x to y p N(x). Construct a vector Score such that Score(x, y p ) = 2 1 bt x w b (x) b, p b In this way, all of the Walsh coefficients whose signs will be changed by flipping bit p are collected into a single number Score(x, y p ). The GSAT algorithm has done this for 23 years (Thanks to H. Hoos). NOTE: Hoos and Stützle have claimed a constant time result, but without proof. An average case complexity proof is required to obtain general constant time complexity results (Whitley 2013, AAAI). Also it does not matter if the problems are uniform random or not.

Best Improving and Next Improving moves Best Improving and Next Improving moves cost the same. GSAT uses a Buffer of best improving moves Buffer(best.improvement) =< M 10, M 1919, M 9999 > But the Buffer does not empty monotonically: this leads to thrashing. Instead uses multiple Buckets to hold improving moves Bucket(best.improvement) =< M 10, M 1919, M 9999 > Bucket(best.improvement 1) =< M 8371, M 4321, M 847 > Bucket(all.other.improving.moves) =< M 40, M 519, M 6799 > This improves the runtime of GSAT by a factor of 20X to 30X. The solution for NK Landscapes is only slightly more complicated.

The locations of the updates are obvious Score(y p, y 1 ) = Score(x, y 1 ) Score(y p, y 2 ) = Score(x, y 2 ) Score(y p, y 3 ) = Score(x, y 3 ) 2( w b(x)) b, (p 3) b Score(y p, y 4 ) = Score(x, y 4 ) Score(y p, y 5 ) = Score(x, y 5 ) Score(y p, y 6 ) = Score(x, y 6 ) Score(y p, y 7 ) = Score(x, y 7 ) Score(y p, y 8 ) = Score(x, y 8 ) 2( w b(x)) Score(y p, y 9 ) = Score(x, y 9 ) b, (p 8) b

The locations of the updates are obvious

What if we could look R Moves Lookahead? Consider R=3 Let Score(3, x, y i,j,k ) indicate we move from x to y i,j,k by flipping the 3 bits i, j, k. In general, we compute Score(r, x, y p ) when flipping r bits. f(y i ) = f(x) + Score(1, x, y i ) f(y i,j ) = f(y i ) + Score(1, y i, y j ) f(y i,j ) = f(x) + Score(2, x, y i,j ) f(y i,j,k ) = f(y i,j ) + Score(1, y i,j, y k ) f(y i,j,k ) = f(x) + Score(3, x, y i,j,k ) With thanks to Francisco Chicano!

Why Doesn t this exponentially EXPLODE??? f(y i,j,k ) = ((f(x) + Score(1, x, y i)) + Score(1, y i, y j)) + Score(1, y i,j, y k ) Score(3, x, y i,j,k ) = Score(2, x, y i,j) + Score(1, y i,j, y i,j,k ) If there is no Walsh Coefficient w i,j then Score(1, y i, y i,j ) = 0. Assume we have already moves of length shorter than 3. If there are no Walsh Coefficients linking i, j, k then Score(3, x, y i,j,k ) = 0.

The Variable Interaction Graph 3 7 9 5 4 8 1 2 0 6 The Variable Interaction Graph Assume all distance 1 moves are taken. There cannot be a move flipping bits 4, 6, 9 that yields an improving move because there are no interactions and no Walsh coefficients.

Multiple Step Lookahead Local Search distance to optimum 0.8 0.6 0.4 0.2 0.0 r=1 r=2 r=3 r=4 r=5 r=6 0 5 10 15 20 time in seconds In this figure, N = 12,000, k=3, and q=4. The radius is 1, 2, 3, 4, 5, 6. At r=6 the global optimum is found.

Steepest Descent on Moments Both f(x) and Avg(N(x)) can be computed with Walsh Spans. 3 f(x) = ϕ (z) (x) z=0 3 Avg(N(x)) = f(x) 1/d 2zϕ (p) (x) z=0 3 3 Avg(N(x)) = ϕ (z) (x) 1/N 2zϕ (z) (x) z=0 z=0 3 3 Avg(N(x)) = ϕ (z) (x) 2/N zϕ (z) (x) z=0 z=0

Steepest Descent on Moments Assume the vectors S p = Score p and Z p just map the change in the Walsh coefficients. Z p is computed using changes in zϕ (z) (x) ScoreAvg(y p, y 1 ) = ScoreAvg(x, y 1 ) ScoreAvg(y p, y 2 ) = ScoreAvg(x, y 2 ) ScoreAvg(y p, y 3 ) = ScoreAvg(x, y 3 ) + W eightedw alshupdate ScoreAvg(y p, y 4 ) = ScoreAvg(x, y 4 ) ScoreAvg(y p, y 5 ) = ScoreAvg(x, y 5 ) ScoreAvg(y p, y 6 ) = ScoreAvg(x, y 6 ) ScoreAvg(y p, y 7 ) = ScoreAvg(x, y 7 ) ScoreAvg(y p, y 8 ) = ScoreAvg(x, y 8 ) + W eightedw alshupdate ScoreAvg(y p, y 9 ) = ScoreAvg(x, y 9 )

Θ(1) Steepest Descent on Moments

What s (Obviously) Next? Local Search with r Move Lookahead PLUS Partition Crossover. Apply r Move Lookahead and Partition Crossover to MAX-kSAT. Use Deterministic Improving Moves. Use Deterministic Recombination.

What s (Obviously) Next? We can now solve 1 million variable NK-Landscapes to optimality in approximately linear time. (Paper submitted.)

What s (Obviously) Next? Put an End to the domination of Black Box Optimization. Wait for Tonight and Try to Take over the World.

THANK YOU Take Home Message: PROBLEM STRUCTURE MATTERS. Black Box Optimizers can never match the performance of an algorithm that efficiently exploits problem structure. But we need only a small amount of information: Gray Box Optimization. For Mk Landscapes, we can use Deterministic Moves and Deterministic Crossover.