SIMU L TED ATED ANNEA L NG ING

Similar documents
5. Simulated Annealing 5.1 Basic Concepts. Fall 2010 Instructor: Dr. Masoud Yaghini

Motivation, Basic Concepts, Basic Methods, Travelling Salesperson Problem (TSP), Algorithms

Single Solution-based Metaheuristics

Lecture 4: Simulated Annealing. An Introduction to Meta-Heuristics, Produced by Qiangfu Zhao (Since 2012), All rights reserved

( ) ( ) ( ) ( ) Simulated Annealing. Introduction. Pseudotemperature, Free Energy and Entropy. A Short Detour into Statistical Mechanics.

Simulated Annealing. 2.1 Introduction

Simulated Annealing. Local Search. Cost function. Solution space

Markov Chain Monte Carlo. Simulated Annealing.

7.1 Basis for Boltzmann machine. 7. Boltzmann machines

Methods for finding optimal configurations

CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash

Zebo Peng Embedded Systems Laboratory IDA, Linköping University

Introduction to Simulated Annealing 22c:145

Lin-Kernighan Heuristic. Simulated Annealing

Methods for finding optimal configurations

12. LOCAL SEARCH. gradient descent Metropolis algorithm Hopfield neural networks maximum cut Nash equilibria

12. LOCAL SEARCH. gradient descent Metropolis algorithm Hopfield neural networks maximum cut Nash equilibria

Artificial Intelligence Heuristic Search Methods

Hill climbing: Simulated annealing and Tabu search

PROBLEM SOLVING AND SEARCH IN ARTIFICIAL INTELLIGENCE

Hertz, Krogh, Palmer: Introduction to the Theory of Neural Computation. Addison-Wesley Publishing Company (1991). (v ji (1 x i ) + (1 v ji )x i )

Optimization Methods via Simulation

Computational statistics

Finding optimal configurations ( combinatorial optimization)

A.I.: Beyond Classical Search

OPTIMIZATION BY SIMULATED ANNEALING: A NECESSARY AND SUFFICIENT CONDITION FOR CONVERGENCE. Bruce Hajek* University of Illinois at Champaign-Urbana

Local and Stochastic Search

Stochastic Networks Variations of the Hopfield model

A Two-Stage Simulated Annealing Methodology

Design and Analysis of Algorithms

1 Heuristics for the Traveling Salesman Problem

Local Search and Optimization

MCMC Simulated Annealing Exercises.

Lecture H2. Heuristic Methods: Iterated Local Search, Simulated Annealing and Tabu Search. Saeed Bastani

Metaheuristics. 2.3 Local Search 2.4 Simulated annealing. Adrian Horga

5. Simulated Annealing 5.2 Advanced Concepts. Fall 2010 Instructor: Dr. Masoud Yaghini

CS 331: Artificial Intelligence Local Search 1. Tough real-world problems

Heuristic Optimisation

Self-Adaptive Ant Colony System for the Traveling Salesman Problem

6 Markov Chain Monte Carlo (MCMC)

MICROCANONICAL OPTIMIZATION APPLIED TO THE TRAVELING SALESMAN PROBLEM

Fundamentals of Metaheuristics

Part B" Ants (Natural and Artificial)! Langton s Vants" (Virtual Ants)! Vants! Example! Time Reversibility!

Acceptance Driven Local Search and Evolutionary Algorithms

Motivation, Basic Concepts, Basic Methods, Travelling Salesperson Problem (TSP), Algorithms

Algorithms and Complexity theory

Motivation. Lecture 23. The Stochastic Neuron ( ) Properties of Logistic Sigmoid. Logistic Sigmoid With Varying T. Logistic Sigmoid T = 0.

Metaheuristics and Local Search. Discrete optimization problems. Solution approaches

Theory and Applications of Simulated Annealing for Nonlinear Constrained Optimization 1

Chapter 11. Stochastic Methods Rooted in Statistical Mechanics

Local Search & Optimization

Metaheuristics and Local Search

SYSTEMS SCIENCE AND CYBERNETICS Vol. III Simulated Annealing: From Statistical Thermodynamics to Combinatory Problems Solving - D.

Spin Glas Dynamics and Stochastic Optimization Schemes. Karl Heinz Hoffmann TU Chemnitz

Chapter 4: Monte Carlo Methods. Paisan Nakmahachalasint

Optimization of Power Transformer Design using Simulated Annealing Technique

Unit 1A: Computational Complexity

Monte Carlo methods for sampling-based Stochastic Optimization

Introduction to Machine Learning CMU-10701

MONTE CARLO METHODS. Hedibert Freitas Lopes

Local Search & Optimization

Analysis of a Large Structure/Biological Activity. Data Set Using Recursive Partitioning and. Simulated Annealing

Hopfield Networks and Boltzmann Machines. Christian Borgelt Artificial Neural Networks and Deep Learning 296

Local Search. Shin Yoo CS492D, Fall 2015, School of Computing, KAIST

Simulated annealing with asymptotic convergence for nonlinear constrained optimization

Large Scale Semi-supervised Linear SVMs. University of Chicago

OPTIMIZATION OF STEEL PLANE TRUSS MEMBERS CROSS SECTIONS WITH SIMULATED ANNEALING METHOD

Random Search. Shin Yoo CS454, Autumn 2017, School of Computing, KAIST

3D HP Protein Folding Problem using Ant Algorithm

6. APPLICATION TO THE TRAVELING SALESMAN PROBLEM

Overview. Optimization. Easy optimization problems. Monte Carlo for Optimization. 1. Survey MC ideas for optimization: (a) Multistart

Microcanonical Mean Field Annealing: A New Algorithm for Increasing the Convergence Speed of Mean Field Annealing.

Session 3A: Markov chain Monte Carlo (MCMC)

Simulated Annealing for Constrained Global Optimization

Clustering using the Minimum Message Length Criterion and Simulated Annealing

Chapter 4 Beyond Classical Search 4.1 Local search algorithms and optimization problems

A Structural Matching Algorithm Using Generalized Deterministic Annealing

arxiv: v3 [physics.comp-ph] 22 Sep 2016

Simulated Annealing applied to the Traveling Salesman Problem. Eric Miller

Part III: Traveling salesman problems

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester

for Global Optimization with a Square-Root Cooling Schedule Faming Liang Simulated Stochastic Approximation Annealing for Global Optim

Teaching Stochastic Local Search

Convex Optimization CMU-10725

Gradient Descent. Sargur Srihari

Markov Chain Monte Carlo Methods

Simulated Annealing SURFACE. Syracuse University

Computing the Initial Temperature of Simulated Annealing

3.4 Relaxations and bounds

arxiv: v1 [cs.ai] 18 Oct 2007

RESEARCH STATEMENT-ERIC SAMANSKY

The Quadratic Assignment Problem

Introduction to Machine Learning CMU-10701

Solving the Homogeneous Probabilistic Traveling Salesman Problem by the ACO Metaheuristic

arxiv: v1 [stat.co] 18 Feb 2012

CS 380: ARTIFICIAL INTELLIGENCE

Lecture 15: MCMC Sanjeev Arora Elad Hazan. COS 402 Machine Learning and Artificial Intelligence Fall 2016

Energy Minimization via Graph Cuts

AN OPTIMIZED FAST VOLTAGE STABILITY INDICATOR

Markov chain Monte Carlo methods for visual tracking

Transcription:

SIMULATED ANNEALING

Fundamental Concept Motivation by an analogy to the statistical mechanics of annealing in solids. => to coerce a solid (i.e., in a poor, unordered state) into a low energy thermodynamic equilibrium (i.e., a highly ordered defect-free state) such as a crystal lattice. The material is annealed by heating the material to a temperature that permits many atomic rearrangements, then cooled slowly until the material freezes into a good crystal- Metropolis procedure Different from greedy algorithm, allows perturbation to move uphill in a controlled fashion to escape from local minima.

Design Metaphor Simulated annealing offers a physical analogy for the solution of optimization problems. Boltzmann distribution N current _ state E new_ state E N i : number of atoms with energy N new_ E i, i = new_state or current_state k is the Boltzmann constant T is the absolute temperature state = exp kt current _ state In simulated annealing the left hand side is interpreted as a probability The probability of a uphill move of size ΔE at temperature ΔE kt T is Pr( accept) = e If ΔE<0, the new configuration is accepted and if ΔE>0, probability of accepting the worse configuration is calculated based on Boltzmann distribution

At higher temperatures, the probability of large uphill move in energy is large (permits an aggressive, essentially random search of the configuration space) At lower temperatures the probability is small (few uphill moves are allowed) By successfully lowering the temperature (cooling schedule) and running the algorithm, we can simulate the material coming into equilibrium at each newly reduced temperature.

Cooling Schedule A starting hot temperature and rule to determine when the temperature should be lowered and how much the temperature should be lowered and when annealing should be terminated. T= αt, α <1 If α is very small, the temperature reduces very fast and there is high possibility of being trapped in a local minimum If α is large the energy decreases very slowly l Many schemes to reduce temperature Fast Cauchy: Geometric: Boltzmann: 1 T k = T 0 k T k = T 0 αk 1 T k = T0 ln( k) Where T 0 is the initial temperature, and T k is the temperature after the k th temperature decrement

Pseudo Codes M = number of moves (perturbations) to attempt T = current temperature For m = 1 to M Generate a random move Evaluate the change in energy ΔE IF (ΔE < 0) Accept this move and update configuration /* downhill move ELSE Accept with probability, P = e -ΔE/T Update configuration if accepted ENDIF Update temperature T = αt ENDFOR

Simulated Annealing for TSP Initial configuration: permutation =>1, 2, 3, 4,, N temperature T = 2 cooling rate α =099 0.99 energy = d(1,2)+d(2,3)+ +d(n,1) Generate a new configuration from the current one at random Evaluate ΔE = current energy previous energy If ΔE E < 0 accept the current configuration (downhill) ΔE kt Else accept configuration with probability P = e T= αt Stopping criteria if energy < threshold or number of iterations is reached.

Case Study: 20-city problem Initial random permutation Best route found for 20-city problem d=4 4.0185 (d=4 4.0192)

Initial temperature 2 2 2 2 2 Cooling rate 0.8 0.9 0.95 0.99 0.999 Converged iteration 87 155 298 851 5982 Minimal energy 4.1056 4.0920 4.0427 4.0185 4.0185 obtained

T 2 T 0 = 2 α=0.99 Iteration = 851 d = 4.0185

Case Study: 101-city problem (ali101) T 0 =200 α= 0.999 Iterations = 100,000 d = 661.25 Global minima = 629

Research Issues Initial Temperature Initial Configuration Next Move (neighborhood size) is it possible to design a truly local search for combinatorial optimization problem (as oppose to numerical optimization problem)? Cooling Schedule e (cooling rate) Stopping Criteria Acceptance Probability

Effect on Initialization Random sequence Hilbert space filling curve Fastest closest path Experimental results indicate that the initialization method does Experimental results indicate that the initialization method does not affect much the probability of the system to converge to the global optimal point.

Space Filling Curve A space filling curve is a continuous mapping from a lower dimensional space into a higher one. A spacefilling curve is formed by repeatedly copying and shrinking a simple pattern. A Hilbert curve (also known as a Hilbert space-filling curve) is a continuous fractal space-filling curve first described by the German mathematician David Hilbert in 1891.

Divide the square into four smaller squares I 00, I 01, I 10, and I 11 Define f 0 so that it maps [0,1/4] into I 00, [1/4,1/2] into I 01 and so on

C 1 C 6 C 1 C 6 C 1 C 6 C 1 C 6 Effect on Next Move Modification Random switch of two vertices Random switch of two edges C 3 C 4 C 3 C 4 C 3 C 4 C 3 C 4 C 5 C 2 C 5 C 2 C 5 C 2 C 5 C 2 C 8 C 7 C C C 8 C 7 C 8 C 7 8 7 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 5 3 4 2 6 7 8 1 5 4 3 2 6 7 8

Random switches of three edges C 2 C C 4 3 C 5 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 4 5 2 3 6 7 8 C 1 C 6 1 4 5 3 2 6 7 8 C 8 C 7 C 3 C 4 C 3 C 4 C 3 C 4 C 2 C 5 C 2 C 5 C 2 C 5 1 2 3 4 5 6 7 8 C 1 C 6 C 1 C 6 C 1 C 6 1 5 4 2 3 6 7 8 C 8 C 7 1 2 3 4 5 6 7 8 1 3 2 5 4 6 7 8 C 8 C 7 C 3 C 4 C 5 C 2 C 1 C 6 C 8 C 7 C 8 C 7 Experimental results show that the three edge change is much more powerful in making the method converge to the global l minimum than the other methods. However, its processing cost is also higher, as one would expect.

2-opt S1 S2 S1 S2 Symmetric [ S1,S2] [S1, -S2] S1 S2 S1 S2 [-S1, S2]

3-opt S1 S2 S3 S1 S2 S3 -S1 -S3 S2 -S1 -S3 -S2 S1 -S2 -S3 S1 -S2 S3 -S1 S3 -S2 -S1 S3 S2 S1 S3 S2 S1 S3 S2 -S1 S2 S3 -S1 S2 S3 S1 -S3 -S2 S1 -S3 S2 -S1 S2 -S3 -S1 S2 S3

For symmetric TSP, The new tour is the best one out of 7 new tours. For asymmetric TSP, The new tour is the best one in the 15 new tours. Larger neighborhood Vs. Smaller neighborhood Pro: Larger is better in terms of improvement in each iteration and convergence speed Con: The searching space increases exponentially

Effect on Cooling Schedule Fast Cauchy T = Geometric k k k 1 Tempurature ( T ) 10 9 8 7 6 5 4 3 1 T0 Tempurature ( T ) 10 9 8 7 6 5 4 3 2 T = α Tk 0.1 02 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 2 1 1 1 2 3 4 5 6 7 8 9 10 Time ( k ) 0 5 10 15 20 25 30 35 40 45 50 Time ( k ) 1 Boltzmann T = k T Logarithmic 0 ln( k) + 1 T 1 = k h k ln( + 1 ) 10 15 9 8 rature ( T ) Tempur 7 6 5 4 3 Tempu urature ( T ) 10 5 h=2 h=4 h=6 h=8 h=10 2 1 0 1 2 3 4 5 6 7 8 9 10 Time ( k ) x 10 4 0 100 200 300 400 500 600 700 800 900 1000 Time ( k )

Theoretical Results The generic SA starts with an initial configuration generated at random. At each step, it selects the next configuration Y from the neighborhood N x of the current configuration. The next configuration will be accepted as the current one if its cost is no greater than that of the current configuration, otherwise it will only be accepted with probability in Botzmann distribution. The procedure e is repeated ed with a slow decrease of the control parameter T, called temperature, until a sufficiently good solution has been found.

The performance of generic SA can be analyzed by nonstationary Markov chain theory. Suppose the probability of generating the next configuration Y from the current one is g Y, which is g g Y independent of temperature T, and the probability of accepting Y as the new configuration is a Y,,, 1 N Y S = = ) 1, exp( min ) (, 0,,, T E E T a N Y S N g Y Y Y The one-step transition probability of the nonstationary Markov chain associated with SA are T Y N Y S E E ) ( 1 i 1 = = N Z Z Y Y Y N Y S T E E N Y N Y S N Y S T N T p,,, ) 1, exp( min 1 1,, 0,,, ) 1, exp( min ) (

S: the configuration space N : the neighborhood space of the current configuration The convergence of SA has been proven by several researchers, e.g., S. Geman and D. Geman, Stochastic rlaxation, Gibbs distribution and the Bayesian restoration of images IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, pp. 721-742, 1984. Given that SA s generation and acceptance probabilities satisfy the above equations in the previous slide, the SA converges to global minima lim Prob{ n S*} = 1 S* = { S, E E, Y S} Y

The rate of temperature decrease is h T n = ln( n +1) where h is a problem dependent constant and T n stands for the temperature at time n. A better result is found with λ Tn = N i = ( ) n ln + n i 1 0 t. Yao and G.J. Li, Performance analysis of simulated annealing, in L.W. Chen, B.C. Chen and Q.N. Sun, editors, Proceedings of the Third Pan-pacific Computer Conference, p. 792-980, 1989. In practice, we use Tn +1 = ρ T n where ρ is a constant selected between 0.8 to 0.99

Conjecture: In earlier stage, raising the temperature makes escaping from local minima easier. However, how much changes should be allowed was not addressed? Theorem: SA with larger neighborhoods has a greater probability of arriving at a global optimum than generic SA has if the other conditions, i.e., the initial configuration, initial temperature and cooling rate are kept the same. Guideline: In earlier stage, allow the next_state to be generated from a larger neighborhood of the current state. t As the process continues, the neighborhood h should be reduced accordingly. But HOW?

Applications Computer-aided VLSI design Simulated annealing for VLSI design, Kluwer Academic, 1988 Combinatorial optimization Simulated annealing: theory and application, Reidel Publishing, 1987 Neural network training A learning algorithm for Botzman machine, Cognitive Science, 9: 147-169, 1985 Image processing Image processing by simulated annealing, IBM Journal of Research and Development, 29: 569-579, 1985 Code design Using simulated annealing to design good codes, IEEE Trans Information Theory, 33: 116-123, 1987 Function optimization An empirical study of bit vector function optimization, in Genetic Algorithm and Simulated Annealing, Chapter 13, 170-204 204, 1987 and etc

Selected References S. Kirkpatrick, C.D. Gelatt, and M.P. Vecchi, Optimization by simulated annealing,,science, 220(4598), pp. 671-680, 1983. B. Moon, H.V. Jagadish, C. Faloutsos, and J.H. Saltz, Analysis of the clustering properties of the Hilbert space-filling curve, IEEE Transactions on Knowledge and Data Engineering, 13(1), pp. 124-141, 2001. S. Lin and B. Kernighan, An effective heuristic algorithm for the traveling salesman problem, Operations Research, 21(2), pp. 498-516, 1973.

Homework #2- due 10/17/2009 Problem #1 (Combinatorial Optimization) Develop a generic simulated annealing algorithm to solve the traveling salesman problem with 20 cities that are uniformly distributed within a unit square in a 2-dimensional plane. The coordinates of 20 cities are given below in a matrix: cities = 0.6606,0.9695,0.5906,0.2124,0.0398,0.1367,0.9536,0.6091,0.8767,0.8148 0.9500,0.6740,0.5029,0.8274,0.9697,0.5979,0.2184,0.7148,0.2395,0.2867 0.3876,0.7041,0.0213,0.3429,0.7471,0.5449,0.9464,0.1247,0.1636,0.8668 0.8200,0.3296,0.1649,0.3025,0.8192,0.9392,0.8191,0.4351,0.8646,0.6768 Show the best route you find and the associated distance with attached computer coding (with documentation- show your recipe). An example is given below for reference.

1 0.9 0.8 0.7 06 0.6 0.5 0.4 0.3 0.2 0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Problem 2 (Scalability) Extend your simulated annealing algorithm to solve the benchmark 101-city symmetric TSP problem (i.e., eil101 due to Christofides and Eilson).The benchmark problem can be found from TSPLIB archive at http://www.iwr.uni-heidelberg.de/groups/comopt/software/tsplib95/ No need to turn in the codes. Only the best route found (i.e., its configuration and distance in the form of above figure) is to be turned in. This problem is to test if your algorithm can scale up properly.

Problem 3: Extend your simulated annealing algorithm to solve the benchmark 17-city asymmetric TSP problem (i.e., br17 due to Repetto). The benchmark problem can be found from TSPLIB archive. Turn in only the best route found. 9999 3 5 48 48 8 8 5 5 3 3 0 3 5 8 8 5 3 9999 3 48 48 8 8 5 5 0 0 3 0 3 8 8 5 5 3 9999 72 72 48 48 24 24 3 3 5 3 0 48 48 24 5 3 9999 72 72 48 48 24 24 3 3 5 3 0 48 48 24 48 48 74 9999 0 6 6 12 12 48 48 48 48 74 6 6 12 48 48 74 0 9999 6 6 12 12 48 48 48 48 74 6 6 12 8 8 50 6 6 9999 0 8 8 8 8 8 8 50 0 0 8 8 8 50 6 6 0 9999 8 8 8 8 8 8 50 0 0 8 5 5 26 12 12 8 8 9999 0 5 5 5 5 26 8 8 0 5 5 26 12 12 8 8 9999 0 5 5 5 5 26 8 8 0 5 5 26 12 12 8 8 0 9999 5 5 5 5 26 8 8 0 3 0 3 48 48 8 8 5 5 9999 0 3 0 3 8 8 5 3 0 3 48 48 8 8 5 5 0 9999 3 0 3 8 8 5 0 3 5 48 48 8 8 5 5 3 3 9999 3 5 8 8 5 3 0 3 48 48 8 8 5 5 0 0 3 9999 3 8 8 5 5 3 0 72 72 48 48 24 24 3 3 5 3 9999 48 48 24 8 8 50 6 6 0 0 8 8 8 8 8 8 50 9999 0 8 8 8 50 6 6 0 0 8 8 8 8 8 8 50 0 9999 8 5 5 26 12 12 8 8 0 0 5 5 5 5 26 8 8 9999