Motivation, Basic Concepts, Basic Methods, Travelling Salesperson Problem (TSP), Algorithms 1
What is Combinatorial Optimization? Combinatorial Optimization deals with problems where we have to search for and find an optimal combination of objects from a very large, but finite, number of possible combinations of objects. For example, choose the best combination of 42 objects from 1,234 objects. The set containing the finite number of possibilities is called the solution space (possibility space). Continuing the example, the solution space consists of all possible combinations of 42 objects taken from the set of 1,234 objects. Each point in the solution space (which is a combination of objects) is a possible solution, and each solution has a cost or benefit associated with it. Finding an optimal combination of objects means finding a combination of objects that minimizes the cost or maximizes the benefit function. Give lowest distance Give highest profit Within the solution space, there may be many solutions that could represent very good solutions (local optimum solutions), which have very low cost or very high benefit, and one best solution (i.e., global optimum solution), which has the lowest cost or the highest benefit. 2
Combinatorial Optimization Example You have n = 10,000 objects. Each object has a different value: Object 1: $9.47 Object 2: $5.32 Object n: $7.44 That would take a long time, Hairy! A combination is defined as a subset of 100 objects. Each object has to be used at least once, and not more than once. The solution space consists of all possible combinations of 100 objects (6.385051e+139 combinations) Can you find a combination of 100 objects whose total value is closest to $1,245,678.90? 3
CO Example: Travelling Salesman Problem The Travelling Salesman Problem (TSP) describes a salesperson who must travel a route to visit cities. The distance between city and city is known. The salesperson s problem is to: 1. Visit each city exactly once (i.e., visit each city at least once and no more than once) 2. Return to the starting point (city). The starting city (starting point of the route) can be any of the cities. 3. Find a route that represents the minimum distance travelled. Each possible route is an ordered combination of cities to visit. Each possible route (ordered combination) is a possible solution in the solution space. That is a difficult problem! Each route has a cost: the total distance travelled. Objective: find the route that has the least distance. 4
Combinatorial Optimization Algorithms Global optimum algorithms: Exhaustive search Held and Karp (1962): dynamic programming Branch-and-cut (later branch and bound) Concorde implementation holds current record (finding best route for 85,900 cities) However, even for the state-of-the-art Concorde implementation, these exact algorithms take a long time to compute. In April 2006 Concorde TSP Solver of 85,900 cities took over 136 CPU-years (A CPU Year is the amount of computing work done by a 1 GFLOP reference machine in a year of dedicated service.) Evolutionary algorithms (Heuristic and approximate) Hill climbing (and variants, such as random restarts) Simulated annealing Genetic algorithm Artificial neural network Ant Colony Optimization Particle Swarm Optimization 5
Exhaustive Search suffers from a serious problem as the number of variables increases, the number of combinations to be examined explodes. For example, consider the travelling salesperson problem with a total number of cities to visit equal to 23. There are different possible solutions. Given a starting city, we have n-1 choices for the second city, n-2 choices for the third city, etc. Multiplying these together we get (n-1)! = n-1 x n-2 x n-3 x... x 3 x 2 x 1. Now since our travel costs do not depend on the direction we take around the tour, we should divide this number by 2 to get (n-1)!/2 How large a number is? Suppose we have a computer capable of evaluating a feasible solution in one ns ( s). If we had only 23 cities to visit, then it would take approximately 178 centuries to run through the possible tours. 6
In contrast, evolutionary algorithms do not suffer from this problem of taking forever to solve a problem. These algorithms are so-named evolutionary because They mimic natural processes that govern how processes in nature naturally evolve. They are iterative algorithms that evolve incrementally (hopefully improve on their solution) through each iteration. These algorithms do not attempt to examine the entire space. Even so, they have been shown to provide good solutions in a fraction of the amount time, as compared to exact algorithms, which, for large size problems, take a large amount of time, i.e., they are infeasible. 7
The annealing of solids is a process in nature that naturally evolves. The term annealing refers to the process in which a solid, that has been brought into liquid phase by increasing its temperature, is brought back to a solid phase by slowly reducing the temperature in such a way that all the particles are allowed to arrange themselves in the strongest possible crystallized state. Such a crystallized state represents the global minimum of the solid s energy function. The cooling process has to be slow enough in order to guarantee that the particles will have time and opportunity to rearrange themselves and find the best position at the current temperature, for all temperature settings in the cooling process. At a certain temperature, once the particles have reached their best position, the substance is said to have reached thermal equilibrium. Once thermal equilibrium is reached, the temperature is lowered once again, and the process continues. At the completion, atoms have attained a nearly globally minimum energy state. 8
During the annealing process, the probability of particle s motion is given by the Boltzmann probability function: The probability of a particle s range of motion is proportional to the temperature. At higher temperatures, the particles have a greater range of motion than at lower temperatures. The higher temperature at the beginning of the annealing process allows the particles to move a greater range than at lower temperatures. As the temperature decreases, the particles range of motion decreases exponentially. The probability of a particle s range of motion is inversely proportional to the change of energy from one temperature state to another for large positive changes in energy from one state to another, which means the particle enters a much lower energy state, the probability of movements gets smaller. (Large positive change in energy means finding good solutions, so reduce the movements to look elsewhere.) 9
Dependence on Temperature and Cooling Schedule The annealing process consists of first raising the temperature of a solid to a point where its atoms can freely (i.e., randomly) move and then to lower the temperature, forcing the atoms to rearrange themselves into a lower energy state (i.e., a crystallization process). The cooling schedule is vital in this process. If the solid is cooled too quickly, or if the initial temperature of the system is too low, at the end of the annealing process, the substance is not able to become a crystal and instead the solid arrives at an amorphous state with higher energy. In this case, the system reaches a poor local minimum (a higher energy state) instead of the global minimum (i.e., the minimal energy state). For example, if you let metal cool rapidly, its atoms aren t given a chance to settle into a tight lattice and are frozen in a random configuration, resulting in brittle metal. If we decrease the temperature very slowly, the atoms are given enough time to settle into a strong crystal. 10
Application of Annealing to Combinatorial Optimization Problems The idea of the annealing process may be applied to combinatorial optimization problems. [1]Metropolis, Nicholas; Rosenbluth, Arianna W.; Rosenbluth, Marshall N.; Teller, Augusta H.; Teller, Edward (1953). "Equation of State Calculations by Fast Computing Machines". The Journal of Chemical Physics 21 (6): 1087. Bibcode:1953JChPh..21.1087M. doi:10.1063/1.1699114. [2]Kirkpatrick, S.; Gelatt Jr, C. D.; Vecchi, M. P. (1983). "Optimization by Simulated Annealing". Science 220 (4598): 671 680. Bibcode:1983Sci...220..671K. doi:10.1126/science.220.4598.671. JSTOR 1690046. PMID 17813860. 11
Application to Combinatorial Optimization Problems, such as TSP The idea of the annealing process may be applied to combinatorial optimization problems, such as the Traveling Salesperson Problem (TSP). There are many molecules in a substance. Many cities to visit (list of cities) There are much more positions in the substance that a molecule can take. Many possible routes to visit the cities in the list Each set of positions the molecules can take is called a combination. Each route (ordered sequence of cities to visit) is called a combination Each solution (each combination) has an associated cost, which is the amount of the energy represented by the combination. Each route has a cost, which is the total distance of the route The optimization problem is to find a combination that has the lowest cost (energy) Shortest distance. 12
Simulated Annealing Algorithm: Basic Structure The basic simulated annealing algorithm can be described as an iterative and evolutionary procedure composed of two loops: an outer loop and a nested inner loop. The inner loop simulates the goal of attaining thermal equilibrium at a given temperature. This is called the Thermal Equilibrium Loop. The outer loop performs the cooling process, in which the temperature is decreased from its initial value towards zero until a certain termination criterion is achieved and the search is stopped. This is called the Cooling Loop. The algorithm starts by initializing several parameters: The initial temperature is set to a very high value (to mimic the temperature setting of the initial natural annealing process). This is needed to allow the algorithm to search a wide breadth of solutions initially. The initial solution is created. Usually, the initial solution is chosen randomly. The number of times to go through the inner loop and the outer loop is set. 13
Each time the Thermal Equilibrium Loop (i.e., the inner loop) is called, it is run with a constant temperature, and the goal of the inner loop is to find a best solution for the given temperature to attain thermal equilibrium. Each iteration through the Thermal Equilibrium Loop, the algorithm performs the following: A small random perturbation of the currently held, best-so-far solution is made to create a new candidate solution. Since the algorithm does not know which direction to search, it picks a random direction by randomly perturbing the current solution. The goodness of a solution is quantified by a cost function. For example, the cost function for the TSP is the sum of the distances between each successive city in the solution list of cities to visit. We can think of the solution space by imagining a cost surface in hyperspace, where each location in hyperspace is given by a route (i.e., ordered list of cities), and the height of the cost function at a location represents the cost value. The algorithm wants to go to the minimum point (route) location that represents the lowest cost. A small random perturbation is made to the current solution, thus creating a candidate solution. A small step is taken because it is believed that good solutions are generally close to each other. But, this is not guaranteed to be true all of the time, because it depends on the problem and the distribution of solutions in the solution space. 14
Each iteration through the Thermal Equilibrium Loop, continued: Sometimes, the random perturbation results in a better solution: If the cost of the new candidate solution is lower than the cost of the previous solution, i.e., the random perturbation results in a better solution, then the new solution is kept and replaces the previous solution. Sometimes, the random perturbation results in a worse solution, in which case the algorithm makes a decision to keep or discard this worse solution. The decision outcome depends on an evaluation of a probability function, which depends on the temperature of the current loop and the change of error the candidate solution represents. The higher is the temperature, the more likely the algorithm will keep a worse solution. Keeping a worse solution is done to allow the algorithm to explore the solution space and to keep it from being trapped in a local minima. The lower is the temperature, the less likely the algorithm will keep a worse solution. Discarding a worse solution allows the algorithm to exploit a local optimum, which might be the global optimum. 15
Each iteration through the Thermal Equilibrium Loop, continued: The decision outcome depends on an evaluation of an estimation [1] of the Boltzmann's probability function: The Boltzmann s probability fun For a real annealing process, the is the change in energy of the atoms from the previous temperature state and the current temperature state, where the energy is given by the potential and kinetic energy of the atoms in the substance at the given temperature of the state. For simulated annealing, this may be estimated by the change in the cost function corresponding to the difference between cost of the previously found best solution at its temperature state and the cost of the new candidate solution at the current temperature state. The Boltzmann's constant may be estimated by the average cost function taken over all iterations of the inner loop. Therefore, 16
Each iteration through the Thermal Equilibrium Loop, continued: The decision outcome depends on: As can been seen, this probability is proportional to the temperature and inversely proportional to the normalized change in cost of the current solution. 1 17
Each iteration through the Thermal Equilibrium Loop, continued: Thus, in one iteration of the inner loop, the algorithm will either find and keep a better solution, keep a worse solution with probability, or make no change and keep the previously found best solution. The algorithm continues running the inner loop and the above procedure for a number of times. After running the inner loop many times, where in each loop it takes on a new better solution, or takes on a worse solution, or keeps the previously found best solution, the algorithm may be viewed as taking a random walk in the solution space, looking for a stable sub-optimal solution for the given temperature of the inner loop. After having found a stable sub-optimal solution for the given temperature of the inner loop, the process is said to have reached thermal equilibrium. At this point the inner loop completes, and the algorithm goes to the outer loop. 18
The outer loop (i.e., the Cooling Loop) performs the following: The currently best solution is recorded as the optimal solution. The temperature is decreased, according to some schedule. The initial temperature is set to a very high value (to mimic the temperature setting of the initial natural annealing process). This is needed to allow the algorithm to search a wide breadth of solutions initially. The final temperature should be set to some low value to prevent the algorithm from accepting worse solutions at the late stages of the process. The number of outer loops is decremented. If the number outer loops hasn t reached zero, then the inner loop is called once again; otherwise the algorithm terminates. 19
Set #ncl, itemp. Done Get Initial Route Y #ncl s==0? Dec # ncl N Compute Distance of Initial Route Set #nel perturbroute N Equilibrium Loop: The temperature is held constant, while the system reaches equilibrium, i.e., until the best route if found for the given temperature. Compute Distance of Route Reduce Temperature Y #nel ==0? Dec # nel Y? Better Route? ncl=numcoolingloops nel=numequilibriumloops N? Y genrand # N Find Prob of Acceptance, 20
21
Watch YouTube Simulation https://www.youtube.com/watch?v=sc5cx8dratu 22
[1] J. D. Hedengren, "Optimization Techniques in Engineering," 5 April 2015. [Online]. Available: http://apmonitor.com/me575/index.php/main/homepage. [Accessed 27 April 2015]. [2] A. R. Parkinson, R. J. Balling and J. D. Heden, "Optimization Methods for Engineering Design Applications and Theory," Brigham Young University, 2013. 23