The Traveling Salesman Problem: A Neural Network Perspective. Jean-Yves Potvin

Size: px
Start display at page:

Download "The Traveling Salesman Problem: A Neural Network Perspective. Jean-Yves Potvin"

Transcription

1 1 The Traveling Salesman Problem: A Neural Network Perspective Jean-Yves Potvin Centre de Recherche sur les Transports Université de Montréal C.P. 6128, Succ. A, Montréal (Québec) Canada H3C 3J7 potvin@iro.umontreal.ca Abstract. This paper surveys the "neurally" inspired problemsolving approaches to the traveling salesman problem, namely, the Hopfield-Tank network, the elastic net, and the self-organizing map. The latest achievements in the neural network domain are reported and numerical comparisons are provided with the classical solution approaches of operations research. An extensive bibliography with more than one hundred references is also included. Introduction The Traveling Salesman Problem (TSP) is a classical combinatorial optimization problem, which is simple to state but very difficult to solve. The problem is to find the shortest possible tour through a set of N vertices so that each vertex is visited exactly once. This problem is known to be NP-complete, and cannot be solved exactly in polynomial time. Many exact and heuristic algorithms have been devised in the field of operations research (OR) to solve the TSP. We refer readers to [15, 64, 65] for good overviews of the TSP. In the sections that follow, we briefly introduce the OR problem-solving approaches to the TSP. Then, the neural network approaches for solving that problem are discussed. Exact Algorithms The exact algorithms are designed to find the optimal solution to the TSP, that is, the tour of minimum length. They are computationally expensive because they must (implicitly) consider

2 2 all feasible solutions in order to identify the optimum. The exact algorithms are typically derived from the integer linear programming (ILP) formulation of the TSP Min Σ i Σ j d ij x ij subject to: Σ j x ij = 1, i=1,...,n Σ i x ij = 1, j=1,...,n (x ij ) X x ij = 0 or 1, where d ij is the distance between vertices i and j and the x ij 's are the decision variables: x ij is set to 1 when arc (i,j) is included in the tour, and 0 otherwise. (x ij ) X denotes the set of subtour-breaking constraints that restrict the feasible solutions to those consisting of a single tour. Although the subtour-breaking constraints can be formulated in many different ways, one very intuitive formulation is Σ i,j SV x ij S v - 1 (S v V; 2 S v N-2), where V is the set of all vertices, S v is some subset of V and S v is the cardinality of S v. These constraints prohibit subtours, that is, tours on subsets with less than N vertices. If there were such a subtour on some subset of vertices S v, this subtour would contain S v arcs. Consequently, the left-hand side of the inequality would be equal to S v, which is greater than S v -1, and the constraint would be violated for this particular subset. Without the subtour-breaking constraints, the TSP reduces to an assignment problem (AP), and a solution like the one shown in Figure 1 would then be feasible. Branch and bound algorithms are commonly used to find an optimal solution to the TSP, and the AP-relaxation is useful to generate good lower bounds on the optimum value. This is true in particular for asymmetric problems, where d ij d ji for some i,j. For symmetric problems, like the Euclidean TSP (ETSP), the AP-solutions often contain many subtours with only two vertices. Consequently,

3 3 these problems are better addressed by specialized algorithms that can exploit their particular structure. For instance, a specific ILP formulation can be derived for the symmetric problem which allows for relaxations that provide sharp lower bounds (e.g., the shortest spanning one-tree [46] ). (a) (b) Fig. 1. (a) Solving the TSP, (b) Solving the assignment problem. It is worth noting that problems with a few hundred vertices can now be routinely solved to optimality. Also, instances involving more than 2,000 vertices have been addressed. For example, the optimal solution to a symmetric problem with 2,392 vertices was identified after two hours and forty minutes of computation time on a powerful vector computer, the IBM 3090/600. [76,77] On the other hand, a classical problem with 532 vertices took five and a half hours on the same machine, indicating that the size of the problem is not the only determining factor for computation time. We refer the interested reader to [64] for a complete description of the state of the art with respect to exact algorithms. Heuristic Algorithms Running an exact algorithm for hours on an expensive computer may not be very cost-effective if a solution, within a few percent of the optimum, can be found quickly on a microcomputer. Accordingly, heuristic or approximate algorithms are often preferred to exact algorithms for solving the large TSPs that occur in practice (e.g., drilling problems).

4 4 Generally speaking, TSP heuristics can be classified as tour construction procedures, tour improvement procedures, and composite procedures, which are based on both construction and improvement techniques. (a) Construction procedures. The best known procedures in this class gradually build a tour by selecting each vertex in turn and by inserting them one by one into the current tour. Various metrics are used for selecting the next vertex and for identifying the best place to insert it, like the proximity to the current tour and the minimum detour. [88] (b) Improvement procedures. Among the local improvement procedures, the k-opt exchange heuristics are the most widely used, in particular, the 2-opt, 3-opt, and Lin-Kernighan heuristics. [67,68] These heuristics locally modify the current solution by replacing k arcs in the tour by k new arcs so as to generate a new improved tour. Figure 2 shows an example of a 2-opt exchange. Typically, the exchange heuristics are applied iteratively until a local optimum is found, namely a tour which cannot be improved further via the exchange heuristic under consideration. In order to overcome the limitations associated with local optimality, new heuristics like simulated annealing and tabu search are being used. [25,39,40,60] Basically, these new procedures allow local modifications that increase the length of the tour. By this means, the method can escape from local minima and explore a larger number of solutions. The neural network models discussed in this paper are often compared to the simulated annealing heuristic described in [60]. In this context, simulated annealing refers to an implementation based on the 2-opt exchanges of Lin [67], where an increase in the length of the tour has some probability of being accepted (see the description of simulated annealing in Section 3). (c) Composite procedures. Recently developed composite procedures, which use both construction and improvement techniques, are now among the most powerful heuristics for solving TSPs. Among the new generation of composite heuristics, the most successful ones are the CCAO heuristic, [41] the GENIUS heuristic, [38] and the iterated Lin-Kernighan heuristic. [53] For example, the iterated Lin-Kernighan heuristic can routinely find solutions within 1% of the optimum for problems with up to

5 5 10,000 vertices. [53] Heuristic solutions within 4% of the optimum for some 1,000,000-city ETSPs are reported in [12]. Here, the tour construction procedure is a simple greedy heuristic. At the start, each city is considered as a fragment, and multiple fragments are built in parallel by iteratively connecting the closest fragments together until a single tour is generated. The solution is then processed by a 3-opt exchange heuristic. A clever implementation of this procedure solved some 1,000,000-city problems in less than four hours on a VAX i j i j l k l k Fig. 2. Exchange of links (i,k),(j,l) for links (i,j),(k,l). Artificial Neural Networks Because of the simplicity of its formulation, the TSP has always been a fertile ground for new solution ideas. Consequently, it is not surprising that many problem-solving approaches inspired by artificial neural networks have been applied to the TSP. Currently, neural networks do not provide solution quality that compares with the classical heuristics of OR. However, the technology is quite young and spectacular improvements have already been achieved since the first attempts in [51] All of these efforts for solving a problem that has already been quite successfully addressed by operations researchers are motivated, in part, by the fact that artificial neural networks are powerful parallel devices. They are made up of a large number of simple elements that can process their inputs in parallel. Accordingly, they lend themselves naturally to implementations on parallel computers. Moreover, many neural

6 6 network models have already been directly implemented hardware as "neural chips." in Hence, the neural network technology could provide a means to solve optimization problems at a speed that has never been achieved before. It remains to be seen, however, whether the quality of the neural network solutions will ever compare to the solutions produced by the best heuristics in OR. Given the spectacular improvements in the neural network technology in the last few years, it would certainly be premature at this time to consider this line of research to be a "dead end." In the sections that follow, we review the three basic neural network approaches to the TSP, namely, the Hopfield-Tank network, the elastic net, and the self-organizing map. Actually, the elastic nets and self-organizing maps appear to be the best approaches for solving the TSP. But the Hopfield-Tank model was the first to be applied to the TSP and it has been the dominant neural approach for solving combinatorial optimization problems over the last decade. Even today, many researchers are still working on that model, trying to explain its failures and successes. Because of its importance, a large part of this paper is thus devoted to that model and its refinements over the years. The paper is organized along the following lines. Sections 1 and 2 first describe the Hopfield-Tank model and its many variants. Sections 3 and 4 are then devoted to the elastic net and the selforganizing map, respectively. Finally, concluding remarks are made in Section 5. Each basic model is described in detail and no deep understanding of neural network technology is assumed. However, previous exposure to an introductory paper on the subject could help to better understand the various models. [61] In each section, computation times and numerical comparisons with other OR heuristics are provided when they are available. However, the OR specialist must understand that the computation time for simulating a neural network on a serial digital computer is not particularly meaningful, because such an implementation does not exploit the inherent parallelism of the model. For this reason, computation times are often missing in neural network research papers. A final remark concerns the class of TSPs addressed by neural network researchers. Although the Hopfield-Tank network has been applied to TSPs with randomly generated distance matrices, [106]

7 7 virtually all work concerns the ETSP. Accordingly, Euclidean distances should be assumed in the sections that follow, unless it is explicitly stated otherwise. The reader should also note that general surveys on the use of neural networks in combinatorial optimization may be found in [22, 70]. An introductory paper about the impacts of neurocomputing on operations research may be found in [29]. Section 1. The Hopfield-Tank Model Before going further into the details of the Hopfield model, it is important to observe that the network or graph defining the TSP is very different from the neural network itself. As a consequence, the TSP must be mapped, in some way, onto the neural network structure. For example, Figure 3a shows a TSP defined over a transportation network. The artificial neural network encoding that problem is shown in Figure 3b. In the transportation network, the five vertices stand for cities and the links are labeled or weighted by the inter-city distances d ij (e.g., d NY,LA is the distance between New York and Los Angeles). A feasible solution to that problem is the tour Montreal-Boston-NY-LA-Toronto-Montreal, as shown by the bold arcs. In Figure 3b, the Hopfield network [50] is depicted as a 5x5 matrix of nodes or units that are used to encode solutions to the TSP. Each row corresponds to a particular city and each column to a particular position in the tour. The black nodes are the activated units that encode the current solution (namely, Montreal is in first position in the tour, Boston in second position, NY in third, etc.). Only a few connections between the units are shown in Figure 3b. In fact, there is a connection between each pair of units, and a weight is associated with each connection. The signal sent along a connection from unit i to unit j is equal to the weight T ij if unit i is activated. It is equal to 0 otherwise. A negative weight thus defines an inhibitory connection between the two units. In such a case, it is unlikely that both units will be active or "on" at the same time, because the first unit that turns on immediately sends an inhibitory signal to the other unit through that connection to prevent its activation. On the other hand, it is more likely for both units to be on at the same time if the connection has a positive weight. In such a

8 8 case, the first unit that turns on sends a positive excitatory signal to the other unit through that connection to facilitate its activation. (a) TSP problem Toronto Montreal Boston NY d NY,LA LA (b) Neural network representation Boston Montreal LA NY T LA5,NY5 Toronto Fig. 3. Mapping a TSP onto the Hopfield network.

9 9 In the TSP context, the weights are derived in part from the inter-city distances. They are chosen to penalize infeasible tours and, among the feasible tours, to favor the shorter ones. For example, T LA5,NY5 in Figure 3b denotes the weight on the connection between the units that represent a visit to cities LA and NY both in the fifth position on a tour. Consequently, that connection should be inhibitory (negative weight), because two cities cannot occupy the same exact position. The first unit to be activated will inhibit the other unit via that connection, so as to prevent an infeasible solution to occur. In Section 1.1, we first introduce the Hopfield model, which is a network composed of binary "on/off" or "0/1" units, like the artificial neural network shown in Figure 3b. We will then describe the Hopfield-Tank model, which is a natural extension of the discrete model to units with continuous activation levels. Finally, the application of the Hopfield-Tank network to the TSP will be described. 1.1 The Discrete Hopfield Model The original Hopfield neural network model [50] is a fully interconnected network of binary units with symmetric connection weights between the units. The connection weights are not learned but are defined a priori from problem data (the inter-city distances in a TSP context). Starting from some arbitrarily chosen initial configuration, either feasible or infeasible, the Hopfield network evolves by updating the activation of each unit in turn (i.e., an activated unit can be turned off, and an unactivated unit can be turned on). The update rule of any given unit involves the activation of the units it is connected to as well as the weights on the connections. Via this update process, various configurations are explored until the network settles into a stable configuration. In this final state, all units are stable according to the update rule and do not change their activation status. The dynamics of the Hopfield network can be described formally in mathematical terms. To this end, the activation levels of the binary units are set to zero and one for "off" and "on," respectively. Starting from some initial configuration {V i } i=1,...,l, where L is the number of units and V i is the activation level of unit i, the network relaxes to a stable configuration according to the following update rule

10 10 set V i to 0 set V i to 1 if Σ j T ij V j < θ i if Σ j T ij V j > θi do not change V i if Σ j T ij V j = θ i, where T ij is the connection weight between units i and j, and θ i is the threshold of unit i. The units are updated at random, one unit at a time. Since the configurations of the network are L-dimensional, the update of one unit from zero to one or from one to zero moves the configuration of the network from one corner to another of the L-dimensional unit hypercube. The behavior of the network can be characterized by an appropriate energy function. The energy E depends only on the activation levels V i (the weights T ij and the thresholds θ i are fixed and derived from problem data), and is such that it can only decrease as the network evolves over time. This energy is given by E = -1/2 Σ i Σ j T ij V i V j + Σ i θ i V i. (1.1) Since the connection weights T ij are symmetric, each term T ij V i V j appears twice within the double summation of (1.1). Hence, this double summation is divided by 2. It is easy to show that a unit changes its activation level if and only if the energy of the network decreases by doing so. In order to prove that statement, we must consider the contribution E i of a given unit i to the overall energy E, that is, E i = - Σ j T ij V i V j + θ i V i. Consequently, if V i = 1 then E i = - Σ j T ij V j + θ i if V i = 0 then E i = 0. Hence, the change in energy due to a change V i in the activation level of unit i is E i = - V i (Σ j T ij V j - θ i ).

11 11 Now, V i is one if unit i changed its activation level from zero to one, and such a change can only occur if the expression between the parentheses is positive. As a consequence, E i is negative and the energy decreases. This same line of reasoning can be applied when a unit i changes its activation level from one to zero (i.e., V i = -1). Since the energy can only decrease over time and the number of configurations is finite, the network must necessarily converge to a stable state (but not necessarily the minimum energy state). In the next section, a natural extension of this model to units with continuous activation levels is described. 1.2 The Continuous Hopfield-Tank Model In [51], Hopfield and Tank extended the original model to a fully interconnected network of nonlinear analog units, where the activation level of each unit is a value in the interval [0,1]. Hence, the space of possible configurations {Vi} i=1,...,l is now continuous rather than discrete, and is bounded by the L-dimensional hypercube defined by V i = 0 or 1. Obviously, the final configuration of the network can be decoded into a solution of the optimization problem if it is close to a corner of the hypercube (i.e., if the activation value of each unit is close to zero or one). The main motivation of Hopfield and Tank for extending the discrete network to a continuous one was to provide a model that could be easily implemented using simple analog hardware. However, it seems that continuous dynamics also facilitate convergence. [47] The evolution of the units over time is now characterized by the following differential equations (usually called "equations of motion") du i /dt = Σ j T ij V j + I i - U i, i=1,...,l (1.2) where U i, I i and V i are the input, input bias, and activation level of unit i, respectively. The activation level of unit i is a function of its input, namely V i = g(u i ) = 1/2 (1 + tanh U i /U o ) = 1/ (1+ e -2U i/uo). (1.3) The activation function g is the well-known sigmoidal function, which always returns a value between 0 and 1. The parameter U o is

12 12 used to modify the slope of the function. In Figure 4, for example, the U o value is lower for curve (2) than for curve (1). g(ui) (2) g(ui) = 1 (1) (1) (2) 0 Ui g(ui) = 0 Fig. 4. The sigmoidal activation function. The energy function for the continuous Hopfield-Tank model is now Vi E = -1/2 Σ i Σ j T ij V i V j - Σ i V i I i + g -1 (x)dx. (1.4) Note in particular that du i /dt = -de/dv i. Accordingly, when the units obey the dynamics of the equations of motion, the network is performing a gradient descent in the network's configuration space with respect to that energy function, and stabilizes at a local minimum. At that point, du i /dt = 0 and the input to any given unit i is the weighted sum of the activation levels of all the other units plus the bias, that is U i = Σ j T ij V j + I i. 0

13 Simulation of the Hopfield-Tank Model In order to simulate the behavior of the continuous Hopfield- Tank model, a discrete time approximation is applied to the equations of motion {U i (t+ t) - U i (t)} / t = Σ j T ij V j (t) + I i - U i (t), where t is a small time interval. This formula can be rewritten as U i (t+ t) = U i (t) + t (Σ j T ij V j (t) + I i - U i (t)). Starting with some initial values {U i (0)} i=1,...,l at time t=0, the system evolves according to these equations until a stable state is reached. During the simulation, t is usually set to 10-5 or Smaller values provide a better approximation of the analog system, but more iterations are then required to converge to a stable state. In the literature, the simulations have mostly been performed on standard sequential machines. However, implementations on parallel machines are discussed in [10, 93]. The authors report that it is possible to achieve almost linear speed-up with the number of processors. For example, a Hopfield-Tank network for a 100-city TSP took almost three hours to converge to a solution on a single processor of the Sequent Balance 8000 computer. [10] The computation time was reduced to about 20 minutes using eight processors. 1.4 Application of the Hopfield-Tank Model to the TSP In the previous sections, we have shown that the Hopfield-Tank model performs a descent towards a local minimum of the energy function E. The "art" of applying that model to the TSP is to appropriately define the connection weights Tij and the bias Ii so that the local minima of E will correspond to good TSP solutions. In order to map a combinatorial optimization problem like the TSP onto the Hopfield-Tank model, the following steps are suggested in [83, 84]: (1) Choose a representation scheme which allows the activation levels of the units to be decoded into a solution of the problem.

14 14 (2) Design an energy function whose minimum corresponds to the best solution of the problem. (3) Derive the connectivity of the network from the energy function. (4) Set up the initial activation levels of the units. These ideas can easily be applied to the design of a Hopfield- Tank network in a TSP context: (1) First, a suitable representation of the problem must be chosen. In [51], the TSP is represented as an NxN matrix of units, where each row corresponds to a particular city and each column to a particular position in the tour (see Figure 1). If the activation level of a given unit V Xi is close to 1, it is then assumed that city X is visited at the ith position in the tour. In this way, the final configuration of the network can be interpreted as a solution to the TSP. Note that N 2 units are needed to encode a solution for a TSP with N cities. (2) Second, the energy function must be defined. The following function is used in [51] E = A/2 (Σ X Σ i Σ j i V Xi V Xj ) +B/2 (Σ i Σ X Σ Y X V Xi V Yi ) +C/2 (Σ X Σ i V Xi -N) 2 +D/2 (Σ X Σ Y X Σ i d XY V Xi (V Yi+1 + V Yi-1 )), (1.5) where the A, B, C, and D parameters are used to weight the various components of the energy. The first three terms penalize solutions that do not correspond to feasible tours. Namely, there must be exactly one activated unit in each row and column of the matrix. The first and second terms, respectively, penalize the rows and columns with more than one activated unit, and the third term requires a total of N activated units (so as to avoid the trivial solution V Xi =0 for all Xi). The fourth term ensures that the energy function will favor short tours over longer ones. This term adds the distance d XY to the energy value when cities X and Y are in consecutive positions in the tour (note that subscripts are taken modulo N, so that V X,N+1 is the same as V X1 ).

15 15 (3) Third, the bias and connection weights are derived. To do so, the energy function of Hopfield and Tank (1.5) is compared to the generic energy function (1.6), which is a slightly modified version of (1.4): each unit has now two subscripts (city and position) and the last term is removed (since it does not play any role here) E = -1/2 Σ Xi Σ Yj T XiYj V Xi V Yj - Σ Xi V Xi I Xi. (1.6) Consequently, the weights T XiYj on the connections of the Hopfield-Tank network are identified by looking at the quadratic terms in the TSP energy function, while the bias I Xi is derived from the linear terms. Hence, T XiYj = - A δ XY (1- δ ij ) I Xi = + CN e, - B δ ij (1- δ XY ) - C - D d XY (δ j,i+1 + δ j,i-1 ), where δ ij =1 if i=j and 0 otherwise. The first and second terms in the definition of the connection weights stand for inhibitory connections within each row and each column, respectively. Hence, a unit whose activation level is close to 1 tends to inhibit the other units in the same row and column. The third term is a global inhibitor term. The combined action of this term and the input bias I Xi, which are both derived from the C term in the energy function (1.5), favor solutions with a total of N activated units. Finally, the fourth term is called the "data term" and prevents solutions with adjacent cities that are far apart (namely, the inhibition is stronger between two units when they represent two cities X, Y in consecutive positions in the tour, with a large inter-city distance d XY ). In the experiments of Hopfield and Tank, the parameter N e in the definition of the bias I Xi = CN e does not always correspond exactly to the number of cities N. This parameter is used by Hopfield and Tank to adjust the level of the positive bias signal with respect to the negative signals coming through the other connections, and it is usually slightly larger than N. Note

16 16 finally that there are O(N 4 ) connections between the N 2 units for a TSP with N cities. (4) The last step is to set the initial activation value of each unit to 1/N plus or minus a small random perturbation (in this way the sum of the initial activations is approximately equal to N). With this model, Hopfield and Tank were able to solve a randomly generated 10-city ETSP, with the following parameter values: A=B=500, C=200, D=500, N e =15. They reported that for 20 distinct trials, using different starting configurations, the network converged 16 times to feasible tours. Half of those tours were one of the two optimal tours. On the other hand, the network was much less reliable on a randomly generated 30-city ETSP (900 units). Apart from frequent convergence to infeasible solutions, the network commonly found feasible tours with a length over 7.0, as compared to a tour of length 4.26 generated by the Lin-Kernighan exchange heuristic. [68] Three years later, it was claimed in [105] that the results of Hopfield and Tank were quite difficult to reproduce. For the 10-city ETSP of Hopfield and Tank, using the same parameter settings, the authors report that on 100 different trials, the network converged to feasible solutions only 15 times. Moreover, the feasible tours were only slightly better than randomly generated tours. Other experiments by the same authors, on various randomly generated 10-city ETSP problems, produced the same kind of results. The main weaknesses of the original Hopfield-Tank model, as pointed out in [105] are the following. (a) Solving a TSP with N cities requires O(N 2 ) units and O(N 4 ) connections. (b) The optimization problem is not solved in a problem space of (c) O(N!), but in a space of O(2 N2 ) where many configurations correspond to infeasible solutions. Each valid tour is represented 2N times in the Hopfield-Tank model because any one of the N cities can be chosen as the starting city, and the two orientations of the tour are equivalent for a symmetric problem. This phenomenon is referred to as "2N-degeneracy" in neural network terminology.

17 17 (d) The model performs a gradient descent of the energy function in the configuration space, and is thus plagued with the limitations of "hill-climbing" approaches, where a local optimum is found. As a consequence, the performance of the model is very sensitive to the initial starting configuration. (e) (f) (g) The model does not guarantee feasibility. In other words, many local minima of the energy function correspond to infeasible solutions. This is related to the fact that the constraints of the problem, namely that each city must be visited exactly once, are not strictly enforced but rather introduced into the energy function as penalty terms. Setting the values of the parameters A, B, C, and D is much more an art than a science and requires a long "trial-anderror" process. Setting the penalty parameters A, B, and C to small values usually leads to short but infeasible tours. Alternatively, setting the penalty parameters to large values forces the network to converge to any feasible solution regardless of the total length. Moreover, it seems to be increasingly difficult to find "good" parameter settings as the number of cities increases. Many infeasible tours produced by the network visit only a subset of cities. This is due to the fact that the third term in the energy function (C term) is the only one to penalize such a situation. The first two terms (A and B terms), as well as the fourth term (D term), benefit from such a situation. (h) It usually takes a large number of iterations (in the thousands) before the network converges to a solution. Moreover, the network can "freeze" far from a corner of the hypercube in the configuration space, where it is not possible to interpret the configuration as a TSP solution. This phenomenon can be explained by the shape of the sigmoidal activation function which is very flat for large positive and large negative U i 's (see Figure 2). Consequently, if the activation level V i of a given unit i is close to zero or one, even large modifications to U i will produce only slight modifications to the activation level. If a large number of units are in this situation, the network will evolve very slowly, a phenomenon referred to as "network paralysis." Paralysis far from a corner of the hypercube can occur if the slope of the activation function is not very steep. In that case, the flat regions of the sigmoidal function extend further and

18 18 (i) affect a larger number of units (even those with activation levels far from zero and one). The network is not adaptive, because the weights of the network are fixed and derived from problem data, rather than taught from it. The positive points are that the model can be easily implemented in hardware, using simple analog devices, and that it can also be applied to non-euclidean TSPs, in particular, problems that do not satisfy the triangle inequality and cannot be interpreted geometrically. [106] This is an advantage over the geometric approaches that are presented in Sections 3 and 4. Section 2. Variants of the Hopfield-Tank Model Surprisingly enough, the results of Wilson and Pawley did not discourage the community of researchers but rather stimulated the search for ways to improve the original Hopfield-Tank model. There were also numerous papers providing in-depth analysis of the model to explain its failures and propose various improvements to the method. [4,5,6,7,17,24,56,86,87,90,108] The modifications to the original model can be classified into six distinct categories: modifications to the energy function, techniques for estimating "good" parameter settings, addition of hard constraints to the model, incorporation of techniques to escape from local minima, new problem representations, and modifications to the starting configurations. We now describe each category, and emphasize the most important contributions. 2.1 Modifications to the Energy Function The first attempts were aimed at modifying the energy function to improve the performance of the Hopfield-Tank model. Those studies, which add or modify terms to push the model towards feasible solutions, are mostly empirical. (a) In [18, 75, 78, 95], the authors suggest replacing either the third term (C term) or the three first terms (A, B, and C terms) of the original energy function (1.5) by F/2 Σ X (Σ i V Xi -1) 2 + G/2 Σ i (Σ X V Xi -1) 2.

19 19 This modification helps the model to converge towards feasible tours, because it heavily penalizes configurations that do not have exactly one active unit in each row and each column. In particular, it prevents many solutions that do not incorporate all cities. Note that a formulation where the A, B, and C terms are replaced by the two new terms can be implemented with only O(N 3 ) connections. In [14], the author proposes an alternative approach to that problem by adding an additional excitatory bias to each unit so as to get a larger number of activated units. (b) In [18], the authors suggest the addition of a new penalty term in the energy function (1.5). This term attempts to drive the search out from the center of the hypercube (and thus towards the corners of the hypercube) so as to alleviate the network paralysis problem. The additional term is F/2 Σ Xi V Xi (1-V Xi ) = F/2 (N 2 /4 - Σ Xi (V Xi - 1/2) 2 ). In the same paper, they also propose a formulation where the inter-city distances are only used in the linear components of the energy function. Hence, the distances are provided to the network via the input bias rather than encoded into the connection weights. It is a great advantage for a hardware implementation, like a neural chip, because the connection weights do not change from one TSP instance to another and can be fixed into the hardware at fabrication time. However, a new representation of the problem with O(N 3 ) units is now required. Brandt and his colleagues [18] report that the Hopfield-Tank model with the two modifications suggested in (a) and (b) consistently converged to feasible tours for randomly generated 10- city ETSPs. Moreover, the average length was now much better than the length of randomly generated tours. Table 1 shows these results for problems with 10, 16, and 32 cities. Note that the heading "Manual" in the Table refers to tours constructed manually by hand.

20 20 Average Tour Length Number of Number of Cities Problems Brandt Manual Random Table 1. Comparison of Results for Three Solution Procedures 2.2 Finding Good Settings for the Parameter Values In [44, 45], the authors experimentally demonstrate that various relationships among the parameters of the energy function must be satisfied in order for the Hopfield-Tank model to converge to feasible tours. Their work also indicates that the region of good settings in the parameter space quickly gets very narrow as the number of cities grows. This study supports previous observations about the difficulty to tune the Hopfield-Tank energy function for problems with a large number of cities. Cuykendall and Reese [28] also provide ways of estimating parameter values from problem data, as the number of cities increases. In [4, 5, 6, 7, 27, 57, 58, 74, 86, 87] theoretical relationships among the parameters are investigated in order for feasible tours to be stable. The work described in these papers is mostly based on a close analysis of the eigenvalues and eigenvectors of the connection matrix. Wang and Tsai [102] propose to gradually reduce the value of some parameters over time. However, time-varying parameters preclude a simple hardware implementation. Lai and Coghill [63] propose the genetic algorithm, as described in [49], to find good parameter values for the Hopfield-Tank model. Along that line of research, the most impressive practical results are reported in [28]. The authors generate feasible solutions for a 165-city ETSP, by appropriately setting the bias of each unit and the U 0 parameter in the sigmoidal activation function (1.3). Depending on the parameter settings, it took between one hour and 10 hours of computation time on an APOLLO DN4000 to converge to a solution. The computation times were in a range of 10 to 30 minutes for

21 21 another 70-city ETSP. Unfortunately, no comparisons are provided with other problem-solving heuristics. 2.3 Addition of Constraints to the Model The approaches that we now describe add new constraints to the Hopfield-Tank model, so as to restrict the configuration space to feasible tours. (a) In [81, 97, 98] the activation levels of the units are normalized so that Σ i V Xi =1 for all cities X. The introduction of these additional constraints is only one aspect of the problem-solving methodology, which is closely related to the simulated annealing heuristic. Accordingly, the full discussion is deferred to Section 2.4, where simulated annealing is introduced. (b) Other approaches are more aggressive and explicitly restrict the configuration space to feasible tours. In [96], the authors calculate the changes required in the remaining units to maintain a feasible solution when a given unit is updated. The energy function is then evaluated on the basis of the change to the updated unit and all the logically implied changes to the other units. This approach converges consistently to feasible solutions on 30-city ETSPs. The tours are only 5% longer on average than those generated by the simulated annealing heuristic. In [69], the author updates the configurations using Lin and Kernighan's exchange heuristic. [68] Foo and Szu [35] use a "divide-andconquer" approach to the problem. They partition the set of cities into subsets and apply the Hopfield-Tank model to each subset. The subtours are then merged back together into a single larger tour with a simple heuristic. Although their approach is not conclusive, the integration of classical OR heuristics and artificial intelligence within a neural network framework could provide interesting research avenues for the future. 2.4 Incorporation of Techniques to Escape from Local Minima The Hopfield-Tank model converges to a local minimum, and is thus highly sensitive to the starting configuration. Hence, various modifications have been proposed in the literature to alleviate this problem.

22 22 (a) In [1, 2, 3], a Boltzmann machine [48] is designed to solve the TSP. Basically, a Boltzmann machine incorporates the simulated annealing heuristic [25,60] within a discrete Hopfield network, so as to allow the network to escape from bad local minima. The simulated annealing heuristic performs a stochastic search in the space of configurations of a discrete system, like a Hopfield network with binary units. As opposed to classical hill-climbing approaches, simulated annealing allows modifications to the current configuration that increase the value of the objective or energy function (for a minimization problem). More precisely, a modification that reduces the energy of the system is always accepted, while a modification that increases the energy by E is accepted with Boltzmann probability e - E/T, where T is the temperature parameter. At a high temperature, the probability of accepting an increase to the energy is high. This probability gets lower as the temperature is reduced. The simulated annealing heuristic is typically initiated at a high temperature, where most modifications are accepted, so as to perform a coarse search of the configuration space. The temperature is then gradually reduced to focus the search on a specific region of the configuration space. At each temperature T, the configurations are modified according to the Boltzmann update rule until the system reaches an "equilibrium." At that point, the configurations follow a Boltzmann distribution, where the probability of the system being in configuration s' at temperature T is e -E(s')/T P T (s') =. (2.1) Σ s e -E(s)/T Here, E(s') is the energy of configuration s', and the denominator is the summation over all configurations. According to that probability, configurations of high energy are very likely to be observed at high temperatures and much less likely to be observed at low temperatures. The inverse is true for low energy configurations. Hence, by gradually reducing the temperature parameter T and by allowing the system to reach equilibrium at each temperature, the system is expected to ultimately settle down at a configuration of low energy.

23 23 This simulated annealing heuristic has been incorporated into the discrete Hopfield model to produce the so-called Boltzmann machine. Here, the binary units obey a stochastic update rule, rather than a deterministic one. At each iteration, a unit is randomly selected and the consequence of modifying its activation level (from zero to one or from one to zero) on the energy is evaluated. The probability of accepting the modification is then 1, 1 + e E/T where E is the modification to the energy. This update probability is slightly different from the one used in simulated annealing. In particular, the probability of accepting a modification that decreases the energy of the network (i.e., E < 0) is not one here, but rather a value between 0.5 and one. However, this new update probability has the same convergence properties as the one used in simulated annealing and, in that sense, the two expressions are equivalent. Aarts and Korst [1,2,3] design a Boltzmann machine for solving the TSP based on these ideas. Unfortunately, their approach suffers from very slow convergence and, as a consequence, only 30-city TSPs have been solved with this model. (b) In [43], the Boltzmann Machine is generalized to units with continuous activation levels. A truncated exponential distribution is used to compute the activation level of each unit. As for the discrete Boltzmann machine, the model suffers from slow convergence, and only small 10-city ETSPs have been solved. (c) The research described in [81, 97, 98], which is derived from the mean-field theory, is probably the most important contribution to the literature relating to the Hopfield-Tank model since its original description in [51]. The term "mean-field" refers to the fact that the model computes the mean activation levels of the stochastic binary units of a Boltzmann machine. This section focuses on the model of Van den Bout and Miller, [97,98] but the model of Peterson and Soderberg [81] is similar. We first introduce the iterative algorithm for updating the

24 24 configurations of the network. Then, we explain the relationships between this model and the Boltzmann machine. The neural network model introduced in [97] is characterized by a new simplified energy function E = d max /2 (Σ i Σ X Σ Y X V Xi V Yi ) + Σ X Σ Y X Σ i d XY V Xi (V Yi+1 + V Yi-1 ). (2.2) The first summation penalizes solutions with multiple cities at the same position, while the second summation computes the tour length. Note that the penalty value is weighted by the parameter d max. Starting from some arbitrary initial configuration, the model evolves to a stable configuration that minimizes (2.2), via the following iterative algorithm: 1. Set the temperature T. 2. Select a city X at random. 3. Compute U Xi = - d max Σ Y X V Yi - Σ Y X d XY (V Yi+1 + V Yi-1 ), i=1,...,n. 4. Compute e U Xi/T V Xi =, i=1,...,n. Σ j e U Xj/T 5. Evaluate the energy E. 6. Repeat Steps 2 to 5 until the energy no longer decreases (i.e., a stable configuration has been reached). We note that the activation levels always satisfy the constraints Σ i V Xi = 1, for all cities X. Accordingly, each value V Xi can be interpreted as the probability that city X occupies position i. When a stable configuration is reached, the activation levels V Xi satisfy the following system of equations (called the "mean-field equations")

25 25 e U Xi/T V Xi =, (2.3) Σ j e U Xj/T where U Xi = - de/dv Xi (see Step 3 of the algorithm). In order to understand the origin of the mean-field equations, we must go back to the evolution of a discrete Hopfield network with binary units, when those units are governed by a stochastic update rule like the Boltzmann rule (see the description of the simulated annealing heuristic and the Boltzmann machine in point (a)). It is known that the configurations of that network follow a Boltzmann distribution at equilibrium (i.e., after a large number of updates). Since the network is stochastic, it is not possible to know what the exact configuration will be at a given time. On the other hand, the average or mean activation value of each binary unit at Boltzmann equilibrium at a given temperature T is a deterministic value which can be computed as follows <V Xi > = Σ s P T (s)v Xi (s) = Σ s Xi P T(s Xi ). In this equation, the summations are restricted to the configurations satisfying Σ j V Xj = 1, for all cities X (so as to comply with the model of Van den Bout and Miller), P T (s) is the Boltzmann probability of configuration s at temperature T, V Xi (s) is the activation level of unit Xi in configuration s, and s Xi denotes the configurations where V Xi = 1. Hence, we have Σ s Xi e E(s Xi)/T <V Xi > =. Σ j Σ s Xj e E(s Xj)/T In this formula, the double summation in the denominator is equivalent to a single summation over all configurations s, because each configuration contains exactly one activated unit in {X j } j=1,...,n (Σ j V Xj = 1 and V Xj is either zero or one). Now, we can apply the so-called "mean-field approximation" to <V Xi >. Rather than summing up over all configurations, we assume that the activation levels of all units that interact with a given unit X j are fixed at their mean value. For example, rather than summing up

26 26 over all configurations s Xi in the numerator (configurations where V Xi =1), we fix the activation levels of all the other units to their mean value. In this way, the summation can be removed. By applying this idea to both the numerator and the denominator, and by observing that -U Xi is the contribution of unit Xi to the energy (2.2) when V Xi =1, the expression can be simplified to where e <U Xi>/T <V Xi > =, Σ j e <U Xj>/T <U Xi > = - d max Σ Y X <V Yi > - Σ Y X d XY (<V Yi+1 > + <V Yi-1 >). These equations are the same as the equations of Van den Bout and Miller (2.3). Hence, the V Xi values computed via their iterative algorithm can be interpreted as the mean activation levels of the corresponding stochastic binary units at Boltzmann equilibrium (at a given temperature T). At low temperatures, the low energy configurations have high Boltzmann probability and they dominate in the computation of the mean values <V Xi >. Hence, the stable configuration computed by the algorithm of Van den Bout and Miller is expected to be of low energy, for a sufficiently small parameter value T, because the stable configuration is composed of those mean values <V Xi >. As noted in [98], all the activation levels are the same at high temperatures, that is, V Xi 1/N when T. As the temperature parameter is lowered, each city gradually settles into a single position, because such configurations correspond to low energy states. In addition, the model also prevents two cities from occupying the same position, because a penalty of d max /2 is incurred in the energy function. If the parameter d max is set to a value slightly larger than twice the largest distance between any two cities, the network can find a configuration with lower energy simply by moving one of the two cities into the empty position. Feasible tours are thus guaranteed through the combined actions of the new energy function and the additional constraints imposed on the activation levels V Xi (once again, for a sufficiently small parameter value T). It is clear that the key problem is to identify a "good" value for the parameter T. By gradually decreasing the temperature, Van den

27 27 Bout and Miller identified a critical value T c where all the energy minimization takes place. Above the critical value T c, the units rarely converge to zero or one and feasible tours do not emerge. Below T c, all the tours generated are feasible, and the best tours emerge when T is close to T c. Obviously, the critical temperature value is highly dependent on the particular TSP to be solved. In [97, 98], Van den Bout and Miller describe a methodology for estimating that value from the inter-city distances. Using various T and d max parameter values, their best tour on a 30-city TSP had a length of 26.9, as compared to 24.1 for a tour obtained with the simulated annealing heuristic. Peterson and Soderberg [81] test a similar model on much larger problems, ranging in size from 50 to 200 cities. They observe that the length of the tours generated by the neural network approach are about 8% longer on average than the tours generated with a simulated annealing heuristic. Moreover, no tour was more than 10% longer than the corresponding simulated annealing tour. However, the average tour lengths are not provided (rather, the results are displayed as small histograms). Also, no computation times are reported. It is quite interesting to note that Bilbro et al. [13] have shown that the evolution of the above model is equivalent to the evolution of the Hopfield-Tank network governed by the equations of motion (see Section 1). However, convergence to a stable configuration is much faster by solving the mean-field equations. This increased convergence speed explains the successes of Peterson and Soderberg who routinely found feasible solutions to the TSP with up to 200 cities, those being the largest problems ever solved with models derived from the work of Hopfield and Tank. (d) Mean-field annealing refers to the application of the mean-field algorithm, as described in (c), with a gradual reduction of the temperature from high to low values, as in the simulated annealing heuristic. [13,19,80,107] As pointed out in [97], this approach is of little use if the critical temperature, where all the energy minimization takes place, can be accurately estimated from problem data. If the estimate is not accurate, then it can be useful to gradually decrease the temperature. (e) In [8, 26, 66], random noise is introduced into the activation level of the units in order to escape from local minima. The random

7.1 Basis for Boltzmann machine. 7. Boltzmann machines

7.1 Basis for Boltzmann machine. 7. Boltzmann machines 7. Boltzmann machines this section we will become acquainted with classical Boltzmann machines which can be seen obsolete being rarely applied in neurocomputing. It is interesting, after all, because is

More information

Travelling Salesman Problem

Travelling Salesman Problem Travelling Salesman Problem Fabio Furini November 10th, 2014 Travelling Salesman Problem 1 Outline 1 Traveling Salesman Problem Separation Travelling Salesman Problem 2 (Asymmetric) Traveling Salesman

More information

Week Cuts, Branch & Bound, and Lagrangean Relaxation

Week Cuts, Branch & Bound, and Lagrangean Relaxation Week 11 1 Integer Linear Programming This week we will discuss solution methods for solving integer linear programming problems. I will skip the part on complexity theory, Section 11.8, although this is

More information

Computational Intelligence Lecture 6: Associative Memory

Computational Intelligence Lecture 6: Associative Memory Computational Intelligence Lecture 6: Associative Memory Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Fall 2011 Farzaneh Abdollahi Computational Intelligence

More information

Hill climbing: Simulated annealing and Tabu search

Hill climbing: Simulated annealing and Tabu search Hill climbing: Simulated annealing and Tabu search Heuristic algorithms Giovanni Righini University of Milan Department of Computer Science (Crema) Hill climbing Instead of repeating local search, it is

More information

Local Search & Optimization

Local Search & Optimization Local Search & Optimization CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2017 Soleymani Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 4 Outline

More information

Motivation, Basic Concepts, Basic Methods, Travelling Salesperson Problem (TSP), Algorithms

Motivation, Basic Concepts, Basic Methods, Travelling Salesperson Problem (TSP), Algorithms Motivation, Basic Concepts, Basic Methods, Travelling Salesperson Problem (TSP), Algorithms 1 What is Combinatorial Optimization? Combinatorial Optimization deals with problems where we have to search

More information

Unit 1A: Computational Complexity

Unit 1A: Computational Complexity Unit 1A: Computational Complexity Course contents: Computational complexity NP-completeness Algorithmic Paradigms Readings Chapters 3, 4, and 5 Unit 1A 1 O: Upper Bounding Function Def: f(n)= O(g(n)) if

More information

CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash

CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash Equilibrium Price of Stability Coping With NP-Hardness

More information

12. LOCAL SEARCH. gradient descent Metropolis algorithm Hopfield neural networks maximum cut Nash equilibria

12. LOCAL SEARCH. gradient descent Metropolis algorithm Hopfield neural networks maximum cut Nash equilibria 12. LOCAL SEARCH gradient descent Metropolis algorithm Hopfield neural networks maximum cut Nash equilibria Lecture slides by Kevin Wayne Copyright 2005 Pearson-Addison Wesley h ttp://www.cs.princeton.edu/~wayne/kleinberg-tardos

More information

Zebo Peng Embedded Systems Laboratory IDA, Linköping University

Zebo Peng Embedded Systems Laboratory IDA, Linköping University TDTS 01 Lecture 8 Optimization Heuristics for Synthesis Zebo Peng Embedded Systems Laboratory IDA, Linköping University Lecture 8 Optimization problems Heuristic techniques Simulated annealing Genetic

More information

A.I.: Beyond Classical Search

A.I.: Beyond Classical Search A.I.: Beyond Classical Search Random Sampling Trivial Algorithms Generate a state randomly Random Walk Randomly pick a neighbor of the current state Both algorithms asymptotically complete. Overview Previously

More information

Local Search & Optimization

Local Search & Optimization Local Search & Optimization CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018 Soleymani Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 4 Some

More information

New Integer Programming Formulations of the Generalized Travelling Salesman Problem

New Integer Programming Formulations of the Generalized Travelling Salesman Problem American Journal of Applied Sciences 4 (11): 932-937, 2007 ISSN 1546-9239 2007 Science Publications New Integer Programming Formulations of the Generalized Travelling Salesman Problem Petrica C. Pop Department

More information

3.4 Relaxations and bounds

3.4 Relaxations and bounds 3.4 Relaxations and bounds Consider a generic Discrete Optimization problem z = min{c(x) : x X} with an optimal solution x X. In general, the algorithms generate not only a decreasing sequence of upper

More information

Stochastic Networks Variations of the Hopfield model

Stochastic Networks Variations of the Hopfield model 4 Stochastic Networks 4. Variations of the Hopfield model In the previous chapter we showed that Hopfield networks can be used to provide solutions to combinatorial problems that can be expressed as the

More information

12. LOCAL SEARCH. gradient descent Metropolis algorithm Hopfield neural networks maximum cut Nash equilibria

12. LOCAL SEARCH. gradient descent Metropolis algorithm Hopfield neural networks maximum cut Nash equilibria Coping With NP-hardness Q. Suppose I need to solve an NP-hard problem. What should I do? A. Theory says you re unlikely to find poly-time algorithm. Must sacrifice one of three desired features. Solve

More information

NP Completeness and Approximation Algorithms

NP Completeness and Approximation Algorithms Chapter 10 NP Completeness and Approximation Algorithms Let C() be a class of problems defined by some property. We are interested in characterizing the hardest problems in the class, so that if we can

More information

Ant Colony Optimization: an introduction. Daniel Chivilikhin

Ant Colony Optimization: an introduction. Daniel Chivilikhin Ant Colony Optimization: an introduction Daniel Chivilikhin 03.04.2013 Outline 1. Biological inspiration of ACO 2. Solving NP-hard combinatorial problems 3. The ACO metaheuristic 4. ACO for the Traveling

More information

Combinatorial optimization problems

Combinatorial optimization problems Combinatorial optimization problems Heuristic Algorithms Giovanni Righini University of Milan Department of Computer Science (Crema) Optimization In general an optimization problem can be formulated as:

More information

Bounds on the Traveling Salesman Problem

Bounds on the Traveling Salesman Problem Bounds on the Traveling Salesman Problem Sean Zachary Roberson Texas A&M University MATH 613, Graph Theory A common routing problem is as follows: given a collection of stops (for example, towns, stations,

More information

Optimization of Quadratic Forms: NP Hard Problems : Neural Networks

Optimization of Quadratic Forms: NP Hard Problems : Neural Networks 1 Optimization of Quadratic Forms: NP Hard Problems : Neural Networks Garimella Rama Murthy, Associate Professor, International Institute of Information Technology, Gachibowli, HYDERABAD, AP, INDIA ABSTRACT

More information

Part III: Traveling salesman problems

Part III: Traveling salesman problems Transportation Logistics Part III: Traveling salesman problems c R.F. Hartl, S.N. Parragh 1/282 Motivation Motivation Why do we study the TSP? c R.F. Hartl, S.N. Parragh 2/282 Motivation Motivation Why

More information

Approximation Algorithms for Re-optimization

Approximation Algorithms for Re-optimization Approximation Algorithms for Re-optimization DRAFT PLEASE DO NOT CITE Dean Alderucci Table of Contents 1.Introduction... 2 2.Overview of the Current State of Re-Optimization Research... 3 2.1.General Results

More information

( ) ( ) ( ) ( ) Simulated Annealing. Introduction. Pseudotemperature, Free Energy and Entropy. A Short Detour into Statistical Mechanics.

( ) ( ) ( ) ( ) Simulated Annealing. Introduction. Pseudotemperature, Free Energy and Entropy. A Short Detour into Statistical Mechanics. Aims Reference Keywords Plan Simulated Annealing to obtain a mathematical framework for stochastic machines to study simulated annealing Parts of chapter of Haykin, S., Neural Networks: A Comprehensive

More information

SIMU L TED ATED ANNEA L NG ING

SIMU L TED ATED ANNEA L NG ING SIMULATED ANNEALING Fundamental Concept Motivation by an analogy to the statistical mechanics of annealing in solids. => to coerce a solid (i.e., in a poor, unordered state) into a low energy thermodynamic

More information

NP and Computational Intractability

NP and Computational Intractability NP and Computational Intractability 1 Polynomial-Time Reduction Desiderata'. Suppose we could solve X in polynomial-time. What else could we solve in polynomial time? don't confuse with reduces from Reduction.

More information

MVE165/MMG630, Applied Optimization Lecture 6 Integer linear programming: models and applications; complexity. Ann-Brith Strömberg

MVE165/MMG630, Applied Optimization Lecture 6 Integer linear programming: models and applications; complexity. Ann-Brith Strömberg MVE165/MMG630, Integer linear programming: models and applications; complexity Ann-Brith Strömberg 2011 04 01 Modelling with integer variables (Ch. 13.1) Variables Linear programming (LP) uses continuous

More information

Hopfield Networks and Boltzmann Machines. Christian Borgelt Artificial Neural Networks and Deep Learning 296

Hopfield Networks and Boltzmann Machines. Christian Borgelt Artificial Neural Networks and Deep Learning 296 Hopfield Networks and Boltzmann Machines Christian Borgelt Artificial Neural Networks and Deep Learning 296 Hopfield Networks A Hopfield network is a neural network with a graph G = (U,C) that satisfies

More information

Local search algorithms. Chapter 4, Sections 3 4 1

Local search algorithms. Chapter 4, Sections 3 4 1 Local search algorithms Chapter 4, Sections 3 4 Chapter 4, Sections 3 4 1 Outline Hill-climbing Simulated annealing Genetic algorithms (briefly) Local search in continuous spaces (very briefly) Chapter

More information

fraction dt (0 < dt 1) from its present value to the goal net value: Net y (s) = Net y (s-1) + dt (GoalNet y (s) - Net y (s-1)) (2)

fraction dt (0 < dt 1) from its present value to the goal net value: Net y (s) = Net y (s-1) + dt (GoalNet y (s) - Net y (s-1)) (2) The Robustness of Relaxation Rates in Constraint Satisfaction Networks D. Randall Wilson Dan Ventura Brian Moncur fonix corporation WilsonR@fonix.com Tony R. Martinez Computer Science Department Brigham

More information

Introduction to Mathematical Programming IE406. Lecture 21. Dr. Ted Ralphs

Introduction to Mathematical Programming IE406. Lecture 21. Dr. Ted Ralphs Introduction to Mathematical Programming IE406 Lecture 21 Dr. Ted Ralphs IE406 Lecture 21 1 Reading for This Lecture Bertsimas Sections 10.2, 10.3, 11.1, 11.2 IE406 Lecture 21 2 Branch and Bound Branch

More information

The Traveling Salesman Problem: An Overview. David P. Williamson, Cornell University Ebay Research January 21, 2014

The Traveling Salesman Problem: An Overview. David P. Williamson, Cornell University Ebay Research January 21, 2014 The Traveling Salesman Problem: An Overview David P. Williamson, Cornell University Ebay Research January 21, 2014 (Cook 2012) A highly readable introduction Some terminology (imprecise) Problem Traditional

More information

Algorithms and Complexity theory

Algorithms and Complexity theory Algorithms and Complexity theory Thibaut Barthelemy Some slides kindly provided by Fabien Tricoire University of Vienna WS 2014 Outline 1 Algorithms Overview How to write an algorithm 2 Complexity theory

More information

Discriminative Direction for Kernel Classifiers

Discriminative Direction for Kernel Classifiers Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering

More information

Pengju

Pengju Introduction to AI Chapter04 Beyond Classical Search Pengju Ren@IAIR Outline Steepest Descent (Hill-climbing) Simulated Annealing Evolutionary Computation Non-deterministic Actions And-OR search Partial

More information

Deep Belief Networks are compact universal approximators

Deep Belief Networks are compact universal approximators 1 Deep Belief Networks are compact universal approximators Nicolas Le Roux 1, Yoshua Bengio 2 1 Microsoft Research Cambridge 2 University of Montreal Keywords: Deep Belief Networks, Universal Approximation

More information

Polynomial-Time Reductions

Polynomial-Time Reductions Reductions 1 Polynomial-Time Reductions Classify Problems According to Computational Requirements Q. Which problems will we be able to solve in practice? A working definition. [von Neumann 1953, Godel

More information

Lin-Kernighan Heuristic. Simulated Annealing

Lin-Kernighan Heuristic. Simulated Annealing DM63 HEURISTICS FOR COMBINATORIAL OPTIMIZATION Lecture 6 Lin-Kernighan Heuristic. Simulated Annealing Marco Chiarandini Outline 1. Competition 2. Variable Depth Search 3. Simulated Annealing DM63 Heuristics

More information

Non-Convex Optimization. CS6787 Lecture 7 Fall 2017

Non-Convex Optimization. CS6787 Lecture 7 Fall 2017 Non-Convex Optimization CS6787 Lecture 7 Fall 2017 First some words about grading I sent out a bunch of grades on the course management system Everyone should have all their grades in Not including paper

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

1 Heuristics for the Traveling Salesman Problem

1 Heuristics for the Traveling Salesman Problem Praktikum Algorithmen-Entwurf (Teil 9) 09.12.2013 1 1 Heuristics for the Traveling Salesman Problem We consider the following problem. We want to visit all the nodes of a graph as fast as possible, visiting

More information

Computational statistics

Computational statistics Computational statistics Combinatorial optimization Thierry Denœux February 2017 Thierry Denœux Computational statistics February 2017 1 / 37 Combinatorial optimization Assume we seek the maximum of f

More information

PROBLEM SOLVING AND SEARCH IN ARTIFICIAL INTELLIGENCE

PROBLEM SOLVING AND SEARCH IN ARTIFICIAL INTELLIGENCE Artificial Intelligence, Computational Logic PROBLEM SOLVING AND SEARCH IN ARTIFICIAL INTELLIGENCE Lecture 4 Metaheuristic Algorithms Sarah Gaggl Dresden, 5th May 2017 Agenda 1 Introduction 2 Constraint

More information

Chapter 3: Discrete Optimization Integer Programming

Chapter 3: Discrete Optimization Integer Programming Chapter 3: Discrete Optimization Integer Programming Edoardo Amaldi DEIB Politecnico di Milano edoardo.amaldi@polimi.it Sito web: http://home.deib.polimi.it/amaldi/ott-13-14.shtml A.A. 2013-14 Edoardo

More information

How hard is it to find a good solution?

How hard is it to find a good solution? How hard is it to find a good solution? Simons Institute Open Lecture November 4, 2013 Research Area: Complexity Theory Given a computational problem, find an efficient algorithm that solves it. Goal of

More information

A NEW SET THEORY FOR ANALYSIS

A NEW SET THEORY FOR ANALYSIS Article A NEW SET THEORY FOR ANALYSIS Juan Pablo Ramírez 0000-0002-4912-2952 Abstract: We present the real number system as a generalization of the natural numbers. First, we prove the co-finite topology,

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.5. Spring 2010 Instructor: Dr. Masoud Yaghini Outline How the Brain Works Artificial Neural Networks Simple Computing Elements Feed-Forward Networks Perceptrons (Single-layer,

More information

Mathematics for Decision Making: An Introduction. Lecture 8

Mathematics for Decision Making: An Introduction. Lecture 8 Mathematics for Decision Making: An Introduction Lecture 8 Matthias Köppe UC Davis, Mathematics January 29, 2009 8 1 Shortest Paths and Feasible Potentials Feasible Potentials Suppose for all v V, there

More information

Chapter 3: Discrete Optimization Integer Programming

Chapter 3: Discrete Optimization Integer Programming Chapter 3: Discrete Optimization Integer Programming Edoardo Amaldi DEIB Politecnico di Milano edoardo.amaldi@polimi.it Website: http://home.deib.polimi.it/amaldi/opt-16-17.shtml Academic year 2016-17

More information

8. INTRACTABILITY I. Lecture slides by Kevin Wayne Copyright 2005 Pearson-Addison Wesley. Last updated on 2/6/18 2:16 AM

8. INTRACTABILITY I. Lecture slides by Kevin Wayne Copyright 2005 Pearson-Addison Wesley. Last updated on 2/6/18 2:16 AM 8. INTRACTABILITY I poly-time reductions packing and covering problems constraint satisfaction problems sequencing problems partitioning problems graph coloring numerical problems Lecture slides by Kevin

More information

CS 380: ARTIFICIAL INTELLIGENCE

CS 380: ARTIFICIAL INTELLIGENCE CS 380: ARTIFICIAL INTELLIGENCE PROBLEM SOLVING: LOCAL SEARCH 10/11/2013 Santiago Ontañón santi@cs.drexel.edu https://www.cs.drexel.edu/~santi/teaching/2013/cs380/intro.html Recall: Problem Solving Idea:

More information

Metaheuristics and Local Search

Metaheuristics and Local Search Metaheuristics and Local Search 8000 Discrete optimization problems Variables x 1,..., x n. Variable domains D 1,..., D n, with D j Z. Constraints C 1,..., C m, with C i D 1 D n. Objective function f :

More information

An artificial neural networks (ANNs) model is a functional abstraction of the

An artificial neural networks (ANNs) model is a functional abstraction of the CHAPER 3 3. Introduction An artificial neural networs (ANNs) model is a functional abstraction of the biological neural structures of the central nervous system. hey are composed of many simple and highly

More information

Scheduling and Optimization Course (MPRI)

Scheduling and Optimization Course (MPRI) MPRI Scheduling and optimization: lecture p. /6 Scheduling and Optimization Course (MPRI) Leo Liberti LIX, École Polytechnique, France MPRI Scheduling and optimization: lecture p. /6 Teachers Christoph

More information

Hertz, Krogh, Palmer: Introduction to the Theory of Neural Computation. Addison-Wesley Publishing Company (1991). (v ji (1 x i ) + (1 v ji )x i )

Hertz, Krogh, Palmer: Introduction to the Theory of Neural Computation. Addison-Wesley Publishing Company (1991). (v ji (1 x i ) + (1 v ji )x i ) Symmetric Networks Hertz, Krogh, Palmer: Introduction to the Theory of Neural Computation. Addison-Wesley Publishing Company (1991). How can we model an associative memory? Let M = {v 1,..., v m } be a

More information

Branch-and-cut Approaches for Chance-constrained Formulations of Reliable Network Design Problems

Branch-and-cut Approaches for Chance-constrained Formulations of Reliable Network Design Problems Branch-and-cut Approaches for Chance-constrained Formulations of Reliable Network Design Problems Yongjia Song James R. Luedtke August 9, 2012 Abstract We study solution approaches for the design of reliably

More information

Part B" Ants (Natural and Artificial)! Langton s Vants" (Virtual Ants)! Vants! Example! Time Reversibility!

Part B Ants (Natural and Artificial)! Langton s Vants (Virtual Ants)! Vants! Example! Time Reversibility! Part B" Ants (Natural and Artificial)! Langton s Vants" (Virtual Ants)! 11/14/08! 1! 11/14/08! 2! Vants!! Square grid!! Squares can be black or white!! Vants can face N, S, E, W!! Behavioral rule:!! take

More information

Local and Stochastic Search

Local and Stochastic Search RN, Chapter 4.3 4.4; 7.6 Local and Stochastic Search Some material based on D Lin, B Selman 1 Search Overview Introduction to Search Blind Search Techniques Heuristic Search Techniques Constraint Satisfaction

More information

Algorithms: COMP3121/3821/9101/9801

Algorithms: COMP3121/3821/9101/9801 NEW SOUTH WALES Algorithms: COMP3121/3821/9101/9801 Aleks Ignjatović School of Computer Science and Engineering University of New South Wales LECTURE 9: INTRACTABILITY COMP3121/3821/9101/9801 1 / 29 Feasibility

More information

16.410/413 Principles of Autonomy and Decision Making

16.410/413 Principles of Autonomy and Decision Making 6.4/43 Principles of Autonomy and Decision Making Lecture 8: (Mixed-Integer) Linear Programming for Vehicle Routing and Motion Planning Emilio Frazzoli Aeronautics and Astronautics Massachusetts Institute

More information

Integer Programming ISE 418. Lecture 8. Dr. Ted Ralphs

Integer Programming ISE 418. Lecture 8. Dr. Ted Ralphs Integer Programming ISE 418 Lecture 8 Dr. Ted Ralphs ISE 418 Lecture 8 1 Reading for This Lecture Wolsey Chapter 2 Nemhauser and Wolsey Sections II.3.1, II.3.6, II.4.1, II.4.2, II.5.4 Duality for Mixed-Integer

More information

Revisiting the Hamiltonian p-median problem: a new formulation on directed graphs and a branch-and-cut algorithm

Revisiting the Hamiltonian p-median problem: a new formulation on directed graphs and a branch-and-cut algorithm Revisiting the Hamiltonian p-median problem: a new formulation on directed graphs and a branch-and-cut algorithm Tolga Bektaş 1, Luís Gouveia 2, Daniel Santos 2 1 Centre for Operational Research, Management

More information

The core of solving constraint problems using Constraint Programming (CP), with emphasis on:

The core of solving constraint problems using Constraint Programming (CP), with emphasis on: What is it about? l Theory The core of solving constraint problems using Constraint Programming (CP), with emphasis on: l Modeling l Solving: Local consistency and propagation; Backtracking search + heuristics.

More information

ON COST MATRICES WITH TWO AND THREE DISTINCT VALUES OF HAMILTONIAN PATHS AND CYCLES

ON COST MATRICES WITH TWO AND THREE DISTINCT VALUES OF HAMILTONIAN PATHS AND CYCLES ON COST MATRICES WITH TWO AND THREE DISTINCT VALUES OF HAMILTONIAN PATHS AND CYCLES SANTOSH N. KABADI AND ABRAHAM P. PUNNEN Abstract. Polynomially testable characterization of cost matrices associated

More information

AI Programming CS F-20 Neural Networks

AI Programming CS F-20 Neural Networks AI Programming CS662-2008F-20 Neural Networks David Galles Department of Computer Science University of San Francisco 20-0: Symbolic AI Most of this class has been focused on Symbolic AI Focus or symbols

More information

A Randomized Rounding Approach to the Traveling Salesman Problem

A Randomized Rounding Approach to the Traveling Salesman Problem A Randomized Rounding Approach to the Traveling Salesman Problem Shayan Oveis Gharan Amin Saberi. Mohit Singh. Abstract For some positive constant ɛ 0, we give a ( 3 2 ɛ 0)-approximation algorithm for

More information

A Polynomial-Time Algorithm for Pliable Index Coding

A Polynomial-Time Algorithm for Pliable Index Coding 1 A Polynomial-Time Algorithm for Pliable Index Coding Linqi Song and Christina Fragouli arxiv:1610.06845v [cs.it] 9 Aug 017 Abstract In pliable index coding, we consider a server with m messages and n

More information

More on NP and Reductions

More on NP and Reductions Indian Institute of Information Technology Design and Manufacturing, Kancheepuram Chennai 600 127, India An Autonomous Institute under MHRD, Govt of India http://www.iiitdm.ac.in COM 501 Advanced Data

More information

Metaheuristics and Local Search. Discrete optimization problems. Solution approaches

Metaheuristics and Local Search. Discrete optimization problems. Solution approaches Discrete Mathematics for Bioinformatics WS 07/08, G. W. Klau, 31. Januar 2008, 11:55 1 Metaheuristics and Local Search Discrete optimization problems Variables x 1,...,x n. Variable domains D 1,...,D n,

More information

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Neural Networks. Nicholas Ruozzi University of Texas at Dallas Neural Networks Nicholas Ruozzi University of Texas at Dallas Handwritten Digit Recognition Given a collection of handwritten digits and their corresponding labels, we d like to be able to correctly classify

More information

Input layer. Weight matrix [ ] Output layer

Input layer. Weight matrix [ ] Output layer MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.034 Artificial Intelligence, Fall 2003 Recitation 10, November 4 th & 5 th 2003 Learning by perceptrons

More information

Technische Universität München, Zentrum Mathematik Lehrstuhl für Angewandte Geometrie und Diskrete Mathematik. Combinatorial Optimization (MA 4502)

Technische Universität München, Zentrum Mathematik Lehrstuhl für Angewandte Geometrie und Diskrete Mathematik. Combinatorial Optimization (MA 4502) Technische Universität München, Zentrum Mathematik Lehrstuhl für Angewandte Geometrie und Diskrete Mathematik Combinatorial Optimization (MA 4502) Dr. Michael Ritter Problem Sheet 1 Homework Problems Exercise

More information

CS 583: Approximation Algorithms: Introduction

CS 583: Approximation Algorithms: Introduction CS 583: Approximation Algorithms: Introduction Chandra Chekuri January 15, 2018 1 Introduction Course Objectives 1. To appreciate that not all intractable problems are the same. NP optimization problems,

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Neural Networks Varun Chandola x x 5 Input Outline Contents February 2, 207 Extending Perceptrons 2 Multi Layered Perceptrons 2 2. Generalizing to Multiple Labels.................

More information

Chapter 11. Approximation Algorithms. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved.

Chapter 11. Approximation Algorithms. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved. Chapter 11 Approximation Algorithms Slides by Kevin Wayne. Copyright @ 2005 Pearson-Addison Wesley. All rights reserved. 1 Approximation Algorithms Q. Suppose I need to solve an NP-hard problem. What should

More information

Theory and Applications of Simulated Annealing for Nonlinear Constrained Optimization 1

Theory and Applications of Simulated Annealing for Nonlinear Constrained Optimization 1 Theory and Applications of Simulated Annealing for Nonlinear Constrained Optimization 1 Benjamin W. Wah 1, Yixin Chen 2 and Tao Wang 3 1 Department of Electrical and Computer Engineering and the Coordinated

More information

Discrete Optimization 2010 Lecture 8 Lagrangian Relaxation / P, N P and co-n P

Discrete Optimization 2010 Lecture 8 Lagrangian Relaxation / P, N P and co-n P Discrete Optimization 2010 Lecture 8 Lagrangian Relaxation / P, N P and co-n P Marc Uetz University of Twente m.uetz@utwente.nl Lecture 8: sheet 1 / 32 Marc Uetz Discrete Optimization Outline 1 Lagrangian

More information

Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm

Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm Michail G. Lagoudakis Department of Computer Science Duke University Durham, NC 2778 mgl@cs.duke.edu

More information

Algorithms for a Special Class of State-Dependent Shortest Path Problems with an Application to the Train Routing Problem

Algorithms for a Special Class of State-Dependent Shortest Path Problems with an Application to the Train Routing Problem Algorithms fo Special Class of State-Dependent Shortest Path Problems with an Application to the Train Routing Problem Lunce Fu and Maged Dessouky Daniel J. Epstein Department of Industrial & Systems Engineering

More information

Part III: Traveling salesman problems

Part III: Traveling salesman problems Transportation Logistics Part III: Traveling salesman problems c R.F. Hartl, S.N. Parragh 1/74 Motivation Motivation Why do we study the TSP? it easy to formulate it is a difficult problem many significant

More information

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided

More information

Fundamentals of Metaheuristics

Fundamentals of Metaheuristics Fundamentals of Metaheuristics Part I - Basic concepts and Single-State Methods A seminar for Neural Networks Simone Scardapane Academic year 2012-2013 ABOUT THIS SEMINAR The seminar is divided in three

More information

An Effective Chromosome Representation for Evolving Flexible Job Shop Schedules

An Effective Chromosome Representation for Evolving Flexible Job Shop Schedules An Effective Chromosome Representation for Evolving Flexible Job Shop Schedules Joc Cing Tay and Djoko Wibowo Intelligent Systems Lab Nanyang Technological University asjctay@ntuedusg Abstract As the Flexible

More information

Robust Network Codes for Unicast Connections: A Case Study

Robust Network Codes for Unicast Connections: A Case Study Robust Network Codes for Unicast Connections: A Case Study Salim Y. El Rouayheb, Alex Sprintson, and Costas Georghiades Department of Electrical and Computer Engineering Texas A&M University College Station,

More information

CHAPTER 3 FUNDAMENTALS OF COMPUTATIONAL COMPLEXITY. E. Amaldi Foundations of Operations Research Politecnico di Milano 1

CHAPTER 3 FUNDAMENTALS OF COMPUTATIONAL COMPLEXITY. E. Amaldi Foundations of Operations Research Politecnico di Milano 1 CHAPTER 3 FUNDAMENTALS OF COMPUTATIONAL COMPLEXITY E. Amaldi Foundations of Operations Research Politecnico di Milano 1 Goal: Evaluate the computational requirements (this course s focus: time) to solve

More information

Testing Problems with Sub-Learning Sample Complexity

Testing Problems with Sub-Learning Sample Complexity Testing Problems with Sub-Learning Sample Complexity Michael Kearns AT&T Labs Research 180 Park Avenue Florham Park, NJ, 07932 mkearns@researchattcom Dana Ron Laboratory for Computer Science, MIT 545 Technology

More information

Neural Networks Lecture 6: Associative Memory II

Neural Networks Lecture 6: Associative Memory II Neural Networks Lecture 6: Associative Memory II H.A Talebi Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Winter 2011. A. Talebi, Farzaneh Abdollahi Neural

More information

Notes on Back Propagation in 4 Lines

Notes on Back Propagation in 4 Lines Notes on Back Propagation in 4 Lines Lili Mou moull12@sei.pku.edu.cn March, 2015 Congratulations! You are reading the clearest explanation of forward and backward propagation I have ever seen. In this

More information

In biological terms, memory refers to the ability of neural systems to store activity patterns and later recall them when required.

In biological terms, memory refers to the ability of neural systems to store activity patterns and later recall them when required. In biological terms, memory refers to the ability of neural systems to store activity patterns and later recall them when required. In humans, association is known to be a prominent feature of memory.

More information

Artificial Intelligence Heuristic Search Methods

Artificial Intelligence Heuristic Search Methods Artificial Intelligence Heuristic Search Methods Chung-Ang University, Jaesung Lee The original version of this content is created by School of Mathematics, University of Birmingham professor Sandor Zoltan

More information

CS/COE

CS/COE CS/COE 1501 www.cs.pitt.edu/~nlf4/cs1501/ P vs NP But first, something completely different... Some computational problems are unsolvable No algorithm can be written that will always produce the correct

More information

Analysis of Algorithms. Unit 5 - Intractable Problems

Analysis of Algorithms. Unit 5 - Intractable Problems Analysis of Algorithms Unit 5 - Intractable Problems 1 Intractable Problems Tractable Problems vs. Intractable Problems Polynomial Problems NP Problems NP Complete and NP Hard Problems 2 In this unit we

More information

Linear discriminant functions

Linear discriminant functions Andrea Passerini passerini@disi.unitn.it Machine Learning Discriminative learning Discriminative vs generative Generative learning assumes knowledge of the distribution governing the data Discriminative

More information

Integer weight training by differential evolution algorithms

Integer weight training by differential evolution algorithms Integer weight training by differential evolution algorithms V.P. Plagianakos, D.G. Sotiropoulos, and M.N. Vrahatis University of Patras, Department of Mathematics, GR-265 00, Patras, Greece. e-mail: vpp

More information

Discrete evaluation and the particle swarm algorithm

Discrete evaluation and the particle swarm algorithm Volume 12 Discrete evaluation and the particle swarm algorithm Tim Hendtlass and Tom Rodgers Centre for Intelligent Systems and Complex Processes Swinburne University of Technology P. O. Box 218 Hawthorn

More information

Lecture 4: NP and computational intractability

Lecture 4: NP and computational intractability Chapter 4 Lecture 4: NP and computational intractability Listen to: Find the longest path, Daniel Barret What do we do today: polynomial time reduction NP, co-np and NP complete problems some examples

More information

Algorithms. NP -Complete Problems. Dong Kyue Kim Hanyang University

Algorithms. NP -Complete Problems. Dong Kyue Kim Hanyang University Algorithms NP -Complete Problems Dong Kyue Kim Hanyang University dqkim@hanyang.ac.kr The Class P Definition 13.2 Polynomially bounded An algorithm is said to be polynomially bounded if its worst-case

More information

Chapter 4 Beyond Classical Search 4.1 Local search algorithms and optimization problems

Chapter 4 Beyond Classical Search 4.1 Local search algorithms and optimization problems Chapter 4 Beyond Classical Search 4.1 Local search algorithms and optimization problems CS4811 - Artificial Intelligence Nilufer Onder Department of Computer Science Michigan Technological University Outline

More information

Lecture 5: Logistic Regression. Neural Networks

Lecture 5: Logistic Regression. Neural Networks Lecture 5: Logistic Regression. Neural Networks Logistic regression Comparison with generative models Feed-forward neural networks Backpropagation Tricks for training neural networks COMP-652, Lecture

More information