DUAL-MODE DYNAMICS NEURAL NETWORKS FOR COMBINATORIAL OPTIMIZATION. Jun Park. A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL

Size: px
Start display at page:

Download "DUAL-MODE DYNAMICS NEURAL NETWORKS FOR COMBINATORIAL OPTIMIZATION. Jun Park. A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL"

Transcription

1 DUAL-MODE DYNAMICS NEURAL NETWORKS FOR COMBINATORIAL OPTIMIZATION by Jun Park A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulllment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (Electrical Engineering - Systems) August 1994 Copyright 1994 Jun Park

2 Acknowledgments First, I express my appreciation and respect to my advisor, Professor Sukhan Lee. Through the stimulating and productive discussions in detail, he has given me a lot of essential ideas, and guided me in the right direction to complete this dissertation successfully. Also, his inexhaustive passion and dedication to research gives me a notion of what kind of researcher I ought to be. It has been my great pleasure and privilege to have him as my advisor. I also would like to thank Professor Bart Kosko and Professor Behrokh Khoshnevis, my dissertation committee, for their constructive comments and valuable suggestions. Also, I express my thanks to Professor Keith Jenkins and Professor Ken Goldberg for serving on my qualifying committee. It has also been my pleasure to have so sincere and cooperative colleagues: Yeong Woo Choi, Chunsik Yi, Shunich Shimoji, Judy Chen, Carlos Luck, Soo Kwang Ro, Andrew H. Fagg and all the previous group members. I thank them all for their cooperation and encouragement. Special thanks should be given to Electronics and Telecommunications Research Institute for the nancial support for my study at University of Southern California. Finally, I would like to express my gratitute to all my family members. Their constant support and encouragement helped me very much to overcome various diculties during the period of my study. I wish to express my appreciation and love to my wife, Miran, and to my daughter and son, Dahyun and Jihyun. Especially, I hope I give a pleasure to my mother with this dissertation. ii

3 Contents Acknowledgments List of Figures List of Tables Abstract ii vi vii viii 1 Introduction Combinatorial Optimization Problem and Neural Network : : : : Related Works : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Approach of the Thesis : : : : : : : : : : : : : : : : : : : : : : : : Organization of the Thesis : : : : : : : : : : : : : : : : : : : : : : 9 2 Dual-Mode Dynamics Neural Networks Network Conguration Space and Equilibrium Manifold : : : : : : : : : : : : : : : : : : : : : : : : : Network Structure : : : : : : : : : : : : : : : : : : : : : : : : : : Dual-Mode Dynamics : : : : : : : : : : : : : : : : : : : : : : : : : Discrete Model Dual-Mode Dynamics : : : : : : : : : : : : Continuous Model Dual-Mode Dynamics : : : : : : : : : : Symmetry Preserving Recurrent Backpropagation : : : : : Binary Value Solution vs. Continuous State Variable : : : : : : : Asymmetric Weight vs. Symmetric Weight : : : : : : : : : : : : : 36 3 Problem Solving with Dual-Mode Dynamics Neural Networks General Design Procedure : : : : : : : : : : : : : : : : : : : : : : N-Queen Problem : : : : : : : : : : : : : : : : : : : : : : : : : : : D2NN for the N-Queen Problem : : : : : : : : : : : : : : : Simulation Results and Discussions : : : : : : : : : : : : : Knapsack Packing Problem : : : : : : : : : : : : : : : : : : : : : 53 iii

4 3.3.1 D2NN for the Knapsack Packing Problem : : : : : : : : : Simulation Results and Discussions : : : : : : : : : : : : : Traveling Salesman Problem : : : : : : : : : : : : : : : : : : : : : D2NN for the Traveling Salesman Problem : : : : : : : : : Simulation Results and Discussions : : : : : : : : : : : : : 68 4 Conclusion Summary of the Research Contributions : : : : : : : : : : : : : : Suggestions for Future Research : : : : : : : : : : : : : : : : : : : 77 Appendix 79 A Convergence Property of Knapsack Packing D2NN 79 Bibliography 81 iv

5 List of Figures 1.1 The relation between the external objective function, network energy function, the state dynamics, and the weight dynamics a) in conventional approaches, and in) Dual-Mode Dynamics Neural Networks. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : The network conguration space and the equilibrium manifold. : : The structure of Dual-Mode Dynamics Neural Network. : : : : : : The schematic view of the dual-mode dynamics based on the equilibrium manifold in the network conguration space. : : : : : : : : The hyperquadrant-to-vertex mapping : : : : : : : : : : : : : : : The state evolution in D2NN for two variable problem with three inequality constraints. : : : : : : : : : : : : : : : : : : : : : : : : Typical behaviors of a state dynamics for a 100 neuron network: a) with asymmetric weights and b) with symmetric weights. : : : The computational ow in the continuous model of D2NN. : : : : The computational cost to nd the solution for N-Queen problem. The solid line indicates the average number of the cumulative state dynamics iterations and the dashed line indicates the average number of the weight dynamics iterations out of 10 trials for each problem size. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : The examples of solutions found be D2NN for 20 and 40 queen problems. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : For N-Queen Problem, the initial weight can be obtained from (2N? 1) (2N? 1) board. For 5 Queen, e.g., the weight for (1,1) unit is indicated by the solid box, and for (4,3) unit by the dashed box on 9 9 board shown above. Note that there are 8(N? 1) inhibitory weights out of (2N? 1) (2N? 1) board cells. The eects of the inhibitory and excitatory weights are balanced by the values given in the text. : : : : : : : : : : : : : : : : : : : : : : : D2NN structure for the knapsack packing problem. : : : : : : : : 58 v

6 3.6 Two representation scheme for the traveling salesman problem: a) the city-order representation b) the city-city representation. : : : The optimal tour for the given problem with 10 cities. : : : : : : : Four semi-optimal tours obtained by D2NN for the given traveling salesman problem with 10 cities. : : : : : : : : : : : : : : : : : : : Tours found by D2NN for the traveling salesman problems with 20 cities. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 72 vi

7 List of Tables 3.1 The computational cost to nd solutions for each size of N-Queen problem. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Comparison of the random initial weight assignment and the heuristic assignment for 8 and 16 Queen problems. The learning rate () is set to 0:01 for all the cases. : : : : : : : : : : : : : : : : : : : : The comparison of the D2NN with other neural network approaches for the N-Queen problem. The performance is compared in terms of the success rate to nd the solution along the number of Queens The computational cost of D2NN to nd optimal solutions for the knapsack packing problem. : : : : : : : : : : : : : : : : : : : : : : Comparison of D2NN with the greedy algorithm and Hellstrom & Kanal's approach in terms of the rate to nd the optimal solution Comparison of performance and computation time(sec) for the different approaches on n = m = 30 problems. The simulation has been performed on Sun 4/50 workstation and the data with the asterisk( ) are from Ohlsson et al.[31]. : : : : : : : : : : : : : : : The computational cost to nd a solution for the traveling salesman problem with 10 cities. The results are obtained with the time limit of 2000 weight dynamics for the target cost 2.75 and of 1000 weight dynamics for the target cost 2.0. : : : : : : : : : : : : The computational cost to nd a solution for two traveling salesman problem instances with 20 cities. : : : : : : : : : : : : : : : : 71 vii

8 Abstract This thesis presents a new approach to solving combinatorial optimization problems based on a novel dynamic neural network featuring a dual-mode of network dynamics, the state dynamics and the weight dynamics. The network is referred to here as the dual-mode dynamics neural network (D2NN). The combinatorial optimization problem usually has a huge number of elements in its conguration space, so that we cannot explore them exhaustively. Recently, the neural network approaches have been studied for the solution of the combinatorial optimization problem. The computational characteristic of neural network{the distributed and collective computation over a massively parallel architecture, emulating nonlinear dynamics{has invoked high expectation to overcome the curse of combinatorial search complexity in optimization. Several eective approaches have been applied to various combinatorial optimization problems, and have shown that the preliminary results are promising. There are two major diculties, however, in the neural network approaches to optimization problems. First, the objective function for a given problem must have the form that can be mapped onto the network, and secondly, due to the local minima problem, the quality of the solution is quite sensitive to various factors, such as the initial state and the parameters in the objective function. The proposed scheme overcomes these diculties 1) by maintaining the objective viii

9 function separately from the network energy function, rather than mapping it onto the network, and 2) by introducing a weight dynamics utilizing the objective function to avoid the local minima problem. The state dynamics denes state trajectories in a direction to minimize the network energy specied by the current weights and states, whereas the weight dynamics generates weight trajectories in a direction to minimize a preassigned external objective function at a current state. D2NN is operated in such a way that the two modes of network dynamics alternately govern the network until an equilibrium is reached. The D2NN has been applied to N-Queen problem, the knapsack problem, and the traveling salesman problem and indicates the superior performance. ix

10 Chapter 1 Introduction 1.1 Combinatorial Optimization Problem and Neural Network The central problem in the engineering eld is the optimization problem. Its concern is to nd the best conguration or a set of parameters to achieve some goal. If the variables in the optimization problem are discrete rather than continuous, we call it the combinatorial optimization problem. In the combinatorial optimization problem, the number of elements in the conguration space is factorially large, therefore, we cannot explore them exhaustively. For example, in the traveling salesman problem with 30 cities, the number of feasible tours is approximately Dierent heuristics have been devised for dierent problems to nd a good solution, rather than the globally optimal solution. Recently, articial neural networks have been applied to solve combinatorial optimization problems. In their pioneering work, Hopeld and Tank [18] showed a feasibility to solve combinatorial optimization problems with neural networks. 1

11 For the Traveling Salesman Problem (TSP) - one of the classical combinatorial optimization problem, they mapped a properly dened objective function onto the Hopeld network with symmetric weights and no self-loop, and showed that the solution can be computed collectively as the network dynamics works on. The underlying idea in Hopeld and Tank's work has been adopted from the associative memory model [16, 17, 22, 23]. Given a new pattern, the associative memory responses by producing one of the stored patterns which most closely resembles the given pattern. The mechanism of this process can be explained by the following. First, each of the stored patterns is placed at one of the local minima of the network energy function. Secondly, as the network dynamics works on, the state of the network goes down along the energy function surface, from the initial state corresponding to the given input pattern. Then, nally, the state converges to a local minimum of the network energy which is near the initial state. Note that the network energy is minimized while retrieving the stored pattern. This characteristic of energy minimization in the associative memory is adopted by Hopeld and Tank. For a given optimization problem, a properly dened objective function is mapped onto the network in such a way that the objective function is minimized as the network dynamics works on. A state at equilibrium is expected to represent a good quality of solution, although not necessarily the globally optimal solution. The computational characteristic of neural networks{the distributed and collective computation over a massively parallel architecture, emulating nonlinear dynamics{has invoked high expectation to overcome the curse of combinatorial search complexity in optimization problems. And thus, much eort has been devoted to obtain a neural network solution for a wide variety of combinatorial 2

12 optimization problems, including the traveling salesman problem [6, 44, 33], the Hamiltonian cycle problem [30, 38], the knapsack problem [15, 31], the N-Queen problem [3, 39, 40, 41], the scheduling problem [10, 13, 49], the graph partitioning problem [32, 36, 43], etc. 1.2 Related Works Most neural network approaches to combinatorial optimization, in general, adopt the following steps: 1) a formulation of an objective function representing both the cost to be minimized and the constraints to be satised, and 2) an assignment of proper weights to the network in such a way that the resulting network dynamics makes the network states converge to the minimum of the network energy function (representing a solution), as shown in Figure 1.1 a). When we formulate an objective function for a given problem, we should make the objective function have a certain form which can be mapped onto the network. In case that we want to map the problem onto the Hopeld network, the objective function should have the same form as the network energy function 1 of the Hopeld network as: E =? 1 2 NX NX NX w ij x i x j? w i0x i ; (1.1) i=1 j=1 i=1 where x i ; i = 1; : : : ; N is the output of the ith unit, and w ij is the weight between the ith and the jth units, and w i0 is the bias term of the ith unit. For many 1 We refer to the network energy function as the function of the state and the weights as in (1.1), and refer to the external objective function (or simply objective function) as the unconstrained function (the sum of penalty functions associated with constraints and the optimization measure) to be minimized for the given optimization problem. 3

13 optimization problems, however, it is dicult or even impossible to build an objective function as in (1.1). For example, an inequality constraint, a common component in optimization problems, is hard to express with the form given by (1.1). Even when we can nd an objective function in a proper form, we still have to select a proper set of parameters. The objective function usually has a set of coecients which determines the relative weighting among components in the optimization problem. Depending on these coecients, the network energy landscape is determined, and thus the performance of the network is highly dependent on the selection of these coecients. These coecients are usually selected by trial and error. As the problem size grows, however, it becomes very hard to nd a suitable set of coecients. We have another crucial problem, so called, the local minima problem. The objective function formulated in the form in (1.1) generally has many local minima. While these local minima have been utilized in the associative memory model, they may cause a critical problem in solving optimization problems, since the nal state stuck into a local minimum often represents a poor quality of solution or even an invalid solution. Also, the performance of the network is quite dependent on the selection of the initial state. If we start with an initial state which is placed in the basin of the global or the near-global minimum of the network energy function, the nal state will represent a solution of good quality. Otherwise, a poor solution is likely to be obtained [47]. To overcome this local minima problem, several eective approaches have been reported. Earlier, simulated annealing [21, 45] has been devised for solving combinatorial optimization problem. The key idea is from the analogy with 4

14 statistical thermodynamics. When liquid freezes and crystallizes with very slow cooling, the material structure achieves the minimum state of the thermodynamic energy. This is because there are always chances for the material structure to escape from the local minima with the help of thermal noises. Similarly, the state or conguration in simulated annealing is rearranged not only to decrease the objective function, but also to move in a direction to increase the objective function with some probability. As the temperature goes down, the probability to increase the objective function gradually reduces, and the nal system state is expected to reach the globally minimum or a semi-minimum of the objective function. In Mean eld annealing [44, 43, 32], simulated annealing is combined with the Mean Field network which is equivalent to the Hopeld network [4]. At the high articial temperature, the surface of the Hopeld network energy is smoothed out and the basins of local minima tend to disappear. The state of the network is likely to approach the vicinity of the global minimum. As the temperature goes down, the energy function recovers its original shape and the state is expected to converge to the global or near global minimum. The tabu learning [6, 12] is another approach for solving non-convex optimization problems. In the tabu learning, the auxiliary energy function is added to the Hopeld network energy. This auxiliary energy function is continuously increasing in a neighborhood of the current state, and thus penalizing the states that have already been visited. If the state is stuck to a local minimum, the auxiliary function around that minimum begins to increase and pushes the state out of that local minimum toward the space not yet visited. Besides the above approaches, many ideas have been proposed to improve 5

15 the performance of the neural optimization networks. To handle the inequality constraint, Tagliarini and Page [39] used the slack variables to transform the integer inequality constraints into the integer equality constraints, and Abe [1, 2] proposed the slack variables with a special activation function to handle the noninteger inequality constraints. Metha and Fulop [30] derived some conditions for the coecients in the objective function to produce the valid solution in solving the Hamiltonian cycle problem. Sun and Fu [38] proposed a algorithmic method which used the coordinate Newton method to speed up the computation time to reach the valid solution. Xu and Tsai [48] proposed the city-city representation scheme for the traveling salesman problem and combined OPT2 algorithm to solve the subtour problem. Dierent mapping schemes have also been proposed for dierent problems [11, 19]. Although these approaches are shown to be eective for improving the quality of solution, there still remain several problems. Due to the constraint for formulating the objective function, it is hard to apply the neural network approaches to some class of problems. For example, the optimization problems with arbitrary inequality constraints or with high-order optimization measure are not easily mapped onto the network. Also, the quality of solution is very sensitive to various factors. The annealing process usually takes a long computation time, and we need a well devised annealing schedule to get a good solution. When we add the auxiliary function on the network energy function to avoid the local minima problem as in the tabu learning, we should select carefully a set of parameters to control the auxiliary function. Otherwise, the nal solution may represent a poor quality of solution, even though it is good for modied network energy function. Besides, the coecients in the objective function and the initial 6

16 state have a crucial inuence on the quality of the nal solution. Unfortunately, there are no guidelines to determine suitable values for these factors. 1.3 Approach of the Thesis This thesis presents a new approach to the solution of combinatorial optimization problems based on Hopeld type recurrent neural networks, focusing on the aforementioned local minima problem. In the proposed approach, we design the network dynamics to be governed not only by the state dynamics but also by the weight dynamics. Thus, the network is named \the Dual-Mode Dynamics Neural Network (D2NN)" [25, 27, 24, 26, 28]. In Dual-Mode Dynamics Neural Network, the external objective function for a given optimization problem is not mapped onto the network but maintained separately from the network energy function. The weight dynamics is introduced to avoid the local minima problem, and is guided by the external objective function. In other words, there exist two kinds of energy functions in D2NN: the external objective function which is specic to the given optimization problem and the network energy which is a function of the network state and the weights. Also, there are two types of dynamics: the state dynamics which is governed by the network energy function and the weight dynamics which is governed by the external objective function, as shown in Figure 1.1b). The state dynamics is same as the Hopeld network dynamics. With the symmetric weight, the state dynamics is guaranteed to converge to an equilibrium [14, 16, 20]. The weight dynamics is set in such a way as to drive the network states in a direction to minimize the external objective function whenever 7

17 Figure 1.1: The relation between the external objective function, network energy function, the state dynamics, and the weight dynamics a) in conventional approaches, and in) Dual-Mode Dynamics Neural Networks. 8

18 the state dynamics reaches an equilibrium. The repetition of the state dynamics and the weight dynamics leads to a solution, since the weight dynamics provides a means of escaping from a local minimum of the network energy function by changing the network energy prole, and pushes the equilibrium state of the state dynamics toward the minimum of the external objective function. 1.4 Organization of the Thesis The thesis is organized as follows: Chapter 1 reviews the neural optimization networks and describes the problem statement as well as the approach of the thesis. Chapter 2 describes the Dual-Mode Dynamics Neural Network in details. First, the fundamental idea is claried through the discussion in the framework of the network conguration space and the equilibrium manifold. Then, the discrete and continuous model of Dual-Mode Dynamics Neural Network are described. Also, the issues of the binary solution vs. the continuous state variable and asymmetric weights vs. symmetric weights are discussed. Chapter 3 presents the problem solving with the Dual-Mode Dynamics Neural Network. After describing the general procedure to design the Dual-Mode Dynamics Neural Network for a given problem, the details of the Dual-Mode Dynamics Neural Networks for the N-Queen problem, the knapsack packing problem, and the traveling salesman problem are explained. Simulation results on these problems are presented and discussed as well. Finally, Chapter 4 summarizes the thesis contributions and proposes the future research issues. 9

19 Chapter 2 Dual-Mode Dynamics Neural Networks 2.1 Network Conguration Space and Equilibrium Manifold Let us consider a recurrent network represented by the following dynamics: nx _u i =?u i + w ij x j + i (2.1) j=1 x i = f(u i ) 1 = for i = 1; : : : ; n; (2.2) 1 + e?u i where n is the number of neurons of the network, and u i and x i represent respectively the state and output of the ith neuron. With a xed set of symmetric weights, it has been proven that the network dynamics, (2.1), is guaranteed to converge to an equilibrium state [7, 20, 17]. The network dynamics, (2.1), can 10

20 also be interpreted in terms of the network energy, E, in such a way that Equation (2.1) describes the evolution of network state along the surface of the network energy function, E, in a direction to minimize the energy and reach the bottom of the basin including the initial state. Equation (2.1) indicates that the network energy function, E, is a function of the state as well as the weight of the network. Figure 2.1a illustrates the network energy function, E, dened over the Cartesian product of the state and weight space, fwxg, called the network conguration space. Note that the state dynamics of a network depends on the selection of a particular set of network weights. In general, network dynamics can be characterized, at a network conguration, in terms of both weight dynamics and state dynamics, based on the network energy function dened over the network conguration space. To apply the general network dynamics consisting of weight and state dynamics to optimization problems, we consider only, so called, the equilibrium manifold of a network. The equilibrium manifold of a network is dened in the network conguration space as a set of points representing the steady states of the network corresponding to given weights, and is represented by the trace of the bottom of valley of the network energy function as shown schematically in Figure 2.1b. In Figure 2.1b, it is shown that the performance of most conventional approaches to optimization based on neural networks is sensitive to the selection of network weights and initial states. For instance, in Figure 2.1b, let x opt represent the optimal solution to be obtained for a given problem. Obviously, to obtain x opt at the network equilibrium, it is necessary that we select w B as the network weight. But, to assign w B as the network weight is not sucient for obtaining 11

21 Figure 2.1: The network conguration space and the equilibrium manifold. 12

22 the solution, since, for example, if the initial state is given at P, the network will settle down at T instead of S. For conventional approaches to be successful for optimization, the proper assignment of network weights and initial states seems essential. Unfortunately, the systematic way of assigning proper weights and initial states to a network is yet to be established for optimization. The dual-mode dynamics neural network (D2NN) is proposed to solve the problem of weight and initial state assignment associated with conventional approaches for optimization by combining the state dynamics with the weight dynamics. In D2NN, the network can start with arbitrarily chosen weights and initial state. That is, although the network starts with P or Q in Figure 2.1b, which dictates the network to reach T or R, respectively, at the equilibrium of state dynamics, D2NN allows the network to be evolved toward S (representing x opt ) along the equilibrium manifold by automatically modifying network weights and initial states through the weight dynamics and state dynamics. 2.2 Network Structure The structure of Dual-Mode Dynamics Neural Network(D2NN) is shown in Figure 2.2. D2NN is composed of two layers: the base layer and the supervisory layer. The base layer consists of a set of base units with symmetric connections among them. A base unit is either a visible unit or a hidden unit. For a given problem, the conguration space (or the solution space) is mapped into a set of visible units, and hidden units help the visible units to produce the desired solution. The supervisory layer consists of a set of supervisory units without intra-layer connections. One supervisory unit is assigned to each constraint in 13

23 the given problem. For the objective measure to be optimized, we set a target value to achieve, and treat it as one of the constraints. The connection between the supervisory unit and the visible unit is determined by the corresponding constraints, and does not change during computation. The external objective function is formulated with the supervisory units so that it has the minimum when all the constraints are satised. Depending on the external objective function, the connection between the base layer and the supervisory layer can be of a higher order, and we can select the desired form of the activation function at the supervisory unit. The base layer is same as the Hopeld network, and thus, the state dynamics which governs the base layer is guaranteed to converge to an equilibrium with symmetric weights. At the equilibrium of the state dynamics, each supervisory unit examines the visible units connected to it to see whether the corresponding constraint is satised or not. If not satised, the weight dynamics changes the weight in the base layer in a direction to reduce the external objective function, while maintaining the symmetry of the weight. The state dynamics and the weight dynamics govern the network alternately and leads to a solution, since the weight dynamics changes the network energy prole, and thus, pushes the equilibrium state of the state dynamics toward the minimum of the external cost function. 14

24 Figure 2.2: The structure of Dual-Mode Dynamics Neural Network. 15

25 2.3 Dual-Mode Dynamics Discrete Model Dual-Mode Dynamics In the discrete model dual-mode dynamics neural network, we use a discrete Hopeld network as the base layer. Therefore, the state dynamics which governs the base layer is the discrete Hopeld network dynamics. Let x i ; i = 1; : : : ; N be the output of the base unit i, where N is the number of the base units. Also, let w ij be the weight between x i and x j, and w i0 be a bias term for the base unit i. The state dynamics of the base unit is: x i = (I x i ) = ( X j w ij x j + w i0); for i = 1; : : : ; N; (2.3) where I x i is the input of ith base unit and () is the binary activation function of the base unit, i.e., (x) = 8>< >: 1 if x 0; 0 otherwise. With the no self-loop, symmetric weight which is randomly or heuristically assigned at initial, the state dynamics is guaranteed to converge to an equilibrium by an asynchronous operation, At equilibrium, the output of the supervisory unit is: s k = k (x) for k = 1; : : : ; K; (2.4) where K is the number of the supervisory units and k () represents the relationship between the kth supervisory unit s k and the base units, x in the corresponding constraint. The external objective function is dened on the supervisory layer 16

26 by: C = KX C k (s k ) = KX k=1 k=1 C k ( k (x)) (2.5) where C k () is the component of the objective function associated with each constraint. At the equilibrium state of the state dynamics, the weight dynamics is invoked, and updates the weight in the direction to decrease the objective function. At rst, we dene s k, x i by: s k x i 4 = dc k for k = 1; : : : ; K; (2.6) ds k 4 x i KX k=1 dc k ds i dx i di x i for i = 1; : : : ; n; (2.7) where n is the number of visible units. Note that x i is equal to 0 for i = (n+1); : : : ; N, since the hidden units are not directly connected to the supervisory unit. For the binary threshold activation function (), it is intractable to compute dx i =di x i as it is. We treat () as a linear activation function and set 0 () to 1 in (2.7). This is a reasonable approximation since both are nondecreasing functions. That is, for a desired 4x i, 4I x i should have the same sign of 4x i in both functions 1. Therefore, x i = KX k=1 s i : (2.8) In case the supervisory unit s k is a linear combination of the base unit output, 1 This approximation is more eective than the approximation with the sigmoid function, since it gives rise to the larger jw ij j when the input to a neuron is deeply saturated. The same idea has been applied to the error backpropagation learning to speed up the learning process [46]. 17

27 i.e., s k = X i w sx ki x i ; (2.9) then, x i = X k s k w sx ki : (2.10) Using (2.6) through (2.8), the equations for the weight dynamics are obtained as: 4w ij ij x x ij x? ( x i x j + x j x i); (2.11) 4w i0 i0? x i (2.12) where (> 0) is the learning rate. The overall operation of the discrete model of D2NN is as follows: Step 1. Initialization 1. Assign randomly (or heuristically) the weight in the base layer. 2. Select the initial state randomly. Step 2. Dual-Mode Dynamics 1. Run the base layer network asynchronously by (2.3) until an equilibrium is reached. 2. If the equilibrium state represents a solution, then go to Step 3. 18

28 3. Update the weight by (2.6) through (2.12). 4. If the time limit is not expired, then go to Step 2. Step 3. Stop Continuous Model Dual-Mode Dynamics In this subsection, we develop the continuous model dual-mode dynamics which claries the theoretical aspects of dual-mode dynamics neural networks. We rst derive the weight dynamics equation for the continuous model, and then, discuss its geometrical interpretation in the framework of the network conguration space and the equilibrium manifold. In the continuous model, the base layer is the continuous Hopeld network. Therefore, the state dynamics is same as the general dynamics for recurrent networks in (2.1), but with a xed symmetric weights, i.e., w ij = w ji. At equilibrium of the state dynamics, all _u i become zero, and thus, the equilibrium manifold equation is: u = W x + (2.13) where u; x; are the vectors of the state, output and bias, respectively, and W is the weight matrix. Let u be the state variation due to suciently small variations of weights and biases, W and. Then, u + u = (W + W )(x + x) + + : (2.14) 19

29 From (2.13) and (2.14), disregarding the O( 2 ) term, (I? W G)u = W x + (2.15) where G = diagfdx i =du i g = diagfx i (1? x i )g, and x i (dx i =du i )u i, i.e., x Gu. When the current equilibrium state does not represent the desired solution, we can obtain the desired state variation, u, in the direction to minimize the objective function, based on the gradient of the objective function with respect to the state. Then, with (2.15), we can get the weight variation, W, which achieves the desired state variation, u at the next equilibrium. However, we have to keep the symmetry of W to guarantee the convergence of the state dynamics. Let us rearrange the right side of (2.15) with the vectorized form of the upper triangular elements in W, w v as follows: = K w v (2.16) where = (I? W G) u; (2.17) K = [K 1 K 2 : : : K i : : : K n I n ]; (2.18) 20

30 K in(n?i+1) = x i x i+1 x i+2 x n 0 x i x i x i ; (2.19) w v = [w T v 1 : : : w T v i : : : w T v n T ] T ; (2.20) w vi = [w ii ; w i(i+1) ; : : : ; w in ] T ; (2.21) and I n is the n n identity matrix. Equation (2.16) is under-determined, that is, there are (n(n + 1)=2 + n) variables and n constraints. Thus, we may have the innite number of solutions. However, we assume a small weight variation in deriving (2.16). So, we choose the pseudo-inverse solution, which gives the minimum length of w v, kw v k. Let K + be the pseudo-inverse of K. Then, w v = K + (2.22) = K T (KK T )?1 : (2.23) Note that KK T is always invertible, because the rank of K is always n due to the last component,i n. This pseudo-inverse solution of w v requires the matrix inversion of (KK T ). But, from (2.18) and (2.19), KK T = diagfs? x 2 i g + xxt (2.24) where S = P i x 2 i + 1, and all the diagonal elements in diagfs? x 2 i g are positive. 21

31 Thus, by applying the general inversion identity: [A + ab T ]?1 = A?1? A?1 ab T A?1 1 + b T A?1 a ; we can compute (KK T )?1 easily as: (KK T )?1 ij = ij S? x 2 i? 1 D x i x j (S? x 2 i )(S? x 2 j) ; (2.25) where D = 1 + P i x 2 i S?x 2 i Then, from (2.18), (2.19) and (2.23), w ii w i(i+1) w i(i+2). w in =. Let L = (KK T )?1 = [L 1 ; L 2 ; L 3 ; ; L n ] T x i x i+1 x i x i+2 0 x i x n 0 0 x i L 1 L 2 L 3. L n (2.26) and n = L 1 L 2. L n : (2.27) From (2.26) and (2.27), the nal symmetry preserving weight update rule is 22

32 obtained as follows: w ij = 8 >< >: x i L j + x j L i x i L i i 6= j i = j i = L i : (2.28) Figure 2.3 illustrates the dual-mode dynamics based on the equilibrium manifold in the network conguration space. In Figure 2.3, the w 1 -axis and w 2 -axis represent the weight space, and u-axis represents the state space. Also, the curved surface represents the equilibrium manifold. Let A(w A ; u A ) represent a current network conguration at equilibrium and u d be the desired state variation which is computed from the objective function. Then, the question is how to obtain the weight variation, W, which achieves u d at the next equilibrium. So, we rst approximate the equilibrium manifold around A by a tangential hyperplane (Eq. (2.15)). Since the dimension of the weight space is much higher than that of the state space, there are innite number of solutions (represented by L sol in Figure 2.3) to achieve u d in this linearized equilibrium manifold. We choose a pseudo-inverse solution, S(w B ; u A + u d ), which gives rise to the minimum weight variation, i.e., min k w k, among them (Eq. (2.23)). Then, with the new weight, w B, and the current state, u A, the state dynamics starts again and moves the network conguration from S(w B ; u A ) to B(w B ; u B ) which belongs to the real equilibrium manifold at next equilibrium. Note that, with a suciently small k u d k, the next state, u B, is always obtained in the same direction with u d. State Variation of Hidden Units For a given optimization problem, one neuron which is called a visible unit is 23

33 Figure 2.3: The schematic view of the dual-mode dynamics based on the equilibrium manifold in the network conguration space. 24

34 assigned to one variable in the solution space. There may be also extra neurons which do not correspond to variables, so called hidden units. Since the hidden unit is not assigned a variable, and thus, its output is not included in the objective function, the desired variation for a hidden unit is not automatically given from the gradient of the objective function. Hence, there exists freedom to select the hidden unit variation, which can be utilized to achieve certain desirable performance. The simple way to x the hidden unit variation is to set them all to zero. By separating the visible unit and the hidden unit in (2.17), = I v 0 0 I h ? 4 W vv W vh W hv W hh G v 0 0 G h C A 6 4 u v u h (2.29) = A v u v + A h u h ; (2.30) where A v = I v? W vv G v?w hv G v ; A h = 2 6 4?W vh G h I h? W hh G h ; (2.31) and the subscripts, v and h, stand for the visible unit and the hidden unit, respectively, and u v and u h represent the variations of the visible units and hidden units, respectively. With u h = 0, w v = K + j uh =0 (2.32) = K T (KK T )?1 A v u v : (2.33) In this case, we actually minimize the norm of the state variation. That is, the 25

35 squared norm of the state variation is: ku k 2 = ku v k 2 + ku h k 2 ; (2.34) and u v is given from the gradient of the objective function. Therefore, ku k 2 is minimized with u h = 0. Since the weight variation is a function of hidden unit variations, we can also select the hidden unit variation to minimize the weight variation. Let us dene V as: V = 1 2 kw v k 2 (2.35) Then, V = 1 2 [KT (KK T )?1 ] T [K T (KK T )?1 ] (2.36) = 1 2 T (KK T )?1 : (2.37) Note that V is a quadratic function of u h as well as of. So, by setting the gradient of V with respect to u h to zero, i.e., A T h (KKT )?1 (A v u v + A h u h ) = 0; (2.38) we can select u h to minimize V as: u h =?[A T h (KKT )?1 A h ]?1 A T h (KKT )?1 A v u v : (2.39) 26

36 With this hidden unit variation, the resultant weight variation is w v = K T (KK T )?1 [I? A h (A T h (KKT )?1 A h )?1 A T h (KKT )?1 ]A v u v : (2.40) By both the state variation and the weight variation, the network changes its conguration along the equilibrium manifold in the network conguration space. So, we can select the hidden unit variation which minimizes the variation of network conguration in the network conguration space. Let us modify V in (2.35) to include the state variation as: V = 1 2 kw v k ku k2 : (2.41) Then, the resultant weight variation in (2.40) is also slightly modied as: w v = K T (KK T )?1 [I?A h (I h +A T h (KK T )?1 A h )?1 A T h (KK T )?1 ]A v u v : (2.42) The introduction of hidden units can be helpful since it accompanies with the increase of the number of weights by O(n 2 ), while with the increase of the number of state variations to achieve by O(n). Since the pseudo-inverse solution in (2.23) gives the minimum weight variation for a given state variation, we will use (2.33) for simulations in the later chapters, since it does not require the matrix inversion of (A T h (KK T )?1 A h ) or (I h + A T h (KK T )?1 A h ) as in (2.40) or (2.42) Symmetry Preserving Recurrent Backpropagation Pineda [34, 35] and Almeida [5] have independently pointed out that backpropagation can be extended to arbitrary networks, and developed the backpropagation 27

37 algorithm for recurrent neural networks, called recurrent backpropagation. The goal of this recurrent backpropagation is conceptually same as that of the weight dynamics in D2NN. With recurrent backpropagation, we try to change the weight to minimize the error function, while with the weight dynamics, we try to update weights in the direction to minimize the objective function which is equivalent to the error function in recurrent backpropagation. So, recurrent backpropagation can be also used as the weight dynamics in D2NN. Since recurrent backpropagation has been derived on the assumption that networks always converge to stable states, we have to guarantee the state dynamics stability all the time. With general asymmetric weights, however, it is hard to maintain the state dynamics stability, as will be discussed in Section 2.5. Therefore, by imposing the condition of symmetric weights, we modify the original recurrent backpropagation to derive the symmetry preserving recurrent backpropagation as follows. For convenience, let us rewrite the network dynamics in (2.1) and (2.2): _u i =?u i + x i = f(u i ) nx j=1 w ij x j + i (2.43) 1 = for i = 1; : : : ; n: (2.44) 1 + e?u i With the symmetric weight, the network always converges into a xed point, and the equilibrium manifold equation is: u i = X j w ij x j + j for i = 1; : : : ; n: (2.45) The goal is to adjust the weight so that the next equilibrium state be formed 28

38 along the direction to decrease the objective function. This is accomplished by computing the gradient of the objective function with respect to the weight, and by updating the weight in the anti-parallel direction of that gradient, that is, w rs =? X i C rs (2.46) where C i f 0 (u i ) (2.47) and C is the objective function. To i =@w rs, let us dierentiate the equilibrium manifold equation in (2.45) with respect to w rs (= w sr ) to rs = ir x s + is x r + X j w ij f 0 (u j rs : (2.48) where ij is the Kronecker delta, i.e., ij = 1 if i = j; 0 otherwise. Collecting terms, this can be written as where X j L rs = ir x s + is x r (2.49) L ij = ij? w ij f 0 (u j ): (2.50) Inverting the linear equations, (2.49) rs = (L?1 ) kr x s + (L?1 ) ks x r : (2.51) 29

39 By substituting (2.51) into (2.46), we obtain w rs =? y r x s? y s x r (2.52) where y r = X k y s = X k C k (L?1 ) kr (2.53) C k (L?1 ) ks : (2.54) Equation (2.52) species the symmetry preserving weight update rule, and requires the matrix inversion to get y r and y s in (2.53) and (2.54). However, we can undo the inversion in (2.53) and (2.54), and obtain linear equations for y k X k L ki y k = C i (2.55) or using (2.50), y i = X k w ki f 0 (u i ) y k + C i : (2.56) This equation has the same form as the original equilibrium manifold equation in (2.45), and can be solved in the same way, by the evolution of an auxillary network with a state dynamics analogous to (2.43): _y i =?y i + nx k=1 w ki f 0 (u i )y k + C i : (2.57) The auxillary network has the same topology as the original network, with the connection w ij from the jth unit to the ith unit replaced by w ji f 0 (u i ), a simple linear activation function f(y) = y, and a bias term C i. Almeida [5] showed that 30

40 the convergence of the original network is a sucient condition of the convergence of the auxillary network. Note that the convergence of the original network is guaranteed by the symmetry of the weight. The whole computational ow is thus: 1. Relax the original network with (2.43). 2. Calculate C i in (2.47) from the objective function C. 3. Relax the auxillary network with (2.57) to nd y i. 4. Update the weight using (2.52). 2.4 Binary Value Solution vs. Continuous State Variable In combinatorial optimization, the solution space is of binary values, while the state variable is continuous in the computation process of the neural optimization network. When we apply the neural network approaches to combinatorial optimization, we actually try to get binary value solutions through a continuous state space. In the discrete model of D2NN, the output of each neuron in the base layer is automatically of binary value because the binary threshold function is used as the activation function. Meanwhile, in the continuous model of D2NN, we need some schemes to get binary value solutions from continuous state variables. Hopeld and Tank used a sigmoid function with a steep slope as an activation function of neurons so that the output of each neuron was expected to be formed near 1 or 0 at equilibrium. However, with a xed steep slope in the activation 31

41 function, the state dynamics loses easily its momentum, and the state tends to stay at one of local minima or on a plateau of the network energy function. Therefore, the nal state usually represents a solution of poor quality, as reported by Wilson and Pawley [47]. The annealing process helps to avoid the above phenomena to some degree. While the state dynamics runs actively with the initial high temperature, the state dynamics eectively becomes discrete as the temperature goes towards zero, and thus, the nal state will be of binary value. The variation of temperature should be carefully scheduled since it has a crucial inuence on the quality of the nal solution. In practice, a good annealing schedule is problem-dependent, and is usually obtained by try-and-error, which takes rather a long a priori processing time. In the continuous model of D2NN, we use the hyperquadrant-to-vertex mapping to get the binary value solution from the continuous state variable. The hyperquadrant-to-vertex mapping is formally dened as: Hyperquadrant-to-vertex mapping: A point in the n dimensional unit hypercube, x = [x 1 x 2 x n ]; x i 2 [0; 1] for i = 1; : : : ; n, is mapped to x B = [x B 1 x B 2 xb n ], where x B i = 8 >< >: 1 if x i 0:5, 0 otherwise for i = 1; : : : ; n. An example of the hyperquadrant-to-vertex mapping is illustrated in Figure 2.4 for the two dimensional case. With this hyperquadrant-to-vertex mapping, each vertex of a unit hypercube represents all the points in the hyperquadrant 32

42 Figure 2.4: The hyperquadrant-to-vertex mapping 33

43 which contains that vertex. We apply the hyperquadrant-to-vertex mapping when we check whether a state represents a solution or not. In other words, the vertex which is obtained from the current output of neurons by the hyperquadrant-to-vertex mapping is tested to see if it satises all the constraints for the given problem. Thus, the goal of computation is slightly modied as to search a hyperquadrant which contains the vertex representing the solution, rather than to search the vertex of the binary value solution itself. This saves a considerable computation time since the state is expected to evolve into any point in the hyperquadrant which corresponds to the solution, rather than to the vertex itself representing the solution. The hyperquadrant-to-vertex mapping is also used in the computation of the desired state variation. In case we use the gradient of the objective function at the current output of neurons, it may happen that the desired state variation is zero even though the current output of neurons or the corresponding vertex does not represent the solution. And thus, there would be no way to guide the weight dynamics. Therefore, when we calculate the state variation, we use the the gradient of the objective function at the vertex which corresponds to the current output of neurons, rather than at the current output itself. Then, if the state evolves into a hyperquadrant whose vertex does not represent the solution for the given problem, the state will be pushed out to another hyperquadrant along the opposite direction of the gradient of objective function at the vertex. Meanwhile, there is no repulsive impetus in the hyperquadrant whose vertex represents the solution. So, the state moves through the equilibrium manifold, governed by the gradients at the vertices, and the weight dynamics will stop when the state evolves into the hyperquadrant which contains the vertex 34

44 Figure 2.5: The state evolution in D2NN for two variable problem with three inequality constraints. 35

45 representing the binary value solution. Figure 2.5 illustrates the state evolution in D2NN for the simple two dimensional problem. Based on three inequality constraints which are represented as the shaded area, the objective function is formed such a way that it becomes zero at the vertex V 4 (representing a solution) and generates the repulsive impetus at other vertices, V 1, V 2, and V 3, illustrated by the shaded big arrows. From an initial state, A, the state is guided by the repulsive impetus at V 1, and reaches the upper-left quadrant. And then, the repulsive impetus at V 1 and V 2 will guide the state alternately, and the state will eventually evolve into the lower-right quadrant which corresponds to the solution, V 4. Note that the initial state, A, has momentum due to the hyperquadrant-to-vertex mapping even though it satises the constraints. 2.5 Asymmetric Weight vs. Symmetric Weight With asymmetric weights, the state dynamics either converges onto an isolated xed point, or generates oscillations or chaotic behaviors. The typical state dynamics for 100 neuron networks is shown in Figure 2.6. With asymmetric weights, the state dynamics results in oscillation after quite a long period of chaotic behavior, while, with symmetric weights, a xed point is reached after a short transient period. If you could utilize the general behaviors such as chaotic behaviors or limit cycles, the neural computation would be enormously powerful. Since we use only an isolated xed point as an output of the system, however, we need to guarantee the convergence of the state dynamics. On the stability of asymmetric recurrent neural networks, several sucient 36

Hill climbing: Simulated annealing and Tabu search

Hill climbing: Simulated annealing and Tabu search Hill climbing: Simulated annealing and Tabu search Heuristic algorithms Giovanni Righini University of Milan Department of Computer Science (Crema) Hill climbing Instead of repeating local search, it is

More information

A.I.: Beyond Classical Search

A.I.: Beyond Classical Search A.I.: Beyond Classical Search Random Sampling Trivial Algorithms Generate a state randomly Random Walk Randomly pick a neighbor of the current state Both algorithms asymptotically complete. Overview Previously

More information

7.1 Basis for Boltzmann machine. 7. Boltzmann machines

7.1 Basis for Boltzmann machine. 7. Boltzmann machines 7. Boltzmann machines this section we will become acquainted with classical Boltzmann machines which can be seen obsolete being rarely applied in neurocomputing. It is interesting, after all, because is

More information

Computational Intelligence Lecture 6: Associative Memory

Computational Intelligence Lecture 6: Associative Memory Computational Intelligence Lecture 6: Associative Memory Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Fall 2011 Farzaneh Abdollahi Computational Intelligence

More information

Neural Networks for Machine Learning. Lecture 11a Hopfield Nets

Neural Networks for Machine Learning. Lecture 11a Hopfield Nets Neural Networks for Machine Learning Lecture 11a Hopfield Nets Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed Hopfield Nets A Hopfield net is composed of binary threshold

More information

Hertz, Krogh, Palmer: Introduction to the Theory of Neural Computation. Addison-Wesley Publishing Company (1991). (v ji (1 x i ) + (1 v ji )x i )

Hertz, Krogh, Palmer: Introduction to the Theory of Neural Computation. Addison-Wesley Publishing Company (1991). (v ji (1 x i ) + (1 v ji )x i ) Symmetric Networks Hertz, Krogh, Palmer: Introduction to the Theory of Neural Computation. Addison-Wesley Publishing Company (1991). How can we model an associative memory? Let M = {v 1,..., v m } be a

More information

Hopfield Networks and Boltzmann Machines. Christian Borgelt Artificial Neural Networks and Deep Learning 296

Hopfield Networks and Boltzmann Machines. Christian Borgelt Artificial Neural Networks and Deep Learning 296 Hopfield Networks and Boltzmann Machines Christian Borgelt Artificial Neural Networks and Deep Learning 296 Hopfield Networks A Hopfield network is a neural network with a graph G = (U,C) that satisfies

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table

More information

In: Proc. BENELEARN-98, 8th Belgian-Dutch Conference on Machine Learning, pp 9-46, 998 Linear Quadratic Regulation using Reinforcement Learning Stephan ten Hagen? and Ben Krose Department of Mathematics,

More information

Lin-Kernighan Heuristic. Simulated Annealing

Lin-Kernighan Heuristic. Simulated Annealing DM63 HEURISTICS FOR COMBINATORIAL OPTIMIZATION Lecture 6 Lin-Kernighan Heuristic. Simulated Annealing Marco Chiarandini Outline 1. Competition 2. Variable Depth Search 3. Simulated Annealing DM63 Heuristics

More information

Novel determination of dierential-equation solutions: universal approximation method

Novel determination of dierential-equation solutions: universal approximation method Journal of Computational and Applied Mathematics 146 (2002) 443 457 www.elsevier.com/locate/cam Novel determination of dierential-equation solutions: universal approximation method Thananchai Leephakpreeda

More information

Error Empirical error. Generalization error. Time (number of iteration)

Error Empirical error. Generalization error. Time (number of iteration) Submitted to Neural Networks. Dynamics of Batch Learning in Multilayer Networks { Overrealizability and Overtraining { Kenji Fukumizu The Institute of Physical and Chemical Research (RIKEN) E-mail: fuku@brain.riken.go.jp

More information

In biological terms, memory refers to the ability of neural systems to store activity patterns and later recall them when required.

In biological terms, memory refers to the ability of neural systems to store activity patterns and later recall them when required. In biological terms, memory refers to the ability of neural systems to store activity patterns and later recall them when required. In humans, association is known to be a prominent feature of memory.

More information

1 Introduction Duality transformations have provided a useful tool for investigating many theories both in the continuum and on the lattice. The term

1 Introduction Duality transformations have provided a useful tool for investigating many theories both in the continuum and on the lattice. The term SWAT/102 U(1) Lattice Gauge theory and its Dual P. K. Coyle a, I. G. Halliday b and P. Suranyi c a Racah Institute of Physics, Hebrew University of Jerusalem, Jerusalem 91904, Israel. b Department ofphysics,

More information

Stochastic Networks Variations of the Hopfield model

Stochastic Networks Variations of the Hopfield model 4 Stochastic Networks 4. Variations of the Hopfield model In the previous chapter we showed that Hopfield networks can be used to provide solutions to combinatorial problems that can be expressed as the

More information

6. APPLICATION TO THE TRAVELING SALESMAN PROBLEM

6. APPLICATION TO THE TRAVELING SALESMAN PROBLEM 6. Application to the Traveling Salesman Problem 92 6. APPLICATION TO THE TRAVELING SALESMAN PROBLEM The properties that have the most significant influence on the maps constructed by Kohonen s algorithm

More information

Optimization Methods via Simulation

Optimization Methods via Simulation Optimization Methods via Simulation Optimization problems are very important in science, engineering, industry,. Examples: Traveling salesman problem Circuit-board design Car-Parrinello ab initio MD Protein

More information

1 Heuristics for the Traveling Salesman Problem

1 Heuristics for the Traveling Salesman Problem Praktikum Algorithmen-Entwurf (Teil 9) 09.12.2013 1 1 Heuristics for the Traveling Salesman Problem We consider the following problem. We want to visit all the nodes of a graph as fast as possible, visiting

More information

Summary. AIMA sections 4.3,4.4. Hill-climbing Simulated annealing Genetic algorithms (briey) Local search in continuous spaces (very briey)

Summary. AIMA sections 4.3,4.4. Hill-climbing Simulated annealing Genetic algorithms (briey) Local search in continuous spaces (very briey) AIMA sections 4.3,4.4 Summary Hill-climbing Simulated annealing Genetic (briey) in continuous spaces (very briey) Iterative improvement In many optimization problems, path is irrelevant; the goal state

More information

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided

More information

SIMU L TED ATED ANNEA L NG ING

SIMU L TED ATED ANNEA L NG ING SIMULATED ANNEALING Fundamental Concept Motivation by an analogy to the statistical mechanics of annealing in solids. => to coerce a solid (i.e., in a poor, unordered state) into a low energy thermodynamic

More information

Finding optimal configurations ( combinatorial optimization)

Finding optimal configurations ( combinatorial optimization) CS 1571 Introduction to AI Lecture 10 Finding optimal configurations ( combinatorial optimization) Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Constraint satisfaction problem (CSP) Constraint

More information

Artificial Neural Networks Examination, March 2004

Artificial Neural Networks Examination, March 2004 Artificial Neural Networks Examination, March 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

Approximation Algorithms for Maximum. Coverage and Max Cut with Given Sizes of. Parts? A. A. Ageev and M. I. Sviridenko

Approximation Algorithms for Maximum. Coverage and Max Cut with Given Sizes of. Parts? A. A. Ageev and M. I. Sviridenko Approximation Algorithms for Maximum Coverage and Max Cut with Given Sizes of Parts? A. A. Ageev and M. I. Sviridenko Sobolev Institute of Mathematics pr. Koptyuga 4, 630090, Novosibirsk, Russia fageev,svirg@math.nsc.ru

More information

3.4 Relaxations and bounds

3.4 Relaxations and bounds 3.4 Relaxations and bounds Consider a generic Discrete Optimization problem z = min{c(x) : x X} with an optimal solution x X. In general, the algorithms generate not only a decreasing sequence of upper

More information

The Traveling Salesman Problem: A Neural Network Perspective. Jean-Yves Potvin

The Traveling Salesman Problem: A Neural Network Perspective. Jean-Yves Potvin 1 The Traveling Salesman Problem: A Neural Network Perspective Jean-Yves Potvin Centre de Recherche sur les Transports Université de Montréal C.P. 6128, Succ. A, Montréal (Québec) Canada H3C 3J7 potvin@iro.umontreal.ca

More information

Appendix A.1 Derivation of Nesterov s Accelerated Gradient as a Momentum Method

Appendix A.1 Derivation of Nesterov s Accelerated Gradient as a Momentum Method for all t is su cient to obtain the same theoretical guarantees. This method for choosing the learning rate assumes that f is not noisy, and will result in too-large learning rates if the objective is

More information

Travelling Salesman Problem

Travelling Salesman Problem Travelling Salesman Problem Fabio Furini November 10th, 2014 Travelling Salesman Problem 1 Outline 1 Traveling Salesman Problem Separation Travelling Salesman Problem 2 (Asymmetric) Traveling Salesman

More information

IN THIS PAPER, we consider a class of continuous-time recurrent

IN THIS PAPER, we consider a class of continuous-time recurrent IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 51, NO. 4, APRIL 2004 161 Global Output Convergence of a Class of Continuous-Time Recurrent Neural Networks With Time-Varying Thresholds

More information

Field indeced pattern simulation and spinodal. point in nematic liquid crystals. Chun Zheng Frank Lonberg Robert B. Meyer. June 18, 1995.

Field indeced pattern simulation and spinodal. point in nematic liquid crystals. Chun Zheng Frank Lonberg Robert B. Meyer. June 18, 1995. Field indeced pattern simulation and spinodal point in nematic liquid crystals Chun Zheng Frank Lonberg Robert B. Meyer June 18, 1995 Abstract We explore the novel periodic Freeredicksz Transition found

More information

Artificial Intelligence Heuristic Search Methods

Artificial Intelligence Heuristic Search Methods Artificial Intelligence Heuristic Search Methods Chung-Ang University, Jaesung Lee The original version of this content is created by School of Mathematics, University of Birmingham professor Sandor Zoltan

More information

Neural Networks. Prof. Dr. Rudolf Kruse. Computational Intelligence Group Faculty for Computer Science

Neural Networks. Prof. Dr. Rudolf Kruse. Computational Intelligence Group Faculty for Computer Science Neural Networks Prof. Dr. Rudolf Kruse Computational Intelligence Group Faculty for Computer Science kruse@iws.cs.uni-magdeburg.de Rudolf Kruse Neural Networks 1 Hopfield Networks Rudolf Kruse Neural Networks

More information

Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm

Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm Michail G. Lagoudakis Department of Computer Science Duke University Durham, NC 2778 mgl@cs.duke.edu

More information

G : Statistical Mechanics Notes for Lecture 3 I. MICROCANONICAL ENSEMBLE: CONDITIONS FOR THERMAL EQUILIBRIUM Consider bringing two systems into

G : Statistical Mechanics Notes for Lecture 3 I. MICROCANONICAL ENSEMBLE: CONDITIONS FOR THERMAL EQUILIBRIUM Consider bringing two systems into G25.2651: Statistical Mechanics Notes for Lecture 3 I. MICROCANONICAL ENSEMBLE: CONDITIONS FOR THERMAL EQUILIBRIUM Consider bringing two systems into thermal contact. By thermal contact, we mean that the

More information

ON COST MATRICES WITH TWO AND THREE DISTINCT VALUES OF HAMILTONIAN PATHS AND CYCLES

ON COST MATRICES WITH TWO AND THREE DISTINCT VALUES OF HAMILTONIAN PATHS AND CYCLES ON COST MATRICES WITH TWO AND THREE DISTINCT VALUES OF HAMILTONIAN PATHS AND CYCLES SANTOSH N. KABADI AND ABRAHAM P. PUNNEN Abstract. Polynomially testable characterization of cost matrices associated

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Jeff Clune Assistant Professor Evolving Artificial Intelligence Laboratory Announcements Be making progress on your projects! Three Types of Learning Unsupervised Supervised Reinforcement

More information

4. Multilayer Perceptrons

4. Multilayer Perceptrons 4. Multilayer Perceptrons This is a supervised error-correction learning algorithm. 1 4.1 Introduction A multilayer feedforward network consists of an input layer, one or more hidden layers, and an output

More information

Methods for finding optimal configurations

Methods for finding optimal configurations CS 1571 Introduction to AI Lecture 9 Methods for finding optimal configurations Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Search for the optimal configuration Optimal configuration search:

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks 鮑興國 Ph.D. National Taiwan University of Science and Technology Outline Perceptrons Gradient descent Multi-layer networks Backpropagation Hidden layer representations Examples

More information

= w 2. w 1. B j. A j. C + j1j2

= w 2. w 1. B j. A j. C + j1j2 Local Minima and Plateaus in Multilayer Neural Networks Kenji Fukumizu and Shun-ichi Amari Brain Science Institute, RIKEN Hirosawa 2-, Wako, Saitama 35-098, Japan E-mail: ffuku, amarig@brain.riken.go.jp

More information

5. Simulated Annealing 5.1 Basic Concepts. Fall 2010 Instructor: Dr. Masoud Yaghini

5. Simulated Annealing 5.1 Basic Concepts. Fall 2010 Instructor: Dr. Masoud Yaghini 5. Simulated Annealing 5.1 Basic Concepts Fall 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Real Annealing and Simulated Annealing Metropolis Algorithm Template of SA A Simple Example References

More information

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis Introduction to Natural Computation Lecture 9 Multilayer Perceptrons and Backpropagation Peter Lewis 1 / 25 Overview of the Lecture Why multilayer perceptrons? Some applications of multilayer perceptrons.

More information

2 1. Introduction. Neuronal networks often exhibit a rich variety of oscillatory behavior. The dynamics of even a single cell may be quite complicated

2 1. Introduction. Neuronal networks often exhibit a rich variety of oscillatory behavior. The dynamics of even a single cell may be quite complicated GEOMETRIC ANALYSIS OF POPULATION RHYTHMS IN SYNAPTICALLY COUPLED NEURONAL NETWORKS J. Rubin and D. Terman Dept. of Mathematics; Ohio State University; Columbus, Ohio 43210 Abstract We develop geometric

More information

Adaptive linear quadratic control using policy. iteration. Steven J. Bradtke. University of Massachusetts.

Adaptive linear quadratic control using policy. iteration. Steven J. Bradtke. University of Massachusetts. Adaptive linear quadratic control using policy iteration Steven J. Bradtke Computer Science Department University of Massachusetts Amherst, MA 01003 bradtke@cs.umass.edu B. Erik Ydstie Department of Chemical

More information

in a Chaotic Neural Network distributed randomness of the input in each neuron or the weight in the

in a Chaotic Neural Network distributed randomness of the input in each neuron or the weight in the Heterogeneity Enhanced Order in a Chaotic Neural Network Shin Mizutani and Katsunori Shimohara NTT Communication Science Laboratories, 2-4 Hikaridai, Seika-cho, Soraku-gun, Kyoto, 69-237 Japan shin@cslab.kecl.ntt.co.jp

More information

Congurations of periodic orbits for equations with delayed positive feedback

Congurations of periodic orbits for equations with delayed positive feedback Congurations of periodic orbits for equations with delayed positive feedback Dedicated to Professor Tibor Krisztin on the occasion of his 60th birthday Gabriella Vas 1 MTA-SZTE Analysis and Stochastics

More information

R. Schaback. numerical method is proposed which rst minimizes each f j separately. and then applies a penalty strategy to gradually force the

R. Schaback. numerical method is proposed which rst minimizes each f j separately. and then applies a penalty strategy to gradually force the A Multi{Parameter Method for Nonlinear Least{Squares Approximation R Schaback Abstract P For discrete nonlinear least-squares approximation problems f 2 (x)! min for m smooth functions f : IR n! IR a m

More information

PROBLEM SOLVING AND SEARCH IN ARTIFICIAL INTELLIGENCE

PROBLEM SOLVING AND SEARCH IN ARTIFICIAL INTELLIGENCE Artificial Intelligence, Computational Logic PROBLEM SOLVING AND SEARCH IN ARTIFICIAL INTELLIGENCE Lecture 4 Metaheuristic Algorithms Sarah Gaggl Dresden, 5th May 2017 Agenda 1 Introduction 2 Constraint

More information

Spurious Chaotic Solutions of Dierential. Equations. Sigitas Keras. September Department of Applied Mathematics and Theoretical Physics

Spurious Chaotic Solutions of Dierential. Equations. Sigitas Keras. September Department of Applied Mathematics and Theoretical Physics UNIVERSITY OF CAMBRIDGE Numerical Analysis Reports Spurious Chaotic Solutions of Dierential Equations Sigitas Keras DAMTP 994/NA6 September 994 Department of Applied Mathematics and Theoretical Physics

More information

The Bias-Variance dilemma of the Monte Carlo. method. Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel

The Bias-Variance dilemma of the Monte Carlo. method. Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel The Bias-Variance dilemma of the Monte Carlo method Zlochin Mark 1 and Yoram Baram 1 Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel fzmark,baramg@cs.technion.ac.il Abstract.

More information

squashing functions allow to deal with decision-like tasks. Attracted by Backprop's interpolation capabilities, mainly because of its possibility of g

squashing functions allow to deal with decision-like tasks. Attracted by Backprop's interpolation capabilities, mainly because of its possibility of g SUCCESSES AND FAILURES OF BACKPROPAGATION: A THEORETICAL INVESTIGATION P. Frasconi, M. Gori, and A. Tesi Dipartimento di Sistemi e Informatica, Universita di Firenze Via di Santa Marta 3-50139 Firenze

More information

Manifold Regularization

Manifold Regularization 9.520: Statistical Learning Theory and Applications arch 3rd, 200 anifold Regularization Lecturer: Lorenzo Rosasco Scribe: Hooyoung Chung Introduction In this lecture we introduce a class of learning algorithms,

More information

Linearly-solvable Markov decision problems

Linearly-solvable Markov decision problems Advances in Neural Information Processing Systems 2 Linearly-solvable Markov decision problems Emanuel Todorov Department of Cognitive Science University of California San Diego todorov@cogsci.ucsd.edu

More information

Lecture 35 Minimization and maximization of functions. Powell s method in multidimensions Conjugate gradient method. Annealing methods.

Lecture 35 Minimization and maximization of functions. Powell s method in multidimensions Conjugate gradient method. Annealing methods. Lecture 35 Minimization and maximization of functions Powell s method in multidimensions Conjugate gradient method. Annealing methods. We know how to minimize functions in one dimension. If we start at

More information

Bounded Approximation Algorithms

Bounded Approximation Algorithms Bounded Approximation Algorithms Sometimes we can handle NP problems with polynomial time algorithms which are guaranteed to return a solution within some specific bound of the optimal solution within

More information

John P.F.Sum and Peter K.S.Tam. Hong Kong Polytechnic University, Hung Hom, Kowloon.

John P.F.Sum and Peter K.S.Tam. Hong Kong Polytechnic University, Hung Hom, Kowloon. Note on the Maxnet Dynamics John P.F.Sum and Peter K.S.Tam Department of Electronic Engineering, Hong Kong Polytechnic University, Hung Hom, Kowloon. April 7, 996 Abstract A simple method is presented

More information

Scheduling Adaptively Parallel Jobs. Bin Song. Submitted to the Department of Electrical Engineering and Computer Science. Master of Science.

Scheduling Adaptively Parallel Jobs. Bin Song. Submitted to the Department of Electrical Engineering and Computer Science. Master of Science. Scheduling Adaptively Parallel Jobs by Bin Song A. B. (Computer Science and Mathematics), Dartmouth College (996) Submitted to the Department of Electrical Engineering and Computer Science in partial fulllment

More information

Ecient Higher-order Neural Networks. for Classication and Function Approximation. Joydeep Ghosh and Yoan Shin. The University of Texas at Austin

Ecient Higher-order Neural Networks. for Classication and Function Approximation. Joydeep Ghosh and Yoan Shin. The University of Texas at Austin Ecient Higher-order Neural Networks for Classication and Function Approximation Joydeep Ghosh and Yoan Shin Department of Electrical and Computer Engineering The University of Texas at Austin Austin, TX

More information

Ant Colony Optimization: an introduction. Daniel Chivilikhin

Ant Colony Optimization: an introduction. Daniel Chivilikhin Ant Colony Optimization: an introduction Daniel Chivilikhin 03.04.2013 Outline 1. Biological inspiration of ACO 2. Solving NP-hard combinatorial problems 3. The ACO metaheuristic 4. ACO for the Traveling

More information

COMP9444 Neural Networks and Deep Learning 11. Boltzmann Machines. COMP9444 c Alan Blair, 2017

COMP9444 Neural Networks and Deep Learning 11. Boltzmann Machines. COMP9444 c Alan Blair, 2017 COMP9444 Neural Networks and Deep Learning 11. Boltzmann Machines COMP9444 17s2 Boltzmann Machines 1 Outline Content Addressable Memory Hopfield Network Generative Models Boltzmann Machine Restricted Boltzmann

More information

Average Reward Parameters

Average Reward Parameters Simulation-Based Optimization of Markov Reward Processes: Implementation Issues Peter Marbach 2 John N. Tsitsiklis 3 Abstract We consider discrete time, nite state space Markov reward processes which depend

More information

Chapter 0 Introduction Suppose this was the abstract of a journal paper rather than the introduction to a dissertation. Then it would probably end wit

Chapter 0 Introduction Suppose this was the abstract of a journal paper rather than the introduction to a dissertation. Then it would probably end wit Chapter 0 Introduction Suppose this was the abstract of a journal paper rather than the introduction to a dissertation. Then it would probably end with some cryptic AMS subject classications and a few

More information

Neural Networks. Hopfield Nets and Auto Associators Fall 2017

Neural Networks. Hopfield Nets and Auto Associators Fall 2017 Neural Networks Hopfield Nets and Auto Associators Fall 2017 1 Story so far Neural networks for computation All feedforward structures But what about.. 2 Loopy network Θ z = ቊ +1 if z > 0 1 if z 0 y i

More information

An Adaptive Bayesian Network for Low-Level Image Processing

An Adaptive Bayesian Network for Low-Level Image Processing An Adaptive Bayesian Network for Low-Level Image Processing S P Luttrell Defence Research Agency, Malvern, Worcs, WR14 3PS, UK. I. INTRODUCTION Probability calculus, based on the axioms of inference, Cox

More information

Using a Hopfield Network: A Nuts and Bolts Approach

Using a Hopfield Network: A Nuts and Bolts Approach Using a Hopfield Network: A Nuts and Bolts Approach November 4, 2013 Gershon Wolfe, Ph.D. Hopfield Model as Applied to Classification Hopfield network Training the network Updating nodes Sequencing of

More information

1. Introduction Let the least value of an objective function F (x), x2r n, be required, where F (x) can be calculated for any vector of variables x2r

1. Introduction Let the least value of an objective function F (x), x2r n, be required, where F (x) can be calculated for any vector of variables x2r DAMTP 2002/NA08 Least Frobenius norm updating of quadratic models that satisfy interpolation conditions 1 M.J.D. Powell Abstract: Quadratic models of objective functions are highly useful in many optimization

More information

A Generalized Homogeneous and Self-Dual Algorithm. for Linear Programming. February 1994 (revised December 1994)

A Generalized Homogeneous and Self-Dual Algorithm. for Linear Programming. February 1994 (revised December 1994) A Generalized Homogeneous and Self-Dual Algorithm for Linear Programming Xiaojie Xu Yinyu Ye y February 994 (revised December 994) Abstract: A generalized homogeneous and self-dual (HSD) infeasible-interior-point

More information

Non-Convex Optimization. CS6787 Lecture 7 Fall 2017

Non-Convex Optimization. CS6787 Lecture 7 Fall 2017 Non-Convex Optimization CS6787 Lecture 7 Fall 2017 First some words about grading I sent out a bunch of grades on the course management system Everyone should have all their grades in Not including paper

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

4.1 Eigenvalues, Eigenvectors, and The Characteristic Polynomial

4.1 Eigenvalues, Eigenvectors, and The Characteristic Polynomial Linear Algebra (part 4): Eigenvalues, Diagonalization, and the Jordan Form (by Evan Dummit, 27, v ) Contents 4 Eigenvalues, Diagonalization, and the Jordan Canonical Form 4 Eigenvalues, Eigenvectors, and

More information

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function

More information

only nite eigenvalues. This is an extension of earlier results from [2]. Then we concentrate on the Riccati equation appearing in H 2 and linear quadr

only nite eigenvalues. This is an extension of earlier results from [2]. Then we concentrate on the Riccati equation appearing in H 2 and linear quadr The discrete algebraic Riccati equation and linear matrix inequality nton. Stoorvogel y Department of Mathematics and Computing Science Eindhoven Univ. of Technology P.O. ox 53, 56 M Eindhoven The Netherlands

More information

MODELLING OF FLEXIBLE MECHANICAL SYSTEMS THROUGH APPROXIMATED EIGENFUNCTIONS L. Menini A. Tornambe L. Zaccarian Dip. Informatica, Sistemi e Produzione

MODELLING OF FLEXIBLE MECHANICAL SYSTEMS THROUGH APPROXIMATED EIGENFUNCTIONS L. Menini A. Tornambe L. Zaccarian Dip. Informatica, Sistemi e Produzione MODELLING OF FLEXIBLE MECHANICAL SYSTEMS THROUGH APPROXIMATED EIGENFUNCTIONS L. Menini A. Tornambe L. Zaccarian Dip. Informatica, Sistemi e Produzione, Univ. di Roma Tor Vergata, via di Tor Vergata 11,

More information

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes CS 6501: Deep Learning for Computer Graphics Basics of Neural Networks Connelly Barnes Overview Simple neural networks Perceptron Feedforward neural networks Multilayer perceptron and properties Autoencoders

More information

Simulated Annealing for Constrained Global Optimization

Simulated Annealing for Constrained Global Optimization Monte Carlo Methods for Computation and Optimization Final Presentation Simulated Annealing for Constrained Global Optimization H. Edwin Romeijn & Robert L.Smith (1994) Presented by Ariel Schwartz Objective

More information

ground state degeneracy ground state energy

ground state degeneracy ground state energy Searching Ground States in Ising Spin Glass Systems Steven Homer Computer Science Department Boston University Boston, MA 02215 Marcus Peinado German National Research Center for Information Technology

More information

Math 1270 Honors ODE I Fall, 2008 Class notes # 14. x 0 = F (x; y) y 0 = G (x; y) u 0 = au + bv = cu + dv

Math 1270 Honors ODE I Fall, 2008 Class notes # 14. x 0 = F (x; y) y 0 = G (x; y) u 0 = au + bv = cu + dv Math 1270 Honors ODE I Fall, 2008 Class notes # 1 We have learned how to study nonlinear systems x 0 = F (x; y) y 0 = G (x; y) (1) by linearizing around equilibrium points. If (x 0 ; y 0 ) is an equilibrium

More information

ESTIMATING STATISTICAL CHARACTERISTICS UNDER INTERVAL UNCERTAINTY AND CONSTRAINTS: MEAN, VARIANCE, COVARIANCE, AND CORRELATION ALI JALAL-KAMALI

ESTIMATING STATISTICAL CHARACTERISTICS UNDER INTERVAL UNCERTAINTY AND CONSTRAINTS: MEAN, VARIANCE, COVARIANCE, AND CORRELATION ALI JALAL-KAMALI ESTIMATING STATISTICAL CHARACTERISTICS UNDER INTERVAL UNCERTAINTY AND CONSTRAINTS: MEAN, VARIANCE, COVARIANCE, AND CORRELATION ALI JALAL-KAMALI Department of Computer Science APPROVED: Vladik Kreinovich,

More information

AI Programming CS F-20 Neural Networks

AI Programming CS F-20 Neural Networks AI Programming CS662-2008F-20 Neural Networks David Galles Department of Computer Science University of San Francisco 20-0: Symbolic AI Most of this class has been focused on Symbolic AI Focus or symbols

More information

On-line Bin-Stretching. Yossi Azar y Oded Regev z. Abstract. We are given a sequence of items that can be packed into m unit size bins.

On-line Bin-Stretching. Yossi Azar y Oded Regev z. Abstract. We are given a sequence of items that can be packed into m unit size bins. On-line Bin-Stretching Yossi Azar y Oded Regev z Abstract We are given a sequence of items that can be packed into m unit size bins. In the classical bin packing problem we x the size of the bins and try

More information

`First Come, First Served' can be unstable! Thomas I. Seidman. Department of Mathematics and Statistics. University of Maryland Baltimore County

`First Come, First Served' can be unstable! Thomas I. Seidman. Department of Mathematics and Statistics. University of Maryland Baltimore County revision2: 9/4/'93 `First Come, First Served' can be unstable! Thomas I. Seidman Department of Mathematics and Statistics University of Maryland Baltimore County Baltimore, MD 21228, USA e-mail: hseidman@math.umbc.edui

More information

Featured Articles Advanced Research into AI Ising Computer

Featured Articles Advanced Research into AI Ising Computer 156 Hitachi Review Vol. 65 (2016), No. 6 Featured Articles Advanced Research into AI Ising Computer Masanao Yamaoka, Ph.D. Chihiro Yoshimura Masato Hayashi Takuya Okuyama Hidetaka Aoki Hiroyuki Mizuno,

More information

1 What a Neural Network Computes

1 What a Neural Network Computes Neural Networks 1 What a Neural Network Computes To begin with, we will discuss fully connected feed-forward neural networks, also known as multilayer perceptrons. A feedforward neural network consists

More information

Neural Networks Lecture 6: Associative Memory II

Neural Networks Lecture 6: Associative Memory II Neural Networks Lecture 6: Associative Memory II H.A Talebi Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Winter 2011. A. Talebi, Farzaneh Abdollahi Neural

More information

August Progress Report

August Progress Report PATH PREDICTION FOR AN EARTH-BASED DEMONSTRATION BALLOON FLIGHT DANIEL BEYLKIN Mentor: Jerrold Marsden Co-Mentors: Claire Newman and Philip Du Toit August Progress Report. Progress.. Discrete Mechanics

More information

Fundamentals of Metaheuristics

Fundamentals of Metaheuristics Fundamentals of Metaheuristics Part I - Basic concepts and Single-State Methods A seminar for Neural Networks Simone Scardapane Academic year 2012-2013 ABOUT THIS SEMINAR The seminar is divided in three

More information

Notes on Dantzig-Wolfe decomposition and column generation

Notes on Dantzig-Wolfe decomposition and column generation Notes on Dantzig-Wolfe decomposition and column generation Mette Gamst November 11, 2010 1 Introduction This note introduces an exact solution method for mathematical programming problems. The method is

More information

Global Analysis of Piecewise Linear Systems Using Impact Maps and Surface Lyapunov Functions

Global Analysis of Piecewise Linear Systems Using Impact Maps and Surface Lyapunov Functions IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL 48, NO 12, DECEMBER 2003 2089 Global Analysis of Piecewise Linear Systems Using Impact Maps and Surface Lyapunov Functions Jorge M Gonçalves, Alexandre Megretski,

More information

New Integer Programming Formulations of the Generalized Travelling Salesman Problem

New Integer Programming Formulations of the Generalized Travelling Salesman Problem American Journal of Applied Sciences 4 (11): 932-937, 2007 ISSN 1546-9239 2007 Science Publications New Integer Programming Formulations of the Generalized Travelling Salesman Problem Petrica C. Pop Department

More information

Comparison of Simulation Algorithms for the Hopfield Neural Network: An Application of Economic Dispatch

Comparison of Simulation Algorithms for the Hopfield Neural Network: An Application of Economic Dispatch Turk J Elec Engin, VOL.8, NO.1 2000, c TÜBİTAK Comparison of Simulation Algorithms for the Hopfield Neural Network: An Application of Economic Dispatch Tankut Yalçınöz and Halis Altun Department of Electrical

More information

PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION. A Thesis MELTEM APAYDIN

PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION. A Thesis MELTEM APAYDIN PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION A Thesis by MELTEM APAYDIN Submitted to the Office of Graduate and Professional Studies of Texas A&M University in partial fulfillment of the

More information

MVE165/MMG630, Applied Optimization Lecture 6 Integer linear programming: models and applications; complexity. Ann-Brith Strömberg

MVE165/MMG630, Applied Optimization Lecture 6 Integer linear programming: models and applications; complexity. Ann-Brith Strömberg MVE165/MMG630, Integer linear programming: models and applications; complexity Ann-Brith Strömberg 2011 04 01 Modelling with integer variables (Ch. 13.1) Variables Linear programming (LP) uses continuous

More information

Training Multi-Layer Neural Networks. - the Back-Propagation Method. (c) Marcin Sydow

Training Multi-Layer Neural Networks. - the Back-Propagation Method. (c) Marcin Sydow Plan training single neuron with continuous activation function training 1-layer of continuous neurons training multi-layer network - back-propagation method single neuron with continuous activation function

More information

21. Set cover and TSP

21. Set cover and TSP CS/ECE/ISyE 524 Introduction to Optimization Spring 2017 18 21. Set cover and TSP ˆ Set covering ˆ Cutting problems and column generation ˆ Traveling salesman problem Laurent Lessard (www.laurentlessard.com)

More information

Mathematics Research Report No. MRR 003{96, HIGH RESOLUTION POTENTIAL FLOW METHODS IN OIL EXPLORATION Stephen Roberts 1 and Stephan Matthai 2 3rd Febr

Mathematics Research Report No. MRR 003{96, HIGH RESOLUTION POTENTIAL FLOW METHODS IN OIL EXPLORATION Stephen Roberts 1 and Stephan Matthai 2 3rd Febr HIGH RESOLUTION POTENTIAL FLOW METHODS IN OIL EXPLORATION Stephen Roberts and Stephan Matthai Mathematics Research Report No. MRR 003{96, Mathematics Research Report No. MRR 003{96, HIGH RESOLUTION POTENTIAL

More information

Gaussian Processes for Regression. Carl Edward Rasmussen. Department of Computer Science. Toronto, ONT, M5S 1A4, Canada.

Gaussian Processes for Regression. Carl Edward Rasmussen. Department of Computer Science. Toronto, ONT, M5S 1A4, Canada. In Advances in Neural Information Processing Systems 8 eds. D. S. Touretzky, M. C. Mozer, M. E. Hasselmo, MIT Press, 1996. Gaussian Processes for Regression Christopher K. I. Williams Neural Computing

More information

An average case analysis of a dierential attack. on a class of SP-networks. Distributed Systems Technology Centre, and

An average case analysis of a dierential attack. on a class of SP-networks. Distributed Systems Technology Centre, and An average case analysis of a dierential attack on a class of SP-networks Luke O'Connor Distributed Systems Technology Centre, and Information Security Research Center, QUT Brisbane, Australia Abstract

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.5. Spring 2010 Instructor: Dr. Masoud Yaghini Outline How the Brain Works Artificial Neural Networks Simple Computing Elements Feed-Forward Networks Perceptrons (Single-layer,

More information