UNIVERSITY OF NAIROBI

Size: px
Start display at page:

Download "UNIVERSITY OF NAIROBI"

Transcription

1 UNIVERSITY OF NAIROBI FACULTY OF ENGINEERING DEPARTMENT OF ELECTRICAL AND INFORMATION ENGINEERING COMBINED REAL AND REACTIVE DISPATCH OF POWER USING REINFORCEMENT LEARNING. PROJECT INDEX : 045 SUBMITTED BY : NYABUGA JOSHUA TUONI F17/1372/2010 SUPERVISOR : MR. PETER MUSAU EXAMINER : MR. OGABA PROJECT REPORT SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENT FOR THE AWARD OF THE DEGREE OF BACHELOR OF SCIENCE IN ELECTRICAL AND ELECTRONICS ENGINEERING OF THE UNIVERSITY OF NAIROBI SUBMITTED ON : 24 TH APRIL, 2015

2 DECLARATION OF ORIGINALITY NAME OF STUDENT: NYABUGA JOSHUA TUONI REGISTRATION NUMBER: F17/1372/2010 COLLEGE: Architecture and Engineering FACULTY/SCHOOL/INSTITUTE: Engineering DEPARTMENT: Electrical and Information Engineering COURSE NAME: Bachelor of Science in Electrical and Electronic Engineering. TITLE OF WORK: COMBINED REAL AND REACTIVE DISPATCH OF POWER USING REINFORCEMENT LEARNING. 1. I understand what plagiarism is and I am aware of the university policy in this regard. 2. I declare that this final year project report is my original work and has not been submitted elsewhere for examination, award of a degree or publication. Where other people s work or my own work has been used, this has properly been acknowledged and referenced in accordance with the University of Nairobi s requirements. 3. I have not sought or used the services of any professional agencies to produce this work. 4. I have not allowed, and shall not allow anyone to copy my work with the intention of passing it off as his/her own work. 5. I understand that any false claim in respect of this work shall result in disciplinary action, in accordance with University anti-plagiarism policy. Signature: Date:.. i

3 CERTIFICATION This report has been submitted to the Department of Electrical and Information Engineering of the University of Nairobi with my approval as supervisor: Mr. Peter Musau. Signature : Date :..... ii

4 DEDICATION To my family, for continued support and prayers. iii

5 ACKNOWLEDGEMENT I would like to thank God for having taken good care of me throughout my academic life and for the good health he has granted me. I extend my gratitude to my supervisor, Mr. Musau, for the support, guidance, useful criticism and encouragement he gave me as I did my project. I appreciate all my lecturers and non-teaching stuff in the Department of Electrical and Information Engineering of the University of Nairobi for their contribution towards my degree. I also thank my classmates for the moral support they gave me as I undertook my project. Lastly, I thank my family for the support and understanding they have accorded me throughout my academic life. iv

6 TABLE OF CONTENTS COMBINED REAL AND REACTIVE DISPATCH OF POWER USING REINFORCEMENT LEARNING... 1 DECLARATION OF ORIGINALITY... i CERTIFICATION...ii DEDICATION... iii ACKNOWLEDGEMENT... iv List of Figures... vii List of Tables...viii List of Abbreviations... ix ABSTRACT...x 1 INTRODUCTION Combined Real and Reactive Dispatch of Power What is Economic Dispatch? Survey of Earlier Work: Genetic Algorithm(GA) Particle Swarm Optimization(PSO) Tabu Search(TS) Simulated Annealing(SA) Ant Colony Optimization(ACO) Neural Networks Hybrid Methods Problem Statement Justification Organization of the Report LITERATURE REVIEW Literature Review on Real Power Economic Dispatch Real Dispatch of Power Objective Function Literature Review on Reactive Power Economic Dispatch Minimize Var Cost Minimum Deviation From a Specific Point Voltage Stability Related Objectives v

7 2.2.4 Multi-Objective(MO) Reactive Power Dispatch and Voltage Control Reactive Dispatch of Power Objective Function Literature Review on Reinforcement Learning Background Information on Reinforcement Learning N-Arm Bandit Problem Parts of Reinforcement Learning Multi-stage Decision Problem (MDP) Methods for solving Multi-stage Decision Problems (MDP) Reinforcement Learning Approach for Solution Action Selection SOLUTION TO COMBINED REAL AND REACTIVE DISPATCH OF POWER USING REINFORCEMENT LEARNING (RL) Formulation of The Real and Reactive Dispatch of Power Problem using Reinforcement Learning Combined Active/Real and Reactive Power Cost RL Algorithm for Combined Real and Reactive Economic Dispatch using ε-greedy Strategy Learning Phase Policy Retrieval Phase Flowchart of RL Algorithm for Combined Real and Reactive Dispatch of Power RESULTS AND ANALYSIS Case Study: IEEE 14-Bus System Results Analysis and Discussion CONCLUSION AND RECOMMENDATIONS FOR FURTHER WORK Conclusion Recommendations for Further Work REFERENCES APPENDIX Matlab Code vi

8 LIST OF FIGURES Figure 2-1 : Voltage Stability Curve Figure 2-2 : Grid World Problem Figure 3-1 : Flowchart of RL Algorithm for ED Figure 4-1 : One Line Diagram of IEEE 14-Bus System [14] Figure 4-2 : Fuel Cost against Power Demand Figure 4-3 : Power Losses against Power Demand vii

9 LIST OF TABLES Table 4-1 : RL Parameters Table 4-2 : Real and Reactive Power Scheduling for a 14-Bus System Table A-0-1 : IEEE 14-Bus System Generator Data Table A-0-2 : IEEE 14-Bus Network Load and Generator Data [14] Table A-0-3 : IEEE 14-Bus Network Line Data [14] viii

10 LIST OF ABBREVIATIONS RL ED OPF MW IEEE GA PSO TS SA ACO MDP Reinforcement Learning Economic Dispatch Optimal Power Flow Megawatts Institute of Electrical and Electronics Engineering Genetic Algorithm Particle Swarm Optimization Tabu Search Simulated Annealing Ant-Colony Optimization Multi-stage Decision Problem ix

11 ABSTRACT Most economic dispatch problems involve real power only. With the integration of renewable energy into the grid, reactive power dispatch cannot be ignored any longer. This project shows how reactive power dispatch and real power dispatch are combined. This project proposes an effective algorithm that uses Reinforcement Learning (RL) for optimum generation dispatch to minimize the fuel cost. Various methods have been used to solve the Economic Dispatch (ED) problem. These include conventional methods such as linear programming, non-linear programming, mixed integer programming, interior points and quadratic programming. The non - conventional methods are Genetic Algorithm (GA), Particle Swarm Optimization (PSO), Tabu Search (TS), Ant-Colony, simulated annealing, neural networks and hybrid techniques. In this project, Reinforcement Learning (RL) method has been used to develop an algorithm for economic dispatch. The developed algorithm has been tested on IEEE 14-bus, five generator network. The allocation schedule for the five generating units was found for the following sets of real and reactive power demands: 800MW & 370MVAR, 900MW & 470MVAR and 1000MW & 570MVAR. The optimal fuel costs for real power, reactive power and combined real and reactive power generation was also computed. x

12 1 INTRODUCTION 1.1 COMBINED REAL AND REACTIVE DISPATCH OF POWER What is Economic Dispatch? The economic dispatch problem is defined as that which minimizes the total operating cost of the power system while meeting the load plus transmission losses within generator limits [1]. The economic load dispatch problem involves the solution of two different problems: a) Unit commitment/pre-dispatch problem It is required to select optimally out of the available generating sources to operate, to meet the expected load and provide a specified margin of operating reserve over a specified period of time [2]. The unit commitment problem involves scheduling unit start up and shut down in a way that minimizes cost without compromising system security [3]. b) On-line economic dispatch It is required to distribute the load among the generating units actually paralleled with the system in such manner as to minimize the total cost of supplying the minute-to-minute requirements of the system [2]. Spinning reserve: This is a safety margin of generation where more units than necessary are kept on line, so that, should a unit unexpectedly fail, or the load rise unexpectedly, the system can meet the load requirement without interrupting service [3]. In load flow study for the power system, for a particular load demand, the generation at all the generator buses are fixed except at one generator bus known as slack, reference or swing bus where we allow the generation to take values within certain limits. In the case of economic load dispatch, the generations are not fixed but they are allowed to take values again within certain limits so as to meet a particular load demand with minimum fuel consumption. This means economic load dispatch is really the solution of a large number of load flow problems and choosing the one which is optimal in the sense that it needs minimum cost of generation [2]. 1.2 SURVEY OF EARLIER WORK: Optimization Methods The optimization techniques used include both conventional and non-conventional ones. Conventional optimization techniques include: 1

13 Linear programming Non-linear programming Mixed integer programming Interior points Quadratic programming, etc The disadvantage of these methods is that they converge to a local optimum solution. Non-conventional optimization techniques include: Genetic Algorithm(GA) Particle Swarm Optimization(PSO) Tabu Search(TS) Ant Colony Optimization(ACO) Simulated Annealing(SA) Neural Networks Hybrid techniques Genetic Algorithm(GA) GA is part of Evolutionary Algorithm (EA) which is a population-based optimization process. Other components of EA are Evolutionary Strategy (ES) and Evolutionary Programming (EP).GA is an optimization technique that is inspired by the process of natural selection. It does not differentiate the cost function and the constra ints and has a probability of convergence to a global optimum of one. It utilizes the operators of selection, crossover and mutation. It combines survival of the fittest among string structures with a structured, yet random, information exchange. In every generation, a new set of artificially developed strings is produced using elements of the fittest of the old; an occasional new element is experimented with for enhancement [4]. A starting population is built with random gene values and it evolves through several generations in which selection, crossover and mutation are repeated until a satisfactory solution has been found or a maximum number of iterations have been reached [5]. The algorithm identifies the individuals with the optimizing fitness values, and those with lower fitness will naturally get discarded from the population. However, GA cannot assure constant optimization response times thereby limiting its application in real-time applications Particle Swarm Optimization(PSO) This is an intelligent search technique that is inspired by social dynamics and the behavior emergent from socially organized populations known as swarms e.g. flocks of birds or schools of fish. The individuals are referred to as particles. The particles change their positions by flying around in a multidimensional search space until a relatively unchanged position has been encountered, or until computational limitations are exceeded [6]. A swarm of potential solutions are called particles. A particle bases its search not only on its 2

14 personal experience but also by the information given by the neighbors in the swarm. Each particle keeps track of its co-ordinates in the problem space, which is associated with the best solution fitness it has achieved so far. The fitness value is also stored. This value is called pbest. Another best value that is tracked by the particle swarm optimizer is the location, lbest, value obtained thus far by any particle in the neighbors of the particle. When a particle takes the whole population as its topological neighbors, the best value is a global best and is called gbest [7, 6]. A PSO system combines local search methods with global search methods. It has the problems of dependency on initial point and parameters, difficulty in finding their optimal design parameters, and the stochastic characteristic of the final outputs [8]. The main advantages of PSO are: easy implementation, simple concept, robustness to control the parameters and less computational time compared to other optimization technique [6] Tabu Search(TS) This is a kind of iterative search that is characterized by the use o f a flexible memory. It is able to eliminate local minima and to search areas beyond a local minimum. Therefore, it has the ability to find the global minimum of a multimodal search space. The process with which Tabu search overcomes the local optimality problem is based on an evaluation function that chooses the highest evaluation solution at each iteration. This means moving to the best admissible solution in the neighborhood of the current solution in terms of the objective value and tabu restrictions. The evaluation function selects the move that produces the most improvement or the least deterioration in the objective function. A tabu list is employed to store the characteristics of accepted moves so that these characteristics can be used to classify certain moves as tabu (i.e. to be avoided) in later iterations. The tabu list determines which solutions may be reached by a move from the current solution. Since moves not leading to improvements are accepted in tabu search, it is possible to return to already visited solutions. This might cause a cycling problem to arise. The tabu list is used to overcome this problem. The forbidding strategy is used to control and update the tabu list to avoid previously visited paths thus allowing exploration of new areas. An aspiration criterion is used to make a tabu solution free if this solution is of sufficient quality thus preventing cycling [7] Simulated Annealing(SA) Annealing is the physical process of heating up a solid and then cooling it down slowly until it crystallizes. At high temperatures, the atoms have high energies and have more freedom to arrange themselves. As the temperature is reduced, the atomic energies decrease. A crystal with regular structure is obtained at the state where the system has minimum energy. If the cooling is carried out very quickly, which is known as rapid quenching, widespread irregularities and defects are seen in the crystal structure. The system does not 3

15 reach the minimum energy state and ends in a polycrystalline state which has a higher energy [7]. In the analogy between a combinatorial optimization problem and the annealing process, the states of the solid represent feasible solutions of the optimization problem, the energies of the states correspond to the values of the objective function computed at those solutions, the minimum energy state corresponds to the optimal solution to the problem and rapid quenching can be viewed as local optimization. The algorithm consists of a sequence of iterations. Each iteration consists of randomly changing the current solution to create a new solution in the neighborhood of the current solution. The neighborhood is defined by the choice of the generation mechanism. Once a new solution is created the corresponding change in the cost function is computed to decide whether the newly produced solution can be accepted as the current solution. If the change in the cost function is negative the newly produced solution is directly taken as the current solution. Otherwise, it is accepted according to Metropolis's criterion [Metropolis et al., 1953] based on Boltzmann s probability [5] Ant Colony Optimization(ACO) ACO was inspired by the behavior of ants in their natural habitat. A colony of ants is able to succeed in a task to find the shortest path between the nest and the food source by depositing a chemical substance trail, called pheromone on the ground as they move. This pheromone can be observed by other ants and motivates them to follow the path with a high probability. This optimization technique is based on the indirect communication of a colony of simple agents, called (artificial) ants, mediated by (artificial) pheromone trail which serve as distributed, numerical information, which the ants use to probabilistically construct solutions to the problem. This is adapted by the ants during the algorithm s execution to reflect their search experience. In this way, the best solution has more intensive pheromone and higher probability to be chosen. The described behaviour of real ant colonies can be used to solve combinatorial optimization problems by simulation, using artificial ants searching the solution space by transiting from nodes to nodes. The artificial ants moves are usually associated with their previous action, stored in the memory with a specific data structure. The pheromone consistencies of all paths are updated only after the ant finishes its tour from the first node to the last node. Every artificial ant has a constant amount of pheromone stored in it when the ant proceeds from the first node. The pheromone that is stored will be distributed average on the path after artificial ants finished their tour. The quantity of pheromone will be high if artificial ants finished their tour with a good path. The pheromone of the routes progressively decreases by evaporation in order to avoid artificial ants stuck in local optima [8] Neural Networks Neural networks are modeled on the mechanism of the brain. Theoretically, they have a parallel distributed information processing structure. Two of the major features of neural 4

16 networks are their ability to learn from examples and their tolerance to noise and damage to their components. A neural network consists of a number of simple processing elements, also called nodes, units, short-term memory elements and neurons. These elements are modelled on the biological neuron and perform local information processing operations. A processing element has several inputs and one output which could be its own output, the output of other processing elements or input signals from external devices. Processing elements are connected to one another through links with weights which represent the strengths of the connections. The weight of a link determines the effect of the output of a neuron on another neuron. It can be considered part of the long-term memory in a neural network. After the inputs are received by a neuron, a pre-processing operation is applied. The output of the preprocessing operation is passed through a function called the activation function to produce the final output of the processing element. Depending on the problem, various types of activation functions are employed, such as a linear function, step function, sigmoid function, hyperbolic-tangent function, etc [7] Hybrid Methods These is a combination of two or more optimization methods with the aim of taking advantage of the pros of each method used in the mix while reducing on computation time hence speeding up convergence and/or better the quality of the solution. Examples include Expert System SA (ESSA).This seeks to use an expert system consisting of several heuristic rules to find a local optimal solution, which will be employed as an initial starting point of the second stage. This method is insensitive to the initial starting point, and so the quality of the solution is stable. It can deal with a mixture of continuous and discrete variables [9]. 1.3 PROBLEM STATEMENT In order to obtain an accurate cost function, the reactive power cost is to be included in the active power cost function. The total cost is given by combining the active and reactive power cost, giving the active power more weight than the reactive power. The objective function becomes as given below: Subject to: NG Minimize F Total = WF(F gi ) + (1 W)F(Q gi ) i=1 NG NB P gi P Di P L = 0 i=1 i=1 5

17 NG NB Q gi Q Di Q L = 0 i=1 i=1 P gi min P gi P gi max Q min max gi Q gi Q gi (i = 1, 2,.., NG) (i = 1, 2,.., NG) Where, Pgi, Qgi are the active/real and reactive generations of i th generator PDi, QDi are the active/real and reactive power demands PL, QL are the active/real and reactive power transmission losses NB is the number of buses NG is the number of generators For this project, W is taken to be 80% and consequently, (1 W) becomes 20%. Therefore, the combined objective function becomes: NG Minimize F Total = 0.8F(F gi ) + 0.2F(Q gi ) i=1 1.4 JUSTIFICATION While most of the optimization and soft computing techniques provide solution for static optimization tasks, Reinforcement Learning based strategies can easily provide solution for dynamic optimization problems. This makes Reinforcement Learning a good learning strategy suitable for real time control tasks and many optimization problems. In case of RL based solution strategies, the environment need not be a mathematically well defined one. It can acquire the knowledge or learn in a model free environment. Acquiring the knowledge of rewards or punishments to an action taken in the environment or state of the system, the learning strategy improves the performance step by step. Through a simple learning procedure with sufficient number of iterative steps, the agent can learn the best actions at any situation or state of the system. Also the reward or return function need not be a deterministic one, since at each action step the agent can accept the reward from a dynamic environment. 6

18 1.5 ORGANIZATION OF THE REPORT This project has been organized into five chapters as follows: In Chapter 1, the ED problem is introduced. Other optimization methods that can be used in solving the problem have been discussed. The problem statement and project objectives have also been discussed. In Chapter 2, a literature review of real and reactive dispatch of power has been done. A detailed literature review of reinforcement learning has been done as well. In Chapter 3, implementation of combined real and reactive dispatch of power using RL has been discussed in detail. The RL algorithm for solving economic dispatch is presented and its flowchart drawn. In Chapter 4, the simulation results obtained from programming in MATLAB are analyzed and discussed. In Chapter 5, conclusions are presented and recommendations for further work stated. 7

19 2 LITERATURE REVIEW 2.1 LITERATURE REVIEW ON REAL POWER ECONOMIC DISPATCH Let us assume that it is known priori which generators are to be run to meet a particular load demand on the station. Suppose there is a station with NG generators committed a nd the active power load demand P D is given, the real power generation P gi for each generator has to be allocated so as to minimize the total cost. The optimization problem can therefore be stated as: F(P gi ) = NG i=1 F i (P gi ).. 2.1a Subject to: i) the energy balance equation: NG i=1 P gi = P D b ii) the inequality constraints: P min max gi P gi P gi (i = 1, 2,.., NG) c where: P gi is the decision variable, that is, real power generation P D is the real power demand NG is the number of generation plants P gi min is the lower permissible limit of real power generation P gi max is the upper permissible limit of real power generation F i (P gi ) is the operating fuel cost of the i th plant and is given by the quadratic equation: F i (P gi ) = (a i P 2 gi + b i P gi + c i ) Ksh./hour. 2.1d The above constrained optimization problem is converted into an unconstrained optimization problem. Lagrange multiplier method is used in which a function is minimized(or maximized) with side conditions in the form of equality constraints. Using this method, an augmented function is defined as: L(P gi, λ) = F(P gi ) + λ(p D NG ) 2.2 i=1 P gi where λ is the Lagrangian multiplier. A necessary condition for a function F(P gi ), subject to energy balance constraint to have a x relative minimum at point P gi is that the partial derivative of the Lagrange function defined by L = L(P gi, λ) with respect to each of the arguments must be zero. So, the necessary conditions for the optimization problem are: L(P gi,λ) P gi = F(P gi ) P gi λ = 0 (i = 1,2,., NG). 2.3 and L(P gi,λ) λ = P D NG i=1 P gi = From equation 2.3, 8

20 F(P gi ) P gi = λ (i = 1, 2,, NG) 2.5 where F(P gi ) is the incremental fuel cost of the i th generator ($/MWh) P gi Optimal loading of generators corresponding to the equal incremental cost point of all the generators. Equation 2.5, called the co-ordination equations numbering NG are solved simultaneously with the load demand to yield a solution for Lagrange multiplier λ and the optimal generation of NG generators. Considering the cost function given by equation 2.1d, the incremental cost can be defined as: F(P gi ) P gi = 2a i P gi + b i 2.6 Substituting the incremental cost into equation 2.5, this equation becomes 2a i P gi + b i = λ (i = 1, 2,, NG) 2.7 Rearranging equation 2.7 to get P gi ; P gi = λ b i (i = 1,2,, NG) a i Substituting the value of P gi in equation 2.4, we get; or λ = P D + NG b i i=1 2a i 1 NG i=12a i NG λ b i = P 2a D i i= Thus, λ can be calculated using equation 2.9 and P gi can be calculated using equation 2.8. Now consider the effect of the generator limits given by the inequality constraint of equation 2.1c. If a particular generator loading P gi reaches the limit P gi min or P gi max, its loading is held fixed at this value and the balance load is shared between the remaining generators on an equal incremental cost basis [10]. Limit Constraint Fixing To fix up the limits, the following strategy can be applied: R Let h = 1 max R i=1 h i 2 min i=1 h i 2.10 where h max max i = P gi P gi (i = 1, 2,., R 1 upper bound violations) h min i = P min gi P gi (i = 1,2,., R 2 upper bound violations) i) If h > 0, fix all R1 upper bound violations to the upper limits, i.e., P max gi. ii) If h < 0, fix all R2 lower bound violations to the lower limits, i.e., P gi min. iii) On the other side, if h = 0, fix both R1 upper and R2 lower bound violations to their respective upper P gi max and lower P gi min limits. 9

21 Determine the new demand which is original PD minus the sum of fixed generation levels, i.e. P D new = P D R 1 +R 2 P gi i=1 The new demand is allocated to other committed generators on an equal incremental cost basis Real Dispatch of Power Objective Function The economic dispatch problem is defined as that which minimizes the total operating cost of a power system while meeting the total load plus transmission losses within generator limits. Mathematically, the problem is defined as: Minimize F(P gi ) = NG i=1 (a i P 2 gi + b i P gi + c i ) $/h.(2.11a) Subject to: i) the energy balance equation, NG i=1 P gi = P D + P L..(2.11b) ii) and the inequality constraints. P min max gi P gi P gi (i = 1, 2,.., NG) Where a i, b i, c i are the cost coefficients. P D is the load demand. P gi is the real power generation and will act as the decision variable. NG is the number of generation buses. P L is the transmission power loss. 2.2 LITERATURE REVIEW ON REACTIVE POWER ECONOMIC DISPATCH The majority of the RPP objectives were to provide the least cost of new reactive power supplies. Many variants of this objective include the cost of real power losses or the fuel cost. In addition, some technical indices such as deviation from a given voltage schedule or the security margin may be used as objectives for optimization [11] Minimize Var Cost Generally, there are two Var source cost models for minimization. The first formulation is to model Var source costs with C 1. Q c that represents a linear function with no fixed cost. Apparently, this model considers only the variable cost relevant to the rating of the newly 10

22 installed Var source Q c and ignores the fixed installation cost. The common unit for C1 is $/(MVar.hour). A better formulation with the format (C 0 + C 1.Q c ). x is to consider the fixed cost, C 0 ($/hour), which is the lifetime fixed cost prerated to per hour, in addition to the incremental/variable cost C 1 ($/MVar.hour) Minimize Var Cost and Real Power Losses. This objective may be divided into two groups: i) To minimize C 1 (Q c ) + C 2 (P loss ) ii) To miminize (C 0 + C 1.Q c ). x + C 2 (P loss ) Where C 2 (P loss ) represents the cost of real power loss C 0 is fixed cost. The objective can be written as follows: N c min F = C 1 (Q c ) + C 2 (P loss )k k=0 Where k(=0, 1,, L,, Nc), represents the k th operating case. Here, considered are the base case (k=0), the contingency cases under preventive mode (k=1,, L), and the contingency cases under corrective mode (k=l+1,, N c) Minimize Var Cost and Generator Fuel Cost. As an alternative to the cost of real power loss, the fuel cost is adopted as a direct measure of the operation cost. The minimization of real power loss cannot guarantee the minimization of the total fuel cost in general. Instead, minimization of the total fuel cost already includes the cost reduction due to the minimization of real power loss. This objective consists of the sum of the costs of the individual generating units. n C T = F i (P gi ) i=1 2 Where F i (P gi ) = a 0i + a 1i P gi + a 2i P gi is the common generator cost-versus-mw curves approximately modeled as a quadratic function, and a 0i, a 1i, a 2i are cost coefficients Minimum Deviation From a Specific Point This objective is usually defined as the weighted sum of the deviations of the control variables, such as bus voltages, from their given target values. The target values correspond to the initial or specified operating points. Minimizing voltage deviation, i.e., i (V imax V i ), where the subscript i represents different buses for voltage regulation. 11

23 2.2.3 Voltage Stability Related Objectives The main function of shunt reactive power compensation is to provide voltage support to avoid voltage instability or a largescale voltage collapse. As shown in Fig. 2.1, voltage stability is usually represented by a P-V (or S-V) curve. Figure 2-1 : Voltage Stability Curve The nose point of the P-V curve is called the point of collapse (PoC), where the voltage drops rapidly with an increase of load. PoC is also known as the equilibrium point, where the corresponding Jaco-bian becomes singular. Hence, power-flow solution fails to converge beyond this limit, which indicates voltage instability and can be associated with a saddle-node bifurcation point. These instabilities are usually local area voltage problems due to the lack of reactive power. Therefore, one objective can be to increase the static voltage stability margin (SM) de fined as the distance between the saddle-node-bifurcation point and the base case operating point. SM can be expressed as follows: where S i normal and S i critical are the MVA loads of load bus at normal operating state B and the voltage collapse critical state (PoC) A as shown in Fig. 2.1, respectively. One could expect an improvement in the stability of the system for that operating point Multi-Objective(MO) This objective includes Var investment cost minimization, power loss reduction and voltage deviation as follows: 12

24 Also, MO canbe given as follows: Also, Min F = 10 (voltage violation in p. u. ) (generator Var violation in p.u. ) 2 + power losses in p. u. Min F = (C 0 + C 1.Q c ). x + C 2 (P loss ) + ρ 1 ( V i V ispec Where, V i = voltage magnitude in bus i V ispec = specified voltage magnitude in bus i i V imax ) 2 + ρ 2 ( S l S lspec V imax = maximum allowable voltage deviation limit at bus i S l = MVA flow through line i S lspec =MVA capacity limit of line i S lmax = specified allowable line flow deviation limit ρ 1 and ρ 2 are weights for different objectives Reactive Power Dispatch and Voltage Control l 2 ) S lmax The reactive power and voltage control has a significant influence on the security of a power system. For efficient and reliable operation of power systems, voltages at the terminal of all equipment in the system must be maintained within desired limits for power system stability enhancement. Conventionally, minimization of total transmission line losses has been considered to be the main objective in reactive power dispatch. Of recent the trend has been towards the elimination of security constraint violations. Proper redistribution of reactive power generations will offer the following benefits : Reduction in real power transmission losses caused by unnecessary reactive power flows which will consequently result in the lowest production cost. Increase in system security from augmented reactive power reserves for emergencies. The reactive power dispatch objective thus seeks to minimize the active power losses in the network Reactive Dispatch of Power Objective Function Reactive power production cost is highly dependent on real power o utput. If a generator produces its maximum active power (Pmax) then no reactive power is produced and therefore, Apparent power (S) equals Pmax. However, reactive power production by a generator will reduce its capability to produce active power. Hence the production of reactive power by generator will result in reduction of its active power production. So to generate reactive power Qgi by generator i, which has been operating at its nominal power (Pmax), it is required to reduce its active power to Pgi ( Hasanpour, et.al., 2009). So at the 13

25 different values of Qgi with respect to Pgi the Quadratic cost expression for reactive power is calculated by fitting a curve into a quadratic polynomial. The fuel cost in terms of reactive power output can be expressed as: NG F(Q gi ) = (a qi Q 2 gi + b qi Q gi + c qi ) i=1 Where a qi, b qi, c qi are reactive power cost coefficients and are calculated using a curve fitting and NG is the number of generators. The above objective function is very simple and as it is extracted from the power cost function of the generator, it is more realistic and can provide accurate results in reactive power pricing [12]. 2.3 LITERATURE REVIEW ON REINFORCEMENT LEARNING Reinforcement learning (RL) refers to a class of learning algorithms in which a learning system learns which action to take in different situations by using a scalar evaluation received from the environment on performing an action. RL has been successfully applied to many multi stage decision making problems (MDP) where in each stage the learning systems decides which action has to be taken. Economic dispatch (ED) problem is an important scheduling problem in power systems, which decides the amount of generation to be allocated to each generating unit so that the total cost of generation is minimized without violating system constraints. In this project, formulation of economic dispatch problem as a multi stage decision making problem is done. Development of RL based algorithm to solve the ED problem is also done. The main advantage of RL is it can learn the schedule for all possible demands simultaneously Background Information on Reinforcement Learning Reinforcement Learning (RL) is the study of how animals and artificial systems can learn to optimize their behavior in the face of rewards and punishments. One way in which animals acquire complex behaviors is by learning to obtain rewards and to avoid punishments. Learning of a baby to walk, a child acquiring the lesson of riding bicycle, an animal learning to trap his food etc. are some examples. During this learning process, the agent interacts with the environment. At each step of interaction, on observing or feeling the current state, an action is taken by the learner. Depending on the goodness of the action at the particular situation, it is tried in the next stage when the same or similar situation arises (Bertsekas and Tsitsikilis [1996]. Sutton and Barto [1998], Sathyakeerthi and Ravindran [1996]). 14

26 The learning methodologies developed for such learning tasks originally combine two disciplines: Dynamic Programming and Function Approximation (Moore et al. [1996]). Dynamic Programming is a field of mathematics that has been traditionally used to solve a variety of optimization problems. However Dynamic Programming in its pure form is limited in size and complexity of the problems it can address. Function Approximation methods like Neural Networks learn the system by different sets of input - output pairs to train the network. In RL. the goal to be achieved is known and the system learns how to achieve the goal by trial and error interactions with the environment. In the conventional Reinforcement Learning frame work, the agent does not initially know what effects its actions have on the state of the environment and also what the immediate reward he will get on selecting an action. It particularly does not know what action is best to do. Rather, it tries out the various actions at various states, gradually learns which one is the best at each state so as to maximize its long term reward. The agent thus tries to acquire a control policy or a rule for choosing an action according to the observed current state of the environment. One most natural way to acquire the above mentioned control rule would be the agent to visit each and every state in the environment and try out the various possible actions. At each state it observes the effect of the actions in terms of rewards. From the observed rewards, best action at each state or best policy is manipulated. However this is not at all practically possible since planning ahead involves accurate enumeration of possible actions and rewards at various states which is computationaliy very expensive. Also such planning is very difficult since some actions may have stochastic effects, so that performing the same action at two different situations may give different reward values. One promising feature in such Reinforcement Learning problems is that there are simple learning algorithms by means of which an agent can learn an optimal rule or policy without the need for planning ahead. Also, such learning requires only a minimal amount of memory: an agent can learn if it can consider only the last action it took, the state in which it took that action and present state reached. The concept of Reinforcement Learning problem and action selection is explained with a simple N - arm bandit problem in the next section. A grid world problem is taken to discuss the different parts of the RL problem. Then the multi stage decision making tasks are explained. The various techniques of solution or learning are described through mathematical formulations. The different action selection strategies and one of the solution methods namely Q learning are discussed. The few applications of RL based learning in the fields of power system are also briefly explained [13]. 15

27 2.3.2 N-Arm Bandit Problem The N-arm bandit is a game based on slot machines. The slot machine is having a number of arms or levers. For playing the game, one has to pay a fixed fee. The player will obtain a monetary reward by playing an arm of his choice. The monetary reward may be greater or lesser than the fee he had paid. Also the reward from each arm will be around a mean value with some value of variance. The aim of the player is to obtain maximum reward or pay, by playing the game. If the play on an arm is considered as an action or decision, then the objective is to find the best action from the action set (set of arms). Since the reward is around a mean value. the problem is to find the action giving highest reward or the arm with highest mean value which can be called as best arm. To introduce the notations used in the thesis, action of choosing an arm is denoted by "a ". The goodness of choosing an arm or quality of an arm is the mean value of arm and is denoted by Q(a). If the mean of all arms are known the best arm is given by the equation,.(2.3.1) As mentioned earlier, the problem is that the Q(a) values are unknown. One simple and direct method is to play each arm a large number of times. Let the reward received in playing an arm in k th trial is r k (a). Then an estimate of Q(a) after n trials is obtained using the equation, By law of large numbers, (2.3.2) Now the optimal action is obtained by equation (2.3.1). To make the notation less cumbersome, the estimate of Q(a) will also be denoted by Q n (a). The above method termed as Brute force is time consuming. As a preliminary to understand an efficient algorithm for rmding Q values (mean values corresponding to each arm), a well known recursive method is now derived. As explained earlier, average based on n observations is given by, 16

28 ..(2.3.3) Therefore, Then, using equation (2.3.3), That is,.(2.3.4) The above equation tells that the new estimate based on n+1 th observation, r n+1 (a) is old estimate Q n (a) plus a small number times the error, {r n+1 (a) Q n (a)}. 17

29 There are results which say that under some technical conditions a decreasing 1 sequence {an} can be used instead of to get a recursive equation. That is, n+1 The sequence an is such that Now, an efficient method to find the best arm of the N-arm bandit problem can be explained. Step 1 : Initialize n=0, a=0.1 Step 2 : Intitialize Q o (a)=0 a A Step 3 : Select an action "a" using an action selection strategy Step 4 : Play the arm corresponding to action "a" and obtain the reward r n (a) Step 5 : Update the estimate of Q(a), Step 6 : n=n+1 Step 7 : If n < max _iteration, go to step 3 Step 8 : Stop To use the above algorithm, an efficient action selection strategy is required. One method would be to take an action with uniform probability. In this way one will play all the arms equal number of times. That is, throughout the learning the action space is explored. Instead of playing all the arms more number of times, it makes sense to play the arms which may be the best arm. One such efficient algorithm for action selection is ε - greedy. In this algorithm, the greedy arm is played with a probability (1- ε) and one of the other arms with a probability ε. Greedy arm (ag) corresponds to the arm with the best estimate of Q value. That is, It may be noted that if ε = 1, the algorithm will select one of the actions with uniform probability and if ε = 0, the greedy action will be selected. Initially, the estimates Q n (a) may not be true value. However as n, Q n (a) Q(a). and then we may exploit the information contained in Q n (a). So in ε - greedy algorithm, initially ε is chosen close to 1 and as n increases ε is gradually reduced. 18

30 Proper balancing of exploration and exploitation of the action space ultimately reduces the number of trials needed to find out the best arm. A more detailed discussion on the parts of Reinforcement Learning problem is given in the following sections Parts of Reinforcement Learning The earlier example discussed had only one state. In many practical situations, the problem may be to find the best action for different states. In order to make the characteristics of such general Reinforcement Learning problems clearer, and to identify the different parts of a Reinforcement Learning problem, a shortest path problem is considered in this section. Consider the grid world problem as given in Fig 2.2. Figure 2-2 : Grid World Problem The grid considered is having 36 cells arranged in 6 rows and 6 co lumns. A robot can be at anyone of the possible cells at any instant. G denotes the goal state to which the robot aim to reach and the crossed cells denote cells with some sort of obstacles. There is a cost associated with each cell transition while the cost of passing through a cell with obstacle is much higher compared to other cells. Starting from any initial position in the grid, robot can reach the goal cell by following different paths and 19

31 correspondingly cost incurred will also vary. The problem is to find an optimum path to reach the goal starting from anyone of the initial cell position. With respect to this example, the parts of the Reinforcement Learning problem can now be defined State Space The cell number can be taken as state of the robot at any time. The possible state the robot can occupy at any instant is coming from the entire cell space. In Reinforcement Learning Terminology, it is termed as state space. State space in Reinforcement Learning problem is defined as the set of possible states the agent (learner) can occupy at different instants of time. At any instant, the agent will be at any one of the state from the entire state space. The state of the robot at instant k can be denoted as x k. The entire state space is then taken as χ, so that at any instant k, x k εχ. In order to reach the goal state 'G' from the initial state x 0. the robot has to take a series of actions or cell transitions, a0, a1,..., an Action Space At any instant k, the robot can take any of the action (cell transition) ak from the set of permissible actions in the action set or action space.all:. The permissible set of actions at each instant k depends on the current state A k of the robot. If the Robot stays in any of the cells in the first column, 'move to Left' is not possible. Similarly for each cell in the grid world, there is a set of possible cell movements or state transitions. The set of possible actions or cell transitions at current state x k is denoted as A xk which also depend on the current state x k. For example if x k = 7, A xk ={ right, up, down} and if x k = 1, A xk ={ right, down} System Model Reinforcement Learning can be used to learn directly by interacting with the system. If that is not possible, a model is required. It need not be a mathematical model. A simulation model would also be sufficient. In this simple example, a mathematical model can be obtained. On taking an action the robot proceeds to the next cell position which is a function of the current state and action. In other words the state occupied by the robot in k+ 1, x k+1 depends on x k and a k. That is, (2.3.5) For example, if x k = 7 and x k =down, then x k+1 = 13 while when ak = up,xk+1 = 1. For this simple grid world, x k+1 is easily obtained by observation. For problems with larger state space, the state x k+1 can be found from the simulation model or studying the environment in which robot moves. The aim of a robot in the grid is to reach the goal state starting from its initial position or state at minimum cost. At each step it takes an 20

32 action which is followed by state transition or movement in the grid The actions which make state transitions to reach the goal state at minimum cost points out the optimum solution. Therefore the shortest path problem can be stated as finding the sequence of actions a0, a1,..., an-1 starting from any initial state such that the total cost for reaching goal state G is minimum Policy As explained in the previous section, whenever an action ak is taken in state xk, state transition occurs governed by equation (2.3.5). Ultimate learning solution is to find out a rule by which an action is chosen at any of the possible states. In other words a good mapping from the state space χ to action space A is to be derived. In Reinforcement Learning problems, any mapping from state space to action space is termed as policy and denoted as p. Then p(x} denotes the action taken by the robot on reaching state x. At any state x, since there are different possible paths to reach the goal, they are treated as different policies: p1(x), p1(x),, etc. The optimum policy at any state x is denoted as π (x). Reinforcement Learning methods go through iterative steps to evolve this optimal policy π (x). In order to find out the optimum policy, some modes of comparison among policies are to be formulated. For the same. the reward function to be defined which give a quantitative measure of the goodness of an action at a particular state Reinforcement Funtion Designing a reinforcement function is an important issue in Reinforcement Learning. Reinforcement function should be able to catch the objective of the agent. In some cases, it is straight forward; in some other cases it is not. For example, in the case of N- arm bandit problem (which can be viewed as a Reinforcement Learning problem with just one state), the reinforcement function is the return obtained while the agent play an arm. In the case of the grid world problem, the objective is to find the shortest path. In this case, it can be assumed that the system will incur a cost of one unit when the agent moves from one cell to another normal cell and incur a cost of "B"units when it moves to a cell with obstacle. The value "B" should be chosen depending on how bad the obstacle is. More formally, at stage k the agent perform an action ak at the state xk and move to a new state xk+1. The reinforcement function is denoted by g(xk, ak, xk+1). The reinforcement obtained in each step is also known as reward and is denoted by rk. The agent learns a sequence of action to minimize g(x k,a k, x k+1 ). In the case of learning by animals, the reward is obtained from the environment. However in the case of algorithms, the reinforcement function is to be defined. In this simple grid world, reinforcement function can be defined as, 21

33 If cell with obstacle has to be avoided, choose B = 1,000,000. If the obstacle is having very smaller effect then B can be chosen as 10. To find the total cost, cumulate the costs or rewards on each transition. Now the total k =(N 1) cost for reaching the goal state can be taken as g(x k,a k,x k+1 ) k =0 x0 being the initial state and N being the number of transitions to reach the goal state Value Function The issue is how the robot (in genernl, agent in Reinforcement Learning problem) can choose 'good' decisions in order to reach the goal state, starting from an initial state x, at minimum cost. Robot has to follow a good policy starting from the initial state in order to reach the goal at minimum cost. One measure to evaluate the goodness of a policy is the total expected discounted cost incurred while following a policy over N stages. Value function for any policy 1t, V π : χ R is defined to rate the goodness of the different policies. V π (x) represents the total cost incurred by starting in state x and following a policy 1t over N stages. Then, (2.3.6) Here γ is the discount factor. The reason for incorporating a discount factor is that, the real goodness of an action may not be reflected by its immediate reward. Value of γ is decided by the problem environment to account how much the future rewards to be discounted to rate the goodness of the policy at the present state. Discount factor can take a value between 0 and 1 based on the problem environment. A value 1 indicates that all the future rewards are having equal importance as the immediate reward. In this shortest path problem since all the costs are relevant to the same extent, γ is taken as 1. On formulating this objective function, a policy π 1 is said to be better than a policy π 2 when V π 1 (x) V π 2(x), x χ. The problem is to find an optimal policy π such that starting from an initial state x, the value function or expected total cost is lower when following policy π compared to any other policy π Π n That is, find π such that,, Π being the set of policies. 22

OPTIMAL DISPATCH OF REAL POWER GENERATION USING PARTICLE SWARM OPTIMIZATION: A CASE STUDY OF EGBIN THERMAL STATION

OPTIMAL DISPATCH OF REAL POWER GENERATION USING PARTICLE SWARM OPTIMIZATION: A CASE STUDY OF EGBIN THERMAL STATION OPTIMAL DISPATCH OF REAL POWER GENERATION USING PARTICLE SWARM OPTIMIZATION: A CASE STUDY OF EGBIN THERMAL STATION Onah C. O. 1, Agber J. U. 2 and Ikule F. T. 3 1, 2, 3 Department of Electrical and Electronics

More information

Metaheuristics and Local Search

Metaheuristics and Local Search Metaheuristics and Local Search 8000 Discrete optimization problems Variables x 1,..., x n. Variable domains D 1,..., D n, with D j Z. Constraints C 1,..., C m, with C i D 1 D n. Objective function f :

More information

Metaheuristics and Local Search. Discrete optimization problems. Solution approaches

Metaheuristics and Local Search. Discrete optimization problems. Solution approaches Discrete Mathematics for Bioinformatics WS 07/08, G. W. Klau, 31. Januar 2008, 11:55 1 Metaheuristics and Local Search Discrete optimization problems Variables x 1,...,x n. Variable domains D 1,...,D n,

More information

Motivation, Basic Concepts, Basic Methods, Travelling Salesperson Problem (TSP), Algorithms

Motivation, Basic Concepts, Basic Methods, Travelling Salesperson Problem (TSP), Algorithms Motivation, Basic Concepts, Basic Methods, Travelling Salesperson Problem (TSP), Algorithms 1 What is Combinatorial Optimization? Combinatorial Optimization deals with problems where we have to search

More information

Capacitor Placement for Economical Electrical Systems using Ant Colony Search Algorithm

Capacitor Placement for Economical Electrical Systems using Ant Colony Search Algorithm Capacitor Placement for Economical Electrical Systems using Ant Colony Search Algorithm Bharat Solanki Abstract The optimal capacitor placement problem involves determination of the location, number, type

More information

Application of Teaching Learning Based Optimization for Size and Location Determination of Distributed Generation in Radial Distribution System.

Application of Teaching Learning Based Optimization for Size and Location Determination of Distributed Generation in Radial Distribution System. Application of Teaching Learning Based Optimization for Size and Location Determination of Distributed Generation in Radial Distribution System. Khyati Mistry Electrical Engineering Department. Sardar

More information

Contents Economic dispatch of thermal units

Contents Economic dispatch of thermal units Contents 2 Economic dispatch of thermal units 2 2.1 Introduction................................... 2 2.2 Economic dispatch problem (neglecting transmission losses)......... 3 2.2.1 Fuel cost characteristics........................

More information

Ant Colony Optimization: an introduction. Daniel Chivilikhin

Ant Colony Optimization: an introduction. Daniel Chivilikhin Ant Colony Optimization: an introduction Daniel Chivilikhin 03.04.2013 Outline 1. Biological inspiration of ACO 2. Solving NP-hard combinatorial problems 3. The ACO metaheuristic 4. ACO for the Traveling

More information

UNIVERSITY OF NAIROBI

UNIVERSITY OF NAIROBI UNIVERSITY OF NAIROBI FACULTY OF ENGINEERING DEPARTMENT OF ELECTRICAL AND INFORMATION ENGINEEERING HYDROTHERMAL ECONOMIC DISPATCH USING PARTICLE SWARM OPTIMIZATION (P.S.O) PROJECT INDEX: 055 SUBMITTED

More information

Outline. Ant Colony Optimization. Outline. Swarm Intelligence DM812 METAHEURISTICS. 1. Ant Colony Optimization Context Inspiration from Nature

Outline. Ant Colony Optimization. Outline. Swarm Intelligence DM812 METAHEURISTICS. 1. Ant Colony Optimization Context Inspiration from Nature DM812 METAHEURISTICS Outline Lecture 8 http://www.aco-metaheuristic.org/ 1. 2. 3. Marco Chiarandini Department of Mathematics and Computer Science University of Southern Denmark, Odense, Denmark

More information

Vedant V. Sonar 1, H. D. Mehta 2. Abstract

Vedant V. Sonar 1, H. D. Mehta 2. Abstract Load Shedding Optimization in Power System Using Swarm Intelligence-Based Optimization Techniques Vedant V. Sonar 1, H. D. Mehta 2 1 Electrical Engineering Department, L.D. College of Engineering Ahmedabad,

More information

Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm

Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm Balancing and Control of a Freely-Swinging Pendulum Using a Model-Free Reinforcement Learning Algorithm Michail G. Lagoudakis Department of Computer Science Duke University Durham, NC 2778 mgl@cs.duke.edu

More information

PowerApps Optimal Power Flow Formulation

PowerApps Optimal Power Flow Formulation PowerApps Optimal Power Flow Formulation Page1 Table of Contents 1 OPF Problem Statement... 3 1.1 Vector u... 3 1.1.1 Costs Associated with Vector [u] for Economic Dispatch... 4 1.1.2 Costs Associated

More information

Computational Intelligence in Product-line Optimization

Computational Intelligence in Product-line Optimization Computational Intelligence in Product-line Optimization Simulations and Applications Peter Kurz peter.kurz@tns-global.com June 2017 Restricted use Restricted use Computational Intelligence in Product-line

More information

GA BASED OPTIMAL POWER FLOW SOLUTIONS

GA BASED OPTIMAL POWER FLOW SOLUTIONS GA BASED OPTIMAL POWER FLOW SOLUTIONS Thesis submitted in partial fulfillment of the requirements for the award of degree of Master of Engineering in Power Systems & Electric Drives Thapar University,

More information

An Adaptive Clustering Method for Model-free Reinforcement Learning

An Adaptive Clustering Method for Model-free Reinforcement Learning An Adaptive Clustering Method for Model-free Reinforcement Learning Andreas Matt and Georg Regensburger Institute of Mathematics University of Innsbruck, Austria {andreas.matt, georg.regensburger}@uibk.ac.at

More information

Lecture 1: March 7, 2018

Lecture 1: March 7, 2018 Reinforcement Learning Spring Semester, 2017/8 Lecture 1: March 7, 2018 Lecturer: Yishay Mansour Scribe: ym DISCLAIMER: Based on Learning and Planning in Dynamical Systems by Shie Mannor c, all rights

More information

ARTIFICIAL INTELLIGENCE

ARTIFICIAL INTELLIGENCE BABEŞ-BOLYAI UNIVERSITY Faculty of Computer Science and Mathematics ARTIFICIAL INTELLIGENCE Solving search problems Informed local search strategies Nature-inspired algorithms March, 2017 2 Topics A. Short

More information

Multi-objective Emission constrained Economic Power Dispatch Using Differential Evolution Algorithm

Multi-objective Emission constrained Economic Power Dispatch Using Differential Evolution Algorithm Multi-objective Emission constrained Economic Power Dispatch Using Differential Evolution Algorithm Sunil Kumar Soni, Vijay Bhuria Abstract The main aim of power utilities is to provide high quality power

More information

OPTIMAL POWER FLOW BASED ON PARTICLE SWARM OPTIMIZATION

OPTIMAL POWER FLOW BASED ON PARTICLE SWARM OPTIMIZATION U.P.B. Sci. Bull., Series C, Vol. 78, Iss. 3, 2016 ISSN 2286-3540 OPTIMAL POWER FLOW BASED ON PARTICLE SWARM OPTIMIZATION Layth AL-BAHRANI 1, Virgil DUMBRAVA 2 Optimal Power Flow (OPF) is one of the most

More information

Introduction to Reinforcement Learning

Introduction to Reinforcement Learning CSCI-699: Advanced Topics in Deep Learning 01/16/2019 Nitin Kamra Spring 2019 Introduction to Reinforcement Learning 1 What is Reinforcement Learning? So far we have seen unsupervised and supervised learning.

More information

Reinforcement Learning and Control

Reinforcement Learning and Control CS9 Lecture notes Andrew Ng Part XIII Reinforcement Learning and Control We now begin our study of reinforcement learning and adaptive control. In supervised learning, we saw algorithms that tried to make

More information

UNIVERSITY OF NAIROBI FACULTY OF ENGINEERING DEPARTMENT OF ELECTRICAL AND INFORMATION ENGINEERING

UNIVERSITY OF NAIROBI FACULTY OF ENGINEERING DEPARTMENT OF ELECTRICAL AND INFORMATION ENGINEERING UNIVERSITY OF NAIROBI FACULTY OF ENGINEERING DEPARTMENT OF ELECTRICAL AND INFORMATION ENGINEERING PROJECT: SECURITY CONSTRAINED ECONOMIC DISPATCH USING IMPROVED OUT-OF-KILTER ALGORITHM (IOKA) PROJECT INDEX:

More information

PROBLEM SOLVING AND SEARCH IN ARTIFICIAL INTELLIGENCE

PROBLEM SOLVING AND SEARCH IN ARTIFICIAL INTELLIGENCE Artificial Intelligence, Computational Logic PROBLEM SOLVING AND SEARCH IN ARTIFICIAL INTELLIGENCE Lecture 4 Metaheuristic Algorithms Sarah Gaggl Dresden, 5th May 2017 Agenda 1 Introduction 2 Constraint

More information

CAPACITOR PLACEMENT USING FUZZY AND PARTICLE SWARM OPTIMIZATION METHOD FOR MAXIMUM ANNUAL SAVINGS

CAPACITOR PLACEMENT USING FUZZY AND PARTICLE SWARM OPTIMIZATION METHOD FOR MAXIMUM ANNUAL SAVINGS CAPACITOR PLACEMENT USING FUZZY AND PARTICLE SWARM OPTIMIZATION METHOD FOR MAXIMUM ANNUAL SAVINGS M. Damodar Reddy and V. C. Veera Reddy Department of Electrical and Electronics Engineering, S.V. University,

More information

Secondary Frequency Control of Microgrids In Islanded Operation Mode and Its Optimum Regulation Based on the Particle Swarm Optimization Algorithm

Secondary Frequency Control of Microgrids In Islanded Operation Mode and Its Optimum Regulation Based on the Particle Swarm Optimization Algorithm International Academic Institute for Science and Technology International Academic Journal of Science and Engineering Vol. 3, No. 1, 2016, pp. 159-169. ISSN 2454-3896 International Academic Journal of

More information

Artificial Intelligence Methods (G5BAIM) - Examination

Artificial Intelligence Methods (G5BAIM) - Examination Question 1 a) According to John Koza there are five stages when planning to solve a problem using a genetic program. What are they? Give a short description of each. (b) How could you cope with division

More information

Today s s Lecture. Applicability of Neural Networks. Back-propagation. Review of Neural Networks. Lecture 20: Learning -4. Markov-Decision Processes

Today s s Lecture. Applicability of Neural Networks. Back-propagation. Review of Neural Networks. Lecture 20: Learning -4. Markov-Decision Processes Today s s Lecture Lecture 20: Learning -4 Review of Neural Networks Markov-Decision Processes Victor Lesser CMPSCI 683 Fall 2004 Reinforcement learning 2 Back-propagation Applicability of Neural Networks

More information

Optimal Placement & sizing of Distributed Generator (DG)

Optimal Placement & sizing of Distributed Generator (DG) Chapter - 5 Optimal Placement & sizing of Distributed Generator (DG) - A Single Objective Approach CHAPTER - 5 Distributed Generation (DG) for Power Loss Minimization 5. Introduction Distributed generators

More information

On Optimal Power Flow

On Optimal Power Flow On Optimal Power Flow K. C. Sravanthi #1, Dr. M. S. Krishnarayalu #2 # Department of Electrical and Electronics Engineering V R Siddhartha Engineering College, Vijayawada, AP, India Abstract-Optimal Power

More information

OPTIMAL CAPACITOR PLACEMENT USING FUZZY LOGIC

OPTIMAL CAPACITOR PLACEMENT USING FUZZY LOGIC CHAPTER - 5 OPTIMAL CAPACITOR PLACEMENT USING FUZZY LOGIC 5.1 INTRODUCTION The power supplied from electrical distribution system is composed of both active and reactive components. Overhead lines, transformers

More information

Assessment of Available Transfer Capability Incorporating Probabilistic Distribution of Load Using Interval Arithmetic Method

Assessment of Available Transfer Capability Incorporating Probabilistic Distribution of Load Using Interval Arithmetic Method Assessment of Available Transfer Capability Incorporating Probabilistic Distribution of Load Using Interval Arithmetic Method Prabha Umapathy, Member, IACSIT, C.Venkataseshaiah and M.Senthil Arumugam Abstract

More information

CS221 Practice Midterm

CS221 Practice Midterm CS221 Practice Midterm Autumn 2012 1 ther Midterms The following pages are excerpts from similar classes midterms. The content is similar to what we ve been covering this quarter, so that it should be

More information

Distributed Optimization. Song Chong EE, KAIST

Distributed Optimization. Song Chong EE, KAIST Distributed Optimization Song Chong EE, KAIST songchong@kaist.edu Dynamic Programming for Path Planning A path-planning problem consists of a weighted directed graph with a set of n nodes N, directed links

More information

DISTRIBUTION SYSTEM OPTIMISATION

DISTRIBUTION SYSTEM OPTIMISATION Politecnico di Torino Dipartimento di Ingegneria Elettrica DISTRIBUTION SYSTEM OPTIMISATION Prof. Gianfranco Chicco Lecture at the Technical University Gh. Asachi, Iaşi, Romania 26 October 2010 Outline

More information

Lecture 9 Evolutionary Computation: Genetic algorithms

Lecture 9 Evolutionary Computation: Genetic algorithms Lecture 9 Evolutionary Computation: Genetic algorithms Introduction, or can evolution be intelligent? Simulation of natural evolution Genetic algorithms Case study: maintenance scheduling with genetic

More information

Optimal Operation of Large Power System by GA Method

Optimal Operation of Large Power System by GA Method Journal of Emerging Trends in Engineering and Applied Sciences (JETEAS) (1): 1-7 Scholarlink Research Institute Journals, 01 (ISSN: 11-7016) jeteas.scholarlinkresearch.org Journal of Emerging Trends in

More information

Basics of reinforcement learning

Basics of reinforcement learning Basics of reinforcement learning Lucian Buşoniu TMLSS, 20 July 2018 Main idea of reinforcement learning (RL) Learn a sequential decision policy to optimize the cumulative performance of an unknown system

More information

Regular paper. Particle Swarm Optimization Applied to the Economic Dispatch Problem

Regular paper. Particle Swarm Optimization Applied to the Economic Dispatch Problem Rafik Labdani Linda Slimani Tarek Bouktir Electrical Engineering Department, Oum El Bouaghi University, 04000 Algeria. rlabdani@yahoo.fr J. Electrical Systems 2-2 (2006): 95-102 Regular paper Particle

More information

Solving Numerical Optimization Problems by Simulating Particle-Wave Duality and Social Information Sharing

Solving Numerical Optimization Problems by Simulating Particle-Wave Duality and Social Information Sharing International Conference on Artificial Intelligence (IC-AI), Las Vegas, USA, 2002: 1163-1169 Solving Numerical Optimization Problems by Simulating Particle-Wave Duality and Social Information Sharing Xiao-Feng

More information

A.I.: Beyond Classical Search

A.I.: Beyond Classical Search A.I.: Beyond Classical Search Random Sampling Trivial Algorithms Generate a state randomly Random Walk Randomly pick a neighbor of the current state Both algorithms asymptotically complete. Overview Previously

More information

RL 3: Reinforcement Learning

RL 3: Reinforcement Learning RL 3: Reinforcement Learning Q-Learning Michael Herrmann University of Edinburgh, School of Informatics 20/01/2015 Last time: Multi-Armed Bandits (10 Points to remember) MAB applications do exist (e.g.

More information

Real Time Voltage Control using Genetic Algorithm

Real Time Voltage Control using Genetic Algorithm Real Time Voltage Control using Genetic Algorithm P. Thirusenthil kumaran, C. Kamalakannan Department of EEE, Rajalakshmi Engineering College, Chennai, India Abstract An algorithm for control action selection

More information

Research Article A Novel Differential Evolution Invasive Weed Optimization Algorithm for Solving Nonlinear Equations Systems

Research Article A Novel Differential Evolution Invasive Weed Optimization Algorithm for Solving Nonlinear Equations Systems Journal of Applied Mathematics Volume 2013, Article ID 757391, 18 pages http://dx.doi.org/10.1155/2013/757391 Research Article A Novel Differential Evolution Invasive Weed Optimization for Solving Nonlinear

More information

Evaluation of multi armed bandit algorithms and empirical algorithm

Evaluation of multi armed bandit algorithms and empirical algorithm Acta Technica 62, No. 2B/2017, 639 656 c 2017 Institute of Thermomechanics CAS, v.v.i. Evaluation of multi armed bandit algorithms and empirical algorithm Zhang Hong 2,3, Cao Xiushan 1, Pu Qiumei 1,4 Abstract.

More information

CS599 Lecture 1 Introduction To RL

CS599 Lecture 1 Introduction To RL CS599 Lecture 1 Introduction To RL Reinforcement Learning Introduction Learning from rewards Policies Value Functions Rewards Models of the Environment Exploitation vs. Exploration Dynamic Programming

More information

Module 6 : Preventive, Emergency and Restorative Control. Lecture 27 : Normal and Alert State in a Power System. Objectives

Module 6 : Preventive, Emergency and Restorative Control. Lecture 27 : Normal and Alert State in a Power System. Objectives Module 6 : Preventive, Emergency and Restorative Control Lecture 27 : Normal and Alert State in a Power System Objectives In this lecture you will learn the following Different states in a power system

More information

Prof. Dr. Ann Nowé. Artificial Intelligence Lab ai.vub.ac.be

Prof. Dr. Ann Nowé. Artificial Intelligence Lab ai.vub.ac.be REINFORCEMENT LEARNING AN INTRODUCTION Prof. Dr. Ann Nowé Artificial Intelligence Lab ai.vub.ac.be REINFORCEMENT LEARNING WHAT IS IT? What is it? Learning from interaction Learning about, from, and while

More information

CS 570: Machine Learning Seminar. Fall 2016

CS 570: Machine Learning Seminar. Fall 2016 CS 570: Machine Learning Seminar Fall 2016 Class Information Class web page: http://web.cecs.pdx.edu/~mm/mlseminar2016-2017/fall2016/ Class mailing list: cs570@cs.pdx.edu My office hours: T,Th, 2-3pm or

More information

Intuitionistic Fuzzy Estimation of the Ant Methodology

Intuitionistic Fuzzy Estimation of the Ant Methodology BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 9, No 2 Sofia 2009 Intuitionistic Fuzzy Estimation of the Ant Methodology S Fidanova, P Marinov Institute of Parallel Processing,

More information

CSC 4510 Machine Learning

CSC 4510 Machine Learning 10: Gene(c Algorithms CSC 4510 Machine Learning Dr. Mary Angela Papalaskari Department of CompuBng Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/ Slides of this presenta(on

More information

Reactive Power Contribution of Multiple STATCOM using Particle Swarm Optimization

Reactive Power Contribution of Multiple STATCOM using Particle Swarm Optimization Reactive Power Contribution of Multiple STATCOM using Particle Swarm Optimization S. Uma Mageswaran 1, Dr.N.O.Guna Sehar 2 1 Assistant Professor, Velammal Institute of Technology, Anna University, Chennai,

More information

Reactive Power Management using Firefly and Spiral Optimization under Static and Dynamic Loading Conditions

Reactive Power Management using Firefly and Spiral Optimization under Static and Dynamic Loading Conditions 1 Reactive Power Management using Firefly and Spiral Optimization under Static and Dynamic Loading Conditions Ripunjoy Phukan, ripun000@yahoo.co.in Abstract Power System planning encompasses the concept

More information

Local Search & Optimization

Local Search & Optimization Local Search & Optimization CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2017 Soleymani Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 4 Outline

More information

Machine Learning and Bayesian Inference. Unsupervised learning. Can we find regularity in data without the aid of labels?

Machine Learning and Bayesian Inference. Unsupervised learning. Can we find regularity in data without the aid of labels? Machine Learning and Bayesian Inference Dr Sean Holden Computer Laboratory, Room FC6 Telephone extension 6372 Email: sbh11@cl.cam.ac.uk www.cl.cam.ac.uk/ sbh11/ Unsupervised learning Can we find regularity

More information

5. Simulated Annealing 5.1 Basic Concepts. Fall 2010 Instructor: Dr. Masoud Yaghini

5. Simulated Annealing 5.1 Basic Concepts. Fall 2010 Instructor: Dr. Masoud Yaghini 5. Simulated Annealing 5.1 Basic Concepts Fall 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Real Annealing and Simulated Annealing Metropolis Algorithm Template of SA A Simple Example References

More information

Optimal Placement and Sizing of Distributed Generation for Power Loss Reduction using Particle Swarm Optimization

Optimal Placement and Sizing of Distributed Generation for Power Loss Reduction using Particle Swarm Optimization Available online at www.sciencedirect.com Energy Procedia 34 (2013 ) 307 317 10th Eco-Energy and Materials Science and Engineering (EMSES2012) Optimal Placement and Sizing of Distributed Generation for

More information

Minimization of Energy Loss using Integrated Evolutionary Approaches

Minimization of Energy Loss using Integrated Evolutionary Approaches Minimization of Energy Loss using Integrated Evolutionary Approaches Attia A. El-Fergany, Member, IEEE, Mahdi El-Arini, Senior Member, IEEE Paper Number: 1569614661 Presentation's Outline Aim of this work,

More information

Fundamentals of Metaheuristics

Fundamentals of Metaheuristics Fundamentals of Metaheuristics Part I - Basic concepts and Single-State Methods A seminar for Neural Networks Simone Scardapane Academic year 2012-2013 ABOUT THIS SEMINAR The seminar is divided in three

More information

Genetic Algorithm for Solving the Economic Load Dispatch

Genetic Algorithm for Solving the Economic Load Dispatch International Journal of Electronic and Electrical Engineering. ISSN 0974-2174, Volume 7, Number 5 (2014), pp. 523-528 International Research Publication House http://www.irphouse.com Genetic Algorithm

More information

Course 16:198:520: Introduction To Artificial Intelligence Lecture 13. Decision Making. Abdeslam Boularias. Wednesday, December 7, 2016

Course 16:198:520: Introduction To Artificial Intelligence Lecture 13. Decision Making. Abdeslam Boularias. Wednesday, December 7, 2016 Course 16:198:520: Introduction To Artificial Intelligence Lecture 13 Decision Making Abdeslam Boularias Wednesday, December 7, 2016 1 / 45 Overview We consider probabilistic temporal models where the

More information

Short Course: Multiagent Systems. Multiagent Systems. Lecture 1: Basics Agents Environments. Reinforcement Learning. This course is about:

Short Course: Multiagent Systems. Multiagent Systems. Lecture 1: Basics Agents Environments. Reinforcement Learning. This course is about: Short Course: Multiagent Systems Lecture 1: Basics Agents Environments Reinforcement Learning Multiagent Systems This course is about: Agents: Sensing, reasoning, acting Multiagent Systems: Distributed

More information

Multi Objective Economic Load Dispatch problem using A-Loss Coefficients

Multi Objective Economic Load Dispatch problem using A-Loss Coefficients Volume 114 No. 8 2017, 143-153 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu Multi Objective Economic Load Dispatch problem using A-Loss Coefficients

More information

CHAPTER 3 FUZZIFIED PARTICLE SWARM OPTIMIZATION BASED DC- OPF OF INTERCONNECTED POWER SYSTEMS

CHAPTER 3 FUZZIFIED PARTICLE SWARM OPTIMIZATION BASED DC- OPF OF INTERCONNECTED POWER SYSTEMS 51 CHAPTER 3 FUZZIFIED PARTICLE SWARM OPTIMIZATION BASED DC- OPF OF INTERCONNECTED POWER SYSTEMS 3.1 INTRODUCTION Optimal Power Flow (OPF) is one of the most important operational functions of the modern

More information

Reinforcement Learning for Continuous. Action using Stochastic Gradient Ascent. Hajime KIMURA, Shigenobu KOBAYASHI JAPAN

Reinforcement Learning for Continuous. Action using Stochastic Gradient Ascent. Hajime KIMURA, Shigenobu KOBAYASHI JAPAN Reinforcement Learning for Continuous Action using Stochastic Gradient Ascent Hajime KIMURA, Shigenobu KOBAYASHI Tokyo Institute of Technology, 4259 Nagatsuda, Midori-ku Yokohama 226-852 JAPAN Abstract:

More information

Power System Analysis Prof. A. K. Sinha Department of Electrical Engineering Indian Institute of Technology, Kharagpur

Power System Analysis Prof. A. K. Sinha Department of Electrical Engineering Indian Institute of Technology, Kharagpur Power System Analysis Prof. A. K. Sinha Department of Electrical Engineering Indian Institute of Technology, Kharagpur Lecture - 9 Transmission Line Steady State Operation Welcome to lesson 9, in Power

More information

AQUIFER GEOMETRY AND STRUCTURAL CONTROLS ON GROUNDWATER POTENTIAL IN MOUNT ELGON AQUIFER, TRANS-NZOIA COUNTY, KENYA.

AQUIFER GEOMETRY AND STRUCTURAL CONTROLS ON GROUNDWATER POTENTIAL IN MOUNT ELGON AQUIFER, TRANS-NZOIA COUNTY, KENYA. AQUIFER GEOMETRY AND STRUCTURAL CONTROLS ON GROUNDWATER POTENTIAL IN MOUNT ELGON AQUIFER, TRANS-NZOIA COUNTY, KENYA. OGUT JULIUS ODIDA I56/79737/2012 A dissertation submitted to the Department of Geology

More information

Algorithms and Complexity theory

Algorithms and Complexity theory Algorithms and Complexity theory Thibaut Barthelemy Some slides kindly provided by Fabien Tricoire University of Vienna WS 2014 Outline 1 Algorithms Overview How to write an algorithm 2 Complexity theory

More information

Distributed vs Bulk Power in Distribution Systems Considering Distributed Generation

Distributed vs Bulk Power in Distribution Systems Considering Distributed Generation Distributed vs Bulk Power in Distribution Systems Considering Distributed Generation Abdullah A. Alghamdi 1 and Prof. Yusuf A. Al-Turki 2 1 Ministry Of Education, Jeddah, Saudi Arabia. 2 King Abdulaziz

More information

Economic Operation of Power Systems

Economic Operation of Power Systems Economic Operation of Power Systems Section I: Economic Operation Of Power System Economic Distribution of Loads between the Units of a Plant Generating Limits Economic Sharing of Loads between Different

More information

Citation for the original published paper (version of record):

Citation for the original published paper (version of record): http://www.diva-portal.org This is the published version of a paper published in SOP Transactions on Power Transmission and Smart Grid. Citation for the original published paper (version of record): Liu,

More information

Introduction to Reinforcement Learning. CMPT 882 Mar. 18

Introduction to Reinforcement Learning. CMPT 882 Mar. 18 Introduction to Reinforcement Learning CMPT 882 Mar. 18 Outline for the week Basic ideas in RL Value functions and value iteration Policy evaluation and policy improvement Model-free RL Monte-Carlo and

More information

Reinforcement learning an introduction

Reinforcement learning an introduction Reinforcement learning an introduction Prof. Dr. Ann Nowé Computational Modeling Group AIlab ai.vub.ac.be November 2013 Reinforcement Learning What is it? Learning from interaction Learning about, from,

More information

Marks. bonus points. } Assignment 1: Should be out this weekend. } Mid-term: Before the last lecture. } Mid-term deferred exam:

Marks. bonus points. } Assignment 1: Should be out this weekend. } Mid-term: Before the last lecture. } Mid-term deferred exam: Marks } Assignment 1: Should be out this weekend } All are marked, I m trying to tally them and perhaps add bonus points } Mid-term: Before the last lecture } Mid-term deferred exam: } This Saturday, 9am-10.30am,

More information

Lin-Kernighan Heuristic. Simulated Annealing

Lin-Kernighan Heuristic. Simulated Annealing DM63 HEURISTICS FOR COMBINATORIAL OPTIMIZATION Lecture 6 Lin-Kernighan Heuristic. Simulated Annealing Marco Chiarandini Outline 1. Competition 2. Variable Depth Search 3. Simulated Annealing DM63 Heuristics

More information

Overview. Optimization. Easy optimization problems. Monte Carlo for Optimization. 1. Survey MC ideas for optimization: (a) Multistart

Overview. Optimization. Easy optimization problems. Monte Carlo for Optimization. 1. Survey MC ideas for optimization: (a) Multistart Monte Carlo for Optimization Overview 1 Survey MC ideas for optimization: (a) Multistart Art Owen, Lingyu Chen, Jorge Picazo (b) Stochastic approximation (c) Simulated annealing Stanford University Intel

More information

OPTIMIZED RESOURCE IN SATELLITE NETWORK BASED ON GENETIC ALGORITHM. Received June 2011; revised December 2011

OPTIMIZED RESOURCE IN SATELLITE NETWORK BASED ON GENETIC ALGORITHM. Received June 2011; revised December 2011 International Journal of Innovative Computing, Information and Control ICIC International c 2012 ISSN 1349-4198 Volume 8, Number 12, December 2012 pp. 8249 8256 OPTIMIZED RESOURCE IN SATELLITE NETWORK

More information

Available online at ScienceDirect. Procedia Computer Science 20 (2013 ) 90 95

Available online at  ScienceDirect. Procedia Computer Science 20 (2013 ) 90 95 Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 20 (2013 ) 90 95 Complex Adaptive Systems, Publication 3 Cihan H. Dagli, Editor in Chief Conference Organized by Missouri

More information

CMU Lecture 12: Reinforcement Learning. Teacher: Gianni A. Di Caro

CMU Lecture 12: Reinforcement Learning. Teacher: Gianni A. Di Caro CMU 15-781 Lecture 12: Reinforcement Learning Teacher: Gianni A. Di Caro REINFORCEMENT LEARNING Transition Model? State Action Reward model? Agent Goal: Maximize expected sum of future rewards 2 MDP PLANNING

More information

SOULTION TO CONSTRAINED ECONOMIC LOAD DISPATCH

SOULTION TO CONSTRAINED ECONOMIC LOAD DISPATCH SOULTION TO CONSTRAINED ECONOMIC LOAD DISPATCH SANDEEP BEHERA (109EE0257) Department of Electrical Engineering National Institute of Technology, Rourkela SOLUTION TO CONSTRAINED ECONOMIC LOAD DISPATCH

More information

Reinforcement Learning. Yishay Mansour Tel-Aviv University

Reinforcement Learning. Yishay Mansour Tel-Aviv University Reinforcement Learning Yishay Mansour Tel-Aviv University 1 Reinforcement Learning: Course Information Classes: Wednesday Lecture 10-13 Yishay Mansour Recitations:14-15/15-16 Eliya Nachmani Adam Polyak

More information

Course notes for EE394V Restructured Electricity Markets: Locational Marginal Pricing

Course notes for EE394V Restructured Electricity Markets: Locational Marginal Pricing Course notes for EE394V Restructured Electricity Markets: Locational Marginal Pricing Ross Baldick Copyright c 2013 Ross Baldick www.ece.utexas.edu/ baldick/classes/394v/ee394v.html Title Page 1 of 132

More information

12. LOCAL SEARCH. gradient descent Metropolis algorithm Hopfield neural networks maximum cut Nash equilibria

12. LOCAL SEARCH. gradient descent Metropolis algorithm Hopfield neural networks maximum cut Nash equilibria 12. LOCAL SEARCH gradient descent Metropolis algorithm Hopfield neural networks maximum cut Nash equilibria Lecture slides by Kevin Wayne Copyright 2005 Pearson-Addison Wesley h ttp://www.cs.princeton.edu/~wayne/kleinberg-tardos

More information

Discrete evaluation and the particle swarm algorithm

Discrete evaluation and the particle swarm algorithm Volume 12 Discrete evaluation and the particle swarm algorithm Tim Hendtlass and Tom Rodgers Centre for Intelligent Systems and Complex Processes Swinburne University of Technology P. O. Box 218 Hawthorn

More information

STUDY OF PARTICLE SWARM FOR OPTIMAL POWER FLOW IN IEEE BENCHMARK SYSTEMS INCLUDING WIND POWER GENERATORS

STUDY OF PARTICLE SWARM FOR OPTIMAL POWER FLOW IN IEEE BENCHMARK SYSTEMS INCLUDING WIND POWER GENERATORS Southern Illinois University Carbondale OpenSIUC Theses Theses and Dissertations 12-1-2012 STUDY OF PARTICLE SWARM FOR OPTIMAL POWER FLOW IN IEEE BENCHMARK SYSTEMS INCLUDING WIND POWER GENERATORS Mohamed

More information

Application of Artificial Neural Network in Economic Generation Scheduling of Thermal Power Plants

Application of Artificial Neural Network in Economic Generation Scheduling of Thermal Power Plants Application of Artificial Neural Networ in Economic Generation Scheduling of Thermal ower lants Mohammad Mohatram Department of Electrical & Electronics Engineering Sanjay Kumar Department of Computer

More information

Lecture 25: Learning 4. Victor R. Lesser. CMPSCI 683 Fall 2010

Lecture 25: Learning 4. Victor R. Lesser. CMPSCI 683 Fall 2010 Lecture 25: Learning 4 Victor R. Lesser CMPSCI 683 Fall 2010 Final Exam Information Final EXAM on Th 12/16 at 4:00pm in Lederle Grad Res Ctr Rm A301 2 Hours but obviously you can leave early! Open Book

More information

Minimization of load shedding by sequential use of linear programming and particle swarm optimization

Minimization of load shedding by sequential use of linear programming and particle swarm optimization Turk J Elec Eng & Comp Sci, Vol.19, No.4, 2011, c TÜBİTAK doi:10.3906/elk-1003-31 Minimization of load shedding by sequential use of linear programming and particle swarm optimization Mehrdad TARAFDAR

More information

Zebo Peng Embedded Systems Laboratory IDA, Linköping University

Zebo Peng Embedded Systems Laboratory IDA, Linköping University TDTS 01 Lecture 8 Optimization Heuristics for Synthesis Zebo Peng Embedded Systems Laboratory IDA, Linköping University Lecture 8 Optimization problems Heuristic techniques Simulated annealing Genetic

More information

3D HP Protein Folding Problem using Ant Algorithm

3D HP Protein Folding Problem using Ant Algorithm 3D HP Protein Folding Problem using Ant Algorithm Fidanova S. Institute of Parallel Processing BAS 25A Acad. G. Bonchev Str., 1113 Sofia, Bulgaria Phone: +359 2 979 66 42 E-mail: stefka@parallel.bas.bg

More information

UNIT-I ECONOMIC OPERATION OF POWER SYSTEM-1

UNIT-I ECONOMIC OPERATION OF POWER SYSTEM-1 UNIT-I ECONOMIC OPERATION OF POWER SYSTEM-1 1.1 HEAT RATE CURVE: The heat rate characteristics obtained from the plot of the net heat rate in Btu/Wh or cal/wh versus power output in W is shown in fig.1

More information

Research Article Ant Colony Search Algorithm for Optimal Generators Startup during Power System Restoration

Research Article Ant Colony Search Algorithm for Optimal Generators Startup during Power System Restoration Mathematical Problems in Engineering Volume 2010, Article ID 906935, 11 pages doi:10.1155/2010/906935 Research Article Ant Colony Search Algorithm for Optimal Generators Startup during Power System Restoration

More information

Comparison of Loss Sensitivity Factor & Index Vector methods in Determining Optimal Capacitor Locations in Agricultural Distribution

Comparison of Loss Sensitivity Factor & Index Vector methods in Determining Optimal Capacitor Locations in Agricultural Distribution 6th NATIONAL POWER SYSTEMS CONFERENCE, 5th-7th DECEMBER, 200 26 Comparison of Loss Sensitivity Factor & Index Vector s in Determining Optimal Capacitor Locations in Agricultural Distribution K.V.S. Ramachandra

More information

Local Search & Optimization

Local Search & Optimization Local Search & Optimization CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018 Soleymani Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 4 Some

More information

Using first-order logic, formalize the following knowledge:

Using first-order logic, formalize the following knowledge: Probabilistic Artificial Intelligence Final Exam Feb 2, 2016 Time limit: 120 minutes Number of pages: 19 Total points: 100 You can use the back of the pages if you run out of space. Collaboration on the

More information

Optimal Capacitor placement in Distribution Systems with Distributed Generators for Voltage Profile improvement by Particle Swarm Optimization

Optimal Capacitor placement in Distribution Systems with Distributed Generators for Voltage Profile improvement by Particle Swarm Optimization Optimal Capacitor placement in Distribution Systems with Distributed Generators for Voltage Profile improvement by Particle Swarm Optimization G. Balakrishna 1, Dr. Ch. Sai Babu 2 1 Associate Professor,

More information

Local Search (Greedy Descent): Maintain an assignment of a value to each variable. Repeat:

Local Search (Greedy Descent): Maintain an assignment of a value to each variable. Repeat: Local Search Local Search (Greedy Descent): Maintain an assignment of a value to each variable. Repeat: I I Select a variable to change Select a new value for that variable Until a satisfying assignment

More information

Part B" Ants (Natural and Artificial)! Langton s Vants" (Virtual Ants)! Vants! Example! Time Reversibility!

Part B Ants (Natural and Artificial)! Langton s Vants (Virtual Ants)! Vants! Example! Time Reversibility! Part B" Ants (Natural and Artificial)! Langton s Vants" (Virtual Ants)! 11/14/08! 1! 11/14/08! 2! Vants!! Square grid!! Squares can be black or white!! Vants can face N, S, E, W!! Behavioral rule:!! take

More information

Homework 2: MDPs and Search

Homework 2: MDPs and Search Graduate Artificial Intelligence 15-780 Homework 2: MDPs and Search Out on February 15 Due on February 29 Problem 1: MDPs [Felipe, 20pts] Figure 1: MDP for Problem 1. States are represented by circles

More information

Reinforcement Learning

Reinforcement Learning 1 Reinforcement Learning Chris Watkins Department of Computer Science Royal Holloway, University of London July 27, 2015 2 Plan 1 Why reinforcement learning? Where does this theory come from? Markov decision

More information