Deep Reinforcement Learning with Experience Replay Based on SARSA

Size: px
Start display at page:

Download "Deep Reinforcement Learning with Experience Replay Based on SARSA"

Transcription

1 Deep Renforcement Learnng wth Experence Replay Baed on SARSA Dongbn Zhao, Hatao Wang, Kun Shao and Yuanheng Zhu Key Laboratory of Management and Control for Complex Sytem Inttute of Automaton Chnee Academy of Scence, Bejng , Chna Abtract SARSA, a one knd of on-polcy renforcement learnng method, ntegrated wth deep learnng to olve the vdeo game control problem n th paper. We ue deep convolutonal neural network to etmate the tate-acton value, and SARSA learnng to update t. Bede, experence replay ntroduced to make the tranng proce utable to calable machne learnng problem. In th way, a new deep renforcement learnng method, called deep SARSA propoed to olve complcated control problem uch a mtatng human to play vdeo game. From the experment reult, we can conclude that the deep SARSA learnng how better performance n ome apect than deep Q learnng. Keyword SARSA learnng; Q learnng; experence replay; deep renforcement learnng; deep learnng I. INTRODUCTION Wth the development of artfcal ntellgence (AI), more and more ntellgent devce come nto ue n our daly lve. In the face of unknown complcated envronment, thee ntellgent devce hould know how to perceve the envronment and make decon accordngly. In the lat 20 year, deep learnng (DL) [1] baed on neural network ha greatly promoted the development of hgh dmenon nformaton percepton problem. Wth the powerful generalzng ablty, t can retreve hghly abtract tructure or feature from the real envronment and then precely depct the complcated dependence between raw data uch a mage and vdeo. Wth excellent ablty of feature detecton, DL ha been appled to many learnng tak, uch a handwrtten dgt recognton [2], cenaro analy [3] and o on. Although t ha acheved great breakthrough n nformaton percepton epecally n mage clafyng problem, DL ha t natural drawback. It can not drectly elect polcy or deal wth decon-makng problem, reultng n lmted applcaton n ntellgent control feld. Dfferent from DL, renforcement learnng (RL) one cla of method whch try to fnd optmal or near-optmal polcy for complcated ytem or agent [4-6]. A an effectve decon-makng method, t ha been ntroduced nto optmal control [7], model free control [8, 9] and o on. Generally, RL ha two clae polcy teraton and value teraton. On the other hand, t can alo be dvded nto off-polcy and on-polcy method. Common RL method nclude Q learnng, SARSA learnng, TD( ) and o on [4]. Though RL naturally degned to deal wth decon-makng problem, t ha run nto great dffculte when handlng hgh dmenon data. Wth the development of feature detecton method lke DL, uch problem are to be well olved. A new method, called deep renforcement learnng (DRL), emerge to lead the drecton of advanced AI reearch. DRL combne excellent percevng ablty of DL wth deconmakng ablty of RL. In 2010, Lange [10] propoed a typcal algorthm whch appled a deep auto-encoder neural network (DANN) nto a vual control tak. Later, Abtah and Fael [11] employed a deep belef network (DBN) a the functon approxmaton to mprove the learnng effcency of tradtonal neural ftted-q method. Then, Rrel [12] gave the complete defnton of deep renforcement learnng n Mot mportantly, the group of DeepMnd ntroduced deep Q network (DQN) [13, 14] whch utlze convoluton neural network (CNN) ntead of tradtonal Q network. Ther method ha been appled to vdeo game platform called Arcade Learnng Envronment (ALE) [15] and can even obtan hgher core than human player n ome game, lke breakout. Baed on ther work, Levne [16] appled recurrent neural network to the framework propoed by DeepMnd. In 2015, DeepMnd put forward a new framework of DRL baed on Monte Carlo tree earch (MCTS) to traned a Go agent called AlphaGo [17], whch beat one of the mot excellent human player Lee Sedol n Th match rae the people nteret n DRL whch leadng the trend of AI. Though DQN algorthm how excellent performance n vdeo game, t alo ha drawback, uch a low effcent data amplng proce and defect of offpolcy RL method. In th paper, we focu on a brand new DRL baed on SARSA learnng, alo called deep SARSA for mtatng human player n playng vdeo game. The deep SARSA method ntegrated wth experence replay proce propoed. To the bet of our knowledge, th the frt attempt to combne SARSA learnng wth DL for complcated ytem. The paper organzed a follow. In Secton II, SARSA learnng and ALE are ntroduced a the background and prelmnary. Then a new deep SARSA propoed n Secton III to olve complcate control tak uch a vdeo game. Two mulaton reult are gven to valdate the effectvene of the propoed deep SARSA n Secton IV. In the end we draw a concluon. Th work wa upported n part by Natonal Natural Scence Foundaton of Chna (No , , and ).

2 II. SARSA LEARNING AND ARCADE LEARNING ENVIRONMENT A. Q learnng and SARSA learnng Conderng a Markov decon proce (MDP), the goal of learnng tak to maxmze the future reward when the agent nteract wth envronment. Generally, we defne the future reward from the tme tep a T k R() t rtk 1 k 0 t, where the reward when an acton taken at tme and T often regarded a the tme when the proce termnate. In addton, and (0,1] the dcount factor, t r t T 1can t be atfed multaneouly. Then the tate-acton Q (, a) value functon can be defned a agent take the acton at the tate Therefore, a k t k 1 t t whch ndcate the under the polcy. Q (, a) E { R( t) t, at a}, (2) E { r, a a} k 0 where E { R( t) t, at a} the expected return, and the polcy functon over acton. Now the learnng tak am at obtanng the optmal tate-acton functon Q * (, a ) whch uually relevant to Bellman equaton. Then two method called Q learnng and SARSA learnng wll be compared to get the optmal tate-acton value functon. A one of the tradtonal RL algorthm, Q learnng an off-polcy method. The agent ndependently nteract wth the envronment whch often ndcate electng the acton a. Then the reward r feedback from the envronment and the next tate derved. Here Q(, a ) repreent the current tateacton value. In order to update the current tate-acton value functon, we employ the next tate-acton acton value to etmate t. Although the next tate acton a tll unknown. So the mot mportant prncple n Q learnng to take a greedy acton to maxmze the next Q(, a ). The update equaton ha been gven, the next Q(, a) Q(, a) [ r max a Q(, a) Q(, a)] where repreent the learnng rate. By contrat, SARSA learnng an on-polcy method. It mean when updatng the current tate-acton value, the next acton wll be taken. But n Q learnng, the acton completely greedy. Gven uch analy, the update equaton of tate-acton value can be defned a a a Q(, a) Q(, a) r Q(, a) Q(, a) Actually, the dfference between SARSA learnng and Q learnng le n the update equaton (3) and (4). In SARSA learnng, the tranng data quntuple- (, a, r,, a ). In every update proce, th quntuple wll be derved n equence. However, n Q learnng a jut for etmaton and wll not be taken n fact. B. Arcade Learnng Envronment Arcade Learnng Envronment (ALE) a wrapper or platform ncludng many vdeo game for Atar A a benchmark for new advanced RL algorthm, t preent ome nterference for the agent, ncludng the tate and the reward [15]. The tate are hgh dmenonal vual nput ( RGB vdeo at 60 Hz) a what human receve and the reward can be tranferred from the core gven by the envronment when the agent nteract wth the platform. Uually, the agent nteract wth ALE through a et of 18 acton, but only 5 of whch are bac acton. Thee 5 acton contan 4 movement drecton, and the remanng one hould be frng or null. To be clear, the reward or core come from the ytem output ntead of recognzng from the mage. ALE degned to be remarkably utable for tetng RL algorthm. So many group have appled ther algorthm to th platform ncludng the DQN method propoed by DeepMnd. Though DQN ha acheved excellent performance n vdeo game, t only combne bac Q learnng wth deep learnng. Many other dfferent renforcement learnng method can help mprove the performance of deep renforcement learnng lke the on-polcy method. In the next Secton, we wll preent a DRL method baed on SARSA learnng to mprove the tranng proce n vdeo game from ALE. III. DEEP REINFORCEMENT LEARNING METHOD BASED ON SARSA LEARNING Before deep renforcement learnng algorthm come out, many tradtonal RL method have been appled to ALE. Deafzo and Graepel [18] appled ome RL method to thoe complcated vdeo game. They compared the advantage and dadvantage of dfferent RL method uch a Q learnng, SARSA learnng, actor-crtc, GQ, R learnng and o on. The reult are lted n Table 1. TABLE I. THE PERFORMANCE OF DIFFERENT RL METHODS IN ALE [18] Relevant performance SARSA AC GQ Q R From Table 1, the average performance of Q learnng only 82% of SARSA learnng n vdeo game. Though thee algorthm only ue hand craft feature, the reult above ndcate that SARSA learnng wll acheve better performance than Q learnng. So gven on thee fact, a new deep renforcement learnng method baed on SARSA learnng propoed a follow.

3 A. SARSA network Game from Atar 2600 can be regarded a a MDP whch wll be olved by RL algorthm. Here SARSA learnng wll be ntegrated to DRL framework. Smlar to DQN n [14], gven the current tate, the acton a elected by -greedy method. Then the next tate and the reward r wll be oberved. The current tate-acton value. So n DRL Q(, a) baed on SARSA, the current optmal tate-acton can be etmated by Q*(, a) E r Q(, a),a where a the next acton elected by -greedy. Smlarly, n deep SARSA learnng, the value functon approxmaton tll wth the convoluton neural network (CNN) whoe tructure hown n Fg. 1. The nput of the CNN the raw mage from vdeo game and the output the Q value of all acton. defned a parameter of the CNN. At the teraton of tranng, the lo functon of the network can be defned a th 2 L ( ) ( y Q(,a; )) where y r Q(, a; 1). Then the man objectve to optmze the lo functon L ( ). From the vew of uperved learnng, y regarded a the label n tranng though alo a varable. By dfferentatng (6), we get the gradent of the lo functon L ( ) ( r Q(, a; 1) Q(,a; )) Q(,a; ) y where Q(,a; ) the gradent of the current tate-acton value. Then accordng to (7), we can optmze the lo functon by tochatc gradent decent (SGD), Adadelta and o on [19]. Bede, the renforcement learnng proce hould alo be taken nto conderaton. The lat layer of the network output the Q value of each acton. So we can elect the acton and update t by the SARSA method. Fg. 2 depct the forward data flow of SARSA network n the tranng proce. nput conv pool conv pool target Feature decteton Fg. 1 The convoluton neural network n DRL Fg. 2 The forward data flow n DRL The Feature Extracton n Fg. 2 can be een a the mage preproce and CNN network. After tate-acton obtaned, proper acton elected to make decon by the SARSA learnng method. Later we ntroduce experence replay technque [20] to mprove the tranng proce of DRL and adapt renforcement learnng to calable machne learnng proce. B. Experence replay In tradtonal renforcement learnng method, the learnng and updatng hould contnue n equence. That to ay, every ample can tmulate one update, thu makng the learnng proce rather low. In order to adapt to the calable machne learnng proce, the htorcal data tored n memory and wll be retraned later contnuouly. In Fg. 2, a quadruple et (,a,r,) kept n htorcal data D e,, 1 en, where ndcate the ze of htorcal tack. Then n the tranng proce, we ample the tranng data from th tack D. There are ome method n whch ample can be obtaned, uch a conecutve amplng, unform amplng, and weghted amplng method by reward. Here we follow the method of unform amplng method n DQN, whch ha two advantage. Frtly, the effcency of data uage mproved. Then conecutve ample mght be greatly relevant to each other. Unform method (,a,r, ) ~ U( D) can reduce the correlaton between nput data [14]. N Before raw mage from vdeo game are ampled, ome preproce mut be dealt wth. We can obtan every frame from the vdeo. However, t would be le effcent f one frame regarded a the tate. Addtonally, conecutve frame mght contan mportant feature n mage a the peed or geometrcal relatonhp whch can contrbute more to the performance of agent. Once a ngle frame traned, all thoe vtal feature are abandoned. So, n th paper one acton taken wth every 4 frame, the ame a [14]. The 4 frame are concatenated a the tate. The concatenaton defned a functon. After beng proceed, the tate are tored n tack D. The next ecton wll ntroduce the whole proce of DRL baed on SARSA learnng. C. Deep SARSA learnng Gven the number of vdeo game n, the SARSA network hould contan output whch repreent dcrete tateacton value, to nteract wth ALE. The current tate proceed by CNN to get the current tate-acton value Q 1, whch a n -dmenon vector. Then the current acton a elected wth -greedy algorthm. The reward r and the next tate oberved. In order to etmate the current Q(, a ), the next tate-acton value Q(, a) obtaned accordng to (4). n n

4 Here, when the next tate nput nto CNN, can be obtaned. Then we defne a label vector related to Q(, a) beng whch repreent the target vector. The two vector only have one dfferent component. That r Q(, a) Q(,a). Now the whole cheme of DRL baed on SARSA learnng preented n Algorthm 1. It hould be noted that durng tranng, the next acton a for etmatng the current tate-acton value never greedy. On the contrary, there a tny probablty that a random acton choen. Q 2 Q 1 Algorthm1 Deep Renforcement Learnng baed on SARSA 1: ntalze data tack D wth ze of N and parameter of CNN 2: for epode=1, M do 3: ntalze tate and preproce tate 4: elect 5: for do 6: take acton { x } 1 1 a 1 wth -greedy method t 1, T t1 t1 a t ( ) ( ) 1 1, oberve next tate xt 1 and 7: tore data ( t, a t,rt, t 1) nto tack D 8: ample data from tack D elect a wth -greedy method 9: rj f epode temnate at tep j 1 y j rj Q( j1, a; ) otherwe 10: accordng to (7), optmze the lo functon L ( ) at a 11: end for 12: end for r t, Fg. 3 Two vdeo game: breakout and eaquet. Fg. 4 and Fg. 5 preent the average core wth deep SARSA learnng and deep Q learnng. We can ee that at the end of the 20th epoch, deep SARSA learnng reache an average reward of about 100. By contrat, deep Q learnng can reach about 170. We can conclude that n the early tage of tranng, deep SARSA learnng converge lower than deep Q learnng. However, after 30 epoch, deep SARSA learnng gan hgher average core. In addton, deep SARSA learnng converge more tably than deep Q learnng. IV. EXPERIMENTS AND RESULTS In th ecton, two mulaton experment wll be preented to verfy our algorthm. The two vdeo game are from Atar 2600, called breakout and eaquet. Fg. 3 how the mage of the two game. The CNN contan 3 convoluton layer and two full connected layer. All the ettng n thee two experment are the ame a DQN [14], except for the RL method. The dcount factor Every 250 thouand tep, the agent teted. Every tetng epode are 125 thouand tep. Fg. 4 Average core wth deep SARSA learnng n Breakout A. Breakout In breakout, 5 bac acton ncludng up, down, left, rght and null are gven. The operaton mage lke the left of Fg. 3. Th game expect the agent to obtan a many core a poble. The agent control dam-board whch can reflect the bullet. Once the bullet ht brck n the top area, the agent get 1 pont. If the bullet fall down, the number of lve ubtracted 1 untl the game over. Fg. 5 Average core wth deep Q learnng n Breakout

5 The number of game durng tet wth two algorthm dplayed n Fg. 6 and 7. It reflect the convergent trend of thee algorthm. After tranng 20 epoch, deep SARSA learnng can alo converge to the equlbrum pont at about 75. In deep Q learnng, the equlbrum pont about 80. Fg. 8 Average core wth deep SARSA learnng n Seaquet Fg. 6 Number of game durng tet wth deep SARSA learnng n Breakout Fg. 9 Average core wth deep Q learnng n Seaquet Fg. 7 Number of game durng tet wth deep Q learnng n Breakout B. Seaquet In eaquet, 5 bac acton are gven ncludng up, down, left, rght and frng. The operaton mage hown n the rght of Fg. 3. Th game expect that the agent hould obtan a many core a poble by avng dver and kllng fh. The agent can control the ubmarne wth fve bac acton a mentoned above. Once the ubmarne ave the dver or kll fh, the agent get 20 and 40 pont. If the ubmarne run nto fh or the oxygen n the ubmarne 0, the number of lfe drop 1 untl the game over. So f human play th game, the quantty of oxygen hould alo be taken nto conderaton. Fg. 8 and Fg. 9 how the average core of deep SARSA learnng and deep Q learnng. We can ee that the core of deep SARSA learnng ncreae a lttle lower before the 10th epoch than deep Q learnng. However, t wll converge much fater after the 30th epoch. At lat deep SARSA learnng can gan about 5000 pont whle deep Q learnng only get 3700 pont. The number of game durng tet wth two algorthm hown n Fg. 10 and Fg. 11. It can alo reflect the trend of DRL proce. Deep SARSA learnng even how a mother proce n th vdeo game than deep Q learnng. Fg. 10 Number of game durng tet wth deep SARSA learnng n Seaquet

6 Fg. 11 Number of game durng tet wth deep Q learnng n Seaquet V. CONCLUSION In th paper, we ntroduce an on-polcy method SARSA learnng to DRL. SARSA learnng ha ome advantage when beng appled to decon makng problem. It make learnng proce more table and more utable to ome complcated ytem. Gven thee fact, a new DRL algorthm baed on SARSA, called deep SARSA learnng, propoed to olve the control problem of vdeo game. Two mulaton experment are gven to compare the performance of deep SARSA learnng and deep Q learnng. In Secton 4, the reult reveal that deep SARSA learnng gan hgher core and fater convergence n breakout and eaquet than deep Q learnng. REFERENCES [1] LeCun, Y., Y. Bengo, and G. Hnton, Deep learnng. Nature, 2015, 521(7553): p [2] LeCun, Y., Bottou L. and Bengo Y, Gradent-baed learnng appled to document recognton. Proceedng of the IEEE, 1998, 86(11): p [3] Farabet, C., Coupre C, Najman L and LeCun Y, Scene parng wth multcale feature learnng, purty tree, and optmal cover. arxv preprnt arxv: , [4] Sutton, R.S. and A.G. Barto, Introducton to renforcement learnng. 1998: MIT Pre. [5] Wang, F.Y., H. Zhang, and D. Lu, Adaptve dynamc programmng: an ntroducton. IEEE Computatonal Intellgence Magazne, 2009, 4(2): p [6] Zhao, D. and Y. Zhu, MEC--a near-optmal onlne renforcement learnng algorthm for contnuou determntc ytem. IEEE Tranacton on Neural Network and Learnng Sytem, 2015, 26(2): [7] Zhu, Y., D. Zhao, and X. L, Ung renforcement learnng technque to olve contnuou-tme non-lnear optmal trackng problem wthout ytem dynamc. IET Control Theory & Applcaton, 2016, 10(12), [8] Zhu, Y. and D. Zhao, A data-baed onlne renforcement learnng algorthm atfyng probably approxmately correct prncple. Neural Computng and Applcaton, 2015,26(4): p [9] Xa, Z. and D. Zhao, Onlne bayean renforcement learnng by gauan procee. IET Control Theory & Applcaton, (12), [10] Lange, S. and M. Redmller. Deep auto-encoder neural network n renforcement learnng. n The 2010 Internatonal Jont Conference on Neural Network (IJCNN) [11] Abtah, F. and I. Fael, Deep belef net a functon approxmator for renforcement learnng, n Proceedng of IEEE ICDL-EPIROB [12] Arel, I., Deep Renforcement Learnng a Foundaton for Artfcal General Intellgence, n Theoretcal Foundaton of Artfcal General Intellgence. 2012, Sprnger. p [13] Mnh V, Kavukcuoglu K, Slver D, et al. Playng atar wth deep renforcement learnng. arxv preprnt arxv: , [14] Mnh V, Kavukcuoglu K, Slver D, et al. Human-level control through deep renforcement learnng. Nature, 2015, 518(7540): [15] Bellemare M G, Naddaf Y, Vene J, et al. The arcade learnng envronment: an evaluaton platform for general agent. Journal of Artfcal Intellgence Reearch, 2012, 47: [16] Levne, S., Explorng deep and recurrent archtecture for optmal control. arxv preprnt arxv: , [17] Slver D, Huang A, Maddon C J, et al. Materng the game of Go wth deep neural network and tree earch. Nature, 2016, 529(7587): [18] Defazo, A. and T. Graepel, A comparon of learnng algorthm on the Arcade Learnng Envronment. arxv preprnt arxv: , [19] Zeler, M.D., ADADELTA: an adaptve learnng rate method. arxv preprnt arxv: , [20] Ln, L.J., Renforcement learnng for robot ung neural network. 1993, Techncal report: DTIC Document.

Additional File 1 - Detailed explanation of the expression level CPD

Additional File 1 - Detailed explanation of the expression level CPD Addtonal Fle - Detaled explanaton of the expreon level CPD A mentoned n the man text, the man CPD for the uterng model cont of two ndvdual factor: P( level gen P( level gen P ( level gen 2 (.).. CPD factor

More information

The Essential Dynamics Algorithm: Essential Results

The Essential Dynamics Algorithm: Essential Results @ MIT maachuett nttute of technology artfcal ntellgence laboratory The Eental Dynamc Algorthm: Eental Reult Martn C. Martn AI Memo 003-014 May 003 003 maachuett nttute of technology, cambrdge, ma 0139

More information

Improvements on Waring s Problem

Improvements on Waring s Problem Improvement on Warng Problem L An-Png Bejng, PR Chna apl@nacom Abtract By a new recurve algorthm for the auxlary equaton, n th paper, we wll gve ome mprovement for Warng problem Keyword: Warng Problem,

More information

The Study of Teaching-learning-based Optimization Algorithm

The Study of Teaching-learning-based Optimization Algorithm Advanced Scence and Technology Letters Vol. (AST 06), pp.05- http://dx.do.org/0.57/astl.06. The Study of Teachng-learnng-based Optmzaton Algorthm u Sun, Yan fu, Lele Kong, Haolang Q,, Helongang Insttute

More information

Batch RL Via Least Squares Policy Iteration

Batch RL Via Least Squares Policy Iteration Batch RL Va Leat Square Polcy Iteraton Alan Fern * Baed n part on lde by Ronald Parr Overvew Motvaton LSPI Dervaton from LSTD Expermental reult Onlne veru Batch RL Onlne RL: ntegrate data collecton and

More information

Batch Reinforcement Learning

Batch Reinforcement Learning Batch Renforcement Learnng Alan Fern * Baed n part on lde by Ronald Parr Overvew What batch renforcement learnng? Leat Square Polcy Iteraton Ftted Q-teraton Batch DQN Onlne veru Batch RL Onlne RL: ntegrate

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

Root Locus Techniques

Root Locus Techniques Root Locu Technque ELEC 32 Cloed-Loop Control The control nput u t ynthezed baed on the a pror knowledge of the ytem plant, the reference nput r t, and the error gnal, e t The control ytem meaure the output,

More information

Estimation of Finite Population Total under PPS Sampling in Presence of Extra Auxiliary Information

Estimation of Finite Population Total under PPS Sampling in Presence of Extra Auxiliary Information Internatonal Journal of Stattc and Analy. ISSN 2248-9959 Volume 6, Number 1 (2016), pp. 9-16 Reearch Inda Publcaton http://www.rpublcaton.com Etmaton of Fnte Populaton Total under PPS Samplng n Preence

More information

Method Of Fundamental Solutions For Modeling Electromagnetic Wave Scattering Problems

Method Of Fundamental Solutions For Modeling Electromagnetic Wave Scattering Problems Internatonal Workhop on MehFree Method 003 1 Method Of Fundamental Soluton For Modelng lectromagnetc Wave Scatterng Problem Der-Lang Young (1) and Jhh-We Ruan (1) Abtract: In th paper we attempt to contruct

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

728. Mechanical and electrical elements in reduction of vibrations

728. Mechanical and electrical elements in reduction of vibrations 78. Mechancal and electrcal element n reducton of vbraton Katarzyna BIAŁAS The Slean Unverty of Technology, Faculty of Mechancal Engneerng Inttute of Engneerng Procee Automaton and Integrated Manufacturng

More information

CSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing

CSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing CSC321 Tutoral 9: Revew of Boltzmann machnes and smulated annealng (Sldes based on Lecture 16-18 and selected readngs) Yue L Emal: yuel@cs.toronto.edu Wed 11-12 March 19 Fr 10-11 March 21 Outlne Boltzmann

More information

A Hybrid Evolution Algorithm with Application Based on Chaos Genetic Algorithm and Particle Swarm Optimization

A Hybrid Evolution Algorithm with Application Based on Chaos Genetic Algorithm and Particle Swarm Optimization Natonal Conference on Informaton Technology and Computer Scence (CITCS ) A Hybrd Evoluton Algorthm wth Applcaton Baed on Chao Genetc Algorthm and Partcle Swarm Optmzaton Fu Yu School of Computer & Informaton

More information

Distributed Control for the Parallel DC Linked Modular Shunt Active Power Filters under Distorted Utility Voltage Condition

Distributed Control for the Parallel DC Linked Modular Shunt Active Power Filters under Distorted Utility Voltage Condition Dtrbted Control for the Parallel DC Lnked Modlar Shnt Actve Power Flter nder Dtorted Utlty Voltage Condton Reearch Stdent: Adl Salman Spervor: Dr. Malabka Ba School of Electrcal and Electronc Engneerng

More information

Chapter 6 The Effect of the GPS Systematic Errors on Deformation Parameters

Chapter 6 The Effect of the GPS Systematic Errors on Deformation Parameters Chapter 6 The Effect of the GPS Sytematc Error on Deformaton Parameter 6.. General Beutler et al., (988) dd the frt comprehenve tudy on the GPS ytematc error. Baed on a geometrc approach and aumng a unform

More information

Team. Outline. Statistics and Art: Sampling, Response Error, Mixed Models, Missing Data, and Inference

Team. Outline. Statistics and Art: Sampling, Response Error, Mixed Models, Missing Data, and Inference Team Stattc and Art: Samplng, Repone Error, Mxed Model, Mng Data, and nference Ed Stanek Unverty of Maachuett- Amhert, USA 9/5/8 9/5/8 Outlne. Example: Doe-repone Model n Toxcology. ow to Predct Realzed

More information

A New Evolutionary Computation Based Approach for Learning Bayesian Network

A New Evolutionary Computation Based Approach for Learning Bayesian Network Avalable onlne at www.scencedrect.com Proceda Engneerng 15 (2011) 4026 4030 Advanced n Control Engneerng and Informaton Scence A New Evolutonary Computaton Based Approach for Learnng Bayesan Network Yungang

More information

An Effective Feature Selection Scheme via Genetic Algorithm Using Mutual Information 1

An Effective Feature Selection Scheme via Genetic Algorithm Using Mutual Information 1 An Effectve Feature Selecton Scheme va Genetc Algorthm Ung Mutual Informaton Chunka K. Zhang and Hong Hu Member IEEE, Department of Mechancal Engneerng and Automaton, Harbn Inttute of Technology, Shenzhen

More information

Multilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata

Multilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata Multlayer Perceptrons and Informatcs CG: Lecture 6 Mrella Lapata School of Informatcs Unversty of Ednburgh mlap@nf.ed.ac.uk Readng: Kevn Gurney s Introducton to Neural Networks, Chapters 5 6.5 January,

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

Chapter 11. Supplemental Text Material. The method of steepest ascent can be derived as follows. Suppose that we have fit a firstorder

Chapter 11. Supplemental Text Material. The method of steepest ascent can be derived as follows. Suppose that we have fit a firstorder S-. The Method of Steepet cent Chapter. Supplemental Text Materal The method of teepet acent can be derved a follow. Suppoe that we have ft a frtorder model y = β + β x and we wh to ue th model to determne

More information

Start Point and Trajectory Analysis for the Minimal Time System Design Algorithm

Start Point and Trajectory Analysis for the Minimal Time System Design Algorithm Start Pont and Trajectory Analy for the Mnmal Tme Sytem Degn Algorthm ALEXANDER ZEMLIAK, PEDRO MIRANDA Department of Phyc and Mathematc Puebla Autonomou Unverty Av San Claudo /n, Puebla, 757 MEXICO Abtract:

More information

Power law and dimension of the maximum value for belief distribution with the max Deng entropy

Power law and dimension of the maximum value for belief distribution with the max Deng entropy Power law and dmenson of the maxmum value for belef dstrbuton wth the max Deng entropy Bngy Kang a, a College of Informaton Engneerng, Northwest A&F Unversty, Yanglng, Shaanx, 712100, Chna. Abstract Deng

More information

Supporting Information

Supporting Information Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Parameter Estimation for Dynamic System using Unscented Kalman filter

Parameter Estimation for Dynamic System using Unscented Kalman filter Parameter Estmaton for Dynamc System usng Unscented Kalman flter Jhoon Seung 1,a, Amr Atya F. 2,b, Alexander G.Parlos 3,c, and Klto Chong 1,4,d* 1 Dvson of Electroncs Engneerng, Chonbuk Natonal Unversty,

More information

Study on Active Micro-vibration Isolation System with Linear Motor Actuator. Gong-yu PAN, Wen-yan GU and Dong LI

Study on Active Micro-vibration Isolation System with Linear Motor Actuator. Gong-yu PAN, Wen-yan GU and Dong LI 2017 2nd Internatonal Conference on Electrcal and Electroncs: echnques and Applcatons (EEA 2017) ISBN: 978-1-60595-416-5 Study on Actve Mcro-vbraton Isolaton System wth Lnear Motor Actuator Gong-yu PAN,

More information

Pop-Click Noise Detection Using Inter-Frame Correlation for Improved Portable Auditory Sensing

Pop-Click Noise Detection Using Inter-Frame Correlation for Improved Portable Auditory Sensing Advanced Scence and Technology Letters, pp.164-168 http://dx.do.org/10.14257/astl.2013 Pop-Clc Nose Detecton Usng Inter-Frame Correlaton for Improved Portable Audtory Sensng Dong Yun Lee, Kwang Myung Jeon,

More information

Foresighted Resource Reciprocation Strategies in P2P Networks

Foresighted Resource Reciprocation Strategies in P2P Networks Foreghted Reource Recprocaton Stratege n PP Networ Hyunggon Par and Mhaela van der Schaar Electrcal Engneerng Department Unverty of Calforna Lo Angele (UCLA) Emal: {hgpar mhaela@ee.ucla.edu Abtract We

More information

Small signal analysis

Small signal analysis Small gnal analy. ntroducton Let u conder the crcut hown n Fg., where the nonlnear retor decrbed by the equaton g v havng graphcal repreentaton hown n Fg.. ( G (t G v(t v Fg. Fg. a D current ource wherea

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

MULTIPLE REGRESSION ANALYSIS For the Case of Two Regressors

MULTIPLE REGRESSION ANALYSIS For the Case of Two Regressors MULTIPLE REGRESSION ANALYSIS For the Cae of Two Regreor In the followng note, leat-quare etmaton developed for multple regreon problem wth two eplanator varable, here called regreor (uch a n the Fat Food

More information

RBF Neural Network Model Training by Unscented Kalman Filter and Its Application in Mechanical Fault Diagnosis

RBF Neural Network Model Training by Unscented Kalman Filter and Its Application in Mechanical Fault Diagnosis Appled Mechancs and Materals Submtted: 24-6-2 ISSN: 662-7482, Vols. 62-65, pp 2383-2386 Accepted: 24-6- do:.428/www.scentfc.net/amm.62-65.2383 Onlne: 24-8- 24 rans ech Publcatons, Swtzerland RBF Neural

More information

A Complexity-Based Approach in Image Compression using Neural Networks

A Complexity-Based Approach in Image Compression using Neural Networks Internatonal Journal of Sgnal Proceng 5; www.waet.org Sprng 009 A Complexty-Baed Approach n Image Compreon ung eural etwork Had Ve, Manour Jamzad Abtract In th paper we preent an adaptve method for mage

More information

A Kernel Particle Filter Algorithm for Joint Tracking and Classification

A Kernel Particle Filter Algorithm for Joint Tracking and Classification A Kernel Partcle Flter Algorthm for Jont Tracng and Clafcaton Yunfe Guo Donglang Peng Inttute of Informaton and Control Automaton School Hangzhou Danz Unverty Hangzhou Chna gyf@hdueducn Huaje Chen Ane

More information

Improvements on Waring s Problem

Improvements on Waring s Problem Imrovement on Warng Problem L An-Png Bejng 85, PR Chna al@nacom Abtract By a new recurve algorthm for the auxlary equaton, n th aer, we wll gve ome mrovement for Warng roblem Keyword: Warng Problem, Hardy-Lttlewood

More information

KEY POINTS FOR NUMERICAL SIMULATION OF INCLINATION OF BUILDINGS ON LIQUEFIABLE SOIL LAYERS

KEY POINTS FOR NUMERICAL SIMULATION OF INCLINATION OF BUILDINGS ON LIQUEFIABLE SOIL LAYERS KY POINTS FOR NUMRICAL SIMULATION OF INCLINATION OF BUILDINGS ON LIQUFIABL SOIL LAYRS Jn Xu 1, Xaomng Yuan, Jany Zhang 3,Fanchao Meng 1 1 Student, Dept. of Geotechncal ngneerng, Inttute of ngneerng Mechanc,

More information

Specification -- Assumptions of the Simple Classical Linear Regression Model (CLRM) 1. Introduction

Specification -- Assumptions of the Simple Classical Linear Regression Model (CLRM) 1. Introduction ECONOMICS 35* -- NOTE ECON 35* -- NOTE Specfcaton -- Aumpton of the Smple Clacal Lnear Regreon Model (CLRM). Introducton CLRM tand for the Clacal Lnear Regreon Model. The CLRM alo known a the tandard lnear

More information

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

Outline. Communication. Bellman Ford Algorithm. Bellman Ford Example. Bellman Ford Shortest Path [1]

Outline. Communication. Bellman Ford Algorithm. Bellman Ford Example. Bellman Ford Shortest Path [1] DYNAMIC SHORTEST PATH SEARCH AND SYNCHRONIZED TASK SWITCHING Jay Wagenpfel, Adran Trachte 2 Outlne Shortest Communcaton Path Searchng Bellmann Ford algorthm Algorthm for dynamc case Modfcatons to our algorthm

More information

Design and Optimization of Fuzzy Controller for Inverse Pendulum System Using Genetic Algorithm

Design and Optimization of Fuzzy Controller for Inverse Pendulum System Using Genetic Algorithm Desgn and Optmzaton of Fuzzy Controller for Inverse Pendulum System Usng Genetc Algorthm H. Mehraban A. Ashoor Unversty of Tehran Unversty of Tehran h.mehraban@ece.ut.ac.r a.ashoor@ece.ut.ac.r Abstract:

More information

Semi-supervised Classification with Active Query Selection

Semi-supervised Classification with Active Query Selection Sem-supervsed Classfcaton wth Actve Query Selecton Jao Wang and Swe Luo School of Computer and Informaton Technology, Beng Jaotong Unversty, Beng 00044, Chna Wangjao088@63.com Abstract. Labeled samples

More information

Gaussian Mixture Models

Gaussian Mixture Models Lab Gaussan Mxture Models Lab Objectve: Understand the formulaton of Gaussan Mxture Models (GMMs) and how to estmate GMM parameters. You ve already seen GMMs as the observaton dstrbuton n certan contnuous

More information

Operating conditions of a mine fan under conditions of variable resistance

Operating conditions of a mine fan under conditions of variable resistance Paper No. 11 ISMS 216 Operatng condtons of a mne fan under condtons of varable resstance Zhang Ynghua a, Chen L a, b, Huang Zhan a, *, Gao Yukun a a State Key Laboratory of Hgh-Effcent Mnng and Safety

More information

Admin NEURAL NETWORKS. Perceptron learning algorithm. Our Nervous System 10/25/16. Assignment 7. Class 11/22. Schedule for the rest of the semester

Admin NEURAL NETWORKS. Perceptron learning algorithm. Our Nervous System 10/25/16. Assignment 7. Class 11/22. Schedule for the rest of the semester 0/25/6 Admn Assgnment 7 Class /22 Schedule for the rest of the semester NEURAL NETWORKS Davd Kauchak CS58 Fall 206 Perceptron learnng algorthm Our Nervous System repeat untl convergence (or for some #

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

Visual tracking via saliency weighted sparse coding appearance model

Visual tracking via saliency weighted sparse coding appearance model 2014 22nd Internatonal Conference on Pattern Recognton Vual trackng va alency weghted pare codng appearance model Wany L, Peng Wang Reearch Center of Precon Senng and Control Inttute of Automaton, Chnee

More information

MARKOV decision process (MDP) is a long-standing

MARKOV decision process (MDP) is a long-standing 2038 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 24, NO. 12, DECEMBER 2013 Goal Representaton Heurstc Dynamc Programmng on Maze Navgaton Zhen N, Habo He, Senor Member, IEEE, Jnyu Wen,

More information

Multigradient for Neural Networks for Equalizers 1

Multigradient for Neural Networks for Equalizers 1 Multgradent for Neural Netorks for Equalzers 1 Chulhee ee, Jnook Go and Heeyoung Km Department of Electrcal and Electronc Engneerng Yonse Unversty 134 Shnchon-Dong, Seodaemun-Ku, Seoul 1-749, Korea ABSTRACT

More information

Erratum: A Generalized Path Integral Control Approach to Reinforcement Learning

Erratum: A Generalized Path Integral Control Approach to Reinforcement Learning Journal of Machne Learnng Research 00-9 Submtted /0; Publshed 7/ Erratum: A Generalzed Path Integral Control Approach to Renforcement Learnng Evangelos ATheodorou Jonas Buchl Stefan Schaal Department of

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Using Immune Genetic Algorithm to Optimize BP Neural Network and Its Application Peng-fei LIU1,Qun-tai SHEN1 and Jun ZHI2,*

Using Immune Genetic Algorithm to Optimize BP Neural Network and Its Application Peng-fei LIU1,Qun-tai SHEN1 and Jun ZHI2,* Advances n Computer Scence Research (ACRS), volume 54 Internatonal Conference on Computer Networks and Communcaton Technology (CNCT206) Usng Immune Genetc Algorthm to Optmze BP Neural Network and Its Applcaton

More information

MODELLING OF STOCHASTIC PARAMETERS FOR CONTROL OF CITY ELECTRIC TRANSPORT SYSTEMS USING EVOLUTIONARY ALGORITHM

MODELLING OF STOCHASTIC PARAMETERS FOR CONTROL OF CITY ELECTRIC TRANSPORT SYSTEMS USING EVOLUTIONARY ALGORITHM MODELLING OF STOCHASTIC PARAMETERS FOR CONTROL OF CITY ELECTRIC TRANSPORT SYSTEMS USING EVOLUTIONARY ALGORITHM Mkhal Gorobetz, Anatoly Levchenkov Inttute of Indutral Electronc and Electrotechnc, Rga Techncal

More information

MiniBooNE Event Reconstruction and Particle Identification

MiniBooNE Event Reconstruction and Particle Identification MnBooNE Event Recontructon and Partcle Identfcaton Ha-Jun Yang Unverty of Mchgan, Ann Arbor (for the MnBooNE Collaboraton) DNP06, Nahvlle, TN October 25-28, 2006 Outlne Phyc Motvaton MnBooNE Event Type

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

CHAPTER 9 LINEAR MOMENTUM, IMPULSE AND COLLISIONS

CHAPTER 9 LINEAR MOMENTUM, IMPULSE AND COLLISIONS CHAPTER 9 LINEAR MOMENTUM, IMPULSE AND COLLISIONS 103 Phy 1 9.1 Lnear Momentum The prncple o energy conervaton can be ued to olve problem that are harder to olve jut ung Newton law. It ued to decrbe moton

More information

Module 5. Cables and Arches. Version 2 CE IIT, Kharagpur

Module 5. Cables and Arches. Version 2 CE IIT, Kharagpur odule 5 Cable and Arche Veron CE IIT, Kharagpur Leon 33 Two-nged Arch Veron CE IIT, Kharagpur Intructonal Objectve: After readng th chapter the tudent wll be able to 1. Compute horzontal reacton n two-hnged

More information

Discrete Simultaneous Perturbation Stochastic Approximation on Loss Function with Noisy Measurements

Discrete Simultaneous Perturbation Stochastic Approximation on Loss Function with Noisy Measurements 0 Amercan Control Conference on O'Farrell Street San Francco CA USA June 9 - July 0 0 Dcrete Smultaneou Perturbaton Stochatc Approxmaton on Lo Functon wth Noy Meaurement Q Wang and Jame C Spall Abtract

More information

EXPERT CONTROL BASED ON NEURAL NETWORKS FOR CONTROLLING GREENHOUSE ENVIRONMENT

EXPERT CONTROL BASED ON NEURAL NETWORKS FOR CONTROLLING GREENHOUSE ENVIRONMENT EXPERT CONTROL BASED ON NEURAL NETWORKS FOR CONTROLLING GREENHOUSE ENVIRONMENT Le Du Bejng Insttute of Technology, Bejng, 100081, Chna Abstract: Keyords: Dependng upon the nonlnear feature beteen neural

More information

On the SO 2 Problem in Thermal Power Plants. 2.Two-steps chemical absorption modeling

On the SO 2 Problem in Thermal Power Plants. 2.Two-steps chemical absorption modeling Internatonal Journal of Engneerng Reearch ISSN:39-689)(onlne),347-53(prnt) Volume No4, Iue No, pp : 557-56 Oct 5 On the SO Problem n Thermal Power Plant Two-tep chemcal aborpton modelng hr Boyadjev, P

More information

PROBABILITY-CONSISTENT SCENARIO EARTHQUAKE AND ITS APPLICATION IN ESTIMATION OF GROUND MOTIONS

PROBABILITY-CONSISTENT SCENARIO EARTHQUAKE AND ITS APPLICATION IN ESTIMATION OF GROUND MOTIONS PROBABILITY-COSISTET SCEARIO EARTHQUAKE AD ITS APPLICATIO I ESTIATIO OF GROUD OTIOS Q-feng LUO SUARY Th paper preent a new defnton of probablty-content cenaro earthquae PCSE and an evaluaton method of

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Neural Networks & Learning

Neural Networks & Learning Neural Netorks & Learnng. Introducton The basc prelmnares nvolved n the Artfcal Neural Netorks (ANN) are descrbed n secton. An Artfcal Neural Netorks (ANN) s an nformaton-processng paradgm that nspred

More information

Confidence intervals for the difference and the ratio of Lognormal means with bounded parameters

Confidence intervals for the difference and the ratio of Lognormal means with bounded parameters Songklanakarn J. Sc. Technol. 37 () 3-40 Mar.-Apr. 05 http://www.jt.pu.ac.th Orgnal Artcle Confdence nterval for the dfference and the rato of Lognormal mean wth bounded parameter Sa-aat Nwtpong* Department

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

Speeding up Computation of Scalar Multiplication in Elliptic Curve Cryptosystem

Speeding up Computation of Scalar Multiplication in Elliptic Curve Cryptosystem H.K. Pathak et. al. / (IJCSE) Internatonal Journal on Computer Scence and Engneerng Speedng up Computaton of Scalar Multplcaton n Ellptc Curve Cryptosystem H. K. Pathak Manju Sangh S.o.S n Computer scence

More information

Introduction to the Introduction to Artificial Neural Network

Introduction to the Introduction to Artificial Neural Network Introducton to the Introducton to Artfcal Neural Netork Vuong Le th Hao Tang s sldes Part of the content of the sldes are from the Internet (possbly th modfcatons). The lecturer does not clam any onershp

More information

Temperature. Chapter Heat Engine

Temperature. Chapter Heat Engine Chapter 3 Temperature In prevous chapters of these notes we ntroduced the Prncple of Maxmum ntropy as a technque for estmatng probablty dstrbutons consstent wth constrants. In Chapter 9 we dscussed the

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

A New Scrambling Evaluation Scheme based on Spatial Distribution Entropy and Centroid Difference of Bit-plane

A New Scrambling Evaluation Scheme based on Spatial Distribution Entropy and Centroid Difference of Bit-plane A New Scramblng Evaluaton Scheme based on Spatal Dstrbuton Entropy and Centrod Dfference of Bt-plane Lang Zhao *, Avshek Adhkar Kouch Sakura * * Graduate School of Informaton Scence and Electrcal Engneerng,

More information

A METHOD TO REPRESENT THE SEMANTIC DESCRIPTION OF A WEB SERVICE BASED ON COMPLEXITY FUNCTIONS

A METHOD TO REPRESENT THE SEMANTIC DESCRIPTION OF A WEB SERVICE BASED ON COMPLEXITY FUNCTIONS UPB Sc Bull, Sere A, Vol 77, I, 5 ISSN 3-77 A METHOD TO REPRESENT THE SEMANTIC DESCRIPTION OF A WEB SERVICE BASED ON COMPLEXITY FUNCTIONS Andre-Hora MOGOS, Adna Magda FLOREA Semantc web ervce repreent

More information

An Improved multiple fractal algorithm

An Improved multiple fractal algorithm Advanced Scence and Technology Letters Vol.31 (MulGraB 213), pp.184-188 http://dx.do.org/1.1427/astl.213.31.41 An Improved multple fractal algorthm Yun Ln, Xaochu Xu, Jnfeng Pang College of Informaton

More information

Adaptive Consensus Control of Multi-Agent Systems with Large Uncertainty and Time Delays *

Adaptive Consensus Control of Multi-Agent Systems with Large Uncertainty and Time Delays * Journal of Robotcs, etworkng and Artfcal Lfe, Vol., o. (September 04), 5-9 Adaptve Consensus Control of Mult-Agent Systems wth Large Uncertanty and me Delays * L Lu School of Mechancal Engneerng Unversty

More information

Air Age Equation Parameterized by Ventilation Grouped Time WU Wen-zhong

Air Age Equation Parameterized by Ventilation Grouped Time WU Wen-zhong Appled Mechancs and Materals Submtted: 2014-05-07 ISSN: 1662-7482, Vols. 587-589, pp 449-452 Accepted: 2014-05-10 do:10.4028/www.scentfc.net/amm.587-589.449 Onlne: 2014-07-04 2014 Trans Tech Publcatons,

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

ECE559VV Project Report

ECE559VV Project Report ECE559VV Project Report (Supplementary Notes Loc Xuan Bu I. MAX SUM-RATE SCHEDULING: THE UPLINK CASE We have seen (n the presentaton that, for downlnk (broadcast channels, the strategy maxmzng the sum-rate

More information

Building A Fuzzy Inference System By An Extended Rule Based Q-Learning

Building A Fuzzy Inference System By An Extended Rule Based Q-Learning Buldng A Fuzzy Inference System By An Extended Rule Based Q-Learnng Mn-Soeng Km, Sun-G Hong and Ju-Jang Lee * Dept. of Electrcal Engneerng and Computer Scence, KAIST 373- Kusung-Dong Yusong-Ku Taejon 35-7,

More information

Discrete MRF Inference of Marginal Densities for Non-uniformly Discretized Variable Space

Discrete MRF Inference of Marginal Densities for Non-uniformly Discretized Variable Space 2013 IEEE Conference on Computer Von and Pattern Recognton Dcrete MRF Inference of Margnal Dente for Non-unformly Dcretzed Varable Space Maak Sato Takayuk Okatan Kochro Deguch Tohoku Unverty, Japan {mato,

More information

Application research on rough set -neural network in the fault diagnosis system of ball mill

Application research on rough set -neural network in the fault diagnosis system of ball mill Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(4):834-838 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 Applcaton research on rough set -neural network n the

More information

The Second Anti-Mathima on Game Theory

The Second Anti-Mathima on Game Theory The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player

More information

Introduction to Interfacial Segregation. Xiaozhe Zhang 10/02/2015

Introduction to Interfacial Segregation. Xiaozhe Zhang 10/02/2015 Introducton to Interfacal Segregaton Xaozhe Zhang 10/02/2015 Interfacal egregaton Segregaton n materal refer to the enrchment of a materal conttuent at a free urface or an nternal nterface of a materal.

More information

Why BP Works STAT 232B

Why BP Works STAT 232B Why BP Works STAT 232B Free Energes Helmholz & Gbbs Free Energes 1 Dstance between Probablstc Models - K-L dvergence b{ KL b{ p{ = b{ ln { } p{ Here, p{ s the eact ont prob. b{ s the appromaton, called

More information

Hongyi Miao, College of Science, Nanjing Forestry University, Nanjing ,China. (Received 20 June 2013, accepted 11 March 2014) I)ϕ (k)

Hongyi Miao, College of Science, Nanjing Forestry University, Nanjing ,China. (Received 20 June 2013, accepted 11 March 2014) I)ϕ (k) ISSN 1749-3889 (prnt), 1749-3897 (onlne) Internatonal Journal of Nonlnear Scence Vol.17(2014) No.2,pp.188-192 Modfed Block Jacob-Davdson Method for Solvng Large Sparse Egenproblems Hongy Mao, College of

More information

APPROXIMATE FUZZY REASONING BASED ON INTERPOLATION IN THE VAGUE ENVIRONMENT OF THE FUZZY RULEBASE AS A PRACTICAL ALTERNATIVE OF THE CLASSICAL CRI

APPROXIMATE FUZZY REASONING BASED ON INTERPOLATION IN THE VAGUE ENVIRONMENT OF THE FUZZY RULEBASE AS A PRACTICAL ALTERNATIVE OF THE CLASSICAL CRI Kovác, Sz., Kóczy, L.T.: Approxmate Fuzzy Reaonng Baed on Interpolaton n the Vague Envronment of the Fuzzy Rulebae a a Practcal Alternatve of the Clacal CRI, Proceedng of the 7 th Internatonal Fuzzy Sytem

More information

Deep Belief Network using Reinforcement Learning and its Applications to Time Series Forecasting

Deep Belief Network using Reinforcement Learning and its Applications to Time Series Forecasting Deep Belef Network usng Renforcement Learnng and ts Applcatons to Tme Seres Forecastng Takaom HIRATA, Takash KUREMOTO, Masanao OBAYASHI, Shngo MABU Graduate School of Scence and Engneerng Yamaguch Unversty

More information

CHAPTER IV RESEARCH FINDING AND ANALYSIS

CHAPTER IV RESEARCH FINDING AND ANALYSIS CHAPTER IV REEARCH FINDING AND ANALYI A. Descrpton of Research Fndngs To fnd out the dfference between the students who were taught by usng Mme Game and the students who were not taught by usng Mme Game

More information

A New Virtual Indexing Method for Measuring Host Connection Degrees

A New Virtual Indexing Method for Measuring Host Connection Degrees A New Vrtual Indexng Method for Meaurng ot Connecton Degree Pnghu Wang, Xaohong Guan,, Webo Gong 3, and Don Towley 4 SKLMS Lab and MOE KLINNS Lab, X an Jaotong Unverty, X an, Chna Department of Automaton

More information

Other NN Models. Reinforcement learning (RL) Probabilistic neural networks

Other NN Models. Reinforcement learning (RL) Probabilistic neural networks Other NN Models Renforcement learnng (RL) Probablstc neural networks Support vector machne (SVM) Renforcement learnng g( (RL) Basc deas: Supervsed dlearnng: (delta rule, BP) Samples (x, f(x)) to learn

More information

Feature Selection in Multi-instance Learning

Feature Selection in Multi-instance Learning The Nnth Internatonal Symposum on Operatons Research and Its Applcatons (ISORA 10) Chengdu-Juzhagou, Chna, August 19 23, 2010 Copyrght 2010 ORSC & APORC, pp. 462 469 Feature Selecton n Mult-nstance Learnng

More information

Sparse Gaussian Processes Using Backward Elimination

Sparse Gaussian Processes Using Backward Elimination Sparse Gaussan Processes Usng Backward Elmnaton Lefeng Bo, Lng Wang, and Lcheng Jao Insttute of Intellgent Informaton Processng and Natonal Key Laboratory for Radar Sgnal Processng, Xdan Unversty, X an

More information

Message modification, neutral bits and boomerangs

Message modification, neutral bits and boomerangs Message modfcaton, neutral bts and boomerangs From whch round should we start countng n SHA? Antone Joux DGA and Unversty of Versalles St-Quentn-en-Yvelnes France Jont work wth Thomas Peyrn 1 Dfferental

More information

state Environment reinforcement

state Environment reinforcement Tunng Fuzzy Inference Syems by Q-Learnng Mohamed Boumehraz*, Kher Benmahammed** *Laboratore MSE, Département Electronque, Unveré de Bskra, boumehraz_m@yahoo.fr ** Département Electronque, Unveré de Setf

More information

Multilayer Perceptron (MLP)

Multilayer Perceptron (MLP) Multlayer Perceptron (MLP) Seungjn Cho Department of Computer Scence and Engneerng Pohang Unversty of Scence and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjn@postech.ac.kr 1 / 20 Outlne

More information

A NUMERICAL MODELING OF MAGNETIC FIELD PERTURBATED BY THE PRESENCE OF SCHIP S HULL

A NUMERICAL MODELING OF MAGNETIC FIELD PERTURBATED BY THE PRESENCE OF SCHIP S HULL A NUMERCAL MODELNG OF MAGNETC FELD PERTURBATED BY THE PRESENCE OF SCHP S HULL M. Dennah* Z. Abd** * Laboratory Electromagnetc Sytem EMP BP b Ben-Aknoun 606 Alger Algera ** Electronc nttute USTHB Alger

More information

arxiv: v1 [cs.gt] 15 Jan 2019

arxiv: v1 [cs.gt] 15 Jan 2019 Model and algorthm for tme-content rk-aware Markov game Wenje Huang, Pham Vet Ha and Wllam B. Hakell January 16, 2019 arxv:1901.04882v1 [c.gt] 15 Jan 2019 Abtract In th paper, we propoe a model for non-cooperatve

More information

APPLICATION OF RBF NEURAL NETWORK IMPROVED BY PSO ALGORITHM IN FAULT DIAGNOSIS

APPLICATION OF RBF NEURAL NETWORK IMPROVED BY PSO ALGORITHM IN FAULT DIAGNOSIS Journal of Theoretcal and Appled Informaton Technology 005-01 JATIT & LLS. All rghts reserved. ISSN: 199-8645 www.jatt.org E-ISSN: 1817-3195 APPLICATION OF RBF NEURAL NETWORK IMPROVED BY PSO ALGORITHM

More information

Wavelet chaotic neural networks and their application to continuous function optimization

Wavelet chaotic neural networks and their application to continuous function optimization Vol., No.3, 04-09 (009) do:0.436/ns.009.307 Natural Scence Wavelet chaotc neural networks and ther applcaton to contnuous functon optmzaton Ja-Ha Zhang, Yao-Qun Xu College of Electrcal and Automatc Engneerng,

More information