state Environment reinforcement
|
|
- Kristopher Jennings
- 5 years ago
- Views:
Transcription
1 Tunng Fuzzy Inference Syems by Q-Learnng Mohamed Boumehraz*, Kher Benmahammed** *Laboratore MSE, Département Electronque, Unveré de Bskra, ** Département Electronque, Unveré de Setf Keywords: renforcement learnng, fuzzy nference syems, Q-learnng Abract: Fuzzy rules for control can be effectvely tuned va renforcement learnng. Renforcement learnng s a weak learnng method wch only requres nformaton on the succes or falure of the control applcaton. In ths paper a renforcement learnng method s used to tune on lne the concluson part of fuzzy nference syem rules. The fuzzy rules are tuned n order to maxmze the return functon. To llurate ts effectvness, the learnng method s appled to the well known Cart-Pole balancng syem problem. The results obtaned show sgnfcant mprovements of the speed of learnng. 1. Introducton Renforcement learnng (RL) refers to a class of learnng tasks and algorthms n whch the learnng syem learns an assocatve mappng by maxmzng a scalar evaluaton (renforcement) of ts performance from the envronment [1,2,3]. Compared to supervsed learnng, RL s more dffcult snce t has to work wth much less nformaton. Fuzzy nference syems have been shown able to provde excellent control n a number of practcal applcatons. However, the problem n fuzzy syems s how to defne the approprate fuzzy rules. Several approaches have been proposed to autamtcally extract rules from data ; gradent descent[4], fuzzy cluerng, genetc algorthms [,6] and renforcement learnng[7,8,9,1,11]. In ths paper we use Q-learnng to determne the approprate conclusons for a Mamdan fuzzy nference syem. We assume that the ructure of the fuzzy syem and the membershp functons are specfed a pror. 2. Renforcement Learnng 2.1 Renforcement learnng model In renforcement learnng an agent learns to optmze an nteracton wth a dynamc envronment through tral and error. The agent receves a scalar value or reward wth every acton t executes. The goal of the agent s to learn a rategy for selectng actons such that the expected sum of dscounted rewards s maxmzed[1]. In the andard renforcement learnng model, an agent s connected to ts envronment va perceton and acton, as depcted n fgure 1. At any gven tme ep t, the agent perceves the ate s t, of the envronment and selects an a t. The envronment responds by gvng the agent scalar renforcement sgnal, r(s t ) and changng nto ate s t+1. The agent should choose actons that tend to ncrease the long run sum of values of the renforcement sgnal. It can learn to do ths overtme by syematc tral and error, guded by a wde varety of algorthms. The agent goal s to fnd an optmal polcy, π : S A, whch maps ates to actons, that maxmze some long-run mesure of renforcement. In the general case of the renforcement learnng problem, the agent s actons determne not only ts mmedate rewards, but also the next ate of the envronment. As a result, when takng actons, the agent has to take the future nto account. The renforcement learnng can be summarzed In the followng eps. Intalze the learnng syem repeat 1-Wth the syem n ate s, choose an acton a accordng to an exploraton polcy and apply t to the syem 2- The syem returns a reward r, and also yelds next ate. 3- Use the experence, (s,a,r,s ) to update the learnng syem 4 s s untl s s termnal acton Envronment renforcement Agent 2.2 The return functon The agent's goal s to maxmze the accumulated future rewards. The return functon, or the return, R(t), s a long-term measure of rewards. We have to specfy how the agent should take future nto account n the decsons t makes about how to select an acton now. There are three models that have been the subject of the majorty of work n ths area. The fnte-horzon model In ths case, the horzon corresponds to a fnte number of eps n the future. It exs a termnal ate and the sequence of actons between the ntal ate and the termnal one s called a perod. The return s gven by: R (1) r r r t t1 tk 1 ate Fgure 1. Renforcement learnng scheme 1
2 where K s the number of eps before the termnal ate. The dscounted return (nfnte-horzon model) In ths case the longrun reward s taken nto account, but rewards that are receved n the future are geometrcally dscounted accordng to dscount factor, < < 1 and the crtera becomes. R k r (2) tk k1 The average-reward model A thrd crtera, n whch the agent s supposed to take actons that optmze ts long-run average reward s also used : R n 1 rt k lm n n (3) k1 2.2 The ate value functon or value functon The value functon s a mappng from ates to ates values. The value functon V (s) of ate s, assocated wth a gven polcy (s) s defned as [1] : V E krt k 1 (4) k1 Where s t s the ate at tme t, r t+k+1 s the reward receved for performng acton : a s tk tk () at tme t+k, and s the dscount factor ( <<1). 2.3 Acton-value functon or Q-functon The acton-value functon measures the expected return of executng acton a t at ate s t, and then followng the polcy for selectng actons n subsequent ates. The Q-functon correspondng to polcy (s) s defned as [1]: Qt 1, at r t1q t 1, 1 (6) The advantage of usng Q-functon s that the agent s able to perform one-ep lookahead search wthout knowng the one-ep reward and dynamcs functons. The dsavantage s that the doman of the Q- functon ncreases from the doman of ates S to the doman of ate-acton pars (s,a). 3- Renforcement Learnng Methods 3.1 Q-Learnng It exs several approaches for renforcement learnng wthout models. Some are based on polcy teraton, such as the Actor Crtc Learnng, and others on value teraton, such as Q-Learnng or SARSA. The Q-Learnng, proposed by Watkns [12], s perhaps the more popular of algorthms, by reason of ts smplcty. One-ep Q-Learnng The fr verson of Q-Learnng s based on the temporal dfferences of order, TD(), whle only consderng the followng ep (one-ep Q- Learnng). The agent observes the present ate, s t, and executes an acton, a t, accordng to the evaluaton of the return that t makes at ths age. It updates ts evaluaton of the value of the acton whle takng n account, a) the mmedate renforcement, r t+1, and b) the emated value of the new ate, V t (s t+1 ), that s defned by: Vt ( 1) max baqt ( 1, b) (7) The update corresponds to the equaton: Qt 1 Qt r t1v t( 1) Qt (8) s a learnng rate such that as t. In addton to ts smplcty, Q-Learnng presents several ntereng charactercs. - The evaluatons of Q, the Q-values, are ndependent of the polcy followed by the agent. Ths one can follow any polcy, whle contnung to conruct correct evaluatons of the value of actons. - Q-values are explotable a long tme before the formal convergence that can be sometmes very slow. - Laly, there are proofs of convergence toward the optmal polcy[12]. 4. Optmzaton of fuzzy nference syemes by Q- Learnng Renforcement learnng has been used for optmzaton of fuzzy nference syems by two types of methods: Methods based on polcy teraton, drvng to Actor- Crtc archtectures [7,8], and the others based on value teraton, generalze Q-Learnng[9,1,11], n [11] Glorennec uses Q-Learnng for the optmzaton of a zero order Takag-Sugeno FIS, wth a conant conclusons. If the acton space s contnuous the conclusons are equally drbuted between lower and upper bounds of the acton. In ths paper, we consder a Madan FIS, and contnuous ate and acton spaces. The FIS ructure s fxed a pror by the user and the fuzzy sets for the nputs and output are supposed fxed. Our approach, cons n determnng the optmal conclusons of the fuzzy nference syem. 4.1 Mamdan fuzzy nference syem A Mamdan nference syem s descrbed by a set of fuzzy rules of the form [13]: Rule : f s s A then a s B Where s s the fuzzy syem nput, A s a fuzzy label for nput n th rule, a s the output of the fuzzy syem and B s fuzzy label for the output n th rule. The problem s how to choose the approprate rules n order to optmze syem performance (n RL maxmze the accumulated future rewards)[13]. In ths paper we use Q-learnng to optmze rule conclusons. Several competng conclusons are assocated to each rule, and a qualty value s assgned to each concluson. The concluson wth the hgh qualty s used by the syem to generate actons. The fuzzy rule becomes: Rule : f s s A then a s argmax b BQ( s, b) 2
3 4.2 Learnng process At each rule, several conclusons are assocated, and each concluson has a Q-value: The fuzzy rule s of the form: Rule : f s s A then a s B 1 wth Q (s, B 1 ) or a s B 2 wth Q (s, B 2 ) or a s B 3 wth Q (s, B 3 ) or a s B m wth Q (s, B m ) where B 1, B 2,.., B m are the fuzzy sets of the outputs and Q j (s, B ) s the Q-value of the concluson a s B of the rule j. Durng learnng the Q-value of each concluson s updated usng Q-learnng ( equaton 8): Q Bj) Q Bj) ( ) rt 1V t( 1) Q B ) t1 t t j (9) Where μ (s t ) s the truth value of the th rule and B j s the j th concluson of the th rule. Wth the value of the new ate gven by: 1 j N Vt( ) 1 N 1 1 j1 1 max bbqt (, ) 1 b f s t+1 s a fnal ate then : (1) V t 1 (11). Results The proposed method s appled to a classc problem; the pole balancng problem or nverted pendulum problem. In ths problem a pole s hnged to a motor-drven cart whch moves on ral tracks to ts rght or ts left. The prmary control task s to keep the pole vertcally balanced. μ p s the coeffcent of frcton of pole on cart. The sample perod s 2 ms. We assume that a falure happen when θ > 4. Also, we assume that the equaton of moton s not known to the controller and that only a vector descrbng the cart-pole syem s ate at each tme ep s known. The nputs of the fuzzy controller are error e and error change Δe: e( k) ( k) (13) e ( k) e( k) e( k1) (14) The output s the force f and the Q-values of conclusons. The fuzzy parttons of the nputs and output are descrbed n fgure μδe μ f μ e e Δe θ Fgure 3. Membershp functons f f Fgure 2. The Cart-Pole Syem The dynamcs of the cart-pole syem are modeled by the followng non lnear dfferental equaton [7,13]: 2 f ml sn p gsn cos mc m m l (12) 2 m l 4 cos 3 mc m where g s the gravty, m c s the mass of the cart, m s the mass of the pole, l s the half-pole length and The rule base s choosen arbtrary and the Q-values of the conclusons are set ntally to zero. We use center of area defuzzfcaton and the mn operator to mplement the premse and mplcaton. A tral n our experments refers to artng wth the cart-pole syem set to an ntal ate and endng wth the appearance of a falure sgnal or successful control of the syem for an extended perod (1 tme eps or 2 seconds). The Q-learnng was appled to tune fuzzy rule conclusons. The free conants were =.9 and set ntally to.1 and decreases. Fgure 4. shows the average return per tral performance of the controller durng the learnng process; the average return per tral and fgures and 6 show the response of the syem, after learnng, for ntal angle equal to and 28 respectvely. It s clear that the average return ncreases durng learnng untl t reaches a sub-optumal value. The obtaned fuzzy controller s able to ablze the pole for angles nferor to. 3
4 force [N] velocty [ /s] force[n] angle[ ] velocty[ /s] Average Return angle[ ] 1 Average Return n the nverted pendulum problem Tral fgure 4. The Average Return Fgure 6. Angle, velocty and force for ntal angle equal to Tme [s] tme [s] Fgure. Angle, velocty and force for ntal angle equal. Conclusons In ths work we have proposed a new method of optmzng fuzzy nference syem based on Q- learnng. Ths method was appled to cart-pole syem. After learnng, the controller s able to ablze the pendulum. We assume that ructure of the fuzzy syem s fxed a pror. The optmzaton of membershp functon parameters and number of rules wll mprove the performance of the proposed method. References [1] R. S. Sutton, A. G. Barto, Introducton to renforcement learnng, MIT Press/Bradford Books, Cambrdge, MA, [2] V. Gullapall, Renforcement learnng and ts appcaton to control, Ph. D. Thess, Unvery of Massachusetts, Amher, MA, USA,1992. [3] L. P. Kaelblng, M. L. Lttman, A. W. Moore, Renforcement learnng: a survey, Journal of Journal Artfcal Intellgence Research 4,
5 [4] J. R. Jang, Self-Learnng Fuzzy Controllers Based on Temporal Back Propagaton, IEEE Transactons on Neural Networks, Vol. 3 No., September [] M. G. Cooper, J. J. Vdal, Genetc Desgn of Fuzzy Controller, Proceedngs of Second Inernatonal Conference on Fuzzy Theory and Technology; Durham, NC, October, [6] A. Bonarn, Evolutonary learnng of fuzzy rules:competton and cooperaton, n Fuzzy modelng : paradgms and practce, Kluwer Academc Publshers, Norwell, MA, 199. [7] H. R. Berenj P. Khedkar, Learnng and Tunng Fuzzy Logc Controllers Through Renforcement, IEEE Transactons on Neural Networks, Vol. 3 No., September [8] M. V. Bujtenen, G. Schram, R. Babuska, B. Verbruggen, Adaptve Fuzzy Control of Satellte Atttude by Renforcement Learnng, IEEE Transactons on Fuzzy Syems, Vol. 6, No. 2, May [9] H. R. Berenj, Fuzzy Q-Learnng: a new approach for fuzzy dynamc programmng, Proceedngs of IEEE nternatonal conference on Fuzzy Syems, Nj, [1] P. Y. Glorennec, L. Jouffe, Fuzzy Q-Learnng, Procedngs of FUZZ-IEEE 97, Barcelona, Span, July [11] P. Y. Glorennec, Renforcement Learnng: an Overvew, ESIT 2, Aachen, Germany, 14-1 September 2. [12] C. Watkns Learnng from Delayed Rewards, PhD. Thess, Unvery of Cambrdge, England, [13] K. Passno, S. Yurkovch, Fuzzy Control, Addson Wesley, Calforna, 1998.
Building A Fuzzy Inference System By An Extended Rule Based Q-Learning
Buldng A Fuzzy Inference System By An Extended Rule Based Q-Learnng Mn-Soeng Km, Sun-G Hong and Ju-Jang Lee * Dept. of Electrcal Engneerng and Computer Scence, KAIST 373- Kusung-Dong Yusong-Ku Taejon 35-7,
More informationDesign and Optimization of Fuzzy Controller for Inverse Pendulum System Using Genetic Algorithm
Desgn and Optmzaton of Fuzzy Controller for Inverse Pendulum System Usng Genetc Algorthm H. Mehraban A. Ashoor Unversty of Tehran Unversty of Tehran h.mehraban@ece.ut.ac.r a.ashoor@ece.ut.ac.r Abstract:
More informationModule 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:
More informationSupporting Information
Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to
More informationErratum: A Generalized Path Integral Control Approach to Reinforcement Learning
Journal of Machne Learnng Research 00-9 Submtted /0; Publshed 7/ Erratum: A Generalzed Path Integral Control Approach to Renforcement Learnng Evangelos ATheodorou Jonas Buchl Stefan Schaal Department of
More informationREINFORCEMENT LEARNING OF FUZZY LOGIC CONTROLLERS FOR QUADRUPED WALKING ROBOTS. Dongbing Gu and Huosheng Hu
Copyrght 2002 IFAC 15th Trennal World Congress, Barcelona, Span REIFORCEMET LEARIG OF FUZZY LOGIC COTROLLERS FOR QUADRUPED WALKIG ROBOTS Dongbng Gu and Huosheng Hu Department of Computer Scence, Unversty
More informationThe Chaotic Robot Prediction by Neuro Fuzzy Algorithm (2) = θ (3) = ω. Asin. A v. Mana Tarjoman, Shaghayegh Zarei
The Chaotc Robot Predcton by Neuro Fuzzy Algorthm Mana Tarjoman, Shaghayegh Zare Abstract In ths paper an applcaton of the adaptve neurofuzzy nference system has been ntroduced to predct the behavor of
More informationThe Study of Teaching-learning-based Optimization Algorithm
Advanced Scence and Technology Letters Vol. (AST 06), pp.05- http://dx.do.org/0.57/astl.06. The Study of Teachng-learnng-based Optmzaton Algorthm u Sun, Yan fu, Lele Kong, Haolang Q,, Helongang Insttute
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationVQ widely used in coding speech, image, and video
at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng
More informationPhysics 5153 Classical Mechanics. D Alembert s Principle and The Lagrangian-1
P. Guterrez Physcs 5153 Classcal Mechancs D Alembert s Prncple and The Lagrangan 1 Introducton The prncple of vrtual work provdes a method of solvng problems of statc equlbrum wthout havng to consder the
More informationStructure and Drive Paul A. Jensen Copyright July 20, 2003
Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.
More informationEEE 241: Linear Systems
EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they
More informationSupplemental Material: Causal Entropic Forces
Supplemental Materal: Causal Entropc Forces A. D. Wssner-Gross 1, 2, and C. E. Freer 3 1 Insttute for Appled Computatonal Scence, Harvard Unversty, Cambrdge, Massachusetts 02138, USA 2 The Meda Laboratory,
More informationA PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS
HCMC Unversty of Pedagogy Thong Nguyen Huu et al. A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS Thong Nguyen Huu and Hao Tran Van Department of mathematcs-nformaton,
More informationA Neuro-Fuzzy System on System Modeling and Its. Application on Character Recognition
A Neuro-Fuzzy System on System Modelng and Its Applcaton on Character Recognton C. J. Chen 1, S. M. Yang 2, Z. C. Wang 3 1 Department of Avaton Servce Management Alethea Unversty Tawan, ROC 2,3 Department
More informationNUMERICAL DIFFERENTIATION
NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the
More informationMultilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata
Multlayer Perceptrons and Informatcs CG: Lecture 6 Mrella Lapata School of Informatcs Unversty of Ednburgh mlap@nf.ed.ac.uk Readng: Kevn Gurney s Introducton to Neural Networks, Chapters 5 6.5 January,
More information2.3 Nilpotent endomorphisms
s a block dagonal matrx, wth A Mat dm U (C) In fact, we can assume that B = B 1 B k, wth B an ordered bass of U, and that A = [f U ] B, where f U : U U s the restrcton of f to U 40 23 Nlpotent endomorphsms
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationSystem identifications by SIRMs models with linear transformation of input variables
ORIGINAL RESEARCH System dentfcatons by SIRMs models wth lnear transformaton of nput varables Hrofum Myama, Nortaka Shge, Hrom Myama Graduate School of Scence and Engneerng, Kagoshma Unversty, Japan Receved:
More informationCHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE
CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng
More informationNovember 5, 2002 SE 180: Earthquake Engineering SE 180. Final Project
SE 8 Fnal Project Story Shear Frame u m Gven: u m L L m L L EI ω ω Solve for m Story Bendng Beam u u m L m L Gven: m L L EI ω ω Solve for m 3 3 Story Shear Frame u 3 m 3 Gven: L 3 m m L L L 3 EI ω ω ω
More informationDetermining Transmission Losses Penalty Factor Using Adaptive Neuro Fuzzy Inference System (ANFIS) For Economic Dispatch Application
7 Determnng Transmsson Losses Penalty Factor Usng Adaptve Neuro Fuzzy Inference System (ANFIS) For Economc Dspatch Applcaton Rony Seto Wbowo Maurdh Hery Purnomo Dod Prastanto Electrcal Engneerng Department,
More informationCHAPTER-5 INFORMATION MEASURE OF FUZZY MATRIX AND FUZZY BINARY RELATION
CAPTER- INFORMATION MEASURE OF FUZZY MATRI AN FUZZY BINARY RELATION Introducton The basc concept of the fuzz matr theor s ver smple and can be appled to socal and natural stuatons A branch of fuzz matr
More informationShort Term Load Forecasting using an Artificial Neural Network
Short Term Load Forecastng usng an Artfcal Neural Network D. Kown 1, M. Km 1, C. Hong 1,, S. Cho 2 1 Department of Computer Scence, Sangmyung Unversty, Seoul, Korea 2 Department of Energy Grd, Sangmyung
More informationConvexity preserving interpolation by splines of arbitrary degree
Computer Scence Journal of Moldova, vol.18, no.1(52), 2010 Convexty preservng nterpolaton by splnes of arbtrary degree Igor Verlan Abstract In the present paper an algorthm of C 2 nterpolaton of dscrete
More informationPortfolios with Trading Constraints and Payout Restrictions
Portfolos wth Tradng Constrants and Payout Restrctons John R. Brge Northwestern Unversty (ont wor wth Chrs Donohue Xaodong Xu and Gongyun Zhao) 1 General Problem (Very) long-term nvestor (eample: unversty
More informationWeek3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity
Week3, Chapter 4 Moton n Two Dmensons Lecture Quz A partcle confned to moton along the x axs moves wth constant acceleraton from x =.0 m to x = 8.0 m durng a 1-s tme nterval. The velocty of the partcle
More informationEn Route Traffic Optimization to Reduce Environmental Impact
En Route Traffc Optmzaton to Reduce Envronmental Impact John-Paul Clarke Assocate Professor of Aerospace Engneerng Drector of the Ar Transportaton Laboratory Georga Insttute of Technology Outlne 1. Introducton
More informationSome modelling aspects for the Matlab implementation of MMA
Some modellng aspects for the Matlab mplementaton of MMA Krster Svanberg krlle@math.kth.se Optmzaton and Systems Theory Department of Mathematcs KTH, SE 10044 Stockholm September 2004 1. Consdered optmzaton
More informationEEL 6266 Power System Operation and Control. Chapter 3 Economic Dispatch Using Dynamic Programming
EEL 6266 Power System Operaton and Control Chapter 3 Economc Dspatch Usng Dynamc Programmng Pecewse Lnear Cost Functons Common practce many utltes prefer to represent ther generator cost functons as sngle-
More informationNP-Completeness : Proofs
NP-Completeness : Proofs Proof Methods A method to show a decson problem Π NP-complete s as follows. (1) Show Π NP. (2) Choose an NP-complete problem Π. (3) Show Π Π. A method to show an optmzaton problem
More informationCS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016
CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng
More informationOne-sided finite-difference approximations suitable for use with Richardson extrapolation
Journal of Computatonal Physcs 219 (2006) 13 20 Short note One-sded fnte-dfference approxmatons sutable for use wth Rchardson extrapolaton Kumar Rahul, S.N. Bhattacharyya * Department of Mechancal Engneerng,
More information829. An adaptive method for inertia force identification in cantilever under moving mass
89. An adaptve method for nerta force dentfcaton n cantlever under movng mass Qang Chen 1, Mnzhuo Wang, Hao Yan 3, Haonan Ye 4, Guola Yang 5 1,, 3, 4 Department of Control and System Engneerng, Nanng Unversty,
More informationOn the Multicriteria Integer Network Flow Problem
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of
More informationCHAPTER 6. LAGRANGE S EQUATIONS (Analytical Mechanics)
CHAPTER 6 LAGRANGE S EQUATIONS (Analytcal Mechancs) 1 Ex. 1: Consder a partcle movng on a fxed horzontal surface. r P Let, be the poston and F be the total force on the partcle. The FBD s: -mgk F 1 x O
More informationMarkov Chain Monte Carlo Lecture 6
where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways
More informationReinforcement learning
Renforcement learnng Nathanel Daw Gatsby Computatonal Neuroscence Unt daw @ gatsby.ucl.ac.uk http://www.gatsby.ucl.ac.uk/~daw Mostly adapted from Andrew Moore s tutorals, copyrght 2002, 2004 by Andrew
More informationFeature Selection: Part 1
CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?
More informationProblem Set 9 Solutions
Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem
More informationA Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach
A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland
More informationPHYS 705: Classical Mechanics. Calculus of Variations II
1 PHYS 705: Classcal Mechancs Calculus of Varatons II 2 Calculus of Varatons: Generalzaton (no constrant yet) Suppose now that F depends on several dependent varables : We need to fnd such that has a statonary
More informationA Hybrid Variational Iteration Method for Blasius Equation
Avalable at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 1932-9466 Vol. 10, Issue 1 (June 2015), pp. 223-229 Applcatons and Appled Mathematcs: An Internatonal Journal (AAM) A Hybrd Varatonal Iteraton Method
More informationχ x B E (c) Figure 2.1.1: (a) a material particle in a body, (b) a place in space, (c) a configuration of the body
Secton.. Moton.. The Materal Body and Moton hyscal materals n the real world are modeled usng an abstract mathematcal entty called a body. Ths body conssts of an nfnte number of materal partcles. Shown
More informationDynamic Programming. Lecture 13 (5/31/2017)
Dynamc Programmng Lecture 13 (5/31/2017) - A Forest Thnnng Example - Projected yeld (m3/ha) at age 20 as functon of acton taken at age 10 Age 10 Begnnng Volume Resdual Ten-year Volume volume thnned volume
More informationCalculation of time complexity (3%)
Problem 1. (30%) Calculaton of tme complexty (3%) Gven n ctes, usng exhaust search to see every result takes O(n!). Calculaton of tme needed to solve the problem (2%) 40 ctes:40! dfferent tours 40 add
More informationLossy Compression. Compromise accuracy of reconstruction for increased compression.
Lossy Compresson Compromse accuracy of reconstructon for ncreased compresson. The reconstructon s usually vsbly ndstngushable from the orgnal mage. Typcally, one can get up to 0:1 compresson wth almost
More informationCopyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor
Taylor Enterprses, Inc. Control Lmts for P Charts Copyrght 2017 by Taylor Enterprses, Inc., All Rghts Reserved. Control Lmts for P Charts Dr. Wayne A. Taylor Abstract: P charts are used for count data
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More information1 Convex Optimization
Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,
More informationLecture 14: Bandits with Budget Constraints
IEOR 8100-001: Learnng and Optmzaton for Sequental Decson Makng 03/07/16 Lecture 14: andts wth udget Constrants Instructor: Shpra Agrawal Scrbed by: Zhpeng Lu 1 Problem defnton In the regular Mult-armed
More informationCHAPTER III Neural Networks as Associative Memory
CHAPTER III Neural Networs as Assocatve Memory Introducton One of the prmary functons of the bran s assocatve memory. We assocate the faces wth names, letters wth sounds, or we can recognze the people
More informationGENETIC REINFORCEMENT LEARNING OF FUZZY INFERENCE SYSTEM APPLICATION TO MOBILE ROBOTIC
GENETIC REINFORCEMENT LEARNING OF FUZZY INFERENCE SYSTEM APPLICATION TO MOBILE ROBOTIC Abdelkrm Nemra, Hacene Rezne and Abdelkrm Souc Unt of Control, Robotc and Productc Laboratory Polytechncal Mltary
More informationA Robust Method for Calculating the Correlation Coefficient
A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal
More informationECE559VV Project Report
ECE559VV Project Report (Supplementary Notes Loc Xuan Bu I. MAX SUM-RATE SCHEDULING: THE UPLINK CASE We have seen (n the presentaton that, for downlnk (broadcast channels, the strategy maxmzng the sum-rate
More informationMARKOV decision process (MDP) is a long-standing
2038 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 24, NO. 12, DECEMBER 2013 Goal Representaton Heurstc Dynamc Programmng on Maze Navgaton Zhen N, Habo He, Senor Member, IEEE, Jnyu Wen,
More informationChapter 8 SCALAR QUANTIZATION
Outlne Chapter 8 SCALAR QUANTIZATION Yeuan-Kuen Lee [ CU, CSIE ] 8.1 Overvew 8. Introducton 8.4 Unform Quantzer 8.5 Adaptve Quantzaton 8.6 Nonunform Quantzaton 8.7 Entropy-Coded Quantzaton Ch 8 Scalar
More informationComputing Correlated Equilibria in Multi-Player Games
Computng Correlated Equlbra n Mult-Player Games Chrstos H. Papadmtrou Presented by Zhanxang Huang December 7th, 2005 1 The Author Dr. Chrstos H. Papadmtrou CS professor at UC Berkley (taught at Harvard,
More informationNeural networks. Nuno Vasconcelos ECE Department, UCSD
Neural networs Nuno Vasconcelos ECE Department, UCSD Classfcaton a classfcaton problem has two types of varables e.g. X - vector of observatons (features) n the world Y - state (class) of the world x X
More informationWhich Separator? Spring 1
Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal
More informationChapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems
Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons
More informationA LINEAR PROGRAM TO COMPARE MULTIPLE GROSS CREDIT LOSS FORECASTS. Dr. Derald E. Wentzien, Wesley College, (302) ,
A LINEAR PROGRAM TO COMPARE MULTIPLE GROSS CREDIT LOSS FORECASTS Dr. Derald E. Wentzen, Wesley College, (302) 736-2574, wentzde@wesley.edu ABSTRACT A lnear programmng model s developed and used to compare
More informationEnsemble Methods: Boosting
Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement
More informationWeek 5: Neural Networks
Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple
More informationCONTRAST ENHANCEMENT FOR MIMIMUM MEAN BRIGHTNESS ERROR FROM HISTOGRAM PARTITIONING INTRODUCTION
CONTRAST ENHANCEMENT FOR MIMIMUM MEAN BRIGHTNESS ERROR FROM HISTOGRAM PARTITIONING N. Phanthuna 1,2, F. Cheevasuvt 2 and S. Chtwong 2 1 Department of Electrcal Engneerng, Faculty of Engneerng Rajamangala
More informationAggregation of Social Networks by Divisive Clustering Method
ggregaton of Socal Networks by Dvsve Clusterng Method mne Louat and Yves Lechaveller INRI Pars-Rocquencourt Rocquencourt, France {lzennyr.da_slva, Yves.Lechevaller, Fabrce.Ross}@nra.fr HCSD Beng October
More informationA Simple Inventory System
A Smple Inventory System Lawrence M. Leems and Stephen K. Park, Dscrete-Event Smulaton: A Frst Course, Prentce Hall, 2006 Hu Chen Computer Scence Vrgna State Unversty Petersburg, Vrgna February 8, 2017
More informationVariability-Driven Module Selection with Joint Design Time Optimization and Post-Silicon Tuning
Asa and South Pacfc Desgn Automaton Conference 2008 Varablty-Drven Module Selecton wth Jont Desgn Tme Optmzaton and Post-Slcon Tunng Feng Wang, Xaoxa Wu, Yuan Xe The Pennsylvana State Unversty Department
More informationComparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method
Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method
More informationAGC Introduction
. Introducton AGC 3 The prmary controller response to a load/generaton mbalance results n generaton adjustment so as to mantan load/generaton balance. However, due to droop, t also results n a non-zero
More informationMultigradient for Neural Networks for Equalizers 1
Multgradent for Neural Netorks for Equalzers 1 Chulhee ee, Jnook Go and Heeyoung Km Department of Electrcal and Electronc Engneerng Yonse Unversty 134 Shnchon-Dong, Seodaemun-Ku, Seoul 1-749, Korea ABSTRACT
More informationChapter 8. Potential Energy and Conservation of Energy
Chapter 8 Potental Energy and Conservaton of Energy In ths chapter we wll ntroduce the followng concepts: Potental Energy Conservatve and non-conservatve forces Mechancal Energy Conservaton of Mechancal
More informationPop-Click Noise Detection Using Inter-Frame Correlation for Improved Portable Auditory Sensing
Advanced Scence and Technology Letters, pp.164-168 http://dx.do.org/10.14257/astl.2013 Pop-Clc Nose Detecton Usng Inter-Frame Correlaton for Improved Portable Audtory Sensng Dong Yun Lee, Kwang Myung Jeon,
More informationLecture 21: Numerical methods for pricing American type derivatives
Lecture 21: Numercal methods for prcng Amercan type dervatves Xaoguang Wang STAT 598W Aprl 10th, 2014 (STAT 598W) Lecture 21 1 / 26 Outlne 1 Fnte Dfference Method Explct Method Penalty Method (STAT 598W)
More informationAsymptotic Quantization: A Method for Determining Zador s Constant
Asymptotc Quantzaton: A Method for Determnng Zador s Constant Joyce Shh Because of the fnte capacty of modern communcaton systems better methods of encodng data are requred. Quantzaton refers to the methods
More informationThe Minimum Universal Cost Flow in an Infeasible Flow Network
Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran
More informationChapter Newton s Method
Chapter 9. Newton s Method After readng ths chapter, you should be able to:. Understand how Newton s method s dfferent from the Golden Secton Search method. Understand how Newton s method works 3. Solve
More informationON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EQUATION
Advanced Mathematcal Models & Applcatons Vol.3, No.3, 2018, pp.215-222 ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EUATION
More informationPhysics 5153 Classical Mechanics. Principle of Virtual Work-1
P. Guterrez 1 Introducton Physcs 5153 Classcal Mechancs Prncple of Vrtual Work The frst varatonal prncple we encounter n mechancs s the prncple of vrtual work. It establshes the equlbrum condton of a mechancal
More informationADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING
1 ADVANCED ACHINE LEARNING ADVANCED ACHINE LEARNING Non-lnear regresson technques 2 ADVANCED ACHINE LEARNING Regresson: Prncple N ap N-dm. nput x to a contnuous output y. Learn a functon of the type: N
More informationAnalytical Chemistry Calibration Curve Handout
I. Quck-and Drty Excel Tutoral Analytcal Chemstry Calbraton Curve Handout For those of you wth lttle experence wth Excel, I ve provded some key technques that should help you use the program both for problem
More informationMean Field / Variational Approximations
Mean Feld / Varatonal Appromatons resented by Jose Nuñez 0/24/05 Outlne Introducton Mean Feld Appromaton Structured Mean Feld Weghted Mean Feld Varatonal Methods Introducton roblem: We have dstrbuton but
More informationAdiabatic Sorption of Ammonia-Water System and Depicting in p-t-x Diagram
Adabatc Sorpton of Ammona-Water System and Depctng n p-t-x Dagram J. POSPISIL, Z. SKALA Faculty of Mechancal Engneerng Brno Unversty of Technology Techncka 2, Brno 61669 CZECH REPUBLIC Abstract: - Absorpton
More informationLab 2e Thermal System Response and Effective Heat Transfer Coefficient
58:080 Expermental Engneerng 1 OBJECTIVE Lab 2e Thermal System Response and Effectve Heat Transfer Coeffcent Warnng: though the experment has educatonal objectves (to learn about bolng heat transfer, etc.),
More informationChapter 2 A Class of Robust Solution for Linear Bilevel Programming
Chapter 2 A Class of Robust Soluton for Lnear Blevel Programmng Bo Lu, Bo L and Yan L Abstract Under the way of the centralzed decson-makng, the lnear b-level programmng (BLP) whose coeffcents are supposed
More informationELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM
ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM An elastc wave s a deformaton of the body that travels throughout the body n all drectons. We can examne the deformaton over a perod of tme by fxng our look
More informationMultilayer Perceptron (MLP)
Multlayer Perceptron (MLP) Seungjn Cho Department of Computer Scence and Engneerng Pohang Unversty of Scence and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjn@postech.ac.kr 1 / 20 Outlne
More informationIntroduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:
CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and
More informationONE DIMENSIONAL TRIANGULAR FIN EXPERIMENT. Technical Advisor: Dr. D.C. Look, Jr. Version: 11/03/00
ONE IMENSIONAL TRIANGULAR FIN EXPERIMENT Techncal Advsor: r..c. Look, Jr. Verson: /3/ 7. GENERAL OJECTIVES a) To understand a one-dmensonal epermental appromaton. b) To understand the art of epermental
More informationThe Feynman path integral
The Feynman path ntegral Aprl 3, 205 Hesenberg and Schrödnger pctures The Schrödnger wave functon places the tme dependence of a physcal system n the state, ψ, t, where the state s a vector n Hlbert space
More informationCIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M
CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute
More informationCOMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS
Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS
More informationSolving Nonlinear Differential Equations by a Neural Network Method
Solvng Nonlnear Dfferental Equatons by a Neural Network Method Luce P. Aarts and Peter Van der Veer Delft Unversty of Technology, Faculty of Cvlengneerng and Geoscences, Secton of Cvlengneerng Informatcs,
More informationAppendix B: Resampling Algorithms
407 Appendx B: Resamplng Algorthms A common problem of all partcle flters s the degeneracy of weghts, whch conssts of the unbounded ncrease of the varance of the mportance weghts ω [ ] of the partcles
More informationInteractive Bi-Level Multi-Objective Integer. Non-linear Programming Problem
Appled Mathematcal Scences Vol 5 0 no 65 3 33 Interactve B-Level Mult-Objectve Integer Non-lnear Programmng Problem O E Emam Department of Informaton Systems aculty of Computer Scence and nformaton Helwan
More informationOnline Classification: Perceptron and Winnow
E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng
More informationDigital Signal Processing
Dgtal Sgnal Processng Dscrete-tme System Analyss Manar Mohasen Offce: F8 Emal: manar.subh@ut.ac.r School of IT Engneerng Revew of Precedent Class Contnuous Sgnal The value of the sgnal s avalable over
More information