A ROLLOUT CONTROL ALGORITHM FOR DISCRETE-TIME STOCHASTIC SYSTEMS

Size: px
Start display at page:

Download "A ROLLOUT CONTROL ALGORITHM FOR DISCRETE-TIME STOCHASTIC SYSTEMS"

Transcription

1 Proceedings of the ASE 2 Dynmic Systems nd Control Conference DSCC2 September 2-5, 2, Cmbridge, sschusetts, USA DSCC2- A ROLLOUT CONTROL ALGORITH FOR DISCRETE-TIE STOCHASTIC SYSTES Andres A. liopoulos Propulsion System Reserch Lb Generl otors Globl Reserch & Development Wrren, I 489 ndres.mliopoulos@gm.com ABSTRACT The growing demnd for ming utonomous intelligent systems tht cn lern how to improve their performnce while intercting with their environment hs induced significnt reserch on computtionl cognitive models. Computtionl intelligence, or rtionlity, cn be chieved by modeling system nd the interction with its environment through ctions, perceptions, nd ssocited costs. A widely dopted prdigm for modeling this interction is the controlled rov chin. In this context, the problem is formulted s sequentil decisionming process in which n intelligent system hs to select those control ctions in severl time steps to chieve long-term gols. This pper presents rollout control lgorithm tht ims to build n online decision-ming mechnism for controlled rov chin. The lgorithm yields loohed suboptiml control policy. Under certin conditions, theoreticl bound on its performnce cn be estblished.. INTRODUCTION Sequentil decision models [, 2] re mthemticl bstrctions of situtions in which decisions must be mde in severl decision epochs while incurring certin cost (or rewrd) t ech epoch. Ech decision my influence the circumstnces under which future decisions will be mde, nd thus, the decision mer must blnce his/her desire to minimize (mximize) the cost (rewrd) of the present decision ginst his/her desire to void future situtions where high cost is inevitble. A lrge clss of sequentil decision-ming problems under uncertinty cn be solved using dynmic progrmming (DP) [3]. However, the computtionl cost of DP in some instnces my be prohibitive nd cn grow intrctbly s the size of the problem increses. As n lterntive pproch to ddress this issue, Approximte Dynmic Progrmming (ADP) [4] is employed, providing suboptiml control methods for deterministic nd stochstic problems. Rollout lgorithms nd model predictive control re two mjor methods within ADP with properties founded on policy itertion. The min ide of rollout lgorithms [5-] is to obtin n improved policy strting from some other suboptiml policy using one-time policy improvement. It hs been proposed by Abrmson [] nd by Tesuro nd Glperin [2] in the context of gme-plying computer progrms. In the ltter, bcgmmon position is evluted by simulting mny gmes strting from tht position nd the results re verged. odel predictive control [3-7] is populr pproch in vriety of control system design contexts, nd in prticulr, in chemicl process control. It ws motivted by the desire to introduce nonlinerities nd constrints into the liner-qudrtic control frmewor, while obtining suboptiml but stble closed-loop system. Other lterntives for pproching these problems hve been primrily developed in the field of Reinforcement Lerning (RL) [4, 8, 9]. RL hs imed to provide lgorithms, founded on DP, for lerning suboptiml control policies when nlyticl methods cnnot be used effectively, or the system s stte trnsition probbilities re not nown [2]. Although mny of these lgorithms re eventully gurnteed to find sub-optiml policies in sequentil decision-ming problems under uncertinty, their use of the ccumulted dt cquired over the lerning process is inefficient, nd they require significnt mount of experience to chieve cceptble performnce [2]. This requirement rises due to the formtion of these lgorithms in deriving control policies without lerning the system dynmics en route, tht is, they do not solve the system identifiction problem simultneously. In ddition, RL lgorithms re suited to problems in which the system needs to chieve prticulr gol sttes, which imposes Copyright 2 by Generl otors Downloded 6 Aug 2 to Redistribution subject to ASE license or copyright; see

2 limittions in employing efficiently these lgorithms to solve prticulr problems. The Predictive Optiml Decision-ming (POD) lerning model [22, 23] hs imed to ddress the system identifiction problem for completely unnown system by lerning in rel time the system s evolution over vrying nd unnown finite time horizon. The POD model hs been employed in vrious pplictions towrds ming utonomous intelligent systems tht cn lern to improve their performnce over time in stochstic environments. In the crt-pole blncing problem [23], n inverted pendulum ws mde cpble of relizing the blncing control policy nd turning into stble system. In vehicle cruise control implementtion [23], n utonomous cruise controller ws developed to lern to mintin the desired vehicle s speed t different rod grdes. POD hs lso ten steps towrd development utonomous intelligent propulsion systems relizing their optiml opertion with respect to the driver s driving style [22, 24]. In this pper, rollout control lgorithm tht ims to build n online decision-ming mechnism for controlled rov chins is presented. The lgorithm cn be combined with the POD model to yield loohed suboptiml control policy tht ssesses the system output with respect to lterntive control ctions, nd selecting those tht optimize specified performnce criteri. A theoreticl bound on its performnce is proven in Theorem 4., thus estblishing tht, under certin conditions, the loohed control policy exists. The reminder of the pper proceeds s follows: Section 2 estblishes the mthemticl frmewor of the controlled rov chin. Section 3 reviews briefly the Predictive Optiml Decisionming (POD) computtionl model tht ims to lern the trnsition probbilities nd ssocited costs. Section 4 introduces the rollout control lgorithm nd formultes the theoreticl bound on its performnce. Concluding remrs re presented in Section PROBLE FORULATION The stochstic system model estblishes the mthemticl frmewor for the representtion of dynmic systems tht evolve stochsticlly over time [2, 25, 26], tht is, when incurring stochstic disturbnce or noise t time, w, in their portryl. The one-dimensionl model is given by n eqution of the form s f( s,, w),,,... () where s is the system s stte tht belongs to some stte spce S {, 2,..., N}, N, f is function tht describes how the system s stte is updted, is the control ction, nd w is the } is treted s disturbnce t time. The sequence { w, stochstic process, nd the joint probbility distribution of the rndom vribles w, w,..., w is unnown for ech. The system output is represented by y h ( s, v ),,,... (2) where y is the observtion or system s output, h is function tht describes how the system output is updted, nd v is the mesurement error or noise. The sequence { v, } is lso considered stochstic process with unnown probbility distribution. We re interested in deriving control policy so tht given performnce criterion is optimized over ll dmissible policies Π. An dmissible policy consists of sequence of functions { µ,...}, where µ mps sttes s into ctions µ ( s ). The system s stte s depends upon the input sequence,,... s well s the rndom vribles w, w,..., Eq. (). Consequently, s is rndom vrible; the system output y h( s, v) is function of the rndom vribles s, s,..., v, v,..., nd thus, is lso rndom vrible. Similrly, the sequence of control ctions µ ( s), {, }, constitutes stochstic process. Suppose tht the previous vlues of the rndom vribles s m nd m, m re nown. Then the conditionl distribution of s given these vlues will be P ( s s,..., s,,..., ) ( w s,..., s,,..., ). s s, P w s, The conditionl probbility distribution of s given s nd cn be independent of the previous vlues of sttes nd control ctions, if it is gurnteed tht for every control policy, w is independent of the rndom vribles s nd m m, m. Kumr nd Vriy [26] proved tht this property is imposed under the ssumption tht the following rndom vribles s, w, w,..., v, v,..., re ll independent. The ltter imposes condition directly to the bsic rndom vribles which eventully yields tht the stte s depends only on s nd. oreover, the conditionl probbility distributions do not depend on the control policy, nd thus the superscript cn be dropped Ps, (,...,,,..., ) s s s s (4) P ( s s, ). s s, (3) 2 Copyright 2 by Generl otors Downloded 6 Aug 2 to Redistribution subject to ASE license or copyright; see

3 A stochstic process { s, } stisfying Eq. (4) is clled rov Process. If the stte spce is discrete, then the process is defined s controlled rov chin. The discrete-time, sttionry controlled rov chin is stochstic dynmic system specified by five-tuple S, A, A, P, R, where { } () S {, 2,..., N} is the finite stte spce; (b) A is the compct ction spce; (c) A is fmily { A() i i } S of nonempty mesurble subsets of A, where A () i denotes the set of fesible ctions when the system is t stte i S, with the property tht the set { i i i} K : (, ), A ( ), of fesible pirs is mesurble subset of S A nd contins the grph of mesurble function form S to A ; (d) P is the stochstic ernel on S given K, tht is, the trnsition probbility of the system from stte i S to j S ; nd (e) R is the mesurble one-stge cost function, R : K. The evolution of the system occurs t ech of sequence of stges,,..., nd is portryed by the sequence of the rndom vribles s nd corresponding to the system s stte nd control ction. At ech stge, the controller observes the system s stte s i, nd executes n ction, from the fesible set of ctions A () i A t this stte. At the next stge, the system trnsits to the stte s j imposed by the conditionl probbility P( j i, ), nd cost R( j i, ) is incurred. After the trnsition to the next stte hs occurred, new ction is selected, nd the process is repeted. The completed period of time over which the system is observed is clled the decision-ming horizon nd is denoted by. The horizon cn be either finite or infinite; in this pper, we consider finite-horizon decision-ming problems. A control policy determines the probbility distribution of stte process { s, } nd the control process {, }. Different policies will led to different probbility distributions. In optiml control problems, the objective is to derive the optiml control policy tht minimizes (mximizes) the ccumulted cost (rewrd) incurred t ech stte trnsition per decision epoch. If policy is fixed, the cost incurred by when the process strts from n initil stte s nd up to the time horizon is J ( s ) R ( s j s i, ), i, j, A ( i ). (5) The ccumulted cost J ( s ) is rndom vrible since s nd re rndom vribles. Hence the expected ccumulted cost of control policy is given by J ( s ) E R ( s j s i, ( s )) s ( ) A s E R( s j s i, µ ( s)) s µ As ( ) p( s j s i, µ ( s)) R( s j s i, µ ( s)), (6) where the expecttion is ten with respect to the probbility distribution of { s, } nd {, } determined by the control policy. The optiml policy { µ,..., µ } cn be derived by rg min J ( s ). (7) Π 3. ONLINE SELF-LEARNING IDENTIFICATION The problem of ming utonomous intelligent systems is formulted s sequentil decision-ming under uncertinly. In this context, n intelligent system (decision mer), e.g., dvnced propulsion systems, robot, utomted mnufcturing system, etc, hs to select those ctions in severl time steps (decision epochs) to chieve long-term gols efficiently. This problem involves two mjor sub-problems: () the system identifiction problem, nd (b) the stochstic control problem. The first is exploittion of the informtion cquired from the system output to identify its behvior, tht is, how stte representtion cn be built by observing the system s stte trnsitions. The second is ssessment of the system output with respect to lterntive control policies, nd selecting those tht optimize specified performnce criteri. The Predictive Optiml Decision-ming (POD) lerning model [23] is intended to ddress the system identifiction problem for completely unnown system by lerning in rel time the system dynmics over vrying nd unnown finite time horizon. The model embedded in the self-lerning controller is constituted by stte representtion which ttempts to provide n efficient process in relizing the stte trnsitions tht occurred in the rov domin. The model considers systems tht their evolution cn be modeled s controlled rov chin under the ssumptions tht the rov chin is homogeneous, ergodic, nd irreducible. The lerning process of the POD model trnspires while the system intercts with its environment. Ten in conjunction with ssigning vlues of the control ctions from the fesible ction 3 Copyright 2 by Generl otors Downloded 6 Aug 2 to Redistribution subject to ASE license or copyright; see

4 spce, A, this interction portrys the progressive enhncement of the controller s nowledge of the system s evolution with respect to the control ctions. ore precisely, t ech of sequence of decision epochs,, 2,..., stte s is introduced to the controller, nd on tht bsis the controller selects n ction, µ ( s). This stte rises s result of the system s evolution. One epoch lter, s consequence of this ction, the system trnsits to new stte s j, nd receives numericl cost, R( s j s i, ). At ech epoch, the controller implements mpping from the Crtesin product of the stte spce nd ction spce to the set of rel numbers, S A, by mens of the costs tht it receives. Similrly, nother mpping from the Crtesin product of the stte spce nd ction spce to the closed set [,] is executed, S A [,], i.e., the trnsition probbility mtrix, P(, ). The ltter essentilly perceives the incidence in which prticulr sttes or prticulr sequences of sttes rise. The POD model possesses structure tht enbles convergent behvior of the conditionl probbilities infused by the POD stte-spce representtion to the sttionry distribution. This behvior is desirble in the effort towrds ming utonomous intelligent systems tht cn lern to improve their performnce over time in stochstic environments. The convergence of POD to the sttionry distribution of the rov stte trnsitions hs been proven in [27], hence estblishing POD s robust model. As the process is stochstic, however, it is still necessry for the controller to build decision-ming mechnism to derive the control policy. This policy is expressed by mens of mpping from sttes to probbilities of selecting the ctions, resulting in the minimum expected ccumulted cost. 4. ROLLOUT CONTROL ALGORITH The objective of the control lgorithm is to evlute in rel time the optiml ction t ech epoch not only for the current stte, but lso for the next two subsequent sttes over the following epochs. The requirement of rel-time implementtion imposes computtionl burden in llowing the lgorithm to loo further hed in time, thus evluting n ction over dditionl succeeding sttes. Suppose tht the current stte is s nd the following stte given n ction A ( s), is s. The immedite cost incurred by this trnsition is Rs ( s, ). The minimum expected cost for the next two subsequent sttes is perceived in terms of the mgnitude, V( s ), nd is equl to { } V( s ) min E R( s s, ). (8) 2 A( s ) s 2 re described by probbility distributions nd the expected vlue of the overll cost is minimized. In this context, the control policy relized by the lgorithm is bsed on the minimx control pproch, whereby the worst possible vlues of the uncertin quntities within the given set re ssumed to occur. This essentilly ssures tht the control policy will result in t most mximum overll cost. Consequently, being t stte s the control lgorithm provides the policy { µ,..., µ }, in terms of the vlues of the controllble vribles s ( s ) rgmin mx R( s s, ) V( s ). S (9) µ ( s ) A( s ) s To evlute the efficiency of the lgorithm, the estblishment of performnce bound in terms of the ccumulted cost over the decision epochs is necessry. The following Lemm (see, e.g. []) ims to provide useful step towrd presenting the min result (Theorem 3.). Lemm 4. : Let f : S [, ] nd g : S A [, ] be two functions. If ( gi) min (, ) >, i () A then we hve min mx[ f( i) g( i, µ ( i))] mx[ f( i) min g( i, )], µ () i A i i A () where the function µ : S A, mps the stte into ction, tht is, µ () i, nd SA, re the stte nd ction spce respectively. Assumption 4. : The minimum expected cost V( s ), Eq. (8), incurred t the decision epoch is bounded, tht is, V( s ) >, s. Theorem 4.: The ccumulted cost J ( s) incurred by the loohed control policy { µ,..., µ }, nmely, ( s ) rg min mx R ( s s, ) min E R ( s s, ), { } (2) µ ( s ) A( s ) s A( s ) s 2 2 is bounded by the ccumulted cost J ( s ) incurred by the minimx control policy { µ,..., µ }, nmely, µ ( s ) A( s ) s [ R s s J s ] rg min mx (, ) ( ) with probbility. (3) For the problem of optiml control of uncertin systems, which is treted in stochstic frmewor, ll uncertin quntities 4 Copyright 2 by Generl otors Downloded 6 Aug 2 to Redistribution subject to ASE license or copyright; see

5 Proof: Suppose tht the chin strts t stte s i, i t time nd ends up t. We consider the problem of finding policy { µ,..., µ } with µ ( s) A for ll s nd tht minimizes the cost function 2 J ( s) mx R (, ) R( s s, ) s.(4) S The DP lgorithm for this problem tes the following form strting from the til sub-problem J ( s ) min mx R ( s s, ) R ( s ), nd µ ( ) A( ) (5) J ( s ) J ( s ). (2) Consequently, n optiml policy for the minimx problem cn be constructed by minimizing the right hnd side of Eq. (4). Performing the sme ts s we did with the DP lgorithm by strting from the lst epoch of the decision-ming process nd moving bcwrds, the ccumulted cost, J ( s), incurred by the control policy { µ,..., µ } is J ( s ) min mx R(, ) min E { R ( 2, ) } µ ( ) A( ) A( ) 2 R ( s s, ) R ( s ) J ( s ), (2) J ( s ) min mx R ( s s, ) J ( s ), µ ( s ) A( s ) s (6) R ( 2, ) since the terminl epoch is t. where R ( s ) is the cost of the terminl decision epoch. Following the steps of the DP lgorithm proposed by Bertses [], the optiml ccumulted cost J ( s ) strting from the lst decision epoch nd moving bcwrds is J ( s ) min... min µ ( s) A( s) µ ( ) A( ) 2 mx...mx R (, ) R( s s, ). s S S (7) By pplying Lemm 4., we cn interchnge the min over µ Μ nd the mx over s,..., 2. The necessry condition in Lemm 4. is implied by Assumption 4.. Eqution (7) yields J ( s ) min... min µ ( s) A( s) µ ( s ) A( s ) mx... mx R( s s, ) mx [ R 2( 2, 2) J( ) ] s 2 S S s (8) 3 min... min mx... mx R( s s, ) J ( ). µ ( s) A( s) µ 2( 2) A( 2) s 2 S S (9) J ( ) min µ ( ) A( ) mx R (, ) min E { R(, ) } J ( ) A( ) (22) min mx R ( s s, ) J ( s ), (23) µ ( ) A( ) Similrly, R(, ) since the terminl epoch is t. Consequently, J ( s ) min mx R ( s s, ) J ( s ) J ( s ), µ ( ) A( ) since J ( s ) is constnt quntity. J 2( 2) min µ 2( 2) A( 2) (24) mx R 2( 2, 2) min E { R (, ) } A( ) J ( ). (25) By continuing bcwrds in similr wy we obtin However, 5 Copyright 2 by Generl otors Downloded 6 Aug 2 to Redistribution subject to ASE license or copyright; see

6 min µ 2( 2) A( 2) mx R 2( 2, 2) min E { R (, ) } A( ) min mx (, ), µ 2( 2) A( 2) [ R s s ] (26) since the LHS of the inequlity will minimize cost which is not only mximum over the cost incurred when the chin trnsits from 2 to but lso minimum over the cost incurred when the chin trnsits from to s. So, the LHS cn be t most equl to the cost which is mximum over the trnsition from 2 to. Consequently, compring the ccumulted cost of the control policy J ( s) in Eq. (25) with the one resulted from the DP t the sme decision epoch, nmely, J ( s ) min mx R ( s s, ) J ( s ) µ 2( 2) A( 2) s we conclude tht (27) J ( s ) J ( s ). (28) By continuing bcwrd with similr rguments, we hve J ( s ) J ( s ) J ( s ). (29) Consequently, the ccumulted cost resulting from the control policy { µ,..., µ } is bounded by the ccumulted cost of the optiml minimx control policy with probbility. 5. CONCLUDING REARKS We presented the theoreticl frmewor nd rollout control lgorithm towrd ming utonomous intelligent systems tht cn lern their optiml opertion in rel time. The evolution of the system ws modeled s controlled rov chin, nd the ts of deriving control policy ws formulted s sequentil decisionming problem under uncertinty. The lgorithm comprises the decision-ming mechnism tht solves the stochstic control problem by utilizing ccumulted dt cquired s the system intercts with its environment. The solution of the lgorithm hs theoreticl performnce bound tht is superior to tht of the solution provided by the one-step minimx control lgorithm (Theorem 4.). The reserch presented here considered the pproximte solution of discrete optimiztion problem using procedures cpble of mgnifying the effectiveness of ny given heuristic lgorithm through sequentil ppliction. In prticulr, the problem ws embedded within dynmic progrmming frmewor, nd two-step rollout lgorithm ws introduced relted to notions of policy itertion. Future reserch should explore the impct of the number of time steps tht the lgorithm cn loo forwrd in time on its performnce bound. REFERENCES [] Bertses, D. P., Dynmic Progrmming nd Optiml Control (Volumes nd 2), Athen Scientific, September 2. [2] Bertses, D. P. nd Shreve, S. E., Stochstic Optiml Control: The Discrete-Time Cse, st edition, Athen Scientific, Februry 27. [3] Bellmn, R., Dynmic Progrmming. Princeton, NJ, Princeton University Press, 957. [4] Bertses, D. P. nd Tsitsilis, J. N., Neuro-Dynmic Progrmming (Optimiztion nd Neurl Computtion Series, 3), st edition, Athen Scientific, y 996. [5] Bertses, D. P., Tsitsilis, J. N., nd Wu, C., "Rollout Algorithms for Comnintoril Optimiztion," Heuristics, vol. 3, pp , 997. [6] Bertses, D. P. nd Cstnon, D. A., "Rollout Algorithms for Stochstic Scheduling Problems," Heuristics, vol. 5, pp. 89-8, 999. [7] Secomndi, N., "Compring Neuro-Dynmic Progrmming Algorithms for the Vehicle Routing Problem with Stochstic Demnds," Computers nd Opertions Reserch, vol. 27, pp , 2. [8] cgovern, A., oss, E., nd Brto, A., "Building Bsic Building Bloc Scheduler Using Reinforcement Lerning nd Rollouts," chine Lerning, vol. 49, pp. 4-6, 22. [9] Bertsims, D. nd Popescu, I., "Revenue ngement in Dynmic Networ Environment," Trnsporttion Science, vol. 37, pp , 23. [] Tu, F. nd Pttipti, K. R., "Rollout Strtegies for Sequentil Fult Dignosis," IEEE Trns on Systems, n nd Cybernetics, Prt A, pp , 23. [] Abrmson, B., "Expected-Outcome: A Generl odel of Sttic Evlution," IEEE Trnsctions on Pttern Anlysis nd chine Intelligence, vol. 2, pp , 99. [2] Tesuro, G. nd Glperin, G., "On-line Policy Improvement Using onte-crlo Serch," Advnces in Neurl Informtion Processing, vol. 9, pp , 996. [3] orri,. nd Lee, J. H., "odel Predictive Control: Pst, Present, nd Future," Computers nd Chemicl Engineering, vol. 23, pp , 999. [4] yne, D. Q., Rwlings, J. B., Ro, C. V., nd Scoert, P. O.., "Constrined odel Predictive Control: Stbility nd Optimlity," Automtic, vol. 36, pp , 2. 6 Copyright 2 by Generl otors Downloded 6 Aug 2 to Redistribution subject to ASE license or copyright; see

7 [5] Rwlings, J. B., "Tutoril Overview of odel Predictive Control," Control Systems gzine, vol. 2, pp , 2. [6] Findeisen, R., Imslnd, L., Allgower, F., nd Foss, B. A., "Stte nd Output Feedbc Nonliner odel Predictive Control: An Overview," Europen Journl of Control, vol. 9, pp. 9-25, 23. [7] Qin, S. J. nd Bdgwell, T. A., "A Survey of Industril odel Predictive Control Technology," Control Engineering Prctice, vol., pp , 23. [8] Sutton, R. S. nd Brto, A. G., Reinforcement Lerning: An Introduction (Adptive Computtion nd chine Lerning), The IT Press, rch 998. [9] Gosvi, A., Simultion-Bsed Optimiztion: Prmetric Optimiztion Techniques nd Reinforcement Lerning, st edition, Springer, June 3, 23. [2] Borr, V. S., "A Lerning Algorithm for Discrete-Time Stochstic Control," Probbility in the Engineering nd Informtion Sciences, vol. 4, pp , 2. [2] Kelbling, L. P., Littmn,. L., nd oore, A. W., "Reinforcement Lerning: A Survey," Journl of Artificil Intelligence Reserch, vol. 4, 996. [22] liopoulos, A. A., Rel-Time, Self-Lerning Identifiction nd Stochstic Optiml Control of Advnced Powertrin Systems, Ph.D. Disserttion, Deprtment of echnicl Engineering, University of ichign, Ann Arbor, USA, 28. [23] liopoulos, A. A., Pplmbros, P. Y., nd Assnis, D. N., "A Rel-Time Computtionl Lerning odel for Sequentil Decision-ing Problems Under Uncertinty," ASE J. Dyn. Sys., es., Control, vol. 3, 4, pp. 4-(8), 29. [24] liopoulos, A. A., Assnis, D. N., nd Pplmbros, P. Y., "Rel-Time Self-Lerning Optimiztion of Diesel Engine Clibrtion," ASE J. Eng. Gs Turbines Power, vol. 3, 2, pp. 2283(7), 29. [25] Kushner, H. J., Introduction to Stochstic Control, Holt, Rinehrt nd Winston, 97. [26] Kumr, P. R. nd Vriy, P., Stochstic Systems, Prentice Hll, June 986. [27] liopoulos, A. A., "Convergence Properties of Computtionl Lerning odel for Unnown rov Chins," ASE J. Dyn. Sys., es., Control, vol. 3, No. 4, pp. 4(7), Copyright 2 by Generl otors Downloded 6 Aug 2 to Redistribution subject to ASE license or copyright; see

Reinforcement Learning

Reinforcement Learning Reinforcement Lerning Tom Mitchell, Mchine Lerning, chpter 13 Outline Introduction Comprison with inductive lerning Mrkov Decision Processes: the model Optiml policy: The tsk Q Lerning: Q function Algorithm

More information

2D1431 Machine Learning Lab 3: Reinforcement Learning

2D1431 Machine Learning Lab 3: Reinforcement Learning 2D1431 Mchine Lerning Lb 3: Reinforcement Lerning Frnk Hoffmnn modified by Örjn Ekeberg December 7, 2004 1 Introduction In this lb you will lern bout dynmic progrmming nd reinforcement lerning. It is ssumed

More information

Reinforcement learning II

Reinforcement learning II CS 1675 Introduction to Mchine Lerning Lecture 26 Reinforcement lerning II Milos Huskrecht milos@cs.pitt.edu 5329 Sennott Squre Reinforcement lerning Bsics: Input x Lerner Output Reinforcement r Critic

More information

Module 6 Value Iteration. CS 886 Sequential Decision Making and Reinforcement Learning University of Waterloo

Module 6 Value Iteration. CS 886 Sequential Decision Making and Reinforcement Learning University of Waterloo Module 6 Vlue Itertion CS 886 Sequentil Decision Mking nd Reinforcement Lerning University of Wterloo Mrkov Decision Process Definition Set of sttes: S Set of ctions (i.e., decisions): A Trnsition model:

More information

1 Online Learning and Regret Minimization

1 Online Learning and Regret Minimization 2.997 Decision-Mking in Lrge-Scle Systems My 10 MIT, Spring 2004 Hndout #29 Lecture Note 24 1 Online Lerning nd Regret Minimiztion In this lecture, we consider the problem of sequentil decision mking in

More information

Bellman Optimality Equation for V*

Bellman Optimality Equation for V* Bellmn Optimlity Eqution for V* The vlue of stte under n optiml policy must equl the expected return for the best ction from tht stte: V (s) mx Q (s,) A(s) mx A(s) mx A(s) Er t 1 V (s t 1 ) s t s, t s

More information

19 Optimal behavior: Game theory

19 Optimal behavior: Game theory Intro. to Artificil Intelligence: Dle Schuurmns, Relu Ptrscu 1 19 Optiml behvior: Gme theory Adversril stte dynmics hve to ccount for worst cse Compute policy π : S A tht mximizes minimum rewrd Let S (,

More information

Administrivia CSE 190: Reinforcement Learning: An Introduction

Administrivia CSE 190: Reinforcement Learning: An Introduction Administrivi CSE 190: Reinforcement Lerning: An Introduction Any emil sent to me bout the course should hve CSE 190 in the subject line! Chpter 4: Dynmic Progrmming Acknowledgment: A good number of these

More information

{ } = E! & $ " k r t +k +1

{ } = E! & $  k r t +k +1 Chpter 4: Dynmic Progrmming Objectives of this chpter: Overview of collection of clssicl solution methods for MDPs known s dynmic progrmming (DP) Show how DP cn be used to compute vlue functions, nd hence,

More information

Chapter 4: Dynamic Programming

Chapter 4: Dynamic Programming Chpter 4: Dynmic Progrmming Objectives of this chpter: Overview of collection of clssicl solution methods for MDPs known s dynmic progrmming (DP) Show how DP cn be used to compute vlue functions, nd hence,

More information

Decision Networks. CS 188: Artificial Intelligence Fall Example: Decision Networks. Decision Networks. Decisions as Outcome Trees

Decision Networks. CS 188: Artificial Intelligence Fall Example: Decision Networks. Decision Networks. Decisions as Outcome Trees CS 188: Artificil Intelligence Fll 2011 Decision Networks ME: choose the ction which mximizes the expected utility given the evidence mbrell Lecture 17: Decision Digrms 10/27/2011 Cn directly opertionlize

More information

Review of Calculus, cont d

Review of Calculus, cont d Jim Lmbers MAT 460 Fll Semester 2009-10 Lecture 3 Notes These notes correspond to Section 1.1 in the text. Review of Clculus, cont d Riemnn Sums nd the Definite Integrl There re mny cses in which some

More information

Multi-Armed Bandits: Non-adaptive and Adaptive Sampling

Multi-Armed Bandits: Non-adaptive and Adaptive Sampling CSE 547/Stt 548: Mchine Lerning for Big Dt Lecture Multi-Armed Bndits: Non-dptive nd Adptive Smpling Instructor: Shm Kkde 1 The (stochstic) multi-rmed bndit problem The bsic prdigm is s follows: K Independent

More information

Online Supplements to Performance-Based Contracts for Outpatient Medical Services

Online Supplements to Performance-Based Contracts for Outpatient Medical Services Jing, Png nd Svin: Performnce-bsed Contrcts Article submitted to Mnufcturing & Service Opertions Mngement; mnuscript no. MSOM-11-270.R2 1 Online Supplements to Performnce-Bsed Contrcts for Outptient Medicl

More information

Cf. Linn Sennott, Stochastic Dynamic Programming and the Control of Queueing Systems, Wiley Series in Probability & Statistics, 1999.

Cf. Linn Sennott, Stochastic Dynamic Programming and the Control of Queueing Systems, Wiley Series in Probability & Statistics, 1999. Cf. Linn Sennott, Stochstic Dynmic Progrmming nd the Control of Queueing Systems, Wiley Series in Probbility & Sttistics, 1999. D.L.Bricker, 2001 Dept of Industril Engineering The University of Iow MDP

More information

Duality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below.

Duality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below. Dulity #. Second itertion for HW problem Recll our LP emple problem we hve been working on, in equlity form, is given below.,,,, 8 m F which, when written in slightly different form, is 8 F Recll tht we

More information

State space systems analysis (continued) Stability. A. Definitions A system is said to be Asymptotically Stable (AS) when it satisfies

State space systems analysis (continued) Stability. A. Definitions A system is said to be Asymptotically Stable (AS) when it satisfies Stte spce systems nlysis (continued) Stbility A. Definitions A system is sid to be Asymptoticlly Stble (AS) when it stisfies ut () = 0, t > 0 lim xt () 0. t A system is AS if nd only if the impulse response

More information

CS 188: Artificial Intelligence Spring 2007

CS 188: Artificial Intelligence Spring 2007 CS 188: Artificil Intelligence Spring 2007 Lecture 3: Queue-Bsed Serch 1/23/2007 Srini Nrynn UC Berkeley Mny slides over the course dpted from Dn Klein, Sturt Russell or Andrew Moore Announcements Assignment

More information

A Fast and Reliable Policy Improvement Algorithm

A Fast and Reliable Policy Improvement Algorithm A Fst nd Relible Policy Improvement Algorithm Ysin Abbsi-Ydkori Peter L. Brtlett Stephen J. Wright Queenslnd University of Technology UC Berkeley nd QUT University of Wisconsin-Mdison Abstrct We introduce

More information

Numerical Integration

Numerical Integration Chpter 5 Numericl Integrtion Numericl integrtion is the study of how the numericl vlue of n integrl cn be found. Methods of function pproximtion discussed in Chpter??, i.e., function pproximtion vi the

More information

The Regulated and Riemann Integrals

The Regulated and Riemann Integrals Chpter 1 The Regulted nd Riemnn Integrls 1.1 Introduction We will consider severl different pproches to defining the definite integrl f(x) dx of function f(x). These definitions will ll ssign the sme vlue

More information

Learning Moore Machines from Input-Output Traces

Learning Moore Machines from Input-Output Traces Lerning Moore Mchines from Input-Output Trces Georgios Gintmidis 1 nd Stvros Tripkis 1,2 1 Alto University, Finlnd 2 UC Berkeley, USA Motivtion: lerning models from blck boxes Inputs? Lerner Forml Model

More information

CMDA 4604: Intermediate Topics in Mathematical Modeling Lecture 19: Interpolation and Quadrature

CMDA 4604: Intermediate Topics in Mathematical Modeling Lecture 19: Interpolation and Quadrature CMDA 4604: Intermedite Topics in Mthemticl Modeling Lecture 19: Interpoltion nd Qudrture In this lecture we mke brief diversion into the res of interpoltion nd qudrture. Given function f C[, b], we sy

More information

CS 188 Introduction to Artificial Intelligence Fall 2018 Note 7

CS 188 Introduction to Artificial Intelligence Fall 2018 Note 7 CS 188 Introduction to Artificil Intelligence Fll 2018 Note 7 These lecture notes re hevily bsed on notes originlly written by Nikhil Shrm. Decision Networks In the third note, we lerned bout gme trees

More information

Expected Value of Function of Uncertain Variables

Expected Value of Function of Uncertain Variables Journl of Uncertin Systems Vol.4, No.3, pp.8-86, 2 Online t: www.jus.org.uk Expected Vlue of Function of Uncertin Vribles Yuhn Liu, Minghu H College of Mthemtics nd Computer Sciences, Hebei University,

More information

Generation of Lyapunov Functions by Neural Networks

Generation of Lyapunov Functions by Neural Networks WCE 28, July 2-4, 28, London, U.K. Genertion of Lypunov Functions by Neurl Networks Nvid Noroozi, Pknoosh Krimghee, Ftemeh Sfei, nd Hmed Jvdi Abstrct Lypunov function is generlly obtined bsed on tril nd

More information

Advanced Calculus: MATH 410 Notes on Integrals and Integrability Professor David Levermore 17 October 2004

Advanced Calculus: MATH 410 Notes on Integrals and Integrability Professor David Levermore 17 October 2004 Advnced Clculus: MATH 410 Notes on Integrls nd Integrbility Professor Dvid Levermore 17 October 2004 1. Definite Integrls In this section we revisit the definite integrl tht you were introduced to when

More information

Generalized Fano and non-fano networks

Generalized Fano and non-fano networks Generlized Fno nd non-fno networks Nildri Ds nd Brijesh Kumr Ri Deprtment of Electronics nd Electricl Engineering Indin Institute of Technology Guwhti, Guwhti, Assm, Indi Emil: {d.nildri, bkri}@iitg.ernet.in

More information

p-adic Egyptian Fractions

p-adic Egyptian Fractions p-adic Egyptin Frctions Contents 1 Introduction 1 2 Trditionl Egyptin Frctions nd Greedy Algorithm 2 3 Set-up 3 4 p-greedy Algorithm 5 5 p-egyptin Trditionl 10 6 Conclusion 1 Introduction An Egyptin frction

More information

REINFORCEMENT learning (RL) was originally studied

REINFORCEMENT learning (RL) was originally studied IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 45, NO. 3, MARCH 2015 385 Multiobjective Reinforcement Lerning: A Comprehensive Overview Chunming Liu, Xin Xu, Senior Member, IEEE, nd

More information

Chapter 3. Vector Spaces

Chapter 3. Vector Spaces 3.4 Liner Trnsformtions 1 Chpter 3. Vector Spces 3.4 Liner Trnsformtions Note. We hve lredy studied liner trnsformtions from R n into R m. Now we look t liner trnsformtions from one generl vector spce

More information

Applying Q-Learning to Flappy Bird

Applying Q-Learning to Flappy Bird Applying Q-Lerning to Flppy Bird Moritz Ebeling-Rump, Mnfred Ko, Zchry Hervieux-Moore Abstrct The field of mchine lerning is n interesting nd reltively new re of reserch in rtificil intelligence. In this

More information

Reliable Optimal Production Control with Cobb-Douglas Model

Reliable Optimal Production Control with Cobb-Douglas Model Relible Computing 4: 63 69, 998. 63 c 998 Kluwer Acdemic Publishers. Printed in the Netherlnds. Relible Optiml Production Control with Cobb-Dougls Model ZHIHUI HUEY HU Texs A&M University, College Sttion,

More information

Bayesian Networks: Approximate Inference

Bayesian Networks: Approximate Inference pproches to inference yesin Networks: pproximte Inference xct inference Vrillimintion Join tree lgorithm pproximte inference Simplify the structure of the network to mkxct inferencfficient (vritionl methods,

More information

AQA Further Pure 1. Complex Numbers. Section 1: Introduction to Complex Numbers. The number system

AQA Further Pure 1. Complex Numbers. Section 1: Introduction to Complex Numbers. The number system Complex Numbers Section 1: Introduction to Complex Numbers Notes nd Exmples These notes contin subsections on The number system Adding nd subtrcting complex numbers Multiplying complex numbers Complex

More information

Intuitionistic Fuzzy Lattices and Intuitionistic Fuzzy Boolean Algebras

Intuitionistic Fuzzy Lattices and Intuitionistic Fuzzy Boolean Algebras Intuitionistic Fuzzy Lttices nd Intuitionistic Fuzzy oolen Algebrs.K. Tripthy #1, M.K. Stpthy *2 nd P.K.Choudhury ##3 # School of Computing Science nd Engineering VIT University Vellore-632014, TN, Indi

More information

Theoretical foundations of Gaussian quadrature

Theoretical foundations of Gaussian quadrature Theoreticl foundtions of Gussin qudrture 1 Inner product vector spce Definition 1. A vector spce (or liner spce) is set V = {u, v, w,...} in which the following two opertions re defined: (A) Addition of

More information

LECTURE NOTE #12 PROF. ALAN YUILLE

LECTURE NOTE #12 PROF. ALAN YUILLE LECTURE NOTE #12 PROF. ALAN YUILLE 1. Clustering, K-mens, nd EM Tsk: set of unlbeled dt D = {x 1,..., x n } Decompose into clsses w 1,..., w M where M is unknown. Lern clss models p(x w)) Discovery of

More information

New Expansion and Infinite Series

New Expansion and Infinite Series Interntionl Mthemticl Forum, Vol. 9, 204, no. 22, 06-073 HIKARI Ltd, www.m-hikri.com http://dx.doi.org/0.2988/imf.204.4502 New Expnsion nd Infinite Series Diyun Zhng College of Computer Nnjing University

More information

FUZZY HOMOTOPY CONTINUATION METHOD FOR SOLVING FUZZY NONLINEAR EQUATIONS

FUZZY HOMOTOPY CONTINUATION METHOD FOR SOLVING FUZZY NONLINEAR EQUATIONS VOL NO 6 AUGUST 6 ISSN 89-668 6-6 Asin Reserch Publishing Networ (ARPN) All rights reserved wwwrpnjournlscom FUZZY HOMOTOPY CONTINUATION METHOD FOR SOLVING FUZZY NONLINEAR EQUATIONS Muhmmd Zini Ahmd Nor

More information

Lecture 14: Quadrature

Lecture 14: Quadrature Lecture 14: Qudrture This lecture is concerned with the evlution of integrls fx)dx 1) over finite intervl [, b] The integrnd fx) is ssumed to be rel-vlues nd smooth The pproximtion of n integrl by numericl

More information

Chapter 5 : Continuous Random Variables

Chapter 5 : Continuous Random Variables STAT/MATH 395 A - PROBABILITY II UW Winter Qurter 216 Néhémy Lim Chpter 5 : Continuous Rndom Vribles Nottions. N {, 1, 2,...}, set of nturl numbers (i.e. ll nonnegtive integers); N {1, 2,...}, set of ll

More information

Time Optimal Control of the Brockett Integrator

Time Optimal Control of the Brockett Integrator Milno (Itly) August 8 - September, 011 Time Optiml Control of the Brockett Integrtor S. Sinh Deprtment of Mthemtics, IIT Bomby, Mumbi, Indi (emil : sunnysphs4891@gmil.com) Abstrct: The Brockett integrtor

More information

THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS.

THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS. THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS RADON ROSBOROUGH https://intuitiveexplntionscom/picrd-lindelof-theorem/ This document is proof of the existence-uniqueness theorem

More information

INVESTIGATION OF MATHEMATICAL MODEL OF COMMUNICATION NETWORK WITH UNSTEADY FLOW OF REQUESTS

INVESTIGATION OF MATHEMATICAL MODEL OF COMMUNICATION NETWORK WITH UNSTEADY FLOW OF REQUESTS Trnsport nd Telecommuniction Vol No 4 9 Trnsport nd Telecommuniction 9 Volume No 4 8 34 Trnsport nd Telecommuniction Institute Lomonosov Rig LV-9 Ltvi INVESTIGATION OF MATHEMATICAL MODEL OF COMMUNICATION

More information

1.4 Nonregular Languages

1.4 Nonregular Languages 74 1.4 Nonregulr Lnguges The number of forml lnguges over ny lphbet (= decision/recognition problems) is uncountble On the other hnd, the number of regulr expressions (= strings) is countble Hence, ll

More information

3.4 Numerical integration

3.4 Numerical integration 3.4. Numericl integrtion 63 3.4 Numericl integrtion In mny economic pplictions it is necessry to compute the definite integrl of relvlued function f with respect to "weight" function w over n intervl [,

More information

Scalable Learning in Stochastic Games

Scalable Learning in Stochastic Games Sclble Lerning in Stochstic Gmes Michel Bowling nd Mnuel Veloso Computer Science Deprtment Crnegie Mellon University Pittsburgh PA, 15213-3891 Abstrct Stochstic gmes re generl model of interction between

More information

Worst-Case Performance of a Mobile Sensor Network Under Individual Sensor Failure

Worst-Case Performance of a Mobile Sensor Network Under Individual Sensor Failure 2013 IEEE Interntionl Conference on Robotics nd Automtion (ICRA) Krlsruhe, Germny, My 6-10, 2013 Worst-Cse Performnce of Mobile Sensor Networ Under Individul Sensor Filure Hyongju Pr nd Seth Hutchinson

More information

Data Assimilation. Alan O Neill Data Assimilation Research Centre University of Reading

Data Assimilation. Alan O Neill Data Assimilation Research Centre University of Reading Dt Assimiltion Aln O Neill Dt Assimiltion Reserch Centre University of Reding Contents Motivtion Univrite sclr dt ssimiltion Multivrite vector dt ssimiltion Optiml Interpoltion BLUE 3d-Vritionl Method

More information

Construction and Selection of Single Sampling Quick Switching Variables System for given Control Limits Involving Minimum Sum of Risks

Construction and Selection of Single Sampling Quick Switching Variables System for given Control Limits Involving Minimum Sum of Risks Construction nd Selection of Single Smpling Quick Switching Vribles System for given Control Limits Involving Minimum Sum of Risks Dr. D. SENHILKUMAR *1 R. GANESAN B. ESHA RAFFIE 1 Associte Professor,

More information

Genetic Programming. Outline. Evolutionary Strategies. Evolutionary strategies Genetic programming Summary

Genetic Programming. Outline. Evolutionary Strategies. Evolutionary strategies Genetic programming Summary Outline Genetic Progrmming Evolutionry strtegies Genetic progrmming Summry Bsed on the mteril provided y Professor Michel Negnevitsky Evolutionry Strtegies An pproch simulting nturl evolution ws proposed

More information

New data structures to reduce data size and search time

New data structures to reduce data size and search time New dt structures to reduce dt size nd serch time Tsuneo Kuwbr Deprtment of Informtion Sciences, Fculty of Science, Kngw University, Hirtsuk-shi, Jpn FIT2018 1D-1, No2, pp1-4 Copyright (c)2018 by The Institute

More information

Module 6: LINEAR TRANSFORMATIONS

Module 6: LINEAR TRANSFORMATIONS Module 6: LINEAR TRANSFORMATIONS. Trnsformtions nd mtrices Trnsformtions re generliztions of functions. A vector x in some set S n is mpped into m nother vector y T( x). A trnsformtion is liner if, for

More information

Lumpability and Absorbing Markov Chains

Lumpability and Absorbing Markov Chains umbility nd Absorbing rov Chins By Ahmed A.El-Sheih Dertment of Alied Sttistics & Econometrics Institute of Sttisticl Studies & Reserch (I.S.S.R Ciro University Abstrct We consider n bsorbing rov Chin

More information

Acceptance Sampling by Attributes

Acceptance Sampling by Attributes Introduction Acceptnce Smpling by Attributes Acceptnce smpling is concerned with inspection nd decision mking regrding products. Three spects of smpling re importnt: o Involves rndom smpling of n entire

More information

Math 270A: Numerical Linear Algebra

Math 270A: Numerical Linear Algebra Mth 70A: Numericl Liner Algebr Instructor: Michel Holst Fll Qurter 014 Homework Assignment #3 Due Give to TA t lest few dys before finl if you wnt feedbck. Exercise 3.1. (The Bsic Liner Method for Liner

More information

Driving Cycle Construction of City Road for Hybrid Bus Based on Markov Process Deng Pan1, a, Fengchun Sun1,b*, Hongwen He1, c, Jiankun Peng1, d

Driving Cycle Construction of City Road for Hybrid Bus Based on Markov Process Deng Pan1, a, Fengchun Sun1,b*, Hongwen He1, c, Jiankun Peng1, d Interntionl Industril Informtics nd Computer Engineering Conference (IIICEC 15) Driving Cycle Construction of City Rod for Hybrid Bus Bsed on Mrkov Process Deng Pn1,, Fengchun Sun1,b*, Hongwen He1, c,

More information

Ordinary differential equations

Ordinary differential equations Ordinry differentil equtions Introduction to Synthetic Biology E Nvrro A Montgud P Fernndez de Cordob JF Urchueguí Overview Introduction-Modelling Bsic concepts to understnd n ODE. Description nd properties

More information

Review of basic calculus

Review of basic calculus Review of bsic clculus This brief review reclls some of the most importnt concepts, definitions, nd theorems from bsic clculus. It is not intended to tech bsic clculus from scrtch. If ny of the items below

More information

Math 1B, lecture 4: Error bounds for numerical methods

Math 1B, lecture 4: Error bounds for numerical methods Mth B, lecture 4: Error bounds for numericl methods Nthn Pflueger 4 September 0 Introduction The five numericl methods descried in the previous lecture ll operte by the sme principle: they pproximte the

More information

Metrics for Finite Markov Decision Processes

Metrics for Finite Markov Decision Processes Metrics for Finite Mrkov Decision Processes Norm Ferns chool of Computer cience McGill University Montrél, Cnd, H3 27 nferns@cs.mcgill.c Prksh Pnngden chool of Computer cience McGill University Montrél,

More information

Taylor Polynomial Inequalities

Taylor Polynomial Inequalities Tylor Polynomil Inequlities Ben Glin September 17, 24 Abstrct There re instnces where we my wish to pproximte the vlue of complicted function round given point by constructing simpler function such s polynomil

More information

Solution for Assignment 1 : Intro to Probability and Statistics, PAC learning

Solution for Assignment 1 : Intro to Probability and Statistics, PAC learning Solution for Assignment 1 : Intro to Probbility nd Sttistics, PAC lerning 10-701/15-781: Mchine Lerning (Fll 004) Due: Sept. 30th 004, Thursdy, Strt of clss Question 1. Bsic Probbility ( 18 pts) 1.1 (

More information

Realistic Method for Solving Fully Intuitionistic Fuzzy. Transportation Problems

Realistic Method for Solving Fully Intuitionistic Fuzzy. Transportation Problems Applied Mthemticl Sciences, Vol 8, 201, no 11, 6-69 HKAR Ltd, wwwm-hikricom http://dxdoiorg/10988/ms20176 Relistic Method for Solving Fully ntuitionistic Fuzzy Trnsporttion Problems P Pndin Deprtment of

More information

Best Approximation in the 2-norm

Best Approximation in the 2-norm Jim Lmbers MAT 77 Fll Semester 1-11 Lecture 1 Notes These notes correspond to Sections 9. nd 9.3 in the text. Best Approximtion in the -norm Suppose tht we wish to obtin function f n (x) tht is liner combintion

More information

Pre-Session Review. Part 1: Basic Algebra; Linear Functions and Graphs

Pre-Session Review. Part 1: Basic Algebra; Linear Functions and Graphs Pre-Session Review Prt 1: Bsic Algebr; Liner Functions nd Grphs A. Generl Review nd Introduction to Algebr Hierrchy of Arithmetic Opertions Opertions in ny expression re performed in the following order:

More information

Estimation of Binomial Distribution in the Light of Future Data

Estimation of Binomial Distribution in the Light of Future Data British Journl of Mthemtics & Computer Science 102: 1-7, 2015, Article no.bjmcs.19191 ISSN: 2231-0851 SCIENCEDOMAIN interntionl www.sciencedomin.org Estimtion of Binomil Distribution in the Light of Future

More information

Using QM for Windows. Using QM for Windows. Using QM for Windows LEARNING OBJECTIVES. Solving Flair Furniture s LP Problem

Using QM for Windows. Using QM for Windows. Using QM for Windows LEARNING OBJECTIVES. Solving Flair Furniture s LP Problem LEARNING OBJECTIVES Vlu%on nd pricing (November 5, 2013) Lecture 11 Liner Progrmming (prt 2) 10/8/16, 2:46 AM Olivier J. de Jong, LL.M., MM., MBA, CFD, CFFA, AA www.olivierdejong.com Solving Flir Furniture

More information

Credibility Hypothesis Testing of Fuzzy Triangular Distributions

Credibility Hypothesis Testing of Fuzzy Triangular Distributions 666663 Journl of Uncertin Systems Vol.9, No., pp.6-74, 5 Online t: www.jus.org.uk Credibility Hypothesis Testing of Fuzzy Tringulr Distributions S. Smpth, B. Rmy Received April 3; Revised 4 April 4 Abstrct

More information

Spanning tree congestion of some product graphs

Spanning tree congestion of some product graphs Spnning tree congestion of some product grphs Hiu-Fi Lw Mthemticl Institute Oxford University 4-9 St Giles Oxford, OX1 3LB, United Kingdom e-mil: lwh@mths.ox.c.uk nd Mikhil I. Ostrovskii Deprtment of Mthemtics

More information

Undergraduate Research

Undergraduate Research Undergrdute Reserch A Trigonometric Simpson s Rule By Ctherine Cusimno Kirby nd Sony Stnley Biogrphicl Sketch Ctherine Cusimno Kirby is the dughter of Donn nd Sm Cusimno. Originlly from Vestvi Hills, Albm,

More information

Chapter 3 Polynomials

Chapter 3 Polynomials Dr M DRAIEF As described in the introduction of Chpter 1, pplictions of solving liner equtions rise in number of different settings In prticulr, we will in this chpter focus on the problem of modelling

More information

We will see what is meant by standard form very shortly

We will see what is meant by standard form very shortly THEOREM: For fesible liner progrm in its stndrd form, the optimum vlue of the objective over its nonempty fesible region is () either unbounded or (b) is chievble t lest t one extreme point of the fesible

More information

7.2 The Definite Integral

7.2 The Definite Integral 7.2 The Definite Integrl the definite integrl In the previous section, it ws found tht if function f is continuous nd nonnegtive, then the re under the grph of f on [, b] is given by F (b) F (), where

More information

Properties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives

Properties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives Block #6: Properties of Integrls, Indefinite Integrls Gols: Definition of the Definite Integrl Integrl Clcultions using Antiderivtives Properties of Integrls The Indefinite Integrl 1 Riemnn Sums - 1 Riemnn

More information

MATH34032: Green s Functions, Integral Equations and the Calculus of Variations 1

MATH34032: Green s Functions, Integral Equations and the Calculus of Variations 1 MATH34032: Green s Functions, Integrl Equtions nd the Clculus of Vritions 1 Section 1 Function spces nd opertors Here we gives some brief detils nd definitions, prticulrly relting to opertors. For further

More information

A New Grey-rough Set Model Based on Interval-Valued Grey Sets

A New Grey-rough Set Model Based on Interval-Valued Grey Sets Proceedings of the 009 IEEE Interntionl Conference on Systems Mn nd Cybernetics Sn ntonio TX US - October 009 New Grey-rough Set Model sed on Intervl-Vlued Grey Sets Wu Shunxing Deprtment of utomtion Ximen

More information

Tech. Rpt. # UMIACS-TR-99-31, Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, June 3, 1999.

Tech. Rpt. # UMIACS-TR-99-31, Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, June 3, 1999. Tech. Rpt. # UMIACS-TR-99-3, Institute for Advnced Computer Studies, University of Mrylnd, College Prk, MD 20742, June 3, 999. Approximtion Algorithms nd Heuristics for the Dynmic Storge Alloction Problem

More information

Purpose of the experiment

Purpose of the experiment Newton s Lws II PES 6 Advnced Physics Lb I Purpose of the experiment Exmine two cses using Newton s Lws. Sttic ( = 0) Dynmic ( 0) fyi fyi Did you know tht the longest recorded flight of chicken is thirteen

More information

Symmetry of Solutions to the Generalized 1-D Optimal Sojourn Time Control Problem

Symmetry of Solutions to the Generalized 1-D Optimal Sojourn Time Control Problem Symmetry of Solutions to the Generlized -D Optiml Sojourn Time Control Problem Wei Zhng nd Jinghi Hu School of Electricl nd Computer Engineering Purdue University West Lfyette IN706 zhng70 jinghi @purdueedu

More information

Binary Rate Distortion With Side Information: The Asymmetric Correlation Channel Case

Binary Rate Distortion With Side Information: The Asymmetric Correlation Channel Case Binry Rte Dtortion With Side Informtion: The Asymmetric Correltion Chnnel Cse Andrei Sechele, Smuel Cheng, Adrin Muntenu, nd Nikos Deliginn Deprtment of Electronics nd Informtics, Vrije Universiteit Brussel,

More information

Application of Exp-Function Method to. a Huxley Equation with Variable Coefficient *

Application of Exp-Function Method to. a Huxley Equation with Variable Coefficient * Interntionl Mthemticl Forum, 4, 9, no., 7-3 Appliction of Exp-Function Method to Huxley Eqution with Vrible Coefficient * Li Yo, Lin Wng nd Xin-Wei Zhou. Deprtment of Mthemtics, Kunming College Kunming,Yunnn,

More information

Research Article Moment Inequalities and Complete Moment Convergence

Research Article Moment Inequalities and Complete Moment Convergence Hindwi Publishing Corportion Journl of Inequlities nd Applictions Volume 2009, Article ID 271265, 14 pges doi:10.1155/2009/271265 Reserch Article Moment Inequlities nd Complete Moment Convergence Soo Hk

More information

Where did dynamic programming come from?

Where did dynamic programming come from? Where did dynmic progrmming come from? String lgorithms Dvid Kuchk cs302 Spring 2012 Richrd ellmn On the irth of Dynmic Progrmming Sturt Dreyfus http://www.eng.tu.c.il/~mi/cd/ or50/1526-5463-2002-50-01-0048.pdf

More information

A REVIEW OF CALCULUS CONCEPTS FOR JDEP 384H. Thomas Shores Department of Mathematics University of Nebraska Spring 2007

A REVIEW OF CALCULUS CONCEPTS FOR JDEP 384H. Thomas Shores Department of Mathematics University of Nebraska Spring 2007 A REVIEW OF CALCULUS CONCEPTS FOR JDEP 384H Thoms Shores Deprtment of Mthemtics University of Nebrsk Spring 2007 Contents Rtes of Chnge nd Derivtives 1 Dierentils 4 Are nd Integrls 5 Multivrite Clculus

More information

1B40 Practical Skills

1B40 Practical Skills B40 Prcticl Skills Comining uncertinties from severl quntities error propgtion We usully encounter situtions where the result of n experiment is given in terms of two (or more) quntities. We then need

More information

Fig. 1. Open-Loop and Closed-Loop Systems with Plant Variations

Fig. 1. Open-Loop and Closed-Loop Systems with Plant Variations ME 3600 Control ystems Chrcteristics of Open-Loop nd Closed-Loop ystems Importnt Control ystem Chrcteristics o ensitivity of system response to prmetric vritions cn be reduced o rnsient nd stedy-stte responses

More information

Classical Mechanics. From Molecular to Con/nuum Physics I WS 11/12 Emiliano Ippoli/ October, 2011

Classical Mechanics. From Molecular to Con/nuum Physics I WS 11/12 Emiliano Ippoli/ October, 2011 Clssicl Mechnics From Moleculr to Con/nuum Physics I WS 11/12 Emilino Ippoli/ October, 2011 Wednesdy, October 12, 2011 Review Mthemtics... Physics Bsic thermodynmics Temperture, idel gs, kinetic gs theory,

More information

1.2. Linear Variable Coefficient Equations. y + b "! = a y + b " Remark: The case b = 0 and a non-constant can be solved with the same idea as above.

1.2. Linear Variable Coefficient Equations. y + b ! = a y + b  Remark: The case b = 0 and a non-constant can be solved with the same idea as above. 1 12 Liner Vrible Coefficient Equtions Section Objective(s): Review: Constnt Coefficient Equtions Solving Vrible Coefficient Equtions The Integrting Fctor Method The Bernoulli Eqution 121 Review: Constnt

More information

CBE 291b - Computation And Optimization For Engineers

CBE 291b - Computation And Optimization For Engineers The University of Western Ontrio Fculty of Engineering Science Deprtment of Chemicl nd Biochemicl Engineering CBE 9b - Computtion And Optimiztion For Engineers Mtlb Project Introduction Prof. A. Jutn Jn

More information

Research Article On Existence and Uniqueness of Solutions of a Nonlinear Integral Equation

Research Article On Existence and Uniqueness of Solutions of a Nonlinear Integral Equation Journl of Applied Mthemtics Volume 2011, Article ID 743923, 7 pges doi:10.1155/2011/743923 Reserch Article On Existence nd Uniqueness of Solutions of Nonliner Integrl Eqution M. Eshghi Gordji, 1 H. Bghni,

More information

Set Integral Equations in Metric Spaces

Set Integral Equations in Metric Spaces Mthemtic Morvic Vol. 13-1 2009, 95 102 Set Integrl Equtions in Metric Spces Ion Tişe Abstrct. Let P cp,cvr n be the fmily of ll nonempty compct, convex subsets of R n. We consider the following set integrl

More information

The Riemann-Lebesgue Lemma

The Riemann-Lebesgue Lemma Physics 215 Winter 218 The Riemnn-Lebesgue Lemm The Riemnn Lebesgue Lemm is one of the most importnt results of Fourier nlysis nd symptotic nlysis. It hs mny physics pplictions, especilly in studies of

More information

Travelling Profile Solutions For Nonlinear Degenerate Parabolic Equation And Contour Enhancement In Image Processing

Travelling Profile Solutions For Nonlinear Degenerate Parabolic Equation And Contour Enhancement In Image Processing Applied Mthemtics E-Notes 8(8) - c IN 67-5 Avilble free t mirror sites of http://www.mth.nthu.edu.tw/ men/ Trvelling Profile olutions For Nonliner Degenerte Prbolic Eqution And Contour Enhncement In Imge

More information

The First Fundamental Theorem of Calculus. If f(x) is continuous on [a, b] and F (x) is any antiderivative. f(x) dx = F (b) F (a).

The First Fundamental Theorem of Calculus. If f(x) is continuous on [a, b] and F (x) is any antiderivative. f(x) dx = F (b) F (a). The Fundmentl Theorems of Clculus Mth 4, Section 0, Spring 009 We now know enough bout definite integrls to give precise formultions of the Fundmentl Theorems of Clculus. We will lso look t some bsic emples

More information

How to simulate Turing machines by invertible one-dimensional cellular automata

How to simulate Turing machines by invertible one-dimensional cellular automata How to simulte Turing mchines by invertible one-dimensionl cellulr utomt Jen-Christophe Dubcq Déprtement de Mthémtiques et d Informtique, École Normle Supérieure de Lyon, 46, llée d Itlie, 69364 Lyon Cedex

More information

ECO 317 Economics of Uncertainty Fall Term 2007 Notes for lectures 4. Stochastic Dominance

ECO 317 Economics of Uncertainty Fall Term 2007 Notes for lectures 4. Stochastic Dominance Generl structure ECO 37 Economics of Uncertinty Fll Term 007 Notes for lectures 4. Stochstic Dominnce Here we suppose tht the consequences re welth mounts denoted by W, which cn tke on ny vlue between

More information

13: Diffusion in 2 Energy Groups

13: Diffusion in 2 Energy Groups 3: Diffusion in Energy Groups B. Rouben McMster University Course EP 4D3/6D3 Nucler Rector Anlysis (Rector Physics) 5 Sept.-Dec. 5 September Contents We study the diffusion eqution in two energy groups

More information

Non-Linear & Logistic Regression

Non-Linear & Logistic Regression Non-Liner & Logistic Regression If the sttistics re boring, then you've got the wrong numbers. Edwrd R. Tufte (Sttistics Professor, Yle University) Regression Anlyses When do we use these? PART 1: find

More information