An Asymptotically Efficient Simulation-Based Algorithm for Finite Horizon Stochastic Dynamic Programming

Size: px
Start display at page:

Download "An Asymptotically Efficient Simulation-Based Algorithm for Finite Horizon Stochastic Dynamic Programming"

Transcription

1 An Asymptotcally Effcent Smulaton-Based Algorthm for Fnte Horzon Stochastc Dynamc Programmng Hyeong Soo Chang, Mchael C. Fu, Jaqao Hu, and Steven I. Marcus Abstract We present a smulaton-based algorthm called Smulated Annealng Multplcatve Weghts (SAMW) for solvng large fntehorzon stochastc dynamc programmng problems. At each teraton of the algorthm, a probablty dstrbuton over canddate polces s updated by a smple multplcatve weght rule, and wth proper annealng of a control parameter, the generated sequence of dstrbutons converges to a dstrbuton concentrated only on the best polces. he algorthm s asymptotcally effcent, n the sense that for the goal of estmatng the value of an optmal polcy, a provably convergent fnte-tme upper bound for the sample mean s obtaned. Index erms stochastc dynamc programmng, Markov decson processes, smulaton, learnng algorthms, smulated annealng I. INRODUCION Consder a dscrete-tme system wth a fnte horzon H: x t+ = f(x t,a t,w t) for t =0,,..., H, where x t s the state at tme t rangng over a (possbly nfnte) set, a t s the acton at tme t to be chosen from a nonempty subset A(x t) of a gven (possbly nfnte) set of avalable actons A at tme t, and w t s a random dsturbance unformly and ndependently selected from [0,] at tme t, representng the uncertanty n the system, and f : A() [0, ] s a next-state functon. hroughout, we assume the ntal state x 0 s gven, but ths s wthout loss of generalty, as the results n the paper carry through for the case where x 0 follows a gven dstrbuton. Defne a nonstatonary (non-randomzed) polcy = { t t : A(),t =0,,..., H }, and ts correspondng fnte-horzon dscount value functon gven by " H # V = E w0,...,w H γ t R(x t, t(x t),w t), () wth dscount factor γ (0, ] and one-perod reward functon R : A() [0, ] R +. We suppress explct dependence of the horzon H on V. he functon f, together wth, A, and R, comprse a stochastc dynamc programmng problem or a Markov decson process (MDP) [] [8]. We assume throughout that the one-perod reward functon s bounded. For smplcty, but wthout loss of generalty, we take the bound to be /H,.e., sup x,a A,w [0,] R(x, a, w) /H, so 0 V. he problem we consder s estmatng the optmal value over a gven fnte set of polces Π: V := max V. () hs work was supported n part by the Natonal Scence Foundaton under Grant DMI-00, n part by the Ar Force Offce of Scentfc Research under Grant FA , and n part by the Department of Defense. he work of H.S. Chang was also supported by the Sogang Unversty research grants n 006. H.S. Chang s wth the Department of Computer Scence and Engneerng at Sogang Unversty, Seoul -74, Korea. (e-mal:hschang@sogang.ac.kr). M.C. Fu s wth the Robert H. Smth School of Busness and the Insttute for Systems Research at the Unversty of Maryland, College Park. (emal:mfu@rhsmth.umd.edu). J. Hu s wth the Department of Appled Mathematcs & Statstcs, SUNY, Stony Brook. (e-mal:jqhu@xx.xx.edu). S.I. Marcus s wth the Department of Electrcal & Computer Engneerng and the Insttute for Systems Research at the Unversty of Maryland, College Park. (e-mal:marcus@eng.umd.edu). Prelmnary portons of ths paper appeared n the Proceedngs of the 4nd IEEE Conference on Decson and Control, 00. Any polcy that acheves V s called an optmal polcy. Our settng s that n whch explct forms for f and R are not avalable, but both can be smulated,.e., sample paths for the states and rewards can be generated from a gven random number sequence {w 0,...,w H }. We present a smulaton-based algorthm called Smulated Annealng Multplcatve Weghts (SAMW) for solvng (), based on the weghted majorty algorthm of []. Specfcally, we explot the recent work of the multplcatve weghts algorthm studed by Freund and Schapre [7] n a completely dfferent context: noncooperatve repeated two-player bmatrx zero-sum games. At each teraton, the algorthm updates a probablty dstrbuton over Π by a multplcatve weght rule usng the estmated (from smulaton) value functons for all polces n Π, requrng Π sample paths. Wth a proper annealng of the control parameter assocated wth the algorthm as n Smulated Annealng (SA) [0], the sequence of dstrbutons generated by the multplcatve weght rule converges to a dstrbuton concentrated only on polces that acheve V, motvatng our choce of SAMW for the name of the algorthm. he algorthm s asymptotcally effcent, n the sense that a fnte-tme upper bound s obtaned for the sample mean of the value of an optmal polcy, and the upper bound converges to V wth rate O(/ ), where s the number of teratons. A samplng verson of the algorthm that does not enumerate all polces n Π at each teraton, but nstead samples from the sequence of generated dstrbutons, s also shown to converge to V. he samplng verson can be used as an on-lne smulaton-based control n the context of plannng. SAMW dffers from the usual SA n that t does not perform any local search; rather, t drectly updates a probablty dstrbuton over Π at each teraton and has a much smpler tunng process than SA. In ths regard, t may be sad that SAMW s a compressed verson of SA wth an extenson to stochastc dynamc programmng. he use of probablty dstrbuton on the search space s a fundamentally dfferent approach from exstng smulaton-based optmzaton technques for solvng MDPs, such as (bass-functon based) neurodynamc programmng [], model-free approaches of Q-learnng [9] and D(λ)-learnng [7], and (bandt-theory based) adaptve multstage samplng [4]. Updatng a probablty dstrbuton over the search space s smlar to the learnng automata approach for stochastc optmzaton [], but SAMW s based on a dfferent multplcatve weght rule. Ordnal comparson [9] that smply chooses the current best Π from the sample mean of V does not provde a determnstc upper-bound even f a probablstc bound s possble (see, e.g., heorem n [6] wth lettng each arm of the bandt nto a polcy). Furthermore, t s not clear how to desgn a varant of ordnal comparson that does not enumerate all polces n Π. hs s also true for the recently proposed on-lne control algorthms, parallel rollout and polcy swtchng [5], for MDPs. hs paper s organzed as follows. In Secton II, we present the SAMW algorthm and n Sectons III and IV, we analyze ts convergence propertes. We conclude n Secton VI wth some remarks. II. BASIC ALGORIHM DESCRIPION Let Φ be the set of all probablty dstrbutons over Π. Forφ Φ and Π, let φ() denote the probablty for polcy. he goal s to concentrate the probablty on the optmal polces n Π. he SAMW algorthm teratvely generates a sequence of dstrbutons, where φ denotes the dstrbuton at teraton. Each teraton of SAMW requres H random numbers w 0,..., w H,.e.,..d. U(0, ) and ndependent from prevous teratons. Each polcy Π s then smulated usng the same sequence of random numbers for that

2 teraton (dfferent random number sequences can also be used for each polcy, and all of the results stll hold) n order to obtan a sample path estmate of the value functon (): H V := γ t R(x t, t(x t),w t), () where the subscrpt denotes the teraton count, whch has been omtted for notatonal smplcty n the quanttes x t and w t. he estmates {V, Π} are used for updatng a probablty dstrbuton over Π at each teraton. Note that 0 V (a.s.) by the boundedness assumpton. Note also that the sze of Π n the worst case can be qute large,.e., A H so that we assume here that Π s relatvely small. In Secton IV, we study the convergence property of a samplng verson of SAMW that does not enumerate all polces n Π at each teraton. he teratve updatng to compute the new dstrbuton φ + from φ and {V } uses a smple multplcatve rule: φ + () =φ () βv, Π, (4) Z where β > s a parameter of the algorthm, the normalzaton factor Z s gven by Z = P φ ()β V, and the ntal dstrbuton φ s the unform dstrbuton,.e., φ () =/ Π Π. III. CONVERGENCE ANALYSIS For φ Φ, defne V (φ) = V φ(), Ψ := V, = where Ψ s the sample mean estmate for the value functon of polcy. Agan, note that (a.s.) 0 V (φ) for all φ Φ. We remark that V (φ) represents an expected reward for each fxed (teraton) experment, where the expectaton s w.r.t. the dstrbuton of the polcy. he followng lemma provdes a fnte-tme upper bound for the sample mean of the value functon of an optmal polcy n terms of the probablty dstrbutons generated by SAMW va (4). Lemma.: For β = β >, =,...,, the sequence of dstrbutons φ,..., φ generated by SAMW va (4) satsfes (a.s.) Ψ β = V (φ ln Π, for any optmal polcy. Proof: he proof dea follows that of heorem n [7], for whch t s convenent to ntroduce the followng measure of dstance between two probablty dstrbutons, called the relatve entropy (also known as Kullback-Lebler entropy): D(p, q) := p()ln p() q() «, p,q Φ. (5) Although D(p, q) 0 for any p and q, and D(p, q) =0f and only f p = q, the measure s not symmetrc, hence not a true metrc. Consder any Drac dstrbuton φ Φ such that for an optmal polcy n Π, φ ( )=and φ () =0for all Π { }. We frst prove that V (β ) V (φ D(φ,φ ) D(φ,φ + ), (6) where φ and φ + are generated by SAMW va (4) and β >. From the defnton of D gven by (5), D(φ,φ + ) D(φ,φ ) = «φ φ () ()ln = φ ()ln Z φ + () β V = φ ()lnβ V +lnz φ () =( ) φ ()V ( ) V (φ ln =( )V +lnz " φ ()(+(β )V ) +ln +(β ) V (φ ) ( )V +(β ) V (φ ), where the frst nequalty follows from the property β a +(β )a for β 0,a [0, ], and the last nequalty follows from the property ln( + a) a for a>. Solvng for V (recall β>) yelds (6). Summng the nequalty (6) over =,...,, = V β β β = = V (φ D(φ,φ ) D(φ,φ + ) V (φ D(φ,φ ) V (φ = ln Π, where the second nequalty follows from D(φ,φ + ) 0, and the last nequalty uses the unform dstrbuton property that φ () = = D(φ,φ ) ln Π. Π Dvdng both sdes by yelds the desred result. If (β )/ s very close to and at the same tme ln Π /( ) s very close to 0, then the above nequalty mples that the expected per-teraton performance of SAMW s very close to the optmal value. However, lettng β, ln Π /.On the other hand, for fxed β and ncreasng, ln Π / becomes neglgble relatve to. hus, from the form of the bound, t s clear that the sequence β should be chosen as a functon of such that β and n order to acheve convergence. Defne the total varaton dstance for probablty dstrbutons p and q by d (p, q) := P Λ p() q(). he followng lemma states that the sequence of dstrbutons generated by SAMW converges to a statonary dstrbuton, wth a proper tunng or annealng of the β-parameter. Lemma.: Let {ψ( )} be a decreasng sequence such that ψ( ) > and lm ψ( ) =. For β = ψ( ), =,..., +k, k, the sequence of dstrbutons φ,..., φ generated by SAMW va (4) satsfes (a.s.) lm d (φ,φ +k )=0. Proof: From the defnton of D gven by (5), D(φ,φ + ) = φ ()ln ««φ () φ max φ + () ln () φ + () =max ln Z ψ( )V =mnln ln ψ( ), ψ( ) V Z snce V and Z for all and any. #

3 Applyng Pnsker s nequalty [8], d (φ,φ + ) p D(φ,φ + ) p lnψ( ). herefore, k k p d (φ,φ +k ) d (φ +j,φ +j ) lnψ( + j). j= Because P d (φ,φ +k ) 0 for any k and k p j=0 lnψ( + j) 0 as, d (φ,φ +k ) 0 as. heorem.: Let {ψ( )} be a decreasng sequence such that ψ( ) >, lm ψ( )=, and lm ln ψ( )=. For β = ψ( ), =,...,, the sequence of dstrbutons φ,..., φ generated by SAMW va (4) satsfes (a.s.) ψ( ) ln ψ( ) V (φ ln Π ln ψ( ) V, = and φ φ Φ, where φ () =0for all such that V <V. Proof: Usng x x ln x for all x and Lemma., Ψ ψ( ) ln ψ( ) V (φ ln Π ln ψ( ) ψ( ) = = j=0 V (φ ln Π ln ψ( ). (7) In the lmt as, the lefthand sde converges to V by the law of large numbers, and n the rghtmost expresson n (7), ψ( ) and the second term vanshes, so t suffces to show that P V = (φ ) s bounded from above by V (n the lmt). From Lemma., for every ɛ>0, there exsts < such that d (φ,φ +k ) ɛ for all > and any nteger k. hen, for >, we have (a.s.) V (φ )= 4 V (φ V (φ ) 5 = = = = = V (φ + V (φ + V (φ = + = + = + V (φ ) V (φ ) = = + V (φ = φ () φ () V φ () φ () V = + Π ɛ, (8) the last nequalty followng from max φ +k () φ () ɛ and V > k, Π. As, the frst term of (8) vanshes, and the second term converges by the law of large numbers to V, φ, whch s bounded from above by V. Snce ɛ can be chosen arbtrarly close to zero, the desred convergence follows. he second part of the theorem follows drectly from the frst part wth Lemma., wth a proof obtaned n a straghtforward manner by assumng there exsts a Π such that φ () 0and V <V, leadng to a contradcton. We skp the detals. An example of a decreasng sequence {ψ( )}, =,,..., that satsfes the condton of heorem. s ψ( )=+ p /, > 0. IV. CONVERGENCE OF HE SAMPLING VERSION OF HE ALGORIHM Instead of estmatng the value functons for every polcy n Π accordng to (), whch requres smulatng all polces n Π, a samplng verson of the algorthm would sample a subset of the polces n Π at each teraton accordng to φ and smulate only those polces (and estmate ther correspondng value functons). In ths context, heorem. essentally establshes that the expected per-teraton performance of SAMW approaches the optmal value as for approprately selected tunng sequence {β }. Here, we show that the actual (dstrbuton sampled) per-teraton performance also converges to the optmal value usng a partcular annealng schedule of the parameter β. For smplcty, we assume that a sngle polcy s sampled at each teraton (.e., subset s a sngleton). A related result s proven by Freund and Schapre wthn the context of solvng two-player zero-sum bmatrx repeated game [7], and the proof of the followng theorem s based on thers. heorem 4.: Let k = P k j= j.forβ =+/k, k < k, let {φ } denote the sequence of dstrbutons generated by SAMW va (4), wth resettng of φ () =/ Π at each = k. Let ˆ(φ ) denote the polcy sampled from φ (at teraton ). hen (a.s. as k ), k k = V ˆ(φ ) V. Proof: he sequence of random varables κ = V ˆ(φ ) V (φ ) forms a martngale dfference sequence wth κ, snce E[κ κ,..., κ ] = 0 for all. Let ɛ k = ln k/k and I k =[ k +, k ]. Applyng Azuma s nequalty [4, p.09], we have that for every ɛ k > 0, 0 V k ˆ(φ ) V (φ ) I k >ɛ ka e 0.5k ɛ k = k. (9) he sum of the probablty bound n (9) over all k from to s fnte. herefore, by the Borel-Cantell lemma, (a.s.) all but a fnte number of I k s (k =,..., ) satsfy V(φ ) V ˆ(φ ) + k ɛ k, (0) I k I k so those I k that volate (9) can be gnored (a.s.). From Lemma. wth the defnton of β, for all I k, k Ψ k β V(φ ln Π I k β V(φ ln Π β I k β = + «V (φ ln Π (k +) k Ik V (φ k +ln Π (k +), () Ik where the last nequalty follows from V (φ) φ Φ and I k = k. Combnng (0) and () and summng, we have k Ψ k + V ˆ(φ ) I I k k j= h j p ln j + j(ln Π +ln Π, so

4 4 Ψ k k + k k V ˆ(φ ) I I k j= h j p ln j + j(ln Π +ln Π. () Because k s O(k ), the term on the rghthand sde of () vanshes as k. herefore, for every ɛ>0, (a.s.) for all but a fnte number of values of k, Ψ k k k = V ˆ(φ ) + ɛ. We now argue that {φ } converges to a fxed dstrbuton as k, so that eventually the term P k k = V ˆ(φ ) s bounded from above by V. Wth smlar reasonng as n the proof of Lemma., for every ɛ>0, there exsts I k for some k> such that for all > wth + j I k, j, d (φ,φ +j ) ɛ. akng > wth I k, = V ˆ(φ ) = = = 4 V ˆ(φ ) V ˆ(φ ) = V ˆ(φ ) + = + = + = = + V ˆ(φ ) = + V ˆ(φ ) 5 V ˆ(φ ) V ˆ(φ ) V ˆ(φ ) V ˆ(φ ) V ˆ(φ ) «. () Lettng, the frst term on the rghthand sde of () vanshes, and the second term s bounded from above by V, because the second term converges to V, φ, from the law of large numbers. We know that for all > n I k, ɛ + φ () φ +j () φ (ɛ for all Π and any j. herefore, as ɛ can be chosen arbtrarly close to zero, {φ } converge to the dstrbuton φ, makng the last term vansh (once each polcy s sampled from the same dstrbuton over Π, the smulated value would be the same for the same random numbers), whch provdes the desred convergence result. V. A NUMERICAL EAMPLE o llustrate the performance of SAMW, we consder a smple fnte-horzon nventory control problem wth lost sales, zero order lead tme, and lnear holdng and shortage costs. Gven an nventory level, orders are placed and receved, demand s realzed, and the new nventory level s calculated. Formally, we let D t, a dscrete random varable, denote the demand n perod t, x t the nventory level at perod t, a t the order amount at perod t, p the per perod per unt demand lost penalty cost, h the per perod per unt nventory holdng cost, and M the nventory capacty. hus, the nventory level evolves accordng to the followng dynamcs: x t+ =max{0,x t +a t D t}. he goal s to mnmze, over a gven set of (nonstatonary) polces Π, the expected total cost over the entre horzon from a gven ntal nventory level x 0,.e., mn E[ P H [h max{0,xt + t(xt) D t} + p max{0,d t x t t(x t)}] x 0 = x]. he followng set of parameters s used n our experments: M = 0, H =, h = 0.00, p = 0.0, x 0 = 5 and x t {0, 5, 0, 5, 0} for t =,...,H, a t {0, 5, 0, 5, 0} for all t =0,...,H, and D t s a dscrete unformly dstrbuted random varable takng values n {0, 5, 0, 5, 0}. he values of h and p are chosen so as to satsfy the one-perod reward bound assumed n the SAMW convergence results. Note that snce we are gnorng the setup cost (.e., no fxed order cost), t s easy to see that the optmal order polcy follows a threshold rule, n whch an order s placed at perod t f the nventory level x t s below a certan threshold S t, and the amount to order s equal to the dfference max{0,s t x t}. hus, by takng advantage of ths structure, n actual mplementaton of SAMW, we restrct the search of the algorthm to the set of threshold polces,.e., Π=(S 0,S,S ), S t {0, 5, 0, 5, 0}, t =0,,, rather than the set of all admssble polces. We mplemented two versons of SAMW,.e., the fully sampled verson of SAMW, whch constructs the optmal value functon estmate by enumeratng all polces n Π and usng all value functon estmates, and the sngle samplng verson of SAMW ntroduced n heorem 4., whch uses just one sampled polcy n each teraton to update the optmal value functon estmate; however, updatng φ requres value functon estmates for all polces n Π. For numercal comparson, we also appled the adaptve mult-stage samplng (AMS) algorthm [4] and a non-adaptve mult-stage samplng (NMS) algorthm. hese two algorthms are smulaton-tree based methods, where each node n the tree represents a state, wth the root node beng the ntal state, and each edge sgnfes the samplng of an acton. hey both use forward search to generate sample path from the ntal state to the fnal state, and update the value functon backwards only at those vsted states. he dfference between the AMS and NMS algorthms s n the way actons are sampled at each decson perod: AMS samples actons n an adaptve manner accordng to some performance ndex, whereas NMS smply samples each acton for a fxed number of tmes. A detaled descrpton of these approaches can be found n [4]. value functon estmate SAMW (fully sampled) SAMW (sngle samplng) SAMW (fully sampled) β= SAMW (sngle samplng) β= AMS NMS Optmal total perods smulated x 0 5 Fg.. Average performance (mean of 5 smulaton replcatons, resultng n confdence half-wdths wthn 5% of estmated mean) of SAMW, AMS, and NMS on the nventory control problem (h =0.00, p =0.0). Fgure shows the performance of these algorthms as a functon of the total number of perods smulated, based on 5 ndependent replcatons. he results ndcate convergence of both versons of SAMW; however, the two alternatve benchmark algorthms AMS and NMS seem to provde superor emprcal performance over SAMW. We beleve ths s because the annealng schedule for β used n SAMW s too conservatve for ths problem, thus leadng to slow convergence. o mprove the emprcal performance of SAMW, we also mplemented both versons of the algorthm wth β beng

5 5 held constant throughout the search,.e., ndependent of. he β = case s ncluded n Fgure, whch shows sgnfcantly mproved performance. Expermentaton wth the SAMW algorthm also revealed that t performed even better for cost parameters values n the nventory control problem that do not satsfy the strct reward bound. One such example s shown n Fgure, for the case h = and p =(all other parameter values unchanged). value functon estmate SAMW (fully sampled) SAMW (sngle samplng) AMS NMS Optmal total perods smulated x 0 5 Fg.. Average performance (mean of 5 smulaton replcatons, resultng n confdence half-wdths wthn 5% of estmated mean) of SAMW, AMS, and NMS on the nventory control problem (h =, p =). VI. CONCLUDING REMARKS SAMW can be naturally parallelzed to speed up ts computatonal cost. Partton the gven polcy space Π nto { j} such that j j = for all j j and S j j =Π, and apply the algorthm n parallel for teratons on each j. For a fxed value of β>, we have the followng fnte-tme bound from Lemma.: ( ) V β max j V (φ j)(x ln j j, = where φ j s the dstrbuton generated for j at teraton. he orgnal verson of SAMW recalculates an estmate of the value functon for all polces n Π at each teraton, requrng each polcy to be smulated. If Π s large, ths may not be practcal, and the samplng verson of SAMW gven by heorem 4. also requres each value functon estmate n order to update the φ at each teraton. One smple alternatve s to use the pror value functon estmates for updatng φ, except for the sngle sampled one; thus, only one smulaton per teraton would be requred. Specfcally, V := V f not sampled at teraton ; else obtan a new estmate of V va (). An extenson of ths s to use a threshold on φ to determne whch polces wll be smulated. Snce the sequence of the dstrbutons generated by SAMW converges to a dstrbuton concentrated on the optmal polces n Π, as the number of teratons ncreases, the contrbutons from non-optmal polces get smaller and smaller, so these polces need not be resmulated (and value functon estmates updated) very often. Specfcally, V := V f φ () ɛ; else obtan a new estmate of V va (). he coolng schedule presented n heorem 4. s just one way of controllng the parameter β. Characterzng propertes of good schedules s crtcal to effectve mplementaton, as the numercal experments showed. he numercal experments also demonstrated that the algorthm may work well outsde the boundares of the assumptons under whch theoretcal convergence s proved, specfcally the bound on the one-perod reward functon and the value of the coolng parameter β. We suspect ths has somethng to do wth the scalng of the algorthm, but more nvestgaton nto ths phenomena s clearly warranted. Fnally, we presented SAMW n the MDP framework for optmzaton of the sequental decson makng processes. Even though the dea s general, the actual algorthm depends on the sequental structure of MDPs. he general problem gven by () takes the form of a general stochastc optmzaton problem, so SAMW can also be adapted to serve as a global stochastc optmzaton algorthm for bounded expected value objectve functons. REFERENCES [] D. P. Bertsekas, Dynamc Programmng and Optmal Control, Volumes and. Athena Scentfc, 995. [] D. P. Bertsekas and J. stskls, Neuro-Dynamc Programmng. Athena Scentfc, 996. [] H. S. Chang, M. C. Fu, and S. I. Marcus, An asymptotcally effcent algorthm for fnte horzon stochastc dynamc programmng problems, n Proc. of the 4nd IEEE Conf. Decson and Control, 00, pp [4] H. S. Chang, M. C. Fu, J. Hu, and S. I. Marcus, An adaptve samplng algorthm for solvng Markov decson processes, Operatons Research, vol. 5, no., pp. 6-9, 005. [5] H. S. Chang, R. Gvan, and E. K. P. Chong, Parallel rollout for onlne soluton of partally observable Markov decson processes, Dscrete Event Dynamc Systems: heory and Applcaton, vol. 4, no., pp. 09 4, 004. [6] E. Even-Dar, S. Mannor, and Y. Mansour, PAC bounds for mult-armed bandt and Markov decson processes, n Proc. of the 5th Annual Conf. on Computatonal Learnng heory, 00, pp [7] Y. Freund and R. Schapre, Adaptve game playng usng multplcatve weghts, Games and Economc Behavor, vol. 9, pp. 79 0, 999. [8] O. Hernández-Lerma and J. B. Lasserre, Dscrete-me Markov Control Processes: Basc Optmalty Crtera, Sprnger, 996. [9] Y. C. Ho, C. Cassandras, C-H. Chen, and L. Da, Ordnal optmzaton and smulaton, J. of Operatons Research Socety, vol., pp , 000. [0] S. Krkpatrck, C. D. Gelatt, and M. P. Vecch, Optmzaton by smulated annealng, Scence, vol. 0, pp , 98. [] A. J. Kleywegt, A. Shapro, and R. Homem-de-Mello, he sample average approxmaton method for stochastc dscrete optmzaton, SIAM J. on Control and Optmzaton, vol., no., pp , 00. [] N. Lttlestone and M. K. Warmnuth, he weghted majorty algorthm, Informaton and Computaton, vol. 08, pp. 6, 994. [] A. S. Poznyak and K. Najm, Learnng Automata and Stochastc Optmzaton, Sprnger-Verlag, 997. [4] S. M. Ross, Stochastc Processes, Second Edton, John Wley & Sons, 996. [5] R. Y. Rubnsten and A. Shapro, Dscrete Event Systems: Senstvty Analyss and Stochastc Optmzaton by the Score Functon Method, John Wley & Sons, 99. [6] N. Shmkn and A. Shwartz, Guaranteed performance regons n Markovan systems wth competng decson makers, IEEE rans. on Automatc Control, vol. 8, no., pp 84 95, 99. [7] R. Sutton and A. Barto, Renforcement Learnng: An Introducton, MI Press, Cambrdge, Massachusetts, 998. [8] F. opsoe, Bounds for entropy and dvergence for dstrbutons over a two-element set, J. of Inequaltes n Pure and Appled Mathematcs, vol., ssue, Artcle 5, 00. [9] C. J. C. H. Watkns, Q-learnng, Machne Learnng, vol. 8, no., pp. 79 9, 99.

Lecture 14: Bandits with Budget Constraints

Lecture 14: Bandits with Budget Constraints IEOR 8100-001: Learnng and Optmzaton for Sequental Decson Makng 03/07/16 Lecture 14: andts wth udget Constrants Instructor: Shpra Agrawal Scrbed by: Zhpeng Lu 1 Problem defnton In the regular Mult-armed

More information

Lecture 4. Instructor: Haipeng Luo

Lecture 4. Instructor: Haipeng Luo Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013 COS 511: heoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 15 Scrbe: Jemng Mao Aprl 1, 013 1 Bref revew 1.1 Learnng wth expert advce Last tme, we started to talk about learnng wth expert advce.

More information

1 The Mistake Bound Model

1 The Mistake Bound Model 5-850: Advanced Algorthms CMU, Sprng 07 Lecture #: Onlne Learnng and Multplcatve Weghts February 7, 07 Lecturer: Anupam Gupta Scrbe: Bryan Lee,Albert Gu, Eugene Cho he Mstake Bound Model Suppose there

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

A note on almost sure behavior of randomly weighted sums of φ-mixing random variables with φ-mixing weights

A note on almost sure behavior of randomly weighted sums of φ-mixing random variables with φ-mixing weights ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 7, Number 2, December 203 Avalable onlne at http://acutm.math.ut.ee A note on almost sure behavor of randomly weghted sums of φ-mxng

More information

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

More information

Online Classification: Perceptron and Winnow

Online Classification: Perceptron and Winnow E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng

More information

A Hybrid Variational Iteration Method for Blasius Equation

A Hybrid Variational Iteration Method for Blasius Equation Avalable at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 1932-9466 Vol. 10, Issue 1 (June 2015), pp. 223-229 Applcatons and Appled Mathematcs: An Internatonal Journal (AAM) A Hybrd Varatonal Iteraton Method

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

Appendix B: Resampling Algorithms

Appendix B: Resampling Algorithms 407 Appendx B: Resamplng Algorthms A common problem of all partcle flters s the degeneracy of weghts, whch conssts of the unbounded ncrease of the varance of the mportance weghts ω [ ] of the partcles

More information

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty Addtonal Codes usng Fnte Dfference Method Benamn Moll 1 HJB Equaton for Consumpton-Savng Problem Wthout Uncertanty Before consderng the case wth stochastc ncome n http://www.prnceton.edu/~moll/ HACTproect/HACT_Numercal_Appendx.pdf,

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Interactive Bi-Level Multi-Objective Integer. Non-linear Programming Problem

Interactive Bi-Level Multi-Objective Integer. Non-linear Programming Problem Appled Mathematcal Scences Vol 5 0 no 65 3 33 Interactve B-Level Mult-Objectve Integer Non-lnear Programmng Problem O E Emam Department of Informaton Systems aculty of Computer Scence and nformaton Helwan

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

Research Article Green s Theorem for Sign Data

Research Article Green s Theorem for Sign Data Internatonal Scholarly Research Network ISRN Appled Mathematcs Volume 2012, Artcle ID 539359, 10 pages do:10.5402/2012/539359 Research Artcle Green s Theorem for Sgn Data Lous M. Houston The Unversty of

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Lecture Space-Bounded Derandomization

Lecture Space-Bounded Derandomization Notes on Complexty Theory Last updated: October, 2008 Jonathan Katz Lecture Space-Bounded Derandomzaton 1 Space-Bounded Derandomzaton We now dscuss derandomzaton of space-bounded algorthms. Here non-trval

More information

On the Multicriteria Integer Network Flow Problem

On the Multicriteria Integer Network Flow Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of

More information

EEL 6266 Power System Operation and Control. Chapter 3 Economic Dispatch Using Dynamic Programming

EEL 6266 Power System Operation and Control. Chapter 3 Economic Dispatch Using Dynamic Programming EEL 6266 Power System Operaton and Control Chapter 3 Economc Dspatch Usng Dynamc Programmng Pecewse Lnear Cost Functons Common practce many utltes prefer to represent ther generator cost functons as sngle-

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016 CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

The Minimum Universal Cost Flow in an Infeasible Flow Network

The Minimum Universal Cost Flow in an Infeasible Flow Network Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran

More information

Ryan (2009)- regulating a concentrated industry (cement) Firms play Cournot in the stage. Make lumpy investment decisions

Ryan (2009)- regulating a concentrated industry (cement) Firms play Cournot in the stage. Make lumpy investment decisions 1 Motvaton Next we consder dynamc games where the choce varables are contnuous and/or dscrete. Example 1: Ryan (2009)- regulatng a concentrated ndustry (cement) Frms play Cournot n the stage Make lumpy

More information

The Second Anti-Mathima on Game Theory

The Second Anti-Mathima on Game Theory The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

A Simple Inventory System

A Simple Inventory System A Smple Inventory System Lawrence M. Leems and Stephen K. Park, Dscrete-Event Smulaton: A Frst Course, Prentce Hall, 2006 Hu Chen Computer Scence Vrgna State Unversty Petersburg, Vrgna February 8, 2017

More information

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

Queueing Networks II Network Performance

Queueing Networks II Network Performance Queueng Networks II Network Performance Davd Tpper Assocate Professor Graduate Telecommuncatons and Networkng Program Unversty of Pttsburgh Sldes 6 Networks of Queues Many communcaton systems must be modeled

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

VQ widely used in coding speech, image, and video

VQ widely used in coding speech, image, and video at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud Resource Allocaton wth a Budget Constrant for Computng Independent Tasks n the Cloud Wemng Sh and Bo Hong School of Electrcal and Computer Engneerng Georga Insttute of Technology, USA 2nd IEEE Internatonal

More information

APPENDIX A Some Linear Algebra

APPENDIX A Some Linear Algebra APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,

More information

Report on Image warping

Report on Image warping Report on Image warpng Xuan Ne, Dec. 20, 2004 Ths document summarzed the algorthms of our mage warpng soluton for further study, and there s a detaled descrpton about the mplementaton of these algorthms.

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES

TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES SVANTE JANSON Abstract. We gve explct bounds for the tal probabltes for sums of ndependent geometrc or exponental varables, possbly wth dfferent

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

High resolution entropy stable scheme for shallow water equations

High resolution entropy stable scheme for shallow water equations Internatonal Symposum on Computers & Informatcs (ISCI 05) Hgh resoluton entropy stable scheme for shallow water equatons Xaohan Cheng,a, Yufeng Ne,b, Department of Appled Mathematcs, Northwestern Polytechncal

More information

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan Wnter 2008 CS567 Stochastc Lnear/Integer Programmng Guest Lecturer: Xu, Huan Class 2: More Modelng Examples 1 Capacty Expanson Capacty expanson models optmal choces of the tmng and levels of nvestments

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper Games of Threats Elon Kohlberg Abraham Neyman Workng Paper 18-023 Games of Threats Elon Kohlberg Harvard Busness School Abraham Neyman The Hebrew Unversty of Jerusalem Workng Paper 18-023 Copyrght 2017

More information

General viscosity iterative method for a sequence of quasi-nonexpansive mappings

General viscosity iterative method for a sequence of quasi-nonexpansive mappings Avalable onlne at www.tjnsa.com J. Nonlnear Sc. Appl. 9 (2016), 5672 5682 Research Artcle General vscosty teratve method for a sequence of quas-nonexpansve mappngs Cuje Zhang, Ynan Wang College of Scence,

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

Maximizing the number of nonnegative subsets

Maximizing the number of nonnegative subsets Maxmzng the number of nonnegatve subsets Noga Alon Hao Huang December 1, 213 Abstract Gven a set of n real numbers, f the sum of elements of every subset of sze larger than k s negatve, what s the maxmum

More information

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals Smultaneous Optmzaton of Berth Allocaton, Quay Crane Assgnment and Quay Crane Schedulng Problems n Contaner Termnals Necat Aras, Yavuz Türkoğulları, Z. Caner Taşkın, Kuban Altınel Abstract In ths work,

More information

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS HCMC Unversty of Pedagogy Thong Nguyen Huu et al. A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS Thong Nguyen Huu and Hao Tran Van Department of mathematcs-nformaton,

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

( ) ( ) ( ) ( ) STOCHASTIC SIMULATION FOR BLOCKED DATA. Monte Carlo simulation Rejection sampling Importance sampling Markov chain Monte Carlo

( ) ( ) ( ) ( ) STOCHASTIC SIMULATION FOR BLOCKED DATA. Monte Carlo simulation Rejection sampling Importance sampling Markov chain Monte Carlo SOCHASIC SIMULAIO FOR BLOCKED DAA Stochastc System Analyss and Bayesan Model Updatng Monte Carlo smulaton Rejecton samplng Importance samplng Markov chan Monte Carlo Monte Carlo smulaton Introducton: If

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Vapnik-Chervonenkis theory

Vapnik-Chervonenkis theory Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown

More information

CSC 411 / CSC D11 / CSC C11

CSC 411 / CSC D11 / CSC C11 18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

Supplement to Clustering with Statistical Error Control

Supplement to Clustering with Statistical Error Control Supplement to Clusterng wth Statstcal Error Control Mchael Vogt Unversty of Bonn Matthas Schmd Unversty of Bonn In ths supplement, we provde the proofs that are omtted n the paper. In partcular, we derve

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD Matrx Approxmaton va Samplng, Subspace Embeddng Lecturer: Anup Rao Scrbe: Rashth Sharma, Peng Zhang 0/01/016 1 Solvng Lnear Systems Usng SVD Two applcatons of SVD have been covered so far. Today we loo

More information

Appendix B. The Finite Difference Scheme

Appendix B. The Finite Difference Scheme 140 APPENDIXES Appendx B. The Fnte Dfference Scheme In ths appendx we present numercal technques whch are used to approxmate solutons of system 3.1 3.3. A comprehensve treatment of theoretcal and mplementaton

More information

On the correction of the h-index for career length

On the correction of the h-index for career length 1 On the correcton of the h-ndex for career length by L. Egghe Unverstet Hasselt (UHasselt), Campus Depenbeek, Agoralaan, B-3590 Depenbeek, Belgum 1 and Unverstet Antwerpen (UA), IBW, Stadscampus, Venusstraat

More information

Computing Correlated Equilibria in Multi-Player Games

Computing Correlated Equilibria in Multi-Player Games Computng Correlated Equlbra n Mult-Player Games Chrstos H. Papadmtrou Presented by Zhanxang Huang December 7th, 2005 1 The Author Dr. Chrstos H. Papadmtrou CS professor at UC Berkley (taught at Harvard,

More information

DUE: WEDS FEB 21ST 2018

DUE: WEDS FEB 21ST 2018 HOMEWORK # 1: FINITE DIFFERENCES IN ONE DIMENSION DUE: WEDS FEB 21ST 2018 1. Theory Beam bendng s a classcal engneerng analyss. The tradtonal soluton technque makes smplfyng assumptons such as a constant

More information

Online Appendix. t=1 (p t w)q t. Then the first order condition shows that

Online Appendix. t=1 (p t w)q t. Then the first order condition shows that Artcle forthcomng to ; manuscrpt no (Please, provde the manuscrpt number!) 1 Onlne Appendx Appendx E: Proofs Proof of Proposton 1 Frst we derve the equlbrum when the manufacturer does not vertcally ntegrate

More information

Edge Isoperimetric Inequalities

Edge Isoperimetric Inequalities November 7, 2005 Ross M. Rchardson Edge Isopermetrc Inequaltes 1 Four Questons Recall that n the last lecture we looked at the problem of sopermetrc nequaltes n the hypercube, Q n. Our noton of boundary

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

A Note on Bound for Jensen-Shannon Divergence by Jeffreys

A Note on Bound for Jensen-Shannon Divergence by Jeffreys OPEN ACCESS Conference Proceedngs Paper Entropy www.scforum.net/conference/ecea- A Note on Bound for Jensen-Shannon Dvergence by Jeffreys Takuya Yamano, * Department of Mathematcs and Physcs, Faculty of

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 ) Kangweon-Kyungk Math. Jour. 4 1996), No. 1, pp. 7 16 AN ITERATIVE ROW-ACTION METHOD FOR MULTICOMMODITY TRANSPORTATION PROBLEMS Yong Joon Ryang Abstract. The optmzaton problems wth quadratc constrants often

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Inductance Calculation for Conductors of Arbitrary Shape

Inductance Calculation for Conductors of Arbitrary Shape CRYO/02/028 Aprl 5, 2002 Inductance Calculaton for Conductors of Arbtrary Shape L. Bottura Dstrbuton: Internal Summary In ths note we descrbe a method for the numercal calculaton of nductances among conductors

More information

The Experts/Multiplicative Weights Algorithm and Applications

The Experts/Multiplicative Weights Algorithm and Applications Chapter 2 he Experts/Multplcatve Weghts Algorthm and Applcatons We turn to the problem of onlne learnng, and analyze a very powerful and versatle algorthm called the multplcatve weghts update algorthm.

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

FUZZY GOAL PROGRAMMING VS ORDINARY FUZZY PROGRAMMING APPROACH FOR MULTI OBJECTIVE PROGRAMMING PROBLEM

FUZZY GOAL PROGRAMMING VS ORDINARY FUZZY PROGRAMMING APPROACH FOR MULTI OBJECTIVE PROGRAMMING PROBLEM Internatonal Conference on Ceramcs, Bkaner, Inda Internatonal Journal of Modern Physcs: Conference Seres Vol. 22 (2013) 757 761 World Scentfc Publshng Company DOI: 10.1142/S2010194513010982 FUZZY GOAL

More information

Foundations of Arithmetic

Foundations of Arithmetic Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

find (x): given element x, return the canonical element of the set containing x;

find (x): given element x, return the canonical element of the set containing x; COS 43 Sprng, 009 Dsjont Set Unon Problem: Mantan a collecton of dsjont sets. Two operatons: fnd the set contanng a gven element; unte two sets nto one (destructvely). Approach: Canoncal element method:

More information

THE GUARANTEED COST CONTROL FOR UNCERTAIN LARGE SCALE INTERCONNECTED SYSTEMS

THE GUARANTEED COST CONTROL FOR UNCERTAIN LARGE SCALE INTERCONNECTED SYSTEMS Copyrght 22 IFAC 5th rennal World Congress, Barcelona, Span HE GUARANEED COS CONROL FOR UNCERAIN LARGE SCALE INERCONNECED SYSEMS Hroak Mukadan Yasuyuk akato Yoshyuk anaka Koch Mzukam Faculty of Informaton

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

Lecture 6 More on Complete Randomized Block Design (RBD)

Lecture 6 More on Complete Randomized Block Design (RBD) Lecture 6 More on Complete Randomzed Block Desgn (RBD) Multple test Multple test The multple comparsons or multple testng problem occurs when one consders a set of statstcal nferences smultaneously. For

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

Excess Error, Approximation Error, and Estimation Error

Excess Error, Approximation Error, and Estimation Error E0 370 Statstcal Learnng Theory Lecture 10 Sep 15, 011 Excess Error, Approxaton Error, and Estaton Error Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton So far, we have consdered the fnte saple

More information

A PROCEDURE FOR SIMULATING THE NONLINEAR CONDUCTION HEAT TRANSFER IN A BODY WITH TEMPERATURE DEPENDENT THERMAL CONDUCTIVITY.

A PROCEDURE FOR SIMULATING THE NONLINEAR CONDUCTION HEAT TRANSFER IN A BODY WITH TEMPERATURE DEPENDENT THERMAL CONDUCTIVITY. Proceedngs of the th Brazlan Congress of Thermal Scences and Engneerng -- ENCIT 006 Braz. Soc. of Mechancal Scences and Engneerng -- ABCM, Curtba, Brazl,- Dec. 5-8, 006 A PROCEDURE FOR SIMULATING THE NONLINEAR

More information

A LINEAR PROGRAM TO COMPARE MULTIPLE GROSS CREDIT LOSS FORECASTS. Dr. Derald E. Wentzien, Wesley College, (302) ,

A LINEAR PROGRAM TO COMPARE MULTIPLE GROSS CREDIT LOSS FORECASTS. Dr. Derald E. Wentzien, Wesley College, (302) , A LINEAR PROGRAM TO COMPARE MULTIPLE GROSS CREDIT LOSS FORECASTS Dr. Derald E. Wentzen, Wesley College, (302) 736-2574, wentzde@wesley.edu ABSTRACT A lnear programmng model s developed and used to compare

More information