An Asymptotically Efficient Simulation-Based Algorithm for Finite Horizon Stochastic Dynamic Programming
|
|
- Rodney Harper
- 5 years ago
- Views:
Transcription
1 An Asymptotcally Effcent Smulaton-Based Algorthm for Fnte Horzon Stochastc Dynamc Programmng Hyeong Soo Chang, Mchael C. Fu, Jaqao Hu, and Steven I. Marcus Abstract We present a smulaton-based algorthm called Smulated Annealng Multplcatve Weghts (SAMW) for solvng large fntehorzon stochastc dynamc programmng problems. At each teraton of the algorthm, a probablty dstrbuton over canddate polces s updated by a smple multplcatve weght rule, and wth proper annealng of a control parameter, the generated sequence of dstrbutons converges to a dstrbuton concentrated only on the best polces. he algorthm s asymptotcally effcent, n the sense that for the goal of estmatng the value of an optmal polcy, a provably convergent fnte-tme upper bound for the sample mean s obtaned. Index erms stochastc dynamc programmng, Markov decson processes, smulaton, learnng algorthms, smulated annealng I. INRODUCION Consder a dscrete-tme system wth a fnte horzon H: x t+ = f(x t,a t,w t) for t =0,,..., H, where x t s the state at tme t rangng over a (possbly nfnte) set, a t s the acton at tme t to be chosen from a nonempty subset A(x t) of a gven (possbly nfnte) set of avalable actons A at tme t, and w t s a random dsturbance unformly and ndependently selected from [0,] at tme t, representng the uncertanty n the system, and f : A() [0, ] s a next-state functon. hroughout, we assume the ntal state x 0 s gven, but ths s wthout loss of generalty, as the results n the paper carry through for the case where x 0 follows a gven dstrbuton. Defne a nonstatonary (non-randomzed) polcy = { t t : A(),t =0,,..., H }, and ts correspondng fnte-horzon dscount value functon gven by " H # V = E w0,...,w H γ t R(x t, t(x t),w t), () wth dscount factor γ (0, ] and one-perod reward functon R : A() [0, ] R +. We suppress explct dependence of the horzon H on V. he functon f, together wth, A, and R, comprse a stochastc dynamc programmng problem or a Markov decson process (MDP) [] [8]. We assume throughout that the one-perod reward functon s bounded. For smplcty, but wthout loss of generalty, we take the bound to be /H,.e., sup x,a A,w [0,] R(x, a, w) /H, so 0 V. he problem we consder s estmatng the optmal value over a gven fnte set of polces Π: V := max V. () hs work was supported n part by the Natonal Scence Foundaton under Grant DMI-00, n part by the Ar Force Offce of Scentfc Research under Grant FA , and n part by the Department of Defense. he work of H.S. Chang was also supported by the Sogang Unversty research grants n 006. H.S. Chang s wth the Department of Computer Scence and Engneerng at Sogang Unversty, Seoul -74, Korea. (e-mal:hschang@sogang.ac.kr). M.C. Fu s wth the Robert H. Smth School of Busness and the Insttute for Systems Research at the Unversty of Maryland, College Park. (emal:mfu@rhsmth.umd.edu). J. Hu s wth the Department of Appled Mathematcs & Statstcs, SUNY, Stony Brook. (e-mal:jqhu@xx.xx.edu). S.I. Marcus s wth the Department of Electrcal & Computer Engneerng and the Insttute for Systems Research at the Unversty of Maryland, College Park. (e-mal:marcus@eng.umd.edu). Prelmnary portons of ths paper appeared n the Proceedngs of the 4nd IEEE Conference on Decson and Control, 00. Any polcy that acheves V s called an optmal polcy. Our settng s that n whch explct forms for f and R are not avalable, but both can be smulated,.e., sample paths for the states and rewards can be generated from a gven random number sequence {w 0,...,w H }. We present a smulaton-based algorthm called Smulated Annealng Multplcatve Weghts (SAMW) for solvng (), based on the weghted majorty algorthm of []. Specfcally, we explot the recent work of the multplcatve weghts algorthm studed by Freund and Schapre [7] n a completely dfferent context: noncooperatve repeated two-player bmatrx zero-sum games. At each teraton, the algorthm updates a probablty dstrbuton over Π by a multplcatve weght rule usng the estmated (from smulaton) value functons for all polces n Π, requrng Π sample paths. Wth a proper annealng of the control parameter assocated wth the algorthm as n Smulated Annealng (SA) [0], the sequence of dstrbutons generated by the multplcatve weght rule converges to a dstrbuton concentrated only on polces that acheve V, motvatng our choce of SAMW for the name of the algorthm. he algorthm s asymptotcally effcent, n the sense that a fnte-tme upper bound s obtaned for the sample mean of the value of an optmal polcy, and the upper bound converges to V wth rate O(/ ), where s the number of teratons. A samplng verson of the algorthm that does not enumerate all polces n Π at each teraton, but nstead samples from the sequence of generated dstrbutons, s also shown to converge to V. he samplng verson can be used as an on-lne smulaton-based control n the context of plannng. SAMW dffers from the usual SA n that t does not perform any local search; rather, t drectly updates a probablty dstrbuton over Π at each teraton and has a much smpler tunng process than SA. In ths regard, t may be sad that SAMW s a compressed verson of SA wth an extenson to stochastc dynamc programmng. he use of probablty dstrbuton on the search space s a fundamentally dfferent approach from exstng smulaton-based optmzaton technques for solvng MDPs, such as (bass-functon based) neurodynamc programmng [], model-free approaches of Q-learnng [9] and D(λ)-learnng [7], and (bandt-theory based) adaptve multstage samplng [4]. Updatng a probablty dstrbuton over the search space s smlar to the learnng automata approach for stochastc optmzaton [], but SAMW s based on a dfferent multplcatve weght rule. Ordnal comparson [9] that smply chooses the current best Π from the sample mean of V does not provde a determnstc upper-bound even f a probablstc bound s possble (see, e.g., heorem n [6] wth lettng each arm of the bandt nto a polcy). Furthermore, t s not clear how to desgn a varant of ordnal comparson that does not enumerate all polces n Π. hs s also true for the recently proposed on-lne control algorthms, parallel rollout and polcy swtchng [5], for MDPs. hs paper s organzed as follows. In Secton II, we present the SAMW algorthm and n Sectons III and IV, we analyze ts convergence propertes. We conclude n Secton VI wth some remarks. II. BASIC ALGORIHM DESCRIPION Let Φ be the set of all probablty dstrbutons over Π. Forφ Φ and Π, let φ() denote the probablty for polcy. he goal s to concentrate the probablty on the optmal polces n Π. he SAMW algorthm teratvely generates a sequence of dstrbutons, where φ denotes the dstrbuton at teraton. Each teraton of SAMW requres H random numbers w 0,..., w H,.e.,..d. U(0, ) and ndependent from prevous teratons. Each polcy Π s then smulated usng the same sequence of random numbers for that
2 teraton (dfferent random number sequences can also be used for each polcy, and all of the results stll hold) n order to obtan a sample path estmate of the value functon (): H V := γ t R(x t, t(x t),w t), () where the subscrpt denotes the teraton count, whch has been omtted for notatonal smplcty n the quanttes x t and w t. he estmates {V, Π} are used for updatng a probablty dstrbuton over Π at each teraton. Note that 0 V (a.s.) by the boundedness assumpton. Note also that the sze of Π n the worst case can be qute large,.e., A H so that we assume here that Π s relatvely small. In Secton IV, we study the convergence property of a samplng verson of SAMW that does not enumerate all polces n Π at each teraton. he teratve updatng to compute the new dstrbuton φ + from φ and {V } uses a smple multplcatve rule: φ + () =φ () βv, Π, (4) Z where β > s a parameter of the algorthm, the normalzaton factor Z s gven by Z = P φ ()β V, and the ntal dstrbuton φ s the unform dstrbuton,.e., φ () =/ Π Π. III. CONVERGENCE ANALYSIS For φ Φ, defne V (φ) = V φ(), Ψ := V, = where Ψ s the sample mean estmate for the value functon of polcy. Agan, note that (a.s.) 0 V (φ) for all φ Φ. We remark that V (φ) represents an expected reward for each fxed (teraton) experment, where the expectaton s w.r.t. the dstrbuton of the polcy. he followng lemma provdes a fnte-tme upper bound for the sample mean of the value functon of an optmal polcy n terms of the probablty dstrbutons generated by SAMW va (4). Lemma.: For β = β >, =,...,, the sequence of dstrbutons φ,..., φ generated by SAMW va (4) satsfes (a.s.) Ψ β = V (φ ln Π, for any optmal polcy. Proof: he proof dea follows that of heorem n [7], for whch t s convenent to ntroduce the followng measure of dstance between two probablty dstrbutons, called the relatve entropy (also known as Kullback-Lebler entropy): D(p, q) := p()ln p() q() «, p,q Φ. (5) Although D(p, q) 0 for any p and q, and D(p, q) =0f and only f p = q, the measure s not symmetrc, hence not a true metrc. Consder any Drac dstrbuton φ Φ such that for an optmal polcy n Π, φ ( )=and φ () =0for all Π { }. We frst prove that V (β ) V (φ D(φ,φ ) D(φ,φ + ), (6) where φ and φ + are generated by SAMW va (4) and β >. From the defnton of D gven by (5), D(φ,φ + ) D(φ,φ ) = «φ φ () ()ln = φ ()ln Z φ + () β V = φ ()lnβ V +lnz φ () =( ) φ ()V ( ) V (φ ln =( )V +lnz " φ ()(+(β )V ) +ln +(β ) V (φ ) ( )V +(β ) V (φ ), where the frst nequalty follows from the property β a +(β )a for β 0,a [0, ], and the last nequalty follows from the property ln( + a) a for a>. Solvng for V (recall β>) yelds (6). Summng the nequalty (6) over =,...,, = V β β β = = V (φ D(φ,φ ) D(φ,φ + ) V (φ D(φ,φ ) V (φ = ln Π, where the second nequalty follows from D(φ,φ + ) 0, and the last nequalty uses the unform dstrbuton property that φ () = = D(φ,φ ) ln Π. Π Dvdng both sdes by yelds the desred result. If (β )/ s very close to and at the same tme ln Π /( ) s very close to 0, then the above nequalty mples that the expected per-teraton performance of SAMW s very close to the optmal value. However, lettng β, ln Π /.On the other hand, for fxed β and ncreasng, ln Π / becomes neglgble relatve to. hus, from the form of the bound, t s clear that the sequence β should be chosen as a functon of such that β and n order to acheve convergence. Defne the total varaton dstance for probablty dstrbutons p and q by d (p, q) := P Λ p() q(). he followng lemma states that the sequence of dstrbutons generated by SAMW converges to a statonary dstrbuton, wth a proper tunng or annealng of the β-parameter. Lemma.: Let {ψ( )} be a decreasng sequence such that ψ( ) > and lm ψ( ) =. For β = ψ( ), =,..., +k, k, the sequence of dstrbutons φ,..., φ generated by SAMW va (4) satsfes (a.s.) lm d (φ,φ +k )=0. Proof: From the defnton of D gven by (5), D(φ,φ + ) = φ ()ln ««φ () φ max φ + () ln () φ + () =max ln Z ψ( )V =mnln ln ψ( ), ψ( ) V Z snce V and Z for all and any. #
3 Applyng Pnsker s nequalty [8], d (φ,φ + ) p D(φ,φ + ) p lnψ( ). herefore, k k p d (φ,φ +k ) d (φ +j,φ +j ) lnψ( + j). j= Because P d (φ,φ +k ) 0 for any k and k p j=0 lnψ( + j) 0 as, d (φ,φ +k ) 0 as. heorem.: Let {ψ( )} be a decreasng sequence such that ψ( ) >, lm ψ( )=, and lm ln ψ( )=. For β = ψ( ), =,...,, the sequence of dstrbutons φ,..., φ generated by SAMW va (4) satsfes (a.s.) ψ( ) ln ψ( ) V (φ ln Π ln ψ( ) V, = and φ φ Φ, where φ () =0for all such that V <V. Proof: Usng x x ln x for all x and Lemma., Ψ ψ( ) ln ψ( ) V (φ ln Π ln ψ( ) ψ( ) = = j=0 V (φ ln Π ln ψ( ). (7) In the lmt as, the lefthand sde converges to V by the law of large numbers, and n the rghtmost expresson n (7), ψ( ) and the second term vanshes, so t suffces to show that P V = (φ ) s bounded from above by V (n the lmt). From Lemma., for every ɛ>0, there exsts < such that d (φ,φ +k ) ɛ for all > and any nteger k. hen, for >, we have (a.s.) V (φ )= 4 V (φ V (φ ) 5 = = = = = V (φ + V (φ + V (φ = + = + = + V (φ ) V (φ ) = = + V (φ = φ () φ () V φ () φ () V = + Π ɛ, (8) the last nequalty followng from max φ +k () φ () ɛ and V > k, Π. As, the frst term of (8) vanshes, and the second term converges by the law of large numbers to V, φ, whch s bounded from above by V. Snce ɛ can be chosen arbtrarly close to zero, the desred convergence follows. he second part of the theorem follows drectly from the frst part wth Lemma., wth a proof obtaned n a straghtforward manner by assumng there exsts a Π such that φ () 0and V <V, leadng to a contradcton. We skp the detals. An example of a decreasng sequence {ψ( )}, =,,..., that satsfes the condton of heorem. s ψ( )=+ p /, > 0. IV. CONVERGENCE OF HE SAMPLING VERSION OF HE ALGORIHM Instead of estmatng the value functons for every polcy n Π accordng to (), whch requres smulatng all polces n Π, a samplng verson of the algorthm would sample a subset of the polces n Π at each teraton accordng to φ and smulate only those polces (and estmate ther correspondng value functons). In ths context, heorem. essentally establshes that the expected per-teraton performance of SAMW approaches the optmal value as for approprately selected tunng sequence {β }. Here, we show that the actual (dstrbuton sampled) per-teraton performance also converges to the optmal value usng a partcular annealng schedule of the parameter β. For smplcty, we assume that a sngle polcy s sampled at each teraton (.e., subset s a sngleton). A related result s proven by Freund and Schapre wthn the context of solvng two-player zero-sum bmatrx repeated game [7], and the proof of the followng theorem s based on thers. heorem 4.: Let k = P k j= j.forβ =+/k, k < k, let {φ } denote the sequence of dstrbutons generated by SAMW va (4), wth resettng of φ () =/ Π at each = k. Let ˆ(φ ) denote the polcy sampled from φ (at teraton ). hen (a.s. as k ), k k = V ˆ(φ ) V. Proof: he sequence of random varables κ = V ˆ(φ ) V (φ ) forms a martngale dfference sequence wth κ, snce E[κ κ,..., κ ] = 0 for all. Let ɛ k = ln k/k and I k =[ k +, k ]. Applyng Azuma s nequalty [4, p.09], we have that for every ɛ k > 0, 0 V k ˆ(φ ) V (φ ) I k >ɛ ka e 0.5k ɛ k = k. (9) he sum of the probablty bound n (9) over all k from to s fnte. herefore, by the Borel-Cantell lemma, (a.s.) all but a fnte number of I k s (k =,..., ) satsfy V(φ ) V ˆ(φ ) + k ɛ k, (0) I k I k so those I k that volate (9) can be gnored (a.s.). From Lemma. wth the defnton of β, for all I k, k Ψ k β V(φ ln Π I k β V(φ ln Π β I k β = + «V (φ ln Π (k +) k Ik V (φ k +ln Π (k +), () Ik where the last nequalty follows from V (φ) φ Φ and I k = k. Combnng (0) and () and summng, we have k Ψ k + V ˆ(φ ) I I k k j= h j p ln j + j(ln Π +ln Π, so
4 4 Ψ k k + k k V ˆ(φ ) I I k j= h j p ln j + j(ln Π +ln Π. () Because k s O(k ), the term on the rghthand sde of () vanshes as k. herefore, for every ɛ>0, (a.s.) for all but a fnte number of values of k, Ψ k k k = V ˆ(φ ) + ɛ. We now argue that {φ } converges to a fxed dstrbuton as k, so that eventually the term P k k = V ˆ(φ ) s bounded from above by V. Wth smlar reasonng as n the proof of Lemma., for every ɛ>0, there exsts I k for some k> such that for all > wth + j I k, j, d (φ,φ +j ) ɛ. akng > wth I k, = V ˆ(φ ) = = = 4 V ˆ(φ ) V ˆ(φ ) = V ˆ(φ ) + = + = + = = + V ˆ(φ ) = + V ˆ(φ ) 5 V ˆ(φ ) V ˆ(φ ) V ˆ(φ ) V ˆ(φ ) V ˆ(φ ) «. () Lettng, the frst term on the rghthand sde of () vanshes, and the second term s bounded from above by V, because the second term converges to V, φ, from the law of large numbers. We know that for all > n I k, ɛ + φ () φ +j () φ (ɛ for all Π and any j. herefore, as ɛ can be chosen arbtrarly close to zero, {φ } converge to the dstrbuton φ, makng the last term vansh (once each polcy s sampled from the same dstrbuton over Π, the smulated value would be the same for the same random numbers), whch provdes the desred convergence result. V. A NUMERICAL EAMPLE o llustrate the performance of SAMW, we consder a smple fnte-horzon nventory control problem wth lost sales, zero order lead tme, and lnear holdng and shortage costs. Gven an nventory level, orders are placed and receved, demand s realzed, and the new nventory level s calculated. Formally, we let D t, a dscrete random varable, denote the demand n perod t, x t the nventory level at perod t, a t the order amount at perod t, p the per perod per unt demand lost penalty cost, h the per perod per unt nventory holdng cost, and M the nventory capacty. hus, the nventory level evolves accordng to the followng dynamcs: x t+ =max{0,x t +a t D t}. he goal s to mnmze, over a gven set of (nonstatonary) polces Π, the expected total cost over the entre horzon from a gven ntal nventory level x 0,.e., mn E[ P H [h max{0,xt + t(xt) D t} + p max{0,d t x t t(x t)}] x 0 = x]. he followng set of parameters s used n our experments: M = 0, H =, h = 0.00, p = 0.0, x 0 = 5 and x t {0, 5, 0, 5, 0} for t =,...,H, a t {0, 5, 0, 5, 0} for all t =0,...,H, and D t s a dscrete unformly dstrbuted random varable takng values n {0, 5, 0, 5, 0}. he values of h and p are chosen so as to satsfy the one-perod reward bound assumed n the SAMW convergence results. Note that snce we are gnorng the setup cost (.e., no fxed order cost), t s easy to see that the optmal order polcy follows a threshold rule, n whch an order s placed at perod t f the nventory level x t s below a certan threshold S t, and the amount to order s equal to the dfference max{0,s t x t}. hus, by takng advantage of ths structure, n actual mplementaton of SAMW, we restrct the search of the algorthm to the set of threshold polces,.e., Π=(S 0,S,S ), S t {0, 5, 0, 5, 0}, t =0,,, rather than the set of all admssble polces. We mplemented two versons of SAMW,.e., the fully sampled verson of SAMW, whch constructs the optmal value functon estmate by enumeratng all polces n Π and usng all value functon estmates, and the sngle samplng verson of SAMW ntroduced n heorem 4., whch uses just one sampled polcy n each teraton to update the optmal value functon estmate; however, updatng φ requres value functon estmates for all polces n Π. For numercal comparson, we also appled the adaptve mult-stage samplng (AMS) algorthm [4] and a non-adaptve mult-stage samplng (NMS) algorthm. hese two algorthms are smulaton-tree based methods, where each node n the tree represents a state, wth the root node beng the ntal state, and each edge sgnfes the samplng of an acton. hey both use forward search to generate sample path from the ntal state to the fnal state, and update the value functon backwards only at those vsted states. he dfference between the AMS and NMS algorthms s n the way actons are sampled at each decson perod: AMS samples actons n an adaptve manner accordng to some performance ndex, whereas NMS smply samples each acton for a fxed number of tmes. A detaled descrpton of these approaches can be found n [4]. value functon estmate SAMW (fully sampled) SAMW (sngle samplng) SAMW (fully sampled) β= SAMW (sngle samplng) β= AMS NMS Optmal total perods smulated x 0 5 Fg.. Average performance (mean of 5 smulaton replcatons, resultng n confdence half-wdths wthn 5% of estmated mean) of SAMW, AMS, and NMS on the nventory control problem (h =0.00, p =0.0). Fgure shows the performance of these algorthms as a functon of the total number of perods smulated, based on 5 ndependent replcatons. he results ndcate convergence of both versons of SAMW; however, the two alternatve benchmark algorthms AMS and NMS seem to provde superor emprcal performance over SAMW. We beleve ths s because the annealng schedule for β used n SAMW s too conservatve for ths problem, thus leadng to slow convergence. o mprove the emprcal performance of SAMW, we also mplemented both versons of the algorthm wth β beng
5 5 held constant throughout the search,.e., ndependent of. he β = case s ncluded n Fgure, whch shows sgnfcantly mproved performance. Expermentaton wth the SAMW algorthm also revealed that t performed even better for cost parameters values n the nventory control problem that do not satsfy the strct reward bound. One such example s shown n Fgure, for the case h = and p =(all other parameter values unchanged). value functon estmate SAMW (fully sampled) SAMW (sngle samplng) AMS NMS Optmal total perods smulated x 0 5 Fg.. Average performance (mean of 5 smulaton replcatons, resultng n confdence half-wdths wthn 5% of estmated mean) of SAMW, AMS, and NMS on the nventory control problem (h =, p =). VI. CONCLUDING REMARKS SAMW can be naturally parallelzed to speed up ts computatonal cost. Partton the gven polcy space Π nto { j} such that j j = for all j j and S j j =Π, and apply the algorthm n parallel for teratons on each j. For a fxed value of β>, we have the followng fnte-tme bound from Lemma.: ( ) V β max j V (φ j)(x ln j j, = where φ j s the dstrbuton generated for j at teraton. he orgnal verson of SAMW recalculates an estmate of the value functon for all polces n Π at each teraton, requrng each polcy to be smulated. If Π s large, ths may not be practcal, and the samplng verson of SAMW gven by heorem 4. also requres each value functon estmate n order to update the φ at each teraton. One smple alternatve s to use the pror value functon estmates for updatng φ, except for the sngle sampled one; thus, only one smulaton per teraton would be requred. Specfcally, V := V f not sampled at teraton ; else obtan a new estmate of V va (). An extenson of ths s to use a threshold on φ to determne whch polces wll be smulated. Snce the sequence of the dstrbutons generated by SAMW converges to a dstrbuton concentrated on the optmal polces n Π, as the number of teratons ncreases, the contrbutons from non-optmal polces get smaller and smaller, so these polces need not be resmulated (and value functon estmates updated) very often. Specfcally, V := V f φ () ɛ; else obtan a new estmate of V va (). he coolng schedule presented n heorem 4. s just one way of controllng the parameter β. Characterzng propertes of good schedules s crtcal to effectve mplementaton, as the numercal experments showed. he numercal experments also demonstrated that the algorthm may work well outsde the boundares of the assumptons under whch theoretcal convergence s proved, specfcally the bound on the one-perod reward functon and the value of the coolng parameter β. We suspect ths has somethng to do wth the scalng of the algorthm, but more nvestgaton nto ths phenomena s clearly warranted. Fnally, we presented SAMW n the MDP framework for optmzaton of the sequental decson makng processes. Even though the dea s general, the actual algorthm depends on the sequental structure of MDPs. he general problem gven by () takes the form of a general stochastc optmzaton problem, so SAMW can also be adapted to serve as a global stochastc optmzaton algorthm for bounded expected value objectve functons. REFERENCES [] D. P. Bertsekas, Dynamc Programmng and Optmal Control, Volumes and. Athena Scentfc, 995. [] D. P. Bertsekas and J. stskls, Neuro-Dynamc Programmng. Athena Scentfc, 996. [] H. S. Chang, M. C. Fu, and S. I. Marcus, An asymptotcally effcent algorthm for fnte horzon stochastc dynamc programmng problems, n Proc. of the 4nd IEEE Conf. Decson and Control, 00, pp [4] H. S. Chang, M. C. Fu, J. Hu, and S. I. Marcus, An adaptve samplng algorthm for solvng Markov decson processes, Operatons Research, vol. 5, no., pp. 6-9, 005. [5] H. S. Chang, R. Gvan, and E. K. P. Chong, Parallel rollout for onlne soluton of partally observable Markov decson processes, Dscrete Event Dynamc Systems: heory and Applcaton, vol. 4, no., pp. 09 4, 004. [6] E. Even-Dar, S. Mannor, and Y. Mansour, PAC bounds for mult-armed bandt and Markov decson processes, n Proc. of the 5th Annual Conf. on Computatonal Learnng heory, 00, pp [7] Y. Freund and R. Schapre, Adaptve game playng usng multplcatve weghts, Games and Economc Behavor, vol. 9, pp. 79 0, 999. [8] O. Hernández-Lerma and J. B. Lasserre, Dscrete-me Markov Control Processes: Basc Optmalty Crtera, Sprnger, 996. [9] Y. C. Ho, C. Cassandras, C-H. Chen, and L. Da, Ordnal optmzaton and smulaton, J. of Operatons Research Socety, vol., pp , 000. [0] S. Krkpatrck, C. D. Gelatt, and M. P. Vecch, Optmzaton by smulated annealng, Scence, vol. 0, pp , 98. [] A. J. Kleywegt, A. Shapro, and R. Homem-de-Mello, he sample average approxmaton method for stochastc dscrete optmzaton, SIAM J. on Control and Optmzaton, vol., no., pp , 00. [] N. Lttlestone and M. K. Warmnuth, he weghted majorty algorthm, Informaton and Computaton, vol. 08, pp. 6, 994. [] A. S. Poznyak and K. Najm, Learnng Automata and Stochastc Optmzaton, Sprnger-Verlag, 997. [4] S. M. Ross, Stochastc Processes, Second Edton, John Wley & Sons, 996. [5] R. Y. Rubnsten and A. Shapro, Dscrete Event Systems: Senstvty Analyss and Stochastc Optmzaton by the Score Functon Method, John Wley & Sons, 99. [6] N. Shmkn and A. Shwartz, Guaranteed performance regons n Markovan systems wth competng decson makers, IEEE rans. on Automatc Control, vol. 8, no., pp 84 95, 99. [7] R. Sutton and A. Barto, Renforcement Learnng: An Introducton, MI Press, Cambrdge, Massachusetts, 998. [8] F. opsoe, Bounds for entropy and dvergence for dstrbutons over a two-element set, J. of Inequaltes n Pure and Appled Mathematcs, vol., ssue, Artcle 5, 00. [9] C. J. C. H. Watkns, Q-learnng, Machne Learnng, vol. 8, no., pp. 79 9, 99.
Lecture 14: Bandits with Budget Constraints
IEOR 8100-001: Learnng and Optmzaton for Sequental Decson Makng 03/07/16 Lecture 14: andts wth udget Constrants Instructor: Shpra Agrawal Scrbed by: Zhpeng Lu 1 Problem defnton In the regular Mult-armed
More informationLecture 4. Instructor: Haipeng Luo
Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would
More informationModule 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:
More informationCOS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013
COS 511: heoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 15 Scrbe: Jemng Mao Aprl 1, 013 1 Bref revew 1.1 Learnng wth expert advce Last tme, we started to talk about learnng wth expert advce.
More information1 The Mistake Bound Model
5-850: Advanced Algorthms CMU, Sprng 07 Lecture #: Onlne Learnng and Multplcatve Weghts February 7, 07 Lecturer: Anupam Gupta Scrbe: Bryan Lee,Albert Gu, Eugene Cho he Mstake Bound Model Suppose there
More information1 Convex Optimization
Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,
More informationErrors for Linear Systems
Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch
More informationA note on almost sure behavior of randomly weighted sums of φ-mixing random variables with φ-mixing weights
ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 7, Number 2, December 203 Avalable onlne at http://acutm.math.ut.ee A note on almost sure behavor of randomly weghted sums of φ-mxng
More informationLOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin
Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.
More informationCOMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS
Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS
More informationOnline Classification: Perceptron and Winnow
E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng
More informationA Hybrid Variational Iteration Method for Blasius Equation
Avalable at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 1932-9466 Vol. 10, Issue 1 (June 2015), pp. 223-229 Applcatons and Appled Mathematcs: An Internatonal Journal (AAM) A Hybrd Varatonal Iteraton Method
More informationEcon107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)
I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes
More informationMarkov Chain Monte Carlo Lecture 6
where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways
More informationAppendix B: Resampling Algorithms
407 Appendx B: Resamplng Algorthms A common problem of all partcle flters s the degeneracy of weghts, whch conssts of the unbounded ncrease of the varance of the mportance weghts ω [ ] of the partcles
More informationAdditional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty
Addtonal Codes usng Fnte Dfference Method Benamn Moll 1 HJB Equaton for Consumpton-Savng Problem Wthout Uncertanty Before consderng the case wth stochastc ncome n http://www.prnceton.edu/~moll/ HACTproect/HACT_Numercal_Appendx.pdf,
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationInteractive Bi-Level Multi-Objective Integer. Non-linear Programming Problem
Appled Mathematcal Scences Vol 5 0 no 65 3 33 Interactve B-Level Mult-Objectve Integer Non-lnear Programmng Problem O E Emam Department of Informaton Systems aculty of Computer Scence and nformaton Helwan
More informationNumerical Heat and Mass Transfer
Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and
More informationFeature Selection: Part 1
CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?
More informationProblem Set 9 Solutions
Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem
More informationMMA and GCMMA two methods for nonlinear optimization
MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons
More informationResearch Article Green s Theorem for Sign Data
Internatonal Scholarly Research Network ISRN Appled Mathematcs Volume 2012, Artcle ID 539359, 10 pages do:10.5402/2012/539359 Research Artcle Green s Theorem for Sgn Data Lous M. Houston The Unversty of
More informationprinceton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg
prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there
More informationLecture 12: Discrete Laplacian
Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly
More informationEEE 241: Linear Systems
EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationLecture Space-Bounded Derandomization
Notes on Complexty Theory Last updated: October, 2008 Jonathan Katz Lecture Space-Bounded Derandomzaton 1 Space-Bounded Derandomzaton We now dscuss derandomzaton of space-bounded algorthms. Here non-trval
More informationOn the Multicriteria Integer Network Flow Problem
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of
More informationEEL 6266 Power System Operation and Control. Chapter 3 Economic Dispatch Using Dynamic Programming
EEL 6266 Power System Operaton and Control Chapter 3 Economc Dspatch Usng Dynamc Programmng Pecewse Lnear Cost Functons Common practce many utltes prefer to represent ther generator cost functons as sngle-
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More informationCS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016
CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng
More informationStanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011
Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected
More informationChapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems
Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons
More informationThe Minimum Universal Cost Flow in an Infeasible Flow Network
Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran
More informationRyan (2009)- regulating a concentrated industry (cement) Firms play Cournot in the stage. Make lumpy investment decisions
1 Motvaton Next we consder dynamc games where the choce varables are contnuous and/or dscrete. Example 1: Ryan (2009)- regulatng a concentrated ndustry (cement) Frms play Cournot n the stage Make lumpy
More informationThe Second Anti-Mathima on Game Theory
The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player
More informationCSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography
CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve
More informationA Simple Inventory System
A Smple Inventory System Lawrence M. Leems and Stephen K. Park, Dscrete-Event Smulaton: A Frst Course, Prentce Hall, 2006 Hu Chen Computer Scence Vrgna State Unversty Petersburg, Vrgna February 8, 2017
More informationEstimation: Part 2. Chapter GREG estimation
Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the
More informationQueueing Networks II Network Performance
Queueng Networks II Network Performance Davd Tpper Assocate Professor Graduate Telecommuncatons and Networkng Program Unversty of Pttsburgh Sldes 6 Networks of Queues Many communcaton systems must be modeled
More informationFor now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.
Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson
More information3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X
Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number
More informationVQ widely used in coding speech, image, and video
at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng
More informationCHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE
CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng
More informationResource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud
Resource Allocaton wth a Budget Constrant for Computng Independent Tasks n the Cloud Wemng Sh and Bo Hong School of Electrcal and Computer Engneerng Georga Insttute of Technology, USA 2nd IEEE Internatonal
More informationAPPENDIX A Some Linear Algebra
APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,
More informationReport on Image warping
Report on Image warpng Xuan Ne, Dec. 20, 2004 Ths document summarzed the algorthms of our mage warpng soluton for further study, and there s a detaled descrpton about the mplementaton of these algorthms.
More informationUsing T.O.M to Estimate Parameter of distributions that have not Single Exponential Family
IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran
More informationHomework Assignment 3 Due in class, Thursday October 15
Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.
More informationTAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES
TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES SVANTE JANSON Abstract. We gve explct bounds for the tal probabltes for sums of ndependent geometrc or exponental varables, possbly wth dfferent
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationHigh resolution entropy stable scheme for shallow water equations
Internatonal Symposum on Computers & Informatcs (ISCI 05) Hgh resoluton entropy stable scheme for shallow water equatons Xaohan Cheng,a, Yufeng Ne,b, Department of Appled Mathematcs, Northwestern Polytechncal
More informationWinter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan
Wnter 2008 CS567 Stochastc Lnear/Integer Programmng Guest Lecturer: Xu, Huan Class 2: More Modelng Examples 1 Capacty Expanson Capacty expanson models optmal choces of the tmng and levels of nvestments
More informationSimulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests
Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth
More informationGames of Threats. Elon Kohlberg Abraham Neyman. Working Paper
Games of Threats Elon Kohlberg Abraham Neyman Workng Paper 18-023 Games of Threats Elon Kohlberg Harvard Busness School Abraham Neyman The Hebrew Unversty of Jerusalem Workng Paper 18-023 Copyrght 2017
More informationGeneral viscosity iterative method for a sequence of quasi-nonexpansive mappings
Avalable onlne at www.tjnsa.com J. Nonlnear Sc. Appl. 9 (2016), 5672 5682 Research Artcle General vscosty teratve method for a sequence of quas-nonexpansve mappngs Cuje Zhang, Ynan Wang College of Scence,
More informationCollege of Computer & Information Science Fall 2009 Northeastern University 20 October 2009
College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:
More informationMaximizing the number of nonnegative subsets
Maxmzng the number of nonnegatve subsets Noga Alon Hao Huang December 1, 213 Abstract Gven a set of n real numbers, f the sum of elements of every subset of sze larger than k s negatve, what s the maxmum
More informationSimultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals
Smultaneous Optmzaton of Berth Allocaton, Quay Crane Assgnment and Quay Crane Schedulng Problems n Contaner Termnals Necat Aras, Yavuz Türkoğulları, Z. Caner Taşkın, Kuban Altınel Abstract In ths work,
More informationA PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS
HCMC Unversty of Pedagogy Thong Nguyen Huu et al. A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS Thong Nguyen Huu and Hao Tran Van Department of mathematcs-nformaton,
More informationMore metrics on cartesian products
More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of
More information( ) ( ) ( ) ( ) STOCHASTIC SIMULATION FOR BLOCKED DATA. Monte Carlo simulation Rejection sampling Importance sampling Markov chain Monte Carlo
SOCHASIC SIMULAIO FOR BLOCKED DAA Stochastc System Analyss and Bayesan Model Updatng Monte Carlo smulaton Rejecton samplng Importance samplng Markov chan Monte Carlo Monte Carlo smulaton Introducton: If
More informationThe Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction
ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also
More informationVapnik-Chervonenkis theory
Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown
More informationCSC 411 / CSC D11 / CSC C11
18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t
More informationx = , so that calculated
Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to
More informationAssortment Optimization under MNL
Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.
More informationSupplement to Clustering with Statistical Error Control
Supplement to Clusterng wth Statstcal Error Control Mchael Vogt Unversty of Bonn Matthas Schmd Unversty of Bonn In ths supplement, we provde the proofs that are omtted n the paper. In partcular, we derve
More information4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA
4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected
More informationLectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix
Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could
More informationMatrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD
Matrx Approxmaton va Samplng, Subspace Embeddng Lecturer: Anup Rao Scrbe: Rashth Sharma, Peng Zhang 0/01/016 1 Solvng Lnear Systems Usng SVD Two applcatons of SVD have been covered so far. Today we loo
More informationAppendix B. The Finite Difference Scheme
140 APPENDIXES Appendx B. The Fnte Dfference Scheme In ths appendx we present numercal technques whch are used to approxmate solutons of system 3.1 3.3. A comprehensve treatment of theoretcal and mplementaton
More informationOn the correction of the h-index for career length
1 On the correcton of the h-ndex for career length by L. Egghe Unverstet Hasselt (UHasselt), Campus Depenbeek, Agoralaan, B-3590 Depenbeek, Belgum 1 and Unverstet Antwerpen (UA), IBW, Stadscampus, Venusstraat
More informationComputing Correlated Equilibria in Multi-Player Games
Computng Correlated Equlbra n Mult-Player Games Chrstos H. Papadmtrou Presented by Zhanxang Huang December 7th, 2005 1 The Author Dr. Chrstos H. Papadmtrou CS professor at UC Berkley (taught at Harvard,
More informationDUE: WEDS FEB 21ST 2018
HOMEWORK # 1: FINITE DIFFERENCES IN ONE DIMENSION DUE: WEDS FEB 21ST 2018 1. Theory Beam bendng s a classcal engneerng analyss. The tradtonal soluton technque makes smplfyng assumptons such as a constant
More informationOnline Appendix. t=1 (p t w)q t. Then the first order condition shows that
Artcle forthcomng to ; manuscrpt no (Please, provde the manuscrpt number!) 1 Onlne Appendx Appendx E: Proofs Proof of Proposton 1 Frst we derve the equlbrum when the manufacturer does not vertcally ntegrate
More informationEdge Isoperimetric Inequalities
November 7, 2005 Ross M. Rchardson Edge Isopermetrc Inequaltes 1 Four Questons Recall that n the last lecture we looked at the problem of sopermetrc nequaltes n the hypercube, Q n. Our noton of boundary
More informationThe Order Relation and Trace Inequalities for. Hermitian Operators
Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence
More informationA Note on Bound for Jensen-Shannon Divergence by Jeffreys
OPEN ACCESS Conference Proceedngs Paper Entropy www.scforum.net/conference/ecea- A Note on Bound for Jensen-Shannon Dvergence by Jeffreys Takuya Yamano, * Department of Mathematcs and Physcs, Faculty of
More informationMarkov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement
Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs
More informationYong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )
Kangweon-Kyungk Math. Jour. 4 1996), No. 1, pp. 7 16 AN ITERATIVE ROW-ACTION METHOD FOR MULTICOMMODITY TRANSPORTATION PROBLEMS Yong Joon Ryang Abstract. The optmzaton problems wth quadratc constrants often
More informationP R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /
Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons
More informationParametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010
Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton
More informationInductance Calculation for Conductors of Arbitrary Shape
CRYO/02/028 Aprl 5, 2002 Inductance Calculaton for Conductors of Arbtrary Shape L. Bottura Dstrbuton: Internal Summary In ths note we descrbe a method for the numercal calculaton of nductances among conductors
More informationThe Experts/Multiplicative Weights Algorithm and Applications
Chapter 2 he Experts/Multplcatve Weghts Algorthm and Applcatons We turn to the problem of onlne learnng, and analyze a very powerful and versatle algorthm called the multplcatve weghts update algorthm.
More informationNotes on Frequency Estimation in Data Streams
Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to
More informationFUZZY GOAL PROGRAMMING VS ORDINARY FUZZY PROGRAMMING APPROACH FOR MULTI OBJECTIVE PROGRAMMING PROBLEM
Internatonal Conference on Ceramcs, Bkaner, Inda Internatonal Journal of Modern Physcs: Conference Seres Vol. 22 (2013) 757 761 World Scentfc Publshng Company DOI: 10.1142/S2010194513010982 FUZZY GOAL
More informationFoundations of Arithmetic
Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an
More informationNUMERICAL DIFFERENTIATION
NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the
More informationfind (x): given element x, return the canonical element of the set containing x;
COS 43 Sprng, 009 Dsjont Set Unon Problem: Mantan a collecton of dsjont sets. Two operatons: fnd the set contanng a gven element; unte two sets nto one (destructvely). Approach: Canoncal element method:
More informationTHE GUARANTEED COST CONTROL FOR UNCERTAIN LARGE SCALE INTERCONNECTED SYSTEMS
Copyrght 22 IFAC 5th rennal World Congress, Barcelona, Span HE GUARANEED COS CONROL FOR UNCERAIN LARGE SCALE INERCONNECED SYSEMS Hroak Mukadan Yasuyuk akato Yoshyuk anaka Koch Mzukam Faculty of Informaton
More informationLinear Regression Analysis: Terminology and Notation
ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented
More informationLecture 6 More on Complete Randomized Block Design (RBD)
Lecture 6 More on Complete Randomzed Block Desgn (RBD) Multple test Multple test The multple comparsons or multple testng problem occurs when one consders a set of statstcal nferences smultaneously. For
More informationA Robust Method for Calculating the Correlation Coefficient
A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal
More informationDifference Equations
Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1
More informationExcess Error, Approximation Error, and Estimation Error
E0 370 Statstcal Learnng Theory Lecture 10 Sep 15, 011 Excess Error, Approxaton Error, and Estaton Error Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton So far, we have consdered the fnte saple
More informationA PROCEDURE FOR SIMULATING THE NONLINEAR CONDUCTION HEAT TRANSFER IN A BODY WITH TEMPERATURE DEPENDENT THERMAL CONDUCTIVITY.
Proceedngs of the th Brazlan Congress of Thermal Scences and Engneerng -- ENCIT 006 Braz. Soc. of Mechancal Scences and Engneerng -- ABCM, Curtba, Brazl,- Dec. 5-8, 006 A PROCEDURE FOR SIMULATING THE NONLINEAR
More informationA LINEAR PROGRAM TO COMPARE MULTIPLE GROSS CREDIT LOSS FORECASTS. Dr. Derald E. Wentzien, Wesley College, (302) ,
A LINEAR PROGRAM TO COMPARE MULTIPLE GROSS CREDIT LOSS FORECASTS Dr. Derald E. Wentzen, Wesley College, (302) 736-2574, wentzde@wesley.edu ABSTRACT A lnear programmng model s developed and used to compare
More information