A Variance Analysis for POMDP Policy Evaluation

Size: px
Start display at page:

Download "A Variance Analysis for POMDP Policy Evaluation"

Transcription

1 Proceedings of the Twenty-Third AAAI Conference on Artificil Intelligence (2008) A Vrince Anlysis for POMDP Policy Evlution Mhdi Milni Frd nd Joelle Pineu School of Computer Science McGill University, Montrel, Cnd Peng Sun Fuqu School of Business Duke University, Durhm, USA Abstrct Prtilly Observble Mrkov Decision Processes hve been studied widely s model for decision mking under uncertinty, nd number of methods hve been developed to find the solutions for such processes. Such studies often involve clcultion of the vlue function of specific policy, given model of the trnsition nd observtion probbilities, nd the rewrd. These models cn be lerned using lbeled smples of on-policy trjectories. However, when using empiricl models, some bis nd vrince terms re introduced into the vlue function s result of imperfect models. In this pper, we propose method for estimting the bis nd vrince of the vlue function in terms of the sttistics of the empiricl trnsition nd observtion model. Such error terms cn be used to meningfully compre the vlue of different policies. This is n importnt result for sequentil decision-mking, since it will llow us to provide more forml gurntees bout the qulity of the policies we implement. To evlute the precision of the proposed method, we provide supporting experiments on problems from the field of robotics nd medicl decision mking. Introduction It is common in the context of Mrkov Decision Processes (MDPs) to clculte the vlue function of specific policy, bsed on some trnsition nd rewrd model. When the model is not known priori, one cn compute n empiricl model from some smple on-policy trjectories using bsic frequentist pproch nd then use this model (long with Bellmn s eqution) to clculte the vlue function of the trget policy. Using imperfect models however will introduce some error in the estimted vlue function. As generl prctice with lerning methods, we might wnt to know how good this estimte of the vlue function is, given the error in the estimted models. This cn be expressed in terms of bis nd vrince of the clculted vlue function. The vribility of the vlue function my hve two different sources. One is the stochstic nture of MDPs (internl vrince), nd the other is the error due to the use of the imperfect empiricl model insted of the true model (prmetric vrince). Internl vrince nd its reduction hve Copyright c 2008, Assocition for the Advncement of Artificil Intelligence ( All rights reserved. been studied in severl works (Greensmith, Brtlett, & Bxter 2004). Here we re mostly interested in finding the ltter. Mnnor et l. (2004) showed tht when the empiricl model is resonbly close to the true model, we cn use second order pproximtion to clculte these terms in the vlue function of n MDP. In this pper we extend these ides to the context of Prtilly Observble Mrkov Decision Processes (POMDPs) nd derive similr expressions for the bis nd vrince terms. This is n importnt result for the deployment of utonomous decision-mking systems in rel-world domins since it is well-known tht POMDPs re more relistic model of decision-mking thn MDPs (becuse they llow prtil stte observbility). It is worth noting tht pproximtion methods for POMDPs hve mde lrge leps in recent yers; nd while these pproches consistently ssume perfect model of the domin, in rel-world pplictions, these models must often be estimted from dt. The method outlined in this pper cn be used to ssess when we hve gthered sufficient dt to hve good estimte of the vlue function. The method cn lso be used to ssess whether we cn confidently select one policy over nother. Finlly, the method cn be used to define clsses of equivlent policies. These re useful considertions when developing expert systems, especilly for criticl pplictions such s humn-robot interction nd medicl decision-mking. Bckground In this section we define the model nd nottion tht will be used in the following sections. Prtilly Observble Mrkov Decision Process We consider finite-stte, finite-ction, discounted rewrd POMDP (Sondik 1971; Cssndr, Kelbling, & Littmn 1994): S: finite set of sttes A: finite set of ctions Z: finite set of observtions R : S dimensionl mtrix of rewrds when selecting ction T : S S dimensionl mtrix of trnsition probbilities when selecting ction 1056

2 O : S Z dimensionl mtrix of observtion probbilities when selecting ction γ: discount fctor It is well known tht the vlue function of the optiml policy of POMDP in the finite horizon is convex piecewise liner function of the belief stte (Sondik 1971). It is often convenient to use finite-horizon pproximtion in the infinite horizon cse. Thus, we work only with piecewise liner vlue functions. Finite Stte Controller nd Vlue Function Sondik (1971) points out tht n optiml policy for finitehorizon POMDP cn be represented s n cyclic finite-stte controller in which ech of the mchine sttes represents liner piece (or the corresponding lph vector) in the piecewise liner vlue function. The stte of the controller is bsed on the observtion history nd the ction of the gent will only be bsed on the stte of the controller. For deterministic policies, ech mchine stte i issues n ction (i) nd then the controller trnsitions to new mchine stte ccording to the received observtion. This finite-stte controller is usully represented s policy grph. An exmple of policy grph for POMP dilog mnger is shown in Fig 2. Cssndr, Kelbling, & Littmn (1994) stte tht dynmic progrmming lgorithms for infinite-horizon POMDPs, such s vlue itertion, sometimes converge to n optiml piecewise vlue function tht is equivlent to cyclic finite-stte controller. In the cse tht the optiml vlue function is not piecewise liner, it is still possible to find n pproximte or suboptiml finite-stte controller (Pouprt & Boutilier 2003). Given finite-stte controller for policy, we cn extrct the vlue function of the POMDP using liner system of equtions. To extrct the ith liner piece of the POMDP vlue function, we clculte the vlue of ech POMDP stte over tht liner piece. For ech mchine stte i (corresponding to liner piece), nd ech POMDP stte s, the vlue of s over the ith liner piece is: v i (s) = r(s, (i)) + γ s,z T (i) (s, s )O (i) (s, z)v l(i,z) (s ), where r(s, ) is the immedite rewrd nd l(i, z) is the next mchine stte from stte i nd given observtion z (Hnsen 1998). We cn rewrite the bove system of equtions in mtrix form using the following definitions: K: finite set of mchine sttes in the policy grph v k for k K: S dimensionl vector of coefficients representing liner piece in the vlue function V : S K dimensionl vector, verticl conctention of v k s representing the POMDP vlue function (k) for k K: the ction ssocited with mchine stte k ccording to the fixed policy r k = R (k) for k K: S dimensionl vector of coefficients representing liner piece in the piecewise liner immedite rewrd function R: S K dimensionl vector, conctention of r k s T : S K S K dimensionl block digonl mtrix of K K blocks, with T (k) s the kth digonl submtrix O: S K Z S K dimensionl block digonl mtrix of K K blocks. Ech digonl block is S Z S block digonl sub-mtrix of S S sub-blocks. Ech sub-block is therefore Z dimensionl row vector. The kth block, sth sub-block contins the sth row in the O (k). Π: Z S K S K dimensionl block mtrix of K K blocks. Ech block Π k1k 2 is itself Z S S block digonl sub-mtrix of S S sub-blocks. Ech subblock is therefore Z dimensionl vector. For ll s, the zth component of the sth digonl block of the (k 1, k 2 ) sub-mtrix, [(Π k1k 2 ) s ] z, is equl to 1 if k 2 is the succeeding index of the mchine stte when the mchine stte is k 1 nd the observtion is z, nd 0 otherwise. This mtrix represents the trnsition function l(i, z) of the finite-stte controller which re the rcs in the policy grph. We cn write the system of equtions representing the vlue of policy π in the following mtrix form: leding to: V π = R + γt OΠ π V. (1) V π = (I γt OΠ π ) 1 R. (2) The bove eqution cn be used to clculte the vlue function of given policy, if the models for T, O nd R re known. This eqution is t the core of most policy itertion lgorithms for POMDPs (Hnsen 1997; 1998), including one of the most recent highly successful pproximtion method (Ji et l. 2007). Thus hving confidence intervls over the clculted vlue function might be of gret use in such lgorithms. Model Error Given POMDP (s defined in the previous section), fixed policy nd set of lbeled on-policy trjectories, one cn use frequentist pproch to clculte the models for T, O nd R. The ssumption of hving trining dt with known lbeled sttes is strong ssumption nd in mny POMDP domins my not be plusible. However, it is still more prcticl thn the ssumption of hving exct true models of T, O nd R. In the cse where EM-type lgorithms re used to lbel the dt (Koenig & Simmons 1996), the derivtion of the estimtes with the bove ssumption is not exctly correct, but might still provide useful guide to compre competing policies. Here we focus on the cse in which the model for immedite rewrd is known, while T nd O re estimted from dt. The method cn be further extended to the cse where rewrds re lso estimted from dt. If ction is used Ni times in stte s i, from which there were Nij trnsitions to s j, we cn write down the empiricl trnsition probbility from s i to s j given ction s: ˆT (i, j) = N ij Ni. (3) 1057

3 A similr method cn be used with the observtion model. If there were Mi trnsitions to s i fter ction, nd z j ws observed in Mij of them, the empiricl model of observtion probbilities would be: Ô (i, j) = M ij Mi. (4) From these empiricl models we cn crete the T nd O models s defined in the previous section. As our trining dt hs finite number of smples, nd therefore these empiricl models re likely to be imperfect, contining error terms T nd Õ. We therefore hve: ˆT = T + T, Ô = O + Õ. (5) As we used simple frequentist pproch to clculte the empiricl models, we cn ssume independence of errors in the following mnner: Different rows in ˆT nd Ô re independent from ech other, nd ech row is drwn from multinomil distribution. Considering sttisticl properties of the multinomil distribution, we know tht the expected errors re zero nd independent: E[ T ] = E[Õ] = E[ T Õ] = 0. (6) We cn write the covrince of the i th row of ˆT (denoted ˆT (i) ) s: cov(t (i) ) = 1 ( ) (i) (i) Ni dig( ˆT ) ( ˆT ) T (i) ˆT, (7) ˆT (i) where dig( ) is digonl mtrix with digonl. Similrly for Ô(i) we hve: cov(o (i) ) = 1 ( Mi dig(ô(i) ) (Ô(i) ˆT (i) long the ) ) T Ô (i). (8) Using the bove derivtions nd the definition of T nd O mtrices from the previous section, it is stright-forwrd to clculte the four dimensionl covrince mtrices of T nd Õ in terms of cov(t (i) ) nd cov(o (i) ). With T nd Õ being zero men vribles, the covrince mtrices will be: cov(t (i, j), T (k, l)) = E[ T (i, j) T (k, l)], (9) cov(o(i, j), O(k, l)) = E[Õ(i, j)õ(k, l)]. (10) These terms cpture the vrince in the empiricl models. The interesting question tht rises is how these errors in the empiricl models impct our estimte of the vlue function. Clcultion of Bis nd Vrince If we use the empiricl models insted of the true models to clculte the vlue of given policy π, we will hve: ˆV π = (I γ ˆT ÔΠπ ) 1 R, (11) To simplify the nottion, we will drop the π superscript in the lter derivtions. The bove expression cn be rewritten s: ˆV = (I γ(t + T )(O + Õ)Π) 1 R. (12) Now using Tylor expnsion nd mtrix mnipultion (Mnnor et l. 2007), we cn re-write the bove expression s: ˆV = γ k f k ( T, Õ)R, (13) where k=0 X = (I γt OΠ) 1, (14) f k ( T, Õ) = (X( T OΠ + T ÕΠ + T ÕΠ))k X. (15) We will use the bove derivtion to pproximte the expecttion of the clculted vlue function: E[ ˆV ] = E[ γ k f k ( T, Õ)R]. (16) k=0 Becuse the exct expression of the bove eqution cnnot be further simplified, we consider second order pproximtion insted. The expecttion of the vlue function then becomes: E[ ˆV ] = XR + γe[f 1 ]R + γ 2 E[f 2 ]R. (17) As Õ nd T re zero men nd independent, E[f 1 ( T, Õ)] will be 0. By substituting X, the bove expression becomes: E[ ˆV ] = V + γ 2 E[f 2 ( T, Õ)]R, (18) which shows tht the clculted vlue function is expected to hve some non-zero bis term. Using similr pproximtion, we cn write down the second moment of vlue function s: E[ ˆV ˆV T ] = V V T + γ 2 (E[f 1 RR T f T 1 ]) (19) +γ 2 (E[f 0 RR T f T 2 ]) + γ 2 (E[f 2 RR T f T 0 ]). The covrince mtrix will therefore be: E[ ˆV ˆV T ] E[ ˆV ]E[ ˆV ] T = γ 2 (E[f 1 RR T f T 1 ]). (20) Substituting f 1 with the definition we get: cov( ˆV ) = γ 2 XE[ T OΠV V T Π T O T T T ]X T. +γ 2 XTE[ÕΠV V T Π T Õ T ]T T X T (21) We will pproximtely clculte the bove expression by substituting the true models with our empiricl models (which is stndrd clssicl pproch). We lso require the following lemm: Lemm 1. Let Q be n n n dimensionl mtrix: Q = AXA T, (22) where A is n n m mtrix of zero men rndom vribles nd X is constnt mtrix of m m dimensions. The ijth entry of E[Q] is equl to: E[ k,l A ik X kl A T lj] = k,l = k,l X kl E[A ik A jl ] X kl cov(a ik, A jl ), (23) which is only dependent on four dimensionl covrince of the mtrix A. By pplying Lemm 1 to Eqn 21, we cn clculte the covrince of the clculted vlue function using the covri- 1058

4 nce of T nd Õ defined in the previous section. In summry, we propose second order pproximtion to estimte the expected error in the vlue function, in terms of the expected error in the empiricl models. Using similr clcultions, we cn lso clculte the bis s defined by Eqn 18 (the derivtion will pper in longer version of this pper; in most cses this term is much smller thn the vrince). goto x x sk x/y y x sk y x x/y sk goto y y Experiment nd Results The purpose of this section is two-fold. First, we wish to evlute the pproximtions used when deriving our estimte of the vrince in the vlue function. Second we wish to illustrte how the method cn be used to compre different policies for given tsk. POMDP dilog mnger We begin by evluting the method on synthetic dt from humn-robot dilog tsk. The use of POMDP-bsed dilog mngers is well-estblished (Doshi & Roy 2007; Willims & Young 2006). However, it is often not esy to get trining dt in humn-robot interction domins. With smll trining sets, error terms tend to be importnt. Estimtes of the error vrince will therefore be helpful to evlute nd compre policies. Here we focus on smll simulted problem which requires evluting dilog policies for the purpose of cquiring motion gols from the user. We presume humn opertor is instructing n ssistive robot to move to one of two loctions (e.g. bedroom or bthroom). While the humn intent (i.e. the stte) is one of these gols, the observtion received by the robot (presumbly through speech recognizer) might be incorrect. The robot hs the option to sk gin to ensure the gol ws understood correctly. Note however tht the humn my chnge his/her intent (the stte) with smll probbility. Fig 1 shows model of the described sitution. In the genertive model (used to provide the trining dt), we ssume the probbility of wrong observtion is 0.15 nd the humn might chnge gols with probbility If the robot cts s requested, it gets rewrd of 10; otherwise it gets 40 penlty. There is smll penlty of 1 when sking for clrifiction. We ssume γ = Fig 2 shows policy grph for the described POMDP dilog mnger. This policy grph corresponds to the policy where the robot keeps sking the humn until it receives n observtion twice more thn the other one. We rn the following experiment: given the fixed policy of Fig 2 nd fixed number n, we drw on-policy lbeled strt goto bedroom goto bthroom Figure 1: Exmple of dilog POMDP - Dshed lines refer to tking ction sk end Figure 2: Policy grph for the POMDP dilog mnger trjectories tht on the whole contin n trnsitions. We use these to clculte the empiricl models (Eqns 3 nd 4), nd use Eqn 2 to clculte the vlue function. Then we use Eqn 21 to clculte the covrince nd stndrd devition of the vlue function t the initil belief point (b 0 = [0.5; 0.5]). Let V (b 0 ) be the expected vlue t the initil belief stte b 0, nd let α = [α 1 ; α 2 ] be the vector of coefficients describing the corresponding liner piece in the piecewise liner vlue function. We hve V (b 0 ) = E[α b] = (α 1 + α 2 )/2 nd thus the vrince of V (b 0 ) cn be clculted s: vr(v (b 0 )) = vr(α 1) + vr(α 1 ) + 2cov(α 1, α 2 ). (24) 4 Fixing the size of the trining set, we run the bove experiment 1000 times. In ech time, we clculte the empiricl vlue of the initil belief stte ( ˆV (b 0 )), nd estimte its vrince using Eqn 24. We then clculte the percentge of cses in which the estimted vlue ( ˆV (b 0 )) lies within 1 nd 2 estimted stndrd devitions of the true vlue (V (b 0 )). Assuming tht the error between the clculted nd true vlue hs Gussin distribution (this ws confirmed by plotting the histogrm of error terms), these vlues should be 68% nd 95% respectively. Fig 3 confirms tht the vrince estimtion we propose stisfies this criteri. The result holds for vriety of smple set sizes (from n=1000 to n=5000). To investigte how these vrince estimtes cn be useful to compre competing policies, we clculte the vrince of the vlue function for two other policies on this dilog problem (we presume these dilog policies re provided by n Percentge below 1 (+) nd 2 (x) STDs Number of smples Figure 3: Percentge of the cses in which ˆV (b 0 ) lies within 1 (+) nd 2 ( ) pproximtely clculted stndrd devitions from V (b 0 ) - the dilog problem 1059

5 4! l & STD intervl of the vlue of the initil belief stte 3! 2! &!!!&! sk once sk twice sk three times MedA m l l h m MedB MedC h h m Figure 5: The policy grph for the STAR*D problem!2!! 2!!! 4!!! $!!! %!!! &!!!! (umber of smples Figure 4: 1 stndrd devition intervl for the clculted vlue of the initil belief stte for different policies on the dilog problem expert, though they could be cquired from policy itertion lgorithm such s Ji et l. (2007)). One policy is to sk for the gol only once, nd then ct ccording to tht single observtion. The other policy is to keep sking until the robot observes one of the gols three times more thn the other one, nd then ct ccordingly. Fig 4 shows the 1 stndrd devition intervl for the clculted vlue of the initil belief stte s function of the number of smples, for ech of our three policies (including the one shown in Fig 2). Given lrger smple sizes, the policy in Fig 2 becomes cler fvorite, wheres the other two re not significntly different from ech other. This illustrtes how our estimtes cn be used prcticlly to ssess the difference between policies using more informtion thn simply their expected vlue (s is usully stndrd in the POMDP literture). Medicl Domin We now evlute the ccurcy of our pproximtion in medicl decision-mking tsk involving rel dt. The dt ws collected s prt of lrge (4000+ ptients) multi-step rndomized clinicl tril, designed to investigte the comprtive effectiveness of different tretments provided sequentilly for ptients suffering from depression (Fv et l. 2003). The POMDP frmework offers powerful model for optimizing tretment strtegies from such dt. However given the sensitive nture of the ppliction, s well s the cost involved in collecting such dt, estimtes of the potentil error re highly useful. The dtset provided includes lrge number of mesured outcomes, which will be the focus of future investigtions. For the current experiment, we focus on numericl score clled the Quick Inventory of Depressive Symptomtology (QIDS), which roughly indictes the level of depression. This score ws collected throughout the study in two different wys: self-report version (QIDS-SR) ws collected using n utomted phone system; clinicl version (QIDS-C) ws lso collected by qulified clinicin. For the purposes of our experiment, we presume the QIDS-C score completely describes the ptient s stte, nd the QIDS- SR score is noisy observtion of the stte. To mke the problem trctble with smll trining dt, we discretize the score (which usully rnges from 0 to 27) uniformly ccording to quntiles into 2 sttes nd 3 observtions. The dtset includes informtion bout 4 steps of tretments. We focus on policies which only differ in terms of tretment options in the second step of the sequence (other tretment steps re held constnt). There re seven tretment options t tht step. A rewrd of 1 is given if the ptient chieves remission (t ny step); rewrd of 0 is given otherwise. Although this reltively smll POMDP domin, it is nonetheless n interesting vlidtion for our estimte, since it uses rel dt, nd highlights the type of problem where these estimtes re prticulrly crucil. We focus on estimting the vrince in the vlue estimte for the policy shown in Fig 5. This policy includes only three tretments: mediction A is given to ptients with low QIDS-SR scores, mediction B is given to ptients with medium QIDS-SR scores, nd mediction C is given to ptients with high QIDS-SR scores. Since we do not know the exct vlue of this policy (over n infinitely lrge dt set), we use bootstrpping estimte. This mens we tke ll the smples in our dtset which re consistent with this policy, nd presume tht they define the true model nd true vlue function. Now to investigte the ccurcy of our vrince estimte, we subsmple this dt set, estimte the corresponding prmeters, nd clculte the vlue function using Eqn 2. To summrize the vlue function into single vlue (de- Percentge below 1 (+) nd 2 (x) STDs Number of smples Figure 6: Percentge of cses in which ˆV (B) lies within 1 (+) nd 2 ( ) pproximtely clculted stndrd devitions from V (B) - the STAR*D problem 1060

6 2 STD intervl of the summrized vlue function not using CT using CT serch for policies tht hve high expected vlue nd low expected vrince. Furthermore, in some domins (including humn-robot interction nd medicl tretment design), where there is n extensive trdition of using hnd-crfted policies to select ctions, the method we present would be useful to compre hnd-crfted policies with the best policy selected by n utomted plnning method. The method we presented cn be further extended to work in cses where the rewrd model is lso unknown nd is pproximted by smpling. However, the derived equtions re more cumbersome s we need to tke into ccount the potentil correltions between rewrd nd trnsition models Number of smples Figure 7: 2 stndrd devition intervl for the clculted vlue of the summrized belief stte for different policies on the STAR*D problem noted by V (B)), we simply tke the verge over the 3 liner pieces in the vlue function. The vrince of ˆV (B) will therefore be the verge of the elements of the covrince mtrix we clculted for the vlue function. To check the qulity of the estimtes, we clculte the percentge of cses in which the clculted vlue lies within 1 nd 2 stndrd devitions from the true vlue. If the error term in the vlue function hs norml distribution these percentges should gin be 68 nd 95. Fig 6 shows the mentioned percentges s function of the number of smples. Here gin, the vrince estimtes re close to wht is observed empiriclly. Finlly, we conducted n experiment to compre policies with different choice of medictions in the policy grph of Fig 5. During the STAR*D experiment, ptients mostly preferred not to use certin tretment (CT:Cognitive Therpy). To study the effect of this preference, we compred two policies only one of which uses CT. As shown in Fig 7, the CT-bsed policy hs slightly better expected vlue nd much higher vrition. Using the result of this nlysis, one might prefer the non CT-bsed policy for two resons: Even with high empiricl vlues, we hve smll evidence to support the CT-bsed policy. Moreover, CT is not preferred by most ptients. Such method cn be pplied in similr cses for comprison between n empiriclly optiml policy nd mediclly preferred ones. Discussion Most of the literture on sequentil decision-mking focuses strictly on the problem of mking the best possible decision. This pper rgues tht it is sometimes importnt to tke into ccount the error in our vlue function, when compring lterntive policies. In prticulr, we show tht when we use imperfect empiricl models generted from smple dt (insted of the true model), some bis nd vrince terms re introduced in the vlue function of POMDP. We lso present method to pproximtely clculte these errors in terms of the sttistics of the empiricl models. Such informtion cn be highly vluble when compring different ction selection strtegies. During policy serch, for instnce, one could mke use of these error terms to Acknowledgment Funding for this work ws provided by the Ntionl Institutes of Helth (grnt R21 DA019800) nd the NSERC Discovery Grnt progrm. References Cssndr, A. R.; Kelbling, L. P.; nd Littmn, M. L Acting optimlly in prtilly observble stochstic domins. In Proceedings of AAAI. Doshi, F., nd Roy, N Efficient model lerning for dilog mngement. In Proceeding of HRI. Fv, M.; Rush, A.; Trivedi, M.; Nierenberg, A.; Thse, M.; Sckeim, H.; Quitkin, F.; Wisniewski, S.; Lvori, P.; Rosenbum, J.; nd Kupfer, D Bckground nd rtionle for the sequenced tretment lterntives to relieve depression (STAR*D) study. Psychitr Clin North Am 26(2): Greensmith, E.; Brtlett, P. L.; nd Bxter, J Vrince reduction techniques for grdient estimtes in reinforcement lerning. J. Mch. Lern. Res. 5: Hnsen, E. A An improved policy itertion lgorithm for prtilly observble MDPs. In Proceedings of NIPS. Hnsen, E. A Solving POMDPs by serching in policy spce. In Proceedings of UAI. Ji, S.; Prr, R.; Li, H.; Lio, X.; nd Crin, L Pointbsed policy itertion. In Proceedings of AAAI. Koenig, S., nd Simmons, R Unsupervised lerning of probbilistic models for robot nvigtion. In Proceedings of ICRA. Mnnor, S.; Simester, D.; Sun, P.; nd Tsitsiklis, J. N Bis nd vrince in vlue function estimtion. In Proceedings of ICML. Mnnor, S.; Simester, D.; Sun, P.; nd Tsitsiklis, J. N Bis nd vrince pproximtion in vlue function estimtes. Mnge. Sci. 53(2): Pouprt, P., nd Boutilier, C Bounded finite stte controllers. In Proceedings of NIPS, volume 16. Sondik, E. J The optiml control of prtilly observble Mrkov processes. Ph.D. Disserttion, Stnford. Willims, J. D., nd Young, S Prtilly observble mrkov decision processes for spoken dilog systems. Computer Speech nd Lnguge 21(2). 1061

Reinforcement learning II

Reinforcement learning II CS 1675 Introduction to Mchine Lerning Lecture 26 Reinforcement lerning II Milos Huskrecht milos@cs.pitt.edu 5329 Sennott Squre Reinforcement lerning Bsics: Input x Lerner Output Reinforcement r Critic

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Lerning Tom Mitchell, Mchine Lerning, chpter 13 Outline Introduction Comprison with inductive lerning Mrkov Decision Processes: the model Optiml policy: The tsk Q Lerning: Q function Algorithm

More information

1 Online Learning and Regret Minimization

1 Online Learning and Regret Minimization 2.997 Decision-Mking in Lrge-Scle Systems My 10 MIT, Spring 2004 Hndout #29 Lecture Note 24 1 Online Lerning nd Regret Minimiztion In this lecture, we consider the problem of sequentil decision mking in

More information

Administrivia CSE 190: Reinforcement Learning: An Introduction

Administrivia CSE 190: Reinforcement Learning: An Introduction Administrivi CSE 190: Reinforcement Lerning: An Introduction Any emil sent to me bout the course should hve CSE 190 in the subject line! Chpter 4: Dynmic Progrmming Acknowledgment: A good number of these

More information

Module 6 Value Iteration. CS 886 Sequential Decision Making and Reinforcement Learning University of Waterloo

Module 6 Value Iteration. CS 886 Sequential Decision Making and Reinforcement Learning University of Waterloo Module 6 Vlue Itertion CS 886 Sequentil Decision Mking nd Reinforcement Lerning University of Wterloo Mrkov Decision Process Definition Set of sttes: S Set of ctions (i.e., decisions): A Trnsition model:

More information

Monte Carlo method in solving numerical integration and differential equation

Monte Carlo method in solving numerical integration and differential equation Monte Crlo method in solving numericl integrtion nd differentil eqution Ye Jin Chemistry Deprtment Duke University yj66@duke.edu Abstrct: Monte Crlo method is commonly used in rel physics problem. The

More information

Chapter 9: Inferences based on Two samples: Confidence intervals and tests of hypotheses

Chapter 9: Inferences based on Two samples: Confidence intervals and tests of hypotheses Chpter 9: Inferences bsed on Two smples: Confidence intervls nd tests of hypotheses 9.1 The trget prmeter : difference between two popultion mens : difference between two popultion proportions : rtio of

More information

Student Activity 3: Single Factor ANOVA

Student Activity 3: Single Factor ANOVA MATH 40 Student Activity 3: Single Fctor ANOVA Some Bsic Concepts In designed experiment, two or more tretments, or combintions of tretments, is pplied to experimentl units The number of tretments, whether

More information

2D1431 Machine Learning Lab 3: Reinforcement Learning

2D1431 Machine Learning Lab 3: Reinforcement Learning 2D1431 Mchine Lerning Lb 3: Reinforcement Lerning Frnk Hoffmnn modified by Örjn Ekeberg December 7, 2004 1 Introduction In this lb you will lern bout dynmic progrmming nd reinforcement lerning. It is ssumed

More information

LECTURE NOTE #12 PROF. ALAN YUILLE

LECTURE NOTE #12 PROF. ALAN YUILLE LECTURE NOTE #12 PROF. ALAN YUILLE 1. Clustering, K-mens, nd EM Tsk: set of unlbeled dt D = {x 1,..., x n } Decompose into clsses w 1,..., w M where M is unknown. Lern clss models p(x w)) Discovery of

More information

Recitation 3: More Applications of the Derivative

Recitation 3: More Applications of the Derivative Mth 1c TA: Pdric Brtlett Recittion 3: More Applictions of the Derivtive Week 3 Cltech 2012 1 Rndom Question Question 1 A grph consists of the following: A set V of vertices. A set E of edges where ech

More information

Chapter 5 : Continuous Random Variables

Chapter 5 : Continuous Random Variables STAT/MATH 395 A - PROBABILITY II UW Winter Qurter 216 Néhémy Lim Chpter 5 : Continuous Rndom Vribles Nottions. N {, 1, 2,...}, set of nturl numbers (i.e. ll nonnegtive integers); N {1, 2,...}, set of ll

More information

New Expansion and Infinite Series

New Expansion and Infinite Series Interntionl Mthemticl Forum, Vol. 9, 204, no. 22, 06-073 HIKARI Ltd, www.m-hikri.com http://dx.doi.org/0.2988/imf.204.4502 New Expnsion nd Infinite Series Diyun Zhng College of Computer Nnjing University

More information

1B40 Practical Skills

1B40 Practical Skills B40 Prcticl Skills Comining uncertinties from severl quntities error propgtion We usully encounter situtions where the result of n experiment is given in terms of two (or more) quntities. We then need

More information

ODE: Existence and Uniqueness of a Solution

ODE: Existence and Uniqueness of a Solution Mth 22 Fll 213 Jerry Kzdn ODE: Existence nd Uniqueness of Solution The Fundmentl Theorem of Clculus tells us how to solve the ordinry differentil eqution (ODE) du = f(t) dt with initil condition u() =

More information

Tests for the Ratio of Two Poisson Rates

Tests for the Ratio of Two Poisson Rates Chpter 437 Tests for the Rtio of Two Poisson Rtes Introduction The Poisson probbility lw gives the probbility distribution of the number of events occurring in specified intervl of time or spce. The Poisson

More information

Continuous Random Variables

Continuous Random Variables STAT/MATH 395 A - PROBABILITY II UW Winter Qurter 217 Néhémy Lim Continuous Rndom Vribles Nottion. The indictor function of set S is rel-vlued function defined by : { 1 if x S 1 S (x) if x S Suppose tht

More information

{ } = E! & $ " k r t +k +1

{ } = E! & $  k r t +k +1 Chpter 4: Dynmic Progrmming Objectives of this chpter: Overview of collection of clssicl solution methods for MDPs known s dynmic progrmming (DP) Show how DP cn be used to compute vlue functions, nd hence,

More information

Chapter 4: Dynamic Programming

Chapter 4: Dynamic Programming Chpter 4: Dynmic Progrmming Objectives of this chpter: Overview of collection of clssicl solution methods for MDPs known s dynmic progrmming (DP) Show how DP cn be used to compute vlue functions, nd hence,

More information

Advanced Calculus: MATH 410 Notes on Integrals and Integrability Professor David Levermore 17 October 2004

Advanced Calculus: MATH 410 Notes on Integrals and Integrability Professor David Levermore 17 October 2004 Advnced Clculus: MATH 410 Notes on Integrls nd Integrbility Professor Dvid Levermore 17 October 2004 1. Definite Integrls In this section we revisit the definite integrl tht you were introduced to when

More information

f(x) dx, If one of these two conditions is not met, we call the integral improper. Our usual definition for the value for the definite integral

f(x) dx, If one of these two conditions is not met, we call the integral improper. Our usual definition for the value for the definite integral Improper Integrls Every time tht we hve evluted definite integrl such s f(x) dx, we hve mde two implicit ssumptions bout the integrl:. The intervl [, b] is finite, nd. f(x) is continuous on [, b]. If one

More information

7.2 The Definite Integral

7.2 The Definite Integral 7.2 The Definite Integrl the definite integrl In the previous section, it ws found tht if function f is continuous nd nonnegtive, then the re under the grph of f on [, b] is given by F (b) F (), where

More information

Bellman Optimality Equation for V*

Bellman Optimality Equation for V* Bellmn Optimlity Eqution for V* The vlue of stte under n optiml policy must equl the expected return for the best ction from tht stte: V (s) mx Q (s,) A(s) mx A(s) mx A(s) Er t 1 V (s t 1 ) s t s, t s

More information

Review of Calculus, cont d

Review of Calculus, cont d Jim Lmbers MAT 460 Fll Semester 2009-10 Lecture 3 Notes These notes correspond to Section 1.1 in the text. Review of Clculus, cont d Riemnn Sums nd the Definite Integrl There re mny cses in which some

More information

CS667 Lecture 6: Monte Carlo Integration 02/10/05

CS667 Lecture 6: Monte Carlo Integration 02/10/05 CS667 Lecture 6: Monte Crlo Integrtion 02/10/05 Venkt Krishnrj Lecturer: Steve Mrschner 1 Ide The min ide of Monte Crlo Integrtion is tht we cn estimte the vlue of n integrl by looking t lrge number of

More information

Predict Global Earth Temperature using Linier Regression

Predict Global Earth Temperature using Linier Regression Predict Globl Erth Temperture using Linier Regression Edwin Swndi Sijbt (23516012) Progrm Studi Mgister Informtik Sekolh Teknik Elektro dn Informtik ITB Jl. Gnesh 10 Bndung 40132, Indonesi 23516012@std.stei.itb.c.id

More information

The steps of the hypothesis test

The steps of the hypothesis test ttisticl Methods I (EXT 7005) Pge 78 Mosquito species Time of dy A B C Mid morning 0.0088 5.4900 5.5000 Mid Afternoon.3400 0.0300 0.8700 Dusk 0.600 5.400 3.000 The Chi squre test sttistic is the sum of

More information

NUMERICAL INTEGRATION. The inverse process to differentiation in calculus is integration. Mathematically, integration is represented by.

NUMERICAL INTEGRATION. The inverse process to differentiation in calculus is integration. Mathematically, integration is represented by. NUMERICAL INTEGRATION 1 Introduction The inverse process to differentition in clculus is integrtion. Mthemticlly, integrtion is represented by f(x) dx which stnds for the integrl of the function f(x) with

More information

Acceptance Sampling by Attributes

Acceptance Sampling by Attributes Introduction Acceptnce Smpling by Attributes Acceptnce smpling is concerned with inspection nd decision mking regrding products. Three spects of smpling re importnt: o Involves rndom smpling of n entire

More information

Properties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives

Properties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives Block #6: Properties of Integrls, Indefinite Integrls Gols: Definition of the Definite Integrl Integrl Clcultions using Antiderivtives Properties of Integrls The Indefinite Integrl 1 Riemnn Sums - 1 Riemnn

More information

Decision Networks. CS 188: Artificial Intelligence Fall Example: Decision Networks. Decision Networks. Decisions as Outcome Trees

Decision Networks. CS 188: Artificial Intelligence Fall Example: Decision Networks. Decision Networks. Decisions as Outcome Trees CS 188: Artificil Intelligence Fll 2011 Decision Networks ME: choose the ction which mximizes the expected utility given the evidence mbrell Lecture 17: Decision Digrms 10/27/2011 Cn directly opertionlize

More information

Lecture 14: Quadrature

Lecture 14: Quadrature Lecture 14: Qudrture This lecture is concerned with the evlution of integrls fx)dx 1) over finite intervl [, b] The integrnd fx) is ssumed to be rel-vlues nd smooth The pproximtion of n integrl by numericl

More information

Improper Integrals. Type I Improper Integrals How do we evaluate an integral such as

Improper Integrals. Type I Improper Integrals How do we evaluate an integral such as Improper Integrls Two different types of integrls cn qulify s improper. The first type of improper integrl (which we will refer to s Type I) involves evluting n integrl over n infinite region. In the grph

More information

19 Optimal behavior: Game theory

19 Optimal behavior: Game theory Intro. to Artificil Intelligence: Dle Schuurmns, Relu Ptrscu 1 19 Optiml behvior: Gme theory Adversril stte dynmics hve to ccount for worst cse Compute policy π : S A tht mximizes minimum rewrd Let S (,

More information

THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS.

THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS. THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS RADON ROSBOROUGH https://intuitiveexplntionscom/picrd-lindelof-theorem/ This document is proof of the existence-uniqueness theorem

More information

APPROXIMATE INTEGRATION

APPROXIMATE INTEGRATION APPROXIMATE INTEGRATION. Introduction We hve seen tht there re functions whose nti-derivtives cnnot be expressed in closed form. For these resons ny definite integrl involving these integrnds cnnot be

More information

Jim Lambers MAT 169 Fall Semester Lecture 4 Notes

Jim Lambers MAT 169 Fall Semester Lecture 4 Notes Jim Lmbers MAT 169 Fll Semester 2009-10 Lecture 4 Notes These notes correspond to Section 8.2 in the text. Series Wht is Series? An infinte series, usully referred to simply s series, is n sum of ll of

More information

1 Linear Least Squares

1 Linear Least Squares Lest Squres Pge 1 1 Liner Lest Squres I will try to be consistent in nottion, with n being the number of dt points, nd m < n being the number of prmeters in model function. We re interested in solving

More information

CS 188: Artificial Intelligence Spring 2007

CS 188: Artificial Intelligence Spring 2007 CS 188: Artificil Intelligence Spring 2007 Lecture 3: Queue-Bsed Serch 1/23/2007 Srini Nrynn UC Berkeley Mny slides over the course dpted from Dn Klein, Sturt Russell or Andrew Moore Announcements Assignment

More information

Goals: Determine how to calculate the area described by a function. Define the definite integral. Explore the relationship between the definite

Goals: Determine how to calculate the area described by a function. Define the definite integral. Explore the relationship between the definite Unit #8 : The Integrl Gols: Determine how to clculte the re described by function. Define the definite integrl. Eplore the reltionship between the definite integrl nd re. Eplore wys to estimte the definite

More information

Math 1B, lecture 4: Error bounds for numerical methods

Math 1B, lecture 4: Error bounds for numerical methods Mth B, lecture 4: Error bounds for numericl methods Nthn Pflueger 4 September 0 Introduction The five numericl methods descried in the previous lecture ll operte by the sme principle: they pproximte the

More information

Non-Linear & Logistic Regression

Non-Linear & Logistic Regression Non-Liner & Logistic Regression If the sttistics re boring, then you've got the wrong numbers. Edwrd R. Tufte (Sttistics Professor, Yle University) Regression Anlyses When do we use these? PART 1: find

More information

Numerical integration

Numerical integration 2 Numericl integrtion This is pge i Printer: Opque this 2. Introduction Numericl integrtion is problem tht is prt of mny problems in the economics nd econometrics literture. The orgniztion of this chpter

More information

Bias and Variance Approximation in Value Function Estimates

Bias and Variance Approximation in Value Function Estimates Bis nd Vrince Approximtion in Vlue Function Estimtes Shie Mnnor Duncn Simester Peng Sun John N. Tsitsiklis July 11, 2004 Revised: July 5, 2005 Abstrct We consider Mrkov Decision Process nd study the bis

More information

CHM Physical Chemistry I Chapter 1 - Supplementary Material

CHM Physical Chemistry I Chapter 1 - Supplementary Material CHM 3410 - Physicl Chemistry I Chpter 1 - Supplementry Mteril For review of some bsic concepts in mth, see Atkins "Mthemticl Bckground 1 (pp 59-6), nd "Mthemticl Bckground " (pp 109-111). 1. Derivtion

More information

Math 8 Winter 2015 Applications of Integration

Math 8 Winter 2015 Applications of Integration Mth 8 Winter 205 Applictions of Integrtion Here re few importnt pplictions of integrtion. The pplictions you my see on n exm in this course include only the Net Chnge Theorem (which is relly just the Fundmentl

More information

Solution for Assignment 1 : Intro to Probability and Statistics, PAC learning

Solution for Assignment 1 : Intro to Probability and Statistics, PAC learning Solution for Assignment 1 : Intro to Probbility nd Sttistics, PAC lerning 10-701/15-781: Mchine Lerning (Fll 004) Due: Sept. 30th 004, Thursdy, Strt of clss Question 1. Bsic Probbility ( 18 pts) 1.1 (

More information

Chapter 14. Matrix Representations of Linear Transformations

Chapter 14. Matrix Representations of Linear Transformations Chpter 4 Mtrix Representtions of Liner Trnsformtions When considering the Het Stte Evolution, we found tht we could describe this process using multipliction by mtrix. This ws nice becuse computers cn

More information

Physics 116C Solution of inhomogeneous ordinary differential equations using Green s functions

Physics 116C Solution of inhomogeneous ordinary differential equations using Green s functions Physics 6C Solution of inhomogeneous ordinry differentil equtions using Green s functions Peter Young November 5, 29 Homogeneous Equtions We hve studied, especilly in long HW problem, second order liner

More information

Recitation 3: Applications of the Derivative. 1 Higher-Order Derivatives and their Applications

Recitation 3: Applications of the Derivative. 1 Higher-Order Derivatives and their Applications Mth 1c TA: Pdric Brtlett Recittion 3: Applictions of the Derivtive Week 3 Cltech 013 1 Higher-Order Derivtives nd their Applictions Another thing we could wnt to do with the derivtive, motivted by wht

More information

Data Assimilation. Alan O Neill Data Assimilation Research Centre University of Reading

Data Assimilation. Alan O Neill Data Assimilation Research Centre University of Reading Dt Assimiltion Aln O Neill Dt Assimiltion Reserch Centre University of Reding Contents Motivtion Univrite sclr dt ssimiltion Multivrite vector dt ssimiltion Optiml Interpoltion BLUE 3d-Vritionl Method

More information

Chapter 4 Contravariance, Covariance, and Spacetime Diagrams

Chapter 4 Contravariance, Covariance, and Spacetime Diagrams Chpter 4 Contrvrince, Covrince, nd Spcetime Digrms 4. The Components of Vector in Skewed Coordintes We hve seen in Chpter 3; figure 3.9, tht in order to show inertil motion tht is consistent with the Lorentz

More information

The Regulated and Riemann Integrals

The Regulated and Riemann Integrals Chpter 1 The Regulted nd Riemnn Integrls 1.1 Introduction We will consider severl different pproches to defining the definite integrl f(x) dx of function f(x). These definitions will ll ssign the sme vlue

More information

8 Laplace s Method and Local Limit Theorems

8 Laplace s Method and Local Limit Theorems 8 Lplce s Method nd Locl Limit Theorems 8. Fourier Anlysis in Higher DImensions Most of the theorems of Fourier nlysis tht we hve proved hve nturl generliztions to higher dimensions, nd these cn be proved

More information

Math& 152 Section Integration by Parts

Math& 152 Section Integration by Parts Mth& 5 Section 7. - Integrtion by Prts Integrtion by prts is rule tht trnsforms the integrl of the product of two functions into other (idelly simpler) integrls. Recll from Clculus I tht given two differentible

More information

For the percentage of full time students at RCC the symbols would be:

For the percentage of full time students at RCC the symbols would be: Mth 17/171 Chpter 7- ypothesis Testing with One Smple This chpter is s simple s the previous one, except it is more interesting In this chpter we will test clims concerning the sme prmeters tht we worked

More information

p-adic Egyptian Fractions

p-adic Egyptian Fractions p-adic Egyptin Frctions Contents 1 Introduction 1 2 Trditionl Egyptin Frctions nd Greedy Algorithm 2 3 Set-up 3 4 p-greedy Algorithm 5 5 p-egyptin Trditionl 10 6 Conclusion 1 Introduction An Egyptin frction

More information

Theoretical foundations of Gaussian quadrature

Theoretical foundations of Gaussian quadrature Theoreticl foundtions of Gussin qudrture 1 Inner product vector spce Definition 1. A vector spce (or liner spce) is set V = {u, v, w,...} in which the following two opertions re defined: (A) Addition of

More information

The Wave Equation I. MA 436 Kurt Bryan

The Wave Equation I. MA 436 Kurt Bryan 1 Introduction The Wve Eqution I MA 436 Kurt Bryn Consider string stretching long the x xis, of indeterminte (or even infinite!) length. We wnt to derive n eqution which models the motion of the string

More information

Week 10: Line Integrals

Week 10: Line Integrals Week 10: Line Integrls Introduction In this finl week we return to prmetrised curves nd consider integrtion long such curves. We lredy sw this in Week 2 when we integrted long curve to find its length.

More information

Exam 2, Mathematics 4701, Section ETY6 6:05 pm 7:40 pm, March 31, 2016, IH-1105 Instructor: Attila Máté 1

Exam 2, Mathematics 4701, Section ETY6 6:05 pm 7:40 pm, March 31, 2016, IH-1105 Instructor: Attila Máté 1 Exm, Mthemtics 471, Section ETY6 6:5 pm 7:4 pm, Mrch 1, 16, IH-115 Instructor: Attil Máté 1 17 copies 1. ) Stte the usul sufficient condition for the fixed-point itertion to converge when solving the eqution

More information

13: Diffusion in 2 Energy Groups

13: Diffusion in 2 Energy Groups 3: Diffusion in Energy Groups B. Rouben McMster University Course EP 4D3/6D3 Nucler Rector Anlysis (Rector Physics) 5 Sept.-Dec. 5 September Contents We study the diffusion eqution in two energy groups

More information

Testing categorized bivariate normality with two-stage. polychoric correlation estimates

Testing categorized bivariate normality with two-stage. polychoric correlation estimates Testing ctegorized bivrite normlity with two-stge polychoric correltion estimtes Albert Mydeu-Olivres Dept. of Psychology University of Brcelon Address correspondence to: Albert Mydeu-Olivres. Fculty of

More information

MIXED MODELS (Sections ) I) In the unrestricted model, interactions are treated as in the random effects model:

MIXED MODELS (Sections ) I) In the unrestricted model, interactions are treated as in the random effects model: 1 2 MIXED MODELS (Sections 17.7 17.8) Exmple: Suppose tht in the fiber breking strength exmple, the four mchines used were the only ones of interest, but the interest ws over wide rnge of opertors, nd

More information

Math 270A: Numerical Linear Algebra

Math 270A: Numerical Linear Algebra Mth 70A: Numericl Liner Algebr Instructor: Michel Holst Fll Qurter 014 Homework Assignment #3 Due Give to TA t lest few dys before finl if you wnt feedbck. Exercise 3.1. (The Bsic Liner Method for Liner

More information

Best Approximation. Chapter The General Case

Best Approximation. Chapter The General Case Chpter 4 Best Approximtion 4.1 The Generl Cse In the previous chpter, we hve seen how n interpolting polynomil cn be used s n pproximtion to given function. We now wnt to find the best pproximtion to given

More information

CS 188 Introduction to Artificial Intelligence Fall 2018 Note 7

CS 188 Introduction to Artificial Intelligence Fall 2018 Note 7 CS 188 Introduction to Artificil Intelligence Fll 2018 Note 7 These lecture notes re hevily bsed on notes originlly written by Nikhil Shrm. Decision Networks In the third note, we lerned bout gme trees

More information

Ordinary differential equations

Ordinary differential equations Ordinry differentil equtions Introduction to Synthetic Biology E Nvrro A Montgud P Fernndez de Cordob JF Urchueguí Overview Introduction-Modelling Bsic concepts to understnd n ODE. Description nd properties

More information

Numerical Integration

Numerical Integration Chpter 5 Numericl Integrtion Numericl integrtion is the study of how the numericl vlue of n integrl cn be found. Methods of function pproximtion discussed in Chpter??, i.e., function pproximtion vi the

More information

Numerical Analysis: Trapezoidal and Simpson s Rule

Numerical Analysis: Trapezoidal and Simpson s Rule nd Simpson s Mthemticl question we re interested in numericlly nswering How to we evlute I = f (x) dx? Clculus tells us tht if F(x) is the ntiderivtive of function f (x) on the intervl [, b], then I =

More information

Strong Bisimulation. Overview. References. Actions Labeled transition system Transition semantics Simulation Bisimulation

Strong Bisimulation. Overview. References. Actions Labeled transition system Transition semantics Simulation Bisimulation Strong Bisimultion Overview Actions Lbeled trnsition system Trnsition semntics Simultion Bisimultion References Robin Milner, Communiction nd Concurrency Robin Milner, Communicting nd Mobil Systems 32

More information

A REVIEW OF CALCULUS CONCEPTS FOR JDEP 384H. Thomas Shores Department of Mathematics University of Nebraska Spring 2007

A REVIEW OF CALCULUS CONCEPTS FOR JDEP 384H. Thomas Shores Department of Mathematics University of Nebraska Spring 2007 A REVIEW OF CALCULUS CONCEPTS FOR JDEP 384H Thoms Shores Deprtment of Mthemtics University of Nebrsk Spring 2007 Contents Rtes of Chnge nd Derivtives 1 Dierentils 4 Are nd Integrls 5 Multivrite Clculus

More information

Driving Cycle Construction of City Road for Hybrid Bus Based on Markov Process Deng Pan1, a, Fengchun Sun1,b*, Hongwen He1, c, Jiankun Peng1, d

Driving Cycle Construction of City Road for Hybrid Bus Based on Markov Process Deng Pan1, a, Fengchun Sun1,b*, Hongwen He1, c, Jiankun Peng1, d Interntionl Industril Informtics nd Computer Engineering Conference (IIICEC 15) Driving Cycle Construction of City Rod for Hybrid Bus Bsed on Mrkov Process Deng Pn1,, Fengchun Sun1,b*, Hongwen He1, c,

More information

Duality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below.

Duality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below. Dulity #. Second itertion for HW problem Recll our LP emple problem we hve been working on, in equlity form, is given below.,,,, 8 m F which, when written in slightly different form, is 8 F Recll tht we

More information

Bayesian Networks: Approximate Inference

Bayesian Networks: Approximate Inference pproches to inference yesin Networks: pproximte Inference xct inference Vrillimintion Join tree lgorithm pproximte inference Simplify the structure of the network to mkxct inferencfficient (vritionl methods,

More information

1 Probability Density Functions

1 Probability Density Functions Lis Yn CS 9 Continuous Distributions Lecture Notes #9 July 6, 28 Bsed on chpter by Chris Piech So fr, ll rndom vribles we hve seen hve been discrete. In ll the cses we hve seen in CS 9, this ment tht our

More information

New data structures to reduce data size and search time

New data structures to reduce data size and search time New dt structures to reduce dt size nd serch time Tsuneo Kuwbr Deprtment of Informtion Sciences, Fculty of Science, Kngw University, Hirtsuk-shi, Jpn FIT2018 1D-1, No2, pp1-4 Copyright (c)2018 by The Institute

More information

221B Lecture Notes WKB Method

221B Lecture Notes WKB Method Clssicl Limit B Lecture Notes WKB Method Hmilton Jcobi Eqution We strt from the Schrödinger eqution for single prticle in potentil i h t ψ x, t = [ ] h m + V x ψ x, t. We cn rewrite this eqution by using

More information

Discrete Mathematics and Probability Theory Spring 2013 Anant Sahai Lecture 17

Discrete Mathematics and Probability Theory Spring 2013 Anant Sahai Lecture 17 EECS 70 Discrete Mthemtics nd Proility Theory Spring 2013 Annt Shi Lecture 17 I.I.D. Rndom Vriles Estimting the is of coin Question: We wnt to estimte the proportion p of Democrts in the US popultion,

More information

Lecture 3 Gaussian Probability Distribution

Lecture 3 Gaussian Probability Distribution Introduction Lecture 3 Gussin Probbility Distribution Gussin probbility distribution is perhps the most used distribution in ll of science. lso clled bell shped curve or norml distribution Unlike the binomil

More information

3.4 Numerical integration

3.4 Numerical integration 3.4. Numericl integrtion 63 3.4 Numericl integrtion In mny economic pplictions it is necessry to compute the definite integrl of relvlued function f with respect to "weight" function w over n intervl [,

More information

1 The Lagrange interpolation formula

1 The Lagrange interpolation formula Notes on Qudrture 1 The Lgrnge interpoltion formul We briefly recll the Lgrnge interpoltion formul. The strting point is collection of N + 1 rel points (x 0, y 0 ), (x 1, y 1 ),..., (x N, y N ), with x

More information

ECO 317 Economics of Uncertainty Fall Term 2007 Notes for lectures 4. Stochastic Dominance

ECO 317 Economics of Uncertainty Fall Term 2007 Notes for lectures 4. Stochastic Dominance Generl structure ECO 37 Economics of Uncertinty Fll Term 007 Notes for lectures 4. Stochstic Dominnce Here we suppose tht the consequences re welth mounts denoted by W, which cn tke on ny vlue between

More information

MAA 4212 Improper Integrals

MAA 4212 Improper Integrals Notes by Dvid Groisser, Copyright c 1995; revised 2002, 2009, 2014 MAA 4212 Improper Integrls The Riemnn integrl, while perfectly well-defined, is too restrictive for mny purposes; there re functions which

More information

Unit #9 : Definite Integral Properties; Fundamental Theorem of Calculus

Unit #9 : Definite Integral Properties; Fundamental Theorem of Calculus Unit #9 : Definite Integrl Properties; Fundmentl Theorem of Clculus Gols: Identify properties of definite integrls Define odd nd even functions, nd reltionship to integrl vlues Introduce the Fundmentl

More information

Chapter 0. What is the Lebesgue integral about?

Chapter 0. What is the Lebesgue integral about? Chpter 0. Wht is the Lebesgue integrl bout? The pln is to hve tutoril sheet ech week, most often on Fridy, (to be done during the clss) where you will try to get used to the ides introduced in the previous

More information

Lecture 1. Functional series. Pointwise and uniform convergence.

Lecture 1. Functional series. Pointwise and uniform convergence. 1 Introduction. Lecture 1. Functionl series. Pointwise nd uniform convergence. In this course we study mongst other things Fourier series. The Fourier series for periodic function f(x) with period 2π is

More information

1.9 C 2 inner variations

1.9 C 2 inner variations 46 CHAPTER 1. INDIRECT METHODS 1.9 C 2 inner vritions So fr, we hve restricted ttention to liner vritions. These re vritions of the form vx; ǫ = ux + ǫφx where φ is in some liner perturbtion clss P, for

More information

State space systems analysis (continued) Stability. A. Definitions A system is said to be Asymptotically Stable (AS) when it satisfies

State space systems analysis (continued) Stability. A. Definitions A system is said to be Asymptotically Stable (AS) when it satisfies Stte spce systems nlysis (continued) Stbility A. Definitions A system is sid to be Asymptoticlly Stble (AS) when it stisfies ut () = 0, t > 0 lim xt () 0. t A system is AS if nd only if the impulse response

More information

Module 6: LINEAR TRANSFORMATIONS

Module 6: LINEAR TRANSFORMATIONS Module 6: LINEAR TRANSFORMATIONS. Trnsformtions nd mtrices Trnsformtions re generliztions of functions. A vector x in some set S n is mpped into m nother vector y T( x). A trnsformtion is liner if, for

More information

Math 113 Exam 2 Practice

Math 113 Exam 2 Practice Mth 3 Exm Prctice Februry 8, 03 Exm will cover 7.4, 7.5, 7.7, 7.8, 8.-3 nd 8.5. Plese note tht integrtion skills lerned in erlier sections will still be needed for the mteril in 7.5, 7.8 nd chpter 8. This

More information

Discrete Mathematics and Probability Theory Summer 2014 James Cook Note 17

Discrete Mathematics and Probability Theory Summer 2014 James Cook Note 17 CS 70 Discrete Mthemtics nd Proility Theory Summer 2014 Jmes Cook Note 17 I.I.D. Rndom Vriles Estimting the is of coin Question: We wnt to estimte the proportion p of Democrts in the US popultion, y tking

More information

CMDA 4604: Intermediate Topics in Mathematical Modeling Lecture 19: Interpolation and Quadrature

CMDA 4604: Intermediate Topics in Mathematical Modeling Lecture 19: Interpolation and Quadrature CMDA 4604: Intermedite Topics in Mthemticl Modeling Lecture 19: Interpoltion nd Qudrture In this lecture we mke brief diversion into the res of interpoltion nd qudrture. Given function f C[, b], we sy

More information

Lecture 20: Numerical Integration III

Lecture 20: Numerical Integration III cs4: introduction to numericl nlysis /8/0 Lecture 0: Numericl Integrtion III Instructor: Professor Amos Ron Scribes: Mrk Cowlishw, Yunpeng Li, Nthnel Fillmore For the lst few lectures we hve discussed

More information

ODE: Existence and Uniqueness of a Solution

ODE: Existence and Uniqueness of a Solution Mth 22 Fll 213 Jerry Kzdn ODE: Existence nd Uniqueness of Solution The Fundmentl Theorem of Clculus tells us how to solve the ordinry dierentil eqution (ODE) du f(t) dt with initil condition u() : Just

More information

1.2. Linear Variable Coefficient Equations. y + b "! = a y + b " Remark: The case b = 0 and a non-constant can be solved with the same idea as above.

1.2. Linear Variable Coefficient Equations. y + b ! = a y + b  Remark: The case b = 0 and a non-constant can be solved with the same idea as above. 1 12 Liner Vrible Coefficient Equtions Section Objective(s): Review: Constnt Coefficient Equtions Solving Vrible Coefficient Equtions The Integrting Fctor Method The Bernoulli Eqution 121 Review: Constnt

More information

UNIT 1 FUNCTIONS AND THEIR INVERSES Lesson 1.4: Logarithmic Functions as Inverses Instruction

UNIT 1 FUNCTIONS AND THEIR INVERSES Lesson 1.4: Logarithmic Functions as Inverses Instruction Lesson : Logrithmic Functions s Inverses Prerequisite Skills This lesson requires the use of the following skills: determining the dependent nd independent vribles in n exponentil function bsed on dt from

More information

Lecture 3. In this lecture, we will discuss algorithms for solving systems of linear equations.

Lecture 3. In this lecture, we will discuss algorithms for solving systems of linear equations. Lecture 3 3 Solving liner equtions In this lecture we will discuss lgorithms for solving systems of liner equtions Multiplictive identity Let us restrict ourselves to considering squre mtrices since one

More information

Quadratic Forms. Quadratic Forms

Quadratic Forms. Quadratic Forms Qudrtic Forms Recll the Simon & Blume excerpt from n erlier lecture which sid tht the min tsk of clculus is to pproximte nonliner functions with liner functions. It s ctully more ccurte to sy tht we pproximte

More information

Chapter 6 Notes, Larson/Hostetler 3e

Chapter 6 Notes, Larson/Hostetler 3e Contents 6. Antiderivtives nd the Rules of Integrtion.......................... 6. Are nd the Definite Integrl.................................. 6.. Are............................................ 6. Reimnn

More information