Tutorial: A Unified Framework for Optimization under Uncertainty

Size: px
Start display at page:

Download "Tutorial: A Unified Framework for Optimization under Uncertainty"

Transcription

1 Tuorial: A Unified Framework for Opimizaion under Uncerainy Informs Annual Meeing - Nashville November 13, 2016 Warren B. Powell Princeon Universiy Deparmen of Operaions Research and Financial Engineering 2016 Warren B. Powell, Princeon Universiy

2 Learning problems Healh sciences» Sequenial design of experimens for drug discovery» Drug delivery Opimizing he design of proecive membranes o conrol drug release» Medical decision making Opimal learning for medical reamens.

3 Meeing variabiliy wih porfolios of generaion wih mixures of dispachabiliy Slide 3

4

5 Real-ime logisics Uber» Provides real-ime, on-demand ransporaion.» Drivers are encouraged o ener or leave he sysem using pricing signals and informaional guidance. Decisions:» How o price o ge he righ balance of drivers relaive o cusomers.» Assigning and rouing drivers o manage Uber-creaed congesion.» Real-ime managemen of drivers.» Pricing (rips, new services, )» Policies (rules for managing drivers, cusomers, ) 2016 W.B. Powell

6 Planning for a risky world Disaser response Robus design of emergency response neworks. Design of sensor neworks and communicaion sysems o manage responses o hurricanes, sunamis, nuclear disasers and erroris aacks. Disease Managemen of medical personnel, equipmen and vaccines o respond o a disease oubreak. Robus design of supply chains o miigae he disrupion of ransporaion sysems. Slide 6

7 Designing robus power grids The power grid» Loss of power creaes cascading failures (lack of fuel, inabiliy o pump waer)» How o plan?» How o reac? Hurricane Sandy» Once in 100 years?» Rare convergence of evens» Bu, meeorologiss did an amazing job of forecasing he sorm.

8 Modeling Before we can solve complex problems, we have o know how o hink abou hem. Min E {cx} Ax = b x > 0 Organize class libraries, and se up communicaions and daabases Mahemaician Sofware The bigges challenge when making decisions under uncerainy is modeling.

9 Modeling For deerminisic problems, we speak he language of mahemaical programming» Linear programming: min x cx Ax b x 0» For ime-saged problems min T x0,..., x cx T 0 A x B x b Dx u x Arguably Danzig s bigges conribuion, more so han he simplex algorihm, was his ariculaion of opimizaion problems in a sandard forma, which has given algorihmic researchers a common language.

10 Modeling For deerminisic problems, we speak he language of mahemaical programming» Linear programming: min x cx Ax b x 0» Opimal conrol:» For ime-saged problems min T x0,..., x cx T 0 A x B x b Dx u x min L( x, u ) J ( x ) u0,..., ut T T 0 x T f( x, u ) 1

11 Sochasic Approximae Robus Decision dynamic opimizaion Simulaion programming analysis opimizaion Opimal Dynamic Model learning Programming predicive and Sochasic conrol Opimal Bandi conrol search conrol problems Online programming Sochasic conrol Reinforcemen learning Markov decision processes compuaion Simulaion opimizaion

12 Sochasic Approximae Robus Decision dynamic opimizaion Simulaion programming analysis opimizaion Opimal Dynamic Model learning Programming predicive and Sochasic conrol Opimal Bandi conrol search conrol problems Online programming Sochasic conrol Reinforcemen learning Markov decision processes compuaion Simulaion opimizaion

13 Ouline Canonical problems Problem classes Soluion sraegies for learning problems Elemens of a dynamic model An energy sorage illusraion Modeling uncerainy Designing policies The four classes of policies From deerminisic o sochasic opimizaion

14 Ouline Canonical problems Problem classes Soluion sraegies for learning problems Elemens of a dynamic model An energy sorage illusraion Modeling uncerainy Designing policies The four classes of policies From deerminisic o sochasic opimizaion

15 Canonical problems Decision rees

16 Canonical problems Sochasic search (derivaive based)» Basic problem: max F( xw, ) x» Sochasic gradien n1 n n n1 x x nxf( x, W )» Convergence: n * lim n F( x, W) F( x, W) Manufacuring nework (x=design) Uni commimen problem (x=day ahead decisions) Transformers (x=replacemen policy) Invenory sysem (x=design, replenishmen policy) Baery sysem (x=choice of maerial) Paien reamen cos (x=drug, reamens) Trucking company (x=flee size and mix)

17 Canonical problems Ranking and selecion (derivaive free)» Basic problem: max x x F( xw, ) 1,..., M n» We need o design a policy X ( S ) ha, N max, F X W

18 Canonical problems Muli-armed bandi problems» We do no know he expeced winnings from each slo machine ( arm ).» Collec informaion by playing a machine. n» We need o find a policy X ( S ) for playing machine x ha maximizes: where W max n 1 S n N 1 n0 W n1 n x "winnings" Sae of knowledge New informaion Wha we know abou each slo machine n n x X ( S ) Choose nex arm o play

19 Canonical problems Two-sage sochasic programming» Make iniial decision How many Chrismas rees o plan» See informaion See he orders for Chrismas rees from reailers» Make final decision Shipping Chrismas rees o reailers Opimizaion model where W x 1 min cx Q x ( x, W ) This is ofen solved using x 0 Q ( x, W ( )) min c ( ) x ( ) x ( ) X ( ) min cx 0 0 p( ) c( ) x( ) T 1

20 Canonical problems Muli-sage sochasic programming» The sochasic programming communiy likes o wrie:» This is he same as:» which is he same as T min EC S, X ( S ) S 0 0

21 Canonical problems (Discree) Markov decision processes» Bellman s opimaliy equaion V ( S ) min C( S, a ) V ( S ) S a A 1 1 min a (, ) ( 1 ', ) 1( 1) CS a A ps s S a V S s '»where S Discree sae (node in nework, iems in invenory) a W Acion (ransiion o node, purchases) Random informaion (demand, prices, wind, deposis) S S ( S, a, W ) M 1 1» Solve saring a =T wih VT( ST) 0and sep backward in ime.

22 Canonical problems Linear quadraic regulaion (LQR)» A popular opimal conrol problem in engineering involves solving:» where: x T min ( x ) Qx ( u ) Ru T T u0,..., ut 0 Sae a ime u Conrol a ime (mus be F 1» Possible o show ha he opimal policy looks like: U ( x ) K x where is a complicaed funcion of Q and R. measurable) x f( x, u ) w ( w is random a ime ) K

23 Canonical problems Opimal sopping - Find he bes ime o sop and sell an asse» Model: Exogenous process: Decision: 1 If we sop and sell a ime X ( ) 0 Oherwise Reward: f ( p ) Reward received if we sop a ime (e.g. f( p ) p )» Opimizaion problem: p1 p2 p T,,..., Sequence of sock prices max X f( p ) where is a sopping ime (or " F measurable funcion" )

24 Canonical problems Sochasic conrol (from ex by Rene Carmona)» This is MCCM - mahemaically correc, compuaionally meaningless.

25 Canonical problems Engineers like o wrie T x0,..., x CS T x 0 max (, )» This way of modeling is asonishingly common in he engineering lieraure, bu i is simply incorrec - x is a random variable. This does no model he flow of informaion. Mahemaicians like o wrie max CS (, x) x0,..., xt 0 where x is F measurable. T I am no smar enough o do sochasic opimizaion.» This is mahemaically correc, bu wih no pah o compuaion.

26 Canonical problems Beer:» Maximize over policies: T max CS (, X ( S)) S0 0 where X ( S ) is a funcion of he sae S.» Now we jus have o show how o search over policies. This is likely o look like: T max (, ( )), f CS X S S f F 0 0 f where f F is funcion classes, is unable parameers» In his uorial, we are going o show ha all of hese canonical problems can be modeled his way.

27 Ouline Canonical problems Problem classes Soluion sraegies for learning problems Elemens of a dynamic model An energy sorage illusraion Modeling uncerainy Designing policies The four classes of policies From deerminisic o sochasic opimizaion

28 Problem classes Saging of informaion and decisions» Saic sochasic opimizaion: Decision, informaion.» Two-sage sochasic programming (vecor x): Decision, informaion, decision.» Mulisage sochasic programming (vecor x): Decision, informaion, decision, informaion,, decision.» Finie horizon Markov decision process (finie acions): Decision, informaion, decision, informaion,, decision.» Asympoic sochasic search: Decision, informaion, decision, informaion,» Infinie horizon Markov decision process (finie acions): Decision, informaion, decision, informaion, decision, Conexual informaion» Each problem above sars wih iniial informaion from an exogenous source (he conex ).

29 Problem classes Learning problems (sae independen)» Arises when we are rying o opimize an unknown funcion (black box simulaion, lab experimen, newsvendor wih unknown disribuion): max FxW (, ) pmin( xw, ) cx x» The sae variable is S K Our sae of knowledge abou F ( x, W )» Transiion: x X ( S ) Fˆ F( x, W ) K S K 1 1 ( S K, x X ( S ), W, Fˆ F( x, W ), S K, x X ( S ),...)

30 Problem classes Parameric belief model» We have a nonlinear model fx ( ) wih uncerain n» Knowledge is n n p p, p Prob[ ] k k k fx ( ) Sampled belief model K ( p ),( ) n n K K k k1 k k1

31 Problem classes Sae-dependen problems» Dynamic informaion process» Imagine price is revealed before making a decision: max FxW (, ) pmin( xw, ) cx x p» Sae variable is now S ( p, K )» Transiions: x X ( K ) Fˆ F( x, W ) K p p pˆ 1 1 S ( p, K ) ( S ( p, K ), x X ( S ), W ( pˆ, Fˆ ), S ( p, K ), x X ( S ),...)

32 Problem classes Sae-dependen problems» Dynamic resource process Excess invenory held over: R Invenory available a ime» Opimizaion problem is now max FxW (, ) p min( xr, ˆ) cx 0xR» Sae variable is now S ( R, p, K )» Transiions: x X ( K ) Fˆ F( x, W ) K R max 0, R x Rˆ 1 1 p p pˆ 1 1 S ( R, p, K ) ( S ( R, p, K ), x X ( S ), W ( Rˆ, pˆ, Fˆ), S ( R, p, K ), x X ( S ),...)

33 Problem classes Offline (final reward)» We can ieraively search for he bes soluion.» We only care abou he final soluion.» Asympoic formulaion: max FxW (, ) x» Finie horizon formulaion:, max N Fx (, W) Online (cumulaive reward)» We have o learn as we go N 1 1 max n n FX ( ( S ), W ) n 0

34 Problem classes Our mos general formulaion ha covers all of hese problems is max, ( ), E C S X S W S T max E F( X, W) S T 0»where S S ( S, X ( S ), W ) M 1 1 So, how do we design policies?

35 Ouline Canonical problems Problem classes Soluion sraegies for learning problems Elemens of a dynamic model An energy sorage illusraion Modeling uncerainy Designing policies The four classes of policies From deerminisic o sochasic opimizaion

36 Soluion sraegies Special srucure» Where expecaions can be compued, urning he problem ino a deerminisic problem» Implies we can solve max exacly as a deerminisic x F( xw, ) problem (his is wha we are doing when we use Bellman s equaion). Sampled problems (SAA, scenario rees)» Unconrolled sampling» Conrolled sampling Adapive learning algorihms» Works wih full probabiliy space» This is our focus.

37 Soluion sraegies Sampled problems» Sample average approximaion (sochasic opimizaion) N 1 min n x F( x, W ) N n» Probabilisic learning of a sampled model» Saisical learning N 1 min yn f( xn ) N 1 K n x F x k pk k 1 min (, ) n1 2

38 Soluion sraegies Sampling» Unconrolled This is wha is implied when we are given he daa. Bach daase ( big daa ) No conrol over he arrival process: e.g. Paiens arriving o a hospial» Direc conrol Creaing samples ha accuraely represen he underlying sochasic process Quanizaion/epi-splines Voronoi quanizaion K-L divergence (more generally phi-divergence)» Indirec conrol Decision influences disribuion (e.g. selling sock influences price)

39 Soluion sraegies Adapive learning algorihms» We seek mehods ha are rying o solve he original problem (no he sampled approximaion)» We are ineresed in: Asympoic opimaliy Rapid finie-ime convergence Sraegies:» Derivaive-based sochasic search Asympoic analysis (Robbins-Monro ec) Finie-ime analysis» Derivaive-free sochasic search Requires ieraive learning

40 Soluion sraegies Derivaive-based sochasic search infinie horizon» Sochasic gradien algorihm: max F( xw, ) pmin xw, cx x x x F( x, W ) n1 n n n1 n x F x W n n1 x (, ) F x W n * lim n (, ) (, ) pc xw c x W F x W» Asympoic analysis produces a deerminisic soluion This is deerminisic hinking on a sochasic problem» Wha happens when we wan he bes soluion in N ieraions? * x

41 Soluion sraegies Derivaive-based sochasic search finie horizon The sae The decision» We wan a mehod (an algorihm) ha produces he bes soluion by ime N: The exogenous informaion n1 n n n1 x x nxf( x, W ) The ransiion funcion» Assume ha our sepsize rule is n n The policy N n where N number of imes he soluion has no improved.»afer n ieraions, our sae is n ( n, n n1 M n n1 S x N ) S S ( S, n, W )» Given he sae n S and he parameer, we can deermine (afer sampling n 1) he nex sae n. W S 1

42 Soluion sraegies Tesing differen sepsize rules ( policies ) Percenage error from opimal OSA 1 n Kalman» We wan o opimize he rae of convergence: Differen sepsize rules Differen ways of compuing he gradien

43 Soluion sraegies Derivaive-based sochasic search finie horizon» If X ( S n ) is our algorihm (policy), we follow a sample pah 1 N W,..., W o obain a final soluion x,n, which is a random variable.» Our opimizaion problem is o find he bes policy (algorihm) X ( S n ), which requires aking an N expecaion over he samples max F ( x, W ) x Wx W 1,..., W N, N,, N W 1..., W N

44 Soluion sraegies Derivaive-free sochasic search» Sar by assuming ha our se of possible decisions is finie: xx x1, x2,..., xm» Assume we have some belief abou our funcion (say, lookup able). Using a Bayesian model, we assume we have a disribuion of belief abou f ( x) F( x, W) given by 0 0 F( x, W) N, x 0 0 2,0 where is he precision where 1/ » We refer o S K N(, ) as our prior sae of knowledge.

45 Derivaive-free, finie horizon Belief sae for ranking and selecion» S is our sae of knowledge n 5 n 5 S N, n 2, n S S,..., S n n n

46 Derivaive-free, finie horizon Updaing beliefs»afer n experimens, our belief is» Assume ha based on his belief, we choose» We updae our beliefs using n n N, x x x n n x x X ( S ) o run for our nex experimen (experimen n+1): W n1 n1 x n x n n n W n1 n1 x x x n W n1 n W x x W Transiion funcion: n1 M n n n1 S S ( S, x, W )

47 Derivaive-free, finie horizon Designing a policy» We need a rule for picking which decision o ry nex. n We call his rule our policy. Some examples are: Inerval esimaion: ( S ) Upper confidence bounding UCB n UCB n UCB log n n X ( S, ) arg max x x Nx No. of imes x is esed. n N x Thompson sampling: Knowledge gradien (expeced value of informaion): X IE n IE n IE n n n X ( S, ) arg max Sd. dev. of x x x x x TS n n n n n X ( S ) arg max ˆ ˆ N(, ) x x x x x ( x) E max F( y, K ( x)) max F( y, K ) KG, n n1 n y y

48 Derivaive-free, finie horizon Tesing policies» We have hree sources of randomness: 0 0 The rue funcion x N, (Bayesian belief model) The samples 1 N W,..., W generaed from ruh x n1 n1 W x n x n while following policy Finally, he uncerainy W needed o evaluae he final design x,n» We choose our policy by solving: max (, ) F x W x Wx N, N,, N W 1..., W N, max N F( x, W)

49 Ouline Canonical problems Problem classes Soluion sraegies for learning problems Elemens of a dynamic model An energy sorage illusraion Modeling uncerainy Designing policies The four classes of policies From deerminisic o sochasic opimizaion

50 Modeling We lack a sandard language for modeling sequenial, sochasic decision problems.» In he slides ha follow, we propose o model problems along five fundamenal dimensions: Sae variables Decision variables Exogenous informaion Transiion funcion Objecive funcion» This framework draws heavily from Markov decision processes and he conrol heory communiies, bu i is no he sandard form used anywhere.

51 Modeling dynamic problems The sysem sae: Conrols communiy x "Informaion sae" Operaions research/mdp/compuer science S R, I, K Sysem sae, where: R I K Resource sae (physical sae) Locaion/saus of ruck/rain/plane Energy in sorage Informaion sae Prices Weaher Knowledge sae ("belief sae") Belief abou raffic delays Belief abou he saus of equipmen Slide 51

52 The sae variable Classes of sae variables K I R Resource/physical sae Informaion sae Knowledge/belief sae

53 The sae variables Wha is a sae variable?» Bellman s classic ex on dynamic programming (1957) describes he sae variable wih: we have a physical sysem characerized a any sage by a small se of parameers, he sae variables.» The mos popular book on dynamic programming (Puerman, 2005, p.18) defines a sae variable wih he following senence: A each decision epoch, he sysem occupies a sae.» Wikipedia: Sae commonly refers o eiher he presen condiion of a sysem or eniy or. A sae variable is one of he se of variables ha are used o describe he mahemaical sae of a dynamical sysem

54 The sae variable A proposed definiion of a sae variable: The sae S is he minimally dimensioned funcion of hisory ha, combined wih he exogenous informaion, is necessary and sufficien o calculae he coss/rewards, consrains, and ransiions, from ime onward.» The firs depends on a policy. The second depends only on he problem.» Using eiher definiion, all properly modeled problems are Markovian!

55 Modeling dynamic problems Decisions: Markov decision processes/compuer science a Discree acion Conrol heory u Low-dimensional coninuous vecor Operaions research x Usually a discree or coninuous bu high-dimensional vecor of decisions. A his poin, we do no specify how o make a decision. Insead, we define he funcion X ( s) (or A ( s) or U ( s)), where specifies he ype of policy. " " carries informaion abou he ype of funcion f, and any unable parameers f. Slide 55

56 The decision variables Syles of decisions»binary xx 0,1» Finie x X 1,2,..., M» Coninuous scalar x X a, b» Coninuous vecor x( x1,..., xk), xk» Discree vecor x( x1,..., xk), xk» Caegorical x ( a,..., a ), a is a caegory (e.g. red/green/blue) 1 I i

57 Modeling dynamic problems Exogenous informaion: W New informaion ha firs became known a ime = Rˆ, Dˆ, pˆ, Eˆ Rˆ Equipmen failures, delays, new arrivals New drivers being hired o he nework Dˆ New cusomer demands pˆ Changes in prices Eˆ Informaion abou he environmen (emperaure,...) Noe: Any variable indexed by is known a ime. This convenion, which is no sandard in conrol heory, dramaically simplifies he modeling of informaion. Below, we le represen a sequence of acual observaions W1, W2,... W refers o a sample realizaion of he random variable W. Slide 57

58 Modeling dynamic problems The ransiion funcion M S 1 S ( S, x, W 1) R ˆ 1 R x R 1 Invenories p ˆ 1 p p 1 Spo prices D D Dˆ Marke demands 1 1 Also known as he: Sysem model Sae ransiion model Plan model Plan equaion Transiion law Transfer funcion Transformaion funcion Law of moion Model For many applicaions, hese equaions are unknown. This is known as model-free dynamic programming. Slide 58

59 Modeling dynamic problems The objecive funcion Dimensions of objecive funcions» Performance merics» Properies (convexiy, monooniciy, coninuiy, unimodulariy, )» Final reward vs. cumulaive reward» Time o compue (fracions of seconds o minues, o hours, o days or monhs)» Expecaion or risk measures Slide 59

60 Objecive funcions Performance merics» Coss, profis, revenues, conribuions (business)» Gains, losses (engineering)» Srengh, conduciviy, diffusiviy (maerials science)» Tolerance, oxiciy, effeciveness (healh)» Sabiliy, robusness (engineering)» Risk, volailiy (finance)» Uiliy (economics)

61 Objecive funcions Objecive funcions» Deerminisic coss T c x = Deerminisic linear coss» Sae-independen 1 1 F( x, W ) pmax( x, W ) cx n n n n n» Sae-dependen CS (, x) px ( p is in he sae variable) CS (, xw, ) pmax( xw, ) cx 1 1 T T CS (, x, S ) SQS S RS

62 Objecive funcions Characerisics of he objecive funcion» Analyical behavior Concave/convex, unimodal, monoone, smooh,» Compuaional cos: Fracions of a second Analyical funcions Minues Compuer simulaions Hours Laboraory experimens/compuer simulaions Days (or longer) Laboraory/field experimens Weeks o monhs Field experimens» Sarup/swiching coss Wha is involved o observe funcion for differen inpus? Is here a cos o swich o differen inpus?» Risk operaors Expecaions Risk measures Robus/wors case

63 Modeling sochasic, dynamic problems Objecive funcions» Offline (asympoic) sochasic search max F( x, W) x» Two-sage sochasic programming» Offline (finie ime) sochasic search max x cx Qx (, W) , N max F( X, W)» Muliarmed bandi problem» Conexual bandi problem» Full dynamic programming» Offline dynamic programming N 1 1 max n n F( X ( S ), W ) n0 N 1 n n1 F X S W S0 n0 T CS X S S0 0 T imp learn CS X S 0 max ( ( ), ) max (, ( )) max (, ( ))

64 Problem classes The circle of sochasic opimizaion T 0 max CS (, X ( S)) learn T max C( S, X ( S), W ) S max x F( x, W) 0 imp max cx Qx (, W) x o 0 1, max N F( X, W) T max CS (, X ( S)) S 0 0 N 1 1 max n n F( X ( S ), W ) n0 1 N 1 max n n F( X ( S ), W ) S n0 0

65 Modeling sochasic, dynamic problems The universal objecive funcion max E, ( ), C S X S W S T Expecaion over all Conribuion funcion random oucomes Decision funcion (policy) Sae variable Iniial sae variable Finding he bes policy New informaion Given a sysem model (ransiion funcion) S S S, x, W ( ) M 1 1 Now we jus have o find he bes policy.

66 Problem classes Major problem classes Offline (erminal reward) Online (cumulaive reward) Sae independen max x F( x, W), N max F( X, W) N 1 max ( ( n ), n1 ) n0 F X S W S sochasic search muliarmed bandi problems 0 Sae dependen max CS (, X ( S)) learn T 1 0 imp imp S S ( S, X ( S ), W ) M 1 1 T 1 max CS (, X ( S)) S 0 M S S ( S, X ( S ), W ) 1 1 dynamic programming 0

67 Ouline Canonical problems Problem classes Soluion sraegies for learning problems Elemens of a dynamic model An energy sorage illusraion Modeling uncerainy Designing policies The four classes of policies From deerminisic o sochasic opimizaion

68 An energy sorage problem Consider a basic energy sorage problem:» We are going o show ha wih minor variaions in he characerisics of his problem, we can make each class of policy work bes.

69 An energy sorage problem A model of our problem» Sae variables» Decision variables» Exogenous informaion» Transiion funcion» Objecive funcion

70 An energy sorage problem Sae variables E B L G» We will presen he full model, accumulaing he informaion we need in he sae variable.» We will highligh informaion we need as we proceed. This informaion will make up our sae variable.

71 An energy sorage problem E Decision variables B L G EL EB GL GB BL x x, x, x, x, x,» Consrains;

72 An energy sorage problem E Exogenous informaion B L G W Eˆ Change in energy from wind beween 1 and p L ' Noise in he price process beween Dˆ Change in load beween 1 and L ' load ' 1 and f Forecas of load D provided by vendor a ime f L f ' Provided exogenously

73 An energy sorage problem E Transiion funcion B L G E E Eˆ 1 1 p p p p p D D Dˆ 1 1 R R x baery baery 1

74 An energy sorage problem E Objecive funcion B L G GB CS (, x) p x x GL T min CS (, X ( S)) S0 0 Expecaion depends on forecass f.

75 An energy sorage problem Sae variables» Cos funcion p Price of elecriciy» Decision funcion Consrains: S R, E, L,( p, p, p ), f L 1 2 L f Needed o compue probabiliy model» Transiion funcion p p p p p

76 Modeling Noes» There is a common misundersanding ha sae variables have o be simple (hey don ).» There is also a endency o refer o problems ha depend on prior informaion (such as p and p ) as 1 hisory dependen. Bu his is informaion known 2 a ime (who cares when i firs became known).» All properly modeled problems are Markovian!» Undersanding sae variables is very imporan in dynamic sysems, because i forces you o undersand wha you know a ime, and wha you don.

77 Ouline Canonical problems Problem classes Soluion sraegies for learning problems Elemens of a dynamic model An energy sorage illusraion Modeling uncerainy Designing policies The four classes of policies From deerminisic o sochasic opimizaion

78 Modeling uncerainy There are wo informaion processes ha drive he sysem: x» Decisions This is he endogenously conrollable informaion process.» Exogenous informaion - This comes from he iniial sae S, and he exogenous informaion process W. 0 To figure ou how o make good decisions, you need: M» The sysem model S 1 S ( S, x, W 1)» The iniial sae S 0 and he exogenous informaion process W.» The conribuion funcion CS (, xw, ) W.B. Powell

79 Modeling uncerainy S 0 The iniial sae. This conains:» All deerminisic parameers needed by he sysem. This is saic daa, so i is no modeled as par of he dynamic sae. S, 0» Sae of knowledge probabilisic informaion abou uncerain parameers. This informaion is always represened as a probabiliy disribuion of some form W.B. Powell

80 Modeling uncerainy The exogenous informaion process W which migh include:» Passive informaion This is informaion ha arrives regardless of any acions we may ake. Examples: Purely exogenous Informaion ha is no influenced by he sae of he sysem or any acions we ake. Examples: Rainfall, sock prices (if we are a small player). Exogenous disribuions may influenced by saes and/or acions (sock prices if we are a large player).» Acive informaion This is informaion we choose o collec Running a laboraory experimen Purchasing a repor 2016 W.B. Powell

81 Modeling uncerainy Types of uncerainy» Observaional uncerainy Errors in our observaions of he sae of he sysem: Wha is he CO2 conen of he amosphere? Wha is invenory of oil in he U.S.?» Prognosic uncerainy Uncerainy in he forecas of a fuure even. Forecasing demands Forecasing he weaher 2016 W.B. Powell

82 Modeling uncerainy Types of uncerainy» Experimenal noise This is he variabiliy ha arises when running repeaed experimens (eiher in a lab or in he field) Tesing he impac of a new flu drug. Tesing he effec of a new maerial on baery lifeimes» Transiional uncerainy We have a model of how a (presumably) deerminisic sysem evolves, bu here is sill noise: M S S ( S, x ) 1! Modeling he locaion of an aircraf moving a a cerain speed from a known locaion. Predicing he ime of arrival of a car a a downsream node 2016 W.B. Powell

83 Modeling uncerainy Types of uncerainy» Inferenial uncerainy Uncerainy in parameers esimaed from observaional daa Someimes known as diagnosic uncerainy which migh arise in he conex of esimaing a condiion such as disease or he reason for a malfuncion (in an engine). Such an assessmen would an inference based on indirec observaions.» Model uncerainy This is uncerainy abou he model iself, which comes in wo forms: Uncerainy abou he srucure of he model: Linear approximaion of a nonlinear model Differen ses of equaions describing he climae Parameers characerizing he model 2016 W.B. Powell

84 Modeling uncerainy Types of uncerainy» Sysemaic exogenous uncerainy - Errors in he model of exogenous informaion ha occur on long ime scales: Modeling he effec of long-erm drops in oil consumpion due o conservaion Modeling he effec of increased cloud cover due o climae change W Base signal (forecased) Low frequency noise ( scenarios ) High frequency noise 2016 W.B. Powell

85 Modeling uncerainy Types of uncerainy» Conrol uncerainy x x You ask for bu you ge Wiley ses a wholesale price of $80, bu Amazon sells a some random price above ha (limis Wiley s abiliy o se prices).» Algorihmic uncerainy Run he same algorihm wice, and you may ge differen answers (depends on he algorihm and he naure of he compue environmen) W.B. Powell

86 Modeling uncerainy Bayesian vs. frequenis uncerainy» Bayesian uncerainy is capured by a disribuion of belief derived from prior informaion: Exper judgmen Informaion colleced from differen seings Pas experience Bayesian uncerainy is always communicaed hrough» Frequenis uncerainy This is uncerainy derived from saisical analysis of he variabiliy inheren in he exogenous informaion W S W.B. Powell

87 Modeling uncerainy Types of disribuions» Probabiliy disribuions come in differen forms: Classical hin ailed disribuions Exponenial family» Normal, exponenial, gamma» Uniform Discree varians Heavy-ailed disribuions Cauchy disribuion (may have infinie variance) Jump diffusion Sum of low-variance normally disribued error, plus a high-variance error ha occurs wih low probabiliy Spikes Burss Rare evens 2016 W.B. Powell

88 Modeling uncerainy Noes» How can you claim you have an opimal policy if you have no modeled he problem properly?» Uncerainy is easily he mos suble and overlooked aspec of modeling.» I is no enough o include uncerainy you have o capure uncerainy in a way ha represens realiy. This issue universally pervades applicaions of sochasic opimizaion o real problems W.B. Powell

89 Ouline Canonical problems Problem classes Soluion sraegies for learning problems Elemens of a dynamic model An energy sorage illusraion Modeling uncerainy Designing policies The four classes of policies From deerminisic o sochasic opimizaion

90 Designing policies We have o sar by describing wha we mean by a policy.» Definiion: A policy is a mapping from a sae o an acion. any mapping. How do we search over an arbirary space of policies?

91 Designing policies Policies and he English language Behavior Manner Riual Belief Mehod Rule Bias Mode Syle Commandmen Mores Technique Conduc Paerns Tene Convenion Plans Tradiion Culure Policies Way of life Cusoms Pracice Dogma Prejudice Eiquee Principle Fashion Procedure Formula Process Habi Proocols Laws/bylaws Recipe

92 Designing policies Two fundamenal sraegies: 1) Policy search Search over a class of funcions for making decisions o opimize some meric. T max f f E C S, X ( S ) S ( ff, ) 0 0 2) Lookahead approximaions Approximae he impac of a decision now on he fuure. T * X ( ) arg max (, ) max S x C S ( ', '( ')) 1, x C S X S S S x ' 1

93 Designing policies Policy search: 1a) Analyical funcions ha direcly map saes o acions ( policy PFA funcion approximaions or PFAs) x X ( S ) Lookup ables when in his sae, ake his acion Parameric funcions Order-up-o policies: if invenory is less han s, order up o S. PFA Affine policies - x X ( S ) ff ( S) ff Neural neworks Locally/semi/non parameric Requires opimizing over local regions 1b) Maximizing analyical approximaions of coss and/or consrains ( cos funcion approximaions or CFAs) Opimizing a deerminisic model modified o handle uncerainy (buffer socks, schedule slack) CFA X ( S ) arg max C ( S, x ) x X ( )

94 Designing policies Policy search:» Typically involves searching wihin a parameerized family» bu may involve comparisons across classes of funcional approximaions.» May be done offline (erminal reward) or online (cumulaive reward). Syles» Acive learning Experimen wih new policies wih he hope of finding improvemens, bu risks spending ime using less effecive policies.» Passive learning Using he policy you believe is bes, do updaing based on samples ha work well.

95 Designing policies Lookahead approximaions Approximae he impac of a decision now on he fuure:» An opimal policy (based on looking ahead): T * X ( ) arg max (, ) max S x C S ( ', '( ')) 1, x C S X S S S x ' 1 2a) Approximaing he value of being in a downsream sae using machine learning ( value funcion approximaions ) * X ( S ) arg max C( S, x ) V ( S ) S, x x 1 1 X ( S ) arg max C( S, x ) V ( S ) S, x VFA x 1 1 x x arg max x CS (, ) ( ) x V S 2b) Approximae lookahead models Opimize over an approximae model of he fuure: ( ) arg max (, ) max ( ', ( ')),!, T LA X S C S x C S X S S S x ' 1

96 Designing policies Policies based on value funcion approximaions» This is he foundaion of all soluion sraegies ha depend on Bellman (or Hamilon-Jacobi) opimaliy equaions.» Exac value funcions are rare: Discree saes and acions, wih a compuable one-sep ransiion marix. Analyical soluions for special funcions (e.g. LQR)» Approximae value funcions are generally based on: Approximae value ieraion Approximae policy ieraion

97 Designing policies The ulimae lookahead policy is opimal T * X ( ) arg max (, ) max S x C S ( ', '( ')) 1, x C S X S S S x ' 1 Maximizaion ha we canno compue Expecaions ha we canno compue

98 Designing policies The ulimae lookahead policy is opimal T * X ( ) arg max (, ) max S x C S ( ', '( ')) 1, x C S X S S S x ' 1 Insead, we have o solve an approximaion called he lookahead model: * ( ) arg max (, ) max ( ', '( ')), 1, H X S x C S x C S X S S S x x ' 1» A lookahead policy works by approximaing he lookahead model.

99 Designing policies Types of lookahead approximaions» One-sep lookahead Widely used in pure learning policies: Bayes greedy/naïve Bayes Thompson sampling Value of informaion (knowledge gradien)» Muli-sep lookahead Deerminisic lookahead, also known as model predicive conrol, rolling horizon procedure Sochasic lookahead: Two-sage (widely used in sochasic linear programming) Mulisage» Mone carlo ree search (MCTS) for discree acion spaces» Mulisage scenario rees (sochasic linear programming) ypically no racable.

100 Four (mea)classes of policies Policy search Lookahead approximaions 1) Policy funcion approximaions (PFAs)» Lookup ables, rules, parameric/nonparameric funcions 2) Cos funcion approximaion (CFAs) CFA» X ( S ) argmax C ( S, ) ( ) x x X 3) Policies based on value funcion approximaions (VFAs)» VFA x x X ( S) argmax x C( S, ) (, ) x V S S x 4) Direc lookahead policies (DLAs)» Deerminisic lookahead/rolling horizon proc./model predicive conrol» Chance consrained programming PAx [ fw ( )] 1» Sochasic lookahead /sochasic prog/mone Carlo ree search x, x,1,..., x,t» Robus opimizaion T LAD ' ' x,..., x, H ' 1 X ( S ) arg max C( S, x ) C( S, x ) T LAS ' ' ' 1 X ( S ) arg max C( S, x ) p( ) C( S ( ), x ( )) T LARO ' ' x,..., x, H ww ( ) ' 1 X ( S ) arg max min C( S, x ) C( S ( w), x ( w))

101 Four (mea)classes of policies Funcion approx. 1) Policy funcion approximaions (PFAs)» Lookup ables, rules, parameric/nonparameric funcions 2) Cos funcion approximaion (CFAs) CFA» X ( S ) argmax C ( S, ) ( ) x x X 3) Policies based on value funcion approximaions (VFAs)» VFA x x X ( S) argmax x C( S, ) (, ) x V S S x 4) Direc lookahead policies (DLAs)» Deerminisic lookahead/rolling horizon proc./model predicive conrol T LAD ' ' x,..., x, H ' 1 X ( S ) arg max C( S, x ) C( S, x )» Chance consrained programming PAx [ fw ( )] 1» Sochasic lookahead /sochasic prog/mone Carlo ree search x, x,1,..., x,t» Robus opimizaion T LAS ' ' ' 1 X ( S ) arg max C( S, x ) p( ) C( S ( ), x ( )) T LARO ' ' x,..., x, H ww ( ) ' 1 X ( S ) arg max min C( S, x ) C( S ( w), x ( w))

102 Approximaion sraegies Approximaion sraegies» Lookup ables Independen beliefs Correlaed beliefs» Linear parameric models Linear models Sparse-linear Tree regression» Nonlinear parameric models Logisic regression Neural neworks» Nonparameric models Gaussian process regression Hierarchical aggregaion

103 Designing policies Finding he bes policy» We have o firs ariculae our classes of policies» So minimizing over means:» We hen have o pick an objecive such as or f f PFAs, CFAs, VFAs, LAs Parameers ha characerize each family. f, T max C S, X ( S ) F X ( S ), W 0 0 max C S, X F( X, W) f T T T T

104 Designing policies Noes:» Three of he four classes of policies involve some form of funcion approximaion (PFA, CFA, VFA)» Lookahead models require approximaing he lookahead model, which requires (among oher approximaions) replacing he full probabiliy space wih a sampled approximaion (can be hard o solve).» Searching for he bes parameerized policy is jus like solving a sochasic search problem: May be derivaive-based or derivaive-free. May be solved offline or online.» VFAs have o be esimaed using biased observaions.

105 Ouline Canonical problems Problem classes Soluion sraegies for learning problems Elemens of a dynamic model An energy sorage illusraion Modeling uncerainy Designing policies The four classes of policies From deerminisic o sochasic opimizaion

106 Ouline The four classes of policies» Policy funcion approximaions (PFAs)» Cos funcion approximaions (CFAs)» Value funcion approximaions (VFAs)» Direc lookahead policies (DLAs)

107 Ouline The four classes of policies» Policy funcion approximaions (PFAs)» Cos funcion approximaions (CFAs)» Value funcion approximaions (VFAs)» Direc lookahead policies (DLAs)

108 Policy funcion approximaions Baery arbirage When o charge, when o discharge, given volaile LMPs

109 Policy funcion approximaions Grid operaors require ha baeries bid charge and discharge prices, an hour in advance Discharge Charge We have o search for he bes values for he policy Charge Discharge parameers and.

110 Policy funcion approximaions Our policy funcion migh be he parameric model (his is nonlinear in he parameers): charge 1 if p Energy in sorage: charge X ( S ) 0 if p charge 1 if p discharge Price of elecriciy:

111 Policy funcion approximaions Finding he bes policy» We need o maximize T max F( ) C S, X ( S ) 0» We canno compue he expecaion, so we run simulaions: Charge Discharge Slide 111

112 Ouline The four classes of policies» Policy funcion approximaions (PFAs)» Cos funcion approximaions (CFAs)» Value funcion approximaions (VFAs)» Direc lookahead policies (DLAs)

113 Robus cos funcion approximaion Invenory managemen» How much produc should I order o anicipae fuure demands?» Need o accommodae differen sources of uncerainy. Marke behavior Transi imes Supplier uncerainy Produc qualiy

114 Robus cos funcion approximaions Imagine ha we wan o purchase pars from differen suppliers. Le x p be he amoun of produc we purchase a ime from supplier p o mee forecased demand. We would solve subjec o D X ( S ) arg min c x pp x p p pp x x x p p p D u 0 p» This assumes our demand forecas is accurae. D

115 Robus cos funcion approximaions Imagine ha we wan o purchase pars from differen suppliers. Le x p be he amoun of produc we purchase a ime from supplier p o mee forecased demand. We would solve subjec o X ( S ) arg min c x x ( ) p p pp pp x x x p p p u p Reserve buffer D» This is a parameric cos funcion approximaion D Reserve ( ) Buffer sock

116 Cos funcion approximaions A general way of creaing CFAs:» Define our policy: X ( ) argmin xc( S, x) ff ( S, x) ff Cos correcion erm.» This has been confused wih approximae dynamic programming, bu he correcion erm is no a value funcion.

117 Cos funcion approximaions An even more general CFA model:» Define our policy: X ( ) argmin C ( S, x ) subjec o x Paramerically modified coss Ax b ( ) Paramerically modified consrains» We une by opimizing: min F ( ) C( S, X ( )) T 0

118 Ouline The four classes of policies» Policy funcion approximaions (PFAs)» Cos funcion approximaions (CFAs)» Value funcion approximaions (VFAs)» Direc lookahead policies (DLAs)

119 Schneider Naional Slide Warren B. Powell Slide 119

120 Value funcion approximaion Pre-decision sae: we see he demands $350 $300 $150 $450 S TX (, Dˆ )

121 Value funcion approximaion We use iniial value funcion approximaions V 0 ( MN) 0 V 0 ( CO) 0 $350 V 0 ( NY) 0 V 0 ( CA) 0 $300 $150 $450 S TX (, Dˆ )

122 Value funcion approximaion and make our firs choice: 1 x V 0 ( MN) 0 V 0 ( CO) 0 $350 V 0 ( NY) 0 V 0 ( CA) 0 $300 $150 $450 S x NY ( ) 1

123 Value funcion approximaion Updae he value of being in Texas. V 0 ( MN) 0 V 0 ( CO) 0 $350 V 0 ( NY) 0 V 0 ( CA) 0 $300 $150 $450 V 1 ( TX) 450 S x NY ( ) 1

124 Value funcion approximaion Now move o he nex sae, sample new demands and make a new decision V 0 ( MN) 0 V 0 ( CO) 0 $400 $180 V 0 ( NY) 0 V 0 ( CA) 0 $600 $125 V 1 ( TX) 450 S NY (, Dˆ ) 1 1 1

125 Value funcion approximaion Updae value of being in NY V 0 ( MN) 0 V 0 ( CO) 0 $400 $180 V 0 ( NY) 600 V 0 ( CA) 0 $600 $125 V 1 ( TX) 450 S x CA 1 ( ) 2

126 Value funcion approximaion Move o California. V 0 ( MN) 0 V 0 ( CA) 0 $200 V 0 ( CO) 0 $350 $400 $150 V 1 ( TX) 450 V 0 ( NY) 600 S CA (, Dˆ ) 2 2 2

127 Value funcion approximaion Make decision o reurn o TX and updae value of being in CA V 0 ( MN) 0 V 0 ( CA) 800 $200 V 0 ( CO) 0 $350 $400 $150 V 1 ( TX) 450 V 0 ( NY) 500 S CA (, Dˆ ) 2 2 2

128 Value funcion approximaion Updaing he value funcion: Old value: V 1 ( TX) $450 New esimae: 2 ˆ ( ) $800 v TX How do we merge old wih new? ( ) (1 ) ( ) ( ) ˆ ( ) V TX V TX v TX (0.90)$450+(0.10)$800 $485

129 Value funcion approximaion An updaed value of being in TX V 0 ( MN) 0 V 0 ( CO) 0 $385 V 0 ( NY) 600 V 0 ( CA) 800 $275 $800 $125 V 1 ( TX) 485 S TX (, Dˆ ) 3 3 3

130 Approximae value ieraion n Sep 1: Sar wih a pre-decision sae S Sep 2: Solve he deerminisic opimizaion using an approximae value funcion: n n n1 M, x n ˆ min x (, ) ( ( v C S x V S S, x )) x n o obain. Sep 3: Updae he value funcion approximaion n x, n n1 x, n n 1( 1) (1 n 1) 1 ( 1) n 1ˆ V S V S v n Sep 4: Obain Mone Carlo sample of W ( ) and compue he nex pre-decision sae: n M n n n S 1 S ( S, x, W 1( )) Sep 5: Reurn o sep 1. on policy learning Deerminisic opimizaion Recursive saisics Simulaion

131 Approximae value ieraion n Sep 1: Sar wih a pre-decision sae S Sep 2: Solve he deerminisic opimizaion using an approximae value funcion: n n n1 M, x n ˆ min x (, ) ( ( v C S x V S S, x )) x n o obain. Sep 3: Updae he value funcion approximaion n x, n n1 x, n n 1( 1) (1 n 1) 1 ( 1) n 1ˆ V S V S v n Sep 4: Obain Mone Carlo sample of W ( ) and compue he nex pre-decision sae: n M n n n S 1 S ( S, x, W 1( )) Sep 5: Reurn o sep 1. Deerminisic opimizaion Recursive saisics Simulaion

132 Approximae dynamic programming a ypical performance graph Objecive funcion Ieraions

133 Ouline The four classes of policies» Policy funcion approximaions (PFAs)» Cos funcion approximaions (CFAs)» Value funcion approximaions (VFAs)» Direc lookahead policies (DLAs)

134 Lookahead policies Planning your nex chess move:» You pu your finger on he piece while you hink abou moves ino he fuure. This is a lookahead policy, illusraed for a problem wih discree acions.

135 Slide 135

136 Lookahead policies Decision rees:

137 Lookahead policies Modeling lookahead policies» Lookahead policies solve a lookahead model, which is an approximaion of he fuure.» I is imporan o undersand he difference beween he: Base model his is he model we are rying o solve by finding he bes policy. This is usually some form of simulaor. The lookahead model, which is our approximaion of he fuure o help us make beer decisions now.» The base model is ypically a simulaor, or i migh be he real world.

138 Lookahead policies Lookahead models use five classes of approximaions:» Horizon runcaion Replacing a longer horizon problem wih a shorer horizon» Sage aggregaion Replacing mulisage problems wih wo-sage approximaion.» Oucome aggregaion/sampling Simplifying he exogenous informaion process» Discreizaion Of ime, saes and decisions» Dimensionaliy reducion We may ignore some variables (such as forecass) in he lookahead model ha we capure in he base model (hese become laen variables in he lookahead model).

139 Lookahead policies Noes:» The academic lieraure a he momen does no disinguish beween lookahead models and base models.» When uncerainy is involved, base models are almos always simulaors ha simulae differen policies (which migh include a lookahead policy).» When we use a deerminisic lookahead policy o solve a sochasic problem, we undersand ha he model being solved by he lookahead policy is jus an approximaion.» When we use a sochasic lookahead model, hen hings end o ge confusing.

140 Lookahead policies Lookahead policies are he rickies o model:» We creae ilde variables for he lookahead model: S x,',' Approximaed sae variable (e.g coarse discreizaion) Decision we plan on implemening a ime ' when we are planning a ime, ', 1,..., H x x, x,..., x,, 1, H,' W,' Approximaion of informaion process c,' Forecas of coss a ime ' made a ime b Forecas of righ hand sides for ime ' made a ime» All variables are indexed by (when he lookahead model is being generaed) and (he ime wihin he lookahead model).

141 Lookahead policies We can use his noaion o creae a policy based on our lookahead model: Limied horizon * ( ) arg max (, ) max H X ( ', '( ')), 1, S C S x C S X S S S x ' 1 Resriced/simplified se of policies Sampled se of realizaions (or deerminisic); Aggregaed saging of decisions and informaion Simplified/discreized se of sae variables Simplified/discreized se of sae variables» Simples lookahead is deerminisic.

142 Lookahead policies Deerminisic lookahead T '1 X LAD (S ) arg minc( S, x ) ' C( S ', x ' ) x, x,1,..., x,t Sochasic lookahead (wih wo-sage approximaion) T LAS ' ( ) argmin (, ) ( ) ( '( ), '( )) ' 1 X S CS x p CS x x, x,1,..., x,t Scenario rees

143 Lookahead policies Lookahead policies peek ino he fuure» Opimize over deerminisic lookahead model The lookahead model The real process

144 Lookahead policies Lookahead policies peek ino he fuure» Opimize over deerminisic lookahead model The lookahead model The real process

145 Lookahead policies Lookahead policies peek ino he fuure» Opimize over deerminisic lookahead model The lookahead model The real process

146 Lookahead policies Lookahead policies peek ino he fuure» Opimize over deerminisic lookahead model The lookahead model The real process

147 Lookahead policies There are wo sraegies for formulaing and solving sochasic lookahead models:» Solve a sampled model of he fuure over a se of scenarios. We have wo opions: A full mulisage ree Usually impossible o solve A wo-sage approximaion We break he problem ino: Iniial decision See all fuure informaion Make all remaining decisions These may sill be very hard o solve.» Use value funcion approximaions: Benders cus (SDDP) Oher sraegies for approximaing value funcions.

148 Mulisage lookahead models Sochasic lookahead» Here, we approximae he informaion model by using a Mone Carlo sample o creae a scenario ree: 1am 2am 3am 4am 5am.. Change in wind speed Change in wind speed Change in wind speed Slide 148

149 Mulisage lookahead models We can hen simulae his lookahead policy over ime: The lookahead model The base model

150 Mulisage lookahead models We can hen simulae his lookahead policy over ime: The lookahead model The base model

151 Mulisage lookahead models We can hen simulae his lookahead policy over ime: The lookahead model The base model

152 Mulisage lookahead models We can hen simulae his lookahead policy over ime: The lookahead model The base model

153 Lookahead policies Noes:» Solving sochasic lookahead policies can be hard!» bu his is sill jus a lookahead policy which is a class of rolling horizon heurisic.» Even if solving he lookahead model is hard, an opimal soluion of a lookahead model (even a sochasic one) is (wih rare excepions) no an opimal policy.

154 Lookahead policies There are wo ways of evaluaing a policy» Offline learning using a simulaor: where 1 T F ( ) C( S( ), X ( S( ))) 0 S ( ) S ( S ( ), X ( S ( )), W ( )) M 1 1» Online learning (in he field) Implemen he policy and observe how well i works. Make adjusmens as necessary.

155 Lookahead policies Offline learning» Can es policies fairly quickly» Can es condiions ha have no acually happened.» Requires making assumpions abou dynamics and random evens Online learning» Takes a day o observe a days performance» Have o live wih real world evens» No assumpions required

156 Lookahead policies Noes:» Lookahead policies are he only class ha does no require using any form of saisical funcion approximaion.» Lookahead policies do no require uning, bu you may wan o es parameer seings (horizon, number of scenarios).» The price of hese feaures ends o be policies ha are much harder o compue.» Approximaion sraegies can only be evaluaed in he conrolled seing of a simulaor.

157 An energy sorage problem Consider a basic energy sorage problem:» We are going o show ha wih minor variaions in he characerisics of his problem, we can make each class of policy work bes.

158 An energy sorage problem We can creae disinc flavors of his problem:» Problem class 1 Bes for PFAs Highly sochasic (heavy ailed) elecriciy prices Saionary daa» Problem class 2 Bes for CFAs Sochasic prices and wind (bu no heavy ailed) Saionary daa» Problem class 3 - Bes for VFAs Sochasic wind and prices (bu no oo random) Time varying loads, bu inaccurae wind forecass» Problem class 4 Bes for deerminisic lookaheads Relaively low noise problem wih accurae forecass» Problem class 5 A hybrid policy worked bes here Sochasic prices and wind, nonsaionary daa, noisy forecass W.B.Powell

159 An energy sorage problem The policies» The PFA: Charge baery when price is below p1 Discharge when price is above p2» The CFA Opimize over a horizon H; mainain upper and lower bounds (u, l) for every ime period excep he firs (noe ha his is a hybrid wih a lookahead).» The VFA Piecewise linear, concave value funcion in erms of energy, indexed by ime.» The lookahead (deerminisic) Opimize over a horizon H (only unable parameer) using forecass of demand, prices and wind energy» The lookahead CFA Use a lookahead policy (deerminisic), bu wih a unable parameer ha improves robusness.

160 An energy sorage problem Each policy is bes on cerain problems» Resuls are percen of poserior opimal soluion» any policy migh be bes depending on he daa. Join research wih Prof. Sephan Meisel, Universiy of Muenser, Germany.

161 Ouline Canonical problems Problem classes Soluion sraegies for learning problems Elemens of a dynamic model An energy sorage illusraion Modeling uncerainy Designing policies The four classes of policies From deerminisic o sochasic opimizaion

162 From deerminisic o sochasic Imagine ha you would like o solve he ime-dependen linear program: min T x0,..., x cx T 0» subjec o Ax 0 0 b0 Ax B 1x 1 b, 1. We can conver his o a proper sochasic model by replacing x wih X ( S ): T cx S 0 X ( S) min ( ) A x The policy has o saisfy wih ransiion funcion: S S S, x, W M 1 1 R

163 Modeling Deerminisic» Objecive funcion T min x0,..., xt 0 cx Sochasic» Objecive funcion max, ( ), E C S X S W S T 1 0 0» Decision variables:» Policy,..., x0 x T X : S» Consrains:» Consrains a ime a ime Ax R x X ( S) x 0 Transiion funcion» Transiion funcion M S 1 S S, x, W 1 R b B x 1 1» Exogenous informaion ( W1, W2,..., W T )

164 From deerminisic o sochasic Sochasic problems» Modeling is he mos imporan, and hardes, aspec of sochasic opimizaion» Searching for policies is imporan, bu less criical.» Modeling uncerainy is ofen overlooked, bu is of cenral imporance.» Evaluaing a policy is imporan, and difficul. Deerminisic problems» Modeling is imporan, bu no cenral.» Algorihms are he mos imporan, and hardes par.» Huh?» Jus add up he coss!!

165 Modeling sochasic, dynamic problems The universal objecive funcion max E, ( ), C S X S W S T wih S S S, x, W ( ) M 1 1 Now search for policies:» Policy search: PFAs, CFAs» Lookahead policies: VFAs, DLAs

166 Theory Compuaion Modeling Applicaions

Tutorial: A Unified Modeling and Algorithmic Framework for Optimization under Uncertainty

Tutorial: A Unified Modeling and Algorithmic Framework for Optimization under Uncertainty Tuorial: A Unified Modeling and Algorihmic Framework for Opimizaion under Uncerainy Informs Opimizaion Sociey Meeing March 23, 2018 Warren B. Powell Princeon Universiy Deparmen of Operaions Research and

More information

From Multiarmed Bandits to Stochastic Optimization

From Multiarmed Bandits to Stochastic Optimization From Muliarmed Bandis o Sochasic Opimizaion Muliarmed Bandis Workshop Roerdam, NL May 24, 2018 Warren B. Powell Princeon Universiy Deparmen of Operaions Research and Financial Engineering 2018 Warren B.

More information

An introduction to the theory of SDDP algorithm

An introduction to the theory of SDDP algorithm An inroducion o he heory of SDDP algorihm V. Leclère (ENPC) Augus 1, 2014 V. Leclère Inroducion o SDDP Augus 1, 2014 1 / 21 Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking

More information

Energy Storage and Renewables in New Jersey: Complementary Technologies for Reducing Our Carbon Footprint

Energy Storage and Renewables in New Jersey: Complementary Technologies for Reducing Our Carbon Footprint Energy Sorage and Renewables in New Jersey: Complemenary Technologies for Reducing Our Carbon Fooprin ACEE E-filliaes workshop November 14, 2014 Warren B. Powell Daniel Seingar Harvey Cheng Greg Davies

More information

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles Diebold, Chaper 7 Francis X. Diebold, Elemens of Forecasing, 4h Ediion (Mason, Ohio: Cengage Learning, 006). Chaper 7. Characerizing Cycles Afer compleing his reading you should be able o: Define covariance

More information

Energy Storage Benchmark Problems

Energy Storage Benchmark Problems Energy Sorage Benchmark Problems Daniel F. Salas 1,3, Warren B. Powell 2,3 1 Deparmen of Chemical & Biological Engineering 2 Deparmen of Operaions Research & Financial Engineering 3 Princeon Laboraory

More information

Subway stations energy and air quality management

Subway stations energy and air quality management Subway saions energy and air qualiy managemen wih sochasic opimizaion Trisan Rigau 1,2,4, Advisors: P. Carpenier 3, J.-Ph. Chancelier 2, M. De Lara 2 EFFICACITY 1 CERMICS, ENPC 2 UMA, ENSTA 3 LISIS, IFSTTAR

More information

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017 Two Popular Bayesian Esimaors: Paricle and Kalman Filers McGill COMP 765 Sep 14 h, 2017 1 1 1, dx x Bel x u x P x z P Recall: Bayes Filers,,,,,,, 1 1 1 1 u z u x P u z u x z P Bayes z = observaion u =

More information

Vehicle Arrival Models : Headway

Vehicle Arrival Models : Headway Chaper 12 Vehicle Arrival Models : Headway 12.1 Inroducion Modelling arrival of vehicle a secion of road is an imporan sep in raffic flow modelling. I has imporan applicaion in raffic flow simulaion where

More information

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing Applicaion of a Sochasic-Fuzzy Approach o Modeling Opimal Discree Time Dynamical Sysems by Using Large Scale Daa Processing AA WALASZE-BABISZEWSA Deparmen of Compuer Engineering Opole Universiy of Technology

More information

Lecture 3: Exponential Smoothing

Lecture 3: Exponential Smoothing NATCOR: Forecasing & Predicive Analyics Lecure 3: Exponenial Smoohing John Boylan Lancaser Cenre for Forecasing Deparmen of Managemen Science Mehods and Models Forecasing Mehod A (numerical) procedure

More information

Notes on Kalman Filtering

Notes on Kalman Filtering Noes on Kalman Filering Brian Borchers and Rick Aser November 7, Inroducion Daa Assimilaion is he problem of merging model predicions wih acual measuremens of a sysem o produce an opimal esimae of he curren

More information

1 Review of Zero-Sum Games

1 Review of Zero-Sum Games COS 5: heoreical Machine Learning Lecurer: Rob Schapire Lecure #23 Scribe: Eugene Brevdo April 30, 2008 Review of Zero-Sum Games Las ime we inroduced a mahemaical model for wo player zero-sum games. Any

More information

SZG Macro 2011 Lecture 3: Dynamic Programming. SZG macro 2011 lecture 3 1

SZG Macro 2011 Lecture 3: Dynamic Programming. SZG macro 2011 lecture 3 1 SZG Macro 2011 Lecure 3: Dynamic Programming SZG macro 2011 lecure 3 1 Background Our previous discussion of opimal consumpion over ime and of opimal capial accumulaion sugges sudying he general decision

More information

3.1 More on model selection

3.1 More on model selection 3. More on Model selecion 3. Comparing models AIC, BIC, Adjused R squared. 3. Over Fiing problem. 3.3 Sample spliing. 3. More on model selecion crieria Ofen afer model fiing you are lef wih a handful of

More information

5. Stochastic processes (1)

5. Stochastic processes (1) Lec05.pp S-38.45 - Inroducion o Teleraffic Theory Spring 2005 Conens Basic conceps Poisson process 2 Sochasic processes () Consider some quaniy in a eleraffic (or any) sysem I ypically evolves in ime randomly

More information

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 175 CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 10.1 INTRODUCTION Amongs he research work performed, he bes resuls of experimenal work are validaed wih Arificial Neural Nework. From he

More information

Simulating models with heterogeneous agents

Simulating models with heterogeneous agents Simulaing models wih heerogeneous agens Wouer J. Den Haan London School of Economics c by Wouer J. Den Haan Individual agen Subjec o employmen shocks (ε i, {0, 1}) Incomplee markes only way o save is hrough

More information

Air Traffic Forecast Empirical Research Based on the MCMC Method

Air Traffic Forecast Empirical Research Based on the MCMC Method Compuer and Informaion Science; Vol. 5, No. 5; 0 ISSN 93-8989 E-ISSN 93-8997 Published by Canadian Cener of Science and Educaion Air Traffic Forecas Empirical Research Based on he MCMC Mehod Jian-bo Wang,

More information

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter Sae-Space Models Iniializaion, Esimaion and Smoohing of he Kalman Filer Iniializaion of he Kalman Filer The Kalman filer shows how o updae pas predicors and he corresponding predicion error variances when

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION DOI: 0.038/NCLIMATE893 Temporal resoluion and DICE * Supplemenal Informaion Alex L. Maren and Sephen C. Newbold Naional Cener for Environmenal Economics, US Environmenal Proecion

More information

Sequential Importance Resampling (SIR) Particle Filter

Sequential Importance Resampling (SIR) Particle Filter Paricle Filers++ Pieer Abbeel UC Berkeley EECS Many slides adaped from Thrun, Burgard and Fox, Probabilisic Roboics 1. Algorihm paricle_filer( S -1, u, z ): 2. Sequenial Imporance Resampling (SIR) Paricle

More information

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Simulaion-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Week Descripion Reading Maerial 2 Compuer Simulaion of Dynamic Models Finie Difference, coninuous saes, discree ime Simple Mehods Euler Trapezoid

More information

Ensamble methods: Bagging and Boosting

Ensamble methods: Bagging and Boosting Lecure 21 Ensamble mehods: Bagging and Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Ensemble mehods Mixure of expers Muliple base models (classifiers, regressors), each covers a differen par

More information

Ensamble methods: Boosting

Ensamble methods: Boosting Lecure 21 Ensamble mehods: Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Schedule Final exam: April 18: 1:00-2:15pm, in-class Term projecs April 23 & April 25: a 1:00-2:30pm in CS seminar room

More information

Problem Set 5. Graduate Macro II, Spring 2017 The University of Notre Dame Professor Sims

Problem Set 5. Graduate Macro II, Spring 2017 The University of Notre Dame Professor Sims Problem Se 5 Graduae Macro II, Spring 2017 The Universiy of Nore Dame Professor Sims Insrucions: You may consul wih oher members of he class, bu please make sure o urn in your own work. Where applicable,

More information

Inventory Control of Perishable Items in a Two-Echelon Supply Chain

Inventory Control of Perishable Items in a Two-Echelon Supply Chain Journal of Indusrial Engineering, Universiy of ehran, Special Issue,, PP. 69-77 69 Invenory Conrol of Perishable Iems in a wo-echelon Supply Chain Fariborz Jolai *, Elmira Gheisariha and Farnaz Nojavan

More information

Estimation of Poses with Particle Filters

Estimation of Poses with Particle Filters Esimaion of Poses wih Paricle Filers Dr.-Ing. Bernd Ludwig Chair for Arificial Inelligence Deparmen of Compuer Science Friedrich-Alexander-Universiä Erlangen-Nürnberg 12/05/2008 Dr.-Ing. Bernd Ludwig (FAU

More information

THE real challenge of stochastic optimization involves

THE real challenge of stochastic optimization involves IEEE TRANS. ON POWER SYSTEMS, VOL. XX, NO. X, DECEMBER XXXX 1 Tuorial on Sochasic Opimizaion in Energy II: An energy sorage illusraion Warren B. Powell, Member, IEEE, Sephan Meisel Absrac In Par I of his

More information

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon 3..3 INRODUCION O DYNAMIC OPIMIZAION: DISCREE IME PROBLEMS A. he Hamilonian and Firs-Order Condiions in a Finie ime Horizon Define a new funcion, he Hamilonian funcion, H. H he change in he oal value of

More information

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB Elecronic Companion EC.1. Proofs of Technical Lemmas and Theorems LEMMA 1. Le C(RB) be he oal cos incurred by he RB policy. Then we have, T L E[C(RB)] 3 E[Z RB ]. (EC.1) Proof of Lemma 1. Using he marginal

More information

BU Macro BU Macro Fall 2008, Lecture 4

BU Macro BU Macro Fall 2008, Lecture 4 Dynamic Programming BU Macro 2008 Lecure 4 1 Ouline 1. Cerainy opimizaion problem used o illusrae: a. Resricions on exogenous variables b. Value funcion c. Policy funcion d. The Bellman equaion and an

More information

GMM - Generalized Method of Moments

GMM - Generalized Method of Moments GMM - Generalized Mehod of Momens Conens GMM esimaion, shor inroducion 2 GMM inuiion: Maching momens 2 3 General overview of GMM esimaion. 3 3. Weighing marix...........................................

More information

Chapter 2. Models, Censoring, and Likelihood for Failure-Time Data

Chapter 2. Models, Censoring, and Likelihood for Failure-Time Data Chaper 2 Models, Censoring, and Likelihood for Failure-Time Daa William Q. Meeker and Luis A. Escobar Iowa Sae Universiy and Louisiana Sae Universiy Copyrigh 1998-2008 W. Q. Meeker and L. A. Escobar. Based

More information

Introduction D P. r = constant discount rate, g = Gordon Model (1962): constant dividend growth rate.

Introduction D P. r = constant discount rate, g = Gordon Model (1962): constant dividend growth rate. Inroducion Gordon Model (1962): D P = r g r = consan discoun rae, g = consan dividend growh rae. If raional expecaions of fuure discoun raes and dividend growh vary over ime, so should he D/P raio. Since

More information

Announcements. Recap: Filtering. Recap: Reasoning Over Time. Example: State Representations for Robot Localization. Particle Filtering

Announcements. Recap: Filtering. Recap: Reasoning Over Time. Example: State Representations for Robot Localization. Particle Filtering Inroducion o Arificial Inelligence V22.0472-001 Fall 2009 Lecure 18: aricle & Kalman Filering Announcemens Final exam will be a 7pm on Wednesday December 14 h Dae of las class 1.5 hrs long I won ask anyhing

More information

Overview. COMP14112: Artificial Intelligence Fundamentals. Lecture 0 Very Brief Overview. Structure of this course

Overview. COMP14112: Artificial Intelligence Fundamentals. Lecture 0 Very Brief Overview. Structure of this course OMP: Arificial Inelligence Fundamenals Lecure 0 Very Brief Overview Lecurer: Email: Xiao-Jun Zeng x.zeng@mancheser.ac.uk Overview This course will focus mainly on probabilisic mehods in AI We shall presen

More information

Distribution of Estimates

Distribution of Estimates Disribuion of Esimaes From Economerics (40) Linear Regression Model Assume (y,x ) is iid and E(x e )0 Esimaion Consisency y α + βx + he esimaes approach he rue values as he sample size increases Esimaion

More information

Robust estimation based on the first- and third-moment restrictions of the power transformation model

Robust estimation based on the first- and third-moment restrictions of the power transformation model h Inernaional Congress on Modelling and Simulaion, Adelaide, Ausralia, 6 December 3 www.mssanz.org.au/modsim3 Robus esimaion based on he firs- and hird-momen resricions of he power ransformaion Nawaa,

More information

1. An introduction to dynamic optimization -- Optimal Control and Dynamic Programming AGEC

1. An introduction to dynamic optimization -- Optimal Control and Dynamic Programming AGEC This documen was generaed a :45 PM 8/8/04 Copyrigh 04 Richard T. Woodward. An inroducion o dynamic opimizaion -- Opimal Conrol and Dynamic Programming AGEC 637-04 I. Overview of opimizaion Opimizaion is

More information

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions Muli-Period Sochasic Models: Opimali of (s, S) Polic for -Convex Objecive Funcions Consider a seing similar o he N-sage newsvendor problem excep ha now here is a fixed re-ordering cos (> 0) for each (re-)order.

More information

Echocardiography Project and Finite Fourier Series

Echocardiography Project and Finite Fourier Series Echocardiography Projec and Finie Fourier Series 1 U M An echocardiagram is a plo of how a porion of he hear moves as he funcion of ime over he one or more hearbea cycles If he hearbea repeas iself every

More information

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle Chaper 2 Newonian Mechanics Single Paricle In his Chaper we will review wha Newon s laws of mechanics ell us abou he moion of a single paricle. Newon s laws are only valid in suiable reference frames,

More information

Excel-Based Solution Method For The Optimal Policy Of The Hadley And Whittin s Exact Model With Arma Demand

Excel-Based Solution Method For The Optimal Policy Of The Hadley And Whittin s Exact Model With Arma Demand Excel-Based Soluion Mehod For The Opimal Policy Of The Hadley And Whiin s Exac Model Wih Arma Demand Kal Nami School of Business and Economics Winson Salem Sae Universiy Winson Salem, NC 27110 Phone: (336)750-2338

More information

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II Roland Siegwar Margaria Chli Paul Furgale Marco Huer Marin Rufli Davide Scaramuzza ETH Maser Course: 151-0854-00L Auonomous Mobile Robos Localizaion II ACT and SEE For all do, (predicion updae / ACT),

More information

Unit Root Time Series. Univariate random walk

Unit Root Time Series. Univariate random walk Uni Roo ime Series Univariae random walk Consider he regression y y where ~ iid N 0, he leas squares esimae of is: ˆ yy y y yy Now wha if = If y y hen le y 0 =0 so ha y j j If ~ iid N 0, hen y ~ N 0, he

More information

2.7. Some common engineering functions. Introduction. Prerequisites. Learning Outcomes

2.7. Some common engineering functions. Introduction. Prerequisites. Learning Outcomes Some common engineering funcions 2.7 Inroducion This secion provides a caalogue of some common funcions ofen used in Science and Engineering. These include polynomials, raional funcions, he modulus funcion

More information

1. Consider a pure-exchange economy with stochastic endowments. The state of the economy

1. Consider a pure-exchange economy with stochastic endowments. The state of the economy Answer 4 of he following 5 quesions. 1. Consider a pure-exchange economy wih sochasic endowmens. The sae of he economy in period, 0,1,..., is he hisory of evens s ( s0, s1,..., s ). The iniial sae is given.

More information

Introduction to Probability and Statistics Slides 4 Chapter 4

Introduction to Probability and Statistics Slides 4 Chapter 4 Inroducion o Probabiliy and Saisics Slides 4 Chaper 4 Ammar M. Sarhan, asarhan@mahsa.dal.ca Deparmen of Mahemaics and Saisics, Dalhousie Universiy Fall Semeser 8 Dr. Ammar Sarhan Chaper 4 Coninuous Random

More information

Georey E. Hinton. University oftoronto. Technical Report CRG-TR February 22, Abstract

Georey E. Hinton. University oftoronto.   Technical Report CRG-TR February 22, Abstract Parameer Esimaion for Linear Dynamical Sysems Zoubin Ghahramani Georey E. Hinon Deparmen of Compuer Science Universiy oftorono 6 King's College Road Torono, Canada M5S A4 Email: zoubin@cs.orono.edu Technical

More information

Tom Heskes and Onno Zoeter. Presented by Mark Buller

Tom Heskes and Onno Zoeter. Presented by Mark Buller Tom Heskes and Onno Zoeer Presened by Mark Buller Dynamic Bayesian Neworks Direced graphical models of sochasic processes Represen hidden and observed variables wih differen dependencies Generalize Hidden

More information

Deep Learning: Theory, Techniques & Applications - Recurrent Neural Networks -

Deep Learning: Theory, Techniques & Applications - Recurrent Neural Networks - Deep Learning: Theory, Techniques & Applicaions - Recurren Neural Neworks - Prof. Maeo Maeucci maeo.maeucci@polimi.i Deparmen of Elecronics, Informaion and Bioengineering Arificial Inelligence and Roboics

More information

Kriging Models Predicting Atrazine Concentrations in Surface Water Draining Agricultural Watersheds

Kriging Models Predicting Atrazine Concentrations in Surface Water Draining Agricultural Watersheds 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Kriging Models Predicing Arazine Concenraions in Surface Waer Draining Agriculural Waersheds Paul L. Mosquin, Jeremy Aldworh, Wenlin Chen Supplemenal Maerial Number

More information

Appendix to Creating Work Breaks From Available Idleness

Appendix to Creating Work Breaks From Available Idleness Appendix o Creaing Work Breaks From Available Idleness Xu Sun and Ward Whi Deparmen of Indusrial Engineering and Operaions Research, Columbia Universiy, New York, NY, 127; {xs2235,ww24}@columbia.edu Sepember

More information

Smoothing. Backward smoother: At any give T, replace the observation yt by a combination of observations at & before T

Smoothing. Backward smoother: At any give T, replace the observation yt by a combination of observations at & before T Smoohing Consan process Separae signal & noise Smooh he daa: Backward smooher: A an give, replace he observaion b a combinaion of observaions a & before Simple smooher : replace he curren observaion wih

More information

L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS. NA568 Mobile Robotics: Methods & Algorithms

L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS. NA568 Mobile Robotics: Methods & Algorithms L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS NA568 Mobile Roboics: Mehods & Algorihms Today s Topic Quick review on (Linear) Kalman Filer Kalman Filering for Non-Linear Sysems Exended Kalman Filer (EKF)

More information

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H.

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H. ACE 56 Fall 005 Lecure 5: he Simple Linear Regression Model: Sampling Properies of he Leas Squares Esimaors by Professor Sco H. Irwin Required Reading: Griffihs, Hill and Judge. "Inference in he Simple

More information

Planning in POMDPs. Dominik Schoenberger Abstract

Planning in POMDPs. Dominik Schoenberger Abstract Planning in POMDPs Dominik Schoenberger d.schoenberger@sud.u-darmsad.de Absrac This documen briefly explains wha a Parially Observable Markov Decision Process is. Furhermore i inroduces he differen approaches

More information

MATHEMATICAL DESCRIPTION OF THEORETICAL METHODS OF RESERVE ECONOMY OF CONSIGNMENT STORES

MATHEMATICAL DESCRIPTION OF THEORETICAL METHODS OF RESERVE ECONOMY OF CONSIGNMENT STORES MAHEMAICAL DESCIPION OF HEOEICAL MEHODS OF ESEVE ECONOMY OF CONSIGNMEN SOES Péer elek, József Cselényi, György Demeer Universiy of Miskolc, Deparmen of Maerials Handling and Logisics Absrac: Opimizaion

More information

An Introduction to Stochastic Programming: The Recourse Problem

An Introduction to Stochastic Programming: The Recourse Problem An Inroducion o Sochasic Programming: he Recourse Problem George Danzig and Phil Wolfe Ellis Johnson, Roger Wes, Dick Cole, and Me John Birge Where o look in he ex pp. 6-7, Secion.2.: Inroducion o sochasic

More information

OBJECTIVES OF TIME SERIES ANALYSIS

OBJECTIVES OF TIME SERIES ANALYSIS OBJECTIVES OF TIME SERIES ANALYSIS Undersanding he dynamic or imedependen srucure of he observaions of a single series (univariae analysis) Forecasing of fuure observaions Asceraining he leading, lagging

More information

Lecture 33: November 29

Lecture 33: November 29 36-705: Inermediae Saisics Fall 2017 Lecurer: Siva Balakrishnan Lecure 33: November 29 Today we will coninue discussing he boosrap, and hen ry o undersand why i works in a simple case. In he las lecure

More information

Online Appendix to Solution Methods for Models with Rare Disasters

Online Appendix to Solution Methods for Models with Rare Disasters Online Appendix o Soluion Mehods for Models wih Rare Disasers Jesús Fernández-Villaverde and Oren Levinal In his Online Appendix, we presen he Euler condiions of he model, we develop he pricing Calvo block,

More information

Maximum Likelihood Parameter Estimation in State-Space Models

Maximum Likelihood Parameter Estimation in State-Space Models Maximum Likelihood Parameer Esimaion in Sae-Space Models Arnaud Douce Deparmen of Saisics, Oxford Universiy Universiy College London 4 h Ocober 212 A. Douce (UCL Maserclass Oc. 212 4 h Ocober 212 1 / 32

More information

= ( ) ) or a system of differential equations with continuous parametrization (T = R

= ( ) ) or a system of differential equations with continuous parametrization (T = R XIII. DIFFERENCE AND DIFFERENTIAL EQUATIONS Ofen funcions, or a sysem of funcion, are paramerized in erms of some variable, usually denoed as and inerpreed as ime. The variable is wrien as a funcion of

More information

STATE-SPACE MODELLING. A mass balance across the tank gives:

STATE-SPACE MODELLING. A mass balance across the tank gives: B. Lennox and N.F. Thornhill, 9, Sae Space Modelling, IChemE Process Managemen and Conrol Subjec Group Newsleer STE-SPACE MODELLING Inroducion: Over he pas decade or so here has been an ever increasing

More information

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still.

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still. Lecure - Kinemaics in One Dimension Displacemen, Velociy and Acceleraion Everyhing in he world is moving. Nohing says sill. Moion occurs a all scales of he universe, saring from he moion of elecrons in

More information

1. An introduction to dynamic optimization -- Optimal Control and Dynamic Programming AGEC

1. An introduction to dynamic optimization -- Optimal Control and Dynamic Programming AGEC This documen was generaed a :37 PM, 1/11/018 Copyrigh 018 Richard T. Woodward 1. An inroducion o dynamic opimiaion -- Opimal Conrol and Dynamic Programming AGEC 64-018 I. Overview of opimiaion Opimiaion

More information

Object tracking: Using HMMs to estimate the geographical location of fish

Object tracking: Using HMMs to estimate the geographical location of fish Objec racking: Using HMMs o esimae he geographical locaion of fish 02433 - Hidden Markov Models Marin Wæver Pedersen, Henrik Madsen Course week 13 MWP, compiled June 8, 2011 Objecive: Locae fish from agging

More information

( ) a system of differential equations with continuous parametrization ( T = R + These look like, respectively:

( ) a system of differential equations with continuous parametrization ( T = R + These look like, respectively: XIII. DIFFERENCE AND DIFFERENTIAL EQUATIONS Ofen funcions, or a sysem of funcion, are paramerized in erms of some variable, usually denoed as and inerpreed as ime. The variable is wrien as a funcion of

More information

Online Convex Optimization Example And Follow-The-Leader

Online Convex Optimization Example And Follow-The-Leader CSE599s, Spring 2014, Online Learning Lecure 2-04/03/2014 Online Convex Opimizaion Example And Follow-The-Leader Lecurer: Brendan McMahan Scribe: Sephen Joe Jonany 1 Review of Online Convex Opimizaion

More information

Article from. Predictive Analytics and Futurism. July 2016 Issue 13

Article from. Predictive Analytics and Futurism. July 2016 Issue 13 Aricle from Predicive Analyics and Fuurism July 6 Issue An Inroducion o Incremenal Learning By Qiang Wu and Dave Snell Machine learning provides useful ools for predicive analyics The ypical machine learning

More information

Understanding the asymptotic behaviour of empirical Bayes methods

Understanding the asymptotic behaviour of empirical Bayes methods Undersanding he asympoic behaviour of empirical Bayes mehods Boond Szabo, Aad van der Vaar and Harry van Zanen EURANDOM, 11.10.2011. Conens 2/20 Moivaion Nonparameric Bayesian saisics Signal in Whie noise

More information

RANDOM LAGRANGE MULTIPLIERS AND TRANSVERSALITY

RANDOM LAGRANGE MULTIPLIERS AND TRANSVERSALITY ECO 504 Spring 2006 Chris Sims RANDOM LAGRANGE MULTIPLIERS AND TRANSVERSALITY 1. INTRODUCTION Lagrange muliplier mehods are sandard fare in elemenary calculus courses, and hey play a cenral role in economic

More information

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD HAN XIAO 1. Penalized Leas Squares Lasso solves he following opimizaion problem, ˆβ lasso = arg max β R p+1 1 N y i β 0 N x ij β j β j (1.1) for some 0.

More information

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION Vol. XI Control of Stochastic Systems - P.R. Kumar

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION Vol. XI Control of Stochastic Systems - P.R. Kumar CONROL OF SOCHASIC SYSEMS P.R. Kumar Deparmen of Elecrical and Compuer Engineering, and Coordinaed Science Laboraory, Universiy of Illinois, Urbana-Champaign, USA. Keywords: Markov chains, ransiion probabiliies,

More information

Linear Time-invariant systems, Convolution, and Cross-correlation

Linear Time-invariant systems, Convolution, and Cross-correlation Linear Time-invarian sysems, Convoluion, and Cross-correlaion (1) Linear Time-invarian (LTI) sysem A sysem akes in an inpu funcion and reurns an oupu funcion. x() T y() Inpu Sysem Oupu y() = T[x()] An

More information

Optimal Investment under Dynamic Risk Constraints and Partial Information

Optimal Investment under Dynamic Risk Constraints and Partial Information Opimal Invesmen under Dynamic Risk Consrains and Parial Informaion Wolfgang Puschögl Johann Radon Insiue for Compuaional and Applied Mahemaics (RICAM) Ausrian Academy of Sciences www.ricam.oeaw.ac.a 2

More information

Macroeconomic Theory Ph.D. Qualifying Examination Fall 2005 ANSWER EACH PART IN A SEPARATE BLUE BOOK. PART ONE: ANSWER IN BOOK 1 WEIGHT 1/3

Macroeconomic Theory Ph.D. Qualifying Examination Fall 2005 ANSWER EACH PART IN A SEPARATE BLUE BOOK. PART ONE: ANSWER IN BOOK 1 WEIGHT 1/3 Macroeconomic Theory Ph.D. Qualifying Examinaion Fall 2005 Comprehensive Examinaion UCLA Dep. of Economics You have 4 hours o complee he exam. There are hree pars o he exam. Answer all pars. Each par has

More information

CH Sean Han QF, NTHU, Taiwan BFS2010. (Joint work with T.-Y. Chen and W.-H. Liu)

CH Sean Han QF, NTHU, Taiwan BFS2010. (Joint work with T.-Y. Chen and W.-H. Liu) CH Sean Han QF, NTHU, Taiwan BFS2010 (Join work wih T.-Y. Chen and W.-H. Liu) Risk Managemen in Pracice: Value a Risk (VaR) / Condiional Value a Risk (CVaR) Volailiy Esimaion: Correced Fourier Transform

More information

Biol. 356 Lab 8. Mortality, Recruitment, and Migration Rates

Biol. 356 Lab 8. Mortality, Recruitment, and Migration Rates Biol. 356 Lab 8. Moraliy, Recruimen, and Migraion Raes (modified from Cox, 00, General Ecology Lab Manual, McGraw Hill) Las week we esimaed populaion size hrough several mehods. One assumpion of all hese

More information

Maintenance Models. Prof. Robert C. Leachman IEOR 130, Methods of Manufacturing Improvement Spring, 2011

Maintenance Models. Prof. Robert C. Leachman IEOR 130, Methods of Manufacturing Improvement Spring, 2011 Mainenance Models Prof Rober C Leachman IEOR 3, Mehods of Manufacuring Improvemen Spring, Inroducion The mainenance of complex equipmen ofen accouns for a large porion of he coss associaed wih ha equipmen

More information

Technical Report Doc ID: TR March-2013 (Last revision: 23-February-2016) On formulating quadratic functions in optimization models.

Technical Report Doc ID: TR March-2013 (Last revision: 23-February-2016) On formulating quadratic functions in optimization models. Technical Repor Doc ID: TR--203 06-March-203 (Las revision: 23-Februar-206) On formulaing quadraic funcions in opimizaion models. Auhor: Erling D. Andersen Convex quadraic consrains quie frequenl appear

More information

On a Discrete-In-Time Order Level Inventory Model for Items with Random Deterioration

On a Discrete-In-Time Order Level Inventory Model for Items with Random Deterioration Journal of Agriculure and Life Sciences Vol., No. ; June 4 On a Discree-In-Time Order Level Invenory Model for Iems wih Random Deerioraion Dr Biswaranjan Mandal Associae Professor of Mahemaics Acharya

More information

Chapter 2. First Order Scalar Equations

Chapter 2. First Order Scalar Equations Chaper. Firs Order Scalar Equaions We sar our sudy of differenial equaions in he same way he pioneers in his field did. We show paricular echniques o solve paricular ypes of firs order differenial equaions.

More information

Speaker Adaptation Techniques For Continuous Speech Using Medium and Small Adaptation Data Sets. Constantinos Boulis

Speaker Adaptation Techniques For Continuous Speech Using Medium and Small Adaptation Data Sets. Constantinos Boulis Speaker Adapaion Techniques For Coninuous Speech Using Medium and Small Adapaion Daa Ses Consaninos Boulis Ouline of he Presenaion Inroducion o he speaker adapaion problem Maximum Likelihood Sochasic Transformaions

More information

Západočeská Univerzita v Plzni, Czech Republic and Groupe ESIEE Paris, France

Západočeská Univerzita v Plzni, Czech Republic and Groupe ESIEE Paris, France ADAPTIVE SIGNAL PROCESSING USING MAXIMUM ENTROPY ON THE MEAN METHOD AND MONTE CARLO ANALYSIS Pavla Holejšovsá, Ing. *), Z. Peroua, Ing. **), J.-F. Bercher, Prof. Assis. ***) Západočesá Univerzia v Plzni,

More information

A Dynamic Model of Economic Fluctuations

A Dynamic Model of Economic Fluctuations CHAPTER 15 A Dynamic Model of Economic Flucuaions Modified for ECON 2204 by Bob Murphy 2016 Worh Publishers, all righs reserved IN THIS CHAPTER, OU WILL LEARN: how o incorporae dynamics ino he AD-AS model

More information

Lecture 1 Overview. course mechanics. outline & topics. what is a linear dynamical system? why study linear systems? some examples

Lecture 1 Overview. course mechanics. outline & topics. what is a linear dynamical system? why study linear systems? some examples EE263 Auumn 27-8 Sephen Boyd Lecure 1 Overview course mechanics ouline & opics wha is a linear dynamical sysem? why sudy linear sysems? some examples 1 1 Course mechanics all class info, lecures, homeworks,

More information

not to be republished NCERT MATHEMATICAL MODELLING Appendix 2 A.2.1 Introduction A.2.2 Why Mathematical Modelling?

not to be republished NCERT MATHEMATICAL MODELLING Appendix 2 A.2.1 Introduction A.2.2 Why Mathematical Modelling? 256 MATHEMATICS A.2.1 Inroducion In class XI, we have learn abou mahemaical modelling as an aemp o sudy some par (or form) of some real-life problems in mahemaical erms, i.e., he conversion of a physical

More information

Economics 8105 Macroeconomic Theory Recitation 6

Economics 8105 Macroeconomic Theory Recitation 6 Economics 8105 Macroeconomic Theory Reciaion 6 Conor Ryan Ocober 11h, 2016 Ouline: Opimal Taxaion wih Governmen Invesmen 1 Governmen Expendiure in Producion In hese noes we will examine a model in which

More information

The equation to any straight line can be expressed in the form:

The equation to any straight line can be expressed in the form: Sring Graphs Par 1 Answers 1 TI-Nspire Invesigaion Suden min Aims Deermine a series of equaions of sraigh lines o form a paern similar o ha formed by he cables on he Jerusalem Chords Bridge. Deermine he

More information

Comparing Means: t-tests for One Sample & Two Related Samples

Comparing Means: t-tests for One Sample & Two Related Samples Comparing Means: -Tess for One Sample & Two Relaed Samples Using he z-tes: Assumpions -Tess for One Sample & Two Relaed Samples The z-es (of a sample mean agains a populaion mean) is based on he assumpion

More information

Right tail. Survival function

Right tail. Survival function Densiy fi (con.) Lecure 4 The aim of his lecure is o improve our abiliy of densiy fi and knowledge of relaed opics. Main issues relaed o his lecure are: logarihmic plos, survival funcion, HS-fi mixures,

More information

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé Bias in Condiional and Uncondiional Fixed Effecs Logi Esimaion: a Correcion * Tom Coupé Economics Educaion and Research Consorium, Naional Universiy of Kyiv Mohyla Academy Address: Vul Voloska 10, 04070

More information

20. Applications of the Genetic-Drift Model

20. Applications of the Genetic-Drift Model 0. Applicaions of he Geneic-Drif Model 1) Deermining he probabiliy of forming any paricular combinaion of genoypes in he nex generaion: Example: If he parenal allele frequencies are p 0 = 0.35 and q 0

More information

t is a basis for the solution space to this system, then the matrix having these solutions as columns, t x 1 t, x 2 t,... x n t x 2 t...

t is a basis for the solution space to this system, then the matrix having these solutions as columns, t x 1 t, x 2 t,... x n t x 2 t... Mah 228- Fri Mar 24 5.6 Marix exponenials and linear sysems: The analogy beween firs order sysems of linear differenial equaions (Chaper 5) and scalar linear differenial equaions (Chaper ) is much sronger

More information

Lecture 20: Riccati Equations and Least Squares Feedback Control

Lecture 20: Riccati Equations and Least Squares Feedback Control 34-5 LINEAR SYSTEMS Lecure : Riccai Equaions and Leas Squares Feedback Conrol 5.6.4 Sae Feedback via Riccai Equaions A recursive approach in generaing he marix-valued funcion W ( ) equaion for i for he

More information

Exponential Weighted Moving Average (EWMA) Chart Under The Assumption of Moderateness And Its 3 Control Limits

Exponential Weighted Moving Average (EWMA) Chart Under The Assumption of Moderateness And Its 3 Control Limits DOI: 0.545/mjis.07.5009 Exponenial Weighed Moving Average (EWMA) Char Under The Assumpion of Moderaeness And Is 3 Conrol Limis KALPESH S TAILOR Assisan Professor, Deparmen of Saisics, M. K. Bhavnagar Universiy,

More information

CSE 3802 / ECE Numerical Methods in Scientific Computation. Jinbo Bi. Department of Computer Science & Engineering

CSE 3802 / ECE Numerical Methods in Scientific Computation. Jinbo Bi. Department of Computer Science & Engineering CSE 3802 / ECE 3431 Numerical Mehods in Scienific Compuaion Jinbo Bi Deparmen of Compuer Science & Engineering hp://www.engr.uconn.edu/~jinbo 1 Ph.D in Mahemaics The Insrucor Previous professional experience:

More information