DYNAMIC PROGRAMMING. Dynamic Programming. Costs. Prototype example. Solving the problem. Formulation

Similar documents
Solving Constrained Flow-Shop Scheduling. Problems with Three Machines

Functions of Random Variables

(b) By independence, the probability that the string 1011 is received correctly is

CHAPTER VI Statistical Analysis of Experimental Data

Lecture 3. Sampling, sampling distributions, and parameter estimation

Third handout: On the Gini Index

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

Continuous Distributions

f f... f 1 n n (ii) Median : It is the value of the middle-most observation(s).

Chapter 9 Jordan Block Matrices

Introduction to local (nonparametric) density estimation. methods

means the first term, a2 means the term, etc. Infinite Sequences: follow the same pattern forever.

Summary of the lecture in Biostatistics

A New Family of Transformations for Lifetime Data

Chapter 14 Logistic Regression Models

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

For combinatorial problems we might need to generate all permutations, combinations, or subsets of a set.

The Mathematical Appendix

Department of Agricultural Economics. PhD Qualifier Examination. August 2011

Lecture 9: Tolerant Testing

Lecture 2 - What are component and system reliability and how it can be improved?

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity

Analysis of System Performance IN2072 Chapter 5 Analysis of Non Markov Systems

Random Variables and Probability Distributions

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

MEASURES OF DISPERSION

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)

CS286.2 Lecture 4: Dinur s Proof of the PCP Theorem

Simple Linear Regression

Chapter 5 Properties of a Random Sample

Lecture Notes Types of economic variables

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

Multiple Choice Test. Chapter Adequacy of Models for Regression

Assignment 5/MATH 247/Winter Due: Friday, February 19 in class (!) (answers will be posted right after class)

Bounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy

Likewise, properties of the optimal policy for equipment replacement & maintenance problems can be used to reduce the computation.

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

STA302/1001-Fall 2008 Midterm Test October 21, 2008

LINEAR REGRESSION ANALYSIS

Point Estimation: definition of estimators

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

STA 105-M BASIC STATISTICS (This is a multiple choice paper.)

Econometric Methods. Review of Estimation

ESS Line Fitting

The Selection Problem - Variable Size Decrease/Conquer (Practice with algorithm analysis)

2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen.

PTAS for Bin-Packing

MA/CSSE 473 Day 27. Dynamic programming

Bayes (Naïve or not) Classifiers: Generative Approach

ENGI 3423 Simple Linear Regression Page 12-01

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1

Chapter 8. Inferences about More Than Two Population Central Values

Random Variate Generation ENM 307 SIMULATION. Anadolu Üniversitesi, Endüstri Mühendisliği Bölümü. Yrd. Doç. Dr. Gürkan ÖZTÜRK.

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter

DATE: 21 September, 1999 TO: Jim Russell FROM: Peter Tkacik RE: Analysis of wide ply tube winding as compared to Konva Kore CC: Larry McMillan

PGE 310: Formulation and Solution in Geosystems Engineering. Dr. Balhoff. Interpolation

Multi Objective Fuzzy Inventory Model with. Demand Dependent Unit Cost and Lead Time. Constraints A Karush Kuhn Tucker Conditions.

Analysis of Variance with Weibull Data

1. Overview of basic probability

UNIT 2 SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations

Midterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions

Lecture 8: Linear Regression

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Homework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

Descriptive Statistics

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

8.1 Hashing Algorithms

F. Inequalities. HKAL Pure Mathematics. 進佳數學團隊 Dr. Herbert Lam 林康榮博士. [Solution] Example Basic properties

Bayesian Classification. CS690L Data Mining: Classification(2) Bayesian Theorem: Basics. Bayesian Theorem. Training dataset. Naïve Bayes Classifier

Simulation Output Analysis

( ) 2 2. Multi-Layer Refraction Problem Rafael Espericueta, Bakersfield College, November, 2006

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

ECON 5360 Class Notes GMM

CHAPTER 4 RADICAL EXPRESSIONS

Block-Based Compact Thermal Modeling of Semiconductor Integrated Circuits

Exercises for Square-Congruence Modulo n ver 11

Taylor s Series and Interpolation. Interpolation & Curve-fitting. CIS Interpolation. Basic Scenario. Taylor Series interpolates at a specific

GOALS The Samples Why Sample the Population? What is a Probability Sample? Four Most Commonly Used Probability Sampling Methods

Lecture 1 Review of Fundamental Statistical Concepts

Chapter 3 Sampling For Proportions and Percentages

Median as a Weighted Arithmetic Mean of All Sample Observations

Objectives of Multiple Regression

1 Onto functions and bijections Applications to Counting

Evaluating Polynomials

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))

STATISTICAL INFERENCE

1. A real number x is represented approximately by , and we are told that the relative error is 0.1 %. What is x? Note: There are two answers.

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

Module 7: Probability and Statistics

Newton s Power Flow algorithm

This lecture and the next. Why Sorting? Sorting Algorithms so far. Why Sorting? (2) Selection Sort. Heap Sort. Heapsort

SPECIAL CONSIDERATIONS FOR VOLUMETRIC Z-TEST FOR PROPORTIONS

On Fuzzy Arithmetic, Possibility Theory and Theory of Evidence

Transcription:

DYNAMIC PROGRAMMING Dyamc Programmg It s a useful mathematcal techque for makg a sequece of terrelated decsos. Systematc procedure for determg the optmal combato of decsos. There s o stadard mathematcal formulato of the Dyamc Programmg problem. Kowg whe to apply dyamc programmg depeds largely o experece wth ts geeral structure. João Mguel da Costa Sousa / Alexadra Moutho 03 Prototype example Costs Cost c j of gog from state to state j s: Stagecoach problem Fortue seeker wats to go from Mssour (A) to Calfora (J) the md 9th cetury. Jourey has 4 stages. Cost s the lfe surace of a specfc route; lowest cost s equvalet to safest trp. João Mguel da Costa Sousa / Alexadra Moutho 04 B C D E F G H I J A 4 3 B 7 4 6 E 4 H 3 C 3 4 F 6 3 I 4 D 4 5 G 3 3 Problem: whch route mmzes the total cost of the polcy? João Mguel da Costa Sousa / Alexadra Moutho 05 Solvg the problem Note that greedy approach does ot work. Soluto A B F I J has total cost of 3. However, e.g. A D F s cheaper tha A B F. Other possblty: tral ad error. Too much effort eve for ths smple problem. Dyamc programmg s much more effcet tha exhaustve eumerato, especally for large problems. Starts from the last stage of the problem, ad elarges t oe stage at a tme. João Mguel da Costa Sousa / Alexadra Moutho 06 Formulato Decso varables x ( =,, 3, 4) are the mmedate destato of stage. Route s A x x x 3 x 4, where x 4 = J. Total cost of the best overall polcy for the remag stages s f (s, x ) Actual state s s, ready to start stage, selectg x as the mmedate destato. x mmzes f (s, x ) ad f (s, x ) s the mmum value of f (s, x ): f () s = m f (, s x ) = f (, s x ) x João Mguel da Costa Sousa / Alexadra Moutho 07

Formulato Soluto procedure where f( s, x) = mmedate cost (stage ) + mmum future cost (stages + oward) ( ) = c + sx f + x Value of c sx gve by c j where = s (curret state) ad j = x (mmedate destato). Objectve: fd f (A) ad the correspodg route. Dyamc programmg fds successvely f 4 (s), f 3 (s), f (s) ad fally f (A). Whe = 4, the route s determed by ts curret state s (H or I) ad ts fal destato J. Sce f 4 (s) = f 4 (s, J) = c sj, the soluto for = 4: s f 4 (s) x 4 H 3 J I 4 J João Mguel da Costa Sousa / Alexadra Moutho 08 João Mguel da Costa Sousa / Alexadra Moutho 09 Stage = 3 Needs a few calculatos. If fortue seeker s state F, he ca go to ether H or I wth costs c F,H = 6 or c F,I = 3. Choosg H, the mmum addtoal cost s f 4 (H) = 3. Total cost s 6 + 3 = 9. Choosg I, the total cost s 3 + 4 = 7. Ths s smaller, ad t s the optmal choce for state F. Stage = 3 Smlar calculatos ca be made for the two possble states s = E ad s = G, resultg the table for = 3: f 3 (s, x 3 ) = c sx3 + f 4 (x 3 ) s x 3 H I f 3 (s) x 3 E 4 8 4 H F 9 7 7 I G 6 7 6 H João Mguel da Costa Sousa / Alexadra Moutho 0 João Mguel da Costa Sousa / Alexadra Moutho Stage = I ths case, f (s, x ) = c sx + f 3 (x ). Example for ode C: x = E: f (C, E) = c C,E + f 3 (E) = 3 + 4 = 7 optmal x = F: f (C, F) = c C,F + f 3 (F) = + 7 = 9. x = G: f (C, G) = c C,G + f 3 (G) = 4 + 6 = 0. Stage = Smlar calculatos ca be made for the two possble states s = B ad s = D, resultg the table for = : f (s,, x ) = c + f 3 (x sx ) s x E F G f (s) x B E or F C 7 9 0 7 E D 8 8 8 E or F João Mguel da Costa Sousa / Alexadra Moutho João Mguel da Costa Sousa / Alexadra Moutho 3

Stage = Just oe possble startg state: A. x = B: f (A, B) = c A,B + f (B) = + = 3. x = C: f (A, C) = c A,C + f (C) = 4 + 7 = optmal x = D: f (A, D) = c A,D + f (D) = 3 + 8 = optmal Optmal soluto Three optmal solutos, all wth f (A) = : Results the table: s x f (s, x ) = c sx + f (x ) B C D f (s) x A 3 C or D João Mguel da Costa Sousa / Alexadra Moutho 4 João Mguel da Costa Sousa / Alexadra Moutho 5 Characterstcs of DP. The problem ca be dvded to stages, wth a polcy decso requred at each stage. Example: 4 stages ad lfe surace polcy to choose. Dyamc programmg problems requre makg a sequece of terrelated decsos.. Each stage has a umber of states assocated wth the begg of each stage. Example: states are the possble terrtores where the fortue seeker could be located. States are possble codtos whch the system mght be. Characterstcs of DP 3. Polcy decso trasforms the curret state to a state assocated wth the begg of the ext stage. Example: fortue seeker s decso led hm from hs curret state to the ext state o hs jourey. DP problems ca be terpreted terms of etworks: each ode correspod to a state. Value assged to each lk s the mmedate cotrbuto to the objectve fucto from makg that polcy decso. I most cases, objectve correspods to fdg the shortest or the logest path. João Mguel da Costa Sousa / Alexadra Moutho 6 João Mguel da Costa Sousa / Alexadra Moutho 7 Characterstcs of DP 4. The soluto procedure fds a optmal polcy for the overall problem. Fds a prescrpto of the optmal polcy decso at each stage for each of the possble states. Example: soluto procedure costructed a table for each stage,, that prescrbed the optmal decso, x, for each possble state s. I addto to detfyg optmal solutos, DP provdes a polcy prescrpto of what to do uder every possble crcumstace (why a decso s called polcy decso). Ths s useful for sestvty aalyss. João Mguel da Costa Sousa / Alexadra Moutho 8 Characterstcs of DP 5. Gve the curret state, a optmal polcy for the remag stages s depedet of the polcy decsos adopted prevous stages. Optmal mmedate decso depeds oly o curret state ad ot o how t was obtaed: ths s the prcple of optmalty for DP. Example: at ay state, the surace polcy s depedet o how the fortue seeker got there. Kowledge of the curret state coveys all formato ecessary for determg the optmal polcy heceforth (Markova property). Problems lackg ths property are ot Dyamc Programmg Problems. João Mguel da Costa Sousa / Alexadra Moutho 9 3

Characterstcs of DP 6. Soluto procedure begs by fdg the optmal polcy for the last stage. Soluto s usually trval. 7. A recursve relatoshp that detfes optmal polcy for stage, gve optmal polcy for stage +, s avalable. Example: recursve relatoshp was { + } f () s = m c + f ( x ) sx x Recursve relatoshp dffers somewhat amog dyamc programmg problems. João Mguel da Costa Sousa / Alexadra Moutho 0 Characterstcs of DP 7. (cot.) Notato: N = umber of stages. = label for curret stage ( =,,, N). s = curret stae t for stage. x = decso varable for stage. x = optmal value of x (gve s ). f ( s, x ) = cotrbuto of stages, +,, N to objectve f ( s ) = f ( s, x ) fucto f system starts state s at stage, mmedate decso s x, ad optmal decsos are made thereafter. João Mguel da Costa Sousa / Alexadra Moutho Characterstcs of DP 7. (cot.) Recursve relatoshp: f ( s ) = max { f ( s, x )} or f ( s ) = m { f ( s, x )} x x where f (s, x ) s wrtte terms of s, x, f, ad + ( s+ ) probably some measure of the mmedate cotrbuto of x to the objectve fucto. 8. Usg recursve relatoshp, soluto procedure starts at the ed ad moves backward stage by stage. Stops whe optmal polcy startg at tal stage s foud. The optmal polcy for the etre problem s foud. Example: the tables for the stages show ths procedure. João Mguel da Costa Sousa / Alexadra Moutho Characterstcs of DP 8. (cot.) For DP problems, a table such as the followg would be obtaed for each stage ( = N, N,, ): s x f (s, x ) f ( s ) João Mguel da Costa Sousa / Alexadra Moutho 3 x Determstc dyamc programmg Determstc problems: the state at the ext stage s completely determed by the state ad polcy decso at the curret stage. Form of the objectve fucto: mmze or maxmze the sum, product, etc. of the cotrbutos from the dvdual stages. Set of states: may be dscrete or cotuous, or a state vector. Decso varables ca also be dscrete or cotuous. João Mguel da Costa Sousa / Alexadra Moutho 4 Example: dstrbutg medcal teams The World Health Coucl has fve medcal teams to allocate to three uderdeveloped coutres. Measure of performace: addtoal perso years of lfe,.e., creased lfe expectacy ( years) tmes coutry s populato. Thousads of addtoal perso years of lfe Coutry Medcal teams 3 45 0 50 70 45 70 3 90 75 80 4 05 0 00 5 0 50 30 João Mguel da Costa Sousa / Alexadra Moutho 5 4

Formulato of the problem States to be cosdered Problem requres three terrelated decsos: how may teams to allocate to the three coutres (stages). x s the umber of teams to allocate to stage. What are the states? What chages from oe stage to aother? s = umber of medcal teams stll avalable for remag coutres (,, 3). Thus: s = 5, s = 5 x = s x, s 3 = s x. Thousads of addtoal perso years of lfe Coutry Medcal teams 3 45 0 50 70 45 70 3 90 75 80 4 05 0 00 5 0 50 30 João Mguel da Costa Sousa / Alexadra Moutho 6 João Mguel da Costa Sousa 7 Overall problem Polcy p (x ): measure of performace from allocatg x medcal teams to coutry. 3 = Maxmze p( x ), subject to Recursve relatoshp relatg fuctos: { + } f ( s ) = max p ( x ) + f ( s x ), for =, x= 0,,, s f ( s ) = max p ( x ) 3 3 3 3 x = 0,,, s 3 ad x 3 = x = 5, are oegatve tegers. João Mguel da Costa Sousa / Alexadra Moutho 8 João Mguel da Costa Sousa / Alexadra Moutho 9 Soluto procedure, stage = 3 For last stage = 3, values of p 3 (x 3 ) are the last colum of table. Here, x 3 = s 3 ad f 3 (s 3 )= p 3 (s 3 ). Stage = Here, fdg x requres calculatg f (s, x ) for the values of x = 0,,, s. Example for s = : Medcal teams Thousads of addtoal perso years of lfe Coutry 3 45 0 50 70 45 70 3 90 75 80 4 05 0 00 5 0 50 30 = 3: s 3 f 3 (s 3 ) x 3 0 0 0 50 70 3 80 3 4 00 4 5 30 5 Medcal teams Thousads of addtoal perso years of lfe Coutry 3 45 0 50 70 45 70 3 90 75 80 4 05 0 00 5 0 50 30 State: 45 0 0 0 50 0 70 João Mguel da Costa Sousa / Alexadra Moutho 30 João Mguel da Costa Sousa / Alexadra Moutho 3 5

Stage = Smlar calculatos ca be made for the other values of s : f (s, x ) = p (x ) + f 3 (s x ) = : s x 0 3 4 5 f (s ) x 50 0 50 0 70 70 45 70 0 or 3 80 90 95 75 95 4 00 00 5 5 0 5 3 5 30 0 5 45 60 50 60 4 João Mguel da Costa Sousa / Alexadra Moutho 3 Stage = Oly state s the startg state s = 5: State: 5 0 0 0... 5 45 4 0 60 5 Thousads of addtoal perso years of lfe Coutry Medcal teams 3 45 0 50 70 45 70 3 90 75 80 4 05 0 00 5 0 50 30 = : s x f (s, x ) = p (x ) + f (s x ) 0 3 4 5 f (s ) x 5 60 70 65 60 55 0 70 João Mguel da Costa Sousa / Alexadra Moutho 33 Optmal polcy decso Dstrbuto of effort problem Oe kd of resource s allocated to a umber of actvtes. Objectve: how to dstrbute the effort (resource) amog the actvtes most effectvely. DP volves oly oe (or few) resources, whle LP ca deal wth thousads of resources. The assumptos of LP: proportoalty, dvsblty ad certaty ca be volated by DP. Oly addtvty (or aalogous for product of terms) s ecessary because of the prcple of optmalty. World Health Coucl problem volates proportoalty ad dvsblty (WHY?) João Mguel da Costa Sousa 34 João Mguel da Costa Sousa / Alexadra Moutho 35 Formulato of dstrbuto of effort Stage = actvty ( =,,, N). x = amout of resource allocated to actvty. State s = amout of resource stll avalable for allocato to remag actvtes (,, N). Whe system starts at stage state s, choce x results the ext state at stage + beg s + = s x : Stage: + State: s x s x Example Dstrbutg scetsts to research teams 3 teams are solvg egeerg problem to safely fly people to Mars. extra scetsts reduce the probablty of falure. Probablty of falure Team New scetsts 3 0 0.40 0.60 0.80 0.0 0.40 0.50 0.5 0.0 0.30 João Mguel da Costa Sousa / Alexadra Moutho 36 João Mguel da Costa Sousa / Alexadra Moutho 37 6

Cotuous dyamc programmg Prevous examples had a dscrete state varable s, at each stage. They all have bee reversble; the soluto procedure could have moved backward or forward stage by stage. Next example s cotuous. As s ca take ay values certa tervals, the solutos f (s ) ad x must be expressed as fuctos of s. Stages the ext example wll correspod to tme perods, so the soluto must proceed backwards. Example: schedulg jobs The compay Local Job Shop eeds to schedule employmet jobs due to seasoal fluctuatos. Mache operators are dffcult to hre ad costly to tra. Peak seaso payroll should ot be mataed afterwards. Overtme work o a regular bass should be avoded. Mmum requremets ear future: Seaso Sprg Summer Autum Wter Sprg Requremets 55 0 40 00 55 João Mguel da Costa Sousa / Alexadra Moutho 38 João Mguel da Costa Sousa / Alexadra Moutho 39 Example: schedulg jobs Formulato Employmet above level the table costs $,000 per perso per seaso. Total cost of chagg level of employmet from oe seaso to the other s $00 tmes the square of the dfferece employmet levels. Fractoal levels are possble due to part tme employees. From data, maxmum employmet should be 55 (sprg). It s ecessary to fd the level of employmet for other seasos. Seasos are stages. Oe cycle of four seasos, where stage s summer ad stage 4 s sprg (kow employmet). x = employmet level for stage ( =,,3,4); x 4 =55 r = mmum employmet requremet for stage : r =0, r =40, r 3 =00, r 4 =55. Thus: r x 55 João Mguel da Costa Sousa / Alexadra Moutho 40 João Mguel da Costa Sousa / Alexadra Moutho 4 Formulato Cost for stage = 00(x x ) + 000(x r ) State s : employmet the precedg seaso x s = x (=: s = x 0 = x 4 = 55) Problem: Choose x, x ad x 3 as to 4 x x + x r = mmze 000( ) 00( ), subject to r x 55, for =,,3,4 Data Choose x, x ad x 3 as to 4 x x + x r = mmze 000( ) 00( ), subject to r x 55, for =,,3,4 r Feasble x Possble s = x Cost 0 0 x 55 s = 55 00(x 55) + 000(x 0) 40 40 x 55 0 s 55 00(x x ) + 000(x 40) 3 00 00 x 3 55 40 s 3 55 00(x 3 x ) + 000(x 3 00) 4 55 x 4 = 55 00 s 4 55 00(55 x 3 ) João Mguel da Costa Sousa / Alexadra Moutho 4 João Mguel da Costa Sousa / Alexadra Moutho 43 7

Formulato Recursve relatoshp: Basc structure of the problem: { + } f ( s ) = m 00( x s ) + 000( x r ) + f ( x ) r x 55 Soluto procedure r Feasble x Possble s = x Cost 0 0 x 55 s = 55 00(x 55) + 000(x 0) 40 40 x 55 0 s 55 00(x x ) + 000(x 40) 3 00 00 x 3 55 40 s 3 55 00(x 3 x ) + 000(x 3 00) 4 55 x 4 = 55 00 s 4 55 00(55 x 3) Stage 4: the soluto s kow to be x 4 = 55. s 4 f 4 (s 4 ) x 4 00 s 4 55 00(55 s 4 ) 55 João Mguel da Costa Sousa / Alexadra Moutho 44 João Mguel da Costa Sousa / Alexadra Moutho 45 Soluto procedure Graphcal soluto for f 3 (x 3 ) r Feasble x Possble s = x Cost 0 0 x 55 s = 55 00(x 55) + 000(x 0) 40 40 x 55 0 s 55 00(x x ) + 000(x 40) 3 00 00 x 3 55 40 s 3 55 00(x 3 x ) + 000(x 3 00) 4 55 x 4 = 55 00 s 4 55 00(55 x 3 ) Stage 3: 40 s 3 55: { } f ( s ) = m 00( x s ) + 000( x 00) + f ( x ) 3 3 00 x3 55 3 3 3 4 3 00 x3 55 { x3 s3 x3 x3 } = m 00( ) + 000( 00) + 00(55 ) João Mguel da Costa Sousa / Alexadra Moutho 46 João Mguel da Costa Sousa / Alexadra Moutho 47 Calculus soluto for f 3 (x 3 ) Usg calculus: f3( s3, x3) = 400( x3 s3) + 000 400(55 x3) x3 = 400(x3 s3 50) = 0 s3 + 50 x3 = Guaratees mmum? s 3 f 3 (s 3 ) x 3 40 s 3 55 50(50 s 3 ) +50(60 s 3 ) +000(s 3 50) (s 3 +50)/ Stage Solved a smlar fasho, wth f( s, x) = 00( x s) + 000( x r) + f3 ( x3) = 00( x s) + 000( x 40) + 50(50 x ) + 50(60 x ) + 000( x 50) for 0 s 55 (possble values) ad 40 x 55 (feasble values). Solvg / x [f (s, x )] = 0, yelds: s + 40 x = 3 João Mguel da Costa Sousa / Alexadra Moutho 48 João Mguel da Costa Sousa / Alexadra Moutho 49 8

Stage The soluto has to be feasble for 0 s 55 (.e., 40 x 55 for 0 s 55 )! s + 40 x = oly feasble for 40 s 55. 3 Need d to solve for feasble value of x that t mmzes f (s, x ) whe 0 s 40. For s 40, so x = 40. x f ( s, x ) > 0 for 40 x 55 Why? João Mguel da Costa Sousa / Alexadra Moutho 50 Stage ad Stage s f (s ) x 0 s 40 00(40 s ) +5000 40 40 s 55 00/9[(40 s ) +(55 s ) (70 s ) ]+000(s 95) Stage : procedure s smlar. (s +40)/3 55 85000 47.5 Soluto: x = 47.5, x = 45, x 3 = 47.5, x 4 = 55 Total cost of $85,000 s f (s ) x How? João Mguel da Costa Sousa / Alexadra Moutho 5 Determstc cotuous problem Cosder the followg olear programmg problem: Maxmze Z x x, subject to x x. (There are o oegatvty costrats.) Use dyamc programmg to solve ths problem. Probablstc dyamc programmg State at ext stage s ot completely determed by state ad polcy decso at curret stage. There s a probablty dstrbuto for determg the ext state, see fgure. S = umber of possble states at stage +. system goes to ( =,,,S) wth probablty p gve state s ad decso x at stage. C = cotrbuto of stage to objectve fucto. If fgure s expaded to all possble states ad decsos at all stages, t s a decso tree. João Mguel da Costa Sousa / Alexadra Moutho 5 João Mguel da Costa Sousa / Alexadra Moutho 53 Basc structure Probablstc dyamc programmg Relato betwee f (s, x ) ad f + (s + ) depeds upo form of overall objectve fucto. Example: mmze the expected sum of the cotrbutos from dvdual stages. f (s, x ) s the mmum expected sum from stage oward, gve state s ad polcy decso x at stage : wth S = + + = f ( s, x ) p C f ( ) f () = m f (, x ) + x + + + João Mguel da Costa Sousa / Alexadra Moutho 54 João Mguel da Costa Sousa / Alexadra Moutho 55 9

Example: determg reject allowaces The Ht ad Mss Maufacturg Compay receved a order to supply tem of a partcular type. Customer requres specfed strget qualty requremets. Maufacturer has to produce more tha oe to acheve oe acceptable. Number of extra tems s the reject allowace. Probablty of acceptable or defectve s ½. Number of acceptable tems a lot of sze L has a bomal dstrbuto: probablty of ot acceptable tems s (/) L. Setup cost = $300, cost per tem = $00. Maxmum producto rus = 3. Cost of o acceptable tem after 3 rus = $,600. João Mguel da Costa Sousa / Alexadra Moutho 56 Formulato Objectve: determe polcy regardg lot sze (+reject allowace) for requred producto ru(s) that mmzes total expected cost. Stage = producto ru ( =,,3), x = lot sze for stage, State s = umber of acceptable tems stll eeded ( or 0) at the begg of stage. At stage, state s =. João Mguel da Costa Sousa / Alexadra Moutho 57 Formulato f (s, x ) = total expected cost for stages,,3 ad optmal decsos are: f ( s) = m f( s, x) x = 0,, f (0) = 0. Moetary t ut s $00. Cotrbuto t to cost from stage s [K(x ) + x ], wth 0, f x = 0 Kx ( ) = 3, f x > 0 Note that f 4 () = 6. João Mguel da Costa Sousa / Alexadra Moutho 58 Basc structure of the problem Recursve relatoshp: x { + } f () = m K( x ) + x + 0.5 f () x = 0,,, for =,,3 João Mguel da Costa Sousa / Alexadra Moutho 59 Soluto procedure s 3 x 3 f 3 (, x 3 ) = K(x 3 ) + x 3 + (/) x 36 = 3: 0 3 4 5 f 3 (s 3 ) x 3 6 9 8 8 8.5 8 3 or 4 f (, x ) = K(x ) + x +(/) x f3 () s x = : 0 3 4 f (s x ) 8 8 7 7 7.5 7 or 3 s x f (, x ) = K(x ) + x +(/) x f () = : 0 3 4 f (s ) x 7 7.5 6.75 6.875 7.44 6.75 Optmal soluto? João Mguel da Costa Sousa / Alexadra Moutho 60 Probablstc problem A eterprsg youg statstca beleves that she has developed a system for wg a popular Las Vegas game. Her colleagues do ot beleve that her system works, so they have made a large bet wth her that f she starts wth three chps, she wll ot have at least fve chps after three plays of the game. Each play of the game volves bettg ay desred umber of avalable chps ad the ether wg or losg ths umber of chps. The statstca beleves that her system wll gve her a probablty of /3 of wg a gve play of the game. Assumg the statstca s correct, use dyamc programmg to determe her optmal polcy regardg how may chps to bet (f ay) at each of the three plays of the game. The decso at each play should take to accout the results of earler plays. The objectve s to maxmze the probablty of wg her bet wth her colleagues. João Mguel da Costa Sousa / Alexadra Moutho 6 0