Inventory Control with Convex Costs

Similar documents
ON THE STRUCTURE OF OPTIMAL ORDERING POLICIES FOR STOCHASTIC INVENTORY SYSTEMS WITH MINIMUM ORDER QUANTITY

Keywords: inventory control; finite horizon, infinite horizon; optimal policy, (s, S) policy.

Stochastic Models. Edited by D.P. Heyman Bellcore. MJ. Sobel State University of New York at Stony Brook

Structure of optimal policies to periodic-review inventory models with convex costs and backorders for all values of discount factors

Single-stage Approximations for Optimal Policies in Serial Inventory Systems with Non-stationary Demand

Optimal ordering policies for periodic-review systems with replenishment cycles

Coordinating Inventory Control and Pricing Strategies with Random Demand and Fixed Ordering Cost: The Finite Horizon Case

Optimality Results in Inventory-Pricing Control: An Alternate Approach

A Proof of the EOQ Formula Using Quasi-Variational. Inequalities. March 19, Abstract

A Single-Unit Decomposition Approach to Multiechelon Inventory Systems

Stochastic Shortest Path Problems

Chapter 2 SOME ANALYTICAL TOOLS USED IN THE THESIS

Optimal Control of Stochastic Inventory System with Multiple Types of Reverse Flows. Xiuli Chao University of Michigan Ann Arbor, MI 48109

1 Markov decision processes

(s, S) Optimality in Joint Inventory-Pricing Control: An Alternate Approach

(s, S) Optimality in Joint Inventory-Pricing Control: An Alternate Approach*

Single-stage Approximations for Optimal Policies in Serial Inventory Systems with Non-stationary Demand

Chapter 16 focused on decision making in the face of uncertainty about one future

Total Expected Discounted Reward MDPs: Existence of Optimal Policies

On the Convergence of Optimal Actions for Markov Decision Processes and the Optimality of (s, S) Inventory Policies

Ordering Policies for Periodic-Review Inventory Systems with Quantity-Dependent Fixed Costs

Optimal Control of an Inventory System with Joint Production and Pricing Decisions

A Hierarchy of Suboptimal Policies for the Multi-period, Multi-echelon, Robust Inventory Problem

Dynamic Inventory Models and Stochastic Programming*

Are Base-stock Policies Optimal in Inventory Problems with Multiple Delivery Modes?

2001, Dennis Bricker Dept of Industrial Engineering The University of Iowa. DP: Producing 2 items page 1

Markov decision processes and interval Markov chains: exploiting the connection

Markov decision processes with threshold-based piecewise-linear optimal policies

arxiv: v2 [math.oc] 25 Mar 2016

MDP Preliminaries. Nan Jiang. February 10, 2019

Discrete-Time Markov Decision Processes

Time is discrete and indexed by t =0; 1;:::;T,whereT<1. An individual is interested in maximizing an objective function given by. tu(x t ;a t ); (0.

A Customer-Item Decomposition Approach to Stochastic Inventory Systems with Correlation

Full terms and conditions of use:

Abstract Dynamic Programming

Serial Supply Chains with Economies of Scale: Bounds and Approximations

A New Algorithm and a New Heuristic for Serial Supply Systems.

Inventory control with partial batch ordering

A monotonic property of the optimal admission control to an M/M/1 queue under periodic observations with average cost criterion

Economics 2010c: Lectures 9-10 Bellman Equation in Continuous Time

Economic lot-sizing games

Lecture notes for Analysis of Algorithms : Markov decision processes

21 Markov Decision Processes

Purchase Contract Management with Demand Forecast Updates

New Markov Decision Process Formulations and Optimal Policy Structure for Assemble-to-Order and New Product Development Problems

Serial Inventory Systems with Markov-Modulated Demand: Solution Bounds, Asymptotic Analysis, and Insights

A Dynamic model for requirements planning with application to supply chain optimization

Computational complexity estimates for value and policy iteration algorithms for total-cost and average-cost Markov decision processes

ON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME

Stochastic Analysis of Bidding in Sequential Auctions and Related Problems.

Optimal Control of Parallel Make-To-Stock Queues in an Assembly System

Percentile Threshold Policies for Inventory Problems with Partially Observed Markovian Demands

A Duality-Based Relaxation and Decomposition Approach for Inventory Distribution Systems

We consider the classic N -stage serial supply systems with linear costs and stationary

Make-to-Stock Production-Inventory Systems. and Lost Sales

Markov Decision Processes and Dynamic Programming

Information Relaxation Bounds for Infinite Horizon Markov Decision Processes

Time-dependent order and distribution policies in supply networks

Chapter 3. Dynamic Programming

UNCORRECTED PROOFS. P{X(t + s) = j X(t) = i, X(u) = x(u), 0 u < t} = P{X(t + s) = j X(t) = i}.

Value Function Iteration

Zero-Inventory Conditions For a Two-Part-Type Make-to-Stock Production System

arxiv: v1 [math.oc] 5 Jan 2015

Reflected Brownian Motion

1 Positive Ordering Costs

Stochastic (Random) Demand Inventory Models

Central-limit approach to risk-aware Markov decision processes

On the Convergence of Optimal Actions for Markov Decision Processes and the Optimality of (s, S) Inventory Policies

Stochastic Differential Equations.

Introduction to Continuous-Time Dynamic Optimization: Optimal Control Theory

7.1 INTRODUCTION. In this era of extreme competition, each subsystem in different

Markov Chains (Part 4)

Dynamic stochastic game and macroeconomic equilibrium

Bayesian Inference and the Symbolic Dynamics of Deterministic Chaos. Christopher C. Strelioff 1,2 Dr. James P. Crutchfield 2

Markov decision processes

Serial Inventory Systems with Markov-Modulated Demand: Derivative Bounds, Asymptotic Analysis, and Insights

Markov Decision Processes and Dynamic Programming

Optimal Backlogging Over an Infinite Horizon Under Time Varying Convex Production and Inventory Costs

Approximate Dynamic Programming for High Dimensional Resource Allocation Problems

Value and Policy Iteration

Asymptotically Optimal Inventory Control For Assemble-to-Order Systems

Lecture 5 Linear Quadratic Stochastic Control

UNIVERSITY OF MANITOBA

DYNAMIC LECTURE 5: DISCRETE TIME INTERTEMPORAL OPTIMIZATION

Influence of product return lead-time on inventory control

Optimality Inequalities for Average Cost MDPs and their Inventory Control Applications

7. Introduction to Numerical Dynamic Programming AGEC Summer 2012

Topic 6: Projected Dynamical Systems

Financial Optimization ISE 347/447. Lecture 21. Dr. Ted Ralphs

This paper studies the optimization of the S T inventory policy, where T is the replenishment interval and S

On Parallel Machine Replacement Problems with General Replacement Cost Functions and Stochastic Deterioration

Time-Varying Parameters

Planning and Acting in Partially Observable Stochastic Domains

WIRELESS systems often operate in dynamic environments

The Impact of Customer Impatience on Production Control

Piecewise linear approximations of the standard normal first order loss function and an application to stochastic inventory control

Decision Sciences, Vol. 5, No. 1 (January 1974)

Abstrct. In this paper, we consider the problem of optimal flow control for a production system with one machine which is subject to failures and prod

Dynamic Programming with Hermite Interpolation

ISM206 Lecture, May 12, 2005 Markov Chain

Transcription:

Inventory Control with Convex Costs Jian Yang and Gang Yu Department of Industrial and Manufacturing Engineering New Jersey Institute of Technology Newark, NJ 07102 yang@adm.njit.edu Department of Management Science and Information Systems The University of Texas at Austin Austin, TX 78712 yu@uts.cc.utexas.edu December 2000 Abstract For an infinite-horizon stochastic inventory control problem, we find that an optimal generalized base stock policy exists when the ordering and holdingbacklogging costs are both convex and an optimal generalized base stock policy exists when the ordering cost is further piecewise-linear. Keywords: Inventory Control, Markov Decision Process This author is supported, in part, by New Jersey Institute of Technology under Grant No. 4-21830 1

1 Introduction The inventory control problem is concerned with the control of order sizes with the purpose of meeting demands with the minimum total cost within a certain time frame. In most cases, costs in a certain period which affect the decisions include: ordering cost as a function of the order size in that period, inventory holding cost as a function of the inventory level carried over through that period, backlogging cost as a function of the negative inventory level carried through that period, and penalty incurred per unsatisfied demand in that period. In this paper, we study a stationary infinite-horizon stochastic inventory control problem with full backlogging under convex ordering and convex holding-backlogging costs. There is a tremendous amount of literature on stochastic and dynamic inventory control [1] [6]. When the ordering cost is convex and all other costs are linear, Karlin [5] showed the optimal policy to be a generalized base stock policy. That is, there exists a nonnegative function y(x) with 0 dy 1 such that, in any period, dx if the starting inventory level is x, order so that the inventory level is brought to max{x, y(x)}. This policy is better illustrated in terms of order size versus starting inventory level. In that term, the above policy states: There exists a nonnegative function y(x) with 1 dy dx 0 and lim x y(x) = 0 such that when the starting inventory level is x in any period, let the order size be y(x). Figures 1 and 2 depict the policy in terms of order up-to level versus starting inventory level and in terms of order size versus starting inventory level, respectively. *** Figure 1 is about here. *** *** Figure 2 is about here. *** For the single-period problem, when the ordering cost is further piecewise-linear, the optimal policy was shown to be a finite generalized base stock policy [4][6]. In terms of order size versus starting inventory level, the policy is as follows: If the ordering cost constitutes L pieces, then y(x) consists of 2L pieces in which pieces with rates 0 and -1 appear intermittently. In addition, ranges for the pieces with rate -1 are exactly the pieces between two consecutive kinks in the ordering cost function. Besides Karlin s work [4], Sobel [8] used a two-piece linear convex order cost function to model the proportional changes in the ordering cost brought by the order-level 2

deviation from a prescribed constant. With the assumption of full backlogging and convex holding-backlogging cost, Sobel concluded that the optimal policy is a twobase-stock generalized base stock policy. Figures 3 and 4 depict the finite generalized base stock policy in terms of order up-to level versus starting inventory level and in terms of order size versus starting inventory level. *** Figure 3 is about here. *** *** Figure 4 is about here. *** Although it might have been a folklore that the above results extend to the infinitehorizon cases under general convex and general piecewise-linear convex ordering costs, we have not seen a rigorous proof for it. In this paper, we attempt to fill this gap. Our approach differs from most work in stochastic inventory control in that we consider all quantities as discrete numbers. By taking limits and making ordinary smoothness assumptions, the results can be translated back into the continuous setting straightforwardly. Our major findings are as follows: When the stationary ordering cost is convex and the stationary holding-backlogging cost is convex, the optimal policy is a generalized base stock policy. When the stationary ordering cost is piecewise-linear convex and the stationary holding-backlogging cost is convex, the optimal policy is a finite generalized base stock policy. The rest of the paper is organized as follows: In Section 2, we introduce the problem formulation and the value iteration solution approach for the problem. In Section 3, we study the problem under general stationary convex ordering and convex holding-backlogging costs. In Section 4, we show the form of one optimal policy for the problem with the further assumption that the stationary ordering cost is piecewiselinear. We conclude the paper in Section 5. 2 Formulation and Convergence of Value Iteration Let us assume that the numbers of demands coming in various periods are i.i.d. nonnegative random variables with a random distribution {P d, d Z + }, such that P d 3

is the probability of d new orders coming in a period. Let α (0, 1) be the constant discount factor per period. Assume also that the stationary ordering cost function is a nonnegative convex function V ( ) with V (0) = 0. Finally, let nonnegative convex function C( ) with C(0) = 0 be the stationary holding-backlogging cost. That is, C(i) is the holding cost for carrying i items over one period when i > 0 and C(i) is the backlogging cost for carrying i unsatisfied items over one period when i < 0. The state in each period is the starting inventory level j. An ordering policy X {x(t), t Z + } with x(t) {x j (t), j Z} for any t Z + and x j (t) Z + for any j and t determines the size x j (t) of the order to be placed in any period t when the firm is in any state j. Under a given policy X, the system evolves as a Markov chain. The transition matrix at t, P t (x(t)), satisfies Pjj t (x P j+xj (t) j j(t)) = if j + x j(t) j 0 0 otherwise since it takes the demand of size j + x j (t) j in period t to change the inventory level from j in period t to j in period t + 1. Let zj be the minimum average discounted total cost incurred from period 0 to the infinite future when the state in the period is j. From Blackwell [2], we know that z = {z j, j Z} is a solution to the equations z j = min x j Z + {V (x j ) + P d [C(j + x j d) + αzj+x j d]} d=0 for all j s and there exists an optimal stationary policy x (t) = x {x j, j Z}. j, Define f(x, z) {f j (x j, z), j Z} to be a functional which satisfies that, for any Also, let Then, z satisfies and x satisfies f j (x j, z) = V (x j ) + P d [C(j + x j d) + αz j+xj d]. d=0 f (z) = min x Z + z = f (z ) = min x Z + f(x, z). f(x, z ) x = argmin x Z + f(x, z ). 4 (1)

According to Denardo [3], f ( ) is a contraction mapping. Thus, the mapping has a unique fixed point as our solution value z. And, we can use the value iteration method [7] to achieve the optimal solution z : 1) Pick an arbitrary z 0 ; 2) For every t 0, let z t+1 = f (z t ). Then, lim t z t = z. Our ensuing derivations hinge on the convergence of the value iteration method and that x solves the minimization problem for f(x, z ). 3 Under General Convex Ordering Costs The convexity of V ( ) and C( ) directly leads to the positive correlation between the starting inventory level and the inventory level right after ordering. That is, given x j, there is x j+1, such that j + 1 + x j+1 j + x j, as in Theorem 1: Theorem 1 For any j, we must have some x j+1 which satisfies x j+1 x j 1. Proof: We need only to prove that for any x x j, f j+1 (x 1, z ) f j+1 (x j 1). Actually, for any such x, we have f j (x, z ) f j (x j, z ) due to the optimality of x j for f j (, z ). Furthermore, we have f j+1 (x 1, z ) f j (x, z ) = V (x 1) V (x) and f j+1 (x j 1, z ) f j (x j, z ) = V (x j 1) V (x j). But since V ( ) is convex, we get f j+1 (x 1, z ) f j+1 (x j 1, z ) f j (x, z ) f j (x j, z ) 0. The convexity of V ( ) and C( ) also leads to the convexity of zj j. That is, in our terms, z is convex in j. as a function of 5

Theorem 2 z is convex in j. Proof: We first prove the property that, if some z is convex in j, then for any j and nonnegative x 1 and x 2, there exist nonnegative x m and x M which satisfy G j (x 1, x 2, x m, x M, z) f j+1 (x 2, z) + f j 1 (x 1, z) f j (x m, z) f j (x M, z) 0. When x 2 x 1 + 1, we let x m be x 1 and x M be x 2. Otherwise, when x 2 x 1 + 2, if x 1 + x 2 is odd, we let x m be (x 1 + x 2 1)/2 and x M be (x 1 + x 2 + 1)/2; otherwise, we let both x m and x M be (x 1 + x 2 )/2. Note that, we always have x m + x M = x 1 + x 2, and when x 2 x 1 + 2, it is true that x 1 + 1 x m x M x 2 1. Hence, when x 2 x 1 + 1, we have G j (x 1, x 2, x 1, x 2, z) = P d [C(j+x 2 d+1)+c(j+x 1 d 1) C(j+x 2 d) C(j+x 1 d) d=0 +α(z j+x2 d+1 + z j+x1 d 1 z j+x2 d z j+x1 d)]. The convexity of C( ) leads to the nonnegativity of the above term. When x 2 x 1 +2, we have G j (x 1, x 2, x m, x M, z) = V (x 2 ) + V (x 1 ) V (x M ) V (x m ) + P d [C(j + x 2 d + 1) + C(j + x 1 d 1) C(j + x M d) C(j + x m d) d=0 +α(z j x2 +d+1 + z j x1 +d 1 z j xm +d z j xm+d)]. The convexity of V ( ) and C( ) leads to the nonnegativity of the above term. Thus, the property is proved. Now, we proceed with induction. Let an arbitrary z 0 be convex in j. For any nonnegative t, let z t+1 = f (z t ). Suppose for a given nonnegative t, z t is convex in j. Now, we try to prove that z t+1 is convex in j as well. To this end, let x t be such that x t = argmin x Z + f(x, z t ). Then, from the convexity of z t in j and the property just proved, we know the existence of some nonnegative x m and x M which leads to f j+1 (x t j+1, z t ) + f j 1 (x t j 1, z t ) f j (x m, z t ) f j (x M, z t ) 0. 6

But, by definition, we have for any j and nonnegative x that Therefore, for any j, z t+1 j f j(x m, z t ) + f j (x M, z t ) 2 z t+1 j f j (x, z t ). f j 1(x t j 1, z t ) + f j+1 (x t j+1, z t ) 2 = zt+1 j 1 + zj+1 t+1. 2 Thus, z t+1 is convex in j. From the convergence of z t to z, we know that z is convex in j. Furthermore, the convexity of V ( ) and C( ) and the convexity of z in j allow some x j+1 to be no larger than x j. Theorem 3 For any j, given x j, we can find some x j+1 which is no larger than x j. Proof: We first show that, for any j, f j (x, z ) is a convex function of x. That is because for any positive x, we have f j (x + 1, z ) + f j (x 1, z ) 2f j (x, z ) = V (x + 1) + V (x 1) 2V (x) + P d [C(j + x d + 1) + C(j + x d 1) 2C(j + x d) d=0 +α(zj+x d+1 + zj+x d 1 2zj+x d)] 0 due to the convexity of V ( ) and C( ) and the convexity of z in j. Then, we prove the property that, for any nonnegative x, f j (x, z ) f j (x + 1, z ) implies f j+1 (x, z ) f j+1 (x + 1, z ). To this end, we have f j+1 (x + 1, z ) f j+1 (x, z ) f j (x + 1, z ) + f j (x, z ) = P d [C(j + x d + 2) + C(j + x d) 2C(j + x d + 1) d=0 +α(zj+x d+2 + zj+x d 2zj+x d+1)] 0 due to the convexity of C( ) and the convexity of z in j. Now, due to the optimality of x j for f j (, z ) and the convexity of f j (, z ), we have f j (x j, z ) f j (x j + 1, z ) f j (x j + 2, z ). 7

By the property just proved, we also have f j+1 (x j, z ) f j+1 (x j + 1, z ) f j+1 (x j + 2, z ). Therefore, for every j, we may always have some x j+1 which is no larger than x j. So, when the stationary ordering and holding-backlogging costs are convex, z is convex in j and there is an x satisfying x j+1 = x j 1 or x j for any j. Taking the limit to the continuous case, the policy is a generalized base stock policy. 4 Under Piecewise-Linear Convex Ordering Costs Here, we further assume that V ( ) is piecewise-linear whose kinks occur at X 0,..., X L with 0 X 0 < X 1 < < X L 1 < X L. For l = 1,..., L, the marginal ordering cost for order sizes between X l 1 and X l is assumed to be F l, with 0 F 1 < < F L. Therefore, for l = 1,..., L and X l 1 x X l 1, we have l 1 V (x) = F l x + (F l F l +1)X l. Figure 5 illustrates this piecewise-linear convex ordering cost function. l =1 *** Figure 5 is about here. *** We may perceive that, whenever x j becomes as small as X l 1 as j increases, it is possible to have x j+1 = x j 1. Theorem 4 confirms this. Theorem 4 For any j, l = 1,..., L, when X l 1 + 1 x j X l 1, we can find some x j+1 to be x j 1. Proof: Since x j+1 may be either x j or x j 1, we need only to prove that f j+1 (x j 1, z ) f j+1 (x j, z ). For X l 1 x X l 1, it is true that V (x + 1) V (x) = F l. Hence, we have f j+1 (x j, z ) f j+1 (x j 1, z ) f j (x j + 1, z ) + f j (x j, z ) 8

= 2V (x j) V (x j 1) V (x j + 1) = 0. Since we also have f j (x j, z ) f j (x j + 1, z ), the theorem is proved. Note that, when X l 1 + 1 x j X l and x j+1 = x j 1, we have zj+1 = zj F l. The above results lead to the existence of an optimal policy x of the following 2L-piece linear nonincreasing zigzag form: There is some K {K 0, K 1,..., K L 1 } with K 1,..., K L 1 all being nonnegative, such that x K 0 l 1 l j = =1 K l j for l = 1,..., L 1, and X l x j = 0 for j K 0, K 0 l 1 l =1 K l X l j K 0 l 1 l =1 K l X l 1 1 K 0 l l =1 K l X l j K 0 l 1 l =1 K l X l 1 (2) x j = K 0 L 1 l =1 K l j for j K 0 L 1 l =1 K l X L 1 1. The optimal solution value z is convex over j. In addition, for l = 1,..., L, K 0 l 1 l =1 K l X l j K 0 l 1 l =1 K l X l 1 1, we have z j+1 = z j F l. When taking the limit to the continuous case, the above policy is exactly the finite generalized base stock policy. 5 Conclusion In this paper, we have reached the forms of the optimal policies for the stochastic infinite-horizon inventory control problem with convex ordering and holdingbacklogging costs. Our work is an extension to the work by S. Karlin and M.J. Sobel on convex-cost stochastic inventory control problems. In the derivation, we have used more convenient representations for the generalized base stock policy and the finite generalized base stock policy. Moreover, our discrete-quantity approach appears to be efficient. 9

References [1] Arrow, K., S. Karlin, and H. Scarf (1958), Studies in the Mathematical Theory of Inventory and Production, Stanford University Press, Stanford, California. [2] Blackwell, D. (1962), Discrete Dynamic Programming, Annals of Mathematical Statistics, 35, pp. 719-726. [3] Denardo, E.V. (1967), Contraction Mappings in the Theory Underlying Dynamic Programming, SIAM Review, 9, pp. 169-177. [4] Karlin, S. (1958), One-Stage Inventory Models with Uncertainty, in K. Arrow, S. Karlin, and H. Scarf (Eds.), Studies in the Mathematical Theory of Inventory and Production, Stanford University Press, Stanford, California. [5] Karlin, S. (1958), Optimal Inventory Policy for the Arrow-Harris-Marschak Dynamic Model, in K. Arrow, S. Karlin, and H. Scarf (Eds.), Studies in the Mathematical Theory of Inventory and Production, Stanford University Press, Stanford, California. [6] Porteus, E.L. (1990), Stochastic Inventory Theory, in D.P. Heyman and M.J. Sobel (Eds.), Handbooks in Operations Research and Management Science, Volume 2: Stochastic Models, Elsevier Science Publishers B.V., North-Holland, pp. 605-652. [7] Puterman, M.L. (1990), Markov Decision Processes, in D.P. Heyman and M.J. Sobel (Eds.), Handbooks in Operations Research and Management Science, Volume 2: Stochastic Models, Elsevier Science Publishers B.V., North-Holland, pp. 331-434. [8] Sobel, M.J. (1970), Making Short-Run Changes in Production When the Employment Level is Fixed, Management Science, 18, pp. 35-51. 10

Order Up-To Level 0 Starting Inventory Level Figure 1: Generalized Base Stock Policy Representation I

Order Size 0 Starting Inventory Level Figure 2: Generalized Base Stock Policy Representation II

Order Up-To Level K 0 -K 1 -K 2 -X 2 0 K 0 -K 1 -X 2 K 0 -K 1 -X 1 K 0 -X 1 K 0 Starting Inventory Level Figure 3: Finite Generalized Base Stock Policy Representation I

Order Size X 2 X 1 0 K 0 -K 1 -K 2 -X 2 K 0 -K 1 -X 2 K 0 -K 1 -X 1 K 0 -X 1 K 0 Starting Inventory Level Figure 4: Finite Generalized Base Stock Policy Representation II

Ordering Cost F 2 X 2 +(F 1 -F 2 )X 1 F 1 X 1 0 X 1 X 2 Order Size Figure 5: Piecewise-Linear Convex Ordering Cost