Management of dam systems via optimal price control

Similar documents
OPTIMALITY OF RANDOMIZED TRUNK RESERVATION FOR A PROBLEM WITH MULTIPLE CONSTRAINTS

Stochastic process. X, a series of random variables indexed by t

Queueing Theory I Summary! Little s Law! Queueing System Notation! Stationary Analysis of Elementary Queueing Systems " M/M/1 " M/M/m " M/M/1/K "

Point Process Control

ON ESSENTIAL INFORMATION IN SEQUENTIAL DECISION PROCESSES

Models for Control and Verification

Linear-Quadratic Stochastic Differential Games with General Noise Processes

Deterministic Dynamic Programming

Inventory Ordering Control for a Retrial Service Facility System Semi- MDP

Operations Research Letters. Instability of FIFO in a simple queueing system with arbitrarily low loads

Session-Based Queueing Systems

HJB equations. Seminar in Stochastic Modelling in Economics and Finance January 10, 2011

UNCORRECTED PROOFS. P{X(t + s) = j X(t) = i, X(u) = x(u), 0 u < t} = P{X(t + s) = j X(t) = i}.

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

DYNAMIC MODEL OF URBAN TRAFFIC AND OPTIMUM MANAGEMENT OF ITS FLOW AND CONGESTION

Performance Evaluation of Queuing Systems

Continuous-Time Markov Decision Processes. Discounted and Average Optimality Conditions. Xianping Guo Zhongshan University.

Hybrid Control and Switched Systems. Lecture #1 Hybrid systems are everywhere: Examples

Computational complexity estimates for value and policy iteration algorithms for total-cost and average-cost Markov decision processes

Dynamic Control of a Tandem Queueing System with Abandonments

Matthew Zyskowski 1 Quanyan Zhu 2

Proxel-Based Simulation of Stochastic Petri Nets Containing Immediate Transitions

Functional Limit theorems for the quadratic variation of a continuous time random walk and for certain stochastic integrals

MEAN FIELD GAMES WITH MAJOR AND MINOR PLAYERS

Markov Chain Model for ALOHA protocol

Uniformly Uniformly-ergodic Markov chains and BSDEs

Irreducibility. Irreducible. every state can be reached from every other state For any i,j, exist an m 0, such that. Absorbing state: p jj =1

NEW FRONTIERS IN APPLIED PROBABILITY

Let (Ω, F) be a measureable space. A filtration in discrete time is a sequence of. F s F t

Reinforcement Learning

Lecture 20: Reversible Processes and Queues

Dynamic Consistency for Stochastic Optimal Control Problems

Moment-based Availability Prediction for Bike-Sharing Systems

Finding the Value of Information About a State Variable in a Markov Decision Process 1

Synchronized Queues with Deterministic Arrivals

Statistics 150: Spring 2007

Chapter 2 Event-Triggered Sampling

ON WEAKLY NONLINEAR BACKWARD PARABOLIC PROBLEM

THIELE CENTRE. The M/M/1 queue with inventory, lost sale and general lead times. Mohammad Saffari, Søren Asmussen and Rasoul Haji

Near-Optimal Control of Queueing Systems via Approximate One-Step Policy Improvement

Markov Decision Processes and Dynamic Programming

Time Reversibility and Burke s Theorem

TCOM 501: Networking Theory & Fundamentals. Lecture 6 February 19, 2003 Prof. Yannis A. Korilis

Chapter 5. Continuous-Time Markov Chains. Prof. Shun-Ren Yang Department of Computer Science, National Tsing Hua University, Taiwan

Stochastic Processes and Advanced Mathematical Finance. Stochastic Processes

Solution of Stochastic Optimal Control Problems and Financial Applications

Non-homogeneous random walks on a semi-infinite strip

Lecture 10: Semi-Markov Type Processes

MARKOVIANITY OF A SUBSET OF COMPONENTS OF A MARKOV PROCESS

Convergence of Feller Processes

Markov Chains. X(t) is a Markov Process if, for arbitrary times t 1 < t 2 <... < t k < t k+1. If X(t) is discrete-valued. If X(t) is continuous-valued

DISCRETE STOCHASTIC PROCESSES Draft of 2nd Edition

Statistical Models and Algorithms for Real-Time Anomaly Detection Using Multi-Modal Data

Level-triggered Control of a Scalar Linear System

SMSTC (2007/08) Probability.

A monotonic property of the optimal admission control to an M/M/1 queue under periodic observations with average cost criterion

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION Vol. XI Stochastic Stability - H.J. Kushner

2001, Dennis Bricker Dept of Industrial Engineering The University of Iowa. DP: Producing 2 items page 1

Total Expected Discounted Reward MDPs: Existence of Optimal Policies

Procedia Computer Science 00 (2011) 000 6

Markov processes and queueing networks

Exact Simulation of the Stationary Distribution of M/G/c Queues

Stochastic Processes

On the Approximate Solution of POMDP and the Near-Optimality of Finite-State Controllers

Queuing Networks: Burke s Theorem, Kleinrock s Approximation, and Jackson s Theorem. Wade Trappe

Chapter 2 SOME ANALYTICAL TOOLS USED IN THE THESIS

7.1 INTRODUCTION. In this era of extreme competition, each subsystem in different

Time-dependent order and distribution policies in supply networks

One-Parameter Processes, Usually Functions of Time

On the static assignment to parallel servers

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

OD-Matrix Estimation using Stable Dynamic Model

Discrete-Time Finite-Horizon Optimal ALM Problem with Regime-Switching for DB Pension Plan

QUALIFYING EXAM IN SYSTEMS ENGINEERING

Numerical algorithms for one and two target optimal controls

Continuous Time Markov Chains

Lecture 7: Simulation of Markov Processes. Pasi Lassila Department of Communications and Networking

Analysis of Software Artifacts


arxiv: v1 [math.oc] 13 Mar 2019

On the Approximate Linear Programming Approach for Network Revenue Management Problems

Optimal Sleep Scheduling for a Wireless Sensor Network Node

Conception of effective number of lanes as basis of traffic optimization

Dynamic-equilibrium solutions of ordinary differential equations and their role in applied problems

Simplex Algorithm for Countable-state Discounted Markov Decision Processes

Stochastic Processes. Theory for Applications. Robert G. Gallager CAMBRIDGE UNIVERSITY PRESS

A NEW PROOF OF THE WIENER HOPF FACTORIZATION VIA BASU S THEOREM

Adaptive Dual Control

Positive Markov Jump Linear Systems (PMJLS) with applications

On Finding Optimal Policies for Markovian Decision Processes Using Simulation

An Optimal Control Approach to National Settlement System Planning

Approximation of BSDEs using least-squares regression and Malliavin weights

Markov decision processes and interval Markov chains: exploiting the connection

Basic Definitions: Indexed Collections and Random Functions

21 Markov Decision Processes

An introduction to Mathematical Theory of Control

Markov Chains. Chapter 16. Markov Chains - 1

Distributed demand side management via smart appliances contributing to frequency control

Water Resources Systems Prof. P. P. Mujumdar Department of Civil Engineering Indian Institute of Science, Bangalore

CDA6530: Performance Models of Computers and Networks. Chapter 3: Review of Practical Stochastic Processes

Transcription:

Available online at www.sciencedirect.com Procedia Computer Science 4 (211) 1373 1382 International Conference on Computational Science, ICCS 211 Management of dam systems via optimal price control Boris M. Miller a, Daniel J. McInnes b, a Monash University, Clayton, VIC 38 Australia and Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia. b Monash University, Clayton, VIC 38 Australia. Abstract This paper considers an optimal management strategy for a system of linked dams. The level of each dam is approximated by N discrete levels and each dam is then modeled as a continuous-time controlled Markov chain on a finite control period. The inflow processes for each dam are non-stationary as are the customer demands. We also consider non-stationary losses from each dam due to evaporation. The controls are a time and state dependent price control, the bounds of which are prescribed by regulators, and time and state dependent flow controls between dams. The innovation in this model is that the price control is a feedback control that takes into account the active sectoral demands of customers. The general approach to the solution is to consider the solution of this stochastic optimization problem in the average sense and solve it using the dynamic programming method. We consider some issues of the numerical procedures involved in this method and parallelization as a means to deal with higher dimension problems in reasonable time. We show that we can obtain optimal price controls for each joint state of the dam system using numerical methods. The result is illustrated by a numerical example. Keywords: stochastic control, optimal control, optimization problems, dynamic programming, Markov decision problems, parallelization. 1. Introduction The availability of a secure water supply is fundamental for human life and all of the activities associated with human civilization. Urbanization and the need to supply an urban population with all of its necessities has required the development of dams and reservoirs as a buffer against the vagaries of seasonal rainfall and as centralized infrastructure for water supply. Without such dam systems many cities simply could not exist. In this paper we consider one approach to dam system management as an example of optimal stochastic control that tries to balance the demands of long-term sustainability of a resource with the demands of customers in a methodical way. This type of problem is typically approached as a continuous-time Markov decision process (MDP) and there is abundant literature available on this topic (see for example [1], [2] and [3]). Of course, dams have not escaped such This work was supported in part by Australian Research Council Grant DP988685 and Russian Foundation for Basic Research Grant 1-1- 71. Corresponding author Email addresses: Boris.Miller@monash.edu (Boris M. Miller), Daniel.McInnes@monash.edu (Daniel J. McInnes) 1877 59 211 Published by Elsevier Ltd. Open access under CC BY-NC-ND license. Selection and/or peer-review under responsibility of Prof. Mitsuhisa Sato and Prof. Satoshi Matsuoka doi:1.116/j.procs.211.4.148

1374 Boris M. Miller et al. / Procedia Computer Science 4 (211) 1373 1382 study and a range of papers have been written where the inflow process is a Wiener process or a compound Poisson process and the dam has either finite or infinite capacity (for examples see [4], [5], [6], [7] and [8].) It has been noted that a Wiener process is not very realistic for modeling inflows but simplifies some calculation [4]. In general, compound Poisson processes are used. The key point of commonality in all of the above examples is that they use very simple threshold control models. They use a long-run average criterion as the principle optimality criterion. Such models generally assume an infinite time horizon for control and stationary inflow data. It is obvious, however, that inflows to dams are non-stationary, varying with seasonal rains. The long time horizon requires that inflows be greater than or equal to outflows or the system will run out of water in finite time, but in the short term usage may well exceed inflows. It is also clear that short term water availability is extremely important and that some account needs to taken of this in the optimality conditions. Finally, the long-term average criterion does not consider the costs of transient states and the resources required for these transitions on a finite time interval [9]. We consider the optimal management of a system of d dams via a state and time dependent price control and flow controls between dams in the system. The level of each dam in the system is approximated by N discrete levels and each is then modeled as a continuous-time controlled Markov chain. The general approach to the solution of this type of problem is to reduce the stochastic problem to a deterministic one with integral and terminal optimality criteria and then solve it via dynamic programming. This type of problem has been solved for server queueing systems by Miller [9] and in general by Miller et. al. [1, 11]. We assume that the intensities of the customers seasonal demand functions, the intensities of the natural inflows to each dam and the intensities of evaporation from each dam are deterministic. We also set a bound on the price control, p(t), with p(t) [p min, p max ]. Each of the flow controls u i, j (t), i, j [1,.., d] from dam i to j is defined as a rate of random transitions (X i,k, X j,l ) (X i,k 1, X j,l+1 ), where u i, j [, U max ], and, for the sake of simplicity, we put U max = 1. 1.1. Structure In section 2 we detail the method of modeling a multi-dam system as linked continuous-time controlled Markov chains. Section 3 provides details of the key innovation of this model and explains how state dependent consumption functions are derived for each dam given our price feedback control. In section 4 we consider a general performance criterion and the general solution of this problem as developed in [9, 1, 11]. Section 5 develops the specific performance criteria appropriate to this setting. Section 6 will discuss some issues dealing with the numerical solution and the application of parallelization to parts of the solution. Section 7 will give a numerical example for a system with only two dams so that the solutions can be readily visualized. The final section will outline future directions for research and enhancement of this model and solution method. 2. Model of the controlled dam system In order to model the dam system, we make some assumptions about the behavior of each dam. We assume that each dam has a natural inflow process, independent of flows into other dams. We likewise assume that natural losses from each dam due to evaporation are independent of evaporative losses in other dams. In terms of consumption, we assume that the consumption in each dam is controlled by a time and state dependent price and therefore depends on the joint state of the system. Likewise, cross-flows between dams are time and state dependent and depend on the joint state of the system. For each dam, we approximate its level by discretizing it into N + 1 states, N <, and then let L i (t) {,..., N}, i = 1,..., d, be an integer valued random variable describing the level of dam i at time t. Using the martingale approach [12] we describe the N + 1 possible levels in each dam by unit vectors in R N+1, giving us S i = { } e (i),..., e(i) N for each

Boris M. Miller et al. / Procedia Computer Science 4 (211) 1373 1382 1375 i = 1,..., d. All processes are defined on the probability space {Ω, F, P}. Specifically, we define X i,t, i = 1,...d, where { Xi,t S i, t [, T] } for T <, as a controlled jump Markov process with piecewise constant right-continuous paths. Clearly this represents the process of change in the level of each dam on the interval [, T]. We also make the following assumptions about the price control, p(t), and the transfer controls, u i, j (t), between dams i and j, for i, j = 1,..., d and i j. Assumption 2.1. Assume that the set of admissible controls, P = p( ) and Ū = { u i, j ( ) :i, j = 1,..., d; i j } are sets of Ft X -predictable controls taking values in P = {p [p min, p max ]} and U = {u [, 1]} respectively, where X = X 1 X 2... X d. Remark 2.2. Assumption 2.1 ensures that if the number of jumps in the i th dam up to time t [, T] is N t, τ k is the time of the k th jump and X t i, = { (X i,, ), (X i,1,τ 1 ),..., (X i,nt,τ Nt ) } is the set of states and jump times, then for τ Nt t <τ Nt+1 the controls p(t) = p(t, X t ) and u i, j(t) = u i, j (t, X t ) are measurable with respect to t and X t, where Xt = Xt 1,... Xt d, [12, 9]. 2.1. Dam system dynamics In our approach we suppose that the inflows and outflows of each dam in the system can be approximated by general Ft X -predictable counting processes with unit jumps and let the inflow into dam i be Y (i) in (t). We assume that the natural component of this has a deterministic intensity, λ i (t). The intensity of inflow components from other dams are the result of our water transfer controls, { u j,i (t) : j, i = 1,..., d; j i }. So for the i th dam the inflow process has the following form: t Y (i) in (t) = (λ i (s) + u j,i (s))i {L i (s) < N} ds + M (i) in (t), where M (i) in (t) is a square integrable martingale with quadratic variation M (i) in t = t (λ i (s) + u j,i (s))i {L i (s) < N} ds. For the outflow process in each dam there is a natural component and components due to consumption and water transfer controls. Let the outflow from dam i be Y (i) out (t) and let the intensity of evaporation from dam i be μ i(t) = μ i (t, X t ), such that the intensity depends on the joint state of the process. The intensity of outflows from water transfers are { u i, j (t) :i, j = 1,..., d; j i } and the intensity of outflow from consumption is the controllable consumption rate C i (t) = C i (t, p(t), X), which depends on the current price of water and the intensity of customer demands. This will be derived in the next section. So, in similar form to Y (i) in (t), t out (t) = (μ i (s) + C i (s) + Y (i) u i, j (s))i {L i (s) > } ds + M (i) out (t), where M (i) out (t) is a square integrable martingale with quadratic variation M (i) out t = t (μ i (s) + C i (s) + u i, j (s))i {L i (s) > } ds. It follows that the approximate dynamics for the i th dam are governed by the equation L i (t) = Y (i) in (t) Y(i) out (t).

1376 Boris M. Miller et al. / Procedia Computer Science 4 (211) 1373 1382 It should be emphasized that the dynamics of each dam clearly depend on the dynamics of the other dams. In essence by splitting each dam into N levels we are saying that the mean time between level changes of the continuous flow processes correspond with the mean time between jumps in our counting process approximations. The martingale terms provide the random perturbation about this mean and, importantly, the mean of the martingale terms is zero, such that, E [ t Y (i) in (t)] = E (λ i (s) + u j,i (s))i {L i (s) < N} ds, and E [ Y (i) out (t)] = E t (μ i (s) + C i (s) + 2.2. Controlled dam system as a system of controlled Markov chains u i, j (s))i {L i (s) > } ds. The above approximation of the dam system dynamics allow us to make the following proposition with respect to each dam in the system. Proposition 2.3. Given the approximate dynamics for the i th dam, as stated in 2.1, in a system of d dams, the controlled process for this dam is represented by a controlled Markov chain with N + 1 states and the matrix (N + 1) (N + 1), describing the infinitesimal generator of the corresponding Markov chain, A i (t, p(t), u i, (t), u,i (t)) = A i (t, C(t, p), u i, (t), u,i (t)), where A i (t, C, u i,, u,i ) = I i [, ] O i [1, ]... I i [, ] (O i [1, ] + I i [1, ]) O i [2, ]........................,... I i [N 2, ] (O i [N 1, ] + I i [N 1, ]) O i [N, ]... I i [N 1, ] O i [N, ] where I i [x, ] = λ i + j i u j,i and O i [x, ] = C i + μ i + j i u i, j when dam i is in state X i = x and the joint state of the other dams in the system are represented by a dot. One can say that O i corresponds to the outflow and I i to the inflow. The column number corresponds to the current state of the i th dam and the column entries add to zero. Proof. The proof of this proposition is accomplished in the same way as for the generator of a controlled Markov chain for a single queueing system given by Miller [9]. The only difference is that the chains are linked via water transfer controls and a price for the joint states, however, as all controls are Ft X -predictable, this does not affect the proof. 3. Derivation of controlled demand functions As already stated, the key innovation of this model is the use of a time and state dependent feedback control, p(t, X t ), to take into account the active seasonal demands of consumers. It is more intuitive and makes calculation easier to find p(t, X t ) through the effect it has on consumption in each dam. For the i th dam, the resulting controlled demand is denoted C i (t, p(t, X t )) = C(t, p) for brevity. Here we show how we take the price of water into account through controlled consumption. To be clear, we are looking for a single price structure for all users of the dam system. So, considering the i th dam, let there be n sectors, or consumers, each with their own seasonal demand intensity, x i,k (t), for k = 1,..., n. In order to have some control on this demand intensity we want to set an optimal demand intensity for each sector, which we denote x i,k, for k = 1,..., n. Now we define in what sense we want this target to be optimal by defining the utility function as min xi,k f i,k (x i,k ) where f i,k (x i,k ) = α(x i,k (1 r) x i,k (t)) 2 + p(t)x i,k, α is a

Boris M. Miller et al. / Procedia Computer Science 4 (211) 1373 1382 1377 parameter which sets a limit to consumption above net natural flow and r is a minimum demand reduction target. To find the minimum we differentiate and solve for x i,k, giving x i,k (t, p(t)) = ((1 r) x i,k (t) p(t) { 2α )I (1 r) x i,k (t) p(t) } 2α. This is the optimal intensity of demand for the k th sector. It follows that for the i th dam, the total optimal intensity of demand is n C i (t, p(t)) = x i,k (t, p(t)). (1) Recalling that p(t) = p(t, X t ), we now have a vector of optimal demand intensities for the i th dam. Since we also know that p(t) [p min, p max ], we can now also define maximum and minimum optimal demand intensities for the i th dam in the following way: n n n C i,max (t) = x i,k (t, p min ) C i (t) = x i,k (t, p) C i,min (t) = x i,k (t, p max ). (2) These equations allow us to define piecewise functions for the solution of C i (t, p(t)) in the dynamic programming equations. 4. Dynamic programming and optimal control In Miller [9], the solution for the optimal control of a single server queueing system is developed via dynamic programming. This method can be extended to systems defined by multiple controlled Markov chains. This will be developed in this section, however, we first detail the basic elements of the dynamic programming based method of solving for optimal controls on a single dam with the single control p(t). 4.1. General performance criterion A performance criterion provides some restriction on the way the Markov chain is allowed to behave and still be considered optimal. Let f (s, p(s), X s ) be the running cost function when the chain is in state X s at time s [, T], T <. Then, a general performance criterion has the form min J[p( )], J[p( )] = E p( ) [ φ (X T ) + T ] f (s, p(s), X s )ds, where φ R N+1,, is the standard inner product, φ (X T ) = φ, X T and f (s, p(s), X s ) = f (s, p(s)), X s. The definition of f (s, p(s), X s ) gives rise to the definition of a vector of running cost functions of the Markov chain, f (s, p(s)) = ( f (s, p(s), e ),..., f (s, p(s), e N )). Assumption 4.1. For each e i,i=, 1,..., N, the elements of f (s, p(s)) are bounded below and continuous on [, T] [p min, p max ]. 4.2. Value function The value function of this Markov chain is really a function which gives the total cost of control starting at time t [, T] and state X t = x. It has the form V(t, x) = inf p( ) J[p( ) X t = x], J[p( ) X t = x] = E Assumption 4.1 ensures that this infimum exists. [ T ] φ (X T ) + f (s, p(s), X s )ds X t = x. (3) t We now represent V(t, x) asv(t, x) = φ(t), x, where φ(t) = (φ (t),φ 1 (t),..., φ N (t)) T R N+1 is some measurable function, recalling that the x are unit vectors forming a basis in R n+1.

1378 Boris M. Miller et al. / Procedia Computer Science 4 (211) 1373 1382 4.3. Dynamic programming Now we consider ˆφ(t), of the same form as φ(t), and define the dynamic programming equation with respect to ˆφ(t): = ˆφ (t), x min p ˆP [ ˆφ(t), A(t, p(t)) x + f (t, p(t)), x ] = ˆφ (t), x min p ˆP H(t, ˆφ(t), p(t), x) (4) = ˆφ (t), x H (t, ˆφ(t), x), with boundary condition ˆφ(T) = φ [12, 13, 14]. Since H(t, ˆφ(t), p(t), x) is continuous in (t, p(t)) and affine in ˆφ, for any (t, x) [, T] S, H (t, ˆφ(t), x) is Lipschitz in ˆφ [9]. Proposition 4.2. With Assumption 4.1 holding, equation (4) has a unique solution on [, T]. Proof. The result follows from well known results on the solutions of ordinary differential equations. Remark 4.3. If we now let x = e (i),i=, 1,..., N, then we get a system of ODE s d ˆφ i (t) dt = H (t, ˆφ(t), e (i) ), i =, 1,..., N. (5) 4.4. Optimal control Given that the system in section 4.3 can be solved, then the optimal control is characterized by the following theorem [12, 9]. Theorem 4.4. Let ˆφ(t) be the solution of the system of equations (5), then for each (t, x) [, T] S there exists p (t, x) P such that H(t, ˆφ(t), p(t), x) achieves a minimum at p (t, x). Then 1. There exists an Ft X -predictable optimal control, ˆp(t, X t ) such that V(t, x) = J[ˆp( ) X t = x] = ˆφ(t), x. 2. The optimal control can be chosen as Markovian, that is ˆp(t, X t ) = p (t, X t ) = arg min H(t, ˆφ(t), p(t), X t ). p P 4.5. Extension to d dams There are two ways to extend these results to d dams. The first is to find the infinitesimal generator of the controlled Markov chain for the entire system, writing the entire set of possible joint states as a vector and solving as in section 4.3. While this is possible it presents practical problems. Firstly, the generator is of the form G = A 1 I I... I + I A 2 I... I +... + I I... A d 1 I + I I... I A d, which is a matrix of dimensions (N + 1) d (N + 1) d. With a large number of states and dams this would be extremely cumbersome to deal with. There is also no existing operation to make the set of joint states into a vector with the elements in the correct position for calculation, although it can be done in an ad hoc fashion. The second method does not have these problems and relies on Theorem 4.4. We consider the value function V(t, x) = φ(t), x 1 x 2... x d, where φ(t) is a tensor of depth d. From Theorem 4.4 we know that there exist optimal Markovian controls that satisfy V(t, x) = φ(t), x = ˆφ(t), x, where ˆφ(t) has the same form as φ(t). So, considering that d φ(t), X dt = dφ(t), X dt + φ(t), A 1 X 1 X 2... X d + X 1 A 2 X 2... X d +... + X 1 X 2... A d X d, we minimize φ(t), X, taking into account the depth-d tensor or performance criteria, f (t, p(t), u l,m (t)), and solve the resulting system of ODE s: dφ(t), X dt = min [ φ(t), A 1X 1 X 2... X d +X 1 A 2 X 2... X d +...+X 1 X 2... A d X d + f (t, p(t), u l,m (t)), X ]. p( ),u l,m ( ) (6), i = 1,...N for x k, k = 1,..., d and u l,m (t) represents all flow controls. The actual method where we use unit vectors e (k) i of computing this will be considered in Section 6.

Boris M. Miller et al. / Procedia Computer Science 4 (211) 1373 1382 1379 5. Performance criteria Performance criteria define in what way we want the dam management policy to be optimal. For a problem with many dams we consider four different performance criteria. The first type gives the difference squared of the optimal consumption in each joint state and in each dam, and the total customer demand each dam. For a d dam system, we have J 1 (t, p(t), X t ) = ( C1 X(t) n x 1,k(t) ) 2 J d (t, p(t), X t ) = ( Cd X(t) n x d,k(t) ) (7) 2, where each of the J i, i = 1,...d are tensors of depth d. The second type of performance criteria concerns controlled transfers between dams. We consider the difference squared of the natural inflows and transfers into each dam and the customer demand and evaporation in each dam. For a d dam system the criteria have the form J 1+d (t, u,1 (t), X t ) = ( λ 1 (t) + d u j,1(t) n x 1,k(t) μ 1 (t) ) 2 J 2d (t, u,d (t), X t ) = ( λ d (t) + d u j,d(t) n x d,k(t) μ d (t) ) (8) 2, where each of the J i+d, i = 1,...d is a tensor of depth d, recalling that u j, (t) = u j, (t, X t ). The idea here is to maintain balance between inflows and outflows via these transfers. This simple quadratic criteria is just to demonstrate the method but clearly more realistic criteria could be developed. Also, we treat the transfer intensities separately in the performance criteria but what we would observe is the difference between these intensities as a single flow intensity. This is explained in section 7. The third type of performance criteria is a single criterion which gives the sum of the average probability that the level of each dam in the system falls below level M N over the interval [, T]: T M J 2d+1 (t, p(t), u l,m (t), X t ) = Ep,u X l,k (s)ds. (9) As with the other criteria, this is a tensor of depth d. l=1 Now we define f (t, p(t), u l,m (t), X t ) = 2d+1 J k( ). This is then used in the dynamic programming equation. A final criterion concerns the terminal state. We consider the sum of the probabilities that the level of each dam is below level M N at time T: M J 2d+2 (t, p(t), u l,m (t), X t ) = Ep,u X l,k (T). (1) This last criterion gives us the terminal conditions for the solution of the system of ODE s given at (6). In this paper we consider the control resources of the entire dam system to be unconstrained, however, it is possible to consider the problem with constrained control resources. This type of work has been done by Miller et.al. [1] and uses the Lagrangian approach to find the optimal weighting of each criterion. This will be implemented in our further research into dam system management. l=1 6. Computational methods With the results of 4.5 and 5, we can solve the system numerically. So far, all the numerical work we have done has been using Mathematica 7. Clearly the numerical solution of this problem will be implemented differently in

138 Boris M. Miller et al. / Procedia Computer Science 4 (211) 1373 1382 different languages, however, there are some common issues to deal with in any implementation. The first is how to handle expressions like φ(t), A 1 X 1 X 2... X d + X 1 A 2 X 2... X d +... + X 1 X 2... A d X d, recalling that φ(t) is a tensor of depth d and all of the X k are unit vectors. In Mathematica 7 this is handled via the repeated use of a generalized inner product such that φ(t), A 1 X 1 X 2... X d = X d, X d 1,..., X 2, A 1 X 1,φ(t).... This may be an abuse of notation since an inner product gives a scalar, however, the implementation produces the correct result. In general, this calculation would be carried out as follows, where * refers to the transpose operation: X d. ( X d 1. (... (X 2. (A 1 X 1.φ(t)) )... ) ). All other calculations involving the extraction of a the tensor element corresponding to a particular joint state, or operations on a particular state, are handled in the same way. Another issue is the minimization operation involved in each ODE in the system. In the case we have with the integral performance criteria given, these minimizations can all be carried out prior to solving the ODE system. Essentially we have to minimize over C l (t, p(t), X t ) and u l.m (t, X t ) in (6). Due to the particular structure we have, we simply take the partial derivatives with respect to these controls and minimize, since the minimizations will be separable. If the performance criteria were not such nice integral criteria we may have to minimize during the solution of the ODE system, resulting in far more complex calculations. In the case we have, the actual time of solution of the ODE system is much less than that taken to carry out the minimizations. Since these are done prior to the ODE system solution, they can be done separately in parallel. Mathematica 7 has good support for parallelization and we have parallelized the minimization operations. As yet, this has only been tested on a dual core machine but the results are promising. We have access to a computer cluster and will implement the program on this cluster and report on the results when available. Mathematica has been useful for experimenting and prototyping but in future work we plan to rewrite this program in Fortran or C. 7. Numerical example We include a two dam system model as a numerical example of our results so far. Table 1 is a list of parameters and functions corresponding to those defined in previous sections and a value K. The values of the performance criteria for the probability of the dams falling below level M during and at the end of the control interval are multiplied by K to make the solutions more sensitive to these criteria. The first subscript refers to the first or second dam as appropriate. The values given for α 1 and α 2 correspond to an allowance of an extra 2% consumption above net natural flows in each dam. The performance criteria (7), (8) and (9) are included in the ODE system definition, while (1) is dealt with by the terminal conditions. On a dual-core processor desktop, for three runs the calculations took on average 575.18 CPU seconds in serial and 71.39 CPU seconds with some parallelization, an increase in speed of around eight times. Figures 1 to 3 give the type of price structure achieved, noting that this is not a surface but a lattice of prices. It is clear that the slightly higher demand on dam one seems to bias the price structure to behave more responsively to changes in dam one, which is reasonable. The structure is also quite stable, except at the end. With the control objectives largely achieved by t = 1, the prices move towards the minimum. A similar result holds for transfers between dams. Figure 4 gives the intensity of selected flows between dam one and two. This is defined as u 1,2 (t) u 2,1 (t) taken at the states specified in figure 4. That is, it is the net intensity of flows between the dams. If the intensity is positive, then it is a flow from dam one to two and if negative, from dam

Boris M. Miller et al. / Procedia Computer Science 4 (211) 1373 1382 1381 two to one. This graph shows that there is a clear bias toward transfers into dam one where the demand is higher. One can also observe the quadratic nature of the performance criteria at work. This may not be realistic but we are simply demonstrating the method at this stage. We will work on better performance criteria in our future research. Figure 5 shows the effects of these controls on the total demand. It shows the total original demand for the system and a weighted average of the total controlled demand. The weighted average is given by 2 N N P {Dam l is in state [i, j] at time t} C l [i, j](t). l=1 i=1 The probabilities are found by solving the Kolmogorov forward differential equations for each dam given the solution to the system of equations (6). It clearly shows that on average there would be a significant reduction in water use in the system using this method of feedback price control. Table 1: Model parameters and functions. Parameters Parameters Natural flows Demands N = 15 r 1 = r 2 =.25 λ 1 (t) = sin(2πt) + 1 x 1,1 (t) = cos(2πt) + 4.5 M = 5 α 1 =.91 λ 2 (t) = sin(2πt + π 6 ) + 9 x 1,2(t) =.3 cos(2πt) + 4.5 n 1 = 3 α 2 = 1.82 μ 1,N (t) = sin(2πt) + 4.5 x 1,3 (t) =.5 cos(2πt) + 5 n 2 = 3 K = 15 μ 2,N (t) = sin(2πt + π 6 ) + 3.5 x 2,1(t) = cos(2πt) + 5 p max = 1.75 x 2,2 (t) =.4 cos(2πt) + 4 p min = 1. x 2,3 (t) =.3 cos(2πt) + 4 1.75 Price $ 1.5 1.25 1. 1 2 3 4 5 6 7 8 9 4 56 Dam 1 level 1 11 12 13 14 1 23 15 14 15 1 111213 7 89 Dam 2 level 1.75 Price $ 1.5 1.25 1. 1 2 3 4 5 6 7 8 9 4 56 Dam 1 level 1 11 12 13 14 1 23 15 14 15 1 111213 7 89 Dam 2 level Figure 1: Prices at t=. Figure 2: Prices at t=.5. 8. Conclusion In this paper we have developed a model for managing water use in a dam system via a dynamic feedback price control. We have shown via a numerical example that the resulting price structure is reasonable in that the prices are generally high when the water level is low and low when high, taking into account the bias toward the dam with greater demand. In this article we consider users without connection to a dam suited to their particular needs. More realistic is the presence of a relation between a particular customer and a particular dam or even the variable and controllable structure of customer-dam relations.

1382 Boris M. Miller et al. / Procedia Computer Science 4 (211) 1373 1382 1.75 Price $ 1.5 1.25 1. 1 2 3 4 5 6 7 8 9 4 56 Dam 1 level 1 11 12 13 14 1 23 15 14 15 1 111213 7 89 Dam 2 level Dam 1 Dam 2 flows 1..5..5 1. State 1,1 State 1,3 State 1,6 State 1,9 State 1,12 State 1,15..2.4.6.8 1. Time years Figure 3: Prices at t=1. Figure 4: Selected flows between dams 1 and 2. 35 3 Demand ML 25 2 15 Total demand 1..2.4.6.8 1.Controlled demand Time years Figure 5: Total original demand and controlled demand. Future work in this area will introduce more seasonable variability to the natural inflows and outflows. We will also consider the problem under control resource constraints. At present we have allowed the flows between dams to be random, however, we want to schedule flows between dams to make it more realistic. This will result in a hybrid model. Alongside this work will be further research on computational methods using cluster computing and parallelization. The results of this work will be reported in appropriate journals as our research proceeds. 9. References [1] E. Altman, Constrained Markov Decision Processes, Chapman & Hall/CRC, Boca Raton, FL, 1999. [2] D. Bertsekas, S. Shreve, Stochastic Optimal Control: The Discrete-Time Case, Athena Scientific, Belmont, MA, 1996. [3] M. Kitaev, V. Rykov, Controlled Queueing Systems, CRC, Boca Raton, FL, 1995. [4] M. Faddy, Optimal control of finite dams: Discrete (2-stage) output procedure, Journal of Applied Probability 11 (1) (1974) 111 121. [5] L. Yeh, L. Hua, Optimal control of a finite dam: Wiener process input, Journal of Applied Probability 24 (1) (1987) 186 199. [6] M. Abdel-Hameed, Optimal control of a dam using p λ, τ M policies and penalty cost when the input process is a compound poisson process with positive drift, Journal of Applied Probability 37 (2) (2) 48 416. [7] J. Bae, S. Kim, E. Lee, Average cost under the p λ, τ M policy in a finite dam with compound poisson inputs, Journal of Applied Probability 4 (2) (23) 519 526. [8] V. Abramov, Optimal control of a large dam, Journal of Applied Probability 44 (1) (27) 249 258. [9] B. Miller, Optimization of queuing system via stochastic control, Automatica 45 (29) 1423 143. [1] B. Miller, G. Miller, K. Siemenikhin, Torwards the optimal control of markov chains with constraints, Automatica 46 (21) 1495 152. [11] B. Miller, G. Miller, K. Siemenikhin, Control of markov chains with constraints, in: Proceedings of the VIII International Conference System Identification and Control Problems SICPRO 9, Moscow, 29. [12] L. Aggoun, R. Elliot, J. Moore, Hidden Markov Models: Estimation and Control, Springer-Verlag, New York, 1995. [13] D. Bertsekas, Dynamic Programming and Stochastic Control, Academic Press, London, 1976. [14] R. Howard, Dynamic Programming and Markov Processes, John Wiley and Sons, New York, 196.