Best Guaranteed Result Principle and Decision Making in Operations with Stochastic Factors and Uncertainty

Size: px

Start display at page:

Download "Best Guaranteed Result Principle and Decision Making in Operations with Stochastic Factors and Uncertainty"

Julie Hill
5 years ago
Views:

1 Stochastics and uncertainty underlie all the processes of the Universe. N.N.Moiseev Best Guaranteed Result Principle and Decision Making in Operations with Stochastic Factors and Uncertainty by Iouldouz S. Raguimov Department of Mathematics and Statistics, York University 4700 Keele Street, Toronto, Ontario, Canada, M3J 1P3 November 13, 2014 Abstract The Best Guaranteed Result Principle is introduced. Decision Making in Operations with Stochastic Factors and Uncertainty is studied. Information hypothesis and decision making turn are introduced. Various scenarios of game situations with different turn sequences of decision making and voluntary information exchange between the players are studied. Decision making in operations with stochastic factors under the assumption of the frequency definition of probability is analyzed. In our setting the possibility of utilization of stochastic information about an uncontrollable factor is connected with the possibility (and intelligibility) of transition from the original operation to an extended one with infinitely many repetitions. It is shown that if in the extended operation we tend to maximize the lower limit of the average value of an original criterion, then the best guaranteed result of the Operating Side is determined as a solution to a certain optimization problem and its optimal strategy will be choosing on each step of the extended operation some optimal solution to that optimization problem. Similar results for the problem of maximization of the probability of exceeding of a given level and the problem of maximization of a level with given guarantee are studied. Applications of the obtained results to a two-stage stochastic programming problem are considered.

2 1 Introduction Problems of control in economic, financial as well as engineering systems are often formulated as a decision-making problem under stochastics and uncertainty. The mathematical formalization of such problems usually requires a non-trivial non-classical approach. Let us define an operation as a series of actions to achieve a certain objective (criterion), i.e., as a series of objective-oriented acts. In operation, a rational entity consisting of one or several persons aiming to achieve the given objective is called an operating side (OS). Within an operating side we will distinguish an operation research analyst (ORA) as a group of one or several persons that does not make a decision by itself, but helps the operating side to do so. The operation usually involves other participants with different objectives and the result of an operation depends on their activities also. Consequently, as a rule, there are some uncontrollable (by the operating side) factors in the operation, that having influenced the result of the operation create a so-called operational situation. According to the information that is available to an ORA these uncontrollable factors can be divided into three groups: 1. Constant factors whose values are known to the ORA; 2. Stochastic factors, i.e. stochastic processes with given distributions; 3. Uncertain factors that have known ranges of values, but uncertain exact value at any particular time. (If the uncertain factor is known to be stochastic, then it is assumed that OR analyst does not know its exact distribution.) The uncertain factors in their turn can also be divided into three groups: (a) uncertain factors that are the result of incomplete knowledge about some processes or quantities in an operation (we call them natural uncertainties); (b) uncertain factors related to an ambiguity in the operating side s objective; 2

3 (c) uncertain factors representing the activities of other participants involved in an operation that act independently and have objectives different from those of the OS. A participant with the opposite objective is called an opponent. 2 Decision making under uncertainty 2.1 Mathematical model of operation In operations containing only uncontrollable factors of type (1) a decisionmaking problem is usually reduced to the solution of a deterministic optimization problem. But, if operation contains also uncontrollable factors of the second and third types, i.e. stochastic and uncertain factors, then to analyze it we need to involve the methods of Stochastic Optimization, Game Theory and Decision Theory. There exist a number of decision-making principles that are used widely in Decision Theory: The minimax criterion; Savage s minimax regret criterion; The expected value criterion; Hurwicz s criterion; Bayes s criterion; and others. However, our main here goal is to introduce the approach based on the Best Guaranteed Result Principle. Suppose that in an operation the criterion of OS is formulated as a maximization of function F (x, y, z), where the vector x M 0 denotes control variables, the vector y N denotes uncontrollable uncertain factors and z denotes a vector of uncontrollable stochastic parameters with distribution function θ(z) on the set Z. The function θ(z) is either known exactly or it is just given that θ Θ, where Θ is a set of distribution functions. Usually, the following two ways of determining set Θ are considered: 1. Θ = {θ α α A}. Here the type of distribution rule, θ α is given, while the vector of parameters α, determining the exact distribution function is uncertain factor with the range of values in the set A at any particular time; 3

4 2. Θ = {θ a i a i (z) dθ(z) a i Z i = 1, 2,, l}, where a i (z), i = 1, 2,, l are given integrable functions. Please note that here the type of distribution rule is unknown but we are given some constraints on its integral characteristics. As a rule, the model of operation is analyzed before the operation takes place. Assume, that there another participant is involved in an operation whose criterion is formulated as the maximization of G(x, y, z). In particular, if G(x, y, z) = F (x, y, z), we obtain a Zero-Sum Game and this participant, as mentioned above, is called an opponent. Usually, while an operation evolves, some information about uncontrollable factors is received by the OS. An exact description of the process of receiving this information is called an information hypothesis. An action allowable by an information hypothesis is called a strategy of the operating side. Mathematically a strategy is defined as a mapping x: N Z M 0. Denote the set of all such functions by M. Let M be the set of strategies, i.e., the set of functions in M, corresponding to the information hypothesis. Notice that strategies are usually determined before operations, once the model of operation is analyzed by the ORA. Consider the following examples of information hypotheses and corresponding sets of strategies: (a) no information is expected about uncontrollable factors. Then M = M 0, where M 0 is the set of all constant strategies, i.e. x x; (b) at the beginning of the operation the exact values of the uncontrollable factors y and z are known. Then M = M, where M is the set of all functions x: N Z M 0 ; (c) at the beginning of the operation exact values of uncertain factors y are known. No information is expected about stochastic factors. Then M = M y, where M y is the set of all functions x: N M 0 ; (d) the information hypothesis may be given using the so-called information function R: N Z R m, defined on N Z. In this case, at the 4

5 beginning of the operation the exact values of R(y, z) for all values of y and z are known. Then the set of strategies of the operating side M = { x M R(y 1, z 1 ) = R(y 2, z 2 ) = x(y 1, z 1 ) = x(y 2, z 2 ) y 1, z 1, y 2, z 2 }. Therefore, the the mathemathical model of operation is determined as a set of the following objects: F (x, y, z), x M 0, y N, z Z, θ Θ, x M. 3 The efficiency of strategies How do we compare strategies? There exist two different approaches to the evaluation of the efficiency of strategy x: N Z M 0 based on either of the following principles: (A) averaging of the value of objective function; (B) the value at risk or conditional value at risk measures of the risk of loss. There exist other principles to evaluate strategies. However, our approach is primarily based on the Best Guaranteed Result Principle. Let us start with mixed strategies. 3.1 Expected value criterion In our setting, for every strategy x M, its efficiency, denoted by W ( x) is estimated using the Guaranteed Result Principle, where W ( x) is a certain guaranteed value of the objective function. In comtaring two strategies, we define the better one to be one with a larger efficiency estimate. The expected value criterion suggests estimating the efficiency of strategy x(y, z) using the function F ( x(y, z), y, θ) = F ( x(y, z), y, z) dθ(z). Z instead of the original criterion F ( x(y, z), y, z). As a result, the OS maximizes the average efficiency. Further suppose: 5

6 1. Every uncertain factor in y is either natural uncertainty or a strategy of an opponent who does not have information about the realizations of stochastic factor z; 2. Opponent s objective is either unknown or opposite to the objective of operating side; 3. The OS agrees with abovementioned averaging of the criterion with respect to stochastic factors. Then the estimate of efficiency for strategy x is given as: W ( x) = inf inf F ( x, y, θ) = inf inf F ( x(y, z), y, z) dθ(z). y N θ Θ y N θ Θ Definition: Strategy x M is said to be optimal, if Z W ( x ) = max x M W ( x). This value is called the Best Guaranteed Result of the OS. Let us ε > 0 be given. The strategy x ε is called ε-optimal, if W ( x ε ) sup W ( x) ε. x M Please note that determining the value of max W ( x) x M is a problem of maximization on the set of functions M. Let us now assume that the opponent knows the realization of stochastic factor z and consequently, has chosen vector y. Then the estimate of efficiency for strategy x is given as: W ( x) = inf θ Θ Z inf F ( x(y, z), y, z) dθ(z). y N Moreover, in the worst case, when the opponent knows strategy x, s/he will be able to achieve inf F ( x(y, z), y, z) y N for the given realization of z. It is clear that W ( x) W ( x). Indeed, the more information an opponent has, the less is the efficiency estimate of operating side s strategy. 6

7 3.2 Mixed strategies in operation Suppose no information is expected about uncontrollable factors, that is x = x M = M 0. If the OS is not satisfied with the best guaranteed result sup x M0 W (x), then mixed strategies can be used. Definition: A probability distribution ϕ on M 0 is called a mixed strategy. For the OS to use the mixed strategy ϕ is equivalent to taking strategy x M 0 as the value of a random variable with the probability distribution ϕ on M 0. As a result, the OS introduces one more stochastic factor into the operation. Let us determine the conditions related to the using of mixed strategies in an operation. In addition, suppose that the opponent does not have information about the realizations of x M 0. Then the efficiency of the mixed strategy ϕ will be given as: W (ϕ) = inf inf F (x, y, z) dϕ(x) dθ(z). y N θ Θ Z M 0 In the case when y is in control of the opponent and s/he knows the realizations of z, the efficiency of the strategy ϕ will be: W (ϕ) = inf inf F (x, y, z) dϕ(x) dθ(z). θ Θ Z y N M 0 It is clear that once the operation involves uncontrollable stochastic factors, it is easier for ORA to convince the OS to use mixed strategies. Problem 1: Let i M 0 = {1, 2, 3, 4}, j N = {1, 2, 3} and let the criterion of the OS be given as: [F (i, j)] 4 3 = , where the OS chooses the rows of the matrix, i.e., i M 0 = {1, 2, 3, 4}; and the states of Nature are given by the collumns of the matris, i.e., j 7

8 N = {1, 2, 3}. Denote the strategy ĩ : N M 0 of the OS as a vector ĩ = (ĩ(j), j N). Evaluate the efficiency of ( the pure strategies ĩ 1 = (1, 1, 1), ĩ 2 = (3, 2, 4) and 1 the mixed strategy ϕ = 2, 1 3, 0, 1 ) og the OS, if: 6 a) j is an uncertain factor; b) j is a stochastic factor with the distribution θ = ( 2 3, 1 6, 1 ) Optimal strategies in operation What strategies in an operation are optimal? Definition: Strategy x M is said to be optimal, whenever F Γ (M) def = max x M W ( x) = W ( x ). The value F Γ (M) is called the best guaranteed result of the OS. When the OS wants to minimize the value of the criterion, the maximum is replaced by minimum. In particular, the best guaranteed result of the OS in mixed strategies is defined as: def F m = sup W (ϕ), ϕ {ϕ} where {ϕ} is the set of all mixed strategies. In the case when the maximum: max x M W ( x) is not attained, instead of optimal strategies ϵ-optimal strategies are used. Definition: For every sufficiently small ϵ > 0, the strategy x ϵ M is said to be ϵ-optimal, whenever F Γ (M) def = W ( x ϵ ) sup W ( x) ϵ. x M Since we need to maximize W ( x) on the set of functions M, determining the value of F Γ (M) is a quite challenging problem. However, this problem is simplified if the set M contains so-called absolute optimal strategies. 8

9 Assume that the OS knows the probability distribution θ for the vector of stochastic factors z. Definition: Strategy x a M is called absolute optimal, if for every y N F ( x a, y, θ) = max F ( x, y, θ). x M It is clear that for every value of the vector of uncontrollable uncertain factor an absolute optimal strategy is the best strategy among all strategies in M. Theorem 1: If there exists an absolute optimal strategy x a M, then x a is an optimal strategy of the OS and F Γ (M) = inf max F ( x, y, θ). y N x a M In particular, if y N and z Z the maxima max F (x, y, θ) and max F ( x, y, θ) x M 0 x M 0 exist, then and F y def = F Γ (M y ) = inf y N max x M 0 F (x, y, θ), F def = F Γ ( M) = inf max F (x, y, z) d θ(z). y N Z x M 0 Let us now suppose that there are some y N, such that the maximum max x M F ( x, y, θ) is not attained but the corresponding supremum is finite. Then the notion of ϵ-absolute optimal strategy can be introduced. Definition: For every sufficiently small ϵ > 0, the strategy x ϵ a M is said to be an ϵ-absolute optimal, if y N F ( x ϵ a, y, θ) sup F ( x, y, θ) ϵ. x M Unfortunately, as the next example shows, ϵ-absolute optimal strategy may not exist at all. 9

10 Example 1: Let i M = M 0 = {1, 2} be a controllable factor, j N = {1, 2} be an uncontrollable factor and suppose that there is no stochastic factor in the operation. The criterion of the operation is given by the following matrix: [ ] 1 1 [F (i, j)] 2 2 = 1 1 Then ϵ [0, 2) there is no ϵ-absolute optimal strategy in the set M 0. Indeed, when i = 1, ϵ [0, 2) F (1, 1) = 1 < 1 ϵ = F (2, 1) ϵ = max i M 0 F (i, 1) ϵ, i.e., for j = 1 the inequality from the definition of ϵ-absolute optimal strategy is not satisfied. Thus, the strategy i = 1 is not an ϵ-absolute optimal strategy. Show that i = 2 also is not an ϵ-absolute optimal strategy ϵ [0, 2). Theorem 2: If x ϵ a M is an ϵ-absolute optimal strategy, then it is an ϵ-optimal strategy. Let assume that ϵ > 0 there exists an ϵ-absolute optimal strategy x ϵ a M. Then F Γ (M) = inf sup F ( x, y, θ). y N x a M In particular, if y N and z Z the values are finite. Then sup F (x, y, θ) and sup F ( x, y, θ) x M 0 x M 0 F y def = F Γ (M y ) = inf y N sup x M 0 F (x, y, θ), and F def = F Γ ( M) = inf sup F (x, y, z) d θ(z). y N Z x M 0 Finally, let us compare the best guaranteed results for the following sets of strategies: M 0, {ϕ}, M y and M. Denote F 0 Γ def = F Γ (M 0 ) = max min x M 0 y N F (x, y, θ). 10

11 As before, F m = sup W (ϕ); F y = F Γ (M y ); F = FΓ ( M). ϕ {ϕ} Theorem 3: The following inequalities are satisfied: F 0 Γ F m F y F. Proof: Since M 0 {ϕ}, FΓ 0 = sup inf F (x, y, z) d θ(z) x M 0 y N Z sup inf F (x, y, z) d θ(z) d ϕ(x) = F m, and ϕ {ϕ} ϕ {ϕ} y N Hence, F m F y. Finally, M 0 Z W (ϕ) = inf F (x, y, z) d θ(z) d ϕ(x) y N M 0 Z inf y N sup x M 0 F (x, y, θ) d ϕ(x) = F y. M y M = M y F. Remark: (Interpretation of Theorem 3) Let y be the strategy of the opponent, whose goal is the opposite to the goal of OS. Denote def = F m FΓ. 0 P E The value of P E can be considered as the price that the opponent would be willing to pay for the information about the strategy x of the OS. Indeed, if the opponent knows the realization of x, then the OS may guarantee for itself just the result FΓ 0. The efficiency of any mixed strategy ϕ, W (ϕ) = M 0 min y N 11 F (x, y, θ) d ϕ(x)

12 is not grater than F 0 Γ. Similarly, the value P OS def = F y F m can be considered as the price that the OS would be willing to pay for the information about the strategy y of the opponent. Indeed, if the information about the strategy y is expected, then the OS gets the result F y, otherwise just the result F m. Example 2: Let i M 0 = {1, 2, 3} be a controllable factor, j N = {1, 2, 3, 4} be an uncontrollable factor and suppose that there is no stochastic factor in the operation. The criterion of the operation is given by the following matrix: [F (i, j)] 3 4 = a) M = M y is the set of all functions ĩ : N M 0. There are 3 4 = 81 different strategies. According to Theorem 1, The strategy ĩ 0 is optimal, if. F y = F Γ (M y ) = min max F (i, j) = 2. 1 j 4 1 i 3 min F 1 i 3 (ĩ0 (j), j) = 2 and j N, F (ĩ 0 (j), j) 2. So, the set of all optimal strategies is: ĩ 0 (1) = 2, 3; ĩ 0 (2) = 1, 2, 3; ĩ 0 (3) = 1, 3; ĩ 0 (4) = 1, 2. Here, underlined values correspond to an existing optimal strategy. There is a total of = 24 optimal strategies. Let us now find all absolute optimal strategies ĩ a : F (ĩ a (j), j) = max i M 0 F (i, j) j N. Hence, the possible absolute optimal strategies ĩ a (j) are: ĩ a (1) = 3; ĩ a (2) = 1, 2, 3; ĩ a (3) = 1; ĩ a (4) = 1, 2. In total, there are = 6 of them. 12

13 b) Suppose that at the beginning of the operation, the OS will know to which of the following sets N 1 = {1, 2} or N 2 = {3, 4} the value of the uncontrollable factor j belongs. Then the strategy of the OS will be described as: { i1, if j N ĩ(j) = 1 i 2, if j N 2. The set M consists of 9 strategies. It contains absolute optimal strategies ĩ a : ĩ a 1 = 3 and ĩ a 2 = 1. By Theorem 1, F Γ (M) = 2. There are two optimal strategies in the set M : i 0 1 = 2, 3; i 0 2 = 1. Problem 2: In Example 3, for the cases (a) and (b) determine the optimal strategies of the OS on the set M y. 4 Hierarchic games of two persons (Principal-agent games) Assume that there is no stochastic factor and in an operation we want to maximize the function F (x, y), where x X denotes our control and y Y denotes an uncontrollable uncertain factor. If we do not know the other participant s objective, then we consider this operation as a game with Nature. According to the Maximin Principle, first we calculate the function and then determine by solving the optimization problem W (x) = inf F (x, y) y Y sup inf F (x, y) x X y Y W (x) sup. x X However, in conflict situations with two participants the interests of participants are not always opposite. Consider a two-person game in the normal form, that is the game when every participant selects her/his strategy without knowing the selection of the opponent. A pair (x, y) of strategies is called 13

14 a game situation. Let the payoff of Player 1 (i.e. operating side) be described by function F (x, y), and the payoff of Player 2 (i.e., a partner) - by function G(x, y), where G(x, y) F (x, y). Assume, each player is trying to maximize her/his payoff by choosing strategies x X and y Y, respectively. Therefore, the two-person game in normal form is given by Γ = X, Y, F (x, y), G(x, y). We shall be considering two-person games with a turn sequence and voluntary information exchange between the players. These types of games are called hierarchic games (also, games with non-antagonistic interests or principal-agent games). It is supposed that Player 1 has the right to the first turn and we will be trying to determine the Best Guaranteed Result that Player 1 (Operating Side) is able to obtain. For simplicity, let F (x, y), G(x, y) C X Y, where X and Y are compact sets. Depending on Player 1 s information about the strategies of Player 2, it is possible to formulate different games. 14

15 Game Γ 1 : Player 1 does not expect to have any information about the strategies of Player 2 and she has to select her strategy x X, and inform Player 2 about it. This will be denoted schematically as x p1 p2 y. Then, Player 2 will be able to choose her strategies in the form g : X Y. Denote the set of such strategies by {g}. Hence, Game Γ 1 in normal form will be given as Γ 1 = X, {g}, F (x, g), G(x, g), where F (x, g) = F (x, g(x)) and G(x, g) = G(x, g(x)). Player 2 knowing x, will choose the strategy y Y (x) = Arg max G(x, y), y Y that is a strategy maximizing her payoff G(x, y). Since Player 1 knows G(x, y), s/he will be able to determine set Y (x). (The set Y (x) is called the set of indifference for Player 2 under the given constraints.) Then the value W (x) = min F (x, y) y Y (x) is said to be the efficiency estimate for strategy x. Consequently, the best guaranteed result of Player 1 is determined as F 1 = sup x X W (x) = sup x X min y Y (x) F (x, y). Note that since Y (x) Y, the reward of Player 1 may be greater than a simple maximin. 15

16 Game Γ 2 : Player 1 expects to have (and will really have) full information about the strategy y Y of Player 2. Since she has the right to the first turn, she chooses her strategy as a function f : Y X, and inform Player 2 about it. Denote the set of such strategies by {f}. The scheme of information exchange in Game Γ 2 may be given as f p1 p2 y p2 p1 x = f(y). Player 2 knowing f, will choose her strategy y from the set Y (f) = Arg max G(f(y), y). y Y Note that set Y (f) may be an empty set unless f(y) is continuous on Y. If f(y) is not continuous on Y, Player 2 can choose any strategy y Y. Define { Y (f), if Y (f), Y (f) = Y, if Y (f) =. Then the optimal strategy of Player 2 will be a choice of any y Y and the efficiency estimate for strategy f of Player 1 will be given as W (f) = inf F (f(y), y). y Y (f) Consequently, the best guaranteed result of Player 1 is determined as F 2 = sup W (f) = sup f {f} f {f} inf y Y (f) F (f(y), y). 16

17 Game Γ 3 : Suppose that Player 2 plays Game Γ 2 against Player 1, i.e. Player 2, expecting to have full information about the strategy x X of Player 1, selects her strategy as a function g : X Y, and informs Player 1 about it. However, as before, the right to the first turn belongs to Player 1. So, the latter knowing g, selects some operator f 1 : {g} X and informs Player 2 about it. The scheme of information exchange in Game Γ 3 may be p1 p2 described as f 1 g p2 p1 x = f(g) p1 p2 y = g(x). Correspondingly, the best guaranteed result of Player 1 is determined as F 3 = sup f 1 {f 1 } inf F (f 1(g), g), g Y (f 1 ) and where Y (f 1 ) = { Y (f1 ), if Y (f 1 ), {g}, if Y (f) =, Y (f 1 ) = Arg max G(f 1 (g), g). g {g} 17

18 Therefore, in Games Γ 1, Γ 2 and Γ 3 to find the best guaranteed result of Player 1 it is necessary to calculate so-called maximin with connected variables. Moreover, finding the best guaranteed result of Player 1 in Games Γ 2 and Γ 3 requires us to solve optimization problem on a set of functions. Let us consider Game Γ 2. We will need the following notations: G 2 = max y Y min G(x, y), x X which is the best guaranteed result of Player 2 when Player 1 for every y Y uses against her a so-called punishment strategy f p (y) : f p (y) Y (x) = Arg min G(x, y), y Y ; x X E = Arg max y Y min G(x, y); x X D = {(x, y) X Y G(x, y) > G 2 }; K = { sup(x,y) D F (x, y), if D,, if D = ; M = min max y E x X F (x, y). Theorem 4: (Yu.B.Germeier) The best guaranteed result of Player 1 in Game Γ 2 is: F 2 = max{k, M}. 18

19 Now consider Game Γ 3. Define: G 3 = min max x X y Y G(x, y) = max y Y G(xp, y) which represents the best guaranteed result of Player 2 when Player 1 uses against her punishment strategy x p ; D = {(x, y) X Y G(x, y) > G 3 }; K = { sup(x,y) D F (x, y), if D,, if D =. Theorem 5: (N.S.Kukushkin) The best guaranteed result of Player 1 in Game Γ 3 is: F 3 = max{k, F 1 }. 19

20 Continuing the above defined recursion, it is possible to formulate Game Γ 4, Game Γ 5,..., Game Γ k,. However, it is shown (Yu.B.Germeier and N.S. Kukushkin) in 1970 s [1, 2, 3] that F 2n = F 2, and F 2n+1 = F 3 n = 2, 3,. Moreover, it is shown that F 1 F 3 F 2. The last inequality is a consequence of the fact that in Game Γ 3 Player 2 also receives (and might be able to use) some information about the strategies of Player 1. Therefore, in a hierarchic game (principal-agent game, two-person game with non-antagonistic interests): 1. For both players it is better to be mutually informed. 2. The information exchange between the players is useful only up to a certain level. Above that level, further collection and/or exchange of information will not potentially increase the payoff of Player 1 (operating side). That is after a certain level of recursion additional complication of the situation will not potentially increase the payoff of Player 1 (operating side). 20

21 Example 3: Consider the bimatrix game: (3, 7) (6, 4) (8, 3) (4, 7 (3, 7) (2, 3) (7, 4) ( 5, 6) ( 1, 6) Here, the criteria of Player I and Player II are given by the matrices A = and B = respectively, and their sets of strategies X = Y = {1, 2, 3}. Let us formulate Games Γ 1, Γ 2 and Γ 3 for this bimatrix game. Game Γ 1 : where F 1 = max i X min j Y (i) a ij = max min a ij = max W (i), 1 i 3 j Y (i) 1 i 3 Y (i) = Arg max 1 j 3 b ij. Since W (1) = W (2) = 3 and W (3) = 5, the best guaranteed result of Player 1, F 1 = 3, and optimal strategies are i = 1, 2. Game Γ 2 : G 2 = max 1 j 3 min 1 i 3 b ij = 4; E = Arg max 1 j 3 min 1 i 3 b ij = {1, 2}, i.e. i p = 3; D = {(i, j) b ij > 4} = {(1, 1), (2, 1), (2, 2), (3, 2), (3, 3)}; M = min j E max i X a ij = min 1 j 2 max 1 i 3 a ij = 6. K = max a ij = 4; (i,j) D Hence, the best guaranteed result of Player 1, F 2 = max{k, M} = max{4, 6} = 6, and the optimal strategy is { 3, if j = 1, f (j) = 1, if j = 2, 3. Game Γ 3 : G 3 = min 1 i 3 max 1 j 3 b ij = max 1 j 3 b i p j = max 1 j 3 b 3j = 6; 21

22 D = {(i, j) b ij > 6} = {(1, 1), (2, 1), (2, 2)}. So, since K = max (i,j) D a ij = 4, the best guaranteed result of Player 1, F 3 = max{k, F 1 } = max{4, 3} = 4, and the optimal strategy is { 2, if g(2) = 1, f1 (g) = 3, if g(2) 1. Note that max min a ij = 3 and min max a ij = 6. 1 i 3 1 j 3 1 j 3 1 i 3 22

23 5 Decision-making in operation with stochastic factors As we have seen, the problem of determining the best guaranteed result sup W ( x) x M is not straightforward. In general, it requires solving a problem of maximization of W ( x) on the set of functions M. However, it is substantially simplified if M contains so called absolute optimal strategies. In this section it is assumed the the OS knows exactly the distribution function θ of the stochastic vector z. Definition: Strategy x a M is called absolute optimal, if y N F ( x a, y, θ) = max F ( x, y, θ). x M x ε a M When the maximum is not attained, for the given ε > 0, strategy is called ε-absolute optimal, if y N F ( x ε a, y, θ) sup F ( x, y, θ) ε. x M It is not difficult to show that every absolute optimal strategy x a is an optimal strategy and W ( x a ) = max W ( x) = inf x M max y N x M F ( x, y, θ). Similarly, every ε-absolute optimal strategy x ε a is an ε-optimal strategy and W ( x ε a) = sup W ( x) = inf sup F ( x, y, θ). x M y N x M Unfortunately, absolute optimal strategies, as well as ε-absolute optimal strategies often do not exist in real-world operations, in reality they exist rarely. 23

24 Let us now consider an operation containing the uncontrollable stochastic vector z, with the given distribution function θ(z) on the set Z. Suppose that there are no uncertain factors in the operation and that the operating side wants to maximize function F (x, z), where vector x X is a strategy, and by z is denoted the above-mentioned stochastic vector. It is clear that the statement z is a stochastic vector represents some additional information for the OS and generally speaking, may allow the OS to get a result better than inf z Z F (x, z). To evaluate the efficiency of strategy x in operations with stochastic factors the following two methods are often used: 1. Set Z is replaced by S Z, such that P [z S] 1 β, where β > 0 is quite a small number, and the function W (x) = inf F (x, z) z S is used to estimate the efficiency of strategy x; 2. The objective function F (x, z) is replaced by the new objective function F (x) = F (x, z) dθ(z). Z It is clear that the both methods entail a certain risk and this risk may be quite large if the operation is not repeated a sufficient number of times. Hence, by using either of these methods we introduce into the operation a new element - the consent to certain risk. 24

25 Now, we will accept the frequency definition of probability. Stated simply, the probability of an event is assumed to be the frequency of this event occurring in a very long series of experiments. More formally, it is supposed that probability is defined as a certain limit when the operation is repeated infinitely often. We assume that: 1. The operation might be repeated infinitely, that is the operating side can choose strategy x X any number of times, and some values of vector z will appear the same number of times. Let s denote a sequence of occurred values of z by z, i.e. z = { z 1, z 2, z 3,..., z N,... }, where z i Z denotes an uncontrollable factor at the i-th stage (repetition) of the operation, i = 1, 2,..., N,...; 2. In this infinitely repeated operation (we will call this operation the extended operation), where the uncontrollable factor, instead of vector z, is an infinite sequence z; this sequence z may be any random sequence with the single constraint that the limit of the realization (occurring) frequency for vector z j exists and is equal to p j. So, if we denote by N the current number of repetitions of the operation and χ j (z) is the characteristic function defined as { 1, if z = zj χ j (z) = 0, if z z j, then, for all j such that z j Z the limit exists and is equal to p j. lim N 1 N N χ j (z i ) i=1 Therefore, in our setting the possibility of utilization of stochastic information about z is connected to the possibility (and intelligibility) of the transition from the original operation to the extended operation with infinitely many repetitions. Suppose that the objective function F (x, z) is bounded, that is, there exists a positive real number C, such that C < F (x, z) < C, for all values of x X R n and stochastic vectors z Z, with P (z = z j ) = p j, j = 1, 2,..., j Z p + j = 1. 25

26 Assume that the operating side agreed to transfer from the original onestage operation into the extended operation. Then, in the extended operation, the set of its strategies will be defined as X = { x = {x i } i=1 x i X, i = 1, 2, 3, }, where x i is the choice of the OS at the stage i; and the range of values for the uncontrollable stochastic factor will be Z = { z = {z i } 1 N i=1 zi Z & lim χ j (z i ) = p j, j = 1, 2, 3, }, N N where z i denotes an uncontrollable stochastic factor z at stage i. Define the average value of the objective function for N steps by i=1 F N (x 1, x 2,..., x N ; z 1, z 2,..., z N ) = 1 N N F (x i, z i ), and assume that in the extended operation the operating side wants to maximize F ( x; z) = lim inf F N(x 1, x 2,..., x N ; z 1, z 2,..., z N ) = lim inf N N Under these conditions it has been proved that sup inf x X z Z i=1 1 N N F (x i, z i ). i=1 (1) F ( x; z) = sup EF (x, z). (2) x X Therefore, we have the following theorem: Theorem 6: If, in the extended operation, the OS wants to maximize the lower limit (1), then its best guaranteed result is determined as the optimal objective value of the problem sup x X EF (x, z), and the optimal strategy is to choose at each stage of the operation some x X, such that EF ( x, z) = max EF (x, z), x X if the maximum is attained on X, or the corresponding ε-optimal strategy, otherwise. 26

27 Next, assume; that in the extended operation, the OS is interested in maximization of the functional Φ( x; z) = lim inf N 1 N N χ(x i, z i ), (3) i=1 where χ(x, z) = { 1, if F (x, z) u0 0, if F (x, z) < u 0, and u 0 is some given value of F (x, z), with u 0 ( C, C). Then applying Theorem 6 to the operation with objective function (3), we obtain sup inf x X z Z On the other hand, from definition (4), Φ( x; z) = sup Eχ(x, z). x X (4) Eχ(x, z) = p j χ(x, z j ) = p j, (5) j=1 j J(u 0,x) where Consequently, J(u 0, x) = { j Z + F (x, z j ) u 0 }. sup inf x X z Z Φ( x; z) = sup x X j J(u 0,x) p j. (6) 27

28 Therefore, we have the following corollary: Corollary 1: If, in the extended operation, the OS wants to maximize a functional, then the best guaranteed result is defined as the supremum of the functional (5), and the optimal strategy is to choose, at each stage of the operation, some x X, such that Eχ( x, z) = max Eχ(x, z), x X if the maximum is attained on X, or the corresponding ε-optimal strategy, otherwise. It is not difficult to see that for every x X the value (5) defines the probability of exceeding of the given level u 0. At the same time, it worth noting that although the result (6) is a particular case of (2), it is significantly important for applications. Let us now assume that the level u is not fixed. Then χ(x, z, u) = { 1, if F (x, z) u 0, if F (x, z) < u, (7) Denote Φ( x; z; u) = lim inf N and consider the functional 1 N N χ(x i, z i, u) i=1 Ω(u) = sup inf Φ( x; z; u). x X z Z 28

29 Suppose that in the extended operation the operating side is interested in maximization of the value u, subject to the constraint Ω(u) p, that is solving the following optimization problem { u sup, Ω(u) p (8), for p (0, 1), where p is a given probability. Let u be the solution to problem (8). Then, if we denote U = {u Ω(u) p }, we obtain that u = sup U. Therefore, under these conditions we have the following theorem: Theorem 7: where v(x) = u = sup v(x), x X sup inf F (x, z). S Z:P (S) p z S It is not difficult to see that, in the extended operation, solving problem (8) is equivalent to the maximization of the value of u, subject to the condition that supremum with respect to x X of the probability of exceeding of the level u by F (x, z) is not less than the given p, for z Z. So, it is clear that in this case the operating side is maximizing the level with the given guarantee. Under these circumstances, to evaluate the efficiency of strategy x it is possible to use the function v(x), and the best guaranteed result is equal to sup x X v(x). At the same time, the optimization problem v(x) sup x X is equivalent to the problem of finding sup sup S Ξ:P (S) p x X ω 29

30 subject to the constraints ω F (x, z j ), j J S, where J S = { j 1, z j S }. Note that the maximin problem has been reduced to a constrained maximization problem. 30

31 6 Applications to two-stage stochastic programming problem Let the profit of an enterprise from realization of strategy (plan) x and its correction r for every value of the stochastic factor z Z is determined by the value of the function F (x, r, z), (9) and plan x and its correction r for all z Z must satisfy the constraints: f k (x, r, z) 0, k = 1, m (10) x X, r R, where x = (x 1, x 2,..., x n ) and r = (r 1, r 2,..., r n ) are strategies of the enterprise and z is an uncontrollable stochastic factor. Denote by r(x, z) the correction r that maximizes (9), subject to constraints (10), for given values of x and z. Then the anticipated (expected) profit from the realization of plan x and correction r(x, z) will be determined as EF (x, r(x, z), z). (11) The problem is reduced to the finding of x X, which maximizes function (11), i.e. to the following optimization problem EF (x, r(x, z), z) sup. (12) x X 31

32 The given stochastic programming problem can be considered as a Zero Sum Game, where Player 1 (the enterprise) is trying to maximize F (x, r(x, z), z) by choosing x X, while Player 2 (Nature) is trying to minimize F (x, r(x, z), z) by choosing z Z. Consider this stochastic programming problem under the assumption of frequency definition of probability about probabilistic characteristics of stochastic factor z. Assume that set Z is at most countable. Let J = { j 1, z j Z }. Suppose that the operation of choosing of x may be repeated infinitely, and on the each step i, when the enterprise makes its choice x i, there appears some z i = z j Z, such that j J p j 1 j J, lim N N N i=1 χ j(z i ) = p j, with = 1, where N is the current number of repetitions of the operation. If the enterprise intends to carry out the operation just once and is interested on the result of the one-step operation, its best guaranteed result is determined as sup inf F (x, r(x, z), z). x X z Z On the other hand, if the enterprise agreed to transfer from the one-step operation to the infinitely-repeated extended operation with the following objective function F ( x; z) = lim inf N 1 N N F (x i, r(x i, z i ), z i ), i=1 where x X, z Z, and sets X and Z are defined as in the previous section, then applying Theorem 6, we find that the best guaranteed result is sup inf x X z Z F ( x; z) = sup EF (x, r(x, z), z). x X Hence, the best guaranteed result of the enterprise in the extended operation is determined as the optimal objective value of problem (12). Further, if in the two-stage problem the enterprise wants to maximize the probability of exceeding the given level, it may construct an extended operation and formulate a problem similar to optimization problem (6). Then the solution to problem (6) will determine the best guaranteed result and optimal strategy in the extended operation. Finally, if in the two-stage problem the enterprise is interested in maximization of the level with the given guarantee, it may transfer to an 32

33 extended operation and formulate an optimization problem similar to (8). Then correspondingly, the solution to problem (8) will determine the best guaranteed result and optimal strategy in the extended operation. Let us now assume that only the constraints of the problem contain the stochastic factor z. Enterprise s strategies, as before, are x and r; and suppose that constraints for x are known, but constrains for r depend on z, for instance are given in the form r R(x, z). For simplicity assume that for every z Z, max r R(x,z) F (x, r) is attained. Then, if the enterprise intends to carry out just one-step operation, the best guaranteed result will be determined as sup inf x X z Z max F (x, r). r R(x,z) However, if a stochastic information about z is known and the enterprise intends to use it by the transformation to extended operation (for example, with objective function of type (1)), it will be need to solve the optimization problem sup x X j J p j max F (x, r). r R(x,z j ) Additionally, when the original problem besides r R(x, z) contains also a constraint g(x, z) 0, denote R (x, z) = {r r R(x, z), g(x, z) 0}, and suppose, for simplicity that R (x, z), x X. Then in the original one-step operation the best guaranteed result will be sup inf max F (x, r). x X z Z r R (x,z) However, once the enterprise transfers to the extended operation, it will have to solve the optimization problem p j F (x, r j ), sup x X sup r R(x) j J where R(x) = { r = (r j ) j J r j R(x, z j ), } p j g(x, r j ) 0, j J 33

34 and the optimal behavior at the second stage of the two-stage stochastic programming problem is defined for all realizations of z at the same time. Although the problem is getting complicated, the operating side might obtain a substantial increment in the value of the operation. 7 What can be done further: There is a lot of room for development and generalization of the suggested approach. A particular interest represents an extension of the suggested method on multi-step operations with stochastic factors. Indeed, it is impossible to enter the same river twice. Additional research also needs to be conducted for the case when the set of values of a stochastic factor is uncountable. 34

35 References [1] Germeier, Yu. B., Non-Antagonistic Games, D. Reidel Publishing Company, Boston, [2] Germeier, Yu. B., On two-person games with a fixed turn sequence, The Annals of the Academy of Sciences of the USSR 198, 5,1971, p [3] Kukushkin, N. S., The importance of a mutual information exchange between the sides in two-person games with non-antagonistic interests, USSR Computational Mathematics and Mathematical Physics 1972, vol 4, p [4] Raguimov, I. S., On decision making in operations with stochastic factors, In the proceedings of the 2001 International Conference on Computational Intelligence for Modeling Control and Automation- CIMCA 2001 /ISBN , p [5] Vasin, A. A., Krasnoshokov P.S., and Morozov V. V. Operations Research, Moscow: Academia, (in Russian) 35

Game Theory. Greg Plaxton Theory in Programming Practice, Spring 2004 Department of Computer Science University of Texas at Austin

Game Theory. Greg Plaxton Theory in Programming Practice, Spring 2004 Department of Computer Science University of Texas at Austin Game Theory Greg Plaxton Theory in Programming Practice, Spring 2004 Department of Computer Science University of Texas at Austin Bimatrix Games We are given two real m n matrices A = (a ij ), B = (b ij