The simplex method is strongly polynomial for deterministic Markov decision processes

Size: px
Start display at page:

Download "The simplex method is strongly polynomial for deterministic Markov decision processes"

Transcription

1 The implex method i trongly polynomial for determinitic Markov deciion procee Ian Pot Yinyu Ye May 31, 2013 Abtract We prove that the implex method with the highet gain/mot-negative-reduced cot pivoting rule converge in trongly polynomial time for determinitic Markov deciion procee (MDP) regardle of the dicount factor. For a determinitic MDP with n tate and m action, we prove the implex method run in O(n 3 m 2 log 2 n) iteration if the dicount factor i uniform and O(n 5 m 3 log 2 n) iteration if each action ha a ditinct dicount factor. Previouly the implex method wa known to run in polynomial time only for dicounted MDP where the dicount wa bounded away from 1 [Ye11]. Unlike in the dicounted cae, the algorithm doe not greedily converge to the optimum, and we require a more complex meaure of progre. We identify a et of layer in which the value of primal variable mut lie and how that the implex method alway make progre optimizing one layer, and when the upper layer i updated the algorithm make a ubtantial amount of progre. In the cae of nonuniform dicount, we define a polynomial number of miletone policie and we prove that, while the objective function may not improve ubtantially overall, the value of at leat one dual variable i alway making progre toward ome miletone, and the algorithm will reach the next miletone in a polynomial number of tep. 1 Introduction Markov deciion procee (MDP) are a powerful tool for modeling repeated deciion making in tochatic, dynamic environment. An MDP conit of a et of tate and a et of action that one may perform in each tate. Baed on an agent action it receive reward and affect the future evolution of the proce, and the agent attempt to maximize it reward over time (ee Section 2 for a formal definition). MDP are widely ued in machine learning, robotic and control, operation reearch, economic, and related field. See the book [Put94] and [Ber96] for a thorough overview. Solving MDP i alo an important problem theoretically. Optimizing an MDP can be formulated a a linear program (LP), and although thee LP poe extra tructure that can be exploited by algorithm like Howard policy iteration method [How60], they lie jut beyond the point at which Department of Combinatoric and Optimization, Univerity of Waterloo. Reearch done while at Stanford Univerity. ian@ianpot.org. Reearch upported by NSF grant We alo acknowledge financial upport from grant #FA from the U.S. Air Force Office of Scientific Reearch (AFOSR) and the Defene Advanced Reearch Project Agency (DARPA). Department of Management Science and Engineering, Stanford Univerity. yinyu-ye@tanford.edu. Reearch upported in part from grant #FA from the U.S. Air Force Office of Scientific Reearch (AFOSR). 1

2 our ability to olve LP in trongly-polynomial time end (and are a natural target for extending thi ability), and they have proven to be hard in general for algorithm previouly thought to be quite powerful, uch a randomized implex pivoting rule [FHZ11]. In practice [LDK95] MDP are olved uing policy iteration, which may be viewed a a parallel verion of the implex method with multiple imultaneou pivot, or value iteration [Bel57], an inexact approximation to policy iteration that i fater per iteration. If the dicount factor γ, which determine the effective time horizon (ee Section 2), i mall it ha long been known that policy and value iteration will find an ɛ-approximation to the optimum [Bel57]. It i alo well-known that value iteration may be exponential, but policy iteration reited wort-cae analyi for many year. It wa conjectured to be trongly polynomial but except for highly-retricted example [Mad02] only exponential time bound were known [MS99]. Building on reult for parity game [Fri09], Fearnley recently gave an exponential lower bound [Fea10]. Friedmann, Hanen, and Zwick extended Fearnley technique to achieve ub-exponential lower bound for randomized implex pivoting rule [FHZ11] uing MDP, and Friedmann gave an exponential lower bound for MDP uing the leat-entered pivoting rule [Fri11]. Melekopoglou and Condon proved everal other implex pivoting rule are exponential [MC94]. On the poitive ide, Ye deigned a pecialized interior-point method that i trongly polynomial in everything except the dicount factor [Ye05]. Ye later proved that for dicounted MDP with n tate and m action, the implex method with the mot-negative-reduced-cot pivoting rule and, by extenion, policy iteration, run in time O(nm/(1 γ) log(n/(1 γ))) on dicounted MDP, which i polynomial for fixed γ [Ye11]. Hanen, Milteren, and Zwick improved the policy iteration bound to O(m/(1 γ) log(n/(1 γ))) and extended it to both value iteration a well a the trategy iteration algorithm for two player turn-baed tochatic game [HMZ11]. But the performance of policy iteration and implex-tyle bai-exchange algorithm on MDP remain poorly undertood. Policy iteration, for intance, i conjectured to run in O(m) iteration on determinitic MDP, but the bet upper bound are exponential, although a lower bound of O(m) i known [HZ10]. Improving our undertanding of thee algorithm i an important tep in deigning better one with polynomial or even trongly-polynomial guarantee. Motivated by thee quetion, we analyze the implex method with the mot-negative-reducedcot pivoting rule on determinitic MDP. For a determinitic MDP with n tate and m action, we prove that the implex method terminate in O(n 3 m 2 log 2 n) iteration regardle of the dicount factor, and if each action ha a ditinct dicount factor, then the algorithm run in O(n 5 m 3 log 2 n) iteration. Our reult do not extend to policy iteration, and we leave thi a a challenging open quetion. Determinitic MDP were previouly known to be olvable in trongly polynomial time uing pecialized method not applicable to general MDP minimum mean cycle algorithm [PT87] or, in the cae of nonuniform dicount, by exploiting the property that the dual LP ha only two variable per inequality [HN94]. The fatet known algorithm for uniformly dicounted determinitic MDP run in time O(mn) [MTZ10]. However, thee problem were not known to be olvable in polynomial time with the more-generic implex method. More generally, we believe that our reult help hed ome light on how algorithm like implex and policy iteration function on MDP. Our proof technique, particularly in the cae of nonuniform dicount, may be of independent interet. For uniformly dicounted MDP, we how that the value of the primal flux variable mut lie within one of two interval or layer of polynomial ize depending on whether an action i on a path or a cycle. Mot iteration update variable in the maller path layer, and we how thee 2

3 converge rapidly to a locally optimal policy for the path, at which point the algorithm mut update the larger cycle layer and make a large amount of progre toward the optimum. Progre take the form of many mall improvement interpered with a few much larger one rather than uniform convergence. The nonuniform cae i harder, and our meaure of progre i unuual and, to the bet of our knowledge, novel. We again define a et of interval in which the value of variable on cycle mut fall, and thee define a collection of intermediate miletone or checkpoint value for each dual variable (the value of a tate in the MDP). Whenever a variable enter a cycle layer, we argue that a correponding dual variable i making progre toward the layer miletone and will pa thi value after enough update. When each of thee checkpoint have been paed, the algorithm mut have reached the optimum. We believe ome of thee idea may prove ueful in other problem a well. In Section 2 we formally define MDP and decribe a number of well-known propertie that we require. In Section 3 we analyze the cae of a uniform dicount factor, and in Section 4 we extend thee reult to the nonuniform cae. 2 Preliminarie Many variation and extenion of MDP have been defined, but we will tudy the following problem. A Markov deciion proce conit of a et of n tate S and m action A. Each action a i aociated with a ingle tate in which it can be performed, a reward r a R for performing the action, and a probability ditribution P a over tate to which the proce will tranition when uing action a. We denote by P a, the probability of tranitioning to tate when taking action a. There i at leat one action uable in each tate. Let r be the vector of reward indexed by a with entrie r a, A A be the et of action performable in tate, and P be the n by m matrix with column P a and entrie P a,. We will retrict the ditribution P a to be determinitic for all action, in which cae tate may be thought of a node in a graph and action a directed edge. However, the reult in thi ection apply to MDP with tochatic tranition a well. At each time tep, the MDP tart in ome tate and perform an action a admiible in tate, at which point it receive the reward r a and tranition to a new tate according to the probability ditribution P a. We are given a dicount factor γ < 1 a part of the input, and our goal i to chooe action to perform o a to maximize the expected dicounted reward we accumulate over an infinite time horizon. The dicount can be thought of a a topping probability at each time tep the proce end with probability 1 γ. Normally, the dicount γ i uniform for the entire MDP, but in Section 4 we will allow each action to have a ditinct dicount γ a. Due to the Markov property tranition depend only the current tate and action there i an optimal trategy that i memoryle and depend only on the current tate. Let π be uch a policy, a ditribution of action to perform for each tate. Thi define a Markov chain and a value for each tate: Definition 2.1. Let π be a policy, P π be the n by n matrix where P, π i the probability of tranitioning from to uing π, and r π the vector of expected reward for each tate according to the ditribution of action in π. The value vector v π i indexed by tate, and v π i equal to the expected total dicounted reward of tarting in tate and following policy π. It i defined a v π = i 0 (γ(p π ) T ) i r π = (I γp π ) T r π or equivalently by v π = r π + γ(p π ) T v π. (1) 3

4 If policy π i randomized and ue two or more action in ome tate, then the value of v π i an average of the value of performing each of the pure action in, and one of thee i the larget. Therefore we can replace the ditribution by a ingle action and only increae the value of the tate. In the remainder of the paper we will retrict ourelve to pure policie in which a ingle action i taken in each tate. In addition to the value vector, a policy π alo ha an aociated flux vector x π that will play a critical role in our analyi. It act a a kind of dicounted flow. Suppoe we tart with a ingle unit of ma on every tate and then run the Markov chain. At each time tep we remove 1 γ fraction of the ma on each tate and reditribute the remaining ma according to the policy π. Summing over all time tep, the total amount of ma that pae through each action i it flux. More formally, Definition 2.2. Let π be a policy and P π the n by n tranition matrix for π formed by the column P a for action in π. The flux vector x π i indexed by action. If action a i not in π then x π a = 0, and if π ue a in tate, then x π a = z, where z = i 0(γP π ) i 1 = (I γp π ) 1 1, (2) and 1 i the all one vector of dimenion n. The flux i the total dicounted number of time we ue each action if we tart the MDP in all tate and run the Markov chain P π dicounting by γ each iteration. Note that if a π then x π a 1, ince the initial flux placed on a tate alway pae through a. Further note that each bit of flux can be traced back to one of the initial unit of ma placed on each tate, although the vector x π um flux from all tate. Thi will be important in Section 4. Solving the MDP can be formulated a the following primal/dual pair of LP, in which the flux and value vector correpond to primal and (poibly infeaible) dual olution: Primal: maximize a r ax a ubject to S, a A x a = 1 + γ a P a,x a x 0 (3) Dual: minimize v ubject to S, a A, v r a + γ (4) P a, v The contraint matrix of (3) i equal to M γp, where M,a = 1 if action a can be ued in tate and 0 otherwie. The dual value LP (4) i often defined a the primal, a it i perhap more intuitive, and (3) i rarely conidered. However, our analyi center on the flux variable, and algorithm that manipulate policie can more naturally be een a moving through the polytope (3), ince vertice of the polytope repreent policie: Lemma 2.3. The LP (3) i non-degenerate, and there i a bijection between vertice of the polytope and policie of the MDP. Proof. Policie have exactly n nonzero variable, and olving for the flux vector in (2) i identical to olving for a bai in the polytope, o policie map to bae. Write the contraint in (3) in the 4

5 tandard matrix form Ax = b. The vector b i 1, and A = M γp. In a row of A the only poitive entrie are on action uable in tate, o if Ax = b, then x mut have a nonzero entry for every tate, i.e., a choice of action for every tate. Bae of the LP have n variable, o they mut include only one action per tate. Finally, a hown above x π a 1 for all a in a policy/bai, o the LP i not degenerate, and bae correpond to vertice. By Lemma 2.3, the implex method applied to (3) correpond to a imple, ingle-witch verion of policy iteration: we tart with an arbitrary policy, and in each iteration we change a ingle action that improve the value of ome tate. Since the LP i not degenerate, the implex method will find the optimal policy with no cycling. We will ue Dantzig mot-negative-reduced-cot pivoting rule to chooe the action witched. Since (3) i written a a maximization problem, we will refer to reduced cot a gain and alway chooe the highet gain action to witch/pivot. For MDP, the gain have a imple interpretation: Definition 2.4. The gain (or reduced cot) of an action a for tate with repect to a policy π i denoted r π a and i the improvement in the value of if ue action a once and then follow π for all time. Formally, r π a = (r a + γp T a v π ) v π, or, in vector form r π = r (M γp ) T v π. (5) We denote the optimal policy by π, and the optimal flux, value, and gain by x, v, and r. The following are baic propertie of the implex method, and we prove them for completene. Lemma 2.5. Let π and π be any policie. The gain atify the following propertie (r π ) T x π = r T x π r T x π = 1 T v π 1 T v π, r π a = 0 for all a π, and r a 0 for all a. Proof. From the definition of gain (r π ) T x π = (r (M γp ) T v π ) T x π = r T x π (v π ) T (M γp )x π = r T x π (v π ) T 1, uing that (M γp ) i the contraint matrix of (3). From the definition of value and flux vector r T x π = r T π (I γp π ) 1 1 = (v π ) T 1, where r π i the reward vector retricted to indice π. Combining thee two give the firt reult. For the econd reult, if a i in π, then v π = r a + γp T a v π, o r π a = 0. Finally, if r a > 0 for ome a, then conider the policy π that i identical to π but ue a. Then (r ) T x π > 0, and the firt identity prove that π i not optimal. A key property of the implex method on MDP that we will employ repeatedly i that not only i the overall objective improving, but alo the value of all tate are monotone non-decreaing, and there exit a ingle policy we denote by π that maximize the value of all tate: Lemma 2.6. Let π and π be policie appearing in an execution of the implex method with π being ued after π. Then v π v π. Further, let π be the policy when implex terminate, and π be any other policy. Then v v π. 5

6 Proof. Suppoe π and π are ubequent policie. The gain of all action in π with repect to π are equal to r π (I γp π ) T v π, all of which are nonnegative. Therefore 0 (I γp π ) T (r π (I γp π ) T )v π = v π v π, uing that (I γp π ) T = i 0 (γ(p π ) T ) i 0. By induction, thi hold if π and π occur further apart. Performing a imilar calculation uing the gain r, which are nonpoitive, how that v v π 0 for any policy π. 3 Uniform dicount A a warmup before delving into our analyi of determinitic MDP, we briefly review the analyi of [Ye11] for tochatic MDP with a fixed dicount. Conider the flux vector in Definition 2.2. One unit of flux i added to each tate, and every tep it i dicounted by a factor of γ, for a total of n(1 + γ + γ 2 + ) = n/(1 γ) flux overall. If π i the current policy and i the highet gain, then, by Lemma 2.5 the farthet π can be from π i if all n/(1 γ) unit of flux in π are on the action with gain, o r T x r T x π n /(1 γ). If we pivot on thi action, at leat 1 unit of flux i placed on the new action, increaing the objective by at leat. Thu we have reduced the gap to π by a 1 (1 γ)/n fraction, which i ubtantial if 1/(1 γ) i polynomial. Now conider r T x r T x π = (r ) T x π. All the term r ax π a are nonnegative, and for ome action a in π we have r ax π a (r ) T x π /n. The term r ax π a i at mot r an/(1 γ), o r a (r ) T x π /(n 2 /(1 γ)). But for any policy π that include a, (r ) T x π r ax π a r a, o after r T x r T x π ha hrunk by a factor of n 2 /(1 γ), action a cannot appear in any future policy, and thi occur after log 1 (1 γ)/n 1 γ n 2 ( ) n = O 1 γ log n 1 γ tep. See [Ye11] for the detail. The above reult hinged on the fact that the ize of all nonzero flux lay within the interval [1, n/(1 γ)], which wa aumed to be polynomial but give a weak bound if γ i very cloe to 1. However, conider a policy for a determinitic MDP. It can be een a a graph with a node for each tate with a ingle directed edge leaving each tate repreenting the action, o the graph conit of one or more directed cycle and directed path leading to thee cycle. Starting on a path, the MDP ue each path action once before reaching a cycle, o the flux on path mut be mall. Flux on the cycle may be ubtantially larger, but ince the MDP reviit each action after at mot n tep, the flux on cycle action varie by at mot a factor of n. Lemma 3.1. Let π be a policy with flux vector x π and a an action in π. If a i on a path in π then 1 x π a n, and if a i on a cycle then 1/(1 γ) x π a n/(1 γ). The total flux on path i at mot n 2, and the total flux on cycle i at mot n/(1 γ). Proof. All action have at leat 1 flux. If a i on a path, then tarting from any tate we can only ue a once and never return, contributing flux at mot 1 per tate, o x π a n. Summing over all path action, the total flux i at mot n 2. If a i on a cycle, each tate on the cycle contribute a total of 1/(1 γ) flux to the cycle. By ymmetry thi flux i ditributed evenly among action on the cycle, o x π a 1/(1 γ). The total flux in the MDP i n/(1 γ), o x π a n/(1 γ). The overall range of flux i large, but all value mut lie within one of two polynomial layer. We will prove that implex can eentially optimize each layer eparately. If a cycle i not updated, then 6

7 not much progre i made toward the optimum, but we make a ubtantial amount of progre in optimizing the path for the current cycle. When the path are optimal the algorithm i forced to update a cycle, at which point we make a ubtantial amount of progre toward the optimum but reet all progre on the path. Firt we analyze progre on the path: Lemma 3.2. Suppoe the implex method pivot from π to π, which doe not create a new cycle. Let π be the final policy uch that cycle in π are a ubet of thoe in π (i.e., the final policy before a new cycle i created). Then r T (x π x π ) (1 1/n 2 )r T (x π x π ). Proof. Let = max a r π a be the highet gain. Conider (r π ) T x π. Since cycle in π are contained in π, r π a = 0 for any action a on a cycle in π, and by Lemma 3.1, π ha at mot n 2 unit of flux on path, o (r π ) T x π = r T (x π x π ) n 2. Policy π ha at leat 1 unit of flux on the action with gain, o r T (x π x π ) r T (x π x π ) (1 1n ) 2 r T (x π x π ). Due to the polynomial contraction in the lemma above, not too many iteration can pa before a new cycle i formed. Lemma 3.3. Let π be a policy. After O(n 2 log n) iteration tarting from π, either the algorithm finihe, a new cycle i created, a cycle i broken, or ome action in π never appear in a policy again until a new cycle i created. Proof. Let π be the policy in ome iteration, π the lat policy before a new cycle i created, and π an arbitrary policy occurring between π and π in the algorithm. Policy π differ from π in action on path and poibly in cycle that exit in π but have been broken in π. By Lemma 2.5 (r π ) T x π = r T (x π x π ) = 1 T (v π v π ). We divide the analyi into two cae. Firt uppoe that there exit an action a ued in tate on a path uch that r π a x π a (r π ) T x π /n (note (r π ) T x π 0). Since a i on a path x π a n, which implie r π a n 2 (r π ) T x π. Now if policy π ue action a, then (r π ) T x π = 1 T (v π v π ) v π v π =v π (r a + γp a v π ) v π (r a + γp a v π ) = r π a (rπ ) π x π n 2, uing that the value of all tate are monotone increaing. In the econd cae there i no action a on a path in π atifying r π a x π a (r π ) T x π /n. The remaining portion of (r π ) T x π i due to cycle, o there mut be ome cycle C coniting of action {a 1,..., a k } ued in tate { 1,..., k } uch that a C rπ a x π a (r π ) T x π /n. All flux in C firt enter C either from a path ending at C or from the initial unit of flux placed on ome tate in C. If y 1 unit of flux firt enter C at tate in policy π, then that flux earn y (v π v π ) reward with repect to the reward r π, o a C rπ a x π a = C y (v π v π ). Moreover, each term v π v π i nonnegative, ince the value of all tate are nondecreaing. Now note that C (vπ v π ) = a C rπ a /(1 γ), and at mot n unit of flux enter each tate from outide. Therefore n a C rπ a /(1 γ) a C rπ a x π a, implying n 2 a C rπ a /(1 γ) (r π ) T x π. 7

8 A long a cycle C i intact, each a C ha 1/(1 γ) flux from tate in C (Lemma 3.1), o if C i in policy π then (r π ) T x π = 1 T (v π v π ) C v π v π = a C rπ a 1 γ (rπ ) T x π n 2. (6) Now if log n 2 /(n 2 1) n 2 iteration occur between π and π, Lemma 3.2 implie (r π ) T x π < (1 1n ) logn 2 /(n 2 1) n 2 2 (r π ) T x π (rπ ) T x π n 2. In the firt cae action a cannot appear in π, and in the econd cae cycle C mut be broken in π. Thi take log n 2 /(n 2 1) n 2 = O(n 2 log n) iteration if no new cycle interrupt the proce. Lemma 3.4. Either the algorithm finihe or a new cycle i created after O(n 2 m log n) iteration. Proof. Let π 0 be a policy after a new cycle i created, and conider the policie π 1, π 2,... each eparated by O(n 2 log n) iteration. If no new cycle i created, then by Lemma 3.3 each of thee policie π i ha either broken another cycle in π 0 or contain an action that cannot appear in π j for all j > i. There are at mot n cycle in π 0 and at mot m action that can be eliminated, o after (m + n)o(n 2 log n) = O(n 2 m log n) iteration, the algorithm mut terminate or create a new cycle. When a new cycle i formed, the algorithm make a ubtantial amount of progre toward the optimum but alo reet the path optimality above. Lemma 3.5. Let π and π be ubequent policie uch that π create a new cycle. Then r T (x x π ) (1 1/n)r T (x x π ). Proof. Let = max a r π a and a = argmax a r π a. There i a total of n/(1 γ) flux in the MDP, o r T x r T x π = (r π ) T x n/(1 γ). By Lemma 3.1, pivoting on a and creating a cycle will reult in at leat 1/(1 γ) flux through a. Therefore r T x π r T x π + /(1 γ), o r T (x x π ) r T (x x π ) ( 1 γ 1 1 ) r T (x x π ). n Lemma 3.6. Let π be a policy. Starting from π, after O(n log n) iteration in which a new cycle i created, ome action in π i either eliminated from cycle for the remainder of the algorithm or entirely eliminated from policie for the remainder of the algorithm. Proof. Conider a policy π with repect to the optimal gain r. There i an action a uch that r ax π a (r ) T x π /n. If a i on a path in π, then 1 x π a n, o r a (r ) T x π /n 2, and if a i on a cycle, then 1/(1 γ) x π a n/(1 γ), o r a/(1 γ) (r ) T x π /n 2. Since r are the gain for the optimal policy, r a 0 for all a. Therefore if π i any policy containing a, then r a r ax π a (r ) T x π, and if π i any policy containing a on a cycle, then r a/(1 γ) r ax π a (r ) T x π. Now by Lemma 3.5, if there are more than log n/(n 1) n 2 = O(n log n) new cycle created between policie π and π then ( (r ) T x π < 1 1 ) logn/(n 1) n 2 (r ) T x π = (r ) T x π n n 2. 8

9 Therefore if π contained a on a path, then a cannot appear in any policy after π for the remainder of the algorithm, and if π contained a on a cycle, then a cannot appear in a cycle after π (but may appear in a path) for the remainder of the algorithm. Theorem 3.7. The implex method converge in at mot O(n 3 m 2 log 2 n) iteration on determinitic MDP with uniform dicount uing the highet gain pivoting rule. Proof. Conider the policie π 0, π 1, π 2,... where O(n log n) new cycle have been created between π i and π i+1. By Lemma 3.6, each π i contain an action that i either eliminated entirely in π j for j > i or eliminated from cycle. Each action can be eliminated from cycle and path, o after 2m uch round of O(n log n) new cycle the algorithm ha converged. By Lemma 3.4 cycle are created every O(n 2 m log n) iteration, for a total of O(n 3 m 2 log 2 n) iteration. 4 Varying Dicount In thi ection we allow each action a to have a ditinct dicount γ a. Thi ignificantly complicate the proof of convergence ince the total flux i no longer fixed. When updating a cycle we can no longer bound the ditance to the optimum baed olely on the maximum gain, ince the optimal policy may employ action with maller gain to the current policy but ubtantially more flux. We are able to exhibit a et of layer in which the flux on cycle mut lie baed on the dicount of the action, and we will how that when a cycle i created in a particular layer we make progre toward the optimum value for the updated tate auming that it lie within that layer. Thee layer will define a et of bound whoe value we mut urpa, which erve a miletone or checkpoint to the optimum. When we update a cycle we cannot claim that the overall objective increae ubtantially but only that the value of individual tate make progre toward one of thee miletone value. When the value of all tate have urpaed each of thee intermediate miletone the algorithm will terminate. We firt define ome notation. Recall that to calculate flux we place one unit of ma in each tate and then run the Markov chain, o all flux trace back to ome tate, but x π aggregate all of it together. Becaue we will be concerned with analyzing the value of individual tate in thi ection, it will be ueful to eparate out the flux originating in a particular tate. Conider the following alternate LP: maximize ubject to r T x a A x a = 1 + a γ ap a, x a a A x a = a γ ap a, x a x 0 The LP (7) i identical to (3), except that initial flux i only added to tate rather than all tate, and the dual of (7) matche (4) if the objective in (4) i changed to minimize only v. Feaible olution in (7) meaure only flux originating in and contributing to v. For a tate and policy π we ue the notation x π, to denote the correponding vertex in (7). Note that x π = xπ,. The following lemma i analogou to Lemma 2.5 and ha an identical proof: Lemma 4.1. For a tate and for policie π and π, (r π ) T x π, = r T x π, r T x π, = v π v π. (7) 9

10 We now define the interval in which the flux mut lie. A in Section 3 flux on path i in [1, n]. Let C be a cycle in ome policy, and γ C = a C γ a be total dicount of C. We will prove that the mallet dicount in C determine the rough order of magnitude of the flux through C. Definition 4.2. Let C be a cycle and a an action in C, then the dicount of a dominate the dicount of C if γ a γ a for all a C. Lemma 4.3. Let π be a policy containing the cycle C with dicount dominated by γ a and total dicount γ C. Let be a tate on C, a the action ued in and a an arbitrary action in C, then x π, a = 1/(1 γ C ), γ C /(1 γ C ) x π, a 1/(1 γ C ), and 1/(n(1 γ a )) 1/(1 γ C ) 1/(1 γ a ). Proof. For the firt equality, all flux originate at, o the flux through a (ued in tate ) either jut originated in or came around the cycle from, implying x π, a = 1 + γ C x π, a. An analogou equation hold for all other action a on C, but now the initial flow from may have been dicounted by at mot γ C before reaching a, giving γ C /(1 γ C ) x π, a 1/(1 γ C ). The upper bound in the final inequality, 1/(1 γ C ) 1/(1 γ a ) hold ince a C (γ a dominate the dicount of C). For the lower bound, let l = 1 γ a. Then γ C γa n = (1 l) n 1 nl = 1 n(1 γ a ), implying 1/(1 γ C ) 1/(n(1 γ a )). Flux on path till fall in [1, n], o the algorithm behave the ame on path a it did in the uniform cae: Lemma 4.4. Either the algorithm finihe or a new cycle i created after O(n 2 m log n) iteration. Proof. Thi i identical to the proof of Lemma 3.4, which depend on Lemma 3.2 and 3.3. Lemma 3.2 hold for nonuniform dicount, and Lemma 3.3 hold after adjuting Equation (6) a follow (r π ) T x π C v π v π a C rπ a (rπ ) T x π 1 γ C n 2, uing that a C rπ a n/(1 γ C ) (r π ) T x π /n and Lemma 4.3. Now uppoe the implex method update the action for tate in policy π and create a cycle dominated by γ a. Again, v may not improve much, ince there may be a cycle with dicount much larger than γ a. However, in any policy π where i on a cycle dominated by γ a and ue ome action a, 1/(n(1 γ a )) x π, a 1/(1 γ a ), which allow u to argue v ha made progre toward the highet value achievable when it i on a cycle dominated by γ a, and after enough uch progre ha made, v will beat thi value and never again appear on any cycle dominated by γ a. The optimal value achievable for each tate on a cycle dominated by each γ a erve a the above-mentioned miletone. Since all cycle are dominated by ome γ a, there are m miletone per tate. Lemma 4.5. Suppoe the implex method move from π to π by updating the action for tate, creating a new cycle C with dicount dominated by γ a for ome a in π. Let π be the final policy ued by the implex method in which i in a cycle dominated by γ a. Then v π v π (1 1/n 2 )(v π v π ). 10

11 Proof. Let = max a r π a be the value of the highet gain with repect to π. Any cycle contain at mot n action, each of which ha gain at mot in r π, o if i on a cycle dominated by γ a in π then by Lemma 4.3 and Lemma 4.1, v π v π n /(1 γ a ), and ince π create a cycle dominated by γ a, by the ame lemma v π v π + /(n(1 γ a )). Combining the two, v π v π = (v π v π ) (v π v π ) (v π v π ) n(1 γ a ) (1 1n 2 ) (v π v π ). The following lemma i the crux of our analyi and allow u to eliminate action when we get cloe to a miletone value. Thi occur becaue the poitive gain mut hrink or ele the algorithm would urpa the miletone, and a the poitive gain hrink they can no longer balance larger negative gain, forcing uch action out of the cycle. Lemma 4.6. Suppoe policy π contain a cycle C with dicount dominated by γ a and i a tate in C. There i ome action a in C (depending on ) uch that after O(n 2 log n) iteration that change the action for and create a cycle with dicount dominated by γ a, action a will never again appear in a cycle dominated by γ a. Proof. Let π be a policy containing a cycle C with dicount dominated by γ a and a tate in C. Let π be another policy where i on a cycle dominated by γ a after at leat 1+log n 2 /(n 2 1) n 5 = O(n 2 log n) iteration that create uch a cycle by changing the action for and π the final policy ued by the algorithm in which i on a cycle dominated by γ a. Conider the policy ˆπ in the iteration immediately preceding π. By Lemma 4.5, and the choice of π, v π vˆπ ( 1 1 ) logn 2 /(n 2 1) n 5 n 2 (v π v π ) = 1 n 5 (vπ v π ), or equivalently v π v π n 5 (v π vˆπ ), implying v π vˆπ = (v π v π ) + (v π vˆπ ) ( n 5 + 1)(v π vˆπ ). (8) Since the gap v π vˆπ i large and negative, there mut be highly negative gain in rˆπ. By Lemma 4.1 v π vˆπ = (rˆπ ) T x π,. Let rˆπ a = min rˆπ a C a and be the tate uing a. By Lemma 4.3, x π, 1/(1 γ a ), and C ha at mot n tate, o applying Equation (8) rˆπ a 1 ( 1 γ a n (vπ vˆπ ) n ) (v π vˆπ ). (9) n The poitive entrie in rˆπ mut all be mall, ince there i only a mall increae in the value of. Let = max rˆπ. The algorithm pivot on the highet gain, and by aumption it update the action for and create a cycle dominated by γ a. By Lemma 4.3, the new action i ued at leat 1/(n(1 γ a )) time by flux from, ince it i the firt action in the cycle, o n(1 γ a ) vπ vˆπ v π vˆπ. (10) We prove that the highly negative rˆπ a cannot coexit with only mall poitive gain bounded by. Conider any policy in which i on a cycle C containing a (but not necearily containing ) with total gain γ C dominated by γ a. By Lemma 4.3, there i at leat 1/(1 γ C ) 1/(n(1 γ a )) 11

12 flux from going through a, and in the ret of the cycle there are at mot n 1 other action with at mot 1/(1 γ C ) 1/(1 γ a ) flux. The highet gain with repect to ˆπ i, o the value of v relative to rˆπ i at mot rˆπ a n(1 γ a ) + n 1 γ a = ( n 3 + 1n 2 ) (v π vˆπ ) + n 2 (v π vˆπ ) ( n 3 + 1n 2 + n2 ) (v π vˆπ ) < 0 uing Equation (9) and (10). But vˆπ = 0 relative to rˆπ, and it only increae in future iteration, o a cannot appear again in a cycle dominated by γ a. Lemma 4.7. For any action a, there are at mot O(n 3 m log n) iteration that create a cycle with dicount dominated by γ a. Proof. After O(n 3 log n) iteration that create a cycle dominated by γ a, ome tate mut have been updated in O(n 2 log n) of thoe iteration, o by Lemma 4.6 ome action will never appear again in a cycle dominated by γ a. After m repetition of thi proce all action have been eliminated. Theorem 4.8. Simplex terminate in at mot O(n 5 m 3 log 2 n) iteration on determinitic MDP with nonuniform dicount uing the highet gain pivoting rule. Proof. There are O(m) poible dicount γ a that can dominate a cycle, and by Lemma 4.7 there are at mot O(n 3 m log n) iteration creating a cycle dominated by any particular γ a, for a total of O(n 3 m 2 log n) iteration that create a cycle. By Lemma 4.4 a new cycle i created every O(n 2 m log n) iteration, for a total of O(n 5 m 3 log 2 n) iteration overall. 5 Open problem A difficult but natural next tep would be to try to extend thee technique to handle policy iteration on determinitic MDP. The main problem encountered i that the multiple imultaneou pivot ued in policy iteration can interfere with each other in uch a way that the algorithm effectively pivot on the mallet improving witch rather than the larget. See [HZ10] for uch an example. Another challenging open quetion i to deign a trongly polynomial algorithm for general MDP. Finally, we believe the technique of dividing variable value into polynomial ized layer may be helpful for entirely different problem. Acknowledgment. The author would like to thank Kazuhia Makino for pointing out an error in Lemma 3.3. Reference [Bel57] Richard E. Bellman. Dynamic Programming. Princeton Univerity Pre, [Ber96] Dimitri P. Berteka. Dynamic programming and optimal control. Athena Scientific,

13 [Fea10] [FHZ11] [Fri09] [Fri11] John Fearnley. Exponential lower bound for policy iteration. In Automata, Language and Programming, volume 6199 of Lecture Note in Computer Science, page Springer Berlin / Heidelberg, arxiv: v1, doi: / _46. 2 Oliver Friedmann, Thoma Dueholm Hanen, and Uri Zwick. Subexponential lower bound for randomized pivoting rule for the implex algorithm. In Proc. 43rd Sympoium on Theory of Computing, STOC 11, page ACM, doi: / Oliver Friedmann. An exponential lower bound for the parity game trategy improvement algorithm a we know it. In Proc. 24th Logic In Computer Science, LICS 09, page , arxiv: v1, doi: /lics Oliver Friedmann. A ubexponential lower bound for zadeh pivoting rule for olving linear program and game. In Integer Programming and Combinatoral Optimization, volume 6655 of Lecture Note in Computer Science, page Springer Berlin / Heidelberg, doi: / _16. 2 [HMZ11] Thoma Dueholm Hanen, Peter Bro Milteren, and Uri Zwick. Strategy iteration i trongly polynomial for 2-player turn-baed tochatic game with a contant dicount factor. In ICS, page , arxiv: v1. 2 [HN94] [How60] [HZ10] Dorit S. Hochbaum and Joeph (Seffi) Naor. Simple and fat algorithm for linear and integer program with two variable per inequality. SIAM Journal on Computing, 23:1179, doi: /s Ronald Howard. Dynamic programming and markov deciion procee. MIT, Cambridge, Thoma Hanen and Uri Zwick. Lower bound for howard algorithm for finding minimum mean-cot cycle. In Otfried Cheong, Kyung-Yong Chwa, and Kunoo Park, editor, Algorithm and Computation, volume 6506 of Lecture Note in Computer Science, page Springer Berlin / Heidelberg, doi: / _37. 2, 12 [LDK95] Michael L. Littman, Thoma L. Dean, and Lelie Pack Kaelbling. On the complexity of olving markov deciion problem. In Proc. 11th Uncertainty in Artificial Intelligence, UAI 95, page , Available from: [Mad02] [MC94] Omid Madani. On policy iteration a a newton method and polynomial policy iteration algorithm. In Proc. 18th National Conference on Artificial intelligence, page , Available from: 2 Mary Melekopoglou and Anne Condon. On the complexity of the policy improvement algorithm for markov deciion procee. ORSA Journal on Computing, 6(2): , doi: /ijoc

14 [MS99] Yihay Manour and Satinder Singh. On the complexity of policy iteration. In Proc. 15th Uncertainty in Artificial Intelligence, UAI 99, page , Available from: 2 [MTZ10] Omid Madani, Mikkel Thorup, and Uri Zwick. Dicounted determinitic markov deciion procee and dicounted all-pair hortet path. ACM Tranaction on Algorithm (TALG), 6(2):33:1 33:25, doi: / [PT87] Chrito Papadimitriou and John N. Titikli. The complexity of markov deciion procee. Mathematic of Operation Reearch, 12(3): , Augut doi: /moor [Put94] [Ye05] [Ye11] Martin L. Puterman. Markov Deciion Procee: Dicrete Stochatic Dynamic Programming. Wiley, New York, NY, USA, Yinyu Ye. A new complexity reult on olving the markov deciion problem. Mathematic of Operation Reearch, 30(3): , Augut doi: /moor Yinyu Ye. The implex and policy-iteration method are trongly polynomial for the markov deciion problem with a fixed dicount rate. Mathematic of Operation Reearch, 36(4): , November doi: /moor , 2, 6 14

Problem Set 8 Solutions

Problem Set 8 Solutions Deign and Analyi of Algorithm April 29, 2015 Maachuett Intitute of Technology 6.046J/18.410J Prof. Erik Demaine, Srini Devada, and Nancy Lynch Problem Set 8 Solution Problem Set 8 Solution Thi problem

More information

Lecture 21. The Lovasz splitting-off lemma Topics in Combinatorial Optimization April 29th, 2004

Lecture 21. The Lovasz splitting-off lemma Topics in Combinatorial Optimization April 29th, 2004 18.997 Topic in Combinatorial Optimization April 29th, 2004 Lecture 21 Lecturer: Michel X. Goeman Scribe: Mohammad Mahdian 1 The Lovaz plitting-off lemma Lovaz plitting-off lemma tate the following. Theorem

More information

Preemptive scheduling on a small number of hierarchical machines

Preemptive scheduling on a small number of hierarchical machines Available online at www.ciencedirect.com Information and Computation 06 (008) 60 619 www.elevier.com/locate/ic Preemptive cheduling on a mall number of hierarchical machine György Dóa a, Leah Eptein b,

More information

arxiv: v1 [math.mg] 25 Aug 2011

arxiv: v1 [math.mg] 25 Aug 2011 ABSORBING ANGLES, STEINER MINIMAL TREES, AND ANTIPODALITY HORST MARTINI, KONRAD J. SWANEPOEL, AND P. OLOFF DE WET arxiv:08.5046v [math.mg] 25 Aug 20 Abtract. We give a new proof that a tar {op i : i =,...,

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning Yihay Manour Google Inc. & Tel-Aviv Univerity Outline Goal of Reinforcement Learning Mathematical Model (MDP) Planning Learning Current Reearch iue 2 Goal of Reinforcement Learning

More information

Lecture 9: Shor s Algorithm

Lecture 9: Shor s Algorithm Quantum Computation (CMU 8-859BB, Fall 05) Lecture 9: Shor Algorithm October 7, 05 Lecturer: Ryan O Donnell Scribe: Sidhanth Mohanty Overview Let u recall the period finding problem that wa et up a a function

More information

Theoretical Computer Science. Optimal algorithms for online scheduling with bounded rearrangement at the end

Theoretical Computer Science. Optimal algorithms for online scheduling with bounded rearrangement at the end Theoretical Computer Science 4 (0) 669 678 Content lit available at SciVere ScienceDirect Theoretical Computer Science journal homepage: www.elevier.com/locate/tc Optimal algorithm for online cheduling

More information

Social Studies 201 Notes for November 14, 2003

Social Studies 201 Notes for November 14, 2003 1 Social Studie 201 Note for November 14, 2003 Etimation of a mean, mall ample ize Section 8.4, p. 501. When a reearcher ha only a mall ample ize available, the central limit theorem doe not apply to the

More information

Multicolor Sunflowers

Multicolor Sunflowers Multicolor Sunflower Dhruv Mubayi Lujia Wang October 19, 2017 Abtract A unflower i a collection of ditinct et uch that the interection of any two of them i the ame a the common interection C of all of

More information

List coloring hypergraphs

List coloring hypergraphs Lit coloring hypergraph Penny Haxell Jacque Vertraete Department of Combinatoric and Optimization Univerity of Waterloo Waterloo, Ontario, Canada pehaxell@uwaterloo.ca Department of Mathematic Univerity

More information

Clustering Methods without Given Number of Clusters

Clustering Methods without Given Number of Clusters Clutering Method without Given Number of Cluter Peng Xu, Fei Liu Introduction A we now, mean method i a very effective algorithm of clutering. It mot powerful feature i the calability and implicity. However,

More information

Social Studies 201 Notes for March 18, 2005

Social Studies 201 Notes for March 18, 2005 1 Social Studie 201 Note for March 18, 2005 Etimation of a mean, mall ample ize Section 8.4, p. 501. When a reearcher ha only a mall ample ize available, the central limit theorem doe not apply to the

More information

Convex Hulls of Curves Sam Burton

Convex Hulls of Curves Sam Burton Convex Hull of Curve Sam Burton 1 Introduction Thi paper will primarily be concerned with determining the face of convex hull of curve of the form C = {(t, t a, t b ) t [ 1, 1]}, a < b N in R 3. We hall

More information

3.1 The Revised Simplex Algorithm. 3 Computational considerations. Thus, we work with the following tableau. Basic observations = CARRY. ... m.

3.1 The Revised Simplex Algorithm. 3 Computational considerations. Thus, we work with the following tableau. Basic observations = CARRY. ... m. 3 Computational conideration In what follow, we analyze the complexity of the Simplex algorithm more in detail For thi purpoe, we focu on the update proce in each iteration of thi procedure Clearly, ince,

More information

SOLUTIONS TO ALGEBRAIC GEOMETRY AND ARITHMETIC CURVES BY QING LIU. I will collect my solutions to some of the exercises in this book in this document.

SOLUTIONS TO ALGEBRAIC GEOMETRY AND ARITHMETIC CURVES BY QING LIU. I will collect my solutions to some of the exercises in this book in this document. SOLUTIONS TO ALGEBRAIC GEOMETRY AND ARITHMETIC CURVES BY QING LIU CİHAN BAHRAN I will collect my olution to ome of the exercie in thi book in thi document. Section 2.1 1. Let A = k[[t ]] be the ring of

More information

Codes Correcting Two Deletions

Codes Correcting Two Deletions 1 Code Correcting Two Deletion Ryan Gabry and Frederic Sala Spawar Sytem Center Univerity of California, Lo Angele ryan.gabry@navy.mil fredala@ucla.edu Abtract In thi work, we invetigate the problem of

More information

Chapter Landscape of an Optimization Problem. Local Search. Coping With NP-Hardness. Gradient Descent: Vertex Cover

Chapter Landscape of an Optimization Problem. Local Search. Coping With NP-Hardness. Gradient Descent: Vertex Cover Coping With NP-Hardne Chapter 12 Local Search Q Suppoe I need to olve an NP-hard problem What hould I do? A Theory ay you're unlikely to find poly-time algorithm Mut acrifice one of three deired feature

More information

Lecture 7: Testing Distributions

Lecture 7: Testing Distributions CSE 5: Sublinear (and Streaming) Algorithm Spring 014 Lecture 7: Teting Ditribution April 1, 014 Lecturer: Paul Beame Scribe: Paul Beame 1 Teting Uniformity of Ditribution We return today to property teting

More information

Electronic Theses and Dissertations

Electronic Theses and Dissertations Eat Tenneee State Univerity Digital Common @ Eat Tenneee State Univerity Electronic Thee and Diertation Student Work 5-208 Vector Partition Jennifer French Eat Tenneee State Univerity Follow thi and additional

More information

Memoryle Strategie in Concurrent Game with Reachability Objective Λ Krihnendu Chatterjee y Luca de Alfaro x Thoma A. Henzinger y;z y EECS, Univerity o

Memoryle Strategie in Concurrent Game with Reachability Objective Λ Krihnendu Chatterjee y Luca de Alfaro x Thoma A. Henzinger y;z y EECS, Univerity o Memoryle Strategie in Concurrent Game with Reachability Objective Krihnendu Chatterjee, Luca de Alfaro and Thoma A. Henzinger Report No. UCB/CSD-5-1406 Augut 2005 Computer Science Diviion (EECS) Univerity

More information

Comparing Means: t-tests for Two Independent Samples

Comparing Means: t-tests for Two Independent Samples Comparing ean: t-tet for Two Independent Sample Independent-eaure Deign t-tet for Two Independent Sample Allow reearcher to evaluate the mean difference between two population uing data from two eparate

More information

Primitive Digraphs with the Largest Scrambling Index

Primitive Digraphs with the Largest Scrambling Index Primitive Digraph with the Larget Scrambling Index Mahmud Akelbek, Steve Kirkl 1 Department of Mathematic Statitic, Univerity of Regina, Regina, Sakatchewan, Canada S4S 0A Abtract The crambling index of

More information

EC381/MN308 Probability and Some Statistics. Lecture 7 - Outline. Chapter Cumulative Distribution Function (CDF) Continuous Random Variables

EC381/MN308 Probability and Some Statistics. Lecture 7 - Outline. Chapter Cumulative Distribution Function (CDF) Continuous Random Variables EC38/MN38 Probability and Some Statitic Yanni Pachalidi yannip@bu.edu, http://ionia.bu.edu/ Lecture 7 - Outline. Continuou Random Variable Dept. of Manufacturing Engineering Dept. of Electrical and Computer

More information

An Inequality for Nonnegative Matrices and the Inverse Eigenvalue Problem

An Inequality for Nonnegative Matrices and the Inverse Eigenvalue Problem An Inequality for Nonnegative Matrice and the Invere Eigenvalue Problem Robert Ream Program in Mathematical Science The Univerity of Texa at Dalla Box 83688, Richardon, Texa 7583-688 Abtract We preent

More information

ON THE APPROXIMATION ERROR IN HIGH DIMENSIONAL MODEL REPRESENTATION. Xiaoqun Wang

ON THE APPROXIMATION ERROR IN HIGH DIMENSIONAL MODEL REPRESENTATION. Xiaoqun Wang Proceeding of the 2008 Winter Simulation Conference S. J. Maon, R. R. Hill, L. Mönch, O. Roe, T. Jefferon, J. W. Fowler ed. ON THE APPROXIMATION ERROR IN HIGH DIMENSIONAL MODEL REPRESENTATION Xiaoqun Wang

More information

Avoiding Forbidden Submatrices by Row Deletions

Avoiding Forbidden Submatrices by Row Deletions Avoiding Forbidden Submatrice by Row Deletion Sebatian Wernicke, Jochen Alber, Jen Gramm, Jiong Guo, and Rolf Niedermeier Wilhelm-Schickard-Intitut für Informatik, niverität Tübingen, Sand 13, D-72076

More information

Hybrid Control and Switched Systems. Lecture #6 Reachability

Hybrid Control and Switched Systems. Lecture #6 Reachability Hbrid Control and Switched Stem Lecture #6 Reachabilit João P. Hepanha Univerit of California at Santa Barbara Summar Review of previou lecture Reachabilit tranition tem reachabilit algorithm backward

More information

(b) Is the game below solvable by iterated strict dominance? Does it have a unique Nash equilibrium?

(b) Is the game below solvable by iterated strict dominance? Does it have a unique Nash equilibrium? 14.1 Final Exam Anwer all quetion. You have 3 hour in which to complete the exam. 1. (60 Minute 40 Point) Anwer each of the following ubquetion briefly. Pleae how your calculation and provide rough explanation

More information

Source slideplayer.com/fundamentals of Analytical Chemistry, F.J. Holler, S.R.Crouch. Chapter 6: Random Errors in Chemical Analysis

Source slideplayer.com/fundamentals of Analytical Chemistry, F.J. Holler, S.R.Crouch. Chapter 6: Random Errors in Chemical Analysis Source lideplayer.com/fundamental of Analytical Chemitry, F.J. Holler, S.R.Crouch Chapter 6: Random Error in Chemical Analyi Random error are preent in every meaurement no matter how careful the experimenter.

More information

Lecture 8: Period Finding: Simon s Problem over Z N

Lecture 8: Period Finding: Simon s Problem over Z N Quantum Computation (CMU 8-859BB, Fall 205) Lecture 8: Period Finding: Simon Problem over Z October 5, 205 Lecturer: John Wright Scribe: icola Rech Problem A mentioned previouly, period finding i a rephraing

More information

IEOR 3106: Fall 2013, Professor Whitt Topics for Discussion: Tuesday, November 19 Alternating Renewal Processes and The Renewal Equation

IEOR 3106: Fall 2013, Professor Whitt Topics for Discussion: Tuesday, November 19 Alternating Renewal Processes and The Renewal Equation IEOR 316: Fall 213, Profeor Whitt Topic for Dicuion: Tueday, November 19 Alternating Renewal Procee and The Renewal Equation 1 Alternating Renewal Procee An alternating renewal proce alternate between

More information

Chapter 2 Sampling and Quantization. In order to investigate sampling and quantization, the difference between analog

Chapter 2 Sampling and Quantization. In order to investigate sampling and quantization, the difference between analog Chapter Sampling and Quantization.1 Analog and Digital Signal In order to invetigate ampling and quantization, the difference between analog and digital ignal mut be undertood. Analog ignal conit of continuou

More information

c n b n 0. c k 0 x b n < 1 b k b n = 0. } of integers between 0 and b 1 such that x = b k. b k c k c k

c n b n 0. c k 0 x b n < 1 b k b n = 0. } of integers between 0 and b 1 such that x = b k. b k c k c k 1. Exitence Let x (0, 1). Define c k inductively. Suppoe c 1,..., c k 1 are already defined. We let c k be the leat integer uch that x k An eay proof by induction give that and for all k. Therefore c n

More information

Chapter 4. The Laplace Transform Method

Chapter 4. The Laplace Transform Method Chapter 4. The Laplace Tranform Method The Laplace Tranform i a tranformation, meaning that it change a function into a new function. Actually, it i a linear tranformation, becaue it convert a linear combination

More information

Suggested Answers To Exercises. estimates variability in a sampling distribution of random means. About 68% of means fall

Suggested Answers To Exercises. estimates variability in a sampling distribution of random means. About 68% of means fall Beyond Significance Teting ( nd Edition), Rex B. Kline Suggeted Anwer To Exercie Chapter. The tatitic meaure variability among core at the cae level. In a normal ditribution, about 68% of the core fall

More information

7.2 INVERSE TRANSFORMS AND TRANSFORMS OF DERIVATIVES 281

7.2 INVERSE TRANSFORMS AND TRANSFORMS OF DERIVATIVES 281 72 INVERSE TRANSFORMS AND TRANSFORMS OF DERIVATIVES 28 and i 2 Show how Euler formula (page 33) can then be ued to deduce the reult a ( a) 2 b 2 {e at co bt} {e at in bt} b ( a) 2 b 2 5 Under what condition

More information

ALLOCATING BANDWIDTH FOR BURSTY CONNECTIONS

ALLOCATING BANDWIDTH FOR BURSTY CONNECTIONS SIAM J. COMPUT. Vol. 30, No. 1, pp. 191 217 c 2000 Society for Indutrial and Applied Mathematic ALLOCATING BANDWIDTH FOR BURSTY CONNECTIONS JON KLEINBERG, YUVAL RABANI, AND ÉVA TARDOS Abtract. In thi paper,

More information

Bogoliubov Transformation in Classical Mechanics

Bogoliubov Transformation in Classical Mechanics Bogoliubov Tranformation in Claical Mechanic Canonical Tranformation Suppoe we have a et of complex canonical variable, {a j }, and would like to conider another et of variable, {b }, b b ({a j }). How

More information

Physics 741 Graduate Quantum Mechanics 1 Solutions to Final Exam, Fall 2014

Physics 741 Graduate Quantum Mechanics 1 Solutions to Final Exam, Fall 2014 Phyic 7 Graduate Quantum Mechanic Solution to inal Eam all 0 Each quetion i worth 5 point with point for each part marked eparately Some poibly ueful formula appear at the end of the tet In four dimenion

More information

Online Appendix for Managerial Attention and Worker Performance by Marina Halac and Andrea Prat

Online Appendix for Managerial Attention and Worker Performance by Marina Halac and Andrea Prat Online Appendix for Managerial Attention and Worker Performance by Marina Halac and Andrea Prat Thi Online Appendix contain the proof of our reult for the undicounted limit dicued in Section 2 of the paper,

More information

New bounds for Morse clusters

New bounds for Morse clusters New bound for More cluter Tamá Vinkó Advanced Concept Team, European Space Agency, ESTEC Keplerlaan 1, 2201 AZ Noordwijk, The Netherland Tama.Vinko@ea.int and Arnold Neumaier Fakultät für Mathematik, Univerität

More information

Tuning of High-Power Antenna Resonances by Appropriately Reactive Sources

Tuning of High-Power Antenna Resonances by Appropriately Reactive Sources Senor and Simulation Note Note 50 Augut 005 Tuning of High-Power Antenna Reonance by Appropriately Reactive Source Carl E. Baum Univerity of New Mexico Department of Electrical and Computer Engineering

More information

Stochastic Perishable Inventory Control in a Service Facility System Maintaining Inventory for Service: Semi Markov Decision Problem

Stochastic Perishable Inventory Control in a Service Facility System Maintaining Inventory for Service: Semi Markov Decision Problem Stochatic Perihable Inventory Control in a Service Facility Sytem Maintaining Inventory for Service: Semi Markov Deciion Problem R.Mugeh 1,S.Krihnakumar 2, and C.Elango 3 1 mugehrengawamy@gmail.com 2 krihmathew@gmail.com

More information

MATEMATIK Datum: Tid: eftermiddag. A.Heintz Telefonvakt: Anders Martinsson Tel.:

MATEMATIK Datum: Tid: eftermiddag. A.Heintz Telefonvakt: Anders Martinsson Tel.: MATEMATIK Datum: 20-08-25 Tid: eftermiddag GU, Chalmer Hjälpmedel: inga A.Heintz Telefonvakt: Ander Martinon Tel.: 073-07926. Löningar till tenta i ODE och matematik modellering, MMG5, MVE6. Define what

More information

Assignment for Mathematics for Economists Fall 2016

Assignment for Mathematics for Economists Fall 2016 Due date: Mon. Nov. 1. Reading: CSZ, Ch. 5, Ch. 8.1 Aignment for Mathematic for Economit Fall 016 We now turn to finihing our coverage of concavity/convexity. There are two part: Jenen inequality for concave/convex

More information

A Study on Simulating Convolutional Codes and Turbo Codes

A Study on Simulating Convolutional Codes and Turbo Codes A Study on Simulating Convolutional Code and Turbo Code Final Report By Daniel Chang July 27, 2001 Advior: Dr. P. Kinman Executive Summary Thi project include the deign of imulation of everal convolutional

More information

Fast Convolutional Sparse Coding (FCSC)

Fast Convolutional Sparse Coding (FCSC) Fat Convolutional Spare Coding (FCSC) Bailey ong Department of Computer Science Univerity of California, Irvine bhkong@ic.uci.edu Charle C. Fowlke Department of Computer Science Univerity of California,

More information

RELIABILITY OF REPAIRABLE k out of n: F SYSTEM HAVING DISCRETE REPAIR AND FAILURE TIMES DISTRIBUTIONS

RELIABILITY OF REPAIRABLE k out of n: F SYSTEM HAVING DISCRETE REPAIR AND FAILURE TIMES DISTRIBUTIONS www.arpapre.com/volume/vol29iue1/ijrras_29_1_01.pdf RELIABILITY OF REPAIRABLE k out of n: F SYSTEM HAVING DISCRETE REPAIR AND FAILURE TIMES DISTRIBUTIONS Sevcan Demir Atalay 1,* & Özge Elmataş Gültekin

More information

List Coloring Graphs

List Coloring Graphs Lit Coloring Graph February 6, 004 LIST COLORINGS AND CHOICE NUMBER Thomaen Long Grotzch girth 5 verion Thomaen Long Let G be a connected planar graph of girth at leat 5. Let A be a et of vertice in G

More information

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions Stochatic Optimization with Inequality Contraint Uing Simultaneou Perturbation and Penalty Function I-Jeng Wang* and Jame C. Spall** The John Hopkin Univerity Applied Phyic Laboratory 11100 John Hopkin

More information

STOCHASTIC GENERALIZED TRANSPORTATION PROBLEM WITH DISCRETE DISTRIBUTION OF DEMAND

STOCHASTIC GENERALIZED TRANSPORTATION PROBLEM WITH DISCRETE DISTRIBUTION OF DEMAND OPERATIONS RESEARCH AND DECISIONS No. 4 203 DOI: 0.5277/ord30402 Marcin ANHOLCER STOCHASTIC GENERALIZED TRANSPORTATION PROBLEM WITH DISCRETE DISTRIBUTION OF DEMAND The generalized tranportation problem

More information

CHAPTER 6. Estimation

CHAPTER 6. Estimation CHAPTER 6 Etimation Definition. Statitical inference i the procedure by which we reach a concluion about a population on the bai of information contained in a ample drawn from that population. Definition.

More information

A relationship between generalized Davenport-Schinzel sequences and interval chains

A relationship between generalized Davenport-Schinzel sequences and interval chains A relationhip between generalized Davenport-Schinzel equence and interval chain The MIT Faculty ha made thi article openly available. Pleae hare how thi acce benefit you. Your tory matter. Citation A Publihed

More information

Chip-firing game and a partial Tutte polynomial for Eulerian digraphs

Chip-firing game and a partial Tutte polynomial for Eulerian digraphs Chip-firing game and a partial Tutte polynomial for Eulerian digraph Kévin Perrot Aix Mareille Univerité, CNRS, LIF UMR 7279 3288 Mareille cedex 9, France. kevin.perrot@lif.univ-mr.fr Trung Van Pham Intitut

More information

Secretary problems with competing employers

Secretary problems with competing employers Secretary problem with competing employer Nicole Immorlica 1, Robert Kleinberg 2, and Mohammad Mahdian 1 1 Microoft Reearch, One Microoft Way, Redmond, WA. {nickle,mahdian}@microoft.com 2 UC Berkeley Computer

More information

Nonlinear Single-Particle Dynamics in High Energy Accelerators

Nonlinear Single-Particle Dynamics in High Energy Accelerators Nonlinear Single-Particle Dynamic in High Energy Accelerator Part 6: Canonical Perturbation Theory Nonlinear Single-Particle Dynamic in High Energy Accelerator Thi coure conit of eight lecture: 1. Introduction

More information

Unavoidable Cycles in Polynomial-Based Time-Invariant LDPC Convolutional Codes

Unavoidable Cycles in Polynomial-Based Time-Invariant LDPC Convolutional Codes European Wirele, April 7-9,, Vienna, Autria ISBN 978--87-4-9 VE VERLAG GMBH Unavoidable Cycle in Polynomial-Baed Time-Invariant LPC Convolutional Code Hua Zhou and Norbert Goertz Intitute of Telecommunication

More information

Flag-transitive non-symmetric 2-designs with (r, λ) = 1 and alternating socle

Flag-transitive non-symmetric 2-designs with (r, λ) = 1 and alternating socle Flag-tranitive non-ymmetric -deign with (r, λ = 1 and alternating ocle Shenglin Zhou, Yajie Wang School of Mathematic South China Univerity of Technology Guangzhou, Guangdong 510640, P. R. China lzhou@cut.edu.cn

More information

Competitive Analysis of Task Scheduling Algorithms on a Fault-Prone Machine and the Impact of Resource Augmentation

Competitive Analysis of Task Scheduling Algorithms on a Fault-Prone Machine and the Impact of Resource Augmentation Competitive Analyi of Tak Scheduling Algorithm on a Fault-Prone Machine and the Impact of Reource Augmentation Antonio Fernández Anta a, Chryi Georgiou b, Dariuz R. Kowalki c, Elli Zavou a,d,1 a Intitute

More information

A Full-Newton Step Primal-Dual Interior Point Algorithm for Linear Complementarity Problems *

A Full-Newton Step Primal-Dual Interior Point Algorithm for Linear Complementarity Problems * ISSN 76-7659, England, UK Journal of Information and Computing Science Vol 5, No,, pp 35-33 A Full-Newton Step Primal-Dual Interior Point Algorithm for Linear Complementarity Problem * Lipu Zhang and Yinghong

More information

CONGESTION control is a key functionality in modern

CONGESTION control is a key functionality in modern IEEE TRANSACTIONS ON INFORMATION TEORY, VOL. X, NO. X, XXXXXXX 2008 On the Connection-Level Stability of Congetion-Controlled Communication Network Xiaojun Lin, Member, IEEE, Ne B. Shroff, Fellow, IEEE,

More information

COHOMOLOGY AS A LOCAL-TO-GLOBAL BRIDGE

COHOMOLOGY AS A LOCAL-TO-GLOBAL BRIDGE COHOMOLOGY AS A LOCAL-TO-GLOBAL BRIDGE LIVIU I. NICOLAESCU ABSTRACT. I dicu low dimenional incarnation of cohomology and illutrate how baic cohomological principle lead to a proof of Sperner lemma. CONTENTS.

More information

CS 170: Midterm Exam II University of California at Berkeley Department of Electrical Engineering and Computer Sciences Computer Science Division

CS 170: Midterm Exam II University of California at Berkeley Department of Electrical Engineering and Computer Sciences Computer Science Division 1 1 April 000 Demmel / Shewchuk CS 170: Midterm Exam II Univerity of California at Berkeley Department of Electrical Engineering and Computer Science Computer Science Diviion hi i a cloed book, cloed calculator,

More information

CHAPTER 8 OBSERVER BASED REDUCED ORDER CONTROLLER DESIGN FOR LARGE SCALE LINEAR DISCRETE-TIME CONTROL SYSTEMS

CHAPTER 8 OBSERVER BASED REDUCED ORDER CONTROLLER DESIGN FOR LARGE SCALE LINEAR DISCRETE-TIME CONTROL SYSTEMS CHAPTER 8 OBSERVER BASED REDUCED ORDER CONTROLLER DESIGN FOR LARGE SCALE LINEAR DISCRETE-TIME CONTROL SYSTEMS 8.1 INTRODUCTION 8.2 REDUCED ORDER MODEL DESIGN FOR LINEAR DISCRETE-TIME CONTROL SYSTEMS 8.3

More information

A Constraint Propagation Algorithm for Determining the Stability Margin. The paper addresses the stability margin assessment for linear systems

A Constraint Propagation Algorithm for Determining the Stability Margin. The paper addresses the stability margin assessment for linear systems A Contraint Propagation Algorithm for Determining the Stability Margin of Linear Parameter Circuit and Sytem Lubomir Kolev and Simona Filipova-Petrakieva Abtract The paper addree the tability margin aement

More information

Symmetric Determinantal Representation of Formulas and Weakly Skew Circuits

Symmetric Determinantal Representation of Formulas and Weakly Skew Circuits Contemporary Mathematic Symmetric Determinantal Repreentation of Formula and Weakly Skew Circuit Bruno Grenet, Erich L. Kaltofen, Pacal Koiran, and Natacha Portier Abtract. We deploy algebraic complexity

More information

Optimal Strategies for Utility from Terminal Wealth with General Bid and Ask Prices

Optimal Strategies for Utility from Terminal Wealth with General Bid and Ask Prices http://doi.org/10.1007/00245-018-9550-5 Optimal Strategie for Utility from Terminal Wealth with General Bid and Ak Price Tomaz Rogala 1 Lukaz Stettner 2 The Author 2018 Abtract In the paper we tudy utility

More information

Moment of Inertia of an Equilateral Triangle with Pivot at one Vertex

Moment of Inertia of an Equilateral Triangle with Pivot at one Vertex oment of nertia of an Equilateral Triangle with Pivot at one Vertex There are two wa (at leat) to derive the expreion f an equilateral triangle that i rotated about one vertex, and ll how ou both here.

More information

Control Systems Analysis and Design by the Root-Locus Method

Control Systems Analysis and Design by the Root-Locus Method 6 Control Sytem Analyi and Deign by the Root-Locu Method 6 1 INTRODUCTION The baic characteritic of the tranient repone of a cloed-loop ytem i cloely related to the location of the cloed-loop pole. If

More information

TRIPLE SOLUTIONS FOR THE ONE-DIMENSIONAL

TRIPLE SOLUTIONS FOR THE ONE-DIMENSIONAL GLASNIK MATEMATIČKI Vol. 38583, 73 84 TRIPLE SOLUTIONS FOR THE ONE-DIMENSIONAL p-laplacian Haihen Lü, Donal O Regan and Ravi P. Agarwal Academy of Mathematic and Sytem Science, Beijing, China, National

More information

Settling the Complexity of 2-Player Nash-Equilibrium

Settling the Complexity of 2-Player Nash-Equilibrium Electronic Colloquium on Computational Complexity, Report No. 140 (2005) Settling the Complexity of 2-Player Nah-Equilibrium Xi Chen Department of Computer Science Tinghua Univerity Beijing, P.R.China

More information

Performance Bounds for Constrained Linear Min-Max Control

Performance Bounds for Constrained Linear Min-Max Control 01 European Control Conference (ECC) July 1-19, 01, Zürich, Switzerland. Performance Bound for Contrained Linear Min-Max Control Tyler H. Summer and Paul J. Goulart Abtract Thi paper propoe a method to

More information

Some Sets of GCF ϵ Expansions Whose Parameter ϵ Fetch the Marginal Value

Some Sets of GCF ϵ Expansions Whose Parameter ϵ Fetch the Marginal Value Journal of Mathematical Reearch with Application May, 205, Vol 35, No 3, pp 256 262 DOI:03770/jin:2095-26520503002 Http://jmredluteducn Some Set of GCF ϵ Expanion Whoe Parameter ϵ Fetch the Marginal Value

More information

arxiv: v4 [math.co] 21 Sep 2014

arxiv: v4 [math.co] 21 Sep 2014 ASYMPTOTIC IMPROVEMENT OF THE SUNFLOWER BOUND arxiv:408.367v4 [math.co] 2 Sep 204 JUNICHIRO FUKUYAMA Abtract. A unflower with a core Y i a family B of et uch that U U Y for each two different element U

More information

Singular perturbation theory

Singular perturbation theory Singular perturbation theory Marc R. Rouel June 21, 2004 1 Introduction When we apply the teady-tate approximation (SSA) in chemical kinetic, we typically argue that ome of the intermediate are highly

More information

The Simplex and Policy Iteration Methods are Strongly Polynomial for the Markov Decision Problem with Fixed Discount

The Simplex and Policy Iteration Methods are Strongly Polynomial for the Markov Decision Problem with Fixed Discount The Simplex and Policy Iteration Methods are Strongly Polynomial for the Markov Decision Problem with Fixed Discount Yinyu Ye Department of Management Science and Engineering and Institute of Computational

More information

A BATCH-ARRIVAL QUEUE WITH MULTIPLE SERVERS AND FUZZY PARAMETERS: PARAMETRIC PROGRAMMING APPROACH

A BATCH-ARRIVAL QUEUE WITH MULTIPLE SERVERS AND FUZZY PARAMETERS: PARAMETRIC PROGRAMMING APPROACH Mathematical and Computational Application Vol. 11 No. pp. 181-191 006. Aociation for Scientific Reearch A BATCH-ARRIVA QEE WITH MTIPE SERVERS AND FZZY PARAMETERS: PARAMETRIC PROGRAMMING APPROACH Jau-Chuan

More information

arxiv: v2 [math.nt] 30 Apr 2015

arxiv: v2 [math.nt] 30 Apr 2015 A THEOREM FOR DISTINCT ZEROS OF L-FUNCTIONS École Normale Supérieure arxiv:54.6556v [math.nt] 3 Apr 5 943 Cachan November 9, 7 Abtract In thi paper, we etablih a imple criterion for two L-function L and

More information

Sociology 376 Exam 1 Spring 2011 Prof Montgomery

Sociology 376 Exam 1 Spring 2011 Prof Montgomery Sociology 76 Exam Spring Prof Montgomery Anwer all quetion. 6 point poible. You may be time-contrained, o pleae allocate your time carefully. [HINT: Somewhere on thi exam, it may be ueful to know that

More information

Technical Appendix: Auxiliary Results and Proofs

Technical Appendix: Auxiliary Results and Proofs A Technical Appendix: Auxiliary Reult and Proof Lemma A. The following propertie hold for q (j) = F r [c + ( ( )) ] de- ned in Lemma. (i) q (j) >, 8 (; ]; (ii) R q (j)d = ( ) q (j) + R q (j)d ; (iii) R

More information

On the Stability Region of Congestion Control

On the Stability Region of Congestion Control On the Stability Region of Congetion Control Xiaojun Lin and Ne B. Shroff School of Electrical and Computer Engineering Purdue Univerity, Wet Lafayette, IN 47906 {linx,hroff}@ecn.purdue.edu Abtract It

More information

Stochastic Neoclassical Growth Model

Stochastic Neoclassical Growth Model Stochatic Neoclaical Growth Model Michael Bar May 22, 28 Content Introduction 2 2 Stochatic NGM 2 3 Productivity Proce 4 3. Mean........................................ 5 3.2 Variance......................................

More information

Linear Motion, Speed & Velocity

Linear Motion, Speed & Velocity Add Important Linear Motion, Speed & Velocity Page: 136 Linear Motion, Speed & Velocity NGSS Standard: N/A MA Curriculum Framework (006): 1.1, 1. AP Phyic 1 Learning Objective: 3.A.1.1, 3.A.1.3 Knowledge/Undertanding

More information

Optimal Coordination of Samples in Business Surveys

Optimal Coordination of Samples in Business Surveys Paper preented at the ICES-III, June 8-, 007, Montreal, Quebec, Canada Optimal Coordination of Sample in Buine Survey enka Mach, Ioana Şchiopu-Kratina, Philip T Rei, Jean-Marc Fillion Statitic Canada New

More information

Problem 1. Construct a filtered probability space on which a Brownian motion W and an adapted process X are defined and such that

Problem 1. Construct a filtered probability space on which a Brownian motion W and an adapted process X are defined and such that Stochatic Calculu Example heet 4 - Lent 5 Michael Tehranchi Problem. Contruct a filtered probability pace on which a Brownian motion W and an adapted proce X are defined and uch that dx t = X t t dt +

More information

Multicast Network Coding and Field Sizes

Multicast Network Coding and Field Sizes Multicat Network Coding and Field Size Qifu (Tyler) Sun, Xunrui Yin, Zongpeng Li, and Keping Long Intitute of Advanced Networking Technology and New Service, Univerity of Science and Technology Beijing,

More information

arxiv: v2 [math.co] 11 Sep 2017

arxiv: v2 [math.co] 11 Sep 2017 The maximum number of clique in graph without long cycle Ruth Luo September 13, 017 arxiv:1701.0747v [math.co] 11 Sep 017 Abtract The Erdő Gallai Theorem tate that for k 3 every graph on n vertice with

More information

ON A CERTAIN FAMILY OF QUARTIC THUE EQUATIONS WITH THREE PARAMETERS. Volker Ziegler Technische Universität Graz, Austria

ON A CERTAIN FAMILY OF QUARTIC THUE EQUATIONS WITH THREE PARAMETERS. Volker Ziegler Technische Universität Graz, Austria GLASNIK MATEMATIČKI Vol. 1(61)(006), 9 30 ON A CERTAIN FAMILY OF QUARTIC THUE EQUATIONS WITH THREE PARAMETERS Volker Ziegler Techniche Univerität Graz, Autria Abtract. We conider the parameterized Thue

More information

Planning and Learning with Stochastic Action Sets

Planning and Learning with Stochastic Action Sets Planning and Learning with Stochatic Action Set Craig Boutilier, Alon Cohen, Avinatan Haidim, Yihay Manour, Ofer Mehi, Martin Mladenov and Dale Schuurman Google Reearch {cboutilier,aloncohen,avinatan,manour,mehi,chuurman}@google.com

More information

2 Hatad, Jukna & Pudlak gate, namely we hall tudy the ize of depth-three circuit. The technique we hall ue ha two ource. The rt one i a \nite" verion

2 Hatad, Jukna & Pudlak gate, namely we hall tudy the ize of depth-three circuit. The technique we hall ue ha two ource. The rt one i a \nite verion TOP-DOWN LOWER BOUNDS FOR DEPTH-THREE CIRCUITS J. Hatad, S. Jukna and P. Pudlak Abtract. We preent a top-down lower bound method for depth-three ^ _ :-circuit which i impler than the previou method and

More information

LINEAR ALGEBRA METHOD IN COMBINATORICS. Theorem 1.1 (Oddtown theorem). In a town of n citizens, no more than n clubs can be formed under the rules

LINEAR ALGEBRA METHOD IN COMBINATORICS. Theorem 1.1 (Oddtown theorem). In a town of n citizens, no more than n clubs can be formed under the rules LINEAR ALGEBRA METHOD IN COMBINATORICS 1 Warming-up example Theorem 11 (Oddtown theorem) In a town of n citizen, no more tha club can be formed under the rule each club have an odd number of member each

More information

Z a>2 s 1n = X L - m. X L = m + Z a>2 s 1n X L = The decision rule for this one-tail test is

Z a>2 s 1n = X L - m. X L = m + Z a>2 s 1n X L = The decision rule for this one-tail test is M09_BERE8380_12_OM_C09.QD 2/21/11 3:44 PM Page 1 9.6 The Power of a Tet 9.6 The Power of a Tet 1 Section 9.1 defined Type I and Type II error and their aociated rik. Recall that a repreent the probability

More information

Ecient Parallel Algorithms for Computing All Pair. University of Kentucky. Duke University

Ecient Parallel Algorithms for Computing All Pair. University of Kentucky. Duke University Ecient Parallel Algorithm for Computing All Pair Shortet Path in Directed Graph 1 Yijie Han 2 * Victor Y. Pan 3 ** John H. Reif*** *Department of Computer Science Univerity of Kentucky Lexington, KY 40506

More information

THE DIVERGENCE-FREE JACOBIAN CONJECTURE IN DIMENSION TWO

THE DIVERGENCE-FREE JACOBIAN CONJECTURE IN DIMENSION TWO THE DIVERGENCE-FREE JACOBIAN CONJECTURE IN DIMENSION TWO J. W. NEUBERGER Abtract. A pecial cae, called the divergence-free cae, of the Jacobian Conjecture in dimenion two i proved. Thi note outline an

More information

Rolling horizon procedures in Semi-Markov Games: The Discounted Case

Rolling horizon procedures in Semi-Markov Games: The Discounted Case Rolling horizon procedure in Semi-Markov Game: The Dicounted Cae Eugenio Della Vecchia, Silvia C. Di Marco, Alain Jean-Marie To cite thi verion: Eugenio Della Vecchia, Silvia C. Di Marco, Alain Jean-Marie.

More information

OPTIMAL STOPPING FOR SHEPP S URN WITH RISK AVERSION

OPTIMAL STOPPING FOR SHEPP S URN WITH RISK AVERSION OPTIMAL STOPPING FOR SHEPP S URN WITH RISK AVERSION ROBERT CHEN 1, ILIE GRIGORESCU 1 AND MIN KANG 2 Abtract. An (m, p) urn contain m ball of value 1 and p ball of value +1. A player tart with fortune k

More information

Performance Evaluation

Performance Evaluation Performance Evaluation 95 (206) 40 Content lit available at ScienceDirect Performance Evaluation journal homepage: www.elevier.com/locate/peva Optimal cheduling in call center with a callback option Benjamin

More information

Math 273 Solutions to Review Problems for Exam 1

Math 273 Solutions to Review Problems for Exam 1 Math 7 Solution to Review Problem for Exam True or Fale? Circle ONE anwer for each Hint: For effective tudy, explain why if true and give a counterexample if fale (a) T or F : If a b and b c, then a c

More information

The Impact of Imperfect Scheduling on Cross-Layer Rate. Control in Multihop Wireless Networks

The Impact of Imperfect Scheduling on Cross-Layer Rate. Control in Multihop Wireless Networks The mpact of mperfect Scheduling on Cro-Layer Rate Control in Multihop Wirele Network Xiaojun Lin and Ne B. Shroff Center for Wirele Sytem and Application (CWSA) School of Electrical and Computer Engineering,

More information

ONLINE APPENDIX: TESTABLE IMPLICATIONS OF TRANSLATION INVARIANCE AND HOMOTHETICITY: VARIATIONAL, MAXMIN, CARA AND CRRA PREFERENCES

ONLINE APPENDIX: TESTABLE IMPLICATIONS OF TRANSLATION INVARIANCE AND HOMOTHETICITY: VARIATIONAL, MAXMIN, CARA AND CRRA PREFERENCES ONLINE APPENDIX: TESTABLE IMPLICATIONS OF TRANSLATION INVARIANCE AND HOMOTHETICITY: VARIATIONAL, MAXMIN, CARA AND CRRA PREFERENCES CHRISTOPHER P. CHAMBERS, FEDERICO ECHENIQUE, AND KOTA SAITO In thi online

More information