Stochastic Subgradient Methods with Approximate Lagrange Multipliers

Size: px
Start display at page:

Download "Stochastic Subgradient Methods with Approximate Lagrange Multipliers"

Transcription

1 Stochastic Subgradient Methods with Approximate Lagrange Multipliers Víctor Valls, Douglas J. Leith Trinity College Dublin Abstract We study the use of approximate Lagrange multipliers in the stochastic subgradient method for the dual problem in convex optimisation. The use of approximate Lagrange multipliers in the optimisation (instead of the true multipliers) is motivated by the fact that we can cast some typically thought non-convex control problems as convex optimisations. For example, we can solve some stochastic discrete decision maing problems by simply solving a sequence of unconstrained convex optimisations. The use of approximate multipliers allows us also to solve problems where there are constraints on the order in which control actions can be made. Index Terms convex optimisation, max-weight scheduling, subgradient methods, approximate multipliers, stochastic control. I. INTRODUCTION In this paper we study the use of approximate Lagrange multipliers in subgradient methods for constrained convex optimisation problems. One of the motivations of using an approximation of the Lagrange multipliers is that in some problems the exact multipliers might not be available, but a perturbed or outdated version of the multipliers is available instead. For example, in distributed optimisation the exchange of system state information might be affected by transmission delays or losses. Another motivation of using approximate Lagrange multipliers is because of its applications to networ scheduling problems involving queues. As shown in [], it is possible to identify scaled queues occupancies with approximate Lagrange multipliers, i.e., the Lagrange multiplier λ generated by the subgradient method stays close to the α-scaled queue occupancyαq in a networ (where α is the step size used in the subgradient method). A direct consequence of this is that the scaled queues occupancies in networs can be used as surrogate quantities of the Lagrange multipliers in the subgradient method for the dual problem. In other words, by using approximate multipliers it is possible to cast some discrete decision maing problems as convex optimisations. Our convex approach to solving networ/scheduling optimisation problems is in mared contrast to approaches lie max-weight [] that are built on Lyapunov optimisation theory. Whereas max-weight approaches focus on constructing a policy that stabilises a system, we instead solve the dual problem and then derive the stability of the system via the properties of the dual problem. This way of looing at the problem allows us to construct scheduling policies in a more flexible manner This wor was supported by Science Foundation Ireland under Grant No. /PI/77. than with max-weight approaches. In particular, we can design scheduling policies that have constraints on the order in which discrete actions can be selected. A. Related Wor The max-weight scheduling algorithm was proposed by Tassiulas and Ephremides in []. The algorithm is able to find an optimal scheduling policy (which consists of choosing the discrete action that minimises the sum of the squared queues baclogs) without previous nowledge of the mean pacet arrival rates in the system. The max-weight algorithm has been extended in a sequel of papers [3], [4], [5] to consider the minimisation of a convex utility function, and to general functions in [6]. The wors in [5], [6] also extend max-weight to allow convex constraints. The construction of max-weight policies with constraints (or penalties) is proposed in [7]. There the authors consider the problem of dynamic allocation of servers with switchover delay. The wor in [8] studies the allocation of rates in a networ while providing fairness using a dynamical system approach and Lyapunov stability. In [9] the authors use a convex approach and study the optimal allocation of rates using subgradient methods. Importantly, [8] and [9] wor with continuous valued variables rather than with the discrete actions used in the max-weight approaches [], [3], [4], [5], [6]. The wors in [0], [] are the first ones that establish rigorous connections between the max-weight and dual approaches in convex optimisation via Lagrange multipliers. II. CONVEX OPTIMISATION Consider the following convex optimisation problem P in standard form: minimise x X f(x) () subject to g i (x) 0 i =,...,m () where f,g i : X R are convex functions and X R n a convex set. We will always assume that set X 0 := {x X g i (x) 0,i =,...,m}, and so problem P is feasible. Further, using standard notation we let f := min x X0 f(x) and x argmin x X0 f(x). As it is usual in convex optimisation, the Lagrange dual function of problem P is given by q(λ) = inf L(x,λ) = inf {f(x)+ m λ (i) g i (x)}, (3) x X x X

2 where λ (j) is the j th component of vector λ R m +. Recall that since q(λ) is the infimum of a set of linear functions it is concave. In order to eep notation short we will write L(x,λ) = f(x)+λ T g(x) whereg = [g,...,g m ] T, and define X(λ) := {argmin x X L(x,λ)} to be the set of minima of L(,λ) for a given vector λ. The following two assumptions are ey in our wor. Assumption (Bounded Set). X is convex and bounded. Assumption (Slater Condition). relint(x 0 ) is non-empty, i.e., there exists a point x X such that g(x) 0. Assumption is important because it guarantees that q is Lipschitz continuous with constant L g := max x X g(x), and Assumption ensures that strong duality holds. Namely, the solution of the dual problem P D, maximise λ 0 q(λ) (4) coincides with the solution of the primal problem P. That is, max λ 0 q(λ) =: q(λ ) = f where λ argmax λ 0 q(λ). Another consequence of the Slater condition is that the set of dual optima, Λ := {argmax λ 0 q(λ)}, is a bounded subset from R m + (see Lemma in []). A. Subgradient Method for the Dual Problem Problem P D is an unconstrained concave optimisation problem that can be solved using the subgradient method. Recall that when the dual function is differentiable then the subgradient is indeed the gradient. In short, the subgradient method consists of the following update λ + = [λ +αg(x )] +, (5) where [ ] + = max{0, }, λ R m, g(x ) is the subgradient of q at point λ, and α > 0 is a constant step size. The subgradient method can mae use of more complex step sizes, but as we will show in Section II-B, constant step size will play a prominent role when woring with averages. Lemma (Subgradient Method). Consider optimisation problem P D and the subgradient update (5) where x X(λ ) and α > 0. Suppose Assumptions and hold. Then, there exists a non-negative and non-increasing sequence {β } that converges to α(3/)l g for large enough and β q(λ ) q(λ ) 0 for all. The intuition behind the lemma is that λ is monotonically attracted to a vectorλ Λ until it converges to a ball around Λ, the size of which depends on parameter α. Hence, by the Lipchitz continuity of the dual function we can then expect that q(λ ) converges (probably not monotonically) to a ball around q(λ ). Moreover, since λ is monotonically attracted to a ball aroundλ, andλ is a bounded set fromr m from Assumption, it follows that λ is bounded for all. See [, Lemma 3] for a bound on λ. Figure illustrates schematically the convergence of q(λ ) to level set q(λ ) q(λ ) α(3/)l g. Fig.. Illustrating the convergence of the subgradient method for the dual problem with constant step size. B. Approximate Primal Solutions Using Averages We now proceed to show we can recover approximate primal solutions using the subgradient method for the dual problem. First we present the following lemma which relates bounded Lagrange multipliers and the feasibility of averages of primal variables. Lemma (Feasible Solutions using Averages). Let {x } be a sequence of points in X and consider update λ + = [λ + αg(x )] +. Then, g( x ) λ + /, where x = x i. Observe from Lemma that if λ + is bounded for all (which is the case in the subgradient method) then x is attracted to a feasible point as. We now present one of our main results, which is a refinement of Theorem 3 in [3] with sharper bounds and simpler proof. Theorem (Approximate Primal Solutions using Averages). Consider the subgradient method for the dual problem with constant step size α > 0, i.e., λ + = [λ +αg(x )] +, (6) where x X(λ ) := {argmin x X L(x,λ )}. Suppose Assumptions and hold. Then, β λ α f( x ) f αl g + λ α, (7) where λ = λt λ +, x := x i and {β } is a sequence given by the subgradient method. Theorem says that f( x ) converges to a solution close to f for sufficiently large and α sufficiently small. Observe that if we let we obtain α3l g f( x ) f αl g, (8) and if we mae α 0 then f( x ) f. Further, since by Lemma we have that g( x ) 0 when, we can conclude that x x. How fast sequence {β } converges to α(3/)l g will depend on the step size and characteristics of the objective function and constraints. For example, if q(λ) were strongly convex we could use second order methods and obtain a sequence {β } that converges fast to α(3/)l g. Theorem can be extended to consider the case where the subgradient g(x ) is computed using a point nearby to λ. Formally, x X(µ ) where µ is an approximate Lagrange multiplier of λ that satisfies λ µ ǫ/l g for all and ǫ 0. As we will show in Section III it

3 3 Fig.. Illustrating the convergence of the subgradient method for the dual problem with constant step size when the subgradients are obtained using nearby (dual) points or approximate Lagrange multipliers. Observe from the figure that the approximate Lagrange multiplier µ (dashed line) is always close to the Lagrange multiplier λ (straight line). Fig. 3. Illustrating the convergence of the stochastic subgradient method with approximate Lagrange multipliers. In contrast to the deterministic case (see Figure ), we now have that λ converges to a ball around λ. will be possible to identify scaled queues occupancies with approximate multipliers µ. A consequence of this is that scaled queues occupancies can be used as surrogate quantities in networ optimisation. We have the following corollary to Theorem. Corollary. Consider the setup of Theorem where update (6) now uses x X(µ ) where λ µ ǫ/l g for all =,,... and ǫ 0. Then, the bound (7) holds and sequence {β } converges to α(3/)l g + ǫ for sufficiently large. The use of a nearby or approximate Lagrange multiplier to compute the subgradient of the dual function is in fact equivalent to the subgradient method for the dual problem with ǫ-subgradients see [4, pp. 65]). Observe that q(λ ) q(µ ) λ µ L g ǫ. Figure illustrates the convergence of q(λ ) to a ball around q(λ ) when the subgradient method is computed using a nearby Lagrange multiplier. Compare Figure and to see that the ball to which q(λ ) converges to depends now on parameter ǫ (which controls the distance between λ and µ ). C. Subgradient Method with Noisy Dual Updates The subgradient method for the dual problem can be extended to consider noisy dual updates. We have the following lemma. Lemma 3 (Stochastic Subgradient Method with Approximate Lagrange Multipliers). Consider optimisation problem P D and the subgradient update λ + = [λ +α(g(x )+B )] +, (9) where x X(µ ) with λ µ ǫ/l g, ǫ 0 and B are i.i.d. random variables with mean b R m and finite varianceσ B. Suppose Assumption holds and that there exists a point x X such that g(x) + b 0. Also, suppose the subgradients of the dual function are bounded for all, i.e., max x X g(x)+b L h for all. Then, λ λ α αl h where λ := λ i. ǫ q(e( λ )) q(λ ) 0, (0) In the stochastic subgradient method we have that the expected value of the running average λ is attracted to a ball around Λ. The latter is schematically illustrated in Figure 3. An important point to note is that now we require that the Slater condition holds with the mean of the random variable, i.e., there exists a point x X such that g(x)+b 0 and not just g(x) 0. We can obtain approximate primal solutions using the stochastic subgradient update. Theorem (Approximate Primal Solutions using Averages, Approximate Lagrange Multipliers and Noisy Dual Updates). Consider the stochastic subgradient method of Lemma 3. Suppose Assumption holds and that there exists a point x X such that g(x) + b 0. Also, suppose the subgradients of the dual function are bounded for all, i.e., max x X g(x)+b L h for all. Then, λ λ α αl h f(e( x )) f αl h λ ǫ () α where x := x i and λ = E[ λ ] T E[λ + ]. + λ α, () Theorem says that f(e( x )) converges to a ball around f for sufficiently large, and when we have that αl h ǫ f(e( x )) f αl h. Further, if α 0 then f(e( x )) f as. III. APPROXIMATE LAGRANGE MULTIPLIERS & QUEUE CONTINUITY Queues updates in networs are usually given by Q + = [Q +δ ] +, where Q R m and δ Z m. Difference δ is discrete and represents the change of discrete quantities (pacets, cars, people, etc.) that get in/out of the queue in a time slot. Next, observe that since [ ] + = max{0, } is a homogeneous function, if we let αq := µ we can write µ + = [µ +αδ ] +, where α is the step size used in the subgradient method. That is, an α-scaled queue is lie a subgradient update with the

4 4 difference that δ is constrained to be discrete. Recall that µ will be an approximate Lagrange multiplier if it stays uniformly close to λ for all. This will be actually the case when the sequence of discrete quantities{δ } and subgradients {g(x )} stay close in appropriate sense, and constraints in the optimisation P are linear [3]. We have the following lemma, which is a restatement of [5, Proposition 3..]. Lemma 4 (Continuity of the Sorohod Map). Consider updates λ + = [λ + αx ] +,µ + = [µ + αy ] + where λ = µ 0, α > 0, and {x } and {y } are two sequences of points from R such that (x i y i ) ǫ. Then, λ + µ + αǫ. (3) Lemma 4 requires (x i y i ) to be uniformly bounded so that the difference λ µ is also bounded. Hence, it is interesting to now under which conditions it is possible to construct a sequence {y } Y R n that stays close to an arbitrary sequence {x } X R n. Next we show that this is always possible when Y is a finite collection of points from R n and X conv(y), the convex hull of Y. A. Construction of Discrete Sequences Consider a set of m discrete points Y from R n such that X conv(y). Collect the points in Y as columns in matrix W. We can write any point x X as a convex combination of points in Y, i.e.,x = Wu where u U := {u R m u =,u 0}. Note there might exist multiple vectors in U such that Wu = x. Similarly, define E := {e {0,} m e = } to be the set containing the m-dimensional standard basis vectors. There is only one vector e E for each y Y such that We = y. We can write (x i y i ) = W (u i e i ), and since W (u i e i ) W (u i e i ) by the Cauchy-Schwarz inequality it follows that showing that (u i e i ) (4) is bounded for all is sufficient to establish the boundedness of (x i y i ). We show this in the following corollary, which is a straightforward extension of [3, Theorem 4]. Corollary (Bloc Sequence). Let T N, U := {u R m u =,u 0}, E := {e {0,} m e = } and define S = Tm u i where u i U for all i =,...,Tm. Let δ D := {w R m w T = 0, w } and select e argmin e E δ +S e i e. (5) Then, we have that (δ +S S ) D where S = Tm e i. One such vector can be efficiently found by solving convex optimisation problem min u U Wu x with standard convex methods see Chapters 6 and 9 in [6]. Fig. 4. Illustrating the construction of a subsequence in an online mode in a delayed or shifted manner with m = 3 and T =. Subsequence {e 7,...,e } is constructed with update (5) with subsequence {u,...,u 6 }, and therefore (δ+ 6 u i j=7 e j) D. Subsequence {e,...,e 6 } can be selected randomly from E and it does not affect the boundedness of (4). Corollary says that for any subsequence {u,...,u Tm } we can use update (5) to construct a subsequence {e,...,e Tm } such that (δ +S S ) = δ + Tm (u i e i ) := δ D. (6) Hence, for any other subsequence {u Tm+,...,u Tm } we can use update (5) to construct a subsequence {e Tm+,...,e Tm } such that Tm δ + (u i e i ) = ( δ + Tm i=t m+ (u i e i ) ) D. Next, observe that if we let S l = l(tm) j=(l )Tm+ u j and S l = l(tm) j=(l )Tm+ e j with l =,,... we will always have that { l (S i S i )} is a sequence of points from D := {w R m w T = 0, w } and therefore bounded. The boundedness of (4) for all =,,... follows immediately from the fact that subsequences have finite length. Importantly, note that update (5) requires to now T m elements from U in advance in order to select Tm elements from E. This implies that it will only be possible to construct subsequences in an online mode in a delayed or shifted manner see the example in Figure 4. Nonetheless, since subsequences have finite length, constructing subsequences in a shifted manner does not affect the boundedness of (4). An important observation is that the elements in a subsequence {e,...,e Tm } can be reordered and still have that (δ + Tm (u i e i )) D. This will be particularly useful in problems where there are constraints on the order how actions y (and so e ) can be selected. In the next section we present a networ example where the selection of discrete actions has constraints and cannot be solved with classic maxweight approaches. A. Problem Setup IV. WIRELESS NETWORK EXAMPLE Consider the networ shown in Figure 5, an Access Point (AP) that transmits to two wireless nodes. Time is slotted and in each time slot pacets arrive at the queues of the AP (Q and Q ) with probability p and p, respectively. In each time slot the AP taes an action from action set Y := {y 0,y,y } = {[0,0],[,0],[0,]}, where each action corresponds, respectively, to not transmitting, to transmitting

5 5 AP Fig. 5. Illustrating the networ in the example of Section IV. The Access Point (AP) sends pacets from Q to node, and from Q to node. one pacet from Q to node, and to transmitting one pacet from Q to node. The transmission protocol of the AP has constraints on how actions can be selected. In particular, it is not possible to select action y after y without first selecting y 0. Liewise, it is not possible to select y after y without first selecting y 0. However, y or y can be selected sequentially. An example of an admissible sequence is {y,y,y,y 0,y,y 0,y,y,y 0,y,y,...}. These type of constraints appear in different areas, and are nown in the literature as reconfiguration delays [7]. For example, asing for the Channel State Information (CSI) in wireless communications in order to adjust the transmission parameters. Our goal is to design a scheduling policy for the AP (select actions from sety)in order to minimise a convex cost function of the average throughput x, and ensure that the system is stable, i.e., the queues do not overflow and so all traffic can be served. B. Problem Formulation The convex or fluid formulation of the problem is minimise x X f(x) (7) subject to b x (8) where f : R R, b R + and X is a bounded convex subset from conv(y) R that depends on the protocol constraints, i.e., on how actions can be selected. If there were no constraints on the order in which actions could be selected we would have X := conv(y), however, this is not the case. Characterising X is as simple as noting that if we have a subsequence of actions where y and y appear (each) sequentially, then y 0 should appear at least twice in order to have a subsequence that is compliant with the transmission protocol for example {y 0,y,y,y,y 0,y,y,y,y }. Conversely, any subsequence in which y 0 appears at least twice can be reordered to obtain a subsequence that is compliant with the transmission protocol. Since we can always choose a subsequence of discrete actions and then reorder its elements (see discussion after Corollary ), we just need to enforce that y 0 appears twice in a subsequence. This can be obtained if the convex combination of a point x X uses point y 0 at least The CSI in wireless communications is in practice requested periodically, and not only at the beginning of a transmission, but we will assume this for simplicity of exposure. The extension is nevertheless straightforward. Fig. 6. Networ capacity region of the example in Section IV. X = ηconv(y ) with η = /(Tm). fraction /Tm, where (Tm) is the length of a subsequence. This will be the case when X := ηconv(y), (9) where η := /(Tm). Note from (9) that as T gets larger η and therefore X conv(y), i.e., the networ capacity region increases. Figure 6 illustrates the networ capacity region of the example. ) Updates: We use scaled queues occupancies as approximate multipliers. At each time slot we have updates x argmin x X {f(x) αqt x}, (0) Q + = [Q y +B ] +, () where B are i.i.d. random variables that tae values {0,} with mean [p,p ] T. Action y = We, and e is obtained with (5) in the online mode explained in Section III-A. The elements in a subsequences are reordered to comply with the transmission protocol constraints. ) Simulation: We run a simulation with objective function f = x and parameters p = 0.5, p = 0.5, α = 0.05, λ = αq = 0, T = 3 (so the number of elements in a subsequence is 9 since m = 3). Figure 7 shows the convergence of λ to a ball around λ. Interestingly, see that whereas λ converges to λ, the Lagrange multipliersλ (grey points) do not converge to a ball around λ. The stability of the system follows from the fact that if the average of the Lagrange multipliers is bounded, then the average of the α- scaled queue occupancies is also bounded. Figure 8 shows that f( x ) converges to a ball around f as increases. V. CONCLUSIONS We have shown that the stochastic subgradient method for the dual problem in convex optimisation allows the use of approximate Lagrange multipliers. The use of approximate Lagrange multipliers is motivated because it is possible to identify α-scaled queue occupancies as approximations of Lagrange multipliers in the subgradient method. As a result, it is possible to cast some discrete decision maing problems as convex optimisations. One of the advantages of our convex approach is that we can mae optimal control decisions that have ordering constraints.

6 6, λ () λ () λ () (), λ Fig. 7. Illustrating the convergence of average λ + (blue line) to a ball around λ = [0.5,] T (grey area). Grey points indicate the values of the Lagrange multipliers λ, and λ () and λ () denote the first and second components of vector λ. f( x) time slot Fig. 8. Illustrating the convergence of f( x ) to f = 0.35 (straight line) in the networ example of Section IV. A. Proof of Lemma APPENDIX Consider update λ + = [λ + αg(x )] + and suppose we select x X(µ ) := {x X L(x,µ ) q(µ ) ǫ}. Observe that λ + λ = [λ +αg(x )] + λ λ +αg(x ) λ = λ λ +α g(x ) +α(λ λ ) T g(x ) λ λ +α L g +α(q(λ )+ǫ q(λ )), where the last inequality follows from the fact that (λ λ ) T g(x ) = L(x,λ ) L(x,λ ) L(x,λ ) q(λ ) q(λ )+ǫ q(λ ), and g(x ) L g for all. Rearranging terms yields λ + λ λ λ () α L g +α(q(λ )+ǫ q(λ )). (3) Now let Λ δ := {λ R m + q(λ ) q(λ) αl g + ǫ} and consider two cases. Case (i) (λ / Λ δ ) Observe that (3) is strictly negative and therefore λ is monotonically attracted to Λ δ. Hence, for any λ R m and sufficiently large it will eventually happen that λ Λ δ and therefore q(λ ) q(λ ) αl g / +ǫ. Case (ii) (λ Λ δ ) We cannot ensure that the RHS in (3) is strictly negative, however, from the Lipchitz continuity of the dual function we have q(λ + ) q(λ ) λ + λ L g = [λ +αg(x )] + λ L g αg(x ) L g αl g. Now see that since q(λ +) q(λ ) = q(λ + ) q(λ ) + q(λ ) q(λ ) q(λ + ) q(λ ) + q(λ ) q(λ ) αl g + αl g /+ǫ = α(3/)l g +ǫ. That is, λ + Λ δ := {λ Rm + q(λ ) q(λ) α(3/)l g +ǫ}. Finally, note that Λ δ Λ δ and that since from case (i) any λ + Λ δ \ Λ δ will be attracted to Λ δ, we can conclude that for every λ R m there exists a non-negative and nonincreasing sequence {β } that converges to α(3/)l g+ǫ for sufficiently large and β q(λ ) q(λ ) 0 for all. B. Proof of Lemma Observe that λ + = [λ +αg(x )] + λ +αg(x i ), and rearranging terms λ + λ αg(x ). Summing from i =,..., yields α g(x i) λ + λ λ +. Dividing by α and using the fact that g( x ) g(x i) yields the stated result. C. Proof of Theorem From Lemma we have that β q(λ ) q(λ ) 0, where {β } is a non-increasing sequence that converges to α(3/)l g + ǫ for sufficiently large. Next, since q(λ ) q(λ ) L( x,λ ) q(λ ) = f( x ) + λ T g( x ) q(λ ), by rearranging terms and using the fact that q(λ ) = f (by strong duality) we obtain β λ T g( x ) f( x ) f. From Lemma we have thatg( x ) λ + /(α) for all, and since λ + λ +αl g we obtain β λ α f( x ) f (4) where λ = λt λ +. We now show the upper bound in (7). Since f = q(λ ) q(λ i) = f(x i)+λ T i g(x i) f( x )+ λt i g(x i), by rearranging terms we have that f( x ) f λt i g(x i). Now observe that for any sequence {x } in X we can write λ + = [λ + αg(x )] + λ + αg(x ) λ + α g(x ) + αλt g(x ). Applying the latter equation recursively for i =,..., we obtain that λ + λ + α g(x i) + α λt i g(x i), and rearranging terms and dividing by α yields λt i g(x i) α g(x i) + λ α. Finally, since g(x i ) L g by Assumption we obtain which concludes the proof. D. Proof of Lemma 3 f( x ) f αl g + λ α, In each iteration we aim to decrease the expected distance of λ from λ. In short, observe that for any vector θ R m we have E( λ + θ λ ) = E( [λ +α g(x )] + θ λ ) E( λ +α g(x ) θ λ ) where g(x ) = g(x )+B.

7 7 Expanding the RHS of the last equation and rearranging terms yields E( λ + θ λ ) λ θ (5) α E( g(x ) λ )+αe((λ θ) T g(x ) λ ). α L h +αe((λ θ) T g(x ) λ ). (6) where the last equation follows from the boundedness of the subgradient. Next, see that since E((λ θ) T (g(x )+B ) λ ) = (λ θ) T (g(x ) + b) = L(x,λ ) L(x,θ) L(x,λ ) q(θ) q(λ )+ǫ q(θ) we have that E( λ + θ λ ) λ θ α L h +α(q(λ ) +ǫ q(θ)) and taing expectations yields E( λ + θ ) E( λ θ ) α L h +αe(q(λ )+ǫ q(θ)). Applying the latter expression recursively to obtaine( λ + θ ) λ θ α L h +α E(q(λ i)+ǫ q(θ)). Dropping E( λ + θ ), further rearranging and dividing by α yields λ θ αl h ǫ E(q(λ i ) q(θ)). (7) α Next, see that by the linearity of the expectation one can write E(q(λ i )) = E q(λ i ) where λ i, i =,..., are the Lagrange multiplier obtained from update (9). Next, let λ := λ i and see that by the concavity of q we can write E( q(λ i)) E(q( λ )) q(e( λ )). Hence, λ θ α αl h ǫ q(e( λ )) q(θ), and if we let θ = λ we obtain that q(e( λ )) converges monotonically to a ball around q(λ ). E. Proof of Theorem We follow the same steps as in the proof of Theorem. From the stochastic subgradient method (see Lemma 3) we have that for any θ R m there exists a nonnegative and strictly decreasing sequence {β } such that β q(e( λ )) q(θ) for all. Also, since we can always write q(e( λ )) L(E( x ),E( λ )) = f(e( x )) + E( λ ) T g(e( x )), and g(e( x )) E(λ + )/(α) from Lemma, rearranging terms yields β E(λ ) α f(e( x )) q(θ). (8) where E(λ ) = E( λ ) T E(λ + ). We now proceed to upper bound (8). From (7) we have E(q(λ i)) q(θ) = E(f(x i) + λ T i (g(x i) + b)) q(θ) f(e( x ))+E( λt i (g(x i)+b)) q(θ) and therefore f(e( x )) q(λ (b)) E λ T i (g(x i)+b) (9) where λ (b) depends on b. Now observe that using θ = 0 in (6) we have E( λ + λ ) λ α L h +αe(λt (g(x )+b)). Applying the latter equation recursively we obtain E( λ + ) λ +α L h +αe( λt i (g(x i)+b)). Rearranging terms and dividing by α yields E λ T i (g(x i)+b) αl h + λ α. (30) Finally, letting θ = λ (b) in (8) concludes the proof. REFERENCES [] V. Valls and D. J. Leith. On the relationship between queues and multipliers. In Communication, Control, and Computing (Allerton), 04 5nd Annual Allerton Conference on, pages 50 55, Sept 04. [] L. Tassiulas and A. Ephremides. Stability properties of constrained queueing systems and scheduling policies for maximum throughput in multihop radio networs. IEEE Transactions on Automatic Control, 37(): , Dec 99. [3] A. Eryilmaz and R. Sriant. Fair resource allocation in wireless networs using queue-length-based scheduling and congestion control. In INFO- COM th Annual Joint Conference of the IEEE Computer and Communications Societies. Proceedings IEEE, volume 3, pages vol. 3, March 005. [4] M. J. Neely. Optimal energy and delay tradeoffs for multiuser wireless downlins. IEEE Transactions on Information Theory, 53(9): , Sept 007. [5] Alexander L. Stolyar. Maximizing queueing networ utility subject to stability: Greedy primal-dual algorithm. Queueing Systems, 50(4):40 457, 005. [6] Michael J. Neely. Stochastic Networ Optimization with Application to Communication and Queueing Systems. Morgan and Claypool Publishers, 00. [7] G. D. Celi, L. B. Le, and E. Modiano. Dynamic server allocation over time-varying channels with switchover delay. IEEE Transactions on Information Theory, 58(9): , Sept 0. [8] D. K. H. Tan F. P. Kelly, A. K. Maulloo. Rate control for communication networs: Shadow prices, proportional fairness and stability. The Journal of the Operational Research Society, 49(3):37 5, 998. [9] A. Nedić and A. Ozdaglar. Subgradient methods in networ resource allocation: Rate analysis. In Information Sciences and Systems, 008. CISS nd Annual Conference on, pages 89 94, March 008. [0] L. Huang and M. J. Neely. Delay reduction via lagrange multipliers in stochastic networ optimization. IEEE Transactions on Automatic Control, 56(4):84 857, April 0. [] V. Valls and D. J. Leith. Max-weight revisited: Sequences of nonconvex optimizations solving convex optimizations. IEEE/ACM Transactions on Networing, PP(99):, 05. [] Angelia Nedić and Asuman Ozdaglar. Approximate primal solutions and rate analysis for dual subgradient methods. SIAM Journal on Optimization, 9(4): , 009. [3] Víctor Valls and Douglas J. Leith. Descent With Approximate Multipliers is Enough: Generalising Max-Weight. arxiv e-print, 05. [4] Dimitri P. Bertseas. Nonlinear Programming. Athena Scientific, nd edition, 999. [5] Sean Meyn. Control Techniques for Complex Networs. Cambridge University Press, st edition, 007. [6] Stephen Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge University Press, st edition, 004.

Descent With Approximate Multipliers is Enough: Generalising Max-Weight

Descent With Approximate Multipliers is Enough: Generalising Max-Weight 1 Descent With Approximate Multipliers is Enough: Generalising Max-Weight Víctor Valls, Douglas J. Leith Trinity College Dublin arxiv:1511.02517v1 [math.oc] 8 Nov 2015 Abstract We study the use of approximate

More information

Subgradient Methods in Network Resource Allocation: Rate Analysis

Subgradient Methods in Network Resource Allocation: Rate Analysis Subgradient Methods in Networ Resource Allocation: Rate Analysis Angelia Nedić Department of Industrial and Enterprise Systems Engineering University of Illinois Urbana-Champaign, IL 61801 Email: angelia@uiuc.edu

More information

A SIMPLE PARALLEL ALGORITHM WITH AN O(1/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS

A SIMPLE PARALLEL ALGORITHM WITH AN O(1/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS A SIMPLE PARALLEL ALGORITHM WITH AN O(/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS HAO YU AND MICHAEL J. NEELY Abstract. This paper considers convex programs with a general (possibly non-differentiable)

More information

Solving Dual Problems

Solving Dual Problems Lecture 20 Solving Dual Problems We consider a constrained problem where, in addition to the constraint set X, there are also inequality and linear equality constraints. Specifically the minimization problem

More information

Dynamic Power Allocation and Routing for Time Varying Wireless Networks

Dynamic Power Allocation and Routing for Time Varying Wireless Networks Dynamic Power Allocation and Routing for Time Varying Wireless Networks X 14 (t) X 12 (t) 1 3 4 k a P ak () t P a tot X 21 (t) 2 N X 2N (t) X N4 (t) µ ab () rate µ ab µ ab (p, S 3 ) µ ab µ ac () µ ab (p,

More information

Fairness and Optimal Stochastic Control for Heterogeneous Networks

Fairness and Optimal Stochastic Control for Heterogeneous Networks λ 91 λ 93 Fairness and Optimal Stochastic Control for Heterogeneous Networks sensor network wired network wireless 9 8 7 6 5 λ 48 λ 42 4 3 0 1 2 λ n R n U n Michael J. Neely (USC) Eytan Modiano (MIT) Chih-Ping

More information

On Scheduling for Minimizing End-to-End Buffer Usage over Multihop Wireless Networks

On Scheduling for Minimizing End-to-End Buffer Usage over Multihop Wireless Networks On Scheduling for Minimizing End-to-End Buffer Usage over Multihop Wireless Networs V.J. Venataramanan and Xiaojun Lin School of ECE Purdue University Email: {vvenat,linx}@purdue.edu Lei Ying Department

More information

A Distributed Newton Method for Network Utility Maximization

A Distributed Newton Method for Network Utility Maximization A Distributed Newton Method for Networ Utility Maximization Ermin Wei, Asuman Ozdaglar, and Ali Jadbabaie Abstract Most existing wor uses dual decomposition and subgradient methods to solve Networ Utility

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

9. Dual decomposition and dual algorithms

9. Dual decomposition and dual algorithms EE 546, Univ of Washington, Spring 2016 9. Dual decomposition and dual algorithms dual gradient ascent example: network rate control dual decomposition and the proximal gradient method examples with simple

More information

Ergodic Stochastic Optimization Algorithms for Wireless Communication and Networking

Ergodic Stochastic Optimization Algorithms for Wireless Communication and Networking University of Pennsylvania ScholarlyCommons Departmental Papers (ESE) Department of Electrical & Systems Engineering 11-17-2010 Ergodic Stochastic Optimization Algorithms for Wireless Communication and

More information

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem: CDS270 Maryam Fazel Lecture 2 Topics from Optimization and Duality Motivation network utility maximization (NUM) problem: consider a network with S sources (users), each sending one flow at rate x s, through

More information

Efficient Nonlinear Optimizations of Queuing Systems

Efficient Nonlinear Optimizations of Queuing Systems Efficient Nonlinear Optimizations of Queuing Systems Mung Chiang, Arak Sutivong, and Stephen Boyd Electrical Engineering Department, Stanford University, CA 9435 Abstract We present a systematic treatment

More information

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The

More information

Uniqueness of Generalized Equilibrium for Box Constrained Problems and Applications

Uniqueness of Generalized Equilibrium for Box Constrained Problems and Applications Uniqueness of Generalized Equilibrium for Box Constrained Problems and Applications Alp Simsek Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology Asuman E.

More information

Asynchronous Non-Convex Optimization For Separable Problem

Asynchronous Non-Convex Optimization For Separable Problem Asynchronous Non-Convex Optimization For Separable Problem Sandeep Kumar and Ketan Rajawat Dept. of Electrical Engineering, IIT Kanpur Uttar Pradesh, India Distributed Optimization A general multi-agent

More information

Constrained Consensus and Optimization in Multi-Agent Networks

Constrained Consensus and Optimization in Multi-Agent Networks Constrained Consensus Optimization in Multi-Agent Networks The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher

More information

Primal Solutions and Rate Analysis for Subgradient Methods

Primal Solutions and Rate Analysis for Subgradient Methods Primal Solutions and Rate Analysis for Subgradient Methods Asu Ozdaglar Joint work with Angelia Nedić, UIUC Conference on Information Sciences and Systems (CISS) March, 2008 Department of Electrical Engineering

More information

Distributed Scheduling Algorithms for Optimizing Information Freshness in Wireless Networks

Distributed Scheduling Algorithms for Optimizing Information Freshness in Wireless Networks Distributed Scheduling Algorithms for Optimizing Information Freshness in Wireless Networks Rajat Talak, Sertac Karaman, and Eytan Modiano arxiv:803.06469v [cs.it] 7 Mar 208 Abstract Age of Information

More information

Channel Probing in Communication Systems: Myopic Policies Are Not Always Optimal

Channel Probing in Communication Systems: Myopic Policies Are Not Always Optimal Channel Probing in Communication Systems: Myopic Policies Are Not Always Optimal Matthew Johnston, Eytan Modiano Laboratory for Information and Decision Systems Massachusetts Institute of Technology Cambridge,

More information

A Low Complexity Algorithm with O( T ) Regret and Finite Constraint Violations for Online Convex Optimization with Long Term Constraints

A Low Complexity Algorithm with O( T ) Regret and Finite Constraint Violations for Online Convex Optimization with Long Term Constraints A Low Complexity Algorithm with O( T ) Regret and Finite Constraint Violations for Online Convex Optimization with Long Term Constraints Hao Yu and Michael J. Neely Department of Electrical Engineering

More information

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R

More information

Information Theory vs. Queueing Theory for Resource Allocation in Multiple Access Channels

Information Theory vs. Queueing Theory for Resource Allocation in Multiple Access Channels 1 Information Theory vs. Queueing Theory for Resource Allocation in Multiple Access Channels Invited Paper Ali ParandehGheibi, Muriel Médard, Asuman Ozdaglar, and Atilla Eryilmaz arxiv:0810.167v1 cs.it

More information

Approximate Primal Solutions and Rate Analysis for Dual Subgradient Methods

Approximate Primal Solutions and Rate Analysis for Dual Subgradient Methods Approximate Primal Solutions and Rate Analysis for Dual Subgradient Methods Angelia Nedich Department of Industrial and Enterprise Systems Engineering University of Illinois at Urbana-Champaign 117 Transportation

More information

Rate Control in Communication Networks

Rate Control in Communication Networks From Models to Algorithms Department of Computer Science & Engineering The Chinese University of Hong Kong February 29, 2008 Outline Preliminaries 1 Preliminaries Convex Optimization TCP Congestion Control

More information

Distributed Resource Allocation Using One-Way Communication with Applications to Power Networks

Distributed Resource Allocation Using One-Way Communication with Applications to Power Networks Distributed Resource Allocation Using One-Way Communication with Applications to Power Networks Sindri Magnússon, Chinwendu Enyioha, Kathryn Heal, Na Li, Carlo Fischione, and Vahid Tarokh Abstract Typical

More information

Utility-Maximizing Scheduling for Stochastic Processing Networks

Utility-Maximizing Scheduling for Stochastic Processing Networks Utility-Maximizing Scheduling for Stochastic Processing Networks Libin Jiang and Jean Walrand EECS Department, University of California at Berkeley {ljiang,wlr}@eecs.berkeley.edu Abstract Stochastic Processing

More information

Distributed Approaches for Proportional and Max-Min Fairness in Random Access Ad Hoc Networks

Distributed Approaches for Proportional and Max-Min Fairness in Random Access Ad Hoc Networks Distributed Approaches for Proportional and Max-Min Fairness in Random Access Ad Hoc Networks Xin Wang, Koushik Kar Department of Electrical, Computer and Systems Engineering, Rensselaer Polytechnic Institute,

More information

Distributed Multiuser Optimization: Algorithms and Error Analysis

Distributed Multiuser Optimization: Algorithms and Error Analysis Distributed Multiuser Optimization: Algorithms and Error Analysis Jayash Koshal, Angelia Nedić and Uday V. Shanbhag Abstract We consider a class of multiuser optimization problems in which user interactions

More information

Information in Aloha Networks

Information in Aloha Networks Achieving Proportional Fairness using Local Information in Aloha Networks Koushik Kar, Saswati Sarkar, Leandros Tassiulas Abstract We address the problem of attaining proportionally fair rates using Aloha

More information

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 43, NO. 3, MARCH

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 43, NO. 3, MARCH IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 43, NO. 3, MARCH 1998 315 Asymptotic Buffer Overflow Probabilities in Multiclass Multiplexers: An Optimal Control Approach Dimitris Bertsimas, Ioannis Ch. Paschalidis,

More information

Delay Analysis for Max Weight Opportunistic Scheduling in Wireless Systems

Delay Analysis for Max Weight Opportunistic Scheduling in Wireless Systems Delay Analysis for Max Weight Opportunistic Scheduling in Wireless Systems Michael J. eely arxiv:0806.345v3 math.oc 3 Dec 008 Abstract We consider the delay properties of max-weight opportunistic scheduling

More information

Routing. Topics: 6.976/ESD.937 1

Routing. Topics: 6.976/ESD.937 1 Routing Topics: Definition Architecture for routing data plane algorithm Current routing algorithm control plane algorithm Optimal routing algorithm known algorithms and implementation issues new solution

More information

Lagrange Relaxation and Duality

Lagrange Relaxation and Duality Lagrange Relaxation and Duality As we have already known, constrained optimization problems are harder to solve than unconstrained problems. By relaxation we can solve a more difficult problem by a simpler

More information

Node-based Distributed Optimal Control of Wireless Networks

Node-based Distributed Optimal Control of Wireless Networks Node-based Distributed Optimal Control of Wireless Networks CISS March 2006 Edmund M. Yeh Department of Electrical Engineering Yale University Joint work with Yufang Xi Main Results Unified framework for

More information

Enhanced Fritz John Optimality Conditions and Sensitivity Analysis

Enhanced Fritz John Optimality Conditions and Sensitivity Analysis Enhanced Fritz John Optimality Conditions and Sensitivity Analysis Dimitri P. Bertsekas Laboratory for Information and Decision Systems Massachusetts Institute of Technology March 2016 1 / 27 Constrained

More information

STABILITY OF MULTICLASS QUEUEING NETWORKS UNDER LONGEST-QUEUE AND LONGEST-DOMINATING-QUEUE SCHEDULING

STABILITY OF MULTICLASS QUEUEING NETWORKS UNDER LONGEST-QUEUE AND LONGEST-DOMINATING-QUEUE SCHEDULING Applied Probability Trust (7 May 2015) STABILITY OF MULTICLASS QUEUEING NETWORKS UNDER LONGEST-QUEUE AND LONGEST-DOMINATING-QUEUE SCHEDULING RAMTIN PEDARSANI and JEAN WALRAND, University of California,

More information

A State Action Frequency Approach to Throughput Maximization over Uncertain Wireless Channels

A State Action Frequency Approach to Throughput Maximization over Uncertain Wireless Channels A State Action Frequency Approach to Throughput Maximization over Uncertain Wireless Channels Krishna Jagannathan, Shie Mannor, Ishai Menache, Eytan Modiano Abstract We consider scheduling over a wireless

More information

Distributed Stochastic Optimization in Networks with Low Informational Exchange

Distributed Stochastic Optimization in Networks with Low Informational Exchange Distributed Stochastic Optimization in Networs with Low Informational Exchange Wenjie Li and Mohamad Assaad, Senior Member, IEEE arxiv:80790v [csit] 30 Jul 08 Abstract We consider a distributed stochastic

More information

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research Introduction to Machine Learning Lecture 7 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Convex Optimization Differentiation Definition: let f : X R N R be a differentiable function,

More information

Understanding the Capacity Region of the Greedy Maximal Scheduling Algorithm in Multi-hop Wireless Networks

Understanding the Capacity Region of the Greedy Maximal Scheduling Algorithm in Multi-hop Wireless Networks Understanding the Capacity Region of the Greedy Maximal Scheduling Algorithm in Multi-hop Wireless Networks Changhee Joo, Xiaojun Lin, and Ness B. Shroff Abstract In this paper, we characterize the performance

More information

A Starvation-free Algorithm For Achieving 100% Throughput in an Input- Queued Switch

A Starvation-free Algorithm For Achieving 100% Throughput in an Input- Queued Switch A Starvation-free Algorithm For Achieving 00% Throughput in an Input- Queued Switch Abstract Adisak ekkittikul ick ckeown Department of Electrical Engineering Stanford University Stanford CA 9405-400 Tel

More information

Primal/Dual Decomposition Methods

Primal/Dual Decomposition Methods Primal/Dual Decomposition Methods Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture Subgradients

More information

A Geometric Framework for Nonconvex Optimization Duality using Augmented Lagrangian Functions

A Geometric Framework for Nonconvex Optimization Duality using Augmented Lagrangian Functions A Geometric Framework for Nonconvex Optimization Duality using Augmented Lagrangian Functions Angelia Nedić and Asuman Ozdaglar April 15, 2006 Abstract We provide a unifying geometric framework for the

More information

Scheduling Algorithms for Optimizing Age of Information in Wireless Networks with Throughput Constraints

Scheduling Algorithms for Optimizing Age of Information in Wireless Networks with Throughput Constraints SUBITTED TO IEEE/AC TRANSACTIONS ON NETWORING Scheduling Algorithms for Optimizing Age of Information in Wireless Networks with Throughput Constraints Igor adota, Abhishek Sinha and Eytan odiano Abstract

More information

Low-Complexity and Distributed Energy Minimization in Multi-hop Wireless Networks

Low-Complexity and Distributed Energy Minimization in Multi-hop Wireless Networks IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. X, NO. XX, XXXXXXX X Low-Complexity and Distributed Energy Minimization in Multi-hop Wireless Networks Longbi Lin, Xiaojun Lin, Member, IEEE, and Ness B. Shroff,

More information

Duality (Continued) min f ( x), X R R. Recall, the general primal problem is. The Lagrangian is a function. defined by

Duality (Continued) min f ( x), X R R. Recall, the general primal problem is. The Lagrangian is a function. defined by Duality (Continued) Recall, the general primal problem is min f ( x), xx g( x) 0 n m where X R, f : X R, g : XR ( X). he Lagrangian is a function L: XR R m defined by L( xλ, ) f ( x) λ g( x) Duality (Continued)

More information

Energy Optimal Control for Time Varying Wireless Networks. Michael J. Neely University of Southern California

Energy Optimal Control for Time Varying Wireless Networks. Michael J. Neely University of Southern California Energy Optimal Control for Time Varying Wireless Networks Michael J. Neely University of Southern California http://www-rcf.usc.edu/~mjneely Part 1: A single wireless downlink (L links) L 2 1 S={Totally

More information

Network Optimization: Notes and Exercises

Network Optimization: Notes and Exercises SPRING 2016 1 Network Optimization: Notes and Exercises Michael J. Neely University of Southern California http://www-bcf.usc.edu/ mjneely Abstract These notes provide a tutorial treatment of topics of

More information

Understanding the Capacity Region of the Greedy Maximal Scheduling Algorithm in Multi-hop Wireless Networks

Understanding the Capacity Region of the Greedy Maximal Scheduling Algorithm in Multi-hop Wireless Networks Understanding the Capacity Region of the Greedy Maximal Scheduling Algorithm in Multi-hop Wireless Networks Changhee Joo Departments of ECE and CSE The Ohio State University Email: cjoo@ece.osu.edu Xiaojun

More information

A Distributed Newton Method for Network Optimization

A Distributed Newton Method for Network Optimization A Distributed Newton Method for Networ Optimization Ali Jadbabaie, Asuman Ozdaglar, and Michael Zargham Abstract Most existing wor uses dual decomposition and subgradient methods to solve networ optimization

More information

A Distributed Newton Method for Network Utility Maximization, I: Algorithm

A Distributed Newton Method for Network Utility Maximization, I: Algorithm A Distributed Newton Method for Networ Utility Maximization, I: Algorithm Ermin Wei, Asuman Ozdaglar, and Ali Jadbabaie October 31, 2012 Abstract Most existing wors use dual decomposition and first-order

More information

DECENTRALIZED algorithms are used to solve optimization

DECENTRALIZED algorithms are used to solve optimization 5158 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 19, OCTOBER 1, 016 DQM: Decentralized Quadratically Approximated Alternating Direction Method of Multipliers Aryan Mohtari, Wei Shi, Qing Ling,

More information

Adaptive Distributed Algorithms for Optimal Random Access Channels

Adaptive Distributed Algorithms for Optimal Random Access Channels Forty-Eighth Annual Allerton Conference Allerton House, UIUC, Illinois, USA September 29 - October, 2 Adaptive Distributed Algorithms for Optimal Random Access Channels Yichuan Hu and Alejandro Ribeiro

More information

On the convergence to saddle points of concave-convex functions, the gradient method and emergence of oscillations

On the convergence to saddle points of concave-convex functions, the gradient method and emergence of oscillations 53rd IEEE Conference on Decision and Control December 15-17, 214. Los Angeles, California, USA On the convergence to saddle points of concave-convex functions, the gradient method and emergence of oscillations

More information

An Improved Bound for Minimizing the Total Weighted Completion Time of Coflows in Datacenters

An Improved Bound for Minimizing the Total Weighted Completion Time of Coflows in Datacenters IEEE/ACM TRANSACTIONS ON NETWORKING An Improved Bound for Minimizing the Total Weighted Completion Time of Coflows in Datacenters Mehrnoosh Shafiee, Student Member, IEEE, and Javad Ghaderi, Member, IEEE

More information

Alternative Decompositions for Distributed Maximization of Network Utility: Framework and Applications

Alternative Decompositions for Distributed Maximization of Network Utility: Framework and Applications Alternative Decompositions for Distributed Maximization of Network Utility: Framework and Applications Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization

More information

Battery-Aware Selective Transmitters in Energy-Harvesting Sensor Networks: Optimal Solution and Stochastic Dual Approximation

Battery-Aware Selective Transmitters in Energy-Harvesting Sensor Networks: Optimal Solution and Stochastic Dual Approximation Battery-Aware Selective Transmitters in Energy-Harvesting Sensor Networs: Optimal Solution and Stochastic Dual Approximation Jesús Fernández-Bes, Carlos III University of Madrid, Leganes, 28911, Madrid,

More information

Continuous-Model Communication Complexity with Application in Distributed Resource Allocation in Wireless Ad hoc Networks

Continuous-Model Communication Complexity with Application in Distributed Resource Allocation in Wireless Ad hoc Networks Continuous-Model Communication Complexity with Application in Distributed Resource Allocation in Wireless Ad hoc Networks Husheng Li 1 and Huaiyu Dai 2 1 Department of Electrical Engineering and Computer

More information

Lecture Note 5: Semidefinite Programming for Stability Analysis

Lecture Note 5: Semidefinite Programming for Stability Analysis ECE7850: Hybrid Systems:Theory and Applications Lecture Note 5: Semidefinite Programming for Stability Analysis Wei Zhang Assistant Professor Department of Electrical and Computer Engineering Ohio State

More information

The Multi-Path Utility Maximization Problem

The Multi-Path Utility Maximization Problem The Multi-Path Utility Maximization Problem Xiaojun Lin and Ness B. Shroff School of Electrical and Computer Engineering Purdue University, West Lafayette, IN 47906 {linx,shroff}@ecn.purdue.edu Abstract

More information

Network Control: A Rate-Distortion Perspective

Network Control: A Rate-Distortion Perspective Network Control: A Rate-Distortion Perspective Jubin Jose and Sriram Vishwanath Dept. of Electrical and Computer Engineering The University of Texas at Austin {jubin, sriram}@austin.utexas.edu arxiv:8.44v2

More information

Distributed Random Access Algorithm: Scheduling and Congesion Control

Distributed Random Access Algorithm: Scheduling and Congesion Control Distributed Random Access Algorithm: Scheduling and Congesion Control The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As

More information

A Distributed Newton Method for Network Utility Maximization, II: Convergence

A Distributed Newton Method for Network Utility Maximization, II: Convergence A Distributed Newton Method for Network Utility Maximization, II: Convergence Ermin Wei, Asuman Ozdaglar, and Ali Jadbabaie October 31, 2012 Abstract The existing distributed algorithms for Network Utility

More information

Wireless Peer-to-Peer Scheduling in Mobile Networks

Wireless Peer-to-Peer Scheduling in Mobile Networks PROC. 46TH ANNUAL CONF. ON INFORMATION SCIENCES AND SYSTEMS (CISS), MARCH 212 (INVITED PAPER) 1 Wireless Peer-to-Peer Scheduling in Mobile Networs Abstract This paper considers peer-to-peer scheduling

More information

THE EFFECT OF DETERMINISTIC NOISE 1 IN SUBGRADIENT METHODS

THE EFFECT OF DETERMINISTIC NOISE 1 IN SUBGRADIENT METHODS Submitted: 24 September 2007 Revised: 5 June 2008 THE EFFECT OF DETERMINISTIC NOISE 1 IN SUBGRADIENT METHODS by Angelia Nedić 2 and Dimitri P. Bertseas 3 Abstract In this paper, we study the influence

More information

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008. 1 ECONOMICS 594: LECTURE NOTES CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS W. Erwin Diewert January 31, 2008. 1. Introduction Many economic problems have the following structure: (i) a linear function

More information

Optimal Power Control in Decentralized Gaussian Multiple Access Channels

Optimal Power Control in Decentralized Gaussian Multiple Access Channels 1 Optimal Power Control in Decentralized Gaussian Multiple Access Channels Kamal Singh Department of Electrical Engineering Indian Institute of Technology Bombay. arxiv:1711.08272v1 [eess.sp] 21 Nov 2017

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

Lexicographic Max-Min Fairness in a Wireless Ad Hoc Network with Random Access

Lexicographic Max-Min Fairness in a Wireless Ad Hoc Network with Random Access Lexicographic Max-Min Fairness in a Wireless Ad Hoc Network with Random Access Xin Wang, Koushik Kar, and Jong-Shi Pang Abstract We consider the lexicographic max-min fair rate control problem at the link

More information

WE consider an undirected, connected network of n

WE consider an undirected, connected network of n On Nonconvex Decentralized Gradient Descent Jinshan Zeng and Wotao Yin Abstract Consensus optimization has received considerable attention in recent years. A number of decentralized algorithms have been

More information

arxiv: v2 [math.oc] 5 May 2018

arxiv: v2 [math.oc] 5 May 2018 The Impact of Local Geometry and Batch Size on Stochastic Gradient Descent for Nonconvex Problems Viva Patel a a Department of Statistics, University of Chicago, Illinois, USA arxiv:1709.04718v2 [math.oc]

More information

Optimization and Optimal Control in Banach Spaces

Optimization and Optimal Control in Banach Spaces Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,

More information

A POMDP Framework for Cognitive MAC Based on Primary Feedback Exploitation

A POMDP Framework for Cognitive MAC Based on Primary Feedback Exploitation A POMDP Framework for Cognitive MAC Based on Primary Feedback Exploitation Karim G. Seddik and Amr A. El-Sherif 2 Electronics and Communications Engineering Department, American University in Cairo, New

More information

CS-E4830 Kernel Methods in Machine Learning

CS-E4830 Kernel Methods in Machine Learning CS-E4830 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 27. September, 2017 Juho Rousu 27. September, 2017 1 / 45 Convex optimization Convex optimisation This

More information

Imperfect Randomized Algorithms for the Optimal Control of Wireless Networks

Imperfect Randomized Algorithms for the Optimal Control of Wireless Networks Imperfect Randomized Algorithms for the Optimal Control of Wireless Networks Atilla ryilmaz lectrical and Computer ngineering Ohio State University Columbus, OH 430 mail: eryilmaz@ece.osu.edu Asuman Ozdaglar,

More information

Some Background Math Notes on Limsups, Sets, and Convexity

Some Background Math Notes on Limsups, Sets, and Convexity EE599 STOCHASTIC NETWORK OPTIMIZATION, MICHAEL J. NEELY, FALL 2008 1 Some Background Math Notes on Limsups, Sets, and Convexity I. LIMITS Let f(t) be a real valued function of time. Suppose f(t) converges

More information

Learning Algorithms for Minimizing Queue Length Regret

Learning Algorithms for Minimizing Queue Length Regret Learning Algorithms for Minimizing Queue Length Regret Thomas Stahlbuhk Massachusetts Institute of Technology Cambridge, MA Brooke Shrader MIT Lincoln Laboratory Lexington, MA Eytan Modiano Massachusetts

More information

OPTIMALITY OF RANDOMIZED TRUNK RESERVATION FOR A PROBLEM WITH MULTIPLE CONSTRAINTS

OPTIMALITY OF RANDOMIZED TRUNK RESERVATION FOR A PROBLEM WITH MULTIPLE CONSTRAINTS OPTIMALITY OF RANDOMIZED TRUNK RESERVATION FOR A PROBLEM WITH MULTIPLE CONSTRAINTS Xiaofei Fan-Orzechowski Department of Applied Mathematics and Statistics State University of New York at Stony Brook Stony

More information

Stability of Primal-Dual Gradient Dynamics and Applications to Network Optimization

Stability of Primal-Dual Gradient Dynamics and Applications to Network Optimization Stability of Primal-Dual Gradient Dynamics and Applications to Network Optimization Diego Feijer Fernando Paganini Department of Electrical Engineering, Universidad ORT Cuareim 1451, 11.100 Montevideo,

More information

Energy Harvesting Multiple Access Channel with Peak Temperature Constraints

Energy Harvesting Multiple Access Channel with Peak Temperature Constraints Energy Harvesting Multiple Access Channel with Peak Temperature Constraints Abdulrahman Baknina, Omur Ozel 2, and Sennur Ulukus Department of Electrical and Computer Engineering, University of Maryland,

More information

Nonlinear Programming 3rd Edition. Theoretical Solutions Manual Chapter 6

Nonlinear Programming 3rd Edition. Theoretical Solutions Manual Chapter 6 Nonlinear Programming 3rd Edition Theoretical Solutions Manual Chapter 6 Dimitri P. Bertsekas Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts 1 NOTE This manual contains

More information

Resource Pooling for Optimal Evacuation of a Large Building

Resource Pooling for Optimal Evacuation of a Large Building Proceedings of the 47th IEEE Conference on Decision and Control Cancun, Mexico, Dec. 9-11, 28 Resource Pooling for Optimal Evacuation of a Large Building Kun Deng, Wei Chen, Prashant G. Mehta, and Sean

More information

Channel Allocation Using Pricing in Satellite Networks

Channel Allocation Using Pricing in Satellite Networks Channel Allocation Using Pricing in Satellite Networks Jun Sun and Eytan Modiano Laboratory for Information and Decision Systems Massachusetts Institute of Technology {junsun, modiano}@mitedu Abstract

More information

ELE539A: Optimization of Communication Systems Lecture 6: Quadratic Programming, Geometric Programming, and Applications

ELE539A: Optimization of Communication Systems Lecture 6: Quadratic Programming, Geometric Programming, and Applications ELE539A: Optimization of Communication Systems Lecture 6: Quadratic Programming, Geometric Programming, and Applications Professor M. Chiang Electrical Engineering Department, Princeton University February

More information

On Stochastic Subgradient Mirror-Descent Algorithm with Weighted Averaging

On Stochastic Subgradient Mirror-Descent Algorithm with Weighted Averaging On Stochastic Subgradient Mirror-Descent Algorithm with Weighted Averaging arxiv:307.879v [math.oc] 7 Jul 03 Angelia Nedić and Soomin Lee July, 03 Dedicated to Paul Tseng Abstract This paper considers

More information

On the Optimality of Multiuser Zero-Forcing Precoding in MIMO Broadcast Channels

On the Optimality of Multiuser Zero-Forcing Precoding in MIMO Broadcast Channels On the Optimality of Multiuser Zero-Forcing Precoding in MIMO Broadcast Channels Saeed Kaviani and Witold A. Krzymień University of Alberta / TRLabs, Edmonton, Alberta, Canada T6G 2V4 E-mail: {saeed,wa}@ece.ualberta.ca

More information

Achieving Shannon Capacity Region as Secrecy Rate Region in a Multiple Access Wiretap Channel

Achieving Shannon Capacity Region as Secrecy Rate Region in a Multiple Access Wiretap Channel Achieving Shannon Capacity Region as Secrecy Rate Region in a Multiple Access Wiretap Channel Shahid Mehraj Shah and Vinod Sharma Department of Electrical Communication Engineering, Indian Institute of

More information

Optimizing Queueing Cost and Energy Efficiency in a Wireless Network

Optimizing Queueing Cost and Energy Efficiency in a Wireless Network Optimizing Queueing Cost and Energy Efficiency in a Wireless Network Antonis Dimakis Department of Informatics Athens University of Economics and Business Patission 76, Athens 1434, Greece Email: dimakis@aueb.gr

More information

A Proof of the Converse for the Capacity of Gaussian MIMO Broadcast Channels

A Proof of the Converse for the Capacity of Gaussian MIMO Broadcast Channels A Proof of the Converse for the Capacity of Gaussian MIMO Broadcast Channels Mehdi Mohseni Department of Electrical Engineering Stanford University Stanford, CA 94305, USA Email: mmohseni@stanford.edu

More information

The Power of Online Learning in Stochastic Network Optimization

The Power of Online Learning in Stochastic Network Optimization he Power of Online Learning in Stochastic Networ Optimization Longbo Huang, Xin Liu, Xiaohong Hao {longbohuang haoxh10@tsinghua.edu.cn, IIIS, singhua University liuxin@microsoft.com, Microsoft Research

More information

Queue Proportional Scheduling in Gaussian Broadcast Channels

Queue Proportional Scheduling in Gaussian Broadcast Channels Queue Proportional Scheduling in Gaussian Broadcast Channels Kibeom Seong Dept. of Electrical Engineering Stanford University Stanford, CA 9435 USA Email: kseong@stanford.edu Ravi Narasimhan Dept. of Electrical

More information

On Acceleration with Noise-Corrupted Gradients. + m k 1 (x). By the definition of Bregman divergence:

On Acceleration with Noise-Corrupted Gradients. + m k 1 (x). By the definition of Bregman divergence: A Omitted Proofs from Section 3 Proof of Lemma 3 Let m x) = a i On Acceleration with Noise-Corrupted Gradients fxi ), u x i D ψ u, x 0 ) denote the function under the minimum in the lower bound By Proposition

More information

Understanding the Capacity Region of the Greedy Maximal Scheduling Algorithm in Multi-hop Wireless Networks

Understanding the Capacity Region of the Greedy Maximal Scheduling Algorithm in Multi-hop Wireless Networks 1 Understanding the Capacity Region of the Greedy Maximal Scheduling Algorithm in Multi-hop Wireless Networks Changhee Joo, Member, IEEE, Xiaojun Lin, Member, IEEE, and Ness B. Shroff, Fellow, IEEE Abstract

More information

NEW LAGRANGIAN METHODS FOR CONSTRAINED CONVEX PROGRAMS AND THEIR APPLICATIONS. Hao Yu

NEW LAGRANGIAN METHODS FOR CONSTRAINED CONVEX PROGRAMS AND THEIR APPLICATIONS. Hao Yu NEW LAGRANGIAN METHODS FOR CONSTRAINED CONVEX PROGRAMS AND THEIR APPLICATIONS by Hao Yu A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment

More information

Convex Optimization and Modeling

Convex Optimization and Modeling Convex Optimization and Modeling Duality Theory and Optimality Conditions 5th lecture, 12.05.2010 Jun.-Prof. Matthias Hein Program of today/next lecture Lagrangian and duality: the Lagrangian the dual

More information

SCHEDULING is one of the core problems in the design

SCHEDULING is one of the core problems in the design IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 66, NO. 6, JUNE 2017 5387 Head-of-Line Access Delay-Based Scheduling Algorithm for Flow-Level Dynamics Yi Chen, Student Member, IEEE, Xuan Wang, and Lin

More information

Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers

Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers Optimization for Communications and Networks Poompat Saengudomlert Session 4 Duality and Lagrange Multipliers P Saengudomlert (2015) Optimization Session 4 1 / 14 24 Dual Problems Consider a primal convex

More information

Primal-dual Subgradient Method for Convex Problems with Functional Constraints

Primal-dual Subgradient Method for Convex Problems with Functional Constraints Primal-dual Subgradient Method for Convex Problems with Functional Constraints Yurii Nesterov, CORE/INMA (UCL) Workshop on embedded optimization EMBOPT2014 September 9, 2014 (Lucca) Yu. Nesterov Primal-dual

More information