Stochastic Subgradient Methods with Approximate Lagrange Multipliers
|
|
- Coral Lyons
- 6 years ago
- Views:
Transcription
1 Stochastic Subgradient Methods with Approximate Lagrange Multipliers Víctor Valls, Douglas J. Leith Trinity College Dublin Abstract We study the use of approximate Lagrange multipliers in the stochastic subgradient method for the dual problem in convex optimisation. The use of approximate Lagrange multipliers in the optimisation (instead of the true multipliers) is motivated by the fact that we can cast some typically thought non-convex control problems as convex optimisations. For example, we can solve some stochastic discrete decision maing problems by simply solving a sequence of unconstrained convex optimisations. The use of approximate multipliers allows us also to solve problems where there are constraints on the order in which control actions can be made. Index Terms convex optimisation, max-weight scheduling, subgradient methods, approximate multipliers, stochastic control. I. INTRODUCTION In this paper we study the use of approximate Lagrange multipliers in subgradient methods for constrained convex optimisation problems. One of the motivations of using an approximation of the Lagrange multipliers is that in some problems the exact multipliers might not be available, but a perturbed or outdated version of the multipliers is available instead. For example, in distributed optimisation the exchange of system state information might be affected by transmission delays or losses. Another motivation of using approximate Lagrange multipliers is because of its applications to networ scheduling problems involving queues. As shown in [], it is possible to identify scaled queues occupancies with approximate Lagrange multipliers, i.e., the Lagrange multiplier λ generated by the subgradient method stays close to the α-scaled queue occupancyαq in a networ (where α is the step size used in the subgradient method). A direct consequence of this is that the scaled queues occupancies in networs can be used as surrogate quantities of the Lagrange multipliers in the subgradient method for the dual problem. In other words, by using approximate multipliers it is possible to cast some discrete decision maing problems as convex optimisations. Our convex approach to solving networ/scheduling optimisation problems is in mared contrast to approaches lie max-weight [] that are built on Lyapunov optimisation theory. Whereas max-weight approaches focus on constructing a policy that stabilises a system, we instead solve the dual problem and then derive the stability of the system via the properties of the dual problem. This way of looing at the problem allows us to construct scheduling policies in a more flexible manner This wor was supported by Science Foundation Ireland under Grant No. /PI/77. than with max-weight approaches. In particular, we can design scheduling policies that have constraints on the order in which discrete actions can be selected. A. Related Wor The max-weight scheduling algorithm was proposed by Tassiulas and Ephremides in []. The algorithm is able to find an optimal scheduling policy (which consists of choosing the discrete action that minimises the sum of the squared queues baclogs) without previous nowledge of the mean pacet arrival rates in the system. The max-weight algorithm has been extended in a sequel of papers [3], [4], [5] to consider the minimisation of a convex utility function, and to general functions in [6]. The wors in [5], [6] also extend max-weight to allow convex constraints. The construction of max-weight policies with constraints (or penalties) is proposed in [7]. There the authors consider the problem of dynamic allocation of servers with switchover delay. The wor in [8] studies the allocation of rates in a networ while providing fairness using a dynamical system approach and Lyapunov stability. In [9] the authors use a convex approach and study the optimal allocation of rates using subgradient methods. Importantly, [8] and [9] wor with continuous valued variables rather than with the discrete actions used in the max-weight approaches [], [3], [4], [5], [6]. The wors in [0], [] are the first ones that establish rigorous connections between the max-weight and dual approaches in convex optimisation via Lagrange multipliers. II. CONVEX OPTIMISATION Consider the following convex optimisation problem P in standard form: minimise x X f(x) () subject to g i (x) 0 i =,...,m () where f,g i : X R are convex functions and X R n a convex set. We will always assume that set X 0 := {x X g i (x) 0,i =,...,m}, and so problem P is feasible. Further, using standard notation we let f := min x X0 f(x) and x argmin x X0 f(x). As it is usual in convex optimisation, the Lagrange dual function of problem P is given by q(λ) = inf L(x,λ) = inf {f(x)+ m λ (i) g i (x)}, (3) x X x X
2 where λ (j) is the j th component of vector λ R m +. Recall that since q(λ) is the infimum of a set of linear functions it is concave. In order to eep notation short we will write L(x,λ) = f(x)+λ T g(x) whereg = [g,...,g m ] T, and define X(λ) := {argmin x X L(x,λ)} to be the set of minima of L(,λ) for a given vector λ. The following two assumptions are ey in our wor. Assumption (Bounded Set). X is convex and bounded. Assumption (Slater Condition). relint(x 0 ) is non-empty, i.e., there exists a point x X such that g(x) 0. Assumption is important because it guarantees that q is Lipschitz continuous with constant L g := max x X g(x), and Assumption ensures that strong duality holds. Namely, the solution of the dual problem P D, maximise λ 0 q(λ) (4) coincides with the solution of the primal problem P. That is, max λ 0 q(λ) =: q(λ ) = f where λ argmax λ 0 q(λ). Another consequence of the Slater condition is that the set of dual optima, Λ := {argmax λ 0 q(λ)}, is a bounded subset from R m + (see Lemma in []). A. Subgradient Method for the Dual Problem Problem P D is an unconstrained concave optimisation problem that can be solved using the subgradient method. Recall that when the dual function is differentiable then the subgradient is indeed the gradient. In short, the subgradient method consists of the following update λ + = [λ +αg(x )] +, (5) where [ ] + = max{0, }, λ R m, g(x ) is the subgradient of q at point λ, and α > 0 is a constant step size. The subgradient method can mae use of more complex step sizes, but as we will show in Section II-B, constant step size will play a prominent role when woring with averages. Lemma (Subgradient Method). Consider optimisation problem P D and the subgradient update (5) where x X(λ ) and α > 0. Suppose Assumptions and hold. Then, there exists a non-negative and non-increasing sequence {β } that converges to α(3/)l g for large enough and β q(λ ) q(λ ) 0 for all. The intuition behind the lemma is that λ is monotonically attracted to a vectorλ Λ until it converges to a ball around Λ, the size of which depends on parameter α. Hence, by the Lipchitz continuity of the dual function we can then expect that q(λ ) converges (probably not monotonically) to a ball around q(λ ). Moreover, since λ is monotonically attracted to a ball aroundλ, andλ is a bounded set fromr m from Assumption, it follows that λ is bounded for all. See [, Lemma 3] for a bound on λ. Figure illustrates schematically the convergence of q(λ ) to level set q(λ ) q(λ ) α(3/)l g. Fig.. Illustrating the convergence of the subgradient method for the dual problem with constant step size. B. Approximate Primal Solutions Using Averages We now proceed to show we can recover approximate primal solutions using the subgradient method for the dual problem. First we present the following lemma which relates bounded Lagrange multipliers and the feasibility of averages of primal variables. Lemma (Feasible Solutions using Averages). Let {x } be a sequence of points in X and consider update λ + = [λ + αg(x )] +. Then, g( x ) λ + /, where x = x i. Observe from Lemma that if λ + is bounded for all (which is the case in the subgradient method) then x is attracted to a feasible point as. We now present one of our main results, which is a refinement of Theorem 3 in [3] with sharper bounds and simpler proof. Theorem (Approximate Primal Solutions using Averages). Consider the subgradient method for the dual problem with constant step size α > 0, i.e., λ + = [λ +αg(x )] +, (6) where x X(λ ) := {argmin x X L(x,λ )}. Suppose Assumptions and hold. Then, β λ α f( x ) f αl g + λ α, (7) where λ = λt λ +, x := x i and {β } is a sequence given by the subgradient method. Theorem says that f( x ) converges to a solution close to f for sufficiently large and α sufficiently small. Observe that if we let we obtain α3l g f( x ) f αl g, (8) and if we mae α 0 then f( x ) f. Further, since by Lemma we have that g( x ) 0 when, we can conclude that x x. How fast sequence {β } converges to α(3/)l g will depend on the step size and characteristics of the objective function and constraints. For example, if q(λ) were strongly convex we could use second order methods and obtain a sequence {β } that converges fast to α(3/)l g. Theorem can be extended to consider the case where the subgradient g(x ) is computed using a point nearby to λ. Formally, x X(µ ) where µ is an approximate Lagrange multiplier of λ that satisfies λ µ ǫ/l g for all and ǫ 0. As we will show in Section III it
3 3 Fig.. Illustrating the convergence of the subgradient method for the dual problem with constant step size when the subgradients are obtained using nearby (dual) points or approximate Lagrange multipliers. Observe from the figure that the approximate Lagrange multiplier µ (dashed line) is always close to the Lagrange multiplier λ (straight line). Fig. 3. Illustrating the convergence of the stochastic subgradient method with approximate Lagrange multipliers. In contrast to the deterministic case (see Figure ), we now have that λ converges to a ball around λ. will be possible to identify scaled queues occupancies with approximate multipliers µ. A consequence of this is that scaled queues occupancies can be used as surrogate quantities in networ optimisation. We have the following corollary to Theorem. Corollary. Consider the setup of Theorem where update (6) now uses x X(µ ) where λ µ ǫ/l g for all =,,... and ǫ 0. Then, the bound (7) holds and sequence {β } converges to α(3/)l g + ǫ for sufficiently large. The use of a nearby or approximate Lagrange multiplier to compute the subgradient of the dual function is in fact equivalent to the subgradient method for the dual problem with ǫ-subgradients see [4, pp. 65]). Observe that q(λ ) q(µ ) λ µ L g ǫ. Figure illustrates the convergence of q(λ ) to a ball around q(λ ) when the subgradient method is computed using a nearby Lagrange multiplier. Compare Figure and to see that the ball to which q(λ ) converges to depends now on parameter ǫ (which controls the distance between λ and µ ). C. Subgradient Method with Noisy Dual Updates The subgradient method for the dual problem can be extended to consider noisy dual updates. We have the following lemma. Lemma 3 (Stochastic Subgradient Method with Approximate Lagrange Multipliers). Consider optimisation problem P D and the subgradient update λ + = [λ +α(g(x )+B )] +, (9) where x X(µ ) with λ µ ǫ/l g, ǫ 0 and B are i.i.d. random variables with mean b R m and finite varianceσ B. Suppose Assumption holds and that there exists a point x X such that g(x) + b 0. Also, suppose the subgradients of the dual function are bounded for all, i.e., max x X g(x)+b L h for all. Then, λ λ α αl h where λ := λ i. ǫ q(e( λ )) q(λ ) 0, (0) In the stochastic subgradient method we have that the expected value of the running average λ is attracted to a ball around Λ. The latter is schematically illustrated in Figure 3. An important point to note is that now we require that the Slater condition holds with the mean of the random variable, i.e., there exists a point x X such that g(x)+b 0 and not just g(x) 0. We can obtain approximate primal solutions using the stochastic subgradient update. Theorem (Approximate Primal Solutions using Averages, Approximate Lagrange Multipliers and Noisy Dual Updates). Consider the stochastic subgradient method of Lemma 3. Suppose Assumption holds and that there exists a point x X such that g(x) + b 0. Also, suppose the subgradients of the dual function are bounded for all, i.e., max x X g(x)+b L h for all. Then, λ λ α αl h f(e( x )) f αl h λ ǫ () α where x := x i and λ = E[ λ ] T E[λ + ]. + λ α, () Theorem says that f(e( x )) converges to a ball around f for sufficiently large, and when we have that αl h ǫ f(e( x )) f αl h. Further, if α 0 then f(e( x )) f as. III. APPROXIMATE LAGRANGE MULTIPLIERS & QUEUE CONTINUITY Queues updates in networs are usually given by Q + = [Q +δ ] +, where Q R m and δ Z m. Difference δ is discrete and represents the change of discrete quantities (pacets, cars, people, etc.) that get in/out of the queue in a time slot. Next, observe that since [ ] + = max{0, } is a homogeneous function, if we let αq := µ we can write µ + = [µ +αδ ] +, where α is the step size used in the subgradient method. That is, an α-scaled queue is lie a subgradient update with the
4 4 difference that δ is constrained to be discrete. Recall that µ will be an approximate Lagrange multiplier if it stays uniformly close to λ for all. This will be actually the case when the sequence of discrete quantities{δ } and subgradients {g(x )} stay close in appropriate sense, and constraints in the optimisation P are linear [3]. We have the following lemma, which is a restatement of [5, Proposition 3..]. Lemma 4 (Continuity of the Sorohod Map). Consider updates λ + = [λ + αx ] +,µ + = [µ + αy ] + where λ = µ 0, α > 0, and {x } and {y } are two sequences of points from R such that (x i y i ) ǫ. Then, λ + µ + αǫ. (3) Lemma 4 requires (x i y i ) to be uniformly bounded so that the difference λ µ is also bounded. Hence, it is interesting to now under which conditions it is possible to construct a sequence {y } Y R n that stays close to an arbitrary sequence {x } X R n. Next we show that this is always possible when Y is a finite collection of points from R n and X conv(y), the convex hull of Y. A. Construction of Discrete Sequences Consider a set of m discrete points Y from R n such that X conv(y). Collect the points in Y as columns in matrix W. We can write any point x X as a convex combination of points in Y, i.e.,x = Wu where u U := {u R m u =,u 0}. Note there might exist multiple vectors in U such that Wu = x. Similarly, define E := {e {0,} m e = } to be the set containing the m-dimensional standard basis vectors. There is only one vector e E for each y Y such that We = y. We can write (x i y i ) = W (u i e i ), and since W (u i e i ) W (u i e i ) by the Cauchy-Schwarz inequality it follows that showing that (u i e i ) (4) is bounded for all is sufficient to establish the boundedness of (x i y i ). We show this in the following corollary, which is a straightforward extension of [3, Theorem 4]. Corollary (Bloc Sequence). Let T N, U := {u R m u =,u 0}, E := {e {0,} m e = } and define S = Tm u i where u i U for all i =,...,Tm. Let δ D := {w R m w T = 0, w } and select e argmin e E δ +S e i e. (5) Then, we have that (δ +S S ) D where S = Tm e i. One such vector can be efficiently found by solving convex optimisation problem min u U Wu x with standard convex methods see Chapters 6 and 9 in [6]. Fig. 4. Illustrating the construction of a subsequence in an online mode in a delayed or shifted manner with m = 3 and T =. Subsequence {e 7,...,e } is constructed with update (5) with subsequence {u,...,u 6 }, and therefore (δ+ 6 u i j=7 e j) D. Subsequence {e,...,e 6 } can be selected randomly from E and it does not affect the boundedness of (4). Corollary says that for any subsequence {u,...,u Tm } we can use update (5) to construct a subsequence {e,...,e Tm } such that (δ +S S ) = δ + Tm (u i e i ) := δ D. (6) Hence, for any other subsequence {u Tm+,...,u Tm } we can use update (5) to construct a subsequence {e Tm+,...,e Tm } such that Tm δ + (u i e i ) = ( δ + Tm i=t m+ (u i e i ) ) D. Next, observe that if we let S l = l(tm) j=(l )Tm+ u j and S l = l(tm) j=(l )Tm+ e j with l =,,... we will always have that { l (S i S i )} is a sequence of points from D := {w R m w T = 0, w } and therefore bounded. The boundedness of (4) for all =,,... follows immediately from the fact that subsequences have finite length. Importantly, note that update (5) requires to now T m elements from U in advance in order to select Tm elements from E. This implies that it will only be possible to construct subsequences in an online mode in a delayed or shifted manner see the example in Figure 4. Nonetheless, since subsequences have finite length, constructing subsequences in a shifted manner does not affect the boundedness of (4). An important observation is that the elements in a subsequence {e,...,e Tm } can be reordered and still have that (δ + Tm (u i e i )) D. This will be particularly useful in problems where there are constraints on the order how actions y (and so e ) can be selected. In the next section we present a networ example where the selection of discrete actions has constraints and cannot be solved with classic maxweight approaches. A. Problem Setup IV. WIRELESS NETWORK EXAMPLE Consider the networ shown in Figure 5, an Access Point (AP) that transmits to two wireless nodes. Time is slotted and in each time slot pacets arrive at the queues of the AP (Q and Q ) with probability p and p, respectively. In each time slot the AP taes an action from action set Y := {y 0,y,y } = {[0,0],[,0],[0,]}, where each action corresponds, respectively, to not transmitting, to transmitting
5 5 AP Fig. 5. Illustrating the networ in the example of Section IV. The Access Point (AP) sends pacets from Q to node, and from Q to node. one pacet from Q to node, and to transmitting one pacet from Q to node. The transmission protocol of the AP has constraints on how actions can be selected. In particular, it is not possible to select action y after y without first selecting y 0. Liewise, it is not possible to select y after y without first selecting y 0. However, y or y can be selected sequentially. An example of an admissible sequence is {y,y,y,y 0,y,y 0,y,y,y 0,y,y,...}. These type of constraints appear in different areas, and are nown in the literature as reconfiguration delays [7]. For example, asing for the Channel State Information (CSI) in wireless communications in order to adjust the transmission parameters. Our goal is to design a scheduling policy for the AP (select actions from sety)in order to minimise a convex cost function of the average throughput x, and ensure that the system is stable, i.e., the queues do not overflow and so all traffic can be served. B. Problem Formulation The convex or fluid formulation of the problem is minimise x X f(x) (7) subject to b x (8) where f : R R, b R + and X is a bounded convex subset from conv(y) R that depends on the protocol constraints, i.e., on how actions can be selected. If there were no constraints on the order in which actions could be selected we would have X := conv(y), however, this is not the case. Characterising X is as simple as noting that if we have a subsequence of actions where y and y appear (each) sequentially, then y 0 should appear at least twice in order to have a subsequence that is compliant with the transmission protocol for example {y 0,y,y,y,y 0,y,y,y,y }. Conversely, any subsequence in which y 0 appears at least twice can be reordered to obtain a subsequence that is compliant with the transmission protocol. Since we can always choose a subsequence of discrete actions and then reorder its elements (see discussion after Corollary ), we just need to enforce that y 0 appears twice in a subsequence. This can be obtained if the convex combination of a point x X uses point y 0 at least The CSI in wireless communications is in practice requested periodically, and not only at the beginning of a transmission, but we will assume this for simplicity of exposure. The extension is nevertheless straightforward. Fig. 6. Networ capacity region of the example in Section IV. X = ηconv(y ) with η = /(Tm). fraction /Tm, where (Tm) is the length of a subsequence. This will be the case when X := ηconv(y), (9) where η := /(Tm). Note from (9) that as T gets larger η and therefore X conv(y), i.e., the networ capacity region increases. Figure 6 illustrates the networ capacity region of the example. ) Updates: We use scaled queues occupancies as approximate multipliers. At each time slot we have updates x argmin x X {f(x) αqt x}, (0) Q + = [Q y +B ] +, () where B are i.i.d. random variables that tae values {0,} with mean [p,p ] T. Action y = We, and e is obtained with (5) in the online mode explained in Section III-A. The elements in a subsequences are reordered to comply with the transmission protocol constraints. ) Simulation: We run a simulation with objective function f = x and parameters p = 0.5, p = 0.5, α = 0.05, λ = αq = 0, T = 3 (so the number of elements in a subsequence is 9 since m = 3). Figure 7 shows the convergence of λ to a ball around λ. Interestingly, see that whereas λ converges to λ, the Lagrange multipliersλ (grey points) do not converge to a ball around λ. The stability of the system follows from the fact that if the average of the Lagrange multipliers is bounded, then the average of the α- scaled queue occupancies is also bounded. Figure 8 shows that f( x ) converges to a ball around f as increases. V. CONCLUSIONS We have shown that the stochastic subgradient method for the dual problem in convex optimisation allows the use of approximate Lagrange multipliers. The use of approximate Lagrange multipliers is motivated because it is possible to identify α-scaled queue occupancies as approximations of Lagrange multipliers in the subgradient method. As a result, it is possible to cast some discrete decision maing problems as convex optimisations. One of the advantages of our convex approach is that we can mae optimal control decisions that have ordering constraints.
6 6, λ () λ () λ () (), λ Fig. 7. Illustrating the convergence of average λ + (blue line) to a ball around λ = [0.5,] T (grey area). Grey points indicate the values of the Lagrange multipliers λ, and λ () and λ () denote the first and second components of vector λ. f( x) time slot Fig. 8. Illustrating the convergence of f( x ) to f = 0.35 (straight line) in the networ example of Section IV. A. Proof of Lemma APPENDIX Consider update λ + = [λ + αg(x )] + and suppose we select x X(µ ) := {x X L(x,µ ) q(µ ) ǫ}. Observe that λ + λ = [λ +αg(x )] + λ λ +αg(x ) λ = λ λ +α g(x ) +α(λ λ ) T g(x ) λ λ +α L g +α(q(λ )+ǫ q(λ )), where the last inequality follows from the fact that (λ λ ) T g(x ) = L(x,λ ) L(x,λ ) L(x,λ ) q(λ ) q(λ )+ǫ q(λ ), and g(x ) L g for all. Rearranging terms yields λ + λ λ λ () α L g +α(q(λ )+ǫ q(λ )). (3) Now let Λ δ := {λ R m + q(λ ) q(λ) αl g + ǫ} and consider two cases. Case (i) (λ / Λ δ ) Observe that (3) is strictly negative and therefore λ is monotonically attracted to Λ δ. Hence, for any λ R m and sufficiently large it will eventually happen that λ Λ δ and therefore q(λ ) q(λ ) αl g / +ǫ. Case (ii) (λ Λ δ ) We cannot ensure that the RHS in (3) is strictly negative, however, from the Lipchitz continuity of the dual function we have q(λ + ) q(λ ) λ + λ L g = [λ +αg(x )] + λ L g αg(x ) L g αl g. Now see that since q(λ +) q(λ ) = q(λ + ) q(λ ) + q(λ ) q(λ ) q(λ + ) q(λ ) + q(λ ) q(λ ) αl g + αl g /+ǫ = α(3/)l g +ǫ. That is, λ + Λ δ := {λ Rm + q(λ ) q(λ) α(3/)l g +ǫ}. Finally, note that Λ δ Λ δ and that since from case (i) any λ + Λ δ \ Λ δ will be attracted to Λ δ, we can conclude that for every λ R m there exists a non-negative and nonincreasing sequence {β } that converges to α(3/)l g+ǫ for sufficiently large and β q(λ ) q(λ ) 0 for all. B. Proof of Lemma Observe that λ + = [λ +αg(x )] + λ +αg(x i ), and rearranging terms λ + λ αg(x ). Summing from i =,..., yields α g(x i) λ + λ λ +. Dividing by α and using the fact that g( x ) g(x i) yields the stated result. C. Proof of Theorem From Lemma we have that β q(λ ) q(λ ) 0, where {β } is a non-increasing sequence that converges to α(3/)l g + ǫ for sufficiently large. Next, since q(λ ) q(λ ) L( x,λ ) q(λ ) = f( x ) + λ T g( x ) q(λ ), by rearranging terms and using the fact that q(λ ) = f (by strong duality) we obtain β λ T g( x ) f( x ) f. From Lemma we have thatg( x ) λ + /(α) for all, and since λ + λ +αl g we obtain β λ α f( x ) f (4) where λ = λt λ +. We now show the upper bound in (7). Since f = q(λ ) q(λ i) = f(x i)+λ T i g(x i) f( x )+ λt i g(x i), by rearranging terms we have that f( x ) f λt i g(x i). Now observe that for any sequence {x } in X we can write λ + = [λ + αg(x )] + λ + αg(x ) λ + α g(x ) + αλt g(x ). Applying the latter equation recursively for i =,..., we obtain that λ + λ + α g(x i) + α λt i g(x i), and rearranging terms and dividing by α yields λt i g(x i) α g(x i) + λ α. Finally, since g(x i ) L g by Assumption we obtain which concludes the proof. D. Proof of Lemma 3 f( x ) f αl g + λ α, In each iteration we aim to decrease the expected distance of λ from λ. In short, observe that for any vector θ R m we have E( λ + θ λ ) = E( [λ +α g(x )] + θ λ ) E( λ +α g(x ) θ λ ) where g(x ) = g(x )+B.
7 7 Expanding the RHS of the last equation and rearranging terms yields E( λ + θ λ ) λ θ (5) α E( g(x ) λ )+αe((λ θ) T g(x ) λ ). α L h +αe((λ θ) T g(x ) λ ). (6) where the last equation follows from the boundedness of the subgradient. Next, see that since E((λ θ) T (g(x )+B ) λ ) = (λ θ) T (g(x ) + b) = L(x,λ ) L(x,θ) L(x,λ ) q(θ) q(λ )+ǫ q(θ) we have that E( λ + θ λ ) λ θ α L h +α(q(λ ) +ǫ q(θ)) and taing expectations yields E( λ + θ ) E( λ θ ) α L h +αe(q(λ )+ǫ q(θ)). Applying the latter expression recursively to obtaine( λ + θ ) λ θ α L h +α E(q(λ i)+ǫ q(θ)). Dropping E( λ + θ ), further rearranging and dividing by α yields λ θ αl h ǫ E(q(λ i ) q(θ)). (7) α Next, see that by the linearity of the expectation one can write E(q(λ i )) = E q(λ i ) where λ i, i =,..., are the Lagrange multiplier obtained from update (9). Next, let λ := λ i and see that by the concavity of q we can write E( q(λ i)) E(q( λ )) q(e( λ )). Hence, λ θ α αl h ǫ q(e( λ )) q(θ), and if we let θ = λ we obtain that q(e( λ )) converges monotonically to a ball around q(λ ). E. Proof of Theorem We follow the same steps as in the proof of Theorem. From the stochastic subgradient method (see Lemma 3) we have that for any θ R m there exists a nonnegative and strictly decreasing sequence {β } such that β q(e( λ )) q(θ) for all. Also, since we can always write q(e( λ )) L(E( x ),E( λ )) = f(e( x )) + E( λ ) T g(e( x )), and g(e( x )) E(λ + )/(α) from Lemma, rearranging terms yields β E(λ ) α f(e( x )) q(θ). (8) where E(λ ) = E( λ ) T E(λ + ). We now proceed to upper bound (8). From (7) we have E(q(λ i)) q(θ) = E(f(x i) + λ T i (g(x i) + b)) q(θ) f(e( x ))+E( λt i (g(x i)+b)) q(θ) and therefore f(e( x )) q(λ (b)) E λ T i (g(x i)+b) (9) where λ (b) depends on b. Now observe that using θ = 0 in (6) we have E( λ + λ ) λ α L h +αe(λt (g(x )+b)). Applying the latter equation recursively we obtain E( λ + ) λ +α L h +αe( λt i (g(x i)+b)). Rearranging terms and dividing by α yields E λ T i (g(x i)+b) αl h + λ α. (30) Finally, letting θ = λ (b) in (8) concludes the proof. REFERENCES [] V. Valls and D. J. Leith. On the relationship between queues and multipliers. In Communication, Control, and Computing (Allerton), 04 5nd Annual Allerton Conference on, pages 50 55, Sept 04. [] L. Tassiulas and A. Ephremides. Stability properties of constrained queueing systems and scheduling policies for maximum throughput in multihop radio networs. IEEE Transactions on Automatic Control, 37(): , Dec 99. [3] A. Eryilmaz and R. Sriant. Fair resource allocation in wireless networs using queue-length-based scheduling and congestion control. In INFO- COM th Annual Joint Conference of the IEEE Computer and Communications Societies. Proceedings IEEE, volume 3, pages vol. 3, March 005. [4] M. J. Neely. Optimal energy and delay tradeoffs for multiuser wireless downlins. IEEE Transactions on Information Theory, 53(9): , Sept 007. [5] Alexander L. Stolyar. Maximizing queueing networ utility subject to stability: Greedy primal-dual algorithm. Queueing Systems, 50(4):40 457, 005. [6] Michael J. Neely. Stochastic Networ Optimization with Application to Communication and Queueing Systems. Morgan and Claypool Publishers, 00. [7] G. D. Celi, L. B. Le, and E. Modiano. Dynamic server allocation over time-varying channels with switchover delay. IEEE Transactions on Information Theory, 58(9): , Sept 0. [8] D. K. H. Tan F. P. Kelly, A. K. Maulloo. Rate control for communication networs: Shadow prices, proportional fairness and stability. The Journal of the Operational Research Society, 49(3):37 5, 998. [9] A. Nedić and A. Ozdaglar. Subgradient methods in networ resource allocation: Rate analysis. In Information Sciences and Systems, 008. CISS nd Annual Conference on, pages 89 94, March 008. [0] L. Huang and M. J. Neely. Delay reduction via lagrange multipliers in stochastic networ optimization. IEEE Transactions on Automatic Control, 56(4):84 857, April 0. [] V. Valls and D. J. Leith. Max-weight revisited: Sequences of nonconvex optimizations solving convex optimizations. IEEE/ACM Transactions on Networing, PP(99):, 05. [] Angelia Nedić and Asuman Ozdaglar. Approximate primal solutions and rate analysis for dual subgradient methods. SIAM Journal on Optimization, 9(4): , 009. [3] Víctor Valls and Douglas J. Leith. Descent With Approximate Multipliers is Enough: Generalising Max-Weight. arxiv e-print, 05. [4] Dimitri P. Bertseas. Nonlinear Programming. Athena Scientific, nd edition, 999. [5] Sean Meyn. Control Techniques for Complex Networs. Cambridge University Press, st edition, 007. [6] Stephen Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge University Press, st edition, 004.
Descent With Approximate Multipliers is Enough: Generalising Max-Weight
1 Descent With Approximate Multipliers is Enough: Generalising Max-Weight Víctor Valls, Douglas J. Leith Trinity College Dublin arxiv:1511.02517v1 [math.oc] 8 Nov 2015 Abstract We study the use of approximate
More informationSubgradient Methods in Network Resource Allocation: Rate Analysis
Subgradient Methods in Networ Resource Allocation: Rate Analysis Angelia Nedić Department of Industrial and Enterprise Systems Engineering University of Illinois Urbana-Champaign, IL 61801 Email: angelia@uiuc.edu
More informationA SIMPLE PARALLEL ALGORITHM WITH AN O(1/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS
A SIMPLE PARALLEL ALGORITHM WITH AN O(/T ) CONVERGENCE RATE FOR GENERAL CONVEX PROGRAMS HAO YU AND MICHAEL J. NEELY Abstract. This paper considers convex programs with a general (possibly non-differentiable)
More informationSolving Dual Problems
Lecture 20 Solving Dual Problems We consider a constrained problem where, in addition to the constraint set X, there are also inequality and linear equality constraints. Specifically the minimization problem
More informationDynamic Power Allocation and Routing for Time Varying Wireless Networks
Dynamic Power Allocation and Routing for Time Varying Wireless Networks X 14 (t) X 12 (t) 1 3 4 k a P ak () t P a tot X 21 (t) 2 N X 2N (t) X N4 (t) µ ab () rate µ ab µ ab (p, S 3 ) µ ab µ ac () µ ab (p,
More informationFairness and Optimal Stochastic Control for Heterogeneous Networks
λ 91 λ 93 Fairness and Optimal Stochastic Control for Heterogeneous Networks sensor network wired network wireless 9 8 7 6 5 λ 48 λ 42 4 3 0 1 2 λ n R n U n Michael J. Neely (USC) Eytan Modiano (MIT) Chih-Ping
More informationOn Scheduling for Minimizing End-to-End Buffer Usage over Multihop Wireless Networks
On Scheduling for Minimizing End-to-End Buffer Usage over Multihop Wireless Networs V.J. Venataramanan and Xiaojun Lin School of ECE Purdue University Email: {vvenat,linx}@purdue.edu Lei Ying Department
More informationA Distributed Newton Method for Network Utility Maximization
A Distributed Newton Method for Networ Utility Maximization Ermin Wei, Asuman Ozdaglar, and Ali Jadbabaie Abstract Most existing wor uses dual decomposition and subgradient methods to solve Networ Utility
More informationConstrained Optimization and Lagrangian Duality
CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may
More information9. Dual decomposition and dual algorithms
EE 546, Univ of Washington, Spring 2016 9. Dual decomposition and dual algorithms dual gradient ascent example: network rate control dual decomposition and the proximal gradient method examples with simple
More informationErgodic Stochastic Optimization Algorithms for Wireless Communication and Networking
University of Pennsylvania ScholarlyCommons Departmental Papers (ESE) Department of Electrical & Systems Engineering 11-17-2010 Ergodic Stochastic Optimization Algorithms for Wireless Communication and
More informationMotivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:
CDS270 Maryam Fazel Lecture 2 Topics from Optimization and Duality Motivation network utility maximization (NUM) problem: consider a network with S sources (users), each sending one flow at rate x s, through
More informationEfficient Nonlinear Optimizations of Queuing Systems
Efficient Nonlinear Optimizations of Queuing Systems Mung Chiang, Arak Sutivong, and Stephen Boyd Electrical Engineering Department, Stanford University, CA 9435 Abstract We present a systematic treatment
More informationExtreme Abridgment of Boyd and Vandenberghe s Convex Optimization
Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The
More informationUniqueness of Generalized Equilibrium for Box Constrained Problems and Applications
Uniqueness of Generalized Equilibrium for Box Constrained Problems and Applications Alp Simsek Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology Asuman E.
More informationAsynchronous Non-Convex Optimization For Separable Problem
Asynchronous Non-Convex Optimization For Separable Problem Sandeep Kumar and Ketan Rajawat Dept. of Electrical Engineering, IIT Kanpur Uttar Pradesh, India Distributed Optimization A general multi-agent
More informationConstrained Consensus and Optimization in Multi-Agent Networks
Constrained Consensus Optimization in Multi-Agent Networks The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher
More informationPrimal Solutions and Rate Analysis for Subgradient Methods
Primal Solutions and Rate Analysis for Subgradient Methods Asu Ozdaglar Joint work with Angelia Nedić, UIUC Conference on Information Sciences and Systems (CISS) March, 2008 Department of Electrical Engineering
More informationDistributed Scheduling Algorithms for Optimizing Information Freshness in Wireless Networks
Distributed Scheduling Algorithms for Optimizing Information Freshness in Wireless Networks Rajat Talak, Sertac Karaman, and Eytan Modiano arxiv:803.06469v [cs.it] 7 Mar 208 Abstract Age of Information
More informationChannel Probing in Communication Systems: Myopic Policies Are Not Always Optimal
Channel Probing in Communication Systems: Myopic Policies Are Not Always Optimal Matthew Johnston, Eytan Modiano Laboratory for Information and Decision Systems Massachusetts Institute of Technology Cambridge,
More informationA Low Complexity Algorithm with O( T ) Regret and Finite Constraint Violations for Online Convex Optimization with Long Term Constraints
A Low Complexity Algorithm with O( T ) Regret and Finite Constraint Violations for Online Convex Optimization with Long Term Constraints Hao Yu and Michael J. Neely Department of Electrical Engineering
More informationLecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem
Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R
More informationInformation Theory vs. Queueing Theory for Resource Allocation in Multiple Access Channels
1 Information Theory vs. Queueing Theory for Resource Allocation in Multiple Access Channels Invited Paper Ali ParandehGheibi, Muriel Médard, Asuman Ozdaglar, and Atilla Eryilmaz arxiv:0810.167v1 cs.it
More informationApproximate Primal Solutions and Rate Analysis for Dual Subgradient Methods
Approximate Primal Solutions and Rate Analysis for Dual Subgradient Methods Angelia Nedich Department of Industrial and Enterprise Systems Engineering University of Illinois at Urbana-Champaign 117 Transportation
More informationRate Control in Communication Networks
From Models to Algorithms Department of Computer Science & Engineering The Chinese University of Hong Kong February 29, 2008 Outline Preliminaries 1 Preliminaries Convex Optimization TCP Congestion Control
More informationDistributed Resource Allocation Using One-Way Communication with Applications to Power Networks
Distributed Resource Allocation Using One-Way Communication with Applications to Power Networks Sindri Magnússon, Chinwendu Enyioha, Kathryn Heal, Na Li, Carlo Fischione, and Vahid Tarokh Abstract Typical
More informationUtility-Maximizing Scheduling for Stochastic Processing Networks
Utility-Maximizing Scheduling for Stochastic Processing Networks Libin Jiang and Jean Walrand EECS Department, University of California at Berkeley {ljiang,wlr}@eecs.berkeley.edu Abstract Stochastic Processing
More informationDistributed Approaches for Proportional and Max-Min Fairness in Random Access Ad Hoc Networks
Distributed Approaches for Proportional and Max-Min Fairness in Random Access Ad Hoc Networks Xin Wang, Koushik Kar Department of Electrical, Computer and Systems Engineering, Rensselaer Polytechnic Institute,
More informationDistributed Multiuser Optimization: Algorithms and Error Analysis
Distributed Multiuser Optimization: Algorithms and Error Analysis Jayash Koshal, Angelia Nedić and Uday V. Shanbhag Abstract We consider a class of multiuser optimization problems in which user interactions
More informationInformation in Aloha Networks
Achieving Proportional Fairness using Local Information in Aloha Networks Koushik Kar, Saswati Sarkar, Leandros Tassiulas Abstract We address the problem of attaining proportionally fair rates using Aloha
More informationIEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 43, NO. 3, MARCH
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 43, NO. 3, MARCH 1998 315 Asymptotic Buffer Overflow Probabilities in Multiclass Multiplexers: An Optimal Control Approach Dimitris Bertsimas, Ioannis Ch. Paschalidis,
More informationDelay Analysis for Max Weight Opportunistic Scheduling in Wireless Systems
Delay Analysis for Max Weight Opportunistic Scheduling in Wireless Systems Michael J. eely arxiv:0806.345v3 math.oc 3 Dec 008 Abstract We consider the delay properties of max-weight opportunistic scheduling
More informationRouting. Topics: 6.976/ESD.937 1
Routing Topics: Definition Architecture for routing data plane algorithm Current routing algorithm control plane algorithm Optimal routing algorithm known algorithms and implementation issues new solution
More informationLagrange Relaxation and Duality
Lagrange Relaxation and Duality As we have already known, constrained optimization problems are harder to solve than unconstrained problems. By relaxation we can solve a more difficult problem by a simpler
More informationNode-based Distributed Optimal Control of Wireless Networks
Node-based Distributed Optimal Control of Wireless Networks CISS March 2006 Edmund M. Yeh Department of Electrical Engineering Yale University Joint work with Yufang Xi Main Results Unified framework for
More informationEnhanced Fritz John Optimality Conditions and Sensitivity Analysis
Enhanced Fritz John Optimality Conditions and Sensitivity Analysis Dimitri P. Bertsekas Laboratory for Information and Decision Systems Massachusetts Institute of Technology March 2016 1 / 27 Constrained
More informationSTABILITY OF MULTICLASS QUEUEING NETWORKS UNDER LONGEST-QUEUE AND LONGEST-DOMINATING-QUEUE SCHEDULING
Applied Probability Trust (7 May 2015) STABILITY OF MULTICLASS QUEUEING NETWORKS UNDER LONGEST-QUEUE AND LONGEST-DOMINATING-QUEUE SCHEDULING RAMTIN PEDARSANI and JEAN WALRAND, University of California,
More informationA State Action Frequency Approach to Throughput Maximization over Uncertain Wireless Channels
A State Action Frequency Approach to Throughput Maximization over Uncertain Wireless Channels Krishna Jagannathan, Shie Mannor, Ishai Menache, Eytan Modiano Abstract We consider scheduling over a wireless
More informationDistributed Stochastic Optimization in Networks with Low Informational Exchange
Distributed Stochastic Optimization in Networs with Low Informational Exchange Wenjie Li and Mohamad Assaad, Senior Member, IEEE arxiv:80790v [csit] 30 Jul 08 Abstract We consider a distributed stochastic
More informationIntroduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research
Introduction to Machine Learning Lecture 7 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Convex Optimization Differentiation Definition: let f : X R N R be a differentiable function,
More informationUnderstanding the Capacity Region of the Greedy Maximal Scheduling Algorithm in Multi-hop Wireless Networks
Understanding the Capacity Region of the Greedy Maximal Scheduling Algorithm in Multi-hop Wireless Networks Changhee Joo, Xiaojun Lin, and Ness B. Shroff Abstract In this paper, we characterize the performance
More informationA Starvation-free Algorithm For Achieving 100% Throughput in an Input- Queued Switch
A Starvation-free Algorithm For Achieving 00% Throughput in an Input- Queued Switch Abstract Adisak ekkittikul ick ckeown Department of Electrical Engineering Stanford University Stanford CA 9405-400 Tel
More informationPrimal/Dual Decomposition Methods
Primal/Dual Decomposition Methods Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture Subgradients
More informationA Geometric Framework for Nonconvex Optimization Duality using Augmented Lagrangian Functions
A Geometric Framework for Nonconvex Optimization Duality using Augmented Lagrangian Functions Angelia Nedić and Asuman Ozdaglar April 15, 2006 Abstract We provide a unifying geometric framework for the
More informationScheduling Algorithms for Optimizing Age of Information in Wireless Networks with Throughput Constraints
SUBITTED TO IEEE/AC TRANSACTIONS ON NETWORING Scheduling Algorithms for Optimizing Age of Information in Wireless Networks with Throughput Constraints Igor adota, Abhishek Sinha and Eytan odiano Abstract
More informationLow-Complexity and Distributed Energy Minimization in Multi-hop Wireless Networks
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. X, NO. XX, XXXXXXX X Low-Complexity and Distributed Energy Minimization in Multi-hop Wireless Networks Longbi Lin, Xiaojun Lin, Member, IEEE, and Ness B. Shroff,
More informationDuality (Continued) min f ( x), X R R. Recall, the general primal problem is. The Lagrangian is a function. defined by
Duality (Continued) Recall, the general primal problem is min f ( x), xx g( x) 0 n m where X R, f : X R, g : XR ( X). he Lagrangian is a function L: XR R m defined by L( xλ, ) f ( x) λ g( x) Duality (Continued)
More informationEnergy Optimal Control for Time Varying Wireless Networks. Michael J. Neely University of Southern California
Energy Optimal Control for Time Varying Wireless Networks Michael J. Neely University of Southern California http://www-rcf.usc.edu/~mjneely Part 1: A single wireless downlink (L links) L 2 1 S={Totally
More informationNetwork Optimization: Notes and Exercises
SPRING 2016 1 Network Optimization: Notes and Exercises Michael J. Neely University of Southern California http://www-bcf.usc.edu/ mjneely Abstract These notes provide a tutorial treatment of topics of
More informationUnderstanding the Capacity Region of the Greedy Maximal Scheduling Algorithm in Multi-hop Wireless Networks
Understanding the Capacity Region of the Greedy Maximal Scheduling Algorithm in Multi-hop Wireless Networks Changhee Joo Departments of ECE and CSE The Ohio State University Email: cjoo@ece.osu.edu Xiaojun
More informationA Distributed Newton Method for Network Optimization
A Distributed Newton Method for Networ Optimization Ali Jadbabaie, Asuman Ozdaglar, and Michael Zargham Abstract Most existing wor uses dual decomposition and subgradient methods to solve networ optimization
More informationA Distributed Newton Method for Network Utility Maximization, I: Algorithm
A Distributed Newton Method for Networ Utility Maximization, I: Algorithm Ermin Wei, Asuman Ozdaglar, and Ali Jadbabaie October 31, 2012 Abstract Most existing wors use dual decomposition and first-order
More informationDECENTRALIZED algorithms are used to solve optimization
5158 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 19, OCTOBER 1, 016 DQM: Decentralized Quadratically Approximated Alternating Direction Method of Multipliers Aryan Mohtari, Wei Shi, Qing Ling,
More informationAdaptive Distributed Algorithms for Optimal Random Access Channels
Forty-Eighth Annual Allerton Conference Allerton House, UIUC, Illinois, USA September 29 - October, 2 Adaptive Distributed Algorithms for Optimal Random Access Channels Yichuan Hu and Alejandro Ribeiro
More informationOn the convergence to saddle points of concave-convex functions, the gradient method and emergence of oscillations
53rd IEEE Conference on Decision and Control December 15-17, 214. Los Angeles, California, USA On the convergence to saddle points of concave-convex functions, the gradient method and emergence of oscillations
More informationAn Improved Bound for Minimizing the Total Weighted Completion Time of Coflows in Datacenters
IEEE/ACM TRANSACTIONS ON NETWORKING An Improved Bound for Minimizing the Total Weighted Completion Time of Coflows in Datacenters Mehrnoosh Shafiee, Student Member, IEEE, and Javad Ghaderi, Member, IEEE
More informationAlternative Decompositions for Distributed Maximization of Network Utility: Framework and Applications
Alternative Decompositions for Distributed Maximization of Network Utility: Framework and Applications Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization
More informationBattery-Aware Selective Transmitters in Energy-Harvesting Sensor Networks: Optimal Solution and Stochastic Dual Approximation
Battery-Aware Selective Transmitters in Energy-Harvesting Sensor Networs: Optimal Solution and Stochastic Dual Approximation Jesús Fernández-Bes, Carlos III University of Madrid, Leganes, 28911, Madrid,
More informationContinuous-Model Communication Complexity with Application in Distributed Resource Allocation in Wireless Ad hoc Networks
Continuous-Model Communication Complexity with Application in Distributed Resource Allocation in Wireless Ad hoc Networks Husheng Li 1 and Huaiyu Dai 2 1 Department of Electrical Engineering and Computer
More informationLecture Note 5: Semidefinite Programming for Stability Analysis
ECE7850: Hybrid Systems:Theory and Applications Lecture Note 5: Semidefinite Programming for Stability Analysis Wei Zhang Assistant Professor Department of Electrical and Computer Engineering Ohio State
More informationThe Multi-Path Utility Maximization Problem
The Multi-Path Utility Maximization Problem Xiaojun Lin and Ness B. Shroff School of Electrical and Computer Engineering Purdue University, West Lafayette, IN 47906 {linx,shroff}@ecn.purdue.edu Abstract
More informationNetwork Control: A Rate-Distortion Perspective
Network Control: A Rate-Distortion Perspective Jubin Jose and Sriram Vishwanath Dept. of Electrical and Computer Engineering The University of Texas at Austin {jubin, sriram}@austin.utexas.edu arxiv:8.44v2
More informationDistributed Random Access Algorithm: Scheduling and Congesion Control
Distributed Random Access Algorithm: Scheduling and Congesion Control The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As
More informationA Distributed Newton Method for Network Utility Maximization, II: Convergence
A Distributed Newton Method for Network Utility Maximization, II: Convergence Ermin Wei, Asuman Ozdaglar, and Ali Jadbabaie October 31, 2012 Abstract The existing distributed algorithms for Network Utility
More informationWireless Peer-to-Peer Scheduling in Mobile Networks
PROC. 46TH ANNUAL CONF. ON INFORMATION SCIENCES AND SYSTEMS (CISS), MARCH 212 (INVITED PAPER) 1 Wireless Peer-to-Peer Scheduling in Mobile Networs Abstract This paper considers peer-to-peer scheduling
More informationTHE EFFECT OF DETERMINISTIC NOISE 1 IN SUBGRADIENT METHODS
Submitted: 24 September 2007 Revised: 5 June 2008 THE EFFECT OF DETERMINISTIC NOISE 1 IN SUBGRADIENT METHODS by Angelia Nedić 2 and Dimitri P. Bertseas 3 Abstract In this paper, we study the influence
More informationCHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.
1 ECONOMICS 594: LECTURE NOTES CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS W. Erwin Diewert January 31, 2008. 1. Introduction Many economic problems have the following structure: (i) a linear function
More informationOptimal Power Control in Decentralized Gaussian Multiple Access Channels
1 Optimal Power Control in Decentralized Gaussian Multiple Access Channels Kamal Singh Department of Electrical Engineering Indian Institute of Technology Bombay. arxiv:1711.08272v1 [eess.sp] 21 Nov 2017
More informationOptimization methods
Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,
More informationLexicographic Max-Min Fairness in a Wireless Ad Hoc Network with Random Access
Lexicographic Max-Min Fairness in a Wireless Ad Hoc Network with Random Access Xin Wang, Koushik Kar, and Jong-Shi Pang Abstract We consider the lexicographic max-min fair rate control problem at the link
More informationWE consider an undirected, connected network of n
On Nonconvex Decentralized Gradient Descent Jinshan Zeng and Wotao Yin Abstract Consensus optimization has received considerable attention in recent years. A number of decentralized algorithms have been
More informationarxiv: v2 [math.oc] 5 May 2018
The Impact of Local Geometry and Batch Size on Stochastic Gradient Descent for Nonconvex Problems Viva Patel a a Department of Statistics, University of Chicago, Illinois, USA arxiv:1709.04718v2 [math.oc]
More informationOptimization and Optimal Control in Banach Spaces
Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,
More informationA POMDP Framework for Cognitive MAC Based on Primary Feedback Exploitation
A POMDP Framework for Cognitive MAC Based on Primary Feedback Exploitation Karim G. Seddik and Amr A. El-Sherif 2 Electronics and Communications Engineering Department, American University in Cairo, New
More informationCS-E4830 Kernel Methods in Machine Learning
CS-E4830 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 27. September, 2017 Juho Rousu 27. September, 2017 1 / 45 Convex optimization Convex optimisation This
More informationImperfect Randomized Algorithms for the Optimal Control of Wireless Networks
Imperfect Randomized Algorithms for the Optimal Control of Wireless Networks Atilla ryilmaz lectrical and Computer ngineering Ohio State University Columbus, OH 430 mail: eryilmaz@ece.osu.edu Asuman Ozdaglar,
More informationSome Background Math Notes on Limsups, Sets, and Convexity
EE599 STOCHASTIC NETWORK OPTIMIZATION, MICHAEL J. NEELY, FALL 2008 1 Some Background Math Notes on Limsups, Sets, and Convexity I. LIMITS Let f(t) be a real valued function of time. Suppose f(t) converges
More informationLearning Algorithms for Minimizing Queue Length Regret
Learning Algorithms for Minimizing Queue Length Regret Thomas Stahlbuhk Massachusetts Institute of Technology Cambridge, MA Brooke Shrader MIT Lincoln Laboratory Lexington, MA Eytan Modiano Massachusetts
More informationOPTIMALITY OF RANDOMIZED TRUNK RESERVATION FOR A PROBLEM WITH MULTIPLE CONSTRAINTS
OPTIMALITY OF RANDOMIZED TRUNK RESERVATION FOR A PROBLEM WITH MULTIPLE CONSTRAINTS Xiaofei Fan-Orzechowski Department of Applied Mathematics and Statistics State University of New York at Stony Brook Stony
More informationStability of Primal-Dual Gradient Dynamics and Applications to Network Optimization
Stability of Primal-Dual Gradient Dynamics and Applications to Network Optimization Diego Feijer Fernando Paganini Department of Electrical Engineering, Universidad ORT Cuareim 1451, 11.100 Montevideo,
More informationEnergy Harvesting Multiple Access Channel with Peak Temperature Constraints
Energy Harvesting Multiple Access Channel with Peak Temperature Constraints Abdulrahman Baknina, Omur Ozel 2, and Sennur Ulukus Department of Electrical and Computer Engineering, University of Maryland,
More informationNonlinear Programming 3rd Edition. Theoretical Solutions Manual Chapter 6
Nonlinear Programming 3rd Edition Theoretical Solutions Manual Chapter 6 Dimitri P. Bertsekas Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts 1 NOTE This manual contains
More informationResource Pooling for Optimal Evacuation of a Large Building
Proceedings of the 47th IEEE Conference on Decision and Control Cancun, Mexico, Dec. 9-11, 28 Resource Pooling for Optimal Evacuation of a Large Building Kun Deng, Wei Chen, Prashant G. Mehta, and Sean
More informationChannel Allocation Using Pricing in Satellite Networks
Channel Allocation Using Pricing in Satellite Networks Jun Sun and Eytan Modiano Laboratory for Information and Decision Systems Massachusetts Institute of Technology {junsun, modiano}@mitedu Abstract
More informationELE539A: Optimization of Communication Systems Lecture 6: Quadratic Programming, Geometric Programming, and Applications
ELE539A: Optimization of Communication Systems Lecture 6: Quadratic Programming, Geometric Programming, and Applications Professor M. Chiang Electrical Engineering Department, Princeton University February
More informationOn Stochastic Subgradient Mirror-Descent Algorithm with Weighted Averaging
On Stochastic Subgradient Mirror-Descent Algorithm with Weighted Averaging arxiv:307.879v [math.oc] 7 Jul 03 Angelia Nedić and Soomin Lee July, 03 Dedicated to Paul Tseng Abstract This paper considers
More informationOn the Optimality of Multiuser Zero-Forcing Precoding in MIMO Broadcast Channels
On the Optimality of Multiuser Zero-Forcing Precoding in MIMO Broadcast Channels Saeed Kaviani and Witold A. Krzymień University of Alberta / TRLabs, Edmonton, Alberta, Canada T6G 2V4 E-mail: {saeed,wa}@ece.ualberta.ca
More informationAchieving Shannon Capacity Region as Secrecy Rate Region in a Multiple Access Wiretap Channel
Achieving Shannon Capacity Region as Secrecy Rate Region in a Multiple Access Wiretap Channel Shahid Mehraj Shah and Vinod Sharma Department of Electrical Communication Engineering, Indian Institute of
More informationOptimizing Queueing Cost and Energy Efficiency in a Wireless Network
Optimizing Queueing Cost and Energy Efficiency in a Wireless Network Antonis Dimakis Department of Informatics Athens University of Economics and Business Patission 76, Athens 1434, Greece Email: dimakis@aueb.gr
More informationA Proof of the Converse for the Capacity of Gaussian MIMO Broadcast Channels
A Proof of the Converse for the Capacity of Gaussian MIMO Broadcast Channels Mehdi Mohseni Department of Electrical Engineering Stanford University Stanford, CA 94305, USA Email: mmohseni@stanford.edu
More informationThe Power of Online Learning in Stochastic Network Optimization
he Power of Online Learning in Stochastic Networ Optimization Longbo Huang, Xin Liu, Xiaohong Hao {longbohuang haoxh10@tsinghua.edu.cn, IIIS, singhua University liuxin@microsoft.com, Microsoft Research
More informationQueue Proportional Scheduling in Gaussian Broadcast Channels
Queue Proportional Scheduling in Gaussian Broadcast Channels Kibeom Seong Dept. of Electrical Engineering Stanford University Stanford, CA 9435 USA Email: kseong@stanford.edu Ravi Narasimhan Dept. of Electrical
More informationOn Acceleration with Noise-Corrupted Gradients. + m k 1 (x). By the definition of Bregman divergence:
A Omitted Proofs from Section 3 Proof of Lemma 3 Let m x) = a i On Acceleration with Noise-Corrupted Gradients fxi ), u x i D ψ u, x 0 ) denote the function under the minimum in the lower bound By Proposition
More informationUnderstanding the Capacity Region of the Greedy Maximal Scheduling Algorithm in Multi-hop Wireless Networks
1 Understanding the Capacity Region of the Greedy Maximal Scheduling Algorithm in Multi-hop Wireless Networks Changhee Joo, Member, IEEE, Xiaojun Lin, Member, IEEE, and Ness B. Shroff, Fellow, IEEE Abstract
More informationNEW LAGRANGIAN METHODS FOR CONSTRAINED CONVEX PROGRAMS AND THEIR APPLICATIONS. Hao Yu
NEW LAGRANGIAN METHODS FOR CONSTRAINED CONVEX PROGRAMS AND THEIR APPLICATIONS by Hao Yu A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment
More informationConvex Optimization and Modeling
Convex Optimization and Modeling Duality Theory and Optimality Conditions 5th lecture, 12.05.2010 Jun.-Prof. Matthias Hein Program of today/next lecture Lagrangian and duality: the Lagrangian the dual
More informationSCHEDULING is one of the core problems in the design
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 66, NO. 6, JUNE 2017 5387 Head-of-Line Access Delay-Based Scheduling Algorithm for Flow-Level Dynamics Yi Chen, Student Member, IEEE, Xuan Wang, and Lin
More informationOptimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers
Optimization for Communications and Networks Poompat Saengudomlert Session 4 Duality and Lagrange Multipliers P Saengudomlert (2015) Optimization Session 4 1 / 14 24 Dual Problems Consider a primal convex
More informationPrimal-dual Subgradient Method for Convex Problems with Functional Constraints
Primal-dual Subgradient Method for Convex Problems with Functional Constraints Yurii Nesterov, CORE/INMA (UCL) Workshop on embedded optimization EMBOPT2014 September 9, 2014 (Lucca) Yu. Nesterov Primal-dual
More information