From Models to Algorithms Department of Computer Science & Engineering The Chinese University of Hong Kong February 29, 2008
Outline Preliminaries 1 Preliminaries Convex Optimization TCP Congestion Control 2 Motivation Basic Model Optimization Decomposition 3 Primal Problem Dual Algorithm Relation to TCP Stability and Convergence 4 5
Convex Optimization Convex Optimization TCP Congestion Control Convex Set Set C is a convex set if the line segment between any two points in C lies in C, i.e., if for any x 1, x 2 C and any θ [0, 1], we have θx 1 + (1 θ)x 2 C Convex Hull Convex hull of C is the set of all convex combinations of points in C: { k θ i x i x i C,θ i 0, i = 1, 2,, k, i=1 k θ i = 1}. i=1
Convex Optimization Convex Optimization TCP Congestion Control Convex Function : Jensen s inequality f : R n R is a convex function if domf is a convex set and for all x, y dom f and t [0, 1], we have f(tx + (1 t)y) tf(x) + (1 t)f(y) f is strictly convex if above strict inequality holds for all x y and 0 < t < 1.
Convex Optimization Convex Optimization TCP Congestion Control Standard Form A convex optimization problem with variables x: minimize subject to f 0 (x) f i (x) 0, i = 1, 2,, m h i (x) = 0, i = 1, 2,, p. where f 0, f 1,, f m are convex functions; h i (x) are linear. Objective Function: Minimize convex objective function (or maximize concave objective function). Inequality Constraints: Upper bound inequality constraints on convex functions. Equality Constraints: Equality constraints must be affine.
Convex Optimization Convex Optimization TCP Congestion Control Lagrangian Function Absorb the constraints as the penalties to the objective. Lagrangian function: L(x,λ,ν) = f 0 (x) + m λ i f i (x) + i=1 p ν i h i (x) i=1 where Lagrange multipliers (dual variables): λ 0, ν. Dual Problem Perform unconstrained maximization on L(x, λ, ν), thus obtaining Lagrangian dual function: g(λ,ν) = inf x L(x,λ,ν).
Convex Optimization Convex Optimization TCP Congestion Control Example: Primal problem: minimize f 0 (x) = x 1 log x 1 x 2 log x 2 subject to x 1 + 2x 2 2, x 1, x 2 > 0; Lag Function: L(x,λ) = x 1 log x 1 x 2 log x 2 +λ(x 1 +2x 2 2);(λ 0) Optimal x: x 1 = e λ 1 ; x 2 = e 2λ 1 ; Dual Function: maximize D(λ) = e λ 1 + e 2λ 1 2λ; over λ 0.
Convex Optimization Convex Optimization TCP Congestion Control Economics Interpretation Primal objective (f 0 (x)): cost of operation Primal constraints (f i (x)): can be violated Dual variables (λ, v): price for violating the corresponding constraint (dollar per unit violation). For the same price, can sell unused violation for revenue Lagrangian (L(x, λ, v)): total cost Lagrange dual problem (g(λ, v)): optimal cost as a function of violation prices (Lagrangian multipliers) Question Optimal Solution of Primal Problem = Optimal Solution of Dual Problem?
Convex Optimization Convex Optimization TCP Congestion Control KKT Optimal Conditions Karush-Kuhn-Tucker (KKT) conditions for a standard convex optimization problem: Primal constraints: f i (x) 0 and h i (x) = 0 Dual constraints: λ 0 Complementary slackness: λ i f i (x) = 0 Gradient of Lagrangian with respect to x vanishes: f 0 (x) + m i=1 λ i f i (x) + p i=1 v i h i (x) = 0 If strong duality holds and x, λ, v are optimal, then they must satisfy the KKT conditions.
TCP Congestion Control Convex Optimization TCP Congestion Control Problem Description Congestion is the conflict between demand and capacity. Congestion control is a problem of resource management. Congestion leads to buffer overflow, large delay, bandwidth underutilization. Current Solutions Rate Adaption in the source (e.g. TCP) Controllers in the buffer (e.g. AQM) Question Is current TCP merely a heuristic algorithm?
TCP Congestion Control Convex Optimization TCP Congestion Control TCP Versions Tahoe (Jacobson 1988) Slow Start, Congestion Avoidance, Fast Retransmit Reno (Jacobson 1990) Further Adding Fast Recovery Vegas (Brakmo & Peterson 1994) Delay(RTT)-based Congestion Avoidance Active Queue Management Random Early Detection (Floyd & Jacobson 1993) Proportional Integral (Hollot,Misra & Towsley 2001) Random Exponential Marking (Athuraliya & Low 2000)
TCP Congestion Control Convex Optimization TCP Congestion Control TCP Reno/RED Dynamics: window marking/dropping 1 host time router B Avg queue
Motivation Basic Model Optimization Decomposition Motivating Example Given a network with Two Links that have the capacities c 1 and c 2. Three end-to-end flows x 1, x 2 and x 3. Question: How to allocate bandwidth for the end-to-end flows? x 1 c 1 c 2 x 2 x 3
Motivation Basic Model Optimization Decomposition Potential Solutions x 1 = 0, x 2 = c 1, x 3 = c 2. x 1 = c 1 /2, x 2 = c 1 /2, x 3 = c 2 c 1 /2. Which solution is the BEST? It depends on our objective! Definition In economics, utility is a measure of the relative happiness or satisfaction (gratification) gained by consuming different bundles of goods and services. Wikipedia
Motivation Basic Model Optimization Decomposition Notation C j,(j J): the finite capacity of link j; r,(r R): a router that has non-empty link set; x s,(s S): the flow rate allocated to user s; A,(A js, j J, s S): defines a 0-1 matrix that depicts the routing; set A js = 1 if s uses the resource j. U,(U s ( ), s S): the utility of user s with the rate x s.
Motivation Basic Model Optimization Decomposition Assumption The utility function U s (x s ) is an increasing, strictly concave and continuously differentiable function of x s. (The traffic that leads to such a utility function is called elastic traffic by S. Shenker.) System Model SYSTEM(U,A,C): subject to over max s S U s (x s ) Ax C x 0.
Motivation Basic Model Optimization Decomposition Example 1 Let U s (x s ) = log x s, what is the optimal solution? Example 2 Let U s (x s ) = x 1, what is the optimal solution? x 1 1 A = 1 1 0 0 1 2Mbps 2Mbps x 2 x 3
Motivation Basic Model Optimization Decomposition Solving SYSTEM(U, A, C) Lagrangian form: L(x;µ) = s S U s (x s ) + µ T (C Ax) = (U s (x s ) x s µ j ) + µ j C j, s S j s j J where µ = (µ j, j J) are Lagrangian multipliers. Then, L = U x s(x s ) µ j. s j s Optimize the dual function over the feasibility region.
Motivation Basic Model Optimization Decomposition Solving SYSTEM(U, A, C) Using KKT conditions, we can express these conditions more compactly: (x) solves SYSTEM(U, A, C) if and only if these exists multipliers (µ) such that: Ax C, x 0; µ 0; µ T (C Ax) = 0,( j s µ j U (x)) T x = 0; The first row is primal feasibility; the second row is dual feasibility; and the third row comprises complementary slackness.
Motivation Basic Model Optimization Decomposition Discussion THE GOOD: Mathematically tractable due to the convexity; THE BAD: Utilities are unlikely to be known by the network; THE UGLY: You derive the flow rates in a centralized manner. Question How can we obtain a distributed algorithm to allocate rates? Road Map Solve THE BAD first, then the THE UGLY.
Motivation Basic Model Optimization Decomposition Definition Shadow price is the change in the objective value of the optimal solution of an optimization problem obtained by relaxing the constraint by one unit. In a business application, a shadow price is the maximum price that management is willing to pay for an extra unit of a given limited resource. Shadow price is the value of the Lagrange multiplier at the optimal solution. Wikipedia Decomposition From the perspective of economic theory, the original problem is replaced by two simpler problems for users and network.
Motivation Basic Model Optimization Decomposition User s Angle USER s (U s ;λ s ): over max U s ( w s λ s ) w s w s 0. where w s is the amount to pay per-unit time and λ s is regarded as a charge per unit flow for user s. Hence, the flow rate x s is exactly ws λ s. The Important Idea: User s wants to maximize profit by choosing optimal w s to pay.
Motivation Basic Model Optimization Decomposition Network s Angle Network(A, C; w): subject to over max s S w s log x s Ax C, x 0. The Important Idea: Network knows payments w s from all users s and chooses rate allocation to maximize the revenue.
Motivation Basic Model Optimization Decomposition Question Is the optimization decomposition true? Theorem 1 There always exist vectors w,λ, and x, satisfying w s = λ s x s, such that w s solves USER s (U s ;λ s ) and x solves NETWORK(A, C; w); further, the vector x is then the unique solution to SYSTEM(U, A, C). Proof The combinations of KKT conditions of the USER and NETWORK problems are identical to the SYSTEM problem.
Motivation Basic Model Optimization Decomposition Physical Meaning The users and the network optimize their individual benefit, and the social welfare is automatically achieved. Recap SYSTEM is decomposed into many local USER problems and one global NETWORK problem where local utility functions are not needed. Fairness? What are the relationships between optimal rate allocation and fairness?
Motivation Basic Model Optimization Decomposition Definition Max-Min Fairness : A vector of rates x is max-min fair if it is feasible, and if for each s, x s cannot be increased without decreasing x s for some s for which x s x s. Proportional Fairness : Feasible x is proportionally fair (per unit charge) if for any other feasible x, Theorem 2 s S x s x s x s 0. x solves NETWORK(A, c; w) if and only if it is proportionally fair (per unit charge).
Motivation Basic Model Optimization Decomposition Physical Meaning The network cannot achieve a better social revenue by changing the rate vector x. A tradeoff between maximum capacity and max-min throughput. Generally, their relationship can be depicted by Fair Unfair Max-min throughput Proportional Fairness Maximum Capacity Road Map THE BAD is solved, and the remaining problem is THE UGLY.
Primal Problem Dual Algorithm Relation to TCP Stability and Convergence Global Picture Design distributed algorithms using the gradient-based method; The congestion indications (link prices) can be generated by considering different performance goals (e.g. loss rate, delay, robustness etc.); The congestion indications (link prices) can be feedback to the source in several ways; Rate control system is globally stable without regard to time delay.
Primal Problem Dual Algorithm Relation to TCP Stability and Convergence NETWORK(A, C; w) Problem with Variable x max r R w s log x s subject to Ax C, x 0. Lagrangian Function L(x; w) = r R w s log x s + µ T (C Ax) Unique Optimum : x s = w s j r µ. j
Primal Problem Dual Algorithm Relation to TCP Stability and Convergence Primal Algorithm dx s (t) dt = κ(w s x s µ j (t)) µ j (t) = p j ( s:j s x s (t)) j r where κ is a small constant. Interpretation Link j charges p j (y) per unit flow, when total flow on link j is y. Each source tries to equalize the total cost with target value w s.
Primal Problem Dual Algorithm Relation to TCP Stability and Convergence Another Interpretation Link j generates feedback signal p j (y), when total flow on link j is y. Each source linearly increase its rate (proportional to w s ) and multiplicatively decrease its rate (proportional to total feedback). Implementation x s (t + 1) = x s (t) + κ(w s x s µ j (t)) µ j (t + 1) = p j ( s:j s x s (t + 1)) j r
Primal Problem Dual Algorithm Relation to TCP Stability and Convergence Dual Algorithm dµ j (t) dt x s (t) = ( = κ s:j s w s j r µ j(t). ) x s (t) q j (µ j (t)) where q j (µ j (t)) is the amount of flow on link j that would generate price µ j (t). Implementation Link algorithm: µ j (t + 1) = µ j (t)+κ( s:j s x s(t) q j (µ j (t))) Source algorithm: x s (t + 1) = Us 1 (p s (t))
Primal Problem Dual Algorithm Relation to TCP Stability and Convergence Primal Algorithm VS Dual Algorithm Primal algorithm: a system where rates vary gradually, and shadow prices are given as functions of the rates. Dual algorithm: a system where shadow prices vary gradually, with rates given as functions of the shadow prices. In the primal algorithm, sender adjusts rate according to the feedback of the congestion signals. In the dual algorithm, the network computes the shadow prices directly and send them back to the sender.
Primal Problem Dual Algorithm Relation to TCP Stability and Convergence Mapping to TCP/AQM A TCP scheme may be mapped into a specific utility function. Major TCP schemes approximately carrying out primal or dual algorithm. Congestion Measures Price Queueing delay Queue Length Packet Loss
Primal Problem Dual Algorithm Relation to TCP Stability and Convergence TCP Utility Functions TCP Reno: TCP Vegas: Queue Management U reno s (x s ) = 2 tan 1( x s D s ) D s 2 Us vegas (x s ) = α s d s log x s FIFO: ṗ l = 1 c l [(y l (t) c l )] + RED: ḃ l = (y l (t) c l ) + ; ṙ l = α l c l (r l (t) b l (t)); p l = m l (r l ); REM: p l (t + 1) = [p l (t) + γ(y l (t) c l )] +
Primal Problem Dual Algorithm Relation to TCP Stability and Convergence System Block Diagram
Question Primal Problem Dual Algorithm Relation to TCP Stability and Convergence What is the most important property for a rate controller? Stability! Basic Notions An insensitivity to small perturbations, where perturbations are modeling errors of system, environment, noise etc.
Primal Problem Dual Algorithm Relation to TCP Stability and Convergence Lyapunov Stability Consider an autonomous nonlinear dynamical system ẋ = f(x(t)), x(0) = x 0, where x(t) D R n denotes the system state vector, D an open set containing the origin, and f : D R n continuous on D. Without loss of generality, we may assume that the origin is an equilibrium. The origin of the above system is said to be Lyapunov stable, if, for every ǫ > 0, there exists a δ = δ(ǫ) > 0 such that, if x(0) < δ, then x(t) < ǫ, for every t 0. from Wikipedia
Primal Problem Dual Algorithm Relation to TCP Stability and Convergence Lyapunov second theorem on stability Consider a function V(x) : R n R such that V(x) 0 with equality if and only if x = 0 (positive definite). V(x(t)) < 0 (negative definite). Then V(x) is called a Lyapunov function candidate and the system is asymptotically stable in the sense of Lyapunov (i.s.l.). Interpretation The kinesthetic energy of an autonomous dynamic system will vanish eventually.
Stability of Primal Algorithms Primal Problem Dual Algorithm Relation to TCP Stability and Convergence Establish a Lyapunov function under mild regularity conditions: U(x) = s S w s log x s j J P s:j s xs 0 p j (y)dy Stability of Dual Algorithms Establish a Lyapunov function under mild regularity conditions: V(x) = w s log( µ j ) s S j s j J µj 0 q j (η)dη Prove the global stability under the Lyapunov second theorem.
Primal Problem Dual Algorithm Relation to TCP Stability and Convergence Related Issues Rate of convergence Stochastic perturbation Time-delay systems Routing Other possible decomposition
Logical Network: Link 1 Link 2 S1 Switch Switch Switch D3 S2 S3 D1 D2 Note Note: We use an alternative dual algorithm (REM) in the simulation since F.P.Kelly s work presents a theoretic framework instead of an implementable algorithm in the real network. We evaluate the price update in REM algorithm. Each PC was equipped with 64 MB of RAM and 100-MB/s PCI ethernet cards. The packets are 500B long, containing 489B data payload. Utility function: U i (x i ) = α i log x i. γ = 1.5 10 2.
Homogeneous Case:Each source transmitted data for a total of 120 s, with their starting times staggered by intervals of 40 s: source 1 started transmitting at time 0, source 2 at time 40 s, and source 3 at time 80 s. α 1 = α 2 = α 3 = 1 10 4.
Heterogeneous Case:The setup in this experiment is the same as in Experiment 1, except that the utility function of source 3 has α 3 = 2 10 4, double that of sources 1 and 2.
Examples The assumption of above network is that C is a fixed vector, which is not true in wireless networks, e.g. the link capacity is a function of scheduler or transmission power. The original maximization is then decomposed into a flow rate control subproblem and a lower-layer subproblem maximize µ j c j subject to j J MAC or PHY layer constraints
Appendix Key Reference Key Reference I F.P. Kelly, A.K. Maulloo, et.al, Rate control for communication networks: shadow prices, proportional fairness and stability, Journal of the Operational Research Society, 1998 S.H. Low, D.E. Lapsley, Optimization Flow Control a I: Basic Algorithm and Convergence, IEEE/ACM Trans. Networking, 1999. M. Chiang, S.H. Low, et.al, Layering as optimization decomposition, Proceeding of IEEE, 2007.