Advanced Networking Technologies

Similar documents
Input-queued switches: Scheduling algorithms for a crossbar switch. EE 384X Packet Switch Architectures 1

A Starvation-free Algorithm For Achieving 100% Throughput in an Input- Queued Switch

Stability of the Maximum Size Matching

I. INTRODUCTION. This work was supported by a Wakerly Stanford Graduate Fellowship and by the Powell Foundation.

Logistics. All the course-related information regarding

Fair Scheduling in Input-Queued Switches under Inadmissible Traffic

Scheduling Algorithms for Input-Queued Cell Switches. Nicholas William McKeown

Greedy weighted matching for scheduling the input-queued switch

Fast Matching Algorithms for Repetitive Optimization: An Application to Switch Scheduling

Logarithmic Delay for N N Packet Switches Under the Crossbar Constraint

STABILITY OF MULTICLASS QUEUEING NETWORKS UNDER LONGEST-QUEUE AND LONGEST-DOMINATING-QUEUE SCHEDULING

Fairness and Optimal Stochastic Control for Heterogeneous Networks

Delay Bounds for Approximate Maximum Weight Matching Algorithms for Input Queued Switches

[6] Chen, M.; Georganas, N.D., A fast algorithm for multi-channel/port traffic scheduling

Session-Based Queueing Systems

Strong Performance Guarantees for Asynchronous Buffered Crossbar Schedulers

Operations Research Letters. Instability of FIFO in a simple queueing system with arbitrarily low loads

Optimal scaling of average queue sizes in an input-queued switch: an open problem

Understanding the Capacity Region of the Greedy Maximal Scheduling Algorithm in Multi-hop Wireless Networks

CS 798: Homework Assignment 3 (Queueing Theory)

Lecture 7: Simulation of Markov Processes. Pasi Lassila Department of Communications and Networking

Understanding the Capacity Region of the Greedy Maximal Scheduling Algorithm in Multi-hop Wireless Networks

Dynamic resource sharing

Queueing Theory I Summary! Little s Law! Queueing System Notation! Stationary Analysis of Elementary Queueing Systems " M/M/1 " M/M/m " M/M/1/K "

Understanding the Capacity Region of the Greedy Maximal Scheduling Algorithm in Multi-hop Wireless Networks

Queueing Theory II. Summary. ! M/M/1 Output process. ! Networks of Queue! Method of Stages. ! General Distributions

On the static assignment to parallel servers

Performance Evaluation of Queuing Systems

Technion - Computer Science Department - Ph.D. Thesis PHD Competitive Evaluation of Switch Architectures.

Scheduling with Soft Deadlines for Input Queued Switches

Second main application of Chernoff: analysis of load balancing. Already saw balls in bins example. oblivious algorithms only consider self packet.

Stochastic Optimization for Undergraduate Computer Science Students

Deadline Aware Scheduling for Input Queued Packet Switches

MARKOV PROCESSES. Valerio Di Valerio

UNIVERSITY OF YORK. MSc Examinations 2004 MATHEMATICS Networks. Time Allowed: 3 hours.

Exercises Solutions. Automation IEA, LTH. Chapter 2 Manufacturing and process systems. Chapter 5 Discrete manufacturing problems

Scheduling: Queues & Computation

CPSC 531: System Modeling and Simulation. Carey Williamson Department of Computer Science University of Calgary Fall 2017

Performance of Round Robin Policies for Dynamic Multichannel Access

Dynamic Matching Models

Dynamic Power Allocation and Routing for Time Varying Wireless Networks

Link Models for Circuit Switching

MULTIPLE CHOICE QUESTIONS DECISION SCIENCE

Queuing Networks. - Outline of queuing networks. - Mean Value Analisys (MVA) for open and closed queuing networks

Energy Optimal Control for Time Varying Wireless Networks. Michael J. Neely University of Southern California

ECE-517: Reinforcement Learning in Artificial Intelligence. Lecture 4: Discrete-Time Markov Chains

A discrete-time priority queue with train arrivals

Linear Model Predictive Control for Queueing Networks in Manufacturing and Road Traffic

A Retrial Queueing model with FDL at OBS core node

Part 2: Random Routing and Load Balancing

State-dependent and Energy-aware Control of Server Farm

A POMDP Framework for Cognitive MAC Based on Primary Feedback Exploitation

Introduction to Markov Chains, Queuing Theory, and Network Performance

Utility-Maximizing Scheduling for Stochastic Processing Networks

Sensitivity Analysis for Discrete-Time Randomized Service Priority Queues

Queuing Networks: Burke s Theorem, Kleinrock s Approximation, and Jackson s Theorem. Wade Trappe

Queueing Theory and Simulation. Introduction

CS675: Convex and Combinatorial Optimization Fall 2016 Combinatorial Problems as Linear and Convex Programs. Instructor: Shaddin Dughmi

Scheduling I. Today. Next Time. ! Introduction to scheduling! Classical algorithms. ! Advanced topics on scheduling

Packet Loss Analysis of Load-Balancing Switch with ON/OFF Input Processes

M/G/1 and M/G/1/K systems

Discrete-event simulations

Distributed Random Access Algorithm: Scheduling and Congesion Control

Maximizing throughput in zero-buffer tandem lines with dedicated and flexible servers

THROUGHPUT ANALYSIS OF STOCHASTIC NETWORKS USING FLUID LIMITS. By John Musacchio University of California, Santa Cruz and

Queueing systems. Renato Lo Cigno. Simulation and Performance Evaluation Queueing systems - Renato Lo Cigno 1

- Well-characterized problems, min-max relations, approximate certificates. - LP problems in the standard form, primal and dual linear programs

variance of independent variables: sum of variances So chebyshev predicts won t stray beyond stdev.

Introduction This paper is motivated by the question of whether a predictor of future packet arrivals could improve switch performance. This is a two-

Problem set 1. (c) Is the Ford-Fulkerson algorithm guaranteed to produce an acyclic maximum flow?

Congestion Control In The Internet Part 1: Theory. JY Le Boudec 2018

CS675: Convex and Combinatorial Optimization Fall 2014 Combinatorial Problems as Linear Programs. Instructor: Shaddin Dughmi

Queues and Queueing Networks

A Study on Performance Analysis of Queuing System with Multiple Heterogeneous Servers

Computer Networks More general queuing systems

Markov Chain Model for ALOHA protocol

This lecture is expanded from:

Environment (E) IBP IBP IBP 2 N 2 N. server. System (S) Adapter (A) ACV

Introduction to Queueing Theory with Applications to Air Transportation Systems

M/G/FQ: STOCHASTIC ANALYSIS OF FAIR QUEUEING SYSTEMS

Representations of All Solutions of Boolean Programming Problems

4 Elementary matrices, continued

Competitive Management of Non-Preemptive Queues with Multiple Values

Multimedia Communication Services Traffic Modeling and Streaming

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Performance Guarantee under Longest-Queue-First Schedule in Wireless Networks

A Semiconductor Wafer

A Measurement-Analytic Approach for QoS Estimation in a Network Based on the Dominant Time Scale

Equivalent Models and Analysis for Multi-Stage Tree Networks of Deterministic Service Time Queues

15-780: LinearProgramming

Delay-Based Back-Pressure Scheduling in. Multihop Wireless Networks

Dynamic Call Center Routing Policies Using Call Waiting and Agent Idle Times Online Supplement

Learning Algorithms for Minimizing Queue Length Regret

A STAFFING ALGORITHM FOR CALL CENTERS WITH SKILL-BASED ROUTING: SUPPLEMENTARY MATERIAL

A State Action Frequency Approach to Throughput Maximization over Uncertain Wireless Channels

Node-based Service-Balanced Scheduling for Provably Guaranteed Throughput and Evacuation Time Performance

On Optimal Routing in Overloaded Parallel Queues

The Transition Probability Function P ij (t)

Queueing Review. Christos Alexopoulos and Dave Goldsman 10/25/17. (mostly from BCNN) Georgia Institute of Technology, Atlanta, GA, USA

Modelling Complex Queuing Situations with Markov Processes

Transcription:

Advanced Networking Technologies Chapter Routers and Switches - Inputbuffered switches and matchings This chapter is heavily based on material by Holger Karl (Universität Paderborn) and Prof. Yashar Ganjali (University of Toronto) Content Head-of-line blocking the missing analysis A Markov model A closed-form approach Virtual output ueues revisited Under uniform traffic Under non-uniform traffic, with known traffic matrix Under non-uniform traffic, with unknown traffic matrix Maximum size matching Maximum weight matching Maximal size matching

Input-Queued Switch: How It Works The switch matches inputs and outputs Packets are ueued at the inputs. Input-Queued Switch: How It Works

Input-Queued Switch: Speed-Up Advantage At most one packet leaves from each input (arrives to each output) Þ speed-up=, not N Speedup: Ratio of switch speed/port speed 5 Head-of-Line Blocking Blocked! Blocked! Blocked! The switch is NOT work-conserving! (There is a packet for an output port, but the port is idle.) 6

Assumptions for a simple analysis Assumptions: Time is slotted, all packets same size At each time-slot, at each of the N inputs: i.i.d. packet arrivals with probability r Each packet is destined for one of the N outputs uniformly at random By symmetry, consider some given output Scheduling: at each time-slot, each output picks an HoL packet uniformly at random Or is idle if there is no packet for this out More precisely: Largest ½ so that ueues stay finite Question: What throughput r (per link) can we get? HoL Blocking in x Switch 8

HoL Blocking in x Switch 9 HoL Blocking in x Switch 0

Balls-and-Bins Model Saturated switch Assume infinite number of packets in each ueue They are all destined to some output u.a.r. (random coloring of packets) Balls-and-bins model N outputs Û N bins N HoL packets Û N balls Balls reflect ONLY the HoL packets; ueue state not relevant At each time-slot Remove one ball from each non-empty bin Assign new balls to bins with prob. ½, independently and u.a.r. It does not matter which ball we remove (why?!) Balls-and-Bins Model

Balls-and-Bins Model Balls-and-Bins Model

Markov Chain There are three states for the bin occupancy: (,0), (,), (0,) E.g., (,0) means both HoL packets are destined to first output We get a discrete-time Markov chain: (,0) (,) (0,) 5 Transition Probabilities in Markov Chain Transition from (,0) / / 6

Transition Probabilities in Markov Chain Euilibrium state distribution: p={¼, ½, ¼} Output throughput = -P(output empty) = 5% / / / / / (,0) (,) (0,) / / Side Note: State Collapse Symmetric Markov chain State collapse: (,0) and (,) / / / (,0) (,) / Euilibrium (collapsed) state distribution: (/,/) get real state distribution 8

x Switch Markov chain with following states: (,0,0),(0,,0),(0,0,), (,,0),(,0,),(,,0),(0,,),(0,,),(,0,) (,,) State collapse into: (,0,0),(,,0) and (,,) / / / /9 /9 (,0,0) (,,0) (,,) /9 /9 / 9 x Switch Euilibrium state distribution Per-output throughput 5% for x, 68% for x but state space explosion for large N 0

Content Head-of-line blocking the missing analysis A Markov model A closed-form approach Virtual output ueues revisited Under uniform traffic Under non-uniform traffic, with known traffic matrix Under non-uniform traffic, with unknown traffic matrix Maximum size matching Maximum weight matching Maximal size matching Method #: Closed form euations for balls in bins Suppose (!) we have an M/D/ system Given from Tele/Perf analysis the terms of Pollaczek-Khinchine E{k} = E{T v } = µ Hence for =0 E{k} = ( ) E{k} = For : = p 58%

HoL Blocking vs. OQ Switch Delay IQ switch with HoL blocking OQ switch -» 58% 0% 0% 0% 60% 80% 00% Load Content Head-of-line blocking the missing analysis A Markov model A closed-form approach Virtual output ueues revisited Under uniform traffic Under non-uniform traffic, with known traffic matrix Under non-uniform traffic, with unknown traffic matrix Maximum size matching Maximum weight matching Maximal size matching

VOQs: How Packets Move VOQs Scheduler 5 Basic Switch Model A (n) A (n) Q (n) D (n) S(n) A N (n) D N (n) A N (n) A N (n) A NN (n) D N (n) D NN (n) N N Q NN (n) 6

Notations: Arrivals A ij (n): packet arrivals at input i for output j at time-slot n A ij (n) = 0 or l ij = E[A ij (n)]: arrival rate L=[l ij ]: traffic matrix A=[A ij (n)] admissible iff: For all i, å j l ij < : no input is oversubscribed For all j, å i l ij < : no output is oversubscribed Notations: Schedule Q ij (n): ueue size of VOQ(i,j) Q=[Q ij (n)] S ij (n): whether the schedule connects input i to output j S ij (n) = 0 or No speedup: each input is connected to at most one output, each output to at most one input We will assume that each input is connected to exactly one output, and each output to exactly one input Þ S=[S ij (n)] permutation matrix 8

Scheduling Algorithm What it does: determine S(n) How: Either using traffic matrix L, Or, in most cases, using ueue sizes Q(n) (because L unknown) Objective: 00% throughput So that lines are fully utilized Secondary objective: minimize packet delays/backlogs 9 What is 00% throughput? Work-conserving scheduler Definition: If there is one or more packet in the system for an output, then the output is busy. i.e. holds system busy at all times An output ueued switch is trivially work-conserving. Each output can be modeled as an independent single-server ueue If λ < µ then E[Q ij (n)] < c for some c Therefore, we say it achieves 00% throughput. For fixed-sized packets, work-conservation also minimizes average packet delay. Q: What happens when packet sizes vary? Non work-conserving scheduler An input-ueued switch is, in general, non work-conserving. Q: What definitions make sense for 00% throughput? 0

Common Definitions of 00% throughput Work-conserving For all n,i,j, Q ij (n) < c, i.e., weaker For all n,i,j, i.e., E[Q ij (n)] < c We will focus on this definition. Departure rate = arrival rate, i.e., Content Head-of-line blocking the missing analysis A Markov model A closed-form approach Virtual output ueues revisited Under uniform traffic Under non-uniform traffic, with known traffic matrix Under non-uniform traffic, with unknown traffic matrix Maximum size matching Maximum weight matching Maximal size matching

Uniform Traffic Definition: l ij =l for all i,j i.e., all input-output pairs have same traffic rate Condition for admissible traffic: l < /N Example: Bernoulli traffic l = r/n Arrivals at input i are Bernoulli(r) and i.i.d. 00% Throughput for Uniform Traffic Nearly all algorithms in literature can give 00% throughput when traffic is uniform For example: Uniform cyclic. Random permutation. Wait-until-full. Maximum size matching (MSM). Maximal size matching (e.g. WFA, PIM, islip).

Uniform Cyclic Scheduling A B C D A A B C D B C D Each (i, j) pair is served every N time slots: M/D/ λ = r / N < / N /N Stable for r < 5 Wait Until Full We don t have to do much at all to achieve 00% throughput when arrivals are Bernoulli i.i.d. uniform. Simulation suggests that the following algorithm leads to 00% throughput. Wait-until-full: If any VOQ is empty, do nothing (i.e. serve no ueues). If no VOQ is empty, pick a random permutation. 6

Simple Algorithms with 00% Throughput Wait until full Uniform Cyclic Maximal Matching Algorithm (islip) MSM Uniform Random Scheduling At each time-slot, pick a schedule u.a.r. among: The N cyclic permutations A B C D Or the N! permutations A A B C D B C D Then P(S i,j =) = /N Q: why? 8

Uniform Random Scheduling We get a M/M/ system: = /N µ = /N Birth-death chain We get: E{Delay} / N Stable when r < 9 Table of content Head-of-line blocking the missing analysis A Markov model A recurrence euation approach Virtual output ueues revisited Under uniform traffic Under non-uniform traffic, with known traffic matrix Under non-uniform traffic, with unknown traffic matrix Maximum size matching Maximum weight matching Maximal size matching 0

Non-Uniform Traffic Assume the traffic matrix is: = L is admissible and non-uniform.6.5.08 0 6 0 0..5.08..5.55.56.09 0.8.90.90.8.88.9.85.8.9 Uniform Schedule? What if uniform schedule? Each VOQ serviced at rate µ = /N = / But arrivals to VOQ(,) have rate l = 0.5 Arrival rate > departure rate Þ switch unstable! Need to adapt schedule to traffic matrix.

Example Scheduling (Trivial) Assume we know the traffic matrix, it is admissible, and it follows a permutation: Then we can simply choose: 0 0 0 =.99 60 0 0 0 0 05 0 0 0 S(n) = 0 0 0 60 0 0 0 0 05, 8n 0 0 0 Example Scheduling Assume we know the traffic matrix, and it doesn t follow a permutation. For example: / / 0 0 / / =.99 6 0 0 0 0 05 0 0 0 Then we can choose the seuence of service permutations: 0 0 0 0 0 0 S() = 60 0 0 0 0 05,S() = S() = 6 0 0 0 0 0 05 0 0 0 0 0 0 And either cycle through it or pick randomly In general, if we know an admissible L, can we pick a seuence S(n) so that l < µ?

Definitions Doubly Stochastic Matrix: An NxN matrix with nonnegative entries where all rows and all columns sum to. Doubly Sub-Stochastic Matrix: An NxN matrix with nonnegative entries where the sum of entries in each row or column is less than or eual to. 5 Doubly Stochastic Matrices L is admissible, or doubly sub-stochastic Theorem (von Neumann): There exists L ={l ij } such that L < L, i.e. every element of L is smaller than the corresponding element in L, and L is doubly stochastic: å i l ij = å j l ij = Example: =.6.5.08 0 6 0 0..5.08..5.55 < 0 =.56.09 0.8..59..0 6.0 0.0.9.9.... 5.59..0. 6

Doubly Stochastic Matrices Fact. The set of doubly stochastic matrices is convex, compact (closed and bounded), in R N Fact. Any convex, compact set in R N has extreme points, and is eual to the convex hull of its extreme points (Krein-Milman Theorem) Doubly Stochastic Matrices Theorem (Birkhoff): Permutation matrices are the extreme points of the set of doubly stochastic matrices In other words: Given L, there exist K numbers a k > 0 and K permutation matrices P k such that Von Neumann Birkhoff Note: K apple N N + 8

Birkhoff-von Neumann (BvN) Scheduling BvN decomposition: L Þ L Þ {a k, P k } BvN weighted random scheduling: Pick P k with probability a k Theorem: BvN scheduling achieves 00% throughput 9 BvN Example For a given =.6.5.08 0 6 0 0..5.08..5.55.56.09 0.8 How do we find a feasible L? Lots of more or less complicated ways possible: Linear optimization Build linear euation system for values added to each additional contraints as needed... ij and add 50

BvN Example Lets take the following matrix 0 =..6. 0 6 0 0.5.5....5.6. 0. How do we get a valid schedule? 5 BvN Example Define the following helper matrix = 0 0 = 6 0 60 0 5 5 5 6 0 Choose a permutation P (at random or with a strategy) and subtract it from as often as possible 6 0 0 0 0 60 0 5 5 5 60 0 0 0 0 0 5 6 0 0 0 0 0 ) =, = 60 0 5 05 0 5

BvN Example Repeat! 0 0 0 0 0 0 60 0 5 05 60 0 0 0 0 05 ) =, = 60 0 0 05 0 0 0 0 0 0 0 0 0 0 0 0 0 0 60 0 0 05 60 0 0 0 0 05 ) =, = 60 0 0 0 0 05 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 60 0 0 0 0 05 60 0 0 0 0 05 ) =, = 60 0 0 0 0 0 0 05 0 0 0 0 0 0 0 0 0 0 5 BvN Example Now we know that..6. 0 0 = 6 0 0.5.5....5.6. 0. 0 0 0 0 0 0 =. 60 0 0 0 0 0 5 +. 60 0 0 0 0 05 + 0 0 0 0 0 0 0 0 0 0 0 0. 60 0 0 0 0 05 +. 60 0 0 0 0 05 0 0 0 0 0 0 Note that KX i = k= 5

Proof: 00% Throughput Lindley s euation: Arrival Rate: P(A ij (n)=) = E[A ij (n)] = l ij Departure Rate: Arrival rate < departure rate 00% throughput 55 Content Head-of-line blocking the missing analysis A Markov model A recurrence euation approach Virtual output ueues revisited Under uniform traffic Under non-uniform traffic, with known traffic matrix Under non-uniform traffic, with unknown traffic matrix Maximum size matching Maximum weight matching Maximal size matching 56

Unknown Traffic Matrix We want to maximize throughput Traffic matrix unknown Þ cannot use BvN Idea/intuition: maximize instantaneous throughput In other words: transfer as many packets as possible at each time-slot Maximum Size Matching (MSM) algorithm, using Reuest Graph A bipartite graph: input ports i and output ports j Edge from i to j if Q ij > 0 (ueue size of the virtual output ueue) Goal: Find the largest number of edges such that each node has at most one edge (in or out) 5 Maximum Size Matching (MSM) MSM maximizes instantaneous throughput Q (n)>0 Maximum Size Match Q N (n)>0 Reuest Graph Bipartite Match MSM algorithm: among all maximum size matches, pick a random one 58

Implementing MSM How can we find maximum size matches? We do so by recasting the problem as a network flow problem Classic math problem 59 Question Is the intuition right? In particular: Good idea to pick one of several maximum matchings at random? Answer: NO! There is a counter-example for which, in a given VOQ (i,j), l ij < µ ij but MSM does not provide 00% throughput. 60

Counter-example Consider the following non-uniform traffic pattern, with Bernoulli i.i.d. arrivals: l = l = /-d l = /-d Three possible matches, S(n): l = /-d Consider the case when Q, Q both have arrivals, w. p. (/ - d) When packets are processed on Q or Q, input is served w. p. at most /. Schedules eually probable Overall, the service rate for input, µ is at most / ( / ) + ( / ) Switch unstable for smaller d µ apple /( / ) apple /( / ) + p =0.059 apple / 6 Table of content Head-of-line blocking the missing analysis A Markov model A closed-form approach Virtual output ueues revisited Under uniform traffic Under non-uniform traffic, with known traffic matrix Under non-uniform traffic, with unknown traffic matrix Maximum size matching Maximum weight matching Maximal size matching 6

Scheduling in Input-Queued Switches A (n) A (n) Q (n) S*(n) D (n) A N (n) A N (n) A N (n) D N (n) A NN (n) Q NN (n) N N Problem. Maximum size matching Maximizes instantaneous throughput. Does not take into account VOQ backlogs. Solution. Give higher priority to VOQs which have more packets. Look at the weighted version of the bipartite matching problem 6 Maximum Weight Matching (MWM) Assign weights to the edges of the reuest graph. Q (n)>0 W Assign Weights Q N (n)>0 Reuest Graph W N Weighted Reuest Graph Find the matching with maximum weight. 6

MWM Scheduling Create the reuest graph. Find the associated link weights. Find the matching with maximum weight. How? Transfer packets from the ingress lines to egress lines based on the matching. Question. How often do we need to calculate MWM? 65 Options for Weights Longest Queue First (LQF) Weight associated with each link is the length of the corresponding VOQ. MWM tends to give priority to long ueues. Does not necessarily serve the longest ueue. Oldest Cell First (OCF) Weight of each link is the waiting time of the HoL packet in the corresponding ueue. 66

Longest Queue First (LQF) LQF is the name given to the maximum weight matching where weight w ij (n) = L ij (n). But the name is so bad that people keep the name MWM! LQF doesn t necessarily serve the longest ueue. Theorem. MWM-LQF scheduling provides 00% throughput. Problem: LQF can leave a short ueue unserved indefinitely. MWM-LQF is very important theoretically: most (if not all) scheduling algorithms that provide 00% throughput for unknown traffic matrices are variants of MWM! 6 Proof Idea: Use Lyapunov Functions Basic idea: when ueues become large, the MWM schedule tends to give them a negative drift. We will try to show that: E{L(n + ) L(n) L(n)} < 0 With L(n) =f(q(n)) being a Lyapunov function and Q(n) an estimate for ueue length written in a single vector Using this method we expect to find E{L(n + ) L(n) L(n)} < Q(n) + c That is for ueues being long enough there is a negative drift 68

Lyapunov Analysis Simple Example Suppose we have a long tube Cross section A m Defined value at t=0s Water poured into the tube at a rate of V m /s for t>0s Orifice has an area of a m at the bottom of the tube Let h(t) be the water level What happens? Volume changes according to differential euation g being the gravitational constant Larger volume higher drainage Set h(t) to be a Lyapunov function L(t) dl dt = (V System is stable for any start value A dh dt = V dl dt = (V a p gl(t)) < 0 ) L(t) > V A ga ap gl(t))/a ap gh(t) 69 Lyapunov Functions How we use this approach in general? Find a positive function L(t) that increases with some state of the system Some more properties needed, e.g. needs to map to a real value, smooth, etc. In the example L(t) = p h(t) or L(t) =(h(t)) would have been ok, too Show that dl dt is negative for all t > c Note: it may be positive for values below c, if it is exactly zero after some point the system may be stable depending on the starting conditions Much more theory behind this 0

Back to the Outline of the Proof Intuition: Can we find S s.t. for any =,... m,n? Q T (n) apple Q T (n) S T Solution: We look for worst case traffic 0 = argmax Q T (n) With NX MX 8j ij apple, 8i ij apple, 8(i, j) ij 0 i=0 j=0 We know (Birkhoff: "permutations are extreme points of doublystochastic matrices") that is at 0 most: S = argmax Q T (n) S NX 8j S ij =, 8i i=0 j=0 S MX S ij =, 8(i, j) S ij 0 Therefore for any, there is Q T (n) apple Q T (n) S Outline of Proof. We know that if we pick S = argmax Q T (n) S, then Q T (n)( S ) apple 0 S. Next we can use this fact to show that: E{L(n + ) L(n) L(n)} apple Q(n) + c with L(n) =Q T (n)q(n) i.e. a uadratic Lyapunov function.. Hence, if is Q(n) large enough, buffers do not grow. (Of course only once there is a small pause there is an expected single-step downward drift in occupancy.) Note: proof details in paper by McKeown et al.

LQF Variants Question: what if w ij (n) =L ij (n) or w ij (n) = L ij (n)? What if weight w ij (n) = W ij (n) (waiting time)? Preference is given to cells that have waited a long time. Is it stable? We call the algorithm OCF (Oldest Cell First). Remember that it doesn t guarantee to serve the oldest cell! Summary of MWM Scheduling MWM LQF scheduling provides 00% throughput. It can starve some of the packets. MWM OCF scheduling gives 00% throughput. No starvation. Question. Are these fast enough to implement in real switches? Not obviously so (recall: 8 ns!) Non-trivial amount of parallelization of these algorithms necessary Or: relax reuirements, look at less challenging version of the matching problem

Simulation of Simple x Example 5 Table of content Head-of-line blocking the missing analysis A Markov model A recurrence euation approach Virtual output ueues revisited Under uniform traffic Under non-uniform traffic, with known traffic matrix Under non-uniform traffic, with unknown traffic matrix Maximum size matching Maximum weight matching Maximal size matching 6

The Story So Far Output-ueued switches Best performance Impractical need speedup of N Input-ueued switches Head of line blocking à VOQs Known traffic matrix à BvN Unknown traffic matrix à MWM Complexity of Maximum Matchings Maximum Size Matchings: Typical complexity O(N.5 ) Maximum Weight Matchings: Typical complexity O(N ) In general: Hard to implement in hardware Slooooow Can we find a faster algorithm? No, but we can relax the reuirements 8

Maximal Matching A maximal matching is a matching in which adding any edge to it destroys the matching property Realization: Maximal matching can be computed by algorithms in which each edge is added one at a time, and is not later removed from the matching No augmenting paths allowed in the Ford-Fulkerson network flow (they remove edges added earlier) Conseuence: no input and output are left unnecessarily idle. 9 Example of Maximal Matching A A A B B B C C C D D D E 5 E 5 E 5 F 6 F 6 F 6 Maximal Size Matching Maximum Size Matching 80

Properties of Maximal Matchings In general, maximal matching is much simpler to implement, and has a much faster running time. A maximal size matching is at least half the size of a maximum size matching. (Why?) Most simple case: Greedy LQF Further (more relevant) examples: WFA PIM islip 8 Greedy LQF Greedy LQF (Greedy Longest Queue First) is defined as follows: Pick the VOQ with the most number of packets (if there are ties, pick at random among the VOQs that are tied). Say it is VOQ(i,j ). Then, among all free VOQs, pick again the VOQ with the most number of packets (say VOQ(i,j ), with i i, j j ). Continue likewise until the algorithm converges. Greedy LQF is also called ilqf (iterative LQF) and Greedy Maximal Weight Matching. 8

Properties of Greedy LQF The algorithm converges in at most N iterations. (Why?) Greedy LQF results in a maximal size matching. (Why?) Greedy LQF produces a matching that has at least half the size and half the weight of a maximum weight matching. (Why?) 8 Wave Front Arbiter (WFA) [Tamir and Chi, 99] Reuests Match 8

Wave Front Arbiter Reuests Match 85 Wave Front Arbiter Implementation,,,, Simple combinational logic blocks,,,,,,,,,,,, 86 86

Wave Front Arbiter Wrapped WFA (WWFA) N steps instead of N- Reuests Match 8 Properties of Wave Front Arbiters Feed-forward (i.e. non-iterative) design lends itself to pipelining. Always finds maximal match. Usually reuires mechanism to prevent Q from getting preferential service. In principle, can be distributed over multiple chips. 88

Parallel Iterative Matching [Anderson et al., 99] uar selection uar selection # # F: Reuests F: Grant F: Accept/Match 89 PIM Properties Guaranteed to find a maximal match in at most N iterations. (Why?) In each phase, each input and output arbiter can make decisions independently. In general, will converge to a maximal match in <N iterations. How many iterations should we run? 90

Parallel Iterative Matching Convergence Time Number of iterations to converge: EU [ i ] N ------ i EC [ ]» logn C N U i = = = # of iterations reuired to resolve connections # of ports # of unresolved connections after iteration i Anderson et al., High-Speed Switch Scheduling for Local Area Networks, 99. 9 Parallel Iterative Matching 9

Parallel Iterative Matching PIM with a single iteration 9 Parallel Iterative Matching PIM with iterations 9

islip [McKeown et al., 999] # # F: Reuests F: Grant F: Accept/Match 95 islip Operation Grant phase: Each output selects the reuesting input at the pointer, or the next input in round-robin order. It only updates its pointer if the grant is accepted. Accept phase: Each input selects the granting output at the pointer, or the next output in round-robin order. Conseuence: Under high load, grant pointers tend to move to uniue values. 96

islip Properties Random under low load TDM under high load Lowest priority to MRU (most recently used) iteration: fair to outputs Converges in at most N iterations. (On average, simulations suggest < log N) Implementation: N priority encoders 00% throughput for uniform i.i.d. traffic. But some pathological patterns can lead to low throughput. 9 islip 98

islip 99 islip Implementation Programmable Priority Encoder N Grant Accept log N State N Grant Accept log N Decision N N Grant N Accept log N 00

Maximal Matches Maximal matching algorithms are widely used in industry (especially algorithms based on WFA and islip). PIM and islip are rarely run to completion (i.e. they are sub-maximal). We will see that a maximal match with a speedup of is stable for non-uniform traffic. 0 Conclusion Switch architecture decision: Output buffer or input buffer Output buffer conceptually simple, but reuires expensive hardware (switching fabric!) Input buffer replaces hardware by brainware: clever scheduling gives comparable performance with much simpler/cheaper switching fabric But needs non-trivial computational effort 0

References Achieving 00% Throughput in an Input-ueued Switch (Extended Version). Nick McKeown, Adisak Mekkittikul, Venkat Anantharam and Jean Walrand. IEEE Transactions on Communications, Vol., No.8, August 999. A Practical Scheduling Algorithm to Achieve 00% Throughput in Input- Queued Switches.. Adisak Mekkittikul and Nick McKeown. IEEE Infocom 98, Vol, pp. 9-99, April 998, San Francisco. A. Schrijver, Combinatorial Optimization - Polyhedra and Efficiency, Springer-Verlag, 00. T. Anderson, S. Owicki, J. Saxe, and C. Thacker, High-Speed Switch Scheduling for Local-Area Networks, ACM Transactions on Computer Systems, II ():9-5, November 99. Y. Tamir and H.-C. Chi, Symmetric Crossbar Arbiters for VLSI Communication Switches, IEEE Transactions on Parallel and Distributed Systems, (j):-, 99. N. McKeown, The islip Scheduling Algorithm for Input-Queued Switches, IEEE/ACM Transactions on Networking, ():88-0, April 999. 0