A Game Theoretic Approach to Distributed Opportunistic Scheduling

Similar documents
ECE559VV Project Report

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Assortment Optimization under MNL

The Second Anti-Mathima on Game Theory

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud

Economics 101. Lecture 4 - Equilibrium and Efficiency

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

COS 521: Advanced Algorithms Game Theory and Linear Programming

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

Pricing and Resource Allocation Game Theoretic Models

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

CS286r Assign One. Answer Key

NUMERICAL DIFFERENTIATION

Welfare Properties of General Equilibrium. What can be said about optimality properties of resource allocation implied by general equilibrium?

Transfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system

Kernel Methods and SVMs Extension

Composite Hypotheses testing

Problem Set 9 Solutions

Perfect Competition and the Nash Bargaining Solution

Numerical Heat and Mass Transfer

Lecture Notes on Linear Regression

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Real-Time Systems. Multiprocessor scheduling. Multiprocessor scheduling. Multiprocessor scheduling

Lecture 10 Support Vector Machines II

Foundations of Arithmetic

The optimal delay of the second test is therefore approximately 210 hours earlier than =2.

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Chapter 13: Multiple Regression

EEE 241: Linear Systems

Lecture 14: Bandits with Budget Constraints

Notes on Frequency Estimation in Data Streams

Errors for Linear Systems

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan

Difference Equations

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Suppose that there s a measured wndow of data fff k () ; :::; ff k g of a sze w, measured dscretely wth varable dscretzaton step. It s convenent to pl

On the Multicriteria Integer Network Flow Problem

Lecture 4. Instructor: Haipeng Luo

MMA and GCMMA two methods for nonlinear optimization

Graph Reconstruction by Permutations

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Temperature. Chapter Heat Engine

Error Probability for M Signals

DUE: WEDS FEB 21ST 2018

Linear Approximation with Regularization and Moving Least Squares

Chapter - 2. Distribution System Power Flow Analysis

Second Order Analysis

Module 9. Lecture 6. Duality in Assignment Problems

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Online Appendix. t=1 (p t w)q t. Then the first order condition shows that

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

Bézier curves. Michael S. Floater. September 10, These notes provide an introduction to Bézier curves. i=0

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

Lecture 4: November 17, Part 1 Single Buffer Management

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Cognitive Access Algorithms For Multiple Access Channels

A Robust Method for Calculating the Correlation Coefficient

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

One-sided finite-difference approximations suitable for use with Richardson extrapolation

Note 10. Modeling and Simulation of Dynamic Systems

Chapter Newton s Method

(1 ) (1 ) 0 (1 ) (1 ) 0

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Supplementary Notes for Chapter 9 Mixture Thermodynamics

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Some modelling aspects for the Matlab implementation of MMA

Feature Selection: Part 1

Outline. Communication. Bellman Ford Algorithm. Bellman Ford Example. Bellman Ford Shortest Path [1]

Analysis of Discrete Time Queues (Section 4.6)

The Geometry of Logit and Probit

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

Design and Optimization of Fuzzy Controller for Inverse Pendulum System Using Genetic Algorithm

Interactive Bi-Level Multi-Objective Integer. Non-linear Programming Problem

The Study of Teaching-learning-based Optimization Algorithm

Computing Correlated Equilibria in Multi-Player Games

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

Market structure and Innovation

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

Solving Nonlinear Differential Equations by a Neural Network Method

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution.

Calculation of time complexity (3%)

A new construction of 3-separable matrices via an improved decoding of Macula s construction

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Vickrey Auction VCG Combinatorial Auctions. Mechanism Design. Algorithms and Data Structures. Winter 2016

k t+1 + c t A t k t, t=0

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors

Negative Binomial Regression

Time-Varying Systems and Computations Lecture 6

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 59, NO. 5, JUNE

Convergence of random processes

Managing Capacity Through Reward Programs. on-line companion page. Byung-Do Kim Seoul National University College of Business Administration

Transcription:

A Game Theoretc Approach to Dstrbuted Opportunstc Schedulng Albert Banchs, Senor Member, IEEE, Andres Garca-Saavedra, Pablo Serrano, Member, IEEE, and Joerg Wdmer, Senor Member, IEEE. Banchs et al.: A Game Theoretc Approach to Dstrbuted Opportunstc Schedulng Abstract Dstrbuted Opportunstc Schedulng (DOS) s nherently more dffcult than conventonal opportunstc schedulng due to the absence of a central entty that knows the channel state of all statons. Wth DOS, statons use random access to contend for the channel and, upon wnnng a contenton, they measure the channel condtons. After measurng the channel condtons, a staton only transmts f the channel qualty s good; otherwse, t gves up the transmsson opportunty. The dstrbuted nature of DOS makes t vulnerable to selfsh users: by devatng from the protocol and usng more transmsson opportuntes, a selfsh user can gan a greater share of wreless resources at the expense of well-behaved users. In ths paper, we address the problem of selfshness n DOS from a game theoretc standpont. We propose an algorthm that satsfes the followng propertes: () when all statons mplement the algorthm, the wreless network s drven to the optmal pont of operaton, and () one or more selfsh statons cannot obtan any gan by devatng from the algorthm. The key dea of the algorthm s to react to a selfsh staton by usng a more aggressve confguraton that (ndrectly) punshes ths staton. We buld on multvarable control theory to desgn a mechansm for punshment that s suffcently severe to prevent selfsh behavor yet not so severe as to render the system unstable. We conduct a game theoretc analyss based on repeated games to show the algorthm s effectveness aganst selfsh statons. These results are confrmed by extensve smulatons. Index Terms Contenton-based channel access, dstrbuted opportunstc schedulng, game theory, multvarable control theory, repeated games, selfsh statons, wreless networks I. INTRODUCTION OPPORTUNISTIC schedulng technques have been shown to sgnfcantly mprove performance n wreless networks. These technques take advantage of the fluctuatons n the channel condtons of dfferent wreless statons over tme; by selectng the staton wth the best nstantaneous channel for data transmsson, opportunstc schedulng can utlze wreless resources more effcently. A key assumpton of most opportunstc schedulng technques [], [] s that the scheduler s centralzed and has knowledge of the nstantaneous channel condtons of all statons. Dstrbuted Opportunstc Schedulng (DOS) technques [3] [6] have been proposed only recently. In contrast to centralzed schemes, wth DOS each staton has to make schedulng decsons wthout knowng the channel condtons of the other statons. Statons contend for the channel usng random access wth a gven access probablty. After successful contenton, a staton measures the channel and, f the channel condtons are poor (.e., the nstantaneous transmsson rate s below a gven threshold), the staton gves up the transmsson opportunty. Ths allows all statons to contend for the channel agan, lettng a staton wth better condtons wn the contenton, whch ncreases the overall throughput. DOS technques thus explot both mult-user dversty across statons and tme dversty across slots. The lack of global channel nformaton makes DOS systems very vulnerable to selfsh users. By devatng from the above protocol and usng a more aggressve confguraton, a selfsh user can easly gan a greater share of wreless resources at the expense of the other, well-behaved users. In ths paper, we address the problem of selfshness n DOS from a game theoretc standpont. In our formulaton of the problem, the players are wreless statons that mplement DOS and strve to obtan as great a share of resources as possble from the wreless network. We show that, n the absence of penaltes, the wreless network naturally tends to ether great unfarness or network collapse. Buldng on ths result, we desgn a penalty mechansm n whch any player who msbehaves wll be punshed by other players n such a way that there s no ncentve to msbehave. A key challenge when desgnng such a penalty scheme s to carefully adjust the punshment nflcted on a msbehavng staton. If the punshment s too lght, a selfsh staton may stll beneft from msbehavng. If t s too excessve, however, the punshment tself could be nterpreted as msbehavor and trgger punshment from other statons, leadng to an endless spral of ncreasng punshments and ultmately throughput collapse. We address ths challenge through a combnaton of game theory and control theory. The most relevant pror work on DOS by Zheng et al. [3] lays the basc foundatons of dstrbuted opportunstc schedulng. The authors propose a mechansm based on optmal stoppng theory and analyze ts performance wth well-behaved as well as selfsh users. The am of the algorthm s to maxmze the total throughput of the network. [4] [6] extend the basc mechansm of [3] by analyzng the case of mperfect channel nformaton [4], mprovng channel estmaton through twolevel channel probng [5], and ncorporatng delay constrants [6]. Whle our algorthm deals wth the basc DOS mechansm of [3], t can be extended to ncorporate the enhancements of [4] [6]. The key contrbutons of our work are as follows: ) We perform a jont optmzaton of both the transmsson rate thresholds and the access probabltes, whereas [3] only optmzes the thresholds. ) We provde a proportonally far allocaton that acheves a good tradeoff between total throughput and farness.

In contrast, [3] maxmzes the total throughput of the network and, as a result, t rsks starvng statons wth poor channel condtons. 3) We propose a smple algorthm based on control theory that guarantees stablty and quck convergence to the optmal pont of operaton, n contrast to the comparatvely complex heurstcs of [3]. 4) Our game theoretc analyss consders that users can selfshly confgure both ther access probablty and transmsson rate threshold, whereas the analyss of [3] assumes that selfsh users only have control over the thresholds. 5) We use a penalty mechansm to force an optmal Nash equlbrum, whereas [3] ntroduces a prcng mechansm whch may not be practcal n many scenaros; furthermore, the performance of the prcng mechansm reles heavly on the cost parameter and t s suboptmal even for the best parameter settng. Some of the concepts and tools used n ths paper buld on our prevous works of [7] and [8]. In [7] we proposed an algorthm based on control theory to optmally adjust the confguraton of DOS. In contrast to [7], n ths paper our am s to prevent selfsh users from obtanng any beneft by devatng from the optmal confguraton, whch s a much more dffcult problem. In [8] we desgned an algorthm based on multvarable control theory to adjust the contenton parameters of a WLAN. In ths paper, we obtan the same lnearzed system as [8], and hence we use the correspondng part of the analyss from that paper; however, the purpose, algorthm and most of the analyss of ths paper dffer substantally from [8]. The remander of the paper s organzed as follows. In Secton II, we present an analyss of our system and derve the optmal confguraton of access probabltes and transmsson rate thresholds. In Secton III, we show that, n the absence of penaltes, the wreless network tends to a hghly undesrable resource allocaton. We then propose an algorthm called Dstrbuted Opportunstc schedulng wth dstrbuted Control (DOC) that avods ths by mplementng a decentralzed penalty mechansm to control selfsh users. Secton IV shows by means of control theory, that when all the statons mplement DOC, the system s stable and converges to the optmal pont of operaton derved n Secton II. In Secton V, we conduct a game theoretc analyss of DOC to show that statons cannot obtan any gan by behavng selfshly. The performance of the proposed scheme s extensvely evaluated through smulatons n Secton VI. Fnally, Secton VII provdes some concludng remarks. II. ANALYSIS AND OPTIMAL CONFIGURATION In the followng, we present our system model and analyze the throughput as a functon of the access probabltes and transmsson rate thresholds. We then compute the optmal confguraton of these parameters for a proportonally far throughput allocaton, whch s well known to provde a good Fg.. Idle successful contenton collson R( ) R Example of channel contenton. R( ) R data transmsson tradeoff between total throughput and farness. A. System model Our system model follows that of [3] [6]. We consder a sngle-hop wreless network wth N statons, where staton contends for the channel wth an access probablty p. We assume a collson model for the channel access, where a staton contends successfully for the channel f no other staton contends at the same tme. Let τ denote the duraton of a mn slot for channel contenton, whch can ether be empty, contan a successful contenton, or a collson. As n [3] [6], we assume that a staton obtans ts local channel condtons after a successful contenton. Let R (θ) denote the correspondng transmsson rate at tme θ. If R (θ) s small (ndcatng a poor channel), staton gves up ths transmsson opportunty and lets all the statons contend for the channel agan. Otherwse, t transmts for a duraton of T. Fg. depcts an example of such channel contenton. Our model, lke that of [3] [6], assumes that R (θ) remans constant for the duraton of a data transmsson and that dfferent observatons of R (θ) are ndependent. From [3], we have that the optmal transmsson polcy s a threshold polcy: for a gven threshold R, staton only transmts after a successful contenton f R (θ) R. B. Throughput analyss The throughput r acheved by staton s a functon of the parameters p and R. Let l be the average number of bts that staton transmts followng a successful contenton and T be the average tme t holds the channel (ncludng the tme spent n contenton). Then, the throughput of staton s p s, l r = j p s,jt j + ( p s )τ where p s, s the probablty that a mn slot contans a successful contenton of staton, p s, = p ( p j ) () j The noton of proportonal farness was orgnally proposed by F. Kelly [9] and has been later appled to opportunstc schedulng []. Accordng to [9], an allocaton {r,..., r N } s proportonally far f () t s feasble, and () for any other feasble allocaton, the aggregate of proportonal changes s zero or negatve. [9] shows that the proportonally far allocaton maxmzes log(r ). The assumpton that R (θ) remans constant durng a data transmsson s a standard assumpton for the block-fadng channel n wreless communcatons [0], []. The assumpton that dfferent observatons are ndependent s justfed n [3] through numercal calculatons whch show that n many practcal scenaros the channel correlaton between two adjacent successful contentons of a staton s very small wth a very hgh probablty. ()

3 and p s s the probablty that the mnslot contans a successful contenton of any staton n the system p s = p s,. (3) Both l and T depend on R. When a staton contends successfully, t holds the channel for a tme T +τ f t transmts data and τ f t gves up the transmsson opportunty. Thus, T can be computed as T = P rob(r (θ) < R )τ + P rob(r (θ) R )(T + τ). (4) When the staton uses the transmsson opportunty, t transmts a number of bts gven by R (θ)t, whch yelds l = R rt f R (r)dr (5) where f R (r) s the pdf of R (θ). Based on the above, we can compute r from p = {p,..., p N } and R = { R,..., R N }. In the followng, we obtan the optmal confguraton of these parameters to acheve proportonal farness. C. Optmal p confguraton The problem of determnng the confguraton that provdes proportonal farness can be formulated as the unconstraned optmzaton problem of fndng the p and R confguraton that maxmzes log(r ). We start by computng the optmal confguraton of p. Let us defne w as w = p s, p s, (6) where we take staton as reference. From the above equaton we have that p s, = w p s / j w j. Substtutng ths nto () yelds w p s l r = j w jp s T j + j w j( p s )τ. (7) Followng the results of [], we approxmate the optmal success probablty p s by P = ( /N) N, whch s more accurate than the well-known approxmaton of /e for slotted random access. 3 The numercal results provded n Secton VI-A confrm the accuracy of the approxmaton. Substtutng p s by the constant value P n (7) gves r = w l j w jt j + j w j(/p )τ. (8) The problem of determnng the optmal p confguraton s equvalent to fndng the w values that maxmze log(r ), for r defned n (8). To obtan these w values, we mpose whch yelds j log(r j) w = 0 (9) T + (/P )τ N w j w jt j + j w = 0. (0) j(/p )τ 3 [3] shows that the approxmaton of /e holds both for symmetrc and asymmetrc access probabltes, and t further shows that ths approxmaton s accurate as long as the number of statons s suffcently large and the access probabltes are suffcently small. Combnng ths expresson for w and w j, we obtan w = T j + (/P )τ w j T + (/P )τ. () From the above, the values of p that solve the optmzaton problem are those that satsfy both p s = P and (). These values can be obtaned by solvng the followng system of equatons: p ( p j ) = P () j p j p j p j p = T + τ(/p ), =,..., N. j T + τ(/p ) (3) As P s only an approxmaton to the optmal p s, the above system of equatons has n fact two solutons. Ths can be seen as follows. From (3) we can express {p } =,...,N as a functon of p. Wth ths, () becomes an equaton wth only one unknown (p ). The left-hand sde of ths equaton ncreases from 0 (for p = 0) to a maxmum value that s greater than P and then decreases to 0 (for p = ). Hence, there are two dstnct values of p that solve (). Takng these two values of p and computng the correspondng values of {p } =,...,N n each case, we obtan the two solutons of the system of equatons. For one of the solutons, all of the access probabltes are larger than the correspondng ones from the other soluton; we select the soluton wth the larger access probabltes. As an excepton to ths, when all access probabltes are equal, the optmal p s s exactly P and the system has only one soluton; n ths case, we select ths unque soluton. We denote the selected soluton by p = {p,..., p N }, and refer to these probabltes as the optmal access probabltes. To determne p above, the T values have to be computed for all statons. These depend on the optmal confguraton of the thresholds R. In the followng secton, we compute the optmal R, whch we denote by R = { R,..., R N }. D. Optmal R confguraton In order to obtan the optmal confguraton of R, we need to fnd the transmsson threshold of each staton that, gven the p computed above, optmzes the overall performance n terms of proportonal farness. Ths s gven by the followng theorem. Theorem : Consder a staton k that s alone n the network and contends for the channel wth p k = P. Let R k be the transmsson rate threshold that optmzes the throughput of ths staton under the assumpton that dfferent channel observatons are ndependent. Then, R k = R k. Proof: The proof s by contradcton. Assume there exsts a confguraton R wth R k R k for some staton k that provdes proportonal farness. Let lk and T k be the values of l k and T k for the threshold R k and l k and T k the correspondng values for R k. Snce R k maxmzes r k when staton k s alone: T k l k + (/P )τ > l T k k + (/P )τ. (4)

4 Consder a network wth N statons that use confguraton R. Gven R, the p that maxmzes log(r ) s gven by () and (3). Ths results n the followng throughput for staton k: p s,k l k rk = j p s,j (T j + (/P )τ) = l k N(Tk + (/P )τ) (5) and for the other statons: r = l N(T + (/P )τ) k. (6) Let us now consder the alternatve confguraton R k for staton k and R for the other statons. If we take the p k and p confguraton that satsfes () and (3) wth ths alternatve confguraton, we obtan the followng throughput for staton k: r k = and for the other statons: r = l k N(T k + (/P )τ) > r k (7) l N(T + (/P )τ) k. (8) Wth the above, we have found an alternatve confguraton that provdes a hgher throughput to staton k and the same throughput to all other statons. Ths alternatve confguraton thus ncreases log(r ), whch contradcts the ntal assumpton that the confguraton R provdes proportonal farness. Followng the above theorem, the optmal confguraton of the thresholds R can be computed usng optmal stoppng theory. Ths s done n [3], whch fnds that the optmal threshold R can be obtaned by solvng the followng fxed pont equaton: E ( R (θ) R ) + R = τ T P. (9) The above concludes the search for the optmal confguraton. The key advantage of ths confguraton s that t allows each staton to compute ts R based on local nformaton only, and thus decouples the computaton of R from that of p. We use ths fndng to desgn a dstrbuted mechansm for computng the optmal confguraton, where each staton uses a fxed R = R obtaned locally, together wth an adaptve algorthm to determne the optmal p. III. DOC ALGORITHM In ths secton we propose an adaptve algorthm that satsfes the followng propertes: () when all statons mplement the algorthm, t leads to the optmal confguraton computed above, and () a selfsh staton cannot obtan any gan by devatng from the algorthm. We frst motvate our algorthm by showng that, n the absence of punshments, the system wll naturally tend to a hghly undesrable pont of operaton. We then present our algorthm, whch uses punshments to drve the system to the optmal pont of operaton derved n the prevous secton. A. Motvaton If no constrants are mposed on the wreless network and statons are allowed to confgure ther {p, R } parameters to selfshly maxmze ther own beneft, the network wll not naturally tend to the optmum confguraton derved above. To show ths, we model the wreless system as a statc game n whch each staton can choose ts confguraton wthout sufferng any penalty. The followng theorem characterzes the Nash equlbra of ths game. Theorem : In the absence of penaltes, there s at least one staton that plays p = n any Nash equlbrum. Proof: The proof s by contradcton. Let us assume that there s a Nash equlbrum such that p j j. If we consder one player and take the partal dervatve of ts throughput r, we obtan r p = j ( p j)l ˆT (p ˆT + ( p ) ˆT ) > 0 (0) where ˆT s the average duraton durng whch the channel s occuped when staton transmts and ˆT s the average duraton of a transmsson or an empty mn slot when staton does not transmt. From the above, t can be seen that the throughput r s a strctly ncreasng functon of p. It follows from ths that {p, R }, wth p, s not the best strategy for player gven the confguraton of the other statons, snce staton could obtan a hgher throughput by ncreasng p to and usng the same R. The confguraton {p, R }, wth p, s therefore not a Nash equlbrum, whch contradcts our ntal assumpton. Any of the above Nash equlbra are hghly undesrable. If staton s the only one that plays p =, then player acheves non-zero throughput whle all other players have zero throughput. Conversely, f some other staton j also plays p j =, the result s a network collapse wth all players obtanng zero throughput. We conclude from the above that, n the absence of punshments, selfsh behavor wll severely degrade the performance of the wreless system. In the followng, we propose an algorthm that addresses ths problem by mplementng a dstrbuted punshment mechansm. B. Ratonale behnd the algorthm Before presentng the algorthm, we frst dscuss the ratonale behnd ts desgn. Ths ratonale reles heavly on the noton of channel tme that a staton obtans over a certan nterval Θ, defned as t (Θ) = n (Θ) k= ( T k (Θ) + (/P )τ ) () where n (Θ) s the number of successful contentons of staton n that perod and T k (Θ) s the duraton of the kth successful contenton of the staton n the nterval. The above defnton comprses the aggregate transmsson tme of the staton plus a fxed overhead of (/P )τ that s added every tme the staton accesses the channel.

5 An mportant observaton that drves the desgn of our algorthm s that, wth the confguraton of Secton II, all statons receve the same channel tme on average,.e., t = t j, j (where t = E[t (Θ)]). Ths can be seen as follows. From () we have t = E[n ( (Θ)] E[T k (Θ)] + (/P )τ ) t j E[n j (Θ)] ( E[Tj k (Θ)] + (/P )τ) = p s,(t + (/P )τ) p s,j (T j + (/P )τ). () Furthermore, from (3) we have p s, (T + (/P )τ) = p s,j (T j + (/P )τ) and thus t = t j. When all statons use the optmal confguraton, the overhead n the defnton of channel tme, (/P )τ, concdes wth the average tme between two successes. As a result, for an nterval Θ of duraton T nterval t holds that E[t (Θ)] = T nterval. From ths and t = t j, we have that wth the optmal confguraton, all statons receve an average channel tme of E[t (Θ)] = T nterval /N. (3) We defne ths average channel tme as the optmal channel tme and denote t by t (.e., t = T nterval /N). The last observaton upon whch our algorthm reles s that as long as a selfsh staton does not receve more channel tme than t, t cannot ncrease ts throughput. The throughput of a staton wth a gven channel tme and R s equal to the throughput t would obtan f t were alone n the channel durng ths tme wth p = P and the same R. From Theorem, we have that ths throughput s maxmzed for the optmal transmsson rate threshold R. Therefore, as long as the staton does not receve extra channel tme, t wll not be able to acheve a hgher throughput. Gven these observatons, we base our algorthm on the followng prncples: () f a gven staton detects that another staton k s recevng more channel tme than tself, t consders staton k to be selfsh and ndrectly punshes t by usng a more aggressve confguraton, and () when punshng staton k, the punshment needs to be severe enough to keep staton k s channel tme below t so that staton k does not beneft from msbehavng. C. Algorthm desgn The objectve of DOC s to drve the system to the optmal confguraton {p, R } obtaned n Secton II. As dscussed n Secton II-D, each staton can locally compute ts optmal confguraton of R ndependently of the confguraton of the other statons. Therefore, wth DOC each staton mantans a fxed R (equal to the optmal value) and mplements an adaptve algorthm to confgure ts access probablty p. Tme s dvded nto ntervals of fxed length T nterval, and each staton updates ts access probablty p at the begnnng of every nterval. We use the dscrete varable Θ to refer to the dfferent ntervals, and p (Θ) to denote the value of p n a gven nterval Θ. The central dea behnd DOC s that when a msbehavng staton s detected, the other statons ncrease ther access probabltes n subsequent ntervals to prevent the selfsh staton from beneftng from ts msbehavor. Fg.. E ( ) E N( ) staton PI controller staton N PI controller DOC control system. P ( ) Eq.(5) P N( ) Eq.(5) p ( ) p N( ) Wreless network A key challenge n DOC s to determne the approprate reacton aganst a selfsh staton. If the reacton s not severe enough, a selfsh staton may beneft from msbehavng. However, f the reacton s too severe, the system may become unstable by enterng an endless loop where all statons ndefntely ncrease ther p to punsh each other. Control theory s a partcularly sutable tool to address ths challenge, as t helps guarantee the convergence and stablty of adaptve algorthms. We use technques from mult-varable control theory [4] for the desgn of the DOC algorthm. The algorthm s based on the classc system llustrated n Fg., where each staton runs an ndependent controller to compute ts confguraton. The controller that we have chosen for ths paper s a proportonal-ntegral (PI) controller, a well-known controller from classc control theory. As shown n the fgure, the PI controller of staton takes as nput the error sgnal measured over an nterval Θ, E (Θ), and provdes as output the control sgnal P (Θ) for the next nterval. The error sgnal ndcates how far the system s from the desred pont of operaton. If the system s operatng as desred, the error sgnals of all statons are zero; otherwse, the error sgnals are non-zero and the state of the system needs to change from ts current pont of operaton to the desred one. To do ths, the PI controller adjusts the control sgnal P (Θ), ncreasng t f E (Θ) > 0 and decreasng t otherwse. In the followng, we address the desgn of P (Θ) and E (Θ). D. Control sgnal P The DOC algorthm needs to adjust the access probablty p (Θ) based on the control sgnal. To do ths, there needs to be a one-to-one mappng between the control sgnal P (Θ) gven by the controller and p (Θ). In addton, we desgn the system such that the P (Θ) values are the same for all statons at the optmal pont of operaton. Ths latter requrement s necessary to derve the condtons for stablty n Secton IV. Based on the above requrements, we desgn P (Θ) as P (Θ) = p (Θ) p (Θ) (T + (/P )τ). (4) A staton can therefore compute ts p (Θ) from the control sgnal P (Θ) as p (Θ) = P (Θ) T + (/P )τ + P (Θ). (5)

6 E. Error sgnal E The desgn of the error sgnal E (Θ) has the followng two goals: () selfsh statons should not be able to obtan extra channel tme from the wreless network by usng a confguraton dfferent from the optmal, and () as long as there are no selfsh statons, p(θ) should converge to the optmal p. For the desgn of the error sgnal, DOC reles (lke [5], [6]) on the broadcast nature of the wreless medum, whch enables statons to overhear the transmssons of the other statons. In partcular, n every nterval Θ, each staton measures () the channel tme used by the other statons, t j (Θ), and () the average tme (over the nterval) that they hold the channel upon a successful contenton, T j (Θ) = nj(θ) k= T k j (Θ)/n j(θ). Based on ths, staton computes the error sgnal at the end of the nterval as E (Θ) = j (t j (Θ) t (Θ)) F (Θ) (6) where F (Θ) s a functon that we desgn below. The error sgnal E (Θ) conssts of the followng two components: The frst component, j t j(θ) t (Θ), punshes selfsh statons. If a staton receves less channel tme than the other statons, ths component wll be postve and hence staton wll ncrease ts access probablty p (Θ). The second component, F (Θ), drves the system to the desred pont of operaton n the absence of selfsh behavor (.e., when all statons receve the same channel tme). We next address the desgn of the functon F (Θ). In order to drve p(θ) to the desred p when all statons receve the same channel tme, we need F (Θ) > 0 for p (Θ) > p, such that n ths case p (Θ) decreases, and F (Θ) < 0 for p (Θ) < p. The desgn of F (Θ) should also prevent selfsh statons from obtanng more channel tme than t. In the followng, we derve the condtons that F (Θ) needs to meet n order to satsfy ths requrement. To derve these condtons, we assume that the system s n steady state, whch mples that selfsh statons play wth a statc confguraton. (In the analyss of Secton V we show that DOC s also effectve aganst selfsh strateges that change the confguraton over tme.) We frst consder the case where one staton k s selfsh and all others are well-behaved and run the DOC algorthm. Snce the PI controller drves the error sgnal E (Θ) to 0 n steady state, the followng holds for all well-behaved statons: F (Θ) = j t j (Θ) t (Θ). (7) Summng F (Θ) over all statons except the selfsh one yelds: F (Θ) = (N )t k (Θ) t (Θ) = Nt k (Θ) t (Θ). k k (8) If we combne the above wth the requrement that the selfsh staton cannot gan,.e., t k (Θ) t, we obtan the followng nequalty, F (Θ) D(Θ) (9) k where D(Θ) s defned as the dfference between the sum of channel tmes n optmal operaton and the sum of channel tmes n the current nterval,.e., D(Θ) = Nt t (Θ). Note that, f the current access probabltes are not optmal, t (Θ) wll be smaller than Nt. Hence, D(Θ) reflects the channel tme lost due to non-optmal access probabltes. The followng upper bound on F (Θ) guarantees that (9) s satsfed, and thus ensures that a selfsh staton does not beneft from msbehavng: F (Θ) D(Θ). (30) N The ntuton behnd ths upper bound s as follows. When a selfsh staton msbehaves, t receves more channel tme than the well-behaved statons. Ths, however, moves the pont of operaton away from the optmal access probabltes, reducng the overall effcency n terms of aggregate channel tme. The above upper bound ensures that the addtonal channel tme receved by the selfsh staton does not outwegh the channel tme t loses due to the overall loss of aggregate channel tme. Ths guarantees that the selfsh staton does not receve more channel tme and hence does not beneft from msbehavng. We next consder the case of multple selfsh statons. In ths case, the aggregate channel tme receved by the selfsh statons must not exceed the aggregate channel tme that they would receve n optmal operaton,.e., m = t (Θ) mt (where {,..., m} s the set of selfsh statons). Followng smlar reasonng to that above, we obtan the upper bound F (Θ) m D(Θ). (3) N m Gven all the above requrements, we desgn F (Θ) as: F (Θ) = ( mn (N )D(Θ), D(Θ) mn N ), p (Θ) > p mn ( ) (N )D(Θ), D(Θ) N, (N ), p (Θ) p mn (3) where p mn = {p mn,..., p mn N } are the access probabltes that mnmze D = E[D(Θ)] subject to t = t j, j, and s the value that D takes at ths pont, = D p=p mn. (33) In order to compute p mn and, the T j of all statons are requred. For these, we use the T j (Θ) values measured over the current nterval. The above desgn satsfes all of our prevous requrements: The term D(Θ)/N ensures that (30) and (3) are satsfed when D(Θ) > 0 and the term (N )D(Θ) ensures that they are satsfed when D(Θ) < 0. Ths provdes the requred protecton aganst (one or more) selfsh statons. As llustrated n Fg. 3, when all statons have the same expected channel tme, the expected value of F (Θ) s

7 F System W Controller D/N H E + C P p mn p * p ( ) (N-) -D/N (N-) (N-)D Fg. 4. Control system. z - Fg. 3. F as a functon of p (Θ) when t = t j, j. postve for p (Θ) > p and negatve otherwse. Ths ensures that p(θ) s drven to the desred p. The above desgn of the DOC algorthm s based on the assumpton that the number of statons n the wreless network s fxed. In the followng, we address the case of statons jonng and leavng the network. Wth DOC, each staton only keeps the state mantaned by the PI controller, Θ j ( tj (Θ) t (Θ) ) + F (Θ), whch accounts for the defct or surplus of the staton s channel tme over the other statons n the network. When a new staton jons the wreless network, ths staton does not have a surplus or defct, and therefore the other statons keep ther state. The new staton ntalzes the state of ts PI controller such that ts ntal p corresponds to the optmal p. When a staton leaves, the remanng statons keep ther state: ths ensures that the defct accumulated by a selfsh staton s not reset f t leaves and rejons the network. Ths concludes the desgn of the algorthm. In the followng two sectons, we analytcally evaluate ts performance when all statons are well-behaved (Secton IV) and when some statons msbehave (Secton V). IV. DOC ANALYSIS In ths secton we analyze the performance of DOC when all statons are well-behaved. As statons do not obtan any beneft from msbehavng, t s to be expected that they wll all play DOC, and therefore ths s the most meanngful scenaro for the performance analyss of the system. We frst analyze the wreless system under steady state condtons and show that t s drven to the desred pont of operaton obtaned n Secton II. We then conduct a transent analyss and derve suffcent condtons for stablty. A. Steady state analyss Our analyss s based on the system model of Fg. 4. In ths model, C represents the functon mplemented by the controllers, whch computes the control sgnals P (Θ), takng the error sgnals E (Θ) as nput. H represents the wreless system whch provdes the error sgnals E (Θ) based on the control sgnals P (Θ). In lne wth standard control theory [7], we model the randomness of the channel wth the nose sgnals W (Θ) and let E (Θ) represent the expected value of the error sgnal for the gven control sgnals P (Θ). Snce the controller ncludes an ntegrator, there s no steady state error [7] and the steady state soluton can be obtaned from E (Θ) = 0. (34) Usng (6) and (3), E (Θ) can be computed from p(θ). Ths enables (34) to be expressed as a system of equatons n p(θ). The followng theorem guarantees that the the soluton of ths system of equatons s unque and shows that the unque stable pont n steady state s the desred pont of operaton from Secton II. Theorem 3: The unque stable pont of operaton of the system n steady state s p(θ) = p. Proof: Let us consder two statons and j. From (34) we have E (Θ) E j (Θ) = 0, whch yelds Nt j (Θ) + F j (Θ) Nt (Θ) F (Θ) = 0. (35) Note that t j (Θ) > t (Θ) mples F j (Θ) F (Θ), and vce versa. Ths can be seen as follows: If p j (Θ) > p mn j and p (Θ) > p mn, then F j (Θ) = F (Θ). If p j (Θ) p mn j and p (Θ) p mn, then also F j (Θ) = F (Θ). If p j (Θ) > p mn j and p (Θ) p mn, then F j (Θ) F (Θ). When t j (Θ) > t (Θ), we are n one of these three cases, and hence F j (Θ) F (Θ). Combnng ths wth (35) yelds t (Θ) = t j (Θ), j. Substtutng ths nto E (Θ) = 0 yelds F (Θ) = 0. Gven t (Θ) = t j (Θ), F (Θ) s an ncreasng functon of p (Θ) that crosses 0 at p (Θ) = p. Hence, the only p (Θ) that satsfes F (Θ) = 0 s p. Snce ths holds for all, the unque stable pont of operaton s p (Θ) = p. B. Stablty analyss We now conduct a stablty analyss of DOC to confgure the parameters of the PI controller. Accordng to the defnton of a PI controller [7], staton computes the value of P at nterval Θ as a functon of the error values measured by the staton n the current and prevous ntervals based on the followng equaton: Θ P (Θ ) = K p E (Θ ) + K E (Θ) (36) Θ=0 where K p and K are the parameters of the controller that we need to confgure. In order to analyze our system from a control theoretc standpont, we need to characterze the transfer functons C and H n the system model of Fg. 4. The control and error sgnals n the fgure are gven by the followng vectors n the z-doman [7]: P(z) = (P (z),..., P N (z)) T (37)

8 and E(z) = (E (z),..., E N (z)) T. (38) Our control system conssts of one PI controller n each staton that takes E (z) as nput and provdes P (z) as output. We can therefore express the relatonshp between E(z) and P(z) as follows P(z) = C E(z) (39) where C = C P I (z) 0 0... 0 0 C P I (z) 0... 0 0 0 C P I (z)... 0....... 0 0 0... C P I (z) (40) wth C P I (z) beng the z-transform of a PI controller [7], C P I (z) = K p + K z. (4) In order to characterze our wreless system wth a transfer functon H that takes P(z) as nput and has E(z) as output, we proceed as follows. Equaton (6) provdes a nonlnear relatonshp between E(Θ) and P(Θ). To express ths relatonshp as a transfer functon, we lnearze t at the optmal pont of operaton. 4 We then study the lnearzed model and ensure ts stablty through approprate choce of parameters. Note that the stablty of the lnearzed model guarantees that our system s locally stable. 5 We express the perturbatons around the stable pont of operaton as follows: P(Θ) = P + δp(θ) (4) where P s the stable pont of operaton as gven by (4) wth p(θ) = p. Wth the above, the perturbatons of E can be approxmated by δe(θ) = H δp(θ) (43) where H = E (Θ) P (Θ) E (Θ) P (Θ). E N (Θ) P (Θ) E (Θ) P (Θ)... E (Θ) E (Θ) P N (Θ) E (Θ) P N (Θ) P (Θ).......... E N (Θ) P (Θ)... E N (Θ) P N (Θ). (44) To compute these partal dervatves we proceed as follows. The error sgnal E (Θ) can be expressed as ( ps,j (Θ) ( T j + ( P E (Θ) = T ) τ ) nterval j k p s,k(θ)t k + ( p s (Θ))τ p s,(θ) ( T + ( P ) τ ) ) k p F (Θ). s,k(θ)t k + ( p s (Θ))τ (45) 4 Ths lnearzaton provdes a good approxmaton of the behavor of the system when t suffers small perturbatons around the stable pont of operaton. 5 A smlar approach was used n [8] to analyze RED from a control theoretc standpont. The above can be rewrtten as a functon of P(Θ) gven by j E (Θ) = T (P j(θ) P (Θ)) nterval j P j(θ) p s(θ) p e (Θ) ( P )τ + p s(θ) p e (Θ) τ F (Θ) (46) where p e (Θ) = j p j(θ). We start by showng that F (Θ)/ P (Θ) = 0 at the stable pont of operaton. It follows from (3) that F (Θ) D(Θ) = 0 = 0. (47) P (Θ) P (Θ) D(Θ) can be expressed as D(Θ) = Nt T p s,(θ)t + p s (Θ)(/P )τ nterval p. s,(θ)t + ( p s (Θ))τ (48) The partal dervatve of D(Θ) can be computed as D(Θ) P (Θ) = D(Θ) p (Θ) p (Θ) P (Θ). (49) Takng the partal dervatve of (48) wth respect to p (Θ) and evaluatng t at the stable pont of operaton yelds ( D(Θ) p (Θ) = T τ/p nterval p s,(θ)t + (/P )τ ) ps (Θ) p (Θ). (50) Snce p s (Θ) has a maxmum value at the stable pont of operaton, we have that p s (Θ)/ p (Θ) = 0, whch yelds D(Θ)/ P (Θ) = 0 and hence F (Θ) = 0. (5) P (Θ) The partal dervatve of E (Θ) evaluated at the stable pont of operaton can then be computed from (46) as E (Θ) P (Θ) = (N )T nterval P(Θ)=P j P j. (5) Usng smlar reasonng, we can see that E j (Θ) P j (Θ) = T nterval P(Θ)=P j P j. (53) Substtutng these expressons n matrx H yelds (N )... (N )... H = K H........... (N ) (54) where K H = T nterval j P j. (55) Thus, the lnearzed system s fully characterzed by the matrces C and H. The next step s to confgure the K p and K parameters. The followng theorem provdes suffcent condtons whch {K p, K } must meet to ensure stablty: Theorem 4: The lnearzed system s guaranteed to be stable as long as K p and K meet the followng condtons: K < K p + NK H, K > K p NK H. (56)

9 Proof: The reader s referred to [8] for the proof of the theorem. Snce [8] uses the same lnearzed system as ths paper, the proof follows very closely that of [8]. In addton to guaranteeng stablty, our goal n the confguraton of the {K p, K } parameters s to fnd the rght balance between reacton tme n transents and oscllatons n steady state. To ths end, we use the Zegler-Nchols rules [7], whch have been desgned for ths purpose. Followng these rules (see [8] for a detaled descrpton), we obtan the confguraton: K p = 0.4 NK H, K = ( 0.85 ) 0.4 NK H. (57) The stablty of the resultng confguraton s guaranteed by the followng corollary: Corollary : The K p and K confguraton gven by (57) s stable. Proof: The proof follows from the fact that the confguraton of (57) meets the condtons of Theorem 4. Note that the above control theoretc analyss guarantees that the system wll always converge to the desred pont of operaton regardless of the ntal state. Ths mples that the system remans stable n the presence of any knd of perturbaton. Such perturbatons nclude, among others, transent selfsh behavor or statons jonng and leavng the network. V. GAME THEORETIC ANALYSIS In the prevous secton we have shown that, when all statons mplement the DOC algorthm, they all play wth p = p R and = R, whch leads to the optmal throughput allocaton r obtaned n Secton II.6 In ths secton we conduct a game theoretc analyss to show that one or more statons cannot obtan any gan by devatng from DOC. In what follows, we say that a staton s honest or well-behaved when t mplements the DOC algorthm to confgure ts p and R parameters, whle we say that t s selfsh or msbehavng when t plays a strategy dfferent from DOC to confgure these parameters n order to obtan a greater share of wreless resources. The game theoretc analyss conducted n ths secton assumes that users are ratonal and want to maxmze ther own beneft or utlty, whch s gven by the throughput. Furthermore, t s reasonable to assume that the game s noncooperatve n that no bndng agreements can be reached between the players as to ther ther future play [6]. The model s based on the theory of repeated games [9]. In repeated games, tme s dvded nto stages and a player can take new decsons at each stage based on the observed behavor of the other players n the prevous stages. Ths matches our algorthm, where tme s dvded nto ntervals and statons update ther confguraton at each nterval. 7 Lke prevous analyses on repeated games [5], [6], we consder 6 Snce the throughput allocaton {r,..., r N } maxmzes log(r ), t s Pareto optmal. Ths follows from the fact that f there exsted another feasble allocaton that provded all statons wth more throughput than r, ths allocaton would yeld a larger log(r ). 7 Note that the game theoretc study conducted n Secton III-A was based on statc games nstead of repeated ones. The reason s that we consdered a system wthout penaltes where a user does not react to the behavor of other users. Hence, we could model t as a statc game where all players only make a sngle move at the begnnng of the game. an nfntely repeated game, whch s a common assumpton when the players do not know when the game wll end. A. Sngle selfsh staton Whle the desgn of the DOC algorthm n Secton III guarantees that a staton cannot beneft from playng wth a fxed selfsh confguraton, selfsh statons mght stll beneft by varyng ther confguraton over tme. As an example, let us consder a nave algorthm that only takes nto account the statons behavor n the prevous stage. Whle ths algorthm may be effectve aganst a fxed selfsh confguraton, t could easly be defeated by a selfsh staton that alternates between a selfsh confguraton (p k =, R k = 0) and an honest one (p k = p k, R k = R k ) at every other stage. Snce ths staton would play selfsh when all the others play honest, t would acheve a sgnfcantly hgher throughput every other nterval, thus beneftng from ts msbehavor. The above example shows that t s mportant to ensure that a selfsh staton cannot obtan any gan no matter how t vares ts confguraton over tme. The followng theorem confrms the effectveness of DOC aganst any (fxed or varable) selfsh strategy. The proof of the theorem reles on the ntegrator component of the PI controller, whch keeps track of the aggregate channel tme receved by all statons and can thus be used to guarantee that ths aggregate does not exceed a gven amount. Theorem 5: Let us consder a selfsh staton that uses a p k (Θ) and R k (Θ) confguraton that can vary over tme. If all the other statons mplement the DOC algorthm, the throughput receved by ths staton wll be no larger than rk. Proof: The PI controller computes P at a gven nterval Θ accordng to the followng expresson: ( ) P (Θ ) = P ntal + K p (t j (Θ ) t (Θ )) F (Θ ) j Θ ( ) + K (t j (Θ) t (Θ)) F (Θ). (58) Θ=0 j Wth the above, P (Θ ) stays between 0 and a gven maxmum value P max. If at some pont P reaches a P max value such that p =, ths wll result n t j = 0 for j and F > (N )t, whch yelds E < 0, and therefore P wll decrease. Smlarly, f at any tme P reaches 0, then t = 0 and F 0, whch yelds E > 0, and therefore P wll ncrease. Consderng that 0 P (Θ ) P max, the above equaton can be expressed as Θ Θ=0 ( j ) (t j (Θ) t (Θ)) F (Θ) = B (59) where B s a bounded value: B = (P max P ntal +(K K p )E (Θ ))/K. Let us consder the case n whch there s a selfsh staton that changes ts confguraton over tme and receves a channel

0 tme t k (Θ). Equaton (59) can be wrtten as t k (Θ) = ((N )t (Θ) ) t j (Θ) + F (Θ) +B. Θ Θ j,k (60) Let us now consder a gven nterval Θ. From (30), we have F j (Θ) ( Nt N ) t (Θ). (6) Summng the above expresson over all j k, we have t (Θ) + j k F j(θ) Nt. As ths satsfed for all Θ, ( t (Θ) + ) F j (Θ) Θ j k Θ Furthermore, by summng (60) over all j k, (N ) Θ t k (Θ) = j k Nt. (6) (t j (Θ) + F j (Θ))+ B j. (63) j k Θ Addng the above two equatons yelds N Θ t k(θ) N Θ t + j k B j. If we consder a long perod of tme, the constant term j k B j can be neglected, resultng n Θ t k(θ) Θ t. Ths means that the selfsh staton cannot receve more channel tme usng a selfsh strategy than by playng DOC. Followng the argument of Secton III-B, ths mples that t cannot obtan more throughput than t would by playng DOC,.e., r k rk, whch proves the theorem. The above theorem leads to Corollary. Corollary : A state n whch all statons play DOC (All- DOC) s a Nash equlbrum of the game. Proof: Accordng to Theorem 5, f all statons but one play DOC, the best response of ths staton s to play DOC as well snce t cannot beneft from playng a dfferent strategy. Thus, All-DOC s a Nash equlbrum. The above shows that f all statons start playng wth no prevous hstory, none of them can beneft by devatng from DOC. In addton to ths, n repeated games t s also mportant to ensure that, f at some pont the game has a gven hstory, a selfsh staton cannot explot knowledge of ths hstory by playng a dfferent strategy from DOC. The followng theorem confrms that All-DOC s a Nash equlbrum of any subgame (where a subgame s defned as the game resultng from startng to play wth a certan hstory). Therefore, a selfsh staton cannot beneft by devatng from DOC for any prevous hstory of the game. Theorem 6: All-DOC s a subgame perfect Nash equlbrum of the game. Proof: Snce the proof of Theorem 5 s not dependent on past hstory and can therefore be appled to any subgame, All-DOC s a Nash equlbrum of any subgame. Note that, even though the above analyss assumes a fxed number of statons n the wreless network, t also holds for the case when the number of statons changes over tme, as long as these changes occur over suffcently long perods such that the constant term j k B j s not sgnfcant. B. Multple selfsh statons The above results show the effectveness of DOC aganst a sngle selfsh staton. In the followng, we focus on the case of multple selfsh statons. The followng theorem shows that, by followng a dfferent strategy from DOC, multple statons cannot gan any aggregate channel tme. Theorem 7: Let us consder a scenaro wth m selfsh statons. If all other statons play DOC, the selfsh statons cannot gan any aggregate channel tme. Proof: Wthout loss of generalty, let us consder that statons = {,..., m} are selfsh. Applyng a reasonng Θ t (Θ) m Θ t. smlar to Theorem 5 leads to m = As the left-hand sde of ths nequalty s the aggregate channel tme obtaned by the selfsh statons, and the rght-hand sde s the aggregate channel tme that they would obtan f they played DOC, the theorem s proven. Accordng to the above theorem, t s possble for a selfsh staton to obtan some gan, but ths wll be at the expense of another selfsh staton that receves less channel tme. Corollary 3 follows from ths. Corollary 3: Let us consder a scenaro wth m selfsh statons. If all other statons play DOC and a selfsh staton k receves a throughput larger than rk, then there exsts another selfsh staton l that receves a throughput smaller than rl. Proof: If there s some staton k {,..., m} for whch r k > rk, ths staton must necessarly receve more channel tme than t would f all statons played DOC. Snce (accordng to Theorem 7) the selfsh statons cannot gan any aggregate channel tme, there must then exst some other staton l {,..., m} that receves less channel tme. For ths staton, t holds that r l < rl, whch proves the corollary. Based on the above, we argue that DOC s effectve aganst multple selfsh statons, snce two or more selfsh statons cannot smultaneously beneft and therefore do not have any ncentve to play a coordnated strategy dfferent from DOC. VI. PERFORMANCE EVALUATION In ths secton we evaluate DOC by means of smulaton to show that () n the absence of selfsh statons, DOC provdes optmal performance whle remanng stable and reactng quckly to changes, and () selfsh statons cannot beneft by followng a strategy dfferent from DOC. Unless otherwse stated, we assume that dfferent observatons of the channel condtons are ndependent. We also assume that the avalable transmsson rate for a gven SNR s gven by the Shannon channel capacty: R(h) = W log ( + ρ h ) bts/s, where W s the channel bandwdth, ρ s the normalzed average SNR and h s the random gan of Raylegh fadng. We mplemented the DOC algorthm n OMNET++. In the smulatons, we set W = 0 7, T /τ = 0 and the nterval of the controller T total = 0 5 τ. For all results, 95% confdence ntervals are below 0.5%. A. Valdaton of the optmal confguraton In order to assess the accuracy of the analyss of Secton II, we compare the performance of the confguraton computed

40 35 optmal confguraton (ρ =) optmal confguraton (ρ =5) optmal confguraton (ρ =0) optmal confguraton (ρ =00) exhaustve search 44 4 optmal confguraton DOC DOS non-opportunstc Total 30 5 0 5 0 Σ(log(r )) 40 38 36 5 4 6 8 0 4 6 8 0 N 34 3 4 5 6 7 8 9 0 ρ Fg. 5. Valdaton of the optmal confguraton of Secton II for dfferent numbers of statons and levels of heterogenety. Fg. 6. Proportonal farness as a functon of SNR (ρ =, ρ 0). n that secton ( optmal confguraton ) aganst the result of performng an exhaustve search over all possble confguratons of {p, R} and selectng the best one ( exhaustve search ). We perform the followng two experments. Frst, we consder a wreless network wth N statons, N/ of them wth a normalzed SNR equal to ρ and the other N/ wth a normalzed SNR equal to ρ. Fg. 5 shows the total throughput obtaned by both approaches for dfferent numbers of statons (N = {, 4,..., 0}) and levels of heterogenety (ρ =, ρ = {, 5, 0, 00}). We observe that both approaches perform very closely, wth a dfference well below 0.5% n all cases. Next, we consder a wreless network wth N statons, N = {, 4,..., 0}, where the ρ of each staton s randomly chosen n the range (ρ, ρ ), for ρ = and ρ = {, 5, 0, 00}. The results (not shown n a graph) confrm that also here the dfference between the two approaches s well below 0.5% n all cases. We conclude from these two experments that the analyss s very accurate. The key approxmaton of the analyss s to assume that p s s equal to ( /N) N, whch corresponds to the value of the optmal success probablty wth symmetrc access probabltes []. By analyzng the access probabltes n the above experments, we observe that all statons use smlar access probabltes regardless of ther channel condtons, whch makes ths approxmaton partcularly accurate. For nstance, even for the extreme case of N = and ρ = 00, the dfference between the access probabltes of the two statons s only around 8%. Ths s a consequence of (), whereby the rato of the access probabltes of two statons depends on ther T values; snce the thresholds R are set so that all statons have a smlar probablty of usng a transmsson opportunty, ths mples that all T values are smlar, and as a result, the access probabltes are also smlar. B. Throughput evaluaton For the throughput evaluaton, we compare the performance of DOC to the followng approaches: () the statc optmal confguraton obtaned n Secton II ( optmal confguraton ), () the confguraton proposed n [3] ( DOS ), and ().6.4..8.6.4. 0.8 0.6 0.4 optmal confguraton r r DOC DOS non-opportunstc Fg. 7. Throughput for heterogeneous SNRs (ρ =, ρ = 4). an approach that does not perform opportunstc schedulng but always transmts after successful contenton ( nonopportunstc ). We consder a scenaro wth N = 0 statons, half of them wth a normalzed SNR of ρ = and the other half wth a normalzed SNR ρ that vares from to 0. Fg. 6 shows the proportonal farness metrc, log(r ), as a functon of ρ. We observe that DOC performs at the same level as the benchmark gven by the optmal confguraton, whle the other two approaches (DOS and non-opportunstc) provde a substantally lower performance. For the above scenaro wth ρ = 4, Fg. 7 depcts the ndvdual throughput allocaton of two statons (where r s the throughput of a staton wth ρ and r that of a staton wth ρ ). DOC s effectve n drvng the system to the optmal pont of operaton and provdes the same throughput as the optmal confguraton. In contrast, DOS exhbts a hgh degree of unfarness as t provdes a much hgher throughput to the staton wth hgh SNR. The non-opportunstc approach provdes a reasonable degree of farness but has lower throughput due to the lack of opportunstc schedulng. In concluson, the proposed DOC algorthm provdes a good tradeoff between overall throughput and farness.

.8.6.4. 0.8 0.6 R - k =R- * k Usng DOC R - k =0 R - k = R- * k R - k =.5 R- * k R - k =0.5 R- * k 5 4 3 DOC Adaptve p k strategy Adaptve R - k strategy Adaptve p k and R - k strategy 0.4 0. 0 0.0 0. p k 0 N=4 N=8 N= N=6 N=0 Fg. 8. Throughput of a selfsh staton for fxed confguratons of {p k, R k }. Fg. 0. Throughput of selfsh staton wth dfferent adaptve strateges. Fg. 9. 4 0 8 6 4 Selfsh confguraton (ρ =8) Selfsh confguraton (ρ =4) Selfsh confguraton (ρ =) Selfsh confguraton (ρ =) 0 4 6 8 0 4 6 8 0 N DOC Selfsh staton wth fxed confguraton for dfferent N and ρ values. C. Selfsh staton wth fxed confguraton We verfy that a staton cannot obtan more throughput wth a selfsh confguraton than by playng DOC n a scenaro wth N = 0 statons, half of them wth ρ = and the other half (ncludng the selfsh staton) wth ρ = 4. The selfsh staton uses a fxed confguraton and all other statons mplement DOC. Fg. 8 shows the throughput of the selfsh staton for dfferent confguratons {p k, R k } of the selfsh staton. Ths s compared to the throughput that the staton would obtan f t played DOC, gven by the horzontal lne. We observe that none of the selfsh confguratons provdes greater throughput than DOC. Fg. 9 analyzes the mpact of fxed selfsh confguratons for a range of dfferent N and ρ values. It shows the largest throughput that a selfsh staton can receve wth a fxed confguraton, whch s obtaned by performng an exhaustve search over the {p k, R k } space. Ths throughput s compared to that whch staton would receve f t played DOC. Agan, we observe that the staton never benefts from playng selfshly, whch valdates the desgn of the DOC algorthm. D. Selfsh staton wth varable confguraton Accordng to Theorem 5, a selfsh staton cannot beneft from changng ts confguraton over tme. To verfy ths, we evaluate the throughput obtaned by a selfsh staton wth dfferent adaptve strateges. These strateges are nspred by the schemes used n [5] for a smlar purpose. The underlyng prncple of all of them s that the cheatng staton uses a selfsh confguraton to gan more throughput and, when t realzes that t s not obtanng more throughput, t assumes that t has been detected as selfsh and swtches back to the honest confguraton to avod beng punshed. In partcular, we consder the followng strateges. The adaptve p k strategy fxes the R k confguraton of the selfsh staton to ts optmal value, Rk = R k, and modfes the p k confguraton as follows: the staton uses a selfsh confguraton of p k = as long as t obtans some gan,.e. r k > rk. When r k drops below rk, the staton swtches to the honest confguraton, p k = p k, and stays wth ths confguraton as long as r k remans below 0.95rk. It swtches back to p k = when r k exceeds 0.95rk. The adaptve R k strategy fxes the p k confguraton to the optmal value, p k = p k, and modfes the R k confguraton followng a strategy smlar to the one above: the staton uses a selfsh confguraton of R k = 0 (.e., t uses all transmsson opportuntes) as long as t obtans some gan and swtches to the honest confguraton when t stops beneftng. Fnally, the adaptve p k and R k strategy follows a smlar behavor to the prevous ones but adapts the confguraton of both p k and R k. Fg. 0 compares the throughput obtaned wth each of the above strateges to that obtaned wth DOC for dfferent values of N. As expected, when all other statons play DOC, a gven staton maxmzes ts payoff by playng DOC as well, as ths results n a larger throughput for the staton than any of the other strateges. Ths confrms the result of Theorem 5. E. Multple selfsh statons Corollary 3 states that all of the selfsh statons cannot smultaneously beneft by devatng from DOC: f one or more of the selfsh statons experence throughput gans, there must be other selfsh statons that suffer some loss. To valdate ths

3 3.5 K p,k 4 K p,k.5 0.5 0.5 3.5.5 K p *0,K *0 6 4 0 K p /0,K /0.5 6 0.5 0 000 4000 6000 8000 0000 Interval 0 500 000 500 000 Interval Fg.. Stablty analyss of the parameters of the PI controller. Fg.. Speed of reacton provded by the parameters of the PI controller. result, we consder a network wth N = 0 statons (two selfsh), half of them (ncludng one of the selfsh statons) wth ρ = and the other half (ncludng the other selfsh staton) wth ρ = 4. We perform an exhaustve search over a wde range of {p, R } confguratons of the two selfsh statons. The results of ths experment show that there s no confguraton that smultaneously mproves the throughput of the two selfsh statons, whch confrms the result of Corollary 3. F. Parameter settng of the PI controller The man objectve n the confguraton of the K p and K parameters proposed n Secton IV s to acheve a good tradeoff between stablty and reacton tme. To valdate that our system guarantees stable behavor, we analyze the evoluton of the throughput receved by a staton over tme n a wreless network wth N = 0 statons. Fg. shows the throughput for the chosen settng (labeled K p, K ) and for a confguraton of these parameters 0 tmes larger (labeled K p 0, K 0 ). We observe that wth the proposed settng, the throughput only suffers mnor devatons around ts average value. In contrast, for a larger settng, t exhbts hghly oscllatory, unstable behavor. To nvestgate the speed wth whch the system reacts aganst selfsh statons, we consder the followng scenaro. In a wreless network wth N = 0 statons, ntally all statons play DOC. After 50 ntervals, one staton becomes selfsh and changes ts access probablty to p k =. Fg. shows the evoluton of the throughput of the selfsh staton over tme. We observe from the fgure that wth our settng (labeled K p, K ), the system reacts quckly, and after a few tens of ntervals the selfsh staton no longer benefts from ts behavor. In contrast, for a parameter settng 0 tmes smaller (labeled K p /0, K /0 ) the reacton s very slow and t takes almost 000 ntervals for the staton to stop beneftng from ts msbehavor. 8 The results show that wth a larger settng of {K p, K } the system suffers from nstablty whle wth a smaller one t 8 We note that, whle the analyss of Secton IV guarantees stablty when all statons run DOC, our system s also stable when some of the statons are selfsh. Ths s shown by the experment of Fg. where, after one of the statons turns selfsh, the others ncrease ther access probablty to a value that ensures the selfsh staton does not have any gan. The system then remans stable at ths pont of operaton. Fg. 3..6.4..8.6.4. 0.8 0.6 0.4 optmal confguraton Performance wth Jakes channel model. r r DOC DOS non-opportunstc reacts too slowly. Hence, the proposed settng provdes a good tradeoff between stablty and reacton tme. G. Impact of channel coherence tme Our channel model s based on the assumpton that dfferent observatons of the channel condtons are ndependent. In order to understand the mpact of ths assumpton, we repeat the experment of Fg. 7 usng Jakes channel model [0] to obtan the dfferent channel observatons. The results, for a Doppler frequency of f D = π/00τ, are gven n Fg. 3. We observe that the throughput obtaned s slghtly smaller than that of Fg. 7. Ths s due to the fact that when the channel s bad, a staton does not transmt after a successful contenton and therefore t takes (on average) a shorter tme untl the next successful contenton of ths staton. As a result, a staton accesses the channel more often when t s bad than when t s good, whch ntroduces a bas that slghtly reduces the throughput. Overall, the results are suffcently smlar to those of Fg. 7 to conclude that our assumpton on the channel model only has a mnor mpact on the resultng performance. We further nvestgate whether, n the above scenaro, a staton wth ρ = 4 could obtan more throughput by usng a selfsh confguraton. Whle the staton obtans.75 Mbps wth DOC, t can obtan up to.757 Mbps wth a selfsh confguraton. Note that ths ncrease s not due to the DOC desgn, as no other confguraton provdes the selfsh staton wth more channel tme, but rather due to the fact that the transmsson rate threshold of [3] s not truly optmal under

4.5 DOC Selfsh ACKNOWLEDGEMENTS.5 0.5 The authors would lke to thank the anonymous revewers and Dr. S. Murphy for ther valuable feedback. 0 00 00 500 000 000 5000 REFERENCES Number of ntervals (I) Fg. 4. No throughput gan for the selfsh staton wth statons jonng and leavng the network. Jakes channel model. In any case, the throughput gan of the selfsh staton s neglgble. H. Statons jonng and leavng the network To assess the effectveness of DOC wth statons jonng and leavng the network, we perform the followng experment. We consder a wreless network wth 5 statons, one of whch s a selfsh staton. After 000 ntervals, 5 addtonal statons jon the wreless network, stay for I ntervals and then leave. The ntal 5 statons stay for another 000 ntervals. The selfsh staton plays wth a confguraton {p, R } when there are 5 statons n the network and a confguraton {p, R } when there are 0 statons. We obtan these confguratons by performng an exhaustve search over all possble confguratons and selectng the {p, R } and {p, R } values that provde the selfsh staton wth the largest average throughput. Fg. 4 shows the average throughput obtaned by the staton wth ths selfsh strategy compared to the throughput t would obtan f t played DOC. The results confrm that the selfsh staton cannot obtan any gan by devatng from DOC. VII. CONCLUSIONS Recently proposed Dstrbuted Opportunstc Schedulng (DOS) technques provde throughput gans n wreless networks that do not have a centralzed scheduler. One of the problems of these technques, however, s that they are vulnerable to malcous users who may confgure ther parameters to obtan a greater share of the wreless resources. In ths paper we address ths problem by proposng a novel algorthm that prevents such throughput gans from selfsh behavor. Wth our approach, upon detectng a selfsh user, statons react by usng a more aggressve parameter confguraton that ndrectly punshes the selfsh staton. Such an adaptve algorthm has to carefully adjust the reacton aganst a selfsh staton n order to prevent the system from becomng unstable. A key aspect of the paper s that we use tools from control theory combned wth game theory to desgn our algorthm: by conductng a control theoretc analyss, we show that when all statons run DOC the system converges to the desred confguraton, and by conductng game theoretc analyss, we show that selfsh statons cannot beneft from playng a dfferent strategy. [] M. Andrews et al., Provdng qualty of servce over a shared wreless lnk, vol. 39, no., pp. 50 54, Feb. 00. [] P. Vswanath, D. Tse, and R. Laroa, Oppurtunstc beamformng usng dumb antennas, IEEE Trans. Inf. Theory, vol. 48, no. 6, pp. 77 94, Jun. 00. [3] D. Zheng, W. Ge, and J. Zhang, Dstrbuted opportunstc schedulng for ad hoc networks wth random access: An optmal stoppng approach, IEEE Trans. Inf. Theory, vol. 55, no., pp. 05, Jan. 009. [4] D. Zheng et al., Dstrbuted opportunstc schedulng for ad hoc communcatons wth mperfect channel nformaton, IEEE Trans. Wreless Commun., vol. 7, no., pp. 5450 5460, Dec. 008. [5] P. S. C. Thejasw et al., Dstrbuted opportunstc schedulng wth twolevel probng, IEEE/ACM Trans. Netw., vol. 8, no. 5, pp. 464 477, Oct. 00. [6] S. Tan, D. Zheng, J. Zhang, and J. R. Zedler, Dstrbuted opportunstc schedulng for ad-hoc communcatons under delay constrants, n Proc. IEEE INFOCOM, Mar. 00, pp. 874 88. [7] A. Garca-Saavedra, A. Banchs, P. Serrano, and J. Wdmer, Dstrbuted opportunstc schedulng: A control theoretc approach, n Proc. IEEE INFOCOM, Mar. 0, pp. 540 548. [8] P. Patras, A. Banchs, P. Serrano, and A. Azcorra, A control-theoretc approach to dstrbuted optmal confguraton of 80. wlans, IEEE Trans. Mob. Comput., vol. 0, no. 6, pp. 897 90, Jun. 0. [9] F. Kelly, Chargng and rate control for elastc traffc, Eur. Trans. Telecommun., vol. 8, no., pp. 33 37, Jan. 997. [0] G. Holland, N. Vadya, and P. Bahl, A rate-adaptve mac protocol for mult-hop wreless networks, n Proc. ACM MOBICOM, Jul. 00, pp. 36 5. [] B. Sadheg, V. Kanoda, A. Sabharwal, and E. Knghtly, Opportunstc meda access for multrate ad hoc networks, n Proc. ACM MOBICOM, Sep. 00, pp. 4 35. [] I. Menache and N. Shmkn, Rate-based equlbra n collson channels wth fadng, IEEE J. Sel. Areas Commun., vol. 6, no. 7, pp. 070 077, Sep. 008. [3] P. Gupta, Y. Sankarasubramanam, and A. Stolyar, Random-access schedulng wth servce dfferentaton n wreless networks, n Proc. IEEE INFOCOM, Mar. 005, pp. 85 85. [4] T. Glad and L. Ljung, Control Theory: Multvarable and Nonlnear Methods. London: Taylor & Francs, 000. [5] M. Cagalj, S. Ganerwal, I. Aad, and J.-P. Hubaux, On selfsh behavor n csma/ca networks, n Proc. IEEE INFOCOM, Mar. 005, pp. 53 54. [6] J. Konorsk, A game-theoretc study of csma/ca under a backoff attack, IEEE/ACM Trans. Netw., vol. 6, no. 6, pp. 67 78, Dec. 006. [7] G. F. Frankln, J. D. Powell, and M. L. Workman. Readng, MA: Addson-Wesley, 990. [8] C. Hollot, V. Msra, D. Towsley, and W. B. Gong, A control theoretc analyss of red, n Proc. IEEE INFOCOM, Apr. 00, pp. 50 59. [9] D. Fudenberg and J. Trole, Game Theory. Cambrdge, MA: MIT Press, 99. [0] W. C. Jakes, Mcrowave Moble Communcatons. New York: Wley & Sons, 975.

5 Albert Banchs (M 04 SM ) receved hs degree n telecommuncatons engneerng from the Polytechnc Unversty of Catalona n 997, and hs Ph.D. degree from the same unversty n 00. He receved a natonal award for the best Ph.D. thess on broadband networks. He was a vstng researcher at ICSI, Berkeley, n 997, worked for Telefonca I+D, Span, n 998, and for NEC Europe Ltd., Germany, from 998 to 003. He has been wth the Unversty Carlos III of Madrd snce 003. Snce 009, he also has a double afflaton as Deputy Drector of the nsttute IMDEA Networks. Albert Banchs has authored over 80 publcatons n peer-revewed journals and conferences and holds sx patents. He s area edtor for Computer Communcatons and has been senor and assocate edtor for IEEE Communcatons Letters and guest edtor for IEEE Wreless Communcatons, Computer Networks and Computer Communcatons. He has served on the TPC of a number of conferences and workshops ncludng IEEE INFOCOM, IEEE ICC and IEEE GLOBECOM, and has been TPC char for European Wreless 00, IEEE HotMESH 00 and IEEE WoWMoM 0. He s senor member of IEEE. Andres Garca-Saavedra receved hs B.Sc. degree n Telecommuncatons Engneerng from Unversty of Cantabra n 009, and hs M.Sc. degree n Telematcs Engneerng from Unversty Carlos III of Madrd (UC3M) n 00. He currently holds the poston of Teachng Assstant and pursues hs Ph.D. n the Telematcs Department of UC3M. Hs work focuses on performance evaluaton and energy effcency of wreless networks. Pablo Serrano (M 09) receved hs degree n telecommuncaton engneerng and hs Ph.D. degree from Unversty Carlos III of Madrd (UC3M) n 00 and 006, respectvely. He has been wth the Telematcs Department of UC3M snce 00, where he currently holds the poston of assocate professor. In 007 he was a vstng researcher at the Computer Network Research Group at the Unversty of Massachusetts, Amherst. Hs current work focuses on performance evaluaton of wreless networks. He has authored over 40 scentfc papers n peer-revewed nternatonal journals and conferences. He also serves as TPC member of several nternatonal conferences, ncludng IEEE GLOBECOM and IEEE INFOCOM. Joerg Wdmer (M 06 SM 0) s a Chef Researcher at Insttute IMDEA Networks n Madrd, Span. Hs research expertse covers computer networks and dstrbuted systems, rangng from MAC layer desgn, sensor networkng, and network codng to transport protocols and Future Internet archtectures. From June 005 to July 00, he was manager of the Ubqutous Networkng Research Group at DOCOMO Euro-Labs n Munch, Germany, leadng several projects n the area of moble and cellular networks. Before jonng DOCOMO Euro-Labs, he worked as post-doctoral researcher at EPFL, Swtzerland, focusng on ultrawde band communcaton and network codng. Joerg Wdmer receved hs M.S. and Ph.D. degrees n computer scence from the Unversty of Mannhem, Germany n 000 and 003, respectvely. In 999 and 000 he was a vstng researcher at the Internatonal Computer Scence Insttute n Berkeley, CA, USA. He has authored more than 00 conference and journal papers and three IETF RFCs. He also holds several patents, serves on the edtoral board of IEEE TRANSACTIONS ON COMMUNICATIONS, and regularly partcpates n the program commttees of several major conferences. He s senor member of IEEE and ACM.