Detecting Stations Cheating on Backoff Rules in Networks Using Sequential Analysis

Detecting Stations Cheating on Backoff Rules in 82.11 Networks Using Sequential Analysis Yanxia Rong Department of Computer Science George Washington University Washington DC Email: yxrong@gwu.edu Sang-Kyu Lee Department of Computer Science Sookmyung Women s University Seoul, Korea Email: sanglee@sookmyung.ac.kr Hyeong-Ah Choi Department of Computer Science George Washington University Washington DC Email: hchoi@gwu.edu Abstract As the commercial success of the IEEE 82.11 protocol has made wireless infrastructure widely deployed, user organizations are increasingly concerned about the new vulnerabilities to their networks. While various security issues have been extensively studied, the threats posed by denial-of-service (DoS) attacks have not been fully exploited. In this paper, we consider DoS attacks posed by cheating on the backoff rules in the IEEE 82.11 DCF protocol and propose a scheme detecting such adversaries. Our scheme is based on the sequential hypothesis testing. We first develop analytical models for packet interarrival time distribution from each station in the network where multiple cheating stations co-exist. Using the characterization of this probability distribution, we develop an algorithm to detect cheating stations based on the throughout degradations observed at normal stations. Our simulation results show that the proposed algorithm only requires very small number of observations of packets with very small value (i.e., less than.1%) of false positive and false negative decisions. That is, our proposed algorithm performs significantly fast and also accurately. I. INTRODUCTION As the commercial success of the IEEE 82.11 protocol [14] in access point-based wireless network (Wi-Fi) has made wireless infrastructures rapidly deployed, user organizations are increasingly concerned about the new vulnerabilities to their networks. While a more secure derivatives, 82.11i, of the 82.11 protocol is available in the standards community and the security mechanisms at the network layer have been extensively discussed, the threats posed by denial-of-service (DoS) attacks have not been fully explored. This paper focuses on threats posed by DoS attacks against the 82.11 MAC layer protocol. In 82.11, the likelihood of collisions is reduced by employing the Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA) algorithm. The basic idea is that the sender and the receiver must first exchange short control frames before transmitting actual data frames. The success of this initial exchange reserves the medium for a period of time specified in the sender s control frame, and all other listening stations should not initiate any transmission until the indicated length of time has elapsed. Such a protocol works well if all stations in a network respect the rules of the protocol. However, network adapters are becoming more and more programmable, and an attacker can easily modify the wireless interface and try to obtain more bandwidth at the expense of others. A. DoS Attacks by Cheating on Backoff Rule Clearly, attackers s goal is to waste network bandwidth as much as possible while making it difficult to be detected. Such an attack is possible when the protocol s backoff mechanism is modified by adversaries. In 82.11 protocol, two major functions exist: the point coordination function (PCF) and the distributed coordination function (DCF). While the PCF is a centralized scheme, the DCF, more widely used, is a random access scheme in which retransmission of collided packets is managed according to the binary exponential backoff rule. At each packet (control or data frame) transmission, the backoff time is uniformly chosen in the range (, CW 1), where the value CW called the contention window is initially set to the minimum contention window CW min, and at each unsuccessful transmission, CW is doubled, up to the maximum value CW max. (The CW min and CW max are physical layer dependent values specified by the 82.11 standards.) The backoff counter is decremented by one in each time slot as long as the channel is sensed idle and is frozen when transmission is detected on the channel. The station transmits when the backoff counter reaches zero. Consider adversaries selecting backoff values from a different distribution, e.g., the backoff time is randomly chosen in the range (, g CW 1) where g is between and 1. (Note that if g = 1, the adversary also observes the rule.) A naive analysis of a transmission log file (even if such a file exists) cannot detect this type of cheating due to the randomness of the backoff values. B. Scope In this paper, we consider DoS attacks posed by cheating on the backoff rules in the DCF and propose a scheme detecting such adversaries. Our proposed scheme is based on two technical advances: (1) the analysis of inter-arrival time distribution between packets successfully transmitted from each station and (2) the sequential analysis initially introduced by Wald [2] and thereafter extensively studied in many variations and application domains. The rest of this paper is organized as follows. In the next section, the DCF of 82.11 protocol is re-examined. In particular, a stochastic model developed by Bianchi in [1] is reviewed in detail as some of the results in[1] will be a basis

of our analysis in subsequent sections. In Section III, we extend the network model to include stations cheating on the backoff rule and study the inter-arrival time distribution of successful packets at both normal and misbehaving stations. In Section IV, based on our analysis presented in Section III, we develop a sequential analysis approach that detects stations cheating on backoff rules. The performance of our approach in various conditions is also discussed. Finally, our conclusions and some ideas for further research are presented in Section V. II. PRIOR AND RELATED WORK In the following, the 82.11 DCF protocol is re-examined, and other related work is reviewed. A. The IEEE 82.11 DCF Protocol The IEEE 82.11 DCF is based on the Carrier Sense Medium Access with Collision Avoidance (CSMA/CA) protocol. The CSMA/CA protocol is designed to reduce the collisions between stations using the same channel. Each station monitors the channel before it transmits its packet. Before a station starts to transmit a packet, it must sense the channel idle for a duration, called a distributed interframe space (DIFS), plus an additional backoff time. The backoff time is an integer multiple of a basic slot duration δ, where the backoff number is drawn randomly in the range (, CW 1). Note that CW is called a contention window initially set to a value called CW min. The station decrements the counter if the channel is idle during a slot period δ, and freezes its counter otherwise until the channel becomes idle. Once the channel becomes idle, the station waits for another DIFS period before it starts to decrement its counter after each idle slot. When the backoff number reaches to zero, the station transmits its packet. When the receiver finishes its receiving, it waits for a shorter period called short interframe space (SIFS) and then sends back to the sender an ACK packet to inform the sender that the transmission is successful. If the sender hasn t received the ACK for a specified timeout or it finds out some other station is transmitting a packet on the channel, the sender doubles its contention window CW and chooses a random number in the range (, CW 1). If its contention window CW is equal to a value called CW max, it will not double its contention window even when its transmission is not successful, and use the current window value for selecting the next backoff value. Note that the CW max is equal to 2 m CW min for some integer m. Once a transmission is successfully completed, the CW value is set to CW min for the next packet transmission. This access mechanism is called basic access mechanism. In wireless networks, there is an issue called hidden terminal. The situation is that as a sender is sending packets to the receiver, a third station, which resides outside the transmission range of the sender but the receiver is within the transmission range of it, senses the channel as idle and sends a packet, which could cause a collision at the receiver side. The situation is similar to that a transmission of a station falling outside the transmission range of the receiver and not able to hear the ACK from the receiver could cause a collision at the sender side. To deal with this issue, the IEEE 82.11 adds two more signalling packets, the request to send (RTS) and the clear to send (CTS). After the channel is sensed idle for DIFS, the sender sends a RTS to the receiver. If the receiver decides to accept the packet, it sends back a CTS after receiving RTS. After receiving CTS, the sender then transmits its packet. In both RTS and CTS, the time period of the transaction is specified. Thus according to the time period specified in the RTS and CTS, those hidden stations on either sender or receiver side are able to defer their transmissions until the ongoing transaction is finished. B. Bianchi s Stochastic Model Our network model assumes that every station is saturated, i.e, each station always has packets waiting to be transmitted. This assumption should be justified when the network is congested such as the case when one or more stations are trying to deprive legitimate stations of their share of bandwidth. In [1], a Markov chain model for IEEE 82.11 DCF protocol was developed assuming that the collision probability denoted as p is a constant, i.e., the probability that a packet collides with others given a packet transmitted from a station is a constant. Note that p is a conditional probability. In this stochastic process, a station is likely to have different contention window CW and different backoff time counter k at different times. If CW = 2 i CW min, a station is said at stage i with corresponding contention window denoted as W i, where i m. Further, s(t) is to represent the stage at which the station is at time t and b(t) is to represent the backoff time counter of the station at time t. The stochastic process is now described as follows, P {i, k i, k + 1} = 1 k (, W i 2) i (, m) P {, k i, } = 1 p W k (, W 1) i (, m) P {i, k i 1, } = p W i k (, W i 1) i (1, m) P {m, k m, } = p W m k (, W m 1) where W i = 2 i CW min and P {s(t + 1) = i 1, b(t + 1) = k 1 s(t) = i, b(t) = k } is expressed as P {i 1, k 1 i, k } for reason of brevity. The first equation in (1) stands for that the backoff time counter is decremented after the channel is idle for DIFS. The second equation stands for that the transmission is successful and the station stays at stage and chooses a random backoff number k. The third equation stands for the transmission is unsuccessful and thus the contention window is doubled and a new random number k is chosen. The fourth equation stands for that the transmission is collided while the station is at the maximum stage m, the station stays at stage m and chooses a random number k for retransmission. The probability that a station transmits a packet in a randomly chosen slot is denoted as τ. Based on this Markov model, τ is calculated as τ = 2(1 2p) (1 2p)(W + 1) + pw (1 (2p) m ). (2) If there are n stations using the channel, p = 1 (1 τ) n 1. (3) (1)

Equation (3) is saying that the collision probability is that in a slot, at least one of the other n 1 stations transmit. Equations (2) and (3) can be solved to compute the two unknowns p and τ. C. Other Related Work Different kinds of techniques and protocols have been proposed to detect or punish the misbehavior in wireless networks. Game theory is a major technique in considering the misbehavior in Ad Hoc networks. From a game theoretic view of point, each station is selfish and wishes to maximize their throughput. In [5], they show that the existence of small population of selfish and non-cooperative stations leads to network collapse, which indicates an incentive for stations to cooperate with each other. They propose a detection mechanism that detects the misbehaving stations and a penalizing scheme by jamming the station whose throughput is more than the average of other stations. In [6], [7], [8], [9], [1], the misbehavior in Aloha is studied through a game theoretic point of view. In [3], [4], Konorski proposes protocols that are resilient to misbehaving stations. In [12], Kyasanur and Vaidya propose a protocol to detect the misbehaving stations. In this protocol, the receiver assigns a backoff number B exp to the sender. Then the receiver counts the actual backoff numberb act observed between two consecutive packets transmitted by the sender. If B act is less than B exp, the sender will be assigned a larger backoff number for next transmission. If the sum of the difference B exp B act of the last several packets is larger than a positive threshold, the sender is identified as Misbehaving. In [13], a protocol ERA-82.11 is presented. In this protocol, if at least one, either the sender or the receiver is honest, a uniformly distributed random backoff is ensured by letting the sender and the receiver agree on a random value. The trusted random value can help the receiver to detect the misbehavior by observing the deviation from the trusted value. Without modifying the IEEE 82.11 protocol, Raya, Hubaux, and Aad [11] present a system, DOMINO, installed in Access Point. To detect whether the misbehaving stations gain advantage over normal stations, this system compares the actual average backoff of a station with the nominal average backoff time to observe whether it deviates from the protocol. III. MODELING 82.11 NETWORKS WITH CHEATING STATIONS When the network includes stations cheating on backoff rules, the performance of the network or each individual station should be different. In order to identify the misbehaving stations, we focus on investigating the properties of the misbehaving stations. Several such properties are of our interest including the throughput and the packet inter-arrival time. In the following, we present a stochastic model, developed based on the Bianchi s model discussed in the previous section, to analyze the inter-arrival time distribution at each station. Our model also assumes that the network is saturated. A. Markov Chain Model In this section, we consider a network that includes n stations among which l stations cheat the backoff rules with greedy factors, g 1,, g l, ( < g 1,, g l < 1) and the remaining n l stations observe the rule. A misbehaving station S a with greedy factor g a chooses a random backoff value between (, g a W 1), where W denotes the current contention window CW. Let p denote collision probability of normal stations, and p a (1 a l) denote the collision probability of misbehaving station S a with greedy factor g a. The stochastic process for each misbehaving station a is then modeled as follows. P a {i, k i, k + 1} = 1 k (, g a W i 2) i (, m) P a {, k i, } = 1 pa g a W k (, g a W 1) i (, m) P a {i, k i 1, } = pa g a W i k (, g a W i 1) i (1, m) P a {m, k m, } = pa g a W m k (, g a W m 1) Note that Equation (4) is similar with Equation (1) except that the collision probability at the misbehaving station is p a and it chooses the random backoff number in the range of (, g a W 1). Let b a i,k = lim t P a {s(t) = i, b(t) = k}. The limiting probabilities b a i,k can be obtained as follows. b a i,k = ga W i k (1 p a ) m j= ba j, for i = g a p a b a i 1, for < i < m W i p a (b a m 1, + b a m,) for i = m and b a i, = (p a ) i b a, for < i < m b a m, = (pa ) m 1 p b a (6) a, From Equations (5) and (6) together with the following equation m i= g a W i 1 k= (4) (5) b a i,k = 1, (7) we obtain b a 2(1 2p a )(1 p a ), = (1 2p a )(g a W + 1) + p a g a W (1 (2p a ) m ). (8) Now, let τ a be the probability that the misbehaving station a transmits in an arbitrary time slot. Then m τ a = i= b a i, 2(1 2p a ) = (1 2p a )(g a W + 1) + p a g a W (1 (2p a ) m ).(9) The stochastic process for the normal stations is P {i, k i, k + 1} = 1 k (, W i 2) i (, m) P {, k i, } = (1 p ) W k (, W 1) i (, m) P {i, k i 1, } = p W i k (, W i 1) i (1, m) P {m, k m, } = p W m k (, W m 1)

(1) Let b i,k = lim t P {s(t) = i, b(t) = k}. We now get the b, similarly as that in Equation (8) except that the collision probability is p. b, = 2(1 2p )(1 p ) (1 2p )(W + 1) + p W (1 (2p ) m ). (11) Let τ denote the probability that a normal station transmits in an arbitrary time slot. Then, similarly, τ = 2(1 2p ) (1 2p )(W + 1) + p W (1 (2p ) m ). (12) Note that the probability that a packet transmitted from misbehaving station a is collided is equal to the probability that at least one of the other stations have a packet to transmit in the same time slot. Therefore, for each misbehaving station a(1 a l), we can get the following equation. 1 p a = (1 τ ) n l (1 τ i ) (13) 1 i a l Similarly, the collision probability for normal stations is equal to the probability that at least one of the other stations (including the misbehaving station) have a packet to transmit. Thus, 1 p = (1 τ ) n l 1 (1 τ i ) (14) 1 i l Thus, we have the following 2l + 2 equations: τ 2(1 2p = ) τ 1 = (1 2p )(W +1)+p W (1 (2p ) m ) 2(1 2p 1 ) (1 2p 1 )(g 1 W +1)+p 1 g 1 W (1 (2p 1 ) m ) 2(1 2p l ) (1 2p l )(g l W +1)+p l g l W (1 (2p l ) m ) τ l = p = 1 (1 τ ) n l 1 1 i l (1 τ i ) p 1 = 1 (1 τ ) n l 2 i l (1 τ i ) p l = 1 (1 τ ) n l 1 i l 1 (1 τ i ) (15) with 2l + 2 unknowns, τ, τ 1,, τ l, p, p 1,, p l. Finding a closed form for each unknown is non-trivial, and we have computed each value using a numerical method in our model validation discussed in a later section. B. Inter-Arrival Time Distribution Now, we are ready to formulate the distribution of packet inter-arrival time. We only focus our discussion on the RTS/CTS access mechanism as it can be easily extended to the basic access scheme. Throughout the paper, we assume that data packets have the same size, and T P denotes the amount of time it takes to entirely transmit a packet. Let T denote a random variable representing the inter-arrival time between two packets successfully transmitted from a station. For a given value t (t > ), our interest is then to compute the probability of the inter-arrival time between two successful packets being t, i.e., the probability of T = t. As it will become clear in the following discussion, we will only consider discrete values of t. After receiving an ACK frame corresponding to the previous data packet, the station waits for a DIFS period and does the following steps if it has another packet ready to transmit. (1) It chooses a random backoff number k 1 within the current contention window. (If the channel is idle, this step will be skipped. Hence, we can treat this case as k 1 =.) (2) When the backoff counter is decreased to, an RTS frame is transmitted from the sender. (3) Two situations may occur after Step (2). (3-1) The RTS frame is successfully transmitted: (3-1-1) The receiver waits for a SIFS period and starts to send a CTS frame. (3-1-2) The sender waits for a SIFS period after a CTS frame is completely received, and starts to transmit a data packet that takes T P. (3-1-3) After the data packer is completely received at the receiver, the receiver waits for a SIFS period and starts to send an ACK frame. (3-2) The RTS frame collides: (3-2-1) The sender assumes the collision of the RTS frame if a CTS frames is not received after a SIFS period. (3-2-2) The sender then waits for an additional (DIFS - SIFS) period from the point when the CTS frame is supposed to be received. (3-2-3) The sender doubles its contention window and chooses a random backoff number k 2. (Subsequently, k 3, k 4,, k i, k i+1 assuming i collisions of RTS frames occur before an RTS frame is successfully transmitted at the (i+1)th try. Let k = k 1 + + k i + k i+1. (3-2-4) Go to Step (2). Let T s denote the time it takes from when the sender starts to transmit an RTS frame to when an ACK frame is successfully received plus an additional idle channel period DIFS, i.e., T s denotes Steps (2) and (3-1) plus an additional DIFS period after an ACK frame received. We then have T s = RT S + SIF S + CT S + SIF S + T P +SIF S + ACK + DIF S. (16) Let T c denote the time from when the sender starts to transmit an RTS frame to when it assumes a collision and starts to choose a new backoff number, i.e., T c denotes Steps (2) and (3-2-1)-(3-2-2). We then have T c = RT S + DIF S. (17) Figure 1 depicts the RTS/CTS mechanism where BC(k j ) denotes the actual time taken to have a backoff number k j decreased to. This figure shows the inter-arrival time between the two packets successfully received at the receiver. Consider a network with n stations with greedy factors g 1, g 2,, g n. If a station is strictly following the DCF

ACK DIFS BC(k 1 ) T c BC(k 2 ) T c BC(k i ) T c BC(k i+1 ) T s Inter-arrival time SIFS DIFS-SIFS RTST c DIFS RTS SIFSCTSSIFS T P SIFSACK DIFS T s Fig. 1. Inter-Arrival Times using RTS/CTS Mechanism protocol, its greedy factor is equal to 1; otherwise its greedy factor is less than 1. Between two successful transmissions from a station, say S a with greedy factor g a, many possible scenarios can happen such as no other station transmits at all, there may be collisions with or without involving S a, or there may be successful transmissions completed by other stations. Let T a be a random variable denoting the inter-arrival time at station S a. Let T a (i, k, f, s) denote the value of T a such that during this period, there are f collisions of which i ( f) collisions involve the station S a, f i collisions do not involve S a, and s successful transmissions have been completed by other stations. Let k 1, k 2,, k i, k i+1 denote the random backoff numbers chosen by the station S a during the period T a (i, k, s, f), where k = i+1 j=1 k j and k j < 2 j 1 g a CW min. Since the station S a makes a successful transmission at the (i + 1)th attempt, the total number of successful transmissions (including station S a s) is s + 1. We thus have T a (i, k, s, f) = kδ + ft c + (s + 1)T s. (18) Let P a (i, k, s, f) denote the probability that T a satisfies this equation. We next proceed to compute P a (i, k, s, f). The station S a chooses a random value k j in the range (, 2 j 1 g a CW min 1) for 1 j i+1. Hence the probability 1 of choosing k j at each jth attempt is equal to 2 j 1 g a CW min. Given backoff numbers k 1,, k i+1, the probability that a transmission at (i+1)th attempt is successful after i collisions is i j=1 m j=1 p a 2 j 1 g a CW min p a Q a (k 1,, k i+1 ) = 1 p a 2 i g a CW min for 1 i m p a 2 j 1 g a CW min ( 2 m g a CW min ) i m for i > m, 1 p a 2 m g a CW min where p a denotes the collision probability at the station S a, and CW max = 2 m CW min. Let C(i, k) denote the number of possible combinations of choosing i + 1 numbers k 1, k i+1 such that i+1 j=1 k j = k and k j < 2 j 1 g a CW min. We then have a recursive form for C(i, k), C(i, k) = 2 i g a CW min 1 j= C(i 1, k j) Intuitively, if k i+1 =, there are C(i 1, k) possible combinations, and if k i+1 = 1, there are C(i 1, k 1) possible combinations; and so on. C(i, k) is equal to the sum of all possible combinations for different k i+1 values. C(, k) = 1 for k < g a CW min and C(i, k) = for any k > i j= 2j g a CW min 1 (k exceeds the maximum possible value). Define P a (i, k) to be We then have P a (i, k) = Q a (k 1,, k i+1 )C(i, k). (19) P a (i, k, s, f) = P a (i, k) P a sc(s, f i) (2) where P a sc(s, f i) is the probability that s successful transmissions and f i collisions occurred by other stations without involving the station S a. Note that P a sc(s, f i) is defined by events that do not include the station S a. So we need to model the other stations behaviors while S a decrements its backoff value. For any randomly chosen time slot, if S a s backoff number is not zero, S a is not ready for transmission. So the probability that some other station, say S b, attempts a transmission and it becomes successful is τ b n j=1 j a,b (1 τ j ). Hence, the probability that a successful transmission occurs by other station, n { n } p a os = τ i (1 τ j ). (21) i=1 i a j=1 j i,j a Note that the probability that a slot is idle is p a oidle = n (1 τ i ). (22) i=1 i a

TABLE I CALCULATION AND SIMULATION PARAMETERS Payload Size 8184 bits MAC Header 272 bits PHY Header 128 bits ACK Frame 112 bits + PHY header RTS Frame 16 bits + PHY header CTS Frame 112 bits + PHY header Data Rate 1 Mbps Time Slot Time 5 µs SIFS 28 µs DIFS 128 µs CW min 16 CW max 124 Max # of Retransmits 7 TABLE II τ AND p IN A NETWORK WITH 1 NORMAL STATIONS Modeling Simulation g = 1. τ.525.565 p.3844.3651 TABLE III τ AND p IN A NETWORK WITH 7 NORMAL AND 3 CHEATING STATIONS Modeling Simulation g =.25 τ.2358.273 p.3269.2763 g =.5 τ.88.968 p.444.3982 g =.75 τ.52.594 p.4584.4217 g = 1 τ.365.425 p.4661.434 From Equations (21) and (22), the probability that a collision occurs in an arbitrary slot is p a oc = 1 p a os p a oidle. (23) Thus, the probability that s successful transmissions and f i collisions occur without involving S a during the period T a (i, k, s, f) can be represented as P a sc(s, f i) = ( ) ( ) k + s + f i k + f i (p a s os) s (p a f i oc) (f i) (p a oidled) k. Finally, we obtain the probability of T a = T a (i, k, s, f) is P a (i, k, s, f) = P a (i, k) P a sc(s, f i). (24) Note that using the stochastic model discussed in Section III.A, each τ a can be computed. So we conclude this section with the following main result. The probability that T a = kδ + ft c + (s + 1)T s is P a (i, k, s, f) = P a (i, k)p a sc(s, f i). C. Numerical and Simulation Results We have developed a software in Java codes that simulates the 82.11 DCF protocol. There are many simulators widely used in the research community such as the ns-2 [15]. However, we find that our in-house simulator is easier to modify protocol parameters such as adding greedy factors to the DCF protocol, and it runs faster as we are only considering the MAC layer behaviors. In this section, we compare the numerical results of the model with simulation results to validate our model. Table I lists the values of parameters used for numerical and simulation results. We first considered a network with 1 normal stations. Table II shows τ and p values computed from Equation (15) and obtained from simulations. Figure 2 shows the numerical and simulation results of the packet inter-arrival time distribution at each individual station. By conducting extensive simulations, we have observed that each of the 1 stations has almost identical probability distribution of inter-arrival time. Now we considered the network including 7 stations observing the rule (i.e., with greedy factor 1.) and 3 stations cheating on the backoff rule with the greedy factor of each station being.25,.5,.75. So we have 8 variables to compute obtained from Equation (15), which are shown in Equation (25) where station S i, for i 3, has greedy factor 1.,.25,.5, and.75, respectively. Table III shows these 8 values computed from Equation (15) using a numerical method. τ = τ 1 = τ 2 = 2(1 2p ) (1 2p )(W +1)+p W (1 (2p ) m ) 2(1 2p 1) (1 2p 1 )(.25W +1)+p 1 (.25W )(1 (2p 1 ) m ) 2(1 2p 2 ) (1 2p 2 )(.5W +1)+p 2 (.5W )(1 (2p 2 ) m ) 2(1 2p 3 ) (1 2p 3)(.75W +1)+p 3(.75W )(1 (2p 3) m ) τ 3 = p = 1 (1 τ 1 )(1 τ 2 )(1 τ 3 )(1 τ ) 6 p 1 = 1 (1 τ 2 )(1 τ 3 )(1 τ ) 7 p 2 = 1 (1 τ 1 )(1 τ 3 )(1 τ ) 7 p 3 = 1 (1 τ 1 )(1 τ 2 )(1 τ ) 7 (25) Figure 3 shows the numerical and simulation results of the packet inter-arrival time distribution at each of the normal and malicious stations. Again, the simulation results of all 7 normal stations are almost identical, and we only show the results of one such a station. Several important observations are made from our analytical and simulation results. In regard to the packet inter-arrival time, each graph (from modeling or simulation) shows several peaks. The first peak corresponds to the case that no other station has a packet successfully transmitted, i.e., T (i, k, s, f) with s =. The (s + 1)th peak in (a) or (b) of the figure corresponds to the case that s packets have been successfully transmitted by other stations. Stations cheating on the backoff rule can achieve higher throughput as they have higher probabilities in lower values of packet inter-arrival times, which can draw a conclusion that monitoring packet inter-arrival times at each station can provide significant information that can be used in detecting such stations. In the following section, we discuss how the results discussed in this section can in fact lead to an interesting scheme to detect cheating stations.

.4.7.35.6.3.25.2.15.1.5.5.4.3.2.1 2 4 6 8 1 12 x 1 4 (a) Modeling 1 2 3 4 5 6 7 8 9 x 1 4 (b) Simulations Fig. 2. Inter-arrival time distribution at each station in a network with 1 normal stations. IV. DETECTION OF CHEATING STATIONS In this section, we develop an algorithm for detecting stations cheating on backoff numbers. Our algorithm is based on the well-known technique called sequential probability ratio test developed by Wald [2]. In the following, the Wald s work is briefly reviewed. A. Sequential Ratio Test Suppose we have two hypotheses, H 1 and H (where only one of them is always true), and two corresponding probability density functions (pdf), P (x H 1 ) and P (x H ). To make a decision whether H 1 or H is true, we make a sequence of observations x 1, x 2,. Given x 1, we calculate the ratio R(1) = P [x 1 H 1 ] P [x 1 H ]. If R(1) is very large, it implies that the likelihood that x 1 is generated under H 1 is much larger than under H. So we have enough confidence to say that H 1 is true. On the other hand, if R(1) is very small, it implies that the likelihood that x 1 is generated under H is much larger than under H 1, and we accept that H is true. If R(1) is not an either extreme, we make an additional observation, say x 2, and calculate a new probability ratio by accumulating the difference of the likelihood, R(2) = R(1) P [x 2 H 1 ] P [x 2 H ] If R(2) is an either extreme, we accept H 1 or H. Otherwise, we continue to make an additional observation and calculate the next probability ratio until we can make a decision. In general, R(n) = R(n 1) P [x n H 1 ] P [x n H ] Since this is a hypothesis test, it is also possible that we make wrong decisions. There are two possible wrong decisions we may commit. The first kind of error is that we accept H 1 but H is actually true. The probability that we commit such an error is denoted as α. The second kind of error is that we accept H but H 1 is actually true. The probability that we commit such an error is denoted by β. In order to terminate the sequential test, we have to have enough confidence, i.e., α must be very small if we accept H 1, and β must be very small if we accept H. A general approach is that before the test starts, the values of α and β are specified. So given α and β, we compute two threshold values A and B such that after making observations x 1,, x n, (1) we accept H 1 and terminate the test if R(n) A, (2) we accept H and terminate the test if R(n) B, and (3) we continue to make an additional observation x n+1 and calculate the probability ratio R(n + 1) if b < R(n) < A. The two threshold values A and B should be chosen to guarantee that the two kinds of errors we make are no more than α and β, respectively. If the sample sequence (x 1, x 2,, x n ) leads to accepting H 1, i.e, R(n) A, we call the sequence (x 1, x 2,, x n ) a sample of type 1. If the sample sequence (x 1, x 2,, x n ) leads to accepting H, i.e., R(n) B, we call the sequence (x 1, x 2,, x n ) a sample of type. Suppose we terminate the test by accepting H 1. This means that the number of samples of type 1 under H 1 is at least A times as large as under H. Note that the percentage of samples of type 1 is equal to the probability that we terminate the test by accepting H 1. It is also equal to 1 β under H 1 and α under H. Hence, we have an upper bound for A, A 1 β α for B, B and a similar discussion gives us a lower bound β 1 α. It is tedious to calculate the precise values for A and B given α and β. However, Wald pointed out in [2] that by making A and B equal to the above upper and lower bounds, respectively, the test would provide at least the same level of precision as the test by using the precise values for A and B. In the following section, we present an algorithm to detect stations cheating on backoff numbers using the technique discussed in this section where A = 1 β α and B = β 1 α given α and β. B. Sequential Hypothesis Testing for Detection of Cheating Stations As discussed in Section III-B, adversaries can achieve significant level of throughput at the expense of other normal

.35.7.3.6.25.5.2.15.4.3.1.2.5.1.45 2 4 6 8 1 x 1 4 (a) Normal station (M).8 1 2 3 4 5 6 7 8 9 x 1 4 (b) Normal station (S).4.7.35.6.3.25.2.15.1.5 2 4 6 8 1 x 1 4.7 (c) Station with g =.75 (M).5.4.3.2.1 1 2 3 4 5 6 7 8 9 x 1 4.14 (d) Station with g =.75 (S).6.12.5.1.4.3.8.6.2.4.1.2.18 2 4 6 8 1 x 1 4 (e) Station with g =.5 (M).25 1 2 3 4 5 6 7 8 9 x 1 4 (f) Station with g =.5 (S).16.14.2.12.1.8.6.15.1.4.2.5 2 4 6 8 1 x 1 4 (g) Station with g =.25 (M) 1 2 3 4 5 6 7 8 9 x 1 4 (h) Station with g =.25 (S) Fig. 3. Inter-arrival time distribution at each station in a network with 7 normal stations and 3 malicious stations, each with g =.75,.5, and.25.

stations by choosing smaller values of greedy factors while maintaining the randomness of the selection, hence hiding their malicious behaviors. In the following, based on the technique shown in sequential probability ratio test, we develop an algorithm detecting such malicious behavior. 1) Data Analysis: Our work is grounded using the packet inter-arrival time distribution and the throughput achieved at each station under various scenarios (e.g., number of active stations, greedy factor used by each station, etc.). We note here that if the network is not saturated, the impact of malicious behaviors by cheating backoff numbers may be ignorable, hence, we are only interested in the situation when the network is saturated. Note that in both simulation and our analytical model, the packet inter-arrival times are shown to be discrete, but in real situations, this may not be the case due to many reasons such as signal dissipation during the transmission, variable packet lengths from time to time, etc. So, we simplify the expressions of the distributions as P (t 1 T < t 2 ). In other words, we divide the time scale into smaller intervals, [t, t 1 ), [t 1, t 2 ), [t k, t k+1 ). And we calculate the probability of P (t i T < t i+1 ) for each station, where P (t i T < t i+1 ) denotes the probability of the packet interarrival time being between t i and t i+1. What would then be the reasonable time intervals without losing any important characteristics of the inter-arrival time? Fortunately, the distributions calculated by both our model and simulations show an interesting property: the burstiness. Note that in Figure 2, for example, the first major peak corresponds to a set of inter-arrival times during which there are no other successful transmissions made by any station. Similarly, the second major peak corresponds to a set of inter-arrival times during which there is exactly one successful transmission by some other station. Therefore, the starting point of the second major peak is approximately 2T s. The rest of the peaks may be similarly interpreted. Within one major peak, when the inter-arrival time gets larger, the probability gets smaller, and the probability approaches to zero when approaching to the next major peak. When falling into the next peak, the probability first becomes large and then becomes smaller as the corresponding inter-arrival time gets larger. In other words, the distribution of the inter-arrival times presents an excellent guideline for dividing the time scale. Therefore, we divide the inter-arrival time into [, T s ), [T s, 2T s ),, [kt s, (k + 1)T s ). To execute the sequential probability ratio test, we have to know the distributions of the inter-arrival times under H 1 and under H. In other words, we have to know the exact greedy factor of the misbehaving station. Such information is not available in practice. It is even possible that we don t know whether there exists such a misbehaving station or not. Even more complicated, there may exist multiple misbehaving stations with different greedy factors. We first tackle the problem that there exists only one misbehaving station in the network, then move to the case that multiple misbehaving stations exist. To illustrate our approach, we first consider a simple case that there is only one cheating station in the network. 2) Single Cheating Station: We start with the simplest scenario that there is only one cheating station and its greedy factor g < 1 is known. The problem is then to find out which station is the cheating station. Note that the greedy factor of any normal station is 1. Then, the two hypotheses can be expressed as H 1 being g = g and H being g = 1. For the ith observed inter-arrival time x i, if x i falls into the jth major peak, i.e., jt s x i < (j + 1)T s, we denote the probability P [jt s x i < (j + 1)T s H 1 ] as P [Q j H 1 ], and P [jt s x i < (j + 1)T s H ] as P [Q j H ]. Note that the values of P [Q j H ] and P [Q j H 1 ] for each j are available from real experiments. Given the desired values of α and β, let A = 1 β and β B = 1 α. Our algorithm works assuming that the packet inter-arrival times x i of each suspicious station is monitored. The algorithm is described below. Algorithm 1: i = 1. P r = 1. 2: Make the ith observation, calculate the inter-arrival time x i, and find out the jth peak which x i falls into. 3: P r = P r P [Qj H1] P [Q j H ] ; 4: If B < P r < A, i i + 1 and go to step 2. 5: If P r A, return H 1 and terminate the algorithm. 6: If P r B, return H and terminate the algorithm. Now, we move to a more complicated case that we know there exists only one misbehaving station but we don t know its greedy factor. For a suspicious station, we need to test the hypothesis H 1 : g < 1 against the hypothesis H : g = 1. We first observe the following fact. If the only misbehaving station has g < g, by applying the test H 1 : g = g against H : g = 1, the probabilities that we commit the first and second kind of error are less than α and β, respectively. The reason is as follows. Consider two scenarios: (1) n 1 normal stations are coexisting with one cheating station with g = g, (2) n 1 normal stations are coexisting with one cheating station with g < g, for a given value g. As discussed before, for the scenario (1), by applying the hypothesis test H 1 : g = g against H : g = 1, the probabilities that we commit the two kinds of errors are nearly α and β, respectively. Compared with the cheating station in the scenario (1), in the scenario (2), the cheating station with g < g should be more misbehaving, meaning that the station with g < g tends to send out its packets after waiting for a shorter period. Hence, it has more packets falling into the first several peaks, and less packets falling into those peaks corresponding to large inter-arrival times. When we apply the hypothesis test H 1 : g = g against H : g = 1 on the station with g < g, the algorithm speeds up and tends to terminate by accepting H 1. In other words, the probability of the second kind of error is smaller than when applying the H 1 : g = g against H : g = 1 on the station with g = g, which is nearly β. In general, the probability of the second kind of error decreases as g decreases in the domain g g. Meanwhile, compared with the normal stations in α

the scenario (1), in the scenario (2), the normal stations are more normal. Because of the existence of the station with g < g, the normal stations tend to send their packets after a longer period. The normal stations have more inter-arrival times falling into the peaks corresponding to large inter-arrival times. When we apply the hypothesis test H 1 : g = g against H : g = 1 on the normal stations coexisting with station of g < g, the algorithm is more likely to terminate by accepting H. In general, as the g of the misbehaving station decreases in the domain g g, the probability of the first kind of error decreases. So if we know the range of the misbehaving station s greedy factor g, say g < g, we can apply the hypothesis test H 1 : g = g against H : g = 1 accordingly to find out the misbehaving station. We can select g value by ourselves as long as we are sure that the selected g is larger than the real g of the misbehaving station. The g value may be a critical point for the network performance. For example, if there exists a misbehaving station with g = g =.565, the throughput of a normal station is observed to decrease by 1% when there are 9 normal stations. So, once we find the throughput of a normal station is decreased by at least 1%, which means that the misbehaving station must have g g, we apply the hypothesis test H 1 : g < g against H : g = 1. As discussed before, by applying this test on both misbehaving and normal stations, the probabilities of committing the two kinds of errors are no more than α and β, respectively. 3) Multiple Cheating Stations: Now, consider the case that among the total n stations, l stations are misbehaving with greedy factors, g 1,, g l, ( < g 1,, g l < 1) but we don t know how many cheating stations in the network and we don t know their greedy factors. How should we then proceed to apply the hypothesis test? As we mentioned before, we can choose g according to the throughput of a normal station in the network. For example, we may think that 1% throughput degradation of a normal station is intolerable. For each throughput degradation d%, for example d = 1, we calculate the following parameters. If there are totally n stations in the network, we calculate the greedy factor value gi d (1 i n 1) such that if there exist i misbehaving stations with the same greedy factor gi d, the throughput of a normal station decreases by d%. The gi d value can be obtained in advance through experiments (in our case through simulations). Then it is true that g1 d < g2 d < < gi d < < gd n 1. This is because that if i misbehaving stations with the greedy factors gi d cause the throughput of a normal station decreased by d%, by forcing one of the remaining n i normal stations to be misbehaving with g = gi d, the throughput of a normal station must be decreased by more than d%. So gi+1 d > gd i. Once we find that the throughput of a normal station decreases by at least d%, we start with the hypothesis test H 1 : g = g1 d against H : g = 1, assuming that there is only one misbehaving station. If there is only one misbehaving station, the misbehaving station must have g g1 d in order to cause at least d% throughput degradation of a normal station. By applying the hypothesis test H 1 : g = g1 d against H : g = 1, we can detect the misbehaving station as we discussed before. However, if there are more than one misbehaving stations, each misbehaving station does not have to have g as small as g1 d to cause d% throughput degradation of the network. For example, if there are two misbehaving stations both with g2 d (g 2 > g 1 ), they can still make the throughput of a normal station decreased by at least d%. The two stations with g2, d compared with the case that a single cheating station exists with g1, d tend to have relatively larger inter-arrival times. Thus, when we apply the test H 1 : g = g1 d against H : g = 1 on the two misbehaving stations, we may not be able to detect any of them. But if we apply the hypothesis H 1 : g = g2 d against H : g = 1, we can detect them using the distribution P (x g = g2). d Now, consider the case that the two cheating stations have different greedy factors. If the throughput of a normal station is decreased by at least d%, one misbehaving station must have g g2. d The reason is that if both have g > g2, d which means that the two stations are not as harmful as the stations with g = g2, d the throughput degradation of a normal station must be less than d%. So when compared with the case that the two stations have the same g = g2, d the station with the smallest greedy factor does more harm to the network. Therefore, when we apply the hypothesis test H 1 : g = g2 d against H : g = 1 on the station with the smallest greedy factor, the test tends to terminate by accepting H 1 more quickly, which means that the second kind of error is smaller than β. In general, if there are i misbehaving stations with different greedy factors, while throughput of a normal station is decreased by at least d%, the station with the smallest greedy factor g value must have g gi d. Therefore, compared with i cheating stations with the same gi d values, the station with the smallest greedy factor among the i misbehaving stations tends to have smaller inter-arrival time. Thus, when applying the test H 1 : g = gi d against H : g = 1 on the station with the smallest greedy factor among the i misbehaving stations, the test tends to terminate more quickly by accepting H 1. Thus, when testing the most misbehaving station, the probability that we accept it as normal is even smaller than β. For those normal stations, since their throughputs decrease by at least d%, they are as least as normal as the normal stations coexisting with n i misbehaving stations with the same g = gi d. Thus, when applying the hypothesis test H 1 : g = gi d against H : g = 1 on the normal stations, the probability that we commit the first kind of error is bounded by α. The overall idea of our approach works as follows. If we find that the throughput of a normal station is decreased by at least d%, we first apply the test H 1 : g = g1 d against H : g = 1 assuming that there is only one misbehaving station. If we can find any misbehaving station, then disable it. If not, which means that there may be multiple misbehaving stations with relatively larger greedy factors, we move to use the test H 1 : g = g2 d against H : g = 1, by assuming that there are two misbehaving stations. If we identify some stations as

misbehaving (i.e., we detect those stations with the smallest greedy factors), disable the misbehaving stations, and check whether the throughput goes back to normal. If the throughput is still abnormal with the throughput degradation d, we start the procedure again by applying the test H 1 : g = g1 d against H : g = 1. If we can t find any misbehaving station when applying the test H 1 : g = g2 d against H : g = 1, which means that there are more than two misbehaving stations with relatively larger greedy factors, we move to use the test H 1 : g = g3 d against H : g = 1. Keep doing the procedure that if we can t find any misbehaving station by applying the test H 1 : g = gi d against H : g = 1, then move to the test H 1 : g = gi+1 d against H : g = 1; if we find some misbehaving stations, then disable them and check whether the throughput goes back to normal. If after disabling the detected misbehaving stations, the throughput is still abnormal, we start the procedure above again until we disable all misbehaving stations. During this procedure, we can guarantee that the detected misbehaving stations have the smallest greedy factors among all misbehaving stations and for the station with the smallest greedy factor, the probability that we accept it as normal is smaller than β; for any real normal station, the probability that we accept it as misbehaving is close to α. A formal description of this approach is presented in the next section. C. Sequential Algorithm Algorithm Sequential Hypothesis Testing Algorithm 1: For a specific d value, obtain a table that stores the value gk,n d such that if there are k(1 k n 1) misbehaving stations out of n stations with the same greedy factor gk,n d, the throughput of a normal station is decreased by d%. This can be done by experimenting in real situation (by simulating in our case) k misbehaving stations and checking how much throughput of a normal station is decreased. Then, for each station, calculate the probability of the inter-arrival time x, P [jt s x < (j + 1)T s g = gk,n d ] = P [Q j g = gk,n d ], and for one of the n k normal stations, calculate the probability of the inter-arrival time x, P [jt s x < (j + 1)T s g = 1] = P [Q j g = 1] based on the monitoring of packet inter-arrival times. 2: In the network with n working stations, roughly estimate whether the throughput of a normal station is decreased by at least d%. 3: k = 1. For each suspicious station, apply the algorithm Detect(k, n, d). If Detect(k, n, d) terminates by accepting some stations as misbehaving, remove the misbehaving ones from the network and go back to step 2. If Detect(k, n, d) didn t find any misbehaving station, k k + 1 and for each suspicious station do Detect(k, n, d). Algorithm Detect(k, n, d) 1: i = 1. P r = 1. 2: Make the ith observation, calculate the inter-arrival time x i, and find out the jth peak which x i falls into. Obtain TABLE IV APPLY THE TEST H 1 : g =.565 AGAINST H : g = 1 IN THE NETWORK WITH ONLY ONE MISBEHAVING STATION WITH g =.565 g α β # of exp. Ave. # pkts. # of wrongs..1.1 1 34.4 15.565.1.1 1 35.9 1.1.1 1 49.3 12.1.1 1 5.6 1.1.1 1 34.1 18 1.1.1 1 49.6 17.1.1 1 35.1 4.1.1 1 52.8 3 the values P [Q j H 1 ] = P [Q j g = g d k,n ] and P [Q j H ] = P [Q j g = 1]. 3: P r = P r P [Qj H1] P [Q j H ] ; 4: If B < P r < A, i i + 1 and go back to step 2. 5: If P r A, return H 1 and terminate the algorithm. 6: If P r B, return H and terminate the algorithm. D. Performance analysis 1) There is only one cheating station: Suppose n = 1 and there is only one cheating station. Suppose we believe that the 1% throughput degradation of normal station is intolerable. Then we obtain g =.565 from the simulation such that if a misbehaving station has g.565, the throughput of a normal station is decreased by at least 1%. We want to detect the station whose g value is no larger than.565. Test 1: Suppose the misbehaving station has g =.565. When we find that some station is suspicious, we may check the packets transmitted by this station and calculate the interarrival time between two successful transmissions. There are two possibilities: the suspected station does have g =.565; a normal station is suspected as misbehaving. Table IV shows the result. In our experiments, we chose α and β for different experiment settings. For each pair of α and β, we run the algorithm 1 times, in each of which we keep feeding the next inter-arrival time to the algorithm till the algorithm terminates by accepting H 1 (misbehaving station) or accepting H (normal station). # pkts. stands for the average number of packets the algorithm needs to terminate. Since the input of the algorithm is the inter-arrival time of the misbehaving station, the algorithm is supposed to terminate by accepting H 1 mostly. We count the number of wrong decisions during the 1 experiments, the percentage of which should be about β. # of wrongs in Table IV stands for the total number of wrong decisions by the algorithm. If the input is the interarrival time of the misbehaving station, the probability that we accept it as normal is about β; and if the input is the interarrival time of a normal station, the probability that we accept it as misbehaving is about α. Other test results shown in the rest of the tables should be similarly understood. Test 2: Suppose the throughput degradation is about 1% (same as Test 1 case), but the misbehaving station actually has g =.5. We run the algorithm with the same g =.565 value to see whether it can differentiate the misbehaving station