Distributed Event Detection under Byzantine Attack in Wireless Sensor Networks

Distributed Event Detection under Byzantine Attack in Wireless Sensor Networks Pengfei Zhang 1,3, Jing Yang Koh 2,3, Shaowei Lin 3, Ido Nevat 3 1. School of Electrical & Electronic Engineering, Nanyang Technological University, Singapore 2. School for Integrative Sciences and Engineering, National University of Singapore, Singapore 3. Sense and Sense-abilities, Institute for Infocomm Research, Singapore Email: {pzhang1@e.ntu.edu.sg, kohjy@i2r.a-star.edu.sg, lins@i2r.a-star.edu.sg, ido-nevat@i2r.a-star.edu.sg } Abstract We present two novel distributed event detection algorithms based on a statistical approach that tolerate Byzantine attacks where malicious (compromised) sensors send false sensing data to the gateway leading to increased false alarm rate. We study the problem of Byzantine attack function optimization and the decision threshold optimization and consider two practical cases in our algorithms. In the first case, the Channel State Information (CSI) between the event generating source and sensors is unknown while CSI between the sensors and gateway is known. In the second case, the CSI between the source and sensors as well as between sensors and gateway are unknown. We develop an optimal event detection decision rule under Byzantine attacks for the first case and a novel low-complexity event detection algorithm based on Gaussian approximation and Moment Matching for the second case which considers a global decision. We evaluate our algorithms through extensive simulations. Simulation results show the Receiver Operating Characteristics (ROC) curves under different cases and scenarios, and therefore provide useful upper bounds for various centralized and distributed scheme designs. We also show that our algorithms provide superior detection performance when compared to local decision based schemes. I. INTRODUCTION Wireless sensor networks (WSNs) are gaining popularity in recent years due to their low cost and flexibility in deployment. They have been deployed for environmental monitoring or industrial control where they are mainly used to monitor a physical phenomena of interest, such as temperature, noise, dust, light [1] [3]. Typically the main purpose of the WSN is to raise alarms when events of interest such as fires or loud noises occur. Network managers will then respond accordingly to the event by sending a specialist to physically verify and rectify the event. Consequently, it is important to correctly identify a valid event with the usage of distributed event detection. It would be inefficient to regularly send a specialist down to the field due to false events. While there has been much research on the detection of malicious users [4], the detection of events in the presence of malicious nodes is still largely unexplored. WSNs function in a distributed manner and large number of sensor nodes may be scattered throughout a wide area. Therefore, it will not be cost effective to physically monitor and protect all the sensor nodes. Malfunctioned or compromised nodes may report anomalous data and the gateway must be able to detect such anomalies. In addition, it needs to determine with high probability if an actual event has happened. One of the malicious attacks that sensor nodes are vulnerable to is a Byzantine attack [5]. In a Byzantine attack, Byzantine nodes collude to send anomalous readings in the network. In existing work on event detection for wireless sensor networks, most of them are based on the simplifying assumptions that there is perfect Channel State Information (CSI) between the sensors and the gateway (GW). In addition, they have not considered the effect of malicious nodes on the system performance. In this paper, we develop a novel algorithm based on a statistical approach and model the Byzantine attack function in order to accurately detect the occurrence of events in the face of such attacks. We consider two practical cases where: 1) channels between events and sensors have unknown CSI; 2) sensors transmit to GW with unknown CSI. We also formulate an optimal Byzantine attack function from the attacker s point of view. We propose two distributed algorithms for these cases. We develop the optimal event detection decision rule under Byzantine attacks for the first case and a novel low-complexity event detection algorithm based on Gaussian approximation and Moment Matching for the second case which considers a global decision. The rest of this paper is organized as follows: Section II presents a brief survey of some closely related work. In Section III, we present our network model. In Section IV, we propose decision rules for two different cases and optimize the Byzantine attack function. Extensive simulation results and discussions are provided in Section VII. Finally, Section VIII concludes the paper. II. RELATED WORK A popular approach to identifying Byzantine attack involves local decisions of sensors [6]. In [7], the problem for distributed detection was considered, where the sensors transmit their local decisions over perfectly known wireless channels. In [8] theoretical performance analysis was derived for detection fusion under conditionally dependant and independent local decisions. In [9], the authors consider the problem of joint optimization of the fusion rule and local sensor thresholds for a fixed Byzantine strategy. [10] proposed new and optimal algorithms for distributed detection in sensor networks over fading channels with multiple receive antennas at the GW. Other methods to detecting malicious users in wireless sensor networks have also been considered. In [11], a witnessbased approach is used to determine the validity of results. In [4], a detailed survey is given on the problem of identifying anomalous data. In [12], the anomalous data is identified using

Bayesian Belief Networks. However, most of these works do not provide detailed analysis of the behavior of malicious users. Our approach in this paper is different from the above methods in the following ways: 1) Instead of using local decisions by each sensor, each sensor transmits its raw information to the GW and the GW makes a global optimal decision based on the raw information. 2) Instead of assuming perfect CSI for the channels, we consider the case when CSI is unknown. 3) Previous works do not deal with malicious nodes in the context of event detection. In this paper, we optimize the Byzantine attack function, and propose optimal decision rules to counter such attacks. III. SENSOR NETWORK MODEL We consider a wireless sensor network consisting of M sensors which observe an event represented by a single binary parameter θ {0, θ} over the wireless channel. GW knows the value of θ as a priori. The system model is depicted in Fig. 1. Each sensor transmits its incoming signal to the gateway (GW). The GW makes a binary centralized decision regarding the value of θ. Event θ = {0, θ} F 1 F 2 F m V 1 V 2 V m... Sensor 1 T i Sensor 2 T i Sensor M T i G 1 G 2 G m TDMA W Gateway Fig. 1. sensor network model with M sensors over wireless channels. Each sensor may be compromised by a malicious (Byzantine) attack T B (R m (l)). We now present the sensor network model: A1 The source is present θ=θ (H θ 1) or absent θ=0 (H θ 0) throughout a frame of L samples. A2 At each frame of L samples the observed signal at the m-th sensor (m=1,..., M) (denoted as R m (l)) is given by: H θ 0 :R m (l) = V m (l), l = 1,..., L H θ 1 :R m (l) = F m (l)θ + V m (l), l = 1,..., L. where F m (l) denotes the wireless channel between the source and the m-th sensor and V m (l) CN(0, σ 2 V ) is the i.i.d noise at the m-th sensor. A3 Each of the M sensors may be either honest or Byzantine (malicious), where we assume an Independent Malicious Byzantine Attacks model [13]. In this case, each Byzantine sensor makes independent decisions based solely on its own observations. We define P(HB s ) as the prior probability of a sensor to be Byzantine (malicious). The prior probability of HB s is given by: P(H s B) = N M, (1) where N is the average number of malicious sensors [11]. A4 Sensor processing: each sensor processes its observations before transmitting it to the GW as follows: { H s H : T H (R m (l)) (Honest), H s B : T B (R m (l)) (Byzantine), where T H : R R and T B : R R are the relay functions of an honest sensor and a Byzantine sensor, respectively. A common example of T H (R m (l)) is the Amplify-and-Forward function [10]. The function of the malicious sensor T B (R m (l)) will be detailed in Section IV-A. A5 The received signal at the GW from m-th sensor over wireless channels at epoch l is given by Y m (l) = G m (l)t i (R m (l)) + W (l), i {H, B}. (2) y m (l) is obtained in the following model: y m (l) = G m (l)t H (V m (l)) + W (l), HH s, Hθ 0 G m (l)t H (F m (l)θ + V m (l)) + W (l), HH s, Hθ 1 G m (l)t B (V m (l)) + W (l), HB s, Hθ 0 G m (l)t B (F m (l)θ + V m (l))) + W (l), HB s, Hθ 1 (3) where y m (l) is the received signal at the l-th sample for m-th sensor, G m (l) is the wireless channel between m-th sensor and the GW. F m (l) is the wireless channel between the source and m-th sensor. The additive noise at the GW is W (l) CN(0, σ 2 W ), and V m(l) is the random additive noise at the m-th sensor. A6 All wireless channels are assumed to be independent and follow a Rayleigh distribution (denoted as CN, i.e., complex normal distribution), as follows: F(l) CN(F(l), Σ F ), G(l) CN(G(l), Σ G ), (Source Sensor links) (Sensors FC links) where F(l) C M 1 is the wireless channel between the source and sensors and G(l) C M 1 is the wireless channel matrix between the sensors and the GW. G(l) and F(l) are the channels mean values, and Σ F = σf 2 I and Σ G = σg 2 I are the covariance matrix. IV. OPTIMAL DETECTION DECISION RULE In this section we derive the optimal detection decision rule. The objective of the decision rule is to identify which hypothesis an observation belongs to, given a set of observations. We develop an algorithm to perform the optimal decision rule. The decision rule is a threshold test based on the likelihood ratio [14], given by: Λ(Y(1 : L)) p(y 1:M (1 : L) H1) θ H p(y 1:M (1 : L) H0 θ) γ, (4) θ 1 H θ 0

where the threshold γ can be set to assure a fixed system false-alarm rate under the Neyman-Pearson approach or can be chosen to minimize the overall probability of error under the Bayesian approach [15]. We can further decompose the full marginals under each hypothesis, p(y 1:M (1 : L) Hk θ ), k = 0, 1, and marginalise over the sensor s state (Honest, Byzantine) as follows: p(y 1:M (1 : L) H θ k) = = = l=1 p ( y 1:M (l) Hk θ ) p(y m (l) Hk) θ j {H,B} p(y m (l) H s j, H θ k)p(h s j). Before we present our algorithms for calculating the optimal decision rule in (5) we need to define the Byzantine attack function T B ( ) A. Byzantine attack function optimization In this section we derive the optimal Byzantine attack function to increase the false alarm rate from the attacker s point of view. A compromised (malicious) sensor will transform the observation in such a way that it would seem to have been generated from the opposite hypothesis, therefore fooling the GW. This means that the Byzantine function should attempt to satisfy the following: T B,m ( Rm (l) H θ 0) d= CN(F m (l)θ, θ 2 σ 2 F + σ 2 V ) T B,m ( Rm (l) H θ 1) d= CN(0, σ 2 V ), where = d denotes equivalence between the distributions. To execute this strategy perfectly, the attacker would need to estimate the value of θ. However, the attacker may choose to implement a linear attack function T B,m = a m x + b m for the sake of simplicity. We say that the function is optimal if it minimizes the sum of the Fréchet distances [16], [17] between R m (l) Hi θ and T B,m(R m (l) Hi θ ) for each i {0, 1}. In Lemma 1, we present the Byzantine attack function. Lemma 1. The linear Byzantine attack function is given by T B,m (x) = a m x + b m, where [ ] am = ( P T P ) 1 P T Q, where P = Q = b m σ V 0 θ2 σf 2 + σ2 V 0 0 1 F m (l)θ 1 θ2 σ 2 F + σ2 V σ V F m (l)θ 0., (5) Proof. Given two normal variables, R m (l) H0 θ N (0, σv 2 ) and R m (l) H1 θ N (F m (l)θ, θ 2 σf 2 + σ2 V ), the optimality condition on T B,m is equivalent to minimizing [(a 0 + b) F m (l)θ] 2 +[(a F m (l)+b) 0] 2 +[(aσ V ) θ 2 σf 2 + σ2 V ]2 + [a θ 2 σf 2 + σ2 V σ V ] 2. We solve this optimization problem via least squares by minimizing Q P z 2, z = [a, b] T, which results in the solution ẑ = (P T P ) 1 P T Q. V. UNKNOWN CSI BETWEEN SOURCE TO SENSORS AND PERFECT CSI BETWEEN SENSORS TO GW We derive the optimal decision rule in (4) for the case in which the GW has perfect knowledge of G(l) and no knowledge of F(l). This scenario is practical in cases where training phase for channel estimation between the sensors and the GW is available. The marginal likelihood of the system model (3), when G(l) is known and F(l) is random unknown, is given by: y m (l) g(l) CN( 0 µ H0,m (l) CN(αg m (l)f m (l)θ µ H1,m (l), σv 2 α 2 g m (l)g m (l) H + σw 2 ), HH s, Hθ 0 Σ H0,m (l), α 2 β(l) + σw 2 } {{ } Σ H1,m (l), a 2 σv 2 g m (l)g m (l) H + σw 2 CN( bg m (l) µ B0,m (l) CN(ag m (l)f m (l)θ + bg m (l) } {{ } Σ B0,m (l) } {{ } µ B1,m (l) ), H s H, Hθ 1 ), H s B, Hθ 0, a 2 β(l) + σw 2 ), HB s, Hθ 1 Σ B1,m (l) where β(l) (σ 2 V + θ2 σ 2 F )g m(l)g m (l) H and α is the amplify-and-forward coefficient. (6) shows the Rayleigh distribution of system model (3), where µ Hi,m (l), Σ Hi,m (l), i = 0, 1 refer to the mean and deviation for H s H, Hθ i and µ Bi,m (l), Σ Bi,m (l), i = 0, 1 for H s B, Hθ i respectively. By using the decomposition (5), the test statistic in (4) is given by: Λ(Y(1 : L)) = p(y m (l) H θ 1) p(y m (l) H θ 0 ) (6) p(y m (l) H s H, Hθ 1)P(H s H ) + p(y m(l) H s B, Hθ 1)P(H s B ) p(y m (l) H s H, Hθ 0 )P(Hs H ) + p(y m(l) H s B, Hθ 0 )P(Hs B ) (7) where P(HB s ) is defined in A3, P(Hs H ) = 1 P(Hs B ) and p(y m (l) Hj s, Hθ i ) is given by: 1 p(y m (l) Hj, s Hi θ ) = exp 2 (y m(l) µ ji(l)) H Σ 1 ji (l)(y m(l) µ ji (l)) 2πΣji (l) (8) where µ ji (l) and Σ ji (l) are given in (6). We take the logarithm of (7) and get: L M log{λ(y(1 : L))} {log(p(y m (l) HH, s H1)P(H θ H) s + p(y m (l) H s B, H θ 1)P(H s B))} log{p(y m (l) H s H, H θ 0)P(H s H) + p(y m (l) H s B, H θ 0)P(H s B))}. (9)

The false alarm probability and positive detection probability are given by p(λ(y(1 : L)) > γ H θ 0) and p(λ(y(1 : L)) > γ H θ 1). Deriving these probabilities involves intractable integrals which cannot be expressed in closed form. We therefore perform Monte Carlo simulations to show the effect of parameters on these probabilities. The event detection algorithm is shown in Algorithm 1. Algorithm 1 Event Detection Algorithm for Case I Input: Y 1:M (1 : L), γ, F, G, Σ F, Σ V, Σ W Output: Binary decision (H0, θ H1) θ 1) Calculate P(Hj s ) according to (1). 2) Calculate p(y m (l) Hj s, Hθ i ) according to (8). 3) Calculate Λ(Y 1:M (1 : L)) via (7) and compare to the threshold γ. VI. UNKNOWN CSI BETWEEN SOURCE TO SENSORS AND UNKNOWN CSI BETWEEN SENSORS TO GW We generalize the case in the Section V by considering the scenario where there is no knowledge of both G(l) and F(l). This case is applicable when there is no training phase for estimating the channels between the sensors and the GW. Consequently, the distribution of the marginal likelihood p(y m (l) Hk θ ), k = 0, 1 in (2) is intractable. To overcome this problem, we develop a novel lowcomplexity detection algorithm based on moment matching. This algorithm approximates one distribution with another distribution, by matching their moments. A popular choice is to match the distribution with a normal distribution, due to its simplicity. Our approximation is based on Lemma 2. Lemma 2. The first two moments of Z=XY, where X N(G m (l), σg 2 ) and Y N(T m(l) i(r m (l)), σt 2 ) is i(r m(l)) given by: E[Z] = E[XY ] = G m (l) T i (R m (l)) Var(Z) = E[X 2 ]E[Y 2 ] (E[XY ]) 2 = (σ 2 G m(l) + G m(l) 2 )(σ 2 T i(r m(l)) + T i(r m (l)) 2 ) G m (l) 2 T i (R m (l)) 2 To apply Lemma 2, we first observe that the distribution of T i (R m (l)) is given by: CN(0, α 2 σ 2 V Σ 0 ), H s H, Hθ 0 CN(αθF m (l), α 2 θ 2 σf 2 + σv 2 ), HH s, Hθ 1 µ 1 Σ 1 CN(b, a 2 σv 2 ), HB s, Hθ 0 Σ 2 CN(aθF m (l) + b, a 2 θ 2 σf 2 + a 2 σv 2 )), HB s, Hθ 1 µ 3 Σ 3 Consequently, the distribution of the received signal in (3) will be approximated by normal distributions: y m (l) d CN(0, (σg 2 + G m (l) 2 )Σ 0 + σw 2 ) H s H,Hθ 0 CN(G m (l)µ 1, (σg 2 + G m (l) 2 )(Σ 1 + µ 2 1) G m (l) 2 µ 2 1 + σw 2 ) H s H,Hθ 1 CN(bG m (l), (σg 2 + G m (l) 2 )(Σ 2 + b 2 ) G m (l) 2 b 2 + σw 2 ) H s B,Hθ 0 CN(G m (l)µ 3, (σg 2 + G m (l) 2 )(Σ 3 + µ 2 3) G m (l) 2 µ 2 3 + σw 2 ) H s B,Hθ 1 where d denotes approximation in distribution. The distribution of y m (l) has the same structure as (6). Therefore the detection algorithm under Gaussian approximation is implemented similarly to Algorithm 1. VII. SIMULATION RESULTS In this section, we present Monte Carlo simulation results for the proposed algorithms in the two cases of CSI of Sections V and VI. The setting for all the simulations are as follows: the prior distribution for all the channels is Rayleigh fading and the channels are assumed to be both spatially and temporally independent. We set the observed binary parameter, θ to 1 and the results are obtained from simulations over 50,000 realizations of channel and noise for a given set of N, M, L, σv 2 and σ2 W. In the following sections we present the detection performance of the algorithms via Receiver Operating Characteristics (ROC) for the different configurations: number of sensors, M = 100. Ratio of malicious users, P(HB s ) = {0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.55, 0.6}, number of samples, L = 1 and noise variance, σv 2 = σ2 W = 0.25, α = 1. A. Case I-Unknown CSI between source to sensors and Perfect CSI between Sensors to GW Fig. 2 presents ROC results for the case of unknown CSI between the source and sensors and perfect CSI between sensors and the GW (Section V) for the optimal detection algorithm presented in Algorithm 1. The results show that with the increase in number of malicious users, the performance of the algorithm decreases. This implies the optimization of malicious function in Section IV-A affects the performance of algorithm, as intended. When the number of malicious users is 0, the algorithm has the highest positive detection probabilities. As the ratio of malicious users grows, it becomes increasingly difficult for the GW to determine whether the data sent from sensor is malicious. Interestingly, for false detection rates of less than 20 %, the system can tolerate up to 20% malicious users while maintaining a positive detection probability higher than 80%. Fig. 3 presents ROC results for case I as the mean F of the unknown channel between the source and the sensors varies. A larger mean could mean a stronger line of sight between sources and sensors, and stronger line of sight leads to a better performance for the algorithm. The reason is that when F increases, the difference between the H s H (0) and Hs B (θ)

Fig. 2. Case I - ROC performance given unknown CSI between the source and sensors and perfect CSI between sensors and the GW under different configurations of the number of malicious sensors P(HB s ) {0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.55, 0.6}, (M = 100, L = 1) Fig. 4. Case II - ROC performance under unknown CSI between the source and sensors and unknown CSI between sensors and the GW under different configurations of the number of malicious sensors P(HB s ) {0, 0.1,..., 0.6}, (M = 100, L = 1) Fig. 3. Case I - ROC performance given unknown CSI between the source and sensors and perfect CSI between sensors and the GW under different configurations of F {0.1 + 0.1i,..., 0.5 + 0.5i}, (P(HB s ) = 0.3, M = 100, L = 1) Fig. 5. Case II - ROC performance under unknown CSI between the source and sensors and unknown CSI between sensors and the GW under different configurations of F {0.1+0.1i,..., 0.45+0.45i}, (M = 100, P(H s B ) = 0.3, L = 1) also increases, making it easier for the GW to find out which hypothesis the observation belongs to. B. Case II-Unknown CSI between the source and sensors and Unknown CSI between sensors and the GW Fig. 4 presents ROC results for the case of unknown CSI between the source and sensors and unknown CSI between sensors and the GW (Section VI) for the optimal detection algorithm. Similarly as with Case I, the result shows that with the increase of the number of malicious users, the performance of the algorithm decreases. One observation is that Case II performs worse than Case I. This is expected as unknown channels between sensors and the GW add uncertainty to the detection problem. Fig. 5 presents ROC results for case II as the mean F of the unknown channel between the source and sensors varies. Similarly as with Case I, the performance of algorithm improves with increasingly F. C. Comparison with local decision based schemes We compare our global decision algorithm (GD) with two other local decision based schemes (LD-1, LD-2). The result for Case I is shown in Figure 6. In LD-1, sensors make a local decision based on the information they receive and assumes that a Byzantine attack function exists. Then each sensor transmits 0 or 1 to GW, 1 represents H θ 1 while 0 represents H θ 0. According to the binary decisions by all the sensors, GW makes a final decision according to majority rule,

and National University of Singapore respectively sponsored by A*STAR s Graduate Scholarship (AGS) from September 2010 and August 2013. Fig. 6. Case I - ROC performance under unknown CSI between the source and sensors and known CSI between sensors and the GW for different schemes F = 0.2 + 0.2i, (M = 100, P(HB s ) = 0.3, L = 1) which means that if more than half of the sensors are shown to have observed the event, then the GW will decide that event occurred. In LD-2, sensors make a binary decision without assuming the existence of a Byzantine attack function, then a malicious user may try to send opposite binary decisions from each sensor to the GW. The GW also makes a binary decision according to the information it receives assuming that a malicious function exists and by performing a majority rule. The result shows that LD-1 and LD-2 perform worse compared with our scheme. This is expected because in our scheme, the GW makes a global optimal decision by taking into consideration information from all the sensors. LD-1 and LD-2 make local decisions for each sensor; therefore, a suboptimal solution is achieved. VIII. CONCLUSION In this paper, we presented a distributed event detection algorithm that considers the existence of malicious nodes based on a statistical approach. We developed optimal Byzantine attack functions and optimal decision rules to decide whether the event occurred. We studied the optimal decision rules for two cases. In case I, there is perfect CSI between sensors and the gateway but unknown CSI between the source and sensors. In Case II, both CSI between the source and sensors and between sensors and the gateway are unknown. We developed the optimal event detection decision rule under Byzantine attack for the first case and developed a novel lowcomplexity detection algorithm based on Gaussian approximation and moment matching for the second case. Through extensive simulations, we demonstrated the performance of the optimal decision rules under various scenarios. Our schemes outperformed two sub optimal algorithms which are based on local decisions. REFERENCES [1] R. Vullers, R. Schaijk, H. Visser, J. Penders, and C. Hoof, Energy harvesting for autonomous wireless sensor networks, IEEE Solid-State Circuits Magazine, vol. 2, no. 2, pp. 29 38, 2010. [2] T. Arampatzis, J. Lygeros, and S. Manesis, A survey of applications of wireless sensors and wireless sensor networks, Proceedings of the 2005 IEEE International Symposium on Intelligent Control, pp. 719 724, 2005. [3] I. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci, Wireless sensor networks: a survey, Computer Networks, vol. 38, no. 4, pp. 393 422, 2002. [4] V. Chandola, A. Banerjee, and V. Kumar, Anomaly detection: A survey, ACM Computing Survey, vol. 41, no. 3, pp. 15:1 15:58, 2009. [5] L. Lamport, R. Shostak, and M. Pease, The byzantine generals problem, ACM Transactions on Programming Languages and Systems (TOPLAS), vol. 4, no. 3, pp. 382 401, 1982. [6] A. Rawat, P. Anand, H. Chen, and P. K. Varshney, Collaborative spectrum sensing in the presence of byzantine attacks in cognitive radio networks, IEEE Transactions on Signal Processing,, vol. 59, no. 2, pp. 774 786, 2011. [7] X. Zhang, H. Poor, and M. Chiang, Optimal power allocation for distributed detection over mimo channels in wireless sensor networks, IEEE Transactions on Signal Processing,, vol. 56, no. 9, pp. 4124 4140, 2008. [8] D. Ciuonzo, G. Romano, and P. Rossi, Performance analysis and design of maximum ratio combining in channel-aware mimo decision fusion, IEEE Transactions on Wireless Communications,, vol. 12, no. 9, pp. 4716 4728, 2013. [9] B. Kailkhura, S. Brahma, Y. S. Han, and P. K. Varshney, Optimal distributed detection in the presence of byzantines, in 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),, 2013, pp. 2925 2929. [10] I. Nevat, G. W. Peters, and I. Collings, Distributed detection in sensor networks over fading channels with multiple antennas at the fusion centre, Signal Processing, IEEE Transactions on, vol. 62, no. 3, pp. 671 683, 2014. [11] W. Du, J. Deng, Y. S. Han, and P. K. Varshney, A witness-based approach for data fusion assurance in wireless sensor networks, in IEEE Global Telecommunications Conference, 2003. GLOBECOM 03., vol. 3, 2003, pp. 1435 1439 vol.3. [12] D. Janakiram, V. Adi Mallikarjuna Reddy, and A. Phani Kumar, Outlier detection in wireless sensor networks using bayesian belief networks, in First International Conference on Communication System Software and Middleware, 2006. Comsware 2006., 2006, pp. 1 6. [13] A. S. Rawat, P. Anand, H. Chen, and P. K. Varshney, Collaborative spectrum sensing in the presence of byzantine attacks in cognitive radio networks, IEEE Transactions on Signal Processing,, vol. 59, no. 2, pp. 774 786, 2011. [14] H. Van Trees, Detection, estimation, and modulation theory.. part 1,. detection, estimation, and linear modulation theory. Wiley New York, 1968. [15] S. Kay, Fundamentals of Statistical Signal Processing, Volumn 2: Detection Theory. Prentice Hall PTR, 1998. [16] M. M. Fréchet, Sur quelques points du calcul fonctionnel, Rendiconti del Circolo Matematico di Palermo (1884-1940), vol. 22, no. 1, pp. 1 72, 1906. [17] D. Dowson and B. Landau, The frechet distance between multivariate normal distributions, Journal of Multivariate Analysis, vol. 12, no. 3, pp. 450 455, 1982. ACKNOWLEDGEMENT This work was carried out while Zhang and Koh were postgraduate students at Nanyang Technological University