Generative Model for Burst Error Characterization in Narrowband Indoor Powerline Channel

1 Generative Model for Burst Error Characterization in Narrowband Indoor Powerline Channel Emilio Balda, Xabier Insausti, Josu Bilbao and Pedro M. Crespo Abstract This paper presents a generative model for burst characterization of the underlying error profiles obtained from the Narrowband Indoor Powerline Channel. Using error sequences measured from the transmission lin, a generative model that produces error sequences of any length, with similar relevant statistics, is obtained. I. INTRODUCTION Powerline networs present an interesting alternative for no-new-wires scenarios were there is an electrical power distribution wiring available [1]. Unlie other communication channels, the powerline channel (PLC cannot be modeled as an additive white Gaussian noise channel [2] [4] [5] [6]. Some of the different types of noise within this channel include periodic impulsive noise generated by zero crossing at 50/60 Hz and asynchronous impulsive noise due to switching transients in the networ [2] [6]. This noise type has a time-varying behavior that produces noise bursts, on a scale from µs to ms, with significant implications on powerline communications. From recorded error sequences, generative models capable of generating binary error sequences with similar statistical distribution can be developed. These models allow studies of the channel to be performed without having the channel available. Obtaining these models involves two steps [3]: 1 guessing a model with enough degrees of freedom to be able to resemble the channel behavior and 2 developing methods to parametrize the model from the recorded error sequences. A generative model provides a method for generating long error sequences with reduced computational load when compared with the standard method of obtaining them by computer simulation of the overall communication lin [3]. This advantage is of great importance for error correcting codes testing on video and/or audio applications transmitted through the channel. Since long transmitted sequences are required for a good quality of the received signal, a generative model will substantially reduce the simulation time for performance evaluation of different channel coding schemes. The model should resemble the relevant statistics for channel coding schemes development. In this paper we will design a generative model, called hidden Marov generative model (HGM, for the Narrowband Indoor Powerline channel, which is particularly suitable for the characterization of channels with error bursts and which is based on hidden Marov models (HMM. This HGM will be inspired in the one designed in [3] for the indoor radio channel. Finally, we will use error profiles measured in [1] to train our model. II. PREVIOUS DEFINITIONS A HMM is an extension of the classical Marov Model in the sense that the observation symbols generated are probabilistic functions of the current state rather than deterministic 1. This means that the resulting HMM is a doubly embedded stochastic process with an uderlying stochastic proccess that is not observable [3]. A HMM is characterized by the following parameters: 1 K, the number of states of the model. We denote the set of all states as S S 1, S 2,... S K. Let q t S be a random variable that indicates the state of the HMM at time t N. 2 D, the number of distinct observation symbols that the model can generate. The set of all the symbols is denoted by V v 1, v 2,..., v D. Let r t V be a random variable that indicates the observation symbol generated at time t N. 3 Let a ij, 1 i, j K be the probability of transition from the state S i to the state S j, that is a ij Prq t+1 = S j q t = S i, 1 i, j K. (1 We define the state transition probability matrix A R K K as the matrix whose element at row i and column j is a ij. 4 Let b j, 1 j K, 1 D be the probability of getting the output observation symbol v in state S j, that is b j Prr t = v q t = S j, 1 j K, 1 D. (2 We define the observation symbol probability distribution matrix B R K D as the matrix whose element at row j and column is b j. 5 Let π i be the probability for the initial state to be S i, that is π i Prq 1 = S i, 1 i K. (3 We define the initial state distribution vector π R 1 K as the row vector whose element at column i is π i. In the sequel, we will denote a HMM with the triple λ = A, B, π. 1 In the classical Marov Model each state is associated to a single observation symbol.

2 A. Burst Definition A sequence of n bits a = (a 1,..., a n 0, 1 n will be sent through the PL channel, receiving another sequence a = (a 1,..., a n 0, 1 n at the output. Let the error profile sequence x = (x 1,..., x n 0, 1 n of the channel be x = a a = (a 1 a 1,..., a n a n (4 where denotes the mod-2 addition (i.e. the xor operator 2. Let T N be a parameter. For the sae of notation, we will denote [x] q p (x p,..., x q, 1 p < q n. A burst is defined as a sub-sequence of x where there is at least one error every T bits. Let b (T be the -th burst of the sequence x, defined as where i (T 1 = argmin i N:i>1 f (T 1 = argmin i N:i>i (T 1 i (T = argmin i N:i>f (T 1 f (T = argmin i N:i>i (T b (T [x] f (T i (T (5 [x]i = 1, [x] i+1 0 (6 [x]i = 1, [x] i+1 = 0 (7 [x]i = 1, [x] i+1 0, > 1 (8 [x]i = 1, [x] i+1 = 0, > 1 (9 (10 Consequently, let b (T : 1 N be the set of all the bursts within x. In Figure 1 we can see how the -th burst within x would loo lie. B. Partition of the Burst Set Let (T f (T i (T + 1 (13 be the length in bits of the burst b (T parameter. For each b (T and the length where 3,l [ The notation L l (T i=l (l 1+1 of the vector b (T and L N be a, 1 N, we define a sequence of that sequence as L (,1,...,, b (T ] i [ b (T ] i (14 N D(L,T (15, 1 l. (16 stands for the value at the position i. In words, c(l,t indicates the number of as shown in is also nown as the compact errors on each bloc of length L of the burst b (T Figure 2. This sequence format version of b (T [3]. Let C : 1 N be the set of all the compact format bursts. Now we will characterize each burst in compact format by its pea number of errors PNE N, that is PNE max l=1,...,,l, 1 N. (17 Note that the PNE is always equal or lower than L. Fig. 1. The -th burst within x for a selected T. Now, let z (T q z (T be the q-th non-burst sequence defined as q [x] i(t q+1 1 f q (T +1 1 q N 1, (11 where we discard the bits before the first burst and after the last one. The set of non-burst sequences will be denoted as A z (T q : 1 q N 1. Consequently, the length in bits of a non-burst sequence z (T q is defined as Λ q = i (T q+1 f (T q 1. (12 2 In other words, when x i is equal to one an error has occurred and when x i = 0 it has not. Fig. 2. The compact format version c (T,L of the burst b (T. Also PNE = 3. The set C will be partitioned into M + 1 sets: Z, W 1, W 2,..., W M defined as Z C : < l (0 min (18 and W i C : l (0 min, ξ(i 1 < PNE ξi, (19 where ξ = L (0 M and l min N is a parameter. 3 We assume that b (T is zero-padded up to length L.

3 III. MODELING THE ERROR PROFILES WITH THE PROPOSED HIDDEN GENERATIVE MODEL The HGM that will be designed in this paper is inspired in [3] and it consists on concatenating sub-models of different behaviors of the channel into one Marov-lie global model. This global model will have different states and state transition probabilities, and each state will contain a sub-model that will generate a sequence of bits that emulates the error profile of the corresponding set. In Figure 3(a we can see how the HGM model will loos lie. The behavior of the bursts within sets W 1, W 2,..., W M will be modeled in Section III-A using HMMs and those sets will be called burst classes (class 1 to class M. Furthermore, the behavior of the bursts in set Z will be modeled in Section III-B, and the behavior of the non-burst sequences in set A will be modeled in Section III-C. Finally, the transition probabilities of the HGM P A,1, P A,2,..., P A,M, P A,Z shown in Figure 3(a are computed as P A,m = W m C P A,Z = Z C, where denotes the cardinal of a set. 1 m M, Fig. 3. (a Concatenation of different HMM sub-models. (b Example of left-right HMM with five states and maximum step size d of two. A. Modeling the behavior of bursts within the classes W 1,..., W M A class W m, 1 m M, will be characterized by the following statistics: 1 PNE m : the mean PNE value, that is PNE m : W m PNE. (20 W m 2 D m : the mean D m value, that is : W m W m. (21 3 E m : the mean number of errors inside a burst, that is E m : (L,T D W m W m l=1,l. (22 Let a left-right HMM with maximum step size d N be a HMM λ = A, B, π where: 1 The state transition probability matrix A R K K is upper triangular, that is a ij = 0, j < i. 2 sequences have finite length. The symbol generated from the last state S K is the last symbol of the sequence. In other words, a left-right model continues generating symbols until the last state is reached. 3 d N is the maximum step size of the left-right model (i.e. the possibility of having transitions from the current state to at most d units forward, + d, that is a ij = 0, j > i + d. An example of a left-right HMM is depicted in Figure 3(b. To model the behavior of the bursts within the class W m (1 m M we will use left-right HMMs. We denote λ m = A m, B m, π m to the left-right HMM corresponding to the class W m. The first initial distribution state vector π m for a model λ m that comes into mind is π m = (1, 0, 0,..., 0 as in [3]. Nevertheless, in order to increase the degrees of freedom of the model we will choose a π m identical to the first row of the state transition probability distribution matrix A m of the corresponding sub-model 4. Now the first state of the model could be any of the d m +1 first states, being d m the maximum step size of the model λ m. Let l (m min be the minimum burst length value present within the class W m, that is l (m min min : W m 1 m M. (23 A model λ m corresponding to the class W m should: 1 be able to generate bursts with length equal to or higher than l (m min, 2 not be able to generate bursts with length lower than l (m min. The possible number of states (given a d m value for this model λ m depends on these two restrictions. Let N m be the number of states of the model λ m. If N m is too large, the model λ m will not fulfill the first condition. Liewise, if N m is too small, the model λ m will not fulfill the second condition. Therefore, N m should be between two finite values: ( 2 + d m l (m min N m 1 + d m l (m min + 1. (24 To obtain the values for A m and B m of λ m we will train this model using the well nown Baum-Welch algorithm - referencia-. This algorithm needs an initial estimated model λ m = A m, B m and the training burst sequences 4 Therefore, the sub-model λ m is fully characterised as λ m = A m, B m

4 W m to generate a model λ m iteratively. To obtain the sought model λ m we design the following algorithm: Algorithm 1: 1 Compute l (m min corresponding to the class W m according to (23. 2 Assign N m = 2 + d m l (m min. 3 Generate an initial estimated model λ m = A m, B m where the rows of B m follow an uniform distribution and A m follows the rules of a left-right HMM. The non-zero values within the rows of A m also follow an uniform distribution. 4 Use the Baum-Welch algorithm with input parameter λ m to obtain a left-right HMM λ m. 5 Using λ m we generate a large number R of bursts ĉ (L,T r, 1 r R. Let Ŵm ĉ (L,T r : 1 r R be the set of these generated bursts. We compute the corresponding parameters PNE m, D m, and Ê m of Ŵm as in (20, (21, and (22 respectively. 6 For a fixed relative error ɛ > 0, we say that the left-right HMM λ m is good enough if all of the following holds: PNE m PNE m < ɛ PNE m, D m D m < ɛ D m, Ê m E m < ɛ E m. (25 7 If the left-right HMM ( is not good enough, then: If N m < 1 + l (m min + 1 d m then, increase N m by one and go to 3. ( If N m = 1 + l (m min + 1 d m then, the class W m is partitioned into two sub-classes 5. Therefore, increase the total number of classes M by one and go to 1. Remar 1: In the event that Algorithm 1 does not converge, the allowed relative error ɛ should be increased. B. Modeling the behavior of the bursts within Z Let l(x, x N be the distribution of the lengths defined in (13 associated to the bursts Z, that is : Z, = x l(x. (26 Z Let q(x, 1 x L be the distribution of the observation symbols,l ( associated to the bursts =,1,...,,l,..., Z, that is, (, l : c (T,L Z,,l = x q(x (, l :. (27 Z The sub-model corresponding to the set Z will be characterized by these two distributions. Therefore, this model generates 5 The bursts of the former class W m with length lower than D m will form one sub-class and the other ones will form the other sub-class. length l bursts, where l is distributed according to l(x, and each burst has q errors randomly situated every L bits, where q is distributed according to q(x. C. Modeling the behavior of the non-burst sequences within A The first parameter of this model is the statistical distribution r(x, x N of the length of the non-burst sequences Λ, defined in (12, that is q : z (T q A, Λ q = x r(x. (28 A In the model corresponding to the set A we will emulate the impulsive periodic noise generated by the zero crossings of the power line at frequency f c (f c is either 50 or 60 Hz. Let, T s be the bit rate of the transmission lin in bits/sec. Since there are two zero crossings every 1 f c seconds, the model should introduce impulsive periodic noise every Ts 2f c bits. Suppose that the training error profile sequence x introduces such periodic impulsive error with a probability P s. Therefore, our model will introduce an error every Ts 2f c bits with probability P s. D. Selection criteria of the Parameters T, L To select a proper value for L we must have in mind that the compact format bursts should reflect how the error density varies within that burst. If the chosen L is too small, most of the symbols inside the compact format bursts will tae the maximum value (i.e., L. This produces a saturation effect that destroys the property of of representing the density behavior. On the other hand, if L is too large, the compact format bursts will have short lengths that will not represent the density behavior either. A large L would also imply a loss on the model s accuracy due to the bac conversion from compact format symbols to bits (recall that bits inside a generated bloc of length L are randomly situated. Since a burst is defined as having at least one error every T bits, it seems reasonable for L to have a smaller value than T. This is because in average, every bloc of L bits will have at least one error. On the other hand, since we do not want to include the noise generated by zero crossings at f c Hz inside the bursts, the value of T should verify that T < T s 2f c. (29 Knowing this boundary, we can iterate for the selection of T or guess a reasonable value. IV. SIMULATION RESULTS The error profile sequence x of length n = 1.7 10 7 bits used to train the HGM was measured in [1], where the transmission was done at a bit rate T s = 4800 bits/sec in a powerline with zero crossing frequency f c = 50 Hz 6. 6 Therefore, as shown in (29, T should be lower than 48 bits.

5 A. Parameters Selection The maximum step size of all the left-right HMMs considered in our HGM will be d m = 3 for 1 m M (as in [3] to avoid increasing the complexity of the model. We will choose the largest T in such a way that the length in bits of the longest burst does not exceed 48 bits. In our case this leads to T = 25. Iterating over different values of L, the best results are obtained for L = 9. In Figure 4 we can see the distributions of the lengths and the PNE values obtained for the selected L and T. Since we have small burst lengths, l (0 min = 2 (the lowest possible value for l (0 min such that Z is not empty is selected. Also, given the small PNE values (when compared to L, M = L is chosen to produce the maximum number of classes. Consequently, ξ = 1. A relative error tolerance ɛ = 0.1 is chosen to train the HGM. B. Obtained HGM The obtained HGM model has 5 sub-models. These submodels correspond to sets A, Z, W 1, W 2 and W 3 and the transition probabilities are shown in Table I. 4 Bit error interarrival period distribution BEIP(m: the probability of having exactly m consecutive error-free bits between one error and the next one (shown in Figure 9. Pr( 0 m 1 10 1 10 2 10 3 10 4 m Fig. 7. Error-free interval distribution Pr 0 m 1. TABLE I HGM TRANSITION PROBABILITIES P A,Z 0.9937 P A,1 0.0040 P A,2 0.0021 P A,3 0.0002 In Figure 5 we can see the distributions l(x and q(x that characterize the sub-model corresponding to the set Z computed according to (26 and (27 respectively. In Table II we can see the characteristics of the obtained sub-models corresponding to the classes W 1, W 2, and W 3. TABLE II SUB-MODELS CORRESPONDING TO SETS W m CHARACTERISTICS m N m l (0 min PNE m D m Ê m 1 7 2 1 2.0884 2 2 7 2 2 2.1568 3.08 3 7 2 3 2 4 C. Comparison Between Relevant Statistics For the comparison between the obtained HGM model and the real measures we select statistics that are considered relevant in [3] and [1]. The selected statistics are as follows: 1 Error-free interval distribution Pr0 m 1: probability of having at least m consecutive error-free bits after an error (shown in Figure 7. 2 Error cluster distribution Pr1 m 0: the probability of having exactly m consecutive errors after an error-free bit (shown in Figure 8. 3 P (v, distribution: probability that a bloc of v bits contains exactly errors. In Figure 6, P (v, is displayed for bloc lengths of 50, 100, 200, and 400. P ( 1 m 0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 2 3 4 5 6 7 m Fig. 8. Normalized cluster distribution Pr1 m 0. BEIP ( m 0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 100 200 300 400 500 600 700 800 900 1000 m Fig. 9. Bit error interarrival period distribution BEIP(m.

6 PNE Distribution D (L,T Distribution 10 5 1 2 3 4 PNE 10 5 1 2 3 4 (L,T D (a (b Fig. 4. 4(a: Burst pea number of errors PNE distribution. 4(b Burst length distribution. l(x q(x 0 2 4 6 8 10 x (a 10 5 1 2 3 4 5 x (b Fig. 5. 5(a: l(x distribution. 5(b: q(x distribution. REFERENCES [1] Josu Bilbao, Aitor Calvo, and Igor Armendariz. Fast characterization method and error sequence analysis for narrowband indoor powerline channel. Proc. IEEE, año?? [2] X. Gu. Time frequency analysis of noise generated by electrical loads in PLC. pages 864 871. 2010. [3] D. Javier Garcia-Frías and D. Pedro Crespo. Hidden Marov Models for Burst Error Characterization in Indoor Radio Channels. Proc. IEEE, vol. 46(no. 4, 1997. [4] T. Yamazato M. Katayama and H. Oada. A mathematical model of noise in narrowband powerline communication systems. 2006. [5] A. Daba M. Nassar. Cyclostationary noise modelling in narrowband powerline communication for smart grid applications. ICASSP, 2012. [6] M. Zimmermann and K. Dostert. Analysis and Modeling of Impulsive Noise in Broad-Band Powerline Communications. volume 44, pages 249 258. 2002.

7 P(50, P(100, 1 1.5 2 2.5 3 (a 1 1.5 2 2.5 3 3.5 4 (b P(200, P(400, 1 2 3 4 5 6 (c 1 2 3 4 5 6 7 8 (d Fig. 6. 6(a: probability that a bloc of 50 bits contains exactly errors. 6(b: probability that a bloc of 100 bits contains exactly errors. 6(c: probability that a bloc of 200 bits contains exactly errors. 6(d: probability that a bloc of 400 bits contains exactly errors.