On the exact bit error probability for Viterbi decoding of convolutional codes Irina E. Bocharova, Florian Hug, Rolf Johannesson, and Boris D. Kudryashov Dept. of Information Systems Dept. of Electrical and Information Technology, St. Petersburg Univ. of Information Technologies, Lund University Mechanics and Optics P. O. Box 8, SE- Lund, Sweden St. Petersburg 97, Russia Email: {florian, rolf}@eit.lth.se Email: {irina, boris}@eit.lth.se Abstract Forty years ago, Viterbi published upper bounds on both the first error event burst error and bit error probabilities for Viterbi decoding of convolutional codes. These bounds were derived using a signal flow chart technique for convolutional encoders. In 995, Best et al. published a formula for the exact bit error probability for Viterbi decoding of the rate R = /, memory m = convolutional encoder with generator matrix GD = + D when used to communicate over the binary symmetric channel. Their method was later extended to the rate R = /, memory m = generator matrix GD = + D + D + D by Lentmaier et al. In this paper, we shall use a different approach to derive the exact bit error probability. We derive and solve a general matrix recurrent equation connecting the average information weights at the current and previous steps of the Viterbi decoding. A closed form expression for the exact bit error probability is given. Our general solution yields the expressions for the exact bit error probability obtained by Best et al. m = and Lentmaier et al. m = as special cases. I. INTRODUCTION In 97, Viterbi [] published a now classical upper bound on the bit error probability P b for Viterbi decoding when convolutional codes are used to communicate over the binary symmetric channel BSC. This bound was derived from the extended path weight enumerators obtained using a signal flow chart technique for convolutional encoders. Van de Meeberg [] used a clever observation to tighten Viterbi s bound. The challenging problem of deriving an expression for the exact bit error probability was first addressed by Morrissey in 97 [3] for a suboptimum feedback decoding technique. Apparently, for the memory m = convolutional encoder with generator matrix GD = + D, he got an expression which coincides with the Viterbi decoding bit error probability published in 995 by Best et al. [4]. They used a more general approach based on considering a Markov chain of the so-called metric states of the Viterbi decoder [5]. Their new method looked simpler to generalize to larger memory encoders and to other channel models, which was later done in a number of papers. In particular, the extension to the memory m = convolutional encoder with generator matrix GD = + D + D + D was given by Lentmaier et al. [6]. We use a different approach to derive the exact bit error probability for Viterbi decoding of minimal convolutional encoders when used to communicate over the BSC. A matrix recurrent equation will be derived and solved for the average information weights at the current and previous states that are connected by the branches decided by the Viterbi decoder during the current step. In this presentation we consider for notational convenience only rate R = / minimal convolutional feed-forward encoders realized in controller canonical form. The extension to rate R = /c is trivial and to rate R = b/c as well as feedback encoders straight-forward. Before proceeding we would like to emphasize that the bit error probability is an encoder property, not a code property. Assume that the all-zero sequence is transmitted over the BSC. Let W t σ denote the weight of the information sequence corresponding to the code sequence decided by the Viterbi decoder at state σ at time t. If its initial value W σ is known then the random process W t σ is a function of the random process of the received c-tuples r i, i =,,..., t. Thus, the ensemble {r i, i =,,..., t } determines the ensemble {W i σ, i =,,..., t}. Our goal is to determine the mathematical expectation of the random variable W t σ over this ensemble, since for minimal convolutional encoders the bit error probability can be computed as the limit assuming that this limit exists. P b E [W t σ = ] t II. A RECURRENT EQUATION FOR THE INFORMATION WEIGHTS Since we have chosen realizations in controller canonical form the encoder states can be represented by the m- tuples of the inputs of the shift register, that is, σ t = u t u t... u t m. In the sequel we usually denote these encoder states σ, σ {,,..., m }. During the decoding step at time t + the Viterbi algorithm computes the cumulative Viterbi branch metric vector =... m at time t + using the vector at time t and the received c-tuple r t. In our analysis it is convenient to normalize the metrics such that the cumulative metrics at every all-zero state will be zero, that is,
r t = r t = r t = r t = φ t = φ t+ = φ t = φ t+ = φ t = φ t+ = φ t = φ t+ = r t = r t = r t = r t = φ t = φ t+ = φ t = φ t+ = φ t = φ t+ = φ t = φ t+ = r t = r t = r t = r t = φ t = φ t+ = φ t = φ t+ = φ t = φ t+ = φ t = φ t+ = r t = r t = 3 3 r t = r t = φ t = φ t+ = φ t = φ t+ = 3 3 φ t = φ t+ = φ t = φ t+ = r t = 3 3 r t = 4 4 r t = r t = 3 3 3 3 φ t = φ t+ = φ t = φ t+ = 4 4 φ t = φ t+ = 3 3 φ t = φ t+ = Fig.. The different trellis sections for the GD = + D generator matrix. we subtract the value from,,..., m and introduce the cumulative normalized branch metric vector φ t = φ t φ t... φ t m =... m For a memory m = encoder we obtain the scalar φ t = φ t 3 while for a memory m = encoder we have the vector φ t = φ t φ t φ t 3 4 First we consider the rate R = /, memory m = minimal encoder with generator matrix GD = + D. In Fig. we show the different trellis sections corresponding to the M = 5 different normalized cumulative metrics =,,,, and the four different received tuples φ t r t =,,,. The bold branches correspond to the branches decided by the Viterbi decoder at time t +. When we have two branches entering the same state with the same state metric we have a tie which we, in our analysis, will resolve by coin-flipping. The normalized cumulative metric Φ t is a 5-state Markov
chain with transition probability matrix Φ = φ jk, where φ jk = P r φ t+ = φ k φt = φ j 5 From the trellis sections in Fig. we obtain the following transition probability matrix Φ = - - φ k - q pq p - q pq p q p pq p + q pq pq p + q pq φ j Let p t denote the probability of the M different metric values of Φ t, that is, φ t { φ, φ,..., φ M}. The stationary distribution of the normalized cumulative metrics Φ t is denoted p = p p... p M and is determined as the solution of, for example, the first M equations of and 6 p Φ = p 7 M i= p i = 8 For our m = convolutional encoder we obtain 4p + 8p 7p 3 + p 4 p T p 5p + 5p 3 p 4 = + 3p p 3 p 3p + 3p 3 p 3p 3 + p 4 9 p + p 3 p 4 Now we return to the information weight W t σ. From the trellis sections in Fig. it is easily seen how the information weights are transformed during one step of the Viterbi decoding. Transitions from state or state to state decided by the Viterbi decoder without tiebreaking do not cause an increment of the information weights; we simply copy the information weight from the state at the root of the branch to the state at the termini of the branch since such a transmission corresponds to û t =. Having a transition from state to state decided by the Viterbi decoder without tiebreaking, we obtain the information weight at state and time t + by incrementing the information weight at state and time t since such a transition corresponds to û t =. Similarly, coming from state we obtain the information weight at state and time t + by incrementing the information weight at state and time t. If we have tiebreaking, we use the arithmetic average of the information weights at the two states and at time t in our updating procedure. Now we introduce some notations for rate R = /c, memory m encoders. The values of the random variable W t σ are distributed over the cumulative metrics φ t according to the vector p t. Let w t be the vector of the information weights at time t split both on the m states σ t and the M metric values φ t ; that is, we can write w t as the following vector of M m entries: w t = w t φ, σ =... w t φ M, σ = w t φ, σ =... w t φ M, σ =. w t φ, σ = m... w t φ M, σ = m The vector w t describes the dynamics of the information weights when we proceed along the trellis. It satisfies the following recurrent equation { wt+ = w t A + b t b t+ = b t Π where A is an M m M m nonnegative matrix and Π is an M m M m stochastic matrix. The matrix A is the linear part of the affine transformation of the information weights and it can be determined from the trellis sections in Fig.. The vector b t of length M m describes the increments of the information weights. For simplicity, we choose the initial values and. w = b =,M,M...,M p }{{} p... p 3 }{{} m m where,m denotes the all-zero vector of length M and p is the stationary probability distribution of the normalized cumulative metrics Φ t. The following two examples illustrate how A can be obtained from the trellis sections in Fig.. Consider first a situation without tiebreaking; for example, the trellis section in the upper left corner, where we have φ t =, φ t+ =, and r t =. Following the bold branches, we first copy with probability P rr t = = q the information weight from state σ t = to state σ t+ =, and obtain the information weight at σ t+ = as the information weight at σ t = plus since û t = for this branch. We have now determined four of the entries in A, namely, the two entries for σ t =, φ t =, and φ t+ =, which both are q, and the two entries for σ t =, φ t =, and φ t+ =, which both are. Notice that, when we determine the entry for φ t+ =, we have to add the probabilities for the two trellis sections with φ t+ =. Next we include tiebreaking and choose the trellis section with φ t =, φ t+ =, and r t =. Here we have to resolve ties at σ t+ =. By following the bold branch from σ t = to σ t+ = we conclude that the information weight at state σ t+ = is a copy of the information weight at state σ t =. Then we follow the two bold branches to state σ t+ = where the information weight is the arithmetic average of the information weights at states σ t = and σ t = plus. We have now determined another four entries of A, namely, the
entry for σ t =, φ t =, φ t+ =, and σ t+ = which is q, the two entries for φ t =, φ t+ =, and σ t+ = which are both q / the tie is resolved by coin-flipping, and, finally, the entry for σ t =, φ t =, φ t+ =, and σ t+ = which is since there is no bold branch between σ t = and σ t+ = in this trellis section. Proceeding in this manner yields the matrix A 4 for the memory m = convolutional encoder with generator matrix GD = + D. This matrix A is specified at the bottom of this page. The second equation in determines the dynamics of the information weight increments. We notice that m i= A ij = Φ, j =,,..., m 5 Moreover, we should only have increments when entering the the states σ t+ whose first digit is a. Thus, the first half of the entries of b t are s for t =,,..., and it follows that we can choose Π to be Φ Π = 6 Φ In the next section we shall solve the recurrent matrix equation. III. SOLVING THE RECURRENT EQUATION By iterating the recurrent equation and using the initial values and 3 we obtain w t+ = b A t + b ΠA t + + b Π t 7 From 7 it follows that w t w t+ lim t/ j= t b Π j A t j j= b Π j A t j + Π t j A j lim b Π t/ A t/ t b Π t/ A t/ = b Π A 8 where Π and A denote the limits of the sequences Π t and A t when t tends to infinity. We also used that, if a sequence is convergent to a finite limit, then it is Cesàro-summable to the same limit. The limit Π can be written as Φ... Π Φ... =..... 9.... Φ where Φ is a matrix whose rows are identical and equal to the stationary distribution p. Then 8 can be written as w t lim = b A where b =,M,M...,M p }{{} p... p }{{} m m We have the following important properties of A = a ij : Nonnegativity, that is, a ij, i, j M m. For any convolutional encoder, A has a block structure, A = A ij, i, j =,,..., m, where the block A ij corresponds to the transitions from σ t = i to σ t+ = j. Summing over the blocks columnwise yields m i= From follows that satisfies A ij = Φ, j =,,..., m e L = p p... p 3 e L A = e L 4 and, hence, e L is a left eigenvector with eigenvalue λ =. From the nonnegativity follows Corollary 8..3 [7] that λ = is a maximal eigenvalue of A. Let e R be the right eigenvector corresponding to the eigenvalue λ = and let A A A = A A = σ t = σ t = σ t+ = σ t+ = - q pq p q pq p - q 3pq/ p / q / 3pq/ p q pq pq p q / qp/ pq/ p / - - pq/ p / q / pq/ pq p q pq pq q / + p pq/ pq/ q + p / pq pq q + p pq pq q + p pq - - - - φ t+ 4 φ t
r t = r t = r t = r t = φ t = φ t+ = φ t = φ t+ = φ t = φ t+ = φ t = φ t+ = Fig.. Four different trellis sections of the in total 4 for the GD = + D + D + D generator matrix. e R be normalized such that e L e R =. If e L is unique up to normalization then it follows Lemma 8..7, statement i [7] that Combining, 3, and 5 yields lim w t A = e R e L 5 = b e R p t p... p 6 From it follows that the expression for the exact bit error probability can be written as M E[W t σ = ] i= P b w tφ i, σ = w t σ = T,M 7 where,m is the all-one row vector of length M. In other words, to get the expression for P b we sum up the first M components of the vector on the right side of 6, or, equivalently, we multiply this vector by the vector,m,m...,m T. Then we obtain P b =,M,M...,M p p... p e R 8 In summary, for rate R = /, memory m convolutional encoders we can determine the exact bit error probability P b for Viterbi decoding, when communicating over the BSC, as follows: Construct the set of metric states and find the stationary probability distribution p Construct the matrix A analogously to the memory m = example given above and compute its right eigenvector e R normalized according to p p... p e R =. Compute P b using 8. IV. EXAMPLES First we consider the rate R = /, memory m = convolutional code with generator matrix GD = + D. Its set of metric states is {,,,, } and the stationary probability distribution p is given by 9. From the trellis sections in Fig. we obtain the matrix A 4. Its normalized right eigenvector is e R = pq 4pq p + 4p 4p 3 + 7p p + 3p 3 p 4 + 4p 5 p + 4p 4p 3 Finally, inserting 9 and 9 into 8 yields P b = 4p 3p 3 + 6p 4 + p 5 6p 6 + 8p 7 + 3p p 3 p + 4p 4p 3 9 3 which coincides with the bit error probability formula in [4]. Next we consider the rate R = /, memory m = convolutional encoder with generator matrix GD = + D + D + D. In Fig. we show the four trellis sections for φ t =. The corresponding metric states at time t + are φ t+ = and. Completing the set of trellis sections yields 3 different normalized metric states. Thus, we have 4 different trellis sections. The matrix A is a block matrix that consists of eight different nontrivial blocks corresponding to the eight branches in each trellis section. We obtain the 4 4 matrix A = A 3,3 A 3,3 A 3,3 A 3,3 3,3 A 3,3 A 3 3,3 A 3 3,3 A 33 3
P b 3 4 5 R = /, memory m =, GD = + D R = /, memory m =, GD = + D + D + D R = /, memory m = 3, GD = + D + D 3 + D + D + D 3 BSC crossover probability p Fig. 3. Exact bit error probability for rate R = /, memory m = GD = + D, memory m = GD = + D + D + D, and memory m = 3 GD = + D + D 3 + D + D + D 3. Similarly to 6 and we have and Π = Φ Φ Φ Φ 3 b =,3,3 p p 33 where Φ is the 3 3 transition probability matrix for the 3-state Markov chain for the normalized cumulative metrics Φ t and p is the stationary probability distribution for Φ t. Following the method of calculating the exact bit error probability in Section III we obtain P b = 44p 3 + 359 8 p4 435 3 p5 6779 64 364645 5 + 39376463 p 7 + 97865739 p 8 48 p 9 4897885768 p 6 p + 34 4 3768 which coincides with the previously obtained result by Lentmaier et al. [6]. Finally, we consider the rate R = /, memory m = 3 convolutional encoder with generator matrix GD = + D +D 3 +D +D +D 3. If we consider all trellis sections we find that the number of normalized metric states is 433. Then we obtain the 433 3 433 3 matrix A 433 433 433 A 4 433 433 433 A 433 433 433 A 4 433 433 433 433 A 433 433 433 A 5 433 433 A = 433 A 3 433 433 433 A 35 433 433 433 433 A 4 433 433 433 A 46 433 35 433 433 A 5 433 433 433 A 56 433 433 433 433 A 63 433 433 433 A 67 433 433 433 A 73 433 433 433 A 77 where 433 is the 433 433 all-zero matrix. Since this example is essentially more complex we computed the exact bit error probability following the method in Section III only numerically. The result is shown in Fig. 3 and compared with the curves for the previously discussed memory m = and m = encoders. ACKNOWLEDGMENT This work was supported in part by the Swedish Research Council under Grant 6-7-68. REFERENCES [] A. J. Viterbi, Convolutional codes and their performance in communication systems, IEEE Trans. Inf. Theory, vol. IT-9, no. 5, pp. 75 77, Oct. 97. [] L. Van De Meeberg, A tightened upper bound on the error probability of binary convolutional codes with Viterbi decoding, IEEE Trans. Inf. Theory, vol. IT-, no. 3, pp. 389 39, May 974. [3] T. N. Morrissey, Jr., Analysis of decoders for convolutional codes by stochastic sequential machine methods, IEEE Trans. Inf. Theory, vol. IT-6, no. 4, pp. 46 469, Jul. 97. [4] M. R. Best, M. V. Burnashev, Y. Levy, A. Rabinovich, P. C. Fishburn, A. R. Calderbank, and D. J. Costello, Jr., On a technique to calculate the exact performance of a convolutional code, IEEE Trans. Inf. Theory, vol. 4, no., pp. 44 447, Mar. 995. [5] M. V. Burnashev and D. L. Kon, Symbol error probability for convolutional codes, Problems on Information Transmission, vol. 6, no. 4, pp. 89 98, 99. [6] M. Lentmaier, D. V. Truhachev, and K. S. Zigangirov, Analytic expressions for the bit error probabilities of rate-/ memory convolutional encoders, IEEE Trans. Inf. Theory, vol. 5, no. 6, pp. 33 3, Jun. 4. [7] R. A. Horn and C. R. Johnson, Matrix Analysis. Cambridge University Press, Feb. 99.