392D: Coding for the AWGN Channel Wednesday, January 24, 2007 Stanford, Winter 2007 Handout #6. Problem Set 2 Solutions

392D: Coding for the AWGN Channel Wednesday, January 24, 2007 Stanford, Winter 2007 Handout #6 Problem Set 2 Solutions Problem 2.1 (Cartesian-product constellations) (a) Show that if A is a K-fold Cartesian product of A, A = A K, then the parameters N, log 2,E(A ) and K min (A ) of A are K times as large as the corresponding parameters of A, whereas the normalized parameters ρ,e s,e b and d 2 min(a) of A are the same as those of A. Verify that these relations hold for ( )-QA constellations. There are possibilities for each of the n components of A, so the total number of possibilities is A = K. The number of bits supported by A is therefore log 2 A = K log 2, K times the number for A. The number of dimensions in A is KN, so its nominal spectral efficiency is ρ = (log 2 )/N, the same as that of A. The average energy of A is E A = a 1 2 + a 2 2 + + a K 2 = KE A. The average energy per bit is E b = E A / log 2 A = E A /(log 2 ), and the average energy per two dimensions is E s = 2E A /KN = 2E A /N, which for both are the same as for A. Two distinct points in A must differ in at least one component. The minimum squared distance is therefore the minimum squared distance in any component, which is d 2 min(a). The number of nearest neighbors to any point (a 1,a 2,...,a K ) A is the sum of the numbers of nearest neighbors to a 1,a 2,...,a K, respectively, since there is a nearest neighbor for each such nearest neighbor to each component. The average number K min (A ) of nearest neighbors to A is therefore the sum of the average number of nearest neighbors in each component, which is K min (A ) = KK min (A). An ( )-QA constellation A is equal to the 2-fold Cartesian product A 2, where A is an -PA constellation. Therefore all the above results hold with K = 2. In particular, the QA constellation has the same d 2 min as the PA constellation, but twice the number K min of nearest neighbors. (b) Show that if the signal constellation A is a Cartesian product A K, then D detection can be performed by performing independent D detection on each of the K components of the received KN-tuple y = (y 1,y 2,...,y K ). Using this result, sketch the decision regions of a (4 4)-QA signal set. Given a received signal (y 1,y 2,...,y K ), to minimize the squared distance y 1 a 1 2 + y 2 a 2 2 + + y K a K 2 over A, we may minimize each component y j a j 2 separately, since choice of one component a j imposes no restrictions on the choices of other components. The D decision regions of a (4 4)-QA signal set A = A 2 are thus simply those of a 4-PA signal set A, independently for each coordinate. Such decision regions are sketched in the Problem Set 1 solutions, Problem 1.3(b). 1

(c) Show that if Pr(E) is the probability of error for D detection of A, then the probability of error for D detection of A = A K is Pr(E) = 1 (1 Pr(E)) K, Show that Pr(E) K Pr(E) if Pr(E) is small. A signal in A is received correctly if and only if each component is received correctly. The probability of correct decision is therefore the product of the probabilities of correct decision for each of the components separately, which is (1 Pr(E)) K. The probability of error for A is therefore Pr(E ) = 1 (1 Pr(E)) K. When Pr(E) is small, (1 Pr(E)) K 1 K Pr(E), so Pr(E ) K Pr(E). Problem 2.2 (Pr(E) invariance to translation, orthogonal transformations, or scaling) Let Pr(E a j ) be the probability of error when a signal a j is selected equiprobably from an N-dimensional signal set A and transmitted over a discrete-time AWGN channel, and the channel output Y = a j +N is mapped to a signal â j A by a minimum-distance decision rule. An error event E occurs if â j a j. Pr(E) denotes the average error probability. (a) Show that the probabilities of error Pr(E a j ) are unchanged if A is translated by any vector v; i.e., the constellation A = A + v has the same Pr(E) as A. If all signals are equiprobable and the noise is iid Gaussian, then the optimum detector is a minimum-distance detector. In this case the received sequence Y = a j+n may be mapped reversibly to Y = Y v = a j + N, and then an D detector for A based on Y is equivalent to an D detector for A based on Y. In particular, it has the same probabilities of error Pr(E a j ). (b) Show that Pr(E) is invariant under orthogonal transformations; i.e., A = UA has the same Pr(E) as A when U is any orthogonal N N matrix (i.e., U 1 = U T ). In this case the received sequence Y = a j+n may be mapped reversibly to Y = U 1 Y = a j +U 1 N. Since the noise distribution depends only on the squared norm n 2, which is not affected by orthogonal transformations, the noise sequence N = U 1 N has the same distribution as N, so again the probability of error Pr(E) is unaffected. (c) Show that Pr(E) is unchanged if both A and N are scaled by α > 0. In this case the received sequence Y = a j + αn may be mapped reversibly to Y = α 1 Y = a j + N, which again reduces the model to the original scenario. Problem 2.3 (optimality of zero-mean constellations) Consider an arbitrary signal set A = {a j, 1 j }. Assume that all signals are equiprobable. Let m(a) = 1 j a j be the average signal, and let A be A translated by m(a) so that the mean of A is zero: A = A m(a) = {a j m(a), 1 j }. Let E(A) and E(A ) denote the average energies of A and A, respectively. (a) Show that the error probability of an D detector is the same for A as it is for A. This follows from Problem 2.2(a) 2

(b) Show that E(A ) = E(A) m(a) 2. Conclude that removing the mean m(a) is always a good idea. Let A be the random variable with alphabet A that is equal to a j with probability 1/ for all j, and let A = A m(a) be the fluctuation of A. Then E(A) = A 2, m(a) = A, and E(A ) = A A 2 = A 2 2 A,A + A 2 = A 2 A 2 = E(A) m(a) 2. Thus the second moment of A is greater than or equal to the variance of A, with equality if and only if the mean of A is zero. This result and that of part (a) imply that if m(a) 0, then by replacing A with A = A m(a), the average energy can be reduced without changing the probability of error. Therefore A is always preferable to A; i.e., an optimum constellation must have zero mean. (c) Show that a binary antipodal signal set A = {±a} is always optimal for = 2. A two-point constellation with zero mean must have a 1 +a 2 = 0, which implies a 2 = a 1. Problem 2.4 (UBE for -PA constellations). For an -PA constellation A, show that K min (A) = 2( 1)/. Conclude that the union bound estimate of Pr(E) is ( ) ( ) 1 d Pr(E) 2 Q. 2σ Observe that in this case the union bound estimate is exact. Explain why. The 2 interior points have 2 nearest neighbors, while the 2 boundary points have 1 nearest neighbor, so the average number of nearest neighbors is K min (A) = 1 (( 2)(2) + (2)(1)) = 2 2. In general the union bound estimate is Pr(E) K min (A)Q(d min (A)/2σ). Plugging in d min (A) = d and the above expression for K min (A), we get the desired expression. For the 2 interior points, the exact error probability is 2Q(d/2σ). For the 2 boundary points, the exact error probability is Q(d/2σ). If all points are equiprobable, then the average Pr(E) is exactly the UBE given above. In general, we can see that the union bound is exact for one-dimensional constellations, and only for one-dimensional constellations. The union bound estimate is therefore exact only for equi-spaced one-dimensional constellations; i.e., essentially only for -PA. 3

Problem 2.5 (Invariance of coding gain) Show that in the power-limited regime the nominal coding gain γ c (A) of (5.9), the UBE (5.10) of P b (E), and the effective coding gain γ eff (A) are invariant to scaling, orthogonal transformations and Cartesian products. In the power-limited regime, the nominal coding gain is defined as γ c (A) = d2 min(a) 4E b (A). Scaling A by α > 0 multiplies both d 2 min(a) and E b (A) by α 2, and therefore leaves γ c (A) unchanged. Orthogonal transformations of A do not change either d 2 min(a) or E b (A). As we have seen in Problem 2.1, taking Cartesian products also does not change either d 2 min(a) or E b (A). Therefore γ c (A) is invariant under all these operations. The UBE of P b (E) involves γ c (A) and K b (A) = K min (A)/( log A ). K min (A) is also obviously unchanged under scaling or orthogonal transformations. Problem 2.1 showed that K min (A) increases by a factor of K under a K-fold Cartesian product, but so does log A, so K b (A) is also unchanged under Cartesian products. The effective coding gain is a function of the UBE of P b (E), and therefore it is invariant also. Problem 2.6 (Orthogonal signal sets) An orthogonal signal set is a set A = {a j, 1 j } of orthogonal vectors in R with equal energy E(A); i.e., a j,a j = E(A)δ jj (Kronecker delta). (a) Compute the nominal spectral efficiency ρ of A in bits per two dimensions. Compute the average energy E b per information bit. The rate of A is log 2 bits per dimensions, so the nominal spectral efficiency is ρ = (2/) log 2 bits per two dimensions. The average energy per symbol is E(A), so the average energy per bit is E b = E(A) log 2. (b) Compute the minimum squared distance d 2 min(a). K min (A) = 1 nearest neighbors. The squared distance between any two distinct vectors is Show that every signal has a j a j 2 = a j 2 2 a j,a j + a j 2 = E(A) 0 + E(A) = 2E(A), so d 2 min(a) = 2E(A), and every vector has all other vectors as nearest neighbors, so K min (A) = 1. (c) Let the noise variance be σ 2 = N 0 /2 per dimension. Show that the probability of error of an optimum detector is bounded by the UBE Pr(E) ( 1)Q (E(A)/N 0 ). 4

The pairwise error probability between any two distinct vectors is Pr{a j a j } = Q ( a j a j 2 /4σ 2 ) = Q (2E(A)/2N 0 ) = Q (E(A)/N 0 ). By the union bound, for any a j A, Pr(E a j ) Pr{a j a j } = ( 1)Q (E(A)/N 0 ), j j so the average Pr(E) also satisfies this upper bound. (d) Let with E b held constant. Using an asymptotically accurate upper bound for the Q ( ) function (see Appendix), show that Pr(E) 0 provided that E b /N 0 > 2 ln 2 (1.42 db). How close is this to the ultimate Shannon limit on E b /N 0? What is the nominal spectral efficiency ρ in the limit? By the Chernoff bound of the Appendix of Chapter 5, Q (x 2 ) e x2 /2. Therefore Pr(E) ( 1)e E(A)/2N 0 < e (ln ) e (E b log 2 )/2N 0. Since ln = (log 2 )(ln 2), as this bound goes to zero provided that or equivalently E b /N 0 > 2 ln 2 (1.42 db). E b /2N 0 > ln 2, The ultimate Shannon limit on E b /N 0 is E b /N 0 > ln 2 (-1.59 db), so this shows that we can get to within 3 db of the ultimate Shannon limit with orthogonal signalling. (It can be shown that orthogonal signalling can actually achieve Pr(E) 0 for any E b /N 0 > ln 2, the ultimate Shannon limit.) Unfortunately, the nominal spectral efficiency ρ = (2 log 2 )/ goes to 0 as. Problem 2.7 (Simplex signal sets) Let A be an orthogonal signal set as above. (a) Denote the mean of A by m(a). Show that m(a) 0, and compute m(a) 2. By definition, Therefore, using orthogonality, we have m(a) = 1 a j. m(a) 2 = 1 a 2 j 2 = E(A) 0. j By the strict non-negativity of the Euclidean norm, m(a) 2 0 implies that m(a) 0. The zero-mean set A = A m(a) is called a simplex signal set. It is universally believed to be the optimum set of signals in AWGN in the absence of bandwidth constraints, except at ridiculously low SNRs. 5 j

(b) For = 2, 3, 4, sketch A and A. For = 2, 3, 4, A consists of orthogonal vectors in -space (hard to sketch for = 4). For = 2, A consists of two antipodal signals in a 1-dimensional subspace of 2-space; for = 3, A consists of three vertices of an equilateral triangle in a 2- dimensional subspace of 3-space; and for = 4, A consists of four vertices of a regular tetrahedron in a 3-dimensional subspace of 4-space. (c) Show that all signals in A have the same energy E(A ). Compute E(A ). Compute the inner products a j,a j for all a j,a j A. The inner product of m(a) with any a j is m(a),a j = 1 The energy of a j = a j m(a) is therefore j a j,a j = E A. a j 2 = a j 2 2 m(a),a j + m(a) 2 = E(A) E(A) = 1 E(A). For j j, the inner product a j,a j is a j,a j = a j m(a),a j m(a) = 0 2 E(A) + E(A) = E(A). In other words, the inner product is equal to 1 E(A) if j = j and 1 E(A) for j j. (d) [Optional]. Show that for ridiculously low SNR, a signal set consisting of 2 zero signals and two antipodal signals {±a} has a lower Pr(E) than a simplex signal set. [Hint: see. Steiner, The strong simplex conjecture is false, IEEE Transactions on Information Theory, pp. 721-731, ay 1994.] See the cited article. Problem 2.8 (Biorthogonal signal sets) The set A = ±A of size 2 consisting of the signals in an orthogonal signal set A with symbol energy E(A) and their negatives is called a biorthogonal signal set. (a) Show that the mean of A is m(a ) = 0, and that the average energy is E(A). The mean is and every vector has energy E(A). m(a ) = j (a j a j ) = 0, (b) How much greater is the nominal spectral efficiency ρ of A than that of A? The rate of A is log 2 2 = 1 + log 2 bits per dimensions, so its nominal spectral efficiency is ρ = (2/)(1 + log 2 ) b/2d, which is 2/ b/2d greater than for A. This is helpful for small, but negligible as. 6

(c) Show that the probability of error of A is approximately the same as that of an orthogonal signal set with the same size and average energy, for large. Each vector in A has 2 2 nearest neighbors at squared distance 2E(A), and one antipodal vector at squared distance 4E(A). The union bound estimate is therefore Pr(E) (2 2)Q (E(A)/N 0 ) A Q (E(A)/N 0 ), which is approximately the same as the estimate Pr(E) (2 1)Q (E(A)/N 0 ) A Q (E(A)/N 0 ) for an orthogonal signal set A of size A = 2. (d) Let the number of signals be a power of 2: 2 = 2 k. Show that the nominal spectral efficiency is ρ(a ) = 4k2 k b/2d, and that the nominal coding gain is γ c (A ) = k/2. Show that the number of nearest neighbors is K min (A ) = 2 k 2. If = 2 k 1, then the nominal spectral efficiency is ρ(a ) = (2/)(1 + log 2 ) = 2 2 k k = 4k2 k b/2d. We are in the power-limited regime, so the nominal coding gain is γ c (A ) = d2 min(a ) 4E b = 2E(A ) 4E(A )/k = k 2. The number of nearest neighbors is K min (A ) = 2 2 = 2 k 2. 7