Noisy Constrained Capacity for BSC Channels

Size: px
Start display at page:

Download "Noisy Constrained Capacity for BSC Channels"

Transcription

1 Noisy Constrained Capacity for BSC Channels Philippe Jacquet INRIA Rocquencourt 7853 Le Chesnay Cedex France Wojciech Szpankowski Department of Computer Science Purdue University W. Lafayette, IN U.S.A. Abstract We study the classical problem of noisy constrained capacity in the case of the binary symmetric channel (BSC), namely, the capacity of a BSC whose input is a sequence from a constrained set. As stated in [4]... while calculation of the noise-free capacity of constrained sequences is well known, the computation of the capacity of a constraint in the presence of noise... has been an unsolved problem in the half-century since Shannon s landmark paper.... We first express the constrained capacity of a binary symmetric channel with (d, k)-constrained input as a limit of the top Lyapunov exponents of certain matrix random processes. Then, we compute asymptotic approximations of the noisy constrained capacity for cases where the noise parameter ε is small. In particular, we show that when k 2d, the error term (excess of capacity beyond the noise-free capacity) is O(ε), whereas it is O(ε log ε) when k > 2d. In both cases, we compute the coefficient of the error term. In the course of establishing these findings, we also extend our previous results on the entropy of a hidden Markov process to higher-order finite memory processes. These conclusions are proved by a combination of analytic and combinatorial methods. I. INTRODUCTION We consider a binary symmetric channel (BSC) with crossover probability ε, and a constrained set of inputs. More precisely, let S n denote the set of binary sequences of length n satisfying a given (d, k)-rll constraint [8], i.e., no sequence in S n contains a run of zeros of length shorter than d or longer than k (we assume that the values d and k, d k, are understood from the context). We write n S n for n =... n. Furthermore, we denote S = n>0 S n. We assume that the input to the channel is a stationary process = { k } k supported on S. We regard the BSC channel as emitting a Bernoulli noise sequence E = {E k } k, independent of, with P(E i = ) = ε. The channel output is Z i = i E i. where denotes addition modulo 2 (exclusive-or). For ease of notation, we identify the BSC channel with its parameter ε. Let C(ε) denote conventional BSC channel capacity (over unconstrained binary sequences), namely, C(ε) = H(ε), where H(ε) = ε logε ( ε)log( ε). We use natural logarithms throughout. Entropies are correspondingly Preliminary version of this paper was presented at ISIT, Nice, Work of W. Szpankowski was supported in part by NSF STC Grant CCF , NSF Grants DMS , and CCF , NSA Grant H , the AFOSR Grant FA , and Humboldt Foundation. measured in nats. The entropy of a random variable or process will be denoted H( n ), and the entropy rate by H(). The noisy constrained capacity C(S, ε) is defined [4] by C(S, ε) = sup I(; Z) = lim S n n sup I( n, Z n ), () n Sn where I(; Z) is the mutual information and the supreme are over all stationary processes supported on S and S n, respectively. The noiseless capacity of the constraint is C(S) = C(S, 0). This quantity has been extensively studied, and several interpretations and methods for its explicit derivation are known (see, e.g., [8] and extensive bibliography therein). As for C(S, ε), the best results in the literature have been in the form of bounds and numerical simulations based on producing random (and, hopefully, typical) channel output sequences (see, e.g., [26], [23], [] and references therein). These methods allow for fairly precise numerical approximations of the capacity for given constraints and channel parameters. Our approach to the noisy constrained capacity C(S, ε) is different. We first consider the corresponding mutual information, I(; Z) = H(Z) H(Z ). (2) Since H(Z ) = H(ε), the problem reduces to finding H(Z), the entropy rate of the output process. If we restrict our attention to constrained processes that are generated by Markov sources, the output process Z can be regarded as a hidden Markov process (HMP), and the problem of computing I(; Z) reduces to that of computing the entropy rate of this HMP. The noisy constrained capacity follows provided we find the imizing distribution P of. It is well known (see, e.g., [8]) that we can regard the (d, k) constraint as the output of a kth-order finite memory (Markov) stationary process, uniquely defined by conditional probabilities P(x t x t t k ), where for any sequence {x i} i, we denote by x j i, j i, the sub-sequence x i, x i+,...,x j. For nontrivial constraints, some of these conditional probabilities must be set to zero in order to enforce the constraint (for example, the probability of a zero after seeing k consecutive zeros, or of a one after seeing less than d consecutive zeros). When the remaining free probabilities are assigned so that the entropy of the process is imized, we say that the process is entropic, and we denote the imizing distribution by

2 P. The noiseless capacity C(S) is equal to the entropy of P [8]. The Shannon entropy (or, simply, entropy) of a HMP was studied as early as [2], where the analysis suggests the intrinsic complexity of the HMP entropy as a function of the process parameters. Blackwell [2] showed an expression of the entropy in terms of a measure Q, obtained by solving an integral equation dependent on the parameters of the process. The measure is hard to extract from the equation in any explicit way. Recently, we have seen a resurgence of interest in estimating HMP entropies [7], [8], [4], [9], [20], [27]. In particular, one recent approach is based on computing the coefficients of an asymptotic expansion of the entropy rate around certain values of the Markov and channel parameters. The first result along these lines was presented in [4], where the Taylor expansion around ε = 0 is studied for a binary HMP of order one. In particular, the first derivative of the entropy rate at ε = 0 is expressed very compactly as a Kullback-Liebler divergence between two distributions on binary triplets, derived from the marginals of the input process. It is also shown in [4], [5] that the entropy rate of a HMP can be expressed in terms of the top Lyapunov exponent of a random process of 2 2 matrices (cf. also [], where the capacity of certain channels with memory is also shown to be related to top Lyapunov exponents). Further improvements, and new methods for the asymptotic expansion approach were obtained in [9], [27], and [8]. In [20] the authors express the entropy rate for a binary HMP where one of the transition probabilities is equal to zero as an asymptotic expansion including a O(ε log ε) term. As we shall see in the sequel, this case is related to the (, ) (or the equivalent (0, )) RLL constraint. Analyticity of the entropy as a function of ε was studied in [7]. In Section II of this paper we extend the results of [4], [5] on HMP entropy to higher order Markov processes. We show that the entropy of a rth-order HMP can be expressed as the top Lyapunov exponent of a random process of matrices of dimensions 2 r 2 r (cf. Theorem ). As an additional result of this work, of interest on its own, we derive the asymptotic expansion of the HMP entropy rate around ε = 0 for the case where all transition probabilities are positive (cf. Theorem 2). In particular, we derive an expression for the first derivative of the entropy rate as the Kullback-Liebler divergence between two distributions on 2r+-tuples, again generalizing the formula for r= [4].The results of Section II are applied, in Section III, to express the noisy constrained capacity as a limit of top Lyapunov exponents of certain matrix processes. These exponents, however, are notoriously difficult to compute [25]. Hence, as in the case of the entropy of HMPs, it is interesting to study asymptotic expansions of the noisy constrained capacity. In Section III-B, we study the asymptotics of the noisy constrained capacity, and we show that for (d, k) constraints with k 2d, we have C(S, ε) = C(S) + K ε + O(ε 2 log ε), where K is a well characterized constant. On the other hand, when k > 2d, we have C(S, ε) = C(S)+L ε log ε+o(ε), where, again, L is an explicit constant. The latter case covers the (0, ) constraint (and also the equivalent (, ) constraint). Our formula for the constant L in this case is consistent with the one derived from the results of [20]. Preliminary results of this paper were presented in [6]. We also remark that recently Han and Marcus [9], [0] reached similar conclusions and obtained some generalizations using different methodology. II. ENTROPY OF HIGHER ORDER HMPS Let = { i } i be an rth-order stationary finite memory (Markov) process over a binary alphabet A={0, }. The process is defined by the set of conditional probabilities P( t = t r t = a r ), a r A r. The process is equivalently interpreted as the Markov chain of its states s t = t r t, t > 0 (we assume r+ 0 is defined and distributed according to the stationary distribution of the process). Clearly, a transition from a state u A r to a state v A r can have positive probability only if u and v satisfy u r 2=v r, in which case we say that (u, v) is an overlapping pair. The noise process E = {E i } i is Bernoulli (binary i.i.d.), independent of, with P(E i =) = ε. Finally, the HMP is Z={Z i } i, Z i = i E i, i. (3) Let Zi = (Z i, Z i+,...,z i+r ) and Ẽ i = (E i,..., E i+r ). Also, for e {0, }, let Ẽi e = (e, E i,...,e i+r ). We next compute 2 the probability of Z n := Zn+r. From the definitions of and E, we have P( Z n, Ẽn) = e AP( Z n, Ẽn, E n = e) (4) = n P( Z, Z n+r, E n = e, Ẽn) e A = n P(Z n+r, E n+r Z, Ẽe n n )P( Z, Ẽe n ) e A = P(E n+r )P ( Z n Ẽn Z n Ẽe n n )P( Z, Ẽe n ). e A Observe that in the last line the transition probabilities P ( ) are with respect to the original Markov chain. We next derive, from (4), an expression for P( Z n ) as a product of matrices extending our earlier work [4], [5]. In what follows, vectors are of dimension 2 r or r, and matrices are of dimensions 2 r 2 r. We denote row vectors by bold lowercase letters, matrices by bold uppercase letters, and we let = [,...,]; superscript t denotes transposition. Entries in vectors and matrices are indexed by vectors in A r. Let for a i A r p n = [P( Z n, Ẽn=a ), P( Z n, Ẽn=a 2 )...P( Z n, Ẽn=a 2 r)] be a vector of dimension 2 r, and let M( Z n Z n ) be a 2 r 2 r matrix defined as follows: if (e n,e n ) A r A r is an We generally use the term finite memory process for the first interpretation, and Markov chain for the second. 2 In general, the measures governing probability expressions will be clear from the context. In cases when confusion is possible, we will explicitly indicate the measure, e.g., P.

3 overlapping pair, then the entry (e n,e n ) of M( Z n Z n ) is M en,e n ( Z n Z n ) = P ( Z n e n Z n e n )P(Ẽn=e n ). (5) All other entries are zero. Clearly, M( Z n Z n ) is a random matrix, drawn from a set of 2 r+ possible realizations. With these definitions, it follows from (4) that p n = p n M( Z n Z n ). (6) Since P( Z n ) = p n t = e A r P( Z n, Ẽn = e), after iterating (6), we obtain P( Z n ) = p M( Z 2 Z ) M( Z n Z n ) t. (7) The joint distribution P(Z n ) of the HMP, presented in (7), has the form p A n t, where A n is the product of the first n random matrices of the process M = M( Z 2 Z ),M( Z 3 Z 2 ),...,M( Z n Z n ),... (8) Applying a subadditive ergodic theorem, and noting that p A n t is a matrix norm of A n (indeed, both p and t are positive and A is nonnegative, hence p A n t satisfies the conditions for a matrix norm, as already observed in []) is readily proved that n E[ logp Z (Z n )] must converge to a constant ξ known as the top Lyapunov exponent of the random process M (cf. [5], [2], [25]). This leads to the following theorem. Theorem : The entropy rate of the HMP Z of (3) satisfies [ H(Z) = lim E ] n n log P Z(Z n+r ) [ ( = lim n n E log p M( Z 2 Z ) M( Z n Z n ) t)] =ξ, where ξ is the top Lyapunov exponent of the process M of (8). Theorem and its derivation generalize the results, for r =, of [4], [5], [27], [28]. It is known that computing top Lyapunov exponents is hard (maybe infeasible), as shown in [25]. Therefore, we shift our attention to asymptotic approximations. We consider the entropy rate H(Z) for the HMP Z as a function of ε for small ε. In order to derive expressions for the entropy rate, we resort to the following formal definition (which was also used in entropy computations in [3] and [5]): R n (s, ε) = PZ s (zn ), (9) z n An where s is a real (or complex) variable, and the summation is over all binary n-tuples. In the sequel, we write P Z (z n ) for P(z n) to distinguish it from P (z n ). It is readily verified that H(Z n ) = E [ logp Z(Z n )] = s R n(s, ε). (0) s= The entropy of the underlying Markov sequence is H( n )= s R n(s, 0). s= Furthermore, let P = [p ei,e j ] ei,e j Ar be the transition matrix of the underlying rth order Markov chain, and let π = [π e ] e A r be the corresponding stationary distribution. Define also P(s) = [p s e i,e j ] ei,e j A r and π(s) = [πs e] e A r. Then R n (s, 0) = z n P s (zn ) = π(s)p(s)n t () which is entirely defined on the underlying Markov process. Using a formal Taylor expansion near ε = 0, we write R n (s, ε) = R n (s, 0) + ε ε R n(s, ε) + O(g(n)ε 2 ), (2) ε=0 where g(n) is the second derivative of R n (s, ε) with respect to ε, computed at some ε, provided these derivatives exist (the dependence on n stems from (9)). Using analyticity at ε = 0 (cf. [7], [5]), we find H(Z n ) = H(n ) ε 2 sε R n(s, ε) + O(g(n)ε 2 ) = H( n ) ε s ε z n ε=0, s= P s Z (zn ) ε=0, s= + O(g(n)ε 2 ). (3) To compute the linear term in the Taylor expansion (3), we differentiate with respect to s, and evaluate at s =. Proceeding in analogy to the derivation in [4], we obtain the following result basically proved in [5], so we omit details here. Theorem 2: If the conditional symbol probabilities in the finite memory (Markov) process satisfy P(a r+ a r ) > 0 for all a r+ A r+, then the entropy rate of Z for small ε is H(Z) = lim n n H n(z n ) = H()+f (P )ε+o(ε 2 ), (4) where, denoting by z i the Boolean complement of z i, and ž 2r+ =z... z r z r+ z r+2... z 2r+, we have f (P ) = P (z 2r+ )log P (z 2r+ P (ž 2r+ z 2r+ ) ) = D ( P (z 2r+ ) P (ž 2r+ ) ). (5) Here, D( ) is the Kullback-Liebler divergence, applied here to distributions on A 2r+ derived from the marginals of. A question arises about the asymptotic expansion of the entropy H(Z) when some of the conditional probabilities are zero. Clearly, when some transition probabilities are zero, then certain sequences x n are not reachable by the Markov process, which provides the link to constrained sequences. Example. Consider a Markov chain with the following transition probabilities [ ] p p P = (6) 0

4 where 0 p. This process generates sequences satisfying the (, ) constraint (or, under a different interpretation of rows and columns, the equivalent (0, ) constraint). The output sequence Z, however, will generally not satisfy the constraint. The probability of the constraint-violating sequences at the output of the channel is a polynomial in ε, which will generally contribute a term O(ε log ε) to the entropy rate H(Z) when ε is small. This was already observed in [20] for the transition matrix P as in (6), where it was shown that as ε 0. H(Z) = H() p(2 p) ε logε + O(ε) (7) + p In this paper, in Section IV and Appendix A we prove the following generalization of Theorem 2 for (d, k) sequences. Theorem 3: Let Z be a HMM. Then, in general H(Z) = H() f 0 (P )ε log ε + f (P )ε + O(ε 2 log ε) (8) for some f 0 (P ) and f (P ). Observe that if all transition probabilities of are positive, then f 0 (P ) = 0 and the coefficient f (P ) at ε is presented in Theorem 2. The coefficient f 0 (P ) is derived in Section IV for HMM representing a (d, k) sequences, and for the imizing distribution is presented in Theorems 5 and 6. We should point out, that recently Han and Marcus [9] showed that in general for any HMM H(Z) = H() f 0 (P )ε log ε + O(ε) which is further generalized in [0] to H(Z) = H() f 0 (P )ε log ε + f (P)ε + O(ε 2 log ε) when at least one of the transition probabilities in the Markov chain is zero. III. CAPACITY OF THE NOISY CONSTRAINED SYSTEM We now apply the results on HMPs to the problem of noisy constrained capacity. A. Capacity as a Lyapunov Exponent Recall that I(; Z) = H(Z) H(ε) and, by Theorem, when is a Markov process, we have H(Z) = ξ(p ) where ξ(p ) is the top Lyapunov exponent of the process {M( Z i Z i )} i>0. In [3] it is proved that the process optimizing the mutual information can be approached by a sequence of Markov representations of increasing order. Therefore, as a direct consequence of this fact and Theorem we conclude the following. Theorem 4: The noisy constrained capacity C(S, ε) for a (d, k) constraint through a BSC channel of parameter ε is given by C(S, ε) = lim r sup P (r) ξ(p (r) ) H(ε) (9) where P (r) denotes the probability law of an rth-order Markov process generating the (d, k) constraint S. Clearly, estimating the top Lyapunov exponents in (9) is computationally prohibitively expensive, if possible. Therefore, we next turn our attention to asymptotic expansions of C(S, ε) near ε = 0. B. Asymptotic Behavior A nontrivial constraint will necessarily have some zerovalued conditional probabilities. Therefore, the associated HMP will not be covered by Theorem 2, but rather by Theorem 3. For (d, k) sequences we have H(Z) = H() f 0 (P )ε log ε + f (P )ε + O(ε 2 log ε) (20) for some f 0 (P ) and f (P ) where P is the underlying Markov process. Since H(ε) = ε log ε+ε O(ε 2 ) for small ε, we obtain C(S, ε) = sup H(Z) ε log ε + ε O(ε 2 ). S We need to find the imizing distribution to estimate the capacity. We shall prove in Section IV.D that this imizing distribution is actually the entropic distribution P (imizing the entropy of the underlying Markov process). However, this introduces additional error term O(ε 2 log 2 ε) which exceeds the error term O(ε 2 log ε) of the entropy estimation in Theorem 3 The same result was established by Han and Marcus [9], [0] using different methodology. In summary, we are led to C(S, ε) = C(S) ( f 0 (P ))ε log ε+(f (P ) )ε + O(ε 2 log 2 ε) (2) where C(S) is the capacity of noiseless RLL system. Various methods exist to derive C(S) [8]. In particular, one can write [8], [24] C(S) = log ρ 0, where ρ 0 is the smallest real root of k ρ l+ 0 =. (22) Our goal is to derive explicit expressions for f 0 (P ) and f (P ) for (d, k) sequences. For example, we will show in Theorem 5 below that for some RLL constraints, we have f 0 (P ) = in (2), hence the noisy constrained capacity is of the form C(S, ε) = C(S) + O(ε). In Theorem 6 below we derive also f (P ). We apply the same approach as in previous section, that is, we use the auxiliary function R n (s, ε) defined in (9). To start, we find a simpler expression for P Z (z n ). Summing over the number of errors introduced by the channel, we find n P Z (z n ) = P (x n )( ε) n + ε( ε) n P (x n e i ) plus the error term O(ε 2 ) (resulting from two or more errors), where e j = (0,..., 0,, 0,..., 0) A n with a at position j. Let B n A n denote the set of sequence z n at Hamming distance one from S n, and C n = A n \ (S n B n ). Notice that i=

5 sequences in C n are at distance at least two from S n, and contribute the O(ε 2 ) term. From the above, we conclude R n (s, ε) = P Z (z n )s (23) z nan = P Z (z n )s + P Z (z n )s + P Z (z n )s. z n Sn z n Bn z n Cn We observe that z n Bn P Z (z n )s = O(ε s ), as ε 0. Defining φ n (s) = Q n (s) = z n Cn P Z (z n )s = O(ε 2s ) n P (z n ) s P (z n ) z n Sn i= ( n ) s P (z n e i) z n Bn i= we arrive, after some algebra, at the following expression for R n (s, ε) R n (s, ε) = ( ε) ns R n (s, 0) + ε( ε) ns φ n (s) (24) Notice that + ε s ( ε) (n )s Q n (s) + O(ε 2 + ε +s + ε 2s ). φ n () + Q n () = z n We now derive n P (z n e i ) = n. i= H(Z n ) = s R n(, ε) using the fact that R n (, ε) =. Since all the functions involved are analytic, we obtain from (24) H(Z n ) = H( n )( nε) + nε ε(φ n () + φ n()) ε log εq n () εq n () + O(nε2 log ε), (25) where the error term is derived in Appendix A. In the above, φ n() and Q n() are, respectively, the derivative of φ n (s) and Q n (s) at s =. Notice also that the term nh( n )ε of order n 2 ε is cancelled by (φ n ()+Q n ())ε = (H(n )n + O(n))ε and only nε term remains (see the next section for details). The case k 2d is interesting: one-bit flip in a (d, k) sequence is guaranteed to violate the constraint, and consequently z n S n and i: P (z n e i) = 0. Therefore φ n (s) = 0 in this case, leaving Q n () = n. Thus, in the case k 2d, we have f 0 (P) =, and the term O(ε log ε) in (2) cancels out. Further considerations are required to compute Q n () and obtain the coefficient of ε in (25). Here, we provide the necessary definitions, and state our result that are proved in Section IV. Ignoring border effects (which do not affect asymptotics, as easy to see 3 ), we restrict our analysis to (d, k) sequences over the extended alphabet (of phrases) [8] B = { 0 d, 0 d+,...,0 k }. In other words, we consider only (d, k) sequences that end with a. For such sequences, we assume that they are generated by a memoryless process over the super-alphabet. This is further discussed in Section IV. Let p l denote the probability of the super-symbol 0 l. We stress the fact that p l differs from the probability that l + consecutive symbols equals 0 l. In fact it is equal to the probability that two consecutive returns to symbol are separated by exactly l zeros. Therefore p l = P ( i+l+ i+ = 0 l i = ), i >, d l k. (26) In Appendix B we prove that the entropic distribution P corresponds to the case with ρ 0 as in (22). p l = ρ l+ 0, (27) Example 2. Consider again the Markov process discussed in Example with the transition matrix (6). This represents (, ) constraint system. Observe that p l = P (0 l ) = P(0 )P l (0 0)P( 0) = p( p) l. The imum entropy distribution is achieved for p = /ϕ 2 where ϕ = ( + 5)/2.; also ρ 0 = /ϕ. In this case p = ρ 2 0 and p = ρ 0, thus p l = ρ l+ 0. The expected length of a super-symbol in B is λ = k (l + )p l. We also introduce the generating function r(s, z) = l p s l zl+. By ρ(s) we denote the smallest root in z of r(s, z) =, that is r(s, ρ(s)) =. Clearly, ρ() = and ρ l () = p l log p l λ is the entropy rate per bit of the super-alphabet, and ρ () = H(). Furthermore, we define λ(s) = z r(s, z) z=ρ(s) and notice that λ() = λ. Finally, to present succinctly our results, we introduce some additional notation. Let α(s, z) = l (2d l)p s lz l+. For integers l, l 2, d l, l 2 k, let I l,l 2 denote the interval I l,l 2 = {l: min + {l d, k l 2 } l min + {l 2 d, k l }}, 3 Indeed, in general a (d, k) sequence may have at most k starting and ending zeros of total length n + O() that cannot affect the entropy rate.

6 where min + {a, b} = {min{a, b}, 0}. We shall write Il,l 2 = I l,l 2 \ {0}. At last, we define τ(s, z) = τ (s, z) + τ 2 (s, z) + τ 3 (s, z) where τ (s, z) = 2 {0, l + l 2 k d}p s l p s l 2 z l2+l2+2 τ 2 (s, z) = τ 3 (s, z) = l =d l 2=d θ Il,l 2 θ I l,l 2 p l+θp l2 θ z l2+l2+2 Now we are in a position to present our main results. The proofs are delayed till the next section. The following theorem summarizes our findings for the case k 2d. Theorem 5: Consider a (d, k) constrained system with k 2d. Then, C(S, ε) = C(S) ( f 0 (P ))ε + O(ε2 log 2 ε), where for p l = ρ l+ 0 f 0 (P ) = log λ + () 2λ λ + s τ(, ) + sα(, ) λ + ρ ()( 2 2 α(, ) + τ(, )) sz sz + ρ () λ ( z α(, ) + τ(, )) z for ε 0 and λ(s), α(s, z) and τ(s, z) are defined above. In the complementary case k > 2d, the term φ n (s) in (23) does not vanish, and thus the O(ε log ε) term in (2) is generally nonzero. For this case, using techniques similar to the ones leading to Theorem 5, we obtain the following result. Theorem 6: Consider the constrained (d, k) system with k 2d. Define γ = (l 2d)p l, δ = p l p l2, l>2d d l +l 2+ k and λ = k = (l + )p l. Then for p l = ρ l+ 0 C(S, ε) = C(S) ( f 0 (P ))εlog ε + O(ε), (28) where for ε 0. and δ =. Thus, f 0 (P ) = γ + δ p(p 2) = λ p, consistent with the calculation of the same quantity in [20]. The noisy constrained capacity is obtained when P = P, d l,l 2 k i.e., p = /ϕ k k 2, where ϕ = (+ 5)/2, the golden ratio. Then, 2 (p l p l2 + p l+θp l2 θ) s z l2+l2+2 f 0 (P ) = / 5, and by Theorem 6 C(S, ε) = log ϕ ( / 5)ε log(/ε) + O(ε) k k for ε 0. 2 min{k, l + l 2 d} (l + l 2 ) + l =d l 2=d IV. ANALYSIS s f 0 (P ) = γ + δ λ Example 3. We consider the (, ) constraint with transition matrix P as in (6). Computing the quantities called for in Theorem 6 for d = and k =, we obtain p l = ( p) l p as in Example 2, and λ = + p p ( p)2, γ =, p In this section, we derive explicit expression for the coefficients f 0 (P ) and f (P ) of Theorem 3, as well as f 0 (P ) and f (P ) of Theorems 5 and 6. We also establish the error term in Theorem 5. Throughout, we consider the super-alphabet approach. Recall that a super-symbol is a text 0 l for d l k which is drawn from a memoryless source. This model is equivalent to a Markov process with renewals at symbols. To simplify the analysis, we first consider sequences of length n consisting only of (full) super symbols. We call such sequence a reduced (d, k) sequence of length n. Let x n be a reduced (d, k)-sequence of length n made of m super-symbols: x n = 0 0l l2...0 lm. In the sequel, for reduced (d, k) sequences of length n, we define P(x n ) = m p li. Notice that P(x n ) = 0 if xn is not a reduced (d, k) sequence (i.e., it doesn t end on a ). We also notice that quantities P(x n ) do not form a probability distribution over the reduced (d, k)-sequence of length n because they don t sum to. In view of this we define where i= P,n (x n ) = P(x n ) P n P n = x n P(x n ). Note that P n is the probability that the n-th symbol is exactly a (in other words, x n is built from a finite number of super symbols). We observe that P,n (x n ) is not the original probability distribution P (x n ) generated by a memoryless source of super-symbols. In fact, it is no longer memoryless, but it converges to it in a way that allows us to cope with truncation problems. The later does not effect the asymptotic value of the entropy rate. Recalling the definition r(s, z) = l ps l zl+, we find P n z n = r(, z). n Indeed, every reduced (d, k) sequence consists of an empty string, one super symbol, two super symbols or more, thus

7 n P nz n = k rk (, z) = /( r(, z)) (cf. [24]). By the Cauchy formula [24] we obtain P n = dz 2πi r(, z) z n+ = z r(, ) + O(µ n ) = λ + O(µ n ) for some µ >, since is the largest root of = r(, z) and z r(, ) = λ = k (l + )p l. Let Sm be the set of (d, k) reduced sequences made of exactly m super-symbols with no restriction on its length. Let S = S m m. Let B be the set of sequences that are exactly at Hamming distance from a sequence in S.By our convention, if x S m for some m, (i.e. if x = 0 l 0 l2...0 lm ), then P(x) = i=m i= p l; otherwise P(x) = 0). We denote by L(x) the length of x. We call such a model the variable-length model. To derive H(Z n ) presented in (25) we need to evaluate φ n () and Q n (). We estimate these quantities in the variablelength model as described above and then re-interpret them in the original model. Define which we re-write as φ(s, z) = n Q(s, z) = n φ(s, z) = Q(s, z) = x S x B L(x) P s (x) i= P s n φ n(s)z n, (29) P s n Q n(s)z n (30) i= P(x e i )z L(x), s L(x) P(x e i ) z L(x). We notice that φ(, z) + Q(, z) = L(x)z L(x) = z z r(, z). x S We can also write where φ m (s, z) = Q m (s, z) = φ(s, z) = m Q(s, z) = m x S m P s (x) x S m j=l(x) j= i=l(x) i= φ m (s, z), (3) Q m (s, z), (32) L(x) i= P(x e i )z L(x), x ej / S B(x e j ) S P(x e j e i ) s z L(x) (33) where B(y) is the set of sequences that are at Hamming distance from sequence y. The factor y/ is there to S enforce that y should not be in S, and therefore is in B (since it is at Hamming distance from S ). The division by B(y) S is to ensure that we do not over-count: it expresses the number of ways y can be reached from S. We next evaluate φ(s, z) and Q(s, z). A. Computation of φ m (s, z) The case k 2d is easy since x e j / S m when x S m. Thus φ m (s, z) = 0. In the sequel we concentrate on k > 2d. Theorem 7: For reduced (d, k) sequences consisting of m super symbols, we have φ m (s, z) = mb (s, z)r m (s, z)+(m )b 2 (s, z)r m 2 (s, z), where b (s, z) = b 2 (s, z) = In particular, k p s l d l +l 2 k b (, ) = l p j p l j z l+, j= p s l p s l 2 p l+l 2+z l+l2+2. l=k p j p l j, b 2 (, ) = {0, l 2d}p l. l Proof. We need to consider two cases: one in which the error changes a 0 to a, and the other one when the error occurs on a. In the first case, m super symbols are not changed and each contributes r(s, z). The corrupted super symbol is divided into two and its contribution is summarized in b (s, z). In the second case, an ending is changed into a 0 so two super symbols (except the last one) collapsed into a one super symbol. This contribution is summarized by b 2 (s, z) while the other m 2 super symbols, represented by r(s, z) are unchanged. B. Computation of Qm (s, z) We recall the following definitions. For integers l, l 2, d l, l 2 k, let I l,l 2 denote the interval I l,l 2 = {l: min + {l d, k l 2 } l min + {l 2 d, k l }}, where min + {a, b} = {min{a, b}, 0}. We shall write I l,l 2 = I l,l 2 \ {0}. Theorem 8: For reduced (d, k) sequences consisting of m super symbols, the following holds Q m (s, z) = mα(s, z)r m (s, z) + (m )τ(s, z)r m 2 (s, z) where α(s, z) = l j {0, 2d l}p s lz l+

8 and τ(s, z) = τ (s, z) + τ 2 (s, z) + τ 3 (s, z) where τ (s, z) = τ 2 (s, z) = τ 3 (s, z) = k l =d l 2=d k ({0, d(l ) + l 2 k} + {0, d(l 2 ) + l k})p s l p s l 2 z l+l2+2, k k θ d (p l p l2 p l+θp l2 θ) s z l+l2+2 2 l =d l 2=d θ Il,l 2 k k l =d l 2=d l+l 2+>k 2 min{k, l + l 2 d} (l + l 2 ) + θ I l,l 2 p l+θp l2 θ s z l+l2+2, with d(l) = min{d, l d}. In particular, for k 2d we have the following simplifications: α(s, z) = (2d l)p s l zl+, l and τ (s, z) = l,l 2 2 {0, l + l 2 k d}p s l p s l 2 z l2+l2+2, τ 2 (s, z) = τ 3 (s, z) = k k l =d l 2=d θ Il,l 2 k k 2 min{k, l + l 2 d} (l + l 2 ) + l =d l 2=d s z l2+l2+2. θ I l,l 2 p l+θp l2 θ Proof. As in the previous proof, the main idea is to enumerate all possible ways a sequence x leaves the status of (d, k) after a bit corruption and returns to (d, k) status after a second bit corruption. In other words, x S, x e j / S, and x e j e i S. We often refer to representation (33) in the proof. We consider several cases: a) Property : Let x be a single super-symbol: x = 0 l. Consider now x e j. First, suppose l 2d and the error e j falls on a zero of x. If e j falls on a zero between l d and d, then 0 l e j = 0 l 0 l2, and at least one of l, l 2 is smaller than d. Therefore, x e j is not a (d, k) sequence. The only way e i can produce a (d, k) sequence is when it is equal to e j : B(x e j ) S =. Assume now l > 2d. If e j falls at distance greater than d from both ends, then x e j S and does not leave S. b) Property 2: If the error e j falls on a symbol 0 l in x = 0 l 0 l2, on the last min{d, l d} zeros, then with θ min{d, l 2 d} 0 l 0 l2 e j = 0 l θ 0 θ 0 l2, and x / S. We have: if it falls also on the last min{d, l d, k l 2 } zeros, i.e θ min{d, l d, k l 2 }, then the only e i that moves x e i e j back a (d, k) sequence is either e j = e i or e j such that it falls on the of 0 l, and B(x e j ) S = 2, otherwise, the only acceptable j is i, so that B(x e j ) S = and x e j / S. c) Property 2bis: If the error e j in x = 0 l 0 l2 falls on the first min{d, l 2 d} zeros of 0 l2, then if it falls also on the first min{d, l 2 d, k l } zeros, then the only e j that moves x e i e j back a (d, k) sequence is either e j = e i or e j such that it falls on the of 0 l, and B(x e j ) S = 2, otherwise, the only acceptable j is i so that B(x e j ) S = and x e j / S. d) Property 3: We still consider x = 0 l 0 l2. If the error falls on the of 0 l, then the only e j that moves x e j e i back (d, k) sequences are those that either fall back on the, or on the min{l 2 d, k l } first zeros of 0 l2, or on the min{l d, k l 2 } last zeros of )0 l, and then B(x e j ) S = + min{l d, k l 2 } + min{l 2 d, k l } = + 2 min{k, l + l 2 d} l l 2. 2 (p l p l2 + p l+θp l2 θ) s z l2+l2+2 Clearly,, then we must have l + l 2 + > k in order x e j / S m. Given these four properties we can define the following quantities α(s, z) = l {0, 2d l}p s l zl+ and τ(s, z) = τ (s, z)+τ 2 (s, z)+τ 3 (s, z) with the convention that α(s, z) corresponds to Property, τ (s, z) to Property 2 and 2bis (second bullet), τ 2 (s, z) to Property 2 and Property 2bis (first bullet), τ 3 (s, z) to Property 3. This completes the proof. C. Asymptotic analysis Finally, we can re-interpret our results for reduced (d, k) sequences of the variable-length model in terms of the original (d, k) sequences of the fixed length. Our aim is to provide an asymptotic evaluation of φ n (), Q n (), φ n() and Q n() as n. To this end, we will present an asymptotic evaluation of φ n (s) and Q n (s). Using Theorems 7 and 8 we easily conclude φ(s, z) = Q(s, z) = m m φ m (s, z) = b (s, z) + b 2 (s, z) ( r(s, z)) 2, Q m (s, z) = α(s, z) + τ(s, z) ( r(s, z)) 2. Then by Cauchy formula applied to (29) and (30) Pn s φ n(s, z) = φ(s, z) dz 2iπ z n+, PnQ s n (s, z) = Q(s, z) dz 2iπ z n+.

9 A simple application of the residue analysis leads to Pn s φ n(s) = ρ n (s) λ(s) 2 ((n + )(b (s, ρ(s)) + b 2 (s, ρ(s))) z b (s, ρ(s)) ) z b (s, ρ(s)) + O(µ n ), P s n Q n(s) = ρ n (s) λ(s) 2 ((n + )(α(s, ρ(s)) + τ(s, ρ(s))) z α(s, ρ(s)) z τ(s, ρ(s)) ) + O(µ n ). Since functions involved are analytic and uniformly bounded in s in a compact neighborhood, the asymptotic estimates of φ n() and Q n() can be easily derived. In summary, we find φ n() + Q n() = (n + )ρ ()(φ n () + Q n ()) + O(n) = nh( n ) + O(n), which cancels the coefficient nεh( n ) in the expansion of H(Z n ) in (25). More precisely, φ n () + Q n () = nh(n ) + n log λ () 2λ λ + n ( λ s b (, ) + s b 2(, ) + s α(, ) + τ(, ) ( s ρ 2 () sz b (, ) + 2 sz b 2(, ) 2 )) 2 + α(, ) + τ(, ) sz sz ( + n ρ () λ z b (, ) + z b 2(, ) + z α(, ) + z τ(, ) ) + O(). (34) The expression for f 0 (P ) in Theorem 5 follows directly from the expression (34) since the coefficient at ε is exactly nh( n ) + φ n() + Q n() + φ n () and φ n () = 0 when k 2d. The proof of Theorem 6 is even easier since f 0 (P ) = Q n() n We have from (34): = φ n() n. ( b (, ) + b 2 (, ) φ n () = n λ ). Observe that b (, ) exactly matches γ and b 2 (, ) matches δ in Theorem 6. D. Error Term in Theorem 5 To complete the proof of Theorem 5, we establish here that the dominating error term of the capacity C(S, ε) estimation is O(ε 2 log 2 ε). For this we need to show that the imizing distribution P (ε) of H(Z) introduces error of order O(ε 2 log 2 ε). Recall that P imizes H(). In Appendix A we show that H(Z) = O(log ε) ǫ uniformly in P. As a consequence H(Z) converges to H() uniformly in P as ε 0. We also prove in the Appendix that H(Z) = H()+f 0 (P )ε log ε+f (P )ε+g(p )O(ε 2 log ε), where the functions f 0, f and g of P are in C (all continuous and infinitely many differentiable functions). Let P (ε) be the distribution that imizes H(Z), hence the capacity C(S, ε). For α > 0 let K α be a compact set of distributions that are at topological distance smaller than or equal to α from P. Since H(Z) converges to H() uniformly, there exists ε > 0 such that ε < ε, ε > 0 we have P K α. Let now β = P K α {g(p )}. Clearly, β g(p ) as α 0. Let also and F(P, ε) = H() + f 0 (P )ε log ε + f (P )ε, F α (ε) = P K α {F(P, ε)}. The following inequality for ε < follows from our analysis in Appendix A F α (ε) + βε 2 log ε H(P (ε)) F α(ε) βε 2 log ε. We will prove here that Let F α (ε) = F(P, ε) + O(ε 2 log 2 ε). P P = arg{f(p, ε)}. We have F(, ε) = 0, where F denotes the gradient of F with respect to P. Defining dp = P P we find F( P, ε) = F(P, ε) + 2 F(P, ε)dp + O( dp 2 ) where 2 F is the second derivative matrix (i.e., Hessian) of F and v is the norm of vector v. Since F( P, ε) = 0 and H(P ) = 0, thus F(P, ε) = f 0 (P )ε log ε + f (P )ε. Denoting F 2 = 2 F(P ) and its inverse matrix as F we arrive at and F 2 dp = f 0 (P +O( dp 2 ), 2, )ε log ε + f (P )ε (35) dp = F2 ( f 0 (P )ε log ε + f (P )ε) +O( dp 2 ). (36)

10 This lead to dp = O(ε log ε) for sufficiently small ε such that dp α. Thus F( P, ε) = F(P, ε) + 2 dp F 2 dp + f 0 (P )dp ε log ε + f (P )dp ε + O( dp 3 ), plugging the expression of dp from (36) yields: F α (ε) = F( P, ε) = F(P, ε) 2 f 0(P ) F 2 f 0 (P )ε2 log 2 ε f 0 (P ) F 2 f (P )ε2 log ε 2 f (P ) F2 f (P )ε 2 + O(ε 3 log 3 ε). This completes the proof. V. CONCLUSION We study the capacity of the constrained BSC channel in which the input is a (d, k) sequence. After observing that a (d, k) sequence can be generated by a k-order Markov chain, we reduce the problem to estimating the entropy rate of the underlying hidden Markov process (HMM). In our previous paper [4], [5], we established that the entropy rate for a HMM process is equal to a Lyapunov exponent. After realizing that such an exponent is hard to compute, theoretically and numerically, we obtained an asymptotic expansion of the entropy rate when the error rate ε is small (cf. also [27]). In this paper, we extend previous results in several directions. First, we present asymptotic expansion of the HMM when some of the transition probabilities of the underlying Markov are zero. This adds additional term of order ε log ε to the asymptotic expansion. Then, we return to the noisy constrained capacity and prove that the exact capacity is related to supremum of Lyapunov exponents over increasing order Markov processes. Finally, for (d, k) sequences we obtain an asymptotic expansion for the noisy capacity when the noise ε 0. In particular, we prove that for k 2d the noisy capacity is equal to the noiseless capacity plus a term O(ε). In the case k > 2d, the correction term is O(ε log ε). We should point out that recently Han and Marcus [9], [0] reached similar conclusions (and obtained some generalizations) using quite different methodology. APPENDI A: PROOF OF THEOREM 3 In this Appendix we prove the error term in (8) in Theorem 3 using the methodology developed by us in [5]. We need to prove that for ε < /2 H(Z n ) = H( n )+nf (P )ε+nf 0 (P )ε logε+o(nε 2 log ε) (37) for some f (P ) and f 0 (P ). We start with H(Z n ) = H( n ) ε ǫ H(Zn ) + G n (38) and show at the end of this section that G n = O(nε 2 log ε). We first concentrate on proving that ǫ H(Zn ) = nf (P ) + nf 0 (P )log ε (39) for some f 0 (P ) and f (P ). We use equation (48) from [5] which we reproduce below ǫ P Z(z) = (P Z (z e i ) P Z (z)) 2ε i for any sequence z of length n (hereafter, we simply write x for x n and z for z n ). Consequently, ǫ H(Zn ) = 2ε that can be rewritten as ǫ H(Zn ) = 2ε (P Z (z e i ) P Z (z))log P Z (z) z i x i P Z (z)log P Z(z e i ). P Z (z) In order to estimate the ratio of P Z (z e i ) and P Z (z), we observe that P Z (z) = ( ε) ( ) dh(x,z) ε n P (x), ε x where d H (, x, z) is the Hamming distance between x and z. Similarly, P Z (z e i ) = ( ε) ( ) dh(x,z e ε i) n P (x). ε x The following inequality is easy to prove ( ) dh(x,z e ε i) d H(x,z) min P Z(z e i ) i ε P Z (z) ( ) dh(x,z e ε i) d H(x,z). i ε Since d H (x, z e i ) = d H (x, z) ± we conclude that z i ε ε P Z(z e i ) P Z (z) ε. ε Thus P Z (z)log P Z(z e i ) n log( ε) n log ε P Z (z) and this completes the proof of (37). To finish the proof of Theorem 3, it remains to show that that G n = O(nε 2 log ε), that is, uniformly in n and ε > 0 H(Z n ) = H(n ) ε ǫ H(Zn ) + O(nε2 log ε). (40) To this end, we make use of the Taylor expansion: H(Z n ) = H( n ) ε ǫ H(Zn ) ε 0 θ 2 ǫ 2H(Zn ) ε=θdθ,

11 and prove that for ε small enough we have uniformly in n and ε > 0 2 ǫ 2H(Zn ) = O(n log ε), (4) from which the error term O(nε 2 log ε) follows immediately. In [5] we proved that for all sequences z 2 ǫ 2 P Z(z) = 2 2ε ǫ P Z(z) ( 2ε) 2 (P Z (z e i e j ) P Z (z e i ) P Z (z e j ) + P Z (z)), which led to equation (49) of [5] repeated below 2 ǫ 2H(Zn ) = 2 2ε ǫ H(Zn ) i,j ( 2ε) 2 (D + D 2 ), where D = P Z (z e i e j ) P Z (z e i ) z i,j P Z (z e j ) + P Z (z)log P Z (z), and D 2 = (P Z (z e i )) P Z (z)) z ij (P Z (z e j )) P Z (z)) P Z (z). We will prove that D = O(n log ε) and D 2 = O(n). Let first deal with D. We can write it as D = P Z (z)log P Z(z e i e j )P Z (z) P z Z (z e i )P(z e j ). i,j We now split D = D + D where D involves the pairs (i, j) such that i j k + and D deals with such pairs that j i > k +. For all z and all i and j such that i j k +, we have ε 2 ( ε) 2 < P Z(z e i e j )P Z (z) ( ε)2 < P Z (z e i )P Z (z e j ) ε 2. (42) Therefore, D z j i k+ P Z (z)( 2 log( ε) 2 log ε) (k + )n( 2 log( ε) log ε). For j i > k +, we observe, as in [5], that there exists µ < such that for all z P Z (z e i e j )P Z (z) P Z (z e i )P Z (z e j ) = + O(µ i ) + O(µ j ) + O(µ j i ) +O(µ n i ) + O(µ n i ). Thus we find D = P Z (z)log ( + O(ρ i ) + O(µ j ) z j i >k+ ) +O(µ j i ) + O(µ n i ) + O(µ n i ) = z P Z (z)o(n/( µ)) = O(n). Now we turn our attention to D 2, and similarly we split D 2 = D 2 +D 2 with D 2 involving only i, j such that i j k + and D 2 involving i, j such that i j > k +. We easily see that D 2 n(k + ), and then D 2 = P Z (z) P Z (z e i ) z i j >k P Z (z e j ) + P Z (z e i e j ( ) PZ (z e i )P Z (z e j ) + P Z (z e i e j )P Z (z) P Z (z e i e j, ε). We now notice that P Z (z) P Z (z e i ) P Z (z e j )+P Z (z e i e j ) = 0. z i,j Restricting this sum to i j > k+ we observe that it gives the opposite of the sum for i j k +. Therefore, the total contribution is O((k + )n. Furthermore, ( ) PZ (z e i )P Z (z e j ) P Z (z e i e j )P Z (z) z i j >k+ P Z (z e i e j ) = z P Z (z)o(n/( µ)) = O(n), and this completes the proof of Theorem 3. APPENDI B: PROOF OF (27) Our aim is to show that p l that imize the sequence entropy rate is given by (27). Recall that the entropy rate is equal to ρ () with k ρ () = p l log p l k (l + )p. l If we extend above p l such that l=k p l, then we need to modify it to ρ () = ( k p l ) ( k log p l ) k k (l + )p l p l log p l. The optimal distribution (p d,...,p k ) is the one imizing the gradient of ρ () which implies that the gradient of the denominator is collinear with the gradient of the numerator. Thus there exists ν such that: ( k ) ( k k k p l log p l ) p l log p l = ν (l+)p l. All computations done, this implies that for all l between d and k ( k ) log p i log p l = (l + )ν. i=d For p l such that k p l =, the above identity becomes for all d l k log p l = (l + )ν, hence p l = (e ν ) l+. Setting ρ 0 = e ν with ρ 0 being the unique root of k ρl+ =, we establish (27).

12 ACKNOWLEDGMENT The authors thank B. Marcus for very helpful discussions during the summer of 2006 when this research was shaping up. We also thank G. Seroussi for participating in the initial stage of this research. REFERENCES [] D. M. Arnold, H.-A. Loeliger, P. O. Vontobel, A. Kavcic, W. Zeng, Simulation-Based Computation of Information Rates for Channels With Memory, IEEE Trans. Information Theory, 52, , [2] D. Blackwell, The entropy of functions of finite-state Markov chains, in Trans. First Prague Conf. Information Theory, Statistical Decision Functions, Random Processes, (Prague, Czechoslovakia), pp. 3 20, Pub. House Chechoslovak Acad. Sci., 957. [3] J. Chen and P. Siegel, Markov processes asymptotically achieve the capacity of finite-state intersymbol interference channels, IEEE Trans. Information Theory, 54, , [4] J. Fan, T. L. Poo, and B. Marcus, Constraint Gain, IEEE Trans. Information Theory, 50, , 200. [5] H. Furstenberg and H. Kesten, Products of random matrices, Ann. Math. Statist, pp , 960. [6] R. Gharavi and V. Anantharam. An upper bound for the largest Lyapunov exponent of a Markovian product of nonnegative matrices. Theoretical Computer Science, Vol. 332, Nos. -3, pp , [7] G. Han and B. Marcus, Analyticity of Entropy Rate of Hidden Markov Chains,IEEE Trans. Information Theory, 52, , [8] G. Han and B. Marcus, Analyticity of Entropy Rate in Families of Hidden Markov Chains (II),,IEEE Trans. Information Theory, 52, 03-07, [9] G. Han and B. Marcus, Capacity of noisy constrained channels, Proc. ISIT 2007, , Nice, [0] G. Han and B. Marcus, Asymptotics of the input-constrained binary symmetric channel capacity, Annals of Applied Probability, 9, , [] T. Holliday, A. Goldsmith, and P. Glynn, Capacity of Finite State Channels Based on Lyapunov Exponents of Random Matrices, IEEE Trans. Information Theory, 52, , [2] K. A. Schouhammer Immink, Codes for Mass Data Storage Systems, Shannon Foundation Publishers, Eindhoven, [3] P. Jacquet and W. Szpankowski, Entropy computations via analytic depoissonization, IEEE Trans. Inform. Theory, 45, , 999. [4] P. Jacquet, G. Seroussi, and W. Szpankowski, On the Entropy of a Hidden Markov Process (extended abstract), Data Compression Conference, , Snowbird, [5] P. Jacquet, G. Seroussi, and W. Szpankowski, On the Entropy of a Hidden Markov Process (full version) Theoretical Computer Science, 395, , [6] P. Jacquet, G. Seroussi, and W. Szpankowski, Noisy Constrained Capacity 2007 International Symposium on Information Theory, , Nice, [7] S. Karlin and H. Taylor, A First Course in Stochastic Processes. New York: Academic Press, 975. [8] B. Marcus, R. Roth and P. Siegel, Constrained Systems and Coding for Recording Channels, Chap. 20 in Handbook of Coding Theory (eds. V. S. Pless and W. C. Huffman), Elsevier Science, 998. [9] E. Ordentlich and T. Weissman, New Bounds on th Entropy Rate of Hidden Markov Process, Information Theory Workshop, 7 22, San Antonio, [20] E. Ordentlich and T. Weissman, On the optimality of symbol by symbol filtering and denoising, IEEE Trans. Information Theory, 52, 9-40, [2] V. Oseledec, A multiplicative ergodic theorem, Trudy Moskov. Mat. Obshch., 968. [22] E. Seneta, Non-Negative Matrices. New York: John Wiley & Sons, 973. [23] S. Shamai (Shitz) and Y Kofman, On the capacity of binary and Gaussian channels with run-length limited inputs, IEEE Trans. Commun., 38, , 990. [24] W. Szpankowski, Average Case Analysis of Algorithms on Sequences. New York: John Wiley & Sons, Inc., 200. [25] J. Tsitsiklis and V. Blondel, The Lyapunov exponent and joint spectral radius of pairs of matrices are hard - when not impossible to compute and to approximate, Mathematics of Control, Signals, and Systems, 0, 3 40, 997. [26] E. Zehavi and J. K. Wolf, On runlength codes, IEEE Trans. Information Theory, 34, 45 54, 988. [27] O. Zuk, I. Kanter and E. Domany, Asymptotics of the Entropy Rate for a Hidden Markov Process, J. Stat. Phys., 2, , [28] O. Zuk, E. Domany, I. Kanter, and M. Aizenman. From Finite-System Entropy to Entropy Rate for a Hidden Markov Process. Signal Processing Letters, IEEE, 3, , Philippe Jacquet is a research director in INRIA. He graduated from Ecole Polytechnique in 98 and from Ecole Nationale des Mines in 984. He received his Ph.D. degree from Paris Sud University in 989 and his habilitation degree from Versailles University in 998. He is currently the head of HIPERCOM project that is devoted to high performance communications. His research interests cover information theory, probability theory, quantum telecommunication, evaluation of performance and algorithm design for telecommunication, wireless and ad hoc networking. Philippe Jacquet is expert in wireless ad hoc networking. He authored the communication protocol OLSR for mobile ad hoc networks (MANET). He is working in flooding techniques in ad hoc network, as well as on the Multipoint Relaying technique. His activities span space-time information theory considerations in massively dense mobile networks, complexity of distributed capacity, and quality of service management. He holds seven patents. Philippe Jacquet is author of numerous papers that have appeared in international journals. In 999 he co-chaired the Information Theory and Networking Workshop, Metsovo, Greece. Wojciech Szpankowski is Saul Rosen Professor of Computer Science and (by courtesy) Electrical and Computer Engineering at Purdue University where he teaches and conducts research in analysis of algorithms, information theory, bioinformatics, analytic combinatorics, random structures, and stability problems of distributed systems. He received his M.S. and Ph.D. degrees in Electrical and Computer Engineering from Gdansk University of Technology. He held several Visiting Professor/Scholar positions, including McGill University, INRIA, France, Stanford, Hewlett-Packard Labs, Universite de Versailles, University of Canterbury, New Zealand, Ecole Polytechnique, France, and the Newton Institute, Cambridge, UK. He is a Fellow of IEEE, and the Erskine Fellow. In 2009 he received the Humboldt Research Award. In 200 he published the book Average Case Analysis of Algorithms on Sequences, John Wiley & Sons, 200. He has been a guest editor and an editor of technical journals, including THEO- RETICAL COMPUTER SCIENCE, the ACM TRANSACTION ON ALGORITHMS, the IEEE TRANSACTIONS ON INFORMATION THEORY, FOUNDATION AND TRENDS IN COMMUNICATIONS AND INFORMATION THEORY, COMBINATORICS, PROBABIL- ITY, AND COMPUTING, and ALGORITHMICA. In 2008 he launched the interdisciplinary Institute for Science of Information, and in 200 he became the Director of the newly established NSF Science and Technology Center for Science of Information.

Noisy Constrained Capacity

Noisy Constrained Capacity Noisy Constrained Capacity Philippe Jacquet INRIA Rocquencourt 7853 Le Chesnay Cedex France Email: philippe.jacquet@inria.fr Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, CA 94304 U.S.A. Email:

More information

Recent Results on Input-Constrained Erasure Channels

Recent Results on Input-Constrained Erasure Channels Recent Results on Input-Constrained Erasure Channels A Case Study for Markov Approximation July, 2017@NUS Memoryless Channels Memoryless Channels Channel transitions are characterized by time-invariant

More information

ASYMPTOTICS OF INPUT-CONSTRAINED BINARY SYMMETRIC CHANNEL CAPACITY

ASYMPTOTICS OF INPUT-CONSTRAINED BINARY SYMMETRIC CHANNEL CAPACITY Submitted to the Annals of Applied Probability ASYMPTOTICS OF INPUT-CONSTRAINED BINARY SYMMETRIC CHANNEL CAPACITY By Guangyue Han, Brian Marcus University of Hong Kong, University of British Columbia We

More information

Concavity of mutual information rate of finite-state channels

Concavity of mutual information rate of finite-state channels Title Concavity of mutual information rate of finite-state channels Author(s) Li, Y; Han, G Citation The 213 IEEE International Symposium on Information Theory (ISIT), Istanbul, Turkey, 7-12 July 213 In

More information

String Complexity. Dedicated to Svante Janson for his 60 Birthday

String Complexity. Dedicated to Svante Janson for his 60 Birthday String Complexity Wojciech Szpankowski Purdue University W. Lafayette, IN 47907 June 1, 2015 Dedicated to Svante Janson for his 60 Birthday Outline 1. Working with Svante 2. String Complexity 3. Joint

More information

Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information

Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information 204 IEEE International Symposium on Information Theory Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information Omur Ozel, Kaya Tutuncuoglu 2, Sennur Ulukus, and Aylin Yener

More information

Analytic Pattern Matching: From DNA to Twitter. AxA Workshop, Venice, 2016 Dedicated to Alberto Apostolico

Analytic Pattern Matching: From DNA to Twitter. AxA Workshop, Venice, 2016 Dedicated to Alberto Apostolico Analytic Pattern Matching: From DNA to Twitter Wojciech Szpankowski Purdue University W. Lafayette, IN 47907 June 19, 2016 AxA Workshop, Venice, 2016 Dedicated to Alberto Apostolico Joint work with Philippe

More information

Pattern Matching in Constrained Sequences

Pattern Matching in Constrained Sequences Pattern Matching in Constrained Sequences Yongwook Choi and Wojciech Szpankowski Department of Computer Science Purdue University W. Lafayette, IN 47907 U.S.A. Email: ywchoi@purdue.edu, spa@cs.purdue.edu

More information

Analyticity and Derivatives of Entropy Rate for HMP s

Analyticity and Derivatives of Entropy Rate for HMP s Analyticity and Derivatives of Entropy Rate for HMP s Guangyue Han University of Hong Kong Brian Marcus University of British Columbia 1 Hidden Markov chains A hidden Markov chain Z = {Z i } is a stochastic

More information

Taylor series expansions for the entropy rate of Hidden Markov Processes

Taylor series expansions for the entropy rate of Hidden Markov Processes Taylor series expansions for the entropy rate of Hidden Markov Processes Or Zuk Eytan Domany Dept. of Physics of Complex Systems Weizmann Inst. of Science Rehovot, 7600, Israel Email: or.zuk/eytan.domany@weizmann.ac.il

More information

Counting Markov Types

Counting Markov Types Counting Markov Types Philippe Jacquet, Charles Knessl, Wojciech Szpankowski To cite this version: Philippe Jacquet, Charles Knessl, Wojciech Szpankowski. Counting Markov Types. Drmota, Michael and Gittenberger,

More information

Dispersion of the Gilbert-Elliott Channel

Dispersion of the Gilbert-Elliott Channel Dispersion of the Gilbert-Elliott Channel Yury Polyanskiy Email: ypolyans@princeton.edu H. Vincent Poor Email: poor@princeton.edu Sergio Verdú Email: verdu@princeton.edu Abstract Channel dispersion plays

More information

Asymptotics of Input-Constrained Erasure Channel Capacity

Asymptotics of Input-Constrained Erasure Channel Capacity Asymptotics of Input-Constrained Erasure Channel Capacity Yonglong Li The University of Hong Kong email: yonglong@hku.hk Guangyue Han The University of Hong Kong email: ghan@hku.hk May 3, 06 Abstract In

More information

Waiting Time Distributions for Pattern Occurrence in a Constrained Sequence

Waiting Time Distributions for Pattern Occurrence in a Constrained Sequence Waiting Time Distributions for Pattern Occurrence in a Constrained Sequence November 1, 2007 Valeri T. Stefanov Wojciech Szpankowski School of Mathematics and Statistics Department of Computer Science

More information

Asymptotic Filtering and Entropy Rate of a Hidden Markov Process in the Rare Transitions Regime

Asymptotic Filtering and Entropy Rate of a Hidden Markov Process in the Rare Transitions Regime Asymptotic Filtering and Entropy Rate of a Hidden Marov Process in the Rare Transitions Regime Chandra Nair Dept. of Elect. Engg. Stanford University Stanford CA 94305, USA mchandra@stanford.edu Eri Ordentlich

More information

Computation of Information Rates from Finite-State Source/Channel Models

Computation of Information Rates from Finite-State Source/Channel Models Allerton 2002 Computation of Information Rates from Finite-State Source/Channel Models Dieter Arnold arnold@isi.ee.ethz.ch Hans-Andrea Loeliger loeliger@isi.ee.ethz.ch Pascal O. Vontobel vontobel@isi.ee.ethz.ch

More information

Upper Bounds on the Capacity of Binary Intermittent Communication

Upper Bounds on the Capacity of Binary Intermittent Communication Upper Bounds on the Capacity of Binary Intermittent Communication Mostafa Khoshnevisan and J. Nicholas Laneman Department of Electrical Engineering University of Notre Dame Notre Dame, Indiana 46556 Email:{mhoshne,

More information

A Deterministic Algorithm for the Capacity of Finite-State Channels

A Deterministic Algorithm for the Capacity of Finite-State Channels A Deterministic Algorithm for the Capacity of Finite-State Channels arxiv:1901.02678v1 [cs.it] 9 Jan 2019 Chengyu Wu Guangyue Han Brian Marcus University of Hong Kong University of Hong Kong University

More information

Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels

Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels Nan Liu and Andrea Goldsmith Department of Electrical Engineering Stanford University, Stanford CA 94305 Email:

More information

Deinterleaving Markov Processes via Penalized ML

Deinterleaving Markov Processes via Penalized ML Deinterleaving Markov Processes via Penalized ML Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, CA, USA Email: gseroussi@ieee.org Wojciech Szpankowski Department of Computer Science Purdue University

More information

IN this work, we develop the theory required to compute

IN this work, we develop the theory required to compute IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 8, AUGUST 2006 3509 Capacity of Finite State Channels Based on Lyapunov Exponents of Random Matrices Tim Holliday, Member, IEEE, Andrea Goldsmith,

More information

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions International Journal of Control Vol. 00, No. 00, January 2007, 1 10 Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions I-JENG WANG and JAMES C.

More information

Intermittent Communication

Intermittent Communication Intermittent Communication Mostafa Khoshnevisan, Student Member, IEEE, and J. Nicholas Laneman, Senior Member, IEEE arxiv:32.42v2 [cs.it] 7 Mar 207 Abstract We formulate a model for intermittent communication

More information

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University Chapter 4 Data Transmission and Channel Capacity Po-Ning Chen, Professor Department of Communications Engineering National Chiao Tung University Hsin Chu, Taiwan 30050, R.O.C. Principle of Data Transmission

More information

4488 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 54, NO. 10, OCTOBER /$ IEEE

4488 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 54, NO. 10, OCTOBER /$ IEEE 4488 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 54, NO. 10, OCTOBER 2008 List Decoding of Biorthogonal Codes the Hadamard Transform With Linear Complexity Ilya Dumer, Fellow, IEEE, Grigory Kabatiansky,

More information

Feedback Capacity of a Class of Symmetric Finite-State Markov Channels

Feedback Capacity of a Class of Symmetric Finite-State Markov Channels Feedback Capacity of a Class of Symmetric Finite-State Markov Channels Nevroz Şen, Fady Alajaji and Serdar Yüksel Department of Mathematics and Statistics Queen s University Kingston, ON K7L 3N6, Canada

More information

Lecture 4 Noisy Channel Coding

Lecture 4 Noisy Channel Coding Lecture 4 Noisy Channel Coding I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw October 9, 2015 1 / 56 I-Hsiang Wang IT Lecture 4 The Channel Coding Problem

More information

The Continuing Miracle of Information Storage Technology Paul H. Siegel Director, CMRR University of California, San Diego

The Continuing Miracle of Information Storage Technology Paul H. Siegel Director, CMRR University of California, San Diego The Continuing Miracle of Information Storage Technology Paul H. Siegel Director, CMRR University of California, San Diego 10/15/01 1 Outline The Shannon Statue A Miraculous Technology Information Theory

More information

Can Feedback Increase the Capacity of the Energy Harvesting Channel?

Can Feedback Increase the Capacity of the Energy Harvesting Channel? Can Feedback Increase the Capacity of the Energy Harvesting Channel? Dor Shaviv EE Dept., Stanford University shaviv@stanford.edu Ayfer Özgür EE Dept., Stanford University aozgur@stanford.edu Haim Permuter

More information

Channel Polarization and Blackwell Measures

Channel Polarization and Blackwell Measures Channel Polarization Blackwell Measures Maxim Raginsky Abstract The Blackwell measure of a binary-input channel (BIC is the distribution of the posterior probability of 0 under the uniform input distribution

More information

Symmetric Characterization of Finite State Markov Channels

Symmetric Characterization of Finite State Markov Channels Symmetric Characterization of Finite State Markov Channels Mohammad Rezaeian Department of Electrical and Electronic Eng. The University of Melbourne Victoria, 31, Australia Email: rezaeian@unimelb.edu.au

More information

WE study the capacity of peak-power limited, single-antenna,

WE study the capacity of peak-power limited, single-antenna, 1158 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 56, NO. 3, MARCH 2010 Gaussian Fading Is the Worst Fading Tobias Koch, Member, IEEE, and Amos Lapidoth, Fellow, IEEE Abstract The capacity of peak-power

More information

Analytic Pattern Matching: From DNA to Twitter. Information Theory, Learning, and Big Data, Berkeley, 2015

Analytic Pattern Matching: From DNA to Twitter. Information Theory, Learning, and Big Data, Berkeley, 2015 Analytic Pattern Matching: From DNA to Twitter Wojciech Szpankowski Purdue University W. Lafayette, IN 47907 March 10, 2015 Information Theory, Learning, and Big Data, Berkeley, 2015 Jont work with Philippe

More information

Distributed Power Control for Time Varying Wireless Networks: Optimality and Convergence

Distributed Power Control for Time Varying Wireless Networks: Optimality and Convergence Distributed Power Control for Time Varying Wireless Networks: Optimality and Convergence Tim Holliday, Nick Bambos, Peter Glynn, Andrea Goldsmith Stanford University Abstract This paper presents a new

More information

The Effect of Memory Order on the Capacity of Finite-State Markov and Flat-Fading Channels

The Effect of Memory Order on the Capacity of Finite-State Markov and Flat-Fading Channels The Effect of Memory Order on the Capacity of Finite-State Markov and Flat-Fading Channels Parastoo Sadeghi National ICT Australia (NICTA) Sydney NSW 252 Australia Email: parastoo@student.unsw.edu.au Predrag

More information

Shannon Legacy and Beyond. Dedicated to PHILIPPE FLAJOLET

Shannon Legacy and Beyond. Dedicated to PHILIPPE FLAJOLET Shannon Legacy and Beyond Wojciech Szpankowski Department of Computer Science Purdue University W. Lafayette, IN 47907 May 18, 2011 NSF Center for Science of Information LUCENT, Murray Hill, 2011 Dedicated

More information

lossless, optimal compressor

lossless, optimal compressor 6. Variable-length Lossless Compression The principal engineering goal of compression is to represent a given sequence a, a 2,..., a n produced by a source as a sequence of bits of minimal possible length.

More information

On the Shamai-Laroia Approximation for the Information Rate of the ISI Channel

On the Shamai-Laroia Approximation for the Information Rate of the ISI Channel On the Shamai-Laroia Approximation for the Information Rate of the ISI Channel Yair Carmon and Shlomo Shamai (Shitz) Department of Electrical Engineering, Technion - Israel Institute of Technology 2014

More information

DISCRETE sources corrupted by discrete memoryless

DISCRETE sources corrupted by discrete memoryless 3476 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 8, AUGUST 2006 Universal Minimax Discrete Denoising Under Channel Uncertainty George M. Gemelos, Student Member, IEEE, Styrmir Sigurjónsson, and

More information

Discrete Memoryless Channels with Memoryless Output Sequences

Discrete Memoryless Channels with Memoryless Output Sequences Discrete Memoryless Channels with Memoryless utput Sequences Marcelo S Pinho Department of Electronic Engineering Instituto Tecnologico de Aeronautica Sao Jose dos Campos, SP 12228-900, Brazil Email: mpinho@ieeeorg

More information

Entropy and Ergodic Theory Notes 21: The entropy rate of a stationary process

Entropy and Ergodic Theory Notes 21: The entropy rate of a stationary process Entropy and Ergodic Theory Notes 21: The entropy rate of a stationary process 1 Sources with memory In information theory, a stationary stochastic processes pξ n q npz taking values in some finite alphabet

More information

Notes 3: Stochastic channels and noisy coding theorem bound. 1 Model of information communication and noisy channel

Notes 3: Stochastic channels and noisy coding theorem bound. 1 Model of information communication and noisy channel Introduction to Coding Theory CMU: Spring 2010 Notes 3: Stochastic channels and noisy coding theorem bound January 2010 Lecturer: Venkatesan Guruswami Scribe: Venkatesan Guruswami We now turn to the basic

More information

Exercise 1. = P(y a 1)P(a 1 )

Exercise 1. = P(y a 1)P(a 1 ) Chapter 7 Channel Capacity Exercise 1 A source produces independent, equally probable symbols from an alphabet {a 1, a 2 } at a rate of one symbol every 3 seconds. These symbols are transmitted over a

More information

Some Aspects of Finite State Channel related to Hidden Markov Process

Some Aspects of Finite State Channel related to Hidden Markov Process Some Aspects of Finite State Channel related to Hidden Markov Process Kingo Kobayashi National Institute of Information and Communications Tecnology(NICT), 4-2-1, Nukui-Kitamachi, Koganei, Tokyo, 184-8795,

More information

Bounded Expected Delay in Arithmetic Coding

Bounded Expected Delay in Arithmetic Coding Bounded Expected Delay in Arithmetic Coding Ofer Shayevitz, Ram Zamir, and Meir Feder Tel Aviv University, Dept. of EE-Systems Tel Aviv 69978, Israel Email: {ofersha, zamir, meir }@eng.tau.ac.il arxiv:cs/0604106v1

More information

A Single-letter Upper Bound for the Sum Rate of Multiple Access Channels with Correlated Sources

A Single-letter Upper Bound for the Sum Rate of Multiple Access Channels with Correlated Sources A Single-letter Upper Bound for the Sum Rate of Multiple Access Channels with Correlated Sources Wei Kang Sennur Ulukus Department of Electrical and Computer Engineering University of Maryland, College

More information

Deinterleaving Finite Memory Processes via Penalized Maximum Likelihood

Deinterleaving Finite Memory Processes via Penalized Maximum Likelihood Deinterleaving Finite Memory Processes via Penalized Maximum Likelihood Gadiel Seroussi, Fellow, IEEE, Wojciech Szpankowski, Fellow, IEEE, and Marcelo J. Weinberger Fellow, IEEE Abstract We study the problem

More information

On the static assignment to parallel servers

On the static assignment to parallel servers On the static assignment to parallel servers Ger Koole Vrije Universiteit Faculty of Mathematics and Computer Science De Boelelaan 1081a, 1081 HV Amsterdam The Netherlands Email: koole@cs.vu.nl, Url: www.cs.vu.nl/

More information

Discrete Universal Filtering Through Incremental Parsing

Discrete Universal Filtering Through Incremental Parsing Discrete Universal Filtering Through Incremental Parsing Erik Ordentlich, Tsachy Weissman, Marcelo J. Weinberger, Anelia Somekh Baruch, Neri Merhav Abstract In the discrete filtering problem, a data sequence

More information

ITCT Lecture IV.3: Markov Processes and Sources with Memory

ITCT Lecture IV.3: Markov Processes and Sources with Memory ITCT Lecture IV.3: Markov Processes and Sources with Memory 4. Markov Processes Thus far, we have been occupied with memoryless sources and channels. We must now turn our attention to sources with memory.

More information

An Achievable Error Exponent for the Mismatched Multiple-Access Channel

An Achievable Error Exponent for the Mismatched Multiple-Access Channel An Achievable Error Exponent for the Mismatched Multiple-Access Channel Jonathan Scarlett University of Cambridge jms265@camacuk Albert Guillén i Fàbregas ICREA & Universitat Pompeu Fabra University of

More information

Lattices for Distributed Source Coding: Jointly Gaussian Sources and Reconstruction of a Linear Function

Lattices for Distributed Source Coding: Jointly Gaussian Sources and Reconstruction of a Linear Function Lattices for Distributed Source Coding: Jointly Gaussian Sources and Reconstruction of a Linear Function Dinesh Krithivasan and S. Sandeep Pradhan Department of Electrical Engineering and Computer Science,

More information

5958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 56, NO. 12, DECEMBER 2010

5958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 56, NO. 12, DECEMBER 2010 5958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 56, NO. 12, DECEMBER 2010 Capacity Theorems for Discrete, Finite-State Broadcast Channels With Feedback and Unidirectional Receiver Cooperation Ron Dabora

More information

Midterm Exam Information Theory Fall Midterm Exam. Time: 09:10 12:10 11/23, 2016

Midterm Exam Information Theory Fall Midterm Exam. Time: 09:10 12:10 11/23, 2016 Midterm Exam Time: 09:10 12:10 11/23, 2016 Name: Student ID: Policy: (Read before You Start to Work) The exam is closed book. However, you are allowed to bring TWO A4-size cheat sheet (single-sheet, two-sided).

More information

On rational approximation of algebraic functions. Julius Borcea. Rikard Bøgvad & Boris Shapiro

On rational approximation of algebraic functions. Julius Borcea. Rikard Bøgvad & Boris Shapiro On rational approximation of algebraic functions http://arxiv.org/abs/math.ca/0409353 Julius Borcea joint work with Rikard Bøgvad & Boris Shapiro 1. Padé approximation: short overview 2. A scheme of rational

More information

Deinterleaving Finite Memory Processes via Penalized Maximum Likelihood

Deinterleaving Finite Memory Processes via Penalized Maximum Likelihood Deinterleaving Finite Memory Processes via Penalized Maximum Likelihood Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, CA, USA gseroussi@ieee.org Wojciech Szpankowski Purdue University, West Lafayette,

More information

NUMERICAL COMPUTATION OF THE CAPACITY OF CONTINUOUS MEMORYLESS CHANNELS

NUMERICAL COMPUTATION OF THE CAPACITY OF CONTINUOUS MEMORYLESS CHANNELS NUMERICAL COMPUTATION OF THE CAPACITY OF CONTINUOUS MEMORYLESS CHANNELS Justin Dauwels Dept. of Information Technology and Electrical Engineering ETH, CH-8092 Zürich, Switzerland dauwels@isi.ee.ethz.ch

More information

Multi-Kernel Polar Codes: Proof of Polarization and Error Exponents

Multi-Kernel Polar Codes: Proof of Polarization and Error Exponents Multi-Kernel Polar Codes: Proof of Polarization and Error Exponents Meryem Benammar, Valerio Bioglio, Frédéric Gabry, Ingmar Land Mathematical and Algorithmic Sciences Lab Paris Research Center, Huawei

More information

The Poisson Channel with Side Information

The Poisson Channel with Side Information The Poisson Channel with Side Information Shraga Bross School of Enginerring Bar-Ilan University, Israel brosss@macs.biu.ac.il Amos Lapidoth Ligong Wang Signal and Information Processing Laboratory ETH

More information

Lecture 6 I. CHANNEL CODING. X n (m) P Y X

Lecture 6 I. CHANNEL CODING. X n (m) P Y X 6- Introduction to Information Theory Lecture 6 Lecturer: Haim Permuter Scribe: Yoav Eisenberg and Yakov Miron I. CHANNEL CODING We consider the following channel coding problem: m = {,2,..,2 nr} Encoder

More information

EE5139R: Problem Set 7 Assigned: 30/09/15, Due: 07/10/15

EE5139R: Problem Set 7 Assigned: 30/09/15, Due: 07/10/15 EE5139R: Problem Set 7 Assigned: 30/09/15, Due: 07/10/15 1. Cascade of Binary Symmetric Channels The conditional probability distribution py x for each of the BSCs may be expressed by the transition probability

More information

Information Transfer in Biological Systems

Information Transfer in Biological Systems Information Transfer in Biological Systems W. Szpankowski Department of Computer Science Purdue University W. Lafayette, IN 47907 July 1, 2007 AofA and IT logos Joint work with A. Grama, P. Jacquet, M.

More information

A Note on a Problem Posed by D. E. Knuth on a Satisfiability Recurrence

A Note on a Problem Posed by D. E. Knuth on a Satisfiability Recurrence A Note on a Problem Posed by D. E. Knuth on a Satisfiability Recurrence Dedicated to Philippe Flajolet 1948-2011 July 4, 2013 Philippe Jacquet Charles Knessl 1 Wojciech Szpankowski 2 Bell Labs Dept. Math.

More information

(Classical) Information Theory III: Noisy channel coding

(Classical) Information Theory III: Noisy channel coding (Classical) Information Theory III: Noisy channel coding Sibasish Ghosh The Institute of Mathematical Sciences CIT Campus, Taramani, Chennai 600 113, India. p. 1 Abstract What is the best possible way

More information

Chapter 9 Fundamental Limits in Information Theory

Chapter 9 Fundamental Limits in Information Theory Chapter 9 Fundamental Limits in Information Theory Information Theory is the fundamental theory behind information manipulation, including data compression and data transmission. 9.1 Introduction o For

More information

On the Capacity of Markov Sources over Noisy Channels

On the Capacity of Markov Sources over Noisy Channels On the Capacity of Markov Sources over Noisy Channels Aleksandar KavCiC Division of Engineering and Applied Sciences Harvard University, Cambridge, MA 02138 Abstract- We present an expectation-maximization

More information

arxiv: v1 [cs.it] 5 Sep 2008

arxiv: v1 [cs.it] 5 Sep 2008 1 arxiv:0809.1043v1 [cs.it] 5 Sep 2008 On Unique Decodability Marco Dalai, Riccardo Leonardi Abstract In this paper we propose a revisitation of the topic of unique decodability and of some fundamental

More information

Lecture 5: Channel Capacity. Copyright G. Caire (Sample Lectures) 122

Lecture 5: Channel Capacity. Copyright G. Caire (Sample Lectures) 122 Lecture 5: Channel Capacity Copyright G. Caire (Sample Lectures) 122 M Definitions and Problem Setup 2 X n Y n Encoder p(y x) Decoder ˆM Message Channel Estimate Definition 11. Discrete Memoryless Channel

More information

Exercises with solutions (Set B)

Exercises with solutions (Set B) Exercises with solutions (Set B) 3. A fair coin is tossed an infinite number of times. Let Y n be a random variable, with n Z, that describes the outcome of the n-th coin toss. If the outcome of the n-th

More information

Compact Suffix Trees Resemble Patricia Tries: Limiting Distribution of Depth

Compact Suffix Trees Resemble Patricia Tries: Limiting Distribution of Depth Purdue University Purdue e-pubs Department of Computer Science Technical Reports Department of Computer Science 1992 Compact Suffix Trees Resemble Patricia Tries: Limiting Distribution of Depth Philippe

More information

Variable-to-Variable Codes with Small Redundancy Rates

Variable-to-Variable Codes with Small Redundancy Rates Variable-to-Variable Codes with Small Redundancy Rates M. Drmota W. Szpankowski September 25, 2004 This research is supported by NSF, NSA and NIH. Institut f. Diskrete Mathematik und Geometrie, TU Wien,

More information

Information Theory and Hypothesis Testing

Information Theory and Hypothesis Testing Summer School on Game Theory and Telecommunications Campione, 7-12 September, 2014 Information Theory and Hypothesis Testing Mauro Barni University of Siena September 8 Review of some basic results linking

More information

2012 IEEE International Symposium on Information Theory Proceedings

2012 IEEE International Symposium on Information Theory Proceedings Decoding of Cyclic Codes over Symbol-Pair Read Channels Eitan Yaakobi, Jehoshua Bruck, and Paul H Siegel Electrical Engineering Department, California Institute of Technology, Pasadena, CA 9115, USA Electrical

More information

Cover Page. The handle holds various files of this Leiden University dissertation

Cover Page. The handle   holds various files of this Leiden University dissertation Cover Page The handle http://hdlhandlenet/1887/39637 holds various files of this Leiden University dissertation Author: Smit, Laurens Title: Steady-state analysis of large scale systems : the successive

More information

A One-to-One Code and Its Anti-Redundancy

A One-to-One Code and Its Anti-Redundancy A One-to-One Code and Its Anti-Redundancy W. Szpankowski Department of Computer Science, Purdue University July 4, 2005 This research is supported by NSF, NSA and NIH. Outline of the Talk. Prefix Codes

More information

ELEC546 Review of Information Theory

ELEC546 Review of Information Theory ELEC546 Review of Information Theory Vincent Lau 1/1/004 1 Review of Information Theory Entropy: Measure of uncertainty of a random variable X. The entropy of X, H(X), is given by: If X is a discrete random

More information

Capacity of a Class of Semi-Deterministic Primitive Relay Channels

Capacity of a Class of Semi-Deterministic Primitive Relay Channels Capacity of a Class of Semi-Deterministic Primitive Relay Channels Ravi Tandon Sennur Ulukus Department of Electrical and Computer Engineering University of Maryland, College Park, MD 2742 ravit@umd.edu

More information

Lecture 8: Shannon s Noise Models

Lecture 8: Shannon s Noise Models Error Correcting Codes: Combinatorics, Algorithms and Applications (Fall 2007) Lecture 8: Shannon s Noise Models September 14, 2007 Lecturer: Atri Rudra Scribe: Sandipan Kundu& Atri Rudra Till now we have

More information

Solutions to Homework Set #1 Sanov s Theorem, Rate distortion

Solutions to Homework Set #1 Sanov s Theorem, Rate distortion st Semester 00/ Solutions to Homework Set # Sanov s Theorem, Rate distortion. Sanov s theorem: Prove the simple version of Sanov s theorem for the binary random variables, i.e., let X,X,...,X n be a sequence

More information

A t super-channel. trellis code and the channel. inner X t. Y t. S t-1. S t. S t+1. stages into. group two. one stage P 12 / 0,-2 P 21 / 0,2

A t super-channel. trellis code and the channel. inner X t. Y t. S t-1. S t. S t+1. stages into. group two. one stage P 12 / 0,-2 P 21 / 0,2 Capacity Approaching Signal Constellations for Channels with Memory Λ Aleksandar Kav»cić, Xiao Ma, Michael Mitzenmacher, and Nedeljko Varnica Division of Engineering and Applied Sciences Harvard University

More information

On the Throughput, Capacity and Stability Regions of Random Multiple Access over Standard Multi-Packet Reception Channels

On the Throughput, Capacity and Stability Regions of Random Multiple Access over Standard Multi-Packet Reception Channels On the Throughput, Capacity and Stability Regions of Random Multiple Access over Standard Multi-Packet Reception Channels Jie Luo, Anthony Ephremides ECE Dept. Univ. of Maryland College Park, MD 20742

More information

1 Introduction to information theory

1 Introduction to information theory 1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through

More information

Variable-Rate Universal Slepian-Wolf Coding with Feedback

Variable-Rate Universal Slepian-Wolf Coding with Feedback Variable-Rate Universal Slepian-Wolf Coding with Feedback Shriram Sarvotham, Dror Baron, and Richard G. Baraniuk Dept. of Electrical and Computer Engineering Rice University, Houston, TX 77005 Abstract

More information

On Scalable Coding in the Presence of Decoder Side Information

On Scalable Coding in the Presence of Decoder Side Information On Scalable Coding in the Presence of Decoder Side Information Emrah Akyol, Urbashi Mitra Dep. of Electrical Eng. USC, CA, US Email: {eakyol, ubli}@usc.edu Ertem Tuncel Dep. of Electrical Eng. UC Riverside,

More information

CS 229r Information Theory in Computer Science Feb 12, Lecture 5

CS 229r Information Theory in Computer Science Feb 12, Lecture 5 CS 229r Information Theory in Computer Science Feb 12, 2019 Lecture 5 Instructor: Madhu Sudan Scribe: Pranay Tankala 1 Overview A universal compression algorithm is a single compression algorithm applicable

More information

Subset Universal Lossy Compression

Subset Universal Lossy Compression Subset Universal Lossy Compression Or Ordentlich Tel Aviv University ordent@eng.tau.ac.il Ofer Shayevitz Tel Aviv University ofersha@eng.tau.ac.il Abstract A lossy source code C with rate R for a discrete

More information

Context tree models for source coding

Context tree models for source coding Context tree models for source coding Toward Non-parametric Information Theory Licence de droits d usage Outline Lossless Source Coding = density estimation with log-loss Source Coding and Universal Coding

More information

Electrical and Information Technology. Information Theory. Problems and Solutions. Contents. Problems... 1 Solutions...7

Electrical and Information Technology. Information Theory. Problems and Solutions. Contents. Problems... 1 Solutions...7 Electrical and Information Technology Information Theory Problems and Solutions Contents Problems.......... Solutions...........7 Problems 3. In Problem?? the binomial coefficent was estimated with Stirling

More information

Shannon Legacy and Beyond

Shannon Legacy and Beyond Shannon Legacy and Beyond Wojciech Szpankowski Department of Computer Science Purdue University W. Lafayette, IN 47907 July 4, 2011 NSF Center for Science of Information Université Paris 13, Paris 2011

More information

EE376A - Information Theory Final, Monday March 14th 2016 Solutions. Please start answering each question on a new page of the answer booklet.

EE376A - Information Theory Final, Monday March 14th 2016 Solutions. Please start answering each question on a new page of the answer booklet. EE376A - Information Theory Final, Monday March 14th 216 Solutions Instructions: You have three hours, 3.3PM - 6.3PM The exam has 4 questions, totaling 12 points. Please start answering each question on

More information

Mismatched Multi-letter Successive Decoding for the Multiple-Access Channel

Mismatched Multi-letter Successive Decoding for the Multiple-Access Channel Mismatched Multi-letter Successive Decoding for the Multiple-Access Channel Jonathan Scarlett University of Cambridge jms265@cam.ac.uk Alfonso Martinez Universitat Pompeu Fabra alfonso.martinez@ieee.org

More information

Feedback Capacity of the First-Order Moving Average Gaussian Channel

Feedback Capacity of the First-Order Moving Average Gaussian Channel Feedback Capacity of the First-Order Moving Average Gaussian Channel Young-Han Kim* Information Systems Laboratory, Stanford University, Stanford, CA 94305, USA Email: yhk@stanford.edu Abstract The feedback

More information

Energy State Amplification in an Energy Harvesting Communication System

Energy State Amplification in an Energy Harvesting Communication System Energy State Amplification in an Energy Harvesting Communication System Omur Ozel Sennur Ulukus Department of Electrical and Computer Engineering University of Maryland College Park, MD 20742 omur@umd.edu

More information

Estimation-Theoretic Representation of Mutual Information

Estimation-Theoretic Representation of Mutual Information Estimation-Theoretic Representation of Mutual Information Daniel P. Palomar and Sergio Verdú Department of Electrical Engineering Princeton University Engineering Quadrangle, Princeton, NJ 08544, USA {danielp,verdu}@princeton.edu

More information

STA205 Probability: Week 8 R. Wolpert

STA205 Probability: Week 8 R. Wolpert INFINITE COIN-TOSS AND THE LAWS OF LARGE NUMBERS The traditional interpretation of the probability of an event E is its asymptotic frequency: the limit as n of the fraction of n repeated, similar, and

More information

Entropy Vectors and Network Information Theory

Entropy Vectors and Network Information Theory Entropy Vectors and Network Information Theory Sormeh Shadbakht and Babak Hassibi Department of Electrical Engineering Caltech Lee Center Workshop May 25, 2007 1 Sormeh Shadbakht and Babak Hassibi Entropy

More information

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have EECS 229A Spring 2007 * * Solutions to Homework 3 1. Problem 4.11 on pg. 93 of the text. Stationary processes (a) By stationarity and the chain rule for entropy, we have H(X 0 ) + H(X n X 0 ) = H(X 0,

More information

On the Distribution of Mutual Information

On the Distribution of Mutual Information On the Distribution of Mutual Information J. Nicholas Laneman Dept. of Electrical Engineering University of Notre Dame Notre Dame IN 46556 Email: jnl@nd.edu Abstract In the early years of information theory

More information

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria Source Coding Master Universitario en Ingeniería de Telecomunicación I. Santamaría Universidad de Cantabria Contents Introduction Asymptotic Equipartition Property Optimal Codes (Huffman Coding) Universal

More information

Optimal matching in wireless sensor networks

Optimal matching in wireless sensor networks Optimal matching in wireless sensor networks A. Roumy, D. Gesbert INRIA-IRISA, Rennes, France. Institute Eurecom, Sophia Antipolis, France. Abstract We investigate the design of a wireless sensor network

More information