An Implementable Scheme for Universal Lossy Compression of Discrete Markov Sources

Size: px

Start display at page:

Download "An Implementable Scheme for Universal Lossy Compression of Discrete Markov Sources"

Whitney Hopkins
5 years ago
Views:

1 A Implemetable Scheme for Uiversal Lossy Compressio of Discrete Markov Sources Shiri Jalali, Adrea Motaari ad Tsachy Weissma, Departmet of Electrical Egieerig, Staford Uiversity, Staford, CA 94305, Departmet of Electrical Egieerig, Techio, Haifa 32000, Israel {shjalali, motaar, Abstract We preset a ew lossy compressor for discrete sources. For codig a source sequece x, the ecoder starts by assigig a certai cost to each recostructio sequece. It the fids the recostructio that miimizes this cost ad describes it losslessly to the decoder via a uiversal lossless compressor. The cost of a sequece is give by a liear combiatio of its empirical probabilities of some order k +1ad its distortio relative to the source sequece. The liear structure of the cost i the empirical cout matrix allows the ecoder to employ a Viterbi-like algorithm for obtaiig the miimizig recostructio sequece simply. We idetify a choice of coefficiets for the liear combiatio i the cost fuctio which esures that the algorithm uiversally achieves the optimum rate-distortio performace of ay Markov source i the limit of large, provided k is icreased as o(log ). I. INTRODUCTION Let X = {X i : i 1} represet a discrete-valued statioary ergodic process with ukow statistics, ad cosider the problem of compressig X at rate R such that the icurred distortio is miimized. Let X ad ˆX deote fiite source ad recostructio alphabets respectively. The performace of the described codig scheme is measured by its average expected distortio betwee source ad recostructio blocks, i.e. D =Ed (X, ˆX ) 1 Ed(X i, ˆX i ), (1) where d : X X R + is a sigle-letter distortio measure. For ay R 0, the miimum achievable distortio (cf. [4] for exact defiitio of achievability) is characterized as [1], [2], [3] D(X,R) = lim mi i=1 p( ˆX X ):I(X ; ˆX ) R E d (X, ˆX ). (2) A sequece of codes at rate R is called uiversal if for every statioary ergodic source X its asymptotic performace coverges to D(X,R), i.e., lim sup E d (X, ˆX ) D(X,R). (3) For lossless compressio where the source is to be recovered without ay errors, there already exist well-kow implemetable uiversal schemes such as Lempel-Ziv codig [5] or arithmetic codig [6]. I cotrast to the situatio of lossless compressio, for D>0, there are o well-kow practical schemes that uiversally achieve the rate-distortio curve. I recet years, there has bee progress towards desigig uiversal lossy compressor especially i tryig to tue some of the existig uiversal lossless coders to work i the lossy case as well [7], [8], [9]. All of these algorithms are either provably suboptimal, or optimal but with expoetial complexity. Aother approach for lossy compressio, which is very well-studied i the literature ad eve implemeted i JPEG 2000 image compressio stadard, is Trellis coded quatizatio, i.e. Trellis structured code plus Viterbi ecodig (c.f. [10], [11] ad refereces therei). This method is i geeral suboptimal for codig sources that have memory [11]. I [12], a algorithm for fixed-slope Trellis source codig is

2 proposed, ad is show to be able to get arbitrary close to the rate-distortio curve for cotiuous-valued statioary ergodic sources. The proposed method is efficiet i low rate regio. I a recet work [13], a ew implemetable algorithm for lossy compressio of discrete-valued statioary ergodic sources was proposed. Istead of fixig rate (or distortio) ad miimizig distortio (or rate), the ew algorithm fixes Lagragia coefficiet α, ad miimizes R + αd. This is doe by assigig eergy E(y ) represetig R +αd to each possible recostructio sequece ad fidig the sequece that miimizes the cost by simulated aealig. The algorithm starts by lettig y = x, ad at each iteratio chooses a idex i {1,...,} uiformly at radom, ad probabilistically chages y i to some y ˆX such that there is a positive probability (which goes to zero as the umber of iteratios icreases) that the resultig sequece has higher eergy tha the origial sequece. Allowig the eergy to icrease especially at iitial steps prevets the algorithm from beig etrapped i a local miimum. It was show that usig a uiversal lossless compressor to describe the recostructio sequece resultig from this process to the decoder results i a scheme which is uiversal i the limit of may iteratios ad large block legth. The drawback of the proposed scheme is that although its computatioal complexity per iteratio is idepedet of the block legth ad liear i a parameter k = o(log ), there is o useful boud o the umber of iteratios required for covergece. I this paper, ispired by the previous method, we propose yet aother approach for lossy compressio of discrete Markov sources which uiversally achieves optimum rate-distortio performace for ay discrete Markov source. We start by assigig the same cost that was defied for each possible recostructio sequece i [13]. The cost of each sequece is a liear combiatio of two terms: its empirical coditioal etropy ad its distace to the source sequece to be coded. We show that there exists proper liear approximatio of the first term such that miimizig the liearized cost results i the same performace as miimizig the origial cost. But the advatage is that miimizig the modified cost ca be doe via Viterbi algorithm i lieu of simulated aealig which was used for miimizig the origial cost. The orgaizatio of the paper is as follows. I Sectio II, we set up the otatio, ad defie the cout matrix ad empirical coditioal etropy of a sequece. Sectio III describes a ew codig scheme for fixed-slope lossy compressio which uiversally achieves the rate-distortio curve for ay discrete Markov source ad IV describes how to compute the coefficiets required by the algorithm outlied i the previous sectio. Sectio V explais how Viterbi algorithm ca be used for implemetig the codig scheme described i Sectio III. Sectio VI presets some simulatios results, ad fially, Sectio VII cocludes the paper with a discussio of some future directios. Proofs that are ot preseted i the paper will appear i the full versio. II. NOTATIONS AND REQUIRED DEFINITIONS Let X ad ˆX deote the source ad recostructio alphabets respectively. Let matrix m(y ) R ˆX R ˆX k represet (k +1) th order empirical cout of y defied as m (y )= 1 { 1 i : y i 1 i k = b,y i = β] }. (4) I (4), ad throughout we assume a cyclic covetio whereby y i y +i for i 0. LetH k (y ) deote the coditioal empirical etropy of order k iduced by y,i.e. where Y k+1 o the right had side of (5) is distributed accordig to H k (y )=H(Y k+1 Y k ), (5) P(Y k+1 =[b,β]) = m (y ), (6)

3 where β ˆX,adb ˆX k,ad[b,β] represets the vector made by cocateatio of b ad β. We will use the same otatio throughout the paper, amely, β,β,... ˆX,adb, b,... ˆX k. The coditioal empirical etropy i (5) ca be expressed as a fuctio of m(y ) as follows H k (y )=H k (m(y )) := 1 H (m,b (y )) 1 T m,b (y ), (7) where 1 ad m,b (y ) deote the all-oes colum vector of legth ˆX, ad the colum i m(y ) correspodig to b respectively. For a vector v =(v 1,...,v l ) T with o-egative compoets, we let H(v) deote the etropy of the radom variable whose probability mass fuctio (pmf) is proportioal to v. Formally, l v i H(v) = v 1 log v 1 v i if v (0,...,0) T i=1 (8) 0 if v =(0,...,0) T. III. LINEARIZED COST FUNCTION Cosider the followig scheme for lossy source codig at fixed slope α>0. For each source sequece x let the recostructio block ˆx be ˆx = arg mi [H k (y )+αd (x,y )]. (9) y ˆX The ecoder, after computig ˆx, losslessly coveys it to the decoder usig LZ compressio. Let k grow slowly eough with so that [ ] 1 lim sup max y l LZ(y ) H k (y ) 0, (10) where l LZ (y ) deotes the legth of the LZ represetatio of y. Note that Ziv s iequality guaratees that if k = k = o(log ) the (10) holds. Theorem 1: [13] Let X be a statioary ad ergodic source, let R(X,D) deote its rate distortio fuctio, ad let ˆX deote the recostructio usig the above scheme for codig X.The [ ] 1 E l LZ( ˆX )+αd (X, ˆX ) mi [R(X,D)+αD]. (11) D 0 I other words, coveyig the recostructio sequece to the decoder via uiversal lossless compressio (selectio of LZ algorithm here is for cocreteess, but other uiversal lossless methods ca be used as well) achieves optimum fixed-slope rate-distortio performace uiversally. As proposed i [13], the exhaustive search required by this algorithm ca be tackled through simulated aealig Gibbs samplig. Here assumig the source is a discrete Markov source, we propose aother method for fidig a sequece achievig the miimum i (9). The advatage of the ew method is that its computatioal complexity is liear i for fixed k. Before describig the ew scheme, cosider the problems (P1) ad (P2) described below. ad (P2) : (P1) : mi y mi [H k (m(y )) + αd (x,y )], (12) y [ ] λ m (y )+αd (x,y ). (13) β b

4 Comparig (P1) with (9) reveals that it is the optimizatio required by the exhaustive search codig scheme described before. The questio is whether it is possible to choose a set of coefficiets {λ }, β ˆX ad b ˆX k, such that (P1) ad (P2) have the same set of miimizers or at least, the set of miimizers of (P2) is a subset of miimizers of (P1). If the aswer to this questio is affirmative, the istead of solvig (P1) oe ca solve (P2), which, as we describe i Sectio V, ca be doe simply via the Viterbi algorithm. Let S 1 ad S 2 deote the set of miimizers of (P1) ad (P2). Cosider some z S 1,adletm = m(z ).SiceH(m) is cocave i m, 1 for ay empirical cout matrix m, wehave H(m) H(m )+ H(m) m m (m m ) (14) Ĥ(m). (15) Now assume that i (P2), the coefficiets are chose as follows λ = H(m) m m. (16) Lemma 1: (P1) ad (P2) have the same miimum value, if the coefficiets are chose accordig to (16). Moreover, if all the sequeces i S 1 have the same type, the S 1 = S 2. Proof: For ay y ˆX, Therefore, H(m(y )) + αd (x,y ) Ĥ(m(y )) + αd (x,y ). (17) mi )) + αd (x,y )] mi y )) + αd (x,y )] y (18) Ĥ(m(z )) + αd (x,z ) (19) =mi )) + αd (x,y )]. (20) y This shows that (P1) ad (P2) have the same miimum values. For ay sequece y with m(y ) m, by strict cocavity of H(m), Ĥ(m(y )) + αd (x,y ) >H(m(y )) + αd (x,y ), (21) mi y [H(m(y )) + αd (x,y )]. (22) As a result all the sequeces i S 2 should have the empirical cout matrix equal to m.siceforthese sequeces H(m) =Ĥ(m), we also coclude that S 2 S 1. If there is a uique miimizig type m,the S 1 = S 2. This shows that if we kew the optimal type m, the we could compute the optimal coefficiets via (16), ad solve (P2) istead of (P1). The problem is that m is ot kow to the ecoder (sice kowledge of m requires solvig (P1) which is the problem we are tryig to avoid). I the ext sectio, we describe a method for approximatig m, ad hece the coefficiets {λ }. 1 As proved i Appedix B.

5 IV. HOW TO CHOOSE THE COEFFICIENTS? For a give statioary ergodic source X, ad for ay give cout matrix m defie D(m) to be the miimum average expected distortio amog all processes Y that are joitly statioary ergodic with X ad their (k +1) th order statioary distributio is accordig to m. 2 D(m) ca equivaletly be defied as D(m) = lim mi E p d(x k 1,y k 1 ), (23) k 1 p(x k 1,y k 1 ) M (k 1 ) (k where M 1) is the set of all joitly statioary distributios p(x k 1,y k 1 ) of (X k 1,Y k 1 ) with margial distributios with respect to x coicidig with the k1 th order distributio of X process, ad with margial distributios with respect to y coicidig with m, i.e., havig the (k +1) th order margial distributio described by m. Lemma 2: If the source is l th order Markov, the D(m) = mi E p d(x k 1,y k 1 ), (24) p(x k 1,y k 1) M (k 1 ) where k 1 =max(l, k +1). Proof: [outlie] Usig the techique described i Appedix A, for ay legitimate give joit distributio p(x k1,y k1 ) with the margial distributio with respect to x coicidig with the source distributio ad with margial distributio with respect to y coicidig with some give distributio m, it is possible to costruct a process which is joitly statioary ad ergodic with our source process ad also has the (k +1) th order joit distributio as p(x k1,y k1 ). Usig this gives us a achievable distortio, i.e., a upper boud o D(m). O the other had, the limit give i (23) is approachig D(m) from below. Combiig the upper ad lower bouds yields the desired equality. Sice by assumptio the ecoder does ot kow l, therefore it ca ot compute max(l, k +1).But lettig k 1 = k +1,wherek = o(log ), forayfixedorderl, k 1 will evetually for large eough, exceed l, ad hece be equal to max(l, k +1). Havig this observatio i mid, cosider the followig optimizatio problem, By Lemma 2, a equivalet represetatio of (25) is mi H(m)+α mi H(m)+αD(m) s.t. m M (k1). (25) d k1 (β b,βb)p x (β b )q y x (βb β b ) β,β,b,b s.t. m = p x (β b )q y x (βb β b ),, β b 0 q y x (βb β b ) 1, β,β, b, b, q y x (βb β b )=1, β, b, p x (β b )q y x (βb β b )= p x (b β )q y x (bβ b β ) b, b. (26) β,β β,β The last costrait i (26) is the statioarity coditio defied i (A-1), ad esures that the joit distributio defied by p x (βb)q y x (β b ) over (x k+1,y k+1 ) correspods to (k +1) th order margial distributio of some joitly statioary processes (X, Y). Note that the variables i (26) are coditioal distributios q y x (y k 1 x k 1 ), but we are oly iterested i the m that they iduce. 2 As discussed i Appedix A, the set of such processes is o-empty for ay legitimate m.

6 Lemma 3: If for each, (P1) has a uique miimizig type m,the m ˆm TV 0, a.s., (27) where ˆm is the solutio of (26). Remark: I (26), the oly depedece o is through k 1. Therefore, if the ecoder kew the distributio of the source, it could solve (26), fid a good approximatio of m, ad the use (16) to compute the coefficiets required by (P2). The problem is that the ecoder does ot have this iformatio, ad oly kows that the source is Markov (but does ot kow its order). To overcome its lack of iformatio, a reasoable step is to use empirical distributio of the source istead of the true ukow distributio i (26). For a k 1 X k 1,defiethek1 th order empirical distributio of the source as ˆp (k 1) x (a k 1 ) {i :(x i k 1,...,x i 1 )=a k 1 }. (28) The followig lemma shows that for k 1 = o(log ), ˆp (k1) coverges to the actual k1 th order distributio of the source, ad therefore ca be cosidered as a good approximatio for it. Lemma 4: For k 1 = o(log ), ad ay statioary ergodic Markov source, where p (k 1) is the true k th 1 order distributio of the Markov source. Assume x is geerated by a discrete Markov source, ad let ˆp (k 1) x ˆp (k 1) p (k 1) TV 0 a.s., (29) i (28). Cosider the followig optimizatio problem mi H(m)+α β,β,b,b d k1 (β b,βb)ˆp (k1) x (β b )q y x (βb β b ) s.t. m = β b ˆp (k 1) x (β b )q y x (βb β b ),, be its empirical distributio defied 0 q y x (βb β b ) 1, β,β, b, b q y x (βb β b )=1, β, b, ˆp x (k1) (β b )q y x (βb β b )= )ˆp (k 1) x (b β )q y x (bβ b β ), β,β β,β b, b. (30) ad let m deote the output of the above optimizatio problem. Lemma 5: For k 1 = k 1 () =o(log ), m ˆm TV 0, a.s. Proof: [outlie] The iput parameters of the optimizatio problem (30) are {ˆp (k1) (a k 1 )} a k 1 X k 1, therefore ˆm = ˆm ({ˆp(k 1) (a k 1 )} a k 1 X k 1 ). O the other had, both the cost fuctio ad the costraits of (30) are cotiuous both i iput parameters ad optimizatio variables. This meas that ˆm i tur is a cotiuous fuctio of {ˆp(x k 1 )} x k 1 X k 1. Let {λ ()} deote the optimal values of the coefficiets defied at m (as give i (16)), ad let {ˆλ ()} be coefficiets computed at m,the Lemma 6: max λ () ˆλ () 0 as. (31) These results suggest that for computig the coefficiets we ca solve the optimizatio problem give i (30) (whose complexity ca be cotrolled with the rate of icrease of k 1 ), ad the substitute the result

7 i (16) to obtai the approximate coefficiets. After that (P2) defied by these coefficiets ca be solved usig the Viterbi algorithm i a way that will be detailed i the ext sectio. The successio of lemmas detailed i the previous sectios the allow us to prove the followig theorem. Theorem 2: Let X let a statioary ad ergodic Markov source, ad R(X,D) deote its rate distortio fuctio. Let ˆX be the recostructio sequece obtaied usig the above scheme for codig X choosig k 1 = k +1,wherek = o(log ). The [ ] 1 E H k(m( ˆX )) + αd (X, ˆX ) mi [R(X,D)+αD]. (32) D 0 Remark: Theorem 2 implies the fixed-slope uiversality of the scheme which does the lossless compressio of the recostructio by first describig its cout matrix (costig a umber of bits which is egligible for large ) ad the doig the coditioal etropy codig. V. VITERBI CODER As proved i Sectio III, istead of solvig (P1), oe ca solve (P2) for proper choices of coefficiets {λ b,β }. Note that [λ m (y )+αd (x,y )] = 1 [ ] λ yi,y i 1 + αd(x i,y i ). (33) i k This alterative represetatio of the cost fuctio suggests that istead of usig simulated aealig, we ca fid the sequece that miimizes the cost fuctio by the Viterbi algorithm. For i = k +1,...,,let s i = y i i k be the state at time i, S be the set of all 2k+1 possible states, ad for s = b k+1 defie i=1 w(s, i) :=λ bk+1,b k + αd(x i,b k+1 ). From our defiitio of the states s i = g(s i 1,y i ),whereg : S ˆX S. This represetatio leads to a Trellis diagram correspodig to the evolutio of the states {s i } m i=k+1 i which each state has two states leadig to it ad two states brachig from it. Assume that weight w(s i ) is assiged to the edge coectig states s i 1 ad s i, i.e., the cost of each edge oly depeds o the tail state. It is clear that i our represetatio, there is a 1-to-1 correspodece betwee biary sequeces y ad sequeces of states {s i } m i=k+1, ad miimizig (33) is equivalet to fidig the path of miimum weight i the correspodig Trellis diagram, i.e., the path {s i } i=k+1 that miimizes i=k+1 w(s i,i). Solvig this miimizatio ca readily be doe by Viterbi algorithm which ca be described as follows. For each state s, letl(s) be the two states leadig to it, ad for ay i>1, C(s, i) := mi[w(s)+c(s,i 1)]. (34) s L(s) For i =1ad s = b k+1,letc(s, 1) := λ bk+1,b k + αd k+1(x k+1,b k+1 ). Usig this procedure, each state s at each time j has a path of legth j k 1 which is the miimum path amog all the possible paths betwee i = k +1ad i = j such that s j = s. After computig {C(s, i)}, at time i =, let s S i {k+1,...,} s =argmic(s, ). (35) s S It is ot hard to see that the path leadig to s is the path of miimum weight amog all possible paths. Note that the computatioal complexity of this procedure is liear i but expoetial i k because the umber of states icreases expoetially with k.

8 R(D) Shao lower boud α =4 α =3.5 α =3 α =2.5 α = R D Fig. 1. (d (x, ˆx ),H k (ˆx )) of output poits of Viterbi ecoder whe the coefficiets are computed at m[x ]. For each value of α, the algorithm is ru L =20times. Here = 5000, k =7, ad the source is biary Markov with q =0.2 VI. SIMULATION RESULTS I this sectio, some prelimiary simulatio results of the applicatio of Viterbi ecoder described i the previous sectio is preseted. I our simulatios, istead of computig the coefficiets {λ } from (16) at the optimal poit m, we compute them at the cout matrix of the iput sequece x, m(x ).Fig.1 demostrates (d (x,y ),H k (m(y ))) of output poits of the described algorithm. The block legth is = 5000, k =7ad the source is 1 st order biary symmetric Markov with trasitio probability q =0.2. For each value of α the algorithm is applied to L =20differet radomly geerated sequeces. The reaso of gettig some poits below the rate-distortio curve is that the actual umber of bits required for describig ˆx losslessly to the decoder is larger tha H k (ˆx ), but coverges to it as grows. For example, for the simple scheme of separately describig the subsequeces correspodig to differet precedig cotexts, this surplus is of order 2 k log /. The effect of this excess rate is ot reflected i the figure, which explais why some poits appear below the rate-distortio curve. It ca be observed that for larger values of α the output poits are closer to the curve. The reaso is that large values of α correspod to small values of distortio, ad if the distortio is small the m(x ) is a good approximatio of m(y ). Fially, Fig. 2 compares the performace of the ew Viterbi ecoder ad the MCMC ecoder described i [13]. Here the source is agai biary symmetric Markov with q =0.2, ad the other parameters are: k =7, =5, 000, β t = log t, r =10, whereβ t determies the coolig schedule of the MCMC coder ad r is its umber of iteratios. Each poit is the figure correspods to the average performace of L =10radom realizatios of the source. It ca be observed that eve for this simplistic choice of the coefficiets the performace of the algorithms are comparable, while the Viterbi ecoder for example i this example rus at least 40 times faster. VII. CONCLUSIONS AND CURRENT DIRECTIONS I this paper, a ew method for uiversal fixed-slope lossy compressio of discrete Markov sources was proposed. The ew method achieves the rate-distortio curve for ay discrete Markov source. Extedig

9 0.75 R(D) R Shao lower boud Viterbi coder: α = 4 MCMC coder: α = 4 Viterbi coder: α = 3.5 MCMC coder: α = 3.5 Viterbi coder: α = 3 MCMC coder: α = 3 Viterbi coder: α = 2.5 MCMC coder: α = 2.5 Viterbi coder: α = 2 MCMC coder: α = D Fig. 2. Comparig the performaces of Viterbi ecoder ad MCMC ecoder proposed i [13] the algorithm to work o ay statioary ergodic source is uder curret ivestigatio. We believe that i fact the same algorithm works for the geeral class of statioary ergodic sources, ad oly the proof should be exteded to work i this case as well. Aother directio for future work is fidig a simple method for approximatig the optimal coefficiets that would alleviate the eed for solvig the optimizatio problem (30). APPENDIX A: STATIONARITY CONDITION Assume that we are give a ˆX ˆX k matrix m with all elemets positive ad summig up to oe. The questio is uder what coditio(s) this matrix ca be (k +1) th order statioary distributio of a statioary process. For the ease of otatios, istead of matrix m cosider p(x k+1 ) as a distributio defied o ˆX k+1. We show that a ecessary ad sufficiet coditio is the so-called statioarity coditio which is p(βx k )= p(x k β). (A-1) β ˆX β ˆX - Necessity: The ecessity of (A-1) is just a direct result of the defiitio of statioarity of a process. If p(x k+1 ) is to represet the (k +1) th order margial distributio of a statioary process, the it should be cosistet with the k th order margial distributio as satisfy (A-1). - Sufficiecy: I order to prove the sufficiecy, we assume that (A-1) holds, ad build a statioary process with (k +1) th order margial distributio of p(x k+1 ). Cosider a k th order Markov chai with trasitio probabilities of q(x k+1 x k )= p(xk+1 ) p(x k ). (A-2) Note that p(x k ) is well-defied by (A-1). Moreover, agai from (A-1), p(x k+1 ) is the statioary

10 distributio of the defied Markov chai, because q(x k+1 x k )p(x k )= p(x k+1 )=p(x k+1 2 ). (A-3) x 1 x 1 Therefore we have foud a statioary process that has the desired margial distributio. Fially we show that if m is the cout matrix of a sequece y, the there exist a statioary process with the margial distributio coicidig with m. From what we just proved, we oly eed to show that (A-1) holds, i.e., m = m bk,[ 1...,b k 1 ]. (A-4) β β But this is true because both sides of (A-4) are equal to {i : yi+1 i+k = b} /. APPENDIX B: CONCAVITY OF H(m) For simplicity assume that X = ˆX = {0, 1}. By defiitio H(m) = m 0,b (m 0,b + m 1,b )h( ), (B-1) m 0,b + m 1,b b {0,1} k where h(α) =α log α +ᾱ log ᾱ ad ᾱ =1 α. We eed to show that for ay θ [0, 1], ad empirical cout matrices m (1) ad m (2), θh(m (1) )+ θh(m (2) ) H(θm (1) + θm (2) ). (B-2) From the cocavity of h, it follows that θ(m (1) 0,b + m(1) 1,b )h( m (1) 0,b )+ θ(m (2) 0,b + m(2) 2,b )h( m (2) 0,b ) m (1) 0,b + m(1) 1,b m (2) 0,b + m(2) 2,b =(θ(m (1) 0,b + m(1) (2) 1,b )+ θ(m 0,b + m(2) 1,b )) θ i (m (i) 0,b + m(i) 1,b ) (i) m0,b i {1,2} (θ(m (1) 0,b + m(1) (2) 1,b )+ θ(m 0,b + m(2) 1,b ))h( m (i) 0,b + ) m(i) 1,b (θ(m (1) 0,b + m(1) (2) 1,b )+ θ(m 0,b + m(2) 1,b ))h( θm (1) (2) 0,b + θm 0,b θ(m (1) 0,b + m(1) (2) 1,b )+ θ(m 0,b + (B-3) m(2) 1,b )), where θ 1 =1 θ 2 = θ. Now summig up both sides of (B-3) over all b ˆX k, yields the desired result. REFERENCES [1] C. Shao, Codig theorems for a discrete source with a fidelity criterio, IRE Nat. Cov. Rec, part 4, pp , [2] R.G. Gallager, Iformatio Theory ad Reliable Commuicatio, New York, NY: Joh Wiley & Sos, [3] T. Berger, Rate-distortio theory: A mathematical basis for data compressio, Eglewood Cliffs, NJ: Pretice-Hall, [4] T.M.Cover,adJ.A.Thomas,Elemets of Iformatio Theory, New York: Wiley, [5] J. Ziv ad A. Lempel, Compressio of idividual sequeces via variable-rate codig, IEEE Tras. o If. Theory, 24(5): , Sep [6] I. H. Witte, R. M. Neal, ad J. G. Cleary, Arithmetic codig for data compressio, Commu. Assoc. Comp. Mach., vol. 30, o. 6, pp , [7] I. Kotoyiais, A implemetable lossy versio of the Lempel Ziv algorithm-part I: optimality for memoryless sources, IEEE Tras. o Iform. Theory, vol. 45, pp , Nov [8] E. Yag, Z. Zhag, ad T. Berger, Fixed-slope uiversal lossy data compressio,, IEEE Tras. o Iform. Theory, vol. 43, o. 5, pp , Sep [9] E. H. Yag ad J. Kieffer, Simple uiversal lossy data compressio schemes derived from the Lempel-Ziv algorithm, IEEE Tras. o Iform. Theory, vol. 42, o. 1, pp , [10] T. Berger, J.D. Gibso, Lossy source codig, IEEE Tras. o Iform. Theory, vol. 44, o. 6, pp , [11] A. Gersho, R.M. Gray, Vector Quatizatio ad Sigal Compressio Spriger, [12] E. Yag, ad Z. Zhag, Variable-Rate Trellis Source Ecodig, IEEE Tras. o Iform. Theory, vol. 45, o. 2, pp , [13] S. Jalali, T. Weissma, Lossy codig via Markov chai Mote Carlo, IEEE Iteratioal Symposium o Iformatio Theory, Toroto, Caada, 2008.

Asymptotic Coupling and Its Applications in Information Theory

Asymptotic Coupling and Its Applications in Information Theory Asymptotic Couplig ad Its Applicatios i Iformatio Theory Vicet Y. F. Ta Joit Work with Lei Yu Departmet of Electrical ad Computer Egieerig, Departmet of Mathematics, Natioal Uiversity of Sigapore IMS-APRM