58487 Dt Compressio Tehiques (Sprig 0) Moel Solutios for Exerise 4 If you hve y fee or orretios, plese ott jro.lo t s.helsii.fi.. Prolem: Let T = Σ = {,,, }. Eoe T usig ptive Huffm oig. Solutio: R 4 U V U 3 V U 4 R 5 V U 5 R 7 R 8 V 3 U 6 R 0 V 3 3 3 3 4 Iitilly, ll outs re set to oe. We re output the oe 0. We swp. The we re output 00. We swp V. The we re output 0. We swp V,. The we re output 000. We swp. The we re output 0.
R 3 4 U 7 V 3 6 We swp V. Filly, we re output Remr. Relig is oe fter the oewor is outputte. Otherwise, it is iffiult to etermie the sme relig opertios whe eoig the output.. Prolem: Show tht H 0 (T ) = log s Σ s log s H (T ) = w Σ w log w Does the ove me tht = H (t) = log for ll T? Solutio: w Σ + w log w Rell tht log = log log. Also, otie tht s Σ s =. H 0 (T ) = s Σ = s Σ = s Σ s log s s log s s log s = s Σ s (log log s ) = s Σ s log s Σ s log s = log s Σ s s Σ s log s = log s Σ s log s. Similrly for the seo se: (otie tht s Σ ws = w for y w)
H (T ) = w w Σ = = w Σ s Σ w Σ s Σ w Σ s Σ s Σ ws w log ws w ws log ws w ws (log ws log w ) = ws log w ws log ws w Σ s Σ = log w ws ws log ws w Σ s Σ w Σ s Σ = w log w ws log ws w Σ w Σ s Σ = w log w w log w. w Σ w Σ + =0 H (T ) = log oes ot hol for ll T euse, se o the ove equtios, we hve m H (T ) = log w log w. w Σ m+ =0 There exists iput strigs suh tht the sum o the right sie is lrger th 0 for y m. Te for exmple, the strig T = ritrry log otext w =... the sum is simply w log w = 3 log 3, euse we out lso the yli ourrees. 3. Prolem: Assume LZW itiory is iitilize with Z 0 =, Z =, Z =,..., Z 6 = z. Deoe 5 0 0 8 30 3 8 3 33 9 38 0 9 5 39 Solutio: 3
Coe Output A to itiory 5 o 0 t Z 7 = ot Z 8 = t 0 Z 9 = v Z 30 = v Z 3 = v l Z 3 = l 8 t Z 33 = lt 30 v Z 34 = t 3 l Z 35 = v 8 t Z 36 = lt 3 v Z 37 = tv 33 lt Z 38 = vl 9 Z 39 = lt 38 vl Z 40 = v 0 t Z 4 = vlt 9 i Z 4 = ti 5 o Z 43 = io 39 lt Z 44 = ol Note: There is oe triy se whih oes ot show up i this exmple. While eoig, the eoer s itiory hs ll the phrses tht the eoer h t tht poit, exept the ltest phrse. If the eoer hppes to use the ltest phrse immeitely, the the eoer hs prolem. Luily there is wy out: See leture 6, slie 7. 4. Prolem: Let X e rom (uompressile) strig of legth over iry lphet. All vrits of LZ78 LZ77 ompressio with ulimite iste legth vlues will prse X ito Θ(/ log ) phrses with o phrse loger th O(log ). For LZ77 with limits mx l mx, log is reple y mi(log, log mx, l mx ) Wht is the (symptoti) ompressio rtio of the followig lgorithms o T = X /? () LZ77 with mx = l mx fixe legth eoig. () LZ77 without legth or iste limits γ oig. () LZ78 Solve t lest two of the three suprolems. Solutio: X repete {}} times { () The iput strig is T = X X X X. Whe we use fixe legth eoig, eh phrse requires log mx + log(l mx +) + log σ its. Sie log σ = mx = l mx, the spe simplifies to O(log mx ) its per phrse. Let z eote the umer of phrses. Rell the efiitio of ompressio rtio we hve: ompressio rtio = ompresse size uompresse size = O ( ) z log mx ( z ) = O log σ log mx. Now let us loo t two ses to etermie the symptoti ehvior of z: 4
If mx, we hieve goo ompressio. Eh X (exept the first oe) is represete with t most oe phrse (i.e. the phrse X ). I geerl, we represet mx osequet X s with oe phrse (this is how my X s you fit ito oe wiow of legth mx ). Now, fter the first few X s, the rest of the text T is eoe with phrses of legth mx = Θ( mx). If we omit first few X s, there re i totl t most z = O( mx ) phrses. The ompressio rtio is the O((log mx )/ mx ). If > mx, we ot hieve y ompressio. The ompressio rtio eomes lrger th. () Rell tht γ(i) = log(i + ) + = O(log i) its. Sie there re o legth or iste limits, we eoe ll X s (exept the first oe) s oe self-referetil phrse. The selfreferetil phrse is of legth requires log( + ) + + log( ) + + log σ = O(log + log ) = O(log ) its. }{{} = Sie the first X e represete i O( i= log i) = O( log ) its, the totl ompresse size is O(log + log ) its, the ompressio rtio is O((log + log )/), or simply O((log )/) if we omit the first X. This is potetilly muh etter rtio th those tht were hieve i the prt (). Remr. It is wsteful to eoe the first X usig phrses. This is, of ourse, ive upperou ut you ot hieve muh etter upperou sie X is uompressile. A ho solutio woul e to use log σ = O() its to represet the first X s plitext, plus itiol O(log ) its to otify the eoer tht this prt of the eoig is pli-text, iluig e.g. the legth of the pli-text represettio s γ-oe. This woul erese the (symptoti) spe of the first X from O( log ) to O( + log ), whih is egligile if = O(log ). () With fixe legth oes, eh phrse requires log(j+) + log σ = O(log j) its, where j is the size of the effetive itiory, log σ = euse σ = for the iry lphet. We group the phrses y their strtig positio i X. Notie tht, for eh strtig positio i, the ext phrse tht strts t positio i is oe symol loger. For lrge eough, we ssume tht there re out the sme umer of phrses i eh group i. Let z eote the umer of phrses i oe group, let z = z eote the totl umer of phrses. Now = z i = Θ ( (z ) ) ( z = Θ j= i= where (z ) = (z/). From the symmetry of Θ( ), it follows tht the totl umer of phrses is z = Θ( ). The totl size of the eoig is the O( j= log j) its euse O(log j) its re require to represet the j-th phrse. Filly, the ompressio rtio is O ( ) ( ) log j log = O = O log σ log(). j= 5. Prolem: Eoe the followig strig usig LZFG: ), how muh woo woul woohu hu if woohu oul hu woo 5
Solutio: phrse h o w m u h w o o eoig h o w m u h w o o phrse wo u l woo h u eoig 3, 5 u l, 4 5, 0, 3, 5 phrse hu i f woohu o ul hu woo eoig 6, 4 i f 3, 0 o 4, 5 6, 7 4, 6. Prolem : Let R e the strig of termils o-termils resultig from ruig Re-Pir o text T. Let α β e two sustrigs of R. Show tht exp(α) = exp(β) if oly if α = β where exp(α) is the result of repetely replig o-termils i α with their right-h sie util there re o o-termils left. Solutio: First, otie tht, if α = β, the exp(α) = exp(β), euse the rules geerte y Re-Pir esure tht exp(α) lwys proues the sme output (o rhig or yles). It remis to show tht, if exp(α) = exp(β) the α = β. Let T eote sustrig tht ppers more th oe i T. We ssume tht T = exp(α) = exp(β). Let eote pir of termil symols i T. Notie tht, if Re-Pir retes ew rule for X, the pir gets reple y X i ll ourrees of T i T. Thus, if we tre the rules from T upwrs, oth α β must hve the trsitio X t some stge of their exp. The sme hols for ll other pirs of termils o-termils, i.e. B, A, AB, whih our i other stges of exp. Furthermore, if there exists ru of oe symol, sy, we hve to ssume tht Re-Pir lwys reples the left-most pir first. Otherwise, the rule A oul rete oth A A out of the sustrig. Filly, otie tht, for y pirs tht our t the strt or e oury of T, the rule X ot exists euse the there woul e o sustrig α, β i R tht woul exp s the sustrig T they woul exp ito e.g. T or T. 6