http://dx.do.org/10.5755/j01.ee.3.4.1877 Swtched Quas-Logarthmc Quantzer wth Golomb Rce Codng Nkola Vucc 1, Zoran Perc 1, Mlan Dncc 1 1 Faculty of Electronc Engneerng, Unversty of Ns, Aleksandar Medvedev St. 14, 18000 Ns, Serba zoran.perc@elfak.n.ac.rs 1 Abstract Ths paper proposes a model of swtched quaslogarthmc quantzer for speech sgnal based on G.711 standard wth usage of Golomb-Rce (GR) codng. In order to acheve better performances a method wth swtched quantzer s appled. Varance range s splt nto quantzers and for each of them a separate quantzer s desgned,.e. the support regon s determned. Optmzaton of the support regon and choce of the parameter μ s done n order to obtan a quantzer that obeys G.71 standard and gves mnmal average bt rate. Every quantzer wthn the varance range has own model wth a two-stage coder. Two stages are ntroduced wth purpose to reduce the bt rate, whereby GR code plays ts role as Varable Length Code (VLC). The frst stage uses a GR coder for codng segments of the quantzer s support regon, whereas the second stage apples the codng method wth fxed code lengths for codng cells wthn a segment. GR has smpler and cheaper hardware realzaton than other VLC codes, Huffman s for nstance, wth very satsfyng results regardng qualty of quantzed sgnal. Index Terms Quantzaton; speech processng; speech codng; sgnal to nose rato; Golomb-Rce codng. I. INTRODUCTION The G.711 standard (G.711 quantzer) defnes fxed length codng that provdes hgh qualty of reconstructed sgnal for fxed bt rates [1]. The G.711.0 standard defnes the G.711 quantzer wth varable length codes (VLC) and assumes the usage of one of many codng technques whereby the choce s made upon the nput sgnal s characterstcs []. Ths standard does not examne swtched quantzers, and therefore our motvaton s to desgn a swtched quantzer wth some VLC. Golomb-Rce (GR) codng method [3], [4], as well as Huffman s, belongs to the group of VLC technques. Some VLC scalar quantzers wth Huffman s code and smlar ones are analysed n [5] [9], whereby [8] studes VLC for lossless compresson. GR codng s smpler than Huffman, although the latter s by custom used for smaller code books. We examne here a code book wth N = 56 levels, thus GR code s better soluton for our model. The G.71 standard gves a lower lmt for the sgnal to quantzaton nose rato (SQNR) of the transmtted speech Manuscrpt receved 14 December, 016; accepted 6 Aprl, 017. Ths work was supported by the Mnstry of Educaton, Scence and Technologcal Development of the Republc of Serba under the project TR 3035. sgnal dependng on nput varances [10], [11]. Paper [1] proposes fxed quas-logarthmc quantzer wth GR codng. In paper [13] swtched quantzers are examned and adaptve quantzers n [14]. Our motvaton s to make an analyss of dfferent swtched quantzers wth -law of compresson and GR codng, as well as to determne swtched quantzer whch satsfes G.71 standard and has the lowest possble average bt rate. II. SWITCHED QUASI LOGARITHMIC QUANTIZER WITH GOLOMB RICE CODING A. Swtched Quas Logarthmc Quantzer As analysed n [13] the swtched quantzer provdes better performances than the quantzer desgned accordng to the G.711 standard. The dea of ntroducng swtched quantzer for speech sgnal transmsson brngs dvson of observed varance range L from 0 db to 0 db nto Ng equally wde quantzers. Sgnal s varance σ s defned as [ ] 10log db ref, whereby σ ref = 1. Each quantzer s labelled wth ndex, whereby = 1,..., Ng. Wth known sgnal s varance σ the matchng quantzer s unambguously determned. For every quantzer wthn the L range we try to fnd the parameters that lead to the achevement of optmal results. Addtonal nformaton about the quantzer, for a frame of sze M, s transmtted to the recever va log N g bts per frame wth M samples, where M = 10,180,40. Compresson characterstc of proposed model has a pecewse lnear approxmaton of the μ-law characterstc, where + 1 marks a rato between the bggest and the smallest quant for non-lnear logarthmc quantzer [10], whereby apples 1, (1) where N. The compresson functon wth -law of compresson [10] s max ln 1 x x c x xmax sgn x. ln 1 It s very mportant to notce that parameters and Ng are () 81
gven for the entre range L, whereas support regon s dfferent for every quantzer and therefore we ntroduce the notaton x max. The support regon [x max, x max ] of ths symmetrc quantzer s dvded nto l = 16 segments whose thresholds x are defned as / L xmax x 1 1, where 0,..., l. The symmetry of proposed quantzer allows us to develop further examnaton only on segments from postve part of support regon. Insde the segment quantzaton s unform and the step sze of quantzaton Δ s x1 x / l / l xmax (3) 1, (4) t t 0,..., l 1. Step szes of consecutve segments have the followng rato 1 / l l 1. Decson levels x,j whch are equally dstanced wthn -th segment can be calculated n ths way, j, (5) x x j (6) where 0,..., l 1, j 1,..., t 1. The borderlne cases are defned as well: x x,0 x,, t x1. All the samples between x,j-1 and x,j are represented wth the representaton level y,j where 0,..., l 1, j 1,..., t. (7) j 1 y, j x, (8) Parameters that have an nfluence to performances are: the parameter of the - compresson law [10], the number of swtched quantzers Ng, as well as the support regons x max. Our goal s actually to determne each x max, for = 1,..., Ng and wth the gven values of and Ng, wthn the range L. All support regons x max wll nfluence quantzer s performances, among whch for us the most mportant are: SQNR and the average bt rate R. The desgn of the quantzer often requres smultaneous fulflment of two opposng crtera: maxmal qualty of sgnal (SQNR) and mnmal average bt rate. Hereby we decded to develop a quantzer whose performances fulfl the G.71 standard [11] along wth as small as possble R when usng the VLC codng. Our goal s to mnmze the average bt rate, snce the dfference between the acheved bt rate and the one proposed by G.711 standard presents an mportant savng n sgnal transfer. The lower lmt of SQNR s condtoned by G.71 standard, and therefore we try to fnd the value of x max whch satsfes ths standard and gves mnmal possble value of R. B. Two-Stage Coder Havng n mnd that the support regon s splt nto l = 16 segments and each segment s splt nto t=16 cells, we need to fnd an effcent way to code ths nformaton. The G.711 coder as shown n Fg. 1 codes as follows: the whole regon s symmetrcally splt nto two parts, postve and negatve; therefore the poston of the segment s coded wth one bt ( 0 or 1 ). Then each of l=8 segments s coded wth 3 bts (from 000 to 111 ) and each of t=16 cells s coded wth 4 bts (from 0000 to 1111 ). Accordngly, G.711 coder always requres 8 bts, classfyng t to the group of coders wth fxed length code words [1]. Therefore the average bt rate s always equal to 8 bts per sample. Our goal s to fnd a soluton that requres lower bt rate than G.711 quantzer s. Therefore we propose the coder depcted n Fg. whch s mplemented as a two stages coder. In frst stage we splt whole support regon nto l = 16 segments and try to code them wth code words of varable length. The segments are labeled from left to rght on the axs one after the another wth numeral =1,,l where =1 marks the furthest segment to the left on the axs,.e. the segment boundng x max, and =16 marks the furthest segment to the rght on axs,.e. segment next to the x max. In ths way wth a proper technque we could acheve better bt usage than the G.711 standard. Here we use a Golomb-Rce coder where values of n from 0 to 15 are coded wth code words of varous lengths, as seen n Table I. In order to have the most effcent results we must try to desgnate shorter codes to the segments wth hgher probablty of appearance and longer codes to the segments wth lower probablty of appearance. Therefore we developed a rule for ths desgnaton, as shown n the last column of Table I. In ths way we conclude the frst stage of proposed coder. In second stage we code each of t=16 cells naturally wth 4 bts (from 0000 to 1111 ). +/- segment codng (values 0 7) Fg. 1. G.711 speech sgnal codng scheme. I stage segment codng wth varable codeword lengths (values 0 15) Golomb Rce coder cell codng (values 0 15) Fg.. Proposed two-stage speech sgnal codng scheme. C. The Golomb Rce Codng Technque II stage cell codng wth fxed codeword lengths (values 0 15) The Golomb-Rce (GR) codng technque uses code words of varable length and was created by combnng Golomb s (Solomon W. Golomb) and Rce s (Robert F. Rce) codng technques [3], [4]. Golomb s codng assumes 8
that number n s decomposed wth dvsor m nto nteger quotent q and a remander r, whereby stays n qm r, n, m, q, r N. (9) Golomb-Rce codng has an addtonal requrement: k m, k N 0. (10) The number n s coded as follows: frst the quotent q s coded unary,.e. we wrte an array of q ones, then we put a zero n order to provde correct decodng, and n the end the remander r s bnary coded on k bts. Therefore, dependng on the parameter k, the GR codng can be done n varous ways. In our research we examned all varants and came to the concluson that the best results for ths model are acheved for k=1. In Table I s gven an overvew of GR code for k=1, n=0,,15 where S represents code and l ts length, whereas marks the coder s segment whch s gong to be coded wth ths partcular code [3], [4]. Decodng s done n the followng way: we count ones before the frst zero,.e. decmal pont. The number of counted ones s quotent q. Then we dentfy k bts after the zero and they represent bnary record of remander r. Snce k, q and r are known, the number s calculated wth formulae (9) and (10). Startng from the (k+1)-th bt after decmal pont we contnue countng ones up to the next zero and the decodng process contnues [3], [4]. TABLE I. GOLOMB RICE CODE, K = 1, M = ; DESIGNATING SEGMENTS I= 1,, L WITH PROPER CODES N= 0,,15. n S l 0 00 9 1 01 8 100 3 10 3 101 3 7 4 1100 4 11 5 1101 4 6 6 11100 5 1 7 11101 5 5 8 111100 6 13 9 111101 6 4 10 1111100 7 14 11 1111101 7 3 1 11111100 8 15 13 11111101 8 14 111111100 9 16 15 111111101 9 1 III. PERFORMANCES AND NUMERICAL RESULTS We assume that quantzer s nput speech sgnal can be descrbed wth the Laplacan probablty densty functon 1 x px, exp. (11) The measure of sgnal s qualty s dstorton. It s conssted from granular D g and overload D ov dstorton whose sum s total dstorton D=D g+d ov. Sgnal to Quantzaton Nose Rato (SQNR) s defned as D SQNR 10log /. (1) Granular dstorton D g s the dstorton for fnte cell wdths,.e. wthn the support regon [x max, x max ]. The overload dstorton D ov measures out of that regon,.e. on (, x max ) (x max, ). Thus, granular dstorton for pecewse lnear quantzaton and hgh bt rate (asymptotc quantzaton) s [10] Dg xmax x, p x, dx, (13) 1 xmax where Δ(x,σ) represents cell wdth defned n (4). We can further derve t as n [10] 1 l1 D g P, (14) 1 0 whereas P s probablty of -th segment x1, P p x dx x x x 1 exp exp. On the other sde, overload dstorton s calculated as [10] ov x y px, xmax l1, t (15) D dx, (16) whereas y l-1,t s defned accordng to (8). Therefore, the exact expresson s D ov x max exp xmax yl1, t A. Sgnal to Quantzaton Nose Rato SQNR. (17) ] 10 On the range L, whereby [ db 10 s, we calculate the average SQNR n a large number of ponts p as followng s 1 p av s (18) p s 1 SQNR SQNR. The values for SQNR av are shown n Table II. For the felds marked wth slash (/) t s not possble to desgn a quantzer whch respects G.71 standard wth current parameters Ng and μ. 83
TABLE II. AVERAGE SIGNAL TO QUANTIZATION NOISE RATIO SQNR [db] CALCULATED DEPENDING ON THE NUMBER OF QUANTIZERS Ng AND PARAMETER μ. SQNRav μ=15 μ=31 μ=63 μ=17 μ=55 Ng=1 / / / 34.1900 33.8617 Ng= / / 34.354 33.8864 33.5398 Ng=4 34.4067 34.0171 33.6413 33.344 33.0904 Ng=8 33.4359 33.147 3.9884 3.817 3.6519 Ng=16 3.8550 3.6967 3.5699 3.4753 3.3789 Ng=3 3.5454 3.4156 3.348 3.838 3.305 B. Bt Rate Snce the SQNR depends on sgnal varance and not on code word lengths, the crucal parameter for the code choce wll be the average bt rate. Hereby we try to get wth the descrbed two-stage codng method a lower average bt rate than the G.711 standard's coder, whch has a constant value of 8 bt/sample. The frst stage s average bt rate for a partcular standard devaton σ s equal to ths case a savng of bt rate equal to 1.4331 bt/sample s made, whereby the SQNR av s 3.305dB. SQNR(σ) s calculated on the entre varance range L and shown n Fg. 3. Further, we can get support regons as a set of values x max[] = [103 10 100 98 98 97 95 9 88 89 96 103 110 117 14 139 160 185 14 47 85 39 380 439 507 586 677 78 903 1043 104 1390], for =1,...,3. Segments are coded as descrbed n Table I. whereby code lengths are l[]= [ 3 4 5 6 7 8 9], =1,,8. For the varance values σ 1 [db]=0.65db and σ [db]= 19.375dB the most mportant quantzer s parameters are gven n Table IV. R I l, (19) 1 l P whereas l s the code word s length, defned n Table I, and P s defned n (15). We calculate the average bt rate of the frst stage on the varance range L n a large number of ponts p as The bt rate of second stage s constant 1 p RI RI s. (0) p s1 RII log. (1) t The total average bt rate of the proposed quas-logarthmc quantzer, wth addtonal nformaton about quantzer, s: log Ng R RI RII. () M Values for R are shown n Table III, where the frame sze s M=40. The values under the double lne are lower than 8 bt/sample and belong to the area of our specal nterest n ths research. TABLE III. AVERAGE BIT RATE R [bt/sample] CALCULATED DEPENDING ON THE NUMBER OF QUANTIZERS Ng AND PARAMETER μ. R μ=15 μ=31 μ=63 μ=17 μ=55 Ng=1 / / / 11.0585 9.76 Ng= / / 11.8040 9.8311 8.5547 Ng=4 13.640 10.5641 8.9969 8.013 7.3480 Ng=8 10.5745 8.9155 7.9433 7.3106 6.8717 Ng=16 9.5135 8.445 7.514 7.008 6.6739 Ng=3 9.0534 7.979 7.306 6.893 6.5869 C. Numercal Results As can be seen n Table III, the mnmal R s obtaned for Ng=3 and μ=55. These parameters are the best choce for our model and wll be used to compute the performances. In Fg. 3. SQNR(σ) on the varance range L. TABLE IV. NUMERICAL RESULTS FOR PARTICULAR VALUES OF SIGNAL VARIANCE σ [db]. σ1 [db]=0.65db, =17 σ [db]= 19.375dB, =1 x Δ P x Δ P 1 0.675 0.039 0.810 0.4039 0.05 0.4975 18.84 0.0784 0.1770 1.118 0.0505 0.005 3 43.9 0.1569 0.0404 8.75 0.1010 0 4 94.118 0.3137 0.0015 60.588 0.00 0 5 194.510 0.675 0 15.16 0.4039 0 6 395.94 1.549 0 54.471 0.8078 0 7 796.863 5.098 0 51.980 16.157 0 8 160 50.196 0 103 3.314 0 SQNRav = 34.1437 db R = 6.5458 bt/sample SQNRav = 3.3097 db R = 6.057 bt/sample In comparson wth non-adaptve fxed quantzer wth GR code descrbed n [1] where the quantzer fulfllng G.71 standard for =55 has an average bt rate 7.9195 bt/sample, ths model has a bt rate savng of around 1.3 bt/sample. Comparng wth an adaptve quantzer [14] where the bt rate s R log N log N / g M our model s average bt rate () s sgnfcantly lower. IV. CONCLUSIONS We calculated the performances for swtched quantzers that satsfy G.71 standard and for dfferent values of : 15, 31, 63, 17, 55. Opposte to the expectaton that better performances would have been obtaned for the lowest value of parameter, actually the lowest average bt rate s reached wth the usage of the swtched quantzer for maxmal value =55. Comparng to the non-adaptve 84
logarthmc G.711 quantzer, the proposed quantzer acheves an average bt rate s decrease of 1.43 bt/sample. The recommended VLC Golomb-Rce code s smpler than Huffman s code, whch s another advantage of proposed soluton. We are lookng forward to seeng ths quantzaton method beng appled for codng and compresson of other sgnal types. REFERENCES [1] Recommendaton G.711, Pulse code modulaton of voce frequences, ITU-T, 197. [Onlne]. Avalable: http://www.tu.nt/rec/t-rec-g.711 [] Recommendaton G.711.0, Lossless compresson of G.711 pulse code modulaton, ITU-T, 009. [Onlne]. Avalable: http://www.tu.nt/rec/t-rec-g.711.0 [3] S. W. Golomb, Run-length encodngs, IEEE Transactons on Informaton Theory, IT-1(3), pp. 399 401, 1966. [Onlne]. Avalable: http://dx.do.org/10.1109/tit.1966.1053907 [4] D. Solomon, A concse ntroducton to data compresson, Sprnger- Verlag, London, 008, pp. 36 38. [Onlne]. Avalable: https://do.org/10.1007/978-1-84800-07-8 [5] M. Dnčć, Z. Perć, D. Denć, Unform polar quantzer wth threestage herarchcal varable-length codng for measurement sgnals wth Gaussan dstrbuton, Measurment, vol. 88, pp. 14, 016. [Onlne]. Avalable: http://www.scencedrect.com/scence/ artcle/p/s0634116300367 [6] Z. Perć, J. Nkolć, J. Lukć, D. Denć: Two-stage quantzer wth Huffman codng based on G.711 standard, Przeglad Elektrotechnczny (Electrcal Revew), ISSN 0033-097, vol. 88, no. 09a, pp. 300 30, 01. [Onlne]. Avalable: http://pe.org.pl/ artcles/01/9a/65.pdf [7] Z. Perć, M. Dnčć, D. Denć, A. Jocć, Forward adaptve logarthmc quantzer wth new lossless codng method for Laplacan source, Wreless Personal Communcatons, vol. 59, no. 4, pp. 65 641, 011. [Onlne]. Avalable: http://dx.do.org/10.1007/s1177-010-999-3 [8] O. Ordentlch, U. Erez, Performance analyss and optmal flter desgn for sgma-delta modulaton va dualty wth DPCM, n Proc. IEEE Int. Symp. on Informaton Theory, Hong Kong, 015, pp. 31 35. [Onlne]. Avalable: http://dx.do.org/10.1109/st.015. 78469 [9] N. Vučć, Z. Perć, N. Smć, Two-stage quantzer wth Golomb- Rce codng, 3 rd Internatonal Conference TAKTONS, Nov Sad, Serba, 015, pp. 34 35. [10] N. S. Jayant, P. Noll, Dgtal codng of waveforms, prncples and applcatons to speech and vdeo, Prentce Hall, New Jersey, 1984, pp. 115 51. [11] Recommendaton G.71, Transmsson performance characterstcs of pulse code modulaton, ITU-T, 199. [Onlne]. Avalable: http://www.tu.nt/rec/t-rec-g.71 [1] N. Vučć, Z. Perć, Quas-logarthmc quantzer wth Golomb-Rce codng, XIII Internatonal Conference Systems, Automatc Control and Measurements- SAUM, Nš, Serba, 016, pp. 14 17. [13] Z. Perc, J. Nkolc, An adaptve waveform codng algorthm and ts applcaton n speech codng, Dgtal Sgnal Processng, vol., no. 1, pp. 199 09, 01. [Onlne]. Avalable: http://dx.do.org/ 10.1016/j.dsp.011.09.001 [14] L. Velmrovć, S. Marć, New Adaptve Compandor for LTE Sgnal Compresson Based on Splne Approxmatons, ETRI Journal, vol. 38, no. 3, pp. 463 468, 016. [Onlne]. Avalable: http://dx.do.org/ 10.418/etrj.16.0115.0506 85