Transform Coding. coefficient vectors u = As. vectors u into decoded source vectors s = Bu. 2D Transform: Rotation by ϕ = 45 A = Transform Coding

Transfrm Cding Transfrm Cding Anther cncept fr partially expliting the memry gain f vectr quantizatin Used in virtually all lssy image and vide cding applicatins Samples f surce s are gruped int vectrs s f adjacent samples Transfrm cding cnsists f the fllwing steps 1 Linear analysis transfrm A, cnverting surce vectrs s int transfrm cefficient vectrs u = As Scalar quantizatin f the transfrm cefficients u u 3 Linear synthesis transfrm B, cnverting quantized transfrm cefficient vectrs u int decded surce vectrs s = Bu Adjacent Samples 4 S 1 A = U 1 4 4 4 D Transfrm: Rtatin by ϕ = 45 [ ] sin ϕ cs ϕ cs ϕ sin ϕ = Transfrm Cefficients 4 4 4 S U 4 January 3, 13 1 / 56

Transfrm Cding Overview Structure f Transfrm Cding Systems Mtivatin f Transfrm Cding Orthgnal Blck Transfrms Bit Allcatin fr Transfrm Cefficients Karhunen Léve Transfrm (KLT) Signal-Independent Transfrms Walsh-Hadamard Transfrm Discrete Furier Transfrm (DFT) Discrete Csine Transfrm (DCT) -d Transfrms in Image and Vide Cding Entrpy cding f transfrm cefficients Distributin f transfrm cefficients fr Images Mdified Discrete Csine Transfrm (MDCT) Summary January 3, 13 / 56

Transfrm Cding Structure f Transfrm Cding Systems s u u 1 Q Q 1 u u 1 A B s u N 1 Q N 1 u N 1 analysis transfrm quantizers synthesis transfrm Synthesis transfrm is typically inverse f analysis transfrm Separate scalar quantizer Q n fr each transfrm cefficient u n Vectr quantizatin f all bands r sme f them is als pssible, but Transfrms are designed t have a decrrelating effect = Memry gain f VQ is reduced Shape gain can be btained by ECSQ Space-filling gain is left as a pssible additinal gain fr VQ Cmbinatin f decrrelating transfrmatin, scalar quantizatin and entrpy cding is highly efficient - in terms f rate-distrtin perfrmance and cmplexity January 3, 13 3 / 56

Transfrm Cding Mtivatin f Transfrm Cding Explitatin f statistical dependencies Transfrm are typically designed in a way that, fr typical input signals, the signal energy is cncentrated in a few transfrm cefficients Cding f a few cefficients and many zer-valued cefficients can be very efficient (e.g., using arithmetic cding, run-length cding) Scalar quantizatin is mre effective in transfrm dmain Efficient trade-ff between cding efficiency & cmplexity Vectr Quantizatin: searching thrugh cdebk fr best matching vectr Cmbinatin f transfrm and scalar quantizatin typically results in a substantial reductin in cmputatinal cmplexity Suitable fr quantizatin using perceptual criteria In image & vide cding, quantizatin in transfrm dmain typically leads t an imprvement in subjective quality In speech & audi cding, frequency bands might be used t simulate prcessing f human ear Reduce perceptually irrelevant cntent January 3, 13 4 / 56

Transfrm Cding Transfrm Encder and Decder encder u α i s A u 1 α 1 i 1 γ b u N 1 α N 1 i N 1 analysis transfrm encder mapping entrpy cder decder b i i 1 β β 1 u u 1 γ 1 B s i N 1 β N 1 u N 1 entrpy decder decder mapping synthesis transfrm January 3, 13 5 / 56

Transfrm Cding Linear Blck Transfrms Linear Blck Transfrm Each cmpnent f the N-dimensinal utput vectr represents a linear cmbinatin f the N cmpnents f the N-dimensinal input vectr Can be written as matrix multiplicatin Analysis transfrm Synthesis transfrm u = A s (1) s = B u () Vectr interpretatin: s is represented as a linear cmbinatin f clumn vectrs f B s = N 1 n= u n b n = u b + u 1 b 1 + + u N 1 b N 1 (3) January 3, 13 6 / 56

Transfrm Cding Linear Blck Transfrms (cnt d) Perfect Recnstructin Prperty Cnsider case that n quantizatin is applied (u = u) Optimal synthesis transfrm: B = A 1 (4) Recnstructed samples are equal t surce samples s = B u = B A s = A 1 A s = s (5) Optimal Synthesis Transfrm (in presence f quantizatin) Optimality: Minimum MSE distrtin amng all synthesis transfrms B = A 1 is ptimal if A is invertible and prduces independent transfrm cefficients the cmpnent quantizers are centridal quantizers If abve cnditins are nt fulfilled, a synthesis transfrm B A 1 may reduce the distrtin January 3, 13 7 / 56

Transfrm Cding Orthgnal Blck Transfrms Orthnrmal Basis An analysis transfrm A frms an rthnrmal basis if basis vectrs (matrix rws) are rthgnal t each ther basis vectrs have t length 1 The crrespnding transfrm is called an rthgnal transfrm The transfrm matrices are called unitary matrices Unitary matrices with real entries are called rthgnal matrix Inverse f unitary matrices: Cnjugate transpse A 1 = A (fr rthgnal matrices: A 1 = A T ) (6) Why are rthgnal transfrms desirable? MSE distrtin can be minimized by independent scalar quantizatin f the transfrm cefficients Orthgnality f the basis vectrs sufficient: Vectr nrms can be taken int accunt in quantizer design January 3, 13 8 / 56

Transfrm Cding Prperties f Orthgnal Blck Transfrms Transfrm cding with rthgnal transfrm and perfect recnstructin B = A 1 = A preserves MSE distrtin d N (s, s ) = 1 N (s s ) (s s ) = 1 N ( A 1 u B u ) ( A 1 u B u ) = 1 N ( A u A u ) ( A u A u ) = 1 N (u u ) A A 1 (u u ) = 1 N (u u ) (u u ) = d N (u, u ) (7) Scalar quantizatin that minimizes MSE in transfrm dmain als minimizes MSE in riginal signal space Fr the special case f rthgnal matrices: ( ) = ( ) T January 3, 13 9 / 56

Transfrm Cding Prperties f Orthgnal Blck Transfrms (cnt d) Cvariance matrix f transfrm cefficients C UU = E { (U E {U})(U E {U}) T } = E {A } (S E {S})(S E {S}) T A T = A C SS A 1 (8) Since the trace f a matrix is similarity-invariant, tr(x) = tr(p X P 1 ), (9) and the trace f an autcvariance matrix is the sum f the variances f the vectr cmpnents, we have N 1 1 σi = σ N S. (1) i= The arithmetic mean f the variances f the transfrm cefficients is equal t the variances f the surce January 3, 13 1 / 56

Transfrm Cding Gemetrical Interpretatin f Orthgnal Transfrms Inverse -d transfrm matrix (= transpse f frward transfrm matrix) B = [ [ ] ] 1 1 1 b b 1 = = A T 1 1 Vectr interpretatin fr -d example s = u b + u 1 b ] [ ] 1 [ 1 1 1 = u s + u 1 1 1 [ ] [ ] [ ] 4 1 1 = 3.5 +.5 3 1 1 [ s yielding transfrm cefficients u = 3.5 u 1 =.5 An rthgnal transfrm is a rtatin frm the signal crdinate system int the crdinate system f the basis functins 1 1 ] b s 1 u b u 1 b 1 b 1 s s January 3, 13 11 / 56

Transfrm Cding Transfrm Example N = Adjacent samples f Gauss-Markv surce with different crrelatin factrs ρ ρ= 4 S1 S1 4 4 4 4 S1 S ρ =.9 4 4 4 ρ =.5 4 4 4 4 S1 S ρ =.95 S 4 4 4 S 4 Transfrm cefficients fr rthnrmal D transfrm ρ= 4 4 U1 4 U1 4 4 ρ =.5 4 4 4 4 U1 4 4 4 U1 U ρ =.95 U ρ =.9 U 4 4 4 U January 3, 13 4 1 / 56

Transfrm Cding Example fr Wavefrms (Gauss-Markv Surce with ρ =.95) Tp: signal s[k] Middle: transfrm cefficient u [k/] als called dc cefficient Bttm: transfrm cefficient u 1 [k/] als called ac cefficient Number f transfrm cefficients u is half the number f samples s Number f transfrm cefficients u 1 is half the number f samples s 4 s[k] k 4 1 3 4 5 4 u [k/] k/ 4 5 1 15 5 4 u 1 [k/] k/ 4 5 1 15 5 January 3, 13 13 / 56

Transfrm Cding Scalar Quantizatin in Transfrm Dmain Cnsider Transfrm Cding with Orthgnal Transfrms direct cding transfrm cding transfrm cding quantizatin cells quantizatin cells quantizatin cells in transfrm dmain in signal space Quantizatin cells are hyper-rectangles as in scalar quantizatin but rtated and aligned with the transfrm basis vectrs Number f quantizatin cells with appreciable prbabilities is reduced = indicates imprved cding efficiency fr crrelated surces January 3, 13 14 / 56

Transfrm Cding Bit Allcatin fr Transfrm Cefficients Prblem: Distribute bit rate R amng the N transfrm cefficients such that the resulting distrtin D is minimized min D(R) = 1 N N D i (R i ) i=1 subject t 1 N N R i R (11) with D i (R i ) being the p. distrtin-rate functins f the scalar quantizers Apprach: Minimize Lagrangian cst functin: J = D + λr ( N ) N D i (R i ) + λ R i = D i(r i ) + λ =! (1) R i R i i=1 Slutin: Paret cnditin i=1 i=1 D i (R i ) R i = λ = cnst (13) Mve bits frm cefficients with small distrtin reductin per bit t cefficients with larger distrtin reductin per bit Similar t bit allcatin prblem in discrete sets: min D i + λr i January 3, 13 15 / 56

Transfrm Cding Bit Allcatin fr Transfrm Cefficients (cnt d) Operatinal distrtin-rate functin f scalar quantizers can be written as D i (R i ) = σ i g i (R i ) (14) Justified t assume that g i(r i) is a cntinuus strictly cnvex functin g i(r i) has a cntinuus strictly increasing derivative g i(r i) with g i( ) = Paret cnditin becmes σ i g i(r i ) = λ (15) If λ σi g i (), the quantizer fr u i cannt be perated at the given slpe = Set the crrespnding cmpnent rate t R i = Bit allcatin rule { : σ ( ) i g i R i = () λ η i λ : σ σi i g i () > λ (16) where η i ( ) dentes the inverse f the derivative g i ( ) January 3, 13 16 / 56

Transfrm Cding Apprximatin fr Gaussian Surces Transfrm cefficients have als a Gaussian distributin Experimentally fund apprximatin fr entrpy-cnstrained scalar quantizatin fr Gaussian surces (a.95) Use parameter Bit allcatin rule θ = λ g(r) = πe 6a ln(a R + 1) (17) 3 (a + 1) πe ln with θ σ max (18) R i (θ) = Resulting cmpnent distrtins { : θ σ ( ) i 1 lg σ i θ (a + 1) a : θ < σi. (19) { σ i ( ) : θ σi D i (θ) = ε ln a σi lg 1 θ a σi a+1 : θ < σi. () January 3, 13 17 / 56

Transfrm Cding High-Rate Apprximatin Assumptin: High-rate apprximatin valid fr all cmpnent quantizers High-rate apprximatin fr distrtin-rate functin f cmpnent quantizers D i (R i ) = ε i σ i Ri (1) where ε i depends n transfrm cefficient distributin and quantizer Paret cnditin R i D i (R i ) = ln ε i σ i Ri = ln D i (R i ) = λ = cnst () states that all quantizers are perated at the same distrtin Bit allcatin rule R i (D) = 1 ( ε lg i σ ) i D Overall peratinal rate-distrin functin R(D) = 1 N N 1 i= R i (D) = 1 N 1 N i= lg ( σ i ε i D ) (3) (4) January 3, 13 18 / 56

Transfrm Cding High-Rate Apprximatin (cnt d) Overall peratinal rate-distrin functin with gemetric means R(D) = 1 N 1 ( σ lg i ε i N D i= ) = 1 ( lg ε σ ) D (5) σ = ( N 1 i= σ i ) 1 N Overall distrtin-rate functin and ε = ( N 1 i= ε i ) 1 N (6) D(R) = ε σ R (7) Fr Gaussian surces (transfrm cefficients are als Gaussian) and entrpy-cnstrained scalar quantizers, we have ε i = ε = πe 6, yielding D G (R) = πe 6 σ R (8) January 3, 13 19 / 56

Transfrm Cding Transfrm Cding Gain at High Rates Transfrm cding gain is the rati f the distrtin fr scalar quantizatin and the distrtin fr transfrm cding with G T = ε S σ S R ε σ R σ S : variance f the input signal = ε S σ S ε σ (9) ε S : factr f high-rate apprximatin fr direct scalar quantizatin High-rate transfrm cding gain fr Gaussian surces G T = σ 1 S σ = N N 1 i= σ i N N 1 i= σ i (3) Rati f arithmetic and gemetric mean f the transfrm cefficient variances The high-rate transfrm cding gain fr Gaussian surces is maximized if the gemetric mean is minimized (= Karhunen Lève Transfrm) January 3, 13 / 56

Transfrm Cding Example: Orthgnal Transfrmatin with N = Input vectr and transfrm matrix [ ] s s = and A = 1 [ ] 1 1 1 1 Transfrmatin u = [ u s 1 u 1 ] = A s = 1 [ 1 1 1 1 ] [ s s 1 ] (31) (3) Cefficients u = 1 (s + s 1 ), u = 1 (s s 1 ) (33) Inverse transfrmatin A 1 = A T = A = 1 [ 1 1 1 1 ] (34) January 3, 13 1 / 56

Transfrm Cding Example: Orthgnal Transfrmatin with N = (cnt d) Variance f transfrm cefficients σ = E { { } U } 1 = E (S + S 1 ) = 1 ( { } { } E S + E S 1 + E {S S 1 } ) = 1 ( σ S + σs + σsρ ) = σs(1 + ρ) (35) σ1 = E { U1 } = σ s (1 ρ) (36) Crss-crrelatin f transfrm cefficients E{U U 1 } = 1 E{ (S + S 1 ) (S S 1 ) } = 1 E {( S S1)} = σ s σs = (37) Transfrm cding gain fr Gaussian (assuming ptimal bit allcatin) G T = σ S σ + σ 1 = 1 1 ρ (38) January 3, 13 / 56

Transfrm Cding Example: Analysis f Transfrm Cding fr N = R-d cst befre transfrm J () = (D + λr) (fr samples) R-r cst after transfrm J (1) = (D + D 1 ) + λ(r + R 1 ) (fr bth transfrm cefficients) Gain in r-d cst due t transfrm at same rate (R + R 1 = R) J = J () J (1) = D D D 1 (39) Fr Gaussian surces, input and utput f transfrm have Gaussian pdf With peratinal distrtin-rate functin fr an entrpy-cnstrained scalar quantizer at high rates (D = ε σ R with ε = πe/6), we have J = ε σ S( R+1 (1 + ρ) R (1 ρ) R1) (4) By eliminating R 1 using R 1 = R R, we get J = ε σ S( R+1 (1 + ρ) R (1 ρ) (R R)) (41) January 3, 13 3 / 56

Transfrm Cding Example: Analysis f Transfrm Cding fr N = (cnt d) Gain in rate-distrtin due t transfrm J = ε σ S( R+1 (1 + ρ) R (1 ρ) (R R)) (4) T maximize gain, we set R J = ln (1 + ρ) R ln (1 ρ) 4R+R! = (43) yielding the bit allcatin rule R = R + 1 lg 1 + ρ 1 ρ (44) Same expressin is btained by using the previusly derived high rate bit allcatin rule R i = 1 ( ε lg σ ) i (45) D Operatinal high-rate distrtin-rate functin (Gaussian, ECSQ, N = ) D(R) = πe 6 1 ρ σ S R (46) January 3, 13 4 / 56

Transfrm Cding Example: Experimental R-D curves fr N =.8.7.6 TC1 Theretical.5 TC1 experimental (ECSQ) Distrtin >.4.3 Optimal bit allcatin using Paret cnditin. Clud f RD pints by cmbinatin f RD values f TC1 thrugh TC4.1 1 3 4 5 6 Bit rate > January 3, 13 5 / 56

Transfrm Cding Example: Experimental R-D curves fr N = 4 1.4 1. TC1 theretical 1 Distrtin >.8.6.4 Optimal bit allcatin using Paret cnditin TC1 experimental (ECSQ) Clud f RD pints by cmbinatin f RD values f TC1 thrugh TC4. 1 3 4 5 6 7 Bit rate > January 3, 13 6 / 56

Transfrm Cding General Bit Allcatin fr Transfrm Cefficients Fr Gaussian surces, the fllwing pints need t be cnsidered: High-rate apprximatins are nt valid fr lw bit rates; better apprximatins shuld be used fr lw rates Fr lw rates, Paret cnditins cannt be fulfilled fr all transfrm cefficients, since the cmpnent rates R i must nt be less then Slutin: Use generalized apprximatin f D i(r i) fr cmpnents quantizers Set cmpnents rates R i t zer fr all transfrm cefficients, fr which the Paret cnditin R i D(R i) = λ cannt be fullfilled fr R i Distribute rate amng remaining cefficients Fr nn-gaussian surces, the fllwing needs t be cnsidered in additin The transfrm cefficients have different (nn-gaussian) distributins (except fr large transfrm sizes) Using the same quantizer design fr all transfrm cefficients with D i (R i ) = σ i g(r i) is subptimal January 3, 13 7 / 56

Transfrm Cding Linear Transfrm Examples b WHT DFT DCT KLT WHT: Walsh Hadamard Transfrm b 1 b DFT: Discrete Furier Transfrm b 3 b 4 b 5 DCT: Discrete Csine Transfrm b 6 b 7 KLT: Karhunen Lève Transfrm January 3, 13 8 / 56

Transfrm Cding Karhunen Lève Transfrm (KLT) Karhunen Lève Transfrm Orthgnal transfrm that decrrelates the input vectrs Transfrm matrix depends n the surce Autcrrelatin matrix f input vectrs s { } R SS = E SS T (47) Autcrrelatin matrix f transfrm cefficient vectrs u R UU = E{UU T } = E{(AS)(AS) T } = AE{SS T }A T = A R SS A T (48) By multiplying with A 1 = A T frm the frnt, we get R SS A T = A T R UU (49) T get uncrrelated transfrm cefficients, we need t btain a diagnal autcrrelatin matrix R UU fr the transfrm cefficients January 3, 13 9 / 56

Transfrm Cding Karhunen Lève Transfrm (cnt d) Expressin fr autcrrelatin matrices R SS A T = A T R UU (5) R UU is a diagnal matrix if the eigenvectr equatin R SS b i = ξ i b i (51) is fulfilled fr all basis vectrs b i (clumn vectrs f A T, rw vectrs f A) The transfrm matrix A decrrelates the input vectrs if its rws are equal t the unit-nrm eigenvectrs v i f R SS [ ] T A KLT = v v 1 v N 1 (5) The resulting autcrrelatin matrix R UU is a diagnal matrix with the eigenvalues f R SS n its main diagnal ξ ξ 1 R UU =..... (53). ξ N 1 January 3, 13 3 / 56

Transfrm Cding Optimality f KLT fr Gaussian Surces Transfrm cding with rthgnal N N transfrm matrix A and B = A T Scalar quantizatin using scaled quantizers D(R, A k ) = N 1 i= σ i (A k ) g(r i ) (54) with σ i (A k) being variance f ith transfrm cefficient and A k being the transfrm matrix Cnsider an arbitrary rthgnal transfrm matrix A and an arbitrary bit allcatin given by the vectr r = [R, R N 1 ] T with N 1 i= R i = R Starting with arbitrary rthgnal matrix A, apply iterative algrithm that generates a series f rthnrmal transfrm matrices {A k }, k = 1,,... Iteratin A k+1 = J k A k cnsists f Jacbi rtatin and re-rdering = Transfrm matrix appraches a KLT matrix Can shw that fr all A k : D(R, A k+1 ) D(R, AA k ) = KLT is ptimal transfrm fr Gaussian surces (minimizes MSE) January 3, 13 31 / 56

Transfrm Cding Asympttic Rate Distrtin Efficiency fr KLT f Gaussian Surces at High Rates Transfrm cefficient variances σi are equal t the eigenvalues ξ i f R SS High-rate apprximatin fr Gaussian surce and ptimal ECSQ D(R) = πe 6 σ R = πe 6 ξ R = πe 6 1 N 1 N i= lg ξi R (55) Fr N, we can apply Szegös therem fr infinite Teplitz matrices: If all eigenvalues ξ i f an infinite autcrrelatin matrix (N ) are finite and G(ξ i ) is any cntinuus functin ver all eigenvalues, lim N N 1 1 N i= G(ξ i ) = 1 π G(Φ(ω))dω (56) π π Resulting distrtin-rate functin fr KLT f infinite size fr high rates DKLT (R) = πe 6 1 π π π lg Φss(ω) dω R (57) January 3, 13 3 / 56

Transfrm Cding Asympttic Rate Distrtin Efficiency fr KLT f Gaussian Surces at High Rates (cnt d) Asympttic distrtin-rate functin fr KLT f infinite size fr high rates DKLT (R) = πe 6 1 π π π lg Φss(ω) dω R (58) Infrmatin distrtin-rate functin (fundamental bund) is by a factr ε = πe/6 smaller D(R) = 1 π π π lg Φss(ω) dω R (59) Asympttic transfrm gain (N ) at high rates G T = ε σs R 1 π DKLT (R) = π Φ SS(ω)dω π 1 π π π lg Φ SS(ω)dω (6) Asympttic transfrm gain (N ) at high rates is identical t the asympttic predictin gain at high rates January 3, 13 33 / 56

Transfrm Cding KLT Transfrm Size Gain fr Gauss-Markv at High Rates Operatinal distrtin-rate functin fr KLT f size N, ECSQ, and ptimum bit allcatin fr Gauss-Markv surces with crrelatin factr ρ D N (R) = πe 6 σ s (1 ρ ) 1 1/N R (61) 1 lg 1 D N (R) D 1(R) [db] G T = 7.1 db ρ =.9 transfrm size N January 3, 13 34 / 56

Transfrm Cding Distrtin-Rate Functins fr Gauss-Markv Distrtin-rate curves fr cding a first-rder Gauss-Markv surce with crrelatin factr ρ =.9 and different transfrm sizes N SNR [db] 3 5 15 1 ECSQ+KLT, N N = 16 space-filling gain: 1.53 db N = 8 N = 4 distrtin-rate N = functin D(R) G T = 7.1 db 5 EC-Llyd (n transfrm) bit rate [bit/sample] 1 3 4 January 3, 13 35 / 56

Transfrm Cding KLTs fr Gauss-Markv Surces ρ =.1 ρ =.5 ρ =.9 ρ =.95 b.5.5.5.5.5 1 3 4 5 6 7.5 1 3 4 5 6 7.5 1 3 4 5 6 7.5 1 3 4 5 6 7 b 1.5.5.5.5.5 1 3 4 5 6 7.5 1 3 4 5 6 7.5 1 3 4 5 6 7.5 1 3 4 5 6 7 b.5.5.5.5.5 1 3 4 5 6 7.5 1 3 4 5 6 7.5 1 3 4 5 6 7.5 1 3 4 5 6 7 b 3.5.5.5.5.5 1 3 4 5 6 7.5 1 3 4 5 6 7.5 1 3 4 5 6 7.5 1 3 4 5 6 7 b 4.5.5.5.5.5 1 3 4 5 6 7.5 1 3 4 5 6 7.5 1 3 4 5 6 7.5 1 3 4 5 6 7 b 5.5.5.5.5.5 1 3 4 5 6 7.5 1 3 4 5 6 7.5 1 3 4 5 6 7.5 1 3 4 5 6 7 b 6.5.5.5.5.5 1 3 4 5 6 7.5 1 3 4 5 6 7.5 1 3 4 5 6 7.5 1 3 4 5 6 7 b 7.5.5.5.5.5 1 3 4 5 6 7.5 1 3 4 5 6 7.5 1 3 4 5 6 7.5 1 3 4 5 6 7 January 3, 13 36 / 56

Transfrm Cding Walsh-Hadamard Transfrm Very simple rthgnal transfrm (nly additins & final scaling) Fr transfrm sizes N that are psitive integer pwer f A N = 1 [ ] AN/ A N/ with A A N/ A 1 = [1]. (6) N/ Transfrm matrix fr N = 8 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 A 8 = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Piecewise-cnstant basis vectrs Image & vide cding: Prduces subjectively disturbing artifacts when cmbined with strng quantizatin (63) January 3, 13 37 / 56

Transfrm Cding Discrete Furier Transfrm (DFT) Discrete versin f the Furier transfrm Frward Transfrm Inverse Transfrm u[k] = 1 N 1 πkn j s[n]e N (64) N n= s[n] = 1 N 1 u[k]e j πkn N (65) N DFT is an rthnrmal transfrm (specified by a unitary transfrm matrix) k= Prduces cmplex transfrm cefficients Fr real inputs, it beys the symmetry u[k] = u [N k], s that N real samples are mapped nt N real values FFT is a fast algrithm fr DFT cmputatin, uses sparse matrix factrizatin Implies peridic signal extensin: Differences between left and right signal bundary reduces rate f cnvergence f Furier series Strng quantizatin = significant high-frequent artifacts January 3, 13 38 / 56

Transfrm Cding Discrete Furier Transfrm vs. Discrete Csine Transfrm (a) Input time-dmain signal (b) Time-dmain replica in case f DFT (c) Time-dmain replica in case f DCT-II January 3, 13 39 / 56

Transfrm Cding Derivatin f DCT High-frequent DFT quantizatin errr cmpnents can be reduced by intrducing implicit symmetry at the bundaries f the input signal and applying a DFT f apprximately duble length Signal with mirrr symmetry { s s[n 1/] : n < N [n] = s[n n 3/] : N n < N Transfrm cefficients (rthnrmal: divide u [] by ) u [k] = = = = N 1 1 s j πkn [i]e N N i= N 1 1 s[n 1/] (e j N π kn + e j π k(n n 1)) N N n= N 1 1 s[n] (e j N π k(n+ 1 ) + e j N π k(n+ 1 ) ) N N n= N 1 n= ( ( π s[n] cs N k n + 1 )) (66) (67) January 3, 13 4 / 56

Transfrm Cding Discrete Csine Transfrm (DCT) Implicit peridicity f DFT leads t lss f cding efficiency This can be reduced by intrducing mirrr symmetry at the bundaries and applying a DFT f apprximately duble size Due t mirrr symmetry, imaginary sine terms get eliminated and nly csine terms remain Mst cmmn DCT is the s-called DCT-II (mirrr symmetry with sample repetitins at bth sides: n = 1 ) DCT and IDCT Type-II are given by N 1 u[k] = α k s[n] = n= N 1 k= [ ( s[n] cs k n + 1 ) π ] N α k u[k] cs [ ( k n + 1 ) π ] N (68) (69) fr n 1 where α = N and α n = N January 3, 13 41 / 56

Transfrm Cding DCT vs. KLT Crrelatin matrix f a first-rder Markv prcesses can be written as 1 ρ ρ ρ N 1 R SS = σs ρ 1 ρ ρ N..... ρ N 1 ρ N ρ N 3 1 (7) DCT is a gd apprximatin f the eigenvectrs f R SS DCT basis vectrs apprach the basis functins f the KLT fr first-rder Markv prcesses fr ρ 1 DCT des nt depend n input signal Fast algrithms fr cmputing frward and inverse transfrm Justificatin fr wide usage f DCT (r integer apprximatins theref) in image and vide cding: JPEG, H.61, H.6/MPEG-4, H.63, MPEG-4, H.64/AVC, H.65/HEVC January 3, 13 4 / 56

Transfrm Cding KLT Cnvergence Twards DCT fr ρ 1 b KLT, ρ =.9.5.5 1 3 4 5 6 7 DCT-II.5.5 1 3 4 5 6 7 Difference between the transfrm matrices f KLT and DCT-II b 1.5.5.5 1 3 4 5 6 7.5 1 3 4 5 6 7 b.5.5 δ(ρ) = A KLT (ρ) A DCT.5 1 3 4 5 6 7.5 1 3 4 5 6 7 b 3.5 1 3 4 5 6 7.5 1 3 4 5 6 7.4 δ(ρ).5.5 b 4.5 1 3 4 5 6 7.5 1 3 4 5 6 7.3.5.5 b 5.5 1 3 4 5 6 7.5 1 3 4 5 6 7..5.5 b 6.5.5 b 7.5 1 3 4 5 6 7.5.5 1 3 4 5 6 7.5.1 ρ.5 1 3 4 5 6 7.5 1 3 4 5 6 7.4.5.6.7.8.9 1 January 3, 13 43 / 56

Transfrm Cding Bit Allcatin fr Audi Signals Bit Allcatin based n human respnse t sund signals - Psychacustic Mdels (PM) After transfrm, set f frequencies are gruped tgether t frm bands Fr each band, PM gives maximum allwed quantizatin nise s that the distrtin cannt be heard, i.e. nise is masked The gals f the encder are Enfrcing the bit rate specified by the user Implementing the PM threshld Adding nise in less ffensive places when there are nt enugh bits As can be seen, transfrm prvides a means fr eliminating nt nly redundancy but als irrelevancy January 3, 13 44 / 56

Transfrm Cding Transfrm Type Cding efficiency fr a speech signal [Zelinski and Nll, 1977] January 3, 13 45 / 56

Transfrm Cding Tw-dimensinal Transfrms -D linear transfrm: input image is represented as a linear cmbinatin f basis images An rthnrmal transfrm is separable and symmetric, if the transfrm f a signal blck s f size N N can be expressed as, u = A s A T (71) where A is the transfrmatin matrix and u is the matrix f transfrm cefficients, bth f size N N. The inverse transfrm is s = A T s A (7) Great practical imprtance: transfrm requires matrix multiplicatins f size N N instead ne multiplicatin f a vectr f size 1 N with a matrix f size N N Reductin f the cmplexity frm O(N 4 ) t O(N 3 ) January 3, 13 46 / 56

Transfrm Cding DCT Example 1 6 4 19 4 5 6 18 17 6 4 8 1 16 15 8 1 3 1 14 1 14 13 1 14 1 16 4 6 8 1 1 14 16 Image blck 11 16 4 6 8 1 1 14 16 DCT clumn-wise 1-d DCT is applied clumn-wise n image blck t btain DCT clumn-wise result Ntice the energy cncentratin in the first rw (DC cefficients) January 3, 13 47 / 56

Transfrm Cding DCT-Example (cntd.) 5 6 4 5 4 6 4 6 15 8 8 1 3 1 1 1 1 14 1 14 5 16 16 4 6 8 1 1 14 16 DCT clumn-wise 4 6 8 1 1 14 16 DCT rw-wise Fr cnvenience, DCT clumn-wise f previus slide is repeated n left side 1-d DCT is applied rw-wise n DCT clumn-wise result t btain final result Ntice the energy cncentratin in the first cefficient January 3, 13 48 / 56

Transfrm Cding Entrpy cding f transfrm cefficients AC cefficients are very likely equal t zer (fr mderate quantizatin) Fr -d, rdering f the transfrm cefficients by zig-zag (r similar) scan Example fr zig-zag scanning in case f a -d transfrm 185 3 1 1-3 -1 1 1-1 -1 1 1-1 1 1-1 -1 1-1 Huffman cde fr events {number f leading zers, cefficient value} r events {end-f-blck, number f leading zers, cefficient value} Arithmetic cding: Fr example, use prbabilities that particular cefficient is unequal t zer when quantizing with a particular step size January 3, 13 49 / 56

Transfrm Cding Analysis f DCT Cefficient Distributins fr Images Let s x,y dente pixel intensity in a blck. The s x,y s are assumed t be identically distributed, but nt necessarily Gaussian DCT is a weighted summatin f s x,y s By the central limit therem, the weighted summatin f identically distributed randm variables can be well apprximated as having a Gaussian distributin Therefre, DCT cefficients f this blck shuld be apprximately distributed as Gaussian f(u σ ) 1 e u σ (73) πσ In typical images, variance f the blcks has apprximately an expnential distributin f(σ ) λe λσ (74) Can shw that pdf f each transfrm cefficient then has apprximately a Laplacian distributin λ f(u) λ u e (75) January 3, 13 5 / 56

Transfrm Cding Example Histgrams f DCT Cefficients Picture Lena Histgram f transfrm ceffcients fr picture Lena January 3, 13 51 / 56

Transfrm Cding Distributin f Variances f(σ ) λe λσ σ January 3, 13 5 / 56

Transfrm Cding Calculatin f Apprximate Distributin f DCT Cefficients Distributin f DCT cefficients can be written as f(u) = With the cnditinal pdf assumptin f(u σ ) f(σ ) dσ (76) f(u σ ) = 1 πσ e u σ (77) we get a Laplacian distributin fr the transfrm cefficients f(u) = = = π λ π λ1 1 πσ e u σ λe λσ dσ e u σ λσ dσ π λ e λu = Nte (frm integral table): e ax bx dx = 1 π a e ab λ λ u e (78) January 3, 13 53 / 56

Transfrm Cding Mdified Discrete Csine Transfrm (MDCT) MDCT intrduced by [Princen, Jhnsn, and Bradley 1987] t avid blck bundaries artifacts MDCT is a lapped transfrm that maps N dimensinal data t N dimensinal data F : R N R N Frward transfrm u k = Inverse transfrm N 1 n= [ ( π s n cs n + 1 N + N ) ( k + 1 )] (79) s n = 1 N N 1 k= [ ( π u k cs n + 1 N + N ) ( k + 1 )] (8) Perfect recnstructin is achieved by adding the IMDCTs f subsequent verlapping blcks Used extensively in audi cding - MP3, AAC, Dlby AC3, etc. January 3, 13 54 / 56

Transfrm Cding MDCT Blck Diagram Frame k Frame k+1 Frame k+ Frame k+3 N N N N N MDCT N N MDCT N N MDCT N N IMDCT N N IMDCT N N IMDCT N N N Frame k+1 Frame k+ January 3, 13 55 / 56

Transfrm Cding Summary n Transfrm Cding Orthnrmal transfrm: rtatin f crdinate system in signal space Purpse f transfrm: decrrelatin, energy cncentratin = Align quantizatin cells with primary axis f jint pdf KLT achieves ptimum decrrelatin, but signal dependent and, hence, withut a fast algrithm DCT shws reduced blcking artifacts cmpared t DFT Fr Gauss-Markv and ρ : DCT appraches KLT Fr Gaussian surces: Bit allcatin prprtinal t lgarithm f variance similar t bit allcatin in discrete sets (D + λr) Fr high rates: Optimum bit allcatin yields equal cmpnent distrtin Larger transfrm size increases gain fr Gauss-Markv surce Fr picture cding: decrrelating transfrm + entrpy-cnstrained quantizatin + zig-zag scan + entrpy cding is widely used tday (e.g. JPEG, MPEG-1//4, ITU-T H.61//3/4) Fr pictures: pdf f transfrm cefficients is Laplacian because f expnential distributin f blck variances Fr audi cding: MDCT is widely used January 3, 13 56 / 56