Transform Coding Principle of block-wise transform coding Properties of orthonormal transforms Discrete cosine transform (DCT) Bit allocation for transform coefficients Entropy coding of transform coefficients Typical coding artifacts Fast implementation of the DCT Thomas Wiegand: Digital Image Communication Transform Coding - Transform Coding Principle Structure u U V T Quantizer T - v Transform coder (T α) / decoder ( β T - ) structure u U i V v T α β Τ Insert entropy coding ( γ) and transmission channel u U i b b i V T α γ channel γ β T - v Thomas Wiegand: Digital Image Communication Transform Coding -
Transform Coding and Quantization u U V v u U V T Q T -!!!! u U V v v Transformation of vector u=(u,u,,u ) into U=(U,U,,U ) Quantizer Q may be applied to coefficients U i separately (scalar quantization: low complexity) jointly (vector quantization, may require high complexity, exploiting of redundancy between U i ) Inverse transformation of vector V=(V,V,,V ) into v=(v,v,,v ) Why should we use a transform? Thomas Wiegand: Digital Image Communication Transform Coding - 3 Geometrical Interpretation A linear transform can decorrelate random variables An orthonormal transform is a rotation of the signal vector around the origin Parseval s Theorem holds for orthonormal transforms DC basis function U u U u u = u = T = u AC basis function U 3 U = = Tu = = U Thomas Wiegand: Digital Image Communication Transform Coding -
Properties of Orthonormal Transforms Forward transform transform coefficients U=Tu Transform matrix of size x input signal block of size Orthonormal transform property: inverse transform u=t U=TT U Linearity: u is represented as linear combination of basis functions Thomas Wiegand: Digital Image Communication Transform Coding - 5 Transform Coding of Images Exploit horizontal and vertical dependencies by processing blocks original image reconstructed image Original image block Reconstructed image block Transform T Transform coefficients Quantization Entropy Coding Transmission Inverse transform T - Quantized transform coefficients Thomas Wiegand: Digital Image Communication Transform Coding -
Separable Orthonormal Transforms, I Problem: size of vectors * (typical value of : 8) An orthonormal transform is separable, if the transform of a signal block of size *-can be expressed by U=TuT * transform coefficients Orthonormal transform matrix of size * * block of input signal The inverse transform is u= T UT Great practical importance: transform requires matrix multiplications of size * instead one multiplication of a vector of size* with a matrix of size * Reduction of the complexity from O( ) to O( 3 ) Thomas Wiegand: Digital Image Communication Transform Coding - 7 Separable Orthonormal Transforms, II Separable -D transform is realized by two -D transforms along rows and columns of the signal block u column-wise Tu row-wise TuTT -transform -transform * block of pixels * block of transform coefficients Thomas Wiegand: Digital Image Communication Transform Coding - 8
Criteria for the Selection of a Particular Transform Decorrelation, energy concentration KLT, DCT, Transform should provide energy compaction Visually pleasant basis functions pseudo-random-noise, m-sequences, lapped transforms, Quantization errors make basis functions visible Low complexity of computation Separability in -D Simple quantization of transform coefficients Thomas Wiegand: Digital Image Communication Transform Coding - 9 Karhunen Loève Transform (KLT) Decorrelate elements of vector u R u =E{uu T },! U=Tu,! R U =E{UU T }=TE{uu T }T T = TR u T T =diag{ } Basis functions are eigenvectors of the covariance matrix of the input signal. KLT achieves optimum energy concentration. Disadvantages: - KLT dependent on signal statistics - KLT not separable for image blocks - Transform matrix cannot be factored into sparse matrices. Thomas Wiegand: Digital Image Communication Transform Coding - α i
Comparison of Various Transforms, I Comparison of D basis functions for block size =8 () Karhunen Loève transform (98/9) () Haar transform (9) (3) Walsh-Hadamard transform (93) () Slant transform (Enomoto, Shibata, 97) () () (3) () (5) (5) Discrete CosineTransform (DCT) (Ahmet, atarajan, Rao, 97) Thomas Wiegand: Digital Image Communication Transform Coding - Comparison of Various Transforms, II Energy concentration measured for typical natural images, block size x3 (Lohscheller) KLT is optimum DCT performs only slightly worse than KLT Thomas Wiegand: Digital Image Communication Transform Coding -
DCT - Type II-DCT of blocksize M x M - D basis functions of the DCT: is defined by transform matrix A containing elements π(k + ) i aik = αi cos M ik, =...( M ) with α = M αi = i M Thomas Wiegand: Digital Image Communication Transform Coding - 3 Discrete Cosine Transform and Discrete Fourier Transform Transform coding of images using the Discrete Fourier Transform (DFT): edge For stationary image statistics, the energy folded concentration properties of the DFT converge against those of the KLT for large block sizes. Problem of blockwise DFT coding: blocking effects DFT of larger symmetric block -> DCT due to circular topology of the DFT and Gibbs phenomena. pixel folded Remedy: reflect image at block boundaries Thomas Wiegand: Digital Image Communication Transform Coding -
Histograms of DCT Coefficients: Image: Lena, 5x5 pixel DCT: 8x8 pixels DCT coefficients are approximately distributed like Laplacian pdf Thomas Wiegand: Digital Image Communication Transform Coding - 5 Distribution of the DCT Coefficients Central Limit Theorem requests DCT coefficients to be Gaussian distributed Model variance of Gaussian DCT coefficients distribution as random variable (Lam & Goodmann ) u pu ( σ ) = exp πσ σ Using conditional probability pu ( ) = pu ( σ ) p( σ ) dσ Thomas Wiegand: Digital Image Communication Transform Coding -
Distribution of the Variance Exponential Distribution { } p( σ ) = µ exp µσ Thomas Wiegand: Digital Image Communication Transform Coding - 7 Distribution of the DCT Coefficients pu ( ) = pu ( σ ) p( σ ) dσ u = exp µ exp{ µσ } dσ πσ σ u = µ µσ σ π exp d σ π µ u = µ exp π µ { } µ = exp { µ u } Laplacian Distribution Thomas Wiegand: Digital Image Communication Transform Coding - 8
Bit Allocation for Transform Coefficients I Problem: divide bit-rate R among transform coefficients such that resulting distortion D is minimized. DR ( ) = Di( Ri), s.t. Ri R i= i= Average Average distortion rate Distortion contributed by coefficient i Rate for coefficient i Approach: minimize Lagrangian cost function d ddi( Ri)! Di( Ri) + λ Ri = + λ = dr dr i i= i= Solution: Pareto condition ddi( Ri) = λ dr Move bits from coefficient with small distortion reduction per bit to coefficient with larger distortion reduction per bit Thomas Wiegand: Digital Image Communication Transform Coding - 9 i i Bit Allocation for Transform Coefficients II Assumption: high rate approximations are valid R dd ( ) i i Ri Ri Di( Ri) aσi, alnσi = λ dri σ σ + a ln σ σ λ " + = i R " σ + i log i log Ri log i log R log R ( σ i ) aln aln R = Ri = logσi + log = log + log i= = # i$%$$& λ i = λ #$%$& log " σ " σ Operational Distortion Rate function for transform coding: a a DR ( ) = D( R) = = a σ σ σ σ log i " R Ri R " i i i i σ i= i= i= Thomas Wiegand: Digital Image Communication Transform Coding - Geometric mean: " σ = ( σ i ) i =
Entropy Coding of Transform Coefficients Previous derivation assumes: AC coefficients are very likely to be zero Ordering of the transform coefficients by zig-zag scan Design Huffman code for event: # of zeros and coefficient value Arithmetic code maybe simpler Probability that coefficient is not zero when quantizing with: V i =*round(u i /).5 Ri 3 σ = max[(log i ), ] bit D 5 7 8 DC coefficient 8 7 5 3 Thomas Wiegand: Digital Image Communication Transform Coding - Entropy Coding of Transform Coefficients II Encoder 95 88 93 9 57 9 9 93 88 87 95 93 3 93 8 9 8 95 8 5 99 93 7 7 79 79 5 8 98 83 9 95 9 7 59 85 8 75 3 5 7 73 85 5 7 5 7 8 8 7 73 98 3 5 8 9 9 59 3 Original 8x8 block DCT 8 9 33-5 - 33-38 -5-7 -3-9 3 - - - 9 8 7 7-3 -5-3 - 3-8 -3-3 -5-9 3 3 3-7 - - -9 8-7 -5-8 -7 α 85 3-3 - - - - - - - zig-zag-scan (85 3 - -3 - - - - - - EOB) run-level coding Mean of block: 85 (,3) (,) (,) (,) (,) (,-) (,) (,) (,) (,-3) (,) (,-) (,) (,-) (,-) (,-) (,) (9,-) (,-) (EOB) entropy coding & transmission Thomas Wiegand: Digital Image Communication Transform Coding -
3 - - -3 3 - - -3 3 - - -3 3 - - -3 3 - - -3 3 - - -3 Entropy Coding of Transform Coefficients III Decoder transmission Mean of block: 85 (,3) (,) (,) (,) (,) (,- ) (,) (,) (,) (,-3) (,) (,-) (,) (,-) (,-) (,-) (,) (9,-) (,-) (EOB) Inverse zig-zagscan run-leveldecoding (85 3 - -3 - - - - - - EOB) 85 3-3 - - - - - - - Scaling and inverse DCT: β 9 93 87 9 79 7 9 89 98 88 8 98 9 9 8 85 89 9 97 7 59 8 89 7 8 8 77 5 53 87 89 99 78 5 3 85 79 7 93 7 5 79 97 7 9 98 95 93 9 5 8 79 9 9 9 85 9 57 Reconstructed 8x8 block Thomas Wiegand: Digital Image Communication Transform Coding - 3 Detail in a Block vs. DCT Coefficients Transmitted image block DCT coefficients quantized DCT block of block coefficients reconstructed of block from quantized coefficients Thomas Wiegand: Digital Image Communication Transform Coding -
Typical DCT Coding Artifacts DCT coding with increasingly coarse quantization, block size 8x8 quantizer step size quantizer step size quantizer step size for AC coefficients: 5 for AC coefficients: for AC coefficients: Thomas Wiegand: Digital Image Communication Transform Coding - 5 Influence of DCT Block Size Efficiency as a function of block size x, measured for 8 bit Quantization in the original domain and equivalent quantization in the transform domain. = Memoryless entropy of original signal mean entropy of transform coefficients Block size 8x8 is a good compromise between coding efficiency and complexity Thomas Wiegand: Digital Image Communication Transform Coding -
Fast DCT Algorithm I DCT matrix factored into sparse matrices (Arai, Agui, and akajima; 988): y= M x = S P M M M 3 M M 5 M x S = s s s s 3 s s 5 s s 7 P = M = - - M = - - M = 3 C -C C -C -C C M = - M = 5 - - - - M = - - - - Thomas Wiegand: Digital Image Communication Transform Coding - 7 Fast DCT Algorithm II Signal flow graph for fast (scaled) 8-DCT according to Arai, Agui, akajima: x s y scaling x s y x x 3 x x 5 a a a 3 s s s 5 s y y y 5 y only 5 + 8 Multiplications (direct matrix multiplication: multiplications) x a x 7 a 5 s 7 s 3 y 7 y 3 Multiplication: u m m u Addition: u v u v u+v u-v a= C a = C C a3 = C a = C+ C a5 = C s = sk = ; k =... 7 CK CK = cos(ks ) Thomas Wiegand: Digital Image Communication Transform Coding - 8
Transform Coding: Summary Orthonormal transform: rotation of coordinate system in signal space Purpose of transform: decorrelation, energy concentration KLT is optimum, but signal dependent and, hence, without a fast algorithm DCT shows reduced blocking artifacts compared to DFT Bit allocation proportional to logarithm of variance Threshold coding + zig-zag scan + 8x8 block size is widely used today (e.g. JPEG, MPEG-//, ITU-T H.//3) Fast algorithm for scaled 8-DCT: 5 multiplications, 9 additions Thomas Wiegand: Digital Image Communication Transform Coding - 9