Laboratory Discrete Cosine Transform and Karhunen-Loeve Transform Miaohui Wang, ID 55006952 Electronic Engineering, CUHK, Shatin, HK Oct. 26, 202 Objective, To investigate the usage of transform in visual signal coding 2, To compare the performance of Discrete Consine Transform(DCT) and Karhunen-Loeve Transform(KLT) 2 Introduction to Block Based Transform Coding Block based transform coding is used to convert spatial pixel values to transform coefficients in the frequency domain. Since a linear transforms is employed in image and video compression, the energy in spatial domain is equal to the energy in the transform domain but the coefficients is compacted into the low frequency area. The convenience is that most of the energy is compacted in a few large transform coefficients. After the quantization, most of the coefficients will be zero and it can save a lot of bits to represent the original image. The reason of incorporated block transform for image compression can be simply concluded by two points: Energy compaction: a few of basis functions are sufficient to represent a given image. Decorrelation: coefficients in transform domain are decorrelated. Linear transform coding can be consider as using a set basis functions to represent a given image. In other words, a image X can be expand as a linear combination of some basis function s i.
X = n λ i s i, () i= where s i is eigenvectors of covariance matrix of the given signals, and λ i is the corresponding eigenvalues. Many types of unitary transforms have been used for image and video coding, including Fourier transform, Karhonen-Loeve transform [8], Walsh- Hadamard transform[3], discrete cosine transform[]), and recently, wavelets transform in JPEG2000 [9]. Karhunen-Loeve Transform(KLT) is a statistically content based transform, and therefore it has the optimum energy concentration. In general, KLT has no fast algorithm in implementation due to its property of signal dependence. Therefore, the discrete cosine transform is proposed by K.R. Rao et al to replace the KLT in image compression. DCT mainly contains the following advantages : ). explicit transform kernel instead of content dependence 2). butterfly algorithm to greatly speed up computation 3). relative integer expression to avoid float shift 4). nearly optimum performance approaching that of KLT in first order Markov wide stationary signals 3 Karhunen-Loeve Transform 3. Definition and Properties Let { x(n), n } be a complex random sequence and its autocorrelation matrix is denoted by R. Let Φ be the eigenvectors of the R and Λ be the corresponding eigenvalues. The KLT is defined by the following equation: y = Φ T x, (2) The KLT can preserve the total energy of the input data: y 2 = y T y = ( Φ T x ) T ( Φ T x ) = x T ΦΦ T x = x T x = x 2 (3) The KLT can also convert the covariance matrix to a diagonal matrix: 2
E(yy T ) = Φ T E(xx T )Φ = Φ T RΦ = Λ. (4) 3.2 Estimation of Covariance The calculation of the KLT is typically performed by finding the eigenvectors of the covariance matrix. In the case of entire signal which is available, for example a given image, the covariance matrix can be estimated from blocks calculated by: R = E [(u ū)(u ū) T] (u i ū)(u i ū) T = i= = ŪŪT, where u i is a column vector of lexicographic ordering of block i. (5) 3.3 Calculation of Eigenvectors For actual computation of the eigenvectors of a matrix, many numerical packages like MATLAB [4], and LIPACK [2] providing function for the solution. Diagonalization of matrix R is given as: 3.4 Forward and Inverse KLT Forward KLT is given by: R = ΦΛΦ T. (6) y i = Φ T (u i ū). (7) After quantization, y i becomes ỹ i, and inverse KLT is given by: 3.5 Experimental results ũ i = Φỹ i +ū (8) In order to evaluate the performance of transform, MSE is employed and defined as: MSE = (Org(i,j) rec(i,j)) 2, (9) WH 3
Original All coefficents QP Type MSE:3.072e 027 QP Type 2 MSE:587.5625 QP Type 3 MSE:240.5799 QP Type 4 MSE:66.425 MSE:2.6366 Figure : Experimental results of KLT 4
where W and H are the width and height of the image size respectively. In Fig., it is clearly depicted that if all coefficients are conserved, the original can be perfectly reconstructed as shown in the top-right corner. QPType i,(i =,2,3,4) denotes that first, 3, 0 and 36 biggest coefficients are preserved to reconstruct the original image. From Fig., we can know that first several biggest coefficients, for example QP Type-3, can represent the original image which is a great convenience for data compression and transmission. 3.6 Basis of KLT In Fig. 2, we show the two dimensional basis images of KLT. ote that basis images are obtained by converting each column φ i of Φ to 8x8 blocks, and then are shown in Fig. 2. Figure 2: Two dimensional basis image of KLT 5
Figure 3: Four QP types of DCT 4 Discrete cosine transform 4. Definition and Properties The discrete cosine transforms (DCT) are members of a family of sinusoidal unitary transforms [5]. They are real, orthogonal, and separable with fast algorithms for its computation. The family of discrete trigonometric transforms consists of 8 versions of DCT. Each transform is identified as EVE or ODD and of type I, II, III, and IV. DCT II [] is ususally used to image compression. Unless stated otherwise, we will imply DCT II whenever we call the DCT in this report. -D forward and inverse DCT is defined as: C(u) = a(u) f(i) = a(u) = i=0 i=0 2 f(i)cos [ π a(u)c(u)cos [ π u = 0 u =... ( ) ] i+ 2 u ) ] u ( i+ 2 k = 0,...,. wherec(u)isthetransformcoefficients, andf(i)istheinputdata. cos [ π is called the basis function. Similarly, 2-D DCT is defined as: (0) ( ) ] n+ 2 k 6
C(u,v) = a(u)a(v) i=0 a(u),a(v) = 2 i=0 f(i,j)cos [ ( ] [ π i+ 2) u cos π u,v = 0 u,v =... ( ) ] j + 2 v () 2-D DCT can be implemented by a separable matrix form, and the block 8x8 matrix is given by: V = TfT T, (2) where T is the DCT transform kernel, and is given by 8 T(u,v) = u = 0,0 v 7 cos[ ( ) ] π 4 8 v+ 2 u u 7,0 v 7. 4.2 Quantization of DCT coefficients (3) In Fig. 3, there are four QP types which are used to evaluated the performance of DCT. 4.3 Experimental results In Fig. 5, the top-right corner shows the results of all coefficients reserved. Also, various types of coefficients cut-off are tested.experimental results show that when more transform coefficients are conserved, a good representation of image is reconstructed. From Fig. 5, we can know that QP Type-3 is able to reconstruct the original image very well. 4.4 Basis of DCT In Fig. 4, we show the two dimensional basis images of DCT. We can make a simple comparison with that of KLT and the conclusion is that the basis images of DCT changes orderly and without blur effect kernels. 5 Comparison of KLT and DCT In Fig. 6, we show MSE results of KLT and DCT. The experimental results show that KLT is a little better than DCT for reconstruction of image under the samecondition, but it alsoshow that DCTisvery close toklt. Actually, 7
Figure 4: Two dimensional basis image of DCT DCT is one of the suboptimal transform to KLT with the first order Markov process assumption because of the correlation ρ = 0.95 which is suggested to nature image. In Table, it clearly shows performance of KLT is better than DCT for various coefficients cut-off. Types coef 3 coefs 0 coefs 36 coefs MSE-KLT 587.5625 240.5799 66.425 2.6366 MSE-DCT 596.939 302.89 93.37 9.958 Table : Comparison of KLT and DCT 8
Original All coefficents QP Type MSE:.633e 026 QP Type 2 MSE:596.939 QP Type 3 MSE:302.89 QP Type 4 MSE:93.37 MSE:9.958 6 Conclusion Figure 5: Experimental results of DCT In the laboratory [6, 7], we investigate two linear unitary transform: KLT and DCT. KLT is known as the optimal transform for a given signal with the covariance matrix. Therefore, KLT is of signal based transform which indicates that the transform kernel is required to solve for different types of signal. However, the transform matrix of DCT is explicitly given by the 9
600 klt dct 500 400 MSE 300 200 00 0 0 5 0 5 20 25 30 35 40 QP types Figure 6: Performance comparison of KLT and DCT definition. We can apply the transform kernel for different images without increase extra computation. The results show that we can use only a few of transform coefficients to reconstruct the original image with accepted visual quality. Also, the experimental results test the sub-optimality of DCT approaching KLT. In addition, fixed kernel helps to design quick algorithm to perform the transform process. These advantages together tell us the reason that DCT is popular for image and video compression. References []. Ahmed, T. atarajan, and K.R. Rao. Discrete cosine transform. Computers, IEEE Transactions on, 00():90 93, 974. 0
[2] J.J. Dongarra, J.R. Bunch, GB Moler, and G.W. Stewart. LIPACK users guide. umber 8. Society for Industrial Mathematics, 987. [3] BJ Fino. Relations between haar and walsh/hadamard transforms. Proceedings of the IEEE, 60(5):647 648, 972. [4] D. Hanselman and B.C. Littlefield. Mastering MATLAB 5: A comprehensive tutorial and reference. Prentice Hall PTR, 997. [5] A.K. Jain. A sinusoidal family of unitary transforms. Pattern Analysis and Machine Intelligence, IEEE Transactions on, (4):356 365, 979. [6] L.Sheng and R. Shi. ELEG 543 laboratory notes. [7] gan King gi. ELEG 543 lecture notes. [8] E. Oja and J. Karhunen. On stochastic approximation of the eigenvectors and eigenvalues of the expectation of a random matrix. Journal of mathematical analysis and applications, 06():69 84, 985. [9] A. Skodras, C. Christopoulos, and T. Ebrahimi. The jpeg 2000 still image compression standard. Signal Processing Magazine, IEEE, 8(5):36 58, 200.