Rounding Transform and Its Application for Lossless Pyramid Structured Coding ABSTRACT A new transform, called the rounding transform (RT), is introduced in this paper. This transform maps an integer vector onto another integer vector by using weighted average and difference filters followed by a rounding operation. The RT can be applied to lossless pyramid structured coding with various elementary block sizes and filters. In addition, it generalizes other mean based lossless pyramid structured coding schemes. I. INTRODUCTION Although linear predictive coding [1,2] is usually used for lossless image compression, pyramid structured coding is still an attractive solution due to the progressive transmission [3,4]. Two pyramid structured coding methods have been studied in depth : the sub-sample based methods [5-7] and the mean based methods [8-11]. The sub-sample based methods mainly transmit difference images between two successive levels of the sub-sample pyramid. The mean based methods such as the S-Transform [8-10] and the hierarchy embedded differential image (HEDI) [11] construct a mean based pyramid by rounding the mean of the non-overlapping block and then generating the differences between neighboring nodes of the mean pyramid to achieve good lossless compression. In recent works [12,13], more efficient compression was found using the S-Transform based predictive coding which is referred to as S+P (Sequential+ Prediction) Transform or RTS (Reversible Two Six) -Transform. However, the computations for the mean based methods have not yet been generalized, mainly due to a lack of precise analysis of the rounding operations. For example, the usable block size is restricted to two sizes, 2 2or 3 3, and the computations of the reconstruction must be done successively. In this paper, we propose a new transform, called the rounding transform (RT), that maps an integer vector to another integer vector by using weighted average and difference filters followed by a rounding operation. The RT is a reversible transform if floor (ceiling) and ceiling 1
(floor) operations are used in the forward and inverse transformation respectively. A proof of reversibility is given in Appendix-A. The RT can be applied to lossless pyramid structured coding with various elementary block sizes and various filters. We show that the RT generalizes the previous mean based lossless pyramid structured coding schemes [8-11], including the reduced difference pyramid (RDP) [14]. All of these schemes can be implemented in parallel at both the transmitting and receiving ends, if the proposed method is used. II. Rounding Transform defined as The RT of an one-dimensional (1-D) integer vector X N 1 and its inverse transform are YN 1 = RN X N 1 (1) X = R Y N 1 N N 1 (2) where and denote the floor and ceiling. R N matrix), and R N is the rounding transform matrix (RT is its inverse matrix. Note that the floor operator can be used in the inverse RT if the ceiling operator is used in the forward, i.e., the order of the rounding operations is interchangeable. A sufficient condition for the reversibility of the transform pair is that the non-singular matrix R N has the following properties. All the elements are integers except for the first row. The sum of the first row is equal to 1. All the other rows sum to 0. To avoid roundoff error due to the finite precision arithmetic [15], the RT matrix should be machine representable. For its inverse to be also machine representable, the determinant should be of the form 2 k where k Z. A proof of reversibility is given in Appendix A. The transformed vector Y N 1 is formed by computing the weighted average and the N 1 difference values, followed by a floor operation. In fact, the floor operation is not used for the differences since they are calculated from integers. The original integer data can be exactly recovered from the floored weighted average and differences by using the inverse RT with the ceiling operation. In a similar manner, a 2-D RT is defined as 2
t Y = Rr X Rc (3) M M M M N X = Rr Y Rc t M M M M N (4) where the superscript -t indicates the transposed inverse, and Rr M M and Rc N satisfy the RT matrix conditions. Note that the order of the inverse transform must be the reverse of that of the forward transform. The floored weighted average has the same dynamic range as the input data and its probability distribution may be very similar to that of the input data. The difference values are significantly decorrelated since they tend to be close to zero when the input data are highly correlated. Taking into account this decorrelation, the RT can be applied to a lossless pyramid structured coding. Fig. 1 depicts the basic algorithm for constructing the pyramids on an elementary block of size M. This structure is similar to the other mean based pyramids [9-11,14], but various elementary block sizes can be used. In addition, the RT allows the use of various weighted averaging filters. We refer to it as a lossless weighted average based pyramid. Using the 2-D RT, the weighted average and the difference pyramids can easily be constructed. The structure can also be obtained by using the 1-D RT, if the block of size M rearranged into a 1-D vector such as is ( L L L ) X x x x x x x ( M ) 1 = 11 12 1N 21 2N MN t, (5) and the RT matrix is of size MN MN. y 11 y 12... y 1N Level K N y 21 O y M1 y MN Level K+1 N x 11 O... x 1 N x M1 x MN Weighted Average Pyramid Difference Pyramid 3
Fig. 1. The basic structure of weighted average based pyramid for a block of size M. The weighted average and the difference images at level k are obtained from the k +1 level weighted average image. III. Relation with other mean based methods The other mean based pyramid coding methods can be represented by the RT matrix and its inverse as shown in Table 1. The S-Transform is represented by a 2-D RT, where the same matrix is used for both Rr 2 2 and Rc 2 2. The HEDI and the RDP can be represented by the 1-D RT. HEDI-CM2 and triplet-a RDP are considered as representatives of the HEDI and the RDP as these were empirically found to yield to the lowest entropy in [11] and [14], respectively. Note that triplet-a RDP is the same as HEDI-CM2 except for the change in sign. It should be noted that the RDP which is a lossy scheme in [11], can losslessly reconstruct the original image if the proposed rounding operations are used. In addition, the RT enables all of these schemes to be implemented in parallel at both the transmitting and receiving ends. Methods RT Dimension Basic block size RT matrix S-Tr. 2-D 2 2 1 2 1 2 1 1 HEDI-CM2 1-D 2 2 Triplet-A RDP 1-D 2 2 1 4 1 4 1 4 1 4 0 1 0 1 1 0 1 0 0 0 1 1 1 4 1 4 1 4 1 4 0 1 0 1 0 0 1 0 1 0 Selected RT 2-D 4 4 a 1 2 a 1 2 a a 1 0 0 0 1 0 0 0 1 Inverse matrix 1 1 2 1 1 2 1 1 4 3 4 1 2 1 3 4 4 2 1 1 4 4 1 2 1 1 4 4 2 1 4 1 2 3 4 1 3 4 1 2 1 4 1 4 1 2 1 4 1 4 1 2 1 4 1 1 a 1 2 a 1 a 1 2 a 1 a 2 a 1 a 2 1+ a Table 1. Previous mean based pyramid methods and a selected 2-D RT. The S-Transform is represented by a 2-D RT. HEDI-CM2 and triplet-a RDP are represented by the 1-D RT. Note that the reconstruction processes of the mean based pyramid methods, based on their 4
formulations [9-14], have to be calculated successively. These can now be calculated in parallel due to the representation of RT matrix form. IV. Performance and Simulation Results For our computer simulation, we use the 2-D RT with the 4 4 RT matrix (see the bottom of table 1). The same matrix is employed for both Rr 4 4 and Rc 4 4, where a is machine representable. The matrix is designed with a symmetric weighted average filter, taking into account the phase linearity for the reduced resolution images. The differentiators take the difference between two adjacent pixels since their correlation is generally the highest. In order to find the optimal a, we consider the output variance (sum of the diagonal elements of the transformed auto-covariance matrix), since the entropy often increases with the output variance [16]. We assume that an input vector is a 1-D first order Markov process with a correlation coefficient ρ. Differentiating the output variance with respect to a and setting the result to zero yields an optimal coefficient a opt = 1 2( 2 ρ ). Note that if ρ 1, then 1 6 1 2. Similarly, the output variance of the S-Transform (in this case, S-Transform a opt matrix is expanded to size 4 4) can be obtained. Fig. 2 shows the difference between two output variances : subtraction of the output variance of the S-Transform from that of the proposed optimal RT matrix. From this figure, one can see that the proposed RT matrix produce a smaller variance than the S-Transform when ρ is larger than 0542.. Variances ρ 5
Fig. 2. Difference between two output variances : subtraction of the output variance (sum of the diagonal elements of the transformed auto-covariance matrix) of the S-Transform from that of the proposed optimal RT matrix. The proposed RT matrix produces a smaller output variance than the S-Transform when ρ 0. 542. Two sets of images are used (see table 2). The first seven images are well-known, of which two are Lena images : Lena-g (green component) and Lena-y (luminance). The others are medical images : "MAMMOGRAM", cardiac "ANGIOGRAM", chest "X-RAY", and head "M.R.I." All the test images are 8-bit 512 512 except for first two images which are of size 256 256. The proposed RT is compared in terms of the total lossless first order entropy with the three mean based pyramid structured coding schemes, the six-level S+P Transform predictor A [12] (or RTS-Transform [13]), and the lossless JPEG third-order predictors #4 and #7 [1,2]. These entropies are shown in Table. 2. Three values are selected for parameter a of the RT matrix. However, it has been observed, by varying a, that the entropy curve is quite flat near the minimum when 1 6 a 1 2. Hence, parameter tuning is not very important in this region. The simulation results indicate that our proposed scheme is generally more effective than HEDI, RDP, and S-Transform. Its performance is very close to that of lossless JPEG, but is inferior to S+P Transform for the majority of the test images. Pyramid Non-Pyramid Non - Overlapping Overlapping - Successive Parallel Successive - RDP S-Tr. R T S+P Tr. (A) JPEG Test Images Original Entropy HEDI- CM2 a=1/4 a=5/16 a=3/8 RTS-Tr. # 4 # 7 Camera 7.009 5.073 4.960 4.949 4.953 4.969 4.978 5.089 4.849 Couple 6.390 4.649 4.501 4.301 4.297 4.301 4.327 4.234 4.370 Peppers 7.592 5.112 4.996 5.103 5.108 5.121 4.787 5.346 4.849 Lena-g 7.594 5.181 5.105 5.039 5.034 5.039 4.828 5.175 4.911 Lena-y 7.445 4.893 4.817 4.719 4.713 4.717 4.504 4.808 4.614 Baboon 7.357 6.559 6.369 6.356 6.349 6.357 6.219 6.610 6.259 Airplane 6.705 4.818 4.649 4.460 4.453 4.451 4.383 4.408 4.358 MAMM. 7.576 3.885 3.715 3.523 3.513 3.508 3.369 3.470 4.342 ANGIO. 6.857 4.168 4.133 4.170 4.168 4.175 3.916 4.364 3.826 X-RAY 6.667 2.924 2.768 2.631 2.632 2.632 2.506 2.623 2.398 MRI. 5.124 3.698 3.424 3.071 3.060 3.053 2.835 2.718 3.166 6
mean. 6.937 4.632 4.494 4.392 4.389 4.393 4.241 4.440 4.358 Table 2. Comparative evaluation of the total lossless first order entropy (bpp) obtained with the proposed RT matrix (a=1/4, 5/16 and 3/8), the other mean based pyramid methods, the six level S+P Transform predictor A (RTS Tr.), and the lossless JPEG with three-order predictors #4 and #7. V. CONCLUSION An integer to integer transform, called the rounding transform was proposed in this paper. We have showed that the RT is reversible. We have applied the RT to the weighted average based lossless pyramid structured coding with various elementary block sizes and various filters. The RT generalizes the previous mean based lossless pyramid structured coding schemes, including the RDP. We also showed that the proposed RT is effective for lossless pyramid structured coding. Acknowledgment. The authors would like to thank the reviewers for their helpful comments and suggestions. REFERENCES [1] G. K. Wallace, "The JPEG Still Picture Compression Standard," Communications of the ACM, vol. 34(4), pp.30-44, April 1991. [2] W. B. Pennebaker and J. L. Mitchell, JPEG Still Image Data Compression Standard, Van Nostrand Reinhold, N.Y, USA, 1993. [3] K.H. Tzou,"Progressive Image Transmission: A Review and Comparison of Technique," Optical engineering, vol. 26, no. 7, pp. 581-589, July 1987. [4] K.R. Sloan Jr. and S.L. Tanimoto, "Progressive Refinement of Raster Images," IEEE Trans. Computers, vol. C-28, no.11, pp.871-874, November 1979. 7
[5] P. Roos, M.A. Viergevier, M.C.A. Vandijke, and J.H. Peters, "Reversible Intraframe Compression of Medical Images," IEEE Trans. Medical Imaging, vol. 7, no. 4, pp.328-336, December 1988. [6] D. Houlding and J. Valsey, "Low Entropy Image Pyramids for Efficient Lossless Coding," IEEE Trans. Image Processing, vol.4, no.8, pp.1150-1153, August 1995. [7] B. Aiazzi, L. Alparone, and S. Baronti, "A Reduced Laplacian Pyramid for Lossless and Progressive Image Communication," IEEE Trans. Communications, vol. 44, no. 1, pp.18-22, January 1996. [8] Lux, A. " A Novel Set of Closed Orthogonal Functions for Picture Coding," Arch. Elek. Ubertragung " vol. 31, pp.267-274, 1977. [9] I.A. Shah, O. A.-Assani, and B. Johnson, "A Chip Set for Lossless Image Compression," IEEE Journal of Solid-State Circuits, vol. 26, no. 3, pp.237-244, March 1991. [10] H. Blume and A. Fand, "Reversible and Irreversible Image Data Compression Using the S- Transform and Lempel-Ziv coding," in Proc. SPIE Medical Imaging III: Image Capture and Display, vol. 1091, 1989, pp. 2-18. [11] W.Y. Kim, P.T. Balsara, D.T. Harper III, and J.W. Park, "Hierarchy Embedded Differential Images for Progressive Transmission Using Lossless Compression," IEEE Trans. Circuits and Systems for Video Technology, vol. 5, no. 1, pp. 1-13, February 1995. [12] A. Said and W.A. Pearlman, "An Image Multiresolution Representation for Lossless and Lossy Compression," IEEE Trans. Image Processing, vol. 5, no. 9, pp.1303-1310, September 1996. [13] A. Zandi, J. Allen, E. Schwartz, and M. Boliek, "CREW : Compression with Reversible Embedded Wavelets," IEEE Data Compression Conference, Snowbird, Utah USA, pp. 212-221, March 1995. [14] L. Wang and M. Goldberg, "Reduced-Difference Pyramid: A Data Structure for Progressive Image Transmission," Optical Engineering, vol. 28, no. 7, pp. 708-716, July 1989. [15] G. H. Golub and C. F. Van Loan, Matrix Computations, The Johns Hopkins University Press, Maryland USA, 1983. [16] A. Papoulis, Probability, Random Variables, and Stochastic Processes (second edition), McGraw- Hill Book Company, NY USA, 1984. 8
APPENDIX A Given an N matrix RN = ( rij) satisfying the following RT matrix conditions : N R N is an non-singular matrix where r j= 1 1 j for i N = 1, r ij = j= 1 0 for i = 2,..., N, and r ij Z = 2,..., N and j = 1,..., N, it immediately follows that the first column of the inverse matrix is the unit column vector. Rewrite R N and R N as : R N H = 1 H ( N 1) ( 1 ) R = V V N N N N ( ) (A-1) Since R R N N H = ( VN VN ) I N 1 1 ( 1) = H (A-2) ( N 1) where I N is the identity matrix, V N 1 must be the unit column vector since there is a unique solution to R N and the first row of R N sums to 1 while all other rows sum to 0. We prove the reversibility of the RT using the definition of the RT X = R R X N 1 N N N 1 = ( ) H1 X N 1 V V N 1 N ( N ) ( ) H = V H X + V H X = V H X + I V H X = V H X + X V X ( N 1) N 1 N 1 1 N 1 N ( N ) ( N 1) N 1 N 1 1 N 1 N N 1 1 N 1 N 1 1 N 1 N 1 H X = X N 1 1 N 1 N 1 (A-3) 9
Due to the finite precision arithmetic [15], the first row of the RT matrix should be machine k representable (i.e., c k 2 where c k { 0 1}, and k Z ) so that the summation to 1 condition is not violated. In addition, as R = adj( R ) det( R ), the determinant should be of the form 2 k where k Z. N N N 10