VIDEO CODING USING A SELF-ADAPTIVE REDUNDANT DICTIONARY CONSISTING OF SPATIAL AND TEMPORAL PREDICTION CANDIDATES. Author 1 and Author 2

Size: px
Start display at page:

Download "VIDEO CODING USING A SELF-ADAPTIVE REDUNDANT DICTIONARY CONSISTING OF SPATIAL AND TEMPORAL PREDICTION CANDIDATES. Author 1 and Author 2"

Transcription

1 VIDEO CODING USING A SELF-ADAPTIVE REDUNDANT DICTIONARY CONSISTING OF SPATIAL AND TEMPORAL PREDICTION CANDIDATES Author 1 and Author 2 Address - Line 1 Address - Line 2 Address - Line 3 ABSTRACT All standard video coders are based on the prediction plus transform representation of an image block, which predicts the current block using various intra- and inter-prediction modes and then represents the prediction error using a fixed orthonormal transform. We propose to directly represent a mean-removed block using a redundant dictionary consisting of all possible inter-prediction candidates with integer motion vectors (mean-removed) and basis vectors of an orthogonal basis (e.g. DCT). We determine the coefficients by minimizing the L1 norm of the coefficients subject to a constraint on the approximation error. We show that using such a self-adaptive dictionary can lead to a very sparse representation, with significantly fewer non-zero coefficients than using the DCT transform on the prediction error. We further propose to orthonormalize the chosen atoms using a modified Gram-Schmidt process, and quantizes the coefficients associated with the resulting orthonormalized basis vectors. Each image block is represented by its mean, which is predictively coded, the indices of the chosen atoms, and the quantized coefficients. Each variable is coded based on its unconditional distribution. Simulation results show that the proposed coder can achieve significant gain over the H.264 coder (x264). 1. INTRODUCTION Recent progress in sparse representation has shown that signal representation using a redundant dictionary can be more efficient than using an orthonormal transform, because the redundant dictionary can be designed so that a typical signal can be approximated well by a sparse set of dictionary atoms [1]. Instead of using a fixed, learned dictionary based on training image blocks, we propose to represent each image block in a video frame using a self-adaptive dictionary consisting of all possible spatial and temporal prediction candidate blocks following a preset prediction rule. For example, it may include all inter-prediction candidates, which are shifted blocks of the same size in the previous frame within a defined search range, and all possible intra-prediction candidates, which are obtained with various intra-prediction modes in the H.264/HEVC encoder. The rationale for using such prediction candidates as the dictionary atoms is that the current block is likely to be very similar to a few of these candidates and hence only a few candidates may be needed to represent the current block accurately. To anticipate the event that some blocks cannot be represented efficiently by the prediction candidates, we also incorporate some pre-designed fixed dictionary atoms in the redundant dictionary. Essentially these fixed atoms are used to describe the residual prediction error by the chosen prediction candidates.they also serve to mitigate the accumulation of reconstruction errors in previously decoded frames. Currently, we simply use the DCT basis vectors for the fixed part, considering the fact that the current video coders all use DCT (or DCT-like) basis to specify the prediction error. Optimal design of this fixed part is subject to further study. We determine the sparse set of the dictionary atoms and the coefficients associated with them by minimizing the L1 norm of the coefficients subject to a constraint on the approximation error. In all currently prevalent block-based video coding standards [2], a single best prediction candidate is chosen among all prediction candidates to predict the current block, and then the prediction error block is represented with a fixed orthogonal transform (e.g. the Discrete Cosine Transform or DCT). This method essentially represents the current block by a slightly redundant dictionary consisting of a fixed set of dictionary atoms that are basis elements of the orthonormal transform plus the best matching candidate. Furthermore, the coefficient corresponding to the best matching candidate is constrained to be 1. When a fractional pel MV is used or multiple reference frames are used, instead of a single best prediction candidate, it uses a linear combination of a few candidates with some preset constraints on the possible combination of candidates and their weights. It is natural to wonder, if we do not force such constraints, would we be able to present the prediction error with fewer DCT basis vectors? The proposed representation allows any weighted combination of the prediction candidates, and hence covers the above prediction+transform approach as a special case. We have found that using the proposed self-adaptive dictionary can lead to a very sparse representation, with significantly fewer non-zero coefficients, than using the DCT on the error between the original block and the best prediction candidate. Several research groups have attempted using redundant dictionaries for block-based image and video coding, including [3 7]. In all reported dictionary-based video coders, the dictionary atoms are used to represent the motion-compensation error block for interframe video coding. Therefore, they are very different from what is proposed here. Instead of using a single dictionary, [6] uses multiple dictionaries, pre-designed for different residual energy levels. The work in [7] codes each frame in the intra-mode, with a dictionary that is updated in real time based on the previously coded frames. Although such online adaptation can yield a dictionary that matches with the video content very well, it is computationally very demanding. The proposed framework uses a self-adaptive dictionary that only depends on the block location, without requiring realtime design/redesign of the dictionary. A major challenge in applying sparse representation for compression is that the dictionary atoms are generally not orthogonal. Quantizing the coefficients associated with them directly and independently are not efficient. First of all, the quantization errors of the coefficients are related to the errors in the reconstructed samples in a

2 complicated way. Secondly, these coefficients are likely to have high correlation. To the best of our knowledge, none of the dictionarybased video coders have produced compression performance that is better than the H.264 and HEVC standards. We believe that one reason that these coders have not been more successful is because they quantize the sparse coefficients associated with the chosen atoms directly. We propose to represent the subspace spanned by the chosen atoms by a set of orthonormal vectors. The coefficients corresponding to these orthonormal vectors will be much less correlated and can be quantized and coded independently without losing coding efficiency. We find the orthonormal vectors and their corresponding quantized coefficients jointly through a modified Gram-Schmidt orthogonalization process with embedded quantization. The encoder only specifies which atoms are chosen (which is a subset of the originally chosen atoms) and the quantized coefficients corresponding to the orthonormal vectors. The decoder can perform the same orthonormalization process on the chosen atoms to derive the orthonormal vectors used at the encoder. We note that this method for orthonormalizing the original dictionary atoms and performing quantization and coding in the orthonormalized subspace representation is applicable to any dictionary-based coding method. In the remainder of this paper, we describe the specific algorithms used for different parts of the proposed coder in Sec. 2-4, and show the simulation results in Sec. 5. We conclude the paper in Sec SPARSE REPRESENTATION USING SPATIAL-TEMPORAL PREDICTION CANDIDATES Instead of representing the original block using the prediction candidates directly, we perform mean subtraction on the original block, and perform mean subtraction and normalization on the candidates. We use N to denote the total number of atoms, which includes all prediction candidates and a predesigned set of atoms (consisting of all 2D DCT basis vectors except the all constant one in our current implementation). We denote the mean-removed block by F and the dictionary atoms by A n, n = 1, 2,..., N.. Note that F and A n are vector representations of 2D blocks, each of dimension M, where M is the number of pixels in a block. Generally, M < N, so that all the atoms form a redundant dictionary. To derive the sparse representation for F using A n with coefficients w n, we solve the following constrained optimization problem: min w n subject to n 1 M 2 w n A n F ɛ 2 1 (1) Note that this is a classical sparse coding LASSO problem, and there are various methods to solve the problem. We use the least angle regression method (LARS) [8], using the MATLAB code provided at [9]. This algorithm is chosen because of its fast convergence and the fact it uses a constrained formulation directly, so that we can control the target representation error ɛ 1 directly. Note that the final reconstruction error is the sum of the sparse representation error plus the error due to quantization of the coefficients, assuming the two types of error are independent. Therefore, the error ɛ 1 should be proportionally smaller than the target final reconstruction error. Given a target reconstruction error, how to optimally allocate between the sparse representation error and the quantization error remains an open research problem. In our current implementation, we choose half of the targeted reconstruction error to be the approximation error ɛ 1. Better rate-distortion performance is expected with optimizing such allocation. n (, 1) ( 1,) (,) (1,) (,1) Fig. 1. Spiral order of 2-D candidate displacements After solving the sparse representation problem, we will have a set of L chosen candidates with coefficients having magnitude larger than a certain threshold ɛ 2 to avoid numerical error. We will use m(l) to denote the index of the l-th chosen atom, and B l = A m(l) the actual atom, with l = 1, 2,..., L. 3. ORTHONORMALIZATION AND QUANTIZATION A straight forward way to find a set of orthonormalized vectors C l from the chosen atoms B l is by applying the well-known Gradm- Schmidt orthogonalization algorithm to the chosen atoms sequentially, using l 1 C = B l (B l, C i )C i, C l = C/ C 2 (2) i=1 where (B, C) denotes the inner product of B and C, and C 2 denotes the 2-norm of C. The coefficients corresponding to the orthonormal vectors can be found easily by using inner product, i.e., t k = (F, C k ). In our current implementation, we apply uniform quantization to each coefficient t k with the same stepsize q. We denoted the quantized value by ˆt k. A problem with the above approach is that the coefficients corresponding to some of the resulting orthonormal vectors may be zero after quantization. Ideally, we want to only keep those vectors (and their corresponding atoms) that have non-zero quantized coefficients. In addition, we would like the resulting orthonormal vectors to have coefficients that are decreasing in magnitude with high likelihood. Towards these goals, we first order the original chosen candidates B l so that their corresponding coefficients are decreasing in magnitude. We then perform orthonormalization and quantization jointly. It uses a Gram-Schmidt-like orthogonalization procedure but with vectors (and their corresponding atoms) that have zero quantized coefficients thrown away. Basically, if a newly obtained orthonormalized vector has a coefficient that is quantized to zero, we will remove this vector and the original atom that is used to derive this vector, move to the next atom, and orthonormalize this atom with respect to all previously derived orthonormal vectors. At the end of this process, we have K (K L) orthonormalized vectors C(k), which correspond to original atoms with indices n(k), and quantized coefficients ˆt(k) with quantization indices t(k). Note that C(k) are equivalent to the orthonormal vectors obtained from the set of original candidates A(n(k)), k = 1, 2,..., K using the original Gram-Schmidt algorithm. Therefore, upon receiving the indices n(k), the decoder can deduce C(k) by applying the original Gram- Schmidt algorithm to the set of atoms with indices n(k). The above algorithm can be iterated several times to further reduce the number of remaining atoms. At the end of each iteration, if the number of chosen atoms is smaller than the last iteration, then all

3 Original block Reconstructed block, MSE=2.287 the remaining atoms are reordered based on the magnitudes of the coefficients associated with their corresponding orthogonal vectors. Next the same algorithm is applied to this reordered set of atoms. The iteration can continue until no more zero coefficients are identified in the last pass. We have found that two passes are sufficient for most image blocks. 4. ENTROPY CODING FOR CHOSEN ATOM INDICES AND QUANTIZED COEFFICIENTS Original block Best inter prediction candidate, MS Fig. 2. Representation of a sample block shown in using different methods, Top: The chosen atoms by the proposed representation, coefficients magnitudes are [24, 8, 6, 2, 2, 2]; Middle: The normalized vectors obtained from the chosen atoms, coefficients magnitude are [244.7, 71.8, 62.4, 14.5, 11.7, 25.7] ; Bottom: The Best matching block and the DCT basis images to represent the prediction error, coefficients magnitudes are [9, 36, 36, 18, 9, 18, 18, 36, 36, 54, 18, 18, 18, 18, 18, 18]. For each block, we first code the quantized mean value of the block, then the indices of the chosen atoms in the same order used for producing the final orthonormal vectors, and finally the quantized coefficients corresponding to the orthonormalized vectors. For the block mean, we perform predictive coding. We predict the mean value of the current block from the co-located block in the previous frame and quantize the prediction error. We collect the probability distribution of the quantized mean prediction error from training images, and use the entropy of this distribution to estimate the bits needed for coding the mean value. We include a special symbol EOB among the possible symbols to indicate the case that the quantized prediction error is zero and no other non-zero coefficients are needed. This would be the case when a block can be presented by a constant block with the predicted mean value accurately up to the target coding distortion. For specifying which atoms are chosen, we arrange all atoms in a pre-defined order, and code the indices of the chosen atoms successively. Specifically, we put all possible inter-prediction candidates in a 2-D array, based on their displacement vectors, with respect to the current block position. We then convert them to a 1-D array, using a 1-D clockwise spiral path starting from the center. For example, the first (n = 1), second (n = 2) and the third (n = 3) candidate in the spiral path are those candidates with displacements (, ), ( 1, ), and (1, ), respectively, as illustrated in Fig. 1. We attach all DCT basis vectors at the end following the well-known zigzag order. For our current implementation, we do not use intra-prediction candidates because we found through our experiments that these candidates are seldom chosen. To code the indices of the chosen atoms, our experiments show that there is very little correlation between positions of chosen candidates for the same block or across adjacent blocks. However, the probability distribution of the index of the first chosen atom is quite different from that of the second chosen atom, which is different from that of the third chosen atom, and so on. The first few chosen atoms are more likely to be the prediction candidates associated with small motion vectors, whereas the remaining atoms are more randomly distributed. Based on this observation, we code the index of the k-th chosen atom using its own probability distribution. We use the entropy of each distribution to estimate the bit rate needed to code each index. Our experiments have shown that the distributions of coefficients for k > 1 are very similar, and therefore can be coded using the same distribution without introducing noticeable loss in coding efficiency. We include a special symbol EOB among the possible symbols for each distribution to indicate the case that no more atoms are chosen. For the quantized coefficient values, we also code them sequentially, with the k-th coefficient coded using the probability distribution of the k-th coefficient s quantization index. This strategy is motivated by the observation that the distributions of the first few coefficients are somewhat different. However, the distributions of coefficients for k > 1 are very similar, and therefore the coefficients for k > 1 can be coded using the same distribution without loss in the coding efficiency.

4 3 Histogram of non zero coefficient For the results reported in this paper, we estimate the average number of bits for all symbols to be coded using the entropies derived from their corresponding probability distributions. We note that the current scheme does not exploit the redundancy in the possible patterns of successive coefficient values. Using an arithmetic coding scheme to take advantage of such redundancy, similar to the CABAC method used in H.264 and HEVC [1], is likely to improve the coding efficiency. 5. SIMULATION RESULTS Although the proposed coding framework can accommodate both inter- and intra-prediction candidates, our preliminary simulation results have shown that intra-prediction candidates are very rarely chosen. Therefore, in all results presented here, only inter-prediction candidates are used, together with the DCT basis vectors. We implemented the proposed method with the following parameters: block size of with M=256, inter-candidate search range of 24 with integer shifts only. This leads to 234 candidates. Plus the 255 DCT basis vectors (exclude DC), we have a total of N = 2559 atoms. We evaluated the performance with quantization stepsize of q = 2, 36, 5. For the results reported, we choose threshold ɛ 1 = 3 and ɛ 2 = 1 6. Generally these thresholds should be chosen to be smaller than the expected mean square quantization error with a given stepsize. To get some insights into how does the algorithm work, we first show some intermediate results for a sample block from a sample frame (shown in Fig. 6 ) in the sequence trail pink kid (tk) from [11]. In Fig. 2, we show, for each sample block, the chosen atoms to represent this block with their corresponding coefficients using the LAR algorithm, and the orthonormalized vectors with their corresponding quantized coefficients. For comparison, we also show the best matching candidate and the DCT basis vectors with non-zero quantized coefficients. It is clear that for this sample block, the proposed representation is more efficient. The chosen candidates for each block resemble the block very closely. We note that generally there is a high likelihood that the first candidate chosen (the one with the largest coefficient) is the same as the best prediction candidate, as demonstrated in this case. Fig. 3 compares the distributions of the number of non-zero coefficients needed using DCT vs. using the proposed method, calculated using all blocks in the same sample frame. The average number of non-zero coefficients in this example is reduced by 32.48%. It is well known that, for a transform coder, the coding efficiency depends on how fast the coefficient variance drops. A steeper slope leads to a higher coding efficiency. Fig. 5(a) shows the variances of the DCT coefficients on the prediction error using the best prediction candidate, where for each block, only DCT coefficients with non-zero coefficients are considered, and are ordered in decreasing order of the coefficient magnitude. Fig. 5(b) shows the variances of the coefficients w l corresponding to the chosen atoms, where the atoms are ordered in decreasing order of the coefficient magnitude. Fig. 5(c) shows the variances of the coefficients t k associated with the orthonormal vectors derived from the chosen atoms. We see that using the adaptive dictionary directly can help make the coefficient variance drops with a steeper slope than using the DCT. Orthonormalization of the chosen atoms can further improve the steepness significantly, which helps to improve the coding efficiency. Finally we show the coding performance of the proposed coder and other three comparison coders for one test sequence consisting of 5 frames, with frame size of and frame rate of 3Hz. We coded the first frame as an I-frame using the H.264 coder with Counts Counts Number of non zero coefficients Histogram of non zero coefficient using DCT coder Number of non zero coefficients Fig. 3. Distributions of the number of non-zero coefficients. Top: The proposed method, with mean number of non-zeros = and reconstruction PSNR= 42.37; The result is obtained by using ɛ 1 = 3 and q = 2; Bottom: Using DCT on the prediction error, with mean number of non-zeros = and reconstruction PSNR= ; The result is obtained by using q = 18 PSNR (db) RD curve for three methods x264 CAVLC x264 CABAC all partitions Proposed Codec bitrate (Mbps) Fig. 4. The PSNR vs. rate curves obtained using three different coders for the sequence

5 a QP of 15 (with reconstruction PSNR=49.19), we code all remaining frames as P-frames using different methods. The rate and PSNR reported are averaged over only the P-frames for all the comparison coders. For the proposed coder, we fixed the target sparse representation error at ɛ 1 = 3 and varied the quantization step size q to obtained different rate points. We determine the probability distributions of each type of variable to be coded (e.g. quantized block mean prediction error, index of k-th chosen candidate in spiral order, and the quantized value of k-th coefficient) based on the occurrence frequency of different symbols in all coded blocks in all frames, and estimate the average number of bits for all symbols to be coded using the entropies derived from their corresponding probability distributions. We compare the proposed coder with following two variations of the H.264 coders: H264CAVLC refers to H.264 using CAVLC for entropy coding, using only 16x16 blocks for both inter and intra prediction and for transform, but with quarter-pel accuracy motion. H264CABAC refers to H.264 using CABAC for entropy coding and all advanced options (including variable size blocks from 4x4 to 32x32). The H.264 results are obtained using the x264 Software [12]. Because the current implementation of the proposed coder uses a fixed blocksize of 16x16 and does not perform arithmetic coding, a relatively fair comparison is with H264CAVLC. Fig. 4 show the PSNR vs. bit rate curves obtained by these three methods. Fig. 6 shows decoded frames for the same sample frame using these three methods under similar bit rates. It is very encouraging that the proposed coder achieved significant gains over H264CAVLC, both using fixed block sizes and non-conditional entropy coding. Even more encouraging, the proposed coder has significant gains even over the H264CABAC with all options enabled. It is expected that with variable block sizes and more efficient entropy coding method, the proposed coder could possibly achieve more significant gain over H CONCLUSION AND OPEN RESEARCH The superior performance of the proposed coder compared to H.264, even when many components are not optimized, is very encouraging, and testifies the great promise of using the self-adaptive dictionary for video block representation. When using a redundant dictionary, it is critical to design efficient quantization and coding method to describe the resulting sparse representation. In addition to the use of self-adaptive dictionary, the proposed joint orthonormalization and quantization process also contributes greatly to the efficiency of the proposed coder. Note that this method for orthonormalizing the original dictionary atoms and performing quantization and coding in the orthonormalized subspace representation is applicable to any dictionary-based coding method. Although this work did not consider intra-frame coding, one can apply a similar idea which uses all intra-prediction candidates in the adaptive part of the dictionary, which can consist of shifted blocks in the previously coded areas in the same frame as well as the intra-prediction candidates using the various intra-prediction modes in H.264 and HEVC. Under the proposed general framework, many components can be further optimized. The current entropy coding scheme does not exploit the redundancy in the possible patterns of successive coefficient values and atom indices. Using an arithmetic coding scheme to take advantage of such redundancy, similar to the CABAC method in H.264 and HEVC, is likely to further improve the coding efficiency. The current coder uses a fixed block size. However, it is relatively straight forward to apply it with variable block sizes, and choose Variance Variance Variance Coefficient variances in Zig zag order using DCT coder Coefficients Coefficient variances w.r.t original atoms Coefficients x Coefficient variances Coefficients Fig. 5. Coefficient variances in decreasing order. Top: Using DCT on the prediction error; Middle: The variances of the coefficients associated with original chosen atoms; Bottom: The variances of the coefficients associated with the orthonormal vectors.

6 the block size using a rate-distortion optimization approach. This is expected to provide additional significant gain. Another open question is, given the target reconstruction error or target bit rate, how to choose the sparse representation error threshold and the quantization step size? This may be formulated as a rate-distortion optimized parameter selection problem. Finally, how to design the fixed part of the dictionary is another interesting and challenging research problem. 7. REFERENCES Fig. 6. Sample coded frames using three comparison methods, from top to bottom, (a) proposed coder (PSNR=42.88), (b) x264 with 16x16 partitions and CAVLC (PSNR=41.9), (c) x264 with all advanced options (PSNR=42.58); all three are coded with similar bitrate. [1] M. Aharon, M. Elad, and A. Bruckstein, K-svd: An algorithm for designing overcomplete dictionaries for sparse representation, IEEE Trans. on Signal Processing, vol. 54, no. 11, pp , 26. [2] Gary J. Sullivan, Jens-Rainer Ohm, Woo-Jin Han, and Thomas Wiegand, Overview of the high efficiency video coding (hevc) standard, IEEE Trans. on Circuits and Systems for Video Technology, vol. 22, pp , 212. [3] Karl Skretting and Kjersti Engan, Image compression using learned dictionaries by rls-dla and compared with k-svd, IEEE International Conference on Acoustics, Speech, and Signal Processing, 211. [4] Joaquin Zepeda, Christine Guillemot, and Ewa Kijak, Image compression using sparse representations and the iterationtuned and aligned dictionary, IEEE Journal of Selected Topics in Signal Processing, vol. 5, no. 5, pp , 211. [5] Philippe Schmid-Saugeon and Avideh Zakhor, Dictionary design for matching pursuit and application to motioncompensated video coding, IEEE Trans. on Circuits and Systems for Video Technology, vol. 14, no. 6, pp , 24. [6] Je-Won Kang, C-CJ Kuo, Robert Cohen, and Anthony Vetro, Efficient dictionary based video coding with reduced side information, in Circuits and Systems (ISCAS), 211 IEEE International Symposium on. IEEE, 211, pp [7] Yipeng Sun, Mai Xu, Xiaoming Tao, and Jianhua Lu, Online dictionary learning based intra-frame video coding via sparse representation, in Wireless Personal Multimedia Communications (WPMC), th International Symposium on, 212. [8] Bradley Efron, Trevor Hastie, Iain Johnstone, and Robert Tibshirani, Least angle regression, Annals of Statistics, vol. 32, pp , 24. [9] Julien Mairal, Francis Bach, Jean Ponce, and Guillermo Sapiro, Online learning for matrix factorization and sparse coding, Journal of Machine Learning Research, vol. 11, pp. 19 6, 21. [1] D. Marpe, H. Schwarz, and T. Wiegand, Context-based adaptive binary arithmetic coding in the h.264/avc video compression standard, IEEE Trans. on Circuits and Systems for Video Technology, vol. 13, no. 7, pp , July 23. [11] Anush Krishna Moorthy, Lark Kwon Choi, Alan Conrad Bovik, and Gustavo de Veciana, Video quality assessment on mobile devices: Subjective, behavioral and objective studies, IEEE Journal of Selected Topics in Signal Processing, vol. 6, no. 6, pp , October 212. [12] Laurent Aimar et. al., x264 open-source video encoder,

A TWO-STAGE VIDEO CODING FRAMEWORK WITH BOTH SELF-ADAPTIVE REDUNDANT DICTIONARY AND ADAPTIVELY ORTHONORMALIZED DCT BASIS

A TWO-STAGE VIDEO CODING FRAMEWORK WITH BOTH SELF-ADAPTIVE REDUNDANT DICTIONARY AND ADAPTIVELY ORTHONORMALIZED DCT BASIS A TWO-STAGE VIDEO CODING FRAMEWORK WITH BOTH SELF-ADAPTIVE REDUNDANT DICTIONARY AND ADAPTIVELY ORTHONORMALIZED DCT BASIS Yuanyi Xue, Yi Zhou, and Yao Wang Department of Electrical and Computer Engineering

More information

THE currently prevalent video coding framework (e.g. A Novel Video Coding Framework using Self-adaptive Dictionary

THE currently prevalent video coding framework (e.g. A Novel Video Coding Framework using Self-adaptive Dictionary JOURNAL OF L A TEX CLASS FILES, VOL. 14, NO., AUGUST 20XX 1 A Novel Video Coding Framework using Self-adaptive Dictionary Yuanyi Xue, Student Member, IEEE, and Yao Wang, Fellow, IEEE Abstract In this paper,

More information

Learning an Adaptive Dictionary Structure for Efficient Image Sparse Coding

Learning an Adaptive Dictionary Structure for Efficient Image Sparse Coding Learning an Adaptive Dictionary Structure for Efficient Image Sparse Coding Jérémy Aghaei Mazaheri, Christine Guillemot, Claude Labit To cite this version: Jérémy Aghaei Mazaheri, Christine Guillemot,

More information

Direction-Adaptive Transforms for Coding Prediction Residuals

Direction-Adaptive Transforms for Coding Prediction Residuals MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Direction-Adaptive Transforms for Coding Prediction Residuals Robert Cohen, Sven Klomp, Anthony Vetro, Huifang Sun TR2010-090 November 2010

More information

LOSSLESS INTRA CODING IN HEVC WITH INTEGER-TO-INTEGER DST. Fatih Kamisli. Middle East Technical University Ankara, Turkey

LOSSLESS INTRA CODING IN HEVC WITH INTEGER-TO-INTEGER DST. Fatih Kamisli. Middle East Technical University Ankara, Turkey LOSSLESS INTRA CODING IN HEVC WITH INTEGER-TO-INTEGER DST Fatih Kamisli Middle East Technical University Ankara, Turkey ABSTRACT It is desirable to support efficient lossless coding within video coding

More information

Waveform-Based Coding: Outline

Waveform-Based Coding: Outline Waveform-Based Coding: Transform and Predictive Coding Yao Wang Polytechnic University, Brooklyn, NY11201 http://eeweb.poly.edu/~yao Based on: Y. Wang, J. Ostermann, and Y.-Q. Zhang, Video Processing and

More information

On Compression Encrypted Data part 2. Prof. Ja-Ling Wu The Graduate Institute of Networking and Multimedia National Taiwan University

On Compression Encrypted Data part 2. Prof. Ja-Ling Wu The Graduate Institute of Networking and Multimedia National Taiwan University On Compression Encrypted Data part 2 Prof. Ja-Ling Wu The Graduate Institute of Networking and Multimedia National Taiwan University 1 Brief Summary of Information-theoretic Prescription At a functional

More information

Intraframe Prediction with Intraframe Update Step for Motion-Compensated Lifted Wavelet Video Coding

Intraframe Prediction with Intraframe Update Step for Motion-Compensated Lifted Wavelet Video Coding Intraframe Prediction with Intraframe Update Step for Motion-Compensated Lifted Wavelet Video Coding Aditya Mavlankar, Chuo-Ling Chang, and Bernd Girod Information Systems Laboratory, Department of Electrical

More information

4x4 Transform and Quantization in H.264/AVC

4x4 Transform and Quantization in H.264/AVC Video compression design, analysis, consulting and research White Paper: 4x4 Transform and Quantization in H.264/AVC Iain Richardson / VCodex Limited Version 1.2 Revised November 2010 H.264 Transform and

More information

CHAPTER 3. Implementation of Transformation, Quantization, Inverse Transformation, Inverse Quantization and CAVLC for H.

CHAPTER 3. Implementation of Transformation, Quantization, Inverse Transformation, Inverse Quantization and CAVLC for H. CHAPTER 3 Implementation of Transformation, Quantization, Inverse Transformation, Inverse Quantization and CAVLC for H.264 Video Encoder 3.1 Introduction The basics of video processing in H.264 Encoder

More information

Enhanced SATD-based cost function for mode selection of H.264/AVC intra coding

Enhanced SATD-based cost function for mode selection of H.264/AVC intra coding SIViP (013) 7:777 786 DOI 10.1007/s11760-011-067-z ORIGINAL PAPER Enhanced SATD-based cost function for mode selection of H.6/AVC intra coding Mohammed Golam Sarwer Q. M. Jonathan Wu Xiao-Ping Zhang Received:

More information

Multimedia Networking ECE 599

Multimedia Networking ECE 599 Multimedia Networking ECE 599 Prof. Thinh Nguyen School of Electrical Engineering and Computer Science Based on lectures from B. Lee, B. Girod, and A. Mukherjee 1 Outline Digital Signal Representation

More information

Enhanced Stochastic Bit Reshuffling for Fine Granular Scalable Video Coding

Enhanced Stochastic Bit Reshuffling for Fine Granular Scalable Video Coding Enhanced Stochastic Bit Reshuffling for Fine Granular Scalable Video Coding Wen-Hsiao Peng, Tihao Chiang, Hsueh-Ming Hang, and Chen-Yi Lee National Chiao-Tung University 1001 Ta-Hsueh Rd., HsinChu 30010,

More information

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC9/WG11 CODING OF MOVING PICTURES AND AUDIO ISO/IEC JTC1/SC9/WG11 MPEG 98/M3833 July 1998 Source:

More information

THE newest video coding standard is known as H.264/AVC

THE newest video coding standard is known as H.264/AVC IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 17, NO. 6, JUNE 2007 765 Transform-Domain Fast Sum of the Squared Difference Computation for H.264/AVC Rate-Distortion Optimization

More information

Motion Vector Prediction With Reference Frame Consideration

Motion Vector Prediction With Reference Frame Consideration Motion Vector Prediction With Reference Frame Consideration Alexis M. Tourapis *a, Feng Wu b, Shipeng Li b a Thomson Corporate Research, 2 Independence Way, Princeton, NJ, USA 855 b Microsoft Research

More information

An Investigation of 3D Dual-Tree Wavelet Transform for Video Coding

An Investigation of 3D Dual-Tree Wavelet Transform for Video Coding MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com An Investigation of 3D Dual-Tree Wavelet Transform for Video Coding Beibei Wang, Yao Wang, Ivan Selesnick and Anthony Vetro TR2004-132 December

More information

A Video Codec Incorporating Block-Based Multi-Hypothesis Motion-Compensated Prediction

A Video Codec Incorporating Block-Based Multi-Hypothesis Motion-Compensated Prediction SPIE Conference on Visual Communications and Image Processing, Perth, Australia, June 2000 1 A Video Codec Incorporating Block-Based Multi-Hypothesis Motion-Compensated Prediction Markus Flierl, Thomas

More information

AN IMPROVED CONTEXT ADAPTIVE BINARY ARITHMETIC CODER FOR THE H.264/AVC STANDARD

AN IMPROVED CONTEXT ADAPTIVE BINARY ARITHMETIC CODER FOR THE H.264/AVC STANDARD 4th European Signal Processing Conference (EUSIPCO 2006), Florence, Italy, September 4-8, 2006, copyright by EURASIP AN IMPROVED CONTEXT ADAPTIVE BINARY ARITHMETIC CODER FOR THE H.264/AVC STANDARD Simone

More information

MATCHING-PURSUIT DICTIONARY PRUNING FOR MPEG-4 VIDEO OBJECT CODING

MATCHING-PURSUIT DICTIONARY PRUNING FOR MPEG-4 VIDEO OBJECT CODING MATCHING-PURSUIT DICTIONARY PRUNING FOR MPEG-4 VIDEO OBJECT CODING Yannick Morvan, Dirk Farin University of Technology Eindhoven 5600 MB Eindhoven, The Netherlands email: {y.morvan;d.s.farin}@tue.nl Peter

More information

Converting DCT Coefficients to H.264/AVC

Converting DCT Coefficients to H.264/AVC MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Converting DCT Coefficients to H.264/AVC Jun Xin, Anthony Vetro, Huifang Sun TR2004-058 June 2004 Abstract Many video coding schemes, including

More information

2.3. Clustering or vector quantization 57

2.3. Clustering or vector quantization 57 Multivariate Statistics non-negative matrix factorisation and sparse dictionary learning The PCA decomposition is by construction optimal solution to argmin A R n q,h R q p X AH 2 2 under constraint :

More information

Lecture 7 Predictive Coding & Quantization

Lecture 7 Predictive Coding & Quantization Shujun LI (李树钧): INF-10845-20091 Multimedia Coding Lecture 7 Predictive Coding & Quantization June 3, 2009 Outline Predictive Coding Motion Estimation and Compensation Context-Based Coding Quantization

More information

The Iteration-Tuned Dictionary for Sparse Representations

The Iteration-Tuned Dictionary for Sparse Representations The Iteration-Tuned Dictionary for Sparse Representations Joaquin Zepeda #1, Christine Guillemot #2, Ewa Kijak 3 # INRIA Centre Rennes - Bretagne Atlantique Campus de Beaulieu, 35042 Rennes Cedex, FRANCE

More information

Rate-Constrained Multihypothesis Prediction for Motion-Compensated Video Compression

Rate-Constrained Multihypothesis Prediction for Motion-Compensated Video Compression IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL 12, NO 11, NOVEMBER 2002 957 Rate-Constrained Multihypothesis Prediction for Motion-Compensated Video Compression Markus Flierl, Student

More information

Application of a Bi-Geometric Transparent Composite Model to HEVC: Residual Data Modelling and Rate Control

Application of a Bi-Geometric Transparent Composite Model to HEVC: Residual Data Modelling and Rate Control Application of a Bi-Geometric Transparent Composite Model to HEVC: Residual Data Modelling and Rate Control by Yueming Gao A thesis presented to the University of Waterloo in fulfilment of the thesis requirement

More information

Transform coding - topics. Principle of block-wise transform coding

Transform coding - topics. Principle of block-wise transform coding Transform coding - topics Principle of block-wise transform coding Properties of orthonormal transforms Discrete cosine transform (DCT) Bit allocation for transform Threshold coding Typical coding artifacts

More information

Transform Coding. Transform Coding Principle

Transform Coding. Transform Coding Principle Transform Coding Principle of block-wise transform coding Properties of orthonormal transforms Discrete cosine transform (DCT) Bit allocation for transform coefficients Entropy coding of transform coefficients

More information

A DISTRIBUTED VIDEO CODER BASED ON THE H.264/AVC STANDARD

A DISTRIBUTED VIDEO CODER BASED ON THE H.264/AVC STANDARD 5th European Signal Processing Conference (EUSIPCO 27), Poznan, Poland, September 3-7, 27, copyright by EURASIP A DISTRIBUTED VIDEO CODER BASED ON THE /AVC STANDARD Simone Milani and Giancarlo Calvagno

More information

MODERN video coding standards, such as H.263, H.264,

MODERN video coding standards, such as H.263, H.264, 146 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 1, JANUARY 2006 Analysis of Multihypothesis Motion Compensated Prediction (MHMCP) for Robust Visual Communication Wei-Ying

More information

Bit Rate Estimation for Cost Function of H.264/AVC

Bit Rate Estimation for Cost Function of H.264/AVC Bit Rate Estimation for Cost Function of H.264/AVC 257 14 X Bit Rate Estimation for Cost Function of H.264/AVC Mohammed Golam Sarwer 1,2, Lai Man Po 1 and Q. M. Jonathan Wu 2 1 City University of Hong

More information

Basic Principles of Video Coding

Basic Principles of Video Coding Basic Principles of Video Coding Introduction Categories of Video Coding Schemes Information Theory Overview of Video Coding Techniques Predictive coding Transform coding Quantization Entropy coding Motion

More information

CHAPTER 3. Transformed Vector Quantization with Orthogonal Polynomials Introduction Vector quantization

CHAPTER 3. Transformed Vector Quantization with Orthogonal Polynomials Introduction Vector quantization 3.1. Introduction CHAPTER 3 Transformed Vector Quantization with Orthogonal Polynomials In the previous chapter, a new integer image coding technique based on orthogonal polynomials for monochrome images

More information

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE General e Image Coder Structure Motion Video x(s 1,s 2,t) or x(s 1,s 2 ) Natural Image Sampling A form of data compression; usually lossless, but can be lossy Redundancy Removal Lossless compression: predictive

More information

Fault Tolerance Technique in Huffman Coding applies to Baseline JPEG

Fault Tolerance Technique in Huffman Coding applies to Baseline JPEG Fault Tolerance Technique in Huffman Coding applies to Baseline JPEG Cung Nguyen and Robert G. Redinbo Department of Electrical and Computer Engineering University of California, Davis, CA email: cunguyen,

More information

Digital Image Processing Lectures 25 & 26

Digital Image Processing Lectures 25 & 26 Lectures 25 & 26, Professor Department of Electrical and Computer Engineering Colorado State University Spring 2015 Area 4: Image Encoding and Compression Goal: To exploit the redundancies in the image

More information

Predictive Coding. Prediction Prediction in Images

Predictive Coding. Prediction Prediction in Images Prediction Prediction in Images Predictive Coding Principle of Differential Pulse Code Modulation (DPCM) DPCM and entropy-constrained scalar quantization DPCM and transmission errors Adaptive intra-interframe

More information

Predictive Coding. Prediction

Predictive Coding. Prediction Predictive Coding Prediction Prediction in Images Principle of Differential Pulse Code Modulation (DPCM) DPCM and entropy-constrained scalar quantization DPCM and transmission errors Adaptive intra-interframe

More information

L. Yaroslavsky. Fundamentals of Digital Image Processing. Course

L. Yaroslavsky. Fundamentals of Digital Image Processing. Course L. Yaroslavsky. Fundamentals of Digital Image Processing. Course 0555.330 Lec. 6. Principles of image coding The term image coding or image compression refers to processing image digital data aimed at

More information

CSE 408 Multimedia Information System Yezhou Yang

CSE 408 Multimedia Information System Yezhou Yang Image and Video Compression CSE 408 Multimedia Information System Yezhou Yang Lots of slides from Hassan Mansour Class plan Today: Project 2 roundup Today: Image and Video compression Nov 10: final project

More information

Neural network based intra prediction for video coding

Neural network based intra prediction for video coding Neural network based intra prediction for video coding J. Pfaff, P. Helle, D. Maniry, S. Kaltenstadler, W. Samek, H. Schwarz, D. Marpe, T. Wiegand Video Coding and Analytics Department, Fraunhofer Institute

More information

SIGNAL COMPRESSION. 8. Lossy image compression: Principle of embedding

SIGNAL COMPRESSION. 8. Lossy image compression: Principle of embedding SIGNAL COMPRESSION 8. Lossy image compression: Principle of embedding 8.1 Lossy compression 8.2 Embedded Zerotree Coder 161 8.1 Lossy compression - many degrees of freedom and many viewpoints The fundamental

More information

IMAGE COMPRESSION-II. Week IX. 03/6/2003 Image Compression-II 1

IMAGE COMPRESSION-II. Week IX. 03/6/2003 Image Compression-II 1 IMAGE COMPRESSION-II Week IX 3/6/23 Image Compression-II 1 IMAGE COMPRESSION Data redundancy Self-information and Entropy Error-free and lossy compression Huffman coding Predictive coding Transform coding

More information

Half-Pel Accurate Motion-Compensated Orthogonal Video Transforms

Half-Pel Accurate Motion-Compensated Orthogonal Video Transforms Flierl and Girod: Half-Pel Accurate Motion-Compensated Orthogonal Video Transforms, IEEE DCC, Mar. 007. Half-Pel Accurate Motion-Compensated Orthogonal Video Transforms Markus Flierl and Bernd Girod Max

More information

SSIM-Inspired Perceptual Video Coding for HEVC

SSIM-Inspired Perceptual Video Coding for HEVC 2012 IEEE International Conference on Multimedia and Expo SSIM-Inspired Perceptual Video Coding for HEVC Abdul Rehman and Zhou Wang Dept. of Electrical and Computer Engineering, University of Waterloo,

More information

Deterministic sampling masks and compressed sensing: Compensating for partial image loss at the pixel level

Deterministic sampling masks and compressed sensing: Compensating for partial image loss at the pixel level Deterministic sampling masks and compressed sensing: Compensating for partial image loss at the pixel level Alfredo Nava-Tudela Institute for Physical Science and Technology and Norbert Wiener Center,

More information

Video Coding With Linear Compensation (VCLC)

Video Coding With Linear Compensation (VCLC) Coding With Linear Compensation () Arif Mahmood Zartash Afzal Uzmi Sohaib Khan School of Science and Engineering Lahore University of Management Sciences, Lahore, Pakistan {arifm, zartash, sohaib}@lums.edu.pk

More information

In-loop atom modulus quantization for matching. pursuit and its application to video coding

In-loop atom modulus quantization for matching. pursuit and its application to video coding In-loop atom modulus quantization for matching pursuit and its application to video coding hristophe De Vleeschouwer Laboratoire de Télécommunications Université catholique de Louvain, elgium Avideh akhor

More information

Compression and Coding

Compression and Coding Compression and Coding Theory and Applications Part 1: Fundamentals Gloria Menegaz 1 Transmitter (Encoder) What is the problem? Receiver (Decoder) Transformation information unit Channel Ordering (significance)

More information

Tensor-Based Dictionary Learning for Multidimensional Sparse Recovery. Florian Römer and Giovanni Del Galdo

Tensor-Based Dictionary Learning for Multidimensional Sparse Recovery. Florian Römer and Giovanni Del Galdo Tensor-Based Dictionary Learning for Multidimensional Sparse Recovery Florian Römer and Giovanni Del Galdo 2nd CoSeRa, Bonn, 17-19 Sept. 2013 Ilmenau University of Technology Institute for Information

More information

Multimedia & Computer Visualization. Exercise #5. JPEG compression

Multimedia & Computer Visualization. Exercise #5. JPEG compression dr inż. Jacek Jarnicki, dr inż. Marek Woda Institute of Computer Engineering, Control and Robotics Wroclaw University of Technology {jacek.jarnicki, marek.woda}@pwr.wroc.pl Exercise #5 JPEG compression

More information

EE368B Image and Video Compression

EE368B Image and Video Compression EE368B Image and Video Compression Homework Set #2 due Friday, October 20, 2000, 9 a.m. Introduction The Lloyd-Max quantizer is a scalar quantizer which can be seen as a special case of a vector quantizer

More information

Vector Quantization Encoder Decoder Original Form image Minimize distortion Table Channel Image Vectors Look-up (X, X i ) X may be a block of l

Vector Quantization Encoder Decoder Original Form image Minimize distortion Table Channel Image Vectors Look-up (X, X i ) X may be a block of l Vector Quantization Encoder Decoder Original Image Form image Vectors X Minimize distortion k k Table X^ k Channel d(x, X^ Look-up i ) X may be a block of l m image or X=( r, g, b ), or a block of DCT

More information

SPARSE signal representations have gained popularity in recent

SPARSE signal representations have gained popularity in recent 6958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 10, OCTOBER 2011 Blind Compressed Sensing Sivan Gleichman and Yonina C. Eldar, Senior Member, IEEE Abstract The fundamental principle underlying

More information

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak 4. Quantization and Data Compression ECE 32 Spring 22 Purdue University, School of ECE Prof. What is data compression? Reducing the file size without compromising the quality of the data stored in the

More information

Greedy Dictionary Selection for Sparse Representation

Greedy Dictionary Selection for Sparse Representation Greedy Dictionary Selection for Sparse Representation Volkan Cevher Rice University volkan@rice.edu Andreas Krause Caltech krausea@caltech.edu Abstract We discuss how to construct a dictionary by selecting

More information

Estimation-Theoretic Delayed Decoding of Predictively Encoded Video Sequences

Estimation-Theoretic Delayed Decoding of Predictively Encoded Video Sequences Estimation-Theoretic Delayed Decoding of Predictively Encoded Video Sequences Jingning Han, Vinay Melkote, and Kenneth Rose Department of Electrical and Computer Engineering University of California, Santa

More information

Lec 04 Variable Length Coding (VLC) in JPEG

Lec 04 Variable Length Coding (VLC) in JPEG ECE 5578 Multimedia Communication Lec 04 Variable Length Coding (VLC) in JPEG Zhu Li Dept of CSEE, UMKC Z. Li Multimedia Communciation, 2018 p.1 Outline Lecture 03 ReCap VLC JPEG Image Coding Framework

More information

A tutorial on sparse modeling. Outline:

A tutorial on sparse modeling. Outline: A tutorial on sparse modeling. Outline: 1. Why? 2. What? 3. How. 4. no really, why? Sparse modeling is a component in many state of the art signal processing and machine learning tasks. image processing

More information

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments Dr. Jian Zhang Conjoint Associate Professor NICTA & CSE UNSW COMP9519 Multimedia Systems S2 2006 jzhang@cse.unsw.edu.au

More information

Wavelet Scalable Video Codec Part 1: image compression by JPEG2000

Wavelet Scalable Video Codec Part 1: image compression by JPEG2000 1 Wavelet Scalable Video Codec Part 1: image compression by JPEG2000 Aline Roumy aline.roumy@inria.fr May 2011 2 Motivation for Video Compression Digital video studio standard ITU-R Rec. 601 Y luminance

More information

Estimation Error Bounds for Frame Denoising

Estimation Error Bounds for Frame Denoising Estimation Error Bounds for Frame Denoising Alyson K. Fletcher and Kannan Ramchandran {alyson,kannanr}@eecs.berkeley.edu Berkeley Audio-Visual Signal Processing and Communication Systems group Department

More information

Lossless Image and Intra-frame Compression with Integer-to-Integer DST

Lossless Image and Intra-frame Compression with Integer-to-Integer DST 1 Lossless Image and Intra-frame Compression with Integer-to-Integer DST Fatih Kamisli, Member, IEEE arxiv:1708.07154v1 [cs.mm] 3 Aug 017 Abstract Video coding standards are primarily designed for efficient

More information

H.264/MPEG4 Part INTRODUCTION Terminology

H.264/MPEG4 Part INTRODUCTION Terminology 6 H.264/MPEG4 Part 10 6.1 INTRODUCTION The Moving Picture Experts Group and the Video Coding Experts Group (MPEG and VCEG) have developed a new standard that promises to outperform the earlier MPEG-4 and

More information

Selective Use Of Multiple Entropy Models In Audio Coding

Selective Use Of Multiple Entropy Models In Audio Coding Selective Use Of Multiple Entropy Models In Audio Coding Sanjeev Mehrotra, Wei-ge Chen Microsoft Corporation One Microsoft Way, Redmond, WA 98052 {sanjeevm,wchen}@microsoft.com Abstract The use of multiple

More information

Analysis of Integer Transformation and Quantization Blocks using H.264 Standard and the Conventional DCT Techniques

Analysis of Integer Transformation and Quantization Blocks using H.264 Standard and the Conventional DCT Techniques Priyanka P James et al, International Journal o Computer Science and Mobile Computing, Vol.3 Issue.3, March- 2014, pg. 873-878 Available Online at www.ijcsmc.com International Journal o Computer Science

More information

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments. Tutorial 1. Acknowledgement and References for lectures 1 to 5

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments. Tutorial 1. Acknowledgement and References for lectures 1 to 5 Lecture : Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments Dr. Jian Zhang Conjoint Associate Professor NICTA & CSE UNSW COMP959 Multimedia Systems S 006 jzhang@cse.unsw.edu.au Acknowledgement

More information

IMPROVED INTRA ANGULAR PREDICTION BY DCT-BASED INTERPOLATION FILTER. Shohei Matsuo, Seishi Takamura, and Hirohisa Jozawa

IMPROVED INTRA ANGULAR PREDICTION BY DCT-BASED INTERPOLATION FILTER. Shohei Matsuo, Seishi Takamura, and Hirohisa Jozawa 2th European Signal Processing Conference (EUSIPCO 212 Bucharest, Romania, August 27-31, 212 IMPROVED INTRA ANGULAR PREDICTION BY DCT-BASED INTERPOLATION FILTER Shohei Matsuo, Seishi Takamura, and Hirohisa

More information

Phase-Correlation Motion Estimation Yi Liang

Phase-Correlation Motion Estimation Yi Liang EE 392J Final Project Abstract Phase-Correlation Motion Estimation Yi Liang yiliang@stanford.edu Phase-correlation motion estimation is studied and implemented in this work, with its performance, efficiency

More information

LORD: LOw-complexity, Rate-controlled, Distributed video coding system

LORD: LOw-complexity, Rate-controlled, Distributed video coding system LORD: LOw-complexity, Rate-controlled, Distributed video coding system Rami Cohen and David Malah Signal and Image Processing Lab Department of Electrical Engineering Technion - Israel Institute of Technology

More information

EUSIPCO

EUSIPCO EUSIPCO 013 1569746769 SUBSET PURSUIT FOR ANALYSIS DICTIONARY LEARNING Ye Zhang 1,, Haolong Wang 1, Tenglong Yu 1, Wenwu Wang 1 Department of Electronic and Information Engineering, Nanchang University,

More information

Introduction p. 1 Compression Techniques p. 3 Lossless Compression p. 4 Lossy Compression p. 5 Measures of Performance p. 5 Modeling and Coding p.

Introduction p. 1 Compression Techniques p. 3 Lossless Compression p. 4 Lossy Compression p. 5 Measures of Performance p. 5 Modeling and Coding p. Preface p. xvii Introduction p. 1 Compression Techniques p. 3 Lossless Compression p. 4 Lossy Compression p. 5 Measures of Performance p. 5 Modeling and Coding p. 6 Summary p. 10 Projects and Problems

More information

The Choice of MPEG-4 AAC encoding parameters as a direct function of the perceptual entropy of the audio signal

The Choice of MPEG-4 AAC encoding parameters as a direct function of the perceptual entropy of the audio signal The Choice of MPEG-4 AAC encoding parameters as a direct function of the perceptual entropy of the audio signal Claus Bauer, Mark Vinton Abstract This paper proposes a new procedure of lowcomplexity to

More information

encoding without prediction) (Server) Quantization: Initial Data 0, 1, 2, Quantized Data 0, 1, 2, 3, 4, 8, 16, 32, 64, 128, 256

encoding without prediction) (Server) Quantization: Initial Data 0, 1, 2, Quantized Data 0, 1, 2, 3, 4, 8, 16, 32, 64, 128, 256 General Models for Compression / Decompression -they apply to symbols data, text, and to image but not video 1. Simplest model (Lossless ( encoding without prediction) (server) Signal Encode Transmit (client)

More information

COMPRESSIVE (CS) [1] is an emerging framework,

COMPRESSIVE (CS) [1] is an emerging framework, 1 An Arithmetic Coding Scheme for Blocked-based Compressive Sensing of Images Min Gao arxiv:1604.06983v1 [cs.it] Apr 2016 Abstract Differential pulse-code modulation (DPCM) is recentl coupled with uniform

More information

6. H.261 Video Coding Standard

6. H.261 Video Coding Standard 6. H.261 Video Coding Standard ITU-T (formerly CCITT) H-Series of Recommendations 1. H.221 - Frame structure for a 64 to 1920 kbits/s channel in audiovisual teleservices 2. H.230 - Frame synchronous control

More information

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering

More information

Lec 05 Arithmetic Coding

Lec 05 Arithmetic Coding ECE 5578 Multimedia Communication Lec 05 Arithmetic Coding Zhu Li Dept of CSEE, UMKC web: http://l.web.umkc.edu/lizhu phone: x2346 Z. Li, Multimedia Communciation, 208 p. Outline Lecture 04 ReCap Arithmetic

More information

1462 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 19, NO. 10, OCTOBER 2009

1462 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 19, NO. 10, OCTOBER 2009 1462 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 19, NO. 10, OCTOBER 2009 2-D Order-16 Integer Transforms for HD Video Coding Jie Dong, Student Member, IEEE, King Ngi Ngan, Fellow,

More information

3drs e3drs fs e3drs fs Rate (kbps) Mother and Daughter (b) Miss America (a) 140.

3drs e3drs fs e3drs fs Rate (kbps) Mother and Daughter (b) Miss America (a) 140. oise{robust Recursive otion Estimation for H.263{based videoconferencing systems Stefano Olivieriy, Gerard de Haan z, and Luigi Albaniy yphilips S.p.A, Philips Research onza Via Philips, 12, 252 onza (I,

More information

IMAGE COMPRESSION IMAGE COMPRESSION-II. Coding Redundancy (contd.) Data Redundancy. Predictive coding. General Model

IMAGE COMPRESSION IMAGE COMPRESSION-II. Coding Redundancy (contd.) Data Redundancy. Predictive coding. General Model IMAGE COMRESSIO IMAGE COMRESSIO-II Data redundancy Self-information and Entropy Error-free and lossy compression Huffman coding redictive coding Transform coding Week IX 3/6/23 Image Compression-II 3/6/23

More information

+ (50% contribution by each member)

+ (50% contribution by each member) Image Coding using EZW and QM coder ECE 533 Project Report Ahuja, Alok + Singh, Aarti + + (50% contribution by each member) Abstract This project involves Matlab implementation of the Embedded Zerotree

More information

Can the sample being transmitted be used to refine its own PDF estimate?

Can the sample being transmitted be used to refine its own PDF estimate? Can the sample being transmitted be used to refine its own PDF estimate? Dinei A. Florêncio and Patrice Simard Microsoft Research One Microsoft Way, Redmond, WA 98052 {dinei, patrice}@microsoft.com Abstract

More information

h 8x8 chroma a b c d Boundary filtering: 16x16 luma H.264 / MPEG-4 Part 10 : Intra Prediction H.264 / MPEG-4 Part 10 White Paper Reconstruction Filter

h 8x8 chroma a b c d Boundary filtering: 16x16 luma H.264 / MPEG-4 Part 10 : Intra Prediction H.264 / MPEG-4 Part 10 White Paper Reconstruction Filter H.264 / MPEG-4 Part 10 White Paper Reconstruction Filter 1. Introduction The Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG are finalising a new standard for the coding (compression) of natural

More information

BASICS OF COMPRESSION THEORY

BASICS OF COMPRESSION THEORY BASICS OF COMPRESSION THEORY Why Compression? Task: storage and transport of multimedia information. E.g.: non-interlaced HDTV: 0x0x0x = Mb/s!! Solutions: Develop technologies for higher bandwidth Find

More information

Context-adaptive coded block pattern coding for H.264/AVC

Context-adaptive coded block pattern coding for H.264/AVC Context-adaptive coded block pattern coding for H.264/AVC Yangsoo Kim a), Sungjei Kim, Jinwoo Jeong, and Yoonsik Choe b) Department of Electrical and Electronic Engineering, Yonsei University 134, Sinchon-dong,

More information

Residual Correlation Regularization Based Image Denoising

Residual Correlation Regularization Based Image Denoising IEEE SIGNAL PROCESSING LETTERS, VOL. XX, NO. XX, MONTH 20 1 Residual Correlation Regularization Based Image Denoising Gulsher Baloch, Huseyin Ozkaramanli, and Runyi Yu Abstract Patch based denoising algorithms

More information

Wyner-Ziv Coding of Video with Unsupervised Motion Vector Learning

Wyner-Ziv Coding of Video with Unsupervised Motion Vector Learning Wyner-Ziv Coding of Video with Unsupervised Motion Vector Learning David Varodayan, David Chen, Markus Flierl and Bernd Girod Max Planck Center for Visual Computing and Communication Stanford University,

More information

Parcimonie en apprentissage statistique

Parcimonie en apprentissage statistique Parcimonie en apprentissage statistique Guillaume Obozinski Ecole des Ponts - ParisTech Journée Parcimonie Fédération Charles Hermite, 23 Juin 2014 Parcimonie en apprentissage 1/44 Classical supervised

More information

Linear Methods for Regression. Lijun Zhang

Linear Methods for Regression. Lijun Zhang Linear Methods for Regression Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Linear Regression Models and Least Squares Subset Selection Shrinkage Methods Methods Using Derived

More information

arxiv: v1 [cs.mm] 16 Feb 2016

arxiv: v1 [cs.mm] 16 Feb 2016 Perceptual Vector Quantization for Video Coding Jean-Marc Valin and Timothy B. Terriberry Mozilla, Mountain View, USA Xiph.Org Foundation arxiv:1602.05209v1 [cs.mm] 16 Feb 2016 ABSTRACT This paper applies

More information

AN ENHANCED EARLY DETECTION METHOD FOR ALL ZERO BLOCK IN H.264

AN ENHANCED EARLY DETECTION METHOD FOR ALL ZERO BLOCK IN H.264 st January 0. Vol. 7 No. 005-0 JATIT & LLS. All rights reserved. ISSN: 99-865 www.jatit.org E-ISSN: 87-95 AN ENHANCED EARLY DETECTION METHOD FOR ALL ZERO BLOCK IN H.6 CONG-DAO HAN School of Electrical

More information

Original citation: Prangnell, Lee, Sanchez Silva, Victor and Vanam, Rahul (05) Adaptive quantization by soft thresholding in HEVC. In: IEEE Picture Coding Symposium, Queensland, Australia, 3 May 03 Jun

More information

Information and Entropy

Information and Entropy Information and Entropy Shannon s Separation Principle Source Coding Principles Entropy Variable Length Codes Huffman Codes Joint Sources Arithmetic Codes Adaptive Codes Thomas Wiegand: Digital Image Communication

More information

Analysis of Rate-distortion Functions and Congestion Control in Scalable Internet Video Streaming

Analysis of Rate-distortion Functions and Congestion Control in Scalable Internet Video Streaming Analysis of Rate-distortion Functions and Congestion Control in Scalable Internet Video Streaming Min Dai Electrical Engineering, Texas A&M University Dmitri Loguinov Computer Science, Texas A&M University

More information

Order Adaptive Golomb Rice Coding for High Variability Sources

Order Adaptive Golomb Rice Coding for High Variability Sources Order Adaptive Golomb Rice Coding for High Variability Sources Adriana Vasilache Nokia Technologies, Tampere, Finland Email: adriana.vasilache@nokia.com Abstract This paper presents a new perspective on

More information

The MPEG4/AVC standard: description and basic tasks splitting

The MPEG4/AVC standard: description and basic tasks splitting The MPEG/AVC standard: description and basic tasks splitting Isabelle Hurbain 1 Centre de recherche en informatique École des Mines de Paris hurbain@cri.ensmp.fr January 7, 00 1 35, rue Saint-Honoré, 77305

More information

Rate-Distortion Based Temporal Filtering for. Video Compression. Beckman Institute, 405 N. Mathews Ave., Urbana, IL 61801

Rate-Distortion Based Temporal Filtering for. Video Compression. Beckman Institute, 405 N. Mathews Ave., Urbana, IL 61801 Rate-Distortion Based Temporal Filtering for Video Compression Onur G. Guleryuz?, Michael T. Orchard y? University of Illinois at Urbana-Champaign Beckman Institute, 45 N. Mathews Ave., Urbana, IL 68 y

More information

Rate-distortion Analysis and Control in DCT-based Scalable Video Coding. Xie Jun

Rate-distortion Analysis and Control in DCT-based Scalable Video Coding. Xie Jun Rate-distortion Analysis and Control in DCT-based Scalable Video Coding Xie Jun School of Computer Engineering A thesis submitted to the Nanyang Technological University in fulfillment of the requirement

More information

Analysis of Redundant-Wavelet Multihypothesis for Motion Compensation

Analysis of Redundant-Wavelet Multihypothesis for Motion Compensation Analysis of Redundant-Wavelet Multihypothesis for Motion Compensation James E. Fowler Department of Electrical and Computer Engineering GeoResources Institute GRI Mississippi State University, Starville,

More information