Rate-Distortion Based Temporal Filtering for. Video Compression. Beckman Institute, 405 N. Mathews Ave., Urbana, IL 61801

Similar documents
On the DPCM Compression of Gaussian Auto-Regressive. Sequences

Estimation-Theoretic Delayed Decoding of Predictively Encoded Video Sequences

On Compression Encrypted Data part 2. Prof. Ja-Ling Wu The Graduate Institute of Networking and Multimedia National Taiwan University

Image Compression using DPCM with LMS Algorithm

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments. Tutorial 1. Acknowledgement and References for lectures 1 to 5

3drs e3drs fs e3drs fs Rate (kbps) Mother and Daughter (b) Miss America (a) 140.

Multimedia Networking ECE 599

Waveform-Based Coding: Outline

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO

Basic Principles of Video Coding

Compression methods: the 1 st generation

On the optimal block size for block-based, motion-compensated video coders. Sharp Labs of America, 5750 NW Pacic Rim Blvd, David L.

Intraframe Prediction with Intraframe Update Step for Motion-Compensated Lifted Wavelet Video Coding

L. Yaroslavsky. Fundamentals of Digital Image Processing. Course

Can the sample being transmitted be used to refine its own PDF estimate?

Module 4. Multi-Resolution Analysis. Version 2 ECE IIT, Kharagpur

Rate-Constrained Multihypothesis Prediction for Motion-Compensated Video Compression

Digital Image Processing Lectures 25 & 26

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

SCALABLE AUDIO CODING USING WATERMARKING

Chapter 10 Applications in Communications

Lecture 7 Predictive Coding & Quantization

Analysis of Rate-distortion Functions and Congestion Control in Scalable Internet Video Streaming

Multimedia Communications. Differential Coding

Predictive Coding. Prediction Prediction in Images

Predictive Coding. Prediction

Review of Quantization. Quantization. Bring in Probability Distribution. L-level Quantization. Uniform partition

BASICS OF COMPRESSION THEORY

Multiple Description Transform Coding of Images

Linear Optimum Filtering: Statement

Performance Bounds for Joint Source-Channel Coding of Uniform. Departements *Communications et **Signal

window operator 2N N orthogonal transform N N scalar quantizers

ON SCALABLE CODING OF HIDDEN MARKOV SOURCES. Mehdi Salehifar, Tejaswi Nanjundaswamy, and Kenneth Rose

SPEECH ANALYSIS AND SYNTHESIS

Noise-Shaped Predictive Coding for Multiple Descriptions of a Colored Gaussian Source

MODERN video coding standards, such as H.263, H.264,

encoding without prediction) (Server) Quantization: Initial Data 0, 1, 2, Quantized Data 0, 1, 2, 3, 4, 8, 16, 32, 64, 128, 256

Joint Source-Channel Coding Optimized On Endto-End Distortion for Multimedia Source

6. H.261 Video Coding Standard

Introduction p. 1 Compression Techniques p. 3 Lossless Compression p. 4 Lossy Compression p. 5 Measures of Performance p. 5 Modeling and Coding p.

A Lossless Image Coder With Context Classification, Adaptive Prediction and Adaptive Entropy Coding

Image Data Compression

Pulse-Code Modulation (PCM) :

Statistical Analysis and Distortion Modeling of MPEG-4 FGS

Video Coding With Linear Compensation (VCLC)

Phase-Correlation Motion Estimation Yi Liang

A Video Codec Incorporating Block-Based Multi-Hypothesis Motion-Compensated Prediction

SIGNAL COMPRESSION. 8. Lossy image compression: Principle of embedding

CODING SAMPLE DIFFERENCES ATTEMPT 1: NAIVE DIFFERENTIAL CODING

Lossless Image and Intra-frame Compression with Integer-to-Integer DST

A DISTRIBUTED VIDEO CODER BASED ON THE H.264/AVC STANDARD

BASIC COMPRESSION TECHNIQUES

EE 121: Introduction to Digital Communication Systems. 1. Consider the following discrete-time communication system. There are two equallly likely

Compression and Coding

Objectives of Image Coding

STATISTICS FOR EFFICIENT LINEAR AND NON-LINEAR PICTURE ENCODING

Multimedia Communications Fall 07 Midterm Exam (Close Book)

IMAGE COMPRESSION OF DIGITIZED NDE X-RAY RADIOGRAPHS. Brian K. LoveweIl and John P. Basart

Butterworth Filter Properties

Soft-Output Trellis Waveform Coding

Chapter 9 Fundamental Limits in Information Theory

Lloyd-Max Quantization of Correlated Processes: How to Obtain Gains by Receiver-Sided Time-Variant Codebooks

The DFT as Convolution or Filtering

3684 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 8, AUGUST 2014

Intra Frame Coding for Advanced Video Coding Standard to reduce Bitrate and obtain consistent PSNR Using Gaussian Pulse

LORD: LOw-complexity, Rate-controlled, Distributed video coding system

Application of a Bi-Geometric Transparent Composite Model to HEVC: Residual Data Modelling and Rate Control

EE5585 Data Compression April 18, Lecture 23

Performance Comparison of Two Implementations of the Leaky. LMS Adaptive Filter. Scott C. Douglas. University of Utah. Salt Lake City, Utah 84112

Selective Use Of Multiple Entropy Models In Audio Coding

798 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 44, NO. 10, OCTOBER 1997

Predictive Coding. Lossy or lossless. Feedforward or feedback. Intraframe or interframe. Fixed or Adaptive

EE5356 Digital Image Processing

Fault Tolerance Technique in Huffman Coding applies to Baseline JPEG

at Some sort of quantization is necessary to represent continuous signals in digital form

THE PROBLEMS OF ROBUST LPC PARAMETRIZATION FOR. Petr Pollak & Pavel Sovka. Czech Technical University of Prague

MATCHING-PURSUIT DICTIONARY PRUNING FOR MPEG-4 VIDEO OBJECT CODING

ii Abstract Compression of digital images has been a topic of research for many years and a number of image compression standards has been created for

AN ENHANCED EARLY DETECTION METHOD FOR ALL ZERO BLOCK IN H.264

Error Spectrum Shaping and Vector Quantization. Jon Dattorro Christine Law

Enhanced Stochastic Bit Reshuffling for Fine Granular Scalable Video Coding

Redundancy Allocation Based on Weighted Mismatch-Rate Slope for Multiple Description Video Coding

Half-Pel Accurate Motion-Compensated Orthogonal Video Transforms

The Choice of MPEG-4 AAC encoding parameters as a direct function of the perceptual entropy of the audio signal

on a per-coecient basis in large images is computationally expensive. Further, the algorithm in [CR95] needs to be rerun, every time a new rate of com

IMAGE COMPRESSION-II. Week IX. 03/6/2003 Image Compression-II 1

Overview. Analog capturing device (camera, microphone) PCM encoded or raw signal ( wav, bmp, ) A/D CONVERTER. Compressed bit stream (mp3, jpg, )

CSE 408 Multimedia Information System Yezhou Yang

SCELP: LOW DELAY AUDIO CODING WITH NOISE SHAPING BASED ON SPHERICAL VECTOR QUANTIZATION

Multimedia & Computer Visualization. Exercise #5. JPEG compression

Audio Coding. Fundamentals Quantization Waveform Coding Subband Coding P NCTU/CSIE DSPLAB C.M..LIU

Motion Vector Prediction With Reference Frame Consideration

EE 5345 Biomedical Instrumentation Lecture 12: slides

IEEE Transactions on Information Theory, Vol. 41, pp. 2094{2100, Nov Information Theoretic Performance of Quadrature Mirror Filters.

EE5356 Digital Image Processing. Final Exam. 5/11/06 Thursday 1 1 :00 AM-1 :00 PM

A NEW BASIS SELECTION PARADIGM FOR WAVELET PACKET IMAGE CODING

VIDEO CODING USING A SELF-ADAPTIVE REDUNDANT DICTIONARY CONSISTING OF SPATIAL AND TEMPORAL PREDICTION CANDIDATES. Author 1 and Author 2

THE newest video coding standard is known as H.264/AVC

Module 5 EMBEDDED WAVELET CODING. Version 2 ECE IIT, Kharagpur

Transcription:

Rate-Distortion Based Temporal Filtering for Video Compression Onur G. Guleryuz?, Michael T. Orchard y? University of Illinois at Urbana-Champaign Beckman Institute, 45 N. Mathews Ave., Urbana, IL 68 y Department of Electrical and Computer Engineering Princeton University, Engineering Quadrangle Princeton, NJ 8544 January, 996 Abstract We consider the temporal DPCM loop at the heart of most modern high performance video coders. Targeting low bitrate-low complexity video applications, it is shown that DPCM is inecient in this region. The DPCM codec is analyzed in the low bitrate region and rate-distortion optimal modications are proposed that do not violate the low complexity requirement. The proposed modications involve negligible added complexity at the encoder and no added complexity at the decoder and are thus compatible with standard coders and bit streams. Introduction Modern video coders are based on a Dierential Pulse Code Modulation (DPCM) loop (Figure ). In order to take advantage of the temporal dependencies inherent in a video sequence, the frame to be coded is predicted from previously decoded frames using a motion compensating predictor. The resulting error frames are quantized and coded for transmission using a coder that removes the spatial redundancies. The use of previously decoded frames rather than the originals at the encoder, enables the decoder to produce the same prediction as the encoder. Codecs not utilising DPCM (Intraframe-I Frame codecs) are forced to transmit large amounts of data which is not acceptable for most applications. Noting that the primary function of DPCM is to reduce error frame variance using temporal dependencies, it is easy to see that the DPCM loop is essentially the most crucial part of modern video coders. The success of these coders has opened the way to very low bitrate applications such as videophones, video conferencing, multimedia etc. In these applications it is also very important to keep the encoder-decoder pair at low complexity. While much attention has been given to ne tuning various parameters of the DPCM codec to the low complexity-low bitrate region, the eciency of the DPCM loop at these bitrates has not been questioned. For example, for a Gaussian Auto Regressive (AR) source, the DPCM loop asymptotically performs within :5 bit of the theoretical

Input Error Frames ENCODER DPCM ENCODER DPCM DECODER DECODER Frames to be coded - Error Frame Coder Motion Compensating Predictor Motion Compensating Predictor Output Previously decoded frames Previously decoded frames motion information Figure : DPCM in Video Coding.5 bits Rate (bits) 3.5 3.5.5 DPCM Rate-Distortion Performance for a first order Gaussian AR process DPCM y = ay z n n- n (a=.9).5 Theoretical R(D) Increasing gap between DPCM and theoretical R(D) - - D/ σ (log scale) Y Figure : DPCM Rate-Distortion Performance rate-distortion function, but at low bitrates the dierence between DPCM and the rate-distortion function increases, especially for high correlation values (Figure ). In this paper we analyze this low bitrate region. Modelling video DPCM as a temporal AR predictor (with an AR coecient of unity) we will propose a simple modication to the DPCM loop using results derived from Gaussian auto regressive processes. The proposed modication, while optimizing the rate-distortion performance of the DPCM codec, will involve no added complexity at the decoder and negligible added complexity at the encoder. Specically, we will consider a temporal prelter at the input and a quantization error spectrum shaping lter, both of which are tuned to a given target distortion to minimize rate. In Section the DPCM codec is analyzed for Gaussian AR processes to arrive at our main results in Section.. Section 3 discusses the application of these results to video sequence coding. Brief examples are provided in Section 4 followed by Section 5 of concluding remarks.

Analysis of DPCM In this section we analyze the rate-distortion performance of the DPCM codec for Gaussian Auto Regressive processes. For simplicity we will consider a rst order Gaussian AR process, however we note that our results can be generalized to higher orders in a straightforward manner. Thus let y n = ay n? z n ; (n = : : : ;?; ; ; : : :) denote the process, where z n are the independent innovations generating the process and a is the AR coecient with jaj <. Let N(; ) be the Gaussian probability density function (pdf) with zero mean and unit variance, then we have z n N(; z) and y n N(; z ). The theoretical rate-distortion function of a Gaussian AR (?a ) process is easily obtained from its spectrum via the reverse water lling paradigm []. Let Y (w) = denote the spectrum of y n, then with mean squared error z j?ae?jw j being the delity criterion we have (Figure 3): Result (Theorem 4.5.3 in []) The mean squared error rate-distortion function of y n has the parametric representation D = R(D ) = 4?? min[; Y (w)]dw (distortion) () max[; log( Y (w) )]dw (rate) () z (? a) Φ Y (ω) (a=.9) Implied transmitted signal and distortion spectrum Spectrum of resampled signal Θ inefficient region Θ efficient region efficient region π ω ω π π ω ω π π π Θ Θ Θ Θ (a) (b) (c) Figure 3: Theoretical Rate-Distortion on Signal Spectrum. DPCM Although the previous result advocates a transform based coder [, ], DPCM has received a lot of attention thanks to its simplicity and excellent performance at high rates []. As the rate is decreased however, the dierence between DPCM and the theoretical rate-distortion function increases, especially for values of the AR coecient close to (Figure ). This becomes particularly pronounced for high distortion values

where w in Figure 3 becomes smaller than. Notice that for such distortion values the theoretical rate-distortion function implies a transmitted signal and distortion spectrum as shown in Figure 3 (b). [3] considers a scenario in which the y n are the sampled values of a time continuous Gaussian process. Then, changing the sampling rate (followed by tuning ideal antialiasing lters) demonstrates better rate-distortion performance for a PCM system coding the resampled process, with matching modications at the decoder (Figure 3 (c)). In the present paper we will assume that such a resampling scheme is not possible. Moreover we will assume that the DPCM decoder is xed due to complexity reasons (see Section 3). y n - z n ( z n = z n q n ) Uniform Quantizer z n zn yn y n -j ω ae -j ω ae (a) DPCM Encoder (b) DPCM Decoder yn G( ω) ( z n = zn q n ) z n z Uniform n /G( ω) /G( ω) L( ω) Quantizer z n yn z n p - n prefilter = -j ω ae -j ω q n quantization noise ae shaping filter ( pn = pn q n ) p n p Uniform n Quantizer - C( ω) q n (c) Equivalent innovations encoder (d) Proposed Encoder Figure 4: Standard and Proposed DPCM Coders Figure 4 (a-c) display the standard DPCM encoder and the equivalent DPCM encoder operating on the innovations process, where we assume the quantization is uniform. Under the restrictions stated in the above paragraph we propose two modications (Figure 4 (d)): a prelter L(w) at the input, a quantization noise shaping lter C(w) at the quantization loop. We now derive these lters to optimize the rate-distortion performance of the modied codec. In the rest of this section we refer to the notation of Figure 4 (d).. Main Derivation Before proceeding with the derivation let us point out some of the concepts complicating the analysis: The uniform quantizer in Figure 4 does not insert Gaussian quantization noise as desired by rate-distortion theory [4, 5].

Because of the feedback structure of the quantizer, the quantization noise is not necessarily white. The pdf of p n is not known exactly for the above reasons, this together with high quantization distortion makes simple integral approximations to rate [6] useless. The pdf of p n has been analyzed for standard DPCM systems (i.e. when C(w) = ae?jw ) using orthogonal polynomial expansions [4, 5]. Fortunately, it can be shown that this density is closely approximated by a Gaussian pdf (see for eg. [4]). With this approximation in hand we also make the following assumptions: We will assume that the quantization noise is approximately white and the quantization distortion is given by the ubiquitous = q, where is the quantizer step size. We will also assume that the error process caused by the prelter, f n = z n? p n, and the quantization error process are decorrelated. With these approximations we are now ready to derive the lters. Using the fact that the rate for a Gaussian random variable p undergoing uniform quantization is governed by the ratio p we can formulate our rate-distortion optimization function as: J = p q (D? D T ) (3) where D T is the target distortion and is a positive Lagrange multiplier chosen such that the overall distortion (prelter distortion quantization distortion) D = D T. Using our approximations we can write J q D T =?? = J L J C jl(w)j z dw jc(w)j q dw?? j? L(w)j jg(w)j z dw j? C(w)j jg(w)j qdw (4) where G(w) =,?ae?jw = q and L(w), C(w) are real impulse response lters chosen to minimize J. Optimization of the prelter L(w): Using J L R = R? jl(w)j dw j z?? L(w)j jg(w)j z dw, let us rst make the following observations: For xed jl(w)j (jl(?w)j ) the value of L(w) (L(?w)) that minimizes J L real. is For xed j? L(w)j the value of L(w) that minimizes J L is L(w). Thus, we conclude that

Result For optimality L(w) has to be real and even with L(w). Writing J L = jl(w)? jg(w)j? jg(w)j j ( jg(w)j ) (terms independent of L(w)) we see that the optimal L(w) = jg(w)j. Having poles both inside and outside of jg(w)j the unit circle, this form of L(w) is not implementable. Moreover, with video coding as our intended application, approximating this form of L(w) with a FIR lter having a large number of taps is out of the question. Instead we propose a manageable three tap FIR lter L(w) = e?jw e jw consistent with Result. Optimization of the quantization error shaping lter C(w): The main constraint involved in the selection of C(w) is that it has to be a causal lter, using only the past values of the quantization error process. Optimizing J C under this constraint yields: Result 3 For optimality C(w) = (a?)e?jw?e?jw, where for the form of the quantization error shaping lter. With the above determined forms of the lters let us now optimize the parameters ; ; for a given target distortion D T : Let D L R = j?? L(w)j jg(w)j dw z denote the prelter distortion. Then the quantization distortion D C R = j?? C(w)j jg(w)j qdw is given by q (? ) and, D T = D L q (? ) q = (D T? D L )(? ) J = ( ) z (a? ) (D T? D L )(? ) (? ) where J is optimized with respect to ; ; subject to D L D T. Figure 6 and Table display optimization results. Notice that the optimization proceeds by picking a possibly dierent lter combination for each target distortion. Thus the overall ratedistortion curve resides on the convex hull of the chosen lter rate-distortion curves. 3 Video Coding In this section we will apply the previous results to the coding of video sequences. Figure displays the standard DPCM based video codec. At this point, let us take a more detailed look at the \motion compensating predictor" employed in these systems. The primary function of this block consists of two parts: A search algorithm with which pixels in the present frame are matched to pixels in the previously decoded frames (motion vector detection),

Distortion D= Y :9 :89 :98 :9 :399 :453 :6985 :85 :779 :3 :5435 :78 :59 :359 :4596 :565 :539 :439 :47 :8 Table : Optimized Filter Parameters.5-4 -3 - - 3 4.5-4 -3 - - 3 4.5-4 -3 - - 3 4.5-4 -3 - - 3 4.5-4 -3 - - 3 4 Figure 5: Progression of Optimal Filters with Increasing Distortion ({jl(w)j, - - jc(w)j, a = :9) A simple prediction algorithm where from each pixel in the present frame, the matched pixel from the previously decoded frames is subtracted (motion compensation), Notice that the second step is a simple DPCM predictor with AR coecient of unity. Thus we can apply our results directly to video sequences provided that the ltering is done \along motion trajectories" as temporal ltering. However, with a prelter consistent with Result one needs to know the future as well as current motion vectors, and lter along current and future motion trajectories. Figure 7 depicts the situation for the \Block Matching" algorithm. A second issue that deserves attention is that with noncausal temporal lters necessitating future motion vectors, it is clear that one needs to contend with motion vectors obtained by matching original blocks instead of matching the original block to a previously decoded block as is done in standard systems. In practice however, the dierence between the two methods is insignicant and can be totally eliminated by doing a very local nal search (after ltering and prior to coding) without raising complexity appreciably. We note that because some part of the target distortion is allocated to the prelter, quantization noise is reduced compared to standard systems, resulting in frames with less noticeable quantization artifacts. Note also that codecs based on the proposed system will improve the performance the most when they are allowed to operate

.5 Rate (bits).5 DPCM Optimized System Optimized Rate-Distortion Curve resides on the convex hull of optimized filters.5 Theoretical R(D)..4.6.8...4.6.8 D/σ Y Figure 6: Rate-Distortion Performance of Optimized System (a = :9) DPCM like, i.e when they are allowed to transmit less number of I Frames compared to standard coders. Finally, since the proposed modications are at the encoder, codecs based on the proposed system can target standard decoders, using standard bit streams and compression formats. 4 Simulation Results This section presents the simulation results for the standard and optimized DPCM systems. We emphasize though that the results presented in this section are meant as preliminary examples and should not be taken as the best performance of the proposed system, since no attempt has been made to optimize the lters to video. Rather, the following results are for a xed prelter ( = :78, = :643) and with the quantization shaping lter C(w) set to its standard value (i.e. = ). It is clear from our framework that, these lters should be optimized for each target distortion. Moreover this optimization can be carried out in DCT domain where dierent DCT coecients have dierent variances and hence should be ltered dierently. Notice that even in the most basic conguration (xed unoptimized prelter, no quantization spectrum shaping lter) one can observe improvement over the standard system. For the following, both the standard and optimized systems use an MPEG compatible coder with integer block matching (3 frames per second with no B Frames allowed). 5 Conclusion DPCM forms the main part of most high performance video coders. With newly emerging low bitrate video applications the performance of video coders in the low complexity-low bitrate region has gained a lot of importance. Pointing out the inef- ciency of the DPCM coder at low bitrates we analyzed and optimized the DPCM

Filtering Along Motion Trajectories for Block Matching trajectory for black patch motion vectors past frame present frame future frame Resolving Implementation Issues: pixels pixels used by one pixel in the future frame (Future motion trajectory given by future pixel) pixels used by more than one pixel in the future frame (Average future pixels to obtain future motion trajectory) block in present frame pixels left unused by the future frame (Repeat current pixel to obtain future motion trajectory) Figure 7: Motion Trajectories Sequence Rate Average MSE PSNR I FRAMES bits per second per pixel Claire 5574 6:394 35:984 7 CIF 68 9:6673 35:934 5 35 4 43696 :77 34:5683 3 Table : Results for Standard DPCM system in this region. As a result, rate-distortion optimized DPCM systems with no added complexity at the encoder and negligible added complexity at the decoder were designed. Thus codecs based on the proposed system can utilize standard formats and bit streams. Applying our results directly to video sequences we demonstrated the better performance of the optimized system. Finally, we addressed the implementation issues related to the optimized system. Optimization for video coding, as well as issues related to the analysis of matching modications at the decoder will be taken up in a future paper. References [] T. Berger, \Rate Distortion Theory", Englewood Clis, NJ: Prentice Hall, 97. [] L. D. Davisson, \Rate-Distortion Theory and Application", Proceedings of the IEEE, Vol. 6, no 7, pp. 8-88, July, 97.

Sequence Rate Average MSE PSNR I FRAMES bits per second per pixel Claire 76 5:895 36:46 7 CIF 679 9:643 35:359 5 35 4 4398 :56 34:6563 3 Table 3: Results for Optimized DPCM 38 Claire: - Prefiltered DPCM, -- Standard DPCM PSNR 36 34 5 5 Frame Number ( every 7th frame is an IFRAME ) 38 PSNR 36 34 5 5 Frame Number ( every 5th frame is an IFRAME ) 38 PSNR 36 34 5 5 Frame Number ( every 3th frame is an IFRAME ) Figure 8: Results [3] W. C. Kellog, \Information Rates in Sampling and Quantization", IEEE Trans. on Information Theory, Vol. IT-3, no 3, pp. 56-5, July, 967. [4] D. S. Arnstein, \Quantization Error in Predictive Coders", IEEE Trans. on Communications, Vol. COM-3, no 4, pp. 43-49, April, 975. [5] N. Farvardin, J. W. Modestino, \Rate-Distortion Performance of DPCM Schemes for Autoregressive Sources", IEEE Trans. on Information Theory, Vol. IT-3, no 3, pp. 4-48, May, 985. [6] V. R. Algazi, J. T. DeWitte, Jr., \Theoretical Performance of Entropy-Encoded DPCM", IEEE Trans. on Communications, Vol. COM-3, no 5, pp. 88-95, May, 98.