On the DPCM Compression of Gaussian Auto-Regressive. Sequences

Size: px

Start display at page:

Download "On the DPCM Compression of Gaussian Auto-Regressive. Sequences"

Marshall May
6 years ago
Views:

1 On the DPCM Compression of Gaussian Auto-Regressive Sequences Onur G. Guleryuz, Michael T. Orchard Department of Electrical Engineering Polytechnic University, Brooklyn, NY 1101 Department of Electrical and Computer Engineering Princeton University, Engineering Quadrangle Princeton, NJ July 14, 000 Abstract Differential Pulse-Coded Modulation (DPCM) encoding of Gaussian Auto-Regressive sequences is considered. It is pointed out that DPCM is rate-distortion inefficient at low bit rates. Simple filtering modifications are proposed and incorporated into DPCM. A rate-distortion optimization framework that results in optimal filters is presented. It is shown that the designed filters take advantage of less significant process spectral components in order to achieve superior rate-distortion performance. Design equations are derived, issues related to optimization and complexity addressed. It is shown that simple DPCM systems with the proposed modifications significantly outperform their standard counterparts. Keywords: DPCM, Rate-Distortion, Prefilter, Postfilter 1

2 List of Figures 1 Standard and proposed DPCM coders for a first order Gaussian AR process y n = ay n 1 + z n. (a) DPCM encoder, (b) DPCM decoder, (c) Equivalent innovations encoder, (d) Equivalent decoder, (e) Proposed encoder, (f) Proposed decoder.... DPCM rate-distortion performance Theoretic rate-distortion on signal spectrum. (a) Low distortion region where all spectral components of the process are allocated bit rate. (b) High distortion region where only the low frequency spectral components are allocated bit rate Rate-Distortion Trade-Off for DPCM. Spectral components at ω 1 and ω contribute equally to the total rate but the contribution to the reduction of distortion is determined by G(ω) which weighs lower frequencies more heavily Optimality of T (w) = L(w)P (w). (a) For fixed T (w), optimal T (w) is real with T (w) 0, (b) For fixed 1 T (w), optimal T (w) is real with T (w) Progression of L(w), P (w), C(w), 1 C(w) with increasing distortion, a = Rate-distortion performance of optimized system (a = 0.9) (a) Mismatch between targeted and actual values (observed via simulation) for the optimized system, (b) Agreement between targeted and actual values (observed via simulation) for the optimized system with dither Rate-distortion performance of optimized system with dither (a = 0.9) Rate-distortion performance of optimized three-tap prefilter (a = 0.9) Rate-distortion performance of optimized systems (a = 0.9) Rate-distortion performance of optimized systems (a = 0.8)

3 1 Introduction Differential pulse-coded modulation (DPCM) is a well known predictive compression method. The sample value to be coded is predicted from previously coded sample values and the error of this prediction is quantized and transmitted to the decoder where the inverse operation takes place. Figure 1 (a)-(b) illustrate the DPCM codec 1 operating on a first order Gaussian Auto-Regressive sequence. Thanks to its excellent performance at high bit rates [], DPCM is an effective and simple method employed in widely varying scenarios including audio and video compression [3, 4]. In newly emerging low bit rate applications however, the high bit rate efficiency of DPCM becomes less important and one must question its low bit rate rate-distortion performance. For example, for a Gaussian Auto-Regressive (AR) source, the DPCM loop asymptotically performs within 0.5 bit of the theoretic rate-distortion function [], but at low bit rates the difference between DPCM and the theoretic rate-distortion function increases, especially for highly correlated sources (Figure ). Thus, as the bit rate is lowered the high bit rate efficiency of DPCM starts to decrease, casting doubt into the usefulness of DPCM for low complexity, low bit rate applications. In this paper, we analyze the low bit rate region and present simple modifications to DPCM that significantly improve its rate-distortion performance on Gaussian AR sequences. Unlike previous work (see for e.g., [8, 9]) that concentrated on tuning the prediction operation to account for quantization, or on designing optimal quantizers to replace the uniform quantizer in Figure 1, our approach involves simple filtering operations that modify the spectral components of the source in order to make it more readily compressible for DPCM. 1 Throughout this paper we will assume the use of a uniform quantizer whose output is losslessly coded with a first order entropy coder. Of course this difference starts to approach zero at very low bit rates where distortion approaches its maximum value (see for e.g., Figures 11 and 1). 3

4 1.1 Basic Idea It is commonly believed that matched prediction in DPCM is optimal in the sense that it removes the dependency of the coding operation on the process spectral components at all bit rates. In other words the optimal DPCM coder operating on a Gaussian AR process minimizes the prediction error variance and codes the residual at all bit rates. and leads to generic rate-distortion asymptotics []. At high bit rates this observation is accurate However, at low bit rates process spectral components play an important role and lead to interesting performance trade-offs. As we will see, these trade-offs provide rate-distortion benefits that cannot be obtained by a matched predictor. In order to illustrate the main idea, consider a first-order, zero mean Gaussian AR process; however, note that the results can be generalized to higher orders in a straightforward manner. Let y n = ay n 1 +z n, (n =..., 1, 0, 1,...) denote the process, where z n are the independent innovations generating the process and a is the AR coefficient with a < 1. For simplicity, assume 0 a < 1. Let N(0, σ ) be the Gaussian probability density function (pdf) with zero mean and variance σ, then we have z n N(0, σ z) and y n N(0, σ z (1 a ) ). The theoretic rate-distortion function of a Gaussian AR process is easily obtained from its spectrum via the reverse water-filling paradigm [1]. Let Φ Y (w) = y n, then we have (Figure 3): σ z 1 ae jw denote the spectrum of Result 1.1 (Theorem in [1]) The mean-squared error rate-distortion function of y n has the parametric representation D θ = 1 π min[θ, Φ Y (w)]dw (distortion) (1) π R(D θ ) = 1 π 4π max[0, log ( Φ Y (w) )]dw (rate) () θ 0 θ σ z (1 a) 4

5 Figure 3 (a) indicates that at high bit rates the theoretic rate-distortion function is obtained by allocating bits to all spectral components whereas from Figure 3 (b) it can be seen that at low bit rates the available bits are allocated to the magnitudewise significant low frequency spectral components, i.e., spectral components with frequencies less than w θ are transmitted with distortion θ; whereas, those with frequencies greater than w θ are suppressed. Hence at low bit rates the magnitudewise less significant components of the spectrum are traded off for more significant components in order to arrive at better rate-distortion performance. As argued in [1, ] the above result advocates a computationally complex transform-based coder for the compression of Gaussian AR sequences. Indeed, since DPCM works in time domain with a limited number of sample process values, the relevance of process spectral components on the overall compression performance is not clear especially under matched prediction. In particular, the generalization of the above trade-off to practical DPCM coders operating on Gaussian AR processes is not obvious. However, as will be shown in this paper, one can indeed take advantage of the existence of magnitudewise less significant spectral components to further the performance of DPCM at low bit rates. Consider Figures 1(c-d) where we see that DPCM effectively codes z n, the process innovations z n (white, Gaussian) plus a feedback quantization error term aq n 1. Assume that z n is approximately Gaussian and has white spectrum (Section ). Then, the effect of the various spectral components of z n on the rate incurred by the encoder is equal, because this rate only depends on the variance of z n and the quantizer step size used (Section ). However, due to the reconstruction filter G(w) = 1/(1 ae jw ) on the decoder side, the effect of spectral components on the reduction of incurred distortion is not equal (i.e., distortionwise, high-frequency spectral components have less importance than low-frequency spectral components, Figure 4). Thus, at high target distortion levels (low target bit rates), a DPCM encoder can be modified to somehow trade-off or suppress these less important spectral components for a reduction in rate, at the expense of increased distortion, in order to achieve better overall rate-distortion performance 3. Of course, one must pose this problem 3 Note that under such a setting the total distortion will have two parts, the first part due to suppressed spectral components and the second due to quantization, similar to the reverse water filling formulation. 5

6 in a rate-distortion optimization framework and find rate-distortion optimal solutions in order to validate the existence of such a trade-off. This framework and the resulting optimal solutions are the main contributions of this paper. [5] considers a scenario where the y n are the sampled values of a time-continuous Gaussian process. The time-continuous Gaussian process is sampled, coded with a PCM coder, and reconstructed at the decoding end. It is shown that tuning ideal antialiasing filters and correspondingly reducing the sampling rate result in better rate-distortion performance for a PCM system encoding the resampled process at low bit rates 4. Note that this operation effectively removes less important high frequency spectral components using the antialiasing filter (thereby incurring a fixed distortion) at the benefit of reducing the number of samples that need to be transmitted resulting in a rate gain. In the present work, we will assume that such a resampling scheme is not possible and propose simple improvements to standard DPCM to optimize its rate-distortion performance 5. In this paper, the following modifications (Figure 1(e-f)) to standard DPCM are proposed: a prefilter L(w) at the input to the encoder, a quantization noise-shaping filter C(w) at the quantization loop, and a postfilter P (w) at the input to the decoder. From the above discussion, the prefilter is expected to trade-off less important spectral components for a reduction in rate and a better overall rate-distortion performance. Naturally we expect to have different prefilters for each target distortion. For example for the first order Gaussian AR process considered in this paper (a = 0.9) we expect to see an all pass filter at high bit rates and progressively lower pass filters at low bit rates. Such prefilters will modify the source and may result in unoccupied spectral bands depending on the target distortion. The idea behind the postfilter P (w) and quantization noise-shaping filter C(w) follows from distortion-optimized 4 Of course the PCM decoder is modified to match the encoder modifications. 5 The reader should also note that [5] does not provide a rate-distortion optimization framework. 6

7 feedback quantizer design where a bandlimited source is quantized using a feedback quantizer. The feedback quantizer tries to shape the quantization noise out of frequency bands occupied by the source for subsequent removal by a postfilter at the decoding end [6]. All of these filters are obtained via a rate-distortion optimization framework and in particular the derivation of the postfilter and the quantization noise-shaping filter are different from the work in [6]. We now derive these filters to optimize the rate-distortion performance of the modified codec. In the rest of this paper, refer to the notation of Figures 1(e-f). The DPCM codec is analyzed for Gaussian AR processes to arrive at our main results in Section. The analysis is carried out in a general framework which includes a prefilter and a quantization error-shaping filter at the encoder and a postfilter at the decoder. Issues complicating analysis and the necessary assumptions are discussed in Section. The derivation of the prefilter, the quantization noise shaping filter and the postfilter are examined in Sections.1,., and.3 respectively. Section 3 considers the results of the optimization and examines the validity of our assumptions. Section 4 includes a discussion of the proposed rate-distortion optimization framework and briefly compares it to other formulations, followed by Section 5 of concluding remarks. All rate-distortion performance plots are actual results on a simulated Gaussian process except for Figure 8 which investigates the validity of our assumptions. Main formulation Before proceeding with the formulation, the following issues complicating the analysis need to be pointed out: The uniform quantizer in Figure 1 does not insert Gaussian quantization noise [7, 9]. Due to the feedback structure of the quantizer, the quantization noise is not necessarily white. The pdf of p n is not exactly known for the above reasons. Together with high quantization distortion incurred at low bit rates, this makes simple integral approximations to rate [10] 7

8 useless. The pdf of p n has been analyzed for standard DPCM systems (i.e., when C(w) = ae jw ) using orthogonal polynomial expansions [7, 9]. Fortunately, it can be shown that this density is closely approximated by a Gaussian pdf (e.g., [7]). 6 Using this approximation we make the following assumptions: We will assume that the quantization noise is approximately white and the quantization distortion is given by the ubiquitous σq =, where is the quantizer step size. 1 We will assume that the input to the feedback quantizer p n is decorrelated to the feedback quantization error at each n, i.e., p n is the sum of two decorrelated terms. We will also assume that the error process caused by the prefilter, f n = z n p n, and the quantization-error process are decorrelated, i.e., the total distortion is due to two decorrelated error processes. With these approximations, we are now ready to derive the filters. As noted above, the high quantization distortion incurred at low bit rates precludes the use of integral approximations to rate, and getting analytical expressions for the actual rate (necessary in a rate-distortion optimization framework) is difficult even with the foregoing assumptions. However, the problem can be equivalently posed with simple quantities by using the following observation: The rate for a Gaussian random variable p undergoing uniform quantization is governed by the ratio σ p, i.e., the actual rate is a monotonic function of this quantity. With this observation, we can formulate the familiar rate-distortion optimization function (R + λd) equivalently as: J = σ p σ q + λ 1 (D D T ) (3) where D T is the target distortion, σ q = 1, and λ 1 is a nonnegative Lagrange multiplier chosen such that the overall distortion (for the prefilter, postfilter and quantizer combined) D = D T. Thus, σ p σ q 6 The general problem of the analysis of a feedback quantizer with arbitrary C(w) acting on colored Gaussian input (p n ) is untractable even with orthogonal expansion techniques [7, 11] and is not considered in this work. 8

9 is used in place of the actual expression for rate because minimizing this quantity is equivalent to minimizing rate. Using our approximations and Figures 1(e-f) we can write D = 1 π π + 1 π σ p = 1 π π π 1 L(w)P (w) G(w) σ zdw 1 C(w) P (w) G(w) σ qdw (4) L(w) σzdw + 1 π C(w) σ π qdw (5) where the first term in the distortion expression represents the distortion due to the prefilter, postfilter pair and the second term is due to quantization distortion. Note that both D and σ p are decoupled with respect to L(w) and C(w) thanks to our decorrelation assumptions, and the only coupling is due to the postfilter P (w) ( G(w) = 1 ae jw ). Rearranging terms to reflect this decoupling, σqj + λ D T = J L + J C (6) J L = 1 π L(w) σ π zdw + λ π 1 L(w)P (w) G(w) σ π zdw (7) J C = 1 π π C(w) σ qdw + λ π π 1 C(w) P (w) G(w) σ qdw (8) where λ = σ qλ 1 and L(w), C(w), P (w) are real impulse response filters chosen to minimize J..1 Optimization of the prefilter L(w) Let T (w) = L(w)P (w), then J L = 1 π T (w) π P (w) σ zdw + λ π 1 T (w) G(w) σ π zdw (9) and we can make the following observations (Figure 5): For fixed T (w) ( T ( w) ), the value of T (w) (T ( w)) that minimizes J L is real with T (w) 0. For fixed 1 T (w), the value of T (w) that minimizes J L is T (w) 1. 9

10 Combining these two observations, we conclude that Proposition.1 For optimality, T (w) has to be real and even with 0 T (w) 1. Writing π J L = 1 T (w) λ G(w) P (w) π 1 + λ G(w) P (w) ( 1 + λ G(w) P (w) )σ P (w) z + (terms independent of T (w)) we see that the optimal T (w) = λ G(w) P (w) 1+λ G(w) P (w). For a given λ it can be seen that T (w) tends to 1 as G(w)P (w) gets large and to 0 as G(w)P (w) gets small. When the postfilter is set to its standard form (P (w) = 1) it is clear that the optimal prefilter tends to suppress magnitudewise less significant spectral components, while preserving the significant spectral components. With arbitrary postfilters (P (w)) the optimal form of T (w), having poles both inside and outside of the unit circle, is not implementable. Moreover, approximating this form of T (w) with an FIR filter having a large number of taps is not desirable for complexity reasons. Instead, we will use FIR filters with a small number of taps consistent with Proposition.1. 7 The reader should note that an FIR T (w) can easily be determined for a given postfilter and λ by taking derivatives of (9) with respect to the filter tap values and equating the result to zero. Using such a T (w) the prefilter can be determined as L(w) = T (w)/p (w) for a given postfilter P (w).. Optimization of the quantization error-shaping filter C(w) The main constraint involved in the selection of C(w) is that it has to be a causal filter, using only the past values of the quantization-error process, i.e., the optimization is constrained. Thus, J C = 1 π C(w) σ π qdw + λ π 1 C(w) P (w) G(w) σ π qdw 7 Notice that Proposition.1 calls for a symmetric, look-ahead filter. 10

11 is optimized subject to a causal C(w) with 1 π π C(w)dw = 0, in order to ensure that the output of C(w) depends only on the past input values. Optimization of feedback quantizers for bandlimited sources has been studied using distortion alone as the optimization criterion [6, 13]. This usually involves designing a quantization errorshaping filter which tries to put the quantization error into spectral bands that are not occupied by the input signal. The quantized signal is then filtered by an ideal postfilter at the decoder in order to eliminate the out-of-band quantization error. As pointed out by [1], the procedure has practical shortcomings because ideal postfilters at the decoder are not feasible and quantization saturation effects at the encoder have to be accounted for, i.e., assuming a practical quantizer with a fixed number of levels the magnitude of the feedback term should not become arbitrarily large. [1] tries to remedy this problem by incorporating nonideal filters in the analysis and proposes to curb saturation effects by constraining the squared magnitude of the shaping filter 8. As such, the latter part of their work can be considered similar to this section, because even though this work assumes the entropy coding of the quantizer output (and hence, a quantizer with a fixed number of levels is not necessary), the squared magnitude of the shaping filter C(w) is incorporated into the optimization since it affects the rate expression, being a part of σ p (Equation (5)). Let D(w) = 1 C(w), then D(w) has to be causal with 1 π π Thus, J C becomes 1 π C(w) σ π qdw = 1 π π J C = 1 π π = 1 π π = 1 π π 1 D(w) σ qdw D(w) σ qdw σ q D(w)dw = 1 and D(w) (1 + λ P (w) G(w) )σ qdw σ q D(w)H(w) σ qdw σ q (10) where H(w) = (1 + λ P (w) G(w) ). Assuming P (w) and G(w) are stable filters with rational z-transforms, H(w) can be obtained from H(w) via simple factorization. With a nonnegative Lagrange multiplier λ 1, λ = σ qλ 1 is nonnegative and one can obtain a stable H(w) with H(w) > 0. 8 It should also be noted that [1] restricts the search for the shaping filter to FIR filters as opposed to the more general formulation presented in this paper. 11

12 Incorporating 1 π π D(w)dw = 1 into Equation (10) with a Lagrange multiplier λ 3 to enforce this constraint, the optimization problem can be written as 1 π D(w) λ 3 H(w) σ π H(w)σ qdw q + (terms independent of D(w)) Then D(w) = δ/h(w), where δ is a constant chosen to ensure 1 π π D(w)dw = 1, and we have Proposition. For optimality C(w) = 1 δ/h(w), where δ = 1 π 1 π 1/H(w)dw and H(w) is the stable factorization of H(w) = (1 + λ P (w) G(w) ) ( H(w) > 0) chosen to yield a causal C(w)..3 Optimization of the postfilter P (w) The optimization of the postfilter P (w) jointly with the other filters presents a nonlinear optimization problem and is tackled numerically. Using T (w) = P (w)l(w), the optimization problem can be formulated as D = D L + D C (11) D L = 1 π π D C = 1 π σ p = 1 π π π 1 T (w) G(w) σ zdw 1 C(w) P (w) G(w) σ qdw T (w) P (w) σzdw + 1 π C(w) σ π qdw (1) 1

13 and J = 1 π π + λ 1 ( π π π T (w) σ P (w) z π σ q dw + 1 π C(w) dw π 1 T (w) G(w) σ zdw 1 C(w) P (w) G(w) σ qdw D T ) (13) First note that (13) depends on the product P (w) σ q, i.e., scaling P (w) by a constant results in the same optimization point if σ q is inverse-scaled by the same amount. Choosing this scaling constant to yield say P (0) = 1, one can obtain σ q as σ q = 1 π (D T D L ) π 1 C(w) P (w) G(w) dw provided the prefilter distortion D L D T. Meeting the total distortion constraint in this way, J becomes J = 1 π π T (w) P (w) σzdw + 1 π C(w) dw σq π which is to be optimized subject to D L D T and Propositions.1,.. 3 Optimization results Figure 6 displays the filters obtained via numerical optimization. Twenty-nine tap FIR filters were used for T (w) and P (w). Proposition.1 is satisfied by T (w) and Proposition. is used to obtain C(w). The numerical optimization proceeds by solving for T (w) and C(w) for a given P (w), and then updating P (w) using the solved quantities in an iterative fashion subject to D = D T. For a given P (w) and λ, (9) is solved for the optimal T (w). The resulting solution is rejected if D L > D T. Otherwise, Proposition. is used to obtain C(w). P (w) and λ are then updated via a simplex search and the algorithm repeated until a tolerance threshold is reached. In Figure 7, the rate-distortion performance of the proposed system on a simulated Gaussian process is shown. Note that the optimization proceeds by choosing a possibly different filter combi- 13

14 nation for each target distortion. Thus, the overall rate-distortion curve resides on the convex hull of the chosen filter rate-distortion curves. As can be seen from Figure 6, the prefilter trades off the high frequencies in the process spectrum that contribute less to the reduction in distortion for gains in rate. This is consistent with earlier discussions. The postfilter ensures Proposition.1 for T (w) = L(w)P (w) and rejects out-of-band quantization noise. The curves for the quantization error-shaping filter C(w) are somewhat surprising from a distortion only viewpoint. In Figure 6, 1 C(w) which determines the effective quantization noise via 1 C(w) σq, is also plotted. Clearly, in order to minimize quantization distortion, we want C(w) to shape the quantization error into the bands suppressed by P (w)g(w) as much as possible. However, it can be seen from Figure 6 that the standard system puts more of the quantization noise to these bands. As noted by [1], putting more of the quantization noise to suppressed bands, while reducing the quantization distortion, has the unfortunate effect of increasing C(w) σq (and hence, the rate) and a compromise is necessary. Note that the formulation presented in this paper results in the rate-distortion optimal compromise. As the target distortion increases, our assumptions are inevitably violated and there is increased discrepancy between calculated values and values observed via simulation. Most notably, as shown in Figure 8, the targeted distortion falls short of the observed value and σq = /1 is also compromised. At these distortion values, the input to the uniform quantizer (prefiltered term plus the feedback quantization error) becomes increasingly colored and together with large step sizes ( ) results in nonwhite quantization noise not obeying σq = /1. Thus, in a strict sense, one tends to lose optimality as the distortion is increased due to lost conformance to the assumed model. However, because the optimized filters coalesce at increased distortion values (Figure 6), the actual effect of this discrepancy is expected to be marginal. The effect of the feedback quantizer on the violation of the assumed model can be demonstrated via dithering techniques. A predetermined white random noise process distributed uniformly over the interval [ /, /] can be added to the input to the quantizer p n as a dither signal [17]. The dither signal is known to the encoder-decoder pair and the entropy coding (decoding) takes place conditioned on the known dither. Such an additive dither aids the feedback quantizer to produce 14

15 white quantization noise obeying σq = /1. The agreement between calculated and observed quantities and the rate-distortion performance for the dithering scheme 9 are shown in Figures 8 (b) and 9. Finally, for low-complexity applications it becomes important to realize most of the gain of the proposed system with filters of the least dimension. In typical applications it is also very important to keep the decoder at very low complexity. Figure 10 shows the rate-distortion performance of a DPCM system modified only by an optimized three-tap prefilter (C(w) = ae jw, P (w) = 1). As illustrated, it can be seen that it is possible to obtain the benefits of the proposed system with a complexitywise low-cost prefilter. 4 Discussion of Results As outlined in Section 1.1, the main contribution of this paper is the establishment of the DPCM rate-distortion trade-off shown in Figure 4, i.e., less significant process spectral components are suppressed for a reduction in rate at the expense of increased distortion, in order to achieve better overall rate-distortion performance. This trade-off manifests itself at intermediate to low bit rates where it becomes advantageous to utilize rate-distortion optimized prefilters, postfilters and quantization noise-shaping filters to substantially improve over the rate-distortion performance of an unmodified DPCM system. Figures 11 and 1 compare the rate-distortion performance of the optimized system, a system optimized only by a 9 tap prefilter (C(w) = ae jw, P (w) = 1), and a system optimized only by a 3 tap prefilter (C(w) = ae jw, P (w) = 1). It can be seen that the systems that only 9 The use of dithering is for the purpose of showing the effect of the feedback quantizer and the reader should note that while providing access to a white quantizer, dithering comes at a rate-distortion performance penalty, especially for high distortion values. 15

16 incorporate optimized prefilters perform very closely to the fully optimized system and the improved performance over a standard DPCM codec is mostly due to the prefilter: Distortion vs. Rate-Distortion Optimization: The difference between distortion vs. rate-distortion optimization formulations and the role of the prefilter can be most clearly demonstrated by examining the following special case: Assume that the postfilter and the quantization noise-shaping filter are set to their standard values (C(w) = ae jw, P (w) = 1). If one is only interested in minimizing distortion, examination of the total distortion expression in Equation 11 reveals that one must set T (w) = L(w) = 1. This results in the standard DPCM system. On the other hand, if a rate-distortion optimization is desired then one can obtain a non-trivial prefilter (L(w) = T (w)) via the results of Section.1. The substantial performance difference between distortion only (standard DPCM) vs. rate-distortion optimization can be observed in Figures 11 and 1. Postfilters and Quantization Noise-Shaping Filters vs. Prefilters: In order to present a complete framework, this paper examines the optimization of C(w) and P (w) in addition to L(w). The reader should note that the extra benefit (in a rate-distortion sense) provided by optimizing C(w) and P (w) is modest. While the distortion optimization of these filters in conjunction with L(w) is quite useful in other contexts (see for e.g., [14, 15, 16] and references therein), the important rate-distortion trade-off identified in Figure 4 is established by the rate-distortion optimization of the prefilter (L(w)) as derived in Section.1. Fixed-rate quantization vs. entropy coded uniform quantization: DPCM variants incorporating fixed-rate quantizers (i.e., quantizers that utilize a fixed number of levels without entropy coding) have been analyzed [6, 9, 1, 13]. In principle, one can formulate the ratedistortion optimization for these systems as a distortion minimization subject to a given target rate constraint (as determined by the number of quantizer levels). The main difficulty involved in the optimization, even under simplifying assumptions, is the presence of quantizer saturation effects. Quantizer saturation leads to overload distortion which may be difficult to analyze and incorporate in an optimization framework. Intuitively, one must coordinate the design of the fixed-rate quantizer with the variance at the input of the quantizer (σ p in our notation) such that granular and overload distortions are accounted for. A rate-distortion 16

17 trade-off similar to Figure 4 manifests itself if one allows for the curbing of the variance at the input to the quantizer via a prefilter. A good prefilter will reduce this variance by selectively suppressing less significant process spectral components (based on the target rate). If the variance is reduced then a quantizer with reduced granular and overload distortions can be utilized at the expense of distortion incurred due to the suppressed spectral components. On the other hand, no curbing or no prefiltering will result in more granular and overload distortions while not having any extra distortion due to suppressed spectral components. This leads to the afore-mentioned trade-off. Note that in order to derive the correctly optimized prefilters one must accurately account for the overload distortions and not ignore them as is typically done since overload distortions may become significant at low bit rates. Note also that this possible curbing formulation with fixed-rate quantizers is much less direct 10 when compared to the formulation of this paper where, thanks to the joint utilization of a uniform quantizer and an entropy coder, analysis is simplified and the reduction of σ p becomes directly equivalent to a reduction of rate 11. Prefilters and Computational Complexity Issues: Figures 11 and 1 indicate that one can obtain most of the benefits of the proposed system by only employing optimized prefilters (C(w) = ae jw, P (w) = 1). Furthermore, for the special but important case of a first order process, even a 3 tap prefilter is capable of substantially improving over the standard system. Designing optimized systems by only incorporating optimized prefilters reduces computational complexity of the proposed system and in addition, enables one to do 10 This is especially the case if analysis is desired for non-uniform quantizers. If one allows a fixed-rate uniform quantizer (N levels, stepsize) and assumes that a granular region the size of a given multiple (say k) of the input standard deviation is sufficient to ignore overload distortions, then the rate constraint can be established by kσ p / = N which will lead to a rate-distortion optimization formulation similar to this paper. 11 As mentioned in Section., [1] tries to curb saturation effects by heuristically constraining the squared magnitude of the shaping filter. Note that this procedure will still not lead to the identified trade-off as the constraint is not established on the prefilter. 17

18 encoder-side modifications only, i.e., since C(w) and P (w) are set to their standard values, one can still target standard decoders while maintaining most of the benefits of the proposed framework with encoder-only modifications. Of course, for higher order processes, enough number of taps must be allowed in the prefilters so that suppression of insignificant process spectral components can be achieved and the DPCM rate-distortion trade-off realized 1. 5 Conclusion DPCM is a simple and powerful compression strategy at high bit rates. With newly emerging low bit rate applications, the performance of DPCM coders in the low-complexity low-bit rate region has gained a lot of importance. Pointing out the inefficiency of the DPCM coder at low bit rates, we analyzed and optimized the DPCM system in this region. We proposed jointly optimized encoderdecoder pairs utilizing only simple filtering operations. As a result, rate-distortion optimized DPCM systems that significantly outperform standard DPCM were designed. Implementation issues related to the optimized system were addressed and low complexity solutions proposed. References [1] T. Berger, Rate Distortion Theory. Englewood Cliffs, NJ: Prentice Hall, It is clear that for higher order processes, the optimized prefilters will not necessarily exhibit low-pass behavior. Rather, the frequency response of the optimized prefilters will ensure that insignificant process spectral components (as determined by the target distortion level) are suppressed wherever they lie in the process spectrum (Section.1). 18

19 [] L. D. Davisson, Rate-distortion theory and application, in Proceedings of the IEEE, vol. 60, no. 7, July, 197, pp [3] P. Noll, Digital Audio Coding for Visual Communications, in Proceedings of the IEEE, vol. 83, no. 6, June, 1995, pp [4] D. J. Le Gall, The MPEG video compression algorithm: A review, Proceedings of SPIE, The International Society for Optical Engineering, vol. 145, pp , [5] W. C. Kellog, Information rates in sampling and quantization, IEEE Transactions on Information Theory, vol. IT-13, no. 3, pp , July, [6] H.A. Spang, III and P.M. Schultheiss, Reduction of quantizing noise by use of feedback, IRE Transactions on Communications Systems, CS-10, pp , Dec [7] D. S. Arnstein, Quantization error in predictive coders, IEEE Transactions on Communications, vol. COM-3, no. 4, pp , Apr [8] P. Noll, On Predictive Quantizing Schemes, in Bell System Technical Journal, vol. 57, no. 5, May-June, 1978, pp

20 [9] N. Farvardin and J. W. Modestino, Rate-distortion performance of DPCM schemes for autoregressive sources, IEEE Transactions on Information Theory, vol. IT-31, no. 3, pp , May, [10] V. R. Algazi and J. T. DeWitte, Jr., Theoretical performance of entropy-encoded DPCM, IEEE Transactions on Communications, vol. COM-30, no. 5, pp , May, 198. [11] M. Naraghi-Pour and D.L. Neuhoff, Mismatched DPCM encoding of autoregressive processes, IEEE Transactions on Information Theory, vol. 36, no., pp , March, [1] D.D. Stacey, R.L. Frost, and G.A. Ware, Error spectrum shaping quantizers with non-ideal reconstruction filters and saturating quantizers, in Proceedings ICASSP 91, vol. 3, 1991, pp [13] E. G. Kimme and F. F. Kuo, Synthesis of Optimal Filters for a Feedback Quantization System, IEEE Transactions on Circuit Theory, pp , Sept [14] R. C. Brainard and J. C. Candy, Direct-Feedback Coders: Design and Performance with Television Signals, in Proceedings of the IEEE, vol. 57, no. 5, May, 1969, pp

21 [15] E. F. Brown, A Sliding Scale Direct-Feedback Coder for Television, in Bell System Technical Journal, May-June, 1969, pp [16] J. Jayant and P. Noll, Digital Coding of Waveforms. Englewood Cliffs, NJ: Prentice Hall, [17] L. Schuchmann, Dither signals and their effects on quantization noise, IEEE Transactions on Communications Technology, vol. COM-1, pp ,

22 yn + - Σ ~ z n ( ~ z n = zn + q n ) Uniform Quantizer z n Entropy Coder Entropy Decoder zn + Σ + y n -j ω ae + + Σ -j ae ω (a) (b) yn G( ω) 1/G( ω) 1 = -j ω 1 ae ~ ( z n = zn + q n ) ~ + z n z Uniform n Σ z + Quantizer n - + -j ω ae Σ q n Entropy Coder Entropy Decoder z n G( ω) yn (c) (d) yn 1/G( ω) zn L( ω) prefilter p n + ~ ( pn = pn + q n ) ~ p n Uniform Σ + Quantizer + pn Σ - Entropy Coder Entropy Decoder pn P( ω) G( ω) postfilter y n C( ω) q n quantization noiseshaping filter (e) (f) Figure 1: Standard and proposed DPCM coders for a first order Gaussian AR process y n = ay n 1 + z n. (a) DPCM encoder, (b) DPCM decoder, (c) Equivalent innovations encoder, (d) Equivalent decoder, (e) Proposed encoder, (f) Proposed decoder

23 .5 Theoretic R(D) DPCM R (bits) Increasing gap between DPCM and theoretic R(D) y = a y n n-1 + z n (a=.9) D/σ Y Figure : DPCM rate-distortion performance 3

24 Φ (ω) Y (a=.9) θ (a) π Φ (ω) Y (a=.9) θ ω θ ω θ π (b) Figure 3: Theoretic rate-distortion on signal spectrum. (a) Low distortion region where all spectral components of the process are allocated bit rate. (b) High distortion region where only the low frequency spectral components are allocated bit rate. 4

25 Φ~ ( ω) Z (a=.9) G( ω) ω π ω ω 1 ω 1 π Figure 4: Rate-Distortion Trade-Off for DPCM. Spectral components at ω 1 and ω contribute equally to the total rate but the contribution to the reduction of distortion is determined by G(ω) which weighs lower frequencies more heavily. 5

26 Im Im T( ω) = c T( ω) 1- T( ω) 1- T( ω) = c T( ω) 1- T( ω) c 1 Re 1-c 1 Re Optimal Optimal (a) (b) Figure 5: Optimality of T (w) = L(w)P (w). (a) For fixed T (w), optimal T (w) is real with T (w) 0, (b) For fixed 1 T (w), optimal T (w) is real with T (w) 1 6

27 1 0.8 L( ω ) D/ σ = Y standard P( ω) D/ σ = Y standard ω 1 C( ω) standard D/ σ = Y ω 1- C( ω) standard D/ σ = Y ω ω Figure 6: Progression of L(w), P (w), C(w), 1 C(w) with increasing distortion, a =.9 7

28 .5 Theoretic R(D) DPCM Optimized System Optimized at D=0.038 Optimized at D=0.084 R (bits) Optimized System resides on the convex hull of optimized filters D/σ Y Figure 7: Rate-distortion performance of optimized system (a = 0.9) 8

29 σ q D (actual) (actual) vs. vs. D T σ q 0.14 (actual) (targeted) (a) D (actual) vs. σ q (actual) vs. D T σ q (actual) (targeted) (b) Figure 8: (a) Mismatch between targeted and actual values (observed via simulation) for the optimized system, (b) Agreement between targeted and actual values (observed via simulation) for the optimized system with dither 9

30 .5 Theoretic R(D) DPCM Optimized System with dither R (bits) D/ σ Y Figure 9: Rate-distortion performance of optimized system with dither (a = 0.9) 30

31 .5 Theoretic R(D) DPCM Optimized Three-Tap Prefilter R (bits) D/σ Y Figure 10: Rate-distortion performance of optimized three-tap prefilter (a = 0.9) 31

32 R (bits) Theoretic R(D) Optimized System Optimized 9-tap Prefilter Optimized 3-tap Prefilter DPCM (a= 0.9) D/σ Y Figure 11: Rate-distortion performance of optimized systems (a = 0.9) 3

33 R (bits) Theoretic R(D) Optimized System Optimized 9-tap prefilter Optimized 3-tap Prefilter DPCM (a= 0.8) D/σ Y Figure 1: Rate-distortion performance of optimized systems (a = 0.8) 33

Rate-Distortion Based Temporal Filtering for. Video Compression. Beckman Institute, 405 N. Mathews Ave., Urbana, IL 61801

Rate-Distortion Based Temporal Filtering for Video Compression Onur G. Guleryuz?, Michael T. Orchard y? University of Illinois at Urbana-Champaign Beckman Institute, 45 N. Mathews Ave., Urbana, IL 68 y