1 Introduction 1 INTRODUCTION 1

Size: px
Start display at page:

Download "1 Introduction 1 INTRODUCTION 1"

Transcription

1 1 INTRODUCTION 1 Audio Denoising by Time-Frequency Block Thresholding Guoshen Yu, Stéphane Mallat and Emmanuel Bacry CMAP, Ecole Polytechnique, Palaiseau, France March 27 Abstract For audio denoising, diagonal thresholding estimators of spectrogram coefficients produce a musical noise that degrades audio perception. We introduce a block thresholding which produces hardly any musical noise and improves the SNR compared to diagonal thresholdings or Ephraim and Malah estimators. Spectrogram coefficients are grouped into blocks to compute attenuation factors. This block grouping regularizes the estimation which removes musical noises. The block size is adapted to the signal properties by minimizing a Stein unbiased estimator of the block thresholding risk. Index Terms Audio denoising, Block thresholding, Diagonal thresholding, Ephraim and Malah, SURE. 1 Introduction Audio signals are often contaminated by background environment noise and buzzing or humming noise from audio equipments. Audio denoising aims at attenuating the noise while retaining the underlying signals. Applications such as music and speech restoration are numerous. Thresholding estimators [11] remove noise by thresholding to zero small coefficients in an appropriate sparse signal representation. Image denoising by thresholding wavelet coefficients is particularly effective to suppress noise from images, and these estimators are used in many applications. For audio signals, despite interesting work on such thresholding estimators [8, 21, 24], the results are less convincing. Indeed, thresholding the spectrogram or the wavelet coefficients of a noisy audio signal produces a musical noise [6, 26]. This noise is a sum of localized time-frequency structures corresponding to isolated spectrogram or wavelet coefficients above the threshold. This superposition of musical noise contaminates the denoised sound and degrades the audio perception. Currently, the audio denoising method most often used is the Ephraim and Malah noise suppression rule [12, 13] and their variants [25] applied to spectrograms. This technique introduces little musical noise and maintains a small amplitude residual noise that masks this musical noise. This paper introduces a block thresholding estimator that produces hardly any musical noise with no residual noise, by grouping spectrogram coefficients in time-frequency blocks [26]. A block thresholding restores spectrograms that are more regular without isolated coefficients responsible for musical noise. Taking advantage of the time-frequency regularity of audio sounds, it also improves the resulting SNR. Comparisons are made with Ephraim and Malah estimators. Block thresholding estimators were first introduced by Cai and Silverman [3, 4, 5] to improve noise removal in orthonormal wavelet bases. Mathematical studies [15, 16, 17] proved the minimax optimality of wavelet block thresholding for certain classes of signals. For audio denoising, the grouping of spectrogram coefficients in blocks can be automatically adjusted to the signal content, by minimizing the resulting risk calculated with the Stein estimator [23]. We begin by reviewing conventional diagonal thresholding estimators and explain why they

2 2 DIAGONAL THRESHOLDING 2 produce musical noise for audio signals. Section 3 introduces the block thresholding estimators of Cai and Silverman [4] in the general context of orthogonal bases and frames. Block thresholding of spectrogram coefficients are studied for audio denoising, and comparisons are made with Ephraim and Malah methods. To adjust the size of blocks that group spectrogram coefficients, Section 4 explains how to compute the Stein unbiased risk estimate [23] of a block thresholding algorithm, and adjust the block size to minimize the risk estimation. A post-processing with an empirical Wiener shrinkage [14] is presented in Section 5 to further improve the estimation. 2 Diagonal Thresholding Next section describes the properties of diagonal thresholding estimators both in orthogonal bases and in frames, and Section 2.2 explains why they produce musical noises when applied to audio spectrograms. 2.1 Properties of Diagonal Thresholding Estimators Let y be a noisy signal that is the sum of a clean signal f and a noise ǫ of zero mean: y[n] = f[n] + ǫ[n], n =, 1,...,N 1. (1) Thresholding estimators decompose noisy signals in a basis or in a frame and set to zero small amplitude coefficients. Let F = {g m } 1 m N be a family of vectors that define an orthonormal basis of R N. Decomposing y in F yields with y F [m] = f F [m] + ǫ F [m], 1 m N (2) y F [m] = y, g m, f F [m] = f, g m and ǫ F [m] = ǫ, g m. A diagonal estimator in this basis modifies the amplitude of each coefficient y F [m] with a factor a[m] and reconstructs ˆf = N D m (y F [m])g m = m=1 N a[m] y F [m] g m. (3) To reduce the quadratic risk E{ f ˆf 2 } one can verify that the attenuation factor should satisfy a[m] 1. The estimator is said to be diagonal if a[m] depends only upon y F [m]. For diagonal estimators, one can verify [11] that a lower bound of the quadratic risk E{ f ˆf 2 } is obtained by choosing f F [m] 2 a[m] = f F [m] 2 + σ 2 (4) [m] where σ 2 [m] = E{ ǫ F [m] 2 } is the variance of each noisy coefficient. The resulting lower bound risk is N f F [m] 2 σ 2 [m] R o = f F [m] 2 + σ 2 [m]. (5) m=1 m=1

3 2 DIAGONAL THRESHOLDING 3 This lower bound cannot be reached because the oracle attenuation factor (4) depends upon f F [m] which is unknown. A simple diagonal estimator is the empirical Wiener estimator [2] defined by ( ) D m (x) = x 1 σ2 [m] x 2 + where we write (z) + = max(z, ). Donoho and Johnstone [11] have introduced better thresholding estimators that can produce a risk close to the oracle lower bound. A hard thresholding keeps coefficients above a threshold T m = λσ[m]: D m (x) = x1 { x >λ σ[m]} (7) in which case the attenuation factor a[m] is or 1. A soft thresholding reduces the amplitude of all coefficients ( D m (x) = x 1 λσ[m] ). (8) x + To minimize the risk, Donoho and Jonhstone proved that the threshold T m should be proportional to the noise standard deviation and depends upon the signal size. Asymptotically, an optimal choice is: T m = 2 log e N σ[m]. (9) When the noise ǫ is Gaussian and white, and hence σ[m] = σ for all 1 m N, Donoho and Johnstone [11] proved that for N 4 the hard and soft thresholding risk is close to the minimum oracle risk: R o E{ f ˆf 2 } (2 log e N + 2.4) ( σ 2 + R o ). (1) A frame is a family of M N vectors F = {g m } m Γ that defines a redundant signal representation f F [m] = f, g m. A tight frame satisfies an energy conservation like an orthogonal basis f 2 = 1 f F [m] 2 A and as a result one can prove that [19] f = 1 A m Γ f F [m] g m, m Γ where A is the frame bound. A thresholding estimator in a tight frame behaves similarly to an averaging of thresholding estimators in several orthonormal bases, which often improves the resulting SNR [9]. The thresholding risk in a frame can also be related to an oracle risk with an upper bound similar to (1). In numerical applications, thresholding estimators in tight frames are thus prefered to thresholding estimators in a single orthogonal basis. (6)

4 2 DIAGONAL THRESHOLDING Audio Denoising by Diagonal Thresholding Audio signal denoising can be implemented with a thresholding in a windowed Fourier frame. It amounts to a simple thresholding of the resulting spectrogram, but it produces a musical noise corresponding to isolated coefficients above threshold. Let w[n] be a window of size R normalized to w 2 = 1. A windowed Fourier frame is defined by ( )} i2πrn F = {g l,r [n]} = {w[n lu] exp R, 1 l N/u,1 r R where u is the window shifting step, and l, r are respectively the time and frequency indices. The resulting windowed Fourier coefficients are computed with an FFT for each translated window f F [l, r] = f, g l,r = N ( ) i2πrn f[n]w[n lu] exp R n=1 and { f F [l, r] 2 } 1 l N/u,1 r R is the spectrogram. Thresholding windowed Fourier coefficients thus amounts to threshold a spectrogram. If the window w[n] is chosen so that l w[n lu] 2 = A, n, (11) R then one can prove [1] that the windowed Fourier frame is a tight frame with frame bound A. In the following, we use half-overlapping windows with u = R/2 and with a window w that is the square root of a Hanning window to satisfy (11). If the noise is stationary then the noise variance σ 2 l,r = E{ǫ F[l, r] 2 } depends only upon the frequency index r and if it is white then it has a constant value σ 2. For an empirical Wiener diagonal estimator (6), the attenuation factor is ( ) a[l, r] = 1 σ2 [l, r] y F [l, r] 2, + which coincides with the square of the suppression rule for the method of power subtraction [1, 2, 18], and is known to produce musical noises. To illustrate the musical noise produced by a spectrogram thresholding, Fig. 1 shows the denoising of a short recording of the Mozart oboe concerto with a white Gaussian noise. Fig. 1(a) and 1(b) show respectively the log spectrograms log f F [l, r] and log y F [l, r] of the original signal f and its noisy version y. Thresholding y F [l, r] amounts to multiplying it by attenuation factors a[l, r] equal to or 1. Fig. 1(c) shows this attenuation map, with black points corresponding to a[l, r] = 1. As it can be observed in the zoom in Fig. 1(c ) this attenuation map includes many isolated black points. In the reconstruction process, these isolated coefficients restore isolated windowed Fourier vectors g l,r [n] that are perceived as a musical noise. A soft thresholding produces a similar phenomenon because each coefficient is also thresholded independently from its neighbors. To remove this musical noise, next section uses a block thresholding estimator that takes into account the fact that large spectrogram coefficients of most audio sounds are aggregated together in the time-frequency plane.

5 3 TIME-FREQUENCY BLOCK THRESHOLDING 5 (a) (b) (c) Log-spectrogram of original Mozart. (d) Log-spectrogram of noisy Mozart (a ) Hard-thresholding Adaptive block thresholding. (b ) (c ) Zoom of (a). (d ) Zoom of (b). Zoom of (c). Zoom of (d). Figure 1: Log-spectrogram of original and noisy Mozart and attenuation coefficients of hard thresholding and block thresholding. (a )(b )(c )(d ) are respectively zooms of the marked regions in (a)(b)(c)(d). Values of attenuation coefficients from 1 (black) to (white). 3 Time-Frequency Block Thresholding The block thresholding algorithm of Cai and Silverman [3, 4] regularizes diagonal thresholding estimations by grouping coefficients in blocks and computing a single attenuation factor for all coefficients in each block. We present this estimator in a general context of orthogonal bases and frames before applying it to spectrograms for audio denoising. By regularizing the thresholding estimation over blocks of coefficients, the musical noise is almost completely removed and the SNR is improved.

6 3 TIME-FREQUENCY BLOCK THRESHOLDING Block Thresholding in Bases and Frames Let F = {g m } m Γ be an orthonormal basis or a frame of R N. The set Γ of all indices m is segmented in K blocks B k in which indices are grouped together. If F is a windowed Fourier frame then the time-frequency indices m = (l, r) are grouped in time-frequency blocks B k whose shape may a priori be chosen arbitrarily. A block thresholding estimator multiplies all coefficients within B k with a same attenuation factor a k ˆf = K k=1 m B k a k y F [m] g m (12) This estimator is not diagonal because the value of each a k may depend upon all coefficients y F [m] within B k. A lower bound of the risk E{ ˆf f 2 } is obtained with an oracle attenuation. Let B # k be the number of coefficients within a block B k. The average signal and noise energy in this block are: f 2 F,k = 1 B # k m B k f F [m] 2 and σ 2 k = 1 B # k m B k σ 2 [m]. Similarly to the oracle attenuation factor (4), one can verify that a minimum risk is obtained by choosing a k = f2 F,k f 2 F,k + σ2 k σ 2 k = 1 ff,k 2 + σ2 k, (13) and the resulting oracle block risk is R bo = K k=1 f 2 F,k σ2 k f 2 F,k + σ2 k. (14) Clearly the oracle block attenuation factor a k in (13) cannot be calculated since it depends upon the values of f F [m]. The goal is to find a block estimator whose risk E{ ˆf f 2 } is as close as possible to the lower bound R bo. Observe that the oracle risk with blocks R bo in (14) is always larger than the oracle risk R o in (5) without blocks, because it is obtained through the same minimization but with less parameters as attenuation factors remain constant over each block. Reducing the number of attenuation parameters with a block technique increases the oracle risk lower bound but it regularizes the estimation when attenuation factors are computed from empirical coefficients. A direct calculation shows that K R bo R o = k=1 m B k ξ F,k ξ F [m](σ 2 k σ2 [m]) + (f 2 F,k f F[m] 2 ) (ξ F,k + 1)(ξ F [m] + 1), (15) with ξ F,k = f2 F,k is the average SNR in block B k and ξ F [m] = ff[m] 2 σk 2 σ is the SNR of the coefficient 2 corresponding to the index m. Equation (15) indicates that R bo is close to R o if both the noise

7 3 TIME-FREQUENCY BLOCK THRESHOLDING 7 and the signal coefficients have little variation in each block. Consequently the risk of the block thresholding estimator is reduced by choosing the blocks so that in each block B k either (i) f F [m] and σ 2 [m] vary little; or (ii) ξ F,k 1, ξ F [m] 1 and σ 2 [m] varies little; or (iii) ξ F,k 1, ξ F [m] 1 and f F [m] varies little. Cai and Silverman block thresholding operators [3, 4] use the James Stein shrinkage rule [22]. We cannot compute the original signal energy in the block but we can calculate the noisy signal energy yf,k 2 = 1 B # y F [m] 2 k m B k and observe that E{y 2 F,k } = f2 F,k + σ2 k. (16) The James Stein shrinkage rule [22] is similar to the oracle formula (13) where ff,k 2 + σ2 k is replaced by y 2 F,k : a k = ( 1 λσ2 k y 2 F,k ) +, (17) with a thresholding parameter λ 1. For blocks of size 1, if λ = 1 then this shrinkage rule corresponds to the empirical diagonal Wiener estimator defined in (6). If the noise ǫ is a Gaussian white noise, then, like in the case of diagonal thresholding estimators, the resulting risk E{ ˆf f 2 } can be shown to be close to the oracle risk (14). The average noise energy over a block B k ǫ 2 F,k = 1 B # ǫ F [m] 2 (18) k m B k has a χ 2 distribution with B # B # k degrees of freedom because each noise coefficient ǫ F[m] is a k Gaussian random variable of variance σ 2. If all blocks B k have the same size B #, then Cai [3] proved that R bo E{ ˆf f 2 } 2λR bo + 4Nσ 2 Prob{ǫ 2 F > λσ2 }, (19) where Prob{} is the probability measure and ǫ 2 F is the average noise energy over a block of size B #. The second term 4Nσ 2 Prob{ǫ 2 F > λσ2 } in the risk upper bound (19) is a variance term corresponding to a probability of keeping pure noise coefficients, i.e., f is zero (y = ǫ) and a k (c.f. (17)). Prob{ǫ 2 F > λσ2 } is the probability to keep a residual noise. The oracle risk and the variance terms in (19) are competing. When λ increases the first term increases and the variance term decreases. Similarly, when the block size B # k increases the oracle risk R bo increases whereas the variance decreases. Adjusting λ and the block sizes B # k can be interpreted as an optimization between the bias and the variance of our block thresholding estimator. The parameters λ and B # k are set by adjusting the residual noise probability where δ is the residual noise probability that one tolerates. Prob{ǫ 2 F > λσ2 } = δ (2)

8 3 TIME-FREQUENCY BLOCK THRESHOLDING 8 Cai [3] shows that choosing B # = log e N and λ = 4.55 yields the following block oracle inequality (19): R ba 2λ R ob + 2σ 2. (21) A tight frame is similar to a union of several orthonormal bases and the risk of a block thresholding estimator in a tight frame behaves similarly as the sum of the risks in several orthonormal bases. However, even if the noise is Gaussian white, because of the redundancy between frame vectors, the average noise energy ǫ 2 F over a block of size B# no longer follows a χ 2 B # distribution. 3.2 Block Thresholding in Short-Time Fourier Frames The time-frequency block thresholding can be applied directly with short-time Fourier frames. Some specifications about choice of parameters are discussed below. Choice of Block We group time-frequency contiguous short-time Fourier coefficients in disjoint rectangular blocks. The block size is B # k = L k W k, where L k and W k are respectively the block length in time and the block width in frequency. For simplicity, dyadic lengths L k = 8, 4, 2 and widths W k = 16, 8, 4, 2, 1 will be used (the unit being the time-frequency index in spectrogram). In this section, fixed block length and width are assigned to all the blocks, i.e., L k = L, W k = W and B # k = B# = L W, k. Choice of Thresholding Level λ Given a choice of block size and the residual noise probability level δ that one tolerates, the thresholding level λ is defined by (2). For each block width and length, λ is estimated using Monte Carlo simulation of ǫ 2 F. Table 1 shows the resulting λ with δ =.1%. Let us remark that for a block width W > 1, blocks that contain same number of coefficients B # = L W have close λ values. W = 16 W = 8 W = 4 W = 2 W = 1 L = L = L = Table 1: Thresholding level λ calculated with different block size B # = L W and with δ =.1%. 3.3 Block Thresholding and Ephraim and Malah In the Ephraim and Malah methods [12, 13, 6] and their variants [7, 25], two factors contribute essentially to the elimination of musical noise: the recursive decision-directed a priori SNR estimator that induces a temporal regularization in the estimator, and the suppression rules that retain a uniform noise which masks efficiently the musical noise in denoised signals. We discuss a connection between the block thresholding estimation and the decision-directed a priori SNR

9 3 TIME-FREQUENCY BLOCK THRESHOLDING 9 estimator. The masking noise technique is incorporated in block thresholding estimator. Ephraim and Malah Methods Estimating the a priori SNR ξ[l, r] = f F [l, r] 2 /σ 2 [l, r] is an important step of most noise suppression rules. In their milestone paper [12], Ephraim and Malah proposed a decision-directed estimator of the a priori SNR with a recursive procedure ˆξ[l, r] = α ˆf F [l 1, r] 2 σ 2 [l 1, r] ( yf [l, r] 2 ) + (1 α) σ 2 1, (22) [l, r] + where α [, 1] is a weighting parameter. In the first term, ˆf F [l 1, r] is the previously computed estimate of f F [l 1, r]. The second term is a maximum likelihood estimate of the SNR of the current coefficient. The decision-directed SNR estimator is recursive and induces a temporal regularization on ˆξ[l, r] with a causal smooth window exponentially decreasing. Based on an independent Gaussian distribution assumption of signal coefficients f F [l, r], Ephraim and Malah proposed a noise suppression rule as ˆf F [l, r] = a[l, r]y F [l, r] (23) with a[l, r] = ( ) [ ( ) ( )] π v[l, r] v[l, r] v[l, r] v[l, r] exp (1 + v[l, r])i + v[l, r]i 1 2 γ[l, r] (24) where γ[l, r] = y F [l, r] 2 /σ 2 [l, r] is called the a posteriori SNR of f F [l, r], v[l, r] is defined by v[l, r] = ξ[l,r] ξ[l,r]+1 γ[l, r] and I ( ) and I 1 ( ) denote respectively the modified Bessel function of zero and first order. Fig. 2-b shows the value of a[l, r] as a function of ξ[l, r] in db with different values of γ[l, r]. Note that the curve corresponding to γ 1 = ξ is close to the average case, since E{γ} 1 = ξ. The Ephraim and Malah suppression rule, compared with block thresholding in Fig. 2-a, performs less severe attenuation when the a priori SNR ξ[l, r] is very small; moreover, the attenuation decreases when the a posteriori SNR γ[l, r] increases. As a result, the Ephraim and Malah suppression rule is able to retain some residual masking noise. Block Thresholding A block thresholding estimation (17) also depends upon an estimated a priori SNR calculated on each block: ( ) ( a k = 1 λσ2 k = 1 λ ), (25) yf,k 2 ˆξ k where + ˆξ k = y2 F,k σ 2 k 1 (26) is an unbiased estimate of the a priori SNR ξ[l, r] computed by averaging the coefficient energy in a block.

10 3 TIME-FREQUENCY BLOCK THRESHOLDING 1 To retain a low-amplitude masking noise, a non-zero attenuation floor value is kept by modifying (25): ( ) ( ( a k = max 1 λσ2 k, a = max 1 λ ) ), a (27) yf,k 2 ˆξ k where < a 1 is a masking noise attenuation factor. The experiments show that with a around.5, the small residual noise masks completely the remaining very weak musical noise. Fig. 2(a) plots the attenuation factor (27) of the block thresholding in function of ˆξ k with different λ and a. Note that the curve with λ = 1 corresponds to the attenuation with oracle. The block thresholding makes stronger attenuation than the Ephraim and Malah suppression rule when the a priori SNR is weak. This explains why the block thresholding is better at eliminating the noise (if a is small) than the Ephaim and Malah suppression rule. (a) Gain (db) λ = 1., a =.6 λ = 1.5, a =.9 1 λ = 2., a =.7 λ = 2.5, a = A priori SNR (db) (b) Gain (db) γ 1 = 2 db γ 1 = db γ 1 = 2 db γ 1 = ξ A priori SNR (db) Figure 2: Attenuation factor versus a priori SNR ξ. (a) Block Thresholding (27) for different thresholding parameters λ and masking noise attenuation factor a. (b) Ephraim and Malah suppression rule (24) for different a posteriori SNR γ. 3.4 Experiments and Results The experiments presented below have been performed on various types of signals: Piano is a simple example that contains a single clear clavier stroke; Mozart and Centuria are musical excerpts that contain respectively quick notes played by a solo oboe and by some drums; Tête is a speech signal (in French). Centuria is sampled at 44 khz and all the other signals are sampled at 11 khz. They were corrupted by white Gaussian noise of different amplitude. Short-time Fourier transform with half-overlapping windows were used in the experiments. These windows are square root of Hanning windows of size 5 ms for Piano and Mozart, 3 ms for Centuria and 2 ms for Tête. 1 1 The audio denoising examples are available online at?????.

11 3 TIME-FREQUENCY BLOCK THRESHOLDING Performance Comparison Table 2 compares the performance in terms of SNR for block thresholding (block lengths and widths are discussed in the next section), Ephraim and Malah suppression rule equipped with the decision-directed SNR estimator [12] and hard thresholding. Two levels of noise removal have been used for the block thresholding and the Ephraim and Malah method. For the partial noise removal level (P), both methods were calibrated to retain a residual noise of similar energy : we chose a.5 in (27) for block thresholding and α.98 in (22) for the Ephraim and Malah method. To achieve the maximum noise removal level (M), we chose a = and α.999. For hard thresholding, the threshold was set equal to 3σ, where σ 2 is the noise variance. SNR Hard Block Thresholding Ephraim-Malah ( Mozart ) Thresholding Method P M P M db db db db Signal Hard Block Thresholding Ephraim-Malah (1 db SNR) Thresholding Method P M P M Piano Centuria Tête Table 2: Performance comparison. Top: Mozart with different SNR. Bottom: Piano, Centuria and Tête with 1 db SNR. From left to right: hard thresholding, block thresholding (with partial (P) and maximum (M) noise removal), Ephraim and Malah suppression rule equipped with the decision-directed SNR estimator (with partial (P) and maximum (M) noise removal levels). With partial noise removal level (P), in both methods, the residual noise masks the musical noise, however, block thresholding introduces less signal distortion as reflected by the systematic 2dB SNR improvement. With the maximum noise removal level (M), the musical noise cannot be masked by the residual noise since there is nearly no residual noise left. Whereas block thresholding hardly produces any musical noise, the Ephraim and Malah method results in noticeable musical noise, especially when the SNR of the noisy signal is small ( Mozart at db and 5 db). Note that the Ephraim and Malah method sometimes produces a resonance artifact, as if the sound was coming from far away. Such artifacts are especially strong for speech signals when α in the decision directed SNR estimator (22) is close to 1, which leads to a temporal window decreasing very slowly. Block thresholding does not create such artifact. Table 2 shows that a hard thresholding produces a smaller SNR than block thresholding (for both level (P) and (M)). Actually, it also produces a very strong musical noise. Fig. 3 displays

12 3 TIME-FREQUENCY BLOCK THRESHOLDING 12 the different attenuation coefficient maps for the Tête signal. It shows that block thresholding coefficients (Fig. 3(c)) are closer to the oracle coefficients (Fig. 3(f)) than the hard thresholding coefficients (Fig. 3(b)). Moreover the block thresholding coefficients map is much more regular than the hard thresholding one. This gives a visual confimation that block thresholding produces less signal distortion than hard thresholding. Note that the block thresholding scheme can also be implemented with half-overlapping blocks to further regularize the estimator. It is equivalent to compute 4 block thresholding estimators with blocks shifted by L/2 in time and/or by W/2 in frequency and then averaging the 4 signal estimations. It leads to a.2 db SNR improvement over the standard block thresholding with non-overlapping blocks, which is not much given the significant increase in the computational complexity. (a) (b) (c) Log spectrogram of noisy Tête (d) Hard-thresholding (e) Block thresholding (f) Adaptive block thresholding Adaptive block thresholding with empirical Wiener shrinkage post-processing Attenuation with oracle Figure 3: (a) log-spectrogram of Tête. Attenuation coefficients of hard-thresholding in (b), block thresholding in (c), adaptive block thresholding in (d), adaptive block thresholding with the empirical Wiener shrinkage as a post-processing in (e) and attenuation with oracle in (f). Values of attenuation coefficients from 1 (black) to (white).

13 4 ADAPTIVE BLOCK THRESHOLDING Block Sizes in Block Thresholding The block thresholding results presented in Table 2 are obtained with optimal block sizes that maximize the SNR among block lengths L = 8, 4, 2 in time and block widths W = 16, 8, 4, 2, 1 in frequency. Optimal block sizes are respectively (L, W) = (4, 1) for Piano, (L, W) = (8, 1) for Mozart, (L, W) = (8, 16) for Centuria and (L, W) = (4, 8) for Tête. Since the noise is white and thus uniform in time and frequency, (15) shows that the optimal block size and shape depends upon the time-frequency spread of the signal components. Within the block size family previously mentioned, there is a difference of more than 2 db SNR between the best and worse block sizes. Block sizes could also be adapted to different signal parts. Fig.4 zooms on the onset of Mozart signal whose log-spectrogram is illustrated in Fig 1(b). As shown in Figs 4(a) and (b), at the beginning of the harmonics, blocks of large attenuation factors spread beyond the onset of the signal. Fig4 (b ) illustrates the horizontal blocks at the onsets marked in Figs 4(a) and (b). This produces a pre-echo artifact 2 in the denoised signal. In the time interval where the blocks exceed the signal onset, little attenuation is performed, the noise is not eliminated, consequently a sound is heard before the very beginning of the original signal. A smaller block size would reduce this time interval and thus reduce this pre-echo artifact. (a) (b) (b ) (c) (c ) Figure 4: Zoom on the onset of Mozart. (a) log-spectrogram. Attenuation coefficients of block thresholding in (b) and adaptive block thresholding in (c). Values of attenuation coefficients from 1 (black) to (white). (b ) and (c ) illustrate respectively the block partition with block thresholding and adaptive thresholding at the onset marked in (b) and (c). 4 Adaptive Block Thresholding An adaptive block thresholding adapts block sizes to the time-frequency signal property by minimizing an estimation of the risk. Appropriate block sizes reduce pre-echo artifacts (as described in Section 3.4.2) and improve the SNR. 2 We call this artifact pre-echo though, originally, pre-echo corresponds to a psychoacoustic phenomenon where an unusually noticeable artifact is heard in a sound recording from the energy of time domain transients smeared backwards in time after processing in the frequency domain due to the Gibbs phenomenon.

14 4 ADAPTIVE BLOCK THRESHOLDING SURE of Block Thresholding Estimator The best choice of block sizes minimizes the estimation risk E{ ˆf f 2 }. This risk cannot be calculated since f is unknown, but it can be estimated with a Stein Unbiased Risk Estimate (SURE) [23]. Best block sizes are computed by minimizing this estimated risk. SURE is an estimate of the risk of an arbitrary estimator Ŷ of the mean value vector Y of a multivariate normal random vector X and having an identity covariance matrix. Since it is unbiased, E{SURE} = E Ŷ Y 2. Theorem (Stein Unbiased Risk Estimate SURE). Let X = (x 1,..., x p ) be a multivariate normal random vector of dimension p with mean Y and having an identity covariance matrix. Let X+h(X) be an estimate of Y, where h = (h 1,..., h p ) : R p R p almost differentiable (h i : R p R 1, i). Define h = p i=1 x i h i. If E So { p i=1 x i h i (X) } <, then E X + h(x) Y 2 = p + E { h(x) h(x) }. (28) SURE := p + h(x) h(x) (29) is an unbiased estimate of the risk of X +h(x), called Stein Unbiased Risk Estimate (SURE) [23]. The proof of (28) is essentially based on the fact that φ (y) = yφ(y), where φ(y) is the standard normal density [23]. Following the approach of Cai [3, 5], one can apply the SURE estimator to compute the risk of a block thresholding estimator. The Gaussian noise coefficients are uncorrelated and hence independent. Let us normalize the observed data z F [m] = y F [m]/σ[m], m Γ so that the normalized noise has an identity covariance matrix. Applying the SURE to the block thresholding estimator (17) on a block B k of size p = B # k, one has ( ) h m (X) = λ z F [m]1 zf,k 2 z 2 F,k >λ z F[m]1 z 2 F,k λ, m B k, (3) where zf,k 2 = 1 B # m B k z F [m] 2. Applying (29), one gets SURE Bk for a block thresholding k estimator SURE Bk = B # k + λ2 B # k 2λ(B# k 2) 1 zf,k 2 z 2 F,k >λ + B# k (z2 F,k 2)1 zf,k 2 λ. (31) Since SURE is unbiased, E{SURE Bk } = E{ m B k f F [m] ˆf F [m] 2 }. When the noise is Gaussian white, orthogonal coefficients are independent. For a tight frame this hypothesis is not valid, but (31) still applies approximately because a tight frame behaves similarly to a union of orthogonal bases. 1 One can verify that the variance of SURE B # Bk is approximately proportional to 1. When k B # k the blocks are small it is necessary to reduce this variance by making an average over several blocks B k inside a macroblock M: SURE M = k M SURE B k. Let M # be the number of coefficients 1 in all the blocks included in M, SURE M # M has a variance proportional to 1. M #

15 5 POST-PROCESSING: EMPIRICAL WIENER SHRINKAGE 15 The adaptive block thresholding groups coefficients in blocks whose sizes are adjusted to minimize SURE and it attenuates coefficients in those blocks. The blocks B k are sets of coefficients that are not necessarily connected or rectangular. In the following by block size we mean a choice of block shape and size among a collection of possibilities. In this adaptive grouping procedure, neighboring coefficients y F [m] are grouped in disjoint macroblocks M j, j = 1, 2..., J. A macroblock M j can be segmented in blocks B k of same size B # (j). Several such segmentations are possible and we want to choose the one that leads to the smallest risk estimated with SURE. The optimal block size B # (j) for the blocks B k in M j is calculated by minimizing the SURE in M j, i.e., B # (j) = arg min B # SURE Mj = argmin B # k M j SURE Bk, j = 1, 2..., J (32) To reduce its variance, SURE is calculated over blocks of identical size imposed in each macroblock. Macroblock size should not be too large in order to maintain enough adaptivity in the size evolution of blocks. Once the block sizes are computed, coefficients in each B k are attenuated with (17), where λ is calculated with (2). 4.2 Adaptive Block Thresholding in Short-Time Fourier Frames The time-frequency adaptive block thresholding is applied directly to short-time Fourier frames. In numerical experiments each macroblock is segmented with 15 possible block sizes B # = L W with a combination of block length L = 8, 4, 2 and block width W = 16, 8, 4, 2, 1. The thresholding parameter λ is calculated with (2). The size of macroblocks is set to be equal to the maximum block size B max # = Fig. 5 illustrates different segmentations of these macroblocks into time-frequency blocks of same size. Experiments have been performed on the same audio signals as in Subsection 3.4, with 1 db SNR, with the same short-time Fourier frames and with the maximum noise removal level (M), i.e., with a = in (27). The first two columns of Table 3 compare the performance in terms of SNR between the adaptive block thresholding and the block thresholding with an optimal fixed block size obtained with an oracle. For three out of the four signals, the adaptive block thresholding improves the SNR relatively to the optimal fixed-size block thresholding. With Piano the SNR improvement is as high as.5 db. With Mozart, the result is the second best among the 15 block size candidates and.25 db below the result obtained with the optimal block size. As shown in Figs 4(c)(c ), compared with Figs 4(b)(b ), in the first part of Mozart, the adaptive block method chooses blocks of shorter length L that hardly exceed the onset of the signal. This reduces considerably the pre-echo artifact discussed in Section After the onset, the adaptive block method chooses narrow horizontal blocks, of the same width as the non adaptive method, that are able to capture the harmonic structure of the signal. 5 Post-processing: Empirical Wiener Shrinkage As a post-processing, an empirical Wiener shrinkage [14] is cascaded after the adaptive block thresholding. It allows more flexible and accurate attenuation decision while it inherits the time-

16 5 POST-PROCESSING: EMPIRICAL WIENER SHRINKAGE 16 Figure 5: Partition of macroblocks into blocks of different sizes. Block Thresholding with Adaptive Block Thresholding Optimal Fixed Size Adaptive Block Thresholding with Empirical Wiener Shrinkage as Post-processing Piano Mozart Centuria Tête Table 3: Performance comparison between the block thresholding with the optimal fixed block size, the adaptive block thresholding and the adaptive block thresholding with the empirical Wiener shrinkage as a post-processing. frequency regularization of the estimate from the adaptive block thresholding. The basic idea is to use the denoised signal as if it was the clean signal. Let us denote f the denoised signal obtained by the adaptive block thresholding algorithm and f F [m] = f, g m. An empirical Wiener shrinkage is a diagonal thresholding with attenuation coefficients defined as in (4): a[m] = f F [m] 2 f F [m] 2 + σ 2. (33) Table 3 shows that an improvement of.25 db SNR on average is brought by the empirical Wiener shrinkage as a post-processing and.5 db on Mozart. Audio improvement due to the post-processing includes less distortion of the underlying signals and further removal of the musical noise.

17 6 CONCLUSION 17 Fig. 3(e) displays the attenuation coefficients map of the empirical Wiener shrinkage. It maintains the same time-frequency regularity of the adaptive block thresholding (Fig. 3(d)), and its coefficients are closer to the oracle coefficients (Fig. 3(f)). 6 Conclusion A diagonal thresholding of spectrogram coefficients is unsuitable for audio signal denoising because it produces too much musical noise. This paper describes a time-frequency block thresholding which produces hardly any musical noise and improves the SNR relatively to start-of-the-art methods such as Ephraim and Malah estimations. A block thresholding groups time-frequency signal coefficients in blocks and then attenuates coefficients in each block. This block grouping regularizes estimations and contributes to the elimination of the musical noise. The block size can also be adapted to the signal properties by minimizing a SURE estimator of the block thresholding risk. For audio signals it reduces distortions such as pre-echo artifacts. References [1] M. Berouti, R. Schwartz, J. Makhoul, Enhancement of speech corrupted by acoustic noise, Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP Vol. 4, pp , [2] S. Boll, Suppression of acoustic noise in speech using spectral subtraction, IEEE Trans. Acoust., Speech, Signal Process. ASSP-27, pp , [3] T. Cai, Adaptive wavelet estimation: a block thresholding and oracle inequality approach, Ann. Statist, 27, , [4] T. Cai and B.W. Silverman, Incorporation information on neighboring coefficients into wavelet estimation, Sankhya, 63, , 21. [5] T. Cai and H. Zhou, A data-driven block thresholding approach to wavelet estimation, Technical Report, Statistics Department, University of Pennsylvania, 25. [6] O. Cappe, Elimination of the musical noise phenomenon with the Ephraim and Malah Noise Suppressor, IEEE Trans. Speech and Audio Processing, vol. 2, p.p , Apr [7] I. Cohen, Speech enhancement using a noncausal a priori SNR estimator, Signal Processing Letters, IEEE, vol. 11, Issue 9, pp , Sept. 24. [8] I. Cohen, Enhancement of Speech Using Bark-Scaled Wavelet Packet Decomposition, Eurospeech, 21, Scandinavia. [9] R.R. Coifman, D.L. Donoho, Translation-Invariant De-Noising, [1] I. Daubechies, A. Grossmann, Y Meyer, Painless nonorthogonal expansions, J. Math. Phys., Vol. 27, No. 5, pp , 1986.

18 REFERENCES 18 [11] D. Donoho and I. Johnstone, Idea Spatial Adaptation via Wavelet Shrinkage, Biometrika, vol. 81, pp , [12] Y. Ephraim, D. Malah, Speech enhancement using a minimum mean square error short-time spectral amplitude estimator, IEEE. Trans. Acoust. Speech Signal Process, 32 (6), , Dec [13] Y. Ephraim and D. Malah, Speech enhancement using a minimum mean square error logspectral amplitude estimator, IEEE Trans. on Acoust., Speech, Signal Processing, vol. ASSP- 33, pp , Apr [14] S. Ghael, A. Sayeed and R. Baraniuk, Improved wavelet denoising via empirical wiener filtering, Proceedings for SPIE, Mathematical Imaging, San Diego, July [15] P. Hall, G. Kerkyacharian and D. Picard, A note on the wavelet oracle, Statistics and Probability Letters, 43, , [16] P. Hall, G. Kerkyacharian and D. Picard, Block threshold rules for curve estimation using kernel and wavelet methods, Ann. Statist, 26, , [17] P. Hall, G. Kerkyacharian and D. Picard, On the minimax optimality of block thresholded wavelet estimators, Statistica Sinica, 9, 33-5, [18] J.S. Lim and A.V. Oppenheim, Enhancement and bandwidth compression of noisy speech, Proc. of the IEEE, vol.67, Dec [19] S. Mallat, A Wavelet Tour of Signal Processing, 2nd edition, New York Academic, [2] R.J. McAulay, and M.L. Malpass, Speech enhancement using soft decision noise suppression filter, IEEE Trans. Acoust., Speech, Signal Process, ASSP-28, pp , 198. [21] H. Sheikhzadeh and H. R. Abutalebi, An improved wavelet-based speech enhancement system, EUROSPEECH, 21, [22] C. Stein and W. James, Estimation with quadratic loss, Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability 1 (Berkeley, University of California Press), , [23] C. Stein, Estimation of the mean of a multivariate normal distribution, Ann. Statist , 198. [24] J. S. Walker, Denoising Gabor Transforms, submitted. [25] P. J. Wolfe and S. J. Godsill, Simple alternatives to the Ephraim and Malah suppression rule for speech enhancement, IEEE Workshop on Statistical Signal Processing, pp , Aug. 21. [26] G. Yu, E. Bacry and S. Mallat, Audio Signal Denoising with Complex Wavelets and Adaptive block attenuation, to be appeared in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Hawaii, 27.

Optimal Speech Enhancement Under Signal Presence Uncertainty Using Log-Spectral Amplitude Estimator

Optimal Speech Enhancement Under Signal Presence Uncertainty Using Log-Spectral Amplitude Estimator 1 Optimal Speech Enhancement Under Signal Presence Uncertainty Using Log-Spectral Amplitude Estimator Israel Cohen Lamar Signal Processing Ltd. P.O.Box 573, Yokneam Ilit 20692, Israel E-mail: icohen@lamar.co.il

More information

Which wavelet bases are the best for image denoising?

Which wavelet bases are the best for image denoising? Which wavelet bases are the best for image denoising? Florian Luisier a, Thierry Blu a, Brigitte Forster b and Michael Unser a a Biomedical Imaging Group (BIG), Ecole Polytechnique Fédérale de Lausanne

More information

Bayesian Estimation of Time-Frequency Coefficients for Audio Signal Enhancement

Bayesian Estimation of Time-Frequency Coefficients for Audio Signal Enhancement Bayesian Estimation of Time-Frequency Coefficients for Audio Signal Enhancement Patrick J. Wolfe Department of Engineering University of Cambridge Cambridge CB2 1PZ, UK pjw47@eng.cam.ac.uk Simon J. Godsill

More information

Signal Denoising with Wavelets

Signal Denoising with Wavelets Signal Denoising with Wavelets Selin Aviyente Department of Electrical and Computer Engineering Michigan State University March 30, 2010 Introduction Assume an additive noise model: x[n] = f [n] + w[n]

More information

Lecture Notes 5: Multiresolution Analysis

Lecture Notes 5: Multiresolution Analysis Optimization-based data analysis Fall 2017 Lecture Notes 5: Multiresolution Analysis 1 Frames A frame is a generalization of an orthonormal basis. The inner products between the vectors in a frame and

More information

MANY digital speech communication applications, e.g.,

MANY digital speech communication applications, e.g., 406 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 2, FEBRUARY 2007 An MMSE Estimator for Speech Enhancement Under a Combined Stochastic Deterministic Speech Model Richard C.

More information

EMPLOYING PHASE INFORMATION FOR AUDIO DENOISING. İlker Bayram. Istanbul Technical University, Istanbul, Turkey

EMPLOYING PHASE INFORMATION FOR AUDIO DENOISING. İlker Bayram. Istanbul Technical University, Istanbul, Turkey EMPLOYING PHASE INFORMATION FOR AUDIO DENOISING İlker Bayram Istanbul Technical University, Istanbul, Turkey ABSTRACT Spectral audio denoising methods usually make use of the magnitudes of a time-frequency

More information

Sparsity Measure and the Detection of Significant Data

Sparsity Measure and the Detection of Significant Data Sparsity Measure and the Detection of Significant Data Abdourrahmane Atto, Dominique Pastor, Grégoire Mercier To cite this version: Abdourrahmane Atto, Dominique Pastor, Grégoire Mercier. Sparsity Measure

More information

Modifying Voice Activity Detection in Low SNR by correction factors

Modifying Voice Activity Detection in Low SNR by correction factors Modifying Voice Activity Detection in Low SNR by correction factors H. Farsi, M. A. Mozaffarian, H.Rahmani Department of Electrical Engineering University of Birjand P.O. Box: +98-9775-376 IRAN hfarsi@birjand.ac.ir

More information

Estimation Error Bounds for Frame Denoising

Estimation Error Bounds for Frame Denoising Estimation Error Bounds for Frame Denoising Alyson K. Fletcher and Kannan Ramchandran {alyson,kannanr}@eecs.berkeley.edu Berkeley Audio-Visual Signal Processing and Communication Systems group Department

More information

Design of Image Adaptive Wavelets for Denoising Applications

Design of Image Adaptive Wavelets for Denoising Applications Design of Image Adaptive Wavelets for Denoising Applications Sanjeev Pragada and Jayanthi Sivaswamy Center for Visual Information Technology International Institute of Information Technology - Hyderabad,

More information

Satellite image deconvolution using complex wavelet packets

Satellite image deconvolution using complex wavelet packets Satellite image deconvolution using complex wavelet packets André Jalobeanu, Laure Blanc-Féraud, Josiane Zerubia ARIANA research group INRIA Sophia Antipolis, France CNRS / INRIA / UNSA www.inria.fr/ariana

More information

Multiresolution Analysis

Multiresolution Analysis Multiresolution Analysis DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Frames Short-time Fourier transform

More information

Wavelet Footprints: Theory, Algorithms, and Applications

Wavelet Footprints: Theory, Algorithms, and Applications 1306 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 51, NO. 5, MAY 2003 Wavelet Footprints: Theory, Algorithms, and Applications Pier Luigi Dragotti, Member, IEEE, and Martin Vetterli, Fellow, IEEE Abstract

More information

OPTIMAL SURE PARAMETERS FOR SIGMOIDAL WAVELET SHRINKAGE

OPTIMAL SURE PARAMETERS FOR SIGMOIDAL WAVELET SHRINKAGE 17th European Signal Processing Conference (EUSIPCO 009) Glasgow, Scotland, August 4-8, 009 OPTIMAL SURE PARAMETERS FOR SIGMOIDAL WAVELET SHRINKAGE Abdourrahmane M. Atto 1, Dominique Pastor, Gregoire Mercier

More information

2D Spectrogram Filter for Single Channel Speech Enhancement

2D Spectrogram Filter for Single Channel Speech Enhancement Proceedings of the 7th WSEAS International Conference on Signal, Speech and Image Processing, Beijing, China, September 15-17, 007 89 D Spectrogram Filter for Single Channel Speech Enhancement HUIJUN DING,

More information

IMPROVEMENTS IN MODAL PARAMETER EXTRACTION THROUGH POST-PROCESSING FREQUENCY RESPONSE FUNCTION ESTIMATES

IMPROVEMENTS IN MODAL PARAMETER EXTRACTION THROUGH POST-PROCESSING FREQUENCY RESPONSE FUNCTION ESTIMATES IMPROVEMENTS IN MODAL PARAMETER EXTRACTION THROUGH POST-PROCESSING FREQUENCY RESPONSE FUNCTION ESTIMATES Bere M. Gur Prof. Christopher Niezreci Prof. Peter Avitabile Structural Dynamics and Acoustic Systems

More information

Covariance smoothing and consistent Wiener filtering for artifact reduction in audio source separation

Covariance smoothing and consistent Wiener filtering for artifact reduction in audio source separation Covariance smoothing and consistent Wiener filtering for artifact reduction in audio source separation Emmanuel Vincent METISS Team Inria Rennes - Bretagne Atlantique E. Vincent (Inria) Artifact reduction

More information

SINGLE-CHANNEL SPEECH PRESENCE PROBABILITY ESTIMATION USING INTER-FRAME AND INTER-BAND CORRELATIONS

SINGLE-CHANNEL SPEECH PRESENCE PROBABILITY ESTIMATION USING INTER-FRAME AND INTER-BAND CORRELATIONS 204 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) SINGLE-CHANNEL SPEECH PRESENCE PROBABILITY ESTIMATION USING INTER-FRAME AND INTER-BAND CORRELATIONS Hajar Momeni,2,,

More information

Design Criteria for the Quadratically Interpolated FFT Method (I): Bias due to Interpolation

Design Criteria for the Quadratically Interpolated FFT Method (I): Bias due to Interpolation CENTER FOR COMPUTER RESEARCH IN MUSIC AND ACOUSTICS DEPARTMENT OF MUSIC, STANFORD UNIVERSITY REPORT NO. STAN-M-4 Design Criteria for the Quadratically Interpolated FFT Method (I): Bias due to Interpolation

More information

A POSTERIORI SPEECH PRESENCE PROBABILITY ESTIMATION BASED ON AVERAGED OBSERVATIONS AND A SUPER-GAUSSIAN SPEECH MODEL

A POSTERIORI SPEECH PRESENCE PROBABILITY ESTIMATION BASED ON AVERAGED OBSERVATIONS AND A SUPER-GAUSSIAN SPEECH MODEL A POSTERIORI SPEECH PRESENCE PROBABILITY ESTIMATION BASED ON AVERAGED OBSERVATIONS AND A SUPER-GAUSSIAN SPEECH MODEL Balázs Fodor Institute for Communications Technology Technische Universität Braunschweig

More information

NOISE ROBUST RELATIVE TRANSFER FUNCTION ESTIMATION. M. Schwab, P. Noll, and T. Sikora. Technical University Berlin, Germany Communication System Group

NOISE ROBUST RELATIVE TRANSFER FUNCTION ESTIMATION. M. Schwab, P. Noll, and T. Sikora. Technical University Berlin, Germany Communication System Group NOISE ROBUST RELATIVE TRANSFER FUNCTION ESTIMATION M. Schwab, P. Noll, and T. Sikora Technical University Berlin, Germany Communication System Group Einsteinufer 17, 1557 Berlin (Germany) {schwab noll

More information

BIAS CORRECTION METHODS FOR ADAPTIVE RECURSIVE SMOOTHING WITH APPLICATIONS IN NOISE PSD ESTIMATION. Robert Rehr, Timo Gerkmann

BIAS CORRECTION METHODS FOR ADAPTIVE RECURSIVE SMOOTHING WITH APPLICATIONS IN NOISE PSD ESTIMATION. Robert Rehr, Timo Gerkmann BIAS CORRECTION METHODS FOR ADAPTIVE RECURSIVE SMOOTHING WITH APPLICATIONS IN NOISE PSD ESTIMATION Robert Rehr, Timo Gerkmann Speech Signal Processing Group, Department of Medical Physics and Acoustics

More information

SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIX FACTORIZATION AND SPECTRAL MASKS. Emad M. Grais and Hakan Erdogan

SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIX FACTORIZATION AND SPECTRAL MASKS. Emad M. Grais and Hakan Erdogan SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIX FACTORIZATION AND SPECTRAL MASKS Emad M. Grais and Hakan Erdogan Faculty of Engineering and Natural Sciences, Sabanci University, Orhanli

More information

Sparse Time-Frequency Transforms and Applications.

Sparse Time-Frequency Transforms and Applications. Sparse Time-Frequency Transforms and Applications. Bruno Torrésani http://www.cmi.univ-mrs.fr/~torresan LATP, Université de Provence, Marseille DAFx, Montreal, September 2006 B. Torrésani (LATP Marseille)

More information

Discussion of Regularization of Wavelets Approximations by A. Antoniadis and J. Fan

Discussion of Regularization of Wavelets Approximations by A. Antoniadis and J. Fan Discussion of Regularization of Wavelets Approximations by A. Antoniadis and J. Fan T. Tony Cai Department of Statistics The Wharton School University of Pennsylvania Professors Antoniadis and Fan are

More information

Sparse linear models

Sparse linear models Sparse linear models Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 2/22/2016 Introduction Linear transforms Frequency representation Short-time

More information

Denoising Gabor Transforms

Denoising Gabor Transforms 1 Denoising Gabor Transforms James S. Walker Abstract We describe denoising one-dimensional signals by thresholding Blackman windowed Gabor transforms. This method is compared with Gauss-windowed Gabor

More information

Improved Speech Presence Probabilities Using HMM-Based Inference, with Applications to Speech Enhancement and ASR

Improved Speech Presence Probabilities Using HMM-Based Inference, with Applications to Speech Enhancement and ASR Improved Speech Presence Probabilities Using HMM-Based Inference, with Applications to Speech Enhancement and ASR Bengt J. Borgström, Student Member, IEEE, and Abeer Alwan, IEEE Fellow Abstract This paper

More information

NOISE reduction is an important fundamental signal

NOISE reduction is an important fundamental signal 1526 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 20, NO. 5, JULY 2012 Non-Causal Time-Domain Filters for Single-Channel Noise Reduction Jesper Rindom Jensen, Student Member, IEEE,

More information

REAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEPARATION. Scott Rickard, Radu Balan, Justinian Rosca. Siemens Corporate Research Princeton, NJ 08540

REAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEPARATION. Scott Rickard, Radu Balan, Justinian Rosca. Siemens Corporate Research Princeton, NJ 08540 REAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEPARATION Scott Rickard, Radu Balan, Justinian Rosca Siemens Corporate Research Princeton, NJ 84 fscott.rickard,radu.balan,justinian.roscag@scr.siemens.com

More information

A New Poisson Noisy Image Denoising Method Based on the Anscombe Transformation

A New Poisson Noisy Image Denoising Method Based on the Anscombe Transformation A New Poisson Noisy Image Denoising Method Based on the Anscombe Transformation Jin Quan 1, William G. Wee 1, Chia Y. Han 2, and Xuefu Zhou 1 1 School of Electronic and Computing Systems, University of

More information

MMSE Denoising of 2-D Signals Using Consistent Cycle Spinning Algorithm

MMSE Denoising of 2-D Signals Using Consistent Cycle Spinning Algorithm Denoising of 2-D Signals Using Consistent Cycle Spinning Algorithm Bodduluri Asha, B. Leela kumari Abstract: It is well known that in a real world signals do not exist without noise, which may be negligible

More information

A SPEECH PRESENCE PROBABILITY ESTIMATOR BASED ON FIXED PRIORS AND A HEAVY-TAILED SPEECH MODEL

A SPEECH PRESENCE PROBABILITY ESTIMATOR BASED ON FIXED PRIORS AND A HEAVY-TAILED SPEECH MODEL A SPEECH PRESENCE PROBABILITY ESTIMATOR BASED ON FIXED PRIORS AND A HEAVY-TAILED SPEECH MODEL Balázs Fodor Institute for Communications Technology Technische Universität Braunschweig 386 Braunschweig,

More information

Single Channel Signal Separation Using MAP-based Subspace Decomposition

Single Channel Signal Separation Using MAP-based Subspace Decomposition Single Channel Signal Separation Using MAP-based Subspace Decomposition Gil-Jin Jang, Te-Won Lee, and Yung-Hwan Oh 1 Spoken Language Laboratory, Department of Computer Science, KAIST 373-1 Gusong-dong,

More information

A SPECTRAL SUBTRACTION RULE FOR REAL-TIME DSP IMPLEMENTATION OF NOISE REDUCTION IN SPEECH SIGNALS

A SPECTRAL SUBTRACTION RULE FOR REAL-TIME DSP IMPLEMENTATION OF NOISE REDUCTION IN SPEECH SIGNALS Proc. of the 1 th Int. Conference on Digital Audio Effects (DAFx-9), Como, Italy, September 1-4, 9 A SPECTRAL SUBTRACTION RULE FOR REAL-TIME DSP IMPLEMENTATION OF NOISE REDUCTION IN SPEECH SIGNALS Matteo

More information

Adaptive Wavelet Estimation: A Block Thresholding and Oracle Inequality Approach

Adaptive Wavelet Estimation: A Block Thresholding and Oracle Inequality Approach University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1999 Adaptive Wavelet Estimation: A Block Thresholding and Oracle Inequality Approach T. Tony Cai University of Pennsylvania

More information

SPEECH ENHANCEMENT USING PCA AND VARIANCE OF THE RECONSTRUCTION ERROR IN DISTRIBUTED SPEECH RECOGNITION

SPEECH ENHANCEMENT USING PCA AND VARIANCE OF THE RECONSTRUCTION ERROR IN DISTRIBUTED SPEECH RECOGNITION SPEECH ENHANCEMENT USING PCA AND VARIANCE OF THE RECONSTRUCTION ERROR IN DISTRIBUTED SPEECH RECOGNITION Amin Haji Abolhassani 1, Sid-Ahmed Selouani 2, Douglas O Shaughnessy 1 1 INRS-Energie-Matériaux-Télécommunications,

More information

The Lifting Wavelet Transform for Periodogram Smoothing

The Lifting Wavelet Transform for Periodogram Smoothing ISSN : 976-8491 (Online) ISSN : 2229-4333 (Print) IJCST Vo l. 3, Is s u e 1, Ja n. - Ma r c h 212 The Lifting for Periodogram Smoothing 1 M.Venakatanarayana, 2 Dr. T.Jayachandra Prasad 1 Dept. of ECE,

More information

Wavelet de-noising for blind source separation in noisy mixtures.

Wavelet de-noising for blind source separation in noisy mixtures. Wavelet for blind source separation in noisy mixtures. Bertrand Rivet 1, Vincent Vigneron 1, Anisoara Paraschiv-Ionescu 2 and Christian Jutten 1 1 Institut National Polytechnique de Grenoble. Laboratoire

More information

An Investigation of 3D Dual-Tree Wavelet Transform for Video Coding

An Investigation of 3D Dual-Tree Wavelet Transform for Video Coding MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com An Investigation of 3D Dual-Tree Wavelet Transform for Video Coding Beibei Wang, Yao Wang, Ivan Selesnick and Anthony Vetro TR2004-132 December

More information

Recent Advancements in Speech Enhancement

Recent Advancements in Speech Enhancement Recent Advancements in Speech Enhancement Yariv Ephraim and Israel Cohen 1 May 17, 2004 Abstract Speech enhancement is a long standing problem with numerous applications ranging from hearing aids, to coding

More information

New Statistical Model for the Enhancement of Noisy Speech

New Statistical Model for the Enhancement of Noisy Speech New Statistical Model for the Enhancement of Noisy Speech Electrical Engineering Department Technion - Israel Institute of Technology February 22, 27 Outline Problem Formulation and Motivation 1 Problem

More information

A Brief Survey of Speech Enhancement 1

A Brief Survey of Speech Enhancement 1 A Brief Survey of Speech Enhancement 1 Yariv Ephraim, Hanoch Lev-Ari and William J.J. Roberts 2 August 2, 2003 Abstract We present a brief overview of the speech enhancement problem for wide-band noise

More information

An Introduction to Wavelets and some Applications

An Introduction to Wavelets and some Applications An Introduction to Wavelets and some Applications Milan, May 2003 Anestis Antoniadis Laboratoire IMAG-LMC University Joseph Fourier Grenoble, France An Introduction to Wavelets and some Applications p.1/54

More information

PERCEPTUAL MATCHING PURSUIT WITH GABOR DICTIONARIES AND TIME-FREQUENCY MASKING. Gilles Chardon, Thibaud Necciari, and Peter Balazs

PERCEPTUAL MATCHING PURSUIT WITH GABOR DICTIONARIES AND TIME-FREQUENCY MASKING. Gilles Chardon, Thibaud Necciari, and Peter Balazs 21 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) PERCEPTUAL MATCHING PURSUIT WITH GABOR DICTIONARIES AND TIME-FREQUENCY MASKING Gilles Chardon, Thibaud Necciari, and

More information

Sparse linear models and denoising

Sparse linear models and denoising Lecture notes 4 February 22, 2016 Sparse linear models and denoising 1 Introduction 1.1 Definition and motivation Finding representations of signals that allow to process them more effectively is a central

More information

Median Filter Based Realizations of the Robust Time-Frequency Distributions

Median Filter Based Realizations of the Robust Time-Frequency Distributions TIME-FREQUENCY SIGNAL ANALYSIS 547 Median Filter Based Realizations of the Robust Time-Frequency Distributions Igor Djurović, Vladimir Katkovnik, LJubiša Stanković Abstract Recently, somenewefficient tools

More information

Expressions for the covariance matrix of covariance data

Expressions for the covariance matrix of covariance data Expressions for the covariance matrix of covariance data Torsten Söderström Division of Systems and Control, Department of Information Technology, Uppsala University, P O Box 337, SE-7505 Uppsala, Sweden

More information

MULTI-RESOLUTION SIGNAL DECOMPOSITION WITH TIME-DOMAIN SPECTROGRAM FACTORIZATION. Hirokazu Kameoka

MULTI-RESOLUTION SIGNAL DECOMPOSITION WITH TIME-DOMAIN SPECTROGRAM FACTORIZATION. Hirokazu Kameoka MULTI-RESOLUTION SIGNAL DECOMPOSITION WITH TIME-DOMAIN SPECTROGRAM FACTORIZATION Hiroazu Kameoa The University of Toyo / Nippon Telegraph and Telephone Corporation ABSTRACT This paper proposes a novel

More information

Image Denoising using Uniform Curvelet Transform and Complex Gaussian Scale Mixture

Image Denoising using Uniform Curvelet Transform and Complex Gaussian Scale Mixture EE 5359 Multimedia Processing Project Report Image Denoising using Uniform Curvelet Transform and Complex Gaussian Scale Mixture By An Vo ISTRUCTOR: Dr. K. R. Rao Summer 008 Image Denoising using Uniform

More information

Denosing Using Wavelets and Projections onto the l 1 -Ball

Denosing Using Wavelets and Projections onto the l 1 -Ball 1 Denosing Using Wavelets and Projections onto the l 1 -Ball October 6, 2014 A. Enis Cetin, M. Tofighi Dept. of Electrical and Electronic Engineering, Bilkent University, Ankara, Turkey cetin@bilkent.edu.tr,

More information

GAUSSIANIZATION METHOD FOR IDENTIFICATION OF MEMORYLESS NONLINEAR AUDIO SYSTEMS

GAUSSIANIZATION METHOD FOR IDENTIFICATION OF MEMORYLESS NONLINEAR AUDIO SYSTEMS GAUSSIANIATION METHOD FOR IDENTIFICATION OF MEMORYLESS NONLINEAR AUDIO SYSTEMS I. Marrakchi-Mezghani (1),G. Mahé (2), M. Jaïdane-Saïdane (1), S. Djaziri-Larbi (1), M. Turki-Hadj Alouane (1) (1) Unité Signaux

More information

Wavelet Based Image Restoration Using Cross-Band Operators

Wavelet Based Image Restoration Using Cross-Band Operators 1 Wavelet Based Image Restoration Using Cross-Band Operators Erez Cohen Electrical Engineering Department Technion - Israel Institute of Technology Supervised by Prof. Israel Cohen 2 Layout Introduction

More information

A Generalized Subspace Approach for Enhancing Speech Corrupted by Colored Noise

A Generalized Subspace Approach for Enhancing Speech Corrupted by Colored Noise 334 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL 11, NO 4, JULY 2003 A Generalized Subspace Approach for Enhancing Speech Corrupted by Colored Noise Yi Hu, Student Member, IEEE, and Philipos C

More information

Recent developments on sparse representation

Recent developments on sparse representation Recent developments on sparse representation Zeng Tieyong Department of Mathematics, Hong Kong Baptist University Email: zeng@hkbu.edu.hk Hong Kong Baptist University Dec. 8, 2008 First Previous Next Last

More information

Image representation with multi-scale gradients

Image representation with multi-scale gradients Image representation with multi-scale gradients Eero P Simoncelli Center for Neural Science, and Courant Institute of Mathematical Sciences New York University http://www.cns.nyu.edu/~eero Visual image

More information

Curvelet imaging & processing: sparseness constrained least-squares migration

Curvelet imaging & processing: sparseness constrained least-squares migration Curvelet imaging & processing: sparseness constrained least-squares migration Felix J. Herrmann and Peyman P. Moghaddam (EOS-UBC) felix@eos.ubc.ca & www.eos.ubc.ca/~felix thanks to: Gilles, Peyman and

More information

Wavelet Analysis for Nanoscopic TEM Biomedical Images with Effective Weiner Filter

Wavelet Analysis for Nanoscopic TEM Biomedical Images with Effective Weiner Filter Wavelet Analysis for Nanoscopic TEM Biomedical Images with Effective Weiner Filter Garima Goyal goyal.garima18@gmail.com Assistant Professor, Department of Information Science & Engineering Jyothy Institute

More information

Denoising via Recursive Wavelet Thresholding. Alyson Kerry Fletcher. A thesis submitted in partial satisfaction of the requirements for the degree of

Denoising via Recursive Wavelet Thresholding. Alyson Kerry Fletcher. A thesis submitted in partial satisfaction of the requirements for the degree of Denoising via Recursive Wavelet Thresholding by Alyson Kerry Fletcher A thesis submitted in partial satisfaction of the requirements for the degree of Master of Science in Electrical Engineering in the

More information

Modeling speech signals in the time frequency domain using GARCH

Modeling speech signals in the time frequency domain using GARCH Signal Processing () 53 59 Fast communication Modeling speech signals in the time frequency domain using GARCH Israel Cohen Department of Electrical Engineering, Technion Israel Institute of Technology,

More information

Wavelet Analysis of Print Defects

Wavelet Analysis of Print Defects Wavelet Analysis of Print Defects Kevin D. Donohue, Chengwu Cui, and M.Vijay Venkatesh University of Kentucky, Lexington, Kentucky Lexmark International Inc., Lexington, Kentucky Abstract This paper examines

More information

Nonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation

Nonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation Nonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation Mikkel N. Schmidt and Morten Mørup Technical University of Denmark Informatics and Mathematical Modelling Richard

More information

Independent Component Analysis and Unsupervised Learning. Jen-Tzung Chien

Independent Component Analysis and Unsupervised Learning. Jen-Tzung Chien Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent voices Nonparametric likelihood

More information

LINEARIZED BREGMAN ITERATIONS FOR FRAME-BASED IMAGE DEBLURRING

LINEARIZED BREGMAN ITERATIONS FOR FRAME-BASED IMAGE DEBLURRING LINEARIZED BREGMAN ITERATIONS FOR FRAME-BASED IMAGE DEBLURRING JIAN-FENG CAI, STANLEY OSHER, AND ZUOWEI SHEN Abstract. Real images usually have sparse approximations under some tight frame systems derived

More information

PDE-SVD BASED AUDIO DENOISING. George Baravdish, Gianpaolo Evangelista, Olof Svensson

PDE-SVD BASED AUDIO DENOISING. George Baravdish, Gianpaolo Evangelista, Olof Svensson PDE-SVD BASED AUDIO DENOISING George Baravdish, Gianpaolo Evangelista, Olof Svensson Linköping University Norrköping, Sweden Faten Sofya Mosul University Mosul, Iraq ABSTRACT In this paper we present a

More information

Sound Recognition in Mixtures

Sound Recognition in Mixtures Sound Recognition in Mixtures Juhan Nam, Gautham J. Mysore 2, and Paris Smaragdis 2,3 Center for Computer Research in Music and Acoustics, Stanford University, 2 Advanced Technology Labs, Adobe Systems

More information

Introduction Wavelet shrinage methods have been very successful in nonparametric regression. But so far most of the wavelet regression methods have be

Introduction Wavelet shrinage methods have been very successful in nonparametric regression. But so far most of the wavelet regression methods have be Wavelet Estimation For Samples With Random Uniform Design T. Tony Cai Department of Statistics, Purdue University Lawrence D. Brown Department of Statistics, University of Pennsylvania Abstract We show

More information

LECTURE NOTES IN AUDIO ANALYSIS: PITCH ESTIMATION FOR DUMMIES

LECTURE NOTES IN AUDIO ANALYSIS: PITCH ESTIMATION FOR DUMMIES LECTURE NOTES IN AUDIO ANALYSIS: PITCH ESTIMATION FOR DUMMIES Abstract March, 3 Mads Græsbøll Christensen Audio Analysis Lab, AD:MT Aalborg University This document contains a brief introduction to pitch

More information

Wavelet denoising of magnetic prospecting data

Wavelet denoising of magnetic prospecting data JOURNAL OF BALKAN GEOPHYSICAL SOCIETY, Vol. 8, No.2, May, 2005, p. 28-36 Wavelet denoising of magnetic prospecting data Basiliki Tsivouraki-Papafotiou, Gregory N. Tsokas and Panagiotis Tsurlos (Received

More information

Correspondence. Wavelet Thresholding for Multiple Noisy Image Copies

Correspondence. Wavelet Thresholding for Multiple Noisy Image Copies IEEE TRASACTIOS O IMAGE PROCESSIG, VOL. 9, O. 9, SEPTEMBER 000 63 Correspondence Wavelet Thresholding for Multiple oisy Image Copies S. Grace Chang, Bin Yu, and Martin Vetterli Abstract This correspondence

More information

Application of the Tuned Kalman Filter in Speech Enhancement

Application of the Tuned Kalman Filter in Speech Enhancement Application of the Tuned Kalman Filter in Speech Enhancement Orchisama Das, Bhaswati Goswami and Ratna Ghosh Department of Instrumentation and Electronics Engineering Jadavpur University Kolkata, India

More information

Digital Image Processing Lectures 15 & 16

Digital Image Processing Lectures 15 & 16 Lectures 15 & 16, Professor Department of Electrical and Computer Engineering Colorado State University CWT and Multi-Resolution Signal Analysis Wavelet transform offers multi-resolution by allowing for

More information

Two Denoising Methods by Wavelet Transform

Two Denoising Methods by Wavelet Transform IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 47, NO. 12, DECEMBER 1999 3401 Two Denoising Methods by Wavelet Transform Quan Pan, Lei Zhang, Guanzhong Dai, and Hongcai Zhang Abstract Two wavelet-based noise

More information

Digital Signal Processing

Digital Signal Processing Digital Signal Processing 0 (010) 157 1578 Contents lists available at ScienceDirect Digital Signal Processing www.elsevier.com/locate/dsp Improved minima controlled recursive averaging technique using

More information

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of Index* The Statistical Analysis of Time Series by T. W. Anderson Copyright 1971 John Wiley & Sons, Inc. Aliasing, 387-388 Autoregressive {continued) Amplitude, 4, 94 case of first-order, 174 Associated

More information

COMPLEX WAVELET TRANSFORM IN SIGNAL AND IMAGE ANALYSIS

COMPLEX WAVELET TRANSFORM IN SIGNAL AND IMAGE ANALYSIS COMPLEX WAVELET TRANSFORM IN SIGNAL AND IMAGE ANALYSIS MUSOKO VICTOR, PROCHÁZKA ALEŠ Institute of Chemical Technology, Department of Computing and Control Engineering Technická 905, 66 8 Prague 6, Cech

More information

A priori SNR estimation and noise estimation for speech enhancement

A priori SNR estimation and noise estimation for speech enhancement Yao et al. EURASIP Journal on Advances in Signal Processing (2016) 2016:101 DOI 10.1186/s13634-016-0398-z EURASIP Journal on Advances in Signal Processing RESEARCH A priori SNR estimation and noise estimation

More information

CEPSTRAL analysis has been widely used in signal processing

CEPSTRAL analysis has been widely used in signal processing 162 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 7, NO. 2, MARCH 1999 On Second-Order Statistics and Linear Estimation of Cepstral Coefficients Yariv Ephraim, Fellow, IEEE, and Mazin Rahim, Senior

More information

A Multi-window Fractional Evolutionary Spectral Analysis

A Multi-window Fractional Evolutionary Spectral Analysis A Multi-window Fractional Evolutionary Spectral Analysis YALÇIN ÇEKİÇ, AYDIN AKAN, and MAHMUT ÖZTÜRK University of Bahcesehir, Department of Electrical and Electronics Engineering Bahcesehir, 49, Istanbul,

More information

UNIFORMLY MOST POWERFUL CYCLIC PERMUTATION INVARIANT DETECTION FOR DISCRETE-TIME SIGNALS

UNIFORMLY MOST POWERFUL CYCLIC PERMUTATION INVARIANT DETECTION FOR DISCRETE-TIME SIGNALS UNIFORMLY MOST POWERFUL CYCLIC PERMUTATION INVARIANT DETECTION FOR DISCRETE-TIME SIGNALS F. C. Nicolls and G. de Jager Department of Electrical Engineering, University of Cape Town Rondebosch 77, South

More information

A NO-REFERENCE SHARPNESS METRIC SENSITIVE TO BLUR AND NOISE. Xiang Zhu and Peyman Milanfar

A NO-REFERENCE SHARPNESS METRIC SENSITIVE TO BLUR AND NOISE. Xiang Zhu and Peyman Milanfar A NO-REFERENCE SARPNESS METRIC SENSITIVE TO BLUR AND NOISE Xiang Zhu and Peyman Milanfar Electrical Engineering Department University of California at Santa Cruz, CA, 9564 xzhu@soeucscedu ABSTRACT A no-reference

More information

An Overview of Sparsity with Applications to Compression, Restoration, and Inverse Problems

An Overview of Sparsity with Applications to Compression, Restoration, and Inverse Problems An Overview of Sparsity with Applications to Compression, Restoration, and Inverse Problems Justin Romberg Georgia Tech, School of ECE ENS Winter School January 9, 2012 Lyon, France Applied and Computational

More information

Simultaneous Multi-frame MAP Super-Resolution Video Enhancement using Spatio-temporal Priors

Simultaneous Multi-frame MAP Super-Resolution Video Enhancement using Spatio-temporal Priors Simultaneous Multi-frame MAP Super-Resolution Video Enhancement using Spatio-temporal Priors Sean Borman and Robert L. Stevenson Department of Electrical Engineering, University of Notre Dame Notre Dame,

More information

Signal Modeling Techniques in Speech Recognition. Hassan A. Kingravi

Signal Modeling Techniques in Speech Recognition. Hassan A. Kingravi Signal Modeling Techniques in Speech Recognition Hassan A. Kingravi Outline Introduction Spectral Shaping Spectral Analysis Parameter Transforms Statistical Modeling Discussion Conclusions 1: Introduction

More information

A Priori SNR Estimation Using a Generalized Decision Directed Approach

A Priori SNR Estimation Using a Generalized Decision Directed Approach A Priori SNR Estimation Using a Generalized Decision Directed Approach Aleksej Chinaev, Reinhold Haeb-Umbach Department of Communications Engineering, Paderborn University, 3398 Paderborn, Germany {chinaev,haeb}@nt.uni-paderborn.de

More information

Pitch Estimation and Tracking with Harmonic Emphasis On The Acoustic Spectrum

Pitch Estimation and Tracking with Harmonic Emphasis On The Acoustic Spectrum Downloaded from vbn.aau.dk on: marts 31, 2019 Aalborg Universitet Pitch Estimation and Tracking with Harmonic Emphasis On The Acoustic Spectrum Karimian-Azari, Sam; Mohammadiha, Nasser; Jensen, Jesper

More information

arxiv:math/ v1 [math.na] 12 Feb 2005

arxiv:math/ v1 [math.na] 12 Feb 2005 arxiv:math/0502252v1 [math.na] 12 Feb 2005 An Orthogonal Discrete Auditory Transform Jack Xin and Yingyong Qi Abstract An orthogonal discrete auditory transform (ODAT) from sound signal to spectrum is

More information

Acoustic MIMO Signal Processing

Acoustic MIMO Signal Processing Yiteng Huang Jacob Benesty Jingdong Chen Acoustic MIMO Signal Processing With 71 Figures Ö Springer Contents 1 Introduction 1 1.1 Acoustic MIMO Signal Processing 1 1.2 Organization of the Book 4 Part I

More information

Multinomial Data. f(y θ) θ y i. where θ i is the probability that a given trial results in category i, i = 1,..., k. The parameter space is

Multinomial Data. f(y θ) θ y i. where θ i is the probability that a given trial results in category i, i = 1,..., k. The parameter space is Multinomial Data The multinomial distribution is a generalization of the binomial for the situation in which each trial results in one and only one of several categories, as opposed to just two, as in

More information

Low-Complexity Image Denoising via Analytical Form of Generalized Gaussian Random Vectors in AWGN

Low-Complexity Image Denoising via Analytical Form of Generalized Gaussian Random Vectors in AWGN Low-Complexity Image Denoising via Analytical Form of Generalized Gaussian Random Vectors in AWGN PICHID KITTISUWAN Rajamangala University of Technology (Ratanakosin), Department of Telecommunication Engineering,

More information

PARAMETRIC coding has proven to be very effective

PARAMETRIC coding has proven to be very effective 966 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 3, MARCH 2007 High-Resolution Spherical Quantization of Sinusoidal Parameters Pim Korten, Jesper Jensen, and Richard Heusdens

More information

Sparse & Redundant Signal Representation, and its Role in Image Processing

Sparse & Redundant Signal Representation, and its Role in Image Processing Sparse & Redundant Signal Representation, and its Role in Michael Elad The CS Department The Technion Israel Institute of technology Haifa 3000, Israel Wave 006 Wavelet and Applications Ecole Polytechnique

More information

A Lower Bound Theorem. Lin Hu.

A Lower Bound Theorem. Lin Hu. American J. of Mathematics and Sciences Vol. 3, No -1,(January 014) Copyright Mind Reader Publications ISSN No: 50-310 A Lower Bound Theorem Department of Applied Mathematics, Beijing University of Technology,

More information

Multivariate Bayes Wavelet Shrinkage and Applications

Multivariate Bayes Wavelet Shrinkage and Applications Journal of Applied Statistics Vol. 32, No. 5, 529 542, July 2005 Multivariate Bayes Wavelet Shrinkage and Applications GABRIEL HUERTA Department of Mathematics and Statistics, University of New Mexico

More information

Wavelet Shrinkage for Nonequispaced Samples

Wavelet Shrinkage for Nonequispaced Samples University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1998 Wavelet Shrinkage for Nonequispaced Samples T. Tony Cai University of Pennsylvania Lawrence D. Brown University

More information

MULTI-SCALE IMAGE DENOISING BASED ON GOODNESS OF FIT (GOF) TESTS

MULTI-SCALE IMAGE DENOISING BASED ON GOODNESS OF FIT (GOF) TESTS MULTI-SCALE IMAGE DENOISING BASED ON GOODNESS OF FIT (GOF) TESTS Naveed ur Rehman 1, Khuram Naveed 1, Shoaib Ehsan 2, Klaus McDonald-Maier 2 1 Department of Electrical Engineering, COMSATS Institute of

More information

THE quintessential goal of statistical estimation is to

THE quintessential goal of statistical estimation is to I TRANSACTIONS ON INFORMATION THORY, VOL. 45, NO. 7, NOVMBR 1999 2225 On Denoising and Best Signal Representation Hamid Krim, Senior Member, I, Dewey Tucker, Stéphane Mallat, Member, I, and David Donoho

More information

Wavelet Based Image Denoising Technique

Wavelet Based Image Denoising Technique (IJACSA) International Journal of Advanced Computer Science and Applications, Wavelet Based Image Denoising Technique Sachin D Ruikar Dharmpal D Doye Department of Electronics and Telecommunication Engineering

More information

A Data-Driven Block Thresholding Approach To Wavelet Estimation

A Data-Driven Block Thresholding Approach To Wavelet Estimation A Data-Driven Block Thresholding Approach To Wavelet Estimation T. Tony Cai 1 and Harrison H. Zhou University of Pennsylvania and Yale University Abstract A data-driven block thresholding procedure for

More information