Digital Image Processing Lectures 15 & 16

Lectures 15 & 16, Professor Department of Electrical and Computer Engineering Colorado State University

CWT and Multi-Resolution Signal Analysis Wavelet transform offers multi-resolution by allowing for variation of time and frequency resolutions. This is done by using analysis (bandpass) filters with constant-q such that ω ω o = constant (e.g., frequency response of cochlea in inner ear which offers better auditory perception). Thus, as frequency increases, ω also increase (Heisenberg equality) to allow small time resolution t. That is, the time resolution t becomes arbitrarily good at high frequencies while the frequency resolution becomes arbitrarily good at low frequencies. For example, two very close short time bursts can always be separated by going up to higher analysis frequency to improve the time resolution. The CWT exactly accomplishes this task with an added simplification i.e. all the basis functions are defined as scaled (i.e. stretched or compressed) versions of a prototype (mother wavelet) Ψ(t) i.e: Ψ a (t) = 1 a Ψ( t a )

where a is called scaling or dilatation factor and 1 a is used for energy normalization. Small a < 1 implies narrow time windows for detecting high frequency activities (zoom-in); while large a > 1 widens the window for low frequency analysis (zoom-out). The CWT is then defined by X CW T (τ, a) = 1 a or x(t)ψ ( t τ a )dt = x(t)ψ a(t τ)dt X CW T (τ, a) = a x(at)ψ (t τ a )dt From the second equation very large a means large contraction of the signal, x(at), is seen through a constant filter (global view) and very small a means detailed view.

Remarks and Properties of CWT 1 Scale change of continuous signals (e.g. x(at)) does not change resolution since it can be reversed while in discrete signals increasing scale involves subsampling (e.g., x(an) for a > 1) and reduction of resolution. Decreasing scale (upsampling) can be undone and hence it does not change the resolution. 2 In contrast to STFT, the wavelet basis are no longer linked to frequency modulation instead they are scaled versions of a prototype (mother) wavelet. 3 The function Ψ(t) should satisfy the following conditions in order to qualify as a mother wavelet. Ψ(t) is absolutely integrable and square integrable (finite energy). Ψ(t) dt < Ψ(t) 2 dt <

If Ψ(ω) is FT of Ψ(t) the admissibility condition which is required for inverse operation is. C = Ψ(ω) 2 dω ω < Since Ψ(ω) is continuous function finiteness of this function implies that Ψ(0) = Ψ(t)dt = 0 (the reason Ψ(t) is called a wavelet). 4 If above conditions are satisfied the inverse operation or synthesis equation consists of x(t) = 1 X CW T (τ, a)ψ C a(t τ) dadτ a 2 i.e. any signal x(t) can be written as a superposition of shifted and dilated wavelets. Here, C is a constant that depends on choice of Ψ(t).

5 CWT is unitary 1 C XCW T (τ, a) 2 dadτ = x(t) 2 dt i.e. a 2 preserves the signal energy. Thus, similar to STFT X CW T (τ, a) 2 is called scalogram that gives the energy distribution of the signal in time-scale plane. In contrast to the spectrogram, the energy of the signal is distributed at multiple resolutions. Following figures show the CWT (using Morlet wavelet Ψ(t) = 1 2π e jωot e t2 /2 ) of the two previous STFT examples. Clearly, multi-resolution property is evident (f 1 a ).

Besides linearity CWT possesses the following properties. 6 Shifting property: If x(t) has CWT X CW T (τ, a) then CWT for shifted signal y(t) = x(t τ ) is i.e. invariance to shift. Y CW T (τ, a) = X CW T (τ τ, a) 7 Scaling property: If x(t) has CWT X CW T (τ, a) then CWT for scaled signal y(t) = (1/ b)x(t/b) is Y CW T (τ, a) = X CW T ( τ b, a b ) 8 Time Localization property: Let x(t) = δ(t t 0 ) then CWT is X CW T (τ, a) = 1 Ψ( t τ a a )δ(t t 0)dt = 1 Ψ( t 0 τ ) a a which is the scaled mother wavelet reversed in time and located at t 0 i.e. perfect localization.

Wigner-Ville Distribution An alternative to spectrogram for non-stationary signal analysis is the Wigner-Ville (WV) distribution (bilinear expansion). W V x (τ, ω) x(τ + t 2 )x (τ t 2 )e jwt dt WV distribution has many interesting properties which are : Always real-valued. Preserves time shifts and frequency shifts. Frequency or time integrals of WVD correspond to the signals instantaneous power and its spectral energy density, respectively, i.e. W V x (τ, ω)dτ = X(ω) 2 τ 1 2π ω W V x (τ, ω)dω = x(t) 2

WVD has time-frequency shift invariance i.e. if y(t) = x(t τ 0 )e jω 0t, then W V y (τ, ω) = W V x (τ τ 0, ω ω 0 ). The relation between WVD and CWT is given by CW T (τ, a) 2 = W V x (t, ξ)w V Ψ ( t τ a, aξ)dtdξ i.e. 2-D correlation between the signal and the basic wavelet WVD.

Discretization of Time-Scale Parameters & Wavelet Series A natural way to discretize the time-scale parameters τ and a is via the dyadic sampling where a = a j o and τ = k a j ot with j, k I i.e. set of integers. The idea behind this sampling is that narrow (high frequency) windows are translated by small steps in order to catch small details; while wider (low frequency) wavelets are translated by larger steps. Note that this sampling is NOT applied to the signal, i.e. x(t) is till continuous time. The sampled wavelets are Ψ j,k (t) = a j 2 o Ψ(a j o t kt ) resulting in X j,k = x(t)ψ j,k (t)dt, Analysis Equation

To observe small details (i.e. magnification), a j o must be large. This calls for j negative and large whereas global views require a j o should be small and positive. The reconstruction equation is x(t) = c X j,k Ψ j,k (t), Synthesis Equation j k Typically, we choose a o = 2, T = 1 for which there exist very special choices of Ψ(t) such that Ψ j,k constitute an orthogonal basis. Ψ j,k Ψ j,k (t)dt = { 1, when j = j, k = k 0, otherwise Thus, an arbitrary signal can be represented exactly as a weighted sum of the basis functions i.e. x(t) = j X j,k Ψ j,k (t), Synthesis Equation k }

Filter Bank Implementation, Sub-band Coding and DWT Given a signal x(n) we first obtain a lower resolution signal by filtering with a half-band low-pass filter (LPF) with impulse response g(n). By the Nyquist rule, we can down-sample the resulting signal by two (drop every other sample) i.e. doubling the scale in the analysis. This gives: y(k) = n g(n)x(2k n) i.e. half-band LPF reduces the resolution by 2 (loss of high freq detail) while scale is unchanged. The subsequent subsampling doubles the scale. Note that up-sampling by two (inserting a 0 sample between samples) then low pass filtering (g (n)) halves the scale while leaving the resolution unchanged.

This is the principle idea behind sub-band coding which is widely used for speech and image data compression. The above system shows one-level DWT where a 1 (k) and d 1 (k) are called low-pass approximation and added detail, respectively. The reconstruction via synthesis filter bank is not perfect i.e. ˆx(n) x(n) unless impulse responses of FIR filters g 1 and h 1 are time reversed versions of g 1 and h 1, respectively, i.e. g 1 (n) = g 1 (L 1 n) and h 1 (n) = h 1 (L 1 n).

Further, if we assume that the LP and HP FIR filters g 1 and h 1 are related via the alternating flip i.e. h 1 (n) = ( 1) n g 1 (L 1 n) where L (even number) is the filter order, then the sub-band analysis/synthesis corresponds to a decomposition onto an orthonormal basis. The low-pass approximation and added detail are a 1 (k) = n x(n)g 1 (2k n) d 1 (k) = n x(n)h 1 (2k n) Now, because the filter impulse responses form an orthonormal set, x(n) can easily be reconstructed as x(n) = k [a 1 (k)g 1 (n 2k) + d 1 (k)h 1 (n 2k)] i.e. weighted sum of the orthogonal impulse responses where weights are the inner products of signal with impulse response.

The sub-band decomposition can be iterated on the low-pass approximations to yield the multi-level DWT. A three-level DWT decomposition is shown with one lowest order approximation, a 3 (m) and three added details d 1 (k), d 2 (l), and d 3 (m). Lowest order Approximation g1(n) 2X a3(m) x(n) g1(n) h1(n) 2X 2X a1(k) x(n ) d1(k) g1(n) h1(n) 2X 2X a2(l) d2(l) h1(n) 2X d3(m) Added Details For N-level DWT we have one lowest order approximation and N added details, i.e. a N (p) = x(n)g N (2 N p n) n d j (m) = n x(n)h j (2 j m n), j [1, N] where g N (n) = k g N 1(k)g 1 (n 2k) is a LPF and h j (n) = k g j 1(k)h 1 (n 2k) s are BPF s or HPF.

The reconstruction equation becomes x(n) = N d j (k)h j (n 2 j k) + k k j=1 a N (k)g N (n 2 N k) Note that if the sub-band decomposition is iterated on both the low-pass approximation and the added details this results in multi-level wavelet packets (WP) decomposition. The subsequent figures show the 5-level DWT decompositions for frequency-break and a noisy speech signal using the Daubechies (db4) wavelet. As can be observed, each sub-band picks up the behavior of the signals at certain resolution and scale. The last two figures show the wavelet functions and the FIR filter impulse responses for the two filters and two different types of wavelets, namely db4 and Symlet (Sym4). The former wavelet is orthogonal but not linear phase; while the latter is orthogonal with nearly linear phase (good for speech and acoustic processing).

Wavelet Transform and Denoising DWT can also be used for denoising applications. The idea is that noise commonly manifest itself as fine-grained structure in an image, and the wavelet transform provides a scale-based decomposition, hence most of the noise tends to be represented by wavelet coefficients at the finer scales. Discarding these coefficients would naturally result in filtering out of the noise. The method of Donoho et. al.,1992 thresholds the wavelet coefficients to zero if their values are below a threshold. These coefficients mostly correspond to noise. The edge-related coefficients, on the other hand, are usually above the threshold. An alternative to such hard thresholding is the soft thresholding, which leads to less severe distortion of the object of interest. The hard and soft thresholding methods are given in the following equations, respectively:

x th = x th = { x if x > t 0 if x t { sign(x)( x t) if x > t 0 if x t There are several approaches for setting the threshold for each band of the DWT. A common approach is to decide the threshold based upon the histogram of each sub-image.

Example 1 Show that the alternating flip h 1 (n) = ( 1) n g 1 (L 1 n) maps a LPF g 1 to a HPF h 1. Proof: (a): Let G 1 (z) = L 1 n=0 g 1(n)z n then L 1 L 1 H 1 (z) = ( 1) n g 1 (L 1 n)z n = ( 1) L 1 n g 1 (n)z (L 1 n) n=0 n=0 L 1 = ( z) (L 1) g 1 (n)( z 1 ) n = ( z) (L 1) G 1 ( z 1 ) n=0 Since G 1 (z) is LPF G 1 (z) z= 1 = 0 and G 1 (z) z=1 = 1. Now, it is clear that H 1 (z) z= 1 = 1 and H 1 (z) z=1 = 0 i.e. H 1 (z) is HPF.

Example 2 Show that the Haar filter bank offers both linear phase and orthonormality properties. Proof: (a): The LPF and HPF for Haar filter bank are: g 1 (n) = { 1 2, 1 2 } and h 1 (n) = { 1 2, 1 2 }. Clearly, h 1 (n) = ( 1) n g 1 (1 n), i.e. alternating flip is satisfied. G 1 (Ω) = 2 cos(ω/2)e jω/2 and H 1 (Ω) = 2j sin(ω/2)e jω/2 i.e. linear phase. (b) To show orthonormality, (i) < g 1 (2k n), g 1 (2l n) >= δ(k l), (ii) < h 1 (2k n), h 1 (2l n) >= δ(k l), and (iii) < g 1 (2k n), h 1 (2l n) >= 0 k, l. Using g 1 (n) it is easy to show that < g 1 (2k n), g 1 (2k n) >= n g2 1(2k n) = 1/2 + 1/2 = 1, while < g 1 (2k n), g 1 (2l n) >= n g 1(2k n)g 1 (2l n) = 0, k l and similarly for (ii).

Also, for (iii) we have < g 1 (2k n), h 1 (2l n) >= n g 1(2k n)h 1 (2l n) = 0, k, l. The lowpass approximation and added details using Haar basis are: a 1 (k) = n x(2k n)g 1 (n) = 1 2 (x(2k) + x(2k 1)) d 1 (k) = n x(2k n)h 1 (n) = 1 2 (x(2k) x(2k 1)) Thus, the reconstruction equations are: x(2k) = 1 2 (a 1 (k) + d 1 (k)) x(2k 1) = 1 2 (a 1 (k) d 1 (k)) which gives the Haar reconstruction filters as g 1 (n) = { 1 2, 1 2 } and h 1 (n) = { 1 2, 1 2 }.

2-D DWT and Filter Banks An obvious way to extend DWT to the 2-D case is to use separable wavelets obtained from 1-D wavelets. A one-level 2-D DWT of an N N image can be implemented using 1-D DWT along the rows, leading to two sub-images of size N 2 N, followed by 1-D DWT along the columns of these two images, resulting in four sub-images of size. The figure below shows this process. N 2 N 2

The first sub-image that is obtained by low-pass filtering and subsampling (by 2) along rows and columns gives the low-pass approximation, the second one is obtained by low-pass filtering-subsampling along rows and high-pass filtering-subsampling (edge extraction) along columns giving the first added details corresponding to the vertical edge details, the third and fourth ones similarly give the horizontal and diagonal edge details. Reconstruction from these sub-images can be done similar to the 1-D case. The process can be iterated on the low-pass approximation several times as in the 1-D case to obtain finer frequency resolution and perform multi-level 2-D DWT. The following examples show single and two-level DWT of the Peppers image using the orthogonal Db4 wavelet.

As can be seen, the detail sub-images mostly contain very low intensities pixels and some edge/texture details. To compress the image data using 2-D DWT the detail sub-images can be thresholded to extract only the useful edges/texture and zero-out those pixels with small intensities (smaller than the chosen threshold). Once this is done the lowest order approximation sub-image can be encoded using straight PCM while only edges need to be encoded in the detail sub-images. This yields substantial reduction in the total number of bits.

Image Fusion Using 2-D DWT Followings are visible and IR satellite images together with the fused image using lowest order approximation from IR channel and added details from visible channel. Image 1 Image 2 Fused Image