ÉC O L E P O L Y T E C H N IQ U E FÉ DÉR A L E D E L A U S A N N E Christophe Ancey Laboratoire hydraulique environnementale (LHE) École Polytechnique Fédérale de Lausanne Écublens CH-05 Lausanne Lectures notes Rheology and Fluid Dynamics version.0 of 4th September 2009
TABLE DES MATIÈRES iii Table des matières Wavelet decomposition. Introduction to wavelets............................ Properties................................2 Multiresolution analysis...................... 2..3 Multiresolution analysis for finite samples............ 3..4 Cascade algorithm for multiresolution analysis for finite samples 7 Bibliographie 9
Wavelet decomposition. Introduction to wavelets.. Properties In order to improve the resolution of methods such as the collocation method, we have to use more elaborate or efficient polynomials. An alternative to Legendre polynomial is to use wavelet decomposition. Wavelets are generated from dilations and translations of a special function, called the mother wavelet ψ. Let us consider a wavelet function ψ with a support I ψ ; this wavelet function, usually referred to as the mother wavelet, is associated with a scaling function φ with a support I φ = [0, B]. We can define a family of orthogonal (in a sense that is specified below) functions: ψ ij (x) = 2 i/2 ψ(2 i x j) and φ ij (x) = 2 i/2 φ(2 i x j). These functions enjoy helpful properties: pairwise orthogonality: < ψ ij, ψ kl >= δ ik δ jl and < φ ij, φ ik >= δ jk orthogonality < φ ij, ψ kl >= 0 for k i 0-moments: R ψ(x)dx = 0 and R φ(x)dx = normality: R ψ2 (x)dx = and R φ2 (x)dx = scaling properties: there is a (usually finite) set of coefficients h k such that: φ(x) = k Z h k 2φ(2x k). (.) The last property is important because it allows the multiresolution analysis (Vidakovic, 999; Mallat, 998). The coefficients h k are called the wavelet filter; they act as a lowpass averaging filter. We introduce n h the length of nonzero wavelet coefficients h k. The support of φ is directly related no n h : suppφ = [0, n h ]; the support of the mother wavelet is [ n h /2, n h /2]. A key point is that it is possible to relate φ(x) to φ(2x k), i.e. by averaging what happens on a fine scale, we can obtain the trend. Another aspect is that this relationship is a convolution; therefore, if we work in the Fourier domain, we obtain the relationship: ˆφ(ω) = m 0 (ω) ˆφ ( ) ω, 2
2. Wavelet decomposition where ˆφ(ω) is the Fourier transform of φ and m 0 is the discrete Fourier transform of (h k ). In many applications, one tries to find sparse representations of functions, i.e., approximations with a few non-zero coefficients. It is then of great importance to select wavelet bases that efficiently approximate particular classes of functions. Wavelet design is optimized to produce a maximum of coefficient < f, ψ ij > that are zero or close to zero. An important property is the number p of vanishing moments of the mother wavelet ψ. The i-moment is: M i =< x i, ψ(x) >= R ψ(x)xi dx. If ψ has p vanishing moments, then ψ is orthogonal to any polynomial of degree p. If a function is is locally C k, then over a small interval it is approximated by a polynomial of degree k. Therefore, for k < p, we have: < f, ψ ij >= 2 i/2 R f(x)ψ(2 i x j)dx = 2 i/2 R f(x + 2 i j)ψ(2 i x)dx k k < f, ψ ij >= 2 i/2 f (n) (2 i j)x n ψ(2 i x)dx + O(x p ) = 2 i/2 f (n) (2 i j)m n + O(x p ) i=0 R i=0 Finally we deduce that: < f, ψ ij >= O(x p ). In other words, if the mother wavelet has p vanishing moments, then any polynomial of degree p is fully reproduced by the scaling function. It can be shown that the following properties are equivalent:. the mother wavelet has p vanishing moments 2. ˆψ(ω) and its first p derivatives are zero at ω = 0 3. suppψ 2p 4. for any 0 k < p, q k (t) = n Z n k φ(t n) is a polynomial of degree k. The last property shows how is possible to exactly represent polynomials {, t, t 2,...} by wavelets. Daubechies wavelets are optimal because they have a minimum size support (according to property (3), suppψ = 2p ) for p vanishing moments : suppφ[dp] = [0, 2p ] and suppψ[dp] = [ p +, p]. When choosing a particular wavelet, we have to find a trade-off between the number of vanishing moments and the support width. Indeed, the wider the support of ψ compared to the support of a function f, the more numerous the shift index required to span suppf. Furthermore if f has few isolated singularities and is very regular, it is more convenient to select a wavelet with a large support and many vanishing moments to provide a sparse representation of f. Otherwise, it may be better to decrease the support width of ψ to single out singularities and avoid high amplitude coefficients...2 Multiresolution analysis Multiresolution analysis is a technique that provides approximations of signals at various resolution levels by using orthogonal projections onto different spaces. These projections can be computed by using projector operators in a given function space,. This property is easy generalized to any wavelet whose scaling function has support [N, N 2 ]; in that case, we have: suppψ = [(N N 2 + )/2, (N 2 N + )/2].
. Introduction to wavelets 3 but it can also be shown that multiresolution approximations are entirely characterized by the wavelet filter, which can be interpreted as a operator controlling the loss of information across different levels (see..3). The idea is to consider that the approximation of a function f at a resolution level 2 i is equivalent to a local average of f over neighborhoods of size 2 i. It is possible to decompose any function of f(x) of L 2 [a, b]: f(x) = β ij ψ ij (x) i= j J where J is a set of j-index for which 0 2 i x j B for a given scale i and β ij = R ψ ij(x)f(x)dx. The problem is that the summation is made over infinite sets. It can be shown that: f(x) = α i0 jφ i0 j(x) + β ij ψ ij (x) j J 0 j J The first term on the right-hand side of the equation is called the trend, i.e., it represents the mean or filtered (low-pass filter) behavior of the function f at the scale i 0. The second term represents the deviation from this trend; the summation is made over different scale i. J i denotes the set of j-index needed to describe f at scale i. Since α ij =< f, φ ij >, we have: α ij = R b ( ) 2 i f(x)φ ij (x)dx = 2 i/2 f(x)φ(2 i x j)dx = 2 i/2 b j ξ + j f φ(ξ)dξ a For this integral not to be zero, the bounds must verify: 2 i a j < B and 2 i b j > 0, i.e.: J i : 2 i a B < j < 2 i b. For instance, let us consider the Haar wavelets (I ψ = I φ = [0, ]) and a function with a support over [, ]. At each scale i, we define the set J i : 2 i < j < 2 i. Thus at scale i = 0, J 0 = {, 0}, i =, J = { 2,, 0, }, i = 2, J 2 = { 4, 3, 2,, 0,, 2, 3}, etc. For instance for f(x) = sin x, at i =, one has (see Fig..): i i 0 2 i a j f(x) = 0.459(φ 0,0 (x) φ 0, (x)) 0.24(ψ 0,0 (x) ψ 0, (x)) 0.064(ψ, 2 (x) ψ, (x)) 0.085(ψ, (x) ψ,0 (x)) + 2 i..3 Multiresolution analysis for finite samples In practice, one has a sampled function f(x i ) for i = n (here we assume that n = 2 N ) and we want to obtain a wavelet decomposition from this sample. Here we assume that the wavelet support is [0, ]. Except for some basic wavelets such as the Haar wavelets, the support is usually different from [0, ]. It is, however, possible to still
4. Wavelet decomposition 0.75 0.25 0.75 0.25 0.75 0.25 - - -0.25 - - - -0.25 - - - -0.25 - -0.75-0.75-0.75 i = 0 i = i = 2 Figure. : Decomposition of f(x) = sin x defined over [, ] into Haar wavelets (up to i = 2)..5 2.5-2 3-0 - - 0 2 3 Figure.2 : Left: scaling function (solid line) and wavelet (dashed line) for the Daubechies wavelet D2. Right: periodized scaling function (solid line) ψ per 00 (x) compared to the original scaling function φ. work on the interval [0, ] by transforming (periodizing) the wavelets in the following way [see (Mallat, 998), 7.5.; (Vidakovic, 999), 5.6]: ψ per ij (x) = k Z 2 i/2 ψ(2 i (x + k) j) for the wavelets ψ ij whose support intersects or includes the interval [0, ]. When the support of the wavelet at scale i and for shift index j lies within [0, ], then the function ψ ij is preserved. We report a typical example of periodization for the for the Daubechies wavelet D2 in Fig..2. Therefore, in the following, we can consider without loss of generality that the support of the wavelets is [0, ]; we no longer use the superscript per for the sake of simplicity. Note that this choice entails some inconvenients: near the boundaries, periodic wavelets have a poor behavior because they induce high amplitude coefficients in the neighborhood of the boundaries when the function f is not periodic. It is possible to alleviate these problems by altering the wavelets coefficients close to the boundaries (by using boundary wavelets), but the computations are made more complicated [see (Mallat, 998), 7.5.3]. The discrete wavelet transform map the data d = (f(x i )) i n to the wavelet domain w = (c ij, d ij ). The result is a vector of the same size n. There is an n n orthogonal (or close to orthogonal) matrix W such that: d = W w, where W pq = n /2 Ψ q (p/n) in which q = (i, j) is an appropriate set of scale and translation indices. We have introduced the function family Ψ k that provides a generic
. Introduction to wavelets 5 representations of the scaling and wavelet functions involved in the decomposition 2. Here Ψ has to be defined at each scale. At the coarsest level, the scale i = 0, we have Ψ 0 = φ 00 and Ψ = ψ 00 ; at scale i =, one has Ψ 2 = φ 0 and Ψ 3 = ψ ; at any scale i, one has 2 j coefficients d ij, implying that Ψ k = ψ ij with k = 2 i +j (0 j 2 i ). Let us take a typical example with the Haar wavelet. Consider the function f(x) = sin x sampled at x i = + 2/(n )i with n = 8 (N = 3). We have: d ={ 0.84, 0.655, 0.45, 0.42, 0.42, 0.46, 0.655, 0.84}. The functions Ψ are: Φ 0 = φ 00, Φ = ψ 00, Φ 2 = ψ 0, Φ 3 = ψ, Φ 4 = ψ 20, Φ 5 = ψ 2, Φ 6 = ψ 22, and Φ 7 = ψ 23. We deduce the following 8 8 orthogonal matrix: 2 2 2 0 2 0 0 0 2 2 2 2 2 0 2 2 2 0 0 0 2 2 2 0 0 2 0 0 2 2 W = 2 2 2 0 0 2 2 2 0 0 2 2 2 0 0 0 2 0 2 2 2 2 2 0 0 0 2 2 2 0 2 2 2 0 0 0 0 2 2 2 2 2 2 0 0 0 0 2 2 2 By taking w = W d, one deduces: w ={0,.452, 0.469, 0.469, 0.3, 0.93, 0.93, 0.3}. A tricky point: Can we approximate f(x) by i w i Ψ i? In other words, is there a direct link between the discrete wavelet coefficients w = (c 00, d 00,...) and their continuous counterparts c = (α ij, β ij )? The response is positive: To show this, two arguments can be used: c ij nα ij and d ij nβ ij. (.2) Inner product: the inner product in L 2 (R) is defined by: < f, g >= R f(x)g(x)dx. We are interested in its relationship with a discrete equivalent [f, g] = i f(x i )g(x i ) computed at certain interpolation points x i regularly spaced by a distance δ. If there are n k interpolation points, then δ (suppf suppg)/n k, implying [f, g] n k < f, g > and f =< f, g > /2 n /2 k [f, f] /2. Continuous and discrete relationships are related via a factor n. Behavior at the finest scale: let us consider the wavelets coefficients at the finest scale. If we have n = 2 N data, the finest scale is at i = N. The continuous wavelet coefficient is in the form: β ij =< f(x), ψ ij >= 2 i/2 f(x)ψ(2 i x j)dx = 2 i/2 f(x + j2 i )ψ(2 i x)dx R If i = N is sufficiently high, ψ(2 i x) is highly concentred in a small region near x = 0. So to the leading order in 2 i, we have: ( ) j β ij = 2 i/2 f + O(2 i ) (.3) 2 i 2. For instance, taking the example above of the decomposition of sin x, we set: Ψ = φ 0,, Ψ 2 = φ 0,0, Ψ 3 = ψ 0,, Ψ 4 = ψ 2,0, etc. R
6. Wavelet decomposition a 0 0.75 0.25 x i - - -0.25 b - -0.75 Figure.3 : Left: in the discrete wavelet decomposition, the interval [a, b] over which the function is sampled is mapped to [0, ]. Right: Decomposition of f(x i ) = sin x i into Haar wavelets, with x i = + 2i/7 and 0 i 7. The dots represent the measurements f(x i ), the solid line represents the function sin, the dashed curve represents the discrete wavelet approximation, and the long-dashed curve is the discrete wavelet decomposition (same curve as in Fig... The interpretation is quite simple: at the finest scale i = N, the coefficient β Nj correspond the signal measured at dyadic points j2 i multiplied by 2 i/2 = 2 N/2 = n. Here, by definition, the discrete coefficients d Nj at the finest scale are f(x + j2 i ). We finally deduce: β Nj n d Nj (.4) From the discrete wavelet transform, we can deduce the continuous coefficients by dividing by n, here 2 2 for our example. We find: α 00 = 0, β00 = 3, β0 = 0.65, β = 3, etc. Furthermore we have to take into account that the decomposition was made on a wrapped sample {i/n, f(x i )}, i.e. by transforming the sample for it to lie within [0, ] (see Fig..3). At the last stage we have to map [0, ] to the initial interval [a, b]. Since we have defined x i = a + i(b a)/n, we have i/n = (x a)/(b a). Finally we can express the approximate wavelet transform as: f(x) i w i n Ψ i (i/n) = i ( ) w i x a Ψ i n For our example, we arrive at the following approximation: b a sin(x) 3ψ 00 (X) 0.65 (ψ 0 (X) + ψ (X)) 0.046 (ψ 20 (X) + ψ 23 (X)) 0.068 (ψ 2 (X) + ψ 22 (X)) with X = (x+)/2. The resulting curve is plotted in Fig..3. Compared to the discrete wavelet decomposition obtained by using the function sin(x) instead of a sample, there are a small deviations, but they are less than /n (accuracy of the relationship given in equation.2). A major drawback of the Haar decomposition comes of the lack of smoothness. Other bases such as the Daubechies wavelet can be used instead. Figure.4 shows the comparison between the D2 and Haar wavelets for the function f(x) = sin 2x sampled
. Introduction to wavelets 7.5 - - - - -.5 Figure.4 : Comparison between the D2-wavelet decomposition (solid line) and Haar-wavelet decomposition (dashed line) for the function f(x) = sin 2x sampled at 2 5 points over the interval [, ]. at 2 5 points over the interval [, ]. A better agreement is obtained with the D2 wavelets, but spurious wide fluctuations are introduced at the boundaries...4 Cascade algorithm for multiresolution analysis for finite samples Note that, in practice, the coefficients are computed by a cascade algorithm (Mallat, 998), which is a fast algorithm that requires approximately n operations to compute w instead of N 2 if we use matrix operations. If we note c ij =< f,φ ij > and d ij =< f,ψ ij >, we have c i,j = n Z h n 2j c i,j d i,j = n Z g n 2j c i,j where h i are the wavelet coefficients (see Eq..) and g i = ( ) i h i. We have found that c i+,j can be computed by taking every convolution of c ij with h. The coefficients pertaining c ij and d ij to resolution level i can be obtained from those at a finer scale i+. If the procedure is repeated, we can recursively build the sequences c ij and d ij provided we specify the initial elements corresponding to the finest scale. Using equations (.3.4), we can take at the finest scale: d ij = f(j/2 i ), i.e. the function sampled at the dyadic points x i = j/2 i. A striking point in this cascade algorithm is that we only need h i and g i and we do not need compute the inner product < f,φ ij > and < f,ψ ij >. In practice, different variants can be used: let us consider that the number of data is n = 2 N (otherwise we can pad with zeros to obtain an appropriate number). We can repeat the recursive procedure from the finest to the coarsest level. We can consider than the coarsest level is i = 0 (described by c 00 and d 00 ) and the finest i = N (described by c Nk and d Nk for 0 k N ). An alternative is to iterate the decomposition as many times as possible and stop when the length of the trend c i0,j becomes either odd or smaller than the filter length n h ; this is, for instance, the procedure used in Mathematica.
8. Wavelet decomposition - - - - - - - - - - -.5 - Figure.5 : Comparison for the D4-wavelet decomposition (dots) between three types of boundary conditions: reflective (on the left), zero (center), periodic (on the right) for the function f(x) = sin 2x sampled at 2 5 points over the interval [, ]. - - - - - Figure.6 : Comparison for the function f(x) = sin 2x (solid line) sampled at 2 5 points over the interval [, ] and the D4-wavelet decomposition between the procedure used in Mathematica (dots) and the complete decomposition. another problem concerns the boundaries. For instance, the boundary points can be obtained using: c i,0 = n h m k=0 h k c ik and c i,2 i = 2 i +n h 3 m k=0 h k 2 i +2c ik, where m is an integer 0 m n h 2. So we need to pad the original data with with m data in the front and n h 2 m data points at the end. The values of data points that are padded at the boundaries are determined by the choice of the boundary condition (periodic, reflective, etc.). On Fig..5, we have reported three wavelet decomposition for the wavelet basis D4 in the case where we sample the function f(x) = sin x over the interval [0, ] (here n = 2 5 data. Here the better agreement is obtained with periodic boundary conditions. On Fig..6, we have plotted the D4-wavelet decomposition for two different assumptions on the coarsest level: in Mathematica, the coarsest level corresponds to the resolution level below which the number of coefficients c ij would become either odd or lower than the filter length. The usual way is to repeat the procedure till there is a single coefficient c 00.
BIBLIOGRAPHIE 9 Bibliographie Mallat, S. 998 A Wavelet Tour of Signal Processing, 2nd edn. San Diego: Academic Press. Vidakovic, B. 999 Statistical Modeling by Wavelets. New York: John Wiley & Sons, Inc.