1 A new class of shift-invariant operators Janne Heiilä Machine Vision Group Department of Electrical and Information Engineering P.O. Box 4500, 90014 University of Oulu, Finland Tel.: +358 8 553 2786, Fax: +358 8 553 2612 E-mail: jth@ee.oulu.fi Abstract This paper proposes a class of operators with a shift invariance property. These operators are derived from two-dimensional complex moment invariants based on the observation that there is a duality between rotation invariance and shift invariance. A general form of the shift invariants belonging to this class is presented, which shows that polyspectral invariants such as the power spectrum and the bispectrum are members of the class. Methods for computing shift invariants for one-dimensional and two-dimensional signals are also presented. The examples given in the paper suggest that the higher order operators can preserve the original signal waveform better than the autocorrelation. Index Terms Discrete Fourier transform, translation invariance, moment invariants, power spectrum, bispectrum. EDICS Category: 1-TFSR I. INTRODUCTION In signal analysis, a usual manoeuvre is to compare the incoming signal waveform with a set of prototypes, and to select that prototype which has the highest resemblance. Before this comparison can be made the signals need to be aligned in order to compensate for the possible delay or shift between them. Another possibility is to convert the signals into a shift-invariant representation, when alignment is not necessary. Formally, if r(t) is a delayed version of s(t) so that r(t) =s(t ), (1)
2 where is an arbitrary delay, a shift-invariant operator S{ } satisfies the following equation S{r(t)} = S{s(t)}. (2) A well-nown operator for attaining shift invariance is the autocorrelation ϱ (l) = N 1 =0 r r +l (3) defined here for a discrete-time sequence {r } N 1 0. Another commonly used shift-invariant operator is the power spectrum which is the Fourier transform of the autocorrelation F{ϱ (l)} n = F{r } n F{r } n = F{r } n 2, (4) where F{ } denotes the discrete Fourier transform (DFT). A straightforward extension of the autocorrelation is the higher order statistics (HOS) with N th order autocorrelations or cumulants which also have the shift invariance property [1]. Higher order autocorrelation features have been used in pattern recognition, for example, in [2]. The Fourier transforms of the cumulants are called polyspectra [3]. Bispectrum is a special case of polyspectra and it is defined by B n1,n 2 = F{r } n1 F{r } n2 F{r } n 1+n 2. (5) As a result a two-dimensional shift-invariant representation of the one-dimensional signal is obtained. Shift and scale invariant features derived from bispectrum have been proposed in [4] for 1-D signals and in [5] for 2-D signals. Bispectrum in system identification has been discussed in [6], and several algorithms for reconstructing the signal from bispectrum have been suggested in the literature, e.g. in [7] and [8]. Trispectrum is another instance of the polyspectra with a 3-D representation of a 1-D signal [3]. Another operator for attaining shift invariance is R-transform [9], which has a fast FFT-lie algorithm, and only additions, subtractions, and absolute value operations are needed. Wagh and Kanetar [10] defined a class of shift-invariant transforms where R-transform also belongs. In that class, M-transform is another transform that has the same invariance property. In some cases shift invariance can be achieved by normalizing the data by using the first order moments. This is usually carried out by shifting the signal so that its centroid becomes zero. However, this procedure is very sensitive to noise, because only the first order statistics of the data is considered. In this paper, a new class of shift-invariant operators is proposed based on the relationship between complex moments, moment invariants and shift invariants.
3 II. DERIVATION OF THE OPERATORS Let us start from the complex moment c pq of the order p + q which is defined for a two-dimensional probability mass function f(m, n) as follows c pq = (m + jn) p (m jn) q f(m, n) (6) m= n= where j is the imaginary unit. In finite domain (6) can be expressed as c pq = N 1 =0 and in polar coordinates the equivalent representation is c pq = (x + jy ) p (x jy ) q f(x,y ), (7) N 1 =0 r p+q e j(p q)θ f(r,θ ). (8) Assuming that θ =2π/N and f(r,θ )=1/N, the complex moment becomes c pq = 1 N N 1 =0 r p+q e 2πj(p q) N, (9) which we can recognize as the (p q)th bin of the DFT calculated for the sequence {r p+q } N 1 =0. With the notation used in the previous section the moment is expressed as c pq = F{r p+q } p q. (10) It should be noticed that when using the discrete phase angles θ instead of continuous values we assume that r is sampled uniformly, which is the most common case with 1-D time series. Another important observation is that {r } can be an arbitrary discrete time sequence with r R, and it can also have negative values, although it is a usual convention to interpret r as a positive distance from the origin of the complex plane to the point x + jy. However, we can also have another interpretation where r is a negative value and in that case the phase angle is shifted by 180, i.e., r e jθ = r e j(θ+π). Assuming that p and q are integers, it is straightforward to prove that (9) holds also with this interpretation: c pq = 1 N ( r ) p+q e j(p q)(θ+π) = 1 N e jπ(p+q) r p+q e j(p q)θ e jπ(p q) = 1 N e 2jπp r p+q e j(p q)θ = 1 N rp+q e j(p q)θ = c pq. (11) It is evident from (6)-(9) that rotation invariance in the 2-D space implies shift invariance for the 1-D signal {r }. This gives us a basis for deriving the new class of shift-invariant operators. If we can find complex moment invariants to 2-D rotations, we will also get 1-D shift invariants based on these equations.
4 Moment invariants have been widely studied in pattern recognition. Hu [11] introduced seven invariants of the second and third order φ 1 φ 7 that are invariant to translation, rotation, and scale changes. Flusser [12] showed that Hu s invariants are partly dependent, and he proposed another set of invariants ψ 1 ψ 6 based on the second and third order moments. With the notation used above we can rewrite these moment invariants into the following form: ψ 1 = φ 1 = c 11 = F{r 2 } 0, ψ 2 = φ 4 = c 21 c 12 = F{r 3 } 1F{r 3 } 1, ψ 3 = φ 6 =Re(c 20 c 2 12 )=Re{F{r2 } 2 (F{r 3 } 1 )2 }, ψ 4 = Im(c 20 c 2 12) =Im{F{r 2 } 2 (F{r 3 } 1) 2 }, ψ 5 = φ 5 =Re(c 30 c 3 12 )=Re{F{r3 } 3 (F{r 3 } 1 )3 }, ψ 6 = φ 7 =Im(c 30 c 3 12) =Im{F{r 3 } 3 (F{r 3 } 1) 3 }, (12) where Re( ) means the real part and Im( ) the imaginary part of a complex number. The first moment invariant ψ 1 is the DFT coefficient defined at frequency 0. As it can be seen from (9) this invariant equals to the average of r 2, which is clearly shift-invariant. Comparing ψ 2 with (4) reveals that this invariant is the power spectrum component of {r 3} with n =1. The invariants ψ 5 and ψ 6 are actually the real and the imaginary parts of the trispectrum [3] of {r 3 }. On the other hand, the invariants ψ 3 and ψ 4 cannot be explained by the shift invariance of the power spectrum or the polyspectra, because two different sequences {r 2} and {r3 } are involved. This suggests that there is some more general theory behind these moment invariants as well as the corresponding shift invariants. Recently, Flusser [12] proposed a general framewor for constructing rotation invariants from complex moments. He showed that m I = c di p i,q i (13) i=1 is invariant to rotation if m d i (p i q i )=0, (14) i=1 where m 1, i =1,...,m, and d i, p i and q i are non-negative integers. In order to generalize our discussion about the shift invariants we need to review the proof of these formulas from [12]. Let f be a rotated version of a 2-D image f so that f (r,θ )=f(r,θ +α) where α is the angle of rotation. The complex moments of f denoted by c pq can be expressed by the term of c pq in the following
5 manner: c pq = ej(p q)α c pq. (15) In order to construct a rotation invariant descriptor we need to eliminate α from its expression. This can be achieved by multiplying moments of different order in such a way that the condition in (14) is satisfied. However, profound inspection of these formulas reveals that satisfying the condition (14) does not necessarily require that the moments involved in the product (13) have the same radial distance r. The only requirement is that each pair of moments must satisfy (15). Based on the duality between the rotation and shift invariance, we can extend this result to construct shift invariants. It was already pointed out that r can be an arbitrary signal, but now it is evident that the signals involved in the expression of the invariant does not have to be identical, but they only need to be shifted by the same amount. If we want to construct a shift invariant for a single sequence, we can utilize this result by introducing a set of real 1-D functionals τ i : R Rand use τ i (r ) instead of r. For example, ψ 3 and ψ 4 in (12) use the functionals τ 1 (r) =r 2 and τ 2 (r) =τ 3 (r) =r 3. From (10) we notice that the set of functionals in all moment invariants have the same form τ i (r) =r p+q. In general, the functionals can be any real-valued and position invariant mappings of r. Henceforth, we will call the functionals τ i shaping functions. Assuming that we have a 1-D real valued discrete time sequence {r }, we can now write the following general form of a shift-invariant operator: m Ψ ω1,ω 2,...,ω m {r } = F{τ i (r )} ωi, (16) i=1 where ω 1,...,ω m are integer parameters and they must satisfy the constraint m ω i =0. (17) i=0 These equations specify a new class of shift-invariant operators. Because the shaping function τ i can have different forms as discussed above, (16) represents an unlimited number of shift invariants. Notice that because of the constraint (17) the parameter space spanned by (ω 1,ω 2,...,ω m ) has only m 1 degrees of freedom. We can immediately see that the power spectrum (4) and the bispectrum (5) are members of this class. For the power spectrum m =2, τ 1 (r) =τ 2 (r) =r, and ω 1 = ω 2 = n, where n =0,...,N 1. For the bispectrum m =3, τ 1 (r) =τ 2 (r) =τ 3 (r) =r, ω 1 = n 1, ω 2 = n 2, and ω 3 = n 1 n 2.
6 III. COMPUTATIONAL ASPECTS In the basic form the operator (16) produces an (m 1)-dimensional representation of the 1-D signal. In many cases, it is more desirable to have a representation with the same dimensionality as with the original signal. In this section, the dimensionality problem is solved by considering only linear slices of the multidimensional representation. The sequence of the invariants is also transformed bac to the spatial domain. Next, we will assume only 1-D and 2-D input signals, but the generalization of the method for N-D signals is straightforward. A. 1-D signals Let {r } be an arbitrary real-valued 1-D discrete-time signal with N samples. From (16) we can see that there are various options to construct the invariants for {r }. We can select different values for m and for the parameters ω 1,...,ω m. Also, the shaping functions may change for each different set of parameter values. Because of this vast amount of possibilities we need to limit ourselves to a more compact set of the invariants. The principle is that we only compute a 1-D sequence of the invariants with the length of mn samples. This length is consistent with the requirement set for the power spectrum and the autocorrelation as well as for the other polyspectra to avoid the wraparound error. The second criterion for selecting the invariants is that only a single linear 1-D slice of the (m 1)-dimensional parameter space is used. This guarantees that the sequence containing the invariants is symmetric and consequently its inverse DFT becomes real-valued. In principle, slices of different orientations could be used, but in this paper we select the diagonal slice so that ω 1 = ω 2 =... = ω m 1 = n where n =0,...,mN 1. As it is shown in [13] diagonal slice of the bispectrum can be used for reconstructing the signal, which indicates that it contains all the necessary information of the signal waveform. The third constraint is that the shaping function τ i is not changed for the different values of ω 1,...,ω m. This is just for a practical reason, because otherwise it would not be possible to utilize the fast Fourier transform (FFT) algorithms for computing the invariants. For notational convenience, let R n,i F{τ i (r )} n. In general, we would need to compute mn samples of the DFT for each i =1,...,m. However, when using the diagonal slice ω 1 = ω 2 =...= ω m 1 = n, the first m 1 DFTs in (16) are the same assuming that τ 1 (r) =τ 2 (r) =,...,= τ m 1 (r). Based on the constraint (17) the last DFT to be computed becomes R (m 1)n,m = R(m 1)n,m. In other words, we need the DFT samples from the bins n and (m 1)n, where n =0,...,mN 1. The first set of samples are directly obtained from the sequence {R n,1 }. For the second set we need to permute the samples.
7 Here, we can utilize the conjugate symmetry and periodicity of the DFT by considering the sequence as infinite length with a period of mn samples. The sample indices needed are obtained from the equation l =(m 1) n mod mn, n =0,...mN 1. (18) Next, a set of invariants {Ψ n,n,...,n, l {r }} n=0 mn 1 are computed using (16). Finally, we can return bac to the time domain by taing the inverse DFT: ρ = F 1 {Ψ n,n,...,n, l {r }}, =0,...,mN 1. (19) This will mae the resulting descriptor ρ comparable with the autocorrelation function. Notice that {ρ } is a real-valued sequence, because {Ψ} is symmetric. However, {ρ } is not necessarily symmetric, although autocorrelation is always a symmetric function. B. 2-D signals Let {r 1, 2 1 =0,...,N 1 1, 2 =0,...,N 2 1} be an N 1 by N 2 array, and {R n1,n 2,i n 1 = 0,...,mN 1 1,n 2 =0,...,mN 2 1} the corresponding 2-D DFT array, where R n1,n 2,i = F{τ i (r 1, 2 )} n1,n 2. In order to apply the shift-invariant operator to the 2-D array, we need to perform a similar permutation for both indices as in the 1-D case so that l 1 = (m 1)n 1 mod mn 1, n 1 =0,...mN 1 1, l 2 = (m 1)n 2 mod mn 2, n 2 =0,...mN 2 1. (20) Again the shift invariants are computed based on (16) for each n 1 and n 2. Finally, the array of the invariants is converted into the spatial domain by using the 2-D inverse DFT. IV. NUMERICAL EXAMPLES This section gives some examples how different shift invariants from the new class wor in practice. Both 1-D and 2-D cases are considered and the methods described in the previous section are applied. For brevity, we only use two third order operators denoted by ρ(b) and ρ(c) and compare them with the well-nown autocorrelation denoted by ρ(a). The operator ρ(b) can be characterized by the following attributes m =3, τ 1 (r) =τ 2 (r) =τ 3 (r) =r, which is basically the inverse DFT of the 1-D diagonal slice of the bispectrum. The operator ρ(c) is characterized by m =3, τ 1 (r) =τ 2 (r) =r and τ 3 (r) = t r, where the time derivative t r can be approximated in the discrete domain by t r r +1 r. Notice that this approximation is position invariant although it depends on the indices and +1.
8 Fig. 1. 1-D examples. From left to right: originals, autocorrelations, and two third order descriptors. In the first example illustrated in Fig. 1, four 1-D signals are shown in the leftmost column. The first two signals are mutually shifted, the third one is a mirrored version of the second signal and the fourth one was obtained by adding zero mean Gaussian noise to the second signal. The corresponding autocorrelations ρ(a) are in the second column, and the responses of the operators ρ(b) and ρ(c) are shown in the last two columns, respectively. As we can see, all three operators are shift-invariant. However, the autocorrelation is also invariant to the signal mirroring which maes it impossible to determine if the signals has been flipped or not, whereas this information is still available from the other two operators. The reason for this is that the autocorrelation does not preserve the phase-spectrum of the signal unlie the other two operators. As it is well-nown the phase-spectrum contains important shape information. From the last signal we notice that ρ(a) and ρ(b) suppress the noise, whereas ρ(c) seems to be pass the noise with less attenuation. This indicates that ρ(c) is more sensitive to high-frequency components, which might be a useful property when small details must be recognized. In the second example, the 2-D versions of the same operators ρ(a), ρ(b) and ρ(c) are applied with
9 Fig. 2. 2-D examples. From left to right: originals, autocorrelations, and two third order descriptors. four intensity images. In the case of ρ(c) the shaping function is now the spatial difference along the vertical image axis i.e. τ 3 (x, y) =r(x, y +1) r(x, y). The original images are shown in Fig. 2 on the left. The first and third images represent two letters G and P. The second image is a shifted and noisy version of the first one and the last image is a mirror reflected version of the third one. Again, the autocorrelations are in the second column, and the responses of the operators ρ(b) and ρ(c) are in the following two columns. The same observations can be made as in the 1-D case. We can also notice some resemblance between the original images and the responses of the last two operators. This suggests that ρ(b) and ρ(c) could be better operators for pattern recognition purposes than the autocorrelation. V. CONCLUSION This paper has presented a new class of shift-invariant operators that can be computed efficiently with the discrete Fourier transform. Within the new class, an unlimited number of operators can be constructed
10 by using different shaping functions. These operators can also be extended to multidimensional signals in a straightforward manner. The power spectrum and the bispectrum as well as the other polyspectra can be seen as members of this class. Since the power spectrum does not preserve the phase information, it does not always provide a sufficient basis for signal analysis, while the higher order operators could give a better description of the original signal by retaining the phase information. The examples given in the paper indicate that these higher order operators transformed bac to the time or spatial domain preserve the characteristics of the original signal better than autocorrelation. This property is liely to be useful in pattern recognition applications, where shift invariance is in a ey role. REFERENCES [1] J. A. McLaughlin and J. Raviv, Nth-order autocorrelations in pattern recognition, Information and Control, vol. 12, pp. 121 142, 1968. [2] T. Kurita, N. Otsu, and T. Sato, A face recognition method using higher order local autocorrelation and multivariate analysis, in Proc. 11th IAPR International Conference on Pattern Recognition (ICPR 92), The Hague, Netherlands, Aug. 1992, pp. 213 216. [3] J. M. Mendel, Tutorial on higher-order statistics (spectra) in signal processing and system theory: theoretical results and some applications, Proc. IEEE, vol. 79, no. 3, pp. 278 305, 1991. [4] V. Chandran and S. L. Elgar, Pattern recognition using invariants defined from higher order spectra - one-dimensional inputs, IEEE Trans. Signal Processing, vol. 41, no. 1, pp. 205 212, 1993. [5] V. Chandran, B. Carswell, B. Boashash, and S. L. Elgar, Pattern recognition using invariants defined from higher order spectra - 2-d image inputs, IEEE Trans. Image Processing, vol. 6, no. 5, pp. 703 712, 1997. [6] G. B. Giannais and J. M. Mendel, Identification of non-minimum phase systems using higher-order statistics, IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 360 377, 1989. [7] B. Sadler and G. B. Giannais, Shift and rotation invariant object reconstruction using the bispectrum, Journal of the Optical Society of America A, vol. 9, pp. 57 69, 1992. [8] A. P. Petropulu and H. Pozidis, Phase reconstuction from bispectrum slices, IEEE Trans. Image Processing, vol. 46, no. 2, pp. 527 530, 1998. [9] H. Reitboec and T. P. Brody, A transformation with invariance under cyclic permutation for applications in pattern recognition, Information and Control, vol. 15, no. 2, pp. 130 154, 1969. [10] M. Wagh and S. Kanetar, A class of translation invariant transforms, IEEE Trans. Acoust., Speech, Signal Processing, vol. 25, no. 2, pp. 203 205, 1977. [11] M. K. Hu, Visual pattern recognition by moment invariants, IEEE Trans. Inform. Theory, vol. 8, pp. 179 187, 1962. [12] J. Flusser, On the independence of rotation moment invariants, Pattern Recognition, vol. 33, pp. 1405 1410, 2000. [13] S. A. Dianat and M. R. Raghuveer, Fast algorithms for phase and magnitude reconstruction from bispectra, Optical Engineering, vol. 29, no. 5, pp. 504 512, 1990.