CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION

Size: px

Start display at page:

Download "CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION"

Gervais George
5 years ago
Views:

1 59 CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION 4. INTRODUCTION Weighted average-based fusion algorithms are one of the widely used fusion methods for multi-sensor data integration. These methods involve the selection of appropriate weights to combine the images, so as to reduce the effects of distortion and also give a satisfactory visual image quality. The method of selecting weights based on the energy of the coefficients of the decomposed wavelet coefficients was described in Chapter 3. This chapter outlines a fusion scheme that uses weights based on the statistical nature of the data to be combined. The statistical measure used is the Principal Component Analysis (PCA). Two approaches of using PCA in image fusion have been listed by Genderen and Pohl (998):. PCA of multi-channel images where the first principal component is replaced by different images. This method, known as the Principal Component Substitution (PCS), has been used for remote sensing data fusion by Chavez et al. (99). 2. PCA of all multi-image data channels as outlined by Yesou et al. (993) and whose results were reported by Richards (984).

2 60 In addition to these, a third approach can be included in which the PCA transform has been used to calculate the weights of a linear combination of the input images as shown in the algorithms developed by Das et al. (2000), Haq et al. (2005) and Zheng et al. (2007). Haq et al. and Zheng et al. describe the PCA fusion rule along with multi-resolution image decomposition using the DWT. This research work focuses on the third method detailed above and offers alternative algorithms to determine the weights of fusion. The PCA-based weighted fusion involves separately fusing the high frequency (HF) and the low frequency (LF) parts of an image. The two frequency components are obtained by a filtering mechanism and finally the fused components are added together to get the resultant fused output. This chapter begins with an introduction to the PCA transform from a statistical standpoint; followed by the principal component analysis of images in section 4.3. The newly designed fusion algorithms that make use of the PCA for calculating the weights are described in the next section. This includes a discussion of the PCA Gaussian weighted average fusion method (5) and the efficient PCA Max Fusion scheme (8) in sections and respectively. The performance of the fusion schemes detailed in the previous section is analysed in section 4.5. This chapter concludes with a summary of the proposed techniques. 4.2 PRINCIPAL COMPONENT ANALYSIS (PCA) TRANSFORM 4.2. Introduction Principal Component Analysis is a quantitatively rigorous method for achieving simplification. Often, its operation can be thought of as revealing the internal structure of the data in a way which best explains the variance in the data. The method generates a new set of variables called

3 6 Principal Components (PC). Each principal component is a linear combination of the original variables and all the PCs are orthogonal to each other, and as a whole form an orthogonal basis for the space of the data; thereby removing redundant information, if any. The first principal component is a single axis in space. When each of the observations in the data set is projected on this axis, the resulting values form a new variable and the variance of this variable is the maximum among all possible choices of the first axis. The second principal component is another axis in space, perpendicular to the first. Projecting the observations on this axis generates another new variable, such that the variance of this variable is the maximum among all possible choices of this second axis. The full set of principal components is as large as the original set of variables. In general, the sum of the variances of the first few principal components exceeds 80% of the total variance of the original data. The original data can be recovered from the first few PCs themselves hence the Principal Component Analysis is a method that enables a decrease in the number of channels (or bands) by reducing the inter-channel dependencies. The multidimensional space is mapped into a space of fewer dimensions by transforming the original space using a linear transformation via a principal component analysis. The steps involved in the PCA transform, also called the Karhunen - Loeve (KL) transform, are:. Calculate the covariance matrix or the correlation matrix of the data sets to be transformed. The covariance matrix is used in the case of the unstandardised PCA, while the standardised PCA uses the correlation matrix. 2. Calculate the eigenvalues and the eigenvectors from the correlation / covariance matrix.

4 62 3. Principal Components of the given data set are the eigenvectors of the covariance matrix of the input. If the correlation matrix of the data is constructed and the eigenvectors found and listed in eigenvalue order, then just the first few eigenvectors can be used to reconstruct a large fraction of the variance of the original data. The first few eigenvectors can often be interpreted in terms of the large-scale physical behaviour of the system. The original space has been reduced to the space spanned by a few eigenvectors, with data loss, but retaining the most important variance. A lossless dimensionality reduction is possible if the data in question falls exactly on a smooth, locally flat subspace; however the noisy data prevents such an exact mapping, introducing some loss of information Mathematical Analysis of the Principal Component Analysis Transform x 3 Consider n-dimensional data in the form of M-vectors, x 2, x...x M defined by X as, X = [x,x 2,x 3...x M] (4.) The first principal component is that the n dimensional vector along whose direction, the variance is maximized. The principal component can be easily computed as the eigenvector of the correlation matrix having the largest eigenvalue. The next step is the computation of mean vector of the population, which is defined by the equation, M æ ö m x = ç å x k (4.2) è M ø k=

5 63 where, m x is the mean vector associated with the population of x vectors. For M vector samples from a random population, the covariance matrix can be approximated from the samples by, æ ö C x = ç (4.3) è M ø k= M T T å[ x k x - m xm ] k x where, T indicates vector transposition. C x is covariance matrix of order n x n. Element c ii of Cx is the variance of x i ; the i th component of the x vectors in the population, and element c ij of C x is the covariance between elements x i and x j of these vectors. The matrix C x is real and symmetric. If elements x i and x j are uncorrelated, their covariance is zero and, therefore, c ij =c ji =0. Since C x is real and symmetric, finding a set of n orthogonal eigenvectors always is possible. Let e i and λ i, i =, 2 n, be the eigenvectors and corresponding eigenvalues of C x, arranged in descending order so that l j ³ l j + for j =,2,... n- (4.4) Let A be a matrix whose rows are formed from the eigenvectors of C x ordered so that the first row of A is the eigenvector corresponding to the largest eigenvalue and the last row is the eigenvector corresponding to the

6 64 smallest eigenvalue. The matrix A is called the transformation matrix that maps the vectors x into vectors denoted by y as follows: Y = A x- ( ) m x (4.5) This equation is called the Principal components transform (also called the Hotelling transform). The mean of the y vectors resulting from this transformation is zero; m y = 0 (4.6) the covariance matrix of the y can be obtained in terms of A and C x by C T y = A C X A (4.7) Furthermore, C y is the diagonal matrix whose elements along the main diagonal are Eigen values of C x ; that is, él 0 0 ù C ê ú y = ê 0 l2 0 ú (4.8) êë 0 0 l ú 3 û The off-diagonal elements of the covariance matrix are 0, and hence the elements of the y vectors are uncorrelated. The rows of matrix A are the normalized eigenvectors of C x. Because C x is real and symmetric, these vectors form an orthonormal set, and it follows that the elements along the main diagonal of C y are the eigenvalues of C x. The main diagonal element in the ith row of C y is the variance of vector element y i.

7 65 Because the rows of A are orthonormal, its inverse equals its transpose. Thus, one can recover the vector x by performing the inverse transformation, T x A y+ = m (4.9) x To conclude this section, the principal component analysis is a mathematical way of determining the linear transformation of a sample of points in an N-dimensional space which exhibits the properties of the sample most clearly along the coordinate axes and in this process reduces the interchannel dependencies. The eigenvalues for each principal component correspond to the amount of total variance in the data described by this component. 4.3 PRINCIPAL COMPONENTS ANALYSIS FOR IMAGES 4.3. Basic Steps Involved (Gonzales and Woods 2002) PCA finds applications in image processing, where it has been used for identifying the patterns in data, and expressing the data in such a way as to highlight their similarities and differences. The steps involved in finding the PCA of the given images are the same as followed in the previous section for statistical data sets. However, as the first stage, the given 2 dimensional image is represented as a single vector. In general, an M x N image is represented as an MN x column vector. Then, the covariances of the given images are calculated, followed by the eigenvector and eigenvalue for covariance matrix and finally transforming the images using the eigenvector. This transformed image consists of the principal components of the given images. The principal components of the images are the original data that are represented solely in terms of the eigenvectors.

is taken using an infrared camera, that responds to changes in heat intensity of the source being imaged and the second image is obtained using

8 66 This process can be illustrated using an example. Consider two images of the same scene (a road) captured using two different image sensing systems. The first image shown in Figure 4. is taken using an infrared camera, that responds to changes in heat intensity of the source being imaged and the second image is obtained using the conventional CCD camera, shown in Figure 4.2. Figure 4. IR Image Figure 4.2 CCD Image The pixels, which are of two dimensional matrix form M x N, from both the images are vector populated, similar to equation (4.), and shown in Figure 4.5:

9 67 x x2 xm x x2 xm 2x 2x2 2xm 2x 2x2 2xm nx nx2 nxm nx nx2 Nxm Figure 4.3 Matrix representation of Image X Figure 4.4 Matrix representation of Image2 Y st image pixels - X 2 st image pixels - Y x 2x x 2x xm 2x xm 2x Nxm Nxm Figure 4.5 Vector populated matrix The next step is to find the covariance of the vector populated matrix. The covariance matrix gives the relation between the given images. If one is given n images, the covariance matrix will be of n x n dimensions.

10 68 Similar to equation (4.3), for the 2 dimensional case, the covariance can be represented by, n å( Yi - my )( Xi - mx ) i= ( n) C(X, Y) = (4.0) where, m X is the mean of the samples of one image represented by vector X m Y is the mean of the samples of one image represented by vector Y In case of more than 2 source images, more than one covariance measurement is required. For example, from a 3 image (X, Y, Z) vector populated matrix of dimension (No of pixels in a single image x 3), a covariance matrix of dimension 3x3 is obtained, æcov(x, X) cov(x, Y) cov(x, Z) ö ç C(X, Y, Z) = çcov(y, X) cov(y, Y) cov(y, Z) (4.) ç è cov(z,x) cov(z,y) cov(z,z) ø The value of the matrix elements (,), (2,2) and (3,3) gives the variance of images x, y and z, respectively. The element (,2) is the relation between image x and y and similarly for elements (,3) and (2,3). The values of (,2) and (2,) are same. Each entry in the matrix is the result of calculating the covariance between two separate dimensions. The following conclusions can be drawn from the covariance matrix: if the value of covariance is positive, then it indicates that the gray values in both the images increase together; if the value is negative, then it implies a negative correlation, that is, as the gray level in one image increases,

11 69 the value in the other image decreases and finally, the last case of zero covariance indicates that the two images are independent of each other Determination of Eigen Values of the Covariance Matrix Since the covariance matrix is a square matrix, the eigenvectors and eigenvalues for this matrix can be computed, which give useful information about the images. The eigenvector and eigenvalue is in the form as shown below for the images under consideration, 3 æ ö Eigenvalues = ç 3 è ø (4.2) æ ö Eigenvectors = ç è ø (4.3) given in Figure 4.6. The graphical representation of the eigenvectors of the images is Grey Levels of CCD Image Grey Levels of IR Image Figure 4.6 Plot of the images with the eigenvectors of the covariance matrix overlaid on top

12 70 When plotted, both the eigenvectors appear as diagonal dotted lines on the plot and are perpendicular to each other. More importantly, they provide with information about the patterns in the data. It can be seen from the plot that one of the eigenvectors goes through the middle of the points, similar to plotting a line of best fit, and shows how these two data sets are related along that line. The second eigenvector gives the other, less important, pattern in the data; signified by the amount by which all the points following the main line are offset from it. This process of taking the eigenvectors of the covariance matrix enables extraction of the lines that characterise the data. In the above example, the eigenvector with the largest eigenvalue is the one that points down the middle of the data, giving the most significant relationship between the images. The next step in finding the principal components of the given images is to transform the image using the eigenvectors using the equation (4.5), which is expressed below in words; Principal Component = Eigenvector ( Mean subtracted vector populatedmatrix) The eigenvector used above is either one or both of the eigenvectors (if two images are taken). In order that the matrix dimensions agree, either the eigenvector or vector populated matrix has to be transposed. The principal component using the eigenvector with the largest eigenvalue will have more information than that obtained using the eigenvector with the smallest eigenvalue. This can be seen from Figures 4.8 and 4.9 which show the images constructed using the largest eigenvector and the smallest eigenvector, respectively.

PCA technique has three effects: it orthogonalizes the components of the input vectors (so that they are uncorrelated with each other); it orders the

13 7 Figure 4.7 Principal components using eigenvector corresponding to the largest eigenvalue Figure 4.8 Principal components using eigenvector corresponding to the smallest eigenvalue From the discussion in this section, it can be summarized that the PCA technique has three effects: it orthogonalizes the components of the input vectors (so that they are uncorrelated with each other); it orders the resulting orthogonal components (principal components) so that those with the largest variation come first; and it eliminates those components that contribute the least to the variation in the data set.

14 72 Gray Levels of CCD Image Gray Levels of IR Image Figure 4.9 Plot of the principal components obtained using both the eigenvectors The PCA process has transformed the data so that it is expressed in terms of the patterns between the images, these patterns being the eigenvectors that most closely describe the relationships between the data. This is helpful because it enables the classification of all the data points (pixels) as a combination of the contributions from each of the eigenvectors. Initially, as in Figure 4.6, the plot has a simple grey level axes, each axis representing grey levels of one image, which does not convey any information on the relationship of the data points with each other. However, after processing the images using the PCA transform, the values of the data points specify exactly where they are present with respect to the trend lines, as seen in Figure 4.7. In the case of the transformation using both the eigenvectors, the data has been altered so that it is in terms of the eigenvectors instead of the usual axes.

15 PCA-BASED WEIGHTED FUSION 4.4. Introduction In this section, two new methods of computing weights for additive fusion of two images are presented. The fusion schemes involve separating the low frequency (LF) and the high frequency (HF) components of the images and detail rules for fusing the frequency components. Consider the multi-sensor input images obtained using an infrared sensor and a visible spectrum CCD sensor as I ir and I vis. The algorithm assumes that the source images are registered with each other and processed to be of the same size. The frequency component separation is achieved by the use of a Gaussian low pass filter used for smoothing the images. A typical smoothing convolution filter is essentially a matrix having an integer value for each row and column, the value chosen depending on the type of filter being used. For the Gaussian low pass filter, the 2-dimensional kernel is given by, h = (4.4) When an image is convolved with this type of filter, the gray value of each pixel is replaced by the average intensity of its eight nearest neighbors and itself. If the gray value of any pixel overlaid by the convolution kernel is dramatically different than that of its neighbors, the averaging effect of the filter will tend to reduce the effect of the noise by distributing it among all of the neighboring pixels. The smoothed images are S = I h (4.5) ir ir * S = I vis h (4.6) vis *

7) D i - r = Iir Sir (4.8) The result of the filtering is shown in Figure 4.

16 74 where * is the convolution operator The images S ir and S vis representing the visible portion of the source images. are the low frequency components The high frequency components are obtained by finding the deviations from the smoothed images. This is given by D - vis = I vis Svis (4.7) D i - r = Iir Sir (4.8) The result of the filtering is shown in Figure 4.0 for the image of a house on a hill captured using a CCD camera and that of the infrared image is given in Figure 4.. Input CCD Image Low Frequency Component Image Result of Low Pass Filtering High Frequency Component Image Figure 4.0 Effect of low pass filtering to separate frequency components of CCD image

17 75 Input IR Image Low Frequency Component Image Result of Low Pass Filtering High Frequency Component Image Figure 4. Effect of low pass filtering to separate frequency components of IR image Then these two frequency components are then fused separately using different fusion rules. The fused components are then added together to get the fused output image. The fusion rules proposed in this thesis are discussed in the following sub-sections PCA Gaussian Fusion Algorithm This fusion rule involves combining the low frequency (LF) components, S ir and S vis, using simple averaging. The high frequency (HF) components, D ir and D vis are combined by weighted addition High Frequency Component Fusion Rule The weights for fusion are calculated from the principal components of the high frequency part of the source images. It involves the computation of principal component of the deviation components as discussed in section Let these components be denoted as PC and PC 2, corresponding to the largest eigenvalue and the smallest eigenvalue, respectively.

18 76 These principal components are used to define the ratios P, P 2 ; which are in turn used to calculate the weights for the fusion. The ratios P and P 2 are obtained as follows: P PC PC + PC = (4.9) 2 P 2 PC PC + PC 2 = (4.20) 2 where PC is principal components corresponding to largest eigen value, and PC 2 is principal components corresponding to smallest eigen value. The weights are obtained by smoothing the ratio of principal components using a similar low pass filter, as in section 4.4. with a kernel defined by equations (4.5) and (4.6). The weights are given by, w * = P h (4.2) w 2 2* = P h (4.22) where * is the convolution operator and h is the Gaussian kernel defined in equation (4.4). Image fusion of the high frequency components is achieved by the weighted, normalized sum of the deviations defined by the fusion rule; D fuse {( D w ) + ( D w )} ir vis 2 = (4.23) [w + w ] 2 The fused output is shown in Figure 4.2

19 77 Figure 4.2 Fusion of HF components Low Frequency Component Fusion Rule The low frequency components are combined by averaging the intensity levels of the two images. This LF image contributes to the background information in the picture and is used to identify the presence of objects in the image. Mathematically, S fuse ( S + S ) ir vis = (4.24) 2

20 78 Figure 4.3 Fusion of LF components The final fused output is obtained by adding the weighted deviations, equations (4.7) and (4.8), to the low frequency background image and is shown below. I + fuse = Sfuse Dfuse (4.25) Figure 4.4 Fused image output using additive weighted fusion

21 Fusion Scheme Using the PCA Max Rule This newly designed scheme overcomes the performance limitation of the weight-based fusion algorithm proposed in section It uses the same technique of separating the image into the low and high frequency components. The low frequency components are combined using the principle of choosing maximum intensity pixels from each LF image. The HF components are fused using the weighted additive fusion rule as used in section Low Frequency Component Fusion Rule The low frequency images S ir and S vis are fused using the Select Max principle as discussed by Zheng et al. (2007). Since the visible information is contained in the low frequency components, fusing the images by selecting the pixel values with the highest intensity gives an output image that has a very high quality as perceived by the human observer. This rule involves choosing the maximum intensity levels of corresponding pixels from each LF image to represent the pixel value in the fused image. However, to enable comparison of the images on a pixel by pixel basis a histogram matching of the two images is performed. This is achieved by matching the histogram of the visible image to that of the IIR image (Gonzales and Woods 2002, Jain 989). The decision map of the low frequency image fusion rule is: ìsir (m, n) if Sir (m, n) > S vis (m, n) Sfuse (m, n) = í (4.26) îs vis (m, n) if Sir (m, n) < S vis (m, n) where, m =, 2, 3,. S m

80 n =, 2, 3, S n and S m x S n are the dimensions of the LF

5 Figure 4.5 LF component fusion using choose max rule 4.4.3.

are calculated from the principal components of the high

It involves the computation of principal component of the

. Let these components be denoted as PC and PC 2,

22 80 n =, 2, 3, S n and S m x S n are the dimensions of the LF component image The fused output is shown in Figure 4.5 Figure 4.5 LF component fusion using choose max rule High Frequency Component Fusion Rule The weights for fusion are calculated from the principal components of the high frequency part of the source images. It involves the computation of principal component of the deviation components as discussed in section Let these components be denoted as PC and PC 2, corresponding to the largest eigenvalue and the smallest eigenvalue, respectively. These principal components are used to define the ratios P, P 2, which are in turn used to calculate the weights for the fusion. The ratios P and P 2 are obtained as follows:

23 8 P PC PC + PC = (4.27) 2 P 2 PC PC + PC 2 = (4.28) 2 where PC is principal components corresponding to largest eigen value, and PC 2 is principal components corresponding to smallest eigen value. The weights are obtained by smoothing the ratio of principal components using a similar low pass filter as in section 4.4. with a kernel defined by equations (4.5) and (4.6). The weights are given by, w * = P h (4.29) w 2 2* = P h (4.30) where * is the convolution operator and h is the Gaussian kernel defined in equation (4.4). Image fusion of the high frequency components is achieved by the weighted, normalized sum of the deviations defined by the fusion rule; D fuse {( D w ) + ( D w )} ir vis 2 = (4.3) [w + w ] 2 The fused output is shown in Figure 4.6,

24 82 Figure 4.6 HF component fusion using weighted average method The final fused output is given by I = S + D (4.32) fuse fuse fuse as shown in Figure 4.7. Figure 4.7 Fused output using the PCA max method

25 RESULTS Experiments were carried out with the newly developed PCA fusion algorithms on different sets of images pertaining to surveillance and night vision applications. The results of the experiments are tabulated in Table 4.. A more detailed comparison of the results is given in Chapter 6. In addition, the average results obtained by conducting experiments on 20 sets of images are shown in Table 4.2. Here the results are compared with the existing DWT-PCA-Max algorithm (Zheng et al. 2007) and the PCA weighted superposition method (Rockinger 999). Based on the experimental work performed using the various quality metrics for the newly designed PCA-based fusion, the results obtained are given in Table 4.. Table 4. Performance metrics for the newly designed PCA-based fusion algorithms Image Fusion Scheme En SSIM MI SD CE Boat Road Scene PCA Max PCA Gaussian PCA Max PCA Gaussian

84 (a) PCA Max (b)pca Gaussian Figure 4.

Scheme PCA Max PCA Gaussian DWT PCA Max PCA fusion CE 0.389034 0.56836 0.

26 84 (a) PCA Max (b)pca Gaussian Figure 4.8 Fusion output for boat image (c) PCA Max (d) PCA Gaussian Figure 4.9 Fusion output for road scene image Table 4.2 Average value of different set of images Image 40 Sets of images Fusion Scheme PCA Max PCA Gaussian DWT PCA Max PCA fusion CE EN SSIM MI SD

27 CONCLUDING REMARKS This chapter gave a brief introduction to the Principal Component Analysis followed by the application of the PCA for analyzing images. Then the fusion rule that combines the PCA weighted scheme and selecting maximum intensity pixels was presented in this chapter. This algorithm has a very good performance both in terms of the visual quality of the fused image evaluated subjectively, and also in terms of quality metrics. Next the PCA Gaussian fusion scheme was described. This new fusion technique designed makes use of the principal components as the weights for the additive fusion. The performance of these two methods is compared in Table 4.. From the parameters it is seen that the PCA Gaussian method gives a higher SSIM index. Also from the output image it can be observed that this rule gives an output image of better quality when the source images have very high intensity pixels. In case of source images with an evenly distributed histogram the two algorithms give an equivalent performance. These results are presented in section 6.3. A comparative study of the fusion schemes proposed in this thesis along with existing algorithms is presented in chapter 6. Also the average results obtained for a set of 40 images shows a higher value of the SSIM based measure for the two new algorithms compared to the existing techniques. The cross entropy (CE) of the existing methods is better and so is the standard deviation (SD). However, these two metrics determine only the amount of information transferred from the source images to the fused output. The observation of the fused output images correlates the SSIM results on the efficiency of the new PCA algorithms.

7. Variable extraction and dimensionality reduction

7. Variable extraction and dimensionality reduction The goal of the variable selection in the preceding chapter was to find least useful variables so that it would be possible to reduce the dimensionality