Unsuervised Hyersectral Image Analysis Using Indeendent Comonent Analysis (ICA) Shao-Shan Chiang Chein-I Chang Irving W. Ginsberg Remote Sensing Signal and Image Processing Laboratory Deartment of Comuter Science and Electrical Engineering University of Maryland Baltimore County, Baltimore, MD 50 Remote Sensing Laboratory, U.S. Deartment of Energy, Las Vegas, Nevada 899 ABSTRACT In this aer, an ICA-based aroach is roosed for hyersectral image analysis. It can be viewed as a random version of the commonly used linear sectral mixture analysis, in which the abundance fractions in a linear mixture model are considered to be unknown indeendent signal sources. It does not require the full rank of the searating matrix or orthogonality as most ICA methods do. More imortantly, the learning algorithm is designed based on the indeendency of the material abundance vector rather than the indeendency of the searating matrix generally used to constrain the standard ICA. As a result, the designed learning algorithm is able to converge to non-orthogonal indeendent comonents. This is articularly useful in hyersectral image analysis since many materials extracted from a hyersectral image may have similar sectral signatures and may not be orthogonal. The AVIRIS exeriments have demonstrated that the roosed ICA rovides an effective unsuervised technique for hyersectral image classification. I. INTRODUCTION In the ast years [], linear sectral mixture analysis (LSMA) has been widely used for endmember unmixing. It models a ixel in an image scene as a linear mixture of materials with relative abundance concentrations. Two restrictions are generally alied to LSMA. One is that comlete knowledge about materials must be given a riori. In many ractical alications, obtaining such a riori information is usually difficult if not imossible. To relax this requirement, an unsuervised method to generate material information from the image data is needed. The second restriction is that the abundance fractions of endmembers are unknown, non-random constants that can be estimated by statistical methods such as least squares estimation. Due to noise and atmosheric effects, however, the mixture of aarent (i.e., observed) material abundance fractions may vary ixel-by-ixel and consequently can be viewed as a random rocess resulting from an aarently random comosition of multile sectra for distinct materials lus random noise. Therefore, it is more realistic to assume that the abundance fractions of materials in a ixel are random quantities rather than unknown constants. In order to aroriately reresent such a random linear mixture, the abundance fraction of each material must be viewed as a random signal source. Under this circumstance, it requires the rior knowledge of their robability distributions. This further comlicates the roblem. Indeendent Comonent Analysis (ICA) [] seems to rovide a feasible aroach to solving this random abundance mixture roblem. ICA is an unsuervised source searation rocess. It differs from Princial Comonents Analysis (PCA) in many asects. Unlike PCA which only requires the second order statistics, ICA looks for comonents which are statistically indeendent, which is a much stronger condition than uncorrelated. As a result, it requires statistics of orders higher than second order. In addition, ICA comonents are not necessarily geometrically orthogonal. The most imortant difference is that ICA needs a linear model to describe data while PCA does not. Therefore, ICA is not a generalization of PCA. However, the requirement of linear model is exactly what we are interested in for ICA. Assume that the abundance fraction of each material is an unknown random signal source. Then the source mixing-model considered in ICA can be directly alied to LSMA, in which case ICA can be an effective means to solve random abundance fractions for the linear mixture model used in LSMA. II. INDEPENDENT COMPONENT ANALYSIS (ICA) Suose that L is the number of sectral bands. Let r be an L column ixel vector in a multisectral or hyersectral image where the bold face is used for vectors. Let M be an m m L m L endmember signature matrix denoted by [ ] where m is an L column vector reresented by the signature of the j th endmember of materials resident in the ixel r and j is the number of endmembers in the ixel. Let = ( α α ) T where α L be a abundance column vector associated with r α α j denotes the fraction of the j th signature resent in the ixel vector r. Assume that the sectral signatures of the
endmembers in the ixel vector r are linearly mixed and that α is an unknown constant vector. In this case, the sectral signature of a ixel vector r can be reresented by the linear regression model. r = Mα + n, () where n is noise that can be interreted as measurement error. One drawback of LSMA is that the signature matrix M must be known a riori. In this section, an ICA is described. It is also based on model (), but does not require the rior knowledge of M. In addition, it assumes that the abundance fractions, α, α,, are unknown random signal sources instead of unknown constants as assumed in model (). However, in this case we also need to make three additional assumtions on the abundance vector α = ( α α Lα ) T : (i) The endmember signature matrix M is full rank, that is, the material endmember signature vectors, m, m, L, m must be linearly indeendent. (ii) The abundance fractions α, α, are mutually statistically indeendent. (iii) At most one of the abundance fractions α, α, is Gaussian. In order to use ICA for our alication in hyersectral image analysis, the mixing matrix is the M in model () and the unknown signal sources to be searated are random abundance fraction sources denoted by α, α,. ICA finds a L searating matrix W to unmix the α, α, from r via the equation ( ) T ˆ L where α () r αˆ () r, αˆ () r,, αˆ () r α ˆ () r = Wr, () = is the estimated abundance vector based on r and is used to unmix the indeendent random abundance fractions α, α,. Under the above assumtions, the estimate of the i th abundance fraction α i may aear as any comonent αˆ () r because changing order of comonents in αˆ () r does not affect their statistical indeendence. In order to simlify infomax criterion used in ICA, Comon [3] introduced an alternative criterion, referred to as contrast functions that maximize the higher order statistics of the data given by m max ψ ( W) = E[ αˆ j ( r) ] for m 3, (3) T W, E[ αˆ ()() r αˆ r ] = I j= where I is the identity matrix. This constrained maximization roblem is equivalent to maximizing the following cost function λ J( W ) = ψ ( W) E[ αˆ i () r αˆ j () r δ ij ]. (4) i, j= From (4), a learning algorithm to generate the searating matrix W can be from where µ and η are learning arameters. W ( [ ] ) T I αˆ () r r T [ g( αˆ () r ) r ] T E αˆ ()() r αˆ r k+ = Wk + E η µ, (5) III. EXPERIMENTS The hyersectral data used are Airborne visible infrared imaging sectrometer (AVIRIS) data extracted from a scene of the Lunar Crater Volcanic Field in Northern Nye County, Nevada (Figure ). Water bands and low signal noise ration bands have been removed from the data, reducing the images data from 4 to 58 bands. There are five target signatures of interest, cinders, rhyolite, laya (dry lakebed), vegetation and shade.
Figure Since there is no rior knowledge about the number of target signatures, we first assume that there is a large number of materials, = 58. Skewness and kurtosis were used in (4) as criteria. Our exeriments showed that skewness erformed better than kurtosis. So, only skewness results were given in this aer. Since very little information was found in all images after the 9 th comonent, Fig. shows only the first 9 comonent images, labeled (a-i). The targets cinders, rhyolite and shade were extracted in Figures (a), (d) and (i), resectively, while the vegetation was icked u in Figures (c) and (e), and laya (dry lakebed) was shown u in Figures (b) and (f-h). This exeriment showed that if is taken too large, it would classify as target materials those sectral variations roduced by mixing. This is illustrated in Fig., vegetation was classified in two searate comonents in Figures (c) and (e). Similarly, due to a large coverage of the dry lakebed, the different abundance fractions of the laya were detected and classified in four searate images in Figures (b) and (f-h). ( d ) ( e ) ( f ) ( g ) ( h ) ( i ) Figure However, if we chose =, only the first six comonent images labeled by (a-f) contained the information shown in Figure 3. The cinders, vegetation, rhyolite and shade were extracted in Figures 3(a), 3(b), 3(d) and 3(f), while the laya was detected in Figures 3(c) and 3(e). Also in this case, the vegetation was classified in only one image, and the number of images in which the laya was detected was cut down from 4 to. 3
If was further reduced to 8, Figure 4 shows the first 5 comonent images wherein the cinders, vegetation and rhyolite were detected and classified in Figures 4(a), 4(b) and 4(d). The laya was still classified into two searate images in Figures 4(c) and 4(e). In this case, only five comonent images were found to contain information, of which two were used to classify the laya. As a result, no comonent image could be sared to classify the shade. These three exeriments demonstrated that the value of to be used is crucial for unsuervised image analysis. This issue is closely related to the determination of the intrinsic dimensionality of images, and has been investigated in [5]. ( d ) ( e ) ( f ) Figure 3 ( d ) ( e ) Figure 4 4
IV. CONCLUSION Recently, ICA has received considerable interest in hyersectral image analysis [4]. This aer resented an ICA aroach to hyersectral image analysis which is different from the commonly used ICA aroach. First, the searating matrix W is not necessarily a square matrix of full rank. Second, the mixing matrix M is not necessarily orthogonal. Third, the learning algorithm was designed based on the indeendency of the material abundance fractions not the matrix W. These three advantages have been shown by exeriments to be very useful in hyersectral image classification. However, since ICA searates unknown signal sources rather than estimates the strengths of the signals, it is very effective in detection and classification, but not quantification. ACKNOWLEDGMENTS The authors would like to thank Bechtel Nevada under contract No. DE-AC08-96NV78 through the U.S. Deartment of Energy for their suort and Dr. J.C. Harsanyi for roviding the AVIRIS data. REFERENCES [] J.B. Adams, M.O. Smith, and A.R. Gillesie, "Image sectroscoy: interretation based on sectral mixture analysis," Remote Geochemical Analysis: Elemental and Mineralogical Comosition, edited by C.M. Pieters and P.A. Englert, Cambridge University Press,. 45 66, 993. [] T.W. Lee, Indendent Comonent Analysis: Theory and Alications, Boston: Kluwer Academic Publishers, 998. [3] P. Comon, "Indeendent comonent analysis, A new concet?", Signal Processing, vol. 36,. 87 34, 994. [4] C.H. Chen and X. Zhang, "Indeendent comonent analysis for remote sensing study," EOS/SPIE Symosium on Remote Sensing, Conference on Image and Signal Processing for Remote Sensing V, SPIE vol. 387, Florence, Italy,. 50 58, Setember 0-4, 999. [5] C.-I Chang and Q. Du, A noise subsace rojection aroach to determination of intrinsic dimensionality for hyersectral imagery, EOS/SPIE Symosium on Remote Sensing, Conference on Image and Signal Processing for Remote Sensing V, SPIE vol. 387, Florence, Italy,. 34 44, Setember 0 4, 999. 5