MODULE 5 LECTURE NOTES 5 PRINCIPAL COMPONENT ANALYSIS

Similar documents
GNR401 Principles of Satellite Image Processing

GEOG 4110/5100 Advanced Remote Sensing Lecture 15

Vegetation Change Detection of Central part of Nepal using Landsat TM

Digital Change Detection Using Remotely Sensed Data for Monitoring Green Space Destruction in Tabriz

Comparison of MLC and FCM Techniques with Satellite Imagery in A Part of Narmada River Basin of Madhya Pradesh, India

What is a vegetation index?

GeoComputation 2011 Session 4: Posters Accuracy assessment for Fuzzy classification in Tripoli, Libya Abdulhakim khmag, Alexis Comber, Peter Fisher ¹D

Principal Components Analysis (PCA)

IMAGE CLASSIFICATION TOOL FOR LAND USE / LAND COVER ANALYSIS: A COMPARATIVE STUDY OF MAXIMUM LIKELIHOOD AND MINIMUM DISTANCE METHOD

12.2 Dimensionality Reduction

A GLOBAL ANALYSIS OF URBAN REFLECTANCE. Christopher SMALL

Deriving Uncertainty of Area Estimates from Satellite Imagery using Fuzzy Land-cover Classification

IMPROVING REMOTE SENSING-DERIVED LAND USE/LAND COVER CLASSIFICATION WITH THE AID OF SPATIAL INFORMATION

EECS490: Digital Image Processing. Lecture #26

A Method to Improve the Accuracy of Remote Sensing Data Classification by Exploiting the Multi-Scale Properties in the Scene

Monitoring Vegetation Growth of Spectrally Landsat Satellite Imagery ETM+ 7 & TM 5 for Western Region of Iraq by Using Remote Sensing Techniques.

The Principal Component Analysis

Principal Component Analysis

1. Introduction. S.S. Patil 1, Sachidananda 1, U.B. Angadi 2, and D.K. Prabhuraj 3

HYPERSPECTRAL IMAGING

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)

AN INTEGRATED METHOD FOR FOREST CANOPY COVER MAPPING USING LANDSAT ETM+ IMAGERY INTRODUCTION

MAPPING LAND USE/ LAND COVER OF WEST GODAVARI DISTRICT USING NDVI TECHNIQUES AND GIS Anusha. B 1, Sridhar. P 2

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

Many of remote sensing techniques are generic in nature and may be applied to a variety of vegetated landscapes, including

MULTI-SOURCE IMAGE CLASSIFICATION

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

MULTI-CHANNEL REMOTE SENSING DATA AND ORTHOGONAL TRANSFORMATIONS FOR CHANGE DETECTION 1

Chapter 6 Multispectral Transformations of Image Data

Lecture: Face Recognition and Feature Reduction

Chapter 6 Spectral Domain Image Transforms

Defining microclimates on Long Island using interannual surface temperature records from satellite imagery

CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION

Comparative Analysis of Supervised and

ANALYSIS AND VALIDATION OF A METHODOLOGY TO EVALUATE LAND COVER CHANGE IN THE MEDITERRANEAN BASIN USING MULTITEMPORAL MODIS DATA

Lecture: Face Recognition and Feature Reduction

Machine Learning 11. week

Application of Remote Sensing Techniques for Change Detection in Land Use/ Land Cover of Ratnagiri District, Maharashtra

INDIAN INSTITUTE OF TECHNOLOGY ROORKEE

Introduction to Principal Component Analysis (PCA)

Lecture 7: Con3nuous Latent Variable Models

Principal Component Analysis (PCA) of AIRS Data

ASSESSMENT OF NDVI FOR DIFFERENT LAND COVERS BEFORE AND AFTER ATMOSPHERIC CORRECTIONS

AGRY 545/ASM 591R. Remote Sensing of Land Resources. Fall Semester Course Syllabus

SPECTRAL DISCRIMINATION OF ROCK TYPES IN THE ARAVALLI MOUNTAIN RANGES OF RAJASTHAN (INDIA) USING LANDSAT THEMATIC MAPPER DATA

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Markov Chains, Stochastic Processes, and Matrix Decompositions

Introduction to Machine Learning

Machine Learning 2nd Edition

Principal Component Analysis (PCA) Principal Component Analysis (PCA)

Landuse and Landcover change analysis in Selaiyur village, Tambaram taluk, Chennai

L11: Pattern recognition principles

Land Features Extraction from Landsat TM Image Using Decision Tree Method

Principal Component Analysis

Data Preprocessing Tasks

Testing the Sensitivity of Vegetation Indices for Crop Type Classification Using Rapideye Imagery

Use of Corona, Landsat TM, Spot 5 images to assess 40 years of land use/cover changes in Cavusbasi

Regularized Discriminant Analysis and Reduced-Rank LDA

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Karhunen-Loève Transform KLT. JanKees van der Poel D.Sc. Student, Mechanical Engineering

Mapping and Quantification of Land Area and Cover Types with Landsat TM in Carey Island, Selangor, Malaysia

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Drought Estimation Maps by Means of Multidate Landsat Fused Images

International Journal of Scientific & Engineering Research, Volume 6, Issue 7, July ISSN

Change Detection Methods Using Band Ratio and Raster to Vector Transform

What is Principal Component Analysis?

Unsupervised Learning: Dimensionality Reduction

A SURVEY OF REMOTE SENSING IMAGE CLASSIFICATION APPROACHES

INTERNATIONAL JOURNAL OF GEOMATICS AND GEOSCIENCES Volume 4, No 1, 2013

IR-MAD Iteratively Re-weighted Multivariate Alteration Detection

Principal Component Analysis vs. Independent Component Analysis for Damage Detection

2 Dr.M.Senthil Murugan

Comparison between Land Surface Temperature Retrieval Using Classification Based Emissivity and NDVI Based Emissivity

PCA, Kernel PCA, ICA

AN INVESTIGATION OF AUTOMATIC CHANGE DETECTION FOR TOPOGRAPHIC MAP UPDATING

Computational paradigms for the measurement signals processing. Metodologies for the development of classification algorithms.

Image Contrast Enhancement and Quantitative Measuring of Information Flow

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Relevance of Hyperspectral Data for Sustainable Management of Natural Resources

Classification of High Spatial Resolution Remote Sensing Images Based on Decision Fusion

Urban remote sensing: from local to global and back

A NORMALIZED DIFFERENCE VEGETATION INDEX (NDVI) TIME-SERIES OF IDLE AGRICULTURE LANDS: A PRELIMINARY STUDY

Multi- and Hyperspectral Remote Sensing Change Detection with Generalized Difference Images by the IR-MAD Method

PCA and admixture models

Image Analysis. PCA and Eigenfaces

UCLA STAT 233 Statistical Methods in Biomedical Imaging

Myoelectrical signal classification based on S transform and two-directional 2DPCA

GIS GIS.

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Rating of soil heterogeneity using by satellite images

Application of Satellite Images and GIS in Evaluation of Green Space Destruction in Urban Area (Case study: Boukan City)

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

SATELLITE REMOTE SENSING

Machine Learning (Spring 2012) Principal Component Analysis

MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A

Data Fusion and Multi-Resolution Data

7. Variable extraction and dimensionality reduction

Bootstrapping, Randomization, 2B-PLS

CHANGE DETECTION USING REMOTE SENSING- LAND COVER CHANGE ANALYSIS OF THE TEBA CATCHMENT IN SPAIN (A CASE STUDY)

Transcription:

MODULE 5 LECTURE NOTES 5 PRINCIPAL COMPONENT ANALYSIS. (PCA) Principal component analysis (PCA), also known as Karhunen-Loeve analysis, transforms the information inherent in multispectral remotely sensed data into new principal component images that are more interpretable than the original data. It compresses the information content of a number of bands into a few principal component images. This enables dimensionality reduction of hyperspectral data. Generally within a multispectral imagery, the adjacent bands will depict mutual correlation. For example, if a sensor captures information using visible/near infrared wavelengths, the vegetated areas obtained using both the bands will be negatively correlated in nature. Imagine a multispectral or a hyperspectral imagery with more than bands which are inter-correlated. The inter correlations between bands depicts repetition of information between the adjacent bands. Consider two variables x and y that are mutually correlated and which are plotted using a scatter diagram. The relationship between x and y can be very well represented using a straight line sloping upwards towards right (assuming that x and y are positively correlated). Now suppose that x and y are not perfectly correlated and that there exists a variability along some other axis. Then the dominant direction of variability can be chosen as the major axis while another second minor axes can be drawn at right angles to it. A plot with both these major and minor axis may be a better representation of the x-y structure than the original horizontal and vertical axes. Using this background information, assume that the pixel values of two bands of Thematic Mapper are drawn using a scatter plot. Let X and X denote the respective bands and let and represent their corresponding mean values. The spread of points (pixel values) indicates the correlation and hence the quality of information present in both the two bands. If the points are tightly clustered within a two dimensional space, it means that they would provide very little information to the end user. It means that the original axis of X and X might not be a very good representative of the D feature space in order to analyze the information content associated with these two bands. Principal component analysis can be used to rotate the location of original axes so that the original D Nagesh Kumar, IISc, Bangalore M5L5

brightness values ( pixel values) be redistributed or reprojected onto a new set of axes (principal axis). For example, the new coordinate system (with locations of X, X ) can be obtained by a simple translation of X X and X X. Once this translation is accomplished, the new coordinates can be rotated about the new origin in the new coordinate system by say some angle. Now the first axis of X will be associated with the maximum amount of variance which is now called as the first principal component (PC). And the second principal component is orthogonal to PC. Similarly, the third, fourth and subsequent components can be arrived at which will be arranged in the decreasing amount of variance found in the data set. X X PC PC D Nagesh Kumar, IISc, Bangalore M5L5

In order to arrive at the principal axes, certain transformation coefficients need to be obtained which can be applied to the original pixel values. The steps for this transformation are discussed below: a) Compute the covariance matrix of the n dimensional remotely sensed data set. The importance of variance to define the points represented by a scatter plot along the dominant direction has already been stressed. If variance be used to define the shape of the ellipsoid covering the points ( in an n dimensional variable space) then, the scales used for measuring each variable must be comparable with one another. If not, neither the variance of each variable be the same nor will the shape of enclosing ellipse remain same. This may also create further complications as the shape of one ellipsoid cannot be related mathematically to the shape of the second ellipsoid. In these circumstances, the correlation coefficient can be used rather than the covariance to measure standardized variables. To standardize the variables, the mean value can be subtracted from all measurements and then the result can be divided by their standard deviation which would convert the raw values to z scores or standard scores having zero mean and a variance of unity. It should be noted that usage of covariance matrix will yield unstandardized PCA and use of correlation matrix will yield in a standardized PCA. b) Computation of eigenvalues and eigenvectors Consider that there are n number of bands within a multispectral remotely sensed imagery. For these n bands there will be n rows and n columns. Quantities known as eigenvalues can be found for the chosen matrix. Eigenvalues are proportional to the length of principal axes of the ellipsoid whose units are measured using variance. In order that the variables be measured on comparable scales, standardized units of variance must be used, as stated in previous paragraph. Each eigenvalue will be associated with a set of coordinates which are known as eigenvectors. The eigenvalues and eigenvectors will together describe the lengths and directions of the principal axes. The eigenvalues will contain important information such as the total percent of variance explained by each of the principal components using the expression D Nagesh Kumar, IISc, Bangalore 3 M5L5

TotalVaria nce(%) n Eigenvalue *00 Eigenvalue i If i, j represents the eigenvalues of an nxn covariance matrix which can be represented as:, 0 0 0 0 0 0, 0 0 0 0 0 0 3, 3 0 0 0 0 0 0 4, 4 0 0 0 0 0 0 5, 5 0 0 0 0 0 0 n, n c) Estimation of factor loadings The eigenvectors when scaled using the square roots of their corresponding eigenvalues can be interpreted as correlations between the principal components and the individual bands of the image. The correlation of each band with respect to each of the principal components can be computed. This gives us an idea regarding how each band loads or otherwise is associated with respect to each principal component. The expression can be given as: R kp a kp * Var k p where a kp = Eigenvector for band k and component p p = pth eigenvalue Var k = Variance of band k in the covariance matrix D Nagesh Kumar, IISc, Bangalore 4 M5L5

This results in factor loadings. Numerical Example PCA is based on four assumptions namely, linearity, sufficiency of mean and variance, orthogonality of principal components and that large variances have important dynamics. The second assumption states are the mean and variance is used as sufficient statistics to fully define the probability distribution. For this assumption to be true, the probability distribution of the variable considered must be exponential in nature. This guarantees that the signal to noise ratio together with the covariance matrix is sufficient to fully describe the noise and redundancies. The third assumption indicates that the data has a high signal to noise ratio. And hence, the principal components with a larger variance will represent more dynamics than those with lower variances which will depict noise. PCA can be solved using linear algebra decomposition techniques. Assume a hypothetical situation of an image at row and column for seven bands of a satellite sensor which are represented using a vector X such that, X BV BV BV BV BV BV BV,,,,,,3,,4,,5,,7,,6 0 30 60 70 6 50 We will now apply the appropriate transformation to this data such that it is projected onto the first principal component s axes. In this way we will find out what the new brightness value will be, for this component. It is computed using the formula: newbv i, j, p n k a kp BV i, j, k D Nagesh Kumar, IISc, Bangalore 5 M5L5

Where a kp = eigenvectors, BV ijk =brightness value in band k for the pixel at row i, column j and n = number of bands. In our hypothetical example, this yields, newbv,, a, ( BV,, ) a,( BV,, ) a3,( BV,,3 ) a4,( BV,,4 ) a5,( BV,,5 ) a6,( BV,,7 ) a7,( BV,, 6 ) = 0.05 (0) + 0.7(30) + 0.04 () + 0.443 (60) + 0.74(70) + 0.376(6) + 0.06(50) = 9.53 This pseudomeasurement is a linear combindation of original brightness value and factor scores (eigenvectors). The new brightness value for row, column in principal component after truncation to an integer is 9. This procedure takes place for every pixel in the original image data to produce the principal component image dataset. Then, p is incremented by and principal component is created pixel by pixel. Noise Adjusted PCA The presence of noise in any data set should be low, else no matter what the analysis technique, the information content extracted will be a minimal. Noise dominates signal in lower- order principal component images. Hence, the PCA needs to be adjusted for the noise variance. Once the PCA images have been adjusted for noise, the result can be used to generate PCA images that are unaffected by noise. Noise is relatively expressed with respect to the measurement in terms of signal to noise ratio given by the expression: SNR signal noise A high value of SNR indicates high precision data whereas a lower value indicates data contaminated with noise. Principal components are linear combinations of the original variables (like image pixel values) with the coefficients being defined such that the criterion of maximum variance gets satisfied. The question which needs to be asked is whether there exists any other criterion other than that of minimum variance that can be used to estimated weights for linear combinations. In this context, a new criterion i.e., maximizing the SNR ratio can be followed. How to maximize this criterion? D Nagesh Kumar, IISc, Bangalore 6 M5L5

How do we calculate signaland noise? A method should be devised that is capable of separating the measurements into two parts, with the first part showing the signal and the second part showing the contribution of noise. If the dataset consists of n number of bands, firstly the covariance matrix can be computed (C). Then, the horizontal and vertical pixel differences in each of these n bands can be determined. This in turn can be used to compute their covariance matrices which when combined produce the noise covariance matrix (C N ). The covariance matrix of the signal (C S ) is estimated by subtracting the covariance matrix of the noise from that of the measurement. This results in the criterion which can be written as maximizing the ratio of C S /C N. The outcome of noise adjusted PCA analysis is a linear combinations of the n spectral bands that are ranked from ( having the highest signal to noise ratio) to n (having the lowest signal to noise ratio). The coefficients are applied to the data in exactly a similar manner as PCA coefficients. Hence, the principal components are estimated using the least signal to noise ratio instead of the minimum variance criterion. D Nagesh Kumar, IISc, Bangalore 7 M5L5

Bibliography. Bastin, L. (997) Comparison of fuzzy C-Means classification, linear mixture modelling and MLC probabilities as tools for unmixing coarse pixels, International Journal of Remote sensing, Vol.8, 369-3648.. Binaghi, E., Brivio, P.A., Ghezzi, P., Rampini, A. (999) A fuzzy set-based accuracy assessment of soft classification, Pattern Recognition Letters, Vol. 0, 935-948. 3. Congalton, R.G. and Green, K. (998) Assessing the Accuracy of Remotely Sensed Data: Principles and Practices, Lewis Publishers, New York. 4. Deering, D. W., J. W. Rouse, R. H. Haas, and J. A. Schell, 975, Measuring Forage production of grazing units from Landsat MSS Data, Proceedings, 0 th International Symposium on Remote Sensing of Environment, :69-78. 5. Dunn, J.C. (973) A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters, Journal of Cybernetics, Vol. 3, 3-57. 6. Foody, G.M. (00) Status of land cover classification accuracy assessment, Remote Sensing of Environment, Vol.80,85 0. 7. Foody, G.M. and Cox, D.P. (994) Sub-pixel land cover composition estimation using a linear mixture model and fuzzy membership functions, International Journal of Remote Sensing, Vol.5, 69 63. 8. Groten, S. M., 993, NDVI-Crop Monitoring and Early Warning Yield Assessment of Burkina Faso, International Journal of Remote Sensing, 4 (8):495-55. 9. Guyot, G. and Gu, X.-F., 994, Effect of radiometric corrections on NDVI determined from SPOT-HRV and Landsat TM data. Remote Sensing of Environment, 49, 69-80. 0. Huete, A., 989, Soil influences in remotely sensed vegetation-canopy spectra. In: Asrar, G. (ed.) (99), 07-4.. John R. Jensen, 996, Introductory Digital Image Processing, Prentice Hall. Kauth, R. J. and G. S. Thomas, 976, The Tasseled Cap-A Graphic description of the spectral-temporal development of agricultural crops as seen by Landsat, Proceedings, Symposium on Machine Processing of remotely sensed data. West Lafayette, IN:Laboratory for applications of remote sensing, pp. 4-5. D Nagesh Kumar, IISc, Bangalore 8 M5L5

3. Kauth, R. J., P. F. Lambeck, W. Richardson, G. S. Thomas, and A. P. Petland, 979, Feature extraction applied to agricultural crops as seen by Landsat, Proceedings, of the Technical session, LACIE Symposium. Houston: National Aeronautics and space administration, 705-7. 4. Lillesand T. M. & Kiefer R. W., 000. Remote Sensing and Image Interpretation, 4 th ed. Wiley & Sons. 5. Lunetta, R. S. and Elvridge, C. D. (eds.), 998, Remote Sensing Change Detection: Environmental Monitoring, Methods and Applications. Chelsea, MI: Ann Arbor Press. 6. Paul. MK. Mather, 004, Computer Processing of Remotely- Sensed Images, Wiley & Sons. 7. Perry, C. R., and L. F. Lautenschlager, 984, Functional Equivalence of spectral vegetation indices, Remote Sensing of Environment, 4:69-8. 8. Richardson, A. J. and C. L. Wiegand, 977, Distinguishing vegetation from soil background information, Remote sensing of environment, 8:307-3. 9. Rouse, J. W., R. H. Haas, J. A. Schell, and D. W. Deering, 973, Monitoring vegetation systems in the great plains with ERTS, Proceedings, 3 rd ERTS Symposium, Vol., pp. 48-6. 0. Sellers, P., 989, Vegetation-canopy reflectance and biophysical properties. In: Asrar, G. (ed.) (989), 97-335.. Steven, M. D., 998, The sensitivity of the OSAVI vegetation index to observational parameters. Remote Sensing of Environment, 63, 49-60.. Thompson, D. R., and O. A. Wehmanen, 980, Using Landsat Digital Data to detect moisture stress in corn-soybean growing regions, Photogrammetric Engineering & Remote Sensing, 46:08-089. 3. Wang, F. (990) Improving Remote Sensing Image Analysis through Fuzzy Information Representation, Photogrammetric Engineering And Remote Sensing, Vol. 56, 63-69. 4. Wu, K.L. and Yang, M.S. ( 00) Alternative c-means clustering algorithms, Pattern Recognition, Vol.35, 67 78. 5. Yang, M.S., Hwang, P.Y. and Chem, D.H. (003) Fuzzy clustering algorithms for mixed feature variables, Fuzzy Sets and Systems, Vol.4, 30 37. D Nagesh Kumar, IISc, Bangalore 9 M5L5

6. Zadeh, L.A. (973) Outline of a new approach to the analysis of complex systems and decision processes, IEEE Transactions on systems, Man And Cybernetics, Vol.SMC-3, No.,8-44. 7. Zhang, J., Foody, G.M. (998) A fuzzy classification of sub-urban land cover from remotely sensed imagery, International Journal of Remote Sensing, Vol.9, No 4, 7-738. D Nagesh Kumar, IISc, Bangalore 0 M5L5