EXAMINING OUTLIER DETECTION PERFORMANCE FOR PRINCIPAL COMPONENTS ANALYSIS METHOD AND ITS ROBUSTIFICATION METHODS
|
|
- Bernice Butler
- 6 years ago
- Views:
Transcription
1 International Journal of Advances in Engineering & Technology, May 23. EXAMINING OUTLIER DETECTION PERFORMANCE FOR PRINCIPAL COMPONENTS ANALYSIS METHOD AND ITS ROBUSTIFICATION METHODS Nada Badr, Noureldien A. Noureldien Department of Computer Science University of Science and Technology, Omdurman, Sudan ABSTRACT Intrusion detection has gasped the attention of both commercial institutions and academic research area. In this paper PCA (Principal Components Analysis) was utilized as unsupervised technique to detect multivariate outliers on the dataset of an hour duration of time. PCA is sensitive to outliers since it depend on non-robust estimators. This lead us using MCD (Minimum Covariance Determinant) and PP (Projection Pursuit) as two different robustification techniques for the PCA. The results obtained from experiments show that PCA generates a high false alarms due to masking and swamping effects, while MCD and PP detection rate is much accurate and both reveals the effects of masking and swamping undergo the PCA method. KEYWORDS: Multivariate Techniques, Robust Estimators, Principal Components, Minimum Covariance Determinant, Projection Pursuit. I. INTRODUCTION Principal Components Analysis (PCA) is a multivariate statistical method that concerned with analyzing and understanding data in high dimensions, that is to say, PCA method analyzes data sets that represent observations which are described by several dependent variables that are inter correlated. PCA is one of the best known and most used multivariate exploratory analysis technique [5]. Several robust competitors to classical PCA estimators have been proposed in the literature. A natural way to robustify PCA is to use robust location and scatter estimators instead of the PCA's sample mean and sample covariance matrix when estimating the eigenvalues and eigenvectors of the population covariance matrix. The minimum covariance determinant (MCD) method is a highly robust estimator of multivariate location and scatter. Its objective is to find h observations out of n whose covariance matrix has the lowest determinant. The MCD location estimate then is the mean of these h points, and the estimate of scatter is their covariance matrix. Another robust method for principal component analysis uses the Projection-Pursuit (PP) principle. Here, one projects the data on a lower-dimensional space such that a robust measure of variance of the projected data will be maximized. In this paper we investigate the effectiveness of the robust estimators provided by MCD and PP, by applying PCA on Abilene dataset and compare its detection performance of dataset outliers to MCD and PP. The rest of this paper is organized as follows. Section 2 is an overview to related work. Section 3 was dedicated for classical PCA. PCA robustification methods, MCD and PP are discussed in section 4. In section 5 the experiment results are shown, conclusions and future work are drawn in section 6. II. RELATED WORK A number of researches have utilized principal components analysis to reduce the dimensionality and to detect anomalous network traffic. The use of PCA to structure network traffic flows was introduced 573 Vol. 6, Issue 2, pp
2 International Journal of Advances in Engineering & Technology, May 23. by Lakhina [3] whereby principal components analysis is used to decompose the structure of Origin- Destination flows from two backbone networks into three main constituents, namely periodic trends, bursts and noise. Labib [2] utilized PCA in reducing the dimension of the traffic data and for visualizing and identifying attacks. Bouzida et, al. [7] presented a performance study of two machine learning algorithms, namely, nearest neighbors and decision trees algorithms, when used with traffic data with or without PCA. They discover that when PCA is applied to the KDD99 dataset to reduce dimension of the data, the algorithms learning speed was improved while accuracy remained the same. Terrel [9] used principal components analysis on features of aggregated network traffic of a link connecting a university campus to the Internet in order to detect anomalous traffic. Sastry [] proposed the use of singular value decomposition and wavelet transform for detecting anomalies in self similar network traffic data. Wong [2] proposed an anomaly intrusion detection model based on PCA for monitoring network behaviors. The model utilizes PCA in reducing the dimensions of a historical data and in building the normal profile, as represented by the first few components principals. An anomaly is flagged when distance between the new observation and normal profile exceeds a predefined threshold. Mei-ling [4] proposed an anomaly detection scheme on robust principal components analysis. Two classifiers were implemented to detect anomalies, one was based on major components that capture most of the variations in the data, and the second was based on minor components or residuals. A new observation is considered to be an outlier or anomalous when the sum of squares of the weighted principal components exceeds the threshold in any of the two classifiers. Lakhina [6] applied principal components analysis to Origin-Destination (OD) flows traffic, the traffic isolated into normal and anomalous spaces by projecting the data onto the resulting principal components one at a time, ordered from high to low, Principal components (PC) are added to the normal space as long as a predefined threshold is not exceeded. When the threshold is exceeded, then the PC and the subsequent PCs are added to anomalous space. New OD flow traffic is projected into the anomalous space and anomaly is flagged if the value of the square prediction error or Q-statistic exceeds a predefined limit. Therefore PCA is widely used to identify lower dimensional structure in data, and is commonly applied to high-dimensional data. PCA represents data by a small number of components that account for the variability in the data. This dimension reduction step can be followed by other multivariate methods, such as regression, discriminant analysis, cluster analysis, etc. In classical PCA the sample mean and the sample covariance matrix are used to derive the principal components. These two estimators are highly sensitive to outlying observations, and render PCA unreliable, when outliers are encountered. III. CLASSICAL PCA MODEL The PCA detection model detects outliers by projecting observations of the dataset on the new computed axes known as PCs. The outliers detected by PCA method are two types, outliers detected by major PCs, and outliers detected by minor PCs. The basic goals of PCA [5] are to extract important information from data set, to compress the size of the data set by keeping only this important information and to simplify the description of data and analyze the structure of the observation and variables (finding patterns with similarities and difference). To achieve these goals PCA calculate new variables from the original variables, called Principal Components (PCs). The computed variables are linear combination of the original variables (to maximize variance of the projected observation) and uncorrelated. The first computed PCs, called major PCs has the largest inertia ( total variance in data set ), while the second calculated PCs, called minor PCs has the greater residual inertia,and orthogonal to the first principal components. The Principal Components define orthogonal directions in the space of observations. In other words, PCA just makes a change of orthogonal reference frame, the original variables being replaced by the Principal Components. 574 Vol. 6, Issue 2, pp
3 International Journal of Advances in Engineering & Technology, May PCA Advantages PCA common advantages are: 3.. Exploratory Data Analysis PCA is mostly used for making 2-dimensional plots of the data for visual examination and interpretation. For this purpose, data is projected on factorial planes that are spanned by pairs of Principal Components chosen among the first ones (that is, the most significant ones). From these plots, one will try to extract information about the data structure, such as the detection of outliers (observations that are very different from the bulk of the data). Due to most researches [8][], PCA detect two types of outliers, type(): the outlier that inflate variance and this is detected by the major PCs and type (2): outlier that violate structure, which are detected by minor PCs Data Reduction Technique All multivariate techniques are prone to the bias variance tradeoff, which states that the number of variables entering a model should be severely restricted. Data is often described by many more variables than necessary for building the best model. PCA is better than other statistical reduction techniques in that, it select and feed the model with reduced number of variables Low Computational Requirement PCA needs low computational efforts since its algorithm constitutes simple calculations. 3.2 PCA Disadvantages It may be noted that the PCA is based on the assumptions that, the dimensionality of data can be efficiently reduced by linear transformation and most information is contained in those directions where input data variance is maximum. As it is evident, these conditions are by no means always met. For example, if points of an input set are positioned on the surface of a hyper sphere, no linear transformation can reduce dimension (nonlinear transformation, however, can easily cope with this task). From the above the following disadvantage of PCA are concluded Depending On Linear Algebra It relies on simple linear algebra as its main mathematical engine, and is quite easy to interpret geometrically. But this strength is also a weakness, for it might very well be that other synthetic variables, more complex than just linear combinations of the original variables, would lead to a more complex data description Smallest Principal Components Have No Attention in Statistical Techniques The lack of interest is due to the fact that, compared with the largest principal components that contain most of the total variance in the data, the smallest principal components only contain the noise of the data and, therefore, appear to contribute minimal information. However, because outliers are a common source of noise, the smallest principal components should be useful for outlier detection High False Alarms Principal components are sensitive to outliers, since the principal components are determined by their directions and calculated from classical estimator such classical mean and classical covariance or correlation matrices. IV. PCA ROBUSTIFICATION In real datasets, it often happens that some observation are different from the majority, such observation are called outliers, intrusion, discordant, etc. However classical PCA method can be 575 Vol. 6, Issue 2, pp
4 International Journal of Advances in Engineering & Technology, May 23. affected by outliers so that PCA model cannot detect all the actual real deviating observation, this is known as masking effect. In addition some good data points might even appear to be outliers which are known as swamping effect. Masking and swamping cause PCA to generate a high false alarm. To reduce this high false alarms using robust estimators was proposed, since outlying points are less likely to enter into the calculation of the robust estimators. The well-known PCA Robustification methods are the minimum covariance determinant (MCD) and Projection-Pursuit (PP) principle. The objective of the raw MCD is to find h > n/2 observations out of n whose covariance matrix has the smallest determinant. Its breakdown value is (bn= [n- h+]/n), hence the number h determines the robustness of the estimator. In Projection-Pursuit principle [3], one projects the data on a lower-dimensional space such that a robust measure of variance of the projected data will be maximized. PP is applied where the number of variables or dimensions is very large, so PP has an advantage over MCD, since the MCD proposes the dimensions of the dataset not to exceed 5 dimensions. Principal Component Analysis (PCA) is an example of the PP approach, because they both search for directions with maximal dispersion of the data projected on it, but PP instead of using variance as measure of dispersion, they use robust scale estimator [4]. V. EXPERIMENTS AND RESULTS In this section we show how we test PCA and its robustification methods MCD and PP on a dataset. The data that was used consist of OD (Origin-Destination) flows which, are collected and made available by Zhang []. The dataset is an extraction of sixty minutes traffic flows from first week of the traffic matrix on 24-3-, which is the traffic matrix Yin Zhang was built from Abilene network. Availability of the dataset is on offline mode, where it is extracted from offline traffic matrix. 5. PCA on Dataset At first, the dataset or the traffic matrix is arranged into the data matrix X, where rows represent observations and columns represent variables or dimensions. x, x,2 X (44 2) =[ ], x 44, x 44,2 The following steps are considered in apply PCA method on the dataset. Centering the dataset to have zero mean, so the mean vector is calculated from the following equation: μ = n x n i= i () and subtracted off the mean for each dimension. The product of this step is another centered data matrix Y, which has the same size as original dataset Y (n,p) = (x i,j μ(x)) (2) Covariance matrix is calculated from the following equation: C(X)orΣ(X) = (X n T(X))T. (X T(X)) (3) Finding eigenvectors and eigenvalues from the covariance matrix where eigenvalues are diagonal elements of the matrix by using eigen-decomposition technique in equation (4). E Σ Y E =ʎ (4) Where E is the eigenvectors, ʎ is the eigenvalues. Ordering eigenvalues in decreasing order and sorting eigenvectors according to the ordered eigenvalues in loadings matrix. The Eigenvectors matrix is then sorted to be loading matrix. Calculating scores matrix (dataset projected on principal components), which declares the relations between principal components and observations. The scores matrix is calculated from the following equations: scores (n,p) = Y (n,p) loadings (p,p) (5) 576 Vol. 6, Issue 2, pp
5 International Journal of Advances in Engineering & Technology, May 23. Applying the 97.5 tolerance ellipse of the bivariate dataset (data projected on first PCS, data projected on minor PCS) to reveal outliers automatically. The ellipse is defined by these data points whose distance is equal to the chisquare root of the 97.5 quantile with 2 degrees of freedom. The form of the distance is dist x 2 p,.975 (6) The screeplot is used and studied and the first and the second principal components accounted for 98% of total variance of the dataset, so retaining the first two principal components to represent the dataset as whole, figure () shows the screeplot, the plotting of the data projected onto the first two principal components in order to reveal the outliers on the dataset visually is shown in figure (2) x 7 data projected on major pcs 2 totalvariance variances PC principal components Figure : PCA Screeplot PC x 7 Figure 2: PCA Visual outliers Figure (3) shows tolerance ellipse on major PCS, and figures (4) and (5) shows the visual recording of outliers from scatter plots of data projected on robust minor principal components and the outliers detected by robust minor principal components tuned by tolerance ellipse respectively. PC2 x PC x 7 Figure 3: PCA Tolerance Ellipse last PC 8 x 5 data projected on minor pcs last PC- x 5 Figure 4: PCA type2 Outliers Vol. 6, Issue 2, pp
6 International Journal of Advances in Engineering & Technology, May 23. x PC MCD on Dataset PC Figure 5: Tuned Minor PCS Testing robust statistics MCD (Minimum Covariance Determinant) estimator yields robust location measure T mcd and robust dispersion Σ mcd. The following steps are applied to test MCD on the dataset in order to reach the robust principal components. MCD measure is calculated from the formula: R=(x i-t mcd(x)) T.inv(Σ mcd(x)).(x i-t mcd(x) ) for i= to n (7) T mcd or µ mcd =.e+6 * From robust covariance matrix Σ mcd calculating the followings: C(X) mcd or Σ(x) mcd =.e+2 * * find robust eigenvalues as diagonal matrix as in equation (4) by replacing n with h * find robust eigenvectors as loading matrix as in equation (5). Calculating robust scores matrix as in the following form robustscores (n,p) = Y (n,p) loadings (p,p) (8) The robust screeplot retaining the first two robust principal components which accounted above of 98% of total variance is shown in figure (6). Figures (7) and (8) shows respectively the visual recording of outliers from scatter plots of data projected on robust major principal components, and the outliers detected by robust major principal components tuned by tolerance ellipse, and Figures (9) and () shows the visual recording of outliers from scatter plots of data projected on robust minor principal components and the outliers detected by robust minor principal components tuned by tolerance ellipse respectively. x 5 robust mcd screeplot to retain robust PCS 2.5 x 7 major pcs from robust estimator total variance robustmcd PC principal components Figure 6: MCD screeplot robustmcd PC x 7 Figure 7: MCD Visual Outliers 578 Vol. 6, Issue 2, pp
7 International Journal of Advances in Engineering & Technology, May x x 6 data project on robustmcd minor PCS 3.5 robustmcdpc robustmcdpc 2 4 x 7 Figure 8: MCD Tolerance Ellipse robustmcd last pc robustmcd last- pc x 6 Figure 9: MCD type2 Outliers x robustmcd pclast robustmcd pclast- Figure : MCD Tuned Minor PCs x Projection Pursuit on Dataset Testing the projection pursuit method on the dataset is included in the following steps: Center the data matrix X (n,p), around L-median to reach centralized data matrix Y (n,p) as : Y (n,p) = (X (n,p) L(X)) (9) Where L(X) is high robust estimator of multivariate data location with 5% resist of outliers []. Construct the directions p i as normalized rows of matrix, `this process include the following: PY = (Y[i, : ]) for i, : n () let NPY = max(svd(py)) () Where SVD stand for singular value decomposition. P i = PY (2) NPY Project all dataset on all possible directions. T i = Y (P i )t (3) Calculate robust scale estimator for all the projections and find the directions that maximize qn estimator,q = max(qn(t i )) (4) qn is a scale estimator, essentially it is the first quartile of all pairwise distance between two data points [5]. The results of these steps yields the robust eigenvectors (PCs), and the squared of value of the robust scale estimator is the eigenvalues. project all data on the selected direction q to obtain robust principal components as in the following : t T i = Y n,p P q (5) Update data matrix by its orthogonal complement as in the followings: Y = Y (P q P t q ). Y (6) 579 Vol. 6, Issue 2, pp
8 International Journal of Advances in Engineering & Technology, May 23. Project all data on the orthogonal complement, scores = Y P i (7) The Plotting of the data projected on the first two robust principal components to detect outliers visually, is shown in figure (), and the tuning the first two robust principal components by tolerance ellipse is shown in figure (2). Figures (3) and (4) show respectively the plotting of the data projected on minor robust principal components to detect outliers visually, and the tuning of the last robust principal components by tolerance ellipse. x 7 data projected on robust major PCS by PP method x 7 PP robust PC PProbust PC PProbust PC x 7 Figure : PP Visual Outliers PProbust PC x 7 Figure 2: PP Tolerance Ellipse PP robust PC2.5 x 6 data projected on robust minor PCS by PP PProbust PC x PProbust PC x 6 Figure 3: MCD type2 Outliers PProbust PC Figure 4: MCD Tuned Minor PCs 2 x Results Table () summarizes the outliers detected by each method. The table shows that PCA suffers from both masking and swamping. The MCD and PP methods results reveal the effects of masking and swamping of the PCA method. The PP method results are similar to MCD with slight difference since we use 2 dimensions on the dataset. PCA Outlier detected by major and Minor PCS Table : Outliers Detection MCD Outliers detected by major and minor PCS PP Outliers detected by major and minor PCS False alarms effects Masking Swamping No No No No No No No No No No No No No No No No 58 Vol. 6, Issue 2, pp
9 International Journal of Advances in Engineering & Technology, May No No No No No No Normal Normal 69 Yes No Normal Normal 7 Yes No 7 Normal normal No Yes 76 Normal normal No Yes 8 Normal normal No Yes Normal normal No Yes 4 Normal normal No Yes Normal normal No Yes 44 Normal normal No Yes Normal normal Yes No Normal normal Yes No Normal Yes No Normal Yes No VI. CONCLUSION AND FUTURE WORK The study has examined the PCA and its robustification methods (MCD, PP) performance for intrusion detection by presenting the bi-plots and extracted outlying observation that are very different from the bulk of data. The study showed that tuned results are identical to visualized one. The study returns the PCA false alarms shortness due to masking and swamping effect. The comparison proved that PP results are similar to MCD with slight difference in outliers type 2 since are considered as source of noise. Our future work will go into applying the hybrid method (ROBPCA), which takes PP as reduction technique and MCD as robust measure for further performance, and applying dynamic robust PCA model with regards to online intrusion detection. REFERENCES []. Abilene TMs, collected by Zhang. research, visited on 3/7/22 [2]. Khalid Labib and V.Rao Vemuri. "An application of principal Components analysis to the detection and visualization of computer network ". Annals of telecommunications, pages 2834, 25. [3]. C. Croux, A. Ruiz-Gazen, "A fast algorithm for robust principal components based on projection pursuit", COMPSTAT: Proceedings in Computational Statistics, Physica-Verlag, Heidelberg,9, [4]. Mei-ling Shyu, Schu-Ching Chen,Kanoksri Sarinnapakorn,and Li Wuchang. "Anovel anomaly detection scheme based on principal components classifier". In proceedings of the IEEE foundations and New directions of Data Mining workshop, in conjuction with third IEEE international conference on data mining (ICOM3). [5]. J.Edward Jackson. "A user guide to principal components". Wiely interscience Ist edition 23. [6]. Anukool Lakhina,. Mark Crovella, and Christoph Diot. "Diagnosing network wide traffic anomalies".proceedings of the 24 conference on Applications, technologies, architectures, protocols for computer communication. ACM 24. [7]. Yacine Bouzida, Frederic Cuppens, NoraCuppens-Boulahio, and Sylvain Gombaul. "Efficient Intrusion Detection Using Principal Component Analysis ". La londe, France, June 24. [8]. R.Gnandesikan, "Methods for statistical data analysis of multivariate observations". Wiely-interscience publication New York, 2 nd edition 997. [9]. J.Terrel, K.Jeffay L.Zhang, H.Shen, Zhu, and A.Nobel, "Multivariate SVD analysis for a network anomaly detection ". In proceedings of the ACM SIGOMM Conference 25. []. Challa S.Sastry, Sanjay Rawat, Aurn K.Pujari and V.P Gulati, "Netwok traffic analysis using singular value decomposition and multiscale transforms ". information sciences : an international journal Vol. 6, Issue 2, pp
10 International Journal of Advances in Engineering & Technology, May 23. []. I.T.Jollif, "Principal components analysis", springer series in statistics, Springer Network,2 nd edition 27. [2]. Wei Wong, Xiachong Guan, and Xiangliong Zhang, "Processing of massive audit data streams for real time anomaly intrusion detection". Computer communications, Elsevier 28. [3]. A Lkhaina, K Papagiannak, M Crovella, C-Diot, E Kolaczy, and N. Taft, "Structural Analysis of network traffic flows". In proceedings of SIGMETRICS, New York, NY, USA, 24. AUTHORS BIOGRAPHIES Nada Badr earned her BSC in Mathematical and Computer Science at University of Gezira, Sudan. She received the MSC in Computer Science at University of Science and Technology. She is pursuing her PHD in Computer Science at University of Science and Technology, Omdurman, Sudan. She currently serving lecturer at the University of Science and Technology, Faculty of Computer Science and Information Technology. Noureldien A. Noureldien is working as an associate professor in Computer Science, department of Computer Science and Information Technology, University of Science and Technology, Omdurman, Sudan. He received his B.Sc. and M.Sc. from School of Mathematical Sciences, University of Khartoum, and received his PhD in Computer Science in 2 from University of Science and Technology, Khartoum, Sudan. He has many papers published in journals of repute. He currently working as the dean of the Faculty of Computer Science and Information Technology at the University of Science and Technology, Omdurman, Sudan. 582 Vol. 6, Issue 2, pp
Principal Component Analysis
Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand
More informationFAST CROSS-VALIDATION IN ROBUST PCA
COMPSTAT 2004 Symposium c Physica-Verlag/Springer 2004 FAST CROSS-VALIDATION IN ROBUST PCA Sanne Engelen, Mia Hubert Key words: Cross-Validation, Robustness, fast algorithm COMPSTAT 2004 section: Partial
More informationMachine learning for pervasive systems Classification in high-dimensional spaces
Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version
More informationTable of Contents. Multivariate methods. Introduction II. Introduction I
Table of Contents Introduction Antti Penttilä Department of Physics University of Helsinki Exactum summer school, 04 Construction of multinormal distribution Test of multinormality with 3 Interpretation
More informationLecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26
Principal Component Analysis Brett Bernstein CDS at NYU April 25, 2017 Brett Bernstein (CDS at NYU) Lecture 13 April 25, 2017 1 / 26 Initial Question Intro Question Question Let S R n n be symmetric. 1
More informationPCA, Kernel PCA, ICA
PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per
More informationPrincipal Component Analysis
Principal Component Analysis Anders Øland David Christiansen 1 Introduction Principal Component Analysis, or PCA, is a commonly used multi-purpose technique in data analysis. It can be used for feature
More informationPrincipal Component Analysis, A Powerful Scoring Technique
Principal Component Analysis, A Powerful Scoring Technique George C. J. Fernandez, University of Nevada - Reno, Reno NV 89557 ABSTRACT Data mining is a collection of analytical techniques to uncover new
More informationMachine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling
Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University
More informationIntroduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin
1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)
More informationClustering VS Classification
MCQ Clustering VS Classification 1. What is the relation between the distance between clusters and the corresponding class discriminability? a. proportional b. inversely-proportional c. no-relation Ans:
More informationBasics of Multivariate Modelling and Data Analysis
Basics of Multivariate Modelling and Data Analysis Kurt-Erik Häggblom 6. Principal component analysis (PCA) 6.1 Overview 6.2 Essentials of PCA 6.3 Numerical calculation of PCs 6.4 Effects of data preprocessing
More informationDimensionality Reduction
Lecture 5 1 Outline 1. Overview a) What is? b) Why? 2. Principal Component Analysis (PCA) a) Objectives b) Explaining variability c) SVD 3. Related approaches a) ICA b) Autoencoders 2 Example 1: Sportsball
More informationData Preprocessing Tasks
Data Tasks 1 2 3 Data Reduction 4 We re here. 1 Dimensionality Reduction Dimensionality reduction is a commonly used approach for generating fewer features. Typically used because too many features can
More informationPCA and admixture models
PCA and admixture models CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar, Alkes Price PCA and admixture models 1 / 57 Announcements HW1
More informationAnomaly Detection via Over-sampling Principal Component Analysis
Anomaly Detection via Over-sampling Principal Component Analysis Yi-Ren Yeh, Zheng-Yi Lee, and Yuh-Jye Lee Abstract Outlier detection is an important issue in data mining and has been studied in different
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationPrincipal Components Analysis (PCA)
Principal Components Analysis (PCA) Principal Components Analysis (PCA) a technique for finding patterns in data of high dimension Outline:. Eigenvectors and eigenvalues. PCA: a) Getting the data b) Centering
More informationFAULT DETECTION AND ISOLATION WITH ROBUST PRINCIPAL COMPONENT ANALYSIS
Int. J. Appl. Math. Comput. Sci., 8, Vol. 8, No. 4, 49 44 DOI:.478/v6-8-38-3 FAULT DETECTION AND ISOLATION WITH ROBUST PRINCIPAL COMPONENT ANALYSIS YVON THARRAULT, GILLES MOUROT, JOSÉ RAGOT, DIDIER MAQUIN
More informationPrincipal Component Analysis (PCA)
Principal Component Analysis (PCA) Salvador Dalí, Galatea of the Spheres CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy and Lisa Zhang Some slides from Derek Hoiem and Alysha
More informationMachine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012
Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component
More informationMachine Learning 11. week
Machine Learning 11. week Feature Extraction-Selection Dimension reduction PCA LDA 1 Feature Extraction Any problem can be solved by machine learning methods in case of that the system must be appropriately
More informationFocus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.
Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,
More informationDimension Reduction Techniques. Presented by Jie (Jerry) Yu
Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage
More informationWhat is Principal Component Analysis?
What is Principal Component Analysis? Principal component analysis (PCA) Reduce the dimensionality of a data set by finding a new set of variables, smaller than the original set of variables Retains most
More informationLecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University
Lecture 4: Principal Component Analysis Aykut Erdem May 016 Hacettepe University This week Motivation PCA algorithms Applications PCA shortcomings Autoencoders Kernel PCA PCA Applications Data Visualization
More informationUnsupervised Learning: Dimensionality Reduction
Unsupervised Learning: Dimensionality Reduction CMPSCI 689 Fall 2015 Sridhar Mahadevan Lecture 3 Outline In this lecture, we set about to solve the problem posed in the previous lecture Given a dataset,
More informationStructure in Data. A major objective in data analysis is to identify interesting features or structure in the data.
Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two
More informationStatistical Machine Learning
Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x
More informationCS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)
CS4495/6495 Introduction to Computer Vision 8B-L2 Principle Component Analysis (and its use in Computer Vision) Wavelength 2 Wavelength 2 Principal Components Principal components are all about the directions
More information7. Variable extraction and dimensionality reduction
7. Variable extraction and dimensionality reduction The goal of the variable selection in the preceding chapter was to find least useful variables so that it would be possible to reduce the dimensionality
More informationMULTIVARIATE TECHNIQUES, ROBUSTNESS
MULTIVARIATE TECHNIQUES, ROBUSTNESS Mia Hubert Associate Professor, Department of Mathematics and L-STAT Katholieke Universiteit Leuven, Belgium mia.hubert@wis.kuleuven.be Peter J. Rousseeuw 1 Senior Researcher,
More informationLecture Topic Projects 1 Intro, schedule, and logistics 2 Applications of visual analytics, data types 3 Data sources and preparation Project 1 out 4
Lecture Topic Projects 1 Intro, schedule, and logistics 2 Applications of visual analytics, data types 3 Data sources and preparation Project 1 out 4 Data reduction, similarity & distance, data augmentation
More informationData Mining Lecture 4: Covariance, EVD, PCA & SVD
Data Mining Lecture 4: Covariance, EVD, PCA & SVD Jo Houghton ECS Southampton February 25, 2019 1 / 28 Variance and Covariance - Expectation A random variable takes on different values due to chance The
More informationRobustness of Principal Components
PCA for Clustering An objective of principal components analysis is to identify linear combinations of the original variables that are useful in accounting for the variation in those original variables.
More informationDamage detection in the presence of outliers based on robust PCA Fahit Gharibnezhad 1, L.E. Mujica 2, Jose Rodellar 3 1,2,3
Damage detection in the presence of outliers based on robust PCA Fahit Gharibnezhad 1, L.E. Mujica 2, Jose Rodellar 3 1,2,3 Escola Universitària d'enginyeria Tècnica Industrial de Barcelona,Department
More informationMultivariate Statistics (I) 2. Principal Component Analysis (PCA)
Multivariate Statistics (I) 2. Principal Component Analysis (PCA) 2.1 Comprehension of PCA 2.2 Concepts of PCs 2.3 Algebraic derivation of PCs 2.4 Selection and goodness-of-fit of PCs 2.5 Algebraic derivation
More informationPrinciple Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA
Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In
More informationROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015
ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 http://intelligentoptimization.org/lionbook Roberto Battiti
More informationMachine Learning (Spring 2012) Principal Component Analysis
1-71 Machine Learning (Spring 1) Principal Component Analysis Yang Xu This note is partly based on Chapter 1.1 in Chris Bishop s book on PRML and the lecture slides on PCA written by Carlos Guestrin in
More informationPrincipal Component Analysis. Applied Multivariate Statistics Spring 2012
Principal Component Analysis Applied Multivariate Statistics Spring 2012 Overview Intuition Four definitions Practical examples Mathematical example Case study 2 PCA: Goals Goal 1: Dimension reduction
More informationDimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining
Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can
More informationSTATISTICAL LEARNING SYSTEMS
STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis
More informationEvaluation of robust PCA for supervised audio outlier detection
Evaluation of robust PCA for supervised audio outlier detection Sarka Brodinova, Vienna University of Technology, sarka.brodinova@tuwien.ac.at Thomas Ortner, Vienna University of Technology, thomas.ortner@tuwien.ac.at
More informationNoise & Data Reduction
Noise & Data Reduction Paired Sample t Test Data Transformation - Overview From Covariance Matrix to PCA and Dimension Reduction Fourier Analysis - Spectrum Dimension Reduction 1 Remember: Central Limit
More informationAnomaly Detection via Online Oversampling Principal Component Analysis
Anomaly Detection via Online Oversampling Principal Component Analysis R.Sundara Nagaraj 1, C.Anitha 2 and Mrs.K.K.Kavitha 3 1 PG Scholar (M.Phil-CS), Selvamm Art Science College (Autonomous), Namakkal,
More informationPrincipal Component Analysis and Singular Value Decomposition. Volker Tresp, Clemens Otte Summer 2014
Principal Component Analysis and Singular Value Decomposition Volker Tresp, Clemens Otte Summer 2014 1 Motivation So far we always argued for a high-dimensional feature space Still, in some cases it makes
More informationMotivating the Covariance Matrix
Motivating the Covariance Matrix Raúl Rojas Computer Science Department Freie Universität Berlin January 2009 Abstract This note reviews some interesting properties of the covariance matrix and its role
More informationCSC 411 Lecture 12: Principal Component Analysis
CSC 411 Lecture 12: Principal Component Analysis Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 12-PCA 1 / 23 Overview Today we ll cover the first unsupervised
More information15 Singular Value Decomposition
15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing
More informationData Mining Techniques
Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 12 Jan-Willem van de Meent (credit: Yijun Zhao, Percy Liang) DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Linear Dimensionality
More informationData Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395
Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA
More informationPrincipal Component Analysis (PCA) Principal Component Analysis (PCA)
Recall: Eigenvectors of the Covariance Matrix Covariance matrices are symmetric. Eigenvectors are orthogonal Eigenvectors are ordered by the magnitude of eigenvalues: λ 1 λ 2 λ p {v 1, v 2,..., v n } Recall:
More informationPrincipal Component Analysis (PCA) Theory, Practice, and Examples
Principal Component Analysis (PCA) Theory, Practice, and Examples Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite) variables. p k n A
More informationIntroduction to Principal Component Analysis (PCA)
Introduction to Principal Component Analysis (PCA) NESAC/BIO NESAC/BIO Daniel J. Graham PhD University of Washington NESAC/BIO MVSA Website 2010 Multivariate Analysis Multivariate analysis (MVA) methods
More informationClusters. Unsupervised Learning. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved
Clusters Unsupervised Learning Luc Anselin http://spatial.uchicago.edu 1 curse of dimensionality principal components multidimensional scaling classical clustering methods 2 Curse of Dimensionality 3 Curse
More informationLinear Dimensionality Reduction
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Principal Component Analysis 3 Factor Analysis
More informationMachine Learning Approaches to Network Anomaly Detection
Machine Learning Approaches to Network Anomaly Detection Tarem Ahmed, Boris Oreshkin and Mark Coates tarem.ahmed@mail.mcgill.ca, boris.oreshkin@mail.mcgill.ca, coates@ece.mcgill.ca USENIX SysML, Cambridge,
More informationDimensionality Reduction Techniques (DRT)
Dimensionality Reduction Techniques (DRT) Introduction: Sometimes we have lot of variables in the data for analysis which create multidimensional matrix. To simplify calculation and to get appropriate,
More informationDegenerate Expectation-Maximization Algorithm for Local Dimension Reduction
Degenerate Expectation-Maximization Algorithm for Local Dimension Reduction Xiaodong Lin 1 and Yu Zhu 2 1 Statistical and Applied Mathematical Science Institute, RTP, NC, 27709 USA University of Cincinnati,
More informationMATH 829: Introduction to Data Mining and Analysis Principal component analysis
1/11 MATH 829: Introduction to Data Mining and Analysis Principal component analysis Dominique Guillot Departments of Mathematical Sciences University of Delaware April 4, 2016 Motivation 2/11 High-dimensional
More informationMACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA
1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR
More informationDimensionality Reduction
Dimensionality Reduction Le Song Machine Learning I CSE 674, Fall 23 Unsupervised learning Learning from raw (unlabeled, unannotated, etc) data, as opposed to supervised data where a classification of
More informationMachine Learning 2nd Edition
INTRODUCTION TO Lecture Slides for Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://www.cs.tau.ac.il/~apartzin/machinelearning/ The MIT Press, 2010
More informationLecture: Face Recognition and Feature Reduction
Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed
More informationECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction
ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering
More informationLinear Algebra & Geometry why is linear algebra useful in computer vision?
Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia
More informationPrincipal Component Analysis -- PCA (also called Karhunen-Loeve transformation)
Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) PCA transforms the original input space into a lower dimensional space, by constructing dimensions that are linear combinations
More informationCHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION
59 CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION 4. INTRODUCTION Weighted average-based fusion algorithms are one of the widely used fusion methods for multi-sensor data integration. These methods
More informationRare Event Discovery And Event Change Point In Biological Data Stream
Rare Event Discovery And Event Change Point In Biological Data Stream T. Jagadeeswari 1 M.Tech(CSE) MISTE, B. Mahalakshmi 2 M.Tech(CSE)MISTE, N. Anusha 3 M.Tech(CSE) Department of Computer Science and
More informationCS 4495 Computer Vision Principle Component Analysis
CS 4495 Computer Vision Principle Component Analysis (and it s use in Computer Vision) Aaron Bobick School of Interactive Computing Administrivia PS6 is out. Due *** Sunday, Nov 24th at 11:55pm *** PS7
More informationECE 661: Homework 10 Fall 2014
ECE 661: Homework 10 Fall 2014 This homework consists of the following two parts: (1) Face recognition with PCA and LDA for dimensionality reduction and the nearest-neighborhood rule for classification;
More informationPCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani
PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given
More informationISSN: (Online) Volume 3, Issue 5, May 2015 International Journal of Advance Research in Computer Science and Management Studies
ISSN: 2321-7782 (Online) Volume 3, Issue 5, May 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at:
More informationClassification 2: Linear discriminant analysis (continued); logistic regression
Classification 2: Linear discriminant analysis (continued); logistic regression Ryan Tibshirani Data Mining: 36-462/36-662 April 4 2013 Optional reading: ISL 4.4, ESL 4.3; ISL 4.3, ESL 4.4 1 Reminder:
More informationCS570 Data Mining. Anomaly Detection. Li Xiong. Slide credits: Tan, Steinbach, Kumar Jiawei Han and Micheline Kamber.
CS570 Data Mining Anomaly Detection Li Xiong Slide credits: Tan, Steinbach, Kumar Jiawei Han and Micheline Kamber April 3, 2011 1 Anomaly Detection Anomaly is a pattern in the data that does not conform
More informationCourse in Data Science
Course in Data Science About the Course: In this course you will get an introduction to the main tools and ideas which are required for Data Scientist/Business Analyst/Data Analyst. The course gives an
More informationModeling Classes of Shapes Suppose you have a class of shapes with a range of variations: System 2 Overview
4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 Modeling Classes of Shapes Suppose you have a class of shapes with a range of variations: System processes System Overview Previous Systems:
More informationFactor Analysis (10/2/13)
STA561: Probabilistic machine learning Factor Analysis (10/2/13) Lecturer: Barbara Engelhardt Scribes: Li Zhu, Fan Li, Ni Guan Factor Analysis Factor analysis is related to the mixture models we have studied.
More informationEigenvalues, Eigenvectors, and an Intro to PCA
Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.
More informationSparse PCA for high-dimensional data with outliers
Sparse PCA for high-dimensional data with outliers Mia Hubert Tom Reynkens Eric Schmitt Tim Verdonck Department of Mathematics, KU Leuven Leuven, Belgium June 25, 2015 Abstract A new sparse PCA algorithm
More informationEigenvalues, Eigenvectors, and an Intro to PCA
Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.
More informationDIMENSION REDUCTION OF THE EXPLANATORY VARIABLES IN MULTIPLE LINEAR REGRESSION. P. Filzmoser and C. Croux
Pliska Stud. Math. Bulgar. 003), 59 70 STUDIA MATHEMATICA BULGARICA DIMENSION REDUCTION OF THE EXPLANATORY VARIABLES IN MULTIPLE LINEAR REGRESSION P. Filzmoser and C. Croux Abstract. In classical multiple
More informationIMPROVING THE SMALL-SAMPLE EFFICIENCY OF A ROBUST CORRELATION MATRIX: A NOTE
IMPROVING THE SMALL-SAMPLE EFFICIENCY OF A ROBUST CORRELATION MATRIX: A NOTE Eric Blankmeyer Department of Finance and Economics McCoy College of Business Administration Texas State University San Marcos
More informationRegularized Discriminant Analysis and Reduced-Rank LDA
Regularized Discriminant Analysis and Reduced-Rank LDA Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Regularized Discriminant Analysis A compromise between LDA and
More informationDecember 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis
.. December 20, 2013 Todays lecture. (PCA) (PLS-R) (LDA) . (PCA) is a method often used to reduce the dimension of a large dataset to one of a more manageble size. The new dataset can then be used to make
More informationCPSC 340: Machine Learning and Data Mining. More PCA Fall 2017
CPSC 340: Machine Learning and Data Mining More PCA Fall 2017 Admin Assignment 4: Due Friday of next week. No class Monday due to holiday. There will be tutorials next week on MAP/PCA (except Monday).
More informationPRINCIPAL COMPONENTS ANALYSIS
121 CHAPTER 11 PRINCIPAL COMPONENTS ANALYSIS We now have the tools necessary to discuss one of the most important concepts in mathematical statistics: Principal Components Analysis (PCA). PCA involves
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction
More informationLecture: Face Recognition and Feature Reduction
Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab 1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed in the
More information-Principal components analysis is by far the oldest multivariate technique, dating back to the early 1900's; ecologists have used PCA since the
1 2 3 -Principal components analysis is by far the oldest multivariate technique, dating back to the early 1900's; ecologists have used PCA since the 1950's. -PCA is based on covariance or correlation
More informationPrompt Network Anomaly Detection using SSA-Based Change-Point Detection. Hao Chen 3/7/2014
Prompt Network Anomaly Detection using SSA-Based Change-Point Detection Hao Chen 3/7/2014 Network Anomaly Detection Network Intrusion Detection (NID) Signature-based detection Detect known attacks Use
More informationPrincipal Component Analysis (PCA) CSC411/2515 Tutorial
Principal Component Analysis (PCA) CSC411/2515 Tutorial Harris Chan Based on previous tutorial slides by Wenjie Luo, Ladislav Rampasek University of Toronto hchan@cs.toronto.edu October 19th, 2017 (UofT)
More informationLEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach
LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach Dr. Guangliang Chen February 9, 2016 Outline Introduction Review of linear algebra Matrix SVD PCA Motivation The digits
More informationStreaming multiscale anomaly detection
Streaming multiscale anomaly detection DATA-ENS Paris and ThalesAlenia Space B Ravi Kiran, Université Lille 3, CRISTaL Joint work with Mathieu Andreux beedotkiran@gmail.com June 20, 2017 (CRISTaL) Streaming
More informationMultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A
MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A. 2017-2018 Pietro Guccione, PhD DEI - DIPARTIMENTO DI INGEGNERIA ELETTRICA E DELL INFORMAZIONE POLITECNICO DI
More informationDeriving Principal Component Analysis (PCA)
-0 Mathematical Foundations for Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Deriving Principal Component Analysis (PCA) Matt Gormley Lecture 11 Oct.
More informationRobust estimation of principal components from depth-based multivariate rank covariance matrix
Robust estimation of principal components from depth-based multivariate rank covariance matrix Subho Majumdar Snigdhansu Chatterjee University of Minnesota, School of Statistics Table of contents Summary
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions
More information