STATISTICS 407 METHODS OF MULTIVARIATE ANALYSIS TOPICS

Similar documents
Basics of Multivariate Modelling and Data Analysis

Linear & Non-Linear Discriminant Analysis! Hugh R. Wilson

Chapter 4 Dimension Reduction

Multivariate Statistics Fundamentals Part 1: Rotation-based Techniques

Measuring relationships among multiple responses

Multivariate Statistics Summary and Comparison of Techniques. Multivariate Techniques

Simultaneous equations 8 Introduction to advanced methods

Machine Learning 2nd Edition

PRINCIPAL COMPONENTS ANALYSIS

Classification: Linear Discriminant Analysis

Computational Genomics

Differentiation 1 concepts

Exercises * on Principal Component Analysis

Computer Vision Group Prof. Daniel Cremers. 3. Regression

Revision: Chapter 1-6. Applied Multivariate Statistics Spring 2012

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

Learning gradients: prescriptive models

CS281 Section 4: Factor Analysis and PCA

Machine Learning 11. week

Principal component analysis

Course in Data Science

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015

Lecture 2: Linear Algebra Review

INFORMATION THEORY AND STATISTICS

Table of Contents. Multivariate methods. Introduction II. Introduction I

Multivariate Analysis

Classification. The goal: map from input X to a label Y. Y has a discrete set of possible values. We focused on binary Y (values 0 or 1).

CS 340 Lec. 18: Multivariate Gaussian Distributions and Linear Discriminant Analysis

LEC 3: Fisher Discriminant Analysis (FDA)

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Multi-View Regression via Canonincal Correlation Analysis

Clustering VS Classification

Supervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2012

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Chapter 4: Factor Analysis

Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition

MULTIVARIATE HOMEWORK #5

Statistical Pattern Recognition

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

Machine Learning - MT & 14. PCA and MDS

Principal Component Analysis, A Powerful Scoring Technique

Bayesian Inference Technique for Data mining for Yield Enhancement in Semiconductor Manufacturing Data

An Introduction to Multivariate Methods

High Dimensional Discriminant Analysis

Lecture 3 Math & Probability Background

Review of Statistics 101

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

6.036 midterm review. Wednesday, March 18, 15

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Principal Components Theory Notes

High Dimensional Discriminant Analysis

Linear Dimensionality Reduction

Generative classifiers: The Gaussian classifier. Ata Kaban School of Computer Science University of Birmingham

An Alternative Algorithm for Classification Based on Robust Mahalanobis Distance

Hypothesis testing:power, test statistic CMS:

Statistical methods. Mean value and standard deviations Standard statistical distributions Linear systems Matrix algebra

Lecture 1: Bayesian Framework Basics

Problem Set 2. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 30

Introduction to Machine Learning

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

STATISTICAL LEARNING SYSTEMS

Assignment 3. Introduction to Machine Learning Prof. B. Ravindran

Single and Multivariable Calculus. Early Transcendentals

1 A factor can be considered to be an underlying latent variable: (a) on which people differ. (b) that is explained by unknown variables

Matrices: 2.1 Operations with Matrices

Single and Multivariable Calculus. Late Transcendentals

Lesson 6 Maximum Likelihood: Part III. (plus a quick bestiary of models)

Metric-based classifiers. Nuno Vasconcelos UCSD

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

ECE 5984: Introduction to Machine Learning

Introduction to Machine Learning

Overview of Statistical Tools. Statistical Inference. Bayesian Framework. Modeling. Very simple case. Things are usually more complicated

Machine Learning (CSE 446): Unsupervised Learning: K-means and Principal Component Analysis

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)

State-space models 2 2 nd order ODEs

7. Variable extraction and dimensionality reduction

Discriminant Analysis for Interval Data

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

Issues and Techniques in Pattern Classification

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Econ671 Factor Models: Principal Components

Supervised Dimension Reduction:

Generative Clustering, Topic Modeling, & Bayesian Inference

A Selective Review of Sufficient Dimension Reduction

1. Introduction. S.S. Patil 1, Sachidananda 1, U.B. Angadi 2, and D.K. Prabhuraj 3

General structural model Part 1: Covariance structure and identification. Psychology 588: Covariance structure and factor models

Statistics Toolbox 6. Apply statistical algorithms and probability models

Chemometrics: Classification of spectra

Discrete Multivariate Statistics

STA 431s17 Assignment Eight 1

Classification 2: Linear discriminant analysis (continued); logistic regression

INF Introduction to classifiction Anne Solberg Based on Chapter 2 ( ) in Duda and Hart: Pattern Classification

Data Exploration and Unsupervised Learning with Clustering

Descriptive Statistics

Resampling Methods CAPT David Ruth, USN

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

Transcription:

STATISTICS 407 METHODS OF MULTIVARIATE ANALYSIS TOPICS Principal Component Analysis (PCA): Reduce the, summarize the sources of variation in the data, transform the data into a new data set where the variables are uncorrelated.! Factor Analysis (FA): When its not possible to of interest directly, measure what s possible and create the variables of interest from the observed data.! Discriminant Analysis (classification, supervised learning): Build a rule to or group id from observed training data.

TOPICS Cluster Analysis (unsupervised learning): Find similar groups of individuals, or based on their similarity.! M(ultivariate)ANOVA: Infer information about the based on the sample means.! Multivariate Regression and Canonical Correlation Analysis: variables, and explore the association between the set of dependent variables and a set of explanatory variables.! PLUS how to plot multivariate data,. A TAXONOMY OF TECHNIQUES Variable-directed: Quantifying the relationships between variables, eg.! Individual-directed: Summarizing relationships that exist between individuals, or experimental units, eg.

MULTIVARIATE DATA Example, nutritional information of chocolates (100g equivalent ) from around the world: SOME MATH... Matrix Notation Data (n observations, p variables) has matrix form as follows: X = [X 1 X 2... X p ] = X 11 X 12... X 1p X 21 X 22... X 2p...... X n1 X n2... X np n p X ij is the element in the i th row and j th column, that is i th case and j th variable. 5

SOME MATH... Mean Vector, Variance-Covariance/Correlation Matrices X = X 1. X p S 11 S 12... S 1p S S = 21 S 22... S 2p...... S p1 S p2... S pp How do you calculate S11, S12, r12? 1 r 12... r 1p r R = 21 1... r 2p...... r p1 r p2... 1 MULTIVARIATE DATA For the chocolates data example, we d calculate the mean vector, var-cov matrix, and correlation matrix for the.! For the variables we d report counts, and proportions.! The mean vector, var-cov and corr matrix might also be reported for each category of the categorical variables.

MULTIVARIATE DATA Chocolates data: n=10, p=6! What s the mean Calories?! variance of Fiber?! correlation between Sugars and Calories?! Which variables have negative covariance?! Which variable has the largest variance?! Standard deviation of Chol? S = R = X = 551.18 36.31 9.48 56.88 6.27 40.01 2299.6 229.26 156.05 296.3 48.78 425.1 229.3 45.14 16.62 166.5 5.97 66.3 156.1 16.62 115.50 108.8 5.65 60.4 296.3 166.50 108.79 2275.2 61.30 90.8 48.8 5.97 5.65 61.3 7.16 23.6 425.1 66.28 60.36 90.8 23.60 196.9 1.000 0.712 0.303 0.130 0.380 0.632 0.712 1.000 0.230 0.520 0.332 0.703 0.303 0.230 1.000 0.212 0.197 0.400 0.130 0.520 0.212 1.000 0.480 0.136 0.380 0.332 0.197 0.480 1.000 0.629 0.632 0.703 0.400 0.136 0.629 1.000 DATA SUMMARY Table 1: Summary statistics for nutritional information about chocolates from around the world, based on 10 observations. Calories TotFat (g) Cholesterol (mg) Sodium (mg) Fiber (g) Sugars (g) Mean 551.18 36.31 9.48 56.88 6.27 40.01 Std dev 47.95 6.72 10.75 47.7 2.68 14.03 It is also a good idea to include the minimum, maximum and median of each variable.

DATA SUMMARY Table 2: Correlation between for nutrition variables collected on chocolates from around the world, based on 10 observations. Calories TotFat Cholesterol Sodium Fiber Sugars Calories 1 0.71-0.3-0.13 0.38-0.63 TotFat 0.71 1-0.23-0.52 0.33-0.7 Cholesterol -0.3-0.23 1-0.21-0.2 0.4 Sodium -0.13-0.52-0.21 1-0.48 0.14 Fiber 0.38 0.33-0.2-0.48 1-0.63 Sugars -0.63-0.7 0.4 0.14-0.63 1 MULTIVARIATE DATA Chocolates data - summary of categorical variables using counts. These are the, or even the dependent variables that might be used for classifying observations.

MORE MATH... Linear combinations, and projections: If then X = = 1.. p p 1 1X 11 + 2 X 21 +...+ p X p1. 1X 1n + 2 X 2n +...+ p X pn If 2 1 +...+ p 2 = 1 then is a projection vector, and X is a projection of the data. Used in. n 1 MORE MATH... Distance measures: For two points (rows of the data matrix) A =(A 1 A 2... A p ) and B = (B 1 B 2... B p ), Euclidean distance is defined as d(a, B) = (A B)(A B) = Used in. (A 1 B 1 ) 2 +...+(A p B p ) 2 and statistical distance (or Mahalobis distance) is defined as d(a, B) = (A B)S 1 (A B). Generally any distance measure can be defined, but it must satisfy (1) d(a, B) =d(b, A), (2) d(a, B) > 0, if A = B, (3) d(a, B) = 0, if A = B, (4) d(a, B) d(a, C)+d(C, B), for any intermediate point C.

MORE MATH... Scaling: The standardized data matrix is Z = z 11 z 12... z 1p z 21 z 22 z 2p... z n1 z n2... z np x j where z ij = x ij s j, i =1,..., n; j =1,..., p. The standardized data has mean vector all zeros, and variances all equal to 1. Different from! Doesn t change the correlation between variables. does remove correlation - more to come on this. This work is licensed under the Creative Commons Attribution-Noncommercial 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/ licenses/by-nc/3.0/us/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.