Techniques and Applications of Multivariate Analysis

Similar documents
An Introduction to Multivariate Methods

Multivariate Analysis

ISyE 6416: Computational Statistics Spring Lecture 5: Discriminant analysis and classification

SCHOOL OF MATHEMATICS AND STATISTICS

MULTIVARIATE HOMEWORK #5

STAT 730 Chapter 1 Background

A User's Guide To Principal Components

1. Introduction to Multivariate Analysis

INFORMATION THEORY AND STATISTICS

One-way ANOVA. Experimental Design. One-way ANOVA

Supervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2012

In most cases, a plot of d (j) = {d (j) 2 } against {χ p 2 (1-q j )} is preferable since there is less piling up

Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition

Applied Multivariate Analysis

Data Mining and Analysis

Classification Methods II: Linear and Quadratic Discrimminant Analysis

The SAS System 18:28 Saturday, March 10, Plot of Canonical Variables Identified by Cluster

Principal Component Analysis

Creative Data Mining

Biplots in Practice MICHAEL GREENACRE. Professor of Statistics at the Pompeu Fabra University. Chapter 11 Offprint. Discriminant Analysis Biplots

Multivariate Statistics Summary and Comparison of Techniques. Multivariate Techniques

Classification techniques focus on Discriminant Analysis

University of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians

New Interpretation of Principal Components Analysis

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 8: Canonical Correlation Analysis

Wolfgang Karl Härdle Leopold Simar. Applied Multivariate. Statistical Analysis. Fourth Edition. ö Springer

LEC 4: Discriminant Analysis for Classification

Dimension Reduction (PCA, ICA, CCA, FLD,

Basics of Multivariate Modelling and Data Analysis

Multivariate Statistics 101. Ordination (PCA, NMDS, CA) Cluster Analysis (UPGMA, Ward s) Canonical Correspondence Analysis

Canonical Correlation Analysis

PRINCIPAL COMPONENTS ANALYSIS

University of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians

Unsupervised Learning. k-means Algorithm

Statistical Pattern Recognition

Applied Multivariate and Longitudinal Data Analysis

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Multivariate Statistics (I) 2. Principal Component Analysis (PCA)

Ratio of Vector Lengths as an Indicator of Sample Representativeness

MLE/MAP + Naïve Bayes

MLE/MAP + Naïve Bayes

Experimental Design and Data Analysis for Biologists

LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach

STATISTICS 407 METHODS OF MULTIVARIATE ANALYSIS TOPICS

T 2 Type Test Statistic and Simultaneous Confidence Intervals for Sub-mean Vectors in k-sample Problem

Chapter 8 Student Lecture Notes 8-1. Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance

An Introduction to Applied Multivariate Analysis with R

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

Multivariate Analysis of Ecological Data using CANOCO

An Application of Discriminant Analysis On University Matriculation Examination Scores For Candidates Admitted Into Anamabra State University

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

Introduction to ordination. Gary Bradfield Botany Dept.

Naïve Bayes Introduction to Machine Learning. Matt Gormley Lecture 3 September 14, Readings: Mitchell Ch Murphy Ch.

Multivariate Data Analysis a survey of data reduction and data association techniques: Principal Components Analysis

Linear Models 1. Isfahan University of Technology Fall Semester, 2014

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 5: Bivariate Correspondence Analysis

Revision: Chapter 1-6. Applied Multivariate Statistics Spring 2012

4 Statistics of Normally Distributed Data

GENERAL TOPIC SPECIFIC TOPIC

Data Mining and Analysis: Fundamental Concepts and Algorithms

Naïve Bayes Introduction to Machine Learning. Matt Gormley Lecture 18 Oct. 31, 2018

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

Principal component analysis, PCA

Multivariate Statistical Analysis

Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance ECON 509. Dr.

City University of Hong Kong. Information on a Course offered by Department of Biology and Chemistry with effect from 2012 / 2013

STAT 501 Assignment 2 NAME Spring Chapter 5, and Sections in Johnson & Wichern.

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

Motivating the Covariance Matrix

Chapter 11 Canonical analysis

Visualizing Tests for Equality of Covariance Matrices Supplemental Appendix

MULTIVARIATE PATTERN RECOGNITION FOR CHEMOMETRICS. Richard Brereton

(Make-Up) Test 1: Multivariable Calculus

Discriminant Analysis

Jack-o -lantern. Integrated Module Grades 1-2. Della Bell MAP Senior Leader. Moberly School District. Central Regional MAP Center

Linear Models. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Last time: PCA. Statistical Data Mining and Machine Learning Hilary Term Singular Value Decomposition (SVD) Eigendecomposition and PCA

Data Mining and Analysis: Fundamental Concepts and Algorithms

Course Outline MODEL INFORMATION. Bayes Decision Theory. Unsupervised Learning. Supervised Learning. Parametric Approach. Nonparametric Approach

EE16B Designing Information Devices and Systems II

HONORS LINEAR ALGEBRA (MATH V 2020) SPRING 2013

z = β βσβ Statistical Analysis of MV Data Example : µ=0 (Σ known) consider Y = β X~ N 1 (β µ, β Σβ) test statistic for H 0β is

Degenerate Expectation-Maximization Algorithm for Local Dimension Reduction

CS534 Machine Learning - Spring Final Exam

Multivariate analysis of variance and covariance

PATTERN CLASSIFICATION

Statistics Toolbox 6. Apply statistical algorithms and probability models

This is a multiple choice and short answer practice exam. It does not count towards your grade. You may use the tables in your book.

Canonical Correlation Analysis of Longitudinal Data

Harold HOTELLING b. 29 September d. 26 December 1973

Section 7: Discriminant Analysis.

4/2/2018. Canonical Analyses Analysis aimed at identifying the relationship between two multivariate datasets. Cannonical Correlation.

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 6: Bivariate Correspondence Analysis - part II

3. Tests in the Bernoulli Model

Part I. Linear Discriminant Analysis. Discriminant analysis. Discriminant analysis

Multivariate Analysis of Ecological Data

UNIVERSITY OF THE PHILIPPINES LOS BAÑOS INSTITUTE OF STATISTICS BS Statistics - Course Description

Правительство Российской Федерации

BINARY TREE-STRUCTURED PARTITION AND CLASSIFICATION SCHEMES

Transcription:

Techniques and Applications of Multivariate Analysis Department of Statistics Professor Yong-Seok Choi E-mail: yschoi@pusan.ac.kr Home : yschoi.pusan.ac.kr

Contents Multivariate Statistics (I) in Spring 1. Introduction of Multivariate Analysis 2. Principal Component Analysis (PCA) 3. Factor Analysis (FA) 4. Canonical Correlation Analysis (CCA) 5. Cluster Analysis (CA) Multivariate Statistics (II) in Autumn 6. Discrimination and Classification Analysis (DCA) 7. Multidimensional Scaling (MDS) 8. Correspondence Analysis (CRA) 9. Biplots 10. Estimation and Testing

Lecture 1. Introduction of Multivariate Analysis Lecture 1-1 1.1 multivariate data analysis 1.3 matrix representation of multivariate data Lecture 1-2 1.4 descriptive statistics 1.5 multivariate normal distribution and its useful properties 1.6 testing multivariate normality 1.7 Appendix: A1 and A2

1.1 Multivariate Data Analysis Definition A collection of techniques dealing with data containing observations on two or more variables. In general, data contain the n observations variables x,, 1 x p o, 1,o n and p Techniques based on the geometrical ideas R-Techniques : Analyses based on the matrix of covariance or correlations between variables. - PCA/FA/CCA/Biplot Q-Techniques : Analyses based on the matrix of distances between observations. - CA/DA/MDS

Geometrical Representations of 3-dimensinal space Students Mechanics Algebra Statistics 1 77 67 81 2 63 80 81 3 50 50 50 a) n = 3 points in p-space b) p = 3 points in n-space

[Data 1] Examination marks on 5 subjects (Mardia et al., 1979, pp. 3-4) closed-book open-book Obs Mechanics Vectors Algebra Analysis Statistics Questions: How to combine or average these marks? Relationship between open-book and closed-book?

[Data 2] Protein consumption in European 25 countries Protein Source: Meat Pigs Eggs Milk Fish Cereals Starchy foods Nuts/ Oil-seeds Fruits/Vegetables Questions: Which countries have high consumption of each kind of protein? Relationship between protein sources?

[Data 3] Fitness club data(sas Institute Inc., 1990, Chapter 15) Variables : Physiological vs Exercise Weight Waist Pulse Chins Situps Jumps Questions: How to relate physiological variables to exercise variables?

[Data 4] Fisher s Iris flower data(johnson & Wichern, 2002, p. 657) Types: Setosa Versicolour Virginica X1: Sepal length X2: Sepal width X4: Petal length X5: Petal width Questions: Ask to which species a new iris of unknown species belongs? How to find the criteria for classifying? Sir Ronald Aylmer Fisher (17 February 1890 29 July 1962) English statistician, evolutionary biologist, and geneticist.

[Data 5] Dissimilarity matrix for economic views of 14 institutes (Choi, 1995, Chapter 1) Questions: Which Institutes have a similar economic view? What are their economic views?

[Data 6] Contingency table for academic careers and preference grades of renowned brand (Baek and Lee) Very : Not at all : Questions: Independence between rows and columns? Which columns match rows together?

Remark : Objectives of Multivariate Analysis 1. Data Reduction or Structure Simplification - Making interpretation easier 2. Sorting and Grouping - Creating groups of similar objects or variables 3. Investigation of the dependence among variables - Making independent variables 4. Prediction 5. Hypothesis Testing