MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 8: Canonical Correlation Analysis

Similar documents
MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 6: Bivariate Correspondence Analysis - part II

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 5: Bivariate Correspondence Analysis

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 4: Measures of Robustness, Robust Principal Component Analysis

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 1: Introduction, Multivariate Location and Scatter

Canonical Correlations

A Peak to the World of Multivariate Statistical Analysis

6-1. Canonical Correlation Analysis

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Topic 9: Canonical Correlation

Unsupervised Learning: Dimensionality Reduction

Canonical Correlation Analysis with Kernels

Chapter 4: Factor Analysis

Canonical Correlation Analysis of Longitudinal Data

STATS306B STATS306B. Discriminant Analysis. Jonathan Taylor Department of Statistics Stanford University. June 3, 2010

1 Principal Components Analysis

INFORMATION THEORY AND STATISTICS

PRINCIPAL COMPONENT ANALYSIS

Matrices and Multivariate Statistics - II

Introduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA

Multivariate Statistical Analysis

Multivariate Statistics Fundamentals Part 1: Rotation-based Techniques

Forecasting 1 to h steps ahead using partial least squares

Principal component analysis

Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition

Eigenvalues, Eigenvectors, and an Intro to PCA

Statistical modelling: Theory and practice

Machine Learning 11. week

Experimental Design and Data Analysis for Biologists

Scatter plot of data from the study. Linear Regression

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

Lecture 8: Classification

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

Exam 2. Jeremy Morris. March 23, 2006

Multivariate Regression (Chapter 10)

Ch.3 Canonical correlation analysis (CCA) [Book, Sect. 2.4]

Scatter plot of data from the study. Linear Regression

Outline of GLMs. Definitions

An Introduction to Multivariate Statistical Analysis

Robustness of the Quadratic Discriminant Function to correlated and uncorrelated normal training samples

THE UNIVERSITY OF CHICAGO Booth School of Business Business 41912, Spring Quarter 2016, Mr. Ruey S. Tsay

Solving Homogeneous Systems with Sub-matrices

4.1 Order Specification

Analysing data: regression and correlation S6 and S7

Chapter 11 Canonical analysis

Principal Component Analysis

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Lecture 1: Review of linear algebra

General structural model Part 1: Covariance structure and identification. Psychology 588: Covariance structure and factor models

APPM 2360 Section Exam 3 Wednesday November 19, 7:00pm 8:30pm, 2014

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

Final Exam. Economics 835: Econometrics. Fall 2010

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Pre-Calculus Multiple Choice Questions - Chapter S8

PRINCIPAL COMPONENTS ANALYSIS

STAT 730 Chapter 1 Background

International Journal of Pure and Applied Mathematics Volume 19 No , A NOTE ON BETWEEN-GROUP PCA

Spatial Process Estimates as Smoothers: A Review

MSA200/TMS041 Multivariate Analysis

Classification 2: Linear discriminant analysis (continued); logistic regression

CSC 411 Lecture 12: Principal Component Analysis

A Significance Test for the Lasso

Data files for today. CourseEvalua2on2.sav pontokprediktorok.sav Happiness.sav Ca;erplot.sav

Hypothesis testing:power, test statistic CMS:

Introduction to Machine Learning

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Linear Methods for Prediction

Principal component analysis and the asymptotic distribution of high-dimensional sample eigenvectors

Problem Set - Instrumental Variables

From dummy regression to prior probabilities in PLS-DA

Classification Methods II: Linear and Quadratic Discrimminant Analysis

Radoslav Harman. Department of Applied Mathematics and Statistics

Robust model selection criteria for robust S and LT S estimators

Principal Component Analysis

Announcements Monday, November 13

Binary choice 3.3 Maximum likelihood estimation

Key Algebraic Results in Linear Regression

Dimension Reduction (PCA, ICA, CCA, FLD,

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

[y i α βx i ] 2 (2) Q = i=1

Dimension Reduction and Classification Using PCA and Factor. Overview

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Statistics for Applications. Chapter 9: Principal Component Analysis (PCA) 1/16

Generalized Linear Models (GLZ)

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

MATH 829: Introduction to Data Mining and Analysis Graphical Models I

Lecture 4: Principal Component Analysis and Linear Dimension Reduction

Supplement to A Generalized Least Squares Matrix Decomposition. 1 GPMF & Smoothness: Ω-norm Penalty & Functional Data

Reminder: Student Instructional Rating Surveys

Test Problems for Probability Theory ,

Supervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2012

Principal Component Analysis Applied to Polytomous Quadratic Logistic

1 Singular Value Decomposition and Principal Component

Fisher Linear Discriminant Analysis

Introductory Econometrics

Lecture 6: Methods for high-dimensional problems

Principal Component Analysis, A Powerful Scoring Technique

Unsupervised dimensionality reduction

Transcription:

MS-E2112 Multivariate Statistical (5cr) Lecture 8:

Contents

Canonical correlation analysis involves partition of variables into two vectors x and y. The aim is to find linear combinations α T x and β T y that have the largest possible correlation.

Let x be a p-variate random vector and let y be a q-variate random vector. The object in canonical correlation analysis is to find linear combinations u k = α T k x and v k = β T k y that maximizes the correlation corr(u k, v k ) between u k and v k subject to var(u k ) = var(v k ) = 1, and corr(u k, u t ) = 0, corr(v k, v t ) = 0, t < k.

Correlation Quick reminder: corr(w 1, w 2 ) = E[(w 1 µ w1 )(w 2 µ w2 )] σ w1 σ w2.

The vectors α k and β k are called the kth canonical vectors and ρ k = corr(u k, v k ) are called canonical correlations.

Whereas principal component analysis considers interrelationships within a set of variables, canonical correlation analysis considers relationships between two groups of variables.

, Examples Exercise health. Open book exams closed book exams. Job satisfaction performance.

, Regression Canonical correlation analysis can be seen as an extension of multivariate regression analysis. However, note that in canonical correlation analysis there is no assumption of causal asymmetry - x and y are treated symmetrically!

, Solution Let z = (x T, y T ) T, and let [ Σ11 Σ cov(z) = Σ = 12 Σ 21 Σ 22 ]. Define and M 1 = Σ 1 11 Σ 12Σ 1 22 Σ 21, M 2 = Σ 1 22 Σ 21Σ 1 11 Σ 12.

, Solution Now, the canonical vectors α k are the eigenvectors of M 1 (α k corresponds to the k th largest eigenvalue), the canonical vectors β k are the eigenvectors of M 2, and ρ 2 k are the eigenvalues of the matrix M 1 (and of M 2 as well). The proof of this solution can be found from pages 283-284 of [1].

, Solution Note that the eigenvectors α k and β k do not have length= 1! Requirements var(u k ) = var(α T k x) = 1 and var(v k ) = var(β T k y) = 1 define the lengths of the eigenvectors.

, Solution If the covariance matrices Σ 11 and Σ 22 are not full rank, similar results may be obtained using generalized inverses. One may also consider dimension reduction as a first step.

, Sample Version Sample estimates ˆα k, ˆβ k and ˆρ k of α k, β k and ρ k, respectively, are obtained by using sample covariance matrices calculated from the samples x 1, x 2,..., x n, y 1, y 2,..., y n and z 1, z 2,..., z n.

, Standardization As in PCA, also in canonical correlation analysis, the data is sometimes standardized first.

Assume that z = (x T, y T ) T N p+q (µ, Σ). Consider testing H 0 : x and y are independent, against H 1 : x and y are not independent.

Let m = min{p, q}, and let T = (n 1 m 2 (p + q + 3)) ln( (1 ˆρ 2 k)). k=1 Now, under H 0 (and under the assumption of multivariate normality), the test statistic T is asymptotically distributed as χ 2 (pq).

Testing Partial Independence Assume that z = (x T, y T ) T N p+q (µ, Σ). Consider testing H 0 : Only s of the canonical correlation coefficients are nonzero, against H 1 : The number of nonzero canonical correlation coefficients is larger than s.

Testing Partial Independence Let m = min{p, q}, and let T s = (n 1 m (p + q + 3)) ln( 2 k=s+1 (1 ˆρ 2 k)). Now, under H 0 (and under the assumption of multivariate normality), the test statistic T is asymptotically distributed as χ 2 ((p s)(q s)).

Let X and Y denote the n p and n q data matrices for n individuals, and let ˆα k and ˆβ k denote the kth (sample) canonical vectors. Then the n 1 vectors and η k = X ˆα k φ k = Y ˆβ k denote the scores of the n individuals on the kth canonical correlation variables.

If the x and y variables are interpreted as the "predictor" and "predicted" variables, respectively, then the η k score vector can be used to predict the φ k score vector by using least square regression: ( φ k ) i = ˆρ k ((η k ) i ˆα T k x) + ˆβ T k ȳ. The canonical correlation ˆρ k estimates the proportion of the variance of φ k that is explained by the regression on x.

Example Example: closed book exams open book exams.

Some The procedure maximizes the correlation between the linear combination of variables it can be more than difficult to interpret the results. Correlation does not automatically imply causality. Normality assumption is required in testing.

Next Week Next week we will talk about discriminant analysis and classification.

I K. V. Mardia, J. T. Kent, J. M. Bibby, Multivariate, Academic Press, London, 2003 (reprint of 1979).

II R. V. Hogg, J. W. McKean, A. T. Craig, Introduction to Mathematical Statistics, Pearson Education, Upper Sadle River, 2005. R. A. Horn, C. R. Johnson, Matrix, Cambridge University Press, New York, 1985. R. A. Horn, C. R. Johnson, Topics in Matrix, Cambridge University Press, New York, 1991.