LECTURE 9 CANONICAL CORRELATION ANALYSIS

Similar documents
LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Chapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of

Linear Approximation with Regularization and Moving Least Squares

However, since P is a symmetric idempotent matrix, of P are either 0 or 1 [Eigen-values

Chat eld, C. and A.J.Collins, Introduction to multivariate analysis. Chapman & Hall, 1980

e i is a random error

763622S ADVANCED QUANTUM MECHANICS Solution Set 1 Spring c n a n. c n 2 = 1.

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Lecture 3 Stat102, Spring 2007

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

Composite Hypotheses testing

2.3 Nilpotent endomorphisms

Effects of Ignoring Correlations When Computing Sample Chi-Square. John W. Fowler February 26, 2012

SIMPLE LINEAR REGRESSION

Transfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system

Module 9. Lecture 6. Duality in Assignment Problems

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

PHYS 705: Classical Mechanics. Calculus of Variations II

Economics 130. Lecture 4 Simple Linear Regression Continued

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

Statistics for Economics & Business

Tests of Single Linear Coefficient Restrictions: t-tests and F-tests. 1. Basic Rules. 2. Testing Single Linear Coefficient Restrictions

The Prncpal Component Transform The Prncpal Component Transform s also called Karhunen-Loeve Transform (KLT, Hotellng Transform, oregenvector Transfor

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

1 Matrix representations of canonical matrices

Lecture 12: Discrete Laplacian

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Linear Correlation. Many research issues are pursued with nonexperimental studies that seek to establish relationships among 2 or more variables

Difference Equations

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

[ ] λ λ λ. Multicollinearity. multicollinearity Ragnar Frisch (1934) perfect exact. collinearity. multicollinearity. exact

Limited Dependent Variables

Statistics for Business and Economics

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Linear Regression Analysis: Terminology and Notation

Chapter 11: Simple Linear Regression and Correlation

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

Norms, Condition Numbers, Eigenvalues and Eigenvectors

Solutions to exam in SF1811 Optimization, Jan 14, 2015

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Lecture 10 Support Vector Machines. Oct

COS 521: Advanced Algorithms Game Theory and Linear Programming

Quantum Mechanics for Scientists and Engineers. David Miller

1 Vectors over the complex numbers

x i1 =1 for all i (the constant ).

Kernel Methods and SVMs Extension

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Example: (13320, 22140) =? Solution #1: The divisors of are 1, 2, 3, 4, 5, 6, 9, 10, 12, 15, 18, 20, 27, 30, 36, 41,

Chapter 9: Statistical Inference and the Relationship between Two Variables

Section 3.6 Complex Zeros

Section 8.3 Polar Form of Complex Numbers

β0 + β1xi. You are interested in estimating the unknown parameters β

Lecture 10 Support Vector Machines II

Quantum Mechanics I - Session 4

The Second Anti-Mathima on Game Theory

Chapter 12 Analysis of Covariance

APPENDIX A Some Linear Algebra

Poisson brackets and canonical transformations

Tests of Exclusion Restrictions on Regression Coefficients: Formulation and Interpretation

Convexity preserving interpolation by splines of arbitrary degree

Remarks on the Properties of a Quasi-Fibonacci-like Polynomial Sequence

Chapter 13: Multiple Regression

PHYS 705: Classical Mechanics. Canonical Transformation II

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede

Lecture 6: Introduction to Linear Regression

The Order Relation and Trace Inequalities for. Hermitian Operators

Number of cases Number of factors Number of covariates Number of levels of factor i. Value of the dependent variable for case k

Chapter 3. Two-Variable Regression Model: The Problem of Estimation

Chapter 15 Student Lecture Notes 15-1

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Workshop: Approximating energies and wave functions Quantum aspects of physical chemistry

STAT 511 FINAL EXAM NAME Spring 2001

is the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

MA 323 Geometric Modelling Course Notes: Day 13 Bezier Curves & Bernstein Polynomials

Discussion of Extensions of the Gauss-Markov Theorem to the Case of Stochastic Regression Coefficients Ed Stanek

Appendix for Causal Interaction in Factorial Experiments: Application to Conjoint Analysis

= z 20 z n. (k 20) + 4 z k = 4

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

Bézier curves. Michael S. Floater. September 10, These notes provide an introduction to Bézier curves. i=0

14 The Postulates of Quantum mechanics

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Multi-dimensional Central Limit Argument

Prof. Dr. I. Nasser Phys 630, T Aug-15 One_dimensional_Ising_Model

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1

Solutions to Problem Set 6

Marginal Effects in Probit Models: Interpretation and Testing. 1. Interpreting Probit Coefficients

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Time-Varying Systems and Computations Lecture 6

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Transcription:

LECURE 9 CANONICAL CORRELAION ANALYSIS Introducton he concept of canoncal correlaton arses when we want to quantfy the assocatons between two sets of varables. For example, suppose that the frst set of varables, labeled 'arthmetc' records x the speed of an ndvdual n workng problems and x the accuracy. he second set of varables, labeled 'readng' conssts of x readng speed and x comprehenson. We can 3 4 examne the sx par wse correlatons but n addton, we ask f t makes sense to ask f arthmetc s correlated wth readng. he answer s gven by consderng a lnear combnaton of the arthmetc varables, say, u and a lnear combnaton of the readng varables, say v and usng ther correlaton to represent the assocaton between the groups. hus we construct u œ ax ax and v œ bx3 bx4 and we seek coeffcents so that ths correlaton s maxmzed. (NOE: Every text I know of uses u and v for these varables. SAS PROC CANCORR uses v and w. hat s OK but don't get confused.) evelopment Suppose we have a vector of varables, x that conssts of two sets of varables, x and x where, x has length p and x has length p. Assume that p Ÿ p. o develop the notaton, let x. x œ E[ x] œ and œ Var( x) œ x. he matrx gves the covarances between the varables n set one and set two and n correlaton form t gves the correlatons. When p and p are moderately large, examnng the pp correlatons and drawng conclusons s not an easy task. As an alternatve, we consder lnear combnatons u œ a x and v œ b x Note that Var[u] œ a a Var[v] œ b b Cov[u,v] œ a b We want to determne the vectors a and b so that Corr[u, v] œ a b a È aèb b s as large as possble. o ths end, we determne a and b as the soluton to the problem

maxmze a b subject to : a a œ b b œ he varables so determned are called the frst par of canoncal varables, u and v. he second par of canoncal varables, u and v are smlarly determned by lnear combnatons of x and x wth unt varance and maxmum correlaton among all varables that are uncorrelated wth the frst par. hs remnds us of the dscusson of prncpal components and leads to the determnaton of egenvalues and egenvectors. he soluton leads us to the statonary equatons, b - a œ 0 a ) b œ 0 Multplyng the frst equaton by a and the second by b shows that - œ ) œ a We thus seek - so that b. - º º œ 0. - he followng result s useful: I the matrx A s wrtten n parttoned form as then A A A œ A A l A l œ l A ll A A A A l œ l A ll A A A A l Applyng the second form of ths to our matrx we have - " º º œ l-ll- ( ) l - - œ l ll ( ) - l œ l ll ll ( ) - Il

Snce - s only nvolved n the last determnant, t follows that we can determne the values of - by fndng the egenvalues of the matrx and takng the square root. he postve square root of the largest egenvalue gves the largest correlaton. Note that the matrx has at most p non-zero egenvalues.. o fnd a and b we return to the statonary equatons. Recallng that - œ ), multplyng the second by we see that b œ " - a Substtutng ths n the frst equaton, and rearrangng terms we see that a s gven by the soluton of the equatons Š - I a œ 0. hat s, the vector a s the egenvector correspondng to the egenvalue -. Smlar computatons show that the vector b s gven by soluton to the equatons Š - b œ 0 hus the frst par of canoncal varates are wth correlaton 3 œ È-. œ b x u œ a x and v o fnd the second canoncal par, u, v, we solve the problem maxmze a b subject to : a a œ b b œ : a a œ 0 b b œ 0 It follows that the squared correlaton between u and v s - the second largest egenvalue of the matrx, and the vectors a and b are obtaned by solvng the above equatons usng -. Although we dd not specfy ths n our optmzaton problem, t also follows that a b œ 0 and b a œ 0

We can contnue ths for all non-zero egenvalues. Summary he canoncal varable pars, u propertes: Corr(u, v ) œ - Corr(u, u ) œ 0 j Corr(v, v j) œ 0 Corr(u, v j) œ 0 for Á j œ a x and v x as determned have the followng hese propertes can be summarzed by the correlaton matrx R uv Ip ag(( - ) œ ag( - ) I p Example Returnng to the readng-arthmetc example, suppose the sample correlaton matrx s gven by Ô.4.5.6.4.3.4.4.5.6 R œ Ö Ù R œ R.5.3. œ.4.3.4 Õ.6.4. Ø R..5.3 œ R. œ.6.4 Note that t s best to apply the results to standardzed data and hence we use the correlaton matrx. We may then compute and A œ R R R R B œ R R R R.45.89 œ.46.495.06.5 œ.78.340 he egenvalues of these two matrces are the same, that s, - œ.5457 and - œ.0009. he egenvectors of A and B are the columns of the matrces.95 -.540.595 -.774 VecA œ and VecB.309.84 œ.804.633 Recall that we have specfed that the varances of the u and v must be one. hat s,

a R a œ and b b œ he egenvectors as determned are normalzed to have length one but do not satsfy ths condton. he egenvectors must be scaled. he scaled egenvectors are gven by " " # #.3 0.9 0 A œ VecAŒ and B VecB 0.636 œ Œ 0.804 hus, A œ.856 -.677 and B.545 -.863.78.055 œ.737.706 It follows that the frst canoncal par s defned by u œ.856z..78z v œ.545z 3.737z4 wth correlaton 3 œ È.5457 œ.74 he second canoncal par s defned by u œ..677z.056z v œ.863x 3.706x4 wth correlaton 3 œ È.0009 œ.03 We see that the frst par captures most of the relaton between arthmetc and readng. he canoncal varate for arthmetc, u, places over three tmes as much weght on speed as t does on accuracy and the canoncal varate for readng, v, puts more weght on comprehenson that on speed n proporton 4:3. Note that ths does not say, for example, that speed s three tmes as mportant as accuracy n arthmetc. It smply says that f we are askng for a measure of the relaton between arthmetc and readng, these functons provde the essental component of that relaton. Interpretaton of Canoncal Varables In general, the canoncal varables are artfcal and may have no physcal meanng. he nterpretaton s often aded by computng the correlaton between the orgnal varables and the canoncal varables. o do ths, note that the canoncal varables are related to the orgnal varables by the equatons, u A z and v œ B z œ

where z denotes the standardzed data from whch the egenvectors have been determned. Recallng that the canoncal varables have been standardzed to have varance one, t follows that Corr( u, z ) œ Cov( u, z ) œ Cov( A z, z ) œ A R Smlarly, Example: Corr( u, z ) Cov( A z, z ) œ A R Corr( v, z ) B R œ œ Corr( v, z ) B R œ Returnng to the arthemetc-readng example, we see that and.4 Corr(u, z) œ (.856.78) œ (.97.6).4. Corr(v, z) œ (.545.737) œ (.69.85). We see that of the two varables n z, u f most hghly correlated wth the frst. Of the two varables n, v s most hghly correlated wth the second. z Smlarly, we obtan the correlatons.corr(u, z ) œ (.5.63) and Corr(v z ) œ (.7.46), As n our study of prncpal components, t s more nformatve to look at the correlatons as opposed to the egen vectors. Observatons It can be shown that the frst canoncal correlaton s larger than any of the smple correlatons n R. If there s one varable n set one, but several n set two, the squared canoncal correlaton s the squared multple correlaton, R, n the regresson of z on z. In general, t can be shown that the squared multple correlaton for the regresson of u k on z s gven 3 k. ths s also the squared multple correlaton for the regresson of v k on z.