LECTURE 9 CANONICAL CORRELATION ANALYSIS

LECURE 9 CANONICAL CORRELAION ANALYSIS Introducton he concept of canoncal correlaton arses when we want to quantfy the assocatons between two sets of varables. For example, suppose that the frst set of varables, labeled 'arthmetc' records x the speed of an ndvdual n workng problems and x the accuracy. he second set of varables, labeled 'readng' conssts of x readng speed and x comprehenson. We can 3 4 examne the sx par wse correlatons but n addton, we ask f t makes sense to ask f arthmetc s correlated wth readng. he answer s gven by consderng a lnear combnaton of the arthmetc varables, say, u and a lnear combnaton of the readng varables, say v and usng ther correlaton to represent the assocaton between the groups. hus we construct u œ ax ax and v œ bx3 bx4 and we seek coeffcents so that ths correlaton s maxmzed. (NOE: Every text I know of uses u and v for these varables. SAS PROC CANCORR uses v and w. hat s OK but don't get confused.) evelopment Suppose we have a vector of varables, x that conssts of two sets of varables, x and x where, x has length p and x has length p. Assume that p Ÿ p. o develop the notaton, let x. x œ E[ x] œ and œ Var( x) œ x. he matrx gves the covarances between the varables n set one and set two and n correlaton form t gves the correlatons. When p and p are moderately large, examnng the pp correlatons and drawng conclusons s not an easy task. As an alternatve, we consder lnear combnatons u œ a x and v œ b x Note that Var[u] œ a a Var[v] œ b b Cov[u,v] œ a b We want to determne the vectors a and b so that Corr[u, v] œ a b a È aèb b s as large as possble. o ths end, we determne a and b as the soluton to the problem

maxmze a b subject to : a a œ b b œ he varables so determned are called the frst par of canoncal varables, u and v. he second par of canoncal varables, u and v are smlarly determned by lnear combnatons of x and x wth unt varance and maxmum correlaton among all varables that are uncorrelated wth the frst par. hs remnds us of the dscusson of prncpal components and leads to the determnaton of egenvalues and egenvectors. he soluton leads us to the statonary equatons, b - a œ 0 a ) b œ 0 Multplyng the frst equaton by a and the second by b shows that - œ ) œ a We thus seek - so that b. - º º œ 0. - he followng result s useful: I the matrx A s wrtten n parttoned form as then A A A œ A A l A l œ l A ll A A A A l œ l A ll A A A A l Applyng the second form of ths to our matrx we have - " º º œ l-ll- ( ) l - - œ l ll ( ) - l œ l ll ll ( ) - Il

Snce - s only nvolved n the last determnant, t follows that we can determne the values of - by fndng the egenvalues of the matrx and takng the square root. he postve square root of the largest egenvalue gves the largest correlaton. Note that the matrx has at most p non-zero egenvalues.. o fnd a and b we return to the statonary equatons. Recallng that - œ ), multplyng the second by we see that b œ " - a Substtutng ths n the frst equaton, and rearrangng terms we see that a s gven by the soluton of the equatons Š - I a œ 0. hat s, the vector a s the egenvector correspondng to the egenvalue -. Smlar computatons show that the vector b s gven by soluton to the equatons Š - b œ 0 hus the frst par of canoncal varates are wth correlaton 3 œ È-. œ b x u œ a x and v o fnd the second canoncal par, u, v, we solve the problem maxmze a b subject to : a a œ b b œ : a a œ 0 b b œ 0 It follows that the squared correlaton between u and v s - the second largest egenvalue of the matrx, and the vectors a and b are obtaned by solvng the above equatons usng -. Although we dd not specfy ths n our optmzaton problem, t also follows that a b œ 0 and b a œ 0

We can contnue ths for all non-zero egenvalues. Summary he canoncal varable pars, u propertes: Corr(u, v ) œ - Corr(u, u ) œ 0 j Corr(v, v j) œ 0 Corr(u, v j) œ 0 for Á j œ a x and v x as determned have the followng hese propertes can be summarzed by the correlaton matrx R uv Ip ag(( - ) œ ag( - ) I p Example Returnng to the readng-arthmetc example, suppose the sample correlaton matrx s gven by Ô.4.5.6.4.3.4.4.5.6 R œ Ö Ù R œ R.5.3. œ.4.3.4 Õ.6.4. Ø R..5.3 œ R. œ.6.4 Note that t s best to apply the results to standardzed data and hence we use the correlaton matrx. We may then compute and A œ R R R R B œ R R R R.45.89 œ.46.495.06.5 œ.78.340 he egenvalues of these two matrces are the same, that s, - œ.5457 and - œ.0009. he egenvectors of A and B are the columns of the matrces.95 -.540.595 -.774 VecA œ and VecB.309.84 œ.804.633 Recall that we have specfed that the varances of the u and v must be one. hat s,

a R a œ and b b œ he egenvectors as determned are normalzed to have length one but do not satsfy ths condton. he egenvectors must be scaled. he scaled egenvectors are gven by " " # #.3 0.9 0 A œ VecAŒ and B VecB 0.636 œ Œ 0.804 hus, A œ.856 -.677 and B.545 -.863.78.055 œ.737.706 It follows that the frst canoncal par s defned by u œ.856z..78z v œ.545z 3.737z4 wth correlaton 3 œ È.5457 œ.74 he second canoncal par s defned by u œ..677z.056z v œ.863x 3.706x4 wth correlaton 3 œ È.0009 œ.03 We see that the frst par captures most of the relaton between arthmetc and readng. he canoncal varate for arthmetc, u, places over three tmes as much weght on speed as t does on accuracy and the canoncal varate for readng, v, puts more weght on comprehenson that on speed n proporton 4:3. Note that ths does not say, for example, that speed s three tmes as mportant as accuracy n arthmetc. It smply says that f we are askng for a measure of the relaton between arthmetc and readng, these functons provde the essental component of that relaton. Interpretaton of Canoncal Varables In general, the canoncal varables are artfcal and may have no physcal meanng. he nterpretaton s often aded by computng the correlaton between the orgnal varables and the canoncal varables. o do ths, note that the canoncal varables are related to the orgnal varables by the equatons, u A z and v œ B z œ

where z denotes the standardzed data from whch the egenvectors have been determned. Recallng that the canoncal varables have been standardzed to have varance one, t follows that Corr( u, z ) œ Cov( u, z ) œ Cov( A z, z ) œ A R Smlarly, Example: Corr( u, z ) Cov( A z, z ) œ A R Corr( v, z ) B R œ œ Corr( v, z ) B R œ Returnng to the arthemetc-readng example, we see that and.4 Corr(u, z) œ (.856.78) œ (.97.6).4. Corr(v, z) œ (.545.737) œ (.69.85). We see that of the two varables n z, u f most hghly correlated wth the frst. Of the two varables n, v s most hghly correlated wth the second. z Smlarly, we obtan the correlatons.corr(u, z ) œ (.5.63) and Corr(v z ) œ (.7.46), As n our study of prncpal components, t s more nformatve to look at the correlatons as opposed to the egen vectors. Observatons It can be shown that the frst canoncal correlaton s larger than any of the smple correlatons n R. If there s one varable n set one, but several n set two, the squared canoncal correlaton s the squared multple correlaton, R, n the regresson of z on z. In general, t can be shown that the squared multple correlaton for the regresson of u k on z s gven 3 k. ths s also the squared multple correlaton for the regresson of v k on z.