Non-linear Canonical Correlation Analysis Using a RBF Network

ESANN' proceedngs - European Smposum on Artfcal Neural Networks Bruges (Belgum), 4-6 Aprl, d-sde publ., ISBN -97--, pp. 57-5 Non-lnear Canoncal Correlaton Analss Usng a RBF Network Sukhbnder Kumar, Elane B Martn and Julan Morrs Center for Process Analtcs and Control echnolog Unverst of Newcastle, Newcastle upon ne, NE 7RU, England Abstract: A non-lnear verson of the multvarate statstcal technque of canoncal correlaton analss (CCA) s proposed through the ntegraton of a radal bass functon (RBF) network. he advantage of the RBF network s that the soluton of lnear CCA can be used to tran the network and hence the tranng effort s mnmal. Also the canoncal varables can be extracted smultaneousl. It s shown that the proposed technque can be used to extract non-lnear structures nherent wthn a data set.. Introducton Over the past decade, a number of technques have been proposed for the extracton of non-lnear features nherent wthn process data ncludng the multvarate statstcal technque of prncpal component analss [-4]. More recentl a nonlnear varant of Canoncal Correlaton Analss (CCA) has been proposed [5] through the ntegraton of a Mult-Laer Perceptron (MLP) network. A drawback of ths approach s that the optmsaton problem s non-lnear and thus suffers from the potental problem of becomng trapped wthn a local mnmum. Hseh [5] addressed ths ssue b tranng an ensemble of neural networks. Although not a serous lmtaton of the methodolog, t does requre major tranng effort. he other lmtaton s that when usng a MLP network, the canoncal varables cannot be extracted smultaneousl. hs has two repercussons. Frst the number of MLP networks to be traned (hence the tranng effort) ncreases wth the number of canoncal varables and secondl snce the MLP networks are traned on the resduals, the extracton of subsequent canoncal varables becomes dffcult because of the reducton n sgnal to nose rato. In ths paper an alternatve method of mplementng non-lnear CCA usng a Radal Bass Functon (RBF) network s proposed.. Lnear Canoncal Correlaton Analss Canoncal Correlaton Analss (CCA) s a multvarate statstcal technque that dentfes a lnear relatonshp between two sets of varables x R m and R n. Lnear CCA seeks to fnd vectors a R m and b R n such that the lnear combnatons: u = a x and v = b ()

ESANN' proceedngs - European Smposum on Artfcal Neural Networks Bruges (Belgum), 4-6 Aprl, d-sde publ., ISBN -97--, pp. 57-5 have maxmum correlaton. he vectors a and b are the canoncal correlaton vectors and u and v are the canoncal varables. he above problem s solved as follows. Let Σ xx and Σ be the covarance matrces of x and respectvel and Σ x be the cross covarance matrx between x and. Let matrx, K, be defned as: K = Σ xx Σ x Σ () If k s the rank of matrx K, then b sngular value decomposton, K can be decomposed as: K = (,..,. k ) D (,,.., k ) () where α and β are the egenvectors of the matrces KK and K K respectvel and D s a block dagonal matrx comprsng the square root of the k non-zero egenvalues. Lettng: a = Σ xx α and b = Σ then a and b are the (k) canoncal vectors. β for =,... k (4). Non Lnear CCA usng a RBF Network Non-lnear Canoncal Correlaton Analss (CCA) s smlar to lnear CCA except that the lnear transformaton appled to the varables, x and, s replaced b a nonlnear transformaton. In ths paper a RBF network replaces the lnear transformaton. Non-lnear CCA s performed n two stages. Frst the varables, x and, are projected from the hgher dmensonal space down onto a lower dmensonal space and then the latent varables are transformed back to the orgnal varables. he second step s termed self-consstenc [6].. Mappng from Orgnal Data Space to the Canoncal Varables he mappng of x and to canoncal varables u and v s from R m R k and R n R k respectvel. For smplct, k, the number of canoncal varables, s taken to be unt. he stuaton where k s greater than unt s a straghtforward extenson of the descrbed methodolog. Gven the centers c c and the wdths σ x, σ of x, the radal bass functons for the two mappngs, the canoncal varables u and v can be defned as:

ESANN' proceedngs - European Smposum on Artfcal Neural Networks Bruges (Belgum), 4-6 Aprl, d-sde publ., ISBN -97--, pp. 57-5 p u = w f ( x) = w f x = x q (5) and v = w g ( ) = w g j= j where f =[f,f... f p ] and g =[g,g... g q] are RBF vectors and w x = [ w x, w x,... w x p] and w = [ w,w,...w q] are the weght vectors for the mappng from x to u and to v respectvel. Non-lnear CCA then reduces to that of fndng the weght vectors w x and w such that there s maxmum correlaton between u and v. hs problem s smlar to the lnear case except that the vectors x and are replaced b f and g respectvel. If A xx and A denote the covarance matrces of the radal bass functons f and g respectvel and A x s the cross covarance matrx, then smlar to equatons () and (): xx x M = A A A (6) can be decomposed usng sngular value decomposton: M = [ p, p... p ] Λ [q, q,... q ] (7) k k k he weght vectors w x and w are calculated as follows: w = A p x xx w = A q (8) (9) In the case where more than one canoncal varable s requred, the weght vectors for the network for transformng the varables x and nto successve canoncal varables can be obtaned usng vectors p and q for =,.. k n equatons (8) and (9) respectvel. hus the canoncal varables can be obtaned smultaneousl wthout solvng an non-lnear optmzaton problem. he bass functon for the mappng from x to u s chosen such that s predcted from x, that s: p ( x) = γ ε = f where s the predcton error and γ s a coeffcent. Smlarl the bass functons for the mappng from to v are selected so that x s predcted from wth mnmum error. here exst man technques to adjust the centers and wdths σ x and σ of the radal bass functons [7-8]. Here the centres are determned b fttng a Gaussan

ESANN' proceedngs - European Smposum on Artfcal Neural Networks Bruges (Belgum), 4-6 Aprl, d-sde publ., ISBN -97--, pp. 57-5 mxture model wth crcular covarances usng the EM algorthm wth the wdths set equal to the maxmum nner centre dstance.. Mappng from Canoncal Varables to Orgnal Data Space he scores u and v calculated n the frst stage of the algorthm should be a good approxmaton of the orgnal vectors, x and. he next stage s to appl an nverse transformaton, agan usng a RBF network. he parameters of the mappng from the scores u to x and v to are adjusted such that the sum of squared predcton errors are mnmsed: N Ex = x x = $ and E = $ N = () he centres and the wdths are calculated as descrbed n secton. and the weghts are determned b least squares. o avod overfttng, a regularzaton term s added to the sum of squares of the errors whle fndng the parameters of the network. 4. est Example he proposed approach to non-lnear CCA s appled to the test problem gven n [5]. he varables x and are three dmensonal: x = [ x x, x x, x x ] and = [,, ], x = t -. t, x = t.t, x = t ; () = t, = - t.t, = t.t ; () x = -s -.s, x = s -.s ; x 4 = - s ; () = sech(4s), = s.s, = s.s ; (4) where t and s are ndependent and unforml dstrbuted over [, ]. he plots of mode and n the x and - space s shown n Fg.. he data set was generated b adjustng the varance of the canoncal varable as one thrd of the frst canoncal varable. Gaussan nose wth standard devaton equal to % of the sgnal standard devaton was added. he varables were then auto-scaled and non-lnear CCA was appled. he number of neurons n the projecton stage were optmsed through cross-valdaton to reproduce the vectors x and. For the test problem, the number of neurons was ffteen. wo canoncal varables explaned approxmatel 95% of the varance n X and Y. In the nverse mappng, from

ESANN' proceedngs - European Smposum on Artfcal Neural Networks Bruges (Belgum), 4-6 Aprl, d-sde publ., ISBN -97--, pp. 57-5 canoncal varables to orgnal varables, the number of neurons was twelve. he plots of mode and mode n x and space extracted from the data are shown n Fgs. and respectvel. Comparng Fg. wth Fgs. and, the proposed technque s able to extract Mode and Mode reasonabl well from the data. x x - - - - - - - - - -4-4 x - -4-4 x - -4-4 - -4-4 4 x - - - -4-4 x x - x - - x - - - -4-4 - - - -5 5 Fg.. Modes and n x -space (LHS) and -space (RHS). ( - data; o Mode ; - Mode ) Fg.. Extracton of Mode n x -space (LHS) and -space (RHS). ( - Data; o Extracted Mode ; - Actual Mode ). Fg.. Extracton of Mode n x -space (LHS) and -space (RHS). ( - Data; o Extracted Mode ; - Actual Mode ).

ESANN' proceedngs - European Smposum on Artfcal Neural Networks Bruges (Belgum), 4-6 Aprl, d-sde publ., ISBN -97--, pp. 57-5 he correlaton between u and v s.997 and between u and v s.9844. he MSE of x after the extracton of the frst canoncal varable s.875 and for s.69. After extracton of both canoncal varables, the MSE n x and are.96 and.59 respectvel. After the non-lnear CCA model s bult, the model s used to predct from the gven values of x. he average MSE for the predcton of for new data sets, gven x s.9. hese results are comparable wth those reported n [5]. 6. Conclusons In ths paper non-lnear CCA usng a radal bass functon network has been proposed. For ths method the tranng effort s less because of the near lnear nature of the problem. Also the canoncal varables can be extracted smultaneousl. However, the ssue of how man canoncal varables to be retaned to buld the model remans unresolved. he model has been tested on snthetc data. he am of the methodolog s to use t for fault detecton and dagnoss. 7. Acknowledgements S. Kumar would lke to acknowledge the fnancal support of the Centre for Process Analtcs and Control echnolog (CPAC) and the EPSRC for fnancal support. 8. References. M. A. Kramer: Nonlnear prncpal component analss usng autoassocatve neural networks. AIChE, 7(), -4 [99].. D. Dong,. J. McAvo: Nonlnear prncpal component analss-based on prncpal curves and neural networks. Computers and Chemcal Engneerng, (), 65-78 [996].. S. an, M.L. Mavrovounots: Reducng data dmensonalt through optmzng neural network nputs. AIChE, 4(6), 47-48 [995]. 4. F. Ja, E.B. Martn, A.J. Morrs: Non-lnear prncpal components analss wth applcaton to process fault detecton. Internatonal Journal of Sstem Scence, (), 47-487 []. 5. W. W. Hseh: Nonlnear canoncal correlaton analss b neural networks. Neural Networks,, 95-5 []. 6. D. J. H. Wlson, G.W. Irwn, G. Lghtbod: RBF prncpal manfolds for process montorng. IEEE ransactons on Neural Networks, (6), 44-44 [999]. 7. C. M. Bshop: Neural networks for pattern recognton. Oxford Unverst Press [995]. 8.. Kohonen: Self-organzaton and assocatve memor. Sprnger Seres n Informaton Scences. Sprnger-Verlag [984].