J. Japan Statist. Soc. Vol. 11 No. 1 1981 43-53 EXPLICIT EXPRESSIONS OF PROJECTORS ON CANONICAL VARIABLES AND DISTANCES BETWEEN CENTROIDS OF GROUPS Haruo Yanai* Generalized expressions of canonical correlation analysis and partial canonical correlation analysis are introduced, in which the sum of the squared canonical and partial canonical correlation coefficients for each are given as the traces of the product of two orthogonal projectors and that of two oblique projectors respectively. Following the result, some explicit expressions of projectors are obtained in connection with the product of two projectors defined in terms of canonical variables arising in canonical correlation analysis, partial canonical correlation analysis and part canonical correlation analysis, and the results are applied for showing that the Euclidian distance based on canonical variables turns out to be Mahalanobis' generalized distance with a slight modification. 1. Introduction Canonical correlation analysis established by Hotelling (1935) as a method of analysing the relationship between two sets of variables is a generalization of regression analysis. It also subsumes multiple regression analysis, discriminant analysis, canonical analysis based on discrete variables and canonical factor analysis. Recently, Khatri (1976) showed that the theory can be developed, by using generalized inverse defined by Rao (1962), without assuming the non-singularity of the covariance matrix associated with the joint distribution of the explanatory and criterion variables. The purpose of the present article is to further extend the theory to the case where the two sets of variables are not necessarily linearly independent and a third set of variables is available in addition to the two sets. For this purpose, we give some explicit expressions of projectors defined in terms of canonical variables obtained from canonical correlation analysis, partial canonical correlation analysis and multiple discriminant analysis, using the general theory of the projector developed by Rao and Mitra (1971), Takeuchi and Yanai (1972), Rao (1974) and Rao and Yanai (1979). Furthermore, we shall show that the obtained results can be used effectively for clarifying the form of Euclidian distance based on the canonical variables arising from canonical correlation analysis and multiple discriminant analysis. In the next section, we give some preliminary results, and in the subsequent sections we shall show that they can be used to clarify the relations among the projectors defined in terms of canonical variables arising in canonical correlation analysis and partial canonical correlation analysis. Received Nov. 1, 1979, Revised Dec. 4, 1980. * Chiba University.
46 J. JAPAN STATIST. SOC. Vol. 11 No.1 1981
EXPLICIT EXPRESSIONS OF PROJECTORS ON CANONICAL VARIABLES 51 which is an estimate of Mahalanobis' generalized distance between the group 1 and 2 (see Yanai and Takane (1977, p. 81)). From the above result, it follows that the Euclidian distance based on canonical variables turns out to be Mahalanobis' generalized distance with a slight modification. We prove a stronger COROLLARY 6. result.
52 J. JAPAN STATIST. SOC. Vol. 11 No.1 1981 A similar reasoning as in the proof of Corollary 6 leads to the following Corollary. COROLLARY 7. Finally, we consider the Euclidian distances based on the canonical variables of canonical correlation analysis. In this case, the following theorem is established. (34) and The proofs follows immediately, using the result (10) and Corollary 2. In order to get explicit expressions of d2yb(gi, gj), we may replace X by Y and Y by X in both equations of (34). From Corollary 5, we have the following theorem, which is a generalization of Theorem 7. and The theorem follows from the result of Corollary 3 and an explicit expression of PQ2x which is QzX(X'QzX)-X'Qz. Acknowledgement Part of the result of this paper was obtained while the author was visiting the Indian Statistical Institute, New Delhi, from December, 1977 to January, 1978. I should like to express my sincere gratitude to Dr. C. R. Rao for his useful comments on this work. Thanks are also due to Dr. M. Sibuya, IBM Japan and Dr. S. Iwatsubo, National Center for Entrance Examination of Universities, for their suggestions, which led me to Corollaries 1 and 6, respectively. Furthermore, the author deeply expresses his thanks to referees for giving useful suggestions for revising the manuscript and for leading me to incorporate Corollary 4 in this paper.
EXPLICIT EXPRESSIONS OF PROJECTORS ON CANONICAL VARIABLES 53 REFERENCES [1] Cooley, W. W. and Lohnes, P. R. (1962). Multivariate Procedure for the Behavioral Science, John Wiley & Sons, New York. [2] Gnanadesikan, R. (1977). Method for Statistical Data Analysis of Multivariate Observations, John Wiley & Sons, New York. [3] Hotelling, H. (1936). Relations between two sets of variates, Biometrika, 28, 321-377. [4] Khatri, C. G. (1976). A note on multiple and canonical correlation for a singular matrix, Psychometrika, Vol. 41, No.4, 465-470. [5] Rao, C. R. (1962). A note on a generalized inverse of a matrix with applications to problems in mathematical statistics, J. of Royal Statistical Society, B 24, 152-158. [6] Rao, C. R. and Mitra, S. K. (1971). Generalized Inverse of Matrices and its Applications, John Wiley & Sons, New York. [7] Rao, C. R. (1974). Projectors, generalized inverse and BLUES, J. of Royal Statistical Society, B 36, 442-448. [8] Rao, C. R. and Yanai, H. (1979). General definition of a projector, its decomposition, and application to statistical problems, J. of Statistical Planning and Inference, Vol. 3, No.1, 1-17. [9] Timm, N. H. and Carlson, J. E. (1976). Part and bipartial canonical correlation analysis, Psychometrika, Vol. 41, 159-176. [10] Takeuchi, K. and Yanai, H. (1972). Tahenryokaiseki no Kiso (Foundations of Multivariate Analysis), Toyo Keizai Press. [11] Yanai, H. and Takane, Y. (1977). Tahenryokaiseki (Multivariate Analysis), Asakura Publishing Company, Tokyo.