AN APPLICATION OF THE CANONICAL CORRELATION

Similar documents
632 CHAP. 11 EIGENVALUES AND EIGENVECTORS. QR Method

446 CHAP. 8 NUMERICAL OPTIMIZATION. Newton's Search for a Minimum of f(x,y) Newton s Method

Random Vectors, Random Matrices, and Matrix Expected Value

Multivariate Statistics Fundamentals Part 1: Rotation-based Techniques

MULTIVARIATE TIME SERIES ANALYSIS AN ADAPTATION OF BOX-JENKINS METHODOLOGY Joseph N Ladalla University of Illinois at Springfield, Springfield, IL

DISCRIMINANT ANALYSIS IN THE STUDY OF ROMANIAN REGIONAL ECONOMIC DEVELOPMENT

Partial derivatives, linear approximation and optimization

Nonparametric regresion models estimation in R

Canonical Correlation & Principle Components Analysis

Exercise Set 7.2. Skills

4.1 Order Specification

STATISTICAL METHODS FOR PROCESSING OF BIOSIGNALS

DIFFERENTIATION. MICROECONOMICS Principles and Analysis Frank Cowell. July 2017 Frank Cowell: Differentiation

A Least Squares Formulation for Canonical Correlation Analysis

Standardization and Singular Value Decomposition in Canonical Correlation Analysis

1 Number Systems and Errors 1


Canonical Correlation Analysis of Longitudinal Data

TWO BOUNDARY ELEMENT APPROACHES FOR THE COMPRESSIBLE FLUID FLOW AROUND A NON-LIFTING BODY

Ch.3 Canonical correlation analysis (CCA) [Book, Sect. 2.4]

Performance In Science And Non Science Subjects

Jordan Canonical Form Homework Solutions

Math 20F Final Exam(ver. c)

This paper is not to be removed from the Examination Halls

LECTURE 9 LIMITS AND CONTINUITY FOR FUNCTIONS OF SEVERAL VARIABLES

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 8: Canonical Correlation Analysis

ON THE TRUNCATED COMPOSITE WEIBULL-PARETO MODEL

Constrained optimization.

4. Nonlinear regression functions

Ion Patrascu, Florentin Smarandache The Polars of a Radical Center

Introduction to Applied Linear Algebra with MATLAB

STATS306B STATS306B. Discriminant Analysis. Jonathan Taylor Department of Statistics Stanford University. June 3, 2010

Data Matrix User Guide

Chapter 6. Eigenvalues. Josef Leydold Mathematical Methods WS 2018/19 6 Eigenvalues 1 / 45

Simultaneous Diagonalization of Positive Semi-definite Matrices

Logistic Regression: Regression with a Binary Dependent Variable

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH M Test #1. July 11, 2013 Solutions

22.4. Numerical Determination of Eigenvalues and Eigenvectors. Introduction. Prerequisites. Learning Outcomes

The LIML Estimator Has Finite Moments! T. W. Anderson. Department of Economics and Department of Statistics. Stanford University, Stanford, CA 94305

7. Response Surface Methodology (Ch.10. Regression Modeling Ch. 11. Response Surface Methodology)

ON THE SINGULAR DECOMPOSITION OF MATRICES

EIGENVALUES AND SINGULAR VALUE DECOMPOSITION

ENGI 5708 Design of Civil Engineering Systems

Lecture 8: Linear Algebra Background

ILLUSTRATIVE EXAMPLES OF PRINCIPAL COMPONENTS ANALYSIS

lecture 2 and 3: algorithms for linear algebra

THE USAGE OF GIS APPLICATIONS IN SURVEYS DEPARTMENT. Tools for planning and managing the work of interviewers

ENGI 5708 Design of Civil Engineering Systems

ANALELE ŞTIINŢIFICE ALE UNIVERSITĂŢII ALEXANDRU IOAN CUZA DIN IAŞI Tomul LII/LIII Ştiinţe Economice 2005/2006

Anexa nr. 2. Subject Content

Random linear systems and simulation

NOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS: The copyright law of the United States (title 17, U.S. Code) governs the making of photocopies or

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)

forms Christopher Engström November 14, 2014 MAA704: Matrix factorization and canonical forms Matrix properties Matrix factorization Canonical forms

Preface to Second Edition... vii. Preface to First Edition...

Regional economic disparities in Romania

SHORT ANALYSIS OF REGIONAL DISPARITIES ÎN ROMÂNIA

lecture 3 and 4: algorithms for linear algebra

Eigenvalues and Eigenvectors: An Introduction

Elliptically Contoured Distributions

U.P.B. Sci. Bull., Series A, Vol. 74, Iss. 3, 2012 ISSN SCALAR OPERATORS. Mariana ZAMFIR 1, Ioan BACALU 2

JUST THE MATHS UNIT NUMBER 9.9. MATRICES 9 (Modal & spectral matrices) A.J.Hobson

ECE580 Partial Solution to Problem Set 3

LOSS FACTOR AND DYNAMIC YOUNG MODULUS DETERMINATION FOR COMPOSITE SANDWICH BARS REINFORCED WITH STEEL FABRIC

Neuendorf MANOVA /MANCOVA. Model: X1 (Factor A) X2 (Factor B) X1 x X2 (Interaction) Y4. Like ANOVA/ANCOVA:

WOMP 2001: LINEAR ALGEBRA. 1. Vector spaces

Introduction to Operations Research

HW2 - Due 01/30. Each answer must be mathematically justified. Don t forget your name.

Multivariate Regression (Chapter 10)

Machine Learning 11. week

Czech J. Anim. Sci., 50, 2005 (4):

Two Posts to Fill On School Board

Linear Algebra in Computer Vision. Lecture2: Basic Linear Algebra & Probability. Vector. Vector Operations

Lecture Note 1: Probability Theory and Statistics

Poverty assessment me-thodology.

Module 3: 3D Constitutive Equations Lecture 10: Constitutive Relations: Generally Anisotropy to Orthotropy. The Lecture Contains: Stress Symmetry

Response Surface Methodology

EE 227A: Convex Optimization and Applications October 14, 2008

MEMORIAL UNIVERSITY OF NEWFOUNDLAND

Chapter 4 Euclid Space

Here each term has degree 2 (the sum of exponents is 2 for all summands). A quadratic form of three variables looks as

Family Feud Review. Linear Algebra. October 22, 2013

THE TECHNOLOGICAL OPTIMIZATION OF DISCONTINUOUS FILTRATION USING GEOMETRICAL PROGRAMMING

Moore-Penrose Conditions & SVD

Logistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld.

THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2014, Mr. Ruey S. Tsay. Solutions to Final Exam

Quadratic Equations 6 QUESTIONS. Relatively Easy: Questions 1 to 2 Moderately Difficult: Questions 3 to 4 Difficult: Questions 5 to 6

MATHEMATICAL MODELING OF METHANOGENESIS

Neuendorf MANOVA /MANCOVA. Model: X1 (Factor A) X2 (Factor B) X1 x X2 (Interaction) Y4. Like ANOVA/ANCOVA:

Principal Component Analysis CS498

On the Equivalence Between Canonical Correlation Analysis and Orthonormalized Partial Least Squares

16.584: Random Vectors

Chapter 11 Canonical analysis

MSA220 Statistical Learning for Big Data

Another algorithm for nonnegative matrices

SEC POWER METHOD Power Method

Chapter 7: Symmetric Matrices and Quadratic Forms

Physics 202 Laboratory 5. Linear Algebra 1. Laboratory 5. Physics 202 Laboratory

STAT 501 Assignment 1 Name Spring 2005

Functional Analysis. James Emery. Edit: 8/7/15

Transcription:

AN APPLICATION OF THE CANONICAL CORRELATION Reader, Dan Petru Vasiliu, Ph. D Faculty of Economic tudies, University Titu Maiorescu Bucharest enior lect. Ganea Tudor, Ph. D Faculty of Economic tudies, University Titu Maiorescu Bucharest Abstract : the paper deals with an experimental application of the canonical correlation in the study of the social environment in the last year Romania. To do this, two groups of variables are selected : one group repersenting the social effort and the other the social effect of the labour. There studied the characteristic of the dependence between them. The conclusion is that in the considered period the greater was the effort, the smaller was the effect. Two different variants of the canonical correlation method were used to prove this Both of them confirm this conclusion. Keywords: canonical correlation ; canonical variates; the COREMAX method multivariate analysis The aim of the classical canonical correlation analysis is studying relationships between sets of multiple dependent and multiple independent variables. o, let X be a p dimensional and Y be a q dimensional random variable. We look for two vectors a, b such that a R coefficient correl ( a X,b Y be maximal. p,b R q and the quadratic KENDALL correlation Note that the univariate random variable U = a X and V = b Y are known as the canonical variates of the variables X, resp. Y,and that a, b are the corresponding canonical coefficients. This problem was solved long before (from Harold Hotteling -939 to Joseph F. Hair,998 for example. In this paper the canonical correlation method will be applied to study dependence between the group X = { civil employment ; employment rate } as the group of independent variables, and Y = { average net nominal monthly earnings ; total expenditure of household } as the group of independent variables. The data in our study are the next (http://www.insse.ro/cms/files/pdf/en/cp3.pdf Year X X Y Y 997 050 60.9 63. 38.58* 998 0845 59.6 04 96.3* 999 0776 63.5 5 74.* 000 0508 63.6 4 375.9* 00 0440 6.9 30 56.5 00 934 58 379 65.66 003 93 57.8 484 78.45

Here, as stated above : 004 958 57.9 599 049.44 005 947 57.7 746 49.33 X = civil employment X = employment rate Y = average net nominal monthly earnings Y = total expenditure of household. ( The stared data are author s estimations. First there proceeds to normalize this data,in order to obtain a less error level ( as is well known, the normalized variable V that corresponds to the variable V, is given by V V E(V σ (V (V =, where E(V is the expected value,and σ - the standard deviation of the variable V. The normalized variables are then represented by the next table : Year X X Y 997.7 0.68 -.74 -.80 998 0.970-0.38-0.999 -.03 999 0.886.8-0.794-0.80 000 0.563.30-0.530-0.53 00 0.480.047-0.54-0.47 00-0.977-0.86 0.74 0. 003-0.990-0.939 0.63 0.577 004 -.068-0.900.3.30 005 -.08-0.978.74.583 The estimated total correlation matrix of this last dataset is then : Y, with : 0,7834 0,7834 ; 0,967 0,6744 0,968 0,6783 0,967 0,968 0,6744 ; 0,6783 0,9957 0,9957 There determines now the largest eigenvalues of the matrices

A = ( and B = ( ( ( 0,9586 0,6 0,7034,5554 0,6959 0,0850 0,75,5770 The ( common largest eigenvalue of the matrices A, B is λ = 0,87. This is equally the quadratic measure of dependence between the two groups, { X,X } and { Y,Y } ( details will be seen later. The canonical coefficients are the eigenvectors, corresponding to λ = 0,87 : there obtain therefore a = ( 0,99 ; - 0,63, b = ( - 0,438 ; 0,904 U = 0,99 X - 0,63 X ; V = - 0,438 Y + 0,904 Y. U V.738-0.5893 0.9994-0.578 0.7784-0.4087 0.3975-0.649 0.3446-0.0703-0.860 0.305-0.8636 0.6788-0.9464 0.7389-0.9498 0.7 If compute now the correlation coefficient correl (U,V, there obtains obviously, [correl (U, V ] = λ. correl (U,V = - 0,9385 ; For correl (U,V < 0, there results that the dependence of { Y,Y } with respect to { X,X } is of negative type : the greater { X,X } become, the poor { Y,Y } become. Otherwise, correl (U,V being so near of -, the dependence is strong enough to let us consider the group { X,X } determinative for { Y,Y }. The values of the canonical variates are processed below : X X Y Y 0.54 0 0 0.89 0.3 0.060 0.057

0.856 0.983 0.30 0.34 0.75.000 0. 0.35 0.679 0.88 0.350 0.374 0.046 0.05 0.463 0.508 0.040 0.07 0.66 0.636 0.006 0.034 0.785 0.90 0 0.000 A reason the canonical correlation method is not used so widely is that the practical meaning of the canonical coefficients is rarely clear. To remediate that, different versions of the method are improved. One of them is the COREMAX method ( see (, that gives the coefficients the meaning of degrees of contributions of each component of the group to the corresponding canonical variate. To do this, we ll denote: η= α X µ = β Y + ( α X + ( β Y ; ; 0 α 0 β Now, we look for values α, β that maximize correl (η, µ. To make economic interpretation possible, we ll enforce to the variables another kind of being comparable, instead of normalization : thus, the normed variables are more adequate.this is justified by the fact that negative values of the variables are equally difficult to interpret ; or, the normalization unavoidable gives negative values to normalized variables. Remind that, for a variable series V, the normed series V is given by X X Y 0.54 0 0 0.89 0.3 0.060 0.057 0.856 0.983 0.30 0.34 0.75.000 0. 0.35 0.679 0.88 0.350 0.374 0.046 0.05 0.463 0.508 0.040 0.07 0.66 0.636 0.006 0.034 0.785 0.90 0 0.000 Y V V min{v} max{v} min{v} =. There obtaines :

Applying technique described in (, there obtains α = 0,47 ; β = 0,70 ; max Correl( µ ; η = - 0,8376 this value being close enough to max correl (U,V = - 0,9385. Finally : in the next, - the canonical variate µ = 0,47 X + 0,53 X gets the name of synthetical employment - the canonical variate η = 0,7 Y + 0,3 Y gets the name of synthetical earning Then: - the synthetical earning depends strongly from the synthetical employment ; - the intensity of this dependence is about 83,76% - the dependence is of the negative type : the greater the synthetical employment, the smaller the synthetical earning becomes. In other words : in the mentioned period, the greater was the effort, the smaller was the effect. BIBLIOGRAPHY: ( :Multivariate Data Analysis, 5th edition by Joseph F. Hair, Jr., Rolph E. u.a. Prentice Hall, Inc. 998. ( : Metode matematice pentru fundamentarea deciziilor in productie, by D.P.Vasiliu, Editura Tehnica, Bucuresti, 986