Fast Computation of Moore-Penrose Inverse Matrices

Similar documents
Case report on the article Water nanoelectrolysis: A simple model, Journal of Applied Physics (2017) 122,

A proximal approach to the inversion of ill-conditioned matrices

A Slice Based 3-D Schur-Cohn Stability Criterion

A new simple recursive algorithm for finding prime numbers using Rosser s theorem

Methylation-associated PHOX2B gene silencing is a rare event in human neuroblastoma.

On Newton-Raphson iteration for multiplicative inverses modulo prime powers

Towards an active anechoic room

Full-order observers for linear systems with unknown inputs

Smart Bolometer: Toward Monolithic Bolometer with Smart Functions

Easter bracelets for years

Vibro-acoustic simulation of a car window

Exact Comparison of Quadratic Irrationals

Analysis of Boyer and Moore s MJRTY algorithm

On The Exact Solution of Newell-Whitehead-Segel Equation Using the Homotopy Perturbation Method

The FLRW cosmological model revisited: relation of the local time with th e local curvature and consequences on the Heisenberg uncertainty principle

On Symmetric Norm Inequalities And Hermitian Block-Matrices

Completeness of the Tree System for Propositional Classical Logic

Low frequency resolvent estimates for long range perturbations of the Euclidean Laplacian

New estimates for the div-curl-grad operators and elliptic problems with L1-data in the half-space

Widely Linear Estimation with Complex Data

b-chromatic number of cacti

Unbiased minimum variance estimation for systems with unknown exogenous inputs

Can we reduce health inequalities? An analysis of the English strategy ( )

A non-commutative algorithm for multiplying (7 7) matrices using 250 multiplications

Thomas Lugand. To cite this version: HAL Id: tel

Finite volume method for nonlinear transmission problems

The Accelerated Euclidean Algorithm

Hook lengths and shifted parts of partitions

The magnetic field diffusion equation including dynamic, hysteresis: A linear formulation of the problem

Nodal and divergence-conforming boundary-element methods applied to electromagnetic scattering problems

Dispersion relation results for VCS at JLab

On Symmetric Norm Inequalities And Hermitian Block-Matrices

On Poincare-Wirtinger inequalities in spaces of functions of bounded variation

Solution to Sylvester equation associated to linear descriptor systems

Gaia astrometric accuracy in the past

Passerelle entre les arts : la sculpture sonore

Some explanations about the IWLS algorithm to fit generalized linear models

Evolution of the cooperation and consequences of a decrease in plant diversity on the root symbiont diversity

Exogenous input estimation in Electronic Power Steering (EPS) systems

Norm Inequalities of Positive Semi-Definite Matrices

Question order experimental constraints on quantum-like models of judgement

On path partitions of the divisor graph

L institution sportive : rêve et illusion

Multiple sensor fault detection in heat exchanger system

Soundness of the System of Semantic Trees for Classical Logic based on Fitting and Smullyan

On infinite permutations

A new approach of the concept of prime number

Comparison of Harmonic, Geometric and Arithmetic means for change detection in SAR time series

Linear Quadratic Zero-Sum Two-Person Differential Games

On size, radius and minimum degree

Influence of a Rough Thin Layer on the Potential

Comparison of NEWUOA with Different Numbers of Interpolation Points on the BBOB Noiseless Testbed

Solving an integrated Job-Shop problem with human resource constraints

Particle-in-cell simulations of high energy electron production by intense laser pulses in underdense plasmas

A note on the computation of the fraction of smallest denominator in between two irreducible fractions

On the longest path in a recursively partitionable graph

A note on the acyclic 3-choosability of some planar graphs

All Associated Stirling Numbers are Arithmetical Triangles

Solving the neutron slowing down equation

A non-commutative algorithm for multiplying (7 7) matrices using 250 multiplications

Quantum efficiency and metastable lifetime measurements in ruby ( Cr 3+ : Al2O3) via lock-in rate-window photothermal radiometry

Lorentz force velocimetry using small-size permanent magnet systems and a multi-degree-of-freedom force/torque sensor

Stickelberger s congruences for absolute norms of relative discriminants

Comments on the method of harmonic balance

Impedance Transmission Conditions for the Electric Potential across a Highly Conductive Casing

Nonlocal computational methods applied to composites structures

Computation and Experimental Measurements of the Magnetic Fields between Filamentary Circular Coils

From Unstructured 3D Point Clouds to Structured Knowledge - A Semantics Approach

approximation results for the Traveling Salesman and related Problems

ON THE UNIQUENESS IN THE 3D NAVIER-STOKES EQUATIONS

RHEOLOGICAL INTERPRETATION OF RAYLEIGH DAMPING

A Context free language associated with interval maps

Comment on: Sadi Carnot on Carnot s theorem.

A simple kinetic equation of swarm formation: blow up and global existence

A Simple Proof of P versus NP

Interactions of an eddy current sensor and a multilayered structure

Hardware Operator for Simultaneous Sine and Cosine Evaluation

On one class of permutation polynomials over finite fields of characteristic two *

Finiteness properties for Pisot S-adic tilings

Numerical Exploration of the Compacted Associated Stirling Numbers

One Use of Artificial Neural Network (ANN) Modeling in Simulation of Physical Processes: A real-world case study

The Mahler measure of trinomials of height 1

A NON - CONVENTIONAL TYPE OF PERMANENT MAGNET BEARING

Climbing discrepancy search for flowshop and jobshop scheduling with time-lags

About partial probabilistic information

A Study of the Regular Pentagon with a Classic Geometric Approach

Influence of network metrics in urban simulation: introducing accessibility in graph-cellular automata.

Primate cranium morphology through ontogenesis and phylogenesis: factorial analysis of global variation

Thermodynamic form of the equation of motion for perfect fluids of grade n

STATISTICAL ENERGY ANALYSIS: CORRELATION BETWEEN DIFFUSE FIELD AND ENERGY EQUIPARTITION

Sparse multivariate factorization by mean of a few bivariate factorizations

Impulse response measurement of ultrasonic transducers

Entropies and fractal dimensions

Non Linear Observation Equation For Motion Estimation

A non-linear simulator written in C for orbital spacecraft rendezvous applications.

On production costs in vertical differentiation models

Solving a quartic equation and certain equations with degree n

A remark on a theorem of A. E. Ingham.

Characterization of the local Electrical Properties of Electrical Machine Parts with non-trivial Geometry

A Simple Model for Cavitation with Non-condensable Gases

Transcription:

Fast Computation of Moore-Penrose Inverse Matrices Pierre Courrieu To cite this version: Pierre Courrieu. Fast Computation of Moore-Penrose Inverse Matrices. Neural Information Processing - Letters and Reviews, KAIST Press, 2005, 8 (2), pp.25-29. <hal-00276477> HAL Id: hal-00276477 https://hal.archives-ouvertes.fr/hal-00276477 Submitted on 29 Apr 2008 HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Neural Information Processing - Letters and Reviews Vol.8, No.2, August 2005 LETTER Fast Computation of Moore-Penrose Inverse Matrices Pierre Courrieu Laboratoire de Psychologie Cognitive, UMR CNRS 6146, Université de Provence - Centre St Charles bat. 9, case D, 3 place Victor Hugo, 13331 Marseille cedex 1, France. E-mail: courrieu@up.univ-mrs.fr (Submitted on May 24, 2005) Abstract- Many neural learning algorithms require to solve large least square systems in order to obtain synaptic weights. Moore-Penrose inverse matrices allow for solving such systems, even with rank deficiency, and they provide minimum-norm vectors of synaptic weights, which contribute to the regularization of the input-output mapping. It is thus of interest to develop fast and accurate algorithms for computing Moore-Penrose inverse matrices. In this paper, an algorithm based on a full rank Cholesky factorization is proposed. The resulting pseudoinverse matrices are similar to those provided by other algorithms. However the computation time is substantially shorter, particularly for large systems. Keywords- Rank Deficient Least Square Systems, Moore-Penrose Inverse, Pseudoinverse, Generalized Inverse, Neural Learning, Minimum-norm Synaptic Weight Vectors, Regularization. 1. Introduction The minimization of quadratic error functions is commonly used in learning algorithms of feed-forward neural networks. This can be achieved by applying gradient methods, but one can also use other methods of linear algebra and matrix computation, particularly in the case of Radial Basis Function Networks [1, 2]. Given a set of m learning examples and a set of n basis functions (or hidden neurons), with m n, one forms the m n real matrix G of the n basis functions for the input values of the m examples, and one must find a synaptic weight matrix W such that the quantity GW - F 2 is minimum, where F is the m k real matrix of the expected output values (in R k ) for the m examples. Whenever G is rank deficient, the solution (W) to the above minimization problem is not unique, it deps on the method used to solve the least square system, and possibly on some random variables (initial conditions, search parameters) for certain algorithms. However, one knows that the network input-output mapping must be suitably regularized in order to obtain a good generalization power, and the regularization is usually achieved by minimizing a norm of some differential operator applied to the mapping [1, 2]. In order to do this, one must choose suitable basis functions. However the choice of the synaptic weights of the output neurons can also play an important role. Let K be a compact subset of R d, on which a network provides an approximation f of a sampled function. Then, for any input point X K, the output f(x) is given by the scalar product of the vector g(x) of the n basis function at X with the vector of output synaptic weights w. Let denote, for convenience, a partial derivative operator (of any order). One has supx K f ( X) = sup X K ( g( X). w) w supx K g( X), which shows the interest of minimizing the norm w of the synaptic weight vector. The Moore-Penrose inverse [3], also called Pseudoinverse, or Generalized Inverse, allows for solving least square systems, even with rank deficient matrices, in such a way that each column vector of the solution has a minimum norm, which is the desired property stated above. The Moore-Penrose inverse of a m n matrix G is the unique n m matrix G + with the following four properties: G G + G = G, G + G G + = G +, (G G + ) = G G +, ( G + G ) = G + G, 25

Fast Computation of Moore-Penrose Inverse Matrices P. Courrieu where M denotes the transpose (real case) or the adjoint (complex case) of the matrix M. The desired least square solution is given by W = G + W. Whenever G is of full rank ( n ), the Moore-Penrose inverse reduces to the usual pseudoinverse: G + = (G G ) -1 G. However, when G is rank deficient (that is when its rank r is lower than n ), the computation of G + is more complex. There are several methods for computing Moore-Penrose inverse matrices [3]. The most commonly used is the Singular Value Decomposition (SVD) method, that is implemented, for example, in the "pinv" function of Matlab (version 6.5.1), as well as in the "PseudoInverse" function of Mathematica (version 5.1). This method is very accurate, but time consuming when the matrix is large. Other well-known methods are Greville's algorithm, the full rank QR factorization by Gram-Schmidt orthonormalization (GSO), and iterative methods of various orders [3]. A number of expansions of the Moore-Penrose inverse can also be used to develop effective algorithms [4], while certain algorithms are specially devoted to parallel computation [5], but these last ones are hardly usable on serial processors given that they use Cramer's rule. In this paper, we propose an algorithm for fast computation of Moore-Penrose inverse matrices on any computer. The algorithm is based on a known reverse order law (eq. 3.2 from [4]), and on a full rank Cholesky factorization of possibly singular symmetric positive matrices (Theorem 4 from [6]). 2. Algorithm Foundations Consider the symmetric positive n n matrix G G, and assume that it is of rank r n. Using Theorem 4 from [6], one knows that there is a unique upper triangular matrix S with exactly n r zero rows, and such that S S = G G, while the computation of S is a simple extension of the usual Cholesky factorization of nonsingular matrices. Removing the zero rows from S, one obtains a r n matrix, say L, of full rank r, and we have G G = S S = LL (1) Note that one can as well directly compute the matrix L, as this is implemented, in Matlab code, in the function "geninv" listed in Appix. We use also a general relation concerning the Moore-Penrose inverse of a matrix product AB given by eq. 3.2 from [4] (AB) + = B (A A B B ) + A (2) With B = I, one obtains from (2) A + = (A A) + A (3) If B = A and A is a n r matrix of rank r, then one obtains from (2) (AA ) + = A (A A) -1 (A A) -1 A. (4) We are now ready to prove the following result: Theorem 1. With G and L defined as above, one has G + = L (L L) -1 (L L) -1 L G. Proof. Using Eq.(3) one obtains G + = (G G) + G, and using Eqs. (1) and (4), one obtains (G G) + = (L L ) + = L (L L) -1 (L L) -1 L, which completes the proof. The function "geninv" listed in Appix provides all implementation details of the above solution, in Matlab code, and it returns the Moore-Penrose inverse of any rectangular matrix given as argument. The two 26

Neural Information Processing - Letters and Reviews Vol.8, No.2, August 2005 main operations are the full rank Cholesky factorization of G G and the inversion of L L. On a serial processor, these operations are of complexity order O(n 3 ) and O(r 3 ), respectively. However, in a parallel architecture, with as many processors as necessary, the time complexity for the Cholesky factorization of G G could reduce to O(n), as suggested by the notation in Matlab code, while the time complexity for the inversion of the symmetric positive definite matrix L L could reduce to O(log r), according to [7]. 3. Computational Test In this section, we compare the performance of the proposed method ("geninv" function) to that of four usual algorithms for the computation of Moore-Penrose inverse matrices: Greville's method [8], the SVD method (Matlab pinv function), the full rank QR factorization by GSO method, and an iterative method of optimized order [3]. For this last one, the optimal order was determined experimentally, and it was found to be p = 2 9 = 512 for the family of matrices that were tested. All algorithms were carefully implemented and tested in Matlab (Version 6.5.1) on a common personal computer. The test matrices were of size m n, with n = 2 k, k = 5.. 10, and m = 2n, and they were rank deficient, with rank r = 7n / 8. The matrix coefficients were real random numbers in [-1,1]. The accuracy of the algorithms was examined in error matrices corresponding to the four properties characterizing the Moore-Penrose inverse: GG + G G, G + GG + G +, (GG + ) GG +, and (G + G) G + G. It turned out that, for all algorithms and for all test matrices, none of the coefficients in the error matrices was greater than 2 x 10-10 in absolute value, thus we obtained a reliable approximation of the Moore-Penrose inverse in all cases. The computation times (in seconds) are reported in Table 1, where one can see that the geninv function is substantially faster than the other algorithms for large matrices. An exception is observed for the smallest matrices (n = 32), for which the full rank QR method is a bit faster than the geninv. This probably results from the use of square root functions in the full rank Cholesky factorization of the geninv, while GSO uses only basic arithmetic operations. However, this small disadvantage is rapidly compensated by other properties as the size of the test matrices increases. Table 1. Computation time (in seconds) of the Moore-Penrose inverse of random rectangular rank-deficient real matrices by four usual algorithms and the "geninv" function, as a function of the matrix size. The computation error is lower than 2 x 10-10 per coefficient in the error matrices, in all cases. n = 32 n = 64 n = 128 n = 256 n = 512 n = 1024 Greville's method 0.015 0.296 1.734 17.092 169.850 2160.100 SVD method (Matlab pinv) 0.025 0.044 0.363 5.433 49.801 373.895 full rank QR (by GSO) 0.006 0.035 0.453 3.795 29.197 240.220 Iterative (of order 512) 0.019 0.141 0.664 4.158 29.286 232.688 geninv 0.007 0.022 0.105 0.768 6.445 54.137 4. Application Fields As stated above, the proposed tool allows for fast computation of Moore-Penrose inverse matrices and fast solving of large least square systems, possibly with rank deficient matrices. In the case of rank deficiency, the obtained solution, in a learning process, has the advantage of being of minimum norm, and thus it contributes to the regularization of the input-output mapping. However, we must note that whenever there is no risk of rank deficiency, there are simpler ways of solving least square systems, and the Moore-Penrose inverse itself reduces to a simpler expression. Thus, the choice of using the proposed tool must dep on the possibility that the matrix G be rank deficient, which in fact deps on the choice of the set of basis functions [9]. One can say that the Moore-Penrose inverse exts modeler's freedom concerning this last point. Moreover, we can note that, without being actually rank deficient, the matrix G can have certain column vectors that are close to be linearly depent, in such a way that the least square system is ill-conditioned. In this case, the use of the Moore- Penrose inverse avoids catastrophic numerical results that can result from the ill-conditioning. Another characteristic of the proposed tool is that it is particularly fast, which is critical whenever there are time constraints, as in on-line learning, or whenever the computation must be updated many times, as in incremental 27

Fast Computation of Moore-Penrose Inverse Matrices P. Courrieu learning procedures. In this last case, there are known fast updating procedures [10], however these procedures do not support rank deficiency, and the basis functions cannot be modified after they have been included in the network, while, in fact, it is frequently desirable to rescale Radial Basis Functions, for example, as the size of the network increases. Thus, it is clear that the fast computation of Moore-Penrose inverse matrices can have valuable applications in neurocomputational learning procedures. Appix: Matlab code of the "geninv" function function Y = geninv(g) % Returns the Moore-Penrose inverse of the argument % Transpose if m < n [m,n]=size(g); transpose=false; if m<n transpose=true; A=G*G'; n=m; else A=G'*G; % Full rank Cholesky factorization of A da=diag(a); tol= min(da(da>0))*1e-9; L=zeros(size(A)); r=0; for k=1:n r=r+1; L(k:n,r)=A(k:n,k)-L(k:n,1:(r-1))*L(k,1:(r-1))'; % Note: for r=1, the substracted vector is zero if L(k,r)>tol L(k,r)=sqrt(L(k,r)); if k<n L((k+1):n,r)=L((k+1):n,r)/L(k,r); else r=r-1; L=L(:,1:r); % Computation of the generalized inverse of G M=inv(L'*L); if transpose Y=G'*L*M*M*L'; else Y=L*M*M*L'*G'; References [1] F. Girosi and T. Poggio, Networks and the best approximation property, Biological Cybernetics, 63, pp. 169-176, 1990. [2] T. Poggio andf. Girosi, Networks for approximation and learning, Proceedings of the IEEE, 78(9), pp. 1481-1497, 1990. [3] A. B. Israel and T.N.E. Greville, Generalized Inverses: Theory and Applications, (2 nd ed.), New-York, Springer, 2003. [4] M.A. Rakha, On the Moore-Penrose generalized inverse matrix, Applied Mathematics and Computation, 158, pp. 185-200, 2004. [5] Y. Wei and G. Wang, PCR algorithm for parallel computing minimum-norm (T) least-squares (S) solution of inconsistent linear equations, Applied Mathematics and Computation, 133, pp. 547-557, 2002. [6] P. Courrieu, Straight monotonic embedding of data sets in Euclidean spaces, Neural Networks, 15, pp. 1185-1196, 2002. 28

Neural Information Processing - Letters and Reviews Vol.8, No.2, August 2005 [7] P. Courrieu, Solving time of least square systems in Sigma-Pi unit networks, Neural Information Processing Letters and Reviews, 4(3), pp. 39-45, 2004. [8] T.N.E. Greville, Some applications of the pseudoinverse of a matrix, SIAM Review, 2, pp. 15-22, 1960. [9] C.A. Micchelli, Interpolation of scattered data: distance matrices and conditionally positive definite functions, Constructive Approximation, 2, pp. 11-22, 1986. [10] S.K. Sin and R.J.P. DeFigueiredo, Efficient learning procedures for optimal interpolative nets, Neural Networks, 6, pp. 99-113, 1993. Pierre Courrieu is a CNRS researcher currently working at the University of Provence (France) with psychologists and neuroscientists. He is a member of the European Neural Network Society, and has published works on visual shape recognition, neural computation, data encoding, multidimensional scaling, function approximation on metric and non-metric spaces, and global optimization methods. 29