MODELS OF LEARNING AND THE POLAR DECOMPOSITION OF BOUNDED LINEAR OPERATORS

Similar documents
SEMILINEAR ELLIPTIC EQUATIONS WITH DEPENDENCE ON THE GRADIENT

ON THE SOLUTION OF DIFFERENTIAL EQUATIONS WITH DELAYED AND ADVANCED ARGUMENTS

ON OPERATORS WITH AN ABSOLUTE VALUE CONDITION. In Ho Jeon and B. P. Duggal. 1. Introduction

INVARIANT SUBSPACES FOR CERTAIN FINITE-RANK PERTURBATIONS OF DIAGONAL OPERATORS. Quanlei Fang and Jingbo Xia

1.4 The Jacobian of a map

GENERALIZED BI-CIRCULAR PROJECTIONS ON MINIMAL IDEALS OF OPERATORS

Trace Class Operators and Lidskii s Theorem

Spectrum for compact operators on Banach spaces

Kernel Method: Data Analysis with Positive Definite Kernels

Hebb rule book: 'The Organization of Behavior' Theory about the neural bases of learning

NORMS ON SPACE OF MATRICES

arxiv: v1 [math.gr] 8 Nov 2008

The goal of this chapter is to study linear systems of ordinary differential equations: dt,..., dx ) T

SELF-ADJOINTNESS OF SCHRÖDINGER-TYPE OPERATORS WITH SINGULAR POTENTIALS ON MANIFOLDS OF BOUNDED GEOMETRY

UNIQUENESS OF SOLUTIONS TO MATRIX EQUATIONS ON TIME SCALES

A COUNTEREXAMPLE TO AN ENDPOINT BILINEAR STRICHARTZ INEQUALITY TERENCE TAO. t L x (R R2 ) f L 2 x (R2 )

NONSTANDARD NUMERICAL METHODS FOR A CLASS OF PREDATOR-PREY MODELS WITH PREDATOR INTERFERENCE

EXISTENCE OF SOLUTIONS FOR CROSS CRITICAL EXPONENTIAL N-LAPLACIAN SYSTEMS

DEFINABLE OPERATORS ON HILBERT SPACES

APPROXIMATE CONTROLLABILITY OF DISTRIBUTED SYSTEMS BY DISTRIBUTED CONTROLLERS

Substrictly Cyclic Operators

Asymptotic analysis of the Carrier-Pearson problem

Background on Linear Algebra - Lecture 2

Memoryless output feedback nullification and canonical forms, for time varying systems

5 Compact linear operators

On periodic solutions of superquadratic Hamiltonian systems

LERAY LIONS DEGENERATED PROBLEM WITH GENERAL GROWTH CONDITION

MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators.

Mathematical foundations - linear algebra

Remarks on Fuglede-Putnam Theorem for Normal Operators Modulo the Hilbert-Schmidt Class

Examples include: (a) the Lorenz system for climate and weather modeling (b) the Hodgkin-Huxley system for neuron modeling

Quasireversibility Methods for Non-Well-Posed Problems

Spectral Theorem for Self-adjoint Linear Operators

A Perron-type theorem on the principal eigenvalue of nonsymmetric elliptic operators

Commutator estimates in the operator L p -spaces.

WEYL S THEOREM FOR PAIRS OF COMMUTING HYPONORMAL OPERATORS

ON THE HÖLDER CONTINUITY OF MATRIX FUNCTIONS FOR NORMAL MATRICES

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Patryk Pagacz. Characterization of strong stability of power-bounded operators. Uniwersytet Jagielloński

1 Math 241A-B Homework Problem List for F2015 and W2016

Some Inequalities for Commutators of Bounded Linear Operators in Hilbert Spaces

On compact operators

DAVIS WIELANDT SHELLS OF NORMAL OPERATORS

Banach Journal of Mathematical Analysis ISSN: (electronic)

The following definition is fundamental.

ON A CLASS OF OPERATORS RELATED TO PARANORMAL OPERATORS

STURM-LIOUVILLE EIGENVALUE CHARACTERIZATIONS

CESARO OPERATORS ON THE HARDY SPACES OF THE HALF-PLANE

EXPLICIT l 1 -EFFICIENT CYCLES AND AMENABLE NORMAL SUBGROUPS

Acta Mathematica Academiae Paedagogicae Nyíregyháziensis 24 (2008), ISSN

CONVERGENCE BEHAVIOUR OF SOLUTIONS TO DELAY CELLULAR NEURAL NETWORKS WITH NON-PERIODIC COEFFICIENTS

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

1 Compact and Precompact Subsets of H

BESSEL MATRIX DIFFERENTIAL EQUATIONS: EXPLICIT SOLUTIONS OF INITIAL AND TWO-POINT BOUNDARY VALUE PROBLEMS

Upper triangular forms for some classes of infinite dimensional operators

AFFINE MAPPINGS OF INVERTIBLE OPERATORS. Lawrence A. Harris and Richard V. Kadison

EXISTENCE OF WEAK SOLUTIONS FOR A NONUNIFORMLY ELLIPTIC NONLINEAR SYSTEM IN R N. 1. Introduction We study the nonuniformly elliptic, nonlinear system

A Spectral Characterization of Closed Range Operators 1

An Iterative Procedure for Solving the Riccati Equation A 2 R RA 1 = A 3 + RA 4 R. M.THAMBAN NAIR (I.I.T. Madras)

THE SPECTRAL DIAMETER IN BANACH ALGEBRAS

ASYMPTOTIC THEORY FOR WEAKLY NON-LINEAR WAVE EQUATIONS IN SEMI-INFINITE DOMAINS

Numerical Range in C*-Algebras

ON THE GENERALIZED FUGLEDE-PUTNAM THEOREM M. H. M. RASHID, M. S. M. NOORANI AND A. S. SAARI

A NOTE ON STOCHASTIC INTEGRALS AS L 2 -CURVES

APPROXIMATION OF MOORE-PENROSE INVERSE OF A CLOSED OPERATOR BY A SEQUENCE OF FINITE RANK OUTER INVERSES

A linear algebra proof of the fundamental theorem of algebra

FAST AND HETEROCLINIC SOLUTIONS FOR A SECOND ORDER ODE

UNIQUENESS OF SELF-SIMILAR VERY SINGULAR SOLUTION FOR NON-NEWTONIAN POLYTROPIC FILTRATION EQUATIONS WITH GRADIENT ABSORPTION

A 3 3 DILATION COUNTEREXAMPLE

Normality of adjointable module maps

A linear algebra proof of the fundamental theorem of algebra

arxiv: v2 [math.pr] 27 Oct 2015

Sufficient conditions for functions to form Riesz bases in L 2 and applications to nonlinear boundary-value problems

MEAN VALUE THEOREMS FOR SOME LINEAR INTEGRAL OPERATORS

dominant positive-normal. In view of (1), it is very natural to study the properties of positivenormal

NOTE ON THE NODAL LINE OF THE P-LAPLACIAN. 1. Introduction In this paper we consider the nonlinear elliptic boundary-value problem

On the simplest expression of the perturbed Moore Penrose metric generalized inverse

SOLVABILITY OF MULTIPOINT DIFFERENTIAL OPERATORS OF FIRST ORDER

Chapter 7. Canonical Forms. 7.1 Eigenvalues and Eigenvectors

Nonlinear error dynamics for cycled data assimilation methods

INVESTIGATING THE NUMERICAL RANGE AND Q-NUMERICAL RANGE OF NON SQUARE MATRICES. Aikaterini Aretaki, John Maroulas

EXISTENCE OF CONJUGACIES AND STABLE MANIFOLDS VIA SUSPENSIONS

Banach Journal of Mathematical Analysis ISSN: (electronic)

Hilbert Spaces. Contents

LECTURE 25-26: CARTAN S THEOREM OF MAXIMAL TORI. 1. Maximal Tori

7 Planar systems of linear ODE

EXISTENCE OF SOLUTIONS FOR A RESONANT PROBLEM UNDER LANDESMAN-LAZER CONDITIONS

arxiv: v1 [math.fa] 6 Nov 2015

NONLINEAR FREDHOLM ALTERNATIVE FOR THE p-laplacian UNDER NONHOMOGENEOUS NEUMANN BOUNDARY CONDITION

arxiv: v1 [math.fa] 12 Oct 2016

Iterative Solution of a Matrix Riccati Equation Arising in Stochastic Control

e jk :=< e k, e j >, e[t ] e = E 1 ( e [T ] e ) h E.

Introduction to Neural Networks

PROBABILISTIC METHODS FOR A LINEAR REACTION-HYPERBOLIC SYSTEM WITH CONSTANT COEFFICIENTS

ORBITAL STABILITY OF SOLITARY WAVES FOR A 2D-BOUSSINESQ SYSTEM

EXPLICIT UPPER BOUNDS FOR THE SPECTRAL DISTANCE OF TWO TRACE CLASS OPERATORS

Compound matrices and some classical inequalities

A NOTE ON COMPACT OPERATORS

QUASI-UNIFORMLY POSITIVE OPERATORS IN KREIN SPACE. Denitizable operators in Krein spaces have spectral properties similar to those

Lecture 3: Review of Linear Algebra

Transcription:

Eighth Mississippi State - UAB Conference on Differential Equations and Computational Simulations. Electronic Journal of Differential Equations, Conf. 19 (2010), pp. 31 36. ISSN: 1072-6691. URL: http://ejde.math.txstate.edu or http://ejde.math.unt.edu ftp ejde.math.txstate.edu MODELS OF LEARNING AND THE POLAR DECOMPOSITION OF BOUNDED LINEAR OPERATORS FERNANDA BOTELHO, ANNITA DAVIS Abstract. We study systems of differential equations in B(H), the space of all bounded linear operators on a separable complex Hilbert space H equipped with the operator norm. These systems are infinite dimensional generalizations of mathematical models of learning. We use the polar decomposition of operators to find an explicit form for solutions. We also discuss the standard questions of existence and uniqueness of local and global solutions, as well as their long-term behavior. 1. Introduction Learning models are devices that reproduce features of the human s ability to interact with the environment. Such models are typically implemented in a feedforward neural network consisting of a finite number of interconnected units, designated neurons. Each neuron has the capability of receiving, combining and processing quantifiable information. The interconnection among neurons is represented by a net of channels through which information flows. This translates the activity of a brain at the synaptic level. It is natural to predict that information changes in this system. This is represented by the action of multiplicative factors, designated connecting weights. A mathematical model of learning should encompass an algebraic interpretation of this phenomenon, in general, given as a system of differential equations. The stability of a learning model is of crucial importance since it provides information on how the device emerges after exposed to a set of initial conditions. If stability occurs, the result yields a set of weight values that represents learning after the exposure to an initial stimulus. In this paper we address this aspect of learning theory, often referred in literature as unsupervised learning. Several researchers have proposed systems that perform unsupervised learning, see [3, 13, 16]. In [17] Oja introduced a learning model that behaves as an information filter with the capability to adapt from an internal assignment of initial weights. This approach uses the principal component analyzer statistical method to perform a selection of relevant information. More recently, Adams [2] proposed a generalization of Oja s model by incorporating a probabilistic parameter. Such 2000 Mathematics Subject Classification. 34G20, 47J25. Key words and phrases. Nonlinear systems; learning models; polar decomposition of operators. c 2010 Texas State University - San Marcos. Published September 25, 2010. 31

32 F. BOTELHO, A. DAVIS EJDE/CONF/19 a parameter captures the possible creation of temporary synapses or channels in an active network. We refer the reader to [5] for a detailed interpretation of the Cox-Adams model for a network with n input neurons and a single output neuron. This model is given by the system of differential equations: dw = T CW W W T CW dt with W a column of connecting weights and C is a symmetric matrix. Each entry value in C is equal to the expected input correlation between two neurons. The matrix T is a non-singular tri-diagonal matrix [t ij ] i,j given by 1 ε if i = j t ij = ε/2 if i j = 1 0 otherwise. This matrix translates the synaptic formation according to some probability ɛ. In this paper we consider a generalization of this system to an infinite dimensional setting. This better reflects the complexity of real systems where continuous activity occurs. Our new setting is the Banach space B(H) of bounded operators on a separable Hilbert space. We consider the system dz dt = T MZ ZZ MZ, (1.1) with T representing an invertible, positive, self-adjoint operator on H and M a self-adjoint operator on H. Particularly interesting examples are the tridiagonal self-adjoint operators, see [7, 8, 10, 11]. The operator valued, time dependent Z now represents the continuous change of connecting weights according to the rule described in equation (1.1). We present a scheme that explicitly solves system (1.1). First a natural change of variables reduces (1.1) to a static system where no synaptic formation occurs. However, the probabilistic effect transfers to the input correlation operator M. System (1.1) reduces to an Oja type model. We follow a strategy employed in [4]. The main tool is the polar decomposition of operators that allows us to derive a scalar system and a polar system associated with the original system. Both systems are solved explicitly. These two solutions combined define the local solution for the original system, given certain mild constrains on the initial conditions. The explicit form for local solutions is now used to derive the existence of global solutions and for the stability analysis. 2. Background Results In this section we summarize the techniques used in [4] to solve the generalization of Oja-Karhunen s model on a separable Hilbert space. We recall that the generalized Oja-Karhunen model is given as follows Ż = MZ ZZ MZ Z(0) = Z 0. (2.1) The time dependent variable Z has values in B(H). The operator Z is the adjoint of Z and M is a normal operator on H. Classical fixed point theorems allow us to assure the local existence and uniqueness of solutions for system (2.1), see [15, p. 405].

EJDE-2010/CONF/19/ MODELS OF LEARNING 33 Theorem 2.1 ([4]). Let Z 0 be a bounded operator in B(H). If F : B(H) B(H) is a Lipschitz function then there exists a positive number ε and a unique differentiable map Z : ( ε, ε) B(H) such that Ż = F (Z) and Z(0) = Z 0. It is straightforward to check that Z MZ ZZ MZ is a Lipschitz function and hence local existence and uniqueness of solutions of (2.1) follow from Theorem 2.1. The technique employed in the solution scheme uses the well known polar decomposition of bounded operators, see Ringrose [19]. A bounded operator Z can be written as the product of a partial isometry P and a hermitian operator ZZ. The operator ZZ represents the unique positive square root of ZZ and the operator P satisfies the equation P P P = P. The bounded operator Z is decomposed as follows: Z = ZZ P. This decomposition is unique and is called the polar decomposition of the operator Z. In [4] we applied the polar decomposition to construct two new systems associated with (2.1). The solutions of these systems are the scalar and the polar components of the solution of the original system. For t ( ε, ε) we denote by Z(t) a local solution of (2.1) and we set V (t) = Z(t)Z(t). It is a straightforward calculation to verify that V (t) is a local solution of the system V = MV + V M V MV V M V V (0) = Z 0 Z0 (2.2). If M and Z 0 commute, Fuglede-Putman Theorem (cf. Furuta [12, p. 67]) and Piccard s iterative method (Hartman [15, pp. 62-66]) imply that the family {V (t)} t ( ɛ,ɛ) defines a path of positive operators that commute with M and M. Furthermore {V (t)} t ( ɛ,ɛ) is a family of commuting operators. Thus system (2.2) can be written as V = (M + M )(V V 2 ), with V (0) = V 0. Since Z 0 is an invertible operator in H, for some ε > 0, V (t) is also an invertible operator for t (ε, ε), cf [9]. We have that d dt (V 1 ) = V 2 V. Using the commutativity of M and V, we found that d dt (V 1 ) = (M + M ) (M + M )V 1, which is a first-order linear differential equation, see [1]. A generalization of standard techniques of integrating factors, appropriately generalized to infinite dimensions (see [4] pg 101), imply that V (t) = [I + (V0 1 I)exp( (M + M )t)] 1 for t ( ɛ, ɛ). We derive the polar system associated with (2.1). This is a first order non autonomous linear differential equation. For simplicity of notation we set V 1/2 = ZZ. Using the commutativity of V and M, we obtain P = 1 2 (M M )(V (t) I)P P (0) = P 0. (2.3)

34 F. BOTELHO, A. DAVIS EJDE/CONF/19 Properties of exponential operator-valued functions imply that ( t P (t) = exp 1 ) 2 (M M )(V (ξ) I)dξ P 0 for t < ε) 0 is a solution of the polar system (2.3). These considerations are now summarized in the following theorem. Theorem 2.2 ([4]). If Z 0 is invertible and commutes with the normal operator M, then there exist ε > 0 and a unique differentiable mapping Z : ( ε, ε) B(H) such that Ż = MZ ZZ MZ (2.4) Z(0) = Z 0, if and only if Z(t) = V (t) 1/2 P (t), and V (t) = [I + (V 1 0 I) exp( (M + M )t)] 1, ( t P (t) = exp 0 1 ) 2 (M M )(V (ξ) I)dξ P 0. 3. General Solution for the Cox-Adams Learning Model We recall that the Cox-Adams learning model is dz dt = T MZ ZZ MZ Z(0) = Z 0 (3.1) with T representing an invertible, positive, self-adjoint operator, and M self-adjoint on H. Theorem 2.1 implies the local existence and uniqueness of solutions. Since T is positive and invertible, we rewrite equation (3.1) as follows dz dt = ( T T )M ( T ( T ) 1) Z ZZ ( ( T ) 1 T ) MZ, equivalently, ( 1 dz T ) dt = T M T ( T ) 1 Z ( T ) 1 ZZ ( T ) 1 T MZ. We set W = ( T ) 1 Z and S = T M T. operator. Then system (3.1) becomes where W 0 = ( T ) 1 Z 0. Ẇ = SW W W SW W (0) = W 0, We observe that S is a hermitian Proposition 3.1. If W 0 is invertible and commutes with the hermitian operator S, then there exist ε > 0 and a unique differentiable mapping W : ( ε, ε) B(H) such that Ẇ = SW W W SW and W (0) = W 0, (3.2) if and only if W (t) = V (t) 1/2 P (t), with V (t) = [I + (V0 1 I) exp( 2St)] 1 (3.3) P (t) = P 0. (3.4)

EJDE-2010/CONF/19/ MODELS OF LEARNING 35 Since the operator S is hermitian (S = S ), the proof of the above lemma follows from Theorem 2.2 Theorem 3.2. Let T be an invertible, positive, self-adjoint operator and M a hermitian operator. If Z 0 is invertible and commutes with M, then there exist ε > 0 and a unique differentiable mapping Z : ( ε, ε) B(H) such that if and only if Z(t) = (T V (t)) 1/2 P (t), and P (t) = P 0. Ż = T MZ ZZ MZ Z(0) = Z 0, V (t) = [ I + ( T (Z 0 Z 0 ) 1 T I ) exp( 2 T M T t) ] 1 The proof of the above theorem follows from Proposition 3.1. lemma is used in the stability analysis of the Cox-Adams model. (3.5) The following Lemma 3.3 ([4]). If Z 0 is an invertible operator in B(H), M is a normal operator that commutes with Z 0, (Z 0 Z 0 ) 1 I < 1, and the spectrum of M is strictly positive, then there exists ɛ > 0 so that is invertible on the interval ( ε, ) and I + [(Z 0 Z 0 ) 1 I] exp( (M + M )t) lim [I + ((Z 0Z0 ) 1 I) exp( (M + M )t)] = I. t As a result we have the following corollary. Corollary 3.4. Let T be an invertible, positive, self-adjoint operator. If Z 0 is an invertible operator in B(H), M is a self-adjoint operator that commutes with Z 0, T (Z 0 Z 0 ) 1 T I < 1, and the spectrum of M is strictly positive, then there exists ɛ > 0 so that I + [ T (Z 0 Z 0 ) 1 T I] exp ( 2 T M T )t ) is invertible on the interval ( ε, ) and [ I + [ T (Z0 Z0 ) 1 T I]exp ( 2 T M T t )] = I. lim t Remark 3.5. We observe that, under the assumptions in Corollary 3.4, we have lim Z(t) = P 0. t This provides a filtering procedure that selects the polar component of the initial condition. References [1] M. Abell, J. Braselton; Modern Differential Equations, 2nd ed., Harcourt College Publishers, (2001), 36-421, 197-205. [2] P. Adams, Hebb, Darwin; Journal of Theoretical Biology, 195 (1998), 419-438. [3] S. Amari; Mathematical Theory of Neural Networks, Sangyo-Tosho, Tokyo, 1998. [4] F. Botelho, A. Davis; Differential systems on spaces of bounded linear operators, Intl. J. of Pure and Applied Mathematics, 53 (2009), 95-107. [5] F. Botelho, J. Jamison; Qualitative Behavior of Differential Equations Associated with Artificial Neural Networks. Journal of Dynamics and Differential Equations 16 (2004), 179-204.

36 F. BOTELHO, A. DAVIS EJDE/CONF/19 [6] J. Conway; A Course in Functional Analysis, Graduate Texts in Mathematics, Springer Verlag, 96 (1990). [7] J. Dombrowski; Tridiagonal Matrix Representations of Cyclic Self-Adjoint Operators, Pacific Jounral of Mathematics, 114, No. 2 (1984), 325-334. [8] J. Dombrowski; Tridiagonal Matrix Representations of Cyclic Self-Adjoint Operators. II, Pacific Jounral of Mathematics, 120, No. 1 (1985), 47-53. [9] R. Douglas; Banach Algebra Techniques in Operator Theory, 2nd Edition, Graduate Texts in Mathematics, Springer Verlag, 179, 1998. [10] P. L. Duren; Extension of a Result of Beurling on Invariant Subspaces, Transactions of the American Mathematical Society, 99 No. 2 (1961), 320-324. [11] P. L. Duren; Invariant Subspaces of Tridiagonal Operators, Duke Math. J, 30 No. 2 (1963), 239-248. [12] T. Furuta; Invitation to Linear Operators, Taylor & Francis, 2001. [13] S. Haykin; Neural Networks: A Comprehensive Foundation, Macmillan College Publ. Co., 1994. [14] K. Cox, P. Adams; Formation of New Connections by a Generalisation of Hebbian Learning, preprint, 2001. [15] P. Hartman; Ordinary Differential Equations, Wiley, 1964. [16] J. Hertz, A. Krogh, R. Palmer; Introduction to the Theory of Neural Computation, A Lecture Notes Volume, Santa Fe Institute Studies in the Sciences of Complexity, 1991. [17] E. Oja; A Simplified Neuron Model as a Principal Component Analyzer, J. of Math. Biology, 15 (1982), 267-273. [18] E. Oja, J. Karhunen; A Stochastic Approximation of the Eigenvectors and Eigenvalues of the Expectation of a Random Matrix, J. of Math. Analysis and Appl., 106 (1985), 69-84. [19] J. Ringrose; Compact Non-Self-Adjoint Operators, Van Nostrand ReinHold Mathematical Studies 35, 1994. Fernanda Botelho The University of Memphis, Mathematical Sciences Dept., Dunn Hall 373, 3721 Norriswood St., Memphis, TN 38152-3240, USA E-mail address: mbotelho@memphis.edu Annita Davis The University of Memphis, Mathematical Sciences Dept., Dunn Hall 373, 3721 Norriswood St., Memphis, TN 38152-3240, USA E-mail address: adavis2@memphis.edu