Mean-field equations for higher-order quantum statistical models : an information geometric approach

Similar documents
AN AFFINE EMBEDDING OF THE GAMMA MANIFOLD

June 21, Peking University. Dual Connections. Zhengchao Wan. Overview. Duality of connections. Divergence: general contrast functions

Information Geometric view of Belief Propagation

Information geometry of Bayesian statistics

Information Geometry of Positive Measures and Positive-Definite Matrices: Decomposable Dually Flat Structure

A Theory of Mean Field Approximation

A Holevo-type bound for a Hilbert Schmidt distance measure

Introduction to Information Geometry

Information Geometry on Hierarchy of Probability Distributions

Information geometry of the power inverse Gaussian distribution

CS Lecture 19. Exponential Families & Expectation Propagation

Chapter 2 Exponential Families and Mixture Families of Probability Distributions

An Information Geometry Perspective on Estimation of Distribution Algorithms: Boundary Analysis

Bregman divergence and density integration Noboru Murata and Yu Fujimoto

The general reason for approach to thermal equilibrium of macroscopic quantu

PHY305: Notes on Entanglement and the Density Matrix

STATISTICAL CURVATURE AND STOCHASTIC COMPLEXITY

Strong Converse and Stein s Lemma in the Quantum Hypothesis Testing

3. If a choice is broken down into two successive choices, the original H should be the weighted sum of the individual values of H.

The Generalized Distributive Law and Free Energy Minimization

Applications of Information Geometry to Hypothesis Testing and Signal Detection

6.1 Main properties of Shannon entropy. Let X be a random variable taking values x in some alphabet with probabilities.

Bregman Divergences for Data Mining Meta-Algorithms

Bias-Variance Trade-Off in Hierarchical Probabilistic Models Using Higher-Order Feature Interactions

Physics 4022 Notes on Density Matrices

Lecture 19 October 28, 2015

Chapter 5. Density matrix formalism

Entropy measures of physics via complexity

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. An Extension of Least Angle Regression Based on the Information Geometry of Dually Flat Spaces

Belief Propagation, Information Projections, and Dykstra s Algorithm

Characterizing the Region of Entropic Vectors via Information Geometry

arxiv: v1 [quant-ph] 4 Jul 2013

Multiplicativity of Maximal p Norms in Werner Holevo Channels for 1 < p 2

Quantum Fisher Information. Shunlong Luo Beijing, Aug , 2006

A PARAMETRIC MODEL FOR DISCRETE-VALUED TIME SERIES. 1. Introduction

13: Variational inference II

Learning Binary Classifiers for Multi-Class Problem

Linear Algebra and Dirac Notation, Pt. 1

Information Theory and Predictability Lecture 6: Maximum Entropy Techniques

Quantum Information & Quantum Computing

Matrix Product States

Variational Principal Components

MP 472 Quantum Information and Computation

Information Geometric Structure on Positive Definite Matrices and its Applications

Slow Dynamics Due to Singularities of Hierarchical Learning Machines

Information Geometry and Algebraic Statistics

Talk 3: Examples: Star exponential

A complete criterion for convex-gaussian states detection

Stochastic Complexities of Reduced Rank Regression in Bayesian Estimation

Introduction to Quantum Information Hermann Kampermann

SYMPLECTIC GEOMETRY: LECTURE 5

Quantum Physics II (8.05) Fall 2002 Assignment 3

Bounding the Entropic Region via Information Geometry

17 Variational Inference

Lecture 35: December The fundamental statistical distances

What is Singular Learning Theory?

Many-Body physics meets Quantum Information

Supplementary Information

ENTANGLEMENT OF N DISTINGUISHABLE PARTICLES

Algebraic Theory of Entanglement

MGMT 69000: Topics in High-dimensional Data Analysis Falll 2016

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation

Quantum annealing by ferromagnetic interaction with the mean-field scheme

arxiv: v3 [quant-ph] 5 Jun 2015

13 : Variational Inference: Loopy Belief Propagation

Acta Mathematica Academiae Paedagogicae Nyíregyháziensis 26 (2010), ISSN

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. A Superharmonic Prior for the Autoregressive Process of the Second Order

arxiv: v7 [cs.ne] 9 Oct 2013

Probabilistic and Bayesian Machine Learning

SPACETIME FROM ENTANGLEMENT - journal club notes -

Distance between physical theories based on information theory

Chapter 3. Riemannian Manifolds - I. The subject of this thesis is to extend the combinatorial curve reconstruction approach to curves

Applications of Algebraic Geometry IMA Annual Program Year Workshop Applications in Biology, Dynamics, and Statistics March 5-9, 2007

arxiv: v1 [math-ph] 19 Oct 2018

Density Matrices. Chapter Introduction

1 More on the Bloch Sphere (10 points)

Controlling and Stabilizing a Rigid Formation using a few agents

Latent Variable Models and EM algorithm

SZEGŐ KERNEL TRANSFORMATION LAW FOR PROPER HOLOMORPHIC MAPPINGS

A Survey of Information Geometry in Iterative Receiver Design, and Its Application to Cooperative Positioning

Quantum mechanics in one hour

Jeong-Sik Kim, Yeong-Moo Song and Mukut Mani Tripathi

arxiv: v4 [cs.it] 17 Oct 2015

Bootstrap prediction and Bayesian prediction under misspecified models

Junction Tree, BP and Variational Methods

13 : Variational Inference: Loopy Belief Propagation and Mean Field

arxiv: v2 [quant-ph] 14 Mar 2018

Definition 5.1. A vector field v on a manifold M is map M T M such that for all x M, v(x) T x M.

Free Probability and Random Matrices: from isomorphisms to universality

A Hierarchy of Information Quantities for Finite Block Length Analysis of Quantum Tasks

MP463 QUANTUM MECHANICS

LECTURE 15: COMPLETENESS AND CONVEXITY

1 Quantum states and von Neumann entropy

Information Geometry

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information

Fractional Belief Propagation

Statistical physics models belonging to the generalised exponential family

A New Class of Adiabatic Cyclic States and Geometric Phases for Non-Hermitian Hamiltonians

Information geometry of the ising model on planar random graphs

Information Theory and Feature Selection (Joint Informativeness and Tractability)

Transcription:

Mean-field equations for higher-order quantum statistical models : an information geometric approach N Yapage Department of Mathematics University of Ruhuna, Matara Sri Lanka. arxiv:1202.5726v1 [quant-ph] 26 Feb 2012 February 28, 2012 Abstract This work is a simple extension of [1]. We apply the concepts of information geometry to study the mean-field approximation for a general class of quantum statistical models namely the higher-order quantum Boltzmann machines (QBMs). The states we consider are assumed to have at most third-order interactions with deterministic coupling coefficients. Such states, taken together, can be shown to form a quantum exponential family and thus can be viewed as a smooth manifold. In our work, we explicitly obtain naive mean-field equations for the third-order classical and quantum Boltzmann machines and demonstrate how some information geometrical concepts, particularly, exponential and mixture projections used to study the naive mean-field approximation in [1] can be extended to a more general case. Though our results do not differ much from those in [1], we emphasize the validity and the importance of information geometrical point of view for higher dimensional classical and quantum statistical models. Keywords: mean-field theory, quantum statistical model, information geometry, quantum relative entropy, quantum exponential family 1 Introduction The mean-field approximation uses a simple tractable family of density operators to calculate quantities related to a complex density operator including mutual interactions. Information geometry, on the other hand, studies intrinsic geometrical structure existing in the manifold of density operators [2]. Many authors have used mean-field approximation to classical statistical models like classical Boltzmann machines (CBMs) [3] and also have discussed the properties in the in the information geometrical point of view [4, 5]. In this work, we apply mean-field theory to the third-order CBMs and QBMs and derive the naive mean-field equations using information geometrical concepts. 2 Information geometry of mean-field approximation for thirdorder CBMs Let us consider a network of n elements numbered as 1,2,...,n. Let the value of each element i {1,2,...,n be x i { 1,+1. Then a state of the network can berepresented nihal@maths.ruh.ac.lk 1

as x = (x 1,x 2,...,x n ) { 1,+1 n. Each element i {1,2,...,n carries a threshold value θ i R. The network also has a real-valued parameter w ij for each pair of elements {i, j, which is called the coupling coefficient between i and j. These parameters are assumed to satisfy conditions w ij = w ji, w ii = 0. The other real-valued parameter is v ijk and is symmetric on all pairs of indices. The equilibrium (stationary) distribution is given by the probability distributions of the form { p(x,h,w,v) = exp h i x i + w ij x i x j + v ijk x i x j x k ψ(h,w,v) (1) i i<j with ψ(h,w,v) = log x { exp h i x i + w ij x i x j + v ijk x i x j x k, (2) i i<j where x = (x 1,...,x n ) { 1,+1 n. Thus, noting that the correspondence p h,w,v (h,w,v) is one to one, we can, at least mathematically, identify each third-order CBM [6] with its equilibrium probability distribution. Many good properties of such networks are consequences of the fact that the equilibrium distributions form an exponential family. Here, we discuss this important aspect of the CBM [7] briefly. Let X be a finite set or, more generally, a measurable space with an underlying measure dµ. We denote the set of positive probability distributions (probability mass functions for a finite X and probability density functions for a general (X,dµ)) on X by P = P(X). When a family of distributions, say is represented in the form M = {p θ θ = (θ i ); i = 1,...,n P, (3) { p θ (x) = exp c(x)+ n i=1 θ i f i (x) ψ(θ), x X, (4) M is called an exponential family. Here, θ i ; i = 1,...,n are R-valued parameters, c and f i are functions on X and ψ(θ) is a real-valued convex function. Further, we assume that the correspondence θ p θ is one to one. These θ = (θ i ) are called the natural coordinates of M. Now, for the exponential family M, if we let η i (θ) def = E θ [f i ] = x p θ (x)f i (x) then η = (η i ) and θ = (θ i ) are in one-to-one correspondence. That is, we can also use η instead of θ to specify an element of M. These (η i ) are called the expectation coordinates of M. The expectation coordinates are, in general, represented as η i = i ψ(θ) ( i def = θ i ). (5) The set that consists of equilibrium probability distributions of third-order CBM (1) is one example of exponential family. In addition, threshold values, coupling coefficients (weights) and third-oder weights become the natural coordinates while E θ [x i ], E θ [x i x j ] and E θ [x i x j x k ] become expectation coordinates. The notion of exponential family is very important in statistics and information geometry, and is also useful in studying properties of third-order CBMs with their mean-field approximations. 2

We now consider a hierarchy of exponential families. Let P r be the set of rth-order CBMs. Then P r also turns out to be an exponential family. Thus, we have a hierarchical structure of exponential families P 1 P 2 P r. In particular, P 1,P 2 and P 3 can be represented by P 1 = {p 1 (x 1 ) p n (x n ) = {product distributions, P 2 = {equilibrium distributions of CBM (6) and P 3 = {equilibrium distributions of third-order CBM (7) respectively. In this subsection, we derive the naive mean-field equation for third-order CBM. When the system size is large, the partition function exp(ψ(h, w, v)) is very difficult to calculate and thus explicit calculation of the expectations m i is intractable. Therefore, due to that difficulty, we are led to obtain a good approximation of m i for a given probability distribution p h,w,v P 3. First, we consider the subspace P 1 of P 3. We parametrize each distribution in P 1 by h and write as { p h(x) = exp h i x i ψ( h), (8) where ψ( h) = i i { log exp( h i )+exp( h i ). Then, P 1 forms a submanifold of P 3 specified by w ij = 0 = v ijk and h i as its coordinates. The expectations m i := E h[x i ] form another coordinate system of P 1. For a given p h P 1, it is easy to obtain m i = E h[x i ] from h i because x i s are independent. We can calculate m i to be m i = ψ( h) = exp( h i ) exp( h i ) h i exp( h i )+exp( h i ) = tanh( h i ), (9) from which we obtain h i = 1 ( ) 1+ 2 log mi. (10) 1 m i The simple idea behind the mean field approximation for a p h,w,v P 3 is to use quantities obtained in the form of expectation with respect to some relevant p h P 1. Now, we need a suitable criterion to measure the approximation of two probability distributions q P and p θ M. For the present purpose, we adopt the Kullback-Leibler (KL) divergence (relative entropy) D(q p θ ) def = x q(x)log q(x) p θ (x). Given p h,w,v P 3, its e-(exponential) and m-(mixture) projections (see [2]) onto P 1 are defined by p (e) def = p h(e) = argmin D(p h p θ ) (12) p h P 1 and respectively, where p (m) = p h(m) (11) def = argmin p h P 1 D(p θ p h) (13) h (e) = argmin D(p h p θ ) (14) h=( h i ) 3

and As necessary conditions, we have and h (m) = argmind(p θ p h). (15) h=( h i ) h i D(p h p θ ) = 0 (16) h i D(p θ p h) = 0, (17) which are weaker than (14) and (15). But sometimes (16) and (17) are chosen to be the definitions of e-, m- projections respectively for convenience. It can be shown that the m-projection p (m) gives the true values of expectations, that is m i = m i or E (θ) [x i ] = E h[x i ] for p h = p (m). The e-projection p (e) from P 3 onto P 1 gives the naive mean-field approximation for third-order CBM. Now we derive the naive mean-field equation for third-order CBM following [4]. Recall that the equilibrium distribution for third-order CBM is given by with { p def = p(x,h,w,v) = exp h i x i + w ij x i x j + i i<j ψ(h,w,v) = log x i v ijk x i x j x k ψ(h,w,v) (18) { exp h i x i + w ij x i x j + v ijk x i x j x k, (19) i<j where x = (x 1,...,x n ) { 1,+1 n. Now we define another function φ(p) = φ(h,w,v) def = which coincides with the negative entropy v ijk E p [x i x j x k ]+ i<j w ij E p [x i x j ]+ i h i E p [x i ] ψ(p), (20) φ(p) = x p(x) log p(x). (21) In particular, for a product distribution p h P 1, using E h[x i ] = m i, we have φ(p h) = [( ) ( ) ( ) ( )] 1+ mi 1+ mi 1 mi 1 mi log + log. (22) 2 2 2 2 i The KL divergence between p P 3 and p h P 1 can be expressed in the following form: D(p h p) = ψ(p)+φ(p h) h i E h[x i ] ijk E p [x i x j x k v ] w ij E p [x i x j ] i<j i = ψ(p)+φ(p h) v ijk m i m j m k w ij m i m j h i m i. i<j i = ψ(p)+ 1 [ ( ) ( )] 1+ mi 1 mi (1+ m i )log +(1 m i )log 2 2 2 i v ijk m i m j m k w ij m i m j h i m i. (23) i<j i 4

Now consider the e-projection (16) from p P 3 onto p h P 1, i.e. h i D(p h p) = 0. (24) Noting that h and m are in one-to-one correspondence, we may consider instead m i D(p h p) = 0. (25) Since ψ(p) does not depend on m i, we obtain from (23) that 0 = 1 ( ) 1+ 2 log mi ijk m j m k 1 m i k j iv w ij m j h i j i = h i k j iv ijk m j m k w ij m j h i, (26) j i where the second equality is from (10). Thus the naive mean-field equation is obtained from (9) and (26) as tanh 1 ( m i ) = k j iv ijk m j m k + w ij m j +h i (27) j i and this is usually written in the form ( m i = tanh ijk m j m k k j iv + w ij m j +h i ). (28) j i 3 Definition of third-order quantum Boltzmann machines Let us consider an n-element system of quantum spin-half particles. Each element is represented as a quantum spin with local Hilbert space C 2, and the n-element system corresponds to H (C 2 ) n C 2n. Let S be the set of strictly positive states on H; S = {ρ ρ = ρ > 0 and Trρ = 1. (29) Here, each ρ is a 2 n 2 n matrix; ρ = ρ > 0 means that ρ is Hermitian and positive definite respectively; and Trρ = 1 shows that the trace of the density matrix ρ is unity. Now an element of S is said to have at most rth-order interactions if it is written as { ρ θ = exp with θ (1) is σ is + i<j { r = exp s,t j=1 i 1 < <i j { r ψ(θ) = logtrexp θ (2) ijst σ isσ jt + + i 1 < <i r s 1...s j θ (j) i 1...i j s 1...s j σ i1 s 1 σ ij s j ψ(θ) j=1 i 1 < <i j s 1...s r θ (r) i 1...i rs 1...s r σ i1 s 1 σ irs r ψ(θ) (30) s 1...s j θ (j) i 1...i j s 1...s j σ i1 s 1 σ ij s j, (31) where σ is = I (i 1) σ s I (n i), θ = (θ (j) i 1...i j s 1...s j ). Here, I is the identity matrix on H and σ s for s {1,2,3 are the usual Pauli matrices given by σ 1 = ( 0 1 1 0 ), σ 2 = ( 0 i i 0 5 ), σ 3 = ( 1 0 0 1 ).

Letting S r be the totality of states ρ θ of the above form, we have the hierarchy S 1 S 2 S 3 S n = S. Note that S 1 is the set of product states ρ 1 ρ 2 ρ n. Corresponding to the classical case, an element of S 2 is called a QBM (see [1]). The third-order quantum Boltzmann machines are given by the elements of S 3 and those states can be explicitly written as { ρ h,w,v = exp h is σ is + w ijst σ is σ jt + v ijkstu σ is σ jt σ jt ψ(h,w,v) (32) i<j s,t s,t,u with { ψ(h,w,v) = logtrexp h is σ is + w ijst σ is σ jt + i<j s,t where h = (h is ),w = (w ijst ) and v = (v ijkstu ). s,t,u v ijkstu σ is σ jt σ jt, (33) 4 Some information geometrical concepts for quantum systems We discuss in this section some information geometrical concepts for quantum systems [2]. Let us consider a manifold S of density operators and a submanifold M of S. We define a quantum divergence function from ρ S to σ S, which in this case turns out to be the quantum relative entropy and its reverse represented by D ( 1) (ρ σ) def = Tr[ρ(logρ logσ)]; D (+1) (ρ σ) def = Tr[σ(logσ logρ)]. (34) The quantum relative entropy satisfies D (±1) (ρ σ) 0, D(ρ σ) = 0 iff ρ = σ but it is not symmetric. Given ρ S, the point τ (±1) M is called the e, m-projection of ρ to M, when function D (±1) (ρ τ), τ M takes a critical value at τ (±1), that is ξ D(±1) (ρ τ(ξ)) = 0 (35) at τ (±1) where ξ is a coordinate system of M. the minimizer of D (±1) (ρ τ), τ M, is the ±1-projection of ρ to M. Next we introduce a quantum version of exponential family (4) in the following. Suppose that a parametric family M = {ρ θ θ = (θ i ); i = 1,...,m S (36) is represented in the form { ρ θ = exp C + m i=1 θ i F i ψ(θ), (37) where F i (i = 1,...,m),C are Hermitian operators and ψ(θ) is a real-valued function. We assume in addition that the operators {F 1,...,F m,i, where I is the identity operator, are linearly independent to ensure that the parametrization θ ρ θ is one to one. Then M forms an m-dimensional smooth manifold with a coordinate system θ = (θ i ). In this thesis, we call such an M a quantum exponential family or QEF for short, with natural coordinates θ = (θ i ). Note also that for any 1 k n the set S k of states (30) forms a QEF, including S 1 of product states, S 2 of QBMs and S 3 of third-order QBMs. 6

If we let η i (θ) def = Tr [ ρ θ F i ], (38) then η = (η i ) and θ = (θ i ) are in one-to-one correspondence. That is, we can also use η instead of θ to specify an element of M. These (η i ) are called the expectation coordinates of M. In particular, the natural coordinates of S 3 are given by (h,w,v) = (h is,w ijst,v ijkstu ) in (32), while the expectation coordinates are (m,µ,ι) = (m is,µ ijst,ι ijkstu ) defined by m is = Tr[ρ h,w,v σ is ] and µ ijst = Tr[ρ h,w,v σ is σ jt ]and ι ijkstu = Tr[ρ h,w,v σ is σ jt σ ku ]. (39) On the other hand, the natural coordinates of S 1 are h = ( h is ) in (43), while the expectation coordinates are m = ( m is ) defined by m is = Tr[τ hσ is ]. (40) In this case, the correspondence between the two coordinate systems can explicitly be represented as or as where m i def = s ( m is) 2. m is = ψ i( h i ) h is = h is h i tanh( h i ) (41) h is = m is m i tanh 1 ( m i ), (42) 5 Information geometry of mean-field approximation for thirdorder QBMs 5.1 The submanifold of product states and its geometry In this section, we briefly discuss the set S 1. The elements of S 1 are represented as τ h by letting w = 0 and v = 0 in (32). In the sequel, we write them as { τ h = exp h is σ is ψ( h) (43) by using new symbols τ and h = ( h is ) when we wish to make it clear that we are treating S 1 instead of S 3. We have τ h = n { exp h is σ s ψ i ( h i ), (44) i=1 s where h i = ( h is ) s and { ψ i ( h i ) = logtrexp h is σ s s = log{exp( h i )+exp( h i ) (45) with h i def = s ( h is ) 2. Note that ψ( h) = i ψ i ( h i ). (46) 7

5.2 The exponential & mixture projections and mean-field approximation In this section, we derive the naive mean-field equation for third-order QBMs explicitly from the viewpoint of information geometry. Suppose that we are interested in calculating the expectations m is = Tr[ρ h,w X is ] from given (h,w) = (h is,w ijst ). Since the direct calculation is intractable in general when the system size is large, we need to employ a computationally efficient approximation method. Mean-field approximation is a wellknown technique for this purpose. The simple idea behind the mean-field approximation for a ρ h,w,v S 3 is to use quantities obtained in the form of expectation with respect to some relevant τ h S 1. T. Tanaka [4] has elucidated the essence of the naive mean-field approximation for classical spin models in terms of e-, m-projections. Our aim is to extend this idea to quantized spin models other than that considered in [1]. In the following arguments, we regard S 3 as a QEF with the natural coordinates (θ α ) = (h is,w ijst,v ijkstu ) andthe expectation coordinates (η α ) = (m is,µ ijst,ι ijkstu ), where α is an indexdenotingα = () orα = (i,j,s,t) or α = (i,j,k,s,t,u). We follow aslightly different method to that of classical setting to obtain naive mean-field equation for thirdorder QBM. Recall that the state for third-order QBM (32) is given by { ρ h,w,v = exp h is σ is + w ijst σ is σ jt + v ijkstu σ is σ jt σ jt ψ(h,w,v) (47) i<j s,t s,t,u with { ψ(h,w,v) = logtrexp h is σ is + w ijst σ is σ jt + i<j s,t s,t,u v ijkstu σ is σ jt σ kt, (48) where h = (h is ),w = (w ijst ) and v = (v ijkstu ). Given ρ h,w,v S 3, its e-(+1) and m-(-1) projections (see [2]) onto S 1 are defined by τ (±1) = τ h(±1) def = argmin τ h S 1 D(ρ h,w,v τ h). (49) We denoteby m (±1) is [ρ h,w,v ] theexpectation ofσ is withrespect to τ (±1), that istr[ τ (±1) σ is ]. Then m (±1) is [ρ h,w,v ] is given by m i s D(±1) (ρ h,w,v τ h) = 0. (50) From the information geometrical point of view, τ (±1) S 1 is the ±1-geodesic projection of ρ h,w,v to S 1 in the sense that the ±1-geodesic connecting ρ h,w,v and τ is orthogonal to S 1 at τ = τ (±1). but we know that τ ( 1) is m-projection of ρ h,w,v to S 1. then we have m is [ρ h,w,v ] which is the quantity we want to obtain. This relation can be directly calculated by solving m i s D( 1) (ρ h,w,v τ h) = 0, (51) m ( 1) is because this is equivalent to m i s Tr[ρ h,w,v(logρ h,w,v log τ)] m i s Tr[ρ h,w,vlog τ)] = 0. (52) Hence m ( 1) is = Tr[ρ h,w,v σ is ] which is the quantity we have been searching for. But we cannot calculate Tr[ρ h,w,v σ is ] explicitly due to the difficulty in calculating ψ(h,w,v) for ρ h,w,v. 8

If we use the e-projection of ρ S 3 to S 1 instead of the m-projection, we have the naive mean-field approximation as in the classical case. to show this we calculate the e-projection (1-projection) of ρ to S 1. Then we have D (1) (ρ τ) = D ( 1) (τ ρ) = Tr[τ(logτ logρ)] (53) [ {( ) = Tr τ h is σ is ψ( h) ( h is σ is + i<j w ijst σ is σ jt + s,t s,t,u )] v ijkstu σ is σ jt σ jt ψ(h,w,v) (54) = h is m is ψ( h) h is m is + w ijst m is m jt i<j s,t + v ijkstu m is m jt m ku ψ(h,w,v), (55) s,t,u where we define m is = Tr[τσ is ], m is m jt = Tr[τσ is σ jt ] and m is m jt m ku = Tr[τσ is σ jt σ ku ]. Hence [ ] m i s D(1) = h is m is ψ( h) h is + w ijst m jt + v ijkstu m jt m (56) ku m i s j i s,t k j ks,t,u = h is h is + w ijst m jt + v ijkstu m jt m ku = 0. (57) j i s,t s,t,u This gives and m is is given by h is = h is + j i k j k w ijst m jt + s,t m is = ψ i( h i ) h is k j ks,t,u v ijkstu m jt m ku (58) = h is h i tanh( h i ). (59) Both (58) and (59) together give the naive mean-field equations for third-order QBMs. 6 Concluding remarks We have applied information geometry to the mean-field approximation for a general class of quantum statistical models. Here, we were able to derive only the naive mean-field equations. However, it is known that the naive mean-field approximation does not give a good approximation to the true value. Therefore, to improve the approximation we need to consider the higher order approximations and the information geometrical point of view is left open. References [1] N. Yapage, H. Nagaoka. An information geometrical approach to the mean-field approximation for quantum Ising spin models. J. Phys. A: Math. Theor. 41 (2008) 065005. [2] S. Amari and H. Nagaoka. Methods of Information Geometry. American Mathematical Society and Oxford University Press, 2000. 9

[3] Manfred Opper and David Saad. Advanced Mean Field Methods - Theory and Practice, MIT Press, Cambridge, MA, 2001. [4] T. Tanaka, Information Geometry of mean field Approximation. Neural Computation, 12 pp.1951 1968, 2000. [5] S. Amari, K. Kurata and H. Nagaoka. Information Geometry of Boltzmann Machines. IEEE Trans. on Neural Networks Vol 3, No. 2, pp.260 271, 1992. [6] T. J. Sejnowski. Higher-order Boltzmann Machines. Conference Proceedings 151: Neural Networks for Computing(Snowbird, Utah) 1986. [7] H. Nagaoka and T. Kojima. Boltzmann Machine as a Statistical Model. Bulletin of The Computational Statistics of Japan Vol.1, pp.61 81, 1995 (in Japanese). 10