Topics in Information Geometry. Dodson, CTJ. MIMS EPrint: Manchester Institute for Mathematical Sciences School of Mathematics

Size: px
Start display at page:

Download "Topics in Information Geometry. Dodson, CTJ. MIMS EPrint: Manchester Institute for Mathematical Sciences School of Mathematics"

Transcription

1 Topics in Information Geometry Dodson, CTJ 005 MIMS EPrint: Manchester Institute for Mathematical Sciences School of Mathematics The University of Manchester Reports available from: And by contacting: The MIMS Secretary School of Mathematics The University of Manchester Manchester, M3 9PL, UK ISSN

2 Topics in Information Geometry C.T.J.Dodson c 004 School of Mathematics, Manchester University, Manchester M60 QD UK Print date: July 8, 005

3 Contents Contents i Preface Mathematical statistics and information theory 3. Probability functions Example: Bernoulli distribution Example: Binomial distribution Example: The Poisson probability function Probability density functions Example: The exponential pdf Joint probability density functions Example: Bivariate Gaussian distributions Information theory Example: Maximum likelihood gamma pdf Information geometry 3. Fisher information metric Exponential family α-connections Affine immersions Example: Gamma manifold α-connection α-curvatures Gamma manifold geodesics Mutually dual foliations Affine Immersions for Gamma Manifold Example: Log-gamma distributions Example: Gaussian distributions Natural coordinate system and potential function i

4 ii CONTENTS.7. Fisher information metric Mutually dual foliations Affine immersions for Gaussian manifold Example: Bivariate Gaussian 5-manifold Fisher information metric Example: Freund distributions Fisher information metric Example: Mckay bivariate gamma 3-manifold Universal connections and curvature 9 3. Systems of connections and universal objects Exponential family of probability density functions on R Universal connection and curvature for exponential families Tangent bundle system: C T T M JT M Frame bundle system: C F F M JF M Universal connection and curvature on the gamma manifold Tangent bundle system Frame bundle system Neighbourhoods of randomness, independence, and uniformity Gamma manifold G and neighbourhoods of randomness Log-gamma manifold L and neighbourhoods of uniformity Freund manifold F and neighbourhoods of independence Submanifold F F : α = α, β = β Neighbourhoods of independence for Gaussian processes [8] Applications 45 Appendix: List of Mathematica Notebooks 5 Bibliography 53

5 Preface The original draft of these lecture notes was prepared for a short course at Centro de Investigaciones de Matematica CIMAT), Guanajuato, Mexico in May 004. Many of the calculations arise from joint publications with Khadiga Arwini and from her PhD thesis [6]. The course was given also in the Departamento de Xeometra e Topoloxa, Facultade de Matemticas, Universidade de Santiago de Compostela, Spain in February 005.

6 CONTENTS

7 Chapter Mathematical statistics and information theory There are many easily found good books on probability theory and mathematical statistics eg [3, 3, 33, 4, 44, 45, 69]), stochastic processes eg [, 60]) and information theory eg [6]); here we just outline some topics to help make the sequel more self contained. For those who have access to the computer algebra package Mathematica[73], the approach to mathematical statistics and accompanying software in Rose and Smith [63] will be particularly helpful. The word stochastic comes from the Greek stochastikos, meaning skillful in aiming and stochazesthai to aim at or guess at, and stochos means target or aim. In our context, stochastic means involving chance variations around some event rather like the variation in positions of strikes aimed at a target. In its turn, the later word statistics comes through eighteenth century German from the Latin root status meaning state; originally it meant the study of political facts and figures. The noun random was used in the sixteenth century to mean a haphazard course, from the Germanic randir to run, and as an adjective to mean without a definite aim, rule or method, the opposite of purposive. From the middle of the last century, the concept of a random variable has been used to describe a variable that is a function of the result of a well-defined statistical experiment in which each possible outcome has a definite probability of occurrence. The organization of probabilities of outcomes is achieved by means of a probability function for discrete random variables and by means of a probability density function for continuous random variables. The result of throwing two fair dice and summing what they show is a discrete random variable. Mainly, we are concerned with continuous random variables here measurable functions defined on some R n ) with differentiable probability density measure functions, but we do need also to mention the Poisson distribution for the discrete case. However, since the Poisson is a limiting approximation to the Binomial distribution which arises from the Bernoulli distribution which everyone encountered in school!) we mention also those examples. 3

8 4 CHAPTER. MATHEMATICAL STATISTICS AND INFORMATION THEORY. Probability functions Here we are concerned with discrete random variables so we take the domain set to be N {0}. We may view a probability function as a subadditive measure function of unit weight on N {0}.).).3) p : N {0} [0, ) nonnegativity) pk) = unit weight) k=0 pa B) pa) + pb), A, B N {0}, subadditivity) with equality A B =. Formally, we have a discrete measure space of total measure with σ-algebra the power set and measure function induced by p subn) [0, ) : A k A pk) and as we have anticipated above, we usually abbreviate k A pk) = pa). We have the following expected values of the random variable and its square.4) Ek) = k = kpk).5) Ek ) = k = k=0 k pk). With slight but common abuse of notation, we call k the mean, k k) the variance, σ k = + k k) the standard deviation and σ k /k the coefficient of variation, respectively, of the random variable k. The variance is the square of the standard deviation. The moment generating function Ψt) = Ee tx ), t R of a distribution generates the r th moment as the value of the r th derivative of Ψ evaluated at t = 0. Hence, in particular, the mean and variance are given by:.6).7) EX) = Ψ 0) k=0 V arx) = Ψ 0) Ψ 0)), which can provide an easier method for their computation in some cases... Example: Bernoulli distribution It is said that a random variable X has a Bernoulli distribution with parameter p 0 p ) if X can take only the values 0 and and the probabilities are.8).9) P r X = ) = p P r X = 0) = p Then the probability function of X can be written as follows: { p.0) fx p) = x p) x if x = 0, 0 otherwise If X has a Bernoulli distribution with parameter p, then we can find its expectation or mean value EX) and variance V arx) as follows..).) EX) = p + 0 p) = p V arx) = EX ) EX)) = p p

9 .. PROBABILITY FUNCTIONS 5 The moment generating function of X is the expectation of e tx,.3) Ψt) = Ee tx ) = pe t + q which is finite for all real t... Example: Binomial distribution If n random variables X, X,..., X n are independently identically distributed, and each has a Bernoulli distribution with parameter p, then it is said that the variables X, X,..., X n form n Bernoulli trials with parameter p. If the random variables X, X,..., X n form n Bernoulli trials with parameter p and if X = X + X X n, then X has a binomial distribution with parameters n and p. The binomial distribution is of fundamental importance in probability and statistics because of the following result for any experiment which can result in only either success or failure. The experiment is performed n times independently and the probability of the success of any given performance is p. If X denotes the total number of successes in the n performances, then X has a binomial distribution with parameters n and p. The probability function of X is: n n.4) P X = r) = P X i = r) = r i= ) p r p) n r where r = 0,,,..., n. We write ) n p.5) fr p) = r r p) n r if r=0,,,..., n 0 otherwise In this distribution n must be a positive integer and p must lie in the interval 0 p. If X is represented by the sum of n Bernoulli trials, then it is easy to get its expectation, variance and moment generating function by using the properties of sums of random variables..6).7).8) EX) = V arx) = n EX i ) = np i= n V arx i ) = np p) i= Ψt) = Ee tx ) = n Ee txi ) = pe t + q) n. i=..3 Example: The Poisson probability function The Poisson distribution is widely discussed in the statistical literature; one monograph devoted to it and its applications is Haight [38]. Take t, τ 0, ).9).0).) p : N {0} [0, ) : k k = t/τ σ k = t/τ. ) k t τ k! e t/τ

10 6 CHAPTER. MATHEMATICAL STATISTICS AND INFORMATION THEORY This probability function is used to model the number k of random events in a region of measure t when the mean number of events per unit region is τ. Importantly, the Poisson distribution can give a good approximation to the binomial distribution when n is large and p is close to 0. This is easy to see by making the correspondences:.).3) e pn n r)p) n!/n r)! n r.. Probability density functions We are concerned with the case of continuous random variables defined on some Ω R m. For our present purposes we may view a probability density function pdf) on Ω R m as a subadditive measure function of unit weight, namely, a nonnegative map on Ω.4).5).6) Ω f : Ω [0, ) nonnegativity) f = fω) = unit weight) fa B) fa) + fb), A, B Ω, subadditivity) with equality A B =. Formally, we have a measure space of total measure with σ-algebra typically the Borel sets or the power set and the measure function induced by f subω) [0, ) : A f and as we have anticipated above, we usually abbreviate f = fa). Given an integrable ie measurable in the σ-algebra) function u : Ω R, the expectation or mean value of u is defined to A be Eu) = u = uf. We say that f is the joint pdf for the random variables x, x,..., x m, being the coordinates of points in Ω, or that these random variables have the joint probability distribution f. If x is one of these random variables, and in particular for the important case of a single random variable x, we have the following.7) Ex) = x = xf Ω.8) Ex ) = x = x f. Again with slight abuse of notation, we call x the mean, x x) the variance, σ x = + x x) the standard deviation and σ x /x the coefficient of variation, respectively, of the random variable x. The variance is the square of the standard deviation. Usually, a probability density function depends on a set of parameters, θ, θ,..., θ n and we say that we have an n-dimensional family of pdfs. Ω Ω A.. Example: The exponential pdf Take λ R + ; this is called the parameter of the exponential pdf.9) f : [0, ) [0, ) : [a, b].30).3) x = λ σ x = λ. [a,b] λ e x/λ

11 .3. JOINT PROBABILITY DENSITY FUNCTIONS 7 The parameter space of the exponential distribution is R +, so exponential distributions form a - parameter family. In the sequel we shall see that quite generally we may provide a Riemannian structure to the parameter space of a family of distributions. Sometimes we call a family of pdfs a parametric statistical model. Observe that, in the Poisson probability function.9), the value at zero is p0) = e t/τ and it is not difficult to show that the probability density function for the distance t between random events on [0, ) is an exponential pdf.9 given by f : [0, ) [0, ) : t τ e t/τ where τ is the mean number of events per unit interval..3 Joint probability density functions Let f be a pdf, defined on R or some subset thereof). This is an important case since here we have two variables, X, Y, say, and we can extract certain features of how they interact. In particular, we define their respective mean values and their covariance, σ xy :.3).33).34) x = y = σ xy = x fx, y) dxdy y fx, y) dxdy xy fx, y) dxdy xy. The marginal pdf of X is f X, obtained by integrating f over all y,.35) f X x) = and similarly the marginal pdf of Y is.36) f Y y) = v= u= f X,Y x, v) dv f X,Y u, y) du The jointly distributed random variables X and Y are called independent if their marginal density functions satisfy.37) f X,Y x, y) = f X x)f Y y) for all x, y R It is easily shown that if the variables are independent then their covariance.34) is zero but the converse is not true. Feller [3] gives a simple counterexample: Let X take values, +,, +, each with probability 4 Let Y = X ; then the covariance is zero but there is evidently a nonlinear) dependence. The extent of dependence between two random variables can be measured in a normalised way by means of the correlation coefficient: the ratio of the covariance to the product of marginal standard deviations:.38) ρ xy = σ xy σ x σ y. Note that ρ xy, whenever it exists, the limiting values corresponding to the case of linear dependence between the variables. Intuitively, ρ xy < 0 if y tends to increase as x decreases, and ρ xy > 0 if x and y tend to increase together.

12 8 CHAPTER. MATHEMATICAL STATISTICS AND INFORMATION THEORY.3. Example: Bivariate Gaussian distributions The probability density function of the two-dimensional Gaussian distribution has the form:.39) where fx, y) = π σ σ σ e σ σ σ ) σx µ) σ x µ ) y µ )+σ y µ ) ), < x < x <, < µ < µ <, 0 < σ, σ <. So we have a five- This contains the five parameters µ, µ, σ, σ, σ ) = ξ, ξ, ξ 3, ξ 4, ξ 5 ) Θ. dimensional parameter space Θ..4 Information theory Jaynes [4] provided the foundation for information theoretic methods in, among other things, Bayes hypothesis testing as described by Tribus et al. [70, 7]; see also Shannon [64], Slepian [65] and Roman [6]. Given a set of observed values < g α t) > for functions g α of the random variable x, we seek a least prejudiced set of probability values p i satisfying.40).4) < g α t) > = = i=n p i g α x i ) for α =,,..., N i= i=n p i i= Jaynes showed that this occurs if we choose those p i that maximize Shannon s entropy function i=n.4) S = p i logp i ) i= It turns out [7] that if we have no data on observed functions of x, so the set of equations.40) is empty) then the maximum entropy choice gives the exponential distribution. If we have estimates of the first two moments of the distribution of x, then we obtain the truncated) Gaussian. If we have estimates of the mean and mean logarithm of x, then the maximum entropy choice is the gamma distribution. In the sequel we shall consider the particular case of the gamma distribution for several reasons: the exponential distributions form a subclass of gamma distributions and exponential distributions represent intervals between events in a Poisson ie. random ) process the sum of n independent identical exponential random variables follows a gamma distribution lognormal distributions may be well-approximated by gamma distributions products of gamma distributions are well-approximated by gamma distributions stochastic porous media have been modeled using gamma distributions [6]. Other parametric statistical models based on different distributions may be treated in a similar way. Let Θ be the parameter space of a parametric statistical model, that is an n-dimensional smooth family of probability density functions defined on some fixed event space Ω of unit measure, p θ = for all θ Θ. Ω

13 .4. INFORMATION THEORY 9 For each sequence X = {X, X,..., X n }, of independent identically distributed observed values, the likelihood function lik X on Θ which measures the likelihood of the sequence arising from different p θ S is defined by n lik X : Θ [0, ] : θ p θ X i ). Statisticians use the likelihood function, or log-likelihood its logarithm l = log lik, in the evaluation of goodness of fit of statistical models. The so-called method of maximum likelihood is used to obtain optimal fitting of the parameters in a distribution to observed data. i=.4. Example: Maximum likelihood gamma pdf The family of gamma distributions has event space Ω = R + and probability density functions given by Θ {fx; γ, µ) γ, µ R + } so here Θ = R + R + and the random variable is x Ω = R + with ) µ µ x µ.43) fx; γ, µ) = γ Γµ) e xµ/γ Then x = γ and V arx) = γ /µ and we see that γ controls the mean of the distribution while µ controls its variance and hence the shape. Indeed, the property that the variance is proportional to the square of the mean actually characterizes gamma distributions as shown recently by Hwang and Hu [39]. They proved, for n 3 independent positive random variables x, x,..., x n with a common continuous probability density function h, that having independence of the sample mean and sample coefficient of variation is equivalent to h being a gamma distribution. The special case µ = corresponds to the situation of the random or Poisson process with mean inter-event interval γ. In fact, for integer µ =,,...,??) models a process that is Poisson but with intermediate events removed to leave only every ν th. Formally, the gamma distribution is the ν-fold convolution of the exponential distribution, called also the Pearson Type III distribution. Figure. shows a family of gamma distributions, all of unit mean, with µ = 0.5,,, 7. Shannon s information theoretic entropy or uncertainty is given, up to a factor, by the negative of the expectation of the logarithm of the probability density function??), that is.44) S f γ, µ) = 0 logfx; γ, µ)) fx; γ, µ) dx = µ + µ) Γ µ) Γµ) + log γ Γµ) µ At unit mean, the maximum entropy or maximum uncertainty) occurs at µ =, which is the random case, and then S f γ, ) = + log γ. So, a Poisson process of points on a line is such that the points are as disorderly as possible and among all homogeneous point processes with a given density, the Poisson process has maximum entropy. Figure. shows a plot of S f µ, β), for the case of unit mean µ =. We can see the role of the log-likelihood function in the case of a set X = {X, X,..., X n } of measurements, drawn from independent identically distributed random variables, to which we wish to fit the maximum likelihood gamma distribution. The procedure to optimize the choice of γ, µ is as follows. For independent events X i, with identical distribution px; γ, µ), their joint probability density is the product of the marginal densities so a measure of the likelihood of finding such a set of events is n lik X γ, µ) = fx i ; γ, µ). i= We seek a choice of γ, µ to maximize this product and since the log function is monotonic increasing it is simpler to maximize the logarithm n l X γ, µ) = log lik X γ, µ) = log[ fx i ; γ, µ)]. i=

14 0 CHAPTER. MATHEMATICAL STATISTICS AND INFORMATION THEORY fx;, ν).6.4. µ = 0.5 Clustered) µ = Random) µ = Smoothed) µ = 5 Smoothed) Inter-event interval t Figure.: Probability density functions, fx; γ, µ), for gamma distributions of inter-event intervals t with unit mean γ =, and µ = 0.5,,, 5. The case µ = corresponds to an exponential distribution from an underlying Poisson process; ν represents some organization clustering µ < ) or smoothing µ > ). Entropy S f γ, µ) µ Figure.: Information theoretic entropy S f γ, µ), for gamma distributions of inter-event intervals t with unit mean γ =. The maximum at µ = corresponds to an exponential distribution from an underlying Poisson process. The regime µ < corresponds to clustering of events and ν > corresponds to smoothing out of events, relative to a Poisson process. Note that, at constant mean, the variance of x decays like /µ.

15 .4. INFORMATION THEORY Substitution gives us l X γ, µ) = n [µlog µ log γ) + µ ) log X i µ γ X i log Γµ)] i= = nµlog µ log γ) + µ ) n log X i µ γ i= n X i n log Γµ). Then, solving for γ l X γ, µ) = µ l X γ, µ) = 0 in terms of properties of the X i, we obtain the maximum likelihood estimates ˆγ, ˆµ of γ, µ in terms of the mean and mean logarithm of the X i where log X = n n i= log X i. log ˆµ Γ ˆµ) Γˆµ) ˆγ = X = n n i= X i = log X log X i=

16 CHAPTER. MATHEMATICAL STATISTICS AND INFORMATION THEORY

17 Chapter Information geometry Amari [3] and Amari and Nagaoka [4] provide modern accounts of the differential geometry that arises from the Fisher information metric.. Fisher information metric Let Θ be the parameter space of a parametric statistical model, that is an n-dimensional smooth family of probability density functions defined on some fixed event space Ω of unit measure, p θ = for all θ Θ. Ω Denote by E the expectation operator measure) for functions defined on Ω; in particular, the mean Ex) = x, and variance Ex ) x = V arx), which will be functions of θ. Then, at each point θ Θ, the covariance of partial derivatives of the log-likelihood function, l = log p θ, is a matrix with entries the expectations.) g ij = E l θ i ) l ) l θ j = E θ i θ j for coordinates θ i ) about θ Θ). This gives rise to a positive definite matrix, so inducing a Riemannian metric g on Θ using for coordinates the parameters θ i ); this metric is called the expected information metric for the family of probability density functions; the original ideas are due to Fisher and Rao [34, 6]. Of course, the second equality in equation.) depends on certain regularity conditions but when it holds it can be particularly convenient to use. Amari [3] and Amari and Nagaoka [4] provide modern accounts of the differential geometry that arises from the Fisher information metric.. Exponential family An n-dimensional parametric statistical model Θ {p θ θ Θ}is said to be an exponential family or of exponential type, when the density function can be expressed in terms of functions {C, F,..., F n } on Λ and a function ψ on Θ as:.) px; θ) = e {Cx)+P i θi Fix) ψθ)}, then we say that θ i ) are its natural or canonical parameters, and ψ is the potential function. From the normalization condition px; θ) dx = we obtain:.3) ψθ) = log e {Cx)+P i θi Fix)} dx. 3

18 4 CHAPTER. INFORMATION GEOMETRY This potential function is therefore a distinguished function of the coordinates alone and in the sequel we make use of it for the presentation of the manifold as an immersion in R n+..3 α-connections Let Θ {p ξ } be an n-dimensional model, and consider the function Γ α) ij,k which maps each point ξ to the following value: ) [ Γ α) log f ij,k = E ξ ξ i ξ j + α ) ] log f log f log f.4) ξ i ξ j ξ k, where α is some arbitrary real number. So we have an affine connection α) on Θ defined by ).5) α) i j, k = Γ α) ij,k, where g =, is the Fisher metric. We call this α) the α-connection. The α-connection is clearly a symmetric connection. We also have ξ.6) α) = α) 0) + α ), = + α ) + α ). Proposition.3. The 0-connection is the Riemannian connection with respect to the Fisher metric. In general, when α 0, α) is not metric. The notion of exponential family has a close relation to ). From the definition of an exponential family given in Equation.), with i = θ i, we obtain.7) and.8) i lx; θ) = F i x) i ψθ) i j lx; θ) = i j ψθ). where lx; θ) = log fx; θ). Hence we have Γ ) ij,k = i j ψ E θ [ k l θ ], which is 0. In other words, we see that θ i ) is a -affine coordinate system, and Θ is -flat. In particular, the -connection is said to be an exponential connection, and the -)-connection is said to be a mixture connection. We say that an α-connection and the α)-connection are mutually dual with respect to the Fisher metric g since the following formula holds: XgY, Z) = g α) X where X, Y and Z are arbitrary vector fields on M. Y, Z) + gy, α) Z), Now, Θ is an exponential family, so a mixture coordinate system is given by a potential function, that is,.9) η i = ψ θ i. Since θ i ) is a -affine coordinate system, η i ) is a )-affine coordinate system, and they are mutually dual with respect to the Fisher metric. Therefore the statistical manifold has dually orthogonal foliations Section 3.7 in [4]). The coordinates in η i ) admit a potential function given by:.0) λ = θ i η i ψθ). X

19 .4. AFFINE IMMERSIONS 5.4 Affine immersions Let M be an m-dimensional manifold, f an immersion from M to R m+, and ξ a vector field along f. We can x R m+, identify T x R m+ R m+. The pair {f, ξ} is said to be an affine immersion from M to R m+ if, for each point p M, the following formula holds: We call ξ a transversal vector field. T fp) R m+ = f T p M) Span{ξ p }. We denote by D the standard flat affine connection of R m+. Identifying the covariant derivative along f with D, we have the following decompositions: D X f Y = f X Y ) + hx, Y )ξ, D X ξ = f ShX)) + µx)ξ. The induced objects, h, Sh and µ are the induced connection, the affine fundamental form, the affine shape operator and the transversal connection form, respectively. If the affine fundamental form h is positive definite everywhere on M, the immersion f is said to be strictly convex. And if µ = 0, the affine immersion {f, ξ} is said to be equiaffine. It is known that a strictly convex equiaffine immersion induces a statistical manifold. Conversely, the condition when a statistical manifold can be realized in an affine space has been studied. We say that an affine immersion {f, ξ} : Θ R m+ is a graph immersion if the hypersurface is a graph of ψ in R m+ : f : M R m+ : Set i = θ i, ψ ij = ψ θ i θ j. Then we have θ.. θ m D i f j = ψ ij ξ. θ. 0. θ m, ξ = 0 0, ψθ) This implies that the induced connection is flat and θ i ) is a -affine coordinate system. Proposition.4. Let M, h,, ) be a simply connected dually flat space with a global coordinate system and θ) an affine coordinate system of. Suppose that ψ is a θ-potential function. Then M, h, ) can be realized in R m+ by a graph immersion whose potential is ψ..5 Example: Gamma manifold The family of gamma distributions has event space Ω = R + and probability density functions given by S = {fx; γ, µ) γ, µ R + } so here M R + R + and the random variable is x Ω = R + with ) µ µ x µ.) fx; γ, µ) = γ Γµ) e xµ/γ Proposition.5. Let G be the gamma manifold. Set β = µ γ, Then β, µ) is a natural coordinate system of the -connection and.) ψθ) = log Γµ) µ log β is the corresponding potential function.

20 6 CHAPTER. INFORMATION GEOMETRY Proof: Using β = µ γ, the logarithm of gamma distributions can be written as.3) log px; β, µ) = log β µ xµ Γµ) e β x = log x + µ log x β x) log Γµ) µ log β) Hence the set of all gamma distributions is an exponential family. The coordinates θ, θ ) = β, µ) is a natural coordinate system, and ψθ) = log Γµ) µ log β is its potential function. Corollary.5. Since ψθ) is a potential function, the Fisher metric is given by the Hessian of ψ, that is, with respect to natural coordinates: [ ] ψθ).4) [g ij ] = = θ i θ j [ µ β β β ψ µ) ].5. α-connection For each α R, the α or α) )-connection is the torsion-free affine connection with components: Γ α) ij,k = α i j k ψθ), where ψθ) is the potential function, and i = θ i. Since the set of gamma distributions is an exponential family, the connection ) is flat. In this case, β, µ) is a -affine coordinate system. So the and -)-connections on the gamma manifold are flat. Proposition.5.3 The functions Γ α) ij,k are given by.5) Γ α) α) µ, = β 3, Γ α), = Γ α), = α β, Γ α), = α) ψ µ) while the other independent components are zero. We have an affine connection α) defined by α) i j, k = Γ α) ij,k, So by solving the equations Γ α) ij,k = h= g kh Γ hα) ij, k =, ). we obtain the components of α) :

21 .5. EXAMPLE: GAMMA MANIFOLD 7 Proposition.5.4 The components Γ iα) jk of the α) -connection are given by.6) Γ α) = α ) + µ ψ, µ)) β + µ ψ, µ)) Γ α) = α ) ψ, µ) + µ ψ µ), Γ α) = α ) β ψ µ) + µ ψ µ), Γ α) = Γ α) = α ) µ β + µ ψ µ)), α β + β µ ψ µ), Γ α) = α ) µ ψ µ) + µ ψ µ). while the other independent components are zero..5. α-curvatures Proposition.5.5 Direct calculation gives the α-curvature tensor of G.7) R α = α ) ψ µ) + µ ψ µ)) 4 β + µ ψ, µ)) while the other independent components are zero. By contraction we obtain: α-ricci tensor:.8) [R α) ij ] = α ) µ ψ µ)+µ ψ µ)) ψ µ)+µ ψ µ)) 4 β +µ ψ µ)) 4 β +µ ψ µ)) ψ µ)+µ ψ µ)) ψ µ) ψ µ)+µ ψ µ)) 4 β +µ ψ µ)) 4 +µ ψ µ)) In addition, the eigenvalues and the eigenvectors for the α-ricci tensor are given by.9) α ) µ+β ψ µ)+ 4 β +µ β µ ψ µ)+β 4 ψ µ) ψ µ)+µ ψ µ)) 8 β +µ ψ µ)) µ+β ψ µ) 4 β +µ β µ ψ µ)+β 4 ψ µ) ψ µ)+µ ψ µ)) 8 β +µ ψ µ)).0) µ β ψ µ)+ 4 β +µ β µ ψ µ)+β 4 ψ µ) β µ+β ψ µ)+ 4 β +µ β µ ψ µ)+β 4 ψ µ) β α-scalar curvature:.) R α) = α ) ψ µ) + µ ψ µ)) + µ ψ µ)). We note that R α) α ) as µ 0.

22 8 CHAPTER. INFORMATION GEOMETRY.5.3 Gamma manifold geodesics The Fisher information metric for the gamma manifold is given in γ, µ) coordinates by the arc length function ds = µ γ dγ + Γ µ) Γµ) ) ) dµ. µ The Levi-Civita connection is that given by setting α = 0 in the α-connections of the previous section; for this case we give in Figure. some examples of geodesic sprays in the vicinities of the points γ, µ) =, 0.5),, ),, )..5.4 Mutually dual foliations Now, G represents an exponential family of pdfs, so a mixture coordinate system is given by a potential function. Since β, µ) is a -affine coordinate system, η, η ) given by.) η = ψ β = µ β, η = ψ = φµ) log β. µ is a )-affine coordinate system, and they are mutually dual with respect to the Fisher metric. Therefore the gamma manifold has dually orthogonal foliations and potential function.3) λ = µ + ψµ) log Γµ)..5.5 Affine Immersions for Gamma Manifold The gamma manifold has an affine immersion in R 3. Proposition.5.6 Let G be the gamma manifold with the Fisher metric g and the exponential connection ). Denote by β, µ) a natural coordinate system. Then G can be realized in R 3 by the graph of a potential function ) β 0 f : G R 3 β µ, ξ = 0. µ log Γµ) µ log β The submanifold of exponential distributions is represented by the curve 0, ) R 3 : β {β,, log β } and a tubular neighbourhood of this curve will contain all immersions for small enough perturbations of exponential distributions..6 Example: Log-gamma distributions The family of probability density functions for random variable N [0, ] given by.4) gn, µ, β) = β µ N β µ )β log N )β Γβ) Some of these pdfs with central mean are shown in Figure.. for µ > 0 and β > 0.

23 .6. EXAMPLE: LOG-GAMMA DISTRIBUTIONS µ. γ µ γ µ γ. Figure.: Geodesic sprays in the gamma manifold, radiating from the points with unit mean γ =, and µ = 0.5,, increasing vertically. The case µ = corresponds to an exponential distribution from an underlying Poisson process; µ < corresponds to clustering and µ increasing above corresponds to the opposite process, dispersion leading to greater uniformity.

24 0 CHAPTER. INFORMATION GEOMETRY 0 β N β N Figure.: The log-gamma family of densities with central mean < N > = contour plot for β. as a surface and as a Proposition.6. The log-gamma family.4) with information metric determines a Riemannian -manifold L with the following properties it contains the uniform distribution it contains approximations to truncated Gaussian distributions it is an isometric isomorph of a the manifold G of gamma distributions. Proof By integration, it is easily checked that the family given by equation.4) consists of probability density functions for the random variable N [0, ]. The limiting densities are given by.5).6) lim gn, µ, β) = gn, µ, ) = ) µ β + µ N lim gn, µ, ) = gn,, ) =. µ The mean, < N >, standard deviation σ N, and coefficient of variation cv N, of N are given by.7).8).9) < N > = σ N = cv N = ) β β β + µ β β + µ σ N < N > = ) β β β β + µ ) β β + µ ) β ) β β + µ. β We can obtain the family of densities having central mean in [0, ], by solving < N >=, which corresponds to the locus µ = β /β ); some of these are shown in Figure. and Figure.3. Evidently, the distributions with central mean and large β provide approximations to Gaussian distributions truncated on [0, ].

25 .7. EXAMPLE: GAUSSIAN DISTRIBUTIONS N N Figure.3: Examples from the log-gamma family of probability densities with central mean < N > =. Left: β =,.,.4,.6,.8. Right: β = 4, 6, 8, 0. For the log-gamma densities [, 8], the Fisher information metric [3] on the parameter space Θ = {µ, β) 0, ) [, )} is given by its arc length function.30) ds L = g ij dx i dx j = β µ dµ + Γ β) Γβ) ) ) dβ, β ij In fact,.4) arises from the gamma family.3) fx, µ, β) = xβ β µ )β for the non-negative random variable x = log N. It is known that the gamma family 3.) has also the information metric.30) so the identity map on the space of coordinates µ, β) is an isometry of Riemannian manifolds. Γβ) e x β µ.7 Example: Gaussian distributions The family of univariate normal or Gaussian distributions has event space Ω = R and probability density functions given by N {Nµ, σ )} = {px; µ, σ) µ R, σ R + } with mean µ and variance σ. So here N = R R + is the upper half -plane, and the random variable is x Ω = R with.3) px; µ, σ) = e x µ) σ π σ The mean µ and standard deviation σ are frequently used as a local coordinate system ξ = ξ, ξ ) = µ, σ).

26 CHAPTER. INFORMATION GEOMETRY Shannon s information theoretic entropy is given by:.33) S N µ, σ) = At unit variance the entropy is S N = + log π)). logpt; µ, σ)) pt; µ, σ) dt = + log π)) + logσ).7. Natural coordinate system and potential function Proposition.7. Let N be the normal manifold. Set θ = µ σ and θ = σ. Then µ σ, σ ) is a natural coordinate system and.34) ψ = θ is the corresponding potential function. 4 θ + log π θ ) = µ σ + log π σ) Proof: Set θ = µ σ and θ = σ. Then the logarithm of univariate normal distributions can be written as log px; θ, θ ) = log e µ ) x+ ) σ σ x µ σ +log π σ) = θ x + θ x θ + 4 θ log π ).35) ) θ Hence the set of all univariate normal distributions is an exponential family. The coordinates θ, θ ) is a natural coordinate system, and ψ = θ 4 θ + log π θ ) = µ σ + log π σ) is its potential function..7. Fisher information metric Proposition.7. The Fisher metric with respect to natural coordinates θ, θ ) is given by: [ ] θ [ ] θ [g ij ] = θ σ = µ σ.36) µ σ σ µ + σ ) θ θ θ θ θ 3 Proof: Since ψ is a potential function, the Fisher metric is given by the Hessian of ψ, that is,.37) g ij = ψ θ i θ j. Then, we have the Fisher metric by a straightforward calculation..7.3 Mutually dual foliations Since N represents an exponential family, a mixture coordinate system is given by a potential function. We have.38) η = ψ θ = θ = µ, θ η = ψ θ = θ θ = µ + σ. 4 θ

27 .8. EXAMPLE: BIVARIATE GAUSSIAN 5-MANIFOLD 3 Since θ, θ ) is a -affine coordinate system, η, η ) is a )-affine coordinate system, and they are mutually dual with respect to the Fisher metric. Therefore the normal manifold has dually orthogonal foliations. The coordinates in.38) admit the potential function.39) λ = + log π ) ) = + log π) + logσ)). θ.7.4 Affine immersions for Gaussian manifold We show that the normal manifold can be realized in Euclidean R 3 by an affine immersion. Proposition.7.3 Let N be the normal manifold with the Fisher metric g and the exponential connection ). Denote by θ, θ ) a natural coordinate system. Then M can be realized in R 3 by the graph of a potential function, namely, G can be realized by the affine immersion {f, ξ}: f : N R 3 : ) θ θ θ, ξ = 0 0. ψ θ where ψ is the potential function ψ = θ 4 θ + log π θ ). The submanifold of univariate normal distributions with zero mean i.e. θ = 0) is represented by the curve, 0) R 3 : θ {0, θ, log π θ )}, In addition, the submanifold of univariate normal distributions with unit variance i.e. θ = ) is represented by the curve R R 3 : θ {θ,, θ + log π)}..8 Example: Bivariate Gaussian 5-manifold Here we collect the results from Yoshiharu Sato, Kazuaki Sugawa and Michiaki Kawaguchi [66], for comparison with bivariate gamma manifolds. The probability density function for the two-dimensional Gaussian distribution has the form:.40) fx, y) = π σ σ σ e σ σ σ ) σx µ) σ x µ ) y µ )+σ y µ ) ) where < x, x <, < µ, µ <, 0 < σ, σ <. This contains the five parameters µ, µ, σ, σ, σ ) R 5. It is easy to show that the marginal distributions are Gaussian with parameters µ, σ ) and µ, σ ); the covariance is σ.

28 4 CHAPTER. INFORMATION GEOMETRY.8. Fisher information metric The information geometry of.40 has been studied by Sato et al. [66]; the metric tensor takes the following form: σ σ σ σ σ G = [g ij ] = 0 0 ) σ σ σ ).4), σ σ σ 0 0 σ +σ ) σ σ 0 0 where is the determinant σ ) σ σ σ ) = σ σ σ ) The inverse [g ij ] of the fundamental tensor [g ij ] defined by the relation g ij g ik = δ k j is given by.4) G = [g ij ] = σ σ σ σ σ ) σ σ σ ) 0 0 σ σ σ σ + σ ) σ σ 0 0 σ ) σ σ σ )..9 Example: Freund distributions Freund [35] introduced a bivariate exponential mixture distribution arising from the following reliability considerations. Suppose that an instrument has two components A and B with lifetimes X and Y respectively having density functions when both components are in operation) f X x) = α e αx f Y y) = α e αy for α, α > 0; x, y > 0). Then X and Y are dependent in that a failure of either component changes the parameter of the life distribution of the other component. Thus when A fails, the parameter for Y becomes β ; when B fails, the parameter for X becomes β. There is no other dependence. Hence the joint density function of X and Y is.43) { α β fx, y) = e βy α+α β)x for 0 < x < y, α β e βx α+α β)y for 0 < y < x where α i, β i > 0 i =, ). Provided that α + α β, the marginal density function of X is ) ) α f X x) = β e βx α β.44) + α + α ) e α+α)x, x 0 α + α β α + α β and provided that α + α β, The marginal density function of Y is ) ) α f Y y) = β e βy α β.45) + α + α ) e α+α)y, y 0 α + α β α + α β We can see that the marginal density functions are not exponential but rather mixtures of exponential distributions if α i > β i ; otherwise, they are weighted averages. For this reason, this system

29 .9. EXAMPLE: FREUND DISTRIBUTIONS 5 of distributions should be termed bivariate mixture exponential distributions rather than simply bivariate exponential distributions. The marginal density functions f X x) and f Y y) are exponential distributions only in the special case α i = β i i =, ). Freund discussed the statistics of the special case when α + α = β = β, and obtained the joint density function as:.46) { α α fx, y) = + α )e α+α)y for 0 < x < y, α α + α )e α+α)x for 0 < y < x with marginal density functions:.47).48) f X x) = α + α α + α )x) e α+α)x x 0, f Y y) = α + α α + α )y) e α+α)y y 0 The covariance and correlation coefficient of X and Y are.49).50) CovX, Y ) = ρx, Y ) = β β α α β β α + α ), β β α α α + α α + β α + α α + β Note that 3 < ρx, Y ) <. The correlation coefficient ρx, Y ) when β, β, and ρx, Y ) 3 when α = α and β, β 0. In many applications, β i > α i i =, ) i.e., lifetime tends to be shorter when the other component is out of action) ; in such cases the correlation is positive..9. Fisher information metric Proposition.9. Let F be the set of Freund bivariate mixture exponential distributions,.43). Then it becomes a 4-manifold with Fisher information metric g ij = and x, x, x 3, x 4 ) = α, β, α, β ). is given by α +α α α 0 β = α +α ) 0 0.5) 0 0 α +α α log fx, y) x i x j α β α +α ) [g ij ] = fx, y) dx dy 0 0 log fx, y) x i x j fx, y) dx dy The inverse [g ij ] of [g ij ] is given by α + α α ) β [g ij 0 α +α ) α ] = α + α α β α +α ) α.

30 6 CHAPTER. INFORMATION GEOMETRY y 0.5 x Figure.4: Part of the family of McKay bivariate gamma pdfs; Observe that the McKay pdf is zero outside the octant 0 < x < y <. Here the correlation coefficient has been set to ρ xy = 0.6 and α = 5..0 Example: Mckay bivariate gamma 3-manifold The information geometry of the 3-manifold of McKay bivariate gamma distributions can provide a metrization of departures from randomness and departures from independence for bivariate processes. The curvature objects are derived, including those on three submanifolds. As in the case of bivariate normal manifolds, we have negative scalar curvature but here it is not constant and we show how it depends on correlation. These results have potential applications, for example, in the characterization of stochastic materials. Fisher information metric The classical family of Mckay bivariate gamma distributions M, is defined on 0 < x < y < with parameters α, σ, α > 0 and pdfs.53) q fx, y; α, σ, α ) = α σ ) α +α ) x α y x) α e α σ y Γα )Γα ) Here σ is the covariance of X and Y. One way to view this is that fx, y) is the probability density for the two random variables X and Y = X + Z where X and Z both have gamma distributions. The correlation coefficient, and marginal functions, of X and Y are given by α.54) ρx, Y ) = > 0 α + α.55).56) q f X x) = α σ ) α x α e α σ x, x > 0 Γα ) q f Y y) = α σ ) α +α ) y α+α) e α σ y, y > 0 Γα + α ) The marginal distributions of X and Y are gamma with shape parameters α and α +α, respectively; note that it is not possible to choose parameters such that both marginal functions are exponential..

31 .0. EXAMPLE: MCKAY BIVARIATE GAMMA 3-MANIFOLD 7 Proposition.0. Let M be the family of Mckay bivariate gamma distributions, then α, σ, α ) is a local coordinate system, and M becomes a 3-manifold. Fisher information metric.57) [g ij ] = 3 α +α 4 α + Γ α ) Γα ) ) α α α +α 4 α σ 4 σ α α 4 α σ α σ α σ Γ α ) Γα ) ) The inverse [g ij ] of [g ij ] is given by: g = + α + α ) ψ ) α ) ψ α ) + ψ α ) α + α ) ψ, α )) g = g σ + α α ) ψ α )) = α ψ α ) + ψ α ) α + α ) ψ α ))),.58) g 3 = g 3 = ψ α ) + ψ α ) + α + α ) ψ α )), g = σ + 3 α + α + 4 α ψ α ) ) ψ α ) ) α ψ α ) + ψ α ) + α + α ) ψ α ))), g 3 = g 3 σ + α ψ α )) = α ψ α ) + ψ α ) α + α ) ψ α ))), g 33 + α + α ) ψ ) α ) = ψ α ) + ψ α ) α + α ) ψ, α )) where we have abbreviated ψ α ) = Γ α ) Γα ) ).

32 8 CHAPTER. INFORMATION GEOMETRY

33 Chapter 3 Universal connections and curvature It is common to have to consider a number of linear connections on a given statistical manifold and so it is important to know the corresponding universal connection and curvature; then all linear connections and their curvatures are pullbacks. An important class of statistical manifolds is that arising from the so-called exponential families [3, 4] and one particular family is that of gamma distributions, which we showed recently [7] to have important uniqueness properties in stochastic processes. Here we describe the system of all linear connections on the manifold of exponential families, using the tangent bundle or the frame bundle to give the system space. Moreover, we provide formulae for the universal connections and curvatures and give an explicit example for the manifold of gamma distributions. 3. Systems of connections and universal objects The concept of system or structure) of connections was introduced by Mangiarotti and Modugno [50, 57], they were concerned with finite-dimensional bundle representations of the space of all connections on a fibred manifold. On each system of connections there exists a unique universal connection of which every connection in the family of connections is a pullback. A similar relation holds between the corresponding universal curvature and the curvatures of the connections of the system. Definition 3.. A system of connections on a fibred manifold p : E M is a fibred manifold p c : C M together with a first jet-valued fibred morphism ξ : C M E JE over M, such that each section Γ : M C determines a unique connection Γ = ξ Γ p, I E ) on E. Then C is the space of connections of the system. In the sequel we are interested in the system of linear connections on a Riemannian manifold. The system of all linear connections is the subject of studies in eg. [4, 5, 7, 4]. Theorem 3.. [50, 57]) Let C, ξ) be a system of connections on a fibred manifold p : E M. Then there is a unique connection form Λ : C M E JC M E) On the fibred manifold π : C M E C which has the coordinate expression Λ = dx λ λ + dc a a + ξ i λ dx λ i. 9

34 30 CHAPTER 3. UNIVERSAL CONNECTIONS AND CURVATURE This Λ is called the universal connection because it describes all the connections of the system. Explicitly, each Γ SecC/M) gives an injection Γ p, I E ), of E into C E, which is a section of π and Γ coincides with the restriction of Λ to this section: Λ Γ p,ie )E = Γ. A similar relation holds between its curvature Ω, called universal curvature, and the curvatures of the connections of the system. Ω = [Λ, Λ] = d ΛΛ : C M E T C) E V E). So the universal curvature Ω has the coordinate expression: Ω = ξ j λ jξ i µ dx λ dx µ + a ξ i µ dx a dx µ) i. 3. Exponential family of probability density functions on R From the definition of an exponential family, and putting i = x i, we may obtain 3.) i ly; x) = F i y) i ψx) and 3.) i j ly; x) = i j ψx). where ly; x) = log py; x). Hence the Fisher information metric [3, 4] on the n-dimensional space of parameters Θ R n, has coordinates: 3.3) g = [g ij ] = E[ i j ly; x)] = i j ψx) = ψ ij x), The Levi-Civita connection with respect to g is given by: 3.4) Γ k ijx) = n h= gkh i g jh + j g ih h g ij ) = n h= where ψ hk x)) represents the dual metric to ψ hk x)). gkh i j h ψx) = n h= ψkh x) ψ ijh x). 3.3 Universal connection and curvature for exponential families In this section we provide explicit formulae for universal connection and curvature for an arbitrary statistical n-dimensional exponential manifold. We do this in the two available ways: using the tangent bundle and using the frame bundle systems of all linear connections.

35 3.3. UNIVERSAL CONNECTION AND CURVATURE FOR EXPONENTIAL FAMILIES Tangent bundle system: C T T M JT M The system of all linear connections on a manifold M has a representation on the tangent bundle E = T M M with system space C T = {α jγ T M M JT M jγ : T M T T M projects onto I T M } Here we view I T M as a section of T M T M, which is a subbundle of T M T T M, with local expression dx λ λ. The fibred morphism for this system is given by 3.5) 3.6) ξ T : C T M T M JT M T M T M T T M, α jγ, ν) αν)jγ. In coordinates 3.7) ξ T = dx λ λ γ i λ i ) = dx λ λ y j Γ i jλ i ) n = dx λ λ y j ψih ψ jλh ) i ) h= Each section of C T M, such as Γ : M C T : x λ ) x λ, γ µϑ ); determines the unique linear connection Γ = ξ T Γ π T, I T M ) with Christoffel symbols Γ λ µϑ. On the fibred manifold π : C T M T M C T ; the universal connection is given by: briefly, Λ T : C T M T M JC T M T M) T C T T C T M T M), x λ, v λ µν, y λ ) [X λ, V λ µν) X λ, V λ µν, Y µ V λ µνx ν )]. 3.8) Λ T = dx λ λ + dv a a + y µ vµν i dx ν i = n dx λ λ + dv a a + y µ ψih ψ µνh ) dx ν i. h= Explicitly, each Γ SecC T /M) gives an injection Γ π T, I T M ), of T M into C T T M, which is a section of π and Γ coincides with the restriction of Λ T to this section: Λ T Γ π T,I T M )T M = Γ. and the universal curvature of the connection Λ is given by: Ω T = d ΛT Λ T : C T M T M T C T ) T M V T M).

36 3 CHAPTER 3. UNIVERSAL CONNECTIONS AND CURVATURE So here the universal curvature Ω T has the coordinate expression: Ω T = y k v j kλ jy m vmµ i dx λ dx µ + a y m vmµ i dx a dx µ) i ) = n y k n ψjh ψ kλh ) j y m n ψih ψ mµh )dx λ dx µ + a y m ψih ψ mµh )dx a dx µ i. h= h= 3.3. Frame bundle system: C F F M JF M A linear connection is also a principal i.e. group invariant) connection on the principal bundle of frames F M with: h= E = F M M = F M/G consisting of linear frames ordered bases for tangent spaces ) with structure group the general linear group, G = Gln). Here the system space is C F = JF M/G T M T M T F M/G, consisting of G-invariant jets. The system morphism is ξ F : C F F M JF M T M T M T F M, [js x ], b) [T x M T b F M]. In coordinates ξ F = dx λ λ X µ µ s λ ν) ν λ 3.9) = dx λ λ X µ Γ λ µν) ν λ = dx λ λ X µ n h= ψλh ψ µνh ) ν λ where ν λ = b λ ν is the natural base on the vertical fibre of T b F M induced by coordinates b λ ν) on F M. Each section of C F M that is projectable onto I T M, such as, ˆΓ : M C F : x λ ) x λ, [jγ x ]) with Γ λ µν = µ s λ ν; determines the unique linear connection Γ = ξ F ˆΓ π F, I F M ) with Christoffel symbols Γ λ µν. On the principal G-bundle π : C F M F M C F ; the universal connection is given by: briefly, Λ F : C F M F M JC F M F M) T C F F M T C F F M F M), x λ, v λ µν, b µ ν ) [X λ, Y λ µν) X λ, Y λ µν, b µ ν v λ µθx θ )]. 3.0) Λ F = dx λ λ + dv a a + b µ ν vµθ λ dx θ ν λ = n dx λ λ + dv a a + b µ ν ψλh ψ µθh ) dx θ ν λ. h= Explicitly, each Γ SecC F /M) gives an injection Γ π F, I F M ), of F M into C F F M, which is a section of π and Γ coincides with the restriction of Λ F to this section:

37 3.4. UNIVERSAL CONNECTION AND CURVATURE ON THE GAMMA MANIFOLD 33 Λ F Γ π F,I F M )F M = Γ. and the universal curvature of the connection Λ is given by: h= Ω = d ΛF Λ F : C F M F M T C F ) F M V F M). So here the universal curvature form Ω F has the coordinate expression: Ω F = b k νv β kλ ν β b m ω vmµ α dx λ dx µ + a b m ω vmµ α dx a dx µ) ω α ) = n b k ν ψβh ψ kλh ) n ν β b m n ω ψαh ψ mµh )dx λ dx µ + a b m ω ψαh ψ mµh )dx a dx µ ω α. h= 3.4 Universal connection and curvature on the gamma manifold Here we give explicit forms for the system space and its universal connection and curvature on the -dimensional statistical manifold of gamma distributions. Let G be the family of gamma probability density functions given by 3.) {px; β, µ) = β µ xµ Γµ) e β x β, µ R + }, x R +. So the space of parameters is topologically R + R +. It is an exponential family and it includes as a special case µ = ) the exponential distribution itself, which complements the Poisson process on a line. Since the set of all gamma distributions is an exponential family with natural coordinate system β, µ) and potential function ψ = log Γµ) µ log β, [4] the Fisher metric 3.3) is given by: [ µ β ] [g ij ] = β 3.) β φ µ) where φµ) = Γ µ) Γµ) is the logarithmic derivative of the gamma function. The Levi-Civita connection components 3.4) are given by: h= 3.3) Γ = Γ = Γ = Γ = Γ = Γ = µ φ µ)) β + µ φ µ)), φ µ) µ φ µ) ), β φ µ) µ φ µ) ), µ β µ φ µ)), β µ φ µ) ), µ φ µ) µ φ µ) ) while the other independent components are zero.

Universal connection and curvature for statistical manifold geometry

Universal connection and curvature for statistical manifold geometry Universal connection and curvature for statistical manifold geometry Khadiga Arwini, L. Del Riego and C.T.J. Dodson Department of Mathematics University of Manchester Institute of Science and Technology

More information

Universal connection and curvature for statistical manifold geometry

Universal connection and curvature for statistical manifold geometry Universal connection and curvature for statistical manifold geometry Khadiga Arwini, L. Del Riego and C.T.J. Dodson School of Mathematics, University of Manchester, Manchester M60 QD, UK arwini00@yahoo.com

More information

Neighbourhoods of Randomness and Independence

Neighbourhoods of Randomness and Independence Neighbourhoods of Randomness and Independence C.T.J. Dodson School of Mathematics, Manchester University Augment information geometric measures in spaces of distributions, via explicit geometric representations

More information

Information geometry for bivariate distribution control

Information geometry for bivariate distribution control Information geometry for bivariate distribution control C.T.J.Dodson + Hong Wang Mathematics + Control Systems Centre, University of Manchester Institute of Science and Technology Optimal control of stochastic

More information

AN AFFINE EMBEDDING OF THE GAMMA MANIFOLD

AN AFFINE EMBEDDING OF THE GAMMA MANIFOLD AN AFFINE EMBEDDING OF THE GAMMA MANIFOLD C.T.J. DODSON AND HIROSHI MATSUZOE Abstract. For the space of gamma distributions with Fisher metric and exponential connections, natural coordinate systems, potential

More information

Information Geometry of Bivariate Gamma Exponential Distributions

Information Geometry of Bivariate Gamma Exponential Distributions Information Geometry of Bivariate Gamma Exponential Distriutions Khadiga Ali Arwini Department of Mathematics Al-Fateh University Tripoli-Liya July 7, 9 Astract This paper is devoted to the information

More information

On the entropy flows to disorder

On the entropy flows to disorder CHAOS 2009 Charnia, Crete 1-5 June 2009 On the entropy flows to disorder C.T.J. Dodson School of Mathematics University of Manchester, UK Abstract Gamma distributions, which contain the exponential as

More information

On the entropy flows to disorder

On the entropy flows to disorder On the entropy flows to disorder C.T.J. Dodson School of Mathematics, University of Manchester, Manchester M 9PL, UK ctdodson@manchester.ac.uk May, 9 Abstract Gamma distributions, which contain the exponential

More information

arxiv: v2 [math-ph] 22 May 2009

arxiv: v2 [math-ph] 22 May 2009 On the entropy flows to disorder arxiv:8.438v [math-ph] May 9 C.T.J. Dodson School of Mathematics, University of Manchester, Manchester M3 9PL, UK ctdodson@manchester.ac.uk August 6, 8 Abstract Gamma distributions,

More information

On the entropy flows to disorder

On the entropy flows to disorder On the entropy flows to disorder School of Mathematics University of Manchester Manchester M 9PL, UK (e-mail: ctdodson@manchester.ac.uk) C.T.J. Dodson Abstract. Gamma distributions, which contain the exponential

More information

Information geometry of the power inverse Gaussian distribution

Information geometry of the power inverse Gaussian distribution Information geometry of the power inverse Gaussian distribution Zhenning Zhang, Huafei Sun and Fengwei Zhong Abstract. The power inverse Gaussian distribution is a common distribution in reliability analysis

More information

Lecture Notes in Mathematics Editors: J.-M. Morel, Cachan F. Takens, Groningen B. Teissier, Paris

Lecture Notes in Mathematics Editors: J.-M. Morel, Cachan F. Takens, Groningen B. Teissier, Paris Lecture Notes in Mathematics 1953 Editors: J.-M. Morel, Cachan F. Takens, Groningen B. Teissier, Paris Khadiga A. Arwini Christopher T.J. Dodson Information Geometry Near Randomness and Near Independence

More information

Isomorphism classes for Banach vector bundle structures of second tangents. Dodson, CTJ and Galanis, GN and Vassiliou, E. MIMS EPrint: 2005.

Isomorphism classes for Banach vector bundle structures of second tangents. Dodson, CTJ and Galanis, GN and Vassiliou, E. MIMS EPrint: 2005. Isomorphism classes for Banach vector bundle structures of second tangents Dodson, CTJ and Galanis, GN and Vassiliou, E 2006 MIMS EPrint: 2005.47 Manchester Institute for Mathematical Sciences School of

More information

Chapter 5. Chapter 5 sections

Chapter 5. Chapter 5 sections 1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

June 21, Peking University. Dual Connections. Zhengchao Wan. Overview. Duality of connections. Divergence: general contrast functions

June 21, Peking University. Dual Connections. Zhengchao Wan. Overview. Duality of connections. Divergence: general contrast functions Dual Peking University June 21, 2016 Divergences: Riemannian connection Let M be a manifold on which there is given a Riemannian metric g =,. A connection satisfying Z X, Y = Z X, Y + X, Z Y (1) for all

More information

A brief introduction to Semi-Riemannian geometry and general relativity. Hans Ringström

A brief introduction to Semi-Riemannian geometry and general relativity. Hans Ringström A brief introduction to Semi-Riemannian geometry and general relativity Hans Ringström May 5, 2015 2 Contents 1 Scalar product spaces 1 1.1 Scalar products...................................... 1 1.2 Orthonormal

More information

Chapter 5 continued. Chapter 5 sections

Chapter 5 continued. Chapter 5 sections Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

AFFINE SPHERES AND KÄHLER-EINSTEIN METRICS. metric makes sense under projective coordinate changes. See e.g. [10]. Form a cone (1) C = s>0

AFFINE SPHERES AND KÄHLER-EINSTEIN METRICS. metric makes sense under projective coordinate changes. See e.g. [10]. Form a cone (1) C = s>0 AFFINE SPHERES AND KÄHLER-EINSTEIN METRICS JOHN C. LOFTIN 1. Introduction In this note, we introduce a straightforward correspondence between some natural affine Kähler metrics on convex cones and natural

More information

1. Geometry of the unit tangent bundle

1. Geometry of the unit tangent bundle 1 1. Geometry of the unit tangent bundle The main reference for this section is [8]. In the following, we consider (M, g) an n-dimensional smooth manifold endowed with a Riemannian metric g. 1.1. Notations

More information

1: PROBABILITY REVIEW

1: PROBABILITY REVIEW 1: PROBABILITY REVIEW Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 1: Probability Review 1 / 56 Outline We will review the following

More information

Information geometry of Bayesian statistics

Information geometry of Bayesian statistics Information geometry of Bayesian statistics Hiroshi Matsuzoe Department of Computer Science and Engineering, Graduate School of Engineering, Nagoya Institute of Technology, Nagoya 466-8555, Japan Abstract.

More information

4.7 The Levi-Civita connection and parallel transport

4.7 The Levi-Civita connection and parallel transport Classnotes: Geometry & Control of Dynamical Systems, M. Kawski. April 21, 2009 138 4.7 The Levi-Civita connection and parallel transport In the earlier investigation, characterizing the shortest curves

More information

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities PCMI 207 - Introduction to Random Matrix Theory Handout #2 06.27.207 REVIEW OF PROBABILITY THEORY Chapter - Events and Their Probabilities.. Events as Sets Definition (σ-field). A collection F of subsets

More information

Introduction to Information Geometry

Introduction to Information Geometry Introduction to Information Geometry based on the book Methods of Information Geometry written by Shun-Ichi Amari and Hiroshi Nagaoka Yunshu Liu 2012-02-17 Outline 1 Introduction to differential geometry

More information

Exercises in Geometry II University of Bonn, Summer semester 2015 Professor: Prof. Christian Blohmann Assistant: Saskia Voss Sheet 1

Exercises in Geometry II University of Bonn, Summer semester 2015 Professor: Prof. Christian Blohmann Assistant: Saskia Voss Sheet 1 Assistant: Saskia Voss Sheet 1 1. Conformal change of Riemannian metrics [3 points] Let (M, g) be a Riemannian manifold. A conformal change is a nonnegative function λ : M (0, ). Such a function defines

More information

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 8. For any two events E and F, P (E) = P (E F ) + P (E F c ). Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 Sample space. A sample space consists of a underlying

More information

UNIVERSITY OF DUBLIN

UNIVERSITY OF DUBLIN UNIVERSITY OF DUBLIN TRINITY COLLEGE JS & SS Mathematics SS Theoretical Physics SS TSM Mathematics Faculty of Engineering, Mathematics and Science school of mathematics Trinity Term 2015 Module MA3429

More information

SUBTANGENT-LIKE STATISTICAL MANIFOLDS. 1. Introduction

SUBTANGENT-LIKE STATISTICAL MANIFOLDS. 1. Introduction SUBTANGENT-LIKE STATISTICAL MANIFOLDS A. M. BLAGA Abstract. Subtangent-like statistical manifolds are introduced and characterization theorems for them are given. The special case when the conjugate connections

More information

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ). .8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics

More information

DIFFERENTIAL GEOMETRY, LECTURE 16-17, JULY 14-17

DIFFERENTIAL GEOMETRY, LECTURE 16-17, JULY 14-17 DIFFERENTIAL GEOMETRY, LECTURE 16-17, JULY 14-17 6. Geodesics A parametrized line γ : [a, b] R n in R n is straight (and the parametrization is uniform) if the vector γ (t) does not depend on t. Thus,

More information

ON THE FOLIATION OF SPACE-TIME BY CONSTANT MEAN CURVATURE HYPERSURFACES

ON THE FOLIATION OF SPACE-TIME BY CONSTANT MEAN CURVATURE HYPERSURFACES ON THE FOLIATION OF SPACE-TIME BY CONSTANT MEAN CURVATURE HYPERSURFACES CLAUS GERHARDT Abstract. We prove that the mean curvature τ of the slices given by a constant mean curvature foliation can be used

More information

CALCULUS ON MANIFOLDS. 1. Riemannian manifolds Recall that for any smooth manifold M, dim M = n, the union T M =

CALCULUS ON MANIFOLDS. 1. Riemannian manifolds Recall that for any smooth manifold M, dim M = n, the union T M = CALCULUS ON MANIFOLDS 1. Riemannian manifolds Recall that for any smooth manifold M, dim M = n, the union T M = a M T am, called the tangent bundle, is itself a smooth manifold, dim T M = 2n. Example 1.

More information

Chapter 7 Curved Spacetime and General Covariance

Chapter 7 Curved Spacetime and General Covariance Chapter 7 Curved Spacetime and General Covariance In this chapter we generalize the discussion of preceding chapters to extend covariance to more general curved spacetimes. 145 146 CHAPTER 7. CURVED SPACETIME

More information

class # MATH 7711, AUTUMN 2017 M-W-F 3:00 p.m., BE 128 A DAY-BY-DAY LIST OF TOPICS

class # MATH 7711, AUTUMN 2017 M-W-F 3:00 p.m., BE 128 A DAY-BY-DAY LIST OF TOPICS class # 34477 MATH 7711, AUTUMN 2017 M-W-F 3:00 p.m., BE 128 A DAY-BY-DAY LIST OF TOPICS [DG] stands for Differential Geometry at https://people.math.osu.edu/derdzinski.1/courses/851-852-notes.pdf [DFT]

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

DIFFERENTIAL FORMS AND COHOMOLOGY

DIFFERENTIAL FORMS AND COHOMOLOGY DIFFERENIAL FORMS AND COHOMOLOGY ONY PERKINS Goals 1. Differential forms We want to be able to integrate (holomorphic functions) on manifolds. Obtain a version of Stokes heorem - a generalization of the

More information

CHAPTER 1 PRELIMINARIES

CHAPTER 1 PRELIMINARIES CHAPTER 1 PRELIMINARIES 1.1 Introduction The aim of this chapter is to give basic concepts, preliminary notions and some results which we shall use in the subsequent chapters of the thesis. 1.2 Differentiable

More information

Chapter 3. Riemannian Manifolds - I. The subject of this thesis is to extend the combinatorial curve reconstruction approach to curves

Chapter 3. Riemannian Manifolds - I. The subject of this thesis is to extend the combinatorial curve reconstruction approach to curves Chapter 3 Riemannian Manifolds - I The subject of this thesis is to extend the combinatorial curve reconstruction approach to curves embedded in Riemannian manifolds. A Riemannian manifold is an abstraction

More information

ST5215: Advanced Statistical Theory

ST5215: Advanced Statistical Theory Department of Statistics & Applied Probability Monday, September 26, 2011 Lecture 10: Exponential families and Sufficient statistics Exponential Families Exponential families are important parametric families

More information

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions

More information

On random sequence spacing distributions

On random sequence spacing distributions On random sequence spacing distributions C.T.J. Dodson School of Mathematics, Manchester University Manchester M6 1QD, UK ctdodson@manchester.ac.uk Abstract The random model for allocation of an amino

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

Chapter 2 Exponential Families and Mixture Families of Probability Distributions

Chapter 2 Exponential Families and Mixture Families of Probability Distributions Chapter 2 Exponential Families and Mixture Families of Probability Distributions The present chapter studies the geometry of the exponential family of probability distributions. It is not only a typical

More information

Exercises and Answers to Chapter 1

Exercises and Answers to Chapter 1 Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean

More information

ELEMENTS OF PROBABILITY THEORY

ELEMENTS OF PROBABILITY THEORY ELEMENTS OF PROBABILITY THEORY Elements of Probability Theory A collection of subsets of a set Ω is called a σ algebra if it contains Ω and is closed under the operations of taking complements and countable

More information

MATH DIFFERENTIAL GEOMETRY. Contents

MATH DIFFERENTIAL GEOMETRY. Contents MATH 3968 - DIFFERENTIAL GEOMETRY ANDREW TULLOCH Contents 1. Curves in R N 2 2. General Analysis 2 3. Surfaces in R 3 3 3.1. The Gauss Bonnet Theorem 8 4. Abstract Manifolds 9 1 MATH 3968 - DIFFERENTIAL

More information

DIFFERENTIAL GEOMETRY. LECTURE 12-13,

DIFFERENTIAL GEOMETRY. LECTURE 12-13, DIFFERENTIAL GEOMETRY. LECTURE 12-13, 3.07.08 5. Riemannian metrics. Examples. Connections 5.1. Length of a curve. Let γ : [a, b] R n be a parametried curve. Its length can be calculated as the limit of

More information

Continuous Random Variables

Continuous Random Variables 1 / 24 Continuous Random Variables Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical Engineering Indian Institute of Technology Bombay February 27, 2013 2 / 24 Continuous Random Variables

More information

Curvature homogeneity of type (1, 3) in pseudo-riemannian manifolds

Curvature homogeneity of type (1, 3) in pseudo-riemannian manifolds Curvature homogeneity of type (1, 3) in pseudo-riemannian manifolds Cullen McDonald August, 013 Abstract We construct two new families of pseudo-riemannian manifolds which are curvature homegeneous of

More information

Symplectic and Kähler Structures on Statistical Manifolds Induced from Divergence Functions

Symplectic and Kähler Structures on Statistical Manifolds Induced from Divergence Functions Symplectic and Kähler Structures on Statistical Manifolds Induced from Divergence Functions Jun Zhang 1, and Fubo Li 1 University of Michigan, Ann Arbor, Michigan, USA Sichuan University, Chengdu, China

More information

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider

More information

Some illustrations of information geometry in biology and physics

Some illustrations of information geometry in biology and physics Some illustrations of information geometry in biology and physics C.T.J. Dodson School of Mathematics, University of Manchester UK Abstract Many real processes have stochastic features which seem to be

More information

Formulas for probability theory and linear models SF2941

Formulas for probability theory and linear models SF2941 Formulas for probability theory and linear models SF2941 These pages + Appendix 2 of Gut) are permitted as assistance at the exam. 11 maj 2008 Selected formulae of probability Bivariate probability Transforms

More information

Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg

Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg Statistics for Data Analysis PSI Practical Course 2014 Niklaus Berger Physics Institute, University of Heidelberg Overview You are going to perform a data analysis: Compare measured distributions to theoretical

More information

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable Distributions of Functions of Random Variables 5.1 Functions of One Random Variable 5.2 Transformations of Two Random Variables 5.3 Several Random Variables 5.4 The Moment-Generating Function Technique

More information

Metrics and Holonomy

Metrics and Holonomy Metrics and Holonomy Jonathan Herman The goal of this paper is to understand the following definitions of Kähler and Calabi-Yau manifolds: Definition. A Riemannian manifold is Kähler if and only if it

More information

On the existence of isoperimetric extremals of rotation and the fundamental equations of rotary diffeomorphisms

On the existence of isoperimetric extremals of rotation and the fundamental equations of rotary diffeomorphisms XV II th International Conference of Geometry, Integrability and Quantization Sveti Konstantin i Elena 2016 On the existence of isoperimetric extremals of rotation and the fundamental equations of rotary

More information

Probability and Distributions

Probability and Distributions Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated

More information

A CONSTRUCTION OF TRANSVERSE SUBMANIFOLDS

A CONSTRUCTION OF TRANSVERSE SUBMANIFOLDS UNIVERSITATIS IAGELLONICAE ACTA MATHEMATICA, FASCICULUS XLI 2003 A CONSTRUCTION OF TRANSVERSE SUBMANIFOLDS by J. Szenthe Abstract. In case of Riemannian manifolds isometric actions admitting submanifolds

More information

Review (Probability & Linear Algebra)

Review (Probability & Linear Algebra) Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint

More information

Random Variables. P(x) = P[X(e)] = P(e). (1)

Random Variables. P(x) = P[X(e)] = P(e). (1) Random Variables Random variable (discrete or continuous) is used to derive the output statistical properties of a system whose input is a random variable or random in nature. Definition Consider an experiment

More information

THEODORE VORONOV DIFFERENTIAL GEOMETRY. Spring 2009

THEODORE VORONOV DIFFERENTIAL GEOMETRY. Spring 2009 [under construction] 8 Parallel transport 8.1 Equation of parallel transport Consider a vector bundle E B. We would like to compare vectors belonging to fibers over different points. Recall that this was

More information

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization

More information

LECTURE 10: THE PARALLEL TRANSPORT

LECTURE 10: THE PARALLEL TRANSPORT LECTURE 10: THE PARALLEL TRANSPORT 1. The parallel transport We shall start with the geometric meaning of linear connections. Suppose M is a smooth manifold with a linear connection. Let γ : [a, b] M be

More information

Bivariate distributions

Bivariate distributions Bivariate distributions 3 th October 017 lecture based on Hogg Tanis Zimmerman: Probability and Statistical Inference (9th ed.) Bivariate Distributions of the Discrete Type The Correlation Coefficient

More information

Review of Probability Theory

Review of Probability Theory Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty Through this class, we will be relying on concepts from probability theory for deriving

More information

1.1 Review of Probability Theory

1.1 Review of Probability Theory 1.1 Review of Probability Theory Angela Peace Biomathemtics II MATH 5355 Spring 2017 Lecture notes follow: Allen, Linda JS. An introduction to stochastic processes with applications to biology. CRC Press,

More information

Measure-theoretic probability

Measure-theoretic probability Measure-theoretic probability Koltay L. VEGTMAM144B November 28, 2012 (VEGTMAM144B) Measure-theoretic probability November 28, 2012 1 / 27 The probability space De nition The (Ω, A, P) measure space is

More information

Probability on a Riemannian Manifold

Probability on a Riemannian Manifold Probability on a Riemannian Manifold Jennifer Pajda-De La O December 2, 2015 1 Introduction We discuss how we can construct probability theory on a Riemannian manifold. We make comparisons to this and

More information

MATH c UNIVERSITY OF LEEDS Examination for the Module MATH2715 (January 2015) STATISTICAL METHODS. Time allowed: 2 hours

MATH c UNIVERSITY OF LEEDS Examination for the Module MATH2715 (January 2015) STATISTICAL METHODS. Time allowed: 2 hours MATH2750 This question paper consists of 8 printed pages, each of which is identified by the reference MATH275. All calculators must carry an approval sticker issued by the School of Mathematics. c UNIVERSITY

More information

Mean-field dual of cooperative reproduction

Mean-field dual of cooperative reproduction The mean-field dual of systems with cooperative reproduction joint with Tibor Mach (Prague) A. Sturm (Göttingen) Friday, July 6th, 2018 Poisson construction of Markov processes Let (X t ) t 0 be a continuous-time

More information

Section 2. Basic formulas and identities in Riemannian geometry

Section 2. Basic formulas and identities in Riemannian geometry Section 2. Basic formulas and identities in Riemannian geometry Weimin Sheng and 1. Bianchi identities The first and second Bianchi identities are R ijkl + R iklj + R iljk = 0 R ijkl,m + R ijlm,k + R ijmk,l

More information

Let X and Y denote two random variables. The joint distribution of these random

Let X and Y denote two random variables. The joint distribution of these random EE385 Class Notes 9/7/0 John Stensby Chapter 3: Multiple Random Variables Let X and Y denote two random variables. The joint distribution of these random variables is defined as F XY(x,y) = [X x,y y] P.

More information

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,

More information

Minimal timelike surfaces in a certain homogeneous Lorentzian 3-manifold II

Minimal timelike surfaces in a certain homogeneous Lorentzian 3-manifold II Minimal timelike surfaces in a certain homogeneous Lorentzian 3-manifold II Sungwook Lee Abstract The 2-parameter family of certain homogeneous Lorentzian 3-manifolds which includes Minkowski 3-space and

More information

MATH 4030 Differential Geometry Lecture Notes Part 4 last revised on December 4, Elementary tensor calculus

MATH 4030 Differential Geometry Lecture Notes Part 4 last revised on December 4, Elementary tensor calculus MATH 4030 Differential Geometry Lecture Notes Part 4 last revised on December 4, 205 Elementary tensor calculus We will study in this section some basic multilinear algebra and operations on tensors. Let

More information

Open Questions in Quantum Information Geometry

Open Questions in Quantum Information Geometry Open Questions in Quantum Information Geometry M. R. Grasselli Department of Mathematics and Statistics McMaster University Imperial College, June 15, 2005 1. Classical Parametric Information Geometry

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete

More information

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Theorems Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

Strictly convex functions on complete Finsler manifolds

Strictly convex functions on complete Finsler manifolds Proc. Indian Acad. Sci. (Math. Sci.) Vol. 126, No. 4, November 2016, pp. 623 627. DOI 10.1007/s12044-016-0307-2 Strictly convex functions on complete Finsler manifolds YOE ITOKAWA 1, KATSUHIRO SHIOHAMA

More information

Lecture 1: August 28

Lecture 1: August 28 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 1: August 28 Our broad goal for the first few lectures is to try to understand the behaviour of sums of independent random

More information

EE4601 Communication Systems

EE4601 Communication Systems EE4601 Communication Systems Week 2 Review of Probability, Important Distributions 0 c 2011, Georgia Institute of Technology (lect2 1) Conditional Probability Consider a sample space that consists of two

More information

2 (Statistics) Random variables

2 (Statistics) Random variables 2 (Statistics) Random variables References: DeGroot and Schervish, chapters 3, 4 and 5; Stirzaker, chapters 4, 5 and 6 We will now study the main tools use for modeling experiments with unknown outcomes

More information

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) D. ARAPURA This is a summary of the essential material covered so far. The final will be cumulative. I ve also included some review problems

More information

3. Probability and Statistics

3. Probability and Statistics FE661 - Statistical Methods for Financial Engineering 3. Probability and Statistics Jitkomut Songsiri definitions, probability measures conditional expectations correlation and covariance some important

More information

Weighted Exponential Distribution and Process

Weighted Exponential Distribution and Process Weighted Exponential Distribution and Process Jilesh V Some generalizations of exponential distribution and related time series models Thesis. Department of Statistics, University of Calicut, 200 Chapter

More information

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows. Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage

More information

Introduction to Probability and Stocastic Processes - Part I

Introduction to Probability and Stocastic Processes - Part I Introduction to Probability and Stocastic Processes - Part I Lecture 1 Henrik Vie Christensen vie@control.auc.dk Department of Control Engineering Institute of Electronic Systems Aalborg University Denmark

More information

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes. A Probability Primer A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes. Are you holding all the cards?? Random Events A random event, E,

More information

Probability Theory and Statistics. Peter Jochumzen

Probability Theory and Statistics. Peter Jochumzen Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................

More information

Mean-field equations for higher-order quantum statistical models : an information geometric approach

Mean-field equations for higher-order quantum statistical models : an information geometric approach Mean-field equations for higher-order quantum statistical models : an information geometric approach N Yapage Department of Mathematics University of Ruhuna, Matara Sri Lanka. arxiv:1202.5726v1 [quant-ph]

More information

Statistics for scientists and engineers

Statistics for scientists and engineers Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3

More information

Geometry in a Fréchet Context: A Projective Limit Approach

Geometry in a Fréchet Context: A Projective Limit Approach Geometry in a Fréchet Context: A Projective Limit Approach Geometry in a Fréchet Context: A Projective Limit Approach by C.T.J. Dodson University of Manchester, Manchester, UK George Galanis Hellenic

More information

fy (X(g)) Y (f)x(g) gy (X(f)) Y (g)x(f)) = fx(y (g)) + gx(y (f)) fy (X(g)) gy (X(f))

fy (X(g)) Y (f)x(g) gy (X(f)) Y (g)x(f)) = fx(y (g)) + gx(y (f)) fy (X(g)) gy (X(f)) 1. Basic algebra of vector fields Let V be a finite dimensional vector space over R. Recall that V = {L : V R} is defined to be the set of all linear maps to R. V is isomorphic to V, but there is no canonical

More information

MAS223 Statistical Inference and Modelling Exercises

MAS223 Statistical Inference and Modelling Exercises MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,

More information

LECTURE 9: MOVING FRAMES IN THE NONHOMOGENOUS CASE: FRAME BUNDLES. 1. Introduction

LECTURE 9: MOVING FRAMES IN THE NONHOMOGENOUS CASE: FRAME BUNDLES. 1. Introduction LECTURE 9: MOVING FRAMES IN THE NONHOMOGENOUS CASE: FRAME BUNDLES 1. Introduction Until now we have been considering homogenous spaces G/H where G is a Lie group and H is a closed subgroup. The natural

More information

Lecture 22: Variance and Covariance

Lecture 22: Variance and Covariance EE5110 : Probability Foundations for Electrical Engineers July-November 2015 Lecture 22: Variance and Covariance Lecturer: Dr. Krishna Jagannathan Scribes: R.Ravi Kiran In this lecture we will introduce

More information

Loos Symmetric Cones. Jimmie Lawson Louisiana State University. July, 2018

Loos Symmetric Cones. Jimmie Lawson Louisiana State University. July, 2018 Louisiana State University July, 2018 Dedication I would like to dedicate this talk to Joachim Hilgert, whose 60th birthday we celebrate at this conference and with whom I researched and wrote a big blue

More information