Information geometry for bivariate distribution control

Size: px
Start display at page:

Download "Information geometry for bivariate distribution control"

Transcription

1 Information geometry for bivariate distribution control C.T.J.Dodson + Hong Wang Mathematics + Control Systems Centre, University of Manchester Institute of Science and Technology Optimal control of stochastic processes through sensor estimation of probability density functions has a geometric setting via information theory and the information metric. We consider gamma models with positive correlation, for which the information theoretic 3-manifold geometry has recently been formulated. For comparison we summarize also the case for bivariate Gaussian processes with arbitrary correlation.

2 Poisson Models Model for random events: probability of exactly m in time t ( ) t m 1 P m = τ m! e t/τ for m = 0, 1, 2,... The mean and variance of the number of events in such intervals is t/τ. From P 0 = e t/τ, the pdf for intervals t between successive events is f random (t; τ) = 1 τ e t/τ for t [0, ) with variance V ar(t) = τ 2. Departures from randomness involve either clustering of events or evening out of event spacings.

3 Gamma pdfs Parameters: τ, ν R + f(t; τ, ν) = ( ν τ ) ν t ν 1 Γ(ν) e tν/τ t R + Mean t = τ and variance V ar(t) = τ 2 /ν. ν = 1 Poisson, with mean interval τ. ν = 1, 2,..., Z Poisson process with events removed to leave only every ν th. ν < 1 clustering of events ν = 0.5 (Clustered) f(t; 1, ν) ν = 1 (Random) ν = 2 (Smoothed) ν = 5 (Smoothed) Inter-event interval t

4 Fisher Information Metric on pdf family Example of univariate pdf family: {f(t; (θ i )), (θ i ) S} Random variable t R + and n parameters. Log-likelihood function l = log(f) For all points (θ i ) S a subset of R n the covariance of partial derivatives ( ) l g ij = f(t; l R + (θi )) θ i θ j dt = f(t; τ, ν) R + ( 2 l θ i θ j is a positive definite n n matrix and induces a Riemannian metric g on the n-dimensional parameter space S. ) dt General Case: integrate over all random variables for multivariate pdfs.

5 McKay bivariate gamma pdf The Mckay bivariate gamma distribution defined on 0 < x < y < with parameters α 1, σ 12, α 2 > 0 is given by: f(x, y; α 1, σ 12, α 2 ) = ( α 1 σ ) (α 1 +α 2 ) 2 x α1 1 (y x) α2 1 e α1 σ y 12 12, (1) Γ(α 1 )Γ(α 2 ) where σ 12 is the covariance of X, Y. The correlation coefficient and marginal gamma distributions, of X and Y are given respectively, for positive x, y by : ρ(x, Y ) = α1 α 1 + α 2 f X (x) = ( α 1 σ ) α 1 2 x α1 1 e α1 σ x 12 12, Γ(α 1 ) f Y (y) = ( α 1 σ ) (α 1 +α 2 ) 2 y (α 1+α 2 ) 1 e α1 σ y Γ(α 1 + α 2 )

6 Applicability of McKay distributions Explicitly, f X (x) is a gamma density with mean α 1 σ 12 and dispersion parameter α 1. Similarly, f Y (y) is a gamma density with mean (α 1 + α 2 ) σ 12 α and dispersion parameter (α 1 + α 2 1 ). So, for the McKay distribution to be applicable to a given joint distribution for (x, y), we need to have: 0 < x < y < Covariance > 0 (Dispersion parameter for y) > (Dispersion parameter for x) And we expect, roughly, (Mean x)(mean y) = α 1+α 2 α 1

7 McKay 3-manifold Denote by M the set of Mckay bivariate gamma distributions, that is M = {f f(x, y; α 1, σ 12, α 2 ) = ( α 1 σ ) (α 1 +α 2 ) 2 x α1 1 (y x) α2 1 e α1 σ y 12 12, Γ(α 1 )Γ(α 2 ) y > x > 0, α 1, σ 12, α 2 > 0} (2) Then we have from Arwini and Dodson: Global coordinates (α 1, σ 12, α 2 ) make M a 3-manifold with Fisher information metric [g ij ] given by : [g ij ] = 3 α 1 +α 2 4 α ψ (α 1 ) α 1 α 2 4 α 1 σ 12 1 α 1 α 2 α 1 +α 2 4 α 1 σ 12 4 σ α σ α σ 12 ψ (α 2 ) (3)

8 Information arclength through McKay distributions For small changes dα 1, dα 2, dσ 12 the element of arclength ds is given by ds 2 = 3 ij=1 g ij dx i dx j (4) with (x 1, x 2, x 3 ) = (α 1, σ 12, α 2 ). For larger separations between two bivariate gamma distributions the arclength along a curve is obtained by integration of ds; geodesic curves can give minimizing trajectories. c : [a, b] M : t (c 1 (t), c 2 (t), c 3 (t)) (5) Tangent vector ċ(t) = (ċ 1 (t), ċ 2 (t), ċ 3 (t)) has norm ċ given via (3) by ċ(t) 2 = 3 i,j=1 g ij ċ i (t) ċ j (t). (6) Information length of the curve is L c (a, b) = b a ċ(t) dt for a b. (7)

9 Bivariate Gaussian The probability density of the 2-dimensional normal distribution has the form: f(x) = 1 2π Σ 1 2 e 1 2 (x µ) Σ 1 (x µ), (8) where x = [ x1 x 2 ], µ = [ µ1 µ 2 ], Σ = < x 1 < x 2 <, < µ 1 < µ 2 <, 0 < σ 11, σ 22 <. [ σ11 σ 12 σ 12 σ 22 This contains the five parameters µ 1, µ 2, σ 11, σ 12, σ 22. So our global coordinate system consists of the 5-tuples θ = (θ 1, θ 2, θ 3, θ 4, θ 5 ) = (µ 1, µ 2, σ 11, σ 12, σ 22 ). ],

10 Bivariate Gaussian 5-manifold θ 1 = µ 1, θ 2 = µ 2, θ 3 = σ 11, θ 4 = σ 12, θ 5 = σ 22. The information metric tensor is known to be given by: [g ij ] = θ 5 θ θ4 θ (θ 5 ) θ4 θ 5 (θ 4 ) θ4 θ 5 θ 3 θ 5 +(θ 4 ) θ3 θ where is the determinant (θ 4 ) θ3 θ 4 (θ 3 ) = Σ = θ 3 θ 5 (θ 4 ) 2. (9)

11 B-spline approximations to McKay Since every pdf has unit measure lim f(x, y, α 1, σ 12, α 2 ) = 0 (10) x,y + Hence, for all ɛ > 0 there is a b(ɛ, α 1, σ 12, α 2 ) > 0 such that x, y > b(ɛ, α 1, σ 12, α 2 ) f(x, y, α 1, σ 12, α 2 ) ɛ. (11) So we can use B-spline functions to approximate the McKay pdf such that f(x, y, α 1, σ 12, α 2 ) n i=1 w i B i (x, y) δ (12) where B i (x, y) are pre-specified bivariate basis functions defined on Ω = [0, b] [0, b], w i are the weights to be trained adaptively and δ is a small number generally larger than ɛ.

12 Square root approximation This is numerically more robust than linear B-splines. Used here to represent the coupled links between the three parameters and the McKay pdf f. Since the basis functions are pre-specified, different values of {α 1, σ 12, α 2 } will generate different sets of weights. Therefore, our approximation should be represented as f(x, y, α 1, σ 12, α 2 ) = where e δ. n i=1 w i B i (x, y) + e (13)

13 Renormalization Wang has used the following transformed representation f(x, y, α 1, σ 12, α 2 ) = C(x, y)v k +h(v k )B n (x, y) (14) to guarantee that b b 0 0 f(x, y, α 1, σ 12, α 2 )dxdy = 1 where V k = (w 1, w 2..., w n 1 ) T constitutes a vector of independent weights, and h(.) is a known nonlinear function of V k. With this format, the relationship between V k and {α 1, σ 12, α 2 } can be formulated.

14 Tuning procedure The initial bivariate gamma distribution has parameters {α 1 (0), σ 12 (0), α 2 (0)} and the desired distribution has {α 1 (f), σ 12 (f), α 2 (f)} Let g(x, y) represent the probability density of the bivariate gamma distribution with parameter set {α 1 (f), σ 12 (f), α 2 (f)}. An effective trajectory for V k should be chosen to minimise the following performance function J = 1 2 b 0 b 0 y) g(x, y)logk(x, dxdy (15) g(x, y) with K(x, y) = C(x, y)v k + h(v k )B n (x, y).

15 Gradient Rule for iteration V k = V k 1 η J V V =V k 1 (16) where k = 0, 1, 2,... represents the sample number and η > 0 is a pre-specified learning rate. Using the relationship between the weight vector and the three parameters, the adaptive tuning of the actual parameters in the pdf of (14) can be readily formulated. The information arclength provides distance estimates between current and target distribution

16 Kullback-Leibler Divergence For two probability density functions p and p on an event space Ω, the function KL(p, p ) = Ω p(x) log p p(x)dx (17) (x) is called the Kullback-Leibler divergence or relative entropy. We could instead of (15), consider (17) as the performance function; explicitly this would be: W = 1 2 b 0 b 0 g(x, y) g(x, y)log dxdy (18) K(x, y) with K(x, y) = (C(x, y)v k + h(v k )B n (x, y).

Neighbourhoods of Randomness and Independence

Neighbourhoods of Randomness and Independence Neighbourhoods of Randomness and Independence C.T.J. Dodson School of Mathematics, Manchester University Augment information geometric measures in spaces of distributions, via explicit geometric representations

More information

AN AFFINE EMBEDDING OF THE GAMMA MANIFOLD

AN AFFINE EMBEDDING OF THE GAMMA MANIFOLD AN AFFINE EMBEDDING OF THE GAMMA MANIFOLD C.T.J. DODSON AND HIROSHI MATSUZOE Abstract. For the space of gamma distributions with Fisher metric and exponential connections, natural coordinate systems, potential

More information

On the entropy flows to disorder

On the entropy flows to disorder CHAOS 2009 Charnia, Crete 1-5 June 2009 On the entropy flows to disorder C.T.J. Dodson School of Mathematics University of Manchester, UK Abstract Gamma distributions, which contain the exponential as

More information

On the entropy flows to disorder

On the entropy flows to disorder On the entropy flows to disorder C.T.J. Dodson School of Mathematics, University of Manchester, Manchester M 9PL, UK ctdodson@manchester.ac.uk May, 9 Abstract Gamma distributions, which contain the exponential

More information

On the entropy flows to disorder

On the entropy flows to disorder On the entropy flows to disorder School of Mathematics University of Manchester Manchester M 9PL, UK (e-mail: ctdodson@manchester.ac.uk) C.T.J. Dodson Abstract. Gamma distributions, which contain the exponential

More information

On random sequence spacing distributions

On random sequence spacing distributions On random sequence spacing distributions C.T.J. Dodson School of Mathematics, Manchester University Manchester M6 1QD, UK ctdodson@manchester.ac.uk Abstract The random model for allocation of an amino

More information

arxiv: v2 [math-ph] 22 May 2009

arxiv: v2 [math-ph] 22 May 2009 On the entropy flows to disorder arxiv:8.438v [math-ph] May 9 C.T.J. Dodson School of Mathematics, University of Manchester, Manchester M3 9PL, UK ctdodson@manchester.ac.uk August 6, 8 Abstract Gamma distributions,

More information

Universal connection and curvature for statistical manifold geometry

Universal connection and curvature for statistical manifold geometry Universal connection and curvature for statistical manifold geometry Khadiga Arwini, L. Del Riego and C.T.J. Dodson Department of Mathematics University of Manchester Institute of Science and Technology

More information

Topics in Information Geometry. Dodson, CTJ. MIMS EPrint: Manchester Institute for Mathematical Sciences School of Mathematics

Topics in Information Geometry. Dodson, CTJ. MIMS EPrint: Manchester Institute for Mathematical Sciences School of Mathematics Topics in Information Geometry Dodson, CTJ 005 MIMS EPrint: 005.49 Manchester Institute for Mathematical Sciences School of Mathematics The University of Manchester Reports available from: And by contacting:

More information

Information Geometry of Bivariate Gamma Exponential Distributions

Information Geometry of Bivariate Gamma Exponential Distributions Information Geometry of Bivariate Gamma Exponential Distriutions Khadiga Ali Arwini Department of Mathematics Al-Fateh University Tripoli-Liya July 7, 9 Astract This paper is devoted to the information

More information

Chapter 5 continued. Chapter 5 sections

Chapter 5 continued. Chapter 5 sections Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Universal connection and curvature for statistical manifold geometry

Universal connection and curvature for statistical manifold geometry Universal connection and curvature for statistical manifold geometry Khadiga Arwini, L. Del Riego and C.T.J. Dodson School of Mathematics, University of Manchester, Manchester M60 QD, UK arwini00@yahoo.com

More information

ELEMENTS OF PROBABILITY THEORY

ELEMENTS OF PROBABILITY THEORY ELEMENTS OF PROBABILITY THEORY Elements of Probability Theory A collection of subsets of a set Ω is called a σ algebra if it contains Ω and is closed under the operations of taking complements and countable

More information

Lecture 35: December The fundamental statistical distances

Lecture 35: December The fundamental statistical distances 36-705: Intermediate Statistics Fall 207 Lecturer: Siva Balakrishnan Lecture 35: December 4 Today we will discuss distances and metrics between distributions that are useful in statistics. I will be lose

More information

June 21, Peking University. Dual Connections. Zhengchao Wan. Overview. Duality of connections. Divergence: general contrast functions

June 21, Peking University. Dual Connections. Zhengchao Wan. Overview. Duality of connections. Divergence: general contrast functions Dual Peking University June 21, 2016 Divergences: Riemannian connection Let M be a manifold on which there is given a Riemannian metric g =,. A connection satisfying Z X, Y = Z X, Y + X, Z Y (1) for all

More information

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ). .8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics

More information

1 First and second variational formulas for area

1 First and second variational formulas for area 1 First and second variational formulas for area In this chapter, we will derive the first and second variational formulas for the area of a submanifold. This will be useful in our later discussion on

More information

Applications of Information Geometry to Hypothesis Testing and Signal Detection

Applications of Information Geometry to Hypothesis Testing and Signal Detection CMCAA 2016 Applications of Information Geometry to Hypothesis Testing and Signal Detection Yongqiang Cheng National University of Defense Technology July 2016 Outline 1. Principles of Information Geometry

More information

Random Variables and Their Distributions

Random Variables and Their Distributions Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital

More information

Maximum Likelihood Estimation. only training data is available to design a classifier

Maximum Likelihood Estimation. only training data is available to design a classifier Introduction to Pattern Recognition [ Part 5 ] Mahdi Vasighi Introduction Bayesian Decision Theory shows that we could design an optimal classifier if we knew: P( i ) : priors p(x i ) : class-conditional

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Information geometry of the power inverse Gaussian distribution

Information geometry of the power inverse Gaussian distribution Information geometry of the power inverse Gaussian distribution Zhenning Zhang, Huafei Sun and Fengwei Zhong Abstract. The power inverse Gaussian distribution is a common distribution in reliability analysis

More information

CS Lecture 19. Exponential Families & Expectation Propagation

CS Lecture 19. Exponential Families & Expectation Propagation CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces

More information

Chapter 5. Chapter 5 sections

Chapter 5. Chapter 5 sections 1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 254 Part V

More information

CS 591, Lecture 2 Data Analytics: Theory and Applications Boston University

CS 591, Lecture 2 Data Analytics: Theory and Applications Boston University CS 591, Lecture 2 Data Analytics: Theory and Applications Boston University Charalampos E. Tsourakakis January 25rd, 2017 Probability Theory The theory of probability is a system for making better guesses.

More information

Lecture 3. Probability - Part 2. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. October 19, 2016

Lecture 3. Probability - Part 2. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. October 19, 2016 Lecture 3 Probability - Part 2 Luigi Freda ALCOR Lab DIAG University of Rome La Sapienza October 19, 2016 Luigi Freda ( La Sapienza University) Lecture 3 October 19, 2016 1 / 46 Outline 1 Common Continuous

More information

Multiple Random Variables

Multiple Random Variables Multiple Random Variables This Version: July 30, 2015 Multiple Random Variables 2 Now we consider models with more than one r.v. These are called multivariate models For instance: height and weight An

More information

Probability on a Riemannian Manifold

Probability on a Riemannian Manifold Probability on a Riemannian Manifold Jennifer Pajda-De La O December 2, 2015 1 Introduction We discuss how we can construct probability theory on a Riemannian manifold. We make comparisons to this and

More information

Machine learning - HT Maximum Likelihood

Machine learning - HT Maximum Likelihood Machine learning - HT 2016 3. Maximum Likelihood Varun Kanade University of Oxford January 27, 2016 Outline Probabilistic Framework Formulate linear regression in the language of probability Introduce

More information

Some illustrations of information geometry in biology and physics

Some illustrations of information geometry in biology and physics Some illustrations of information geometry in biology and physics C.T.J. Dodson School of Mathematics, University of Manchester UK Abstract Many real processes have stochastic features which seem to be

More information

Introduction to Information Geometry

Introduction to Information Geometry Introduction to Information Geometry based on the book Methods of Information Geometry written by Shun-Ichi Amari and Hiroshi Nagaoka Yunshu Liu 2012-02-17 Outline 1 Introduction to differential geometry

More information

Formulas for probability theory and linear models SF2941

Formulas for probability theory and linear models SF2941 Formulas for probability theory and linear models SF2941 These pages + Appendix 2 of Gut) are permitted as assistance at the exam. 11 maj 2008 Selected formulae of probability Bivariate probability Transforms

More information

PATTERN RECOGNITION AND MACHINE LEARNING

PATTERN RECOGNITION AND MACHINE LEARNING PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality

More information

Contrastive Divergence

Contrastive Divergence Contrastive Divergence Training Products of Experts by Minimizing CD Hinton, 2002 Helmut Puhr Institute for Theoretical Computer Science TU Graz June 9, 2010 Contents 1 Theory 2 Argument 3 Contrastive

More information

1. Geometry of the unit tangent bundle

1. Geometry of the unit tangent bundle 1 1. Geometry of the unit tangent bundle The main reference for this section is [8]. In the following, we consider (M, g) an n-dimensional smooth manifold endowed with a Riemannian metric g. 1.1. Notations

More information

Latent state estimation using control theory

Latent state estimation using control theory Latent state estimation using control theory Bert Kappen SNN Donders Institute, Radboud University, Nijmegen Gatsby Unit, UCL London August 3, 7 with Hans Christian Ruiz Bert Kappen Smoothing problem Given

More information

Stat410 Probability and Statistics II (F16)

Stat410 Probability and Statistics II (F16) Stat4 Probability and Statistics II (F6 Exponential, Poisson and Gamma Suppose on average every /λ hours, a Stochastic train arrives at the Random station. Further we assume the waiting time between two

More information

Copulas. MOU Lili. December, 2014

Copulas. MOU Lili. December, 2014 Copulas MOU Lili December, 2014 Outline Preliminary Introduction Formal Definition Copula Functions Estimating the Parameters Example Conclusion and Discussion Preliminary MOU Lili SEKE Team 3/30 Probability

More information

Gaussian processes for inference in stochastic differential equations

Gaussian processes for inference in stochastic differential equations Gaussian processes for inference in stochastic differential equations Manfred Opper, AI group, TU Berlin November 6, 2017 Manfred Opper, AI group, TU Berlin (TU Berlin) inference in SDE November 6, 2017

More information

Lecture 1: August 28

Lecture 1: August 28 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 1: August 28 Our broad goal for the first few lectures is to try to understand the behaviour of sums of independent random

More information

Manifold Monte Carlo Methods

Manifold Monte Carlo Methods Manifold Monte Carlo Methods Mark Girolami Department of Statistical Science University College London Joint work with Ben Calderhead Research Section Ordinary Meeting The Royal Statistical Society October

More information

is a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications.

is a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications. Stat 811 Lecture Notes The Wald Consistency Theorem Charles J. Geyer April 9, 01 1 Analyticity Assumptions Let { f θ : θ Θ } be a family of subprobability densities 1 with respect to a measure µ on a measurable

More information

Chapter 3 : Likelihood function and inference

Chapter 3 : Likelihood function and inference Chapter 3 : Likelihood function and inference 4 Likelihood function and inference The likelihood Information and curvature Sufficiency and ancilarity Maximum likelihood estimation Non-regular models EM

More information

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the

More information

An Information Geometry Perspective on Estimation of Distribution Algorithms: Boundary Analysis

An Information Geometry Perspective on Estimation of Distribution Algorithms: Boundary Analysis An Information Geometry Perspective on Estimation of Distribution Algorithms: Boundary Analysis Luigi Malagò Department of Electronics and Information Politecnico di Milano Via Ponzio, 34/5 20133 Milan,

More information

A Detailed Analysis of Geodesic Least Squares Regression and Its Application to Edge-Localized Modes in Fusion Plasmas

A Detailed Analysis of Geodesic Least Squares Regression and Its Application to Edge-Localized Modes in Fusion Plasmas A Detailed Analysis of Geodesic Least Squares Regression and Its Application to Edge-Localized Modes in Fusion Plasmas Geert Verdoolaege1,2, Aqsa Shabbir1,3 and JET Contributors EUROfusion Consortium,

More information

Mathematical Preliminaries

Mathematical Preliminaries Mathematical Preliminaries Economics 3307 - Intermediate Macroeconomics Aaron Hedlund Baylor University Fall 2013 Econ 3307 (Baylor University) Mathematical Preliminaries Fall 2013 1 / 25 Outline I: Sequences

More information

Transport Continuity Property

Transport Continuity Property On Riemannian manifolds satisfying the Transport Continuity Property Université de Nice - Sophia Antipolis (Joint work with A. Figalli and C. Villani) I. Statement of the problem Optimal transport on Riemannian

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

STA 2201/442 Assignment 2

STA 2201/442 Assignment 2 STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution

More information

Warped Riemannian metrics for location-scale models

Warped Riemannian metrics for location-scale models 1 Warped Riemannian metrics for location-scale models Salem Said, Lionel Bombrun, Yannick Berthoumieu arxiv:1707.07163v1 [math.st] 22 Jul 2017 Abstract The present paper shows that warped Riemannian metrics,

More information

Foundations of Nonparametric Bayesian Methods

Foundations of Nonparametric Bayesian Methods 1 / 27 Foundations of Nonparametric Bayesian Methods Part II: Models on the Simplex Peter Orbanz http://mlg.eng.cam.ac.uk/porbanz/npb-tutorial.html 2 / 27 Tutorial Overview Part I: Basics Part II: Models

More information

FuncICA for time series pattern discovery

FuncICA for time series pattern discovery FuncICA for time series pattern discovery Nishant Mehta and Alexander Gray Georgia Institute of Technology The problem Given a set of inherently continuous time series (e.g. EEG) Find a set of patterns

More information

Information geometry of mirror descent

Information geometry of mirror descent Information geometry of mirror descent Geometric Science of Information Anthea Monod Department of Statistical Science Duke University Information Initiative at Duke G. Raskutti (UW Madison) and S. Mukherjee

More information

Midterm Examination. STA 215: Statistical Inference. Due Wednesday, 2006 Mar 8, 1:15 pm

Midterm Examination. STA 215: Statistical Inference. Due Wednesday, 2006 Mar 8, 1:15 pm Midterm Examination STA 215: Statistical Inference Due Wednesday, 2006 Mar 8, 1:15 pm This is an open-book take-home examination. You may work on it during any consecutive 24-hour period you like; please

More information

Natural Evolution Strategies for Direct Search

Natural Evolution Strategies for Direct Search Tobias Glasmachers Natural Evolution Strategies for Direct Search 1 Natural Evolution Strategies for Direct Search PGMO-COPI 2014 Recent Advances on Continuous Randomized black-box optimization Thursday

More information

Bayes spaces: use of improper priors and distances between densities

Bayes spaces: use of improper priors and distances between densities Bayes spaces: use of improper priors and distances between densities J. J. Egozcue 1, V. Pawlowsky-Glahn 2, R. Tolosana-Delgado 1, M. I. Ortego 1 and G. van den Boogaart 3 1 Universidad Politécnica de

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

Advanced Machine Learning & Perception

Advanced Machine Learning & Perception Advanced Machine Learning & Perception Instructor: Tony Jebara Topic 6 Standard Kernels Unusual Input Spaces for Kernels String Kernels Probabilistic Kernels Fisher Kernels Probability Product Kernels

More information

Riemannian geometry of surfaces

Riemannian geometry of surfaces Riemannian geometry of surfaces In this note, we will learn how to make sense of the concepts of differential geometry on a surface M, which is not necessarily situated in R 3. This intrinsic approach

More information

Independent Component Analysis

Independent Component Analysis 1 Independent Component Analysis Background paper: http://www-stat.stanford.edu/ hastie/papers/ica.pdf 2 ICA Problem X = AS where X is a random p-vector representing multivariate input measurements. S

More information

ON THE FOLIATION OF SPACE-TIME BY CONSTANT MEAN CURVATURE HYPERSURFACES

ON THE FOLIATION OF SPACE-TIME BY CONSTANT MEAN CURVATURE HYPERSURFACES ON THE FOLIATION OF SPACE-TIME BY CONSTANT MEAN CURVATURE HYPERSURFACES CLAUS GERHARDT Abstract. We prove that the mean curvature τ of the slices given by a constant mean curvature foliation can be used

More information

The Variational Gaussian Approximation Revisited

The Variational Gaussian Approximation Revisited The Variational Gaussian Approximation Revisited Manfred Opper Cédric Archambeau March 16, 2009 Abstract The variational approximation of posterior distributions by multivariate Gaussians has been much

More information

Energy-Based Generative Adversarial Network

Energy-Based Generative Adversarial Network Energy-Based Generative Adversarial Network Energy-Based Generative Adversarial Network J. Zhao, M. Mathieu and Y. LeCun Learning to Draw Samples: With Application to Amoritized MLE for Generalized Adversarial

More information

A NOVEL OPTIMAL PROBABILITY DENSITY FUNCTION TRACKING FILTER DESIGN 1

A NOVEL OPTIMAL PROBABILITY DENSITY FUNCTION TRACKING FILTER DESIGN 1 A NOVEL OPTIMAL PROBABILITY DENSITY FUNCTION TRACKING FILTER DESIGN 1 Jinglin Zhou Hong Wang, Donghua Zhou Department of Automation, Tsinghua University, Beijing 100084, P. R. China Control Systems Centre,

More information

Metric Spaces. Exercises Fall 2017 Lecturer: Viveka Erlandsson. Written by M.van den Berg

Metric Spaces. Exercises Fall 2017 Lecturer: Viveka Erlandsson. Written by M.van den Berg Metric Spaces Exercises Fall 2017 Lecturer: Viveka Erlandsson Written by M.van den Berg School of Mathematics University of Bristol BS8 1TW Bristol, UK 1 Exercises. 1. Let X be a non-empty set, and suppose

More information

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS PROBABILITY: LIMIT THEOREMS II, SPRING 218. HOMEWORK PROBLEMS PROF. YURI BAKHTIN Instructions. You are allowed to work on solutions in groups, but you are required to write up solutions on your own. Please

More information

Consistency of the maximum likelihood estimator for general hidden Markov models

Consistency of the maximum likelihood estimator for general hidden Markov models Consistency of the maximum likelihood estimator for general hidden Markov models Jimmy Olsson Centre for Mathematical Sciences Lund University Nordstat 2012 Umeå, Sweden Collaborators Hidden Markov models

More information

04. Random Variables: Concepts

04. Random Variables: Concepts University of Rhode Island DigitalCommons@URI Nonequilibrium Statistical Physics Physics Course Materials 215 4. Random Variables: Concepts Gerhard Müller University of Rhode Island, gmuller@uri.edu Creative

More information

Wasserstein GAN. Juho Lee. Jan 23, 2017

Wasserstein GAN. Juho Lee. Jan 23, 2017 Wasserstein GAN Juho Lee Jan 23, 2017 Wasserstein GAN (WGAN) Arxiv submission Martin Arjovsky, Soumith Chintala, and Léon Bottou A new GAN model minimizing the Earth-Mover s distance (Wasserstein-1 distance)

More information

Exercise 1 (Formula for connection 1-forms) Using the first structure equation, show that

Exercise 1 (Formula for connection 1-forms) Using the first structure equation, show that 1 Stokes s Theorem Let D R 2 be a connected compact smooth domain, so that D is a smooth embedded circle. Given a smooth function f : D R, define fdx dy fdxdy, D where the left-hand side is the integral

More information

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

Lecture 25: Review. Statistics 104. April 23, Colin Rundel Lecture 25: Review Statistics 104 Colin Rundel April 23, 2012 Joint CDF F (x, y) = P [X x, Y y] = P [(X, Y ) lies south-west of the point (x, y)] Y (x,y) X Statistics 104 (Colin Rundel) Lecture 25 April

More information

GSI Geometric Science of Information, Paris, August Dimensionality reduction for classification of stochastic fibre radiographs

GSI Geometric Science of Information, Paris, August Dimensionality reduction for classification of stochastic fibre radiographs GSI2013 - Geometric Science of Information, Paris, 28-30 August 2013 Dimensionality reduction for classification of stochastic fibre radiographs C.T.J. Dodson 1 and W.W. Sampson 2 School of Mathematics

More information

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak 1 Introduction. Random variables During the course we are interested in reasoning about considered phenomenon. In other words,

More information

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n = Spring 2012 Math 541A Exam 1 1. (a) Let Z i be independent N(0, 1), i = 1, 2,, n. Are Z = 1 n n Z i and S 2 Z = 1 n 1 n (Z i Z) 2 independent? Prove your claim. (b) Let X 1, X 2,, X n be independent identically

More information

Week 3: The EM algorithm

Week 3: The EM algorithm Week 3: The EM algorithm Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London Term 1, Autumn 2005 Mixtures of Gaussians Data: Y = {y 1... y N } Latent

More information

MAS223 Statistical Inference and Modelling Exercises

MAS223 Statistical Inference and Modelling Exercises MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,

More information

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we

More information

Chapter 2 Exponential Families and Mixture Families of Probability Distributions

Chapter 2 Exponential Families and Mixture Families of Probability Distributions Chapter 2 Exponential Families and Mixture Families of Probability Distributions The present chapter studies the geometry of the exponential family of probability distributions. It is not only a typical

More information

IEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm

IEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm IEOR E4570: Machine Learning for OR&FE Spring 205 c 205 by Martin Haugh The EM Algorithm The EM algorithm is used for obtaining maximum likelihood estimates of parameters when some of the data is missing.

More information

Gaussian predictive process models for large spatial data sets.

Gaussian predictive process models for large spatial data sets. Gaussian predictive process models for large spatial data sets. Sudipto Banerjee, Alan E. Gelfand, Andrew O. Finley, and Huiyan Sang Presenters: Halley Brantley and Chris Krut September 28, 2015 Overview

More information

Lecture 7 and 8: Markov Chain Monte Carlo

Lecture 7 and 8: Markov Chain Monte Carlo Lecture 7 and 8: Markov Chain Monte Carlo 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/ Ghahramani

More information

Errata for Vector and Geometric Calculus Printings 1-4

Errata for Vector and Geometric Calculus Printings 1-4 October 21, 2017 Errata for Vector and Geometric Calculus Printings 1-4 Note: p. m (n) refers to page m of Printing 4 and page n of Printings 1-3. p. 31 (29), just before Theorem 3.10. f x(h) = [f x][h]

More information

Covariance function estimation in Gaussian process regression

Covariance function estimation in Gaussian process regression Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian

More information

An Information Geometric Perspective on Active Learning

An Information Geometric Perspective on Active Learning An Information Geometric Perspective on Active Learning Chen-Hsiang Yeang Artificial Intelligence Lab, MIT, Cambridge, MA 02139, USA {chyeang}@ai.mit.edu Abstract. The Fisher information matrix plays a

More information

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline.

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline. Practitioner Course: Portfolio Optimization September 10, 2008 Before we define dependence, it is useful to define Random variables X and Y are independent iff For all x, y. In particular, F (X,Y ) (x,

More information

Series 7, May 22, 2018 (EM Convergence)

Series 7, May 22, 2018 (EM Convergence) Exercises Introduction to Machine Learning SS 2018 Series 7, May 22, 2018 (EM Convergence) Institute for Machine Learning Dept. of Computer Science, ETH Zürich Prof. Dr. Andreas Krause Web: https://las.inf.ethz.ch/teaching/introml-s18

More information

Introduction to Probability and Stocastic Processes - Part I

Introduction to Probability and Stocastic Processes - Part I Introduction to Probability and Stocastic Processes - Part I Lecture 2 Henrik Vie Christensen vie@control.auc.dk Department of Control Engineering Institute of Electronic Systems Aalborg University Denmark

More information

Outline. Supervised Learning. Hong Chang. Institute of Computing Technology, Chinese Academy of Sciences. Machine Learning Methods (Fall 2012)

Outline. Supervised Learning. Hong Chang. Institute of Computing Technology, Chinese Academy of Sciences. Machine Learning Methods (Fall 2012) Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Linear Models for Regression Linear Regression Probabilistic Interpretation

More information

Expectation Propagation for Approximate Bayesian Inference

Expectation Propagation for Approximate Bayesian Inference Expectation Propagation for Approximate Bayesian Inference José Miguel Hernández Lobato Universidad Autónoma de Madrid, Computer Science Department February 5, 2007 1/ 24 Bayesian Inference Inference Given

More information

Data assimilation with and without a model

Data assimilation with and without a model Data assimilation with and without a model Tyrus Berry George Mason University NJIT Feb. 28, 2017 Postdoc supported by NSF This work is in collaboration with: Tim Sauer, GMU Franz Hamilton, Postdoc, NCSU

More information

Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm. by Korbinian Schwinger

Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm. by Korbinian Schwinger Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm by Korbinian Schwinger Overview Exponential Family Maximum Likelihood The EM Algorithm Gaussian Mixture Models Exponential

More information

CS229T/STATS231: Statistical Learning Theory. Lecturer: Tengyu Ma Lecture 11 Scribe: Jongho Kim, Jamie Kang October 29th, 2018

CS229T/STATS231: Statistical Learning Theory. Lecturer: Tengyu Ma Lecture 11 Scribe: Jongho Kim, Jamie Kang October 29th, 2018 CS229T/STATS231: Statistical Learning Theory Lecturer: Tengyu Ma Lecture 11 Scribe: Jongho Kim, Jamie Kang October 29th, 2018 1 Overview This lecture mainly covers Recall the statistical theory of GANs

More information

Independent Component Analysis on the Basis of Helmholtz Machine

Independent Component Analysis on the Basis of Helmholtz Machine Independent Component Analysis on the Basis of Helmholtz Machine Masashi OHATA *1 ohatama@bmc.riken.go.jp Toshiharu MUKAI *1 tosh@bmc.riken.go.jp Kiyotoshi MATSUOKA *2 matsuoka@brain.kyutech.ac.jp *1 Biologically

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

Properties of the boundary rg flow

Properties of the boundary rg flow Properties of the boundary rg flow Daniel Friedan Department of Physics & Astronomy Rutgers the State University of New Jersey, USA Natural Science Institute University of Iceland 82ème rencontre entre

More information

Dynamic System Identification using HDMR-Bayesian Technique

Dynamic System Identification using HDMR-Bayesian Technique Dynamic System Identification using HDMR-Bayesian Technique *Shereena O A 1) and Dr. B N Rao 2) 1), 2) Department of Civil Engineering, IIT Madras, Chennai 600036, Tamil Nadu, India 1) ce14d020@smail.iitm.ac.in

More information

UNIVERSITY of LIMERICK OLLSCOIL LUIMNIGH

UNIVERSITY of LIMERICK OLLSCOIL LUIMNIGH UNIVERSITY of LIMERICK OLLSCOIL LUIMNIGH Faculty of Science and Engineering Department of Mathematics and Statistics END OF SEMESTER ASSESSMENT PAPER MODULE CODE: MA4006 SEMESTER: Spring 2011 MODULE TITLE:

More information

3. Probability and Statistics

3. Probability and Statistics FE661 - Statistical Methods for Financial Engineering 3. Probability and Statistics Jitkomut Songsiri definitions, probability measures conditional expectations correlation and covariance some important

More information