CODE LENGTHS FOR MODEL CLASSES WITH CONTINUOUS UNIFORM DISTRIBUTIONS. Panu Luosto
|
|
- Dorcas Jacobs
- 5 years ago
- Views:
Transcription
1 CODE LENGTHS FOR MODEL CLASSES WITH CONTINUOUS UNIFORM DISTRIBUTIONS Panu Luosto University of Helsinki Department of Computer Science P.O. Box 68, FI-4 UNIVERSITY OF HELSINKI, Finland ABSTRACT Continuous uniform distributions are an important means for modelling e.g. noise and unknown portions of heterogeneous data. Even if they are simplistic models, deriving the corresponding code lengths is sometimes non-trivial. One of the obvious problems is that the set in which a uniform density gets positive values is often not known in advance in practical applications. This paper treats uniform distributions in origin-centred balls, arbitrary balls and axis-aligned boxes. We derive normalized maximum likelihood NML) densities for the cases when the maximum likelihood parameters of the data are bounded. From the NML densities we derive code length functions that depend on the prior densities of the parameters. We generalize Rissanen s prior for positive reals for this purpose. We also suggest methods for dealing with the problems that arise from the singularities in the final code length functions.. INTRODUCTION Continuous uniform distributions are as simplistic models important in the field of minimum description length MDL) [, ] principle based learning. When the domain of the data is known, the uniform distribution gives the shortest worst-case code. In this paper, we assume that the domain is unknown, which makes the situation non-trivial. Our objective is to derive code length functions that are suitable for noise and entirely unknown data, or that can be used as baseline code length functions for determining the efficiency of more sophisticated models. In the same time, we avoid unnecessary assumptions about the domain. We have used our code length for the model class with uniform distributions in axis-aligned boxes in [3] where the objective is to find the best clustering with an unknown number of normally distributed clusters and one uniform cluster. If the bounded set in which the uniform distribution gets positive values is not known in advance, even choosing its geometrical form can be a difficult design choice. We consider origin-centred balls in any dimensionality and arbitrary balls in one and two dimensions. Uniform distributions in axis-aligned boxes are simply product densities of uniform distributions in arbitrary one-dimensional balls. If the ranges of the parameters are bounded, calculating normalized maximum likelihoods NML) [4] is straightforward for all the models mentioned above. If the parameters are unbounded, the NML density is not defined. We derive our code lengths with unbounded parameters in all the cases according to a similar idea. To outline the method, we take as an example the simplest model, the uniform distribution in an origin-centred ball. The distribution has one parameter, the radius of the ball. Let x n R d ) n be a data sequence. The maximum likelihood parameter Rx n ) is equal to the distance of the farthest point in the sequence from the origin. If we restrict the data so that Rx n ) [r, r ], we can derive a normalized maximum likelihood x n ; r, r ). But it can be difficult to give r and r any reasonable values before seeing the data. If we let r and r approach Rx n ), the density grows unbounded, which means that renormalization is not possible. Also, if we fix r and let r, or fix r and let r, the density approaches. Instead, we give the parameter r a continuous prior density p r and let t >. Now, we get a mixture density by integrating x n ; r, tr ) over such values of r that Rx n ) [r, tr ]. That gives the density fx n ; p r, t) Rx n ) Rx n )/t x n ; r, tr) p r r) dr. To dispose of the parameter t, we consider the it t + fxn ; p r, t) The iting function is a density function and a universal model []. We still have the problem of choosing a suitable p r. NML encoding minimizes the worst-case excess code length compared to the maximum likelihood code length the latter being the optimal coding method, but only with hindsight). Similarly, we should choose a flat prior which diminishes asymptotically as slow as possible in order to minimize the excess code length with all data. Section introduces a generalization of Rissanen s prior for positive reals as a candidate for p r. With continuous distributions, it is a common practice to use the term code length as a synonym for the negative logarithm of the density, which corresponds to encoding
2 of real numbers with infinite precision. We consider just densities in this paper, taking the logarithm is left to the reader. In practical situations, data values have a finite precision and minimizing the negative logarithm of the density is not quite equal to finding the most effective way to encode the data. This does not usually cause problems, but if the density can grow unbounded in the neighbourhood of some point, the results may be surprising, especially when the data is represented with greater precision than we find trustworthy. Therefore it might be reasonable to fine-tune the model so that the density is bounded. In the following sections, there are examples of densities having singularities in which they are not defined and in the neighbourhoods of which they get arbitrarily large values. These densities are problematic to use with some data sets. We might for example have a two-point cluster the density of which grows unbounded when the distance between the points approaches zero. The small cluster could thus totally dominate the code length of a potentially complex clustering, which might lead to very unintuitive results. We give therefore some solutions how the densities can be bounded and extended to the singularities. We use the notation log for the logarithm to the base of two and ln for the natural logarithm. The elements of a data sequence are assumed to be identically and independently distributed in all the models. Before deriving the densities corresponding to the code length functions we introduce a very flat density function that we use as a prior for parameters.. A PRIOR DENSITY FOR THE REAL NUMBERS The main criterion for priors of the parameters in our case is that they should be as uninformative as possible. In [5], Rissanen gives a density function for the reals in the interval [, [. We generalize it without changing its asymptotic properties by adding a parameter that defines how strongly the probability mass is concentrated in the vicinity of the origin. We write x y as x y for typographical reasons. Let x and let x y x x... x for x >, y copies of x y N. Now let b... δ where k N and k copies of s δ [, ]. For x R +, we define the density f R+ x; b) ) where ln ln ) k log δ ln ) hx) x + b) hx + b) { if log x log x hlog x) otherwise. We verify next that f R+ ; b) integrates to unity over the positive real line. Let log k) x log log... log x. No- k copies tice that D x ln ) k log k) x x hx), if log k ) x or equivalently k ) x k. Thus and k k ) x hx) dx k dx ln )k x hx) k k ) Assuming that b is defined as above, b x hx) dx b k x hx) dx i ln dx x hx) ln. i i ) x hx) dx k ) x hx) dx ln ) ln ) k ln ln ln ) k log δ ) ln ) k ln log δ. The proof is easily completed after a variable change x y + b. The function f R+ ; b) diminishes asymptotically only slightly faster than /x log x). When a prior for all the real numbers is needed, we simply use /)f R+ x ). 3. A CODE LENGTH ACCORDING TO UNIFORM DISTRIBUTIONS IN ORIGIN-CENTRED BALLS In this section, we consider a model class consisting of uniform distributions in a sphere centred at the origin. Let V d r) πd/ Γ d + )rd, denote the volume of a d-dimensional sphere with the radius r, and let A d r) V d r) d/r denote the surface area of that sphere. For the distance of the farthest point in the sequence x n x, x,..., x n ) R d ) n from the origin, we use the notation Rx n ) max { x i i {,,..., n} }. We consider first a special case where the radius of the smallest enclosing sphere belongs to a certain interval. Let r, r > and assume that r < r. Let A {x n R d ) n Rx n ) [r, r ]}. The maximum likelihood for x n A is x n ; r, r ) V d Rx n )) n.
3 The normalization integral for the NML density is x n ; r, r ) dx n x n A n n n n n x B,r )\B,r ) x n B, x ) x B,r )\B,r ) r r r nd ln r r y B,r) A d r) V d r) d r dr dr x B, x ) dx n dx n... dx V d x ) n V d x ) dx dy dr V d r) which yields the NML density function x n ; r, r ) V d Rx n )) n nd lnr /r ) if x n A. The parameters r and r are something we would like to get rid of, because we can seldom give them reasonable values before looking at the data. Setting the parameters to their maximum likelihood values r r Rx n ) results in an infinite density, which implies that a renormalization schema is not possible. Instead, we give r a continuous prior density p r, and make r a function of r, r r ) tr, where t >. Integrating the coefficient / lnr /r ) / ln t over the values of r such that Rx n ) [r, r ] [r, tr ] yields Rx n ) Rx n )/t ln t p r r) dr. When t approaches from above, let the iting function be ux n, p r ) t + t + Rx n ) Rx n )/t ln t p r r) dr Rx n ) Rxn ) t Rx n ) p r Rx n )). ) ) ln t p r Rx n )) Replacing the coefficient / lnr /r ) with ux n, p r ) in ) yields the function fx n ; p r ) Rx n ) V d Rx n )) n nd p r Rx n )) 3) if x n R d ) n and Rx n ) >. It is easy to check that 3) is a valid density by integrating over {x n R d ) n Rx n ) > }. We compare the result briefly with a more straightforward solution. Assume that Rx n ) a > and let pr) ɛa ɛ r ɛ where ɛ >. Consider the mixture density f x n ) Rx n ) V d r) n pr) dr V d Rx n )) n Rx n ) ɛ ɛa ɛ nd + ɛ. Let p r f R+ ; b) as defined in ). Then the ratio f x n ) fx n ; p r ) Rx n ) +ɛ p r Rx n )) approaches zero when Rx n ). Depending on the choice of p r, the density fx n ; p r ) can grow unbounded when Rx n ) approaches zero. A simple solution to get a bounded density is to make p r a function of n and d, and to let p r R) grow relative to R nd in an interval [, ɛ], which keeps fx n ; p r ) constant when Rx n ) [, ɛ]. As a concrete example, let b k ), where k N and we have used the notation explained in Section. Let also ɛ... α b k copies of s where α ], ]. A continuous density fulfilling the previous requirements is p r R) { c f R+ ɛ; b) ɛ nd R nd if R [, ɛ[ c f R+ R; b) if R ɛ, where f R+ is a density defined in ) and c is a constant for normalization. Because and ɛ ɛ f R+ R; b) dr ln ) log α f R+ ɛ; b) ɛ nd R nd dr f R + ɛ; b) ɛ, nd we get c ln ) log α + f R + ɛ; b) ɛ). nd 4. A CODE LENGTH ACCORDING TO UNIFORM DISTRIBUTIONS IN ARBITRARY BALLS We consider here modelling a data sequence according to a uniform distribution in an arbitrary ball, first in one and then in two dimensions. The one-dimensional case is important because a uniform distribution in an axis-aligned box is the product of the densities of the coordinates according to uniform distributions in one-dimensional balls. In the first subsection, we assume that the minimum and maximum values of the one-dimensional sequence are unequal. In the second subsection, we bound the density not by choosing a special prior but by altering the models slightly. The third and fourth subsections examine the two-dimensional case. Let cx n ) denote the centre of the smallest enclosing ball of x n R d ) n, and let rx n ) denote the radius of that ball.
4 4.. One dimension, minx n ) maxx n ) We restrict the data with the maximum likelihood parameters first. Let c R and let δ, r, r > Assume that r < r. Let the set of sequences to be considered be A {x n R n cx n ) [c δ, c + δ], rx n ) [r, r ]}. The maximum likelihood of x n A is x n ) rx n )) n, and the corresponding normalizing integral is Cc, δ, r, r ) 4) x n ) dx n x n A x nn ) x x nn ) nn ) x,x R: x +x )/ [c δ,c +δ], x x )/ [r,r ] x x ) n dx n dx n... dx dx x x,x R: x +x )/ [c δ,c +δ], x x )/ [r,r ] c+δ c δ r r) nn ) δ ). r r dx dx x x ) dr dc 5) There was a coordinate change x, x ) c r, c + r) at 5) in the previous integration. Dividing the maximum likelihood by the normalizing integral yields the NML density function x n ; c, δ, r, r ) rx n )) n nn ) δ r r r r 6) if x n A. The next step is to replace c, δ, r and r with more general parameters that allow us to define a non-zero density for all x n R having rx n ) >. Our solution is similar to the one in Section 3. We assume that r is independent of δ and c. Consider the parameters r and r first. Let again t > and r r ) tr. Requiring that rx n ) [r, r ] [r, tr ], we replace the coefficient r r )/r r ) tr )/t ) in 6) with the integral rx n ) rx n )/t tr t p r r) dr, where p r is a continuous prior of the parameter r. Letting t approach from above, we get t + rx n ) rx n )/t t + tr t p r r) dr rx n ) rxn ) t rx n ) p r rx n )). ) t rx n ) t p r rx n )) Next, we get rid of the coefficient /δ and the dependence on c in 6). Let δ > and let p c be a continuous prior density function of the parameter c. The integration goes over all such values of c that cx n ) [c δ, c +δ]. In a similar fashion as above, we substitute /δ with the iting function cx n )+δ δ + cx n ) δ δ p c c) dc p c cx n )). 7) The final density function is thus fx n ; p c, p r ) p c cx n )) p r rx n )) rx n )) n nn ) 8) if x n R n and rx n ) >. The sequences consisting of equal points are problematic singularities this time. We can bound the density and extend it to the singularities as at the end of Section 3, choosing a special prior p r that keeps fx n ; p c, p r ) constant when rx n ) [, ɛ]. For the case n we let fx ); p c ) p c x). A naive solution for bounding the density is to add one extra point to the beginning of the sequence in order to ensure that the difference between the maximum and minimum values in the sequence is greater than some positive ɛ. By decoding, this point is simply discarded. But then if rx n ) > ɛ, the number of extra bits needed compared to 8) is not a constant any more but log rx n ) + logn + ) logn ) +. In the next subsection, we provide yet another solution how to bound the density. 4.. One dimension, bounded maximum likelihood We restrict the largest possible density by bounding the radius parameter of the models from below. Assume for the time being that n {, 3,... }. We shall see later that the result applies for n as well. Let x n R n and let the smallest radius that can be used for encoding to be ɛ >. The maximum likelihood is thus { x n rx n )) n if rx n ) ɛ ; ɛ) ɛ) n otherwise. Let δ, r > and let c R. We calculate the normalizing integral in the bounded set {x n R n cx n ) [c o δ, c + δ], rx n ) r } first. The integral consist of two parts: Cc, δ, r ) x n R n : cx n ) [c δ, c +δ], rx n ) [, ɛ[ + x n R n : cx n ) [c δ, c +δ], rx n ) [ɛ, r ] ɛ) n dxn rx n )) n dxn.
5 The second term is Cc, δ, ɛ, r ) as in 4). The first term is equal to nn ) nn ) nδ ɛ. x,x R: x +x )/ [c δ, c+δ] x x )/ [,ɛ[ c+δ ɛ c δ x x ) n r) n ɛ) n dr dc ɛ) n dx dx Putting these together yields Cc, δ, r ) nn )δ ɛ ) + nδ r ɛ. When r approaches infinity, Cc, δ, r ) n δ)/ɛ. We normalize the maximum likelihood by this it and use a prior for the parameter c in a similar way as in 7), getting the density f ɛ x n ; p c ) if rx n ) ɛ, and f ɛ x n ; p c ) ɛ rx n )) n n p c cx n )) 9) p c cx n )) ɛ) n n ) if rx n ) [, ɛ[. Letting n in ) yields f ɛ x ); p c ) p co x ), which is a valid density Two dimensions, rx n ) > Next, we consider the arbitrary ball model in two dimensions. Assume first that n {3, 4,... }. The final result is also valid for n, which we shall see after the calculation of the normalizing integral at ). Let C, C ) R and let δ, r, r >, r < r. We assume first that the centre of the smallest enclosing ball of the point sequence belongs to the set D [C δ, C + δ] [C δ, C + δ] and that the radius of that ball is in the interval [r, r ]. Let the set of sequences fulfilling these conditions be A {x n R ) n cx n ) D, rx n ) [r, r ]}. Denote the maximum likelihood in this set x n ) πrx n ) ) n. There must be at least two different points x i, x j in the sequence x n such that x i, x j Bcx n ), rx n )). If x i and x j are the only points in x n belonging to the border of the smallest enclosing ball, then cx n ) x i + x j )/. If at least three points of x n belong to Bcx n ), rx n )), there are three different indices i, j, k {,,..., n} such that x i, x j, x k Bcx n ), rx n )) and cx n ) conv{x i, x j, x k }, where conv{x i, x j, x k } is the convex hull of the set {x i, x j, x k }. We derive then integral x n A x n ) dx n by dividing the integrating space into two parts whose intersection is a null set. First, consider a situation where the points x and x determine the minimal enclosing ball. Let x i j) denote the jth coordinate of x i. We change the coordinate β + π x 3 + x α x β α + π Figure. Three points and their smallest enclosing ball. If and only if the angular coordinate of x 3 is between α + π and β + π, the centre of the smallest enclosing ball is the point marked with a cross. system using the function wc, c, r, θ) c + r cos θ, c + r sin θ, c r cos θ, c r sin θ) x ), x ), x ), x )). Hence, det w c, c, r, θ) 4r. The integral with x and x as outermost points is I D, r, r ) x n A: x,x Bcx n ),rx n )), cx n )x +x )/ x n ) dx n x,x R : x 3,...,x n x +x )/ D, Bx +x )/, x x /) r < x x /<r ) ) n x x π dx n dx n... dx ) ) x x π dx dx x,x R : x +x )/ D, r < x x /<r π c,c ) D r r 4δ )π) 4 π r 3 dr 6δ π r ). 4r πr ) dr dθ dc dc Next, consider the situation where the points x, x and x 3 determine the smallest enclosing ball figure ). The function for the change of coordinates is now wc, c, r, α, β, γ) c + r cos α, c + r sin α, c + r cos β, c + r sin β, c + r cos γ, c + r sin γ), and det w c, c, α, β, γ) sinα β)+sinγ α)+ sinβ γ) r 3.
6 The integral with x, x and x 3 as outermost points without any fixed ordering is I 3 D, r, r ) x n A: x,x,x 3 Bcx n ),rx n )), cx n ) conv{x,x,x 3} x n ) dx n x,x,x 3 R : x 4,...,x n cx,x,x 3)) D, Bcx,x,x 3)),rx,x,x 3))) r <rx,x,x 3))<r x,x,x 3 R : cx,x,x 3)) D, r <rx,x,x 3))<r c,c ) D πrx, x, x 3 )) ) n dx n dx n... dx dx 3 dx dx πrx, x, x 3 )) ) 3 r π α+π β+π α α+π sinα β) + sinγ α) + sinβ γ) r 3 dγ dβ dα dr dc dc π 3 4δ ) r ) π)3π) 4δ π r ). πr ) 3 Using symmetry among the points, we get the normalizing integral x n ) dx n ) x n A ) ) n n I D, r, r ) + I 3 D, r, r ) 3 nn ) 6δ π r ) nn )n ) 4δ + 6 π r ) 4n n ) δ π r ). The calculation above is valid also for n if we define ) 3. The corresponding normalized density is x n π r ; D, r, r ) r πrx n ) ) n 4n n ) δ and t + rx n ) rx n )/t t + r tr) tr) r p r r) dr ) rx n ) rxn ) t rxn ) 3 p r rx n )). The final density is thus rx n ) t t p r rx n )) fx n ; p c, p r ) ) p c cx n )) p r rx n )) π n rx n ) n 3 n. n ) if x n {y n R ) n ry n ) > }, where n {, 3,... } Two dimensions, bounded maximum likelihood We omit the calculations here, because they are essentially similar to those in the one-dimensional case in Subsection 4.. Let n {,, 3,... }. The final density is f ɛ x n ; p c ) ɛ π n rx n ) n n 3 p ccx n )) if rx n ) ɛ, and f ɛ x n ; p c ) p c cx n ))/π n ɛ n n 3 ) if rx n ) [, ɛ[. 5. REFERENCES [] Jorma Rissanen, Information and Complexity in Statistical Modeling, Springer Verlag, New York, 7. [] Peter D. Grünwald, The Minimum Description Length Principle, The MIT Press, 7. [3] Panu Luosto, Jyrki Kivinen, and Heikki Mannila, Gaussian clusters and noise: an approach based on the minimum description length principle, in Discovery Science,, to appear. [4] Jorma Rissanen, Fisher information and stochastic complexity, IEEE Transactions on Information Theory, vol. 4, no., pp. 4 47, January 996. [5] Jorma Rissanen, A universal prior for integers and estimation by minimum description length, The Annals of Statistics, vol., no., pp , Jun if x n A and n {, 3,... }. When we give the centre of the smallest enclosing ball a prior density p c and the radius r a prior p r, we can derive the final density as before. Let c i x n ) denote the ith coordinate of cx n ). The its are cx n )+δ δ + c x n ) δ cx n )+δ c x n ) δ 4 p c c x n ), c x n ))) δ p cc, c )) dc dc
Normalized Maximum Likelihood Methods for Clustering and Density Estimation
Department of Computer Science Series of Publications A Report A-213-8 Normalized Maximum Likelihood Methods for Clustering and Density Estimation Panu Luosto To be presented, with the permission of the
More informationMDL Histogram Density Estimation
MDL Histogram Density Estimation Petri Kontkanen, Petri Myllymäki Complex Systems Computation Group (CoSCo) Helsinki Institute for Information Technology (HIIT) University of Helsinki and Helsinki University
More informationInformation-Theoretically Optimal Histogram Density Estimation
Information-Theoretically Optimal Histogram Density Estimation Petri Kontkanen, Petri Myllymäki March 17, 2006 HIIT TECHNICAL REPORT 2006 2 Information-Theoretically Optimal Histogram Density Estimation
More informationExact Solutions of the Einstein Equations
Notes from phz 6607, Special and General Relativity University of Florida, Fall 2004, Detweiler Exact Solutions of the Einstein Equations These notes are not a substitute in any manner for class lectures.
More informationTaylor and Laurent Series
Chapter 4 Taylor and Laurent Series 4.. Taylor Series 4... Taylor Series for Holomorphic Functions. In Real Analysis, the Taylor series of a given function f : R R is given by: f (x + f (x (x x + f (x
More informationLimits at Infinity. Horizontal Asymptotes. Definition (Limits at Infinity) Horizontal Asymptotes
Limits at Infinity If a function f has a domain that is unbounded, that is, one of the endpoints of its domain is ±, we can determine the long term behavior of the function using a it at infinity. Definition
More informationCourse 212: Academic Year Section 1: Metric Spaces
Course 212: Academic Year 1991-2 Section 1: Metric Spaces D. R. Wilkins Contents 1 Metric Spaces 3 1.1 Distance Functions and Metric Spaces............. 3 1.2 Convergence and Continuity in Metric Spaces.........
More informationThe Volume of Bitnets
The Volume of Bitnets Carlos C. Rodríguez The University at Albany, SUNY Department of Mathematics and Statistics http://omega.albany.edu:8008/bitnets Abstract. A bitnet is a dag of binary nodes representing
More informationX100/701 MATHEMATICS ADVANCED HIGHER. Read carefully
X/7 N A T I O N A L Q U A L I F I C A T I O N S 9 T H U R S D A Y, M A Y. P M. P M MATHEMATICS ADVANCED HIGHER Read carefully. Calculators may be used in this paper.. Candidates should answer all questions.
More informationBMO Round 2 Problem 3 Generalisation and Bounds
BMO 2007 2008 Round 2 Problem 3 Generalisation and Bounds Joseph Myers February 2008 1 Introduction Problem 3 (by Paul Jefferys) is: 3. Adrian has drawn a circle in the xy-plane whose radius is a positive
More informationIntegration - Past Edexcel Exam Questions
Integration - Past Edexcel Exam Questions 1. (a) Given that y = 5x 2 + 7x + 3, find i. - ii. - (b) ( 1 + 3 ) x 1 x dx. [4] 2. Question 2b - January 2005 2. The gradient of the curve C is given by The point
More informationProbabilistic & Unsupervised Learning
Probabilistic & Unsupervised Learning Week 2: Latent Variable Models Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc ML/CSML, Dept Computer Science University College
More informationCLASS NOTES FOR APRIL 14, 2000
CLASS NOTES FOR APRIL 14, 2000 Announcement: Section 1.2, Questions 3,5 have been deferred from Assignment 1 to Assignment 2. Section 1.4, Question 5 has been dropped entirely. 1. Review of Wednesday class
More informationFor all questions, answer choice E. NOTA" means none of the above answers is correct.
For all questions, answer choice " means none of the above answers is correct. 1. The sum of the integers 1 through n can be modeled by a quadratic polynomial. What is the product of the non-zero coefficients
More informationFoundations of Calculus. November 18, 2014
Foundations of Calculus November 18, 2014 Contents 1 Conic Sections 3 11 A review of the coordinate system 3 12 Conic Sections 4 121 Circle 4 122 Parabola 5 123 Ellipse 5 124 Hyperbola 6 2 Review of Functions
More informationMAXIMA AND MINIMA CHAPTER 7.1 INTRODUCTION 7.2 CONCEPT OF LOCAL MAXIMA AND LOCAL MINIMA
CHAPTER 7 MAXIMA AND MINIMA 7.1 INTRODUCTION The notion of optimizing functions is one of the most important application of calculus used in almost every sphere of life including geometry, business, trade,
More informationBayesian Network Structure Learning using Factorized NML Universal Models
Bayesian Network Structure Learning using Factorized NML Universal Models Teemu Roos, Tomi Silander, Petri Kontkanen, and Petri Myllymäki Complex Systems Computation Group, Helsinki Institute for Information
More information1 Potential due to a charged wire/sheet
Lecture XXX Renormalization, Regularization and Electrostatics Let us calculate the potential due to an infinitely large object, e.g. a uniformly charged wire or a uniformly charged sheet. Our main interest
More informationWorst-Case Bounds for Gaussian Process Models
Worst-Case Bounds for Gaussian Process Models Sham M. Kakade University of Pennsylvania Matthias W. Seeger UC Berkeley Abstract Dean P. Foster University of Pennsylvania We present a competitive analysis
More informationOn Locating-Dominating Codes in Binary Hamming Spaces
Discrete Mathematics and Theoretical Computer Science 6, 2004, 265 282 On Locating-Dominating Codes in Binary Hamming Spaces Iiro Honkala and Tero Laihonen and Sanna Ranto Department of Mathematics and
More informationKeep it Simple Stupid On the Effect of Lower-Order Terms in BIC-Like Criteria
Keep it Simple Stupid On the Effect of Lower-Order Terms in BIC-Like Criteria Teemu Roos and Yuan Zou Helsinki Institute for Information Technology HIIT Department of Computer Science University of Helsinki,
More informationa 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.
Chapter 1 LINEAR EQUATIONS 11 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,, a n, b are given real
More information8.821 F2008 Lecture 18: Wilson Loops
8.821 F2008 Lecture 18: Wilson Loops Lecturer: McGreevy Scribe: Koushik Balasubramanian Decemebr 28, 2008 1 Minimum Surfaces The expectation value of Wilson loop operators W [C] in the CFT can be computed
More informationOn the Behavior of MDL Denoising
On the Behavior of MDL Denoising Teemu Roos Petri Myllymäki Helsinki Institute for Information Technology Univ. of Helsinki & Helsinki Univ. of Technology P.O. Box 9800 FIN-0015 TKK, Finland Henry Tirri
More informationCandidates are expected to have available a calculator. Only division by (x + a) or (x a) will be required.
Revision Checklist Unit C2: Core Mathematics 2 Unit description Algebra and functions; coordinate geometry in the (x, y) plane; sequences and series; trigonometry; exponentials and logarithms; differentiation;
More information1 The distributive law
THINGS TO KNOW BEFORE GOING INTO DISCRETE MATHEMATICS The distributive law The distributive law is this: a(b + c) = ab + bc This can be generalized to any number of terms between parenthesis; for instance:
More informationFigure 21:The polar and Cartesian coordinate systems.
Figure 21:The polar and Cartesian coordinate systems. Coordinate systems in R There are three standard coordinate systems which are used to describe points in -dimensional space. These coordinate systems
More informationCOMPUTING THE REGRET TABLE FOR MULTINOMIAL DATA
COMPUTIG THE REGRET TABLE FOR MULTIOMIAL DATA Petri Kontkanen, Petri Myllymäki April 12, 2005 HIIT TECHICAL REPORT 2005 1 COMPUTIG THE REGRET TABLE FOR MULTIOMIAL DATA Petri Kontkanen, Petri Myllymäki
More informationELEMENTARY LINEAR ALGEBRA
ELEMENTARY LINEAR ALGEBRA K R MATTHEWS DEPARTMENT OF MATHEMATICS UNIVERSITY OF QUEENSLAND First Printing, 99 Chapter LINEAR EQUATIONS Introduction to linear equations A linear equation in n unknowns x,
More informationConvex Feasibility Problems
Laureate Prof. Jonathan Borwein with Matthew Tam http://carma.newcastle.edu.au/drmethods/paseky.html Spring School on Variational Analysis VI Paseky nad Jizerou, April 19 25, 2015 Last Revised: May 6,
More informationLocal Asymptotics and the Minimum Description Length
Local Asymptotics and the Minimum Description Length Dean P. Foster and Robert A. Stine Department of Statistics The Wharton School of the University of Pennsylvania Philadelphia, PA 19104-6302 March 27,
More informationSOUTH AFRICAN TERTIARY MATHEMATICS OLYMPIAD
SOUTH AFRICAN TERTIARY MATHEMATICS OLYMPIAD. Determine the following value: 7 August 6 Solutions π + π. Solution: Since π
More informationQuantum Mechanics for Scientists and Engineers. David Miller
Quantum Mechanics for Scientists and Engineers David Miller Background mathematics 5 Sum, factorial and product notations Summation notation If we want to add a set of numbers a 1, a 2, a 3, and a 4, we
More informationChapter 2. Limits and Continuity 2.6 Limits Involving Infinity; Asymptotes of Graphs
2.6 Limits Involving Infinity; Asymptotes of Graphs Chapter 2. Limits and Continuity 2.6 Limits Involving Infinity; Asymptotes of Graphs Definition. Formal Definition of Limits at Infinity.. We say that
More informationi=1 β i,i.e. = β 1 x β x β 1 1 xβ d
66 2. Every family of seminorms on a vector space containing a norm induces ahausdorff locally convex topology. 3. Given an open subset Ω of R d with the euclidean topology, the space C(Ω) of real valued
More informationPhysics 342 Lecture 23. Radial Separation. Lecture 23. Physics 342 Quantum Mechanics I
Physics 342 Lecture 23 Radial Separation Lecture 23 Physics 342 Quantum Mechanics I Friday, March 26th, 2010 We begin our spherical solutions with the simplest possible case zero potential. Aside from
More informationMAC Calculus II Spring Homework #6 Some Solutions.
MAC 2312-15931-Calculus II Spring 23 Homework #6 Some Solutions. 1. Find the centroid of the region bounded by the curves y = 2x 2 and y = 1 2x 2. Solution. It is obvious, by inspection, that the centroid
More informationAPPLICATION OF DERIVATIVES
94 APPLICATION OF DERIVATIVES Chapter 6 With the Calculus as a key, Mathematics can be successfully applied to the explanation of the course of Nature. WHITEHEAD 6. Introduction In Chapter 5, we have learnt
More informationPrelim Examination 2010 / 2011 (Assessing Units 1 & 2) MATHEMATICS. Advanced Higher Grade. Time allowed - 2 hours
Prelim Examination 00 / 0 (Assessing Units & ) MATHEMATICS Advanced Higher Grade Time allowed - hours Read Carefully. Calculators may be used in this paper.. Candidates should answer all questions. Full
More informationMetric Spaces Lecture 17
Metric Spaces Lecture 17 Homeomorphisms At the end of last lecture an example was given of a bijective continuous function f such that f 1 is not continuous. For another example, consider the sets T =
More informationContents. MATH 32B-2 (18W) (L) G. Liu / (TA) A. Zhou Calculus of Several Variables. 1 Multiple Integrals 3. 2 Vector Fields 9
MATH 32B-2 (8W) (L) G. Liu / (TA) A. Zhou Calculus of Several Variables Contents Multiple Integrals 3 2 Vector Fields 9 3 Line and Surface Integrals 5 4 The Classical Integral Theorems 9 MATH 32B-2 (8W)
More informationSOLUTIONS FOR ADMISSIONS TEST IN MATHEMATICS, COMPUTER SCIENCE AND JOINT SCHOOLS THURSDAY 2 NOVEMBER 2017
SOLUTIONS FOR ADMISSIONS TEST IN MATHEMATICS, COMPUTER SCIENCE AND JOINT SCHOOLS THURSDAY NOVEMBER 07 Mark Scheme: Each part of Question is worth 4 marks which are awarded solely for the correct answer.
More informationDIFFERENTIAL EQUATIONS
DIFFERENTIAL EQUATIONS Chapter 1 Introduction and Basic Terminology Most of the phenomena studied in the sciences and engineering involve processes that change with time. For example, it is well known
More informationMATH 31BH Homework 1 Solutions
MATH 3BH Homework Solutions January 0, 04 Problem.5. (a) (x, y)-plane in R 3 is closed and not open. To see that this plane is not open, notice that any ball around the origin (0, 0, 0) will contain points
More informationExistence and Uniqueness
Chapter 3 Existence and Uniqueness An intellect which at a certain moment would know all forces that set nature in motion, and all positions of all items of which nature is composed, if this intellect
More informationA proof of the existence of good nested lattices
A proof of the existence of good nested lattices Dinesh Krithivasan and S. Sandeep Pradhan July 24, 2007 1 Introduction We show the existence of a sequence of nested lattices (Λ (n) 1, Λ(n) ) with Λ (n)
More informationVISCOSITY SOLUTIONS. We follow Han and Lin, Elliptic Partial Differential Equations, 5.
VISCOSITY SOLUTIONS PETER HINTZ We follow Han and Lin, Elliptic Partial Differential Equations, 5. 1. Motivation Throughout, we will assume that Ω R n is a bounded and connected domain and that a ij C(Ω)
More informationEfficient packing of unit squares in a square
Loughborough University Institutional Repository Efficient packing of unit squares in a square This item was submitted to Loughborough University's Institutional Repository by the/an author. Additional
More informationChapter 1. Preliminaries. The purpose of this chapter is to provide some basic background information. Linear Space. Hilbert Space.
Chapter 1 Preliminaries The purpose of this chapter is to provide some basic background information. Linear Space Hilbert Space Basic Principles 1 2 Preliminaries Linear Space The notion of linear space
More informationLebesgue Measure on R n
8 CHAPTER 2 Lebesgue Measure on R n Our goal is to construct a notion of the volume, or Lebesgue measure, of rather general subsets of R n that reduces to the usual volume of elementary geometrical sets
More informationTechnique 1: Volumes by Slicing
Finding Volumes of Solids We have used integrals to find the areas of regions under curves; it may not seem obvious at first, but we can actually use similar methods to find volumes of certain types of
More informationMATH 425, FINAL EXAM SOLUTIONS
MATH 425, FINAL EXAM SOLUTIONS Each exercise is worth 50 points. Exercise. a The operator L is defined on smooth functions of (x, y by: Is the operator L linear? Prove your answer. L (u := arctan(xy u
More informationTypes of Real Integrals
Math B: Complex Variables Types of Real Integrals p(x) I. Integrals of the form P.V. dx where p(x) and q(x) are polynomials and q(x) q(x) has no eros (for < x < ) and evaluate its integral along the fol-
More informationIrredundant Families of Subcubes
Irredundant Families of Subcubes David Ellis January 2010 Abstract We consider the problem of finding the maximum possible size of a family of -dimensional subcubes of the n-cube {0, 1} n, none of which
More informationHigh School Math Contest
High School Math Contest University of South Carolina February th, 017 Problem 1. If (x y) = 11 and (x + y) = 169, what is xy? (a) 11 (b) 1 (c) 1 (d) (e) 8 Solution: Note that xy = (x + y) (x y) = 169
More informationUniversal probability distributions, two-part codes, and their optimal precision
Universal probability distributions, two-part codes, and their optimal precision Contents 0 An important reminder 1 1 Universal probability distributions in theory 2 2 Universal probability distributions
More informationFinal practice, Math 31A - Lec 1, Fall 2013 Name and student ID: Question Points Score Total: 90
Final practice, Math 31A - Lec 1, Fall 13 Name and student ID: Question Points Score 1 1 1 3 1 4 1 5 1 6 1 7 1 8 1 9 1 Total: 9 1. a) 4 points) Find all points x at which the function fx) x 4x + 3 + x
More informationTwo special equations: Bessel s and Legendre s equations. p Fourier-Bessel and Fourier-Legendre series. p
LECTURE 1 Table of Contents Two special equations: Bessel s and Legendre s equations. p. 259-268. Fourier-Bessel and Fourier-Legendre series. p. 453-460. Boundary value problems in other coordinate system.
More informationDRAFT - Math 101 Lecture Note - Dr. Said Algarni
2 Limits 2.1 The Tangent Problems The word tangent is derived from the Latin word tangens, which means touching. A tangent line to a curve is a line that touches the curve and a secant line is a line that
More informationELEMENTARY LINEAR ALGEBRA
ELEMENTARY LINEAR ALGEBRA K. R. MATTHEWS DEPARTMENT OF MATHEMATICS UNIVERSITY OF QUEENSLAND Corrected Version, 7th April 013 Comments to the author at keithmatt@gmail.com Chapter 1 LINEAR EQUATIONS 1.1
More informationPartial Differential Equations
Partial Differential Equations Spring Exam 3 Review Solutions Exercise. We utilize the general solution to the Dirichlet problem in rectangle given in the textbook on page 68. In the notation used there
More informationMATH 162. Midterm 2 ANSWERS November 18, 2005
MATH 62 Midterm 2 ANSWERS November 8, 2005. (0 points) Does the following integral converge or diverge? To get full credit, you must justify your answer. 3x 2 x 3 + 4x 2 + 2x + 4 dx You may not be able
More informationThis Week. Professor Christopher Hoffman Math 124
This Week Sections 2.1-2.3,2.5,2.6 First homework due Tuesday night at 11:30 p.m. Average and instantaneous velocity worksheet Tuesday available at http://www.math.washington.edu/ m124/ (under week 2)
More informationy mx 25m 25 4 circle. Then the perpendicular distance of tangent from the centre (0, 0) is the radius. Since tangent
Mathematics. The sides AB, BC and CA of ABC have, 4 and 5 interior points respectively on them as shown in the figure. The number of triangles that can be formed using these interior points is () 80 ()
More informationHOMEWORK SET 1 SOLUTIONS MATH 456, Spring 2011 Bruce Turkington
HOMEWORK SET SOLUTIONS MATH 456, Spring Bruce Turkington. Consider a roll of paper, like a toilet roll. Its cross section is very nearly an annulus of inner radius R and outer radius R. The thickness of
More information1. Use the properties of exponents to simplify the following expression, writing your answer with only positive exponents.
Math120 - Precalculus. Final Review. Fall, 2011 Prepared by Dr. P. Babaali 1 Algebra 1. Use the properties of exponents to simplify the following expression, writing your answer with only positive exponents.
More informationFunctional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...
Functional Analysis Franck Sueur 2018-2019 Contents 1 Metric spaces 1 1.1 Definitions........................................ 1 1.2 Completeness...................................... 3 1.3 Compactness......................................
More informationSuggested Solution to Assignment 7
MATH 422 (25-6) partial diferential equations Suggested Solution to Assignment 7 Exercise 7.. Suppose there exists one non-constant harmonic function u in, which attains its maximum M at x. Then by the
More informationCore Mathematics C12
Write your name here Surname Other names Core Mathematics C12 SWANASH A Practice Paper Time: 2 hours 30 minutes Paper - J Year: 2017-2018 The formulae that you may need to answer some questions are found
More informationIB Mathematics HL Year 2 Unit 11: Completion of Algebra (Core Topic 1)
IB Mathematics HL Year Unit : Completion of Algebra (Core Topic ) Homewor for Unit Ex C:, 3, 4, 7; Ex D: 5, 8, 4; Ex E.: 4, 5, 9, 0, Ex E.3: (a), (b), 3, 7. Now consider these: Lesson 73 Sequences and
More informationLecture 13: Series Solutions near Singular Points
Lecture 13: Series Solutions near Singular Points March 28, 2007 Here we consider solutions to second-order ODE s using series when the coefficients are not necessarily analytic. A first-order analogy
More informationELEMENTARY LINEAR ALGEBRA
ELEMENTARY LINEAR ALGEBRA K. R. MATTHEWS DEPARTMENT OF MATHEMATICS UNIVERSITY OF QUEENSLAND Second Online Version, December 1998 Comments to the author at krm@maths.uq.edu.au Contents 1 LINEAR EQUATIONS
More informationVariational Principal Components
Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings
More informationSupervised Machine Learning (Spring 2014) Homework 2, sample solutions
58669 Supervised Machine Learning (Spring 014) Homework, sample solutions Credit for the solutions goes to mainly to Panu Luosto and Joonas Paalasmaa, with some additional contributions by Jyrki Kivinen
More informationFurther Mathematics AS/F1/D17 AS PAPER 1
Surname Other Names Candidate Signature Centre Number Candidate Number Examiner Comments Total Marks Further Mathematics AS PAPER 1 CM December Mock Exam (AQA Version) Time allowed: 1 hour and 30 minutes
More informationGeometric inequalities for black holes
Geometric inequalities for black holes Sergio Dain FaMAF-Universidad Nacional de Córdoba, CONICET, Argentina. 3 August, 2012 Einstein equations (vacuum) The spacetime is a four dimensional manifold M with
More informationGLOBAL, GEOMETRICAL COORDINATES ON FALBEL S CROSS-RATIO VARIETY
GLOBAL GEOMETRICAL COORDINATES ON FALBEL S CROSS-RATIO VARIETY JOHN R. PARKER & IOANNIS D. PLATIS Abstract. Falbel has shown that four pairwise distinct points on the boundary of complex hyperbolic -space
More informationPure Mathematics Paper II
MATHEMATICS TUTORIALS H AL TARXIEN A Level 3 hours Pure Mathematics Question Paper This paper consists of five pages and ten questions. Check to see if any pages are missing. Answer any SEVEN questions.
More informationAnswers to Problem Set Number MIT (Fall 2005).
Answers to Problem Set Number 5. 18.305 MIT Fall 2005). D. Margetis and R. Rosales MIT, Math. Dept., Cambridge, MA 02139). November 23, 2005. Course TA: Nikos Savva, MIT, Dept. of Mathematics, Cambridge,
More informationInformation geometry for bivariate distribution control
Information geometry for bivariate distribution control C.T.J.Dodson + Hong Wang Mathematics + Control Systems Centre, University of Manchester Institute of Science and Technology Optimal control of stochastic
More informationIntroduction to Differentials
Introduction to Differentials David G Radcliffe 13 March 2007 1 Increments Let y be a function of x, say y = f(x). The symbol x denotes a change or increment in the value of x. Note that a change in the
More informationHausdorff Measure. Jimmy Briggs and Tim Tyree. December 3, 2016
Hausdorff Measure Jimmy Briggs and Tim Tyree December 3, 2016 1 1 Introduction In this report, we explore the the measurement of arbitrary subsets of the metric space (X, ρ), a topological space X along
More informationModule 2: Reflecting on One s Problems
MATH55 Module : Reflecting on One s Problems Main Math concepts: Translations, Reflections, Graphs of Equations, Symmetry Auxiliary ideas: Working with quadratics, Mobius maps, Calculus, Inverses I. Transformations
More informationMath 31CH - Spring Final Exam
Math 3H - Spring 24 - Final Exam Problem. The parabolic cylinder y = x 2 (aligned along the z-axis) is cut by the planes y =, z = and z = y. Find the volume of the solid thus obtained. Solution:We calculate
More informationAlgorithms, Design and Analysis. Order of growth. Table 2.1. Big-oh. Asymptotic growth rate. Types of formulas for basic operation count
Types of formulas for basic operation count Exact formula e.g., C(n) = n(n-1)/2 Algorithms, Design and Analysis Big-Oh analysis, Brute Force, Divide and conquer intro Formula indicating order of growth
More informationOne parameter is always enough
One parameter is always enough Steven T. Piantadosi Department of Brain and Cognitive Sciences 358 Meliora Hall, P.O. Box 270268 University of Rochester Rochester, NY 14627 We construct an elementary equation
More informationExercise 8.1 We have. the function is differentiable, with. f (x 0, y 0 )(u, v) = (2ax 0 + 2by 0 )u + (2bx 0 + 2cy 0 )v.
Exercise 8.1 We have f(x, y) f(x 0, y 0 ) = a(x 0 + x) 2 + 2b(x 0 + x)(y 0 + y) + c(y 0 + y) 2 ax 2 0 2bx 0 y 0 cy 2 0 = (2ax 0 + 2by 0 ) x + (2bx 0 + 2cy 0 ) y + (a x 2 + 2b x y + c y 2 ). By a x 2 +2b
More informationAdvanced Mathematics Support Programme OCR Year 2 Pure Core Suggested Scheme of Work ( )
OCR Year 2 Pure Core Suggested Scheme of Work (2018-2019) This template shows how Integral Resources and FMSP FM videos can be used to support Further Mathematics students and teachers. This template is
More informationSuccinct Data Structures for Approximating Convex Functions with Applications
Succinct Data Structures for Approximating Convex Functions with Applications Prosenjit Bose, 1 Luc Devroye and Pat Morin 1 1 School of Computer Science, Carleton University, Ottawa, Canada, K1S 5B6, {jit,morin}@cs.carleton.ca
More informationGREEN S IDENTITIES AND GREEN S FUNCTIONS
GREEN S IENTITIES AN GREEN S FUNCTIONS Green s first identity First, recall the following theorem. Theorem: (ivergence Theorem) Let be a bounded solid region with a piecewise C 1 boundary surface. Let
More informationApproximating a Convex Body by An Ellipsoid
Chapter 1 Approximating a Convex Body by An Ellipsoid By Sariel Har-Peled, May 10, 010 1 Is there anything in the Geneva Convention about the rules of war in peacetime? Stalnko wanted to know, crawling
More information1 Solutions to selected problems
Solutions to selected problems Section., #a,c,d. a. p x = n for i = n : 0 p x = xp x + i end b. z = x, y = x for i = : n y = y + x i z = zy end c. y = (t x ), p t = a for i = : n y = y(t x i ) p t = p
More information1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3
Index Page 1 Topology 2 1.1 Definition of a topology 2 1.2 Basis (Base) of a topology 2 1.3 The subspace topology & the product topology on X Y 3 1.4 Basic topology concepts: limit points, closed sets,
More informationf(p i )Area(T i ) F ( r(u, w) ) (r u r w ) da
MAH 55 Flux integrals Fall 16 1. Review 1.1. Surface integrals. Let be a surface in R. Let f : R be a function defined on. efine f ds = f(p i Area( i lim mesh(p as a limit of Riemann sums over sampled-partitions.
More informationChapter 2: Functions, Limits and Continuity
Chapter 2: Functions, Limits and Continuity Functions Limits Continuity Chapter 2: Functions, Limits and Continuity 1 Functions Functions are the major tools for describing the real world in mathematical
More informationCalculus I Review Solutions
Calculus I Review Solutions. Compare and contrast the three Value Theorems of the course. When you would typically use each. The three value theorems are the Intermediate, Mean and Extreme value theorems.
More informationExtra FP3 past paper - A
Mark schemes for these "Extra FP3" papers at https://mathsmartinthomas.files.wordpress.com/04//extra_fp3_markscheme.pdf Extra FP3 past paper - A More FP3 practice papers, with mark schemes, compiled from
More informationLecture 18: March 15
CS71 Randomness & Computation Spring 018 Instructor: Alistair Sinclair Lecture 18: March 15 Disclaimer: These notes have not been subjected to the usual scrutiny accorded to formal publications. They may
More information0 MATH Last Updated: September 7, 2012
Problem List 0.1 Trig. Identity 0.2 Basic vector properties (Numeric) 0.3 Basic vector properties (Conceptual) 0.4 Vector decomposition (Conceptual) 0.5 Div, grad, curl, and all that 0.6 Curl of a grad
More informationThe sample complexity of agnostic learning with deterministic labels
The sample complexity of agnostic learning with deterministic labels Shai Ben-David Cheriton School of Computer Science University of Waterloo Waterloo, ON, N2L 3G CANADA shai@uwaterloo.ca Ruth Urner College
More information