The EM Algorithm applied to determining new limit points of Mahler measures

Contro and Cybernetics vo. 39 (2010) No. 4 The EM Agorithm appied to determining new imit points of Maher measures by Souad E Otmani, Georges Rhin and Jean-Marc Sac-Épée Université Pau Veraine-Metz, LMAM, CNRS UMR 7122, Ie du Saucy, METZ, F-57045, France e-mai: eotmani@univ-metz.fr, rhin@univ-metz.fr,jmse@univ-metz.fr Abstract: In this work, we propose new candidates expected to be imit points of Maher measures of poynomias. The too we use for determining these candidates is the Expectation-Maximization agorithm, whose goa is to optimize the ikeihood for the given data points, i.e. the known Maher measures up to degree 44, to be generated by a specific mixture of Gaussians. We wi give the mean (which is a candidate to be a new imit point) and the reative ampitude of each component of the more ikey gaussian mixture. Keywords: Maher Measure, EM agorithm. 1. State of the art Reca that if P is a poynomia defined as P(x) = n a k x k, a k C and a n 0, k=0 then its Maher measure (see Boyd, 1980, 1989; Mossinghoff, 1998) is defined to be M(P) = a n n max(1, α k ), k=1 where the α k s are the roots of P. Initiated by Lehmer (1933) who, for poynomias P with integer coefficients, provided the smaest vaues of M(P) for deg(p) = 2, 3, and 4, and for reciproca P with deg(p) = 2, 4, 6 and 8, computations on Maher measure were continued by Boyd (1980, 1981, 1989) who found a reciproca P with M(P) 1.3 Submitted: Juy 2010; Accepted: October 2010.

1186 S. EL OTMANI, G. RHIN, J.-M. ÉPÉE and degree up to 20, as we as those with M(P) 1.3 and degree up to 32 with height 1. Boyd s ists were extended by Mossinghoff (1998) up to degree 24 (up to degree 40 for height 1 poynomias). Fammang, Grandcoas and Rhin (1999) proved the competeness of these ists, and Fammang, Rhin and Sac- Épée (2006) provided a poynomias P such that M(P) θ 0 (θ 0 1.3247... is the smaest Pisot number) with deg(p) 36, and a poynomias P such that M(P) 1.31 and deg(p) 40. On Mossinghoff s web site (see http://www.cecm.sfu.ca/ mjm/lehmer/ists), we can find a ist of a known noncycotomic and irreducibe poynomias with integer coefficients and degree at most 180 and Maher measure beow 1.3, incuding poynomias provided by P. Lisonek (2000), G. Rhin and J.-M. Sac-Epée (2003), and Mossinghoff, Rhin, and Wu (2008) who proved the competeness of the ist up to degree 44. For each poynomia, its Maher measure is avaiabe. One of the important questions concerning the Maher measure is the foowing: What are the sma imit points of the Maher measure of the set of agebraic integers? In a recent paper, Boyd and Mossinghoff (2005) gave a ist of 48 such imit points ess than 1.37. We note that there are ony two imit points ess than 1.3, namey 1.255... and 1.285... A are obtained by vaues of Maher measures of poynomias in severa variabes of different types. We may add the foowing question known as Lehmer s probem: Is 1 a imit point of the set of the Maher measures? The smaest vaue of the Maher measure that is known is 1.176... given by the poynomia X 10 +X 9 X 7 X 6 X 5 X 4 X 3 +X+1, found by D.H. Lehmer himsef (Lehmer, 1933). For a more compete survey, pease refer to Smyth (2008). Then, an interesting idea is to suggest a new statistica approach, which is conceivabe having regard to compete ists of vaues which are avaiabe up to degree 44, and very rich ists up to degree 180. So, we wi focus on the possibe vaues of imit points smaer than 1.3 using these tabes as incoming data, and the EM agorithm as a statistica anaysis too to suggest the existence of two possibe new vaues of imit points as 1.256533... and 1.286625... In the foowing section, we draw the histogram of the 8415 avaiabe points with the purpose of investigating what kind of distributions this ist of points coud arise from. 2. Graphica observations Let us examine very cosey the histogram of the frequency distribution (Fig. 1) of the given points, with zooms at interesting zones. By zooming around the first known imit point 1.255433866..., we obtain Fig. 2. The pecuiar appearance of the graphic induces us to specuate that the distribution of scaar vaues around the first known imit point does not foow a simpe gaussian mode, but rather arises from a mixture gaussian mode, as in the foowing exampe (see Fig. 3) corresponding to a bimoda mixture made of

New imit points of Maher measures through EM 1187 Figure 1. Histogram of the frequency distribution of the known imit points Figure 2. Zoom around the first known imit point

1188 S. EL OTMANI, G. RHIN, J.-M. ÉPÉE Figure 3. Exampe corresponding to a bimoda gaussian Figure 4. Zoom around the second known imit point 4000 points issued from the N(0, 1) Gaussian aw and 12000 points issued from the N(2.5, 1) Gaussian aw. The same observation can be made when zooming around the second known imit point 1.285734864... (see Fig. 4). For determining the parameters of the underying gaussian mixture, we use a systematic method consisting in appying the EM agorithm (see Dasgupta

New imit points of Maher measures through EM 1189 and Schuman, 2000; MacLachan and Krishnan, 2008; Tagare, 1998; Xu and Jordan, 1996; Wu, 1983) to the given ist of points. For the reader s convenience, we outine some EM theory in Section 3. 3. The EM agorithm Consider n independent scaar vaues a 1, a 2,, a n. Each a i is supposed to arise from a probabiity distribution whose density can be expressed as f(x θ) = p j g j (x µ j, σ j ). j=1 Scaar vaue p j stands for the mixing proportion of the jth component of the mixture, and we have p j = 1 and j = 1,, N, 0 < p j < 1. j=1 Function g j ( µ k, σ k ) is the gaussian density with mean µ j and standard deviation σ j, and is defined by g j (x µ j, σ j ) = 1 (x µ j ) 2 σ 2π e 2σ j 2. θ = (p 1,, p N 1, µ 1,, µ N, σ 1,, σ N ) is a vector whose components are the mixture parameters, which are estimated by maximizing the ogikeihood L(θ a 1,, a n ) = n ( N ) n p j g(a i µ j, σ j ). i=1 j=1 Given vector θ, the beonging h(k, ) of data point a k to custer number can be computed by using Bayes theorem as h(k, ) = p(custer s number = a k, θ) = p g(x k µ, σ ). p i g(x k µ i, σ i ) One of the most iked method used for determining the maximum ikeihood soution is the Expectation-Maximization agorithm. Roughy speaking, assuming that given data arise from a gaussian mixture mode with N components, the EM agorithm is devoted to estimate the parameters (means, standard deviations) of each component of the mixture for which the observed data are the most ikey. The Expectation-Maximization agorithm for gaussian mixtures is an iterative process defined as foows: Choose initia parameters settings. i=1

1190 S. EL OTMANI, G. RHIN, J.-M. ÉPÉE Repeat unti convergence: E-Step: Using the current parameter vaues, compute h(k, ) for 1 k n, 1 N: h(k, ) (i) = p(i) i=1 g(x k µ (i), σ (i) ) p (i) g(x k µ (i), σ (i) ), M-Step: Use data points a k and just computed vaues h(k, ) to give new parameters vaues: S (i+1) = h(k, ) (i) α (i+1) k=1 = 1 N S(i+1) µ (i+1) = 1 S (i+1) h(k, ) (i) a k k=1 (σ (i+1) ) 2 = 1 S (i+1) k=1 h(k, ) (i) (a k µ (i+1) ) 2. We stop the iterative process when the og-ikeyhood s vaue becomes amost unchanged from one iteration to the next. 4. Appication for search of candidates to be new imit points Many softwares impementing EM agorithm are avaiabe on the web. Among a, we choose to use Mixmod Software, which is an exporatory data anaysis too for soving custering and cassification probems. A carefu observation of histograms above induced us to surmise that the given ist of points arises from a gaussian mixture constituted by four components that we pan to make more precise. Our choice was to work with Mixmod in Sciab environment. Around the first known imit point 1.255433866..., we appied EM agorithm on interva (1.24, 1.27). Resuts obtained after cacuations are summarized in the foowing tabe: Tabe 1. Two custers on the first interva means proportions custer 1 1.256533 0.433147 custer 2 1.255336 0.566853

New imit points of Maher measures through EM 1191 Around the second known imit point 1.285734864..., we appied EM agorithm on interva (1.275, 1.295). Resuts obtained after cacuations are summarized in the foowing tabe: Tabe 2. Two custers on the second interva means proportions custer 1 1.286625 0.36126 custer 2 1.285674 0.63874 On each interva, cacuations provided a precise approximation of the aready known imit vaue, and a new vaue expected to be a new imit point for Maher measures of poynomias. So, our two new candidates are 1.256533 and 1.286625. 5. Concusion Whie these two new vaues seem to be promising, one shoud keep in mind that contrary to known imit vaues, these new vaues are not mathematicay proved to be imit points. Numerica investigations simpy ead us to consider these points as good candidates, worthy of some more detaied theoretica studies. References Boyd, D.W. (1981) Specuations concerning the range of Maher s measure. Canad. Math. Bu. 24, 453-469. Boyd, D.W. (1980) Reciproca poynomias having sma measure. Math. Comp. 35, 1361-1377. Boyd, D.W. (1989) Reciproca poynomias having sma measure II. Math. Comp. 53, 355-357, S1-S5. Boyd, D.W. (2005) M. J. Mossinghoff, Sma Limit Points of Maher s Measure. Experimenta Mathematics 14, Part 4, 403-414. Dasgupta, S. and Schuman, L.J. (2000) A Two-Round Variant of EM for Gaussian Mixtures. 16th Conference on Uncertainty in Artificia Inteigence, 152-159. Dempster, A.P., Laird, N.M. and Rubin, D.B. (1977) Maximum Likeihood from Incompete Data via the EM Agorithm. J. Roya Statist. Soc. Ser. B., 39, 1-38. Fammang, V., Grandcoas, M. and Rhin, G. (1999) Sma Saem numbers. Number Theory in Progress, 1 (Zakopane-Kościeisko, 1997), de Gruyter, Berin, 165-168. Fammang, V., Rhin, G. and Sac-Épée, J.M. (2006) Integer Transfinite Diameter and poynomias with sma Maher measure. Math. Comp. 75, 1527-1540.

1192 S. EL OTMANI, G. RHIN, J.-M. ÉPÉE Lehmer, D.H. (1933) Factorization of certain cycotomic functions. Ann. of Math. 2 (34), 461-479. Mcachan, G.J. and Krishnan, T. (2008) The EM Agorithm and Extensions. Wiey Series in Probabiity and Statistics (Second, ed.). Wiey- Interscience, Hoboken, NJ. Mossinghoff, M.J. (1998) Poynomias with sma Maher measure. Math. Comp. 67, 1697-1705. Mossinghoff, M.J., Rhin, G. and Wu, Q. (2008) Minima Maher Measures. Experiment. Math., 17 (4), 451-458. Rhin, G. and Sac-Épée, J.M. (2003) New methods providing high degree poynomias with sma Maher measure. Experiment. Math., 12 (4), 457-461. Tagare, H.D. (1998) A Gente Introduction to the EM Agorithm, Part I: Theory. Avaiabe on the web. Xu, L. and Jordan, M.I. (1996) On Convergence Properties of the EM Agorithm for Gaussian Mixtures. Neura Computation 8, 129-151. Wu, C.F.J. (1983) On the Convergence Properties of the EM Agorithm. The Annas of Statistics 11 (1), 95-103. Smyth, C. (2008) The Maher measure of agebraic numbers: A survey. Number Theory and Poynomias, London Math. Soc. Lecture Note Ser. 352, Cambridge Univ. Press, 322-349.