An Approximate Fisher Scoring Algorithm for Finite Mixtures of Multinomials

Size: px
Start display at page:

Download "An Approximate Fisher Scoring Algorithm for Finite Mixtures of Multinomials"

Transcription

1 An Approximate Fisher Scoring Agorithm for Finite Mixtures of Mutinomias Andrew M. Raim, Mingei Liu, Nagaraj K. Neercha and Jorge G. More Abstract Finite mixture distributions arise naturay in many appications incuding custering and cassification. Since they usuay do not yied cosed forms for maximum ikeihood estimates MLEs, numerica methods using the we known Fisher Scoring or Expectation-Maximization agorithms are considered. In this work, an approximation to the Fisher Information Matrix of an arbitrary mixture of mutinomia distributions is introduced. This eads to an Approximate Fisher Scoring agorithm AFSA, which turns out to be cosey reated to Expectation-Maximization, and is more robust to the choice of initia vaue than Fisher Scoring iterations. A combination of AFSA and the cassica Fisher Scoring iterations provides the best of both computationa efficiency and stabe convergence properties. Key Words: Mutinomia; Finite mixture, Maximum ikeihood, Fisher Information Matrix, Fisher Scoring. 1 Introduction A finite mixture mode arises when observations are beieved to originate from one of severa popuations, but it is unknown to which popuation each observation beongs. Mode identifiabiity of mixtures is an important issue with a considerabe iterature; see, for exampe, Robbins 1948 and Teicher The book by McLachan and Pee 2000 is a good entry point to the iterature on mixtures. A majority of the mixture iterature deas with mixtures of norma distributions; however, Bischke 1962; 1964, Kroikowska 1976, and Kabir 1968 are a few eary works which address mixtures of discrete distributions. The present work focuses on mixtures of mutinomias, which have wide appications in custering and cassification probems, as we as modeing overdispersion data More and Nagaraj, It is we known that computation of maximum ikeihood estimates MLE under mixture distributions is often anayticay intractabe, and therefore iterative numerica methods are needed. Cassica iterative techniques such as Newton-Raphson and Fisher Scoring are two widey used methods. The more recent Expectation-Maximization EM agorithm discussed in Dempster, Laird and Rubin 1977 has become another standard technique to compute MLEs. EM is a framework for performing estimation in missing data probems. The idea Andrew Raim is graduate student at Department of Mathematics and Statistics, University of Maryand, Batimore County, Batimore, MD 21250, USA Emai: araim1@umbc.edu. Mingei Liu is Senior Principia Biostatistician at Medtronic, Santa Rosa, CA, USA. Nagaraj Neercha is Professor at Department of Mathematics and Statistics, University of Maryand, Batimore County, Batimore, MD 21250, USA Emai: nagaraj@umbc.edu. Jorge More is Principa Statistician at Procter & Gambe, Cincinnati, OH, USA. 1

2 is to sove a difficut incompete data probem by repeatedy soving tractabe competedata probems. If the unknown popuation abes are treated as missing information, then estimation under the mixture distribution can be considered a missing data probem, and an EM agorithm can be used. Unike Fisher Scoring FSA, EM does not require computation of an expected Hessian in each iteration, which is a great advantage if this matrix is difficut to compute. Sow speed of convergence has been cited as a disadvantage of EM. Variations and improved versions of the EM agorithm have been widey used for obtaining MLEs for mixtures Mcachan and Pee, 2000, chapter 2. Fisher Scoring iterations require the inverse of the Fisher Information Matrix FIM. In the mixture setting, computing the FIM invoves a compicated expectation which does not have an anayticay tractabe form. The matrix can be approximated numericay by Monte Caro simuation for exampe, but this is computationay expensive, especiay when repeated over many iterations. More and Nagaraj 1991; 1993 proposed a variant of Fisher Scoring using an approximate FIM in their study of a mutinomia mode with extra variation. This mode, now referred to as the Random Cumped Mutinomia see Exampe 3.1 for detais, is a specia case of the finite mixture of mutinomias. The approximate FIM was justified asymptoticay, and was used to obtain MLEs for the mode and to demonstrate their efficiency. In the present paper, we extend the approximate FIM idea to genera finite mixtures of mutinomias and hence formuate the Approximate Fisher Scoring Agorithm AFSA for this famiy of distributions. By using the approximate FIM in pace of the true FIM, we obtain an agorithm which is cosey reated to EM. Both AFSA and EM have a sower convergence rate than Fisher Scoring once they are in the proximity of a maximum, but both are aso much more robust than Fisher Scoring in finding such regions from an arbitrary initia vaue. The rest of the paper is organized as foows. In section 2, a arge custer approximation for the Fisher Information Matrix is derived and some of its properties are presented. This approximate information matrix is easiy computed and has an immediate appication in Fisher Scoring, which is presented in section 3. Simuation studies are presented in section 4, iustrating convergence properties of the approximate information matrix and approximate Fisher Scoring. Concuding remarks are given in section 5. 2 An Approximate Fisher Information Matrix Consider the mutinomia sampe space with m trias paced into k categories at random, Ω { x x 1,..., x k : x j {0, 1,..., m}, k j1 } x j m. The standard mutinomia density is fx; p, m m! x 1!... x k! px p x k k Ix Ω, 2

3 where I is the indicator function, and the parameter space is { p p 1,..., p k 1 : 0 < p j < 1, k 1 j1 } p j < 1 R k 1. If a random variabe X has distribution fx; p, m, we wi write X Mut k p, m. Since x k m k 1 j1 x j and p k 1 k 1 j1 p j, the kth category can be considered as redundant information. Foowing the samping and overdispersion iterature, we wi refer to the number of trias m as the custer size of a mutinomia observation. Now suppose there are s mutinomia popuations Mut k p 1, m,..., Mut k p s, m, p p 1,..., p,k 1 where the th popuation occurs with proportion π for 1,..., s. If we draw X from the mixed popuation, its probabiity density is a finite mixture of mutinomias fx; θ s π fx; p, m, θ p 1,..., p s, π, and we wi write X MutMix k θ, m. The dimension of θ is q : sk 1 + s 1 sk 1, disregarding the redundant parameters p 1k,..., p sk, π s. We wi aso make use of the foowing sighty-ess-cumbersome notation for densities, Px : fx; θ, m : P x : fx; p, m : the mixture, the th component of the mixture. The setting of this paper wi be an independent sampe X i MutMix k θ, m i, i 1,..., n with custer sizes not necessariy equa. The resuting ikeihood is { n n s [ m i! Lθ fx i ; θ π x i1!... x ik! px i px ik k Ix i Ω] } The inner summation prevents cosed-form ikeihood maximization, hence our goa wi be to compute the MLE ˆθ numericay. Some additiona preiminaries are given in Appendix A. In genera, as mentioned earier, the Fisher Information Matrix FIM for mixtures invoves a compicated expectation which does not have a tractabe form. Since the mutinomia 3

4 mixture has a finite sampe space, it can be computed naivey by using the definition of the expectation Iθ { } { } T og fx; θ og fx; θ fx; θ, 2.3 θ θ given a particuar vaue for θ. Athough the number of terms k+m 1 m in the summation is finite, it grows quicky with m and k, and this method becomes intractibe as m and k increase. For exampe, when m 100 and k 4, the sampe space Ω contains more than 178,000 eements. To avoid these potentiay expensive computations, we extend the approximate FIM approach of More and Nagaraj 1991; 1993 to the genera finite mixture of mutinomias. The foowing theorem states our main resut. Theorem 2.1. Suppose X MutMix k θ, m is a singe observation from the mixed popuation. Denote the exact FIM with respect to X as Iθ. Then an approximation to the FIM with respect to X is given by the sk 1 sk 1 bock-diagona matrix where for 1,..., s F m [ D 1 are k 1 k 1 matrices, F π D 1 π Ĩθ : Bockdiag π 1 F 1,..., π s F s, F π, + p 1 k 11T ] and D diagp 1,..., p,k 1 + π 1 s 11 T and D π diagπ 1,..., π s 1 is a s 1 s 1 matrix, and 1 denotes a vector of ones of the appropriate dimension. To emphasize the dependence of the FIM and the approximation on m, we wi aso write I m θ and Ĩmθ. If the vectors p 1,..., p s are distinct i.e. p a p b for every pair of popuations a b, then I m θ Ĩmθ as m. A proof is given in Appendix B. Notice that the matrix F is exacty the FIM of Mut k p, m for the th popuation, and F π is the FIM of Mut s π, 1 corresponding to the mixing probabiities π; see Appendix A for detais. The approximate FIM turns out to be equivaent to a compete data FIM, as shown in Proposition 2.2 beow, which provides an interesting connection to EM. This matrix can be formuated for any finite mixture whose components have a we-defined FIM, and is not imited to the case of mutinomias. Proposition 2.2. The matrix Ĩθ is equivaent to the FIM of X, Z, where 1 with probabiity π 1 Z. s with probabiity π s, and X Z Mut k p, m

5 Proof of Proposition 2.2. Here Z represents the popuation from which X was drawn. The compete data ikeihood is then Lθ x, z s 1 [ π fx p, m] Iz. This ikeihood eads to the score vectors [ og Lθ a Da 1 x k x ] k 1, p a p ak π og Lθ D 1 π s s 1, π s where 1,..., s so that IZ Bernouiπ, and s denotes the vector 1,..., s 1. Taking second derivatives yieds [ 2 og Lθ p a p T a Da 2 x k + x ] k 11 T, a 2 p a p T b 2 og Lθ 0, for a b, og Lθ 0, p a πt [ 2 og Lθ π πt D 2 π s + s π 2 s p 2 ak 11 T ]. Now take the expected vaue of the negative of each of these terms, jointy with respect to X, Z, to obtain the bocks of Ĩθ. Coroary 2.3. Suppose X i MutMixθ, m i, i 1,..., n, is an independent sampe from the mixed popuation with varying custer sizes, and M m m n. Then the approximate FIM with respect to X 1,..., X n is given by Ĩθ Bockdiag π 1 F 1,..., π s F s, F π, F M [ D 1 F π n [ D 1 π + p ] 1 k 11T, + πs 1 11 ] T. 1,..., s Proof of Coroary 2.3. Let Ĩiθ represent the approximate FIM with respect to observation X i. The resut is obtained using Ĩθ Ĩ1θ + + Ĩnθ, corresponding to the additive property of exact FIMs for independent sampes. The additive property can be justified by noting that each Ĩiθ is a true compete data FIM, by Proposition

6 Since Ĩθ is a bock diagona matrix, some usefu expressions can be obtained in cosed form. Coroary 2.4. Let Ĩθ represent the FIM with respect to an independent sampe X i MutMixθ, m i, i 1,..., n. Then: a The inverse of Ĩθ is given by Ĩ 1 θ Bockdiag π1 1 F1 1,..., πs 1 Fs 1, Fπ 1, 2.5 F 1 M 1 {D p p T }, 1,..., s Fπ 1 n 1 {D π ππ T }. b The trace of Ĩθ is given by tr Ĩθ s k 1 { Mπ p 1 j 1 j1 c The determinant of Ĩθ is given by s det Ĩθ 1 k 1 p 1 k j1 } s 1 + p 1 k + Mπ p 1 j 1 π 1 s n { π 1 s 1 1 nπ 1 } + πs 1. Proof of Coroary 2.4 a. Since Ĩθ is bock diagona, its inverse can be obtained by inverting the bocks, which can immediatey be seen to be 2.5. To find the expressions for the individua bocks, we can appy the Sherman-Morrison formua see for exampe Rao 1965, chapter 1 C + uv T 1 C 1 C 1 uv T C v T C 1 u. For the case of Fπ 1, for exampe, take C Dπ 1, u πs 1/2 1, and v πs 1/2 1 T and use the expressions in Coroary 2.3. Proof of Coroary 2.4 b. Since the trace of a bock diagona matrix is the sum of the traces of its bocks, we have tr Ĩθ π 1 tr F π s tr F s + tr F π. 2.6 The individua traces can be obtained as tr F tr [ MD 1 + p 1 k 11T ] k 1 j1 M { p 1 j + p 1 k },. 6

7 a summation over the diagona eements. Simiary for the bock corresponding to π, tr F π tr [ n D 1 π + π 1 s 11 T ] s 1 1 n { π 1 The resut is obtained by repacing these expressions into 2.6. } + πs 1. Proof of Coroary 2.4 c. Since Ĩθ has a bock diagona structure, det Ĩθ det {F π} s det {π F } 1 n s 1 det { D 1 π + π 1 s 11 T } s 1 π k 1 M k 1 det { D 1 + p } 1 k 11T 2.7 Reca the property see for exampe Rao 1965, chapter 1 that for M non-singuar, we have detm + uu T M u u T 1 detm 1 + u T M 1 u. This yieds, for instance det { D 1 π + π 1 s 11 } T det { } Dπ π 1 s 1 T D π 1 [ π ] s s 1 s 1 π 1 πs 1 π s 1 1 π 1. The resut can be obtained by substituting the simpified determinants into 2.7. The determinant and trace of the FIM are not utiized in the computation of MLEs, but are used in the computation of many statistics in subsequent anaysis. In such appications, it may be preferabe to have a cosed form for these expressions. As one exampe, consider the Consistent Akaike Information Criterion with Fisher Information CAICF formuated in Bozdogan, The CAICF is an information-theoretic criterion for mode seection, and is a function of the og-determinant of the FIM. It can aso be shown that Im 1 θ Ĩ 1 m θ 0 as m, which we now state as a theorem. A proof is given in Appendix B. This resut is perhaps more immediatey reevant than Theorem 2.1 for our Fisher Scoring appication presented in the foowing section. Theorem 2.5. Let I m θ and Ĩmθ be defined as in Theorem 2.1 namey the FIM and approximate FIM with respect to a singe observation with custer size m. Then Im 1 θ Ĩm 1 θ 0 as m. In the next section, we use the approximate FIM obtained in Theorem 2.1 to define an approximate Fisher Scoring agorithm and investigate its properties. 7

8 3 Approximate Fisher Scoring Agorithm Consider an independent sampe with varying custer sizes X i MutMix k θ, m i, i 1,..., n. Let θ 0 be an initia guess for θ, and Sθ be the score vector with respect to the sampe see Appendix A. Then by independence Sθ Sθ; x i, where Sθ; x i is the score vector with respect to the ith observation. The Fisher Scoring Agorithm is given by computing the iterations unti the convergence criteria θ g+1 θ g + I 1 θ g Sθ g, g 1, 2, og Lθ g+1 og Lθ g < ε is met, for some given toerance ε > 0. In practice, a ine search may be used for every iteration after determining a search direction, but such modifications wi not be considered here. Note that 3.1 uses the exact FIM which may not be easiy computabe. We propose to substitute the approximation Ĩθ for Iθ, and wi refer to the resuting method as the Approximate Fisher Scoring Agorithm AFSA. The expressions for Ĩθ and its inverse are avaiabe in cosed form, as seen in Coroaries 2.3 and 2.4. AFSA can be appied to finite mixture of mutinomia modes which are not expicity in the form of 2.2. We now give two exampes which use AFSA to compute MLEs for such modes. The first is the Random Cumped mode for overdispersed mutinomia data. The second is an arbirtrary mixture of mutinomias with inks from parameters to covariates. Exampe 3.1. In section 1 we have mentioned the Random Cumped Mutinomia RCM, a distribution that addresses overdispersion due to cumped samping in the mutinomia framework. RCM represents an interesting mode for exporing computationa methods. Recenty, Zhou and Lange 2010 have used it as an iustrative exampe for the minorizationmaximization principe. Raim et a 2012 have expored parae computing in maximum ikeihood estimation using arge RCM modes as a test probem. It turns out that RCM conforms to the finite mixture of mutinomias representation 2.1, and can therefore be fitted by the AFSA agorithm. Once the mixture representation is estabished, the score vector and approximate FIM can be formuated by the use of transformations; see for exampe section 2.6 of Lehmann and Casea Hence, we can obtain the agorithm presented in More and Nagaraj 1993 and Neercha and More 1998 as an AFSA-type agorithm. Consider a custer of m trias, where each tria resuts in one of k possibe outcomes with probabiities π 1,..., π k. Suppose a defaut category is aso seected at random, so that each 8

9 tria either resuts in this defaut outcome with probabiity ρ, or an independent choice with probabiity 1 ρ. Intuitivey, if ρ 0, RCM approaches a standard mutinomia distribution. Using this idea, an RCM random variabe can be obtained from the foowing procedure. Let Y 0, Y 1,..., Y m iid Mut k π, 1 and U 1,..., U m iid U0, 1 be independent sampes, then X Y 0 m IU i ρ + m Y i IU i > ρ Y 0 N + Z N 3.2 foows the distribution RCM k π, ρ. The representation 3.2 emphasizes that N Binomiam, ρ, Z N Mut k π, m N, and Y 0 Mut k π, 1, where N and Y 0 are independent. RCM is aso a specia case of the finite mixture of mutinomias, so that X fx; π, ρ k π fx; p, m, 1 p 1 ρπ + ρe, for 1,..., k 1 p k 1 ρπ, where fx; p, m is our usua notation for the density of Mut k p, m. This mixture representation can be derived using moment generating functions, as shown in More and Nagaraj, Notice that in this mixture s k so that the number of mixture components matches the number of categories. There are aso ony k distinct parameters rather than sk 1 as in the genera mixture. The approximate FIM for the RCM mode can be obtained by transformation, starting with the expression for the genera mixture. Consider transforming the k dimensiona η π, ρ to the q sk 1 k + 1k 1 dimensiona θ p 1,..., p s, π so that 1 ρπ + ρe 1. θη 1 ρπ + ρe k 1. 1 ρπ π The q k Jacobian of this transformation is 1 ρi k 1 π + e 1 θ η θi.. η j 1 ρi k 1 π + e k 1. 1 ρi k 1 π I k 1 0 9

10 Using the reations Sη T θ og fx; θ og fx; θ, η η θ T θ θ Iη Var Sη Iθ, η η it is possibe to obtain an expicit form of the approximate FIM as Ĩη a ij, where m1 ρ 2 β i + β k π 1 i + π 1 k, i j, i, j {1,..., k 1} m1 ρ 2 β k π 1 k i j, i, j {1,..., k 1} a ij m1 ργ i γ k, j k, i {1,..., k 1} m k 1 ρ π i1 π i [1 ρπ i + ρ] 1, i k, j k and β i π i 1 ρπ i + ρ + 1 π i, γ i π i1 π i 1 ρπ i 1 ρπ i + ρ + π i, i 1,..., k. 1 ρ It can be shown rigorousy that Ĩη Iη 0 as m, as stated in More and Nagaraj, 1993, and proved in detai in More and Nagaraj, The proof is simiar in spirit to the proof of Theorem 2.1. We then have AFSA iterations for RCM, η g+1 η g + Ĩ 1 η g Sη g, g 1, 2,... The foowing exampe invoves a mixture of mutinomias where the response probabiities are functions of covariates. The idea is anaogous to the usua mutinomia with ogit ink, but with inks corresponding to each component of the mixture. Exampe 3.2. In practice there are often covariates to be inked into the mode. As an exampe for how AFSA can be appied, consider the foowing fixed effect mode for response Y MutMix k θx, m with d 1 covariates x and z. To each p vector, a generaized ogit ink wi be added og p jx p k x η j, η j x T β j, for 1,..., s and j 1,..., k 1. A proportiona odds mode wi be assumed for π, og π 1z + + π z π +1 z + + π s z ηπ, η π ν + z T α, for 1,..., s 1, taking η0 π : and ηs π :. The unknown parameters are the d 1 vectors α and β j, and the scaars ν. Denote these parameters coectivey as β 1. β 1 ν 1 B β s, where β. and ν.. ν β,k 1 ν s α 10

11 Expressions for the θ parameters can be obtained as p j x π z e η j 1 + k 1 b1 eη b eηπ 1 + e ηπ eηπ e ηπ 1 for 1,..., s and j 1,..., k 1, for 1,..., s. To impement AFSA, a score vector and approximate FIM are needed. For the score vector we have SB T T N θ og fy; θ og fy; θ B B N θ where N η 1,..., η s, η π, η η 1,..., η,k 1, and η π η1 π,..., ηs 1. π For the FIM we have T T N θ θ N IB Var SB Iθ. B N N B Finding expressions for the two Jacobians is tedious but straightforward. Propositions 3.3 and 3.4 and Theorem 3.5 state consequences of the main approximation resut, which have significant impications on the computation of MLEs. We have aready seen that the approximate FIM is equivaent to a compete data FIM from EM. There is aso an interesting connection between AFSA and EM, in that the iterations are agebraicay reated. To see this connection, expicit forms for AFSA and EM iterations are first presented, with proofs given in Appendix B. Proposition 3.3 AFSA Iterations. The AFSA iterations can be written expicity as π g+1 p g+1 j π g 1 n 1 M θ g+1 θ g + Ĩ 1 θ g Sθ g, g 1, 2, P x i Px i, where M m m n. P x i Px i x ij p g j 1,..., s [ 1 1 M ] P x i m i, 1,..., s, j 1,..., k. Px i Proposition 3.4 EM Iterations. Consider the compete data 1 with probabiity π 1 Z i. and X i Z i Mut k p, m i, s with probabiity π s, 11

12 where X i, Z i are independent for i 1,..., n. Denote γ g i : PZ i x i, θ g as the posterior probabiity that the ith observation beongs to the th group. Iterations for an EM agorithm are given by π g+1 1 n p g+1 j γ g i n x ijγ g i n m iγ g i 1 n πg P x i, 1,..., s, Px i n x ij P x i Px i n m i P x i, 1,..., s, j 1,..., k. Px i The iterations for AFSA or EM are repeated for g 1, 2,..., with a given initia guess θ 0, unti og Lθ g+1 og Lθ g < ε, where ε > 0 is a given toerance, which is taken to be the stopping criteria for the remainder of this paper. Theorem 3.5. Denote the estimator from EM by ˆθ, and the estimator from AFSA by θ. Suppose custer sizes are equa, so that m 1 m n m. If the two agorithms start at the gth iteration with θ g, then for the g + 1th iteration, π g+1 ˆπ g+1 and p g+1 j for 1,..., s and j 1,..., k. ˆπ g+1 π g ˆp g+1 j + 1 ˆπg+1 π g Proof of Theorem 3.5. It is immediate from Propositions 3.3 and 3.4 that π g+1 and that Now, ˆπ g+1 π g ˆp g+1 j + ˆπ g+1 π g n x ij P x i Px i m n P x i Px i 1 mn 1 n 1 ˆπg+1 π g 1 n P x i Px i x ij + p g j p g j P x i Px i. P x i Px i + pg j 1 1 n 1 1 P x i n Px i P x i Px i p g j ˆπ g+1, p g+1 j

13 The g + 1th AFSA iterate can then be seen as a inear combination of the gth iterate and the g + 1th step of EM. The coefficient ˆπ g+1 /π g is non-negative but may be arger than 1. Therefore p g+1 j need not ie stricty between ˆp g+1 j and p g j. Figure 1 shows a pot of p g+1 j as the ratio ˆπ g+1 /π g varies. However, suppose that at gth step the EM agorithm is cose to convergence. Then ˆπ g+1 From 3.4 we wi aso have ˆπ g ˆπg+1 ˆπ g 1, for 1,..., s. p g+1 j ˆp g+1 j, for 1,..., s, and j 1,..., k. From this point on, AFSA and EM iterations are approximatey the same. Hence, in the vicinity of a soution, AFSA and EM wi produce the same estimate. Note that this resut hods for any m, and does not require a arge custer size justification. For the case of varying custer sizes m 1,..., m n, ˆπ g+1 π g ˆp g+1 j + 1 ˆπg+1 π g n x ij P x i Px i 1 n m i P x i n Px i p g j P x i Px i + pg j 1 1 n P x i, 3.5 Px i which does not simpify to p g+1 j as in the proof of Theorem 3.5. However, this iustrates that EM and AFSA are sti cosey reated. This aso suggests an ad-hoc revision to AFSA, etting p g+1 j equa 3.5 so that the agebraic reationship to EM woud be maintained as in 3.4 for the baanced case. A more genera connection is known between EM and iterations of the form θ g+1 θ g + I 1 c θ g Sθ g, g 1, 2,..., 3.6 where I c θ is a compete data FIM. Titterington 1984 shows that the two iterations are approximatey equivaent under appropriate reguarity conditions. The equivaence is exact when the compete data ikeihood is a reguar exponentia famiy { } Lµ exp bx + η T t + aη, η ηµ, t tx, and µ : EtX is the parameter of interest. The compete data ikeihood for our mutinomia mixture is indeed a reguar exponentia famiy, but the parameter of interest θ is a transformation of µ rather than µ itsef. Therefore the equivaance is approximate, as we have seen in Theorem 3.5. The justification for AFSA eading to this paper foowed the historica approach of Bischke 1964, and not from the roe of Ĩθ as a compete data FIM. But the reationship between EM and the iterations 3.6 suggests that AFSA is a reasonabe approach for finite mixtures beyond the mutinomia setting, 13

14 AFSA step compared to previous iterate and EM step ˆp g+1 j p g+1 j p g j 0 1 ˆπ g+1 /π g Figure 1: The next AFSA p g+1 j depends on the ratio ˆπ g+1 /π g. iteration is a inear combination of ˆp g+1 j and p g j, which 4 Simuation Studies The main resut stated in Theorem 2.1 aows us to approximate the matrix Iθ by Ĩθ, which is much more easiy computed. Theorem 2.5 justifies Ĩ 1 θ as an approximation for the inverse FIM. In the present section, simuation studies investigate the quaity of the two approximations as a function of m. We aso present studies to demonstrate the convergence speed and soution quaity of AFSA. 4.1 Distance between true and approximate FIM Consider two concepts of distance to compare the coseness of the exact and approximate matrices. Based on the Frobenius norm A F i j a2 ij, a distance metric d F A, B A B F can be constructed using the sum of squared differences of corresponding eements. This distance wi be arger in genera when the magnitudes of the eements are arger, so we wi aso consider a scaed version d S A, B d F A, B B F i j a ij b ij 2, i j b2 ij noting that this is not a true distance metric since it is not symmetric. Using these two metrics, we compare the distance between true and approximate FIMs, and aso the dis- 14

15 tance between their inverses. Consider a mixture MutMix 2 θ, m of three binomias, with parameters p 1/7 1/3 2/3 and π 1/6 2/6 3/6. Figure 2 pots the two distance types for both the FIM and inverse FIM as m varies. Note that distances are potted on a og scae, so the vertica axis represents orders of magnitude. To see more concretey what is being compared, for the moderate custer size m 20 we have, respectivey for the approximate and exact FIMs, vs and for the approximate and exact inverse FIMs, vs Since the approximations are bock diagona matrices they have no way of capturing the off-diagona bocks, which are present in the exact matrices but are eventuay dominated by the bock-diagona eements as m. This emphasizes one obvious disadvantage of the approximate FIM, which is that it cannot be used to estimate a the asymptotic covariances for the MLEs for a fixed custer size. For this m 20 case, the bock-diagona eements for both pairs of matrices are not very cose, athough they are at east the same order of magnitude with the same signs. The magnitudes of eements in the inverse FIMs are in genera much smaer than those in the FIMs, so the unscaed distance wi naturay be smaer between the inverses. Now in Figure 2 consider the distance d F Ĩθ, Iθ as m is varied. For the FIM, the distance appears to be moderate at first, then increasing with m, and finay beginning to vanish as m becomes arge. What is not refected here is that the magnitudes of the eements themseves are increasing; this is infating the distance unti the convergence of Thereom 2.1 begins to kick in. Considering the scaed distance d S Ĩθ, Iθ heps to suppress the effect of the eement magnitudes and gives a cearer picture of the convergence. Focusing next on the inverse FIM, consider the distance d F Ĩ 1 θ, I 1 θ. For m < 5 the exact FIM is computationay singuar, so its inverse cannot be computed. Note that in this case the conditions for identifiabiity are not satisfied see Appendix A. This is not just a coincidence; there is a known reationship between mode non-identifiabiity and singuarity of the FIM Rothenberg, For m between 5 and about 23, the distance is very arge at first because of near-singuarity of the FIM, but quicky returns to a reasonabe magnitude. As m increases further, the distance quicky vanishes toward zero. We aso consider the 15.

16 Log of Frobenius Distance b/w Exact and Approx Matrices Log of Scaed Frobenius Distance b/w Exact and Approx Matrices ogdistance FIM Inverse FIM ogdistance FIM Inverse FIM m m a Using unscaed distance b Using scaed distance Figure 2: Distance between exact and approximate FIM and its inverse, as m is varied. scaed distance d S Ĩ 1 θ, I 1 θ. Again, this heps to remove the effects of the eement magnitudes, which are becoming very sma as m increases. Even after taking into account the scae of the eements, the distance between the inverse matrices appears to be converging more quicky than the distance between the FIM and approximate FIM. 4.2 Effectiveness of AFSA method Convergence Speed We first observe the convergence speed of AFSA and severa of its competitors. Consider the mixture of two trinomias Y i iid MutMix 3 θ, m 20, i 1,..., n 500 p 1 1/3 1/3 1/3, p , π We fit the MLE using AFSA, FSA, and EM. After the gth iteration, the quantity δ g og Lθ g og Lθ g 1 is measured. The sequence og δ g is potted for each agorithm in Figure 3. Note that δ g may be negative, except for exampe in EM which guarantees an improvement to the og-ikeihood in every step. A negative δ g can be interpreted as negative progress, at east from a oca maximum. The absoute vaue is taken to make potting possibe on the og scae, but some steps with negative progress have been obscured. The resuting estimates and 16

17 standard errors for a agorithms are shown in Tabe 1, and additiona summary information is shown in Tabe 2. We see that AFSA and EM have amost exacty the same rate of convergence toward the same soution, as suggested by Thereom 3.5. FSA had severe probems, and was not abe to converge within 100 iterations; i.e. δ g < 10 8 was not attained. The situation for FSA is worse than it appears in the pot. Athough og δ g is becoming sma, FSA s steps resut in both positive and negative δ g s unti the iteration imit is reached. This indicates a faiure to approach any maximum of the og-ikeihood. We aso considered an FSA hybrid with a warmup period, where for a given ε 0 > 0 the approximate FIM is used unti the first time δ g < ε 0 is crossed. Notice that ε 0 corresponds to no warmup period. A simiar idea has been considered by Neercha and More 2005, who proposed a two-stage procedure for AFSA in the RCM setting of Exampe 3.1. The first stage consisted of running AFSA iterations unti convergence, and in the second stage one additiona iteration of exact Fisher Scoring was performed. The purpose of the FSA iteration was to improve standard error estimates, which were previousy found to be inaccurate when computed directy from the approximate FIM Neercha and More, Here we note that FSA aso offers a faster convergence rate than AFSA, given an initia path to a soution. Therefore, AFSA can be used in eary iterations to move to the vicinity of a soution, then a switch to FSA wi give an acceerated converge to the soution. This approach depends on the exact FIM being feasibe to compute, so the sampe space cannot be too arge. For the present simuations, we make use of the naive summation 2.3. Hence, there is a trade-off in the choice of ε 0 between energy spent on computing the exact FIM and a arger number of iterations required for AFSA. Figure 3 shows that the hybrid strategy is effective, addressing the erratic behavior of FSA from an arbitrary starting vaue and the sower convergence rates of EM and AFSA. Tabe 2 shows that even a very imited warmup period such as ε 0 10 can be sufficient. The Newton-Raphson agorithm, which has not been shown here, performed simiary to Fisher Scoring but has issues with singuarity in some sampes. Standard errors for AFSA were obtained as a 11,..., a qq, denoting Ĩ 1 ˆθ a ij. For FSA and FSA-Hybrid, the inverse of the exact FIM was used instead. The basic EM agorithm does not yied standard error estimates. Severa extensions have been proposed to address this, such as by Louis 1982 and Meng and Rubin In ight of Theorem 3.5, standard errors from Ĩ 1 θ evauated at EM estimates coud aso be used to obtain simiar resuts to AFSA Monte Caro Study We next consider a Monte Caro study of the difference between AFSA and EM estimators. Observations were generated from Y i ind MutMix k θ, m i, i 1,..., n 500, given varying custer sizes m 1,..., m n which themseves were generated as Z 1,..., Z n iid Gammaα, β, m i Z i

18 Convergence of competing agorithms ogabsdeta AFSA FSA EM FSA w/ warmup 1e iteration Figure 3: Convergence of severa competing agorithms for a sma test probem Tabe 1: Estimates and standard errors for the competing agorithms. FSA Hybrid produced the same resuts with ε 0 set to 0.001, 0.01, 0.1, 1, and 10. FSA AFSA EM FSA Hybrid ˆp SE ˆp SE ˆp SE ˆp SE ˆπ SE

19 Tabe 2: Convergence of severa competing agorithms. Hybrid FSA is shown with severa choices of the warmup toerance ε 0. Exact FSA uses ε 0. method ε 0 oglik to iter AFSA EM FSA FSA FSA FSA FSA FSA Severa different settings of θ are considered, with s 2 mixing components and proportion π 0.75 for the first component. The parameters α and β were chosen such that EZ i αβ 20. This gives β 20/α so ony α is free, and VarZ i αβ 2 400/α can be chosen as desired. The expectation and variance of m i are intuitivey simiar to Z i, and their exact vaues may be computed numericay. Once the n observations are generated, an AFSA estimator θ and an EM estimator ˆθ are fit. This process is repeated 1000 times yieding θ r and ˆθ r for r 1,..., A defaut initia vaue was seected for each setting of θ, and used for both agorithms in every repetition. To measure the coseness of the two estimators, a maximum reative difference is taken over a components of θ, then averaged over a repetitions: D D r, where D r 1000 r1 Here represents the maximum operator. Notice that obtaining a good resut for D depends on the vectors ˆθ and θ being ordered in the same way. To hep ensure this, we add the constraint π 1 > > π s, which is enforced in both agorithms by reordering the estimates for π 1,..., π s and p 1,..., p s accordingy after every iteration. Tabe 3 shows the resuts of the simuation. Nine different scenarios for θ are considered. The custer sizes m 1,..., m n are seected in three different ways: a baanced case where m i 20 for i 1,..., n, custer sizes seected at random with sma variabiity using α 100, and custer sizes seected at random with moderate variabiity using α 25. Both agorithms are susceptibe to finding oca maxima of the ikeihood, but in this experiment AFSA encountered the probem much more frequenty. These cases stood out because the oca maxima occurred with one of the mixing proportions or category probabiities cose to zero, i.e. a convergence to the boundary of the parameter space. This is an especiay bad situation for our Monte Caro statistic D, which can become very arge if 19 q j1 θ r j θ r j ˆθ r j.

20 Tabe 3: Coseness between AFSA and EM estimates, over 1000 trias A. B. C. D. E. F. G. H. I. Custer sizes equa α 100 α 25 p 1 p 2 m i 20 Varm i Varm i , 0.3 1/3, 1/ , 0.5 1/3, 1/ , 0.5 1/3, 1/ , 0.1, , 0.25, , 0.2, , 0.25, this occurs even once for a given scenario. The probem occurred most frequenty for the case p 1 0.1, 0.3 and p 2 1/3, 1/3. To counter this, we restarted AFSA with a random starting vaue whenever a soution with any estimate ess than 0.01 was obtained. For this experiment, no more than 15 out of 1000 trias required a restart, and no more than two restarts were needed for the same tria. In practice, we recommend starting AFSA with severa initia vaues to ensure that any soutions on the boundary are not missteps taken by the agorithm. The entries in Tabe 3 show that sma to moderate variation of the custer sizes does not have a significant impact on the equivaence of AFSA and EM. On the other hand, as p 1 and p 2 are moved coser together, the quantity D tends to become arger. Theorem 2.1 depends on the distinctness of the category probabiity vectors, so the quaity of the FIM approximation at moderate custer size may begin to suffer in this case. The estimation probem itsef aso intuitivey becomes more difficut as p 1 and p 2 become coser. Reca that the dimension of p i is k 1; it can be seen from Tabe 3 that increasing k from 2 to 4 does not necessariy have a negative effect on the resuts. In Scenario E, p 1 and p 2 are not too cose together, yet D has a simiar magnitude to Scenario D where the two vectors are coser. Figure 4 shows a pot of the individua D r for Scenarios D and E. Notice that in Scenario E, one particuar simuation in each case is responsibe for the arge magnitude of D. Upon remova of these simuations, the order of D is reduced from 10 3 to However, many arge D r were present in the Scenario D resuts. 5 Concusions A arge custer approximation was presented for the FIM of the finite mixture of mutinonias mode Theorem 2.1. This matrix has a convenient bock diagona form, where each non-zero bock is the FIM of a standard mutinomia observation. Furthermore, the approximation is equivaent to the compete data FIM, had popuation abes been recorded for each 20

21 m20 α100 α25 m20 α100 α Reative Distances Between EM and AFSA for Scenarios D and E Scenario D Scenario E Figure 4: Boxpots for Scenarios D and E of Monte Caro study. At this scae, the boxes appear as thin horizonta ines. observation Proposition 2.2. Using this approximation to the FIM, we formuated the Approximate Fisher Scoring Agorithm AFSA, and showed that its iterations are cosey reated to the we known Expectation-Maximization EM agorithm for finite mixtures Theorem 3.5. Simuations show that a rather arge custer size is needed before the exact and approximate FIM are cose; this is not surprising given that a bock diagona matrix is being used to approximate a dense matrix. A arge custer size is aso needed for a cose approximation of the inverse, athough the inverses are seen to converge together more quicky. Therefore, the approximate FIM and its inverse are not we-suited to repace the exact matrices for genera usage. This means, for exampe, that one shoud be cautious about computing standard errors for the MLE from the approximate inverse FIM. As another exampe of a genera use for the approximate FIM, consider approximate 1 α eve Wad-type and Score-type confidence regions, { θ 0 : ˆθ θ 0 T Ĩ ˆθ ˆθ θ 0 χ 2 q,α } and { θ 0 : Sθ 0 T Ĩ 1 θ 0 Sθ 0 χ 2 q,α }, 5.1 respectivey, using the approximate FIM in pace of the exact FIM. Such regions are very practica to compute, but wi ikey not have the desired coverage for θ. However, we might expect the Score region to perform better for moderate custer sizes because it invoves the inverse matrix. On the other hand, the approximate FIM works we as a too for estimation in the AFSA agorithm. This is interesting because the more standard Fisher Scoring and Newton-Raphson agorithms do not work we on their own. For Newton-Raphson, the invertibiity of the Hessian depends on the sampe as we as the current iterate θ g and the mode. Fisher Scoring can be computed when the custer size is not too sma so that 21

22 the FIM is non-singuar, but it is often unabe to make progress at a from an arbitrariy chosen starting point. In this case, AFSA or EM is usefu for giving FSA some initia hep. If FSA has a sufficienty good starting point, it can converge very quicky. Therefore we recommend a hybrid approach: use AFSA iterations for an initia warmup period, then switch to FSA once a path toward a soution has been estabished. This approach may aso hep to reduce the number of exact FIM computations needed, which may be expensive. Athough AFSA and EM are cosey reated and often tend toward the same soution, AFSA is not restricted to the parameter space. Additiona precautions may therefore be needed to prevent AFSA iterations from drifting outside of the space. AFSA aso tended to converge to the boundary of the space more often than EM; hence, we reiterate the usua advice of trying severa initia vaues as a good practice. AFSA may be preferabe to EM in situations where it is more natura to formuate. Derivation of the E-step conditiona og-ikeihood may invove evauating a compicated expectation, but is not required for AFSA. A tradeoff for AFSA is that the score vector for the observed data must be computed; this may invove a messy differentiation, but is arguaby easier to address numericay than the E- step. AFSA iterations were obtained for the Random-Cumped Mutinomia in Exampe 3.1, starting from a genera mutinomia mixture and using an appropriate transformation of the parameters. It is interesting to note the reationship between FSA, AFSA, and EM as Newton-type agorithms. Fisher Scoring is a cassic agorithm where the Hessian is repaced by its expectation. In AFSA the Hessian is repaced instead by a compete data FIM. EM can be considered a Newton-type agorithm aso, where the entire ikeihood is repaced by a compete data ikeihood with missing data integrated out. In this ight, EM and AFSA iterations are seen to be approximatey equivaent. Severa interesting questions can be raised at this point. There is a reationship between AFSA and EM which extends beyond the mutinomia mixture; we wonder if the reationship between the exact and compete data information matrix generaizes as we. Aso, for the present mutinomia mixture, perhaps there is a sma custer bias correction that coud be appied to improve the approximation. This might aow standard errors and confidence regions such as 5.1 to safey be derived from the approximate FIM. 6 Acknowedgements The hardware used in the computationa studies is part of the UMBC High Performance Computing Faciity HPCF. The faciity is supported by the U.S. Nationa Science Foundation through the MRI program grant no. CNS and the SCREMS program grant no. DMS , with additiona substantia support from the University of Maryand, Batimore County UMBC. See for more information on HPCF and the projects using its resources. The first author additionay acknowedges financia support as HPCF RA. 22

23 References W. R. Bischke. Moment estimators for the parameters of a mixture of two binomia distributions. The Annas of Mathematica Statistics, 332: , W. R. Bischke. Estimating the parameters of mixtures of binomia distributions. Journa of the American Statistica Association, 59306: , H. Bozdogan. Mode seection and Akaike s Information Criterion AIC: The genera theory and its anaytica extensions. Psychometrika, 523: , S. Chandra. On the mixtures of probabiity distributions. Scandinavian Journa of Statistics, 4: , A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum ikeihood from incompete data via the EM agorithm. Journa of the Roya Statistica Society, Series B, 391:1 38, A. B. Kabir. Estimation of parameters of a finite mixture of distributions. Journa of the Roya Statistica Society. Series B, 30: , K. Kroikowska. Estimation of the parameters of any finite mixture of geometric distributions. Demonstratio Mathematica, 9: , K. Lange. Numerica Anaysis for Statisticians. Springer, 2nd edition, E. L. Lehmann and G. Casea. Theory of Point Estimation. Springer, 2nd edition, T. A. Louis. Finding the observed information matrix when using the EM agorithm. Journa of the Roya Statistica Society. Series B, 44: , G. Mcachan and D. Pee. Finite Mixture Modes. Wiey-Interscience, X. L. Meng and D. B. Rubin. Using EM to obtain asymptotic variance-covariance matrices: the SEM agorithm. Journa of the American Statistica Association, 86416: , C. D. Meyer. Matrix Anaysis and Appied Linear Agebra. SIAM, J. G. More and N. K. Nagaraj. A finite mixture distribution for modeing mutinomia extra variation. Technica Report Research report 91 03, Department of Mathematics and Statistics, University of Maryand, Batimore County, J. G. More and N. K. Nagaraj. A finite mixture distribution for modeing mutinomia extra variation. Biometrika, 802: , J. G. More and N. K. Nagaraj. Overdispersion Modes in SAS. SAS Institute,

24 N. K. Neercha and J. G. More. Large custer resuts for two parametric mutinomia extra variation modes. Journa of the American Statistica Association, 93443: , N. K. Neercha and J. G. More. An improved method for the computation of maximum ikeihood estimates for mutinomia overdispersion modes. Computationa Statistics & Data Anaysis, 491:33 43, M. Okamoto. Some inequaities reating to the partia sum of binomia probabiities. Annas of the Institute of Statistica Mathematics, 10:29 35, A. M. Raim, M. K. Gobbert, N. K. Neercha, and J. G. More. Maximum ikeihood estimation of the random-cumped mutinomia mode as prototype probem for arge-scae statistica computing. Accepted, C. R. Rao. Linear statistica inference and its appications. John Wiey and Sons Inc, H. Robbins. Mixture of distributions. The Annas of Mathematica Statistics, 193: , T. J. Rothenberg. Identification in parametric modes. Econometrica, 39: , H. Teicher. On the mixture of distributions. The Annas of Mathematica Statistics, 311: 55 73, D. M. Titterington. Recursive parameter estimation using incompete data. Journa of the Roya Statistica Society. Series B, 46: , H. Zhou and K. Lange. MM agorithms for some discrete mutivariate distributions. Journa of Computationa and Graphica Statistics, 193: , A Appendix: Preiminaries and Notation Given an independent sampe X 1,..., X n with joint ikeihood Lθ and θ having dimension q 1, the score vector is Sθ og Lθ θ og fx; θ. θ For X i Mut k p, m the score vector for a singe observation can be obtained from [ ] og fx; p, m k 1 x 1 og p x k 1 og p k 1 + x k og 1 p j p a p a x a /p a x k /p k, A.1 j1 24

25 so that og fx; p, m p x 1 /p 1. x k 1 /p k 1 x k /p k. x k /p k D 1 x k x k p k 1, denoting D : diagp 1,..., p k 1 and x k : x 1,..., x k 1. The score vector for a singe observation X MutMix k θ, m can aso be obtained, og Px p a og{ s 1 π P x} p a 1 Px π P a x a p a π a P a x og P a x Px p a π [ a P a x Da 1 x k x ] k 1 Px p ak, a 1,..., s, where D a : diagp a1,..., p a,k 1, and og Px π a og{ s 1 π P x} π a P ax P s x, a 1,..., s 1. Px Next, consider the q q FIM for the independent sampe X 1,..., X n [ { } { } ] T Iθ VarSθ E og Lθ og Lθ θ θ ] E [ 2 og Lθ. θ θt The ast equaity hods under appropriate reguarity conditions. For the mutinomia FIM, we may use A.1 to obtain { x k /p 2 k if a b og fx; p, m p a p b x a /p 2 a x k /p 2 k otherwise and so og fx; p, m diag x 1,..., x k 1 p pt p 2 1 p 2 k 1 x k 11 T. p 2 k 25

26 Therefore, we have Ip E p p mp1 diag p 2 1 og fx; p, m T,..., mp k 1 p 2 k 1 + mp k 11 T p 2 k m D 1 + p 1 k 11T. The score vector and Hessian of the og-ikeihood can be used to impement the Newton- Raphson agorithm, where the g + 1th iteration is given by { } θ g+1 θ g 2 1 θ θ og T Lθg Sθ g. The Hessian may be repaced with the FIM to impement Fisher Scoring θ g+1 θ g + I 1 θ g Sθ g. In order for the estimation probem to be we-defined in the first pace, the mode must be identifiabe. For finite mixtures, this is taken to mean that the equaity s v π fx; θ as λ fx; ξ 1 impies s v and terms within the sums are equa, except the indicies may be permuted Mcachan and Pee, 2000, section Chandra 1977 provides some insight into the identifiabiity issue, and shows that a famiy of mutivariate mixtures is identifiabe if any of the corresponding margina mixtures are identifiabe. In the present case, the mutivariate mixtures consist of mutinomia densities, and the margina densities are binomias. It is we known that a finite mixture of s components from 1 { Binomiam, θ : θ 0, 1 } is identifiabe if and ony if m 2s 1; see, for exampe, Bischke Then a sufficient condition for mode 2.2 to be identifiabe is that m i 2s 1 for at east one observation. This can be seen by the foowing emma. Lemma A.1. Suppose X i ind f i x; θ, i 1,..., n, where f i s are densities, and for at east one r {1,..., n} the famiy {f r ; θ : θ Θ} is identifiabe. Then the joint mode is identifiabe. Proof of Lemma A.1. WLOG assume that r 1, and suppose we have n n f i x i ; θ as f i x i ; ξ. Integrating both sides with respect to x 2,..., x n, using the appropriate dominating measure, f 1 x 1 ; θ as f 1 x 1 ; ξ. Since the famiy {f 1 ; θ : θ Θ} is identifiabe, this impies θ ξ. Hence the joint famiy { n f i ; θ : θ Θ} is identifiabe. 26

27 B Appendix: Additiona Proofs To prove Theorem 2.1, we wi first estabish a key inequaity. A simiar strategy was used by More and Nagaraj 1991, but they considered the specia case k s, so that the number of mixture components is equa to the number of categories within each component. Here we generaize their argument to the genera case where k s need not hod. The origina proof was inspired by the foowing inequaity from Okamoto 1959 for the tai probabiity of the binomia distribution, which was aso considered by Bischke Lemma B.1. Suppose X Binomiam, p and et fx; m, p be its density. Then for c 0, i. PX/m p c e 2mc2, ii. PX/m p c e 2mc2. Theorem B.2. For a given index b {1,..., s} we have s π a P a x P b x 2 s e m 2 δ2 ab, Px π b where δ ab k 1 j1 p aj p bj. a b Proof of Theorem B.2. For a, b {1,..., s}, assume WLOG that k 1 δ ab : p aj p bj p al p bl, for some L {1,... k 1} j1 is positive. Denote as Ωx j the mutinomia sampe space when the jth eement of x is fixed at a number x j. Then we have π a P a x P b x Px m x L 0 x L x L m 2 p al+p bl x L π a x L m 2 p al+p bl x L π a π b x L m 2 p al+p bl x L π a P a x P b x Px π a P a x P bx Px + P a x + π b P a x + a b x L > m 2 p al+p bl x L x L > m 2 p al+p bl x L x L > m 2 p al+p bl x L P b x P b x. π a P a x Px P b x B.1 Notice that the ast statement above consists of margina probabiities for the Lth coordinate of k-dimensiona mutinomias, which are binomia probabiities. Foowing Bischke 1962, suppose A Binomiam, p al and B Binomiam, p bl, then B.1 is equa to π a π b P { A m 2 p al + p bl } + P 27 { B > m 2 p al + p bl }. B.2

CS229 Lecture notes. Andrew Ng

CS229 Lecture notes. Andrew Ng CS229 Lecture notes Andrew Ng Part IX The EM agorithm In the previous set of notes, we taked about the EM agorithm as appied to fitting a mixture of Gaussians. In this set of notes, we give a broader view

More information

A Brief Introduction to Markov Chains and Hidden Markov Models

A Brief Introduction to Markov Chains and Hidden Markov Models A Brief Introduction to Markov Chains and Hidden Markov Modes Aen B MacKenzie Notes for December 1, 3, &8, 2015 Discrete-Time Markov Chains You may reca that when we first introduced random processes,

More information

(This is a sample cover image for this issue. The actual cover is not yet available at this time.)

(This is a sample cover image for this issue. The actual cover is not yet available at this time.) (This is a sampe cover image for this issue The actua cover is not yet avaiabe at this time) This artice appeared in a journa pubished by Esevier The attached copy is furnished to the author for interna

More information

Akaike Information Criterion for ANOVA Model with a Simple Order Restriction

Akaike Information Criterion for ANOVA Model with a Simple Order Restriction Akaike Information Criterion for ANOVA Mode with a Simpe Order Restriction Yu Inatsu * Department of Mathematics, Graduate Schoo of Science, Hiroshima University ABSTRACT In this paper, we consider Akaike

More information

FRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA)

FRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA) 1 FRST 531 -- Mutivariate Statistics Mutivariate Discriminant Anaysis (MDA) Purpose: 1. To predict which group (Y) an observation beongs to based on the characteristics of p predictor (X) variabes, using

More information

Explicit overall risk minimization transductive bound

Explicit overall risk minimization transductive bound 1 Expicit overa risk minimization transductive bound Sergio Decherchi, Paoo Gastado, Sandro Ridea, Rodofo Zunino Dept. of Biophysica and Eectronic Engineering (DIBE), Genoa University Via Opera Pia 11a,

More information

SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS

SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS ISEE 1 SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS By Yingying Fan and Jinchi Lv University of Southern Caifornia This Suppementary Materia

More information

The EM Algorithm applied to determining new limit points of Mahler measures

The EM Algorithm applied to determining new limit points of Mahler measures Contro and Cybernetics vo. 39 (2010) No. 4 The EM Agorithm appied to determining new imit points of Maher measures by Souad E Otmani, Georges Rhin and Jean-Marc Sac-Épée Université Pau Veraine-Metz, LMAM,

More information

MARKOV CHAINS AND MARKOV DECISION THEORY. Contents

MARKOV CHAINS AND MARKOV DECISION THEORY. Contents MARKOV CHAINS AND MARKOV DECISION THEORY ARINDRIMA DATTA Abstract. In this paper, we begin with a forma introduction to probabiity and expain the concept of random variabes and stochastic processes. After

More information

Two-sample inference for normal mean vectors based on monotone missing data

Two-sample inference for normal mean vectors based on monotone missing data Journa of Mutivariate Anaysis 97 (006 6 76 wwweseviercom/ocate/jmva Two-sampe inference for norma mean vectors based on monotone missing data Jianqi Yu a, K Krishnamoorthy a,, Maruthy K Pannaa b a Department

More information

A. Distribution of the test statistic

A. Distribution of the test statistic A. Distribution of the test statistic In the sequentia test, we first compute the test statistic from a mini-batch of size m. If a decision cannot be made with this statistic, we keep increasing the mini-batch

More information

A proposed nonparametric mixture density estimation using B-spline functions

A proposed nonparametric mixture density estimation using B-spline functions A proposed nonparametric mixture density estimation using B-spine functions Atizez Hadrich a,b, Mourad Zribi a, Afif Masmoudi b a Laboratoire d Informatique Signa et Image de a Côte d Opae (LISIC-EA 4491),

More information

STA 216 Project: Spline Approach to Discrete Survival Analysis

STA 216 Project: Spline Approach to Discrete Survival Analysis : Spine Approach to Discrete Surviva Anaysis November 4, 005 1 Introduction Athough continuous surviva anaysis differs much from the discrete surviva anaysis, there is certain ink between the two modeing

More information

Alberto Maydeu Olivares Instituto de Empresa Marketing Dept. C/Maria de Molina Madrid Spain

Alberto Maydeu Olivares Instituto de Empresa Marketing Dept. C/Maria de Molina Madrid Spain CORRECTIONS TO CLASSICAL PROCEDURES FOR ESTIMATING THURSTONE S CASE V MODEL FOR RANKING DATA Aberto Maydeu Oivares Instituto de Empresa Marketing Dept. C/Maria de Moina -5 28006 Madrid Spain Aberto.Maydeu@ie.edu

More information

Lecture Note 3: Stationary Iterative Methods

Lecture Note 3: Stationary Iterative Methods MATH 5330: Computationa Methods of Linear Agebra Lecture Note 3: Stationary Iterative Methods Xianyi Zeng Department of Mathematica Sciences, UTEP Stationary Iterative Methods The Gaussian eimination (or

More information

6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7

6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7 6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17 Soution 7 Probem 1: Generating Random Variabes Each part of this probem requires impementation in MATLAB. For the

More information

Efficiently Generating Random Bits from Finite State Markov Chains

Efficiently Generating Random Bits from Finite State Markov Chains 1 Efficienty Generating Random Bits from Finite State Markov Chains Hongchao Zhou and Jehoshua Bruck, Feow, IEEE Abstract The probem of random number generation from an uncorreated random source (of unknown

More information

THE REACHABILITY CONES OF ESSENTIALLY NONNEGATIVE MATRICES

THE REACHABILITY CONES OF ESSENTIALLY NONNEGATIVE MATRICES THE REACHABILITY CONES OF ESSENTIALLY NONNEGATIVE MATRICES by Michae Neumann Department of Mathematics, University of Connecticut, Storrs, CT 06269 3009 and Ronad J. Stern Department of Mathematics, Concordia

More information

Some Measures for Asymmetry of Distributions

Some Measures for Asymmetry of Distributions Some Measures for Asymmetry of Distributions Georgi N. Boshnakov First version: 31 January 2006 Research Report No. 5, 2006, Probabiity and Statistics Group Schoo of Mathematics, The University of Manchester

More information

Formulas for Angular-Momentum Barrier Factors Version II

Formulas for Angular-Momentum Barrier Factors Version II BNL PREPRINT BNL-QGS-06-101 brfactor1.tex Formuas for Anguar-Momentum Barrier Factors Version II S. U. Chung Physics Department, Brookhaven Nationa Laboratory, Upton, NY 11973 March 19, 2015 abstract A

More information

XSAT of linear CNF formulas

XSAT of linear CNF formulas XSAT of inear CN formuas Bernd R. Schuh Dr. Bernd Schuh, D-50968 Kön, Germany; bernd.schuh@netcoogne.de eywords: compexity, XSAT, exact inear formua, -reguarity, -uniformity, NPcompeteness Abstract. Open

More information

CONJUGATE GRADIENT WITH SUBSPACE OPTIMIZATION

CONJUGATE GRADIENT WITH SUBSPACE OPTIMIZATION CONJUGATE GRADIENT WITH SUBSPACE OPTIMIZATION SAHAR KARIMI AND STEPHEN VAVASIS Abstract. In this paper we present a variant of the conjugate gradient (CG) agorithm in which we invoke a subspace minimization

More information

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones ASummaryofGaussianProcesses Coryn A.L. Baier-Jones Cavendish Laboratory University of Cambridge caj@mrao.cam.ac.uk Introduction A genera prediction probem can be posed as foows. We consider that the variabe

More information

AST 418/518 Instrumentation and Statistics

AST 418/518 Instrumentation and Statistics AST 418/518 Instrumentation and Statistics Cass Website: http://ircamera.as.arizona.edu/astr_518 Cass Texts: Practica Statistics for Astronomers, J.V. Wa, and C.R. Jenkins, Second Edition. Measuring the

More information

Stochastic Variational Inference with Gradient Linearization

Stochastic Variational Inference with Gradient Linearization Stochastic Variationa Inference with Gradient Linearization Suppementa Materia Tobias Pötz * Anne S Wannenwetsch Stefan Roth Department of Computer Science, TU Darmstadt Preface In this suppementa materia,

More information

Integrating Factor Methods as Exponential Integrators

Integrating Factor Methods as Exponential Integrators Integrating Factor Methods as Exponentia Integrators Borisav V. Minchev Department of Mathematica Science, NTNU, 7491 Trondheim, Norway Borko.Minchev@ii.uib.no Abstract. Recenty a ot of effort has been

More information

Appendix of the Paper The Role of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Model

Appendix of the Paper The Role of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Model Appendix of the Paper The Roe of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Mode Caio Ameida cameida@fgv.br José Vicente jose.vaentim@bcb.gov.br June 008 1 Introduction In this

More information

A Comparison Study of the Test for Right Censored and Grouped Data

A Comparison Study of the Test for Right Censored and Grouped Data Communications for Statistica Appications and Methods 2015, Vo. 22, No. 4, 313 320 DOI: http://dx.doi.org/10.5351/csam.2015.22.4.313 Print ISSN 2287-7843 / Onine ISSN 2383-4757 A Comparison Study of the

More information

AALBORG UNIVERSITY. The distribution of communication cost for a mobile service scenario. Jesper Møller and Man Lung Yiu. R June 2009

AALBORG UNIVERSITY. The distribution of communication cost for a mobile service scenario. Jesper Møller and Man Lung Yiu. R June 2009 AALBORG UNIVERSITY The distribution of communication cost for a mobie service scenario by Jesper Møer and Man Lung Yiu R-29-11 June 29 Department of Mathematica Sciences Aaborg University Fredrik Bajers

More information

Expectation-Maximization for Estimating Parameters for a Mixture of Poissons

Expectation-Maximization for Estimating Parameters for a Mixture of Poissons Expectation-Maximization for Estimating Parameters for a Mixture of Poissons Brandon Maone Department of Computer Science University of Hesini February 18, 2014 Abstract This document derives, in excrutiating

More information

First-Order Corrections to Gutzwiller s Trace Formula for Systems with Discrete Symmetries

First-Order Corrections to Gutzwiller s Trace Formula for Systems with Discrete Symmetries c 26 Noninear Phenomena in Compex Systems First-Order Corrections to Gutzwier s Trace Formua for Systems with Discrete Symmetries Hoger Cartarius, Jörg Main, and Günter Wunner Institut für Theoretische

More information

Do Schools Matter for High Math Achievement? Evidence from the American Mathematics Competitions Glenn Ellison and Ashley Swanson Online Appendix

Do Schools Matter for High Math Achievement? Evidence from the American Mathematics Competitions Glenn Ellison and Ashley Swanson Online Appendix VOL. NO. DO SCHOOLS MATTER FOR HIGH MATH ACHIEVEMENT? 43 Do Schoos Matter for High Math Achievement? Evidence from the American Mathematics Competitions Genn Eison and Ashey Swanson Onine Appendix Appendix

More information

Uniprocessor Feasibility of Sporadic Tasks with Constrained Deadlines is Strongly conp-complete

Uniprocessor Feasibility of Sporadic Tasks with Constrained Deadlines is Strongly conp-complete Uniprocessor Feasibiity of Sporadic Tasks with Constrained Deadines is Strongy conp-compete Pontus Ekberg and Wang Yi Uppsaa University, Sweden Emai: {pontus.ekberg yi}@it.uu.se Abstract Deciding the feasibiity

More information

A CLUSTERING LAW FOR SOME DISCRETE ORDER STATISTICS

A CLUSTERING LAW FOR SOME DISCRETE ORDER STATISTICS J App Prob 40, 226 241 (2003) Printed in Israe Appied Probabiity Trust 2003 A CLUSTERING LAW FOR SOME DISCRETE ORDER STATISTICS SUNDER SETHURAMAN, Iowa State University Abstract Let X 1,X 2,,X n be a sequence

More information

THE THREE POINT STEINER PROBLEM ON THE FLAT TORUS: THE MINIMAL LUNE CASE

THE THREE POINT STEINER PROBLEM ON THE FLAT TORUS: THE MINIMAL LUNE CASE THE THREE POINT STEINER PROBLEM ON THE FLAT TORUS: THE MINIMAL LUNE CASE KATIE L. MAY AND MELISSA A. MITCHELL Abstract. We show how to identify the minima path network connecting three fixed points on

More information

Algorithms to solve massively under-defined systems of multivariate quadratic equations

Algorithms to solve massively under-defined systems of multivariate quadratic equations Agorithms to sove massivey under-defined systems of mutivariate quadratic equations Yasufumi Hashimoto Abstract It is we known that the probem to sove a set of randomy chosen mutivariate quadratic equations

More information

A NOTE ON QUASI-STATIONARY DISTRIBUTIONS OF BIRTH-DEATH PROCESSES AND THE SIS LOGISTIC EPIDEMIC

A NOTE ON QUASI-STATIONARY DISTRIBUTIONS OF BIRTH-DEATH PROCESSES AND THE SIS LOGISTIC EPIDEMIC (January 8, 2003) A NOTE ON QUASI-STATIONARY DISTRIBUTIONS OF BIRTH-DEATH PROCESSES AND THE SIS LOGISTIC EPIDEMIC DAMIAN CLANCY, University of Liverpoo PHILIP K. POLLETT, University of Queensand Abstract

More information

MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES

MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES Separation of variabes is a method to sove certain PDEs which have a warped product structure. First, on R n, a inear PDE of order m is

More information

arxiv: v1 [math.ca] 6 Mar 2017

arxiv: v1 [math.ca] 6 Mar 2017 Indefinite Integras of Spherica Besse Functions MIT-CTP/487 arxiv:703.0648v [math.ca] 6 Mar 07 Joyon K. Boomfied,, Stephen H. P. Face,, and Zander Moss, Center for Theoretica Physics, Laboratory for Nucear

More information

VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES

VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES SARAH DAY, JEAN-PHILIPPE LESSARD, AND KONSTANTIN MISCHAIKOW Abstract. One of the most efficient methods for determining the equiibria of a continuous parameterized

More information

Stochastic Complement Analysis of Multi-Server Threshold Queues. with Hysteresis. Abstract

Stochastic Complement Analysis of Multi-Server Threshold Queues. with Hysteresis. Abstract Stochastic Compement Anaysis of Muti-Server Threshod Queues with Hysteresis John C.S. Lui The Dept. of Computer Science & Engineering The Chinese University of Hong Kong Leana Goubchik Dept. of Computer

More information

Smoothness equivalence properties of univariate subdivision schemes and their projection analogues

Smoothness equivalence properties of univariate subdivision schemes and their projection analogues Numerische Mathematik manuscript No. (wi be inserted by the editor) Smoothness equivaence properties of univariate subdivision schemes and their projection anaogues Phiipp Grohs TU Graz Institute of Geometry

More information

The Group Structure on a Smooth Tropical Cubic

The Group Structure on a Smooth Tropical Cubic The Group Structure on a Smooth Tropica Cubic Ethan Lake Apri 20, 2015 Abstract Just as in in cassica agebraic geometry, it is possibe to define a group aw on a smooth tropica cubic curve. In this note,

More information

Statistical Learning Theory: A Primer

Statistical Learning Theory: A Primer Internationa Journa of Computer Vision 38(), 9 3, 2000 c 2000 uwer Academic Pubishers. Manufactured in The Netherands. Statistica Learning Theory: A Primer THEODOROS EVGENIOU, MASSIMILIANO PONTIL AND TOMASO

More information

Symbolic models for nonlinear control systems using approximate bisimulation

Symbolic models for nonlinear control systems using approximate bisimulation Symboic modes for noninear contro systems using approximate bisimuation Giordano Poa, Antoine Girard and Pauo Tabuada Abstract Contro systems are usuay modeed by differentia equations describing how physica

More information

Statistics for Applications. Chapter 7: Regression 1/43

Statistics for Applications. Chapter 7: Regression 1/43 Statistics for Appications Chapter 7: Regression 1/43 Heuristics of the inear regression (1) Consider a coud of i.i.d. random points (X i,y i ),i =1,...,n : 2/43 Heuristics of the inear regression (2)

More information

II. PROBLEM. A. Description. For the space of audio signals

II. PROBLEM. A. Description. For the space of audio signals CS229 - Fina Report Speech Recording based Language Recognition (Natura Language) Leopod Cambier - cambier; Matan Leibovich - matane; Cindy Orozco Bohorquez - orozcocc ABSTRACT We construct a rea time

More information

On the evaluation of saving-consumption plans

On the evaluation of saving-consumption plans On the evauation of saving-consumption pans Steven Vanduffe Jan Dhaene Marc Goovaerts Juy 13, 2004 Abstract Knowedge of the distribution function of the stochasticay compounded vaue of a series of future

More information

NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION

NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION Hsiao-Chang Chen Dept. of Systems Engineering University of Pennsyvania Phiadephia, PA 904-635, U.S.A. Chun-Hung Chen

More information

C. Fourier Sine Series Overview

C. Fourier Sine Series Overview 12 PHILIP D. LOEWEN C. Fourier Sine Series Overview Let some constant > be given. The symboic form of the FSS Eigenvaue probem combines an ordinary differentia equation (ODE) on the interva (, ) with a

More information

4 Separation of Variables

4 Separation of Variables 4 Separation of Variabes In this chapter we describe a cassica technique for constructing forma soutions to inear boundary vaue probems. The soution of three cassica (paraboic, hyperboic and eiptic) PDE

More information

BALANCING REGULAR MATRIX PENCILS

BALANCING REGULAR MATRIX PENCILS BALANCING REGULAR MATRIX PENCILS DAMIEN LEMONNIER AND PAUL VAN DOOREN Abstract. In this paper we present a new diagona baancing technique for reguar matrix pencis λb A, which aims at reducing the sensitivity

More information

Coupling of LWR and phase transition models at boundary

Coupling of LWR and phase transition models at boundary Couping of LW and phase transition modes at boundary Mauro Garaveo Dipartimento di Matematica e Appicazioni, Università di Miano Bicocca, via. Cozzi 53, 20125 Miano Itay. Benedetto Piccoi Department of

More information

c 2007 Society for Industrial and Applied Mathematics

c 2007 Society for Industrial and Applied Mathematics SIAM REVIEW Vo. 49,No. 1,pp. 111 1 c 7 Society for Industria and Appied Mathematics Domino Waves C. J. Efthimiou M. D. Johnson Abstract. Motivated by a proposa of Daykin [Probem 71-19*, SIAM Rev., 13 (1971),

More information

Analysis of rounded data in mixture normal model

Analysis of rounded data in mixture normal model Stat Papers (2012) 53:895 914 DOI 10.1007/s00362-011-0395-0 REGULAR ARTICLE Anaysis of rounded data in mixture norma mode Ningning Zhao Zhidong Bai Received: 13 August 2010 / Revised: 9 June 2011 / Pubished

More information

Componentwise Determination of the Interval Hull Solution for Linear Interval Parameter Systems

Componentwise Determination of the Interval Hull Solution for Linear Interval Parameter Systems Componentwise Determination of the Interva Hu Soution for Linear Interva Parameter Systems L. V. Koev Dept. of Theoretica Eectrotechnics, Facuty of Automatics, Technica University of Sofia, 1000 Sofia,

More information

Learning Fully Observed Undirected Graphical Models

Learning Fully Observed Undirected Graphical Models Learning Fuy Observed Undirected Graphica Modes Sides Credit: Matt Gormey (2016) Kayhan Batmangheich 1 Machine Learning The data inspires the structures we want to predict Inference finds {best structure,

More information

An Extension of Almost Sure Central Limit Theorem for Order Statistics

An Extension of Almost Sure Central Limit Theorem for Order Statistics An Extension of Amost Sure Centra Limit Theorem for Order Statistics T. Bin, P. Zuoxiang & S. Nadarajah First version: 6 December 2007 Research Report No. 9, 2007, Probabiity Statistics Group Schoo of

More information

Analysis of Emerson s Multiple Model Interpolation Estimation Algorithms: The MIMO Case

Analysis of Emerson s Multiple Model Interpolation Estimation Algorithms: The MIMO Case Technica Report PC-04-00 Anaysis of Emerson s Mutipe Mode Interpoation Estimation Agorithms: The MIMO Case João P. Hespanha Dae E. Seborg University of Caifornia, Santa Barbara February 0, 004 Anaysis

More information

Mat 1501 lecture notes, penultimate installment

Mat 1501 lecture notes, penultimate installment Mat 1501 ecture notes, penutimate instament 1. bounded variation: functions of a singe variabe optiona) I beieve that we wi not actuay use the materia in this section the point is mainy to motivate the

More information

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with?

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with? Bayesian Learning A powerfu and growing approach in machine earning We use it in our own decision making a the time You hear a which which coud equay be Thanks or Tanks, which woud you go with? Combine

More information

Further analysis of multilevel Monte Carlo methods for elliptic PDEs with random coefficients

Further analysis of multilevel Monte Carlo methods for elliptic PDEs with random coefficients Further anaysis of mutieve Monte Caro methods for eiptic PDEs with random coefficients A. L. Teckentrup, R. Scheich, M. B. Gies, and E. Umann Abstract We consider the appication of mutieve Monte Caro methods

More information

A Statistical Framework for Real-time Event Detection in Power Systems

A Statistical Framework for Real-time Event Detection in Power Systems 1 A Statistica Framework for Rea-time Event Detection in Power Systems Noan Uhrich, Tim Christman, Phiip Swisher, and Xichen Jiang Abstract A quickest change detection (QCD) agorithm is appied to the probem

More information

ORTHOGONAL MULTI-WAVELETS FROM MATRIX FACTORIZATION

ORTHOGONAL MULTI-WAVELETS FROM MATRIX FACTORIZATION J. Korean Math. Soc. 46 2009, No. 2, pp. 281 294 ORHOGONAL MLI-WAVELES FROM MARIX FACORIZAION Hongying Xiao Abstract. Accuracy of the scaing function is very crucia in waveet theory, or correspondingy,

More information

Reichenbachian Common Cause Systems

Reichenbachian Common Cause Systems Reichenbachian Common Cause Systems G. Hofer-Szabó Department of Phiosophy Technica University of Budapest e-mai: gszabo@hps.ete.hu Mikós Rédei Department of History and Phiosophy of Science Eötvös University,

More information

VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES

VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES SIAM J. NUMER. ANAL. Vo. 0, No. 0, pp. 000 000 c 200X Society for Industria and Appied Mathematics VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES SARAH DAY, JEAN-PHILIPPE LESSARD, AND KONSTANTIN MISCHAIKOW

More information

17 Lecture 17: Recombination and Dark Matter Production

17 Lecture 17: Recombination and Dark Matter Production PYS 652: Astrophysics 88 17 Lecture 17: Recombination and Dark Matter Production New ideas pass through three periods: It can t be done. It probaby can be done, but it s not worth doing. I knew it was

More information

Theory of Generalized k-difference Operator and Its Application in Number Theory

Theory of Generalized k-difference Operator and Its Application in Number Theory Internationa Journa of Mathematica Anaysis Vo. 9, 2015, no. 19, 955-964 HIKARI Ltd, www.m-hiari.com http://dx.doi.org/10.12988/ijma.2015.5389 Theory of Generaized -Difference Operator and Its Appication

More information

Week 6 Lectures, Math 6451, Tanveer

Week 6 Lectures, Math 6451, Tanveer Fourier Series Week 6 Lectures, Math 645, Tanveer In the context of separation of variabe to find soutions of PDEs, we encountered or and in other cases f(x = f(x = a 0 + f(x = a 0 + b n sin nπx { a n

More information

Separation of Variables and a Spherical Shell with Surface Charge

Separation of Variables and a Spherical Shell with Surface Charge Separation of Variabes and a Spherica She with Surface Charge In cass we worked out the eectrostatic potentia due to a spherica she of radius R with a surface charge density σθ = σ cos θ. This cacuation

More information

Math 124B January 17, 2012

Math 124B January 17, 2012 Math 124B January 17, 212 Viktor Grigoryan 3 Fu Fourier series We saw in previous ectures how the Dirichet and Neumann boundary conditions ead to respectivey sine and cosine Fourier series of the initia

More information

Partial permutation decoding for MacDonald codes

Partial permutation decoding for MacDonald codes Partia permutation decoding for MacDonad codes J.D. Key Department of Mathematics and Appied Mathematics University of the Western Cape 7535 Bevie, South Africa P. Seneviratne Department of Mathematics

More information

General Certificate of Education Advanced Level Examination June 2010

General Certificate of Education Advanced Level Examination June 2010 Genera Certificate of Education Advanced Leve Examination June 2010 Human Bioogy HBI6T/Q10/task Unit 6T A2 Investigative Skis Assignment Task Sheet The effect of using one or two eyes on the perception

More information

Asymptotic Properties of a Generalized Cross Entropy Optimization Algorithm

Asymptotic Properties of a Generalized Cross Entropy Optimization Algorithm 1 Asymptotic Properties of a Generaized Cross Entropy Optimization Agorithm Zijun Wu, Michae Koonko, Institute for Appied Stochastics and Operations Research, Caustha Technica University Abstract The discrete

More information

On a geometrical approach in contact mechanics

On a geometrical approach in contact mechanics Institut für Mechanik On a geometrica approach in contact mechanics Aexander Konyukhov, Kar Schweizerhof Universität Karsruhe, Institut für Mechanik Institut für Mechanik Kaiserstr. 12, Geb. 20.30 76128

More information

$, (2.1) n="# #. (2.2)

$, (2.1) n=# #. (2.2) Chapter. Eectrostatic II Notes: Most of the materia presented in this chapter is taken from Jackson, Chap.,, and 4, and Di Bartoo, Chap... Mathematica Considerations.. The Fourier series and the Fourier

More information

Two-Stage Least Squares as Minimum Distance

Two-Stage Least Squares as Minimum Distance Two-Stage Least Squares as Minimum Distance Frank Windmeijer Discussion Paper 17 / 683 7 June 2017 Department of Economics University of Bristo Priory Road Compex Bristo BS8 1TU United Kingdom Two-Stage

More information

Problem set 6 The Perron Frobenius theorem.

Problem set 6 The Perron Frobenius theorem. Probem set 6 The Perron Frobenius theorem. Math 22a4 Oct 2 204, Due Oct.28 In a future probem set I want to discuss some criteria which aow us to concude that that the ground state of a sef-adjoint operator

More information

Appendix for Stochastic Gradient Monomial Gamma Sampler

Appendix for Stochastic Gradient Monomial Gamma Sampler 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 3 3 33 34 35 36 37 38 39 4 4 4 43 44 45 46 47 48 49 5 5 5 53 54 Appendix for Stochastic Gradient Monomia Gamma Samper A The Main Theorem We provide the foowing

More information

Cryptanalysis of PKP: A New Approach

Cryptanalysis of PKP: A New Approach Cryptanaysis of PKP: A New Approach Éiane Jaumes and Antoine Joux DCSSI 18, rue du Dr. Zamenhoff F-92131 Issy-es-Mx Cedex France eiane.jaumes@wanadoo.fr Antoine.Joux@ens.fr Abstract. Quite recenty, in

More information

Automobile Prices in Market Equilibrium. Berry, Pakes and Levinsohn

Automobile Prices in Market Equilibrium. Berry, Pakes and Levinsohn Automobie Prices in Market Equiibrium Berry, Pakes and Levinsohn Empirica Anaysis of demand and suppy in a differentiated products market: equiibrium in the U.S. automobie market. Oigopoistic Differentiated

More information

MATRIX CONDITIONING AND MINIMAX ESTIMATIO~ George Casella Biometrics Unit, Cornell University, Ithaca, N.Y. Abstract

MATRIX CONDITIONING AND MINIMAX ESTIMATIO~ George Casella Biometrics Unit, Cornell University, Ithaca, N.Y. Abstract MATRIX CONDITIONING AND MINIMAX ESTIMATIO~ George Casea Biometrics Unit, Corne University, Ithaca, N.Y. BU-732-Mf March 98 Abstract Most of the research concerning ridge regression methods has deat with

More information

Efficient Generation of Random Bits from Finite State Markov Chains

Efficient Generation of Random Bits from Finite State Markov Chains Efficient Generation of Random Bits from Finite State Markov Chains Hongchao Zhou and Jehoshua Bruck, Feow, IEEE Abstract The probem of random number generation from an uncorreated random source (of unknown

More information

FRIEZE GROUPS IN R 2

FRIEZE GROUPS IN R 2 FRIEZE GROUPS IN R 2 MAXWELL STOLARSKI Abstract. Focusing on the Eucidean pane under the Pythagorean Metric, our goa is to cassify the frieze groups, discrete subgroups of the set of isometries of the

More information

6 Wave Equation on an Interval: Separation of Variables

6 Wave Equation on an Interval: Separation of Variables 6 Wave Equation on an Interva: Separation of Variabes 6.1 Dirichet Boundary Conditions Ref: Strauss, Chapter 4 We now use the separation of variabes technique to study the wave equation on a finite interva.

More information

Testing for the Existence of Clusters

Testing for the Existence of Clusters Testing for the Existence of Custers Caudio Fuentes and George Casea University of Forida November 13, 2008 Abstract The detection and determination of custers has been of specia interest, among researchers

More information

arxiv: v1 [math.co] 17 Dec 2018

arxiv: v1 [math.co] 17 Dec 2018 On the Extrema Maximum Agreement Subtree Probem arxiv:1812.06951v1 [math.o] 17 Dec 2018 Aexey Markin Department of omputer Science, Iowa State University, USA amarkin@iastate.edu Abstract Given two phyogenetic

More information

An explicit Jordan Decomposition of Companion matrices

An explicit Jordan Decomposition of Companion matrices An expicit Jordan Decomposition of Companion matrices Fermín S V Bazán Departamento de Matemática CFM UFSC 88040-900 Forianópois SC E-mai: fermin@mtmufscbr S Gratton CERFACS 42 Av Gaspard Coriois 31057

More information

DIGITAL FILTER DESIGN OF IIR FILTERS USING REAL VALUED GENETIC ALGORITHM

DIGITAL FILTER DESIGN OF IIR FILTERS USING REAL VALUED GENETIC ALGORITHM DIGITAL FILTER DESIGN OF IIR FILTERS USING REAL VALUED GENETIC ALGORITHM MIKAEL NILSSON, MATTIAS DAHL AND INGVAR CLAESSON Bekinge Institute of Technoogy Department of Teecommunications and Signa Processing

More information

Appendix for Stochastic Gradient Monomial Gamma Sampler

Appendix for Stochastic Gradient Monomial Gamma Sampler Appendix for Stochastic Gradient Monomia Gamma Samper A The Main Theorem We provide the foowing theorem to characterize the stationary distribution of the stochastic process with SDEs in (3) Theorem 3

More information

V.B The Cluster Expansion

V.B The Cluster Expansion V.B The Custer Expansion For short range interactions, speciay with a hard core, it is much better to repace the expansion parameter V( q ) by f(q ) = exp ( βv( q )) 1, which is obtained by summing over

More information

Approximated MLC shape matrix decomposition with interleaf collision constraint

Approximated MLC shape matrix decomposition with interleaf collision constraint Approximated MLC shape matrix decomposition with intereaf coision constraint Thomas Kainowski Antje Kiese Abstract Shape matrix decomposition is a subprobem in radiation therapy panning. A given fuence

More information

LIKELIHOOD RATIO TEST FOR THE HYPER- BLOCK MATRIX SPHERICITY COVARIANCE STRUCTURE CHARACTERIZATION OF THE EXACT

LIKELIHOOD RATIO TEST FOR THE HYPER- BLOCK MATRIX SPHERICITY COVARIANCE STRUCTURE CHARACTERIZATION OF THE EXACT LIKELIHOOD RATIO TEST FOR THE HYPER- BLOCK MATRIX SPHERICITY COVARIACE STRUCTURE CHARACTERIZATIO OF THE EXACT DISTRIBUTIO AD DEVELOPMET OF EAR-EXACT DISTRIBUTIOS FOR THE TEST STATISTIC Authors: Bárbara

More information

General Certificate of Education Advanced Level Examination June 2010

General Certificate of Education Advanced Level Examination June 2010 Genera Certificate of Education Advanced Leve Examination June 2010 Human Bioogy HBI6T/P10/task Unit 6T A2 Investigative Skis Assignment Task Sheet The effect of temperature on the rate of photosynthesis

More information

Moreau-Yosida Regularization for Grouped Tree Structure Learning

Moreau-Yosida Regularization for Grouped Tree Structure Learning Moreau-Yosida Reguarization for Grouped Tree Structure Learning Jun Liu Computer Science and Engineering Arizona State University J.Liu@asu.edu Jieping Ye Computer Science and Engineering Arizona State

More information

Turbo Codes. Coding and Communication Laboratory. Dept. of Electrical Engineering, National Chung Hsing University

Turbo Codes. Coding and Communication Laboratory. Dept. of Electrical Engineering, National Chung Hsing University Turbo Codes Coding and Communication Laboratory Dept. of Eectrica Engineering, Nationa Chung Hsing University Turbo codes 1 Chapter 12: Turbo Codes 1. Introduction 2. Turbo code encoder 3. Design of intereaver

More information

Approximation and Fast Calculation of Non-local Boundary Conditions for the Time-dependent Schrödinger Equation

Approximation and Fast Calculation of Non-local Boundary Conditions for the Time-dependent Schrödinger Equation Approximation and Fast Cacuation of Non-oca Boundary Conditions for the Time-dependent Schrödinger Equation Anton Arnod, Matthias Ehrhardt 2, and Ivan Sofronov 3 Universität Münster, Institut für Numerische

More information

V.B The Cluster Expansion

V.B The Cluster Expansion V.B The Custer Expansion For short range interactions, speciay with a hard core, it is much better to repace the expansion parameter V( q ) by f( q ) = exp ( βv( q )), which is obtained by summing over

More information

Competitive Diffusion in Social Networks: Quality or Seeding?

Competitive Diffusion in Social Networks: Quality or Seeding? Competitive Diffusion in Socia Networks: Quaity or Seeding? Arastoo Fazei Amir Ajorou Ai Jadbabaie arxiv:1503.01220v1 [cs.gt] 4 Mar 2015 Abstract In this paper, we study a strategic mode of marketing and

More information

Statistical Inference, Econometric Analysis and Matrix Algebra

Statistical Inference, Econometric Analysis and Matrix Algebra Statistica Inference, Econometric Anaysis and Matrix Agebra Bernhard Schipp Water Krämer Editors Statistica Inference, Econometric Anaysis and Matrix Agebra Festschrift in Honour of Götz Trenker Physica-Verag

More information