An Approximate Fisher Scoring Algorithm for Finite Mixtures of Multinomials
|
|
- Kelly Stokes
- 5 years ago
- Views:
Transcription
1 An Approximate Fisher Scoring Agorithm for Finite Mixtures of Mutinomias Andrew M. Raim, Mingei Liu, Nagaraj K. Neercha and Jorge G. More Abstract Finite mixture distributions arise naturay in many appications incuding custering and cassification. Since they usuay do not yied cosed forms for maximum ikeihood estimates MLEs, numerica methods using the we known Fisher Scoring or Expectation-Maximization agorithms are considered. In this work, an approximation to the Fisher Information Matrix of an arbitrary mixture of mutinomia distributions is introduced. This eads to an Approximate Fisher Scoring agorithm AFSA, which turns out to be cosey reated to Expectation-Maximization, and is more robust to the choice of initia vaue than Fisher Scoring iterations. A combination of AFSA and the cassica Fisher Scoring iterations provides the best of both computationa efficiency and stabe convergence properties. Key Words: Mutinomia; Finite mixture, Maximum ikeihood, Fisher Information Matrix, Fisher Scoring. 1 Introduction A finite mixture mode arises when observations are beieved to originate from one of severa popuations, but it is unknown to which popuation each observation beongs. Mode identifiabiity of mixtures is an important issue with a considerabe iterature; see, for exampe, Robbins 1948 and Teicher The book by McLachan and Pee 2000 is a good entry point to the iterature on mixtures. A majority of the mixture iterature deas with mixtures of norma distributions; however, Bischke 1962; 1964, Kroikowska 1976, and Kabir 1968 are a few eary works which address mixtures of discrete distributions. The present work focuses on mixtures of mutinomias, which have wide appications in custering and cassification probems, as we as modeing overdispersion data More and Nagaraj, It is we known that computation of maximum ikeihood estimates MLE under mixture distributions is often anayticay intractabe, and therefore iterative numerica methods are needed. Cassica iterative techniques such as Newton-Raphson and Fisher Scoring are two widey used methods. The more recent Expectation-Maximization EM agorithm discussed in Dempster, Laird and Rubin 1977 has become another standard technique to compute MLEs. EM is a framework for performing estimation in missing data probems. The idea Andrew Raim is graduate student at Department of Mathematics and Statistics, University of Maryand, Batimore County, Batimore, MD 21250, USA Emai: araim1@umbc.edu. Mingei Liu is Senior Principia Biostatistician at Medtronic, Santa Rosa, CA, USA. Nagaraj Neercha is Professor at Department of Mathematics and Statistics, University of Maryand, Batimore County, Batimore, MD 21250, USA Emai: nagaraj@umbc.edu. Jorge More is Principa Statistician at Procter & Gambe, Cincinnati, OH, USA. 1
2 is to sove a difficut incompete data probem by repeatedy soving tractabe competedata probems. If the unknown popuation abes are treated as missing information, then estimation under the mixture distribution can be considered a missing data probem, and an EM agorithm can be used. Unike Fisher Scoring FSA, EM does not require computation of an expected Hessian in each iteration, which is a great advantage if this matrix is difficut to compute. Sow speed of convergence has been cited as a disadvantage of EM. Variations and improved versions of the EM agorithm have been widey used for obtaining MLEs for mixtures Mcachan and Pee, 2000, chapter 2. Fisher Scoring iterations require the inverse of the Fisher Information Matrix FIM. In the mixture setting, computing the FIM invoves a compicated expectation which does not have an anayticay tractabe form. The matrix can be approximated numericay by Monte Caro simuation for exampe, but this is computationay expensive, especiay when repeated over many iterations. More and Nagaraj 1991; 1993 proposed a variant of Fisher Scoring using an approximate FIM in their study of a mutinomia mode with extra variation. This mode, now referred to as the Random Cumped Mutinomia see Exampe 3.1 for detais, is a specia case of the finite mixture of mutinomias. The approximate FIM was justified asymptoticay, and was used to obtain MLEs for the mode and to demonstrate their efficiency. In the present paper, we extend the approximate FIM idea to genera finite mixtures of mutinomias and hence formuate the Approximate Fisher Scoring Agorithm AFSA for this famiy of distributions. By using the approximate FIM in pace of the true FIM, we obtain an agorithm which is cosey reated to EM. Both AFSA and EM have a sower convergence rate than Fisher Scoring once they are in the proximity of a maximum, but both are aso much more robust than Fisher Scoring in finding such regions from an arbitrary initia vaue. The rest of the paper is organized as foows. In section 2, a arge custer approximation for the Fisher Information Matrix is derived and some of its properties are presented. This approximate information matrix is easiy computed and has an immediate appication in Fisher Scoring, which is presented in section 3. Simuation studies are presented in section 4, iustrating convergence properties of the approximate information matrix and approximate Fisher Scoring. Concuding remarks are given in section 5. 2 An Approximate Fisher Information Matrix Consider the mutinomia sampe space with m trias paced into k categories at random, Ω { x x 1,..., x k : x j {0, 1,..., m}, k j1 } x j m. The standard mutinomia density is fx; p, m m! x 1!... x k! px p x k k Ix Ω, 2
3 where I is the indicator function, and the parameter space is { p p 1,..., p k 1 : 0 < p j < 1, k 1 j1 } p j < 1 R k 1. If a random variabe X has distribution fx; p, m, we wi write X Mut k p, m. Since x k m k 1 j1 x j and p k 1 k 1 j1 p j, the kth category can be considered as redundant information. Foowing the samping and overdispersion iterature, we wi refer to the number of trias m as the custer size of a mutinomia observation. Now suppose there are s mutinomia popuations Mut k p 1, m,..., Mut k p s, m, p p 1,..., p,k 1 where the th popuation occurs with proportion π for 1,..., s. If we draw X from the mixed popuation, its probabiity density is a finite mixture of mutinomias fx; θ s π fx; p, m, θ p 1,..., p s, π, and we wi write X MutMix k θ, m. The dimension of θ is q : sk 1 + s 1 sk 1, disregarding the redundant parameters p 1k,..., p sk, π s. We wi aso make use of the foowing sighty-ess-cumbersome notation for densities, Px : fx; θ, m : P x : fx; p, m : the mixture, the th component of the mixture. The setting of this paper wi be an independent sampe X i MutMix k θ, m i, i 1,..., n with custer sizes not necessariy equa. The resuting ikeihood is { n n s [ m i! Lθ fx i ; θ π x i1!... x ik! px i px ik k Ix i Ω] } The inner summation prevents cosed-form ikeihood maximization, hence our goa wi be to compute the MLE ˆθ numericay. Some additiona preiminaries are given in Appendix A. In genera, as mentioned earier, the Fisher Information Matrix FIM for mixtures invoves a compicated expectation which does not have a tractabe form. Since the mutinomia 3
4 mixture has a finite sampe space, it can be computed naivey by using the definition of the expectation Iθ { } { } T og fx; θ og fx; θ fx; θ, 2.3 θ θ given a particuar vaue for θ. Athough the number of terms k+m 1 m in the summation is finite, it grows quicky with m and k, and this method becomes intractibe as m and k increase. For exampe, when m 100 and k 4, the sampe space Ω contains more than 178,000 eements. To avoid these potentiay expensive computations, we extend the approximate FIM approach of More and Nagaraj 1991; 1993 to the genera finite mixture of mutinomias. The foowing theorem states our main resut. Theorem 2.1. Suppose X MutMix k θ, m is a singe observation from the mixed popuation. Denote the exact FIM with respect to X as Iθ. Then an approximation to the FIM with respect to X is given by the sk 1 sk 1 bock-diagona matrix where for 1,..., s F m [ D 1 are k 1 k 1 matrices, F π D 1 π Ĩθ : Bockdiag π 1 F 1,..., π s F s, F π, + p 1 k 11T ] and D diagp 1,..., p,k 1 + π 1 s 11 T and D π diagπ 1,..., π s 1 is a s 1 s 1 matrix, and 1 denotes a vector of ones of the appropriate dimension. To emphasize the dependence of the FIM and the approximation on m, we wi aso write I m θ and Ĩmθ. If the vectors p 1,..., p s are distinct i.e. p a p b for every pair of popuations a b, then I m θ Ĩmθ as m. A proof is given in Appendix B. Notice that the matrix F is exacty the FIM of Mut k p, m for the th popuation, and F π is the FIM of Mut s π, 1 corresponding to the mixing probabiities π; see Appendix A for detais. The approximate FIM turns out to be equivaent to a compete data FIM, as shown in Proposition 2.2 beow, which provides an interesting connection to EM. This matrix can be formuated for any finite mixture whose components have a we-defined FIM, and is not imited to the case of mutinomias. Proposition 2.2. The matrix Ĩθ is equivaent to the FIM of X, Z, where 1 with probabiity π 1 Z. s with probabiity π s, and X Z Mut k p, m
5 Proof of Proposition 2.2. Here Z represents the popuation from which X was drawn. The compete data ikeihood is then Lθ x, z s 1 [ π fx p, m] Iz. This ikeihood eads to the score vectors [ og Lθ a Da 1 x k x ] k 1, p a p ak π og Lθ D 1 π s s 1, π s where 1,..., s so that IZ Bernouiπ, and s denotes the vector 1,..., s 1. Taking second derivatives yieds [ 2 og Lθ p a p T a Da 2 x k + x ] k 11 T, a 2 p a p T b 2 og Lθ 0, for a b, og Lθ 0, p a πt [ 2 og Lθ π πt D 2 π s + s π 2 s p 2 ak 11 T ]. Now take the expected vaue of the negative of each of these terms, jointy with respect to X, Z, to obtain the bocks of Ĩθ. Coroary 2.3. Suppose X i MutMixθ, m i, i 1,..., n, is an independent sampe from the mixed popuation with varying custer sizes, and M m m n. Then the approximate FIM with respect to X 1,..., X n is given by Ĩθ Bockdiag π 1 F 1,..., π s F s, F π, F M [ D 1 F π n [ D 1 π + p ] 1 k 11T, + πs 1 11 ] T. 1,..., s Proof of Coroary 2.3. Let Ĩiθ represent the approximate FIM with respect to observation X i. The resut is obtained using Ĩθ Ĩ1θ + + Ĩnθ, corresponding to the additive property of exact FIMs for independent sampes. The additive property can be justified by noting that each Ĩiθ is a true compete data FIM, by Proposition
6 Since Ĩθ is a bock diagona matrix, some usefu expressions can be obtained in cosed form. Coroary 2.4. Let Ĩθ represent the FIM with respect to an independent sampe X i MutMixθ, m i, i 1,..., n. Then: a The inverse of Ĩθ is given by Ĩ 1 θ Bockdiag π1 1 F1 1,..., πs 1 Fs 1, Fπ 1, 2.5 F 1 M 1 {D p p T }, 1,..., s Fπ 1 n 1 {D π ππ T }. b The trace of Ĩθ is given by tr Ĩθ s k 1 { Mπ p 1 j 1 j1 c The determinant of Ĩθ is given by s det Ĩθ 1 k 1 p 1 k j1 } s 1 + p 1 k + Mπ p 1 j 1 π 1 s n { π 1 s 1 1 nπ 1 } + πs 1. Proof of Coroary 2.4 a. Since Ĩθ is bock diagona, its inverse can be obtained by inverting the bocks, which can immediatey be seen to be 2.5. To find the expressions for the individua bocks, we can appy the Sherman-Morrison formua see for exampe Rao 1965, chapter 1 C + uv T 1 C 1 C 1 uv T C v T C 1 u. For the case of Fπ 1, for exampe, take C Dπ 1, u πs 1/2 1, and v πs 1/2 1 T and use the expressions in Coroary 2.3. Proof of Coroary 2.4 b. Since the trace of a bock diagona matrix is the sum of the traces of its bocks, we have tr Ĩθ π 1 tr F π s tr F s + tr F π. 2.6 The individua traces can be obtained as tr F tr [ MD 1 + p 1 k 11T ] k 1 j1 M { p 1 j + p 1 k },. 6
7 a summation over the diagona eements. Simiary for the bock corresponding to π, tr F π tr [ n D 1 π + π 1 s 11 T ] s 1 1 n { π 1 The resut is obtained by repacing these expressions into 2.6. } + πs 1. Proof of Coroary 2.4 c. Since Ĩθ has a bock diagona structure, det Ĩθ det {F π} s det {π F } 1 n s 1 det { D 1 π + π 1 s 11 T } s 1 π k 1 M k 1 det { D 1 + p } 1 k 11T 2.7 Reca the property see for exampe Rao 1965, chapter 1 that for M non-singuar, we have detm + uu T M u u T 1 detm 1 + u T M 1 u. This yieds, for instance det { D 1 π + π 1 s 11 } T det { } Dπ π 1 s 1 T D π 1 [ π ] s s 1 s 1 π 1 πs 1 π s 1 1 π 1. The resut can be obtained by substituting the simpified determinants into 2.7. The determinant and trace of the FIM are not utiized in the computation of MLEs, but are used in the computation of many statistics in subsequent anaysis. In such appications, it may be preferabe to have a cosed form for these expressions. As one exampe, consider the Consistent Akaike Information Criterion with Fisher Information CAICF formuated in Bozdogan, The CAICF is an information-theoretic criterion for mode seection, and is a function of the og-determinant of the FIM. It can aso be shown that Im 1 θ Ĩ 1 m θ 0 as m, which we now state as a theorem. A proof is given in Appendix B. This resut is perhaps more immediatey reevant than Theorem 2.1 for our Fisher Scoring appication presented in the foowing section. Theorem 2.5. Let I m θ and Ĩmθ be defined as in Theorem 2.1 namey the FIM and approximate FIM with respect to a singe observation with custer size m. Then Im 1 θ Ĩm 1 θ 0 as m. In the next section, we use the approximate FIM obtained in Theorem 2.1 to define an approximate Fisher Scoring agorithm and investigate its properties. 7
8 3 Approximate Fisher Scoring Agorithm Consider an independent sampe with varying custer sizes X i MutMix k θ, m i, i 1,..., n. Let θ 0 be an initia guess for θ, and Sθ be the score vector with respect to the sampe see Appendix A. Then by independence Sθ Sθ; x i, where Sθ; x i is the score vector with respect to the ith observation. The Fisher Scoring Agorithm is given by computing the iterations unti the convergence criteria θ g+1 θ g + I 1 θ g Sθ g, g 1, 2, og Lθ g+1 og Lθ g < ε is met, for some given toerance ε > 0. In practice, a ine search may be used for every iteration after determining a search direction, but such modifications wi not be considered here. Note that 3.1 uses the exact FIM which may not be easiy computabe. We propose to substitute the approximation Ĩθ for Iθ, and wi refer to the resuting method as the Approximate Fisher Scoring Agorithm AFSA. The expressions for Ĩθ and its inverse are avaiabe in cosed form, as seen in Coroaries 2.3 and 2.4. AFSA can be appied to finite mixture of mutinomia modes which are not expicity in the form of 2.2. We now give two exampes which use AFSA to compute MLEs for such modes. The first is the Random Cumped mode for overdispersed mutinomia data. The second is an arbirtrary mixture of mutinomias with inks from parameters to covariates. Exampe 3.1. In section 1 we have mentioned the Random Cumped Mutinomia RCM, a distribution that addresses overdispersion due to cumped samping in the mutinomia framework. RCM represents an interesting mode for exporing computationa methods. Recenty, Zhou and Lange 2010 have used it as an iustrative exampe for the minorizationmaximization principe. Raim et a 2012 have expored parae computing in maximum ikeihood estimation using arge RCM modes as a test probem. It turns out that RCM conforms to the finite mixture of mutinomias representation 2.1, and can therefore be fitted by the AFSA agorithm. Once the mixture representation is estabished, the score vector and approximate FIM can be formuated by the use of transformations; see for exampe section 2.6 of Lehmann and Casea Hence, we can obtain the agorithm presented in More and Nagaraj 1993 and Neercha and More 1998 as an AFSA-type agorithm. Consider a custer of m trias, where each tria resuts in one of k possibe outcomes with probabiities π 1,..., π k. Suppose a defaut category is aso seected at random, so that each 8
9 tria either resuts in this defaut outcome with probabiity ρ, or an independent choice with probabiity 1 ρ. Intuitivey, if ρ 0, RCM approaches a standard mutinomia distribution. Using this idea, an RCM random variabe can be obtained from the foowing procedure. Let Y 0, Y 1,..., Y m iid Mut k π, 1 and U 1,..., U m iid U0, 1 be independent sampes, then X Y 0 m IU i ρ + m Y i IU i > ρ Y 0 N + Z N 3.2 foows the distribution RCM k π, ρ. The representation 3.2 emphasizes that N Binomiam, ρ, Z N Mut k π, m N, and Y 0 Mut k π, 1, where N and Y 0 are independent. RCM is aso a specia case of the finite mixture of mutinomias, so that X fx; π, ρ k π fx; p, m, 1 p 1 ρπ + ρe, for 1,..., k 1 p k 1 ρπ, where fx; p, m is our usua notation for the density of Mut k p, m. This mixture representation can be derived using moment generating functions, as shown in More and Nagaraj, Notice that in this mixture s k so that the number of mixture components matches the number of categories. There are aso ony k distinct parameters rather than sk 1 as in the genera mixture. The approximate FIM for the RCM mode can be obtained by transformation, starting with the expression for the genera mixture. Consider transforming the k dimensiona η π, ρ to the q sk 1 k + 1k 1 dimensiona θ p 1,..., p s, π so that 1 ρπ + ρe 1. θη 1 ρπ + ρe k 1. 1 ρπ π The q k Jacobian of this transformation is 1 ρi k 1 π + e 1 θ η θi.. η j 1 ρi k 1 π + e k 1. 1 ρi k 1 π I k 1 0 9
10 Using the reations Sη T θ og fx; θ og fx; θ, η η θ T θ θ Iη Var Sη Iθ, η η it is possibe to obtain an expicit form of the approximate FIM as Ĩη a ij, where m1 ρ 2 β i + β k π 1 i + π 1 k, i j, i, j {1,..., k 1} m1 ρ 2 β k π 1 k i j, i, j {1,..., k 1} a ij m1 ργ i γ k, j k, i {1,..., k 1} m k 1 ρ π i1 π i [1 ρπ i + ρ] 1, i k, j k and β i π i 1 ρπ i + ρ + 1 π i, γ i π i1 π i 1 ρπ i 1 ρπ i + ρ + π i, i 1,..., k. 1 ρ It can be shown rigorousy that Ĩη Iη 0 as m, as stated in More and Nagaraj, 1993, and proved in detai in More and Nagaraj, The proof is simiar in spirit to the proof of Theorem 2.1. We then have AFSA iterations for RCM, η g+1 η g + Ĩ 1 η g Sη g, g 1, 2,... The foowing exampe invoves a mixture of mutinomias where the response probabiities are functions of covariates. The idea is anaogous to the usua mutinomia with ogit ink, but with inks corresponding to each component of the mixture. Exampe 3.2. In practice there are often covariates to be inked into the mode. As an exampe for how AFSA can be appied, consider the foowing fixed effect mode for response Y MutMix k θx, m with d 1 covariates x and z. To each p vector, a generaized ogit ink wi be added og p jx p k x η j, η j x T β j, for 1,..., s and j 1,..., k 1. A proportiona odds mode wi be assumed for π, og π 1z + + π z π +1 z + + π s z ηπ, η π ν + z T α, for 1,..., s 1, taking η0 π : and ηs π :. The unknown parameters are the d 1 vectors α and β j, and the scaars ν. Denote these parameters coectivey as β 1. β 1 ν 1 B β s, where β. and ν.. ν β,k 1 ν s α 10
11 Expressions for the θ parameters can be obtained as p j x π z e η j 1 + k 1 b1 eη b eηπ 1 + e ηπ eηπ e ηπ 1 for 1,..., s and j 1,..., k 1, for 1,..., s. To impement AFSA, a score vector and approximate FIM are needed. For the score vector we have SB T T N θ og fy; θ og fy; θ B B N θ where N η 1,..., η s, η π, η η 1,..., η,k 1, and η π η1 π,..., ηs 1. π For the FIM we have T T N θ θ N IB Var SB Iθ. B N N B Finding expressions for the two Jacobians is tedious but straightforward. Propositions 3.3 and 3.4 and Theorem 3.5 state consequences of the main approximation resut, which have significant impications on the computation of MLEs. We have aready seen that the approximate FIM is equivaent to a compete data FIM from EM. There is aso an interesting connection between AFSA and EM, in that the iterations are agebraicay reated. To see this connection, expicit forms for AFSA and EM iterations are first presented, with proofs given in Appendix B. Proposition 3.3 AFSA Iterations. The AFSA iterations can be written expicity as π g+1 p g+1 j π g 1 n 1 M θ g+1 θ g + Ĩ 1 θ g Sθ g, g 1, 2, P x i Px i, where M m m n. P x i Px i x ij p g j 1,..., s [ 1 1 M ] P x i m i, 1,..., s, j 1,..., k. Px i Proposition 3.4 EM Iterations. Consider the compete data 1 with probabiity π 1 Z i. and X i Z i Mut k p, m i, s with probabiity π s, 11
12 where X i, Z i are independent for i 1,..., n. Denote γ g i : PZ i x i, θ g as the posterior probabiity that the ith observation beongs to the th group. Iterations for an EM agorithm are given by π g+1 1 n p g+1 j γ g i n x ijγ g i n m iγ g i 1 n πg P x i, 1,..., s, Px i n x ij P x i Px i n m i P x i, 1,..., s, j 1,..., k. Px i The iterations for AFSA or EM are repeated for g 1, 2,..., with a given initia guess θ 0, unti og Lθ g+1 og Lθ g < ε, where ε > 0 is a given toerance, which is taken to be the stopping criteria for the remainder of this paper. Theorem 3.5. Denote the estimator from EM by ˆθ, and the estimator from AFSA by θ. Suppose custer sizes are equa, so that m 1 m n m. If the two agorithms start at the gth iteration with θ g, then for the g + 1th iteration, π g+1 ˆπ g+1 and p g+1 j for 1,..., s and j 1,..., k. ˆπ g+1 π g ˆp g+1 j + 1 ˆπg+1 π g Proof of Theorem 3.5. It is immediate from Propositions 3.3 and 3.4 that π g+1 and that Now, ˆπ g+1 π g ˆp g+1 j + ˆπ g+1 π g n x ij P x i Px i m n P x i Px i 1 mn 1 n 1 ˆπg+1 π g 1 n P x i Px i x ij + p g j p g j P x i Px i. P x i Px i + pg j 1 1 n 1 1 P x i n Px i P x i Px i p g j ˆπ g+1, p g+1 j
13 The g + 1th AFSA iterate can then be seen as a inear combination of the gth iterate and the g + 1th step of EM. The coefficient ˆπ g+1 /π g is non-negative but may be arger than 1. Therefore p g+1 j need not ie stricty between ˆp g+1 j and p g j. Figure 1 shows a pot of p g+1 j as the ratio ˆπ g+1 /π g varies. However, suppose that at gth step the EM agorithm is cose to convergence. Then ˆπ g+1 From 3.4 we wi aso have ˆπ g ˆπg+1 ˆπ g 1, for 1,..., s. p g+1 j ˆp g+1 j, for 1,..., s, and j 1,..., k. From this point on, AFSA and EM iterations are approximatey the same. Hence, in the vicinity of a soution, AFSA and EM wi produce the same estimate. Note that this resut hods for any m, and does not require a arge custer size justification. For the case of varying custer sizes m 1,..., m n, ˆπ g+1 π g ˆp g+1 j + 1 ˆπg+1 π g n x ij P x i Px i 1 n m i P x i n Px i p g j P x i Px i + pg j 1 1 n P x i, 3.5 Px i which does not simpify to p g+1 j as in the proof of Theorem 3.5. However, this iustrates that EM and AFSA are sti cosey reated. This aso suggests an ad-hoc revision to AFSA, etting p g+1 j equa 3.5 so that the agebraic reationship to EM woud be maintained as in 3.4 for the baanced case. A more genera connection is known between EM and iterations of the form θ g+1 θ g + I 1 c θ g Sθ g, g 1, 2,..., 3.6 where I c θ is a compete data FIM. Titterington 1984 shows that the two iterations are approximatey equivaent under appropriate reguarity conditions. The equivaence is exact when the compete data ikeihood is a reguar exponentia famiy { } Lµ exp bx + η T t + aη, η ηµ, t tx, and µ : EtX is the parameter of interest. The compete data ikeihood for our mutinomia mixture is indeed a reguar exponentia famiy, but the parameter of interest θ is a transformation of µ rather than µ itsef. Therefore the equivaance is approximate, as we have seen in Theorem 3.5. The justification for AFSA eading to this paper foowed the historica approach of Bischke 1964, and not from the roe of Ĩθ as a compete data FIM. But the reationship between EM and the iterations 3.6 suggests that AFSA is a reasonabe approach for finite mixtures beyond the mutinomia setting, 13
14 AFSA step compared to previous iterate and EM step ˆp g+1 j p g+1 j p g j 0 1 ˆπ g+1 /π g Figure 1: The next AFSA p g+1 j depends on the ratio ˆπ g+1 /π g. iteration is a inear combination of ˆp g+1 j and p g j, which 4 Simuation Studies The main resut stated in Theorem 2.1 aows us to approximate the matrix Iθ by Ĩθ, which is much more easiy computed. Theorem 2.5 justifies Ĩ 1 θ as an approximation for the inverse FIM. In the present section, simuation studies investigate the quaity of the two approximations as a function of m. We aso present studies to demonstrate the convergence speed and soution quaity of AFSA. 4.1 Distance between true and approximate FIM Consider two concepts of distance to compare the coseness of the exact and approximate matrices. Based on the Frobenius norm A F i j a2 ij, a distance metric d F A, B A B F can be constructed using the sum of squared differences of corresponding eements. This distance wi be arger in genera when the magnitudes of the eements are arger, so we wi aso consider a scaed version d S A, B d F A, B B F i j a ij b ij 2, i j b2 ij noting that this is not a true distance metric since it is not symmetric. Using these two metrics, we compare the distance between true and approximate FIMs, and aso the dis- 14
15 tance between their inverses. Consider a mixture MutMix 2 θ, m of three binomias, with parameters p 1/7 1/3 2/3 and π 1/6 2/6 3/6. Figure 2 pots the two distance types for both the FIM and inverse FIM as m varies. Note that distances are potted on a og scae, so the vertica axis represents orders of magnitude. To see more concretey what is being compared, for the moderate custer size m 20 we have, respectivey for the approximate and exact FIMs, vs and for the approximate and exact inverse FIMs, vs Since the approximations are bock diagona matrices they have no way of capturing the off-diagona bocks, which are present in the exact matrices but are eventuay dominated by the bock-diagona eements as m. This emphasizes one obvious disadvantage of the approximate FIM, which is that it cannot be used to estimate a the asymptotic covariances for the MLEs for a fixed custer size. For this m 20 case, the bock-diagona eements for both pairs of matrices are not very cose, athough they are at east the same order of magnitude with the same signs. The magnitudes of eements in the inverse FIMs are in genera much smaer than those in the FIMs, so the unscaed distance wi naturay be smaer between the inverses. Now in Figure 2 consider the distance d F Ĩθ, Iθ as m is varied. For the FIM, the distance appears to be moderate at first, then increasing with m, and finay beginning to vanish as m becomes arge. What is not refected here is that the magnitudes of the eements themseves are increasing; this is infating the distance unti the convergence of Thereom 2.1 begins to kick in. Considering the scaed distance d S Ĩθ, Iθ heps to suppress the effect of the eement magnitudes and gives a cearer picture of the convergence. Focusing next on the inverse FIM, consider the distance d F Ĩ 1 θ, I 1 θ. For m < 5 the exact FIM is computationay singuar, so its inverse cannot be computed. Note that in this case the conditions for identifiabiity are not satisfied see Appendix A. This is not just a coincidence; there is a known reationship between mode non-identifiabiity and singuarity of the FIM Rothenberg, For m between 5 and about 23, the distance is very arge at first because of near-singuarity of the FIM, but quicky returns to a reasonabe magnitude. As m increases further, the distance quicky vanishes toward zero. We aso consider the 15.
16 Log of Frobenius Distance b/w Exact and Approx Matrices Log of Scaed Frobenius Distance b/w Exact and Approx Matrices ogdistance FIM Inverse FIM ogdistance FIM Inverse FIM m m a Using unscaed distance b Using scaed distance Figure 2: Distance between exact and approximate FIM and its inverse, as m is varied. scaed distance d S Ĩ 1 θ, I 1 θ. Again, this heps to remove the effects of the eement magnitudes, which are becoming very sma as m increases. Even after taking into account the scae of the eements, the distance between the inverse matrices appears to be converging more quicky than the distance between the FIM and approximate FIM. 4.2 Effectiveness of AFSA method Convergence Speed We first observe the convergence speed of AFSA and severa of its competitors. Consider the mixture of two trinomias Y i iid MutMix 3 θ, m 20, i 1,..., n 500 p 1 1/3 1/3 1/3, p , π We fit the MLE using AFSA, FSA, and EM. After the gth iteration, the quantity δ g og Lθ g og Lθ g 1 is measured. The sequence og δ g is potted for each agorithm in Figure 3. Note that δ g may be negative, except for exampe in EM which guarantees an improvement to the og-ikeihood in every step. A negative δ g can be interpreted as negative progress, at east from a oca maximum. The absoute vaue is taken to make potting possibe on the og scae, but some steps with negative progress have been obscured. The resuting estimates and 16
17 standard errors for a agorithms are shown in Tabe 1, and additiona summary information is shown in Tabe 2. We see that AFSA and EM have amost exacty the same rate of convergence toward the same soution, as suggested by Thereom 3.5. FSA had severe probems, and was not abe to converge within 100 iterations; i.e. δ g < 10 8 was not attained. The situation for FSA is worse than it appears in the pot. Athough og δ g is becoming sma, FSA s steps resut in both positive and negative δ g s unti the iteration imit is reached. This indicates a faiure to approach any maximum of the og-ikeihood. We aso considered an FSA hybrid with a warmup period, where for a given ε 0 > 0 the approximate FIM is used unti the first time δ g < ε 0 is crossed. Notice that ε 0 corresponds to no warmup period. A simiar idea has been considered by Neercha and More 2005, who proposed a two-stage procedure for AFSA in the RCM setting of Exampe 3.1. The first stage consisted of running AFSA iterations unti convergence, and in the second stage one additiona iteration of exact Fisher Scoring was performed. The purpose of the FSA iteration was to improve standard error estimates, which were previousy found to be inaccurate when computed directy from the approximate FIM Neercha and More, Here we note that FSA aso offers a faster convergence rate than AFSA, given an initia path to a soution. Therefore, AFSA can be used in eary iterations to move to the vicinity of a soution, then a switch to FSA wi give an acceerated converge to the soution. This approach depends on the exact FIM being feasibe to compute, so the sampe space cannot be too arge. For the present simuations, we make use of the naive summation 2.3. Hence, there is a trade-off in the choice of ε 0 between energy spent on computing the exact FIM and a arger number of iterations required for AFSA. Figure 3 shows that the hybrid strategy is effective, addressing the erratic behavior of FSA from an arbitrary starting vaue and the sower convergence rates of EM and AFSA. Tabe 2 shows that even a very imited warmup period such as ε 0 10 can be sufficient. The Newton-Raphson agorithm, which has not been shown here, performed simiary to Fisher Scoring but has issues with singuarity in some sampes. Standard errors for AFSA were obtained as a 11,..., a qq, denoting Ĩ 1 ˆθ a ij. For FSA and FSA-Hybrid, the inverse of the exact FIM was used instead. The basic EM agorithm does not yied standard error estimates. Severa extensions have been proposed to address this, such as by Louis 1982 and Meng and Rubin In ight of Theorem 3.5, standard errors from Ĩ 1 θ evauated at EM estimates coud aso be used to obtain simiar resuts to AFSA Monte Caro Study We next consider a Monte Caro study of the difference between AFSA and EM estimators. Observations were generated from Y i ind MutMix k θ, m i, i 1,..., n 500, given varying custer sizes m 1,..., m n which themseves were generated as Z 1,..., Z n iid Gammaα, β, m i Z i
18 Convergence of competing agorithms ogabsdeta AFSA FSA EM FSA w/ warmup 1e iteration Figure 3: Convergence of severa competing agorithms for a sma test probem Tabe 1: Estimates and standard errors for the competing agorithms. FSA Hybrid produced the same resuts with ε 0 set to 0.001, 0.01, 0.1, 1, and 10. FSA AFSA EM FSA Hybrid ˆp SE ˆp SE ˆp SE ˆp SE ˆπ SE
19 Tabe 2: Convergence of severa competing agorithms. Hybrid FSA is shown with severa choices of the warmup toerance ε 0. Exact FSA uses ε 0. method ε 0 oglik to iter AFSA EM FSA FSA FSA FSA FSA FSA Severa different settings of θ are considered, with s 2 mixing components and proportion π 0.75 for the first component. The parameters α and β were chosen such that EZ i αβ 20. This gives β 20/α so ony α is free, and VarZ i αβ 2 400/α can be chosen as desired. The expectation and variance of m i are intuitivey simiar to Z i, and their exact vaues may be computed numericay. Once the n observations are generated, an AFSA estimator θ and an EM estimator ˆθ are fit. This process is repeated 1000 times yieding θ r and ˆθ r for r 1,..., A defaut initia vaue was seected for each setting of θ, and used for both agorithms in every repetition. To measure the coseness of the two estimators, a maximum reative difference is taken over a components of θ, then averaged over a repetitions: D D r, where D r 1000 r1 Here represents the maximum operator. Notice that obtaining a good resut for D depends on the vectors ˆθ and θ being ordered in the same way. To hep ensure this, we add the constraint π 1 > > π s, which is enforced in both agorithms by reordering the estimates for π 1,..., π s and p 1,..., p s accordingy after every iteration. Tabe 3 shows the resuts of the simuation. Nine different scenarios for θ are considered. The custer sizes m 1,..., m n are seected in three different ways: a baanced case where m i 20 for i 1,..., n, custer sizes seected at random with sma variabiity using α 100, and custer sizes seected at random with moderate variabiity using α 25. Both agorithms are susceptibe to finding oca maxima of the ikeihood, but in this experiment AFSA encountered the probem much more frequenty. These cases stood out because the oca maxima occurred with one of the mixing proportions or category probabiities cose to zero, i.e. a convergence to the boundary of the parameter space. This is an especiay bad situation for our Monte Caro statistic D, which can become very arge if 19 q j1 θ r j θ r j ˆθ r j.
20 Tabe 3: Coseness between AFSA and EM estimates, over 1000 trias A. B. C. D. E. F. G. H. I. Custer sizes equa α 100 α 25 p 1 p 2 m i 20 Varm i Varm i , 0.3 1/3, 1/ , 0.5 1/3, 1/ , 0.5 1/3, 1/ , 0.1, , 0.25, , 0.2, , 0.25, this occurs even once for a given scenario. The probem occurred most frequenty for the case p 1 0.1, 0.3 and p 2 1/3, 1/3. To counter this, we restarted AFSA with a random starting vaue whenever a soution with any estimate ess than 0.01 was obtained. For this experiment, no more than 15 out of 1000 trias required a restart, and no more than two restarts were needed for the same tria. In practice, we recommend starting AFSA with severa initia vaues to ensure that any soutions on the boundary are not missteps taken by the agorithm. The entries in Tabe 3 show that sma to moderate variation of the custer sizes does not have a significant impact on the equivaence of AFSA and EM. On the other hand, as p 1 and p 2 are moved coser together, the quantity D tends to become arger. Theorem 2.1 depends on the distinctness of the category probabiity vectors, so the quaity of the FIM approximation at moderate custer size may begin to suffer in this case. The estimation probem itsef aso intuitivey becomes more difficut as p 1 and p 2 become coser. Reca that the dimension of p i is k 1; it can be seen from Tabe 3 that increasing k from 2 to 4 does not necessariy have a negative effect on the resuts. In Scenario E, p 1 and p 2 are not too cose together, yet D has a simiar magnitude to Scenario D where the two vectors are coser. Figure 4 shows a pot of the individua D r for Scenarios D and E. Notice that in Scenario E, one particuar simuation in each case is responsibe for the arge magnitude of D. Upon remova of these simuations, the order of D is reduced from 10 3 to However, many arge D r were present in the Scenario D resuts. 5 Concusions A arge custer approximation was presented for the FIM of the finite mixture of mutinonias mode Theorem 2.1. This matrix has a convenient bock diagona form, where each non-zero bock is the FIM of a standard mutinomia observation. Furthermore, the approximation is equivaent to the compete data FIM, had popuation abes been recorded for each 20
21 m20 α100 α25 m20 α100 α Reative Distances Between EM and AFSA for Scenarios D and E Scenario D Scenario E Figure 4: Boxpots for Scenarios D and E of Monte Caro study. At this scae, the boxes appear as thin horizonta ines. observation Proposition 2.2. Using this approximation to the FIM, we formuated the Approximate Fisher Scoring Agorithm AFSA, and showed that its iterations are cosey reated to the we known Expectation-Maximization EM agorithm for finite mixtures Theorem 3.5. Simuations show that a rather arge custer size is needed before the exact and approximate FIM are cose; this is not surprising given that a bock diagona matrix is being used to approximate a dense matrix. A arge custer size is aso needed for a cose approximation of the inverse, athough the inverses are seen to converge together more quicky. Therefore, the approximate FIM and its inverse are not we-suited to repace the exact matrices for genera usage. This means, for exampe, that one shoud be cautious about computing standard errors for the MLE from the approximate inverse FIM. As another exampe of a genera use for the approximate FIM, consider approximate 1 α eve Wad-type and Score-type confidence regions, { θ 0 : ˆθ θ 0 T Ĩ ˆθ ˆθ θ 0 χ 2 q,α } and { θ 0 : Sθ 0 T Ĩ 1 θ 0 Sθ 0 χ 2 q,α }, 5.1 respectivey, using the approximate FIM in pace of the exact FIM. Such regions are very practica to compute, but wi ikey not have the desired coverage for θ. However, we might expect the Score region to perform better for moderate custer sizes because it invoves the inverse matrix. On the other hand, the approximate FIM works we as a too for estimation in the AFSA agorithm. This is interesting because the more standard Fisher Scoring and Newton-Raphson agorithms do not work we on their own. For Newton-Raphson, the invertibiity of the Hessian depends on the sampe as we as the current iterate θ g and the mode. Fisher Scoring can be computed when the custer size is not too sma so that 21
22 the FIM is non-singuar, but it is often unabe to make progress at a from an arbitrariy chosen starting point. In this case, AFSA or EM is usefu for giving FSA some initia hep. If FSA has a sufficienty good starting point, it can converge very quicky. Therefore we recommend a hybrid approach: use AFSA iterations for an initia warmup period, then switch to FSA once a path toward a soution has been estabished. This approach may aso hep to reduce the number of exact FIM computations needed, which may be expensive. Athough AFSA and EM are cosey reated and often tend toward the same soution, AFSA is not restricted to the parameter space. Additiona precautions may therefore be needed to prevent AFSA iterations from drifting outside of the space. AFSA aso tended to converge to the boundary of the space more often than EM; hence, we reiterate the usua advice of trying severa initia vaues as a good practice. AFSA may be preferabe to EM in situations where it is more natura to formuate. Derivation of the E-step conditiona og-ikeihood may invove evauating a compicated expectation, but is not required for AFSA. A tradeoff for AFSA is that the score vector for the observed data must be computed; this may invove a messy differentiation, but is arguaby easier to address numericay than the E- step. AFSA iterations were obtained for the Random-Cumped Mutinomia in Exampe 3.1, starting from a genera mutinomia mixture and using an appropriate transformation of the parameters. It is interesting to note the reationship between FSA, AFSA, and EM as Newton-type agorithms. Fisher Scoring is a cassic agorithm where the Hessian is repaced by its expectation. In AFSA the Hessian is repaced instead by a compete data FIM. EM can be considered a Newton-type agorithm aso, where the entire ikeihood is repaced by a compete data ikeihood with missing data integrated out. In this ight, EM and AFSA iterations are seen to be approximatey equivaent. Severa interesting questions can be raised at this point. There is a reationship between AFSA and EM which extends beyond the mutinomia mixture; we wonder if the reationship between the exact and compete data information matrix generaizes as we. Aso, for the present mutinomia mixture, perhaps there is a sma custer bias correction that coud be appied to improve the approximation. This might aow standard errors and confidence regions such as 5.1 to safey be derived from the approximate FIM. 6 Acknowedgements The hardware used in the computationa studies is part of the UMBC High Performance Computing Faciity HPCF. The faciity is supported by the U.S. Nationa Science Foundation through the MRI program grant no. CNS and the SCREMS program grant no. DMS , with additiona substantia support from the University of Maryand, Batimore County UMBC. See for more information on HPCF and the projects using its resources. The first author additionay acknowedges financia support as HPCF RA. 22
23 References W. R. Bischke. Moment estimators for the parameters of a mixture of two binomia distributions. The Annas of Mathematica Statistics, 332: , W. R. Bischke. Estimating the parameters of mixtures of binomia distributions. Journa of the American Statistica Association, 59306: , H. Bozdogan. Mode seection and Akaike s Information Criterion AIC: The genera theory and its anaytica extensions. Psychometrika, 523: , S. Chandra. On the mixtures of probabiity distributions. Scandinavian Journa of Statistics, 4: , A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum ikeihood from incompete data via the EM agorithm. Journa of the Roya Statistica Society, Series B, 391:1 38, A. B. Kabir. Estimation of parameters of a finite mixture of distributions. Journa of the Roya Statistica Society. Series B, 30: , K. Kroikowska. Estimation of the parameters of any finite mixture of geometric distributions. Demonstratio Mathematica, 9: , K. Lange. Numerica Anaysis for Statisticians. Springer, 2nd edition, E. L. Lehmann and G. Casea. Theory of Point Estimation. Springer, 2nd edition, T. A. Louis. Finding the observed information matrix when using the EM agorithm. Journa of the Roya Statistica Society. Series B, 44: , G. Mcachan and D. Pee. Finite Mixture Modes. Wiey-Interscience, X. L. Meng and D. B. Rubin. Using EM to obtain asymptotic variance-covariance matrices: the SEM agorithm. Journa of the American Statistica Association, 86416: , C. D. Meyer. Matrix Anaysis and Appied Linear Agebra. SIAM, J. G. More and N. K. Nagaraj. A finite mixture distribution for modeing mutinomia extra variation. Technica Report Research report 91 03, Department of Mathematics and Statistics, University of Maryand, Batimore County, J. G. More and N. K. Nagaraj. A finite mixture distribution for modeing mutinomia extra variation. Biometrika, 802: , J. G. More and N. K. Nagaraj. Overdispersion Modes in SAS. SAS Institute,
24 N. K. Neercha and J. G. More. Large custer resuts for two parametric mutinomia extra variation modes. Journa of the American Statistica Association, 93443: , N. K. Neercha and J. G. More. An improved method for the computation of maximum ikeihood estimates for mutinomia overdispersion modes. Computationa Statistics & Data Anaysis, 491:33 43, M. Okamoto. Some inequaities reating to the partia sum of binomia probabiities. Annas of the Institute of Statistica Mathematics, 10:29 35, A. M. Raim, M. K. Gobbert, N. K. Neercha, and J. G. More. Maximum ikeihood estimation of the random-cumped mutinomia mode as prototype probem for arge-scae statistica computing. Accepted, C. R. Rao. Linear statistica inference and its appications. John Wiey and Sons Inc, H. Robbins. Mixture of distributions. The Annas of Mathematica Statistics, 193: , T. J. Rothenberg. Identification in parametric modes. Econometrica, 39: , H. Teicher. On the mixture of distributions. The Annas of Mathematica Statistics, 311: 55 73, D. M. Titterington. Recursive parameter estimation using incompete data. Journa of the Roya Statistica Society. Series B, 46: , H. Zhou and K. Lange. MM agorithms for some discrete mutivariate distributions. Journa of Computationa and Graphica Statistics, 193: , A Appendix: Preiminaries and Notation Given an independent sampe X 1,..., X n with joint ikeihood Lθ and θ having dimension q 1, the score vector is Sθ og Lθ θ og fx; θ. θ For X i Mut k p, m the score vector for a singe observation can be obtained from [ ] og fx; p, m k 1 x 1 og p x k 1 og p k 1 + x k og 1 p j p a p a x a /p a x k /p k, A.1 j1 24
25 so that og fx; p, m p x 1 /p 1. x k 1 /p k 1 x k /p k. x k /p k D 1 x k x k p k 1, denoting D : diagp 1,..., p k 1 and x k : x 1,..., x k 1. The score vector for a singe observation X MutMix k θ, m can aso be obtained, og Px p a og{ s 1 π P x} p a 1 Px π P a x a p a π a P a x og P a x Px p a π [ a P a x Da 1 x k x ] k 1 Px p ak, a 1,..., s, where D a : diagp a1,..., p a,k 1, and og Px π a og{ s 1 π P x} π a P ax P s x, a 1,..., s 1. Px Next, consider the q q FIM for the independent sampe X 1,..., X n [ { } { } ] T Iθ VarSθ E og Lθ og Lθ θ θ ] E [ 2 og Lθ. θ θt The ast equaity hods under appropriate reguarity conditions. For the mutinomia FIM, we may use A.1 to obtain { x k /p 2 k if a b og fx; p, m p a p b x a /p 2 a x k /p 2 k otherwise and so og fx; p, m diag x 1,..., x k 1 p pt p 2 1 p 2 k 1 x k 11 T. p 2 k 25
26 Therefore, we have Ip E p p mp1 diag p 2 1 og fx; p, m T,..., mp k 1 p 2 k 1 + mp k 11 T p 2 k m D 1 + p 1 k 11T. The score vector and Hessian of the og-ikeihood can be used to impement the Newton- Raphson agorithm, where the g + 1th iteration is given by { } θ g+1 θ g 2 1 θ θ og T Lθg Sθ g. The Hessian may be repaced with the FIM to impement Fisher Scoring θ g+1 θ g + I 1 θ g Sθ g. In order for the estimation probem to be we-defined in the first pace, the mode must be identifiabe. For finite mixtures, this is taken to mean that the equaity s v π fx; θ as λ fx; ξ 1 impies s v and terms within the sums are equa, except the indicies may be permuted Mcachan and Pee, 2000, section Chandra 1977 provides some insight into the identifiabiity issue, and shows that a famiy of mutivariate mixtures is identifiabe if any of the corresponding margina mixtures are identifiabe. In the present case, the mutivariate mixtures consist of mutinomia densities, and the margina densities are binomias. It is we known that a finite mixture of s components from 1 { Binomiam, θ : θ 0, 1 } is identifiabe if and ony if m 2s 1; see, for exampe, Bischke Then a sufficient condition for mode 2.2 to be identifiabe is that m i 2s 1 for at east one observation. This can be seen by the foowing emma. Lemma A.1. Suppose X i ind f i x; θ, i 1,..., n, where f i s are densities, and for at east one r {1,..., n} the famiy {f r ; θ : θ Θ} is identifiabe. Then the joint mode is identifiabe. Proof of Lemma A.1. WLOG assume that r 1, and suppose we have n n f i x i ; θ as f i x i ; ξ. Integrating both sides with respect to x 2,..., x n, using the appropriate dominating measure, f 1 x 1 ; θ as f 1 x 1 ; ξ. Since the famiy {f 1 ; θ : θ Θ} is identifiabe, this impies θ ξ. Hence the joint famiy { n f i ; θ : θ Θ} is identifiabe. 26
27 B Appendix: Additiona Proofs To prove Theorem 2.1, we wi first estabish a key inequaity. A simiar strategy was used by More and Nagaraj 1991, but they considered the specia case k s, so that the number of mixture components is equa to the number of categories within each component. Here we generaize their argument to the genera case where k s need not hod. The origina proof was inspired by the foowing inequaity from Okamoto 1959 for the tai probabiity of the binomia distribution, which was aso considered by Bischke Lemma B.1. Suppose X Binomiam, p and et fx; m, p be its density. Then for c 0, i. PX/m p c e 2mc2, ii. PX/m p c e 2mc2. Theorem B.2. For a given index b {1,..., s} we have s π a P a x P b x 2 s e m 2 δ2 ab, Px π b where δ ab k 1 j1 p aj p bj. a b Proof of Theorem B.2. For a, b {1,..., s}, assume WLOG that k 1 δ ab : p aj p bj p al p bl, for some L {1,... k 1} j1 is positive. Denote as Ωx j the mutinomia sampe space when the jth eement of x is fixed at a number x j. Then we have π a P a x P b x Px m x L 0 x L x L m 2 p al+p bl x L π a x L m 2 p al+p bl x L π a π b x L m 2 p al+p bl x L π a P a x P b x Px π a P a x P bx Px + P a x + π b P a x + a b x L > m 2 p al+p bl x L x L > m 2 p al+p bl x L x L > m 2 p al+p bl x L P b x P b x. π a P a x Px P b x B.1 Notice that the ast statement above consists of margina probabiities for the Lth coordinate of k-dimensiona mutinomias, which are binomia probabiities. Foowing Bischke 1962, suppose A Binomiam, p al and B Binomiam, p bl, then B.1 is equa to π a π b P { A m 2 p al + p bl } + P 27 { B > m 2 p al + p bl }. B.2
CS229 Lecture notes. Andrew Ng
CS229 Lecture notes Andrew Ng Part IX The EM agorithm In the previous set of notes, we taked about the EM agorithm as appied to fitting a mixture of Gaussians. In this set of notes, we give a broader view
More informationA Brief Introduction to Markov Chains and Hidden Markov Models
A Brief Introduction to Markov Chains and Hidden Markov Modes Aen B MacKenzie Notes for December 1, 3, &8, 2015 Discrete-Time Markov Chains You may reca that when we first introduced random processes,
More information(This is a sample cover image for this issue. The actual cover is not yet available at this time.)
(This is a sampe cover image for this issue The actua cover is not yet avaiabe at this time) This artice appeared in a journa pubished by Esevier The attached copy is furnished to the author for interna
More informationAkaike Information Criterion for ANOVA Model with a Simple Order Restriction
Akaike Information Criterion for ANOVA Mode with a Simpe Order Restriction Yu Inatsu * Department of Mathematics, Graduate Schoo of Science, Hiroshima University ABSTRACT In this paper, we consider Akaike
More informationFRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA)
1 FRST 531 -- Mutivariate Statistics Mutivariate Discriminant Anaysis (MDA) Purpose: 1. To predict which group (Y) an observation beongs to based on the characteristics of p predictor (X) variabes, using
More informationExplicit overall risk minimization transductive bound
1 Expicit overa risk minimization transductive bound Sergio Decherchi, Paoo Gastado, Sandro Ridea, Rodofo Zunino Dept. of Biophysica and Eectronic Engineering (DIBE), Genoa University Via Opera Pia 11a,
More informationSUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS
ISEE 1 SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS By Yingying Fan and Jinchi Lv University of Southern Caifornia This Suppementary Materia
More informationThe EM Algorithm applied to determining new limit points of Mahler measures
Contro and Cybernetics vo. 39 (2010) No. 4 The EM Agorithm appied to determining new imit points of Maher measures by Souad E Otmani, Georges Rhin and Jean-Marc Sac-Épée Université Pau Veraine-Metz, LMAM,
More informationMARKOV CHAINS AND MARKOV DECISION THEORY. Contents
MARKOV CHAINS AND MARKOV DECISION THEORY ARINDRIMA DATTA Abstract. In this paper, we begin with a forma introduction to probabiity and expain the concept of random variabes and stochastic processes. After
More informationTwo-sample inference for normal mean vectors based on monotone missing data
Journa of Mutivariate Anaysis 97 (006 6 76 wwweseviercom/ocate/jmva Two-sampe inference for norma mean vectors based on monotone missing data Jianqi Yu a, K Krishnamoorthy a,, Maruthy K Pannaa b a Department
More informationA. Distribution of the test statistic
A. Distribution of the test statistic In the sequentia test, we first compute the test statistic from a mini-batch of size m. If a decision cannot be made with this statistic, we keep increasing the mini-batch
More informationA proposed nonparametric mixture density estimation using B-spline functions
A proposed nonparametric mixture density estimation using B-spine functions Atizez Hadrich a,b, Mourad Zribi a, Afif Masmoudi b a Laboratoire d Informatique Signa et Image de a Côte d Opae (LISIC-EA 4491),
More informationSTA 216 Project: Spline Approach to Discrete Survival Analysis
: Spine Approach to Discrete Surviva Anaysis November 4, 005 1 Introduction Athough continuous surviva anaysis differs much from the discrete surviva anaysis, there is certain ink between the two modeing
More informationAlberto Maydeu Olivares Instituto de Empresa Marketing Dept. C/Maria de Molina Madrid Spain
CORRECTIONS TO CLASSICAL PROCEDURES FOR ESTIMATING THURSTONE S CASE V MODEL FOR RANKING DATA Aberto Maydeu Oivares Instituto de Empresa Marketing Dept. C/Maria de Moina -5 28006 Madrid Spain Aberto.Maydeu@ie.edu
More informationLecture Note 3: Stationary Iterative Methods
MATH 5330: Computationa Methods of Linear Agebra Lecture Note 3: Stationary Iterative Methods Xianyi Zeng Department of Mathematica Sciences, UTEP Stationary Iterative Methods The Gaussian eimination (or
More information6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7
6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17 Soution 7 Probem 1: Generating Random Variabes Each part of this probem requires impementation in MATLAB. For the
More informationEfficiently Generating Random Bits from Finite State Markov Chains
1 Efficienty Generating Random Bits from Finite State Markov Chains Hongchao Zhou and Jehoshua Bruck, Feow, IEEE Abstract The probem of random number generation from an uncorreated random source (of unknown
More informationTHE REACHABILITY CONES OF ESSENTIALLY NONNEGATIVE MATRICES
THE REACHABILITY CONES OF ESSENTIALLY NONNEGATIVE MATRICES by Michae Neumann Department of Mathematics, University of Connecticut, Storrs, CT 06269 3009 and Ronad J. Stern Department of Mathematics, Concordia
More informationSome Measures for Asymmetry of Distributions
Some Measures for Asymmetry of Distributions Georgi N. Boshnakov First version: 31 January 2006 Research Report No. 5, 2006, Probabiity and Statistics Group Schoo of Mathematics, The University of Manchester
More informationFormulas for Angular-Momentum Barrier Factors Version II
BNL PREPRINT BNL-QGS-06-101 brfactor1.tex Formuas for Anguar-Momentum Barrier Factors Version II S. U. Chung Physics Department, Brookhaven Nationa Laboratory, Upton, NY 11973 March 19, 2015 abstract A
More informationXSAT of linear CNF formulas
XSAT of inear CN formuas Bernd R. Schuh Dr. Bernd Schuh, D-50968 Kön, Germany; bernd.schuh@netcoogne.de eywords: compexity, XSAT, exact inear formua, -reguarity, -uniformity, NPcompeteness Abstract. Open
More informationCONJUGATE GRADIENT WITH SUBSPACE OPTIMIZATION
CONJUGATE GRADIENT WITH SUBSPACE OPTIMIZATION SAHAR KARIMI AND STEPHEN VAVASIS Abstract. In this paper we present a variant of the conjugate gradient (CG) agorithm in which we invoke a subspace minimization
More informationASummaryofGaussianProcesses Coryn A.L. Bailer-Jones
ASummaryofGaussianProcesses Coryn A.L. Baier-Jones Cavendish Laboratory University of Cambridge caj@mrao.cam.ac.uk Introduction A genera prediction probem can be posed as foows. We consider that the variabe
More informationAST 418/518 Instrumentation and Statistics
AST 418/518 Instrumentation and Statistics Cass Website: http://ircamera.as.arizona.edu/astr_518 Cass Texts: Practica Statistics for Astronomers, J.V. Wa, and C.R. Jenkins, Second Edition. Measuring the
More informationStochastic Variational Inference with Gradient Linearization
Stochastic Variationa Inference with Gradient Linearization Suppementa Materia Tobias Pötz * Anne S Wannenwetsch Stefan Roth Department of Computer Science, TU Darmstadt Preface In this suppementa materia,
More informationIntegrating Factor Methods as Exponential Integrators
Integrating Factor Methods as Exponentia Integrators Borisav V. Minchev Department of Mathematica Science, NTNU, 7491 Trondheim, Norway Borko.Minchev@ii.uib.no Abstract. Recenty a ot of effort has been
More informationAppendix of the Paper The Role of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Model
Appendix of the Paper The Roe of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Mode Caio Ameida cameida@fgv.br José Vicente jose.vaentim@bcb.gov.br June 008 1 Introduction In this
More informationA Comparison Study of the Test for Right Censored and Grouped Data
Communications for Statistica Appications and Methods 2015, Vo. 22, No. 4, 313 320 DOI: http://dx.doi.org/10.5351/csam.2015.22.4.313 Print ISSN 2287-7843 / Onine ISSN 2383-4757 A Comparison Study of the
More informationAALBORG UNIVERSITY. The distribution of communication cost for a mobile service scenario. Jesper Møller and Man Lung Yiu. R June 2009
AALBORG UNIVERSITY The distribution of communication cost for a mobie service scenario by Jesper Møer and Man Lung Yiu R-29-11 June 29 Department of Mathematica Sciences Aaborg University Fredrik Bajers
More informationExpectation-Maximization for Estimating Parameters for a Mixture of Poissons
Expectation-Maximization for Estimating Parameters for a Mixture of Poissons Brandon Maone Department of Computer Science University of Hesini February 18, 2014 Abstract This document derives, in excrutiating
More informationFirst-Order Corrections to Gutzwiller s Trace Formula for Systems with Discrete Symmetries
c 26 Noninear Phenomena in Compex Systems First-Order Corrections to Gutzwier s Trace Formua for Systems with Discrete Symmetries Hoger Cartarius, Jörg Main, and Günter Wunner Institut für Theoretische
More informationDo Schools Matter for High Math Achievement? Evidence from the American Mathematics Competitions Glenn Ellison and Ashley Swanson Online Appendix
VOL. NO. DO SCHOOLS MATTER FOR HIGH MATH ACHIEVEMENT? 43 Do Schoos Matter for High Math Achievement? Evidence from the American Mathematics Competitions Genn Eison and Ashey Swanson Onine Appendix Appendix
More informationUniprocessor Feasibility of Sporadic Tasks with Constrained Deadlines is Strongly conp-complete
Uniprocessor Feasibiity of Sporadic Tasks with Constrained Deadines is Strongy conp-compete Pontus Ekberg and Wang Yi Uppsaa University, Sweden Emai: {pontus.ekberg yi}@it.uu.se Abstract Deciding the feasibiity
More informationA CLUSTERING LAW FOR SOME DISCRETE ORDER STATISTICS
J App Prob 40, 226 241 (2003) Printed in Israe Appied Probabiity Trust 2003 A CLUSTERING LAW FOR SOME DISCRETE ORDER STATISTICS SUNDER SETHURAMAN, Iowa State University Abstract Let X 1,X 2,,X n be a sequence
More informationTHE THREE POINT STEINER PROBLEM ON THE FLAT TORUS: THE MINIMAL LUNE CASE
THE THREE POINT STEINER PROBLEM ON THE FLAT TORUS: THE MINIMAL LUNE CASE KATIE L. MAY AND MELISSA A. MITCHELL Abstract. We show how to identify the minima path network connecting three fixed points on
More informationAlgorithms to solve massively under-defined systems of multivariate quadratic equations
Agorithms to sove massivey under-defined systems of mutivariate quadratic equations Yasufumi Hashimoto Abstract It is we known that the probem to sove a set of randomy chosen mutivariate quadratic equations
More informationA NOTE ON QUASI-STATIONARY DISTRIBUTIONS OF BIRTH-DEATH PROCESSES AND THE SIS LOGISTIC EPIDEMIC
(January 8, 2003) A NOTE ON QUASI-STATIONARY DISTRIBUTIONS OF BIRTH-DEATH PROCESSES AND THE SIS LOGISTIC EPIDEMIC DAMIAN CLANCY, University of Liverpoo PHILIP K. POLLETT, University of Queensand Abstract
More informationMATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES
MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES Separation of variabes is a method to sove certain PDEs which have a warped product structure. First, on R n, a inear PDE of order m is
More informationarxiv: v1 [math.ca] 6 Mar 2017
Indefinite Integras of Spherica Besse Functions MIT-CTP/487 arxiv:703.0648v [math.ca] 6 Mar 07 Joyon K. Boomfied,, Stephen H. P. Face,, and Zander Moss, Center for Theoretica Physics, Laboratory for Nucear
More informationVALIDATED CONTINUATION FOR EQUILIBRIA OF PDES
VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES SARAH DAY, JEAN-PHILIPPE LESSARD, AND KONSTANTIN MISCHAIKOW Abstract. One of the most efficient methods for determining the equiibria of a continuous parameterized
More informationStochastic Complement Analysis of Multi-Server Threshold Queues. with Hysteresis. Abstract
Stochastic Compement Anaysis of Muti-Server Threshod Queues with Hysteresis John C.S. Lui The Dept. of Computer Science & Engineering The Chinese University of Hong Kong Leana Goubchik Dept. of Computer
More informationSmoothness equivalence properties of univariate subdivision schemes and their projection analogues
Numerische Mathematik manuscript No. (wi be inserted by the editor) Smoothness equivaence properties of univariate subdivision schemes and their projection anaogues Phiipp Grohs TU Graz Institute of Geometry
More informationThe Group Structure on a Smooth Tropical Cubic
The Group Structure on a Smooth Tropica Cubic Ethan Lake Apri 20, 2015 Abstract Just as in in cassica agebraic geometry, it is possibe to define a group aw on a smooth tropica cubic curve. In this note,
More informationStatistical Learning Theory: A Primer
Internationa Journa of Computer Vision 38(), 9 3, 2000 c 2000 uwer Academic Pubishers. Manufactured in The Netherands. Statistica Learning Theory: A Primer THEODOROS EVGENIOU, MASSIMILIANO PONTIL AND TOMASO
More informationSymbolic models for nonlinear control systems using approximate bisimulation
Symboic modes for noninear contro systems using approximate bisimuation Giordano Poa, Antoine Girard and Pauo Tabuada Abstract Contro systems are usuay modeed by differentia equations describing how physica
More informationStatistics for Applications. Chapter 7: Regression 1/43
Statistics for Appications Chapter 7: Regression 1/43 Heuristics of the inear regression (1) Consider a coud of i.i.d. random points (X i,y i ),i =1,...,n : 2/43 Heuristics of the inear regression (2)
More informationII. PROBLEM. A. Description. For the space of audio signals
CS229 - Fina Report Speech Recording based Language Recognition (Natura Language) Leopod Cambier - cambier; Matan Leibovich - matane; Cindy Orozco Bohorquez - orozcocc ABSTRACT We construct a rea time
More informationOn the evaluation of saving-consumption plans
On the evauation of saving-consumption pans Steven Vanduffe Jan Dhaene Marc Goovaerts Juy 13, 2004 Abstract Knowedge of the distribution function of the stochasticay compounded vaue of a series of future
More informationNEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION
NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION Hsiao-Chang Chen Dept. of Systems Engineering University of Pennsyvania Phiadephia, PA 904-635, U.S.A. Chun-Hung Chen
More informationC. Fourier Sine Series Overview
12 PHILIP D. LOEWEN C. Fourier Sine Series Overview Let some constant > be given. The symboic form of the FSS Eigenvaue probem combines an ordinary differentia equation (ODE) on the interva (, ) with a
More information4 Separation of Variables
4 Separation of Variabes In this chapter we describe a cassica technique for constructing forma soutions to inear boundary vaue probems. The soution of three cassica (paraboic, hyperboic and eiptic) PDE
More informationBALANCING REGULAR MATRIX PENCILS
BALANCING REGULAR MATRIX PENCILS DAMIEN LEMONNIER AND PAUL VAN DOOREN Abstract. In this paper we present a new diagona baancing technique for reguar matrix pencis λb A, which aims at reducing the sensitivity
More informationCoupling of LWR and phase transition models at boundary
Couping of LW and phase transition modes at boundary Mauro Garaveo Dipartimento di Matematica e Appicazioni, Università di Miano Bicocca, via. Cozzi 53, 20125 Miano Itay. Benedetto Piccoi Department of
More informationc 2007 Society for Industrial and Applied Mathematics
SIAM REVIEW Vo. 49,No. 1,pp. 111 1 c 7 Society for Industria and Appied Mathematics Domino Waves C. J. Efthimiou M. D. Johnson Abstract. Motivated by a proposa of Daykin [Probem 71-19*, SIAM Rev., 13 (1971),
More informationAnalysis of rounded data in mixture normal model
Stat Papers (2012) 53:895 914 DOI 10.1007/s00362-011-0395-0 REGULAR ARTICLE Anaysis of rounded data in mixture norma mode Ningning Zhao Zhidong Bai Received: 13 August 2010 / Revised: 9 June 2011 / Pubished
More informationComponentwise Determination of the Interval Hull Solution for Linear Interval Parameter Systems
Componentwise Determination of the Interva Hu Soution for Linear Interva Parameter Systems L. V. Koev Dept. of Theoretica Eectrotechnics, Facuty of Automatics, Technica University of Sofia, 1000 Sofia,
More informationLearning Fully Observed Undirected Graphical Models
Learning Fuy Observed Undirected Graphica Modes Sides Credit: Matt Gormey (2016) Kayhan Batmangheich 1 Machine Learning The data inspires the structures we want to predict Inference finds {best structure,
More informationAn Extension of Almost Sure Central Limit Theorem for Order Statistics
An Extension of Amost Sure Centra Limit Theorem for Order Statistics T. Bin, P. Zuoxiang & S. Nadarajah First version: 6 December 2007 Research Report No. 9, 2007, Probabiity Statistics Group Schoo of
More informationAnalysis of Emerson s Multiple Model Interpolation Estimation Algorithms: The MIMO Case
Technica Report PC-04-00 Anaysis of Emerson s Mutipe Mode Interpoation Estimation Agorithms: The MIMO Case João P. Hespanha Dae E. Seborg University of Caifornia, Santa Barbara February 0, 004 Anaysis
More informationMat 1501 lecture notes, penultimate installment
Mat 1501 ecture notes, penutimate instament 1. bounded variation: functions of a singe variabe optiona) I beieve that we wi not actuay use the materia in this section the point is mainy to motivate the
More informationBayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with?
Bayesian Learning A powerfu and growing approach in machine earning We use it in our own decision making a the time You hear a which which coud equay be Thanks or Tanks, which woud you go with? Combine
More informationFurther analysis of multilevel Monte Carlo methods for elliptic PDEs with random coefficients
Further anaysis of mutieve Monte Caro methods for eiptic PDEs with random coefficients A. L. Teckentrup, R. Scheich, M. B. Gies, and E. Umann Abstract We consider the appication of mutieve Monte Caro methods
More informationA Statistical Framework for Real-time Event Detection in Power Systems
1 A Statistica Framework for Rea-time Event Detection in Power Systems Noan Uhrich, Tim Christman, Phiip Swisher, and Xichen Jiang Abstract A quickest change detection (QCD) agorithm is appied to the probem
More informationORTHOGONAL MULTI-WAVELETS FROM MATRIX FACTORIZATION
J. Korean Math. Soc. 46 2009, No. 2, pp. 281 294 ORHOGONAL MLI-WAVELES FROM MARIX FACORIZAION Hongying Xiao Abstract. Accuracy of the scaing function is very crucia in waveet theory, or correspondingy,
More informationReichenbachian Common Cause Systems
Reichenbachian Common Cause Systems G. Hofer-Szabó Department of Phiosophy Technica University of Budapest e-mai: gszabo@hps.ete.hu Mikós Rédei Department of History and Phiosophy of Science Eötvös University,
More informationVALIDATED CONTINUATION FOR EQUILIBRIA OF PDES
SIAM J. NUMER. ANAL. Vo. 0, No. 0, pp. 000 000 c 200X Society for Industria and Appied Mathematics VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES SARAH DAY, JEAN-PHILIPPE LESSARD, AND KONSTANTIN MISCHAIKOW
More information17 Lecture 17: Recombination and Dark Matter Production
PYS 652: Astrophysics 88 17 Lecture 17: Recombination and Dark Matter Production New ideas pass through three periods: It can t be done. It probaby can be done, but it s not worth doing. I knew it was
More informationTheory of Generalized k-difference Operator and Its Application in Number Theory
Internationa Journa of Mathematica Anaysis Vo. 9, 2015, no. 19, 955-964 HIKARI Ltd, www.m-hiari.com http://dx.doi.org/10.12988/ijma.2015.5389 Theory of Generaized -Difference Operator and Its Appication
More informationWeek 6 Lectures, Math 6451, Tanveer
Fourier Series Week 6 Lectures, Math 645, Tanveer In the context of separation of variabe to find soutions of PDEs, we encountered or and in other cases f(x = f(x = a 0 + f(x = a 0 + b n sin nπx { a n
More informationSeparation of Variables and a Spherical Shell with Surface Charge
Separation of Variabes and a Spherica She with Surface Charge In cass we worked out the eectrostatic potentia due to a spherica she of radius R with a surface charge density σθ = σ cos θ. This cacuation
More informationMath 124B January 17, 2012
Math 124B January 17, 212 Viktor Grigoryan 3 Fu Fourier series We saw in previous ectures how the Dirichet and Neumann boundary conditions ead to respectivey sine and cosine Fourier series of the initia
More informationPartial permutation decoding for MacDonald codes
Partia permutation decoding for MacDonad codes J.D. Key Department of Mathematics and Appied Mathematics University of the Western Cape 7535 Bevie, South Africa P. Seneviratne Department of Mathematics
More informationGeneral Certificate of Education Advanced Level Examination June 2010
Genera Certificate of Education Advanced Leve Examination June 2010 Human Bioogy HBI6T/Q10/task Unit 6T A2 Investigative Skis Assignment Task Sheet The effect of using one or two eyes on the perception
More informationAsymptotic Properties of a Generalized Cross Entropy Optimization Algorithm
1 Asymptotic Properties of a Generaized Cross Entropy Optimization Agorithm Zijun Wu, Michae Koonko, Institute for Appied Stochastics and Operations Research, Caustha Technica University Abstract The discrete
More informationOn a geometrical approach in contact mechanics
Institut für Mechanik On a geometrica approach in contact mechanics Aexander Konyukhov, Kar Schweizerhof Universität Karsruhe, Institut für Mechanik Institut für Mechanik Kaiserstr. 12, Geb. 20.30 76128
More information$, (2.1) n="# #. (2.2)
Chapter. Eectrostatic II Notes: Most of the materia presented in this chapter is taken from Jackson, Chap.,, and 4, and Di Bartoo, Chap... Mathematica Considerations.. The Fourier series and the Fourier
More informationTwo-Stage Least Squares as Minimum Distance
Two-Stage Least Squares as Minimum Distance Frank Windmeijer Discussion Paper 17 / 683 7 June 2017 Department of Economics University of Bristo Priory Road Compex Bristo BS8 1TU United Kingdom Two-Stage
More informationProblem set 6 The Perron Frobenius theorem.
Probem set 6 The Perron Frobenius theorem. Math 22a4 Oct 2 204, Due Oct.28 In a future probem set I want to discuss some criteria which aow us to concude that that the ground state of a sef-adjoint operator
More informationAppendix for Stochastic Gradient Monomial Gamma Sampler
3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 3 3 33 34 35 36 37 38 39 4 4 4 43 44 45 46 47 48 49 5 5 5 53 54 Appendix for Stochastic Gradient Monomia Gamma Samper A The Main Theorem We provide the foowing
More informationCryptanalysis of PKP: A New Approach
Cryptanaysis of PKP: A New Approach Éiane Jaumes and Antoine Joux DCSSI 18, rue du Dr. Zamenhoff F-92131 Issy-es-Mx Cedex France eiane.jaumes@wanadoo.fr Antoine.Joux@ens.fr Abstract. Quite recenty, in
More informationAutomobile Prices in Market Equilibrium. Berry, Pakes and Levinsohn
Automobie Prices in Market Equiibrium Berry, Pakes and Levinsohn Empirica Anaysis of demand and suppy in a differentiated products market: equiibrium in the U.S. automobie market. Oigopoistic Differentiated
More informationMATRIX CONDITIONING AND MINIMAX ESTIMATIO~ George Casella Biometrics Unit, Cornell University, Ithaca, N.Y. Abstract
MATRIX CONDITIONING AND MINIMAX ESTIMATIO~ George Casea Biometrics Unit, Corne University, Ithaca, N.Y. BU-732-Mf March 98 Abstract Most of the research concerning ridge regression methods has deat with
More informationEfficient Generation of Random Bits from Finite State Markov Chains
Efficient Generation of Random Bits from Finite State Markov Chains Hongchao Zhou and Jehoshua Bruck, Feow, IEEE Abstract The probem of random number generation from an uncorreated random source (of unknown
More informationFRIEZE GROUPS IN R 2
FRIEZE GROUPS IN R 2 MAXWELL STOLARSKI Abstract. Focusing on the Eucidean pane under the Pythagorean Metric, our goa is to cassify the frieze groups, discrete subgroups of the set of isometries of the
More information6 Wave Equation on an Interval: Separation of Variables
6 Wave Equation on an Interva: Separation of Variabes 6.1 Dirichet Boundary Conditions Ref: Strauss, Chapter 4 We now use the separation of variabes technique to study the wave equation on a finite interva.
More informationTesting for the Existence of Clusters
Testing for the Existence of Custers Caudio Fuentes and George Casea University of Forida November 13, 2008 Abstract The detection and determination of custers has been of specia interest, among researchers
More informationarxiv: v1 [math.co] 17 Dec 2018
On the Extrema Maximum Agreement Subtree Probem arxiv:1812.06951v1 [math.o] 17 Dec 2018 Aexey Markin Department of omputer Science, Iowa State University, USA amarkin@iastate.edu Abstract Given two phyogenetic
More informationAn explicit Jordan Decomposition of Companion matrices
An expicit Jordan Decomposition of Companion matrices Fermín S V Bazán Departamento de Matemática CFM UFSC 88040-900 Forianópois SC E-mai: fermin@mtmufscbr S Gratton CERFACS 42 Av Gaspard Coriois 31057
More informationDIGITAL FILTER DESIGN OF IIR FILTERS USING REAL VALUED GENETIC ALGORITHM
DIGITAL FILTER DESIGN OF IIR FILTERS USING REAL VALUED GENETIC ALGORITHM MIKAEL NILSSON, MATTIAS DAHL AND INGVAR CLAESSON Bekinge Institute of Technoogy Department of Teecommunications and Signa Processing
More informationAppendix for Stochastic Gradient Monomial Gamma Sampler
Appendix for Stochastic Gradient Monomia Gamma Samper A The Main Theorem We provide the foowing theorem to characterize the stationary distribution of the stochastic process with SDEs in (3) Theorem 3
More informationV.B The Cluster Expansion
V.B The Custer Expansion For short range interactions, speciay with a hard core, it is much better to repace the expansion parameter V( q ) by f(q ) = exp ( βv( q )) 1, which is obtained by summing over
More informationApproximated MLC shape matrix decomposition with interleaf collision constraint
Approximated MLC shape matrix decomposition with intereaf coision constraint Thomas Kainowski Antje Kiese Abstract Shape matrix decomposition is a subprobem in radiation therapy panning. A given fuence
More informationLIKELIHOOD RATIO TEST FOR THE HYPER- BLOCK MATRIX SPHERICITY COVARIANCE STRUCTURE CHARACTERIZATION OF THE EXACT
LIKELIHOOD RATIO TEST FOR THE HYPER- BLOCK MATRIX SPHERICITY COVARIACE STRUCTURE CHARACTERIZATIO OF THE EXACT DISTRIBUTIO AD DEVELOPMET OF EAR-EXACT DISTRIBUTIOS FOR THE TEST STATISTIC Authors: Bárbara
More informationGeneral Certificate of Education Advanced Level Examination June 2010
Genera Certificate of Education Advanced Leve Examination June 2010 Human Bioogy HBI6T/P10/task Unit 6T A2 Investigative Skis Assignment Task Sheet The effect of temperature on the rate of photosynthesis
More informationMoreau-Yosida Regularization for Grouped Tree Structure Learning
Moreau-Yosida Reguarization for Grouped Tree Structure Learning Jun Liu Computer Science and Engineering Arizona State University J.Liu@asu.edu Jieping Ye Computer Science and Engineering Arizona State
More informationTurbo Codes. Coding and Communication Laboratory. Dept. of Electrical Engineering, National Chung Hsing University
Turbo Codes Coding and Communication Laboratory Dept. of Eectrica Engineering, Nationa Chung Hsing University Turbo codes 1 Chapter 12: Turbo Codes 1. Introduction 2. Turbo code encoder 3. Design of intereaver
More informationApproximation and Fast Calculation of Non-local Boundary Conditions for the Time-dependent Schrödinger Equation
Approximation and Fast Cacuation of Non-oca Boundary Conditions for the Time-dependent Schrödinger Equation Anton Arnod, Matthias Ehrhardt 2, and Ivan Sofronov 3 Universität Münster, Institut für Numerische
More informationV.B The Cluster Expansion
V.B The Custer Expansion For short range interactions, speciay with a hard core, it is much better to repace the expansion parameter V( q ) by f( q ) = exp ( βv( q )), which is obtained by summing over
More informationCompetitive Diffusion in Social Networks: Quality or Seeding?
Competitive Diffusion in Socia Networks: Quaity or Seeding? Arastoo Fazei Amir Ajorou Ai Jadbabaie arxiv:1503.01220v1 [cs.gt] 4 Mar 2015 Abstract In this paper, we study a strategic mode of marketing and
More informationStatistical Inference, Econometric Analysis and Matrix Algebra
Statistica Inference, Econometric Anaysis and Matrix Agebra Bernhard Schipp Water Krämer Editors Statistica Inference, Econometric Anaysis and Matrix Agebra Festschrift in Honour of Götz Trenker Physica-Verag
More information