UvA-DARE (Digital Academic Repository) Recursive unsupervised learning of finite mixture models Zivkovic, Z.; van der Heijden, F.

Size: px
Start display at page:

Download "UvA-DARE (Digital Academic Repository) Recursive unsupervised learning of finite mixture models Zivkovic, Z.; van der Heijden, F."

Transcription

1 UvA-DARE (Digital Acadeic Repository) Recursive unsupervised learning of finite ixture odels Zivkovic, Z.; van der Heijden, F. Published in: IEEE Transactions on Pattern Analysis and Machine Intelligence DOI: /TPAMI Link to publication Citation for published version (APA): Zivkovic, Z., & van der Heijden, F. (2004). Recursive unsupervised learning of finite ixture odels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(5), DOI: /TPAMI General rights It is not peritted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Coons). Disclaier/Coplaints regulations If you believe that digital publication of certain aterial infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitiate coplaint, the Library will ake the aterial inaccessible and/or reove it fro the website. Please Ask the Library: or a letter to: Library of the University of Asterda, Secretariat, Singel 425, 1012 WP Asterda, The Netherlands. You will be contacted as soon as possible. UvA-DARE is a service provided by the library of the University of Asterda ( Download date: 19 Jun 2018

2 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 5, MAY Recursive Unsupervised Learning of Finite Mixture Models Zoran Zivkovic, Meber, IEEE Coputer Society, and Ferdinand van der Heijden, Meber, IEEE Coputer Society Abstract There are two open probles when finite ixture densities are used to odel ultivariate data: the selection of the nuber of coponents and the initialization. In this paper, we propose an online (recursive) algorith that estiates the paraeters of the ixture and that siultaneously selects the nuber of coponents. The new algorith starts with a large nuber of randoly initialized coponents. A prior is used as a bias for axially structured odels. A stochastic approxiation recursive learning algorith is proposed to search for the axiu a posteriori (MAP) solution and to discard the irrelevant coponents. Index Ters Online (recursive) estiation, unsupervised learning, finite ixtures, odel selection, EM-algorith. 1 INTRODUCTION æ FINITE ixture probability density odels have been analyzed any ties and used extensively for odeling ultivariate data [16], [8]. In [3] and [6], an efficient heuristic was used to siultaneously estiate the paraeters of a ixture and select the appropriate nuber of its coponents. The idea is to start with a large nuber of coponents and introduce a prior to express our preference for copact odels. During soe iterative search procedure for the MAP solution, the prior drives the irrelevant coponents to extinction. The entropic-prior fro [3] leads to a MAP estiate that iniizes the entropy and, hence, leads to a copact odel. The Dirichlet prior fro [6] gives a solution that is related to odel selection using the Miniu Message Length (MML) criterion [20]. This paper is inspired by the aforeentioned papers [3], [6]. Our contribution is in developing an online version which is potentially very useful in any situations since it is highly eory and tie efficient. We use a stochastic approxiation procedure to estiate the paraeters of the ixture recursively. More on the behavior of approxiate recursive equations can be found in [13], [5], [15]. We propose a way to include the suggested prior fro [6] in the recursive equations. This enables the online selection of the nuber of coponents of the ixture. We show that the new algorith can reach solutions siilar to those obtained by batch algoriths. In Sections 2 and 3 of the paper, we introduce the notation and discuss soe standard probles associated with finite ixture fitting. In Section 4, we describe the entioned heuristic that enables us to estiate the paraeters of the ixture and to siultaneously select the nuber of its coponents. Further, in Section 5, we develop an online version. The final practical algorith we used in our experients is described in Section 6. In. Z. Zivkovic is with the Inforatics Institute, University of Asterda, Kruislaan 403, 1098SJ Asterda, The Netherlands. E-ail: zivkovic@science.uva.nl.. F. van der Heijden is with the Laboratory for Measureent and Instruentation, University of Twente, PO Box 217, 7500AE Enschede, The Netherlands. E-ail: f.vanderheijden@utwente.nl. Manuscript received 18 Nov. 2002; revised 24 June 2003; accepted 3 Nov Recoended for acceptance by Y. Ait. For inforation on obtaining reprints of this article, please send e-ail to: tpai@coputer.org, and reference IEEECS Log Nuber Section 7, we deonstrate how the new algorith perfors for a nuber of standard probles and copare it to soe batch algoriths. 2 PARAMETER ESTIMATION A ixture density with M coponents for a d-diensional rando variable ~x is given by: pð~x; ~ Þ¼ XM ¼1 p ð~x; ~ Þ; with X M ¼1 ¼ 1; where ~ ¼f 1 ; ::; M ; ~ 1 ; ::; ~ M g are the paraeters. The nuber of paraeter depends on the nuber of coponents M and the notation ~ ðmþ will be used to stress this when needed. The th coponent of the ixture is denoted by p ð~x; ~ Þ and ~ are its paraeters. The ixing weights denoted by are nonnegative and add up to one. Given a set of t data saples X¼f~x ð1þ ;...; ~x g the axiu likelihood (ML) estiate of the paraeter values is: b~~ ¼ arg axðlog pðx; ~ ÞÞ: ~ The Expectation Maxiization (EM) algorith [4] is coonly used to search for the solution. The EM algorith is an iterative procedure that searches for a local axiu of the log-likelihood function. In order to apply the EM algorith, we need to introduce for each ~x a discrete unobserved indicator vector ~y ¼½y 1...y M Š T. The indicator vector specifies (by eans of position coding) the ixture coponent fro which the observation ~x is drawn. The new joint density function can be written as a product: pð~x;~y; ~ Þ¼pð~y; 1 ; ::; M Þpð~xj~y; ~ 1 ; ::; ~ M Þ¼ YM ¼1 y p ð~x; ~ Þ y ; where exactly one of the y fro ~y can be equal to 1 and the others are zero. The indicators ~y have a ultinoial distribution defined by the ixing weights 1 ; ::; M. The EM algorith starts with soe initial paraeter estiate ~ b ð0þ. If we denote the set of unobserved data by Y¼f~y ð1þ ;...;~y g the estiate ~ b ðkþ fro the kth iteration of the EM algorith is obtained using the previous estiate ~ b ðk 1Þ : E step : M step : Qð ~ ; ~ b ðk 1Þ Þ¼E Y ðlog pðx; Y; ~ ÞjX; ~ b ðk 1Þ Þ¼ X pðyjx; ~ b ðk 1Þ Þ log pðx; Y; ~ Þ all possible Y b~~ ðkþ ¼ arg axðqð ~ ; ~ b ðk 1Þ ÞÞ: ~ The attractiveness of the EM algorith is that it is easy to ipleent and it converges to a local axiu of the loglikelihood function. However, one of the serious liitations of the EM algorith is that it can end up in a poor local axiu if not properly initialized. The selection of the initial paraeter values is still an open question that was studied any ties. Soe recent efforts were reported in [3], [6], [17], [18], [19]. 3 MODEL SELECTION Note that, in order to use the EM algorith, we need to know the appropriate nuber of coponents M. Too any coponents lead to overfitting and too few to underfitting. Choosing an appropriate nuber of coponents is iportant. Soeties, for exaple, the appropriate nuber of coponents can reveal soe iportant existing underlying structure that characterizes the data. ð1þ ð2þ /04/$20.00 ß 2004 IEEE Published by the IEEE Coputer Society

3 652 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 5, MAY 2004 Full Bayesian approaches saple fro the full a posteriori distribution with the nuber of coponents M considered unknown. This is possible using Markov chain Monte Carlo ethods as reported in [11], [10]. However, these ethods are still far too coputationally deanding. Most of the practical odel selection techniques are based on axiizing the following type of criteria: JðM; ~ ðmþþ ¼ log pðx; ~ ðmþþ PðMÞ: Here, log pðx; ~ ðmþþ is the log-likelihood for the available data. This part can be axiized using the EM. However, introducing ore ixture coponents always increases the log-likelihood. The balance is achieved by introducing P ðmþ that penalizes coplex solutions. Soe exaples of such criteria are the Akaike Inforation Criterion [1], the Bayesian Inference Criterion [14], the Miniu Description Length [12], the Miniu Message Length (MML) [20], etc. For a detailed review, see, for exaple, [8]. 4 SOLUTION USING MAP ESTIMATION The standard procedure for selecting M is the following: Find the ML estiate for different M-s and choose the M that axiizes (3). Suppose that we introduce a prior pð ~ ðmþþ for the ixture paraeters that penalizes coplex solutions in a siilar way as P ðmþ fro (3). Instead of (3), we could use: log pðx; ~ ðmþþ þ log pð ~ ðmþþ: As in [6] and [3], we use the siplest prior choice, the prior only on the ixing weights -s. For exaple, the Dirichlet prior (see [7], chapter 16) for the ixing weights is given by: pð ~ ðmþþ / exp XM ¼1 c log ¼ YM ¼1 c : The procedure is then as follows: We start with a large nuber of randoly initialized coponents M and search for the MAP solution using soe iterative procedure, for exaple, the EM algorith. The prior drives the irrelevant coponents to extinction. In this way, while searching for the MAP solution, the nuber of coponents M is reduced until the balance is achieved. It can be shown that the standard MML odel selection criterion can be approxiated by the Dirichlet prior with the coefficients c equal to N=2, where N presents the nuber of paraeters per coponent of the ixture. See [6] for details. The paraeters c have a eaningful interpretation. For a ultinoial distribution, the c presents the prior evidence (in the MAP sense) for the class (nuber of saples a priori belonging to that class). Negative prior evidence eans that we will accept that the class exists only if there is enough evidence fro the data for the existence of this class. If there are any paraeters per coponent, we will need any data saples to estiate the. In this sense, the presented linear connection between the c and N sees very logical. The procedure fro [6] starts with all the -s equal to 1=M. Although there is no proof of optiality, it sees reasonable to discard the coponent when its weight becoes negative. This also ensures that the ixing weights stay nonnegative. The entropic prior fro [3] has a siilar for: pð ~ ðmþþ / exp ð Hð 1,...; M ÞÞ, where Hð 1 ;...; M Þ¼ P M ¼1 log is the entropy easure for the underlying ultinoial distribution and is a paraeter. We use the entioned Dirichlet prior because it leads to a closed for solution. ð3þ ð4þ ð5þ 5 RECURSIVE (ONLINE) For the ML estiate, the following ~ b log pðx; ~ b Þ¼0. The ixing weights are constrained to su up to 1. We take this into account by introducing the Lagrange ultiplier ^ log pðx; ~ b Þþð P M ¼1 ^ 1Þ ¼ 0. Fro here, after getting rid of, it follows that the ML estiate for t data saples should satisfy ¼ 1 P t t i¼1 o ð~xðiþ Þ with the ownerships defined as: o ð~xþ ¼^ p ð~x; ~ b Þ=pð~x; ~ b Þ: Siilarly, for the MAP solution, we ^ ðlog pðx; ~ b Þ + log pð~ b Þþð P M ¼1 ^ 1ÞÞ ¼ 0, where pð~ b Þ is the entioned Dirichlet prior (5). For t data saples, we get:! ¼ 1 K X t i¼1 o ð~xðiþ Þ c ; ð7þ P M ¼1 o where K ¼ P M ¼1 ðp t i¼1 o ð~xðiþ Þ cþ ¼t Mc (since ¼ 1). The paraeters of the prior are c ¼ c (and c ¼ N=2 as entioned before). We rewrite (7) as: ¼ ^ c=t 1 Mc=t ; ð8þ where ^ ¼ 1 P t t i¼1 o ð~xðiþ Þ is the entioned ML estiate and the bias fro the prior is introduced through c=t. The bias decreases for larger data sets (larger t). However, if a sall bias is acceptable we can keep it constant by fixing c=t to c T ¼ c=t with soe large T. This eans that the bias will always be the sae as if it would have been for a data set with T saples. If we assue that the paraeter estiates do not change uch when a new saple ~x ðtþ1þ is added and, therefore, o ðtþ1þð~xðiþ Þ can be approxiated by o ð~xðiþ Þ that uses the previous paraeter estiates, we get the following well behaved and easy to use recursive update equation: ^ ðtþ1þ ¼ o þð1þtþ 1 ð1þtþ 1 c T : 1 Mc T 1 Mc T ð9þ ð~xðtþ1þ Þ Here, T should be sufficiently large to ake sure that Mc T < 1.We start with initial ^ ð0þ ¼ 1=M and discard the th coponent when ^ ðtþ1þ < 0. Note that the straightforward recursive version of (7) given by: ^ ðtþ1þ ¼ þð1þt McÞ 1 ðo ð~xðtþ1þ Þ Þ, is not very useful. For sall t, the update is negative and the weights for the coponents with high o ð~xðtþ1þ Þ are decreased instead of increased. In order to avoid the negative update, we could start with a larger value for t, but then we cancel out the influence of the prior. This otivates the iportant choice we ade to fix the influence of the prior. The ost coonly used ixture is the Gaussian ixture. A ixture coponent p ð~x; ~ Þ¼Nð~x; ~ ;C Þ has its ean ~ and its covariance atrix C as the paraeters. The prior has influence only on the ixing weights and we can use the recursive equations: b~~ ðtþ1þ ^C ðtþ1þ ¼ b ~~ þðt þ 1Þ 1 o ¼ ^C þðt þ 1Þ 1 o ^C ð~xðtþ1þ Þ ð~xðtþ1þ Þ ð~x ðtþ1þ b ~~ Þ fro [15] for the rest of the paraeters. ð~x ðtþ1þ b ~~ Þð~xðtþ1Þ b ~~ ÞT ð10þ ð11þ

4 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 5, MAY ASIMPLE PRACTICAL ALGORITHM For an online procedure, it is reasonable to fix the influence of the new saples by replacing the ter ð1 þ tþ 1 fro the recursive update equations (9), (10), and (11) by ¼ 1=T. There are also soe practical reasons for using a fixed sall constant. It reduces the probles with instability of the equations for sall t. Furtherore, a fixed helps in forgetting the out-of-date statistics (rando initialization and coponent deletion) ore rapidly. It is equivalent to introducing an exponentially decaying envelope: ð1 Þ t i is applied to the influence of the saple ~x ðt iþ. For the sake of clarity, we present here the whole algorith we used in our experients. We start with a large nuber of coponents M and with a rando initialization of the paraeters (see next section for an exaple). We have c T ¼ N=2. Furtherore, we use Gaussian ixture coponents with full covariance atrices. Therefore, if the data is d-diensional, we have N ¼ d þ dðd þ 1Þ=2 (the nuber of paraeters for a Gaussian with a full covariance atrix). The online algorith is then given by:. Input: new data saple ~x ðtþ1þ, current paraeter estiates b~~.. Calculate ownerships: o ð~xðtþ1þ Þ¼ p ð~x ðtþ1þ ; ~ b Þ= pð~x ðtþ1þ ; ~ b Þ.. Update ixture weights: ^ ðtþ1þ ¼ þ ðo ð~x ðtþ1þ Þ 1 Mc T ^ Þ ct 1 Mc T.. Check if there are irrelevant coponents: if ^ ðtþ1þ < 0, discard the coponent, set M ¼ M 1 and renoralize the reaining ixing weights.. Update the rest of the paraeters: - ~~ b ðtþ1þ b~~ Þ. - ^C ðtþ1þ ¼ ~~ b þ w~ (where w ¼ o ^C ð~x ðtþ1þ Þ ¼ þ wð~ ~ T speed w ¼ inð20; wþ).. Output: new paraeter estiates ~ b ðtþ1þ. and ~ ¼ ~x ðtþ1þ - ^C Þ (tip: liit the update This siple algorith can be ipleented in only a few lines of code. The recoended upper liit 20 for w siply eans that the updating speed is liited for the covariance atrices of the coponents representing less than 5 percent of the data. This was necessary since ~ T is a singular atrix and the covariance atrices ay becoe singular if updated too fast. 7 EXPERIMENTS In this section, we deonstrate the algorith perforance on a few standard probles. We show suary results fro 100 trials for each data set. For the real-world data sets, we randoly saple fro the data to generate longer sequences needed for our sequential algorith. First, for each of the probles, we present in Fig. 1 how the selected nuber of coponents of the ixture was changing when new saples are sequentially added. The nuber of coponents that was finally selected is presented in the for of a histogra for the 100 trials. In Fig. 2, we present a coparison with soe batch algoriths and study the influence of the paraeter. The rando initialization of the paraeters is the sae as in [6]. The eans ~~ b ð0þ of the ixture coponents are initialized by soe randoly chosen data points. The initial covariance atrices are a fraction (1=10 here) of the ean global diagonal covariance atrix: C ð0þ ¼ 1 10d trace 1 n X n i¼1 ð~x ðiþ b ~~Þð~x ðiþ b ~~Þ T!I; where ~~ b ¼ 1 P n n i¼1 ~xðiþ is the global ean of the data and I is the identity atrix with proper diensions. We used the first n ¼ 100 saples (it is also possible to estiate this initial covariance atrix recursively). Finally, we set the initial ixing weights to ^ ð0þ ¼ 1=M. The initial nuber of coponents M should be large enough so that the initialization reasonably covers the data. We used here the sae initial nuber of coponents as in [6]. 7.1 The Three Gaussians Data Set First, we analyze a Gaussian ixture with ixing weights 1 ¼ 2 ¼ 3 ¼ 1=3, eans 1 ¼½0 2Š T, 2 ¼½00Š T, 3 ¼½02Š T, and covariance atrices C 1 ¼ C 2 ¼ C 3 ¼ 2 0 : 0 0:2 A odified version of the EM called DAEM fro [17] was able to find the correct solution using a bad initialization. For a data set with 900 saples, they needed ore than 200 iterations to get close to the solution. Here, we start with M ¼ 30 ixture coponents. With rando initialization, we perfored 100 trials and the new algorith was always able to find the correct solution while siultaneously estiating the paraeters of the ixture and selecting the nuber of coponents. A siilar batch algorith fro [6] needs about 200 iterations to identify the three coponents (on a data set with 900 saples). Fro the plot in Fig. 1, we see that already after 9,000 saples the new algorith is usually able to identify the three coponents. The coputation costs for 9,000 saples are approxiately the sae as for only 10 iterations of the EM algorith on a data set with 900 saples. Consequently, the new algorith for this data set is about 20 ties faster in finding a siilar solution (a typical solution is presented in Fig. 1 by the ¼ 2 contours of the Guassian coponents). In [9], soe approxiate recursive versions of the EM algorith were copared to the standard EM algorith and it was shown that the recursive versions are usually faster. This is in correspondence with our results. Epirically, we decided that 50 saples per class are enough and used ¼ 1= The Iris Data Set We disregard the class inforation fro the well-known 3-class, 4- diensional Iris data set [2]. Fro the 100 trials, the clusters were properly identified 81 ties. This shows that the order in which the data is presented can influence the recursive solution. The data set had only 150 saples (50 per class) that were repeated any ties. We expect that the algorith would perfor better with ore data saples. We used ¼ 1=150. The typical solution in Fig. 1 is presented by projecting the 4-diensional data to the first two principal coponents. 7.3 The Shrinking Spiral Data Set This data set presents a 1-diensional anifold ( shrinking spiral ) in the three diensions with added noise: ~x ¼½ð13 0:5tÞ cos t ð0:5t 13Þ sin t tšþ~n, with t Unifor½0; 4Š and the noise ~n Nð0;IÞ. The odified EM called SMEM fro [18] was reported to be able to fit a 10 coponent ixture in about 350 iterations. The batch algorith fro [6] is fitting the ixture and selecting 11, 12, or 13 coponents using typically 300 to 400 iterations for a 900 saples data set. Fro the graph in Fig. 1, it is clear that we achieve siilar results, but uch faster. About 18,000 saples was enough to arrive at a siilar solution. Consequently, again, the new algorith is about 20 ties faster. There are no clusters in this data set. The fixed has as the effect that the influence of the old data is downweighted by the exponential decaying envelope ð1 Þ t k (for k<t). For coparison with the other algoriths that used 900 saples, we liited the influence of the older saples to 5 percent of the influence of the current saple by ¼ logð0:05þ=900. In Fig. 1, we present a typical solution by

5 654 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 5, MAY 2004 Fig. 1. Model selection results for a few standard probles (suary fro 100 trials). showing for each coponent the eigenvector corresponding to the largest eigenvalue of the covariance atrix. 7.4 The Enzye Data Set The 1-diensional Enzye data set has 245 data saples. It was shown in [11] using the MCMC that the nuber of coponents supported by the data is ost likely four, but two and three are also good choices. Our algorith arrived at siilar solutions. In a siilar way as before, we used ¼ logð0:05þ= Coparison with Soe Batch Algoriths The following standard batch ethods were considered for coparison: the EM algorith initialized using the result fro k-eans clustering; the SMEM ethod [18]; the greedy EM ethod [19] that starts with a single coponent and adds new ones reported to be faster than the elaborate SMEM. We used 900 saples for the Three Gaussians and the Shrinking Spiral data sets. The batch algoriths assue a known nuber of coponents: three for the Three Gaussians and the Iris data, 13 for the Shrinking Spiral, and four for the Enzye data set. Our new unsupervised recursive algorith RUEM has selected on average approxiately the sae nuber of coponents for the chosen. All the iterative batch algoriths in our experients stop if the change in the log-likelihood is less than The results are presented in Fig. 2a. The best likelihood and the lowest standard deviation are reported in bold. We also added the ideal ML result obtained using a carefully initialized EM. For the Iris data, the EM was initialized using the eans and the covariances of the three classes. However, the solution where the two close clusters are odeled using one coponent was better in ters of likelihood. This wrong solution was found occasionally by soe of the algoriths. The results fro the RUEM are biased. Furtherore, the paraeter is controlling the speed of updating the paraeters and, therefore, also the effective aount of data that is considered. Therefore, we present also the results polished by additionally applying the EM algorith and using the sae saple size for the batch algoriths. The RUEM results and the polished results are better or siilar to the batch results. We also observe that the greedy EM algorith has probles with the Iris and the Shrinking spiral data. 7.6 The Influence of the Paraeter In Figs. 2b and 2c, we show the influence of the paraeter on the selected nuber of coponents. We also plot the log-likelihood

6 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 5, MAY Fig. 2. Coparison with soe standard batch ethods and soe experients to study the influence of the paraeter (suary fro 100 trials). (a) The ean and the standard deviation (between the brackets) of the log-likelihood over the nuber of saples calculated on new test data for the synthetic data sets. (b) The three Gaussians data set the influence of. (c) The shrinking spiral data set the influence of. per saple for different values of. For the Three Gaussians data set, there is a range of values for where the sae nuber of coponents is finally selected. We can expect siilar results for other data sets where the clusters are well described by the ixture coponents and the coponents are well separated. For the Shrinking Spiral data set, there are no clear clusters and the nuber of selected coponents slowly declines with larger. Siilarly, the log-likelihood also decreases with. For coparison, we plotted also soe log-likelihood values fro soe batch algoriths (see previous section). The new unsupervised procedure siultaneously estiates paraeters and selects a copact odel. We observe fro the log-likelihood values that for a wide range of values for, we get a good representation of the data with a copact odel. The graphs for the real-world data Iris and Enzye are not included since they look siilar to the graphs for the Shrinking Spiral data. 8 DISCUSSION AND CONCLUSIONS We have proposed an online ethod for fitting ixture odels which relies on a description-length reducing prior and a MAP estiation procedure for selecting a copact odel. The experiental results indicated that the recursive algorith was able to solve difficult probles and to obtain siilar solutions as other elaborate batch algoriths. However, the theoretical support for the finally selected nuber of coponents is questionable. Soe arguents in favor of the entropic prior and its connections to other odel selection criteria are given in [3]. The Dirichlet prior we used is related to the well founded MML principle, but it can be

7 656 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 5, MAY 2004 perhaps better viewed as an efficient heuristics. Therefore, if selecting the correct odel is critical, we suggest, as in the uch slower batch version [6], to perfor an additional check with soe standard odel selection criterion (full MML for exaple). An additional proble when copared to the batch version [6] is the introduced paraeter that balances the influence of the data against the influence of the prior. This is siilar to the paraeter fro the entropic prior ( in [3]). Soe experients were perfored to show the influence of the paraeter. The paraeter ¼ 1=T is related to the nuber of data saples T that are considered and soe heuristic choices were used in the previous section. If selecting the correct nuber of coponents is not critical the new recursive procedure is highly tie and eory efficient and potentially very useful to give a quick up-to-date copact description of the data. ACKNOWLEDGMENTS This work was done while Z. Zivkovic was with the Laboratory for Measureent an Instruentation, University of Twente, Enschede, The Netherlands. REFERENCES [1] H. Akaike, A New Look at the Statistical Model Identification, IEEE Trans. Autoatic Control, vol. 19, no. 6, pp , [2] E. Anderson, The Irises of the Gaspe Peninsula, Bull. of the A. Iris Soc., vol. 59, [3] M.E. Brand, Structure Learning in Conditional Probability Models via an Entropic Prior and Paraeter Extinction, Neural Coputation J., vol. 11, no. 5, pp , [4] A.P. Depster, N. Laird, and D.B. Rubin, Maxiu Likelihood fro Incoplete Data via the EM Algorith, J. Royal Statistical Soc., Series B (Methodological), vol. 1, no. 39, pp. 1-38, [5] V. Fabian, On Asyptotically Efficient Recursive Estiation, Annals of Statistics, vol. 6, pp , [6] M. Figueiredo and A.K. Jain, Unsupervised Learning of Finite Mixture Models, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 3, pp , Mar [7] A. Gelan, J.B. Carlin, H.S. Stern, and D.B. Rubin, Bayesian Data Analysis. Chapan and Hall, [8] G. McLachlan and D. Peel, Finite Mixture Models. John Wiley and Sons, [9] R.M. Neal and G.E. Hinton, A New View of the EM Algorith that Justifies Increental, Sparse and Other Variants, Learning in Graphical Models, pp , M.I. Jordan ed., [10] C. Rasussen, The Infinite Gaussian Mixture Model, Advances in Neural Inforation Processing Systes, vol. 12, pp , [11] S. Richardson and P. Green, On Bayesian Analysis of Mixture Models with Unknown Nuber of Coponents, J. Royal Statistical Soc., Series B (Methodological), vol. 59, no. 4, pp , [12] J. Rissansen, Stochastic Coplexity, J. Royal Statistical Soc., Series B (Methodological), vol. 49, no. 3, pp , [13] J. Sacks, Asyptotic Distribution of Stochastic Approxiation Procedures, Annals of Math. Statistics, vol. 29, pp , [14] G. Schwarz, Estiating the Diension of a Model, Annals of Statistics, vol. 6, no. 2, pp , [15] D.M. Titterington, Recursive Paraeter Estiation Using Incoplete Data, J. Royal Statistical Soc., Series B (Methodological), vol. 2, no. 46, pp , [16] D.M. Titterington, A.F.M. Sith, and U.E. Makov, Statistical Analysis of Finite Mixture Distributions. John Wiley and Sons, [17] N. Ueda and R. Nakano, Deterinistic Annealing EM Algorith, Neural Networks, vol. 11, pp , [18] N. Ueda, R. Nakano, Z. Ghahraani, and G.E. Hinton, SMEM Algorith for Mixture Models, Neural Coputation, vol. 12, no. 9, pp , [19] J.J. Verbeek, N. Vlassis, and B. Krose, Efficient Greedy Learning of Gaussian Mixture Models, Neural Coputation, vol. 15, no. 1, [20] C. Wallace and P. Freean, Estiation and Inference by Copact Coding, J. Royal Statistical Soc., Series B (Methodological), vol. 49, no. 3, pp , For ore inforation on this or any other coputing topic, please visit our Digital Library at

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points

More information

Machine Learning Basics: Estimators, Bias and Variance

Machine Learning Basics: Estimators, Bias and Variance Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics

More information

CS Lecture 13. More Maximum Likelihood

CS Lecture 13. More Maximum Likelihood CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

Ch 12: Variations on Backpropagation

Ch 12: Variations on Backpropagation Ch 2: Variations on Backpropagation The basic backpropagation algorith is too slow for ost practical applications. It ay take days or weeks of coputer tie. We deonstrate why the backpropagation algorith

More information

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

Probability Distributions

Probability Distributions Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples

More information

A Note on the Applied Use of MDL Approximations

A Note on the Applied Use of MDL Approximations A Note on the Applied Use of MDL Approxiations Daniel J. Navarro Departent of Psychology Ohio State University Abstract An applied proble is discussed in which two nested psychological odels of retention

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

COS 424: Interacting with Data. Written Exercises

COS 424: Interacting with Data. Written Exercises COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well

More information

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t. CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when

More information

Analyzing Simulation Results

Analyzing Simulation Results Analyzing Siulation Results Dr. John Mellor-Cruey Departent of Coputer Science Rice University johnc@cs.rice.edu COMP 528 Lecture 20 31 March 2005 Topics for Today Model verification Model validation Transient

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lessons 7 20 Dec 2017 Outline Artificial Neural networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

e-companion ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer

More information

Boosting with log-loss

Boosting with log-loss Boosting with log-loss Marco Cusuano-Towner Septeber 2, 202 The proble Suppose we have data exaples {x i, y i ) i =... } for a two-class proble with y i {, }. Let F x) be the predictor function with the

More information

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search Quantu algoriths (CO 781, Winter 2008) Prof Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search ow we begin to discuss applications of quantu walks to search algoriths

More information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis City University of New York (CUNY) CUNY Acadeic Works International Conference on Hydroinforatics 8-1-2014 Experiental Design For Model Discriination And Precise Paraeter Estiation In WDS Analysis Giovanna

More information

Introduction to Machine Learning. Recitation 11

Introduction to Machine Learning. Recitation 11 Introduction to Machine Learning Lecturer: Regev Schweiger Recitation Fall Seester Scribe: Regev Schweiger. Kernel Ridge Regression We now take on the task of kernel-izing ridge regression. Let x,...,

More information

Bootstrapping Dependent Data

Bootstrapping Dependent Data Bootstrapping Dependent Data One of the key issues confronting bootstrap resapling approxiations is how to deal with dependent data. Consider a sequence fx t g n t= of dependent rando variables. Clearly

More information

Bayes Decision Rule and Naïve Bayes Classifier

Bayes Decision Rule and Naïve Bayes Classifier Bayes Decision Rule and Naïve Bayes Classifier Le Song Machine Learning I CSE 6740, Fall 2013 Gaussian Mixture odel A density odel p(x) ay be ulti-odal: odel it as a ixture of uni-odal distributions (e.g.

More information

W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS

W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS. Introduction When it coes to applying econoetric odels to analyze georeferenced data, researchers are well

More information

Interactive Markov Models of Evolutionary Algorithms

Interactive Markov Models of Evolutionary Algorithms Cleveland State University EngagedScholarship@CSU Electrical Engineering & Coputer Science Faculty Publications Electrical Engineering & Coputer Science Departent 2015 Interactive Markov Models of Evolutionary

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

Nonmonotonic Networks. a. IRST, I Povo (Trento) Italy, b. Univ. of Trento, Physics Dept., I Povo (Trento) Italy

Nonmonotonic Networks. a. IRST, I Povo (Trento) Italy, b. Univ. of Trento, Physics Dept., I Povo (Trento) Italy Storage Capacity and Dynaics of Nononotonic Networks Bruno Crespi a and Ignazio Lazzizzera b a. IRST, I-38050 Povo (Trento) Italy, b. Univ. of Trento, Physics Dept., I-38050 Povo (Trento) Italy INFN Gruppo

More information

A note on the multiplication of sparse matrices

A note on the multiplication of sparse matrices Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani

More information

Proc. of the IEEE/OES Seventh Working Conference on Current Measurement Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES

Proc. of the IEEE/OES Seventh Working Conference on Current Measurement Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES Proc. of the IEEE/OES Seventh Working Conference on Current Measureent Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES Belinda Lipa Codar Ocean Sensors 15 La Sandra Way, Portola Valley, CA 98 blipa@pogo.co

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

Data-Driven Imaging in Anisotropic Media

Data-Driven Imaging in Anisotropic Media 18 th World Conference on Non destructive Testing, 16- April 1, Durban, South Africa Data-Driven Iaging in Anisotropic Media Arno VOLKER 1 and Alan HUNTER 1 TNO Stieltjesweg 1, 6 AD, Delft, The Netherlands

More information

Symbolic Analysis as Universal Tool for Deriving Properties of Non-linear Algorithms Case study of EM Algorithm

Symbolic Analysis as Universal Tool for Deriving Properties of Non-linear Algorithms Case study of EM Algorithm Acta Polytechnica Hungarica Vol., No., 04 Sybolic Analysis as Universal Tool for Deriving Properties of Non-linear Algoriths Case study of EM Algorith Vladiir Mladenović, Miroslav Lutovac, Dana Porrat

More information

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks Intelligent Systes: Reasoning and Recognition Jaes L. Crowley MOSIG M1 Winter Seester 2018 Lesson 7 1 March 2018 Outline Artificial Neural Networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

3.3 Variational Characterization of Singular Values

3.3 Variational Characterization of Singular Values 3.3. Variational Characterization of Singular Values 61 3.3 Variational Characterization of Singular Values Since the singular values are square roots of the eigenvalues of the Heritian atrices A A and

More information

Detection and Estimation Theory

Detection and Estimation Theory ESE 54 Detection and Estiation Theory Joseph A. O Sullivan Sauel C. Sachs Professor Electronic Systes and Signals Research Laboratory Electrical and Systes Engineering Washington University 11 Urbauer

More information

DERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS

DERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS DERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS N. van Erp and P. van Gelder Structural Hydraulic and Probabilistic Design, TU Delft Delft, The Netherlands Abstract. In probles of odel coparison

More information

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical IEEE TRANSACTIONS ON INFORMATION THEORY Large Alphabet Source Coding using Independent Coponent Analysis Aichai Painsky, Meber, IEEE, Saharon Rosset and Meir Feder, Fellow, IEEE arxiv:67.7v [cs.it] Jul

More information

Identical Maximum Likelihood State Estimation Based on Incremental Finite Mixture Model in PHD Filter

Identical Maximum Likelihood State Estimation Based on Incremental Finite Mixture Model in PHD Filter Identical Maxiu Lielihood State Estiation Based on Increental Finite Mixture Model in PHD Filter Gang Wu Eail: xjtuwugang@gail.co Jing Liu Eail: elelj20080730@ail.xjtu.edu.cn Chongzhao Han Eail: czhan@ail.xjtu.edu.cn

More information

Principal Components Analysis

Principal Components Analysis Principal Coponents Analysis Cheng Li, Bingyu Wang Noveber 3, 204 What s PCA Principal coponent analysis (PCA) is a statistical procedure that uses an orthogonal transforation to convert a set of observations

More information

Stochastic Subgradient Methods

Stochastic Subgradient Methods Stochastic Subgradient Methods Lingjie Weng Yutian Chen Bren School of Inforation and Coputer Science University of California, Irvine {wengl, yutianc}@ics.uci.edu Abstract Stochastic subgradient ethods

More information

Optimal nonlinear Bayesian experimental design: an application to amplitude versus offset experiments

Optimal nonlinear Bayesian experimental design: an application to amplitude versus offset experiments Geophys. J. Int. (23) 155, 411 421 Optial nonlinear Bayesian experiental design: an application to aplitude versus offset experients Jojanneke van den Berg, 1, Andrew Curtis 2,3 and Jeannot Trapert 1 1

More information

Least Squares Fitting of Data

Least Squares Fitting of Data Least Squares Fitting of Data David Eberly, Geoetric Tools, Redond WA 98052 https://www.geoetrictools.co/ This work is licensed under the Creative Coons Attribution 4.0 International License. To view a

More information

Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space

Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space Journal of Machine Learning Research 3 (2003) 1333-1356 Subitted 5/02; Published 3/03 Grafting: Fast, Increental Feature Selection by Gradient Descent in Function Space Sion Perkins Space and Reote Sensing

More information

Bayesian Approach for Fatigue Life Prediction from Field Inspection

Bayesian Approach for Fatigue Life Prediction from Field Inspection Bayesian Approach for Fatigue Life Prediction fro Field Inspection Dawn An and Jooho Choi School of Aerospace & Mechanical Engineering, Korea Aerospace University, Goyang, Seoul, Korea Srira Pattabhiraan

More information

MSEC MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL SOLUTION FOR MAINTENANCE AND PERFORMANCE

MSEC MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL SOLUTION FOR MAINTENANCE AND PERFORMANCE Proceeding of the ASME 9 International Manufacturing Science and Engineering Conference MSEC9 October 4-7, 9, West Lafayette, Indiana, USA MSEC9-8466 MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL

More information

Qualitative Modelling of Time Series Using Self-Organizing Maps: Application to Animal Science

Qualitative Modelling of Time Series Using Self-Organizing Maps: Application to Animal Science Proceedings of the 6th WSEAS International Conference on Applied Coputer Science, Tenerife, Canary Islands, Spain, Deceber 16-18, 2006 183 Qualitative Modelling of Tie Series Using Self-Organizing Maps:

More information

Tracking using CONDENSATION: Conditional Density Propagation

Tracking using CONDENSATION: Conditional Density Propagation Tracking using CONDENSATION: Conditional Density Propagation Goal Model-based visual tracking in dense clutter at near video frae rates M. Isard and A. Blake, CONDENSATION Conditional density propagation

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016 Lessons 7 14 Dec 2016 Outline Artificial Neural networks Notation...2 1. Introduction...3... 3 The Artificial

More information

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels Extension of CSRSM for the Paraetric Study of the Face Stability of Pressurized Tunnels Guilhe Mollon 1, Daniel Dias 2, and Abdul-Haid Soubra 3, M.ASCE 1 LGCIE, INSA Lyon, Université de Lyon, Doaine scientifique

More information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information Cite as: Straub D. (2014). Value of inforation analysis with structural reliability ethods. Structural Safety, 49: 75-86. Value of Inforation Analysis with Structural Reliability Methods Daniel Straub

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

Multi-view Discriminative Manifold Embedding for Pattern Classification

Multi-view Discriminative Manifold Embedding for Pattern Classification Multi-view Discriinative Manifold Ebedding for Pattern Classification X. Wang Departen of Inforation Zhenghzou 450053, China Y. Guo Departent of Digestive Zhengzhou 450053, China Z. Wang Henan University

More information

SPECTRUM sensing is a core concept of cognitive radio

SPECTRUM sensing is a core concept of cognitive radio World Acadey of Science, Engineering and Technology International Journal of Electronics and Counication Engineering Vol:6, o:2, 202 Efficient Detection Using Sequential Probability Ratio Test in Mobile

More information

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS A Thesis Presented to The Faculty of the Departent of Matheatics San Jose State University In Partial Fulfillent of the Requireents

More information

IN modern society that various systems have become more

IN modern society that various systems have become more Developent of Reliability Function in -Coponent Standby Redundant Syste with Priority Based on Maxiu Entropy Principle Ryosuke Hirata, Ikuo Arizono, Ryosuke Toohiro, Satoshi Oigawa, and Yasuhiko Takeoto

More information

A Theoretical Analysis of a Warm Start Technique

A Theoretical Analysis of a Warm Start Technique A Theoretical Analysis of a War Start Technique Martin A. Zinkevich Yahoo! Labs 701 First Avenue Sunnyvale, CA Abstract Batch gradient descent looks at every data point for every step, which is wasteful

More information

Estimation of the Mean of the Exponential Distribution Using Maximum Ranked Set Sampling with Unequal Samples

Estimation of the Mean of the Exponential Distribution Using Maximum Ranked Set Sampling with Unequal Samples Open Journal of Statistics, 4, 4, 64-649 Published Online Septeber 4 in SciRes http//wwwscirporg/ournal/os http//ddoiorg/436/os4486 Estiation of the Mean of the Eponential Distribution Using Maiu Ranked

More information

A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION

A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION A eshsize boosting algorith in kernel density estiation A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION C.C. Ishiekwene, S.M. Ogbonwan and J.E. Osewenkhae Departent of Matheatics, University

More information

Statistical clustering and Mineral Spectral Unmixing in Aviris Hyperspectral Image of Cuprite, NV

Statistical clustering and Mineral Spectral Unmixing in Aviris Hyperspectral Image of Cuprite, NV CS229 REPORT, DECEMBER 05 1 Statistical clustering and Mineral Spectral Unixing in Aviris Hyperspectral Iage of Cuprite, NV Mario Parente, Argyris Zynis I. INTRODUCTION Hyperspectral Iaging is a technique

More information

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering

More information

Fast Structural Similarity Search of Noncoding RNAs Based on Matched Filtering of Stem Patterns

Fast Structural Similarity Search of Noncoding RNAs Based on Matched Filtering of Stem Patterns Fast Structural Siilarity Search of Noncoding RNs Based on Matched Filtering of Ste Patterns Byung-Jun Yoon Dept. of Electrical Engineering alifornia Institute of Technology Pasadena, 91125, S Eail: bjyoon@caltech.edu

More information

Efficient Filter Banks And Interpolators

Efficient Filter Banks And Interpolators Efficient Filter Banks And Interpolators A. G. DEMPSTER AND N. P. MURPHY Departent of Electronic Systes University of Westinster 115 New Cavendish St, London W1M 8JS United Kingdo Abstract: - Graphical

More information

An improved self-adaptive harmony search algorithm for joint replenishment problems

An improved self-adaptive harmony search algorithm for joint replenishment problems An iproved self-adaptive harony search algorith for joint replenishent probles Lin Wang School of Manageent, Huazhong University of Science & Technology zhoulearner@gail.co Xiaojian Zhou School of Manageent,

More information

Compression and Predictive Distributions for Large Alphabet i.i.d and Markov models

Compression and Predictive Distributions for Large Alphabet i.i.d and Markov models 2014 IEEE International Syposiu on Inforation Theory Copression and Predictive Distributions for Large Alphabet i.i.d and Markov odels Xiao Yang Departent of Statistics Yale University New Haven, CT, 06511

More information

Chapter 6 1-D Continuous Groups

Chapter 6 1-D Continuous Groups Chapter 6 1-D Continuous Groups Continuous groups consist of group eleents labelled by one or ore continuous variables, say a 1, a 2,, a r, where each variable has a well- defined range. This chapter explores:

More information

Estimating Parameters for a Gaussian pdf

Estimating Parameters for a Gaussian pdf Pattern Recognition and achine Learning Jaes L. Crowley ENSIAG 3 IS First Seester 00/0 Lesson 5 7 Noveber 00 Contents Estiating Paraeters for a Gaussian pdf Notation... The Pattern Recognition Proble...3

More information

Combining Classifiers

Combining Classifiers Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/

More information

Graphical Models in Local, Asymmetric Multi-Agent Markov Decision Processes

Graphical Models in Local, Asymmetric Multi-Agent Markov Decision Processes Graphical Models in Local, Asyetric Multi-Agent Markov Decision Processes Ditri Dolgov and Edund Durfee Departent of Electrical Engineering and Coputer Science University of Michigan Ann Arbor, MI 48109

More information

A remark on a success rate model for DPA and CPA

A remark on a success rate model for DPA and CPA A reark on a success rate odel for DPA and CPA A. Wieers, BSI Version 0.5 andreas.wieers@bsi.bund.de Septeber 5, 2018 Abstract The success rate is the ost coon evaluation etric for easuring the perforance

More information

Figure 1: Equivalent electric (RC) circuit of a neurons membrane

Figure 1: Equivalent electric (RC) circuit of a neurons membrane Exercise: Leaky integrate and fire odel of neural spike generation This exercise investigates a siplified odel of how neurons spike in response to current inputs, one of the ost fundaental properties of

More information

Lower Bounds for Quantized Matrix Completion

Lower Bounds for Quantized Matrix Completion Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &

More information

A Smoothed Boosting Algorithm Using Probabilistic Output Codes

A Smoothed Boosting Algorithm Using Probabilistic Output Codes A Soothed Boosting Algorith Using Probabilistic Output Codes Rong Jin rongjin@cse.su.edu Dept. of Coputer Science and Engineering, Michigan State University, MI 48824, USA Jian Zhang jian.zhang@cs.cu.edu

More information

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lesson 1 4 October 2017 Outline Learning and Evaluation for Pattern Recognition Notation...2 1. The Pattern Recognition

More information

PULSE-TRAIN BASED TIME-DELAY ESTIMATION IMPROVES RESILIENCY TO NOISE

PULSE-TRAIN BASED TIME-DELAY ESTIMATION IMPROVES RESILIENCY TO NOISE PULSE-TRAIN BASED TIME-DELAY ESTIMATION IMPROVES RESILIENCY TO NOISE 1 Nicola Neretti, 1 Nathan Intrator and 1,2 Leon N Cooper 1 Institute for Brain and Neural Systes, Brown University, Providence RI 02912.

More information

Effective joint probabilistic data association using maximum a posteriori estimates of target states

Effective joint probabilistic data association using maximum a posteriori estimates of target states Effective joint probabilistic data association using axiu a posteriori estiates of target states 1 Viji Paul Panakkal, 2 Rajbabu Velurugan 1 Central Research Laboratory, Bharat Electronics Ltd., Bangalore,

More information

TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES

TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES S. E. Ahed, R. J. Tokins and A. I. Volodin Departent of Matheatics and Statistics University of Regina Regina,

More information

Polygonal Designs: Existence and Construction

Polygonal Designs: Existence and Construction Polygonal Designs: Existence and Construction John Hegean Departent of Matheatics, Stanford University, Stanford, CA 9405 Jeff Langford Departent of Matheatics, Drake University, Des Moines, IA 5011 G

More information

1 Bounding the Margin

1 Bounding the Margin COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost

More information

Lost-Sales Problems with Stochastic Lead Times: Convexity Results for Base-Stock Policies

Lost-Sales Problems with Stochastic Lead Times: Convexity Results for Base-Stock Policies OPERATIONS RESEARCH Vol. 52, No. 5, Septeber October 2004, pp. 795 803 issn 0030-364X eissn 1526-5463 04 5205 0795 infors doi 10.1287/opre.1040.0130 2004 INFORMS TECHNICAL NOTE Lost-Sales Probles with

More information

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used

More information

Support recovery in compressed sensing: An estimation theoretic approach

Support recovery in compressed sensing: An estimation theoretic approach Support recovery in copressed sensing: An estiation theoretic approach Ain Karbasi, Ali Horati, Soheil Mohajer, Martin Vetterli School of Coputer and Counication Sciences École Polytechnique Fédérale de

More information

Inference in the Presence of Likelihood Monotonicity for Polytomous and Logistic Regression

Inference in the Presence of Likelihood Monotonicity for Polytomous and Logistic Regression Advances in Pure Matheatics, 206, 6, 33-34 Published Online April 206 in SciRes. http://www.scirp.org/journal/ap http://dx.doi.org/0.4236/ap.206.65024 Inference in the Presence of Likelihood Monotonicity

More information

The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Parameters

The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Parameters journal of ultivariate analysis 58, 96106 (1996) article no. 0041 The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Paraeters H. S. Steyn

More information

Statistical Logic Cell Delay Analysis Using a Current-based Model

Statistical Logic Cell Delay Analysis Using a Current-based Model Statistical Logic Cell Delay Analysis Using a Current-based Model Hanif Fatei Shahin Nazarian Massoud Pedra Dept. of EE-Systes, University of Southern California, Los Angeles, CA 90089 {fatei, shahin,

More information

OPTIMIZATION in multi-agent networks has attracted

OPTIMIZATION in multi-agent networks has attracted Distributed constrained optiization and consensus in uncertain networks via proxial iniization Kostas Margellos, Alessandro Falsone, Sione Garatti and Maria Prandini arxiv:603.039v3 [ath.oc] 3 May 07 Abstract

More information

Hybrid System Identification: An SDP Approach

Hybrid System Identification: An SDP Approach 49th IEEE Conference on Decision and Control Deceber 15-17, 2010 Hilton Atlanta Hotel, Atlanta, GA, USA Hybrid Syste Identification: An SDP Approach C Feng, C M Lagoa, N Ozay and M Sznaier Abstract The

More information

paper prepared for the 1996 PTRC Conference, September 2-6, Brunel University, UK ON THE CALIBRATION OF THE GRAVITY MODEL

paper prepared for the 1996 PTRC Conference, September 2-6, Brunel University, UK ON THE CALIBRATION OF THE GRAVITY MODEL paper prepared for the 1996 PTRC Conference, Septeber 2-6, Brunel University, UK ON THE CALIBRATION OF THE GRAVITY MODEL Nanne J. van der Zijpp 1 Transportation and Traffic Engineering Section Delft University

More information

Use of PSO in Parameter Estimation of Robot Dynamics; Part One: No Need for Parameterization

Use of PSO in Parameter Estimation of Robot Dynamics; Part One: No Need for Parameterization Use of PSO in Paraeter Estiation of Robot Dynaics; Part One: No Need for Paraeterization Hossein Jahandideh, Mehrzad Navar Abstract Offline procedures for estiating paraeters of robot dynaics are practically

More information

In this chapter, we consider several graph-theoretic and probabilistic models

In this chapter, we consider several graph-theoretic and probabilistic models THREE ONE GRAPH-THEORETIC AND STATISTICAL MODELS 3.1 INTRODUCTION In this chapter, we consider several graph-theoretic and probabilistic odels for a social network, which we do under different assuptions

More information

Fast Montgomery-like Square Root Computation over GF(2 m ) for All Trinomials

Fast Montgomery-like Square Root Computation over GF(2 m ) for All Trinomials Fast Montgoery-like Square Root Coputation over GF( ) for All Trinoials Yin Li a, Yu Zhang a, a Departent of Coputer Science and Technology, Xinyang Noral University, Henan, P.R.China Abstract This letter

More information

A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine. (1900 words)

A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine. (1900 words) 1 A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine (1900 words) Contact: Jerry Farlow Dept of Matheatics Univeristy of Maine Orono, ME 04469 Tel (07) 866-3540 Eail: farlow@ath.uaine.edu

More information

Supervised Baysian SAR image Classification Using The Full Polarimetric Data

Supervised Baysian SAR image Classification Using The Full Polarimetric Data Supervised Baysian SAR iage Classification Using The Full Polarietric Data (1) () Ziad BELHADJ (1) SUPCOM, Route de Raoued 3.5 083 El Ghazala - TUNSA () ENT, BP. 37, 100 Tunis Belvedere, TUNSA Abstract

More information

Training an RBM: Contrastive Divergence. Sargur N. Srihari

Training an RBM: Contrastive Divergence. Sargur N. Srihari Training an RBM: Contrastive Divergence Sargur N. srihari@cedar.buffalo.edu Topics in Partition Function Definition of Partition Function 1. The log-likelihood gradient 2. Stochastic axiu likelihood and

More information

DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS

DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS ISSN 1440-771X AUSTRALIA DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS An Iproved Method for Bandwidth Selection When Estiating ROC Curves Peter G Hall and Rob J Hyndan Working Paper 11/00 An iproved

More information

An Improved Particle Filter with Applications in Ballistic Target Tracking

An Improved Particle Filter with Applications in Ballistic Target Tracking Sensors & ransducers Vol. 72 Issue 6 June 204 pp. 96-20 Sensors & ransducers 204 by IFSA Publishing S. L. http://www.sensorsportal.co An Iproved Particle Filter with Applications in Ballistic arget racing

More information

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs BULLETIN OF THE POLISH ACADEMY OF SCIENCES TECHNICAL SCIENCES Vol. 57, No. 3, 2009 Algoriths for parallel processor scheduling with distinct due windows and unit-tie obs A. JANIAK 1, W.A. JANIAK 2, and

More information

Ştefan ŞTEFĂNESCU * is the minimum global value for the function h (x)

Ştefan ŞTEFĂNESCU * is the minimum global value for the function h (x) 7Applying Nelder Mead s Optiization Algorith APPLYING NELDER MEAD S OPTIMIZATION ALGORITHM FOR MULTIPLE GLOBAL MINIMA Abstract Ştefan ŞTEFĂNESCU * The iterative deterinistic optiization ethod could not

More information

Spine Fin Efficiency A Three Sided Pyramidal Fin of Equilateral Triangular Cross-Sectional Area

Spine Fin Efficiency A Three Sided Pyramidal Fin of Equilateral Triangular Cross-Sectional Area Proceedings of the 006 WSEAS/IASME International Conference on Heat and Mass Transfer, Miai, Florida, USA, January 18-0, 006 (pp13-18) Spine Fin Efficiency A Three Sided Pyraidal Fin of Equilateral Triangular

More information

arxiv: v1 [cs.ds] 29 Jan 2012

arxiv: v1 [cs.ds] 29 Jan 2012 A parallel approxiation algorith for ixed packing covering seidefinite progras arxiv:1201.6090v1 [cs.ds] 29 Jan 2012 Rahul Jain National U. Singapore January 28, 2012 Abstract Penghui Yao National U. Singapore

More information