UvA-DARE (Digital Academic Repository) Recursive unsupervised learning of finite mixture models Zivkovic, Z.; van der Heijden, F.
|
|
- Britton Long
- 6 years ago
- Views:
Transcription
1 UvA-DARE (Digital Acadeic Repository) Recursive unsupervised learning of finite ixture odels Zivkovic, Z.; van der Heijden, F. Published in: IEEE Transactions on Pattern Analysis and Machine Intelligence DOI: /TPAMI Link to publication Citation for published version (APA): Zivkovic, Z., & van der Heijden, F. (2004). Recursive unsupervised learning of finite ixture odels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(5), DOI: /TPAMI General rights It is not peritted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Coons). Disclaier/Coplaints regulations If you believe that digital publication of certain aterial infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitiate coplaint, the Library will ake the aterial inaccessible and/or reove it fro the website. Please Ask the Library: or a letter to: Library of the University of Asterda, Secretariat, Singel 425, 1012 WP Asterda, The Netherlands. You will be contacted as soon as possible. UvA-DARE is a service provided by the library of the University of Asterda ( Download date: 19 Jun 2018
2 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 5, MAY Recursive Unsupervised Learning of Finite Mixture Models Zoran Zivkovic, Meber, IEEE Coputer Society, and Ferdinand van der Heijden, Meber, IEEE Coputer Society Abstract There are two open probles when finite ixture densities are used to odel ultivariate data: the selection of the nuber of coponents and the initialization. In this paper, we propose an online (recursive) algorith that estiates the paraeters of the ixture and that siultaneously selects the nuber of coponents. The new algorith starts with a large nuber of randoly initialized coponents. A prior is used as a bias for axially structured odels. A stochastic approxiation recursive learning algorith is proposed to search for the axiu a posteriori (MAP) solution and to discard the irrelevant coponents. Index Ters Online (recursive) estiation, unsupervised learning, finite ixtures, odel selection, EM-algorith. 1 INTRODUCTION æ FINITE ixture probability density odels have been analyzed any ties and used extensively for odeling ultivariate data [16], [8]. In [3] and [6], an efficient heuristic was used to siultaneously estiate the paraeters of a ixture and select the appropriate nuber of its coponents. The idea is to start with a large nuber of coponents and introduce a prior to express our preference for copact odels. During soe iterative search procedure for the MAP solution, the prior drives the irrelevant coponents to extinction. The entropic-prior fro [3] leads to a MAP estiate that iniizes the entropy and, hence, leads to a copact odel. The Dirichlet prior fro [6] gives a solution that is related to odel selection using the Miniu Message Length (MML) criterion [20]. This paper is inspired by the aforeentioned papers [3], [6]. Our contribution is in developing an online version which is potentially very useful in any situations since it is highly eory and tie efficient. We use a stochastic approxiation procedure to estiate the paraeters of the ixture recursively. More on the behavior of approxiate recursive equations can be found in [13], [5], [15]. We propose a way to include the suggested prior fro [6] in the recursive equations. This enables the online selection of the nuber of coponents of the ixture. We show that the new algorith can reach solutions siilar to those obtained by batch algoriths. In Sections 2 and 3 of the paper, we introduce the notation and discuss soe standard probles associated with finite ixture fitting. In Section 4, we describe the entioned heuristic that enables us to estiate the paraeters of the ixture and to siultaneously select the nuber of its coponents. Further, in Section 5, we develop an online version. The final practical algorith we used in our experients is described in Section 6. In. Z. Zivkovic is with the Inforatics Institute, University of Asterda, Kruislaan 403, 1098SJ Asterda, The Netherlands. E-ail: zivkovic@science.uva.nl.. F. van der Heijden is with the Laboratory for Measureent and Instruentation, University of Twente, PO Box 217, 7500AE Enschede, The Netherlands. E-ail: f.vanderheijden@utwente.nl. Manuscript received 18 Nov. 2002; revised 24 June 2003; accepted 3 Nov Recoended for acceptance by Y. Ait. For inforation on obtaining reprints of this article, please send e-ail to: tpai@coputer.org, and reference IEEECS Log Nuber Section 7, we deonstrate how the new algorith perfors for a nuber of standard probles and copare it to soe batch algoriths. 2 PARAMETER ESTIMATION A ixture density with M coponents for a d-diensional rando variable ~x is given by: pð~x; ~ Þ¼ XM ¼1 p ð~x; ~ Þ; with X M ¼1 ¼ 1; where ~ ¼f 1 ; ::; M ; ~ 1 ; ::; ~ M g are the paraeters. The nuber of paraeter depends on the nuber of coponents M and the notation ~ ðmþ will be used to stress this when needed. The th coponent of the ixture is denoted by p ð~x; ~ Þ and ~ are its paraeters. The ixing weights denoted by are nonnegative and add up to one. Given a set of t data saples X¼f~x ð1þ ;...; ~x g the axiu likelihood (ML) estiate of the paraeter values is: b~~ ¼ arg axðlog pðx; ~ ÞÞ: ~ The Expectation Maxiization (EM) algorith [4] is coonly used to search for the solution. The EM algorith is an iterative procedure that searches for a local axiu of the log-likelihood function. In order to apply the EM algorith, we need to introduce for each ~x a discrete unobserved indicator vector ~y ¼½y 1...y M Š T. The indicator vector specifies (by eans of position coding) the ixture coponent fro which the observation ~x is drawn. The new joint density function can be written as a product: pð~x;~y; ~ Þ¼pð~y; 1 ; ::; M Þpð~xj~y; ~ 1 ; ::; ~ M Þ¼ YM ¼1 y p ð~x; ~ Þ y ; where exactly one of the y fro ~y can be equal to 1 and the others are zero. The indicators ~y have a ultinoial distribution defined by the ixing weights 1 ; ::; M. The EM algorith starts with soe initial paraeter estiate ~ b ð0þ. If we denote the set of unobserved data by Y¼f~y ð1þ ;...;~y g the estiate ~ b ðkþ fro the kth iteration of the EM algorith is obtained using the previous estiate ~ b ðk 1Þ : E step : M step : Qð ~ ; ~ b ðk 1Þ Þ¼E Y ðlog pðx; Y; ~ ÞjX; ~ b ðk 1Þ Þ¼ X pðyjx; ~ b ðk 1Þ Þ log pðx; Y; ~ Þ all possible Y b~~ ðkþ ¼ arg axðqð ~ ; ~ b ðk 1Þ ÞÞ: ~ The attractiveness of the EM algorith is that it is easy to ipleent and it converges to a local axiu of the loglikelihood function. However, one of the serious liitations of the EM algorith is that it can end up in a poor local axiu if not properly initialized. The selection of the initial paraeter values is still an open question that was studied any ties. Soe recent efforts were reported in [3], [6], [17], [18], [19]. 3 MODEL SELECTION Note that, in order to use the EM algorith, we need to know the appropriate nuber of coponents M. Too any coponents lead to overfitting and too few to underfitting. Choosing an appropriate nuber of coponents is iportant. Soeties, for exaple, the appropriate nuber of coponents can reveal soe iportant existing underlying structure that characterizes the data. ð1þ ð2þ /04/$20.00 ß 2004 IEEE Published by the IEEE Coputer Society
3 652 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 5, MAY 2004 Full Bayesian approaches saple fro the full a posteriori distribution with the nuber of coponents M considered unknown. This is possible using Markov chain Monte Carlo ethods as reported in [11], [10]. However, these ethods are still far too coputationally deanding. Most of the practical odel selection techniques are based on axiizing the following type of criteria: JðM; ~ ðmþþ ¼ log pðx; ~ ðmþþ PðMÞ: Here, log pðx; ~ ðmþþ is the log-likelihood for the available data. This part can be axiized using the EM. However, introducing ore ixture coponents always increases the log-likelihood. The balance is achieved by introducing P ðmþ that penalizes coplex solutions. Soe exaples of such criteria are the Akaike Inforation Criterion [1], the Bayesian Inference Criterion [14], the Miniu Description Length [12], the Miniu Message Length (MML) [20], etc. For a detailed review, see, for exaple, [8]. 4 SOLUTION USING MAP ESTIMATION The standard procedure for selecting M is the following: Find the ML estiate for different M-s and choose the M that axiizes (3). Suppose that we introduce a prior pð ~ ðmþþ for the ixture paraeters that penalizes coplex solutions in a siilar way as P ðmþ fro (3). Instead of (3), we could use: log pðx; ~ ðmþþ þ log pð ~ ðmþþ: As in [6] and [3], we use the siplest prior choice, the prior only on the ixing weights -s. For exaple, the Dirichlet prior (see [7], chapter 16) for the ixing weights is given by: pð ~ ðmþþ / exp XM ¼1 c log ¼ YM ¼1 c : The procedure is then as follows: We start with a large nuber of randoly initialized coponents M and search for the MAP solution using soe iterative procedure, for exaple, the EM algorith. The prior drives the irrelevant coponents to extinction. In this way, while searching for the MAP solution, the nuber of coponents M is reduced until the balance is achieved. It can be shown that the standard MML odel selection criterion can be approxiated by the Dirichlet prior with the coefficients c equal to N=2, where N presents the nuber of paraeters per coponent of the ixture. See [6] for details. The paraeters c have a eaningful interpretation. For a ultinoial distribution, the c presents the prior evidence (in the MAP sense) for the class (nuber of saples a priori belonging to that class). Negative prior evidence eans that we will accept that the class exists only if there is enough evidence fro the data for the existence of this class. If there are any paraeters per coponent, we will need any data saples to estiate the. In this sense, the presented linear connection between the c and N sees very logical. The procedure fro [6] starts with all the -s equal to 1=M. Although there is no proof of optiality, it sees reasonable to discard the coponent when its weight becoes negative. This also ensures that the ixing weights stay nonnegative. The entropic prior fro [3] has a siilar for: pð ~ ðmþþ / exp ð Hð 1,...; M ÞÞ, where Hð 1 ;...; M Þ¼ P M ¼1 log is the entropy easure for the underlying ultinoial distribution and is a paraeter. We use the entioned Dirichlet prior because it leads to a closed for solution. ð3þ ð4þ ð5þ 5 RECURSIVE (ONLINE) For the ML estiate, the following ~ b log pðx; ~ b Þ¼0. The ixing weights are constrained to su up to 1. We take this into account by introducing the Lagrange ultiplier ^ log pðx; ~ b Þþð P M ¼1 ^ 1Þ ¼ 0. Fro here, after getting rid of, it follows that the ML estiate for t data saples should satisfy ¼ 1 P t t i¼1 o ð~xðiþ Þ with the ownerships defined as: o ð~xþ ¼^ p ð~x; ~ b Þ=pð~x; ~ b Þ: Siilarly, for the MAP solution, we ^ ðlog pðx; ~ b Þ + log pð~ b Þþð P M ¼1 ^ 1ÞÞ ¼ 0, where pð~ b Þ is the entioned Dirichlet prior (5). For t data saples, we get:! ¼ 1 K X t i¼1 o ð~xðiþ Þ c ; ð7þ P M ¼1 o where K ¼ P M ¼1 ðp t i¼1 o ð~xðiþ Þ cþ ¼t Mc (since ¼ 1). The paraeters of the prior are c ¼ c (and c ¼ N=2 as entioned before). We rewrite (7) as: ¼ ^ c=t 1 Mc=t ; ð8þ where ^ ¼ 1 P t t i¼1 o ð~xðiþ Þ is the entioned ML estiate and the bias fro the prior is introduced through c=t. The bias decreases for larger data sets (larger t). However, if a sall bias is acceptable we can keep it constant by fixing c=t to c T ¼ c=t with soe large T. This eans that the bias will always be the sae as if it would have been for a data set with T saples. If we assue that the paraeter estiates do not change uch when a new saple ~x ðtþ1þ is added and, therefore, o ðtþ1þð~xðiþ Þ can be approxiated by o ð~xðiþ Þ that uses the previous paraeter estiates, we get the following well behaved and easy to use recursive update equation: ^ ðtþ1þ ¼ o þð1þtþ 1 ð1þtþ 1 c T : 1 Mc T 1 Mc T ð9þ ð~xðtþ1þ Þ Here, T should be sufficiently large to ake sure that Mc T < 1.We start with initial ^ ð0þ ¼ 1=M and discard the th coponent when ^ ðtþ1þ < 0. Note that the straightforward recursive version of (7) given by: ^ ðtþ1þ ¼ þð1þt McÞ 1 ðo ð~xðtþ1þ Þ Þ, is not very useful. For sall t, the update is negative and the weights for the coponents with high o ð~xðtþ1þ Þ are decreased instead of increased. In order to avoid the negative update, we could start with a larger value for t, but then we cancel out the influence of the prior. This otivates the iportant choice we ade to fix the influence of the prior. The ost coonly used ixture is the Gaussian ixture. A ixture coponent p ð~x; ~ Þ¼Nð~x; ~ ;C Þ has its ean ~ and its covariance atrix C as the paraeters. The prior has influence only on the ixing weights and we can use the recursive equations: b~~ ðtþ1þ ^C ðtþ1þ ¼ b ~~ þðt þ 1Þ 1 o ¼ ^C þðt þ 1Þ 1 o ^C ð~xðtþ1þ Þ ð~xðtþ1þ Þ ð~x ðtþ1þ b ~~ Þ fro [15] for the rest of the paraeters. ð~x ðtþ1þ b ~~ Þð~xðtþ1Þ b ~~ ÞT ð10þ ð11þ
4 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 5, MAY ASIMPLE PRACTICAL ALGORITHM For an online procedure, it is reasonable to fix the influence of the new saples by replacing the ter ð1 þ tþ 1 fro the recursive update equations (9), (10), and (11) by ¼ 1=T. There are also soe practical reasons for using a fixed sall constant. It reduces the probles with instability of the equations for sall t. Furtherore, a fixed helps in forgetting the out-of-date statistics (rando initialization and coponent deletion) ore rapidly. It is equivalent to introducing an exponentially decaying envelope: ð1 Þ t i is applied to the influence of the saple ~x ðt iþ. For the sake of clarity, we present here the whole algorith we used in our experients. We start with a large nuber of coponents M and with a rando initialization of the paraeters (see next section for an exaple). We have c T ¼ N=2. Furtherore, we use Gaussian ixture coponents with full covariance atrices. Therefore, if the data is d-diensional, we have N ¼ d þ dðd þ 1Þ=2 (the nuber of paraeters for a Gaussian with a full covariance atrix). The online algorith is then given by:. Input: new data saple ~x ðtþ1þ, current paraeter estiates b~~.. Calculate ownerships: o ð~xðtþ1þ Þ¼ p ð~x ðtþ1þ ; ~ b Þ= pð~x ðtþ1þ ; ~ b Þ.. Update ixture weights: ^ ðtþ1þ ¼ þ ðo ð~x ðtþ1þ Þ 1 Mc T ^ Þ ct 1 Mc T.. Check if there are irrelevant coponents: if ^ ðtþ1þ < 0, discard the coponent, set M ¼ M 1 and renoralize the reaining ixing weights.. Update the rest of the paraeters: - ~~ b ðtþ1þ b~~ Þ. - ^C ðtþ1þ ¼ ~~ b þ w~ (where w ¼ o ^C ð~x ðtþ1þ Þ ¼ þ wð~ ~ T speed w ¼ inð20; wþ).. Output: new paraeter estiates ~ b ðtþ1þ. and ~ ¼ ~x ðtþ1þ - ^C Þ (tip: liit the update This siple algorith can be ipleented in only a few lines of code. The recoended upper liit 20 for w siply eans that the updating speed is liited for the covariance atrices of the coponents representing less than 5 percent of the data. This was necessary since ~ T is a singular atrix and the covariance atrices ay becoe singular if updated too fast. 7 EXPERIMENTS In this section, we deonstrate the algorith perforance on a few standard probles. We show suary results fro 100 trials for each data set. For the real-world data sets, we randoly saple fro the data to generate longer sequences needed for our sequential algorith. First, for each of the probles, we present in Fig. 1 how the selected nuber of coponents of the ixture was changing when new saples are sequentially added. The nuber of coponents that was finally selected is presented in the for of a histogra for the 100 trials. In Fig. 2, we present a coparison with soe batch algoriths and study the influence of the paraeter. The rando initialization of the paraeters is the sae as in [6]. The eans ~~ b ð0þ of the ixture coponents are initialized by soe randoly chosen data points. The initial covariance atrices are a fraction (1=10 here) of the ean global diagonal covariance atrix: C ð0þ ¼ 1 10d trace 1 n X n i¼1 ð~x ðiþ b ~~Þð~x ðiþ b ~~Þ T!I; where ~~ b ¼ 1 P n n i¼1 ~xðiþ is the global ean of the data and I is the identity atrix with proper diensions. We used the first n ¼ 100 saples (it is also possible to estiate this initial covariance atrix recursively). Finally, we set the initial ixing weights to ^ ð0þ ¼ 1=M. The initial nuber of coponents M should be large enough so that the initialization reasonably covers the data. We used here the sae initial nuber of coponents as in [6]. 7.1 The Three Gaussians Data Set First, we analyze a Gaussian ixture with ixing weights 1 ¼ 2 ¼ 3 ¼ 1=3, eans 1 ¼½0 2Š T, 2 ¼½00Š T, 3 ¼½02Š T, and covariance atrices C 1 ¼ C 2 ¼ C 3 ¼ 2 0 : 0 0:2 A odified version of the EM called DAEM fro [17] was able to find the correct solution using a bad initialization. For a data set with 900 saples, they needed ore than 200 iterations to get close to the solution. Here, we start with M ¼ 30 ixture coponents. With rando initialization, we perfored 100 trials and the new algorith was always able to find the correct solution while siultaneously estiating the paraeters of the ixture and selecting the nuber of coponents. A siilar batch algorith fro [6] needs about 200 iterations to identify the three coponents (on a data set with 900 saples). Fro the plot in Fig. 1, we see that already after 9,000 saples the new algorith is usually able to identify the three coponents. The coputation costs for 9,000 saples are approxiately the sae as for only 10 iterations of the EM algorith on a data set with 900 saples. Consequently, the new algorith for this data set is about 20 ties faster in finding a siilar solution (a typical solution is presented in Fig. 1 by the ¼ 2 contours of the Guassian coponents). In [9], soe approxiate recursive versions of the EM algorith were copared to the standard EM algorith and it was shown that the recursive versions are usually faster. This is in correspondence with our results. Epirically, we decided that 50 saples per class are enough and used ¼ 1= The Iris Data Set We disregard the class inforation fro the well-known 3-class, 4- diensional Iris data set [2]. Fro the 100 trials, the clusters were properly identified 81 ties. This shows that the order in which the data is presented can influence the recursive solution. The data set had only 150 saples (50 per class) that were repeated any ties. We expect that the algorith would perfor better with ore data saples. We used ¼ 1=150. The typical solution in Fig. 1 is presented by projecting the 4-diensional data to the first two principal coponents. 7.3 The Shrinking Spiral Data Set This data set presents a 1-diensional anifold ( shrinking spiral ) in the three diensions with added noise: ~x ¼½ð13 0:5tÞ cos t ð0:5t 13Þ sin t tšþ~n, with t Unifor½0; 4Š and the noise ~n Nð0;IÞ. The odified EM called SMEM fro [18] was reported to be able to fit a 10 coponent ixture in about 350 iterations. The batch algorith fro [6] is fitting the ixture and selecting 11, 12, or 13 coponents using typically 300 to 400 iterations for a 900 saples data set. Fro the graph in Fig. 1, it is clear that we achieve siilar results, but uch faster. About 18,000 saples was enough to arrive at a siilar solution. Consequently, again, the new algorith is about 20 ties faster. There are no clusters in this data set. The fixed has as the effect that the influence of the old data is downweighted by the exponential decaying envelope ð1 Þ t k (for k<t). For coparison with the other algoriths that used 900 saples, we liited the influence of the older saples to 5 percent of the influence of the current saple by ¼ logð0:05þ=900. In Fig. 1, we present a typical solution by
5 654 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 5, MAY 2004 Fig. 1. Model selection results for a few standard probles (suary fro 100 trials). showing for each coponent the eigenvector corresponding to the largest eigenvalue of the covariance atrix. 7.4 The Enzye Data Set The 1-diensional Enzye data set has 245 data saples. It was shown in [11] using the MCMC that the nuber of coponents supported by the data is ost likely four, but two and three are also good choices. Our algorith arrived at siilar solutions. In a siilar way as before, we used ¼ logð0:05þ= Coparison with Soe Batch Algoriths The following standard batch ethods were considered for coparison: the EM algorith initialized using the result fro k-eans clustering; the SMEM ethod [18]; the greedy EM ethod [19] that starts with a single coponent and adds new ones reported to be faster than the elaborate SMEM. We used 900 saples for the Three Gaussians and the Shrinking Spiral data sets. The batch algoriths assue a known nuber of coponents: three for the Three Gaussians and the Iris data, 13 for the Shrinking Spiral, and four for the Enzye data set. Our new unsupervised recursive algorith RUEM has selected on average approxiately the sae nuber of coponents for the chosen. All the iterative batch algoriths in our experients stop if the change in the log-likelihood is less than The results are presented in Fig. 2a. The best likelihood and the lowest standard deviation are reported in bold. We also added the ideal ML result obtained using a carefully initialized EM. For the Iris data, the EM was initialized using the eans and the covariances of the three classes. However, the solution where the two close clusters are odeled using one coponent was better in ters of likelihood. This wrong solution was found occasionally by soe of the algoriths. The results fro the RUEM are biased. Furtherore, the paraeter is controlling the speed of updating the paraeters and, therefore, also the effective aount of data that is considered. Therefore, we present also the results polished by additionally applying the EM algorith and using the sae saple size for the batch algoriths. The RUEM results and the polished results are better or siilar to the batch results. We also observe that the greedy EM algorith has probles with the Iris and the Shrinking spiral data. 7.6 The Influence of the Paraeter In Figs. 2b and 2c, we show the influence of the paraeter on the selected nuber of coponents. We also plot the log-likelihood
6 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 5, MAY Fig. 2. Coparison with soe standard batch ethods and soe experients to study the influence of the paraeter (suary fro 100 trials). (a) The ean and the standard deviation (between the brackets) of the log-likelihood over the nuber of saples calculated on new test data for the synthetic data sets. (b) The three Gaussians data set the influence of. (c) The shrinking spiral data set the influence of. per saple for different values of. For the Three Gaussians data set, there is a range of values for where the sae nuber of coponents is finally selected. We can expect siilar results for other data sets where the clusters are well described by the ixture coponents and the coponents are well separated. For the Shrinking Spiral data set, there are no clear clusters and the nuber of selected coponents slowly declines with larger. Siilarly, the log-likelihood also decreases with. For coparison, we plotted also soe log-likelihood values fro soe batch algoriths (see previous section). The new unsupervised procedure siultaneously estiates paraeters and selects a copact odel. We observe fro the log-likelihood values that for a wide range of values for, we get a good representation of the data with a copact odel. The graphs for the real-world data Iris and Enzye are not included since they look siilar to the graphs for the Shrinking Spiral data. 8 DISCUSSION AND CONCLUSIONS We have proposed an online ethod for fitting ixture odels which relies on a description-length reducing prior and a MAP estiation procedure for selecting a copact odel. The experiental results indicated that the recursive algorith was able to solve difficult probles and to obtain siilar solutions as other elaborate batch algoriths. However, the theoretical support for the finally selected nuber of coponents is questionable. Soe arguents in favor of the entropic prior and its connections to other odel selection criteria are given in [3]. The Dirichlet prior we used is related to the well founded MML principle, but it can be
7 656 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 5, MAY 2004 perhaps better viewed as an efficient heuristics. Therefore, if selecting the correct odel is critical, we suggest, as in the uch slower batch version [6], to perfor an additional check with soe standard odel selection criterion (full MML for exaple). An additional proble when copared to the batch version [6] is the introduced paraeter that balances the influence of the data against the influence of the prior. This is siilar to the paraeter fro the entropic prior ( in [3]). Soe experients were perfored to show the influence of the paraeter. The paraeter ¼ 1=T is related to the nuber of data saples T that are considered and soe heuristic choices were used in the previous section. If selecting the correct nuber of coponents is not critical the new recursive procedure is highly tie and eory efficient and potentially very useful to give a quick up-to-date copact description of the data. ACKNOWLEDGMENTS This work was done while Z. Zivkovic was with the Laboratory for Measureent an Instruentation, University of Twente, Enschede, The Netherlands. REFERENCES [1] H. Akaike, A New Look at the Statistical Model Identification, IEEE Trans. Autoatic Control, vol. 19, no. 6, pp , [2] E. Anderson, The Irises of the Gaspe Peninsula, Bull. of the A. Iris Soc., vol. 59, [3] M.E. Brand, Structure Learning in Conditional Probability Models via an Entropic Prior and Paraeter Extinction, Neural Coputation J., vol. 11, no. 5, pp , [4] A.P. Depster, N. Laird, and D.B. Rubin, Maxiu Likelihood fro Incoplete Data via the EM Algorith, J. Royal Statistical Soc., Series B (Methodological), vol. 1, no. 39, pp. 1-38, [5] V. Fabian, On Asyptotically Efficient Recursive Estiation, Annals of Statistics, vol. 6, pp , [6] M. Figueiredo and A.K. Jain, Unsupervised Learning of Finite Mixture Models, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 3, pp , Mar [7] A. Gelan, J.B. Carlin, H.S. Stern, and D.B. Rubin, Bayesian Data Analysis. Chapan and Hall, [8] G. McLachlan and D. Peel, Finite Mixture Models. John Wiley and Sons, [9] R.M. Neal and G.E. Hinton, A New View of the EM Algorith that Justifies Increental, Sparse and Other Variants, Learning in Graphical Models, pp , M.I. Jordan ed., [10] C. Rasussen, The Infinite Gaussian Mixture Model, Advances in Neural Inforation Processing Systes, vol. 12, pp , [11] S. Richardson and P. Green, On Bayesian Analysis of Mixture Models with Unknown Nuber of Coponents, J. Royal Statistical Soc., Series B (Methodological), vol. 59, no. 4, pp , [12] J. Rissansen, Stochastic Coplexity, J. Royal Statistical Soc., Series B (Methodological), vol. 49, no. 3, pp , [13] J. Sacks, Asyptotic Distribution of Stochastic Approxiation Procedures, Annals of Math. Statistics, vol. 29, pp , [14] G. Schwarz, Estiating the Diension of a Model, Annals of Statistics, vol. 6, no. 2, pp , [15] D.M. Titterington, Recursive Paraeter Estiation Using Incoplete Data, J. Royal Statistical Soc., Series B (Methodological), vol. 2, no. 46, pp , [16] D.M. Titterington, A.F.M. Sith, and U.E. Makov, Statistical Analysis of Finite Mixture Distributions. John Wiley and Sons, [17] N. Ueda and R. Nakano, Deterinistic Annealing EM Algorith, Neural Networks, vol. 11, pp , [18] N. Ueda, R. Nakano, Z. Ghahraani, and G.E. Hinton, SMEM Algorith for Mixture Models, Neural Coputation, vol. 12, no. 9, pp , [19] J.J. Verbeek, N. Vlassis, and B. Krose, Efficient Greedy Learning of Gaussian Mixture Models, Neural Coputation, vol. 15, no. 1, [20] C. Wallace and P. Freean, Estiation and Inference by Copact Coding, J. Royal Statistical Soc., Series B (Methodological), vol. 49, no. 3, pp , For ore inforation on this or any other coputing topic, please visit our Digital Library at
Using EM To Estimate A Probablity Density With A Mixture Of Gaussians
Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points
More informationMachine Learning Basics: Estimators, Bias and Variance
Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics
More informationCS Lecture 13. More Maximum Likelihood
CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood
More informationFeature Extraction Techniques
Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that
More informationCh 12: Variations on Backpropagation
Ch 2: Variations on Backpropagation The basic backpropagation algorith is too slow for ost practical applications. It ay take days or weeks of coputer tie. We deonstrate why the backpropagation algorith
More informationIntelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines
Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes
More informationKernel Methods and Support Vector Machines
Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic
More informationE0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis
E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds
More informationBlock designs and statistics
Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent
More informationProbability Distributions
Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples
More informationA Note on the Applied Use of MDL Approximations
A Note on the Applied Use of MDL Approxiations Daniel J. Navarro Departent of Psychology Ohio State University Abstract An applied proble is discussed in which two nested psychological odels of retention
More informationA Simple Regression Problem
A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where
More informationCOS 424: Interacting with Data. Written Exercises
COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well
More informationThis model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.
CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when
More informationAnalyzing Simulation Results
Analyzing Siulation Results Dr. John Mellor-Cruey Departent of Coputer Science Rice University johnc@cs.rice.edu COMP 528 Lecture 20 31 March 2005 Topics for Today Model verification Model validation Transient
More informationPattern Recognition and Machine Learning. Artificial Neural networks
Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lessons 7 20 Dec 2017 Outline Artificial Neural networks Notation...2 Introduction...3 Key Equations... 3 Artificial
More informatione-companion ONLY AVAILABLE IN ELECTRONIC FORM
OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer
More informationBoosting with log-loss
Boosting with log-loss Marco Cusuano-Towner Septeber 2, 202 The proble Suppose we have data exaples {x i, y i ) i =... } for a two-class proble with y i {, }. Let F x) be the predictor function with the
More informationQuantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search
Quantu algoriths (CO 781, Winter 2008) Prof Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search ow we begin to discuss applications of quantu walks to search algoriths
More information13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices
CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay
More informationExperimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis
City University of New York (CUNY) CUNY Acadeic Works International Conference on Hydroinforatics 8-1-2014 Experiental Design For Model Discriination And Precise Paraeter Estiation In WDS Analysis Giovanna
More informationIntroduction to Machine Learning. Recitation 11
Introduction to Machine Learning Lecturer: Regev Schweiger Recitation Fall Seester Scribe: Regev Schweiger. Kernel Ridge Regression We now take on the task of kernel-izing ridge regression. Let x,...,
More informationBootstrapping Dependent Data
Bootstrapping Dependent Data One of the key issues confronting bootstrap resapling approxiations is how to deal with dependent data. Consider a sequence fx t g n t= of dependent rando variables. Clearly
More informationBayes Decision Rule and Naïve Bayes Classifier
Bayes Decision Rule and Naïve Bayes Classifier Le Song Machine Learning I CSE 6740, Fall 2013 Gaussian Mixture odel A density odel p(x) ay be ulti-odal: odel it as a ixture of uni-odal distributions (e.g.
More informationW-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS
W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS. Introduction When it coes to applying econoetric odels to analyze georeferenced data, researchers are well
More informationInteractive Markov Models of Evolutionary Algorithms
Cleveland State University EngagedScholarship@CSU Electrical Engineering & Coputer Science Faculty Publications Electrical Engineering & Coputer Science Departent 2015 Interactive Markov Models of Evolutionary
More informationNon-Parametric Non-Line-of-Sight Identification 1
Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,
More informationNonmonotonic Networks. a. IRST, I Povo (Trento) Italy, b. Univ. of Trento, Physics Dept., I Povo (Trento) Italy
Storage Capacity and Dynaics of Nononotonic Networks Bruno Crespi a and Ignazio Lazzizzera b a. IRST, I-38050 Povo (Trento) Italy, b. Univ. of Trento, Physics Dept., I-38050 Povo (Trento) Italy INFN Gruppo
More informationA note on the multiplication of sparse matrices
Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani
More informationProc. of the IEEE/OES Seventh Working Conference on Current Measurement Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES
Proc. of the IEEE/OES Seventh Working Conference on Current Measureent Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES Belinda Lipa Codar Ocean Sensors 15 La Sandra Way, Portola Valley, CA 98 blipa@pogo.co
More informationModel Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon
Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential
More informationData-Driven Imaging in Anisotropic Media
18 th World Conference on Non destructive Testing, 16- April 1, Durban, South Africa Data-Driven Iaging in Anisotropic Media Arno VOLKER 1 and Alan HUNTER 1 TNO Stieltjesweg 1, 6 AD, Delft, The Netherlands
More informationSymbolic Analysis as Universal Tool for Deriving Properties of Non-linear Algorithms Case study of EM Algorithm
Acta Polytechnica Hungarica Vol., No., 04 Sybolic Analysis as Universal Tool for Deriving Properties of Non-linear Algoriths Case study of EM Algorith Vladiir Mladenović, Miroslav Lutovac, Dana Porrat
More informationIntelligent Systems: Reasoning and Recognition. Artificial Neural Networks
Intelligent Systes: Reasoning and Recognition Jaes L. Crowley MOSIG M1 Winter Seester 2018 Lesson 7 1 March 2018 Outline Artificial Neural Networks Notation...2 Introduction...3 Key Equations... 3 Artificial
More information3.3 Variational Characterization of Singular Values
3.3. Variational Characterization of Singular Values 61 3.3 Variational Characterization of Singular Values Since the singular values are square roots of the eigenvalues of the Heritian atrices A A and
More informationDetection and Estimation Theory
ESE 54 Detection and Estiation Theory Joseph A. O Sullivan Sauel C. Sachs Professor Electronic Systes and Signals Research Laboratory Electrical and Systes Engineering Washington University 11 Urbauer
More informationDERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS
DERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS N. van Erp and P. van Gelder Structural Hydraulic and Probabilistic Design, TU Delft Delft, The Netherlands Abstract. In probles of odel coparison
More informationASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical
IEEE TRANSACTIONS ON INFORMATION THEORY Large Alphabet Source Coding using Independent Coponent Analysis Aichai Painsky, Meber, IEEE, Saharon Rosset and Meir Feder, Fellow, IEEE arxiv:67.7v [cs.it] Jul
More informationIdentical Maximum Likelihood State Estimation Based on Incremental Finite Mixture Model in PHD Filter
Identical Maxiu Lielihood State Estiation Based on Increental Finite Mixture Model in PHD Filter Gang Wu Eail: xjtuwugang@gail.co Jing Liu Eail: elelj20080730@ail.xjtu.edu.cn Chongzhao Han Eail: czhan@ail.xjtu.edu.cn
More informationPrincipal Components Analysis
Principal Coponents Analysis Cheng Li, Bingyu Wang Noveber 3, 204 What s PCA Principal coponent analysis (PCA) is a statistical procedure that uses an orthogonal transforation to convert a set of observations
More informationStochastic Subgradient Methods
Stochastic Subgradient Methods Lingjie Weng Yutian Chen Bren School of Inforation and Coputer Science University of California, Irvine {wengl, yutianc}@ics.uci.edu Abstract Stochastic subgradient ethods
More informationOptimal nonlinear Bayesian experimental design: an application to amplitude versus offset experiments
Geophys. J. Int. (23) 155, 411 421 Optial nonlinear Bayesian experiental design: an application to aplitude versus offset experients Jojanneke van den Berg, 1, Andrew Curtis 2,3 and Jeannot Trapert 1 1
More informationLeast Squares Fitting of Data
Least Squares Fitting of Data David Eberly, Geoetric Tools, Redond WA 98052 https://www.geoetrictools.co/ This work is licensed under the Creative Coons Attribution 4.0 International License. To view a
More informationGrafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space
Journal of Machine Learning Research 3 (2003) 1333-1356 Subitted 5/02; Published 3/03 Grafting: Fast, Increental Feature Selection by Gradient Descent in Function Space Sion Perkins Space and Reote Sensing
More informationBayesian Approach for Fatigue Life Prediction from Field Inspection
Bayesian Approach for Fatigue Life Prediction fro Field Inspection Dawn An and Jooho Choi School of Aerospace & Mechanical Engineering, Korea Aerospace University, Goyang, Seoul, Korea Srira Pattabhiraan
More informationMSEC MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL SOLUTION FOR MAINTENANCE AND PERFORMANCE
Proceeding of the ASME 9 International Manufacturing Science and Engineering Conference MSEC9 October 4-7, 9, West Lafayette, Indiana, USA MSEC9-8466 MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL
More informationQualitative Modelling of Time Series Using Self-Organizing Maps: Application to Animal Science
Proceedings of the 6th WSEAS International Conference on Applied Coputer Science, Tenerife, Canary Islands, Spain, Deceber 16-18, 2006 183 Qualitative Modelling of Tie Series Using Self-Organizing Maps:
More informationTracking using CONDENSATION: Conditional Density Propagation
Tracking using CONDENSATION: Conditional Density Propagation Goal Model-based visual tracking in dense clutter at near video frae rates M. Isard and A. Blake, CONDENSATION Conditional density propagation
More informationPattern Recognition and Machine Learning. Artificial Neural networks
Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016 Lessons 7 14 Dec 2016 Outline Artificial Neural networks Notation...2 1. Introduction...3... 3 The Artificial
More informationExtension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels
Extension of CSRSM for the Paraetric Study of the Face Stability of Pressurized Tunnels Guilhe Mollon 1, Daniel Dias 2, and Abdul-Haid Soubra 3, M.ASCE 1 LGCIE, INSA Lyon, Université de Lyon, Doaine scientifique
More informationInspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information
Cite as: Straub D. (2014). Value of inforation analysis with structural reliability ethods. Structural Safety, 49: 75-86. Value of Inforation Analysis with Structural Reliability Methods Daniel Straub
More informationCourse Notes for EE227C (Spring 2018): Convex Optimization and Approximation
Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October
More informationMulti-view Discriminative Manifold Embedding for Pattern Classification
Multi-view Discriinative Manifold Ebedding for Pattern Classification X. Wang Departen of Inforation Zhenghzou 450053, China Y. Guo Departent of Digestive Zhengzhou 450053, China Z. Wang Henan University
More informationSPECTRUM sensing is a core concept of cognitive radio
World Acadey of Science, Engineering and Technology International Journal of Electronics and Counication Engineering Vol:6, o:2, 202 Efficient Detection Using Sequential Probability Ratio Test in Mobile
More informationESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics
ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS A Thesis Presented to The Faculty of the Departent of Matheatics San Jose State University In Partial Fulfillent of the Requireents
More informationIN modern society that various systems have become more
Developent of Reliability Function in -Coponent Standby Redundant Syste with Priority Based on Maxiu Entropy Principle Ryosuke Hirata, Ikuo Arizono, Ryosuke Toohiro, Satoshi Oigawa, and Yasuhiko Takeoto
More informationA Theoretical Analysis of a Warm Start Technique
A Theoretical Analysis of a War Start Technique Martin A. Zinkevich Yahoo! Labs 701 First Avenue Sunnyvale, CA Abstract Batch gradient descent looks at every data point for every step, which is wasteful
More informationEstimation of the Mean of the Exponential Distribution Using Maximum Ranked Set Sampling with Unequal Samples
Open Journal of Statistics, 4, 4, 64-649 Published Online Septeber 4 in SciRes http//wwwscirporg/ournal/os http//ddoiorg/436/os4486 Estiation of the Mean of the Eponential Distribution Using Maiu Ranked
More informationA MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION
A eshsize boosting algorith in kernel density estiation A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION C.C. Ishiekwene, S.M. Ogbonwan and J.E. Osewenkhae Departent of Matheatics, University
More informationStatistical clustering and Mineral Spectral Unmixing in Aviris Hyperspectral Image of Cuprite, NV
CS229 REPORT, DECEMBER 05 1 Statistical clustering and Mineral Spectral Unixing in Aviris Hyperspectral Iage of Cuprite, NV Mario Parente, Argyris Zynis I. INTRODUCTION Hyperspectral Iaging is a technique
More informationSupport Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization
Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering
More informationFast Structural Similarity Search of Noncoding RNAs Based on Matched Filtering of Stem Patterns
Fast Structural Siilarity Search of Noncoding RNs Based on Matched Filtering of Ste Patterns Byung-Jun Yoon Dept. of Electrical Engineering alifornia Institute of Technology Pasadena, 91125, S Eail: bjyoon@caltech.edu
More informationEfficient Filter Banks And Interpolators
Efficient Filter Banks And Interpolators A. G. DEMPSTER AND N. P. MURPHY Departent of Electronic Systes University of Westinster 115 New Cavendish St, London W1M 8JS United Kingdo Abstract: - Graphical
More informationAn improved self-adaptive harmony search algorithm for joint replenishment problems
An iproved self-adaptive harony search algorith for joint replenishent probles Lin Wang School of Manageent, Huazhong University of Science & Technology zhoulearner@gail.co Xiaojian Zhou School of Manageent,
More informationCompression and Predictive Distributions for Large Alphabet i.i.d and Markov models
2014 IEEE International Syposiu on Inforation Theory Copression and Predictive Distributions for Large Alphabet i.i.d and Markov odels Xiao Yang Departent of Statistics Yale University New Haven, CT, 06511
More informationChapter 6 1-D Continuous Groups
Chapter 6 1-D Continuous Groups Continuous groups consist of group eleents labelled by one or ore continuous variables, say a 1, a 2,, a r, where each variable has a well- defined range. This chapter explores:
More informationEstimating Parameters for a Gaussian pdf
Pattern Recognition and achine Learning Jaes L. Crowley ENSIAG 3 IS First Seester 00/0 Lesson 5 7 Noveber 00 Contents Estiating Paraeters for a Gaussian pdf Notation... The Pattern Recognition Proble...3
More informationCombining Classifiers
Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/
More informationGraphical Models in Local, Asymmetric Multi-Agent Markov Decision Processes
Graphical Models in Local, Asyetric Multi-Agent Markov Decision Processes Ditri Dolgov and Edund Durfee Departent of Electrical Engineering and Coputer Science University of Michigan Ann Arbor, MI 48109
More informationA remark on a success rate model for DPA and CPA
A reark on a success rate odel for DPA and CPA A. Wieers, BSI Version 0.5 andreas.wieers@bsi.bund.de Septeber 5, 2018 Abstract The success rate is the ost coon evaluation etric for easuring the perforance
More informationFigure 1: Equivalent electric (RC) circuit of a neurons membrane
Exercise: Leaky integrate and fire odel of neural spike generation This exercise investigates a siplified odel of how neurons spike in response to current inputs, one of the ost fundaental properties of
More informationLower Bounds for Quantized Matrix Completion
Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &
More informationA Smoothed Boosting Algorithm Using Probabilistic Output Codes
A Soothed Boosting Algorith Using Probabilistic Output Codes Rong Jin rongjin@cse.su.edu Dept. of Coputer Science and Engineering, Michigan State University, MI 48824, USA Jian Zhang jian.zhang@cs.cu.edu
More informationPattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition
Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lesson 1 4 October 2017 Outline Learning and Evaluation for Pattern Recognition Notation...2 1. The Pattern Recognition
More informationPULSE-TRAIN BASED TIME-DELAY ESTIMATION IMPROVES RESILIENCY TO NOISE
PULSE-TRAIN BASED TIME-DELAY ESTIMATION IMPROVES RESILIENCY TO NOISE 1 Nicola Neretti, 1 Nathan Intrator and 1,2 Leon N Cooper 1 Institute for Brain and Neural Systes, Brown University, Providence RI 02912.
More informationEffective joint probabilistic data association using maximum a posteriori estimates of target states
Effective joint probabilistic data association using axiu a posteriori estiates of target states 1 Viji Paul Panakkal, 2 Rajbabu Velurugan 1 Central Research Laboratory, Bharat Electronics Ltd., Bangalore,
More informationTEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES
TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES S. E. Ahed, R. J. Tokins and A. I. Volodin Departent of Matheatics and Statistics University of Regina Regina,
More informationPolygonal Designs: Existence and Construction
Polygonal Designs: Existence and Construction John Hegean Departent of Matheatics, Stanford University, Stanford, CA 9405 Jeff Langford Departent of Matheatics, Drake University, Des Moines, IA 5011 G
More information1 Bounding the Margin
COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost
More informationLost-Sales Problems with Stochastic Lead Times: Convexity Results for Base-Stock Policies
OPERATIONS RESEARCH Vol. 52, No. 5, Septeber October 2004, pp. 795 803 issn 0030-364X eissn 1526-5463 04 5205 0795 infors doi 10.1287/opre.1040.0130 2004 INFORMS TECHNICAL NOTE Lost-Sales Probles with
More informationSharp Time Data Tradeoffs for Linear Inverse Problems
Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used
More informationSupport recovery in compressed sensing: An estimation theoretic approach
Support recovery in copressed sensing: An estiation theoretic approach Ain Karbasi, Ali Horati, Soheil Mohajer, Martin Vetterli School of Coputer and Counication Sciences École Polytechnique Fédérale de
More informationInference in the Presence of Likelihood Monotonicity for Polytomous and Logistic Regression
Advances in Pure Matheatics, 206, 6, 33-34 Published Online April 206 in SciRes. http://www.scirp.org/journal/ap http://dx.doi.org/0.4236/ap.206.65024 Inference in the Presence of Likelihood Monotonicity
More informationThe Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Parameters
journal of ultivariate analysis 58, 96106 (1996) article no. 0041 The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Paraeters H. S. Steyn
More informationStatistical Logic Cell Delay Analysis Using a Current-based Model
Statistical Logic Cell Delay Analysis Using a Current-based Model Hanif Fatei Shahin Nazarian Massoud Pedra Dept. of EE-Systes, University of Southern California, Los Angeles, CA 90089 {fatei, shahin,
More informationOPTIMIZATION in multi-agent networks has attracted
Distributed constrained optiization and consensus in uncertain networks via proxial iniization Kostas Margellos, Alessandro Falsone, Sione Garatti and Maria Prandini arxiv:603.039v3 [ath.oc] 3 May 07 Abstract
More informationHybrid System Identification: An SDP Approach
49th IEEE Conference on Decision and Control Deceber 15-17, 2010 Hilton Atlanta Hotel, Atlanta, GA, USA Hybrid Syste Identification: An SDP Approach C Feng, C M Lagoa, N Ozay and M Sznaier Abstract The
More informationpaper prepared for the 1996 PTRC Conference, September 2-6, Brunel University, UK ON THE CALIBRATION OF THE GRAVITY MODEL
paper prepared for the 1996 PTRC Conference, Septeber 2-6, Brunel University, UK ON THE CALIBRATION OF THE GRAVITY MODEL Nanne J. van der Zijpp 1 Transportation and Traffic Engineering Section Delft University
More informationUse of PSO in Parameter Estimation of Robot Dynamics; Part One: No Need for Parameterization
Use of PSO in Paraeter Estiation of Robot Dynaics; Part One: No Need for Paraeterization Hossein Jahandideh, Mehrzad Navar Abstract Offline procedures for estiating paraeters of robot dynaics are practically
More informationIn this chapter, we consider several graph-theoretic and probabilistic models
THREE ONE GRAPH-THEORETIC AND STATISTICAL MODELS 3.1 INTRODUCTION In this chapter, we consider several graph-theoretic and probabilistic odels for a social network, which we do under different assuptions
More informationFast Montgomery-like Square Root Computation over GF(2 m ) for All Trinomials
Fast Montgoery-like Square Root Coputation over GF( ) for All Trinoials Yin Li a, Yu Zhang a, a Departent of Coputer Science and Technology, Xinyang Noral University, Henan, P.R.China Abstract This letter
More informationA Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine. (1900 words)
1 A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine (1900 words) Contact: Jerry Farlow Dept of Matheatics Univeristy of Maine Orono, ME 04469 Tel (07) 866-3540 Eail: farlow@ath.uaine.edu
More informationSupervised Baysian SAR image Classification Using The Full Polarimetric Data
Supervised Baysian SAR iage Classification Using The Full Polarietric Data (1) () Ziad BELHADJ (1) SUPCOM, Route de Raoued 3.5 083 El Ghazala - TUNSA () ENT, BP. 37, 100 Tunis Belvedere, TUNSA Abstract
More informationTraining an RBM: Contrastive Divergence. Sargur N. Srihari
Training an RBM: Contrastive Divergence Sargur N. srihari@cedar.buffalo.edu Topics in Partition Function Definition of Partition Function 1. The log-likelihood gradient 2. Stochastic axiu likelihood and
More informationDEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS
ISSN 1440-771X AUSTRALIA DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS An Iproved Method for Bandwidth Selection When Estiating ROC Curves Peter G Hall and Rob J Hyndan Working Paper 11/00 An iproved
More informationAn Improved Particle Filter with Applications in Ballistic Target Tracking
Sensors & ransducers Vol. 72 Issue 6 June 204 pp. 96-20 Sensors & ransducers 204 by IFSA Publishing S. L. http://www.sensorsportal.co An Iproved Particle Filter with Applications in Ballistic arget racing
More informationAlgorithms for parallel processor scheduling with distinct due windows and unit-time jobs
BULLETIN OF THE POLISH ACADEMY OF SCIENCES TECHNICAL SCIENCES Vol. 57, No. 3, 2009 Algoriths for parallel processor scheduling with distinct due windows and unit-tie obs A. JANIAK 1, W.A. JANIAK 2, and
More informationŞtefan ŞTEFĂNESCU * is the minimum global value for the function h (x)
7Applying Nelder Mead s Optiization Algorith APPLYING NELDER MEAD S OPTIMIZATION ALGORITHM FOR MULTIPLE GLOBAL MINIMA Abstract Ştefan ŞTEFĂNESCU * The iterative deterinistic optiization ethod could not
More informationSpine Fin Efficiency A Three Sided Pyramidal Fin of Equilateral Triangular Cross-Sectional Area
Proceedings of the 006 WSEAS/IASME International Conference on Heat and Mass Transfer, Miai, Florida, USA, January 18-0, 006 (pp13-18) Spine Fin Efficiency A Three Sided Pyraidal Fin of Equilateral Triangular
More informationarxiv: v1 [cs.ds] 29 Jan 2012
A parallel approxiation algorith for ixed packing covering seidefinite progras arxiv:1201.6090v1 [cs.ds] 29 Jan 2012 Rahul Jain National U. Singapore January 28, 2012 Abstract Penghui Yao National U. Singapore
More information