Adaptive imputation of missing values for. incomplete pattern classification

Size: px
Start display at page:

Download "Adaptive imputation of missing values for. incomplete pattern classification"

Transcription

1 Adaptve mputaton of mssng values for 1 ncomplete pattern classfcaton Zhun-ga Lu 1, Quan Pan 1, Jean Dezert 2, Arnaud Martn 3 1. School of Automaton, Northwestern Polytechncal Unversty, X an, Chna. Emal: luzhunga@nwpu.edu.cn 2. ONERA - The French Aerospace Lab, F Palaseau, France. Emal: jean.dezert@onera.fr arxv: v1 [cs.ai] 8 Feb IRISA, Unversty of Rennes 1, Rue E. Branly, Lannon, France. Emal: Arnaud.Martn@unv-rennes1.fr Abstract In classfcaton of ncomplete pattern, the mssng values can ether play a crucal role n the class determnaton, or have only lttle nfluence (or eventually none) on the classfcaton results accordng to the context. We propose a credal classfcaton method for ncomplete pattern wth adaptve mputaton of mssng values based on belef functon theory. At frst, we try to classfy the object (ncomplete pattern) based only on the avalable attrbute values. As underlyng prncple, we assume that the mssng nformaton s not crucal for the classfcaton f a specfc class for the object can be found usng only the avalable nformaton. In ths case, the object s commtted to ths partcular class. However, f the object cannot be classfed wthout ambguty, t means that the mssng values play a man role for achevng an accurate classfcaton. In ths case, the mssng values wll be mputed based on the K-nearest neghbor (K-NN) and self-organzng map (SOM) technques, and the edted pattern wth the mputaton s then classfed. The (orgnal or edted) pattern s respectvely classfed accordng to each tranng class, and the classfcaton results represented by basc belef assgnments are fused wth proper combnaton rules for makng the credal classfcaton. The object s allowed to belong wth dfferent masses of belef to the specfc classes and meta-classes (whch are partcular dsjunctons of several sngle classes). The credal classfcaton captures well the uncertanty and mprecson of classfcaton, and reduces effectvely the rate of msclassfcatons thanks to the ntroducton of meta-classes. The effectveness of the proposed method wth respect to other classcal methods s demonstrated based on several experments usng artfcal and real data sets. Keywords: belef functon, classfcaton, mssng values, SOM, K-NN.

2 2 I. INTRODUCTION In many practcal classfcaton problems, the avalable nformaton for makng object classfcaton s partal (ncomplete) because some attrbute values can be mssng due to varous reasons (e.g. the falure or dysfunctonng of the sensors provdng nformaton, or partal observaton of object of nterest because of some occultaton phenomenon, etc). So t s crucal to develop effcent technques to classfy as best as possble the objects wth mssng attrbute values (ncomplete pattern), and the search for a soluton of ths problem remans an mportant research topc n the pattern classfcaton feld [1], [2]. Some more detals about pattern classfcaton can be found n [3], [4]. There have been many approaches developed for classfyng the ncomplete patterns [1], and they can be broadly grouped nto four dfferent types. The frst (smplest) one s to remove drectly the patterns wth mssng values, and the classfer s desgned only for the complete patterns. Ths method s acceptable when the ncomplete data set s only a very small subset (e.g. less than 5%) of the whole data set, but t cannot effectvely classfy the pattern wth mssng values. The second type s the modelbased technques [5]. The probablty densty functon (PDF) of the nput data (complete and ncomplete cases) s estmated at frst by means of some procedures, and then the object s classfed usng bayesan reasonng. For nstance, the expectaton-maxmzaton (EM) algorthm have been appled to many problems nvolvng mssng data for tranng Gaussan mxture models [5]. In the model-based methods, t must make assumptons about the jont dstrbuton of all the varables n the model, but the sutable dstrbutons sometmes are hard to obtan. The thrd type classfers are desgned to drectly handle ncomplete pattern wthout mputng the mssng values, such as neural network ensemble methods [6], decson trees [7], fuzzy approaches [8] and support vector machne classfer [9]. The last type s the often used mputaton (estmaton) method. The mssng values are flled wth proper estmatons [10] at frst, and then the edted patterns are classfed usng the normal classfer (for the complete pattern). The mssng values and pattern classfcaton are treated separately n these methods. Many works have been devoted to the mputaton of mssng data, and the mputaton can be done ether by the statstcal methods, e.g. mean mputaton [11], regress mputaton [2], etc, or by machne learnng methods, e.g. K-nearest neghbors mputaton (KNNI) [12], Fuzzy c-means (FCM) mputaton (FCMI) [13], [14], Self-organzng map mputaton (SOMI) [15], etc. In KNNI, the mssng values are estmated usng K-nearest neghbors of object n tranng data space. In FCMI, the mssng values are mputed accordng to the clusterng centers of FCM and takng nto account the dstances of the object to these centers [13], [14]. In SOMI [15], the best match node (unt) of ncomplete pattern can be found gnorng the mssng values, and the mputaton of the mssng values

3 3 s computed based on the weghts of the actvaton group of nodes ncludng the best match node and ts close neghbors. These exstng methods usually attempt to classfy the object nto a partcular class wth maxmal probablty or lkelhood measure. However, the estmaton of mssng values s n general qute uncertan, and the dfferent mputatons of mssng values can yeld very dfferent classfcaton results, whch prevent us to correctly commt the object nto a partcular class. Belef functon theory (BFT), also called Dempster-Shafer theory (DST) [16] and ts extenson [17], [18] offer a mathematcal framework for modelng uncertanty and mprecse nformaton [19]. BFT has already been appled successfully for object classfcaton [20] [28], clusterng [29] [33] and mult-source nformaton fuson [34] [37], etc. Some classfers for the complete pattern based on DST have been developed by Denœux and hs collaborators to come up wth the evdental K-nearest neghbors (EK-NN) [21], evdental neural network (ENN) [27], etc. The extra gnorance element represented by the dsjuncton of all the elements n the whole frame of dscernment s ntroduced n these classfers to capture the totally gnorant nformaton. However, the partal mprecson, whch s very mportant n the classfcaton, s not well characterzed. We have proposed credal classfers [23], [24] for complete pattern consderng all the possble meta-classes (.e. the partcular dsjunctons of several sngleton classes) to model the partal mprecse nformaton. The credal classfcaton allows the objects to belong (wth dfferent masses of belef) not only to the sngleton classes, but also to any set of classes correspondng to the metaclasses. In [23], a belef-based K-nearest neghbor classfer (BK-NN) has been presented, and the credal classfcaton of object s done accordng to the dstances between the object and ts K nearest neghbors as well as two gven (acceptance and rejecton) dstance thresholds. The K-NN classfer generally takes bg computaton burden, and ths s not convenent for real applcaton. Thus, a smple credal classfcaton rule (CCR) [24] has been further developed, and the belef value of object assocated wth dfferent classes (.e. sngleton classes and selected meta-classes) s drectly calculated by the dstance to the center of correspondng class and the dstngushablty degree (w.r.t. object) of the sngleton classes nvolved n the meta-class. The locaton of center of meta-class n CCR s consdered wth the same (smlar) dstance to all the nvolved sngleton classes centers. Moreover, when the tranng data s not avalable, we have also proposed several credal clusterng methods [30] [32] n dfferent cases. Nevertheless, these prevous credal classfcaton methods are manly for dealng wth complete pattern wthout takng nto account the mssng values. In our recent work, a prototype-based credal classfcaton (PCC) [25] method for the ncomplete patterns has been ntroduced to capture the mprecse nformaton caused by the mssng values. The object hard to

4 4 correctly classfy are commtted to a sutable meta-class by PCC, whch well characterzes the mprecson of classfcaton due to absence of part attrbutes and also reduces the msclassfcaton errors. In PCC, the mssng values n all the ncomplete patterns are mputed usng prototype of each class center, and the edted pattern wth each mputaton s respectvely classfed by a standard classfer (for complete pattern). Wth PCC, one obtans c peces of classfcaton results for each ncomplete pattern n a c class problem, and the global fuson of the c results s gven for the credal classfcaton. Unfortunately, PCC classfer s computatonally greedy and tme-consumng, and the mputaton of mssng values based on class prototype s not so precse. In order to overcome the lmtatons of PCC, we propose a new credal classfcaton method for ncomplete pattern wth adaptve mputaton of mssng values, and t can be called Credal Classfcaton wth Adaptve Imputaton (CCAI) for short. The pattern to classfy usually conssts of multple attrbutes. Sometmes, the class of the pattern can be precsely determned usng only a part (a subset) of the avalable attrbutes, and t mples that the other attrbutes are redundant and n fact unnecessary for the classfcaton. So a new method of credal classfcaton wth adaptve mputaton strategy (.e. CCAI) for mssng values s proposed. In CCAI, we attempt to classfy the object only usng the known attrbutes value at frst. If a specfc classfcaton result s obtaned, t very lkely means that the mssng values are not very necessary for the classfcaton, and we drectly take the decson on the class of the object based on ths result. However, f the object cannot be clearly classfed wth the avalable nformaton, t ndcates that the mssng nformaton ncluded n the mssng attrbute values s probably very crucal for makng the classfcaton. In ths case, we present a sophstcated classfcaton strategy for the edton of pattern based on the proper mputaton of mssng values. K-nearest neghbors-based mputaton method usually provdes pretty good performances for the estmaton of mssng values, but the ts man drawback s the bg computatonal burden. To reduce the computatonal burden, Self-Organzng Map (SOM) [38] s appled n each class, and the optmzed weghtng vectors are used to represent the correspondng class. Then, the K nearest weghtng vectors of the object n each class are respectvely employed to estmate the mssng values. For the classfcaton of orgnal ncomplete pattern (wthout mputaton of mssng values) or the edted pattern (wth mputaton of mssng values), we adopt the ensemble classfer approach. One can respectvely get the smple classfcaton result accordng to each tranng class, and each classfcaton result s represented by a smple basc belef assgnment (BBA) ncludng two focal elements (.e. sngleton class and gnorant class) only. The belef of the object belongng to each class s calculated based on the dstance to the

5 5 correspondng prototype, and the other belef s commtted to the gnorant element. The fuson (ensemble) of these multple BBA s s then used to determne the class of the object. If the object s drectly classfed usng only the known values, Dempster-Shafer 1 (DS) fuson rule [16] s appled because of the smplcty of ths rule and also because the BBA s to fuse are usually n low conflct. In ths case, a specfc result s obtaned wth DS rule. Otherwse, a new fuson rule nspred by Dubos and Prade (DP) rule [39] s used to classfy the edted pattern wth proper mputaton of ts mssng values. Because the estmaton of the mssng values can be qute uncertan, t naturally nduces an mprecse classfcaton. So the partal conflctng belefs wll be kept and commtted to the assocated meta-classes n ths new rule to reasonably reveal the potental mprecson of the classfcaton result. In ths paper, we present an credal classfcaton method wth adaptve mputaton of mssng values based on belef functon theory for dealng wth the ncomplete patterns, and t s organzed as follows. The bascs of belef functon theory and Self-Organzng Map s brefly recalled n secton II. The new credal classfcaton method for ncomplete patterns s presented n the secton III, and the proposed method s then tested and evaluated n secton IV compared wth several other classcal methods. The paper s concluded n the fnal. II. BACKGROUND KNOWLEDGE Belef functon theory (BFT) can well characterze the uncertan and mprecse nformaton, and t s used n ths work for the classfcaton of patterns. SOM technque s employed to fnd the optmzed weghtng vectors whch are used to represent the correspondng class, and ths can reduce the computaton burden n the estmaton of the mssng values based on K-NN method. So the basc knowledge on BFT and SOM wll be brefly recalled. A. Bass of belef functon theory The Belef Functon Theory (BFT) ntroduced by Glenn Shafer s also known as Dempster-Shafer Theory (DST), or the Mathematcal Theory of Evdence [16] [18]. Let us consder a frame of dscernment consstng of c exclusve and exhaustve hypotheses (classes) denoted by Ω = {ω, = 1, 2,..., c}. The power-set of Ω denoted 2 Ω s the set of all the subsets of Ω, empty set ncluded. For example, f Ω = {ω 1, ω 2, ω 3 }, then 2 Ω = {, ω 1, ω 2, ω 3, ω 1 ω 2, ω 1 ω 3, ω 2 ω 3, Ω}. In the classfcaton problem, the sngleton element (e.g. ω ) represents a specfc class. In ths work, the dsjuncton (unon) of several 1 Although the rule has been proposed orgnally by Arthur Dempster, we prefer to call t Dempster-Shafer rule because t has been wdely promoted by Shafer n [16].

6 6 sngleton elements s called a meta-class whch characterzes the partal gnorance of classfcaton. Examples of meta-classes are ω ω j, or ω ω j ω k. In BFT, one object can be assocated wth dfferent sngleton elements as well as wth sets of elements accordng to a basc belef assgnment (BBA), whch s a functon m(.) from 2 Ω to [0, 1] satsfyng m( ) = 0 and the normalzaton condton A 2 Ω m(a) = 1. The subsets A of Ω such that m(a) > 0 are called the focal elements of the belef mass m(.). The credal classfcaton (or parttonng) [29] s defned as n-tuple M = (m 1,, m n ) of BBA s, where m s the basc belef assgnment of the object x X, = 1,..., n assocated wth the dfferent elements n the power-set 2 Θ. The credal classfcaton allows the objects to belong to the specfc classes and the sets of classes correspondng to meta-classes wth dfferent belef mass assgnments. The credal classfcaton can well model the mprecse and uncertan nformaton thanks to the ntroducton of metaclass. For combnng multple sources of evdence represented by a set of BBA s, the well-known Dempster s rule [16] s stll wdely used, even f ts justfcaton s an open debate and questonable n the communty [40], [41]. The combnaton of two BBA s m 1 (.) and m 2 (.) over 2 Ω s done wth DS rule of combnaton defned by m DS ( ) = 0 and for A, B, C 2 Ω by m DS (A) = m 1 (B)m 2 (C) B C=A 1 m 1 (B)m 2 (C) B C= (1) B C= DS rule s commutatve and assocatve, and makes a compromse between the specfcty and complexty for the combnaton of BBA s. Wth ths rule, all the conflctng belefs m 1 (B)m 2 (C) are proportonally redstrbuted back to the focal elements through a classcal normalzaton step. However, ths redstrbuton can yeld unreasonable results n the hgh conflctng cases [40], as well as n some specal low conflctng cases as well [41]. That s why dfferent rules of combnaton have emerged to overcome ts lmtatons. Among the possble alternatves of DS rule, we fnd Smets conjunctve rule (used n hs transferable belef model (TBM) [18]), Dubos-Prade (DP) rule [39], and more recently the more complex Proportonal Conflct Redstrbutons (PCR) rules [42]. Unfortunately, DP and PCR rules are less appealng from mplementaton standpont snce they are not assocatve, and they become complex to use when more than two BBA s have to be combned altogether.

7 7 B. Overvew of Self-Organzng Map Self-Organzng Map (SOM) (also called Kohonen map) [38] ntroduced by Teuvo Kohonen s a type of artfcal neural network (ANN), and t s traned by unsupervsed learnng method. SOM defnes a mappng from the nput space to a low-dmensonal (typcally two-dmensonal) grd of M N nodes. So t allows to approxmate the feature space dmenson (e.g. a real nput vector x R p ) nto a projected 2D space, and t s stll able to preserve the topologcal propertes of the nput space usng a neghborhood functon. Thus, SOM s very useful for vsualzng low-dmensonal vews of hgh-dmensonal data by a non lnear projecton. The node at poston (, j), = 1,... M, j = 1,..., N corresponds to a weghtng vector denoted by σ(, j) R p. An nput vector x R p s to be compared to each σ(, j), and the neuron whose weghtng vector s the most close (smlar) to x accordng to a gven metrc s called the best matchng unt (BMU), whch s defned as the output of SOM wth respect to x. In real applcatons, the Eucldean dstance s usually used to compare x and σ(, j). The nput pattern x can be mapped onto the SOM at locaton (, j) where σ(, j) s wth the mnmal dstance to x. It s consdered that the SOM acheves a non-unform quantzaton that transforms x to σ x by mnmzng the gven metrc (e.g. dstance measure) [43]. In SOM, the compettve learnng s adopted, and the tranng algorthm s teratve. The ntal values of the weghtng vectors σ may be set randomly, but they wll converge to a stable value at the end of the tranng process. When an nput vector s fed to the network, ts Eucldean dstance to all weght vectors s computed. Then the BMU whose weght vector s most smlar to the nput vector s found, and the weghts of the BMU and neurons close to t n the SOM grd are adjusted towards the nput vector. The magntude of the change decreases wth tme and wth dstance (wthn the grd) from the BMU. The detaled nformaton about SOM can be found n [38]. In ths work, SOM s appled n each tranng class to obtan the optmzed weghtng vectors that are used to represent the correspondng class. The number of the weghtng vectors s much smaller than the orgnal samples n the assocated tranng class. We wll utlze these weghtng vectors rather than the orgnal samples to estmate the mssng values n the object (ncomplete pattern), and ths could effectvely reduce the computaton burden. III. CREDAL CLASSIFICATION OF INCOMPLETE PATTERN Our new method conssts of two man steps. In the frst step, the object (ncomplete pattern) s drectly classfed accordng to the known attrbute values only, and the mssng values are gnored. If one can get a

8 8 specfc classfcaton result, the classfcaton procedure s done because the avalable attrbute nformaton s suffcent for makng the classfcaton. But f the class of the object cannot be clearly dentfed n the frst step, t means that the unavalable nformaton ncluded n the mssng values s lkely crucal for the classfcaton. In ths case, one has to enter n the second step of the method to classfy the object wth a proper mputaton of mssng values. In the classfcaton procedure, the orgnal or edted pattern wll be respectvely classfed accordng to each class of tranng data. The global fuson of these classfcaton results, whch can be consdered as multple sources of evdence represented by BBA s, s then used for the credal classfcaton of the object. Our new method for credal classfcaton of ncomplete pattern wth adaptve mputaton of mssng values s referred as Credal Classfcaton wth Adaptve Imputaton, or just as CCAI for concseness. CCAI s based on belef functon theory, whch can well manage the uncertan and mprecse nformaton caused by the mssng values n the classfcaton. A. Frst step: Drect classfcaton of ncomplete pattern usng the avalable data Let us consder a set of test patterns (samples) X = {x 1,..., x n } to be classfed based on a set of labeled tranng patterns Y = {y 1,..., y s } over the frame of dscernment Ω = {ω 1,..., ω c }. In ths work, we focus on the classfcaton of ncomplete pattern n whch some attrbute values are absent. So we consder all the test patterns (e.g. x, = 1,..., n) wth several mssng values. The tranng data set Y may also have ncomplete patterns n some applcatons. However, f the ncomplete patterns take a very small amount say less than 5% n the tranng data set, they can be gnored n the classfcaton. If the percentage of ncomplete patterns s bg, the mssng values must usually be estmated at frst, and the classfer wll be traned usng the edted (complete) patterns. In the real applcatons, one can also just choose the complete labeled patterns to nclude n the tranng data set when the tranng nformaton s suffcent. So for smplcty and convenence, we consder that the labeled samples (e.g. y j, j = 1,..., s) of the tranng set Y are all complete patterns n the sequel. In the frst step of classfcaton, the ncomplete pattern say x wll be respectvely classfed accordng to each tranng class by a normal classfer (for dealng wth the complete pattern) at frst, and all the mssng values are gnored here. In ths work, we adopt a very smple classfcaton method 2 for the convenence of computaton, and x s drectly classfed based on the dstance to the prototype of each class. The prototype of each class {o 1,..., o c } correspondng to {ω 1,..., ω c } s gven by the arthmetc 2 Many other normal classfers (e.g. K-NN) can be selected here dependng on the preference of user, and we propose to use ths smple classfcaton method because of ts low computaton complexty.

9 9 average vector of the tranng patterns n the same class. Mathematcally, the prototype s computed for g = 1,..., c by o g = 1 y j (2) N g y j ω g where N g s the number of the tranng samples n the class ω g. In a c-class problem, one can get c peces of smple classfcaton result for x accordng to each class of tranng data, and each result s represented by a smple BBA s ncludng two focal elements,.e. the sngleton class and the gnorant class (Ω) to characterze the full gnorance. The belef of x belongng to class ω g s computed based on the dstance between x and the correspondng prototype o g. Normalzed Eucldean dstance as eq. (4) s adopted here to deal wth the ansotropc class, and the mssng values are gnored n the calculaton of ths dstance. The other mass of belef s assgned to the gnorant class Ω. Therefore, the BBA s constructon s done by (ω g ) = e ηd g m og m og (Ω) = 1 e ηd g (3) wth d g = 1 p p ( ) 2 xj o gj (4) j=1 δ gj and 1 δ gj = (y j o gj ) 2 (5) N g y ω g where x j s value of x n j-th dmenson, and y j s value of y n j-th dmenson. p s the number of avalable attrbute values n the object x. The coeffcent 1/p s necessary to normalze the dstance value because each test sample can have a dfferent number of mssng values. δ gj s the average dstance of all tranng samples n class ω g to the prototype o g n j-th dmenson. N g s the number of tranng samples n ω g. η s a tunng parameter, and the bgger η generally yelds smaller mass of belef on the specfc class w g. It s usually recommended to take η [0.5, 0.8] accordng to our varous tests, and η = 0.7 can be consdered as default value. Obvously, the smaller dstance measure, the bgger mass of belef on the sngleton class. Ths partcular structure of BBA s ndcates that we can just confrm the degree of the object x assocated wth the specfc class ω g only accordng to tranng data n ω g. The other mass of belef reflects the level of belef one has on full gnorance, and t s commtted to the gnorant class Ω. Smlarly, one calculates c ndependent

10 10 BBA s m og (ω g ), g = 1,..., c based on the dfferent tranng classes. Before combnng these c BBA s, we examne whether a specfc classfcaton result can be derved from these c BBA s. Ths s done as follows: f t holds that m o 1st (ω 1st ) = argmax g (m og (ω g )), then the object wll be consdered to belong very lkely to the class ω 1st, whch obtans the bggest mass of belef n the c BBA s. The class wth the second bggest mass of belef s denoted ω 2nd. The dstngushablty degree χ (0, 1] of an object x assocated wth dfferent classes s defned by: χ = mo 2nd (ω 2nd ) m omax (ω max ) (6) Let ɛ be a chosen small postve dstngushablty threshold value n (0, 1]. If the condton χ ɛ s satsfed, t means that all the classes nvolved n the computaton of χ can be clearly dstngushed of x. In ths case, t s very lkely to obtan a specfc classfcaton result from the fuson of the c BBA s. The condton χ ɛ also ndcates that the avalable attrbute nformaton s suffcent for makng the classfcaton of the object, and the mputaton of the mssng values s not necessary. If χ ɛ condton holds, the c BBA s are drectly combned wth DS rule to obtan the fnal classfcaton results of the object because DS rule usually produces specfc combnaton result wth acceptable computaton burden n the low conflctng case. In such case, the meta-class s not ncluded n the fuson result, because these dfferent classes are consdered dstngushable based on the condton of dstngushablty. Moreover, the mass of belef of the full gnorance class Ω, whch represents the nosy data (outlers), can be proportonally redstrbuted to other sngleton classes for more specfc results f one knows a pror that the nosy data s not nvolved. If the dstngushablty condton χ ɛ s not satsfed, t means that the classes ω 1st and ω 2nd cannot be clearly dstngushed for the object wth respect to the chosen threshold value ɛ, ndcatng that mssng attrbute values play almost surely a crucal role n the classfcaton. In ths case, the mssng values must be properly mputed to recover the unavalable attrbute nformaton before enterng the classfcaton procedure. Ths s the Step 2 of our method whch s explaned n the next subsecton. B. Second step: Classfcaton of ncomplete pattern wth mputaton of mssng values 1) Multple estmaton of mssng values: In the estmaton of the mssng attrbute values, there exst varous methods. Partcularly, the K-NN mputaton method generally provdes good performance. However, the man drawback of KNN method s ts bg computatonal burden, snce one needs to calculate the dstances of the object wth all the tranng samples. Inspred by [43], we propose to use the Self

11 11 Organzed Map (SOM) technque [38] to reduce the computatonal complexty. SOM can be appled n each class of tranng data, and then M N weghtng vectors wll be obtaned after the optmzaton procedure. These optmzed weghtng vectors allow to characterze well the topologcal features of the whole class, and they wll be used to represent the correspondng data class. The number of the weghtng vectors s usually small (e.g. 5 6). So the K nearest neghbors of the test pattern assocated wth these weghtng vectors n the SOM can be easly found wth low computatonal complexty 3. The selected weghtng vector no. k n the class ω g, g = 1,..., c s denoted σ ωg k, for k = 1,..., K. In each class, the K selected close weghtng vectors provde dfferent contrbutons (weght) n the estmaton of mssng values, and the weght p ωg k the object x and weghtng vector σ ωg k. of each vector s defned based on the dstance between p ωg k = e( λdωg k ) (7) wth λ = cnm(cnm 1) 2 d(σ, σ j ),j (8) where d ωg k s the Eucldean dstance between x and the neghbor o ωg k gnorng the mssng values, and 1 λ s the average dstance between each par of weghtng vectors produced by SOM n all the classes; c s the number of classes; M N s the number of weghtng vectors obtaned by SOM n each class; and d(σ, σ j ) s the Eucldean dstance between any two weghtng vectors σ and σ j. The weghted mean value ŷ ωg used for the mputaton of mssng values. It s calculated by of the selected K weghtng vectors n class tranng class ω g wll be ŷ ωg = K k=1 p ωg k σωg k K k=1 p ωg k (9) The mssng values n x wll be flled by the values of ŷ ωg get the edted pattern x ωg accordng to the tranng class ω g. Then x ωg n the same dmensons. By dong ths, we wll be smply classfed only based on the tranng data n ω g as smlarly done n the drect classfcaton of ncomplete pattern usng eq. (3) of Step 1 for convenence 4. 3 The tranng of SOM usng the labeled patterns becomes tme consumng when the number of labeled patterns s bg, but fortunately t can be done off-lne. In our experments, the runnng tme performance shown n the results doesn t nclude the computatonal tme spent for the off-lne procedures. 4 Of course, some other sophstcated classfers can also be appled here accordng to the selecton of user, but the choce of classfer s not the man purpose of ths work.

12 12 The classfcaton of x wth the estmaton of mssng values s also respectvely done based on the other tranng classes accordng to ths procedure. For a c-class problem, there are c tranng classes, and therefore one can get c peces of classfcaton results wth respect to one object. 2) Ensemble classfer for credal classfcaton: These c peces of results obtaned by each class of tranng data n a c-class problem are consdered wth dfferent weghts, snce the estmatons of the mssng values accordng to dfferent classes have dfferent relabltes. The weghtng factor of the classfcaton result assocated wth the class w g can be defned by the sum of the weghts of the K selected SOM weghtng vectors for the contrbutons to the mssng values mputaton n ω g, whch s gven by ρ ωg = The result wth the bggest weghtng factor ρ ωmax K k=1 p ωg k (10) s consdered as the most relable, because one assumes that the object must belong to one of the labeled classes (.e. w g, g = 1,..., c). So the bggest weghtng factor wll be normalzed as one. The other relatve weghtng factors are defned by: ˆα ωg = ρωg ρ ωmax (11) If the condton 5 ˆα ωg < ɛ s satsfed, the correspondng estmaton of the mssng values and the classfcaton result are not very relable. Very lkely, the object does not belong to ths class. It s mplctly assumed that the object can belong to only one class n realty. If ths result whose relatve weghtng factor s very small (w.r.t. ɛ) s stll consdered useful, t wll be (more or less) harmful for the fnal classfcaton of the object. So f the condton ˆα wg zero. More precsely, we wll take α ωg = < ɛ holds, then the relatve weghtng factor s set to 0, f ˆα ωg < ɛ ρ ωg ρ ωmax, otherwse. After the estmaton of weghtng (dscountng) factors α ωg, the c classfcaton results (the BBA s m og (.)) are classcally dscounted [16] by ˆm og (ω g ) = α ωg m og (ω g ) ˆm og (Ω) = 1 α ωg + α ωg m og (Ω) These dscounted BBA s wll be globally combned to get the credal classfcaton result. If α ωg = 0, 5 The threshold ɛ s the same as n secton III-A, because t s also used to measure the dstngushablty degree here. (12) (13)

13 13 one gets ˆm og (Ω) = 1, and ths fully gnorant (vacuous) BBA plays a neutral role n the global fuson process for the fnal classfcaton of the object. Although we have done our best to estmate the mssng values, the estmaton can be qute mprecse when the estmatons are obtaned from dfferent class wth the smlar weghtng factors, and the dfferent estmatons probably lead to dstnct classfcaton results. In such case, we prefer to cautously keep (rather to gnore) the uncertanty, and mantan the uncertanty n the classfcaton result. Such uncertanty can be well reflected by the conflct of these classfcaton results represented by the BBA s. DS rule s not sutable here, because all the conflctng belefs are dstrbuted to other focal elements. A partcular combnaton rule nspred by DP rule s ntroduced here to fuse these BBA s accordng to the current context. In our new rule, the partal conflctng belefs are prudently transferred to the proper meta-class to reveal the mprecson degree of the classfcaton caused by the mssng values. Ths new rule of combnaton s defned by: m (ω g ) = ˆm og (ω g ) j g m (A) = ω j =A j ˆm o j (Ω) ˆm o j (ω j ) k j ˆm o k (Ω) The test pattern can be classfed accordng to the fuson results, and the object s consdered belongng to the class (sngleton class or meta-class) wth the maxmum mass of belef. Ths s called hard credal classfcaton. If one object s classfed to a partcular class, t means that ths object has been correctly classfed wth the proper mputaton of mssng values. If one object s commtted to a meta-class (e.g. A B), t means that we just know that ths object belongs to one of the specfc classes (e.g. A or B) ncluded n the meta-class, but we cannot specfy whch one. Ths case can happen when the mssng values are essental for the accurate classfcaton of ths object, but the mssng values cannot be estmated very well accordng to the context, and dfferent estmatons wll nduce the classfcaton of the object nto dstnct classes (e.g. A or B). For convenence, Fg. 1 shows the functonal flowchart of ths new CCAI method. (14) Gudelne for tunng of the parameters ɛ and η: The tunng of parameters η and ɛ s very mportant n the applcaton of CCAI. η n eq. (3) s assocated wth the calculaton of mass of belef on the specfc class, and the bgger η value wll lead to smaller mass of belef commtted to the specfc class. Based on our varous tests, we advse to take η [0.5, 0.8], and the value η = 0.7 can be taken as the default value. The parameter ɛ s the threshold to tune for changng the classfcaton strategy. It s also used

14 14 Fgure 1. Flowchart of the proposed CCAI method. n Eq. (12) for the calculaton of the dscountng factor. The bgger ɛ wll make fewer objects gong to the sophstcated classfcaton procedure wth the mputaton of mssng values, and t also forces more dscountng factors to zero accordng to Eq. (12), whch mples that fewer smple classfcaton results obtaned based on each class can be useful n the global fuson step. So the bgger ɛ wll makes fewer objects commtted to the meta-classes (correspondng to the low mprecson of classfcaton), but t ncreases the rsk of msclassfcaton error. ɛ should be tuned accordng to the compromse one can accept between the msclassfcaton error and mprecson (non specfcty of classfcaton decson). One can

15 15 also apply the cross valdaton [44] (e.g. leave-one-out method) n the tranng data space to fnd a sutable threshold, and the mssng values n the test samples are randomly dstrbuted n all the dmensons. IV. EXPERIMENTS Three experments wth artfcal and real data sets have been used to test the performance of ths new CCAI method compared wth the K-NN mputaton (KNNI) method [12], FCM mputaton (FCMI) method [13], [14], SOM mputaton (SOMI) [15] method and our prevous credal classfcaton PCC method [25]. SOM technque s also employed n the second step of CCAI method, but CCAI s dfferent from the prevous SOMI method. In SOMI method, SOM s appled for the whole tranng data set, and the mssng values are precsely estmated based on an actvaton group composed of the best match node (unt) of nput pattern and ts close neghbors. Then, the edted pattern wth the mputaton of mssng values can be classfed usng a standard classfer. Nevertheless, SOM s not nvolved n the frst step of CCAI, and the object s drectly classfed gnorng the mssng values. In the second step of CCAI, SOM s respectvely appled n each tranng class, and multple estmatons of mssng values can be obtaned based on the nput pattern s K nearest weghtng vectors correspondng to nodes of SOM n each class. Then dfferent classfcaton results wll be produced accordng to dfferent estmatons, and these results are globally fused for fnal classfcaton. The conflctng nformaton commtted to the meta-class s kept n the fuson to characterze the mprecson of classfcaton n CCAI, but ths cannot be done n SOMI. These dfferent methods have been programmed and tested wth Matlab TM software. The evdental neural network classfer (ENN) [27] s adopted n the sequel experments to classfy the edted pattern wth the estmated values n PCC, KNNI and FCMI, snce ENN produce generally good results n the classfcaton 6. The evdental K-nearest neghbor (EK-NN) method [21] s also used to classfy the edted pattern n Experment 3 wth real data for comparson. The parameters of ENN and EK-NN can be automatcally optmzed as explaned n [27] and [22]. In SOMI, we use the M N = 6 8 nodes for mappng the whole nput data set consstng of all the tranng classes to the 2-dmensonal grd, and t has good performance. In the applcatons of PCC, the tunng parameter ɛ can be tuned accordng to the mprecson rate one can accept. In CCAI, a small number of the nodes n the 2-dmensonal grd of SOM s gven by M N = 3 4 for each class, and we take the value of K = N = 4 n K-NN for the mputaton of mssng values. Ths seems to provde good result n the sequel experments. In order to show the ablty of CCAI and PCC to deal wth the meta-classes, the hard credal classfcaton s appled, and the class of each object s decded accordng to the crteron of the maxmal mass of belef. 6 Other tradtonal classfers for complete pattern can also be selected here accordng to the actual applcaton.

16 16 In our smulatons, the msclassfcaton s declared (counted) for one object truly orgnated from ω f t s classfed nto A wth ω A =. If ω A and A ω then t wll be consdered as an mprecse classfcaton. The error rate denoted by Re s calculated by Re = N e /T, where N e s number of msclassfcaton errors, and T s the number of objects under test. The mprecson rate denoted by R j s calculated by R j = N j /T, where N j s number of objects commtted to the meta-classes wth the cardnalty value j. In our experments, the classfcaton of object s generally uncertan (mprecse) among a very small number (e.g. 2) of classes, and we only take R 2 commtted to the meta-class ncludng three or more specfc classes. here snce there s no object A. Experment 1 (artfcal data set) In the frst experment, we show the nterest of credal classfcaton based on belef functons wth respect to the tradtonal classfcaton workng wth probablty framework. A 3-class data set Ω = {ω 1, ω 2, ω 3 } obtaned from three 2-D unform dstrbutons shown by Fg. 2 s consdered here. Each class has 200 tranng samples and 200 test samples, and there are 600 tranng samples and 600 test samples n total. The unform dstrbutons of the three classes are characterzed by the followng nterval bounds: x-label nterval y-label nterval ω 1 (5, 65) (5, 25) ω 2 (95, 155) (5, 25) ω 3 (50, 110) (50, 70) The values n the second dmenson correspondng to y-coordnate of test samples are all mssng. So test samples are classfed accordng to the only one avalable value n the frst dmenson correspondng to x-coordnate. Several dfferent methods lke FCMI, KNNI, SOMI have been appled here for comparson wth CCAI as shown by Fg. 3-(a) 3-(f). Partcularly, the classfcaton result obtaned usng the (frst or second) sngle step of CCAI (denoted by SCCAI) are also gven as n Fg. 3-(d) 3-(e). In the frst step of CCAI, the drect classfcaton s done wthout mputaton of mssng value, whereas the object s classfed wth mputaton of mssng values n all ncomplete patterns by the only second step of CCAI. A partcular value of K = 9 s selected n the classfer K-NN mputaton method 7. For notaton concseness, we have denoted ω te ω test, ω tr ω tranng and ω,...,k ω... ω k. The error rate (n %), mprecson rate (n %) and computaton tme (Sec.) are specfed n the capton of each subfgure. 7 In fact, the choce of K rankng from 7 to 15 does not affect serously the results.

17 tr w 1 tr w 2 tr w 3 te w 1 te w 2 te w Fgure 2. Tranng data and test data. Because the y value n the test sample s mssng, the class w 3 appears partally overlapped wth the classes ω 1 and ω 2 on ther margns accordng to the value of x-coordnate as shown n Fg. 3-(a). The mssng value of the samples n the overlapped parts can be flled by qute dfferent estmatons obtaned from dfferent classes wth the almost same relabltes. For example, the estmaton of the mssng values of the objects n the rght margn of ω 1 and the left margn of ω 3 can be obtaned accordng to the tranng class ω 1 or ω 3. The edted pattern wth the estmaton from ω 1 wll be classfed nto class ω 1, whereas t wll be commtted to class ω 3 f the estmaton s drawn from ω 3. It s smlar to the test samples n the left margn of ω 2 and the rght margn of ω 3. Ths ndcates that the mssng value play a crucal rule n the classfcaton of these objects, but unfortunately the estmaton of these nvolved mssng values are

18 w 1 w 1 60 w 2 60 w 2 w 3 w (a). Classfcaton result by FCMI (Re = 14.67, tme = s) (b). Classfcaton result by KNNI (Re = 14.17, tme = s) w 1 w 1 60 w 2 60 w 2 w 3 w w 1,2 w 1, w 2, (c). Classfcaton result by SOMI (Re = 14.33, tme = s). w (d). Classfcaton result only by 1 st step of SCCAI (Re = 14.83, tme = s). 70 w 1 60 w 2 60 w 2 w 3 w 3 50 w 1,2 w 1,3 50 w 1,2 w 1,3 40 w 2,3 40 w 2, (e). Classfcaton result only by 2 nd step of SCCAI (Re = 4.83, R 2 = 19.33, tme = s). Fgure 3. Classfcaton results of a 3-class artfcal data set by dfferent methods. (f). Classfcaton result by CCAI (Re = 5.83, R 2 = 16.83, tme = s). qute uncertan accordng to context. So these objects are prudently classfed nto the proper meta-class (e.g. ω 1 ω 3 and ω 2 ω 3 ) by CCAI. The CCAI results ndcate that these objects belong to one of the specfc classes ncluded n the meta-classes, but these specfc classes cannot be clearly dstngushed by

19 19 the object based only on the avalable values. If one wants to get more precse and accurate classfcaton results, one needs to request for addtonal resources for gatherng more useful nformaton. The other objects n the left margn of ω 1, rght margn of ω 2 and mddle of ω 3 can be correctly classfed based on the only known value n x-coordnate, and t s not necessary to estmate the mssng value for the classfcaton of these objects n CCAI. However, all the test samples are classfed nto specfc classes by the tradtonal methods KNNI and FCMI, and ths causes many errors due to the lmtaton of probablty framework. If we just apply the frst step of SCCAI wthout mputaton of the mssng value and drectly classfy all the objects usng the only known value (.e. value n x-coordnate), t produces bgger error rate than the other methods, and ths ndcates that the mputaton procedure s mportant to mprove the accuracy of classfcaton. If only the second step of SCCAI s done wth mputaton of the mssng values n all ncomplete patterns, t causes hgh mprecson rate that s not an effcent soluton, and t takes much longer computaton tme than CCAI. CCAI wth the adaptve mputaton strategy can well balance the error rate, mprecson rate and computaton burden. CCAI consstng of two steps generally produces smaller error rate than KNNI, FCMI and SOMI thanks to the use of meta-classes. Meanwhle, the computatonal tme of CCAI s smlar to that of FCMI, and s much shorter than KNNI because of the ntroducton of SOM technque n the estmaton of mssng values. It shows that the computatonal complexty of CCAI s relatvely low. Ths smple example shows the nterest and the potental of the credal classfcaton obtaned wth CCAI method. B. Experment 2 (artfcal data set) In ths second experment, we evaluate the performance of CCAI method usng a 4D data set whch ncludes 3 classes ω 1, ω 2, and ω 3. The artfcal data are generated from three 4D Gaussan dstrbutons characterzed by the followng means vectors and covarance matrces (I denotes the 4 4 dentty matrx): µ 1 = (10, 50, 100, 100) T, Σ 1 = 10 I µ 2 = (30, 40, 50, 90) T, Σ 2 = 15 I µ 3 = (20, 80, 90, 130) T, Σ 3 = 12 I We have used g tranng samples, and g test samples (for g = 500, and g = 1000) n each class. So there are total N = 3 g tranng samples and N = 3 g test samples. Each test sample has n mssng values (for n = 1, 2, 3), and the mssng component value s randomly dstrbuted n every dmenson. Three other methods KNNI, FCMI, SOMI and PCC are also appled here for the performances comparson. For

20 20 each par (N, n), the reported error rates, mprecson rates and runnng tme (sec.) are the averages over 10 trals performed wth 10 ndependent random generaton of the data sets. For KNNI, the values of K rangng from 5 to 20 neghbors have been tested, and the mean error rate wth K [5, 20] s gven n Table I. In PCC method, the parameter ɛ has been optmzed to obtan an acceptable compromse between error rate and the mprecson degree. ENN s adopted to classfy the edted pattern wth mputaton of mssng values n FCMI, KNNI, SOMI and PCC. Table I CLASSIFICATION RESULTS FOR 3-CLASS DATA SET BY DIFFERENT METHODS (IN %). (N,n) FCMI KNNI SOMI PCC CCAI {Re, tme} {Re, tme} {Re, tme} {Re, R 2, tme} {Re, R 2, tme} (1500,1) {6.73, s} {7.42, s} {7.22, s} {6.20, 2.33, s} {4.64, 3.87, s} (1500,2) {14.38, s} {15.68, s} {15.43, s} {13.47, 5.93, s} {9.76, 9.79, s} (1500,3) {36.84, s} {40.11, 3.002s} {40.10, s} {34.57, 7.97, s} {29.71, 15.6, s} (3000,1) {6.75, s} {7.54, s} {7.14, s} {6.17, 1.63, s} {4.73, 3.83, s} (3000,2) {14.73, s} {15.80, s} {15.20, s} {14.00, 1.60, s} {9.90, 10.33, s} (3000,3) {36.43,1.6500s} {40.48, s} {40.05, s} {33.94, 8.13, s} {29.52, 16.83, s} The classfcaton results of the appled methods (.e. FCMI, KNNI, SOMI, PCC and CCAI) have been shown n Table I. Our proposed CCAI method produces the lowest error rate, snce some objects hard to correctly classfy because of the mssng values have been commtted to the proper meta-class. Meanwhle, CCAI takes the shortest computaton tme compared wth the other methods. Ths s because that some ncomplete patterns are drectly classfed gnorng the mssng values, whch are consdered unmportant for the classfcaton. However, the mssng values n each pattern are all mputed by other methods, and ths needs more computatons and thus ncreases the computatonal tme. Moreover, one can see that KNNI takes the longest tme, and ths s the man drawback of K-NN based method. The K-NN strategy s also adopted n CCAI, but we use a few optmzed weghtng vectors acqured by SOM technque to represent the whole tranng data class. Thus, we just need to calculate the dstances between the object and these obtaned weghtng vectors rather than all the tranng samples, whch reduces a lot the computaton burden. C. Experment 3 (real data set) Nne well known real data sets 8 avalable from UCI Machne Learnng Repostory [45] are used n ths experment to evaluate the performance of CCAI wth respect to KNNI, FCMI, SOMI and PCC. Both ENN and EK-NN are employed here as standard classfer to classfy the edted patterns. Moreover, the 8 We select seven classes from Yeast data set, because the last three classes (.e. VAC POX and ERL) contan qute few samples.

21 21 sngle (1 st and 2 nd ) step procedure of CCAI (SCCAI) has been also appled here for comparson. In the frst step of SCCAI, the object s drectly classfed usng the only avalable attrbutes wthout mputaton procedure, whereas all the mssng values are mputed before the classfcaton n the second step of SCCAI. The basc nformaton of these used real data sets s gven n Table II. In Hepatts data set, many patterns have already contaned mssng values. The patterns wth mssng values are consdered as test samples, and the others are used as tranng samples. There s no mssng values n the other seven orgnal data sets, and t s assumed that n values are mssng completely at random n all dmensons of each test sample. The cross valdaton s performed for these seven data sets, and we use the smplest 2-fold cross valdaton 9 here, snce t has the advantage that the tranng and test sets are both large, and each sample s used for both tranng and testng on each fold. The 2-fold cross valdaton has been repeated 10 tmes, and the average error rate Re and mprecson rate R (for PCC and CCAI) of the dfferent methods are gven n Table III. Partcularly, the reported classfcaton result of KNNI s the average wth K value rangng from 5 to 15. For the notaton concseness, the selected classfer (SC) s denoted by A=EK-NN, B=ENN n Table III. For the method of sngle step of CCAI (SCCAI), A represents the frst step of SCCAI, and B represents the second step of SCCAI. Table II BASIC INFORMATION OF THE USED DATA SETS. name classes attrbutes nstances Breast Hepatts Statlog (Heart) Irs Seeds Wne Knowledge Vehcle Yeast One can see n Table III that the credal classfcaton of PCC and CCAI always produce the lower error rate than the tradtonal FCMI, KNNI and SOMI methods, snce some objects that cannot be correctly classfed usng only the avalable attrbute values have been properly commtted to the meta-classes, whch can well reveal the mprecson of classfcaton. The selected classfers (.e. EK-NN and ENN) for classfcaton of edted patterns n FCMI, KNNI, SOMI and PCC are usually wth the smlar performance 9 More precsely, the samples n each class are randomly assgned to two sets S 1 and S 2 havng equal sze. Then we tran on S 1 and test on S 2, and recprocally.

22 22 n many cases 10, but t s known that the K-NN based method generally has bg computaton burden. The choce of EK-NN and ENN should be made accordng to the actual condton n real applcatons. In CCAI, some objects wth the mputaton of mssng values are stll classfed nto the meta-class. It ndcates that these mssng values play a crucal role n the classfcaton, but the estmaton of these mssng values s no very good. In other words, the mssng values can be flled wth the smlar relabltes by dfferent estmated data, whch lead to dstnct classfcaton results. So we have to cautously assgn them to the meta-class to reduce the rsk of msclassfcaton. Compared wth our prevous method PCC, ths new method CCAI generally provde better performance wth lower error rate and mprecson rate, and t s manly because more accurate estmaton method (.e. SOM+KNN) for mssng values s adopted n CCAI. However, f only the frst step of SCCAI s appled, t produces more msclassfcaton errors that other methods due to the absence of mputaton of mssng data. Whereas, the mprecson rate wll be qute hgh f only the second step of SCCAI s adopted because all the conflctng belefs caused n the combnaton procedure are transferred to the meta-classes. So CCAI wth adaptve mputaton of mssng values can provde a good compromse between the error and mprecson. Ths thrd experment usng real data sets shows the effectveness and nterest of ths new CCAI method wth respect to other methods. V. CONCLUSION A new credal classfcaton method wth adaptve mputaton of mssng values (called CCAI) for dealng wth ncomplete pattern has been presented based on belef functon theory. In the frst step of CCAI method, some objects (ncomplete pattern) are drectly classfed gnorng the mssng values f the specfc classfcaton result can be obtaned, whch effectvely reduces the computaton complexty because t avods the mputaton of the mssng values. However, f the avalable nformaton s not suffcent to acheve a specfc classfcaton of the object n the frst step, we estmate (recover) the mssng values before enterng the classfcaton procedure n a second step. The SOM and K-NN technques are appled to make the estmaton of mssng attrbutes wth a good compromse between the estmaton accuracy and computaton burden. The credal classfcaton n ths work allows the object to belong to dfferent sngleton classes and meta-class (.e. dsjuncton of several classes) wth dfferent masses of belef. Once the object s commtted to a meta-class (e.g. A B), t means that the mssng values cannot be accurately recovered accordng to the context, and the estmaton s not very good. Dfferent estmatons wll lead the object to dstnct classes (e.g. A or B) nvolved n the meta-class. So some other sources of nformaton wll be requred to acheve more precse classfcaton of the object f necessary. The credal classfcaton s 10 EK-NN outperforms ENN sometmes, but ENN can be better n some other cases.

Classification of Incomplete Patterns Based on the Fusion of Belief Functions

Classification of Incomplete Patterns Based on the Fusion of Belief Functions 18th Internatonal Conference on Informaton Fuson Washngton, DC - July 6-9, 2015 Classfcaton of Incomplete Patterns Based on the Fuson of Belef Functons Zhun-ga Lu, Quan Pan, Jean Dezert, Arnaud Martn and

More information

Classification of incomplete patterns based on fusion of belief functions

Classification of incomplete patterns based on fusion of belief functions Classfcaton of ncomplete patterns based on fuson of belef functons Zhun-Ga Lu, Quan Pan, Jean Dezert, Arnaud Martn, Gregore Mercer To cte ths verson: Zhun-Ga Lu, Quan Pan, Jean Dezert, Arnaud Martn, Gregore

More information

Pattern classification with missing data using belief functions

Pattern classification with missing data using belief functions Pattern classfcaton wth mssng data usng belef functons Zhun-ga Lu a,b, Quan Pan a, Gregore Mercer b, Jean Dezert c a. School of Automaton, Northwestern Polytechncal Unversty, X an, Chna. Emal: luzhunga@gmal.com

More information

Power law and dimension of the maximum value for belief distribution with the max Deng entropy

Power law and dimension of the maximum value for belief distribution with the max Deng entropy Power law and dmenson of the maxmum value for belef dstrbuton wth the max Deng entropy Bngy Kang a, a College of Informaton Engneerng, Northwest A&F Unversty, Yanglng, Shaanx, 712100, Chna. Abstract Deng

More information

International Journal of Mathematical Archive-3(3), 2012, Page: Available online through ISSN

International Journal of Mathematical Archive-3(3), 2012, Page: Available online through   ISSN Internatonal Journal of Mathematcal Archve-3(3), 2012, Page: 1136-1140 Avalable onlne through www.ma.nfo ISSN 2229 5046 ARITHMETIC OPERATIONS OF FOCAL ELEMENTS AND THEIR CORRESPONDING BASIC PROBABILITY

More information

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna

More information

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

Bayesian predictive Configural Frequency Analysis

Bayesian predictive Configural Frequency Analysis Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse

More information

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland

More information

Statistical pattern recognition

Statistical pattern recognition Statstcal pattern recognton Bayes theorem Problem: decdng f a patent has a partcular condton based on a partcular test However, the test s mperfect Someone wth the condton may go undetected (false negatve

More information

Supporting Information

Supporting Information Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

Sampling Theory MODULE VII LECTURE - 23 VARYING PROBABILITY SAMPLING

Sampling Theory MODULE VII LECTURE - 23 VARYING PROBABILITY SAMPLING Samplng heory MODULE VII LECURE - 3 VARYIG PROBABILIY SAMPLIG DR. SHALABH DEPARME OF MAHEMAICS AD SAISICS IDIA ISIUE OF ECHOLOGY KAPUR he smple random samplng scheme provdes a random sample where every

More information

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018 INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Linear Feature Engineering 11

Linear Feature Engineering 11 Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

Regularized Discriminant Analysis for Face Recognition

Regularized Discriminant Analysis for Face Recognition 1 Regularzed Dscrmnant Analyss for Face Recognton Itz Pma, Mayer Aladem Department of Electrcal and Computer Engneerng, Ben-Guron Unversty of the Negev P.O.Box 653, Beer-Sheva, 845, Israel. Abstract Ths

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Grover s Algorithm + Quantum Zeno Effect + Vaidman Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the

More information

Which Separator? Spring 1

Which Separator? Spring 1 Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal

More information

SDMML HT MSc Problem Sheet 4

SDMML HT MSc Problem Sheet 4 SDMML HT 06 - MSc Problem Sheet 4. The recever operatng characterstc ROC curve plots the senstvty aganst the specfcty of a bnary classfer as the threshold for dscrmnaton s vared. Let the data space be

More information

Report on Image warping

Report on Image warping Report on Image warpng Xuan Ne, Dec. 20, 2004 Ths document summarzed the algorthms of our mage warpng soluton for further study, and there s a detaled descrpton about the mplementaton of these algorthms.

More information

Hongyi Miao, College of Science, Nanjing Forestry University, Nanjing ,China. (Received 20 June 2013, accepted 11 March 2014) I)ϕ (k)

Hongyi Miao, College of Science, Nanjing Forestry University, Nanjing ,China. (Received 20 June 2013, accepted 11 March 2014) I)ϕ (k) ISSN 1749-3889 (prnt), 1749-3897 (onlne) Internatonal Journal of Nonlnear Scence Vol.17(2014) No.2,pp.188-192 Modfed Block Jacob-Davdson Method for Solvng Large Sparse Egenproblems Hongy Mao, College of

More information

Affine transformations and convexity

Affine transformations and convexity Affne transformatons and convexty The purpose of ths document s to prove some basc propertes of affne transformatons nvolvng convex sets. Here are a few onlne references for background nformaton: http://math.ucr.edu/

More information

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper Games of Threats Elon Kohlberg Abraham Neyman Workng Paper 18-023 Games of Threats Elon Kohlberg Harvard Busness School Abraham Neyman The Hebrew Unversty of Jerusalem Workng Paper 18-023 Copyrght 2017

More information

Support Vector Machines CS434

Support Vector Machines CS434 Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? Intuton of Margn Consder ponts A, B, and C We

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

CHAPTER III Neural Networks as Associative Memory

CHAPTER III Neural Networks as Associative Memory CHAPTER III Neural Networs as Assocatve Memory Introducton One of the prmary functons of the bran s assocatve memory. We assocate the faces wth names, letters wth sounds, or we can recognze the people

More information

LECTURE 9 CANONICAL CORRELATION ANALYSIS

LECTURE 9 CANONICAL CORRELATION ANALYSIS LECURE 9 CANONICAL CORRELAION ANALYSIS Introducton he concept of canoncal correlaton arses when we want to quantfy the assocatons between two sets of varables. For example, suppose that the frst set of

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

NP-Completeness : Proofs

NP-Completeness : Proofs NP-Completeness : Proofs Proof Methods A method to show a decson problem Π NP-complete s as follows. (1) Show Π NP. (2) Choose an NP-complete problem Π. (3) Show Π Π. A method to show an optmzaton problem

More information

Learning from Data 1 Naive Bayes

Learning from Data 1 Naive Bayes Learnng from Data 1 Nave Bayes Davd Barber dbarber@anc.ed.ac.uk course page : http://anc.ed.ac.uk/ dbarber/lfd1/lfd1.html c Davd Barber 2001, 2002 1 Learnng from Data 1 : c Davd Barber 2001,2002 2 1 Why

More information

Uncertainty as the Overlap of Alternate Conditional Distributions

Uncertainty as the Overlap of Alternate Conditional Distributions Uncertanty as the Overlap of Alternate Condtonal Dstrbutons Olena Babak and Clayton V. Deutsch Centre for Computatonal Geostatstcs Department of Cvl & Envronmental Engneerng Unversty of Alberta An mportant

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Evaluation for sets of classes

Evaluation for sets of classes Evaluaton for Tet Categorzaton Classfcaton accuracy: usual n ML, the proporton of correct decsons, Not approprate f the populaton rate of the class s low Precson, Recall and F 1 Better measures 21 Evaluaton

More information

Uncertainty and auto-correlation in. Measurement

Uncertainty and auto-correlation in. Measurement Uncertanty and auto-correlaton n arxv:1707.03276v2 [physcs.data-an] 30 Dec 2017 Measurement Markus Schebl Federal Offce of Metrology and Surveyng (BEV), 1160 Venna, Austra E-mal: markus.schebl@bev.gv.at

More information

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method

More information

3.1 ML and Empirical Distribution

3.1 ML and Empirical Distribution 67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

Lecture 4: November 17, Part 1 Single Buffer Management

Lecture 4: November 17, Part 1 Single Buffer Management Lecturer: Ad Rosén Algorthms for the anagement of Networs Fall 2003-2004 Lecture 4: November 7, 2003 Scrbe: Guy Grebla Part Sngle Buffer anagement In the prevous lecture we taled about the Combned Input

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

VQ widely used in coding speech, image, and video

VQ widely used in coding speech, image, and video at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

Natural Language Processing and Information Retrieval

Natural Language Processing and Information Retrieval Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support

More information

Spatial Statistics and Analysis Methods (for GEOG 104 class).

Spatial Statistics and Analysis Methods (for GEOG 104 class). Spatal Statstcs and Analyss Methods (for GEOG 104 class). Provded by Dr. An L, San Dego State Unversty. 1 Ponts Types of spatal data Pont pattern analyss (PPA; such as nearest neghbor dstance, quadrat

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

Design and Optimization of Fuzzy Controller for Inverse Pendulum System Using Genetic Algorithm

Design and Optimization of Fuzzy Controller for Inverse Pendulum System Using Genetic Algorithm Desgn and Optmzaton of Fuzzy Controller for Inverse Pendulum System Usng Genetc Algorthm H. Mehraban A. Ashoor Unversty of Tehran Unversty of Tehran h.mehraban@ece.ut.ac.r a.ashoor@ece.ut.ac.r Abstract:

More information

Primer on High-Order Moment Estimators

Primer on High-Order Moment Estimators Prmer on Hgh-Order Moment Estmators Ton M. Whted July 2007 The Errors-n-Varables Model We wll start wth the classcal EIV for one msmeasured regressor. The general case s n Erckson and Whted Econometrc

More information

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 1, July 2013

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 1, July 2013 ISSN: 2277-375 Constructon of Trend Free Run Orders for Orthogonal rrays Usng Codes bstract: Sometmes when the expermental runs are carred out n a tme order sequence, the response can depend on the run

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Hidden Markov Models

Hidden Markov Models CM229S: Machne Learnng for Bonformatcs Lecture 12-05/05/2016 Hdden Markov Models Lecturer: Srram Sankararaman Scrbe: Akshay Dattatray Shnde Edted by: TBD 1 Introducton For a drected graph G we can wrte

More information

Methods of Combining Multiple Classifiers with Different Features and Their Applications to Text-Independent Speaker Identification

Methods of Combining Multiple Classifiers with Different Features and Their Applications to Text-Independent Speaker Identification Internatonal Journal of Pattern Recognton and Artfcal Intellgence, Vol. 11, No. 3, pp. 417-445, 1997. World Scentfc Publsher Co. Methods of Combnng Multple Classfers wth Dfferent Features and Ther Applcatons

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Why Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one)

Why Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one) Why Bayesan? 3. Bayes and Normal Models Alex M. Martnez alex@ece.osu.edu Handouts Handoutsfor forece ECE874 874Sp Sp007 If all our research (n PR was to dsappear and you could only save one theory, whch

More information

Clustering & Unsupervised Learning

Clustering & Unsupervised Learning Clusterng & Unsupervsed Learnng Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Wnter 2012 UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y

More information

Lecture 7: Boltzmann distribution & Thermodynamics of mixing

Lecture 7: Boltzmann distribution & Thermodynamics of mixing Prof. Tbbtt Lecture 7 etworks & Gels Lecture 7: Boltzmann dstrbuton & Thermodynamcs of mxng 1 Suggested readng Prof. Mark W. Tbbtt ETH Zürch 13 März 018 Molecular Drvng Forces Dll and Bromberg: Chapters

More information

Outline. Multivariate Parametric Methods. Multivariate Data. Basic Multivariate Statistics. Steven J Zeil

Outline. Multivariate Parametric Methods. Multivariate Data. Basic Multivariate Statistics. Steven J Zeil Outlne Multvarate Parametrc Methods Steven J Zel Old Domnon Unv. Fall 2010 1 Multvarate Data 2 Multvarate ormal Dstrbuton 3 Multvarate Classfcaton Dscrmnants Tunng Complexty Dscrete Features 4 Multvarate

More information

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z ) C4B Machne Learnng Answers II.(a) Show that for the logstc sgmod functon dσ(z) dz = σ(z) ( σ(z)) A. Zsserman, Hlary Term 20 Start from the defnton of σ(z) Note that Then σ(z) = σ = dσ(z) dz = + e z e z

More information

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING MACHINE LEANING Vasant Honavar Bonformatcs and Computatonal Bology rogram Center for Computatonal Intellgence, Learnng, & Dscovery Iowa State Unversty honavar@cs.astate.edu www.cs.astate.edu/~honavar/

More information

Multigradient for Neural Networks for Equalizers 1

Multigradient for Neural Networks for Equalizers 1 Multgradent for Neural Netorks for Equalzers 1 Chulhee ee, Jnook Go and Heeyoung Km Department of Electrcal and Electronc Engneerng Yonse Unversty 134 Shnchon-Dong, Seodaemun-Ku, Seoul 1-749, Korea ABSTRACT

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

I529: Machine Learning in Bioinformatics (Spring 2017) Markov Models

I529: Machine Learning in Bioinformatics (Spring 2017) Markov Models I529: Machne Learnng n Bonformatcs (Sprng 217) Markov Models Yuzhen Ye School of Informatcs and Computng Indana Unversty, Bloomngton Sprng 217 Outlne Smple model (frequency & profle) revew Markov chan

More information

Tracking with Kalman Filter

Tracking with Kalman Filter Trackng wth Kalman Flter Scott T. Acton Vrgna Image and Vdeo Analyss (VIVA), Charles L. Brown Department of Electrcal and Computer Engneerng Department of Bomedcal Engneerng Unversty of Vrgna, Charlottesvlle,

More information

Ph 219a/CS 219a. Exercises Due: Wednesday 23 October 2013

Ph 219a/CS 219a. Exercises Due: Wednesday 23 October 2013 1 Ph 219a/CS 219a Exercses Due: Wednesday 23 October 2013 1.1 How far apart are two quantum states? Consder two quantum states descrbed by densty operators ρ and ρ n an N-dmensonal Hlbert space, and consder

More information