Learning to Identify Unexpected Instances in the Test Set

Size: px
Start display at page:

Download "Learning to Identify Unexpected Instances in the Test Set"

Transcription

1 Learnng to Ientfy Unexpete Instanes n the Test Set Xao-L L Insttute for Infoomm Researh, 21 Heng Mu Keng Terrae, Sngapore, xll@2r.a-star.eu.sg Bng Lu Department of Computer Sene, Unversty of Illnos at Chago, 851 South Morgan Street, Chago, IL lub@s.u.eu See-Kong Ng Insttute for Infoomm Researh, 21 Heng Mu Keng Terrae, Sngapore, skng@2r.a-star.eu.sg Abstrat Tratonal lassfaton nvolves bulng a lassfer usng labele tranng examples from a set of preefne lasses an then applyng the lassfer to lassfy test nstanes nto the same set of lasses. In prate, ths paragm an be problemat beause the test ata may ontan nstanes that o not belong to any of the prevously efne lasses. Detetng suh unexpete nstanes n the test set s an mportant ssue n prate. The problem an be formulate as learnng from postve an unlabele examples (PU learnng. Hoever, urrent PU learnng algorthms requre a large proporton of negatve nstanes n the unlabele set to be effetve. Ths paper proposes a novel tehnque to solve ths problem n the text lassfaton oman. The tehnque frst generates a sngle artfal negatve oument A N.ThesetsP an {A N } are then use to bul a naïve Bayesan lassfer. Our experment results sho that ths metho s sgnfantly better than exstng tehnques. 1 Introuton Classfaton s a ell-stue problem n mahne learnng. Tratonally, to bul a lassfer, a user frst ollets a set of tranng examples that are labele th preefne or knon lasses. A lassfaton algorthm s then apple to the tranng ata to bul a lassfer that s subsequently employe to assgn the preefne lasses to nstanes n a test set (for evaluaton or future nstanes (n prate. Ths paragm an be problemat n prate beause some of the test or future nstanes may not belong to any of the preefne lasses of the orgnal tranng set. The test set may ontan atonal unknon sublasses, or ne sublasses may arse as the unerlyng oman evolves over tme. For example, n aner lassfaton, the tranng set onssts of ata from urrently knon aner subtypes. Hoever, sne aner s a omplex an heterogeneous sease, an stll a perplexng one to-ate, t s lkely that the test ata ontan aner subtypes that are not yet meally lassfe (they are therefore not overe n the tranng ata. Even f the tranng ata o ontan all the urrent aner subtypes, ne subtypes may be forme at a later stage as the sease evolves ue to mutatons or other aner-ausng agents. Ths phenomenon s not unommon even n the seemngly smpler applaton omans. For example, n oument lassfaton, tops are often heterogeneous an ne tops evolve over tme. A oument lassfer bult for lassfyng say, omputer sene papers, oul fae the smlar problems as the aner lassfer esrbe above. Ths s beause omputer sene s a heterogeneous an nreasngly ross-splnary oman; t s also a raply evolvng one th ne tops beng reate over tme. Thus, a lassfer reate base on the noton of a fxe set of preefne lasses s boun to be naequate n the omplex an ynam real-orl n the long run, requrng the user to manually go through the lassfaton results to remove the unexpete nstanes. In prate, a ompetent lassfer shoul learn to entfy unexpete nstanes n the test set so as to automatally set these unlassfable nstanes apart. In some applatons, ths an be mportant n tself. For example, n the aner example above, eteton of the unexpete nstanes an alert the sentsts that some ne meal sovery (a ne aner subtype may have ourre. In reent years, researhers have stue the problem of learnng from postve an unlabele examples (or PU learnng. Gven a postve set P an an unlabelle set U, a PU learnng algorthm learns a lassfer that an entfy hen postve ouments n the unlabele set U. Our problem of entfyng unexpete nstanes n the test set an be moele as a PU learnng problem by treatng all the tranng ata as the postve set P an the test set as the unlabele set U. A lassfer an then be learne usng PU learnng algorthms to lassfy the test set to entfy those unexpete (or negatve nstanes before applyng a tratonal lassfer to lassfy the remanng nstanes nto the orgnal preefne lasses. Hoever, as the urrent PU tehnques operate by tryng to entfy an aequate set of relable negatve ata from the unlabele set U to learn from, they requre a large proporton of unexpete nstanes n the unlabele set U to be effetve. In prate, the number of unexpete nstanes n the test ata an be very small sne they are most lkely to be arsng from an emergng lass. Ths means that the lassfers bult th exstng PU learnng tehnques ll 2802

2 perform poorly ue to the small number of unexpete (negatve nstanes n U. In ths paper, e propose a novel tehnque alle LGN (PU Learnng by Generatng Negatve examples, an e stuy the problem usng text lassfaton. LGN uses an entropy-base metho to generate a sngle artfal negatve oument A N base on the nformaton n P an U, nhh the features frequeny strbutons orrespon to the egrees of negatveness n terms of ther respetve entropy values. A more aurate lassfer (e use the naïve Bayesan metho an be bult to entfy unexpete nstanes th the help of the artfal negatve oument A N. Expermental results on the benhmark 20 Nesgroup ata shoe that LGN outperforms exstng methos ramatally. 2 Relate Work PU learnng as nvestgate by several researhers n reent years. A stuy of PAC learnng from postve an unlabele examples uner the statstal query moel as gven n [Dens, 1998]. [Lu et al., 2002] reporte sample omplexty results an shoe ho the problem may be solve. Subsequently, a number of pratal algorthms [Lu et al., 2002; Yu et al., 2002; L an Lu, 2003] ere propose. They all onforme to the theoretal results n [Lu et al., 2002] follong a to-step strategy: (1 entfyng a set of relable negatve ouments from the unlabele set; an (2 bulng a lassfer usng EM or SVM teratvely. Ther spef fferenes n the to steps are as follos. S-EM propose n [Lu et al., 2002] s base on naïve Bayesan lassfaton an the EM algorthm [Dempster, 1977]. The man ea as to frst use a spyng tehnque to entfy some relable negatve ouments from the unlabele set, an then to run EM to bul the fnal lassfer. PEBL [Yu et al., 2002] uses a fferent metho (1-DNF to entfy relable negatve examples an then runs SVM teratvely to bul a lassfer. More reently, [L an Lu, 2003] reporte a tehnque alle Ro-SVM. In ths tehnque, relable negatve ouments are extrate by usng the nformaton retreval tehnque Roho [Roho, 1971], an SVM s use n the seon step. In [Fung et al., 2005], a metho alle PN-SVM s propose to eal th the stuaton hen the postve set s small. All these exstng methos requre that the unlabele set have a large number of hen negatve nstanes. In ths paper, e eal th the opposte problem,.e. the number of hen negatve nstanes s very small. Another lne of relate ork s learnng from only postve ata. In [Sholkopf, 1999], a one-lass SVM s propose. It as also stue n [Manevtz an Yousef, 2002] an [Crammer, 2004]. One-lass SVM buls a lassfer by treatng the tranng ata as the postve set P. Those nstanes n test set that are lassfe as negatve by the lassfer an be regare as unexpete nstanes. Hoever, our experments sho that ts results are poorer than PU learnng, hh nates that unlabele ata helps lassfaton. 3 The Propose Algorthm Gven a tranng set { }( =1,2,,n of multple lasses, our target s to automatally entfy those unexpete nstanes n test set T that o not belong to any of the tranng lasses. In the next subseton (Seton 3.1, e esrbe a baselne algorthm that retly apples PU learnng tehnques to entfy unexpete nstanes. Then, n Seton 3.2, e present our propose LGN algorthm. 3.1 Baselne Algorthms: PU Learnng To reaptulate, our problem of entfyng unexpete nstanes n the test set an be formulate as a PU learnng problem as follos. The tranng nstanes of all lasses are frst ombne to form the postve set P. The test set T then forms the unlabele set U, hh ontans both postve nstanes (.e., those belongng to tranng lasses an negatve/unexpete nstanes n T (.e., those not belongng to any tranng lass. Then, PU learnng tehnques an be employe to bul a lassfer to lassfy the unlabele set U (test set T to entfy negatve nstanes n U (the unexpete nstanes. Fgure 1 gves the etale frameork for generatng baselne algorthms base on PU learnng tehnques. 1. UE = ; 2. P = tranng examples from all lasses (treate as postve; 3. U = T (test set, gnore the lass labels n T f present; 4. Run an exstng PU learnng algorthm th P an U to bul a lassfer Q; 5. For eah nstane U (hh s the same as T 6. Use a lassfer Q to lassfy 7. If s lassfe as negatve then 8. UE = UE { }; 9. output UE Fgure 1. Dretly applyng exstng PU learnng tehnques Inthebaselnealgorthm,euseasetUE to store the negatve (unexpete nstanes entfe. Step 1 ntalzes UE to the empty set, hle Steps 2-3 ntalze the postve set P an unlabele set U as esrbe above. In Step 4, e run an exstng PU learnng algorthm (varous PU learnng tehnques an be apple to bul fferent lassfers to onstrut a lassfer Q. We then employ the lassfer Q to lassfy the test nstanes n U n Steps 5 to 8. Those nstanes that are lassfe by Q as negatve lass are ae to UE as unexpete nstanes. After e have terate through all the test nstanes, Step 9 outputs the unexpete set UE. 3.2 The Propose Tehnque: LGN In tratonal lassfaton, the tranng an test nstanes are ran nepenently aorng to some fxe strbuton D over X Y, herex enotes the set of possble ouments n our text lassfaton applaton, an Y ={ 1, 2,..., n }enotes the knon lasses. Theoretally, for eah lass, f ts tranng an test nstanes follo the same strbuton, a lassfer learne from the tranng nstanes an be use to lassfy the test nstanes nto the n knon lasses. In our problem, the tranng set Tr th nstanes from lasses 1, 2,..., n are stll ran from the strbuton D. Hoever, the test set T onssts of to subsets, T.P (alle postve nstanes n T an T.N (alle unexpete / negatve nstanes n T. The nstanes n T.P are nepenently ran 2803

3 from D, but the nstanes n T.N are ran from an unknon an fferent strbuton D u. Our obetve s to entfy all the nstanes ran from ths unknon strbuton D u,orn other ors to entfy all the hen nstanes n T.N. Let us no formally reformulate ths problem as a to-lass lassfaton problem thout labele negatve tranng examples. We frst rename the tranng set Tr as the postve set P by hangng every lass label Y to + (the postve lass. We then rename the test set T as the unlabele set U, hh omprses both hen postve nstanes an hen unexpete nstanes. The unexpete nstanes n U (or T are no alle negatve nstanes th the lass label (bear n mn that there are many hen postve nstanes n U. A learnng algorthm ll selet a funton f from a lass of funtons F: X {+, } tobe use as a lassfer that an entfy the unexpete (negatve nstanes from U. The problem here s that there are no labele negatve examples for learnng. Thus, t beomes a problem of learnng from postve an unlabele examples (PU learnng. As susse n the prevous seton, ths problem has been stue by researhers n reent years, but exstng PU tehnques performe poorly hen the number of negatve (unexpete nstanes n U s very small. To aress ths, e ll propose a tehnque to generate artfal negatve ouments base on the gven ata. Let us analyze the problem from a probablst pont of ve. In our text lassfaton problem, ouments are ommonly represente by frequenes of ors 1, 2,..., v that appear n the oument olleton, here V s alle the voabulary. Let + represent a postve or feature that haraterzes the nstanes n P an let - represent a negatve feature that haraterzes negatve (unexpete nstanes n U. IfU ontans a large proporton of postve nstanes, then the feature + ll have smlar strbuton n both P an U. Hoever, for the negatve feature -, ts probablty strbutons n the set P an U are very fferent. Our strategy s to explot ths fferene to generate an effetve set of artfal negatve ouments N so that t an be use together th the postve set P for a lassfer tranng to entfy negatve (unexpete ouments n U aurately. Gven that e use the naïve Bayesan frameork n ths ork, before gong further, e no ntroue naïve Bayesan lassfer for text lassfaton. NAÏVE BAYESIAN CLASSIFICATION Naïve Bayesan (NB lassfaton has been shon to be an effetve tehnque for text lassfaton [Les, 1994; MCallum an Ngam, 1998]. Gven a set of tranng ouments D, eah oument s onsere an orere lst of ors. We use,k to enote the or n poston k of oument, here eah or s from the voabulary V = { 1, 2,..., V }. The voabulary s the set of all ors e onser for lassfaton. We also have a set of preefne lasses, C ={ 1, 2,..., C }. In orer to perform lassfaton, e nee to ompute the posteror probablty, Pr(, here s a lass an s a oument. Base on the Bayesan probablty an the multnomal moel, e have D r( 1 r( = Ρ Ρ = D (1 an th Laplaan smoothng, D 1+ N(, Ρr( = 1 t (2 Ρ r( t = V D V + N(, Ρr( s= 1 = 1 s here N( t, s the ount of the number of tmes that the or t ours n oument an Pr( {0,1} epenng on the lass label of the oument. Fnally, assumng that the probabltes of the ors are nepenent gven the lass, e obtan the NB lassfer: Ρr( Ρr( = 1, k k Ρ r( = C (3 Ρr( Ρr( = 1 = 1, r r k k r In the nave Bayesan lassfer, the lass th the hghest Pr( s assgne as the lass of the oument. GENERATING NEGATIVE DATA In ths subseton, e present our algorthm to generate the negatve ata. Gven that n a naïve Bayesan frameork, the ontonal probabltes Pr( t - (Equaton (2 are ompute base on the aumulatve frequenes of all the ouments n the negatve lass, a sngle artfal negatve nstane A N oul ork equally ell for Bayesan learnng. In other ors, e nee to generate the negatve oument A N n suh aaytoensurepr( + + Pr( + - > 0 for a postve feature + an Pr( - + Pr( - - < 0 for a negatve feature -.We use an entropy-base metho to estmate f a feature n U has sgnfantly fferent ontonal probabltes n P an n U (.e, (Pr( + an Pr( -. The entropy equaton s: entropy( = Pr( *log(pr( (4 { +, } The entropy values sho the relatve srmnatory poer of the or features: the bgger a feature s entropy s, the more lkely t has smlar strbutons n both P an U (.e. less srmnatory. Ths means that for a negatve feature -, ts entropy entropy( - ssmallaspr( - - ( - manly ourrng n U s sgnfantly larger than Pr( - +, hle entropy( + s large as Pr( + + an Pr( + - are smlar. The entropy (an ts ontonal probabltes an therefore nate hether a feature belongs to the postve or the negatve lass. We generate features for A N base on the entropy nformaton, eghte as follos: entropy( q( (5 = 1 max = 1,2..., V ( entropy( If q( = 0, t means that unformly ours n both P an U an e therefore o not generate n A N.Ifq( =1, e an be almost ertan that s a negatve feature an e generate t for A N, base on ts strbuton n U. Inthsay, those features that are eeme more srmnatory ll be generate more frequently n A N. For those features th q( beteen the to extremes, ther frequenes n A N are generate proportonally. We generate the artfal negatve oument A N as follos. Gven the postve set P an the unlabele set U, e ompute eah or feature s entropy value. The feature s frequeny n the negatve oument A N s then ranomly generate follong a Gaussan strbuton aorng to q( =1-entropy( /max(entropy(, V. The etale algorthm s shon n Fgure

4 1. A N = ; 2. P = tranng ouments from all lasses (treate as postve; 3. U = T (test set, gnore the lass labels n T f present; 4. For eah feature U 5. Compute the frequeny of n eah oument k freq(, k, k U; 6. Let mean freq(, k here D k D s the set of μ =, D ouments n ontanng 7. Let varane ; σ = ( freq(, k μ ( D 1 k D 8. For eah feature V 9. Compute Pr( +, Pr( - usng Equaton (2 assumng that all the ouments n U are negatve; 10. Let entropy ( = Pr( *log(pr( ; { +, } 11. Let m =max(entropy(, =1,..., V ; 12. For eah feature V 13. entropy( q( = 1 ; m 14. For =1to D *q( 15. Generate a frequeny fne(, usng the Gaussan strbuton ( x μ 2 2σ e μ 2π A N = A N {(, fne( } 17. Output A N Fgure 2. Generatng the negatve oument A N In the algorthm, Step 1 ntalzes the negatve oument A N (hh onssts of a set of feature-frequeny pars to the empty set hle Steps 2 an Step 3 ntalze the postve set P an the unlabele set U. FromStep4toStep7,for eah feature that appeare n U, e ompute ts frequeny n eah oument, an then alulate the frequeny mean an varane n those ouments D that ontan. These nformaton are use to generate A N later. From Step 8 to Step 10, e ompute the entropy of usng Pr( + an Pr( - (hh are ompute usng Equaton (2 by assumng that all the ouments n U are negatve. After obtanng the maxmal entropy value n Step 11, e generate the negatve oument A N n Steps 12 to 16. In partular, Step 13 omputes q(, hh shos ho negatve a feature s n terms of ho fferent the s strbutons n U an n P are the bgger the fferene, the hgher the frequeny th hh e generate the feature. Steps 14 to 16 s an nner loop an D *q( ees the number of tmes e generate a frequeny for or.thus,fq( s small, t means that has ourre n both P an U th smlar probabltes, an e generate feer.otherse, s qute lkely to be a negatve feature an e generate t th a strbuton smlar to the one n U. Ineahteraton, Step 15 uses a Gaussan strbuton th orresponng an σ to generate a frequeny fne( for.step16 plaes the par (, fne( nto the negatve oument A N. Fnally, Step 17 outputs our generate negatve set A N. Note that the frequeny for eah feature n A N may not of an nteger value as t s generate by a Gaussan strbuton. A N s essentally a ranomly generate aggregate oument that summarzes the unlabelle ata set, but th the features natve of postve lass ramatally reue. BUILDING THE FINAL NB CLASSIFIER Fnally, e esrbe ho to bul an NB lassfer th the postve set P an the generate sngle negatve oument A N to entfy unexpete oument nstanes. The etale algorthm s shon n Fgure UE = ; 2. Bul a naïve Bayesan lassfer Q th P an {A N }usng Equatons (1 an (2; 3. For eah oument U 4. Usng Q to lassfy usng Equaton (3; 5. If (Pr(- >Pr(+ 6. UE = UE { }; 7. output UE; Fgure 3. Bulng the fnal NB lassfer UE stores the set of unexpete ouments entfe n U (or test set T, ntalze to empty set n Step 1. In Step 2, e use Equatons (1 an (2 to bul a NB lassfer by omputng the pror probabltes Pr(+ an Pr(-, an the ontonal probabltes of Pr( + an Pr( -. Clearly, Pr( + an Pr( - an be ompute base on the postve set P an the sngle negatve oument A N respetvely (A N an be regare as the average oument of a set of vrtual negatve ouments. Hoever, the problem s ho to ompute the pror probabltes of Pr(+ an Pr(-. It turns out that ths s not a maor ssue e an smply assume that e have generate a negatve oument set that has the same number of ouments as the number of ouments n the postve set P. We ll report expermental results that support ths n the next seton. After bulng the NB lassfer Q, e use t to lassfy eah test oument n U (Steps 3-6. The fnal output s the UE set that store all the entfe unexpete ouments n U. 4 Empral Evaluaton In ths seton, e evaluate our propose tehnque LGN. We ompare t th both one-lass SVM (OSVM, e use LIBSVM an exstng PU learnng methos: S-EM [Lu et al., 2002], PEBL [Yu et al., 2002] an Ro-SVM [L an Lu, 2003]. S-EM an Ro-SVM are publly avalable 1. We mplemente PEBL as t s not avalable from ts authors. 4.1 Datasets For evaluaton, e use the benhmark 20 Nesgroup olleton, hh onssts of ouments from 20 fferent UseNet susson groups. The 20 groups ere also ategorze nto 4 man ategores, omputer, rereaton, sene, an talk. We frst perform the follong to sets of experments: 2-lasses: Ths set of experments smulates the ase n hh the tranng ata has to lasses,.e. our postve set P ontans to lasses. The to lasses of ata ere hosen

5 from to man ategores, omputer an sene, n hh the omputer group has fve subgroups, an the sene group has four subgroups. Every subgroup onssts of 1,000 ouments. Eah ata set for tranng an testng s then onstrute as follos: The postve ouments for both tranng an testng onsst of ouments from one subgroup (or lass n omputer an one subgroup (or lass n sene. Ths gves us 20 ata sets. For eah lass (or subgroup, e parttone ts ouments nto to stanar subsets: 70% for tranng an 30% for testng. That s, eah postve set P for tranng ontans 1400 ouments of to lasses, an eah test set U ontans 600 postve ouments of the same to lasses. We then a negatve (unexpete ouments to U, hh are ranomly selete from the remanng 18 groups. In orer to reate fferent expermental settngs, e vary the number of unexpete ouments, hh s ontrolle by a parameter α, a perentage of U,.e., the number of unexpete ouments ae to U s α U. 3-lasses: Ths set of experments smulates the ase n hh the tranng ata has three fferent lasses,.e. our postve set P ontans three lasses of ata. We use the same 20 ata sets forme above an ae another lass to eah for both P an U. The ae thr lass as ranomly selete from the remanng 18 groups. For eah ata set, the unexpete ouments n U ere then ranomly selete from the remanng 17 nesgroups. All other settngs ere the same as for the 2-lasses ase. 4.2 Expermental Results 2-lasses: We performe experments usng all possble 1 an 2 ombnatons (.e., 20 ata sets. For eah tehnque, namely, OSVM, S-EM, Ro-SVM, PEBL an LGN, e performe 5 ranom runs to obtan the average results. In eah run, the tranng an test oument sets from 1 an 2 as ell as the unexpete oument nstanes from the other 18 lasses ere selete ranomly. We vare α from 5% to 100%. Table 1 shos the lassfaton results of varous tehnques n terms of F-sore (for negatve lass hen α = 5%. The frst olumn of Table 1 lsts the 20 fferent ombnatons of 1 an 2. Columns 2 to 5 sho the results of four tehnques OSVM, S-EM, Ro-SVM an PEBL respetvely. Column 6 gves the orresponng results of our tehnque LGN. We observe from Table 1 that LGN proues the best results onsstently for all ata sets, ahevng an F-sore of 77.0% on average, hh s 54.8%, 32.8%, 60.2% an 76.5% hgher than the F-sores of exstng four tehnques (OSVM, S-EM, Ro-SVM an PEBL respetvely n absolute terms. We also see that LGN s hghly onsstent aross fferent ata sets. In fat, e have heke the frst step of the three exstng PU learnng tehnques an foun that most of the extrate negatve ouments ere rong. As a result, n ther respetve seon steps, SVM an EM ere unable to bul aurate lassfers ue to very nosy negatve ata. Sne the S-EM algorthm has a parameter, e tre fferent values, but the results ere smlar. Table 1. Expermental results for α =5%. Data Set OSVM S-EM Ro-SVM PEBL LGN graph-rypt graph-eletro graph-me graph-spae os-rypt os-eletrons os-me os-spae ma.harare-rypt ma.harare-eletro ma.harare-me ma.harare-spae bm.harare-rypt bm.harare-eletro bm.harare-me bm.harare-spae nos-rypt nos-eletro nos-me nos-spae Average Fgure 4 shos the maro-average results of all α values (from 5% to 100% for all fve tehnques n the 2-lasses experments. Our metho LGN outperforme all others sgnfantly for α 60%. When α as nrease to 80% an 100%, Ro-SVM aheve slghtly better results than LGN. We also observe that OSVM, S-EM an Ro-SVM outperforme PEBL sne they ere able to extrat more relable negatves than the 1-DNF metho use n PEBL. PEBL neee a hgher α (200% to aheve smlar goo results. F-sore LGN S-EM Ro-SVM PEBL OSVM 5% 10% 15% 20% 40% 60% 80% 100% a % of unexpete ouments Fgure. 4. The omparson results th fferent perentages of unexpete ouments n U n the 2-lasses experments. 3-lasses: Fgure 5 shos the 3-lasses results here LGN stll performe muh better than the methos hen the proporton of unexpete ouments s small (α 60% an omparably th S-EM an Ro-SVM hen the proporton s larger. OSVM s results are muh orse than S-EM, Ro-SVM an LGN hen α s larger, shong that PU learnng s better than one-lass SVM n the problem. Agan, PEBL requre a muh larger proporton of unexpete ouments to proue omparable results. 2806

6 F-sore In summary, e onlue that LGN s sgnfantly better (th hgh F-sores than the other tehnques hen α s small (α 60%, hh nates that t an be use to effetvely extrat unexpete ouments from the test set even n the hallengng senaros n hh ther presene n U s non-obvous. The other methos all fale baly hen α s small. LGN also performe omparably n the event hen the proporton of unexpete nstanes s large (α 80%. Fnally, e also onute 10-lasses experments n hh ten fferent lasses from both the 20 Nesgroups an Reuter olletons (th same expermental settng for the 3-lasses ere use. The behavors of the algorthms for 10 lasses ere the same as for 2 lasses an 3 lasses. Usng the Reuter olleton th 10 lasses an α set to 5%, 10%, 15%, 20% an 40%, our algorthm LGN aheve 32.77%, 32.14%, 27.82%, 18.43%, 11.11% hgher F-sores respetvely than the best results of the exstng methos (OSVM, S-EM, Ro-SVM an PEBL. Smlarly, usng 10 lasses from the 20 nesgroup olleton, LGN aheve 10.56%, 4.80%, 5.46%, 6.20%, an 4.00% hgher F-sores for α =5%, 10%, 15%, 20% an 40% of unexpete ouments respetvely than the best of the four other exstng methos. Effet of prors: Reall that n Seton 3 e have left the pror probabltes as a parameter sne e only generate a sngle artfal negatve oument. To hek the effet of prors, e also vare the pror n our experments by hangng the proporton of negatve ouments as a perentage of the number of postve ouments n P. Wetre 40%, 60%, 80% an 100%. The results ere vrtually the same, th average fferenes only thn ±1%.Thus,e smply hoose 100% as the efault of our system, hh gves us Pr(+ = Pr(- = 0.5. All the expermental results reporte here ere obtane usng ths efault settng. 5 Conluson LGN S-EM Ro-SVM PEBL OSVM 5% 10% 15% 20% 40% 60% 80% 100% a % of unexpete ouments Fgure. 5. The omparson results th fferent perentages of unexpete ouments n U n the 3-lasses experments. In real-orl lassfaton applatons, the test ata may ffer from the tranng ata beause unexpete nstanes that o not belong to any of the preefne lasses may be present (or emerge n the long run an they annot be entfe by tratonal lassfaton tehnques. We have shon here that the problem an be aresse by formulatng t as a PU learnng problem. Hoever, retly applyng exstng PU learnng algorthms performe poorly as they requre a large proporton of unexpete nstanes to be present n the unlabele test ata, hh s often not the ase n prate. We then propose a novel tehnque LGN to entfy unexpete ouments by generatng a sngle artfal negatve oument to help tran a lassfer to better etet unexpete nstanes. Our expermental results n oument lassfaton emonstrate that LGN performe sgnfantly better than exstng tehnques hen the proporton of unexpete nstanes s lo. The metho s also robust rrespetve of the proportons of unexpete nstanes present n the test set. Although our urrent experments ere performe n the text lassfaton applaton usng an NB lassfer, e beleve that the approah s also applable to other omans. Usng a sngle artfal negatve oument, hoever, ll not sutable for other learnng algorthms. In our future ork, e plan to generate a large set of artfal ouments so that other learnng methos may also be apple. Referenes [Crammer an Chehk, 2004] K. Crammer an G. Chehk. A neele n a haystak: loal one-lass optmzaton, ICML, [Dempster et al., 1977] A. Dempster, N. Lar an D. Rubn, Maxmum lkelhoo from nomplete ata va the EM algorthm, Journal of the Royal Statstal Soety, [Dens, 1998] F. Dens, PAC learnng from postve statstal queres. ALT, [Dens, 2002] F. Dens, R. Glleron, an M. Tommas. Text lassfaton from postve an unlabele examples. IPMU, [Fung, 2005] G. Fung, J. Yu, H. Lu, an P Yu. Text Classfaton thout Labele Negatve Douments. ICDE, [Les an Gale, 1994] D. Les an W. Gale. A sequental algorthm for tranng text lassfers. SIGIR, [L an Lu, 2003] X. L, an B. Lu. Learnng to lassfy text usng postve an unlabele ata. IJCAI, [Lu et al., 2002] B. Lu, W. Lee, P. Yu, an X. L. Partally supervse lassfaton of text ouments. ICML, [Manevtz an Yousef, 2001] L. Manevtz, an M. Yousef. One lass SVMs for oument lassfaton. Journal of Mahne Learnng Researh, 2, , [MCallum an Ngam, 1998] A. MCallum, an K. Ngam, A omparson of event moels for naïve Bayes text lassfaton. AAAI, [Muggleton, 2001] S. Muggleton. Learnng from the postve ata. Mahne Learnng, [Roho, 1971] J. Roho. Relevant feebak n nformaton retreval. G. Salton. The smart retreval system: experments n automat oument proessng, [Sholkopf et al., 1999] B. Sholkopf, J. Platt, J. Shae, A. Smola & R. Wllamson. Estmatng the support of a hgh-mensonal strbuton. Tehnal Report MSR-TR-99-87, Mrosoft Researh, [Yu et al., 2002] H. Yu, J. Han, an K. Chang. PEBL: Postve example base learnng for Web page lassfaton usng SVM. KDD,

Discriminative Estimation (Maxent models and perceptron)

Discriminative Estimation (Maxent models and perceptron) srmnatve Estmaton Maxent moels an pereptron Generatve vs. srmnatve moels Many sles are aapte rom sles by hrstopher Mannng Introuton So ar we ve looke at generatve moels Nave Bayes But there s now muh use

More information

BINARY LAMBDA-SET FUNCTION AND RELIABILITY OF AIRLINE

BINARY LAMBDA-SET FUNCTION AND RELIABILITY OF AIRLINE BINARY LAMBDA-SET FUNTION AND RELIABILITY OF AIRLINE Y. Paramonov, S. Tretyakov, M. Hauka Ra Tehnal Unversty, Aeronautal Insttute, Ra, Latva e-mal: yur.paramonov@mal.om serejs.tretjakovs@mal.om mars.hauka@mal.om

More information

Outline. Clustering: Similarity-Based Clustering. Supervised Learning vs. Unsupervised Learning. Clustering. Applications of Clustering

Outline. Clustering: Similarity-Based Clustering. Supervised Learning vs. Unsupervised Learning. Clustering. Applications of Clustering Clusterng: Smlarty-Based Clusterng CS4780/5780 Mahne Learnng Fall 2013 Thorsten Joahms Cornell Unversty Supervsed vs. Unsupervsed Learnng Herarhal Clusterng Herarhal Agglomeratve Clusterng (HAC) Non-Herarhal

More information

Clustering. CS4780/5780 Machine Learning Fall Thorsten Joachims Cornell University

Clustering. CS4780/5780 Machine Learning Fall Thorsten Joachims Cornell University Clusterng CS4780/5780 Mahne Learnng Fall 2012 Thorsten Joahms Cornell Unversty Readng: Mannng/Raghavan/Shuetze, Chapters 16 (not 16.3) and 17 (http://nlp.stanford.edu/ir-book/) Outlne Supervsed vs. Unsupervsed

More information

An Evaluation on Feature Selection for Text Clustering

An Evaluation on Feature Selection for Text Clustering An Evaluaton on Feature Seleton for Text Clusterng Tao Lu Department of Informaton Sene, anka Unversty, Tann 30007, P. R. Chna Shengpng Lu Department of Informaton Sene, Pekng Unversty, Beng 0087, P. R.

More information

A solution to the Curse of Dimensionality Problem in Pairwise Scoring Techniques

A solution to the Curse of Dimensionality Problem in Pairwise Scoring Techniques A soluton to the Curse of Dmensonalty Problem n Parwse orng Tehnques Man Wa MAK Dept. of Eletron and Informaton Engneerng The Hong Kong Polytehn Unversty un Yuan KUNG Dept. of Eletral Engneerng Prneton

More information

Instance-Based Learning and Clustering

Instance-Based Learning and Clustering Instane-Based Learnng and Clusterng R&N 04, a bt of 03 Dfferent knds of Indutve Learnng Supervsed learnng Bas dea: Learn an approxmaton for a funton y=f(x based on labelled examples { (x,y, (x,y,, (x n,y

More information

A Developed Method of Tuning PID Controllers with Fuzzy Rules for Integrating Processes

A Developed Method of Tuning PID Controllers with Fuzzy Rules for Integrating Processes A Develope Metho of Tunng PID Controllers wth Fuzzy Rules for Integratng Proesses Janmng Zhang, Nng Wang an Shuqng Wang Abstrat The proportonal ntegral ervatve (PID) ontrollers are wely apple n nustral

More information

Test Data: Classes: Training Data:

Test Data: Classes: Training Data: CS276A Text Retreval and Mnng Reap of the last leture Probablst models n Informaton Retreval Probablty Rankng Prnple Bnary Independene Model Bayesan Networks for IR [very superfally] Leture 11 These models

More information

Unified Utility Maximization Framework for Resource Selection

Unified Utility Maximization Framework for Resource Selection Unfe Utlty Maxmzaton Framewor for Resoure Seleton Luo S Language Tehnology Inst. Shool of Compute Sene Carnege Mellon Unversty Pttsburgh, PA 523 ls@s.mu.eu Jame Callan Language Tehnology Inst. Shool of

More information

Jointly optimized rate-compatible UEP-LDPC codes for half-duplex co-operative relay networks

Jointly optimized rate-compatible UEP-LDPC codes for half-duplex co-operative relay networks Khattak an Sanberg EURASIP Journal on Wreless Communatons an Networkng 2014, 2014:22 http://wn.euraspournals.om/ontent/2014/1/22 RESEARCH Open Aess Jontly optmze rate-ompatble UEP-LDPC oes for half-uplex

More information

Bayesian Learning. Smart Home Health Analytics Spring Nirmalya Roy Department of Information Systems University of Maryland Baltimore County

Bayesian Learning. Smart Home Health Analytics Spring Nirmalya Roy Department of Information Systems University of Maryland Baltimore County Smart Home Health Analytcs Sprng 2018 Bayesan Learnng Nrmalya Roy Department of Informaton Systems Unversty of Maryland Baltmore ounty www.umbc.edu Bayesan Learnng ombnes pror knowledge wth evdence to

More information

JSM Survey Research Methods Section. Is it MAR or NMAR? Michail Sverchkov

JSM Survey Research Methods Section. Is it MAR or NMAR? Michail Sverchkov JSM 2013 - Survey Researh Methods Seton Is t MAR or NMAR? Mhal Sverhkov Bureau of Labor Statsts 2 Massahusetts Avenue, NE, Sute 1950, Washngton, DC. 20212, Sverhkov.Mhael@bls.gov Abstrat Most methods that

More information

b ), which stands for uniform distribution on the interval a x< b. = 0 elsewhere

b ), which stands for uniform distribution on the interval a x< b. = 0 elsewhere Fall Analyss of Epermental Measurements B. Esensten/rev. S. Errede Some mportant probablty dstrbutons: Unform Bnomal Posson Gaussan/ormal The Unform dstrbuton s often called U( a, b ), hch stands for unform

More information

A ME Model Based on Feature Template for Chinese Text Categorization

A ME Model Based on Feature Template for Chinese Text Categorization A ME Model Based on Feature Template for Chnese Text Categorzaton L Pe-feng *+ Zhu Qao-mng *+ L Jun-hu * * Shool of Computer Sene & Tehnology Soohow Unversty Suzhou, Jangsu, Chna Abstrat - Wth enterng

More information

ECE 522 Power Systems Analysis II 2 Power System Modeling

ECE 522 Power Systems Analysis II 2 Power System Modeling ECE 522 Power Systems Analyss II 2 Power System Moelng Sprng 218 Instrutor: Ka Sun 1 Outlne 2.1 Moelng of synhronous generators for Stablty Stues Synhronous Mahne Moelng Smplfe Moels for Stablty Stues

More information

Using Maximum Entropy for Text Classification

Using Maximum Entropy for Text Classification Usng Maxmum Entropy for Text Classfaton Kamal Ngam kngam@s.mu.edu John Lafferty lafferty@s.mu.edu Andrew MCallum mallum@justresearh.om Shool of Computer Sene Carnege Mellon Unversty Pttsburgh, PA 15213

More information

Global Exponential Stability of FAST TCP

Global Exponential Stability of FAST TCP Global Exponental Stablty of FAST TCP Joon-Young Cho Kyungmo Koo Dav X. We Jn S. Lee an Steven H. Low Abstrat We onser a sngle-lnk mult-soure network wth the FAST TCP soures. We propose a ontnuous-tme

More information

Maxent Models and Discriminative Estimation. Generative vs. Discriminative models

Maxent Models and Discriminative Estimation. Generative vs. Discriminative models + Maxent Moels an Dsrmnatve Estmaton Generatve vs. Dsrmnatve moels + Introuton n So far we ve looke at generatve moels n Language moels Nave Bayes 2 n But there s now muh use of ontonal or srmnatve probablst

More information

A Theorem of Mass Being Derived From Electrical Standing Waves (As Applied to Jean Louis Naudin's Test)

A Theorem of Mass Being Derived From Electrical Standing Waves (As Applied to Jean Louis Naudin's Test) A Theorem of Mass Beng Derved From Eletral Standng Waves (As Appled to Jean Lous Naudn's Test) - by - Jerry E Bayles Aprl 4, 000 Ths paper formalzes a onept presented n my book, "Eletrogravtaton As A Unfed

More information

Multigradient for Neural Networks for Equalizers 1

Multigradient for Neural Networks for Equalizers 1 Multgradent for Neural Netorks for Equalzers 1 Chulhee ee, Jnook Go and Heeyoung Km Department of Electrcal and Electronc Engneerng Yonse Unversty 134 Shnchon-Dong, Seodaemun-Ku, Seoul 1-749, Korea ABSTRACT

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

ENTROPIC QUESTIONING

ENTROPIC QUESTIONING ENTROPIC QUESTIONING NACHUM. Introucton Goal. Pck the queston that contrbutes most to fnng a sutable prouct. Iea. Use an nformaton-theoretc measure. Bascs. Entropy (a non-negatve real number) measures

More information

Machine Learning: and 15781, 2003 Assignment 4

Machine Learning: and 15781, 2003 Assignment 4 ahne Learnng: 070 and 578, 003 Assgnment 4. VC Dmenson 30 onts Consder the spae of nstane X orrespondng to all ponts n the D x, plane. Gve the VC dmenson of the followng hpothess spaes. No explanaton requred.

More information

Evaluation for sets of classes

Evaluation for sets of classes Evaluaton for Tet Categorzaton Classfcaton accuracy: usual n ML, the proporton of correct decsons, Not approprate f the populaton rate of the class s low Precson, Recall and F 1 Better measures 21 Evaluaton

More information

ECE 422 Power System Operations & Planning 2 Synchronous Machine Modeling

ECE 422 Power System Operations & Planning 2 Synchronous Machine Modeling ECE 422 Power System Operatons & Plannng 2 Synhronous Mahne Moelng Sprng 219 Instrutor: Ka Sun 1 Outlne 2.1 Moelng of synhronous generators for Stablty Stues Synhronous Mahne Moelng Smplfe Moels for Stablty

More information

A New Thresholding Algorithm for Hierarchical Text Classification

A New Thresholding Algorithm for Hierarchical Text Classification A New Thresholdng Algorthm for Herarhal Text Classfaton Donato Malerba, Mhelangelo Ce, Mhele Lap, Gulo Altn Dpartmento d Informata, Unverstà degl Stud va Orabona, 4-716 Bar - Italy {malerba, e, lap}@d.unba.t

More information

Exact Inference: Introduction. Exact Inference: Introduction. Exact Inference: Introduction. Exact Inference: Introduction.

Exact Inference: Introduction. Exact Inference: Introduction. Exact Inference: Introduction. Exact Inference: Introduction. Exat nferene: ntroduton Exat nferene: ntroduton Usng a ayesan network to ompute probabltes s alled nferene n general nferene nvolves queres of the form: E=e E = The evdene varables = The query varables

More information

The corresponding link function is the complementary log-log link The logistic model is comparable with the probit model if

The corresponding link function is the complementary log-log link The logistic model is comparable with the probit model if SK300 and SK400 Lnk funtons for bnomal GLMs Autumn 08 We motvate the dsusson by the beetle eample GLMs for bnomal and multnomal data Covers the followng materal from hapters 5 and 6: Seton 5.6., 5.6.3,

More information

Mixture o f of Gaussian Gaussian clustering Nov

Mixture o f of Gaussian Gaussian clustering Nov Mture of Gaussan clusterng Nov 11 2009 Soft vs hard lusterng Kmeans performs Hard clusterng: Data pont s determnstcally assgned to one and only one cluster But n realty clusters may overlap Soft-clusterng:

More information

Hopfield Training Rules 1 N

Hopfield Training Rules 1 N Hopfeld Tranng Rules To memorse a sngle pattern Suppose e set the eghts thus - = p p here, s the eght beteen nodes & s the number of nodes n the netor p s the value requred for the -th node What ll the

More information

Efficient Sampling for Gaussian Process Inference using Control Variables

Efficient Sampling for Gaussian Process Inference using Control Variables Effent Samplng for Gaussan Proess Inferene usng Control Varables Mhals K. Ttsas, Nel D. Lawrene and Magnus Rattray Shool of Computer Sene, Unversty of Manhester Manhester M 9PL, UK Abstrat Samplng funtons

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

Large-Scale Data-Dependent Kernel Approximation Appendix

Large-Scale Data-Dependent Kernel Approximation Appendix Large-Scale Data-Depenent Kernel Approxmaton Appenx Ths appenx presents the atonal etal an proofs assocate wth the man paper [1]. 1 Introucton Let k : R p R p R be a postve efnte translaton nvarant functon

More information

APLSSVM: Hybrid Entropy Models for Image Retrieval

APLSSVM: Hybrid Entropy Models for Image Retrieval Internatonal Journal of Intellgent Informaton Systems 205; 4(2-2): 9-4 Publshed onlne Aprl 29, 205 (http://www.senepublshnggroup.om/j/js) do: 0.648/j.js.s.205040202.3 ISSN: 2328-7675 (Prnt); ISSN: 2328-7683

More information

Approaches to Modeling Clinical PK of ADCs

Approaches to Modeling Clinical PK of ADCs Sesson 4b: PKPD Mong of ntboy-drug onjugate (Symposum) Otober 4, 24, Las egas, NE pproahes to Mong lnal PK of Ds Leon Gbansy QuantPharm LL ntboy-drug onjugates Ø ntboy (or antboy fragment) lne (through

More information

DOAEstimationforCoherentSourcesinBeamspace UsingSpatialSmoothing

DOAEstimationforCoherentSourcesinBeamspace UsingSpatialSmoothing DOAEstmatonorCoherentSouresneamspae UsngSpatalSmoothng YnYang,ChunruWan,ChaoSun,QngWang ShooloEletralandEletronEngneerng NanangehnologalUnverst,Sngapore,639798 InsttuteoAoustEngneerng NorthwesternPoltehnalUnverst,X

More information

Some remarks about the transformation of Charnes and Cooper by Ezio Marchi *)

Some remarks about the transformation of Charnes and Cooper by Ezio Marchi *) Some remars about the transformaton of Charnes an Cooper b Eo Marh * Abstrat In ths paper we eten n a smple wa the transformaton of Charnes an Cooper to the ase where the funtonal rato to be onsere are

More information

Image retrieval at low bit rates: BSP Trees vs. JPEG

Image retrieval at low bit rates: BSP Trees vs. JPEG mage retreval at low bt rates: Trees vs. Mhal Sth, an Geral Shaefer Shool of Computng an Tehnology The ottngham Trent Unversty, ottngham, U.K. Dept. of Computng, Eletrons an Automate Control Slesan Unversty

More information

Lecture Nov

Lecture Nov Lecture 18 Nov 07 2008 Revew Clusterng Groupng smlar obects nto clusters Herarchcal clusterng Agglomeratve approach (HAC: teratvely merge smlar clusters Dfferent lnkage algorthms for computng dstances

More information

I. INTRODUCTION. Keywords Web Mining, Web Usage Mining, Page Rank, Web Map

I. INTRODUCTION. Keywords Web Mining, Web Usage Mining, Page Rank, Web Map An Extended Algorm of Page Rankng Consderng Chronologal Dmenson of Searh Sandeep Gupta #, Mohd. Husan * # Computer Sene and Engneerng,NIMS Unversty, Japur, Rajasan, Inda * Dretor, AZAD IET, Luknow, UP,

More information

Fusion of Neural Classifiers for Financial Market Prediction

Fusion of Neural Classifiers for Financial Market Prediction Fuson of Neural Classfers for Fnanal Market Predton Trsh Keaton Dept. of Eletral Engneerng (136-93) Informaton Senes Laboratory (RL 69) Calforna Insttute of Tehnology HRL Laboratores, LLC Pasadena, CA

More information

A new mixed integer linear programming model for flexible job shop scheduling problem

A new mixed integer linear programming model for flexible job shop scheduling problem A new mxed nteger lnear programmng model for flexble job shop shedulng problem Mohsen Zaee Department of Industral Engneerng, Unversty of Bojnord, 94531-55111 Bojnord, Iran Abstrat. In ths paper, a mxed

More information

Gaussian Mixture Models

Gaussian Mixture Models Lab Gaussan Mxture Models Lab Objectve: Understand the formulaton of Gaussan Mxture Models (GMMs) and how to estmate GMM parameters. You ve already seen GMMs as the observaton dstrbuton n certan contnuous

More information

Support Vector Machines

Support Vector Machines CS 2750: Machne Learnng Support Vector Machnes Prof. Adrana Kovashka Unversty of Pttsburgh February 17, 2016 Announcement Homework 2 deadlne s now 2/29 We ll have covered everythng you need today or at

More information

2. High dimensional data

2. High dimensional data /8/00. Hgh mensons. Hgh mensonal ata Conser representng a ocument by a vector each component of whch correspons to the number of occurrences of a partcular wor n the ocument. The Englsh language has on

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Regularized Discriminant Analysis for Face Recognition

Regularized Discriminant Analysis for Face Recognition 1 Regularzed Dscrmnant Analyss for Face Recognton Itz Pma, Mayer Aladem Department of Electrcal and Computer Engneerng, Ben-Guron Unversty of the Negev P.O.Box 653, Beer-Sheva, 845, Israel. Abstract Ths

More information

Expectation Maximization Mixture Models HMMs

Expectation Maximization Mixture Models HMMs -755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood

More information

Using Artificial Neural Networks and Support Vector Regression to Model the Lyapunov Exponent

Using Artificial Neural Networks and Support Vector Regression to Model the Lyapunov Exponent Usng Artfal Neural Networks and Support Vetor Regresson to Model the Lyapunov Exponent Abstrat: Adam Maus* Aprl 3, 009 Fndng the salent patterns n haot data has been the holy gral of Chaos Theory. Examples

More information

FAULT DETECTION AND IDENTIFICATION BASED ON FULLY-DECOUPLED PARITY EQUATION

FAULT DETECTION AND IDENTIFICATION BASED ON FULLY-DECOUPLED PARITY EQUATION Control 4, Unversty of Bath, UK, September 4 FAUL DEECION AND IDENIFICAION BASED ON FULLY-DECOUPLED PARIY EQUAION C. W. Chan, Hua Song, and Hong-Yue Zhang he Unversty of Hong Kong, Hong Kong, Chna, Emal:

More information

An Internet Traffic Identification Approach Based on GA and PSO-SVM

An Internet Traffic Identification Approach Based on GA and PSO-SVM JOURAL OF COMPUERS, VOL. 7, O., JAUARY 202 9 An Internet raff Ientfaton Approah Base on GA an PSO-SVM Jun an Shool of Coputer Sene, Shuan Unversty, Chengu, Chna Eal: hnatanjun@gal.o Xngshu Chen an Mn Du

More information

A Theorem of Mass Being Derived From Electrical Standing Waves (As Applied to Jean Louis Naudin's Test)

A Theorem of Mass Being Derived From Electrical Standing Waves (As Applied to Jean Louis Naudin's Test) A Theorem of Mass Beng Derved From Eletral Standng Waves (As Appled to Jean Lous Naudn's Test) - by - Jerry E Bayles Aprl 5, 000 Ths Analyss Proposes The Neessary Changes Requred For A Workng Test Ths

More information

A Tutorial on Data Reduction. Linear Discriminant Analysis (LDA) Shireen Elhabian and Aly A. Farag. University of Louisville, CVIP Lab September 2009

A Tutorial on Data Reduction. Linear Discriminant Analysis (LDA) Shireen Elhabian and Aly A. Farag. University of Louisville, CVIP Lab September 2009 A utoral on Data Reducton Lnear Dscrmnant Analss (LDA) hreen Elhaban and Al A Farag Unverst of Lousvlle, CVIP Lab eptember 009 Outlne LDA objectve Recall PCA No LDA LDA o Classes Counter eample LDA C Classes

More information

A HYDROPHOBICITY BASED NEURAL NETWORK METHOD FOR PREDICTING TRANSMEMBRANE SEGMENTS IN PROTEIN SEQUENCES

A HYDROPHOBICITY BASED NEURAL NETWORK METHOD FOR PREDICTING TRANSMEMBRANE SEGMENTS IN PROTEIN SEQUENCES A HYDROPHOBICITY BASED NEURAL NETWORK METHOD FOR PREDICTING TRANSMEMBRANE SEGMENTS IN PROTEIN SEQUENCES Zhongqang Chen, Q Lu, Ysheng Zhu, Yxue, L*, Yuhong Xu Department of Bomeal Engneerng, Shangha Jaotong

More information

Pattern Classification: An Improvement Using Combination of VQ and PCA Based Techniques

Pattern Classification: An Improvement Using Combination of VQ and PCA Based Techniques Ameran Journal of Appled Senes (0): 445-455, 005 ISSN 546-939 005 Sene Publatons Pattern Classfaton: An Improvement Usng Combnaton of and PCA Based Tehnques Alok Sharma, Kuldp K. Palwal and Godfrey C.

More information

Correspondence Rules for Motion Detection using Randomized Methods

Correspondence Rules for Motion Detection using Randomized Methods Egyptan Computer Sene Journal Corresponene Rules or Moton Deteton usng Ranomze Methos Amr Gone an Howaa Nagu Department o Computer Sene & Engneerng, the Ameran Unversty n Caro, Caro, Egypt Astrat Parametr

More information

A Particle Filter Algorithm based on Mixing of Prior probability density and UKF as Generate Importance Function

A Particle Filter Algorithm based on Mixing of Prior probability density and UKF as Generate Importance Function Advanced Scence and Technology Letters, pp.83-87 http://dx.do.org/10.14257/astl.2014.53.20 A Partcle Flter Algorthm based on Mxng of Pror probablty densty and UKF as Generate Importance Functon Lu Lu 1,1,

More information

Approximations for a Fork/Join Station with Inputs from Finite Populations

Approximations for a Fork/Join Station with Inputs from Finite Populations Approxmatons for a Fork/Jon Staton th Inputs from Fnte Populatons Ananth rshnamurthy epartment of ecson Scences ngneerng Systems Rensselaer Polytechnc Insttute 0 8 th Street Troy NY 80 USA Rajan Sur enter

More information

Outline. Classification Methods. Feature Selection/Reduction: Entropy Minimization Algorithm

Outline. Classification Methods. Feature Selection/Reduction: Entropy Minimization Algorithm Outlne lassfaton Methos Xaoun Q Feature Seleton: Entropy Mnzaton Algorth an Karhunen-Loève Epanson luster Seeng: K-Means algorth rnpal oponents Analyss A lassfaton: Lnear Dsrnant Analyss LDA Statstal lassfaton:

More information

Explicit bounds for the return probability of simple random walk

Explicit bounds for the return probability of simple random walk Explct bouns for the return probablty of smple ranom walk The runnng hea shoul be the same as the ttle.) Karen Ball Jacob Sterbenz Contact nformaton: Karen Ball IMA Unversty of Mnnesota 4 Ln Hall, 7 Church

More information

Retrieval Models: Language models

Retrieval Models: Language models CS-590I Informaton Retreval Retreval Models: Language models Luo S Department of Computer Scence Purdue Unversty Introducton to language model Ungram language model Document language model estmaton Maxmum

More information

Physics 2B Chapter 17 Notes - Calorimetry Spring 2018

Physics 2B Chapter 17 Notes - Calorimetry Spring 2018 Physs 2B Chapter 17 Notes - Calormetry Sprng 2018 hermal Energy and Heat Heat Capaty and Spe Heat Capaty Phase Change and Latent Heat Rules or Calormetry Problems hermal Energy and Heat Calormetry lterally

More information

Prediction suffix trees for supervised classification of sequences

Prediction suffix trees for supervised classification of sequences Predton suffx trees for supervsed lassfaton of sequenes Chrstne Largeron - Leténo EURISE - Unversté Jean Monnet Sant-Etenne 6, rue Basse des Rves 42023 Sant-Etenne edex 2 Tel : (33) 04 77 42 19 60 Fax

More information

1 GSW Iterative Techniques for y = Ax

1 GSW Iterative Techniques for y = Ax 1 for y = A I m gong to cheat here. here are a lot of teratve technques that can be used to solve the general case of a set of smultaneous equatons (wrtten n the matr form as y = A), but ths chapter sn

More information

Accurate Online Support Vector Regression

Accurate Online Support Vector Regression Aurate Onlne Support Vetor Regresson Junshu Ma, James Theler, and Smon Perkns MS-D436, NIS-2, Los Alamos Natonal Laboratory, Los Alamos, NM 87545, USA {junshu, jt, s.perkns}@lanl.gov Abstrat Conventonal

More information

Distance-Based Approaches to Inferring Phylogenetic Trees

Distance-Based Approaches to Inferring Phylogenetic Trees Dstance-Base Approaches to Inferrng Phylogenetc Trees BMI/CS 576 www.bostat.wsc.eu/bm576.html Mark Craven craven@bostat.wsc.eu Fall 0 Representng stances n roote an unroote trees st(a,c) = 8 st(a,d) =

More information

Iterative Discovering of User s Preferences Using Web Mining

Iterative Discovering of User s Preferences Using Web Mining Internatonal Journal of Computer Sene & Applatons Vol. II, No. II, pp. 57-66 2005 Tehnomathemats Researh Foundaton Iteratve Dsoverng of User s Preferenes Usng Web Mnng Mae Kewra Futsu Serves, Span, Camno

More information

Voltammetry. Bulk electrolysis: relatively large electrodes (on the order of cm 2 ) Voltammetry:

Voltammetry. Bulk electrolysis: relatively large electrodes (on the order of cm 2 ) Voltammetry: Voltammetry varety of eletroanalytal methods rely on the applaton of a potental funton to an eletrode wth the measurement of the resultng urrent n the ell. In ontrast wth bul eletrolyss methods, the objetve

More information

Semantically Enhanced Uyghur Information Retrieval Model

Semantically Enhanced Uyghur Information Retrieval Model JOURNAL OF SOFTWARE, VOL. 7, NO. 6, JUNE 202 35 Semantally Enhane Uyghur Informaton Retreval Moel Bo Ma Researh Center for Multlngual Informaton Tehnology, Xnang Tehnal Insttute of Physs an Chemstry, Chnese

More information

Phase Transition in Collective Motion

Phase Transition in Collective Motion Phase Transton n Colletve Moton Hefe Hu May 4, 2008 Abstrat There has been a hgh nterest n studyng the olletve behavor of organsms n reent years. When the densty of lvng systems s nreased, a phase transton

More information

CSLDA and LDA fusion based face recognition

CSLDA and LDA fusion based face recognition uhamma Imran RAZZAK,, uhamma Khurram KHA, Khale ALGHAHBAR, Rubyah YUSOF 3 CoEIA, Kng Sau Unversty, Sau Araba (, Internatonal Islam Unversty, Pastan (, CAIRO, Unverst enolog alaysa (3. CSLDA an LDA fuson

More information

technische universiteit eindhoven Analysis of one product /one location inventory control models prof.dr. A.G. de Kok 1

technische universiteit eindhoven Analysis of one product /one location inventory control models prof.dr. A.G. de Kok 1 TU/e tehnshe unverstet endhoven Analyss of one produt /one loaton nventory ontrol models prof.dr. A.G. de Kok Aknowledgements: I would lke to thank Leonard Fortun for translatng ths ourse materal nto Englsh

More information

The Origin of Aromaticity

The Origin of Aromaticity The Orgn of Aromatty Ths problem s one of the hstores of organ hemstry. Many researhers have propose ther own unerstanng methos amng at soluton. owever, t s not hear that the problem of the orgn why aromat

More information

Some Results on the Counterfeit Coins Problem. Li An-Ping. Beijing , P.R.China Abstract

Some Results on the Counterfeit Coins Problem. Li An-Ping. Beijing , P.R.China Abstract Some Results on the Counterfet Cons Problem L An-Png Bejng 100085, P.R.Chna apl0001@sna.om Abstrat We wll present some results on the ounterfet ons problem n the ase of mult-sets. Keywords: ombnatoral

More information

Hopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen

Hopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen Hopfeld networks and Boltzmann machnes Geoffrey Hnton et al. Presented by Tambet Matsen 18.11.2014 Hopfeld network Bnary unts Symmetrcal connectons http://www.nnwj.de/hopfeld-net.html Energy functon The

More information

ENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition

ENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition EG 880/988 - Specal opcs n Computer Engneerng: Pattern Recognton Memoral Unversty of ewfoundland Pattern Recognton Lecture 7 May 3, 006 http://wwwengrmunca/~charlesr Offce Hours: uesdays hursdays 8:30-9:30

More information

CS47300: Web Information Search and Management

CS47300: Web Information Search and Management CS47300: Web Informaton Search and Management Probablstc Retreval Models Prof. Chrs Clfton 7 September 2018 Materal adapted from course created by Dr. Luo S, now leadng Albaba research group 14 Why probabltes

More information

Cooperative Self Encoded Spread Spectrum in Fading Channels

Cooperative Self Encoded Spread Spectrum in Fading Channels I. J. Communatons, etwork an Sstem Senes, 9,, 9-68 Publshe Onlne Ma 9 n SRes (http://www.srp.org/journal/jns/). Cooperatve Self Enoe Sprea Spetrum n Fang Channels Kun HUA, Won Mee JAG, Lm GUYE Unverst

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

Optimal Resource Allocation in Satellite Networks: Certainty Equivalent Approach versus Sensitivity Estimation Algorithms

Optimal Resource Allocation in Satellite Networks: Certainty Equivalent Approach versus Sensitivity Estimation Algorithms Optmal Resoure Alloaton n Satellte Networks: Certanty Equvalent Approah versus Senstvty Estmaton Algorthms Frano Davol*, Maro Marhese, Maurzo Mongell* * DIST - Department of Communatons, Computer an Systems

More information

Introduction to Molecular Spectroscopy

Introduction to Molecular Spectroscopy Chem 5.6, Fall 004 Leture #36 Page Introduton to Moleular Spetrosopy QM s essental for understandng moleular spetra and spetrosopy. In ths leture we delneate some features of NMR as an ntrodutory example

More information

15-381: Artificial Intelligence. Regression and cross validation

15-381: Artificial Intelligence. Regression and cross validation 15-381: Artfcal Intellgence Regresson and cross valdaton Where e are Inputs Densty Estmator Probablty Inputs Classfer Predct category Inputs Regressor Predct real no. Today Lnear regresson Gven an nput

More information

CONTRAST ENHANCEMENT FOR MIMIMUM MEAN BRIGHTNESS ERROR FROM HISTOGRAM PARTITIONING INTRODUCTION

CONTRAST ENHANCEMENT FOR MIMIMUM MEAN BRIGHTNESS ERROR FROM HISTOGRAM PARTITIONING INTRODUCTION CONTRAST ENHANCEMENT FOR MIMIMUM MEAN BRIGHTNESS ERROR FROM HISTOGRAM PARTITIONING N. Phanthuna 1,2, F. Cheevasuvt 2 and S. Chtwong 2 1 Department of Electrcal Engneerng, Faculty of Engneerng Rajamangala

More information

p(z) = 1 a e z/a 1(z 0) yi a i x (1/a) exp y i a i x a i=1 n i=1 (y i a i x) inf 1 (y Ax) inf Ax y (1 ν) y if A (1 ν) = 0 otherwise

p(z) = 1 a e z/a 1(z 0) yi a i x (1/a) exp y i a i x a i=1 n i=1 (y i a i x) inf 1 (y Ax) inf Ax y (1 ν) y if A (1 ν) = 0 otherwise Dustn Lennon Math 582 Convex Optmzaton Problems from Boy, Chapter 7 Problem 7.1 Solve the MLE problem when the nose s exponentally strbute wth ensty p(z = 1 a e z/a 1(z 0 The MLE s gven by the followng:

More information

Adaptive Microphone Arrays for Noise Suppression in the Frequency Domain

Adaptive Microphone Arrays for Noise Suppression in the Frequency Domain Seon Cost 9 Workshop on Aaptve Algorthms n Communatons, Boreaux, 3.9-..99 Aaptve Mrophone Arrays for ose Suppreo the Frequeny Doman K. U. Smmer, A. Wasleff Department of Physs an Eletral Engneerng Unversty

More information

Support Vector Machines CS434

Support Vector Machines CS434 Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? Intuton of Margn Consder ponts A, B, and C We

More information

TIME-VARYING LINEAR PREDICTION FOR SPEECH ANALYSIS

TIME-VARYING LINEAR PREDICTION FOR SPEECH ANALYSIS 5th European Sgnal roessng Conferene (EUSICO 7), oznan, olan, September 3-7, 7, opyrght by EURASI IME-VARYIG LIEAR REDICIO FOR SEECH AALYSIS Karl Shnell an Arl Laro Insttute of Apple hyss, Goethe-Unversty

More information

Handwriting Recognition Using Position Sensitive Letter N-Gram Matching

Handwriting Recognition Using Position Sensitive Letter N-Gram Matching Handwrtng Reognton Usng Poston Senstve Letter N-Gram Mathng Adnan El-Nasan, Srharsha Veeramahanen, George Nagy DoLab, Rensselaer Polytehn Insttute, Troy, NY 12180 elnasan@rp.edu Abstrat We propose further

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

MAXIMUM A POSTERIORI TRANSDUCTION

MAXIMUM A POSTERIORI TRANSDUCTION MAXIMUM A POSTERIORI TRANSDUCTION LI-WEI WANG, JU-FU FENG School of Mathematcal Scences, Peng Unversty, Bejng, 0087, Chna Center for Informaton Scences, Peng Unversty, Bejng, 0087, Chna E-MIAL: {wanglw,

More information

Chapter 2 Transformations and Expectations. , and define f

Chapter 2 Transformations and Expectations. , and define f Revew for the prevous lecture Defnton: support set of a ranom varable, the monotone functon; Theorem: How to obtan a cf, pf (or pmf) of functons of a ranom varable; Eamples: several eamples Chapter Transformatons

More information

Ensemble Validation: Selectivity has a Price, but Variety is Free

Ensemble Validation: Selectivity has a Price, but Variety is Free Enseble Valdaton: Seletvty has a e, but Varety s Free Er Bax Verzon baxhoe@yahoo.o Farshad Koot Faebook arxv:60.0234v2 [stat.ml] 25 Apr 208 Abstrat Suppose soe lassfers are seleted fro a set of hypothess

More information

Discriminative classifier: Logistic Regression. CS534-Machine Learning

Discriminative classifier: Logistic Regression. CS534-Machine Learning Dscrmnatve classfer: Logstc Regresson CS534-Machne Learnng robablstc Classfer Gven an nstance, hat does a probablstc classfer do dfferentl compared to, sa, perceptron? It does not drectl predct Instead,

More information

Analysis of Heterocatalytic Reactor Bed Based on Catalytic Pellet Models

Analysis of Heterocatalytic Reactor Bed Based on Catalytic Pellet Models III Internatonal Intersplnary Tehnal Conferene of Young entsts 19-1 May 010, Pozna, Polan Analyss of Heteroatalyt Reator Be Base on Catalyt Pellet Moels yörgy Rá, Unversty of Pannona Tamás Varga, Unversty

More information

ESCI 341 Atmospheric Thermodynamics Lesson 10 The Physical Meaning of Entropy

ESCI 341 Atmospheric Thermodynamics Lesson 10 The Physical Meaning of Entropy ESCI 341 Atmospherc Thermodynamcs Lesson 10 The Physcal Meanng of Entropy References: An Introducton to Statstcal Thermodynamcs, T.L. Hll An Introducton to Thermodynamcs and Thermostatstcs, H.B. Callen

More information

Graphical representation of constitutive equations

Graphical representation of constitutive equations Graphal representaton of onsttutve equatons Ger Guehus, o. Prof. em. Dr.-Ing. Dr. h.. Professor Emertus Unversty of Karlsruhe Insttute of Sol Mehans an Rok Mehans Engler-unte-Rng 4 763 Karlsruhe, Germany

More information

Adaptive Multilayer Neural Network Control of Blood Pressure

Adaptive Multilayer Neural Network Control of Blood Pressure Proeedng of st Internatonal Symposum on Instrument Sene and Tenology. ISIST 99. P4-45. 999. (ord format fle: ISIST99.do) Adaptve Multlayer eural etwork ontrol of Blood Pressure Fe Juntao, Zang bo Department

More information

New Liu Estimators for the Poisson Regression Model: Method and Application

New Liu Estimators for the Poisson Regression Model: Method and Application New Lu Estmators for the Posson Regresson Moel: Metho an Applcaton By Krstofer Månsson B. M. Golam Kbra, Pär Sölaner an Ghaz Shukur,3 Department of Economcs, Fnance an Statstcs, Jönköpng Unversty Jönköpng,

More information