Similarity Model and Term Association for Document Categorization

Similar documents
Maximum Likelihood Estimation

On Probability Density Function of the Quotient of Generalized Order Statistics from the Weibull Distribution

c- : r - C ' ',. A a \ V

ON THE EXTENSION OF WEAK ARMENDARIZ RINGS RELATIVE TO A MONOID

Numerical Study of Large-area Anti-Resonant Reflecting Optical Waveguide (ARROW) Vertical-Cavity Semiconductor Optical Amplifiers (VCSOAs)

Physics 201 Lecture 15

CHAPTER 10: LINEAR DISCRIMINATION

Information Fusion Kalman Smoother for Time-Varying Systems

flbc in Russia. PIWiREE COHORTS ARE NOT PULL- ING TOGETHER. SIGHTS AND SCENES IN ST. PETERSBURG.

ESS 265 Spring Quarter 2005 Kinetic Simulations

_ J.. C C A 551NED. - n R ' ' t i :. t ; . b c c : : I I .., I AS IEC. r '2 5? 9

African Journal of Science and Technology (AJST) Science and Engineering Series Vol. 4, No. 2, pp GENERALISED DELETION DESIGNS

Clustering Web Access Patterns based on Learning Automata and Fuzzy Logic

A multi-band approach to arterial traffic signal optimization. Nathan H. Gartner Susan F. Assmann Fernando Lasaga Dennin L. Hou

-HYBRID LAPLACE TRANSFORM AND APPLICATIONS TO MULTIDIMENSIONAL HYBRID SYSTEMS. PART II: DETERMINING THE ORIGINAL

X-Ray Notes, Part III

Variance of Time to Recruitment for a Single Grade Manpower System using Order Statistics for Inter-decision Times and Wastages

Acta Electrotechnica et Informatica, Vol. 13, No. 2, 2013, 3 7, DOI: /aeei SLOVAK TEXT DOCUMENT CLUSTERING

Randomized Stream Ciphers with Enhanced Security Based on Nonlinear Random Coding

CptS 570 Machine Learning School of EECS Washington State University. CptS Machine Learning 1

CIRCUITS AND ELECTRONICS. The Impedance Model

Calculus 241, section 12.2 Limits/Continuity & 12.3 Derivatives/Integrals notes by Tim Pilachowski r r r =, with a domain of real ( )

ÖRNEK 1: THE LINEAR IMPULSE-MOMENTUM RELATION Calculate the linear momentum of a particle of mass m=10 kg which has a. kg m s

Rotations.

Chapter 3: Vectors and Two-Dimensional Motion

New Results on Oscillation of even Order Neutral Differential Equations with Deviating Arguments

Coupled Mass Transport and Reaction in LPCVD Reactors

Chapter 6 Plane Motion of Rigid Bodies

Component Score Weighting for GMM based Text-Independent Speaker Verification

EMA5001 Lecture 3 Steady State & Nonsteady State Diffusion - Fick s 2 nd Law & Solutions

The Non-Truncated Bulk Arrival Queue M x /M/1 with Reneging, Balking, State-Dependent and an Additional Server for Longer Queues

G OUP S 5 TH TE 5 DN 5. / E/ ' l / DECE 'I E THIS PAGE DECLASSIFIED IAW EO ', - , --,. . ` : - =.. r .

THIS PAGE DECLASSIFIED IAW EO IRIS u blic Record. Key I fo mation. Ma n: AIR MATERIEL COMM ND. Adm ni trative Mar ings.

2 shear strain / L for small angle

THIS PAGE DECLASSIFIED IAW E

Chapter 2: Descriptive Statistics

CS 188: Artificial Intelligence Fall Probabilistic Models

DQ Modeling and Dynamic Characteristics of a Three-Phase Induction Machine

A L A BA M A L A W R E V IE W

8.5 Circles and Lengths of Segments

Keywords: Dynamic Programming, Stochastic Processes, Longest Path, Graph Theory

Analytical Evaluation of Multicenter Nuclear Attraction Integrals for Slater-Type Orbitals Using Guseinov Rotation-Angular Function

T h e C S E T I P r o j e c t

The Boltzmann transport equation and the diffusion equation

Memorandum COSOR 97-??, 1997, Eindhoven University of Technology

ENGI 4430 Advanced Calculus for Engineering Faculty of Engineering and Applied Science Problem Set 9 Solutions [Theorems of Gauss and Stokes]

index HDDvsSSD Introduction L architecturedeshdd Philosophie Exemplesd utilisations Conclusion

Telematics 2 & Performance Evaluation

Support Vector Machines

Sharif University of Technology - CEDRA By: Professor Ali Meghdari

s = rθ Chapter 10: Rotation 10.1: What is physics?

Increasing the Image Quality of Atomic Force Microscope by Using Improved Double Tapered Micro Cantilever

Two-Pion Exchange Currents in Photodisintegration of the Deuteron

LECTURE 14. m 1 m 2 b) Based on the second law of Newton Figure 1 similarly F21 m2 c) Based on the third law of Newton F 12

On Fractional Operational Calculus pertaining to the product of H- functions

Name of the Student:

THIS PAGE DECLASSIFIED IAW E

Exponential and Logarithmic Equations and Properties of Logarithms. Properties. Properties. log. Exponential. Logarithmic.

Cross-layer rate control, medium access control and routing design in cooperative VANET

Monetary policy and models

Fresnel and Fraunhofer diffraction of a laser Gaussian beam by fork-shaped gratings. Ljiljana Janicijevic and Suzana Topuzoski *

Physics 120 Spring 2007 Exam #1 April 20, Name

P a g e 5 1 of R e p o r t P B 4 / 0 9

EECE 301 Signals & Systems Prof. Mark Fowler

! -., THIS PAGE DECLASSIFIED IAW EQ t Fr ra _ ce, _., I B T 1CC33ti3HI QI L '14 D? 0. l d! .; ' D. o.. r l y. - - PR Pi B nt 8, HZ5 0 QL

OH BOY! Story. N a r r a t iv e a n d o bj e c t s th ea t e r Fo r a l l a g e s, fr o m th e a ge of 9

How to Solve System Dynamic s Problems

_ =- 314 TH / 3 RD 60M AR M NT GROUP C L) _. 5 TH AIR F0 RCE ` Pl R?N ]9. ia UNIT, - _ : --.

PHY2053 Summer C 2013 Exam 1 Solutions

A note on characterization related to distributional properties of random translation, contraction and dilation of generalized order statistics

Physics 240: Worksheet 16 Name

Problem Set If all directed edges in a network have distinct capacities, then there is a unique maximum flow.

Physics 15 Second Hour Exam

CHATTERJEA CONTRACTION MAPPING THEOREM IN CONE HEPTAGONAL METRIC SPACE

LOW ORDER POLYNOMIAL EXPANSION NODAL METHOD FOR A DeCART AXIAL SOLUTION

INATTENTIVE HYPERACTIVE

P a g e 3 6 of R e p o r t P B 4 / 0 9

5-1. We apply Newton s second law (specifically, Eq. 5-2). F = ma = ma sin 20.0 = 1.0 kg 2.00 m/s sin 20.0 = 0.684N. ( ) ( )

Representation of Saturation in Stability Studies

Modern Energy Functional for Nuclei and Nuclear Matter. By: Alberto Hinojosa, Texas A&M University REU Cyclotron 2008 Mentor: Dr.

APPLICATION OF A Z-TRANSFORMS METHOD FOR INVESTIGATION OF MARKOV G-NETWORKS

Modeling and Control of a DFIG-Based Wind Turbine During a Grid Voltage Drop

β A Constant-G m Biasing

Advanced Particle Physics & Introduction to Standard Model: II. Prerequisites

PHYSICS 151 Notes for Online Lecture #4

Fractional Order PID Design for Nonlinear Motion Control Based on Adept 550 Robot

, _ _. = - . _ 314 TH COMPOSITE I G..., 3 RD BOM6ARDMENT GROUP ( L 5 TH AIR FORCE THIS PAGE DECLASSIFIED IAW EO z g ; ' ' Y ' ` ' ; t= `= o

Time-Space Model of Business Fluctuations

Molecular Evolution and Phylogeny. Based on: Durbin et al Chapter 8

A Robust Fuzzy Control Approach to Stabilization of Nonlinear Time-delay Systems with Saturating Inputs

Two-dimensional Effects on the CSR Interaction Forces for an Energy-Chirped Bunch. Rui Li, J. Bisognano, R. Legg, and R. Bosch

Revisited the Mathematical Derivation Wall Thickness Measurement of Pipe for Radiography

Probabilistic Models. CS 188: Artificial Intelligence Fall Independence. Example: Independence. Example: Independence? Conditional Independence

Then the number of elements of S of weight n is exactly the number of compositions of n into k parts.

Reinforcement learning

Stability Analysis of a Sliding-Mode Speed Observer during Transient State

Outline. GW approximation. Electrons in solids. The Green Function. Total energy---well solved Single particle excitation---under developing

dm dt = 1 V The number of moles in any volume is M = CV, where C = concentration in M/L V = liters. dcv v

Fig. 1S. The antenna construction: (a) main geometrical parameters, (b) the wire support pillar and (c) the console link between wire and coaxial

THIS PAGE DECLASSIFIED IAW EO 12958

Transcription:

Say Moe an e Aocaon fo Docuen Caegozaon Huazhong KOU HUAIZHONG.KOU@PRISM.UVSQ.FR PRSM Laboaoy, Unvey of Veae San-Quenn, 45 Ea-Un Roa, 78035 Veae, Fance Geoge GARDARIN GEORGES.GARDARIN@E-XMLMEDIA.FR e-xea, 31 Avenue u Généa LECLERC, 9340 BOURG LA REINE, Fance Abac h pape aee ay oe an e aocaon fo ay-bae ocuen caegozaon. Boh Eucean ance an cone-bae ay oe ae ey ue fo eaue of ocuen ay n nfoaon eeva an ocuen caegozaon couny. hee o ay oe ae bae on he aupon ha e veco ae ohogona. e aocaon ae gnoe n uch ay oe. In fac, he aupon above no ue. In he conex of ocuen caegozaon, e anayze he popee of e-ocuen pace, ecaegoy pace an caegoy-ocuen pace. hen, hou he aupon of e nepenence, e popoe a ne aheaca oe o eae he aocaon beeen e. Dffeen fo ohe oe of e eaonhp, hee e ae be ue of exng caegoy ebehp epeene by copu a oe a pobe, an he obecve o pove caegozaon pefoance. By noucng aocaon beeen e, e ae no accoun e aocaon fo cacuang ocuen ay an efne a -ay oe of ocuen. Expeen have been one h - NN cafe ove Reue-5178 copu. he epca eu ho ha uzaon of e aocaon can pove he effecvene of caegozaon ye an -ay oe oupefo han one hou coneng e aocaon. 1. Inoucon Docuen caegozaon he poceue of agnng one o upe peefne caegoy abe o a fee ex ocuen (caegoy oee cae opc o hee ). I a uefu coponen of vaou anguage poceng appcaon. A pay appcaon of ex caegozaon o agn ubec caegoe o ocuen o uppo nfoaon eeva, o o a huan nexe n agnng uch caegoe. Caegozaon can ao be ue o bu a peonaze ne ne fe hch ean abou he ne-eang pefeence of a ue. Bee, hey can be ue o oue ex o caegoy-pecfc poceon echan, an o gue a ue each on he Wo We Web. Docuen caegozaon can eae he oganzaon of nceang exua nfoaon, n pacua Web page an ohe eeconc fo ocuen. Vaou aheaca oe have been popoe o epeen ocuen fo ocuen caegozaon ye, ncung fo exape a pobabc oe (Le, 1994) bae on he copuaon of eevance pobabe fo he ocuen of a coecon an he veco pace oe(saon,1989). he veco pace oe he pe o ue an n oe ay he o poucve ( Saon,1989). Aue ha an opa ube of poan feaue (o, o oo, copoun o o phae, ec) ae eece, une he veco pace oe ocuen ae epeene a a bag of hee feaue h he egh n ocuen. ha, each ocuen appe o a veco n hgh enon pace of feaue, hee each feaue coepon o an ax. Bae-on h epeenaon oe, a o of agoh have been nenvey ue, ncung uppo veco achne (SVM) (Joach,1998) agoh, neae neghbo (-NN) (uba an Guven,1998) agoh, Roccho feebac agoh (Yang, Au, Pece, an Lae, 000), ec. A agoh ce above ae bae on a ay oe by hch o ocuen can be copae. hee ae o any ay oe: Eucean ance oe an cone funcon oe. In oe o ue Eucean ance, ax veco ae aheacay aue ohogona, ha, feaue ae ohogona. h aupon ao neceay fo he cone funcon oe. Bu h aupon oe no ho n eay ( Raghavan & Wong, 1986) an hee ex oe aocaon beeen e. Bu hee o ay oe o no ae no accoun he aocaon beeen e. In he conex of ocuen caegozaon, h pape fy anayze he ocuen veco oe, e-caegoy pace an caegoy-ocuen pace, hen popoe a aheaca oe o eae he e aocaon hou he aupon of ohogona e. hen uch e aocaon ae fuheoe ue n -ay oe of ocuen. he ea a foo: each.1.

caegoy epeene a a bnay veco n caegoyocuen pace hoe eeen ncae f a ape ocuen beong o, hen ay beeen caegoe eae by Jacca coeffcen; he egh of feaue e n caegoe can be cacuae by ung fo exape x agoh, an feaue e ae appe a veco n e-caegoy pace hoe ax ae caegoe, hen aocaon beeen feaue e can be eae; uch eaon of e aocaon hen ae nouce n he ay oe of ocuen veco n e-ocuen pace. he eane ae oganze a foo: Secon anayze he exng ay oe of ocuen; n Secon 3, afe anayzng e-caegoy pace an caegoy-ocuen pace, a aheaca oe popoe fo eaon of e aocaon, hen ay oe of ocuen efne fo copang ocuen hee e aocaon ae aen no accoun. In Secon 4, oe expeena eu ae hon he Secon 5 concue h pape.. Anay of Say Moe.1 Genea Noon Hee e peen oe noon ue n he ae econ. D a e of exape ocuen ; n he nube of ocuen n D; a e of feaue e opay eece fo ocuen n D, he nube of feaue e n ; C a e of oan-pecfc caegoe, he nube of caegoy n C; f he nube of e h e occu n h ocuen, f he nube of ocuen n hch he h e occu.. Docuen Veco Moe Une he veco pace epeenaon oe, a ocuen epeene a e veco of he foong fo, noe ao by :,...,. 1 1, hee ( = 1,,,n an =1,, ) a egh of h e n h ocuen D, an eaue he exen o hch e conbue o boh ocuen conen epeenaon an cnaon. hee ex ffeen appoache o he cacuaon of, ncung f-f agoh (Saon,1983), fc agoh (Saon & Bucey, 1988) an enopy agoh (Dua, 1991). In he e-ocuen pace, each feaue e coepon o an ax veco an ocuen o a pon o veco. If e epeen h ax veco coeponng o h e by, hen ocuen veco can be epeene a a nea cobnaon of ax veco beo: 1. hee ax veco (=1,,,) ae neay nepenen; ohee a ax neay nepenen goup of ax veco can be eene by nea ageba heoy an ue n (.)..3 Say Moe of Docuen Gven o ocuen an D epeene n he fo (.), he ay beeen he can be eaue by he nne pouc ( Saon,1989 ):, 1, 1. 3, 1 Fo (.3), e can ee ha copung he ay beeen o ocuen veco oe no ony epen on ocuen bu ao he noege of he e coeaon fo a e pa. Bu he e aocaon o coeaon ae no uuay avaabe a po, an no pe o geneae uefu e aocaon. In pacce, e-coeaon ofen ccuvene by aung ha he e ae no coeae, n hch cae he e veco ae ohogona: 0 fo an 1 fo. So, (.3) can be een a (.4).,.4 1 No, e ca he f an he econ pa n (.3) a ohogona an non-ohogona pa epecvey. he non-ohogona pa ecbe he poe ha e coeaon beeen e n pay ue fo copung ay. Obvouy, he non-ohogona pa of (.3) appea fo (.4). ha, aocaon beeen e ae no a a aen no accoun n coony ue nne pouc oe. Agan, on he ba of ohogonay aupon abou e, cone ay oe ao nouce a foo( Saon,1989 ):, co,.5 hee an ae engh of ocuen an. Ofen, ocuen veco ae noaze o be of un engh. h pe ha (.4) an (.5) ae ae. Une h aupon, ohe ay oe ae ao popoe n nfoaon eeva couny, ee Saon,1989. Ignoance of e aocaon ceany effec ye pefoance fo exape accuacy an effecvene...

3. Maheaca Moe fo e Aocaon On he ba of e-ocuen ax, one appoach o eaon of e aocaon popoe (Saon,1989 ): each e veco fy epeene a a nea cobnaon of ocuen veco, hen e aocaon can be eae une he aupon ha he ocuen veco ae heeve ohogona. Such an aupon ceay uneaonabe fo any n of pacca ocuen coecon (Raghavan & Wong, 1986). Anohe appoach ue e co-occuence n anng ocuen o eae e aocaon (Saon,1983). Une he veco pace oe, e an ocuen ae epeene n he ae veco pace, an ony he occuence of e n ocuen ae ue. In he ocuen caegozaon conex, hee ex ohe poan nfoaon abou eaonhp beeen e an caegoe, ocuen an caegoe. We aep o ae ue of hee nfoaon an nouce e-caegoy pace an ocuen-caegoy pace. I n he ae o pace hee aocaon beeen e ae eae. We beeve ha e aocaon eae by uch ay cou pove caegozaon pefoance. Fo ocuen caegozaon pon of ve, h econ popoe a aheaca oe o appoxaey eae e aocaon n (.3) va o pace: e-caegoy pace an ocuen-caegoy pace. h oe oe no nee he aupon ae n (Raghavan & Wong, 1986). 3.1 e-caegoy pace Gven a copu, e can eae he poe ha each e n pay ue of caegoy pecon. o o h, hee ex evea agoh(yang an Peeen,1997): Infoaon gan: eaue he nube of b of nfoaon obane fo caegoy pecon by nong he peence o abence of a e n a ocuen. he nfoaon gan of e n caegoy c C efne o be : P c an P ae pobaby of caegoy c an epecvey n he copu. hee K K, c P c ogp c P P c ogp c G P 1 K P c ogp c 3.1 1 1 CHI-Squae ac: eaue he ac of nepenence beeen e an ca o he ebehp N AD CB, c 3. A C B D A B C D o oe exen hch a e beong o a caegoy. I efne o be: hee N he oa nube of anng ocuen n copu, A he nube of ocuen beongng o he caegoy c an conanng e, B he nube of ocuen beongng o he caegoy c bu no conanng, C he nube of ocuen no beongng o c bu conanng an D he nube of ocuen nehe beongng o c no conanng. hen each e can be appe o a veco epeene a (3.3) n e-caegoy pace, n hch caegoe coepon o ax.,..., 3. 3 1, In (3.3), (=1,,,) he eaue of caegoy pecon poe of he h e fo caegoy c C an cacuae by fo exape CHI-quae agoh n ou peenaon. Fuheoe, hou o of geneay, e can aue ha caegoy veco c ae neay nepenen; ohee e can eec one axu neay nepenen goup fo he o epace he. In h cae, e veco can be epeene a nea cobnaon of caegoy veco e (3.): 1 c c, c,..., c 3. 6 3.4 Wh e veco epeenaon (3.4), ay beeen o e, can be cacuae a nne pouc by: Dffeen fo he appoach o eaon of e aocaon popoe n (Saon,1989), (3.5) eae c _,, 1 e aocaon a he eve of caegoy n e-caegoy pace. In he caegozaon conex, e beeve ha uch eaon (3.5) eaonabe. Bu nvove he ay beeen o caegoe: c c. In oe o cacuae he ay beeen caegoe, e go o caegoy-ocuen pace hee caegoy ay can be eae appoxaey. 3. Caegoy-Docuen pace c c 3.5 1 c c In he ocuen caegozaon conex, each caegoy chaaceze by he anng ocuen ha beong o. We can epeen each caegoy a Booean veco of ocuen n caegoy-ocuen pace hee each anng ocuen coepon o an ax. Gven a caegoy c C, e have a Booean veco epeenaon a foo: c 1 n hee c =1 f beong o c ; c =0 ohee. Hence caegoy-ocuen Booean veco pace. In h cae, e ue Jacca coeffcen (Saon,1989) o cacuae ay beeen o caegoy c an c by (3.7), hee c c c _ c, c 3.7 c c c c c an c ae he nube of ocuen of caegoe.3.

c c c an c epecvey, he nube of coon ocuen of caegoe c an c. Obvouy, any aupon no neee fo copung. c _ c, c c _ c, c 1 Fo (3.7), e oban fo evey caegoy c C. Fuheoe, f evey anng ocuen a o beong o one caegoy, hen c _ c, c =0 fo. 3.3 e Aocaon On he ba of (3.5) an (3.7), he aocaon beeen o e,, noe by a(, ), efne n (3.8). a, 1 Fo he cuon n (3.7), e no ha f evey anng ocuen a o beong o one caegoy, he econ pa n (3.8) be zeo. In h cae, e have (3.9). he eu conce h he cae hee caegoy veco ae ohogona n e-caegoy pace, See (3.5). Noe ha he aocaon beeen e by (3.8) unboune, e noaze a(, ) o ha aocaon vaue op n he neva [0,1] a foo: No gven a, 0 1, e efne -aocaon beeen o e by (3.11). Obvouy, f =0, -a(, ) equa o a(, ). he funcon of -a(, ) o fe ou hoe e pa h eave ea aocaon. 3.4 -Say Moe a a,, 1 c _ c, c 3. 8 No, by ung -a(, ) efne n (3.11) o appoxae e aocaon n he ohogona pa of (.3) an cobnng (.4) an he ohogona pa of (.3), e nouce -ay oe of ocuen pa (, ) n (3.1). he aocaon beeen e nouce n (3.1), bu (3.1) ffeen fo (.3). -ay oe exen he pe a oe n (.4) an (.5). I cea ha f =1 (3.1) be ae a (.4) an (.5)., 3.9. 1 a ax,, a, (3.10) 0 f a(, ) a(, ) 3.11 a(, ) f a(, ) Copae o he pe ay oe (.4) an (.5), e hope he noucon of e aocaon n ou -ay oe pove caegozaon, _ a, 3. 1 1, 1 pefoance. he obecve of o fe ou e pa h o aocaon. 4. Expeen an Anay 4.1 Expeen egn We ue Reue-1578 coecon a expeen bencha. ApeMo p aegy aen bu ony 9805 ne oe h ne boy ae ep, an boh he ne oe hou ne boy an hou caegoy abe ae eove. A eu, 7063 anng ocuen an 74 e ocuen ae ncue n ou expeen (e/anng=38.8%). Fuheoe, e choe 1 age caegoe h a ea 1% anng ocuen. Lay, e eece 9037 ocuen aong hch 6533 anng ocuen an 506 e ocuen (38.4%). he aveage nube of caegoy pe ocuen 1.3 an he axu nube of caegoe o hch one ocuen beong 7. Afe eovng 319 op-o, Poe eng agoh (Poe,1980) pefoe o conve o no he o e, an 14,743 unque e ae obane. hen x Appoach (Yang an Peeen, 1997) ue o eaue he pecve poe of e fo caegoe. In ou expeen, 1100 feaue e h hghe x vaue eece a opa ube of feaue e, an veco oe aape o epeen ocuen by ung f-f e eghng oe (Saon,1989). -NN conuce by ang he vaue of a 10,0,30,40,50,60,70 epecvey, an RCu hehong aegy (Yang, 001) ue h vaue 1. o caegoy coe agoh SSA, IBW (Kou an Gaan, 00) ae ee. We e o be 0.6 fo ay oe, an he aocaon vaue of ony 460 pa of e ae ue o cacuae ay of ocuen. Dung he eanng phae, he egh of e n ffeen caegoe ae cacuae by (3.) an e aocaon eae by (3.8). hee o cacuaon ae vey heavy, o( ) oe. o eae h pobe, one can cacuae o pa n (3.1) by ung paae agoh. 4. Evauaon of pefoance In ou expeen, ana appoache o evauaon of caegozaon ye pefoance ae conuce, aong hch ae co- an aco- F1 an 11-pon eca-pecon cuve. 4..1 EVALUAION OF BINARY CLASSIFICAION.4.

Gven a caegoy, eca () he popoon of coecy agne ocuen o a ocuen beongng o he caegoy an pecon (p) one of coecy agne ocuen o a agne ocuen. Uuay hee a ae-off beeen eca an pecon. By cobng eca an pecon, Van Rbegen popoe F-eaue ha oe eeache ue (Haye an Seve,1990; Le an Schape e a,1996) a foo: F 1, p p Whee he paaee aong ffeena eghng of p an. When he vaue of e o one (enoe a F 1 ), eca an pecon eghe equay: p F p 1, p he hgh F 1, p vaue ean hgh pefoance of ye. When eca an pecon ae ae, F 1, p ae hghe vaue ha equa o eca. 4.. MICRO- AND MACRO-AVERAGE hee ae o ay o eaue he aveage pefoance of a bnay cafe ove upe caegoe, naey, he aco-aveage an he co-aveage (Yang,1999; Joach,1998). Maco-aveage gve an equa egh o he pefoance on evey caegoy, egae ho ae o ho coon a caegoy. Mco-aveage, hoeve, gve an equa egh o he pefoance on evey ocuen (caegoy nance), hu favong he pefoance on coon caegoe. he co-aveage ue n (Yang,1999;Joach,1998). Boh co-aveage an aco-aveage ae ue n (Yang an Lu,1999). he co-aveage F 1 have been ey ue n coeho copaon. he co-aveage coe en o be onae by he cafe pefoance on coon caegoe he he aco-aveage coe ae oe nfuence by he pefoance on ae caegoe (Yang an Lu,1999). 4..3 11-POINS RECALL-PRECISION GRAPH he cafe ofen pouce a coe fo caegoyocuen pa. In h cae, gven a caegoy, e can an ocuen n he ecenng coe oe. hen by gong on he ocuen, e cacuae eca vaue an pecon hen encouneng a eevan ocuen. We ca a cacuae eca-pecon pa obeve pon. Fnay, e ue hee obeve pon o pouce 11 ne pon h pefxe eca vaue (angng fo 0 o 1.0 h ep 0.1) an on up he ne 11 pon o a eca-pecon cuve fo a caegoy. o oban ecapecon cuve fo he cafe, e can ceae 11 pon by aveagng pecon ove a caegoe a each eca vaue eve. Becaue aong 11-pefxe eca vaue hee ex oe o hch no ea obeve pecon vaue coepon. We ca uch eca vaue bn pon. p Lnea nepoaon echnque be aape o eae he be uabe pecon fo uch bn pon. 4.3 Reu an anay abe1 ho he eu pouce by -NN h convenona ay oe (.4) an -ay oe (3.1), hee an e epeen he eu by (.4) an (3.1) epecvey. abeau 1 Maco Aveage Pefoance Reca Pecon F1 e S e e 10 0,643 0,647 0,864 0,865 0,715 0,718 0 0,661 0,663 0,86 0,861 0,76 0,76 30 0,649 0,653 0,863 0,863 0,716 0,717 40 0,69 0,695 0,874 0,874 0,755 0,757 50 0,693 0,697 0,885 0,886 0,759 0,76 60 0,696 0,7 0,887 0,888 0,76 0,763 70 0,698 0,70 0,893 0,893 0,76 0,764 avg 0,676 0,679 0,875 0,876 0,74 0,744 We can ee ha -ay oe aay oupefo convenona ay oe fo ffeen vaue even f oe ghy. h pove ha uzaon of e aocaon eay pove he effecvene of caegozaon. One ony acheve a e gan of pefoance by (3.1) agan (.4). he caue ha ony a e pa of e pa aocaon ( 460 v. 604450 oa nube of e pa ) ae ue o cacuae ocuen ay n ou expeen. In aon, e ony ue 1100 feaue e o epeen ocuen, 1100 no enough age copae o he nube of exape ocuen (9037). In Fgue 1 an Fgue, e ue eca-pecon cuve o uae he pefoance of o ay oe. he expeena eu of o a vaue an o bg vaue of ae pacuay epoe hee, o ha ae ae cea. Fgue1 gve he eu of =0,30. Obvouy, fo =0, -ay oe upeo o cone oe a he o an e eve of eca,.e. [0,0.7] he hey ae ao ae a he hgh eve of eca (0.7,1]. A fo he cae of = 30, hng change e. -ay oe upeo o cone oe a he o an e eve of eca, bu ghy nfeo o he ae a he eve (0.7, 0.9), an hey ae ae a he eve [0.9,1]. Fgue he eu fo =60 an 70. Copae o he cae =0 an 30, he eaon of eca-pecon cuve ae eavey abe fo =60 an 70. -ay oe oupefo cone oe a he a eve of eca h.5.

excep ha nfeo o he ae ony a he oe eve [0,0.15] of eca. A hon by eca-pecon cuve, boh -ay oe an cone oe each Fo ovea pon of ve, -ay oe pefo bee han cone ay oe, n pacua a he e eve of eca. hee o ay oe have o coon paaee: he nube of feaue e an he nube of neae neghbo. Bee, -ay oe ha he h paaee: e aocaon heho. he aocaon heho paaee n -ay oe can be une o fuheoe pove pefoance. unng pefoance by aocaon heho avanage of -ay oe. 5. Concuon Fgue 1--Reca-pecon cuve by -NN h ay oe (.4) an (3.1) epecvey. =0,30. Fgue--Reca-pecon cuve by -NN h ay oe (.4) an (3.1) epecvey. =60,70. he pea of pefoance n ou expeen hen =60. In he conex of ocuen caegozaon, h pape aee ay oe ha one funaena pobe of ay-bae ocuen caegozaon. In pacce, one aue ha e veco ae ohogona n e-ocuen veco pace, he aocaon beeen e oe n coony ue ay oe, fo exape nne pouc-bae ay oe an conebae ay oe. Bu uch aupon oe no ho. By noucng o pace a he oe hgh eve of concep: e-caegoy pace an ocuen-caegoy pace, h pape popoe a aheaca oe of eang e aocaon, hen e aocaon by h aheaca oe negae h he convenona cone oe of ocuen ay. Ou expeen confe ha uzaon of aocaon beeen e can pove pefoance of caegozaon ye. Acnoegeen he auho ae gaefu o Eabeh Méa fo he uggeon an coen on a af of he pape. Refeence Dua, S..(1991). Ipovng he eeva nfoaon fo exena ouce, Behavo Reeach eho, Inuen an Copue, Vo. 3, No., 199,p.9-36. Haye, P. an Wenen, S. (1990). CONSRUE/IS: a ye fo conen-bae nexng of a aabae of ne oe. In econ Annua confeence on Innovave Appcaon of Afca Inegence,1990. Joah,.(1998). ex caegozaon h uppo veco achne: Leanng h any eevan feaue. In poceeng of he Euopean confeence on Machne eanng, 1998,p.137-14. Kou, H. & Gaan, G.(00). Suy of Caegoy Scoe Agoh fo -NN Cafe, ubon o ICML, 00. Le, D. D. & Rnguee, M.(1994). Copaon of o eanng agoh fo ex caegozaon. In Poceeng of he 3 Annua Sypou on Docuen Anay an Infoaon Reeva (SDAIR 94),1994,p.81-93..6.

Le, D. D., Schape,R.E., Caan,J.P., & Papa, R.(1996). anng agoh fo nea cafe. In Poceeng of he 19 h Inenaona ACM SIGIR Confeence on Reeach an Deveopen n Infoaon Reeva, Ne Yo, NY:ACM pe, 1996, p.98-303. Raghavan V.V. & S.K.M. Wong. (1986). A Cca Anay of he Veco pace Moe fo nfoaon Reeva, Jouna of he Aecan Socey fo Infoaon Scence (JASIS), 37:5, Sepebe 1986. Saon, G.(1983). Inoucon o Moen Infoaon eeva, 1983, McGa-H Boo Copany Saon, G. & Bucey, C. (1988). e eghng appoache n auoac ex eeva, Infoaon Poceng an Manageen, Vo. 4, No. 5, 1988, p.513-53. Saon, G. (1989). Auoac ex Poceng: he anfoaon, Anay, an Reeva of Infoaon by Copue. Aon-Weey, Reang, Pennyvana, 1989. uba,y. & Guven H. A. (1998). Appcaon of - Neae Neghbo on Feaue Poecon Cafe o ex Caegozaon, Poceeng of ISCIS-98, 13h Inenaona Sypou on Copue an Infoaon Scence. Yang,Y. & Peeen, J. O. (1997). A Copaave Suy on Feaue Seecon n ex Caegozaon, In he 14 h In. Conf. On Machne Leanng, 1997,p. 41-40. Yang, Y. (1999). An evauaon of aca appoache o ex caegozaon. Infoaon Reeva,1(1), 1999,p.69-90. Yang Y. & Lu, X.(1999). A e-exanaon of ex caegozaon eho, Poceeng of SIGR,1999,p.4-49. Yang, Y.; Au,.; Pece,.& Lae, C. W.(000). Ipovng ex caegozaon eho fo even acng, Poceeng of SIGIR 00, 3 ACM Inenaona Confeence on Reeach an Deveopen n Infoaon Reeva, 000, p.65-7..7.