On the Effectiveness of Relevance Profiling

Size: px

Start display at page:

Download "On the Effectiveness of Relevance Profiling"

Peter Black
5 years ago
Views:

1 On the Effectveness of Relevance Proflng Davd J Harper Smart Web Technologes Centre The Robert Gordon Unversty Aberdeen AB25 1HG UK d.harper@rgu.ac.uk Davd Lee Smart Web Technologes Centre The Robert Gordon Unversty Aberdeen AB25 1HG UK davd.lee@smartweb.rgu.ac.uk Abstract Relevance proflng s a general process for wthndocument retreval. Gven a query, a profle of retreval status values s computed by sldng a fxed szed wndow across a document. In ths paper, we report a seres of bench experments on relevance proflng, usng an exstng electronc book, and ts assocated book ndex. The book ndex s the source of queres and relevance judgements for the experments. Three weghtng functons based on a language modellng approach are nvestgated, and we demonstrate that the well-known query generaton model outperforms one based on the Kullback-Lebler dvergence, and one based on smple term frequency. The relevance proflng process proved hghly effectve n retrevng relevant pages wthn the electronc book, and exhbts stable performance over a range of sldng wndow szes. The expermental study provdes evdence for the effectveness of relevance proflng for wthn-document retreval, wth the caveat that the experment was conducted wth a partcular electronc book. Keywords relevance proflng; wthn-document retreval; language modellng; nformaton retreval expermentaton. 1 Background Proceedngs of the 9th Australasan Document Computng Symposum, Melbourne, Australa, December 13, Copyrght for ths artcle remans wth the authors. Increasngly, long electronc documents are beng publshed and delvered usng web and other technologes, and users need to fnd nformaton wthn these long documents. Varous approaches have been proposed for wthn-document retreval, ncludng passage retreval [1], and user nterfaces supportng content-based browsng of documents [2]. We have developed a tool, ProfleSkm, for wthn-document retreval based on the concept of relevance proflng [3], and we reported on a comprehensve user-centred evaluaton of ths tool [4] [5]. ProfleSkm ntegrates passage retreval and content-based document browsng. The key concept underpnnng the tool s relevance proflng, n whch a profle of retreval status values s computed across a document n response to a query. Wthn the user nterface, an nteractve bar graph provdes an overvew of ths profle, and through nteracton wth the graph the user can select and browse n stu potentally relevant passages wthn the document. The reader s referred to [3] for a descrpton of, and detaled ratonale for, relevance proflng and the ProfleSkm user nterface. In the user evaluaton study [4] [5], we compared the performance of ProfleSkm aganst a tool based on the sequental nd command delvered wth most browsers and word processng packages. We concluded that: Relevance proflng, as mplemented and presented by the ProfleSkm, was effectve n enablng users to dentfy relevant passages of a document; Usng ProfleSkm, users were able to select and browse relevant passages more effcently, because only the best matchng passages needed to be explored; and Users found the tool satsfyng to use for wthndocument retreval, because of the overvew provded by the relevance profle. That study was based on a smulated work task, namely recreatng part of the (back of book) ndex for an electronc book. On the bass of the results, we tentatvely concluded that relevance proflng n combnaton wth a set of pre-defned ndex entres, mght prove to be an effectve replacement for the typcal ntellectually derved book ndex. In ths paper, we wsh to nvestgate the underlyng relevance proflng mechansm, through controlled bench experments. Specfcally, we want to nvestgate dfferent forms of the weghtng functon used for relevance proflng, and dfferent settngs for parameters of the process. Addtonally, we wanted to nvestgate whether relevance proflng could provde an alternatve to the ntellectually generated book ndex. Consequently, we based our experments on an electronc book, wth an exstng book ndex, that formed the bass of the expermental corpus.

2 2 Relevance Proflng and Weghtng unctons Relevance proflng s a process whereby a profle of retreval status values s computed across a document n response to a query. Ths profle s presented to the user n the form of a bar graph, and by nteractng wth ths bar graph the user can dentfy, and navgate to, relevant sectons of a document. In ths paper each secton, and hence bar n the graph, corresponds to a page of the document (electronc book). The process s sketched n on object-orented pseudo-code n gure 1. Effectvely, a retreval status value (RSV) s computed for each word poston n the document. The relevance proflng procedure operates by sldng a wndow of fxed sze across the document (lnes 7-18, gure 1), and at each word poston a wndow weghtng functon s appled (lne 12). The weghtng functon s a functon of the set of words n the document, query and the wndow, startng at the gven poston and of the gven sze. The wndow weghts are aggregated for each page of the document, and the maxmum weght acheved by a wndow startng n the page s returned (lnes 13-16). Note, that a flter s appled so that only pages contanng at least one term from the query are retreved,.e. assgned a score (lne 10). 1 process relevanceproflng ( Document, Query, weghtnguncton, WndowSze, scoresarray ) 2 begn 3 for Page = 1 to Document.countPages() 4 begn 5 scoresarray[ Page ]= mnmumwndowscore 6 end 7 for wndowposton = 1 to Document.length() 8 begn 9 PageInDoc= Document.pageOf( wndowposton ) 10 f (Document.contansTermOf (Query, PageInDoc) 11 begn 12 wndowweght = weghtnguncton ( Document, Query, wndowposton, WndowSze ) 13 f (wndowweght > scoresarray [ PageInDoc ]) 14 begn 15 scoresarray [ PageInDoc ]= wndowweght 16 end 17 end 18 end 19 end gure 1: Pseudo-code descrbng the relevance proflng process The man purpose of the expermental study s to nvestgate the behavour of the relevance proflng process, and specfcally the effect on retreval effectveness of the choce of wndow sze and weghtng functon. Before descrbng the expermental study, we ntroduce the varous weghtng functons that are nvestgated. The weghtng functons are based on a language modellng approach n whch we model a document, as a probablty dstrbuton over the terms of the vocabulary, and smlarly for each (sldng) wndow. We defne the document model to be the probablty dstrbuton P(t D), where for every term t n the vocabulary (n our case, of the document), the model gves the probablty that we would observe term t f we randomly sampled from the document D. Smlarly, we defne the wndow model, P(t W) for wndow W. We wll defne three weghtng functons (lne 12, gure 1), whch we refer to as the query generaton model (GEN), the Kullback-Lebler model (KL), and a smple term frequency model (REQ). 2.1 Query Generaton Model (GEN) We adapt the language modellng approach proposed for document retreval n [6] [7] for relevance proflng. Language modellng s used to construct a statstcal model for a text wndow, and based on ths model; we compute the wndow RSV as the probablty of generatng a query (denoted Q). We model the dstrbuton of terms (actually stemmed words) over a text wndow, as a mxture of the text wndow and document term dstrbutons as follows: Ρ( Q W ) = p ( t W ) (1) mx t Q where: pmx( t W ) = λ * p( t W ) + ( 1- λ) * p( t D) The estmates are smoothed by the document word statstcs usng the mxng parameter, λ a. The ndvdual word probabltes are estmated n the obvous way usng maxmum lkelhood estmators: p = ( t W) nw nw p ( t D) = nd nd (2) where n W (n D ), and n W (n D ), are the number of term occurrences of term n the wndow (document), and total term occurrences n the wndow (document) respectvely. Equvalently, we can rank wndows by takng the log of both sdes of (1), yeldng: log Ρ( Q W ) = log pmx ( t wn) (3) t Q a We have nvestgated the best value for the mxng parameter emprcally, and smlar results were obtaned n the range 0.8 through In the experments reported, we use 0.8.

3 2.2 Kullback-Lebler Model (KL) The Kullback-Lebler dvergence has been used extensvely n applcatons of language modellng n nformaton retreval [8]. In ths approach, we compute the Kullback-Lebler dvergence between the wndow model and the document model, and we restrct the computaton to the query terms, as follows: ( W D) = p( W ) log( p( W ) p( D ) KL t t t ) t Q (4) We use the strength of the dvergence as a measure of wndow relevance wth respect to a query. Intutvely, the larger the dvergence of the wndow model from the (common) document model, the more lkely the wndow s relevant to the query. Unlke the GEN model, the KL term weghts have both a representaton component (outsde the log), and a dscrmnatve component (the log term). We are unable to use maxmum lkelhood estmates for the component probabltes, as we must avod estmatng zero for p(t W). We use the estmaton rule attrbuted to Jefferys [9, p. 127] to estmate the parameters as follows: ( n W + 0.5) ( n W 1.0) ( n + 0.5) ( n 1.0) p ( t W ) = + p ( t D) = D D Term requency Weghtng Model (REQ) (5) In ths approach, we smply weght the wndow by summng the total occurrences of all query terms wthn the wndow. Ths approach can be gven a language modellng nterpretaton, whch wll be useful n our later analyss. Usng the query generaton approach, we compute the wndow RSV as the probablty of generatng any term of the query as follows: Ρ( any Q term W ) = p( t W t Q ) (6) Ths s equvalent to summng all the term occurrences n a wndow over the query terms. 3 Expermental Study 3.1 Research Questons In ths study we nvestgate three man research questons, namely: Is there an optmal wndow sze for the sldng wndow, and s ths optmal sze dependent on the partcular weghtng functon used? What s the relatve effectveness of the three weghtng functons we have descrbed? Could relevance proflng be used as an ad n generatng a book ndex, through ntellectual effort by a human, or ndeed as a complete replacement for the book ndex? In respect of wndow sze, we conjecture that smaller wndow szes wll tend to be precsonenhancng as they are less lkely to dscover unrelated term occurrences, and that large wndow szes wll tend to be recall enhancng. We nvestgate a range of wndow szes from a few sentences n length (50 words) up to 2-3 paragraphs or half a page (250 words). or the weghtng functons, we conjecture that the smple term frequency weghtng (REQ) wll perform worst, gven that t explots less of the avalable nformaton than the other functons. Both the query generaton approach and the Kullback-Lebler approach have proved effectve n conventonal document retreval. However, n ths applcaton, we beleve that comparatvely small sze of sample text (namely the samples of wndow text) s lkely to be a major lmtng factor, and specfcally n estmatng probabltes. The smpler query generaton model may prove more effectve due to the smoothng provded by the mxture approach. 3.2 Corpus We wsh to nvestgate relevance proflng for wthndocument retreval, and especally for long documents. And, n order to measure relatve effectveness, we need to establsh a so-called ground truth for the experments. We used van Rjsbergen s classc textbook, Informaton Retreval [9], whch s arguably long (191 pages, words). It s avalable n an electronc format, and t possesses a comprehensve book ndex. We had used ths corpus successfully n prevous end-user experments [4] [5]. Here, we use the corpus n a smlar way to a conventonal test collecton: the pages are the unts of retreval, the queres are provded by the book ndex entres, and the relevance assessments are the relevant pages assocated wth each ndex entry. We note that the book ndex was created by the book s author, and t s possble that the ndexng s not exhaustve. Consequently, the absolute performance of the technques we nvestgate s lkely to be hgher than we report. However, the man purpose of the experments s to compare technques, and n ths respect all are evaluated aganst the same ground truth. Table 1 summarses the man characterstc of the corpus. In the man, we conducted the experments usng the set of mult-term queres as all the weghtng func-

4 tons are monotonc wth respect to sngle term queres. We note that a hgh proporton of the mult-word queres are n fact phrasal queres rather than sets of words. Number of pages: 191 Number of words: Index Entry Type No. of Entres b (queres) Total relevant pages sngle word mult-word Average #rels/entry Table 1: Detals of corpus: Informaton Retreval textbook by van Rjsbergen [9] 3.3 Experment Desgn The experments were conducted as follows. Each query s used to generate a relevance profle for the document. Then, we smply rank the pages accordng to the retreval status value, and evaluate the rankng based on the relevance judgements provded by the book ndex. In fact, ths s not an deal way of evaluatng relevance proflng, whch s desgned to dentfy relevant groups (or clumps) of pages, and not smply ndvdual pages. It would have been better to assess the effectveness n dentfyng the same groups of pages dentfed n the book ndex. Gven the avalablty of measures and tools for evaluatng ranked output, we opted to do page rankng. 3.4 Measures We used a range of sngle-valued measures averaged over the set of queres to assess effectveness, namely: non-nterpolated precson, R-precson (R-prec.), and three varatons of the -measure. R-precson s the precson acheved at rank R, where R s the number of relevant documents for a gven query. Ths measure combnes elements of both precson, proporton of retreved R documents that are relevant, and recall, proporton of R relevant documents retreved. The - measure s smply the complement to the E-measure [9], and s gven by: ( αr + ( α ) P) = R P 1 (7) where R c s recall, P s precson, and α enables one to bas the measure n favour of precson (α -> 1) or recall (α ->0). We use three varants wth α set to 0.8 b There were a number of entres that we not used n the experments, namely the see also entres. c Ths s a dfferent use of R and we trust the use s clear from the context. (precson-orented), 0.5 (balanced harmonc mean of R and P), and 0.2 (recall-orented). To compute -values for each query, we compute precson and recall at each dfferent score n the rankng (note, not at each rank poston), and then compute the optmal values -values for the dfferent settngs. Statstcal sgnfcance testng was performed usng the sgn test, and p-values less than 0.05 were consdered sgnfcant. 4 Experments We nvestgate each research queston n turn, and present and dscuss the results. Gven the possble nteracton between wndow sze and weghtng functon, we decded to smply explore the wndow sze ntally wth the query generaton model (GEN), and we later confrmed the valdty of the fndngs for the other two weghtng functons. 4.1 Wndow Sze Experments In the frst set of experments, we nvestgated effect of the sze of the sldng wndow on the performance achevable wth the query generaton model (GEN). The sze of the wndow was vared from 50 words through 200 words n steps of 25, and for 250 words. The experment was run wth two sets of queres, those for whch the queres contaned more than one word (MULTI), and those comprsng a sngle word (SINGLE). In the context of document skmmng, we thnk t lkely that mult-word queres wll be more usual that sngle word queres. However, we were also nterested n the performance achevable wth sngle word queres, as they form a sgnfcant proporton of entres n our book ndex. If ProfleSkm were used as a replacement for the book ndex then t would have to provde good performance for ths type of query. We present the wndow sze results n Tables 2 and 3, where the bold fgures are the best values acheved for each measure, and the talcsed fgures are values close to the best. The results for the mult-term queres (Table 2) show that for -measures, the results are broadly comparable over a wde range of wndow szes from 50 through 250. Interestngly, the recall-orented (α=0.2) results are better than the other results, ndcatng that relevance proflng may be a recallorented devce. or the (averaged) non-nterpolated precson and for R-Precson, the results are better for the smaller wndow szes (50, 75), and sgnfcantly so compared wth the results for wndow szes 100 and up.

5 Wndow sze Prec. R- prec. (α=0.8) (α=0.5) (α=0.2) Table 2: Averaged effectveness of query generaton weghtng (GEN) for 232 mult-term queres for dfferent wndow szes The performance for the sngle term queres (Table 3) s reduced by 5-10% compared wth the mult-term queres, except for the R-Precson measure. Interestngly, for all measures, the maxmum effectveness s acheved at a wndow sze of 200. Wndow sze Prec. R- prec. (α=0.8) (α=0.5) (α=0.2) Table 3: Averaged effectveness of query generaton weghtng (GEN) for 46 sngle term queres for dfferent wndow szes The results of the wndow sze experments demonstrate the robust nature of the relevance proflng process, n that hgh and broadly comparable levels of performance are acheved across a range of wndow szes. However, statstcally speakng, small wndow szes (50, 75) yeld sgnfcantly better results for nonnterpolated precson and R-precson, for mult-term queres. Nevertheless, t s clear that relevance proflng works well for wndows of a few sentences n length up to half a page. or sngle term queres, the best effectveness s acheved for larger wndow szes (200, 250). A possble explanaton for ths s that, n order to establsh relevance for a short query, we may requre a larger sample of text Contrary to expectatons, small wndow szes dd not seem to be precson-enhancng, nor large wndow szes recall-enhancng. If one looks n detal at the relevance proflng process (gure 1), one can observe both precson- and recall-enhancng devces. By nsstng that pages can only acheve a score f at least one query term s actually present n the page, we are effectvely boostng precson. On the other hand, the fact that any wndow begnnng n a page can contrbute an RSV to that page s a recall-enhancng devce. We would argue that these devces operate to mtgate the effects of wndow sze, and partcularly the possble deleterous effect of large wndow szes on precson. 4.2 Weghtng uncton Experments In ths second set of experments, we nvestgate the comparatve effectveness of the three weghtng functons descrbed n secton 2. In Table 4, we report the best levels of effectveness acheved for each weghtng functon, where the wndow sze s chosen optmally for each functon (and n some cases measure). Standard uncton GEN KL REQ Wndow Sze Average Precson Average R-precson (α=0.8) (α=0.5) (α=0.2) Table 4: Averaged effectveness of three weghtng functons for 232 mult-term queres (usng optmal wndow settng for each functon) Wth coordnaton flterng uncton GEN KL REQ Wndow Sze (unless stated otherwse) Average Precson Average R-precson (75) (α=0.8) (100) (α=0.5) (100) (α=0.2) (200) (200) Table 5: Averaged effectveness of three weghtng functons wth co-ordnaton level flterng for 232 mult-term queres (usng optmal wndow settng for each functon)

6 Let us consder the standard results frst. Clearly, the performance of the query generaton model (GEN) exceeds that of the other two weghtng functons by a large and statstcally sgnfcant margn. It s approxmately 10%-15% better than the Kullback-Lebler model, and above 20% better than the term frequency weghtng model. Although KL weghtng outperforms REQ, t s not sgnfcantly better. It would appear that the ostensbly smple query generaton model s better than the arguably more nformaton-rch KL model, and we surmsed earler the problem of estmatng parameters from small samples may be mportant n relevance proflng. If we look at the ndvdual term weghts n the KL model (eqn 4), t s clear that wthn the log term, p(t D)<<p(t W), and thus the contrbuton of p(t D) domnates the expresson. Gven the nature of a book ndex, we beleve that many terms n the ndex wll be of smlar frequency, and thus the P(t D) wll be both small and relatvely constant for query terms. Consequently, the term outsde the log, namely p(t W) wll provde the overall shape of the weghtng functon, and ths s the same bass for the term frequency weghtng (eqn 6). Hence, the performance of KL and REQ should be, and ndeed are, broadly smlar. How then do we explan why the performance of GEN s so much better than the other two weghtng functons? We thnk t s the treatment of the nonoccurrence of query terms n a wndow that s the dstngushng hallmark of GEN. In GEN, f a query term does not appear n the wndow, then the weghtng functon penalses ths harshly. Consder that n the probablty mxture (eqn 1), the contrbuton from the document model wll always be very small. Whereas, n the other two functons the weghtng s more or less lnear n the number of term occurrences. To test ths conjecture, we ran a thrd set of experments, n whch we added an addtonal flter, whch we dubbed the coordnaton level flter. In these experments we nsst that all query terms are present n a wndow (or coordnated as t used to be known), pror to a retreval status value beng calculated for that wndow. The results of ths thrd set of experments are presented n Table 5. In some cases, there was a dfferent optmal wndow sze for some measures, and ths s ncluded n brackets wth the measure value. We expected that co-ordnaton flterng wll reduce the number of pages retreved, and hence lkely to depress recall, and possbly boost precson. or GEN, performance was reduced by about 5-10% for all measures. However, the co-ordnaton flterng boosted the performance of both KL and REQ to comparable levels to the fltered GEN performance. Ths provdes strong evdence that t s the treatment of the nonoccurrence of query terms whch s the key to the performance of the unfltered (or standard) query generaton model. It may be that the performance of the KL model could be mproved by expermentng wth the parameter estmaton rules. Interestngly, ncorporatng co-ordnaton flterng, results n generally larger optmal wndow szes for GEN and to some extent KL. 4.3 Relevance Proflng for Book Indexng Here, we draw together the evdence that relevance proflng mght be used as an ad n generatng a book ndex, namely through ntellectual effort by a human, or ndeed as a complete replacement for the book ndex. rst, the absolute effectveness acheved when usng the query generaton model was very hgh, for both the mult-term and sngle term sets of queres. or the mult-term queres (Table 3), t acheved optmal -values of (, α=0.5), whch effectvely means that at ths optmum pont, on average both Precson and Recall acheved levels around 70%. Moreover, the R-precson value of 0.58 s qute hgh, and means f a user nspects down to the rank correspondng the exact number of relevant pages for each query, then on average 58% of these relevant pages wll be retreved. Hgh recall s clearly mportant n the context of book ndexng. We also computed the number of queres that acheved recall of 100% at a cutoff of 20 retreved pages. or GEN (wndow sze = 75), 175 of the 232 mult-term queres acheved maxmum recall, and only 9 of the 232 queres faled to retreve any relevant documents. urther, GEN (wndow sze = 75) retreves 587 of the possble 766 relevant pages for the mult-term queres, and 100 of the possble 144 relevant pages for the sngle term queres. It seems reasonable to conclude that a tool such as ProfleSkm, whch mplements relevance proflng, would prove a useful ad to a human ndexer. Indeed, gven a set of ndex entres, relevance proflng mght even replace the tradtonal book ndex. It s worth notng that careful nspecton of the exstng book ndex showed conclusvely that the ndex s not completely exhaustve. In partcular, there s nconsstent treatment of page ranges, where frequently not all relevant pages n a range are ncluded n the ndex. Relevance proflng was able to relably dentfy many of these undentfed relevant pages. 5 Conclusons and uture Work In ths paper we have reported a seres of bench experments of relevance proflng. The experments were conducted wth an electronc book and ts assocated book ndex. We nvestgated three alternatve

7 weghtng functons, and varous parameters of the relevance proflng process, most notably wndow sze. The major fndngs of the experments are: - relevance proflng s a robust process yeldng acceptably hgh levels of performance over a range of wndow szes from words n length; - the query generaton model (GEN) s sgnfcantly more effectve than the Kullback-Lebler (KL) and the term frequency (REQ) models; - a key factor n effectveness of query generaton model s ts treatment of non-occurrences of query terms n the sldng wndow, whch we attrbute to the smoothng provded by the probablty mxture model; and - the absolute performance of relevance proflng ndcates that t mght be usefully employed as an ad to human book ndexng, and ndeed may provde a vable alternatve to the book ndex (provdng that a good set of ndex entres can be generated). We would emphasse that these experments have been conducted wth a sngle book, and assocated ndex, and t would be hghly desrable to repeat the experments wth a number of such books. Nevertheless, we beleve that the general conclusons would lkely be confrmed by such experments, albet the absolute levels of effectveness may dffer, dependng on the exhaustvty of the (human) ndexng. The results reported here suggest a number of possble avenues for future work. rst, gven the evdent, and ndeed wdely recognsed, mportance of parameter estmaton and smoothng n language modellng applcatons, and gven the specalsed nature of relevance proflng, further experments should be conducted on both GEN and KL usng dfferent parameter estmaton rules or smoothng approaches. Second, all the weghtng functons we nvestgated assumed that query terms are dstrbuted ndependently. Gven that, n general, ths s unlkely to be the case, and especally for phrasal queres, t would be desrable to nvestgate models based on term dependence. However, we would suggest only frst-order models be nvestgated gven the paucty of sample data. nally, t would be nterestng to explore how to generate useful queres automatcally, where by useful we mean the query dentfes coherent chucks of relevant text. We note that such queres would not necessarly be just phrasal. Ths would pave the way for both automatng the process of book ndexng, and perhaps more tantalzngly, enablng query-less retreval wthn documents. Acknowledgements Ths work was done wthn the Smart Web Technologes Centre that was establshed through a Scottsh Hgher Educaton undng Councl (SHEC) Research Development Grant (No. HR01007). Davd Lee was supported by a Summer Studentshp provded by The Robert Gordon Unversty. We thank the followng colleagues for ther feedback on ths work: Ian Pre, Ivan Koychev, Bcheng Lu, Ralf Berg and Murat Yakc. I also thank Margery Murray for her nvaluable assstance n preparng the manuscrpt. nally, we would lke to thank the anonymous referees for ther nsghtful comments on the submtted paper. 6 References [1] Kaszkel, M. and Zobel, J.: Passage Retreval Revsted. In: Proceedngs of the Twenteth Internatonal ACM-SIGIR Conference on Research and Development n Informaton Retreval, Phladelpha, July ACM Press, , [2] Hearst, M. A.: TleBars: vsualzaton of term dstrbuton nformaton n full text nformaton access. Proc. CHI'95, 56-66, [3] Harper, D. J., Coulthard, S., and Sun, Y. A language modellng approach to relevance proflng for document browsng. In Proceedngs of the Jont Conference on Dgtal Lbrares, pages 76-83, Oregon, USA, July [4] Harper, D.J., Koychev, I. and Sun, Y. Query-based document skmmng: A user-centred evaluaton of relevance proflng. In: Proceedngs of 25 th European Conference on Informaton Retreval. Lecture Notes n Computer Scence, Sprnger-Verlag, Berln, pp , [5] Harper, D.J., Koychev, I., Sun, Y. and Pre, I. Wthn-Document Retreval: A User-Centred Evaluaton of Relevance Proflng. In: Informaton Retreval, Kluwer Academc Publshers, The Netherlands, 7, , [6] Ponte, J. and Croft, W. B.: A language modelng approach to nformaton retreval. In: Proceedngs of the 1998 ACM SIGIR Conference on Research and Development n Informaton Retreval, , [7] Song,. and Croft, W.B.: A general language model for nformaton retreval. In: Proceedngs of the 1999 ACM SIGIR Conference on Research and Development n Informaton Retreval, , 1999 [8] Croft, W.B. and Lafferty, J. (Eds.): Language modelng for nformaton retreval. Kluwer Academc Publshers, The Netherlands, [9] Van Rjsbergen, K. Informaton Retreval. Butterworths, London, 1979.

CS47300: Web Information Search and Management

CS47300: Web Information Search and Management CS47300: Web Informaton Search and Management Probablstc Retreval Models Prof. Chrs Clfton 7 September 2018 Materal adapted from course created by Dr. Luo S, now leadng Albaba research group 14 Why probabltes