ENTROPIC QUESTIONING

Size: px

Start display at page:

Download "ENTROPIC QUESTIONING"

Pamela Holland
5 years ago
Views:

1 ENTROPIC QUESTIONING NACHUM. Introucton Goal. Pck the queston that contrbutes most to fnng a sutable prouct. Iea. Use an nformaton-theoretc measure. Bascs. Entropy (a non-negatve real number) measures the amount of uncertanty nherent n a strbuton: hgh entropy correspons to large uncertanty; zero entropy correspons to perfect knowlege. Let P = {p,...,p n } be a screte probablty strbuton (set of probabltes), corresponng to events {c,..., c n }. The entropy of P s H P = p log p By efnton, set 0 log 0 = 0. We wll use bnary logarthms, lg (= log ), nstea of the usual natural logarthms, ln, the fference beng only a multplcatve constant. Example. For n = 4, we have: P H ( 4, 4, 4, 4) unform (, 4, 4, 0).5 one elmnate (,, 0, 0) two elmnate ( 3 4, 4, 0, 0) 0.8 skewe (0,, 0, 0) 0 certanty. Metho Theory. Let events c be successes (tem s the rght one). The (bnary) entropy of the system s H = Pr(c )lg Pr(c ) After askng a queston (attrbute) A an gettng an answer A, the entropy becomes H = Pr(c )lg Pr(c ) Date: November 005. These eas were frst conceve n Sprng 00.

2 NACHUM where Pr(c ) s the posteror probablty of c, gven answer. Thus, the expecte upate entropy after a queston wth answers A s H A = Pr()H where Pr() s the pror probablty that answer wll be gven. Thus, HA measures the expecte uncertanty after recevng an answer from A. The lower H s, the more nformaton obtane by askng the queston. Example. Conser questons wth the followng answer strbutons: A Y N a 0 b 0 c 0 0 B Y N a b c C Y N a 0 c Suppose that ntally four tems {a, b, c, } are equprobable, wth entropy. Queston A gves a system wth entropy, for ether answer. Queston B s meanngless, an leaves the entropy as s, regarless of the answer. Queston C reuces the entropy to.5 f the answer s Y, an to.38 for N, for a weghte average of.39. So, clearly A s best. Havng aske A an gotten an answer, both A an B are worthless. Note. When there are many answers, each H wll be relatvely small, n whch case H wll also be small assumng that the queston s meanngful. Parameters. S n = S p = Pr(c ) q = Pr( c ) r = Pr() = p q p = Pr(c ) = p q /r H = p lg p H = r H 3. Applcaton Set of (relevant) tems Number of (relevant) tems A pror lkelhoo (normalze rank) that tem S wll be chosen Probablty of choce (answer) for esre tem Probablty of choce Posteror probablty for tem, assumng choce Entropy after choosng Expecte entropy (after answerng gven queston)

3 ENTROPIC QUESTIONING 3 Metho. Gven the set S an probabltes p an q, choose the queston wth smallest H value. Upate each p to p, where s the answer obtane. Atonally, one may want to ask no queston when none make a sgnfcant mprovement n entropy. Backgroun. The use of entropy calculatons to esgn ecson trees was suggeste by Ross Qunlan n hs famous ID3 algorthm [3]. Assumptons. All proucts n S wth non-zero probablty appear uner at least one choce ( other, f necessary). If not, then some large cost shoul be assgne them. Each leaf correspons to a sngle prouct or to any of a number of nstngushable proucts (all of whch wll satsfy the customer equally). Facts. p = q = r = Smplfcatons. H () The above entropy formula for a partcular queston s = r H = ( r ) p q lg p q r r = ( ) p q p q lg p q = ( ) p q lg p q p q lg p p q lg q = [ ( )] lg p q p q p lg p q p q lg q = ( ) ( ) p q lg p q p lg p p q lg q () Ignorng the mle part, whch s the pre-queston entropy, an s nepenent of the (parameters of the) queston, we are left wth ( ) ( ) p q lg p q p q lg q representng the queston s expecte mprovement to entropy. (3) We can use any base for the logarthm.

4 4 NACHUM (4) Pre-compute the followng for each tem an each queston: t = log q log q (5) The formula s now equvalent to ( ) ( ) p q log p q (6) One can save the p q to reuse as the upate ranks, after pckng a queston an gettng answer. (7) The frst half of ths formula can be compute n one pass, by (selectng an) computng p q, for each. 4. Example p t Before Answer 0 Answer p lg p p lg p q 0 p 0 lg p 0 p 0 lg p0 q p lg p p lg p t Σ H So the entropy before the queston s.75 an the expecte value after s = Excurss Path Length. Suppose the n leaves of a tree T are assgne probabltes p. The weghte path length w(t) of T s the weghte average of the length l of the paths from the root to the leaves of T: w(t) = p l Note. It s known [] that for all probablty strbutons P there s a T such that H(P) w(t) H(P) + So for large trees the entropy s a goo estmate of the average path n the best possble tree. (The lower boun follows from the nature of entropy.) The Huffman coe algorthm [] (see below) s optmal. Theorem. Assume that all probabltes n P are powers k of. Then, one can acheve w(t) = H(P). Proof. Use Huffman s algorthm: Some other connectons between path length an entropy appear n [4, 5].

5 ENTROPIC QUESTIONING 5 () Each value n P s a leaf noe. () Replace the two noes wth smallest values, wth a noe contanng ther sum, an wth the orgnal noes as ts chlren. (3) Repeat prevous step untl only one noe s left. For the probabltes to a up to, the two smallest must be equal; so ang them gves a new power of. (Just thnk of ther bnary representaton.) By nucton, every noe k wll en up on level k. The clam follows. Theorem. One can recursvely partton a screte strbuton nto two sets of equal total probablty, f an only f the probabltes are all powers k. Such a recursve parttonng correspons to a perfectly balance bnary tree, n whch the two branches at every noe have equal weght. Proof. Suppose we are gven a set P of n such powers. If n =, then P = {/, /}, whch can be splt as esre. Suppose n >. Replace two equal ones wth ther sum, whch s also of the esre form. By nucton, the new, smaller set can be splt evenly, after whch the two combne probabltes can be ecompose. Suppose P = {p,..., p n } can be splt as escrbe nto P, P, each weghng. Each of P ( =, ) can be splt recursvely, so f we ouble each probablty n P, the latter can also be splt. Snce P, P < P, by nucton the ouble probabltes are of the esre form. But then so are the non-ouble ones. Moral. Workng uner the assumpton that there always s some queston that splts the relevant set nto two halves, the expecte length of a path to the chosen (clcke) prouct for a gven query s gven by the entropy. References [] Glbert, E. N., an Moore, E. F. Varable Length Bnary Encongs, Bell Sys. Tech. J. 38 (July 959), pp [] Huffman, D. A. A metho for the constructon of mnmum-reunancy coes, Proceengs of the I.R.E., Sep. 95, pp URL= huff/huffman_95_mnmum-reunancy-coes.pf. [3] Qunlan, J. R. Dscoverng rules from large collectons of examples: a case stuy. In D. Mche, etor, Expert Systems n the Mcro-electronc Age, Enburgh Unversty Press, Enburgh, 979. [4] Rssanen, J. Bouns for Weght Balance Trees, IBM J. of R&D 7 (), 0 05 (Mar. 973). URL= [5] Wong, C. K. an Nevergelt, J Upper Bouns for the Total Path Length of Bnary Trees. J. ACM 0, (Jan. 973), -6. DOI= The Shannon-Fano algorthm works top-own n ths way, rather than bottom-up lke Huffman.

Distance-Based Approaches to Inferring Phylogenetic Trees

Distance-Based Approaches to Inferring Phylogenetic Trees Dstance-Base Approaches to Inferrng Phylogenetc Trees BMI/CS 576 www.bostat.wsc.eu/bm576.html Mark Craven craven@bostat.wsc.eu Fall 0 Representng stances n roote an unroote trees st(a,c) = 8 st(a,d) =