DØNote 4492 Calculatng CLs Lmts Harrson B. Prosper Florda State Unversty, Tallahassee, Florda 32306 (Dated: June 8, 2004) Abstract Ths note suggests how the calculaton of lmts based on the CLs method mght be performed more effcently. 1
2 I. INTRODUCTION The CLs method [1], whch has been ntegrated nto the Sngle Top Analyss Framework by Brgtte Vachon, was developed by LEP physcsts n the context of the Hggs search. The method combnes frequentst and Bayesan elements n a procedure judged pragmatc and successful by ts proponents but conceptually problematc by crtcs, of whch I am one. However, my am here s not to smash the chna but to offer constructve suggestons about how the CLs lmts calculaton mght be made more effcent. But let me frst address a queston whose answer, at frst, seems obvous. A To bn or not to bn, That s the queston. Consder data arsng from two sources, sgnal and background, characterzed by the denstes s(x), and b(x), respectvely, where x s a quantty that dscrmnates between the two sources. For example, x could be the output of a neural network. Bnnng data always entals a loss of nformaton. Therefore, the obvous answer to the queston s that, n prncple, data should not be bnned. The lkelhood for unbnned data s proportonal to a product of s(x) + b(x), wth one term per bn. But here s the rub: The modelng of s(x) and b(x) must be good enough so that the uncertanty n the lkelhood, due to modelng uncertantes n these denstes, s neglgble compared wth other uncertantes n the problem. However, f there are too many terms n the product the uncertanty n the form of the lkelhood could become sgnfcant. The upshot s that only a careful analyss desgn study can provde a ratonal answer to the queston whether or not to bn. There s another reason why the answer to that queston s not necessarly the obvous one: The dstncton between unbnned and bnned data s merely one of degree because each expermental datum s, of necessty, represented by a fnte precson number. There s no such thng as contnuous expermental data; such a thng exsts only as a convenent mathematcal abstracton. In the real world, unbnned data are nothng more than data
3 that have been bnned nto a large number of bns of very small wdth. Ths observaton provdes a way to see the connecton between bnned and unbnned lkelhoods, whch I now sketch. Consder the bnned lkelhood L K λ k exp( λ )/k!, (I.1) where K s the number of bns and k and λ are the count and the mean count, respectvely, n bn. The mean count s gven by λ = [s(z) + b(z)] dz. bn (I.2) Consder now the lmt of Eqs. (I.1) and (I.2) as the bn sze x goes to zero. In that lmt the probablty to get more than one count n a bn becomes neglgble relatve to that for k = 0 or 1 and, therefore, only those terms survve. The terms wth k = 0 collapse to unty and we are left wth only the k = 1 terms L exp( N λ j ) λ j, j j = exp( N [s(x j ) + b(x j )] x j ) [s(x j ) + b(x j )] x j, j j N exp( [s(x) + b(x)]dx) [s(x j ) + b(x j )], j (I.3) where N s the number of bns wth k = 1. In the last step, we have taken the lmt K and x 0. Ths heurstc argument shows that an unbnned lkelhood s merely a bnned lkelhood wth suffcently small bns, whch suggests the followng strategy: Always use a bnned lkelhood but choose the bn sze and number of bns so as to maxmze the desred optmalty crteron.
4 B The CLs statstc Lmts calculated wth the CLs method are based on the samplng dstrbuton of the logarthm of the quantty Q = = under two dfferent hypotheses: M exp[ (s + b )] (s + b ) n exp( b ) b n, (I.4) M ( ) n s + b exp( s ), b The sgnal (plus background) hypothess S and background hypothess B. The product s over = 1 M channels [6], where s a σ s the mean sgnal count wth a the acceptance tmes ntegrated lumnosty, and b = N j=1 y j s the mean background count, whch n general s a sum over j = 1 N mean yelds y j, one for each background source j n each channel. The quantty n s the count n the th channel. None of the quanttes a or y j s known. However, we assume that we have estmates A and Y j of them together wth a covarance matrx Cov(v) that characterzes the uncertanty n our knowledge of the vector of parameters v = (y 11, y 12,, y 1N, a 1,, y M1, y 12,, y MN, a M ) assocated wth a known vector of estmates V = (Y 11, Y 12,, Y 1N, A 1,, Y M1, Y 12,, Y MN, A M ). Gven the samplng dstrbuton of q ln Q, p(q v, V, S), under the sgnal hypothess, and the samplng dstrbuton p(q v, V, B), under the background hypothess, one calculates a CLs lmt, at level β, by solvng the followng rato of p-values (that s, tal probabltes) q q β = 0 p(q v, V, S) q q 0 p(q v, V, B), (I.5) for the upper lmt on the cross-secton σ, where q 0 s the observed value of the statstc q. The prncpal task, therefore, s to compute the sums n Eq. (I.5) quckly and accurately. II. NUMERICAL METHOD A fast elegant method s descrbed n Ref. [2] for computng CLs lmts when the statstc q s contnuous. That method can be construed as an applcaton of a general method
5 descrbed some years ago n Ref. [3]. However, the statstc q n Eq. (I.5) s not contnuous; t s a weghted sum of ntegers where q ln Q, = s + Therefore, the task s to compute sums lke c n, (II.1) ( c ln 1 + s ). (II.2) b C(σ, N, H) = q p(q v, V, H) (II.3) = n 1 n M M λ exp( λ )/n!, where N = (N 1,, N M ) are the observed counts. The sums are over the lattce of ponts {n } that satsfy the constrant q q 0, that s, s + c n s + c N, (II.4) or equvalently, M c n t where t = c N M c N. Bascally, we must sum over all the ponts that le between the orgn and the plane t = M c x. The Posson parameter λ s ether s + b or b, dependng on whch of the two hypotheses, H = S or H = B, s beng consdered. For each n the mnmum value s zero, whle the maxmum value s gven by [t/c ], that s, the nteger part of t/c, whch obtans when all other counts are zero. Such constraned sums can usually be done recursvely. For effcency, C(σ, N, H), should be computed at an approprate set of ponts n the cross-secton σ and an nterpolaton of C, wth respect to σ, should be constructed. Then the upper lmt σ U, at CLs level β, can be had by solvng β = C(σU, N, S) C(σ U, N, B). (II.5)
6 III. SYSTEMATIC UNCERTAINTY In the CLs method, systematc uncertanty s accounted for usng the Bayesan procedure of ntegratng the lkelhood weghted by a pror densty π(v) for the vector of parameters v. Ths s equvalent to replacng C(σ, N, H) wth C (σ, N, H) = C(σ, N, H) π(v) dv. (III.1) In the Sngle Top Group, the pror densty s defned n terms of the known covarance matrx Cov(v) and known vector of estmates V. It s assumed that the pror π(v) can be adequately modeled usng a multvarate Gaussan. In that case, t s straghtforward to generate vectors v from the pror and approxmate the ntegral defnng C (σ, N, H) by the sum C (σ, N, H) = C (σ, N, H) (III.2) n whch the th term s evaluated at the generated vector v. Note that snce t = c N depends on v, through the numbers c, ts value wll vary over ths sum. IV. SUMMARY Careful consderaton of what needs to be calculated suggests that t ought to be possble to calculate CLs lmts n a reasonable amount of tme. The suggestons made here may be helpful n ths regard. So far I have tred to be constructve, but now I cannot resst the temptaton to smash a few plates! So here goes. The CLs method melds frequentst and Bayesan elements n a procedure that s nether frequentst nor Bayesan. Therefore, we are warned [1], correctly, not to nterpret CLs lmts n a frequentst or Bayesan way. CLs lmts have frequency propertes that we are at lberty to study. But ths s besde the pont. Any ensemble of anythng has frequency propertes that can be studed, at least on a computer, ncludng Bayesan lmts! The pont s ths: The CLs confdence level β s not defned by ts frequency
7 propertes as s the case, by defnton, for a frequentst confdence level. I suspect, however, that CLs lmts, as s true for all lmts however computed, are almost certanly nternalzed n a Bayesan way because t s qute unnatural to do otherwse. I would hazard a guess that the overwhelmng majorty of consumers of the statement the sngle top producton cross secton s less 6 pb at 95% CL take t to mean there s a roughly 95% chance that the sngle top producton cross secton s less than 6 pb, where 95% smply means that whle one s not certan of the truth of ths statement, one s sure enough of ts truth to be wllng to proceed as f t were. The proponents of CLs pont to the method s pragmatcally useful propertes. Unfortunately, f pragmatc usefulness were the only crteron of acceptablty then one s left wth no compellng reason why ths method s to be favored over any other pragmatcally useful method of whch there are several. All thngs beng equal, I m nclned to favor a method that s both pragmatcally useful and well-founded. APPENDIX Drect numercal evaluaton of the sums n Eq. (II.3) s perhaps the most straghtforward way to proceed. However, t s possble to represent the sums analytcally, whch may perhaps be useful. Snce the product n Eq. (II.3) can be factorzed, we can wrte the sums n Eq. (II.3) as C(σ, N, H) = exp ( ) λ n 1 =0 λ n 1 1 n 1! n M =0 λ n M M M n M! H(t c n ), (IV.1) where the constrant s mposed usng the Heavsde step functon [4] defned by H(x) = 1 f x 0 and H(x) = 0 f x < 0. Usng the nverse Laplace transform [5] representaton of the Heavsde step functon H(x) = 1 γ+ dp exp(px) 1 2π γ p, (IV.2)
8 we can express Eq. (IV.1) as C(σ, N, H) = exp 1 p ( n 1 =0 ) 1 λ 2π λ n 1 1 exp( c 1 n 1 p) n 1! γ+ γ dp exp(pt) n M =0 λ n M M exp( c M n M p). n M! (IV.3) Snce n =0 λ n exp( c n p) n! we can wrte C as the nverse Laplace transform = exp (λ exp( c p)), (IV.4) C(σ, N, H) = 1 2π γ+ γ dp e pt F (p), (IV.5) of the functon ( F (p) = exp λ + ) λ exp( c p) / p. (IV.6) REFERENCES [1] Alex Read, Modfed Frequentst Analyss of Search Results (The CLs Method), CERN Yellow Report 2000-005; http://preprnts.cern.ch/cernrep/2000/2000-005/2000-005.html. [2] H. Hu and J. Nelsen, Analytc Confdence Level Calculatons Usng The Lkelhood Rato and Fourer Transform, CERN Yellow Report 2000-005; http://preprnts.cern.ch/cernrep/2000/2000-005/2000-005.html. [3] Danel Gllespe, A Theorem For Physcsts In The Theory Of Random Varables, Am. J. Phys. 51, 520 (1983). [4] Heavsde Step Functon, http://mathworld.wolfram.com/heavsdestepfuncton.html. [5] Laplace Transform, http://mathworld.wolfram.com/laplacetransform.html. [6] Ths could also nclude products over bns.