Dscrete Optmzaton 5 008 700 704 Contents lsts avalable at ScenceDrect Dscrete Optmzaton journal homepage: wwwelsevercom/locate/dsopt A new constructon of 3-separable matrces va an mproved decodng of Macula s constructon Hung-Ln Fu, FK Hwang Department of Appled Mathematcs, Natonal Chao Tung Unversty, Hsn Chu,Tawan, ROC a r t c l e n f o a b s t r a c t Artcle hstory: Receved 30 July 004 Receved n revsed form 3 March 006 Accepted 16 Aprl 008 Avalable onlne 4 June 008 Keywords: 3-separable matrces Macula s constructon Macula proposed a novel constructon of poolng desgns whch can effectvely dentfy postve clones and also proposed a decodng method However, the probablty of an unresolved postve clone s hard to analyze In ths paper we propose an mproved decodng method and show that for d = 3 an exact probablty analyss s possble Further, we derve necessary and suffcent condtons for a postve clone to be unresolved and gave a modfed constructon whch avods ths necessary condton, thus resultng n a 3-separable matrx 008 Elsever BV All rghts reserved 1 Introducton A poolng desgn has many bologcal applcatons For convenence, we use the language of clone-lbrary screenng We have a set of n clones and a probe X whch s a short DNA sequence Let X denote the dual sequence of X, e, X s obtaned by frst reversng the order of the letters and then nterchangng A wth T and C wth G A clone s called postve f t contans X as a subsequence and negatve f not Typcally, there are a small number of postve clones, say, from 3 to 10, among the n clones The goal of a poolng desgn s to dentfy all postve clones through a small set of tests or pools performed parallelly A test can be appled to an arbtrary subset of clones wth two possble outcomes: a negatve outcome ndcates that the subset contans no postve clone, and a postve outcome ndcates otherwse Let M denote the ncdence matrx of a desgn wth clones labellng the columns and tests labellng the rows We wll treat a column as the subset of row labels where s n the column subset f and only f that column clone s n test M s called d-dsjunct f no column s contaned n the unon of any other d columns M s called d-separable f no two unons of dstnct sets of d columns are dentcal, and d-separable f d s changed to at most d It s well nown [1] that d-separable matrces can dentfy all postve clones f ther number s exactly d, whle d-separable or d-dsjunct matrces can dentfy all postve clones f ther number s at most d These matrces have become the major tools n constructng poolng desgns Many methods have been proposed to construct these matrces But ther exstence s stll rare for practcal need Recently, Macula [4] opened a new door by proposng the contanment constructon method More specfcally, let [m] = {1,, m} Then the columns of Mm,, d, d < are labelled by n random but dstnct -subsets, rows by all d-subsets, and M j = 1 f and only f the row label s contaned n the column label Macula proved that M s d-dsjunct Ths approach extends to many other contanment relatons as n partal orders [] and geometrcal structures [6] One problem wth ths constructon s that the number n of columns s bounded by m, hence m cannot be too small On the other hand, the number of tests s, so d must be small Macula [5] proposed usng Mm,, even though the actual m d Correspondng author E-mal address: hlfu@mathnctuedutw H-L Fu 157-586/$ see front matter 008 Elsever BV All rghts reserved do:101016/jdsopt0080400
H-L Fu, FK Hwang / Dscrete Optmzaton 5 008 700 704 701 number d of postve clones can be larger than In such an applcaton there may exst unresolved clones whose status of beng postve or negatve s unnown Let P + denote the probablty that a postve clone s unresolved The problem of computng P + under a gven decodng method turns out to be dffcult Macula [5] gave a smple decodng whle Hwang and Lu [3] mproved t, but wth a more complcate analyss where P + can be computed only for d = 3 In ths paper we further mprove the decodng method Although probablty analyss remans dffcult, we are able to accomplsh the followng for d = 3: 1 obtan necessary and suffcent condtons for a postve clone to be unresolved; gve an exact probablty analyss; and 3 derve a smple necessary condton and show that by choosng the column ndces judcously, ths necessary condton can be avoded and hence the matrx obtaned s 3-separable The unque-graph decodng Consder Mm,, throughout ths secton We shall use u, v to denote an edge jonng u and v The outcome graph G has [m] as ts vertex-set and an edge u, v f the pool labelled by u, v s postve, e, t has a postve outcome Note that each postve clone nduces a -clque n G, but a -clque n G can correspond to a negatve clone or a -subset of [m] not chosen as a column label whle all ts edges actually come from some other postve clones The unque-graph decodng conssts of the followng rules: If a clone appears n a negatve pool, t s negatve If a -clque n G contans an edge not appearng n any other -clque n G, then t represents a postve clone Let p be the number of postve clones dentfed n Let G be obtaned from G by removng all p -clques dentfed n as postve clones and also removng solated vertces Let S denote the set of edges n these -clques wth both endponts n G Defne G = {G S : S S} If G contans a unque graph G S whch s the unon of a unque set of p -clques for some p wth p + p p, then each of these p -clques represents a postve clone v All clones not dentfed n,, are unresolved Note that rule dfferentates the unque-graph decodng from the orgnal Hwang Lu decodng The followng example llustrates the dfference Example 1 = 3, d = 3, 13, 14, 345 are postve Usng, only 345 s dentfed snce ether 3,5 or 4,5 s an edge not n any other trangle Usng, G conssts of two graphs whle only the frst one s the unon of two trangles Thus 13 and 14 are dentfed Even the unque-graph decodng can leave postve clones unresolved Example = 3, d = 3 Snce G s the unon of four dfferent sets of three trangles, the unque-graph decodng fals to dentfy any postve clone Defne A B = A \ B B \ A We have: Theorem 1 Under the unque-graph decodng, a postve clone A s unresolved f and only f there exsts a nonpostve -clque B such that all edges of A B are contaned n the other postve clones
70 H-L Fu, FK Hwang / Dscrete Optmzaton 5 008 700 704 Fg 1 Intersecton of A, C, D Proof f We cannot tell whether A or B s postve snce they nduce the same outcome graph only f Suppose there does not exst a nonpostve -clque n G Then every -clque n G must be postve On the other hand, every postve -clque s clearly n G Therefore the number of -clques n G equals the true number of postve -clques; hence at most p Consequently, these postve -clques can be dentfed through rule of the unque-graph decodng The requrement that all edges n A \ B are contaned n the other postve clones follows from the fact that f a -clque K contans an edge whch s not n any other -clque, then K represents a postve clone Fnally, all edges n B\A are necessarly n the other postve clones snce otherwse B would not be n G Corollary If d = 3, then a necessary and suffcent condton for A to be unresolved s that each of the other two postve clones contans A B Proof Let C and D denote the other two postve clones For a A \ B, C D must contan all edges from a to A \ {a} But nether C nor D alone can contan all of them snce t requres C A or D A, an absurdty Therefore a C and a D Smlarly, we can prove that every b B \ A s n both C and D Corollary 3 For d = 3, a necessary condton for A to be unresolved s that G contans a -clque wth > Proof A B \ A = B A \ B s a -clque wth > 3 Exact probablty analyss for d = 3 The necessary and suffcent condton n Theorem 1 s not convenent for computng the probabltes of unresolved clones snce t nvolves an unspecfed negatve clone For d = 3, we transform the condton nto condtons nvolvng the three postve clones A, C, D only Fg 1 shows the ntersectons of A, B, C where the seven parts are labelled by A, C, D, AC, AD, CD and ACD A necessary condton for A to be unresolved s that A = φ, whch mples AC φ, AD φ otherwse C or D would contan A, an absurdty Furthermore CD cannot be empty snce otherwse the edges from AC to AD forces A to be postve Fnally, by Theorem 1 all edges from AC to AD must be n B, whch mples B AC AD So f B contans x vertces n CD, B must leave x vertces n ACD out to enforce A = B = Let P + n denote the probablty a gven postve clone s unresolved Theorem 31 Consder three postve clones A, C, D Then A s unresolved f and only f A s empty whle CD ACD Proof only f Shown n the precedng paragraph f Note that A = and CD 1 mples AC AD CD ACD + 1 B can be selected from AC AD CD ACD wth the provson that f B taes x vertces from CD, then t must leave x vertces n ACD untouched so that A \ B = B \ A We are now ready to gve the probablty formulas Defne m = mn{, j} Theorem 3 P + n = 1 =1 b 1 j=0 m j h= j + h b + j h / N 1 Proof b s the number of choces of C wth AC = The rest of the numerator gves the number of choces of D such that D taes all the vertces of A \ C plus j vertces of AC, h vertces of C \ A, and the rest outsde of A C Note that j + h The denomnator gves the unconstraned number of choces of C and D f we also consder the choce of the n 3 negatve clones, we merely add the term to both the numerator and denomnator N 3 n 3
H-L Fu, FK Hwang / Dscrete Optmzaton 5 008 700 704 703 Note that P + n s ndependent of n Let P + n x denote the probablty that exactly x postve clones are undentfed Lemma 33 x = 3 f and only f G s a -clque wth > Proof Clearly, a trval necessary condton for x = 3 s that A = C = D = φ, or G = A C D s a -clque, wth > We now show that ths condton s also suffcent Suppose ACD = j Then, AC = AD = CD = j/ Therefore CD + ACD = j + j = +j = 3 forces j = 1, satsfyng the condton of Theorem 31 Hence A s unresolved Smlarly, we can prove that C and D are unresolved Corollary 34 P + n 3 = 1 j=0 Proof Gven A, there are j j b j j/ j/ / N 3 choces of j vertces n ACD, and choces for the remanng vertces of C Once A and C are chosen, D s fxed j choces of j/ vertces n AC and b j/ j/ Lemma 35 x = f and only f exactly two of A, C and D are empty, say, C and D, and CD Proof AC = ACD CD = AD 1, snce otherwse C = D, an absurdty Further, AC + ACD = AD + ACD = CD Lemma 35 follows from Theorem 31 mmedately Corollary 36 P + n = 3 1/ = 1 j= +1 j j / N 1 Proof Suppose A s the only dentfed postve clone Assume AC = and ACD = j Then AD = as shown n the proof of Lemma 35 Snce A s dentfed, Hence and = A > AC + AD + ACD = j j + 1, 1 Note that we set to guarantee CD = The multplcaton by 3 s because any of A, C, D can be the unque resolved one Let P + n be obtaned from P + n by changng the upper bound of h from m to m 1 n Theorem 3 to guarantee that nether C nor D s empty Then P + n s the probablty that A s the unque unresolved postve clone Hence: Lemma 37 P + n 1 = 3P+ n Theorem 38 P + n 0 = 1 P+ n 3 P+ n P+ n 1 Note that P + n x s ndependent of n for any x 4 A new constructon of 3-separable matrces Theorem 41 Suppose d 3 and G contans no + 1-clque Then the unque-graph decodng dentfes all d postve clones Proof Suppose d = 1 or Snce a sngle postve clone cannot cover the edges of another postve clone, rule of the unque-graph decodng always dentfes all postve clones Suppose d = 3 Then the proof follows from Corollary 3 We now show how to choose the labels of the columns such that G does not contan a + 1-clque Partton the m ndces of [m] evenly nto parts Then K s a legtmate label of columns f K conssts of one ndex from each part Now, a + 1-clque has + 1 ndces, hence two of whch, say, u and v, must come from the same part But no label contanng both u and v s chosen snce t s not legtmate Therefore G does not contan the edge u, v, and à fortor, does not contan the + 1-clque contanng u and v
704 H-L Fu, FK Hwang / Dscrete Optmzaton 5 008 700 704 Theorem 4 The number of legtmate labels s approxmately m Therefore, by Theorem 41, we have a constructon of a 3-separable matrx wth m tests and m clones We now explan why ths constructon does not lead to a 3-dsjunct matrx Note that rule of the unque-graph decodng dentfes postve clones even when there exst unresolved nonpostve -clques see Example 1, where trangles 14 and 134 are nonpostve and undentfed On the other hand, a d-dsjunct matrx has the property that all clones whch are not dentfed as negatve are dentfed as postve, a clear conflct wth rule The followng comparson shows the advantage and dsadvantage respectvely of usng ths new constructon Gven n clones, we want to choose m, such that m n wth the mnmum m so the number of tests s mnmzed Approxmate 1 by equalty Then 1 m = n 1 To mnmze the rght-hand sde sze of wth respect to, we obtan 0 = ln n Consequently, m 0 = e ln n So the number of tests s e ln n To compare wth Mm,,, we frst choose m, such that m n and m s mnmum Snce = m maxmzes m 3 There the number of tests s mn, we replace by m n 3 Denote by m n the m mnmzng the modfed Example Let n = 1000 Then m 0 = 0 and 0 = 6 snce 4 + 4 + 3 + 3 + 3 + 3 = 0, and 4 3 4 > 1000 Thus our 3-separable matrx requres 0 = 190 tests On the other hand, m 1000 = 13 = 6 snce 13 6 > 1000 > 1 6 Hence M13, 6, requres 78 tests However, to dentfy three postve clones, we need to use M13, 6, 3 whch requres 86 tests Acnowledgments The authors would le to than the referees for ther helpful comments and suggestons 3 References [1] DZ Du, FK Hwang, Combnatoral Group Testng and ts Applcatons, nd ed, World Scentfc, Sngapore, 000 [] TY Huang, CW Weng, Poolng spaces and nonadaptve poolng desgns, Dscrete Math 83 004 163 169 [3] FK Hwang, YC Lu, Random poolng desgns under varous structures, J Comb Optm 7 003 339 35 [4] AJ Macula, A smple constructon of d-dsjunct matrces wth certan constant weghts, Dscrete Appl Math 50 1996 17 [5] AJ Macula, Probablstc nonadaptve and two-stage group testng wth relatvely small pools and DNA lbrary screenng, J Comb Optm 1999 385 397 [6] H Ngo, DZ Du, New constructons of nonadaptve and error-tolerance poolng desgns, Dscrete Math 43 00 161 170