Selecting an Optimal Rejection Region for Multiple Testing

Size: px
Start display at page:

Download "Selecting an Optimal Rejection Region for Multiple Testing"

Transcription

1 Selecting an Optial Rejection Region for Multiple Testing A decision theory alternative to FDR control, with an application to icroarrays David R. Bickel Office of Biostatistics and Bioinforatics Medical College of Georgia Augusta, GA dbickel@ail.cg.edu Noveber 26, 2002 Archives: arxiv.org, athpreprints.co Abstract. As a easure of error in testing ultiple hypotheses, the decisive false discovery rate (dfdr), the ratio of the expected nuber of false discoveries to the expected total nuber of discoveries, has advantages over the false discovery rate (FDR) and positive FDR (pfdr). The dfdr can be optiized and often controlled using decision theory, and soe previous estiators of the FDR can estiate the dfdr without assuing weak dependence or the randoness of hypothesis truth values. While it is suitable in frequentist analyses, the dfdr is also exactly equal to a posterior probability under rando truth values, even without independence. The test statistic space in which null hypotheses are rejected, called the rejection region, can be deterined by a nuber of ultiple testing procedures, including those controlling the faily-wise error rate (FWER) or the FDR at a specified level. An alternate approach, which akes use of the dfdr, is to reject null hypotheses to axiize a desirability function, given the cost of each false discovery and the benefit of each true discovery. The focus of this ethod is on discoveries, unlike related approaches based on false nondiscoveries. A ethod is provided for achieving the highest possible desirability under general conditions, without relying on density estiation. A Bayesian treatent of differing costs and benefits associated with different tests is related to the weighted FDR when there is a coon rejection region for all tests. An application to DNA icroarrays of patients with different types of leukeia illustrates the proposed use of the dfdr in decision theory. Coparisons between ore than two groups of patients do not call for changes in the results, as when the FWER is strictly controlled by adjusting p-values for ultiplicity.

2 2 Introduction In ultiple hypothesis testing, the ratio of the nuber of false discoveries (false positives) to the total nuber of discoveries is often of interest. Here, a discovery is the rejection of a null hypothesis and a false discovery is the rejection of a true null hypothesis. To quantify this ratio statistically, consider hypothesis tests with the ith test yielding H i = 0 if the null hypothesis is true or H i = 1 if it is false and the corresponding alternative hypothesis is true. Let R i = 1 if the ith null hypothesis is rejected and R i = 0 otherwise and define V i = H1 - H i L R i, so that R i is the total nuber of discoveries and V i is the total nuber of false discoveries. Storey (2002) pointed out that the intuitively appealing quantity EH V i ê R i L is not a viable definition of a false discovery rate since it usually cannot be assued that PH R i > 0L = 1. He listed three possible candidates for such a definition: V i ª E i j ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ k R i ƒ V i Q ª E i j ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ k R i ƒ R i > 0 y z Pi j { k R i > 0 y z ; { R i > 0 y z ; { (1) (2) D ª EH V i L ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅÅÅÅÅÅÅ R i L. EH Benjaini and Hochberg (1995) called the false discovery rate (FDR) since null hypotheses can be rejected in such a way that a is guaranteed for any significance level a œ H0, 1L. This control of the FDR is accoplished by choosing a suitable rejection region G Õ S, where S is the test-statistic space, and by rejecting the ith null hypothesis if and only if the ith test statistic, t i œ S, is an eleent of G. The FDR was originally controlled under independence of the test statistics (Benjaini and Hochberg, 1995), but can also be controlled for ore general cases (Benjaini and Yekutieli, 2001). The other two candidates, Q and D, cannot be controlled in this way since, if all of the null hypotheses are true, then Q = 1 > a and D = 1 > a (Storey 2002). Nonetheless, Storey (2002) recoended the use of Q, which he called the positive false discovery rate (pfdr), since the ultiplicand PH R i > 0L of the FDR akes it harder to interpret than the pfdr. Although D of Eq. (3) is also easier to interpret than the FDR, Storey (2002) coplained that D fails to describe the siultaneous fluctuations in V i and R i, in spite of its attraction as a siple easure. However, it will be seen that D, unlike and Q, can be optiized using decision theory without dependence assuptions and that the optial value of D is autoatically controlled under general conditions. Because these advantages are based on decision theory, I call D the decisive false discovery rate (dfdr). It will also becoe clear that certain estiators of the FDR and pfdr are ore exact when considered as estiators of the dfdr. A possible objection to the use the dfdr is that it is undefined for EH R i L = 0. However, that does not occur as long as the rejection region is a non-null subset of the test statistic space, in which case PH R i > 0L 0. In other words, the dfdr is eaningful as (3)

3 3 long as a rejection region is not selected in such a trivial way that the nuber of rejected null hypotheses is alost always zero. For exaple, rejecting a null hypothesis only if a t-statistic were exactly equal to soe value t 0 would ake all hypothesis testing procedures useless, including those based on the FDR, pfdr, or dfdr. Denote by b i the benefit of rejecting the ith null hypothesis when it is false, where the benefit can be econoic, such as an aount of oney, or soe other positive desirability in the sense of Jeffrey (1983). Let c i be the cost of rejecting the ith null hypothesis when it is true, with cost here playing the role of a negative desirability, so that we have b i 0, c i 0, and the vectors b ª Hb i L and c ª Hc i L. Then the net desirability is dhb, cl = Hb i HR i - V i L - c i V i L = b i R i - Hb i + c i L V i. If the costs and benefits are independent of the hypothesis, then the net desirability is d = I R i - I1 + c 1 ÅÅÅÅÅÅ M V i M = I1 - I1 + c 1 ÅÅÅÅÅÅ M V i ê R i M R i and its expectation value is EHdL º J1 - J1 + c 1 ÅÅÅÅÅÅÅÅ N QN E i (5) j R y i z k { if PH R i > 0L º 1. (Genovese and Wasseran (2001) instead utilized a arginal risk function that is effectively equivalent to -EHdL ê if = c 1.) Thus, the rejection region can be chosen such that the pfdr (or FDR) and the expected nuber of rejections axiize the approxiate expected desirability, given a cost-to-benefit ratio, assuing the probability of no rejections is close to zero. However, these approxiations are not needed since there is an exact result in ters of the dfdr: EHdL = E i j R i - J1 + c 1 y ÅÅÅÅÅÅÅ N V i (6) k b 1 z = J1 - J1 + c 1 ÅÅÅÅÅÅÅ N DN E i { j y R i z. k { When all of the null hypotheses are true, D = 1 for all nontrivial rejection regions and it follows fro Eq. (6) that EHdL is optiized when EH R i Lö0, as expected. Likewise, when all of the null hypotheses are false, D = 0 for all rejection regions and EH R i L = is obviously optial; in this case, the rejection region would be the whole test statistic space. More generally, if the rejection region is chosen such that EHdL is axial and if that axiu value is positive, then D < I1 + c 1 ÅÅÅÅÅ (7) M -1, as plotted in Fig. 1. Thus, under conditions that ake the rejection of at least one null hypothesis desirable, optiizing the EHdL autoatically controls the dfdr. Consequently, desirability optiization can be seen as a generalization of dfdr control that applies to wider sets of probles. The dfdr inequality (7) can be used to select a cost-to-benefit ratio when none is suggested by the proble at hand. For exaple, analogous to the usual Type I significance level of 5%, optiizing the desirability with a cost-to-benefit ratio of 19 would control the (4)

4 4 dfdr at the 5% level, provided that the expected desirability is positive for soe rejection region D cost-to-benefit ratio Upper bound of dfdr for ax EHdL if ax EHdL > 0. Figure 1 Estiation of dfdr and optiization of desirability The dfdr can be written in ters of p 0, the proportion of null hypotheses that are true: D = EH H1 - H i L R i L ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ R i L EH = p 0 P Ht i œ G» H i = 0L ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ. P Ht i œ GL For ease of notation, let G L, denote by F the distribution of the test statistics, and denote by F H0L the distribution of the test statistics under each null hypothesis. (The sae ethod applies to ore general rejection regions, but a single-interval rejection region avoids the bias entioned by Storey (2002) of choosing a ultiple-interval region fro the data and then using that region to obtain estiates.) Then (8) DHtL = p 0 P Ht t» H = 0L ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ = p 0 H1 - F H0L HtLL ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅÅÅÅ, P Ht tl 1 - FHtL which is naturally estiated by (9) M p` 0 D` H0L IHTi tl í M HtL ª ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ (10) IHTi tl ê, where p` 0 is an estiate of p 0, IHxL = 1 if x is true and IHxL = 0 if x is false, T i is the observed test statistic of the ith test, and T H0L i is the ith test statistic out of M test statistics calculated by a resapling procedure under the null hypothesis. (Of course, such resapling is not needed when F H0L is known, as when -T i is the p-value of the ith test; Storey (2002) and Genovese and Wasseran (2002) exained this special case.) Making use of the fact that a large proportion of the test statistics that are less than soe threshold l would be associated with true null hypotheses, Efron et al. (2001) and Storey (2002) estiated p 0 by variants of

5 5 p` 0HlL = IHTi < ll ê ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ M ÅÅÅÅÅÅÅÅÅÅ (11) H0L IITi < lm ë M. Storey (2002) offers soe interesting ways to optiize l, but in the present application to gene expression data, I will follow the ore straightforward approach of Efron et al. (2001) for siplicity. They transfored the observed test statistics to follow F, the standard noral distribution, then counted the nubers of transfored statistics falling in the range H-1 ê 2, 1 ê 2L. The exaple analyses below use Eq. (11) to obtain siilar results, without transforation, by choosing l such that the proportion of observed test statistics that are less than l is as close as possible to FH1 ê 2L - FH-1 ê 2L º Given a value of l and a testindependent cost-to-benefit ratio, Eq. (6) suggests optiizing the dfdr by selecting the value of t œ 8T i < that axiizes D` HtL = I1 - I1 + c 1 ÅÅÅÅÅ M D` HtLM R i HtL. (12) Clearly, the best rejection region can be found without knowledge of, as long as the ratio c 1 ê is specified. Like estiation of the FDR and pfdr, estiation of the dfdr does not require restrictive assuptions about F. The dfdr has the additional advantage that the sae ethods of estiation, such as the one given here, can be used without assuptions about correlations between the test statistics. It will be seen in the next section that certain ethods that require independence or weak dependence for the estiation of the FDR or pfdr can estiate the dfdr without such restrictions. Bayesian foralis dfdr as a posterior probability While the above definition of dfdr and estiation procedure were given in ters agreeable with a frequentist perspective in which the truth or falsehood of each null hypothesis is fixed, they are equally valid in a Bayesian construction. Efron et al. (2001), Genovese and Wasseran (2002), and Storey (2002) considered the case in which each H i is a rando variable and p 0 is the prior probability that a null hypothesis is true, so that the arginal distribution F is a ixture of F H0L and F H1L, the distributions of the statistics conditioned on the truth and falsehood of the null hypothesis: F = p 0 F H0L + H1 - p 0 L F H1L. Storey (2002) pointed out that the dfdr is equal to the posterior probability that a null hypothesis is true: PHH = 0L PHt œ G» H = 0L PHH = 0, t œ GL D = ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ = ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ = PHH = 0» t œ GL. (13) PHt œ GL PHt œ GL He also proved that this holds for the pfdr, assuing that the eleents of 8t i < and 8H i < are independent, with the corollary that D = Q for the sae rejection region G. Thus, under approxiate independence,

6 6 Q º PHH = 0» t œ GL, with equality only holding under exact independence. Storey (2002) used relation (14) to otivate the ethods of estiating the FDR and pfdr that inspired the variant ethod of Eqs. (10) and (11). (He adjusted his estiates for the conditionality of the pfdr on R i > 0, but the adjustent is not needed to estiate the FDR or dfdr.) Storey (2002) also proved that such estiation procedures are valid asyptotically (ö ) under weak dependence. However, these techniques ore directly estiate the dfdr since Eq. (13) holds exactly, without any assuption about the nature of dependence. Far fro detracting fro the ethods of estiation, this observation broadens their application, since one can use the to ake inferences about the dfdr whether or not they apply to the pfdr for a particular data set. Other ethods of optiization The axiization of D` HtL deterines a rejection region using the dfdr and the cost and benefit of a false or true discovery. Genovese and Wasseran (2001) and Storey (2002), on the other hand, presented cost functions that use the concept of a false nondiscovery, a failure to reject a false null hypothesis. The approach taken herein has uch ore in coon with a Bayes error approach of Storey (2002) than with the approach based on a loss function of Genovese and Wasseran (2001), also considered in Storey (2002). Each will be considered in turn. Storey (2002) introduced a ethod based on the cost of a false nondiscovery instead of on the benefit fro a true discovery. Table 1 clarifies the difference between his forulation and that described above: Change in desirability d i Hc, b, c L R i = 0 R i = 1 H i = 0 0 -c H i = 1 -c +b Contribution of each cost or benefit to the net desirability. Table 1 The approach of Eqs. (4), (5), and (6) requires that c > 0, b > 0, and c = 0, whereas that of Storey (2002) requires that c > 0, b = 0, and c > 0; he pointed out that a scientist can feasibly decide on c ê c for a icroarray experient. If the net desirability is defined as dhc, b, c L ª d i Hc, b, c L, then the Bayes error that Storey iniized, BEHc, 0, c L, is equal to -EHdHc, 0, c LL ê. Since axiizing EHdHc, b, 0LL and iniizing -EHdHc, 0, bll ê yield the sae optial rejection region, the ethods have the sae goal. Thus, stating the proble in ters of false nondiscoveries as opposed to true discoveries is largely a atter of taste, but the latter is sipler and is ore in line with the intuition that otivates reporting false discovery rates, which concerns false and true discoveries, not nondiscoveries. Unlike the technique used here, Storey's (2002) algorith of optiization relies on a knowledge or estiation of probability densities, which can be probleatic. Another advantage of axiizing D` HtL is that it does not require a Bayesian fraework, but is also consistent with fixed values of H i. (14)

7 7 Genovese and Wasseran (2001) considered a linear cobination of the FDR and the false nondiscovery rate (FNR), and Storey (2002) optiized the corresponding linear cobination the pfdr and the positive false nondiscovery rate (pfnr). The ethod of Storey (2002) would be ore exact using the dfdr (12) and the decisive false nondiscovery rate (dfnr), the ratio of the expected nuber of false nondiscoveries to the expected total nuber of nondiscoveries. However, the axiization of D` HtL or iniization of the estiated Bayes error is preferred since they are based on c ê b, the ratio of the cost of a false discovery to the benefit of a true discovery or, equivalently, on c ê c. The linear cobination ethods, by contrast, are based on the ratio of the discovery rates. The proble is that the two ratios are not the sae in general since the discovery rates have different denoinators. Since it is unrealistic in ost cases to ask an investigator to specify a ratio of discovery rates, the straightforward approach of optiizing a total desirability or Bayes error is ore widely applicable. Test-dependent costs and benefits The use of the dfdr and decision theory does not require all tests to have equal costs of false discoveries and benefits of true discoveries. If the costs and benefits associated with each test are equal to those of a sufficiently large nuber of other tests, as in the following application to icroarray data, then the estiators of Eqs. (10), (11), and (12) and the axiization of D` HtL can be applied separately to each subset of tests with the sae costs and benefits, yielding a different rejection region for each subset. In the notation used above, " " c i = c j, b i = b j, where L `, " i j i k = «, and kœ81,2,...,l< i,jœi k j,kœ81,2,...,l< L k=1 i k ª 81, 2,..., <. It is assued that each set i k has enough ebers that reliable estiates can be obtained by replacing 81, 2,..., < with i k in the sus of Eqs. (10) and (11) and ` with test statistics of the null distribution generated fro i k. Let D k Htk L be the resulting estiate of the net desirability associated with i k for threshold t k. Maxiizing D` k Ht k L independently for each i k axiizes the estiated expected desirability, D k Htk L, but if a co- L ` k=1 on rejection region is desired, then the estiate D` b+c HtL can be axiized instead by finding the optial coon threshold t, using p` 0 Hb + cl, a coon estiate of p 0, found as L described below. Since D` k Ht k L D` b+c HtL if all estiates of p 0 and all null distributions k=1 are equal, the forer approach tends to give a higher desirability, but the latter has an interesting connection with the weighted false discovery rate (WFDR) of Benjaini and Hochberg (1997). In analogy with the WFDR and the dfdr, I define the weighted decisive false discovery rate (WdFDR) as D w ª EH w i V i L ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ w i R i L, EH where w is a vector of positive weights, Hw i L. Fro Eq. (4), the expected net desirability can be expressed as (15)

8 8 EHdL = E i j k J b i ÅÅÅÅÅÅÅ + c i y ÅÅÅÅÅÅÅ N R i i z j Ei j { k k b i ÅÅÅÅÅÅÅ y R i z ì Ei j { k J b i ÅÅÅÅÅÅÅ + c i y ÅÅÅÅÅÅÅ N R i z - D y b+c z, { { Finding the threshold t that axiizes D` b+c HtL, the estiate of EHdL, thus yields an optial estiate of D b+c : M p` 0 Hb + cl Hbi + c i L IHT D` H0L i tl í M b+c HtL ª ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ (17), Hbi + c i L IHT i tl ê with p` 0 Hb + cl and D` b+c HtL siilarly odified fro Eqs. (11) and (12). Here, the null statistics are generated using a pool of all of the data, unlike the ethod in which each D` k Ht k L is independently axiized. Eqs. (16) and (17) indicate that D` b+c HtL can be seen as an approxiate estiate of the WdFDR when each weight is set to the su of the cost and benefit for the corresponding test. In addition, there is a Bayesian interpretation of the WFDR. Given that the distribution of the test statistics is the ixture F = wi F j ê j=1 w j, the dfdr of test statistics distributed as F [Eq. (9)] is equal to the WdFDR of test statistics distributed as F j ê : (16) DHtL = p 0 H1 - F H0L HtLL p 0 I1 - wi F H0L j HtL ê ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ = j=1 w j M ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ = 1 - FHtL 1 - wi F j HtL ê j=1 w j p 0 wi H1 - F H0L j HtLL ê j=1 w j ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ = EH w i V i L ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ ÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ wi H1 - F j HtLL ê j=1 w j EH w i R i L. (18) Thus, the WdFDR is a posterior probability associated with t ~F, according to Eq. (13). It follows that giving one test ore weight than another is equivalent to increasing its representation in the saple by a proportional aount. Through the ixture F, the FDR and pfdr have parallel relationships with their weighted counterparts. Benjaini and Hochberg (1997) proposed the WFDR to take into account differing values or econoic considerations aong hypotheses, just as traditional Type I error rate control can give different weights for p-values or different significance levels to different hypotheses. However, without a copelling reason to have a coon rejection region, optiizing D` k Ht k L independently for each i k sees to better accoplish the goal of Benjaini and Hochberg (1997), in light of the above inequality.

9 9 Application to gene expression in cancer patients The optiization of D` HtL was applied to the public icroarray data set of Golub et al. (1999). Levels of gene expression were easured for 7129 genes in 38 patients with B-cell acute lyphoblastic leukeia (ALL) and in 25 patients with acute yeloid leukeia (AML). These preprocessing steps were applied to the raw "average difference" (AD) levels of expression: each AD value was divided by the edian of all AD values of the subject to reduce subject-specific effects, then the noralized AD values were transfored by HAD ê» AD»L lnh1 +» AD»L, as per Bickel (2002). The ith null hypothesis is that there is no stochastic ordering in transfored expression between groups in the ith gene is zero; if it is rejected, then the gene is considered differentially expressed. The goal is to reject null hypotheses such that the expected desirability is axiized, given a cost-to-benefit ratio. That ratio is set to 19 to ensure that the dfdr is controlled at the 5% level (7), provided that the axiu expected desirability is positive. The coputations used the specific values c = 19 and b = 1, without loss of generality. (Alternately, a biologist could estiate the cost of investigating a false discovery to the benefit of finding a truly differentially expressed gene.) To achieve the goal, the absolute value of a two-saple t-statistic, 2 2 T i = ALL,i - AML,i ë Hs ALL,i ê 38 - s ALL,i ê 25L 1ê2, (19) was coputed for each gene for both the original preprocessed data and for M = 1000 rando perutations of the coluns of the 7129 µ H L data atrix, following the unbalanced perutation ethod used by Dudoit et al. (2000). (The whole coluns were peruted, instead of the expression values for each gene independently, to preserve the gene-gene correlations.) Finally, the threshold t was found such that D` HtL is axiized, using the estiates coputed fro Eqs. (10), (11), and (12). Fig. 2 displays D` HtL for this data. Table 2 copares the results to those obtained by choosing t such that the highest nuber of discoveries were ade with the constraint D` HtL 5 % as an approxiation to FDR control. Since the ethod of Benjaini and Hochberg (1995) effectively uses p 0 = 1 (Storey, 2002), as does the algorith of Bickel (2002), Table 2 also shows the effect of conservatively replacing p` 0 with 1.

10 10 Estiated desirability D` HtL for the ALL-AML coparison (12), using the estiate p`0 º 0.59, fro Eq. (11). ax D`HtL p `0 º 0.59 Figure 2 ax D`HtL p `0 ª 1 Table 2 ` D HtL 5 % p `0 º 0.59 D `HtL 5 % p `0 ª 1 Threshold, t # discoveries Desir., D` HtL with p`0 º 0.59 Desir., D` HtL with p`0 ª 1 dfdr, D` HtL with p`0 º 0.59 dfdr, D` HtL with p`0 ª Coparison of estiates of the proposed ethod (axiizing D` HtL with p` 0 fro Eq. (11)) to three alternate ethods. Each nuber in bold face is a axial value constrained according to the ethod of its colun. Coparisons between other groups can easily be added to these results. For exaple, one ight be interested not only in knowing which genes are differentially expressed between B-cell ALL and AML patients, but also which ones are differentially expressed between the B- cell ALL group and the 9 patients of Golub et al. (1999) with T-cell ALL. A faily-wise error rate control approach, such as that used by Dudoit et al. (2000), would require additional adjustents to the p-values, so that fewer genes would be found to be differentially expressed as the nuber of group-group coparisons increases, but the approach of independent rejection regions, given by t 1 and t 2, does not require any odification to the results of Table 2. Sup-

11 11 pose that the benefit of a true discovery of differential expression between the B-cell ALL and T-cell ALL groups is twice that between the B-cell ALL and AML groups, and that the cost of investigating a false discovery is the sae for all coparisons. Then the only difference in the ethod applied to the new set of coparisons is that the benefit of each true discovery is set to 2 instead of 1. In the above notation, i 1 specifies the t-statistics corresponding to the coparisons to the AML group and i 2 specifies those corresponding to the coparisons to the T-cell ALL group. In icroarray experients, this kind of ultiplicity of subject groups as well as a ultiplicity of genes is coon, and the dfdr and decision theory together provide a flexible, powerful fraework for such situations. The coputed threshold is t 2 º 3.64, yielding 177 genes considered differentially expressed and a dfdr estiate of ; these values correspond to t 1 º 3.14, 503 genes, and a dfdr estiate of fro the first colun of Table 2. The two thresholds are close enough that forcing a coon threshold, as in Eqs. (17) and (18), would not have a drastic effect on the conclusions drawn in this case. By contrast, the arked difference between the two dfdr estiates is a reinder that traditional FDR control does not axiize the desirability. Even so, each estiate is controlled in the sense of satisfying (7). In estiating the pfdr, Storey (2002) considered the expression patterns of genes to be weakly dependent, but that ay presently be ipossible to test statistically since the nuber is genes is typically uch greater than the nuber of cases. Storey (2002) hypothesized weak dependence since genes work in biocheical pathways of sall groups of interacting genes. However, weak dependence would only follow if there were no interaction between one pathway and another, an assuption that could only hold approxiately at best. Indeed, the expression of thousands of genes in eukaryotes is controlled by highly connected networks of transcriptional regulatory proteins (Lee et al., 2002). Thus, I hypothesize instead that while weak dependence is a good approxiation for prokaryotes, there is stronger dependence between eukaryotic genes. Fractal statistics can often conveniently describe biological long-range correlations (Bickel and West, 1998) and ay aid the study of genoe-wide interactions as ore data becoes available. At any rate, the investigator can safely report the dfdr, since that is what is directly estiated, noting that it would be approxiately equal to the FDR or pfdr under certain conditions. References Benjaini, Y. and Hochberg, Y. (1995) "Controlling the false discovery rate: A practical and powerful approach to ultiple testing," Journal of the Royal Statistical Society B 57, Benjaini, Y. and Hochberg, Y. (1997) "Multiple hypotheses testing with weights," Scandinavian Journal of Statistics, 24, Benjaini, Y. and Yekutieli, D. (2001) "The control of the false discovery rate in ultiple testing under dependency," Annals of Statistics 29, Bickel, D. R. and West, B. J. (1998) Molecular evolution odeled as a fractal renewal point process in agreeent with the dispersion of substitutions in aalian genes, Journal of Molecular Evolution 47,

12 Bickel, D. R. (2002) "Microarray gene expression analysis: Data transforation and ultiple coparison bootstrapping," Coputing Science and Statistics (to appear); available fro Dudoit, S., Yang, Y. H., Callow, M. J., Speed, T. P. (2000) "Statistical ethods for identifying differentially expressed genes in replicated cdna icroarray experients," Technical Report #578, University of California at Berkeley, Departent of Statistics, Efron, B., Tibshirani, R., Storey, J. D., and Tusher, V. (2001) "Epirical Bayes analysis of a icroarray experient," Journal of the Aerican Statistical Association 96, Genovese, C. and Wasseran, L. (2001) "Operating characteristics and extensions of the FDR procedure," Technical report, Departent of Statistics, Carnegie Mellon University Genovese, C. and Wasseran, L. (2002) "Bayesian and frequentist ultiple testing," unpublished draft of 4/12/02 Golub, T. R., Sloni, D. K., Taayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., Coller, H., Loh, M. L., Downing, J. R., Caligiuri, M. A., Bloofield, C. D., and Lander, E. S. (1999) Molecular classification of cancer: Class discovery and class prediction by gene expression odeling," Science 286, Jeffrey, R. C. (1983) The Logic of Decision, second edition, University of Chicago Press: Chicago Lee, T. I., Rinaldi, N. J., Robert, F., Odo, D. T., Bar-Joseph, Z., Gerber, G. K., Hannett, N. M., Harbison, C. T., Thopson, C. M., Sion, I., Zeitlinger, J., Jennings, E. G., Murray, H. L., Gordon, D. B., Ren, B., Wyrick, J. J., Tagne, J.-B., Volkert, T. L., Fraenkel., E., Gifford, D. K., and Young, R. A. (2002) "Transcriptional Regulatory Networks in Saccharoyces cerevisiae," Science 298, Storey, J. D. (2002) False Discovery Rates: Theory and Applications to DNA Microarrays, PhD dissertation, Departent of Statistics, Stanford University 12

Selecting an optimal rejection region for multiple testing

Selecting an optimal rejection region for multiple testing Selecting an optial rejection region for ultiple testing A decision-theoretic alternative to FDR control, with an application to icroarrays David R. Bickel Office of Biostatistics and Bioinforatics Medical

More information

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are,

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are, Page of 8 Suppleentary Materials: A ultiple testing procedure for ulti-diensional pairwise coparisons with application to gene expression studies Anjana Grandhi, Wenge Guo, Shyaal D. Peddada S Notations

More information

Multiple Testing Issues & K-Means Clustering. Definitions related to the significance level (or type I error) of multiple tests

Multiple Testing Issues & K-Means Clustering. Definitions related to the significance level (or type I error) of multiple tests StatsM254 Statistical Methods in Coputational Biology Lecture 3-04/08/204 Multiple Testing Issues & K-Means Clustering Lecturer: Jingyi Jessica Li Scribe: Arturo Rairez Multiple Testing Issues When trying

More information

Generalized Augmentation for Control of the k-familywise Error Rate

Generalized Augmentation for Control of the k-familywise Error Rate International Journal of Statistics in Medical Research, 2012, 1, 113-119 113 Generalized Augentation for Control of the k-failywise Error Rate Alessio Farcoeni* Departent of Public Health and Infectious

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

Simultaneous critical values for t-tests in very high dimensions

Simultaneous critical values for t-tests in very high dimensions Bernoulli 17(1, 2011, 347 394 DOI: 10.3150/10-BEJ272 Siultaneous critical values for t-tests in very high diensions HONGYUAN CAO 1 and MICHAEL R. KOSOROK 2 1 Departent of Health Studies, 5841 South Maryland

More information

arxiv: v1 [stat.ot] 7 Jul 2010

arxiv: v1 [stat.ot] 7 Jul 2010 Hotelling s test for highly correlated data P. Bubeliny e-ail: bubeliny@karlin.ff.cuni.cz Charles University, Faculty of Matheatics and Physics, KPMS, Sokolovska 83, Prague, Czech Republic, 8675. arxiv:007.094v

More information

FDR- and FWE-controlling methods using data-driven weights

FDR- and FWE-controlling methods using data-driven weights FDR- and FWE-controlling ethods using data-driven weights LIVIO FINOS Center for Modelling Coputing and Statistics, University of Ferrara via N.Machiavelli 35, 44 FERRARA - Italy livio.finos@unife.it LUIGI

More information

The miss rate for the analysis of gene expression data

The miss rate for the analysis of gene expression data Biostatistics (2005), 6, 1,pp. 111 117 doi: 10.1093/biostatistics/kxh021 The miss rate for the analysis of gene expression data JONATHAN TAYLOR Department of Statistics, Stanford University, Stanford,

More information

Biostatistics Department Technical Report

Biostatistics Department Technical Report Biostatistics Departent Technical Report BST006-00 Estiation of Prevalence by Pool Screening With Equal Sized Pools and a egative Binoial Sapling Model Charles R. Katholi, Ph.D. Eeritus Professor Departent

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

Boosting with log-loss

Boosting with log-loss Boosting with log-loss Marco Cusuano-Towner Septeber 2, 202 The proble Suppose we have data exaples {x i, y i ) i =... } for a two-class proble with y i {, }. Let F x) be the predictor function with the

More information

Combining Classifiers

Combining Classifiers Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/

More information

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution Testing approxiate norality of an estiator using the estiated MSE and bias with an application to the shape paraeter of the generalized Pareto distribution J. Martin van Zyl Abstract In this work the norality

More information

Proc. of the IEEE/OES Seventh Working Conference on Current Measurement Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES

Proc. of the IEEE/OES Seventh Working Conference on Current Measurement Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES Proc. of the IEEE/OES Seventh Working Conference on Current Measureent Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES Belinda Lipa Codar Ocean Sensors 15 La Sandra Way, Portola Valley, CA 98 blipa@pogo.co

More information

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering

More information

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

Inference in the Presence of Likelihood Monotonicity for Polytomous and Logistic Regression

Inference in the Presence of Likelihood Monotonicity for Polytomous and Logistic Regression Advances in Pure Matheatics, 206, 6, 33-34 Published Online April 206 in SciRes. http://www.scirp.org/journal/ap http://dx.doi.org/0.4236/ap.206.65024 Inference in the Presence of Likelihood Monotonicity

More information

Sequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson) February 5,

Sequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson) February 5, Sequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson) February 5, 2015 31 11 Motif Finding Sources for this section: Rouchka, 1997, A Brief Overview of Gibbs Sapling. J. Buhler, M. Topa:

More information

A method to determine relative stroke detection efficiencies from multiplicity distributions

A method to determine relative stroke detection efficiencies from multiplicity distributions A ethod to deterine relative stroke detection eiciencies ro ultiplicity distributions Schulz W. and Cuins K. 2. Austrian Lightning Detection and Inoration Syste (ALDIS), Kahlenberger Str.2A, 90 Vienna,

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

Bootstrapping Dependent Data

Bootstrapping Dependent Data Bootstrapping Dependent Data One of the key issues confronting bootstrap resapling approxiations is how to deal with dependent data. Consider a sequence fx t g n t= of dependent rando variables. Clearly

More information

MSEC MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL SOLUTION FOR MAINTENANCE AND PERFORMANCE

MSEC MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL SOLUTION FOR MAINTENANCE AND PERFORMANCE Proceeding of the ASME 9 International Manufacturing Science and Engineering Conference MSEC9 October 4-7, 9, West Lafayette, Indiana, USA MSEC9-8466 MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL

More information

Probability Distributions

Probability Distributions Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples

More information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information Cite as: Straub D. (2014). Value of inforation analysis with structural reliability ethods. Structural Safety, 49: 75-86. Value of Inforation Analysis with Structural Reliability Methods Daniel Straub

More information

arxiv: v1 [cs.ds] 3 Feb 2014

arxiv: v1 [cs.ds] 3 Feb 2014 arxiv:40.043v [cs.ds] 3 Feb 04 A Bound on the Expected Optiality of Rando Feasible Solutions to Cobinatorial Optiization Probles Evan A. Sultani The Johns Hopins University APL evan@sultani.co http://www.sultani.co/

More information

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points

More information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

Testing equality of variances for multiple univariate normal populations

Testing equality of variances for multiple univariate normal populations University of Wollongong Research Online Centre for Statistical & Survey Methodology Working Paper Series Faculty of Engineering and Inforation Sciences 0 esting equality of variances for ultiple univariate

More information

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lesson 1 4 October 2017 Outline Learning and Evaluation for Pattern Recognition Notation...2 1. The Pattern Recognition

More information

Revealed Preference and Stochastic Demand Correspondence: A Unified Theory

Revealed Preference and Stochastic Demand Correspondence: A Unified Theory Revealed Preference and Stochastic Deand Correspondence: A Unified Theory Indraneel Dasgupta School of Econoics, University of Nottingha, Nottingha NG7 2RD, UK. E-ail: indraneel.dasgupta@nottingha.ac.uk

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

e-companion ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer

More information

Data-driven hypothesis weighting increases detection power in genome-scale multiple testing

Data-driven hypothesis weighting increases detection power in genome-scale multiple testing CORRECTION NOTICE Nat. Methods; doi 10.1038/neth.3885 (published online 30 May 2016). Data-driven hypothesis weighting increases detection power in genoe-scale ultiple testing Nikolaos Ignatiadis, Bernd

More information

On Constant Power Water-filling

On Constant Power Water-filling On Constant Power Water-filling Wei Yu and John M. Cioffi Electrical Engineering Departent Stanford University, Stanford, CA94305, U.S.A. eails: {weiyu,cioffi}@stanford.edu Abstract This paper derives

More information

Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method

Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method Christopher R. Genovese Department of Statistics Carnegie Mellon University joint work with Larry Wasserman

More information

Topic 5a Introduction to Curve Fitting & Linear Regression

Topic 5a Introduction to Curve Fitting & Linear Regression /7/08 Course Instructor Dr. Rayond C. Rup Oice: A 337 Phone: (95) 747 6958 E ail: rcrup@utep.edu opic 5a Introduction to Curve Fitting & Linear Regression EE 4386/530 Coputational ethods in EE Outline

More information

C na (1) a=l. c = CO + Clm + CZ TWO-STAGE SAMPLE DESIGN WITH SMALL CLUSTERS. 1. Introduction

C na (1) a=l. c = CO + Clm + CZ TWO-STAGE SAMPLE DESIGN WITH SMALL CLUSTERS. 1. Introduction TWO-STGE SMPLE DESIGN WITH SMLL CLUSTERS Robert G. Clark and David G. Steel School of Matheatics and pplied Statistics, University of Wollongong, NSW 5 ustralia. (robert.clark@abs.gov.au) Key Words: saple

More information

Revealed Preference with Stochastic Demand Correspondence

Revealed Preference with Stochastic Demand Correspondence Revealed Preference with Stochastic Deand Correspondence Indraneel Dasgupta School of Econoics, University of Nottingha, Nottingha NG7 2RD, UK. E-ail: indraneel.dasgupta@nottingha.ac.uk Prasanta K. Pattanaik

More information

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS A Thesis Presented to The Faculty of the Departent of Matheatics San Jose State University In Partial Fulfillent of the Requireents

More information

Ensemble Based on Data Envelopment Analysis

Ensemble Based on Data Envelopment Analysis Enseble Based on Data Envelopent Analysis So Young Sohn & Hong Choi Departent of Coputer Science & Industrial Systes Engineering, Yonsei University, Seoul, Korea Tel) 82-2-223-404, Fax) 82-2- 364-7807

More information

Interactive Markov Models of Evolutionary Algorithms

Interactive Markov Models of Evolutionary Algorithms Cleveland State University EngagedScholarship@CSU Electrical Engineering & Coputer Science Faculty Publications Electrical Engineering & Coputer Science Departent 2015 Interactive Markov Models of Evolutionary

More information

Introduction to Machine Learning. Recitation 11

Introduction to Machine Learning. Recitation 11 Introduction to Machine Learning Lecturer: Regev Schweiger Recitation Fall Seester Scribe: Regev Schweiger. Kernel Ridge Regression We now take on the task of kernel-izing ridge regression. Let x,...,

More information

Bayes Decision Rule and Naïve Bayes Classifier

Bayes Decision Rule and Naïve Bayes Classifier Bayes Decision Rule and Naïve Bayes Classifier Le Song Machine Learning I CSE 6740, Fall 2013 Gaussian Mixture odel A density odel p(x) ay be ulti-odal: odel it as a ixture of uni-odal distributions (e.g.

More information

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis City University of New York (CUNY) CUNY Acadeic Works International Conference on Hydroinforatics 8-1-2014 Experiental Design For Model Discriination And Precise Paraeter Estiation In WDS Analysis Giovanna

More information

TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES

TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES S. E. Ahed, R. J. Tokins and A. I. Volodin Departent of Matheatics and Statistics University of Regina Regina,

More information

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t. CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when

More information

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical IEEE TRANSACTIONS ON INFORMATION THEORY Large Alphabet Source Coding using Independent Coponent Analysis Aichai Painsky, Meber, IEEE, Saharon Rosset and Meir Feder, Fellow, IEEE arxiv:67.7v [cs.it] Jul

More information

Robustness and Regularization of Support Vector Machines

Robustness and Regularization of Support Vector Machines Robustness and Regularization of Support Vector Machines Huan Xu ECE, McGill University Montreal, QC, Canada xuhuan@ci.cgill.ca Constantine Caraanis ECE, The University of Texas at Austin Austin, TX, USA

More information

Ch 12: Variations on Backpropagation

Ch 12: Variations on Backpropagation Ch 2: Variations on Backpropagation The basic backpropagation algorith is too slow for ost practical applications. It ay take days or weeks of coputer tie. We deonstrate why the backpropagation algorith

More information

What is Probability? (again)

What is Probability? (again) INRODUCTION TO ROBBILITY Basic Concepts and Definitions n experient is any process that generates well-defined outcoes. Experient: Record an age Experient: Toss a die Experient: Record an opinion yes,

More information

Bayesian Learning. Chapter 6: Bayesian Learning. Bayes Theorem. Roles for Bayesian Methods. CS 536: Machine Learning Littman (Wu, TA)

Bayesian Learning. Chapter 6: Bayesian Learning. Bayes Theorem. Roles for Bayesian Methods. CS 536: Machine Learning Littman (Wu, TA) Bayesian Learning Chapter 6: Bayesian Learning CS 536: Machine Learning Littan (Wu, TA) [Read Ch. 6, except 6.3] [Suggested exercises: 6.1, 6.2, 6.6] Bayes Theore MAP, ML hypotheses MAP learners Miniu

More information

Randomized Recovery for Boolean Compressed Sensing

Randomized Recovery for Boolean Compressed Sensing Randoized Recovery for Boolean Copressed Sensing Mitra Fatei and Martin Vetterli Laboratory of Audiovisual Counication École Polytechnique Fédéral de Lausanne (EPFL) Eail: {itra.fatei, artin.vetterli}@epfl.ch

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE7C (Spring 018: Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee7c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee7c@berkeley.edu October 15,

More information

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs BULLETIN OF THE POLISH ACADEMY OF SCIENCES TECHNICAL SCIENCES Vol. 57, No. 3, 2009 Algoriths for parallel processor scheduling with distinct due windows and unit-tie obs A. JANIAK 1, W.A. JANIAK 2, and

More information

Weighted Hypothesis Testing. April 7, 2006

Weighted Hypothesis Testing. April 7, 2006 Weighted Hypothesis Testing Larry Wasseran and Kathryn Roeder 1 Carnegie Mellon University April 7, 2006 The power of ultiple testing procedures can be increased by using weighted p-values Genovese, Roeder

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher

More information

An Introduction to Meta-Analysis

An Introduction to Meta-Analysis An Introduction to Meta-Analysis Douglas G. Bonett University of California, Santa Cruz How to cite this work: Bonett, D.G. (2016) An Introduction to Meta-analysis. Retrieved fro http://people.ucsc.edu/~dgbonett/eta.htl

More information

AN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS

AN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS Statistica Sinica 6 016, 1709-178 doi:http://dx.doi.org/10.5705/ss.0014.0034 AN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS Nilabja Guha 1, Anindya Roy, Yaakov Malinovsky and Gauri

More information

Sampling How Big a Sample?

Sampling How Big a Sample? C. G. G. Aitken, 1 Ph.D. Sapling How Big a Saple? REFERENCE: Aitken CGG. Sapling how big a saple? J Forensic Sci 1999;44(4):750 760. ABSTRACT: It is thought that, in a consignent of discrete units, a certain

More information

Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space

Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space Journal of Machine Learning Research 3 (2003) 1333-1356 Subitted 5/02; Published 3/03 Grafting: Fast, Increental Feature Selection by Gradient Descent in Function Space Sion Perkins Space and Reote Sensing

More information

OBJECTIVES INTRODUCTION

OBJECTIVES INTRODUCTION M7 Chapter 3 Section 1 OBJECTIVES Suarize data using easures of central tendency, such as the ean, edian, ode, and idrange. Describe data using the easures of variation, such as the range, variance, and

More information

A general forulation of the cross-nested logit odel Michel Bierlaire, Dpt of Matheatics, EPFL, Lausanne Phone: Fax:

A general forulation of the cross-nested logit odel Michel Bierlaire, Dpt of Matheatics, EPFL, Lausanne Phone: Fax: A general forulation of the cross-nested logit odel Michel Bierlaire, EPFL Conference paper STRC 2001 Session: Choices A general forulation of the cross-nested logit odel Michel Bierlaire, Dpt of Matheatics,

More information

Some Proofs: This section provides proofs of some theoretical results in section 3.

Some Proofs: This section provides proofs of some theoretical results in section 3. Testing Jups via False Discovery Rate Control Yu-Min Yen. Institute of Econoics, Acadeia Sinica, Taipei, Taiwan. E-ail: YMYEN@econ.sinica.edu.tw. SUPPLEMENTARY MATERIALS Suppleentary Materials contain

More information

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness A Note on Scheduling Tall/Sall Multiprocessor Tasks with Unit Processing Tie to Miniize Maxiu Tardiness Philippe Baptiste and Baruch Schieber IBM T.J. Watson Research Center P.O. Box 218, Yorktown Heights,

More information

Principal Components Analysis

Principal Components Analysis Principal Coponents Analysis Cheng Li, Bingyu Wang Noveber 3, 204 What s PCA Principal coponent analysis (PCA) is a statistical procedure that uses an orthogonal transforation to convert a set of observations

More information

The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Parameters

The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Parameters journal of ultivariate analysis 58, 96106 (1996) article no. 0041 The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Paraeters H. S. Steyn

More information

Bayesian Approach for Fatigue Life Prediction from Field Inspection

Bayesian Approach for Fatigue Life Prediction from Field Inspection Bayesian Approach for Fatigue Life Prediction fro Field Inspection Dawn An and Jooho Choi School of Aerospace & Mechanical Engineering, Korea Aerospace University, Goyang, Seoul, Korea Srira Pattabhiraan

More information

In this chapter, we consider several graph-theoretic and probabilistic models

In this chapter, we consider several graph-theoretic and probabilistic models THREE ONE GRAPH-THEORETIC AND STATISTICAL MODELS 3.1 INTRODUCTION In this chapter, we consider several graph-theoretic and probabilistic odels for a social network, which we do under different assuptions

More information

A note on the multiplication of sparse matrices

A note on the multiplication of sparse matrices Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

Graphical Models in Local, Asymmetric Multi-Agent Markov Decision Processes

Graphical Models in Local, Asymmetric Multi-Agent Markov Decision Processes Graphical Models in Local, Asyetric Multi-Agent Markov Decision Processes Ditri Dolgov and Edund Durfee Departent of Electrical Engineering and Coputer Science University of Michigan Ann Arbor, MI 48109

More information

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search Quantu algoriths (CO 781, Winter 2008) Prof Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search ow we begin to discuss applications of quantu walks to search algoriths

More information

1 Bounding the Margin

1 Bounding the Margin COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost

More information

Compression and Predictive Distributions for Large Alphabet i.i.d and Markov models

Compression and Predictive Distributions for Large Alphabet i.i.d and Markov models 2014 IEEE International Syposiu on Inforation Theory Copression and Predictive Distributions for Large Alphabet i.i.d and Markov odels Xiao Yang Departent of Statistics Yale University New Haven, CT, 06511

More information

Lecture October 23. Scribes: Ruixin Qiang and Alana Shine

Lecture October 23. Scribes: Ruixin Qiang and Alana Shine CSCI699: Topics in Learning and Gae Theory Lecture October 23 Lecturer: Ilias Scribes: Ruixin Qiang and Alana Shine Today s topic is auction with saples. 1 Introduction to auctions Definition 1. In a single

More information

Distributed Subgradient Methods for Multi-agent Optimization

Distributed Subgradient Methods for Multi-agent Optimization 1 Distributed Subgradient Methods for Multi-agent Optiization Angelia Nedić and Asuan Ozdaglar October 29, 2007 Abstract We study a distributed coputation odel for optiizing a su of convex objective functions

More information

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels Extension of CSRSM for the Paraetric Study of the Face Stability of Pressurized Tunnels Guilhe Mollon 1, Daniel Dias 2, and Abdul-Haid Soubra 3, M.ASCE 1 LGCIE, INSA Lyon, Université de Lyon, Doaine scientifique

More information

Learnability and Stability in the General Learning Setting

Learnability and Stability in the General Learning Setting Learnability and Stability in the General Learning Setting Shai Shalev-Shwartz TTI-Chicago shai@tti-c.org Ohad Shair The Hebrew University ohadsh@cs.huji.ac.il Nathan Srebro TTI-Chicago nati@uchicago.edu

More information

Tracking using CONDENSATION: Conditional Density Propagation

Tracking using CONDENSATION: Conditional Density Propagation Tracking using CONDENSATION: Conditional Density Propagation Goal Model-based visual tracking in dense clutter at near video frae rates M. Isard and A. Blake, CONDENSATION Conditional density propagation

More information

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks Intelligent Systes: Reasoning and Recognition Jaes L. Crowley MOSIG M1 Winter Seester 2018 Lesson 7 1 March 2018 Outline Artificial Neural Networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 5, Issue 1 2006 Article 28 A Two-Step Multiple Comparison Procedure for a Large Number of Tests and Multiple Treatments Hongmei Jiang Rebecca

More information

A Note on the Applied Use of MDL Approximations

A Note on the Applied Use of MDL Approximations A Note on the Applied Use of MDL Approxiations Daniel J. Navarro Departent of Psychology Ohio State University Abstract An applied proble is discussed in which two nested psychological odels of retention

More information

1 Proof of learning bounds

1 Proof of learning bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a

More information

Soft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis

Soft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis Soft Coputing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis Beverly Rivera 1,2, Irbis Gallegos 1, and Vladik Kreinovich 2 1 Regional Cyber and Energy Security Center RCES

More information

Generalized Queries on Probabilistic Context-Free Grammars

Generalized Queries on Probabilistic Context-Free Grammars IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 20, NO. 1, JANUARY 1998 1 Generalized Queries on Probabilistic Context-Free Graars David V. Pynadath and Michael P. Wellan Abstract

More information

When Short Runs Beat Long Runs

When Short Runs Beat Long Runs When Short Runs Beat Long Runs Sean Luke George Mason University http://www.cs.gu.edu/ sean/ Abstract What will yield the best results: doing one run n generations long or doing runs n/ generations long

More information

Support recovery in compressed sensing: An estimation theoretic approach

Support recovery in compressed sensing: An estimation theoretic approach Support recovery in copressed sensing: An estiation theoretic approach Ain Karbasi, Ali Horati, Soheil Mohajer, Martin Vetterli School of Coputer and Counication Sciences École Polytechnique Fédérale de

More information

Optimal nonlinear Bayesian experimental design: an application to amplitude versus offset experiments

Optimal nonlinear Bayesian experimental design: an application to amplitude versus offset experiments Geophys. J. Int. (23) 155, 411 421 Optial nonlinear Bayesian experiental design: an application to aplitude versus offset experients Jojanneke van den Berg, 1, Andrew Curtis 2,3 and Jeannot Trapert 1 1

More information

Birthday Paradox Calculations and Approximation

Birthday Paradox Calculations and Approximation Birthday Paradox Calculations and Approxiation Joshua E. Hill InfoGard Laboratories -March- v. Birthday Proble In the birthday proble, we have a group of n randoly selected people. If we assue that birthdays

More information

3.8 Three Types of Convergence

3.8 Three Types of Convergence 3.8 Three Types of Convergence 3.8 Three Types of Convergence 93 Suppose that we are given a sequence functions {f k } k N on a set X and another function f on X. What does it ean for f k to converge to

More information

Lecture 21. Interior Point Methods Setup and Algorithm

Lecture 21. Interior Point Methods Setup and Algorithm Lecture 21 Interior Point Methods In 1984, Kararkar introduced a new weakly polynoial tie algorith for solving LPs [Kar84a], [Kar84b]. His algorith was theoretically faster than the ellipsoid ethod and

More information

A New Class of APEX-Like PCA Algorithms

A New Class of APEX-Like PCA Algorithms Reprinted fro Proceedings of ISCAS-98, IEEE Int. Syposiu on Circuit and Systes, Monterey (USA), June 1998 A New Class of APEX-Like PCA Algoriths Sione Fiori, Aurelio Uncini, Francesco Piazza Dipartiento

More information

Optimum Value of Poverty Measure Using Inverse Optimization Programming Problem

Optimum Value of Poverty Measure Using Inverse Optimization Programming Problem International Journal of Conteporary Matheatical Sciences Vol. 14, 2019, no. 1, 31-42 HIKARI Ltd, www.-hikari.co https://doi.org/10.12988/ijcs.2019.914 Optiu Value of Poverty Measure Using Inverse Optiization

More information

Support Vector Machines MIT Course Notes Cynthia Rudin

Support Vector Machines MIT Course Notes Cynthia Rudin Support Vector Machines MIT 5.097 Course Notes Cynthia Rudin Credit: Ng, Hastie, Tibshirani, Friedan Thanks: Şeyda Ertekin Let s start with soe intuition about argins. The argin of an exaple x i = distance

More information

Uniform Approximation and Bernstein Polynomials with Coefficients in the Unit Interval

Uniform Approximation and Bernstein Polynomials with Coefficients in the Unit Interval Unifor Approxiation and Bernstein Polynoials with Coefficients in the Unit Interval Weiang Qian and Marc D. Riedel Electrical and Coputer Engineering, University of Minnesota 200 Union St. S.E. Minneapolis,

More information

Lost-Sales Problems with Stochastic Lead Times: Convexity Results for Base-Stock Policies

Lost-Sales Problems with Stochastic Lead Times: Convexity Results for Base-Stock Policies OPERATIONS RESEARCH Vol. 52, No. 5, Septeber October 2004, pp. 795 803 issn 0030-364X eissn 1526-5463 04 5205 0795 infors doi 10.1287/opre.1040.0130 2004 INFORMS TECHNICAL NOTE Lost-Sales Probles with

More information

Gene Selection for Colon Cancer Classification using Bayesian Model Averaging of Linear and Quadratic Discriminants

Gene Selection for Colon Cancer Classification using Bayesian Model Averaging of Linear and Quadratic Discriminants Gene Selection for Colon Cancer Classification using Bayesian Model Averaging of Linear and Quadratic Discriinants Oyebayo Ridwan Olaniran*, Mohd Asrul Affendi Abdullah Departent of Matheatics and Statistics,

More information