ABSTRACT INTRODUCTION

Size: px
Start display at page:

Download "ABSTRACT INTRODUCTION"

Transcription

1 Fast Two-Sample Pemutation Tests, Even When One Sample is Lage, that Efficiently Maximize Powe Unde Cude Monte Calo Sampling J.D. Opdyke, Economic Consulting Goup Andesen, LLP, Boston, MA ABSTRACT I pesent a method fo quickly pefoming multiple nonpaametic two-sample pemutation tests on continuous data in SAS, even when one sample is lage. I maximize statistical powe (within the context of a cude Monte Calo appoach) by ovesampling dawing moe pemutation samples than desied, deleting duplicates, and then selecting the desied numbe of samples fom the emainde. I detemine the optimal numbe of samples to ovesample based on sampling pobability and the untime of a sampling pocedue (PROC PLAN). Implementing ovesampling with nealy optimal numbes of samples inceases stat-to-finish untime typically by only 5%, and always by less than 10%. Using telecommunications pefomance measuement data fom multiple souces with a wide ange of sample size pais, I benchmak stat-to-finish untime against a) anothe SAS pocedue (PROC MULTTEST), b) anothe SAS pogam witten fo the same pupose, and c) Cytel s PROC TWOSAMPL, with vey favoable esults. The elative benchmak speeds would be identical if applied to data fom andomized contolled studies. INTRODUCTION Pemutation tests ae as old as moden statistics, 1 and thei statistical popeties ae well undestood and thooughly documented in the statistical liteatue. 2 Though not always as poweful as thei paametic countepats that ely on asymptotic theoy, they sometimes have equal o even geate powe. 3 Often they can be used when asymptotic theoy falls shot (e.g. small samples and the Cental Limit Theoem), and when fully enumeated, they povide gatifyingly exact esults (as opposed to appoximations based on asymptotic theoy). 4 Most impotant, howeve, is thei eliance on vey few distibutional assumptions, 5 giving pemutation tests a much boade ange of application. Until ecently the majo dawback of pemutation tests has been thei high computational demands. Fully enumeating a pemutation test equies calculating the test statistic appopiate fo the hypotheses being tested fo evey possible two-sample 1 Pemutation tests wee advocated by one of the fathes of moden statistics, Si R.A. Fishe, as ealy as the 1920s. 2 Pesain (2001) and Mielke and Bey (2001) contain extensive bibliogaphies. 3 Fo just one example, see Andesen and Legende (1999). 4 In fact, it was Fishe who coectly chaacteized paametic tests elying on asymptotic theoy as mee appoximations to the exact esults of fully enumeated pemutation tests (Good (1994), p.4, citing Fishe). 5 Exchangeability of the data unde both the null and altenate hypotheses is the majo equiement of pemutation tests. Good (1994) states that a pemutation test equies only, that the undelying distibutions ae symmetic, and/o the altenatives ae simples [sic] shifts in value, though thee ae cases satisfying these citeia whee the basic, nonpaametic pemutation test discussed hee is not the most appopiate method and should not be elied upon (e.g. the Behens-Fishe poblem -- see Pesain (2001), Ch. 10). combination of the data points. 6 Then the value of the test statistic based on the oiginal two samples must be compaed to those based on all the pemutation samples to obtain a p-value 7 the esult of the test. 8 Dawing only a sample of all possible samples, as is typically done, still has been associated with pohibitive compute untimes. Recent advances in computing capacity and speed, howeve, inceasingly have elaxed this constaint. But efficient statistical code still is needed to most effectively exploit these advances and to ensue that the choice of method is diven as much by statistical theoy, and as little by technological constaint, as possible. The goal of the methods descibed below is to contibute to this effot. IMPLEMENTING PERMUTATION TESTS IN SAS Two pocedues in SAS can be used to pefom two-sample nonpaametic pemutation tests PROC MULTTEST 9 and PROC PLAN. The fome diectly samples the input dataset itself, while the late geneates a ecod-by-ecod list identifying those ecods on the input dataset to include in the samples. This list subsequently must be meged with the oiginal data to obtain the coesponding data points. In addition, PROC MULTTEST actually conducts the pemutation test and povides a p-value (assuming that, fo continuous data, a pooled-vaiance t-test is the appopiate test statistic); the esults of PROC PLAN, on the othe hand, must be manipulated manually to calculate the value of the test statistic associated with the oiginal sample pai, and then compae it to those associated with all the pemutation samples to obtain a p-value. Howeve, this entie pocess using PROC PLAN still is much faste than PROC MULTTEST unde most conditions, as shown in the benchmak section below, and it also povides moe flexibility in the definition of the test statistic. Howeve, PROC PLAN has a sample size constaint the poduct of the sum of the two sample sizes (n 1 + n 2) and the numbe of samples being dawn (T) cannot exceed 2 31 o the pocedue teminates. Yet this can be cicumvented by inseting calls to PROC PLAN in a loop which cycles oundup((n 1 + n 2)* T / 2 31 ) times, each loop dawing T * [oundup((n 1 + n 2)* T / 2 31 )] -1 samples until T samples have been dawn (see code in Appendix A). And this looping in and of itself does not slow the total untime of the pocedue. The elative speed of PROC PLAN when samples ae lage enough to equie such looping is at least 20 times faste than PROC MULTTEST. Anothe impotant issue egading the implementation of these pocedues is the sampling method: both PROC MULTTEST and PROC PLAN can pefom cude Monte Calo sampling without eplacement within a sample, as equied of a pemutation test, but neithe can avoid the possibility of dawing the same sample 6 Of couse, the sample sizes of all these possible combinations ae the same as the oiginal two samples. 7 The p-value is simply a popotion the pecentage of pemutation sample test statistic values at least as lage as that based on the oiginal data. 8 This pape addesses only the two-sample pemutation test, though its methods eadily can be applied to pemutation tests with moe complex study designs. 9 PROC MULTTEST is a vesatile pocedue with many functions I addess only its specific application to two-sample nonpaametic pemutation tests on continuous data.

2 moe than once. In othe wods, when dawing a sample of samples, both pocedues can only sample the entie samples with eplacement. This poblem of dawing duplicate samples, its effect on the statistical powe of the pemutation test, and a poposed solution that maximizes powe unde cude Monte Calo sampling 10 ae discussed below. DETERMINING THE NUMBER OF PERMUTATION SAMPLES Full enumeation of pemutation tests quickly becomes infeasible as sample sizes incease because the numbe of possible sample combinations becomes vey lage, even fo elatively small sample sizes. 11 Howeve, full enumeation is unnecessay as the pemutation test can be based on only a sample of all possible samples. Recognizing that the esulting p-value is simply an estimated popotion distibuted binomially, we can detemine its coefficient of vaiation (cv) 12 and confidence inteval as functions of the numbe of samples (T) dawn. Fo example, to achieve cv>0.10 with a p-value equal to the citical value of the test (p-value = α = 0.05), T = 1, This yields a 95% confidence inteval of within 0.01 of the estimated p-value, 14 which fo most pactical puposes is a sufficiently pecise estimate of the fully enumeated p-value. 15 Lage values of T will yield geate pecision, but the maginal inceases in pecision decease apidly and nonlinealy (see Gaph 1 below) These ae simple, unifom andom daws as opposed to moe complicated sampling algoithms, such as impotance sampling (see Mehta et. al. (1988)). ( n1+ n2)! n 11 The numbe of possible sample combinations is given by 1! n2!, which fo n 1 = 3 and n = 3 is just 20, but fo the slightly lage sample pai of n = 29 and n = 30, is a sizeable 59,132,290,782,430, The coefficient of vaiation is a unitless measue of elative spead. It is simply the standad eo of a statistic divided by its mean (see Za (1999), p. 40). 13 Solving fo ( 1 ( )) 0.05 ( ) p value p value T T cv = = = 0.10 p value 0.05 yields T=1,900, so to obtain cv>0.10, we must incease T to 1,901. This calculation also is pefomed in Efon and Tibshiani (1993), pp coefficient of vaiation Gaph 1: Pemutation p-value - cv and 1.96*standad eo by T (# of pemutation samples) fo p-value = ,000 1,200 1,400 1,600 1,800 2,000 2,200 2,400 2,600 2,800 3,000 3,200 3,400 T (# of pemutation samples) 1.96*se THE PROBLEM OF DRAWING DUPLICATE SAMPLES The above calculation assumes, howeve, that the T pemutation samples ae dawn without eplacement i.e. that no sample was dawn moe than once. 17 Moe impotantly, the pemutation test will lose statistical powe if thee ae any duplicate samples among these T samples. 18 Thus, maximizing powe equies a daw of a unique set of samples. This can be accomplished in diffeent ways, but the fastest when using PROC PLAN as descibed heein is to simply ovesample by dawing moe than T samples (say, samples), deleting any duplicate samples, and then andomly selecting T samples fom the emaining set. This appoach does not violate any statistical assumptions equied by a nonpaametic pemutation test specifically, that the distibution of all possible samples is unifom with (1) P ( Y y) = = 1 1 2! ( n + n ) 1,901 so it is a valid appoach fo obtaining T unique samples. Howeve, it bings up two inteelated questions, one egading sampling pobability, the othe egading computational efficiency: how many samples moe than T (-T) ae equied so that we ae left with at least T unique samples once duplicates ae deleted? = 2T? = 3T? And what is the untime tadeoff of dawing lage factos of T samples, which takes moe time simply because the numbe of samples is lage, but also educes the chance of coming up shot with less than T unique samples and having to edaw samples by calling PROC PLAN moe than once? Solving the poblem of minimizing expected untime addesses both of these questions below. cv *standad eo of p-value 14 This calculation assumes that both the sample size (in this case, T) is sufficiently lage and the value of the popotion (in this case, the p-value at p-value = α = 0.05) is sufficiently nonexteme (i.e. not close to 0 o 1) fo the binomially distibuted p-value to closely appoximate the nomal distibution. If these citeia ae satisfied, the 95% confidence inteval is defined by the ange of p-value ± 1.96 standad eos. T = 1,901 and p-value = 0.05 satisfy a vey stict of standad in the liteatue (npq>25, o T*pvalue*(1-p-value)>25 see Evans, Hastings, and Peacock (1993), p. 39) fo such an appoximation to be consideed valid. 17 As mentioned above, pemutation tests equie that each sample be dawn without eplacement i.e. with no item dawn moe than once within a sample. To maximize powe, the sample of pemutation samples also must be dawn without eplacement i.e. no entie sample among the T samples dawn may be dawn moe than once. 15 This appoximate level of pecision fo an estimate of a popotion was deemed excellent (fo T = 2,000) by Baun and Feng (2001) in thei ecent study of optimal pemutation tests. 16 An efficient altenative is to incease T only if the p-value is close to the citical value of the test. 18 Imagine, in the exteme, that the same sample pai is dawn all T times, o close to all T times. It would be impossible, o nealy so, to detect with a specified level of statistical confidence a diffeence between the two samples even if such a diffeence did, in fact, exist. In othe wods, it would be impossible to eject the null hypothesis of no diffeence even if thee was a diffeence, and the extent to which a test can coectly eject the null hypothesis is its level of statistical powe.

3 CHOOSING TO MINIMIZE EXPECTED RUNTIME Expected untime is the poduct of the time it takes to daw samples and the expected numbe of times samples must be dawn to obtain at least T unique samples. 19 Addessing the latte facto fist, the numbe of times samples must be dawn befoe at least T unique samples ae obtained follows the geometic distibution, which identifies the numbe of events occuing befoe the fist success: (2) ( ) ( ) ( s S = s = p p 1) P 1 whee p indicates the pobability of success (of obtaining at least T unique samples) fo each event (each call to PROC PLAN). The expected value of the geometic distibution is simply 1/p, and p is simply deived fom a geneal fom of the familia collecto s poblem. This poblem asks the question, How many samples ae equied, when sampling with eplacement, to obtain ( collect ) all samples fom the sampling distibution? o moe geneally, How many samples ae equied, when sampling with eplacement, to obtain T samples fom the sampling distibution? 20 The numbe of samples equied obviously will follow a pobability density function, which in fact, is the sum of geometic andom vaiables, and is pesented below: (3) ( n + n ) 1 2! j i ( 1 ) j! ( j i) P (# unique samples = j) = ( n1+ n2)! j! j! i= 0 ( n1 n2)! + n!( )! 1! n2! i j i whee = # of samples dawn and j The pobability of obtaining at least T unique samples is simply the cumulative pobability of obtaining T, T+1, T+2,, -1, and unique samples, as shown below: (4) ( n1+ n2)! j i ( 1 ) j! ( j i) p = P( j T) = ( n1+ n2)! j= T j! j! i= 0 ( n1 n2)! + n!( )! 1! n2! i j i whee T. Thus, the expected numbe of times samples must be dawn to obtain at least T unique samples is a function of the numbe of possible sample combinations and, and is povided below: (5) expected # of calls to PROC PLAN = 1 ( n1+ n2)! j i 1 ( 1 ) j! ( j i) = p ( n1+ n2)! j= T j! j! i= 0 ( n1 n2)! + n!( )! 1! n2! i j i Gaph 2 below illustates this functional elationship between p, 1/p, and fo n 1 = 68, n 2= 4, and T = 1,901: P(j>=1,901)=p 1/p = expected # calls ,901 1,902 1,903 Gaph 2: Pobability of at least T Unique Samples (p) and Expected # of calls to Poc Plan (1/p) by (fo n1=4, n2=68, and T=1,901) 1,904 1,905 1,906 1,907 1,908 1,909 1,910 1,911 1,912 p 1/p 1,913 1,914 1,915 1,916 1,917 1,918 1,919 1,920 1,921 Now to etun to the fist facto detemining expected sampling untime the time it takes PROC PLAN to daw a sample of samples. This is simply the untime of PROC PLAN as a function of, inteestingly, not the numbe of possible two-sample combinations, but athe the sum of the two sample sizes (n 1 + n 2), as well as the numbe of samples dawn,. This is shown in Gaph 3 below. 21 Real Time (seconds) Gaph 3: PROC PLAN Runtime by n 1+n 2 by , , , ,000 1,000,000 1,200,000 1,400,000 1,600,000 n 1+n 2 = 1,901 2,700 3, Investigating the possibility of dawing fewe than samples if PROC PLAN is called moe than once inceases the complexity of this poblem beyond the scope of this pape. Given the poximity to T of the nealy optimal values fo povided in Table 1 below, and the vey fast untime of PROC PLAN when the pobability of ecalling it is anything but infinitesimal, I suspect small, if any, gains in untime would esult fom such an inquiy, though I intend to exploe the issue. 20 Of couse, the sampling distibution hee is all possible two-sample combinations based on the two oiginal samples of data. Obviously, and (n 1 + n 2) ae coelated, but untime is vey well pedicted (adjusted R 2 = ) by the simple odinay least squaes multivaiate egession equation below: 21 PROC PLAN was un in SAS v.8.2 on a desktop PC with 2GB RAM and a 2GHz Pentium pocesso. Sample sizes wee geneated by assigning values of 3, 16, 27 to the smalle of the two samples, and, beginning at 100, assigning values by 100 incements to the lage sample up to 100,000, afte which point incements of 10,000 wee used up to 1.5 million (though the pogam has been un on sample pais as lage as 29 and 5,000,029). Thee values of wee used: 1,901, 2,700, and 3,500.

4 PROC PLAN Runtime = PPRT(n 1, n 2, ) = β 0 + β 1*(n 1 + n 2) + β 2* + β 3*(n 1 + n 2)* Nonlineaity at about (n 1 + n 2) = 65,500 and (n 1 + n 2) = 73,500 pompted the inclusion of dummy and inteaction tems, leading to the nea pefect pediction (adjusted R 2 = ) fo PPRT(n 1, n 2, ) pesented in Appendix B (see Gaph 4, which is simply a magnification of Gaph 3 up to (n 1 + n 2)=100,000). g(n 1, n 2,, T) (the poduct of 1/p in Gaph 2 and PPRT in Gaph 5 above) and demonstate an optimal, * = 1,908, fo T = 1,901, n 1 = 4, and n 2 = 68 (and numbe of possible sample combinations = 1,028,790) Gaph 6: Expected Runtime (1/p * each untime) by (fo n1=4, n2=68, and T=1,901) Real Time (seconds) Gaph 4: PROC PLAN Runtime by n 1+n 2 by Expected Runtime (seconds) ,901 1,902 1,903 1,904 1,905 1,906 1,907 1,908 1,909 1,910 1,911 1,912 1,913 1,914 1,915 1,916 1,917 1,918 1,919 1,920 1, ,000 20,000 30,000 40,000 50,000 60,000 70,000 80,000 90, ,000 Gaph 7 magnifies the elevant expected untime ange. n1+n2 = 1,901 2,700 3, Gaph 7: Expected Runtime (1/p * each untime) by (fo n 1=4, n 2=68, and T=1,901) Thus, expected untime g(n 1, n 2,, T) is the poduct of PROC PLAN Runtime and the expected numbe of calls to PROC PLAN: (6) expected untime = g(n 1, n 2,, T) = PPRT(n 1, n 2, ) * expected # of calls to PROC PLAN = PPRT(n 1, n 2, ) * 1 ( n1+ n2)! j i ( 1 ) j! ( j i) ( n1+ n2)! j= T j! j! i= 0 ( n1 n2)! + n!( )! 1! n2! i j i To get an intuitive feel fo as a function of n 1 and n 2 (fo a given T), note again that the second tem of (6) above is a combinatoial function of the sample sizes while the fist tem is meely a linea function of the sample sizes (see Gaph 5 below). Estimated Runtime (seconds) Gaph 5: Estimated PROC PLAN Runtime by (fo n1=4, n2=68, and T=1, based on PPRT in Appendix B) 1,901 1,902 1,903 1,904 1,905 1,906 1,907 1,908 1,909 1,910 1,911 1,912 1,913 1,914 1,915 1,916 1,917 1,918 1,919 1,920 1,921 The combinatoial tems in the second tem of (6) end up dominating as sample sizes incease, asymptotically conveging to 1.0 (one call to PROC PLAN) faste than the fist tem (each PROC PLAN untime) diveges. Hence, fo all but the smallest sample sizes, an optimal, in tems of expected untime (whee g/ = 0), will be faily close to T. Gaphs 6 and 7 below pesent Expected Runtime (seconds) ,901 1,902 1,903 1,904 1,905 1,906 1,907 1,908 * 1,909 1,910 1,911 1,912 1,913 1,914 1,915 1,916 1,917 1,918 1,919 1,920 1,921 Unfotunately, the high level of pecision needed to calculate numeic solutions fo * fo diffeent sample sizes (and T) equies use of a symbolic pogamming language, and thus, cannot be implemented on the fly in SAS. 22 Howeve, fo all pactical puposes, * need not be calculated fo evey combination of values of n 1 and n 2 nealy optimal can be calculated fo anges of the numbe of possible sample combinations because, as shown in Gaph 7, the maginal untime cost of dawing slightly lage than * is negligible (though the maginal untime cost of dawing smalle than * is elatively lage). Thus, if we ceate appopiate anges of the numbe of sample combinations, and fo the low end of each ange identify the lowest coesponding (n 1 + n 2) (because this gives us the lowest maginal incease in PPRT as inceases, and thus, the lagest *), then the coesponding low-end * will neve be lowe than any othe * fo any of the sample pais within that ange. In othe wods, though not optimal fo evey combination of sample sizes within the ange, the low-end * will be nealy optimal because it will be slightly lage (neve smalle) than all othe * fo sample size pais within that ange, and the maginal cost of being slightly lage than * is small, if not negligible. Table 1 below shows the values of used in the pogam the low-end * s fo diffeent anges of the numbe of possible sample combinations. 22 I used Mathematica v.4.1 to calculate p in Table 2 fom the cumulative distibution function (4) of the genealized collecto s poblem. This code is available upon equest. Good appoximations to (4), howeve, do exist see Read (1998) and Kuonen (2000).

5 TABLE 1: Nealy Optimal ( low-end *) Used in Pogam, Pobability (p) of Obtaining T = 1,901 Unique Samples, and Expected # of Calls to PROC PLAN (1/p) by Ranges of # of Possible Sample Combinations, C +! lowend p (lowe bound) * C < 10,626 C 1.0 (assuming C T) ( n1 n2 ) C = 10,626 52,360 52, , , , ,855 1,028,790 1,028,790 10,009,125 10,009,125 25,637,001 25,637, ,290, ,290,905 5,031,771,045 5,031,771,045 C POWER AT WHAT COST? 2, (appox.) 1, (appox.) 1, , , , , , , /p (lowe bound) A good metic fo the cost of employing ovesampling via Table 1 above in ode to maximize statistical powe is its stat-tofinish untime compaed to that associated with just dawing T samples and ignoing the duplicate sample poblem. Employing Table 1 inceased stat-to-finish untime typically by 5% but always by less than 10%. Maximizing powe aguably is woth this elatively small incease in untime. HOW FAST IS IT? RELATIVE SPEED SOME BENCHMARKS The stat-to-finish untime of the pogam using the ovesampling method with PROC PLAN as descibed above is fast elative to othe pogams and pocedues, as seen in Gaph 8 below: Gaph 8: Relative Stat-to-Finish Runtime (T = 1,901) a b c d e f g h whee a = PROC PLAN with ovesampling b = PROC MULTTEST, (n 1+n 2)<10,000, 1 Study pe Contol c = PROC MULTTEST, (n 1+n 2)<10,000, Many Study pe Contol d = PROC MULTTEST, 10,000<(n 1+n 2)<100,000 e = PROC MULTTEST, 100,000<(n 1+n 2)<150,000 f = PROC MULTTEST, 1,000,000<(n 1+n 2)<1,500,00 g = Cytel s PROC TWOSAMPL 24 h = looping in SAS 25 The only pocedue o pogam faste than PROC PLAN with ovesampling is PROC MULTTEST with small samples and one study goup pe contol goup. 26 On the one hand, smalle sample anges ae whee one is most likely to need pemutation tests. Howeve, this is whee the speed diffeential mattes the least in absolute tems even when compaing two hunded pemutation tests with these smalle sample sizes and a study/contol goup atio equal to one, PROC MULTTEST was neve moe than five minutes faste than PROC PLAN. So the tadeoff is seveal minutes pe un with PROC MULTTEST, vesus maximum powe and flexibility in the definition of the test statistic with PROC PLAN with ovesampling. In addition to the speed of PROC PLAN itself, a numbe of factos contibute to the speed of the entie pogam using PROC PLAN with ovesampling, including: Use of PROC APPEND to set two lage datasets togethe (one on top of the othe) wheneve possible. Judicious use of multiple PROC TRANSPOSE s to evaluate the summaized esults of the pemutation sampling. 24 Time did not pemit a thoough investigation into the effect of loweing the study/contol goup atio to one fo PROC TWOSAMPL, though ealy indications ae that its elative untime deceases somewhat, though not nealy as much as that of PROC MULTTEST. 25 See Jackson (1998) fo a copy of the SAS code employing looping to sample the pemutation samples. Bewae, howeve, that the code entes an infinite loop if the numbe of possible sample combinations fo a given sample pai is less than T. Also note that the code, unlike the appopiate definition of a pemutation test which includes ties in the numeato of the p-value, splits ties at the bounday afte assuming exactly one tie at the bounday (appaently in an attempt to make the test less statistically consevative). 23 Of couse, these altenatives do not delete duplicate samples and thus, fo the same numbe of samples T, have less powe than PROC PLAN with ovesampling to the extent they daw duplicate samples. 26 Fo (n + n ) beyond appoximately 10,000, the study/contol goup atio made less 1 2 of a diffeence in PROC MULTTEST s untime. The times epoted fo lage (n + n ) 1 2 ae fo a study/contol goup atio = 1, which would be the fastest time fo PROC MULTTEST.

6 Most test statistics can be constucted based on just one of the two samples in a pai and, if necessay, the pooled summay statistics of the pai. Thus, when conducting pemutation sampling, sample only the smalle of the two samples, but keep tack of which sample is used (study o contol) when constucting the test statistics based on the pemutation samples. Setting togethe the many output datasets fom PROC PLAN by enteing thei names into a long sting assigned to a global vaiable using CALL SYMPUT, and then placing the global vaiable in a set statement so that all the datasets ae set togethe all at once. Altenatives, such as cumulatively setting o even PROC APPENDing the datasets togethe one at a time in a loop, can consume a fai amount of untime. If the dataset is lage and contains a lage pecentage of ecods with the same esponse vaiable value (say, zeo), delete these ecods to avoid soting and late meging them with the PROC PLAN output list. Afte meging the emaining data with the PROC PLAN output and etaining all PROC PLAN ecods in the mege, eassign this value to the esponse vaiable when it is missing (i.e. when that ecod did not mege with the PROC PLAN output because it had been deleted). Most impotantly, if the data contains multiple study goups pe contol goup, thee is no need to output contol goup ecods multiple times fo each coesponding study goup when using PROC PLAN as outlined above. The oiginal data simply can be divided into two datasets one fo contol goup(s) and one fo study goups and each meged sepaately to the PROC PLAN output (then (PROC) APPENDed togethe afte the meges). Unless one constucts a sepaate dataset fo each pemutation test, both PROC MULTTEST and PROC TWOSAMPL equie contol goup ecods duplicated in the input dataset fo each study goup against which they ae being compaed. If the dataset is lage, this can take a temendous amount of sot-time as both pocedues equie soted input data. The same cannot be said fo the thee altenate methods. CONCLUSION I have pesented a method fo quickly pefoming multiple twosample nonpaametic pemutation tests on continuous data in SAS, even when one sample is lage. I also have pesented a pactical and efficient method fo maximizing powe (within the context of a cude Monte Calo appoach) with little incease in untime (5-10%). The pogam utilizing these two methods is at least seveal times faste than seveal widely available altenatives in almost all cicumstances, does not have the sample size constaints of all of these altenatives, and has moe flexibility in the definition of the test statistic used compaed with two of these altenatives (the two pocedues). Only the twosample test was exploed in this study, but the methods pesented ae flexible and can be adapted to moe complex study designs. To quote a medical study that empiically and favoably compaes pemutation tests to thei asymptotic theoy countepats, with continual impovements in computing speed thee seems no impotant agument emaining against pefeed use of this elementay but exact [when fully enumeated] and flexible method (Bullmoe et al. (1999), p.42). Use of the methods pesented in this pape hopefully will contibute to the effot of efficiently exploiting continual advances in computing speed and capacity so that the choice and implementation of statistical methods will be diven by applicable statistical theoy and not technological constaint. ABSOLUTE SPEED When un on data containing 220 sample pais whee the smalle sample was less than 30 obsevations but the lage sample was sometimes ove 64,000 obsevations, the untime of the pogam was 7 minutes, 45 seconds. 27 Fo data containing 6,682 sample pais whee the smalle sample was less than 30 obsevations but the lage was sometimes ove 5,000,000 obsevations, the untime was 8 hous, 36 minutes. The fome example obviously is moe typical of the contexts in which pemutation tests ae used, but the latte is instuctive because it is impotant to know the limits of the methods and softwae being elied upon. This study shows that the untime of PROC PLAN with ovesampling is not pohibitive even when applied to sample sizes as lage (if not fa lage) than would eve be used with pemutation tests All pogams wee un in SAS v.8.2 on a desktop PC with 2GB RAM and a 2GHz Pentium pocesso. 28 One such context is the telecommunications egulatoy aena. As a equiement of the Telecommunications Act of 1996, Regional Bell Opeating Companies (RBOCs), in ode to be pemitted to ente the long distance phone sevice maket, must open thei local phone sevice monopolies to competition. To initially establish competitive economic makets, this equies RBOCs to povide sevice to thei competitos customes that is equivalent to the sevice they povide to thei own customes. Poof of this sevice paity has been equied in the fom of thousands of statistical tests on hundeds of sevice pefomance metics (e.g. how quickly is a phone line installed; how quickly is a phone line epaied, etc.). When one of the two samples being compaed is small (typically that of the Competitive Local Exchange Caie s customes) and the Cental Limit Theoem cannot be elied upon, State Public Sevice Commissions often have equied the use of pemutation tests, even if the sample coesponding to the RBOC s own customes is millions of obsevations. Of the methods examined in this study, only PROC PLAN with ovesampling ealistically can be elied upon to handle such compaisons (fo continuous data pefomance metics) in a timely manne.

7 APPENDIX A *** The code below has been un, and its esults veified, many times. Tanslating file fomats, howeve, can always cause typos. Please notify the autho if you spot one.; *** set global vaiables; %let npemsampt=1901; %let seednum =1; %let seedoig =1; *** The PROC PLAN seed(s) can be set to -1 to key off the time of day (but they still must be incemented when PROC PLAN is looped below o the same daw of samples will esult). The seed geneated can be saved and eused if necessay fo veification / pocess contol puposes.; *** summaized data contains # study goup obs, # contol goup obs; *** use # of obs on summaized input data to obtain # of pemutation tests; poc summay data=sumdinpt; va nobscntl; output out=npemtst(dop=_type_) n=npems; data npemtst eo ; set npemtst(keep=npems _FREQ_); if npems~=_freq_ then output eo; else do; output npemtst; call symput('npemtst',compess(npems)); end; title "Missing NOBS Contol Goup"; poc pint data=eo; title " "; *** ceate pemutation samples one sample pai at a time with the maco makesamp; *** vaiables passed: expment = an identifie of the expeiment (any numbe of such "by vaiables" can be inseted into the maco) bigcomb = identifies which is lage: combins o (the size of the pemutation sample) nobsmal = min(nobstudy, nobscntl) combins = comb(sumofnobs, nobsmal) mincomb = min(combins, ) sumofnobs = sum(nobstudy, nobscntl) ncalls2pp = numbe of times to loop on PROC PLAN if nobstudy+nobscntl * size of daw > 2^31 topdaws = # of samples to daw in all but last loop lastdaw = # of samples to daw in last loop (takes can of modulus) smalle = identifies which sample - study o contol - is smalle and thus, coesponds with the pemutation samples ecount = counte fo ecod # of summaized data fed into makesamp... should count up to global vaiable npemtst %maco makesamp(expment =, bigcomb =, nobsmal =, combins =, mincomb =, sumofnobs =, ncalls2pp =, topdaws =, lastdaw =, smalle =, ecount = ); *** if combins <=, choose all sample combinations, then select T samples fom them; %if &bigcomb=0 %then %do; poc plan seed=&seednum; factos dawnum = &combins dataobsid = &nobsmal of &sumofnobs comb / nopint; output out = psamp&ecount; %if &combins>&npemsampt %then %do; poc plan seed=&seednum; factos dawnum = 1 dataobsid = &npemsampt of &combins andom / nopint; output out = choosmp; data choosmp(keep=dawnum); set choosmp(dop=dawnum); dawnum=dataobsid; poc sot data=choosmp; poc sot data=psamp&ecount; data psamp&ecount; mege psamp&ecount choosmp(in=inchoos) ; if inchoos then output psamp&ecount; data psamp&ecount(dop=dawnum2 temp); set psamp&ecount(dop=dawnum); etain dawnum2; temp=mod(_n_,&nobsmal); if _n_=1 then dawnum2=1; else if temp=1 then dawnum2=dawnum2+1; dawnum=dawnum2;

8 *** if combins >, check whethe PROC PLAN needs to be looped multiple times -- if not, simply select samples, delete duplicates, and keep T samples. If so, loop it fist to select samples. In eithe case, edaw samples if fewe than T unique samples ae dawn the fist time aound.; %if &bigcomb=1 %then %do; %edaw3: %if &ncalls2pp=1 %then %do; poc plan seed=&seednum; factos dawnum = &mincomb dataobsid = &nobsmal of &sumofnobs andom / nopint; output out = psamp&ecount; poc sot data=psamp&ecount; poc tanspose data=psamp&ecount out=temp pefix=stdy; va dataobsid; poc sot data=temp out=temp nodupkey; by stdy1-stdy&nobsmal; poc sot data=temp; set temp; if last.dawnum; call symput("uniqsampn",compess(dawnum)); %if &uniqsampn<&npemsampt %then %do; set temp; dawsize="&mincomb"; combins = "&sumofnobs" " choose " "&nobsmal"; nuniqdw=&uniqsampn; expeiment ="&expment"; seednum2=&seednum+1; call symput('seednum',compess(seednum2)); title "Fewe than &npemsampt Unique Pemutation Samples Selected fo &expment.: Redaw Pefomed"; poc pint data=temp; title " "; %goto edaw3; %if &uniqsampn>&npemsampt %then %do; data psamp&ecount; set psamp&ecount; if dawnum<=&npemsampt; %edaw4: %if &ncalls2pp>1 %then %do q=1 %to &ncalls2pp; %if &q<&ncalls2pp %then %do; poc plan seed=&q; factos dawnum = &topdaws dataobsid = &nobsmal of &sumofnobs andom / nopint; output out = ptemp&q; %if &q=&ncalls2pp %then %do; poc plan seed=&q; factos dawnum = &lastdaw dataobsid = &nobsmal of &sumofnobs andom / nopint; output out = ptemp&q; data psamp&ecount; set ptemp1; %do k=2 %to &q; data psamp&ecount; set psamp&ecount ptemp&k(in=in&k); if in&k then dawnum=dawnum+(&k-1)*&topdaws; poc sot data=psamp&ecount; poc tanspose data=psamp&ecount out=temp pefix=stdyn; va dataobsid; poc sot data=temp out=temp nodupkey; by stdyn1-stdyn&nobsmal; poc sot data=temp; set temp; if last.dawnum; call symput("uniqsampn",compess(dawnum)); %if &uniqsampn<&npemsampt %then %do;

9 set temp; dawsize="&mincomb"; combins = "&sumofnobs" " choose " "&nobsmal"; nuniqdw=&uniqsampn; expeiment ="&expment"; seednum2=&seednum+1; call symput('seednum',compess(seednum2)); title "Fewe than &npemsampt Unique Pemutation Samples Selected fo &expment.: Redaw Pefomed"; poc pint data=temp; title " "; %goto edaw4; %if &uniqsampn>&npemsampt %then %do; data psamp&ecount; set psamp&ecount; if dawnum<=&npemsampt; *** set lengths of any "by vaiables" in PROC PLAN output equal to those in the oiginal dataset fo a smooth mege of the PROC PLAN output with the oiginal dataset; data psamp&ecount(dop=seedback); length expeiment $10. whichsml $4.; set psamp&ecount; seedback="&seedoig"; call symput('seednum',compess(seedback)); expeiment ="&expment"; whichsml ="&smalle"; %mend; *** use nested maco to call PROC PLAN to geneate andom samples fo each ecod of the summaized input dataset; %maco loopingmaco(numloops); %do i=1 %to &numloops; set sumdinpt(keep=expeiment studynobs contlnobs ); if &i=_n_ then do; n=sum(studynobs,contlnobs); k=min(studynobs,contlnobs); combin=comb(n,k); dop n k; *** fo vesions of SAS 6.12 and olde, comb(,) will only go up to about 1*10E70 befoe teminating, so use the loop below; * combin=1; * do j=k to 1 by -1; * combin=combin*(n-j+1)/j; * end; * dop n k j; *** The cuent numeic size limit of SAS is 1.8*10E308 if one sample of the pai being tested is less than 30 obsevations, the lage sample would have to appoach 500,000,000,000 obsevations fo the numbe of possible twosample combinations to exceed this limit. Equal sample sizes of n 1 = n 2 = 514 appoach this limit (# combinations = *10E307), but one should examine closely whethe pemutation tests ae necessay and/o appopiate unde these cicumstances.; *** the 'table' below was calculated based on npemsampt=1901; if combin<10626 then nsamp=combin; else if combin<52360 then nsamp=2155; else if combin< then nsamp=1960; else if combin< then nsamp=1934; else if combin< then nsamp=1912; else if combin< then nsamp=1908; else if combin< then nsamp=1904; else if combin< then nsamp=1903; else if combin< then nsamp=1902; else if combin>= then nsamp=1901; call symput('ncomb',combin); x=min(combin,nsamp); call symput('mincomb',compess(x)); if combin=x then combbig=0; else if combin>x then combbig=1; call symput('combbig',compess(combbig)); nloops=ceil(x*sum(studynobs,contlnobs)/2**31); modplus=mod(&npemsampt,nloops); toploops=floo(&npemsampt/nloops); botmloop=toploops+modplus; call symput('ncalls',compess(nloops)); call symput('topndaw',compess(toploops)); call symput('lastndaw',compess(botmloop)); *** assign names of by vaiables; call symput('expmnt',expeiment); if studynobs<=contlnobs then whichsml="stdy"; else whichsml="cntl"; *** keep tack of which sample (size) coesponds to the pemutation samples fo calculating the test statistic; call symput('whichsml',compess(whichsml)); call symput('nsmalle', compess(min(studynobs,contlnobs))); call symput('sumnobs', compess(sum(studynobs,contlnobs))); output; end; % makesamp(expment =&expmnt, bigcomb =&combbig, nobsmal =&nsmalle, combins =&ncomb, mincomb =&mincomb, sumofnobs =&sumnobs, ncalls2pp =&ncalls, topdaws =&topndaw, lastdaw =&lastndaw,

10 %mend; smalle =&whichsml, ecount =&i ); %loopingmaco(&npemtst); *** now using CALL SYMPUT, place all the PROC PLAN output datasets (psamp1, psamp2, etc.) into a long sting in a global vaiable so they can all be set togethe at once (much faste than setting o appending them cumulatively in a loop). Then mege this entie PROC PLAN output dataset with the oiginal dataset by the appopiate by vaiables (e.g. expeiment) and the ecod id vaiable (above, dataobsid).; APPENDIX B PROC PLAN RunTime, PPRT(n 1, n 2, ), egession esults: Left hand side vaiable = untime seconds adjusted R 2 = Vaiable t value Key Paamete Estimate A B C D E F G H I J K L Vaiable Key Vaiable A Intecept B (n 1 + n 2) C D (n 1 + n 2) * E (dummy: (n 1 + n 2)<65.5K) F (dummy: (n 1 + n 2)<65.5K) * (n 1 + n 2) G (dummy: (n 1 + n 2)<65.5K) * H (dummy: (n 1 + n 2)<65.5K) * (n 1 + n 2) * I (dummy: 65.5K (n 1 + n 2) 73.5K) J (dummy: 65.5K (n 1 + n 2) 73.5K) * (n 1 + n 2) K (dummy: 65.5K (n 1 + n 2) 73.5K) * L (dummy: 65.5K (n 1 + n 2) 73.5K) * (n 1 + n 2) * REFERENCES Andesen, M.J. and P. Legende, An empiical compaison of pemutation methods fo tests of patial egession coefficients in a linea model, Jounal of Statistical Computation and Simulation, Vol. 62, No. 3, Rabe-Hesketh, Eic Taylo, and Michael J. Bamme, Global, Voxel, and Cluste Tests, by Theoy and Pemutation, fo a Diffeence Between Two Goups of Stuctual MR Images of the Bain, IEEE Tansactions on Medical Imaging, Vol. 18, No. 1, Baun, Thomas M. and Ziding Feng, Optimal Pemutation Tests fo the Analysis of Goup Randomized Tials. Jounal of the Ameican Statistical Association, Vol. 96, No. 456, Decembe, Evans, Mean, Nichols Hastings, and Bian Peacock, Statistical Distibutions, 2 nd ed., New Yok: John Wiley & Sons, Good, Phillip, Pemutation Tests, Spinge-Velag, New Yok, Efon, Badley and Robet Tibshiani, An Intoduction to the Bootstap, Chapman & Hall, London & New Yok, Affidavit of John Jackson, On Behalf of MCI-Woldcom, Befoe the Michigan Public Sevice Commission, Case No. U-11830, Novembe 18, 1998, ATTACHMENT A, Using Pemutation Tests to Evaluate the Significance of CLEC vs. ILEC Sevice Quality Diffeentials Kuonen, Diego, A Saddlepoint Appoximation fo the Collecto s Poblem, The Ameican Statistician, Vol. 54, No. 3, August Mehta, Cyus R., Nitin R. Patel, Palay Senchaudui, Impotance Sampling fo Estimating Exact Pobabilities in Pemutational Infeence, Jounal of the Ameican Statistical Association, Vol. 83, No. 404, Mielke, Paul W. and Kenneth J. Bey, Pemutation Methods: A Distance Function Appoach, Spinge-Velag, New Yok, Pesain, Fotunado, Multivaiate Pemutation Tests with Applications in Biostatistics, John Wiley & Sons, Ltd., New Yok, Read, K.L.Q., A Lognomal Appoximation fo the Collecto s Poblem, The Ameican Statistician, Vol. 52, No. 2, May Za, Jeold H., Biostatistical Analysis, 4 th ed., Uppe Saddle Rive, NJ: Pentice Hall, ACKNOWLEDGMENTS I owe special thanks to Gei Costanza, M.S., who povided me with a numbe of valuable insights. Any eos ae my own. CONTACT INFORMATION Please contact the autho with you questions and comments at: J.D. Opdyke Economic Consulting Goup - Andesen, LLP 225 Fanklin Steet, 13 th Floo Boston, MA Wok Phone: , Fax: jdopdyke@usa.net john.d.opdyke@andesen.com Bullmoe, Edwad T., John Suckling, Stephan Ovemeye, Sophia

The Substring Search Problem

The Substring Search Problem The Substing Seach Poblem One algoithm which is used in a vaiety of applications is the family of substing seach algoithms. These algoithms allow a use to detemine if, given two chaacte stings, one is

More information

6 PROBABILITY GENERATING FUNCTIONS

6 PROBABILITY GENERATING FUNCTIONS 6 PROBABILITY GENERATING FUNCTIONS Cetain deivations pesented in this couse have been somewhat heavy on algeba. Fo example, detemining the expectation of the Binomial distibution (page 5.1 tuned out to

More information

Central Coverage Bayes Prediction Intervals for the Generalized Pareto Distribution

Central Coverage Bayes Prediction Intervals for the Generalized Pareto Distribution Statistics Reseach Lettes Vol. Iss., Novembe Cental Coveage Bayes Pediction Intevals fo the Genealized Paeto Distibution Gyan Pakash Depatment of Community Medicine S. N. Medical College, Aga, U. P., India

More information

Estimation of the Correlation Coefficient for a Bivariate Normal Distribution with Missing Data

Estimation of the Correlation Coefficient for a Bivariate Normal Distribution with Missing Data Kasetsat J. (Nat. Sci. 45 : 736-74 ( Estimation of the Coelation Coefficient fo a Bivaiate Nomal Distibution with Missing Data Juthaphon Sinsomboonthong* ABSTRACT This study poposes an estimato of the

More information

Alternative Tests for the Poisson Distribution

Alternative Tests for the Poisson Distribution Chiang Mai J Sci 015; 4() : 774-78 http://epgsciencecmuacth/ejounal/ Contibuted Pape Altenative Tests fo the Poisson Distibution Manad Khamkong*[a] and Pachitjianut Siipanich [b] [a] Depatment of Statistics,

More information

Pearson s Chi-Square Test Modifications for Comparison of Unweighted and Weighted Histograms and Two Weighted Histograms

Pearson s Chi-Square Test Modifications for Comparison of Unweighted and Weighted Histograms and Two Weighted Histograms Peason s Chi-Squae Test Modifications fo Compaison of Unweighted and Weighted Histogams and Two Weighted Histogams Univesity of Akueyi, Bogi, v/noduslód, IS-6 Akueyi, Iceland E-mail: nikolai@unak.is Two

More information

3.1 Random variables

3.1 Random variables 3 Chapte III Random Vaiables 3 Random vaiables A sample space S may be difficult to descibe if the elements of S ae not numbes discuss how we can use a ule by which an element s of S may be associated

More information

ASTR415: Problem Set #6

ASTR415: Problem Set #6 ASTR45: Poblem Set #6 Cuan D. Muhlbege Univesity of Mayland (Dated: May 7, 27) Using existing implementations of the leapfog and Runge-Kutta methods fo solving coupled odinay diffeential equations, seveal

More information

A Bijective Approach to the Permutational Power of a Priority Queue

A Bijective Approach to the Permutational Power of a Priority Queue A Bijective Appoach to the Pemutational Powe of a Pioity Queue Ia M. Gessel Kuang-Yeh Wang Depatment of Mathematics Bandeis Univesity Waltham, MA 02254-9110 Abstact A pioity queue tansfoms an input pemutation

More information

Classical Worm algorithms (WA)

Classical Worm algorithms (WA) Classical Wom algoithms (WA) WA was oiginally intoduced fo quantum statistical models by Pokof ev, Svistunov and Tupitsyn (997), and late genealized to classical models by Pokof ev and Svistunov (200).

More information

Stanford University CS259Q: Quantum Computing Handout 8 Luca Trevisan October 18, 2012

Stanford University CS259Q: Quantum Computing Handout 8 Luca Trevisan October 18, 2012 Stanfod Univesity CS59Q: Quantum Computing Handout 8 Luca Tevisan Octobe 8, 0 Lectue 8 In which we use the quantum Fouie tansfom to solve the peiod-finding poblem. The Peiod Finding Poblem Let f : {0,...,

More information

THE INFLUENCE OF THE MAGNETIC NON-LINEARITY ON THE MAGNETOSTATIC SHIELDS DESIGN

THE INFLUENCE OF THE MAGNETIC NON-LINEARITY ON THE MAGNETOSTATIC SHIELDS DESIGN THE INFLUENCE OF THE MAGNETIC NON-LINEARITY ON THE MAGNETOSTATIC SHIELDS DESIGN LIVIU NEAMŢ 1, ALINA NEAMŢ, MIRCEA HORGOŞ 1 Key wods: Magnetostatic shields, Magnetic non-lineaity, Finite element method.

More information

Encapsulation theory: the transformation equations of absolute information hiding.

Encapsulation theory: the transformation equations of absolute information hiding. 1 Encapsulation theoy: the tansfomation equations of absolute infomation hiding. Edmund Kiwan * www.edmundkiwan.com Abstact This pape descibes how the potential coupling of a set vaies as the set is tansfomed,

More information

Introduction to Nuclear Forces

Introduction to Nuclear Forces Intoduction to Nuclea Foces One of the main poblems of nuclea physics is to find out the natue of nuclea foces. Nuclea foces diffe fom all othe known types of foces. They cannot be of electical oigin since

More information

Multiple Criteria Secretary Problem: A New Approach

Multiple Criteria Secretary Problem: A New Approach J. Stat. Appl. Po. 3, o., 9-38 (04 9 Jounal of Statistics Applications & Pobability An Intenational Jounal http://dx.doi.og/0.785/jsap/0303 Multiple Citeia Secetay Poblem: A ew Appoach Alaka Padhye, and

More information

TESTING THE VALIDITY OF THE EXPONENTIAL MODEL BASED ON TYPE II CENSORED DATA USING TRANSFORMED SAMPLE DATA

TESTING THE VALIDITY OF THE EXPONENTIAL MODEL BASED ON TYPE II CENSORED DATA USING TRANSFORMED SAMPLE DATA STATISTICA, anno LXXVI, n. 3, 2016 TESTING THE VALIDITY OF THE EXPONENTIAL MODEL BASED ON TYPE II CENSORED DATA USING TRANSFORMED SAMPLE DATA Hadi Alizadeh Noughabi 1 Depatment of Statistics, Univesity

More information

Fresnel Diffraction. monchromatic light source

Fresnel Diffraction. monchromatic light source Fesnel Diffaction Equipment Helium-Neon lase (632.8 nm) on 2 axis tanslation stage, Concave lens (focal length 3.80 cm) mounted on slide holde, iis mounted on slide holde, m optical bench, micoscope slide

More information

n 1 Cov(X,Y)= ( X i- X )( Y i-y ). N-1 i=1 * If variable X and variable Y tend to increase together, then c(x,y) > 0

n 1 Cov(X,Y)= ( X i- X )( Y i-y ). N-1 i=1 * If variable X and variable Y tend to increase together, then c(x,y) > 0 Covaiance and Peason Coelation Vatanian, SW 540 Both covaiance and coelation indicate the elationship between two (o moe) vaiables. Neithe the covaiance o coelation give the slope between the X and Y vaiable,

More information

Physics 211: Newton s Second Law

Physics 211: Newton s Second Law Physics 211: Newton s Second Law Reading Assignment: Chapte 5, Sections 5-9 Chapte 6, Section 2-3 Si Isaac Newton Bon: Januay 4, 1643 Died: Mach 31, 1727 Intoduction: Kinematics is the study of how objects

More information

GENLOG Multinomial Loglinear and Logit Models

GENLOG Multinomial Loglinear and Logit Models GENLOG Multinomial Loglinea and Logit Models Notation This chapte descibes the algoithms used to calculate maximum-likelihood estimates fo the multinomial loglinea model and the multinomial logit model.

More information

Chem 453/544 Fall /08/03. Exam #1 Solutions

Chem 453/544 Fall /08/03. Exam #1 Solutions Chem 453/544 Fall 3 /8/3 Exam # Solutions. ( points) Use the genealized compessibility diagam povided on the last page to estimate ove what ange of pessues A at oom tempeatue confoms to the ideal gas law

More information

Lab #4: Newton s Second Law

Lab #4: Newton s Second Law Lab #4: Newton s Second Law Si Isaac Newton Reading Assignment: bon: Januay 4, 1643 Chapte 5 died: Mach 31, 1727 Chapte 9, Section 9-7 Intoduction: Potait of Isaac Newton by Si Godfey Knelle http://www.newton.cam.ac.uk/at/potait.html

More information

CSCE 478/878 Lecture 4: Experimental Design and Analysis. Stephen Scott. 3 Building a tree on the training set Introduction. Outline.

CSCE 478/878 Lecture 4: Experimental Design and Analysis. Stephen Scott. 3 Building a tree on the training set Introduction. Outline. In Homewok, you ae (supposedly) Choosing a data set 2 Extacting a test set of size > 3 3 Building a tee on the taining set 4 Testing on the test set 5 Repoting the accuacy (Adapted fom Ethem Alpaydin and

More information

Surveillance Points in High Dimensional Spaces

Surveillance Points in High Dimensional Spaces Société de Calcul Mathématique SA Tools fo decision help since 995 Suveillance Points in High Dimensional Spaces by Benad Beauzamy Januay 06 Abstact Let us conside any compute softwae, elying upon a lage

More information

Hypothesis Test and Confidence Interval for the Negative Binomial Distribution via Coincidence: A Case for Rare Events

Hypothesis Test and Confidence Interval for the Negative Binomial Distribution via Coincidence: A Case for Rare Events Intenational Jounal of Contempoay Mathematical Sciences Vol. 12, 2017, no. 5, 243-253 HIKARI Ltd, www.m-hikai.com https://doi.og/10.12988/ijcms.2017.7728 Hypothesis Test and Confidence Inteval fo the Negative

More information

4/18/2005. Statistical Learning Theory

4/18/2005. Statistical Learning Theory Statistical Leaning Theoy Statistical Leaning Theoy A model of supevised leaning consists of: a Envionment - Supplying a vecto x with a fixed but unknown pdf F x (x b Teache. It povides a desied esponse

More information

Lecture 7 Topic 5: Multiple Comparisons (means separation)

Lecture 7 Topic 5: Multiple Comparisons (means separation) Lectue 7 Topic 5: Multiple Compaisons (means sepaation) ANOVA: H 0 : µ 1 = µ =... = µ t H 1 : The mean of at least one teatment goup is diffeent If thee ae moe than two teatments in the expeiment, futhe

More information

Some technical details on confidence. intervals for LIFT measures in data. mining

Some technical details on confidence. intervals for LIFT measures in data. mining Some technical details on confidence intevals fo LIFT measues in data mining Wenxin Jiang and Yu Zhao June 2, 2014 Abstact A LIFT measue, such as the esponse ate, lift, o the pecentage of captued esponse,

More information

Math 2263 Solutions for Spring 2003 Final Exam

Math 2263 Solutions for Spring 2003 Final Exam Math 6 Solutions fo Sping Final Exam ) A staightfowad appoach to finding the tangent plane to a suface at a point ( x, y, z ) would be to expess the cuve as an explicit function z = f ( x, y ), calculate

More information

2. The Munich chain ladder method

2. The Munich chain ladder method ntoduction ootstapping has become vey popula in stochastic claims eseving because of the simplicity and flexibility of the appoach One of the main easons fo this is the ease with which it can be implemented

More information

Encapsulation theory: radial encapsulation. Edmund Kirwan *

Encapsulation theory: radial encapsulation. Edmund Kirwan * Encapsulation theoy: adial encapsulation. Edmund Kiwan * www.edmundkiwan.com Abstact This pape intoduces the concept of adial encapsulation, wheeby dependencies ae constained to act fom subsets towads

More information

Unobserved Correlation in Ascending Auctions: Example And Extensions

Unobserved Correlation in Ascending Auctions: Example And Extensions Unobseved Coelation in Ascending Auctions: Example And Extensions Daniel Quint Univesity of Wisconsin Novembe 2009 Intoduction In pivate-value ascending auctions, the winning bidde s willingness to pay

More information

Safety variations in steel designed using Eurocode 3

Safety variations in steel designed using Eurocode 3 JCSS Wokshop on eliability Based Code Calibation Safety vaiations in steel designed using Euocode 3 Mike Byfield Canfield Univesity Swindon, SN6 8LA, UK David Nethecot Impeial College London SW7 2BU, UK

More information

Recent Advances in Chemical Engineering, Biochemistry and Computational Chemistry

Recent Advances in Chemical Engineering, Biochemistry and Computational Chemistry Themal Conductivity of Oganic Liquids: a New Equation DI NICOLA GIOVANNI*, CIARROCCHI ELEONORA, PIERANTOZZI ARIANO, STRYJEK ROAN 1 DIIS, Univesità Politecnica delle ache, 60131 Ancona, ITALY *coesponding

More information

ON INDEPENDENT SETS IN PURELY ATOMIC PROBABILITY SPACES WITH GEOMETRIC DISTRIBUTION. 1. Introduction. 1 r r. r k for every set E A, E \ {0},

ON INDEPENDENT SETS IN PURELY ATOMIC PROBABILITY SPACES WITH GEOMETRIC DISTRIBUTION. 1. Introduction. 1 r r. r k for every set E A, E \ {0}, ON INDEPENDENT SETS IN PURELY ATOMIC PROBABILITY SPACES WITH GEOMETRIC DISTRIBUTION E. J. IONASCU and A. A. STANCU Abstact. We ae inteested in constucting concete independent events in puely atomic pobability

More information

Section 8.2 Polar Coordinates

Section 8.2 Polar Coordinates Section 8. Pola Coodinates 467 Section 8. Pola Coodinates The coodinate system we ae most familia with is called the Catesian coodinate system, a ectangula plane divided into fou quadants by the hoizontal

More information

Centripetal Force OBJECTIVE INTRODUCTION APPARATUS THEORY

Centripetal Force OBJECTIVE INTRODUCTION APPARATUS THEORY Centipetal Foce OBJECTIVE To veify that a mass moving in cicula motion expeiences a foce diected towad the cente of its cicula path. To detemine how the mass, velocity, and adius affect a paticle's centipetal

More information

F-IF Logistic Growth Model, Abstract Version

F-IF Logistic Growth Model, Abstract Version F-IF Logistic Gowth Model, Abstact Vesion Alignments to Content Standads: F-IFB4 Task An impotant example of a model often used in biology o ecology to model population gowth is called the logistic gowth

More information

Do Managers Do Good With Other People s Money? Online Appendix

Do Managers Do Good With Other People s Money? Online Appendix Do Manages Do Good With Othe People s Money? Online Appendix Ing-Haw Cheng Haison Hong Kelly Shue Abstact This is the Online Appendix fo Cheng, Hong and Shue 2013) containing details of the model. Datmouth

More information

OSCILLATIONS AND GRAVITATION

OSCILLATIONS AND GRAVITATION 1. SIMPLE HARMONIC MOTION Simple hamonic motion is any motion that is equivalent to a single component of unifom cicula motion. In this situation the velocity is always geatest in the middle of the motion,

More information

COUPLED MODELS OF ROLLING, SLIDING AND WHIRLING FRICTION

COUPLED MODELS OF ROLLING, SLIDING AND WHIRLING FRICTION ENOC 008 Saint Petesbug Russia June 30-July 4 008 COUPLED MODELS OF ROLLING SLIDING AND WHIRLING FRICTION Alexey Kieenkov Ins ti tu te fo P ob le ms in Me ch an ic s Ru ss ia n Ac ad em y of Sc ie nc es

More information

LINEAR AND NONLINEAR ANALYSES OF A WIND-TUNNEL BALANCE

LINEAR AND NONLINEAR ANALYSES OF A WIND-TUNNEL BALANCE LINEAR AND NONLINEAR ANALYSES O A WIND-TUNNEL INTRODUCTION BALANCE R. Kakehabadi and R. D. Rhew NASA LaRC, Hampton, VA The NASA Langley Reseach Cente (LaRC) has been designing stain-gauge balances fo utilization

More information

LET a random variable x follows the two - parameter

LET a random variable x follows the two - parameter INTERNATIONAL JOURNAL OF MATHEMATICS AND SCIENTIFIC COMPUTING ISSN: 2231-5330, VOL. 5, NO. 1, 2015 19 Shinkage Bayesian Appoach in Item - Failue Gamma Data In Pesence of Pio Point Guess Value Gyan Pakash

More information

1D2G - Numerical solution of the neutron diffusion equation

1D2G - Numerical solution of the neutron diffusion equation DG - Numeical solution of the neuton diffusion equation Y. Danon Daft: /6/09 Oveview A simple numeical solution of the neuton diffusion equation in one dimension and two enegy goups was implemented. Both

More information

ANALYSIS OF PRESSURE VARIATION OF FLUID IN AN INFINITE ACTING RESERVOIR

ANALYSIS OF PRESSURE VARIATION OF FLUID IN AN INFINITE ACTING RESERVOIR Nigeian Jounal of Technology (NIJOTECH) Vol. 36, No. 1, Januay 2017, pp. 80 86 Copyight Faculty of Engineeing, Univesity of Nigeia, Nsukka, Pint ISSN: 0331-8443, Electonic ISSN: 2467-8821 www.nijotech.com

More information

Analytical Solutions for Confined Aquifers with non constant Pumping using Computer Algebra

Analytical Solutions for Confined Aquifers with non constant Pumping using Computer Algebra Poceedings of the 006 IASME/SEAS Int. Conf. on ate Resouces, Hydaulics & Hydology, Chalkida, Geece, May -3, 006 (pp7-) Analytical Solutions fo Confined Aquifes with non constant Pumping using Compute Algeba

More information

COMP Parallel Computing SMM (3) OpenMP Case Study: The Barnes-Hut N-body Algorithm

COMP Parallel Computing SMM (3) OpenMP Case Study: The Barnes-Hut N-body Algorithm COMP 633 - Paallel Computing Lectue 8 Septembe 14, 2017 SMM (3) OpenMP Case Study: The Banes-Hut N-body Algoithm Topics Case study: the Banes-Hut algoithm Study an impotant algoithm in scientific computing»

More information

Macro Theory B. The Permanent Income Hypothesis

Macro Theory B. The Permanent Income Hypothesis Maco Theoy B The Pemanent Income Hypothesis Ofe Setty The Eitan Beglas School of Economics - Tel Aviv Univesity May 15, 2015 1 1 Motivation 1.1 An econometic check We want to build an empiical model with

More information

Basic Bridge Circuits

Basic Bridge Circuits AN7 Datafoth Copoation Page of 6 DID YOU KNOW? Samuel Hunte Chistie (784-865) was bon in London the son of James Chistie, who founded Chistie's Fine At Auctionees. Samuel studied mathematics at Tinity

More information

A scaling-up methodology for co-rotating twin-screw extruders

A scaling-up methodology for co-rotating twin-screw extruders A scaling-up methodology fo co-otating twin-scew extudes A. Gaspa-Cunha, J. A. Covas Institute fo Polymes and Composites/I3N, Univesity of Minho, Guimaães 4800-058, Potugal Abstact. Scaling-up of co-otating

More information

International Journal of Mathematical Archive-3(12), 2012, Available online through ISSN

International Journal of Mathematical Archive-3(12), 2012, Available online through  ISSN Intenational Jounal of Mathematical Achive-3(), 0, 480-4805 Available online though www.ijma.info ISSN 9 504 STATISTICAL QUALITY CONTROL OF MULTI-ITEM EOQ MOEL WITH VARYING LEAING TIME VIA LAGRANGE METHO

More information

Chapter 5 Linear Equations: Basic Theory and Practice

Chapter 5 Linear Equations: Basic Theory and Practice Chapte 5 inea Equations: Basic Theoy and actice In this chapte and the next, we ae inteested in the linea algebaic equation AX = b, (5-1) whee A is an m n matix, X is an n 1 vecto to be solved fo, and

More information

7.2. Coulomb s Law. The Electric Force

7.2. Coulomb s Law. The Electric Force Coulomb s aw Recall that chaged objects attact some objects and epel othes at a distance, without making any contact with those objects Electic foce,, o the foce acting between two chaged objects, is somewhat

More information

New problems in universal algebraic geometry illustrated by boolean equations

New problems in universal algebraic geometry illustrated by boolean equations New poblems in univesal algebaic geomety illustated by boolean equations axiv:1611.00152v2 [math.ra] 25 Nov 2016 Atem N. Shevlyakov Novembe 28, 2016 Abstact We discuss new poblems in univesal algebaic

More information

Lab 10: Newton s Second Law in Rotation

Lab 10: Newton s Second Law in Rotation Lab 10: Newton s Second Law in Rotation We can descibe the motion of objects that otate (i.e. spin on an axis, like a popelle o a doo) using the same definitions, adapted fo otational motion, that we have

More information

Chapter 3: Theory of Modular Arithmetic 38

Chapter 3: Theory of Modular Arithmetic 38 Chapte 3: Theoy of Modula Aithmetic 38 Section D Chinese Remainde Theoem By the end of this section you will be able to pove the Chinese Remainde Theoem apply this theoem to solve simultaneous linea conguences

More information

ac p Answers to questions for The New Introduction to Geographical Economics, 2 nd edition Chapter 3 The core model of geographical economics

ac p Answers to questions for The New Introduction to Geographical Economics, 2 nd edition Chapter 3 The core model of geographical economics Answes to questions fo The New ntoduction to Geogaphical Economics, nd edition Chapte 3 The coe model of geogaphical economics Question 3. Fom intoductoy mico-economics we know that the condition fo pofit

More information

PROBLEM SET #1 SOLUTIONS by Robert A. DiStasio Jr.

PROBLEM SET #1 SOLUTIONS by Robert A. DiStasio Jr. POBLM S # SOLUIONS by obet A. DiStasio J. Q. he Bon-Oppenheime appoximation is the standad way of appoximating the gound state of a molecula system. Wite down the conditions that detemine the tonic and

More information

On the Poisson Approximation to the Negative Hypergeometric Distribution

On the Poisson Approximation to the Negative Hypergeometric Distribution BULLETIN of the Malaysian Mathematical Sciences Society http://mathusmmy/bulletin Bull Malays Math Sci Soc (2) 34(2) (2011), 331 336 On the Poisson Appoximation to the Negative Hypegeometic Distibution

More information

Research Design - - Topic 17 Multiple Regression & Multiple Correlation: Two Predictors 2009 R.C. Gardner, Ph.D.

Research Design - - Topic 17 Multiple Regression & Multiple Correlation: Two Predictors 2009 R.C. Gardner, Ph.D. Reseach Design - - Topic 7 Multiple Regession & Multiple Coelation: Two Pedictos 009 R.C. Gadne, Ph.D. Geneal Rationale and Basic Aithmetic fo two pedictos Patial and semipatial coelation Regession coefficients

More information

Experiment I Voltage Variation and Control

Experiment I Voltage Variation and Control ELE303 Electicity Netwoks Expeiment I oltage aiation and ontol Objective To demonstate that the voltage diffeence between the sending end of a tansmission line and the load o eceiving end depends mainly

More information

Nuclear Medicine Physics 02 Oct. 2007

Nuclear Medicine Physics 02 Oct. 2007 Nuclea Medicine Physics Oct. 7 Counting Statistics and Eo Popagation Nuclea Medicine Physics Lectues Imaging Reseach Laboatoy, Radiology Dept. Lay MacDonald 1//7 Statistics (Summaized in One Slide) Type

More information

Psychometric Methods: Theory into Practice Larry R. Price

Psychometric Methods: Theory into Practice Larry R. Price ERRATA Psychometic Methods: Theoy into Pactice Lay R. Pice Eos wee made in Equations 3.5a and 3.5b, Figue 3., equations and text on pages 76 80, and Table 9.1. Vesions of the elevant pages that include

More information

I. Introduction to ecological populations, life tables, and population growth models

I. Introduction to ecological populations, life tables, and population growth models 3-1 Population ecology Lab 3: Population life tables I. Intoduction to ecological populations, life tables, and population gowth models This week we begin a new unit on population ecology. A population

More information

33. 12, or its reciprocal. or its negative.

33. 12, or its reciprocal. or its negative. Page 6 The Point is Measuement In spite of most of what has been said up to this point, we did not undetake this poject with the intent of building bette themometes. The point is to measue the peson. Because

More information

Physics 2B Chapter 22 Notes - Magnetic Field Spring 2018

Physics 2B Chapter 22 Notes - Magnetic Field Spring 2018 Physics B Chapte Notes - Magnetic Field Sping 018 Magnetic Field fom a Long Staight Cuent-Caying Wie In Chapte 11 we looked at Isaac Newton s Law of Gavitation, which established that a gavitational field

More information

Information Retrieval Advanced IR models. Luca Bondi

Information Retrieval Advanced IR models. Luca Bondi Advanced IR models Luca Bondi Advanced IR models 2 (LSI) Pobabilistic Latent Semantic Analysis (plsa) Vecto Space Model 3 Stating point: Vecto Space Model Documents and queies epesented as vectos in the

More information

MEASURING CHINESE RISK AVERSION

MEASURING CHINESE RISK AVERSION MEASURING CHINESE RISK AVERSION --Based on Insuance Data Li Diao (Cental Univesity of Finance and Economics) Hua Chen (Cental Univesity of Finance and Economics) Jingzhen Liu (Cental Univesity of Finance

More information

6 Matrix Concentration Bounds

6 Matrix Concentration Bounds 6 Matix Concentation Bounds Concentation bounds ae inequalities that bound pobabilities of deviations by a andom vaiable fom some value, often its mean. Infomally, they show the pobability that a andom

More information

When two numbers are written as the product of their prime factors, they are in factored form.

When two numbers are written as the product of their prime factors, they are in factored form. 10 1 Study Guide Pages 420 425 Factos Because 3 4 12, we say that 3 and 4 ae factos of 12. In othe wods, factos ae the numbes you multiply to get a poduct. Since 2 6 12, 2 and 6 ae also factos of 12. The

More information

To Feel a Force Chapter 7 Static equilibrium - torque and friction

To Feel a Force Chapter 7 Static equilibrium - torque and friction To eel a oce Chapte 7 Chapte 7: Static fiction, toque and static equilibium A. Review of foce vectos Between the eath and a small mass, gavitational foces of equal magnitude and opposite diection act on

More information

A NEW VARIABLE STIFFNESS SPRING USING A PRESTRESSED MECHANISM

A NEW VARIABLE STIFFNESS SPRING USING A PRESTRESSED MECHANISM Poceedings of the ASME 2010 Intenational Design Engineeing Technical Confeences & Computes and Infomation in Engineeing Confeence IDETC/CIE 2010 August 15-18, 2010, Monteal, Quebec, Canada DETC2010-28496

More information

Liquid gas interface under hydrostatic pressure

Liquid gas interface under hydrostatic pressure Advances in Fluid Mechanics IX 5 Liquid gas inteface unde hydostatic pessue A. Gajewski Bialystok Univesity of Technology, Faculty of Civil Engineeing and Envionmental Engineeing, Depatment of Heat Engineeing,

More information

Absorption Rate into a Small Sphere for a Diffusing Particle Confined in a Large Sphere

Absorption Rate into a Small Sphere for a Diffusing Particle Confined in a Large Sphere Applied Mathematics, 06, 7, 709-70 Published Online Apil 06 in SciRes. http://www.scip.og/jounal/am http://dx.doi.og/0.46/am.06.77065 Absoption Rate into a Small Sphee fo a Diffusing Paticle Confined in

More information

Pulse Neutron Neutron (PNN) tool logging for porosity Some theoretical aspects

Pulse Neutron Neutron (PNN) tool logging for porosity Some theoretical aspects Pulse Neuton Neuton (PNN) tool logging fo poosity Some theoetical aspects Intoduction Pehaps the most citicism of Pulse Neuton Neuon (PNN) logging methods has been chage that PNN is to sensitive to the

More information

Topic 5. Mean separation: Multiple comparisons [ST&D Ch.8, except 8.3]

Topic 5. Mean separation: Multiple comparisons [ST&D Ch.8, except 8.3] 5.1 Topic 5. Mean sepaation: Multiple compaisons [ST&D Ch.8, except 8.3] 5. 1. Basic concepts In the analysis of vaiance, the null hypothesis that is tested is always that all means ae equal. If the F

More information

Moment-free numerical approximation of highly oscillatory integrals with stationary points

Moment-free numerical approximation of highly oscillatory integrals with stationary points Moment-fee numeical appoximation of highly oscillatoy integals with stationay points Sheehan Olve Abstact We pesent a method fo the numeical quadatue of highly oscillatoy integals with stationay points.

More information

Web-based Supplementary Materials for. Controlling False Discoveries in Multidimensional Directional Decisions, with

Web-based Supplementary Materials for. Controlling False Discoveries in Multidimensional Directional Decisions, with Web-based Supplementay Mateials fo Contolling False Discoveies in Multidimensional Diectional Decisions, with Applications to Gene Expession Data on Odeed Categoies Wenge Guo Biostatistics Banch, National

More information

MEASURES OF BLOCK DESIGN EFFICIENCY RECOVERING INTERBLOCK INFORMATION

MEASURES OF BLOCK DESIGN EFFICIENCY RECOVERING INTERBLOCK INFORMATION MEASURES OF BLOCK DESIGN EFFICIENCY RECOVERING INTERBLOCK INFORMATION Walte T. Fedee 337 Waen Hall, Biometics Unit Conell Univesity Ithaca, NY 4853 and Tey P. Speed Division of Mathematics & Statistics,

More information

Goodness-of-fit for composite hypotheses.

Goodness-of-fit for composite hypotheses. Section 11 Goodness-of-fit fo composite hypotheses. Example. Let us conside a Matlab example. Let us geneate 50 obsevations fom N(1, 2): X=nomnd(1,2,50,1); Then, unning a chi-squaed goodness-of-fit test

More information

Analytical time-optimal trajectories for an omni-directional vehicle

Analytical time-optimal trajectories for an omni-directional vehicle Analytical time-optimal tajectoies fo an omni-diectional vehicle Weifu Wang and Devin J. Balkcom Abstact We pesent the fist analytical solution method fo finding a time-optimal tajectoy between any given

More information

( ) [ ] [ ] [ ] δf φ = F φ+δφ F. xdx.

( ) [ ] [ ] [ ] δf φ = F φ+δφ F. xdx. 9. LAGRANGIAN OF THE ELECTROMAGNETIC FIELD In the pevious section the Lagangian and Hamiltonian of an ensemble of point paticles was developed. This appoach is based on a qt. This discete fomulation can

More information

A generalization of the Bernstein polynomials

A generalization of the Bernstein polynomials A genealization of the Benstein polynomials Halil Ouç and Geoge M Phillips Mathematical Institute, Univesity of St Andews, Noth Haugh, St Andews, Fife KY16 9SS, Scotland Dedicated to Philip J Davis This

More information

Bayesian Analysis of Topp-Leone Distribution under Different Loss Functions and Different Priors

Bayesian Analysis of Topp-Leone Distribution under Different Loss Functions and Different Priors J. tat. Appl. Po. Lett. 3, No. 3, 9-8 (6) 9 http://dx.doi.og/.8576/jsapl/33 Bayesian Analysis of Topp-Leone Distibution unde Diffeent Loss Functions and Diffeent Pios Hummaa ultan * and. P. Ahmad Depatment

More information

Likelihood vs. Information in Aligning Biopolymer Sequences. UCSD Technical Report CS Timothy L. Bailey

Likelihood vs. Information in Aligning Biopolymer Sequences. UCSD Technical Report CS Timothy L. Bailey Likelihood vs. Infomation in Aligning Biopolyme Sequences UCSD Technical Repot CS93-318 Timothy L. Bailey Depatment of Compute Science and Engineeing Univesity of Califonia, San Diego 1 Febuay, 1993 ABSTRACT:

More information

Lecture 8 - Gauss s Law

Lecture 8 - Gauss s Law Lectue 8 - Gauss s Law A Puzzle... Example Calculate the potential enegy, pe ion, fo an infinite 1D ionic cystal with sepaation a; that is, a ow of equally spaced chages of magnitude e and altenating sign.

More information

Topic 4a Introduction to Root Finding & Bracketing Methods

Topic 4a Introduction to Root Finding & Bracketing Methods /8/18 Couse Instucto D. Raymond C. Rumpf Office: A 337 Phone: (915) 747 6958 E Mail: cumpf@utep.edu Topic 4a Intoduction to Root Finding & Backeting Methods EE 4386/531 Computational Methods in EE Outline

More information

The Omega Function -Not for Circulation-

The Omega Function -Not for Circulation- The Omega Function -Not fo Ciculation- A. Cascon, C. Keating and W. F. Shadwick The Finance Development Cente London, England July 23 2002, Final Revision 11 Mach 2003 1 Intoduction In this pape we intoduce

More information

Control Chart Analysis of E k /M/1 Queueing Model

Control Chart Analysis of E k /M/1 Queueing Model Intenational OPEN ACCESS Jounal Of Moden Engineeing Reseach (IJMER Contol Chat Analysis of E /M/1 Queueing Model T.Poongodi 1, D. (Ms. S. Muthulashmi 1, (Assistant Pofesso, Faculty of Engineeing, Pofesso,

More information

Contact impedance of grounded and capacitive electrodes

Contact impedance of grounded and capacitive electrodes Abstact Contact impedance of gounded and capacitive electodes Andeas Hödt Institut fü Geophysik und extateestische Physik, TU Baunschweig The contact impedance of electodes detemines how much cuent can

More information

! E da = 4πkQ enc, has E under the integral sign, so it is not ordinarily an

! E da = 4πkQ enc, has E under the integral sign, so it is not ordinarily an Physics 142 Electostatics 2 Page 1 Electostatics 2 Electicity is just oganized lightning. Geoge Calin A tick that sometimes woks: calculating E fom Gauss s law Gauss s law,! E da = 4πkQ enc, has E unde

More information

Relating Branching Program Size and. Formula Size over the Full Binary Basis. FB Informatik, LS II, Univ. Dortmund, Dortmund, Germany

Relating Branching Program Size and. Formula Size over the Full Binary Basis. FB Informatik, LS II, Univ. Dortmund, Dortmund, Germany Relating Banching Pogam Size and omula Size ove the ull Binay Basis Matin Saueho y Ingo Wegene y Ralph Wechne z y B Infomatik, LS II, Univ. Dotmund, 44 Dotmund, Gemany z ankfut, Gemany sauehof/wegene@ls.cs.uni-dotmund.de

More information

Revision of Lecture Eight

Revision of Lecture Eight Revision of Lectue Eight Baseband equivalent system and equiements of optimal tansmit and eceive filteing: (1) achieve zeo ISI, and () maximise the eceive SNR Thee detection schemes: Theshold detection

More information

Directed Regression. Benjamin Van Roy Stanford University Stanford, CA Abstract

Directed Regression. Benjamin Van Roy Stanford University Stanford, CA Abstract Diected Regession Yi-hao Kao Stanfod Univesity Stanfod, CA 94305 yihaoao@stanfod.edu Benjamin Van Roy Stanfod Univesity Stanfod, CA 94305 bv@stanfod.edu Xiang Yan Stanfod Univesity Stanfod, CA 94305 xyan@stanfod.edu

More information

Bounds on the performance of back-to-front airplane boarding policies

Bounds on the performance of back-to-front airplane boarding policies Bounds on the pefomance of bac-to-font aiplane boading policies Eitan Bachmat Michael Elin Abstact We povide bounds on the pefomance of bac-to-font aiplane boading policies. In paticula, we show that no

More information

Regularization. Stephen Scott and Vinod Variyam. Introduction. Outline. Machine. Learning. Problems. Measuring. Performance.

Regularization. Stephen Scott and Vinod Variyam. Introduction. Outline. Machine. Learning. Problems. Measuring. Performance. leaning can geneally be distilled to an optimization poblem Choose a classifie (function, hypothesis) fom a set of functions that minimizes an objective function Clealy we want pat of this function to

More information

An Exact Solution of Navier Stokes Equation

An Exact Solution of Navier Stokes Equation An Exact Solution of Navie Stokes Equation A. Salih Depatment of Aeospace Engineeing Indian Institute of Space Science and Technology, Thiuvananthapuam, Keala, India. July 20 The pincipal difficulty in

More information

Fractional Zero Forcing via Three-color Forcing Games

Fractional Zero Forcing via Three-color Forcing Games Factional Zeo Focing via Thee-colo Focing Games Leslie Hogben Kevin F. Palmowski David E. Robeson Michael Young May 13, 2015 Abstact An -fold analogue of the positive semidefinite zeo focing pocess that

More information

A Comparison and Contrast of Some Methods for Sample Quartiles

A Comparison and Contrast of Some Methods for Sample Quartiles A Compaison and Contast of Some Methods fo Sample Quatiles Anwa H. Joade and aja M. Latif King Fahd Univesity of Petoleum & Mineals ABSTACT A emainde epesentation of the sample size n = 4m ( =, 1, 2, 3)

More information

Introduction to Mathematical Statistics Robert V. Hogg Joeseph McKean Allen T. Craig Seventh Edition

Introduction to Mathematical Statistics Robert V. Hogg Joeseph McKean Allen T. Craig Seventh Edition Intoduction to Mathematical Statistics Robet V. Hogg Joeseph McKean Allen T. Caig Seventh Edition Peason Education Limited Edinbugh Gate Halow Essex CM2 2JE England and Associated Companies thoughout the

More information