A STATISTICAL SIGNIFICANCE TEST FOR PERSON AUTHENTICATION. IDIAP CP 592, rue du Simplon Martigny, Switzerland

Size: px
Start display at page:

Download "A STATISTICAL SIGNIFICANCE TEST FOR PERSON AUTHENTICATION. IDIAP CP 592, rue du Simplon Martigny, Switzerland"

Transcription

1 A STATISTICAL SIGFICANCE TEST FOR PERSON AUTHENTICATION Samy Begio Johy Mariéthoz IDIAP CP 592, rue du Simplo Martigy, Switzerlad {begio,marietho}@idiap.ch ABSTRACT Assessig whether two models are statistically sigificatly differet from each other is a very importat step i research, although it has ufortuately ot received eough attetio i the field of perso autheticatio. Several performace measures are ofte used to compare models, such as half total error rates HTERs ad equal error rates EERs, but most beig aggregates of two measures such as the false acceptace rate ad the false rejectio rate, simple statistical tests caot be used as is. We show i this paper how to adapt oe of these tests i order to compute a cofidece iterval aroud oe HTER measure or to assess the statistical sigificatess of the differece betwee two HTER measures. We also compare our techique with other solutios that are sometimes used i the literature ad show why they yield ofte too optimistic results resultig i false statemets about statistical sigificatess. 1. INTRODUCTION The geeral field of biometric perso autheticatio is cocered with the use of several biometric traits such as the voice, the face, the sigature, or the figerprits of persos i order to assess their idetity [1]. I all these cases, researchers ted to use the same performace measures to estimate ad compare their models. Most of them, such as the half total error rate HTER or the detectio cost fuctio DCF are i fact aggregates of other measures such as false acceptace rates FARs ad false rejectio rates FRRs. However, whe it is time to compare a ovel model to existig solutios o the same problem, a quick review of the curret literature i perso autheticatio shows that either o statistical test is used to assess the differece betwee models, or, worse, statistical tests are wrogly used, which ofte eds up i over-optimistic results, tedig to show, for istace, that the ew model is ideed statistically sigificatly better tha the state-of-the-art while it might ot be the case i fact. I this paper, we preset a proper method to compute a simple statistical test, kow as the test of two proportios, or z-test, adapted to the problem of aggregate measures such as HTER ad DCF. I sectio 2, we first review the mai performace measures used i verificatio tasks, the i sectio 3 we recall the purpose of the z-test, based o the Biomial distributio, ad some of its variats. I sectio 4, we exted this test to the case of aggregate measures such as HTER, while i sectio 5, we preset other possible solutios, which, as explaied, ca lead to improper results. I fact, sectio 6 compares our solutio to these other methods ad show why they yield over-optimistic results. Sectio 7 cocludes this paper with some proposed future work. 2. PERSON AUTHENTICATION MEASURES A verificatio system has to deal with two kids of evets: either the perso claimig a give idetity is the oe who he claims to be i which case, he is called a cliet, or he is ot i which case, he is called a impostor. Moreover, the system may geerally take two decisios: either accept the cliet or reject him ad decide he is a impostor. Thus, the system may make two types of errors: a false acceptace, whe the system accepts a impostor, ad a false rejectio, whe the system rejects a cliet. Let FA be the total umber of false acceptaces made by the system, FR be the total umber of false rejectios, NC be the umber of cliet accesses, ad be the umber of impostor accesses. I order to be idepedet o the specific dataset distributio, the performace of the system is ofte measured i terms of rates of these two differet errors, as follows: FAR = FA, FR FRR = NC. 1 A uique measure ofte used combies these two ratios ito the so-called detectio cost fuctio DCF [2] as follows: DCF = { CostFR P cliet FRR CostFA P impostor FAR 2

2 where P cliet is the prior probability that a cliet will use the system, P impostor is the prior probability that a impostor will use the system, CostFR is the cost of a false rejectio, ad CostFA is the cost of a false acceptace. A particular case of the DCF is kow as the half total error rate HTER where the costs are equal to 1 ad the probabilities are 0.5 each: HTER = FAR FRR 2. 3 Most autheticatio systems are measured ad compared usig HTERs or variatios of it. The mai questio we address i this paper is thus: how ca we compute a cofidece iterval aroud a HTER or assess the differece betwee two systems yieldig differet HTERs. Note that i most bechmark databases used i the autheticatio literature, there is a sigificat ubalace betwee the umber of cliet accesses ad the umber of impostor accesses. This is probably due to the relatively higher cost of obtaiig the former with respect to the latter. This ubalace is the mai reaso why people use HTER to compare models ad ot the usual classificatio error used i the machie learig literature. 3. THE Z-TEST ON PROPORTIONS Several statistical tests are available i the literature. For stadard classificatio tasks, a simple yet ofte used test is kow as the z-test, or test betwee two proportios. The ratioale of this test is the followig: give a set of examples, each draw idepedetly ad idetically distributed i.i.d. from a ukow distributio, our system is goig to take a decisio for each example, ad this decisio will be correct or ot. Let us ow look at the distributio of the umber of errors that will be made by our classificatio system. Sice each decisio is idepedet from the others ad is biary, it is reasoable to assume that the radom variable X represetig 1 the umber of errors should follow a Biomial distributio B, p where is the umber of examples ad p is the percetage of errors. Moreover, it is kow that a Biomial B, p ca be approximated by a Normal distributio N µ, σ 2 with µ = p ad σ 2 = p1 p whe is large eough 2. Fially, if X N p, p1 p, the the distributio of the proportio of errors Y = X N p, p1 p. 1 I this paper we use the followig otatio: bold letters such as FA represet radom variables, while ormal letters such as FA represet a particular value of the uderlyig radom variable. 2 A rule of thumb ofte used is to have p1 p larger tha 10. P Y p β δ = Area uder the curve Y p β Fig. 1. Cofidece itervals are computed usig the area uder the Normal curve Cofidece Itervals I order to compute a cofidece iterval aroud p, we ca search for bouds {p β, p β} such that P p β < Y < p β = δ 4 where δ represets our cofidece. This is called a twosided test sice we are searchig for two bouds aroud p. Fortuately, fidig β i 4 for a give δ ca be doe efficietly for the Normal distributio. Figure 1 illustrates graphically the problem ad Figure 2 summarizes the procedure to obtai the cofidece iterval Differece Betwee Proportios Alteratively, if oe wats to verify whether a give proportio of errors p A is statistically sigificatly differet from aother proportio p B, a similar test ca be performed. I the case where we already kow that p A caot be lower tha p B, a oe-sided test is used, otherwise we use a twosided test. Notig respectively Y A ad Y B the radom variables represetig the distributio of p A ad p B, the oe-sided test is based o P Y A Y B < p A p B = δ 5 while the two-sided test is based o P Y A Y B < p A p B = δ 6 which ca be solved usig the fact that the differece betwee two idepedet Normal distributios is a Normal

3 distributio where the mea is the differece betwee the two Normal meas ad the variace is the sum of the two Normal variaces, hece, if Y A is ot statistically differet from Y B, the Y A Y B N 0, p A1 p A p B 1 p B 7 ad if δ is higher tha a predefied value such as 95%, the oe ca state that p A is sigificatly differet from p B. Note that a better estimate of the variace of 7 ca be obtaied whe assumig p A = p B which should be the case if they are ot sigificatly differet. I that case, equatio 7 becomes with Y A Y B N 0, p = p A p B 2 2p1 p Note however that usig this test to verify whether two models give statistically sigificatly differet results o the same test database makes a wrog hypothesis, sice Y A ad Y B are ot really idepedet as they correspod to decisios take o the same test set Depedet Case Oe possible solutio proposed i [3] is to oly take ito accout the examples for which the two models disagree. Let p AB be the proportio of examples correctly classified by model A ad icorrectly classified by model B, ad similarly p BA be the proportio of examples correctly classified by model B ad icorrectly classified by model A. I that case, the distributio Y AB of the differece betwee the proportios of errors committed by each model is still Normal distributed ad, assumig the two models are ot differet from each other, should follow Y AB N 0, p AB p BA 9 with the correspodig two-sided test. 8 P Y AB < p AB p BA = δ. 10 This test is i fact very similar to the well-kow Mc- Nemar test, based o a χ 2 distributio. I the literature, most people adopt equatio 8 ad some adopt equatio 9; remember that i order to use equatio 9, oe eeds to have access to all the scores of both models, ad ot just the umbers of errors. Whe possible, we will look at both solutios here, for the case of perso autheticatio. 4. A STATISTICAL TEST FOR HTERS HTERs are ot proportios, but they are a average of two well-defied proportios FAR ad FRR. Give this, ad assumig some hypotheses regardig FAR ad FRR 3, we propose here to exted the test betwee two proportios for the case of HTERs as follows Cofidece Itervals Let the radom variable FA represet the umber of false acceptaces. We ca model it by a Biomial, ad hece by a Normal, as follows: FA B, FA N FA FA, 1 FA N FA, FA 1 FAR. 11 The radom variable FR ca be modeled accordigly. We ca ow write the distributio of the radom variable FAR represetig the ratio of false acceptaces: FA FAR N N FA 1 FAR, FAR, FAR 1 FAR 12 ad similarly for the radom variable FRR. Give the distributio of FAR ad FRR, we ca estimate the distributio of the radom variable HTER as follows: FARFRR N FARFRR 2 HTER N ţ FAR 1 FAR FARFRR, ţ FARFRR N, 2 FAR 1 FAR ţ FAR1 FAR HTER, ű FRR 1 FRR NC ű FRR 1 FRR ű FRR1 FRR 13 Usig this last defiitio, we ca ow compute easily cofidece itervals aroud HTERs usig the methodology preseted i sectio 3 ad summarized i Figure 2 for classical cofidece values used i the scietific literature, Moreover, the test ca be easily exteded to variatios of HTER, such as the DCF. For istace, i the case of 3 such that the distributios of FAR ad FRR should be idepedet, which may look false sice they are both liked by the same model ad threshold, but i fact, give a model ad associated threshold these two quatities are ideed idepedet sice they are computed o separate data the cliet accesses ad the impostor accesses, assumig the model was estimated o a separate traiig set, as it should be.

4 the well-kow ST evaluatios performed yearly to compare speaker verificatio systems, ad which use the DCF measure described by equatio 2 with CostFR = 10, Pcliet = 0.01, CostFA = 1 ad Pimpostor = 0.99, the uderlyig Normal becomes: FAR 1 FAR FRR 1 FRR DCF N DCF, NC Differece Betwee HTERs The distributio of the differece betwee two HTERs assumig idepedece betwee the two uderlyig distributios is HTER A HTER B N 0, σindep 2 15 with FAR A 1 FAR A FAR B 1 FAR B σindep 2 = FRR A 1 FRR A FRR B 1 FRR B while the distributio of the differece betwee two HTERs assumig depedece betwee the two uderlyig distributios becomes HTER A HTER B N 0, σdep 2 16 with σ 2 DEP = FAR AB FAR BA FRR AB FRR BA where FAR AB = AB ad AB is the umber of impostor accesses correctly rejected by model A ad icorrectly accepted by model B, with similar defiitios for FAR BA, FRR AB, ad FRR BA. Hece, i summary, ad usig the stadard cofidece values used i the scietific literature, we obtai the simple methodology described i Figure 2 i order to compute statistical tests for perso autheticatio tasks OTHER STATISTICAL TESTS While several researchers have poited out the use of the z- test to compute statistical tests aroud values such as FAR or FRR see for istace [4], we are ot aware, to the best of our kowledge, of ay similar attempt for aggregate measures such as HTERs or EER, or DCF. However, most people publishig results i verificatio use HTERs or DCF to assess the quality of their methods. 4 While this summary cocers HTERs, it should ow be obvious to exted it to the geeral DCF fuctio. The cofidece iterval CI aroud a HTER is HTER ± σ Z α/2 with FAR1 FAR FRR1 FRR σ = for a 90% CI Z α/2 = for a 95% CI for a 99% CI ad similarly, HTER A ad HTER B are statistically sigificatly differet if z > Z α/2 with z = HTER A HTER B FAR A 1 FAR A FAR B 1 FAR B FRR A 1 FRR A FRR B 1 FRR B i the idepedet case, ad z = FAR AB FAR BA FRR AB FRR BA FARAB FAR BA FRR AB FRR BA i the depedet case. Fig. 2. Methodology for statistical tests aroud HTERs. Oe simple solutio could be to cosider the classificatio error istead of the HTER ad compute statistical tests aroud it. Sice the classificatio error is a well-defied proportio, we ca apply the z-test as well; Let CLASS be defied as the followig radom variable: CLASS = FAFR NC the, the correspodig uderlyig Normal becomes: FAFR CLASS N NC, FAFR NC 2 1 FAFR NC 17 but remember that while this test is correct to assess models accordig to their respective classificatio error, it does ot say aythig o the cofidece oe has over the correspodig HTER, which is the measure of iterest i perso autheticatio. I fact, we will show i sectio 6.1 that, uder reasoable assumptios, the variace of CLASS i equatio 17 is always smaller tha the variace of HTER i

5 equatio 13, hece cofidece tests usig 17 will always result i over-cofidet statistical sigificace or smaller cofidece itervals. This will be explored further i the followig sectio. Aother possible solutio is to cosider the HTER itself as a proportio which it is ot directly ad compute the statistical test o it. Let NAIVE be the radom variable of this value; the uderlyig Normal becomes: NAIVE N HTER, HTER1 HTER NC 18 Agai, we will show i sectio 6.1 that uder reasoable assumptios, the variace of NAIVE i equatio 18 is always smaller tha the variace of HTER i equatio 13, hece cofidece tests usig 17 should always result i over-cofidet statistical sigificace or smaller cofidece itervals. Yet aother solutio that has bee proposed by some researchers see for istace [5] is to compute a statistical test for FAR ad FRR separately ad the combie the results 5. For istace, i order to compute a cofidece iterval for HTER, oe would average both upper bouds ad both lower bouds foud separately by the FAR ad FRR tests. O top of the fact that there is o theoretical groud to justify such a approach, there is a evidet problem with all approaches that cosider separately FARs ad FRRs. Two models could yield very similar HTERs but for some reaso i geeral liked to the choice of the threshold, which is doe i geeral o a separate data set oe could be slightly biased toward FRRs ad the other oe slightly biased toward FARs. I such a case, these tests would cosider them statistically sigificatly differet while they would ot be whe cosiderig globally their respective HTER istead. For this reaso, we will ot cosider this solutio further here. 6. ANALYSIS We would like to compare i this sectio the use of the proposed statistical test for HTERs, with respect to the two other tests preseted i sectio 5. We will first show that uder some reasoable coditios, icreasig the ratio betwee ad NC will icrease the differece betwee the variace of the Normal of the proposed test ad the variace of the Normal of the other tests. Afterward, we preset two real case studies where the use of the proposed statistical test would have yielded a differet coclusio with regard to the cofidece itervals ad the differece betwee the compared models. 5 The well-kow ST evaluatio campaigs have also apparetly recetly ivestigated the use of the McNemar test to assess speaker verificatio methods, but have cosidered separately FARs ad FRRs [6] Theoretical Aalysis Let us first look i which coditios σ 2 13, the variace of HTER as writte i equatio 13 is higher tha σ 2 18, the variace of NAIVE as writte i equatio 18: implies that FAR 1 FAR σ 2 13 > σ FRR 1 FRR which ca be simplified ad yields > HTER1 HTER NC 0 > FAR NC FRR 1 FRR NC1 FAR which meas that iequatio 19 will be true whe either NC is much less or much higher tha which is i geeral the case, ad FAR is similar to FRR agai, whe the threshold is chose such that we have equal error rate EER o a separate validatio set, as it is ofte doe, this is reasoable. Let us ow look i which coditios σ 2 13 is higher tha σ 2 17, the variace of CLASS, represetig the classificatio error: σ 2 13 > σ implies that FAR1 FAR FRR1 FRR > FAFR NC 2 FAFR2 NC 3 which ca be re-writte as 1 FRR3NC > 1 FARNC3 NC ad assumig FAR is similar to FRR, it ca be simplified ito 2 > NC 2 21 which is true as log as is higher tha NC, which is i geeral the case, agai. I order to verify these relatios graphically, we have fixed some variables to reasoable values FAR = 0.1, FRR = 0.2, NC = 100 ad have varied, the umber of impostor accesses. Figure 3 shows the relatio betwee the stadard deviatio of the uderlyig Normal distributios ad the ratio betwee ad NC. As expected, the higher the ratio NC, the bigger the differece betwee the stadard deviatio of the Normal distributios related to the three statistical tests. Moreover, we see that the stadard deviatio of the proposed HTER distributio stays close to the oe of the FRR distributio, which is mostly iflueced by NC, the umber of cliet accesses, ad does ot decrease with the icrease of, cotrary to the two other solutios. Sice the size of the cofidece iterval is directly related to the stadard deviatio, this Figure essetially shows that

6 0.1 FAR = 0.1, FRR = 0.2, NC = 100 Havig up to examples, oe could ideed expect the differece betwee the two models to be statistically sigificat. Stadard Deviatio 0.01 HTER NAIVE CLASS FAR FRR Ratio /NC Fig. 3. Stadard deviatio of the Normal distributios uderlyig the three differet choices of distributios for a statistical test o HTERs. Also show: stadard deviatios of both the FAR ad FRR distributios. All curves are i loglog scale. the cofidece iterval computed usig the proposed techique will always be larger tha that of the two other techiques. Hece two verificatio methods yieldig two differet HTERs could easily be cosidered statistically sigificatly differet usig oe of the methods described i sectio 5, while they would ot be cosidered statistically sigificatly differet usig the proposed techique. I fact, the Figure shows that the cofidece iterval is directly iflueced by the miimum of NC ad ad ot their sum. I the ext two subsectios, we preset two real case studies where the use of the proposed statistical test would have yielded a differet coclusio Empirical Aalysis o XM2VTS I the first case, the well-kow text-idepedet audiovisual verificatio database XM2VTS [7] was used. I this database, the test set cosists of up to impostor accesses ad oly 400 cliet accesses, for a total of accesses. I a recet competitio [8], several models were compared 6 o a face verificatio task ad we will look here at the results of the best model, hereafter called model A, ad the third best model, hereafter called model B, apparetly sigificatly worse. Table 1 shows the differece of performace i terms of HTER betwee models A ad B. 6 While this is ot the topic of this paper sice it should apply to ay data/model, people iterested i kowig more about the problem tackled i this case study are referred to [8]; we used results of the models of IDIAP ad UiS-NC o the automatic registratio task, usig Lausae Protocol I. Furthermore, ote that the results of UiS-NC are slightly differet from those published i [8], but correspod to the list of scores provided by oe of the authors of the method. Method FAR % FRR % HTER % Model A Model B Table 1. HTER Performace compariso o the test set betwee models A ad B whe the threshold was selected accordig to the Equal Error Rate criterio EER o a separate validatio set. δ HTER NAIVE CLASS eq 13 eq 18 eq 17 90% 1.285% 0.131% 0.105% 95% 1.531% 0.156% 0.125% 99% 2.013% 0.206% 0.164% Table 2. Cofidece itervals aroud results of model A, computed usig three differet hypotheses ad their respective equatio. Table 2 shows the size of the cofidece itervals computed aroud the result usig HTER or the classificatio error obtaied by model A for the three methods for three differet values of δ 90%, 95% ad 99%. As we ca see, for all values of δ, the size of the iterval is about oe order of magitude larger for the proposed method tha for the two other methods. HTER HTER NAIVE CLASS DEP, eq 16 INDEP, eq 15 eq 18 eq 17 δ 69.2% 64.7% 100.0% 100.0% σ Table 3. Cofidece value δ o the fact that model A is statistically sigificatly differet from model B, accordig to their respective performace HTER or classificatio error, ad computed usig four differet hypotheses ad their respective equatio. For each method, we also give σ, the stadard deviatio of the correspodig statistical test. Table 3 verifies whether the HTER obtaied by model A gives statistically sigificatly differet results tha the oe obtaied by model B, usig the two-sided test of equatio 6 for the idepedet cases ad 10 for the depedet case. Accordig to both proposed HTER method idepedet ad depedet cases, both models are equivalet the cofidece o their differece is much less tha, say, 90%, while accordig to both other methods, the models would

7 be differet with 100% cofidece!. Remember that there was oly 400 cliet accesses durig the test, hece it is reasoable that oly oe error o these accesses makes a visible differece i HTER while it caot seriously be cosidered statistically sigificat. This is well captured by our techique, but ot by the other oes. Moreover, i this case, the depedece/idepedece assumptio did ot have ay impact o the fial decisio Empirical Aalysis o ST 2000 I the secod case, the well-kow text-idepedet speaker verificatio bechmark database ST 2000 was used. Here, the test set cosists of impostor accesses ad 5825 cliet accesses, for a total of accesses. We compared the performace of two models 7 hereafter called models C ad D. Note that, while o XM2VTS the ratio betwee the umber of impostor ad cliet accesses was very high 280 times more, for the ST database, the ratio is more reasoable, but still high aroud 10. Method FAR % FRR % HTER % Model C Model D Table 4. HTER Performace compariso o the test set betwee models C ad D whe the threshold was selected accordig to the Equal Error Rate criterio EER o a separate validatio set. δ HTER NAIVE CLASS eq 13 eq 18 eq 17 90% 0.676% 0.414% 0.436% 95% 0.805% 0.493% 0.519% 99% 1.058% 0.648% 0.682% Table 5. Cofidece itervals aroud results of model C, computed usig three differet hypotheses ad their respective equatio. We ow preset the same kids of results as for the XM2VTS case. Table 4 shows the differece of performace i terms of HTER betwee models C ad D; Table 5 shows the size of the cofidece itervals computed aroud the result obtaied by model C; as we ca see, give a ratio of impostor ad cliet accesses aroud 10 istead of 280, the differece betwee all the cofidece itervals is less drastic but still exists; Table 6 verifies whether the HTER 7 Oce agai, while this is ot the topic of this paper, people iterested i kowig more about the problem tackled i this case study are referred to [9]. HTER HTER NAIVE CLASS DEP, eq 16 INDEP, eq 15 eq 18 eq 17 δ 98.8% 89.1% 98.9% 100.0% σ Table 6. Cofidece value δ o the fact that model C is statistically sigificatly differet from model D, accordig to their respective performace HTER or classificatio error, ad computed usig four differet hypotheses ad their respective equatio. For each method, we also give σ, the stadard deviatio of the correspodig statistical test. obtaied by model C gives statistically sigificatly differet results tha the oe obtaied by model D. For each test, we show both the cofidece value δ ad the stadard deviatio σ of the correspodig statistical test. As it ca be see, i the DEP case, σ is very small, eve smaller tha the NAIVE ad CLASS solutios, hece obtaiig a very high cofidece that the two models are differet. I order to explai this uexpected result, ote tha oe of the tests take ito accout the possible depedece existig betwee the compared models. Ideed, if the two models are based o the same techique which is ofte the case; for istace, i speaker verificatio, most systems are ofte based o Gaussia Mixture Models, but traied with slightly differet assumptios, the both systems will have a atural tedecy to aswer very correlated scores o the same example. I the case of the two models traied o the XM2VTS database, they were very differet oe was based o a Gaussia Mixture Model, while the other oe was based o Liear Discrimiat Aalysis ad Normalized Correlatio; while for the models traied o the ST database, both were i fact variatios of Gaussia Mixture Models, hece are probably very correlated. Ufortuately, there exist o test that take this depedecy ito accout. Hece, for istace, the variace p ABp BA of equatio 9 will be quickly very small simply because the models are correlated ad ot just because the examples are the same. Usig this equatio will thus result i a uderestimate of the true variace whe models are very correlated, as empirically show i Table 6. O the other had, the INDEP case does ot take ito accout the depedecy betwee the data, but somehow it is reasoable to expect that the effect of this error may be balaced by the fact that it does ot take ito accout the depedecy betwee the models either. The correct solutio probably lies somewhere betwee these two solutios, hece, oe should probably favor the most difficult test so as to oly assess statistical differeces whe both tests agree o this fact hece, here, with oly 89.1% cofidece.

8 7. CONCLUSION I this paper, we have proposed a proper method to compute statistical tests o aggregate measures such as HTER or DCF ofte used i perso autheticatio. We have also show why usig other approximatios such as tests o the classificatio error istead would result i over-optimistic decisios. We have give some empirical evidece usig two bechmark databases. It is importat to ote that the test of two proportios is ot the ultimate statistical test ad there exist other tests that are kow to be sometimes more appropriate for classificatio tasks such as complex crossvalidatio techiques for istace [10]. However, oe of these tests have so far addressed the problem of depedece betwee the tested models. Nevertheless, a importat fidig of this paper is that whe people desig ew databases for perso autheticatio, they should keep i mid that it is probably ot worth havig a huge ubalace betwee cliet ad impostor access umbers, sice the statistical sigificatess of the results will maily deped o the smallest of these two umbers providig equal costs for false acceptaces ad false rejectios. [7] J. Lütti, Evaluatio protocol for the the XM2FDB database lausae protocol, Tech. Rep. COM-05, IDIAP, [8] K. Messer, J. Kittler, M. Sadeghi, S. Marcel, C. Marcel, S. Begio, F. Cardiaux, C. Saderso, J. Czyz, L. Vadedorpe, S. Srisuk, M. Petrou, W. Kurutach, A. Kadyrov, R. Paredes, B. Kepeekci, F. B. Tek, G. B. Akar, F. Deravi, ad N. Mavity, Face verificatio competitio o the XM2VTS database, i 4th Iteratioal Coferece o Audio- ad Video-Based Biometric Perso Autheticatio, AVBPA. 2003, Spriger- Verlag. [9] J. Mariéthoz ad S. Begio, A alterative to silece removal for text-idepedet speaker verificatio, Techical Report IDIAP-RR 03-51, IDIAP, Martigy, Switzerlad, [10] T.G. Dietterich, Approximate statistical tests for comparig supervised classificatio learig algorithms, Neural Computatio, vol. 10, o. 7, pp , ACKNOWLEDGMENTS This research has bee carried out i the framework of the Swiss NCCR project IM2. The authors would like to thak Iva Magri-Chagolleau for suggestig the problem, ad Mohammed Sadeghi for providig the scores of Model A. 9. REFERENCES [1] P. Verlide, G. Chollet, ad M. Acheroy, Multimodal idetity verificatio usig expert fusio, Iformatio Fusio, vol. 1, pp , [2] A. Marti ad M. Przybocki, The ST 1999 speaker recogitio evaluatio - a overview, Digital Sigal Processig, vol. 10, pp. 1 18, [3] G. W. Sedecor ad W. G. Cochra, Statistical Methods, Iowa State Uiversity Press, [4] J.L. Wayma, Cofidece iterval ad test size estimatio for biometric data, i Proceedigs of the IEEE AutoID Coferece, [5] J. Koolwaaij, Automatic Speaker Verificatio i Telephoy: a probabilitic approach, PritParters Ipskamp B.V., Eschede, [6] A Marti, Persoal commuicatio, 2004.

1 Review of Probability & Statistics

1 Review of Probability & Statistics 1 Review of Probability & Statistics a. I a group of 000 people, it has bee reported that there are: 61 smokers 670 over 5 960 people who imbibe (drik alcohol) 86 smokers who imbibe 90 imbibers over 5

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Frequentist Inference

Frequentist Inference Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

This is an introductory course in Analysis of Variance and Design of Experiments.

This is an introductory course in Analysis of Variance and Design of Experiments. 1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics 8.2 Testig a Proportio Math 1 Itroductory Statistics Professor B. Abrego Lecture 15 Sectios 8.2 People ofte make decisios with data by comparig the results from a sample to some predetermied stadard. These

More information

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions Chapter 9 Slide Ifereces from Two Samples 9- Overview 9- Ifereces about Two Proportios 9- Ifereces about Two Meas: Idepedet Samples 9-4 Ifereces about Matched Pairs 9-5 Comparig Variatio i Two Samples

More information

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to: STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio

More information

MA238 Assignment 4 Solutions (part a)

MA238 Assignment 4 Solutions (part a) (i) Sigle sample tests. Questio. MA38 Assigmet 4 Solutios (part a) (a) (b) (c) H 0 : = 50 sq. ft H A : < 50 sq. ft H 0 : = 3 mpg H A : > 3 mpg H 0 : = 5 mm H A : 5mm Questio. (i) What are the ull ad alterative

More information

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more

More information

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population A quick activity - Cetral Limit Theorem ad Proportios Lecture 21: Testig Proportios Statistics 10 Coli Rudel Flip a coi 30 times this is goig to get loud! Record the umber of heads you obtaied ad calculate

More information

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

CS284A: Representations and Algorithms in Molecular Biology

CS284A: Representations and Algorithms in Molecular Biology CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by

More information

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01 ENGI 44 Cofidece Itervals (Two Samples) Page -0 Two Sample Cofidece Iterval for a Differece i Populatio Meas [Navidi sectios 5.4-5.7; Devore chapter 9] From the cetral limit theorem, we kow that, for sufficietly

More information

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer. 6 Itegers Modulo I Example 2.3(e), we have defied the cogruece of two itegers a,b with respect to a modulus. Let us recall that a b (mod ) meas a b. We have proved that cogruece is a equivalece relatio

More information

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE TERRY SOO Abstract These otes are adapted from whe I taught Math 526 ad meat to give a quick itroductio to cofidece

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should be doe

More information

Final Examination Solutions 17/6/2010

Final Examination Solutions 17/6/2010 The Islamic Uiversity of Gaza Faculty of Commerce epartmet of Ecoomics ad Political Scieces A Itroductio to Statistics Course (ECOE 30) Sprig Semester 009-00 Fial Eamiatio Solutios 7/6/00 Name: I: Istructor:

More information

- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion

- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion 1 Chapter 7 ad 8 Review for Exam Chapter 7 Estimates ad Sample Sizes 2 Defiitio Cofidece Iterval (or Iterval Estimate) a rage (or a iterval) of values used to estimate the true value of the populatio parameter

More information

GG313 GEOLOGICAL DATA ANALYSIS

GG313 GEOLOGICAL DATA ANALYSIS GG313 GEOLOGICAL DATA ANALYSIS 1 Testig Hypothesis GG313 GEOLOGICAL DATA ANALYSIS LECTURE NOTES PAUL WESSEL SECTION TESTING OF HYPOTHESES Much of statistics is cocered with testig hypothesis agaist data

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ STATISTICAL INFERENCE INTRODUCTION Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I oesample testig, we essetially

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9 Hypothesis testig PSYCHOLOGICAL RESEARCH (PYC 34-C Lecture 9 Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I

More information

There is no straightforward approach for choosing the warmup period l.

There is no straightforward approach for choosing the warmup period l. B. Maddah INDE 504 Discrete-Evet Simulatio Output Aalysis () Statistical Aalysis for Steady-State Parameters I a otermiatig simulatio, the iterest is i estimatig the log ru steady state measures of performace.

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2 Chapter 8 Comparig Two Treatmets Iferece about Two Populatio Meas We wat to compare the meas of two populatios to see whether they differ. There are two situatios to cosider, as show i the followig examples:

More information

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test. Math 308 Sprig 018 Classes 19 ad 0: Aalysis of Variace (ANOVA) Page 1 of 6 Itroductio ANOVA is a statistical procedure for determiig whether three or more sample meas were draw from populatios with equal

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y. Testig Statistical Hypotheses Recall the study where we estimated the differece betwee mea systolic blood pressure levels of users of oral cotraceptives ad o-users, x - y. Such studies are sometimes viewed

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

1 Models for Matched Pairs

1 Models for Matched Pairs 1 Models for Matched Pairs Matched pairs occur whe we aalyse samples such that for each measuremet i oe of the samples there is a measuremet i the other sample that directly relates to the measuremet i

More information

6 Sample Size Calculations

6 Sample Size Calculations 6 Sample Size Calculatios Oe of the major resposibilities of a cliical trial statisticia is to aid the ivestigators i determiig the sample size required to coduct a study The most commo procedure for determiig

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Chapter 23: Inferences About Means

Chapter 23: Inferences About Means Chapter 23: Ifereces About Meas Eough Proportios! We ve spet the last two uits workig with proportios (or qualitative variables, at least) ow it s time to tur our attetios to quatitative variables. For

More information

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random Part III. Areal Data Aalysis 0. Comparative Tests amog Spatial Regressio Models While the otio of relative likelihood values for differet models is somewhat difficult to iterpret directly (as metioed above),

More information

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters? CONFIDENCE INTERVALS How do we make ifereces about the populatio parameters? The samplig distributio allows us to quatify the variability i sample statistics icludig how they differ from the parameter

More information

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight) Tests of Hypotheses Based o a Sigle Sample Devore Chapter Eight MATH-252-01: Probability ad Statistics II Sprig 2018 Cotets 1 Hypothesis Tests illustrated with z-tests 1 1.1 Overview of Hypothesis Testig..........

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading Topic 15 - Two Sample Iferece I STAT 511 Professor Bruce Craig Comparig Two Populatios Research ofte ivolves the compariso of two or more samples from differet populatios Graphical summaries provide visual

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Patter Recogitio Classificatio: No-Parametric Modelig Hamid R. Rabiee Jafar Muhammadi Sprig 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Ageda Parametric Modelig No-Parametric Modelig

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

More information

Homework 5 Solutions

Homework 5 Solutions Homework 5 Solutios p329 # 12 No. To estimate the chace you eed the expected value ad stadard error. To do get the expected value you eed the average of the box ad to get the stadard error you eed the

More information

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals 7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses

More information

GUIDELINES ON REPRESENTATIVE SAMPLING

GUIDELINES ON REPRESENTATIVE SAMPLING DRUGS WORKING GROUP VALIDATION OF THE GUIDELINES ON REPRESENTATIVE SAMPLING DOCUMENT TYPE : REF. CODE: ISSUE NO: ISSUE DATE: VALIDATION REPORT DWG-SGL-001 002 08 DECEMBER 2012 Ref code: DWG-SGL-001 Issue

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasii/teachig.html Suhasii Subba Rao Review of testig: Example The admistrator of a ursig home wats to do a time ad motio

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency Math 152. Rumbos Fall 2009 1 Solutios to Review Problems for Exam #2 1. I the book Experimetatio ad Measuremet, by W. J. Youde ad published by the by the Natioal Sciece Teachers Associatio i 1962, the

More information

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times Sigificace level vs. cofidece level Agreemet of CI ad HT Lecture 13 - Tests of Proportios Sta102 / BME102 Coli Rudel October 15, 2014 Cofidece itervals ad hypothesis tests (almost) always agree, as log

More information

Information-based Feature Selection

Information-based Feature Selection Iformatio-based Feature Selectio Farza Faria, Abbas Kazeroui, Afshi Babveyh Email: {faria,abbask,afshib}@staford.edu 1 Itroductio Feature selectio is a topic of great iterest i applicatios dealig with

More information

Sampling Distributions, Z-Tests, Power

Sampling Distributions, Z-Tests, Power Samplig Distributios, Z-Tests, Power We draw ifereces about populatio parameters from sample statistics Sample proportio approximates populatio proportio Sample mea approximates populatio mea Sample variace

More information

Simulation. Two Rule For Inverting A Distribution Function

Simulation. Two Rule For Inverting A Distribution Function Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

1 Introduction to reducing variance in Monte Carlo simulations

1 Introduction to reducing variance in Monte Carlo simulations Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by

More information

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9 BIOS 4110: Itroductio to Biostatistics Brehey Lab #9 The Cetral Limit Theorem is very importat i the realm of statistics, ad today's lab will explore the applicatio of it i both categorical ad cotiuous

More information

THE SYSTEMATIC AND THE RANDOM. ERRORS - DUE TO ELEMENT TOLERANCES OF ELECTRICAL NETWORKS

THE SYSTEMATIC AND THE RANDOM. ERRORS - DUE TO ELEMENT TOLERANCES OF ELECTRICAL NETWORKS R775 Philips Res. Repts 26,414-423, 1971' THE SYSTEMATIC AND THE RANDOM. ERRORS - DUE TO ELEMENT TOLERANCES OF ELECTRICAL NETWORKS by H. W. HANNEMAN Abstract Usig the law of propagatio of errors, approximated

More information

1 Approximating Integrals using Taylor Polynomials

1 Approximating Integrals using Taylor Polynomials Seughee Ye Ma 8: Week 7 Nov Week 7 Summary This week, we will lear how we ca approximate itegrals usig Taylor series ad umerical methods. Topics Page Approximatig Itegrals usig Taylor Polyomials. Defiitios................................................

More information

1036: Probability & Statistics

1036: Probability & Statistics 036: Probability & Statistics Lecture 0 Oe- ad Two-Sample Tests of Hypotheses 0- Statistical Hypotheses Decisio based o experimetal evidece whether Coffee drikig icreases the risk of cacer i humas. A perso

More information

CSE 527, Additional notes on MLE & EM

CSE 527, Additional notes on MLE & EM CSE 57 Lecture Notes: MLE & EM CSE 57, Additioal otes o MLE & EM Based o earlier otes by C. Grat & M. Narasimha Itroductio Last lecture we bega a examiatio of model based clusterig. This lecture will be

More information

Accuracy assessment methods and challenges

Accuracy assessment methods and challenges Accuracy assessmet methods ad challeges Giles M. Foody School of Geography Uiversity of Nottigham giles.foody@ottigham.ac.uk Backgroud Need for accuracy assessmet established. Cosiderable progress ow see

More information

Common Large/Small Sample Tests 1/55

Common Large/Small Sample Tests 1/55 Commo Large/Small Sample Tests 1/55 Test of Hypothesis for the Mea (σ Kow) Covert sample result ( x) to a z value Hypothesis Tests for µ Cosider the test H :μ = μ H 1 :μ > μ σ Kow (Assume the populatio

More information

Confidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M.

Confidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M. MATH1005 Statistics Lecture 24 M. Stewart School of Mathematics ad Statistics Uiversity of Sydey Outlie Cofidece itervals summary Coservative ad approximate cofidece itervals for a biomial p The aïve iterval

More information

OPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES

OPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES OPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES Peter M. Maurer Why Hashig is θ(). As i biary search, hashig assumes that keys are stored i a array which is idexed by a iteger. However, hashig attempts to bypass

More information

Lecture 10: Performance Evaluation of ML Methods

Lecture 10: Performance Evaluation of ML Methods CSE57A Machie Learig Sprig 208 Lecture 0: Performace Evaluatio of ML Methods Istructor: Mario Neuma Readig: fcml: 5.4 (Performace); esl: 7.0 (Cross-Validatio); optioal book: Evaluatio Learig Algorithms

More information

CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS. 8.1 Random Sampling. 8.2 Some Important Statistics

CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS. 8.1 Random Sampling. 8.2 Some Important Statistics CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS 8.1 Radom Samplig The basic idea of the statistical iferece is that we are allowed to draw ifereces or coclusios about a populatio based

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

Chapter 11: Asking and Answering Questions About the Difference of Two Proportions

Chapter 11: Asking and Answering Questions About the Difference of Two Proportions Chapter 11: Askig ad Aswerig Questios About the Differece of Two Proportios These otes reflect material from our text, Statistics, Learig from Data, First Editio, by Roxy Peck, published by CENGAGE Learig,

More information

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence Chapter 8: Estimatig with Cofidece Sectio 8.2 The Practice of Statistics, 4 th editio For AP* STARNES, YATES, MOORE Chapter 8 Estimatig with Cofidece 8.1 Cofidece Itervals: The Basics 8.2 8.3 Estimatig

More information

Estimation of a population proportion March 23,

Estimation of a population proportion March 23, 1 Social Studies 201 Notes for March 23, 2005 Estimatio of a populatio proportio Sectio 8.5, p. 521. For the most part, we have dealt with meas ad stadard deviatios this semester. This sectio of the otes

More information

ANALYSIS OF EXPERIMENTAL ERRORS

ANALYSIS OF EXPERIMENTAL ERRORS ANALYSIS OF EXPERIMENTAL ERRORS All physical measuremets ecoutered i the verificatio of physics theories ad cocepts are subject to ucertaities that deped o the measurig istrumets used ad the coditios uder

More information

Economics Spring 2015

Economics Spring 2015 1 Ecoomics 400 -- Sprig 015 /17/015 pp. 30-38; Ch. 7.1.4-7. New Stata Assigmet ad ew MyStatlab assigmet, both due Feb 4th Midterm Exam Thursday Feb 6th, Chapters 1-7 of Groeber text ad all relevat lectures

More information

µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion

µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion Poit Estimatio Poit estimatio is the rather simplistic (ad obvious) process of usig the kow value of a sample statistic as a approximatio to the ukow value of a populatio parameter. So we could for example

More information

Chapter 6 Sampling Distributions

Chapter 6 Sampling Distributions Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to

More information

University of California, Los Angeles Department of Statistics. Hypothesis testing

University of California, Los Angeles Department of Statistics. Hypothesis testing Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Elemets of a hypothesis test: Hypothesis testig Istructor: Nicolas Christou 1. Null hypothesis, H 0 (claim about µ, p, σ 2, µ

More information

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution Iteratioal Mathematical Forum, Vol., 3, o. 3, 3-53 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/.9/imf.3.335 Double Stage Shrikage Estimator of Two Parameters Geeralized Expoetial Distributio Alaa M.

More information

Power and Type II Error

Power and Type II Error Statistical Methods I (EXST 7005) Page 57 Power ad Type II Error Sice we do't actually kow the value of the true mea (or we would't be hypothesizig somethig else), we caot kow i practice the type II error

More information

Module 1 Fundamentals in statistics

Module 1 Fundamentals in statistics Normal Distributio Repeated observatios that differ because of experimetal error ofte vary about some cetral value i a roughly symmetrical distributio i which small deviatios occur much more frequetly

More information

Efficient GMM LECTURE 12 GMM II

Efficient GMM LECTURE 12 GMM II DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

Understanding Samples

Understanding Samples 1 Will Moroe CS 109 Samplig ad Bootstrappig Lecture Notes #17 August 2, 2017 Based o a hadout by Chris Piech I this chapter we are goig to talk about statistics calculated o samples from a populatio. We

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber

More information

4.3 Growth Rates of Solutions to Recurrences

4.3 Growth Rates of Solutions to Recurrences 4.3. GROWTH RATES OF SOLUTIONS TO RECURRENCES 81 4.3 Growth Rates of Solutios to Recurreces 4.3.1 Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer.

More information