Hypotheses Testing. Chapter Hypotheses and tests statistics

Size: px
Start display at page:

Download "Hypotheses Testing. Chapter Hypotheses and tests statistics"

Transcription

1 Chapter 8 Hypotheses Testing In the previous chapters we used experimental data to estimate parameters. Here we will use data to test hypotheses. A typical example is to test whether the data are compatible with the theoretical prediction or to choose among di erent hypothesis which one best represents the data. 8. Hypotheses and tests statistics Let s begin by defining some terminology that we will need in the following. The goal of a statistical test is to make a statement about how well the observed data stand in agreement (accept) or not (reject) with given predicted probabilities, i.e. a hypothesis. The typical naming for the hypothesis under test is the null hypothesis or H 0. The alternative hypothesis, if there is one, is usually called H. If there are several alternative hypotheses they are labeled H,H,... The hypothesis can be simple if the p.d.f. of the random variable under test is completely specified (e.g. the data are drawn from a gaussian p.d.f. with specified mean ad width) or composite if at least one of the parameters is not specified (e.g. the data are drawn from a Poisson with mean >3). In order to tell in a quantitative way what it means to test a hypothesis we need to build a function of the measured variables ~x, called test statistic t(~x ). If we build it in a clever way, the test statistic will be distributed di erently depending on the hypothesis under test: g(t(~x ) H 0 ) or g(t(~x ) H ). This pedantic notation is used here to stress that the test statistic is a function of the data and that it is the distribution of the test statistic values that is di erent under the di erent hypotheses (the lighter notation g(t H i ) will be used from now on). Comparing the value of the test statistic computed on the actual data, with the value(s) obtained computing it under di erent hypotheses, we can quantitatively state the level of agreement. That s the general idea. The way this is implemented in practice will be explained in the next sections. The test statistic can be any function of the data: it can be a multidimensional vector ~t(~x ) or a single real number t(~x ). Even the data themselves {~x } can be used as a test statistic. Collapsing all the information about the data into a single meaningful variable is particularly helpful in visualizing the test statistic and the separation between the two hypothesis. There is no general rule about the choice of the test statistic. The specific choice will depend on the particular case at hand. Di erent test statistic will give in general di erent results and 97

2 98 CHAPTER 8. HYPOTHESES TESTING it is up to the physicist to decide which is the most appropriate for the specific problem. Example: In order to better understand the terminology we can use a specific example based on particle identification. The average specific ionization de/dx of two charged particle with the same speed passing through matter will be di erent depending on their masses (see Fig.8..). Because of this dependence de/dx can be used as a particle identification tool to distinguish particle types. For example the ionization of electrons with momenta in the range of few GeV tend to be larger than the one of pions of the same momentum range. If we want to distinguish an electron from a pion in a given bin of we can use the specific Figure 8..: Left: the specific ionization for some particle types (in green pions and in red electrons; other particles species are shown with di erent colors); Right: the projections of the left plot on the y-axis, i.e. the measured specific ionization for pions and electrons. ionization itself as test statistic t(~x ) = de/dx. This is a typical case where the data itself is used as test statistic. The test statistics will then be distributed di erently under the two following hypotheses (see Fig.8.. right): null hypothesis g(t H 0 )=P de dx e± alternative hypothesis g(t H )=P de dx ± Example: When testing data for the presence of a signal, we define the null hypothesis as the background only hypothesis and the alternative hypothesis as the signal+background hypothesis. Example: Fig.8.. shows the cross section (e + e )! W + W ( ) measured by the L3 collaboration at di erent centre of mass energies. In this case the test statistic is the cross-section as a function of energy. The measured values are then compared with di erent theoretical models (di erent hypothesis). We haven t explained yet how to quantitatively accept or reject an hypothesis, but already at a naive level we can see that data clearly prefer one of the models.

3 8.. SIGNIFICANCE, POWER, CONSISTENCY AND BIAS 99 Figure 8..: Analysis of the cross section of e + e! W + W ( ) as a function of the centre of mass energy (L3 detector at LEP). The p.d.f. describing the test statistic corresponding to a certain hypothesis g(~t H) isusu- ally built from a data set that has precisely the characteristic associated to that hypothesis. In the particle identification example discussed before the expected data used to build the p.d.f. for the two hypotheses were pure sample of electrons and pure samples of pions. For example you can get a pure sample of electrons selecting tracks from photon conversions! e + e and a pure sample of pions from the self-tagging decays of charmed mesons D +! + D 0 ; D 0! k + (D! D 0 ; D 0! k + ) (self-tagging means that by knowing the charge of the in the first decay you can unambiguously assign the pion/kaon hypothesis to the positive/negative charge of the second decay). In other cases the p.d.f. are built from dedicated measurement (e.g. a test beam ) or from Monte Carlo simulations. 8. Significance, power, consistency and bias In order to accept or reject a null hypothesis we partition the space of the test statistics values into a critical region and its complementary the acceptance region, for the test hypothesis t (see Fig. 8..3), such that there is a small probability, assuming H 0 to be correct to observe data with a test statistics in that region. The value of the test statistics chosen to define the two regions is called decision boundary: t cut. If the value of the test statistic computed on the data sample under test falls in the rejection region then the null hypothesis is discarded, otherwise it is accepted (or more precisely not reject it). In a test beam, a beam of particles is prepared in a well defined condition (particle type, energy, etc...) and it is typically used to test a device under development. This configuration inverts the typical experimental conditions where a device with known properties is used to characterize particles in a beam or from collisions.

4 00 CHAPTER 8. HYPOTHESES TESTING Figure 8..3: A test statistic in red, where we defined an acceptance x apple and rejection region x>. Given a test statistic, some parameters are usually defined when sizing a rejection region. The first one is the significance level of the test (see Fig.8..4). It is defined as the integral of the null hypothesis p.d.f. above the decision boundary: = Z t cut g(t H 0 )dt (8..) The probability can be read as the probability to reject H 0 even if H 0 is in reality correct. This is called an error of the first kind. If we have an alternative hypothesis H, an error of the second kind occurs when H 0 is accepted but the correct hypothesis is in reality the alternative one H. The integral of the alternative hypothesis p.d.f. below t cut is called the power of the test to discriminate against the alternative hypothesis H (see Fig.8..4): A good test has both and = Z tcut g(t H )dt (8..) small, which is equivalent to say high significance and high Figure 8..4: Illustration of the acceptance and rejection region both for the hypothesis H 0 (on the left hand side) and the alternative H (on the right hand side) under the same choice of decision boundary. power. This means that H 0 and H are well separated. Table 8..5 summarize the di erent

5 8.. SIGNIFICANCE, POWER, CONSISTENCY AND BIAS 0 ways to mistakenly interpret the data in terms of errors of the first and second kind. While errors of the first type can be controlled by choosing su ciently small, errors of the second type, depending on the separation between the two hypothesis, are not as easily controllable. In HEP searches we typically speak of evidence when apple 3 0 3, and of discovery when apple (corresponding to the probability outside 3 and 5 respectively in a single sided tail gaussian); these numbers are purely conventional and they don t have any scientific ground. They are defined this way to set a high threshold for such important claims about the observation of new phenomena. Figure 8..5: Example of errors of the first and second kind (Wikipedia). Example We consider now a machine BM which is used for bonding wires of Si-detector modules. The produced detectors had a scrap rate of P 0 =0.. This machine BM should now be replaced with a newer bonding machine called BM, if (and only if) the new machine can produce detector modules with a lower scrap rate P. In a test run we produce n = 30 modules. To verify P <P 0 statistically, we use the hypothesis test discussed above. Define the two hypotheses H 0 and H as: H 0 : P 0.; H : P < 0.. (8..3) We choose = 0.05 and our test statistic t is the number of malfunctioning detector modules. This quantity is distributed according to a binomial distribution, with the total number of produced modules n = 30 and a probability P. The rejection region for H 0 is constructed out of Xn c n P i i 0( P 0 ) n i <. (8..4) i=0 Here, the critical value is denoted by n c, and it is the maximal number of malfunctioning modules produced by BM which still implies a rejection of H 0 with CL. By going through the calculation we find that for n c = the value of is still just below Thus the rejection region for H 0 is K =0,,. This means that if we find two or less malfunctioning modules produced by BM we will replace BM by the new machine BM. If there are 3 or even more malfunctioning detector modules, the old bonding machine BM should be preferred. Once the test statistics is defined there is a trade-o between and, the smaller you make the larger will be; it s up to the experimenter to decide what is acceptable and what is not.

6 0 CHAPTER 8. HYPOTHESES TESTING Example Suppose we want to distinguish K p elastic scattering events from inelastic scattering events where a 0 is produced. H 0 : K p! K p ; H : K p! K p 0. The detector used for this experiment is a spectrometer capable of measuring the momenta of all the charged particles (K, p) but it is blind to neutral particles ( 0 ). The considered test statistic is the missing mass M defined as the di erence between the initial and final visible mass. The true value of the missing mass is M = 0 under the null hypothesis H 0 (no 0 produced) and M 0 = 35 MeV/c under the alternative hypothesis H (a 0 is produced). The critical region can be defined as M>M c. The value of M c depends on the significance and power we want to obtain (see Fig.8..6): a high value of M c will correspond to a high significance at the expenses of the power, while low values of M c will result in a high power but low significance. Figure 8..6: Top: the p.d.f. for the test statistic M under the null hypothesis of elastic scattering H 0 centred at M = 0; bottom the p.d.f. for the test statistic under the alternative hypothesis of inelastic scattering H centred at M = m 0. M c defines the critical region. Some caution is necessary when using. Suppose you have 0 researchers looking for a new phenomenon which in reality does not exist. Their H 0 hypothesis is that what they see is only background. One of them is liable to reject H 0 with = 5%, while the other 9 will not. This is part of the game and therefore, before rushing for publication, that researcher should balance the claim with what the others don t see. That s the main reason why anytime there is a discovery claim, we always need to have the results to be corroborated by independent measurements. We will come back to this point when we will talk about the look-elsewhere

7 8.. SIGNIFICANCE, POWER, CONSISTENCY AND BIAS 03 e ect. Example Let s use again the example of the electron/pion separation. As already shown before the specific ionization de/dx of a charged particle can be used as a test statistic to distinguish particle types, for example electrons (e) from pions ( )(see Fig. 8..). The selection e ciency is defined as the probability for a particle to pass the selection cut: e = Z tcut g(t e)dt = = Z tcut g(t )dt = (8..5) By moving the value of t cut you can change the composition of your sample. The lower the value of t cut the larger the electron e ciency but the higher the contamination from pions and vice-versa. In general, one can set a value of t cut, select a sample and work out what is the fraction of electrons N acc present in the initial sample (before the requirement t<t cut ). The number of accepted particles in the sample is composed by: which gives N acc = e N e + N = e N e + (N tot N e ) (8..6) N e = N acc N tot (8..7) e From this, one can immediately notice that the N e can only be calculated if e 6=,i.e. N e can only be extracted if there is any separation power at all. If there are systematic uncertainties in e or these will translate into an uncertainty on N e. One should try to select the critical region t cut such that the total error on N e is negligible. Up to now we used only the p.d.f describing what is the probability that a electron/pion gives a certain amount of ionization; using the Bayes theorem (see.6) we can invert the problem and ask what is the probability that a particle releasing a given ionization signal t is a pion or an electron: a e g(t e) h(e t) = (8..8) a e g(t e)+a g(t ) h( t) = a g(t ) a e g(t e)+a g(t ) (8..9) where a e,a = a e are the prior probabilities for the electron and pion hypotheses. So to give the probability that a particle is an electron (or better the degree of belief that a given particle with a measured t is an electron) one needs to know the prior probability for all possible hypothesis as well as their p.d.f. The other side of the problem is to estimate the purity p e of the sample of candidates passing the requirement t<t cut : p e = #electrons with t < t cut (8..0) #particles with t < t cut R tcut = a eg(t e)dt R tcut (a (8..) eg(t e)+( a e )g(t ))dt = a e e N tot N acc (8..)

8 04 CHAPTER 8. HYPOTHESES TESTING In high energy physics a parallel nomenclature has been defined with time to express the same concepts we have encounter in this section. Typically we call: background e error ciency = probability to accept background = probability of a type-i signal e ciency = power of the test = probability of a type two error purity = probability for an event to be signal, once we have accepted it as signal 8.3 Is there a signal? A typical application of hypothesis testing in high energy physics is to test for the presence of a signal in data. The easiest case is represented by counting experiments. In this type of experiments the detector is used to count the number of events satisfying some selection criteria (slang: cut-and-count ). The number of expected events in case of background only hypothesis is compared with the measured number and the signal would typically appear as an excess over the expected background. Let n be a number of events which is the sum of some signal and some background events n = n s + n b. Each of the components can be treated as a Poisson variable s (signal) and b (background) and so the total = s + b is also a Poisson variable. The probability to observe n events is: f(n; s, b )= ( s + b ) n e ( s+ b) (8.3.3) n! Suppose you measure n obs events. To quantify our degree of confidence in the discovery of a new phenomenon, i.e. s 6= 0, we can compute how likely it is to find n obs events or more from background alone. X n obs X n obs X n b n! e b. P (n n obs )= f(n; s =0, b )= f(n; s =0, b )= n=n obs n=0 n=0 (8.3.4) For example, if we expect b =0.5background events and we observe n obs = 5, then the p-value from is This is not the probability of the hypothesis s =0. It is rather the probability, under the assumption s =0, of obtaining as many events as observed or more. Often the result of a measurement is given as the estimated value of a number of events plus or minus one standard deviation. Since the standard deviation of a Poisson variable is equal to the square root of its mean, from the previous example, we have 5 ± p 5 for an estimate of, i.e. after subtracting the expected background, 4.5 ±. for our estimate of s. This is very misleading: it is only two standard deviations from zero, and it gives the impression that s is not very incompatible with zero, but we have seen from the p-value that it is not the case. The subtlety is that we need to ask for the probability that a Poisson variable of The signal doesn t always appear as an excess of events. In case of neutrino disappearance experiments the signal is given by a deficit of events.

9 8.4. NEYMAN PEARSON LEMMA 05 mean b will fluctuate up to n obs or higher, not for the probability that a gaussian variable with mean n obs will fluctuate down to b or lower. Another important point is that usually b is known within some uncertainty. If we set =0.8rather than 0.5, the p-value would increase by almost an order of magnitude to It is therefore crucial to quantify the systematic uncertainty of the background when evaluating the significance of a new e ect. In other types of searches the signal would reveal itself as a resonance, i.e. an excess of data in a localized region of a mass spectrum (slang: bump hunt ), or as an excess of events in the tail of a distribution. Two examples are show in Fig In these cases the signal is extracted from the background using a fit (more on this will be developed in the next sections). In this case on top of using the number of expected events, we add the information about the shape. Figure 8.3.7: Left: Higgs boson search in 0. The data are well described by the background only hypothesis. Right: search for an excess of events at high missing transverse energy. 8.4 Neyman Pearson Lemma We haven t addressed up to now the choice of t cut. The only thing we know up to now is that it a ects the e ciency and the purity of the sample under study. Ideally what we want is to set the desired e ciency and for that value get the best possible purity. Take the case of a simple hypothesis H 0 and allow for an alternative hypothesis H (e.g. the typical situation of signal and background). The Neyman Pearson lemma states that the acceptance region giving the highest power (i.e. the highest purity) for a given significance level is the region of space such that g(~t H 0 ) >c, (8.4.5) g(~t H ) where c is the knob we can tune to achieve the desired e of ~t under the hypothesis H i. ciency and g(~t H i ) is the distribution

10 06 CHAPTER 8. HYPOTHESES TESTING Basically what the lemma says is that there is function r defined as r = g( ~t H 0 ) g(~t H ) that brings the problem to a dimensional case and that gives the best purity given a fixed e ciency. The function r is called the likelihood ratio for the simple hypothesis H 0 and H (in the likelihood the data are fixed, the hypothesis is the variable). The corresponding acceptance region is given by r>c. Any monotonic function of r will be good too and will lead to the same test. The main draw back of the Neyman-Pearson lemma is that is is valid if and only if both H 0 and H are simple hypothesis (and that is pretty rare). Even in those cases in order to determine c one needs to know g(t H 0 ) and g(t H ). These must be determined by Monte Carlo simulations (or data driven techniques) for both data and background. The resulting p.d.f. is represented by a multidimensional histogram. This can cause some troubles when the dimensionality of the problems is high. Say we have M bins for each of the n components of t, then the total number of bins is M n,i.e. M n, parameters must be determined from Monte Carlo or data. A way to address this problem is by using a Multi-Variate technique as we will see in Chapter??. 8.5 Goodness of Fit A typical application of hypothesis testing is the goodness of fit, quantifying how well the null hypothesis H 0 (a function f(x)) describes a sample of data, without any specific reference to an alternative hypothesis. The test statistic has to be constructed such that it reflects the level of agreement between the observed data and the predictions of H 0 (i.e. the values of f(x)). The p value is the probability, under the assumption of H, to observe data with equal or lesser compatibility with H, relative to the data we got. N.B. it is not the probability that H is true! As frequentist the probability of the hypothesis is not even defined: the probability is defined on the data. As Bayesian the probability of the hypothesis is a di erent thing and it is defined through the Bayes theorem using the prior hypothesis The -Test We have already encountered the as a goodness of fit test in section Sec.7.5. The -test is by far the most commonly used goodness of fit test. Its first application is with a set of measurements x i and y i,wherethex i are supposed to be exact (or at least with negligible uncertainty) and the y i are known with an uncertainty i. We want to test the function f(x) which we believe it gives (predicts) the correct value of y i for each value of x i ;todosowe define the as: NX [y i f(x i )] = i= i. (8.5.6)

11 Degrees of freedom for -test on fitted data 07 If the uncertainties on the y i measurements are correlated, the above formula becomes (with the lighter matrix notation see Sec.7.3): =(y T f T )V (y f) (8.5.7) where V is the covariance matrix. A function that correctly describes the data will give a small di erence between the values predicted by the function f and the measurements y i.this di erence reflects the statistical uncertainty on the measurements, so for N measurements the should be roughly N. Recalling the p.d.f. of the distribution (see section..3): P ( ; N) = N ( N ) N e (8.5.8) (where the expectation value of this distribution is N, and so /N ), we can base our decision boundary on the goodness-of-fit, by defining the p value: p = Prob( ; N) = Z P ( 0 ; N)d 0 (8.5.9) which is called the probability. This expression gives the probability that the function describing the N measured data points gives a as large or larger than the one we obtained from our measurement. Example Suppose you compute a of 0 for N=5 points. The naive reaction is that the function is a very poor model of the data (0/5 =4 ). To quantify that we compute the probability R 0 P (, 5)d.InROOT you can compute this as TMath::Prob(0,5) = The probability is indeed very small and the H 0 hypothesis should be discarded. You have to be careful when using the probability to take decisions. For instance if the is large, giving a very small probability, it could be both that the function f is a bad representation of the data or that the uncertainties are underestimated. On the other hand if you obtain a very small value for the, the function cannot be blamed, so you might have overestimated the uncertainties. It s up to you to interpret correctly the meaning of the probability. A very useful tool for this scope is the pull distribution (see Sec.6.), where each entry is defined as (measured-predicted)/uncertainty = (y i f(x i ))/ i ;ifeverything is done correctly (i.e. the model is correct and the uncertainties are computed correctly) the pull will result in a normal distribution centred at 0 with width. If the pull is not centred at 0 (bias) the model is incorrect, if the pull has a width larger than either the uncertainties are underestimated or the model is wrong, if the pull has a width smaller than the uncertainties are overestimated Degrees of freedom for -test on fitted data The concept of developed above only works if you are given a set of data points and a function (model). If the function comes out from a fit to the data then, by construction, you will get a which is smaller than the expected, because you fit the parameters of the

12 08 CHAPTER 8. HYPOTHESES TESTING function in order to minimize it. This problem turns out to be very easy to treat. You just need to change the number of degrees of freedom in the computation. For example, suppose you have N points and you fitted m parameters of your function to minimize the sum; then all you have to do to compute the new probability is to reduce the number of d.o.f. to n = N m. Example You have a set of 0 points, you consider as function f(x) a straight line and you get = If you use a parabola you get = 0.. The straight line has degrees of freedom (slope and intercept), so the number of d.o.f. of the problem is 0-=8; the probability is TMath::Prob(36.3,8) = which makes the hypothesis that data are described by a straight line improbable. If you now fit it with a parabola you get TMath::Prob(0.,7) = 0.7 which means that you can t reject the hypothesis that the data are distributed according to a parabolic shape. Notes on the -test: For large values of d.o.f. the distribution of p can be approximated with a gaussian distribution with mean p n and standard deviation. When in the past the integrals were extracted from tables this was a neat trick; still it is a useful simplification when the the is used in some explicit calculation. The -test can also be used as a goodness of fit test for binned data. The number of events in bin i (i =,,...,n) are y i,withbinihaving mean value x i.thepredicted number of events is thus f(x i ). The errors are given by Poisson statistics in the bin ( p f(x i )) and the is nx [y i f(x i )] =, (8.5.0) f(x i ) i= where the number of degrees of freedom n is given by the number of bins minus the number of fitted parameters (do not forget the overall normalization of the model when counting the number of fitted parameters). when binning data, you should try to have enough entries per bin such that the computation of the is actually meaningful; as a rule of thumb you should have at least 5 entries per bin. Most of the results for binned data are only true asymptotically, e.g., the normal limit of the multinomial p.d.f. or the distribution of log or the distribution of Run test The collapses in one number the level of agreement between a hypothesis and set of data. There are cases where behind a hides in reality a very poor agreement between the data and the model. Consider the situation which is illustrated in figure The data points are fitted by a straight line, which clearly does not describe the data adequately. Nevertheless, in this example, the is.0 and thus /n =. In cases such as this one the run test provides important extra information. The run test works like this: every time the measured data point lies Above the function, we write an A in a sequence, and every time the data lies Below the function, we write a B. If the data are distributed according to the

13 Run test 09 Figure 8.5.8: Example for the application of the run test. The dashed line is the hypothesized fit (a straight line), whereas the crosses are the actual data. hypothesis function, then they should fluctuate up and down creating very short sequences of A s and B s (runs). The sequence in the pictures reads AAABBBBBBAAA, making only three runs and possibly pointing to a poor description of the data. The probability of the A s and B s giving a particular number of runs can be calculated. Suppose there are N A points above and N B below with N = N A + N B. The total number of possible combinations without repetitions is given by (see chapter ): N N! = (8.5.) N A! N B! N A this will be our denominator. For the numerator suppose that r is even and the sequence starts with an A. There are N A A-points and r/ divisions between them (occupied by B s). With N A point you can place N A dividingline,inthenextstepn A and so on, giving N A r/ di erent A arrangements. The same argument can be made for the B s. So we find for the probability of r runs: P r = N A r/ N B r/, (8.5.) N N A where the extra factor of is there because we chose to start with an A and we could have started with a B. When r is odd you get: P r = N A (r 3)/ N B (r )/ + N A (r )/ N B (r 3)/ These are probabilities to get r runs given a sequence of A s and B s. It can be shown that (8.5.3) N N A hri = + N AN B N V (r) = N AN B (N A N B ) N (N ) (8.5.4) (8.5.5)

14 0 CHAPTER 8. HYPOTHESES TESTING In the example above the number of expected runs is hri =+ 6 6/ = 7 with =.65. The deviation between the expected and the observed is 7 3 = 4 and it constitutes.4 standard deviations which is significant at the % level for the one-sided test. Thus the straight line fit could be rejected despite the (far too) good value. The run test does not substitute the test, it s in a sense complementary; the test ignore the sign of the fluctuations, while the run test only looks at them Unbinned tests Unbinned tests are used when the binning procedure would result in a too large loss of information (e.g. when the data set is small). They are all based on the comparison of the cumulative distribution function (c.d.f.) F (x) of the model f(x) under some hypothesis H 0 and the c.d.f. for the data. To define a c.d.f. on data we define define an order statistics, i.e. a rule to order the data 3 and then define on it the Empirical Cumulative Distribution Function e.c.d.f.: 0, x<x r S n (x) = n, x r apple x<x r+ (8.5.6), x n <x This is just the fraction of events not exceeding x (which is a staircase function from 0 to ), see Fig The first unbinned test we describe is the Smirnov-Cramér-von Mises test. We define a measure of the distance between S n (x) and F(x) as: W = Z 0 [S n (x) F (x)] df (x) (8.5.7) (df(x) can be in general a non decreasing weight). Inserting the explicit expression of S n (x) in this definition we get: nw = nx n + F (x i ) i= i (8.5.8) n From the asymptotic distribution of nw the critical regions can be computed: frequently used test sizes are given in the Tab The asymptotic distribution is reached remarkably rapidly (in this table the asymptotic limit is reached for n 3). The Kolmogorov-Smirnov test follows the same idea of comparing the model c.d.f. with the data e.c.d.f. but it defines a di erent metric for the distance between the two. The test statistic is d := D p N where D is the maximal vertical di erence between F n (x) and F (x) (see Fig.8.5.0): D := max F n (x) F (x) x The hypothesis H 0 corresponding to the function f(x)isrejectedifd is larger than a given critical value. The probability P (d apple t 0 ) can be calculated in ROOT by TMath::KolmogorovProb(t0). 3 In -D the ordering is trivial, ascending/descending, in n-d it is arbitrary, you have to choose a convention and map it to a D sequence.

15 Unbinned tests Figure 8.5.9: Rejection regions for the Smirnov-Cramér-von Mises test the for some typical test sizes. Some values are reported in table Figure 8.5.0: Example of c.d.f. and e.c.d.f.. The arrow indicates the largest distance used by the Kolmogorov-Smirnov test. Table 8.5.: Critical values t 0 for various significances. 99% 95% 50% 3% 5% % 0.% P (d apple t 0 ) % 5% 50% 68% 95% 99% 99.9% t The Kolmogorov-Smirnov test can also be used to test if two data sets have been drawn from the same parent distribution. Take the two histograms corresponding to the data to be compared and normalize them (such that the cumulative plateaus at ). Then compare the e.c.d.f. for the two histograms and and compute the maximum distance as before (in ROOT use h.kolmogorovtest(h)).

16 CHAPTER 8. HYPOTHESES TESTING Notes on the Kolmogorov-Smirnov test: the test is more sensitive to departures of the data from the median of H 0 than to departures from the width (more sensitive to the core than to the tails of the distributions) the test becomes meaningless if the H 0 p.d.f. is a fit to the data. This is due to the fact that there is no equivalent of the number of degrees of freedom as in the -test, hence it cannot be corrected for. 8.6 Two-sample problem In this section we will look at the problem of telling if two samples are compatible with each other, i.e. if both are drawn from the same parent distribution. Clearly the complication is that even if they are compatible they will exhibit di erences coming from statistical fluctuations. In the following we will examine some typical examples of two-sample problems Two gaussians, known Suppose you have two random variables X and Y distributed as gaussians of known width. Typical situations are when you have two measurements taken with the same device with a known resolution; or two samples are taken under di erent conditions where the variances of the parent distribution are known (you have the two means hxi, hxi and the uncertainty on the means x/ p N x and x / p N x ). This problem is equivalent to check if X Y is compatible with 0. The variance of X Y is V (X Y )= x + y and so the problem boils down to how many the di erence is from 0: q (X Y )/ x + y. More generally what you are doing is defining a test statistics hxi µ 0 / p (in the previous case N µ 0 = 0) and a double sided rejection region. This means that you choose the significance of your test ( ) and set as rejection region the (symmetric) values u / on the corresponding gaussian as: Z u / Z G(x; µ 0, )dx = G(x; µ 0, )dx = (8.6.9) u / If the measured di erence ends up in the rejection region (either of the two tails) then the two samples are to be considered di erent. You can also decide to test whether X>Y (or Y > X). In this case the test statistic is (hxi µ 0 ) / p N and the rejection region becomes single sided (u, ) (or (,u )) Two gaussians, unknown The problem is like the previous one, you re comparing two gaussian distribution with means hxi and hy i, but this time you don t know what are the parents standard deviations. All you can do is to estimate them from the samples at hand: P (xi hxi) P (yi hyi) s x = ; s y =. (8.6.30) N x N y

17 F-test 3 Because we re using the estimated standard deviation we have to use the Student s t to test the significance and not the gaussian p.d.f. as we did in the previous case (see Sec...7). So we build a variable which is the ratio between a gaussian and a distribution. The expression q (hxi hyi) (8.6.3) ( x/n x )+( y/n y ) under the null hypothesis that the two distributions have the same mean, is a gaussian centred at zero with standard deviation one. The sum (N x )s x x + (N y )s y y (8.6.3) is a with N x + N y d.o.f. If we take the ratio of the two (assuming that the unknown parent standard deviation x = y, such that they cancel out in the ratio) we get the definition for a t-distribution: where (hxi hyi) t = S p (/N x )+(/N y ) (8.6.33) S = (N x )s x +(N y )s y (8.6.34) N x + N x (S is called the pooled estimate of the standard deviation, as it is the combined estimate from the two samples, appropriately weighted. The term S/ p (/N x )+(/N y ) is analogous to the standard error on the mean / p N that is used when is known). The variable t is distributed as a Student s t with N x + N x d.o.f. With this variable we can now use the same testing procedure (double or single sided rejection regions) used in the case shown above 8.6., substituting the c.d.f of the gaussian with the c.d.f. of the student s t F-test The F -test is used to test whether the variances of two samples with size n and n,respectively, are compatible. Because the true variances are not known, the sample variances V and V are used to build the ratio F = V V. Recalling the definition of the sample variance, we can write: P n n n F = V i= = i x ) P V n i= (x i x ). (8.6.35) (by convention the bigger sample variance is at the numerator, such that F ). Intuitively the ratio will be close to if the two variances are similar, while it will go to a large value if they are not. When you divide the variance by you obtain a random variable which is distributed as a with N d.o.f. Given that the random variable F is the ratio of two such variables, the cancels and we are left with the ratio of two distributions with f = N d.o.f. for the numerator and f = N d.o.f. for the denominator. The variable F follows the F -distribution with f and f degrees of freedom: F (N,N ): P (F )= ((f + f )/) (f /) (f /) q f f f f F f / (f + f F ) (f +f )/ (8.6.36)

18 4 CHAPTER 8. HYPOTHESES TESTING For large numbers, the variable Z = log F (8.6.37) converges to a Gaussian distribution with mean (/f /f ) and variance (/f +/f ). In ROOT you can use the function TMath::fdistribution pdf.

19 F-test 5 Example Background model for the H! search: The collected diphoton events are divided in several categories (based on resolution and S/B to optimize the analysis sensitivity). Once a model for the background is chosen (e.g. a polynomial) the number of d.o.f. for that model (e.g. the order of the polynomial) can be chosen using an F-test. The main idea is gradually increase the number of d.o.f. until you don t see any decrease in the variance (see snapshot of the text here below). Table 8.6.: Summary of the hypothesis tests for the two-sample problem. H 0 H Test Statistic Rejection Region Comment Comparison of two normal distributions with µ and µ and known and x µ µ apple µ µ > x (u d ; ) x i: arithmetic mean of sample i x µ µ µ µ < x ( ; u d ) d := n + n x µ µ = µ µ 6= x (u / ; ) d and Comparison of µ and µ with unknown (but supposed equal) q x µ µ apple µ µ > x (n S d (t f; ; ) S d = )S +(n )S n +n x µ µ µ µ < x S d ( ; t f; ) f = n + n x µ µ = µ µ 6= x S d (t f; / ; ) Calculate by non-central t-dist. F -Test: Hypotheses about and of two normal distributions apple > S/S A =(F N ;N ; ; ) N i = n i < S/S A =(0;F N ;N ; ) = 6= S/S A and A q n +n n n

20 6 CHAPTER 8. HYPOTHESES TESTING Matched and correlated samples In the previous sections we ve seen how to compare two samples under di erent hypothesis. The tests are more discriminating the smaller are their variances. Correlations between the two samples can be used to our advantage to reduce the variance. Take as test statistic: X (x i y i ) (8.6.38) i where each of the data point of the first sample is paired to a corresponding one in the second sample. The variance of this distribution is: V (x y) = x + y x y (8.6.39) Now, if the two samples are correlated ( >0) then the variance is reduced and will make the test more discriminating. Example A consumer magazine is testing a widget claimed to increase fuel economy. Here are the data on seven cars are reported in Fig Is there evidence for any improvement? If you ignore the matching, the means are 38.6 ± 3.0 and 35.6 ±.3 for the samples with and Figure 8.6.: Data from seven cars. without the widget. The improvement of 3 m.p.g. is within the statistical uncertainties. Now look at the di erences. Their average is 3.0. The estimated standard deviation s is 3.6, so the error on the estimated average is 3.6/ p 7=.3, and t is 3.6/.3 =.0. This is significant at the 5% level using Student s t (one-tailed test, 6 degrees of freedom, t critic =.943) The most general test As we already said the more precisely the test can be formulated, the more discriminant it will be. The most general two-sample test, makes no assumptions at all on the two distributions, it just asks whether the two are the same. You can apply an unbinned test like the the Kolmogorov-Smirnov (as explained in Sec.8.5.4) by ordering the two samples and computing the maximal distance between the two e.c.d.f. Or you can approach the problem ordering together both samples and then look apply a the run-test. If the two samples are drawn from the same parent distribution there will be several very short runs; if on the other hand the two samples are from di erent parent distributions you will have long runs from both samples. This test can be tried only if he number of points in sample A is similar to the one in sample B.

21 8.7. REFERENCES 7 Example Two samples A and B from the same parent distribution will give something like: AABBABABAABBAABABAABBBA. Two samples from two narrow distributions with di erent means will give something like: AAAAAAAAAABBBBBBBBBBBBB. 8.7 References G. Cowan, Statistical Data Analysis,Ch. 4 R. Barlow, A guide to the use of statistical methods in the physical sciences. Ch. 8 W. Metzger, Statistical Methods in Data Analysis : Ch.0

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Data Analysis Stat 3: p-values, parameter estimation Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,

More information

Lecture 2. G. Cowan Lectures on Statistical Data Analysis Lecture 2 page 1

Lecture 2. G. Cowan Lectures on Statistical Data Analysis Lecture 2 page 1 Lecture 2 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

Statistical Methods for Particle Physics (I)

Statistical Methods for Particle Physics (I) Statistical Methods for Particle Physics (I) https://agenda.infn.it/conferencedisplay.py?confid=14407 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Statistics for the LHC Lecture 1: Introduction

Statistics for the LHC Lecture 1: Introduction Statistics for the LHC Lecture 1: Introduction Academic Training Lectures CERN, 14 17 June, 2010 indico.cern.ch/conferencedisplay.py?confid=77830 Glen Cowan Physics Department Royal Holloway, University

More information

Statistical Methods for Particle Physics Lecture 2: statistical tests, multivariate methods

Statistical Methods for Particle Physics Lecture 2: statistical tests, multivariate methods Statistical Methods for Particle Physics Lecture 2: statistical tests, multivariate methods www.pp.rhul.ac.uk/~cowan/stat_aachen.html Graduierten-Kolleg RWTH Aachen 10-14 February 2014 Glen Cowan Physics

More information

E. Santovetti lesson 4 Maximum likelihood Interval estimation

E. Santovetti lesson 4 Maximum likelihood Interval estimation E. Santovetti lesson 4 Maximum likelihood Interval estimation 1 Extended Maximum Likelihood Sometimes the number of total events measurements of the experiment n is not fixed, but, for example, is a Poisson

More information

FYST17 Lecture 8 Statistics and hypothesis testing. Thanks to T. Petersen, S. Maschiocci, G. Cowan, L. Lyons

FYST17 Lecture 8 Statistics and hypothesis testing. Thanks to T. Petersen, S. Maschiocci, G. Cowan, L. Lyons FYST17 Lecture 8 Statistics and hypothesis testing Thanks to T. Petersen, S. Maschiocci, G. Cowan, L. Lyons 1 Plan for today: Introduction to concepts The Gaussian distribution Likelihood functions Hypothesis

More information

Discovery and Significance. M. Witherell 5/10/12

Discovery and Significance. M. Witherell 5/10/12 Discovery and Significance M. Witherell 5/10/12 Discovering particles Much of what we know about physics at the most fundamental scale comes from discovering particles. We discovered these particles by

More information

Statistics for the LHC Lecture 2: Discovery

Statistics for the LHC Lecture 2: Discovery Statistics for the LHC Lecture 2: Discovery Academic Training Lectures CERN, 14 17 June, 2010 indico.cern.ch/conferencedisplay.py?confid=77830 Glen Cowan Physics Department Royal Holloway, University of

More information

Systematic uncertainties in statistical data analysis for particle physics. DESY Seminar Hamburg, 31 March, 2009

Systematic uncertainties in statistical data analysis for particle physics. DESY Seminar Hamburg, 31 March, 2009 Systematic uncertainties in statistical data analysis for particle physics DESY Seminar Hamburg, 31 March, 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Statistical Methods in Particle Physics

Statistical Methods in Particle Physics Statistical Methods in Particle Physics Lecture 11 January 7, 2013 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline How to communicate the statistical uncertainty

More information

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Statistical Methods in Particle Physics Lecture 1: Bayesian methods Statistical Methods in Particle Physics Lecture 1: Bayesian methods SUSSP65 St Andrews 16 29 August 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis Lecture 3 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests

Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests http://benasque.org/2018tae/cgi-bin/talks/allprint.pl TAE 2018 Benasque, Spain 3-15 Sept 2018 Glen Cowan Physics

More information

Primer on statistics:

Primer on statistics: Primer on statistics: MLE, Confidence Intervals, and Hypothesis Testing ryan.reece@gmail.com http://rreece.github.io/ Insight Data Science - AI Fellows Workshop Feb 16, 018 Outline 1. Maximum likelihood

More information

Hypothesis testing:power, test statistic CMS:

Hypothesis testing:power, test statistic CMS: Hypothesis testing:power, test statistic The more sensitive the test, the better it can discriminate between the null and the alternative hypothesis, quantitatively, maximal power In order to achieve this

More information

ETH Zurich HS Mauro Donegà: Higgs physics meeting name date 1

ETH Zurich HS Mauro Donegà: Higgs physics meeting name date 1 Higgs physics - lecture 4 ETH Zurich HS 2015 Mauro Donegà Mauro Donegà: Higgs physics meeting name date 1 Outline 1 2 3 4 5 6 Introduction Accelerators Detectors EW constraints Search at LEP1 / LEP 2 Statistics:

More information

Hypothesis Testing - Frequentist

Hypothesis Testing - Frequentist Frequentist Hypothesis Testing - Frequentist Compare two hypotheses to see which one better explains the data. Or, alternatively, what is the best way to separate events into two classes, those originating

More information

Hypothesis testing (cont d)

Hypothesis testing (cont d) Hypothesis testing (cont d) Ulrich Heintz Brown University 4/12/2016 Ulrich Heintz - PHYS 1560 Lecture 11 1 Hypothesis testing Is our hypothesis about the fundamental physics correct? We will not be able

More information

Statistical Methods in Particle Physics Lecture 2: Limits and Discovery

Statistical Methods in Particle Physics Lecture 2: Limits and Discovery Statistical Methods in Particle Physics Lecture 2: Limits and Discovery SUSSP65 St Andrews 16 29 August 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Lecture 5 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Statistical Methods for Particle Physics Lecture 4: discovery, exclusion limits

Statistical Methods for Particle Physics Lecture 4: discovery, exclusion limits Statistical Methods for Particle Physics Lecture 4: discovery, exclusion limits www.pp.rhul.ac.uk/~cowan/stat_aachen.html Graduierten-Kolleg RWTH Aachen 10-14 February 2014 Glen Cowan Physics Department

More information

Hypothesis testing. Chapter Formulating a hypothesis. 7.2 Testing if the hypothesis agrees with data

Hypothesis testing. Chapter Formulating a hypothesis. 7.2 Testing if the hypothesis agrees with data Chapter 7 Hypothesis testing 7.1 Formulating a hypothesis Up until now we have discussed how to define a measurement in terms of a central value, uncertainties, and units, as well as how to extend these

More information

P Values and Nuisance Parameters

P Values and Nuisance Parameters P Values and Nuisance Parameters Luc Demortier The Rockefeller University PHYSTAT-LHC Workshop on Statistical Issues for LHC Physics CERN, Geneva, June 27 29, 2007 Definition and interpretation of p values;

More information

Advanced statistical methods for data analysis Lecture 1

Advanced statistical methods for data analysis Lecture 1 Advanced statistical methods for data analysis Lecture 1 RHUL Physics www.pp.rhul.ac.uk/~cowan Universität Mainz Klausurtagung des GK Eichtheorien exp. Tests... Bullay/Mosel 15 17 September, 2008 1 Outline

More information

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede Hypothesis Testing: Suppose we have two or (in general) more simple hypotheses which can describe a set of data Simple means explicitly defined, so if parameters have to be fitted, that has already been

More information

Some Statistical Tools for Particle Physics

Some Statistical Tools for Particle Physics Some Statistical Tools for Particle Physics Particle Physics Colloquium MPI für Physik u. Astrophysik Munich, 10 May, 2016 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk

More information

Statistical Methods in Particle Physics. Lecture 2

Statistical Methods in Particle Physics. Lecture 2 Statistical Methods in Particle Physics Lecture 2 October 17, 2011 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2011 / 12 Outline Probability Definition and interpretation Kolmogorov's

More information

Statistical Methods for Astronomy

Statistical Methods for Astronomy Statistical Methods for Astronomy Probability (Lecture 1) Statistics (Lecture 2) Why do we need statistics? Useful Statistics Definitions Error Analysis Probability distributions Error Propagation Binomial

More information

Topics in Statistical Data Analysis for HEP Lecture 1: Bayesian Methods CERN-JINR European School of High Energy Physics Bautzen, June 2009

Topics in Statistical Data Analysis for HEP Lecture 1: Bayesian Methods CERN-JINR European School of High Energy Physics Bautzen, June 2009 Topics in Statistical Data Analysis for HEP Lecture 1: Bayesian Methods CERN-JINR European School of High Energy Physics Bautzen, 14 27 June 2009 Glen Cowan Physics Department Royal Holloway, University

More information

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007 MIT OpenCourseWare http://ocw.mit.edu HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing Spring 2007 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Practical Statistics

Practical Statistics Practical Statistics Lecture 1 (Nov. 9): - Correlation - Hypothesis Testing Lecture 2 (Nov. 16): - Error Estimation - Bayesian Analysis - Rejecting Outliers Lecture 3 (Nov. 18) - Monte Carlo Modeling -

More information

Statistics Challenges in High Energy Physics Search Experiments

Statistics Challenges in High Energy Physics Search Experiments Statistics Challenges in High Energy Physics Search Experiments The Weizmann Institute of Science, Rehovot, Israel E-mail: eilam.gross@weizmann.ac.il Ofer Vitells The Weizmann Institute of Science, Rehovot,

More information

Lectures on Statistical Data Analysis

Lectures on Statistical Data Analysis Lectures on Statistical Data Analysis London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk

More information

Modern Methods of Data Analysis - WS 07/08

Modern Methods of Data Analysis - WS 07/08 Modern Methods of Data Analysis Lecture VII (26.11.07) Contents: Maximum Likelihood (II) Exercise: Quality of Estimators Assume hight of students is Gaussian distributed. You measure the size of N students.

More information

Introduction to Statistical Methods for High Energy Physics

Introduction to Statistical Methods for High Energy Physics Introduction to Statistical Methods for High Energy Physics 2011 CERN Summer Student Lectures Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Multivariate statistical methods and data mining in particle physics

Multivariate statistical methods and data mining in particle physics Multivariate statistical methods and data mining in particle physics RHUL Physics www.pp.rhul.ac.uk/~cowan Academic Training Lectures CERN 16 19 June, 2008 1 Outline Statement of the problem Some general

More information

Recent developments in statistical methods for particle physics

Recent developments in statistical methods for particle physics Recent developments in statistical methods for particle physics Particle Physics Seminar Warwick, 17 February 2011 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk

More information

Physics 403. Segev BenZvi. Classical Hypothesis Testing: The Likelihood Ratio Test. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Classical Hypothesis Testing: The Likelihood Ratio Test. Department of Physics and Astronomy University of Rochester Physics 403 Classical Hypothesis Testing: The Likelihood Ratio Test Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Bayesian Hypothesis Testing Posterior Odds

More information

Statistics for Particle Physics. Kyle Cranmer. New York University. Kyle Cranmer (NYU) CERN Academic Training, Feb 2-5, 2009

Statistics for Particle Physics. Kyle Cranmer. New York University. Kyle Cranmer (NYU) CERN Academic Training, Feb 2-5, 2009 Statistics for Particle Physics Kyle Cranmer New York University 91 Remaining Lectures Lecture 3:! Compound hypotheses, nuisance parameters, & similar tests! The Neyman-Construction (illustrated)! Inverted

More information

Constructing Ensembles of Pseudo-Experiments

Constructing Ensembles of Pseudo-Experiments Constructing Ensembles of Pseudo-Experiments Luc Demortier The Rockefeller University, New York, NY 10021, USA The frequentist interpretation of measurement results requires the specification of an ensemble

More information

Statistical Methods for Particle Physics Lecture 3: Systematics, nuisance parameters

Statistical Methods for Particle Physics Lecture 3: Systematics, nuisance parameters Statistical Methods for Particle Physics Lecture 3: Systematics, nuisance parameters http://benasque.org/2018tae/cgi-bin/talks/allprint.pl TAE 2018 Centro de ciencias Pedro Pascual Benasque, Spain 3-15

More information

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO LECTURE NOTES FYS 4550/FYS9550 - EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I PROBABILITY AND STATISTICS A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO Before embarking on the concept

More information

Combined Higgs Results

Combined Higgs Results Chapter 2 Combined Higgs Results This chapter presents the combined ATLAS search for the Standard Model Higgs boson. The analysis has been performed using 4.7 fb of s = 7 TeV data collected in 2, and 5.8

More information

12 Statistical Justifications; the Bias-Variance Decomposition

12 Statistical Justifications; the Bias-Variance Decomposition Statistical Justifications; the Bias-Variance Decomposition 65 12 Statistical Justifications; the Bias-Variance Decomposition STATISTICAL JUSTIFICATIONS FOR REGRESSION [So far, I ve talked about regression

More information

Statistics for Particle Physics. Kyle Cranmer. New York University. Kyle Cranmer (NYU) CERN Academic Training, Feb 2-5, 2009

Statistics for Particle Physics. Kyle Cranmer. New York University. Kyle Cranmer (NYU) CERN Academic Training, Feb 2-5, 2009 Statistics for Particle Physics Kyle Cranmer New York University 1 Hypothesis Testing 55 Hypothesis testing One of the most common uses of statistics in particle physics is Hypothesis Testing! assume one

More information

Practice Problems Section Problems

Practice Problems Section Problems Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,

More information

Statistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit

Statistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit Statistics Lent Term 2015 Prof. Mark Thomson Lecture 2 : The Gaussian Limit Prof. M.A. Thomson Lent Term 2015 29 Lecture Lecture Lecture Lecture 1: Back to basics Introduction, Probability distribution

More information

Discovery significance with statistical uncertainty in the background estimate

Discovery significance with statistical uncertainty in the background estimate Glen Cowan, Eilam Gross ATLAS Statistics Forum 8 May, 2008 Discovery significance with statistical uncertainty in the background estimate Introduction In a search for a new type of event, data samples

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

Discovery Potential for the Standard Model Higgs at ATLAS

Discovery Potential for the Standard Model Higgs at ATLAS IL NUOVO CIMENTO Vol.?, N.?? Discovery Potential for the Standard Model Higgs at Glen Cowan (on behalf of the Collaboration) Physics Department, Royal Holloway, University of London, Egham, Surrey TW EX,

More information

Asymptotic formulae for likelihood-based tests of new physics

Asymptotic formulae for likelihood-based tests of new physics Eur. Phys. J. C (2011) 71: 1554 DOI 10.1140/epjc/s10052-011-1554-0 Special Article - Tools for Experiment and Theory Asymptotic formulae for likelihood-based tests of new physics Glen Cowan 1, Kyle Cranmer

More information

Journeys of an Accidental Statistician

Journeys of an Accidental Statistician Journeys of an Accidental Statistician A partially anecdotal account of A Unified Approach to the Classical Statistical Analysis of Small Signals, GJF and Robert D. Cousins, Phys. Rev. D 57, 3873 (1998)

More information

Quadratic Equations Part I

Quadratic Equations Part I Quadratic Equations Part I Before proceeding with this section we should note that the topic of solving quadratic equations will be covered in two sections. This is done for the benefit of those viewing

More information

Discovery potential of the SM Higgs with ATLAS

Discovery potential of the SM Higgs with ATLAS Discovery potential of the SM Higgs with P. Fleischmann On behalf of the Collaboration st October Abstract The discovery potential of the Standard Model Higgs boson with the experiment at the Large Hadron

More information

Physics 509: Non-Parametric Statistics and Correlation Testing

Physics 509: Non-Parametric Statistics and Correlation Testing Physics 509: Non-Parametric Statistics and Correlation Testing Scott Oser Lecture #19 Physics 509 1 What is non-parametric statistics? Non-parametric statistics is the application of statistical tests

More information

Search for top squark pair production and decay in four bodies, with two leptons in the final state, at the ATLAS Experiment with LHC Run2 data

Search for top squark pair production and decay in four bodies, with two leptons in the final state, at the ATLAS Experiment with LHC Run2 data Search for top squark pair production and decay in four bodies, with two leptons in the final state, at the ATLAS Experiment with LHC Run data Marilea Reale INFN Lecce and Università del Salento (IT) E-mail:

More information

How to find a Higgs boson. Jonathan Hays QMUL 12 th October 2012

How to find a Higgs boson. Jonathan Hays QMUL 12 th October 2012 How to find a Higgs boson Jonathan Hays QMUL 12 th October 2012 Outline Introducing the scalar boson Experimental overview Where and how to search Higgs properties Prospects and summary 12/10/2012 2 The

More information

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Lecture No. # 36 Sampling Distribution and Parameter Estimation

More information

14.30 Introduction to Statistical Methods in Economics Spring 2009

14.30 Introduction to Statistical Methods in Economics Spring 2009 MIT OpenCourseWare http://ocw.mit.edu 4.0 Introduction to Statistical Methods in Economics Spring 009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Statistical Methods for Discovery and Limits in HEP Experiments Day 3: Exclusion Limits

Statistical Methods for Discovery and Limits in HEP Experiments Day 3: Exclusion Limits Statistical Methods for Discovery and Limits in HEP Experiments Day 3: Exclusion Limits www.pp.rhul.ac.uk/~cowan/stat_freiburg.html Vorlesungen des GK Physik an Hadron-Beschleunigern, Freiburg, 27-29 June,

More information

Part III: Unstructured Data

Part III: Unstructured Data Inf1-DA 2010 2011 III: 51 / 89 Part III Unstructured Data Data Retrieval: III.1 Unstructured data and data retrieval Statistical Analysis of Data: III.2 Data scales and summary statistics III.3 Hypothesis

More information

32. STATISTICS. 32. Statistics 1

32. STATISTICS. 32. Statistics 1 32. STATISTICS 32. Statistics 1 Revised September 2009 by G. Cowan (RHUL). This chapter gives an overview of statistical methods used in high-energy physics. In statistics, we are interested in using a

More information

Use of the likelihood principle in physics. Statistics II

Use of the likelihood principle in physics. Statistics II Use of the likelihood principle in physics Statistics II 1 2 3 + Bayesians vs Frequentists 4 Why ML does work? hypothesis observation 5 6 7 8 9 10 11 ) 12 13 14 15 16 Fit of Histograms corresponds This

More information

Physics 509: Bootstrap and Robust Parameter Estimation

Physics 509: Bootstrap and Robust Parameter Estimation Physics 509: Bootstrap and Robust Parameter Estimation Scott Oser Lecture #20 Physics 509 1 Nonparametric parameter estimation Question: what error estimate should you assign to the slope and intercept

More information

Statistics for Resonance Search

Statistics for Resonance Search Statistics for Resonance Search Georgios Choudalakis University of Chicago ATLAS exotic diphoton resonances meeting Nov. 5, 0 Intro Introduction I was asked to explain how we treated (statistically) the

More information

CMS Internal Note. The content of this note is intended for CMS internal use and distribution only

CMS Internal Note. The content of this note is intended for CMS internal use and distribution only Available on CMS information server CMS IN 2003/xxxx CMS Internal Note The content of this note is intended for CMS internal use and distribution only August 26, 2003 Expected signal observability at future

More information

YETI IPPP Durham

YETI IPPP Durham YETI 07 @ IPPP Durham Young Experimentalists and Theorists Institute Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan Course web page: www.pp.rhul.ac.uk/~cowan/stat_yeti.html

More information

Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg

Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg Statistics for Data Analysis PSI Practical Course 2014 Niklaus Berger Physics Institute, University of Heidelberg Overview You are going to perform a data analysis: Compare measured distributions to theoretical

More information

Physics 509: Error Propagation, and the Meaning of Error Bars. Scott Oser Lecture #10

Physics 509: Error Propagation, and the Meaning of Error Bars. Scott Oser Lecture #10 Physics 509: Error Propagation, and the Meaning of Error Bars Scott Oser Lecture #10 1 What is an error bar? Someone hands you a plot like this. What do the error bars indicate? Answer: you can never be

More information

Statistical Methods for Astronomy

Statistical Methods for Astronomy Statistical Methods for Astronomy If your experiment needs statistics, you ought to have done a better experiment. -Ernest Rutherford Lecture 1 Lecture 2 Why do we need statistics? Definitions Statistical

More information

Statistical Tools in Collider Experiments. Multivariate analysis in high energy physics

Statistical Tools in Collider Experiments. Multivariate analysis in high energy physics Statistical Tools in Collider Experiments Multivariate analysis in high energy physics Lecture 3 Pauli Lectures - 08/02/2012 Nicolas Chanon - ETH Zürich 1 Outline 1.Introduction 2.Multivariate methods

More information

Statistical Methods in Particle Physics

Statistical Methods in Particle Physics Statistical Methods in Particle Physics Lecture 3 October 29, 2012 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline Reminder: Probability density function Cumulative

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem set 1 Solutions Thursday, September 19 What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.

More information

Parameter Estimation and Fitting to Data

Parameter Estimation and Fitting to Data Parameter Estimation and Fitting to Data Parameter estimation Maximum likelihood Least squares Goodness-of-fit Examples Elton S. Smith, Jefferson Lab 1 Parameter estimation Properties of estimators 3 An

More information

Introductory Statistics Course Part II

Introductory Statistics Course Part II Introductory Statistics Course Part II https://indico.cern.ch/event/735431/ PHYSTAT ν CERN 22-25 January 2019 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Fourier and Stats / Astro Stats and Measurement : Stats Notes

Fourier and Stats / Astro Stats and Measurement : Stats Notes Fourier and Stats / Astro Stats and Measurement : Stats Notes Andy Lawrence, University of Edinburgh Autumn 2013 1 Probabilities, distributions, and errors Laplace once said Probability theory is nothing

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

RWTH Aachen Graduiertenkolleg

RWTH Aachen Graduiertenkolleg RWTH Aachen Graduiertenkolleg 9-13 February, 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan Course web page: www.pp.rhul.ac.uk/~cowan/stat_aachen.html

More information

32. STATISTICS. 32. Statistics 1

32. STATISTICS. 32. Statistics 1 32. STATISTICS 32. Statistics 1 Revised September 2007 by G. Cowan (RHUL). This chapter gives an overview of statistical methods used in High Energy Physics. In statistics, we are interested in using a

More information

Hà γγ in the VBF production mode and trigger photon studies using the ATLAS detector at the LHC

Hà γγ in the VBF production mode and trigger photon studies using the ATLAS detector at the LHC Hà γγ in the VBF production mode and trigger photon studies using the ATLAS detector at the LHC Olivier DAVIGNON (LPNHE Paris / CNRS-IN2P3 / UPMC Paris Diderot) Journées de Rencontre Jeunes Chercheurs

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

PHYSICS 2150 EXPERIMENTAL MODERN PHYSICS. Lecture 3 Rejection of Data; Weighted Averages

PHYSICS 2150 EXPERIMENTAL MODERN PHYSICS. Lecture 3 Rejection of Data; Weighted Averages PHYSICS 15 EXPERIMENTAL MODERN PHYSICS Lecture 3 Rejection of Data; Weighted Averages PREVIOUS LECTURE: GAUSS DISTRIBUTION 1.5 p(x µ, )= 1 e 1 ( x µ ) µ=, σ=.5 1. µ=3, σ=.5.5 µ=4, σ=1 4 6 8 WE CAN NOW

More information

North Carolina State University

North Carolina State University North Carolina State University MA 141 Course Text Calculus I by Brenda Burns-Williams and Elizabeth Dempster August 7, 2014 Section1 Functions Introduction In this section, we will define the mathematical

More information

Bump Hunt on 2016 Data

Bump Hunt on 2016 Data Bump Hunt on 216 Data Sebouh Paul College of William and Mary HPS Collaboration Meeting May 3, 217 1 / 24 Outline Trident selection Effects of cuts on dataset Comparison with 215 dataset Mass resolutions:

More information

MODIFIED FREQUENTIST ANALYSIS OF SEARCH RESULTS (THE CL s METHOD)

MODIFIED FREQUENTIST ANALYSIS OF SEARCH RESULTS (THE CL s METHOD) MODIFIED FREQUENTIST ANALYSIS OF SEARCH RESULTS (THE CL s METHOD) A. L. Read University of Oslo, Department of Physics, P.O. Box 148, Blindern, 316 Oslo 3, Norway Abstract The statistical analysis of direct

More information

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module

More information

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland Frederick James CERN, Switzerland Statistical Methods in Experimental Physics 2nd Edition r i Irr 1- r ri Ibn World Scientific NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI HONG KONG TAIPEI CHENNAI CONTENTS

More information

Frank Porter February 26, 2013

Frank Porter February 26, 2013 116 Frank Porter February 26, 2013 Chapter 6 Hypothesis Tests Often, we want to address questions such as whether the possible observation of a new effect is really significant, or merely a chance fluctuation.

More information

Statistical Methods in Particle Physics

Statistical Methods in Particle Physics Statistical Methods in Particle Physics Lecture 10 December 17, 01 Silvia Masciocchi, GSI Darmstadt Winter Semester 01 / 13 Method of least squares The method of least squares is a standard approach to

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

exp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1

exp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1 4 Hypothesis testing 4. Simple hypotheses A computer tries to distinguish between two sources of signals. Both sources emit independent signals with normally distributed intensity, the signals of the first

More information

Physics 403. Segev BenZvi. Credible Intervals, Confidence Intervals, and Limits. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Credible Intervals, Confidence Intervals, and Limits. Department of Physics and Astronomy University of Rochester Physics 403 Credible Intervals, Confidence Intervals, and Limits Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Summarizing Parameters with a Range Bayesian

More information

DELPHI Collaboration DELPHI PHYS 656. Measurement of mass dierence between? and + and mass of -neutrino from three-prong -decays

DELPHI Collaboration DELPHI PHYS 656. Measurement of mass dierence between? and + and mass of -neutrino from three-prong -decays DELPHI Collaboration DELPHI 96-167 PHYS 656 2 December, 1996 Measurement of mass dierence between? and + and mass of -neutrino from three-prong -decays M.Chapkin V.Obraztsov IHEP, Protvino Abstract The

More information

Algebra. Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Algebra. Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. This document was written and copyrighted by Paul Dawkins. Use of this document and its online version is governed by the Terms and Conditions of Use located at. The online version of this document is

More information

Relationship between Least Squares Approximation and Maximum Likelihood Hypotheses

Relationship between Least Squares Approximation and Maximum Likelihood Hypotheses Relationship between Least Squares Approximation and Maximum Likelihood Hypotheses Steven Bergner, Chris Demwell Lecture notes for Cmpt 882 Machine Learning February 19, 2004 Abstract In these notes, a

More information

Chapter 9. Non-Parametric Density Function Estimation

Chapter 9. Non-Parametric Density Function Estimation 9-1 Density Estimation Version 1.2 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

More information

Observation of a New Particle with a Mass of 125 GeV

Observation of a New Particle with a Mass of 125 GeV Observation of a New Particle with a Mass of 125 GeV CMS Experiment, CERN 4 July 2012 Summary In a joint seminar today at CERN and the ICHEP 2012 conference[1] in Melbourne, researchers of the Compact

More information

Recommendations for presentation of error bars

Recommendations for presentation of error bars Draft 0.00 ATLAS Statistics Forum 15 February, 2011 Recommendations for presentation of error bars 1 Introduction This note summarizes recommendations on how to present error bars on plots. It follows

More information