Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Hypothesis testing. Anna Wegloop Niels Landwehr/Tobias Scheffer

Size: px

Start display at page:

Download "Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Hypothesis testing. Anna Wegloop Niels Landwehr/Tobias Scheffer"

Evangeline Moore
5 years ago
Views:

1 Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Hypothesis testing Anna Wegloop iels Landwehr/Tobias Scheffer

2 Why do a statistical test? input computer model output

3 Outlook ull-hypothesis Some more concepts involved in hypothesis testing Central limit theorem Confidence interval Critical value P-values Significance level One sample t-test Pearson s chi-squared test Sign test Likelihood ratio test 3

4 Definition of the null hypothesis Given a model/hypothesis: Are the data plausible, given the model? If not, choose alternative model/hypothesis: input computer output model H 0 H 0 H a 4

5 We will use the central limit theorem Let {X, 1 X,..., X } be i.i.d. random variables drawn from a distribution with mean and variance 1 i Let be X X the sample mean of a sample of i 1 size Then the central limit theorem (CLT) follows: lim p( X ) 0, 5

6 Using the CLT in an example A list with 5000 weights of people stored in a database In an experiment, we only us a sample of 16 weights. The sample mean is X 73 kg. The sample variance is sample 3 kg. What is the chance that the population mean differs kg or less from the sample mean? 6

7 Example: confidence interval 0.9 Let P X b X b be the probability that the population mean differs b or less from a sample mean X Then we call the X b, X b 90% confidence interval 7

8 Example: confidence interval Derivation confidence interval Define Then approximately Therefore with ( X ) x ˆ ~ 0,1 [approximately, for large, because of CLT] 1 sample X ˆ the cumulative distribution function of ~ 0,1 X b b P X b P X b P ˆ ˆ ˆ (1,0) 8

9 Example: confidence interval For b ˆ 1 ( 0.95), it will approximately hold that P X b Because of symmetry, P X b X b then approximately holds. 9

10 When to reject the null hypothesis? Check how likely the data are, given : Sample data Define test statistic See if sample is consistent with null hypothesis If very unlikely, reject H 0 H 0 10

11 The test statistic Let {X,,..., } be random variables 1 Let {x, x,..., x } 1 be their sample values M We want to test whether our model H0 1 for the true distribution is likely to be correct Define test statistic t t(x 1,X,...,X ) with distribution t measures some attribute of the sample Let p () H 0 t be the distribution over t under the assumption that the null-hypothesis holds Calculate X X t(x, x,..., x ) 0 1 One sided test, e.g.: t p (X, X,..., X ) p( t) P ( t t ) H

12 Two sided and one sided tests Tests can be either single-sided Or two-sided H 0 : 0 versus some H a : 0 H 0 : 0 versus some H a : 0 Where, are attributes of the random variables 0 1

13 Testing procedure Define test statistic Calculate value t 0 of test statistic for sample Calculate p-value: t P ( t t ) H 0 0 Reject H 0 with predefined significance level (corresponding to the critical value c ) if P ( t t ) H

14 Critical value Area under the distribution Critical value: c CDF 1 (1 ) Reject H if t c 0 0. This is the same as: H 0 c c p (t)dt CDF(c) 1 0 H 0 P( t t ) 14

15 Comparing hypotheses H 0 H a t 0 15

16 Examples of statistical tests One sample t-test Pearsons chi-squared test Sign test Likelihood ratio test 16

17 Example: one sample t-test Sample {x 1, x,..., x } of weights :The population mean 0 is 80 kg H 0 17

18 Assumptions of the (one sample) t-test The data generating variables are independent of each other, X,...,X 1 Means of the random variables are normally distributed (assumption is often satisfied due to CLT). p(x) X, X 18

19 Test statistic of the t-test (single sample) Corrected estimate of the standard deviation of a sample of size. Test statistic: sample 1 t 1 (X i) i1 i1 1 X i X _ i 0 i1 X 0 sample ( ) sample 19

20 Example: one sample t-test Data set {x, x,..., x } of weights 1 Is the population mean 80 kg? H 0 t _ ( X 0) sample 0 0

21 Derive the t-distribution from assumptions Test statistic sample With v (1) and and t _ ( X 0) 1u v sample the population variance If we assume (for this proof ) that X i normally distributed, then: ph 0 (u) 1 e 1 u 1 p () v v e v 1 normal distribution chi-squared distribution u Then ph (t) t p(u) p (v)dudv 0 is the t-distribution, v / u ( X 0) 1

22 The t-distribution 3 p H, 0 t lim p (t) 0,1 1 t 1 1

23 Pearson`s -Test Let { X,,X } i.i.d., from a multinomial distribution with and expectation value We want to test 1 X X,,X 1 k i i i and Test statistic t obeys a chi-squared distribution t k j1 1 k,, H : vs. H : j j X 0 j 0 j X 0,1 i 1 i { } 0 0 a 0 X j X 1 1 i1 X j i

24 Sign test Test whether the distributions of two random variables X,Y have zero difference median H : P(X Y) 0 1 Collect pairs 1 1 Discard pairs where Xi Yi with We keep M pairs {(X,Y ),(X,Y ),...,(X,Y )} Let be the number of pairs for which Assuming H 0, it follows that: i {1,,..., } t Xi Yi 0 1 ph ( t) Binom(M, ) 0 4

25 Likelihood ratio test Compare likelihood f of data given model H 0 with likelihood of data given model H a H 0 is a special case of Test statistic: t f (x, x,..., x ) 1 a log f (x 1, x,..., x 0) Under certain conditions, the test statistic approaches a chi-squared distribution when the sample size approaches infinity H a 5

26 Wald test Test Test statistic H : vs. H : When the null-hypothesis is true, then: H 0 t ˆ 0 p (t) 0,1 With the estimator of the standard deviation of ˆ 6

27 Summary Use statistical test to decide whether the null hypothesis is likely given a sample: define significance level, calculate p-value One sample t-test: assume {X 1, X,..., X } i.i.d., means normally distributed Chi-squared test: assume {X 1, X,..., X } i.i.d., normally distributed Two sample sign test: assume pairwise samples i.i.d Likelihood ratio test: assume test statistic is chisquared distributed Wald test: assume likelihood estimator normally distributed 7

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses. 1 Review: Let X 1, X,..., X n denote n independent random variables sampled from some distribution might not be normal!) with mean µ) and standard deviation σ). Then X µ σ n In other words, X is approximately