Inferential Statistics
|
|
- Peter Long
- 5 years ago
- Views:
Transcription
1 Inferential Statistics Eva Riccomagno, Maria Piera Rogantin DIMA Università di Genova
2 Part G Distribution free hypothesis tests 1. Classical and distribution-free tests 2. Distribution-free statistics and tests 3. Aside of Probability. Two distribution-free statistics 4. The sign test 5. The Wilcoxon-Mann-Whitney test 6. The goodness-of-fit tests a) Chi-square test b) Kolmogorov-Smirnov tests (one and two samples) 7. Final remarks 1
3 1. Classical and distribution-free tests Differences between independent groups Classical: t-test (or Welch test) to compare the mean of two groups; ANOVA for more groups Distribution-free: Mann-Whitney U test and Kolmogorov- Smirnov two-sample test; Kruskal-Wallis and Median test for more groups. Differences between variables Classical: t-test for paired samples; repeated measures ANOVA for more than two variables Distribution-free: Sign test and Wilcoxon s matched pairs test. Relationships between variables Classical: correlation coefficient. Distribution-free: Spearman R,.... For binary variables: Chi-square test, Phi coefficient, and Fisher exact test. 2
4 2. Distribution-free statistics and tests Let X 1,..., X n F be i.i.d. sample variables. A statistic T = T (X 1,..., X n ) is distribution-free if its distribution is invariant for each distribution of the sample variables. An example: the Wald test (using CLT approximation for the distribution of X n ). Under H 0 : µ = µ 0 for large n X n µ 0 S/ n approx N (0, 1) This is a particular case of: the statistics with asymptotic (limit) distribution independent from the sample distribution are distributionfree. A test is distribution-free if the test statistic is distribution-free. 3
5 3. Aside of Probability. Two distribution free statistics Sign Statistics Consider any i.i.d. random sample X 1,..., X n with median equal to 0. Assume P(X i = 0) = 0, for i = 1,..., n (e.g. X i continuous). Define Z i = { 1 if Xi > 0 0 if X i < 0 and note that Z i B(1, 1/2) The statistic B = n i=1 Z i B(n, 1/2) is distribution free Furthermore, for large n the statistic B 1/2 1/2 n approx N (0, 1) 4
6 Rank statistics Consider the sample variables X 1,..., X n and the corresponding rank variables R 1,..., R n where R i represents the position of X i in the sample Note. In lecture 2 we saw that an observed sample can be ordered by eg. the R command sort. Also random variables can be sorted returning the ordered random vector (X (1),..., X (n) ). random sample and it is a random variable. Thus R 1 is the index of the minimum of the The joint distribution of (R 1,..., R n ) does not depend on the distribution of the sample variables. We do not give here the details (proof based on combinatorial computation). If the data contains ties, to the tied values assign the average of the ranks they would have received had they not been tied. E.g. to the values are assigned the ranks
7 4. The simplest distribution-free test: the sign test a) Test for the median of a random variable b) Test for the equality of two medians - paired sample a) Test for the median of a random variable. Consider X 1,..., X n i.i.d. random sample and the test with hypotheses: H 0 : Q2 = λ 0 against H 1 : Q2 λ 0 H 1 could be Q2 < λ 0 or Q2 > λ 0. Consider Z i = { 1 if Xi λ 0 0 if X i < λ 0 then Z i B(1, 1/2) and the test statistic is Under H 0, B B(n, 1/2) B = n Z i i=1 The test is carried out as usual. 6
8 b) Test for the equality of two medians - paired samples Let X and Y be two continuous random variables modeling some characteristic of the same population, with median Q2 X and Q2 Y respectively. Consider a test with hypotheses: H 0 : Q2 X = Q2 Y against H 1 : Q2 X > Q2 Y H 1 could be Q2 X Q2 Y or Q2 X < Q2 Y. Let (X 1, Y 1 ),..., (X n, Y n ) be the n paired random sample and define (D 1,..., D i,..., D n ) with D i = X i Y i. The test hypotheses become H 0 : Q2 D = 0 against H 1 : Q2 D > 0 and we fall in the set-up of case a). Remark. A powerful alternative for both a) and b) is the rank signed Wilcoxon test. We do not give here the details. 7
9 Example. Deer legs Zar, Jerold H. (1999), Chapter 24: More on Dichotomous Variables, Biostatistical Analysis (Fourth ed.), Prentice-Hall The null hypothesis is that there is no difference between the hind leg and foreleg length in deer. The alternative hypothesis is that the hind leg length is longer than foreleg length. Thus: Deer Hind leg Foreleg Diff. sign H 0 : Q2 D = 0 against H 1 : Q2 D > 0 Under H 0 the test statistic is B = 10 i=1 Z i B(10, 1/2). Its sample value is b = 8. The test is one-sided right. The p-value of b is (in R: 1-pbinom(7,10,0.5)). 8
10 Direct computation in R > binom.test(8,10,alternative ="greater") Exact binomial test data: 8 and 10 number of successes = 8, number of trials = 10, p-value = alternative hypothesis: true probability of success is greater than percent confidence interval: sample estimates: probability of success 0.8 There is weak evidence against H 0. When the sample size is small the tails of the test statistic distribution are large. This leads to reject H 0 often. In practice, to overcome this choose a high α. In our example there is evidence to retain H 0. 9
11 5. The Mann-Whitney U test or Wilcoxon rank-sum test (equality of two distributions - unpaired samples) The null hypothesis can be expressed as the probability of an observation from the population X exceeding an observation from the population Y equals the probability of an observation from Y exceeding an observation from X: H 0 : P(X > Y ) = P(X < Y ) = 0.5 The alternative hypothesis can be stated in terms of one-sided (left or right) or two-sided test. Here X and Y are two continuous independent random variables and to test H 0 we consider X 1,..., X n1 and Y 1,..., Y n2 two independent random samples, with possibly different size. The variables could be discrete or ordinal with P(X = Y ) = 0. 10
12 Put together the two samples, so that there are n = n 1 + n 2 observations in total. Let R 1,..., R n1 be the rank variables assigned to X 1,..., X n1 and R n1 +1,..., R n the rank variables assigned to Y 1,..., Y n2. The statistics W 1 = n 1 i=1 R i and U 1 = W 1 n 1(n 1 + 1) 2 are distribution-free and are used as test statistic. U 1 takes integer values between 0 and n 1 n 2. The statistics W 2 and U 2 (based on ranks of the Y s) are defined analogously. Moreover W 1 + W 2 = n(n + 1)/2. Which between W 1 and W 2 is to consider? (or U 1 and U 2?) Usually the statistics with lower sample value is used. 11
13 A small example Does the treatment A produce lower values of a variable than the treatment B? Denote by X and Y the variables modeling the results of treatment A and B respectively. H 0 : P(X < Y ) = P(X > Y ) H 1 : P(X < Y ) > P(X > Y ) Seven elements are drawn from the population at random. Three, randomly chosen, are assigned to treatment A; the other four to treatment B: n 1 = 3 and n 2 = 4. The sample values and the corresponding sample ranks are x i r(x i ) y i r(y i ) The sample value of W 1 is w = 7. 12
14 Computation of the distribution of W 1 under H 0 (n 1 = 3, n 2 = 4) W 1 is sum of 3 different numbers chosen among {1,..., 7}. It takes values between 6 and 18. It is symmetrical w.r.t. 12. How many ways are there to form w? - 6: one way, ; - 7: one way, ; - 8: two ways, and ;... Under H 0, the three ranks of X are randomly chosen among {1,..., 7}: ( 7 3) = 35 cases. Then the distribution of W 1 for n 1 = 3 and n 2 = 4 is w associated ranks f W1 (w) 6 (1,2,3) 1/35 7 (1,2,4) 1/35 8 (1,2,5);(1,3,4) 2/35 9 (1,2,6); (1,3,5); (2,3,4) 3/35 10 (1,2,7); (1,3,6); (1,4,5); (2,3,5) 4/35 11 (1,3,7); (1,4,6); (2,4,5); (2,3,6) 4/35 12 (1,4,7); (1,5,6); (2,3,7); (2,4,6); (3,4,5) 5/35 The distribution of W 1 depends only on n 1 and n 2 : W 1 is a distribution-free statistics 13
15 Some properties of W 1 and U 1 under H 0 Minimum value: all the ranks of the X i s are smaller than the ranks of the Y i s: min(w 1 ) = n 1 i=1 i = n 1(n 1 + 1) 2 min(u 1 ) = 0 Maximum value: all the ranks of the Y i s are smaller than the ranks of the X i s: n max(w 1 ) = i = n 1(n + n 2 + 1) max(u 1 ) = n 1 n 2 2 Mean value: Variance: i=n 1 +1 E(W 1 ) = n 1 (n + 1) 2 E(U 1 ) = n 1n 2 2 V(W 1 ) = V(U 1 ) = n 1 n 2 (n + 1) 12 W 1 and U 1 are symmetrical w.r.t. their mean values. Moreover, for n 1 and n 2 greater than 10: U 1 E(U 1 ) std(u 1 ) approx N (0, 1) 14
16 Back to the test The test is one-sided left; the sample value is 7 and its p-value is P(W 1 7) = 2/35 = In such a case with low sample size, we can say that the evidence is against H 0. Direct computation in R > x=c( 12, 16,13);y=c(17,15,18, 20) > wilcox.test(x,y,"less") Wilcoxon rank sum test data: x and y W = 1, p-value = alternative hypothesis: true location shift is less than 0 The approximation of W 1 with a standard normal distribution is not appropriate for small sample sizes. But, in such a case, the exact computation and the normal approximation give similar results: z = = 1.77 p-value( 1.77) =
17 6. Goodness-of-fit tests Measures of goodness-of-fit typically summarize the discrepancy between observed values and the values expected under a known probability model. Such measures can also be used to test whether two samples are drawn from identical distributions. We consider here two goodness-of-fit tests: a) Chi-square test (discrete variables) b) Kolmogorov-Smirnov 16
18 6. a) Chi-square goodness-of-fit tests Let X be a discrete random variable with finite support variable with P(X = x i ) = π i i = 1,..., r The test hypotheses are: Let H 0 : π i = π i0 for all i and H 1 : π i π i0 for at least one i - X 1,..., X n be a random sample - F 1,..., F r be the sample variables denoting the sample frequencies of the values 1,..., r - N 1,..., N r be the corresponding counts variables, N i = nf i, i = 1,..., r. Often the N i s variables are called observed (counts) while the nπ i0 s are called expected (counts) and denoted by O i and E i respectively. 17
19 The test statistic is Q = n r i=1 (F i π i0 ) 2 π i0 = r i=1 (N i nπ i0 ) 2 nπ i0 = (simply) r i=1 (O i E i ) 2 E i Its asymptotic distribution is a chi-square with r 1 degrees of freedom: Q approx χ2 [r 1] The test is one-sided right because large sample values of Q state large difference between observed frequencies and expected frequencies. 18
20 Dependence on a parameter Often the π i s depends on a unknown parameter θ. Examples: X B(n, θ) (binomial), X U{0, θ} (discrete uniform between 0 and θ), X P(θ) (truncated Poisson, considering null the probability of large integers) We can write π i = π i (θ) and the test hypotheses become: H 0 : π i = π i0 (θ) and H 1 : π i π i0 (θ) If Θ n is a consistent estimator of θ with normal asymptotic distribution N (θ, V(Θ n )) (e.g. maximum likelihood estimator) then the test can be conduct with the statistic: Q = n r i=1 (F i π i0 (Θ n )) 2 π i0 (Θ n ) 19
21 Example. Sons among the first 7 children (Edwards and Fraccaro 1960) Consider the number of males among the first seven sons of 1334 Swedish Ministers n. sons counts We want to test if they are sample values of a random variable X B(7, θ) The point estimator of θ is X/7, the maximum likelihood estimator. estimate of θ is The > x=c( 0,1,2,3,4,5,6,7);o=c(6,57,206,362,365,256,69,13) > t=sum(x*o)/sum(o)/7;t [1] The expected counts under H 0 are: > e=sum(o)*dbinom(0:7,7,t);round(e,1) [1] The sample values of Q is 5.98 with p-value (q=sum((o-e)^2/e); 1-pchisq(q,7)) Then there is no evidence to reject H 0 20
22 Effects of small sample size. Recall that Q = n r i=1 (F i π i0 ) 2 π i0 = r i=1 (N i nπ i0 ) 2 nπ i0 approx χ2 [r 1] The chi-square approximation is valid when the sample size is large and the expected counts nπ i are not too small (at least 5 for all i = 1,..., r). In fact: (1) small n small q risk of type II error (2) small nπ i0 large q risk type I error. 21
23 Examples. Case (1): small n small q risk of type II error Consider the expected and observed frequencies beside where the differences between them are greater than 40%. 1 2 expected observed In such a case: (f 1 π 10 ) 2 π 10 + (f 2 π 20 ) 2 π 20 = If n = 10, then q = = with p-value retain H 0. If n = 30, then q = = with p-value reject H 0. > e=c(0.4,0.6); o=c(0.15,0.85); cf=sum((o-e)^2/e);cf [1] > n=10;cbind(cf*n,1-pchisq(cf*n,1)) [1,] > n=30;cbind(cf*n,1-pchisq(cf*n,1)) [1,]
24 Case (2): small nπ i0 large q risk type I error Consider the expected and observed counts beside. In (A) the expected counts are small twice. values (A) expected observed values (B) expected observed In (A): q = with p-value reject H 0. In (B): q = with p-value retain H 0. > e=c(10,2,2); o=c(12,3,6); cf=sum((o-e)^2/e) > cbind(cf,1-pchisq(cf,2)) [1,] > e=c(10,12,12); o=c(12,13,16); cf=sum((o-e)^2/e) > cbind(cf,1-pchisq(cf,2)) [1,]
25 6 b1) Kolmogorov-Smirnov goodness-of-fit tests Let X 1,..., X n be i.i.d. sample variables from a continuous random variable X with cumulative distribution function F. Consider the test hypotheses: H 0 : F (x) = F 0 (x) for all x R H 1 : F (x) F 0 (x) for at least a x R Let F be the empirical cumulative distribution function: F (x) = n i i n ( X(i) < x < X (i+1) ) where (X (1),..., X (n) ) is the sorted random sample and (.) denote the indicator function (equal to 1 if the condition is satisfied and equal to 0 otherwise). ˆF is a step function. The sample values of F (x) are discussed in the slides Exploratory Data Analysis. 24
26 The Kolmogorov test statistic is D = sup x R F (x) F 0 (x) = { { i max max 1 x n n F ( ) 0 X(i), i 1 n F ( ) }} 0 X(i) D is a distribution-free statistic. The test is one-sided right because a large sample value of D corresponds to a large difference between empirical and tested cumulative distribution function. 25
27 Example. Goodness-of-fit of a uniform random variable X U(0, 2) We want to test if a uniform random variable X U(0, 2) fits the following (sorted) data: A random variable X U(0, 2) has cumulative distribution function F 0 (x) = 0 if x < 0 1/2 x if 0 x < 2 1x if 2 x Beside the empirical cumulative distribution function (red) and F 0 (black). The maximum distance fo the two plot is achieved for x = 0.49 (fifth sorted value) and d = =
28 Direct computation in R > s=c(0.03,0.12,0.25,0.41,0.49,1.18,1.21,1.56,1.57,1.69) > ks.test(s,"punif",0,2) One-sample Kolmogorov-Smirnov test data: s D = 0.255, p-value = alternative hypothesis: two-sided There is no evidence to reject H 0. 27
29 Example. Approximate distribution of X n see slides on Central limit theorem Consider a simulation of 1000 samples, of size n each, from an exponential random variable X E(λ) with λ = 2. The simulated distribution is compared with - a Normal variable with sample mean and standard deviation - a Normal variable with theoretical mean and standard deviation; which are known: 1/λ and 1/(λ n) respectively. n = 10 > lambda=2;x=c(1:1000);n=10 > for (i in 1:1000) x[i]=mean(rexp(n,lambda)) > ######### empirical mean and standard deviation > ks.test(x,"pnorm",mean(x),sd(x)) One-sample Kolmogorov-Smirnov test data: x D = , p-value = alternative hypothesis: two-sided 28
30 > ######### theoretical mean and standard deviation > ks.test(x,"pnorm",(1/lambda),(1/lambda/sqrt(n))) One-sample Kolmogorov-Smirnov test data: x D = , p-value = alternative hypothesis: two-sided In the first case there is evidence to reject that the simulated distribution of X 10 is Normal. In the second one the evidence is weak. 29
31 n = 30 > lambda=2;x=c(1:1000);n=30 > for (i in 1:1000) x[i]=mean(rexp(n,lambda)) > ######### empirical mean and standard deviation > ks.test(x,"pnorm",mean(x),sd(x)) One-sample Kolmogorov-Smirnov test data: x D = , p-value = alternative hypothesis: two-sided > ######### theoretical mean and standard deviation > ks.test(x,"pnorm",(1/lambda),(1/lambda/sqrt(n))) One-sample Kolmogorov-Smirnov test data: x D = , p-value = alternative hypothesis: two-sided In both cases there is evidence to retain that the simulated distribution of X 10 is Normal. 30
32 6 b2) Two-sample Kolmogorov-Smirnov goodness-of-fit tests Let X and Y be two continuous independent random variables with cumulative distribution functions F X and F Y respectively. The test hypotheses are: H 0 : F X (t) = F Y (t) for all t R H 1 : F X (t) F Y (t) for at least a t R Let X 1,..., X n1 and Y 1,..., Y n2 be two independent random samples with empirical cumulative distribution functions F X and F Y respectively. The Kolmogorov-Smirnov test statistic is D n1,n 2 = sup x R D n1,n 2 is a distribution-free statistic. F X (x) F Y (x) 31
33 Example. Juiper trees We want to test if biomass of male and female Juniper trees have the same distribution. The two samples have size 6 each. > m=c(71,72,74,76,77,78); f=c(73,79,80,82,83,84) > > Fm_Ff=rbind(cumsum(table(factor(m, levels=71:84)))/6, + cumsum(table(factor(f, levels=71:84)))/6) > round(fm_ff,2) plot(ecdf(m),col="blue", xlim=c(70,85),xlab="",ylab="",main="") plot(ecdf(f),add=t,col="red", xlim=c(70,85),xlab="",ylab="",main="")
34 The absolute values of difference between F M and F F are listed below and their maximum value is reached at 78 of biomass. > D=abs(Fm_Ff[1,]-Fm_Ff[2,]) > round(rbind(fm_ff,d),2) D > max(d) [1] Direct computation in R > ks.test(m, f) Two-sample Kolmogorov-Smirnov test data: m and f D = , p-value = alternative hypothesis: two-sided There is evidence to reject H 0 33
35 7. Final remarks Form the book by T. Hill and P. Levicky (2006) Statistics method and applications. StatSoft. p. 385 It is not easy to give simple advice concerning the use of nonparametric procedures. Each nonparametric procedure has its peculiar sensitivities and blind spots. For example, the Kolmogorov-Smirnov two-sample test is not only sensitive to differences in the location of distributions (for example, differences in means) but is also greatly affected by differences in their shapes. The Wilcoxon matched pairs test assumes that one can rank order the magnitude of differences in matched observations in a meaningful manner. If this is not the case, one should rather use the Sign test. 34
36 In general, if the result of a study is important (e.g., does a very expensive and painful drug therapy help people get better?), then it is always advisable to run different nonparametric tests; should discrepancies in the results occur contingent upon which test is used, one should try to understand why some tests give different results. On the other hand, nonparametric statistics are less statistically powerful (sensitive) than their parametric counterparts, and if it is important to detect even small effects (e.g., is this food additive harmful to people?) one should be very careful in the choice of a test statistic. Nonparametric methods are most appropriate when the sample sizes are small. 35
NON-PARAMETRIC STATISTICS * (http://www.statsoft.com)
NON-PARAMETRIC STATISTICS * (http://www.statsoft.com) 1. GENERAL PURPOSE 1.1 Brief review of the idea of significance testing To understand the idea of non-parametric statistics (the term non-parametric
More informationIntroduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.
Preface p. xi Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p. 6 The Scientific Method and the Design of
More informationCHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)
FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter
More informationContents. Acknowledgments. xix
Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables
More informationNonparametric statistic methods. Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health
Nonparametric statistic methods Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health Measurement What are the 4 levels of measurement discussed? 1. Nominal or Classificatory Scale Gender,
More informationContents Kruskal-Wallis Test Friedman s Two-way Analysis of Variance by Ranks... 47
Contents 1 Non-parametric Tests 3 1.1 Introduction....................................... 3 1.2 Advantages of Non-parametric Tests......................... 4 1.3 Disadvantages of Non-parametric Tests........................
More informationChapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics
Section 15.1: An Overview of Nonparametric Statistics Understand Difference between Parametric and Nonparametric Statistical Procedures Parametric statistical procedures inferential procedures that rely
More information4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures
Non-parametric Test Stephen Opiyo Overview Distinguish Parametric and Nonparametric Test Procedures Explain commonly used Nonparametric Test Procedures Perform Hypothesis Tests Using Nonparametric Procedures
More informationDETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics
DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and
More informationRank-Based Methods. Lukas Meier
Rank-Based Methods Lukas Meier 20.01.2014 Introduction Up to now we basically always used a parametric family, like the normal distribution N (µ, σ 2 ) for modeling random data. Based on observed data
More informationParametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami
Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric Assumptions The observations must be independent. Dependent variable should be continuous
More informationInferential Statistics Hypothesis tests Confidence intervals
Inferential Statistics Hypothesis tests Confidence intervals Eva Riccomagno, Maria Piera Rogantin DIMA Università di Genova riccomagno@dima.unige.it rogantin@dima.unige.it Part G. Multiple tests Part H.
More informationPSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests
PSY 307 Statistics for the Behavioral Sciences Chapter 20 Tests for Ranked Data, Choosing Statistical Tests What To Do with Non-normal Distributions Tranformations (pg 382): The shape of the distribution
More informationNon-parametric methods
Eastern Mediterranean University Faculty of Medicine Biostatistics course Non-parametric methods March 4&7, 2016 Instructor: Dr. Nimet İlke Akçay (ilke.cetin@emu.edu.tr) Learning Objectives 1. Distinguish
More informationExam details. Final Review Session. Things to Review
Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit
More informationUnit 14: Nonparametric Statistical Methods
Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based
More informationNonparametric Tests. Mathematics 47: Lecture 25. Dan Sloughter. Furman University. April 20, 2006
Nonparametric Tests Mathematics 47: Lecture 25 Dan Sloughter Furman University April 20, 2006 Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 1 / 14 The sign test Suppose X 1, X 2,...,
More informationST4241 Design and Analysis of Clinical Trials Lecture 7: N. Lecture 7: Non-parametric tests for PDG data
ST4241 Design and Analysis of Clinical Trials Lecture 7: Non-parametric tests for PDG data Department of Statistics & Applied Probability 8:00-10:00 am, Friday, September 2, 2016 Outline Non-parametric
More informationTextbook Examples of. SPSS Procedure
Textbook s of IBM SPSS Procedures Each SPSS procedure listed below has its own section in the textbook. These sections include a purpose statement that describes the statistical test, identification of
More information3. Nonparametric methods
3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests
More informationNon-parametric tests, part A:
Two types of statistical test: Non-parametric tests, part A: Parametric tests: Based on assumption that the data have certain characteristics or "parameters": Results are only valid if (a) the data are
More informationNonparametric Statistics
Nonparametric Statistics Nonparametric or Distribution-free statistics: used when data are ordinal (i.e., rankings) used when ratio/interval data are not normally distributed (data are converted to ranks)
More informationComparison of Two Samples
2 Comparison of Two Samples 2.1 Introduction Problems of comparing two samples arise frequently in medicine, sociology, agriculture, engineering, and marketing. The data may have been generated by observation
More informationModule 9: Nonparametric Statistics Statistics (OA3102)
Module 9: Nonparametric Statistics Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 15.1-15.6 Revision: 3-12 1 Goals for this Lecture
More informationTHE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE
THE ROYAL STATISTICAL SOCIETY 004 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER II STATISTICAL METHODS The Society provides these solutions to assist candidates preparing for the examinations in future
More informationSTATISTIKA INDUSTRI 2 TIN 4004
STATISTIKA INDUSTRI 2 TIN 4004 Pertemuan 11 & 12 Outline: Nonparametric Statistics Referensi: Walpole, R.E., Myers, R.H., Myers, S.L., Ye, K., Probability & Statistics for Engineers & Scientists, 9 th
More informationNonparametric Location Tests: k-sample
Nonparametric Location Tests: k-sample Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota)
More informationNon-parametric (Distribution-free) approaches p188 CN
Week 1: Introduction to some nonparametric and computer intensive (re-sampling) approaches: the sign test, Wilcoxon tests and multi-sample extensions, Spearman s rank correlation; the Bootstrap. (ch14
More informationAdvanced Statistics II: Non Parametric Tests
Advanced Statistics II: Non Parametric Tests Aurélien Garivier ParisTech February 27, 2011 Outline Fitting a distribution Rank Tests for the comparison of two samples Two unrelated samples: Mann-Whitney
More informationLecture 7: Hypothesis Testing and ANOVA
Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis
More informationGlossary. The ISI glossary of statistical terms provides definitions in a number of different languages:
Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the
More informationWe know from STAT.1030 that the relevant test statistic for equality of proportions is:
2. Chi 2 -tests for equality of proportions Introduction: Two Samples Consider comparing the sample proportions p 1 and p 2 in independent random samples of size n 1 and n 2 out of two populations which
More informationDistribution-Free Procedures (Devore Chapter Fifteen)
Distribution-Free Procedures (Devore Chapter Fifteen) MATH-5-01: Probability and Statistics II Spring 018 Contents 1 Nonparametric Hypothesis Tests 1 1.1 The Wilcoxon Rank Sum Test........... 1 1. Normal
More informationNon-parametric Inference and Resampling
Non-parametric Inference and Resampling Exercises by David Wozabal (Last update. Juni 010) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend surfing
More informationAPPENDICES APPENDIX A. STATISTICAL TABLES AND CHARTS 651 APPENDIX B. BIBLIOGRAPHY 677 APPENDIX C. ANSWERS TO SELECTED EXERCISES 679
APPENDICES APPENDIX A. STATISTICAL TABLES AND CHARTS 1 Table I Summary of Common Probability Distributions 2 Table II Cumulative Standard Normal Distribution Table III Percentage Points, 2 of the Chi-Squared
More informationWhat to do today (Nov 22, 2018)?
What to do today (Nov 22, 2018)? Part 1. Introduction and Review (Chp 1-5) Part 2. Basic Statistical Inference (Chp 6-9) Part 3. Important Topics in Statistics (Chp 10-13) Part 4. Further Topics (Selected
More informationNonparametric Statistics Notes
Nonparametric Statistics Notes Chapter 5: Some Methods Based on Ranks Jesse Crawford Department of Mathematics Tarleton State University (Tarleton State University) Ch 5: Some Methods Based on Ranks 1
More informationComparison of two samples
Comparison of two samples Pierre Legendre, Université de Montréal August 009 - Introduction This lecture will describe how to compare two groups of observations (samples) to determine if they may possibly
More informationNonparametric hypothesis tests and permutation tests
Nonparametric hypothesis tests and permutation tests 1.7 & 2.3. Probability Generating Functions 3.8.3. Wilcoxon Signed Rank Test 3.8.2. Mann-Whitney Test Prof. Tesler Math 283 Fall 2018 Prof. Tesler Wilcoxon
More informationCHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC
CHI SQUARE ANALYSIS I N T R O D U C T I O N T O N O N - P A R A M E T R I C A N A L Y S E S HYPOTHESIS TESTS SO FAR We ve discussed One-sample t-test Dependent Sample t-tests Independent Samples t-tests
More informationTransition Passage to Descriptive Statistics 28
viii Preface xiv chapter 1 Introduction 1 Disciplines That Use Quantitative Data 5 What Do You Mean, Statistics? 6 Statistics: A Dynamic Discipline 8 Some Terminology 9 Problems and Answers 12 Scales of
More information14.30 Introduction to Statistical Methods in Economics Spring 2009
MIT OpenCourseWare http://ocw.mit.edu 4.0 Introduction to Statistical Methods in Economics Spring 009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationLecture 26. December 19, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.
s Sign s Lecture 26 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University December 19, 2007 s Sign s 1 2 3 s 4 Sign 5 6 7 8 9 10 s s Sign 1 Distribution-free
More informationSEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics
SEVERAL μs AND MEDIANS: MORE ISSUES Business Statistics CONTENTS Post-hoc analysis ANOVA for 2 groups The equal variances assumption The Kruskal-Wallis test Old exam question Further study POST-HOC ANALYSIS
More informationSession 3 The proportional odds model and the Mann-Whitney test
Session 3 The proportional odds model and the Mann-Whitney test 3.1 A unified approach to inference 3.2 Analysis via dichotomisation 3.3 Proportional odds 3.4 Relationship with the Mann-Whitney test Session
More informationGROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION
FOR SAMPLE OF RAW DATA (E.G. 4, 1, 7, 5, 11, 6, 9, 7, 11, 5, 4, 7) BE ABLE TO COMPUTE MEAN G / STANDARD DEVIATION MEDIAN AND QUARTILES Σ ( Σ) / 1 GROUPED DATA E.G. AGE FREQ. 0-9 53 10-19 4...... 80-89
More informationStatistics Handbook. All statistical tables were computed by the author.
Statistics Handbook Contents Page Wilcoxon rank-sum test (Mann-Whitney equivalent) Wilcoxon matched-pairs test 3 Normal Distribution 4 Z-test Related samples t-test 5 Unrelated samples t-test 6 Variance
More informationBasic Business Statistics, 10/e
Chapter 1 1-1 Basic Business Statistics 11 th Edition Chapter 1 Chi-Square Tests and Nonparametric Tests Basic Business Statistics, 11e 009 Prentice-Hall, Inc. Chap 1-1 Learning Objectives In this chapter,
More informationFish SR P Diff Sgn rank Fish SR P Diff Sng rank
Nonparametric tests Distribution free methods require fewer assumptions than parametric methods Focus on testing rather than estimation Not sensitive to outlying observations Especially useful for cruder
More informationGlossary for the Triola Statistics Series
Glossary for the Triola Statistics Series Absolute deviation The measure of variation equal to the sum of the deviations of each value from the mean, divided by the number of values Acceptance sampling
More informationData are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA)
BSTT523 Pagano & Gauvreau Chapter 13 1 Nonparametric Statistics Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA) In particular, data
More informationBiostatistics 270 Kruskal-Wallis Test 1. Kruskal-Wallis Test
Biostatistics 270 Kruskal-Wallis Test 1 ORIGIN 1 Kruskal-Wallis Test The Kruskal-Wallis is a non-parametric analog to the One-Way ANOVA F-Test of means. It is useful when the k samples appear not to come
More informationStatistical Inference Theory Lesson 46 Non-parametric Statistics
46.1-The Sign Test Statistical Inference Theory Lesson 46 Non-parametric Statistics 46.1 - Problem 1: (a). Let p equal the proportion of supermarkets that charge less than $2.15 a pound. H o : p 0.50 H
More informationMcGill University. Faculty of Science. Department of Mathematics and Statistics. Part A Examination. Statistics: Theory Paper
McGill University Faculty of Science Department of Mathematics and Statistics Part A Examination Statistics: Theory Paper Date: 10th May 2015 Instructions Time: 1pm-5pm Answer only two questions from Section
More informationNonparametric tests. Mark Muldoon School of Mathematics, University of Manchester. Mark Muldoon, November 8, 2005 Nonparametric tests - p.
Nonparametric s Mark Muldoon School of Mathematics, University of Manchester Mark Muldoon, November 8, 2005 Nonparametric s - p. 1/31 Overview The sign, motivation The Mann-Whitney Larger Larger, in pictures
More informationANOVA - analysis of variance - used to compare the means of several populations.
12.1 One-Way Analysis of Variance ANOVA - analysis of variance - used to compare the means of several populations. Assumptions for One-Way ANOVA: 1. Independent samples are taken using a randomized design.
More informationChapter Fifteen. Frequency Distribution, Cross-Tabulation, and Hypothesis Testing
Chapter Fifteen Frequency Distribution, Cross-Tabulation, and Hypothesis Testing Copyright 2010 Pearson Education, Inc. publishing as Prentice Hall 15-1 Internet Usage Data Table 15.1 Respondent Sex Familiarity
More informationOne-Sample Numerical Data
One-Sample Numerical Data quantiles, boxplot, histogram, bootstrap confidence intervals, goodness-of-fit tests University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html
More informationAnalysis of variance (ANOVA) Comparing the means of more than two groups
Analysis of variance (ANOVA) Comparing the means of more than two groups Example: Cost of mating in male fruit flies Drosophila Treatments: place males with and without unmated (virgin) females Five treatments
More informationDr. Maddah ENMG 617 EM Statistics 10/12/12. Nonparametric Statistics (Chapter 16, Hines)
Dr. Maddah ENMG 617 EM Statistics 10/12/12 Nonparametric Statistics (Chapter 16, Hines) Introduction Most of the hypothesis testing presented so far assumes normally distributed data. These approaches
More informationNonparametric Methods
Nonparametric Methods Marc H. Mehlman marcmehlman@yahoo.com University of New Haven Nonparametric Methods, or Distribution Free Methods is for testing from a population without knowing anything about the
More informationLecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series
Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 13 Nonparametric Statistics 13-1 Overview 13-2 Sign Test 13-3 Wilcoxon Signed-Ranks
More informationNonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I
1 / 16 Nonparametric tests Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I Nonparametric one and two-sample tests 2 / 16 If data do not come from a normal
More informationLecture Slides. Section 13-1 Overview. Elementary Statistics Tenth Edition. Chapter 13 Nonparametric Statistics. by Mario F.
Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 13 Nonparametric Statistics 13-1 Overview 13-2 Sign Test 13-3 Wilcoxon Signed-Ranks
More informationAnalyzing Small Sample Experimental Data
Analyzing Small Sample Experimental Data Session 2: Non-parametric tests and estimators I Dominik Duell (University of Essex) July 15, 2017 Pick an appropriate (non-parametric) statistic 1. Intro to non-parametric
More informationThis is particularly true if you see long tails in your data. What are you testing? That the two distributions are the same!
Two sample tests (part II): What to do if your data are not distributed normally: Option 1: if your sample size is large enough, don't worry - go ahead and use a t-test (the CLT will take care of non-normal
More informationBIO 682 Nonparametric Statistics Spring 2010
BIO 682 Nonparametric Statistics Spring 2010 Steve Shuster http://www4.nau.edu/shustercourses/bio682/index.htm Lecture 8 Example: Sign Test 1. The number of warning cries delivered against intruders by
More informationWhat is a Hypothesis?
What is a Hypothesis? A hypothesis is a claim (assumption) about a population parameter: population mean Example: The mean monthly cell phone bill in this city is μ = $42 population proportion Example:
More informationStat 710: Mathematical Statistics Lecture 31
Stat 710: Mathematical Statistics Lecture 31 Jun Shao Department of Statistics University of Wisconsin Madison, WI 53706, USA Jun Shao (UW-Madison) Stat 710, Lecture 31 April 13, 2009 1 / 13 Lecture 31:
More informationNon-parametric Hypothesis Testing
Non-parametric Hypothesis Testing Procedures Hypothesis Testing General Procedure for Hypothesis Tests 1. Identify the parameter of interest.. Formulate the null hypothesis, H 0. 3. Specify an appropriate
More informationSelection should be based on the desired biological interpretation!
Statistical tools to compare levels of parasitism Jen_ Reiczigel,, Lajos Rózsa Hungary What to compare? The prevalence? The mean intensity? The median intensity? Or something else? And which statistical
More informationTMA4255 Applied Statistics V2016 (23)
TMA4255 Applied Statistics V2016 (23) Part 7: Nonparametric tests Signed-Rank test [16.2] Wilcoxon Rank-sum test [16.3] Anna Marie Holand April 19, 2016, wiki.math.ntnu.no/tma4255/2016v/start 2 Outline
More informationSTAT 461/561- Assignments, Year 2015
STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and
More informationChapter 18 Resampling and Nonparametric Approaches To Data
Chapter 18 Resampling and Nonparametric Approaches To Data 18.1 Inferences in children s story summaries (McConaughy, 1980): a. Analysis using Wilcoxon s rank-sum test: Younger Children Older Children
More informationRecall the Basics of Hypothesis Testing
Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE
More informationIntroduction to Statistical Analysis. Cancer Research UK 12 th of February 2018 D.-L. Couturier / M. Eldridge / M. Fernandes [Bioinformatics core]
Introduction to Statistical Analysis Cancer Research UK 12 th of February 2018 D.-L. Couturier / M. Eldridge / M. Fernandes [Bioinformatics core] 2 Timeline 9:30 Morning I I 45mn Lecture: data type, summary
More informationSTAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test.
STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter March 30, 2015 Mann-Whitney Test Mann-Whitney Test Recall that the Mann-Whitney
More informationIntroduction to Statistical Analysis
Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive
More informationMy data doesn t look like that..
Testing assumptions My data doesn t look like that.. We have made a big deal about testing model assumptions each week. Bill Pine Testing assumptions Testing assumptions We have made a big deal about testing
More informationNonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown
Nonparametric Statistics Leah Wright, Tyler Ross, Taylor Brown Before we get to nonparametric statistics, what are parametric statistics? These statistics estimate and test population means, while holding
More information6 Single Sample Methods for a Location Parameter
6 Single Sample Methods for a Location Parameter If there are serious departures from parametric test assumptions (e.g., normality or symmetry), nonparametric tests on a measure of central tendency (usually
More informationFrequency table: Var2 (Spreadsheet1) Count Cumulative Percent Cumulative From To. Percent <x<=
A frequency distribution is a kind of probability distribution. It gives the frequency or relative frequency at which given values have been observed among the data collected. For example, for age, Frequency
More informationNon-parametric Tests for Complete Data
Non-parametric Tests for Complete Data Non-parametric Tests for Complete Data Vilijandas Bagdonavičius Julius Kruopis Mikhail S. Nikulin First published 2011 in Great Britain and the United States by
More informationsimple if it completely specifies the density of x
3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely
More informationKumaun University Nainital
Kumaun University Nainital Department of Statistics B. Sc. Semester system course structure: 1. The course work shall be divided into six semesters with three papers in each semester. 2. Each paper in
More informationAgonistic Display in Betta splendens: Data Analysis I. Betta splendens Research: Parametric or Non-parametric Data?
Agonistic Display in Betta splendens: Data Analysis By Joanna Weremjiwicz, Simeon Yurek, and Dana Krempels Once you have collected data with your ethogram, you are ready to analyze that data to see whether
More informationNominal Data. Parametric Statistics. Nonparametric Statistics. Parametric vs Nonparametric Tests. Greg C Elvers
Nominal Data Greg C Elvers 1 Parametric Statistics The inferential statistics that we have discussed, such as t and ANOVA, are parametric statistics A parametric statistic is a statistic that makes certain
More informationSummary of Chapters 7-9
Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two
More informationAsymptotic Statistics-VI. Changliang Zou
Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous
More informationHYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă
HYPOTHESIS TESTING II TESTS ON MEANS Sorana D. Bolboacă OBJECTIVES Significance value vs p value Parametric vs non parametric tests Tests on means: 1 Dec 14 2 SIGNIFICANCE LEVEL VS. p VALUE Materials and
More informationBasics on t-tests Independent Sample t-tests Single-Sample t-tests Summary of t-tests Multiple Tests, Effect Size Proportions. Statistiek I.
Statistiek I t-tests John Nerbonne CLCG, Rijksuniversiteit Groningen http://www.let.rug.nl/nerbonne/teach/statistiek-i/ John Nerbonne 1/46 Overview 1 Basics on t-tests 2 Independent Sample t-tests 3 Single-Sample
More informationTABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1
TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 1.1 The Probability Model...1 1.2 Finite Discrete Models with Equally Likely Outcomes...5 1.2.1 Tree Diagrams...6 1.2.2 The Multiplication Principle...8
More informationSolutions exercises of Chapter 7
Solutions exercises of Chapter 7 Exercise 1 a. These are paired samples: each pair of half plates will have about the same level of corrosion, so the result of polishing by the two brands of polish are
More informationInstitute of Actuaries of India
Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the
More informationIntuitive Biostatistics: Choosing a statistical test
pagina 1 van 5 < BACK Intuitive Biostatistics: Choosing a statistical This is chapter 37 of Intuitive Biostatistics (ISBN 0-19-508607-4) by Harvey Motulsky. Copyright 1995 by Oxfd University Press Inc.
More informationIntroduction to Nonparametric Statistics
Introduction to Nonparametric Statistics by James Bernhard Spring 2012 Parameters Parametric method Nonparametric method µ[x 2 X 1 ] paired t-test Wilcoxon signed rank test µ[x 1 ], µ[x 2 ] 2-sample t-test
More informationStatistics: revision
NST 1B Experimental Psychology Statistics practical 5 Statistics: revision Rudolf Cardinal & Mike Aitken 29 / 30 April 2004 Department of Experimental Psychology University of Cambridge Handouts: Answers
More informationRama Nada. -Ensherah Mokheemer. 1 P a g e
- 9 - Rama Nada -Ensherah Mokheemer - 1 P a g e Quick revision: Remember from the last lecture that chi square is an example of nonparametric test, other examples include Kruskal Wallis, Mann Whitney and
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationNon-Parametric Statistics: When Normal Isn t Good Enough"
Non-Parametric Statistics: When Normal Isn t Good Enough" Professor Ron Fricker" Naval Postgraduate School" Monterey, California" 1/28/13 1 A Bit About Me" Academic credentials" Ph.D. and M.A. in Statistics,
More information