Kolmogorov-Smirnov Test for Goodness of Fit in an ordered sequence

Size: px
Start display at page:

Download "Kolmogorov-Smirnov Test for Goodness of Fit in an ordered sequence"

Transcription

1 Biostatistics 430 Kolmogorov-Smirnov Test ORIGIN Model: Assumptions: Kolmogorov-Smirnov Test for Goodness of Fit in an ordered sequence The Kolmogorov-Smirnov Test is designed to test whether observed counts, frequencies, or continuous numerical values collected along an ordered sequence (Y) conform to that expected for a known probability distribution such as equal probability or Normal distribution, or the like. The test proceeds by considering each value in an ordered sequence Y to represent bins (as in a histogram) for which Observed and Expected Cumulative probabilites (Y) are calculated. (I use Y instead of X here only to conform to Table 7.4 in Sokal & Rohlf 995). Let Expected Cumulative Probabilities (Y) be specified according to some model over ordered values in Y. Frequency or numerical data collected for each value of X are independent of the others. Hypotheses: H 0 : P j are distributed according to the model H : P j differ from the model < Two sided test Sokal & Rohlf Table 7.4 based on data in Box 5.2 n 2 i n Y sort SR 3 i Y stdy ExpF F0.5 g0.5 F0 g0 F g Construct Cumulative Frequencies: 2.05 mean sd max δ max δ max SR READPRN ("c:/data/biostatistics/sr5.2r.txt" ) < Ordered values which will be tested against some probability distribution. stdy Y mean( Y) Var( Y) mean( Y) Var( Y) < standardizing Y ExpF pnorm( stdy0 ) < Expected (Y) - In this instance, we will be testing a Normal distribution ~N(0,) for the standardized data stdy.

2 Biostatistics 430 Kolmogorov-Smirnov Test 2 i Construct Statistics g 0.5, g 0 & g : Y stdy ^ standardized Y ExpF ^ (Y) for ~N(0,) g ObsF 0.5 HKL corrected ObsF 0.5 g 0 - ObsF 0 Khamis corrected ObsF 0.5 where = 0 g - ObsF Khamis corrected ObsF 0.5 where = ( i 0.5) F 0.5i n g 0.5i ExpF F i 0.5i < Harter, Khamis & Lamb correction SR p. 708 < Exp-Obs 0.5 F g i F 0i n 2 g 0i ExpF F i 0i < Khamis SR p. 7 < 0 correction < Exp-Obs 0 for = 0 F g

3 Biostatistics 430 Kolmogorov-Smirnov Test 3 < Khamis SR p. 7 i F i n 2 < correction g i ExpF F i i < Exp-Obs for = Test Statistic d max : F maxg maxg max g d max maxg 0.5 d max n Sampling Distribution: If Assumptions hold and H 0 is true, then D max ~D (,n) g Critical Value of the Test: 0.05 <Type I error must be set ^ Kolmogorov-Smirnov distribution for continuous data in specialized tables (e.g., Zar 200 Appendix B.9). Sokal & Rohlf offer two different tables depending on whether distribution ExpF is derived from internal estimates or external parameters. C Decision Rule: < from Zar 200 Appendix B.9 n 2 IF d max > C THEN REJECT H 0 d max IF P < THEN REJECT H 0 Probability Value: The Kolmogorov-Smirnov distribution is not available for calculating this, so until I find one, we'll have to depend on R's function ks.test() to calculate this for us. Prototype in R: #KOLMOGOROV SMIRNOV SINGLE SAMPLE TEST #FOR CONTINUOUS DATA #SOKAL & ROHLF TABLE 7.4 & BOX 5.2 SR=read.table("c:/DATA/Biostascs/SR5.2R.txt") SR aach(sr) Y=sort(bodywt) stdy=(y mean(y))/sqrt(var(y)) stdy One sample Kolmogorov Smirnov test ks.test(stdy,pnorm) data: stdy D = 0.225, p value = alternave hypothesis: two sided > SR gillwt bodywt ^ Note: in this test, I standardized Y (making stdy) in order to compare stdy with the Normal distribution ~N(0,)

4 Biostatistics 430 Kolmogorov-Smirnov Test 4 Kolmogorov-Smirnov Test for Goodness of Fit for Two Samples: A similar Kolmogorov-Smirnov approach may be used to compare the distribution of one sample with another. According to Sokal & Rohlf, this test is not as sensitive as the Mann-Whitney Test for assessing differences in median. However, it also looks for differences in shapes of the the two distributions. Construct Chart: Sokal & Rohlf Biometry 3rd Edition 995 Example Box 3.9 Testing differences in distribution between Sample A & B: length sample F F2 F/n F2/n2 d 00 B A B B B B A B A A A B A A A B A B A B A A A A A A n n2 max d F & F2 are cumulative counts respectively for samples A & B. n & n2 are total counts for each sample. d is the absolute difference: F/n - F2/n2 Test Statistic: d max

5 Biostatistics 430 Kolmogorov-Smirnov Test 5 Sampling Distribution: If Assumptions hold and H 0 is true, then D max ~D () 0.05 Probability Value: Decision Rule: Prototype in R: <Type I error must be set IF P < THEN REJECT H 0 ^ Sokal & Rohlf Table W The Kolmogorov-Smirnov distribution is not available for calculating this, so until I find one, we'll have to depend on R's function ks.test() to calculate this for us. d max #KOLMOGOROV SMIRNOV TWO SAMPLE TEST #FOR CONTINUOUS DATA #SOKAL & ROHLF EXAMPLE 3.7 SR=read.table("c:/DATA/Biostascs/SREX3.7R.txt") SR aach(sr) X=length[sample=="A"] Y=length[sample=="B"] ks.test(x,y) Two sample Kolmogorov Smirnov test data: X and Y D = 0.475, p value = alternave hypothesis: two sided Warning message: In ks.test(x, Y) : cannot compute correct p values with es ^ Here Calcuulation of max(d) match hand calculation above and in Sokal & Rohlf (995). This test is supposed to cover continuous distributions, so there should be no ties. As a result, R gives a warning message. Systat Output: Categorical values encountered during processing are: SAMPLE$ (2 levels) A, B > SR length sample 04 A 2 09 A 3 2 A 4 4 A 5 6 A 6 8 A 7 8 A 8 9 A 9 2 A 0 23 A 25 A 2 26 A 3 26 A 4 28 A 5 28 A 6 28 A 7 00 B 8 05 B 9 07 B B 2 08 B 22 B 23 6 B B 25 2 B B Kolmogorov-Smirnov Two Sample Test results Maximum differences for pairs of groups A B A B < test statistic matches Two-sided probabilities A B A. B < probability doesn't exactly match!

Biostatistics 270 Kruskal-Wallis Test 1. Kruskal-Wallis Test

Biostatistics 270 Kruskal-Wallis Test 1. Kruskal-Wallis Test Biostatistics 270 Kruskal-Wallis Test 1 ORIGIN 1 Kruskal-Wallis Test The Kruskal-Wallis is a non-parametric analog to the One-Way ANOVA F-Test of means. It is useful when the k samples appear not to come

More information

Rank-Based Methods. Lukas Meier

Rank-Based Methods. Lukas Meier Rank-Based Methods Lukas Meier 20.01.2014 Introduction Up to now we basically always used a parametric family, like the normal distribution N (µ, σ 2 ) for modeling random data. Based on observed data

More information

Nonparametric Statistics

Nonparametric Statistics Nonparametric Statistics Jessi Cisewski Yale University Astrostatistics Summer School - XI Wednesday, June 3, 2015 1 Overview Many of the standard statistical inference procedures are based on assumptions

More information

What to do today (Nov 22, 2018)?

What to do today (Nov 22, 2018)? What to do today (Nov 22, 2018)? Part 1. Introduction and Review (Chp 1-5) Part 2. Basic Statistical Inference (Chp 6-9) Part 3. Important Topics in Statistics (Chp 10-13) Part 4. Further Topics (Selected

More information

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. A common problem of this type is concerned with determining

More information

Nested 2-Way ANOVA as Linear Models - Unbalanced Example

Nested 2-Way ANOVA as Linear Models - Unbalanced Example Linear Models Nested -Way ANOVA ORIGIN As with other linear models, unbalanced data require use of the regression approach, in this case by contrast coding of independent variables using a scheme not described

More information

Inferential Statistics

Inferential Statistics Inferential Statistics Eva Riccomagno, Maria Piera Rogantin DIMA Università di Genova riccomagno@dima.unige.it rogantin@dima.unige.it Part G Distribution free hypothesis tests 1. Classical and distribution-free

More information

This is particularly true if you see long tails in your data. What are you testing? That the two distributions are the same!

This is particularly true if you see long tails in your data. What are you testing? That the two distributions are the same! Two sample tests (part II): What to do if your data are not distributed normally: Option 1: if your sample size is large enough, don't worry - go ahead and use a t-test (the CLT will take care of non-normal

More information

Random Number Generation. CS1538: Introduction to simulations

Random Number Generation. CS1538: Introduction to simulations Random Number Generation CS1538: Introduction to simulations Random Numbers Stochastic simulations require random data True random data cannot come from an algorithm We must obtain it from some process

More information

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Chapte The McGraw-Hill Companies, Inc. All rights reserved. er15 Chapte Chi-Square Tests d Chi-Square Tests for -Fit Uniform Goodness- Poisson Goodness- Goodness- ECDF Tests (Optional) Contingency Tables A contingency table is a cross-tabulation of n paired observations

More information

ST4241 Design and Analysis of Clinical Trials Lecture 7: N. Lecture 7: Non-parametric tests for PDG data

ST4241 Design and Analysis of Clinical Trials Lecture 7: N. Lecture 7: Non-parametric tests for PDG data ST4241 Design and Analysis of Clinical Trials Lecture 7: Non-parametric tests for PDG data Department of Statistics & Applied Probability 8:00-10:00 am, Friday, September 2, 2016 Outline Non-parametric

More information

Dr. Maddah ENMG 617 EM Statistics 10/15/12. Nonparametric Statistics (2) (Goodness of fit tests)

Dr. Maddah ENMG 617 EM Statistics 10/15/12. Nonparametric Statistics (2) (Goodness of fit tests) Dr. Maddah ENMG 617 EM Statistics 10/15/12 Nonparametric Statistics (2) (Goodness of fit tests) Introduction Probability models used in decision making (Operations Research) and other fields require fitting

More information

Biostatistics 380 Multiple Regression 1. Multiple Regression

Biostatistics 380 Multiple Regression 1. Multiple Regression Biostatistics 0 Multiple Regression ORIGIN 0 Multiple Regression Multiple Regression is an extension of the technique of linear regression to describe the relationship between a single dependent (response)

More information

Non-parametric methods

Non-parametric methods Eastern Mediterranean University Faculty of Medicine Biostatistics course Non-parametric methods March 4&7, 2016 Instructor: Dr. Nimet İlke Akçay (ilke.cetin@emu.edu.tr) Learning Objectives 1. Distinguish

More information

f (1 0.5)/n Z =

f (1 0.5)/n Z = Math 466/566 - Homework 4. We want to test a hypothesis involving a population proportion. The unknown population proportion is p. The null hypothesis is p = / and the alternative hypothesis is p > /.

More information

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests Chapters 3.5.1 3.5.2, 3.3.2 Prof. Tesler Math 283 Fall 2018 Prof. Tesler z and t tests for mean Math

More information

Nonparametric Statistics

Nonparametric Statistics Nonparametric Statistics Nonparametric or Distribution-free statistics: used when data are ordinal (i.e., rankings) used when ratio/interval data are not normally distributed (data are converted to ranks)

More information

Survey on Population Mean

Survey on Population Mean MATH 203 Survey on Population Mean Dr. Neal, Spring 2009 The first part of this project is on the analysis of a population mean. You will obtain data on a specific measurement X by performing a random

More information

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that Lecture 28 28.1 Kolmogorov-Smirnov test. Suppose that we have an i.i.d. sample X 1,..., X n with some unknown distribution and we would like to test the hypothesis that is equal to a particular distribution

More information

Goodness-of-fit Tests for the Normal Distribution Project 1

Goodness-of-fit Tests for the Normal Distribution Project 1 Goodness-of-fit Tests for the Normal Distribution Project 1 Jeremy Morris September 29, 2005 1 Kolmogorov-Smirnov Test The Kolmogorov-Smirnov Test (KS test) is based on the cumulative distribution function

More information

Violating the normal distribution assumption. So what do you do if the data are not normal and you still need to perform a test?

Violating the normal distribution assumption. So what do you do if the data are not normal and you still need to perform a test? Violating the normal distribution assumption So what do you do if the data are not normal and you still need to perform a test? Remember, if your n is reasonably large, don t bother doing anything. Your

More information

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c Inference About the Slope ffl As with all estimates, ^fi1 subject to sampling var ffl Because Y jx _ Normal, the estimate ^fi1 _ Normal A linear combination of indep Normals is Normal Simple Linear Regression

More information

Inverse Transform Simulations

Inverse Transform Simulations 0 20000 50000 0 20000 50000 Inverse Transform Simulations (a) Using the Inverse Transform Method, write R codes to draw 100,000 observations from the following distributions (b) Check our simulations with

More information

Testing for Normality

Testing for Normality Testing for Normality For each mean and standard deviation combination a theoretical normal distribution can be determined. This distribution is based on the proportions shown below. This theoretical normal

More information

Version 1: Equality of Distributions. 3. F (x) and G(x) represent the distribution functions corresponding to the Xs and Y s, respectively.

Version 1: Equality of Distributions. 3. F (x) and G(x) represent the distribution functions corresponding to the Xs and Y s, respectively. 4 Two-Sample Methods 4.1 The (Mann-Whitney) Wilcoxon Rank Sum Test Version 1: Equality of Distributions Assumptions: Given two independent random samples X 1, X 2,..., X n and Y 1, Y 2,..., Y m : 1. The

More information

ADDITIONAL STATISTICAL ANALYSES. The data were not normally distributed (Kolmogorov-Smirnov test; Legendre &

ADDITIONAL STATISTICAL ANALYSES. The data were not normally distributed (Kolmogorov-Smirnov test; Legendre & DDITIONL STTISTICL NLYSES The data were not normally distributed (Kolmogorov-Smirnov test; Legendre & Legendre, 1998) and violate the assumption of equal variances (Levene test; Rohlf & Sokal, 1994) for

More information

EE/CpE 345. Modeling and Simulation. Fall Class 10 November 18, 2002

EE/CpE 345. Modeling and Simulation. Fall Class 10 November 18, 2002 EE/CpE 345 Modeling and Simulation Class 0 November 8, 2002 Input Modeling Inputs(t) Actual System Outputs(t) Parameters? Simulated System Outputs(t) The input data is the driving force for the simulation

More information

Module 9: Nonparametric Statistics Statistics (OA3102)

Module 9: Nonparametric Statistics Statistics (OA3102) Module 9: Nonparametric Statistics Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 15.1-15.6 Revision: 3-12 1 Goals for this Lecture

More information

THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH. Robert R. SOKAL and F. James ROHLF. State University of New York at Stony Brook

THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH. Robert R. SOKAL and F. James ROHLF. State University of New York at Stony Brook BIOMETRY THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH THIRD E D I T I O N Robert R. SOKAL and F. James ROHLF State University of New York at Stony Brook W. H. FREEMAN AND COMPANY New

More information

Chapter 7 Comparison of two independent samples

Chapter 7 Comparison of two independent samples Chapter 7 Comparison of two independent samples 7.1 Introduction Population 1 µ σ 1 1 N 1 Sample 1 y s 1 1 n 1 Population µ σ N Sample y s n 1, : population means 1, : population standard deviations N

More information

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007) FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

More information

S D / n t n 1 The paediatrician observes 3 =

S D / n t n 1 The paediatrician observes 3 = Non-parametric tests Paired t-test A paediatrician measured the blood cholesterol of her patients and was worried to note that some had levels over 00mg/100ml To investigate whether dietary regulation

More information

Standard & Conditional Probability

Standard & Conditional Probability Biostatistics 050 Standard & Conditional Probability 1 ORIGIN 0 Probability as a Concept: Standard & Conditional Probability "The probability of an event is the likelihood of that event expressed either

More information

Hypothesis testing:power, test statistic CMS:

Hypothesis testing:power, test statistic CMS: Hypothesis testing:power, test statistic The more sensitive the test, the better it can discriminate between the null and the alternative hypothesis, quantitatively, maximal power In order to achieve this

More information

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and

More information

Stochastic Simulation

Stochastic Simulation Stochastic Simulation APPM 7400 Lesson 3: Testing Random Number Generators Part II: Uniformity September 5, 2018 Lesson 3: Testing Random Number GeneratorsPart II: Uniformity Stochastic Simulation September

More information

Comparison of two samples

Comparison of two samples Comparison of two samples Pierre Legendre, Université de Montréal August 009 - Introduction This lecture will describe how to compare two groups of observations (samples) to determine if they may possibly

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Unit 14: Nonparametric Statistical Methods

Unit 14: Nonparametric Statistical Methods Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based

More information

Statistical comparison of univariate tests of homogeneity of variances

Statistical comparison of univariate tests of homogeneity of variances Submitted to the Journal of Statistical Computation and Simulation Statistical comparison of univariate tests of homogeneity of variances Pierre Legendre* and Daniel Borcard Département de sciences biologiques,

More information

Exam: 4 hour multiple choice. Agenda. Course Introduction to Statistics. Lecture 1: Introduction to Statistics. Per Bruun Brockhoff

Exam: 4 hour multiple choice. Agenda. Course Introduction to Statistics. Lecture 1: Introduction to Statistics. Per Bruun Brockhoff Course 02402 Lecture 1: Per Bruun Brockhoff DTU Informatics Building 305 - room 110 Danish Technical University 2800 Lyngby Denmark e-mail: pbb@imm.dtu.dk Agenda 1 2 3 4 Per Bruun Brockhoff (pbb@imm.dtu.dk),

More information

EE/CpE 345. Modeling and Simulation. Fall Class 9

EE/CpE 345. Modeling and Simulation. Fall Class 9 EE/CpE 345 Modeling and Simulation Class 9 208 Input Modeling Inputs(t) Actual System Outputs(t) Parameters? Simulated System Outputs(t) The input data is the driving force for the simulation - the behavior

More information

Statistical Procedures for Testing Homogeneity of Water Quality Parameters

Statistical Procedures for Testing Homogeneity of Water Quality Parameters Statistical Procedures for ing Homogeneity of Water Quality Parameters Xu-Feng Niu Professor of Statistics Department of Statistics Florida State University Tallahassee, FL 3306 May-September 004 1. Nonparametric

More information

TMA4255 Applied Statistics V2016 (23)

TMA4255 Applied Statistics V2016 (23) TMA4255 Applied Statistics V2016 (23) Part 7: Nonparametric tests Signed-Rank test [16.2] Wilcoxon Rank-sum test [16.3] Anna Marie Holand April 19, 2016, wiki.math.ntnu.no/tma4255/2016v/start 2 Outline

More information

Degrees of freedom df=1. Limitations OR in SPSS LIM: Knowing σ and µ is unlikely in large

Degrees of freedom df=1. Limitations OR in SPSS LIM: Knowing σ and µ is unlikely in large Z Test Comparing a group mean to a hypothesis T test (about 1 mean) T test (about 2 means) Comparing mean to sample mean. Similar means = will have same response to treatment Two unknown means are different

More information

Non-parametric (Distribution-free) approaches p188 CN

Non-parametric (Distribution-free) approaches p188 CN Week 1: Introduction to some nonparametric and computer intensive (re-sampling) approaches: the sign test, Wilcoxon tests and multi-sample extensions, Spearman s rank correlation; the Bootstrap. (ch14

More information

Non-Parametric Statistics: When Normal Isn t Good Enough"

Non-Parametric Statistics: When Normal Isn t Good Enough Non-Parametric Statistics: When Normal Isn t Good Enough" Professor Ron Fricker" Naval Postgraduate School" Monterey, California" 1/28/13 1 A Bit About Me" Academic credentials" Ph.D. and M.A. in Statistics,

More information

Data analysis and Geostatistics - lecture VII

Data analysis and Geostatistics - lecture VII Data analysis and Geostatistics - lecture VII t-tests, ANOVA and goodness-of-fit Statistical testing - significance of r Testing the significance of the correlation coefficient: t = r n - 2 1 - r 2 with

More information

Correlation. We don't consider one variable independent and the other dependent. Does x go up as y goes up? Does x go down as y goes up?

Correlation. We don't consider one variable independent and the other dependent. Does x go up as y goes up? Does x go down as y goes up? Comment: notes are adapted from BIOL 214/312. I. Correlation. Correlation A) Correlation is used when we want to examine the relationship of two continuous variables. We are not interested in prediction.

More information

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 004 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER II STATISTICAL METHODS The Society provides these solutions to assist candidates preparing for the examinations in future

More information

Modeling Hydrologic Chanae

Modeling Hydrologic Chanae Modeling Hydrologic Chanae Statistical Methods Richard H. McCuen Department of Civil and Environmental Engineering University of Maryland m LEWIS PUBLISHERS A CRC Press Company Boca Raton London New York

More information

Non-parametric Tests

Non-parametric Tests Statistics Column Shengping Yang PhD,Gilbert Berdine MD I was working on a small study recently to compare drug metabolite concentrations in the blood between two administration regimes. However, the metabolite

More information

Solutions exercises of Chapter 7

Solutions exercises of Chapter 7 Solutions exercises of Chapter 7 Exercise 1 a. These are paired samples: each pair of half plates will have about the same level of corrosion, so the result of polishing by the two brands of polish are

More information

Lecture 06. DSUR CH 05 Exploring Assumptions of parametric statistics Hypothesis Testing Power

Lecture 06. DSUR CH 05 Exploring Assumptions of parametric statistics Hypothesis Testing Power Lecture 06 DSUR CH 05 Exploring Assumptions of parametric statistics Hypothesis Testing Power Introduction Assumptions When broken then we are not able to make inference or accurate descriptions about

More information

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y

More information

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics Mathematics Curriculum A. DESCRIPTION This is a full year courses designed to introduce students to the basic elements of statistics and probability. Emphasis is placed on understanding terminology and

More information

3. Nonparametric methods

3. Nonparametric methods 3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests

More information

Nonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I

Nonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I 1 / 16 Nonparametric tests Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I Nonparametric one and two-sample tests 2 / 16 If data do not come from a normal

More information

Things you always wanted to know about statistics but were afraid to ask

Things you always wanted to know about statistics but were afraid to ask Things you always wanted to know about statistics but were afraid to ask Christoph Amma Felix Putze Design and Evaluation of Innovative User Interfaces 6.12.13 1/43 Overview In the last lecture, we learned

More information

Non-parametric Hypothesis Testing

Non-parametric Hypothesis Testing Non-parametric Hypothesis Testing Procedures Hypothesis Testing General Procedure for Hypothesis Tests 1. Identify the parameter of interest.. Formulate the null hypothesis, H 0. 3. Specify an appropriate

More information

Testing for Normality

Testing for Normality Testing for Normality For each mean and standard deviation combination a theoretical normal distribution can be determined. This distribution is based on the proportions shown below. This theoretical normal

More information

Looking at the Other Side of Bonferroni

Looking at the Other Side of Bonferroni Department of Biostatistics University of Washington 24 May 2012 Multiple Testing: Control the Type I Error Rate When analyzing genetic data, one will commonly perform over 1 million (and growing) hypothesis

More information

STATISTICS ( CODE NO. 08 ) PAPER I PART - I

STATISTICS ( CODE NO. 08 ) PAPER I PART - I STATISTICS ( CODE NO. 08 ) PAPER I PART - I 1. Descriptive Statistics Types of data - Concepts of a Statistical population and sample from a population ; qualitative and quantitative data ; nominal and

More information

Multiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model

Multiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model Biostatistics 250 ANOVA Multiple Comparisons 1 ORIGIN 1 Multiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model When the omnibus F-Test for ANOVA rejects the null hypothesis that

More information

Frequency table: Var2 (Spreadsheet1) Count Cumulative Percent Cumulative From To. Percent <x<=

Frequency table: Var2 (Spreadsheet1) Count Cumulative Percent Cumulative From To. Percent <x<= A frequency distribution is a kind of probability distribution. It gives the frequency or relative frequency at which given values have been observed among the data collected. For example, for age, Frequency

More information

NAG Toolbox for Matlab. g08cd.1

NAG Toolbox for Matlab. g08cd.1 G08 Nonparametric Statistics NAG Toolbox for Matlab 1 Purpose performs the two sample Kolmogorov Smirnov distribution test. 2 Syntax [d, z, p, sx, sy, ifail] = (x, y, ntype, n1, n1, n2, n2) 3 Description

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

10.2 Hypothesis Testing with Two-Way Tables

10.2 Hypothesis Testing with Two-Way Tables 10.2 Hypothesis Testing with Two-Way Tables Part 2: more examples 3x3 Two way table 2x3 Two-way table (worksheet) 1 Example 2: n Is there an association between the type of school area and the students'

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

SIMULATION SEMINAR SERIES INPUT PROBABILITY DISTRIBUTIONS

SIMULATION SEMINAR SERIES INPUT PROBABILITY DISTRIBUTIONS SIMULATION SEMINAR SERIES INPUT PROBABILITY DISTRIBUTIONS Zeynep F. EREN DOGU PURPOSE & OVERVIEW Stochastic simulations involve random inputs, so produce random outputs too. The quality of the output is

More information

3 Joint Distributions 71

3 Joint Distributions 71 2.2.3 The Normal Distribution 54 2.2.4 The Beta Density 58 2.3 Functions of a Random Variable 58 2.4 Concluding Remarks 64 2.5 Problems 64 3 Joint Distributions 71 3.1 Introduction 71 3.2 Discrete Random

More information

Non-Parametric Two-Sample Analysis: The Mann-Whitney U Test

Non-Parametric Two-Sample Analysis: The Mann-Whitney U Test Non-Parametric Two-Sample Analysis: The Mann-Whitney U Test When samples do not meet the assumption of normality parametric tests should not be used. To overcome this problem, non-parametric tests can

More information

Distribution Fitting (Censored Data)

Distribution Fitting (Censored Data) Distribution Fitting (Censored Data) Summary... 1 Data Input... 2 Analysis Summary... 3 Analysis Options... 4 Goodness-of-Fit Tests... 6 Frequency Histogram... 8 Comparison of Alternative Distributions...

More information

Analysis of variance (ANOVA) Comparing the means of more than two groups

Analysis of variance (ANOVA) Comparing the means of more than two groups Analysis of variance (ANOVA) Comparing the means of more than two groups Example: Cost of mating in male fruit flies Drosophila Treatments: place males with and without unmated (virgin) females Five treatments

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

USE OF STATISTICAL BOOTSTRAPPING FOR SAMPLE SIZE DETERMINATION TO ESTIMATE LENGTH-FREQUENCY DISTRIBUTIONS FOR PACIFIC ALBACORE TUNA (THUNNUS ALALUNGA)

USE OF STATISTICAL BOOTSTRAPPING FOR SAMPLE SIZE DETERMINATION TO ESTIMATE LENGTH-FREQUENCY DISTRIBUTIONS FOR PACIFIC ALBACORE TUNA (THUNNUS ALALUNGA) FRI-UW-992 March 1999 USE OF STATISTICAL BOOTSTRAPPING FOR SAMPLE SIZE DETERMINATION TO ESTIMATE LENGTH-FREQUENCY DISTRIBUTIONS FOR PACIFIC ALBACORE TUNA (THUNNUS ALALUNGA) M. GOMEZ-BUCKLEY, L. CONQUEST,

More information

My data doesn t look like that..

My data doesn t look like that.. Testing assumptions My data doesn t look like that.. We have made a big deal about testing model assumptions each week. Bill Pine Testing assumptions Testing assumptions We have made a big deal about testing

More information

Lecture 26. December 19, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

Lecture 26. December 19, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University. s Sign s Lecture 26 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University December 19, 2007 s Sign s 1 2 3 s 4 Sign 5 6 7 8 9 10 s s Sign 1 Distribution-free

More information

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics Understand Difference between Parametric and Nonparametric Statistical Procedures Parametric statistical procedures inferential procedures that rely

More information

Advanced Statistics II: Non Parametric Tests

Advanced Statistics II: Non Parametric Tests Advanced Statistics II: Non Parametric Tests Aurélien Garivier ParisTech February 27, 2011 Outline Fitting a distribution Rank Tests for the comparison of two samples Two unrelated samples: Mann-Whitney

More information

6 Single Sample Methods for a Location Parameter

6 Single Sample Methods for a Location Parameter 6 Single Sample Methods for a Location Parameter If there are serious departures from parametric test assumptions (e.g., normality or symmetry), nonparametric tests on a measure of central tendency (usually

More information

Business Statistics MEDIAN: NON- PARAMETRIC TESTS

Business Statistics MEDIAN: NON- PARAMETRIC TESTS Business Statistics MEDIAN: NON- PARAMETRIC TESTS CONTENTS Hypotheses on the median The sign test The Wilcoxon signed ranks test Old exam question HYPOTHESES ON THE MEDIAN The median is a central value

More information

THE PAIR CHART I. Dana Quade. University of North Carolina. Institute of Statistics Mimeo Series No ~.:. July 1967

THE PAIR CHART I. Dana Quade. University of North Carolina. Institute of Statistics Mimeo Series No ~.:. July 1967 . _ e THE PAR CHART by Dana Quade University of North Carolina nstitute of Statistics Mimeo Series No. 537., ~.:. July 1967 Supported by U. S. Public Health Service Grant No. 3-Tl-ES-6l-0l. DEPARTMENT

More information

Physics 509: Non-Parametric Statistics and Correlation Testing

Physics 509: Non-Parametric Statistics and Correlation Testing Physics 509: Non-Parametric Statistics and Correlation Testing Scott Oser Lecture #19 Physics 509 1 What is non-parametric statistics? Non-parametric statistics is the application of statistical tests

More information

Deccan Education Society s FERGUSSON COLLEGE, PUNE (AUTONOMOUS) SYLLABUS UNDER AUTONOMY. FIRST YEAR B.Sc.(Computer Science) SEMESTER I

Deccan Education Society s FERGUSSON COLLEGE, PUNE (AUTONOMOUS) SYLLABUS UNDER AUTONOMY. FIRST YEAR B.Sc.(Computer Science) SEMESTER I Deccan Education Society s FERGUSSON COLLEGE, PUNE (AUTONOMOUS) SYLLABUS UNDER AUTONOMY FIRST YEAR B.Sc.(Computer Science) SEMESTER I SYLLABUS FOR F.Y.B.Sc.(Computer Science) STATISTICS Academic Year 2016-2017

More information

10/4/2013. Hypothesis Testing & z-test. Hypothesis Testing. Hypothesis Testing

10/4/2013. Hypothesis Testing & z-test. Hypothesis Testing. Hypothesis Testing & z-test Lecture Set 11 We have a coin and are trying to determine if it is biased or unbiased What should we assume? Why? Flip coin n = 100 times E(Heads) = 50 Why? Assume we count 53 Heads... What could

More information

Chapter 8 Class Notes Comparison of Paired Samples

Chapter 8 Class Notes Comparison of Paired Samples Chapter 8 Class Notes Comparison of Paired Samples In this chapter, we consider the analysis of paired data. To illustrate, (in the spirit of p.332 ex.8.s.5) an agronomist randomly selected six wheat plants

More information

Statistics: revision

Statistics: revision NST 1B Experimental Psychology Statistics practical 5 Statistics: revision Rudolf Cardinal & Mike Aitken 29 / 30 April 2004 Department of Experimental Psychology University of Cambridge Handouts: Answers

More information

Wilcoxon Test and Calculating Sample Sizes

Wilcoxon Test and Calculating Sample Sizes Wilcoxon Test and Calculating Sample Sizes Dan Spencer UC Santa Cruz Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 1 / 33 Differences in the Means of Two Independent Groups When

More information

Textbook Examples of. SPSS Procedure

Textbook Examples of. SPSS Procedure Textbook s of IBM SPSS Procedures Each SPSS procedure listed below has its own section in the textbook. These sections include a purpose statement that describes the statistical test, identification of

More information

Introduction to Biostatistics: Part 5, Statistical Inference Techniques for Hypothesis Testing With Nonparametric Data

Introduction to Biostatistics: Part 5, Statistical Inference Techniques for Hypothesis Testing With Nonparametric Data SPECIAL CONTRIBUTION biostatistics Introduction to Biostatistics: Part 5, Statistical Inference Techniques for Hypothesis Testing With Nonparametric Data Specific statistical tests are used when the null

More information

ANOVA Randomized Block Design

ANOVA Randomized Block Design Biostatistics 301 ANOVA Randomized Block Design 1 ORIGIN 1 Data Structure: Let index i,j indicate the ith column (treatment class) and jth row (block). For each i,j combination, there are n replicates.

More information

Statistical Methods for Astronomy

Statistical Methods for Astronomy Statistical Methods for Astronomy Probability (Lecture 1) Statistics (Lecture 2) Why do we need statistics? Useful Statistics Definitions Error Analysis Probability distributions Error Propagation Binomial

More information

Logistic Regression Analysis

Logistic Regression Analysis Logistic Regression Analysis Predicting whether an event will or will not occur, as well as identifying the variables useful in making the prediction, is important in most academic disciplines as well

More information

Biostatistics Quantitative Data

Biostatistics Quantitative Data Biostatistics Quantitative Data Descriptive Statistics Statistical Models One-sample and Two-Sample Tests Introduction to SAS-ANALYST T- and Rank-Tests using ANALYST Thomas Scheike Quantitative Data This

More information

Assignment due Probability histogram, population and sample sd, etc. 1. For the following data:

Assignment due Probability histogram, population and sample sd, etc. 1. For the following data: Assignment due 7-23-10 Probability histogram, population and sample sd, etc. 1. For the following data: 2.6 6.3 6.5 2.9 6.7 6.5 5.8 2.6 8.8 4.6 5.5 4.3 incl incl incl incl a. The height of the probability

More information

5 Introduction to the Theory of Order Statistics and Rank Statistics

5 Introduction to the Theory of Order Statistics and Rank Statistics 5 Introduction to the Theory of Order Statistics and Rank Statistics This section will contain a summary of important definitions and theorems that will be useful for understanding the theory of order

More information

Nemours Biomedical Research Statistics Course. Li Xie Nemours Biostatistics Core October 14, 2014

Nemours Biomedical Research Statistics Course. Li Xie Nemours Biostatistics Core October 14, 2014 Nemours Biomedical Research Statistics Course Li Xie Nemours Biostatistics Core October 14, 2014 Outline Recap Introduction to Logistic Regression Recap Descriptive statistics Variable type Example of

More information

Nonparametric Methods

Nonparametric Methods Nonparametric Methods Marc H. Mehlman marcmehlman@yahoo.com University of New Haven Nonparametric Methods, or Distribution Free Methods is for testing from a population without knowing anything about the

More information