Lecture 8: Non-parametric Comparison of Location. GENOME 560, Spring 2016 Doug Fowler, GS

Similar documents
Lecture 7: Non-parametric Comparison of Location. GENOME 560, Spring 2016 Doug Fowler, GS

Lecture 7: Non-parametric Comparison of Location. GENOME 560 Doug Fowler, GS

Agenda: Recap. Lecture. Chapter 12. Homework. Chapt 12 #1, 2, 3 SAS Problems 3 & 4 by hand. Marquette University MATH 4740/MSCS 5740

Lecture 10: Non- parametric Comparison of Loca6on. GENOME 560, Spring 2015 Doug Fowler, GS

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Chapter 1 (Definitions)

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Important Formulas. Expectation: E (X) = Σ [X P(X)] = n p q σ = n p q. P(X) = n! X1! X 2! X 3! X k! p X. Chapter 6 The Normal Distribution.

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Sampling Distributions, Z-Tests, Power

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

Mann Whitney U test as applied to the change in the mathematics exam method in Sudan" Case study of south Darfur state"

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

GG313 GEOLOGICAL DATA ANALYSIS

11 Correlation and Regression

October 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1

Module 1 Fundamentals in statistics

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

Describing the Relation between Two Variables

(7 One- and Two-Sample Estimation Problem )

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)

MidtermII Review. Sta Fall Office Hours Wednesday 12:30-2:30pm Watch linear regression videos before lab on Thursday

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

Lecture 7: Properties of Random Samples

Sample Size Determination (Two or More Samples)

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Chapter 13, Part A Analysis of Variance and Experimental Design

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Data Analysis and Statistical Methods Statistics 651

Chapter 6 Sampling Distributions

Common Large/Small Sample Tests 1/55

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

This is an introductory course in Analysis of Variance and Design of Experiments.

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n,

Stat 200 -Testing Summary Page 1

Expectation and Variance of a random variable

CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS. 8.1 Random Sampling. 8.2 Some Important Statistics

Lecture 2: Monte Carlo Simulation

Statistics 20: Final Exam Solutions Summer Session 2007

1 Inferential Methods for Correlation and Regression Analysis

Formulas and Tables for Gerstman

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

BIOSTATS 640 Intermediate Biostatistics Frequently Asked Questions Topic 1 FAQ 1 Review of BIOSTATS 540 Introductory Biostatistics

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

Sampling Error. Chapter 6 Student Lecture Notes 6-1. Business Statistics: A Decision-Making Approach, 6e. Chapter Goals

Chapter 23: Inferences About Means

Understanding Samples

Random Variables, Sampling and Estimation

Chapter 18 Summary Sampling Distribution Models

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Lecture 19: Convergence

(6) Fundamental Sampling Distribution and Data Discription

Lecture 2: Probability, Random Variables and Probability Distributions. GENOME 560, Spring 2017 Doug Fowler, GS

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

1036: Probability & Statistics

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Efficient GMM LECTURE 12 GMM II

Lecture 3. Properties of Summary Statistics: Sampling Distribution

UCLA STAT 110B Applied Statistics for Engineering and the Sciences

Probability and statistics: basic terms

Comparing your lab results with the others by one-way ANOVA

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Statistics 511 Additional Materials

University of California, Los Angeles Department of Statistics. Hypothesis testing

Parameter, Statistic and Random Samples

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

Chapter 6 Principles of Data Reduction

Properties and Hypothesis Testing

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008

Final Examination Solutions 17/6/2010

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

1 Review of Probability & Statistics

Topic 9: Sampling Distributions of Estimators

Statistical inference: example 1. Inferential Statistics

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

Read through these prior to coming to the test and follow them when you take your test.

MA238 Assignment 4 Solutions (part a)

Lecture 33: Bootstrap

Chapter 2 Descriptive Statistics

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

Statisticians use the word population to refer the total number of (potential) observations under consideration

Hypothesis Testing (2) Barrow, Statistics for Economics, Accounting and Business Studies, 4 th edition Pearson Education Limited 2006

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

z is the upper tail critical value from the normal distribution

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

Transcription:

Lecture 8: No-parametric Compariso of Locatio GENOME 560, Sprig 2016 Doug Fowler, GS (dfowler@uw.edu) 1

Review What do we mea by oparametric? What is a desirable locatio statistic for ordial data? What are NP equivalets of a oe-sample t-test? 2

Review What do we mea by oparametric? Descriptive stats or iferece methods that do t deped (as much) o the distributio of the populatio beig sampled What is a desirable locatio statistic for ordial data? What are NP equivalets of a oe-sample t-test? 3

Review What do we mea by oparametric? Descriptive stats or iferece methods that do t deped (as much) o the distributio of the populatio beig sampled What is a desirable locatio statistic for ordial data? Media why? What are NP equivalets of a oe-sample t-test? 4

Review What do we mea by oparametric? Descriptive stats or iferece methods that do t deped (as much) o the distributio of the populatio beig sampled What is a desirable locatio statistic for ordial data? Media why? What are NP equivalets of a oe-sample t-test? Sig test, Wilcoxo siged rak test summary? 5

Goals Comparig the medias of two samples usig the Wilcoxo Rak Sum test Comparig the medias of may mutually idepedet samples usig the Kruskal-Wallis test 6

Wilcoxo Rak Sum Test Used to test whether two samples are likely to be draw from the same distributio or differet oes 7

Wilcoxo Rak Sum Test Used to test whether two samples are likely to be draw from the same distributio or differet oes Idetical to Ma-Whitey U test 8

Wilcoxo Rak Sum Test Used to test whether two samples are likely to be draw from the same distributio or differet oes Idetical to Ma-Whitey U test 9

Wilcoxo Rak Sum Test Used to test whether two samples are likely to be draw from the same distributio or differet oes Idetical to Ma-Whitey U test Pool N = x + y observatios 10

Wilcoxo Rak Sum Test Used to test whether two samples are likely to be draw from the same distributio or differet oes Idetical to Ma-Whitey U test Pool N = x + y observatios Arrage ito a ordered array, preservig labels 11

Wilcoxo Rak Sum Test Used to test whether two samples are likely to be draw from the same distributio or differet oes Idetical to Ma-Whitey U test Pool N = x + y observatios Arrage ito a ordered array, preservig labels Assig raks to each elemet of the array from 1 N 12

Wilcoxo Rak Sum Test Used to test whether two samples are likely to be draw from the same distributio or differet oes Idetical to Ma-Whitey U test Pool N = x + y observatios Arrage ito a ordered array, preservig labels Assig raks to each elemet of the array from 1 N The test statistic T x is the sum of the raks of X 13

Wilcoxo Rak Sum Test Used to test whether two samples are likely to be draw from the same distributio or differet oes Idetical to Ma-Whitey U test Pool N = x + y observatios Arrage ito a ordered array, preservig labels Assig raks to each elemet of the array from 1 N The test statistic T x is the sum of the raks of X Reject H 0 if T x is very large or very small compared to possible values of T x for = N 14

Wilcoxo Rak Sum Test - Example Let s say we have measured a trascript level i 6 preoperative patiets (X) ad 3 post-operative patiets (Y). Does the surgery chage trascript levels? 15

Wilcoxo Rak Sum Test - Example Let s say we have measured a trascript level i 6 preoperative patiets (X) ad 3 post-operative patiets (Y). Does the surgery chage trascript levels? 16

Wilcoxo Rak Sum Test - Example Let s say we have measured a trascript level i 6 preoperative patiets (X) ad 3 post-operative patiets (Y). Does the surgery chage trascript levels? 17

Wilcoxo Rak Sum Test - Example Let s say we have measured a trascript level i 6 preoperative patiets (X) ad 3 post-operative patiets (Y). Does the surgery chage trascript levels? Observatio Sample Rak 2.5 X 3 X 6.2 X 9.1 X 14.3 X 14.7 X 14.1 Y 15.6 Y 16.7 Y 18

Wilcoxo Rak Sum Test - Example Let s say we have measured a trascript level i 6 preoperative patiets (X) ad 3 post-operative patiets (Y). Does the surgery chage trascript levels? Observatio Sample Rak 2.5 X 1 3 X 2 6.2 X 3 9.1 X 4 14.1 Y 5 14.3 X 6 14.7 X 7 15.6 Y 8 16.7 Y 9 19

Wilcoxo Rak Sum Test - Example Let s say we have measured a trascript level i 6 preoperative patiets (X) ad 3 post-operative patiets (Y). Does the surgery chage trascript levels? Observatio Sample Rak 2.5 X 1 3 X 2 6.2 X 3 9.1 X 4 14.1 Y 5 14.3 X 6 14.7 X 7 15.6 Y 8 16.7 Y 9 20

Distributio of T x Whe N is Small Cosider a case where x = 2 ad y = 3 We kow raks must be 1, 2, 3, 4, 5 Agai, the issue is how to assig these raks amogst the samples X ad Y 21

Distributio of T x Whe N is Small Cosider a case where x = 2 ad y = 3 We kow raks must be 1, 2, 3, 4, 5 Agai, the issue is how to assig these raks amogst the samples X ad Y There are ways of assigig five raks to two samples Each way is equally likely uder the ull hypothesis so each has a probability of 10% 22

Distributio of T x Whe N is Small X Raks Y Raks Value of T x Probability 1, 2 3, 4, 5 3 0.10 1, 3 2, 4, 5 4 0.10 1, 4 2, 3, 5 5 0.10 2, 3 1, 4, 5 5 0.10 2, 4 1, 3, 5 6 0.10 1, 5 2, 3, 4 6 0.10 2, 5 1, 3, 4 7 0.10 3, 4 1, 2, 5 7 0.10 3, 5 1, 2, 4 8 0.10 4, 5 1, 2, 3 9 0.10 probability 0.00 0.05 0.10 0.15 0.20 3 5 7 9 Tx 23

Distributio of T x Whe N is Small X Raks Y Raks Value of T x Probability 1, 2 3, 4, 5 3 0.10 1, 3 2, 4, 5 4 0.10 1, 4 2, 3, 5 5 0.10 2, 3 1, 4, 5 5 0.10 2, 4 1, 3, 5 6 0.10 1, 5 2, 3, 4 6 0.10 2, 5 1, 3, 4 7 0.10 3, 4 1, 2, 5 7 0.10 3, 5 1, 2, 4 8 0.10 4, 5 1, 2, 3 9 0.10 probability 0.00 0.05 0.10 0.15 0.20 3 5 7 9 Tx 24

Distributio of T x Whe N is Large 25

Distributio of T x Whe N is Large Will be ormally distributed how do we calculate a z value? 26

Distributio of T x Whe N is Large Will be ormally distributed how do we calculate a z value? Subtract the value of T x we got from the mea of the samplig distributio of T x ad divide by the stadard deviatio of the samplig distributio of T x 27

Distributio of T x Whe N is Large Will be ormally distributed how do we calculate a z value? 28

Wilcoxo Rak Sum Test - Example Let s say we have measured a trascript level i 6 preoperative patiets (X) ad 3 post-operative patiets (Y). Does the surgery chage trascript levels? Observatio Sample Rak 2.5 X 1 3 X 2 6.2 X 3 9.1 X 4 14.1 Y 5 14.3 X 6 14.7 X 7 15.6 Y 8 16.7 Y 9 29

Wilcoxo Rak Sum Test - Example Let s say we have measured a trascript level i 6 preoperative patiets (X) ad 3 post-operative patiets (Y). Does the surgery chage trascript levels? Observatio Sample Rak 2.5 X 1 3 X 2 6.2 X 3 9.1 X 4 14.1 Y 5 14.3 X 6 14.7 X 7 15.6 Y 8 16.7 Y 9 Accept H 0 30

Distributio of the Populatios How would oequality of the shape of the populatios from which X ad Y are draw affect the test? 31

Distributio of the Populatios How would oequality of the shape of the populatios from which X ad Y are draw affect the test? Here, we would erroeously ifer that the populatios had differet medias media 32

Distributio of the Populatios How would oequality of the shape of the populatios from which X ad Y are draw affect the test? We must make the assumptio that our distributios have the same shape media 33

How Do We Test Whether The Shapes Are Equal? 34

How Do We Test Whether The Shapes Are Equal? Simplest way is to use boxplots/histograms to get a sese of whether the distributios appear to be similar 35

How Do We Test Whether The Shapes Are Equal? Simplest way is to use boxplots/histograms to get a sese of whether the distributios appear to be similar You ca use formal tests for dispersio/scale parameters (e.g. Asari-Bradley) though you have to equalize the locatio of the two distributios first! 36

Wilcoxo Rak Sum Test Geeralizatio of the Wilcoxo Siged Rak test Used to test whether two samples are likely to be draw from the same distributio or differet oes Idetical to Ma-Whitey U test Ofte called Ma-Whitey-Wilcoxo test (otice alphabetical order of the dead people) Assumptios: Observatios are idepedet Observatios are draw from a cotiuous distributios The values draw are ordered The shapes of the two distributios are idetical 37

Frak Wilcoxo Wilcoxo lived from 1892 to 1965. He was a polymath, workig as a oilma ad a tree surgeo before traiig as a physical chemist, workig i plat research ad the i process cotrol i idustry. I a sigle paper i 1945 he published both tests that bear his ame. 38

Goals Comparig the medias of two samples usig the Wilcoxo Rak Sum test Comparig the medias of may mutually idepedet samples usig the Kruskal-Wallis test 39

Kruskal-Wallis Test Geeralizatio of the Wilcoxo rak sum test to 3 or more idepedet radom samples Used to test whether the medias of the samples are equal Noparametric versio of the oe-way ANOVA 40

Kruskal-Wallis Test Geeralizatio of the Wilcoxo rak sum test to 3 or more idepedet radom samples Used to test whether the medias of the samples are equal Noparametric versio of the oe-way ANOVA 41

Kruskal-Wallis Test Geeralizatio of the Wilcoxo rak sum test to 3 or more idepedet radom samples Used to test whether the medias of the samples are equal Noparametric versio of the oe-way ANOVA Pool all observatios 42

Kruskal-Wallis Test Geeralizatio of the Wilcoxo rak sum test to 3 or more idepedet radom samples Used to test whether the medias of the samples are equal Noparametric versio of the oe-way ANOVA Pool all observatios Rak the pooled samples 43

Kruskal-Wallis Test Geeralizatio of the Wilcoxo rak sum test to 3 or more idepedet radom samples Used to test whether the medias of the samples are equal Noparametric versio of the oe-way ANOVA Pool all observatios Rak the pooled samples Sum the raks for each sample to get idividual sample rak sums 44

Kruskal-Wallis Test Uder the ull hypothesis, what should be true about the relatioship betwee ay two rak sums R i, R j? 45

Kruskal-Wallis Test Uder the ull hypothesis, what should be true about the relatioship betwee ay two rak sums R i, R j? Ay R i is a radom sample of raks Therefore, the meas of ay two rak sums should be equal 46

Kruskal-Wallis Test Uder the ull hypothesis, what should be true about the relatioship betwee ay two rak sums R i, R j? Ay R i is a radom sample of raks Therefore, the meas of ay two rak sums should be equal I fact, 47

Kruskal-Wallis Test The sum of all the sample rak sums is 48

Kruskal-Wallis Test The sum of all the sample rak sums is Where N is the total umber of pooled observatios Give that, what is the expected value of ay oe average rak sum uder the ull hypothesis (that they are all equal)? 49

Kruskal-Wallis Test The sum of all the sample rak sums is Where N is the total umber of pooled observatios Give that, what is the expected value of ay oe average rak sum uder the ull hypothesis (that they are all equal)? 50

Kruskal-Wallis Test Statistic Give this, how could we costruct a test statistic to see if each sample media deviates from the expected value? 51

Kruskal-Wallis Test Statistic Give this, how could we costruct a test statistic to see if each sample media deviates from the expected value? 52

Kruskal-Wallis Test Statistic Give this, how could we costruct a test statistic to see if each sample media deviates from the expected value? This is the sum of squares (or the sum of the squared differeces betwee each score ad the expected value) Variace/stadard deviatio Least squares regressio ANOVA 53

Kruskal-Wallis Test Statistic Q, the Kruskal-Wallis test statistic is the weighted sum of squares of deviatios of the actual average rak sums from the expected average rak sum 54

Kruskal-Wallis Test Statistic Q, the Kruskal-Wallis test statistic is the weighted sum of squares of deviatios of the actual average rak sums from the expected average rak sum i th average rak sum value Expected average rak sum value 55

Kruskal-Wallis Test Statistic Q, the Kruskal-Wallis test statistic is the weighted sum of squares of deviatios of the actual average rak sums from the expected average rak sum Q = 12 P k i=1 i( R i N+1 2 )2 N(N + 1) 56

Kruskal-Wallis Test Statistic Distributio What distributio should the KW test statistic follow (hit, it is ot a ormal distributio)? 57

Kruskal-Wallis Test Statistic Distributio What distributio should the KW test statistic follow (hit, it is ot a ormal distributio)? Aother hit: the test statistic is a sum of squares of somethig that IS ormally distributed 58

The Chi-Square Distributio The Chi-square distributio is the distributio of the sum of squared idepedet stadard ormal RVs. df 2 2 χdf = Z ; where Z ~ Ν 0, 1) i= 1 The expected value ad variace of the chi-square E(x) = df Var(x) = 2 * (df) 59

Kruskal-Wallis Test Example Let s say we survey icomig Geome Scieces classes for the distace each studet traveled to get here Year 1 Year 2 Year 3 Year 4 164 1204 131 353 119 1107 20 47 66 414 444 1333 52 342 444 83 367 422 426 305 163 195 0 181 138 115 706 247 77 542 516 266 15 144 66 60

Kruskal-Wallis Test Example Let s say we survey icomig Geome Scieces classes for the distace each studet traveled to get here Year 1 Y1 Raks Year 2 Y2 Raks Year 3 Y3 Raks Year 4 Y4 Raks 164 16 1204 34 131 12 353 23 119 11 1107 33 20 3 47 4 66 6.5 414 25 444 28.5 1333 35 52 5 342 22 444 28.5 83 9 367 24 422 26 426 27 305 21 163 15 195 18 0 1 181 17 138 13 115 10 706 32 247 19 77 8 542 31 516 30 266 20 15 2 144 14 66 6.5 Rak the observatios collectively 61

Kruskal-Wallis Test Example Let s say we survey icomig Geome Scieces classes for the distace each studet traveled to get here R Year 1 Y1 Raks Year 2 Y2 Raks Year 3 Y3 Raks Year 4 Y4 Raks 164 16 1204 34 131 12 353 23 119 11 1107 33 20 3 47 4 66 6.5 414 25 444 28.5 1333 35 52 5 342 22 444 28.5 83 9 367 24 422 26 426 27 305 21 163 15 195 18 0 1 181 17 138 13 115 10 706 32 247 19 77 8 542 31 516 30 266 20 15 2 144 14 66 6.5 sum 132.5 168 171.5 158 mea 13.25 24 17.15 19.75 Calculate the rak sum, R, for each class 62

Kruskal-Wallis Test Example Let s say we survey icomig Geome Scieces classes for the distace each studet traveled to get here R R Year 1 Y1 Raks Year 2 Y2 Raks Year 3 Y3 Raks Year 4 Y4 Raks 164 16 1204 34 131 12 353 23 119 11 1107 33 20 3 47 4 66 6.5 414 25 444 28.5 1333 35 52 5 342 22 444 28.5 83 9 367 24 422 26 426 27 305 21 163 15 195 18 0 1 181 17 138 13 115 10 706 32 247 19 77 8 542 31 516 30 266 20 15 2 144 14 66 6.5 sum 132.5 168 171.5 158 mea 13.25 24 17.15 19.75 Calculate the rak sum average,, for each class R 63

Kruskal-Wallis Test Example Let s say we survey icomig Geome Scieces classes for the distace each studet traveled to get here Year 1 Y1 Raks Year 2 Y2 Raks Year 3 Y3 Raks Year 4 Y4 Raks 164 16 1204 34 131 12 353 23 119 11 1107 33 20 3 47 4 66 6.5 414 25 444 28.5 1333 35 52 5 342 22 444 28.5 83 9 367 24 422 26 426 27 305 21 163 15 195 18 0 1 181 17 138 13 115 10 706 32 247 19 77 8 542 31 516 30 266 20 15 2 144 14 66 6.5 sum 132.5 168 171.5 158 mea 13.25 24 17.15 19.75 Calculate Q Q = 12 P k i=1 i( R i N+1 2 )2 N(N + 1) 64

Kruskal-Wallis Test Example Let s say we survey icomig Geome Scieces classes for the distace each studet traveled to get here Year 1 Y1 Raks Year 2 Y2 Raks Year 3 Y3 Raks Year 4 Y4 Raks 164 16 1204 34 131 12 353 23 119 11 1107 33 20 3 47 4 66 6.5 414 25 444 28.5 1333 35 52 5 342 22 444 28.5 83 9 367 24 422 26 426 27 305 21 163 15 195 18 0 1 181 17 138 13 115 10 706 32 247 19 77 8 542 31 516 30 266 20 15 2 144 14 66 6.5 sum 132.5 168 171.5 158 mea 13.25 24 17.15 19.75 Q = 12 P k i=1 i( R i N+1 2 )2 N(N + 1) p =2 10 24 65

Kruskal-Wallis Test Outcome Give the way the test statistic/hypotheses are costructed, what does a rejectio of H 0 mea? 66

Kruskal-Wallis Test Outcome Give the way the test statistic/hypotheses are costructed, what does a rejectio of H 0 mea? That the medias are ot all equal (i.e. does t tell you which are uequal) 67

Kruskal-Wallis Test Outcome Give the way the test statistic/hypotheses are costructed, what does a rejectio of H 0 mea? That the medias are ot all equal (i.e. does t tell you which are uequal) Havig rejected the ull, you might aturally wat to kow which medias are differet 68

Kruskal-Wallis Test Outcome Give the way the test statistic/hypotheses are costructed, what does a rejectio of H 0 mea? That the medias are ot all equal (i.e. does t tell you which are uequal) Havig rejected the ull, you might aturally wat to kow which medias are differet Pairwise Wilcoxo rak sum tests are a way to do this, but you ll have to correct for multiple tests 69

Kruskal-Wallis Test Geeralizatio of the Wilcoxo rak sum test to 3 or more idepedet radom samples Used to test whether the medias of the samples are equal Noparametric versio of the oe-way ANOVA Assumptios: k mutually idepedet radom samples measured o at least a ordial scale draw from a cotiuous distributio shapes of the distributios are idetical 70

Noparametric Locatio Tests Ca be used to perform oe or two sample tests with fewer assumptios about the distributio from which the sample(s) are draw Usage of sig ad rak (rather tha iterval, as with parametric tests) eable this ad cofer other beefits More robust (immue to outliers) Ca be used o ordial data NP tests still have assumptios, ad still must be used with care (e.g. zeroes for sig test, ties, similarity of distributios for rak-sum test) 71

AND THERE IS NO FREE LUNCH Let s say we have measured a trascript level i 6 preoperative patiets (X) ad 3 post-operative patiets (Y). Does the surgery chage trascript levels? Observatio Sample Rak 2.5 X 1 3 X 2 6.2 X 3 9.1 X 4 14.1 Y 5 14.3 X 6 14.7 X 7 15.6 Y 8 16.7 Y 9 72

AND THERE IS NO FREE LUNCH Let s say we have measured a trascript level i 6 preoperative patiets (X) ad 3 post-operative patiets (Y). Does the surgery chage trascript levels? If we assume ormality ad ideticality of variace, the a two sample t-test gives: 73

AND THERE IS NO FREE LUNCH Geerally speakig, oparametric tests trade fewer assumptios for less power 74

AND THERE IS NO FREE LUNCH Geerally speakig, oparametric tests trade fewer assumptios for less power Differet oparametric tests perform better or worse i this regard (efficiecy) 75

AND THERE IS NO FREE LUNCH Geerally speakig, oparametric tests trade fewer assumptios for less power Differet oparametric tests perform better or worse i this regard (efficiecy) All will do better tha their parametric couterparts whe assumptios are violated 76

AND THERE IS NO FREE LUNCH Geerally speakig, oparametric tests trade fewer assumptios for less power Differet oparametric tests perform better or worse i this regard (efficiecy) All will do better tha their parametric couterparts whe assumptios are violated The Ma-Whitey-Wilcoxo test is particularly good, givig up little power eve for ormally distributed data 77

R Goals Executig oparametric tests i R Playig aroud with differet distributio shapes ad test assumptios Examiig effect size vs. test outcome 78

Readig/Resources http://www.statsoft.com/textbook/noparametric- Statistics/butto/2 http://sci2s.ugr.es/keel/pdf/algorithm/articulo/wilcoxo1 945.pdf http://www.mayo.edu/mayo-edu-docs/ceter-fortraslatioal-sciece-activities-documets/berd-5-6.pdf Noparametric statistics: a itroductio, Jea Gibbos (available olie through UW libraries at http://goo.gl/nerixx) 79