CDA Chapter 3 part II

Similar documents
Textbook Examples of. SPSS Procedure

Sections 3.4, 3.5. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Small n, σ known or unknown, underlying nongaussian

Ordinal Variables in 2 way Tables

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.

APPENDIX B Sample-Size Calculation Methods: Classical Design

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Categorical data analysis Chapter 5

ST3241 Categorical Data Analysis I Two-way Contingency Tables. Odds Ratio and Tests of Independence

Analysis of categorical data S4. Michael Hauptmann Netherlands Cancer Institute Amsterdam, The Netherlands

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

Turning a research question into a statistical question.

Unit 14: Nonparametric Statistical Methods

NON-PARAMETRIC STATISTICS * (

Session 3 The proportional odds model and the Mann-Whitney test

Sections 2.3, 2.4. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis 1 / 21

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Nonparametric Statistics

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

Nonparametric statistic methods. Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health

Chapter 2: Describing Contingency Tables - II

Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA)

Describing Contingency tables

Statistical. Psychology

Contents. Acknowledgments. xix

Intuitive Biostatistics: Choosing a statistical test

N Utilization of Nursing Research in Advanced Practice, Summer 2008

Lecture 8: Summary Measures

What Are Nonparametric Statistics and When Do You Use Them? Jennifer Catrambone

Degrees of freedom df=1. Limitations OR in SPSS LIM: Knowing σ and µ is unlikely in large

Non-parametric tests, part A:

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti

Introduction to Statistical Analysis

Types of Statistical Tests DR. MIKE MARRAPODI

BIOS 625 Fall 2015 Homework Set 3 Solutions

Nemours Biomedical Research Statistics Course. Li Xie Nemours Biostatistics Core October 14, 2014

Glossary for the Triola Statistics Series

Hypothesis testing, part 2. With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal

Rank-Based Methods. Lukas Meier

Statistics of Contingency Tables - Extension to I x J. stat 557 Heike Hofmann

Exam details. Final Review Session. Things to Review

Biostatistics 270 Kruskal-Wallis Test 1. Kruskal-Wallis Test

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

Intro to Parametric & Nonparametric Statistics

Relate Attributes and Counts

STAT 7030: Categorical Data Analysis

Categorical Data Analysis Chapter 3

Selection should be based on the desired biological interpretation!

2 Describing Contingency Tables

Review of Statistics 101

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

STATISTICS ( CODE NO. 08 ) PAPER I PART - I

NAG Library Chapter Introduction. G08 Nonparametric Statistics

Statistics Handbook. All statistical tables were computed by the author.

Non-parametric (Distribution-free) approaches p188 CN

Lecture 7: Hypothesis Testing and ANOVA

TA: Sheng Zhgang (Th 1:20) / 342 (W 1:20) / 343 (W 2:25) / 344 (W 12:05) Haoyang Fan (W 1:20) / 346 (Th 12:05) FINAL EXAM

Research Methodology: Tools

Nonparametric Statistics Notes

Marginal, crude and conditional odds ratios

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling

Contents Kruskal-Wallis Test Friedman s Two-way Analysis of Variance by Ranks... 47

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures

Discrete Multivariate Statistics

Statistics and Measurement Concepts with OpenStat

SCHEME OF EXAMINATION

Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution

3 Joint Distributions 71

Modeling and inference for an ordinal effect size measure

Logistic regression: Miscellaneous topics

Inferences About the Difference Between Two Means

Testing Independence

Statistical Inference Theory Lesson 46 Non-parametric Statistics

(Where does Ch. 7 on comparing 2 means or 2 proportions fit into this?)

ESP 178 Applied Research Methods. 2/23: Quantitative Analysis

Understand the difference between symmetric and asymmetric measures

Three-Way Contingency Tables

= 1 i. normal approximation to χ 2 df > df

SAS/STAT 14.1 User s Guide. Introduction to Nonparametric Analysis

Analyzing Small Sample Experimental Data

Modeling and Measuring Association for Ordinal Data

Chi-Square. Heibatollah Baghi, and Mastee Badii

Formulas and Tables by Mario F. Triola

Computational Systems Biology: Biology X

Ch 6: Multicategory Logit Models

i=1 m i,j, respectively. ...

Unit 9: Inferences for Proportions and Count Data

Bivariate Relationships Between Variables

Epidemiology Wonders of Biostatistics Chapter 13 - Effect Measures. John Koval

Generalized Linear Models

The Flight of the Space Shuttle Challenger

ij i j m ij n ij m ij n i j Suppose we denote the row variable by X and the column variable by Y ; We can then re-write the above expression as

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

Introduction to Statistical Analysis using IBM SPSS Statistics (v24)

Topic 21 Goodness of Fit

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Research Article The Assignment of Scores Procedure for Ordinal Categorical Data

Transcription:

CDA Chapter 3 part II

Two-way tables with ordered classfications Let u 1 u 2... u I denote scores for the row variable X, and let ν 1 ν 2... ν J denote column Y scores. Consider the hypothesis H 0 : X and Y are independent VS H 1 : Y is a linear function of X Mantel-Haenszel (MH) statistic M 2 = (n 1)r 2 where r is the Pearson correlation coefficient between X and Y based on the above scores. For large sample, M 2 is approximately chi-squared with df = 1.

Example: Is happiness associated with political ideology? With scores (1,2,3) for each variable, the correlation is r = 0.135. The linear trend test statistic M 2 = (321 1)(0.135) 2 = 5.85 This shows strong evidence of association (P = 0.016)

Sensitivity to choice of scores Cochran (1954) noted that... If the set of scores is poor, in that it badly distorts a numerical scale that really does underlie the ordered classification, the test will not sensitive For most data sets, different choices of monotone scores give similar results Scores that are linear transforms of each other, such as (1,2,3,4) and (0,2,4,6), have the same absolute correlation and hence the same M 2. Results may depend on the scores, when data are highly unbalanced, with some categories having many more observations than others. It is usually better to select scores that reflect perceived distances between categories.

Sensitivity to choice of scores Equally spaced scores often provide a reasonable compromise when categorical labels do not suggest obvious choices When unsure, do sensitivity analysis and check whether results are similar If you choose a set of scores, and get a significant result, then this suggests that row and column are not independent. However, you shouldn t keep choosing a lot of different scores until you get a significant result.

Example: Infant birth defects by maternal alcohol consumption With Y score {1, 2, 3, 4, 5}, M 2 = 1.83 and P = 0.18 With Y score as midranks {(1 + 17, 114)/2, 24, 365.5, 32, 013, 32, 473, 32, 555.5}, M 2 = 0.35 and P = 0.55 With Y score {0, 0.5, 1.5, 4.0, 7.0}, M 2 = 6.57 and P = 0.01

Monotone trend alternatives to independence Consider the hypothesis: H 0 : X and Y are independent VS H 1 : Y is a monotone function of X Gamma test statistic: z = ˆγ/SE where SE is the standard error of ˆγ drived based on the delta method. Under H 0, z follows a standard normal distribution. For the example on happiness and political ideology, ˆγ = 0.185, z = 0.185/0.078 = 2.37 and the two sided P = 0.018. An approximate 95% CI for γ is 0.185 ± 1.96(0.078) = (0.032, 0.338) The true association seems to be relatively weak.

Extra power with ordinal tests Consider the same example of happiness and political ideology, Ignoring the ordering of the categories, Pearson chi-squared statistics for testing independence is MH test is X 2 = 7.07 with df = 4, P = 0.13 M 2 = 5.85 with df = 1, P = 0.016 Gamma test z = 2.37, P = 0.018 The latter two are ordinal tests which have more power in this example because they are designed to detect linear or monotone patterns, whereas the X 2 and G 2 refer to the most general alternative, whereby cell probabilities exhibit any type of statistical dependence.

Trend tests for 2XJ tables Using scores for the Y variable, contruct the MH test, which detect differences between the two row means of the scores on Y With midrank scores for Y, the MH test is also called Wilcoxon or Mann-Whitney test. It is two-sample t-test with ranks. The MH test is also equivalent to the test based on z = C D SE 0 where C and D are numbers of concordant and discordant pairs respectively. Find score CI for the measure = P(Y 1 > Y 2 ) P(Y 2 > Y 1 )

Nominal X ordinal tables and IX2 tables Extension to a nominal row variable with more than two categories, the Mann-Whitney test extends to Kruskal Wallis test which is ANOVA test on ranks of the Y values. In IX 2 tables, Y is binary. The linear trend statistic then refers to a linear trend in the probability of either response category, such as the probability of malformation as a function of alcoho consumption. The test in this case, often called Cochran-Armitage trend test, which is also related to logistic regression.

Small-sample inference for contingency tables

Small-sample inference for contingency tables Conditional on the row and column margins, the probability of all possible tables. Let θ be the odds ratio. Consider the one sided hypothesis H 0 : θ = 1 vs H 1 : θ > 1 The P-value equals the sum of the probabilities of the tables that have large n 11, therefore. P value = 0.0238

Fisher s exact test for 2x2 tables Conditioning on both sets of marginal totals, Fisher s exact test statistic is the table probability

Fisher s exact test for 2x2 tables For the one-sided alternative, the same P-value results using tables ordered according to larger n 11, larger odds ratio, or larger difference of proportions. For a two-sided alternative, the most common approach sums probabilities of equally or less likely tables, that is, P value = P(p(n 11 ) p(t o )) for the observed value t o In the previous example, the two sided P-value is 0.0238 + 0.0238 = 0.0476