Categorical Data Analysis Chapter 3

Similar documents
n y π y (1 π) n y +ylogπ +(n y)log(1 π).

STAT 705: Analysis of Contingency Tables

Categorical Variables and Contingency Tables: Description and Inference

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

ST3241 Categorical Data Analysis I Two-way Contingency Tables. Odds Ratio and Tests of Independence

ST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios

Some comments on Partitioning

Testing Independence

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti

Confidence Intervals, Testing and ANOVA Summary

Solutions for Examination Categorical Data Analysis, March 21, 2013

Sections 3.4, 3.5. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Unit 9: Inferences for Proportions and Count Data

Modeling and inference for an ordinal effect size measure

Lecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University

Review of Statistics 101

Log-linear Models for Contingency Tables

Statistics 3858 : Contingency Tables

Multinomial Logistic Regression Models

Unit 9: Inferences for Proportions and Count Data

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Lecture 8: Summary Measures

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

Discrete Multivariate Statistics

Categorical data analysis Chapter 5

Topic 21 Goodness of Fit

Chi-Square. Heibatollah Baghi, and Mastee Badii

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Homework 1 Solutions

BIOS 625 Fall 2015 Homework Set 3 Solutions

STAC51: Categorical data Analysis

Central Limit Theorem ( 5.3)

8 Nominal and Ordinal Logistic Regression

Session 3 The proportional odds model and the Mann-Whitney test

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

Review of One-way Tables and SAS

Analysis of data in square contingency tables

13.1 Categorical Data and the Multinomial Experiment

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.

ij i j m ij n ij m ij n i j Suppose we denote the row variable by X and the column variable by Y ; We can then re-write the above expression as

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: )

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

ST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses

Solution to Tutorial 7

STAT 7030: Categorical Data Analysis

One-sample categorical data: approximate inference

CDA Chapter 3 part II

The material for categorical data follows Agresti closely.

Contingency Tables Part One 1

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

Reports of the Institute of Biostatistics

Poisson regression: Further topics

One-Way Tables and Goodness of Fit

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014

Optimal exact tests for complex alternative hypotheses on cross tabulated data

I i=1 1 I(J 1) j=1 (Y ij Ȳi ) 2. j=1 (Y j Ȳ )2 ] = 2n( is the two-sample t-test statistic.

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

Lecture 01: Introduction

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

STAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure).

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)

Chapter 2: Describing Contingency Tables - I

Contingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878

Chapter 10. Chapter 10. Multinomial Experiments and. Multinomial Experiments and Contingency Tables. Contingency Tables.

New Bayesian methods for model comparison

Part 1.) We know that the probability of any specific x only given p ij = p i p j is just multinomial(n, p) where p k1 k 2

Interval estimation. October 3, Basic ideas CLT and CI CI for a population mean CI for a population proportion CI for a Normal mean

Likelihood-based inference with missing data under missing-at-random

2 and F Distributions. Barrow, Statistics for Economics, Accounting and Business Studies, 4 th edition Pearson Education Limited 2006

Partition of the Chi-Squared Statistic in a Contingency Table

Section 4.6 Simple Linear Regression

The t-statistic. Student s t Test

1/24/2008. Review of Statistical Inference. C.1 A Sample of Data. C.2 An Econometric Model. C.4 Estimating the Population Variance and Other Moments

Generalized Linear Models. Kurt Hornik

Describing Contingency tables

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

2 Describing Contingency Tables

Interpret Standard Deviation. Outlier Rule. Describe the Distribution OR Compare the Distributions. Linear Transformations SOCS. Interpret a z score

Ling 289 Contingency Table Statistics

Decomposition of Parsimonious Independence Model Using Pearson, Kendall and Spearman s Correlations for Two-Way Contingency Tables

Chapter 4: Generalized Linear Models-II

Lecture 5: ANOVA and Correlation

Means or "expected" counts: j = 1 j = 2 i = 1 m11 m12 i = 2 m21 m22 True proportions: The odds that a sampled unit is in category 1 for variable 1 giv

Institute of Actuaries of India

Nominal Data. Parametric Statistics. Nonparametric Statistics. Parametric vs Nonparametric Tests. Greg C Elvers

Loglikelihood and Confidence Intervals

OHSU OGI Class ECE-580-DOE :Design of Experiments Steve Brainerd

Practical Meta-Analysis -- Lipsey & Wilson

Correspondence Analysis

Hypothesis Testing hypothesis testing approach

Cohen s s Kappa and Log-linear Models

POLI 443 Applied Political Research

HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC

Frequency Distribution Cross-Tabulation

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

Answer Key for STAT 200B HW No. 8

Transcription:

Categorical Data Analysis Chapter 3

The actual coverage probability is usually a bit higher than the nominal level. Confidence intervals for association parameteres Consider the odds ratio in the 2x2 table, ˆθ = (n 11 n 22 )/(n 12 n 21 ) Unless n is vary large, its sampling distribution is highly skewed. The log transform converges more reapidly to normality. An estimated standard error for log ˆθ is ˆσ(log ˆθ) 1 = n 1 1 + 1 n 1 2 + 1 n 2 1 + 1 n 2 2 By the large-sample normaility of log ˆθ, the Wald confidence interval for log θ is log ˆθ ± z α/2ˆσ(log ˆθ) The Wald CI for θ is exp ( ) log ˆθ ± z α/2ˆσ(log ˆθ)

Wald CI for difference of proportions The estimated difference of proportions ˆπ 1 ˆπ 2 is unbiased for the true difference π 1 π 2 and has the standard error π 1 (1 π 1 ) σ(ˆπ 1 ˆπ 2 ) = + π 1(1 π 1 ) n 1 n 2 The estimate ˆσ(ˆπ 1 ˆπ 2 ) replaces π i by ˆπ i. Then ˆπ 1 ˆπ 2 ± z α/2 σ(ˆπ 1 ˆπ 2 ) is a Wald CI for π 1 π 2. Like the Wald interval for a single proportion, it usually has true coverage probability less than the nominal confidence level, especially when π 1 and π 2 are near 0 or 1.

Wald CI for relative risk Like the odds ratio, the log relative risk log r converges to normality faster on the log scale. An estimated standard error for log r is ˆσ(log r) = 1 ˆπ1 y 1 + 1 ˆπ 2 y 2 The Wald interval for r is ( ) exp log r ± z α/2ˆσ(log r). It tends to be somewhat conservative.

Example: Aspirin and Heart attacks revisited The proportions having fatal hear attacks were 18/11,034=0.00163 for placebo group and 5/11,037=0.00045 for aspirin group. The 95% CI for the log relative risk is log(0.00163/0.00045) ± 1.96(0.505) = log(1.34, 9.70) Despite the very large sample sizes, due to the very low rate of heart attach deaths, the estimated effect is imprecise. The Wald 95% CI for π 1 π 2 is 0.0012 ± 1.96(0.00043) = (0.0003, 0.002) The Wald 95% CI for odds ratio is log(3.62) ± 1.96(0.51) = log(1.33, 9.84)

Deriving standard errors with the delta method If n(t n θ) d N(0, σ 2 ) then n(g(tn ) g(θ) d N(0, [g (θ)] 2 σ 2 )

Score confidence interval for difference in proportions Consider testing H 0 : π 1 π 2 = 0 Let ˆπ 1 ( 0 ) and ˆπ 2 ( 0 ) denote the ML estimates of π 1 and π 2 subject to the constraint π 1 π 2 = 0. The score test statistic is z( 0 ) = (ˆπ 1 ˆπ 2 ) 0 ˆπ1 ( 0 )[1 ˆπ 1 ( 0 )] n 1 + ˆπ 2( 0 )[1 ˆπ 2 ( 0 )] n 2 The score CI is the set of 0 such that z( 0 ) < z α/2.

Score confidence interval for odds ratio Consider testing on the odds ratio H 0 : θ = θ 0 Let ˆµ ij (θ 0 ) be the unique expected frequeny estimates that have the same row and column margins as {n i j} and satisfy The set of θ 0 satisfying ˆµ 11 (θ 0 )ˆµ 22 (θ 0 ) ˆµ 12 (θ 0 )ˆµ 21 (θ 0 ) = θ 0 X 2 (θ 0 ) = (n ij ˆµ ij (θ 0 )) 2 /ˆµ ij (θ 0 ) < χ 2 1(α) form a 100(1 α)% score-test-based confidence interval.

Profile likelihood CI Consider testing on the odds ratio H 0 : θ = θ 0 The set of θ 0 satisfying G 2 (θ 0 ) = 2 i n ij log[n ij /ˆµ ij (θ 0 )] < χ 2 1(α) j form a 100(1 α)% likeli-ratio test-based CI.

Example: Aspirin and heart attacks profile LRT CI for odds ratio: (1.44, 2.34) score CI for difference in proportion: (0.0047, 0.0108) score CI for relative risk (1.43, 2.30) score CI for odds ratio: (1.44, 2.33)

Testing independence in two-way contingency tables Consider the hypothesis of statistical independence H 0 : π ij = π i+ π +j for all i and j The Pearson X 2 test statistic is X 2 = (n ij ˆµ ij ) 2 ˆµ i j ij where ˆµ ij = nˆπ i+ˆπ +j = n i+ n +j /n which is the expected cell count under independence. Under H 0, X 2 follows an asymptotic chi-square distribution with. df = (IJ 1) (I 1) (J 1) = (I 1)(J 1) The likelihood-ratio test produces a different statistic: G 2 = 2 n ij log(n ij /ˆµ ij ) i j which follows the same asymptotic distribution as X 2

Adequacy of Chi-Squared approximations When there are independent multinomial samples, independence between the row and column corresponds to homogeneity of each outcome probability among the rows or columns with fixed margin. The limiting chi-squared results still hold. The convergence of the actual sampling distribution of X 2 or G 2 to the chi-squared distribution applies as n grows for a fixed number of cells. The adequacy of the approximation depends on both n and the number of cells. The size of n/ij that produces adequate approximations for X 2 tends to decrease as IJ increases.

Adequacy of Chi-Squared approximations Research has shown that X 2 performs adequately with smaller n and more sparse tables than G 2. The distribution of G 2 is usually poorly approximated by chi-squared when n/ij < 5. Chi-squared approximations for both tend to be poor for tables containing both very small and moderately large µ ij. Small-sample methods are available whenever it is doubtful.

Example: Education and belief in God X 2 = 76.1, G 2 = 73.2 with df = (3 1)(6 1) = 10. The P-values are < 0.0001. These statistics provide extremely strong evidence of an association.

Following-up Chi-squared tests When a test of independence has a small p-value, what does it say about the strength of the association? Not much, the smaller the p-value, the stronger the evidence that AN association exists It does not tell you that the association is very strong To understand more about assoication, do 1) a residual analysis 2) Consider partitioning the Chi-square statistics into independent pieces to examine association in subtables

Residuals Pearson residuals e ij = n ij ˆµ ij ˆµij Pearson residuals have asymptotic variances less than 1, averaging [(I 1)(J 1)]/IJ Standardized residuals r ij = n ij ˆµ ij ˆµ ij (1 p i+ )(1 p +j ) In 2x2 tables, df = 1 and r 11 = r 12 = r 21 = r 22, and any r 2 ij = X 2 A standardized residual that exceeds about 2 or 3 in absolute value indicates lack of fit of H 0.

Example: Education and Belief in God revisited n 36 = 293, ˆµ 36 = 358.8, p 3+ = 581/2000 = 0.2905, p +6 = 1235/2000 = 0.6175. r 36 = (293 358.8)/ 358.8(1 0.2905)(1 0.6175) = 6.7 We can infer that in the population in 2008, fewer people at the highest level of education would have responded know God exists than if the variables were truly independent.

Example: Mosaic plot

Partitioning Chi-Squared After rejecting independence, the next question could be Are there individual comparisons more significant than others? Partitioning may show the association is largly dependent on certain categories or groupings of categories For IXJ tables, one way to partition G 2 to the G 2 of the (I 1)(J 1) separate 2x2 tables is The G 2 s of the (I 1)(J 1) tables are independent.

Example: Origin of Schizophrenia Here G 2 = 23.04 with df = 4. To understand the association better, we partition G 2 into 4 independent components.

Example the order of the above tables, G 2 = 0.29, 1.36, 12.95, 8.43 respectively. In The psychoanalytic school seems more likely than the other schools to ascribe the origins of schizophrenia as being a combination Of those who chose either the biogenic or environmental origin, members of the psychoanalytic school where somewhat more likely than the other schools to choose the environmental origin.

Partitioning For G 2, exact partitioning occurs Pearson X 2 does not have this property, but since X 2 and G 2 are asymptotically equivalent, X 2 can be used for subtables too The selection of subtables is not unique. To initiate the process, you can use your residual analysis to identify the most extreme cells and begin there Association measures such as odds ratio, relative risk, difference of proportions, and association factors and their CI can also be used describe strength of association in subtables.

Rules for partitioning The df for the subtables must sum to the df for the full table Each cell count in the full table must be a cell count in one and only one subtable Each marginal total of the full table must be a marginal total for one and only one subtable