Power and Sample Size for Log-linear Models 1

Similar documents
The Multinomial Model

Contingency Tables Part One 1

Introduction 1. STA442/2101 Fall See last slide for copyright information. 1 / 33

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

Generalized Linear Models 1

Categorical Data Analysis 1

Interactions and Factorial ANOVA

Interactions and Factorial ANOVA

Factorial ANOVA. STA305 Spring More than one categorical explanatory variable

Joint Distributions: Part Two 1

Factorial ANOVA. More than one categorical explanatory variable. See last slide for copyright information 1

Random Vectors 1. STA442/2101 Fall See last slide for copyright information. 1 / 30

The Bootstrap 1. STA442/2101 Fall Sampling distributions Bootstrap Distribution-free regression example

STA441: Spring Multiple Regression. More than one explanatory variable at the same time

Introduction to Bayesian Statistics 1

STA 431s17 Assignment Eight 1

Part 1.) We know that the probability of any specific x only given p ij = p i p j is just multinomial(n, p) where p k1 k 2

STA442/2101: Assignment 5

STA 2101/442 Assignment 3 1

Statistics 3858 : Contingency Tables

STA 2101/442 Assignment 2 1

STA 302f16 Assignment Five 1

Regression Part II. One- factor ANOVA Another dummy variable coding scheme Contrasts Mul?ple comparisons Interac?ons

STA 2101/442 Assignment Four 1

Instrumental Variables Again 1

The Multivariate Normal Distribution 1

Chapter 10. Chapter 10. Multinomial Experiments and. Multinomial Experiments and Contingency Tables. Contingency Tables.

The Multivariate Normal Distribution 1

11-2 Multinomial Experiment

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Factorial Analysis of Variance with R

UNIVERSITY OF TORONTO Faculty of Arts and Science

Continuous Random Variables 1

Lecture 7: Hypothesis Testing and ANOVA

MATH 240. Chapter 8 Outlines of Hypothesis Tests

Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution

Structural Equa+on Models: The General Case. STA431: Spring 2015

Categorical Data Analysis. The data are often just counts of how many things each category has.

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Structural Equa+on Models: The General Case. STA431: Spring 2013

Ordinary Least Squares Regression Explained: Vartanian

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Lecture 10: Generalized likelihood ratio test

STAT 536: Genetic Statistics

Psychology 282 Lecture #4 Outline Inferences in SLR

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

Ling 289 Contingency Table Statistics

This does not cover everything on the final. Look at the posted practice problems for other topics.

10.2: The Chi Square Test for Goodness of Fit

Sociology 6Z03 Review II

Statistics for Managers Using Microsoft Excel

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)

13.1 Categorical Data and the Multinomial Experiment

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

Econ 325: Introduction to Empirical Economics

Testing Independence

Topic 21 Goodness of Fit

Poisson Regression. The Training Data

Lecture 22. December 19, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

Chapter 10. Discrete Data Analysis

Simple logistic regression

TUTORIAL 8 SOLUTIONS #

STAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure).

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

Inferential statistics

ij i j m ij n ij m ij n i j Suppose we denote the row variable by X and the column variable by Y ; We can then re-write the above expression as

Section VII. Chi-square test for comparing proportions and frequencies. F test for means

Model Estimation Example

Analysis of Variance

General Principles Within-Cases Factors Only Within and Between. Within Cases ANOVA. Part One

Ron Heck, Fall Week 3: Notes Building a Two-Level Model

Hypotheses and Errors

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.

DISTRIBUTIONS USED IN STATISTICAL WORK

1 Hypothesis testing for a single mean

Regression With a Categorical Independent Variable

Problem of the Day. 7.3 Hypothesis Testing for Mean (Small Samples n<30) Objective(s): Find critical values in a t-distribution

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

Chapter Seven: Multi-Sample Methods 1/52

Regression With a Categorical Independent Variable

280 CHAPTER 9 TESTS OF HYPOTHESES FOR A SINGLE SAMPLE Tests of Statistical Hypotheses

Statistical power of likelihood ratio and Wald tests in latent class models with covariates

Inverse Sampling for McNemar s Test

Keppel, G. & Wickens, T.D. Design and Analysis Chapter 2: Sources of Variability and Sums of Squares

Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance ECON 509. Dr.

Relating Graph to Matlab

Systems of Linear Equations

The material for categorical data follows Agresti closely.

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Prepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti

TESTING FOR CO-INTEGRATION

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests

James H. Steiger. Department of Psychology and Human Development Vanderbilt University. Introduction Factors Influencing Power

Unit 9: Inferences for Proportions and Count Data

Precept 4: Hypothesis Testing

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

Chapter 8 Student Lecture Notes 8-1. Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance

Transcription:

Power and Sample Size for Log-linear Models 1 STA 312: Fall 2012 1 See last slide for copyright information. 1 / 1

Overview 2 / 1

Power of a Statistical test The power of a test is the probability of rejecting H 0 when H 0 is false. More power is good, because we want to make correct decisions. Power is not just one number. It is a function of the parameters. Usually, For any n, the more incorrect H 0 is, the greater the power. For any parameter value satisfying the alternative hypothesis, the larger n is, the greater the power. 3 / 1

Statistical power analysis To select sample size Pick an effect you d like to be able to detect a set of parameter values such that H 0 is false. It should be just over the boundary of interesting and meaningful. Pick a desired power, a probability with which you d like to be able to detect the effect by rejecting the null hypothesis. Start with a fairly small n and calculate the power. Increase the sample size until the desired power is reached. There may be shortcuts, but this is the idea. 4 / 1

Power for the Multinomial Model More review The main test statistics are ( ) G 2 nj = 2 n j log = 2n µ j X 2 = (n j µ j ) 2 µ j = n (p j π j ) 2 ( ) pj p j log π j When H 0 is true, their distributions are approximately (central) chi-squared. When H 0 is false, their distributions are approximately non-central chi-squared, with non-centrality parameter λ > 0. π j 5 / 1

Large-sample target of the restricted MLE as n π n π(m) The notation M means model the model given by H 0. The non-centrality parameter λ depends on π(m) = {π 1 (M),..., π c (M)}. X 2 = n G 2 = 2n (p j π j ) 2 π j ( ) pj p j log π j λ = n λ = 2n [π j π j (M)] 2 π j (M) ( ) πj π j log, π j (M) These formulas for λ are in Agresti s Categorical data analysis, p. 241. 6 / 1

Take a closer look X 2 (p j π j ) 2 = n π j G 2 = 2n p j log ( pj π j ) λ = n λ = 2n [π j π j (M)] 2 π j (M) ( ) πj π j log, π j (M) By the Law of Large Numbers, p j π j, always. When H 0 is correct, π j π j for j = 1,..., c. When H 0 is wrong, may have π j π j for some j, but not all. So λ is n times a quantity saying how wrong H 0 is. Let s call this quantity effect size. 7 / 1

Why we need π 1 (M),..., π c (M) For any sample size n, power is an increasing function of λ. To calculate λ, we need not only the true parameter π, but also where the restricted MLE goes as n. ( ) X 2 pj π 2 j = n π j ( ) G 2 pj = 2n p j log π j [ πj π j (M) ] 2 λ = n π j (M) ( ) πj λ = 2n π j log, π j (M) We will see that (given a null hypothesis model), π(m) is a specific function of the true parameter π. 8 / 1

The Law of Large Numbers Let Y 1,..., Y n be independent with common expected value µ. Then Y n E(Y i ) = µ For the multinomial, the data values Y i are vectors of zeros and ones, with exactly one 1. Marginally they are Bernoulli, so E(Y i ) = µ = π. The vector of sample proportions is Y n = p. Technical details aside, p n π as n like an ordinary limit, and the usual rules apply. In particular, if g( ) is a continuous function, g(p n ) g(π). 9 / 1

Calculating the large-sample target π(m) = {π 1 (M),..., π c (M)} Write the restricted MLE as π n = g(p n ). Let n, and π n = g(p n ) g(π) = π(m). 10 / 1

A one-dimensional example from earlier Recall the Jobs example, with H 0 : π 1 = 2π 2 π = = = ( 2(n1 + n 2 ), n 1 + n 2, n ) 3 3n 3n n ( 2 ( n1 3 n + n ) 2, 1 ( n1 n 3 n + n ) 2, n ) 3 n n ( 2 3 (p 1 + p 2 ), 1 ) 3 (p 1 + p 2 ), p 3 ( 2 3 (π 1 + π 2 ), 1 ) 3 (π 1 + π 2 ), π 3 = π(m) 11 / 1

Again, we need π(m) to calculate λ X 2 = n G 2 = 2n (p j π j ) 2 π j ( ) pj p j log π j λ = n λ = 2n [π j π j (M)] 2 π j (M) ( ) πj π j log, π j (M) Divide the test statistic by n and then let n to get the effect size. Then multiply by a chosen value of n to get λ. Increase n until the power is as high as you wish. Of course you can (should) do this before seeing any data. 12 / 1

Chi-squared test of independence H 0 : π ij = π i+ π +j Passed the Course Course Did not pass Passed Total Catch-up π 11 π 12 π 1+ Mainstream π 21 π 22 π 2+ Elite π 31 π 32 π 3+ Total π +1 π +2 1 MLEs of marginal probabilities are π i+ = p i+ and π +j = p +j, so π ij = p i+ p +j π i+ π +j = π ij (M) 13 / 1

Non-centrality parameters for testing independence I X 2 = n i=1 I G 2 = 2n J (p ij p i+ p +j ) 2 i=1 J p ij log p i+ p +j ( pij p i+ p +j ) λ = n λ = 2n I i=1 I J (π ij π i+ π +j ) 2 i=1 J π ij log π i+ π +j ( πij π i+ π +j ), With degrees of freedom df = (I 1)(J 1) 14 / 1

A cheap way to calculate the non-centrality parameter for any alternative hypothesis Just for testing independence, so far I J X 2 (p ij p i+ p +j ) 2 = n i=1 p i+ p +j ( ) I J G 2 pij = 2n p ij log i=1 p i+ p +j I J (π ij π i+ π +j ) 2 λ = n i=1 π i+ π +j ( ) I J πij λ = 2n π ij log, i=1 π i+ π +j Make up some data that represent the alternative hypothesis of interest. Sample size does not matter; n = 100 would make the cell frequencies percentages. Calculate the test statistic on your made-up data. Divide by the n you used. Now you have an effect size. Multiply your effect size by any n, to get a λ. Or you can follow the formulas on the right-hand side, but it s the same thing. 15 / 1

Example As part of their rehabilitation, equal numbers of convicted criminals are randomly assigned to one of two treatment programmes just prior to their release on parole. How many will be re-arrested within 12 months? Suppose the programs differ somewhat in their effectiveness, but not much. Say 60% in Programme A will be re-arrested, compared to 70% in Programme B. What total sample size is required so that this difference will be detected by a Pearson chi-squared test of independence, with power at least 0.80? (Note this test is identical to one implied by the more appropriate product-multinomial model.) 16 / 1

Sample size required for Power of 0.80 Conditional probability of re-arrest 0.60 for Programme A, and 0.70 for B > # R calculation for the recidivism example > crit = qchisq(0.95,df=1); crit [1] 3.841459 > dummy = rbind(c(40,60), + c(30,70)) > X2 = loglin(dummy,margin=list(1,2))$pearson 2 iterations: deviation 0 > effectsize = X2/200; effectsize [1] 0.01098901 > wantpow = 0.80; power = 0 ; n = 50; crit = qchisq(0.95,1) > while(power<wantpow) + { + n = n+2 # Keeping equal sample sizes + lambda = n*effectsize + power = 1-pchisq(crit,1,lambda) + } # End while power < wantpow > n; power [1] 716 [1] 0.8009609 17 / 1

What if probability is 0.50 for Programme A? Compared to 0.70 for B > # What if Programme A reduced re-arrests to 50%? > dummy = rbind(c(50,50), + c(30,70)) > X2 = loglin(dummy,margin=list(1,2))$pearson 2 iterations: deviation 0 > effectsize = X2/200; effectsize [1] 0.04166667 > wantpow = 0.80; power = 0 ; n = 50; crit = qchisq(0.95,1) > while(power<wantpow) + { + n = n+2 # Keeping equal sample sizes + lambda = n*effectsize + power = 1-pchisq(crit,1,lambda) + } # End while power < wantpow > n; power [1] 190 [1] 0.8033634 18 / 1

Can there be too much power? What if the sample size is HUGE? > # Power of a trivial effect for n = 100,000 > dummy = rbind(c(50,50), + c(49,51)) > X2 = loglin(dummy,margin=list(1,2))$pearson 2 iterations: deviation 0 > effectsize = X2/200; effectsize [1] 0.00010001 > lambda = 100000 * effectsize > power = 1-pchisq(crit,1,lambda); power [1] 0.8854098 19 / 1

General log-linear models are tougher Until you think about it Explicit formulas for the MLEs under the model are not available in general. So can t just re-write them in terms of p ij... and let n to get π ij... (M). However, 20 / 1

We know how the iterative proportional fitting algorithm produces π It s based on certain observed marginal totals. These are based on n ij... = n p ij..., So π n is a function of p n : π n = g(p n ) What kind of function? It s a sequence of multiplications, all continuous. A continuous function of a continuous function is continuous, So the entire composite function, though complicated, is continuous, and π n = g(p n ) g(π) = π(m) 21 / 1

π n = g(p n ) g(π) = π(m) The large-sample target of π n is g(π). And we know what the function g( ) is, or anyway we know how to compute it. It s the iterative proportional fitting algorithm, applied to sets of marginal probabilities instead of totals. Or you could apply it to quantities like n π i+k, and then divide by n at the end. 22 / 1

Note π j = g j (p) and π j (M) = g j (π) It s the same function: Iterative proportional fitting X 2 (p j π j) 2 = n π j G 2 = 2n p j log ( pj π j ) λ = n λ = 2n [π j π j(m)] 2 π j(m) ( ) πj π j log π j(m) So we can do what we did before with tests of independence. Make up a data table that represents the alternative hypothesis of interest. Cell frequencies are n π j, for some convenient n. Calculate the test statistic on your made-up data. Divide by the n you used; now you have an effect size. Multiply your effect size by any n, to get a λ. Use that λ to calculate a power value. 23 / 1

Copyright Information This slide show was prepared by Jerry Brunner, Department of Statistics, University of Toronto. It is licensed under a Creative Commons Attribution - ShareAlike 3.0 Unported License. Use any part of it as you like and share the result freely. The L A TEX source code is available from the course website: http://www.utstat.toronto.edu/ brunner/oldclass/312f12 24 / 1