Count Predicted Variable & Contingency Tables

Size: px

Start display at page:

Download "Count Predicted Variable & Contingency Tables"

Hugo Chandler
6 years ago
Views:

1 Count Predicted Variable & Contingency Tables Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information.

2 Goals and General Idea

3 Goals Contingency tables When we have count data distributed across a range of different categories Are counts in one category higher than another? Is the count in one category contingent upon the the level in another category (or plural)? Eye Colour Hair Colour Blue Brown Green Hazel Black Blond Brunette Red Data from Snee (1974) The American Statistician 28: 9-12, as presented in Kruschke (2015).

4 Goals Contingency tables When we have count data distributed across a range of different categories Are counts in one category higher than another? Is the count in one category contingent upon the the level in another category (or plural)? Often addressed with chi-square or Exact test analyses

5 Goals Count predicted variable Any time our predicted variable is a count Abundance Counts of individuals (or things) with different traits/characteristics etc.

6 Distribution and Links When modelling count data, it is appropriate to use the Poisson distribution Positive integers One parameter - lambda (λ) λ = 1 λ = 3 Density Density X Values X Values λ = 5 λ = 10 Density Density X Values X Values

7 Distribution and Links Contingency table example Cell frequencies are representative of underlying cell probabilities Are nominal variables independent of each other?

8 Distribution and Links Contingency table example Cell frequencies are representative of underlying cell probabilities Are nominal variables independent of each other? If independent 68 = Pr(BlackHair) x Pr(BrownEyes) True for all cells

Distribution and Links Contingency table example Cell frequencies are representative of underlying cell probabilities Are nominal variables

9 Distribution and Links Contingency table example Cell frequencies are representative of underlying cell probabilities Are nominal variables independent of each other? If independent 68 = Pr(BlackHair) x Pr(BrownEyes) True for all cells If interaction effects, this will not be the case

10 Distribution and Links Contingency table example Eye Colour Hair Colour Blue Brown Green Hazel Black Blond Brunette Red f(c) N marginal frequencies of each eye colour

11 Distribution and Links Contingency table example Eye Colour Hair Colour Blue Brown Green Hazel Black Blond Brunette Red f(c) N marginal frequencies of each eye colour Pr(BlueEyes) = 215 / 592 = 0.363

12 Distribution and Links Contingency table example Eye Colour Hair Colour Blue Brown Green Hazel Black Blond Brunette Red f(c) N marginal frequencies of each eye colour marginal frequencies of each hair colour Pr(BlueEyes) = 215 / 592 = 0.363

13 Distribution and Links Contingency table example Eye Colour Hair Colour Blue Brown Green Hazel Black Blond Brunette Red f(c) N marginal frequencies of each eye colour marginal frequencies of each hair colour Pr(BlueEyes) = 215 / 592 = Pr(BlackHair) = 108 / 592 = 0.182

14 Distribution and Links Contingency table example Eye Colour Hair Colour Blue Brown Green Hazel Black Blond Brunette Red f(c) N marginal frequencies of each eye colour marginal frequencies of each hair colour Pr(BlueEyes) = 215 / 592 = Pr(BlackHair) = 108 / 592 = Pr(BlueEyes & BlackHair) = (0.363 x 0.182) x 592 = 39 Hmm...must be an interaction effect

15 Distribution and Links Contingency table example Pr(BlueEyes & BlackHair) = (0.363 x 0.182) x 592 = 39 Joint probability is the product of the relevant marginal probabilities We re used to dealing with additive combinations Can convert to log scale, then they ll be additive

16 Segue on Logarithms Adding logarithms is the same as multiplying original values 10 x 5 = 50 log(10) + log(5) = log(50)

17 Segue on Logarithms Adding logarithms is the same as multiplying original values 10 x 5 = 50 log(10) + log(5) = log(50) Can cancel out logarithms (bring back to original scale), by raising them to the exponent exp(log(10) + log(5)) = 50

18 Distribution and Links

19 Distribution and Links The black box into which we can put any of our previous equations, or others

20 Distribution and Links Depends on rest of model, here the average across all categories of all variables

21 Distribution and Links The deflection away from baseline due to being in each category of our first nominal predictor variable.

22 Distribution and Links The deflection away from baseline due to being in each category of our second nominal predictor variable.

23 Distribution and Links Interaction effects. Must also be constrained so that they sum to zero

24 Distribution and Links Note that coefficient estimates will now be on the log scale (even though we haven t explicitly specified them as such)

25 Distribution and Links The link between the predictor and predicted variables is based on logarithms Previous models (other than logistic) have been based on identity These types of models are called log linear models

26 Frequentist Approach

27 Frequentist Approach Read data into R haireye <- read.table( haireye.csv, header = TRUE, sep =, )

28 Frequentist Approach First, organize the data y <- haireye$freq eye <- haireye$eye hair <- haireye$hair

29 Frequentist Approach Use the glm function (generalized linear model) Set family argument to poisson(link = log ) glm.out <- glm(y ~ eye + hair + (eye * hair), family = poisson(link = "log"))

30 Frequentist Approach summary(glm.out) Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) < 2e-16 *** eyebrown e-06 *** eyegreen ** eyehazel hairblond e-10 *** hairbrunette e-09 *** hairred eyebrown:hairblond e-16 *** eyegreen:hairblond eyehazel:hairblond e-05 *** eyebrown:hairbrunette ** eyegreen:hairbrunette eyehazel:hairbrunette eyebrown:hairred * eyegreen:hairred eyehazel:hairred

31 Frequentist Approach summary(glm.out) Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) < 2e-16 *** eyebrown e-06 *** eyegreen ** eyehazel hairblond e-10 *** hairbrunette Combined effect of blue eyes 8.03e-09 & black *** hairred hair (independently, and in all eyebrown:hairblond e-16 *** interactions) eyegreen:hairblond eyehazel:hairblond e-05 *** eyebrown:hairbrunette ** eyegreen:hairbrunette eyehazel:hairbrunette eyebrown:hairred * eyegreen:hairred eyehazel:hairred

32 Bayesian Approach

33 Load Libraries & Functions library(runjags) library(coda) source("plotpost.r")

34 Organize the Data # Y-Data y <- haireye$freq N <- length(y) ylogmean <- log(mean(y)) ylogsd <- log(sd(c(rep(0, N - 1), sum(y)))) # Eye data x1 <- as.numeric(haireye$eye) eyecolours <- levels(haireye$eye) neyecolour <- length(unique(x1)) # Hair data x2 <- as.numeric(haireye$hair) haircolours <- levels(haireye$hair) nhaircolour <- length(unique(x2))

35 Organize the Data # Y-Data y <- haireye$freq N <- length(y) ylogmean <- log(mean(y)) ylogsd <- log(sd(c(rep(0, N - 1), sum(y)))) # Eye data x1 <- as.numeric(haireye$eye) eyecolours <- levels(haireye$eye) neyecolour <- length(unique(x1)) # Hair data x2 <- as.numeric(haireye$hair) haircolours <- levels(haireye$hair) nhaircolour <- length(unique(x2)) Will see why we need these in a minute...

36 Organize the Data # Y-Data y <- haireye$freq N <- length(y) ylogmean <- log(mean(y)) ylogsd <- log(sd(c(rep(0, N - 1), sum(y)))) # Eye data x1 <- as.numeric(haireye$eye) eyecolours <- levels(haireye$eye) neyecolour <- length(unique(x1)) # Hair data x2 <- as.numeric(haireye$hair) haircolours <- levels(haireye$hair) nhaircolour <- length(unique(x2)) Largest possible sd would occur if the sum of all values was in one cell, and all others were zero

37 Make Data List For JAGS datalist = list( y = y, N = N, ylogmean = ylogmean, ylogsd = ylogsd, x1 = x1, x2 = x2, neyecolours = neyecolours, nhaircolours = nhaircolours )

38 Define the Model Model the same as what we ve encountered, except y values based on a Poisson distribution Link between predictors and predicted is the exponent

39 Define the Model 0 Poisson λi yi

40 Define the Model ylogmean (2 * ylogsd) µ τ = 1/σ 2 - norm λi 0 Poisson yi

41 Define the Model ylogmean (2 * ylogsd) α gamma β µ τ = 1/σ 2 - norm µ τ = 1/σ 2 norm 0 Poisson λi yi

42 Define the Model modelstring = " model { # The likelihood for (i in 1:N) { y[i] ~ dpois(lambda[i]) lambda[i] <- exp(a0 + a1[x1[i]] + a2[x2[i]] + a1a2[x1[i], x2[i]]) }...

43 Define the Model modelstring = " model { a instead of b because we still have to modify them so that the sum to zero within each variable (they are nominal) # The likelihood for (i in 1:N) { y[i] ~ dpois(lambda[i]) lambda[i] <- exp(a0 + a1[x1[i]] + a2[x2[i]] + a1a2[x1[i], x2[i]]) }...

44 Define the Model... # The Priors # a0 a0 ~ dnorm(ylogmean, 1 / ((2 * ylogsd)^2)) # a1 for (j in 1:nEyeColours) { a1[j] ~ dnorm(0, 1 / sigma1[j]^2) sigma1[j] ~ dgamma(1.1, 0.11) } # a2 for (j in 1:nHairColours) { a2[j] ~ dnorm(0, 1 / sigma2[j]^2) sigma2[j] ~ dgamma(1.1, 0.11) }... # a1a2 for (j in 1:nHairColours) { for (k in 1:nEyeColours) { a1a2[j, k] ~ dnorm(0, 1 / sigma12[j, k]^2) sigma12[j, k] ~ dgamma(1.1, 0.11) } }

45 Define the Model... # Convert variables to sum-to-zero for (j in 1:nEyeColours) { for (k in 1:nHairColours) { m[j, k] <- a0 + a1[j] + a2[k] + a1a2[j, k] # Cell means } } b0 <- mean(m[1:neyecolours, 1:nHairColours]) for (j in 1:nEyeColours) { b1[j] <- mean(m[j, 1:nHairColours]) - b0 } for (j in 1:nHairColours) { b2[j] <- mean(m[1:neyecolours, j]) - b0 } for (j in 1:nEyeColours) { for (k in 1:nHairColours) { b1b2[j, k] <- m[j, k] - (b0 + b1[j] + b2[k]) } } } " writelines(modelstring, con = "model.txt")

46 Specify Initial Values initslist <- function() { list( a0 = rnorm(n = 1, mean = ylogmean, sd = ylogsd), a1 = rnorm(n = neyecolours, mean = 0, sd = 10), a2 = rnorm(n = nhaircolours, mean = 0, sd = 10), sigma1 = rgamma(n = neyecolours, shape = 1.1, rate = 0.11), sigma2 = rgamma(n = nhaircolours, shape = 1.1, rate = 0.11) ) }

Specify Initial Values initslist <- function() { list( a0 = rnorm(n = 1, mean = ylogmean, sd = ylogsd), a1 = rnorm(n = neyecolours, mean = 0, sd = 10), a2 = rnorm(n = nhaircolours, mean =

47 Specify Initial Values initslist <- function() { list( a0 = rnorm(n = 1, mean = ylogmean, sd = ylogsd), a1 = rnorm(n = neyecolours, mean = 0, sd = 10), a2 = rnorm(n = nhaircolours, mean = 0, sd = 10), sigma1 = rgamma(n = neyecolours, shape = 1.1, rate = 0.11), sigma2 = rgamma(n = nhaircolours, shape = 1.1, rate = 0.11) ) } Note absence of initial values for a1a2 and sigma12

48 Specify MCMC Parameters and Run runjagsout <- run.jags( method = "simple", model = "model.txt", monitor = c("b0", "b1", "b2", b1b2, sigma1, sigma2, sigma12 ), data = datalist, inits = initslist, n.chains = 3, adapt = 500, burnin = 1000, sample = 20000, thin = 1, summarise = TRUE, plots = FALSE)

49 Next Steps (On Your Own) Retrieve the data and take a peek at the structure Test model performance Extract & parse results

50 View Posteriors

51 Plotting Posterior Distributions β 0 par(mfrow = c(1, 1)) histinfo = plotpost(b0, xlab = bquote(beta[0])) mean = % HDI β 0

52 Plotting Posterior Distributions β 1 par(mfrow = c(2, 2)) for (i in 1:nEyeColours) { histinfo = plotpost(b1[i, ], xlab = bquote(beta * 1[.(i)]), main = paste("b1:", eyecolours[i])) }

53 Plotting Posterior Distributions β 1 b1: Blue b1: Brown mean = mean = % HDI % HDI β β1 2 b1: Green b1: Hazel mean = mean = % HDI % HDI β β1 4

54 Plotting Posterior Distributions β 1 b1: Blue b1: Brown mean = mean = Blue & brown eyes more frequent, green & hazel less so 95% HDI β1 1 95% HDI β1 2 b1: Green b1: Hazel mean = mean = % HDI % HDI β β1 4

55 Plotting Posterior Distributions β 2 par(mfrow = c(2, 2)) for (i in 1:nHairColours) { histinfo = plotpost(b2[i, ], xlab = bquote(beta * 2[.(i)]), main = paste("b2:", haircolours[i])) }

56 Plotting Posterior Distributions β 2 b2: Black b2: Blond mean = mean = Brunettes more frequent, all others less so 95% HDI β2 1 95% HDI β2 2 b2: Brunette b2: Red mean = mean = % HDI % HDI β β2 4

57 Plotting Posterior Distributions β 1 β 2 par(mfrow = c(2, 2)) ncombos = neyecolours * nhaircolours counter = 1 while (counter <= ncombos) { for (i in 1:nEyeColours) { for (j in 1:nHairColours) { histinfo = plotpost(b1b2[counter, ], xlab = bquote(beta * 1[.(i)] * beta * 2[.(j)]), main = paste("b1:", eyecolours[i], ", b2:", haircolours[j])) counter = counter + 1 } } }

58 Plotting Posterior Distributions β 1 β 2 b1: Blue, b2: Black b1: Blue, b2: Blond mean = mean = % HDI % HDI β1 1 β β1 1 β2 2 b1: Blue, b2: Brunette b1: Blue, b2: Red mean = mean = % HDI % HDI β1 1 β β1 1 β2 4

59 Plotting Posterior Distributions β 1 β 2 Blue eyes & blonde hair found together at a higher frequency than expected if no interaction b1: Blue, b2: Black mean = b1: Blue, b2: Blond mean = % HDI % HDI β1 1 β β1 1 β2 2 b1: Blue, b2: Brunette mean = b1: Blue, b2: Red mean = % HDI % HDI β1 1 β β1 1 β2 4

60 Plotting Posterior Distributions β 1 β 2 b1: Blue, b2: Black mean = Blue eyes & other combinations found together less often than expected b1: Blue, b2: Blond mean = % HDI % HDI β1 1 β β1 1 β2 2 b1: Blue, b2: Brunette b1: Blue, b2: Red mean = mean = % HDI % HDI β1 1 β β1 1 β2 4

61 Plotting Posterior Distributions β 1 β 2 b1: Green, b2: Black b1: Green, b2: Blond mean = mean = % HDI % HDI β1 3 β β1 3 β2 2 b1: Green, b2: Brunette b1: Green, b2: Red mean = mean = % HDI % HDI β1 3 β β1 3 β2 4

62 Plotting Posterior Distributions β 1 β 2 b1: Brown, b2: Black b1: Brown, b2: Blond mean = mean = % HDI % HDI β1 2 β β1 2 β2 2 b1: Brown, b2: Brunette b1: Brown, b2: Red mean = mean = % HDI % HDI β1 2 β β1 2 β2 4

63 Plotting Posterior Distributions β 1 β 2 b1: Hazel, b2: Black b1: Hazel, b2: Blond mean = mean = % HDI % HDI β1 4 β β1 4 β2 2 b1: Hazel, b2: Brunette b1: Hazel, b2: Red mean = mean = % HDI % HDI β1 4 β β1 4 β2 4

64 Calculating Expected Values

65 Calculating Expected Values Same as before with multiple nominal predictors Expected frequency of those with blue eyes & black hair par(mfrow = c(1, 1)) expblueblack <- exp(b0 + b1[1, ] + b2[1, ] + b1b2[1, ]) histinfo = plotpost(expblueblack, xlab = "Frequency", main = "Exp. Blue Eyes Black Hair")

Calculating Expected Values Same as before with multiple nominal predictors Row in b1 representing blue eyes Expected frequency of those with blue eyes & black hair

66 Calculating Expected Values Same as before with multiple nominal predictors Row in b1 representing blue eyes Expected frequency of those with blue eyes & black hair par(mfrow = c(1, 1)) expblueblack <- exp(b0 + b1[1, ] + b2[1, ] + b1b2[1, ]) histinfo = plotpost(expblueblack, xlab = "Frequency", main = "Exp. Blue Eyes Black Hair")

67 Calculating Expected Values Same as before with multiple nominal predictors Row in b2 representing black hair Expected frequency of those with blue eyes & black hair par(mfrow = c(1, 1)) expblueblack <- exp(b0 + b1[1, ] + b2[1, ] + b1b2[1, ]) histinfo = plotpost(expblueblack, xlab = "Frequency", main = "Exp. Blue Eyes Black Hair")

68 Calculating Expected Values Row in b1b2 representing blue eyes & black hair interaction Same as before with multiple nominal predictors Expected frequency of those with blue eyes & black hair par(mfrow = c(1, 1)) expblueblack <- exp(b0 + b1[1, ] + b2[1, ] + b1b2[1, ]) histinfo = plotpost(expblueblack, xlab = "Frequency", main = "Exp. Blue Eyes Black Hair")

69 Calculating Expected Values Exp. Blue Eyes Black Hair mean = % HDI Frequency

70 Calculating Expected Values Expected frequency of those with blue eyes & black hair expgreenblond <- exp(b0 + b1[3, ] + b2[2, ] + b1b2[10, ]) histinfo = plotpost(expgreenblond, xlab = "Frequency", main = "Exp. Green Eyes Blond Hair")

71 Calculating Expected Values Exp. Green Eyes Blond Hair mean = % HDI Frequency

72 Posterior Predictive Check

73 Posterior Predictive Check Same as before with multiple nominal predictors for (i in 1:nPred) { for (j in 1:postSampSize) { ypostpred[i, j] <- rpois(n = 1, lambda = exp(b0[j] + b1[x1[i], j] + b2[x2[i], j] + b1b2[((x1[i] * 4) - 3) + (x2[i] - 1), j])) } }

74 Posterior Predictive Check Same as before with multiple nominal predictors for (i in 1:nPred) { for (j in 1:postSampSize) { ypostpred[i, j] <- rpois(n = 1, lambda = exp(b0[j] + b1[x1[i], j] + b2[x2[i], j] + b1b2[((x1[i] * 4) - 3) + (x2[i] - 1), j])) } } Using the Poisson distribution

75 Posterior Predictive Check Same as before with multiple nominal predictors for (i in 1:nPred) { for (j in 1:postSampSize) { ypostpred[i, j] <- rpois(n = 1, lambda = exp(b0[j] + b1[x1[i], j] + b2[x2[i], j] + b1b2[((x1[i] * 4) - 3) + (x2[i] - 1), j])) } } Raising the predictors to the exponent

76 Posterior Predictive Check Same as before with multiple nominal predictors for (i in 1:nPred) { for (j in 1:postSampSize) { ypostpred[i, j] <- rpois(n = 1, lambda = exp(b0[j] + b1[x1[i], j] + b2[x2[i], j] + b1b2[((x1[i] * 4) - 3) + (x2[i] - 1), j])) } } Have to figure out how to identify proper interaction coefficient row based on values in each predictor

77 Posterior Predictive Check Not bad! Frequency

78 Other Things

79 Other Things Can do all the things we did with other nominal data sets Contrasts between individual groups Contrasts between one group and all others Contrasts between any combination

80 Questions?

81 Creative Commons License Anyone is allowed to distribute, remix, tweak, and build upon this work, even commercially, as long as they credit me for the original creation. See the Creative Commons website for more information. Click here to go back to beginning

Multiple Regression: Nominal Predictors. Tim Frasier

Multiple Regression: Nominal Predictors Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information. Goals