Metric Predicted Variable on One Group

Size: px

Start display at page:

Download "Metric Predicted Variable on One Group"

Beatrice Fleming
5 years ago
Views:

1 Metric Predicted Variable on One Group Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information.

2 Prior Homework

3 Prior Homework As a preface to today, have students choose two papers and evaluate them (will come in handy later) 1. Choose two top journals in their field 2. Identify how many research papers are included in the most recent issues 3. Use R s sample function to randomly pick which of those papers to read/analyze

4 Prior Homework Once papers are chosen, read them and 1. Count how many p-values are reported 2. In what percentage of those is the confidence interval also reported? 3. In what percentage is the effect size, regression coefficient, or actual size of the difference between observed and expected values reported? 4. In what percentage is the biological significance discussed? 5. What percentage are naked p-values, with no other information (including graphics that would provide such information) Create a brief presentation to summarize your findings

5 First, Note on Tests vs Generalized Linear Models

in different situations t-test ANOVA ANCOVA Binomial Test Linear

6 Tests vs Generalized Linear Model In stats courses (or elsewhere), you likely learned different tests Each applicable in different situations t-test ANOVA ANCOVA Binomial Test Linear Regression Wilcoxon Rank-Sum Test Chi-Square Test Fisher s Exact Test etc.

7 Tests vs Generalized Linear Model In stats courses (or elsewhere), you likely learned different tests Each applicable in different situations Often, these are taught and/or learned as independent things Underlying principles not clear Memorization rather than understanding

8 A Secret They re all variations on a generalized linear model One set of underlying principles, different types of parameters

9 A Secret They re all variations on a generalized linear model One set of underlying principles, different types of parameters Once you understand basic principles, no need to memorize anything, just build an appropriate model!

10 Back to Metric Predicted Variable on One Group

11 Goals When would we use this type of analysis? Obtain parameter estimates and credible intervals for a given set of data To see if observed data fit some expected value Is the sex ratio of the population 50:50? Is the blood pressure of a certain group higher/lower than recommended value? Is the IQ of a certain group higher/lower than general population average of 100? etc.

12 Data

13 Data IQ data from 25 NSERC recipients*, ** Do they have higher IQs than the population average of 100? * Not real data. ** I do not currently have an NSERC grant

14 Data First let s get a feel for the data A very important thing to always do first iq <- read.table( IQdata.csv, header = TRUE, sep =, )

15 Data First let s get a feel for the data A very important thing to always do first iq <- read.table( IQdata.csv, header = TRUE, sep =, )

16 Data First, let s plot a histogram of the data hist(iq$iq) Histogram of iq$iq Frequency iq$iq

17 Data Let s make it look a little nicer hist(iq$iq, xlab = IQ, main =, col = steelblue )

18 Data Let s make it look a little nicer hist(iq$iq, xlab = IQ, main =, col = steelblue ) State explicitly what the x-axis label should be (could do the same thing for y- axis using ylab = )

19 Data Let s make it look a little nicer hist(iq$iq, xlab = IQ, main =, col = steelblue ) State explicitly what the main title should be. Here, we re specifying blank (no main label)

20 Data Let s make it look a little nicer hist(iq$iq, xlab = IQ, main =, col = steelblue ) Specify what fill colour you want

21 Data Let s make it look a little nicer hist(iq$iq, xlab = IQ, main =, col = steelblue ) Frequency IQ

22 Data Almost no end to the customization you can do with graphics (histograms in this case) in R See help file for more information?hist

23 Data Get data characteristics summary(iq$iq) Min. 1st Qu. Median Mean 3rd Qu. Max

24 Data Get data characteristics summary(iq$iq) Min. 1st Qu. Median Mean 3rd Qu. Max sd(iq$iq) [1]

25 Frequentist Approach

26 Frequentist Approach How would your normally analyze these data?

27 Frequentist Approach How would your normally analyze these data? Could use a one-sample t-test (t.test in R)

28 Frequentist Approach How would your normally analyze these data? Could use a one-sample t-test (t.test in R) Two-tailed test = Are the observed data different from the population average of 100 t.test(iq$iq, mu = 100)

29 Frequentist Approach How would your normally analyze these data? Could use a one-sample t-test (t.test in R) Two-tailed test = Are the observed data different from the population average of 100 t.test(iq$iq, mu = 100) One Sample t-test data: iq$iq t = 2.134, df = 24, p-value = alternative hypothesis: true mean is not equal to percent confidence interval: sample estimates: mean of x

30 Frequentist Approach How would your normally analyze these data? Could use a one-sample t-test (t.test in R) Two-tailed test = Are the observed data different from the population average of 100 One-tailed test = Are the observed data greater than the population average of 100 (our original question) t.test(iq$iq, mu = 100, alternative = c( greater ))

31 Frequentist Approach t.test(iq$iq, mu = 100, alternative = c( greater )) One Sample t-test data: iq$iq t = 2.134, df = 24, p-value = alternative hypothesis: true mean is greater than percent confidence interval: Inf sample estimates: mean of x

32 Frequentist Approach Let s think about how this may be reported in a paper, maybe something like The IQ of NSERC recipients is significantly higher than the population average of 100 (p < 0.05). Would the histogram be included? Would the actual average, confidence interval, or difference between observed and expected by included? Based on your analysis of the papers

33 Frequentist Approach Let s think about how this may be reported in a paper, maybe something like The IQ of NSERC recipients is significantly higher than the population average of 100 (p < 0.05). Would the histogram be included? Would the actual average, confidence interval, or difference between observed and expected by included? Is a mean of vs biologically important?

34 Frequentist Approach Let s think about how this may be reported in a paper, maybe something like The IQ of NSERC recipients is significantly higher than the population average of 100 (p < 0.05). Would the histogram be included? Would the actual average, confidence interval, or difference between observed and expected by included? Is a mean of vs biologically important? Hopefully you re starting to see how little information is provided by p-values, and how strange it is to base our understanding of the world on them.

35 Bayesian Approach

36 Standardize the Data Markov Chain will perform much better if we standardize the data first Also makes for clearer choice of priors Just need to convert it back to original scale prior to interpreting results! Mean will be 0, and sd will be ~1

37 Standardize the Data Markov Chain will perform much better if we standardize the data first Also makes for clearer choice of priors Just need to convert it back to original scale prior to interpreting results! each original x value each new standardized x value mean of x sd of x Mean will be 0, and sd will be ~1

38 Standardize the Data ym <- mean(iq$iq) ysd <- sd(iq$iq) zy <- (iq$iq - ym) / ysd mean(zy) [1] e-15 sd(zy) [1] 1

39 Standardize the Data hist(zy, main =, col = steelblue ) Frequency zy

40 Specify the model Yay - your first model! What distribution should we use? Frequency zy

41 Specify the model Equation

42 Specify the model Equation Normal distribution defined by two parameters (mean and sd) Want to understand the characteristics of this distribution

43 Specify the model Equation Normal distribution defined by two parameters (mean and sd) Want to understand the characteristics of this distribution Estimate the probabilities associated with these parameters taking different reasonable values

44 Specify the model Any parameter we are trying to estimate also needs a prior

45 Specify the model Any parameter we are trying to estimate also needs a prior µ τ = 1/σ 2 - norm y

46 Specify the model Any parameter we are trying to estimate also needs a prior What makes sense? µ τ = 1/σ 2 - norm y

47 Specify the model Any parameter we are trying to estimate also needs a prior µ τ = 1/σ 2 - norm µ τ = 1/σ 2 - norm y

48 Specify the model Any parameter we are trying to estimate also needs a prior 0 µ τ = 1/σ 2 - norm µ τ = 1/σ 2 - norm y

49 Specify the model Any parameter we are trying to estimate also needs a prior 0 10 µ τ = 1/σ 2 - norm µ τ = 1/σ 2 - norm y

50 Specify the model Any parameter we are trying to estimate also needs a prior 0 10 µ τ = 1/σ 2 - norm What makes sense? - µ τ = 1/σ 2 norm y

51 Specify the model Any parameter we are trying to estimate also needs a prior µ τ = 1/σ 2 - norm α gamma β Ranges from 0 to, Mode = 1, sd = 10 (see p in Kruschke (2015)) - µ τ = 1/σ 2 norm y

52 Specify the model y ~ dnorm(mu, sigma) mu ~ dnorm(0, 10) sigma ~ dgamma(1.1, 0.11)

53 Specify the model For JAGS, we actually need this as text in it s own file modelstring = model { # Likelihood for (i in 1:N) { y[i] ~ dnorm(mu, tau) } # Priors mu ~ dnorm(0, (1 / 10^2)) sigma ~ dgamma(1.1, 0.11) tau <- 1 / sigma^2 } writelines(modelstring, con = model.txt )

54 Specify the model For JAGS, we actually need this as text in it s own file Note that this value is not pulled from a distribution, but rather is modelstring = model { # Likelihood for (i in 1:N) { y[i] ~ dnorm(mu, tau) } # Priors mu ~ dnorm(0, (1 / 10^2)) sigma ~ dgamma(1.1, 0.11) tau <- 1 / sigma^2 } writelines(modelstring, con = model.txt ) calculated from existing values

55 Prepare Data for JAGS What information does JAGS need to run the model? modelstring = model { # Likelihood for (i in 1:N) { y[i] ~ dnorm(mu, tau) } # Priors mu ~ dnorm(0, (1 / 10^2)) sigma ~ dgamma(1.1, 0.11) tau <- 1 / sigma^2 } writelines(modelstring, con = model.txt )

56 Prepare Data for JAGS What information does JAGS need to run the model? modelstring = model { # Likelihood for (i in 1:N) { y[i] ~ dnorm(mu, tau) } Number of records (rows) in data set, as variable N (can be any label we want) # Priors mu ~ dnorm(0, (1 / 10^2)) sigma ~ dgamma(1.1, 0.11) tau <- 1 / sigma^2 } writelines(modelstring, con = model.txt )

57 Prepare Data for JAGS What information does JAGS need to run the model? modelstring = model { A vector of all of the values of data # Likelihood for (i in 1:N) { y[i] ~ dnorm(mu, tau) } # Priors mu ~ dnorm(0, (1 / 10^2)) sigma ~ dgamma(1.1, 0.11) tau <- 1 / sigma^2 } writelines(modelstring, con = model.txt )

58 Prepare Data for JAGS Specify as a list for JAGS datalist = list ( y = zy, N = length(zy) )

59 Specify Initial Values JAGS often performs better if you give it a starting point This can just be a draw from the prior Write a function that will save these as a list Note that these are for R, not JAGS initslist <- function() { list( mu = rnorm(n = 1, mean = 0, sd = 10), sigma = rgamma(n = 1, shape = 1.1, rate = 0.11) ) }

60 Specify MCMC Parameters and Run library(runjags) runjagsout <- run.jags( method = simple, model = model.txt, monitor = c( mu, sigma ), data = datalist, inits = initslist, n.chains = 3, adapt = 500, burnin = 1000, sample = 20000, thin = 1, summarise = TRUE, plots = FALSE)

61 Evaluate Performance of the Model

62 Testing Model Performance Are many ways of doing this, we ll use four: From the coda package (see manual for more options & details) 1. Trace plots 2. Autocorrelation plots 3. Gelman and Rubin Diagnostic 4. Effective chain length

63 Testing Model Performance First, load the coda package, and format data as an MCMC list library(coda) codasamples = as.mcmc.list(runjagsout)

64 Testing Model Performance codasamples is a list, with each chain as a separate mcmc list Let s use the head function to take a look at the first chain head(codasamples[[1]]) Markov Chain Monte Carlo (MCMC) output: Start = 1501 End = 1507 Thinning interval = 1 mu sigma

65 Testing Model Performance Trace plots Plots iteration # against value drawn for that iteration If chain is mixing well, should look like a wide mess that is consistent noise around a mean value - a spiky caterpillar

66 Testing Model Performance Trace plots Can plot one chain at a time par(mfrow = c(1,2)) traceplot(codasamples[[1]])

67 Testing Model Performance Trace plots Can plot one chain at a time par(mfrow = c(1,2)) traceplot(codasamples[[1]]) Change the plot parameters to include both plots in one figure. Specifies a plot area with 1 row (first term) and 2 columns (second term)

68 Testing Model Performance Trace plots...or all together (each as a different colour) traceplot(codasamples)

69 Testing Model Performance Autocorrelation plots If chain is mixing well, steps differing by one step should be the only ones showing real autocorrelation, whereas steps further apart should not Good Bad Autocorrelation Lag

70 Testing Model Performance Autocorrelation plots Can only assess one chain at a time (should all be similar though) autocorr.plot(codasamples[[1]]) mu sigma Autocorrelation Autocorrelation Lag Lag

71 Testing Model Performance Gelman & Rubin diagnostic Compares variance within vs between chains If chains mixing well, these should be the same (ratio = 1.0) If not, between-chain variance should be greater than within-chain variance (ratio > 1.0)

72 Testing Model Performance Gelman & Rubin diagnostic gelman.diag(codasamples) Potential scale reduction factors: Point est. Upper C.I. mu 1 1 sigma 1 1 Multivariate psrf 1

73 Testing Model Performance Effective chain length Estimate of the equivalent number of independent steps that the chain represents If steps show autocorrelation, this number will be low If not, estimate should be close to the full number of steps times the number of chains

74 Testing Model Performance Effective chain length effectivesize(codasamples) mu sigma

75 Viewing Results (already!)

76 Parsing Data Convert codasamples to a matrix Will concatenate chains into one long one mcmcchain = as.matrix(codasamples)

77 Parsing Data Convert codasamples to a matrix Will concatenate chains into one long one mcmcchain = as.matrix(codasamples) Separate out data for each parameter zmu <- mcmcchain[, "mu"] zsigma <- mcmcchain[, "sigma"]

78 Convert Back to Original Scale mu <- (zmu * ysd) + ym sigma <- zsigma * ysd

79 Plotting Posterior Distributions Will use 2 functions from Kruschke plotpost.r HDIofMCMC.R These need to be in R s working directory, then loaded source( plotpost.r )

80 Plotting Posterior Distributions Mean (mu) par(mfrow = c(1,1)) histinfo = plotpost(mu, xlab = bquote(mu)) mean = % HDI µ

81 Plotting Posterior Distributions Mean (mu) par(mfrow = c(1,1)) histinfo = plotpost(mu, xlab = bquote(mu)) Can change using the credmass argument. Nothing special about 95% anymore!!!! mean = % HDI µ

82 Plotting Posterior Distributions Mean (mu) par(mfrow = c(1,1)) histinfo = plotpost(mu, credmass = 0.89, xlab = bquote(mu)) mean = % HDI µ

83 Plotting Posterior Distributions Mean (mu) abline(v = 100, lty = 2, lwd = 2, col = red ) mean = % HDI µ

84 Plotting Posterior Distributions Standard deviation (sigma) histinfo = plotpost(sigma, xlab = bquote(sigma), showmode = TRUE) mode = % HDI σ

85 Plotting Posterior Distributions Standard deviation (sigma) histinfo = plotpost(sigma, xlab = bquote(sigma), showmode = TRUE) mode = Show mode instead of mean because distribution is skewed 95% HDI σ

86 Interpretation Distribution is clearly centred at a value > 100 Difference isn t too big though Useful information on which to base understanding mean = mode = % HDI µ 95% HDI σ

87 How Well Does Our Model Fit the Data? Posterior Predictive Check

88 Posterior Predictive Check Plot data Choose some values from the posterior and plot over data

89 Posterior Predictive Check histinfo = hist(iq$iq, xlab = "IQ", main = "", col = skyblue", prob = TRUE) Density IQ

90 Posterior Predictive Check Get range of values from observed distribution plot xlims = range(histinfo$breaks) xlims [1]

91 Posterior Predictive Check Get range of values from observed distribution plot xlims = range(histinfo$breaks) xlims [1] Create a sequence of 500 values within this range xsample = seq(from = xlims[1], to = xlims[2], length = 500)

92 Posterior Predictive Check Get length of posterior chainlength = length(mu)

93 Posterior Predictive Check Get length of posterior chainlength = length(mu) Get 20 values from this range (we ll draw 20 lines) xnew = floor(seq(from = 1, to = chainlength, length = 20))

94 Posterior Predictive Check Get length of posterior chainlength = length(mu) Get 20 values from this range (we ll draw 20 lines) xnew = floor(seq(from = 1, to = chainlength, length = 20)) Rounds values, but so that they won t be larger than upper limit

95 Posterior Predictive Check Loop through list and plot associated lines for (i in xnew) { lines(xsample, dnorm(xsample, mean = mu[i], sd = sigma[i]), col = gray47 ) } Density IQ

96 Were Priors Appropriate?

97 Assessing Priors Plot posterior distribution on top of priors Ensure priors cover range appropriately See how heavily posteriors are influenced by priors (and how much by the data)

98 Assessing Priors Will use transformed data, because that is what the model was based on

99 Assessing Priors Mean (mu) Make a list containing the range of values over which to evaluate performance Mean should be 0, with sd = 1, so a range from -2 to 2 should work mupriorlist <- seq(from = -2, to = 2, length = 500)

100 Assessing Priors Mean (mu) Then, generate priors using model parameters muprior <- dnorm(mupriorlist, mean = 0, sd = 10)

101 Assessing Priors Mean (mu) Get the distribution of the posterior using the density function mupost <- density(zmu)

102 Assessing Priors Mean (mu) Plot the priors muhigh <- ceiling(max(mupost$y)) plot(mupriorlist, muprior, ylim = c(0, muhigh), type = l, lty = 2, xlab = Possible Values, ylab = Probability, main = mu ) mu Probability Possible Values

103 Assessing Priors Mean (mu) Add the posterior and legend lines(mupost) legend( topleft, legend = c("prior", "Posterior"), lty = c(2, 1), bty = "n") mu Probability Prior Posterior Possible Values

104 Assessing Priors Standard deviation (sigma) Do the same thing with sigma sigmapriorlist <- seq(from = 0, to = 5, length = 500) sigmaprior <- dgamma(sigmapriorlist, shape = 1.1, rate = 0.11) sigmapost <- density(zsigma)

105 Assessing Priors Standard deviation (sigma) sigmahigh <- ceiling(max(sigmapost$y)) plot(sigmapriorlist, sigmaprior, ylim = c(0, sigmahigh), type = l, lty = 2, xlab = Possible Values, ylab = Probability, main = sigma ) lines(sigmapost) legend("topleft", legend = c("prior", "Posterior"), lty = c(2, 1), bty = "n")

106 Assessing Priors Standard deviation (sigma) sigma Probability Prior Posterior Possible Values

107 Questions?

108 Homework!!

109 Modify Model Based on a t-distribution t distribution less effected by outliers than the normal distribution (i.e., it is robust to outliers) Kruschke (2015) p. 460

110 Modify Model Based on a t-distribution Robust estimation Include all model testing, prior justification, and validation steps! Is a bit tricky to get scale correct for plotting lines on top of histogram

111 Modify Model Based on a t-distribution dt centred around 0 in R (not JAGS), so need to re-scale Remember, in JAGS dt(mu, tau, df) In R dt(x, df) mu and tau in JAGS don t have an equivalent in R Need to re-scale dt(mu, tau, df) in JAGS = sqrt(tau) * dt((x - mu) * sqrt(tau), df) in R

112 Modify Model Based on a t-distribution Relevant at this step Changing this as appropriate for a t- distribution

113 Creative Commons License Anyone is allowed to distribute, remix, tweak, and build upon this work, even commercially, as long as they credit me for the original creation. See the Creative Commons website for more information. Click here to go back to beginning

Metric Predicted Variable on Two Groups

Metric Predicted Variable on Two Groups Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information. Goals