Metric Predicted Variable on Two Groups

Size: px

Start display at page:

Download "Metric Predicted Variable on Two Groups"

Marjory Greer
5 years ago
Views:

1 Metric Predicted Variable on Two Groups Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information.

2 Goals

3 Goals When would we use this type of analysis? Comparing data between two groups Are means different? Is variability different? etc. t-test and equivalents (but more flexible)

4 Data

5 Data Metric data from two groups twodata.csv

6 Data Read data into R twodata <- read.table( twodata.csv, header = TRUE, sep =, )

7 Data Let s get a feel for the data summary(twodata) y1 y2 Min. : Min. : st Qu.: st Qu.: Median : Median : Mean : Mean : rd Qu.: rd Qu.: Max. : Max. :

8 Data Let s get a feel for the data summary(twodata) y1 y2 Min. : Min. : st Qu.: st Qu.: Median : Median : Mean : Mean : rd Qu.: rd Qu.: Max. : Max. : y2 seems to have higher values than y1

9 Data sd(twodata$y1) [1] sd(twodata$y2) [1]

10 Data sd(twodata$y1) [1] sd(twodata$y2) [1] y2 seems to have larger standard deviation than y1

11 Data Let s look at the data Many potential ways to plot this. We ll look at three.

12 Data plot(twodata$y1, twodata$y2, ylim = c(-6, 12), xlim = c(-6, 12), xlab = "y1", ylab = "y2", pch = 16, col = rgb(0, 0, 1, 0.5)) abline(0, 1, lwd = 2) y y1

13 Data plot(twodata$y1, twodata$y2, ylim = c(-6, 12), xlim = c(-6, 12), xlab = "y1", ylab = "y2", pch = 16, col = rgb(0, 0, 1, 0.5)) abline(0, 1, lwd = 2) y2 Define values for x- and y-axes. Here we want the same so that it is easy to compare y1

14 Data plot(twodata$y1, twodata$y2, ylim = c(-6, 12), xlim = c(-6, 12), xlab = "y1", ylab = "y2", pch = 16, col = rgb(0, 0, 1, 0.5)) abline(0, 1, lwd = 2) Define labels for the x- and y-axes. y y1

15 Data plot(twodata$y1, twodata$y2, ylim = c(-6, 12), xlim = c(-6, 12), xlab = "y1", ylab = "y2", pch = 16, col = rgb(0, 0, 1, 0.5)) abline(0, 1, lwd = 2) y Use filled circles as plotting symbol (see?pch for more details) y1

16 Data plot(twodata$y1, twodata$y2, ylim = c(-6, 12), xlim = c(-6, 12), xlab = "y1", ylab = "y2", pch = 16, col = rgb(0, 0, 1, 0.5)) abline(0, 1, lwd = 2) Use rgb colour specifications to set fill colour (allows for transparency of symbols). First number indicates degree of red, second indicates degree of green, and third indicates degree of blue (on a scale from 0 to 1). The 4th number indicates how opaque the colour is (1 = solid, 0 = totally opaque). See?rgb for more details. y y1

17 Data plot(twodata$y1, twodata$y2, ylim = c(-6, 12), xlim = c(-6, 12), xlab = "y1", ylab = "y2", pch = 16, col = rgb(0, 0, 1, 0.5)) abline(0, 1, lwd = 2) y2 Add a line to the plot with an intercept of 0, a slope of 1, and a thickness of y1

18 Data y2 mostly larger than y1 y2 more spread out than y1 y y1

19 Data h1 <- hist(twodata$y1) h2 <- hist(twodata$y2) plot(h1, col = rgb(0,0,1,0.25), xlim = c(-4, 15), main = "", xlab = "") plot(h2, col = rgb(1,0,0,0.25), xlim = c(-4, 15), add = TRUE) legend("topright", legend = c("y1", "y2"), pch = 15, col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25)), bty = n ) Frequency y1 y

20 Data h1 <- hist(twodata$y1) h2 <- hist(twodata$y2) plot(h1, col = rgb(0,0,1,0.25), xlim = c(-4, 15), main = "", xlab = "") plot(h2, col = rgb(1,0,0,0.25), xlim = c(-4, 15), add = TRUE) legend("topright", legend = c("y1", "y2"), pch = 15, col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25)), bty = n ) Create variables with histogram data for each data set. Frequency y1 y

21 Data h1 <- hist(twodata$y1) h2 <- hist(twodata$y2) plot(h1, col = rgb(0,0,1,0.25), xlim = c(-4, 15), main = "", xlab = "") plot(h2, col = rgb(1,0,0,0.25), xlim = c(-4, 15), add = TRUE) legend("topright", legend = c("y1", "y2"), pch = 15, col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25)), bty = n ) Frequency Plot them, specifying the rgb parameters and scale of the x-axis. Note the add = TRUE argument to indicate that y1 the second histogram should be plotted in the y2 same frame as the first one

22 Data h1 <- hist(twodata$y1) h2 <- hist(twodata$y2) plot(h1, col = rgb(0,0,1,0.25), xlim = c(-4, 15), main = "", xlab = "") plot(h2, col = rgb(1,0,0,0.25), xlim = c(-4, 15), add = TRUE) legend("topright", legend = c("y1", "y2"), pch = 15, col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25)), bty = n ) Frequency Add a legend to the plot, and place it in the upper-right corner. y1 y

23 Data h1 <- hist(twodata$y1) h2 <- hist(twodata$y2) plot(h1, col = rgb(0,0,1,0.25), xlim = c(-4, 15), main = "", xlab = "") plot(h2, col = rgb(1,0,0,0.25), xlim = c(-4, 15), add = TRUE) legend("topright", legend = c("y1", "y2"), pch = 15, col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25)), bty = n ) Frequency The text to be included in the legend y1 y

24 Data h1 <- hist(twodata$y1) h2 <- hist(twodata$y2) plot(h1, col = rgb(0,0,1,0.25), xlim = c(-4, 15), main = "", xlab = "") plot(h2, col = rgb(1,0,0,0.25), xlim = c(-4, 15), add = TRUE) legend("topright", legend = c("y1", "y2"), pch = 15, col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25)), bty = n ) Frequency What shape to use for legend symbols (15 is square). See?pch for more details. y1 y

25 Data h1 <- hist(twodata$y1) h2 <- hist(twodata$y2) plot(h1, col = rgb(0,0,1,0.25), xlim = c(-4, 15), main = "", xlab = "") plot(h2, col = rgb(1,0,0,0.25), xlim = c(-4, 15), add = TRUE) legend("topright", legend = c("y1", "y2"), pch = 15, col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25)), bty = n ) Frequency What colours to use for each symbol (in order!). y1 y

26 Data h1 <- hist(twodata$y1) h2 <- hist(twodata$y2) plot(h1, col = rgb(0,0,1,0.25), xlim = c(-4, 15), main = "", xlab = "") plot(h2, col = rgb(1,0,0,0.25), xlim = c(-4, 15), add = TRUE) legend("topright", legend = c("y1", "y2"), pch = 15, col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25)), bty = n ) Frequency Don t draw a box around legend. y1 y

27 Data h1 <- hist(twodata$y1) h2 <- hist(twodata$y2) plot(h1, col = rgb(0,0,1,0.25), xlim = c(-4, 15), main = "", xlab = "") plot(h2, col = rgb(1,0,0,0.25), xlim = c(-4, 15), add = TRUE) legend("topright", legend = c("y1", "y2"), pch = 15, col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25)), bty = n ) Frequency y1 y

28 Data data <- c(twodata$y1, twodata$y2) ny1 <- rep(1, length(twodata$y1)) ny2 <- rep(2, length(twodata$y2)) groups <- c(ny1, ny2) boxplot(data ~ groups, names = c("y1", "y2"), col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25))) y1 y2

29 Data data <- c(twodata$y1, twodata$y2) ny1 <- rep(1, length(twodata$y1)) ny2 <- rep(2, length(twodata$y2)) groups <- c(ny1, ny2) boxplot(data ~ groups, names = c("y1", "y2"), col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25))) Combine the two data sets into one long vector y1 y2

30 Data data <- c(twodata$y1, twodata$y2) ny1 <- rep(1, length(twodata$y1)) ny2 <- rep(2, length(twodata$y2)) groups <- c(ny1, ny2) boxplot(data ~ groups, names = c("y1", "y2"), col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25))) Create a vector of labels for the values in the first group (y1). Will label first group as 1, so this vector will have 1 repeated for each value in the y1 group y1 y2

31 Data data <- c(twodata$y1, twodata$y2) ny1 <- rep(1, length(twodata$y1)) ny2 <- rep(2, length(twodata$y2)) groups <- c(ny1, ny2) boxplot(data ~ groups, names = c("y1", "y2"), col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25))) Create a vector of labels for the values in the second group (y2). Will label second group as 2, so this vector will have 2 repeated for each value in the y2 group y1 y2

32 Data data <- c(twodata$y1, twodata$y2) ny1 <- rep(1, length(twodata$y1)) ny2 <- rep(2, length(twodata$y2)) groups <- c(ny1, ny2) boxplot(data ~ groups, names = c("y1", "y2"), col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25))) Combine these into one long vector. This vector will be as long as our data vector, but contain a label indicating which group each value is from y1 y2

33 Data data <- c(twodata$y1, twodata$y2) ny1 <- rep(1, length(twodata$y1)) ny2 <- rep(2, length(twodata$y2)) groups <- c(ny1, ny2) boxplot(data ~ groups, names = c("y1", "y2"), col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25))) Draw a box plot of the data values, grouped by the groups values. y1 y2

34 Data data <- c(twodata$y1, twodata$y2) ny1 <- rep(1, length(twodata$y1)) ny2 <- rep(2, length(twodata$y2)) groups <- c(ny1, ny2) boxplot(data ~ groups, names = c("y1", "y2"), col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25))) Specify how to label the groups in the plot. y1 y2

35 Data data <- c(twodata$y1, twodata$y2) ny1 <- rep(1, length(twodata$y1)) ny2 <- rep(2, length(twodata$y2)) groups <- c(ny1, ny2) boxplot(data ~ groups, names = c("y1", "y2"), col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25))) Specify what colours to use for each group. y1 y2

36 Data y1 y2

37 Data Median y1 y2

38 Data 50% of values (1st and 3rd quartile) y1 y2

39 Data Remaining values up to 1.5X inter-quartile range (difference between 1st and 3rd quartile; roughly 2 standard deviations) y1 y2

40 Data Outliers - values falling outside 1.5X the inter-quartile range y1 y2

41 Data Which plotting method (if any) is most informative for data like this? y y1 Frequency y1 y y1 y2

42 Frequentist Approach

43 Frequentist Approach t-test t.test(twodata$y1, twodata$y2) Welch Two Sample t-test data: twodata$y1 and twodata$y2 t = , df = , p-value = alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: sample estimates: mean of x mean of y

44 Frequentist Approach Wilcoxon rank sum test t-test assumes: Data normally distributed Variances equal Wilcoxon rank sum test is a non-parametric alternative

45 Frequentist Approach Wilcoxon rank sum test wilcox.test(twodata$y1, twodata$y2) Wilcoxon rank sum test data: twodata$y1 and twodata$y2 W = 87, p-value = alternative hypothesis: true location shift is not equal to 0

46 Bayesian Approach

47 Standardize the Data y1 <- twodata$y1 y1mean <- mean(y1) y1sd <- sd(y1) zy1 <- (y1 - y1mean) / y1sd N1 <- length(zy1) y2 <- twodata$y2 y2mean <- mean(y2) y2sd <- sd(y2) zy2 <- (y2 - y2mean) / y2sd N2 <- length(zy2)

48 Specify the model Can just do two of our original model (simultaneously) µ τ = 1/σ 2 µ τ = 1/σ 2 - norm α gamma β - norm α gamma β µ τ = 1/σ 2 µ τ = 1/σ 2 norm norm - - y1i y2i

49 Specify the model modelstring = model { # Likelihood for (i in 1:N1) { zy1[i] ~ dnorm(mu1, tau1) } for (j in 1:N2) { zy2[j] ~ dnorm(mu2, tau2) } # Priors mu1 ~ dnorm(0, (1 / 10^2)) mu2 ~ dnorm(0, (1 / 10^2)) sigma1 ~ dgamma(1.1, 0.11) sigma2 ~ dgamma(1.1, 0.11) tau1 <- 1 / sigma1^2 tau2 <- 1 / sigma2^2 } writelines(modelstring, con = model.txt )

50 Prepare Data for JAGS Specify as a list for JAGS datalist = list ( zy1 = zy1, zy2 = zy2, N1 = N1, N2 = N2 )

51 Specify Initial Values initslist <- function() { list( mu1 = rnorm(n = 1, mean = 0, sd = 10), mu2 = rnorm(n = 1, mean = 0, sd = 10), sigma1 = rgamma(n = 1, shape = 1.1, rate = 0.11) sigma2 = rgamma(n = 1, shape = 1.1, rate = 0.11) ) }

52 Specify MCMC Parameters and Run library(runjags) runjagsout <- run.jags( method = simple, model = model.txt, monitor = c( mu1, mu2, sigma1, sigma2 ), data = datalist, inits = initslist, n.chains = 3, adapt = 500, burnin = 1000, sample = 20000, thin = 1, summarise = TRUE, plots = FALSE)

53 Evaluate Performance of the Model

54 Testing Model Performance Retrieve the data and take a peak at the structure codasamples = as.mcmc.list(runjagsout) head(codasamples[[1]]) Markov Chain Monte Carlo (MCMC) output: Start = 1501 End = 1507 Thinning interval = 1 mu1 mu2 sigma1 sigma

55 Testing Model Performance Trace plots par(mfrow = c(2,2)) traceplot(codasamples)

56 Testing Model Performance Autocorrelation plots autocorr.plot(codasamples[[1]]) mu1 mu2 Autocorrelation Autocorrelation Lag Lag sigma1 sigma2 Autocorrelation Autocorrelation Lag Lag

57 Testing Model Performance Gelman & Rubin diagnostic gelman.diag(codasamples) Potential scale reduction factors: Point est. Upper C.I. mu1 1 1 mu2 1 1 sigma1 1 1 sigma2 1 1 Multivariate psrf 1

58 Testing Model Performance Effective size effectivesize(codasamples) mu1 mu2 sigma1 sigma

59 Viewing Results

60 Parsing Data Convert codasamples to a matrix Will concatenate chains into one long one mcmcchain = as.matrix(codasamples)

61 Parsing Data Separate out data for each parameter zmu1 <- mcmcchain[, mu1 ] zmu2 <- mcmcchain[, mu2 ] zsigma1 <- mcmcchain[, sigma1 ] zsigma2 <- mcmcchain[, sigma2 ]

62 Convert Back to Original Scale mu1 <- (zmu1 * ysd1) + ymean1 mu2 <- (zmu2 * ysd2) + ymean2 sigma1 <- zsigma1 * ysd1 sigma2 <- zsigma2 * ysd2

63 Plot Posterior Distributions Means par(mfrow=c(1, 2)) histinfo = plotpost(mu1, xlab = bquote(mu[1])) histinfo = plotpost(mu2, xlab = bquote(mu[2])) mean = mean = % HDI % HDI µ µ 2

64 Plot Posterior Distributions Means Can work directly with posterior distributions!!! diffmu <- mu1 - mu2 par(mfrow = c(1,1)) histinfo = plotpost(diffmu, xlab = bquote(mu[1] - mu[2])) mean = % HDI µ 1 µ 2

65 Plot Posterior Distributions Standard deviation par(mfrow = c(1,2)) histinfo = plotpost(sigma1, xlab = bquote(sigma[1]), showmode = TRUE) histinfo = plotpost(sigma2, xlab = bquote(sigma[2]), showmode = TRUE) mode = mode = % HDI % HDI σ σ 2

66 Plot Posterior Distributions Standard deviation diffsigma <- sigma1 - sigma2 par(mfrow = c(1,1)) histinfo = plotpost(diffsigma, xlab = bquote(sigma[1] - sigma[2]), showmode = TRUE) mode = % HDI σ 1 σ 2

67 Plot Posterior Distributions Effect size The difference in means, standardized by the variance Provides information on how big of an effect there is, considering the amount of variation. Should generally range from about -1 to 1

68 Plot Posterior Distributions Effect size esize <- (mu1 - mu2) / (sqrt((sigma1^2 + sigma2^2) / 2)) histinfo = plotpost(esize, xlab = bquote((mu[1] - mu[2]) / sqrt((sigma[1]^2 + sigma[2]^2)/2)), cex.lab = 0.9) mean = % HDI (µ 1 µ 2 ) (σ σ 22 ) 2

69 Recap Think of the wealth of information we ve obtained mean = mode = mean = mode = % HDI % HDI % HDI % HDI µ σ µ σ 2 y1 y2 mean = mode = mean = % HDI µ 1 µ 2 95% HDI σ 1 σ 2 95% HDI (µ 1 µ 2 ) (σ σ 22 ) 2

70 Recap Think of the wealth of information we ve obtained The goal of analyses should not be one value and a yes/no decision it should be to obtain information about the data so that you can evaluate the credibility of different hypotheses

71 Revision of the Goals of Bayesian Analysis

72 Bayesian Analysis Taken almost verbatim from Gelman et al. (2014)* A practical method for making inferences from data using probability models for quantities we observe and for quantities about which we wish to learn Explicit use of probability for quantifying uncertainty in inferences based on statistical data analysis * Gelman et al. (2014) Bayesian Data Analysis. CRC Press.

73 Bayesian Analysis Three main steps 1. Setting up a full probability model - a joint probability distribution for all observable and unobservable quantities in a problem. The model should be consistent with knowledge about the underlying scientific problem and the data collection process

74 Bayesian Analysis Three main steps 2. Condition on observed data - calculating and interpreting the appropriate posterior distribution - the conditional probability distribution of the unobserved quantities of ultimate interest, given the observed data

75 Bayesian Analysis Three main steps 3. Evaluating the fit of the model and the implications - How well does the model fit the data? Are the conclusions reasonable? How sensitive are the results to the modelling assumptions in step 1?

76 Bayesian Analysis Emphasis on Do the inferences make sense? Are the model s predictions consistent with the data?

77 Bayesian Analysis Emphasis on Do the inferences make sense? Are the model s predictions consistent with the data? Is the model true? What is the Pr(model is true) Can we reject the model Not

78 Bayesian Analysis Emphasis on Describing the data, and the factors influencing the data, in an explicit and probabilistic manner Making interpretations of these factors based on the analyses

79 How Well Does Our Model Fit The Data? Posterior Predictive Check

80 Assessing Model Fit y1 Plot data Choose some values from the posterior and plot over data

81 Assessing Model Fit y1 histinfo = hist(y1, xlab = "y1", main = "", col = "skyblue", prob = TRUE) Density y1

82 Assessing Model Fit y1 Get range of values from observed distribution plot y1lims = range(histinfo$breaks) y1lims [1] -3 3

83 Assessing Model Fit y1 Get range of values from observed distribution plot y1lims = range(histinfo$breaks) y1lims [1] -3 3 Create a sequence of 500 values within this range y1sample = seq(from = y1lims[1], to = y1lims[2], length = 500)

84 Assessing Model Fit y1 Get length of posterior chainlength1 = length(mu1)

85 Assessing Model Fit y1 Get length of posterior chainlength1 = length(mu1) Get 20 values from this range (we ll draw 20 lines) y1new = floor(seq(from = 1, to = chainlength1, length = 20))

86 Assessing Model Fit y1 Loop through list and plot associated lines for (i in y1new) { lines(y1sample, dnorm(y1sample, mean = mu1[i], sd = sigma1[i]), col = gray47 ) } Density y1

87 Assessing Model Fit y2 histinfo = hist(y2, xlab = "y2", main = "", col = "skyblue", prob = TRUE) Density y2

88 Assessing Model Fit y2 Get range of values from observed distribution plot y2lims = range(histinfo$breaks) y2lims [1] -2 12

89 Assessing Model Fit y2 Get range of values from observed distribution plot y2lims = range(histinfo$breaks) y2lims [1] Create a sequence of 500 values within this range y2sample <- seq(from = y2lims[1], to = y2lims[2], length = 500)

90 Assessing Model Fit y2 Get length of posterior chainlength2 = length(mu2)

91 Assessing Model Fit y2 Get length of posterior chainlength2 = length(mu2) Get 20 values from this range (we ll draw 20 lines) y2new = floor(seq(from = 1, to = chainlength2, length = 20))

92 Assessing Model Fit y2 Loop through list and plot associated lines for (i in y2new) { lines(y2sample, dnorm(y2sample, mean = mu2[i], sd = sigma2[i]), col = "gray47") } Density y2

93 Were Priors Appropriate?

94 Assessing Priors Mean (mu) Make a list containing the range of values over which to evaluate performance Mean should be 0, with sd = 1, so a range from -2 to 2 should work par(mfrow = c(1, 2)) # To plot data for both mu1 and mu2 together mupriorlist <- seq(from = -2, to = 2, length = 500)

95 Assessing Priors Mean (mu) Then, generate priors using model parameters mu1prior <- dnorm(mupriorlist, mean = 0, sd = 10) mu2prior <- dnorm(mupriorlist, mean = 0, sd = 10)

96 Assessing Priors Mean (mu) Get the distribution of the posterior using the density function mu1post <- density(zmu1) mu2post <- density(zmu2)

97 Assessing Priors Mean (mu) Get ranges for data mu1high <- ceiling(max(mu1post$y)) mu2high <- ceiling(max(mu2post$y))

98 Assessing Priors Mean (mu) Plot data for mu1 plot(mupriorlist, mu1prior, ylim = c(0, mu1high), type = "l", lty = 2, lwd = 2, xlab = "Possible Values", ylab = "Probability", main = zmu1") lines(mu1post, lwd = 2) legend("topleft", legend = c("prior", "Posterior"), lty = c(2, 1), bty = "n")

99 Assessing Priors Mean (mu) Plot data for mu2 plot(mupriorlist, mu2prior, ylim = c(0, mu2high), type = "l", lty = 2, lwd = 2, xlab = "Possible Values", ylab = "Probability", main = zmu2") lines(mu2post, lwd = 2) legend("topleft", legend = c("prior", "Posterior"), lty = c(2, 1), bty = "n")

100 Assessing Priors Mean (mu) zmu1 zmu2 Probability Prior Posterior Probability Prior Posterior Possible Values Possible Values

101 Assessing Priors Standard deviation (sigma) Make a list containing the range of values over which to evaluate performance Mode should be 1, with sd = 10, so a range from 0 to 5 should work par(mfrow = c(1, 2)) # To plot data for both sigma1 and sigma2 together sigmapriorlist <- seq(from = 0, to = 3, length = 500)

102 Assessing Priors Standard deviation (sigma) Then, generate priors using model parameters sigma1prior <- dgamma(sigmapriorlist, shape = 1.1, rate = 0.11) sigma2prior <- dgamma(sigmapriorlist, shape = 1.1, rate = 0.11)

103 Assessing Priors Standard deviation (sigma) Get the distribution of the posterior using the density function sigma1post <- density(zsigma1) sigma2post <- density(zsigma2)

104 Assessing Priors Standard deviation (sigma) Get ranges for data sigma1high <- ceiling(max(sigma1post$y)) sigma2high <- ceiling(max(sigma2post$y))

105 Assessing Priors Standard deviation (sigma) Plot data for sigma1 plot(sigmapriorlist, sigma1prior, ylim = c(0, sigma1high), type = "l", lty = 2, lwd = 2, xlab = "Possible Values", ylab = "Probability", main = sigma1") lines(sigma1post, lwd = 2) legend("topleft", legend = c("prior", "Posterior"), lty = c(2, 1), lwd = c(2, 2), bty = "n")

106 Assessing Priors Standard deviation (sigma) Plot data for sigma2 plot(sigmapriorlist, sigma2prior, ylim = c(0, sigma2high), type = "l", lty = 2, lwd = 2, xlab = "Possible Values", ylab = "Probability", main = sigma2") lines(sigma2post, lwd = 2) legend("topleft", legend = c("prior", "Posterior"), lty = c(2, 1), lwd = c(2, 2), bty = "n")

107 Assessing Priors Standard deviation (sigma) sigma1 sigma2 Probability Prior Posterior Probability Prior Posterior Possible Values Possible Values

108 Re-evaluating Some Old Examples

109 Re-evaluating Old Examples Remember these? N = 10,000 each, means differ by 0.1 N = 10 each, means differ by 4 Density Density Effect size = Effect size = p = p = 0.36

110 Re-evaluating Old Examples Remember these? What would you expect from Bayesian analyses?

111 Re-evaluating Old Examples Density mean = mean = % HDI µ 1 95% HDI µ 2

112 Re-evaluating Old Examples Density mean = % HDI µ 1 µ 2

113 Re-evaluating Old Examples Density mean = % HDI (µ 1 µ 2 ) (σ σ 22 ) 2

114 Re-evaluating Old Examples Density mean = mean = % HDI µ 1 95% HDI µ 2

115 Re-evaluating Old Examples Density mean = % HDI µ 1 µ 2

116 Re-evaluating Old Examples Density mean = % HDI (µ 1 µ 2 ) (σ σ 22 ) 2

117 Re-evaluating Old Examples Frequency µ 1 µ 2

118 Re-evaluating Old Examples Frequency µ 1 µ 2

119 Questions?

120 Homework!!

121 You guessed it: modify model using the t distribution instead of normal

122 Creative Commons License Anyone is allowed to distribute, remix, tweak, and build upon this work, even commercially, as long as they credit me for the original creation. See the Creative Commons website for more information. Click here to go back to beginning

Metric Predicted Variable on One Group

Metric Predicted Variable on One Group Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information. Prior Homework