Metric Predicted Variable on Two Groups

Size: px
Start display at page:

Download "Metric Predicted Variable on Two Groups"

Transcription

1 Metric Predicted Variable on Two Groups Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information.

2 Goals

3 Goals When would we use this type of analysis? Comparing data between two groups Are means different? Is variability different? etc. t-test and equivalents (but more flexible)

4 Data

5 Data Metric data from two groups twodata.csv

6 Data Read data into R twodata <- read.table( twodata.csv, header = TRUE, sep =, )

7 Data Let s get a feel for the data summary(twodata) y1 y2 Min. : Min. : st Qu.: st Qu.: Median : Median : Mean : Mean : rd Qu.: rd Qu.: Max. : Max. :

8 Data Let s get a feel for the data summary(twodata) y1 y2 Min. : Min. : st Qu.: st Qu.: Median : Median : Mean : Mean : rd Qu.: rd Qu.: Max. : Max. : y2 seems to have higher values than y1

9 Data sd(twodata$y1) [1] sd(twodata$y2) [1]

10 Data sd(twodata$y1) [1] sd(twodata$y2) [1] y2 seems to have larger standard deviation than y1

11 Data Let s look at the data Many potential ways to plot this. We ll look at three.

12 Data plot(twodata$y1, twodata$y2, ylim = c(-6, 12), xlim = c(-6, 12), xlab = "y1", ylab = "y2", pch = 16, col = rgb(0, 0, 1, 0.5)) abline(0, 1, lwd = 2) y y1

13 Data plot(twodata$y1, twodata$y2, ylim = c(-6, 12), xlim = c(-6, 12), xlab = "y1", ylab = "y2", pch = 16, col = rgb(0, 0, 1, 0.5)) abline(0, 1, lwd = 2) y2 Define values for x- and y-axes. Here we want the same so that it is easy to compare y1

14 Data plot(twodata$y1, twodata$y2, ylim = c(-6, 12), xlim = c(-6, 12), xlab = "y1", ylab = "y2", pch = 16, col = rgb(0, 0, 1, 0.5)) abline(0, 1, lwd = 2) Define labels for the x- and y-axes. y y1

15 Data plot(twodata$y1, twodata$y2, ylim = c(-6, 12), xlim = c(-6, 12), xlab = "y1", ylab = "y2", pch = 16, col = rgb(0, 0, 1, 0.5)) abline(0, 1, lwd = 2) y Use filled circles as plotting symbol (see?pch for more details) y1

16 Data plot(twodata$y1, twodata$y2, ylim = c(-6, 12), xlim = c(-6, 12), xlab = "y1", ylab = "y2", pch = 16, col = rgb(0, 0, 1, 0.5)) abline(0, 1, lwd = 2) Use rgb colour specifications to set fill colour (allows for transparency of symbols). First number indicates degree of red, second indicates degree of green, and third indicates degree of blue (on a scale from 0 to 1). The 4th number indicates how opaque the colour is (1 = solid, 0 = totally opaque). See?rgb for more details. y y1

17 Data plot(twodata$y1, twodata$y2, ylim = c(-6, 12), xlim = c(-6, 12), xlab = "y1", ylab = "y2", pch = 16, col = rgb(0, 0, 1, 0.5)) abline(0, 1, lwd = 2) y2 Add a line to the plot with an intercept of 0, a slope of 1, and a thickness of y1

18 Data y2 mostly larger than y1 y2 more spread out than y1 y y1

19 Data h1 <- hist(twodata$y1) h2 <- hist(twodata$y2) plot(h1, col = rgb(0,0,1,0.25), xlim = c(-4, 15), main = "", xlab = "") plot(h2, col = rgb(1,0,0,0.25), xlim = c(-4, 15), add = TRUE) legend("topright", legend = c("y1", "y2"), pch = 15, col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25)), bty = n ) Frequency y1 y

20 Data h1 <- hist(twodata$y1) h2 <- hist(twodata$y2) plot(h1, col = rgb(0,0,1,0.25), xlim = c(-4, 15), main = "", xlab = "") plot(h2, col = rgb(1,0,0,0.25), xlim = c(-4, 15), add = TRUE) legend("topright", legend = c("y1", "y2"), pch = 15, col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25)), bty = n ) Create variables with histogram data for each data set. Frequency y1 y

21 Data h1 <- hist(twodata$y1) h2 <- hist(twodata$y2) plot(h1, col = rgb(0,0,1,0.25), xlim = c(-4, 15), main = "", xlab = "") plot(h2, col = rgb(1,0,0,0.25), xlim = c(-4, 15), add = TRUE) legend("topright", legend = c("y1", "y2"), pch = 15, col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25)), bty = n ) Frequency Plot them, specifying the rgb parameters and scale of the x-axis. Note the add = TRUE argument to indicate that y1 the second histogram should be plotted in the y2 same frame as the first one

22 Data h1 <- hist(twodata$y1) h2 <- hist(twodata$y2) plot(h1, col = rgb(0,0,1,0.25), xlim = c(-4, 15), main = "", xlab = "") plot(h2, col = rgb(1,0,0,0.25), xlim = c(-4, 15), add = TRUE) legend("topright", legend = c("y1", "y2"), pch = 15, col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25)), bty = n ) Frequency Add a legend to the plot, and place it in the upper-right corner. y1 y

23 Data h1 <- hist(twodata$y1) h2 <- hist(twodata$y2) plot(h1, col = rgb(0,0,1,0.25), xlim = c(-4, 15), main = "", xlab = "") plot(h2, col = rgb(1,0,0,0.25), xlim = c(-4, 15), add = TRUE) legend("topright", legend = c("y1", "y2"), pch = 15, col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25)), bty = n ) Frequency The text to be included in the legend y1 y

24 Data h1 <- hist(twodata$y1) h2 <- hist(twodata$y2) plot(h1, col = rgb(0,0,1,0.25), xlim = c(-4, 15), main = "", xlab = "") plot(h2, col = rgb(1,0,0,0.25), xlim = c(-4, 15), add = TRUE) legend("topright", legend = c("y1", "y2"), pch = 15, col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25)), bty = n ) Frequency What shape to use for legend symbols (15 is square). See?pch for more details. y1 y

25 Data h1 <- hist(twodata$y1) h2 <- hist(twodata$y2) plot(h1, col = rgb(0,0,1,0.25), xlim = c(-4, 15), main = "", xlab = "") plot(h2, col = rgb(1,0,0,0.25), xlim = c(-4, 15), add = TRUE) legend("topright", legend = c("y1", "y2"), pch = 15, col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25)), bty = n ) Frequency What colours to use for each symbol (in order!). y1 y

26 Data h1 <- hist(twodata$y1) h2 <- hist(twodata$y2) plot(h1, col = rgb(0,0,1,0.25), xlim = c(-4, 15), main = "", xlab = "") plot(h2, col = rgb(1,0,0,0.25), xlim = c(-4, 15), add = TRUE) legend("topright", legend = c("y1", "y2"), pch = 15, col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25)), bty = n ) Frequency Don t draw a box around legend. y1 y

27 Data h1 <- hist(twodata$y1) h2 <- hist(twodata$y2) plot(h1, col = rgb(0,0,1,0.25), xlim = c(-4, 15), main = "", xlab = "") plot(h2, col = rgb(1,0,0,0.25), xlim = c(-4, 15), add = TRUE) legend("topright", legend = c("y1", "y2"), pch = 15, col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25)), bty = n ) Frequency y1 y

28 Data data <- c(twodata$y1, twodata$y2) ny1 <- rep(1, length(twodata$y1)) ny2 <- rep(2, length(twodata$y2)) groups <- c(ny1, ny2) boxplot(data ~ groups, names = c("y1", "y2"), col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25))) y1 y2

29 Data data <- c(twodata$y1, twodata$y2) ny1 <- rep(1, length(twodata$y1)) ny2 <- rep(2, length(twodata$y2)) groups <- c(ny1, ny2) boxplot(data ~ groups, names = c("y1", "y2"), col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25))) Combine the two data sets into one long vector y1 y2

30 Data data <- c(twodata$y1, twodata$y2) ny1 <- rep(1, length(twodata$y1)) ny2 <- rep(2, length(twodata$y2)) groups <- c(ny1, ny2) boxplot(data ~ groups, names = c("y1", "y2"), col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25))) Create a vector of labels for the values in the first group (y1). Will label first group as 1, so this vector will have 1 repeated for each value in the y1 group y1 y2

31 Data data <- c(twodata$y1, twodata$y2) ny1 <- rep(1, length(twodata$y1)) ny2 <- rep(2, length(twodata$y2)) groups <- c(ny1, ny2) boxplot(data ~ groups, names = c("y1", "y2"), col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25))) Create a vector of labels for the values in the second group (y2). Will label second group as 2, so this vector will have 2 repeated for each value in the y2 group y1 y2

32 Data data <- c(twodata$y1, twodata$y2) ny1 <- rep(1, length(twodata$y1)) ny2 <- rep(2, length(twodata$y2)) groups <- c(ny1, ny2) boxplot(data ~ groups, names = c("y1", "y2"), col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25))) Combine these into one long vector. This vector will be as long as our data vector, but contain a label indicating which group each value is from y1 y2

33 Data data <- c(twodata$y1, twodata$y2) ny1 <- rep(1, length(twodata$y1)) ny2 <- rep(2, length(twodata$y2)) groups <- c(ny1, ny2) boxplot(data ~ groups, names = c("y1", "y2"), col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25))) Draw a box plot of the data values, grouped by the groups values. y1 y2

34 Data data <- c(twodata$y1, twodata$y2) ny1 <- rep(1, length(twodata$y1)) ny2 <- rep(2, length(twodata$y2)) groups <- c(ny1, ny2) boxplot(data ~ groups, names = c("y1", "y2"), col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25))) Specify how to label the groups in the plot. y1 y2

35 Data data <- c(twodata$y1, twodata$y2) ny1 <- rep(1, length(twodata$y1)) ny2 <- rep(2, length(twodata$y2)) groups <- c(ny1, ny2) boxplot(data ~ groups, names = c("y1", "y2"), col = c(rgb(0,0,1,0.25), rgb(1,0,0,0.25))) Specify what colours to use for each group. y1 y2

36 Data y1 y2

37 Data Median y1 y2

38 Data 50% of values (1st and 3rd quartile) y1 y2

39 Data Remaining values up to 1.5X inter-quartile range (difference between 1st and 3rd quartile; roughly 2 standard deviations) y1 y2

40 Data Outliers - values falling outside 1.5X the inter-quartile range y1 y2

41 Data Which plotting method (if any) is most informative for data like this? y y1 Frequency y1 y y1 y2

42 Frequentist Approach

43 Frequentist Approach t-test t.test(twodata$y1, twodata$y2) Welch Two Sample t-test data: twodata$y1 and twodata$y2 t = , df = , p-value = alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: sample estimates: mean of x mean of y

44 Frequentist Approach Wilcoxon rank sum test t-test assumes: Data normally distributed Variances equal Wilcoxon rank sum test is a non-parametric alternative

45 Frequentist Approach Wilcoxon rank sum test wilcox.test(twodata$y1, twodata$y2) Wilcoxon rank sum test data: twodata$y1 and twodata$y2 W = 87, p-value = alternative hypothesis: true location shift is not equal to 0

46 Bayesian Approach

47 Standardize the Data y1 <- twodata$y1 y1mean <- mean(y1) y1sd <- sd(y1) zy1 <- (y1 - y1mean) / y1sd N1 <- length(zy1) y2 <- twodata$y2 y2mean <- mean(y2) y2sd <- sd(y2) zy2 <- (y2 - y2mean) / y2sd N2 <- length(zy2)

48 Specify the model Can just do two of our original model (simultaneously) µ τ = 1/σ 2 µ τ = 1/σ 2 - norm α gamma β - norm α gamma β µ τ = 1/σ 2 µ τ = 1/σ 2 norm norm - - y1i y2i

49 Specify the model modelstring = model { # Likelihood for (i in 1:N1) { zy1[i] ~ dnorm(mu1, tau1) } for (j in 1:N2) { zy2[j] ~ dnorm(mu2, tau2) } # Priors mu1 ~ dnorm(0, (1 / 10^2)) mu2 ~ dnorm(0, (1 / 10^2)) sigma1 ~ dgamma(1.1, 0.11) sigma2 ~ dgamma(1.1, 0.11) tau1 <- 1 / sigma1^2 tau2 <- 1 / sigma2^2 } writelines(modelstring, con = model.txt )

50 Prepare Data for JAGS Specify as a list for JAGS datalist = list ( zy1 = zy1, zy2 = zy2, N1 = N1, N2 = N2 )

51 Specify Initial Values initslist <- function() { list( mu1 = rnorm(n = 1, mean = 0, sd = 10), mu2 = rnorm(n = 1, mean = 0, sd = 10), sigma1 = rgamma(n = 1, shape = 1.1, rate = 0.11) sigma2 = rgamma(n = 1, shape = 1.1, rate = 0.11) ) }

52 Specify MCMC Parameters and Run library(runjags) runjagsout <- run.jags( method = simple, model = model.txt, monitor = c( mu1, mu2, sigma1, sigma2 ), data = datalist, inits = initslist, n.chains = 3, adapt = 500, burnin = 1000, sample = 20000, thin = 1, summarise = TRUE, plots = FALSE)

53 Evaluate Performance of the Model

54 Testing Model Performance Retrieve the data and take a peak at the structure codasamples = as.mcmc.list(runjagsout) head(codasamples[[1]]) Markov Chain Monte Carlo (MCMC) output: Start = 1501 End = 1507 Thinning interval = 1 mu1 mu2 sigma1 sigma

55 Testing Model Performance Trace plots par(mfrow = c(2,2)) traceplot(codasamples)

56 Testing Model Performance Autocorrelation plots autocorr.plot(codasamples[[1]]) mu1 mu2 Autocorrelation Autocorrelation Lag Lag sigma1 sigma2 Autocorrelation Autocorrelation Lag Lag

57 Testing Model Performance Gelman & Rubin diagnostic gelman.diag(codasamples) Potential scale reduction factors: Point est. Upper C.I. mu1 1 1 mu2 1 1 sigma1 1 1 sigma2 1 1 Multivariate psrf 1

58 Testing Model Performance Effective size effectivesize(codasamples) mu1 mu2 sigma1 sigma

59 Viewing Results

60 Parsing Data Convert codasamples to a matrix Will concatenate chains into one long one mcmcchain = as.matrix(codasamples)

61 Parsing Data Separate out data for each parameter zmu1 <- mcmcchain[, mu1 ] zmu2 <- mcmcchain[, mu2 ] zsigma1 <- mcmcchain[, sigma1 ] zsigma2 <- mcmcchain[, sigma2 ]

62 Convert Back to Original Scale mu1 <- (zmu1 * ysd1) + ymean1 mu2 <- (zmu2 * ysd2) + ymean2 sigma1 <- zsigma1 * ysd1 sigma2 <- zsigma2 * ysd2

63 Plot Posterior Distributions Means par(mfrow=c(1, 2)) histinfo = plotpost(mu1, xlab = bquote(mu[1])) histinfo = plotpost(mu2, xlab = bquote(mu[2])) mean = mean = % HDI % HDI µ µ 2

64 Plot Posterior Distributions Means Can work directly with posterior distributions!!! diffmu <- mu1 - mu2 par(mfrow = c(1,1)) histinfo = plotpost(diffmu, xlab = bquote(mu[1] - mu[2])) mean = % HDI µ 1 µ 2

65 Plot Posterior Distributions Standard deviation par(mfrow = c(1,2)) histinfo = plotpost(sigma1, xlab = bquote(sigma[1]), showmode = TRUE) histinfo = plotpost(sigma2, xlab = bquote(sigma[2]), showmode = TRUE) mode = mode = % HDI % HDI σ σ 2

66 Plot Posterior Distributions Standard deviation diffsigma <- sigma1 - sigma2 par(mfrow = c(1,1)) histinfo = plotpost(diffsigma, xlab = bquote(sigma[1] - sigma[2]), showmode = TRUE) mode = % HDI σ 1 σ 2

67 Plot Posterior Distributions Effect size The difference in means, standardized by the variance Provides information on how big of an effect there is, considering the amount of variation. Should generally range from about -1 to 1

68 Plot Posterior Distributions Effect size esize <- (mu1 - mu2) / (sqrt((sigma1^2 + sigma2^2) / 2)) histinfo = plotpost(esize, xlab = bquote((mu[1] - mu[2]) / sqrt((sigma[1]^2 + sigma[2]^2)/2)), cex.lab = 0.9) mean = % HDI (µ 1 µ 2 ) (σ σ 22 ) 2

69 Recap Think of the wealth of information we ve obtained mean = mode = mean = mode = % HDI % HDI % HDI % HDI µ σ µ σ 2 y1 y2 mean = mode = mean = % HDI µ 1 µ 2 95% HDI σ 1 σ 2 95% HDI (µ 1 µ 2 ) (σ σ 22 ) 2

70 Recap Think of the wealth of information we ve obtained The goal of analyses should not be one value and a yes/no decision it should be to obtain information about the data so that you can evaluate the credibility of different hypotheses

71 Revision of the Goals of Bayesian Analysis

72 Bayesian Analysis Taken almost verbatim from Gelman et al. (2014)* A practical method for making inferences from data using probability models for quantities we observe and for quantities about which we wish to learn Explicit use of probability for quantifying uncertainty in inferences based on statistical data analysis * Gelman et al. (2014) Bayesian Data Analysis. CRC Press.

73 Bayesian Analysis Three main steps 1. Setting up a full probability model - a joint probability distribution for all observable and unobservable quantities in a problem. The model should be consistent with knowledge about the underlying scientific problem and the data collection process

74 Bayesian Analysis Three main steps 2. Condition on observed data - calculating and interpreting the appropriate posterior distribution - the conditional probability distribution of the unobserved quantities of ultimate interest, given the observed data

75 Bayesian Analysis Three main steps 3. Evaluating the fit of the model and the implications - How well does the model fit the data? Are the conclusions reasonable? How sensitive are the results to the modelling assumptions in step 1?

76 Bayesian Analysis Emphasis on Do the inferences make sense? Are the model s predictions consistent with the data?

77 Bayesian Analysis Emphasis on Do the inferences make sense? Are the model s predictions consistent with the data? Is the model true? What is the Pr(model is true) Can we reject the model Not

78 Bayesian Analysis Emphasis on Describing the data, and the factors influencing the data, in an explicit and probabilistic manner Making interpretations of these factors based on the analyses

79 How Well Does Our Model Fit The Data? Posterior Predictive Check

80 Assessing Model Fit y1 Plot data Choose some values from the posterior and plot over data

81 Assessing Model Fit y1 histinfo = hist(y1, xlab = "y1", main = "", col = "skyblue", prob = TRUE) Density y1

82 Assessing Model Fit y1 Get range of values from observed distribution plot y1lims = range(histinfo$breaks) y1lims [1] -3 3

83 Assessing Model Fit y1 Get range of values from observed distribution plot y1lims = range(histinfo$breaks) y1lims [1] -3 3 Create a sequence of 500 values within this range y1sample = seq(from = y1lims[1], to = y1lims[2], length = 500)

84 Assessing Model Fit y1 Get length of posterior chainlength1 = length(mu1)

85 Assessing Model Fit y1 Get length of posterior chainlength1 = length(mu1) Get 20 values from this range (we ll draw 20 lines) y1new = floor(seq(from = 1, to = chainlength1, length = 20))

86 Assessing Model Fit y1 Loop through list and plot associated lines for (i in y1new) { lines(y1sample, dnorm(y1sample, mean = mu1[i], sd = sigma1[i]), col = gray47 ) } Density y1

87 Assessing Model Fit y2 histinfo = hist(y2, xlab = "y2", main = "", col = "skyblue", prob = TRUE) Density y2

88 Assessing Model Fit y2 Get range of values from observed distribution plot y2lims = range(histinfo$breaks) y2lims [1] -2 12

89 Assessing Model Fit y2 Get range of values from observed distribution plot y2lims = range(histinfo$breaks) y2lims [1] Create a sequence of 500 values within this range y2sample <- seq(from = y2lims[1], to = y2lims[2], length = 500)

90 Assessing Model Fit y2 Get length of posterior chainlength2 = length(mu2)

91 Assessing Model Fit y2 Get length of posterior chainlength2 = length(mu2) Get 20 values from this range (we ll draw 20 lines) y2new = floor(seq(from = 1, to = chainlength2, length = 20))

92 Assessing Model Fit y2 Loop through list and plot associated lines for (i in y2new) { lines(y2sample, dnorm(y2sample, mean = mu2[i], sd = sigma2[i]), col = "gray47") } Density y2

93 Were Priors Appropriate?

94 Assessing Priors Mean (mu) Make a list containing the range of values over which to evaluate performance Mean should be 0, with sd = 1, so a range from -2 to 2 should work par(mfrow = c(1, 2)) # To plot data for both mu1 and mu2 together mupriorlist <- seq(from = -2, to = 2, length = 500)

95 Assessing Priors Mean (mu) Then, generate priors using model parameters mu1prior <- dnorm(mupriorlist, mean = 0, sd = 10) mu2prior <- dnorm(mupriorlist, mean = 0, sd = 10)

96 Assessing Priors Mean (mu) Get the distribution of the posterior using the density function mu1post <- density(zmu1) mu2post <- density(zmu2)

97 Assessing Priors Mean (mu) Get ranges for data mu1high <- ceiling(max(mu1post$y)) mu2high <- ceiling(max(mu2post$y))

98 Assessing Priors Mean (mu) Plot data for mu1 plot(mupriorlist, mu1prior, ylim = c(0, mu1high), type = "l", lty = 2, lwd = 2, xlab = "Possible Values", ylab = "Probability", main = zmu1") lines(mu1post, lwd = 2) legend("topleft", legend = c("prior", "Posterior"), lty = c(2, 1), bty = "n")

99 Assessing Priors Mean (mu) Plot data for mu2 plot(mupriorlist, mu2prior, ylim = c(0, mu2high), type = "l", lty = 2, lwd = 2, xlab = "Possible Values", ylab = "Probability", main = zmu2") lines(mu2post, lwd = 2) legend("topleft", legend = c("prior", "Posterior"), lty = c(2, 1), bty = "n")

100 Assessing Priors Mean (mu) zmu1 zmu2 Probability Prior Posterior Probability Prior Posterior Possible Values Possible Values

101 Assessing Priors Standard deviation (sigma) Make a list containing the range of values over which to evaluate performance Mode should be 1, with sd = 10, so a range from 0 to 5 should work par(mfrow = c(1, 2)) # To plot data for both sigma1 and sigma2 together sigmapriorlist <- seq(from = 0, to = 3, length = 500)

102 Assessing Priors Standard deviation (sigma) Then, generate priors using model parameters sigma1prior <- dgamma(sigmapriorlist, shape = 1.1, rate = 0.11) sigma2prior <- dgamma(sigmapriorlist, shape = 1.1, rate = 0.11)

103 Assessing Priors Standard deviation (sigma) Get the distribution of the posterior using the density function sigma1post <- density(zsigma1) sigma2post <- density(zsigma2)

104 Assessing Priors Standard deviation (sigma) Get ranges for data sigma1high <- ceiling(max(sigma1post$y)) sigma2high <- ceiling(max(sigma2post$y))

105 Assessing Priors Standard deviation (sigma) Plot data for sigma1 plot(sigmapriorlist, sigma1prior, ylim = c(0, sigma1high), type = "l", lty = 2, lwd = 2, xlab = "Possible Values", ylab = "Probability", main = sigma1") lines(sigma1post, lwd = 2) legend("topleft", legend = c("prior", "Posterior"), lty = c(2, 1), lwd = c(2, 2), bty = "n")

106 Assessing Priors Standard deviation (sigma) Plot data for sigma2 plot(sigmapriorlist, sigma2prior, ylim = c(0, sigma2high), type = "l", lty = 2, lwd = 2, xlab = "Possible Values", ylab = "Probability", main = sigma2") lines(sigma2post, lwd = 2) legend("topleft", legend = c("prior", "Posterior"), lty = c(2, 1), lwd = c(2, 2), bty = "n")

107 Assessing Priors Standard deviation (sigma) sigma1 sigma2 Probability Prior Posterior Probability Prior Posterior Possible Values Possible Values

108 Re-evaluating Some Old Examples

109 Re-evaluating Old Examples Remember these? N = 10,000 each, means differ by 0.1 N = 10 each, means differ by 4 Density Density Effect size = Effect size = p = p = 0.36

110 Re-evaluating Old Examples Remember these? What would you expect from Bayesian analyses?

111 Re-evaluating Old Examples Density mean = mean = % HDI µ 1 95% HDI µ 2

112 Re-evaluating Old Examples Density mean = % HDI µ 1 µ 2

113 Re-evaluating Old Examples Density mean = % HDI (µ 1 µ 2 ) (σ σ 22 ) 2

114 Re-evaluating Old Examples Density mean = mean = % HDI µ 1 95% HDI µ 2

115 Re-evaluating Old Examples Density mean = % HDI µ 1 µ 2

116 Re-evaluating Old Examples Density mean = % HDI (µ 1 µ 2 ) (σ σ 22 ) 2

117 Re-evaluating Old Examples Frequency µ 1 µ 2

118 Re-evaluating Old Examples Frequency µ 1 µ 2

119 Questions?

120 Homework!!

121 You guessed it: modify model using the t distribution instead of normal

122 Creative Commons License Anyone is allowed to distribute, remix, tweak, and build upon this work, even commercially, as long as they credit me for the original creation. See the Creative Commons website for more information. Click here to go back to beginning

Metric Predicted Variable on One Group

Metric Predicted Variable on One Group Metric Predicted Variable on One Group Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information. Prior Homework

More information

Metric Predicted Variable With One Nominal Predictor Variable

Metric Predicted Variable With One Nominal Predictor Variable Metric Predicted Variable With One Nominal Predictor Variable Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more

More information

Hierarchical Modeling

Hierarchical Modeling Hierarchical Modeling Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information. General Idea One benefit

More information

Multiple Regression: Mixed Predictor Types. Tim Frasier

Multiple Regression: Mixed Predictor Types. Tim Frasier Multiple Regression: Mixed Predictor Types Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information. The

More information

Multiple Regression: Nominal Predictors. Tim Frasier

Multiple Regression: Nominal Predictors. Tim Frasier Multiple Regression: Nominal Predictors Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information. Goals

More information

Count Predicted Variable & Contingency Tables

Count Predicted Variable & Contingency Tables Count Predicted Variable & Contingency Tables Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information.

More information

Bayesian Statistics: An Introduction

Bayesian Statistics: An Introduction : An Introduction Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information. Outline 1. Bayesian statistics,

More information

WinBUGS : part 2. Bruno Boulanger Jonathan Jaeger Astrid Jullion Philippe Lambert. Gabriele, living with rheumatoid arthritis

WinBUGS : part 2. Bruno Boulanger Jonathan Jaeger Astrid Jullion Philippe Lambert. Gabriele, living with rheumatoid arthritis WinBUGS : part 2 Bruno Boulanger Jonathan Jaeger Astrid Jullion Philippe Lambert Gabriele, living with rheumatoid arthritis Agenda 2! Hierarchical model: linear regression example! R2WinBUGS Linear Regression

More information

R Demonstration ANCOVA

R Demonstration ANCOVA R Demonstration ANCOVA Objective: The purpose of this week s session is to demonstrate how to perform an analysis of covariance (ANCOVA) in R, and how to plot the regression lines for each level of the

More information

Why Bayesian approaches? The average height of a rare plant

Why Bayesian approaches? The average height of a rare plant Why Bayesian approaches? The average height of a rare plant Estimation and comparison of averages is an important step in many ecological analyses and demographic models. In this demonstration you will

More information

Homework 6 Solutions

Homework 6 Solutions Homework 6 Solutions set.seed(1) library(mvtnorm) samp.theta

More information

BUGS Bayesian inference Using Gibbs Sampling

BUGS Bayesian inference Using Gibbs Sampling BUGS Bayesian inference Using Gibbs Sampling Glen DePalma Department of Statistics May 30, 2013 www.stat.purdue.edu/~gdepalma 1 / 20 Bayesian Philosophy I [Pearl] turned Bayesian in 1971, as soon as I

More information

36-463/663Multilevel and Hierarchical Models

36-463/663Multilevel and Hierarchical Models 36-463/663Multilevel and Hierarchical Models From Bayes to MCMC to MLMs Brian Junker 132E Baker Hall brian@stat.cmu.edu 1 Outline Bayesian Statistics and MCMC Distribution of Skill Mastery in a Population

More information

Statistical Simulation An Introduction

Statistical Simulation An Introduction James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 Simulation Through Bootstrapping Introduction 1 Introduction When We Don t Need Simulation

More information

Package leiv. R topics documented: February 20, Version Type Package

Package leiv. R topics documented: February 20, Version Type Package Version 2.0-7 Type Package Package leiv February 20, 2015 Title Bivariate Linear Errors-In-Variables Estimation Date 2015-01-11 Maintainer David Leonard Depends R (>= 2.9.0)

More information

First steps of multivariate data analysis

First steps of multivariate data analysis First steps of multivariate data analysis November 28, 2016 Let s Have Some Coffee We reproduce the coffee example from Carmona, page 60 ff. This vignette is the first excursion away from univariate data.

More information

Metropolis-Hastings Algorithm

Metropolis-Hastings Algorithm Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

36-463/663: Multilevel & Hierarchical Models HW09 Solution

36-463/663: Multilevel & Hierarchical Models HW09 Solution 36-463/663: Multilevel & Hierarchical Models HW09 Solution November 15, 2016 Quesion 1 Following the derivation given in class, when { n( x µ) 2 L(µ) exp, f(p) exp 2σ 2 0 ( the posterior is also normally

More information

Theory of Inference: Homework 4

Theory of Inference: Homework 4 Theory of Inference: Homework 4 1. Here is a slightly technical question about hypothesis tests. Suppose that Y 1,..., Y n iid Poisson(λ) λ > 0. The two competing hypotheses are H 0 : λ = λ 0 versus H

More information

Introduction to R, Part I

Introduction to R, Part I Introduction to R, Part I Basic math, variables, and variable types Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here

More information

Bayesian Networks in Educational Assessment

Bayesian Networks in Educational Assessment Bayesian Networks in Educational Assessment Estimating Parameters with MCMC Bayesian Inference: Expanding Our Context Roy Levy Arizona State University Roy.Levy@asu.edu 2017 Roy Levy MCMC 1 MCMC 2 Posterior

More information

Package bpp. December 13, 2016

Package bpp. December 13, 2016 Type Package Package bpp December 13, 2016 Title Computations Around Bayesian Predictive Power Version 1.0.0 Date 2016-12-13 Author Kaspar Rufibach, Paul Jordan, Markus Abt Maintainer Kaspar Rufibach Depends

More information

A Bayesian Approach to Phylogenetics

A Bayesian Approach to Phylogenetics A Bayesian Approach to Phylogenetics Niklas Wahlberg Based largely on slides by Paul Lewis (www.eeb.uconn.edu) An Introduction to Bayesian Phylogenetics Bayesian inference in general Markov chain Monte

More information

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Lecture 5 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

A noninformative Bayesian approach to domain estimation

A noninformative Bayesian approach to domain estimation A noninformative Bayesian approach to domain estimation Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 glen@stat.umn.edu August 2002 Revised July 2003 To appear in Journal

More information

Bayesian inference for a population growth model of the chytrid fungus Philipp H Boersch-Supan, Sadie J Ryan, and Leah R Johnson September 2016

Bayesian inference for a population growth model of the chytrid fungus Philipp H Boersch-Supan, Sadie J Ryan, and Leah R Johnson September 2016 Bayesian inference for a population growth model of the chytrid fungus Philipp H Boersch-Supan, Sadie J Ryan, and Leah R Johnson September 2016 1 Preliminaries This vignette illustrates the steps needed

More information

Quantitative Understanding in Biology 1.7 Bayesian Methods

Quantitative Understanding in Biology 1.7 Bayesian Methods Quantitative Understanding in Biology 1.7 Bayesian Methods Jason Banfelder October 25th, 2018 1 Introduction So far, most of the methods we ve looked at fall under the heading of classical, or frequentist

More information

Bayesian Phylogenetics:

Bayesian Phylogenetics: Bayesian Phylogenetics: an introduction Marc A. Suchard msuchard@ucla.edu UCLA Who is this man? How sure are you? The one true tree? Methods we ve learned so far try to find a single tree that best describes

More information

BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression

BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression Introduction to Correlation and Regression The procedures discussed in the previous ANOVA labs are most useful in cases where we are interested

More information

BIOS 312: Precision of Statistical Inference

BIOS 312: Precision of Statistical Inference and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample

More information

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Statistical Methods in Particle Physics Lecture 1: Bayesian methods Statistical Methods in Particle Physics Lecture 1: Bayesian methods SUSSP65 St Andrews 16 29 August 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Class 04 - Statistical Inference

Class 04 - Statistical Inference Class 4 - Statistical Inference Question 1: 1. What parameters control the shape of the normal distribution? Make some histograms of different normal distributions, in each, alter the parameter values

More information

Bayesian Graphical Models

Bayesian Graphical Models Graphical Models and Inference, Lecture 16, Michaelmas Term 2009 December 4, 2009 Parameter θ, data X = x, likelihood L(θ x) p(x θ). Express knowledge about θ through prior distribution π on θ. Inference

More information

STAT 3022 Spring 2007

STAT 3022 Spring 2007 Simple Linear Regression Example These commands reproduce what we did in class. You should enter these in R and see what they do. Start by typing > set.seed(42) to reset the random number generator so

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

ST 740: Markov Chain Monte Carlo

ST 740: Markov Chain Monte Carlo ST 740: Markov Chain Monte Carlo Alyson Wilson Department of Statistics North Carolina State University October 14, 2012 A. Wilson (NCSU Stsatistics) MCMC October 14, 2012 1 / 20 Convergence Diagnostics:

More information

Introduction to the Analysis of Hierarchical and Longitudinal Data

Introduction to the Analysis of Hierarchical and Longitudinal Data Introduction to the Analysis of Hierarchical and Longitudinal Data Georges Monette, York University with Ye Sun SPIDA June 7, 2004 1 Graphical overview of selected concepts Nature of hierarchical models

More information

Chapter 5 Exercises 1

Chapter 5 Exercises 1 Chapter 5 Exercises 1 Data Analysis & Graphics Using R, 2 nd edn Solutions to Exercises (December 13, 2006) Preliminaries > library(daag) Exercise 2 For each of the data sets elastic1 and elastic2, determine

More information

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017 Markov Chain Monte Carlo (MCMC) and Model Evaluation August 15, 2017 Frequentist Linking Frequentist and Bayesian Statistics How can we estimate model parameters and what does it imply? Want to find the

More information

Explore the data. Anja Bråthen Kristoffersen

Explore the data. Anja Bråthen Kristoffersen Explore the data Anja Bråthen Kristoffersen density 0.2 0.4 0.6 0.8 Probability distributions Can be either discrete or continuous (uniform, bernoulli, normal, etc) Defined by a density function, p(x)

More information

Statistical Computing Session 4: Random Simulation

Statistical Computing Session 4: Random Simulation Statistical Computing Session 4: Random Simulation Paul Eilers & Dimitris Rizopoulos Department of Biostatistics, Erasmus University Medical Center p.eilers@erasmusmc.nl Masters Track Statistical Sciences,

More information

Online Appendix to Mixed Modeling for Irregularly Sampled and Correlated Functional Data: Speech Science Spplications

Online Appendix to Mixed Modeling for Irregularly Sampled and Correlated Functional Data: Speech Science Spplications Online Appendix to Mixed Modeling for Irregularly Sampled and Correlated Functional Data: Speech Science Spplications Marianne Pouplier, Jona Cederbaum, Philip Hoole, Stefania Marin, Sonja Greven R Syntax

More information

The evdbayes Package

The evdbayes Package The evdbayes Package April 19, 2006 Version 1.0-5 Date 2006-18-04 Title Bayesian Analysis in Extreme Theory Author Alec Stephenson and Mathieu Ribatet. Maintainer Mathieu Ribatet

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

Introduction and Single Predictor Regression. Correlation

Introduction and Single Predictor Regression. Correlation Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation

More information

DAG models and Markov Chain Monte Carlo methods a short overview

DAG models and Markov Chain Monte Carlo methods a short overview DAG models and Markov Chain Monte Carlo methods a short overview Søren Højsgaard Institute of Genetics and Biotechnology University of Aarhus August 18, 2008 Printed: August 18, 2008 File: DAGMC-Lecture.tex

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Bayesian phylogenetics. the one true tree? Bayesian phylogenetics

Bayesian phylogenetics. the one true tree? Bayesian phylogenetics Bayesian phylogenetics the one true tree? the methods we ve learned so far try to get a single tree that best describes the data however, they admit that they don t search everywhere, and that it is difficult

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Department of Statistics The University of Auckland https://www.stat.auckland.ac.nz/~brewer/ Emphasis I will try to emphasise the underlying ideas of the methods. I will not be teaching specific software

More information

36-463/663: Hierarchical Linear Models

36-463/663: Hierarchical Linear Models 36-463/663: Hierarchical Linear Models Taste of MCMC / Bayes for 3 or more levels Brian Junker 132E Baker Hall brian@stat.cmu.edu 1 Outline Practical Bayes Mastery Learning Example A brief taste of JAGS

More information

Matematisk statistik allmän kurs, MASA01:A, HT-15 Laborationer

Matematisk statistik allmän kurs, MASA01:A, HT-15 Laborationer Lunds universitet Matematikcentrum Matematisk statistik Matematisk statistik allmän kurs, MASA01:A, HT-15 Laborationer General information on labs During the rst half of the course MASA01 we will have

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Deciding, Estimating, Computing, Checking

Deciding, Estimating, Computing, Checking Deciding, Estimating, Computing, Checking How are Bayesian posteriors used, computed and validated? Fundamentalist Bayes: The posterior is ALL knowledge you have about the state Use in decision making:

More information

Deciding, Estimating, Computing, Checking. How are Bayesian posteriors used, computed and validated?

Deciding, Estimating, Computing, Checking. How are Bayesian posteriors used, computed and validated? Deciding, Estimating, Computing, Checking How are Bayesian posteriors used, computed and validated? Fundamentalist Bayes: The posterior is ALL knowledge you have about the state Use in decision making:

More information

STA414/2104 Statistical Methods for Machine Learning II

STA414/2104 Statistical Methods for Machine Learning II STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

Bayesian Model Diagnostics and Checking

Bayesian Model Diagnostics and Checking Earvin Balderama Quantitative Ecology Lab Department of Forestry and Environmental Resources North Carolina State University April 12, 2013 1 / 34 Introduction MCMCMC 2 / 34 Introduction MCMCMC Steps in

More information

Forward Problems and their Inverse Solutions

Forward Problems and their Inverse Solutions Forward Problems and their Inverse Solutions Sarah Zedler 1,2 1 King Abdullah University of Science and Technology 2 University of Texas at Austin February, 2013 Outline 1 Forward Problem Example Weather

More information

Luke B Smith and Brian J Reich North Carolina State University May 21, 2013

Luke B Smith and Brian J Reich North Carolina State University May 21, 2013 BSquare: An R package for Bayesian simultaneous quantile regression Luke B Smith and Brian J Reich North Carolina State University May 21, 2013 BSquare in an R package to conduct Bayesian quantile regression

More information

David Giles Bayesian Econometrics

David Giles Bayesian Econometrics David Giles Bayesian Econometrics 5. Bayesian Computation Historically, the computational "cost" of Bayesian methods greatly limited their application. For instance, by Bayes' Theorem: p(θ y) = p(θ)p(y

More information

A Handbook of Statistical Analyses Using R 3rd Edition. Torsten Hothorn and Brian S. Everitt

A Handbook of Statistical Analyses Using R 3rd Edition. Torsten Hothorn and Brian S. Everitt A Handbook of Statistical Analyses Using R 3rd Edition Torsten Hothorn and Brian S. Everitt CHAPTER 12 Quantile Regression: Head Circumference for Age 12.1 Introduction 12.2 Quantile Regression 12.3 Analysis

More information

Part 4: Multi-parameter and normal models

Part 4: Multi-parameter and normal models Part 4: Multi-parameter and normal models 1 The normal model Perhaps the most useful (or utilized) probability model for data analysis is the normal distribution There are several reasons for this, e.g.,

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture I: Linear Correlation and Regression

Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture I: Linear Correlation and Regression Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture I: Linear Correlation and Regression Correlation Linear correlation and linear regression are often confused, mostly

More information

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

Bayesian Inference for Regression Parameters

Bayesian Inference for Regression Parameters Bayesian Inference for Regression Parameters 1 Bayesian inference for simple linear regression parameters follows the usual pattern for all Bayesian analyses: 1. Form a prior distribution over all unknown

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn

A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn A Handbook of Statistical Analyses Using R Brian S. Everitt and Torsten Hothorn CHAPTER 6 Logistic Regression and Generalised Linear Models: Blood Screening, Women s Role in Society, and Colonic Polyps

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3 Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Statistical Modeling and Analysis of Scientific Inquiry: The Basics of Hypothesis Testing

Statistical Modeling and Analysis of Scientific Inquiry: The Basics of Hypothesis Testing Statistical Modeling and Analysis of Scientific Inquiry: The Basics of Hypothesis Testing So, What is Statistics? Theory and techniques for learning from data How to collect How to analyze How to interpret

More information

Jian WANG, PhD. Room A115 College of Fishery and Life Science Shanghai Ocean University

Jian WANG, PhD. Room A115 College of Fishery and Life Science Shanghai Ocean University Jian WANG, PhD j_wang@shou.edu.cn Room A115 College of Fishery and Life Science Shanghai Ocean University Contents 1. Introduction to R 2. Data sets 3. Introductory Statistical Principles 4. Sampling and

More information

Lecture 7: The χ 2 and t distributions, frequentist confidence intervals, the Neyman-Pearson paradigm, and introductory frequentist hypothesis testing

Lecture 7: The χ 2 and t distributions, frequentist confidence intervals, the Neyman-Pearson paradigm, and introductory frequentist hypothesis testing Lecture 7: The χ 2 and t distributions, frequentist confidence intervals, the Neyman-Pearson paradigm, and introductory frequentist hypothesis testing 17 October 2007 In this lecture we ll learn the following:

More information

L6: Regression II. JJ Chen. July 2, 2015

L6: Regression II. JJ Chen. July 2, 2015 L6: Regression II JJ Chen July 2, 2015 Today s Plan Review basic inference based on Sample average Difference in sample average Extrapolate the knowledge to sample regression coefficients Standard error,

More information

Package bayeslm. R topics documented: June 18, Type Package

Package bayeslm. R topics documented: June 18, Type Package Type Package Package bayeslm June 18, 2018 Title Efficient Sampling for Gaussian Linear Regression with Arbitrary Priors Version 0.8.0 Date 2018-6-17 Author P. Richard Hahn, Jingyu He, Hedibert Lopes Maintainer

More information

STAT 425: Introduction to Bayesian Analysis

STAT 425: Introduction to Bayesian Analysis STAT 425: Introduction to Bayesian Analysis Marina Vannucci Rice University, USA Fall 2017 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 2) Fall 2017 1 / 19 Part 2: Markov chain Monte

More information

Using R in 200D Luke Sonnet

Using R in 200D Luke Sonnet Using R in 200D Luke Sonnet Contents Working with data frames 1 Working with variables........................................... 1 Analyzing data............................................... 3 Random

More information

Robust Bayesian Regression

Robust Bayesian Regression Readings: Hoff Chapter 9, West JRSSB 1984, Fúquene, Pérez & Pericchi 2015 Duke University November 17, 2016 Body Fat Data: Intervals w/ All Data Response % Body Fat and Predictor Waist Circumference 95%

More information

PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.

PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation. PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.. Beta Distribution We ll start by learning about the Beta distribution, since we end up using

More information

Introductory Statistics with R: Simple Inferences for continuous data

Introductory Statistics with R: Simple Inferences for continuous data Introductory Statistics with R: Simple Inferences for continuous data Statistical Packages STAT 1301 / 2300, Fall 2014 Sungkyu Jung Department of Statistics University of Pittsburgh E-mail: sungkyu@pitt.edu

More information

Statistics in Environmental Research (BUC Workshop Series) II Problem sheet - WinBUGS - SOLUTIONS

Statistics in Environmental Research (BUC Workshop Series) II Problem sheet - WinBUGS - SOLUTIONS Statistics in Environmental Research (BUC Workshop Series) II Problem sheet - WinBUGS - SOLUTIONS 1. (a) The posterior mean estimate of α is 14.27, and the posterior mean for the standard deviation of

More information

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference. Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences

More information

Statistical Methods for Astronomy

Statistical Methods for Astronomy Statistical Methods for Astronomy Probability (Lecture 1) Statistics (Lecture 2) Why do we need statistics? Useful Statistics Definitions Error Analysis Probability distributions Error Propagation Binomial

More information

Weakness of Beta priors (or conjugate priors in general) They can only represent a limited range of prior beliefs. For example... There are no bimodal beta distributions (except when the modes are at 0

More information

Psychology 282 Lecture #4 Outline Inferences in SLR

Psychology 282 Lecture #4 Outline Inferences in SLR Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Bagging During Markov Chain Monte Carlo for Smoother Predictions

Bagging During Markov Chain Monte Carlo for Smoother Predictions Bagging During Markov Chain Monte Carlo for Smoother Predictions Herbert K. H. Lee University of California, Santa Cruz Abstract: Making good predictions from noisy data is a challenging problem. Methods

More information

Chapter 4 Exercises 1. Data Analysis & Graphics Using R Solutions to Exercises (December 11, 2006)

Chapter 4 Exercises 1. Data Analysis & Graphics Using R Solutions to Exercises (December 11, 2006) Chapter 4 Exercises 1 Data Analysis & Graphics Using R Solutions to Exercises (December 11, 2006) Preliminaries > library(daag) Exercise 2 Draw graphs that show, for degrees of freedom between 1 and 100,

More information

(Re)introduction to Statistics Dan Lizotte

(Re)introduction to Statistics Dan Lizotte (Re)introduction to Statistics Dan Lizotte 2017-01-17 Statistics The systematic collection and arrangement of numerical facts or data of any kind; (also) the branch of science or mathematics concerned

More information

The lmm Package. May 9, Description Some improved procedures for linear mixed models

The lmm Package. May 9, Description Some improved procedures for linear mixed models The lmm Package May 9, 2005 Version 0.3-4 Date 2005-5-9 Title Linear mixed models Author Original by Joseph L. Schafer . Maintainer Jing hua Zhao Description Some improved

More information

Confidence Distribution

Confidence Distribution Confidence Distribution Xie and Singh (2013): Confidence distribution, the frequentist distribution estimator of a parameter: A Review Céline Cunen, 15/09/2014 Outline of Article Introduction The concept

More information

Open book, but no loose leaf notes and no electronic devices. Points (out of 200) are in parentheses. Put all answers on the paper provided to you.

Open book, but no loose leaf notes and no electronic devices. Points (out of 200) are in parentheses. Put all answers on the paper provided to you. ISQS 5347 Final Exam Spring 2017 Open book, but no loose leaf notes and no electronic devices. Points (out of 200) are in parentheses. Put all answers on the paper provided to you. 1. Recall the commute

More information

Inference for the mean of a population. Testing hypotheses about a single mean (the one sample t-test). The sign test for matched pairs

Inference for the mean of a population. Testing hypotheses about a single mean (the one sample t-test). The sign test for matched pairs Stat 528 (Autumn 2008) Inference for the mean of a population (One sample t procedures) Reading: Section 7.1. Inference for the mean of a population. The t distribution for a normal population. Small sample

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Bayesian Inference. Chapter 1. Introduction and basic concepts

Bayesian Inference. Chapter 1. Introduction and basic concepts Bayesian Inference Chapter 1. Introduction and basic concepts M. Concepción Ausín Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master

More information

2015 SISG Bayesian Statistics for Genetics R Notes: Generalized Linear Modeling

2015 SISG Bayesian Statistics for Genetics R Notes: Generalized Linear Modeling 2015 SISG Bayesian Statistics for Genetics R Notes: Generalized Linear Modeling Jon Wakefield Departments of Statistics and Biostatistics, University of Washington 2015-07-24 Case control example We analyze

More information

Introduction to Bayesian Statistics 1

Introduction to Bayesian Statistics 1 Introduction to Bayesian Statistics 1 STA 442/2101 Fall 2018 1 This slide show is an open-source document. See last slide for copyright information. 1 / 42 Thomas Bayes (1701-1761) Image from the Wikipedia

More information