Multiple Regression: Mixed Predictor Types. Tim Frasier

Size: px
Start display at page:

Download "Multiple Regression: Mixed Predictor Types. Tim Frasier"

Transcription

1 Multiple Regression: Mixed Predictor Types Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information.

2 The Data

3 Data Fuel economy data from 1999 and 2008 for 38 popular models of car* I know, I know. It s neither biological nor that interesting, but it is hard to find good example data sets for this * As distributed with the ggplot2 package, and original data from the EPA (

4 Data

5 Data Predicted variable hwy

6 Data Two categorical predictors man class

7 Data Two metric predictors* displ cyl * I realize that cylinders is not really a metric variable, but we will treat it like one here for demonstration purposes

8 Data Read the data into R and parse out just the fields in which we are interested cardata <- read.table("mpg.csv", header = TRUE, sep = ",") carsub <- cardata[, c(2, 4, 6, 10, 12)]

9 Data Use summary function to get a feel for it summary(carsub) manufacturer displ cyl hwy class dodge :37 Min. :1.600 Min. :4.000 Min. : seater : 5 toyota :34 1st Qu.: st Qu.: st Qu.:18.00 compact :47 volkswagen:27 Median :3.300 Median :6.000 Median :24.00 midsize :41 ford :25 Mean :3.472 Mean :5.889 Mean :23.44 minivan :11 chevrolet :19 3rd Qu.: rd Qu.: rd Qu.:27.00 pickup :33 audi :18 Max. :7.000 Max. :8.000 Max. :44.00 subcompact:35 (Other) :74 suv :62

10 Data Plot the data to get a feel for it But keep in mind these can be misleading!!! pairs(carsub, pch = 16, col = rgb(0, 0, 1, 0.5))

11 Data manufacturer displ cyl hwy class

12 Data Positive relationship between engine displacement and the number of cylinders (makes sense) manufacturer displ cyl hwy class

13 Data Negative relationship between engine displacement & highway mpg manufacturer displ cyl hwy class

14 Data Negative relationship between number of cylinders & highway mpg manufacturer displ cyl hwy class

15 Data Mostly positive relationship between engine displacement & vehicle class manufacturer displ cyl hwy class

16 Data Some interesting patterns of relationships between class and highway mpg manufacturer displ cyl hwy class

17 Data Some interesting patterns of relationships between manufacturer and highway mpg manufacturer displ cyl hwy class

18 Frequentist Approach

19 Frequentist Approach Mixed predictors can be analyzed with the lm function cartest <- lm(hwy ~ manufacturer + displ + cyl + class, data = carsub)

20 Frequentist Approach summary(cartest) Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** manufacturerchevrolet manufacturerdodge manufacturerford manufacturerhonda ** manufacturerhyundai manufacturerjeep manufacturerland rover manufacturerlincoln manufacturermercury manufacturernissan manufacturerpontiac manufacturersubaru manufacturertoyota manufacturervolkswagen * displ cyl *** classcompact classmidsize classminivan ** classpickup e-08 *** classsubcompact classsuv e-08 *** --- Residual standard error: on 211 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 22 and 211 DF, p-value: < 2.2e-16

21 Frequentist Approach summary(cartest) Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** manufacturerchevrolet manufacturerdodge manufacturerford manufacturerhonda ** manufacturerhyundai manufacturerjeep manufacturerland rover manufacturerlincoln manufacturermercury manufacturernissan manufacturerpontiac manufacturersubaru manufacturertoyota manufacturervolkswagen * displ cyl *** classcompact classmidsize classminivan ** classpickup e-08 *** classsubcompact classsuv e-08 *** --- Residual standard error: on 211 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 22 and 211 DF, p-value: < 2.2e-16 Is the intercept plus the effect of being an audi. All other effects are differences from this reference

22 Frequentist Approach summary(cartest) Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** manufacturerchevrolet manufacturerdodge manufacturerford manufacturerhonda ** manufacturerhyundai manufacturerjeep manufacturerland rover manufacturerlincoln manufacturermercury manufacturernissan manufacturerpontiac manufacturersubaru manufacturertoyota manufacturervolkswagen * displ cyl *** classcompact classmidsize classminivan ** classpickup e-08 *** classsubcompact classsuv e-08 *** --- Residual standard error: on 211 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 22 and 211 DF, p-value: < 2.2e-16 Manufacturer not too big an impact, but a little

23 Frequentist Approach summary(cartest) Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** manufacturerchevrolet manufacturerdodge manufacturerford manufacturerhonda ** manufacturerhyundai manufacturerjeep manufacturerland rover manufacturerlincoln manufacturermercury manufacturernissan manufacturerpontiac manufacturersubaru manufacturertoyota manufacturervolkswagen * displ cyl *** classcompact classmidsize classminivan ** classpickup e-08 *** classsubcompact classsuv e-08 *** --- Residual standard error: on 211 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 22 and 211 DF, p-value: < 2.2e-16 Engine displacement has a negative, but not significant, effect

24 Frequentist Approach summary(cartest) Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** manufacturerchevrolet manufacturerdodge manufacturerford manufacturerhonda ** manufacturerhyundai manufacturerjeep manufacturerland rover manufacturerlincoln manufacturermercury manufacturernissan manufacturerpontiac manufacturersubaru manufacturertoyota manufacturervolkswagen * displ cyl *** classcompact classmidsize classminivan ** classpickup e-08 *** classsubcompact classsuv e-08 *** --- Residual standard error: on 211 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 22 and 211 DF, p-value: < 2.2e-16 Cylinder number has a significant negative effect

25 Frequentist Approach summary(cartest) Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** manufacturerchevrolet manufacturerdodge manufacturerford manufacturerhonda ** manufacturerhyundai manufacturerjeep manufacturerland rover manufacturerlincoln manufacturermercury manufacturernissan manufacturerpontiac manufacturersubaru manufacturertoyota manufacturervolkswagen * displ cyl *** classcompact classmidsize classminivan ** classpickup e-08 *** classsubcompact classsuv e-08 *** --- Residual standard error: on 211 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 22 and 211 DF, p-value: < 2.2e-16 Class category seems important

26 Bayesian Approach

27 Load Libraries & Functions library(runjags) library(coda) source("plotpost.r")

28 Organize the Data #--- The y data ---# y = carsub$hwy N = length(y) ymean = mean(y) ysd = sd(y) zy = (y - ymean) / ysd

29 Organize the Data #-- The metric x data ---# # displ displ <- carsub$displ displmean <- mean(displ) displsd <- sd(displ) zdispl <- (displ - displmean) / displsd # cyl cyl <- carsub$cyl cylmean <- mean(cyl) cylsd <- sd(cyl) zcyl <- (cyl - cylmean) / cylsd

30 Organize the Data #--- The nominal x data ---# man <- as.numeric(carsub$manufacturer) class <- as.numeric(carsub$class) manlevels <- levels(carsub$manufacturer) classlevels <- levels(carsub$class) nmans <- length(unique(man)) nclass <- length(unique(class))

31 Organize the Data datalist = list( y = zy, N = N, displ = zdispl, displmean = displmean, cyl = zcyl, cylmean = cylmean, man = man, class = class, nmans = nmans, nclass = nclass )

32 Organize the Data datalist = list( y = zy, N = N, displ = zdispl, displmean = displmean, cyl = zcyl, cylmean = cylmean, man = man, class = class, nmans = nmans, nclass = nclass ) Note that we need the means of the metric predictor variables here (we haven t in the past)

33 Define the Model µ τ = 1/σ 2 - norm yi

34 Define the Model Effect of being in each manufacturer category on mpg µ τ = 1/σ 2 - norm yi

35 Define the Model Effect of engine displacement on mpg µ τ = 1/σ 2 - norm yi

36 Define the Model Effect of being in each class category on mpg µ τ = 1/σ 2 - norm yi

37 Define the Model Effect of # of cylinders on mpg µ τ = 1/σ 2 - norm yi

38 Define the Model Note multiple personalities of β 0 now Metric predictors: y value when all predictors are zero Nominal predictors: Mean y value across all categories of all variables µ τ = 1/σ 2 - norm yi

39 Define the Model Note multiple personalities of β 0 now Metric predictors: y value when all predictors are zero Nominal predictors: Mean y value across all categories of all variables What should it be now? µ τ = 1/σ 2 - norm yi

40 Define the Model Note multiple personalities of β 0 now Metric predictors: y value when all predictors are zero Nominal predictors: Mean y value across all categories of all variables Makes sense to set it as the mean predicted value if the metric predictors are re-centred at their mean µ τ = 1/σ 2 - norm yi

41 Define the Model All α because they will need to be standardized µ τ = 1/σ 2 - norm yi

42 Define the Model All α because they will need to be standardized Now metric effects are centred around the mean µ τ = 1/σ 2 - norm yi

43 Define the Model 0 10 µ τ = 1/σ 2 - norm µ τ = 1/σ 2 - norm yi

44 Define the Model µ τ = 1/σ 2 norm µ τ = 1/σ 2 - norm µ τ = 1/σ 2 - norm yi

45 Define the Model µ τ = 1/σ 2 norm µ τ = 1/σ 2 - norm We ll also make each nominal variable hierarchical... µ τ = 1/σ 2 - norm yi

46 Define the Model µ τ = 1/σ 2 norm µ τ = 1/σ 2 - norm α gamma β µ τ = 1/σ 2 - norm yi

47 modelstring = " model { for (i in 1:N) { } #--- Likelihood ---# y[i] ~ dnorm(mu[i], tau) mu[i] <- a0 + a1[man[i]] + (a2 * (displ[i] - displmean)) + a3[class[i]] + (a4 * (cyl[i] - cylmean)) #--- Priors ---# sigma ~ dgamma(1.1, 0.11) tau <- 1 / sigma^2 a0 ~ dnorm(0, 1/10^2) a2 ~ dnorm(0, 1/10^2) a4 ~ dnorm(0, 1/10^2) # a1 for (j in 1:nMans) { a1[j] ~ dnorm(manmeans, 1/manSD^2) } # a3 for (j in 1:nClass) { a3[j] ~ dnorm(classmeans, 1/classSD^2) }

48 #--- Hyperpriors ---# manmeans ~ dnorm(0, 1/10^2) mansd ~ dgamma(1.1, 0.11) classmeans ~ dnorm(0, 1/10^2) classsd ~ dgamma(1.1, 0.11)

49 # # # Convert a0,a[] to sum-to-zero b0,b[] : # # # m1 <- mean(a1[1:nmans]) # Mean across a1 categories m3 <- mean(a3[1:nclass]) # Mean across a3 categories #- b0 is a0 + mean of each nominal predictor, minus mean effect -# #- of metric predictors. See Kruschke (2015) p. 570 for algebra -# b0 <- a0 + m1 + m3 - (a2 * displmean) - (a4 * cylmean) #- b1 is the the uncorrected a1 minus mean across categories for that nominal variable -# for (j in 1:nMans) { b1[j] <- a1[j] - m1 } #- b3 is the uncorrected a3 minus mean across categories for that nominal variable -# for (j in 1:nClass) { b3[j] <- a3[j] - m3 } #- Coefficients for metric variables stay the same -# b2 <- a2 b4 <- a4 } " # close quote for modelstring writelines(modelstring,con="model.txt")

50 Specify Initial Values initslist <- function() { list( sigma = rgamma(n = 1, shape = 1.1, rate = 0.11), a0 = rnorm(n = 1, mean = 0, sd = 10), b2 = rnorm(n = 1, mean = 0, sd = 10), b4 = rnorm(n = 1, mean = 0, sd = 10), manmeans = rnorm(n = 1, mean = 0, sd = 10), mansd = rgamma(n = 1, shape = 1.1, rate = 0.11), classmeans = rnorm(n = 1, mean = 0, sd = 10), classsd = rgamma(n = 1, shape = 1.1, rate = 0.11) ) }

51 Specify MCMC Parameters and Run runjagsout <- run.jags( method = "simple", model = "model.txt", monitor = c("b0", "b1", "b2", "b3", "b4", "sigma"), data = datalist, inits = initslist, n.chains = 3, adapt = 500, burnin = 1000, sample = 20000, thin = 1, summarise = TRUE, plots = FALSE)

52 Evaluate Performance of the Model

53 Testing Model Performance Retrieve the data and take a peak at the structure codasamples = as.mcmc.list(runjagsout) head(codasamples[[1]]) Markov Chain Monte Carlo (MCMC) output: Start = 1501 End = 1507 Thinning interval = 1 b0 b1[1] b1[2] b1[3] b1[4] b1[5] b1[6] b1[7] b1[8] b1[9] b1[10] b1[11] b1[12] b1[13]

54 Testing Model Performance Can do this on your own

55 Extract & Parse Results mcmcchain = as.matrix(codasamples) # b0 zb0 = mcmcchain[, "b0"] # b1 chainlength = length(zb0) zb1 = matrix(0, ncol = chainlength, nrow = nmans) for (i in 1:nMans) { zb1[i, ] = mcmcchain[, paste("b1[", i, "]", sep = "")] } # b2 zb2 = mcmcchain[, "b2"] # b3 zb3 = matrix(0, ncol = chainlength, nrow = nclass) for (i in 1:nClass) { zb3[i, ] = mcmcchain[, paste("b3[", i, "]", sep = "")] } # b4 zb4 = mcmcchain[, "b4"] # sigma zsigma <- mcmcchain[, "sigma"]

56 Convert to Original Scale b0 <- (zb0 * ysd) + ymean b2 <- (zb2 * ysd) / displsd b4 <- (zb4 * ysd) / cylsd b1 <- zb1 * ysd b3 <- zb3 * ysd sigma <- zsigma * ysd

57 View Posteriors

58 Plotting Posterior Distributions β 0 par(mfrow = c(1, 1)) histinfo = plotpost(b0, xlab = "b0", main = "b0") b0 mean = % HDI b0

59 Plotting Posterior Distributions β 1 par(mfrow = c(3, 3)) for (i in 1:nMans) { histinfo = plotpost(b1[i, ], xlab = bquote(b1[.(i)]), main = paste("b1:", manlevels[i])) }

60 Plotting Posterior Distributions β 1 b1: audi mean = % HDI b1: chevrolet mean = % HDI b1: dodge mean = % HDI b b b1 3 b1: ford mean = % HDI b1: honda mean = % HDI b1: hyundai mean = % HDI b b b1 6 b1: jeep mean = % HDI b1: land rover mean = % HDI b1: lincoln mean = % HDI b b b1 9

61 Plotting Posterior Distributions β 1 b1: mercury mean = % HDI b1: nissan mean = % HDI b1: pontiac mean = % HDI b b b1 12 b1: subaru mean = % HDI b1: toyota mean = % HDI b1: volkswagen mean = % HDI b b b1 15

62 Plotting Posterior Distributions β 2 par(mfrow = c(1, 1)) histinfo = plotpost(b2, xlab = "b2", main = "Engine Displacement") Engine Displacement mean = % HDI b2

63 Plotting Posterior Distributions β 3 par(mfrow = c(2, 2)) for (i in 1:nClass) { histinfo = plotpost(b3[i, ], xlab = bquote(b3[.(i)]), main = paste("b3:", classlevels[i])) }

64 Plotting Posterior Distributions β 3 b3: 2seater mean = b3: compact mean = % HDI % HDI b b3 2 b3: midsize mean = b3: minivan mean = % HDI % HDI b b3 4

65 Plotting Posterior Distributions β 3 b3: pickup mean = b3: subcompact mean = % HDI % HDI b b3 6 b3: suv mean = % HDI b3 7

66 Plotting Posterior Distributions β 4 par(mfrow = c(1, 1)) histinfo = plotpost(b4, xlab = "b4", main = "# of Cylinders") # of Cylinders mean = % HDI b4

67 Posterior Predictive Check

68 Posterior Predictive Check Select a subset of the data on which to make predictions (let s pick 20) npred = 20 newrows <- round(seq(from = 1, to = NROW(carSub), length = npred)) newdata <- carsub[newrows, ]

69 Posterior Predictive Check Separate out just the x data, on which we will make predictions x1 <- as.numeric(newdata$manufacturer) x2 <- newdata$displ x3 <- as.numeric(newdata$class) x4 <- newdata$cyl

70 Posterior Predictive Check Next, define a matrix that will hold all of the predicted y values Number of rows is the number of x values for prediction Number of columns is the number of y values generated from the MCMC process We ll start with the matrix filled with zeros, but will fill it in later postsampsize = length(b0) ynew = matrix(0, nrow = npred, ncol = postsampsize)

71 Posterior Predictive Check Define a matrix for holding the HDI limits of the predicted y values Same number of rows as above Only two columns (one for each end of the HDI) yhdilim = matrix(0, nrow = npred, ncol = 2)

72 Posterior Predictive Check Now, populate the ynew matrix by generating one predicted y value for each step in the chain Note that our coefficients for the metric predictors are centred around the mean, so we have to treat them this way here for (i in 1:nPred) { for (j in 1:postSampSize) { ynew[i, j] <- rnorm(1, mean = b0[j] + b1[x1[i], j] + (b2[j] * (x2[i] - displmean)) + b3[x3[i], j] + (b4[j] * (x4[i] - cylmean)), sd = sigma[j]) } }

73 Posterior Predictive Check Calculate means for each prediction, and the associated low and high 95% HDI estimates means <- rowmeans(ynew) source("hdiofmcmc.r") for (i in 1:nPred) { yhdilim[i, ] <- HDIofMCMC(yNew[i, ]) }

74 Posterior Predictive Check Combine into one data frame predtable <- cbind(means, yhdilim)

75 Posterior Predictive Check Plot predicted values dotchart(means, labels = 1:nPred, xlim = c(min(yhdilim), max(yhdilim)), xlab = hwy mpg", pch = 16) segments(yhdilim[, 1], 1:nPred, yhdilim[, 2], 1:nPred, lwd = 2) Add the truth points(x = newdata$hwy, y = 1:nPred, pch = 16, col = rgb(1, 0, 0, 0.5))

76 Posterior Predictive Check hwy mpg

77 Homework (last one!)

78 Homework Get the DIC for the full model Re-configure and run the model 4 more times, leaving a different predictor variable out each time, and get the DIC for each Compare the DIC values to decide which predictors are most important for your model Should explain your results and interpretation, but can do so as commented lines in your code (i.e., enclosed in # so that your code will still run, but also so that you have written explanations in there for me to read)

79 Creative Commons License Anyone is allowed to distribute, remix, tweak, and build upon this work, even commercially, as long as they credit me for the original creation. See the Creative Commons website for more information. Click here to go back to beginning

Metric Predicted Variable With One Nominal Predictor Variable

Metric Predicted Variable With One Nominal Predictor Variable Metric Predicted Variable With One Nominal Predictor Variable Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more

More information

Multiple Regression: Nominal Predictors. Tim Frasier

Multiple Regression: Nominal Predictors. Tim Frasier Multiple Regression: Nominal Predictors Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information. Goals

More information

Hierarchical Modeling

Hierarchical Modeling Hierarchical Modeling Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information. General Idea One benefit

More information

Count Predicted Variable & Contingency Tables

Count Predicted Variable & Contingency Tables Count Predicted Variable & Contingency Tables Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information.

More information

Metric Predicted Variable on Two Groups

Metric Predicted Variable on Two Groups Metric Predicted Variable on Two Groups Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information. Goals

More information

Metric Predicted Variable on One Group

Metric Predicted Variable on One Group Metric Predicted Variable on One Group Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information. Prior Homework

More information

Operators and the Formula Argument in lm

Operators and the Formula Argument in lm Operators and the Formula Argument in lm Recall that the first argument of lm (the formula argument) took the form y. or y x (recall that the term on the left of the told lm what the response variable

More information

Bayesian Statistics: An Introduction

Bayesian Statistics: An Introduction : An Introduction Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information. Outline 1. Bayesian statistics,

More information

WinBUGS : part 2. Bruno Boulanger Jonathan Jaeger Astrid Jullion Philippe Lambert. Gabriele, living with rheumatoid arthritis

WinBUGS : part 2. Bruno Boulanger Jonathan Jaeger Astrid Jullion Philippe Lambert. Gabriele, living with rheumatoid arthritis WinBUGS : part 2 Bruno Boulanger Jonathan Jaeger Astrid Jullion Philippe Lambert Gabriele, living with rheumatoid arthritis Agenda 2! Hierarchical model: linear regression example! R2WinBUGS Linear Regression

More information

Introduction to R, Part I

Introduction to R, Part I Introduction to R, Part I Basic math, variables, and variable types Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here

More information

Why Bayesian approaches? The average height of a rare plant

Why Bayesian approaches? The average height of a rare plant Why Bayesian approaches? The average height of a rare plant Estimation and comparison of averages is an important step in many ecological analyses and demographic models. In this demonstration you will

More information

R Demonstration ANCOVA

R Demonstration ANCOVA R Demonstration ANCOVA Objective: The purpose of this week s session is to demonstrate how to perform an analysis of covariance (ANCOVA) in R, and how to plot the regression lines for each level of the

More information

36-463/663: Multilevel & Hierarchical Models HW09 Solution

36-463/663: Multilevel & Hierarchical Models HW09 Solution 36-463/663: Multilevel & Hierarchical Models HW09 Solution November 15, 2016 Quesion 1 Following the derivation given in class, when { n( x µ) 2 L(µ) exp, f(p) exp 2σ 2 0 ( the posterior is also normally

More information

Statistics in Environmental Research (BUC Workshop Series) II Problem sheet - WinBUGS - SOLUTIONS

Statistics in Environmental Research (BUC Workshop Series) II Problem sheet - WinBUGS - SOLUTIONS Statistics in Environmental Research (BUC Workshop Series) II Problem sheet - WinBUGS - SOLUTIONS 1. (a) The posterior mean estimate of α is 14.27, and the posterior mean for the standard deviation of

More information

Introduction and Single Predictor Regression. Correlation

Introduction and Single Predictor Regression. Correlation Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation

More information

STAT Lecture 11: Bayesian Regression

STAT Lecture 11: Bayesian Regression STAT 491 - Lecture 11: Bayesian Regression Generalized Linear Models Generalized linear models (GLMs) are a class of techniques that include linear regression, logistic regression, and Poisson regression.

More information

Analytics 512: Homework # 2 Tim Ahn February 9, 2016

Analytics 512: Homework # 2 Tim Ahn February 9, 2016 Analytics 512: Homework # 2 Tim Ahn February 9, 2016 Chapter 3 Problem 1 (# 3) Suppose we have a data set with five predictors, X 1 = GP A, X 2 = IQ, X 3 = Gender (1 for Female and 0 for Male), X 4 = Interaction

More information

Linear Regression. Data Model. β, σ 2. Process Model. ,V β. ,s 2. s 1. Parameter Model

Linear Regression. Data Model. β, σ 2. Process Model. ,V β. ,s 2. s 1. Parameter Model Regression: Part II Linear Regression y~n X, 2 X Y Data Model β, σ 2 Process Model Β 0,V β s 1,s 2 Parameter Model Assumptions of Linear Model Homoskedasticity No error in X variables Error in Y variables

More information

Chapter 3 - Linear Regression

Chapter 3 - Linear Regression Chapter 3 - Linear Regression Lab Solution 1 Problem 9 First we will read the Auto" data. Note that most datasets referred to in the text are in the R package the authors developed. So we just need to

More information

Chapter 5 Exercises 1

Chapter 5 Exercises 1 Chapter 5 Exercises 1 Data Analysis & Graphics Using R, 2 nd edn Solutions to Exercises (December 13, 2006) Preliminaries > library(daag) Exercise 2 For each of the data sets elastic1 and elastic2, determine

More information

lm statistics Chris Parrish

lm statistics Chris Parrish lm statistics Chris Parrish 2017-04-01 Contents s e and R 2 1 experiment1................................................. 2 experiment2................................................. 3 experiment3.................................................

More information

Statistics 203 Introduction to Regression Models and ANOVA Practice Exam

Statistics 203 Introduction to Regression Models and ANOVA Practice Exam Statistics 203 Introduction to Regression Models and ANOVA Practice Exam Prof. J. Taylor You may use your 4 single-sided pages of notes This exam is 7 pages long. There are 4 questions, first 3 worth 10

More information

Package leiv. R topics documented: February 20, Version Type Package

Package leiv. R topics documented: February 20, Version Type Package Version 2.0-7 Type Package Package leiv February 20, 2015 Title Bivariate Linear Errors-In-Variables Estimation Date 2015-01-11 Maintainer David Leonard Depends R (>= 2.9.0)

More information

Metropolis-Hastings Algorithm

Metropolis-Hastings Algorithm Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to

More information

MALA versus Random Walk Metropolis Dootika Vats June 4, 2017

MALA versus Random Walk Metropolis Dootika Vats June 4, 2017 MALA versus Random Walk Metropolis Dootika Vats June 4, 2017 Introduction My research thus far has predominantly been on output analysis for Markov chain Monte Carlo. The examples on which I have implemented

More information

Hierarchical Linear Models

Hierarchical Linear Models Hierarchical Linear Models Statistics 220 Spring 2005 Copyright c 2005 by Mark E. Irwin The linear regression model Hierarchical Linear Models y N(Xβ, Σ y ) β σ 2 p(β σ 2 ) σ 2 p(σ 2 ) can be extended

More information

Weakness of Beta priors (or conjugate priors in general) They can only represent a limited range of prior beliefs. For example... There are no bimodal beta distributions (except when the modes are at 0

More information

36-463/663Multilevel and Hierarchical Models

36-463/663Multilevel and Hierarchical Models 36-463/663Multilevel and Hierarchical Models From Bayes to MCMC to MLMs Brian Junker 132E Baker Hall brian@stat.cmu.edu 1 Outline Bayesian Statistics and MCMC Distribution of Skill Mastery in a Population

More information

Class 04 - Statistical Inference

Class 04 - Statistical Inference Class 4 - Statistical Inference Question 1: 1. What parameters control the shape of the normal distribution? Make some histograms of different normal distributions, in each, alter the parameter values

More information

Chapter 5 Exercises 1. Data Analysis & Graphics Using R Solutions to Exercises (April 24, 2004)

Chapter 5 Exercises 1. Data Analysis & Graphics Using R Solutions to Exercises (April 24, 2004) Chapter 5 Exercises 1 Data Analysis & Graphics Using R Solutions to Exercises (April 24, 2004) Preliminaries > library(daag) Exercise 2 The final three sentences have been reworded For each of the data

More information

Introduction to Statistics and R

Introduction to Statistics and R Introduction to Statistics and R Mayo-Illinois Computational Genomics Workshop (2018) Ruoqing Zhu, Ph.D. Department of Statistics, UIUC rqzhu@illinois.edu June 18, 2018 Abstract This document is a supplimentary

More information

Bayesian Graphical Models

Bayesian Graphical Models Graphical Models and Inference, Lecture 16, Michaelmas Term 2009 December 4, 2009 Parameter θ, data X = x, likelihood L(θ x) p(x θ). Express knowledge about θ through prior distribution π on θ. Inference

More information

Multiple Regression Introduction to Statistics Using R (Psychology 9041B)

Multiple Regression Introduction to Statistics Using R (Psychology 9041B) Multiple Regression Introduction to Statistics Using R (Psychology 9041B) Paul Gribble Winter, 2016 1 Correlation, Regression & Multiple Regression 1.1 Bivariate correlation The Pearson product-moment

More information

Lab #5 - Predictive Regression I Econ 224 September 11th, 2018

Lab #5 - Predictive Regression I Econ 224 September 11th, 2018 Lab #5 - Predictive Regression I Econ 224 September 11th, 2018 Introduction This lab provides a crash course on least squares regression in R. In the interest of time we ll work with a very simple, but

More information

General Linear Statistical Models

General Linear Statistical Models General Linear Statistical Models Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin This framework includes General Linear Statistical Models Linear Regression Analysis of Variance (ANOVA) Analysis

More information

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 14/11/2017 This Week Categorical Variables Categorical

More information

Prediction problems 3: Validation and Model Checking

Prediction problems 3: Validation and Model Checking Prediction problems 3: Validation and Model Checking Data Science 101 Team May 17, 2018 Outline Validation Why is it important How should we do it? Model checking Checking whether your model is a good

More information

BUGS Bayesian inference Using Gibbs Sampling

BUGS Bayesian inference Using Gibbs Sampling BUGS Bayesian inference Using Gibbs Sampling Glen DePalma Department of Statistics May 30, 2013 www.stat.purdue.edu/~gdepalma 1 / 20 Bayesian Philosophy I [Pearl] turned Bayesian in 1971, as soon as I

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Assumptions of Linear Model Homoskedasticity Model variance No error in X variables Errors in variables No missing data Missing data model Normally distributed error Error in

More information

Community Health Needs Assessment through Spatial Regression Modeling

Community Health Needs Assessment through Spatial Regression Modeling Community Health Needs Assessment through Spatial Regression Modeling Glen D. Johnson, PhD CUNY School of Public Health glen.johnson@lehman.cuny.edu Objectives: Assess community needs with respect to particular

More information

Holiday Assignment PS 531

Holiday Assignment PS 531 Holiday Assignment PS 531 Prof: Jake Bowers TA: Paul Testa January 27, 2014 Overview Below is a brief assignment for you to complete over the break. It should serve as refresher, covering some of the basic

More information

STAT 3022 Spring 2007

STAT 3022 Spring 2007 Simple Linear Regression Example These commands reproduce what we did in class. You should enter these in R and see what they do. Start by typing > set.seed(42) to reset the random number generator so

More information

Weakness of Beta priors (or conjugate priors in general) They can only represent a limited range of prior beliefs. For example... There are no bimodal beta distributions (except when the modes are at 0

More information

Package effectfusion

Package effectfusion Package November 29, 2016 Title Bayesian Effect Fusion for Categorical Predictors Version 1.0 Date 2016-11-21 Author Daniela Pauger [aut, cre], Helga Wagner [aut], Gertraud Malsiner-Walli [aut] Maintainer

More information

Contents 1 Admin 2 General extensions 3 FWL theorem 4 Omitted variable bias 5 The R family Admin 1.1 What you will need Packages Data 1.

Contents 1 Admin 2 General extensions 3 FWL theorem 4 Omitted variable bias 5 The R family Admin 1.1 What you will need Packages Data 1. 2 2 dplyr lfe readr MASS auto.csv plot() plot() ggplot2 plot() # Start the.jpeg driver jpeg("your_plot.jpeg") # Make the plot plot(x = 1:10, y = 1:10) # Turn off the driver dev.off() # Start the.pdf driver

More information

A Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn

A Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn A Handbook of Statistical Analyses Using R 2nd Edition Brian S. Everitt and Torsten Hothorn CHAPTER 12 Analysing Longitudinal Data I: Computerised Delivery of Cognitive Behavioural Therapy Beat the Blues

More information

22s:152 Applied Linear Regression

22s:152 Applied Linear Regression 22s:152 Applied Linear Regression Chapter 7: Dummy Variable Regression So far, we ve only considered quantitative variables in our models. We can integrate categorical predictors by constructing artificial

More information

Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov

Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT Nov 20 2015 Charlotte Wickham stat511.cwick.co.nz Quiz #4 This weekend, don t forget. Usual format Assumptions Display 7.5 p. 180 The ideal normal, simple

More information

1 Introduction 1. 2 The Multiple Regression Model 1

1 Introduction 1. 2 The Multiple Regression Model 1 Multiple Linear Regression Contents 1 Introduction 1 2 The Multiple Regression Model 1 3 Setting Up a Multiple Regression Model 2 3.1 Introduction.............................. 2 3.2 Significance Tests

More information

Package horseshoe. November 8, 2016

Package horseshoe. November 8, 2016 Title Implementation of the Horseshoe Prior Version 0.1.0 Package horseshoe November 8, 2016 Description Contains functions for applying the horseshoe prior to highdimensional linear regression, yielding

More information

Contents. 1 Introduction: what is overdispersion? 2 Recognising (and testing for) overdispersion. 1 Introduction: what is overdispersion?

Contents. 1 Introduction: what is overdispersion? 2 Recognising (and testing for) overdispersion. 1 Introduction: what is overdispersion? Overdispersion, and how to deal with it in R and JAGS (requires R-packages AER, coda, lme4, R2jags, DHARMa/devtools) Carsten F. Dormann 07 December, 2016 Contents 1 Introduction: what is overdispersion?

More information

Swarthmore Honors Exam 2012: Statistics

Swarthmore Honors Exam 2012: Statistics Swarthmore Honors Exam 2012: Statistics 1 Swarthmore Honors Exam 2012: Statistics John W. Emerson, Yale University NAME: Instructions: This is a closed-book three-hour exam having six questions. You may

More information

The lmm Package. May 9, Description Some improved procedures for linear mixed models

The lmm Package. May 9, Description Some improved procedures for linear mixed models The lmm Package May 9, 2005 Version 0.3-4 Date 2005-5-9 Title Linear mixed models Author Original by Joseph L. Schafer . Maintainer Jing hua Zhao Description Some improved

More information

STK 2100 Oblig 1. Zhou Siyu. February 15, 2017

STK 2100 Oblig 1. Zhou Siyu. February 15, 2017 STK 200 Oblig Zhou Siyu February 5, 207 Question a) Make a scatter box plot for the data set. Answer:Here is the code I used to plot the scatter box in R. library ( MASS ) 2 pairs ( Boston ) Figure : Scatter

More information

A Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn

A Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn A Handbook of Statistical Analyses Using R 2nd Edition Brian S. Everitt and Torsten Hothorn CHAPTER 12 Analysing Longitudinal Data I: Computerised Delivery of Cognitive Behavioural Therapy Beat the Blues

More information

Package bayeslm. R topics documented: June 18, Type Package

Package bayeslm. R topics documented: June 18, Type Package Type Package Package bayeslm June 18, 2018 Title Efficient Sampling for Gaussian Linear Regression with Arbitrary Priors Version 0.8.0 Date 2018-6-17 Author P. Richard Hahn, Jingyu He, Hedibert Lopes Maintainer

More information

Section Least Squares Regression

Section Least Squares Regression Section 2.3 - Least Squares Regression Statistics 104 Autumn 2004 Copyright c 2004 by Mark E. Irwin Regression Correlation gives us a strength of a linear relationship is, but it doesn t tell us what it

More information

MATH 644: Regression Analysis Methods

MATH 644: Regression Analysis Methods MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100

More information

Gov 2000: 9. Regression with Two Independent Variables

Gov 2000: 9. Regression with Two Independent Variables Gov 2000: 9. Regression with Two Independent Variables Matthew Blackwell Harvard University mblackwell@gov.harvard.edu Where are we? Where are we going? Last week: we learned about how to calculate a simple

More information

Bayesian Dynamic Modeling for Space-time Data in R

Bayesian Dynamic Modeling for Space-time Data in R Bayesian Dynamic Modeling for Space-time Data in R Andrew O. Finley and Sudipto Banerjee September 5, 2014 We make use of several libraries in the following example session, including: ˆ library(fields)

More information

Stat 5102 Final Exam May 14, 2015

Stat 5102 Final Exam May 14, 2015 Stat 5102 Final Exam May 14, 2015 Name Student ID The exam is closed book and closed notes. You may use three 8 1 11 2 sheets of paper with formulas, etc. You may also use the handouts on brand name distributions

More information

Lab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model

Lab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model Lab 3 A Quick Introduction to Multiple Linear Regression Psychology 310 Instructions.Work through the lab, saving the output as you go. You will be submitting your assignment as an R Markdown document.

More information

Homework 6 Solutions

Homework 6 Solutions Homework 6 Solutions set.seed(1) library(mvtnorm) samp.theta

More information

HW3 Solutions : Applied Bayesian and Computational Statistics

HW3 Solutions : Applied Bayesian and Computational Statistics HW3 Solutions 36-724: Applied Bayesian and Computational Statistics March 2, 2006 Problem 1 a Fatal Accidents Poisson(θ I will set a prior for θ to be Gamma, as it is the conjugate prior. I will allow

More information

Solution to Series 11

Solution to Series 11 Prof. Dr. M. Maathuis Multivariate Statistics SS 2014 Solution to Series 11 1. a) > car

More information

Chapter 4 Exercises 1. Data Analysis & Graphics Using R Solutions to Exercises (December 11, 2006)

Chapter 4 Exercises 1. Data Analysis & Graphics Using R Solutions to Exercises (December 11, 2006) Chapter 4 Exercises 1 Data Analysis & Graphics Using R Solutions to Exercises (December 11, 2006) Preliminaries > library(daag) Exercise 2 Draw graphs that show, for degrees of freedom between 1 and 100,

More information

PIER HLM Course July 30, 2011 Howard Seltman. Discussion Guide for Bayes and BUGS

PIER HLM Course July 30, 2011 Howard Seltman. Discussion Guide for Bayes and BUGS PIER HLM Course July 30, 2011 Howard Seltman Discussion Guide for Bayes and BUGS 1. Classical Statistics is based on parameters as fixed unknown values. a. The standard approach is to try to discover,

More information

ST430 Exam 2 Solutions

ST430 Exam 2 Solutions ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving

More information

b. Write the rule for a function that has your line as its graph. a. What shadow location would you predict when the flag height is12 feet?

b. Write the rule for a function that has your line as its graph. a. What shadow location would you predict when the flag height is12 feet? Regression and Correlation Shadows On sunny days, every vertical object casts a shadow that is related to its height. The following graph shows data from measurements of flag height and shadow location,

More information

Inferences on Linear Combinations of Coefficients

Inferences on Linear Combinations of Coefficients Inferences on Linear Combinations of Coefficients Note on required packages: The following code required the package multcomp to test hypotheses on linear combinations of regression coefficients. If you

More information

Motor Trend Car Road Analysis

Motor Trend Car Road Analysis Motor Trend Car Road Analysis Zakia Sultana February 28, 2016 Executive Summary You work for Motor Trend, a magazine about the automobile industry. Looking at a data set of a collection of cars, they are

More information

Chapter 5: Exploring Data: Distributions Lesson Plan

Chapter 5: Exploring Data: Distributions Lesson Plan Lesson Plan Exploring Data Displaying Distributions: Histograms For All Practical Purposes Mathematical Literacy in Today s World, 7th ed. Interpreting Histograms Displaying Distributions: Stemplots Describing

More information

A Handbook of Statistical Analyses Using R 3rd Edition. Torsten Hothorn and Brian S. Everitt

A Handbook of Statistical Analyses Using R 3rd Edition. Torsten Hothorn and Brian S. Everitt A Handbook of Statistical Analyses Using R 3rd Edition Torsten Hothorn and Brian S. Everitt CHAPTER 12 Quantile Regression: Head Circumference for Age 12.1 Introduction 12.2 Quantile Regression 12.3 Analysis

More information

Introduction to the Analysis of Hierarchical and Longitudinal Data

Introduction to the Analysis of Hierarchical and Longitudinal Data Introduction to the Analysis of Hierarchical and Longitudinal Data Georges Monette, York University with Ye Sun SPIDA June 7, 2004 1 Graphical overview of selected concepts Nature of hierarchical models

More information

Additional Notes: Investigating a Random Slope. When we have fixed level-1 predictors at level 2 we show them like this:

Additional Notes: Investigating a Random Slope. When we have fixed level-1 predictors at level 2 we show them like this: Ron Heck, Summer 01 Seminars 1 Multilevel Regression Models and Their Applications Seminar Additional Notes: Investigating a Random Slope We can begin with Model 3 and add a Random slope parameter. If

More information

Lecture 19. Spatial GLM + Point Reference Spatial Data. Colin Rundel 04/03/2017

Lecture 19. Spatial GLM + Point Reference Spatial Data. Colin Rundel 04/03/2017 Lecture 19 Spatial GLM + Point Reference Spatial Data Colin Rundel 04/03/2017 1 Spatial GLM Models 2 Scottish Lip Cancer Data Observed Expected 60 N 59 N 58 N 57 N 56 N value 80 60 40 20 0 55 N 8 W 6 W

More information

22s:152 Applied Linear Regression. Chapter 5: Ordinary Least Squares Regression. Part 2: Multiple Linear Regression Introduction

22s:152 Applied Linear Regression. Chapter 5: Ordinary Least Squares Regression. Part 2: Multiple Linear Regression Introduction 22s:152 Applied Linear Regression Chapter 5: Ordinary Least Squares Regression Part 2: Multiple Linear Regression Introduction Basic idea: we have more than one covariate or predictor for modeling a dependent

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

Introduction and Background to Multilevel Analysis

Introduction and Background to Multilevel Analysis Introduction and Background to Multilevel Analysis Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background and

More information

Lecture 2. The Simple Linear Regression Model: Matrix Approach

Lecture 2. The Simple Linear Regression Model: Matrix Approach Lecture 2 The Simple Linear Regression Model: Matrix Approach Matrix algebra Matrix representation of simple linear regression model 1 Vectors and Matrices Where it is necessary to consider a distribution

More information

General Linear Statistical Models - Part III

General Linear Statistical Models - Part III General Linear Statistical Models - Part III Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Interaction Models Lets examine two models involving Weight and Domestic in the cars93 dataset.

More information

Chapter 24: Comparing means

Chapter 24: Comparing means Chapter 4: Comparing means Example: Consumer Reports annually conducts a survey of automobile reliability Approximately 4 million households are surveyed by mail, The 990 survey is summarized in the Figure

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

Math 2311 Written Homework 6 (Sections )

Math 2311 Written Homework 6 (Sections ) Math 2311 Written Homework 6 (Sections 5.4 5.6) Name: PeopleSoft ID: Instructions: Homework will NOT be accepted through email or in person. Homework must be submitted through CourseWare BEFORE the deadline.

More information

cor(dataset$measurement1, dataset$measurement2, method= pearson ) cor.test(datavector1, datavector2, method= pearson )

cor(dataset$measurement1, dataset$measurement2, method= pearson ) cor.test(datavector1, datavector2, method= pearson ) Tutorial 7: Correlation and Regression Correlation Used to test whether two variables are linearly associated. A correlation coefficient (r) indicates the strength and direction of the association. A correlation

More information

Elementary Statistics Lecture 3 Association: Contingency, Correlation and Regression

Elementary Statistics Lecture 3 Association: Contingency, Correlation and Regression Elementary Statistics Lecture 3 Association: Contingency, Correlation and Regression Chong Ma Department of Statistics University of South Carolina chongm@email.sc.edu Chong Ma (Statistics, USC) STAT 201

More information

ST440/540: Applied Bayesian Statistics. (9) Model selection and goodness-of-fit checks

ST440/540: Applied Bayesian Statistics. (9) Model selection and goodness-of-fit checks (9) Model selection and goodness-of-fit checks Objectives In this module we will study methods for model comparisons and checking for model adequacy For model comparisons there are a finite number of candidate

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll

More information

Statistics 572 Semester Review

Statistics 572 Semester Review Statistics 572 Semester Review Final Exam Information: The final exam is Friday, May 16, 10:05-12:05, in Social Science 6104. The format will be 8 True/False and explains questions (3 pts. each/ 24 pts.

More information

Bayesian Inference for Regression Parameters

Bayesian Inference for Regression Parameters Bayesian Inference for Regression Parameters 1 Bayesian inference for simple linear regression parameters follows the usual pattern for all Bayesian analyses: 1. Form a prior distribution over all unknown

More information

Describing Center: Mean and Median Section 5.4

Describing Center: Mean and Median Section 5.4 Describing Center: Mean and Median Section 5.4 Look at table 5.2 at the right. We are going to make the dotplot of the city gas mileages of midsize cars. How to describe the center of a distribution: x

More information

Statistics. Introduction to R for Public Health Researchers. Processing math: 100%

Statistics. Introduction to R for Public Health Researchers. Processing math: 100% Statistics Introduction to R for Public Health Researchers Statistics Now we are going to cover how to perform a variety of basic statistical tests in R. Correlation T-tests/Rank-sum tests Linear Regression

More information

QUANTITATIVE STATISTICAL METHODS: REGRESSION AND FORECASTING JOHANNES LEDOLTER VIENNA UNIVERSITY OF ECONOMICS AND BUSINESS ADMINISTRATION SPRING 2013

QUANTITATIVE STATISTICAL METHODS: REGRESSION AND FORECASTING JOHANNES LEDOLTER VIENNA UNIVERSITY OF ECONOMICS AND BUSINESS ADMINISTRATION SPRING 2013 QUANTITATIVE STATISTICAL METHODS: REGRESSION AND FORECASTING JOHANNES LEDOLTER VIENNA UNIVERSITY OF ECONOMICS AND BUSINESS ADMINISTRATION SPRING 3 Introduction Objectives of course: Regression and Forecasting

More information

Consider fitting a model using ordinary least squares (OLS) regression:

Consider fitting a model using ordinary least squares (OLS) regression: Example 1: Mating Success of African Elephants In this study, 41 male African elephants were followed over a period of 8 years. The age of the elephant at the beginning of the study and the number of successful

More information

Package lmm. R topics documented: March 19, Version 0.4. Date Title Linear mixed models. Author Joseph L. Schafer

Package lmm. R topics documented: March 19, Version 0.4. Date Title Linear mixed models. Author Joseph L. Schafer Package lmm March 19, 2012 Version 0.4 Date 2012-3-19 Title Linear mixed models Author Joseph L. Schafer Maintainer Jing hua Zhao Depends R (>= 2.0.0) Description Some

More information

1 The Classic Bivariate Least Squares Model

1 The Classic Bivariate Least Squares Model Review of Bivariate Linear Regression Contents 1 The Classic Bivariate Least Squares Model 1 1.1 The Setup............................... 1 1.2 An Example Predicting Kids IQ................. 1 2 Evaluating

More information

Different formulas for the same model How to get different coefficients in R

Different formulas for the same model How to get different coefficients in R Outline 1 Re-parametrizations Different formulas for the same model How to get different coefficients in R 2 Interactions Two-way interactions between a factor and another predictor Two-way interactions

More information

STAT 420: Methods of Applied Statistics

STAT 420: Methods of Applied Statistics STAT 420: Methods of Applied Statistics Model Diagnostics Transformation Shiwei Lan, Ph.D. Course website: http://shiwei.stat.illinois.edu/lectures/stat420.html August 15, 2018 Department

More information

III. Inferential Tools

III. Inferential Tools III. Inferential Tools A. Introduction to Bat Echolocation Data (10.1.1) 1. Q: Do echolocating bats expend more enery than non-echolocating bats and birds, after accounting for mass? 2. Strategy: (i) Explore

More information

DAG models and Markov Chain Monte Carlo methods a short overview

DAG models and Markov Chain Monte Carlo methods a short overview DAG models and Markov Chain Monte Carlo methods a short overview Søren Højsgaard Institute of Genetics and Biotechnology University of Aarhus August 18, 2008 Printed: August 18, 2008 File: DAGMC-Lecture.tex

More information