1 Forecasting House Starts

Size: px
Start display at page:

Download "1 Forecasting House Starts"

Transcription

1 1396, Time Series, Week 5, Fall In this handout, we will see the application example on chapter 5. We use the same example as illustrated in the textbook and fit the data with several models of interests. Data: Housing starts are usually considered to be seasonal. Using monthly data on U.S. housing starts, we will estimate the regressions using the period and the period for out-of-sample forecasting. Library in R: additionally, only need to add-on library(car) for Durbin- Watson. 1 Forecasting House Starts 1.1 Read data into R > library(car) ## for Durbin.Watson test and Cook Distance > hs <- read.table("hstarts.dat", header=true) > hstarts <- hs$hstarts[1:576] 1.2 Time Series Plot Time series plot using time scale, Time = 1, 2,..., T. The plot is generated by the R command: > plot(hstarts, xlab="time") hstarts Time Times series plot using original time scale. commands: The plot is generated by the R > ## Create a time series with months attached to the data, from > hstarts.ts <- ts(hstarts, frequency=12, start=c(1946,1), end=c(1993,12)) > plot(hstarts.ts)

2 1396, Time Series, Week 5, Fall hstarts.ts Time 1.3 Monthly Effects: Seasonal variations in which months? To assess effects of all 12 months, we first create 12 dummy variables as follows. > n <- length(sales) ## n=576 ## > Time <- c(1:n) > d1 <- rep(c(1, rep(0,11)),48) > d2 <- rep(c(0,1,rep(0,10)), 48) > d3 <- rep(c(0,0,1,rep(0,9)), 48) > d4 <- rep(c(0,0,0,1, rep(0,8)),48) > d5 <- rep(c(rep(0,4),1,rep(0,7)), 48) > d6 <- rep(c(rep(0,5), 1, rep(0,6)), 48) > d7 <- rep(c(rep(0,6),1,rep(0,5)),48) > d8 <- rep(c(rep(0,7), 1, rep(0,4)), 48) > d9 <- rep(c(rep(0,8), 1,0,0,0),48) > d10 <- rep(c(rep(0,9),1,0,0), 48) > d11 <- rep(c(rep(0,10),1, 0), 48) > d12 <- rep(c(rep(0,11),1), 48) Check out one dummy variable to see the idea. > d1 [1] [38] [75] [112] [149] [186] [223]

3 1396, Time Series, Week 5, Fall [260] [297] [334] [371] [408] [445] [482] [519] [556] The model assessing the effects of all months is > hs.fit <- lm(hstarts~-1+d1+d2+d3+d4+d5+d6+d7+d8+d9+d10+d11+d12) > summary(hs.fit) Call: lm(formula = hstarts ~ -1 + d1 + d2 + d3 + d4 + d5 + d6 + d7 + d8 + d9 + d10 + d11 + d12) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) d <2e-16 *** d <2e-16 *** d <2e-16 *** d <2e-16 *** d <2e-16 *** d <2e-16 *** d <2e-16 *** d <2e-16 *** d <2e-16 *** d <2e-16 *** d <2e-16 *** d <2e-16 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 564 degrees of freedom Multiple R-Squared: , Adjusted R-squared: F-statistic: on 12 and 564 DF, p-value: < 2.2e-16 Check the residuals for (1) Normal assumptions (2) Constant Variance (3) Independence

4 1396, Time Series, Week 5, Fall Normal Q Q Plot Sample Quantiles hs.fit$residuals Theoretical Quantiles hs.fit$fitted.values hs.fit$residuals e(t 1) Time e_(t) Tests for serial correlation > durbin.watson(hs.fit, method="normal", alternative="two.sided") lag Autocorrelation D-W Statistic p-value Alternative hypothesis: rho!= 0 > durbin.watson(hs.fit, method="normal", alternative="positive") lag Autocorrelation D-W Statistic p-value Alternative hypothesis: rho > 0 Now, we attempt to find AIC and SIC (we use log version shown on page 101 of the textbook) > hs.aic <- log(sum(hs.fit$residuals^2)/n)+2*12/n > hs.sic <- log(sum(hs.fit$residuals^2)/n)+12*log(n)/n > hs.aic [1] > hs.sic [1]

5 1396, Time Series, Week 5, Fall Notice that k in AIC and SIC now is 12, because there are 12 dummy variables in the model. The following shows the forecasting values of the next 12 months and the prediction intervals: > hs.new <- data.frame(d1=c(1, rep(0,11)), d2=c(0,1,rep(0,10)), + d3=c(0,0,1,rep(0,9)), d4=c(0,0,0,1,rep(0,8)), + d5=c(0,0,0,0,1,rep(0,7)), d6=c(rep(0,5),1,rep(0,6)), + d7=c(rep(0,6),1,rep(0,5)), d8=c(rep(0,7),1,rep(0,4)), + d9=c(rep(0,8),1,0,0,0), d10=c(rep(0,9),1,0,0), + d11=c(rep(0,10),1,0), d12=c(rep(0,11),1)) > > hs.pred.plim <- predict(hs.fit, hs.new, interval="prediction") > hs.pred.clim <- predict(hs.fit, hs.new, interval="confidence") > hs.pred.plim fit lwr upr > hs.pred.clim fit lwr upr The plot with forecast and prediction interval: > plot(time,hstarts,type="l",xlim=c(1,590),ylim=c(0,250)) > lines((n+1):(n+12), hs.pred.plim[,1],lty=2)

6 1396, Time Series, Week 5, Fall > lines((n+1):(n+12), hs.pred.plim[,2],lty=2) > lines((n+1):(n+12), hs.pred.plim[,3],lty=2) hstarts Time 1.4 Single Month Effect Suppose we are interested in the effect of June only. We will need two dummy variables. First, we can create a dummy variable for June: { 1, the month is June D 1 = 0, the month is not June The other dummy variable D 2 shows the Non-June effect. That is, D 2 = 1 D 1. The fitted model is > d1.june <- rep(c(rep(0,5),1,rep(0,6)),48) > d1.notjune <- rep(1,576)-d1 > hs.june.fit <- lm(hstarts~-1+d1.june+d1.notjune) > summary(hs.june.fit) Call: lm(formula = hstarts ~ -1 + d1.june + d1.notjune) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) d1.june <2e-16 ***

7 1396, Time Series, Week 5, Fall d1.notjune <2e-16 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 574 degrees of freedom Multiple R-Squared: , Adjusted R-squared: F-statistic: 3678 on 2 and 574 DF, p-value: < 2.2e-16 Check for the residual plots Normal Q Q Plot Theoretical Quantiles Sample Quantiles hs.june.fit$fitted.values hs.june.fit$residuals Time hs.june.fit$residuals e_(t) e(t 1) Tests for serial correlation: That is, is e t correlated to e t 1? > durbin.watson(hs.june.fit, method="normal", alternative="two.sided") lag Autocorrelation D-W Statistic p-value Alternative hypothesis: rho!= 0 > durbin.watson(hs.june.fit, method="normal", alternative="positive") lag Autocorrelation D-W Statistic p-value Alternative hypothesis: rho > 0

8 1396, Time Series, Week 5, Fall AIC and BIC values: > hs.june.aic <- log(sum(hs.june.fit$residuals^2)/n)+2*2/n > hs.june.sic <- log(sum(hs.june.fit$residuals^2)/n)+2*log(n)/n > hs.june.aic [1] > hs.june.sic [1] Notice that the AIC and SIC are higher than the AIC, SIC from the model including all 12 dummy variables. 1.5 Adding Trend to the Model Now, in addition to the 12 dummy variables assessing the monthly effects, we are interested in whether the housing starts has relationship with Time. So we add a linear Time trend into the model. The fitted regression is > hs2.fit <- lm(hstarts~-1+d1+d2+d3+d4+d5+d6+d7+d8+d9+d10+d11+d12+time) > summary(hs2.fit) Call: lm(formula = hstarts ~ -1 + d1 + d2 + d3 + d4 + d5 + d6 + d7 + d8 + d9 + d10 + d11 + d12 + Time) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) d e e <2e-16 *** d e e <2e-16 *** d e e <2e-16 *** d e e <2e-16 *** d e e <2e-16 *** d e e <2e-16 *** d e e <2e-16 *** d e e <2e-16 *** d e e <2e-16 *** d e e <2e-16 *** d e e <2e-16 *** d e e <2e-16 *** Time 1.160e e

9 1396, Time Series, Week 5, Fall Signif. codes: 0 *** ** 0.01 * Residual standard error: on 563 degrees of freedom Multiple R-Squared: , Adjusted R-squared: F-statistic: on 13 and 563 DF, p-value: < 2.2e-16 Check for residuals plots Normal Q Q Plot Sample Quantiles hs2.fit$residuals Theoretical Quantiles hs2.fit$fitted.values hs2.fit$residuals e(t 1) Index e_(t) Normality test: > shapiro.test(hs2.fit$residuals) Shapiro-Wilk normality test data: hs2.fit$residuals W = , p-value = 4.047e-09 AIC and SIC:

10 1396, Time Series, Week 5, Fall > hs2.aic <- log(sum(hs2.fit$residuals^2)/n)+2*13/n > hs2.sic <- log(sum(hs2.fit$residuals^2)/n)+13*log(n)/n > hs2.aic [1] > hs2.sic [1] Conclusions It appears that there were higher housing starts in the period of April to June. Time trend is not shown to be significant. The AIC and SIC values are slightly lower when the Time trend is not included. The normal plot is not improved too much if adding the linear Time trend in the model. Exercise 1. It appears that there were higher housing starts in the period of April to June. Build a model to assess the effects of this period by doing the followings. (a) How to create dummy variables for effects of April, May and June? To assess the effects, there should be four dummy variables for putting into the model. Write out these four dummy variables. (b) Using the data from the period , fit the regression model on these four dummy variables. Present your regression and give the estimates of those effects from the regression. (c) What are the p-values for the t-tests shown in the output? Write down the hypothesis for each t-test and draw conclusions about the significance. (d) Do the residuals support your model assumptions? Check out the residual plot, including qqnorm plot, residuals-time plot and a plot for detecting serial correlation. Also carry out Durbin-Watson test on the residuals. (e) Find out AIC and SIC and compare them with the models shown in this handout. (f) Provide the prediction intervals and confidence intervals for the forecasts at April, May and June in 1994.

Regression on Faithful with Section 9.3 content

Regression on Faithful with Section 9.3 content Regression on Faithful with Section 9.3 content The faithful data frame contains 272 obervational units with variables waiting and eruptions measuring, in minutes, the amount of wait time between eruptions,

More information

Estimated Simple Regression Equation

Estimated Simple Regression Equation Simple Linear Regression A simple linear regression model that describes the relationship between two variables x and y can be expressed by the following equation. The numbers α and β are called parameters,

More information

Homework 2. For the homework, be sure to give full explanations where required and to turn in any relevant plots.

Homework 2. For the homework, be sure to give full explanations where required and to turn in any relevant plots. Homework 2 1 Data analysis problems For the homework, be sure to give full explanations where required and to turn in any relevant plots. 1. The file berkeley.dat contains average yearly temperatures for

More information

Homework 9 Sample Solution

Homework 9 Sample Solution Homework 9 Sample Solution # 1 (Ex 9.12, Ex 9.23) Ex 9.12 (a) Let p vitamin denote the probability of having cold when a person had taken vitamin C, and p placebo denote the probability of having cold

More information

Eco and Bus Forecasting Fall 2016 EXERCISE 2

Eco and Bus Forecasting Fall 2016 EXERCISE 2 ECO 5375-701 Prof. Tom Fomby Eco and Bus Forecasting Fall 016 EXERCISE Purpose: To learn how to use the DTDS model to test for the presence or absence of seasonality in time series data and to estimate

More information

Non-independence due to Time Correlation (Chapter 14)

Non-independence due to Time Correlation (Chapter 14) Non-independence due to Time Correlation (Chapter 14) When we model the mean structure with ordinary least squares, the mean structure explains the general trends in the data with respect to our dependent

More information

ST430 Exam 2 Solutions

ST430 Exam 2 Solutions ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving

More information

22s:152 Applied Linear Regression. Returning to a continuous response variable Y...

22s:152 Applied Linear Regression. Returning to a continuous response variable Y... 22s:152 Applied Linear Regression Generalized Least Squares Returning to a continuous response variable Y... Ordinary Least Squares Estimation The classical models we have fit so far with a continuous

More information

STAT 572 Assignment 5 - Answers Due: March 2, 2007

STAT 572 Assignment 5 - Answers Due: March 2, 2007 1. The file glue.txt contains a data set with the results of an experiment on the dry sheer strength (in pounds per square inch) of birch plywood, bonded with 5 different resin glues A, B, C, D, and E.

More information

No other aids are allowed. For example you are not allowed to have any other textbook or past exams.

No other aids are allowed. For example you are not allowed to have any other textbook or past exams. UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Sample Exam Note: This is one of our past exams, In fact the only past exam with R. Before that we were using SAS. In

More information

22s:152 Applied Linear Regression. In matrix notation, we can write this model: Generalized Least Squares. Y = Xβ + ɛ with ɛ N n (0, Σ)

22s:152 Applied Linear Regression. In matrix notation, we can write this model: Generalized Least Squares. Y = Xβ + ɛ with ɛ N n (0, Σ) 22s:152 Applied Linear Regression Generalized Least Squares Returning to a continuous response variable Y Ordinary Least Squares Estimation The classical models we have fit so far with a continuous response

More information

Chapter 16: Understanding Relationships Numerical Data

Chapter 16: Understanding Relationships Numerical Data Chapter 16: Understanding Relationships Numerical Data These notes reflect material from our text, Statistics, Learning from Data, First Edition, by Roxy Peck, published by CENGAGE Learning, 2015. Linear

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

STAT 215 Confidence and Prediction Intervals in Regression

STAT 215 Confidence and Prediction Intervals in Regression STAT 215 Confidence and Prediction Intervals in Regression Colin Reimer Dawson Oberlin College 24 October 2016 Outline Regression Slope Inference Partitioning Variability Prediction Intervals Reminder:

More information

Linear Regression Model. Badr Missaoui

Linear Regression Model. Badr Missaoui Linear Regression Model Badr Missaoui Introduction What is this course about? It is a course on applied statistics. It comprises 2 hours lectures each week and 1 hour lab sessions/tutorials. We will focus

More information

STAT 3022 Spring 2007

STAT 3022 Spring 2007 Simple Linear Regression Example These commands reproduce what we did in class. You should enter these in R and see what they do. Start by typing > set.seed(42) to reset the random number generator so

More information

Handout 4: Simple Linear Regression

Handout 4: Simple Linear Regression Handout 4: Simple Linear Regression By: Brandon Berman The following problem comes from Kokoska s Introductory Statistics: A Problem-Solving Approach. The data can be read in to R using the following code:

More information

Solution to Series 6

Solution to Series 6 Dr. M. Dettling Applied Series Analysis SS 2014 Solution to Series 6 1. a) > r.bel.lm summary(r.bel.lm) Call: lm(formula = NURSING ~., data = d.beluga) Residuals: Min 1Q

More information

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013 UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013 STAC67H3 Regression Analysis Duration: One hour and fifty minutes Last Name: First Name: Student

More information

Stat 5102 Final Exam May 14, 2015

Stat 5102 Final Exam May 14, 2015 Stat 5102 Final Exam May 14, 2015 Name Student ID The exam is closed book and closed notes. You may use three 8 1 11 2 sheets of paper with formulas, etc. You may also use the handouts on brand name distributions

More information

ITSx: Policy Analysis Using Interrupted Time Series

ITSx: Policy Analysis Using Interrupted Time Series ITSx: Policy Analysis Using Interrupted Time Series Week 3 Slides Michael Law, Ph.D. The University of British Columbia Layout of the weeks 1. Introduction, setup, data sources 2. Single series interrupted

More information

Reaction Days

Reaction Days Stat April 03 Week Fitting Individual Trajectories # Straight-line, constant rate of change fit > sdat = subset(sleepstudy, Subject == "37") > sdat Reaction Days Subject > lm.sdat = lm(reaction ~ Days)

More information

Study Sheet. December 10, The course PDF has been updated (6/11). Read the new one.

Study Sheet. December 10, The course PDF has been updated (6/11). Read the new one. Study Sheet December 10, 2017 The course PDF has been updated (6/11). Read the new one. 1 Definitions to know The mode:= the class or center of the class with the highest frequency. The median : Q 2 is

More information

Multiple Linear Regression (solutions to exercises)

Multiple Linear Regression (solutions to exercises) Chapter 6 1 Chapter 6 Multiple Linear Regression (solutions to exercises) Chapter 6 CONTENTS 2 Contents 6 Multiple Linear Regression (solutions to exercises) 1 6.1 Nitrate concentration..........................

More information

Exam Applied Statistical Regression. Good Luck!

Exam Applied Statistical Regression. Good Luck! Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.

More information

Multiple Regression Introduction to Statistics Using R (Psychology 9041B)

Multiple Regression Introduction to Statistics Using R (Psychology 9041B) Multiple Regression Introduction to Statistics Using R (Psychology 9041B) Paul Gribble Winter, 2016 1 Correlation, Regression & Multiple Regression 1.1 Bivariate correlation The Pearson product-moment

More information

1.) Fit the full model, i.e., allow for separate regression lines (different slopes and intercepts) for each species

1.) Fit the full model, i.e., allow for separate regression lines (different slopes and intercepts) for each species Lecture notes 2/22/2000 Dummy variables and extra SS F-test Page 1 Crab claw size and closing force. Problem 7.25, 10.9, and 10.10 Regression for all species at once, i.e., include dummy variables for

More information

Linear Modelling: Simple Regression

Linear Modelling: Simple Regression Linear Modelling: Simple Regression 10 th of Ma 2018 R. Nicholls / D.-L. Couturier / M. Fernandes Introduction: ANOVA Used for testing hpotheses regarding differences between groups Considers the variation

More information

> modlyq <- lm(ly poly(x,2,raw=true)) > summary(modlyq) Call: lm(formula = ly poly(x, 2, raw = TRUE))

> modlyq <- lm(ly poly(x,2,raw=true)) > summary(modlyq) Call: lm(formula = ly poly(x, 2, raw = TRUE)) School of Mathematical Sciences MTH5120 Statistical Modelling I Tutorial 4 Solutions The first two models were looked at last week and both had flaws. The output for the third model with log y and a quadratic

More information

Regression and correlation

Regression and correlation 6 Regression and correlation The main object of this chapter is to show how to perform basic regression analyses, including plots for model checking and display of confidence and prediction intervals.

More information

Inferences on Linear Combinations of Coefficients

Inferences on Linear Combinations of Coefficients Inferences on Linear Combinations of Coefficients Note on required packages: The following code required the package multcomp to test hypotheses on linear combinations of regression coefficients. If you

More information

ST430 Exam 1 with Answers

ST430 Exam 1 with Answers ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.

More information

Biostatistics for physicists fall Correlation Linear regression Analysis of variance

Biostatistics for physicists fall Correlation Linear regression Analysis of variance Biostatistics for physicists fall 2015 Correlation Linear regression Analysis of variance Correlation Example: Antibody level on 38 newborns and their mothers There is a positive correlation in antibody

More information

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. B203: Quantitative Methods Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. Part I: Compulsory Questions. Answer all questions. Each question carries

More information

Regression and Models with Multiple Factors. Ch. 17, 18

Regression and Models with Multiple Factors. Ch. 17, 18 Regression and Models with Multiple Factors Ch. 17, 18 Mass 15 20 25 Scatter Plot 70 75 80 Snout-Vent Length Mass 15 20 25 Linear Regression 70 75 80 Snout-Vent Length Least-squares The method of least

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression ST 430/514 Recall: a regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates).

More information

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 PDF file location: http://www.murraylax.org/rtutorials/regression_anovatable.pdf

More information

Interpretation, Prediction and Confidence Intervals

Interpretation, Prediction and Confidence Intervals Interpretation, Prediction and Confidence Intervals Merlise Clyde September 15, 2017 Last Class Model for log brain weight as a function of log body weight Nested Model Comparison using ANOVA led to model

More information

Stat 401B Exam 2 Fall 2015

Stat 401B Exam 2 Fall 2015 Stat 401B Exam Fall 015 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning

More information

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model 1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor

More information

Econ 427, Spring Problem Set 3 suggested answers (with minor corrections) Ch 6. Problems and Complements:

Econ 427, Spring Problem Set 3 suggested answers (with minor corrections) Ch 6. Problems and Complements: Econ 427, Spring 2010 Problem Set 3 suggested answers (with minor corrections) Ch 6. Problems and Complements: 1. (page 132) In each case, the idea is to write these out in general form (without the lag

More information

Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov

Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT Nov 20 2015 Charlotte Wickham stat511.cwick.co.nz Quiz #4 This weekend, don t forget. Usual format Assumptions Display 7.5 p. 180 The ideal normal, simple

More information

We d like to know the equation of the line shown (the so called best fit or regression line).

We d like to know the equation of the line shown (the so called best fit or regression line). Linear Regression in R. Example. Let s create a data frame. > exam1 = c(100,90,90,85,80,75,60) > exam2 = c(95,100,90,80,95,60,40) > students = c("asuka", "Rei", "Shinji", "Mari", "Hikari", "Toji", "Kensuke")

More information

Regression Models for Time Trends: A Second Example. INSR 260, Spring 2009 Bob Stine

Regression Models for Time Trends: A Second Example. INSR 260, Spring 2009 Bob Stine Regression Models for Time Trends: A Second Example INSR 260, Spring 2009 Bob Stine 1 Overview Resembles prior textbook occupancy example Time series of revenue, costs and sales at Best Buy, in millions

More information

Booklet of Code and Output for STAC32 Final Exam

Booklet of Code and Output for STAC32 Final Exam Booklet of Code and Output for STAC32 Final Exam December 7, 2017 Figure captions are below the Figures they refer to. LowCalorie LowFat LowCarbo Control 8 2 3 2 9 4 5 2 6 3 4-1 7 5 2 0 3 1 3 3 Figure

More information

Analysis. Components of a Time Series

Analysis. Components of a Time Series Module 8: Time Series Analysis 8.2 Components of a Time Series, Detection of Change Points and Trends, Time Series Models Components of a Time Series There can be several things happening simultaneously

More information

Ch 8. MODEL DIAGNOSTICS. Time Series Analysis

Ch 8. MODEL DIAGNOSTICS. Time Series Analysis Model diagnostics is concerned with testing the goodness of fit of a model and, if the fit is poor, suggesting appropriate modifications. We shall present two complementary approaches: analysis of residuals

More information

Vector Autoregression

Vector Autoregression Vector Autoregression Prabakar Rajasekaran December 13, 212 1 Introduction Vector autoregression (VAR) is an econometric model used to capture the evolution and the interdependencies between multiple time

More information

Lab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model

Lab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model Lab 3 A Quick Introduction to Multiple Linear Regression Psychology 310 Instructions.Work through the lab, saving the output as you go. You will be submitting your assignment as an R Markdown document.

More information

Model Modifications. Bret Larget. Departments of Botany and of Statistics University of Wisconsin Madison. February 6, 2007

Model Modifications. Bret Larget. Departments of Botany and of Statistics University of Wisconsin Madison. February 6, 2007 Model Modifications Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison February 6, 2007 Statistics 572 (Spring 2007) Model Modifications February 6, 2007 1 / 20 The Big

More information

Ref.: Spring SOS3003 Applied data analysis for social science Lecture note

Ref.:   Spring SOS3003 Applied data analysis for social science Lecture note SOS3003 Applied data analysis for social science Lecture note 05-2010 Erling Berge Department of sociology and political science NTNU Spring 2010 Erling Berge 2010 1 Literature Regression criticism I Hamilton

More information

SCHOOL OF MATHEMATICS AND STATISTICS Autumn Semester

SCHOOL OF MATHEMATICS AND STATISTICS Autumn Semester RESTRICTED OPEN BOOK EXAMINATION (Not to be removed from the examination hall) Data provided: "Statistics Tables" by H.R. Neave PAS 371 SCHOOL OF MATHEMATICS AND STATISTICS Autumn Semester 2008 9 Linear

More information

Regression. Bret Hanlon and Bret Larget. December 8 15, Department of Statistics University of Wisconsin Madison.

Regression. Bret Hanlon and Bret Larget. December 8 15, Department of Statistics University of Wisconsin Madison. Regression Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison December 8 15, 2011 Regression 1 / 55 Example Case Study The proportion of blackness in a male lion s nose

More information

Testing methodology. It often the case that we try to determine the form of the model on the basis of data

Testing methodology. It often the case that we try to determine the form of the model on the basis of data Testing methodology It often the case that we try to determine the form of the model on the basis of data The simplest case: we try to determine the set of explanatory variables in the model Testing for

More information

PAPER 206 APPLIED STATISTICS

PAPER 206 APPLIED STATISTICS MATHEMATICAL TRIPOS Part III Thursday, 1 June, 2017 9:00 am to 12:00 pm PAPER 206 APPLIED STATISTICS Attempt no more than FOUR questions. There are SIX questions in total. The questions carry equal weight.

More information

MODELS WITHOUT AN INTERCEPT

MODELS WITHOUT AN INTERCEPT Consider the balanced two factor design MODELS WITHOUT AN INTERCEPT Factor A 3 levels, indexed j 0, 1, 2; Factor B 5 levels, indexed l 0, 1, 2, 3, 4; n jl 4 replicate observations for each factor level

More information

Analytics 512: Homework # 2 Tim Ahn February 9, 2016

Analytics 512: Homework # 2 Tim Ahn February 9, 2016 Analytics 512: Homework # 2 Tim Ahn February 9, 2016 Chapter 3 Problem 1 (# 3) Suppose we have a data set with five predictors, X 1 = GP A, X 2 = IQ, X 3 = Gender (1 for Female and 0 for Male), X 4 = Interaction

More information

Model Specification and Data Problems. Part VIII

Model Specification and Data Problems. Part VIII Part VIII Model Specification and Data Problems As of Oct 24, 2017 1 Model Specification and Data Problems RESET test Non-nested alternatives Outliers A functional form misspecification generally means

More information

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

unadjusted model for baseline cholesterol 22:31 Monday, April 19, unadjusted model for baseline cholesterol 22:31 Monday, April 19, 2004 1 Class Level Information Class Levels Values TRETGRP 3 3 4 5 SEX 2 0 1 Number of observations 916 unadjusted model for baseline cholesterol

More information

Regression. Marc H. Mehlman University of New Haven

Regression. Marc H. Mehlman University of New Haven Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven the statistician knows that in nature there never was a normal distribution, there never was a straight line, yet with normal and

More information

AGEC 621 Lecture 16 David Bessler

AGEC 621 Lecture 16 David Bessler AGEC 621 Lecture 16 David Bessler This is a RATS output for the dummy variable problem given in GHJ page 422; the beer expenditure lecture (last time). I do not expect you to know RATS but this will give

More information

Math 5305 Notes. Diagnostics and Remedial Measures. Jesse Crawford. Department of Mathematics Tarleton State University

Math 5305 Notes. Diagnostics and Remedial Measures. Jesse Crawford. Department of Mathematics Tarleton State University Math 5305 Notes Diagnostics and Remedial Measures Jesse Crawford Department of Mathematics Tarleton State University (Tarleton State University) Diagnostics and Remedial Measures 1 / 44 Model Assumptions

More information

Forecasting. Al Nosedal University of Toronto. March 8, Al Nosedal University of Toronto Forecasting March 8, / 80

Forecasting. Al Nosedal University of Toronto. March 8, Al Nosedal University of Toronto Forecasting March 8, / 80 Forecasting Al Nosedal University of Toronto March 8, 2016 Al Nosedal University of Toronto Forecasting March 8, 2016 1 / 80 Forecasting Methods: An Overview There are many forecasting methods available,

More information

Variance Decomposition and Goodness of Fit

Variance Decomposition and Goodness of Fit Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings

More information

Regression and the 2-Sample t

Regression and the 2-Sample t Regression and the 2-Sample t James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Regression and the 2-Sample t 1 / 44 Regression

More information

Lecture 8. Using the CLR Model

Lecture 8. Using the CLR Model Lecture 8. Using the CLR Model Example of regression analysis. Relation between patent applications and R&D spending Variables PATENTS = No. of patents (in 1000) filed RDEXP = Expenditure on research&development

More information

Lecture 2. Simple linear regression

Lecture 2. Simple linear regression Lecture 2. Simple linear regression Jesper Rydén Department of Mathematics, Uppsala University jesper@math.uu.se Regression and Analysis of Variance autumn 2014 Overview of lecture Introduction, short

More information

Data Analysis Using R ASC & OIR

Data Analysis Using R ASC & OIR Data Analysis Using R ASC & OIR Overview } What is Statistics and the process of study design } Correlation } Simple Linear Regression } Multiple Linear Regression 2 What is Statistics? Statistics is a

More information

Firstly, the dataset is cleaned and the years and months are separated to provide better distinction (sample below).

Firstly, the dataset is cleaned and the years and months are separated to provide better distinction (sample below). Project: Forecasting Sales Step 1: Plan Your Analysis Answer the following questions to help you plan out your analysis: 1. Does the dataset meet the criteria of a time series dataset? Make sure to explore

More information

Figure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim

Figure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim 0.0 1.0 1.5 2.0 2.5 3.0 8 10 12 14 16 18 20 22 y x Figure 1: The fitted line using the shipment route-number of ampules data STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim Problem#

More information

Inference with Heteroskedasticity

Inference with Heteroskedasticity Inference with Heteroskedasticity Note on required packages: The following code requires the packages sandwich and lmtest to estimate regression error variance that may change with the explanatory variables.

More information

Exam details. Final Review Session. Things to Review

Exam details. Final Review Session. Things to Review Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit

More information

Modelling using ARMA processes

Modelling using ARMA processes Modelling using ARMA processes Step 1. ARMA model identification; Step 2. ARMA parameter estimation Step 3. ARMA model selection ; Step 4. ARMA model checking; Step 5. forecasting from ARMA models. 33

More information

GLS and related issues

GLS and related issues GLS and related issues Bernt Arne Ødegaard 27 April 208 Contents Problems in multivariate regressions 2. Problems with assumed i.i.d. errors...................................... 2 2 NON-iid errors 2 2.

More information

Coefficient of Determination

Coefficient of Determination Coefficient of Determination ST 430/514 The coefficient of determination, R 2, is defined as before: R 2 = 1 SS E (yi ŷ i ) = 1 2 SS yy (yi ȳ) 2 The interpretation of R 2 is still the fraction of variance

More information

Time-Series Regression and Generalized Least Squares in R*

Time-Series Regression and Generalized Least Squares in R* Time-Series Regression and Generalized Least Squares in R* An Appendix to An R Companion to Applied Regression, third edition John Fox & Sanford Weisberg last revision: 2018-09-26 Abstract Generalized

More information

Cherry.R. > cherry d h v <portion omitted> > # Step 1.

Cherry.R. > cherry d h v <portion omitted> > # Step 1. Cherry.R ####################################################################### library(mass) library(car) cherry < read.table(file="n:\\courses\\stat8620\\fall 08\\trees.dat",header=T) cherry d h v 1

More information

CHAPTER 2 SIMPLE LINEAR REGRESSION

CHAPTER 2 SIMPLE LINEAR REGRESSION CHAPTER 2 SIMPLE LINEAR REGRESSION 1 Examples: 1. Amherst, MA, annual mean temperatures, 1836 1997 2. Summer mean temperatures in Mount Airy (NC) and Charleston (SC), 1948 1996 Scatterplots outliers? influential

More information

Workshop 9.3a: Randomized block designs

Workshop 9.3a: Randomized block designs -1- Workshop 93a: Randomized block designs Murray Logan November 23, 16 Table of contents 1 Randomized Block (RCB) designs 1 2 Worked Examples 12 1 Randomized Block (RCB) designs 11 RCB design Simple Randomized

More information

Econometric Forecasting Overview

Econometric Forecasting Overview Econometric Forecasting Overview April 30, 2014 Econometric Forecasting Econometric models attempt to quantify the relationship between the parameter of interest (dependent variable) and a number of factors

More information

Chapter 8: Correlation & Regression

Chapter 8: Correlation & Regression Chapter 8: Correlation & Regression We can think of ANOVA and the two-sample t-test as applicable to situations where there is a response variable which is quantitative, and another variable that indicates

More information

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc IES 612/STA 4-573/STA 4-576 Winter 2008 Week 1--IES 612-STA 4-573-STA 4-576.doc Review Notes: [OL] = Ott & Longnecker Statistical Methods and Data Analysis, 5 th edition. [Handouts based on notes prepared

More information

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75

More information

Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis. 17th Class 7/1/10

Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis. 17th Class 7/1/10 Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis 17th Class 7/1/10 The only function of economic forecasting is to make astrology look respectable. --John Kenneth Galbraith show

More information

Univariate linear models

Univariate linear models Univariate linear models The specification process of an univariate ARIMA model is based on the theoretical properties of the different processes and it is also important the observation and interpretation

More information

Ch3. TRENDS. Time Series Analysis

Ch3. TRENDS. Time Series Analysis 3.1 Deterministic Versus Stochastic Trends The simulated random walk in Exhibit 2.1 shows a upward trend. However, it is caused by a strong correlation between the series at nearby time points. The true

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

R Output for Linear Models using functions lm(), gls() & glm()

R Output for Linear Models using functions lm(), gls() & glm() LM 04 lm(), gls() &glm() 1 R Output for Linear Models using functions lm(), gls() & glm() Different kinds of output related to linear models can be obtained in R using function lm() {stats} in the base

More information

Seasonal Adjustment using X-13ARIMA-SEATS

Seasonal Adjustment using X-13ARIMA-SEATS Seasonal Adjustment using X-13ARIMA-SEATS Revised: 10/9/2017 Summary... 1 Data Input... 3 Limitations... 4 Analysis Options... 5 Tables and Graphs... 6 Analysis Summary... 7 Data Table... 9 Trend-Cycle

More information

Tests of Linear Restrictions

Tests of Linear Restrictions Tests of Linear Restrictions 1. Linear Restricted in Regression Models In this tutorial, we consider tests on general linear restrictions on regression coefficients. In other tutorials, we examine some

More information

1 Multiple Regression

1 Multiple Regression 1 Multiple Regression In this section, we extend the linear model to the case of several quantitative explanatory variables. There are many issues involved in this problem and this section serves only

More information

MATH 644: Regression Analysis Methods

MATH 644: Regression Analysis Methods MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100

More information

Outline. 1 Preliminaries. 2 Introduction. 3 Multivariate Linear Regression. 4 Online Resources for R. 5 References. 6 Upcoming Mini-Courses

Outline. 1 Preliminaries. 2 Introduction. 3 Multivariate Linear Regression. 4 Online Resources for R. 5 References. 6 Upcoming Mini-Courses UCLA Department of Statistics Statistical Consulting Center Introduction to Regression in R Part II: Multivariate Linear Regression Denise Ferrari denise@stat.ucla.edu Outline 1 Preliminaries 2 Introduction

More information

Solution to Series 3

Solution to Series 3 Prof. Nicolai Meinshausen Regression FS 2016 Solution to Series 3 1. a) The general least-squares regression estimator is given as Using the model equation, we get in this case ( ) X T x X (1)T x (1) x

More information

SCHOOL OF MATHEMATICS AND STATISTICS

SCHOOL OF MATHEMATICS AND STATISTICS RESTRICTED OPEN BOOK EXAMINATION (Not to be removed from the examination hall) Data provided: Statistics Tables by H.R. Neave MAS5052 SCHOOL OF MATHEMATICS AND STATISTICS Basic Statistics Spring Semester

More information

Introduction and Single Predictor Regression. Correlation

Introduction and Single Predictor Regression. Correlation Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation

More information

Chapter 16. Simple Linear Regression and Correlation

Chapter 16. Simple Linear Regression and Correlation Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Lab: Box-Jenkins Methodology - US Wholesale Price Indicator

Lab: Box-Jenkins Methodology - US Wholesale Price Indicator Lab: Box-Jenkins Methodology - US Wholesale Price Indicator In this lab we explore the Box-Jenkins methodology by applying it to a time-series data set comprising quarterly observations of the US Wholesale

More information

Pumpkin Example: Flaws in Diagnostics: Correcting Models

Pumpkin Example: Flaws in Diagnostics: Correcting Models Math 3080. Treibergs Pumpkin Example: Flaws in Diagnostics: Correcting Models Name: Example March, 204 From Levine Ramsey & Smidt, Applied Statistics for Engineers and Scientists, Prentice Hall, Upper

More information

Introduction to Linear Regression

Introduction to Linear Regression Introduction to Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Introduction to Linear Regression 1 / 46

More information