Homework 2. For the homework, be sure to give full explanations where required and to turn in any relevant plots.

Size: px
Start display at page:

Download "Homework 2. For the homework, be sure to give full explanations where required and to turn in any relevant plots."

Transcription

1 Homework 2 1 Data analysis problems For the homework, be sure to give full explanations where required and to turn in any relevant plots. 1. The file berkeley.dat contains average yearly temperatures for the cities of Berkeley and Santa Barbara. Import the data into R using the following commands berk=scan("berkeley.dat", what=list(double(0),double(0),double(0))) time=berk[[1]] berkeley=berk[[2]] stbarb=berk[[3]] (a) Plot the variables berkeley and stbarb versus time. Also, plot berkeley versus stbarb. Figure 1: 1-(a) berkeley vs time, stbarb vs time and berkeley vs stbarb time berkeley time berkeley stbarb berkeley (b) Perform a regression of berkeley on time. What do you think about this fit? Be sure to make diagnostic plots (including ) 1

2 of the residuals. If there are any violations of the assumptions for a linear regression model, make sure to comment on them. Figure 2: 1-(b) residual diagnostics a regression of berkeley on time time berkeley bfit1$fitted bfit1$residuals Index bfit1$residuals Series bfit1$residuals Normal Q Q Plot Theoretical Quantiles Sample Quantiles Call: lm(formula = berkeley ~ time) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) *** time e-13 *** --- Signif. codes: 0 *** ** 0.01 *

3 Residual standard error: on 102 degrees of freedom Multiple R-Squared: ,Adjusted R-squared: F-statistic: on 1 and 102 DF, p-value: 5.228e-13 This seems to be a fairly reasonable fit for the data. The F test indicates a strong relationship. The residual plots seem not to indicate anything out of the ordinary. The only troubling feature is the rather low adjusted R-squared. (c) Perform a regression of berkeley on stbarb. Comment on the fit and the residuals. Figure 3: 1-(c) residual diagnostics a regression of berkeley on stbarb stbarb berkeley bfit2$fitted bfit1$residuals Index bfit2$residuals Series bfit2$residuals Normal Q Q Plot Theoretical Quantiles Sample Quantiles Call: lm(formula = berkeley ~ stbarb) Residuals: 3

4 Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-11 *** stbarb e-06 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 102 degrees of freedom Multiple R-Squared: ,Adjusted R-squared: F-statistic: on 1 and 102 DF, p-value: 1.153e-06 This also seems like a reasonable fit with residuals that don t strongly violate the regression assumptions. Again the adjusted R-squared is rather low. The acf plot for the residuals show a correlation at lag 11 that is larger than what is expected if the residuals were independent. However, it is not extremely large and is most likely due to regular variation. (d) Make a plot of the variable berkeley and an plot of the data. Does the time series appear to be stationary? Explain. Interpret the plot in this situation. The time series has an increasing trend which means that it could not possibly be stationary. The plot is difficult to interpret since the data is not stationary; it cannot be interpreted as an approximation to the autocorrelation function. (e) Difference the data. Plot this differenced data, and make an plot. What is your opinion of whether the series is stationary after differencing? The data seems to be fairly stationary after differencing with a fairly constant variance and no discernible trend. (f) Now, we have detrended this series by using linear regression and with differencing. The result of detrending via regression was a model that fit rather well and residuals that had no apparent dependency. Let us assume then that the true model for this data 4

5 Figure 4: 1-(d) of the variable berkeley Series berkeley Figure 5: 1-(e) differenced berkeley vs time and of differenced berkeley Series diffber diffber

6 is x t = β 0 + β 1 t + w t where w t, t = 1,..., T is normal white noise with variance σ 2. (This is the same as assuming that this data follows the standard regression assumptions.) Assuming this model, write out a formula for the differenced time series, x t. Use this to explain the apparent dependency in the differenced data from 1e above. The model after differencing would be x t x t 1 = β 1 + w t w t 1 The differenced data is an MA(1) series (with a constant mean) with a negative θ 1. This corresponds very well to the plot in the previous question. The at lag one is negative and significantly outside the confidence intervals. The other lags show no or weak dependency. 6

7 2. Load the data in dailyibm.dat using the command ibm=scan("dailyibm.dat", skip=1). This series is the daily closing price of IBM stock from Jan 1, 1980 to Oct 8, (a) Make a plot of the data and an plot of the data. Does the time series appear to be stationary? Explain. Interpret the plot in this situation. Figure 6: 2-(a) a time series plot for ibm and its Series ibm ibm The time series from a time series plot and its does not appear stationary. The series plot wanders in a fashion similar to a random walk. The no longer approximates an autocorrelation function. (b) Difference the data. Plot this differenced data, and make an plot. What is your opinion of whether the series is stationary after differencing? The plot not longer wanders; however the variance seems to be increasing which contradicts stationarity. Again, the plot does not have a clear interpretation. 7

8 Figure 7: 2-(b) a time series plot for diffibm and its Series diffibm diffibm (c) Another option for attempting to obtain stationary data when there is something similar to an exponential trend is to take the logarithm. Use the R command log() to take the logarithm of the data. Plot this transformed data. Does the transformed data appear stationary? Explain. Figure 8: 2-(c) a time series plot for logibm and its Series logibm logibm The series still seems to wander after the transformation. The no longer approximates an autocorrelation function. (d) Perhaps some combination of differencing and the logarithmic 8

9 transform will give us stationary data. Why would log(diff(ibm)) not be a very good idea? Try the opposite, difference the log transformed data difflogibm=diff(log(ibm)). Except for a few extreme outliers, does this transformation succeed in creating stationary data? Figure 9: 2-(d) a time series plot for difflogibm and its Series difflogibm difflogibm Taking the log of the difference will attempt to take the logarithm of many negative values which will be undefined. Taking the difference of the log yields data that are reasonably stationary. (e) Delete the extreme outliers using the following command: difflogibm=difflogibm[difflogibm> -0.1] Plot this data and the for this data. Sometimes with very long time series like this one, portions of the series exhibit different behavior than other portions. Break the series into two parts using the following commands: difflogibm1= difflogibm[1:500] difflogibm2= difflogibm[501:length(difflogibm)] Plot both of these and create plots of each. Do you notice a difference between these two sections of the larger time series? The plots seem to indicate that the difflogibm1 is slightly dependent whereas difflogibm2 is essentially white noise. 9

10 Figure 10: 2-(e) a time series plot for difflogibm (without the extreme outlier) and its Series difflogibm difflogibm Figure 11: 2-(e) a time series plot for difflogibm1 and difflogibm2, and their s Series difflogibm1 difflogibm Series difflogibm2 difflogibm

11 (f) Assume the model for the data that we have called difflogibm2 is of the following form: d t = δ + w t where w t, t = 1,..., T is normal white noise with variance σ 2. Is this reasonable from what you now know of this time series? How would you estimate δ and σ? Give the estimates. As mentioned above the second series appears to have no dependency and could, therefore, be a shifted white noise. We can estimate δ and σ by using the sample mean and the sample standard deviation to obtain the estimates and respectively. 11

12 3. Load the monthly temperature data for England from 1723 to 1970 using the following command engtemp=scan("tpmon.dat",skip=1) (a) Plot the data and create an. Try doing this only on the first 300 observations. Figure 12: 3-(a) a time series plot forengtemp and its Series engtemp[1:300] engtemp[1:300] There is a clear seasonal pattern over the course of the year. The plot does not fall off over time and cannot be correctly interpreted because of the periodic trend. (b) Fit the following model using lm(): x t = β 0 + β 1 cos (2π 1 ) 12 t + β 2 sin (2π 1 ) 12 t + w t (You will need to create the variables for the covariates. It may be useful to know that there are R functions sin() and cos.) Call: lm(formula = engtemp ~ tcosine + tsine) Residuals: Min 1Q Median 3Q Max

13 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) <2e-16 *** tcosine <2e-16 *** tsine <2e-16 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 2973 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: 1.441e+04 on 2 and 2973 DF, p-value: < 2.2e-16 (c) Plot the residuals of the above fit. Comment on these residuals. You may want to look at only a few hundred at a time. Are the residuals dependent? Are they stationary? Figure 13: 3-(c) Diagnostics for the residuals Index engtemp[1:300] tfit$fitted[1:300] tfit$residuals[1:300] Index tfit$residuals[1:300] Series tfit$residuals[1:300] Normal Q Q Plot Theoretical Quantiles Sample Quantiles 13

14 The periodic trend is largely removed. However, the plot is not falling off - indicating there may be some trend left in the series. The series appears fairly stationary but there are some indications that this may not be the case. They do appear more stationary than the original time series however. (d) Compare the periodograms of the original data and the residuals from the fit model. Figure 14: 3-(d) Periodograms of the original data (top) and the residuals from the fit model (bottom) Scaled Periodogram Frequency Scaled Periodogram Frequency In the original time series, the periodogram is dominated by the yearly periodic trend (i.e., 12=1/ ). Only one clear spike is visible. When this is removed, 14

15 other patterns emerge. Specifically there is still a spike at around frequency 1/6 (6 month in terms of the period) and the low frequencies are somewhat strong beyond that. 15

16 4. Use the smoothing techniques introduced in class and above to estimate the trend in the global temperature data. Find out a proper window size or bandwidth and describe why you choose it. (Example 2.1 from the textbook. The data can be found in globtemp.dat ). In this problem, the students need to fit the global temperature data with moving average and kernel smoothing. Ideally, the students will show a few plots and discuss why a certain plot is the best. Below I have some example plots of MA smoothing (undersmoothed, about right, and oversmoothed). For the good level of smoothing, I have also shown the residuals and the acf. I then repeated the process for kernel smoothing. 16

17 Figure 15: 4-(a) MA smoothing (1st row - undersmoothed, 2nd row- about right, 3rd row - oversmoothed) Series resunderma5 globtemp resunderma Series resgoodma15 globtemp resgoodma Series resoverma40 globtemp resoverma

18 Figure 16: 4-(b) kernel smoothing (1st row - undersmoothed, 2nd row- about right, 3rd row - oversmoothed) Series resunderksm globtemp resunderksm Series resgoodksm globtemp resgoodksm Series resoverksm globtemp resoverksm

19 2 Theoretical problems 1. (No R required.) Show that the M A(3) model is (weakly) stationary. You need to show that the mean is zero and the covariance function depends only on distance. This will be very similar to what was done in class for the MA(2) model. E[x t ] = E[w t + θ 1 w t 1 + θ 2 w t 2 + θ 3 w t 3 ] = 0. E[x t x t ] = E[(w t + θ 1 w t 1 + θ 2 w t 2 + θ 3 w t 3 )(w t + θ 1 w t 1 + θ 2 w t 2 + θ 3 w t 3 )] = Ew 2 t ] + E[θ 2 1w 2 t 1] + E[θ 2 2w 2 t 2] + E[θ 2 3w 2 t 3] = σ 2 + θ 2 1σ 2 + θ 2 2σ 2 + θ 2 3σ 2 Now, let s do the calculation where s and t are only one time unit away from each other. Remember that all MA models have mean zero which will simplify the calculations. E[x t x t 1 ] = E[(w t + θ 1 w t 1 + θ 2 w t 2 + θ 3 w t 3 )(w t 1 + θ 1 w t 2 + θ 2 w t 3 + θ 3 w t 4 )] = E[θ 1 w 2 t 1] + E[θ 1 θ 2 w 2 t 2] + E[θ 2 θ 3 w 2 t 3] = θ 1 σ 2 + θ 1 θ 2 σ 2 + θ 2 θ 3 σ 2 Now, let s try a lag of two. E[x t x t 2 ] = E[(w t + θ 1 w t 1 + θ 2 w t 2 + θ 3 w t 3 )(w t 2 + θ 1 w t 3 + θ 2 w t 4 + θ 3 w t 5 )] And for three, = E[θ 2 w 2 t 2] + E[θ 1 θ 3 w 2 t 3] = θ 2 σ 2 + θ 1 θ 3 σ 2 E[x t x t 3 ] = E[(w t + θ 1 w t 1 + θ 2 w t 2 + θ 3 w t 3 )(w t 3 + θ 1 w t 4 + θ 2 w t 5 + θ 3 w t 6 )] = E[θ 3 w 2 t 3] = θ 3 σ 2 Now, we see the pattern; if we move to a lag of four or greater, then the windows do not overlap. Therefore, the covariance between the time series at lags larger than 3 will be zero. We illustrate with a lag of 4. E[x t x t 4 ] = E[(w t + θ 1 w t 1 + θ 2 w t 2 + θ 3 w t 3 )(w t 4 + θ 1 w t 5 + θ 2 w t 6 + θ 3 w t 7 )] = 0 19

20 2. (No R required.) Verify that the following model is non-stationary: x t = β 0 + β 1 t + β 2 t 2 + w t where w t is white noise. Now, verify that 2 x t is stationary. To see that the model is not stationary one needs only to say that the mean is not constant in time. To show that 2 x t is stationary: 2 x t = (x t x t 1 ) = (β 1 + β 2 t 2 β 2 (t 1) 2 + w t w t 1 ) = (β 1 + 2β 2 t β 2 + w t w t 1 ) = 2β 2 + w t 2w t 1 + w t 2 Since this is an MA(2) with a constant mean it is stationary. 20

Ch3. TRENDS. Time Series Analysis

Ch3. TRENDS. Time Series Analysis 3.1 Deterministic Versus Stochastic Trends The simulated random walk in Exhibit 2.1 shows a upward trend. However, it is caused by a strong correlation between the series at nearby time points. The true

More information

Read Section 1.1, Examples of time series, on pages 1-8. These example introduce the book; you are not tested on them.

Read Section 1.1, Examples of time series, on pages 1-8. These example introduce the book; you are not tested on them. TS Module 1 Time series overview (The attached PDF file has better formatting.)! Model building! Time series plots Read Section 1.1, Examples of time series, on pages 1-8. These example introduce the book;

More information

Homework 4. 1 Data analysis problems

Homework 4. 1 Data analysis problems Homework 4 1 Data analysis problems This week we will be analyzing a number of data sets. We are going to build ARIMA models using the steps outlined in class. It is also a good idea to read section 3.8

More information

Econ 424 Time Series Concepts

Econ 424 Time Series Concepts Econ 424 Time Series Concepts Eric Zivot January 20 2015 Time Series Processes Stochastic (Random) Process { 1 2 +1 } = { } = sequence of random variables indexed by time Observed time series of length

More information

at least 50 and preferably 100 observations should be available to build a proper model

at least 50 and preferably 100 observations should be available to build a proper model III Box-Jenkins Methods 1. Pros and Cons of ARIMA Forecasting a) need for data at least 50 and preferably 100 observations should be available to build a proper model used most frequently for hourly or

More information

Week 9: An Introduction to Time Series

Week 9: An Introduction to Time Series BUS41100 Applied Regression Analysis Week 9: An Introduction to Time Series Dependent data, autocorrelation, AR and periodic regression models Max H. Farrell The University of Chicago Booth School of Business

More information

The log transformation produces a time series whose variance can be treated as constant over time.

The log transformation produces a time series whose variance can be treated as constant over time. TAT 520 Homework 6 Fall 2017 Note: Problem 5 is mandatory for graduate students and extra credit for undergraduates. 1) The quarterly earnings per share for 1960-1980 are in the object in the TA package.

More information

STAT 3022 Spring 2007

STAT 3022 Spring 2007 Simple Linear Regression Example These commands reproduce what we did in class. You should enter these in R and see what they do. Start by typing > set.seed(42) to reset the random number generator so

More information

Volatility. Gerald P. Dwyer. February Clemson University

Volatility. Gerald P. Dwyer. February Clemson University Volatility Gerald P. Dwyer Clemson University February 2016 Outline 1 Volatility Characteristics of Time Series Heteroskedasticity Simpler Estimation Strategies Exponentially Weighted Moving Average Use

More information

Basics: Definitions and Notation. Stationarity. A More Formal Definition

Basics: Definitions and Notation. Stationarity. A More Formal Definition Basics: Definitions and Notation A Univariate is a sequence of measurements of the same variable collected over (usually regular intervals of) time. Usual assumption in many time series techniques is that

More information

Regression and Models with Multiple Factors. Ch. 17, 18

Regression and Models with Multiple Factors. Ch. 17, 18 Regression and Models with Multiple Factors Ch. 17, 18 Mass 15 20 25 Scatter Plot 70 75 80 Snout-Vent Length Mass 15 20 25 Linear Regression 70 75 80 Snout-Vent Length Least-squares The method of least

More information

Minitab Project Report - Assignment 6

Minitab Project Report - Assignment 6 .. Sunspot data Minitab Project Report - Assignment Time Series Plot of y Time Series Plot of X y X 7 9 7 9 The data have a wavy pattern. However, they do not show any seasonality. There seem to be an

More information

Ch 8. MODEL DIAGNOSTICS. Time Series Analysis

Ch 8. MODEL DIAGNOSTICS. Time Series Analysis Model diagnostics is concerned with testing the goodness of fit of a model and, if the fit is poor, suggesting appropriate modifications. We shall present two complementary approaches: analysis of residuals

More information

Time Series Analysis -- An Introduction -- AMS 586

Time Series Analysis -- An Introduction -- AMS 586 Time Series Analysis -- An Introduction -- AMS 586 1 Objectives of time series analysis Data description Data interpretation Modeling Control Prediction & Forecasting 2 Time-Series Data Numerical data

More information

Analysis of Violent Crime in Los Angeles County

Analysis of Violent Crime in Los Angeles County Analysis of Violent Crime in Los Angeles County Xiaohong Huang UID: 004693375 March 20, 2017 Abstract Violent crime can have a negative impact to the victims and the neighborhoods. It can affect people

More information

Stat 5102 Final Exam May 14, 2015

Stat 5102 Final Exam May 14, 2015 Stat 5102 Final Exam May 14, 2015 Name Student ID The exam is closed book and closed notes. You may use three 8 1 11 2 sheets of paper with formulas, etc. You may also use the handouts on brand name distributions

More information

Chapter 3: Regression Methods for Trends

Chapter 3: Regression Methods for Trends Chapter 3: Regression Methods for Trends Time series exhibiting trends over time have a mean function that is some simple function (not necessarily constant) of time. The example random walk graph from

More information

GMM - Generalized method of moments

GMM - Generalized method of moments GMM - Generalized method of moments GMM Intuition: Matching moments You want to estimate properties of a data set {x t } T t=1. You assume that x t has a constant mean and variance. x t (µ 0, σ 2 ) Consider

More information

TESTING FOR CO-INTEGRATION

TESTING FOR CO-INTEGRATION Bo Sjö 2010-12-05 TESTING FOR CO-INTEGRATION To be used in combination with Sjö (2008) Testing for Unit Roots and Cointegration A Guide. Instructions: Use the Johansen method to test for Purchasing Power

More information

1.4 Properties of the autocovariance for stationary time-series

1.4 Properties of the autocovariance for stationary time-series 1.4 Properties of the autocovariance for stationary time-series In general, for a stationary time-series, (i) The variance is given by (0) = E((X t µ) 2 ) 0. (ii) (h) apple (0) for all h 2 Z. ThisfollowsbyCauchy-Schwarzas

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

Problem Set 2 Solution Sketches Time Series Analysis Spring 2010

Problem Set 2 Solution Sketches Time Series Analysis Spring 2010 Problem Set 2 Solution Sketches Time Series Analysis Spring 2010 Forecasting 1. Let X and Y be two random variables such that E(X 2 ) < and E(Y 2 )

More information

Introduction and Single Predictor Regression. Correlation

Introduction and Single Predictor Regression. Correlation Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation

More information

1 Forecasting House Starts

1 Forecasting House Starts 1396, Time Series, Week 5, Fall 2007 1 In this handout, we will see the application example on chapter 5. We use the same example as illustrated in the textbook and fit the data with several models of

More information

Linear Modelling: Simple Regression

Linear Modelling: Simple Regression Linear Modelling: Simple Regression 10 th of Ma 2018 R. Nicholls / D.-L. Couturier / M. Fernandes Introduction: ANOVA Used for testing hpotheses regarding differences between groups Considers the variation

More information

Circle a single answer for each multiple choice question. Your choice should be made clearly.

Circle a single answer for each multiple choice question. Your choice should be made clearly. TEST #1 STA 4853 March 4, 215 Name: Please read the following directions. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directions This exam is closed book and closed notes. There are 31 questions. Circle

More information

Simple linear regression

Simple linear regression Simple linear regression Business Statistics 41000 Fall 2015 1 Topics 1. conditional distributions, squared error, means and variances 2. linear prediction 3. signal + noise and R 2 goodness of fit 4.

More information

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75

More information

Regression of Time Series

Regression of Time Series Mahlerʼs Guide to Regression of Time Series CAS Exam S prepared by Howard C. Mahler, FCAS Copyright 2016 by Howard C. Mahler. Study Aid 2016F-S-9Supplement Howard Mahler hmahler@mac.com www.howardmahler.com/teaching

More information

Inference with Simple Regression

Inference with Simple Regression 1 Introduction Inference with Simple Regression Alan B. Gelder 06E:071, The University of Iowa 1 Moving to infinite means: In this course we have seen one-mean problems, twomean problems, and problems

More information

Chapter 12: Linear regression II

Chapter 12: Linear regression II Chapter 12: Linear regression II Timothy Hanson Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 14 12.4 The regression model

More information

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis Chapter 12: An introduction to Time Series Analysis Introduction In this chapter, we will discuss forecasting with single-series (univariate) Box-Jenkins models. The common name of the models is Auto-Regressive

More information

Lecture 1 Intro to Spatial and Temporal Data

Lecture 1 Intro to Spatial and Temporal Data Lecture 1 Intro to Spatial and Temporal Data Dennis Sun Stanford University Stats 253 June 22, 2015 1 What is Spatial and Temporal Data? 2 Trend Modeling 3 Omitted Variables 4 Overview of this Class 1

More information

Math 2311 Written Homework 6 (Sections )

Math 2311 Written Homework 6 (Sections ) Math 2311 Written Homework 6 (Sections 5.4 5.6) Name: PeopleSoft ID: Instructions: Homework will NOT be accepted through email or in person. Homework must be submitted through CourseWare BEFORE the deadline.

More information

STAT 436 / Lecture 16: Key

STAT 436 / Lecture 16: Key STAT 436 / 536 - Lecture 16: Key Modeling Non-Stationary Time Series Many time series models are non-stationary. Recall a time series is stationary if the mean and variance are constant in time and the

More information

Multiple Regression and Regression Model Adequacy

Multiple Regression and Regression Model Adequacy Multiple Regression and Regression Model Adequacy Joseph J. Luczkovich, PhD February 14, 2014 Introduction Regression is a technique to mathematically model the linear association between two or more variables,

More information

Part 1. Multiple Choice (50 questions, 1 point each) Part 2. Problems/Short Answer (10 questions, 5 points each)

Part 1. Multiple Choice (50 questions, 1 point each) Part 2. Problems/Short Answer (10 questions, 5 points each) GROUND RULES: This exam contains two parts: Part 1. Multiple Choice (50 questions, 1 point each) Part 2. Problems/Short Answer (10 questions, 5 points each) The maximum number of points on this exam is

More information

Applied Time Series Topics

Applied Time Series Topics Applied Time Series Topics Ivan Medovikov Brock University April 16, 2013 Ivan Medovikov, Brock University Applied Time Series Topics 1/34 Overview 1. Non-stationary data and consequences 2. Trends and

More information

Regression on Faithful with Section 9.3 content

Regression on Faithful with Section 9.3 content Regression on Faithful with Section 9.3 content The faithful data frame contains 272 obervational units with variables waiting and eruptions measuring, in minutes, the amount of wait time between eruptions,

More information

PAPER 206 APPLIED STATISTICS

PAPER 206 APPLIED STATISTICS MATHEMATICAL TRIPOS Part III Thursday, 1 June, 2017 9:00 am to 12:00 pm PAPER 206 APPLIED STATISTICS Attempt no more than FOUR questions. There are SIX questions in total. The questions carry equal weight.

More information

Time Series Analysis

Time Series Analysis Time Series Analysis A time series is a sequence of observations made: 1) over a continuous time interval, 2) of successive measurements across that interval, 3) using equal spacing between consecutive

More information

Decision 411: Class 7

Decision 411: Class 7 Decision 411: Class 7 Confidence limits for sums of coefficients Use of the time index as a regressor The difficulty of predicting the future Confidence intervals for sums of coefficients Sometimes the

More information

Scenario 5: Internet Usage Solution. θ j

Scenario 5: Internet Usage Solution. θ j Scenario : Internet Usage Solution Some more information would be interesting about the study in order to know if we can generalize possible findings. For example: Does each data point consist of the total

More information

Lecture 19 Box-Jenkins Seasonal Models

Lecture 19 Box-Jenkins Seasonal Models Lecture 19 Box-Jenkins Seasonal Models If the time series is nonstationary with respect to its variance, then we can stabilize the variance of the time series by using a pre-differencing transformation.

More information

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018 Work all problems. 60 points needed to pass at the Masters level, 75 to pass at the PhD

More information

Suan Sunandha Rajabhat University

Suan Sunandha Rajabhat University Forecasting Exchange Rate between Thai Baht and the US Dollar Using Time Series Analysis Kunya Bowornchockchai Suan Sunandha Rajabhat University INTRODUCTION The objective of this research is to forecast

More information

Solution to Series 6

Solution to Series 6 Dr. M. Dettling Applied Series Analysis SS 2014 Solution to Series 6 1. a) > r.bel.lm summary(r.bel.lm) Call: lm(formula = NURSING ~., data = d.beluga) Residuals: Min 1Q

More information

Non-Stationary Time Series and Unit Root Testing

Non-Stationary Time Series and Unit Root Testing Econometrics II Non-Stationary Time Series and Unit Root Testing Morten Nyboe Tabor Course Outline: Non-Stationary Time Series and Unit Root Testing 1 Stationarity and Deviation from Stationarity Trend-Stationarity

More information

Class: Dean Foster. September 30, Read sections: Examples chapter (chapter 3) Question today: Do prices go up faster than they go down?

Class: Dean Foster. September 30, Read sections: Examples chapter (chapter 3) Question today: Do prices go up faster than they go down? Class: Dean Foster September 30, 2013 Administrivia Read sections: Examples chapter (chapter 3) Gas prices Question today: Do prices go up faster than they go down? Idea is that sellers watch spot price

More information

A time series is called strictly stationary if the joint distribution of every collection (Y t

A time series is called strictly stationary if the joint distribution of every collection (Y t 5 Time series A time series is a set of observations recorded over time. You can think for example at the GDP of a country over the years (or quarters) or the hourly measurements of temperature over a

More information

STOR 356: Summary Course Notes

STOR 356: Summary Course Notes STOR 356: Summary Course Notes Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, NC 7599-360 rls@email.unc.edu February 19, 008 Course text: Introduction

More information

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA CHAPTER 6 TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA 6.1. Introduction A time series is a sequence of observations ordered in time. A basic assumption in the time series analysis

More information

CHAPTER 8 MODEL DIAGNOSTICS. 8.1 Residual Analysis

CHAPTER 8 MODEL DIAGNOSTICS. 8.1 Residual Analysis CHAPTER 8 MODEL DIAGNOSTICS We have now discussed methods for specifying models and for efficiently estimating the parameters in those models. Model diagnostics, or model criticism, is concerned with testing

More information

1 The Classic Bivariate Least Squares Model

1 The Classic Bivariate Least Squares Model Review of Bivariate Linear Regression Contents 1 The Classic Bivariate Least Squares Model 1 1.1 The Setup............................... 1 1.2 An Example Predicting Kids IQ................. 1 2 Evaluating

More information

Introduction to Linear Regression

Introduction to Linear Regression Introduction to Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Introduction to Linear Regression 1 / 46

More information

The ARIMA Procedure: The ARIMA Procedure

The ARIMA Procedure: The ARIMA Procedure Page 1 of 120 Overview: ARIMA Procedure Getting Started: ARIMA Procedure The Three Stages of ARIMA Modeling Identification Stage Estimation and Diagnostic Checking Stage Forecasting Stage Using ARIMA Procedure

More information

Stat 311: HW 9, due Th 5/27/10 in your Quiz Section

Stat 311: HW 9, due Th 5/27/10 in your Quiz Section Stat 311: HW 9, due Th 5/27/10 in your Quiz Section Fritz Scholz Your returned assignment should show your name and student ID number. It should be printed or written clearly. 1. The data set ReactionTime

More information

Lecture 8a: Spurious Regression

Lecture 8a: Spurious Regression Lecture 8a: Spurious Regression 1 Old Stuff The traditional statistical theory holds when we run regression using (weakly or covariance) stationary variables. For example, when we regress one stationary

More information

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo Vol.4, No.2, pp.2-27, April 216 MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo ABSTRACT: This study

More information

We d like to know the equation of the line shown (the so called best fit or regression line).

We d like to know the equation of the line shown (the so called best fit or regression line). Linear Regression in R. Example. Let s create a data frame. > exam1 = c(100,90,90,85,80,75,60) > exam2 = c(95,100,90,80,95,60,40) > students = c("asuka", "Rei", "Shinji", "Mari", "Hikari", "Toji", "Kensuke")

More information

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph. Regression, Part I I. Difference from correlation. II. Basic idea: A) Correlation describes the relationship between two variables, where neither is independent or a predictor. - In correlation, it would

More information

Technical note on seasonal adjustment for Capital goods imports

Technical note on seasonal adjustment for Capital goods imports Technical note on seasonal adjustment for Capital goods imports July 1, 2013 Contents 1 Capital goods imports 2 1.1 Additive versus multiplicative seasonality..................... 2 2 Steps in the seasonal

More information

Regression. Marc H. Mehlman University of New Haven

Regression. Marc H. Mehlman University of New Haven Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven the statistician knows that in nature there never was a normal distribution, there never was a straight line, yet with normal and

More information

Analytics 512: Homework # 2 Tim Ahn February 9, 2016

Analytics 512: Homework # 2 Tim Ahn February 9, 2016 Analytics 512: Homework # 2 Tim Ahn February 9, 2016 Chapter 3 Problem 1 (# 3) Suppose we have a data set with five predictors, X 1 = GP A, X 2 = IQ, X 3 = Gender (1 for Female and 0 for Male), X 4 = Interaction

More information

Non-Stationary Time Series and Unit Root Testing

Non-Stationary Time Series and Unit Root Testing Econometrics II Non-Stationary Time Series and Unit Root Testing Morten Nyboe Tabor Course Outline: Non-Stationary Time Series and Unit Root Testing 1 Stationarity and Deviation from Stationarity Trend-Stationarity

More information

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 PDF file location: http://www.murraylax.org/rtutorials/regression_anovatable.pdf

More information

ibm: daily closing IBM stock prices (dates not given) internet: number of users logged on to an Internet server each minute (dates/times not given)

ibm: daily closing IBM stock prices (dates not given) internet: number of users logged on to an Internet server each minute (dates/times not given) Remark: Problem 1 is the most important problem on this assignment (it will prepare you for your project). Problem 2 was taken largely from last year s final exam. Problem 3 consists of a bunch of rambling

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression OI CHAPTER 7 Important Concepts Correlation (r or R) and Coefficient of determination (R 2 ) Interpreting y-intercept and slope coefficients Inference (hypothesis testing and confidence

More information

Introduction to Linear Regression Rebecca C. Steorts September 15, 2015

Introduction to Linear Regression Rebecca C. Steorts September 15, 2015 Introduction to Linear Regression Rebecca C. Steorts September 15, 2015 Today (Re-)Introduction to linear models and the model space What is linear regression Basic properties of linear regression Using

More information

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

Circle the single best answer for each multiple choice question. Your choice should be made clearly. TEST #1 STA 4853 March 6, 2017 Name: Please read the following directions. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directions This exam is closed book and closed notes. There are 32 multiple choice

More information

FinQuiz Notes

FinQuiz Notes Reading 9 A time series is any series of data that varies over time e.g. the quarterly sales for a company during the past five years or daily returns of a security. When assumptions of the regression

More information

22s:152 Applied Linear Regression. Returning to a continuous response variable Y...

22s:152 Applied Linear Regression. Returning to a continuous response variable Y... 22s:152 Applied Linear Regression Generalized Least Squares Returning to a continuous response variable Y... Ordinary Least Squares Estimation The classical models we have fit so far with a continuous

More information

CHAPTER 21: TIME SERIES ECONOMETRICS: SOME BASIC CONCEPTS

CHAPTER 21: TIME SERIES ECONOMETRICS: SOME BASIC CONCEPTS CHAPTER 21: TIME SERIES ECONOMETRICS: SOME BASIC CONCEPTS 21.1 A stochastic process is said to be weakly stationary if its mean and variance are constant over time and if the value of the covariance between

More information

SCHOOL OF MATHEMATICS AND STATISTICS

SCHOOL OF MATHEMATICS AND STATISTICS SHOOL OF MATHEMATIS AND STATISTIS Linear Models Autumn Semester 2015 16 2 hours Marks will be awarded for your best three answers. RESTRITED OPEN BOOK EXAMINATION andidates may bring to the examination

More information

Decision 411: Class 9. HW#3 issues

Decision 411: Class 9. HW#3 issues Decision 411: Class 9 Presentation/discussion of HW#3 Introduction to ARIMA models Rules for fitting nonseasonal models Differencing and stationarity Reading the tea leaves : : ACF and PACF plots Unit

More information

22s:152 Applied Linear Regression. In matrix notation, we can write this model: Generalized Least Squares. Y = Xβ + ɛ with ɛ N n (0, Σ)

22s:152 Applied Linear Regression. In matrix notation, we can write this model: Generalized Least Squares. Y = Xβ + ɛ with ɛ N n (0, Σ) 22s:152 Applied Linear Regression Generalized Least Squares Returning to a continuous response variable Y Ordinary Least Squares Estimation The classical models we have fit so far with a continuous response

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Dynamic Time Series Regression: A Panacea for Spurious Correlations

Dynamic Time Series Regression: A Panacea for Spurious Correlations International Journal of Scientific and Research Publications, Volume 6, Issue 10, October 2016 337 Dynamic Time Series Regression: A Panacea for Spurious Correlations Emmanuel Alphonsus Akpan *, Imoh

More information

Module 3. Descriptive Time Series Statistics and Introduction to Time Series Models

Module 3. Descriptive Time Series Statistics and Introduction to Time Series Models Module 3 Descriptive Time Series Statistics and Introduction to Time Series Models Class notes for Statistics 451: Applied Time Series Iowa State University Copyright 2015 W Q Meeker November 11, 2015

More information

Inferences on Linear Combinations of Coefficients

Inferences on Linear Combinations of Coefficients Inferences on Linear Combinations of Coefficients Note on required packages: The following code required the package multcomp to test hypotheses on linear combinations of regression coefficients. If you

More information

Introduction to Linear regression analysis. Part 2. Model comparisons

Introduction to Linear regression analysis. Part 2. Model comparisons Introduction to Linear regression analysis Part Model comparisons 1 ANOVA for regression Total variation in Y SS Total = Variation explained by regression with X SS Regression + Residual variation SS Residual

More information

APPLIED ECONOMETRIC TIME SERIES 4TH EDITION

APPLIED ECONOMETRIC TIME SERIES 4TH EDITION APPLIED ECONOMETRIC TIME SERIES 4TH EDITION Chapter 2: STATIONARY TIME-SERIES MODELS WALTER ENDERS, UNIVERSITY OF ALABAMA Copyright 2015 John Wiley & Sons, Inc. Section 1 STOCHASTIC DIFFERENCE EQUATION

More information

SCHOOL OF MATHEMATICS AND STATISTICS

SCHOOL OF MATHEMATICS AND STATISTICS RESTRICTED OPEN BOOK EXAMINATION (Not to be removed from the examination hall) Data provided: Statistics Tables by H.R. Neave MAS5052 SCHOOL OF MATHEMATICS AND STATISTICS Basic Statistics Spring Semester

More information

5 Transfer function modelling

5 Transfer function modelling MSc Further Time Series Analysis 5 Transfer function modelling 5.1 The model Consider the construction of a model for a time series (Y t ) whose values are influenced by the earlier values of a series

More information

AMS 7 Correlation and Regression Lecture 8

AMS 7 Correlation and Regression Lecture 8 AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation

More information

Economics 618B: Time Series Analysis Department of Economics State University of New York at Binghamton

Economics 618B: Time Series Analysis Department of Economics State University of New York at Binghamton Problem Set #1 1. Generate n =500random numbers from both the uniform 1 (U [0, 1], uniformbetween zero and one) and exponential λ exp ( λx) (set λ =2and let x U [0, 1]) b a distributions. Plot the histograms

More information

Non-Stationary Time Series and Unit Root Testing

Non-Stationary Time Series and Unit Root Testing Econometrics II Non-Stationary Time Series and Unit Root Testing Morten Nyboe Tabor Course Outline: Non-Stationary Time Series and Unit Root Testing 1 Stationarity and Deviation from Stationarity Trend-Stationarity

More information

FIN822 project 2 Project 2 contains part I and part II. (Due on November 10, 2008)

FIN822 project 2 Project 2 contains part I and part II. (Due on November 10, 2008) FIN822 project 2 Project 2 contains part I and part II. (Due on November 10, 2008) Part I Logit Model in Bankruptcy Prediction You do not believe in Altman and you decide to estimate the bankruptcy prediction

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the

More information

Stat 101 L: Laboratory 5

Stat 101 L: Laboratory 5 Stat 101 L: Laboratory 5 The first activity revisits the labeling of Fun Size bags of M&Ms by looking distributions of Total Weight of Fun Size bags and regular size bags (which have a label weight) of

More information

Variance Decomposition and Goodness of Fit

Variance Decomposition and Goodness of Fit Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings

More information

Extensions of One-Way ANOVA.

Extensions of One-Way ANOVA. Extensions of One-Way ANOVA http://www.pelagicos.net/classes_biometry_fa18.htm What do I want You to Know What are two main limitations of ANOVA? What two approaches can follow a significant ANOVA? How

More information

The Big Picture. Model Modifications. Example (cont.) Bacteria Count Example

The Big Picture. Model Modifications. Example (cont.) Bacteria Count Example The Big Picture Remedies after Model Diagnostics The Big Picture Model Modifications Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison February 6, 2007 Residual plots

More information

Lesson 8: Testing for IID Hypothesis with the correlogram

Lesson 8: Testing for IID Hypothesis with the correlogram Lesson 8: Testing for IID Hypothesis with the correlogram Dipartimento di Ingegneria e Scienze dell Informazione e Matematica Università dell Aquila, umberto.triacca@ec.univaq.it Testing for i.i.d. Hypothesis

More information

13. Time Series Analysis: Asymptotics Weakly Dependent and Random Walk Process. Strict Exogeneity

13. Time Series Analysis: Asymptotics Weakly Dependent and Random Walk Process. Strict Exogeneity Outline: Further Issues in Using OLS with Time Series Data 13. Time Series Analysis: Asymptotics Weakly Dependent and Random Walk Process I. Stationary and Weakly Dependent Time Series III. Highly Persistent

More information

Chapter 3 - Linear Regression

Chapter 3 - Linear Regression Chapter 3 - Linear Regression Lab Solution 1 Problem 9 First we will read the Auto" data. Note that most datasets referred to in the text are in the R package the authors developed. So we just need to

More information

Non-independence due to Time Correlation (Chapter 14)

Non-independence due to Time Correlation (Chapter 14) Non-independence due to Time Correlation (Chapter 14) When we model the mean structure with ordinary least squares, the mean structure explains the general trends in the data with respect to our dependent

More information

L21: Chapter 12: Linear regression

L21: Chapter 12: Linear regression L21: Chapter 12: Linear regression Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 37 So far... 12.1 Introduction One sample

More information

Chapter 5 Exercises 1. Data Analysis & Graphics Using R Solutions to Exercises (April 24, 2004)

Chapter 5 Exercises 1. Data Analysis & Graphics Using R Solutions to Exercises (April 24, 2004) Chapter 5 Exercises 1 Data Analysis & Graphics Using R Solutions to Exercises (April 24, 2004) Preliminaries > library(daag) Exercise 2 The final three sentences have been reworded For each of the data

More information

Principal components

Principal components Principal components Principal components is a general analysis technique that has some application within regression, but has a much wider use as well. Technical Stuff We have yet to define the term covariance,

More information