Testing for IID Noise/White Noise: I

Similar documents
Testing for IID Noise/White Noise: I

DATA IN SERIES AND TIME I. Several different techniques depending on data and what one wants to do

Lesson 8: Testing for IID Hypothesis with the correlogram

Ch 8. MODEL DIAGNOSTICS. Time Series Analysis

CHAPTER 8 MODEL DIAGNOSTICS. 8.1 Residual Analysis

TMA 4275 Lifetime Analysis June 2004 Solution

Part II. Time Series

Analysis. Components of a Time Series

Chapter 8: Model Diagnostics

Stochastic Processes: I. consider bowl of worms model for oscilloscope experiment:

Classical Decomposition Model Revisited: I

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

Confidence Intervals, Testing and ANOVA Summary

LECTURE 5 HYPOTHESIS TESTING

Comment about AR spectral estimation Usually an estimate is produced by computing the AR theoretical spectrum at (ˆφ, ˆσ 2 ). With our Monte Carlo

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

Problem set 1 - Solutions

Econometrics. 4) Statistical inference

Simple Linear Regression

Time Series Analysis -- An Introduction -- AMS 586

Institute of Actuaries of India

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS

Scatter plot of data from the study. Linear Regression

Sample Size and Power I: Binary Outcomes. James Ware, PhD Harvard School of Public Health Boston, MA

Ch. 1: Data and Distributions

2008 Winton. Statistical Testing of RNGs

Chapter 10: Inferences based on two samples

B.N.Bandodkar College of Science, Thane. Random-Number Generation. Mrs M.J.Gholba

Scatter plot of data from the study. Linear Regression

2. An Introduction to Moving Average Models and ARMA Models

Can you tell the relationship between students SAT scores and their college grades?

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

Financial Econometrics and Quantitative Risk Managenent Return Properties

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

INTRODUCTION TO TIME SERIES ANALYSIS. The Simple Moving Average Model

Analysis of Violent Crime in Los Angeles County

Econometrics Summary Algebraic and Statistical Preliminaries

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

STAT Financial Time Series

Chapter 3: Regression Methods for Trends

Ch 2: Simple Linear Regression

Probability and Statistics Notes

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Inference for the Regression Coefficient

Econ 424 Time Series Concepts

Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8]

LECTURES 2-3 : Stochastic Processes, Autocorrelation function. Stationarity.

Lab: Box-Jenkins Methodology - US Wholesale Price Indicator

Review of Statistics 101

Systems Simulation Chapter 7: Random-Number Generation

Swarthmore Honors Exam 2012: Statistics

Correlation and the Analysis of Variance Approach to Simple Linear Regression

Simple and Multiple Linear Regression

at least 50 and preferably 100 observations should be available to build a proper model

Lecture 7: Hypothesis Testing and ANOVA

What is a Hypothesis?

Minitab Project Report - Assignment 6

5 Autoregressive-Moving-Average Modeling

Lectures 5 & 6: Hypothesis Testing

Multivariate Time Series: VAR(p) Processes and Models

4/22/2010. Test 3 Review ANOVA

distributed approximately according to white noise. Likewise, for general ARMA(p,q), the residuals can be expressed as

White Noise Processes (Section 6.2)

Design of Engineering Experiments Part 2 Basic Statistical Concepts Simple comparative experiments

Circle a single answer for each multiple choice question. Your choice should be made clearly.

Wavelet Methods for Time Series Analysis. Part IX: Wavelet-Based Bootstrapping

Multivariate Time Series: Part 4

Wavelet Methods for Time Series Analysis. Motivating Question

Differencing Revisited: I ARIMA(p,d,q) processes predicated on notion of dth order differencing of a time series {X t }: for d = 1 and 2, have X t

UNIT 5:Random number generation And Variation Generation

Leftovers. Morris. University Farm. University Farm. Morris. yield

Lecture 5: Estimation of time series

STAT 520 FORECASTING AND TIME SERIES 2013 FALL Homework 05

THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41202, Spring Quarter 2003, Mr. Ruey S. Tsay

Review 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2

STT 843 Key to Homework 1 Spring 2018

STAT Chapter 11: Regression

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

Practical Meta-Analysis -- Lipsey & Wilson

Inferences for Regression

Modeling and Performance Analysis with Discrete-Event Simulation

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

Simple linear regression

1 Introduction to One-way ANOVA

Exam details. Final Review Session. Things to Review

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

Ch 6. Model Specification. Time Series Analysis

Correlation and Simple Linear Regression

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.

Chapter 8 Heteroskedasticity

28. SIMPLE LINEAR REGRESSION III

AR, MA and ARMA models

Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles

Mathematics for Economics MA course

Collection of Formulae and Statistical Tables for the B2-Econometrics and B3-Time Series Analysis courses and exams

Transcription:

Testing for IID Noise/White Noise: I want to be able to test null hypothesis time series {x t } or set of residuals {r t } is IID(0, 2 ) or WN(0, 2 ) there are many such tests, including informal test based on sample ACF (see overhead II 64) portmanteau test turning point test di erence-sign test rank test runs test will illustrate tests using all-star game time series {x t } & residuals {z t } from AR(1) model for detrended Lake Huron levels BD 35, CC 46, SS-149 IV 1

outcome of game NL win tie AL win First All-Star Baseball Game in Each Year (I 22) 1940 1960 1980 2000 year IV 2

ACF 1.0 0.5 0.0 0.5 1.0 Sample ACF for All-Star Game x t 5 10 15 20 h (lag) IV 3

AR(1) Residuals z t = r t ˆrt 1 (III 15) z t 2 1 0 1 2 1880 1900 1920 1940 1960 year IV 4

Unit Lag Scatter Plot of AR(1) Residuals z t (III 16) z t+1 2 1 0 1 2 2 1 0 1 2 IV 5 z t

ACF 1.0 0.5 0.0 0.5 1.0 Sample ACF for AR(1) Residuals z t (III 17) 0 10 20 30 40 h (lag) IV 6

Portmanteau Test: I can regard as formal version of informal test based on sample ACF, which makes use of fact that, when {X t } IID(µ, 2 ), then corresponding ˆ X (1),..., ˆ X (h) are approximately IID N (0, 1/n) for large sample sizes n (but: h/n must be small!) hence ˆ X (j) p 1/n = n 1/2ˆ X (j), j = 1,..., h, are approximately IID standard normal RVs (i.e., N (0, 1)) recall that, if Z 1, Z 2,..., Z are IID N (0, 1), then Z 2 1 + Z2 2 + + Z2 2 obeys a chi-square distribution with degrees of freedom BD 36; CC 183; SS 150 IV 7

Portmanteau Test: II hence, under null hypothesis {X t } IID(µ, 2 ), hx Q(h) n 1/2ˆ 2 hx X (j) = n ˆ 2 X (j) should obey a j=1 2 h distribution j=1 Q(h) will be large under alternative hypotheses, so can reject null hypothesis at level of significance if Q(h) > 2 1 (h), where 2 1 (h) is (1 )th quantile of 2 h distribution refinement is Ljung Box version of portmanteau test: hx ˆ 2 Q LB (h) = n(n + 2) X (j) n j (obeys j=1 2 h distribution to better approximation than Q(h) does) BD 36; CC 183; SS 150 IV 8

Portmanteau Test: III if we want to test residuals {r t } rather than a time series {x t }, then hx ˆ 2 Q LB (h) = n(n + 2) R (j) n j j=1 has the same form as before, but now obeys a 2 h p distribution, where p is the number of parameters estimated in forming {r t } as examples, let s look at Q LB (h) for all-star game time series {x t } and Lake Huron residuals {z t }, for which p = 3 arguably (have estimated intercept, slope and the AR(1) parameter ) will focus on tests with level = 0.05 red horizontal lines show 2 0.95 (h p) blue dots show Q LB (h) CC 183, SS 150 IV 9

Q LB (h) and 95% quantile 0 10 20 30 40 Portmanteau Tests of All-Star Game {x t } 0 2 4 6 8 10 h (lag) IV 10

p values 0.00 0.01 0.02 0.03 0.04 0.05 0.06 p-values for Portmanteau Tests of All-Star Game {x t } 2 4 6 8 10 h (lag) IV 11

Q LB (h) and 95% quantile 0 5 10 15 20 25 30 Portmanteau Tests of Lake Huron {z t } 0 5 10 15 20 h (lag) IV 12

p values 0.0 0.2 0.4 0.6 0.8 1.0 p-values for Portmanteau Tests of Lake Huron {z t } 0 5 10 15 20 h (lag) IV 13

Turning Point Test: I time series {x t } is said to have a turning point at t if one of the following two patterns exist: 1. x t > x t 1 and x t > x t+1 2. x t < x t 1 and x t < x t+1 if X t 1, X t and X t+1 are IID RVs with a continuous distribution, the following six events are equally likely: X t 1 < X t < X t+1 (no turning point) X t 1 < X t+1 < X t (turning point) X t < X t 1 < X t+1 (turning point) X t < X t+1 < X t 1 (turning point) X t+1 < X t 1 < X t (turning point) X t+1 < X t < X t 1 (no turning point) BD 36, 37 IV 14

Turning Point Test: II let T be number of turning points in IID sequence of length n since probability of a turning point is 2/3 under IID hypothesis, µ T E{T } = 2(n 2)/3 can show that, under IID hypothesis, 2 16n 29 T var {T } = 90 large T µ T indicates time series is either fluctuating more rapidly than what would be expected from an IID sequence or there are fewer fluctuations than expected under IID hypothesis, T is approximately N (µ T, 2 T ), so can reject null at level if T µ T / T > 1 /2, where 1 /2 is 1 /2 quantile for standard normal distribution BD 36, 37 IV 15

Turning Point Test: III note: cannot apply to all-star {x t } (has a discrete distribution) as example, T = 60 for Lake Huron {z t } (see next overhead) since n = 97, get µ T. = 63.3 and 2 T. = 16.9 ( T. = 4.1), so T µ T T. = 0.81 p-value is 0.42, so fail to reject null hypothesis BD 36, 37 IV 16

AR(1) Residuals z t = r t ˆrt 1 z t 2 1 0 1 2 1880 1900 1920 1940 1960 year IV 17

Di erence-sign Test: I let S be the number of times X t > X t 1, t = 2,..., n (equivalent to number of times X t X t 1 is positive) if {X t } is IID, then µ S E{S} = (n 1)/2 can shown that 2 S var {S} = (n + 1)/12 S is approximately equal in distribution to a N (µ S, for large n 2 S ) RV large positive (negative) value of S µ S indicates presence of increasing (decreasing) trend (doesn t have power against alternative of cyclic variations) reject null hypothesis at level if S µ S / S > 1 /2 BD 37 IV 18

Di erence-sign Test: II note: cannot apply to all-star {x t } (has a discrete distribution) as example, S = 48 for Lake Huron {z t } (see next overhead).. since n = 97, get µ S = 48 and = 8.2 ( S = 2.9), so 2 S S µ S S = 0 p-value is 1, so totally fail to reject null hypothesis!!! BD 37 IV 19

AR(1) Residuals z t = r t ˆrt 1 z t 2 1 0 1 2 1880 1900 1920 1940 1960 year IV 20

Rank Test: I given time series of length n, let P be the pairs (r, s) such that X r > X s and r > s total number of pairs such that r > s is n n(n 1) = 2 2 if {X t } is IID, each event of form X r > X s has probability of 1/2, so µ P E{P } = n(n 1)/4 can shown that 2 P var {P } = n(n 1)(2n + 5)/72 P is approximately equal in distribution to a N (µ P, for large n large positive (negative) value of P increasing (decreasing) trend 2 P ) RV µ P indicates presence of BD 37, 38 IV 21

Rank Test: II reject null hypothesis at level if P µ P / P > 1 /2 note: cannot apply to all-star {x t } (has a discrete distribution) as example, P = 2351 for Lake Huron {z t } since n = 97, get µ P = 2328 and P 2.. = 25737.3 ( P = 160.4), so P µ P. = 0.14 p-value is 0.89, so again fail to reject null hypothesis P BD 37 IV 22

test designed for Runs Test: I binary-valued {x t } (will assume x t is either 1 or 1) continuous-valued {x t }, with centering allowed; i.e., {x t where, e.g., c = x is sample mean of x t s residuals {r t } (sample mean r should be close to 0) run is defined to be set of consecutive values in series, all of which are above or below zero as an example, here is pattern of American League and National League victories over 78 years in all-star {x t }: c}, AAANANANAAANAAAANNNNANNAANNNNNNNNNNNNANNNNNNNNNNNANNANAAAAAANNNAAAAAAAAAAAANNN above has a total of 24 runs (12 each for both leagues) CC 46; SS 149 IV 23

Runs Test: II runs test is based on number of runs in series, conditional on number of observed values above and below zero (test does not require underlying probability of value being greater than zero to be 50%) let n + and n denote number of, respectively, positive and negative values in series (thus n + + n = n) let n r be number of runs under null hypothesis of IID, n r is approximately normally distributed with mean µ and variance 2 given by µ = 2n +n n + 1 and 2 = (µ 1)(µ 2) n 1 CC 46; SS 149 IV 24

Runs Test: III large positive (negative) value of n r µ indicates excessive choppiness (too much clumpiness) compared to what is reasonable under null hypothesis of IID reject null hypothesis at level if n r µ / > 1 /2 using all-star {x t } as example, n r = 24, n + = 36 & n = 42 get µ. = 39.8 and 2. = 19.0 (. = 4.4), so n r µ. = 3.6 p-value is 0.0003, so null hypothesis is not tenable CC 46; SS 149 IV 25

Runs Test: IV for Lake Huron {z t }, have n r = 37, n + = 51 & n = 46 get µ =. 49.4 and 2.. = 23.9 ( = 4.9), so n r µ. = 2.5 p-value is 0.011, so would reject null hypothesis at = 0.05 level of significance, but would fail to reject (just barely!) at more stringent = 0.01 level number of runs smaller than to be expected, suggesting clumpiness in data (agrees with positive correlation at unit lag) CC 46; SS 149 IV 26

Testing for IID Noise/White Noise: III summary of IID tests for all-star {x t } informal test based on sample ACF: IID hypothesis untenable both ˆ X (1) & ˆ X (2) well outside bounds reasonable for IID noise portmanteau test: reject resoundingly for h = 1,..., 10 turning point test: inapplicable di erence-sign test: inapplicable rank test: inapplicable runs test: reject (p-value is 0.0003) conclusion: IID hypothesis for all-star {z t } untenable action item: bet on National League in 2013! IV 27

Testing for IID Noise/White Noise: IV summary of IID tests for Lake Huron {z t } informal test based on sample ACF: IID hypothesis seems tenable, but ˆ (1) outside of 95% CI portmanteau test: fail to reject at level = 0.5 for h 4 turning point test: fail to reject (p-value is 0.42) di erence-sign test: fail to reject (p-value is 1(!)) rank test: fail to reject (p-value is 0.89) runs test: reject (p-value is 0.011) IID hypothesis for Lake Huron z t s thus tenable, but some evidence for small but significant autocorrelation at least at lag h = 1 IV 28

Testing for IID Noise/White Noise: V not clear how to combine results of tests together (multiple comparison problem) need to keep in mind various tests have di ering strengths and weaknesses against particular alternative hypotheses two other tests: informal test based on fitting AR models (will discuss later) cumulative periodogram test (will discuss in Stat/EE 520) IV 29