A Forty (?) Year Assessment of Forecasting The Boat Race

Size: px
Start display at page:

Download "A Forty (?) Year Assessment of Forecasting The Boat Race"

Transcription

1 A Forty (?) Year Assessment of Forecasting The Boat Race Geert Mesters & Siem Jan Koopman Netherlands Institute for the Study of Crime and Law Enforcement (NSCR) VU University Amsterdam & Tinbergen Institute Andrew Harvey 65th Year Conference Oxford-Man Institute, Oxford University, UK June 29-30, /39

2 Andrew Harvey 65th year Andrew Harvey also well-known for the fun illustrations : Seat Belts Purse Snatching in Hyde Park area in Chicago Ice volume Rainfall in NE Brazil (Fortaleza) Goals between England and Scotland Mink Muskrats, more? To praise the illustration : The Boat Race 2/39

3 The Boat Race A bit of history : The boat race between teams of the universities of Oxford and Cambridge was first organized in The idea came from two friends who were both named Charles : Cambridge student Merrivale and Oxford student Wordsworth. March 12, 1829, Cambridge challenged Oxford and history started. The first race was held at Henley-on-Thames : Oxford won easily. In 1836, Oxford took the Dark Blue color and Cambridge the Duck Egg Blue color. From 1839, the race is an annual fixture; it gets relocated to London: from Westminster to Putney. The race became more and more popular, crowds increased quickly, and the location had to change again. In 1845, 1st time race at current location: C wins in 23mins 30sec. 3/39

4 The Boat Race The founders Charles Merrivale (C) and Charles Wordsworth (O) 4/39

5 The Boat Race 5/39

6 The Boat Race Some important years : Until 1861, the outcomes were about even : the first and only Dead Heat but... many say that Honest John Phelps felt asleep under a bush when the crews reached the finishing line : both boats sank for the first time, Oxford won the next day : during World War I, no races were held : longest winning streak by Cambridge : during World War II, no races were held : Oxford win, in the midst of a blizzard : Another Oxford win, the 100th Boat Race : Sue Brown is 1st female to enter the race, as cox for Oxford : Hugh and Rob Clay of Oxford are 1st twins to win. 6/39

7 The Boat Race The 1877 Dead Heat by Charles Robinson 7/39

8 The Boat Race More recent developments : Dark Blues dominate in the 1980s 1984 : Cambridge writes off their boat before the race starts Oxford Mutiny : crew protest over team selection policy but... they still won! Topolski and Robison book True Blue appeared in 1989 and the movie appeared in : Cambridge regains its pride and ends Oxford domination 1998 : Cambridge sets record time to 16mins 19sec : DK book predicts a win for Cambridge correctly : Oxford wins by one foot (the smallest margin since 1877) 2010 : the last year in our analysis: a C win. Cambridge leads the series by 80 against 74 (2011 O, 2012 C). 8/39

9 The Boat Race Binary Time Series of Cambridge and Oxford Wins Cambridge Win Oxford Win Cambridge Win Oxford Win /39

10 Forecasting The Boat Race Motivation Why would one want to forecast The Boat Race? Bookmakers and individual gamblers may want to increase their expected profits; Highlighting the importance of previous outcomes (history) of Boat Race; Provide insights to Cambridge and Oxford teams for their winning strategies; Illustration of how Econometrics and Time Series Analysis can be useful in forecasting; Forecasting binary time series can be relevant in many other fields including criminology, finance and computer science; 10/39

11 Forecasting The Boat Race Explanatory variables What information may be relevant? Past outcomes (moderate change of teams in subsequent years); Toss outcomes (which side of the river, betting odds change severely after the toss); Average weight of oarsmen (more muscles against water resistence); Average age of oarsmen (experience); Weather conditions; More? What information do we use? We use Past outcomes (time series), Toss outcomes and difference of Average weight of oarsmen (between C and O). 11/39

12 Our explanatory variables for winning the Boat Race Forecasting The Boat Race Explanatory variables 5 Excessive Cambridge Weight 2007 : Thorsten Engelmann : 17 stone 6lbs (110.8 kilos) Excessive Oxford Weight Cambridge Wins Toss Oxford Wins Toss /39

13 Forecasting The Boat Race Explanatory variables Regression Output from PcGive (y t = 1 is a win for Cambridge) Coefficient Std.Error t-value t-prob Part.R2 Constant Winner Toss DiffWgt sigma R Adj.R no. of observations 146 mean(winner) se(winner) /39

14 Binary time series model Dynamic model specification Density function for binary observations : p(y t ;π) = π yt (1 π) 1 yt, t = 1,...,n, where probability 0 π 1 is usually subject to transformation by link function θ = log(π/(1 π)); see Cox and Snell (1989). If y t is iid, the likelihood function is easily obtained and MLE of π or θ is straightforward : Logit model. In a time series, we let π be time-varying, that is π t, and have conditional density function p(y t π t ) = π yt t (1 π t ) 1 yt, t = 1,...,n, or, in terms of signal θ t = log(π t /(1 π t )), p(y t θ t ) = exp[y t θ t log(1+expθ t )], t = 1,...,n, which shows that binary density is part of exponential family. 14/39

15 Binary time series model Dynamic model specification Density function for binary observations with time-varying signal : p(y t θ t ) = exp[y t θ t log(1+expθ t )], t = 1,...,n, with time-varying signal θ t = µ+x t β +u t, where we consider different dynamic processes for u t : deterministic signal : u t = 0 (LogitJD) random walk : u t = u t 1 +η t with η t NID(0,σ 2 ) AR(p) : u t = φ 1 u t φ p u t p +η t fractionally integrated : u t = (1 L) d u t with u t AR(p) cycle : ( ut u + t ) [ cosλ sinλ = φ sinλ cosλ ]( ut 1 u + t 1 ) ( ηt + η t + ) 15/39

16 Binary time series model Parameter estimation Likelihood function is based on : p(y;ψ) = p(y,u;x,ψ)du = u u p(y u;x,ψ)p(u;ψ)du, where parameter vector ψ includes µ, β, σ 2, d and φ s, density p(y u;x,ψ) p(y u;x,ψ) = n exp[y t θ t log(1+expθ t )], t=1 with θ t = µ+x t β+u t and with p(u;ψ) the density for time series process u t. 16/39

17 Likelihood evaluation : p(y u;x,ψ)p(u;ψ)du ˆp = g(y) M u Binary time series model Parameter estimation M p(y u i )/g(y u i ). i=1 where u i g(u y), is the method of importance sampling : simulation smoothing : Carter and Kohn (1994), Fruhwirth-Schnatter (1994), de Jong and Shephard (1995), Durbin and Koopman (2002), and more. importance sampling for time series : Shephard and Pitt (1997) and Durbin and Koopman (1997), but also... efficient importance sampling : Danielsson and Richard (1995), Liesenfeld and Richard (2003), Richard and Zhang (2007) numerically accelerated importance sampling : Koopman and Nguyen (2011), Koopman, Lucas and Scharth (2011, 2012) importance sampling for long memory processes : Mesters, Koopman and Ooms (2010) 17/39

18 Binary time series model Parameter estimation Estimation by importance sampling : based on an approximating linear Gaussian model g(y,u;x,ψ) that we obtain by an iterative algorithm based on 2nd order Taylor expansion (or Laplace transformation) evaluation is numerical, it requires some computing attention direct maximisation of likelihood function : common random numbers for each evaluation to obtain a smooth likelihood surface NAIS : role of simulation becomes less all methods can treat missing values methods that rely on Kalman filter (ssfpack) are fast and estimation is a routine matter 18/39

19 Binary time series model In-sample results Selection of estimation results : φ σ λ/d X-wgt -toss loglik % OK Constant RW AR(1) ARFI(0,d,0) (d) ARFI(1,d,0) (d) Cycle (λ) indicates some level of significance; λ = 0.33 implies cycle period of 2π/λ 18 years. % OK is percentage of estimating the winner correctly. 19/39

20 Binary time series model Signal extraction Cambridge Win constant Oxford Win AR(1) ARFI(1) RW FI cycle /39

21 A forty year forecasting assessment Design of study We perform an out-of-sample forecasting exercise : The first forecast is for 1971 using the binary observations from 1829 to We forecast the probability π t+1 t where t refers to When π t+1 t 0.5, we predict a Cambridge win, otherwise an Oxford win. The second forecast is for 1972 using the binary observations from 1830 to 1971, etc. Hence we adopt a rolling forecast window. We compute the forecasts until 2010: a total of 40 forecasts. This procedure is repeated for each model specification and for each ad-hoc method. 21/39

22 A forty year forecasting assessment Design of study Our framework for model-based forecasting is : Observation density with time-varying signal p(y t θ t ) = exp[y t θ t log(1+expθ t )], t = 1,...,n, where θ t = µ+x t β +u t, for different dynamic processes u t. The probability for a Cambridge win is given by π t = expθ t 1+expθ t. 22/39

23 Model-based forecasts : Deterministic : u t = 0 A forty year forecasting assessment Random Walk : u t = u t 1 +η t with η t NID(0,σ 2 ) AR(1) : u t = φ 1 u t 1 +η t ARFIMA(0,d,0) : u t = (1 L) d u t with u t = η t ARFIMA(1,d,0) : u t = (1 L) d u t with u t AR(1) Cycle Design of study Ad-hoc forecasts : Last Year Winner Last Year Loser Always Cambridge Win Always Oxford Win 23/39

24 A forty year forecasting assessment Forecasting results Outcome RW ARFI0 ARFI1 Constant AR1 CYCLE /39

25 A forty year forecasting assessment Forecasting results Forecasts: Correct % Correct Constant RW AR(1) ARFI(0,d,0) ARFI(1,d,0) Cycle Last Year Winner Last Year Loser Always Cambridge Win Always Oxford Win Overall model-based forecasts outperform ad-hoc forecasts Cycle forecasts are best! This is it? 25/39

26 A forty year forecasting assessment How significant are the differences in forecast accuracy? Forecasting comparisons Our Loss function value is L t which can take the values: 1 : if the forecast of the Boat Race is WRONG 0 : if the forecast of the Boat Race is CORRECT For each forecast method, we can construct (yet another) binary time series L t : say L (i) t for method i. The sums L (i) L(i) 40 are reported in previous table. The relative performance for each method is then measured as d ij t = L (i) t L (j) t, i j, which can take the values: 1 : model i wrong, model j correct: GOOD for model j 0 : both models are wrong or correct: no distinction -1: model i correct, model j wrong: GOOD for model i 26/39

27 A forty year forecasting assessment Equal Predictive Ability How significant are the differences in forecast accuracy? We follow Diebold & Mariano (1995) with their sign test. To carry out the test, only consider the m ij non-zero values and compute S ij = t 1(d (ij) t = 1), S ij Binomial(m ij,0.5). A small value for S ij says model i is doing better than model j. Exact test : the cumulative binomial distribution function assesses whether S ij is small enough. 27/39

28 A forty year forecasting assessment Equal Predictive Ability Bench. / Alt. Co Wi Lo Ca Ox RW AR ARFI Cy Constant Winner Loser Cambridge Oxford RW AR ARFI Cycle EPA test : the p-values are reported. In bold : the model in column outperforms the model in row. Enough evidence, let s stop here? 28/39

29 A forty year forecasting assessment How significant are the differences... over ALL models? Superior Predictive Ability Separate EPA tests may be less powerful against a single model : see previous table. Matrix of EPA p-values may be difficult to interpret : conflicting evidence. We therefore also consider the Superior Predictive Ability (SPA) test of White (2000) and Hansen (2005). We focus on the quantity D ij = E(d (ij) t ), and model i is said to be superior if and only if D ij 0, j, j i. Applied in Hansen & Lunde (2005), Hsu & Kuan (2005) and Jungbacker, Koopman & Hol (2007). 29/39

30 A forty year forecasting assessment SPA test values with confidence intervals (bootstrapped) Benchmark Constant [0.114, 0.191] Winner [0.077, 0.135] Loser [0.004, 0.004] Cambridge [0.003, 0.003] Oxford [0.133, 0.169] RW [0.055, 0.078] AR [0.046, 0.267] ARFI [0.020, 0.057] Cycle [0.864, 1.000] Superior Predictive Ability SPA test statistics are reported. In bold : the model is significantly outperformed by the others. Strongest evidence that model is not outperformed is for Cycle. 30/39

31 A forty year (?) forecasting assessment Sample Split Why forty years? We don t know. The original plan was 50, but in the end we did 40. Let s have a go... Inoue & Rossi (Biometrika, 2011) and Hansen & Timmermann (wp, 2012) see dangers in ad-hoc choice of forecast window size: not able to detect significant predictive ability (even when available for other window sizes) significant results by chance : data snooping over window size (leads to size distortions) 31/39

32 A forty year forecasting assessment Percentage of correct predictions per sample split Constant Loser Oxford AR(1) Cycle Winner Cambridge RW ARFI(0,d) /39

33 A forty year forecasting assessment Cycle model outperforms other model per sample split : EPA Constant 0.5 Winner Loser 0.5 Cambridge Oxford 0.5 RW AR 0.5 ARFI /39

34 A forty year forecasting assessment Cycle model outperforms all other models per sample split : SPA /39

35 Forecasting the Boat Race Review What have we learned? Our aim is to seriously assess the role of statistical models for the forecasting of binary events. It is a challenging exercise and we need to study harder. However...it appears that a statistical model can outperform ad-hoc methods. We should also take a look at the returns on betting. But it is also fun! 35/39

36 Andrew Harvey 65th year I have learned a lot from you! Also, it has been a lot of fun After this meeting, I am sure it will be business as usual But if you decide differently... 36/39

37 Andrew Harvey 65th year Please do not end up like this... 37/39

38 Andrew Harvey 65th year Unobserved Components Let us hope that UC models remain in top of time series research But if UC developments slow down in the coming years... Or it will be all gone... 38/39

39 Andrew Harvey 65th year... cheer up, we all go together... 39/39

Forecasting The Boat Race

Forecasting The Boat Race Forecasting The Boat Race G. Mesters (a,b,c) and S.J. Koopman (b,c,d) (a) Netherlands Institute for the Study of Crime and Law Enforcement, (b) Department of Econometrics, VU University Amsterdam, (c)

More information

Part I State space models

Part I State space models Part I State space models 1 Introduction to state space time series analysis James Durbin Department of Statistics, London School of Economics and Political Science Abstract The paper presents a broad

More information

Unobserved. Components and. Time Series. Econometrics. Edited by. Siem Jan Koopman. and Neil Shephard OXFORD UNIVERSITY PRESS

Unobserved. Components and. Time Series. Econometrics. Edited by. Siem Jan Koopman. and Neil Shephard OXFORD UNIVERSITY PRESS Unobserved Components and Time Series Econometrics Edited by Siem Jan Koopman and Neil Shephard OXFORD UNIVERSITY PRESS CONTENTS LIST OF FIGURES LIST OF TABLES ix XV 1 Introduction 1 Siem Jan Koopman and

More information

Numerically Accelerated Importance Sampling for Nonlinear Non-Gaussian State Space Models

Numerically Accelerated Importance Sampling for Nonlinear Non-Gaussian State Space Models Numerically Accelerated Importance Sampling for Nonlinear Non-Gaussian State Space Models Siem Jan Koopman (a) André Lucas (a) Marcel Scharth (b) (a) VU University Amsterdam and Tinbergen Institute, The

More information

Generalized Dynamic Panel Data Models with Random Effects for Cross-Section and Time

Generalized Dynamic Panel Data Models with Random Effects for Cross-Section and Time Generalized Dynamic Panel Data Models with Random Effects for Cross-Section and Time G. Mesters (a,b,c) and S.J. Koopman (b,c) (a) Netherlands Institute for the Study of Crime and Law Enforcement, (b)

More information

FaMIDAS: A Mixed Frequency Factor Model with MIDAS structure

FaMIDAS: A Mixed Frequency Factor Model with MIDAS structure FaMIDAS: A Mixed Frequency Factor Model with MIDAS structure Frale C., Monteforte L. Computational and Financial Econometrics Limassol, October 2009 Introduction After the recent financial and economic

More information

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US Gerdie Everaert 1, Lorenzo Pozzi 2, and Ruben Schoonackers 3 1 Ghent University & SHERPPA 2 Erasmus

More information

Generalized Autoregressive Score Models

Generalized Autoregressive Score Models Generalized Autoregressive Score Models by: Drew Creal, Siem Jan Koopman, André Lucas To capture the dynamic behavior of univariate and multivariate time series processes, we can allow parameters to be

More information

Generalized Autoregressive Score Smoothers

Generalized Autoregressive Score Smoothers Generalized Autoregressive Score Smoothers Giuseppe Buccheri 1, Giacomo Bormetti 2, Fulvio Corsi 3,4, and Fabrizio Lillo 2 1 Scuola Normale Superiore, Italy 2 University of Bologna, Italy 3 University

More information

Model-based trend-cycle decompositions. with time-varying parameters

Model-based trend-cycle decompositions. with time-varying parameters Model-based trend-cycle decompositions with time-varying parameters Siem Jan Koopman Kai Ming Lee Soon Yip Wong s.j.koopman@ klee@ s.wong@ feweb.vu.nl Department of Econometrics Vrije Universiteit Amsterdam

More information

7 Day 3: Time Varying Parameter Models

7 Day 3: Time Varying Parameter Models 7 Day 3: Time Varying Parameter Models References: 1. Durbin, J. and S.-J. Koopman (2001). Time Series Analysis by State Space Methods. Oxford University Press, Oxford 2. Koopman, S.-J., N. Shephard, and

More information

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models Optimum Design for Mixed Effects Non-Linear and generalized Linear Models Cambridge, August 9-12, 2011 Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

More information

. Find E(V ) and var(v ).

. Find E(V ) and var(v ). Math 6382/6383: Probability Models and Mathematical Statistics Sample Preliminary Exam Questions 1. A person tosses a fair coin until she obtains 2 heads in a row. She then tosses a fair die the same number

More information

Research Division Federal Reserve Bank of St. Louis Working Paper Series

Research Division Federal Reserve Bank of St. Louis Working Paper Series Research Division Federal Reserve Bank of St Louis Working Paper Series Kalman Filtering with Truncated Normal State Variables for Bayesian Estimation of Macroeconomic Models Michael Dueker Working Paper

More information

Nonlinear Autoregressive Processes with Optimal Properties

Nonlinear Autoregressive Processes with Optimal Properties Nonlinear Autoregressive Processes with Optimal Properties F. Blasques S.J. Koopman A. Lucas VU University Amsterdam, Tinbergen Institute, CREATES OxMetrics User Conference, September 2014 Cass Business

More information

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a Chapter 9 Regression with a Binary Dependent Variable Multiple Choice ) The binary dependent variable model is an example of a a. regression model, which has as a regressor, among others, a binary variable.

More information

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October

More information

DEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE

DEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE Data Provided: None DEPARTMENT OF COMPUTER SCIENCE Autumn Semester 203 204 MACHINE LEARNING AND ADAPTIVE INTELLIGENCE 2 hours Answer THREE of the four questions. All questions carry equal weight. Figures

More information

7. Assumes that there is little or no multicollinearity (however, SPSS will not assess this in the [binary] Logistic Regression procedure).

7. Assumes that there is little or no multicollinearity (however, SPSS will not assess this in the [binary] Logistic Regression procedure). 1 Neuendorf Logistic Regression The Model: Y Assumptions: 1. Metric (interval/ratio) data for 2+ IVs, and dichotomous (binomial; 2-value), categorical/nominal data for a single DV... bear in mind that

More information

STRUCTURAL TIME-SERIES MODELLING

STRUCTURAL TIME-SERIES MODELLING 1: Structural Time-Series Modelling STRUCTURAL TIME-SERIES MODELLING Prajneshu Indian Agricultural Statistics Research Institute, New Delhi-11001 1. Introduction. ARIMA time-series methodology is widely

More information

Comparing Predictive Accuracy, Twenty Years Later: On The Use and Abuse of Diebold-Mariano Tests

Comparing Predictive Accuracy, Twenty Years Later: On The Use and Abuse of Diebold-Mariano Tests Comparing Predictive Accuracy, Twenty Years Later: On The Use and Abuse of Diebold-Mariano Tests Francis X. Diebold April 28, 2014 1 / 24 Comparing Forecasts 2 / 24 Comparing Model-Free Forecasts Models

More information

A Practical Guide to State Space Modeling

A Practical Guide to State Space Modeling A Practical Guide to State Space Modeling Jin-Lung Lin Institute of Economics, Academia Sinica Department of Economics, National Chengchi University March 006 1 1 Introduction State Space Model (SSM) has

More information

IS THE NORTH ATLANTIC OSCILLATION A RANDOM WALK? A COMMENT WITH FURTHER RESULTS

IS THE NORTH ATLANTIC OSCILLATION A RANDOM WALK? A COMMENT WITH FURTHER RESULTS INTERNATIONAL JOURNAL OF CLIMATOLOGY Int. J. Climatol. 24: 377 383 (24) Published online 11 February 24 in Wiley InterScience (www.interscience.wiley.com). DOI: 1.12/joc.13 IS THE NORTH ATLANTIC OSCILLATION

More information

DEVELOPING A TENNIS MODEL THAT REFLECTS OUTCOMES OF TENNIS MATCHES

DEVELOPING A TENNIS MODEL THAT REFLECTS OUTCOMES OF TENNIS MATCHES DEVELOPING TENNIS MODEL THT REFLECTS OUTCOMES OF TENNIS MTCHES Barnett, T., Brown,. and Clarke, S. Faculty of Life and Social Sciences, Swinburne University, Melbourne, VIC, ustralia BSTRCT Many tennis

More information

Forecasting economic time series using score-driven dynamic models with mixeddata

Forecasting economic time series using score-driven dynamic models with mixeddata TI 2018-026/III Tinbergen Institute Discussion Paper Forecasting economic time series using score-driven dynamic models with mixeddata sampling 1 Paolo Gorgi Siem Jan (S.J.) Koopman Mengheng Li3 2 1: 2:

More information

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

STA216: Generalized Linear Models. Lecture 1. Review and Introduction STA216: Generalized Linear Models Lecture 1. Review and Introduction Let y 1,..., y n denote n independent observations on a response Treat y i as a realization of a random variable Y i In the general

More information

Introduction to AI Learning Bayesian networks. Vibhav Gogate

Introduction to AI Learning Bayesian networks. Vibhav Gogate Introduction to AI Learning Bayesian networks Vibhav Gogate Inductive Learning in a nutshell Given: Data Examples of a function (X, F(X)) Predict function F(X) for new examples X Discrete F(X): Classification

More information

Econometric Forecasting

Econometric Forecasting Graham Elliott Econometric Forecasting Course Description We will review the theory of econometric forecasting with a view to understanding current research and methods. By econometric forecasting we mean

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

The topics in this section concern with the first course objective.

The topics in this section concern with the first course objective. 1.1 Systems & Probability The topics in this section concern with the first course objective. A system is one of the most fundamental concepts and one of the most useful and powerful tools in STEM (science,

More information

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random STA 216: GENERALIZED LINEAR MODELS Lecture 1. Review and Introduction Much of statistics is based on the assumption that random variables are continuous & normally distributed. Normal linear regression

More information

Forecasting. Bernt Arne Ødegaard. 16 August 2018

Forecasting. Bernt Arne Ødegaard. 16 August 2018 Forecasting Bernt Arne Ødegaard 6 August 208 Contents Forecasting. Choice of forecasting model - theory................2 Choice of forecasting model - common practice......... 2.3 In sample testing of

More information

Event A: at least one tail observed A:

Event A: at least one tail observed A: Chapter 3 Probability 3.1 Events, sample space, and probability Basic definitions: An is an act of observation that leads to a single outcome that cannot be predicted with certainty. A (or simple event)

More information

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Generalized Linear Models. Last time: Background & motivation for moving beyond linear Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered

More information

NOWCASTING THE OBAMA VOTE: PROXY MODELS FOR 2012

NOWCASTING THE OBAMA VOTE: PROXY MODELS FOR 2012 JANUARY 4, 2012 NOWCASTING THE OBAMA VOTE: PROXY MODELS FOR 2012 Michael S. Lewis-Beck University of Iowa Charles Tien Hunter College, CUNY IF THE US PRESIDENTIAL ELECTION WERE HELD NOW, OBAMA WOULD WIN.

More information

TREND ESTIMATION AND THE HODRICK-PRESCOTT FILTER

TREND ESTIMATION AND THE HODRICK-PRESCOTT FILTER J. Japan Statist. Soc. Vol. 38 No. 1 2008 41 49 TREND ESTIMATION AND THE HODRICK-PRESCOTT FILTER Andrew Harvey* and Thomas Trimbur** The article analyses the relationship between unobserved component trend-cycle

More information

Generative Models for Discrete Data

Generative Models for Discrete Data Generative Models for Discrete Data ddebarr@uw.edu 2016-04-21 Agenda Bayesian Concept Learning Beta-Binomial Model Dirichlet-Multinomial Model Naïve Bayes Classifiers Bayesian Concept Learning Numbers

More information

1. INTRODUCTION State space models may be formulated in avariety of ways. In this paper we consider rst the linear Gaussian form y t = Z t t + " t " t

1. INTRODUCTION State space models may be formulated in avariety of ways. In this paper we consider rst the linear Gaussian form y t = Z t t +  t  t A simple and ecient simulation smoother for state space time series analysis BY J. DURBIN Department of Statistics, London School of Economics and Political Science, London WCA AE, UK. durbinja@aol.com

More information

Experiment 1: The Same or Not The Same?

Experiment 1: The Same or Not The Same? Experiment 1: The Same or Not The Same? Learning Goals After you finish this lab, you will be able to: 1. Use Logger Pro to collect data and calculate statistics (mean and standard deviation). 2. Explain

More information

ECE531: Principles of Detection and Estimation Course Introduction

ECE531: Principles of Detection and Estimation Course Introduction ECE531: Principles of Detection and Estimation Course Introduction D. Richard Brown III WPI 22-January-2009 WPI D. Richard Brown III 22-January-2009 1 / 37 Lecture 1 Major Topics 1. Web page. 2. Syllabus

More information

ML Testing (Likelihood Ratio Testing) for non-gaussian models

ML Testing (Likelihood Ratio Testing) for non-gaussian models ML Testing (Likelihood Ratio Testing) for non-gaussian models Surya Tokdar ML test in a slightly different form Model X f (x θ), θ Θ. Hypothesist H 0 : θ Θ 0 Good set: B c (x) = {θ : l x (θ) max θ Θ l

More information

Lecture 6. Statistical Processes. Irreversibility. Counting and Probability. Microstates and Macrostates. The Meaning of Equilibrium Ω(m) 9 spins

Lecture 6. Statistical Processes. Irreversibility. Counting and Probability. Microstates and Macrostates. The Meaning of Equilibrium Ω(m) 9 spins Lecture 6 Statistical Processes Irreversibility Counting and Probability Microstates and Macrostates The Meaning of Equilibrium Ω(m) 9 spins -9-7 -5-3 -1 1 3 5 7 m 9 Lecture 6, p. 1 Irreversibility Have

More information

Probability, For the Enthusiastic Beginner (Exercises, Version 1, September 2016) David Morin,

Probability, For the Enthusiastic Beginner (Exercises, Version 1, September 2016) David Morin, Chapter 8 Exercises Probability, For the Enthusiastic Beginner (Exercises, Version 1, September 2016) David Morin, morin@physics.harvard.edu 8.1 Chapter 1 Section 1.2: Permutations 1. Assigning seats *

More information

CS 188: Artificial Intelligence Spring Today

CS 188: Artificial Intelligence Spring Today CS 188: Artificial Intelligence Spring 2006 Lecture 9: Naïve Bayes 2/14/2006 Dan Klein UC Berkeley Many slides from either Stuart Russell or Andrew Moore Bayes rule Today Expectations and utilities Naïve

More information

Dynamic and stochastic volatility structures in U.S. inflation: Estimation and signal extraction

Dynamic and stochastic volatility structures in U.S. inflation: Estimation and signal extraction Dynamic and stochastic volatility structures in U.S. inflation: Estimation and signal extraction Mengheng Li 1,3,4 and Siem Jan Koopman 1,2,3 1 Department of Econometrics, Vrije Universiteit Amsterdam,

More information

9 Classification. 9.1 Linear Classifiers

9 Classification. 9.1 Linear Classifiers 9 Classification This topic returns to prediction. Unlike linear regression where we were predicting a numeric value, in this case we are predicting a class: winner or loser, yes or no, rich or poor, positive

More information

Probability Distributions

Probability Distributions Probability Distributions Probability This is not a math class, or an applied math class, or a statistics class; but it is a computer science course! Still, probability, which is a math-y concept underlies

More information

Forecasting in the presence of recent structural breaks

Forecasting in the presence of recent structural breaks Forecasting in the presence of recent structural breaks Second International Conference in memory of Carlo Giannini Jana Eklund 1, George Kapetanios 1,2 and Simon Price 1,3 1 Bank of England, 2 Queen Mary

More information

Stock index returns density prediction using GARCH models: Frequentist or Bayesian estimation?

Stock index returns density prediction using GARCH models: Frequentist or Bayesian estimation? MPRA Munich Personal RePEc Archive Stock index returns density prediction using GARCH models: Frequentist or Bayesian estimation? Ardia, David; Lennart, Hoogerheide and Nienke, Corré aeris CAPITAL AG,

More information

Algorithm-Independent Learning Issues

Algorithm-Independent Learning Issues Algorithm-Independent Learning Issues Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2007 c 2007, Selim Aksoy Introduction We have seen many learning

More information

Constrained estimation for binary and survival data

Constrained estimation for binary and survival data Constrained estimation for binary and survival data Jeremy M. G. Taylor Yong Seok Park John D. Kalbfleisch Biostatistics, University of Michigan May, 2010 () Constrained estimation May, 2010 1 / 43 Outline

More information

Decisions on Multivariate Time Series: Combining Domain Knowledge with Utility Maximization

Decisions on Multivariate Time Series: Combining Domain Knowledge with Utility Maximization Decisions on Multivariate Time Series: Combining Domain Knowledge with Utility Maximization Chun-Kit Ngan 1, Alexander Brodsky 2, and Jessica Lin 3 George Mason University cngan@gmu.edu 1, brodsky@gmu.edu

More information

Advanced Quantitative Data Analysis

Advanced Quantitative Data Analysis Chapter 24 Advanced Quantitative Data Analysis Daniel Muijs Doing Regression Analysis in SPSS When we want to do regression analysis in SPSS, we have to go through the following steps: 1 As usual, we choose

More information

A Gaussian state-space model for wind fields in the North-East Atlantic

A Gaussian state-space model for wind fields in the North-East Atlantic A Gaussian state-space model for wind fields in the North-East Atlantic Julie BESSAC - Université de Rennes 1 with Pierre AILLIOT and Valï 1 rie MONBET 2 Juillet 2013 Plan Motivations 1 Motivations 2 Context

More information

4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49

4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49 4 HYPOTHESIS TESTING 49 4 Hypothesis testing In sections 2 and 3 we considered the problem of estimating a single parameter of interest, θ. In this section we consider the related problem of testing whether

More information

Non-Stationary Time Series, Cointegration, and Spurious Regression

Non-Stationary Time Series, Cointegration, and Spurious Regression Econometrics II Non-Stationary Time Series, Cointegration, and Spurious Regression Econometrics II Course Outline: Non-Stationary Time Series, Cointegration and Spurious Regression 1 Regression with Non-Stationarity

More information

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53 State-space Model Eduardo Rossi University of Pavia November 2014 Rossi State-space Model Fin. Econometrics - 2014 1 / 53 Outline 1 Motivation 2 Introduction 3 The Kalman filter 4 Forecast errors 5 State

More information

Adaptive quadrature for likelihood inference on dynamic latent variable models for time-series and panel data

Adaptive quadrature for likelihood inference on dynamic latent variable models for time-series and panel data MPRA Munich Personal RePEc Archive Adaptive quadrature for likelihood inference on dynamic latent variable models for time-series and panel data Silvia Cagnone and Francesco Bartolucci Department of Statistical

More information

Lecture 4: Dynamic models

Lecture 4: Dynamic models linear s Lecture 4: s Hedibert Freitas Lopes The University of Chicago Booth School of Business 5807 South Woodlawn Avenue, Chicago, IL 60637 http://faculty.chicagobooth.edu/hedibert.lopes hlopes@chicagobooth.edu

More information

Forecasting. BUS 735: Business Decision Making and Research. exercises. Assess what we have learned

Forecasting. BUS 735: Business Decision Making and Research. exercises. Assess what we have learned Forecasting BUS 735: Business Decision Making and Research 1 1.1 Goals and Agenda Goals and Agenda Learning Objective Learn how to identify regularities in time series data Learn popular univariate time

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

MATH/STAT 3360, Probability

MATH/STAT 3360, Probability MATH/STAT 3360, Probability Sample Final Examination This Sample examination has more questions than the actual final, in order to cover a wider range of questions. Estimated times are provided after each

More information

Stochastic Analogues to Deterministic Optimizers

Stochastic Analogues to Deterministic Optimizers Stochastic Analogues to Deterministic Optimizers ISMP 2018 Bordeaux, France Vivak Patel Presented by: Mihai Anitescu July 6, 2018 1 Apology I apologize for not being here to give this talk myself. I injured

More information

Dynamic Generalized Linear Models

Dynamic Generalized Linear Models Dynamic Generalized Linear Models Jesse Windle Oct. 24, 2012 Contents 1 Introduction 1 2 Binary Data (Static Case) 2 3 Data Augmentation (de-marginalization) by 4 examples 3 3.1 Example 1: CDF method.............................

More information

Quantiles, Expectiles and Splines

Quantiles, Expectiles and Splines Quantiles, Expectiles and Splines Andrew Harvey University of Cambridge December 2007 Harvey (University of Cambridge) QES December 2007 1 / 40 Introduction The movements in a time series may be described

More information

Discussion Score-driven models for forecasting by Siem Jan Koopman. Domenico Giannone LUISS University of Rome, ECARES, EIEF and CEPR

Discussion Score-driven models for forecasting by Siem Jan Koopman. Domenico Giannone LUISS University of Rome, ECARES, EIEF and CEPR Discussion Score-driven models for forecasting by Siem Jan Koopman Domenico Giannone LUISS University of Rome, ECARES, EIEF and CEPR 8th ECB Forecasting Workshop European Central Bank, June 2014 1 / 10

More information

Testing for Regime Switching: A Comment

Testing for Regime Switching: A Comment Testing for Regime Switching: A Comment Andrew V. Carter Department of Statistics University of California, Santa Barbara Douglas G. Steigerwald Department of Economics University of California Santa Barbara

More information

Marginal Specifications and a Gaussian Copula Estimation

Marginal Specifications and a Gaussian Copula Estimation Marginal Specifications and a Gaussian Copula Estimation Kazim Azam Abstract Multivariate analysis involving random variables of different type like count, continuous or mixture of both is frequently required

More information

Stat 135 Fall 2013 FINAL EXAM December 18, 2013

Stat 135 Fall 2013 FINAL EXAM December 18, 2013 Stat 135 Fall 2013 FINAL EXAM December 18, 2013 Name: Person on right SID: Person on left There will be one, double sided, handwritten, 8.5in x 11in page of notes allowed during the exam. The exam is closed

More information

Problem Set 2: Box-Jenkins methodology

Problem Set 2: Box-Jenkins methodology Problem Set : Box-Jenkins methodology 1) For an AR1) process we have: γ0) = σ ε 1 φ σ ε γ0) = 1 φ Hence, For a MA1) process, p lim R = φ γ0) = 1 + θ )σ ε σ ε 1 = γ0) 1 + θ Therefore, p lim R = 1 1 1 +

More information

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Financial Econometrics / 49

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Financial Econometrics / 49 State-space Model Eduardo Rossi University of Pavia November 2013 Rossi State-space Model Financial Econometrics - 2013 1 / 49 Outline 1 Introduction 2 The Kalman filter 3 Forecast errors 4 State smoothing

More information

Logistic Regression. Seungjin Choi

Logistic Regression. Seungjin Choi Logistic Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

STAT5044: Regression and Anova

STAT5044: Regression and Anova STAT5044: Regression and Anova Inyoung Kim 1 / 18 Outline 1 Logistic regression for Binary data 2 Poisson regression for Count data 2 / 18 GLM Let Y denote a binary response variable. Each observation

More information

QUESTION ONE Let 7C = Total Cost MC = Marginal Cost AC = Average Cost

QUESTION ONE Let 7C = Total Cost MC = Marginal Cost AC = Average Cost ANSWER QUESTION ONE Let 7C = Total Cost MC = Marginal Cost AC = Average Cost Q = Number of units AC = 7C MC = Q d7c d7c 7C Q Derivation of average cost with respect to quantity is different from marginal

More information

ConcepTest PowerPoints

ConcepTest PowerPoints ConcepTest PowerPoints Chapter 4 Physics: Principles with Applications, 6 th edition Giancoli 2005 Pearson Prentice Hall This work is protected by United States copyright laws and is provided solely for

More information

Lecture 10: Alternatives to OLS with limited dependent variables. PEA vs APE Logit/Probit Poisson

Lecture 10: Alternatives to OLS with limited dependent variables. PEA vs APE Logit/Probit Poisson Lecture 10: Alternatives to OLS with limited dependent variables PEA vs APE Logit/Probit Poisson PEA vs APE PEA: partial effect at the average The effect of some x on y for a hypothetical case with sample

More information

HWA CHONG INSTITUTION 2018 JC2 PRELIMINARY EXAMINATION. Monday 17 September hours

HWA CHONG INSTITUTION 2018 JC2 PRELIMINARY EXAMINATION. Monday 17 September hours HWA CHONG INSTITUTION 08 JC PRELIMINARY EXAMINATION MATHEMATICS Higher 9758/0 Paper Monday 7 September 08 hours Additional materials: Answer paper List of Formula (MF6) Cover Page READ THESE INSTRUCTIONS

More information

Testing Restrictions and Comparing Models

Testing Restrictions and Comparing Models Econ. 513, Time Series Econometrics Fall 00 Chris Sims Testing Restrictions and Comparing Models 1. THE PROBLEM We consider here the problem of comparing two parametric models for the data X, defined by

More information

Gaussian Process Regression Model in Spatial Logistic Regression

Gaussian Process Regression Model in Spatial Logistic Regression Journal of Physics: Conference Series PAPER OPEN ACCESS Gaussian Process Regression Model in Spatial Logistic Regression To cite this article: A Sofro and A Oktaviarina 018 J. Phys.: Conf. Ser. 947 01005

More information

MGR-815. Notes for the MGR-815 course. 12 June School of Superior Technology. Professor Zbigniew Dziong

MGR-815. Notes for the MGR-815 course. 12 June School of Superior Technology. Professor Zbigniew Dziong Modeling, Estimation and Control, for Telecommunication Networks Notes for the MGR-815 course 12 June 2010 School of Superior Technology Professor Zbigniew Dziong 1 Table of Contents Preface 5 1. Example

More information

Forecasting the term structure interest rate of government bond yields

Forecasting the term structure interest rate of government bond yields Forecasting the term structure interest rate of government bond yields Bachelor Thesis Econometrics & Operational Research Joost van Esch (419617) Erasmus School of Economics, Erasmus University Rotterdam

More information

Forecasting 1 to h steps ahead using partial least squares

Forecasting 1 to h steps ahead using partial least squares Forecasting 1 to h steps ahead using partial least squares Philip Hans Franses Econometric Institute, Erasmus University Rotterdam November 10, 2006 Econometric Institute Report 2006-47 I thank Dick van

More information

The Normal Table: Read it, Use it

The Normal Table: Read it, Use it The Normal Table: Read it, Use it ECO22Y1Y: 218/19; Written by Jennifer Murdock A massive flood in Grand Forks, North Dakota in 1997 cost billions to clean up. The levee could protect the town even if

More information

How Fukushima-Daiichi core meltdown changed the probability of nuclear acci

How Fukushima-Daiichi core meltdown changed the probability of nuclear acci How Fukushima-Daiichi core meltdown changed the probability of nuclear accidents? CERNA Mines ParisTech Paris, March 22, 2013 Motivation Motivations 1 Two weeks after Fukushima Daiichi an article in a

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

Ph.D. Qualifying Exam Monday Tuesday, January 4 5, 2016

Ph.D. Qualifying Exam Monday Tuesday, January 4 5, 2016 Ph.D. Qualifying Exam Monday Tuesday, January 4 5, 2016 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Find the maximum likelihood estimate of θ where θ is a parameter

More information

Statistical Estimation

Statistical Estimation Statistical Estimation Use data and a model. The plug-in estimators are based on the simple principle of applying the defining functional to the ECDF. Other methods of estimation: minimize residuals from

More information

This paper is not to be removed from the Examination Halls

This paper is not to be removed from the Examination Halls ~~ST104B ZA d0 This paper is not to be removed from the Examination Halls UNIVERSITY OF LONDON ST104B ZB BSc degrees and Diplomas for Graduates in Economics, Management, Finance and the Social Sciences,

More information

Forecasting economic time series using unobserved components time series models

Forecasting economic time series using unobserved components time series models Forecasting economic time series using unobserved components time series models Siem Jan Koopman and Marius Ooms VU University Amsterdam, Department of Econometrics FEWEB, De Boelelaan 1105, 1081 HV Amsterdam

More information

Forecasting Based on Common Trends in Mixed Frequency Samples

Forecasting Based on Common Trends in Mixed Frequency Samples Forecasting Based on Common Trends in Mixed Frequency Samples Peter Fuleky and Carl S. Bonham June 2, 2011 Abstract We extend the existing literature on small mixed frequency single factor models by allowing

More information

Time Series I Time Domain Methods

Time Series I Time Domain Methods Astrostatistics Summer School Penn State University University Park, PA 16802 May 21, 2007 Overview Filtering and the Likelihood Function Time series is the study of data consisting of a sequence of DEPENDENT

More information

Decision Trees. CSC411/2515: Machine Learning and Data Mining, Winter 2018 Luke Zettlemoyer, Carlos Guestrin, and Andrew Moore

Decision Trees. CSC411/2515: Machine Learning and Data Mining, Winter 2018 Luke Zettlemoyer, Carlos Guestrin, and Andrew Moore Decision Trees Claude Monet, The Mulberry Tree Slides from Pedro Domingos, CSC411/2515: Machine Learning and Data Mining, Winter 2018 Luke Zettlemoyer, Carlos Guestrin, and Andrew Moore Michael Guerzhoy

More information

Logit Regression and Quantities of Interest

Logit Regression and Quantities of Interest Logit Regression and Quantities of Interest Stephen Pettigrew March 4, 2015 Stephen Pettigrew Logit Regression and Quantities of Interest March 4, 2015 1 / 57 Outline 1 Logistics 2 Generalized Linear Models

More information

A Gaussian IV estimator of cointegrating relations Gunnar Bårdsen and Niels Haldrup. 13 February 2006.

A Gaussian IV estimator of cointegrating relations Gunnar Bårdsen and Niels Haldrup. 13 February 2006. A Gaussian IV estimator of cointegrating relations Gunnar Bårdsen and Niels Haldrup 3 February 26. Abstract. In static single equation cointegration regression models the OLS estimator will have a non-standard

More information

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University FEATURE EXPANSIONS FEATURE EXPANSIONS

More information

1 Basic continuous random variable problems

1 Basic continuous random variable problems Name M362K Final Here are problems concerning material from Chapters 5 and 6. To review the other chapters, look over previous practice sheets for the two exams, previous quizzes, previous homeworks and

More information

Probabilistic modeling. The slides are closely adapted from Subhransu Maji s slides

Probabilistic modeling. The slides are closely adapted from Subhransu Maji s slides Probabilistic modeling The slides are closely adapted from Subhransu Maji s slides Overview So far the models and algorithms you have learned about are relatively disconnected Probabilistic modeling framework

More information

The Stochastic Volatility in Mean Model with Time- Varying Parameters: An Application to Inflation Modeling

The Stochastic Volatility in Mean Model with Time- Varying Parameters: An Application to Inflation Modeling Crawford School of Public Policy CAMA Centre for Applied Macroeconomic Analysis The Stochastic Volatility in Mean Model with Time- Varying Parameters: An Application to Inflation Modeling CAMA Working

More information

Forecasting Based on Common Trends in Mixed Frequency Samples

Forecasting Based on Common Trends in Mixed Frequency Samples Forecasting Based on Common Trends in Mixed Frequency Samples PETER FULEKY and CARL S. BONHAM June 13, 2011 Abstract We extend the existing literature on small mixed frequency single factor models by allowing

More information

Fundamental Probability and Statistics

Fundamental Probability and Statistics Fundamental Probability and Statistics "There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are

More information