A Forty (?) Year Assessment of Forecasting The Boat Race

Size: px

Start display at page:

Download "A Forty (?) Year Assessment of Forecasting The Boat Race"

Job Burke
5 years ago
Views:

1 A Forty (?) Year Assessment of Forecasting The Boat Race Geert Mesters & Siem Jan Koopman Netherlands Institute for the Study of Crime and Law Enforcement (NSCR) VU University Amsterdam & Tinbergen Institute Andrew Harvey 65th Year Conference Oxford-Man Institute, Oxford University, UK June 29-30, /39

2 Andrew Harvey 65th year Andrew Harvey also well-known for the fun illustrations : Seat Belts Purse Snatching in Hyde Park area in Chicago Ice volume Rainfall in NE Brazil (Fortaleza) Goals between England and Scotland Mink Muskrats, more? To praise the illustration : The Boat Race 2/39

3 The Boat Race A bit of history : The boat race between teams of the universities of Oxford and Cambridge was first organized in The idea came from two friends who were both named Charles : Cambridge student Merrivale and Oxford student Wordsworth. March 12, 1829, Cambridge challenged Oxford and history started. The first race was held at Henley-on-Thames : Oxford won easily. In 1836, Oxford took the Dark Blue color and Cambridge the Duck Egg Blue color. From 1839, the race is an annual fixture; it gets relocated to London: from Westminster to Putney. The race became more and more popular, crowds increased quickly, and the location had to change again. In 1845, 1st time race at current location: C wins in 23mins 30sec. 3/39

4 The Boat Race The founders Charles Merrivale (C) and Charles Wordsworth (O) 4/39

5 The Boat Race 5/39

6 The Boat Race Some important years : Until 1861, the outcomes were about even : the first and only Dead Heat but... many say that Honest John Phelps felt asleep under a bush when the crews reached the finishing line : both boats sank for the first time, Oxford won the next day : during World War I, no races were held : longest winning streak by Cambridge : during World War II, no races were held : Oxford win, in the midst of a blizzard : Another Oxford win, the 100th Boat Race : Sue Brown is 1st female to enter the race, as cox for Oxford : Hugh and Rob Clay of Oxford are 1st twins to win. 6/39

7 The Boat Race The 1877 Dead Heat by Charles Robinson 7/39

8 The Boat Race More recent developments : Dark Blues dominate in the 1980s 1984 : Cambridge writes off their boat before the race starts Oxford Mutiny : crew protest over team selection policy but... they still won! Topolski and Robison book True Blue appeared in 1989 and the movie appeared in : Cambridge regains its pride and ends Oxford domination 1998 : Cambridge sets record time to 16mins 19sec : DK book predicts a win for Cambridge correctly : Oxford wins by one foot (the smallest margin since 1877) 2010 : the last year in our analysis: a C win. Cambridge leads the series by 80 against 74 (2011 O, 2012 C). 8/39

9 The Boat Race Binary Time Series of Cambridge and Oxford Wins Cambridge Win Oxford Win Cambridge Win Oxford Win /39

10 Forecasting The Boat Race Motivation Why would one want to forecast The Boat Race? Bookmakers and individual gamblers may want to increase their expected profits; Highlighting the importance of previous outcomes (history) of Boat Race; Provide insights to Cambridge and Oxford teams for their winning strategies; Illustration of how Econometrics and Time Series Analysis can be useful in forecasting; Forecasting binary time series can be relevant in many other fields including criminology, finance and computer science; 10/39

11 Forecasting The Boat Race Explanatory variables What information may be relevant? Past outcomes (moderate change of teams in subsequent years); Toss outcomes (which side of the river, betting odds change severely after the toss); Average weight of oarsmen (more muscles against water resistence); Average age of oarsmen (experience); Weather conditions; More? What information do we use? We use Past outcomes (time series), Toss outcomes and difference of Average weight of oarsmen (between C and O). 11/39

12 Our explanatory variables for winning the Boat Race Forecasting The Boat Race Explanatory variables 5 Excessive Cambridge Weight 2007 : Thorsten Engelmann : 17 stone 6lbs (110.8 kilos) Excessive Oxford Weight Cambridge Wins Toss Oxford Wins Toss /39

13 Forecasting The Boat Race Explanatory variables Regression Output from PcGive (y t = 1 is a win for Cambridge) Coefficient Std.Error t-value t-prob Part.R2 Constant Winner Toss DiffWgt sigma R Adj.R no. of observations 146 mean(winner) se(winner) /39

14 Binary time series model Dynamic model specification Density function for binary observations : p(y t ;π) = π yt (1 π) 1 yt, t = 1,...,n, where probability 0 π 1 is usually subject to transformation by link function θ = log(π/(1 π)); see Cox and Snell (1989). If y t is iid, the likelihood function is easily obtained and MLE of π or θ is straightforward : Logit model. In a time series, we let π be time-varying, that is π t, and have conditional density function p(y t π t ) = π yt t (1 π t ) 1 yt, t = 1,...,n, or, in terms of signal θ t = log(π t /(1 π t )), p(y t θ t ) = exp[y t θ t log(1+expθ t )], t = 1,...,n, which shows that binary density is part of exponential family. 14/39

15 Binary time series model Dynamic model specification Density function for binary observations with time-varying signal : p(y t θ t ) = exp[y t θ t log(1+expθ t )], t = 1,...,n, with time-varying signal θ t = µ+x t β +u t, where we consider different dynamic processes for u t : deterministic signal : u t = 0 (LogitJD) random walk : u t = u t 1 +η t with η t NID(0,σ 2 ) AR(p) : u t = φ 1 u t φ p u t p +η t fractionally integrated : u t = (1 L) d u t with u t AR(p) cycle : ( ut u + t ) [ cosλ sinλ = φ sinλ cosλ ]( ut 1 u + t 1 ) ( ηt + η t + ) 15/39

16 Binary time series model Parameter estimation Likelihood function is based on : p(y;ψ) = p(y,u;x,ψ)du = u u p(y u;x,ψ)p(u;ψ)du, where parameter vector ψ includes µ, β, σ 2, d and φ s, density p(y u;x,ψ) p(y u;x,ψ) = n exp[y t θ t log(1+expθ t )], t=1 with θ t = µ+x t β+u t and with p(u;ψ) the density for time series process u t. 16/39

17 Likelihood evaluation : p(y u;x,ψ)p(u;ψ)du ˆp = g(y) M u Binary time series model Parameter estimation M p(y u i )/g(y u i ). i=1 where u i g(u y), is the method of importance sampling : simulation smoothing : Carter and Kohn (1994), Fruhwirth-Schnatter (1994), de Jong and Shephard (1995), Durbin and Koopman (2002), and more. importance sampling for time series : Shephard and Pitt (1997) and Durbin and Koopman (1997), but also... efficient importance sampling : Danielsson and Richard (1995), Liesenfeld and Richard (2003), Richard and Zhang (2007) numerically accelerated importance sampling : Koopman and Nguyen (2011), Koopman, Lucas and Scharth (2011, 2012) importance sampling for long memory processes : Mesters, Koopman and Ooms (2010) 17/39

18 Binary time series model Parameter estimation Estimation by importance sampling : based on an approximating linear Gaussian model g(y,u;x,ψ) that we obtain by an iterative algorithm based on 2nd order Taylor expansion (or Laplace transformation) evaluation is numerical, it requires some computing attention direct maximisation of likelihood function : common random numbers for each evaluation to obtain a smooth likelihood surface NAIS : role of simulation becomes less all methods can treat missing values methods that rely on Kalman filter (ssfpack) are fast and estimation is a routine matter 18/39

19 Binary time series model In-sample results Selection of estimation results : φ σ λ/d X-wgt -toss loglik % OK Constant RW AR(1) ARFI(0,d,0) (d) ARFI(1,d,0) (d) Cycle (λ) indicates some level of significance; λ = 0.33 implies cycle period of 2π/λ 18 years. % OK is percentage of estimating the winner correctly. 19/39

20 Binary time series model Signal extraction Cambridge Win constant Oxford Win AR(1) ARFI(1) RW FI cycle /39

21 A forty year forecasting assessment Design of study We perform an out-of-sample forecasting exercise : The first forecast is for 1971 using the binary observations from 1829 to We forecast the probability π t+1 t where t refers to When π t+1 t 0.5, we predict a Cambridge win, otherwise an Oxford win. The second forecast is for 1972 using the binary observations from 1830 to 1971, etc. Hence we adopt a rolling forecast window. We compute the forecasts until 2010: a total of 40 forecasts. This procedure is repeated for each model specification and for each ad-hoc method. 21/39

22 A forty year forecasting assessment Design of study Our framework for model-based forecasting is : Observation density with time-varying signal p(y t θ t ) = exp[y t θ t log(1+expθ t )], t = 1,...,n, where θ t = µ+x t β +u t, for different dynamic processes u t. The probability for a Cambridge win is given by π t = expθ t 1+expθ t. 22/39

23 Model-based forecasts : Deterministic : u t = 0 A forty year forecasting assessment Random Walk : u t = u t 1 +η t with η t NID(0,σ 2 ) AR(1) : u t = φ 1 u t 1 +η t ARFIMA(0,d,0) : u t = (1 L) d u t with u t = η t ARFIMA(1,d,0) : u t = (1 L) d u t with u t AR(1) Cycle Design of study Ad-hoc forecasts : Last Year Winner Last Year Loser Always Cambridge Win Always Oxford Win 23/39

24 A forty year forecasting assessment Forecasting results Outcome RW ARFI0 ARFI1 Constant AR1 CYCLE /39

25 A forty year forecasting assessment Forecasting results Forecasts: Correct % Correct Constant RW AR(1) ARFI(0,d,0) ARFI(1,d,0) Cycle Last Year Winner Last Year Loser Always Cambridge Win Always Oxford Win Overall model-based forecasts outperform ad-hoc forecasts Cycle forecasts are best! This is it? 25/39

26 A forty year forecasting assessment How significant are the differences in forecast accuracy? Forecasting comparisons Our Loss function value is L t which can take the values: 1 : if the forecast of the Boat Race is WRONG 0 : if the forecast of the Boat Race is CORRECT For each forecast method, we can construct (yet another) binary time series L t : say L (i) t for method i. The sums L (i) L(i) 40 are reported in previous table. The relative performance for each method is then measured as d ij t = L (i) t L (j) t, i j, which can take the values: 1 : model i wrong, model j correct: GOOD for model j 0 : both models are wrong or correct: no distinction -1: model i correct, model j wrong: GOOD for model i 26/39

27 A forty year forecasting assessment Equal Predictive Ability How significant are the differences in forecast accuracy? We follow Diebold & Mariano (1995) with their sign test. To carry out the test, only consider the m ij non-zero values and compute S ij = t 1(d (ij) t = 1), S ij Binomial(m ij,0.5). A small value for S ij says model i is doing better than model j. Exact test : the cumulative binomial distribution function assesses whether S ij is small enough. 27/39

28 A forty year forecasting assessment Equal Predictive Ability Bench. / Alt. Co Wi Lo Ca Ox RW AR ARFI Cy Constant Winner Loser Cambridge Oxford RW AR ARFI Cycle EPA test : the p-values are reported. In bold : the model in column outperforms the model in row. Enough evidence, let s stop here? 28/39

29 A forty year forecasting assessment How significant are the differences... over ALL models? Superior Predictive Ability Separate EPA tests may be less powerful against a single model : see previous table. Matrix of EPA p-values may be difficult to interpret : conflicting evidence. We therefore also consider the Superior Predictive Ability (SPA) test of White (2000) and Hansen (2005). We focus on the quantity D ij = E(d (ij) t ), and model i is said to be superior if and only if D ij 0, j, j i. Applied in Hansen & Lunde (2005), Hsu & Kuan (2005) and Jungbacker, Koopman & Hol (2007). 29/39

30 A forty year forecasting assessment SPA test values with confidence intervals (bootstrapped) Benchmark Constant [0.114, 0.191] Winner [0.077, 0.135] Loser [0.004, 0.004] Cambridge [0.003, 0.003] Oxford [0.133, 0.169] RW [0.055, 0.078] AR [0.046, 0.267] ARFI [0.020, 0.057] Cycle [0.864, 1.000] Superior Predictive Ability SPA test statistics are reported. In bold : the model is significantly outperformed by the others. Strongest evidence that model is not outperformed is for Cycle. 30/39

31 A forty year (?) forecasting assessment Sample Split Why forty years? We don t know. The original plan was 50, but in the end we did 40. Let s have a go... Inoue & Rossi (Biometrika, 2011) and Hansen & Timmermann (wp, 2012) see dangers in ad-hoc choice of forecast window size: not able to detect significant predictive ability (even when available for other window sizes) significant results by chance : data snooping over window size (leads to size distortions) 31/39

32 A forty year forecasting assessment Percentage of correct predictions per sample split Constant Loser Oxford AR(1) Cycle Winner Cambridge RW ARFI(0,d) /39

33 A forty year forecasting assessment Cycle model outperforms other model per sample split : EPA Constant 0.5 Winner Loser 0.5 Cambridge Oxford 0.5 RW AR 0.5 ARFI /39

34 A forty year forecasting assessment Cycle model outperforms all other models per sample split : SPA /39

35 Forecasting the Boat Race Review What have we learned? Our aim is to seriously assess the role of statistical models for the forecasting of binary events. It is a challenging exercise and we need to study harder. However...it appears that a statistical model can outperform ad-hoc methods. We should also take a look at the returns on betting. But it is also fun! 35/39

36 Andrew Harvey 65th year I have learned a lot from you! Also, it has been a lot of fun After this meeting, I am sure it will be business as usual But if you decide differently... 36/39

37 Andrew Harvey 65th year Please do not end up like this... 37/39

38 Andrew Harvey 65th year Unobserved Components Let us hope that UC models remain in top of time series research But if UC developments slow down in the coming years... Or it will be all gone... 38/39

39 Andrew Harvey 65th year... cheer up, we all go together... 39/39

Forecasting The Boat Race

Forecasting The Boat Race G. Mesters (a,b,c) and S.J. Koopman (b,c,d) (a) Netherlands Institute for the Study of Crime and Law Enforcement, (b) Department of Econometrics, VU University Amsterdam, (c)