Bayesian modelling of football outcomes. 1 Introduction. Synopsis. (using Skellam s Distribution)

Similar documents
Bayesian Analysis of Bivariate Count Data

Analysis of sports data by using bivariate Poisson models

Statistical Properties of indices of competitive balance

Advanced Computational Methods in Statistics: Lecture 5 Sequential Monte Carlo/Particle Filtering

A Bivariate Weibull Count Model for Forecasting Association Football Scores

Petr Volf. Model for Difference of Two Series of Poisson-like Count Data

Composite Poisson Models For Goal Scoring

On the Dependency of Soccer Scores - A Sparse Bivariate Poisson Model for the UEFA EURO 2016

Or How to select variables Using Bayesian LASSO

arxiv: v1 [physics.soc-ph] 24 Sep 2009

A penalty criterion for score forecasting in soccer

Bivariate Poisson and Diagonal Inflated Bivariate Poisson Regression Models in R

Lecture 3: Sports rating models

Bayesian Networks in Educational Assessment

Journal of Statistical Software

LORENZ A. GILCH AND SEBASTIAN MÜLLER

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline.

Topic 2: Probability & Distributions. Road Map Probability & Distributions. ECO220Y5Y: Quantitative Methods in Economics. Dr.

A market-based algorithm for predicting soccer outcomes

This does not cover everything on the final. Look at the posted practice problems for other topics.

CS37300 Class Notes. Jennifer Neville, Sebastian Moreno, Bruno Ribeiro

Business Statistics: 41000

Class 26: review for final exam 18.05, Spring 2014

PREDICTING AND RETROSPECTIVE ANALYSIS OF SOCCER MATCHES IN A LEAGUE

Six Examples of Fits Using the Poisson. Negative Binomial. and Poisson Inverse Gaussian Distributions

Discrete Choice Modeling

Swarthmore Honors Exam 2012: Statistics

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

Twelfth Problem Assignment

Approximating the Conway-Maxwell-Poisson normalizing constant

Default Priors and Effcient Posterior Computation in Bayesian

Variance reduction. Michel Bierlaire. Transport and Mobility Laboratory. Variance reduction p. 1/18

ECE 302, Final 3:20-5:20pm Mon. May 1, WTHR 160 or WTHR 172.

(3) Review of Probability. ST440/540: Applied Bayesian Statistics

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Class 8 Review Problems solutions, 18.05, Spring 2014

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

Probability and Information Theory. Sargur N. Srihari

Bayes Model Selection with Path Sampling: Factor Models

On the bivariate Skellam distribution. hal , version 1-22 Oct Noname manuscript No. (will be inserted by the editor)

Online Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access

Mohammed. Research in Pharmacoepidemiology National School of Pharmacy, University of Otago

Unit 2. Describing Data: Numerical

MODEL BASED CLUSTERING FOR COUNT DATA

Bayesian Inference and MCMC

Obnoxious lateness humor

LECTURE 1. Introduction to Econometrics

Business Statistics 41000: Homework # 2 Solutions

Where would you rather live? (And why?)

Approximate Bayesian binary and ordinal regression for prediction with structured uncertainty in the inputs

Estimation of Copula Models with Discrete Margins (via Bayesian Data Augmentation) Michael S. Smith

This paper is not to be removed from the Examination Halls

Multivariate Normal & Wishart

Lecture Notes for BUSINESS STATISTICS - BMGT 571. Chapters 1 through 6. Professor Ahmadi, Ph.D. Department of Management

CS281 Section 4: Factor Analysis and PCA

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics

Incorporating cost in Bayesian variable selection, with application to cost-effective measurement of quality of health care

EXAMINATIONS OF THE HONG KONG STATISTICAL SOCIETY

Chapter 5. Bayesian Statistics

CIS 390 Fall 2016 Robotics: Planning and Perception Final Review Questions

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Bayesian inference for factor scores

Optimization and Simulation

Contents. Part I: Fundamentals of Bayesian Inference 1

Part 8: GLMs and Hierarchical LMs and GLMs

Systematic strategies for real time filtering of turbulent signals in complex systems

Chapter 5. Chapter 5 sections

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline.

Exam 1 - Math Solutions

Stat 5101 Lecture Notes

Lecture 1: Probability Fundamentals

Multivariate Generalized Linear Mixed Models for Joint Estimation of Sporting Outcomes

Vojtěch Herrmann. Favoritism Under Social Pressure: Evidence From English Premier League

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Semiparametric Generalized Linear Models

Hidden Markov models for time series of counts with excess zeros

A strategy for modelling count data which may have extra zeros

arxiv: v2 [physics.data-an] 3 Mar 2010

Lecture 3: Sports rating models

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA

Forestry experiment and dose-response modelling. In the News: High water mark: the rise in sea levels may be accelerating Economist, Jan 17

36-463/663: Multilevel & Hierarchical Models

Bayesian linear regression

Approximate Inference in Practice Microsoft s Xbox TrueSkill TM

MATH 3510: PROBABILITY AND STATS June 15, 2011 MIDTERM EXAM

13.1 Causal effects with continuous mediator and. predictors in their equations. The definitions for the direct, total indirect,

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

Introduction to Probability 2017/18 Supplementary Problems

Approaches for Multiple Disease Mapping: MCAR and SANOVA

Linear Models and Empirical Bayes Methods for. Assessing Differential Expression in Microarray Experiments

R-squared for Bayesian regression models

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

45 6 = ( )/6! = 20 6 = selected subsets: 6 = and 10

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

CSC 411: Lecture 09: Naive Bayes

Online Appendix: Bayesian versus maximum likelihood estimation of treatment effects in bivariate probit instrumental variable models

Bayes: All uncertainty is described using probability.

Plausible Values for Latent Variables Using Mplus

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

STAT 4385 Topic 01: Introduction & Review

Transcription:

Karlis & Ntzoufras: Bayesian modelling of football outcomes 3 Bayesian modelling of football outcomes (using Skellam s Distribution) Dimitris Karlis & Ioannis Ntzoufras e-mails: {karlis, ntzoufras}@aueb.gr Department of Statistics Athens University of Economics and Business Athens, Greece; 1 Introduction Poisson distribution has been widely used as a simple modelling approach for predicting the number of goals in football (see, for example, Lee, 1997, Karlis and Ntzoufras, 2000). Empirical evidence has shown a (relatively low) correlation between the goals in a football game. This correlation must be incorporated in the model. It is reasonable to assume that the two outcome variables in football are correlated since the two teams interact during the game. 2007, 24-26th June, Manchester Karlis & Ntzoufras: Bayesian modelling of football outcomes 2 Karlis & Ntzoufras: Bayesian modelling of football outcomes 4 Synopsis 1. Introduction. 2. Skellam s Distribution. 3. Bayesian infernece for the Skellam s distribution 4. The General Model. 5. Illustrative example and results. 6. Discussion and further work. We can assume bivariate Poisson distribution (see Karlis and Ntzoufras, 2003). The marginal distributions are simple Poisson distributions, while the random variables are now dependent. Important issue model extra variation of draws (mainly 0-0, 1-1, 2-2). This problem has been effectively handled by Dixon and Coles (1997) and Karlis and Ntzoufras (2003)

Karlis & Ntzoufras: Bayesian modelling of football outcomes 5 Karlis & Ntzoufras: Bayesian modelling of football outcomes 7 The Bivariate Poisson Model (Karlis and Ntzoufras, 2003) (X i,y i ) BP(λ 1i,λ 2i,λ 3i ) where BP(λ 1,λ 2,λ 3 ) is the bivariate Poisson distribution with parameters λ i, i =1, 2, 3 and probability function P (X = x, Y = y) =e (λ1+λ2+λ3) λx 1 λ y min(x,y) 2 x y ( ) k λ3 k!. (1) x! y! k k λ 1 λ 2 k=0 Marginals: X P oisson(λ 1 + λ 3 )andy P oisson(λ 2 + λ 3 ) Means and Variance: E(X) =V (X) =λ 1 + λ 3, E(Y )=V (X) =λ 2 + λ 3 Advantages Accounts for the covariance which can be modelled Has he latent variables decomposition which can be used for data augmentation Poisson Marginals Easy interpretation Easy extension to similar models. Covariance: Cov(X, Y )=λ 3 > 0 Can be derived using latent variables W i P oisson(λ i ), for i =1, 2, 3with X = W 1 + W 3 and with Y = W 2 + W 3. Karlis & Ntzoufras: Bayesian modelling of football outcomes 6 Karlis & Ntzoufras: Bayesian modelling of football outcomes 8 Covariates can be linked directly on the means of the latent variables as in Karlis and Ntzoufras (2003) or directly on the marginal means. A Simple Model (Karlis and Ntzoufras, 2003) Response variables (X, Y ) are the home and away goals in each game. Consider the structure of Lee (1997) for λ 1 and λ 2 log(λ 1i ) = µ + H + A HTi + D ATi (2) log(λ 2i ) = µ + A ATi + D HTi (3) µ: constant; H: home effect; A k, D k attacking and defensive parameters of team k; HT i, AT i home and away team in i game. Constant covariance λ 3 Disadvantages How to model λ 3? Is constant λ 3 sufficient? Does λ 3 varies across teams or time? Underestimates the number of draws in certain cases (while it provides better results on this from the double Poisson model) Poisson Marginals while empirical evidence has shown slight over-dispersion: need to adjust The model allows only for positive correlation, if the data show negative correlation (even slow) the model cannot handle it

Karlis & Ntzoufras: Bayesian modelling of football outcomes 9 Karlis & Ntzoufras: Bayesian modelling of football outcomes 11 marginal means λ 3 win draw loss ratio 1.5 0.00 0.378 0.243 0.378 0.02 0.378 0.245 0.378 1.008 0.05 0.376 0.248 0.376 1.012 0.10 0.374 0.253 0.374 1.020 0.20 0.368 0.264 0.368 1.044 2.0 0.00 0.396 0.207 0.396 0.02 0.396 0.208 0.396 1.006 0.05 0.395 0.210 0.395 1.008 0.10 0.394 0.213 0.394 1.014 0.20 0.390 0.219 0.390 1.030 Table 1: The effect on the predicted probabilities when ignoring the covariance; the BP increases the probability of a draw. Skellam s distribution (1) For any pair of variables (X, Y )thatcanbewrittenas X = W 1 + W 3, Y = W 2 + W 3 with W 1 P oisson(λ 1 ), W 2 P oisson(λ 2 ) and W 3 any distribution with parameters θ 3 then Z = X Y PD(λ 1,λ 2 ) (Poisson difference or Skellam s distribution with parameters λ 1 and λ 2 ). Karlis & Ntzoufras: Bayesian modelling of football outcomes 10 Karlis & Ntzoufras: Bayesian modelling of football outcomes 12 Skellam s distribution (2) Diagonal Inflated Bivariate Poisson Model Under this approach a diagonal inflated model is specified by (1 p)bp(x, y λ 1,λ 2,λ 3 ), x y P D (x, y) = (1 p)bp(x, y λ 1,λ 2,λ 3 )+pd(x, θ), x = y, where D(x, θ) is discrete distribution with parameter vector θ. Such models can be fitted using the EM algorithm. Important: diagonal inflation improves in several aspects: better draw prediction, overdipsersed marginals, introduce correlation (4) Z = X Y PD(λ 1,λ 2 ) Mean E(Z) = λ 1 λ 2 Poisson difference or Skellam s distribution with Variance Var(Z) = λ 1 + λ 2 and density function ( ) z/2 f PD (z λ, λ 2 )=P(Z = z λ 1,λ 2 )=e (λ1+λ2) λ1 I z (2 ) λ 1 λ 2. (5) λ 2 for all z Z,whereI r (x) is the modified Bessel function of order r ( ) k ( x ) r x 2 4 I r (x) = 2 k!γ(r + k +1). k=0 (see Abramowitz and Stegun, 1974, pp. 375).

Karlis & Ntzoufras: Bayesian modelling of football outcomes 13 Karlis & Ntzoufras: Bayesian modelling of football outcomes 15 Prob 0.00 0.05 0.10 0.15 0.20 0.25 0.30 lambda1 = 1,lambda2= 1 6 4 2 0 2 4 6 x Prob 0.00 0.05 0.10 0.15 0.20 0.25 lambda1 = 1,lambda2= 1.5 6 4 2 0 2 4 6 x (see, Karlis and Ntzoufras,2006, SIM) Bayesian approach Used Discrete differences of dental data to test for differences of means between two repeated measurements test for zero inflated components Prob 0.00 0.05 0.10 0.15 0.20 lambda1 = 1,lambda2= 2 6 4 2 0 2 4 6 Prob 0.00 0.05 0.10 0.15 0.20 lambda1 = 2,lambda2= 2 6 4 2 0 2 4 6 using the Bayes Factor approach (and RJMCMC to estimate it). The same approach can be used to quantify (using the Bayes factor) Used Discrete differences of dental data to the importance of home effect (by testing for the equality of expected home and away goals) and the excess of draws (by testing the importance of the zero inflated component). x x Karlis & Ntzoufras: Bayesian modelling of football outcomes 14 Advantages Removes additive covariance (do not need to model it) Has an Poisson latent variable interpretation Does not assumes Poisson marginals (i.e. one may assume an overdispersed marginal distribution) Easy interpretation Simpler Model set-up than the corresponding Biv. Poisson model Discards part of the information Disadvantages Cannot model covariance and final full score (only differences). Karlis & Ntzoufras: Bayesian modelling of football outcomes 16 Skellam s Distribution for Football Scores Response variable: Z = X Y the goal difference in each game. Same structure for parameters λ 1 and λ 2 as in Bivariate Poisson: log(λ 1i ) = µ + H + A HTi + D ATi (6) log(λ 2i ) = µ + A ATi + D HTi (7) µ: constant; H: home effect; A k, D k attacking and defensive parameters of team k; HT i, AT i home and away team in i game. Use the zero inflated variation of Skellam s distribution to model the excess of draws. Hence we define the zero inflated Poisson Difference (ZPD) distribution as f ZPD (0 p, λ 1,λ 2 )= p + (1 p)f PD (0 λ 1,λ 2 ) and f ZPD (z p, λ 1,λ 2 )= (1 p)f PD (z λ 1,λ 2 ), (8) for z Z\{0}; wherep (0, 1) and f PD (z λ 1,λ 2 ) is given by (5).

Karlis & Ntzoufras: Bayesian modelling of football outcomes 17 Karlis & Ntzoufras: Bayesian modelling of football outcomes 19 Posterior Distributions The key element for building an MCMC algorithm for the above models is to use augmented data Sample latent data w 1i and w 2i from f(w 1i,w 2i z i = w 1i w 2i,λ 1i,λ 2i ) λw1i w 1i! where I(A) =1ifA is true and zero otherwise. Sample [δ i z i,λ 1i,λ 2i ] Bernoulli( p i )from p p i = p +(1 p)f PD (z i λ 1i,λ 2i ) [δ i indicates the mixing (zero inflated or PD) component] λ w2i 2i w 2i! I(z i = w 1i w 2i ) Then we MCMC algorithm for model parameters is similar to the one used in Poisson regression models. We can additionally model p as in logistic regression models. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Fulham (38) Watford (29) Chelsea (64) Arsenal (63) Tottenham (57) Everton (52) Bolton (47) Reading (52) Portsmouth (45) Blackburn (52) Aston Villa (43) Middlesbrough (44) Newcastle (38) Man City (29) West Ham (35) Wigan (37) Sheff Utd (32) Charlton (34) Man Utd (83) Liverpool (57) 1.5 1.0 0.5 0.0 0.5 95% Posterior intervals for attacking coefficients (A i ) Karlis & Ntzoufras: Bayesian modelling of football outcomes 18 Karlis & Ntzoufras: Bayesian modelling of football outcomes 20 Example: Premiership season 2006-7 data Data from the premiership for the 2006-7 season are used to fit the PD and ZPD models. Results using PD Posterior summaries for µ and home effect Min. 1st Qu. Median Mean 3rd Qu. Max. s.d. mu -0.793-0.457-0.390-0.386-0.316-0.024 0.109 home 0.165 0.384 0.432 0.436 0.486 0.720 0.074 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Man Utd (27) Chelsea (24) Liverpool (27) Arsenal (35) Tottenham (54) Everton (36) Bolton (52) Reading (47) Portsmouth (42) Blackburn (54) Aston Villa (41) Middlesbrough (49) Newcastle (47) Man City (44) West Ham (59) Fulham (60) Wigan (59) Sheff Utd (55) Charlton (60) Watford (59) 0.5 0.0 0.5 1.0 1.5 2.0 95% Posterior intervals for defensive coefficients [( 1) D i ]

Karlis & Ntzoufras: Bayesian modelling of football outcomes 21 Karlis & Ntzoufras: Bayesian modelling of football outcomes 23 Number of games 0 20 40 60 80 100 120 4 3 2 1 0 1 2 3 4 6 Goal difference 95% Posterior intervals for predicted differences (in red=observed differences; - indicates the posterior median) Karlis & Ntzoufras: Bayesian modelling of football outcomes 22 Karlis & Ntzoufras: Bayesian modelling of football outcomes 24 Posterior Predictive Table Observed Final Table Pred.(Obs.) Post. Expectations Obs. Obs. values Rank Team Pts G.Dif. Rank Team Pts G.Dif. 1 (1) Man Utd 86.7 56.0 1ManUtd 89 56 2 (2) Chelsea 81.0 40.0 2 Chelsea 83 40 3 (3) Arsenal 70.5 28.0 3 Liverpool 68 30 4 (3) Liverpool 69.4 30.2 4 Arsenal 68 28 5 (6) Everton 62.5 16.0 5 Tottenham 60 3 6 (8) Reading 55.5 5.4 6 Everton 58 16 7 (5) Tottenham 54.0 3.2 7Bolton 56-5 8 (9) Portsmouth 53.3 2.8 8 Reading 55 5 9 (10) Blackburn 51.8-2.1 9Portsmouth 54 3 10 (11) Aston Villa 51.5 1.6 10 Blackburn 52-2 11 (12) Middlesbrough 49.0-4.7 11 Aston Villa 50 2 12 (7) Bolton 49.0-5.7 12 Middlesbrough 46-5 13 (13) Newcastle 43.8-8.8 13 Newcastle 43-9 14 (14) Man City 41.8-14.8 14 Man City 42-15 15 (15) West Ham 38.6-24.3 15 West Ham 41-24 16 (17) Wigan 38.1-21.6 16 Fulham 39-22 17 (18) Sheff Utd 37.0-23.0 17 Wigan 38-22 18 (19) Charlton 35.7-25.9 18 Sheff Utd 38-23 19 (16) Fulham 33.1-22.0 19 Charlton 34-26 20 (20) Watford 29.7-30.3 20 Watford 28-30 Deviations between observed and predictive measures Comparison Deviation 1. Relative Frequencies (counts/games) 1.06% 2. Frequencies (counts) 4.04 3. Relative Frequencies of win/draw/lose 2.20% 4. Frequencies of win/draw/lose 8.30 5. Expected points 3.02 6. Expected goal difference 0.28 Deviation = 1 K ( E(Q Pred i y) Q obs i K where K is the lengths of vector Q (K = 13 for comparisons 1-4, K =20for comparisons 5-6). i=1 ) 2

Karlis & Ntzoufras: Bayesian modelling of football outcomes 25 Results using ZPD Posterior summaries for µ and home effect Min. 1st Qu. Median Mean 3rd Qu. Max. s.d. mu -0.785-0.432-0.355-0.360-0.284-0.019 0.115 home 0.184 0.378 0.432 0.434 0.486 0.705 0.082 p 0.000 0.006 0.014 0.018 0.025 0.114 0.015 The posterior distribution of p is close to zero Small deviations between µ for PD and ZPD Home effect is similar in both models Results from PD Min. 1st Qu. Median Mean 3rd Qu. Max. s.d. mu -0.793-0.457-0.390-0.386-0.316-0.024 0.109 home 0.165 0.384 0.432 0.436 0.486 0.720 0.074 Karlis & Ntzoufras: Bayesian modelling of football outcomes 27 Deviations between observed and predictive measures Deviation Comparison PD ZPD 1. Relative Frequencies (counts/games) 1.06% 1.32% 2. Frequencies (counts) 4.04 5.04 3. Relative Frequencies of win/draw/lose 2.20% 2.80% 4. Frequencies of win/draw/lose 8.30 10.65% 5. Expected points 3.02 3.07 6. Expected goal difference 0.28 0.40 Comments Zero inflation component does not improves the performance (as expected) Prediction of goal differences for ZPD are worse than the corresponding ones for PD Nevertheless, differences in the final rankings are minimal Karlis & Ntzoufras: Bayesian modelling of football outcomes 26 Karlis & Ntzoufras: Bayesian modelling of football outcomes 28 Number of games 0 20 40 60 80 100 120 4 3 2 1 0 1 2 3 4 6 Goal difference 95% Posterior intervals for predicted differences (in red=observed differences; - indicates the posterior median) Comment: No major differences between the two models are observed. ZPD PD Discussion and further research Although in the simple and bivariate Poisson model we underestimate draws here using PD we have overestimated draws. No zero inflation is needed. We might need to add a component which will reduce the predicted draws. Can covariates on p improve the model? Apply Bayesian variable selection and Bayesian model averaging techniques. Apply other distributions defined on the same range.

Karlis & Ntzoufras: Bayesian modelling of football outcomes 29 Our Related Publications 1. Karlis D. and Ntzoufras, I. (2006). Bayesian analysis of the differences of count data. Statistics in Medicine, 25, 1885-1905. 2. Karlis, D. and Ntzoufras, I. (2005). Bivariate Poisson and Diagonal Inflated Bivariate Poisson Regression Models in R. Journal of Statistical Software, Volume 10, Issue 10. 3. Karlis, D. and Ntzoufras, I. (2003). Analysis of Sports Data Using Bivariate Poisson Models. Journal of the Royal Statistical Society, D,(Statistician),52, 381 393. 4. Karlis, D. and Ntzoufras, I. (2000). On Modelling Soccer Data. Student, 3, 229 244. Karlis & Ntzoufras: Bayesian modelling of football outcomes 30 Additional References Lee, A.J. (1997). Modeling scores in the Premier League: Is Manchester United really the best? Chance, 10, 15-19. Dixon, M.J. and Coles, S.G. (1997). Modelling association football scores and inefficiencies in football betting market. Applied Statistics, 46, 265-280.