forecasting the fifa world cuppieter.robberechts/repo/robberechts... · match outcome prediction...

19
Forecasting the FIFA World Cup Combining goal- and result-based team ability parameters Pieter Robberechts, Jesse Davis http://people.cs.kuleuven.be/pieter.robberechts

Upload: others

Post on 27-Jan-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Forecasting the FIFA World Cuppieter.robberechts/repo/robberechts... · Match outcome prediction Via team rating systems Two prediction models were explored:-Ordered logit regression

Forecasting the FIFA World CupCombining goal- and result-based team ability parameters

Pieter Robberechts, Jesse Davishttp://people.cs.kuleuven.be/pieter.robberechts

Page 2: Forecasting the FIFA World Cuppieter.robberechts/repo/robberechts... · Match outcome prediction Via team rating systems Two prediction models were explored:-Ordered logit regression

Introduction

A popular research topic since the '60

Two popular approaches:

1. Goal-based models

Model the number of goals scored by both teams

2. Result-based models

Model win-draw-loss outcomes directly

Page 3: Forecasting the FIFA World Cuppieter.robberechts/repo/robberechts... · Match outcome prediction Via team rating systems Two prediction models were explored:-Ordered logit regression

Typical approach:

1. Estimate team abilities based on historical match data

2. Use them to predict future match outcomes

Match outcome prediction

Data → Team ratings → Predictions

Page 4: Forecasting the FIFA World Cuppieter.robberechts/repo/robberechts... · Match outcome prediction Via team rating systems Two prediction models were explored:-Ordered logit regression

Match outcome prediction

Typical approach:

1. Estimate team abilities based on historical match data

2. Use them to predict future match outcomes

Data → Team ratings → Predictions

Data scraped from:

- post WW2 international games from http://eloratings.net

- betting odds from http://betexplorer.com/

Page 5: Forecasting the FIFA World Cuppieter.robberechts/repo/robberechts... · Match outcome prediction Via team rating systems Two prediction models were explored:-Ordered logit regression

Match outcome prediction

Typical approach:

1. Estimate team abilities based on historical match data

2. Use them to predict future match outcomes

Data → Team ratings → Predictions

Two rating systems were explored:

- ELO ratings (result-based)

- ODM ratings (goal-based)

Team ...

Strength 2320 2237 2220 2207 ....

Page 6: Forecasting the FIFA World Cuppieter.robberechts/repo/robberechts... · Match outcome prediction Via team rating systems Two prediction models were explored:-Ordered logit regression

The ELO rating systemA Result-based rating system

EH =1

1 + 10RH−RA

400

R′�H = RH + k(SH − EH)

Given:

RH, RA

SH = {10.50

Then:

Current home and away team ratings

Expected score for the home team

Actual score of the home team

Updated rating of the home team

If the home team wonWhen drawIf the home team lost

Page 7: Forecasting the FIFA World Cuppieter.robberechts/repo/robberechts... · Match outcome prediction Via team rating systems Two prediction models were explored:-Ordered logit regression

The ELO rating systemA Result-based rating system

k = k0wi(1+δ)γ

Problem: - Not all games are handled with the same seriousness- Most games are played against weak opponents

‣ Competitiveness factor ‣ Margin of victory

Margin of victory weight Recentness factorR′�H = RH + k(SH − EH)

Page 8: Forecasting the FIFA World Cuppieter.robberechts/repo/robberechts... · Match outcome prediction Via team rating systems Two prediction models were explored:-Ordered logit regression

Offense-Defense ratingsA Goal-based rating system

Given:

Then:

Aij =

oj =n

∑i=1

Aij

didi =

n

∑i=1

Aji

oi

Aij = 0

Score team j generated against team i

Otherwise

Offensive rating of team j Defensive rating of team i

Page 9: Forecasting the FIFA World Cuppieter.robberechts/repo/robberechts... · Match outcome prediction Via team rating systems Two prediction models were explored:-Ordered logit regression

Offense-Defense ratingsA Goal-based rating system

Problem: - Large disparities between the number of games played and the

strength of the opponents- Teams in different confederations rarely play each other

Solution:

Update ratings sequentially

For each team:- Pre-game ratings = weighted sum of a team's post game ratings- Post-game ratings = ODM procedure with pre-game ratings as initial ratings

Page 10: Forecasting the FIFA World Cuppieter.robberechts/repo/robberechts... · Match outcome prediction Via team rating systems Two prediction models were explored:-Ordered logit regression

Match outcome predictionVia team rating systems

Two prediction models were explored:

- Ordered logit regression (result-based)

- Bivariate poisson regression (goal-based)

Typical approach:

1. Estimate team abilities based on historical match data

2. Use them to predict future match outcomes

Data → Team ratings → Predictions

Predictor

Eloattdef

Elodefatt [ 0.43 0.33 0.24 ]

"Belgium wins"

"It's a tie"

"England wins"

Home advantage?

-

Page 11: Forecasting the FIFA World Cuppieter.robberechts/repo/robberechts... · Match outcome prediction Via team rating systems Two prediction models were explored:-Ordered logit regression

Tuning the predictive power

1r − 1

r −1

∑k=1

(k

∑l=1

( ̂pl − yl))2

How accurate are our predictions?

3 possible interpretations:1. How many games are predicted correctly?→ Accuracy

2. How certain was the model about the true outcome?→ Logarithmic loss

3. How certain was the model about the true ordered outcome?

→ Ranked Probability Score (RPS)

Page 12: Forecasting the FIFA World Cuppieter.robberechts/repo/robberechts... · Match outcome prediction Via team rating systems Two prediction models were explored:-Ordered logit regression

Tuning the predictive power

Dataset

Test setValidation set

Apply best model

Training set

Until convergence: For each game ∈ Training set: update_rating(game) If game ∈ Validation set: make_prediction(game)

End if End for Compute average RPS Update rating and prediction model parameters

Minimise RPS with L-BFG-S algorithm:

Page 13: Forecasting the FIFA World Cuppieter.robberechts/repo/robberechts... · Match outcome prediction Via team rating systems Two prediction models were explored:-Ordered logit regression

Challenge I: Match outcome prediction

Accuracy LogLoss RPS

ELO ordered logit

ELO bivariate Poisson

Random forest

Bookmakers

ELO+ODM ordered logit

ELO+ODM bivariate Poisson

ODM ordered logit

ODM bivariate Poisson

0,51 0,6 0,1

40,2

30,9

21,0

1

The models were validated on the 2002, 2006, 2010 and 2014

World Cups 2002 2006 2010 2014allX

Page 14: Forecasting the FIFA World Cuppieter.robberechts/repo/robberechts... · Match outcome prediction Via team rating systems Two prediction models were explored:-Ordered logit regression

Challenge I: Match outcome prediction

Accuracy RPS

Bookmakers

ELO ordered logit

ELO+ODM ordered logit

Berrar et al.

Hubáček et al.

Constantinou

Tsokos et al.

And compared with the 2017 Soccer Prediction Challenge submissions

0,5 0,54

0,201

0,209

Page 15: Forecasting the FIFA World Cuppieter.robberechts/repo/robberechts... · Match outcome prediction Via team rating systems Two prediction models were explored:-Ordered logit regression

Accuracy LogLoss RPS

2014 Elo

Elo+ODM

FiveThirthyEight

2010 Elo

Elo+ODM

2006 Elo

Elo+ODM

2002 Elo

Elo+ODM

0,3 0,6 0,1 0,24

0,15

0,25

Challenge II: Tournament elimination

How accurate can we predict the round of elimination of each team in

previous World Cups?

Page 16: Forecasting the FIFA World Cuppieter.robberechts/repo/robberechts... · Match outcome prediction Via team rating systems Two prediction models were explored:-Ordered logit regression

Our predictions

Page 17: Forecasting the FIFA World Cuppieter.robberechts/repo/robberechts... · Match outcome prediction Via team rating systems Two prediction models were explored:-Ordered logit regression

Other's predictions

Accuracy LogLoss RPSFiveThirtyEightZeileirs et al.Groll et al.Our model

UBS 0,50,563

0,5940,563

0,531

0,2010,224

0,1860,1850,182

0,1920,1320,1260,1270,124

Tournament elimination

Page 18: Forecasting the FIFA World Cuppieter.robberechts/repo/robberechts... · Match outcome prediction Via team rating systems Two prediction models were explored:-Ordered logit regression

Online interactive https://dtai.cs.kuleuven.be/sports/worldcup18/

Page 19: Forecasting the FIFA World Cuppieter.robberechts/repo/robberechts... · Match outcome prediction Via team rating systems Two prediction models were explored:-Ordered logit regression

Thanks!Any questions?

Interactive at:https://dtai.cs.kuleuven.be/sports/worldcup18/