Bayesian Bivariate Model
[1]:
import sys
sys.path.append("../../")
import penaltyblog as pb
WARNING (aesara.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
Get data from football-data.co.uk
[2]:
fb = pb.scrapers.FootballData("ENG Premier League", "2019-2020")
df = fb.get_fixtures()
df.head()
[2]:
date | datetime | season | competition | div | time | team_home | team_away | fthg | ftag | ... | b365_cahh | b365_caha | pcahh | pcaha | max_cahh | max_caha | avg_cahh | avg_caha | goals_home | goals_away | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||||||||||
1565308800---liverpool---norwich | 2019-08-09 | 2019-08-09 20:00:00 | 2019-2020 | ENG Premier League | E0 | 20:00 | Liverpool | Norwich | 4 | 1 | ... | 1.91 | 1.99 | 1.94 | 1.98 | 1.99 | 2.07 | 1.90 | 1.99 | 4 | 1 |
1565395200---bournemouth---sheffield_united | 2019-08-10 | 2019-08-10 15:00:00 | 2019-2020 | ENG Premier League | E0 | 15:00 | Bournemouth | Sheffield United | 1 | 1 | ... | 1.95 | 1.95 | 1.98 | 1.95 | 2.00 | 1.96 | 1.96 | 1.92 | 1 | 1 |
1565395200---burnley---southampton | 2019-08-10 | 2019-08-10 15:00:00 | 2019-2020 | ENG Premier League | E0 | 15:00 | Burnley | Southampton | 3 | 0 | ... | 1.87 | 2.03 | 1.89 | 2.03 | 1.90 | 2.07 | 1.86 | 2.02 | 3 | 0 |
1565395200---crystal_palace---everton | 2019-08-10 | 2019-08-10 15:00:00 | 2019-2020 | ENG Premier League | E0 | 15:00 | Crystal Palace | Everton | 0 | 0 | ... | 1.82 | 2.08 | 1.97 | 1.96 | 2.03 | 2.08 | 1.96 | 1.93 | 0 | 0 |
1565395200---tottenham---aston_villa | 2019-08-10 | 2019-08-10 17:30:00 | 2019-2020 | ENG Premier League | E0 | 17:30 | Tottenham | Aston Villa | 3 | 1 | ... | 2.10 | 1.70 | 2.18 | 1.77 | 2.21 | 1.87 | 2.08 | 1.80 | 3 | 1 |
5 rows × 111 columns
Train the Model
[3]:
clf = pb.models.BayesianBivariateGoalModel(
df["goals_home"], df["goals_away"], df["team_home"], df["team_away"]
)
clf.fit()
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [tau_att, atts_star, tau_def, def_star, tau_rho, rho, mu, eta]
100.00% [9000/9000 01:53<00:00 Sampling 2 chains, 0 divergences]
Sampling 2 chains for 2_000 tune and 2_500 draw iterations (4_000 + 5_000 draws total) took 114 seconds.
The model’s parameters
[5]:
clf
[5]:
Module: Penaltyblog
Model: Bayesian Random Intercept
Number of parameters: 62
Team Attack Defence rho
--------------------------------------------------------------------------------
Arsenal 0.365 -0.093 -0.149
Aston Villa -0.174 0.539 -0.22
Bournemouth -0.835 0.172 0.05
Brighton -0.357 0.243 -0.261
Burnley 0.009 0.274 -0.39
Chelsea 0.368 -0.081 0.112
Crystal Palace -0.61 0.219 -0.417
Everton -0.469 -0.027 -0.007
Leicester 0.738 -0.221 -0.225
Liverpool 1.098 -0.511 -0.228
Man City 1.459 -0.384 -0.319
Man United 0.67 -0.691 -0.162
Newcastle -0.639 0.336 -0.188
Norwich -1.016 0.639 -0.244
Sheffield United -0.276 -0.268 -0.334
Southampton -0.254 0.012 0.086
Tottenham 0.427 -0.305 -0.054
Watford -0.513 0.459 -0.231
West Ham -0.029 0.243 -0.052
Wolves 0.038 -0.554 -0.125
--------------------------------------------------------------------------------
mu: -1.188
eta: 0.439
Predict Match Outcomes
[6]:
probs = clf.predict("Liverpool", "Wolves")
probs
[6]:
Module: Penaltyblog
Class: FootballProbabilityGrid
Home Goal Expectation: 1.5174424821924677
Away Goal Expectation: 0.8928718395765348
Home Win: 0.5189164181292896
Draw: 0.25940159825895515
Away Win: 0.2216819835152744
1x2 Probabilities
[7]:
probs.home_draw_away
[7]:
[0.5189164181292896, 0.25940159825895515, 0.2216819835152744]
[8]:
probs.home_win
[8]:
0.5189164181292896
[9]:
probs.draw
[9]:
0.25940159825895515
[10]:
probs.away_win
[10]:
0.2216819835152744
Probablity of Total Goals >1.5
[11]:
probs.total_goals("over", 1.5)
[11]:
0.6937978756309453
Probability of Asian Handicap 1.5
[12]:
probs.asian_handicap("home", 1.5)
[12]:
0.2670081276116055
Probability of both teams scoring
[13]:
probs.both_teams_to_score
[13]:
0.46103699853202423
Train the model with more recent data weighted to be more important
[14]:
weights = pb.models.dixon_coles_weights(df["date"], 0.001)
clf = pb.models.BayesianBivariateGoalModel(
df["goals_home"], df["goals_away"], df["team_home"], df["team_away"], weights
)
clf.fit()
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [tau_att, atts_star, tau_def, def_star, tau_rho, rho, mu, eta]
100.00% [9000/9000 01:42<00:00 Sampling 2 chains, 0 divergences]
Sampling 2 chains for 2_000 tune and 2_500 draw iterations (4_000 + 5_000 draws total) took 103 seconds.
[15]:
clf
[15]:
Module: Penaltyblog
Model: Bayesian Random Intercept
Number of parameters: 62
Team Attack Defence rho
--------------------------------------------------------------------------------
Arsenal 0.379 -0.115 -0.147
Aston Villa -0.198 0.509 -0.218
Bournemouth -0.81 0.148 0.061
Brighton -0.401 0.242 -0.251
Burnley -0.067 0.209 -0.345
Chelsea 0.355 -0.063 0.111
Crystal Palace -0.623 0.234 -0.396
Everton -0.435 -0.022 -0.028
Leicester 0.617 -0.22 -0.161
Liverpool 1.079 -0.446 -0.205
Man City 1.48 -0.37 -0.327
Man United 0.722 -0.642 -0.17
Newcastle -0.603 0.335 -0.176
Norwich -0.99 0.659 -0.28
Sheffield United -0.302 -0.234 -0.313
Southampton -0.205 -0.061 0.089
Tottenham 0.404 -0.302 -0.058
Watford -0.451 0.449 -0.207
West Ham -0.037 0.171 -0.021
Wolves 0.085 -0.482 -0.166
--------------------------------------------------------------------------------
mu: -1.243
eta: 0.471
[ ]: