Bayesian Random Intercept Model
[1]:
import sys
sys.path.append("../../")
import penaltyblog as pb
import pandas as pd
import arviz as az
WARNING (aesara.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
Get data from football-data.co.uk
[2]:
fb = pb.scrapers.FootballData("ENG Premier League", "2019-2020")
df = fb.get_fixtures()
df.head()
[2]:
date | datetime | season | competition | div | time | team_home | team_away | fthg | ftag | ... | b365_cahh | b365_caha | pcahh | pcaha | max_cahh | max_caha | avg_cahh | avg_caha | goals_home | goals_away | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||||||||||
1565308800---liverpool---norwich | 2019-08-09 | 2019-08-09 20:00:00 | 2019-2020 | ENG Premier League | E0 | 20:00 | Liverpool | Norwich | 4 | 1 | ... | 1.91 | 1.99 | 1.94 | 1.98 | 1.99 | 2.07 | 1.90 | 1.99 | 4 | 1 |
1565395200---bournemouth---sheffield_united | 2019-08-10 | 2019-08-10 15:00:00 | 2019-2020 | ENG Premier League | E0 | 15:00 | Bournemouth | Sheffield United | 1 | 1 | ... | 1.95 | 1.95 | 1.98 | 1.95 | 2.00 | 1.96 | 1.96 | 1.92 | 1 | 1 |
1565395200---burnley---southampton | 2019-08-10 | 2019-08-10 15:00:00 | 2019-2020 | ENG Premier League | E0 | 15:00 | Burnley | Southampton | 3 | 0 | ... | 1.87 | 2.03 | 1.89 | 2.03 | 1.90 | 2.07 | 1.86 | 2.02 | 3 | 0 |
1565395200---crystal_palace---everton | 2019-08-10 | 2019-08-10 15:00:00 | 2019-2020 | ENG Premier League | E0 | 15:00 | Crystal Palace | Everton | 0 | 0 | ... | 1.82 | 2.08 | 1.97 | 1.96 | 2.03 | 2.08 | 1.96 | 1.93 | 0 | 0 |
1565395200---tottenham---aston_villa | 2019-08-10 | 2019-08-10 17:30:00 | 2019-2020 | ENG Premier League | E0 | 17:30 | Tottenham | Aston Villa | 3 | 1 | ... | 2.10 | 1.70 | 2.18 | 1.77 | 2.21 | 1.87 | 2.08 | 1.80 | 3 | 1 |
5 rows × 111 columns
Train the Model
[3]:
clf = pb.models.BayesianRandomInterceptGoalModel(
df["goals_home"], df["goals_away"], df["team_home"], df["team_away"]
)
clf.fit()
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [home, tau_int, intercept, tau_att, atts_star, tau_def, def_star]
100.00% [8000/8000 00:40<00:00 Sampling 2 chains, 0 divergences]
Sampling 2 chains for 1_500 tune and 2_500 draw iterations (3_000 + 5_000 draws total) took 41 seconds.
The model’s parameters
[4]:
clf
[4]:
Module: Penaltyblog
Model: Bayesian Random Intercept
Number of parameters: 61
Team Intercept Attack Defence
--------------------------------------------------------------------------------
Arsenal 0.125 0.06 -0.042
Aston Villa -0.009 -0.074 0.205
Bournemouth -0.015 -0.092 0.179
Brighton -0.035 -0.103 0.035
Burnley 0.004 -0.06 -0.021
Chelsea 0.238 0.148 0.056
Crystal Palace -0.135 -0.199 -0.027
Everton 0.02 -0.052 0.063
Leicester 0.214 0.135 -0.142
Liverpool 0.333 0.242 -0.266
Man City 0.427 0.329 -0.218
Man United 0.203 0.133 -0.225
Newcastle -0.047 -0.108 0.088
Norwich -0.199 -0.251 0.289
Sheffield United -0.045 -0.103 -0.19
Southampton 0.095 0.014 0.122
Tottenham 0.171 0.094 -0.05
Watford -0.069 -0.128 0.164
West Ham 0.07 0.004 0.148
Wolves 0.08 0.011 -0.168
Home Advantage: 0.269
Predict Match Outcomes
[5]:
probs = clf.predict("Liverpool", "Wolves")
probs
[5]:
Module: Penaltyblog
Class: FootballProbabilityGrid
Home Goal Expectation: 1.9655112637906613
Away Goal Expectation: 0.839616515083537
Home Win: 0.637440649792086
Draw: 0.20991587925141958
Away Win: 0.15264346787710398
1x2 Probabilities
[6]:
probs.home_draw_away
[6]:
[0.637440649792086, 0.20991587925141958, 0.15264346787710398]
[7]:
probs.home_win
[7]:
0.637440649792086
[8]:
probs.draw
[8]:
0.20991587925141958
[9]:
probs.away_win
[9]:
0.15264346787710398
Probablity of Total Goals >1.5
[10]:
probs.total_goals("over", 1.5)
[10]:
0.7697934185386799
Probability of Asian Handicap 1.5
[11]:
probs.asian_handicap("home", 1.5)
[11]:
0.38938850629173366
Probability of both teams scoring
[12]:
probs.both_teams_to_score
[12]:
0.488538680478423