Bayesian Random Intercept Model

[1]:
import sys

sys.path.append("../../")

import penaltyblog as pb
import pandas as pd
import arviz as az
WARNING (aesara.tensor.blas): Using NumPy C-API based implementation for BLAS functions.

Get data from football-data.co.uk

[2]:
fb = pb.scrapers.FootballData("ENG Premier League", "2019-2020")
df = fb.get_fixtures()

df.head()
[2]:
date datetime season competition div time team_home team_away fthg ftag ... b365_cahh b365_caha pcahh pcaha max_cahh max_caha avg_cahh avg_caha goals_home goals_away
id
1565308800---liverpool---norwich 2019-08-09 2019-08-09 20:00:00 2019-2020 ENG Premier League E0 20:00 Liverpool Norwich 4 1 ... 1.91 1.99 1.94 1.98 1.99 2.07 1.90 1.99 4 1
1565395200---bournemouth---sheffield_united 2019-08-10 2019-08-10 15:00:00 2019-2020 ENG Premier League E0 15:00 Bournemouth Sheffield United 1 1 ... 1.95 1.95 1.98 1.95 2.00 1.96 1.96 1.92 1 1
1565395200---burnley---southampton 2019-08-10 2019-08-10 15:00:00 2019-2020 ENG Premier League E0 15:00 Burnley Southampton 3 0 ... 1.87 2.03 1.89 2.03 1.90 2.07 1.86 2.02 3 0
1565395200---crystal_palace---everton 2019-08-10 2019-08-10 15:00:00 2019-2020 ENG Premier League E0 15:00 Crystal Palace Everton 0 0 ... 1.82 2.08 1.97 1.96 2.03 2.08 1.96 1.93 0 0
1565395200---tottenham---aston_villa 2019-08-10 2019-08-10 17:30:00 2019-2020 ENG Premier League E0 17:30 Tottenham Aston Villa 3 1 ... 2.10 1.70 2.18 1.77 2.21 1.87 2.08 1.80 3 1

5 rows × 111 columns

Train the Model

[3]:
clf = pb.models.BayesianRandomInterceptGoalModel(
    df["goals_home"], df["goals_away"], df["team_home"], df["team_away"]
)
clf.fit()
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [home, tau_int, intercept, tau_att, atts_star, tau_def, def_star]
100.00% [8000/8000 00:40<00:00 Sampling 2 chains, 0 divergences]
Sampling 2 chains for 1_500 tune and 2_500 draw iterations (3_000 + 5_000 draws total) took 41 seconds.

The model’s parameters

[4]:
clf
[4]:
Module: Penaltyblog

Model: Bayesian Random Intercept

Number of parameters: 61
Team                 Intercept            Attack               Defence
--------------------------------------------------------------------------------
Arsenal              0.125                0.06                 -0.042
Aston Villa          -0.009               -0.074               0.205
Bournemouth          -0.015               -0.092               0.179
Brighton             -0.035               -0.103               0.035
Burnley              0.004                -0.06                -0.021
Chelsea              0.238                0.148                0.056
Crystal Palace       -0.135               -0.199               -0.027
Everton              0.02                 -0.052               0.063
Leicester            0.214                0.135                -0.142
Liverpool            0.333                0.242                -0.266
Man City             0.427                0.329                -0.218
Man United           0.203                0.133                -0.225
Newcastle            -0.047               -0.108               0.088
Norwich              -0.199               -0.251               0.289
Sheffield United     -0.045               -0.103               -0.19
Southampton          0.095                0.014                0.122
Tottenham            0.171                0.094                -0.05
Watford              -0.069               -0.128               0.164
West Ham             0.07                 0.004                0.148
Wolves               0.08                 0.011                -0.168
Home Advantage: 0.269

Predict Match Outcomes

[5]:
probs = clf.predict("Liverpool", "Wolves")
probs
[5]:
Module: Penaltyblog

Class: FootballProbabilityGrid

Home Goal Expectation: 1.9655112637906613
Away Goal Expectation: 0.839616515083537

Home Win: 0.637440649792086
Draw: 0.20991587925141958
Away Win: 0.15264346787710398

1x2 Probabilities

[6]:
probs.home_draw_away
[6]:
[0.637440649792086, 0.20991587925141958, 0.15264346787710398]
[7]:
probs.home_win
[7]:
0.637440649792086
[8]:
probs.draw
[8]:
0.20991587925141958
[9]:
probs.away_win
[9]:
0.15264346787710398

Probablity of Total Goals >1.5

[10]:
probs.total_goals("over", 1.5)
[10]:
0.7697934185386799

Probability of Asian Handicap 1.5

[11]:
probs.asian_handicap("home", 1.5)
[11]:
0.38938850629173366

Probability of both teams scoring

[12]:
probs.both_teams_to_score
[12]:
0.488538680478423