Inferring Goal Expectancies from Bookmaker Odds#
penaltyblog also includes a utility that works in the opposite direction to the goals models: given bookmaker 1X2 probabilities (home/draw/away), it estimates the implied goal expectancies (μ_home, μ_away).
What it does#
Finds μ_home and μ_away such that a Poisson (optionally Dixon–Coles-adjusted) model best matches the given 1X2 probabilities.
Uses numerical optimisation (scipy.optimize.minimize) with stable parameterisation (log μ bounded).
- Supports:
Time-decay adjustment for low-score events (Dixon–Coles).
Flexible scoring rules: Brier/MSE or cross-entropy.
Configurable grid size (max_goals) and normalisation after Dixon–Coles adjustment.
Parameters#
home, draw, away - 1X2 probabilities (must be in [0,1]).
dc_adj - whether to apply Dixon–Coles adjustment.
rho - correlation parameter for Dixon–Coles.
minimizer_options - dict of options to pass to SciPy’s optimiser.
max_goals - maximum goals per team in the grid (default 15 ⇒ 0–15 inclusive).
remove_overround - if True, probabilities are renormalised to sum to 1 before fitting.
method - optimiser method (default “L-BFGS-B”).
bounds - bounds on (log μ_home, log μ_away) for stability.
x0 - optional starting guess for (log μ_home, log μ_away).
renormalize_after_dc - if True, re-normalises the probability grid after DC adjustments and clips small negatives.
objective - ‘brier’ for mean-squared error or ‘cross_entropy’ for KL-style loss.
return_details - if True, includes extra audit information in the result.
Returns#
A dict with:
home_exp- implied home goal expectancy (μ_home)away_exp- implied away goal expectancy (μ_away)error- final mean squared error between predicted and target 1X2success- whether the optimiser reported success
If return_details=True, also includes:
predicted - model’s predicted [P(home win), P(draw), P(away win)]
mass - total probability in the truncated grid (≤ 1.0 if max_goals small or normalize_after_dc=False)
Quick Example#
from penaltyblog.models import goal_expectancy
from pprint import pprint
# Suppose your market is:
p_home, p_draw, p_away = 0.45, 0.28, 0.29
est = goal_expectancy(
home=p_home,
draw=p_draw,
away=p_away,
dc_adj=True, # use Dixon–Coles low-score correction
rho=0.001, # typical small value
minimizer_options={"maxiter": 5000},
remove_overround=True
)
pprint(est)
{'away_exp': 1.018968393752446,
'error': 1.4642670964617868e-12,
'home_exp': 1.3415375327000219,
'mass': 0.9999999999999999,
'predicted': array([0.44117672, 0.27450821, 0.28431507]),
'success': True}
You can then use these expectancies directly in your own Poisson simulator or as a prior/anchor when comparing to model-based expectancies.
Notes & Best Practices#
Probabilities vs odds: If you start from odds, convert to probabilities and (optionally) remove overround before passing to this function.
Truncation: Only scores up to
max_goalsare considered; very small tail mass may be lost ifnormalize_after_dc=False.DC adjustment: Can help fit when draw prices are high; rho is typically small (0.001–0.01).
Stability: The optimiser works on bounded log-μ space, preventing non-physical negative goal expectancies.
Diagnostics: Use
return_details=Trueto check mass and predicted 1X2 to understand residual errors.
Behind the Scenes: How the Optimiser Works#
This function reverse-engineers μ_home and μ_away via a small non-linear optimisation problem.
1. Parameterisation#
The optimiser works in log μ space (
log_mu_home,log_mu_away), ensuring μ > 0 at all times.Bounds on log μ (default
[-3, 3]) correspond to μ ∈ [0.05, 20] – covering realistic football scoring ranges but preventing runaway values that could destabilise the fit.
2. Objective Function#
Default: Brier score (mean squared error) between the model’s predicted 1X2 probabilities and the input values.
Alternative: Cross-entropy loss (KL divergence direction) if
objective='cross_entropy'.
3. Probability Grid#
A Poisson model generates a probability matrix over scores
(0..max_goals) × (0..max_goals).The Dixon–Coles adjustment optionally tweaks four low-score cells to better match real-world correlation in low-scoring games.
If
renormalize_after_dc=True, the grid is re-scaled to sum exactly to 1.0 after the adjustment (and any small negatives are clipped).
4. From Grid to 1X2#
P(home win)= sum of all cells below the main diagonal.P(draw)= sum of diagonal cells.P(away win)= sum of cells above the diagonal.
5. Optimisation#
The optimiser (
scipy.optimize.minimize) searches log μ space to minimise the chosen loss.By default, the L-BFGS-B method is used, as it handles bounds well and converges quickly for small parameter spaces.
The starting guess defaults to a mild home advantage (
log(1.3),log(1.1)), but you can override withx0.
6. Diagnostics & Tail Mass#
The returned
massis the sum of all probabilities in the truncated grid. Ifmax_goalsis too low,mass< 1.0 means you’ve cut off a non-negligible tail - increasingmax_goalswill reduce this.With
normalize_after_dc=False, residuals may include both truncation error and DC-induced mass shifts.
Extended Goal Expectancy Inference#
If you have both the 1X2 market probabilities and the Over/Under 2.5 probabilities available, you can use goal_expectancy_extended to simultaneously reverse-engineer the implied expected goals (μ_home, μ_away) and a custom Dixon-Coles adjustment parameter (rho) that matches all constraints.
from penaltyblog.models import goal_expectancy_extended
# Assuming you derived the following market probabilities:
p_home, p_draw, p_away = 0.45, 0.28, 0.29
p_over25, p_under25 = 0.48, 0.52
est_ext = goal_expectancy_extended(
home=p_home,
draw=p_draw,
away=p_away,
over25=p_over25,
under25=p_under25,
remove_overround=True
)
print(est_ext["home_exp"]) # Implied home lambda
print(est_ext["away_exp"]) # Implied away lambda
print(est_ext["implied_rho"]) # Implied Dixon-Coles rho
Generating Full Match Probabilities#
Once you have inferred the implied goal expectancies (and optionally the Dixon-Coles rho parameter) using either goal_expectancy or goal_expectancy_extended, you can feed these parameters directly into create_dixon_coles_grid. This allows you to generate a complete probability grid and extract odds for any other market (e.g., Asian Handicaps, Both Teams to Score, exact correct scores) from just basic 1X2 and Over/Under prices.
from penaltyblog.models import create_dixon_coles_grid
# Using the results from goal_expectancy_extended above
home_lambda = est_ext["home_exp"]
away_lambda = est_ext["away_exp"]
rho = est_ext["implied_rho"]
# Create the full probability grid
pred = create_dixon_coles_grid(home_lambda, away_lambda, rho)
# Now you can query any market
print(pred.btts_yes) # Both Teams to Score (Yes)
print(pred.asian_handicap_probs("home", -0.5)) # Asian Handicap
print(pred.exact_score(2, 1)) # Probability of a 2-1 exact score