Inferring Goal Expectancies from Bookmaker Odds#

penaltyblog also includes a utility that works in the opposite direction to the goals models: given bookmaker 1X2 probabilities (home/draw/away), it estimates the implied goal expectancies (μ_home, μ_away).

What it does#

  • Finds μ_home and μ_away such that a Poisson (optionally Dixon–Coles-adjusted) model best matches the given 1X2 probabilities.

  • Uses numerical optimisation (scipy.optimize.minimize) with stable parameterisation (log μ bounded).

  • Supports:
    • Time-decay adjustment for low-score events (Dixon–Coles).

    • Flexible scoring rules: Brier/MSE or cross-entropy.

    • Configurable grid size (max_goals) and normalisation after Dixon–Coles adjustment.

Parameters#

  • home, draw, away - 1X2 probabilities (must be in [0,1]).

  • dc_adj - whether to apply Dixon–Coles adjustment.

  • rho - correlation parameter for Dixon–Coles.

  • minimizer_options - dict of options to pass to SciPy’s optimiser.

  • max_goals - maximum goals per team in the grid (default 15 ⇒ 0–15 inclusive).

  • remove_overround - if True, probabilities are renormalised to sum to 1 before fitting.

  • method - optimiser method (default “L-BFGS-B”).

  • bounds - bounds on (log μ_home, log μ_away) for stability.

  • x0 - optional starting guess for (log μ_home, log μ_away).

  • renormalize_after_dc - if True, re-normalises the probability grid after DC adjustments and clips small negatives.

  • objective - ‘brier’ for mean-squared error or ‘cross_entropy’ for KL-style loss.

  • return_details - if True, includes extra audit information in the result.

Returns#

A dict with:

  • home_exp - implied home goal expectancy (μ_home)

  • away_exp - implied away goal expectancy (μ_away)

  • error - final mean squared error between predicted and target 1X2

  • success - whether the optimiser reported success

If return_details=True, also includes:

  • predicted - model’s predicted [P(home win), P(draw), P(away win)]

  • mass - total probability in the truncated grid (≤ 1.0 if max_goals small or normalize_after_dc=False)

Quick Example#

from penaltyblog.models import goal_expectancy
from pprint import pprint

# Suppose your market is:
p_home, p_draw, p_away = 0.45, 0.28, 0.29

est = goal_expectancy(
    home=p_home,
    draw=p_draw,
    away=p_away,
    dc_adj=True,              # use Dixon–Coles low-score correction
    rho=0.001,                # typical small value
    minimizer_options={"maxiter": 5000},
    remove_overround=True
)

pprint(est)
{'away_exp': 1.018968393752446,
 'error': 1.4642670964617868e-12,
 'home_exp': 1.3415375327000219,
 'mass': 0.9999999999999999,
 'predicted': array([0.44117672, 0.27450821, 0.28431507]),
 'success': True}

You can then use these expectancies directly in your own Poisson simulator or as a prior/anchor when comparing to model-based expectancies.

Notes & Best Practices#

  • Probabilities vs odds: If you start from odds, convert to probabilities and (optionally) remove overround before passing to this function.

  • Truncation: Only scores up to max_goals are considered; very small tail mass may be lost if normalize_after_dc=False.

  • DC adjustment: Can help fit when draw prices are high; rho is typically small (0.001–0.01).

  • Stability: The optimiser works on bounded log-μ space, preventing non-physical negative goal expectancies.

  • Diagnostics: Use return_details=True to check mass and predicted 1X2 to understand residual errors.

Behind the Scenes: How the Optimiser Works#

This function reverse-engineers μ_home and μ_away via a small non-linear optimisation problem.

1. Parameterisation#

  • The optimiser works in log μ space (log_mu_home, log_mu_away), ensuring μ > 0 at all times.

  • Bounds on log μ (default [-3, 3]) correspond to μ ∈ [0.05, 20] – covering realistic football scoring ranges but preventing runaway values that could destabilise the fit.

2. Objective Function#

  • Default: Brier score (mean squared error) between the model’s predicted 1X2 probabilities and the input values.

  • Alternative: Cross-entropy loss (KL divergence direction) if objective='cross_entropy'.

3. Probability Grid#

  • A Poisson model generates a probability matrix over scores (0..max_goals) × (0..max_goals).

  • The Dixon–Coles adjustment optionally tweaks four low-score cells to better match real-world correlation in low-scoring games.

  • If renormalize_after_dc=True, the grid is re-scaled to sum exactly to 1.0 after the adjustment (and any small negatives are clipped).

4. From Grid to 1X2#

  • P(home win) = sum of all cells below the main diagonal.

  • P(draw) = sum of diagonal cells.

  • P(away win) = sum of cells above the diagonal.

5. Optimisation#

  • The optimiser (scipy.optimize.minimize) searches log μ space to minimise the chosen loss.

  • By default, the L-BFGS-B method is used, as it handles bounds well and converges quickly for small parameter spaces.

  • The starting guess defaults to a mild home advantage (log(1.3), log(1.1)), but you can override with x0.

6. Diagnostics & Tail Mass#

  • The returned mass is the sum of all probabilities in the truncated grid. If max_goals is too low, mass < 1.0 means you’ve cut off a non-negligible tail - increasing max_goals will reduce this.

  • With normalize_after_dc=False, residuals may include both truncation error and DC-induced mass shifts.

Extended Goal Expectancy Inference#

If you have both the 1X2 market probabilities and the Over/Under 2.5 probabilities available, you can use goal_expectancy_extended to simultaneously reverse-engineer the implied expected goals (μ_home, μ_away) and a custom Dixon-Coles adjustment parameter (rho) that matches all constraints.

from penaltyblog.models import goal_expectancy_extended

# Assuming you derived the following market probabilities:
p_home, p_draw, p_away = 0.45, 0.28, 0.29
p_over25, p_under25 = 0.48, 0.52

est_ext = goal_expectancy_extended(
    home=p_home,
    draw=p_draw,
    away=p_away,
    over25=p_over25,
    under25=p_under25,
    remove_overround=True
)

print(est_ext["home_exp"])     # Implied home lambda
print(est_ext["away_exp"])     # Implied away lambda
print(est_ext["implied_rho"])  # Implied Dixon-Coles rho

Generating Full Match Probabilities#

Once you have inferred the implied goal expectancies (and optionally the Dixon-Coles rho parameter) using either goal_expectancy or goal_expectancy_extended, you can feed these parameters directly into create_dixon_coles_grid. This allows you to generate a complete probability grid and extract odds for any other market (e.g., Asian Handicaps, Both Teams to Score, exact correct scores) from just basic 1X2 and Over/Under prices.

from penaltyblog.models import create_dixon_coles_grid

# Using the results from goal_expectancy_extended above
home_lambda = est_ext["home_exp"]
away_lambda = est_ext["away_exp"]
rho = est_ext["implied_rho"]

# Create the full probability grid
pred = create_dixon_coles_grid(home_lambda, away_lambda, rho)

# Now you can query any market
print(pred.btts_yes)                           # Both Teams to Score (Yes)
print(pred.asian_handicap_probs("home", -0.5)) # Asian Handicap
print(pred.exact_score(2, 1))                  # Probability of a 2-1 exact score