Changelog#

Version Numbering#

penaltyblog follows the SemVer versioning guidelines. For more information, see semver.org

v1.9.0 (2026-02-28)#

  • New Features

    • Added create_dixon_coles_grid() function to create a FootballProbabilityGrid directly from expected goals (lambdas) and optional Dixon-Coles rho parameter. Useful when expected goals come from external ML models rather than fitted goal models.

    • Added goal_expectancy_extended() function to infer implied goal expectancies AND the Dixon-Coles correlation parameter (rho) simultaneously from 1X2 and Over/Under 2.5 probabilities.

  • Bug Fixes

    • Fixed FootballProbabilityGrid.totals() to correctly handle quarter lines (e.g., 2.25, 2.75) with split-stake logic. Previously, push probability was incorrectly 0 for quarter lines.

    • Fixed FootballProbabilityGrid.asian_handicap_probs() where handicap signs were inverted. Negative lines (e.g., -0.5) and positive lines (e.g., +0.5) could produce swapped win/lose probabilities.

  • Performance

    • Precomputed total goals grid in FootballProbabilityGrid.__post_init__ to avoid redundant calculation in repeated totals() calls.

    • Removed redundant negative fraction handling code in totals() and asian_handicap_probs() methods.

  • Documentation

    • Updated documentation to reflect quarter line support in the totals() method.

v1.8.0 (2026-01-08)#

  • Goal Models

    • Added new BayesianGoalModel and HierarchicalBayesianGoalModel models

    • Added Cythonized MCMC sampler for Bayesian Modelling

  • Scraping

    • Fixed Understat scraper to use new API endpoints (getLeagueData, getMatchData, getPlayerData) instead of parsing embedded JavaScript from HTML pages

    • Fixed FBRef scraper by using wrapper-tls-requests to bypass Cloudflare TLS fingerprinting protection

v1.7.1 (2025-12-24)#

  • Goal Models

    • Added add public params_array and param_indices functions to all goal models. This makes it easier to work with a model’s parameters without having to rely on its internal implementation details. Thank you to Sebastian Velandia for this contribution.

v1.7.0 (2025-11-30)#

  • Opta API Integration (penaltyblog.matchflow.contrib.opta):

    • Added built-in integration with Stats Perform (Opta) API, allowing for lazy loading of data streams (data is fetched only on .collect() or .to_pandas()).

    • Added support for major endpoints including events, matches, season_stats, referees, standings, and pass_matrix.

    • Added opta_helpers module for human-readable filtering (e.g., where_opta_event("Shot")) to avoid manual ID lookups.

    • Added get_opta_mappings() to explore available event types and qualifiers[cite: 23].

    • Added support for authenticated access via environment variables or credential dictionaries, including proxy configuration support.

    • Added documentation and examples for using the Opta API integration.

v1.6.2 (2025-10-22)#

Package Updates#

  • Fixed bug in PoissonGoalsModel where weights parameter was not being handled correctly in the gradient function.

v1.6.1 (2025-10-17)#

Package Updates#

  • Updated goals models loss functions to work with scipy 1.16+

  • Improved numerical stability of the loss function for the Negative Binomial model to improve convergence

  • Added colab notebook for implied probabilities examples

  • Python 3.14 support

v1.6.0 (2025-09-23)#

Package Updates#

  • Matchflow

  • can now read / write data from cloud storage (e.g. S3, GCS, Azure Blob Storage) using fsspec

  • Now supports multiple join strategies:

    • left, right, outer, inner and anti joins

    • Automatic type inference and conversion for join keys

    • Customizable type coercion functions for complex join key scenarios

  • Fixed bug where where executor did not recognise .concat() function

  • Updated implied submodule to add logarithmic overround removal method and return structured results

  • Betting

    • Renamed kelly submodule to betting

    • Added multiple_criterion function for calculating Kelly Criterion for multiple outcomes

    • Added arbitrage_hedge function to calculate hedge bet sizes

    • Added arbitrage_opportunities function to identify arbitrage opportunities across bookmakers

    • Added value_bets function to identify value bets based on model probabilities

    • Added odds_conversion function to convert between different odds formats (decimal, fractional, American)

    • Updated all betting utility functions to return structured output

Documentation Improvements#

  • Updated Matchflow documentation

  • Updated implied documentation

  • Updated betting documentation

  • Started adding Colab notebooks for interactive examples, more to come!

v1.5.1 (2025-08-20)#

Package Updates#

  • Restricted scipy to version <=1.15.3 due to breaking changes in the minimize function introduced in 1.16+, which affect model compatibility.

v1.5.0 (2025-08-15)#

Package Updates#

  • Pitch

    • Initial release of interactive Pitch plotting library

  • MatchFlow

    • Flow now has it’s own query language, with support for boolean expressions and field comparisons via .query

  • Goals Models

    • All Goals Model’s .fit functions now take an optional dictionary of arguments to pass to scipy’s optimiser

    • All GoalsModels now fit using an optional gradient (defaults to True), which improves the fit time by approx 5-10x

  • FootballProbabilityGrid

    • Updated class to include more betting markets

    • Now supports fractional Asian handicaps and totals

    • Optionally normalizes probabilities to sum to 1 (default: True)

    • Calculations now use vectorized numpy operations for improved performance

    • Caching of results for repeated queries to improve efficiency

  • Goal Expectancy

    • Added support for removing overrounding from input probabilities

    • Improved handling of edge cases in probability distributions

    • Altered to using probabilities rather than odds

    • Added more diagnostic output for debugging

    • Optionally normalizes probabilities to sum to 1

Documentation Improvements#

  • Added Pitch documentation

  • Updated Flow documentation with .query examples

  • Completely rewritten documentation for Goals Models and goal expectancy

  • Removed obsolete examples

v1.4.1 (2025-06-24)#

Package Updates#

  • Fixed bug in Flow.cache executor logic

v1.4.0 (2025-06-19)#

Package Updates#

  • Introduced optional FlowOptimizer for smart plan rewrites - New optimize=True flag on all flows (off by default) - Safe, conservative rewrites: pushdown, fusion, and simplification - Enhanced .explain(compare=True) for before/after plan introspection - Optimizer is backwards-compatible and fully opt-in

  • Added .plot_plan() on Flow and FlowGroup to visualize pipeline structure

  • .explain() now works on FlowGroup, and supports compare=True

  • New .with_schema({...}) method to cast and validate fields - Example: Flow.with_schema({"x": int, "ts": parse_datetime})

  • Added .rolling_summary() to FlowGroup for windowed group summaries (e.g. rolling 5-minute aggregates per player or team)

  • Added .time_bucket() to FlowGroup for time-based binning summaries

  • Added .show() method to pretty-print results using tabulate

  • Flow.collect() now supports optional progress bars during execution

Documentation Improvements#

  • Refreshed documentation to include: - FlowOptimizer and .optimize=True - .with_schema, .rolling_summary, .show() - Plan introspection via .explain(compare=True) and .plot_plan() - Enhanced type hints throughout the package for improved compatibility with mypy.

v1.3.0 (2025-05-20)#

Package Updates#

  • Initial release of MatchFlow

Documentation Improvements#

  • Added MatchFlow documentation

  • Added MatchFlow recipes documentation

  • Added API references for all of penaltyblog

  • Added stub file for metric Cython code

  • Added stub file for model Cython code

v1.2.0 (2025-04-10)#

Package Updates#

  • Updated Elo Ratings model to be more football-specific so that it now includes home field advantage and can predict draw probabilities

  • Added new Cythonised Ignorance Score metric

  • Added new Cythonised Multiclass Briar Score metric

  • RPS functions now raise a ValueError exception if outcome is out of bounds

Documentation Improvements#

  • Updated Elo documentation

  • Added Pi Ratings documentation

  • Added examples for ignorance score

  • Added examples for multiclass briar score

  • Updated examples for RPS

v1.1.0 (2025-03-15)#

Performance Enhancements#

  • Rewrote Dixon-Coles model using Cython, achieving approximately 250x speed improvement.

  • Rewrote Poisson model using Cython, achieving approximately 250x speed improvement.

  • Implemented Negative Binomial Goals Model in Cython for enhanced performance.

  • Added high-performance Cython implementation of the Bivariate Poisson Goals Model based on Karlis & Ntzoufras.

  • Introduced Cython implementation of the Bivariate Weibull Count Copula Goals Model (Boshnakov et al. paper).

  • Added Pi Ratings System (Constantinou paper).

  • Migrated ranked probability score functions to Cython for improved speed.

Package Updates#

  • Temporarily removed Stan-based models due to dependency management challenges. Investigating improved packaging strategies for future reintegration.

  • Temporarily removed Rue and Salvesen model pending revision to accurately reflect its intended methodology (previously implemented as a hybrid Dixon-Coles variant).

Documentation Improvements#

  • Updated and expanded model examples in the documentation.

  • Enhanced type hints throughout the package for improved compatibility with mypy.

  • Updated documentation to pydata Sphinx theme.

CI/CD and Testing#

  • Expanded GitHub Actions workflows to perform unit tests across all supported Python versions.

  • Extended GitHub Actions workflows to perform unit tests on Windows, macOS, and Linux.

  • Configured GitHub Actions to automatically build wheels for all supported Python versions across Windows, macOS, and Linux.

v1.0.4 (2025-01-10)#

Package Updates#

  • Moved Stan code to separate files to prevent access denied issues on Windows.

v1.0.3 (2024-12-19)#

Bug Fixes#

  • Fixed bug in how the Bayesian models indexed teams in the predict function.

  • Goals models now only predict individual team names rather than iterables of team names, fixing compatibility issues between different sequence objects.

v1.0.2 (2024-12-18)#

Bug Fixes#

  • Updated how the Bayesian models handle the Stan files to prevent access denied issues on Windows.

v1.0.1 (2024-12-13)#

Improvements#

  • Updated install_stan to install the C++ toolchain on Windows if required.

v1.0.0 (2024-12-12)#

Performance Enhancements#

  • Removed pymc as a dependency.

  • Optimized RPS calculation.

  • Optimized ELO code.

  • Optimized Kelly Criterion code.

  • Updated FootballProbabilityGrid to store its internal matrix as a NumPy array.

Model Updates#

  • Rewrote BayesianHierarchicalGoalModel in Stan instead of pymc, updating the prediction method to integrate over the posterior rather than sampling the mid-point.

  • Rewrote BayesianRandomInterceptGoalModel in Stan, improved the random intercept, and updated the prediction method.

  • Rewrote BayesianBivariateGoalModel in Stan for better convergence and updated the prediction method.

  • Added BayesianSkellamGoalModel for predicting football match outcomes using the Skellam distribution.

Package Updates#

  • Added support for Python 3.13.

  • Removed obsolete SoFifa and ESPN scrapers.

  • Updated all example notebooks.

  • Increased unit test coverage.

  • Added CI/CD workflows.

  • Removed Poetry from the build step.

  • Updated documentation.

  • Added type hinting to Colley and Massey classes.

v0.8.1 (2023-09-31)#

Bug Fixes#

  • Changed FBRef born column to Int64 dtype to allow NULL values.

v0.8.0 (2023-08-31)#

New Features#

  • Added initial Backtest framework for backtesting betting strategies.

  • Added function to calculate the Kelly Criterion.

  • Added class for calculating Elo ratings.

Bug Fixes#

  • Fixed bug in FBRef scraper for player age and year of birth.

  • All goal models can now accept iterables as team inputs.

  • Fixed mapping of Belgium leagues in the FootballData scraper.

v0.7.0 (2023-03-13)#

New Features#

  • Added FBRef scraper.

Package Updates#

  • Minimum Python version supported is now Python 3.8.

v0.6.1 (2023-01-06)#

Bug Fixes#

  • Tweaked Understat scraper to avoid bot detection.

v0.6.0 (2022-12-02)#

New Features#

  • Added goal_expectancy function.

  • Added Bayesian Random Intercept Model.

Performance Enhancements#

  • Tweaked pymc settings for Bayesian goal models to improve speed.

Bug Fixes#

  • Fixed bug in Bayesian Bivariate Goals Model.

  • Fixed bug in FootballData scraper where a null value was breaking the index column.

v0.5.1 (2022-11-03)#

Bug Fixes#

  • Fixed bug in goal models when printing an instance before fitting it.

  • Fixed bug in Bayesian goal models’ weighted decay.

  • Fixed default value of xi in dixon_coles_weights to 0.0018.

v0.5.0 (2022-10-11)#

New Features#

  • Added get_player_season and get_player_shots to Understat scraper.

  • Added Bayesian Hierarchical Goal Model.

  • Added Bayesian Bivariate Poisson Goal Model.

  • Added Bayesian Random Intercept Poisson Goal Model.

Bug Fixes#

  • get_fixtures in Understat scraper now only returns completed fixtures (consistent with FootballData scraper).

  • Fixed bug in FootballData scraper for older seasons missing the Time column.

Package Updates#

  • Added SoFifa scraper.

  • Added compatibility for Python 3.7.

v0.4.0 (2022-08-08)#

General Improvements#

  • General bug fixes.

  • Reorganized internal package structure.

  • Added unit tests.

  • Added documentation and uploaded to ReadTheDocs.

New Features#

  • Added FPL scraper.

  • Added FPL optimizer.

  • Added ESPN scraper.

  • Added Understat scraper.

  • Added pre-commit checks to repository.

  • Added both-teams-to-score probability to football goals models.

  • Refactored FootballData scraper for consistency with other scrapers.

  • Refactored Club Elo scraper for consistency with other scrapers.

Performance Enhancements#

  • Refactored Colley ratings and Massey ratings for consistency.

  • Updated example notebooks and included them in documentation.