Roadmap#
This roadmap outlines planned features, ideas under exploration, and long-term goals for penaltyblog.
Itβs not a guarantee, but a guide - contributions, feedback, and suggestions are welcome!
π Planned#
MatchFlow#
Usability + Helper Expansion
β General speed optimisations + cythonization to make faster
β More
where_andget_helpersβ
Flow.describe()improvementsβ Docs: Writing custom helpers tutorial
β Docs: More
Flowrecipesβ Add plugin interface to make it easy to add in other data providers
β Progress bars
β Custom query DSL for natural quering -
flow.query("player.name == 'Kevin de Bruyne'")β Optimization of internal DAG plan
Joins & I/O Enhancements
β Join-on-multiple-fields support
β Benchmarks page in docs
β Parallel loading of files
Rolling & Windowed Aggregates
β
.rolling(...)and.expanding(...)on grouped flowsβ Support for rolling summary fields like moving average xG
Plotting#
β Publish penaltyblog plotting library
β Native support for plotting Flow pipelines
Models#
β Bring the Bayesian models back to the party
β Add new models based on time-series approaches
β Pre-trained models, e.g. xT
β Updated player ratings model
Scrapers#
β Give scraper module an overhaul to make it more efficient and easier to use
β Add support for new data sources such as Sofa Score
β Add automatic throttling to avoid overloading servers
β Hook up scrapers to MatchFlow
β Caching of scraped data sources
General#
β Refresh / expand rest of documentation
π§ͺ Under Exploration#
These are bigger ideas Iβm researching - feedback welcome!
MatchFlow#
FlowZ: A custom binary format for fast I/O on nested JSON
Partitioning of large datasets for faster processing
Built-in indexing or predicate pushdown
Streaming joins for large datasets
A lightweight visual data explorer (maybe based on my upcoming plotting library)
Declarative YAML/JSON pipeline definitions.
Pluggable transforms (e.g. xT, formation_detection, pressing_zones)
Models#
Custom Bayesian library focussed on building sports models without depenency hassles
Contributing#
If youβre interested in helping with anything here, feel free to open an issue, submit a PR, or just reach out.