Benchmarks¶
This page documents typical performance for solver and export operations and how to reproduce locally.
Methodology¶
- Hardware: note CPU model and RAM. Example: Apple M2 Pro, 16GB RAM.
- OS: macOS 14.x or Ubuntu 22.04.
- Python: 3.11 or 3.12.
- NumPy: tested on 1.26.x and 2.x.
- Environment:
pip install -r requirements-lock.txt --require-hashes && pip install -e .
(for archival) orpip install .[dev]
for local dev. Optionally add.[parquet]
for Parquet runs.
We use pytest-benchmark
to time hot paths. Benchmarks are stable and avoid tiny timers via --benchmark-min-time=0.1
.
Deterministic protocol¶
To keep CPU threading and hashing deterministic across runs, set:
export PYTHONHASHSEED=0
export MKL_NUM_THREADS=1
export OPENBLAS_NUM_THREADS=1
export OMP_NUM_THREADS=1
Then run the benchmarks with a warmup:
pytest -q -k benchmark --benchmark-min-time=0.1 --benchmark-warmup=on --benchmark-warmup-iterations=10
Reproduce locally¶
pip install -r requirements-lock.txt --require-hashes
pip install -e .[dev]
# optional for parquet
pip install .[parquet]
pytest -q -k benchmark --benchmark-min-time=0.1
Sample results (indicative)¶
- Perfect-play solver (enumerate all reachable states, memoized): typically <100 ms.
- Dataset export (CSV, canonical-only, no augmentation): <500 ms.
- Dataset export (both CSV+Parquet): slightly higher due to serialization overhead.
pytest-benchmark
output example:
--------------------------------------------------------------------------------------- benchmark: 2 tests --------------------------------------------------------------------------------------
Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS (Kops/s) Rounds Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------
test_solver_full_enumeration 45.000 60.000 50.000 3.00 49.500 4.000 2;0 20.0 20 1
test_export_small_csv 180.000 260.000 200.000 15.00 198.000 20.000 1;1 5.0 10 1
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Numbers vary slightly by machine and Python/NumPy version but are consistently within the same order of magnitude.
Notes¶
- All pipelines are pure Python with vectorized components where applicable.
- CSV order is deterministic, which can add minimal sorting cost but is essential for reproducibility.
CI artifacts¶
Automated runs publish raw pytest-benchmark
JSON and a small summary tied to:
- Git commit
- OS/arch and Python version
- NumPy version
- Environment lock identifiers (hash of requirements-lock.txt and/or conda-lock)
See the CI job artifacts for the current branch. The JSON can be downloaded and compared using pytest-benchmark compare
.