Skip to content

Benchmarks

This page documents typical performance for solver and export operations and how to reproduce locally.

Methodology

  • Hardware: note CPU model and RAM. Example: Apple M2 Pro, 16GB RAM.
  • OS: macOS 14.x or Ubuntu 22.04.
  • Python: 3.11 or 3.12.
  • NumPy: tested on 1.26.x and 2.x.
  • Environment: pip install -r requirements-lock.txt --require-hashes && pip install -e . (for archival) or pip install .[dev] for local dev. Optionally add .[parquet] for Parquet runs.

We use pytest-benchmark to time hot paths. Benchmarks are stable and avoid tiny timers via --benchmark-min-time=0.1.

Deterministic protocol

To keep CPU threading and hashing deterministic across runs, set:

export PYTHONHASHSEED=0
export MKL_NUM_THREADS=1
export OPENBLAS_NUM_THREADS=1
export OMP_NUM_THREADS=1

Then run the benchmarks with a warmup:

pytest -q -k benchmark --benchmark-min-time=0.1 --benchmark-warmup=on --benchmark-warmup-iterations=10

Reproduce locally

pip install -r requirements-lock.txt --require-hashes
pip install -e .[dev]
# optional for parquet
pip install .[parquet]

pytest -q -k benchmark --benchmark-min-time=0.1

Sample results (indicative)

  • Perfect-play solver (enumerate all reachable states, memoized): typically <100 ms.
  • Dataset export (CSV, canonical-only, no augmentation): <500 ms.
  • Dataset export (both CSV+Parquet): slightly higher due to serialization overhead.

pytest-benchmark output example:

--------------------------------------------------------------------------------------- benchmark: 2 tests --------------------------------------------------------------------------------------
Name (time in ms)                      Min       Max      Mean   StdDev    Median      IQR  Outliers  OPS (Kops/s)  Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------
test_solver_full_enumeration        45.000    60.000    50.000     3.00    49.500    4.000       2;0           20.0      20           1
test_export_small_csv              180.000   260.000   200.000    15.00   198.000   20.000       1;1            5.0      10           1
-----------------------------------------------------------------------------------------------------------------------------------------------------------

Numbers vary slightly by machine and Python/NumPy version but are consistently within the same order of magnitude.

Notes

  • All pipelines are pure Python with vectorized components where applicable.
  • CSV order is deterministic, which can add minimal sorting cost but is essential for reproducibility.

CI artifacts

Automated runs publish raw pytest-benchmark JSON and a small summary tied to:

  • Git commit
  • OS/arch and Python version
  • NumPy version
  • Environment lock identifiers (hash of requirements-lock.txt and/or conda-lock)

See the CI job artifacts for the current branch. The JSON can be downloaded and compared using pytest-benchmark compare.