Skip to content

Theory

Minimax with strict tie-breaks

We solve states from the perspective of the side-to-move using exact minimax with memoization. The value v ∈ {+1, 0, −1} denotes win/draw/loss under perfect play. Among moves with equal v we break ties by distance-to-terminal (plies):

  • Win ≻ Draw ≻ Loss
  • Among wins/draws: prefer the shortest plies-to-end
  • Among losses: prefer the longest plies-to-end (delay the loss)

This induces a total preorder over legal actions. Our solver returns per-action q_values ∈ {+1,0,−1} and dtt_action (distance-to-terminal after taking the action), and the set of optimal_moves obeying the policy above.

Symmetry group (D4) and canonicalization

Tic-tac-toe on a 3×3 board admits the dihedral-4 (D4) symmetry group with 8 elements: {id, rot90, rot180, rot270, hflip, vflip, d1, d2}. We precompute index mappings for each transformation and use them to:

  • Canonicalize states by selecting the lexicographically smallest image across all 8 transforms.
  • Remap action indices under symmetry for augmentation and invariance checks.

For a board b and transform g ∈ D4, let g·b be the transformed board and g·a the transformed action index. The solver’s value is invariant: V(g·b) = V(b); optimal moves map equivariantly: g·ArgMax(b) = ArgMax(g·b).

Complexity and state graph

From the empty board, 5478 states are reachable (4520 nonterminal; terminal split: X=626, O=316, draw=16). This is a tiny graph and the exact backward induction + memoization completes in milliseconds on commodity hardware. Our features and datasets are O(#states) and fit comfortably in memory.