Batch::iteration: 21.36 µs (T1 was 22.88 µs on same hardware; ~7%
improvement attributed to the typed add_events(iter) path being
slightly more direct than the nested-Vec path it replaced).
Gaussian operations unchanged vs T1.
Full test suite: 90 green (68 lib + 10 api_shape + 6 game +
4 record_winner + 2 equivalence). No golden value changed across
the entire T2 tier.
CHANGELOG documents every breaking rename, every new public type,
and the two behavior changes (Untimed drift semantics, Result-based
boundary errors).
Closes T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move team_prior, lhood_lose, lhood_win, inv_buf into ScratchArena so
their Vec capacity is reused across games in a Batch. Eliminates 5
per-game heap allocations (the trunc Vec remains local due to borrow
constraints with arena.vars).
Batch::iteration: 23.0 µs (down from 27.0 µs with naive local Vecs;
8% above T0 21.253 µs baseline due to TruncFactor propagate overhead).
Batch::iteration: 29.840 µs → 21.253 µs (1.40×)
Gaussian::mul: 1.568 ns → 218.69 ps (7.17×)
Gaussian::div: 1.572 ns → 218.64 ps (7.19×)
Gaussian arithmetic hit target (7×+ vs 1.5–2× expected). Batch::iteration
reached 1.40× vs the 3× target. Post-mortem: the bench exercises 100 tiny
2-team events and the dominant cost is still Vec allocation in within_priors,
sort_perm, and Game::likelihoods. The HashMap→Vec win shows at the History
level (forward/backward sweep) which this bench doesn't exercise.
Remediation plan documented in benches/baseline.txt: arena-ify sort_perm,
within_priors, and Game::likelihoods in T1 when Game's internals are
redesigned around the new factor graph.
38/38 tests passing. Closes T0 tier.
- Promotes Gaussian::pi and Gaussian::tau to public so benches/gaussian.rs
compiles, then captures baseline numbers for the T0 acceptance gate.
- Fixes the divide bench: g1/g2 panicked (g1 has lower precision than g2;
cavity requires pi_num >= pi_den). Swapped to g2/g1 (well-defined).
Baseline on Apple M5 Pro:
Batch::iteration 29.840 µs
Gaussian::mul 1.568 ns (vs ~220 ps for add/sub — hot path)
Gaussian::div 1.572 ns