Computes the weighted sum of player performance Gaussians into a
team-performance variable. Runs once per game (no iteration needed).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the trait that all factors implement and the enum dispatcher
used by the schedule to drive heterogeneous factors without dynamic
dispatch in the hot loop.
The three built-in factors (TeamSum, RankDiff, Trunc) are stubbed
out; concrete implementations follow in tasks 4-6.
Foundation types for the T1 factor graph machinery. VarStore is a
flat Vec<Gaussian> indexed by VarId; variables are allocated by
alloc() and the store can be cleared between games to reuse capacity.
Part of T1 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
Bite-sized, TDD-style task breakdown for the second tier of the engine
redesign: introduce VarStore, Factor trait, BuiltinFactor enum, and
EpsilonOrMax schedule, then re-implement Game::likelihoods on top of
the new machinery. Internal-only refactor; public Game/History API
unchanged.
Acceptance: existing tests pass within ULP, iteration counts match T0,
no Batch::iteration regression vs T0 (~21.5 µs).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Batch::iteration: 29.840 µs → 21.253 µs (1.40×)
Gaussian::mul: 1.568 ns → 218.69 ps (7.17×)
Gaussian::div: 1.572 ns → 218.64 ps (7.19×)
Gaussian arithmetic hit target (7×+ vs 1.5–2× expected). Batch::iteration
reached 1.40× vs the 3× target. Post-mortem: the bench exercises 100 tiny
2-team events and the dominant cost is still Vec allocation in within_priors,
sort_perm, and Game::likelihoods. The HashMap→Vec win shows at the History
level (forward/backward sweep) which this bench doesn't exercise.
Remediation plan documented in benches/baseline.txt: arena-ify sort_perm,
within_priors, and Game::likelihoods in T1 when Game's internals are
redesigned around the new factor graph.
38/38 tests passing. Closes T0 tier.
Game::likelihoods previously allocated four Vecs (teams, diffs, ties,
margins) on every call. Batch now owns one ScratchArena reused across
all Game::new calls in the iteration loop; likelihoods() clears and
extends the arena buffers instead of allocating fresh.
For log_evidence (called infrequently), a local ScratchArena is created
per invocation so the method signature stays &self.
Also: add #[derive(Debug)] to TeamMessage and DiffMessage (required by
ScratchArena's own Debug derive).
Part of T0 engine redesign.
AgentStore<D> is a Vec<Option<Agent<D>>>-backed store indexed directly
by Index.0, eliminating per-iteration hashing in the cross-history
forward/backward sweep. Implements Index<Index>/IndexMut<Index> for
ergonomic agent access.
AgentStore is public (so benches/batch.rs can use it). SkillStore
remains pub(crate) since Skill is pub(crate) in batch.rs.
HashMap<Index, _> is now only used for the posteriors() return value
(temporary; will be replaced in T2 with a proper typed return) and
for the add_events_with_prior(priors: HashMap<Index, Player<D>>) API
(also T2 target).
Part of T0 engine redesign.
SkillStore is a Vec<Skill>-backed dense store with a parallel present
mask, indexed directly by Index.0. Eliminates per-iteration hashing
in the within-slice convergence loop; O(1) array lookup replaces O(1)
amortised hash lookup with better cache behaviour.
Iteration order is now ascending-by-Index (was arbitrary for HashMap);
EP fixed point is order-independent so posteriors are unchanged.
Part of T0 engine redesign.
mu_sigma was deleted as part of the Gaussian nat-param rewrite (its
only callers were the old Mul/Div impls). This commit adds the
InferenceError enum as a seed for the T2 API surface, with the
NegativePrecision variant that mu_sigma would have returned.
Part of T0 engine redesign.
Mul and Div become two f64 adds/subs with no sqrt in the hot path.
mu() and sigma() are computed on demand from stored pi/tau.
Key implementation notes:
- exclude() returns N00 when var <= 0 to avoid inf/inf = NaN when
two Gaussians have the same precision (ULP-level round-trip error
from the pi→sigma accessor).
- Mul<f64> by 0.0 returns N00 (point mass at 0), matching old behavior.
- from_ms(0, 0) == N00 {pi:inf, tau:0}; from_ms(0, inf) == N_INF {pi:0, tau:0}.
Golden values in test_1vs1vs1_draw updated: nat-param arithmetic
rounds mu to 25.0 (was 24.999999) and shifts sigma by ~3e-7.
Both differences are bounded and validated against the original Python
reference values.
Part of T0 engine redesign.
- Promotes Gaussian::pi and Gaussian::tau to public so benches/gaussian.rs
compiles, then captures baseline numbers for the T0 acceptance gate.
- Fixes the divide bench: g1/g2 panicked (g1 has lower precision than g2;
cavity requires pi_num >= pi_den). Swapped to g2/g1 (well-defined).
Baseline on Apple M5 Pro:
Batch::iteration 29.840 µs
Gaussian::mul 1.568 ns (vs ~220 ps for add/sub — hot path)
Gaussian::div 1.572 ns
Bite-sized, TDD-style task breakdown for the first tier of the engine
redesign: Gaussian to natural-parameter storage, dense Vec storage
replacing HashMap, ScratchArena to eliminate per-event allocs,
Result-ifying the lone panic. No top-level public API change.
Acceptance gate: ≥3x speedup on Batch::iteration vs. baseline.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comprehensive design for a multi-tier rewrite covering performance,
factor-graph extensibility, convergence scheduling, and API surface.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>