T0 + T1 + T2: engine redesign through new API surface #1
Reference in New Issue
Block a user
Delete Branch "t2-new-api-surface"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Implements tiers T0, T1, T2 of
docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md. All three tiers have landed together on this branch because they build on one another; this PR rolls them up for a single review pass.Per-tier plans:
docs/superpowers/plans/2026-04-23-t0-numerical-parity.mddocs/superpowers/plans/2026-04-24-t1-factor-graph.mddocs/superpowers/plans/2026-04-24-t2-new-api-surface.mdSummary
T0 — Numerical parity (internal)
Gaussianswitched to natural-parameter storage(pi, tau); mul/div now ~7× faster (218 ps vs 1.57 ns).HashMap<Index, _>→ denseVec<_>keyed byIndex.0(viaAgentStore<D>,SkillStore).ScratchArenaeliminates per-event allocations inGame::likelihoods.InferenceErrorseed type added (1 variant).Batch::iteration29.84 → 21.25 µs.T1 — Factor graph machinery (internal)
Factortrait +BuiltinFactorenum (TeamSum / RankDiff / Trunc) driving within-game inference.VarStoreflat storage for variable marginals.Scheduletrait +EpsilonOrMaximpl replacing the hand-rolled EP loop.Game::likelihoodsrebuilt on the factor-graph machinery; iteration counts and goldens preserved to within 1e-6.Batch::iteration23.01 µs (slight regression absorbed in T2).T2 — New API surface (breaking)
Renames:
IndexMap → KeyTable,Player → Rating,Agent → Competitor,Batch → TimeSliceNew types:
Timetrait withUntimedZST andi64impls;Drift<T>,Rating<T, D>,Competitor<T, D>,TimeSlice<T>,History<T, D, O, K>all generic.Event<T, K>,Team<K>,Member<K>,Outcome(Rankedvariant;#[non_exhaustive]).Observer<T>trait +NullObserver.ConvergenceOptions,ConvergenceReport.GameOptions,OwnedGame<T, D>.Three-tier ingestion:
history.record_winner(&K, &K, T)/record_draw(&K, &K, T)— 1v1 convenience.history.add_events(iter)— typed bulk.history.event(T).team([...]).weights([...]).ranking([...]).commit()— fluent.Query API:
current_skill,learning_curve,learning_curves(keyed onK),log_evidence,log_evidence_for,predict_quality,predict_outcome.Game constructors:
ranked,one_v_one,free_for_all,custom— all returningResult<_, InferenceError>.factorsmodule:Factor,Schedule,VarStore,VarId,BuiltinFactor,EpsilonOrMax,ScheduleReport,TeamSumFactor,RankDiffFactor,TruncFactornow public.Errors:
InferenceErrorgainsMismatchedShape,InvalidProbability,ConvergenceFailed; boundary panics converted toResult.Removed (breaking):
History::convergence(iters, eps, verbose),HistoryBuilder::gamma(f64),HistoryBuilder::time(bool),History.time: bool,learning_curves_by_index, nested-Vec publicadd_events.Behavior change (documented in CHANGELOG)
Time = Untimedhaselapsed_to → 0, so no drift accumulates between slices. The oldtime=falsemode implicitly forcedelapsed=1on reappearance via ani64::MAXsentinel — that quirk is not reproducible under a typed time axis. Tests that depended on it now useHistory::<i64, _>with explicit1..=ntimestamps. One test (test_env_ttt) had 3 Gaussian goldens updated to reflect the corrected semantics; documented in commit33a7d90.Final numbers
Batch::iterationGaussian::mulGaussian::divAll other Gaussian ops unchanged (~219 ps add/sub, ~264 ps pi/tau reads).
Test plan
cargo test --features approx— 90/90 pass (68 lib + 10 api_shape + 6 game + 4 record_winner + 2 equivalence)cargo clippy --all-targets --features approx -- -D warnings— cleancargo +nightly fmt --check— cleancargo bench --bench batch— 21.36 µscargo bench --bench gaussian— unchanged from T1cargo run --example atp --features approx— rewritten in new API, runs cleantests/equivalence.rstests/api_shape.rs)Commit history
~45 commits total across T0 + T1 + T2. Each task is self-contained and individually tested; the branch is bisectable. See
git log main..t2-new-api-surfacefor the full list.Deferred to later tiers
Outcome::Scored+MarginFactor— T4Damped/Residualschedules — T4Send + Syncbounds + Rayon parallelism — T3predict_outcome— T4Game::customfull ergonomics — T4🤖 Generated with Claude Code
Mul and Div become two f64 adds/subs with no sqrt in the hot path. mu() and sigma() are computed on demand from stored pi/tau. Key implementation notes: - exclude() returns N00 when var <= 0 to avoid inf/inf = NaN when two Gaussians have the same precision (ULP-level round-trip error from the pi→sigma accessor). - Mul<f64> by 0.0 returns N00 (point mass at 0), matching old behavior. - from_ms(0, 0) == N00 {pi:inf, tau:0}; from_ms(0, inf) == N_INF {pi:0, tau:0}. Golden values in test_1vs1vs1_draw updated: nat-param arithmetic rounds mu to 25.0 (was 24.999999) and shifts sigma by ~3e-7. Both differences are bounded and validated against the original Python reference values. Part of T0 engine redesign.The bool encoded 'no time axis' which is now expressed at the type level (T = Untimed). The old !self.time branch generated sequential i64 timestamps internally (1..=n) and bumped all agents' last_time at every tick; tests that relied on this now pass those timestamps explicitly and reflect the correct time=true elapsed semantics. Collapsed `if self.time { A } else { B }` into the A branch everywhere in add_events_with_prior. Removed the two !self.time blocks that updated all agents' last_time at every slice regardless of participation. sort_time is now generic over `T: Copy + Ord`. HistoryBuilder::time(bool) removed. History<i64, ConstantDrift> default remains, producing the same behavior as old .time(true). The test_env_ttt Gaussian goldens are updated to reflect the correct time=true semantics (b.elapsed=2 instead of 1 due to b skipping t=2); this is a correction: the old !self.time last_time bump was an implementation quirk that diverged from the Python reference. 55 tests pass. clippy clean. fmt clean. Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>New public types: - ConvergenceOptions { max_iter, epsilon } — config for the loop - ConvergenceReport { iterations, final_step, log_evidence, converged, per_iteration_time, slices_skipped } — post-hoc summary History and HistoryBuilder gain a third generic parameter O: Observer<T> = NullObserver. Builder methods: - .convergence(opts) sets the ConvergenceOptions - .observer(o) plugs in an Observer (reshapes the builder's O param) History::converge() runs the existing iteration loop driven by the stored opts, emits observer callbacks on each iteration end and on completion, and returns Result<ConvergenceReport, InferenceError>. The old convergence(iters, eps, verbose) stays — gets removed in Task 20 after tests are translated. Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.Public API gains: History::add_events<I: IntoIterator<Item = Event<T, K>>>(events) -> Result<(), InferenceError> which accepts the typed Event<T, K> shape added in Task 10. Ranks from Outcome::Ranked are mapped to the legacy "higher f64 = better" results internally. add_events_with_prior now takes Vec<T> for times (was Vec<i64>), generifying the whole internal path over T in a single fully-generic impl<T: Time, D: Drift<T>, O: Observer<T>, K> block. The i64-specific block is gone; record_winner/record_draw are now generic over T. add_events_with_prior stays pub (not pub(crate)) because the ATP example calls it directly with pre-built Index-based composition; the new typed add_events is the primary public API going forward. In-crate tests updated to call add_events_with_prior with an empty HashMap. tests/api_shape.rs added with 3 integration tests covering bulk ingest, draw, and mismatched-outcome error. Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>factorsmodule fe6f028127Exposes the factor-graph machinery so power users can define custom factors and schedules (see Game::custom in the next task). The internal factor/ and schedule/ modules remain unchanged (still referenced by Game's internals via crate::factor); the user-facing public API goes through the new factors module re-exports: pub use crate::factor::{BuiltinFactor, Factor, VarId, VarStore}; pub use crate::factor::rank_diff::RankDiffFactor; pub use crate::factor::team_sum::TeamSumFactor; pub use crate::factor::trunc::TruncFactor; pub use crate::schedule::{EpsilonOrMax, Schedule, ScheduleReport}; #[allow(dead_code)] guards on the previously-pub(crate) items are removed because the types are now referenced via the re-exports. Promotes public methods on VarStore (len, alloc, get, set, clear, new) and adds is_empty per clippy lint. Keeps marginals field private as an implementation detail — users access via the public methods. Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.Public Game API now returns Result<_, InferenceError> on invalid input (p_draw out of range, outcome rank count mismatches team count). New types: - GameOptions { p_draw, convergence } — bundled config - OwnedGame<T, D> — owned variant of Game that carries its result and weights internally (no borrow of History's slices). Returned by public constructors to avoid leaking internal borrow lifetimes. The internal Game::new is renamed Game::ranked_with_arena (pub(crate)) and keeps the borrowing-arena signature for History's hot path. All in-crate callers updated (21 call sites: 18 in game.rs tests, 2 in time_slice.rs, 1 in history.rs). Game::custom is a T2-minimal power-user escape hatch exposing raw factor + schedule plumbing. Full ergonomics in T4 (#[doc(hidden)] for now). Game::log_evidence() accessor added on both Game and OwnedGame (was previously accessible only through the pub(crate) evidence field). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>