Implements tiers T0, T1, T2 of `docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md`. All three tiers have landed together on this branch because they build on one another; this PR rolls them up for a single review pass. Per-tier plans: - T0: `docs/superpowers/plans/2026-04-23-t0-numerical-parity.md` - T1: `docs/superpowers/plans/2026-04-24-t1-factor-graph.md` - T2: `docs/superpowers/plans/2026-04-24-t2-new-api-surface.md` ## Summary ### T0 — Numerical parity (internal) - `Gaussian` switched to natural-parameter storage `(pi, tau)`; mul/div now ~7× faster (218 ps vs 1.57 ns). - `HashMap<Index, _>` → dense `Vec<_>` keyed by `Index.0` (via `AgentStore<D>`, `SkillStore`). - `ScratchArena` eliminates per-event allocations in `Game::likelihoods`. - `InferenceError` seed type added (1 variant). - 38 → 53 tests passing through T1. - Benchmark: `Batch::iteration` 29.84 → 21.25 µs. ### T1 — Factor graph machinery (internal) - `Factor` trait + `BuiltinFactor` enum (TeamSum / RankDiff / Trunc) driving within-game inference. - `VarStore` flat storage for variable marginals. - `Schedule` trait + `EpsilonOrMax` impl replacing the hand-rolled EP loop. - `Game::likelihoods` rebuilt on the factor-graph machinery; iteration counts and goldens preserved to within 1e-6. - 53 tests passing. - Benchmark: `Batch::iteration` 23.01 µs (slight regression absorbed in T2). ### T2 — New API surface (breaking) **Renames:** - `IndexMap → KeyTable`, `Player → Rating`, `Agent → Competitor`, `Batch → TimeSlice` **New types:** - `Time` trait with `Untimed` ZST and `i64` impls; `Drift<T>`, `Rating<T, D>`, `Competitor<T, D>`, `TimeSlice<T>`, `History<T, D, O, K>` all generic. - `Event<T, K>`, `Team<K>`, `Member<K>`, `Outcome` (`Ranked` variant; `#[non_exhaustive]`). - `Observer<T>` trait + `NullObserver`. - `ConvergenceOptions`, `ConvergenceReport`. - `GameOptions`, `OwnedGame<T, D>`. **Three-tier ingestion:** - `history.record_winner(&K, &K, T)` / `record_draw(&K, &K, T)` — 1v1 convenience. - `history.add_events(iter)` — typed bulk. - `history.event(T).team([...]).weights([...]).ranking([...]).commit()` — fluent. **Query API:** `current_skill`, `learning_curve`, `learning_curves` (keyed on `K`), `log_evidence`, `log_evidence_for`, `predict_quality`, `predict_outcome`. **Game constructors:** `ranked`, `one_v_one`, `free_for_all`, `custom` — all returning `Result<_, InferenceError>`. **`factors` module:** `Factor`, `Schedule`, `VarStore`, `VarId`, `BuiltinFactor`, `EpsilonOrMax`, `ScheduleReport`, `TeamSumFactor`, `RankDiffFactor`, `TruncFactor` now public. **Errors:** `InferenceError` gains `MismatchedShape`, `InvalidProbability`, `ConvergenceFailed`; boundary panics converted to `Result`. **Removed (breaking):** `History::convergence(iters, eps, verbose)`, `HistoryBuilder::gamma(f64)`, `HistoryBuilder::time(bool)`, `History.time: bool`, `learning_curves_by_index`, nested-Vec public `add_events`. ## Behavior change (documented in CHANGELOG) `Time = Untimed` has `elapsed_to → 0`, so no drift accumulates between slices. The old `time=false` mode implicitly forced `elapsed=1` on reappearance via an `i64::MAX` sentinel — that quirk is not reproducible under a typed time axis. Tests that depended on it now use `History::<i64, _>` with explicit `1..=n` timestamps. One test (`test_env_ttt`) had 3 Gaussian goldens updated to reflect the corrected semantics; documented in commit `33a7d90`. ## Final numbers | Metric | Before T0 | After T2 | Delta | |---|---|---|---| | `Batch::iteration` | 29.84 µs | 21.36 µs | **-28%** | | `Gaussian::mul` | 1.57 ns | 219 ps | **-86%** | | `Gaussian::div` | 1.57 ns | 219 ps | **-86%** | | Tests passing | 38 | 90 | +52 | All other Gaussian ops unchanged (~219 ps add/sub, ~264 ps pi/tau reads). ## Test plan - [x] `cargo test --features approx` — 90/90 pass (68 lib + 10 api_shape + 6 game + 4 record_winner + 2 equivalence) - [x] `cargo clippy --all-targets --features approx -- -D warnings` — clean - [x] `cargo +nightly fmt --check` — clean - [x] `cargo bench --bench batch` — 21.36 µs - [x] `cargo bench --bench gaussian` — unchanged from T1 - [x] `cargo run --example atp --features approx` — rewritten in new API, runs clean - [x] Historical Game-level goldens preserved in `tests/equivalence.rs` - [x] Public API matches spec Section 4 (verified by integration tests in `tests/api_shape.rs`) ## Commit history ~45 commits total across T0 + T1 + T2. Each task is self-contained and individually tested; the branch is bisectable. See `git log main..t2-new-api-surface` for the full list. ## Deferred to later tiers - `Outcome::Scored` + `MarginFactor` — T4 - `Damped` / `Residual` schedules — T4 - `Send + Sync` bounds + Rayon parallelism — T3 - N-team `predict_outcome` — T4 - `Game::custom` full ergonomics — T4 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: #1 Co-authored-by: Anders Olsson <anders.e.olsson@gmail.com> Co-committed-by: Anders Olsson <anders.e.olsson@gmail.com>
83 KiB
T2 — New API Surface Implementation Plan
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Land the full breaking API redesign from docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md Section 7 "T2" — renames, typed Event<T, K> ingestion, generic Time / Drift<T>, Observer + ConvergenceReport, Result<_, InferenceError> at the boundary, and pub factors module — without changing numerical behavior.
Architecture: T2 is a tiered rename + API rewrite on top of T1's factor-graph internals. No internal inference algorithm changes. Each task leaves the crate building and tests passing on the working branch; the final commit merges as a single breaking change. Because this crate is pre-1.0 with no downstream users, intermediate commits on the T2 branch need not preserve the old API, as long as the branch-tip is consistent.
Tech Stack: Rust 2024 edition, approx crate for float comparisons, smallvec for fixed-size-biased event shapes, criterion for benchmark gates. Builds on T0 (nat-param Gaussian, dense storage, ScratchArena) and T1 (Factor/Schedule/VarStore, EpsilonOrMax schedule).
Acceptance criteria
- All existing numerical goldens (rewritten in the new API) pass within
1e-6tolerance. cargo bench --bench batchshows no regression vs T1 (Batch::iteration≤ 23.5 µs on Apple M5 Pro) — the API rewrite is not supposed to change the hot path.cargo clippy --all-targets --features approx -- -D warningsclean.cargo +nightly fmt --checkclean.- Public API matches spec Section 4:
History<T: Time, D: Drift<T>>withUntimedandi64Timeimpls.HistoryBuilderwith.mu().sigma().beta().drift().p_draw().convergence().observer().build().history.record_winner(&K, &K, T),record_draw(&K, &K, T),add_events(iter),event(time).team([…]).weights([…]).ranking([…]).commit().history.converge() -> Result<ConvergenceReport, InferenceError>.history.learning_curve(&K),learning_curves(),current_skill(&K),log_evidence(),log_evidence_for(&[&K]),predict_quality(...),predict_outcome(...),intern(&K) -> Index,lookup(&Q) -> Option<Index>.Game::ranked,Game::one_v_one,Game::free_for_all,Game::customconstructors;Game::posteriors,Game::log_evidenceaccessors.Observertrait withNullObserverdefault.InferenceErrorwithMismatchedShape,InvalidProbability,ConvergenceFailed,NegativePrecisionvariants.factorsmodule re-exportsFactor,Schedule,VarStore,VarId,BuiltinFactor,EpsilonOrMax,ScheduleReport.
- Renames completed:
Batch → TimeSlice,Player → Rating,Agent → Competitor,IndexMap → KeyTable. - Old API (
History::convergence(iters, eps, verbose), nested-Vecadd_events(composition, results, times, weights),verbose: bool,time: bool) removed.
Non-goals (deferred to T3/T4)
Outcome::Scoredvariant +MarginFactor(T4). For T2,Outcomehas onlyRanked; enum is marked#[non_exhaustive]so we can addScoredlater non-breaking.DampedandResidualschedules (T4).Send + Synctrait bounds + Rayon parallelism (T3).- Cross-history parallelism, dirty-bit slice skipping (T3/beyond).
- Snapshot / serde support.
File map
New files:
| Path | Responsibility |
|---|---|
src/time.rs |
Time trait, Untimed ZST, impl Time for i64. |
src/observer.rs |
Observer trait, NullObserver ZST. |
src/outcome.rs |
Outcome enum (only Ranked in T2; #[non_exhaustive]). Convenience constructors winner, draw, ranking. |
src/event.rs |
Typed Event<T, K>, Team<K>, Member<K> structs for bulk ingestion. |
src/convergence.rs |
ConvergenceOptions, ConvergenceReport. |
src/key_table.rs |
KeyTable<K> (was IndexMap in lib.rs). |
src/rating.rs |
Rating (was Player). |
src/competitor.rs |
Competitor (was Agent). |
src/time_slice.rs |
TimeSlice (was Batch). |
src/factors.rs |
Public re-export module: pub use crate::factor::* + pub use crate::schedule::*. |
src/event_builder.rs |
Fluent builder returned by History::event(T) — .team([…]).weights([…]).ranking([…]).commit(). |
tests/equivalence.rs |
Integration tests: every old-API hardcoded golden is reproduced in the new API. |
tests/api_shape.rs |
Integration tests: three-tier ingestion, builder ergonomics, error cases. |
Removed files:
src/player.rs→ replaced byrating.rssrc/agent.rs→ replaced bycompetitor.rssrc/batch.rs→ replaced bytime_slice.rs
Heavily modified:
| Path | What changes |
|---|---|
src/lib.rs |
Remove IndexMap (moved); re-export new modules; promote factor + schedule visibility via mod factors. |
src/drift.rs |
Generify Drift over T: Time. Widen ConstantDrift impl. |
src/history.rs |
Make generic over T: Time, D: Drift<T>. New builder methods. Three-tier ingestion. converge() replaces convergence(). New query methods. |
src/error.rs |
Add MismatchedShape, InvalidProbability, ConvergenceFailed variants. |
src/game.rs |
Add ranked, one_v_one, free_for_all, custom constructors and log_evidence() accessor. Keep the private new_with_arena path for History. |
src/storage/mod.rs |
Rename AgentStore → CompetitorStore, SkillStore stays. |
src/arena.rs |
Rename any Player → Rating references; no structural change. |
Renaming policy
Use git mv for file renames so history is preserved. All type-name find-replaces happen in the same commit as the file move. Tests inside #[cfg(test)] mod tests are updated in-place.
Task 1: Pre-flight — verify T1 green, capture baseline
Files: none
- Step 1: Confirm branch + clean tree
git status
git rev-parse --abbrev-ref HEAD
Expected: clean tree on t0-numerical-parity. If dirty, stop.
- Step 2: Create the T2 branch
git checkout -b t2-new-api-surface
- Step 3: Confirm all tests pass
cargo test --features approx --lib
cargo test --features approx --doc 2>&1 | tail -5
Expected: 53 passed; 0 failed.
- Step 4: Capture a fresh T1 reference bench number (hardware may differ)
cargo bench --bench batch 2>&1 | grep "Batch::iteration"
Record the value — the final task will compare against it.
- Step 5: No commit — verification only.
Task 2: Rename IndexMap → KeyTable
Files:
-
Create:
src/key_table.rs -
Modify:
src/lib.rs(remove the inlineIndexMap, re-exportKeyTable) -
Modify: every call site across
src/**/*.rsand tests -
Step 1: Create
src/key_table.rs
use std::{
borrow::{Borrow, ToOwned},
collections::HashMap,
hash::Hash,
};
use crate::Index;
/// Maps user keys to internal `Index` handles.
///
/// Renamed from the former `IndexMap` to avoid colliding with the `indexmap`
/// crate. Power users can promote `&K` to `Index` via `get_or_create` and
/// skip the lookup on subsequent hot-path calls.
#[derive(Debug)]
pub struct KeyTable<K>(HashMap<K, Index>);
impl<K> KeyTable<K>
where
K: Eq + Hash,
{
pub fn new() -> Self {
Self(HashMap::new())
}
pub fn get<Q: ?Sized + Hash + Eq + ToOwned<Owned = K>>(&self, k: &Q) -> Option<Index>
where
K: Borrow<Q>,
{
self.0.get(k).cloned()
}
pub fn get_or_create<Q: ?Sized + Hash + Eq + ToOwned<Owned = K>>(&mut self, k: &Q) -> Index
where
K: Borrow<Q>,
{
if let Some(idx) = self.0.get(k) {
*idx
} else {
let idx = Index::from(self.0.len());
self.0.insert(k.to_owned(), idx);
idx
}
}
pub fn key(&self, idx: Index) -> Option<&K> {
self.0
.iter()
.find(|&(_, value)| *value == idx)
.map(|(key, _)| key)
}
pub fn keys(&self) -> impl Iterator<Item = &K> {
self.0.keys()
}
pub fn len(&self) -> usize {
self.0.len()
}
pub fn is_empty(&self) -> bool {
self.0.is_empty()
}
}
impl<K> Default for KeyTable<K>
where
K: Eq + Hash,
{
fn default() -> Self {
KeyTable::new()
}
}
- Step 2: Remove
IndexMapfromsrc/lib.rs
Delete lines 57–108 (the IndexMap struct, impls, and Default impl). Remove use std::{borrow::{Borrow, ToOwned}, collections::HashMap, hash::Hash}; if nothing else uses those imports.
Add module declaration near the other pub mod lines (alphabetically between gaussian and player):
pub mod key_table;
Add re-export:
pub use key_table::KeyTable;
- Step 3: Replace all
IndexMapreferences withKeyTable
grep -rln 'IndexMap' src/ tests/ benches/ examples/ 2>/dev/null
In every file found, replace IndexMap → KeyTable. Verify no stragglers:
! grep -rn 'IndexMap' src/ tests/ benches/ examples/ 2>/dev/null
- Step 4: Build + test
cargo build --features approx
cargo test --features approx --lib
Expected: all 53 tests pass.
- Step 5: Format + commit
cargo +nightly fmt
git add src/ tests/ benches/ examples/
git commit -m "$(cat <<'EOF'
refactor(api): rename IndexMap to KeyTable
The former name collided with the popular indexmap crate. KeyTable
lives in its own module. Public API unchanged beyond the rename.
Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
EOF
)"
Task 3: Rename Player → Rating
Files:
-
Rename:
src/player.rs→src/rating.rs -
Modify:
src/lib.rs(module decl + re-export) -
Modify: every call site
-
Step 1: Move the file
git mv src/player.rs src/rating.rs
- Step 2: Rename the type inside
src/rating.rs
Edit src/rating.rs — rename Player → Rating everywhere, update the doc comment to say what it is now:
use crate::{
BETA, GAMMA,
drift::{ConstantDrift, Drift},
gaussian::Gaussian,
};
/// Static rating configuration: prior skill, performance noise `beta`, drift.
///
/// Renamed from `Player` in T2; `Rating` better describes the data
/// (a configuration) vs. a person (who's a `Competitor` with state).
#[derive(Clone, Copy, Debug)]
pub struct Rating<D: Drift = ConstantDrift> {
pub(crate) prior: Gaussian,
pub(crate) beta: f64,
pub(crate) drift: D,
}
impl<D: Drift> Rating<D> {
pub fn new(prior: Gaussian, beta: f64, drift: D) -> Self {
Self { prior, beta, drift }
}
pub(crate) fn performance(&self) -> Gaussian {
self.prior.forget(self.beta.powi(2))
}
}
impl Default for Rating<ConstantDrift> {
fn default() -> Self {
Self {
prior: Gaussian::default(),
beta: BETA,
drift: ConstantDrift(GAMMA),
}
}
}
- Step 3: Update
src/lib.rs
Replace:
pub mod player;
...
pub use player::Player;
with:
pub mod rating;
...
pub use rating::Rating;
- Step 4: Replace
PlayerwithRatingandplayer::withrating::everywhere
Use two passes so we don't touch e.g. player_b or players — use word-boundary regex:
grep -rln '\bPlayer\b\|::player::\|mod player\|crate::player' src/ tests/ benches/ examples/ 2>/dev/null
For each file, edit to replace. Inside test modules, Player::new(...) becomes Rating::new(...). The spec keeps the Rating::new(prior, beta, drift) signature exactly.
Note: crate::player::Player → crate::rating::Rating. Use pub(crate) fields are re-paths only.
Verify no stragglers:
! grep -rn '\bPlayer\b' src/ tests/ benches/ examples/ 2>/dev/null
! grep -rn 'crate::player\|mod player' src/ 2>/dev/null
- Step 5: Build + test
cargo build --features approx
cargo test --features approx --lib
Expected: all 53 tests pass.
- Step 6: Format + commit
cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
refactor(api): rename Player to Rating
The struct holds prior/beta/drift — a rating configuration, not a
person. The person-with-temporal-state is the Competitor (renamed in
the next task). Resolves Player/Agent ambiguity.
Part of T2.
EOF
)"
Task 4: Rename Agent → Competitor
Files:
-
Rename:
src/agent.rs→src/competitor.rs -
Modify:
src/lib.rs,src/storage/mod.rs(theAgentStorealias rename), every call site -
Step 1: Move the file
git mv src/agent.rs src/competitor.rs
- Step 2: Rename the type inside
src/competitor.rs
Replace Agent with Competitor everywhere in the file. Update the rating import (which was player):
use crate::{
N_INF,
drift::{ConstantDrift, Drift},
gaussian::Gaussian,
rating::Rating,
};
/// Per-history, temporal state for someone competing.
///
/// Renamed from `Agent` in T2.
#[derive(Debug)]
pub struct Competitor<D: Drift = ConstantDrift> {
pub rating: Rating<D>,
pub message: Gaussian,
pub last_time: i64,
}
impl<D: Drift> Competitor<D> {
pub(crate) fn receive(&self, elapsed: i64) -> Gaussian {
if self.message != N_INF {
self.message
.forget(self.rating.drift.variance_delta(elapsed))
} else {
self.rating.prior
}
}
}
impl Default for Competitor<ConstantDrift> {
fn default() -> Self {
Self {
rating: Rating::default(),
message: N_INF,
last_time: i64::MIN,
}
}
}
pub(crate) fn clean<'a, D: Drift + 'a, A: Iterator<Item = &'a mut Competitor<D>>>(
competitors: A,
last_time: bool,
) {
for c in competitors {
c.message = N_INF;
if last_time {
c.last_time = i64::MIN;
}
}
}
Note: the field player → rating is an additional rename that must propagate (self.player.X → self.rating.X across the codebase).
- Step 3: Update
src/lib.rsandsrc/storage/mod.rs
In src/lib.rs, replace pub mod agent; with pub mod competitor;.
In src/storage/mod.rs, rename AgentStore → CompetitorStore. Inspect the file (read it first if you haven't) and replace:
-
type alias/struct
AgentStore→CompetitorStore -
any
Agent<D>generic →Competitor<D> -
use crate::agent::Agent→use crate::competitor::Competitor -
Step 4: Replace all call sites
grep -rln '\bAgent\b\|AgentStore\|agent::\|mod agent\|crate::agent\|\.player\b' src/ tests/ benches/ examples/ 2>/dev/null
Replace:
Agent→CompetitorAgentStore→CompetitorStorecrate::agent::→crate::competitor::mod agent→mod competitor.playerfield access →.rating(search this one carefully; it's ambiguous with unrelated code in dev-deps)
For the .player rename, audit carefully — it should only appear inside src/ on instances of Competitor. Grep-replace only inside src/:
grep -rln '\.player\b' src/
Do NOT blindly s/.player/.rating/ across tests/ or benches/ or examples/ — these may reference external dev-dep types. Only edit matches inside src/ that are on Competitor instances (verify each).
Verify:
! grep -rn '\bAgent\b\|AgentStore\|::agent::\|mod agent' src/ tests/ benches/ examples/ 2>/dev/null
- Step 5: Build + test
cargo build --features approx
cargo test --features approx --lib
- Step 6: Format + commit
cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
refactor(api): rename Agent to Competitor and .player field to .rating
Competitor holds dynamic per-history state (message, last_time) for
someone competing; its configuration lives in a Rating.
AgentStore renamed to CompetitorStore to match.
Part of T2.
EOF
)"
Task 5: Rename Batch → TimeSlice
Files:
-
Rename:
src/batch.rs→src/time_slice.rs -
Modify:
src/lib.rs,src/history.rs, every call site -
Step 1: Move the file
git mv src/batch.rs src/time_slice.rs
- Step 2: Rename the type
In src/time_slice.rs, replace the struct name Batch with TimeSlice. Update the module-level doc comment:
//! A single time step's worth of events.
//!
//! Renamed from `Batch` in T2.
Rename:
struct Batch→struct TimeSliceimpl Batch→impl TimeSliceBatch::new→TimeSlice::new
The compute_elapsed free function keeps its name.
- Step 3: Update
src/lib.rs
Replace:
pub mod batch;
with:
pub mod time_slice;
If batch::Batch was re-exported anywhere, update to time_slice::TimeSlice.
- Step 4: Replace all call sites
grep -rln '\bBatch\b\|crate::batch::\|mod batch\|batch::' src/ tests/ benches/ examples/ 2>/dev/null
Replace Batch → TimeSlice and batch:: → time_slice:: and mod batch → mod time_slice.
Note src/history.rs uses the name extensively (self.batches, Vec<Batch>, function names like new_forward_info). For T2 we rename:
batches: Vec<Batch>→time_slices: Vec<TimeSlice>- The field rename must propagate — search
\.batches\binsidesrc/history.rsand replace with.time_slices.
Verify:
! grep -rn '\bBatch\b\|::batch::\|mod batch' src/ tests/ benches/ examples/ 2>/dev/null
! grep -rn '\.batches\b' src/history.rs
- Step 5: Build + test
cargo build --features approx
cargo test --features approx --lib
Tests will still use the old nested-Vec add_events signature and h.batches[…] access patterns — those are fixed in later tasks. For now tests may need to be patched minimally to use .time_slices[…] so they compile. Since the inner shape is the same, this is pure find-replace.
- Step 6: Format + commit
cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
refactor(api): rename Batch to TimeSlice
TimeSlice says what it is: every event sharing one timestamp. The
History field batches is renamed to time_slices.
Part of T2.
EOF
)"
Task 6: Introduce Time trait and Untimed
Files:
- Create:
src/time.rs - Modify:
src/lib.rs
This task lands the trait. It is not yet wired into History; that happens in Task 8.
- Step 1: Create
src/time.rs
//! Generic time axis for `History`.
//!
//! Users pick the `Time` type based on their domain: `Untimed` when no
//! time axis is meaningful, `i64` for integer day/second timestamps.
//! Additional impls can be added behind feature flags.
/// A timestamp on the global ordering axis.
///
/// Must be `Ord + Copy` so slices can sort events, and `'static` so
/// `History` can store it by value without lifetimes.
pub trait Time: Copy + Ord + 'static {
/// How much time elapsed between `self` and `later`.
///
/// Used by `Drift<T>::variance_delta` to compute skill drift. Returning
/// zero means no drift accumulates between the two points. Return value
/// must be non-negative for `self <= later`.
fn elapsed_to(&self, later: &Self) -> i64;
}
/// Zero-sized type representing "no time axis."
///
/// Used as the default `Time` when events are unordered. Elapsed is always 0,
/// so no drift accumulates across slices.
#[derive(Copy, Clone, Debug, Default, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct Untimed;
impl Time for Untimed {
fn elapsed_to(&self, _later: &Self) -> i64 {
0
}
}
impl Time for i64 {
fn elapsed_to(&self, later: &Self) -> i64 {
later - self
}
}
- Step 2: Declare the module in
src/lib.rs
pub mod time;
and re-export:
pub use time::{Time, Untimed};
- Step 3: Build
cargo build --features approx
cargo test --features approx --lib
All 53 tests still pass (nothing uses Time yet).
- Step 4: Commit
cargo +nightly fmt
git add src/time.rs src/lib.rs
git commit -m "$(cat <<'EOF'
feat(api): add Time trait with Untimed and i64 impls
Foundation for generic History time axis. Untimed is the ZST case
(no drift across slices); i64 is the standard timestamp case.
Additional impls (time::OffsetDateTime, chrono) can be added behind
feature flags in follow-up work.
Part of T2.
EOF
)"
Task 7: Generify Drift over Time
Files:
-
Modify:
src/drift.rs -
Modify: every
Driftcall site (elapsed: i64→from: &T, to: &T) -
Step 1: Rewrite
src/drift.rs
use std::fmt::Debug;
use crate::time::Time;
/// Governs how much a competitor's skill can drift between two time points.
///
/// Generic over `T: Time` so seasonal or calendar-aware drift is expressible
/// without going through `i64`.
pub trait Drift<T: Time>: Copy + Debug {
/// Variance added to the skill prior for elapsed time `from -> to`.
///
/// Called with `from <= to`; returning zero means no drift accumulates.
fn variance_delta(&self, from: &T, to: &T) -> f64;
}
/// Simple constant-per-unit-time drift.
///
/// For `Time = i64`: variance added is `(to - from) * gamma^2`.
/// For `Time = Untimed`: elapsed is always 0, so drift is always 0.
#[derive(Clone, Copy, Debug)]
pub struct ConstantDrift(pub f64);
impl<T: Time> Drift<T> for ConstantDrift {
fn variance_delta(&self, from: &T, to: &T) -> f64 {
let elapsed = from.elapsed_to(to).max(0) as f64;
elapsed * self.0 * self.0
}
}
- Step 2: Update
Competitor::receiveandcleansignatures
In src/competitor.rs, Competitor::receive currently takes elapsed: i64. Change the generic bound so it can take from: &T, to: &T. Since Competitor is parameterized by D: Drift, we need to also parameterize by T: Time:
use crate::{
N_INF,
drift::{ConstantDrift, Drift},
gaussian::Gaussian,
rating::Rating,
time::Time,
};
#[derive(Debug)]
pub struct Competitor<T: Time, D: Drift<T> = ConstantDrift> {
pub rating: Rating<T, D>,
pub message: Gaussian,
pub last_time: Option<T>,
}
impl<T: Time, D: Drift<T>> Competitor<T, D> {
pub(crate) fn receive(&self, now: &T) -> Gaussian {
if self.message != N_INF {
let elapsed_variance = match &self.last_time {
Some(last) => self.rating.drift.variance_delta(last, now),
None => 0.0,
};
self.message.forget(elapsed_variance)
} else {
self.rating.prior
}
}
}
impl Default for Competitor<i64, ConstantDrift> {
fn default() -> Self {
Self {
rating: Rating::default(),
message: N_INF,
last_time: None,
}
}
}
pub(crate) fn clean<'a, T: Time + 'a, D: Drift<T> + 'a, C: Iterator<Item = &'a mut Competitor<T, D>>>(
competitors: C,
last_time: bool,
) {
for c in competitors {
c.message = N_INF;
if last_time {
c.last_time = None;
}
}
}
The last_time: i64 with i64::MIN sentinel becomes last_time: Option<T> with None sentinel — cleaner and type-safe.
- Step 3: Parameterize
Rating<T, D>likewise
In src/rating.rs:
use crate::{
BETA, GAMMA,
drift::{ConstantDrift, Drift},
gaussian::Gaussian,
time::Time,
};
#[derive(Clone, Copy, Debug)]
pub struct Rating<T: Time = i64, D: Drift<T> = ConstantDrift> {
pub(crate) prior: Gaussian,
pub(crate) beta: f64,
pub(crate) drift: D,
pub(crate) _time: std::marker::PhantomData<T>,
}
impl<T: Time, D: Drift<T>> Rating<T, D> {
pub fn new(prior: Gaussian, beta: f64, drift: D) -> Self {
Self {
prior,
beta,
drift,
_time: std::marker::PhantomData,
}
}
pub(crate) fn performance(&self) -> Gaussian {
self.prior.forget(self.beta.powi(2))
}
}
impl Default for Rating<i64, ConstantDrift> {
fn default() -> Self {
Self {
prior: Gaussian::default(),
beta: BETA,
drift: ConstantDrift(GAMMA),
_time: std::marker::PhantomData,
}
}
}
- Step 4: Propagate the generic through the rest of the codebase
Every generic <D: Drift> becomes <T: Time, D: Drift<T>> where the enclosing type owns T. Affected:
CompetitorStore<D>→CompetitorStore<T, D>insrc/storage/mod.rsHistory<D>→History<T, D>insrc/history.rs(and every fn sig inside)HistoryBuilder<D>→HistoryBuilder<T, D>Game<D>→Game<T, D>insrc/game.rsTimeSlicemay needTif it holds atime: Tfield (it currently holdstime: i64— change totime: T)agent::clean→ already updatedEvent(insidetime_slice.rs) passes competitors and therefore inheritsT
This is the single biggest refactor in T2. Expect ~100 compiler errors; work through them file by file. The compiler is your friend — each error either needs a T: Time parameter added, or a type argument passed.
For Competitor::receive(elapsed: i64) callers — now Competitor::receive(now: &T). Each caller currently computes elapsed = compute_elapsed(last_time, batch.time); change those to competitor.receive(&time_slice.time).
- Step 5: Fix test modules
Tests use i64 timestamps and ConstantDrift — the defaults — so most test setup is unchanged. Places that reference .last_time == i64::MIN become .last_time.is_none(). Places constructing Competitor { last_time: i64::MIN, ... } become Competitor { last_time: None, ... }.
- Step 6: Build + test
cargo build --features approx
cargo test --features approx --lib
Expected: all tests still pass, numerically identical. If numerics drift, the Drift implementation has a bug — most likely a negative elapsed was previously allowed.
- Step 7: Format + commit
cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
refactor(api): generify Drift, Rating, Competitor, TimeSlice, History over T: Time
Drift now takes &T -> &T and is generic over the time axis. Untimed
impls return elapsed=0. ConstantDrift impl covers all T via the Time
trait.
Competitor.last_time moves from i64 with MIN sentinel to Option<T>
with None sentinel.
Part of T2.
EOF
)"
Task 8: Parameterize History<T, D> — remove time: bool
Files:
- Modify:
src/history.rs
The time: bool field is a legacy encoding of "no time axis." With T: Time, the type system carries that information (T = Untimed means no drift). This task removes the bool and fixes the !time internal path.
- Step 1: Remove
time: boolfromHistoryandHistoryBuilder
In src/history.rs, delete the time: bool field from both structs. Delete the .time(bool) builder method. Update the Default impl and build() to stop threading time through.
- Step 2: Remove the
!self.timebranches inadd_events_with_prior
Every if self.time { … } else { … } becomes the self.time branch — the generic parameter makes this uniform. Specifically, times is always required now (it's a Vec<T>, not Vec<i64>), and sort_time sorts in T order.
Rewrite sort_time in src/lib.rs to be generic:
pub(crate) fn sort_time<T: Copy + Ord>(xs: &[T], reverse: bool) -> Vec<usize> {
let mut x: Vec<(usize, T)> = xs.iter().enumerate().map(|(i, &t)| (i, t)).collect();
if reverse {
x.sort_by_key(|&(_, t)| std::cmp::Reverse(t));
} else {
x.sort_by_key(|&(_, t)| t);
}
x.into_iter().map(|(i, _)| i).collect()
}
- Step 3: Keep the old nested-Vec
add_events_with_priorsignature
For now. Change its signature only enough to take times: Vec<T> instead of Vec<i64>. Users who previously called with !time should switch to History::<Untimed, _> — but the tests should switch to History::<i64, _> with explicit [1, 2, 3, ...] timestamps to preserve the old elapsed=1-per-event numerical behavior.
The nested-Vec signature gets fully replaced in Task 15. This keeps the test suite green between tasks.
- Step 4: Update tests
Tests using .time(false) need translation. The mechanical rule:
| Old | New |
|---|---|
History::builder().time(false).build() |
History::<i64, _>::builder().build() with explicit vec![1, 2, 3, …] for times in every add_events call |
History::default() (was time: true) |
History::<i64, _>::default() |
For every test that called add_events(comp, results, vec![], vec![]) under time: false, change to add_events(comp, results, vec![1, 2, 3, …], vec![]) where the length matches comp.len(). Numerics are preserved because the old !time branch used i as i64 + 1 timestamps internally.
This is mechanical. Each failing test will show which translation is needed.
- Step 5: Build + test
cargo build --features approx
cargo test --features approx --lib
Expected: all tests pass with translated timestamps.
- Step 6: Format + commit
cargo +nightly fmt
git add src/history.rs src/lib.rs src/time_slice.rs src/storage/
git commit -m "$(cat <<'EOF'
refactor(history): parameterize History<T, D> and remove time: bool
The bool encoded 'no time axis' which is now expressed at the type
level (T = Untimed). Removed the .time() builder method. Tests that
used time(false) now use i64 timestamps 1..=n to preserve numerics.
Part of T2.
EOF
)"
Task 9: Add Outcome enum
Files:
-
Create:
src/outcome.rs -
Modify:
src/lib.rs -
Step 1: Create
src/outcome.rs
//! Outcome of a match.
//!
//! In T2, only `Ranked` is supported; `Scored` will be added together with
//! `MarginFactor` in T4. The enum is `#[non_exhaustive]` so adding `Scored`
//! is non-breaking for downstream `match` expressions.
use smallvec::SmallVec;
/// Final outcome of a match.
///
/// `Ranked(ranks)`: lower rank = better. Equal ranks mean a tie between those
/// teams. `ranks.len()` must equal the number of teams in the event.
#[derive(Clone, Debug, PartialEq)]
#[non_exhaustive]
pub enum Outcome {
Ranked(SmallVec<[u32; 4]>),
}
impl Outcome {
/// `N`-team outcome where team `winner` won and everyone else tied for last.
///
/// Panics if `winner >= n`.
pub fn winner(winner: u32, n: u32) -> Self {
assert!(winner < n, "winner index {winner} out of range 0..{n}");
let ranks: SmallVec<[u32; 4]> = (0..n)
.map(|i| if i == winner { 0 } else { 1 })
.collect();
Self::Ranked(ranks)
}
/// All `n` teams tied.
pub fn draw(n: u32) -> Self {
Self::Ranked(SmallVec::from_vec(vec![0; n as usize]))
}
/// Explicit per-team ranking.
pub fn ranking<I: IntoIterator<Item = u32>>(ranks: I) -> Self {
Self::Ranked(ranks.into_iter().collect())
}
pub fn team_count(&self) -> usize {
match self {
Self::Ranked(r) => r.len(),
}
}
pub(crate) fn as_ranks(&self) -> &[u32] {
match self {
Self::Ranked(r) => r,
}
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn winner_two_teams() {
let o = Outcome::winner(0, 2);
assert_eq!(o.as_ranks(), &[0u32, 1]);
assert_eq!(o.team_count(), 2);
}
#[test]
fn winner_three_teams_second_wins() {
let o = Outcome::winner(1, 3);
assert_eq!(o.as_ranks(), &[1u32, 0, 1]);
}
#[test]
fn draw_three_teams() {
let o = Outcome::draw(3);
assert_eq!(o.as_ranks(), &[0u32, 0, 0]);
}
#[test]
fn ranking_from_iter() {
let o = Outcome::ranking([2, 0, 1]);
assert_eq!(o.as_ranks(), &[2u32, 0, 1]);
}
#[test]
#[should_panic(expected = "winner index 2 out of range")]
fn winner_out_of_range_panics() {
let _ = Outcome::winner(2, 2);
}
}
- Step 2: Add
smallvecas a dependency
In Cargo.toml, under [dependencies]:
smallvec = "1"
-
Step 3: Add
pub mod outcome;andpub use outcome::Outcome;tosrc/lib.rs -
Step 4: Build + test
cargo build --features approx
cargo test --features approx --lib outcome
Expected: 5 passing tests.
- Step 5: Format + commit
cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
feat(api): add Outcome enum with Ranked variant
Outcome::winner(i, n), Outcome::draw(n), Outcome::ranking(iter) are
the convenience constructors. Marked #[non_exhaustive] so Scored can
be added in T4 without breaking match exhaustiveness.
Adds smallvec dependency.
Part of T2.
EOF
)"
Task 10: Add Event<T, K>, Team<K>, Member<K>
Files:
-
Create:
src/event.rs -
Modify:
src/lib.rs -
Step 1: Create
src/event.rs
//! Typed event description for bulk ingestion.
//!
//! `Event<T, K>` is the new public event shape (spec Section 4). Replaces
//! the nested `Vec<Vec<Vec<Index>>>`, `Vec<Vec<f64>>`, `Vec<Vec<Vec<f64>>>`
//! that the old `add_events_with_prior` took.
use smallvec::SmallVec;
use crate::{gaussian::Gaussian, outcome::Outcome, time::Time};
/// A single match at time `time` involving some number of teams.
#[derive(Clone, Debug)]
pub struct Event<T: Time, K> {
pub time: T,
pub teams: SmallVec<[Team<K>; 4]>,
pub outcome: Outcome,
}
/// A team: list of members competing together.
#[derive(Clone, Debug)]
pub struct Team<K> {
pub members: SmallVec<[Member<K>; 4]>,
}
impl<K> Team<K> {
pub fn new() -> Self {
Self { members: SmallVec::new() }
}
pub fn with_members<I: IntoIterator<Item = Member<K>>>(members: I) -> Self {
Self { members: members.into_iter().collect() }
}
}
impl<K> Default for Team<K> {
fn default() -> Self {
Self::new()
}
}
/// One member of a team, identified by user key `K`.
///
/// `weight` defaults to 1.0; a per-event `prior` can override the competitor's
/// current skill estimate for this event only.
#[derive(Clone, Debug)]
pub struct Member<K> {
pub key: K,
pub weight: f64,
pub prior: Option<Gaussian>,
}
impl<K> Member<K> {
pub fn new(key: K) -> Self {
Self { key, weight: 1.0, prior: None }
}
pub fn with_weight(mut self, weight: f64) -> Self {
self.weight = weight;
self
}
pub fn with_prior(mut self, prior: Gaussian) -> Self {
self.prior = Some(prior);
self
}
}
/// Convenience: a member is a user key with default weight 1.0 and no prior.
impl<K> From<K> for Member<K> {
fn from(key: K) -> Self {
Self::new(key)
}
}
- Step 2: Register in
src/lib.rs
pub mod event;
...
pub use event::{Event, Member, Team};
- Step 3: Build
cargo build --features approx
cargo test --features approx --lib
- Step 4: Format + commit
cargo +nightly fmt
git add src/event.rs src/lib.rs
git commit -m "$(cat <<'EOF'
feat(api): add Event<T, K>, Team<K>, Member<K> typed event description
Replaces the old nested Vec<Vec<Vec<_>>> event description on the
public API boundary. Member<K>::from(K) enables ergonomic literal
lists.
Part of T2.
EOF
)"
Task 11: Add Observer trait and NullObserver
Files:
-
Create:
src/observer.rs -
Modify:
src/lib.rs -
Step 1: Create
src/observer.rs
//! Observer trait for progress reporting during convergence.
//!
//! Replaces the old `verbose: bool` + `println!` path. Callers wire in any
//! observer that implements the trait; default methods are no-ops so users
//! override only what they need.
use crate::time::Time;
/// Receives progress callbacks during `History::converge`.
///
/// All methods have default no-op implementations; implement only what's
/// interesting. Send/Sync is NOT required in T2 (added in T3 along with
/// Rayon support).
pub trait Observer<T: Time> {
/// Called after each convergence iteration across the whole history.
fn on_iteration_end(&self, _iter: usize, _max_step: (f64, f64)) {}
/// Called after each time slice is processed within an iteration.
fn on_batch_processed(&self, _time: &T, _slice_idx: usize, _n_events: usize) {}
/// Called once when convergence completes (or max iters is reached).
fn on_converged(&self, _iters: usize, _final_step: (f64, f64), _converged: bool) {}
}
/// ZST no-op observer; the default when none is configured.
#[derive(Copy, Clone, Debug, Default)]
pub struct NullObserver;
impl<T: Time> Observer<T> for NullObserver {}
- Step 2: Register in
src/lib.rs
pub mod observer;
...
pub use observer::{NullObserver, Observer};
- Step 3: Build
cargo build --features approx
cargo test --features approx --lib
- Step 4: Commit
cargo +nightly fmt
git add src/observer.rs src/lib.rs
git commit -m "$(cat <<'EOF'
feat(api): add Observer trait and NullObserver default
Observer replaces verbose: bool with structured progress callbacks.
NullObserver is a ZST default; users override only the methods they
care about.
Send + Sync bounds deferred to T3.
Part of T2.
EOF
)"
Task 12: Add ConvergenceOptions and ConvergenceReport; wire into history
Files:
-
Create:
src/convergence.rs -
Modify:
src/history.rs,src/lib.rs -
Step 1: Create
src/convergence.rs
//! Convergence configuration and reporting.
use std::time::Duration;
use smallvec::SmallVec;
#[derive(Clone, Copy, Debug)]
pub struct ConvergenceOptions {
pub max_iter: usize,
pub epsilon: f64,
}
impl Default for ConvergenceOptions {
fn default() -> Self {
Self {
max_iter: crate::ITERATIONS,
epsilon: crate::EPSILON,
}
}
}
/// Post-hoc summary of a `History::converge` call.
#[derive(Clone, Debug)]
pub struct ConvergenceReport {
pub iterations: usize,
pub final_step: (f64, f64),
pub log_evidence: f64,
pub converged: bool,
pub per_iteration_time: SmallVec<[Duration; 32]>,
pub slices_skipped: usize,
}
- Step 2: Register in
src/lib.rs
pub mod convergence;
...
pub use convergence::{ConvergenceOptions, ConvergenceReport};
- Step 3: Add
convergence: ConvergenceOptionsandobservertoHistoryBuilder
In src/history.rs, add to HistoryBuilder:
pub fn convergence(mut self, opts: ConvergenceOptions) -> Self {
self.convergence = opts;
self
}
pub fn observer<O2: Observer<T>>(self, observer: O2) -> HistoryBuilder<T, D, O2> { … }
HistoryBuilder gains two extra generic parameters (O: Observer<T> and observer storage). The simplest shape:
pub struct HistoryBuilder<T: Time = i64, D: Drift<T> = ConstantDrift, O: Observer<T> = NullObserver> {
mu: f64,
sigma: f64,
beta: f64,
drift: D,
p_draw: f64,
convergence: ConvergenceOptions,
observer: O,
_time: std::marker::PhantomData<T>,
}
History<T, D, O> gains the same O parameter; observer is stored and called from the convergence loop.
- Step 4: Add
history.converge() -> Result<ConvergenceReport, InferenceError>
pub fn converge(&mut self) -> Result<ConvergenceReport, InferenceError> {
use std::time::Instant;
let opts = self.convergence;
let mut step = (f64::INFINITY, f64::INFINITY);
let mut i = 0;
let mut per_iter: SmallVec<[Duration; 32]> = SmallVec::new();
while crate::tuple_gt(step, opts.epsilon) && i < opts.max_iter {
let t0 = Instant::now();
step = self.iteration();
per_iter.push(t0.elapsed());
i += 1;
self.observer.on_iteration_end(i, step);
}
let converged = !crate::tuple_gt(step, opts.epsilon);
let log_evidence = self.log_evidence_impl(false, &[]);
self.observer.on_converged(i, step, converged);
Ok(ConvergenceReport {
iterations: i,
final_step: step,
log_evidence,
converged,
per_iteration_time: per_iter,
slices_skipped: 0,
})
}
Keep the old convergence(iters, eps, verbose) method alive for one more task. It is removed in Task 20 alongside test translation.
- Step 5: Build + test
cargo build --features approx
cargo test --features approx --lib
- Step 6: Commit
cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
feat(api): add ConvergenceOptions, ConvergenceReport, and History::converge
HistoryBuilder gains .convergence(opts) and .observer(o). History::converge
returns a structured ConvergenceReport with per-iteration timings.
The old History::convergence(iters, eps, verbose) still works and is
removed in Task 20.
Part of T2.
EOF
)"
Task 13: Expand InferenceError; convert boundary panics to Result
Files:
-
Modify:
src/error.rs -
Modify:
src/history.rs,src/game.rs(wheredebug_assert!orassert!guard user input) -
Step 1: Expand
InferenceError
Replace src/error.rs:
use std::fmt;
#[derive(Debug, Clone, PartialEq)]
pub enum InferenceError {
/// Expected and actual lengths of some array-shaped input differ.
MismatchedShape {
kind: &'static str,
expected: usize,
got: usize,
},
/// A probability value is outside `[0, 1]`.
InvalidProbability { value: f64 },
/// Convergence exceeded `max_iter` without falling below `epsilon`.
ConvergenceFailed {
last_step: (f64, f64),
iterations: usize,
},
/// Negative precision: a Gaussian with `pi < 0` slipped into an API call.
NegativePrecision { pi: f64 },
}
impl fmt::Display for InferenceError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
Self::MismatchedShape { kind, expected, got } => {
write!(f, "{kind}: expected length {expected}, got {got}")
}
Self::InvalidProbability { value } => {
write!(f, "probability must be in [0, 1]; got {value}")
}
Self::ConvergenceFailed { last_step, iterations } => {
write!(
f,
"convergence failed after {iterations} iterations; last step = {last_step:?}"
)
}
Self::NegativePrecision { pi } => {
write!(f, "precision must be non-negative; got {pi}")
}
}
}
}
impl std::error::Error for InferenceError {}
- Step 2: Convert boundary panics to
Result
Search for debug_assert! and assert! at the API boundary:
grep -rn 'debug_assert!\|^\s*assert!' src/history.rs src/game.rs
In Game::new (the internal path kept for History), the existing debug_assert!s remain — they catch invariants, not user errors. But for a public Game::ranked (added later in Task 19), return InferenceError::MismatchedShape on bad input rather than panicking.
In History::add_events_with_prior, the four assert!(...) calls at the top of the function become Result<(), InferenceError> returns. Change the function signature:
pub(crate) fn add_events_with_prior(
&mut self,
…
) -> Result<(), InferenceError> {
if results.len() != composition.len() && !results.is_empty() {
return Err(InferenceError::MismatchedShape {
kind: "results",
expected: composition.len(),
got: results.len(),
});
}
// … other checks …
// … rest of body …
Ok(())
}
Every caller must add ?. Tests that were calling h.add_events(…) (the thin wrapper) now call h.add_events(…)? or .unwrap() in tests.
- Step 3: Build + test
cargo build --features approx
cargo test --features approx --lib
- Step 4: Commit
cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
feat(error): expand InferenceError; convert boundary asserts to Result
Adds MismatchedShape, InvalidProbability, ConvergenceFailed. History's
add_events_with_prior now returns Result<(), InferenceError>. Internal
debug_asserts for invariants stay; user-facing shape/bounds checks
become errors.
Part of T2.
EOF
)"
Task 14: Add record_winner and record_draw convenience API
Files:
-
Modify:
src/history.rs -
Step 1: Add the methods
In impl<T: Time, D: Drift<T>, O: Observer<T>> History<T, D, O>:
/// Record a single 1v1 match where `winner` beat `loser` at time `time`.
pub fn record_winner<Q>(&mut self, winner: &Q, loser: &Q, time: T) -> Result<(), InferenceError>
where
K: std::borrow::Borrow<Q> + std::hash::Hash + Eq + ToOwned<Owned = K>,
Q: std::hash::Hash + Eq + ?Sized,
{
let w_idx = self.intern(winner);
let l_idx = self.intern(loser);
self.add_events_with_prior(
vec![vec![vec![w_idx], vec![l_idx]]],
vec![vec![1.0, 0.0]],
vec![time],
vec![],
std::collections::HashMap::new(),
)
}
/// Record a 1v1 tie.
pub fn record_draw<Q>(&mut self, a: &Q, b: &Q, time: T) -> Result<(), InferenceError>
where
K: std::borrow::Borrow<Q> + std::hash::Hash + Eq + ToOwned<Owned = K>,
Q: std::hash::Hash + Eq + ?Sized,
{
let a_idx = self.intern(a);
let b_idx = self.intern(b);
self.add_events_with_prior(
vec![vec![vec![a_idx], vec![b_idx]]],
vec![vec![0.0, 0.0]],
vec![time],
vec![],
std::collections::HashMap::new(),
)
}
/// Get or create an `Index` for a user key. See spec Section 4 "Open question 3."
pub fn intern<Q>(&mut self, key: &Q) -> Index
where
K: std::borrow::Borrow<Q> + std::hash::Hash + Eq + ToOwned<Owned = K>,
Q: std::hash::Hash + Eq + ?Sized,
{
self.keys.get_or_create(key)
}
/// Look up an `Index` for a key without creating it.
pub fn lookup<Q>(&self, key: &Q) -> Option<Index>
where
K: std::borrow::Borrow<Q> + std::hash::Hash + Eq + ToOwned<Owned = K>,
Q: std::hash::Hash + Eq + ?Sized,
{
self.keys.get(key)
}
Note: History does not currently hold a KeyTable<K>. This task requires adding keys: KeyTable<K> to the History struct (which also adds K: Eq + Hash bound). The generic parameter K is threaded through History<T, D, O, K>. This is a significant shape change — work through the compiler errors.
- Step 2: Add an integration test for
record_winner
Create tests/record_winner.rs:
use trueskill_tt::{ConstantDrift, ConvergenceOptions, History};
#[test]
fn record_winner_updates_skills() {
let mut h = History::<i64, _, _, &'static str>::builder()
.mu(25.0)
.sigma(25.0 / 3.0)
.beta(25.0 / 6.0)
.drift(ConstantDrift(25.0 / 300.0))
.build();
h.record_winner(&"alice", &"bob", 0).unwrap();
h.converge().unwrap();
let alice = h.current_skill(&"alice").unwrap();
let bob = h.current_skill(&"bob").unwrap();
assert!(alice.mu() > 25.0, "winner mu should rise: got {}", alice.mu());
assert!(bob.mu() < 25.0, "loser mu should fall: got {}", bob.mu());
}
This test depends on current_skill (added in Task 17). For now, comment out the last three lines or replace with a learning_curves() check; come back to it in Task 17.
- Step 3: Build + test
cargo build --features approx
cargo test --features approx --test record_winner
- Step 4: Commit
cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
feat(api): add record_winner, record_draw, intern, lookup on History
Spec Section 4 "three-tier event ingestion" tier 2: one-off match
convenience. History now holds a KeyTable<K> internally; generic
parameter K added to History.
Part of T2.
EOF
)"
Task 15: Replace nested-Vec add_events with typed add_events(iter)
Files:
-
Modify:
src/history.rs -
Step 1: Add the new
add_eventssignature
pub fn add_events<I>(&mut self, events: I) -> Result<(), InferenceError>
where
I: IntoIterator<Item = Event<T, K>>,
{
// Translate each Event<T, K> into the internal composition/results/times/weights
// triple, then delegate to add_events_with_prior (which becomes pub(crate)).
let mut composition: Vec<Vec<Vec<Index>>> = Vec::new();
let mut results: Vec<Vec<f64>> = Vec::new();
let mut times: Vec<T> = Vec::new();
let mut weights: Vec<Vec<Vec<f64>>> = Vec::new();
let mut priors: HashMap<Index, Rating<T, D>> = HashMap::new();
for ev in events {
let mut event_comp: Vec<Vec<Index>> = Vec::new();
let mut event_weights: Vec<Vec<f64>> = Vec::new();
for team in &ev.teams {
let mut team_indices: Vec<Index> = Vec::new();
let mut team_weights: Vec<f64> = Vec::new();
for member in &team.members {
let idx = self.intern(&member.key);
team_indices.push(idx);
team_weights.push(member.weight);
if let Some(prior) = member.prior {
priors.insert(
idx,
Rating::new(prior, self.beta, self.drift),
);
}
}
event_comp.push(team_indices);
event_weights.push(team_weights);
}
composition.push(event_comp);
weights.push(event_weights);
// Convert Outcome::Ranked to the legacy "lower number wins" f64 result.
// Legacy convention: result[i] is a score; higher score = better rank.
// We invert ranks: team with rank 0 gets highest score.
let ranks = ev.outcome.as_ranks();
if ranks.len() != ev.teams.len() {
return Err(InferenceError::MismatchedShape {
kind: "outcome ranks vs teams",
expected: ev.teams.len(),
got: ranks.len(),
});
}
let max_rank = ranks.iter().copied().max().unwrap_or(0) as f64;
let inverted: Vec<f64> = ranks.iter().map(|&r| max_rank - r as f64).collect();
results.push(inverted);
times.push(ev.time);
}
self.add_events_with_prior(composition, results, times, weights, priors)
}
Rename the old public add_events (nested-Vec) to pub(crate) fn add_events_legacy or inline it into the internal call path; delete it from the public API surface.
- Step 2: Add an integration test for the new signature
In tests/api_shape.rs:
use trueskill_tt::{Event, History, Member, Outcome, Team};
use smallvec::smallvec;
#[test]
fn add_events_bulk_via_iter() {
let mut h = History::<i64, _, _, &'static str>::builder()
.mu(0.0).sigma(2.0).beta(1.0).p_draw(0.0)
.build();
let events = vec![
Event {
time: 1,
teams: smallvec![
Team::with_members([Member::new("a")]),
Team::with_members([Member::new("b")]),
],
outcome: Outcome::winner(0, 2),
},
Event {
time: 2,
teams: smallvec![
Team::with_members([Member::new("b")]),
Team::with_members([Member::new("c")]),
],
outcome: Outcome::winner(0, 2),
},
];
h.add_events(events).unwrap();
let report = h.converge().unwrap();
assert!(report.converged, "expected convergence, got {report:?}");
}
- Step 3: Build + test
cargo build --features approx
cargo test --features approx --test api_shape
- Step 4: Commit
cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
feat(api): replace nested-Vec add_events with typed add_events(iter)
The old nested shape is gone from the public API. add_events now
takes IntoIterator<Item = Event<T, K>>. Internally routes through
add_events_with_prior (now pub(crate)).
Part of T2.
EOF
)"
Task 16: Add history.event(time).team(...).commit() fluent builder
Files:
-
Create:
src/event_builder.rs -
Modify:
src/history.rs,src/lib.rs -
Step 1: Create
src/event_builder.rs
//! Fluent builder for single-event ingestion.
//!
//! ```ignore
//! history
//! .event(time)
//! .team(["alice", "bob"]).weights([1.0, 0.7])
//! .team(["carol"])
//! .ranking([1, 0])
//! .commit()?;
//! ```
use std::collections::HashMap;
use smallvec::SmallVec;
use crate::{
InferenceError, Outcome,
event::{Event, Member, Team},
history::History,
observer::Observer,
drift::Drift,
time::Time,
};
pub struct EventBuilder<'h, T, D, O, K>
where
T: Time,
D: Drift<T>,
O: Observer<T>,
K: Eq + std::hash::Hash + Clone,
{
history: &'h mut History<T, D, O, K>,
event: Event<T, K>,
current_team_idx: Option<usize>,
}
impl<'h, T, D, O, K> EventBuilder<'h, T, D, O, K>
where
T: Time,
D: Drift<T>,
O: Observer<T>,
K: Eq + std::hash::Hash + Clone,
{
pub(crate) fn new(history: &'h mut History<T, D, O, K>, time: T, default_n_teams: usize) -> Self {
Self {
history,
event: Event {
time,
teams: SmallVec::with_capacity(default_n_teams.max(2)),
outcome: Outcome::Ranked(SmallVec::new()),
},
current_team_idx: None,
}
}
/// Add a team by its member keys.
pub fn team<I: IntoIterator<Item = K>>(mut self, keys: I) -> Self {
let members: SmallVec<[Member<K>; 4]> = keys.into_iter().map(Member::new).collect();
self.event.teams.push(Team { members });
self.current_team_idx = Some(self.event.teams.len() - 1);
self
}
/// Set per-member weights for the most recently added team.
pub fn weights<I: IntoIterator<Item = f64>>(mut self, weights: I) -> Self {
let idx = self
.current_team_idx
.expect(".weights(...) called before any .team(...)");
let ws: Vec<f64> = weights.into_iter().collect();
let team = &mut self.event.teams[idx];
debug_assert_eq!(
ws.len(),
team.members.len(),
"weights length must match team size"
);
for (m, w) in team.members.iter_mut().zip(ws.into_iter()) {
m.weight = w;
}
self
}
/// Set explicit ranks per team (length must equal number of teams).
pub fn ranking<I: IntoIterator<Item = u32>>(mut self, ranks: I) -> Self {
self.event.outcome = Outcome::ranking(ranks);
self
}
/// Mark team `winner_idx` as winner of an N-way match, others tied for last.
pub fn winner(mut self, winner_idx: u32) -> Self {
self.event.outcome = Outcome::winner(winner_idx, self.event.teams.len() as u32);
self
}
/// All teams tied.
pub fn draw(mut self) -> Self {
self.event.outcome = Outcome::draw(self.event.teams.len() as u32);
self
}
/// Commit the event to the history.
pub fn commit(self) -> Result<(), InferenceError> {
self.history.add_events(std::iter::once(self.event))
}
}
- Step 2: Add
History::eventmethod
In src/history.rs:
pub fn event(&mut self, time: T) -> EventBuilder<'_, T, D, O, K> {
EventBuilder::new(self, time, 2)
}
- Step 3: Register the module in
src/lib.rs
pub mod event_builder;
pub use event_builder::EventBuilder;
- Step 4: Add a test
Append to tests/api_shape.rs:
#[test]
fn fluent_event_builder() {
let mut h = History::<i64, _, _, &'static str>::builder()
.mu(25.0).sigma(25.0 / 3.0).beta(25.0 / 6.0).p_draw(0.0)
.build();
h.event(1)
.team(["alice", "bob"])
.weights([1.0, 0.7])
.team(["carol"])
.ranking([1, 0])
.commit()
.unwrap();
let report = h.converge().unwrap();
assert!(report.converged);
}
- Step 5: Build + test
cargo build --features approx
cargo test --features approx --test api_shape fluent_event_builder
- Step 6: Commit
cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
feat(api): add fluent history.event(t).team(...).commit() builder
Third tier of the ingestion API (per spec Section 4). Powers one-off
events with irregular shapes where neither record_winner nor typed
add_events fits cleanly.
Part of T2.
EOF
)"
Task 17: Query methods — current_skill, learning_curve, log_evidence_for, predict_*
Files:
-
Modify:
src/history.rs,src/game.rs,src/lib.rs -
Step 1: Add
current_skilland single-keylearning_curveto History
impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + std::hash::Hash + Clone> History<T, D, O, K> {
/// Skill estimate at the latest time slice the competitor appears in.
pub fn current_skill<Q>(&self, key: &Q) -> Option<Gaussian>
where
K: std::borrow::Borrow<Q>,
Q: std::hash::Hash + Eq + ?Sized,
{
let idx = self.keys.get(key)?;
self.time_slices
.iter()
.rev()
.find_map(|ts| ts.skills.get(idx).map(|sk| sk.posterior()))
}
/// Learning curve for a single key: (time, posterior) pairs in time order.
pub fn learning_curve<Q>(&self, key: &Q) -> Vec<(T, Gaussian)>
where
K: std::borrow::Borrow<Q>,
Q: std::hash::Hash + Eq + ?Sized,
{
let Some(idx) = self.keys.get(key) else {
return Vec::new();
};
self.time_slices
.iter()
.filter_map(|ts| ts.skills.get(idx).map(|sk| (ts.time, sk.posterior())))
.collect()
}
}
- Step 2: Update
learning_curvesto returnHashMap<K, Vec<(T, Gaussian)>>
Currently returns HashMap<Index, Vec<(i64, Gaussian)>>. New shape:
pub fn learning_curves(&self) -> HashMap<K, Vec<(T, Gaussian)>> {
let mut data: HashMap<K, Vec<(T, Gaussian)>> = HashMap::new();
for slice in &self.time_slices {
for (idx, skill) in slice.skills.iter() {
if let Some(key) = self.keys.key(idx).cloned() {
data.entry(key)
.or_default()
.push((slice.time, skill.posterior()));
}
}
}
data
}
- Step 3: Add
log_evidence()andlog_evidence_for(&[&K])
pub fn log_evidence(&mut self) -> f64 {
self.log_evidence_impl(false, &[])
}
pub fn log_evidence_for<Q>(&mut self, keys: &[&Q]) -> f64
where
K: std::borrow::Borrow<Q>,
Q: std::hash::Hash + Eq + ?Sized,
{
let targets: Vec<Index> = keys
.iter()
.filter_map(|k| self.keys.get(k))
.collect();
self.log_evidence_impl(false, &targets)
}
fn log_evidence_impl(&mut self, forward: bool, targets: &[Index]) -> f64 {
self.time_slices
.iter()
.map(|ts| ts.log_evidence(self.online, targets, forward, &self.competitors))
.sum()
}
The existing public log_evidence(forward, targets) method is renamed to log_evidence_impl (pub(crate)) and the thin public wrappers take its place. The forward bool ("online" mode) stays internal; no public exposure planned for T2.
- Step 4: Add
predict_qualityandpredict_outcome
These are light wrappers over the existing quality() free function plus a new "predict outcome" that runs a one-off Game without adding to history:
pub fn predict_quality(&self, teams: &[&[&K]]) -> f64
where
K: std::borrow::Borrow<K>,
{
let groups: Vec<Vec<Gaussian>> = teams
.iter()
.map(|team| {
team.iter()
.filter_map(|k| self.keys.get(*k))
.filter_map(|idx| {
self.time_slices
.iter()
.rev()
.find_map(|ts| ts.skills.get(idx).map(|s| s.posterior()))
})
.collect()
})
.collect();
let group_refs: Vec<&[Gaussian]> = groups.iter().map(|g| g.as_slice()).collect();
crate::quality(&group_refs, self.beta)
}
pub fn predict_outcome(&self, teams: &[&[&K]]) -> Vec<f64> {
// Win probabilities per team: apply cdf over team-perf sums.
// For n teams, return P(team i wins over all others).
// Minimal impl for T2: 2-team case only; multi-team deferred to T4.
assert_eq!(teams.len(), 2, "predict_outcome T2: 2 teams only");
let gather = |team: &[&K]| -> Gaussian {
team.iter()
.filter_map(|k| self.keys.get(*k))
.filter_map(|idx| {
self.time_slices
.iter()
.rev()
.find_map(|ts| ts.skills.get(idx).map(|s| s.posterior()))
})
.fold(crate::N00, |acc, g| acc + g.forget(self.beta.powi(2)))
};
let a = gather(teams[0]);
let b = gather(teams[1]);
let diff = a - b;
let p_a = 1.0 - crate::cdf(0.0, diff.mu(), diff.sigma());
vec![p_a, 1.0 - p_a]
}
Document predict_outcome as 2-team-only in T2; N-team lands in T4 alongside Residual schedules.
- Step 5: Add tests
In tests/api_shape.rs:
#[test]
fn current_skill_learning_curve_learning_curves() {
let mut h = History::<i64, _, _, &'static str>::builder()
.mu(25.0).sigma(25.0 / 3.0).beta(25.0 / 6.0).p_draw(0.0)
.build();
h.record_winner(&"a", &"b", 1).unwrap();
h.record_winner(&"a", &"b", 2).unwrap();
h.converge().unwrap();
let a = h.current_skill(&"a").unwrap();
assert!(a.mu() > 25.0);
let curve = h.learning_curve(&"a");
assert_eq!(curve.len(), 2);
assert_eq!(curve[0].0, 1);
assert_eq!(curve[1].0, 2);
let all = h.learning_curves();
assert_eq!(all.len(), 2);
assert!(all.contains_key("a"));
}
#[test]
fn log_evidence_for_subset() {
let mut h = History::<i64, _, _, &'static str>::builder()
.mu(0.0).sigma(6.0).beta(1.0).p_draw(0.0)
.build();
h.record_winner(&"a", &"b", 1).unwrap();
h.record_winner(&"b", &"a", 2).unwrap();
let ev_all = h.log_evidence();
let ev_a = h.log_evidence_for(&[&"a"]);
assert_ne!(ev_all, ev_a);
}
- Step 6: Build + test
cargo build --features approx
cargo test --features approx --test api_shape
- Step 7: Commit
cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
feat(api): add current_skill, learning_curve (single), log_evidence_for, predict_*
learning_curves now returns HashMap<K, Vec<(T, Gaussian)>> keyed on
the user's key type. log_evidence is public zero-arg; log_evidence_for
takes a slice of keys.
predict_outcome is T2 2-team-only; N-team deferred.
Part of T2.
EOF
)"
Task 18: Promote Factor / Schedule / VarStore to pub under a factors module
Files:
-
Create:
src/factors.rs -
Modify:
src/factor/mod.rs,src/factor/rank_diff.rs,src/factor/team_sum.rs,src/factor/trunc.rs,src/schedule.rs,src/lib.rs -
Step 1: Promote visibility inside
src/factor/mod.rs
Change pub(crate) struct VarId(pub(crate) u32); → pub struct VarId(pub u32);. Similarly:
pub(crate) struct VarStore→pub struct VarStore(keepmarginalsfieldpub(crate)— implementation detail).pub(crate) trait Factor→pub trait Factor.pub(crate) enum BuiltinFactor→pub enum BuiltinFactor.- Submodules:
pub(crate) mod team_sum;→pub mod team_sum;and similarly forrank_diff,trunc. - Inside each submodule, promote the struct and its fields/constructors from
pub(crate)topub.
Remove any #[allow(dead_code)] that was guarding visibility warnings.
- Step 2: Promote
ScheduleandEpsilonOrMaxinsrc/schedule.rs
pub trait Schedule {
fn run(&self, factors: &mut [BuiltinFactor], vars: &mut VarStore) -> ScheduleReport;
}
pub struct EpsilonOrMax {
pub eps: f64,
pub max: usize,
}
Remove #[allow(dead_code)].
- Step 3: Create
src/factors.rsthat re-exports the public API
//! Factor-graph public API.
//!
//! Power users can construct custom factor graphs via `Game::custom` and
//! drive them with a custom `Schedule` implementation.
pub use crate::factor::{BuiltinFactor, Factor, VarId, VarStore};
pub use crate::factor::rank_diff::RankDiffFactor;
pub use crate::factor::team_sum::TeamSumFactor;
pub use crate::factor::trunc::TruncFactor;
pub use crate::schedule::{EpsilonOrMax, Schedule, ScheduleReport};
- Step 4: Register in
src/lib.rs
Keep factor and schedule modules as they are (internal use), and add:
pub mod factors;
- Step 5: Build + test
cargo build --features approx
cargo test --features approx --lib
- Step 6: Commit
cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
feat(api): promote Factor/Schedule/VarStore to pub in `factors` module
Exposes the factor-graph machinery so power users can define custom
factors and schedules (see Game::custom in the next task). The
internal factor/ and schedule/ modules remain pub(crate) — user-
facing API goes through the factors module re-exports.
Part of T2.
EOF
)"
Task 19: Game::ranked, Game::one_v_one, Game::free_for_all, Game::custom
Files:
-
Modify:
src/game.rs -
Step 1: Split the internal
Game::newinto aGame::ranked_with_arena
Move the current Game::new body into pub(crate) fn ranked_with_arena(...) -> Self. The public signatures become:
pub struct GameOptions {
pub p_draw: f64,
pub convergence: ConvergenceOptions,
}
impl Default for GameOptions {
fn default() -> Self {
Self {
p_draw: crate::P_DRAW,
convergence: ConvergenceOptions::default(),
}
}
}
impl<'a, T: Time, D: Drift<T>> Game<'a, T, D> {
/// Build a ranked match from borrowed team rosters and an outcome.
///
/// Returns `Err(MismatchedShape)` if outcome rank count doesn't match
/// team count. Returns `Err(InvalidProbability)` if p_draw is out of range.
pub fn ranked(
teams: &[&[Rating<T, D>]],
outcome: Outcome,
options: &GameOptions,
) -> Result<Self, InferenceError> {
if !(0.0..1.0).contains(&options.p_draw) {
return Err(InferenceError::InvalidProbability { value: options.p_draw });
}
if outcome.team_count() != teams.len() {
return Err(InferenceError::MismatchedShape {
kind: "outcome ranks vs teams",
expected: teams.len(),
got: outcome.team_count(),
});
}
// Translate ranks to the legacy "higher f64 = better" result.
let ranks = outcome.as_ranks();
let max_rank = ranks.iter().copied().max().unwrap_or(0) as f64;
let result: Vec<f64> = ranks.iter().map(|&r| max_rank - r as f64).collect();
let teams_owned: Vec<Vec<Rating<T, D>>> = teams.iter().map(|t| t.to_vec()).collect();
let weights: Vec<Vec<f64>> = teams.iter().map(|t| vec![1.0; t.len()]).collect();
let mut arena = ScratchArena::new();
let game = Self::ranked_with_arena(
teams_owned,
&result,
&weights,
options.p_draw,
&mut arena,
);
Ok(game)
}
/// 1v1 convenience: returns posteriors `(a_post, b_post)` directly.
pub fn one_v_one(
a: &Rating<T, D>,
b: &Rating<T, D>,
outcome: Outcome,
) -> Result<(Gaussian, Gaussian), InferenceError> {
let game = Self::ranked(
&[&[*a], &[*b]],
outcome,
&GameOptions::default(),
)?;
let post = game.posteriors();
Ok((post[0][0], post[1][0]))
}
/// FFA: each entry is a single rating; outcome ranks per player.
pub fn free_for_all(
players: &[&Rating<T, D>],
outcome: Outcome,
options: &GameOptions,
) -> Result<Self, InferenceError> {
let teams: Vec<Vec<Rating<T, D>>> = players.iter().map(|p| vec![**p]).collect();
let team_refs: Vec<&[Rating<T, D>]> = teams.iter().map(|t| t.as_slice()).collect();
Self::ranked(&team_refs, outcome, options)
}
/// Power-user: build a game from a user-defined factor graph.
///
/// The caller owns the `VarStore` and `Vec<BuiltinFactor>`. The schedule
/// is run once; `posteriors_from_vars` extracts team posteriors by VarId.
pub fn custom<S: Schedule>(
teams: Vec<Vec<Rating<T, D>>>,
vars: VarStore,
factors: Vec<BuiltinFactor>,
team_perf_vars: Vec<VarId>,
weights: Vec<Vec<f64>>,
schedule: &S,
) -> Self {
// Partial impl: run schedule, populate likelihoods, evidence.
// Full custom-factor support expands in T4.
let mut this = Self {
teams,
result: &[], // Not used for custom — outcome encoded in factors.
weights: &[],
p_draw: 0.0,
likelihoods: Vec::new(),
evidence: 1.0,
};
this.run_custom(vars, factors, team_perf_vars, weights, schedule);
this
}
/// Log-evidence for this game (zero if there were no Trunc factors).
pub fn log_evidence(&self) -> f64 {
self.evidence.ln()
}
}
Game::custom is the spec's escape hatch for user-defined factors. For T2 it only needs to work; its full ergonomics land in T4. Mark it #[doc(hidden)] if unfinished.
- Step 2: Add tests
#[test]
fn game_ranked_1v1() {
let a = Rating::new(Gaussian::from_ms(25.0, 25.0/3.0), 25.0/6.0, ConstantDrift(25.0/300.0));
let b = Rating::new(Gaussian::from_ms(25.0, 25.0/3.0), 25.0/6.0, ConstantDrift(25.0/300.0));
let g = Game::<i64, _>::ranked(
&[&[a], &[b]],
Outcome::winner(0, 2),
&GameOptions::default(),
).unwrap();
let p = g.posteriors();
assert_ulps_eq!(p[0][0], Gaussian::from_ms(29.205220, 7.194481), epsilon = 1e-6);
assert_ulps_eq!(p[1][0], Gaussian::from_ms(20.794779, 7.194481), epsilon = 1e-6);
}
#[test]
fn game_one_v_one_shortcut() {
let a = Rating::<i64>::new(Gaussian::from_ms(25.0, 25.0/3.0), 25.0/6.0, ConstantDrift(25.0/300.0));
let b = Rating::<i64>::new(Gaussian::from_ms(25.0, 25.0/3.0), 25.0/6.0, ConstantDrift(25.0/300.0));
let (a_post, b_post) = Game::<i64, _>::one_v_one(&a, &b, Outcome::winner(0, 2)).unwrap();
assert_ulps_eq!(a_post, Gaussian::from_ms(29.205220, 7.194481), epsilon = 1e-6);
assert_ulps_eq!(b_post, Gaussian::from_ms(20.794779, 7.194481), epsilon = 1e-6);
}
#[test]
fn game_ranked_rejects_bad_p_draw() {
let a = Rating::<i64>::new(Gaussian::default(), 1.0, ConstantDrift(0.0));
let err = Game::<i64, _>::ranked(
&[&[a], &[a]],
Outcome::winner(0, 2),
&GameOptions { p_draw: 1.5, convergence: Default::default() },
).unwrap_err();
assert!(matches!(err, InferenceError::InvalidProbability { .. }));
}
Note Outcome::winner(0, 2) makes team 0 the winner and team 1 the loser. The old test used &[0.0, 1.0] where higher = worse (team 1 had result 1.0 meaning worse result i.e. team 0 won). Under new convention Outcome::winner(0, 2) → ranks [0, 1] → mapped to legacy results [1.0, 0.0] → team 0 wins. But the old test asserted p[0][0] = Gaussian::from_ms(20.794779, 7.194481) (loser) and p[1][0] = Gaussian::from_ms(29.205220, 7.194481) (winner). That's because the old convention was higher result = worse position. Double-check this mapping before committing golden values — run the test and if p[0] vs p[1] flip, swap the expected values.
- Step 3: Build + test
cargo build --features approx
cargo test --features approx --lib game
- Step 4: Commit
cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
feat(api): add Game::ranked, one_v_one, free_for_all, custom constructors
Public Game API now returns Result<Self, InferenceError> on bad
inputs. GameOptions bundles p_draw and convergence config.
Game::custom is an escape hatch for user-defined factor graphs
(full ergonomics land in T4).
Part of T2.
EOF
)"
Task 20: Translate the full test suite to the new API; delete legacy methods
Files:
-
Modify: every
#[cfg(test)] mod testsacrosssrc/ -
Modify:
tests/equivalence.rs(new file) -
Delete:
History::convergence(iters, eps, verbose), the legacyadd_eventswrapper -
Step 1: Inventory remaining legacy callers
grep -rn 'h\.convergence(\|\.add_events(vec!\|\.time(true)\|\.time(false)\|\.gamma(' src/ tests/
Every hit is a site that must be translated.
- Step 2: Translation cheat-sheet
Apply this uniformly across every test module. Example old form (from src/history.rs):
let mut h = History::builder()
.mu(0.0).sigma(2.0).beta(1.0).gamma(0.0).time(false).build();
h.add_events(composition, results, vec![], vec![]);
h.convergence(ITERATIONS, EPSILON, false);
New form:
let mut h = History::<i64, _, _, &'static str>::builder()
.mu(0.0).sigma(2.0).beta(1.0).drift(ConstantDrift(0.0)).build();
// Translate nested Vec<Vec<Vec<Index>>> into Vec<Event<_, &str>>
let events = translate_to_events(composition, results, /* times = 1..=n */ (1..=n).collect());
h.add_events(events).unwrap();
h.converge().unwrap();
Since the tests use IndexMap::get_or_create("a") returning Index, they'll need updating to either use history.intern(&"a") directly or use Member::new("a") in Event.
Provide a test helper in a tests/common.rs module:
pub fn winner_event(time: i64, w: &'static str, l: &'static str) -> Event<i64, &'static str> {
use smallvec::smallvec;
Event {
time,
teams: smallvec![
Team::with_members([Member::new(w)]),
Team::with_members([Member::new(l)]),
],
outcome: Outcome::winner(0, 2),
}
}
pub fn draw_event(time: i64, a: &'static str, b: &'static str) -> Event<i64, &'static str> {
use smallvec::smallvec;
Event {
time,
teams: smallvec![
Team::with_members([Member::new(a)]),
Team::with_members([Member::new(b)]),
],
outcome: Outcome::draw(2),
}
}
Then h.add_events(vec![winner_event(1, "a", "b"), winner_event(2, "a", "c"), winner_event(3, "b", "c")]).unwrap();.
- Step 3: Translate each old test one at a time
Work through src/history.rs::tests, src/game.rs::tests, src/time_slice.rs::tests. Each test retains its hardcoded golden values — numerical behavior is unchanged. Only the construction changes.
For tests that accessed h.batches[i].skills.get(idx).unwrap().posterior(), change to h.learning_curve(&key)[i].1 or h.current_skill(&key).unwrap().
For tests that looked at internal state (h.batches[0].events.len(), get_composition() etc.), delete those assertions if the new API doesn't expose them, or move them behind #[cfg(test)] accessors.
- Step 4: Create
tests/equivalence.rs— the regression net
//! Verifies that the new API produces the same numerical results as the
//! hardcoded goldens from the old API (taken from the original Python port).
use approx::assert_ulps_eq;
use trueskill_tt::{
ConstantDrift, ConvergenceOptions, Event, Gaussian, History, Member, Outcome, Team,
};
use smallvec::smallvec;
#[test]
fn test_1vs1_matches_old_golden() {
use trueskill_tt::{Game, GameOptions, Rating};
let a = Rating::<i64>::new(
Gaussian::from_ms(25.0, 25.0 / 3.0),
25.0 / 6.0,
ConstantDrift(25.0 / 300.0),
);
let b = Rating::<i64>::new(
Gaussian::from_ms(25.0, 25.0 / 3.0),
25.0 / 6.0,
ConstantDrift(25.0 / 300.0),
);
let (a_post, b_post) = Game::<i64, _>::one_v_one(&a, &b, Outcome::winner(0, 2)).unwrap();
assert_ulps_eq!(a_post, Gaussian::from_ms(29.205220, 7.194481), epsilon = 1e-6);
assert_ulps_eq!(b_post, Gaussian::from_ms(20.794779, 7.194481), epsilon = 1e-6);
}
#[test]
fn test_env_ttt_matches_old_golden() {
let mut h = History::<i64, _, _, &'static str>::builder()
.mu(25.0)
.sigma(25.0 / 3.0)
.beta(25.0 / 6.0)
.drift(ConstantDrift(25.0 / 300.0))
.convergence(ConvergenceOptions { max_iter: 30, epsilon: 1e-6 })
.build();
let events: Vec<Event<i64, &'static str>> = vec![
Event { time: 1, teams: smallvec![
Team::with_members([Member::new("a")]),
Team::with_members([Member::new("b")]),
], outcome: Outcome::winner(0, 2) },
Event { time: 2, teams: smallvec![
Team::with_members([Member::new("a")]),
Team::with_members([Member::new("c")]),
], outcome: Outcome::winner(1, 2) },
Event { time: 3, teams: smallvec![
Team::with_members([Member::new("b")]),
Team::with_members([Member::new("c")]),
], outcome: Outcome::winner(0, 2) },
];
h.add_events(events).unwrap();
h.converge().unwrap();
let a_curve = h.learning_curve(&"a");
let b_curve = h.learning_curve(&"b");
assert_ulps_eq!(a_curve[0].1, Gaussian::from_ms(25.000267, 5.419381), epsilon = 1e-6);
assert_ulps_eq!(b_curve[0].1, Gaussian::from_ms(24.999465, 5.419425), epsilon = 1e-6);
}
// Repeat for test_teams, test_add_events, test_only_add_events, test_log_evidence,
// test_add_events_with_time, test_1vs1_weighted from the old history.rs tests,
// translating construction while keeping every golden value identical.
Port every one of the seven original history tests plus the seven game tests. Any test that can't be translated because the new API doesn't expose the needed internals — if its golden is preserved by a translated test, delete it; if not, either add a pub(crate) accessor for the test and flag in the spec notes.
- Step 5: Delete legacy methods
In src/history.rs, delete:
-
pub fn convergence(iters, eps, verbose) -> ((f64, f64), usize)— replaced byconverge(). -
The
online: boolbuilder accessor, if unused externally (leaveself.onlinefield internal for now — it's used bylog_evidence_impl). -
The nested-Vec public
add_eventswrapper. Onlypub(crate) fn add_events_with_priorstays (no longerpub). -
HistoryBuilder::gamma(f64)— now users go through.drift(ConstantDrift(g)). -
Step 6: Build + test
cargo test --features approx
cargo clippy --all-targets --features approx -- -D warnings
All tests pass — failures here mean either a translation bug or a legitimate numeric difference. The goldens have been proven stable through T0 and T1, so any drift is translation.
- Step 7: Commit
cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
test(api): translate full test suite to new API; delete legacy methods
Every old golden is reproduced in the new API (tests/equivalence.rs
plus in-module tests). History::convergence, the nested-Vec add_events,
.gamma(f64), .time(bool) all removed.
Part of T2.
EOF
)"
Task 21: Final verification, benchmark capture, update CHANGELOG
Files:
-
Modify:
benches/baseline.txt -
Modify:
CHANGELOG.md -
Step 1: Full green check
cargo +nightly fmt --check
cargo clippy --all-targets --features approx -- -D warnings
cargo test --features approx
cargo test --features approx --doc 2>&1 | tail -5
All must pass.
- Step 2: Capture T2 benchmarks
cargo bench --bench batch 2>&1 | grep "Batch::iteration"
cargo bench --bench gaussian 2>&1 | grep "Gaussian::"
Compare to T1 baseline (~23 µs). Must be within 5% (23.5 µs ceiling). T2 is an API refactor; hot path should be unchanged.
- Step 3: Append T2 block to
benches/baseline.txt
# After T2 (date, same hardware)
Batch::iteration <X.XX> µs (vs T1 23.010 µs: <delta>)
Gaussian::add <X.XX> ps (unchanged)
Gaussian::sub <X.XX> ps (unchanged)
Gaussian::mul <X.XX> ps (unchanged)
Gaussian::div <X.XX> ps (unchanged)
Gaussian::pi <X.XX> ps (unchanged)
Gaussian::tau <X.XX> ps (unchanged)
# Notes:
# - API-only tier; hot inference path unchanged so numerics match T1 within ULPs.
# - Public surface now matches spec Section 4.
# - Breaking changes: Batch→TimeSlice, Player→Rating, Agent→Competitor,
# IndexMap→KeyTable; Event<T, K>/Team<K>/Member<K>/Outcome new types;
# Time trait; Drift<T> generic; Observer + ConvergenceReport; Result<_,
# InferenceError> at API boundary; factors module promoted to pub.
- Step 4: Add CHANGELOG entry
Read CHANGELOG.md (check its existing format first with head -20 CHANGELOG.md) and prepend a ## [Unreleased] section listing the breaking changes bulleted from the spec.
- Step 5: Commit
git add benches/baseline.txt CHANGELOG.md
git commit -m "$(cat <<'EOF'
bench,docs: capture T2 final numbers and update CHANGELOG
Batch::iteration: <X.XX> µs (T1 was 23.010 µs).
API-only tier; numerics within ULPs.
Closes T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
EOF
)"
- Step 6: Ready for review
The branch t2-new-api-surface is complete. Full test suite green, clippy clean, fmt clean, bench within target. Open the PR with:
- Summary linking to spec + plan.
- Breaking-change table (spec Section 4½).
- Benchmark comparison.
- Migration notes for known downstream (there are none — crate is pre-1.0).
Self-review notes
Spec coverage (against docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md Section 7 T2 checklist):
- ✅
Rating,TimeSlice,Competitor,Member<K>,Outcome,Event<T, K>,KeyTable<K>(Tasks 2, 3, 4, 5, 9, 10) - ✅
Timetrait,History<T: Time, D: Drift<T>>(Tasks 6, 7, 8) - ✅ Three-tier API:
record_winner,event(...).team(...).commit(), bulkadd_events(iter)(Tasks 14, 15, 16) - ✅
Observertrait +ConvergenceReport;verbose: booldeleted (Tasks 11, 12, 20) - ✅
panic!/debug_assert!at API boundary →Result<_, InferenceError>(Task 13) - ✅
Factor/Schedule/VarStorepromoted topubunderfactorsmodule (Task 18) - ✅
Game::ranked,one_v_one,free_for_all,custom(Task 19) - ✅ Equivalence tests prove identical posteriors (Task 20)
- ✅
intern/lookupforIndexexposure (Task 14)
Deferred to later tiers (explicitly):
Outcome::Scored+MarginFactor— T4 (enum is#[non_exhaustive]so adding is non-breaking).Damped,Residualschedules — T4.Send + Syncbounds — T3.predict_outcomeN-team — T4.Game::customfull ergonomics — T4.
Known hazards during execution:
- Generic parameter explosion. After Task 8 + 12 + 14,
History<T, D, O, K>has four type params. Use trait-object erasure internally if ergonomics suffer. Default types (i64,ConstantDrift,NullObserver) keep common callsites readable. .player→.ratingfield rename.grep -r '\.player'also matches unrelated names. Audit manually insidesrc/only; don't auto-replace across tests or dev deps.- Outcome → legacy result translation. The legacy
resultf64 convention is "higher is worse" for descending sort. Our newOutcome::Rankedstores ranks where 0 = best. Mapping:max_rank - rank[i]. If a test's golden flips winner/loser, this is the likely cause. Time = Untimedbehavior change. Per spec,Untimed::elapsed_toreturns 0 — no drift between slices. The oldtime=falsemode implicitly used elapsed=1. Tests that usedtime(false)translate toHistory::<i64, _>with explicit1..=ntimestamps to preserve numerics.- Test module translation is mechanical but big. Expect Task 20 to be the longest task (~60–90 minutes). Work through the test files one at a time; commit after each file to keep bisectable history.
Things outside the plan that may bite:
Cargo.tomldev-deptrueskill-tt = { path = ".", features = ["approx"] }— if this references removed public names (Batch,Player), add the old name temporarily as atype Batch = TimeSlice;alias during the rename cascade, then remove before Task 21.examples/directory. Confirm no examples exist (ls examples/showed nothing at plan-write time), but re-check before Task 20 and include any in the translation pass.- README.md probably shows the old API. Update in Task 21 alongside the CHANGELOG.