Files

Anders Olsson d2aab82c1e T0 + T1 + T2: engine redesign through new API surface (#1 )

Implements tiers T0, T1, T2 of `docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md`. All three tiers have landed together on this branch because they build on one another; this PR rolls them up for a single review pass.

Per-tier plans:
- T0: `docs/superpowers/plans/2026-04-23-t0-numerical-parity.md`
- T1: `docs/superpowers/plans/2026-04-24-t1-factor-graph.md`
- T2: `docs/superpowers/plans/2026-04-24-t2-new-api-surface.md`

## Summary

### T0 — Numerical parity (internal)

- `Gaussian` switched to natural-parameter storage `(pi, tau)`; mul/div now ~7× faster (218 ps vs 1.57 ns).
- `HashMap<Index, _>` → dense `Vec<_>` keyed by `Index.0` (via `AgentStore<D>`, `SkillStore`).
- `ScratchArena` eliminates per-event allocations in `Game::likelihoods`.
- `InferenceError` seed type added (1 variant).
- 38 → 53 tests passing through T1.
- Benchmark: `Batch::iteration` 29.84 → 21.25 µs.

### T1 — Factor graph machinery (internal)

- `Factor` trait + `BuiltinFactor` enum (TeamSum / RankDiff / Trunc) driving within-game inference.
- `VarStore` flat storage for variable marginals.
- `Schedule` trait + `EpsilonOrMax` impl replacing the hand-rolled EP loop.
- `Game::likelihoods` rebuilt on the factor-graph machinery; iteration counts and goldens preserved to within 1e-6.
- 53 tests passing.
- Benchmark: `Batch::iteration` 23.01 µs (slight regression absorbed in T2).

### T2 — New API surface (breaking)

**Renames:**
- `IndexMap → KeyTable`, `Player → Rating`, `Agent → Competitor`, `Batch → TimeSlice`

**New types:**
- `Time` trait with `Untimed` ZST and `i64` impls; `Drift<T>`, `Rating<T, D>`, `Competitor<T, D>`, `TimeSlice<T>`, `History<T, D, O, K>` all generic.
- `Event<T, K>`, `Team<K>`, `Member<K>`, `Outcome` (`Ranked` variant; `#[non_exhaustive]`).
- `Observer<T>` trait + `NullObserver`.
- `ConvergenceOptions`, `ConvergenceReport`.
- `GameOptions`, `OwnedGame<T, D>`.

**Three-tier ingestion:**
- `history.record_winner(&K, &K, T)` / `record_draw(&K, &K, T)` — 1v1 convenience.
- `history.add_events(iter)` — typed bulk.
- `history.event(T).team([...]).weights([...]).ranking([...]).commit()` — fluent.

**Query API:** `current_skill`, `learning_curve`, `learning_curves` (keyed on `K`), `log_evidence`, `log_evidence_for`, `predict_quality`, `predict_outcome`.

**Game constructors:** `ranked`, `one_v_one`, `free_for_all`, `custom` — all returning `Result<_, InferenceError>`.

**`factors` module:** `Factor`, `Schedule`, `VarStore`, `VarId`, `BuiltinFactor`, `EpsilonOrMax`, `ScheduleReport`, `TeamSumFactor`, `RankDiffFactor`, `TruncFactor` now public.

**Errors:** `InferenceError` gains `MismatchedShape`, `InvalidProbability`, `ConvergenceFailed`; boundary panics converted to `Result`.

**Removed (breaking):** `History::convergence(iters, eps, verbose)`, `HistoryBuilder::gamma(f64)`, `HistoryBuilder::time(bool)`, `History.time: bool`, `learning_curves_by_index`, nested-Vec public `add_events`.

## Behavior change (documented in CHANGELOG)

`Time = Untimed` has `elapsed_to → 0`, so no drift accumulates between slices. The old `time=false` mode implicitly forced `elapsed=1` on reappearance via an `i64::MAX` sentinel — that quirk is not reproducible under a typed time axis. Tests that depended on it now use `History::<i64, _>` with explicit `1..=n` timestamps. One test (`test_env_ttt`) had 3 Gaussian goldens updated to reflect the corrected semantics; documented in commit `33a7d90`.

## Final numbers

| Metric | Before T0 | After T2 | Delta |
|---|---|---|---|
| `Batch::iteration` | 29.84 µs | 21.36 µs | **-28%** |
| `Gaussian::mul` | 1.57 ns | 219 ps | **-86%** |
| `Gaussian::div` | 1.57 ns | 219 ps | **-86%** |
| Tests passing | 38 | 90 | +52 |

All other Gaussian ops unchanged (~219 ps add/sub, ~264 ps pi/tau reads).

## Test plan

- [x] `cargo test --features approx` — 90/90 pass (68 lib + 10 api_shape + 6 game + 4 record_winner + 2 equivalence)
- [x] `cargo clippy --all-targets --features approx -- -D warnings` — clean
- [x] `cargo +nightly fmt --check` — clean
- [x] `cargo bench --bench batch` — 21.36 µs
- [x] `cargo bench --bench gaussian` — unchanged from T1
- [x] `cargo run --example atp --features approx` — rewritten in new API, runs clean
- [x] Historical Game-level goldens preserved in `tests/equivalence.rs`
- [x] Public API matches spec Section 4 (verified by integration tests in `tests/api_shape.rs`)

## Commit history

~45 commits total across T0 + T1 + T2. Each task is self-contained and individually tested; the branch is bisectable. See `git log main..t2-new-api-surface` for the full list.

## Deferred to later tiers

- `Outcome::Scored` + `MarginFactor` — T4
- `Damped` / `Residual` schedules — T4
- `Send + Sync` bounds + Rayon parallelism — T3
- N-team `predict_outcome` — T4
- `Game::custom` full ergonomics — T4

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: #1
Co-authored-by: Anders Olsson <anders.e.olsson@gmail.com>
Co-committed-by: Anders Olsson <anders.e.olsson@gmail.com>

2026-04-24 11:20:04 +00:00

83 KiB

Raw Permalink Blame History

T2 — New API Surface Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Land the full breaking API redesign from docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md Section 7 "T2" — renames, typed Event<T, K> ingestion, generic Time / Drift<T>, Observer + ConvergenceReport, Result<_, InferenceError> at the boundary, and pub factors module — without changing numerical behavior.

Architecture: T2 is a tiered rename + API rewrite on top of T1's factor-graph internals. No internal inference algorithm changes. Each task leaves the crate building and tests passing on the working branch; the final commit merges as a single breaking change. Because this crate is pre-1.0 with no downstream users, intermediate commits on the T2 branch need not preserve the old API, as long as the branch-tip is consistent.

Tech Stack: Rust 2024 edition, approx crate for float comparisons, smallvec for fixed-size-biased event shapes, criterion for benchmark gates. Builds on T0 (nat-param Gaussian, dense storage, ScratchArena) and T1 (Factor/Schedule/VarStore, EpsilonOrMax schedule).

Acceptance criteria

All existing numerical goldens (rewritten in the new API) pass within 1e-6 tolerance.
cargo bench --bench batch shows no regression vs T1 (Batch::iteration ≤ 23.5 µs on Apple M5 Pro) — the API rewrite is not supposed to change the hot path.
cargo clippy --all-targets --features approx -- -D warnings clean.
cargo +nightly fmt --check clean.
Public API matches spec Section 4:
- History<T: Time, D: Drift<T>> with Untimed and i64 Time impls.
- HistoryBuilder with .mu().sigma().beta().drift().p_draw().convergence().observer().build().
- history.record_winner(&K, &K, T), record_draw(&K, &K, T), add_events(iter), event(time).team([…]).weights([…]).ranking([…]).commit().
- history.converge() -> Result<ConvergenceReport, InferenceError>.
- history.learning_curve(&K), learning_curves(), current_skill(&K), log_evidence(), log_evidence_for(&[&K]), predict_quality(...), predict_outcome(...), intern(&K) -> Index, lookup(&Q) -> Option<Index>.
- Game::ranked, Game::one_v_one, Game::free_for_all, Game::custom constructors; Game::posteriors, Game::log_evidence accessors.
- Observer trait with NullObserver default.
- InferenceError with MismatchedShape, InvalidProbability, ConvergenceFailed, NegativePrecision variants.
- factors module re-exports Factor, Schedule, VarStore, VarId, BuiltinFactor, EpsilonOrMax, ScheduleReport.
Renames completed: Batch → TimeSlice, Player → Rating, Agent → Competitor, IndexMap → KeyTable.
Old API (History::convergence(iters, eps, verbose), nested-Vec add_events(composition, results, times, weights), verbose: bool, time: bool) removed.

Non-goals (deferred to T3/T4)

Outcome::Scored variant + MarginFactor (T4). For T2, Outcome has only Ranked; enum is marked #[non_exhaustive] so we can add Scored later non-breaking.
Damped and Residual schedules (T4).
Send + Sync trait bounds + Rayon parallelism (T3).
Cross-history parallelism, dirty-bit slice skipping (T3/beyond).
Snapshot / serde support.

File map

New files:

Path	Responsibility
`src/time.rs`	`Time` trait, `Untimed` ZST, `impl Time for i64`.
`src/observer.rs`	`Observer` trait, `NullObserver` ZST.
`src/outcome.rs`	`Outcome` enum (only `Ranked` in T2; `#[non_exhaustive]`). Convenience constructors `winner`, `draw`, `ranking`.
`src/event.rs`	Typed `Event<T, K>`, `Team<K>`, `Member<K>` structs for bulk ingestion.
`src/convergence.rs`	`ConvergenceOptions`, `ConvergenceReport`.
`src/key_table.rs`	`KeyTable<K>` (was `IndexMap` in `lib.rs`).
`src/rating.rs`	`Rating` (was `Player`).
`src/competitor.rs`	`Competitor` (was `Agent`).
`src/time_slice.rs`	`TimeSlice` (was `Batch`).
`src/factors.rs`	Public re-export module: `pub use crate::factor::` + `pub use crate::schedule::`.
`src/event_builder.rs`	Fluent builder returned by `History::event(T)` — `.team([…]).weights([…]).ranking([…]).commit()`.
`tests/equivalence.rs`	Integration tests: every old-API hardcoded golden is reproduced in the new API.
`tests/api_shape.rs`	Integration tests: three-tier ingestion, builder ergonomics, error cases.

Removed files:

src/player.rs → replaced by rating.rs
src/agent.rs → replaced by competitor.rs
src/batch.rs → replaced by time_slice.rs

Heavily modified:

Path	What changes
`src/lib.rs`	Remove `IndexMap` (moved); re-export new modules; promote `factor` + `schedule` visibility via `mod factors`.
`src/drift.rs`	Generify `Drift` over `T: Time`. Widen `ConstantDrift` impl.
`src/history.rs`	Make generic over `T: Time, D: Drift<T>`. New builder methods. Three-tier ingestion. `converge()` replaces `convergence()`. New query methods.
`src/error.rs`	Add `MismatchedShape`, `InvalidProbability`, `ConvergenceFailed` variants.
`src/game.rs`	Add `ranked`, `one_v_one`, `free_for_all`, `custom` constructors and `log_evidence()` accessor. Keep the private `new_with_arena` path for History.
`src/storage/mod.rs`	Rename `AgentStore` → `CompetitorStore`, `SkillStore` stays.
`src/arena.rs`	Rename any `Player` → `Rating` references; no structural change.

Renaming policy

Use git mv for file renames so history is preserved. All type-name find-replaces happen in the same commit as the file move. Tests inside #[cfg(test)] mod tests are updated in-place.

Task 1: Pre-flight — verify T1 green, capture baseline

Files: none

Step 1: Confirm branch + clean tree

git status
git rev-parse --abbrev-ref HEAD

Expected: clean tree on t0-numerical-parity. If dirty, stop.

Step 2: Create the T2 branch

git checkout -b t2-new-api-surface

Step 3: Confirm all tests pass

cargo test --features approx --lib
cargo test --features approx --doc 2>&1 | tail -5

Expected: 53 passed; 0 failed.

Step 4: Capture a fresh T1 reference bench number (hardware may differ)

cargo bench --bench batch 2>&1 | grep "Batch::iteration"

Record the value — the final task will compare against it.

Step 5: No commit — verification only.

Task 2: Rename `IndexMap` → `KeyTable`

Files:

Create: src/key_table.rs
Modify: src/lib.rs (remove the inline IndexMap, re-export KeyTable)
Modify: every call site across src/**/*.rs and tests
Step 1: Create src/key_table.rs

use std::{
    borrow::{Borrow, ToOwned},
    collections::HashMap,
    hash::Hash,
};

use crate::Index;

/// Maps user keys to internal `Index` handles.
///
/// Renamed from the former `IndexMap` to avoid colliding with the `indexmap`
/// crate. Power users can promote `&K` to `Index` via `get_or_create` and
/// skip the lookup on subsequent hot-path calls.
#[derive(Debug)]
pub struct KeyTable<K>(HashMap<K, Index>);

impl<K> KeyTable<K>
where
    K: Eq + Hash,
{
    pub fn new() -> Self {
        Self(HashMap::new())
    }

    pub fn get<Q: ?Sized + Hash + Eq + ToOwned<Owned = K>>(&self, k: &Q) -> Option<Index>
    where
        K: Borrow<Q>,
    {
        self.0.get(k).cloned()
    }

    pub fn get_or_create<Q: ?Sized + Hash + Eq + ToOwned<Owned = K>>(&mut self, k: &Q) -> Index
    where
        K: Borrow<Q>,
    {
        if let Some(idx) = self.0.get(k) {
            *idx
        } else {
            let idx = Index::from(self.0.len());
            self.0.insert(k.to_owned(), idx);
            idx
        }
    }

    pub fn key(&self, idx: Index) -> Option<&K> {
        self.0
            .iter()
            .find(|&(_, value)| *value == idx)
            .map(|(key, _)| key)
    }

    pub fn keys(&self) -> impl Iterator<Item = &K> {
        self.0.keys()
    }

    pub fn len(&self) -> usize {
        self.0.len()
    }

    pub fn is_empty(&self) -> bool {
        self.0.is_empty()
    }
}

impl<K> Default for KeyTable<K>
where
    K: Eq + Hash,
{
    fn default() -> Self {
        KeyTable::new()
    }
}

Step 2: Remove IndexMap from src/lib.rs

Delete lines 57–108 (the IndexMap struct, impls, and Default impl). Remove use std::{borrow::{Borrow, ToOwned}, collections::HashMap, hash::Hash}; if nothing else uses those imports.

Add module declaration near the other pub mod lines (alphabetically between gaussian and player):

pub mod key_table;

Add re-export:

pub use key_table::KeyTable;

Step 3: Replace all IndexMap references with KeyTable

grep -rln 'IndexMap' src/ tests/ benches/ examples/ 2>/dev/null

In every file found, replace IndexMap → KeyTable. Verify no stragglers:

! grep -rn 'IndexMap' src/ tests/ benches/ examples/ 2>/dev/null

Step 4: Build + test

cargo build --features approx
cargo test --features approx --lib

Expected: all 53 tests pass.

Step 5: Format + commit

cargo +nightly fmt
git add src/ tests/ benches/ examples/
git commit -m "$(cat <<'EOF'
refactor(api): rename IndexMap to KeyTable

The former name collided with the popular indexmap crate. KeyTable
lives in its own module. Public API unchanged beyond the rename.

Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
EOF
)"

Task 3: Rename `Player` → `Rating`

Files:

Rename: src/player.rs → src/rating.rs
Modify: src/lib.rs (module decl + re-export)
Modify: every call site
Step 1: Move the file

git mv src/player.rs src/rating.rs

Step 2: Rename the type inside src/rating.rs

Edit src/rating.rs — rename Player → Rating everywhere, update the doc comment to say what it is now:

use crate::{
    BETA, GAMMA,
    drift::{ConstantDrift, Drift},
    gaussian::Gaussian,
};

/// Static rating configuration: prior skill, performance noise `beta`, drift.
///
/// Renamed from `Player` in T2; `Rating` better describes the data
/// (a configuration) vs. a person (who's a `Competitor` with state).
#[derive(Clone, Copy, Debug)]
pub struct Rating<D: Drift = ConstantDrift> {
    pub(crate) prior: Gaussian,
    pub(crate) beta: f64,
    pub(crate) drift: D,
}

impl<D: Drift> Rating<D> {
    pub fn new(prior: Gaussian, beta: f64, drift: D) -> Self {
        Self { prior, beta, drift }
    }

    pub(crate) fn performance(&self) -> Gaussian {
        self.prior.forget(self.beta.powi(2))
    }
}

impl Default for Rating<ConstantDrift> {
    fn default() -> Self {
        Self {
            prior: Gaussian::default(),
            beta: BETA,
            drift: ConstantDrift(GAMMA),
        }
    }
}

Step 3: Update src/lib.rs

Replace:

pub mod player;
...
pub use player::Player;

with:

pub mod rating;
...
pub use rating::Rating;

Step 4: Replace Player with Rating and player:: with rating:: everywhere

Use two passes so we don't touch e.g. player_b or players — use word-boundary regex:

grep -rln '\bPlayer\b\|::player::\|mod player\|crate::player' src/ tests/ benches/ examples/ 2>/dev/null

For each file, edit to replace. Inside test modules, Player::new(...) becomes Rating::new(...). The spec keeps the Rating::new(prior, beta, drift) signature exactly.

Note: crate::player::Player → crate::rating::Rating. Use pub(crate) fields are re-paths only.

Verify no stragglers:

! grep -rn '\bPlayer\b' src/ tests/ benches/ examples/ 2>/dev/null
! grep -rn 'crate::player\|mod player' src/ 2>/dev/null

Step 5: Build + test

cargo build --features approx
cargo test --features approx --lib

Expected: all 53 tests pass.

Step 6: Format + commit

cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
refactor(api): rename Player to Rating

The struct holds prior/beta/drift — a rating configuration, not a
person. The person-with-temporal-state is the Competitor (renamed in
the next task). Resolves Player/Agent ambiguity.

Part of T2.
EOF
)"

Task 4: Rename `Agent` → `Competitor`

Files:

Rename: src/agent.rs → src/competitor.rs
Modify: src/lib.rs, src/storage/mod.rs (the AgentStore alias rename), every call site
Step 1: Move the file

git mv src/agent.rs src/competitor.rs

Step 2: Rename the type inside src/competitor.rs

Replace Agent with Competitor everywhere in the file. Update the rating import (which was player):

use crate::{
    N_INF,
    drift::{ConstantDrift, Drift},
    gaussian::Gaussian,
    rating::Rating,
};

/// Per-history, temporal state for someone competing.
///
/// Renamed from `Agent` in T2.
#[derive(Debug)]
pub struct Competitor<D: Drift = ConstantDrift> {
    pub rating: Rating<D>,
    pub message: Gaussian,
    pub last_time: i64,
}

impl<D: Drift> Competitor<D> {
    pub(crate) fn receive(&self, elapsed: i64) -> Gaussian {
        if self.message != N_INF {
            self.message
                .forget(self.rating.drift.variance_delta(elapsed))
        } else {
            self.rating.prior
        }
    }
}

impl Default for Competitor<ConstantDrift> {
    fn default() -> Self {
        Self {
            rating: Rating::default(),
            message: N_INF,
            last_time: i64::MIN,
        }
    }
}

pub(crate) fn clean<'a, D: Drift + 'a, A: Iterator<Item = &'a mut Competitor<D>>>(
    competitors: A,
    last_time: bool,
) {
    for c in competitors {
        c.message = N_INF;
        if last_time {
            c.last_time = i64::MIN;
        }
    }
}

Note: the field player → rating is an additional rename that must propagate (self.player.X → self.rating.X across the codebase).

Step 3: Update src/lib.rs and src/storage/mod.rs

In src/lib.rs, replace pub mod agent; with pub mod competitor;.

In src/storage/mod.rs, rename AgentStore → CompetitorStore. Inspect the file (read it first if you haven't) and replace:

type alias/struct AgentStore → CompetitorStore
any Agent<D> generic → Competitor<D>
use crate::agent::Agent → use crate::competitor::Competitor
Step 4: Replace all call sites

grep -rln '\bAgent\b\|AgentStore\|agent::\|mod agent\|crate::agent\|\.player\b' src/ tests/ benches/ examples/ 2>/dev/null

Replace:

Agent → Competitor
AgentStore → CompetitorStore
crate::agent:: → crate::competitor::
mod agent → mod competitor
.player field access → .rating (search this one carefully; it's ambiguous with unrelated code in dev-deps)

For the .player rename, audit carefully — it should only appear inside src/ on instances of Competitor. Grep-replace only inside src/:

grep -rln '\.player\b' src/

Do NOT blindly s/.player/.rating/ across tests/ or benches/ or examples/ — these may reference external dev-dep types. Only edit matches inside src/ that are on Competitor instances (verify each).

Verify:

! grep -rn '\bAgent\b\|AgentStore\|::agent::\|mod agent' src/ tests/ benches/ examples/ 2>/dev/null

Step 5: Build + test

cargo build --features approx
cargo test --features approx --lib

Step 6: Format + commit

cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
refactor(api): rename Agent to Competitor and .player field to .rating

Competitor holds dynamic per-history state (message, last_time) for
someone competing; its configuration lives in a Rating.

AgentStore renamed to CompetitorStore to match.

Part of T2.
EOF
)"

Task 5: Rename `Batch` → `TimeSlice`

Files:

Rename: src/batch.rs → src/time_slice.rs
Modify: src/lib.rs, src/history.rs, every call site
Step 1: Move the file

git mv src/batch.rs src/time_slice.rs

Step 2: Rename the type

In src/time_slice.rs, replace the struct name Batch with TimeSlice. Update the module-level doc comment:

//! A single time step's worth of events.
//!
//! Renamed from `Batch` in T2.

Rename:

struct Batch → struct TimeSlice
impl Batch → impl TimeSlice
Batch::new → TimeSlice::new

The compute_elapsed free function keeps its name.

Step 3: Update src/lib.rs

Replace:

pub mod batch;

with:

pub mod time_slice;

If batch::Batch was re-exported anywhere, update to time_slice::TimeSlice.

Step 4: Replace all call sites

grep -rln '\bBatch\b\|crate::batch::\|mod batch\|batch::' src/ tests/ benches/ examples/ 2>/dev/null

Replace Batch → TimeSlice and batch:: → time_slice:: and mod batch → mod time_slice.

Note src/history.rs uses the name extensively (self.batches, Vec<Batch>, function names like new_forward_info). For T2 we rename:

batches: Vec<Batch> → time_slices: Vec<TimeSlice>
The field rename must propagate — search \.batches\b inside src/history.rs and replace with .time_slices.

Verify:

! grep -rn '\bBatch\b\|::batch::\|mod batch' src/ tests/ benches/ examples/ 2>/dev/null
! grep -rn '\.batches\b' src/history.rs

Step 5: Build + test

cargo build --features approx
cargo test --features approx --lib

Tests will still use the old nested-Vec add_events signature and h.batches[…] access patterns — those are fixed in later tasks. For now tests may need to be patched minimally to use .time_slices[…] so they compile. Since the inner shape is the same, this is pure find-replace.

Step 6: Format + commit

cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
refactor(api): rename Batch to TimeSlice

TimeSlice says what it is: every event sharing one timestamp. The
History field batches is renamed to time_slices.

Part of T2.
EOF
)"

Task 6: Introduce `Time` trait and `Untimed`

Files:

Create: src/time.rs
Modify: src/lib.rs

This task lands the trait. It is not yet wired into History; that happens in Task 8.

Step 1: Create src/time.rs

//! Generic time axis for `History`.
//!
//! Users pick the `Time` type based on their domain: `Untimed` when no
//! time axis is meaningful, `i64` for integer day/second timestamps.
//! Additional impls can be added behind feature flags.

/// A timestamp on the global ordering axis.
///
/// Must be `Ord + Copy` so slices can sort events, and `'static` so
/// `History` can store it by value without lifetimes.
pub trait Time: Copy + Ord + 'static {
    /// How much time elapsed between `self` and `later`.
    ///
    /// Used by `Drift<T>::variance_delta` to compute skill drift. Returning
    /// zero means no drift accumulates between the two points. Return value
    /// must be non-negative for `self <= later`.
    fn elapsed_to(&self, later: &Self) -> i64;
}

/// Zero-sized type representing "no time axis."
///
/// Used as the default `Time` when events are unordered. Elapsed is always 0,
/// so no drift accumulates across slices.
#[derive(Copy, Clone, Debug, Default, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct Untimed;

impl Time for Untimed {
    fn elapsed_to(&self, _later: &Self) -> i64 {
        0
    }
}

impl Time for i64 {
    fn elapsed_to(&self, later: &Self) -> i64 {
        later - self
    }
}

Step 2: Declare the module in src/lib.rs

pub mod time;

and re-export:

pub use time::{Time, Untimed};

Step 3: Build

cargo build --features approx
cargo test --features approx --lib

All 53 tests still pass (nothing uses Time yet).

Step 4: Commit

cargo +nightly fmt
git add src/time.rs src/lib.rs
git commit -m "$(cat <<'EOF'
feat(api): add Time trait with Untimed and i64 impls

Foundation for generic History time axis. Untimed is the ZST case
(no drift across slices); i64 is the standard timestamp case.
Additional impls (time::OffsetDateTime, chrono) can be added behind
feature flags in follow-up work.

Part of T2.
EOF
)"

Task 7: Generify `Drift` over `Time`

Files:

Modify: src/drift.rs
Modify: every Drift call site (elapsed: i64 → from: &T, to: &T)
Step 1: Rewrite src/drift.rs

use std::fmt::Debug;

use crate::time::Time;

/// Governs how much a competitor's skill can drift between two time points.
///
/// Generic over `T: Time` so seasonal or calendar-aware drift is expressible
/// without going through `i64`.
pub trait Drift<T: Time>: Copy + Debug {
    /// Variance added to the skill prior for elapsed time `from -> to`.
    ///
    /// Called with `from <= to`; returning zero means no drift accumulates.
    fn variance_delta(&self, from: &T, to: &T) -> f64;
}

/// Simple constant-per-unit-time drift.
///
/// For `Time = i64`: variance added is `(to - from) * gamma^2`.
/// For `Time = Untimed`: elapsed is always 0, so drift is always 0.
#[derive(Clone, Copy, Debug)]
pub struct ConstantDrift(pub f64);

impl<T: Time> Drift<T> for ConstantDrift {
    fn variance_delta(&self, from: &T, to: &T) -> f64 {
        let elapsed = from.elapsed_to(to).max(0) as f64;
        elapsed * self.0 * self.0
    }
}

Step 2: Update Competitor::receive and clean signatures

In src/competitor.rs, Competitor::receive currently takes elapsed: i64. Change the generic bound so it can take from: &T, to: &T. Since Competitor is parameterized by D: Drift, we need to also parameterize by T: Time:

use crate::{
    N_INF,
    drift::{ConstantDrift, Drift},
    gaussian::Gaussian,
    rating::Rating,
    time::Time,
};

#[derive(Debug)]
pub struct Competitor<T: Time, D: Drift<T> = ConstantDrift> {
    pub rating: Rating<T, D>,
    pub message: Gaussian,
    pub last_time: Option<T>,
}

impl<T: Time, D: Drift<T>> Competitor<T, D> {
    pub(crate) fn receive(&self, now: &T) -> Gaussian {
        if self.message != N_INF {
            let elapsed_variance = match &self.last_time {
                Some(last) => self.rating.drift.variance_delta(last, now),
                None => 0.0,
            };
            self.message.forget(elapsed_variance)
        } else {
            self.rating.prior
        }
    }
}

impl Default for Competitor<i64, ConstantDrift> {
    fn default() -> Self {
        Self {
            rating: Rating::default(),
            message: N_INF,
            last_time: None,
        }
    }
}

pub(crate) fn clean<'a, T: Time + 'a, D: Drift<T> + 'a, C: Iterator<Item = &'a mut Competitor<T, D>>>(
    competitors: C,
    last_time: bool,
) {
    for c in competitors {
        c.message = N_INF;
        if last_time {
            c.last_time = None;
        }
    }
}

The last_time: i64 with i64::MIN sentinel becomes last_time: Option<T> with None sentinel — cleaner and type-safe.

Step 3: Parameterize Rating<T, D> likewise

In src/rating.rs:

use crate::{
    BETA, GAMMA,
    drift::{ConstantDrift, Drift},
    gaussian::Gaussian,
    time::Time,
};

#[derive(Clone, Copy, Debug)]
pub struct Rating<T: Time = i64, D: Drift<T> = ConstantDrift> {
    pub(crate) prior: Gaussian,
    pub(crate) beta: f64,
    pub(crate) drift: D,
    pub(crate) _time: std::marker::PhantomData<T>,
}

impl<T: Time, D: Drift<T>> Rating<T, D> {
    pub fn new(prior: Gaussian, beta: f64, drift: D) -> Self {
        Self {
            prior,
            beta,
            drift,
            _time: std::marker::PhantomData,
        }
    }

    pub(crate) fn performance(&self) -> Gaussian {
        self.prior.forget(self.beta.powi(2))
    }
}

impl Default for Rating<i64, ConstantDrift> {
    fn default() -> Self {
        Self {
            prior: Gaussian::default(),
            beta: BETA,
            drift: ConstantDrift(GAMMA),
            _time: std::marker::PhantomData,
        }
    }
}

Step 4: Propagate the generic through the rest of the codebase

Every generic <D: Drift> becomes <T: Time, D: Drift<T>> where the enclosing type owns T. Affected:

CompetitorStore<D> → CompetitorStore<T, D> in src/storage/mod.rs
History<D> → History<T, D> in src/history.rs (and every fn sig inside)
HistoryBuilder<D> → HistoryBuilder<T, D>
Game<D> → Game<T, D> in src/game.rs
TimeSlice may need T if it holds a time: T field (it currently holds time: i64 — change to time: T)
agent::clean → already updated
Event (inside time_slice.rs) passes competitors and therefore inherits T

This is the single biggest refactor in T2. Expect ~100 compiler errors; work through them file by file. The compiler is your friend — each error either needs a T: Time parameter added, or a type argument passed.

For Competitor::receive(elapsed: i64) callers — now Competitor::receive(now: &T). Each caller currently computes elapsed = compute_elapsed(last_time, batch.time); change those to competitor.receive(&time_slice.time).

Step 5: Fix test modules

Tests use i64 timestamps and ConstantDrift — the defaults — so most test setup is unchanged. Places that reference .last_time == i64::MIN become .last_time.is_none(). Places constructing Competitor { last_time: i64::MIN, ... } become Competitor { last_time: None, ... }.

Step 6: Build + test

cargo build --features approx
cargo test --features approx --lib

Expected: all tests still pass, numerically identical. If numerics drift, the Drift implementation has a bug — most likely a negative elapsed was previously allowed.

Step 7: Format + commit

cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
refactor(api): generify Drift, Rating, Competitor, TimeSlice, History over T: Time

Drift now takes &T -> &T and is generic over the time axis. Untimed
impls return elapsed=0. ConstantDrift impl covers all T via the Time
trait.

Competitor.last_time moves from i64 with MIN sentinel to Option<T>
with None sentinel.

Part of T2.
EOF
)"

Task 8: Parameterize `History<T, D>` — remove `time: bool`

Files:

Modify: src/history.rs

The time: bool field is a legacy encoding of "no time axis." With T: Time, the type system carries that information (T = Untimed means no drift). This task removes the bool and fixes the !time internal path.

Step 1: Remove time: bool from History and HistoryBuilder

In src/history.rs, delete the time: bool field from both structs. Delete the .time(bool) builder method. Update the Default impl and build() to stop threading time through.

Step 2: Remove the !self.time branches in add_events_with_prior

Every if self.time { … } else { … } becomes the self.time branch — the generic parameter makes this uniform. Specifically, times is always required now (it's a Vec<T>, not Vec<i64>), and sort_time sorts in T order.

Rewrite sort_time in src/lib.rs to be generic:

pub(crate) fn sort_time<T: Copy + Ord>(xs: &[T], reverse: bool) -> Vec<usize> {
    let mut x: Vec<(usize, T)> = xs.iter().enumerate().map(|(i, &t)| (i, t)).collect();
    if reverse {
        x.sort_by_key(|&(_, t)| std::cmp::Reverse(t));
    } else {
        x.sort_by_key(|&(_, t)| t);
    }
    x.into_iter().map(|(i, _)| i).collect()
}

Step 3: Keep the old nested-Vec add_events_with_prior signature

For now. Change its signature only enough to take times: Vec<T> instead of Vec<i64>. Users who previously called with !time should switch to History::<Untimed, _> — but the tests should switch to History::<i64, _> with explicit [1, 2, 3, ...] timestamps to preserve the old elapsed=1-per-event numerical behavior.

The nested-Vec signature gets fully replaced in Task 15. This keeps the test suite green between tasks.

Step 4: Update tests

Tests using .time(false) need translation. The mechanical rule:

Old	New
`History::builder().time(false).build()`	`History::<i64, _>::builder().build()` with explicit `vec![1, 2, 3, …]` for `times` in every `add_events` call
`History::default()` (was `time: true`)	`History::<i64, _>::default()`

For every test that called add_events(comp, results, vec![], vec![]) under time: false, change to add_events(comp, results, vec![1, 2, 3, …], vec![]) where the length matches comp.len(). Numerics are preserved because the old !time branch used i as i64 + 1 timestamps internally.

This is mechanical. Each failing test will show which translation is needed.

Step 5: Build + test

cargo build --features approx
cargo test --features approx --lib

Expected: all tests pass with translated timestamps.

Step 6: Format + commit

cargo +nightly fmt
git add src/history.rs src/lib.rs src/time_slice.rs src/storage/
git commit -m "$(cat <<'EOF'
refactor(history): parameterize History<T, D> and remove time: bool

The bool encoded 'no time axis' which is now expressed at the type
level (T = Untimed). Removed the .time() builder method. Tests that
used time(false) now use i64 timestamps 1..=n to preserve numerics.

Part of T2.
EOF
)"

Task 9: Add `Outcome` enum

Files:

Create: src/outcome.rs
Modify: src/lib.rs
Step 1: Create src/outcome.rs

//! Outcome of a match.
//!
//! In T2, only `Ranked` is supported; `Scored` will be added together with
//! `MarginFactor` in T4. The enum is `#[non_exhaustive]` so adding `Scored`
//! is non-breaking for downstream `match` expressions.

use smallvec::SmallVec;

/// Final outcome of a match.
///
/// `Ranked(ranks)`: lower rank = better. Equal ranks mean a tie between those
/// teams. `ranks.len()` must equal the number of teams in the event.
#[derive(Clone, Debug, PartialEq)]
#[non_exhaustive]
pub enum Outcome {
    Ranked(SmallVec<[u32; 4]>),
}

impl Outcome {
    /// `N`-team outcome where team `winner` won and everyone else tied for last.
    ///
    /// Panics if `winner >= n`.
    pub fn winner(winner: u32, n: u32) -> Self {
        assert!(winner < n, "winner index {winner} out of range 0..{n}");
        let ranks: SmallVec<[u32; 4]> = (0..n)
            .map(|i| if i == winner { 0 } else { 1 })
            .collect();
        Self::Ranked(ranks)
    }

    /// All `n` teams tied.
    pub fn draw(n: u32) -> Self {
        Self::Ranked(SmallVec::from_vec(vec![0; n as usize]))
    }

    /// Explicit per-team ranking.
    pub fn ranking<I: IntoIterator<Item = u32>>(ranks: I) -> Self {
        Self::Ranked(ranks.into_iter().collect())
    }

    pub fn team_count(&self) -> usize {
        match self {
            Self::Ranked(r) => r.len(),
        }
    }

    pub(crate) fn as_ranks(&self) -> &[u32] {
        match self {
            Self::Ranked(r) => r,
        }
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn winner_two_teams() {
        let o = Outcome::winner(0, 2);
        assert_eq!(o.as_ranks(), &[0u32, 1]);
        assert_eq!(o.team_count(), 2);
    }

    #[test]
    fn winner_three_teams_second_wins() {
        let o = Outcome::winner(1, 3);
        assert_eq!(o.as_ranks(), &[1u32, 0, 1]);
    }

    #[test]
    fn draw_three_teams() {
        let o = Outcome::draw(3);
        assert_eq!(o.as_ranks(), &[0u32, 0, 0]);
    }

    #[test]
    fn ranking_from_iter() {
        let o = Outcome::ranking([2, 0, 1]);
        assert_eq!(o.as_ranks(), &[2u32, 0, 1]);
    }

    #[test]
    #[should_panic(expected = "winner index 2 out of range")]
    fn winner_out_of_range_panics() {
        let _ = Outcome::winner(2, 2);
    }
}

Step 2: Add smallvec as a dependency

In Cargo.toml, under [dependencies]:

smallvec = "1"

Step 3: Add pub mod outcome; and pub use outcome::Outcome; to src/lib.rs
Step 4: Build + test

cargo build --features approx
cargo test --features approx --lib outcome

Expected: 5 passing tests.

Step 5: Format + commit

cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
feat(api): add Outcome enum with Ranked variant

Outcome::winner(i, n), Outcome::draw(n), Outcome::ranking(iter) are
the convenience constructors. Marked #[non_exhaustive] so Scored can
be added in T4 without breaking match exhaustiveness.

Adds smallvec dependency.

Part of T2.
EOF
)"

Task 10: Add `Event<T, K>`, `Team<K>`, `Member<K>`

Files:

Create: src/event.rs
Modify: src/lib.rs
Step 1: Create src/event.rs

//! Typed event description for bulk ingestion.
//!
//! `Event<T, K>` is the new public event shape (spec Section 4). Replaces
//! the nested `Vec<Vec<Vec<Index>>>`, `Vec<Vec<f64>>`, `Vec<Vec<Vec<f64>>>`
//! that the old `add_events_with_prior` took.

use smallvec::SmallVec;

use crate::{gaussian::Gaussian, outcome::Outcome, time::Time};

/// A single match at time `time` involving some number of teams.
#[derive(Clone, Debug)]
pub struct Event<T: Time, K> {
    pub time: T,
    pub teams: SmallVec<[Team<K>; 4]>,
    pub outcome: Outcome,
}

/// A team: list of members competing together.
#[derive(Clone, Debug)]
pub struct Team<K> {
    pub members: SmallVec<[Member<K>; 4]>,
}

impl<K> Team<K> {
    pub fn new() -> Self {
        Self { members: SmallVec::new() }
    }

    pub fn with_members<I: IntoIterator<Item = Member<K>>>(members: I) -> Self {
        Self { members: members.into_iter().collect() }
    }
}

impl<K> Default for Team<K> {
    fn default() -> Self {
        Self::new()
    }
}

/// One member of a team, identified by user key `K`.
///
/// `weight` defaults to 1.0; a per-event `prior` can override the competitor's
/// current skill estimate for this event only.
#[derive(Clone, Debug)]
pub struct Member<K> {
    pub key: K,
    pub weight: f64,
    pub prior: Option<Gaussian>,
}

impl<K> Member<K> {
    pub fn new(key: K) -> Self {
        Self { key, weight: 1.0, prior: None }
    }

    pub fn with_weight(mut self, weight: f64) -> Self {
        self.weight = weight;
        self
    }

    pub fn with_prior(mut self, prior: Gaussian) -> Self {
        self.prior = Some(prior);
        self
    }
}

/// Convenience: a member is a user key with default weight 1.0 and no prior.
impl<K> From<K> for Member<K> {
    fn from(key: K) -> Self {
        Self::new(key)
    }
}

Step 2: Register in src/lib.rs

pub mod event;
...
pub use event::{Event, Member, Team};

Step 3: Build

cargo build --features approx
cargo test --features approx --lib

Step 4: Format + commit

cargo +nightly fmt
git add src/event.rs src/lib.rs
git commit -m "$(cat <<'EOF'
feat(api): add Event<T, K>, Team<K>, Member<K> typed event description

Replaces the old nested Vec<Vec<Vec<_>>> event description on the
public API boundary. Member<K>::from(K) enables ergonomic literal
lists.

Part of T2.
EOF
)"

Task 11: Add `Observer` trait and `NullObserver`

Files:

Create: src/observer.rs
Modify: src/lib.rs
Step 1: Create src/observer.rs

//! Observer trait for progress reporting during convergence.
//!
//! Replaces the old `verbose: bool` + `println!` path. Callers wire in any
//! observer that implements the trait; default methods are no-ops so users
//! override only what they need.

use crate::time::Time;

/// Receives progress callbacks during `History::converge`.
///
/// All methods have default no-op implementations; implement only what's
/// interesting. Send/Sync is NOT required in T2 (added in T3 along with
/// Rayon support).
pub trait Observer<T: Time> {
    /// Called after each convergence iteration across the whole history.
    fn on_iteration_end(&self, _iter: usize, _max_step: (f64, f64)) {}

    /// Called after each time slice is processed within an iteration.
    fn on_batch_processed(&self, _time: &T, _slice_idx: usize, _n_events: usize) {}

    /// Called once when convergence completes (or max iters is reached).
    fn on_converged(&self, _iters: usize, _final_step: (f64, f64), _converged: bool) {}
}

/// ZST no-op observer; the default when none is configured.
#[derive(Copy, Clone, Debug, Default)]
pub struct NullObserver;

impl<T: Time> Observer<T> for NullObserver {}

Step 2: Register in src/lib.rs

pub mod observer;
...
pub use observer::{NullObserver, Observer};

Step 3: Build

cargo build --features approx
cargo test --features approx --lib

Step 4: Commit

cargo +nightly fmt
git add src/observer.rs src/lib.rs
git commit -m "$(cat <<'EOF'
feat(api): add Observer trait and NullObserver default

Observer replaces verbose: bool with structured progress callbacks.
NullObserver is a ZST default; users override only the methods they
care about.

Send + Sync bounds deferred to T3.

Part of T2.
EOF
)"

Task 12: Add `ConvergenceOptions` and `ConvergenceReport`; wire into history

Files:

Create: src/convergence.rs
Modify: src/history.rs, src/lib.rs
Step 1: Create src/convergence.rs

//! Convergence configuration and reporting.

use std::time::Duration;

use smallvec::SmallVec;

#[derive(Clone, Copy, Debug)]
pub struct ConvergenceOptions {
    pub max_iter: usize,
    pub epsilon: f64,
}

impl Default for ConvergenceOptions {
    fn default() -> Self {
        Self {
            max_iter: crate::ITERATIONS,
            epsilon: crate::EPSILON,
        }
    }
}

/// Post-hoc summary of a `History::converge` call.
#[derive(Clone, Debug)]
pub struct ConvergenceReport {
    pub iterations: usize,
    pub final_step: (f64, f64),
    pub log_evidence: f64,
    pub converged: bool,
    pub per_iteration_time: SmallVec<[Duration; 32]>,
    pub slices_skipped: usize,
}

Step 2: Register in src/lib.rs

pub mod convergence;
...
pub use convergence::{ConvergenceOptions, ConvergenceReport};

Step 3: Add convergence: ConvergenceOptions and observer to HistoryBuilder

In src/history.rs, add to HistoryBuilder:

pub fn convergence(mut self, opts: ConvergenceOptions) -> Self {
    self.convergence = opts;
    self
}

pub fn observer<O2: Observer<T>>(self, observer: O2) -> HistoryBuilder<T, D, O2> { … }

HistoryBuilder gains two extra generic parameters (O: Observer<T> and observer storage). The simplest shape:

pub struct HistoryBuilder<T: Time = i64, D: Drift<T> = ConstantDrift, O: Observer<T> = NullObserver> {
    mu: f64,
    sigma: f64,
    beta: f64,
    drift: D,
    p_draw: f64,
    convergence: ConvergenceOptions,
    observer: O,
    _time: std::marker::PhantomData<T>,
}

History<T, D, O> gains the same O parameter; observer is stored and called from the convergence loop.

Step 4: Add history.converge() -> Result<ConvergenceReport, InferenceError>

pub fn converge(&mut self) -> Result<ConvergenceReport, InferenceError> {
    use std::time::Instant;
    let opts = self.convergence;
    let mut step = (f64::INFINITY, f64::INFINITY);
    let mut i = 0;
    let mut per_iter: SmallVec<[Duration; 32]> = SmallVec::new();
    while crate::tuple_gt(step, opts.epsilon) && i < opts.max_iter {
        let t0 = Instant::now();
        step = self.iteration();
        per_iter.push(t0.elapsed());
        i += 1;
        self.observer.on_iteration_end(i, step);
    }
    let converged = !crate::tuple_gt(step, opts.epsilon);
    let log_evidence = self.log_evidence_impl(false, &[]);
    self.observer.on_converged(i, step, converged);
    Ok(ConvergenceReport {
        iterations: i,
        final_step: step,
        log_evidence,
        converged,
        per_iteration_time: per_iter,
        slices_skipped: 0,
    })
}

Keep the old convergence(iters, eps, verbose) method alive for one more task. It is removed in Task 20 alongside test translation.

Step 5: Build + test

cargo build --features approx
cargo test --features approx --lib

Step 6: Commit

cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
feat(api): add ConvergenceOptions, ConvergenceReport, and History::converge

HistoryBuilder gains .convergence(opts) and .observer(o). History::converge
returns a structured ConvergenceReport with per-iteration timings.

The old History::convergence(iters, eps, verbose) still works and is
removed in Task 20.

Part of T2.
EOF
)"

Task 13: Expand `InferenceError`; convert boundary panics to `Result`

Files:

Modify: src/error.rs
Modify: src/history.rs, src/game.rs (where debug_assert! or assert! guard user input)
Step 1: Expand InferenceError

Replace src/error.rs:

use std::fmt;

#[derive(Debug, Clone, PartialEq)]
pub enum InferenceError {
    /// Expected and actual lengths of some array-shaped input differ.
    MismatchedShape {
        kind: &'static str,
        expected: usize,
        got: usize,
    },
    /// A probability value is outside `[0, 1]`.
    InvalidProbability { value: f64 },
    /// Convergence exceeded `max_iter` without falling below `epsilon`.
    ConvergenceFailed {
        last_step: (f64, f64),
        iterations: usize,
    },
    /// Negative precision: a Gaussian with `pi < 0` slipped into an API call.
    NegativePrecision { pi: f64 },
}

impl fmt::Display for InferenceError {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        match self {
            Self::MismatchedShape { kind, expected, got } => {
                write!(f, "{kind}: expected length {expected}, got {got}")
            }
            Self::InvalidProbability { value } => {
                write!(f, "probability must be in [0, 1]; got {value}")
            }
            Self::ConvergenceFailed { last_step, iterations } => {
                write!(
                    f,
                    "convergence failed after {iterations} iterations; last step = {last_step:?}"
                )
            }
            Self::NegativePrecision { pi } => {
                write!(f, "precision must be non-negative; got {pi}")
            }
        }
    }
}

impl std::error::Error for InferenceError {}

Step 2: Convert boundary panics to Result

Search for debug_assert! and assert! at the API boundary:

grep -rn 'debug_assert!\|^\s*assert!' src/history.rs src/game.rs

In Game::new (the internal path kept for History), the existing debug_assert!s remain — they catch invariants, not user errors. But for a public Game::ranked (added later in Task 19), return InferenceError::MismatchedShape on bad input rather than panicking.

In History::add_events_with_prior, the four assert!(...) calls at the top of the function become Result<(), InferenceError> returns. Change the function signature:

pub(crate) fn add_events_with_prior(
    &mut self,
    …
) -> Result<(), InferenceError> {
    if results.len() != composition.len() && !results.is_empty() {
        return Err(InferenceError::MismatchedShape {
            kind: "results",
            expected: composition.len(),
            got: results.len(),
        });
    }
    // … other checks …
    // … rest of body …
    Ok(())
}

Every caller must add ?. Tests that were calling h.add_events(…) (the thin wrapper) now call h.add_events(…)? or .unwrap() in tests.

Step 3: Build + test

cargo build --features approx
cargo test --features approx --lib

Step 4: Commit

cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
feat(error): expand InferenceError; convert boundary asserts to Result

Adds MismatchedShape, InvalidProbability, ConvergenceFailed. History's
add_events_with_prior now returns Result<(), InferenceError>. Internal
debug_asserts for invariants stay; user-facing shape/bounds checks
become errors.

Part of T2.
EOF
)"

Task 14: Add `record_winner` and `record_draw` convenience API

Files:

Modify: src/history.rs
Step 1: Add the methods

In impl<T: Time, D: Drift<T>, O: Observer<T>> History<T, D, O>:

/// Record a single 1v1 match where `winner` beat `loser` at time `time`.
pub fn record_winner<Q>(&mut self, winner: &Q, loser: &Q, time: T) -> Result<(), InferenceError>
where
    K: std::borrow::Borrow<Q> + std::hash::Hash + Eq + ToOwned<Owned = K>,
    Q: std::hash::Hash + Eq + ?Sized,
{
    let w_idx = self.intern(winner);
    let l_idx = self.intern(loser);
    self.add_events_with_prior(
        vec![vec![vec![w_idx], vec![l_idx]]],
        vec![vec![1.0, 0.0]],
        vec![time],
        vec![],
        std::collections::HashMap::new(),
    )
}

/// Record a 1v1 tie.
pub fn record_draw<Q>(&mut self, a: &Q, b: &Q, time: T) -> Result<(), InferenceError>
where
    K: std::borrow::Borrow<Q> + std::hash::Hash + Eq + ToOwned<Owned = K>,
    Q: std::hash::Hash + Eq + ?Sized,
{
    let a_idx = self.intern(a);
    let b_idx = self.intern(b);
    self.add_events_with_prior(
        vec![vec![vec![a_idx], vec![b_idx]]],
        vec![vec![0.0, 0.0]],
        vec![time],
        vec![],
        std::collections::HashMap::new(),
    )
}

/// Get or create an `Index` for a user key. See spec Section 4 "Open question 3."
pub fn intern<Q>(&mut self, key: &Q) -> Index
where
    K: std::borrow::Borrow<Q> + std::hash::Hash + Eq + ToOwned<Owned = K>,
    Q: std::hash::Hash + Eq + ?Sized,
{
    self.keys.get_or_create(key)
}

/// Look up an `Index` for a key without creating it.
pub fn lookup<Q>(&self, key: &Q) -> Option<Index>
where
    K: std::borrow::Borrow<Q> + std::hash::Hash + Eq + ToOwned<Owned = K>,
    Q: std::hash::Hash + Eq + ?Sized,
{
    self.keys.get(key)
}

Note: History does not currently hold a KeyTable<K>. This task requires adding keys: KeyTable<K> to the History struct (which also adds K: Eq + Hash bound). The generic parameter K is threaded through History<T, D, O, K>. This is a significant shape change — work through the compiler errors.

Step 2: Add an integration test for record_winner

Create tests/record_winner.rs:

use trueskill_tt::{ConstantDrift, ConvergenceOptions, History};

#[test]
fn record_winner_updates_skills() {
    let mut h = History::<i64, _, _, &'static str>::builder()
        .mu(25.0)
        .sigma(25.0 / 3.0)
        .beta(25.0 / 6.0)
        .drift(ConstantDrift(25.0 / 300.0))
        .build();

    h.record_winner(&"alice", &"bob", 0).unwrap();
    h.converge().unwrap();

    let alice = h.current_skill(&"alice").unwrap();
    let bob = h.current_skill(&"bob").unwrap();

    assert!(alice.mu() > 25.0, "winner mu should rise: got {}", alice.mu());
    assert!(bob.mu() < 25.0, "loser mu should fall: got {}", bob.mu());
}

This test depends on current_skill (added in Task 17). For now, comment out the last three lines or replace with a learning_curves() check; come back to it in Task 17.

Step 3: Build + test

cargo build --features approx
cargo test --features approx --test record_winner

Step 4: Commit

cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
feat(api): add record_winner, record_draw, intern, lookup on History

Spec Section 4 "three-tier event ingestion" tier 2: one-off match
convenience. History now holds a KeyTable<K> internally; generic
parameter K added to History.

Part of T2.
EOF
)"

Task 15: Replace nested-Vec `add_events` with typed `add_events(iter)`

Files:

Modify: src/history.rs
Step 1: Add the new add_events signature

pub fn add_events<I>(&mut self, events: I) -> Result<(), InferenceError>
where
    I: IntoIterator<Item = Event<T, K>>,
{
    // Translate each Event<T, K> into the internal composition/results/times/weights
    // triple, then delegate to add_events_with_prior (which becomes pub(crate)).
    let mut composition: Vec<Vec<Vec<Index>>> = Vec::new();
    let mut results: Vec<Vec<f64>> = Vec::new();
    let mut times: Vec<T> = Vec::new();
    let mut weights: Vec<Vec<Vec<f64>>> = Vec::new();
    let mut priors: HashMap<Index, Rating<T, D>> = HashMap::new();

    for ev in events {
        let mut event_comp: Vec<Vec<Index>> = Vec::new();
        let mut event_weights: Vec<Vec<f64>> = Vec::new();
        for team in &ev.teams {
            let mut team_indices: Vec<Index> = Vec::new();
            let mut team_weights: Vec<f64> = Vec::new();
            for member in &team.members {
                let idx = self.intern(&member.key);
                team_indices.push(idx);
                team_weights.push(member.weight);
                if let Some(prior) = member.prior {
                    priors.insert(
                        idx,
                        Rating::new(prior, self.beta, self.drift),
                    );
                }
            }
            event_comp.push(team_indices);
            event_weights.push(team_weights);
        }
        composition.push(event_comp);
        weights.push(event_weights);
        // Convert Outcome::Ranked to the legacy "lower number wins" f64 result.
        // Legacy convention: result[i] is a score; higher score = better rank.
        // We invert ranks: team with rank 0 gets highest score.
        let ranks = ev.outcome.as_ranks();
        if ranks.len() != ev.teams.len() {
            return Err(InferenceError::MismatchedShape {
                kind: "outcome ranks vs teams",
                expected: ev.teams.len(),
                got: ranks.len(),
            });
        }
        let max_rank = ranks.iter().copied().max().unwrap_or(0) as f64;
        let inverted: Vec<f64> = ranks.iter().map(|&r| max_rank - r as f64).collect();
        results.push(inverted);
        times.push(ev.time);
    }

    self.add_events_with_prior(composition, results, times, weights, priors)
}

Rename the old public add_events (nested-Vec) to pub(crate) fn add_events_legacy or inline it into the internal call path; delete it from the public API surface.

Step 2: Add an integration test for the new signature

In tests/api_shape.rs:

use trueskill_tt::{Event, History, Member, Outcome, Team};
use smallvec::smallvec;

#[test]
fn add_events_bulk_via_iter() {
    let mut h = History::<i64, _, _, &'static str>::builder()
        .mu(0.0).sigma(2.0).beta(1.0).p_draw(0.0)
        .build();

    let events = vec![
        Event {
            time: 1,
            teams: smallvec![
                Team::with_members([Member::new("a")]),
                Team::with_members([Member::new("b")]),
            ],
            outcome: Outcome::winner(0, 2),
        },
        Event {
            time: 2,
            teams: smallvec![
                Team::with_members([Member::new("b")]),
                Team::with_members([Member::new("c")]),
            ],
            outcome: Outcome::winner(0, 2),
        },
    ];

    h.add_events(events).unwrap();
    let report = h.converge().unwrap();
    assert!(report.converged, "expected convergence, got {report:?}");
}

Step 3: Build + test

cargo build --features approx
cargo test --features approx --test api_shape

Step 4: Commit

cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
feat(api): replace nested-Vec add_events with typed add_events(iter)

The old nested shape is gone from the public API. add_events now
takes IntoIterator<Item = Event<T, K>>. Internally routes through
add_events_with_prior (now pub(crate)).

Part of T2.
EOF
)"

Task 16: Add `history.event(time).team(...).commit()` fluent builder

Files:

Create: src/event_builder.rs
Modify: src/history.rs, src/lib.rs
Step 1: Create src/event_builder.rs

//! Fluent builder for single-event ingestion.
//!
//! ```ignore
//! history
//!     .event(time)
//!     .team(["alice", "bob"]).weights([1.0, 0.7])
//!     .team(["carol"])
//!     .ranking([1, 0])
//!     .commit()?;
//! ```

use std::collections::HashMap;

use smallvec::SmallVec;

use crate::{
    InferenceError, Outcome,
    event::{Event, Member, Team},
    history::History,
    observer::Observer,
    drift::Drift,
    time::Time,
};

pub struct EventBuilder<'h, T, D, O, K>
where
    T: Time,
    D: Drift<T>,
    O: Observer<T>,
    K: Eq + std::hash::Hash + Clone,
{
    history: &'h mut History<T, D, O, K>,
    event: Event<T, K>,
    current_team_idx: Option<usize>,
}

impl<'h, T, D, O, K> EventBuilder<'h, T, D, O, K>
where
    T: Time,
    D: Drift<T>,
    O: Observer<T>,
    K: Eq + std::hash::Hash + Clone,
{
    pub(crate) fn new(history: &'h mut History<T, D, O, K>, time: T, default_n_teams: usize) -> Self {
        Self {
            history,
            event: Event {
                time,
                teams: SmallVec::with_capacity(default_n_teams.max(2)),
                outcome: Outcome::Ranked(SmallVec::new()),
            },
            current_team_idx: None,
        }
    }

    /// Add a team by its member keys.
    pub fn team<I: IntoIterator<Item = K>>(mut self, keys: I) -> Self {
        let members: SmallVec<[Member<K>; 4]> = keys.into_iter().map(Member::new).collect();
        self.event.teams.push(Team { members });
        self.current_team_idx = Some(self.event.teams.len() - 1);
        self
    }

    /// Set per-member weights for the most recently added team.
    pub fn weights<I: IntoIterator<Item = f64>>(mut self, weights: I) -> Self {
        let idx = self
            .current_team_idx
            .expect(".weights(...) called before any .team(...)");
        let ws: Vec<f64> = weights.into_iter().collect();
        let team = &mut self.event.teams[idx];
        debug_assert_eq!(
            ws.len(),
            team.members.len(),
            "weights length must match team size"
        );
        for (m, w) in team.members.iter_mut().zip(ws.into_iter()) {
            m.weight = w;
        }
        self
    }

    /// Set explicit ranks per team (length must equal number of teams).
    pub fn ranking<I: IntoIterator<Item = u32>>(mut self, ranks: I) -> Self {
        self.event.outcome = Outcome::ranking(ranks);
        self
    }

    /// Mark team `winner_idx` as winner of an N-way match, others tied for last.
    pub fn winner(mut self, winner_idx: u32) -> Self {
        self.event.outcome = Outcome::winner(winner_idx, self.event.teams.len() as u32);
        self
    }

    /// All teams tied.
    pub fn draw(mut self) -> Self {
        self.event.outcome = Outcome::draw(self.event.teams.len() as u32);
        self
    }

    /// Commit the event to the history.
    pub fn commit(self) -> Result<(), InferenceError> {
        self.history.add_events(std::iter::once(self.event))
    }
}

Step 2: Add History::event method

In src/history.rs:

pub fn event(&mut self, time: T) -> EventBuilder<'_, T, D, O, K> {
    EventBuilder::new(self, time, 2)
}

Step 3: Register the module in src/lib.rs

pub mod event_builder;
pub use event_builder::EventBuilder;

Step 4: Add a test

Append to tests/api_shape.rs:

#[test]
fn fluent_event_builder() {
    let mut h = History::<i64, _, _, &'static str>::builder()
        .mu(25.0).sigma(25.0 / 3.0).beta(25.0 / 6.0).p_draw(0.0)
        .build();

    h.event(1)
        .team(["alice", "bob"])
        .weights([1.0, 0.7])
        .team(["carol"])
        .ranking([1, 0])
        .commit()
        .unwrap();

    let report = h.converge().unwrap();
    assert!(report.converged);
}

Step 5: Build + test

cargo build --features approx
cargo test --features approx --test api_shape fluent_event_builder

Step 6: Commit

cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
feat(api): add fluent history.event(t).team(...).commit() builder

Third tier of the ingestion API (per spec Section 4). Powers one-off
events with irregular shapes where neither record_winner nor typed
add_events fits cleanly.

Part of T2.
EOF
)"

Task 17: Query methods — `current_skill`, `learning_curve`, `log_evidence_for`, `predict_*`

Files:

Modify: src/history.rs, src/game.rs, src/lib.rs
Step 1: Add current_skill and single-key learning_curve to History

impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + std::hash::Hash + Clone> History<T, D, O, K> {
    /// Skill estimate at the latest time slice the competitor appears in.
    pub fn current_skill<Q>(&self, key: &Q) -> Option<Gaussian>
    where
        K: std::borrow::Borrow<Q>,
        Q: std::hash::Hash + Eq + ?Sized,
    {
        let idx = self.keys.get(key)?;
        self.time_slices
            .iter()
            .rev()
            .find_map(|ts| ts.skills.get(idx).map(|sk| sk.posterior()))
    }

    /// Learning curve for a single key: (time, posterior) pairs in time order.
    pub fn learning_curve<Q>(&self, key: &Q) -> Vec<(T, Gaussian)>
    where
        K: std::borrow::Borrow<Q>,
        Q: std::hash::Hash + Eq + ?Sized,
    {
        let Some(idx) = self.keys.get(key) else {
            return Vec::new();
        };
        self.time_slices
            .iter()
            .filter_map(|ts| ts.skills.get(idx).map(|sk| (ts.time, sk.posterior())))
            .collect()
    }
}

Step 2: Update learning_curves to return HashMap<K, Vec<(T, Gaussian)>>

Currently returns HashMap<Index, Vec<(i64, Gaussian)>>. New shape:

pub fn learning_curves(&self) -> HashMap<K, Vec<(T, Gaussian)>> {
    let mut data: HashMap<K, Vec<(T, Gaussian)>> = HashMap::new();
    for slice in &self.time_slices {
        for (idx, skill) in slice.skills.iter() {
            if let Some(key) = self.keys.key(idx).cloned() {
                data.entry(key)
                    .or_default()
                    .push((slice.time, skill.posterior()));
            }
        }
    }
    data
}

Step 3: Add log_evidence() and log_evidence_for(&[&K])

pub fn log_evidence(&mut self) -> f64 {
    self.log_evidence_impl(false, &[])
}

pub fn log_evidence_for<Q>(&mut self, keys: &[&Q]) -> f64
where
    K: std::borrow::Borrow<Q>,
    Q: std::hash::Hash + Eq + ?Sized,
{
    let targets: Vec<Index> = keys
        .iter()
        .filter_map(|k| self.keys.get(k))
        .collect();
    self.log_evidence_impl(false, &targets)
}

fn log_evidence_impl(&mut self, forward: bool, targets: &[Index]) -> f64 {
    self.time_slices
        .iter()
        .map(|ts| ts.log_evidence(self.online, targets, forward, &self.competitors))
        .sum()
}

The existing public log_evidence(forward, targets) method is renamed to log_evidence_impl (pub(crate)) and the thin public wrappers take its place. The forward bool ("online" mode) stays internal; no public exposure planned for T2.

Step 4: Add predict_quality and predict_outcome

These are light wrappers over the existing quality() free function plus a new "predict outcome" that runs a one-off Game without adding to history:

pub fn predict_quality(&self, teams: &[&[&K]]) -> f64
where
    K: std::borrow::Borrow<K>,
{
    let groups: Vec<Vec<Gaussian>> = teams
        .iter()
        .map(|team| {
            team.iter()
                .filter_map(|k| self.keys.get(*k))
                .filter_map(|idx| {
                    self.time_slices
                        .iter()
                        .rev()
                        .find_map(|ts| ts.skills.get(idx).map(|s| s.posterior()))
                })
                .collect()
        })
        .collect();
    let group_refs: Vec<&[Gaussian]> = groups.iter().map(|g| g.as_slice()).collect();
    crate::quality(&group_refs, self.beta)
}

pub fn predict_outcome(&self, teams: &[&[&K]]) -> Vec<f64> {
    // Win probabilities per team: apply cdf over team-perf sums.
    // For n teams, return P(team i wins over all others).
    // Minimal impl for T2: 2-team case only; multi-team deferred to T4.
    assert_eq!(teams.len(), 2, "predict_outcome T2: 2 teams only");
    let gather = |team: &[&K]| -> Gaussian {
        team.iter()
            .filter_map(|k| self.keys.get(*k))
            .filter_map(|idx| {
                self.time_slices
                    .iter()
                    .rev()
                    .find_map(|ts| ts.skills.get(idx).map(|s| s.posterior()))
            })
            .fold(crate::N00, |acc, g| acc + g.forget(self.beta.powi(2)))
    };
    let a = gather(teams[0]);
    let b = gather(teams[1]);
    let diff = a - b;
    let p_a = 1.0 - crate::cdf(0.0, diff.mu(), diff.sigma());
    vec![p_a, 1.0 - p_a]
}

Document predict_outcome as 2-team-only in T2; N-team lands in T4 alongside Residual schedules.

Step 5: Add tests

In tests/api_shape.rs:

#[test]
fn current_skill_learning_curve_learning_curves() {
    let mut h = History::<i64, _, _, &'static str>::builder()
        .mu(25.0).sigma(25.0 / 3.0).beta(25.0 / 6.0).p_draw(0.0)
        .build();
    h.record_winner(&"a", &"b", 1).unwrap();
    h.record_winner(&"a", &"b", 2).unwrap();
    h.converge().unwrap();

    let a = h.current_skill(&"a").unwrap();
    assert!(a.mu() > 25.0);

    let curve = h.learning_curve(&"a");
    assert_eq!(curve.len(), 2);
    assert_eq!(curve[0].0, 1);
    assert_eq!(curve[1].0, 2);

    let all = h.learning_curves();
    assert_eq!(all.len(), 2);
    assert!(all.contains_key("a"));
}

#[test]
fn log_evidence_for_subset() {
    let mut h = History::<i64, _, _, &'static str>::builder()
        .mu(0.0).sigma(6.0).beta(1.0).p_draw(0.0)
        .build();
    h.record_winner(&"a", &"b", 1).unwrap();
    h.record_winner(&"b", &"a", 2).unwrap();
    let ev_all = h.log_evidence();
    let ev_a = h.log_evidence_for(&[&"a"]);
    assert_ne!(ev_all, ev_a);
}

Step 6: Build + test

cargo build --features approx
cargo test --features approx --test api_shape

Step 7: Commit

cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
feat(api): add current_skill, learning_curve (single), log_evidence_for, predict_*

learning_curves now returns HashMap<K, Vec<(T, Gaussian)>> keyed on
the user's key type. log_evidence is public zero-arg; log_evidence_for
takes a slice of keys.

predict_outcome is T2 2-team-only; N-team deferred.

Part of T2.
EOF
)"

Task 18: Promote `Factor` / `Schedule` / `VarStore` to `pub` under a `factors` module

Files:

Create: src/factors.rs
Modify: src/factor/mod.rs, src/factor/rank_diff.rs, src/factor/team_sum.rs, src/factor/trunc.rs, src/schedule.rs, src/lib.rs
Step 1: Promote visibility inside src/factor/mod.rs

Change pub(crate) struct VarId(pub(crate) u32); → pub struct VarId(pub u32);. Similarly:

pub(crate) struct VarStore → pub struct VarStore (keep marginals field pub(crate) — implementation detail).
pub(crate) trait Factor → pub trait Factor.
pub(crate) enum BuiltinFactor → pub enum BuiltinFactor.
Submodules: pub(crate) mod team_sum; → pub mod team_sum; and similarly for rank_diff, trunc.
Inside each submodule, promote the struct and its fields/constructors from pub(crate) to pub.

Remove any #[allow(dead_code)] that was guarding visibility warnings.

Step 2: Promote Schedule and EpsilonOrMax in src/schedule.rs

pub trait Schedule {
    fn run(&self, factors: &mut [BuiltinFactor], vars: &mut VarStore) -> ScheduleReport;
}

pub struct EpsilonOrMax {
    pub eps: f64,
    pub max: usize,
}

Remove #[allow(dead_code)].

Step 3: Create src/factors.rs that re-exports the public API

//! Factor-graph public API.
//!
//! Power users can construct custom factor graphs via `Game::custom` and
//! drive them with a custom `Schedule` implementation.

pub use crate::factor::{BuiltinFactor, Factor, VarId, VarStore};
pub use crate::factor::rank_diff::RankDiffFactor;
pub use crate::factor::team_sum::TeamSumFactor;
pub use crate::factor::trunc::TruncFactor;
pub use crate::schedule::{EpsilonOrMax, Schedule, ScheduleReport};

Step 4: Register in src/lib.rs

Keep factor and schedule modules as they are (internal use), and add:

pub mod factors;

Step 5: Build + test

cargo build --features approx
cargo test --features approx --lib

Step 6: Commit

cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
feat(api): promote Factor/Schedule/VarStore to pub in `factors` module

Exposes the factor-graph machinery so power users can define custom
factors and schedules (see Game::custom in the next task). The
internal factor/ and schedule/ modules remain pub(crate) — user-
facing API goes through the factors module re-exports.

Part of T2.
EOF
)"

Task 19: `Game::ranked`, `Game::one_v_one`, `Game::free_for_all`, `Game::custom`

Files:

Modify: src/game.rs
Step 1: Split the internal Game::new into a Game::ranked_with_arena

Move the current Game::new body into pub(crate) fn ranked_with_arena(...) -> Self. The public signatures become:

pub struct GameOptions {
    pub p_draw: f64,
    pub convergence: ConvergenceOptions,
}

impl Default for GameOptions {
    fn default() -> Self {
        Self {
            p_draw: crate::P_DRAW,
            convergence: ConvergenceOptions::default(),
        }
    }
}

impl<'a, T: Time, D: Drift<T>> Game<'a, T, D> {
    /// Build a ranked match from borrowed team rosters and an outcome.
    ///
    /// Returns `Err(MismatchedShape)` if outcome rank count doesn't match
    /// team count. Returns `Err(InvalidProbability)` if p_draw is out of range.
    pub fn ranked(
        teams: &[&[Rating<T, D>]],
        outcome: Outcome,
        options: &GameOptions,
    ) -> Result<Self, InferenceError> {
        if !(0.0..1.0).contains(&options.p_draw) {
            return Err(InferenceError::InvalidProbability { value: options.p_draw });
        }
        if outcome.team_count() != teams.len() {
            return Err(InferenceError::MismatchedShape {
                kind: "outcome ranks vs teams",
                expected: teams.len(),
                got: outcome.team_count(),
            });
        }

        // Translate ranks to the legacy "higher f64 = better" result.
        let ranks = outcome.as_ranks();
        let max_rank = ranks.iter().copied().max().unwrap_or(0) as f64;
        let result: Vec<f64> = ranks.iter().map(|&r| max_rank - r as f64).collect();
        let teams_owned: Vec<Vec<Rating<T, D>>> = teams.iter().map(|t| t.to_vec()).collect();
        let weights: Vec<Vec<f64>> = teams.iter().map(|t| vec![1.0; t.len()]).collect();
        let mut arena = ScratchArena::new();
        let game = Self::ranked_with_arena(
            teams_owned,
            &result,
            &weights,
            options.p_draw,
            &mut arena,
        );
        Ok(game)
    }

    /// 1v1 convenience: returns posteriors `(a_post, b_post)` directly.
    pub fn one_v_one(
        a: &Rating<T, D>,
        b: &Rating<T, D>,
        outcome: Outcome,
    ) -> Result<(Gaussian, Gaussian), InferenceError> {
        let game = Self::ranked(
            &[&[*a], &[*b]],
            outcome,
            &GameOptions::default(),
        )?;
        let post = game.posteriors();
        Ok((post[0][0], post[1][0]))
    }

    /// FFA: each entry is a single rating; outcome ranks per player.
    pub fn free_for_all(
        players: &[&Rating<T, D>],
        outcome: Outcome,
        options: &GameOptions,
    ) -> Result<Self, InferenceError> {
        let teams: Vec<Vec<Rating<T, D>>> = players.iter().map(|p| vec![**p]).collect();
        let team_refs: Vec<&[Rating<T, D>]> = teams.iter().map(|t| t.as_slice()).collect();
        Self::ranked(&team_refs, outcome, options)
    }

    /// Power-user: build a game from a user-defined factor graph.
    ///
    /// The caller owns the `VarStore` and `Vec<BuiltinFactor>`. The schedule
    /// is run once; `posteriors_from_vars` extracts team posteriors by VarId.
    pub fn custom<S: Schedule>(
        teams: Vec<Vec<Rating<T, D>>>,
        vars: VarStore,
        factors: Vec<BuiltinFactor>,
        team_perf_vars: Vec<VarId>,
        weights: Vec<Vec<f64>>,
        schedule: &S,
    ) -> Self {
        // Partial impl: run schedule, populate likelihoods, evidence.
        // Full custom-factor support expands in T4.
        let mut this = Self {
            teams,
            result: &[],        // Not used for custom — outcome encoded in factors.
            weights: &[],
            p_draw: 0.0,
            likelihoods: Vec::new(),
            evidence: 1.0,
        };
        this.run_custom(vars, factors, team_perf_vars, weights, schedule);
        this
    }

    /// Log-evidence for this game (zero if there were no Trunc factors).
    pub fn log_evidence(&self) -> f64 {
        self.evidence.ln()
    }
}

Game::custom is the spec's escape hatch for user-defined factors. For T2 it only needs to work; its full ergonomics land in T4. Mark it #[doc(hidden)] if unfinished.

Step 2: Add tests

#[test]
fn game_ranked_1v1() {
    let a = Rating::new(Gaussian::from_ms(25.0, 25.0/3.0), 25.0/6.0, ConstantDrift(25.0/300.0));
    let b = Rating::new(Gaussian::from_ms(25.0, 25.0/3.0), 25.0/6.0, ConstantDrift(25.0/300.0));
    let g = Game::<i64, _>::ranked(
        &[&[a], &[b]],
        Outcome::winner(0, 2),
        &GameOptions::default(),
    ).unwrap();
    let p = g.posteriors();
    assert_ulps_eq!(p[0][0], Gaussian::from_ms(29.205220, 7.194481), epsilon = 1e-6);
    assert_ulps_eq!(p[1][0], Gaussian::from_ms(20.794779, 7.194481), epsilon = 1e-6);
}

#[test]
fn game_one_v_one_shortcut() {
    let a = Rating::<i64>::new(Gaussian::from_ms(25.0, 25.0/3.0), 25.0/6.0, ConstantDrift(25.0/300.0));
    let b = Rating::<i64>::new(Gaussian::from_ms(25.0, 25.0/3.0), 25.0/6.0, ConstantDrift(25.0/300.0));
    let (a_post, b_post) = Game::<i64, _>::one_v_one(&a, &b, Outcome::winner(0, 2)).unwrap();
    assert_ulps_eq!(a_post, Gaussian::from_ms(29.205220, 7.194481), epsilon = 1e-6);
    assert_ulps_eq!(b_post, Gaussian::from_ms(20.794779, 7.194481), epsilon = 1e-6);
}

#[test]
fn game_ranked_rejects_bad_p_draw() {
    let a = Rating::<i64>::new(Gaussian::default(), 1.0, ConstantDrift(0.0));
    let err = Game::<i64, _>::ranked(
        &[&[a], &[a]],
        Outcome::winner(0, 2),
        &GameOptions { p_draw: 1.5, convergence: Default::default() },
    ).unwrap_err();
    assert!(matches!(err, InferenceError::InvalidProbability { .. }));
}

Note Outcome::winner(0, 2) makes team 0 the winner and team 1 the loser. The old test used &[0.0, 1.0] where higher = worse (team 1 had result 1.0 meaning worse result i.e. team 0 won). Under new convention Outcome::winner(0, 2) → ranks [0, 1] → mapped to legacy results [1.0, 0.0] → team 0 wins. But the old test asserted p[0][0] = Gaussian::from_ms(20.794779, 7.194481) (loser) and p[1][0] = Gaussian::from_ms(29.205220, 7.194481) (winner). That's because the old convention was higher result = worse position. Double-check this mapping before committing golden values — run the test and if p[0] vs p[1] flip, swap the expected values.

Step 3: Build + test

cargo build --features approx
cargo test --features approx --lib game

Step 4: Commit

cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
feat(api): add Game::ranked, one_v_one, free_for_all, custom constructors

Public Game API now returns Result<Self, InferenceError> on bad
inputs. GameOptions bundles p_draw and convergence config.
Game::custom is an escape hatch for user-defined factor graphs
(full ergonomics land in T4).

Part of T2.
EOF
)"

Task 20: Translate the full test suite to the new API; delete legacy methods

Files:

Modify: every #[cfg(test)] mod tests across src/
Modify: tests/equivalence.rs (new file)
Delete: History::convergence(iters, eps, verbose), the legacy add_events wrapper
Step 1: Inventory remaining legacy callers

grep -rn 'h\.convergence(\|\.add_events(vec!\|\.time(true)\|\.time(false)\|\.gamma(' src/ tests/

Every hit is a site that must be translated.

Step 2: Translation cheat-sheet

Apply this uniformly across every test module. Example old form (from src/history.rs):

let mut h = History::builder()
    .mu(0.0).sigma(2.0).beta(1.0).gamma(0.0).time(false).build();
h.add_events(composition, results, vec![], vec![]);
h.convergence(ITERATIONS, EPSILON, false);

New form:

let mut h = History::<i64, _, _, &'static str>::builder()
    .mu(0.0).sigma(2.0).beta(1.0).drift(ConstantDrift(0.0)).build();
// Translate nested Vec<Vec<Vec<Index>>> into Vec<Event<_, &str>>
let events = translate_to_events(composition, results, /* times = 1..=n */ (1..=n).collect());
h.add_events(events).unwrap();
h.converge().unwrap();

Since the tests use IndexMap::get_or_create("a") returning Index, they'll need updating to either use history.intern(&"a") directly or use Member::new("a") in Event.

Provide a test helper in a tests/common.rs module:

pub fn winner_event(time: i64, w: &'static str, l: &'static str) -> Event<i64, &'static str> {
    use smallvec::smallvec;
    Event {
        time,
        teams: smallvec![
            Team::with_members([Member::new(w)]),
            Team::with_members([Member::new(l)]),
        ],
        outcome: Outcome::winner(0, 2),
    }
}

pub fn draw_event(time: i64, a: &'static str, b: &'static str) -> Event<i64, &'static str> {
    use smallvec::smallvec;
    Event {
        time,
        teams: smallvec![
            Team::with_members([Member::new(a)]),
            Team::with_members([Member::new(b)]),
        ],
        outcome: Outcome::draw(2),
    }
}

Then h.add_events(vec![winner_event(1, "a", "b"), winner_event(2, "a", "c"), winner_event(3, "b", "c")]).unwrap();.

Step 3: Translate each old test one at a time

Work through src/history.rs::tests, src/game.rs::tests, src/time_slice.rs::tests. Each test retains its hardcoded golden values — numerical behavior is unchanged. Only the construction changes.

For tests that accessed h.batches[i].skills.get(idx).unwrap().posterior(), change to h.learning_curve(&key)[i].1 or h.current_skill(&key).unwrap().

For tests that looked at internal state (h.batches[0].events.len(), get_composition() etc.), delete those assertions if the new API doesn't expose them, or move them behind #[cfg(test)] accessors.

Step 4: Create tests/equivalence.rs — the regression net

//! Verifies that the new API produces the same numerical results as the
//! hardcoded goldens from the old API (taken from the original Python port).

use approx::assert_ulps_eq;
use trueskill_tt::{
    ConstantDrift, ConvergenceOptions, Event, Gaussian, History, Member, Outcome, Team,
};
use smallvec::smallvec;

#[test]
fn test_1vs1_matches_old_golden() {
    use trueskill_tt::{Game, GameOptions, Rating};
    let a = Rating::<i64>::new(
        Gaussian::from_ms(25.0, 25.0 / 3.0),
        25.0 / 6.0,
        ConstantDrift(25.0 / 300.0),
    );
    let b = Rating::<i64>::new(
        Gaussian::from_ms(25.0, 25.0 / 3.0),
        25.0 / 6.0,
        ConstantDrift(25.0 / 300.0),
    );
    let (a_post, b_post) = Game::<i64, _>::one_v_one(&a, &b, Outcome::winner(0, 2)).unwrap();
    assert_ulps_eq!(a_post, Gaussian::from_ms(29.205220, 7.194481), epsilon = 1e-6);
    assert_ulps_eq!(b_post, Gaussian::from_ms(20.794779, 7.194481), epsilon = 1e-6);
}

#[test]
fn test_env_ttt_matches_old_golden() {
    let mut h = History::<i64, _, _, &'static str>::builder()
        .mu(25.0)
        .sigma(25.0 / 3.0)
        .beta(25.0 / 6.0)
        .drift(ConstantDrift(25.0 / 300.0))
        .convergence(ConvergenceOptions { max_iter: 30, epsilon: 1e-6 })
        .build();

    let events: Vec<Event<i64, &'static str>> = vec![
        Event { time: 1, teams: smallvec![
            Team::with_members([Member::new("a")]),
            Team::with_members([Member::new("b")]),
        ], outcome: Outcome::winner(0, 2) },
        Event { time: 2, teams: smallvec![
            Team::with_members([Member::new("a")]),
            Team::with_members([Member::new("c")]),
        ], outcome: Outcome::winner(1, 2) },
        Event { time: 3, teams: smallvec![
            Team::with_members([Member::new("b")]),
            Team::with_members([Member::new("c")]),
        ], outcome: Outcome::winner(0, 2) },
    ];
    h.add_events(events).unwrap();
    h.converge().unwrap();

    let a_curve = h.learning_curve(&"a");
    let b_curve = h.learning_curve(&"b");
    assert_ulps_eq!(a_curve[0].1, Gaussian::from_ms(25.000267, 5.419381), epsilon = 1e-6);
    assert_ulps_eq!(b_curve[0].1, Gaussian::from_ms(24.999465, 5.419425), epsilon = 1e-6);
}

// Repeat for test_teams, test_add_events, test_only_add_events, test_log_evidence,
// test_add_events_with_time, test_1vs1_weighted from the old history.rs tests,
// translating construction while keeping every golden value identical.

Port every one of the seven original history tests plus the seven game tests. Any test that can't be translated because the new API doesn't expose the needed internals — if its golden is preserved by a translated test, delete it; if not, either add a pub(crate) accessor for the test and flag in the spec notes.

Step 5: Delete legacy methods

In src/history.rs, delete:

pub fn convergence(iters, eps, verbose) -> ((f64, f64), usize) — replaced by converge().
The online: bool builder accessor, if unused externally (leave self.online field internal for now — it's used by log_evidence_impl).
The nested-Vec public add_events wrapper. Only pub(crate) fn add_events_with_prior stays (no longer pub).
HistoryBuilder::gamma(f64) — now users go through .drift(ConstantDrift(g)).
Step 6: Build + test

cargo test --features approx
cargo clippy --all-targets --features approx -- -D warnings

All tests pass — failures here mean either a translation bug or a legitimate numeric difference. The goldens have been proven stable through T0 and T1, so any drift is translation.

Step 7: Commit

cargo +nightly fmt
git add -A
git commit -m "$(cat <<'EOF'
test(api): translate full test suite to new API; delete legacy methods

Every old golden is reproduced in the new API (tests/equivalence.rs
plus in-module tests). History::convergence, the nested-Vec add_events,
.gamma(f64), .time(bool) all removed.

Part of T2.
EOF
)"

Task 21: Final verification, benchmark capture, update CHANGELOG

Files:

Modify: benches/baseline.txt
Modify: CHANGELOG.md
Step 1: Full green check

cargo +nightly fmt --check
cargo clippy --all-targets --features approx -- -D warnings
cargo test --features approx
cargo test --features approx --doc 2>&1 | tail -5

All must pass.

Step 2: Capture T2 benchmarks

cargo bench --bench batch 2>&1 | grep "Batch::iteration"
cargo bench --bench gaussian 2>&1 | grep "Gaussian::"

Compare to T1 baseline (~23 µs). Must be within 5% (23.5 µs ceiling). T2 is an API refactor; hot path should be unchanged.

Step 3: Append T2 block to benches/baseline.txt

# After T2 (date, same hardware)

Batch::iteration           <X.XX> µs   (vs T1 23.010 µs: <delta>)
Gaussian::add             <X.XX> ps    (unchanged)
Gaussian::sub             <X.XX> ps    (unchanged)
Gaussian::mul             <X.XX> ps    (unchanged)
Gaussian::div             <X.XX> ps    (unchanged)
Gaussian::pi              <X.XX> ps    (unchanged)
Gaussian::tau             <X.XX> ps    (unchanged)

# Notes:
# - API-only tier; hot inference path unchanged so numerics match T1 within ULPs.
# - Public surface now matches spec Section 4.
# - Breaking changes: Batch→TimeSlice, Player→Rating, Agent→Competitor,
#   IndexMap→KeyTable; Event<T, K>/Team<K>/Member<K>/Outcome new types;
#   Time trait; Drift<T> generic; Observer + ConvergenceReport; Result<_,
#   InferenceError> at API boundary; factors module promoted to pub.

Step 4: Add CHANGELOG entry

Read CHANGELOG.md (check its existing format first with head -20 CHANGELOG.md) and prepend a ## [Unreleased] section listing the breaking changes bulleted from the spec.

Step 5: Commit

git add benches/baseline.txt CHANGELOG.md
git commit -m "$(cat <<'EOF'
bench,docs: capture T2 final numbers and update CHANGELOG

Batch::iteration: <X.XX> µs (T1 was 23.010 µs).
API-only tier; numerics within ULPs.

Closes T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
EOF
)"

Step 6: Ready for review

The branch t2-new-api-surface is complete. Full test suite green, clippy clean, fmt clean, bench within target. Open the PR with:

Summary linking to spec + plan.
Breaking-change table (spec Section 4½).
Benchmark comparison.
Migration notes for known downstream (there are none — crate is pre-1.0).

Self-review notes

Spec coverage (against docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md Section 7 T2 checklist):

✅ Rating, TimeSlice, Competitor, Member<K>, Outcome, Event<T, K>, KeyTable<K> (Tasks 2, 3, 4, 5, 9, 10)
✅ Time trait, History<T: Time, D: Drift<T>> (Tasks 6, 7, 8)
✅ Three-tier API: record_winner, event(...).team(...).commit(), bulk add_events(iter) (Tasks 14, 15, 16)
✅ Observer trait + ConvergenceReport; verbose: bool deleted (Tasks 11, 12, 20)
✅ panic!/debug_assert! at API boundary → Result<_, InferenceError> (Task 13)
✅ Factor/Schedule/VarStore promoted to pub under factors module (Task 18)
✅ Game::ranked, one_v_one, free_for_all, custom (Task 19)
✅ Equivalence tests prove identical posteriors (Task 20)
✅ intern / lookup for Index exposure (Task 14)

Deferred to later tiers (explicitly):

Outcome::Scored + MarginFactor — T4 (enum is #[non_exhaustive] so adding is non-breaking).
Damped, Residual schedules — T4.
Send + Sync bounds — T3.
predict_outcome N-team — T4.
Game::custom full ergonomics — T4.

Known hazards during execution:

Generic parameter explosion. After Task 8 + 12 + 14, History<T, D, O, K> has four type params. Use trait-object erasure internally if ergonomics suffer. Default types (i64, ConstantDrift, NullObserver) keep common callsites readable.
.player → .rating field rename. grep -r '\.player' also matches unrelated names. Audit manually inside src/ only; don't auto-replace across tests or dev deps.
Outcome → legacy result translation. The legacy result f64 convention is "higher is worse" for descending sort. Our new Outcome::Ranked stores ranks where 0 = best. Mapping: max_rank - rank[i]. If a test's golden flips winner/loser, this is the likely cause.
Time = Untimed behavior change. Per spec, Untimed::elapsed_to returns 0 — no drift between slices. The old time=false mode implicitly used elapsed=1. Tests that used time(false) translate to History::<i64, _> with explicit 1..=n timestamps to preserve numerics.
Test module translation is mechanical but big. Expect Task 20 to be the longest task (~60–90 minutes). Work through the test files one at a time; commit after each file to keep bisectable history.

Things outside the plan that may bite:

Cargo.toml dev-dep trueskill-tt = { path = ".", features = ["approx"] } — if this references removed public names (Batch, Player), add the old name temporarily as a type Batch = TimeSlice; alias during the rename cascade, then remove before Task 21.
examples/ directory. Confirm no examples exist (ls examples/ showed nothing at plan-write time), but re-check before Task 20 and include any in the translation pass.
README.md probably shows the old API. Update in Task 21 alongside the CHANGELOG.

83 KiB Raw Permalink Blame History Unescape Escape

T2 — New API Surface Implementation Plan

Acceptance criteria

Non-goals (deferred to T3/T4)

File map

Renaming policy

Task 1: Pre-flight — verify T1 green, capture baseline

Task 2: Rename IndexMap → KeyTable

Task 3: Rename Player → Rating

Task 4: Rename Agent → Competitor

Task 5: Rename Batch → TimeSlice

Task 6: Introduce Time trait and Untimed

Task 7: Generify Drift over Time

Task 8: Parameterize History<T, D> — remove time: bool

Task 9: Add Outcome enum

Task 10: Add Event<T, K>, Team<K>, Member<K>

Task 11: Add Observer trait and NullObserver

Task 12: Add ConvergenceOptions and ConvergenceReport; wire into history

Task 13: Expand InferenceError; convert boundary panics to Result

Task 14: Add record_winner and record_draw convenience API

Task 15: Replace nested-Vec add_events with typed add_events(iter)

Task 16: Add history.event(time).team(...).commit() fluent builder

Task 17: Query methods — current_skill, learning_curve, log_evidence_for, predict_*

Task 18: Promote Factor / Schedule / VarStore to pub under a factors module

Task 19: Game::ranked, Game::one_v_one, Game::free_for_all, Game::custom

Task 20: Translate the full test suite to the new API; delete legacy methods

Task 21: Final verification, benchmark capture, update CHANGELOG

Self-review notes

83 KiB

Raw Permalink Blame History

Task 2: Rename `IndexMap` → `KeyTable`

Task 3: Rename `Player` → `Rating`

Task 4: Rename `Agent` → `Competitor`

Task 5: Rename `Batch` → `TimeSlice`

Task 6: Introduce `Time` trait and `Untimed`

Task 7: Generify `Drift` over `Time`

Task 8: Parameterize `History<T, D>` — remove `time: bool`

Task 9: Add `Outcome` enum

Task 10: Add `Event<T, K>`, `Team<K>`, `Member<K>`

Task 11: Add `Observer` trait and `NullObserver`

Task 12: Add `ConvergenceOptions` and `ConvergenceReport`; wire into history

Task 13: Expand `InferenceError`; convert boundary panics to `Result`

Task 14: Add `record_winner` and `record_draw` convenience API

Task 15: Replace nested-Vec `add_events` with typed `add_events(iter)`

Task 16: Add `history.event(time).team(...).commit()` fluent builder

Task 17: Query methods — `current_skill`, `learning_curve`, `log_evidence_for`, `predict_*`

Task 18: Promote `Factor` / `Schedule` / `VarStore` to `pub` under a `factors` module

Task 19: `Game::ranked`, `Game::one_v_one`, `Game::free_for_all`, `Game::custom`