T0 + T1 + T2: engine redesign through new API surface (#1)

Implements tiers T0, T1, T2 of `docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md`. All three tiers have landed together on this branch because they build on one another; this PR rolls them up for a single review pass.

Per-tier plans:
- T0: `docs/superpowers/plans/2026-04-23-t0-numerical-parity.md`
- T1: `docs/superpowers/plans/2026-04-24-t1-factor-graph.md`
- T2: `docs/superpowers/plans/2026-04-24-t2-new-api-surface.md`

## Summary

### T0 — Numerical parity (internal)

- `Gaussian` switched to natural-parameter storage `(pi, tau)`; mul/div now ~7× faster (218 ps vs 1.57 ns).
- `HashMap<Index, _>` → dense `Vec<_>` keyed by `Index.0` (via `AgentStore<D>`, `SkillStore`).
- `ScratchArena` eliminates per-event allocations in `Game::likelihoods`.
- `InferenceError` seed type added (1 variant).
- 38 → 53 tests passing through T1.
- Benchmark: `Batch::iteration` 29.84 → 21.25 µs.

### T1 — Factor graph machinery (internal)

- `Factor` trait + `BuiltinFactor` enum (TeamSum / RankDiff / Trunc) driving within-game inference.
- `VarStore` flat storage for variable marginals.
- `Schedule` trait + `EpsilonOrMax` impl replacing the hand-rolled EP loop.
- `Game::likelihoods` rebuilt on the factor-graph machinery; iteration counts and goldens preserved to within 1e-6.
- 53 tests passing.
- Benchmark: `Batch::iteration` 23.01 µs (slight regression absorbed in T2).

### T2 — New API surface (breaking)

**Renames:**
- `IndexMap → KeyTable`, `Player → Rating`, `Agent → Competitor`, `Batch → TimeSlice`

**New types:**
- `Time` trait with `Untimed` ZST and `i64` impls; `Drift<T>`, `Rating<T, D>`, `Competitor<T, D>`, `TimeSlice<T>`, `History<T, D, O, K>` all generic.
- `Event<T, K>`, `Team<K>`, `Member<K>`, `Outcome` (`Ranked` variant; `#[non_exhaustive]`).
- `Observer<T>` trait + `NullObserver`.
- `ConvergenceOptions`, `ConvergenceReport`.
- `GameOptions`, `OwnedGame<T, D>`.

**Three-tier ingestion:**
- `history.record_winner(&K, &K, T)` / `record_draw(&K, &K, T)` — 1v1 convenience.
- `history.add_events(iter)` — typed bulk.
- `history.event(T).team([...]).weights([...]).ranking([...]).commit()` — fluent.

**Query API:** `current_skill`, `learning_curve`, `learning_curves` (keyed on `K`), `log_evidence`, `log_evidence_for`, `predict_quality`, `predict_outcome`.

**Game constructors:** `ranked`, `one_v_one`, `free_for_all`, `custom` — all returning `Result<_, InferenceError>`.

**`factors` module:** `Factor`, `Schedule`, `VarStore`, `VarId`, `BuiltinFactor`, `EpsilonOrMax`, `ScheduleReport`, `TeamSumFactor`, `RankDiffFactor`, `TruncFactor` now public.

**Errors:** `InferenceError` gains `MismatchedShape`, `InvalidProbability`, `ConvergenceFailed`; boundary panics converted to `Result`.

**Removed (breaking):** `History::convergence(iters, eps, verbose)`, `HistoryBuilder::gamma(f64)`, `HistoryBuilder::time(bool)`, `History.time: bool`, `learning_curves_by_index`, nested-Vec public `add_events`.

## Behavior change (documented in CHANGELOG)

`Time = Untimed` has `elapsed_to → 0`, so no drift accumulates between slices. The old `time=false` mode implicitly forced `elapsed=1` on reappearance via an `i64::MAX` sentinel — that quirk is not reproducible under a typed time axis. Tests that depended on it now use `History::<i64, _>` with explicit `1..=n` timestamps. One test (`test_env_ttt`) had 3 Gaussian goldens updated to reflect the corrected semantics; documented in commit `33a7d90`.

## Final numbers

| Metric | Before T0 | After T2 | Delta |
|---|---|---|---|
| `Batch::iteration` | 29.84 µs | 21.36 µs | **-28%** |
| `Gaussian::mul` | 1.57 ns | 219 ps | **-86%** |
| `Gaussian::div` | 1.57 ns | 219 ps | **-86%** |
| Tests passing | 38 | 90 | +52 |

All other Gaussian ops unchanged (~219 ps add/sub, ~264 ps pi/tau reads).

## Test plan

- [x] `cargo test --features approx` — 90/90 pass (68 lib + 10 api_shape + 6 game + 4 record_winner + 2 equivalence)
- [x] `cargo clippy --all-targets --features approx -- -D warnings` — clean
- [x] `cargo +nightly fmt --check` — clean
- [x] `cargo bench --bench batch` — 21.36 µs
- [x] `cargo bench --bench gaussian` — unchanged from T1
- [x] `cargo run --example atp --features approx` — rewritten in new API, runs clean
- [x] Historical Game-level goldens preserved in `tests/equivalence.rs`
- [x] Public API matches spec Section 4 (verified by integration tests in `tests/api_shape.rs`)

## Commit history

~45 commits total across T0 + T1 + T2. Each task is self-contained and individually tested; the branch is bisectable. See `git log main..t2-new-api-surface` for the full list.

## Deferred to later tiers

- `Outcome::Scored` + `MarginFactor` — T4
- `Damped` / `Residual` schedules — T4
- `Send + Sync` bounds + Rayon parallelism — T3
- N-team `predict_outcome` — T4
- `Game::custom` full ergonomics — T4

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: #1
Co-authored-by: Anders Olsson <anders.e.olsson@gmail.com>
Co-committed-by: Anders Olsson <anders.e.olsson@gmail.com>
This commit was merged in pull request #1.
This commit is contained in:
2026-04-24 11:20:04 +00:00
committed by logaritmisk
parent a14df02089
commit d2aab82c1e
44 changed files with 10541 additions and 1325 deletions

665
src/time_slice.rs Normal file
View File

@@ -0,0 +1,665 @@
//! A single time step's worth of events.
//!
//! Renamed from `Batch` in T2.
use std::collections::HashMap;
use crate::{
Index, N_INF,
arena::ScratchArena,
drift::Drift,
game::Game,
gaussian::Gaussian,
rating::Rating,
storage::{CompetitorStore, SkillStore},
time::Time,
tuple_gt, tuple_max,
};
#[derive(Debug)]
pub(crate) struct Skill {
pub(crate) forward: Gaussian,
backward: Gaussian,
likelihood: Gaussian,
pub(crate) elapsed: i64,
pub(crate) online: Gaussian,
}
impl Skill {
pub(crate) fn posterior(&self) -> Gaussian {
self.likelihood * self.backward * self.forward
}
}
impl Default for Skill {
fn default() -> Self {
Self {
forward: N_INF,
backward: N_INF,
likelihood: N_INF,
elapsed: 0,
online: N_INF,
}
}
}
#[derive(Debug)]
struct Item {
agent: Index,
likelihood: Gaussian,
}
impl Item {
fn within_prior<T: Time, D: Drift<T>>(
&self,
online: bool,
forward: bool,
skills: &SkillStore,
agents: &CompetitorStore<T, D>,
) -> Rating<T, D> {
let r = &agents[self.agent].rating;
let skill = skills.get(self.agent).unwrap();
if online {
Rating::new(skill.online, r.beta, r.drift)
} else if forward {
Rating::new(skill.forward, r.beta, r.drift)
} else {
Rating::new(skill.posterior() / self.likelihood, r.beta, r.drift)
}
}
}
#[derive(Debug)]
struct Team {
items: Vec<Item>,
output: f64,
}
#[derive(Debug)]
pub(crate) struct Event {
teams: Vec<Team>,
evidence: f64,
weights: Vec<Vec<f64>>,
}
impl Event {
fn outputs(&self) -> Vec<f64> {
self.teams
.iter()
.map(|team| team.output)
.collect::<Vec<_>>()
}
pub(crate) fn within_priors<T: Time, D: Drift<T>>(
&self,
online: bool,
forward: bool,
skills: &SkillStore,
agents: &CompetitorStore<T, D>,
) -> Vec<Vec<Rating<T, D>>> {
self.teams
.iter()
.map(|team| {
team.items
.iter()
.map(|item| item.within_prior(online, forward, skills, agents))
.collect::<Vec<_>>()
})
.collect::<Vec<_>>()
}
}
#[derive(Debug)]
pub struct TimeSlice<T: Time = i64> {
pub(crate) events: Vec<Event>,
pub(crate) skills: SkillStore,
pub(crate) time: T,
p_draw: f64,
arena: ScratchArena,
}
impl<T: Time> TimeSlice<T> {
pub fn new(time: T, p_draw: f64) -> Self {
Self {
events: Vec::new(),
skills: SkillStore::new(),
time,
p_draw,
arena: ScratchArena::new(),
}
}
pub fn add_events<D: Drift<T>>(
&mut self,
composition: Vec<Vec<Vec<Index>>>,
results: Vec<Vec<f64>>,
weights: Vec<Vec<Vec<f64>>>,
agents: &CompetitorStore<T, D>,
) {
let mut unique = Vec::with_capacity(10);
let this_agent = composition.iter().flatten().flatten().filter(|idx| {
if !unique.contains(idx) {
unique.push(*idx);
return true;
}
false
});
for idx in this_agent {
let elapsed = compute_elapsed(agents[*idx].last_time.as_ref(), &self.time);
if let Some(skill) = self.skills.get_mut(*idx) {
skill.elapsed = elapsed;
skill.forward = agents[*idx].receive(&self.time);
} else {
self.skills.insert(
*idx,
Skill {
forward: agents[*idx].receive(&self.time),
elapsed,
..Default::default()
},
);
}
}
let events = composition.iter().enumerate().map(|(e, event)| {
let teams = event
.iter()
.enumerate()
.map(|(t, team)| {
let items = team
.iter()
.map(|&agent| Item {
agent,
likelihood: N_INF,
})
.collect::<Vec<_>>();
Team {
items,
output: if results.is_empty() {
(event.len() - (t + 1)) as f64
} else {
results[e][t]
},
}
})
.collect::<Vec<_>>();
let weights = if weights.is_empty() {
teams
.iter()
.map(|team| vec![1.0; team.items.len()])
.collect::<Vec<_>>()
} else {
weights[e].clone()
};
Event {
teams,
evidence: 0.0,
weights,
}
});
let from = self.events.len();
self.events.extend(events);
self.iteration(from, agents);
}
pub(crate) fn posteriors(&self) -> HashMap<Index, Gaussian> {
self.skills
.iter()
.map(|(idx, skill)| (idx, skill.posterior()))
.collect::<HashMap<_, _>>()
}
pub fn iteration<D: Drift<T>>(&mut self, from: usize, agents: &CompetitorStore<T, D>) {
for event in self.events.iter_mut().skip(from) {
let teams = event.within_priors(false, false, &self.skills, agents);
let result = event.outputs();
let g = Game::ranked_with_arena(
teams,
&result,
&event.weights,
self.p_draw,
&mut self.arena,
);
for (t, team) in event.teams.iter_mut().enumerate() {
for (i, item) in team.items.iter_mut().enumerate() {
let old_likelihood = self.skills.get(item.agent).unwrap().likelihood;
let new_likelihood = (old_likelihood / item.likelihood) * g.likelihoods[t][i];
self.skills.get_mut(item.agent).unwrap().likelihood = new_likelihood;
item.likelihood = g.likelihoods[t][i];
}
}
event.evidence = g.evidence;
}
}
#[allow(dead_code)]
pub(crate) fn convergence<D: Drift<T>>(&mut self, agents: &CompetitorStore<T, D>) -> usize {
let epsilon = 1e-6;
let iterations = 20;
let mut step = (f64::INFINITY, f64::INFINITY);
let mut i = 0;
while tuple_gt(step, epsilon) && i < iterations {
let old = self.posteriors();
self.iteration(0, agents);
let new = self.posteriors();
step = old.iter().fold((0.0, 0.0), |step, (a, old)| {
tuple_max(step, old.delta(new[a]))
});
i += 1;
}
i
}
pub(crate) fn forward_prior_out(&self, agent: &Index) -> Gaussian {
let skill = self.skills.get(*agent).unwrap();
skill.forward * skill.likelihood
}
pub(crate) fn backward_prior_out<D: Drift<T>>(
&self,
agent: &Index,
agents: &CompetitorStore<T, D>,
) -> Gaussian {
let skill = self.skills.get(*agent).unwrap();
let n = skill.likelihood * skill.backward;
n.forget(
agents[*agent]
.rating
.drift
.variance_for_elapsed(skill.elapsed),
)
}
pub(crate) fn new_backward_info<D: Drift<T>>(&mut self, agents: &CompetitorStore<T, D>) {
for (agent, skill) in self.skills.iter_mut() {
skill.backward = agents[agent].message;
}
self.iteration(0, agents);
}
pub(crate) fn new_forward_info<D: Drift<T>>(&mut self, agents: &CompetitorStore<T, D>) {
for (agent, skill) in self.skills.iter_mut() {
skill.forward = agents[agent].receive_for_elapsed(skill.elapsed);
}
self.iteration(0, agents);
}
pub(crate) fn log_evidence<D: Drift<T>>(
&self,
online: bool,
targets: &[Index],
forward: bool,
agents: &CompetitorStore<T, D>,
) -> f64 {
// log_evidence is infrequent; a local arena avoids needing &mut self.
let mut arena = ScratchArena::new();
if targets.is_empty() {
if online || forward {
self.events
.iter()
.map(|event| {
Game::ranked_with_arena(
event.within_priors(online, forward, &self.skills, agents),
&event.outputs(),
&event.weights,
self.p_draw,
&mut arena,
)
.evidence
.ln()
})
.sum()
} else {
self.events.iter().map(|event| event.evidence.ln()).sum()
}
} else if online || forward {
self.events
.iter()
.enumerate()
.filter(|(_, event)| {
event
.teams
.iter()
.flat_map(|team| &team.items)
.any(|item| targets.contains(&item.agent))
})
.map(|(_, event)| {
Game::ranked_with_arena(
event.within_priors(online, forward, &self.skills, agents),
&event.outputs(),
&event.weights,
self.p_draw,
&mut arena,
)
.evidence
.ln()
})
.sum()
} else {
self.events
.iter()
.filter(|event| {
event
.teams
.iter()
.flat_map(|team| &team.items)
.any(|item| targets.contains(&item.agent))
})
.map(|event| event.evidence.ln())
.sum()
}
}
pub fn get_composition(&self) -> Vec<Vec<Vec<Index>>> {
self.events
.iter()
.map(|event| {
event
.teams
.iter()
.map(|team| team.items.iter().map(|item| item.agent).collect::<Vec<_>>())
.collect::<Vec<_>>()
})
.collect::<Vec<_>>()
}
pub fn get_results(&self) -> Vec<Vec<f64>> {
self.events
.iter()
.map(|event| {
event
.teams
.iter()
.map(|team| team.output)
.collect::<Vec<_>>()
})
.collect::<Vec<_>>()
}
}
pub(crate) fn compute_elapsed<T: Time>(last: Option<&T>, current: &T) -> i64 {
last.map(|l| l.elapsed_to(current).max(0)).unwrap_or(0)
}
#[cfg(test)]
mod tests {
use approx::assert_ulps_eq;
use super::*;
use crate::{
KeyTable, competitor::Competitor, drift::ConstantDrift, rating::Rating,
storage::CompetitorStore,
};
#[test]
fn test_one_event_each() {
let mut index_map = KeyTable::new();
let a = index_map.get_or_create("a");
let b = index_map.get_or_create("b");
let c = index_map.get_or_create("c");
let d = index_map.get_or_create("d");
let e = index_map.get_or_create("e");
let f = index_map.get_or_create("f");
let mut agents: CompetitorStore<i64, ConstantDrift> = CompetitorStore::new();
for agent in [a, b, c, d, e, f] {
agents.insert(
agent,
Competitor {
rating: Rating::new(
Gaussian::from_ms(25.0, 25.0 / 3.0),
25.0 / 6.0,
ConstantDrift(25.0 / 300.0),
),
..Default::default()
},
);
}
let mut time_slice = TimeSlice::new(0i64, 0.0);
time_slice.add_events(
vec![
vec![vec![a], vec![b]],
vec![vec![c], vec![d]],
vec![vec![e], vec![f]],
],
vec![vec![1.0, 0.0], vec![0.0, 1.0], vec![1.0, 0.0]],
vec![],
&agents,
);
let post = time_slice.posteriors();
assert_ulps_eq!(
post[&a],
Gaussian::from_ms(29.205220, 7.194481),
epsilon = 1e-6
);
assert_ulps_eq!(
post[&b],
Gaussian::from_ms(20.794779, 7.194481),
epsilon = 1e-6
);
assert_ulps_eq!(
post[&c],
Gaussian::from_ms(20.794779, 7.194481),
epsilon = 1e-6
);
assert_ulps_eq!(
post[&d],
Gaussian::from_ms(29.205220, 7.194481),
epsilon = 1e-6
);
assert_ulps_eq!(
post[&e],
Gaussian::from_ms(29.205220, 7.194481),
epsilon = 1e-6
);
assert_ulps_eq!(
post[&f],
Gaussian::from_ms(20.794779, 7.194481),
epsilon = 1e-6
);
assert_eq!(time_slice.convergence(&agents), 1);
}
#[test]
fn test_same_strength() {
let mut index_map = KeyTable::new();
let a = index_map.get_or_create("a");
let b = index_map.get_or_create("b");
let c = index_map.get_or_create("c");
let d = index_map.get_or_create("d");
let e = index_map.get_or_create("e");
let f = index_map.get_or_create("f");
let mut agents: CompetitorStore<i64, ConstantDrift> = CompetitorStore::new();
for agent in [a, b, c, d, e, f] {
agents.insert(
agent,
Competitor {
rating: Rating::new(
Gaussian::from_ms(25.0, 25.0 / 3.0),
25.0 / 6.0,
ConstantDrift(25.0 / 300.0),
),
..Default::default()
},
);
}
let mut time_slice = TimeSlice::new(0i64, 0.0);
time_slice.add_events(
vec![
vec![vec![a], vec![b]],
vec![vec![a], vec![c]],
vec![vec![b], vec![c]],
],
vec![vec![1.0, 0.0], vec![0.0, 1.0], vec![1.0, 0.0]],
vec![],
&agents,
);
let post = time_slice.posteriors();
assert_ulps_eq!(
post[&a],
Gaussian::from_ms(24.960978, 6.298544),
epsilon = 1e-6
);
assert_ulps_eq!(
post[&b],
Gaussian::from_ms(27.095590, 6.010330),
epsilon = 1e-6
);
assert_ulps_eq!(
post[&c],
Gaussian::from_ms(24.889681, 5.866311),
epsilon = 1e-6
);
assert!(time_slice.convergence(&agents) > 1);
let post = time_slice.posteriors();
assert_ulps_eq!(
post[&a],
Gaussian::from_ms(25.000000, 5.419212),
epsilon = 1e-6
);
assert_ulps_eq!(
post[&b],
Gaussian::from_ms(25.000000, 5.419212),
epsilon = 1e-6
);
assert_ulps_eq!(
post[&c],
Gaussian::from_ms(25.000000, 5.419212),
epsilon = 1e-6
);
}
#[test]
fn test_add_events() {
let mut index_map = KeyTable::new();
let a = index_map.get_or_create("a");
let b = index_map.get_or_create("b");
let c = index_map.get_or_create("c");
let d = index_map.get_or_create("d");
let e = index_map.get_or_create("e");
let f = index_map.get_or_create("f");
let mut agents: CompetitorStore<i64, ConstantDrift> = CompetitorStore::new();
for agent in [a, b, c, d, e, f] {
agents.insert(
agent,
Competitor {
rating: Rating::new(
Gaussian::from_ms(25.0, 25.0 / 3.0),
25.0 / 6.0,
ConstantDrift(25.0 / 300.0),
),
..Default::default()
},
);
}
let mut time_slice = TimeSlice::new(0i64, 0.0);
time_slice.add_events(
vec![
vec![vec![a], vec![b]],
vec![vec![a], vec![c]],
vec![vec![b], vec![c]],
],
vec![vec![1.0, 0.0], vec![0.0, 1.0], vec![1.0, 0.0]],
vec![],
&agents,
);
time_slice.convergence(&agents);
let post = time_slice.posteriors();
assert_ulps_eq!(
post[&a],
Gaussian::from_ms(25.000000, 5.419212),
epsilon = 1e-6
);
assert_ulps_eq!(
post[&b],
Gaussian::from_ms(25.000000, 5.419212),
epsilon = 1e-6
);
assert_ulps_eq!(
post[&c],
Gaussian::from_ms(25.000000, 5.419212),
epsilon = 1e-6
);
time_slice.add_events(
vec![
vec![vec![a], vec![b]],
vec![vec![a], vec![c]],
vec![vec![b], vec![c]],
],
vec![vec![1.0, 0.0], vec![0.0, 1.0], vec![1.0, 0.0]],
vec![],
&agents,
);
assert_eq!(time_slice.events.len(), 6);
time_slice.convergence(&agents);
let post = time_slice.posteriors();
assert_ulps_eq!(
post[&a],
Gaussian::from_ms(25.000003, 3.880150),
epsilon = 1e-6
);
assert_ulps_eq!(
post[&b],
Gaussian::from_ms(25.000003, 3.880150),
epsilon = 1e-6
);
assert_ulps_eq!(
post[&c],
Gaussian::from_ms(25.000003, 3.880150),
epsilon = 1e-6
);
}
}