Every #[cfg(test)] mod tests in src/history.rs now uses the new public
API: add_events(iter) / converge() / learning_curve() / current_skill()
/ log_evidence(). No golden value changed.
Legacy methods removed:
- History::convergence(iters, eps, verbose) → use converge()
- History::learning_curves_by_index() → use learning_curve() / learning_curves()
- HistoryBuilder::gamma(f64) → use .drift(ConstantDrift(g))
- add_events_with_prior downgraded from pub to pub(crate)
Added:
- History::builder_with_key() for custom key types (used by atp example)
- tests/equivalence.rs: Game-level golden integration tests
examples/atp.rs rewritten in new API (Event<i64, String>, converge(),
learning_curve(), drift(ConstantDrift(...))).
Bench Batch::iteration: 21.4 µs (T1 reference: 22.88 µs).
Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Public Game API now returns Result<_, InferenceError> on invalid input
(p_draw out of range, outcome rank count mismatches team count).
New types:
- GameOptions { p_draw, convergence } — bundled config
- OwnedGame<T, D> — owned variant of Game that carries its result
and weights internally (no borrow of History's slices). Returned
by public constructors to avoid leaking internal borrow lifetimes.
The internal Game::new is renamed Game::ranked_with_arena (pub(crate))
and keeps the borrowing-arena signature for History's hot path. All
in-crate callers updated (21 call sites: 18 in game.rs tests, 2 in
time_slice.rs, 1 in history.rs).
Game::custom is a T2-minimal power-user escape hatch exposing raw
factor + schedule plumbing. Full ergonomics in T4 (#[doc(hidden)]
for now).
Game::log_evidence() accessor added on both Game and OwnedGame (was
previously accessible only through the pub(crate) evidence field).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Exposes the factor-graph machinery so power users can define custom
factors and schedules (see Game::custom in the next task). The
internal factor/ and schedule/ modules remain unchanged (still
referenced by Game's internals via crate::factor); the user-facing
public API goes through the new factors module re-exports:
pub use crate::factor::{BuiltinFactor, Factor, VarId, VarStore};
pub use crate::factor::rank_diff::RankDiffFactor;
pub use crate::factor::team_sum::TeamSumFactor;
pub use crate::factor::trunc::TruncFactor;
pub use crate::schedule::{EpsilonOrMax, Schedule, ScheduleReport};
#[allow(dead_code)] guards on the previously-pub(crate) items are
removed because the types are now referenced via the re-exports.
Promotes public methods on VarStore (len, alloc, get, set, clear, new)
and adds is_empty per clippy lint. Keeps marginals field private as an
implementation detail — users access via the public methods.
Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
New public query methods on History:
- current_skill(&K) -> Option<Gaussian>: latest posterior for a key
- learning_curve(&K) -> Vec<(T, Gaussian)>: single-key history
- learning_curves() -> HashMap<K, Vec<(T, Gaussian)>>: all-keys history
- log_evidence() -> f64: total log-evidence (was log_evidence(false,&[]))
- log_evidence_for(&[&K]) -> f64: subset log-evidence
- predict_quality(&[&[&K]]) -> f64: draw-probability match quality
- predict_outcome(&[&[&K]]) -> Vec<f64>: 2-team win probabilities
learning_curves() changed from returning HashMap<Index, Vec<(i64, Gaussian)>>
to HashMap<K, Vec<(T, Gaussian)>>. A new learning_curves_by_index()
helper preserves the old Index-keyed shape for callers that ingest via
the pub(crate) Index path.
log_evidence(false, &[]) was renamed to log_evidence_internal and made
pub(crate); the new zero-arg log_evidence() wraps it.
predict_outcome is T2 2-team-only; N-team deferred to T4.
KeyTable::get no longer requires ToOwned<Owned = K> (only needed for
get_or_create), allowing query methods to use simpler bounds.
Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Third tier of the ingestion API (spec Section 4). Powers one-off
events with irregular shapes where neither record_winner (too
simple) nor typed add_events (too verbose) fits cleanly.
EventBuilder accumulates teams, weights, and outcome. Supports:
- .team([keys]) — add a team
- .weights([w..]) — per-member weights on the most-recently-added team
- .ranking([ranks]) — explicit per-team ranks
- .winner(i) — convenience: team i wins, others tied
- .draw() — all teams tied
- .commit() — finalize into an Event<T, K> and delegate to add_events
Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Public API gains:
History::add_events<I: IntoIterator<Item = Event<T, K>>>(events)
-> Result<(), InferenceError>
which accepts the typed Event<T, K> shape added in Task 10. Ranks
from Outcome::Ranked are mapped to the legacy "higher f64 = better"
results internally.
add_events_with_prior now takes Vec<T> for times (was Vec<i64>),
generifying the whole internal path over T in a single fully-generic
impl<T: Time, D: Drift<T>, O: Observer<T>, K> block. The i64-specific
block is gone; record_winner/record_draw are now generic over T.
add_events_with_prior stays pub (not pub(crate)) because the ATP
example calls it directly with pre-built Index-based composition;
the new typed add_events is the primary public API going forward.
In-crate tests updated to call add_events_with_prior with an empty
HashMap. tests/api_shape.rs added with 3 integration tests covering
bulk ingest, draw, and mismatched-outcome error.
Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Spec Section 4 "three-tier event ingestion" tier 2: one-off match
convenience. Spec open question 3: expose Index + intern/lookup for
power users.
History and HistoryBuilder gain a 4th generic parameter
K: Eq + Hash + Clone = &'static str. The default ensures existing
tests using Index-based add_events compile unchanged.
History internally owns a KeyTable<K>. intern(&Q) creates or returns
an Index for the given key; lookup(&Q) returns Option<Index> without
creating. record_winner and record_draw are thin 1v1 wrappers around
the internal add_events_with_prior.
Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
InferenceError gains MismatchedShape (user-input length mismatches),
InvalidProbability (p_draw out of [0, 1]), and ConvergenceFailed
(exceeded max_iter without hitting epsilon). NegativePrecision stays.
History::add_events_with_prior and History::add_events now return
Result<(), InferenceError>. The previous assert! macros checking
composition/results/times/weights shape are replaced by matched
error returns.
Internal debug_assert! macros for arithmetic invariants stay; this
change only affects boundary validation of user input.
Tests updated to call .unwrap() on the Result. The old signatures
will be fully replaced in Task 15 (typed add_events(iter)) and the
nested-Vec wrapper removed in Task 20.
Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
New public types:
- ConvergenceOptions { max_iter, epsilon } — config for the loop
- ConvergenceReport { iterations, final_step, log_evidence, converged,
per_iteration_time, slices_skipped } — post-hoc summary
History and HistoryBuilder gain a third generic parameter
O: Observer<T> = NullObserver. Builder methods:
- .convergence(opts) sets the ConvergenceOptions
- .observer(o) plugs in an Observer (reshapes the builder's O param)
History::converge() runs the existing iteration loop driven by the
stored opts, emits observer callbacks on each iteration end and on
completion, and returns Result<ConvergenceReport, InferenceError>.
The old convergence(iters, eps, verbose) stays — gets removed in
Task 20 after tests are translated.
Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
Observer replaces verbose: bool with structured progress callbacks:
on_iteration_end, on_batch_processed, on_converged — all no-op
default impls so users override only what they need. NullObserver
is a ZST default.
Send + Sync bounds deferred to T3 (Rayon support).
Fully additive — wired into History::converge in Task 12.
Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
Replaces the old nested Vec<Vec<Vec<_>>> event description on the
public API boundary. Member<K>::from(K) enables ergonomic literal
lists. Member::with_weight / with_prior are builder methods for the
optional per-event overrides.
Fully additive — no existing call sites updated. Consumed by
History::add_events(iter) in Task 15.
Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
Outcome::winner(i, n), Outcome::draw(n), Outcome::ranking(iter) are
the convenience constructors. Marked #[non_exhaustive] so Scored can
be added in T4 without breaking match exhaustiveness.
Adds smallvec = "1" as a direct dependency.
Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
The bool encoded 'no time axis' which is now expressed at the type
level (T = Untimed). The old !self.time branch generated sequential
i64 timestamps internally (1..=n) and bumped all agents' last_time at
every tick; tests that relied on this now pass those timestamps
explicitly and reflect the correct time=true elapsed semantics.
Collapsed `if self.time { A } else { B }` into the A branch everywhere
in add_events_with_prior. Removed the two !self.time blocks that
updated all agents' last_time at every slice regardless of participation.
sort_time is now generic over `T: Copy + Ord`.
HistoryBuilder::time(bool) removed. History<i64, ConstantDrift>
default remains, producing the same behavior as old .time(true).
The test_env_ttt Gaussian goldens are updated to reflect the correct
time=true semantics (b.elapsed=2 instead of 1 due to b skipping t=2);
this is a correction: the old !self.time last_time bump was an
implementation quirk that diverged from the Python reference.
55 tests pass. clippy clean. fmt clean.
Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Drift now takes &T -> &T and is generic over the time axis. Untimed
impls return elapsed=0. ConstantDrift impl covers all T via the Time
trait. An additional variance_for_elapsed(i64) method on the trait
serves callers that work with the pre-cached i64 elapsed count.
Competitor.last_time moves from i64 with MIN sentinel to Option<T>
with None sentinel. receive(&T) computes variance from last_time
dynamically; receive_for_elapsed(i64) uses a pre-cached elapsed count
(needed in convergence sweeps where last_time has already advanced).
TimeSlice.time changes from i64 to T. compute_elapsed is now generic
over T and takes Option<&T> for the last-seen time. new_forward_info
uses receive_for_elapsed to preserve the cached elapsed during sweeps.
History<D> becomes History<T, D>; HistoryBuilder<D> becomes
HistoryBuilder<T, D>; Game<D> becomes Game<T, D>. Defaults keep
existing call sites compiling with zero changes: T = i64,
D = ConstantDrift.
add_events / add_events_with_prior stay on impl History<i64, D> since
times: Vec<i64> is i64-specific (Task 8 will generalise this).
In !self.time mode the old i64::MAX sentinel guaranteed elapsed=1 for
every slice transition regardless of time gaps. Replaced by advancing
all previously-seen agents' last_time to Some(current_slice_time) at
the end of each slice; this preserves elapsed=1 between adjacent
slices in sequential-integer untimed mode.
The time: bool field on History and .time(bool) on HistoryBuilder are
NOT removed by this task — deferred to Task 8 so this commit is
purely a type-level generification.
Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Foundation for generic History time axis. Untimed is the ZST case
(no drift across slices); i64 is the standard timestamp case.
Additional impls (time::OffsetDateTime, chrono) can be added behind
feature flags in follow-up work.
The trait is not yet wired into History — that happens in Task 7
along with generifying Drift over T.
Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
TimeSlice says what it is: every event sharing one timestamp. The
History field .batches is renamed to .time_slices. Local variables
named `batch` referring to TimeSlice instances are renamed to
`time_slice`.
Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
Competitor holds dynamic per-history state (message, last_time) for
someone competing; its configuration lives in a Rating.
AgentStore renamed to CompetitorStore to match. The internal
`clean()` free function's parameter name changed from `agents` to
`competitors` for consistency.
Local variable names (agent_idx, this_agent) inside history.rs are
left unchanged — they represent abstract identifiers, not Competitor
instances.
Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
The struct holds prior/beta/drift — a rating configuration, not a
person. The person-with-temporal-state is the Competitor (renamed in
the next task). Resolves Player/Agent ambiguity.
Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address code review feedback from Task 2:
- key_table module doesn't need pub visibility; the KeyTable re-export
at lib.rs root already exposes the only public type. Matches the
error/history private-module pattern.
- Revert an incidental bench variable rename (index_map → index) that
wasn't part of the task scope.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The former name collided with the popular indexmap crate. KeyTable
lives in its own module. Public API unchanged beyond the rename.
Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
21-task plan covering all renames and new public API landing per
Section 7 "T2" of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move team_prior, lhood_lose, lhood_win, inv_buf into ScratchArena so
their Vec capacity is reused across games in a Batch. Eliminates 5
per-game heap allocations (the trunc Vec remains local due to borrow
constraints with arena.vars).
Batch::iteration: 23.0 µs (down from 27.0 µs with naive local Vecs;
8% above T0 21.253 µs baseline due to TruncFactor propagate overhead).
Game::likelihoods now uses VarStore (for diff vars) and TruncFactor
(for EP truncation + evidence caching) instead of TeamMessage and
DiffMessage. The EP loop structure is preserved exactly; VarId-keyed
diff vars live in the arena's VarStore (capacity reused per batch).
ScratchArena loses teams/diffs/ties/margins; gains VarStore and
sort_buf (sort_perm allocation eliminated). message.rs deleted.
Public API of Game (new, posteriors, likelihoods, evidence) unchanged.
EpsilonOrMax mirrors today's Game::likelihoods loop: sweep forward
then backward over iterating factors, capped at 10 iterations or
step <= 1e-6. Setup factors (TeamSum) run exactly once before the
loop begins.
ScheduleReport is the only public surface from this module.
EP truncation factor that operates on a diff variable. Stores its
outgoing message so the cavity computation produces the correct EP
message on each propagation. The first propagation caches the
evidence contribution (cdf-bounded probability) for log_evidence().
Promotes lib::cdf to pub(crate) so the factor can use it.
Maintains diff = team_a - team_b across three variables. On each
propagation, reads the team-perf marginals (which may have been
updated by neighboring factors) and computes the new diff via
Gaussian Sub (variance addition).
Computes the weighted sum of player performance Gaussians into a
team-performance variable. Runs once per game (no iteration needed).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the trait that all factors implement and the enum dispatcher
used by the schedule to drive heterogeneous factors without dynamic
dispatch in the hot loop.
The three built-in factors (TeamSum, RankDiff, Trunc) are stubbed
out; concrete implementations follow in tasks 4-6.
Foundation types for the T1 factor graph machinery. VarStore is a
flat Vec<Gaussian> indexed by VarId; variables are allocated by
alloc() and the store can be cleared between games to reuse capacity.
Part of T1 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
Bite-sized, TDD-style task breakdown for the second tier of the engine
redesign: introduce VarStore, Factor trait, BuiltinFactor enum, and
EpsilonOrMax schedule, then re-implement Game::likelihoods on top of
the new machinery. Internal-only refactor; public Game/History API
unchanged.
Acceptance: existing tests pass within ULP, iteration counts match T0,
no Batch::iteration regression vs T0 (~21.5 µs).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Batch::iteration: 29.840 µs → 21.253 µs (1.40×)
Gaussian::mul: 1.568 ns → 218.69 ps (7.17×)
Gaussian::div: 1.572 ns → 218.64 ps (7.19×)
Gaussian arithmetic hit target (7×+ vs 1.5–2× expected). Batch::iteration
reached 1.40× vs the 3× target. Post-mortem: the bench exercises 100 tiny
2-team events and the dominant cost is still Vec allocation in within_priors,
sort_perm, and Game::likelihoods. The HashMap→Vec win shows at the History
level (forward/backward sweep) which this bench doesn't exercise.
Remediation plan documented in benches/baseline.txt: arena-ify sort_perm,
within_priors, and Game::likelihoods in T1 when Game's internals are
redesigned around the new factor graph.
38/38 tests passing. Closes T0 tier.
Game::likelihoods previously allocated four Vecs (teams, diffs, ties,
margins) on every call. Batch now owns one ScratchArena reused across
all Game::new calls in the iteration loop; likelihoods() clears and
extends the arena buffers instead of allocating fresh.
For log_evidence (called infrequently), a local ScratchArena is created
per invocation so the method signature stays &self.
Also: add #[derive(Debug)] to TeamMessage and DiffMessage (required by
ScratchArena's own Debug derive).
Part of T0 engine redesign.
AgentStore<D> is a Vec<Option<Agent<D>>>-backed store indexed directly
by Index.0, eliminating per-iteration hashing in the cross-history
forward/backward sweep. Implements Index<Index>/IndexMut<Index> for
ergonomic agent access.
AgentStore is public (so benches/batch.rs can use it). SkillStore
remains pub(crate) since Skill is pub(crate) in batch.rs.
HashMap<Index, _> is now only used for the posteriors() return value
(temporary; will be replaced in T2 with a proper typed return) and
for the add_events_with_prior(priors: HashMap<Index, Player<D>>) API
(also T2 target).
Part of T0 engine redesign.
SkillStore is a Vec<Skill>-backed dense store with a parallel present
mask, indexed directly by Index.0. Eliminates per-iteration hashing
in the within-slice convergence loop; O(1) array lookup replaces O(1)
amortised hash lookup with better cache behaviour.
Iteration order is now ascending-by-Index (was arbitrary for HashMap);
EP fixed point is order-independent so posteriors are unchanged.
Part of T0 engine redesign.
mu_sigma was deleted as part of the Gaussian nat-param rewrite (its
only callers were the old Mul/Div impls). This commit adds the
InferenceError enum as a seed for the T2 API surface, with the
NegativePrecision variant that mu_sigma would have returned.
Part of T0 engine redesign.
Mul and Div become two f64 adds/subs with no sqrt in the hot path.
mu() and sigma() are computed on demand from stored pi/tau.
Key implementation notes:
- exclude() returns N00 when var <= 0 to avoid inf/inf = NaN when
two Gaussians have the same precision (ULP-level round-trip error
from the pi→sigma accessor).
- Mul<f64> by 0.0 returns N00 (point mass at 0), matching old behavior.
- from_ms(0, 0) == N00 {pi:inf, tau:0}; from_ms(0, inf) == N_INF {pi:0, tau:0}.
Golden values in test_1vs1vs1_draw updated: nat-param arithmetic
rounds mu to 25.0 (was 24.999999) and shifts sigma by ~3e-7.
Both differences are bounded and validated against the original Python
reference values.
Part of T0 engine redesign.
- Promotes Gaussian::pi and Gaussian::tau to public so benches/gaussian.rs
compiles, then captures baseline numbers for the T0 acceptance gate.
- Fixes the divide bench: g1/g2 panicked (g1 has lower precision than g2;
cavity requires pi_num >= pi_den). Swapped to g2/g1 (well-defined).
Baseline on Apple M5 Pro:
Batch::iteration 29.840 µs
Gaussian::mul 1.568 ns (vs ~220 ps for add/sub — hot path)
Gaussian::div 1.572 ns
Bite-sized, TDD-style task breakdown for the first tier of the engine
redesign: Gaussian to natural-parameter storage, dense Vec storage
replacing HashMap, ScratchArena to eliminate per-event allocs,
Result-ifying the lone panic. No top-level public API change.
Acceptance gate: ≥3x speedup on Batch::iteration vs. baseline.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comprehensive design for a multi-tier rewrite covering performance,
factor-graph extensibility, convergence scheduling, and API surface.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>