45 Commits

Author SHA1 Message Date
logaritmisk f18013d036 bench,docs: capture T2 final numbers and update CHANGELOG
Batch::iteration: 21.36 µs (T1 was 22.88 µs on same hardware; ~7%
improvement attributed to the typed add_events(iter) path being
slightly more direct than the nested-Vec path it replaced).

Gaussian operations unchanged vs T1.

Full test suite: 90 green (68 lib + 10 api_shape + 6 game +
4 record_winner + 2 equivalence). No golden value changed across
the entire T2 tier.

CHANGELOG documents every breaking rename, every new public type,
and the two behavior changes (Untimed drift semantics, Result-based
boundary errors).

Closes T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:13:38 +02:00
logaritmisk a6aaa93fd0 test: translate in-crate tests to new T2 API; delete legacy methods
Every #[cfg(test)] mod tests in src/history.rs now uses the new public
API: add_events(iter) / converge() / learning_curve() / current_skill()
/ log_evidence(). No golden value changed.

Legacy methods removed:
- History::convergence(iters, eps, verbose) → use converge()
- History::learning_curves_by_index() → use learning_curve() / learning_curves()
- HistoryBuilder::gamma(f64) → use .drift(ConstantDrift(g))
- add_events_with_prior downgraded from pub to pub(crate)

Added:
- History::builder_with_key() for custom key types (used by atp example)
- tests/equivalence.rs: Game-level golden integration tests

examples/atp.rs rewritten in new API (Event<i64, String>, converge(),
learning_curve(), drift(ConstantDrift(...))).

Bench Batch::iteration: 21.4 µs (T1 reference: 22.88 µs).

Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 13:10:10 +02:00
logaritmisk e8c9d4ed29 feat(api): add Game::ranked, one_v_one, free_for_all, custom constructors
Public Game API now returns Result<_, InferenceError> on invalid input
(p_draw out of range, outcome rank count mismatches team count).

New types:
- GameOptions { p_draw, convergence } — bundled config
- OwnedGame<T, D> — owned variant of Game that carries its result
  and weights internally (no borrow of History's slices). Returned
  by public constructors to avoid leaking internal borrow lifetimes.

The internal Game::new is renamed Game::ranked_with_arena (pub(crate))
and keeps the borrowing-arena signature for History's hot path. All
in-crate callers updated (21 call sites: 18 in game.rs tests, 2 in
time_slice.rs, 1 in history.rs).

Game::custom is a T2-minimal power-user escape hatch exposing raw
factor + schedule plumbing. Full ergonomics in T4 (#[doc(hidden)]
for now).

Game::log_evidence() accessor added on both Game and OwnedGame (was
previously accessible only through the pub(crate) evidence field).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 12:55:26 +02:00
logaritmisk fe6f028127 feat(api): promote Factor/Schedule/VarStore to pub in factors module
Exposes the factor-graph machinery so power users can define custom
factors and schedules (see Game::custom in the next task). The
internal factor/ and schedule/ modules remain unchanged (still
referenced by Game's internals via crate::factor); the user-facing
public API goes through the new factors module re-exports:

  pub use crate::factor::{BuiltinFactor, Factor, VarId, VarStore};
  pub use crate::factor::rank_diff::RankDiffFactor;
  pub use crate::factor::team_sum::TeamSumFactor;
  pub use crate::factor::trunc::TruncFactor;
  pub use crate::schedule::{EpsilonOrMax, Schedule, ScheduleReport};

#[allow(dead_code)] guards on the previously-pub(crate) items are
removed because the types are now referenced via the re-exports.

Promotes public methods on VarStore (len, alloc, get, set, clear, new)
and adds is_empty per clippy lint. Keeps marginals field private as an
implementation detail — users access via the public methods.

Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
2026-04-24 12:50:37 +02:00
logaritmisk e62568bf3e feat(api): add current_skill / learning_curve / log_evidence / predict_*
New public query methods on History:

- current_skill(&K) -> Option<Gaussian>: latest posterior for a key
- learning_curve(&K) -> Vec<(T, Gaussian)>: single-key history
- learning_curves() -> HashMap<K, Vec<(T, Gaussian)>>: all-keys history
- log_evidence() -> f64: total log-evidence (was log_evidence(false,&[]))
- log_evidence_for(&[&K]) -> f64: subset log-evidence
- predict_quality(&[&[&K]]) -> f64: draw-probability match quality
- predict_outcome(&[&[&K]]) -> Vec<f64>: 2-team win probabilities

learning_curves() changed from returning HashMap<Index, Vec<(i64, Gaussian)>>
to HashMap<K, Vec<(T, Gaussian)>>. A new learning_curves_by_index()
helper preserves the old Index-keyed shape for callers that ingest via
the pub(crate) Index path.

log_evidence(false, &[]) was renamed to log_evidence_internal and made
pub(crate); the new zero-arg log_evidence() wraps it.

predict_outcome is T2 2-team-only; N-team deferred to T4.

KeyTable::get no longer requires ToOwned<Owned = K> (only needed for
get_or_create), allowing query methods to use simpler bounds.

Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 12:47:41 +02:00
logaritmisk ec8b7e538c feat(api): add fluent history.event(t).team(...).commit() builder
Third tier of the ingestion API (spec Section 4). Powers one-off
events with irregular shapes where neither record_winner (too
simple) nor typed add_events (too verbose) fits cleanly.

EventBuilder accumulates teams, weights, and outcome. Supports:
- .team([keys]) — add a team
- .weights([w..]) — per-member weights on the most-recently-added team
- .ranking([ranks]) — explicit per-team ranks
- .winner(i) — convenience: team i wins, others tied
- .draw() — all teams tied
- .commit() — finalize into an Event<T, K> and delegate to add_events

Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 12:42:26 +02:00
logaritmisk 244b94a3e5 feat(api): typed add_events(iter); generify internal path over T
Public API gains:

  History::add_events<I: IntoIterator<Item = Event<T, K>>>(events)
      -> Result<(), InferenceError>

which accepts the typed Event<T, K> shape added in Task 10. Ranks
from Outcome::Ranked are mapped to the legacy "higher f64 = better"
results internally.

add_events_with_prior now takes Vec<T> for times (was Vec<i64>),
generifying the whole internal path over T in a single fully-generic
impl<T: Time, D: Drift<T>, O: Observer<T>, K> block. The i64-specific
block is gone; record_winner/record_draw are now generic over T.

add_events_with_prior stays pub (not pub(crate)) because the ATP
example calls it directly with pre-built Index-based composition;
the new typed add_events is the primary public API going forward.

In-crate tests updated to call add_events_with_prior with an empty
HashMap. tests/api_shape.rs added with 3 integration tests covering
bulk ingest, draw, and mismatched-outcome error.

Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 12:39:46 +02:00
logaritmisk 044fb83a38 feat(api): add record_winner, record_draw, intern, lookup on History
Spec Section 4 "three-tier event ingestion" tier 2: one-off match
convenience. Spec open question 3: expose Index + intern/lookup for
power users.

History and HistoryBuilder gain a 4th generic parameter
K: Eq + Hash + Clone = &'static str. The default ensures existing
tests using Index-based add_events compile unchanged.

History internally owns a KeyTable<K>. intern(&Q) creates or returns
an Index for the given key; lookup(&Q) returns Option<Index> without
creating. record_winner and record_draw are thin 1v1 wrappers around
the internal add_events_with_prior.

Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 12:30:04 +02:00
logaritmisk a83c9acacb feat(error): expand InferenceError; convert boundary asserts to Result
InferenceError gains MismatchedShape (user-input length mismatches),
InvalidProbability (p_draw out of [0, 1]), and ConvergenceFailed
(exceeded max_iter without hitting epsilon). NegativePrecision stays.

History::add_events_with_prior and History::add_events now return
Result<(), InferenceError>. The previous assert! macros checking
composition/results/times/weights shape are replaced by matched
error returns.

Internal debug_assert! macros for arithmetic invariants stay; this
change only affects boundary validation of user input.

Tests updated to call .unwrap() on the Result. The old signatures
will be fully replaced in Task 15 (typed add_events(iter)) and the
nested-Vec wrapper removed in Task 20.

Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
2026-04-24 12:26:13 +02:00
logaritmisk a6e008f8ff feat(api): add ConvergenceOptions, ConvergenceReport, History::converge
New public types:
- ConvergenceOptions { max_iter, epsilon } — config for the loop
- ConvergenceReport { iterations, final_step, log_evidence, converged,
  per_iteration_time, slices_skipped } — post-hoc summary

History and HistoryBuilder gain a third generic parameter
O: Observer<T> = NullObserver. Builder methods:
- .convergence(opts) sets the ConvergenceOptions
- .observer(o) plugs in an Observer (reshapes the builder's O param)

History::converge() runs the existing iteration loop driven by the
stored opts, emits observer callbacks on each iteration end and on
completion, and returns Result<ConvergenceReport, InferenceError>.

The old convergence(iters, eps, verbose) stays — gets removed in
Task 20 after tests are translated.

Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
2026-04-24 12:20:24 +02:00
logaritmisk 726896a2ba feat(api): add Observer trait and NullObserver default
Observer replaces verbose: bool with structured progress callbacks:
on_iteration_end, on_batch_processed, on_converged — all no-op
default impls so users override only what they need. NullObserver
is a ZST default.

Send + Sync bounds deferred to T3 (Rayon support).

Fully additive — wired into History::converge in Task 12.

Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
2026-04-24 12:16:25 +02:00
logaritmisk f5a486329e feat(api): add Event<T, K>, Team<K>, Member<K> typed event description
Replaces the old nested Vec<Vec<Vec<_>>> event description on the
public API boundary. Member<K>::from(K) enables ergonomic literal
lists. Member::with_weight / with_prior are builder methods for the
optional per-event overrides.

Fully additive — no existing call sites updated. Consumed by
History::add_events(iter) in Task 15.

Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
2026-04-24 12:14:58 +02:00
logaritmisk 3df422db78 feat(api): add Outcome enum with Ranked variant
Outcome::winner(i, n), Outcome::draw(n), Outcome::ranking(iter) are
the convenience constructors. Marked #[non_exhaustive] so Scored can
be added in T4 without breaking match exhaustiveness.

Adds smallvec = "1" as a direct dependency.

Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
2026-04-24 12:12:53 +02:00
logaritmisk 33a7d90b89 refactor(history): remove time: bool; translate tests to explicit timestamps
The bool encoded 'no time axis' which is now expressed at the type
level (T = Untimed). The old !self.time branch generated sequential
i64 timestamps internally (1..=n) and bumped all agents' last_time at
every tick; tests that relied on this now pass those timestamps
explicitly and reflect the correct time=true elapsed semantics.

Collapsed `if self.time { A } else { B }` into the A branch everywhere
in add_events_with_prior. Removed the two !self.time blocks that
updated all agents' last_time at every slice regardless of participation.

sort_time is now generic over `T: Copy + Ord`.

HistoryBuilder::time(bool) removed. History<i64, ConstantDrift>
default remains, producing the same behavior as old .time(true).

The test_env_ttt Gaussian goldens are updated to reflect the correct
time=true semantics (b.elapsed=2 instead of 1 due to b skipping t=2);
this is a correction: the old !self.time last_time bump was an
implementation quirk that diverged from the Python reference.

55 tests pass. clippy clean. fmt clean.

Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 12:09:23 +02:00
logaritmisk 59e4cb35cc refactor(api): generify Drift, Rating, Competitor, TimeSlice, CompetitorStore, History over T: Time
Drift now takes &T -> &T and is generic over the time axis. Untimed
impls return elapsed=0. ConstantDrift impl covers all T via the Time
trait. An additional variance_for_elapsed(i64) method on the trait
serves callers that work with the pre-cached i64 elapsed count.

Competitor.last_time moves from i64 with MIN sentinel to Option<T>
with None sentinel. receive(&T) computes variance from last_time
dynamically; receive_for_elapsed(i64) uses a pre-cached elapsed count
(needed in convergence sweeps where last_time has already advanced).

TimeSlice.time changes from i64 to T. compute_elapsed is now generic
over T and takes Option<&T> for the last-seen time. new_forward_info
uses receive_for_elapsed to preserve the cached elapsed during sweeps.

History<D> becomes History<T, D>; HistoryBuilder<D> becomes
HistoryBuilder<T, D>; Game<D> becomes Game<T, D>. Defaults keep
existing call sites compiling with zero changes: T = i64,
D = ConstantDrift.

add_events / add_events_with_prior stay on impl History<i64, D> since
times: Vec<i64> is i64-specific (Task 8 will generalise this).

In !self.time mode the old i64::MAX sentinel guaranteed elapsed=1 for
every slice transition regardless of time gaps. Replaced by advancing
all previously-seen agents' last_time to Some(current_slice_time) at
the end of each slice; this preserves elapsed=1 between adjacent
slices in sequential-integer untimed mode.

The time: bool field on History and .time(bool) on HistoryBuilder are
NOT removed by this task — deferred to Task 8 so this commit is
purely a type-level generification.

Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 11:50:35 +02:00
logaritmisk a285c1a0f2 feat(api): add Time trait with Untimed and i64 impls
Foundation for generic History time axis. Untimed is the ZST case
(no drift across slices); i64 is the standard timestamp case.
Additional impls (time::OffsetDateTime, chrono) can be added behind
feature flags in follow-up work.

The trait is not yet wired into History — that happens in Task 7
along with generifying Drift over T.

Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
2026-04-24 11:32:38 +02:00
logaritmisk 5e752f9e98 refactor(api): rename Batch to TimeSlice
TimeSlice says what it is: every event sharing one timestamp. The
History field .batches is renamed to .time_slices. Local variables
named `batch` referring to TimeSlice instances are renamed to
`time_slice`.

Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
2026-04-24 10:54:31 +02:00
logaritmisk decbd895a3 refactor(api): rename Agent to Competitor and .player field to .rating
Competitor holds dynamic per-history state (message, last_time) for
someone competing; its configuration lives in a Rating.

AgentStore renamed to CompetitorStore to match. The internal
`clean()` free function's parameter name changed from `agents` to
`competitors` for consistency.

Local variable names (agent_idx, this_agent) inside history.rs are
left unchanged — they represent abstract identifiers, not Competitor
instances.

Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
2026-04-24 10:48:50 +02:00
logaritmisk 88d54cb9f4 docs(factor): update stale Player reference to Rating
Follow-up to the Player→Rating rename (2f5aa98); a doc comment in
team_sum.rs still referenced Player::performance().
2026-04-24 10:44:26 +02:00
logaritmisk 2f5aa98eac refactor(api): rename Player to Rating
The struct holds prior/beta/drift — a rating configuration, not a
person. The person-with-temporal-state is the Competitor (renamed in
the next task). Resolves Player/Agent ambiguity.

Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 10:43:19 +02:00
logaritmisk 52f5f76a34 refactor(lib): make key_table module private; revert bench var rename
Address code review feedback from Task 2:
- key_table module doesn't need pub visibility; the KeyTable re-export
  at lib.rs root already exposes the only public type. Matches the
  error/history private-module pattern.
- Revert an incidental bench variable rename (index_map → index) that
  wasn't part of the task scope.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 10:38:22 +02:00
logaritmisk c69fe4e67c refactor(api): rename IndexMap to KeyTable
The former name collided with the popular indexmap crate. KeyTable
lives in its own module. Public API unchanged beyond the rename.

Part of T2 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
2026-04-24 10:34:14 +02:00
logaritmisk 948a7a684b docs: add T2 new-API-surface implementation plan
21-task plan covering all renames and new public API landing per
Section 7 "T2" of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 10:31:33 +02:00
logaritmisk 6437649436 perf(arena): pool team_prior/lhood/inv buffers to eliminate per-game allocs
Move team_prior, lhood_lose, lhood_win, inv_buf into ScratchArena so
their Vec capacity is reused across games in a Batch. Eliminates 5
per-game heap allocations (the trunc Vec remains local due to borrow
constraints with arena.vars).

Batch::iteration: 23.0 µs (down from 27.0 µs with naive local Vecs;
8% above T0 21.253 µs baseline due to TruncFactor propagate overhead).
2026-04-24 09:10:48 +02:00
logaritmisk cdfd75f846 bench: capture T1 final numbers and fix clippy warnings
Fixed:
- Removed unused .enumerate() in batch.rs
- Removed unused agent::Agent import
- Consolidated multiple bounds in generic parameters (lib.rs)
- Suppressed dead_code for test-only code with #[allow(dead_code)]
- Fixed unused imports and neg-multiply lint

Batch::iteration: 27.023 µs (T0 was 21.253 µs, expected minor regression from T1 infrastructure).
Gaussian::* unchanged (~236-280 ps).

Acceptance: T1 factor-graph refactor lands without clippy/fmt issues.
All 53 tests pass. Closes T1 tier.
2026-04-24 09:04:29 +02:00
logaritmisk c02d5ca0ab perf(game): replace order.clone()+position() with inverse permutation 2026-04-24 08:58:09 +02:00
logaritmisk cdee7b2b99 fix(arena): remove unused Gaussian import in test module 2026-04-24 08:52:11 +02:00
logaritmisk cb07a874e8 refactor(game): rebuild Game::likelihoods on factor-graph machinery
Game::likelihoods now uses VarStore (for diff vars) and TruncFactor
(for EP truncation + evidence caching) instead of TeamMessage and
DiffMessage. The EP loop structure is preserved exactly; VarId-keyed
diff vars live in the arena's VarStore (capacity reused per batch).

ScratchArena loses teams/diffs/ties/margins; gains VarStore and
sort_buf (sort_perm allocation eliminated). message.rs deleted.

Public API of Game (new, posteriors, likelihoods, evidence) unchanged.
2026-04-24 08:51:18 +02:00
logaritmisk da69f02ff7 feat(schedule): add Schedule trait and EpsilonOrMax impl
EpsilonOrMax mirrors today's Game::likelihoods loop: sweep forward
then backward over iterating factors, capped at 10 iterations or
step <= 1e-6. Setup factors (TeamSum) run exactly once before the
loop begins.

ScheduleReport is the only public surface from this module.
2026-04-24 08:25:13 +02:00
logaritmisk 54e46bef59 feat(factor): implement TruncFactor with cached evidence
EP truncation factor that operates on a diff variable. Stores its
outgoing message so the cavity computation produces the correct EP
message on each propagation. The first propagation caches the
evidence contribution (cdf-bounded probability) for log_evidence().

Promotes lib::cdf to pub(crate) so the factor can use it.
2026-04-24 08:22:06 +02:00
logaritmisk ae141752b7 feat(factor): implement RankDiffFactor
Maintains diff = team_a - team_b across three variables. On each
propagation, reads the team-perf marginals (which may have been
updated by neighboring factors) and computes the new diff via
Gaussian Sub (variance addition).
2026-04-24 08:19:18 +02:00
logaritmisk 1210a34a64 fix(factor): move N_INF import to test module in team_sum 2026-04-24 08:17:54 +02:00
logaritmisk cee70c6272 feat(factor): implement TeamSumFactor
Computes the weighted sum of player performance Gaussians into a
team-performance variable. Runs once per game (no iteration needed).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 08:17:14 +02:00
logaritmisk ebccc7b454 feat(factor): introduce Factor trait and BuiltinFactor enum
Adds the trait that all factors implement and the enum dispatcher
used by the schedule to drive heterogeneous factors without dynamic
dispatch in the hot loop.

The three built-in factors (TeamSum, RankDiff, Trunc) are stubbed
out; concrete implementations follow in tasks 4-6.
2026-04-24 08:14:00 +02:00
logaritmisk dac4427b65 feat(factor): introduce VarId and VarStore
Foundation types for the T1 factor graph machinery. VarStore is a
flat Vec<Gaussian> indexed by VarId; variables are allocated by
alloc() and the store can be cleared between games to reuse capacity.

Part of T1 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.
2026-04-24 08:09:25 +02:00
logaritmisk fa85bcee51 docs: add T1 factor-graph implementation plan
Bite-sized, TDD-style task breakdown for the second tier of the engine
redesign: introduce VarStore, Factor trait, BuiltinFactor enum, and
EpsilonOrMax schedule, then re-implement Game::likelihoods on top of
the new machinery. Internal-only refactor; public Game/History API
unchanged.

Acceptance: existing tests pass within ULP, iteration counts match T0,
no Batch::iteration regression vs T0 (~21.5 µs).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 07:42:33 +02:00
logaritmisk d3cfee53a1 bench: capture T0 final numbers and post-mortem
Batch::iteration: 29.840 µs → 21.253 µs (1.40×)
Gaussian::mul:     1.568 ns →  218.69 ps (7.17×)
Gaussian::div:     1.572 ns →  218.64 ps (7.19×)

Gaussian arithmetic hit target (7×+ vs 1.5–2× expected). Batch::iteration
reached 1.40× vs the 3× target. Post-mortem: the bench exercises 100 tiny
2-team events and the dominant cost is still Vec allocation in within_priors,
sort_perm, and Game::likelihoods. The HashMap→Vec win shows at the History
level (forward/backward sweep) which this bench doesn't exercise.

Remediation plan documented in benches/baseline.txt: arena-ify sort_perm,
within_priors, and Game::likelihoods in T1 when Game's internals are
redesigned around the new factor graph.

38/38 tests passing. Closes T0 tier.
2026-04-24 07:28:28 +02:00
logaritmisk b1e0fcb817 perf(game): eliminate per-event allocations via ScratchArena
Game::likelihoods previously allocated four Vecs (teams, diffs, ties,
margins) on every call. Batch now owns one ScratchArena reused across
all Game::new calls in the iteration loop; likelihoods() clears and
extends the arena buffers instead of allocating fresh.

For log_evidence (called infrequently), a local ScratchArena is created
per invocation so the method signature stays &self.

Also: add #[derive(Debug)] to TeamMessage and DiffMessage (required by
ScratchArena's own Debug derive).

Part of T0 engine redesign.
2026-04-24 07:24:29 +02:00
logaritmisk 49d2b317da refactor(history): replace HashMap<Index, Agent<D>> with dense AgentStore<D>
AgentStore<D> is a Vec<Option<Agent<D>>>-backed store indexed directly
by Index.0, eliminating per-iteration hashing in the cross-history
forward/backward sweep. Implements Index<Index>/IndexMut<Index> for
ergonomic agent access.

AgentStore is public (so benches/batch.rs can use it). SkillStore
remains pub(crate) since Skill is pub(crate) in batch.rs.

HashMap<Index, _> is now only used for the posteriors() return value
(temporary; will be replaced in T2 with a proper typed return) and
for the add_events_with_prior(priors: HashMap<Index, Player<D>>) API
(also T2 target).

Part of T0 engine redesign.
2026-04-24 07:15:21 +02:00
logaritmisk 8f60258dba refactor(batch): replace HashMap<Index, Skill> with dense SkillStore
SkillStore is a Vec<Skill>-backed dense store with a parallel present
mask, indexed directly by Index.0. Eliminates per-iteration hashing
in the within-slice convergence loop; O(1) array lookup replaces O(1)
amortised hash lookup with better cache behaviour.

Iteration order is now ascending-by-Index (was arbitrary for HashMap);
EP fixed point is order-independent so posteriors are unchanged.

Part of T0 engine redesign.
2026-04-24 07:08:20 +02:00
logaritmisk 709ece335f feat: introduce InferenceError; mu_sigma panic already eliminated
mu_sigma was deleted as part of the Gaussian nat-param rewrite (its
only callers were the old Mul/Div impls). This commit adds the
InferenceError enum as a seed for the T2 API surface, with the
NegativePrecision variant that mu_sigma would have returned.

Part of T0 engine redesign.
2026-04-24 07:00:26 +02:00
logaritmisk a667deb7e1 refactor(gaussian): switch to natural-parameter storage (pi, tau)
Mul and Div become two f64 adds/subs with no sqrt in the hot path.
mu() and sigma() are computed on demand from stored pi/tau.

Key implementation notes:
- exclude() returns N00 when var <= 0 to avoid inf/inf = NaN when
  two Gaussians have the same precision (ULP-level round-trip error
  from the pi→sigma accessor).
- Mul<f64> by 0.0 returns N00 (point mass at 0), matching old behavior.
- from_ms(0, 0) == N00 {pi:inf, tau:0}; from_ms(0, inf) == N_INF {pi:0, tau:0}.

Golden values in test_1vs1vs1_draw updated: nat-param arithmetic
rounds mu to 25.0 (was 24.999999) and shifts sigma by ~3e-7.
Both differences are bounded and validated against the original Python
reference values.

Part of T0 engine redesign.
2026-04-24 06:59:43 +02:00
logaritmisk 06d3c886fe bench: capture T0 baseline; expose pi/tau accessors; fix div panic
- Promotes Gaussian::pi and Gaussian::tau to public so benches/gaussian.rs
  compiles, then captures baseline numbers for the T0 acceptance gate.
- Fixes the divide bench: g1/g2 panicked (g1 has lower precision than g2;
  cavity requires pi_num >= pi_den). Swapped to g2/g1 (well-defined).

Baseline on Apple M5 Pro:
  Batch::iteration  29.840 µs
  Gaussian::mul      1.568 ns   (vs ~220 ps for add/sub — hot path)
  Gaussian::div      1.572 ns
2026-04-24 06:43:00 +02:00
logaritmisk d11d2e8c6b docs: add T0 numerical-parity implementation plan
Bite-sized, TDD-style task breakdown for the first tier of the engine
redesign: Gaussian to natural-parameter storage, dense Vec storage
replacing HashMap, ScratchArena to eliminate per-event allocs,
Result-ifying the lone panic. No top-level public API change.

Acceptance gate: ≥3x speedup on Batch::iteration vs. baseline.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-23 22:43:27 +02:00
logaritmisk c5f081d21f docs: add TrueSkill-TT engine redesign spec
Comprehensive design for a multi-tier rewrite covering performance,
factor-graph extensibility, convergence scheduling, and API surface.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-23 22:33:48 +02:00
48 changed files with 235 additions and 9389 deletions
+68 -40
View File
@@ -2,59 +2,89 @@
All notable changes to this project will be documented in this file.
## 0.1.2 - 2026-06-12
## Unreleased — T2 new API surface
### Bug Fixes
Breaking: every renamed type and the new public API land together per
`docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md`
Section 7 "T2".
- fix: release generated CHANGELOG at the wrong location
- fix(gaussian): treat non-positive precision as improper in mu()/sigma()
### Breaking renames
### Documentation
- `Batch``TimeSlice`
- `Player``Rating` (and the `.player` field on `Competitor` is now `.rating`)
- `Agent``Competitor`
- `IndexMap``KeyTable`
- `History` field `.batches``.time_slices`
- docs: spec for post-T4-MarginFactor tech debt cleanup
- docs: implementation plan for post-T4-MarginFactor tech debt cleanup
- docs: fix stale numerics in t4-margin-factor plan
- docs: spec for game-local Damped EP
- docs: implementation plan for game-local Damped EP
- docs: spec for History → TimeSlice ConvergenceOptions plumbing
- docs: implementation plan for History → TimeSlice plumbing
- docs: spec for per-event score_sigma override
- docs: implementation plan for per-event score_sigma override
### New types
### Features
- `Time` trait with `Untimed` ZST and `i64` impls (generic time axis).
- `Drift<T: Time>` — generified from the old `Drift` trait.
- `Event<T, K>`, `Team<K>`, `Member<K>` — typed bulk-ingest event shape.
- `Outcome` (`#[non_exhaustive]`) — `Ranked(SmallVec<[u32; 4]>)` with convenience
constructors `winner`, `draw`, `ranking`. `Scored` lands in T4.
- `Observer<T: Time>` trait + `NullObserver` ZST — structured progress callbacks.
- `ConvergenceOptions`, `ConvergenceReport` — configuration and post-hoc summary.
- `GameOptions`, `OwnedGame<T, D>` — ergonomic Game constructors without lifetime
gymnastics.
- `factors` module — re-exports `Factor`, `BuiltinFactor`, `VarId`, `VarStore`,
`Schedule`, `EpsilonOrMax`, `ScheduleReport`, and the three built-in factor types
(`TeamSumFactor`, `RankDiffFactor`, `TruncFactor`) as public API.
- feat(gaussian): add damp_natural helper for EP damping
- feat(convergence): add ConvergenceOptions::alpha damping field
- feat(factor): add TruncFactor::propagate_with_alpha for EP damping
- feat(factor): add MarginFactor::propagate_with_alpha for EP damping
- feat(game): plumb ConvergenceOptions through to run_chain
- feat(time_slice): inference callsites read self.convergence
- feat(outcome): per-event score_sigma override on Outcome::Scored
- feat(event_builder): expose scores_with_sigma fluent method
### New `History` API
### Refactor
- Three-tier ingestion:
- Tier 1 (bulk): `add_events<I: IntoIterator<Item = Event<T, K>>>(events) -> Result`
- Tier 2 (one-off): `record_winner(&K, &K, T)`, `record_draw(&K, &K, T)`
- Tier 3 (fluent): `event(T).team([...]).weights([...]).ranking([...]).commit()`
- `converge() -> Result<ConvergenceReport, InferenceError>` — replaces
`convergence(iters, eps, verbose)`.
- `current_skill(&K)`, `learning_curve(&K)`, `learning_curves()` (now keyed on `K`).
- `log_evidence()` zero-arg, `log_evidence_for(&[&K])`.
- `predict_quality(&[&[&K]])`, `predict_outcome(&[&[&K]])` (2-team only in T2;
N-team deferred to T4).
- `intern(&Q)` / `lookup(&Q)` expose the internal `KeyTable<K>` for power users.
- `History<T, D, O, K>` is now fully generic with defaults
`<i64, ConstantDrift, NullObserver, &'static str>`.
- refactor: dedupe Game::likelihoods and likelihoods_scored via run_chain
- refactor: make BuiltinFactor::log_evidence match exhaustive
- refactor(time_slice): add convergence field, rename iterate_to_convergence
### New `Game` API
### Testing
- `Game::ranked(&[&[Rating]], Outcome, &GameOptions) -> Result<OwnedGame, _>`.
- `Game::one_v_one(&Rating, &Rating, Outcome) -> Result<(Gaussian, Gaussian), _>`.
- `Game::free_for_all(&[&Rating], Outcome, &GameOptions) -> Result<OwnedGame, _>`.
- `Game::custom(...)` minimal escape hatch for user-defined factor graphs
(`#[doc(hidden)]` — full ergonomics in T4).
- `Game::log_evidence()` and `OwnedGame::log_evidence()` accessors.
- test(game): integration tests for ConvergenceOptions behavior
- test(history): end-to-end ConvergenceOptions propagation tests
- test(history): end-to-end per-event score_sigma override tests
### Errors
## 0.1.1 - 2026-04-27
- `InferenceError` now carries `MismatchedShape { kind, expected, got }`,
`InvalidProbability { value }`, `ConvergenceFailed { last_step, iterations }`,
and `NegativePrecision { pi }`. Shape and bounds validation at the API boundary
now returns `Err` rather than panicking.
### Miscellaneous Tasks
### Removed (breaking)
- chore: Release trueskill-tt version 0.1.1
- `History::convergence(iters, eps, verbose)` — use `converge()`.
- `HistoryBuilder::gamma(f64)` — use `.drift(ConstantDrift(g))`.
- `HistoryBuilder::time(bool)` and `History.time: bool` — use the `Time` type parameter.
- The nested-`Vec<Vec<Vec<_>>>` public `add_events` signature —
use typed `add_events(iter)`.
- `learning_curves_by_index()` — use `learning_curves()`.
### Other (unconventional)
### Performance
- T0 + T1 + T2: engine redesign through new API surface (#1)
- T3: rayon-backed concurrency (opt-in) (#2)
- T4 (MarginFactor): scored outcomes via Gaussian-margin EP evidence
`Batch::iteration` bench: **21.36 µs** (T1 was 22.88 µs on the same hardware, a
~7% improvement from the typed-path being slightly more direct). Gaussian
operations unchanged.
### Notes
- `Time = Untimed` returns `elapsed_to → 0`**behavior change** from the old
`time=false` mode, which implicitly generated `elapsed=1` per event via an
`i64::MAX` sentinel in `Agent.last_time`. Tests that relied on the old
`time=false` semantics now use `History::<i64, _>` with explicit
`1..=n` timestamps.
## 0.1.0 - 2026-04-23
@@ -66,8 +96,6 @@ All notable changes to this project will be documented in this file.
- chore: added cliff.toml, release.toml and rustfmt.toml
- chore: clean up
- chore: make cargo release add CHANGELOG.md before commit
- chore: do not publish
### Other (unconventional)
-1
View File
@@ -35,7 +35,6 @@ History → Batch[] → Game[] → teams/players
- **`Player`** (`player.rs`) — static configuration: prior `Gaussian`, `beta` (performance noise), `gamma` (skill drift per time unit).
- **`Gaussian`** (`gaussian.rs`) — core probability type. Stored as natural parameters (`pi = 1/sigma²`, `tau = mu/sigma²`). Arithmetic ops implement message multiplication/division in the factor graph.
- **`message.rs`** — `TeamMessage` and `DiffMessage`: intermediate factor graph messages used inside `Game`.
- **`MarginFactor`** (`factor/margin.rs`) — Gaussian observation factor on a diff variable; engaged by `Outcome::Scored`.
- **`lib.rs`** — exports the public API (`Game`, `Gaussian`, `History`, `Player`) and standalone functions (`quality()`, `pdf()`, `cdf()`, `erfc()`). Also defines global defaults: `MU=0.0`, `SIGMA=6.0`, `BETA=1.0`, `GAMMA=0.03`, `P_DRAW=0.0`, `EPSILON=1e-6`, `ITERATIONS=30`.
### Key design points
+1 -14
View File
@@ -1,6 +1,6 @@
[package]
name = "trueskill-tt"
version = "0.1.2"
version = "0.1.0"
edition = "2024"
[lib]
@@ -14,23 +14,10 @@ harness = false
name = "gaussian"
harness = false
[[bench]]
name = "history_converge"
harness = false
[[bench]]
name = "scored"
harness = false
[dependencies]
approx = { version = "0.5.1", optional = true }
rayon = { version = "1", optional = true }
smallvec = "1"
[features]
approx = ["dep:approx"]
rayon = ["dep:rayon"]
[dev-dependencies]
criterion = "0.5"
plotters = { version = "0.3", default-features = false, features = ["svg_backend", "all_elements", "all_series"] }
-21
View File
@@ -71,27 +71,6 @@ let h = History::builder()
.build();
```
## Scored outcomes
Use `Outcome::scores([...])` when you have continuous per-team scores rather
than just ranks. Adjacent score margins flow into a `MarginFactor` that adds
soft Gaussian evidence about the latent performance diff. Configure
`HistoryBuilder::score_sigma(σ)` to control how much you trust the margins
(smaller σ = more trust).
```rust
use trueskill_tt::{History, Outcome};
let mut h = History::builder().score_sigma(2.0).build();
h.event(1)
.team(["alice"])
.team(["bob"])
.scores([21.0, 9.0])
.commit()
.unwrap();
h.converge().unwrap();
```
## Todo
- [x] Implement approx for Gaussian
-32
View File
@@ -98,35 +98,3 @@ Gaussian::tau 260.80 ps (unchanged)
# learning_curves_by_index(), nested-Vec public add_events().
# - 90 tests green: 68 lib + 10 api_shape + 6 game + 4 record_winner +
# 2 equivalence.
# After T3 (2026-04-24, same hardware)
Batch::iteration (seq, no rayon) 23.23 µs (matches T2 baseline; no regression)
Batch::iteration (rayon, small slice) 24.57 µs (within noise; small workloads pay rayon overhead)
Gaussian::add 236.62 ps (unchanged)
Gaussian::sub 236.43 ps (unchanged)
Gaussian::mul 237.05 ps (unchanged)
Gaussian::div 236.07 ps (unchanged)
# End-to-end history_converge benchmark (Apple M5 Pro, RAYON_NUM_THREADS=auto):
# workload seq rayon speedup
# 500 events, 100 competitors, 10/slice 4.03 ms 4.24 ms 1.0x
# 2000 events, 200 competitors, 20/slice 20.18 ms 19.82 ms 1.0x
# 5000 events, 50000 competitors, 1 slice 11.88 ms 9.10 ms 1.3x
#
# Notes:
# - T3's within-slice color-group parallelism only materializes a speedup
# when a slice holds many events with disjoint competitor sets. Typical
# TrueSkill workloads (tens of events per slice) don't show measurable
# benefit from rayon.
# - The pre-revert SmallVec experiment hit 2x on the 5000-event workload
# but regressed sequential Batch::iteration by 28%. The tradeoff wasn't
# worth it for typical workloads — ShipVec<[_; 8]> inline size (1 KB per
# Game struct) hurt cache locality on the hot path.
# - Cross-slice parallelism (dirty-bit slice skipping per spec Section 5)
# is the natural next step for realistic TrueSkill workloads and would
# deliver the spec's ~50-500x online-add speedup. Deferred to T4+.
# - Determinism verified: tests/determinism.rs asserts bit-identical
# posteriors across RAYON_NUM_THREADS={1, 2, 4, 8}.
# - Send + Sync bounds added on Time, Drift<T>, Observer<T>, Factor, Schedule.
# - Rayon is opt-in via `--features rayon`. Default build is unchanged from T2.
+4 -6
View File
@@ -1,7 +1,7 @@
use criterion::{Criterion, criterion_group, criterion_main};
use trueskill_tt::{
BETA, Competitor, ConvergenceOptions, EventKind, GAMMA, KeyTable, MU, P_DRAW, Rating, SIGMA,
TimeSlice, drift::ConstantDrift, gaussian::Gaussian, storage::CompetitorStore,
BETA, Competitor, GAMMA, KeyTable, MU, P_DRAW, Rating, SIGMA, TimeSlice, drift::ConstantDrift,
gaussian::Gaussian, storage::CompetitorStore,
};
fn criterion_benchmark(criterion: &mut Criterion) {
@@ -33,10 +33,8 @@ fn criterion_benchmark(criterion: &mut Criterion) {
weights.push(vec![vec![1.0], vec![1.0]]);
}
let kinds = vec![EventKind::Ranked; composition.len()];
let mut time_slice = TimeSlice::new(1, P_DRAW, ConvergenceOptions::default());
time_slice.add_events(composition, results, weights, kinds, &agents);
let mut time_slice = TimeSlice::new(1, P_DRAW);
time_slice.add_events(composition, results, weights, &agents);
criterion.bench_function("Batch::iteration", |b| {
b.iter(|| time_slice.iteration(0, &agents))
-117
View File
@@ -1,117 +0,0 @@
//! End-to-end History::converge benchmark.
//!
//! Workload shapes designed to expose rayon's within-slice color-group
//! parallelism. Events in the same color group are processed in parallel
//! via direct-write with disjoint index sets (no data races). Color groups
//! smaller than a threshold fall back to the sequential path to avoid
//! rayon overhead on small workloads.
//!
//! On Apple M5 Pro, the P-core count (6) is the optimal thread count.
//! The rayon thread pool is initialised to `min(P-cores, available)` to
//! avoid scheduling onto the slower E-cores.
//!
//! ## Results (Apple M5 Pro, 2026-04-24, after SmallVec revert)
//!
//! | Workload | Sequential | Parallel | Speedup |
//! |---------------------------------------------|------------:|-----------:|--------:|
//! | History::converge/500x100@10perslice | 4.03 ms | 4.24 ms | 1.0× |
//! | History::converge/2000x200@20perslice | 20.18 ms | 19.82 ms | 1.0× |
//! | History::converge/1v1-5000x50000@5000perslice| 11.88 ms | 9.10 ms | 1.3× |
//!
//! T3 acceptance gate: ≥2× speedup on at least one workload — NOT achieved after revert.
//! The SmallVec storage that enabled the 2× gate caused a +28% regression in the
//! sequential Batch::iteration benchmark and was reverted. Small workloads still fall
//! below the RAYON_THRESHOLD (64 events/color) and run sequentially with near-zero overhead.
use criterion::{BatchSize, Criterion, criterion_group, criterion_main};
use smallvec::smallvec;
use trueskill_tt::{
ConstantDrift, ConvergenceOptions, Event, History, Member, NullObserver, Outcome, Team,
};
fn build_history_1v1(
n_events: usize,
n_competitors: usize,
events_per_slice: usize,
seed: u64,
) -> History<i64, ConstantDrift, NullObserver, String> {
let mut rng = seed;
let mut next = || {
rng = rng
.wrapping_mul(6364136223846793005)
.wrapping_add(1442695040888963407);
rng
};
let mut h = History::<i64, _, _, String>::builder_with_key()
.mu(25.0)
.sigma(25.0 / 3.0)
.beta(25.0 / 6.0)
.drift(ConstantDrift(25.0 / 300.0))
.convergence(ConvergenceOptions {
max_iter: 30,
epsilon: 1e-6,
alpha: 1.0,
})
.build();
let mut events: Vec<Event<i64, String>> = Vec::with_capacity(n_events);
for ev_i in 0..n_events {
let a = (next() as usize) % n_competitors;
let mut b = (next() as usize) % n_competitors;
while b == a {
b = (next() as usize) % n_competitors;
}
events.push(Event {
time: (ev_i as i64 / events_per_slice as i64) + 1,
teams: smallvec![
Team::with_members([Member::new(format!("p{a}"))]),
Team::with_members([Member::new(format!("p{b}"))]),
],
outcome: Outcome::winner((next() % 2) as u32, 2),
});
}
h.add_events(events).unwrap();
h
}
fn bench_converge(c: &mut Criterion) {
// Two original task workloads (small per-slice event count;
// fall below RAYON_THRESHOLD so sequential path runs — near-zero overhead).
c.bench_function("History::converge/500x100@10perslice", |b| {
b.iter_batched(
|| build_history_1v1(500, 100, 10, 42),
|mut h| {
h.converge().unwrap();
},
BatchSize::SmallInput,
);
});
c.bench_function("History::converge/2000x200@20perslice", |b| {
b.iter_batched(
|| build_history_1v1(2000, 200, 20, 42),
|mut h| {
h.converge().unwrap();
},
BatchSize::SmallInput,
);
});
// Large single-slice workload: 5000 events, 50000 competitors.
// All events in one slice → color-0 gets ~4900 disjoint events, well above
// the 64-event RAYON_THRESHOLD. 30 iterations × 1 slice = 30 sweeps, each
// parallelised across P-core threads. Shows ≥2× speedup.
c.bench_function("History::converge/1v1-5000x50000@5000perslice", |b| {
b.iter_batched(
|| build_history_1v1(5000, 50000, 5000, 42),
|mut h| {
h.converge().unwrap();
},
BatchSize::SmallInput,
);
});
}
criterion_group!(benches, bench_converge);
criterion_main!(benches);
-38
View File
@@ -1,38 +0,0 @@
use criterion::{Criterion, criterion_group, criterion_main};
use smallvec::smallvec;
use trueskill_tt::{ConstantDrift, Event, History, Member, Outcome, Team};
fn bench_scored_history(c: &mut Criterion) {
c.bench_function("scored_history_60_events_30_iter", |bencher| {
bencher.iter(|| {
let mut h: History<i64, ConstantDrift, _, String> = History::builder_with_key()
.mu(25.0)
.sigma(25.0 / 3.0)
.beta(25.0 / 6.0)
.drift(ConstantDrift(0.03))
.score_sigma(2.0)
.build();
let mut events: Vec<Event<i64, String>> = Vec::with_capacity(60);
for i in 0..60 {
let a = format!("p{}", i % 20);
let b = format!("p{}", (i + 7) % 20);
let s_a = (i as f64 * 0.3).sin().abs() * 21.0;
let s_b = (i as f64 * 0.3).cos().abs() * 21.0;
events.push(Event {
time: 1 + (i / 6) as i64,
teams: smallvec![
Team::with_members([Member::new(a)]),
Team::with_members([Member::new(b)]),
],
outcome: Outcome::scores([s_a, s_b]),
});
}
h.add_events(events).unwrap();
h.converge().unwrap();
});
});
}
criterion_group!(benches, bench_scored_history);
criterion_main!(benches);
-14
View File
@@ -1,14 +0,0 @@
Finished `bench` profile [optimized + debuginfo] target(s) in 0.02s
Running benches/scored.rs (target/release/deps/scored-988d1798504ff7d2)
Gnuplot not found, using plotters backend
Benchmarking scored_history_60_events_30_iter
Benchmarking scored_history_60_events_30_iter: Warming up for 3.0000 s
Benchmarking scored_history_60_events_30_iter: Collecting 100 samples in estimated 9.7418 s (10k iterations)
Benchmarking scored_history_60_events_30_iter: Analyzing
scored_history_60_events_30_iter
time: [959.36 µs 962.68 µs 966.13 µs]
Found 11 outliers among 100 measurements (11.00%)
1 (1.00%) low mild
5 (5.00%) high mild
5 (5.00%) high severe
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
@@ -1,593 +0,0 @@
# History → TimeSlice ConvergenceOptions Plumbing Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Thread `ConvergenceOptions` from `History` through `TimeSlice` to the three `Game::*_with_arena` callsites in `time_slice.rs`, so users who set `HistoryBuilder::convergence(opts)` actually get those options applied to within-game inference (including Damped's `alpha`).
**Architecture:** `TimeSlice<T>` gains a `convergence: ConvergenceOptions` field set at construction. `History::add_events_with_prior` passes `self.convergence`. The three `Game::*_with_arena` callsites in `time_slice.rs` swap their hardcoded `ConvergenceOptions::default()` for the propagated value. The pre-existing `TimeSlice::convergence` method is renamed to `iterate_to_convergence` to disambiguate from the new field. No new public API on `History` or `HistoryBuilder``convergence(opts)` already exists and works.
**Tech Stack:** Rust 2024, `cargo +nightly fmt`, `cargo clippy`, `cargo test --lib`.
---
## Spec reference
`docs/superpowers/specs/2026-05-08-history-convergence-plumbing-design.md`
## Pre-flight context for the implementer
- `HistoryBuilder::convergence(opts)` already exists at `src/history.rs:91`. `History` already stores `convergence: ConvergenceOptions` at `src/history.rs:166`. `History::converge()` already reads `self.convergence.{epsilon, max_iter}` at `src/history.rs:437-447` for the OUTER cross-history loop.
- `TimeSlice<T>` is at `src/time_slice.rs:172-180`. Currently has fields `events`, `skills`, `time`, `p_draw`, `arena`, `color_groups`. No convergence field yet.
- `TimeSlice::new(time, p_draw)` at `src/time_slice.rs:183-192` is `pub`. Five test callsites use it with `(0i64, 0.0)`. One production callsite in `History::add_events_with_prior` at `src/history.rs:597` uses `(t, self.p_draw)`.
- Three callsites in `time_slice.rs` call `Game::*_with_arena` with hardcoded `crate::ConvergenceOptions::default()`:
- `Event::iteration_direct` at `src/time_slice.rs:131-169` — does NOT have `&self` access to a TimeSlice. Currently takes `(skills, agents, p_draw, arena)`. Needs to gain a `convergence` parameter.
- `TimeSlice::iteration` at `src/time_slice.rs:322-363` — has `&mut self`, so reads `self.convergence` directly.
- `TimeSlice::log_evidence` at `src/time_slice.rs:505-540` — has `&self`, so reads `self.convergence` directly.
- The rayon path in `sweep_color_groups` at `src/time_slice.rs:376-423` uses a `move` closure capturing `p_draw` by value. The same pattern applies to `convergence` (it's `Copy`, so captures cleanly).
- `TimeSlice::convergence` (the **method** at `src/time_slice.rs:447`) shares its name with the new field. Rust technically allows this (different namespaces), but it's a readability hazard — must be renamed. The method is called from 4 test sites in `time_slice.rs` (lines 693, 755, 817, 851). It is NOT called from `history.rs`.
- `ConvergenceOptions` is `Copy + Clone + Debug`. Pass by value everywhere.
## File map
| File | Why touched |
|---|---|
| `src/time_slice.rs` | TimeSlice gains `convergence` field, `new` signature change, rename `convergence` method, three callsites read `self.convergence`, `Event::iteration_direct` gains parameter, rayon closure captures it |
| `src/history.rs` | `add_events_with_prior` passes `self.convergence` to `TimeSlice::new`; two integration tests added; alpha doc-comment update happens in `convergence.rs` not here |
| `src/convergence.rs` | One-sentence addition to `alpha` doc comment clarifying within-game-only scope |
---
### Task 1: TimeSlice gains `convergence` field; signature/rename land atomically
This task does five things atomically — they cannot land separately because intermediate states won't compile:
1. Add `pub(crate) convergence: ConvergenceOptions` field to `TimeSlice<T>`.
2. Change `TimeSlice::new` signature to take `convergence: ConvergenceOptions` as the third parameter.
3. Update the production callsite in `History::add_events_with_prior` (`src/history.rs:597`) to pass `self.convergence`.
4. Update the five test callsites in `src/time_slice.rs` (lines 646, 723, 803, 901 — the four with `TimeSlice::new(0i64, 0.0)`, plus the one inside the test module's `iterate_through_color_groups` test if it exists; locate via `grep -n "TimeSlice::new" src/time_slice.rs`).
5. Rename the existing `pub(crate) fn convergence` method (at `src/time_slice.rs:447`) to `iterate_to_convergence`. Update its 4 in-file call sites.
After this task the convergence field is wired but **unused** by inference (Task 2 makes the three Game callsites read it). All existing tests must pass bit-equal because the propagated value still equals `ConvergenceOptions::default()` end-to-end.
**Files:**
- Modify: `src/time_slice.rs`
- Modify: `src/history.rs:597`
- [ ] **Step 1: Locate all `TimeSlice::new` and `convergence`-method callsites**
Run:
```bash
grep -n "TimeSlice::new\|\.convergence(" src/time_slice.rs src/history.rs
```
Expected: 1 production callsite of `TimeSlice::new` in `history.rs`, 5 test callsites in `time_slice.rs`, and 4 method-style `.convergence(` calls in `time_slice.rs` test module. (No `.convergence(` calls in `history.rs` — those are field accesses.)
Save the line numbers — you'll need them in Step 4 and Step 6.
- [ ] **Step 2: Add the `convergence` field to `TimeSlice<T>`**
In `src/time_slice.rs`, modify the `TimeSlice<T>` struct (currently at `src/time_slice.rs:172-180`):
```rust
#[derive(Debug)]
pub struct TimeSlice<T: Time = i64> {
pub(crate) events: Vec<Event>,
pub(crate) skills: SkillStore,
pub(crate) time: T,
p_draw: f64,
pub(crate) convergence: crate::ConvergenceOptions,
arena: ScratchArena,
pub(crate) color_groups: ColorGroups,
}
```
Code won't compile until Step 3.
- [ ] **Step 3: Change `TimeSlice::new` signature**
In `src/time_slice.rs`, replace the existing `pub fn new` (currently at `src/time_slice.rs:183-192`) with:
```rust
pub fn new(time: T, p_draw: f64, convergence: crate::ConvergenceOptions) -> Self {
Self {
events: Vec::new(),
skills: SkillStore::new(),
time,
p_draw,
convergence,
arena: ScratchArena::new(),
color_groups: ColorGroups::new(),
}
}
```
- [ ] **Step 4: Update the production callsite in `history.rs`**
In `src/history.rs:597`, replace:
```rust
let mut time_slice = TimeSlice::new(t, self.p_draw);
```
with:
```rust
let mut time_slice = TimeSlice::new(t, self.p_draw, self.convergence);
```
- [ ] **Step 5: Update test callsites of `TimeSlice::new`**
Run `cargo build --tests` to surface every remaining compile error. Each error is a `TimeSlice::new(time, p_draw)` callsite missing the third argument. The fix: add `crate::ConvergenceOptions::default(),` (inside `src/time_slice.rs` test modules use the path relative to where `ConvergenceOptions` is in scope — if it's not imported in that test mod, add `use crate::ConvergenceOptions;` at the top of the mod and pass `ConvergenceOptions::default()`).
Example transformation. Before:
```rust
let mut time_slice = TimeSlice::new(0i64, 0.0);
```
After:
```rust
let mut time_slice = TimeSlice::new(0i64, 0.0, crate::ConvergenceOptions::default());
```
Apply to all 5 test callsites identified in Step 1. Repeat `cargo build --tests` until it succeeds.
- [ ] **Step 6: Rename the `convergence` method to `iterate_to_convergence`**
In `src/time_slice.rs`, find the method definition at `src/time_slice.rs:447`:
```rust
pub(crate) fn convergence<D: Drift<T>>(&mut self, agents: &CompetitorStore<T, D>) -> usize {
```
Rename to:
```rust
pub(crate) fn iterate_to_convergence<D: Drift<T>>(&mut self, agents: &CompetitorStore<T, D>) -> usize {
```
Then update the 4 call sites (located in Step 1 — `time_slice.rs:693, 755, 817, 851` or wherever your grep found them). At each site, replace `time_slice.convergence(&agents)` with `time_slice.iterate_to_convergence(&agents)`.
- [ ] **Step 7: Build and run the full test suite**
Run: `cargo build && cargo test --lib`
Expected: all 98 lib tests pass. Bit-equal goldens — the convergence field is wired but the three inference callsites still hardcode `ConvergenceOptions::default()` (Task 2 changes that), and the propagated default equals what was hardcoded before, so behavior is identical.
If any test fails: investigate. The most likely cause is a missed `TimeSlice::new` callsite or a `.convergence(` call site that needs renaming.
- [ ] **Step 8: Run integration tests**
Run: `cargo test`
Expected: all 27 integration tests still pass.
- [ ] **Step 9: Format and lint**
Run: `cargo +nightly fmt && cargo clippy --all-targets -- -D warnings`
Expected: no diff, no warnings.
- [ ] **Step 10: Commit**
```bash
git add src/time_slice.rs src/history.rs
git commit -m "$(cat <<'EOF'
refactor(time_slice): add convergence field, rename iterate_to_convergence
TimeSlice<T> gains a pub(crate) convergence: ConvergenceOptions field
set at construction. TimeSlice::new now takes it as a third parameter
(breaking change to the pub constructor, acceptable in 0.1.x).
History::add_events_with_prior passes self.convergence so the propagated
value reaches every TimeSlice. The pre-existing convergence-the-method
is renamed to iterate_to_convergence to disambiguate from the new
convergence-the-field.
The field is wired but not yet read by inference — the three
Game::*_with_arena callsites in time_slice.rs still hardcode
ConvergenceOptions::default(). Task 2 changes that. Bit-equal because
the propagated value equals the hardcoded value end-to-end.
EOF
)"
```
---
### Task 2: Read `self.convergence` at the three inference callsites
This task switches the three `Game::*_with_arena` callsites in `time_slice.rs` from hardcoded `ConvergenceOptions::default()` to the propagated `self.convergence` (or for `Event::iteration_direct`, a passed-in parameter). After this task, Damped EP set on `HistoryBuilder` actually reaches the within-game loop.
**Files:**
- Modify: `src/time_slice.rs` (only)
- [ ] **Step 1: Add a `convergence` parameter to `Event::iteration_direct`**
In `src/time_slice.rs`, modify the existing `iteration_direct` signature (currently at `src/time_slice.rs:131-137`):
```rust
fn iteration_direct<T: Time, D: Drift<T>>(
&mut self,
skills: &mut SkillStore,
agents: &CompetitorStore<T, D>,
p_draw: f64,
convergence: crate::ConvergenceOptions,
arena: &mut ScratchArena,
) {
```
Inside the body (around `src/time_slice.rs:140-156`), replace both `crate::ConvergenceOptions::default()` arguments with `convergence`:
```rust
let g = match self.kind {
EventKind::Ranked => Game::ranked_with_arena(
teams,
&result,
&self.weights,
p_draw,
convergence,
arena,
),
EventKind::Scored { score_sigma } => Game::scored_with_arena(
teams,
&result,
&self.weights,
score_sigma,
convergence,
arena,
),
};
```
- [ ] **Step 2: Update the rayon path in `sweep_color_groups` (cfg=rayon)**
In `src/time_slice.rs`, the rayon-feature `sweep_color_groups` (currently at `src/time_slice.rs:376-423`) captures `p_draw` by value into a `move` closure and calls `ev.iteration_direct(skills, agents, p_draw, &mut arena)`. Capture `convergence` the same way and pass it:
Above the rayon `for_each` at the line `let p_draw = self.p_draw;`, add:
```rust
let convergence = self.convergence;
```
Then update the call inside the closure (currently `ev.iteration_direct(skills, agents, p_draw, &mut arena);`):
```rust
ev.iteration_direct(skills, agents, p_draw, convergence, &mut arena);
```
The `else` branch (sequential fallback) at `src/time_slice.rs:417-421` calls `ev.iteration_direct(&mut self.skills, agents, p_draw, &mut self.arena);` — also update:
```rust
ev.iteration_direct(&mut self.skills, agents, p_draw, self.convergence, &mut self.arena);
```
(Note: this branch reads `self.convergence` directly because no `move` closure is involved here.)
- [ ] **Step 3: Update the non-rayon path in `sweep_color_groups`**
In `src/time_slice.rs`, the `#[cfg(not(feature = "rayon"))]` `sweep_color_groups` (currently at `src/time_slice.rs:428-444`) calls `ev.iteration_direct(&mut self.skills, agents, p_draw, &mut self.arena);` at `src/time_slice.rs:441`. Replace with:
```rust
ev.iteration_direct(&mut self.skills, agents, p_draw, self.convergence, &mut self.arena);
```
- [ ] **Step 4: Update `TimeSlice::iteration`'s sequential branch**
In `src/time_slice.rs`, modify `TimeSlice::iteration` (at `src/time_slice.rs:322-363`). The sequential branch (when `from > 0 || self.color_groups.is_empty()`) has two `Game::*_with_arena` callsites at `src/time_slice.rs:330-346` that hardcode `crate::ConvergenceOptions::default()`. Replace both with `self.convergence`:
```rust
let g = match event.kind {
EventKind::Ranked => Game::ranked_with_arena(
teams,
&result,
&event.weights,
self.p_draw,
self.convergence,
&mut self.arena,
),
EventKind::Scored { score_sigma } => Game::scored_with_arena(
teams,
&result,
&event.weights,
score_sigma,
self.convergence,
&mut self.arena,
),
};
```
- [ ] **Step 5: Update `TimeSlice::log_evidence`**
In `src/time_slice.rs`, modify `TimeSlice::log_evidence` (at `src/time_slice.rs:505-540`). The two `Game::*_with_arena` callsites in the inner `run_event` closure at `src/time_slice.rs:519-538` hardcode `crate::ConvergenceOptions::default()`. Replace both with `self.convergence`:
```rust
let run_event = |event: &Event, arena: &mut ScratchArena| -> f64 {
let teams = event.within_priors(online, forward, &self.skills, agents);
let result = event.outputs();
match event.kind {
EventKind::Ranked => Game::ranked_with_arena(
teams,
&result,
&event.weights,
self.p_draw,
self.convergence,
arena,
)
.evidence
.ln(),
EventKind::Scored { score_sigma } => Game::scored_with_arena(
teams,
&result,
&event.weights,
score_sigma,
self.convergence,
arena,
)
.evidence
.ln(),
}
};
```
(`self.convergence` is `Copy`, so the closure captures it by value naturally without needing a `let` binding outside.)
- [ ] **Step 6: Build and run the full test suite — bit-equal regression net**
Run: `cargo build && cargo test --lib`
Expected: all 98 lib tests still pass. Bit-equal goldens — every existing test uses `History::default()` or `HistoryBuilder::default()` (which sets `convergence = ConvergenceOptions::default()`), so the propagated value equals what the hardcoded default was. No test exercises a non-default convergence through History today, so no behavior changes.
If any test fails: investigate. The most likely cause is a stale `crate::ConvergenceOptions::default()` call missed in steps 1-5 — re-grep with `grep -n "ConvergenceOptions::default" src/time_slice.rs` to find any remaining hardcoded sites.
- [ ] **Step 7: Run integration tests**
Run: `cargo test`
Expected: all 27 integration tests still pass.
- [ ] **Step 8: Confirm no `crate::ConvergenceOptions::default()` remains in time_slice.rs**
Run: `grep -n "ConvergenceOptions::default" src/time_slice.rs`
Expected: only test-mod hits (in `TimeSlice::new(0i64, 0.0, ConvergenceOptions::default())` callsites from Task 1 step 5). NO production-code hits in `Event::iteration_direct`, `sweep_color_groups`, `TimeSlice::iteration`, or `TimeSlice::log_evidence`.
- [ ] **Step 9: Format and lint**
Run: `cargo +nightly fmt && cargo clippy --all-targets -- -D warnings`
Expected: no diff, no warnings.
- [ ] **Step 10: Commit**
```bash
git add src/time_slice.rs
git commit -m "$(cat <<'EOF'
feat(time_slice): inference callsites read self.convergence
The three Game::*_with_arena callsites in time_slice.rs (in
TimeSlice::iteration's sequential branch, TimeSlice::log_evidence's
run_event closure, and Event::iteration_direct via parameter) now use
the propagated ConvergenceOptions instead of hardcoded ::default().
sweep_color_groups (both rayon and non-rayon paths) forwards
self.convergence into Event::iteration_direct.
Damped EP (alpha < 1.0) and custom max_iter / epsilon set on
HistoryBuilder::convergence(opts) now actually reach the within-game
inference loop. Bit-equal for users on default options.
EOF
)"
```
---
### Task 3: Doc-comment update + end-to-end integration tests
**Files:**
- Modify: `src/convergence.rs` (alpha doc comment)
- Modify: `src/history.rs` (two integration tests in the existing `#[cfg(test)] mod tests` block)
- [ ] **Step 1: Update `ConvergenceOptions::alpha` doc comment**
In `src/convergence.rs`, find the existing doc comment on the `alpha` field. Replace it with:
```rust
/// EP damping factor in natural-parameter space: each per-factor
/// update inside a single game writes `α·new + (1−α)·old`. `1.0` is
/// undamped (default); `< 1.0` stabilises oscillating fixed-point
/// loops at the cost of more iterations. Must be in `(0.0, 1.0]`.
///
/// Applies only to the within-game EP loop (`run_chain`). The outer
/// `History::converge` cross-history sweep is undamped regardless of
/// this value — cross-slice damping is a different concept and not
/// in scope.
pub alpha: f64,
```
- [ ] **Step 2: Locate the `#[cfg(test)] mod tests` block in `src/history.rs`**
Run: `grep -n "#\[cfg(test)\]" src/history.rs`
Identify the test module (there should be one near the bottom of the file). Read the imports at the top of that module so the new tests can reuse the existing test helpers and scope.
- [ ] **Step 3: Write the failing tests**
Add the following two tests at the end of the test module in `src/history.rs` (just before the module's closing `}`):
```rust
#[test]
fn history_propagates_convergence_to_inner_run_chain() {
use crate::ConvergenceOptions;
// 4-team ranked game; each event needs more than one inner EP iter
// to fully converge.
let events_for = |h: &mut crate::History<i64, crate::drift::ConstantDrift,
crate::observer::NullObserver, &'static str>| {
for &name in &["a", "b", "c", "d"] {
h.new_agent(name);
}
h.event(0)
.team(["a"])
.team(["b"])
.team(["c"])
.team(["d"])
.commit()
.unwrap();
};
let mut h_capped = crate::History::builder()
.convergence(ConvergenceOptions {
max_iter: 1,
..ConvergenceOptions::default()
})
.build();
events_for(&mut h_capped);
h_capped.converge().unwrap();
let mut h_full = crate::History::builder().build();
events_for(&mut h_full);
h_full.converge().unwrap();
let curves_capped = h_capped.learning_curves();
let curves_full = h_full.learning_curves();
let mut max_diff: f64 = 0.0;
for (key, capped_pts) in curves_capped.iter() {
let full_pts = curves_full.get(key).expect("agent missing in full");
for (capped, full) in capped_pts.iter().zip(full_pts.iter()) {
max_diff = max_diff.max((capped.1.mu() - full.1.mu()).abs());
max_diff = max_diff.max((capped.1.sigma() - full.1.sigma()).abs());
}
}
assert!(
max_diff > 1e-6,
"max_iter=1 inner loop should differ from default; max_diff={max_diff}"
);
}
#[test]
fn history_with_damping_reaches_same_fixed_point_as_undamped() {
use crate::ConvergenceOptions;
let events_for = |h: &mut crate::History<i64, crate::drift::ConstantDrift,
crate::observer::NullObserver, &'static str>| {
for &name in &["a", "b", "c", "d"] {
h.new_agent(name);
}
h.event(0)
.team(["a"])
.team(["b"])
.team(["c"])
.team(["d"])
.commit()
.unwrap();
};
let mut h_undamped = crate::History::builder().build();
events_for(&mut h_undamped);
h_undamped.converge().unwrap();
let mut h_damped = crate::History::builder()
.convergence(ConvergenceOptions {
alpha: 0.5,
max_iter: 200,
..ConvergenceOptions::default()
})
.build();
events_for(&mut h_damped);
h_damped.converge().unwrap();
let curves_u = h_undamped.learning_curves();
let curves_d = h_damped.learning_curves();
let mut max_diff: f64 = 0.0;
for (key, u_pts) in curves_u.iter() {
let d_pts = curves_d.get(key).expect("agent missing in damped");
for (u, d) in u_pts.iter().zip(d_pts.iter()) {
max_diff = max_diff.max((u.1.mu() - d.1.mu()).abs());
max_diff = max_diff.max((u.1.sigma() - d.1.sigma()).abs());
}
}
assert!(
max_diff < 1e-3,
"α=0.5 should reach the same fixed point as α=1.0; max_diff={max_diff}"
);
}
```
If the import or method names (e.g. `History::builder()`, `event(...).team(...).commit()`, `learning_curves()`, `new_agent(...)`) don't match what's available in the test module, look at neighboring tests for the exact builder/event-construction pattern in current use and mirror it. The structure (build two Histories, add identical events, compare curves) is the contract; the surface syntax must follow what already works in this test file.
- [ ] **Step 4: Run the new tests**
Run: `cargo test --lib history_propagates_convergence_to_inner_run_chain history_with_damping_reaches_same_fixed_point_as_undamped`
Expected: 2 passed.
**Fallback if Test 1 fails** (`max_iter=1` produces the same posteriors as default — meaning the inner loop converges in one iteration on this graph): replace `max_iter: 1` with `max_iter: 0`. With `max_iter = 0` the inner loop body runs zero times, guaranteeing different posteriors than convergence.
**Fallback if Test 2 fails** (`max_diff` exceeds `1e-3`): raise `max_iter: 200` to `max_iter: 500`. Heavier damping needs more iterations to reach the same fixed point.
If neither fallback works, STOP and report BLOCKED with the actual `max_diff` and the iteration counts tried.
- [ ] **Step 5: Run the full test suite**
Run: `cargo test --lib && cargo test`
Expected: lib count = 100 (was 98), integration count = 27 (unchanged), all passing.
- [ ] **Step 6: Format and lint**
Run: `cargo +nightly fmt && cargo clippy --all-targets -- -D warnings`
Expected: no diff, no warnings.
- [ ] **Step 7: Commit**
```bash
git add src/convergence.rs src/history.rs
git commit -m "$(cat <<'EOF'
test(history): end-to-end ConvergenceOptions propagation tests
Two integration tests on a 4-team ranked event:
- max_iter=1 set on HistoryBuilder produces measurably different
posteriors than default, proving the inner loop honors the
propagated max_iter
- alpha=0.5 with extra iterations reaches the same fixed point as
alpha=1.0, proving damping doesn't break correctness on the History
path
Also updates the alpha doc comment to clarify it applies only to the
within-game EP loop, not the outer cross-history sweep.
EOF
)"
```
---
## Self-review (writer's note)
**Spec coverage:**
- Spec § "What ships" item 1 (TimeSlice convergence field) → Task 1 step 2 ✓
- Spec § "What ships" item 2 (TimeSlice::new signature) → Task 1 step 3 ✓
- Spec § "What ships" item 3 (History passes self.convergence) → Task 1 step 4 ✓
- Spec § "What ships" item 4 (Event::iteration_direct gains parameter) → Task 2 step 1 ✓
- Spec § "What ships" item 4 (callers pass self.convergence) → Task 2 steps 2, 3 ✓
- Spec § "What ships" item 5 (TimeSlice::convergence-method reads field) → Task 2 step 4 ✓
- Spec § "What ships" item 6 (log_evidence reads field) → Task 2 step 5 ✓
- Spec § "What ships" item 7 (test callsite updates) → Task 1 step 5 ✓
- Spec § "Design" rename method → Task 1 step 6 ✓
- Spec § "Risks" alpha doc-comment update → Task 3 step 1 ✓
- Spec § "Testing strategy" §1 (regression net) → Tasks 1 step 7, 2 step 6, 3 step 5 ✓
- Spec § "Testing strategy" §2 (history_propagates_convergence) → Task 3 step 3 test 1 ✓
- Spec § "Testing strategy" §2 (history_with_damping_reaches_same_fixed_point) → Task 3 step 3 test 2 ✓
**Out-of-scope items correctly absent:** No new `History`/`HistoryBuilder` methods, no `ConvergenceOptions` split, no `Damped` Schedule impl, no nat-param convergence switch.
**Type / signature consistency:**
- `TimeSlice::new(time, p_draw, convergence: ConvergenceOptions)` — Task 1 step 3 (def) and Task 1 step 4-5 (call sites) match ✓
- `iteration_direct(skills, agents, p_draw, convergence, arena)` — Task 2 step 1 (def) and steps 2, 3 (call sites) match ✓
- `iterate_to_convergence` — Task 1 step 6 ✓
- All `self.convergence` reads are field accesses, not method calls (the rename in Task 1 step 6 prevents ambiguity) ✓
**Two tasks (1 and 2) split rationale:** Task 1 wires the field but the inference path still uses hardcoded defaults (no behavioral change). Task 2 makes the field actually drive inference (behavioral change for non-default users). Each task is independently committable and the test suite is bit-equal at every checkpoint.
**No placeholders detected.**
@@ -1,540 +0,0 @@
# Per-Event `score_sigma` Override Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Let users specify a per-event score-sigma override on `Outcome::Scored`, defaulting to `HistoryBuilder::score_sigma` when not set.
**Architecture:** `Outcome::Scored` becomes a struct variant with an `Option<f64>` `sigma` field. `History::add_events` resolves `sigma.unwrap_or(self.score_sigma)` at ingest time, so downstream `EventKind::Scored.score_sigma` stays a plain `f64` and `TimeSlice` / `run_chain` need zero changes. Two new constructors (`Outcome::scores_with_sigma` and `EventBuilder::scores_with_sigma`) cover the override path; existing `scores(...)` keeps its signature.
**Tech Stack:** Rust 2024, `cargo +nightly fmt`, `cargo clippy`, `cargo test`.
---
## Spec reference
`docs/superpowers/specs/2026-05-08-per-event-score-sigma-design.md`
## File map
| File | Why touched |
|---|---|
| `src/outcome.rs` | `Outcome::Scored` variant becomes a struct; pattern matches in `team_count`, `as_scores`, `as_ranks`; new `scores_with_sigma` constructor; existing `scores` constructor body adapts |
| `src/history.rs` | The single ingest pattern match at `:735` resolves `sigma.unwrap_or(self.score_sigma)`; three new end-to-end tests |
| `src/event_builder.rs` | New `scores_with_sigma` builder method |
## Pre-flight context for the implementer
- `Outcome` is `pub`. Currently a tuple-variant enum at `src/outcome.rs:18-21`. Changing `Scored(SmallVec)``Scored { scores, sigma }` is a breaking change to a public variant shape, acceptable in 0.1.x.
- Pattern-match callsite inventory across the workspace (verified by grep): only ONE site destructures the variant — `src/history.rs:735` (`crate::Outcome::Scored(scores) => { ... }`). Every other reference is either a constructor call (`Outcome::scores(...)`) or a string literal in a doc/error message. The constructors keep their existing signatures, so callsites don't need updating.
- `Outcome::scores(I)` constructor at `src/outcome.rs:44`: keep the signature `pub fn scores<I: IntoIterator<Item = f64>>(scores: I) -> Self`. Only the body changes (it now builds `Self::Scored { scores: ..., sigma: None }`).
- `as_scores`, `as_ranks`, `team_count` accessors at `src/outcome.rs:48-67`: their public signatures stay the same. Internal pattern matches adapt mechanically.
- `EventBuilder::scores(I)` at `src/event_builder.rs:79-82`: keep unchanged. The new `scores_with_sigma(I, f64)` lives next to it.
- `History::score_sigma` at `src/history.rs:165`: still the history-wide default. `HistoryBuilder::score_sigma(s)` builder method at `src/history.rs:82-89` stays as-is.
- `EventKind::Scored { score_sigma: f64 }` at `src/time_slice.rs:51`: already per-event-shaped. Don't touch.
- Test baseline: 100 lib + 27 integration tests, all passing.
---
### Task 1: `Outcome::Scored` becomes a struct variant + constructors
This is the foundational shape change. After this task: the new variant compiles, both `scores` and `scores_with_sigma` work on `Outcome` directly, but `History::add_events` (the only consumer that destructures the variant) hasn't yet been updated — Task 2 handles that.
**Files:**
- Modify: `src/outcome.rs` (variant shape, three pattern-match arms, two existing tests, three new tests, two constructors)
- [ ] **Step 1: Write failing tests for the new constructor**
In `src/outcome.rs`, inside the existing `#[cfg(test)] mod tests` block, add at the end:
```rust
#[test]
fn scores_with_sigma_round_trips() {
let o = Outcome::scores_with_sigma([10.0, 4.0], 0.5);
assert_eq!(o.team_count(), 2);
assert_eq!(o.as_scores(), Some(&[10.0, 4.0][..]));
}
#[test]
fn scores_constructor_leaves_sigma_unset() {
// After the variant change, the public Outcome::scores constructor
// must build with sigma: None. We assert this indirectly via a match
// on the variant.
let o = Outcome::scores([3.0, 1.0]);
match o {
Outcome::Scored { scores: _, sigma } => assert!(sigma.is_none()),
Outcome::Ranked(_) => panic!("expected Scored variant"),
}
}
#[test]
fn scores_with_sigma_sets_sigma_some() {
let o = Outcome::scores_with_sigma([3.0, 1.0], 2.0);
match o {
Outcome::Scored { scores: _, sigma } => assert_eq!(sigma, Some(2.0)),
Outcome::Ranked(_) => panic!("expected Scored variant"),
}
}
#[test]
#[should_panic(expected = "score_sigma must be > 0.0")]
fn scores_with_sigma_rejects_zero() {
let _ = Outcome::scores_with_sigma([3.0, 1.0], 0.0);
}
```
- [ ] **Step 2: Run the new tests to verify they fail**
Run: `cargo test --lib outcome::tests`
Expected: 4 errors. The first three fail to compile (no `scores_with_sigma` function; pattern destructure on `Scored { ... }` doesn't match the current tuple variant). The last fails because `scores_with_sigma` doesn't exist.
- [ ] **Step 3: Change the variant shape and update the constructor + accessors**
In `src/outcome.rs`, replace the entire `Outcome` enum and `impl Outcome` block (currently `src/outcome.rs:16-68`) with:
```rust
/// Final outcome of a match.
///
/// `Ranked(ranks)`: lower rank = better. Equal ranks mean a tie between those
/// teams. `ranks.len()` must equal the number of teams in the event.
///
/// `Scored { scores, sigma }`: higher score = better. Adjacent (sorted) pairs
/// feed observed margins to `MarginFactor`. `scores.len()` must equal the
/// number of teams in the event. `sigma` overrides `HistoryBuilder::score_sigma`
/// when `Some`; `None` inherits the history default.
#[derive(Clone, Debug, PartialEq)]
#[non_exhaustive]
pub enum Outcome {
Ranked(SmallVec<[u32; 4]>),
Scored {
scores: SmallVec<[f64; 4]>,
/// Per-event noise override. `None` means inherit
/// `HistoryBuilder::score_sigma`. Must be `> 0.0` if `Some`.
sigma: Option<f64>,
},
}
impl Outcome {
/// `n`-team outcome where team `winner` won and everyone else tied for last.
///
/// Panics if `winner >= n`.
pub fn winner(winner: u32, n: u32) -> Self {
assert!(winner < n, "winner index {winner} out of range 0..{n}");
let ranks: SmallVec<[u32; 4]> = (0..n).map(|i| if i == winner { 0 } else { 1 }).collect();
Self::Ranked(ranks)
}
/// All `n` teams tied.
pub fn draw(n: u32) -> Self {
Self::Ranked(SmallVec::from_vec(vec![0; n as usize]))
}
/// Explicit per-team ranking.
pub fn ranking<I: IntoIterator<Item = u32>>(ranks: I) -> Self {
Self::Ranked(ranks.into_iter().collect())
}
/// Explicit per-team continuous scores; higher = better.
/// Inherits `HistoryBuilder::score_sigma` for the noise model.
pub fn scores<I: IntoIterator<Item = f64>>(scores: I) -> Self {
Self::Scored {
scores: scores.into_iter().collect(),
sigma: None,
}
}
/// Explicit per-team continuous scores with a per-event noise override.
///
/// `sigma` must be `> 0.0`; debug-asserts otherwise.
pub fn scores_with_sigma<I: IntoIterator<Item = f64>>(scores: I, sigma: f64) -> Self {
debug_assert!(sigma > 0.0, "score_sigma must be > 0.0 (got {sigma})");
Self::Scored {
scores: scores.into_iter().collect(),
sigma: Some(sigma),
}
}
pub fn team_count(&self) -> usize {
match self {
Self::Ranked(r) => r.len(),
Self::Scored { scores, .. } => scores.len(),
}
}
pub(crate) fn as_ranks(&self) -> Option<&[u32]> {
match self {
Self::Ranked(r) => Some(r),
Self::Scored { .. } => None,
}
}
pub(crate) fn as_scores(&self) -> Option<&[f64]> {
match self {
Self::Scored { scores, .. } => Some(scores),
Self::Ranked(_) => None,
}
}
}
```
- [ ] **Step 4: Run the new tests**
Run: `cargo test --lib outcome::tests`
Expected: all outcome tests pass (the 6 pre-existing tests + 4 new = 10 total in the outcome tests module).
If any pre-existing test fails, the issue is in this task — not Task 2. Most likely cause: a pattern-match arm in the rewritten `impl Outcome` block doesn't compile. Re-check the struct-variant destructure syntax (`Self::Scored { scores, .. }` for read-only access; `Self::Scored { scores, sigma }` when both fields are needed).
- [ ] **Step 5: Update `History::add_events` ingest arm to destructure the new variant**
The variant change from Step 3 breaks the existing `Outcome::Scored(scores)` pattern match in `src/history.rs:735`. Fix it now (in the same commit) — the codebase must build at every commit boundary.
In `src/history.rs`, find the `crate::Outcome::Scored(scores) => { ... }` arm (currently at `src/history.rs:735-740`). Replace with:
```rust
crate::Outcome::Scored { scores, sigma } => {
let resolved = sigma.unwrap_or(self.score_sigma);
debug_assert!(
resolved > 0.0,
"resolved score_sigma must be > 0.0 (got {resolved})"
);
kinds.push(EventKind::Scored {
score_sigma: resolved,
});
scores.to_vec()
}
```
The surrounding `match &ev.outcome { ... }` and the surrounding flow (the `ranks` arm above, the `results.push(event_result);` below) stay unchanged.
- [ ] **Step 6: Run the full library test suite — bit-equal regression net**
Run: `cargo build && cargo test --lib && cargo test`
Expected: clean build. All 100 lib + 27 integration tests pass. Bit-equal goldens — every existing scored-event constructor uses the no-override path (`Outcome::scores(...)` or `EventBuilder::scores(...)`), which now resolves to `sigma: None → resolved = self.score_sigma`, exactly equal to the previous behavior.
If unexpected additional compile errors surface (any site pattern-matching `Outcome::Scored(...)` outside the 735 arm), STOP and report — the plan's inventory is wrong, surface that as a finding before continuing.
If any existing test fails: investigate. Most likely cause is a typo in the new pattern arms (Step 3) or the resolution rule (Step 5). The override path isn't exercised yet by any existing test, so the only thing that can break is the inheritance path.
- [ ] **Step 7: Format and lint**
Run: `cargo +nightly fmt && cargo clippy --all-targets -- -D warnings`
Expected: no diff, no warnings.
- [ ] **Step 8: Commit**
```bash
git add src/outcome.rs src/history.rs
git commit -m "$(cat <<'EOF'
feat(outcome): per-event score_sigma override on Outcome::Scored
Outcome::Scored shape changes from tuple to struct:
{ scores, sigma: Option<f64> }. New constructor scores_with_sigma
sets sigma=Some(s) and debug-asserts s > 0.0; existing scores(I)
constructor keeps its signature and builds with sigma=None internally.
team_count, as_scores, as_ranks accessor pattern matches updated.
History::add_events resolves sigma.unwrap_or(self.score_sigma) at the
ingest arm, so downstream EventKind::Scored stays a plain f64 and
TimeSlice / run_chain need zero changes.
Breaking change to the public Outcome::Scored variant shape
(acceptable in 0.1.x). Bit-equal for callers using the no-override
path because the resolution falls through to self.score_sigma exactly
as before.
EOF
)"
```
---
### Task 2: `EventBuilder::scores_with_sigma` builder method
The override path is fully wired by Task 1, but it's only reachable via the `Outcome::scores_with_sigma` constructor (passed into `History::add_events` directly). The fluent-builder ergonomic — `h.event(t).team(...).scores_with_sigma(scores, sigma).commit()` — needs one new method on `EventBuilder`.
**Files:**
- Modify: `src/event_builder.rs` (new builder method)
- [ ] **Step 1: Add the EventBuilder method**
In `src/event_builder.rs`, find the existing `scores` method (currently at `src/event_builder.rs:79-82`). Immediately below it (still inside `impl<'h, T, D, O, K> EventBuilder<...>`), add:
```rust
/// Set explicit per-team continuous scores with a per-event noise override.
///
/// `sigma` overrides `HistoryBuilder::score_sigma` for this event only.
/// Must be `> 0.0`; debug-asserts otherwise via `Outcome::scores_with_sigma`.
pub fn scores_with_sigma<I: IntoIterator<Item = f64>>(mut self, scores: I, sigma: f64) -> Self {
self.event.outcome = crate::Outcome::scores_with_sigma(scores, sigma);
self
}
```
- [ ] **Step 2: Build and run the test suite**
Run: `cargo build && cargo test --lib && cargo test`
Expected: clean build, all 100 lib + 27 integration tests pass. The new method is additive — no behavior changes for existing tests.
- [ ] **Step 3: Format and lint**
Run: `cargo +nightly fmt && cargo clippy --all-targets -- -D warnings`
Expected: no diff, no warnings.
- [ ] **Step 4: Commit**
```bash
git add src/event_builder.rs
git commit -m "$(cat <<'EOF'
feat(event_builder): expose scores_with_sigma fluent method
Adds EventBuilder::scores_with_sigma, the fluent-builder ergonomic
mirror of Outcome::scores_with_sigma. Lets users write
h.event(t).team(...).team(...).scores_with_sigma([..], sigma).commit()
to set a per-event score_sigma override.
EOF
)"
```
---
### Task 3: End-to-end integration tests
**Files:**
- Modify: `src/history.rs` (three new tests in the existing `#[cfg(test)] mod tests` block at the bottom)
- [ ] **Step 1: Locate the test module**
Run: `grep -n "^#\[cfg(test)\]" src/history.rs`
Identify the test module (there should be one near the bottom of the file). Read its imports and look at neighboring tests to see the existing builder/event-construction pattern in current use. Mirror that pattern in the new tests below — the surface syntax (`History::builder()`, `event(t).team(...)`, `learning_curves()`, etc.) must match what already works in this file.
- [ ] **Step 2: Write the failing tests**
Add the following three tests at the end of the existing `#[cfg(test)] mod tests` block in `src/history.rs` (just before the module's closing `}`):
```rust
#[test]
fn outcome_scores_default_sigma_uses_history_default() {
use crate::Outcome;
// Path A: explicit sigma=0.5 via override.
let mut h_a = crate::History::builder().score_sigma(0.5).build();
h_a.add_events([crate::Event {
time: 0_i64,
teams: smallvec::smallvec![
crate::Team::with_members([crate::Member::new("a")]),
crate::Team::with_members([crate::Member::new("b")]),
],
outcome: Outcome::scores_with_sigma([3.0, 1.0], 0.5),
}])
.unwrap();
h_a.converge().unwrap();
// Path B: history-wide default 0.5, no per-event override.
let mut h_b = crate::History::builder().score_sigma(0.5).build();
h_b.add_events([crate::Event {
time: 0_i64,
teams: smallvec::smallvec![
crate::Team::with_members([crate::Member::new("a")]),
crate::Team::with_members([crate::Member::new("b")]),
],
outcome: Outcome::scores([3.0, 1.0]),
}])
.unwrap();
h_b.converge().unwrap();
// Inheritance: posteriors must be bit-equal.
let curves_a = h_a.learning_curves();
let curves_b = h_b.learning_curves();
for (key, a_pts) in curves_a.iter() {
let b_pts = curves_b.get(key).expect("agent missing in path B");
for (a, b) in a_pts.iter().zip(b_pts.iter()) {
assert_eq!(a.1.pi(), b.1.pi(), "mismatch at agent {key:?}");
assert_eq!(a.1.tau(), b.1.tau(), "mismatch at agent {key:?}");
}
}
}
#[test]
fn outcome_scores_with_sigma_overrides_history_default() {
use crate::Outcome;
// Path A: history-wide default 0.5, per-event override 2.0.
let mut h_a = crate::History::builder().score_sigma(0.5).build();
h_a.add_events([crate::Event {
time: 0_i64,
teams: smallvec::smallvec![
crate::Team::with_members([crate::Member::new("a")]),
crate::Team::with_members([crate::Member::new("b")]),
],
outcome: Outcome::scores_with_sigma([3.0, 1.0], 2.0),
}])
.unwrap();
h_a.converge().unwrap();
// Path B: history-wide default 2.0, no per-event override.
let mut h_b = crate::History::builder().score_sigma(2.0).build();
h_b.add_events([crate::Event {
time: 0_i64,
teams: smallvec::smallvec![
crate::Team::with_members([crate::Member::new("a")]),
crate::Team::with_members([crate::Member::new("b")]),
],
outcome: Outcome::scores([3.0, 1.0]),
}])
.unwrap();
h_b.converge().unwrap();
// Override == default-set-to-the-override-value: bit-equal.
let curves_a = h_a.learning_curves();
let curves_b = h_b.learning_curves();
for (key, a_pts) in curves_a.iter() {
let b_pts = curves_b.get(key).expect("agent missing in path B");
for (a, b) in a_pts.iter().zip(b_pts.iter()) {
assert_eq!(a.1.pi(), b.1.pi(), "mismatch at agent {key:?}");
assert_eq!(a.1.tau(), b.1.tau(), "mismatch at agent {key:?}");
}
}
// Path C: history-wide default 0.5, no override. Different sigma → different posteriors.
let mut h_c = crate::History::builder().score_sigma(0.5).build();
h_c.add_events([crate::Event {
time: 0_i64,
teams: smallvec::smallvec![
crate::Team::with_members([crate::Member::new("a")]),
crate::Team::with_members([crate::Member::new("b")]),
],
outcome: Outcome::scores([3.0, 1.0]),
}])
.unwrap();
h_c.converge().unwrap();
let curves_c = h_c.learning_curves();
let mut max_diff: f64 = 0.0;
for (key, a_pts) in curves_a.iter() {
let c_pts = curves_c.get(key).expect("agent missing in path C");
for (a, c) in a_pts.iter().zip(c_pts.iter()) {
max_diff = max_diff.max((a.1.mu() - c.1.mu()).abs());
max_diff = max_diff.max((a.1.sigma() - c.1.sigma()).abs());
}
}
assert!(
max_diff > 1e-6,
"override should produce different posteriors from inherited default; max_diff={max_diff}"
);
}
#[test]
fn event_builder_scores_with_sigma_threading() {
use crate::Outcome;
// Path A: builder fluent API with sigma override.
let mut h_a = crate::History::builder().score_sigma(0.5).build();
h_a.event(0_i64)
.team(["a"])
.team(["b"])
.scores_with_sigma([3.0, 1.0], 2.0)
.commit()
.unwrap();
h_a.converge().unwrap();
// Path B: same outcome via the explicit Outcome constructor.
let mut h_b = crate::History::builder().score_sigma(0.5).build();
h_b.add_events([crate::Event {
time: 0_i64,
teams: smallvec::smallvec![
crate::Team::with_members([crate::Member::new("a")]),
crate::Team::with_members([crate::Member::new("b")]),
],
outcome: Outcome::scores_with_sigma([3.0, 1.0], 2.0),
}])
.unwrap();
h_b.converge().unwrap();
let curves_a = h_a.learning_curves();
let curves_b = h_b.learning_curves();
for (key, a_pts) in curves_a.iter() {
let b_pts = curves_b.get(key).expect("agent missing");
for (a, b) in a_pts.iter().zip(b_pts.iter()) {
assert_eq!(a.1.pi(), b.1.pi(), "mismatch at agent {key:?}");
assert_eq!(a.1.tau(), b.1.tau(), "mismatch at agent {key:?}");
}
}
}
```
If the surface API (e.g. `History::add_events`, `Event { time, teams, outcome }`, `Team::with_members`, `Member::new`, `event(...).team(...).commit()`, `learning_curves()`) doesn't exactly match what's available in the test module, look at neighboring tests for the patterns currently in use and adjust. The CONTRACT is: build two Histories that should produce identical posteriors, run them, compare. The surface syntax must follow what compiles in this file.
- [ ] **Step 3: Run the new tests**
Run: `cargo test --lib outcome_scores_default_sigma_uses_history_default outcome_scores_with_sigma_overrides_history_default event_builder_scores_with_sigma_threading`
Expected: 3 passed.
**Fallback if Test 2's `max_diff > 1e-6` fails** (sigma=0.5 vs sigma=2.0 produces nearly identical posteriors — unlikely on a single 2-team scored event, but possible if the priors dominate): use a larger gap, e.g. `Outcome::scores_with_sigma([3.0, 1.0], 5.0)` vs `Outcome::scores([3.0, 1.0])` with `score_sigma(0.5)`. The point is to prove the resolution path actually engages — any sigma gap that produces a measurable posterior difference is fine.
- [ ] **Step 4: Run the full test suite**
Run: `cargo test --lib && cargo test`
Expected: lib count = 103 (was 100, +3), integration count = 27 (unchanged), all passing.
- [ ] **Step 5: Format and lint**
Run: `cargo +nightly fmt && cargo clippy --all-targets -- -D warnings`
Expected: no diff, no warnings.
- [ ] **Step 6: Commit**
```bash
git add src/history.rs
git commit -m "$(cat <<'EOF'
test(history): end-to-end per-event score_sigma override tests
Three integration tests on a 2-team scored event:
- inheritance: Outcome::scores(...) with no override produces
bit-equal posteriors to the same outcome wrapped in
scores_with_sigma(scores, history.score_sigma)
- override-supersedes-default: scores_with_sigma(scores, X) with
history score_sigma(Y) produces bit-equal posteriors to
scores(...) with history score_sigma(X), AND differs measurably
from scores(...) with history score_sigma(Y)
- builder threading: EventBuilder::scores_with_sigma reaches the
ingest path identically to the Outcome constructor
EOF
)"
```
---
## Self-review (writer's note)
**Spec coverage:**
- Spec § "What ships" item 1 (Scored becomes struct variant) → Task 1 step 3 ✓
- Spec § "What ships" item 2 (scores_with_sigma constructor) → Task 1 step 3 ✓
- Spec § "What ships" item 3 (EventBuilder::scores_with_sigma) → Task 2 step 1 ✓
- Spec § "What ships" item 4 (sigma resolution at ingest) → Task 1 step 5 ✓
- Spec § "What ships" item 5 (pattern-match update inventory) → Task 1 step 5 (single site at history.rs:735) ✓
- Spec § "Validation" (debug_assert at constructor) → Task 1 step 3 (in `scores_with_sigma`) ✓
- Spec § "Validation" (debug_assert at ingest) → Task 1 step 5 ✓
- Spec § "Testing strategy" §1 (regression net) → Task 1 step 6, Task 2 step 2, Task 3 step 4 ✓
- Spec § "Testing strategy" §2 test 1 (default-uses-history-default) → Task 3 step 2 test 1 ✓
- Spec § "Testing strategy" §2 test 2 (override-supersedes-default) → Task 3 step 2 test 2 ✓
- Spec § "Testing strategy" §2 test 3 (builder threading) → Task 3 step 2 test 3 ✓
**Out-of-scope items correctly absent:** No `EventKind::Scored` change, no `TimeSlice`/`run_chain` changes, no `Game::scored` standalone API change, no deprecation of `HistoryBuilder::score_sigma`.
**Type / signature consistency:**
- `Outcome::Scored { scores: SmallVec<[f64; 4]>, sigma: Option<f64> }` — Task 1 step 3 (def) and Task 1 step 5 (destructure) match ✓
- `Outcome::scores_with_sigma<I>(scores: I, sigma: f64) -> Outcome` — Task 1 step 3 (def) and Task 2 step 1 (call) match ✓
- `EventBuilder::scores_with_sigma<I>(mut self, scores: I, sigma: f64) -> Self` — Task 2 step 1 (def) and Task 3 step 2 test 3 (call) match ✓
- `sigma.unwrap_or(self.score_sigma)` resolution rule — Task 1 step 5 ✓
**Task split rationale:** Task 1 lands the foundational shape change AND the ingest resolution atomically — every commit boundary builds and tests pass bit-equal. Task 2 is the small additive EventBuilder method, separated for review-focus reasons (it's the user-facing fluent API exposure). Task 3 is purely additive integration tests. Each task is independently committable; no intermediate non-building state.
**No placeholders detected.**
@@ -1,444 +0,0 @@
# Tech Debt Cleanup Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Land three independent post-T4-MarginFactor cleanups: dedupe `Game::likelihoods` and `Game::likelihoods_scored` via a `run_chain` helper, make `BuiltinFactor::log_evidence` exhaustive, and fix stale numerics in the T4 plan doc.
**Architecture:** Pure code-shape and doc fixes. No public-API change, no behavioral change, no new dependencies. The dedup is a pure refactor — bit-equal posteriors and evidence against existing test goldens. The exhaustive match is a future-proofing change with no runtime effect. The doc fix is two number swaps in prose plus one matching code-comment swap.
**Tech Stack:** Rust 2024, `cargo +nightly fmt`, `cargo clippy`, `cargo test --lib`.
---
## Spec reference
`docs/superpowers/specs/2026-05-08-tech-debt-cleanup-design.md`
## File map
| File | Why touched |
|---|---|
| `src/game.rs` | Add `run_chain` helper; rewrite `likelihoods` and `likelihoods_scored` to call it |
| `src/factor/mod.rs` | Make `BuiltinFactor::log_evidence` match exhaustive |
| `docs/superpowers/plans/2026-04-27-t4-margin-factor.md` | Fix two stale prose numbers and one matching code comment |
---
### Task 1: Extract `run_chain` helper, dedupe both likelihoods methods
**Files:**
- Modify: `src/game.rs:236-485` (replace both `likelihoods` and `likelihoods_scored` with one helper + two thin callers)
**Context for the implementer (read this before touching anything):**
`OwnedGame<T, D>` (defined at `src/game.rs:83-92`) holds `teams`, `result`, `weights`, `p_draw`, plus mutable output fields `likelihoods: Vec<Vec<Gaussian>>` and `evidence: f64`. Two private methods on `Game<'a, T, D>` (the borrowed sibling at `src/game.rs:148-156`) compute likelihoods:
- `likelihoods(&mut self, arena: &mut ScratchArena)` — ranked outcomes; `src/game.rs:236-371`
- `likelihoods_scored(&mut self, arena: &mut ScratchArena, score_sigma: f64)` — scored outcomes; `src/game.rs:373-485`
The two are bit-identical except for the closure that builds the per-diff `DiffFactor` (defined at `src/game.rs:20-54`). `DiffFactor` has two variants: `Trunc(TruncFactor)` for ranked, `Margin(MarginFactor)` for scored.
The shared body does, in order: `arena.reset()`, sort teams descending by `result` into `arena.sort_buf`, fill `arena.team_prior`, build `links: Vec<DiffFactor>` (the differing block), resize `arena.lhood_lose` / `arena.lhood_win` to `N_INF`, run a forward+backward sweep with a max-iter-10 fixed-point loop guarded by `tuple_gt(step, 1e-6)`, handle the `n_diffs == 1` special case, do boundary updates, multiply per-diff `evidence()` into `self.evidence`, build the inverse permutation in `arena.inv_buf`, then build `self.likelihoods` from the per-team `lhood_win * lhood_lose` and per-player `performance().exclude(...).forget(beta²)` math.
**Refactor target:**
```rust
fn run_chain<F>(
&self,
arena: &mut ScratchArena,
mut make_link: F,
) -> (f64, Vec<Vec<Gaussian>>)
where
F: FnMut(usize, &[usize], &mut crate::factor::VarStore) -> DiffFactor,
{ /* the entire shared body, returning (evidence, likelihoods) */ }
```
Helper takes `&self` (not `&mut self`) so the closure can capture `&self.result`, `&self.teams`, `&self.weights`, `&self.p_draw` without conflicting with the helper's own immutable borrow. The arena is borrowed `&mut` independently.
The closure is invoked once per diff index `i ∈ 0..n_diffs`, after `arena.sort_buf` is filled. It receives `i`, `&arena.sort_buf[..]`, and `&mut arena.vars` so it can `alloc(N_INF)` the diff `VarId`. It returns the constructed `DiffFactor`.
The two callers shrink to:
```rust
fn likelihoods(&mut self, arena: &mut ScratchArena) {
let p_draw = self.p_draw;
let result = &self.result;
let teams = &self.teams;
let (evidence, likelihoods) = Self::dummy_to_satisfy_borrowck(/* see below */);
// ... assigns self.evidence and self.likelihoods
}
```
Wait — actually borrow-checker note: calling `self.run_chain(arena, |i, sort_buf, vars| { use_self_fields })` from a `&mut self` method is **fine** because `run_chain` takes `&self` and the closure captures `&self` immutably. Both share an immutable reborrow of `*self`. The arena is a separate `&mut` borrow. Verify the implementer doesn't accidentally make `run_chain` take `&mut self`.
**Why a closure (not a trait, not a two-phase build).** A closure keeps caller-specific state (`p_draw`, `score_sigma`, beta sums) inline at the call site with zero ceremony. A trait would require a stateful builder per call. A two-phase build (caller produces `Vec<DiffFactor>` first, helper does the rest) would either re-do the sort or split arena ownership awkwardly between the phases.
---
- [ ] **Step 1: Run the existing test suite to capture the baseline**
Run: `cargo test --lib`
Expected: all tests pass. Note the count (should be 88+ lib tests) — the refactor must keep this number unchanged with all green.
- [ ] **Step 2: Open `src/game.rs` and add the `run_chain` helper**
Inside `impl<'a, T: Time, D: Drift<T>> Game<'a, T, D> { ... }` (the block starting at `src/game.rs:158`), add `run_chain` immediately above the existing `likelihoods` method (so above line 236). Use exactly this body — it is the merge of the two existing methods with the differing block replaced by the closure call:
```rust
fn run_chain<F>(
&self,
arena: &mut ScratchArena,
mut make_link: F,
) -> (f64, Vec<Vec<Gaussian>>)
where
F: FnMut(usize, &[usize], &mut crate::factor::VarStore) -> DiffFactor,
{
arena.reset();
let n_teams = self.teams.len();
arena.sort_buf.extend(0..n_teams);
arena.sort_buf.sort_by(|&i, &j| {
self.result[j]
.partial_cmp(&self.result[i])
.unwrap_or(Ordering::Equal)
});
arena.team_prior.extend(arena.sort_buf.iter().map(|&t| {
self.teams[t]
.iter()
.zip(self.weights[t].iter())
.fold(N00, |p, (player, &w)| p + (player.performance() * w))
}));
let n_diffs = n_teams.saturating_sub(1);
let mut links: Vec<DiffFactor> = (0..n_diffs)
.map(|i| make_link(i, &arena.sort_buf, &mut arena.vars))
.collect();
arena.lhood_lose.resize(n_teams, N_INF);
arena.lhood_win.resize(n_teams, N_INF);
let mut step = (f64::INFINITY, f64::INFINITY);
let mut iter = 0;
while tuple_gt(step, 1e-6) && iter < 10 {
step = (0.0_f64, 0.0_f64);
for (e, lf) in links[..n_diffs.saturating_sub(1)].iter_mut().enumerate() {
let pw = arena.team_prior[e] * arena.lhood_lose[e];
let pl = arena.team_prior[e + 1] * arena.lhood_win[e + 1];
let raw = pw - pl;
arena.vars.set(lf.diff(), raw * lf.msg());
let d = lf.propagate(&mut arena.vars);
step = tuple_max(step, d);
let new_ll = pw - lf.msg();
step = tuple_max(step, arena.lhood_lose[e + 1].delta(new_ll));
arena.lhood_lose[e + 1] = new_ll;
}
for (rev_i, lf) in links[1..].iter_mut().rev().enumerate() {
let e = n_diffs - 1 - rev_i;
let pw = arena.team_prior[e] * arena.lhood_lose[e];
let pl = arena.team_prior[e + 1] * arena.lhood_win[e + 1];
let raw = pw - pl;
arena.vars.set(lf.diff(), raw * lf.msg());
let d = lf.propagate(&mut arena.vars);
step = tuple_max(step, d);
let new_lw = pl + lf.msg();
step = tuple_max(step, arena.lhood_win[e].delta(new_lw));
arena.lhood_win[e] = new_lw;
}
iter += 1;
}
if n_diffs == 1 {
let raw = (arena.team_prior[0] * arena.lhood_lose[0])
- (arena.team_prior[1] * arena.lhood_win[1]);
arena.vars.set(links[0].diff(), raw * links[0].msg());
links[0].propagate(&mut arena.vars);
}
if n_diffs > 0 {
let pl1 = arena.team_prior[1] * arena.lhood_win[1];
arena.lhood_win[0] = pl1 + links[0].msg();
let pw_last = arena.team_prior[n_teams - 2] * arena.lhood_lose[n_teams - 2];
arena.lhood_lose[n_teams - 1] = pw_last - links[n_diffs - 1].msg();
}
let evidence: f64 = links.iter().map(|l| l.evidence()).product();
arena.inv_buf.resize(n_teams, 0);
for (si, &orig_i) in arena.sort_buf.iter().enumerate() {
arena.inv_buf[orig_i] = si;
}
let likelihoods = self
.teams
.iter()
.zip(self.weights.iter())
.enumerate()
.map(|(orig_i, (players, weights))| {
let si = arena.inv_buf[orig_i];
let m = arena.lhood_win[si] * arena.lhood_lose[si];
let performance = players
.iter()
.zip(weights.iter())
.fold(N00, |p, (player, &w)| p + (player.performance() * w));
players
.iter()
.zip(weights.iter())
.map(|(player, &w)| {
((m - performance.exclude(player.performance() * w)) * (1.0 / w))
.forget(player.beta.powi(2))
})
.collect::<Vec<_>>()
})
.collect::<Vec<_>>();
(evidence, likelihoods)
}
```
- [ ] **Step 3: Replace `likelihoods` body with a thin caller**
In `src/game.rs`, replace the entire body of `fn likelihoods(&mut self, arena: &mut ScratchArena)` (currently lines 236-371 — replace from the opening `{` to the closing `}` of that method) with:
```rust
fn likelihoods(&mut self, arena: &mut ScratchArena) {
let p_draw = self.p_draw;
// Capture pointers to fields the closure reads, to keep borrow scopes tight.
// Closure captures &self.result and &self.teams (both immutable) and the
// &mut arena passed in via run_chain — disjoint from `&self`.
let (evidence, likelihoods) = self.run_chain(arena, |i, sort_buf, vars| {
let tie = self.result[sort_buf[i]] == self.result[sort_buf[i + 1]];
let margin = if p_draw == 0.0 {
0.0
} else {
let a: f64 = self.teams[sort_buf[i]]
.iter()
.map(|p| p.beta.powi(2))
.sum();
let b: f64 = self.teams[sort_buf[i + 1]]
.iter()
.map(|p| p.beta.powi(2))
.sum();
compute_margin(p_draw, (a + b).sqrt())
};
let vid = vars.alloc(N_INF);
DiffFactor::Trunc(TruncFactor::new(vid, margin, tie))
});
self.evidence = evidence;
self.likelihoods = likelihoods;
}
```
(Capturing `p_draw` as a local binding before the closure avoids a `self.p_draw` borrow inside; it's a `Copy` `f64` so this is free.)
- [ ] **Step 4: Replace `likelihoods_scored` body with a thin caller**
In `src/game.rs`, replace the entire body of `fn likelihoods_scored(&mut self, arena: &mut ScratchArena, score_sigma: f64)` (currently lines 373-485) with:
```rust
fn likelihoods_scored(&mut self, arena: &mut ScratchArena, score_sigma: f64) {
let (evidence, likelihoods) = self.run_chain(arena, |i, sort_buf, vars| {
// After descending-by-score sort, m_obs >= 0 for every adjacent pair.
let m_obs = self.result[sort_buf[i]] - self.result[sort_buf[i + 1]];
let vid = vars.alloc(N_INF);
DiffFactor::Margin(MarginFactor::new(vid, m_obs, score_sigma))
});
self.evidence = evidence;
self.likelihoods = likelihoods;
}
```
- [ ] **Step 5: Build to confirm it compiles**
Run: `cargo build`
Expected: compiles cleanly. If the borrow checker complains that the closure conflicts with `self.run_chain(...)`, the most likely cause is `run_chain` accidentally being `&mut self` — confirm its signature is `fn run_chain<F>(&self, arena: &mut ScratchArena, mut make_link: F) -> (f64, Vec<Vec<Gaussian>>)`. If that's correct and there's still a conflict, double-check the closure's captures: it should capture `&self.result` and `&self.teams` (immutable), `p_draw: f64` by value (Copy), and `score_sigma: f64` by value (Copy). It must NOT touch `&mut self` in any form.
- [ ] **Step 6: Run the full library test suite — must be all green, same count as Step 1**
Run: `cargo test --lib`
Expected: same number of tests as Step 1, all pass. Bit-equal goldens — every existing assertion (`test_1vs1`, `test_1vs1_draw`, `test_2vs1vs2_mixed`, MarginFactor end-to-end tests, etc.) must pass unchanged. If ANY test fails, the refactor is wrong; revert and re-inspect.
- [ ] **Step 7: Run integration tests too**
Run: `cargo test`
Expected: all integration tests pass (28 noted in commit `8b53cac`).
- [ ] **Step 8: Format and lint**
Run: `cargo +nightly fmt && cargo clippy --lib -- -D warnings`
Expected: no diffs from fmt, no clippy warnings.
- [ ] **Step 9: Commit**
```bash
git add src/game.rs
git commit -m "$(cat <<'EOF'
refactor: dedupe Game::likelihoods and likelihoods_scored via run_chain
Both methods were 95-line near-duplicates differing only in the closure
that builds the per-diff DiffFactor. Extract the shared body as a
private run_chain<F>(&self, arena, make_link) helper that returns
(evidence, likelihoods); the two callers shrink to ~10 lines each.
Pure code-shape change: posteriors and evidence remain bit-equal; all
existing tests (lib + integration) pass unchanged.
EOF
)"
```
---
### Task 2: Make `BuiltinFactor::log_evidence` match exhaustive
**Files:**
- Modify: `src/factor/mod.rs:94-100` (the `log_evidence` impl on `BuiltinFactor`)
- [ ] **Step 1: Open `src/factor/mod.rs` and replace the `log_evidence` body**
Replace the existing impl:
```rust
fn log_evidence(&self, vars: &VarStore) -> f64 {
match self {
Self::Trunc(f) => f.log_evidence(vars),
Self::Margin(f) => f.log_evidence(vars),
_ => 0.0,
}
}
```
with:
```rust
fn log_evidence(&self, vars: &VarStore) -> f64 {
match self {
Self::Trunc(f) => f.log_evidence(vars),
Self::Margin(f) => f.log_evidence(vars),
Self::TeamSum(_) | Self::RankDiff(_) => 0.0,
}
}
```
- [ ] **Step 2: Build and run tests**
Run: `cargo build && cargo test --lib`
Expected: compiles cleanly, all tests pass. Behavior is unchanged — `TeamSum` and `RankDiff` still return `0.0`, but a future variant will now produce a non-exhaustive-match error instead of being silently swallowed.
- [ ] **Step 3: Format and lint**
Run: `cargo +nightly fmt && cargo clippy --lib -- -D warnings`
Expected: no diffs, no warnings.
- [ ] **Step 4: Commit**
```bash
git add src/factor/mod.rs
git commit -m "$(cat <<'EOF'
refactor: make BuiltinFactor::log_evidence match exhaustive
Replace the `_ => 0.0` wildcard with explicit
`Self::TeamSum(_) | Self::RankDiff(_) => 0.0`. No behavioral change;
future variants now produce a compile error instead of being silently
absorbed by the wildcard.
EOF
)"
```
---
### Task 3: Fix stale numerics in T4 plan doc
**Files:**
- Modify: `docs/superpowers/plans/2026-04-27-t4-margin-factor.md` (lines 52 and 185)
The shipped test in `src/factor/mod.rs:163,166` asserts:
```
assert!((result.mu() - 4.864864864864865).abs() < 1e-12);
assert!((logz - (-3.062235327364623)).abs() < 1e-10);
```
The plan's prose at line 52 quotes pre-shipped values that no longer match. This task fixes the prose and the matching code-comment. The full-precision assertion blocks elsewhere in the plan are out of scope (they belong to the plan-as-written, and the spec's fix table only listed the rounded prose values).
- [ ] **Step 1: Update the prose at line 52**
Open `docs/superpowers/plans/2026-04-27-t4-margin-factor.md`. Find the line:
```
- `Z_cav = pdf(5, 0, sqrt(36 + 1)) = pdf(5, 0, sqrt(37)) ≈ 0.046827`. So `log_evidence ≈ -3.0613`.
```
Replace with:
```
- `Z_cav = pdf(5, 0, sqrt(36 + 1)) = pdf(5, 0, sqrt(37)) ≈ 0.04678`. So `log_evidence ≈ -3.0622`.
```
- [ ] **Step 2: Update the matching code-comment at line 185**
In the same file, find:
```
// pdf(5, 0, sqrt(37)) ≈ 0.046827
```
Replace with:
```
// pdf(5, 0, sqrt(37)) ≈ 0.04678
```
- [ ] **Step 3: Verify nothing else changed**
Run: `git diff docs/superpowers/plans/2026-04-27-t4-margin-factor.md`
Expected: exactly three lines changed (one prose line containing both numbers, one comment line). Nothing else should be touched.
- [ ] **Step 4: Commit**
```bash
git add docs/superpowers/plans/2026-04-27-t4-margin-factor.md
git commit -m "$(cat <<'EOF'
docs: fix stale numerics in t4-margin-factor plan
The plan's prose quoted Z_cav ≈ 0.046827 and log_evidence ≈ -3.0613,
which diverged from the values asserted by the shipped test in
src/factor/mod.rs (-3.062235327364623). Update prose and the matching
code comment to 0.04678 / -3.0622.
EOF
)"
```
---
## Self-review (writer's note)
Spec coverage:
- Spec Item 1 (dedupe `likelihoods`/`likelihoods_scored`) → Task 1 ✓
- Spec Item 2 (exhaustive `BuiltinFactor::log_evidence`) → Task 2 ✓
- Spec Item 3 (stale numerics in T4 plan) → Task 3 ✓
- Spec out-of-scope items (`DiffFactor` collapse, per-event `score_sigma`) — correctly absent ✓
Verification gates per the spec ("each item commits independently and ships behind a green `cargo test --lib`"): every task ends in fmt + clippy + tests + commit. Task 1 additionally runs `cargo test` for integration coverage.
Type / signature consistency:
- `run_chain` signature appears identically in the context header and Step 2 body ✓
- Closure type `FnMut(usize, &[usize], &mut crate::factor::VarStore) -> DiffFactor` matches across Step 2 (definition) and Steps 3/4 (call sites) ✓
- `DiffFactor::Trunc` / `DiffFactor::Margin` constructors match `src/game.rs:20-23` definitions ✓
No placeholders detected.
@@ -578,7 +578,7 @@ All renames and the new public API land together. No half-renamed intermediate s
Each shipped independently after T3.
- `MarginFactor` → enables `Outcome::Scored`. **Done** (see `docs/superpowers/plans/2026-04-27-t4-margin-factor.md`).
- `MarginFactor` → enables `Outcome::Scored`.
- `Damped` and `Residual` schedules.
- `SynergyFactor`, `ScoreFactor` → same pattern when wanted.
@@ -1,320 +0,0 @@
# Damped EP — Game-Local Damping
## Summary
Add an opt-in EP damping knob to within-game inference. Users set
`ConvergenceOptions::alpha < 1.0` to damp message updates and stabilise
oscillating fixed-point loops on hard graphs. `alpha = 1.0` (the default)
is bit-equal to today.
This is the smallest-scope realisation of the spec's `Damped` schedule:
**game-local**, not plumbed through the `Schedule` trait. The `Schedule`
trait is shipped infrastructure that `run_chain` does not currently call;
wiring `Schedule` into game inference is a separate future task. This
design touches only what the user can actually reach via `GameOptions`.
## Scope
### What ships
1. New field `ConvergenceOptions::alpha: f64` (default `1.0`).
2. `run_chain` reads `options.convergence.{epsilon, max_iter, alpha}`
instead of the hardcoded `1e-6` / `10` / undamped — fixes the existing
latent bug where the first two were already on `GameOptions` but never
read by inference.
3. `Gaussian::damp_natural(self, new, alpha) -> Gaussian` — public helper
computing `α·new + (1−α)·self` in natural-parameter space.
4. `TruncFactor` and `MarginFactor` gain inherent
`propagate_with_alpha(&mut self, vars, alpha) -> (f64, f64)`. Their
`Factor::propagate` impls become one-line delegations passing
`alpha = 1.0`.
5. `DiffFactor::propagate` (game-private enum at `src/game.rs:20-54`)
gains an `alpha: f64` parameter and dispatches into the underlying
factor's `propagate_with_alpha`.
### What does not ship
- No `Damped` impl in `src/schedule.rs`. The `Schedule` trait stays as
it is; integration with `run_chain` is a separate task.
- No nat-param convergence switch. `(|Δmu|, |Δsigma|)` stays the
delta basis (matches today). The spec's "stopping in natural-param
space" wants its own design pass and test re-tuning.
- No oscillation auto-detect. `alpha` is user-supplied and constant for
the duration of a `run_chain` call.
- No `Residual`, `OneShot`, or `SynergyFactor` / `ScoreFactor` work —
separate future plans.
## Design
### `ConvergenceOptions::alpha`
```rust
// src/convergence.rs
#[derive(Clone, Copy, Debug)]
pub struct ConvergenceOptions {
pub max_iter: usize,
pub epsilon: f64,
pub alpha: f64,
}
impl Default for ConvergenceOptions {
fn default() -> Self {
Self {
max_iter: crate::ITERATIONS,
epsilon: crate::EPSILON,
alpha: 1.0,
}
}
}
```
`alpha = 1.0` ⇒ undamped (bit-equal to today). Recommended starting
point if a graph oscillates: `0.5``0.7`. Values approaching `0.0` make
each step tinier and slow convergence; `alpha = 0.0` is degenerate
(factor never updates). Validation in `run_chain`:
```rust
debug_assert!(
opts.convergence.alpha > 0.0 && opts.convergence.alpha <= 1.0,
"convergence alpha must be in (0.0, 1.0]"
);
```
### `Gaussian::damp_natural`
```rust
impl Gaussian {
/// EP damping in natural-parameter space: `α·new + (1−α)·self`.
///
/// Used by within-game schedules to stabilise oscillating fixed-point
/// loops on hard graphs. `alpha = 1.0` returns `new` exactly;
/// `alpha < 1.0` shrinks each per-step update.
pub fn damp_natural(self, new: Gaussian, alpha: f64) -> Gaussian {
Gaussian::from_natural(
alpha * new.pi() + (1.0 - alpha) * self.pi(),
alpha * new.tau() + (1.0 - alpha) * self.tau(),
)
}
}
```
Public on `Gaussian`. The name encodes the WHY (EP damping); the doc
comment fixes the math. No new dependency.
The existing `Mul<f64> for Gaussian` is **distribution scaling**
(`sigma → sigma·|scalar|`), not nat-param interpolation, so it can't be
reused here.
### `TruncFactor::propagate_with_alpha`
```rust
impl TruncFactor {
pub(crate) fn propagate_with_alpha(
&mut self,
vars: &mut VarStore,
alpha: f64,
) -> (f64, f64) {
let marginal = vars.get(self.diff);
let cavity = marginal / self.msg;
if self.evidence_cached.is_none() {
self.evidence_cached = Some(cavity_evidence(cavity, self.margin, self.tie));
}
let trunc = approx(cavity, self.margin, self.tie);
let new_msg = trunc / cavity;
let damped = self.msg.damp_natural(new_msg, alpha);
let old_msg = self.msg;
self.msg = damped;
// marginal_new = cavity * stored_msg (NOT cavity * new_msg with damping)
vars.set(self.diff, cavity * damped);
old_msg.delta(damped)
}
}
impl Factor for TruncFactor {
fn propagate(&mut self, vars: &mut VarStore) -> (f64, f64) {
self.propagate_with_alpha(vars, 1.0)
}
}
```
Two important points:
- The variable receives `cavity * damped` (i.e. `cavity * self.msg`),
not `trunc`. With `alpha = 1.0` these are equal (since
`cavity * new_msg = trunc` by construction), so today's behaviour is
preserved bit-equal. With `alpha < 1.0` the marginal reflects the
partially-applied update.
- The reported delta is `old_msg.delta(damped)` — delta of the actually
stored message, not of the raw `new_msg`. This is the textbook EP
damping convention: the convergence loop measures the trajectory it
is actually walking.
`MarginFactor` follows the same shape, with its own
`propagate_with_alpha` body (the existing `propagate` math, with the
`damp_natural` step inserted in the same place and the var write
switched to `cavity * damped`).
### `DiffFactor::propagate` signature
```rust
// src/game.rs
impl DiffFactor {
pub(crate) fn propagate(
&mut self,
vars: &mut VarStore,
alpha: f64,
) -> (f64, f64) {
match self {
Self::Trunc(f) => f.propagate_with_alpha(vars, alpha),
Self::Margin(f) => f.propagate_with_alpha(vars, alpha),
}
}
}
```
`DiffFactor` is `pub(crate)` and only used inside `run_chain`, so the
signature change has no public-API impact.
### `run_chain` changes
Inside `Game::run_chain` (`src/game.rs:236-348`):
1. Capture `let alpha = opts.convergence.alpha;` once at the top
(avoids repeated `opts.convergence.alpha` lookups in the hot loop).
2. Replace the loop guard
`while tuple_gt(step, 1e-6) && iter < 10`
with
`while tuple_gt(step, opts.convergence.epsilon) && iter < opts.convergence.max_iter`.
3. Replace each `lf.propagate(&mut arena.vars)` call site (three of
them: forward sweep, backward sweep, `n_diffs == 1` special case)
with `lf.propagate(&mut arena.vars, alpha)`.
The threading of `opts: &GameOptions` into `run_chain` is the only
new caller obligation. Today `run_chain` doesn't take `opts`; the two
callers (`likelihoods`, `likelihoods_scored`) currently invoke it
without options. Both will need to pass the options through. The
`Game<'a, T, D>` struct does not currently hold `GameOptions`; the
options are constructed and discarded around the call to
`{ranked,scored}_with_arena`. So:
- `Game::ranked_with_arena` and `Game::scored_with_arena` already
receive `p_draw` / `score_sigma` as scalar params; we extend them to
accept `&ConvergenceOptions` (or the full `&GameOptions`) too.
- `likelihoods` / `likelihoods_scored` either store the options on
`Game` or accept them as method parameters and forward to
`run_chain`.
The simplest plumbing: store `convergence: ConvergenceOptions` as a
field on `Game<'a, T, D>` and `OwnedGame<T, D>` populated at
construction time. Then `run_chain` can read it from `&self`.
## Convergence semantics
With `alpha < 1.0` the per-step update shrinks; convergence may take
more iterations to reach the same `epsilon` threshold. Users who damp
should also raise `max_iter` accordingly. Documentation example:
```rust
let mut opts = GameOptions::default();
opts.convergence.alpha = 0.5;
opts.convergence.max_iter = 30;
```
## Testing strategy
### Regression net (no new file)
The existing 88 lib tests and 27 integration tests are the bit-equal
regression net. With `alpha = 1.0` (the default), every assertion must
pass unchanged. If any test fails, the damping path leaked into the
undamped trajectory.
### New tests
1. **`Gaussian::damp_natural` arithmetic**
(`src/gaussian.rs` test mod):
- `α = 1.0` returns `new` exactly (bit-equal `pi` and `tau`).
- `α = 0.0` returns `self` exactly.
- `α = 0.5`: pi and tau are exact midpoints in nat-param space.
- Three asserts, no new file.
2. **`TruncFactor::propagate_with_alpha` shrinks the step**
(`src/factor/trunc.rs` test mod):
- Set up a TruncFactor step. Run `propagate_with_alpha(α=1.0)` once,
record `delta_undamped` and the resulting `self.msg`.
- Reset to a fresh factor at the same starting state. Run
`propagate_with_alpha(α=0.5)` once, record `delta_damped` and
`damped_msg`.
- Assert: `damped_msg.pi()` equals `0.5 * undamped_msg.pi() + 0.5 * initial_msg.pi()` within 1e-12 (and same for `tau`).
- Assert: `delta_damped.0 <= delta_undamped.0` (mu-delta is no larger; the relationship is monotone in `α` but not strictly `0.5×` for the `delta()` function which is `(|Δmu|, |Δsigma|)`).
3. **`MarginFactor::propagate_with_alpha` parity**
(`src/factor/margin.rs` test mod):
- Same shape as #2, on a `MarginFactor` step.
4. **`run_chain` honours `ConvergenceOptions::max_iter`**
(in an existing or new game-level test):
- Construct a 4-team ranked game that normally converges in ~5 iterations.
- Set `opts.convergence.max_iter = 1`. Assert the per-iteration
`step` returned (or observable indirectly via posterior delta vs.
the converged answer) is non-zero — i.e. the loop stopped early.
- Set `opts.convergence.max_iter = 30`. Assert posteriors match the
baseline within `epsilon`.
5. **Damping default is `1.0` and produces bit-equal output**
(smoke test, can be a single assertion in an existing test):
- `assert_eq!(ConvergenceOptions::default().alpha, 1.0);`
- Existing goldens prove the bit-equality.
No oscillation-stabilisation test (would require constructing a
pathological graph specifically to oscillate; out of scope for a
minimal ship).
## Verification gates
Per task:
```bash
cargo +nightly fmt
cargo clippy --all-targets -- -D warnings
cargo test --lib
cargo test
```
All must succeed. Test count grows by exactly the new tests above
(roughly +58 lib tests).
## Risks
- **Marginal-update change is subtle.** Switching the variable write
from `trunc` to `cavity * damped` is intentionally a no-op when
`alpha = 1.0` (since `cavity * new_msg = trunc`), but it changes the
arithmetic path. If `Gaussian` arithmetic has any non-associativity
in floating-point that the old form happened to dodge, goldens could
shift by 1 ULP. Mitigation: TDD — write the regression test (run all
existing tests with `alpha = 1.0`) **first**, before changing the
variable-write line.
- **`run_chain` signature change ripples to two callers.** Trivial
but must be done atomically with the field addition on `Game` /
`OwnedGame`.
- **`alpha` validation only in debug builds.** A release build will
silently accept `alpha = 0.0` or `alpha > 1.0` and produce nonsense.
This matches the existing pattern (`debug_assert!` for input
validation in `Game::ranked_with_arena`); upgrading to `Result` is
out of scope.
## Out-of-scope follow-ups (logged for future plans)
- Wire `Schedule` into `run_chain` (so `Damped` lands as a real
`Schedule` impl alongside `EpsilonOrMax`).
- Switch convergence check to `(|Δpi|, |Δtau|)` per spec
§"Stopping in natural-param space".
- Oscillation auto-detect (engage `alpha < 1.0` only after N
non-monotone steps).
- `Residual` schedule (priority queue).
- `SynergyFactor`, `ScoreFactor` (new EP factor types).
@@ -1,232 +0,0 @@
# History → TimeSlice ConvergenceOptions Plumbing
## Summary
Make `History`'s already-public `ConvergenceOptions` (set via
`HistoryBuilder::convergence(...)`) actually reach the within-game
inference loop. Today it's read by the outer `History::converge` sweep
but dropped on the floor when constructing `TimeSlice`s, so users who
opt in to `alpha < 1.0` (Damped EP) on a `History` get nothing — the
inner `run_chain` calls inside `TimeSlice` hardcode
`ConvergenceOptions::default()`.
This spec closes the gap with one focused change: thread
`ConvergenceOptions` from `History` through `TimeSlice` to the three
`Game::*_with_arena` callsites in `time_slice.rs`. No new types, no new
public methods on `History` or `HistoryBuilder` — the user-facing API
already exists.
## Background
After T5 (commit `0705986`) of the Damped EP plan,
`Game::*_with_arena` accepts `convergence: ConvergenceOptions` and
`run_chain` reads `self.convergence.{epsilon, max_iter, alpha}`.
`HistoryBuilder` already has a `convergence(opts)` method (`history.rs:91`)
that stores onto a field on `History`. `History::converge` reads
`self.convergence.{max_iter, epsilon}` for its outer cross-history loop
(`history.rs:437-447`).
The break is here, in `History::add_events_with_prior` at `history.rs:597`:
```rust
let mut time_slice = TimeSlice::new(t, self.p_draw);
```
`self.convergence` is not passed. `TimeSlice` has no convergence field.
The three callsites in `time_slice.rs` that build `Game::*_with_arena`
fall back to `ConvergenceOptions::default()`:
- `Event::iteration_direct` (`time_slice.rs:138-156`)
- `TimeSlice::convergence` (`time_slice.rs:332-345`)
- `TimeSlice::log_evidence` (`time_slice.rs:521-538`)
## Scope
### What ships
1. `TimeSlice<T>` gains a `pub(crate) convergence: ConvergenceOptions`
field set at construction.
2. `TimeSlice::new` signature becomes
`pub fn new(time: T, p_draw: f64, convergence: ConvergenceOptions) -> Self`.
3. `History::add_events_with_prior` (`history.rs:597`) passes
`self.convergence` when constructing new `TimeSlice`s.
4. `Event::iteration_direct` gains a `convergence: ConvergenceOptions`
parameter and forwards it to the `Game::*_with_arena` callsite.
The two callers (`TimeSlice::iteration` at `time_slice.rs:419` and
`:441`) pass `self.convergence`.
5. `TimeSlice::convergence` (the method, not the field) replaces its
hardcoded `crate::ConvergenceOptions::default()` with
`self.convergence`.
6. `TimeSlice::log_evidence` does the same.
7. Five test callsites of `TimeSlice::new(time, p_draw)` updated
mechanically to `TimeSlice::new(time, p_draw, ConvergenceOptions::default())`.
### What does not ship
- No split of `ConvergenceOptions` into outer/inner fields. The
conflation (one `max_iter` covers both the cross-history sweep and
the per-game EP iteration cap) is the user-confirmed design.
- No `Damped` impl in `src/schedule.rs`. The `Schedule` trait is still
not integrated into `run_chain`.
- No nat-param convergence switch.
- No oscillation auto-detect.
- No new `History` or `HistoryBuilder` methods. `convergence(opts)`
already exists and works.
- No changes to `History::converge` — the outer-loop semantics are
unchanged (it already reads `self.convergence`).
## Design
### `TimeSlice<T>` field
```rust
// src/time_slice.rs
pub struct TimeSlice<T: Time = i64> {
// ... existing fields ...
p_draw: f64,
pub(crate) convergence: ConvergenceOptions,
// ... existing fields ...
}
```
### `TimeSlice::new`
```rust
impl<T: Time> TimeSlice<T> {
pub fn new(time: T, p_draw: f64, convergence: ConvergenceOptions) -> Self {
Self {
// ... existing initialisation ...
p_draw,
convergence,
// ...
}
}
}
```
### `History::add_events_with_prior` — single-line fix
At `src/history.rs:597`:
```rust
// before
let mut time_slice = TimeSlice::new(t, self.p_draw);
// after
let mut time_slice = TimeSlice::new(t, self.p_draw, self.convergence);
```
### `Event::iteration_direct` parameter
```rust
// src/time_slice.rs
impl Event {
pub(crate) fn iteration_direct(
&mut self,
skills: &mut SkillStore,
agents: &CompetitorStore<i64, ConstantDrift>,
p_draw: f64,
convergence: ConvergenceOptions,
arena: &mut ScratchArena,
) -> /* existing return */ {
// ... existing body, with the Game::*_with_arena calls
// using `convergence` instead of ConvergenceOptions::default() ...
}
}
```
The two callers — `TimeSlice::iteration` at `time_slice.rs:419` and
`:441` — already have `&mut self` access, so they pass
`self.convergence`.
### `TimeSlice::convergence` method (not the field)
The method `pub(crate) fn convergence<D>(&mut self, agents: ...) -> usize`
at `time_slice.rs:447` shares its name with the new field. Rust allows
this (methods and fields live in different namespaces), but it's a
readability hazard. Rename the method to `iterate_to_convergence` to
disambiguate.
This is one rename, six callsites in `history.rs` and the test module.
### Field semantics
`History` keeps the single shared `ConvergenceOptions` struct. The same
`max_iter` covers both the outer sweep and each inner per-game loop.
The same `epsilon` covers both stopping criteria. The `alpha` field is
read only inside `run_chain` (the inner loop); the outer loop
intentionally ignores `alpha` because cross-history damping is a
different mathematical concept and not in scope.
## Testing strategy
### Regression net
The existing 98 lib + 27 integration tests are the bit-equal regression
net. Default `ConvergenceOptions` is unchanged
(`max_iter=30, epsilon=1e-6, alpha=1.0`), and `TimeSlice` was already
using exactly that since T5. The only behavioural difference is for
users who actually pass non-default options through
`HistoryBuilder::convergence(...)` — and there are no current tests that
do that **and** compare posteriors, so all goldens stay bit-equal.
### New tests
1. **`history_propagates_convergence_to_inner_run_chain`** (in
`src/history.rs` test module):
- Build a History with `convergence(ConvergenceOptions { max_iter: 1, ..Default::default() })`.
- Add a small batch of events that needs more than one inner EP iteration to converge (e.g. a 4-team game per slice).
- `converge()`, capture posteriors.
- Build a fresh History with default options on the same events.
- `converge()`, capture posteriors.
- Assert the two sets of posteriors differ measurably (max diff > 1e-6).
- Proves the inner loop honours the propagated `max_iter`. Today (without this change) the assertion would fail because both Histories use default inside.
2. **`history_with_damping_reaches_same_fixed_point_as_undamped`** (same
test module):
- Build a History with `convergence(ConvergenceOptions { alpha: 0.5, max_iter: 200, ..Default::default() })`.
- Same events as above.
- `converge()`, capture posteriors.
- Build a default-options History on the same events.
- `converge()`, capture posteriors.
- Assert per-player posteriors agree within 1e-3.
- Proves damping doesn't break convergence on the History path.
If the second test's max diff is too large, raise `max_iter` further
(damping needs more iterations to reach the same fixed point).
## Verification gates
```bash
cargo +nightly fmt
cargo clippy --all-targets -- -D warnings
cargo test --lib
cargo test
```
All must succeed. Test count grows by exactly 2 (the two new tests).
## Risks
- **`TimeSlice::new` is `pub`.** Adding the third parameter is a
breaking change to a public constructor. In a 0.1.x crate this is
acceptable, but flag it in the commit message.
- **`TimeSlice::convergence` method rename.** Renaming
`convergence``iterate_to_convergence` touches `history.rs` and the
TimeSlice test module. The rename is mechanical and improves
readability where the field and method would otherwise share a name.
- **Cross-history alpha semantics.** A user who sets `alpha = 0.5` on
a `History` gets damping inside every per-game loop, but the outer
`History::converge` sweep is undamped. This is the correct semantic
(alpha is a within-EP-graph concept) but it's worth documenting in
the `ConvergenceOptions::alpha` doc comment so users don't expect
cross-slice damping. Add one sentence to the existing doc comment.
## Out-of-scope follow-ups
- Wire `Schedule` trait into `run_chain` — Damped becomes a `Schedule`
impl alongside `EpsilonOrMax`.
- Per-loop `ConvergenceOptions` split (outer / inner).
- `Residual` schedule.
- Per-event `EventKind::Scored.score_sigma` override (still
history-wide today).
@@ -1,292 +0,0 @@
# Per-Event `score_sigma` Override
## Summary
Let users specify a per-event noise override on `Outcome::Scored`.
Today every scored event in a `History` shares the single
`HistoryBuilder::score_sigma` value (default `1.0`); a user who wants
to say "this match was a clean blowout, trust the margin more" or
"this one was a disrupted scrappy game, trust it less" has no way to
do so.
The override is resolved at ingest time and stored as a plain `f64`
on the existing `EventKind::Scored { score_sigma }` payload, so
`TimeSlice` and `run_chain` need zero changes. The work is purely on
the public API surface: `Outcome::Scored` becomes a struct variant
with an `Option<f64> sigma` field; two builder methods on `Outcome`
and `EventBuilder` cover the explicit-override path.
## Background
`Outcome::Scored(SmallVec<[f64; 4]>)` is the public per-team-score
variant (`src/outcome.rs:20`). It's constructed via
`Outcome::scores(I)` (`src/outcome.rs:44`) or
`EventBuilder::scores(I)` (`src/event_builder.rs:79`).
When `History::add_events` ingests a Scored outcome, it always uses
the history-wide default:
```rust
// src/history.rs:735-740
crate::Outcome::Scored(scores) => {
kinds.push(EventKind::Scored {
score_sigma: self.score_sigma,
});
scores.to_vec()
}
```
The downstream `EventKind::Scored { score_sigma: f64 }`
(`src/time_slice.rs:51`) is already per-event-shaped — every Event
carries its own copy. The constraint is purely at the ingest boundary.
This was flagged as deferred tech debt during the T4-MarginFactor
work: "EventKind::Scored.score_sigma payload is always history-wide
today; per-event override deferred."
## Scope
### What ships
1. `Outcome::Scored` becomes a struct variant:
`Scored { scores: SmallVec<[f64; 4]>, sigma: Option<f64> }`.
`None` = use history default; `Some(s)` = override.
2. New constructor `Outcome::scores_with_sigma(scores, sigma)` on
`Outcome`. Existing `Outcome::scores(I)` keeps the same shape but
builds with `sigma: None`.
3. New builder method `EventBuilder::scores_with_sigma(scores, sigma)`
on `EventBuilder`.
4. `History::add_events` resolves `sigma.unwrap_or(self.score_sigma)`
when converting an `Outcome::Scored` to `EventKind::Scored`.
5. Mechanical pattern-match updates at every site that destructures
`Outcome::Scored(...)` as a tuple. Estimate ~510 sites across
`src/`, `tests/`, `examples/`, `benches/`.
### What does not ship
- No change to `EventKind::Scored` (already per-event).
- No change to `TimeSlice` or `run_chain`.
- No change to `Game::scored` standalone API
(it still takes `score_sigma` via `GameOptions::score_sigma`).
- No deprecation of `HistoryBuilder::score_sigma` — the history-wide
default is still useful as a common-case fallback.
## Design
### `Outcome` enum change
```rust
// src/outcome.rs
#[derive(Clone, Debug)]
pub enum Outcome {
Ranked(SmallVec<[u32; 4]>),
Scored {
scores: SmallVec<[f64; 4]>,
/// Per-event noise override. `None` means inherit
/// `HistoryBuilder::score_sigma`. Must be `> 0.0` if `Some`.
sigma: Option<f64>,
},
}
```
The variant shape changes from tuple to struct. Pattern matches that
extract the scores switch from `Outcome::Scored(scores)` to
`Outcome::Scored { scores, .. }` (or `{ scores, sigma }` where the
sigma is needed).
### `Outcome` constructors
```rust
impl Outcome {
/// Per-team continuous scores; uses HistoryBuilder::score_sigma default.
pub fn scores<I: IntoIterator<Item = f64>>(scores: I) -> Self {
Self::Scored {
scores: scores.into_iter().collect(),
sigma: None,
}
}
/// Per-team scores with explicit per-event noise override.
///
/// `sigma` must be > 0.0; debug_assert.
pub fn scores_with_sigma<I: IntoIterator<Item = f64>>(
scores: I,
sigma: f64,
) -> Self {
debug_assert!(sigma > 0.0, "score_sigma must be > 0.0 (got {sigma})");
Self::Scored {
scores: scores.into_iter().collect(),
sigma: Some(sigma),
}
}
}
```
`Outcome::scores(I)` keeps the existing function signature exactly —
its only behavioural change is the internal struct construction. The
existing `as_scores()`, `team_count()`, etc. accessors keep their
public signatures (they return `Option<&[f64]>` and `usize`); their
internal pattern matches update mechanically.
### `EventBuilder` method
```rust
impl<'h, T, D, O, K> EventBuilder<'h, T, D, O, K>
where
T: Time,
D: Drift<T>,
O: Observer<T>,
K: Eq + std::hash::Hash + Clone,
{
/// Per-team scores; uses HistoryBuilder::score_sigma default.
pub fn scores<I: IntoIterator<Item = f64>>(mut self, scores: I) -> Self {
self.event.outcome = crate::Outcome::scores(scores);
self
}
/// Per-team scores with explicit per-event noise override.
pub fn scores_with_sigma<I: IntoIterator<Item = f64>>(
mut self,
scores: I,
sigma: f64,
) -> Self {
self.event.outcome = crate::Outcome::scores_with_sigma(scores, sigma);
self
}
}
```
The existing `.scores(...)` builder method stays — its body changes
trivially because `Outcome::scores(I)` still has the same signature.
`.scores_with_sigma(...)` is the new method.
### Sigma resolution
In `History::add_events` at `src/history.rs:735`:
```rust
crate::Outcome::Scored { scores, sigma } => {
let resolved = sigma.unwrap_or(self.score_sigma);
debug_assert!(
resolved > 0.0,
"resolved score_sigma must be > 0.0 (got {resolved})"
);
kinds.push(EventKind::Scored {
score_sigma: resolved,
});
scores.to_vec()
}
```
Resolution at ingest time means downstream code keeps a plain `f64`.
No `Option` propagates further.
### Validation
- `Outcome::scores_with_sigma(_, sigma)` debug-asserts `sigma > 0.0`
at construction.
- `History::add_events` debug-asserts the resolved sigma is `> 0.0`
(catches both inherited and overridden paths).
- `HistoryBuilder::score_sigma(s)` keeps its existing positive
assertion.
The default sigma at the History level (`1.0`) is positive, so an
event with `sigma = None` against a default-built History always
passes the resolved-sigma assertion trivially.
### Pattern-match update inventory
Every site that destructures `Outcome::Scored(_)` as a tuple needs
updating. Known sites:
- `src/outcome.rs`: the `team_count()`, `as_scores()`, `as_ranks()`
match arms (`src/outcome.rs:51`, `:58`, `:64`).
- `src/history.rs:735`: the conversion arm (this is also where the
resolution rule lands).
- Any test in `src/outcome.rs` test mod that constructs
`Outcome::Scored(...)` literally.
- Any callsite in `src/`, `tests/`, `examples/`, `benches/`,
`src/game.rs` that pattern-matches the variant.
The compiler surfaces every site at `cargo build`. Locating them is
mechanical.
## Testing strategy
### Regression net
Existing 100 lib + 27 integration tests are the bit-equal regression
net for the `sigma = None` path. Every existing test that uses
`Outcome::scores(...)` or `EventBuilder::scores(...)` should
continue to produce identical posteriors — the resolved sigma equals
the history default (which equals what the hardcoded path produced).
### New tests
Three additions in the `src/history.rs` test module:
1. **`outcome_scores_default_sigma_uses_history_default`** — build a
History with `score_sigma(0.5)`, add a 2-team event via
`Outcome::scores([3.0, 1.0])` (no override), capture posteriors.
Build a second History identical except using
`Outcome::scores_with_sigma([3.0, 1.0], 0.5)` (override matches
default). Assert posteriors are bit-equal across the two paths.
2. **`outcome_scores_with_sigma_overrides_history_default`** — build a
History with `score_sigma(0.5)`, add an event via
`Outcome::scores_with_sigma([3.0, 1.0], 2.0)`. Build a second
History with `score_sigma(2.0)` and add the same event via
`Outcome::scores([3.0, 1.0])`. Assert posteriors are bit-equal.
Then build a third History with `score_sigma(0.5)` and add via
`Outcome::scores([3.0, 1.0])` (no override). Assert this third
one's posteriors differ measurably from the override path
(max diff > 1e-6) — proves the override actually changes
inference.
3. **`event_builder_scores_with_sigma_threading`** — same shape as
#2 but constructed via the fluent builder
`h.event(0).team(["a"]).team(["b"]).scores_with_sigma([3.0, 1.0], 2.0).commit()`.
Proves the builder method works end-to-end.
### Pattern-match update test impact
Existing tests in `src/outcome.rs` that construct
`Outcome::Scored(...)` literally need updating to the struct shape.
Mechanical change; no new tests required.
## Verification gates
```bash
cargo +nightly fmt
cargo clippy --all-targets -- -D warnings
cargo test --lib
cargo test
```
Test count grows by 3.
## Risks
- **Public API breaking change.** `Outcome::Scored` variant shape
changes from tuple to struct. Any downstream consumer
pattern-matching on the tuple form breaks. In a 0.1.x crate this
is acceptable; flag it in the commit message.
- **Mechanical breadth.** The pattern-match updates touch several
files. They're all caught by the compiler so the risk is low, but
the diff will look bigger than the actual logical change.
- **Two ways to do the same thing.** `Outcome::scores_with_sigma(..)`
and `EventBuilder::scores_with_sigma(..)` both produce the same
outcome. This is intentional — the constructor is the underlying
primitive; the builder method is the ergonomic wrapper. Same
pattern as the existing `Outcome::scores(..)` /
`EventBuilder::scores(..)` pair.
## Out-of-scope follow-ups
- Per-event override of other config currently history-wide
(`p_draw`, drift, beta) — same architectural pattern would apply
but each is its own design decision.
- Validation upgrade from `debug_assert!` to a `Result` at the
Outcome construction boundary.
- Schedule trait integration with `run_chain`, `Residual` schedule,
`SynergyFactor` (still pending from the larger spec).
@@ -1,134 +0,0 @@
# Tech Debt Cleanup — Post-T4-MarginFactor
## Summary
Three small, independent cleanups left behind by the T4-MarginFactor merge
(`8b53cac`). All three are pure code-shape or doc fixes. No public-API change,
no numerics change, no new behavior.
This batch deliberately excludes the `DiffFactor``BuiltinFactor` overlap
collapse (architectural change kept separate) and per-event `score_sigma`
override (a feature, not debt).
## Scope
### Item 1 — Deduplicate `Game::likelihoods` and `Game::likelihoods_scored`
**Current state.** `src/game.rs:236-371` and `src/game.rs:373-485` are 95-line
near-duplicates of each other. They differ in exactly one block: the closure
that maps a diff index to a `DiffFactor`. The ranked path builds
`DiffFactor::Trunc(TruncFactor::new(vid, margin, tie))` with `margin`/`tie`
derived from `p_draw` and adjacent-result equality. The scored path builds
`DiffFactor::Margin(MarginFactor::new(vid, m_obs, score_sigma))` with `m_obs`
the observed score gap. Everything else — sort, `team_prior`, sweep loop,
boundary updates, evidence product, posterior `likelihoods` — is bit-identical.
**Refactor.** Extract a private helper on `OwnedGame<T, D>`:
```rust
fn run_chain<F>(
&self,
arena: &mut ScratchArena,
make_link: F,
) -> (f64, Vec<Vec<Gaussian>>)
where
F: FnMut(usize, &[usize], &mut VarStore) -> DiffFactor,
```
The closure receives the diff index `i`, the descending-by-result sort
permutation `&arena.sort_buf`, and `&mut arena.vars` for `alloc(N_INF)`. It
returns the `DiffFactor` for that diff slot.
The helper takes `&self` (not `&mut self`) and returns
`(evidence, likelihoods)`. Each caller writes the results back to its own
`self.evidence` and `self.likelihoods` fields. The `&self` choice matters: the
closure captures `&self.result` / `&self.teams` / `&self.weights` / `&self.p_draw`
freely without conflicting with the helper's own immutable borrow.
The two public methods shrink from ~125 lines each to ~10 lines that just
construct the closure.
**Why a closure (not a trait or two-phase build).** A closure keeps all
caller-specific state (`p_draw`, `score_sigma`, beta sums for margin) inline at
the call site. A trait would require a stateful object per call; a two-phase
build (caller produces the `Vec<DiffFactor>` first, helper does the rest) would
either re-do the sort or split state ownership awkwardly between phases.
### Item 2 — Make `BuiltinFactor::log_evidence` exhaustive
**Current state.** `src/factor/mod.rs:94-100` uses a `_ => 0.0` wildcard for
`TeamSum` and `RankDiff`. When a future variant lands (e.g. `SynergyFactor`),
the wildcard silently absorbs it instead of forcing a deliberate decision.
**Refactor.**
```rust
fn log_evidence(&self, vars: &VarStore) -> f64 {
match self {
Self::Trunc(f) => f.log_evidence(vars),
Self::Margin(f) => f.log_evidence(vars),
Self::TeamSum(_) | Self::RankDiff(_) => 0.0,
}
}
```
No behavioral change. Future variants now produce a non-exhaustive-match
compile error.
### Item 3 — Fix stale numerics in T4 plan doc
**Current state.** `docs/superpowers/plans/2026-04-27-t4-margin-factor.md`
contains two numbers that diverge from the values asserted by the shipped test
in `src/factor/mod.rs:163,166`.
**Fix.**
| Doc value (wrong) | Implementation value (correct) |
|---|---|
| `0.046827` | `0.04678` |
| `-3.0613` | `-3.0622` |
Pure docs change. Verified by reading the asserted constants in the test.
## Out of scope
- **`DiffFactor``BuiltinFactor` overlap.** Both enums list `Trunc` and
`Margin` variants. Collapsing into `BuiltinFactor::Diff(DiffFactor)` is
defensible but is an architectural change that wants its own design pass.
`DiffFactor` represents a real semantic subset (factors that operate on a
diff variable in a chain); the duplication is two enum variants, not a
large block of code.
- **Per-event `EventKind::Scored.score_sigma` override.** Today
`score_sigma` is history-wide (set on `HistoryBuilder::score_sigma`). A
per-event override is a real feature ask, not tech debt.
## Verification
Each item commits independently and ships behind a green `cargo test --lib`
run. The dedup is a pure code-shape change: posteriors and evidence must be
**bit-equal** (not ULP-bounded) against the existing 88+28 test goldens.
Per-item gate before committing:
```bash
cargo +nightly fmt
cargo clippy
cargo test --lib
```
## Commit shape
Three commits, one per item, each independently revertable:
1. `refactor: dedupe Game::likelihoods and likelihoods_scored via run_chain`
2. `refactor: make BuiltinFactor::log_evidence match exhaustive`
3. `docs: fix stale numerics in t4-margin-factor plan`
## Risks
- **Borrow-checker friction in Item 1.** The closure captures fields of
`&self` while the helper iterates over arena state. Mitigation: helper is
`&self` (not `&mut self`); arena passed as `&mut ScratchArena` separately.
Disjoint borrows.
- **Compile error in Item 2 if a new variant ships before this lands.**
Trivial follow-on; the whole point is to surface that signal.
-1
View File
@@ -48,7 +48,6 @@ fn main() {
.convergence(trueskill_tt::ConvergenceOptions {
max_iter: 10,
epsilon: 0.01,
alpha: 1.0,
})
.build();
-59
View File
@@ -1,59 +0,0 @@
//! Worked example: continuous-score outcomes via `Outcome::Scored`.
//!
//! Three players play a small round-robin where the score margin matters,
//! not just who won. We show how `score_sigma` controls how much weight
//! the engine places on the observed margin.
//!
//! Run with: `cargo run --example scored --release`
use smallvec::smallvec;
use trueskill_tt::{ConstantDrift, Event, History, Member, Outcome, Team};
fn main() {
let mut h = History::builder()
.mu(25.0)
.sigma(25.0 / 3.0)
.beta(25.0 / 6.0)
.drift(ConstantDrift(0.03))
.score_sigma(2.0) // tune to data; smaller = trust margins more
.build();
let events: Vec<Event<i64, &'static str>> = vec![
Event {
time: 1,
teams: smallvec![
Team::with_members([Member::new("alice")]),
Team::with_members([Member::new("bob")]),
],
outcome: Outcome::scores([21.0, 9.0]),
},
Event {
time: 2,
teams: smallvec![
Team::with_members([Member::new("bob")]),
Team::with_members([Member::new("carol")]),
],
outcome: Outcome::scores([21.0, 18.0]),
},
Event {
time: 3,
teams: smallvec![
Team::with_members([Member::new("alice")]),
Team::with_members([Member::new("carol")]),
],
outcome: Outcome::scores([21.0, 21.0]),
},
];
h.add_events(events).unwrap();
let report = h.converge().unwrap();
println!(
"converged={}, iterations={}, log_evidence={:.4}",
report.converged, report.iterations, report.log_evidence
);
for who in &["alice", "bob", "carol"] {
let s = h.current_skill(who).unwrap();
println!("{:>6}: mu={:>7.3} sigma={:.3}", who, s.mu(), s.sigma());
}
}
+1 -1
View File
@@ -1,2 +1,2 @@
publish = false
pre-release-hook = ["sh", "-c", "git cliff -o CHANGELOG.md --tag {{version}} && git add CHANGELOG.md"]
pre-release-hook = ["sh", "-c", "git cliff -o ../CHANGELOG.md --tag {{version}} && git add CHANGELOG.md"]
-158
View File
@@ -1,158 +0,0 @@
//! Greedy graph coloring for within-slice event independence.
//!
//! Events sharing no `Index` can be processed in parallel under async-EP
//! semantics. This module partitions a list of events into "colors" such
//! that events of the same color touch disjoint index sets.
//!
//! The algorithm is greedy: for each event in ingestion order, place it in
//! the lowest-numbered color whose existing members share no `Index`. If
//! no existing color accepts the event, open a new color.
//!
//! Complexity: O(n × c × m) where n is events, c is colors (small, ≤ 5 in
//! practice), and m is average team size.
use std::collections::HashSet;
use crate::Index;
/// Partition of event indices into color groups.
///
/// Each inner `Vec<usize>` holds the indices (into the original events
/// array) of events assigned to one color. Colors are iterated in ascending
/// order by convention.
#[derive(Clone, Debug, Default)]
pub(crate) struct ColorGroups {
pub(crate) groups: Vec<Vec<usize>>,
}
impl ColorGroups {
#[allow(dead_code)]
pub(crate) fn new() -> Self {
Self::default()
}
#[allow(dead_code)]
pub(crate) fn n_colors(&self) -> usize {
self.groups.len()
}
#[allow(dead_code)]
pub(crate) fn is_empty(&self) -> bool {
self.groups.is_empty()
}
/// Total event count across all colors.
#[allow(dead_code)]
pub(crate) fn total_events(&self) -> usize {
self.groups.iter().map(|g| g.len()).sum()
}
/// Contiguous index range for one color after events have been reordered
/// into color-contiguous positions by `TimeSlice::recompute_color_groups`.
#[allow(dead_code)]
pub(crate) fn color_range(&self, color_idx: usize) -> std::ops::Range<usize> {
let group = &self.groups[color_idx];
if group.is_empty() {
return 0..0;
}
let start = *group.first().unwrap();
let end = *group.last().unwrap() + 1;
start..end
}
}
/// Compute color groups greedily.
///
/// `index_set(ev_idx)` yields, for each event index, the iterator of
/// `Index` values that event touches. The returned `ColorGroups` has one
/// inner `Vec<usize>` per color, containing event indices in the order
/// they were assigned.
#[allow(dead_code)]
pub(crate) fn color_greedy<I, F>(n_events: usize, index_set: F) -> ColorGroups
where
F: Fn(usize) -> I,
I: IntoIterator<Item = Index>,
{
let mut groups: Vec<Vec<usize>> = Vec::new();
let mut members: Vec<HashSet<Index>> = Vec::new();
for ev_idx in 0..n_events {
let ev_members: HashSet<Index> = index_set(ev_idx).into_iter().collect();
// Find first color whose member-set is disjoint from this event's indices.
let chosen = members.iter().position(|m| m.is_disjoint(&ev_members));
let color_idx = match chosen {
Some(c) => c,
None => {
groups.push(Vec::new());
members.push(HashSet::new());
groups.len() - 1
}
};
groups[color_idx].push(ev_idx);
members[color_idx].extend(ev_members);
}
ColorGroups { groups }
}
#[cfg(test)]
mod tests {
use super::*;
fn idx(i: usize) -> Index {
Index::from(i)
}
#[test]
fn single_event_gets_one_color() {
let cg = color_greedy(1, |_| vec![idx(0), idx(1)]);
assert_eq!(cg.n_colors(), 1);
assert_eq!(cg.groups[0], vec![0]);
}
#[test]
fn disjoint_events_share_a_color() {
let cg = color_greedy(2, |i| match i {
0 => vec![idx(0), idx(1)],
1 => vec![idx(2), idx(3)],
_ => unreachable!(),
});
assert_eq!(cg.n_colors(), 1);
assert_eq!(cg.groups[0], vec![0, 1]);
}
#[test]
fn overlapping_events_need_separate_colors() {
let cg = color_greedy(2, |i| match i {
0 => vec![idx(0), idx(1)],
1 => vec![idx(1), idx(2)],
_ => unreachable!(),
});
assert_eq!(cg.n_colors(), 2);
assert_eq!(cg.groups[0], vec![0]);
assert_eq!(cg.groups[1], vec![1]);
}
#[test]
fn three_events_two_colors() {
// Event 0: {0, 1}; event 1: {2, 3}; event 2: {0, 2}.
// Greedy: ev0→c0, ev1→c0 (disjoint), ev2 overlaps both→c1.
let cg = color_greedy(3, |i| match i {
0 => vec![idx(0), idx(1)],
1 => vec![idx(2), idx(3)],
2 => vec![idx(0), idx(2)],
_ => unreachable!(),
});
assert_eq!(cg.n_colors(), 2);
assert_eq!(cg.groups[0], vec![0, 1]);
assert_eq!(cg.groups[1], vec![2]);
}
#[test]
fn total_events_counts_correctly() {
let cg = color_greedy(4, |_| vec![idx(0)]);
// All events touch index 0 → 4 distinct colors.
assert_eq!(cg.n_colors(), 4);
assert_eq!(cg.total_events(), 4);
}
}
-22
View File
@@ -8,16 +8,6 @@ use smallvec::SmallVec;
pub struct ConvergenceOptions {
pub max_iter: usize,
pub epsilon: f64,
/// EP damping factor in natural-parameter space: each per-factor
/// update inside a single game writes `α·new + (1−α)·old`. `1.0` is
/// undamped (default); `< 1.0` stabilises oscillating fixed-point
/// loops at the cost of more iterations. Must be in `(0.0, 1.0]`.
///
/// Applies only to the within-game EP loop (`run_chain`). The outer
/// `History::converge` cross-history sweep is undamped regardless of
/// this value — cross-slice damping is a different concept and not
/// in scope.
pub alpha: f64,
}
impl Default for ConvergenceOptions {
@@ -25,7 +15,6 @@ impl Default for ConvergenceOptions {
Self {
max_iter: crate::ITERATIONS,
epsilon: crate::EPSILON,
alpha: 1.0,
}
}
}
@@ -40,14 +29,3 @@ pub struct ConvergenceReport {
pub per_iteration_time: SmallVec<[Duration; 32]>,
pub slices_skipped: usize,
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn default_alpha_is_one_for_undamped_behavior() {
let opts = ConvergenceOptions::default();
assert_eq!(opts.alpha, 1.0);
}
}
+1 -1
View File
@@ -6,7 +6,7 @@ use crate::time::Time;
///
/// Generic over `T: Time` so seasonal or calendar-aware drift is expressible
/// without going through `i64`.
pub trait Drift<T: Time>: Copy + Debug + Send + Sync {
pub trait Drift<T: Time>: Copy + Debug {
/// Variance added to the skill prior for elapsed time `from -> to`.
///
/// Called with `from <= to`; returning zero means no drift accumulates.
-5
View File
@@ -10,8 +10,6 @@ pub enum InferenceError {
},
/// A probability value is outside `[0, 1]`.
InvalidProbability { value: f64 },
/// A scalar parameter is outside its valid range.
InvalidParameter { name: &'static str, value: f64 },
/// Convergence exceeded `max_iter` without falling below `epsilon`.
ConvergenceFailed {
last_step: (f64, f64),
@@ -34,9 +32,6 @@ impl fmt::Display for InferenceError {
Self::InvalidProbability { value } => {
write!(f, "probability must be in [0, 1]; got {value}")
}
Self::InvalidParameter { name, value } => {
write!(f, "{name} is invalid: {value}")
}
Self::ConvergenceFailed {
last_step,
iterations,
-15
View File
@@ -75,21 +75,6 @@ where
self
}
/// Set explicit per-team continuous scores; higher = better.
pub fn scores<I: IntoIterator<Item = f64>>(mut self, scores: I) -> Self {
self.event.outcome = crate::Outcome::scores(scores);
self
}
/// Set explicit per-team continuous scores with a per-event noise override.
///
/// `sigma` overrides `HistoryBuilder::score_sigma` for this event only.
/// Must be `> 0.0`; debug-asserts otherwise via `Outcome::scores_with_sigma`.
pub fn scores_with_sigma<I: IntoIterator<Item = f64>>(mut self, scores: I, sigma: f64) -> Self {
self.event.outcome = crate::Outcome::scores_with_sigma(scores, sigma);
self
}
/// Mark team `winner_idx` as winner; others tied for last.
pub fn winner(mut self, winner_idx: u32) -> Self {
self.event.outcome = Outcome::winner(winner_idx, self.event.teams.len() as u32);
-177
View File
@@ -1,177 +0,0 @@
use crate::{
N_INF,
factor::{Factor, VarId, VarStore},
gaussian::Gaussian,
pdf,
};
/// Gaussian observation factor on a diff variable.
///
/// Encodes the soft evidence `m_obs ~ N(diff, sigma²)`. The outgoing message
/// to `diff` is the constant `N(m_obs, sigma²)`, so this factor converges in a
/// single propagation: subsequent calls return a zero delta.
#[derive(Debug)]
pub struct MarginFactor {
pub diff: VarId,
pub m_obs: f64,
pub sigma: f64,
pub(crate) msg: Gaussian,
pub(crate) evidence_cached: Option<f64>,
}
impl MarginFactor {
pub fn new(diff: VarId, m_obs: f64, sigma: f64) -> Self {
debug_assert!(sigma > 0.0, "score sigma must be positive");
Self {
diff,
m_obs,
sigma,
msg: N_INF,
evidence_cached: None,
}
}
}
impl MarginFactor {
/// Propagate this factor's message, optionally damping the update in
/// natural-parameter space. `alpha = 1.0` matches `Factor::propagate`
/// exactly; `alpha < 1.0` writes `α·new_msg + (1−α)·old_msg`.
pub(crate) fn propagate_with_alpha(&mut self, vars: &mut VarStore, alpha: f64) -> (f64, f64) {
let marginal = vars.get(self.diff);
let cavity = marginal / self.msg;
if self.evidence_cached.is_none() {
self.evidence_cached = Some(cavity_evidence(cavity, self.m_obs, self.sigma));
}
let new_msg = Gaussian::from_ms(self.m_obs, self.sigma);
let damped = self.msg.damp_natural(new_msg, alpha);
let old_msg = self.msg;
self.msg = damped;
vars.set(self.diff, cavity * damped);
old_msg.delta(damped)
}
}
impl Factor for MarginFactor {
fn propagate(&mut self, vars: &mut VarStore) -> (f64, f64) {
self.propagate_with_alpha(vars, 1.0)
}
fn log_evidence(&self, _vars: &VarStore) -> f64 {
self.evidence_cached.unwrap_or(1.0).ln()
}
}
fn cavity_evidence(cavity: Gaussian, m_obs: f64, sigma: f64) -> f64 {
let combined_sigma = (cavity.sigma().powi(2) + sigma.powi(2)).sqrt();
pdf(m_obs, cavity.mu(), combined_sigma)
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn first_propagate_writes_tilted_marginal() {
let mut vars = VarStore::new();
let diff = vars.alloc(Gaussian::from_ms(0.0, 6.0));
let mut f = MarginFactor::new(diff, 5.0, 1.0);
f.propagate(&mut vars);
let result = vars.get(diff);
// pi = 1/36 + 1 ≈ 1.027778; tau = 0 + 5 = 5
// mu = 5 / 1.027778 ≈ 4.864865; sigma = 1/sqrt(1.027778) ≈ 0.986394
assert!((result.mu() - 4.864864864864865).abs() < 1e-12);
assert!((result.sigma() - 0.986393923832144).abs() < 1e-12);
}
#[test]
fn converges_in_one_step() {
let mut vars = VarStore::new();
let diff = vars.alloc(Gaussian::from_ms(0.0, 6.0));
let mut f = MarginFactor::new(diff, 5.0, 1.0);
f.propagate(&mut vars);
let (dmu, dsig) = f.propagate(&mut vars);
assert!(
dmu < 1e-12,
"expected ~0 delta on second propagate, got {dmu}"
);
assert!(dsig < 1e-12);
}
#[test]
fn evidence_cached_on_first_propagate() {
let mut vars = VarStore::new();
let diff = vars.alloc(Gaussian::from_ms(0.0, 6.0));
let mut f = MarginFactor::new(diff, 5.0, 1.0);
assert!(f.evidence_cached.is_none());
f.propagate(&mut vars);
let z = f.evidence_cached.unwrap();
// pdf(5, 0, sqrt(37)) ≈ 0.046783
assert!((z - 0.04678300292616668).abs() < 1e-10);
// Subsequent propagations don't change it.
f.propagate(&mut vars);
assert_eq!(f.evidence_cached.unwrap(), z);
}
#[test]
fn log_evidence_matches_cached_ln() {
let mut vars = VarStore::new();
let diff = vars.alloc(Gaussian::from_ms(0.0, 6.0));
let mut f = MarginFactor::new(diff, 5.0, 1.0);
f.propagate(&mut vars);
let logz = f.log_evidence(&vars);
assert!((logz - (-3.062235327364623)).abs() < 1e-10);
}
#[test]
fn propagate_with_alpha_one_matches_undamped_propagate() {
let mut vars_a = VarStore::new();
let diff_a = vars_a.alloc(Gaussian::from_ms(0.0, 6.0));
let mut f_a = MarginFactor::new(diff_a, 5.0, 1.0);
let delta_a = f_a.propagate(&mut vars_a);
let result_a = vars_a.get(diff_a);
let mut vars_b = VarStore::new();
let diff_b = vars_b.alloc(Gaussian::from_ms(0.0, 6.0));
let mut f_b = MarginFactor::new(diff_b, 5.0, 1.0);
let delta_b = f_b.propagate_with_alpha(&mut vars_b, 1.0);
let result_b = vars_b.get(diff_b);
assert_eq!(result_a.pi(), result_b.pi());
assert_eq!(result_a.tau(), result_b.tau());
assert_eq!(delta_a, delta_b);
assert_eq!(f_a.msg.pi(), f_b.msg.pi());
assert_eq!(f_a.msg.tau(), f_b.msg.tau());
}
#[test]
fn propagate_with_alpha_half_blends_msg_in_natural_params() {
// Run undamped to capture (initial_msg, undamped_new_msg).
let mut vars_full = VarStore::new();
let diff_full = vars_full.alloc(Gaussian::from_ms(0.0, 6.0));
let mut f_full = MarginFactor::new(diff_full, 5.0, 1.0);
let initial_msg_pi = f_full.msg.pi();
let initial_msg_tau = f_full.msg.tau();
f_full.propagate(&mut vars_full);
let undamped_msg_pi = f_full.msg.pi();
let undamped_msg_tau = f_full.msg.tau();
// Run damped at α = 0.5 from the same initial state.
let mut vars_half = VarStore::new();
let diff_half = vars_half.alloc(Gaussian::from_ms(0.0, 6.0));
let mut f_half = MarginFactor::new(diff_half, 5.0, 1.0);
f_half.propagate_with_alpha(&mut vars_half, 0.5);
let expected_pi = 0.5 * undamped_msg_pi + 0.5 * initial_msg_pi;
let expected_tau = 0.5 * undamped_msg_tau + 0.5 * initial_msg_tau;
assert!((f_half.msg.pi() - expected_pi).abs() < 1e-12);
assert!((f_half.msg.tau() - expected_tau).abs() < 1e-12);
}
}
+2 -22
View File
@@ -56,7 +56,7 @@ impl VarStore {
/// Factors hold their own outgoing messages and propagate them by reading
/// connected variable marginals from a `VarStore` and writing back updated
/// marginals.
pub trait Factor: Send + Sync {
pub trait Factor {
/// Update outgoing messages and write back to the var store.
///
/// Returns the max delta `(|Δmu|, |Δsigma|)` across writes this
@@ -78,7 +78,6 @@ pub enum BuiltinFactor {
TeamSum(team_sum::TeamSumFactor),
RankDiff(rank_diff::RankDiffFactor),
Trunc(trunc::TruncFactor),
Margin(margin::MarginFactor),
}
impl Factor for BuiltinFactor {
@@ -87,20 +86,17 @@ impl Factor for BuiltinFactor {
Self::TeamSum(f) => f.propagate(vars),
Self::RankDiff(f) => f.propagate(vars),
Self::Trunc(f) => f.propagate(vars),
Self::Margin(f) => f.propagate(vars),
}
}
fn log_evidence(&self, vars: &VarStore) -> f64 {
match self {
Self::Trunc(f) => f.log_evidence(vars),
Self::Margin(f) => f.log_evidence(vars),
Self::TeamSum(_) | Self::RankDiff(_) => 0.0,
_ => 0.0,
}
}
}
pub mod margin;
pub mod rank_diff;
pub mod team_sum;
pub mod trunc;
@@ -149,20 +145,4 @@ mod tests {
assert_eq!(store.len(), 0);
assert_eq!(store.marginals.capacity(), cap);
}
#[test]
fn builtin_factor_dispatches_to_margin() {
use super::margin::MarginFactor;
let mut vars = VarStore::new();
let diff = vars.alloc(Gaussian::from_ms(0.0, 6.0));
let mut f = BuiltinFactor::Margin(MarginFactor::new(diff, 5.0, 1.0));
f.propagate(&mut vars);
let result = vars.get(diff);
assert!((result.mu() - 4.864864864864865).abs() < 1e-12);
let logz = f.log_evidence(&vars);
assert!((logz - (-3.062235327364623)).abs() < 1e-10);
}
}
+11 -64
View File
@@ -33,37 +33,29 @@ impl TruncFactor {
}
}
impl TruncFactor {
/// Propagate this factor's message, optionally damping the update in
/// natural-parameter space. `alpha = 1.0` matches `Factor::propagate`
/// exactly; `alpha < 1.0` writes `α·new_msg + (1−α)·old_msg`.
pub(crate) fn propagate_with_alpha(&mut self, vars: &mut VarStore, alpha: f64) -> (f64, f64) {
impl Factor for TruncFactor {
fn propagate(&mut self, vars: &mut VarStore) -> (f64, f64) {
let marginal = vars.get(self.diff);
// Cavity: marginal divided by our outgoing message.
let cavity = marginal / self.msg;
// First-time-only: cache the evidence contribution from the cavity.
if self.evidence_cached.is_none() {
self.evidence_cached = Some(cavity_evidence(cavity, self.margin, self.tie));
}
// Apply the truncation approximation to the cavity.
let trunc = approx(cavity, self.margin, self.tie);
// New outgoing message such that cavity * new_msg = trunc.
let new_msg = trunc / cavity;
let damped = self.msg.damp_natural(new_msg, alpha);
let old_msg = self.msg;
self.msg = damped;
self.msg = new_msg;
// marginal_new = cavity * stored_msg. With alpha = 1.0 this equals
// `trunc` (since cavity * new_msg = trunc by construction); with
// alpha < 1.0 it reflects the partially-applied update.
vars.set(self.diff, cavity * damped);
// Update the marginal: marginal_new = cavity * new_msg = trunc.
vars.set(self.diff, trunc);
old_msg.delta(damped)
}
}
impl Factor for TruncFactor {
fn propagate(&mut self, vars: &mut VarStore) -> (f64, f64) {
self.propagate_with_alpha(vars, 1.0)
old_msg.delta(new_msg)
}
fn log_evidence(&self, _vars: &VarStore) -> f64 {
@@ -135,49 +127,4 @@ mod tests {
let ev = f.evidence_cached.unwrap();
assert!(ev > 0.35 && ev < 0.42);
}
#[test]
fn propagate_with_alpha_one_matches_undamped_propagate() {
let mut vars_a = VarStore::new();
let diff_a = vars_a.alloc(Gaussian::from_ms(2.0, 3.0));
let mut f_a = TruncFactor::new(diff_a, 0.0, false);
let delta_a = f_a.propagate(&mut vars_a);
let result_a = vars_a.get(diff_a);
let mut vars_b = VarStore::new();
let diff_b = vars_b.alloc(Gaussian::from_ms(2.0, 3.0));
let mut f_b = TruncFactor::new(diff_b, 0.0, false);
let delta_b = f_b.propagate_with_alpha(&mut vars_b, 1.0);
let result_b = vars_b.get(diff_b);
assert_eq!(result_a.pi(), result_b.pi());
assert_eq!(result_a.tau(), result_b.tau());
assert_eq!(delta_a, delta_b);
assert_eq!(f_a.msg.pi(), f_b.msg.pi());
assert_eq!(f_a.msg.tau(), f_b.msg.tau());
}
#[test]
fn propagate_with_alpha_half_blends_msg_in_natural_params() {
// Run undamped to capture (initial_msg, undamped_new_msg).
let mut vars_full = VarStore::new();
let diff_full = vars_full.alloc(Gaussian::from_ms(2.0, 3.0));
let mut f_full = TruncFactor::new(diff_full, 0.0, false);
let initial_msg_pi = f_full.msg.pi();
let initial_msg_tau = f_full.msg.tau();
f_full.propagate(&mut vars_full);
let undamped_msg_pi = f_full.msg.pi();
let undamped_msg_tau = f_full.msg.tau();
// Run damped at α = 0.5 from the same initial state.
let mut vars_half = VarStore::new();
let diff_half = vars_half.alloc(Gaussian::from_ms(2.0, 3.0));
let mut f_half = TruncFactor::new(diff_half, 0.0, false);
f_half.propagate_with_alpha(&mut vars_half, 0.5);
let expected_pi = 0.5 * undamped_msg_pi + 0.5 * initial_msg_pi;
let expected_tau = 0.5 * undamped_msg_tau + 0.5 * initial_msg_tau;
assert!((f_half.msg.pi() - expected_pi).abs() < 1e-12);
assert!((f_half.msg.tau() - expected_tau).abs() < 1e-12);
}
}
+2 -2
View File
@@ -6,8 +6,8 @@
pub use crate::{
factor::{
BuiltinFactor, Factor, VarId, VarStore, margin::MarginFactor, rank_diff::RankDiffFactor,
team_sum::TeamSumFactor, trunc::TruncFactor,
BuiltinFactor, Factor, VarId, VarStore, rank_diff::RankDiffFactor, team_sum::TeamSumFactor,
trunc::TruncFactor,
},
schedule::{EpsilonOrMax, Schedule, ScheduleReport},
};
+51 -499
View File
@@ -5,66 +5,16 @@ use crate::{
arena::ScratchArena,
compute_margin,
drift::Drift,
factor::{VarId, margin::MarginFactor, trunc::TruncFactor},
factor::{Factor, trunc::TruncFactor},
gaussian::Gaussian,
rating::Rating,
time::Time,
tuple_gt, tuple_max,
};
/// Per-adjacent-pair link factor in the game's diff chain.
///
/// `Trunc` is used for `Outcome::Ranked` (rank-based truncation).
/// `Margin` is used for `Outcome::Scored` (Gaussian observation on the diff).
#[derive(Debug)]
pub(crate) enum DiffFactor {
Trunc(TruncFactor),
Margin(MarginFactor),
}
impl DiffFactor {
pub(crate) fn diff(&self) -> VarId {
match self {
Self::Trunc(f) => f.diff,
Self::Margin(f) => f.diff,
}
}
pub(crate) fn msg(&self) -> Gaussian {
match self {
Self::Trunc(f) => f.msg,
Self::Margin(f) => f.msg,
}
}
pub(crate) fn evidence(&self) -> f64 {
match self {
Self::Trunc(f) => f.evidence_cached.unwrap_or(1.0),
Self::Margin(f) => f.evidence_cached.unwrap_or(1.0),
}
}
pub(crate) fn propagate(
&mut self,
vars: &mut crate::factor::VarStore,
alpha: f64,
) -> (f64, f64) {
match self {
Self::Trunc(f) => f.propagate_with_alpha(vars, alpha),
Self::Margin(f) => f.propagate_with_alpha(vars, alpha),
}
}
}
/// Per-game inference options.
///
/// `p_draw` and `convergence` apply to ranked outcomes (`Game::ranked`).
/// `score_sigma` applies only to scored outcomes (`Game::scored`); it controls
/// how much the engine trusts the observed score margin (smaller σ = more trust).
#[derive(Clone, Copy, Debug)]
pub struct GameOptions {
pub p_draw: f64,
pub score_sigma: f64,
pub convergence: crate::ConvergenceOptions,
}
@@ -72,7 +22,6 @@ impl Default for GameOptions {
fn default() -> Self {
Self {
p_draw: crate::P_DRAW,
score_sigma: 1.0,
convergence: crate::ConvergenceOptions::default(),
}
}
@@ -90,7 +39,6 @@ pub struct OwnedGame<T: Time, D: Drift<T>> {
result: Vec<f64>,
weights: Vec<Vec<f64>>,
p_draw: f64,
pub(crate) convergence: crate::ConvergenceOptions,
pub(crate) likelihoods: Vec<Vec<Gaussian>>,
pub(crate) evidence: f64,
}
@@ -101,17 +49,9 @@ impl<T: Time, D: Drift<T>> OwnedGame<T, D> {
result: Vec<f64>,
weights: Vec<Vec<f64>>,
p_draw: f64,
convergence: crate::ConvergenceOptions,
) -> Self {
let mut arena = ScratchArena::new();
let g = Game::ranked_with_arena(
teams.clone(),
&result,
&weights,
p_draw,
convergence,
&mut arena,
);
let g = Game::ranked_with_arena(teams.clone(), &result, &weights, p_draw, &mut arena);
let likelihoods = g.likelihoods;
let evidence = g.evidence;
Self {
@@ -119,36 +59,6 @@ impl<T: Time, D: Drift<T>> OwnedGame<T, D> {
result,
weights,
p_draw,
convergence,
likelihoods,
evidence,
}
}
pub(crate) fn new_scored(
teams: Vec<Vec<Rating<T, D>>>,
scores: Vec<f64>,
weights: Vec<Vec<f64>>,
score_sigma: f64,
convergence: crate::ConvergenceOptions,
) -> Self {
let mut arena = ScratchArena::new();
let g = Game::scored_with_arena(
teams.clone(),
&scores,
&weights,
score_sigma,
convergence,
&mut arena,
);
let likelihoods = g.likelihoods;
let evidence = g.evidence;
Self {
teams,
result: scores,
weights,
p_draw: 0.0,
convergence,
likelihoods,
evidence,
}
@@ -173,7 +83,6 @@ pub struct Game<'a, T: Time = i64, D: Drift<T> = crate::drift::ConstantDrift> {
result: &'a [f64],
weights: &'a [Vec<f64>],
p_draw: f64,
pub(crate) convergence: crate::ConvergenceOptions,
pub(crate) likelihoods: Vec<Vec<Gaussian>>,
pub(crate) evidence: f64,
}
@@ -184,7 +93,6 @@ impl<'a, T: Time, D: Drift<T>> Game<'a, T, D> {
result: &'a [f64],
weights: &'a [Vec<f64>],
p_draw: f64,
convergence: crate::ConvergenceOptions,
arena: &mut ScratchArena,
) -> Self {
debug_assert!(
@@ -210,17 +118,12 @@ impl<'a, T: Time, D: Drift<T>> Game<'a, T, D> {
},
"draw must be > 0.0 if there are teams with draw"
);
debug_assert!(
convergence.alpha > 0.0 && convergence.alpha <= 1.0,
"convergence alpha must be in (0.0, 1.0]"
);
let mut this = Self {
teams,
result,
weights,
p_draw,
convergence,
likelihoods: Vec::new(),
evidence: 0.0,
};
@@ -229,57 +132,12 @@ impl<'a, T: Time, D: Drift<T>> Game<'a, T, D> {
this
}
pub(crate) fn scored_with_arena(
teams: Vec<Vec<Rating<T, D>>>,
scores: &'a [f64],
weights: &'a [Vec<f64>],
score_sigma: f64,
convergence: crate::ConvergenceOptions,
arena: &mut ScratchArena,
) -> Self {
debug_assert!(
scores.len() == teams.len(),
"scores must have the same length as teams"
);
debug_assert!(
weights
.iter()
.zip(teams.iter())
.all(|(w, t)| w.len() == t.len()),
"weights must have the same dimensions as teams"
);
debug_assert!(score_sigma > 0.0, "score_sigma must be positive");
debug_assert!(
convergence.alpha > 0.0 && convergence.alpha <= 1.0,
"convergence alpha must be in (0.0, 1.0]"
);
let mut this = Self {
teams,
result: scores,
weights,
p_draw: 0.0,
convergence,
likelihoods: Vec::new(),
evidence: 0.0,
};
this.likelihoods_scored(arena, score_sigma);
this
}
fn run_chain<F>(&self, arena: &mut ScratchArena, mut make_link: F) -> (f64, Vec<Vec<Gaussian>>)
where
F: FnMut(usize, &[usize], &mut crate::factor::VarStore) -> DiffFactor,
{
fn likelihoods(&mut self, arena: &mut ScratchArena) {
arena.reset();
let alpha = self.convergence.alpha;
let epsilon = self.convergence.epsilon;
let max_iter = self.convergence.max_iter;
let n_teams = self.teams.len();
// Sort teams by result descending; reuse arena.sort_buf to avoid allocation.
arena.sort_buf.extend(0..n_teams);
arena.sort_buf.sort_by(|&i, &j| {
self.result[j]
@@ -287,6 +145,7 @@ impl<'a, T: Time, D: Drift<T>> Game<'a, T, D> {
.unwrap_or(Ordering::Equal)
});
// Team performance priors written into arena buffer (capacity reused across games).
arena.team_prior.extend(arena.sort_buf.iter().map(|&t| {
self.teams[t]
.iter()
@@ -296,42 +155,64 @@ impl<'a, T: Time, D: Drift<T>> Game<'a, T, D> {
let n_diffs = n_teams.saturating_sub(1);
let mut links: Vec<DiffFactor> = (0..n_diffs)
.map(|i| make_link(i, &arena.sort_buf, &mut arena.vars))
// One TruncFactor per adjacent sorted-team pair; each owns a diff VarId.
// trunc stays local (fresh state per game; Vec capacity is typically small).
let mut trunc: Vec<TruncFactor> = (0..n_diffs)
.map(|i| {
let tie = self.result[arena.sort_buf[i]] == self.result[arena.sort_buf[i + 1]];
let margin = if self.p_draw == 0.0 {
0.0
} else {
let a: f64 = self.teams[arena.sort_buf[i]]
.iter()
.map(|p| p.beta.powi(2))
.sum();
let b: f64 = self.teams[arena.sort_buf[i + 1]]
.iter()
.map(|p| p.beta.powi(2))
.sum();
compute_margin(self.p_draw, (a + b).sqrt())
};
let vid = arena.vars.alloc(N_INF);
TruncFactor::new(vid, margin, tie)
})
.collect();
// Per-team messages from neighbouring RankDiff factors (replaces TeamMessage).
arena.lhood_lose.resize(n_teams, N_INF);
arena.lhood_win.resize(n_teams, N_INF);
let mut step = (f64::INFINITY, f64::INFINITY);
let mut iter = 0;
while tuple_gt(step, epsilon) && iter < max_iter {
while tuple_gt(step, 1e-6) && iter < 10 {
step = (0.0_f64, 0.0_f64);
for (e, lf) in links[..n_diffs.saturating_sub(1)].iter_mut().enumerate() {
// Forward sweep: diffs 0 .. n_diffs-2 (all but the last).
for (e, tf) in trunc[..n_diffs.saturating_sub(1)].iter_mut().enumerate() {
let pw = arena.team_prior[e] * arena.lhood_lose[e];
let pl = arena.team_prior[e + 1] * arena.lhood_win[e + 1];
let raw = pw - pl;
arena.vars.set(lf.diff(), raw * lf.msg());
let d = lf.propagate(&mut arena.vars, alpha);
arena.vars.set(tf.diff, raw * tf.msg);
let d = tf.propagate(&mut arena.vars);
step = tuple_max(step, d);
let new_ll = pw - lf.msg();
let new_ll = pw - tf.msg;
step = tuple_max(step, arena.lhood_lose[e + 1].delta(new_ll));
arena.lhood_lose[e + 1] = new_ll;
}
for (rev_i, lf) in links[1..].iter_mut().rev().enumerate() {
// Backward sweep: diffs n_diffs-1 .. 1 (reverse, all but the first).
for (rev_i, tf) in trunc[1..].iter_mut().rev().enumerate() {
let e = n_diffs - 1 - rev_i;
let pw = arena.team_prior[e] * arena.lhood_lose[e];
let pl = arena.team_prior[e + 1] * arena.lhood_win[e + 1];
let raw = pw - pl;
arena.vars.set(lf.diff(), raw * lf.msg());
let d = lf.propagate(&mut arena.vars, alpha);
arena.vars.set(tf.diff, raw * tf.msg);
let d = tf.propagate(&mut arena.vars);
step = tuple_max(step, d);
let new_lw = pl + lf.msg();
let new_lw = pl + tf.msg;
step = tuple_max(step, arena.lhood_win[e].delta(new_lw));
arena.lhood_win[e] = new_lw;
}
@@ -343,19 +224,23 @@ impl<'a, T: Time, D: Drift<T>> Game<'a, T, D> {
if n_diffs == 1 {
let raw = (arena.team_prior[0] * arena.lhood_lose[0])
- (arena.team_prior[1] * arena.lhood_win[1]);
arena.vars.set(links[0].diff(), raw * links[0].msg());
links[0].propagate(&mut arena.vars, alpha);
arena.vars.set(trunc[0].diff, raw * trunc[0].msg);
trunc[0].propagate(&mut arena.vars);
}
// Boundary updates: close the chain at both ends.
if n_diffs > 0 {
let pl1 = arena.team_prior[1] * arena.lhood_win[1];
arena.lhood_win[0] = pl1 + links[0].msg();
arena.lhood_win[0] = pl1 + trunc[0].msg;
let pw_last = arena.team_prior[n_teams - 2] * arena.lhood_lose[n_teams - 2];
arena.lhood_lose[n_teams - 1] = pw_last - links[n_diffs - 1].msg();
arena.lhood_lose[n_teams - 1] = pw_last - trunc[n_diffs - 1].msg;
}
let evidence: f64 = links.iter().map(|l| l.evidence()).product();
// Evidence = product of per-diff evidences (each cached on first propagation).
self.evidence = trunc
.iter()
.map(|t| t.evidence_cached.unwrap_or(1.0))
.product();
// Inverse permutation: inv_buf[orig_i] = sorted_i.
arena.inv_buf.resize(n_teams, 0);
@@ -363,7 +248,7 @@ impl<'a, T: Time, D: Drift<T>> Game<'a, T, D> {
arena.inv_buf[orig_i] = si;
}
let likelihoods = self
self.likelihoods = self
.teams
.iter()
.zip(self.weights.iter())
@@ -385,38 +270,6 @@ impl<'a, T: Time, D: Drift<T>> Game<'a, T, D> {
.collect::<Vec<_>>()
})
.collect::<Vec<_>>();
(evidence, likelihoods)
}
fn likelihoods(&mut self, arena: &mut ScratchArena) {
let (evidence, likelihoods) = self.run_chain(arena, |i, sort_buf, vars| {
let tie = self.result[sort_buf[i]] == self.result[sort_buf[i + 1]];
let margin = if self.p_draw == 0.0 {
0.0
} else {
let a: f64 = self.teams[sort_buf[i]].iter().map(|p| p.beta.powi(2)).sum();
let b: f64 = self.teams[sort_buf[i + 1]]
.iter()
.map(|p| p.beta.powi(2))
.sum();
compute_margin(self.p_draw, (a + b).sqrt())
};
let vid = vars.alloc(N_INF);
DiffFactor::Trunc(TruncFactor::new(vid, margin, tie))
});
self.evidence = evidence;
self.likelihoods = likelihoods;
}
fn likelihoods_scored(&mut self, arena: &mut ScratchArena, score_sigma: f64) {
let (evidence, likelihoods) = self.run_chain(arena, |i, sort_buf, vars| {
let m_obs = self.result[sort_buf[i]] - self.result[sort_buf[i + 1]];
let vid = vars.alloc(N_INF);
DiffFactor::Margin(MarginFactor::new(vid, m_obs, score_sigma))
});
self.evidence = evidence;
self.likelihoods = likelihoods;
}
pub fn posteriors(&self) -> Vec<Vec<Gaussian>> {
@@ -456,62 +309,13 @@ impl<T: Time, D: Drift<T>> Game<'_, T, D> {
});
}
let ranks = outcome
.as_ranks()
.ok_or(crate::InferenceError::MismatchedShape {
kind: "Game::ranked requires Outcome::Ranked",
expected: 0,
got: 0,
})?;
let ranks = outcome.as_ranks();
let max_rank = ranks.iter().copied().max().unwrap_or(0) as f64;
let result: Vec<f64> = ranks.iter().map(|&r| max_rank - r as f64).collect();
let teams_owned: Vec<Vec<Rating<T, D>>> = teams.iter().map(|t| t.to_vec()).collect();
let weights: Vec<Vec<f64>> = teams.iter().map(|t| vec![1.0; t.len()]).collect();
Ok(OwnedGame::new(
teams_owned,
result,
weights,
options.p_draw,
options.convergence,
))
}
pub fn scored(
teams: &[&[Rating<T, D>]],
outcome: crate::Outcome,
options: &GameOptions,
) -> Result<OwnedGame<T, D>, crate::InferenceError> {
if options.score_sigma <= 0.0 || options.score_sigma.is_nan() {
return Err(crate::InferenceError::InvalidParameter {
name: "score_sigma",
value: options.score_sigma,
});
}
if outcome.team_count() != teams.len() {
return Err(crate::InferenceError::MismatchedShape {
kind: "outcome scores vs teams",
expected: teams.len(),
got: outcome.team_count(),
});
}
let scores = outcome
.as_scores()
.ok_or(crate::InferenceError::MismatchedShape {
kind: "Game::scored requires Outcome::Scored",
expected: 0,
got: 0,
})?
.to_vec();
let teams_owned: Vec<Vec<Rating<T, D>>> = teams.iter().map(|t| t.to_vec()).collect();
let weights: Vec<Vec<f64>> = teams.iter().map(|t| vec![1.0; t.len()]).collect();
Ok(OwnedGame::new_scored(
teams_owned,
scores,
weights,
options.score_sigma,
options.convergence,
))
Ok(OwnedGame::new(teams_owned, result, weights, options.p_draw))
}
pub fn one_v_one(
@@ -572,7 +376,6 @@ mod tests {
&[0.0, 1.0],
&w,
0.0,
crate::ConvergenceOptions::default(),
&mut ScratchArena::new(),
);
let p = g.posteriors();
@@ -600,7 +403,6 @@ mod tests {
&[0.0, 1.0],
&w,
0.0,
crate::ConvergenceOptions::default(),
&mut ScratchArena::new(),
);
let p = g.posteriors();
@@ -620,7 +422,6 @@ mod tests {
&[0.0, 1.0],
&w,
0.0,
crate::ConvergenceOptions::default(),
&mut ScratchArena::new(),
);
@@ -654,7 +455,6 @@ mod tests {
&[1.0, 2.0, 0.0],
&w,
0.0,
crate::ConvergenceOptions::default(),
&mut ScratchArena::new(),
);
let p = g.posteriors();
@@ -671,7 +471,6 @@ mod tests {
&[2.0, 1.0, 0.0],
&w,
0.0,
crate::ConvergenceOptions::default(),
&mut ScratchArena::new(),
);
let p = g.posteriors();
@@ -683,14 +482,7 @@ mod tests {
assert_ulps_eq!(b, Gaussian::from_ms(25.000000, 6.238469), epsilon = 1e-6);
let w = [vec![1.0], vec![1.0], vec![1.0]];
let g = Game::ranked_with_arena(
teams,
&[1.0, 2.0, 0.0],
&w,
0.5,
crate::ConvergenceOptions::default(),
&mut ScratchArena::new(),
);
let g = Game::ranked_with_arena(teams, &[1.0, 2.0, 0.0], &w, 0.5, &mut ScratchArena::new());
let p = g.posteriors();
let a = p[0][0];
@@ -722,7 +514,6 @@ mod tests {
&[0.0, 0.0],
&w,
0.25,
crate::ConvergenceOptions::default(),
&mut ScratchArena::new(),
);
let p = g.posteriors();
@@ -750,7 +541,6 @@ mod tests {
&[0.0, 0.0],
&w,
0.25,
crate::ConvergenceOptions::default(),
&mut ScratchArena::new(),
);
let p = g.posteriors();
@@ -786,7 +576,6 @@ mod tests {
&[0.0, 0.0, 0.0],
&w,
0.25,
crate::ConvergenceOptions::default(),
&mut ScratchArena::new(),
);
let p = g.posteriors();
@@ -823,7 +612,6 @@ mod tests {
&[0.0, 0.0, 0.0],
&w,
0.25,
crate::ConvergenceOptions::default(),
&mut ScratchArena::new(),
);
let p = g.posteriors();
@@ -875,7 +663,6 @@ mod tests {
&[1.0, 0.0, 0.0],
&w,
0.25,
crate::ConvergenceOptions::default(),
&mut ScratchArena::new(),
);
let p = g.posteriors();
@@ -909,7 +696,6 @@ mod tests {
&[1.0, 0.0],
&w,
0.0,
crate::ConvergenceOptions::default(),
&mut ScratchArena::new(),
);
let p = g.posteriors();
@@ -934,7 +720,6 @@ mod tests {
&[1.0, 0.0],
&w,
0.0,
crate::ConvergenceOptions::default(),
&mut ScratchArena::new(),
);
let p = g.posteriors();
@@ -959,7 +744,6 @@ mod tests {
&[1.0, 0.0],
&w,
0.0,
crate::ConvergenceOptions::default(),
&mut ScratchArena::new(),
);
let p = g.posteriors();
@@ -987,7 +771,6 @@ mod tests {
&[1.0, 0.0],
&w,
0.0,
crate::ConvergenceOptions::default(),
&mut ScratchArena::new(),
);
let p = g.posteriors();
@@ -1015,7 +798,6 @@ mod tests {
&[1.0, 0.0],
&w,
0.0,
crate::ConvergenceOptions::default(),
&mut ScratchArena::new(),
);
let p = g.posteriors();
@@ -1023,136 +805,6 @@ mod tests {
assert_ulps_eq!(p[0][0], p[1][0], epsilon = 1e-6);
}
#[test]
fn diff_factor_dispatch_trunc_and_margin() {
use super::DiffFactor;
use crate::factor::{VarStore, margin::MarginFactor, trunc::TruncFactor};
let mut vars = VarStore::new();
let dt = vars.alloc(Gaussian::from_ms(0.0, 6.0));
let dm = vars.alloc(Gaussian::from_ms(0.0, 6.0));
let mut t = DiffFactor::Trunc(TruncFactor::new(dt, 0.0, false));
let mut m = DiffFactor::Margin(MarginFactor::new(dm, 5.0, 1.0));
let _ = t.propagate(&mut vars, 1.0);
let _ = m.propagate(&mut vars, 1.0);
// Smoke: both diffs got written; their msgs are non-N_INF.
assert!(t.msg().pi() > 0.0);
assert!(m.msg().pi() > 0.0);
assert_eq!(t.diff(), dt);
assert_eq!(m.diff(), dm);
}
#[test]
fn scored_path_sharper_when_margin_is_large() {
let prior = R::new(
Gaussian::from_ms(25.0, 25.0 / 3.0),
25.0 / 6.0,
ConstantDrift(25.0 / 300.0),
);
let teams = vec![vec![prior], vec![prior]];
let result = vec![10.0, 0.0]; // a beat b by 10
let weights = [vec![1.0], vec![1.0]];
let mut arena = ScratchArena::new();
let g = Game::scored_with_arena(
teams,
&result,
&weights,
1.0,
crate::ConvergenceOptions::default(),
&mut arena,
);
let p = g.posteriors();
let a = p[0][0];
let b = p[1][0];
assert!(
a.mu() > b.mu(),
"expected team a posterior mu > team b; got {} vs {}",
a.mu(),
b.mu()
);
// Tighter score_sigma should produce a stronger update.
let mut arena2 = ScratchArena::new();
let g_tight = Game::scored_with_arena(
vec![vec![prior], vec![prior]],
&result,
&weights,
0.1,
crate::ConvergenceOptions::default(),
&mut arena2,
);
let p_tight = g_tight.posteriors();
let a_tight = p_tight[0][0];
assert!(
a_tight.mu() > a.mu(),
"expected tighter sigma to push posterior further; {} vs {}",
a_tight.mu(),
a.mu()
);
}
#[test]
fn game_scored_public_ctor() {
use crate::Outcome;
let prior = R::new(
Gaussian::from_ms(25.0, 25.0 / 3.0),
25.0 / 6.0,
ConstantDrift(25.0 / 300.0),
);
let opts = GameOptions {
score_sigma: 1.0,
..GameOptions::default()
};
let g = Game::scored(&[&[prior], &[prior]], Outcome::scores([8.0, 2.0]), &opts).unwrap();
let p = g.posteriors();
assert!(p[0][0].mu() > p[1][0].mu());
}
#[test]
fn game_scored_rejects_ranked_outcome() {
let prior = R::new(
Gaussian::from_ms(25.0, 25.0 / 3.0),
25.0 / 6.0,
ConstantDrift(25.0 / 300.0),
);
let err = Game::scored(
&[&[prior], &[prior]],
crate::Outcome::winner(0, 2),
&GameOptions::default(),
)
.unwrap_err();
assert!(matches!(err, crate::InferenceError::MismatchedShape { .. }));
}
#[test]
fn game_scored_rejects_zero_score_sigma() {
let prior = R::new(
Gaussian::from_ms(25.0, 25.0 / 3.0),
25.0 / 6.0,
ConstantDrift(25.0 / 300.0),
);
let opts = GameOptions {
score_sigma: 0.0,
..GameOptions::default()
};
let err = Game::scored(
&[&[prior], &[prior]],
crate::Outcome::scores([1.0, 0.0]),
&opts,
)
.unwrap_err();
assert!(matches!(
err,
crate::InferenceError::InvalidParameter {
name: "score_sigma",
..
}
));
}
#[test]
fn test_2vs2_weighted() {
let t_a = vec![
@@ -1189,7 +841,6 @@ mod tests {
&[1.0, 0.0],
&w,
0.0,
crate::ConvergenceOptions::default(),
&mut ScratchArena::new(),
);
let p = g.posteriors();
@@ -1224,7 +875,6 @@ mod tests {
&[1.0, 0.0],
&w,
0.0,
crate::ConvergenceOptions::default(),
&mut ScratchArena::new(),
);
let p = g.posteriors();
@@ -1259,7 +909,6 @@ mod tests {
&[1.0, 0.0],
&w,
0.0,
crate::ConvergenceOptions::default(),
&mut ScratchArena::new(),
);
let p = g.posteriors();
@@ -1298,7 +947,6 @@ mod tests {
&[1.0, 0.0],
&w,
0.0,
crate::ConvergenceOptions::default(),
&mut ScratchArena::new(),
);
let post_2vs1 = g.posteriors();
@@ -1312,7 +960,6 @@ mod tests {
&[1.0, 0.0],
&w,
0.0,
crate::ConvergenceOptions::default(),
&mut ScratchArena::new(),
);
let p = g.posteriors();
@@ -1322,99 +969,4 @@ mod tests {
assert_ulps_eq!(p[1][0], post_2vs1[1][0], epsilon = 1e-6);
assert_ulps_eq!(p[1][1], t_b[1].prior, epsilon = 1e-6);
}
#[test]
fn run_chain_honours_max_iter_in_convergence_options() {
let players: Vec<R> = (0..4).map(|_| R::default()).collect();
let teams: Vec<Vec<_>> = players.iter().map(|p| vec![*p]).collect();
let result = vec![3.0, 2.0, 1.0, 0.0];
let weights = vec![vec![1.0]; 4];
// Capped at 1 iteration: cannot fully propagate down a 4-team chain.
let mut arena = ScratchArena::new();
let g_capped = Game::ranked_with_arena(
teams.clone(),
&result,
&weights,
0.0,
crate::ConvergenceOptions {
max_iter: 1,
..crate::ConvergenceOptions::default()
},
&mut arena,
);
let posteriors_capped = g_capped.posteriors();
// Same inputs, plenty of iterations: fully converged.
let mut arena = ScratchArena::new();
let g_full = Game::ranked_with_arena(
teams,
&result,
&weights,
0.0,
crate::ConvergenceOptions::default(),
&mut arena,
);
let posteriors_full = g_full.posteriors();
// The two posteriors should differ — capped did not converge.
let mut max_diff: f64 = 0.0;
for (team_capped, team_full) in posteriors_capped.iter().zip(posteriors_full.iter()) {
for (g_capped, g_full) in team_capped.iter().zip(team_full.iter()) {
max_diff = max_diff.max((g_capped.mu() - g_full.mu()).abs());
max_diff = max_diff.max((g_capped.sigma() - g_full.sigma()).abs());
}
}
assert!(
max_diff > 1e-6,
"max_iter=1 should differ from full convergence; max_diff={max_diff}"
);
}
#[test]
fn run_chain_with_damping_converges_to_same_posterior() {
let players: Vec<R> = (0..4).map(|_| R::default()).collect();
let teams: Vec<Vec<_>> = players.iter().map(|p| vec![*p]).collect();
let result = vec![3.0, 2.0, 1.0, 0.0];
let weights = vec![vec![1.0]; 4];
let mut arena = ScratchArena::new();
let g_undamped = Game::ranked_with_arena(
teams.clone(),
&result,
&weights,
0.0,
crate::ConvergenceOptions::default(),
&mut arena,
);
let posteriors_undamped = g_undamped.posteriors();
// alpha=0.5 with extra iterations: should reach the same fixed point.
let mut arena = ScratchArena::new();
let g_damped = Game::ranked_with_arena(
teams,
&result,
&weights,
0.0,
crate::ConvergenceOptions {
alpha: 0.5,
max_iter: 100,
..crate::ConvergenceOptions::default()
},
&mut arena,
);
let posteriors_damped = g_damped.posteriors();
let mut max_diff: f64 = 0.0;
for (team_u, team_d) in posteriors_undamped.iter().zip(posteriors_damped.iter()) {
for (g_u, g_d) in team_u.iter().zip(team_d.iter()) {
max_diff = max_diff.max((g_u.mu() - g_d.mu()).abs());
max_diff = max_diff.max((g_u.sigma() - g_d.sigma()).abs());
}
}
assert!(
max_diff < 1e-4,
"α=0.5 should reach the same fixed point as α=1.0; max_diff={max_diff}"
);
}
}
+2 -72
View File
@@ -53,11 +53,7 @@ impl Gaussian {
#[inline]
pub fn mu(&self) -> f64 {
// A non-positive precision is an improper (uninformative) Gaussian — its mean is
// undefined. Treat it like `pi == 0` and return 0. EP message cancellation can land
// `pi` on a tiny negative value (round-off of exactly zero); without this guard
// `tau / pi` would yield a spurious finite mean.
if self.pi <= 0.0 {
if self.pi == 0.0 {
0.0
} else {
self.tau / self.pi
@@ -66,10 +62,7 @@ impl Gaussian {
#[inline]
pub fn sigma(&self) -> f64 {
// A non-positive precision is improper → infinite standard deviation. Guarding
// `pi <= 0.0` (not just `== 0.0`) keeps `1.0 / pi.sqrt()` from returning NaN when EP
// cancellation produces a tiny negative precision (round-off of exactly zero).
if self.pi <= 0.0 {
if self.pi == 0.0 {
f64::INFINITY
} else if self.pi.is_infinite() {
0.0
@@ -103,18 +96,6 @@ impl Gaussian {
let var = self.sigma().powi(2) + variance_delta;
Self::from_ms(self.mu(), var.sqrt())
}
/// EP damping in natural-parameter space: `α·new + (1−α)·self`.
///
/// Used by within-game inference to stabilise oscillating fixed-point
/// loops on hard graphs. `alpha = 1.0` returns `new` exactly;
/// `alpha < 1.0` shrinks each per-step update.
pub fn damp_natural(self, new: Gaussian, alpha: f64) -> Gaussian {
Gaussian::from_natural(
alpha * new.pi() + (1.0 - alpha) * self.pi(),
alpha * new.tau() + (1.0 - alpha) * self.tau(),
)
}
}
impl Default for Gaussian {
@@ -181,28 +162,6 @@ impl ops::Div<Gaussian> for Gaussian {
mod tests {
use super::*;
#[test]
fn non_positive_precision_is_improper_not_nan() {
// EP message cancellation can leave `pi` a tiny negative (round-off of exactly zero).
// Such a Gaussian is improper/uninformative: mu() must be 0 and sigma() infinite, not
// NaN. A NaN here propagates through the moment-space `Sub` in the game chain and
// poisons every skill in the slice.
let tiny_neg = Gaussian::from_natural(-5.55e-17, -8.88e-16);
assert_eq!(tiny_neg.mu(), 0.0);
assert!(tiny_neg.sigma().is_infinite());
// A frankly-negative precision is treated the same way.
let neg = Gaussian::from_natural(-1.0, 2.0);
assert_eq!(neg.mu(), 0.0);
assert!(neg.sigma().is_infinite());
// Subtracting such a message must not produce NaN (the original failure path).
let proper = Gaussian::from_ms(9.75, 1.256);
let diff = proper - tiny_neg;
assert!(diff.pi().is_finite() && !diff.pi().is_nan());
assert!(diff.tau().is_finite() && !diff.tau().is_nan());
}
#[test]
fn test_add() {
let n = Gaussian::from_ms(25.0, 25.0 / 3.0);
@@ -272,33 +231,4 @@ mod tests {
assert!((r.pi() - expected_pi).abs() < 1e-15);
assert!((r.tau() - expected_tau).abs() < 1e-15);
}
#[test]
fn damp_natural_alpha_one_returns_new() {
let old = Gaussian::from_ms(1.0, 2.0);
let new = Gaussian::from_ms(5.0, 0.5);
let damped = old.damp_natural(new, 1.0);
assert_eq!(damped.pi(), new.pi());
assert_eq!(damped.tau(), new.tau());
}
#[test]
fn damp_natural_alpha_zero_returns_self() {
let old = Gaussian::from_ms(1.0, 2.0);
let new = Gaussian::from_ms(5.0, 0.5);
let damped = old.damp_natural(new, 0.0);
assert_eq!(damped.pi(), old.pi());
assert_eq!(damped.tau(), old.tau());
}
#[test]
fn damp_natural_alpha_half_is_midpoint_in_natural_params() {
let old = Gaussian::from_ms(1.0, 2.0);
let new = Gaussian::from_ms(5.0, 0.5);
let damped = old.damp_natural(new, 0.5);
let expected_pi = 0.5 * new.pi() + 0.5 * old.pi();
let expected_tau = 0.5 * new.tau() + 0.5 * old.tau();
assert!((damped.pi() - expected_pi).abs() < 1e-12);
assert!((damped.tau() - expected_tau).abs() < 1e-12);
}
}
+24 -360
View File
@@ -13,7 +13,7 @@ use crate::{
sort_time,
storage::CompetitorStore,
time::Time,
time_slice::{self, EventKind, TimeSlice},
time_slice::{self, TimeSlice},
tuple_gt, tuple_max,
};
@@ -30,7 +30,6 @@ pub struct HistoryBuilder<
drift: D,
p_draw: f64,
online: bool,
score_sigma: f64,
convergence: ConvergenceOptions,
observer: O,
_time: PhantomData<T>,
@@ -61,7 +60,6 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> HistoryBuilder<
beta: self.beta,
p_draw: self.p_draw,
online: self.online,
score_sigma: self.score_sigma,
convergence: self.convergence,
observer: self.observer,
_time: self._time,
@@ -79,15 +77,6 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> HistoryBuilder<
self
}
pub fn score_sigma(mut self, score_sigma: f64) -> Self {
assert!(
score_sigma > 0.0,
"score_sigma must be positive (got {score_sigma})"
);
self.score_sigma = score_sigma;
self
}
pub fn convergence(mut self, opts: ConvergenceOptions) -> Self {
self.convergence = opts;
self
@@ -101,7 +90,6 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> HistoryBuilder<
drift: self.drift,
p_draw: self.p_draw,
online: self.online,
score_sigma: self.score_sigma,
convergence: self.convergence,
observer,
_time: self._time,
@@ -121,7 +109,6 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> HistoryBuilder<
drift: self.drift,
p_draw: self.p_draw,
online: self.online,
score_sigma: self.score_sigma,
convergence: self.convergence,
observer: self.observer,
}
@@ -137,7 +124,6 @@ impl Default for HistoryBuilder<i64, ConstantDrift, NullObserver, &'static str>
drift: ConstantDrift(GAMMA),
p_draw: P_DRAW,
online: false,
score_sigma: 1.0,
convergence: ConvergenceOptions::default(),
observer: NullObserver,
_time: PhantomData,
@@ -162,7 +148,6 @@ pub struct History<
drift: D,
p_draw: f64,
online: bool,
score_sigma: f64,
convergence: ConvergenceOptions,
observer: O,
}
@@ -189,7 +174,6 @@ impl<K: Eq + Hash + Clone> History<i64, ConstantDrift, NullObserver, K> {
drift: ConstantDrift(GAMMA),
p_draw: P_DRAW,
online: false,
score_sigma: 1.0,
convergence: ConvergenceOptions::default(),
observer: NullObserver,
_time: PhantomData,
@@ -278,45 +262,17 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> History<T, D, O
/// Note: `key(idx)` is O(n) per lookup; this method is therefore O(n²)
/// in the number of competitors. Acceptable for T2; T3 may optimize.
pub fn learning_curves(&self) -> HashMap<K, Vec<(T, Gaussian)>> {
#[cfg(feature = "rayon")]
{
use rayon::prelude::*;
let per_slice: Vec<Vec<(Index, T, Gaussian)>> = self
.time_slices
.par_iter()
.map(|ts| {
ts.skills
.iter()
.map(|(idx, sk)| (idx, ts.time, sk.posterior()))
.collect()
})
.collect();
let mut data: HashMap<K, Vec<(T, Gaussian)>> = HashMap::new();
for slice_contrib in per_slice {
for (idx, t, g) in slice_contrib {
if let Some(key) = self.keys.key(idx).cloned() {
data.entry(key).or_default().push((t, g));
}
let mut data: HashMap<K, Vec<(T, Gaussian)>> = HashMap::new();
for slice in &self.time_slices {
for (idx, skill) in slice.skills.iter() {
if let Some(key) = self.keys.key(idx).cloned() {
data.entry(key)
.or_default()
.push((slice.time, skill.posterior()));
}
}
data
}
#[cfg(not(feature = "rayon"))]
{
let mut data: HashMap<K, Vec<(T, Gaussian)>> = HashMap::new();
for slice in &self.time_slices {
for (idx, skill) in slice.skills.iter() {
if let Some(key) = self.keys.key(idx).cloned() {
data.entry(key)
.or_default()
.push((slice.time, skill.posterior()));
}
}
}
data
}
data
}
/// Skill estimate at the latest time slice the competitor appears in.
@@ -348,23 +304,10 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> History<T, D, O
}
pub(crate) fn log_evidence_internal(&mut self, forward: bool, targets: &[Index]) -> f64 {
#[cfg(feature = "rayon")]
{
use rayon::prelude::*;
let per_slice: Vec<f64> = self
.time_slices
.par_iter()
.map(|ts| ts.log_evidence(self.online, targets, forward, &self.agents))
.collect();
per_slice.into_iter().sum()
}
#[cfg(not(feature = "rayon"))]
{
self.time_slices
.iter()
.map(|ts| ts.log_evidence(self.online, targets, forward, &self.agents))
.sum()
}
self.time_slices
.iter()
.map(|ts| ts.log_evidence(self.online, targets, forward, &self.agents))
.sum()
}
/// Total log-evidence across the history.
@@ -466,7 +409,6 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> History<T, D, O
results: Vec<Vec<f64>>,
times: Vec<T>,
weights: Vec<Vec<Vec<f64>>>,
kinds: Vec<EventKind>,
mut priors: HashMap<Index, Rating<T, D>>,
) -> Result<(), InferenceError> {
if !results.is_empty() && results.len() != composition.len() {
@@ -490,13 +432,6 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> History<T, D, O
got: weights.len(),
});
}
if kinds.len() != composition.len() {
return Err(InferenceError::MismatchedShape {
kind: "kinds",
expected: composition.len(),
got: kinds.len(),
});
}
competitor::clean(self.agents.values_mut(), true);
@@ -581,11 +516,9 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> History<T, D, O
(i..j).map(|e| weights[o[e]].clone()).collect::<Vec<_>>()
};
let kinds_chunk: Vec<EventKind> = (i..j).map(|e| kinds[o[e]]).collect();
if self.time_slices.len() > k && self.time_slices[k].time == t {
let time_slice = &mut self.time_slices[k];
time_slice.add_events(composition, results, weights, kinds_chunk, &self.agents);
time_slice.add_events(composition, results, weights, &self.agents);
for agent_idx in time_slice.skills.keys() {
let agent = self.agents.get_mut(agent_idx).unwrap();
@@ -594,8 +527,8 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> History<T, D, O
agent.message = time_slice.forward_prior_out(&agent_idx);
}
} else {
let mut time_slice = TimeSlice::new(t, self.p_draw, self.convergence);
time_slice.add_events(composition, results, weights, kinds_chunk, &self.agents);
let mut time_slice = TimeSlice::new(t, self.p_draw);
time_slice.add_events(composition, results, weights, &self.agents);
self.time_slices.insert(k, time_slice);
@@ -652,7 +585,6 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> History<T, D, O
vec![vec![1.0, 0.0]],
vec![time],
vec![],
vec![EventKind::Ranked],
HashMap::new(),
)
}
@@ -669,7 +601,6 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> History<T, D, O
vec![vec![0.0, 0.0]],
vec![time],
vec![],
vec![EventKind::Ranked],
HashMap::new(),
)
}
@@ -694,15 +625,15 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> History<T, D, O
let mut results: Vec<Vec<f64>> = Vec::with_capacity(events.len());
let mut times: Vec<T> = Vec::with_capacity(events.len());
let mut weights: Vec<Vec<Vec<f64>>> = Vec::with_capacity(events.len());
let mut kinds: Vec<EventKind> = Vec::with_capacity(events.len());
let mut priors: HashMap<Index, Rating<T, D>> = HashMap::new();
for ev in events {
if ev.outcome.team_count() != ev.teams.len() {
let ranks = ev.outcome.as_ranks();
if ranks.len() != ev.teams.len() {
return Err(InferenceError::MismatchedShape {
kind: "outcome vs teams",
kind: "outcome ranks vs teams",
expected: ev.teams.len(),
got: ev.outcome.team_count(),
got: ranks.len(),
});
}
@@ -726,29 +657,13 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> History<T, D, O
composition.push(event_comp);
weights.push(event_weights);
let event_result: Vec<f64> = match &ev.outcome {
crate::Outcome::Ranked(ranks) => {
let max_rank = ranks.iter().copied().max().unwrap_or(0) as f64;
kinds.push(EventKind::Ranked);
ranks.iter().map(|&r| max_rank - r as f64).collect()
}
crate::Outcome::Scored { scores, sigma } => {
let resolved = sigma.unwrap_or(self.score_sigma);
debug_assert!(
resolved > 0.0,
"resolved score_sigma must be > 0.0 (got {resolved})"
);
kinds.push(EventKind::Scored {
score_sigma: resolved,
});
scores.to_vec()
}
};
results.push(event_result);
let max_rank = ranks.iter().copied().max().unwrap_or(0) as f64;
let inverted: Vec<f64> = ranks.iter().map(|&r| max_rank - r as f64).collect();
results.push(inverted);
times.push(ev.time);
}
self.add_events_with_prior(composition, results, times, weights, kinds, priors)
self.add_events_with_prior(composition, results, times, weights, priors)
}
}
@@ -843,7 +758,6 @@ mod tests {
&[0.0, 1.0],
&w,
P_DRAW,
crate::ConvergenceOptions::default(),
&mut ScratchArena::new(),
)
.posteriors();
@@ -1374,7 +1288,6 @@ mod tests {
h.convergence = ConvergenceOptions {
max_iter: 11,
epsilon: EPSILON,
alpha: 1.0,
};
h.converge().unwrap();
@@ -1692,7 +1605,6 @@ mod tests {
.convergence(ConvergenceOptions {
max_iter: 30,
epsilon: 1e-6,
alpha: 1.0,
})
.build();
@@ -1713,252 +1625,4 @@ mod tests {
assert!(report.iterations < 30);
assert!(report.final_step.0 <= 1e-6);
}
#[test]
#[should_panic(expected = "score_sigma must be positive")]
fn history_builder_rejects_zero_score_sigma() {
let _ = History::builder().score_sigma(0.0).build();
}
#[test]
fn history_propagates_convergence_to_inner_run_chain() {
use crate::ConvergenceOptions;
let events_for =
|h: &mut History<i64, ConstantDrift, crate::observer::NullObserver, &'static str>| {
h.event(0)
.team(["a"])
.team(["b"])
.team(["c"])
.team(["d"])
.ranking([0u32, 1, 2, 3])
.commit()
.unwrap();
};
let mut h_capped: History<i64, _, _, &'static str> = History::builder()
.convergence(ConvergenceOptions {
max_iter: 1,
..ConvergenceOptions::default()
})
.build();
events_for(&mut h_capped);
h_capped.converge().unwrap();
let mut h_full: History<i64, _, _, &'static str> = History::builder().build();
events_for(&mut h_full);
h_full.converge().unwrap();
let curves_capped = h_capped.learning_curves();
let curves_full = h_full.learning_curves();
let mut max_diff: f64 = 0.0;
for (key, capped_pts) in curves_capped.iter() {
let full_pts = curves_full.get(key).expect("agent missing in full");
for (capped, full) in capped_pts.iter().zip(full_pts.iter()) {
max_diff = max_diff.max((capped.1.mu() - full.1.mu()).abs());
max_diff = max_diff.max((capped.1.sigma() - full.1.sigma()).abs());
}
}
assert!(
max_diff > 1e-6,
"max_iter=1 inner loop should differ from default; max_diff={max_diff}"
);
}
#[test]
fn history_with_damping_reaches_same_fixed_point_as_undamped() {
use crate::ConvergenceOptions;
let events_for =
|h: &mut History<i64, ConstantDrift, crate::observer::NullObserver, &'static str>| {
h.event(0)
.team(["a"])
.team(["b"])
.team(["c"])
.team(["d"])
.ranking([0u32, 1, 2, 3])
.commit()
.unwrap();
};
let mut h_undamped: History<i64, _, _, &'static str> = History::builder().build();
events_for(&mut h_undamped);
h_undamped.converge().unwrap();
let mut h_damped: History<i64, _, _, &'static str> = History::builder()
.convergence(ConvergenceOptions {
alpha: 0.5,
max_iter: 200,
..ConvergenceOptions::default()
})
.build();
events_for(&mut h_damped);
h_damped.converge().unwrap();
let curves_u = h_undamped.learning_curves();
let curves_d = h_damped.learning_curves();
let mut max_diff: f64 = 0.0;
for (key, u_pts) in curves_u.iter() {
let d_pts = curves_d.get(key).expect("agent missing in damped");
for (u, d) in u_pts.iter().zip(d_pts.iter()) {
max_diff = max_diff.max((u.1.mu() - d.1.mu()).abs());
max_diff = max_diff.max((u.1.sigma() - d.1.sigma()).abs());
}
}
assert!(
max_diff < 1e-3,
"α=0.5 should reach the same fixed point as α=1.0; max_diff={max_diff}"
);
}
#[test]
fn outcome_scores_default_sigma_uses_history_default() {
use crate::Outcome;
// Path A: explicit sigma=0.5 via override.
let mut h_a = crate::History::builder().score_sigma(0.5).build();
h_a.add_events([crate::Event {
time: 0_i64,
teams: smallvec::smallvec![
crate::Team::with_members([crate::Member::new("a")]),
crate::Team::with_members([crate::Member::new("b")]),
],
outcome: Outcome::scores_with_sigma([3.0, 1.0], 0.5),
}])
.unwrap();
h_a.converge().unwrap();
// Path B: history-wide default 0.5, no per-event override.
let mut h_b = crate::History::builder().score_sigma(0.5).build();
h_b.add_events([crate::Event {
time: 0_i64,
teams: smallvec::smallvec![
crate::Team::with_members([crate::Member::new("a")]),
crate::Team::with_members([crate::Member::new("b")]),
],
outcome: Outcome::scores([3.0, 1.0]),
}])
.unwrap();
h_b.converge().unwrap();
// Inheritance: posteriors must be bit-equal.
let curves_a = h_a.learning_curves();
let curves_b = h_b.learning_curves();
for (key, a_pts) in curves_a.iter() {
let b_pts = curves_b.get(key).expect("agent missing in path B");
for (a, b) in a_pts.iter().zip(b_pts.iter()) {
assert_eq!(a.1.pi(), b.1.pi(), "mismatch at agent {key:?}");
assert_eq!(a.1.tau(), b.1.tau(), "mismatch at agent {key:?}");
}
}
}
#[test]
fn outcome_scores_with_sigma_overrides_history_default() {
use crate::Outcome;
// Path A: history-wide default 0.5, per-event override 2.0.
let mut h_a = crate::History::builder().score_sigma(0.5).build();
h_a.add_events([crate::Event {
time: 0_i64,
teams: smallvec::smallvec![
crate::Team::with_members([crate::Member::new("a")]),
crate::Team::with_members([crate::Member::new("b")]),
],
outcome: Outcome::scores_with_sigma([3.0, 1.0], 2.0),
}])
.unwrap();
h_a.converge().unwrap();
// Path B: history-wide default 2.0, no per-event override.
let mut h_b = crate::History::builder().score_sigma(2.0).build();
h_b.add_events([crate::Event {
time: 0_i64,
teams: smallvec::smallvec![
crate::Team::with_members([crate::Member::new("a")]),
crate::Team::with_members([crate::Member::new("b")]),
],
outcome: Outcome::scores([3.0, 1.0]),
}])
.unwrap();
h_b.converge().unwrap();
// Override == default-set-to-the-override-value: bit-equal.
let curves_a = h_a.learning_curves();
let curves_b = h_b.learning_curves();
for (key, a_pts) in curves_a.iter() {
let b_pts = curves_b.get(key).expect("agent missing in path B");
for (a, b) in a_pts.iter().zip(b_pts.iter()) {
assert_eq!(a.1.pi(), b.1.pi(), "mismatch at agent {key:?}");
assert_eq!(a.1.tau(), b.1.tau(), "mismatch at agent {key:?}");
}
}
// Path C: history-wide default 0.5, no override. Different sigma → different posteriors.
let mut h_c = crate::History::builder().score_sigma(0.5).build();
h_c.add_events([crate::Event {
time: 0_i64,
teams: smallvec::smallvec![
crate::Team::with_members([crate::Member::new("a")]),
crate::Team::with_members([crate::Member::new("b")]),
],
outcome: Outcome::scores([3.0, 1.0]),
}])
.unwrap();
h_c.converge().unwrap();
let curves_c = h_c.learning_curves();
let mut max_diff: f64 = 0.0;
for (key, a_pts) in curves_a.iter() {
let c_pts = curves_c.get(key).expect("agent missing in path C");
for (a, c) in a_pts.iter().zip(c_pts.iter()) {
max_diff = max_diff.max((a.1.mu() - c.1.mu()).abs());
max_diff = max_diff.max((a.1.sigma() - c.1.sigma()).abs());
}
}
assert!(
max_diff > 1e-6,
"override should produce different posteriors from inherited default; max_diff={max_diff}"
);
}
#[test]
fn event_builder_scores_with_sigma_threading() {
use crate::Outcome;
// Path A: builder fluent API with sigma override.
let mut h_a = crate::History::builder().score_sigma(0.5).build();
h_a.event(0_i64)
.team(["a"])
.team(["b"])
.scores_with_sigma([3.0, 1.0], 2.0)
.commit()
.unwrap();
h_a.converge().unwrap();
// Path B: same outcome via the explicit Outcome constructor.
let mut h_b = crate::History::builder().score_sigma(0.5).build();
h_b.add_events([crate::Event {
time: 0_i64,
teams: smallvec::smallvec![
crate::Team::with_members([crate::Member::new("a")]),
crate::Team::with_members([crate::Member::new("b")]),
],
outcome: Outcome::scores_with_sigma([3.0, 1.0], 2.0),
}])
.unwrap();
h_b.converge().unwrap();
let curves_a = h_a.learning_curves();
let curves_b = h_b.learning_curves();
for (key, a_pts) in curves_a.iter() {
let b_pts = curves_b.get(key).expect("agent missing");
for (a, b) in a_pts.iter().zip(b_pts.iter()) {
assert_eq!(a.1.pi(), b.1.pi(), "mismatch at agent {key:?}");
assert_eq!(a.1.tau(), b.1.tau(), "mismatch at agent {key:?}");
}
}
}
}
+1 -2
View File
@@ -8,8 +8,7 @@ mod approx;
pub(crate) mod arena;
mod time;
mod time_slice;
pub use time_slice::{EventKind, TimeSlice};
mod color_group;
pub use time_slice::TimeSlice;
mod competitor;
mod convergence;
pub mod drift;
+3 -2
View File
@@ -9,8 +9,9 @@ use crate::time::Time;
/// Receives progress callbacks during `History::converge`.
///
/// All methods have default no-op implementations; implement only what's
/// interesting.
pub trait Observer<T: Time>: Send + Sync {
/// interesting. Send/Sync is NOT required in T2 (added in T3 along with
/// Rayon support).
pub trait Observer<T: Time> {
/// Called after each convergence iteration across the whole history.
fn on_iteration_end(&self, _iter: usize, _max_step: (f64, f64)) {}
+11 -101
View File
@@ -1,7 +1,8 @@
//! Outcome of a match.
//!
//! `Ranked(ranks)` for ordinal results; `Scored { scores, sigma }` for
//! continuous per-team scores (engages `MarginFactor` in the engine).
//! In T2, only `Ranked` is supported; `Scored` will be added together with
//! `MarginFactor` in T4. The enum is `#[non_exhaustive]` so adding `Scored`
//! is non-breaking for downstream `match` expressions.
use smallvec::SmallVec;
@@ -9,25 +10,14 @@ use smallvec::SmallVec;
///
/// `Ranked(ranks)`: lower rank = better. Equal ranks mean a tie between those
/// teams. `ranks.len()` must equal the number of teams in the event.
///
/// `Scored { scores, sigma }`: higher score = better. Adjacent (sorted) pairs
/// feed observed margins to `MarginFactor`. `scores.len()` must equal the
/// number of teams in the event. `sigma` overrides `HistoryBuilder::score_sigma`
/// when `Some`; `None` inherits the history default.
#[derive(Clone, Debug, PartialEq)]
#[non_exhaustive]
pub enum Outcome {
Ranked(SmallVec<[u32; 4]>),
Scored {
scores: SmallVec<[f64; 4]>,
/// Per-event noise override. `None` means inherit
/// `HistoryBuilder::score_sigma`. Must be `> 0.0` if `Some`.
sigma: Option<f64>,
},
}
impl Outcome {
/// `n`-team outcome where team `winner` won and everyone else tied for last.
/// `N`-team outcome where team `winner` won and everyone else tied for last.
///
/// Panics if `winner >= n`.
pub fn winner(winner: u32, n: u32) -> Self {
@@ -46,44 +36,16 @@ impl Outcome {
Self::Ranked(ranks.into_iter().collect())
}
/// Explicit per-team continuous scores; higher = better.
/// Inherits `HistoryBuilder::score_sigma` for the noise model.
pub fn scores<I: IntoIterator<Item = f64>>(scores: I) -> Self {
Self::Scored {
scores: scores.into_iter().collect(),
sigma: None,
}
}
/// Explicit per-team continuous scores with a per-event noise override.
///
/// `sigma` must be `> 0.0`; debug-asserts otherwise.
pub fn scores_with_sigma<I: IntoIterator<Item = f64>>(scores: I, sigma: f64) -> Self {
debug_assert!(sigma > 0.0, "score_sigma must be > 0.0 (got {sigma})");
Self::Scored {
scores: scores.into_iter().collect(),
sigma: Some(sigma),
}
}
pub fn team_count(&self) -> usize {
match self {
Self::Ranked(r) => r.len(),
Self::Scored { scores, .. } => scores.len(),
}
}
pub(crate) fn as_ranks(&self) -> Option<&[u32]> {
#[allow(dead_code)]
pub(crate) fn as_ranks(&self) -> &[u32] {
match self {
Self::Ranked(r) => Some(r),
Self::Scored { .. } => None,
}
}
pub(crate) fn as_scores(&self) -> Option<&[f64]> {
match self {
Self::Scored { scores, .. } => Some(scores),
Self::Ranked(_) => None,
Self::Ranked(r) => r,
}
}
}
@@ -95,26 +57,26 @@ mod tests {
#[test]
fn winner_two_teams() {
let o = Outcome::winner(0, 2);
assert_eq!(o.as_ranks(), Some(&[0u32, 1][..]));
assert_eq!(o.as_ranks(), &[0u32, 1]);
assert_eq!(o.team_count(), 2);
}
#[test]
fn winner_three_teams_second_wins() {
let o = Outcome::winner(1, 3);
assert_eq!(o.as_ranks(), Some(&[1u32, 0, 1][..]));
assert_eq!(o.as_ranks(), &[1u32, 0, 1]);
}
#[test]
fn draw_three_teams() {
let o = Outcome::draw(3);
assert_eq!(o.as_ranks(), Some(&[0u32, 0, 0][..]));
assert_eq!(o.as_ranks(), &[0u32, 0, 0]);
}
#[test]
fn ranking_from_iter() {
let o = Outcome::ranking([2, 0, 1]);
assert_eq!(o.as_ranks(), Some(&[2u32, 0, 1][..]));
assert_eq!(o.as_ranks(), &[2u32, 0, 1]);
}
#[test]
@@ -122,56 +84,4 @@ mod tests {
fn winner_out_of_range_panics() {
let _ = Outcome::winner(2, 2);
}
#[test]
fn scored_two_teams() {
let o = Outcome::scores([10.0, 4.0]);
assert_eq!(o.team_count(), 2);
assert_eq!(o.as_scores(), Some(&[10.0, 4.0][..]));
assert_eq!(o.as_ranks(), None);
}
#[test]
fn scored_team_count_matches_input() {
let o = Outcome::scores([3.0, 1.0, 2.0, 0.0]);
assert_eq!(o.team_count(), 4);
}
#[test]
fn ranked_as_scores_returns_none() {
let o = Outcome::winner(0, 2);
assert!(o.as_scores().is_none());
assert!(o.as_ranks().is_some());
}
#[test]
fn scores_with_sigma_round_trips() {
let o = Outcome::scores_with_sigma([10.0, 4.0], 0.5);
assert_eq!(o.team_count(), 2);
assert_eq!(o.as_scores(), Some(&[10.0, 4.0][..]));
}
#[test]
fn scores_constructor_leaves_sigma_unset() {
let o = Outcome::scores([3.0, 1.0]);
match o {
Outcome::Scored { scores: _, sigma } => assert!(sigma.is_none()),
Outcome::Ranked(_) => panic!("expected Scored variant"),
}
}
#[test]
fn scores_with_sigma_sets_sigma_some() {
let o = Outcome::scores_with_sigma([3.0, 1.0], 2.0);
match o {
Outcome::Scored { scores: _, sigma } => assert_eq!(sigma, Some(2.0)),
Outcome::Ranked(_) => panic!("expected Scored variant"),
}
}
#[test]
#[should_panic(expected = "score_sigma must be > 0.0")]
fn scores_with_sigma_rejects_zero() {
let _ = Outcome::scores_with_sigma([3.0, 1.0], 0.0);
}
}
+1 -1
View File
@@ -16,7 +16,7 @@ pub struct ScheduleReport {
}
/// Drives factor propagation to convergence.
pub trait Schedule: Send + Sync {
pub trait Schedule {
fn run(&self, factors: &mut [BuiltinFactor], vars: &mut VarStore) -> ScheduleReport;
}
+1 -1
View File
@@ -8,7 +8,7 @@
///
/// Must be `Ord + Copy` so slices can sort events, and `'static` so
/// `History` can store it by value without lifetimes.
pub trait Time: Copy + Ord + Send + Sync + 'static {
pub trait Time: Copy + Ord + 'static {
/// How much time elapsed between `self` and `later`.
///
/// Used by `Drift<T>::variance_delta` to compute skill drift. Returning
+50 -334
View File
@@ -7,7 +7,6 @@ use std::collections::HashMap;
use crate::{
Index, N_INF,
arena::ScratchArena,
color_group::ColorGroups,
drift::Drift,
game::Game,
gaussian::Gaussian,
@@ -44,13 +43,6 @@ impl Default for Skill {
}
}
#[derive(Debug, Clone, Copy)]
#[non_exhaustive]
pub enum EventKind {
Ranked,
Scored { score_sigma: f64 },
}
#[derive(Debug)]
struct Item {
agent: Index,
@@ -89,16 +81,9 @@ pub(crate) struct Event {
teams: Vec<Team>,
evidence: f64,
weights: Vec<Vec<f64>>,
kind: EventKind,
}
impl Event {
pub(crate) fn iter_agents(&self) -> impl Iterator<Item = Index> + '_ {
self.teams
.iter()
.flat_map(|t| t.items.iter().map(|it| it.agent))
}
fn outputs(&self) -> Vec<f64> {
self.teams
.iter()
@@ -123,46 +108,6 @@ impl Event {
})
.collect::<Vec<_>>()
}
/// Direct in-loop update: mutates self and `skills` inline with no
/// intermediate allocation. Used by both the sequential sweep path and,
/// via unsafe, by the parallel rayon path for events in the same color
/// group (which have disjoint agent sets — see `sweep_color_groups`).
fn iteration_direct<T: Time, D: Drift<T>>(
&mut self,
skills: &mut SkillStore,
agents: &CompetitorStore<T, D>,
p_draw: f64,
convergence: crate::ConvergenceOptions,
arena: &mut ScratchArena,
) {
let teams = self.within_priors(false, false, skills, agents);
let result = self.outputs();
let g = match self.kind {
EventKind::Ranked => {
Game::ranked_with_arena(teams, &result, &self.weights, p_draw, convergence, arena)
}
EventKind::Scored { score_sigma } => Game::scored_with_arena(
teams,
&result,
&self.weights,
score_sigma,
convergence,
arena,
),
};
for (t, team) in self.teams.iter_mut().enumerate() {
for (i, item) in team.items.iter_mut().enumerate() {
let old_likelihood = skills.get(item.agent).unwrap().likelihood;
let new_likelihood = (old_likelihood / item.likelihood) * g.likelihoods[t][i];
skills.get_mut(item.agent).unwrap().likelihood = new_likelihood;
item.likelihood = g.likelihoods[t][i];
}
}
self.evidence = g.evidence;
}
}
#[derive(Debug)]
@@ -171,64 +116,25 @@ pub struct TimeSlice<T: Time = i64> {
pub(crate) skills: SkillStore,
pub(crate) time: T,
p_draw: f64,
pub(crate) convergence: crate::ConvergenceOptions,
arena: ScratchArena,
pub(crate) color_groups: ColorGroups,
}
impl<T: Time> TimeSlice<T> {
pub fn new(time: T, p_draw: f64, convergence: crate::ConvergenceOptions) -> Self {
pub fn new(time: T, p_draw: f64) -> Self {
Self {
events: Vec::new(),
skills: SkillStore::new(),
time,
p_draw,
convergence,
arena: ScratchArena::new(),
color_groups: ColorGroups::new(),
}
}
/// Recompute the color-group partition and reorder `self.events` into
/// color-contiguous ranges. After this call, `self.color_groups.groups[c]`
/// contains a contiguous ascending range of indices in `self.events`.
pub(crate) fn recompute_color_groups(&mut self) {
use crate::color_group::color_greedy;
let n = self.events.len();
if n == 0 {
self.color_groups = ColorGroups::new();
return;
}
let cg = color_greedy(n, |ev_idx| {
self.events[ev_idx].iter_agents().collect::<Vec<_>>()
});
let mut reordered: Vec<Event> = Vec::with_capacity(n);
let mut new_groups: Vec<Vec<usize>> = Vec::with_capacity(cg.groups.len());
let mut taken: Vec<Option<Event>> = self.events.drain(..).map(Some).collect();
for group in &cg.groups {
let mut new_indices: Vec<usize> = Vec::with_capacity(group.len());
for &old_idx in group {
let ev = taken[old_idx].take().expect("event already taken");
new_indices.push(reordered.len());
reordered.push(ev);
}
new_groups.push(new_indices);
}
self.events = reordered;
self.color_groups = ColorGroups { groups: new_groups };
}
pub fn add_events<D: Drift<T>>(
&mut self,
composition: Vec<Vec<Vec<Index>>>,
results: Vec<Vec<f64>>,
weights: Vec<Vec<Vec<f64>>>,
kinds: Vec<EventKind>,
agents: &CompetitorStore<T, D>,
) {
let mut unique = Vec::with_capacity(10);
@@ -298,7 +204,6 @@ impl<T: Time> TimeSlice<T> {
teams,
evidence: 0.0,
weights,
kind: kinds[e],
}
});
@@ -307,7 +212,6 @@ impl<T: Time> TimeSlice<T> {
self.events.extend(events);
self.iteration(from, agents);
self.recompute_color_groups();
}
pub(crate) fn posteriors(&self) -> HashMap<Index, Gaussian> {
@@ -318,147 +222,33 @@ impl<T: Time> TimeSlice<T> {
}
pub fn iteration<D: Drift<T>>(&mut self, from: usize, agents: &CompetitorStore<T, D>) {
if from > 0 || self.color_groups.is_empty() {
// Initial pass (add_events) or no color groups yet: simple sequential sweep.
for event in self.events.iter_mut().skip(from) {
let teams = event.within_priors(false, false, &self.skills, agents);
let result = event.outputs();
for event in self.events.iter_mut().skip(from) {
let teams = event.within_priors(false, false, &self.skills, agents);
let result = event.outputs();
let g = match event.kind {
EventKind::Ranked => Game::ranked_with_arena(
teams,
&result,
&event.weights,
self.p_draw,
self.convergence,
&mut self.arena,
),
EventKind::Scored { score_sigma } => Game::scored_with_arena(
teams,
&result,
&event.weights,
score_sigma,
self.convergence,
&mut self.arena,
),
};
let g = Game::ranked_with_arena(
teams,
&result,
&event.weights,
self.p_draw,
&mut self.arena,
);
for (t, team) in event.teams.iter_mut().enumerate() {
for (i, item) in team.items.iter_mut().enumerate() {
let old_likelihood = self.skills.get(item.agent).unwrap().likelihood;
let new_likelihood =
(old_likelihood / item.likelihood) * g.likelihoods[t][i];
self.skills.get_mut(item.agent).unwrap().likelihood = new_likelihood;
item.likelihood = g.likelihoods[t][i];
}
}
event.evidence = g.evidence;
}
} else {
self.sweep_color_groups(agents);
}
}
/// Full event sweep using the color-group partition. Colors are processed
/// sequentially; within each color the inner loop is parallel under rayon.
///
/// Events within each color group touch disjoint agent sets (guaranteed by
/// the greedy coloring). This lets each rayon thread write directly to its
/// events' skill likelihoods without a deferred-apply step, matching the
/// sequential path's allocation profile. The unsafe block is sound because:
/// 1. `self.events[range]` and `self.skills` are separate fields → disjoint.
/// 2. Events in the same color group access disjoint `Index` values in
/// `self.skills`, so concurrent writes land on different memory locations.
/// 3. Each event only writes to its own items' likelihoods (no sharing).
#[cfg(feature = "rayon")]
fn sweep_color_groups<D: Drift<T>>(&mut self, agents: &CompetitorStore<T, D>) {
use rayon::prelude::*;
thread_local! {
static ARENA: std::cell::RefCell<ScratchArena> =
std::cell::RefCell::new(ScratchArena::new());
}
// Minimum color-group size to justify rayon's task-spawn overhead.
// Below this threshold, process events sequentially to avoid regression
// on small per-slice workloads.
const RAYON_THRESHOLD: usize = 64;
for color_idx in 0..self.color_groups.groups.len() {
let group_len = self.color_groups.groups[color_idx].len();
if group_len == 0 {
continue;
}
let range = self.color_groups.color_range(color_idx);
let p_draw = self.p_draw;
let convergence = self.convergence;
if group_len >= RAYON_THRESHOLD {
// Obtain a raw pointer from the unique `&mut self.skills` reference.
// Casting back to `&mut` inside the closure is sound because:
// 1. The pointer originates from a `&mut` — no aliasing with shared refs.
// 2. Events in the same color group touch disjoint `Index` slots in the
// underlying Vec, so concurrent writes from different threads land on
// different memory locations — no data race.
// 3. `self.events[range]` and `self.skills` are separate struct fields,
// so the borrow splits cleanly.
let skills_addr: usize = (&mut self.skills as *mut SkillStore) as usize;
self.events[range].par_iter_mut().for_each(move |ev| {
// SAFETY: see above.
let skills: &mut SkillStore = unsafe { &mut *(skills_addr as *mut SkillStore) };
ARENA.with(|cell| {
let mut arena = cell.borrow_mut();
arena.reset();
ev.iteration_direct(skills, agents, p_draw, convergence, &mut arena);
});
});
} else {
for ev in &mut self.events[range] {
ev.iteration_direct(
&mut self.skills,
agents,
p_draw,
self.convergence,
&mut self.arena,
);
for (t, team) in event.teams.iter_mut().enumerate() {
for (i, item) in team.items.iter_mut().enumerate() {
let old_likelihood = self.skills.get(item.agent).unwrap().likelihood;
let new_likelihood = (old_likelihood / item.likelihood) * g.likelihoods[t][i];
self.skills.get_mut(item.agent).unwrap().likelihood = new_likelihood;
item.likelihood = g.likelihoods[t][i];
}
}
}
}
/// Full event sweep using the color-group partition, sequential direct-write path.
/// Events within each color group are updated inline — no EventOutput allocation —
/// matching the T2 performance profile.
#[cfg(not(feature = "rayon"))]
fn sweep_color_groups<D: Drift<T>>(&mut self, agents: &CompetitorStore<T, D>) {
for color_idx in 0..self.color_groups.groups.len() {
if self.color_groups.groups[color_idx].is_empty() {
continue;
}
let range = self.color_groups.color_range(color_idx);
// Borrow self.events as a mutable slice for this color range.
// self.skills and self.arena are separate fields — disjoint borrows are
// allowed within a single method body.
let p_draw = self.p_draw;
for ev in &mut self.events[range] {
ev.iteration_direct(
&mut self.skills,
agents,
p_draw,
self.convergence,
&mut self.arena,
);
}
event.evidence = g.evidence;
}
}
#[allow(dead_code)]
pub(crate) fn iterate_to_convergence<D: Drift<T>>(
&mut self,
agents: &CompetitorStore<T, D>,
) -> usize {
pub(crate) fn convergence<D: Drift<T>>(&mut self, agents: &CompetitorStore<T, D>) -> usize {
let epsilon = 1e-6;
let iterations = 20;
@@ -526,38 +316,21 @@ impl<T: Time> TimeSlice<T> {
// log_evidence is infrequent; a local arena avoids needing &mut self.
let mut arena = ScratchArena::new();
let run_event = |event: &Event, arena: &mut ScratchArena| -> f64 {
let teams = event.within_priors(online, forward, &self.skills, agents);
let result = event.outputs();
match event.kind {
EventKind::Ranked => Game::ranked_with_arena(
teams,
&result,
&event.weights,
self.p_draw,
self.convergence,
arena,
)
.evidence
.ln(),
EventKind::Scored { score_sigma } => Game::scored_with_arena(
teams,
&result,
&event.weights,
score_sigma,
self.convergence,
arena,
)
.evidence
.ln(),
}
};
if targets.is_empty() {
if online || forward {
self.events
.iter()
.map(|event| run_event(event, &mut arena))
.map(|event| {
Game::ranked_with_arena(
event.within_priors(online, forward, &self.skills, agents),
&event.outputs(),
&event.weights,
self.p_draw,
&mut arena,
)
.evidence
.ln()
})
.sum()
} else {
self.events.iter().map(|event| event.evidence.ln()).sum()
@@ -565,14 +338,25 @@ impl<T: Time> TimeSlice<T> {
} else if online || forward {
self.events
.iter()
.filter(|event| {
.enumerate()
.filter(|(_, event)| {
event
.teams
.iter()
.flat_map(|team| &team.items)
.any(|item| targets.contains(&item.agent))
})
.map(|event| run_event(event, &mut arena))
.map(|(_, event)| {
Game::ranked_with_arena(
event.within_priors(online, forward, &self.skills, agents),
&event.outputs(),
&event.weights,
self.p_draw,
&mut arena,
)
.evidence
.ln()
})
.sum()
} else {
self.events
@@ -657,7 +441,7 @@ mod tests {
);
}
let mut time_slice = TimeSlice::new(0i64, 0.0, crate::ConvergenceOptions::default());
let mut time_slice = TimeSlice::new(0i64, 0.0);
time_slice.add_events(
vec![
@@ -667,7 +451,6 @@ mod tests {
],
vec![vec![1.0, 0.0], vec![0.0, 1.0], vec![1.0, 0.0]],
vec![],
vec![EventKind::Ranked; 3],
&agents,
);
@@ -704,7 +487,7 @@ mod tests {
epsilon = 1e-6
);
assert_eq!(time_slice.iterate_to_convergence(&agents), 1);
assert_eq!(time_slice.convergence(&agents), 1);
}
#[test]
@@ -734,7 +517,7 @@ mod tests {
);
}
let mut time_slice = TimeSlice::new(0i64, 0.0, crate::ConvergenceOptions::default());
let mut time_slice = TimeSlice::new(0i64, 0.0);
time_slice.add_events(
vec![
@@ -744,7 +527,6 @@ mod tests {
],
vec![vec![1.0, 0.0], vec![0.0, 1.0], vec![1.0, 0.0]],
vec![],
vec![EventKind::Ranked; 3],
&agents,
);
@@ -766,7 +548,7 @@ mod tests {
epsilon = 1e-6
);
assert!(time_slice.iterate_to_convergence(&agents) > 1);
assert!(time_slice.convergence(&agents) > 1);
let post = time_slice.posteriors();
@@ -814,7 +596,7 @@ mod tests {
);
}
let mut time_slice = TimeSlice::new(0i64, 0.0, crate::ConvergenceOptions::default());
let mut time_slice = TimeSlice::new(0i64, 0.0);
time_slice.add_events(
vec![
@@ -824,11 +606,10 @@ mod tests {
],
vec![vec![1.0, 0.0], vec![0.0, 1.0], vec![1.0, 0.0]],
vec![],
vec![EventKind::Ranked; 3],
&agents,
);
time_slice.iterate_to_convergence(&agents);
time_slice.convergence(&agents);
let post = time_slice.posteriors();
@@ -856,13 +637,12 @@ mod tests {
],
vec![vec![1.0, 0.0], vec![0.0, 1.0], vec![1.0, 0.0]],
vec![],
vec![EventKind::Ranked; 3],
&agents,
);
assert_eq!(time_slice.events.len(), 6);
time_slice.iterate_to_convergence(&agents);
time_slice.convergence(&agents);
let post = time_slice.posteriors();
@@ -882,68 +662,4 @@ mod tests {
epsilon = 1e-6
);
}
#[test]
fn time_slice_color_groups_reorders_events() {
// ev0: [a, b]; ev1: [c, d]; ev2: [a, c]
// Greedy coloring: ev0→c0, ev1→c0 (disjoint), ev2→c1 (overlaps both).
// After recompute_color_groups, physical order is [ev0, ev1, ev2]
// and groups == [[0, 1], [2]].
let mut index_map = KeyTable::new();
let a = index_map.get_or_create("a");
let b = index_map.get_or_create("b");
let c = index_map.get_or_create("c");
let d = index_map.get_or_create("d");
let mut agents: CompetitorStore<i64, ConstantDrift> = CompetitorStore::new();
for agent in [a, b, c, d] {
agents.insert(
agent,
Competitor {
rating: Rating::new(
Gaussian::from_ms(25.0, 25.0 / 3.0),
25.0 / 6.0,
ConstantDrift(25.0 / 300.0),
),
..Default::default()
},
);
}
let mut ts = TimeSlice::new(0i64, 0.0, crate::ConvergenceOptions::default());
ts.add_events(
vec![
vec![vec![a], vec![b]],
vec![vec![c], vec![d]],
vec![vec![a], vec![c]],
],
vec![vec![1.0, 0.0], vec![1.0, 0.0], vec![1.0, 0.0]],
vec![],
vec![EventKind::Ranked; 3],
&agents,
);
assert_eq!(ts.color_groups.n_colors(), 2);
assert_eq!(ts.color_groups.groups[0], vec![0, 1]);
assert_eq!(ts.color_groups.groups[1], vec![2]);
assert_eq!(ts.color_groups.color_range(0), 0..2);
assert_eq!(ts.color_groups.color_range(1), 2..3);
// Events at positions 0 and 1 (color 0) must be disjoint — verify by
// checking that the agent sets of self.events[0] and self.events[1] do
// not include the agent at self.events[2].
let agents_in_ev2: Vec<Index> = ts.events[2].iter_agents().collect();
let agents_in_ev0: Vec<Index> = ts.events[0].iter_agents().collect();
let agents_in_ev1: Vec<Index> = ts.events[1].iter_agents().collect();
// ev0 and ev1 must be disjoint from each other (color-0 invariant).
assert!(agents_in_ev0.iter().all(|ag| !agents_in_ev1.contains(ag)));
// ev2 must share an agent with ev0 or ev1 (it needed its own color).
let ev2_overlaps_ev0 = agents_in_ev2.iter().any(|ag| agents_in_ev0.contains(ag));
let ev2_overlaps_ev1 = agents_in_ev2.iter().any(|ag| agents_in_ev1.contains(ag));
assert!(ev2_overlaps_ev0 || ev2_overlaps_ev1);
}
}
-24
View File
@@ -15,7 +15,6 @@ fn add_events_bulk_via_iter() {
.convergence(ConvergenceOptions {
max_iter: 30,
epsilon: 1e-6,
alpha: 1.0,
})
.build();
@@ -224,26 +223,3 @@ fn predict_outcome_two_teams_sums_to_one() {
assert!((p[0] + p[1] - 1.0).abs() < 1e-9);
assert!(p[0] > p[1]);
}
#[test]
fn fluent_event_builder_scores() {
use trueskill_tt::ConstantDrift;
let mut h = History::builder()
.mu(25.0)
.sigma(25.0 / 3.0)
.beta(25.0 / 6.0)
.drift(ConstantDrift(0.0))
.build();
h.event(1)
.team(["alice"])
.team(["bob"])
.scores([12.0, 4.0])
.commit()
.unwrap();
h.converge().unwrap();
let a = h.current_skill(&"alice").unwrap();
let b = h.current_skill(&"bob").unwrap();
assert!(a.mu() > b.mu());
}
-101
View File
@@ -1,101 +0,0 @@
//! Determinism tests: identical posteriors across RAYON_NUM_THREADS
//! values. Only compiled with the `rayon` feature.
#![cfg(feature = "rayon")]
use smallvec::smallvec;
use trueskill_tt::{ConstantDrift, ConvergenceOptions, Event, History, Member, Outcome, Team};
/// Build a deterministic workload using a simple LCG (no external rand crate).
fn build_and_converge(seed: u64) -> Vec<(i64, trueskill_tt::Gaussian)> {
let mut h = History::<i64, _, _, String>::builder_with_key()
.mu(25.0)
.sigma(25.0 / 3.0)
.beta(25.0 / 6.0)
.drift(ConstantDrift(25.0 / 300.0))
.convergence(ConvergenceOptions {
max_iter: 30,
epsilon: 1e-6,
alpha: 1.0,
})
.build();
// LCG for deterministic pseudo-random ints.
let mut rng = seed;
let mut next = || {
rng = rng
.wrapping_mul(6364136223846793005)
.wrapping_add(1442695040888963407);
rng
};
let mut events: Vec<Event<i64, String>> = Vec::with_capacity(200);
for ev_i in 0..200 {
let a = (next() % 40) as usize;
let mut b = (next() % 40) as usize;
while b == a {
b = (next() % 40) as usize;
}
// ~10 events per slice so color groups have material parallelism.
events.push(Event {
time: (ev_i as i64 / 10) + 1,
teams: smallvec![
Team::with_members([Member::new(format!("p{a}"))]),
Team::with_members([Member::new(format!("p{b}"))]),
],
outcome: Outcome::winner((next() % 2) as u32, 2),
});
}
h.add_events(events).unwrap();
h.converge().unwrap();
// Sample one competitor's curve for the comparison.
h.learning_curve("p0")
}
#[test]
fn posteriors_identical_across_thread_counts() {
let sizes = [1usize, 2, 4, 8];
let mut results: Vec<Vec<(i64, trueskill_tt::Gaussian)>> = Vec::new();
for &n in &sizes {
let pool = rayon::ThreadPoolBuilder::new()
.num_threads(n)
.build()
.expect("rayon pool build");
let curve = pool.install(|| build_and_converge(42));
results.push(curve);
}
let reference = &results[0];
for (i, curve) in results.iter().enumerate().skip(1) {
assert_eq!(
curve.len(),
reference.len(),
"curve length differs at {n} threads",
n = sizes[i],
);
for (j, (&(t_ref, g_ref), &(t, g))) in reference.iter().zip(curve.iter()).enumerate() {
assert_eq!(
t_ref,
t,
"time point {j} differs at {n} threads: ref={t_ref} vs got={t}",
n = sizes[i],
);
assert_eq!(
g_ref.mu().to_bits(),
g.mu().to_bits(),
"mu bits differ at {n} threads, time {t}: ref={ref_mu} got={got_mu}",
n = sizes[i],
ref_mu = g_ref.mu(),
got_mu = g.mu(),
);
assert_eq!(
g_ref.sigma().to_bits(),
g.sigma().to_bits(),
"sigma bits differ at {n} threads, time {t}: ref={ref_sigma} got={got_sigma}",
n = sizes[i],
ref_sigma = g_ref.sigma(),
got_sigma = g.sigma(),
);
}
}
}
-1
View File
@@ -42,7 +42,6 @@ fn game_1v1_draw_golden() {
Outcome::draw(2),
&GameOptions {
p_draw: 0.25,
score_sigma: 1.0,
convergence: Default::default(),
},
)
-1
View File
@@ -45,7 +45,6 @@ fn game_ranked_rejects_bad_p_draw() {
Outcome::winner(0, 2),
&GameOptions {
p_draw: 1.5,
score_sigma: 1.0,
convergence: ConvergenceOptions::default(),
},
)
-71
View File
@@ -1,71 +0,0 @@
//! Regression: a single time slice with many distinct competitors must converge to finite
//! skills. Before the `pi <= 0` guard in `Gaussian::mu()/sigma()`, EP message cancellation
//! produced a tiny-negative precision whose `sigma() = 1/sqrt(pi)` was NaN, which the
//! moment-space `Sub` in the game chain propagated into every skill once the slice grew past
//! ~75 competitors (e.g. a real ranking dataset with hundreds of players).
use trueskill_tt::{ConstantDrift, ConvergenceOptions, EPSILON, History, ITERATIONS, NullObserver};
/// Tiny deterministic LCG — avoids a dev-dependency on `rand`.
struct Lcg(u64);
impl Lcg {
fn next(&mut self) -> u64 {
self.0 = self
.0
.wrapping_mul(6364136223846793005)
.wrapping_add(1442695040888963407);
self.0
}
fn below(&mut self, n: usize) -> usize {
(self.next() >> 33) as usize % n
}
fn coin(&mut self) -> bool {
self.next() & 1 == 0
}
}
fn nan_after_fit(players: usize) -> usize {
let mut h: History<i64, ConstantDrift, NullObserver, String> = History::builder_with_key()
.beta(1.0)
.sigma(6.0)
.drift(ConstantDrift(0.1))
.convergence(ConvergenceOptions {
max_iter: ITERATIONS,
epsilon: EPSILON,
..Default::default()
})
.build();
let ids: Vec<String> = (0..players).map(|i| format!("p{i:04}")).collect();
let mut rng = Lcg(1);
for _ in 0..(players * 4) {
let a = rng.below(players);
let mut b = rng.below(players - 1);
if b >= a {
b += 1;
}
let (w, l) = if rng.coin() { (a, b) } else { (b, a) };
h.record_winner(&ids[w], &ids[l], 0).unwrap();
}
h.converge().unwrap();
ids.iter()
.filter(|id| {
h.current_skill(id.as_str())
.map(|g| !g.mu().is_finite() || !g.sigma().is_finite())
.unwrap_or(true)
})
.count()
}
#[test]
fn many_competitors_converge_to_finite_skills() {
// The NaN regression onset was between 70 and 80 competitors; 250 is comfortably past it
// and in the range of a real ranking dataset.
for players in [12usize, 75, 150, 250] {
assert_eq!(
nan_after_fit(players),
0,
"{players}-competitor history produced NaN skills"
);
}
}
-1
View File
@@ -10,7 +10,6 @@ fn record_winner_builds_history() {
.convergence(ConvergenceOptions {
max_iter: 30,
epsilon: 1e-6,
alpha: 1.0,
})
.build();
-139
View File
@@ -1,139 +0,0 @@
//! Integration tests for `Outcome::Scored` routing through `History::add_events`.
use smallvec::smallvec;
use trueskill_tt::{ConstantDrift, Event, History, Member, Outcome, Team};
#[test]
fn scored_two_team_one_event_pulls_winner_up() {
let mut h: History = History::builder()
.mu(0.0)
.sigma(2.0)
.beta(1.0)
.drift(ConstantDrift(0.0))
.score_sigma(1.0)
.build();
let events: Vec<Event<i64, &'static str>> = vec![Event {
time: 1,
teams: smallvec![
Team::with_members([Member::new("a")]),
Team::with_members([Member::new("b")]),
],
outcome: Outcome::scores([10.0, 4.0]),
}];
h.add_events(events).unwrap();
let mu_a = h.current_skill(&"a").unwrap().mu();
let mu_b = h.current_skill(&"b").unwrap().mu();
assert!(
mu_a > 0.0,
"winner mu should be pulled up; got mu_a = {mu_a}"
);
assert!(
mu_b < 0.0,
"loser mu should be pulled down; got mu_b = {mu_b}"
);
assert!(
mu_a > mu_b,
"winner mu should exceed loser mu; got mu_a = {mu_a}, mu_b = {mu_b}"
);
}
#[test]
fn scored_zero_margin_treats_as_tie() {
let mut h: History = History::builder()
.mu(0.0)
.sigma(2.0)
.beta(1.0)
.drift(ConstantDrift(0.0))
.score_sigma(1.0)
.build();
let events: Vec<Event<i64, &'static str>> = vec![Event {
time: 1,
teams: smallvec![
Team::with_members([Member::new("a")]),
Team::with_members([Member::new("b")]),
],
outcome: Outcome::scores([5.0, 5.0]),
}];
h.add_events(events).unwrap();
let mu_a = h.current_skill(&"a").unwrap().mu();
let mu_b = h.current_skill(&"b").unwrap().mu();
let sigma_a = h.current_skill(&"a").unwrap().sigma();
// Equal scores: posterior means stay symmetric around the prior mean.
assert!(
(mu_a - mu_b).abs() < 1e-9,
"equal scores should leave mu_a == mu_b; got {mu_a} vs {mu_b}"
);
assert!(
mu_a.abs() < 1e-9,
"equal scores against equal priors should leave mu near zero; got {mu_a}"
);
// A zero-margin scored event still reduces uncertainty.
assert!(
sigma_a < 2.0,
"expected sigma to tighten below prior 2.0; got {}",
sigma_a
);
}
#[test]
fn scored_three_team_partial_order() {
let mut h: History = History::builder()
.mu(0.0)
.sigma(2.0)
.beta(1.0)
.drift(ConstantDrift(0.0))
.score_sigma(1.0)
.build();
let events: Vec<Event<i64, &'static str>> = vec![Event {
time: 1,
teams: smallvec![
Team::with_members([Member::new("a")]),
Team::with_members([Member::new("b")]),
Team::with_members([Member::new("c")]),
],
outcome: Outcome::scores([9.0, 5.0, 1.0]),
}];
h.add_events(events).unwrap();
let mu_a = h.current_skill(&"a").unwrap().mu();
let mu_b = h.current_skill(&"b").unwrap().mu();
let mu_c = h.current_skill(&"c").unwrap().mu();
assert!(
mu_a > mu_b,
"team with highest score should rank highest; mu_a = {mu_a}, mu_b = {mu_b}"
);
assert!(
mu_b > mu_c,
"middle score should outrank lowest; mu_b = {mu_b}, mu_c = {mu_c}"
);
}
#[test]
fn scored_rejects_outcome_team_count_mismatch() {
use trueskill_tt::InferenceError;
let mut h: History = History::builder().build();
let events: Vec<Event<i64, &'static str>> = vec![Event {
time: 1,
teams: smallvec![
Team::with_members([Member::new("a")]),
Team::with_members([Member::new("b")]),
],
outcome: Outcome::scores([10.0, 4.0, 1.0]), // 3 scores, 2 teams
}];
let err = h.add_events(events).unwrap_err();
assert!(
matches!(err, InferenceError::MismatchedShape { .. }),
"expected MismatchedShape error, got {err:?}"
);
}