Files
trueskill-tt/docs/superpowers/specs/2026-05-08-tech-debt-cleanup-design.md
T
logaritmisk a69a3004b2 docs: spec for post-T4-MarginFactor tech debt cleanup
Three independent cleanups: dedupe Game::likelihoods and likelihoods_scored
via a run_chain helper taking a make_link closure, make BuiltinFactor's
log_evidence match exhaustive, and fix stale numerics in the T4 plan doc.
2026-05-08 14:24:48 +02:00

5.0 KiB

Tech Debt Cleanup — Post-T4-MarginFactor

Summary

Three small, independent cleanups left behind by the T4-MarginFactor merge (8b53cac). All three are pure code-shape or doc fixes. No public-API change, no numerics change, no new behavior.

This batch deliberately excludes the DiffFactorBuiltinFactor overlap collapse (architectural change kept separate) and per-event score_sigma override (a feature, not debt).

Scope

Item 1 — Deduplicate Game::likelihoods and Game::likelihoods_scored

Current state. src/game.rs:236-371 and src/game.rs:373-485 are 95-line near-duplicates of each other. They differ in exactly one block: the closure that maps a diff index to a DiffFactor. The ranked path builds DiffFactor::Trunc(TruncFactor::new(vid, margin, tie)) with margin/tie derived from p_draw and adjacent-result equality. The scored path builds DiffFactor::Margin(MarginFactor::new(vid, m_obs, score_sigma)) with m_obs the observed score gap. Everything else — sort, team_prior, sweep loop, boundary updates, evidence product, posterior likelihoods — is bit-identical.

Refactor. Extract a private helper on OwnedGame<T, D>:

fn run_chain<F>(
    &self,
    arena: &mut ScratchArena,
    make_link: F,
) -> (f64, Vec<Vec<Gaussian>>)
where
    F: FnMut(usize, &[usize], &mut VarStore) -> DiffFactor,

The closure receives the diff index i, the descending-by-result sort permutation &arena.sort_buf, and &mut arena.vars for alloc(N_INF). It returns the DiffFactor for that diff slot.

The helper takes &self (not &mut self) and returns (evidence, likelihoods). Each caller writes the results back to its own self.evidence and self.likelihoods fields. The &self choice matters: the closure captures &self.result / &self.teams / &self.weights / &self.p_draw freely without conflicting with the helper's own immutable borrow.

The two public methods shrink from ~125 lines each to ~10 lines that just construct the closure.

Why a closure (not a trait or two-phase build). A closure keeps all caller-specific state (p_draw, score_sigma, beta sums for margin) inline at the call site. A trait would require a stateful object per call; a two-phase build (caller produces the Vec<DiffFactor> first, helper does the rest) would either re-do the sort or split state ownership awkwardly between phases.

Item 2 — Make BuiltinFactor::log_evidence exhaustive

Current state. src/factor/mod.rs:94-100 uses a _ => 0.0 wildcard for TeamSum and RankDiff. When a future variant lands (e.g. SynergyFactor), the wildcard silently absorbs it instead of forcing a deliberate decision.

Refactor.

fn log_evidence(&self, vars: &VarStore) -> f64 {
    match self {
        Self::Trunc(f) => f.log_evidence(vars),
        Self::Margin(f) => f.log_evidence(vars),
        Self::TeamSum(_) | Self::RankDiff(_) => 0.0,
    }
}

No behavioral change. Future variants now produce a non-exhaustive-match compile error.

Item 3 — Fix stale numerics in T4 plan doc

Current state. docs/superpowers/plans/2026-04-27-t4-margin-factor.md contains two numbers that diverge from the values asserted by the shipped test in src/factor/mod.rs:163,166.

Fix.

Doc value (wrong) Implementation value (correct)
0.046827 0.04678
-3.0613 -3.0622

Pure docs change. Verified by reading the asserted constants in the test.

Out of scope

  • DiffFactorBuiltinFactor overlap. Both enums list Trunc and Margin variants. Collapsing into BuiltinFactor::Diff(DiffFactor) is defensible but is an architectural change that wants its own design pass. DiffFactor represents a real semantic subset (factors that operate on a diff variable in a chain); the duplication is two enum variants, not a large block of code.
  • Per-event EventKind::Scored.score_sigma override. Today score_sigma is history-wide (set on HistoryBuilder::score_sigma). A per-event override is a real feature ask, not tech debt.

Verification

Each item commits independently and ships behind a green cargo test --lib run. The dedup is a pure code-shape change: posteriors and evidence must be bit-equal (not ULP-bounded) against the existing 88+28 test goldens.

Per-item gate before committing:

cargo +nightly fmt
cargo clippy
cargo test --lib

Commit shape

Three commits, one per item, each independently revertable:

  1. refactor: dedupe Game::likelihoods and likelihoods_scored via run_chain
  2. refactor: make BuiltinFactor::log_evidence match exhaustive
  3. docs: fix stale numerics in t4-margin-factor plan

Risks

  • Borrow-checker friction in Item 1. The closure captures fields of &self while the helper iterates over arena state. Mitigation: helper is &self (not &mut self); arena passed as &mut ScratchArena separately. Disjoint borrows.
  • Compile error in Item 2 if a new variant ships before this lands. Trivial follow-on; the whole point is to surface that signal.