From a69a3004b28f0027aab9b95a05fe8d5aedcb6e96 Mon Sep 17 00:00:00 2001 From: Anders Olsson Date: Fri, 8 May 2026 14:24:48 +0200 Subject: [PATCH] docs: spec for post-T4-MarginFactor tech debt cleanup Three independent cleanups: dedupe Game::likelihoods and likelihoods_scored via a run_chain helper taking a make_link closure, make BuiltinFactor's log_evidence match exhaustive, and fix stale numerics in the T4 plan doc. --- .../2026-05-08-tech-debt-cleanup-design.md | 134 ++++++++++++++++++ 1 file changed, 134 insertions(+) create mode 100644 docs/superpowers/specs/2026-05-08-tech-debt-cleanup-design.md diff --git a/docs/superpowers/specs/2026-05-08-tech-debt-cleanup-design.md b/docs/superpowers/specs/2026-05-08-tech-debt-cleanup-design.md new file mode 100644 index 0000000..160726a --- /dev/null +++ b/docs/superpowers/specs/2026-05-08-tech-debt-cleanup-design.md @@ -0,0 +1,134 @@ +# Tech Debt Cleanup — Post-T4-MarginFactor + +## Summary + +Three small, independent cleanups left behind by the T4-MarginFactor merge +(`8b53cac`). All three are pure code-shape or doc fixes. No public-API change, +no numerics change, no new behavior. + +This batch deliberately excludes the `DiffFactor` ↔ `BuiltinFactor` overlap +collapse (architectural change kept separate) and per-event `score_sigma` +override (a feature, not debt). + +## Scope + +### Item 1 — Deduplicate `Game::likelihoods` and `Game::likelihoods_scored` + +**Current state.** `src/game.rs:236-371` and `src/game.rs:373-485` are 95-line +near-duplicates of each other. They differ in exactly one block: the closure +that maps a diff index to a `DiffFactor`. The ranked path builds +`DiffFactor::Trunc(TruncFactor::new(vid, margin, tie))` with `margin`/`tie` +derived from `p_draw` and adjacent-result equality. The scored path builds +`DiffFactor::Margin(MarginFactor::new(vid, m_obs, score_sigma))` with `m_obs` +the observed score gap. Everything else — sort, `team_prior`, sweep loop, +boundary updates, evidence product, posterior `likelihoods` — is bit-identical. + +**Refactor.** Extract a private helper on `OwnedGame`: + +```rust +fn run_chain( + &self, + arena: &mut ScratchArena, + make_link: F, +) -> (f64, Vec>) +where + F: FnMut(usize, &[usize], &mut VarStore) -> DiffFactor, +``` + +The closure receives the diff index `i`, the descending-by-result sort +permutation `&arena.sort_buf`, and `&mut arena.vars` for `alloc(N_INF)`. It +returns the `DiffFactor` for that diff slot. + +The helper takes `&self` (not `&mut self`) and returns +`(evidence, likelihoods)`. Each caller writes the results back to its own +`self.evidence` and `self.likelihoods` fields. The `&self` choice matters: the +closure captures `&self.result` / `&self.teams` / `&self.weights` / `&self.p_draw` +freely without conflicting with the helper's own immutable borrow. + +The two public methods shrink from ~125 lines each to ~10 lines that just +construct the closure. + +**Why a closure (not a trait or two-phase build).** A closure keeps all +caller-specific state (`p_draw`, `score_sigma`, beta sums for margin) inline at +the call site. A trait would require a stateful object per call; a two-phase +build (caller produces the `Vec` first, helper does the rest) would +either re-do the sort or split state ownership awkwardly between phases. + +### Item 2 — Make `BuiltinFactor::log_evidence` exhaustive + +**Current state.** `src/factor/mod.rs:94-100` uses a `_ => 0.0` wildcard for +`TeamSum` and `RankDiff`. When a future variant lands (e.g. `SynergyFactor`), +the wildcard silently absorbs it instead of forcing a deliberate decision. + +**Refactor.** + +```rust +fn log_evidence(&self, vars: &VarStore) -> f64 { + match self { + Self::Trunc(f) => f.log_evidence(vars), + Self::Margin(f) => f.log_evidence(vars), + Self::TeamSum(_) | Self::RankDiff(_) => 0.0, + } +} +``` + +No behavioral change. Future variants now produce a non-exhaustive-match +compile error. + +### Item 3 — Fix stale numerics in T4 plan doc + +**Current state.** `docs/superpowers/plans/2026-04-27-t4-margin-factor.md` +contains two numbers that diverge from the values asserted by the shipped test +in `src/factor/mod.rs:163,166`. + +**Fix.** + +| Doc value (wrong) | Implementation value (correct) | +|---|---| +| `0.046827` | `0.04678` | +| `-3.0613` | `-3.0622` | + +Pure docs change. Verified by reading the asserted constants in the test. + +## Out of scope + +- **`DiffFactor` ↔ `BuiltinFactor` overlap.** Both enums list `Trunc` and + `Margin` variants. Collapsing into `BuiltinFactor::Diff(DiffFactor)` is + defensible but is an architectural change that wants its own design pass. + `DiffFactor` represents a real semantic subset (factors that operate on a + diff variable in a chain); the duplication is two enum variants, not a + large block of code. +- **Per-event `EventKind::Scored.score_sigma` override.** Today + `score_sigma` is history-wide (set on `HistoryBuilder::score_sigma`). A + per-event override is a real feature ask, not tech debt. + +## Verification + +Each item commits independently and ships behind a green `cargo test --lib` +run. The dedup is a pure code-shape change: posteriors and evidence must be +**bit-equal** (not ULP-bounded) against the existing 88+28 test goldens. + +Per-item gate before committing: + +```bash +cargo +nightly fmt +cargo clippy +cargo test --lib +``` + +## Commit shape + +Three commits, one per item, each independently revertable: + +1. `refactor: dedupe Game::likelihoods and likelihoods_scored via run_chain` +2. `refactor: make BuiltinFactor::log_evidence match exhaustive` +3. `docs: fix stale numerics in t4-margin-factor plan` + +## Risks + +- **Borrow-checker friction in Item 1.** The closure captures fields of + `&self` while the helper iterates over arena state. Mitigation: helper is + `&self` (not `&mut self`); arena passed as `&mut ScratchArena` separately. + Disjoint borrows. +- **Compile error in Item 2 if a new variant ships before this lands.** + Trivial follow-on; the whole point is to surface that signal.