docs: spec for post-T4-MarginFactor tech debt cleanup

Three independent cleanups: dedupe Game::likelihoods and likelihoods_scored via a run_chain helper taking a make_link closure, make BuiltinFactor's log_evidence match exhaustive, and fix stale numerics in the T4 plan doc.
2026-05-08 14:24:48 +02:00
parent dbaad0e7d2
commit a69a3004b2
1 changed files with 134 additions and 0 deletions
@@ -0,0 +1,134 @@
+# Tech Debt Cleanup — Post-T4-MarginFactor
+
+## Summary
+
+Three small, independent cleanups left behind by the T4-MarginFactor merge
+(`8b53cac`). All three are pure code-shape or doc fixes. No public-API change,
+no numerics change, no new behavior.
+
+This batch deliberately excludes the `DiffFactor` ↔ `BuiltinFactor` overlap
+collapse (architectural change kept separate) and per-event `score_sigma`
+override (a feature, not debt).
+
+## Scope
+
+### Item 1 — Deduplicate `Game::likelihoods` and `Game::likelihoods_scored`
+
+**Current state.** `src/game.rs:236-371` and `src/game.rs:373-485` are 95-line
+near-duplicates of each other. They differ in exactly one block: the closure
+that maps a diff index to a `DiffFactor`. The ranked path builds
+`DiffFactor::Trunc(TruncFactor::new(vid, margin, tie))` with `margin`/`tie`
+derived from `p_draw` and adjacent-result equality. The scored path builds
+`DiffFactor::Margin(MarginFactor::new(vid, m_obs, score_sigma))` with `m_obs`
+the observed score gap. Everything else — sort, `team_prior`, sweep loop,
+boundary updates, evidence product, posterior `likelihoods` — is bit-identical.
+
+**Refactor.** Extract a private helper on `OwnedGame<T, D>`:
+
+```rust
+fn run_chain<F>(
+    &self,
+    arena: &mut ScratchArena,
+    make_link: F,
+) -> (f64, Vec<Vec<Gaussian>>)
+where
+    F: FnMut(usize, &[usize], &mut VarStore) -> DiffFactor,
+```
+
+The closure receives the diff index `i`, the descending-by-result sort
+permutation `&arena.sort_buf`, and `&mut arena.vars` for `alloc(N_INF)`. It
+returns the `DiffFactor` for that diff slot.
+
+The helper takes `&self` (not `&mut self`) and returns
+`(evidence, likelihoods)`. Each caller writes the results back to its own
+`self.evidence` and `self.likelihoods` fields. The `&self` choice matters: the
+closure captures `&self.result` / `&self.teams` / `&self.weights` / `&self.p_draw`
+freely without conflicting with the helper's own immutable borrow.
+
+The two public methods shrink from ~125 lines each to ~10 lines that just
+construct the closure.
+
+**Why a closure (not a trait or two-phase build).** A closure keeps all
+caller-specific state (`p_draw`, `score_sigma`, beta sums for margin) inline at
+the call site. A trait would require a stateful object per call; a two-phase
+build (caller produces the `Vec<DiffFactor>` first, helper does the rest) would
+either re-do the sort or split state ownership awkwardly between phases.
+
+### Item 2 — Make `BuiltinFactor::log_evidence` exhaustive
+
+**Current state.** `src/factor/mod.rs:94-100` uses a `_ => 0.0` wildcard for
+`TeamSum` and `RankDiff`. When a future variant lands (e.g. `SynergyFactor`),
+the wildcard silently absorbs it instead of forcing a deliberate decision.
+
+**Refactor.**
+
+```rust
+fn log_evidence(&self, vars: &VarStore) -> f64 {
+    match self {
+        Self::Trunc(f) => f.log_evidence(vars),
+        Self::Margin(f) => f.log_evidence(vars),
+        Self::TeamSum(_) | Self::RankDiff(_) => 0.0,
+    }
+}
+```
+
+No behavioral change. Future variants now produce a non-exhaustive-match
+compile error.
+
+### Item 3 — Fix stale numerics in T4 plan doc
+
+**Current state.** `docs/superpowers/plans/2026-04-27-t4-margin-factor.md`
+contains two numbers that diverge from the values asserted by the shipped test
+in `src/factor/mod.rs:163,166`.
+
+**Fix.**
+
+| Doc value (wrong) | Implementation value (correct) |
+|---|---|
+| `0.046827` | `0.04678` |
+| `-3.0613` | `-3.0622` |
+
+Pure docs change. Verified by reading the asserted constants in the test.
+
+## Out of scope
+
+- **`DiffFactor` ↔ `BuiltinFactor` overlap.** Both enums list `Trunc` and
+  `Margin` variants. Collapsing into `BuiltinFactor::Diff(DiffFactor)` is
+  defensible but is an architectural change that wants its own design pass.
+  `DiffFactor` represents a real semantic subset (factors that operate on a
+  diff variable in a chain); the duplication is two enum variants, not a
+  large block of code.
+- **Per-event `EventKind::Scored.score_sigma` override.** Today
+  `score_sigma` is history-wide (set on `HistoryBuilder::score_sigma`). A
+  per-event override is a real feature ask, not tech debt.
+
+## Verification
+
+Each item commits independently and ships behind a green `cargo test --lib`
+run. The dedup is a pure code-shape change: posteriors and evidence must be
+**bit-equal** (not ULP-bounded) against the existing 88+28 test goldens.
+
+Per-item gate before committing:
+
+```bash
+cargo +nightly fmt
+cargo clippy
+cargo test --lib
+```
+
+## Commit shape
+
+Three commits, one per item, each independently revertable:
+
+1. `refactor: dedupe Game::likelihoods and likelihoods_scored via run_chain`
+2. `refactor: make BuiltinFactor::log_evidence match exhaustive`
+3. `docs: fix stale numerics in t4-margin-factor plan`
+
+## Risks
+
+- **Borrow-checker friction in Item 1.** The closure captures fields of
+  `&self` while the helper iterates over arena state. Mitigation: helper is
+  `&self` (not `&mut self`); arena passed as `&mut ScratchArena` separately.
+  Disjoint borrows.
+- **Compile error in Item 2 if a new variant ships before this lands.**
+  Trivial follow-on; the whole point is to surface that signal.