# Tech Debt Cleanup Implementation Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Land three independent post-T4-MarginFactor cleanups: dedupe `Game::likelihoods` and `Game::likelihoods_scored` via a `run_chain` helper, make `BuiltinFactor::log_evidence` exhaustive, and fix stale numerics in the T4 plan doc. **Architecture:** Pure code-shape and doc fixes. No public-API change, no behavioral change, no new dependencies. The dedup is a pure refactor — bit-equal posteriors and evidence against existing test goldens. The exhaustive match is a future-proofing change with no runtime effect. The doc fix is two number swaps in prose plus one matching code-comment swap. **Tech Stack:** Rust 2024, `cargo +nightly fmt`, `cargo clippy`, `cargo test --lib`. --- ## Spec reference `docs/superpowers/specs/2026-05-08-tech-debt-cleanup-design.md` ## File map | File | Why touched | |---|---| | `src/game.rs` | Add `run_chain` helper; rewrite `likelihoods` and `likelihoods_scored` to call it | | `src/factor/mod.rs` | Make `BuiltinFactor::log_evidence` match exhaustive | | `docs/superpowers/plans/2026-04-27-t4-margin-factor.md` | Fix two stale prose numbers and one matching code comment | --- ### Task 1: Extract `run_chain` helper, dedupe both likelihoods methods **Files:** - Modify: `src/game.rs:236-485` (replace both `likelihoods` and `likelihoods_scored` with one helper + two thin callers) **Context for the implementer (read this before touching anything):** `OwnedGame` (defined at `src/game.rs:83-92`) holds `teams`, `result`, `weights`, `p_draw`, plus mutable output fields `likelihoods: Vec>` and `evidence: f64`. Two private methods on `Game<'a, T, D>` (the borrowed sibling at `src/game.rs:148-156`) compute likelihoods: - `likelihoods(&mut self, arena: &mut ScratchArena)` — ranked outcomes; `src/game.rs:236-371` - `likelihoods_scored(&mut self, arena: &mut ScratchArena, score_sigma: f64)` — scored outcomes; `src/game.rs:373-485` The two are bit-identical except for the closure that builds the per-diff `DiffFactor` (defined at `src/game.rs:20-54`). `DiffFactor` has two variants: `Trunc(TruncFactor)` for ranked, `Margin(MarginFactor)` for scored. The shared body does, in order: `arena.reset()`, sort teams descending by `result` into `arena.sort_buf`, fill `arena.team_prior`, build `links: Vec` (the differing block), resize `arena.lhood_lose` / `arena.lhood_win` to `N_INF`, run a forward+backward sweep with a max-iter-10 fixed-point loop guarded by `tuple_gt(step, 1e-6)`, handle the `n_diffs == 1` special case, do boundary updates, multiply per-diff `evidence()` into `self.evidence`, build the inverse permutation in `arena.inv_buf`, then build `self.likelihoods` from the per-team `lhood_win * lhood_lose` and per-player `performance().exclude(...).forget(beta²)` math. **Refactor target:** ```rust fn run_chain( &self, arena: &mut ScratchArena, mut make_link: F, ) -> (f64, Vec>) where F: FnMut(usize, &[usize], &mut crate::factor::VarStore) -> DiffFactor, { /* the entire shared body, returning (evidence, likelihoods) */ } ``` Helper takes `&self` (not `&mut self`) so the closure can capture `&self.result`, `&self.teams`, `&self.weights`, `&self.p_draw` without conflicting with the helper's own immutable borrow. The arena is borrowed `&mut` independently. The closure is invoked once per diff index `i ∈ 0..n_diffs`, after `arena.sort_buf` is filled. It receives `i`, `&arena.sort_buf[..]`, and `&mut arena.vars` so it can `alloc(N_INF)` the diff `VarId`. It returns the constructed `DiffFactor`. The two callers shrink to: ```rust fn likelihoods(&mut self, arena: &mut ScratchArena) { let p_draw = self.p_draw; let result = &self.result; let teams = &self.teams; let (evidence, likelihoods) = Self::dummy_to_satisfy_borrowck(/* see below */); // ... assigns self.evidence and self.likelihoods } ``` Wait — actually borrow-checker note: calling `self.run_chain(arena, |i, sort_buf, vars| { use_self_fields })` from a `&mut self` method is **fine** because `run_chain` takes `&self` and the closure captures `&self` immutably. Both share an immutable reborrow of `*self`. The arena is a separate `&mut` borrow. Verify the implementer doesn't accidentally make `run_chain` take `&mut self`. **Why a closure (not a trait, not a two-phase build).** A closure keeps caller-specific state (`p_draw`, `score_sigma`, beta sums) inline at the call site with zero ceremony. A trait would require a stateful builder per call. A two-phase build (caller produces `Vec` first, helper does the rest) would either re-do the sort or split arena ownership awkwardly between the phases. --- - [ ] **Step 1: Run the existing test suite to capture the baseline** Run: `cargo test --lib` Expected: all tests pass. Note the count (should be 88+ lib tests) — the refactor must keep this number unchanged with all green. - [ ] **Step 2: Open `src/game.rs` and add the `run_chain` helper** Inside `impl<'a, T: Time, D: Drift> Game<'a, T, D> { ... }` (the block starting at `src/game.rs:158`), add `run_chain` immediately above the existing `likelihoods` method (so above line 236). Use exactly this body — it is the merge of the two existing methods with the differing block replaced by the closure call: ```rust fn run_chain( &self, arena: &mut ScratchArena, mut make_link: F, ) -> (f64, Vec>) where F: FnMut(usize, &[usize], &mut crate::factor::VarStore) -> DiffFactor, { arena.reset(); let n_teams = self.teams.len(); arena.sort_buf.extend(0..n_teams); arena.sort_buf.sort_by(|&i, &j| { self.result[j] .partial_cmp(&self.result[i]) .unwrap_or(Ordering::Equal) }); arena.team_prior.extend(arena.sort_buf.iter().map(|&t| { self.teams[t] .iter() .zip(self.weights[t].iter()) .fold(N00, |p, (player, &w)| p + (player.performance() * w)) })); let n_diffs = n_teams.saturating_sub(1); let mut links: Vec = (0..n_diffs) .map(|i| make_link(i, &arena.sort_buf, &mut arena.vars)) .collect(); arena.lhood_lose.resize(n_teams, N_INF); arena.lhood_win.resize(n_teams, N_INF); let mut step = (f64::INFINITY, f64::INFINITY); let mut iter = 0; while tuple_gt(step, 1e-6) && iter < 10 { step = (0.0_f64, 0.0_f64); for (e, lf) in links[..n_diffs.saturating_sub(1)].iter_mut().enumerate() { let pw = arena.team_prior[e] * arena.lhood_lose[e]; let pl = arena.team_prior[e + 1] * arena.lhood_win[e + 1]; let raw = pw - pl; arena.vars.set(lf.diff(), raw * lf.msg()); let d = lf.propagate(&mut arena.vars); step = tuple_max(step, d); let new_ll = pw - lf.msg(); step = tuple_max(step, arena.lhood_lose[e + 1].delta(new_ll)); arena.lhood_lose[e + 1] = new_ll; } for (rev_i, lf) in links[1..].iter_mut().rev().enumerate() { let e = n_diffs - 1 - rev_i; let pw = arena.team_prior[e] * arena.lhood_lose[e]; let pl = arena.team_prior[e + 1] * arena.lhood_win[e + 1]; let raw = pw - pl; arena.vars.set(lf.diff(), raw * lf.msg()); let d = lf.propagate(&mut arena.vars); step = tuple_max(step, d); let new_lw = pl + lf.msg(); step = tuple_max(step, arena.lhood_win[e].delta(new_lw)); arena.lhood_win[e] = new_lw; } iter += 1; } if n_diffs == 1 { let raw = (arena.team_prior[0] * arena.lhood_lose[0]) - (arena.team_prior[1] * arena.lhood_win[1]); arena.vars.set(links[0].diff(), raw * links[0].msg()); links[0].propagate(&mut arena.vars); } if n_diffs > 0 { let pl1 = arena.team_prior[1] * arena.lhood_win[1]; arena.lhood_win[0] = pl1 + links[0].msg(); let pw_last = arena.team_prior[n_teams - 2] * arena.lhood_lose[n_teams - 2]; arena.lhood_lose[n_teams - 1] = pw_last - links[n_diffs - 1].msg(); } let evidence: f64 = links.iter().map(|l| l.evidence()).product(); arena.inv_buf.resize(n_teams, 0); for (si, &orig_i) in arena.sort_buf.iter().enumerate() { arena.inv_buf[orig_i] = si; } let likelihoods = self .teams .iter() .zip(self.weights.iter()) .enumerate() .map(|(orig_i, (players, weights))| { let si = arena.inv_buf[orig_i]; let m = arena.lhood_win[si] * arena.lhood_lose[si]; let performance = players .iter() .zip(weights.iter()) .fold(N00, |p, (player, &w)| p + (player.performance() * w)); players .iter() .zip(weights.iter()) .map(|(player, &w)| { ((m - performance.exclude(player.performance() * w)) * (1.0 / w)) .forget(player.beta.powi(2)) }) .collect::>() }) .collect::>(); (evidence, likelihoods) } ``` - [ ] **Step 3: Replace `likelihoods` body with a thin caller** In `src/game.rs`, replace the entire body of `fn likelihoods(&mut self, arena: &mut ScratchArena)` (currently lines 236-371 — replace from the opening `{` to the closing `}` of that method) with: ```rust fn likelihoods(&mut self, arena: &mut ScratchArena) { let p_draw = self.p_draw; // Capture pointers to fields the closure reads, to keep borrow scopes tight. // Closure captures &self.result and &self.teams (both immutable) and the // &mut arena passed in via run_chain — disjoint from `&self`. let (evidence, likelihoods) = self.run_chain(arena, |i, sort_buf, vars| { let tie = self.result[sort_buf[i]] == self.result[sort_buf[i + 1]]; let margin = if p_draw == 0.0 { 0.0 } else { let a: f64 = self.teams[sort_buf[i]] .iter() .map(|p| p.beta.powi(2)) .sum(); let b: f64 = self.teams[sort_buf[i + 1]] .iter() .map(|p| p.beta.powi(2)) .sum(); compute_margin(p_draw, (a + b).sqrt()) }; let vid = vars.alloc(N_INF); DiffFactor::Trunc(TruncFactor::new(vid, margin, tie)) }); self.evidence = evidence; self.likelihoods = likelihoods; } ``` (Capturing `p_draw` as a local binding before the closure avoids a `self.p_draw` borrow inside; it's a `Copy` `f64` so this is free.) - [ ] **Step 4: Replace `likelihoods_scored` body with a thin caller** In `src/game.rs`, replace the entire body of `fn likelihoods_scored(&mut self, arena: &mut ScratchArena, score_sigma: f64)` (currently lines 373-485) with: ```rust fn likelihoods_scored(&mut self, arena: &mut ScratchArena, score_sigma: f64) { let (evidence, likelihoods) = self.run_chain(arena, |i, sort_buf, vars| { // After descending-by-score sort, m_obs >= 0 for every adjacent pair. let m_obs = self.result[sort_buf[i]] - self.result[sort_buf[i + 1]]; let vid = vars.alloc(N_INF); DiffFactor::Margin(MarginFactor::new(vid, m_obs, score_sigma)) }); self.evidence = evidence; self.likelihoods = likelihoods; } ``` - [ ] **Step 5: Build to confirm it compiles** Run: `cargo build` Expected: compiles cleanly. If the borrow checker complains that the closure conflicts with `self.run_chain(...)`, the most likely cause is `run_chain` accidentally being `&mut self` — confirm its signature is `fn run_chain(&self, arena: &mut ScratchArena, mut make_link: F) -> (f64, Vec>)`. If that's correct and there's still a conflict, double-check the closure's captures: it should capture `&self.result` and `&self.teams` (immutable), `p_draw: f64` by value (Copy), and `score_sigma: f64` by value (Copy). It must NOT touch `&mut self` in any form. - [ ] **Step 6: Run the full library test suite — must be all green, same count as Step 1** Run: `cargo test --lib` Expected: same number of tests as Step 1, all pass. Bit-equal goldens — every existing assertion (`test_1vs1`, `test_1vs1_draw`, `test_2vs1vs2_mixed`, MarginFactor end-to-end tests, etc.) must pass unchanged. If ANY test fails, the refactor is wrong; revert and re-inspect. - [ ] **Step 7: Run integration tests too** Run: `cargo test` Expected: all integration tests pass (28 noted in commit `8b53cac`). - [ ] **Step 8: Format and lint** Run: `cargo +nightly fmt && cargo clippy --lib -- -D warnings` Expected: no diffs from fmt, no clippy warnings. - [ ] **Step 9: Commit** ```bash git add src/game.rs git commit -m "$(cat <<'EOF' refactor: dedupe Game::likelihoods and likelihoods_scored via run_chain Both methods were 95-line near-duplicates differing only in the closure that builds the per-diff DiffFactor. Extract the shared body as a private run_chain(&self, arena, make_link) helper that returns (evidence, likelihoods); the two callers shrink to ~10 lines each. Pure code-shape change: posteriors and evidence remain bit-equal; all existing tests (lib + integration) pass unchanged. EOF )" ``` --- ### Task 2: Make `BuiltinFactor::log_evidence` match exhaustive **Files:** - Modify: `src/factor/mod.rs:94-100` (the `log_evidence` impl on `BuiltinFactor`) - [ ] **Step 1: Open `src/factor/mod.rs` and replace the `log_evidence` body** Replace the existing impl: ```rust fn log_evidence(&self, vars: &VarStore) -> f64 { match self { Self::Trunc(f) => f.log_evidence(vars), Self::Margin(f) => f.log_evidence(vars), _ => 0.0, } } ``` with: ```rust fn log_evidence(&self, vars: &VarStore) -> f64 { match self { Self::Trunc(f) => f.log_evidence(vars), Self::Margin(f) => f.log_evidence(vars), Self::TeamSum(_) | Self::RankDiff(_) => 0.0, } } ``` - [ ] **Step 2: Build and run tests** Run: `cargo build && cargo test --lib` Expected: compiles cleanly, all tests pass. Behavior is unchanged — `TeamSum` and `RankDiff` still return `0.0`, but a future variant will now produce a non-exhaustive-match error instead of being silently swallowed. - [ ] **Step 3: Format and lint** Run: `cargo +nightly fmt && cargo clippy --lib -- -D warnings` Expected: no diffs, no warnings. - [ ] **Step 4: Commit** ```bash git add src/factor/mod.rs git commit -m "$(cat <<'EOF' refactor: make BuiltinFactor::log_evidence match exhaustive Replace the `_ => 0.0` wildcard with explicit `Self::TeamSum(_) | Self::RankDiff(_) => 0.0`. No behavioral change; future variants now produce a compile error instead of being silently absorbed by the wildcard. EOF )" ``` --- ### Task 3: Fix stale numerics in T4 plan doc **Files:** - Modify: `docs/superpowers/plans/2026-04-27-t4-margin-factor.md` (lines 52 and 185) The shipped test in `src/factor/mod.rs:163,166` asserts: ``` assert!((result.mu() - 4.864864864864865).abs() < 1e-12); assert!((logz - (-3.062235327364623)).abs() < 1e-10); ``` The plan's prose at line 52 quotes pre-shipped values that no longer match. This task fixes the prose and the matching code-comment. The full-precision assertion blocks elsewhere in the plan are out of scope (they belong to the plan-as-written, and the spec's fix table only listed the rounded prose values). - [ ] **Step 1: Update the prose at line 52** Open `docs/superpowers/plans/2026-04-27-t4-margin-factor.md`. Find the line: ``` - `Z_cav = pdf(5, 0, sqrt(36 + 1)) = pdf(5, 0, sqrt(37)) ≈ 0.046827`. So `log_evidence ≈ -3.0613`. ``` Replace with: ``` - `Z_cav = pdf(5, 0, sqrt(36 + 1)) = pdf(5, 0, sqrt(37)) ≈ 0.04678`. So `log_evidence ≈ -3.0622`. ``` - [ ] **Step 2: Update the matching code-comment at line 185** In the same file, find: ``` // pdf(5, 0, sqrt(37)) ≈ 0.046827 ``` Replace with: ``` // pdf(5, 0, sqrt(37)) ≈ 0.04678 ``` - [ ] **Step 3: Verify nothing else changed** Run: `git diff docs/superpowers/plans/2026-04-27-t4-margin-factor.md` Expected: exactly three lines changed (one prose line containing both numbers, one comment line). Nothing else should be touched. - [ ] **Step 4: Commit** ```bash git add docs/superpowers/plans/2026-04-27-t4-margin-factor.md git commit -m "$(cat <<'EOF' docs: fix stale numerics in t4-margin-factor plan The plan's prose quoted Z_cav ≈ 0.046827 and log_evidence ≈ -3.0613, which diverged from the values asserted by the shipped test in src/factor/mod.rs (-3.062235327364623). Update prose and the matching code comment to 0.04678 / -3.0622. EOF )" ``` --- ## Self-review (writer's note) Spec coverage: - Spec Item 1 (dedupe `likelihoods`/`likelihoods_scored`) → Task 1 ✓ - Spec Item 2 (exhaustive `BuiltinFactor::log_evidence`) → Task 2 ✓ - Spec Item 3 (stale numerics in T4 plan) → Task 3 ✓ - Spec out-of-scope items (`DiffFactor` collapse, per-event `score_sigma`) — correctly absent ✓ Verification gates per the spec ("each item commits independently and ships behind a green `cargo test --lib`"): every task ends in fmt + clippy + tests + commit. Task 1 additionally runs `cargo test` for integration coverage. Type / signature consistency: - `run_chain` signature appears identically in the context header and Step 2 body ✓ - Closure type `FnMut(usize, &[usize], &mut crate::factor::VarStore) -> DiffFactor` matches across Step 2 (definition) and Steps 3/4 (call sites) ✓ - `DiffFactor::Trunc` / `DiffFactor::Margin` constructors match `src/game.rs:20-23` definitions ✓ No placeholders detected.