Adds soft Gaussian-observation evidence on the per-pair diff variable,
enabling continuous score margins as a richer alternative to ranks.
Public API:
- `Outcome::Scored([scores])` (non-breaking enum extension under
`#[non_exhaustive]`).
- `Game::scored(teams, outcome, options)` constructor parallel to
`Game::ranked`.
- `EventBuilder::scores([...])` fluent helper.
- `HistoryBuilder::score_sigma(σ)` knob (default 1.0, validated > 0).
- `GameOptions::score_sigma`.
- `EventKind` re-exported from `lib.rs` (annotated `#[non_exhaustive]`).
- New `InferenceError::InvalidParameter { name, value }` variant.
Internals:
- `MarginFactor` (`factor/margin.rs`): Gaussian observation factor that
closes in one EP step; cavity-cached log-evidence mirrors `TruncFactor`.
- `BuiltinFactor::Margin` dispatch arm.
- `DiffFactor` enum in `game.rs` lets `Game::likelihoods` and the new
`likelihoods_scored` share the per-pair link abstraction.
- Per-event `EventKind { Ranked, Scored { score_sigma } }` routed through
`TimeSlice::add_events`, `iteration_direct`, and `log_evidence`.
Tests: 88 lib + 27 integration (4 new in `tests/scored.rs`); existing
goldens byte-identical. Bench: `benches/scored.rs` baseline ~960µs for
60 events × 20-player pool with default convergence.
Plan: docs/superpowers/plans/2026-04-27-t4-margin-factor.md
Spec item marked Done.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1977 lines
64 KiB
Markdown
1977 lines
64 KiB
Markdown
# T4 — MarginFactor + Outcome::Scored Implementation Plan
|
||
|
||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||
|
||
**Goal:** Add a `MarginFactor` (Gaussian observation factor on a diff variable) and an `Outcome::Scored(scores)` variant, so users can supply continuous per-team scores instead of just ranks. Per-pair score margins become soft EP evidence about the latent performance diff.
|
||
|
||
**Architecture:**
|
||
- Sort scored teams by score descending; for each adjacent pair compute `m_obs = score_higher − score_lower ≥ 0`. Per pair: `RankDiffFactor` writes `diff = team_a − team_b`, then a `MarginFactor` multiplies in the Gaussian observation `N(m_obs, score_sigma²)`. This replaces the `TruncFactor` for scored outcomes; ranked outcomes are unchanged.
|
||
- A new internal enum `DiffFactor { Trunc(TruncFactor), Margin(MarginFactor) }` lets `Game::likelihoods` keep its single hand-rolled forward/backward sweep loop while dispatching the per-diff factor by enum.
|
||
- `score_sigma` is configurable on `GameOptions` and `HistoryBuilder` (default `1.0`).
|
||
- `Outcome` is already `#[non_exhaustive]`, so adding `Scored` is non-breaking for downstream `match` arms.
|
||
|
||
**Tech Stack:** Rust 2024, smallvec, rayon (already in tree). No new crate dependencies.
|
||
|
||
---
|
||
|
||
## File Structure
|
||
|
||
| Path | Status | Responsibility |
|
||
|---|---|---|
|
||
| `src/factor/margin.rs` | **create** | `MarginFactor` struct + `Factor` impl + cavity-cached evidence + unit tests |
|
||
| `src/factor/mod.rs` | modify | `pub mod margin;`, `BuiltinFactor::Margin(...)` variant + dispatch arms |
|
||
| `src/factors.rs` | modify | re-export `MarginFactor` |
|
||
| `src/outcome.rs` | modify | `Outcome::Scored(SmallVec<[f64; 4]>)` variant, `scores()` ctor, `as_scores()` accessor, `team_count` arm |
|
||
| `src/game.rs` | modify | `pub(crate) enum DiffFactor`, scored path in `likelihoods`, `Game::scored()` ctor, `GameOptions::score_sigma` |
|
||
| `src/event_builder.rs` | modify | `.scores([...])` builder method |
|
||
| `src/history.rs` | modify | match `Outcome::Scored` in `add_events`; `HistoryBuilder::score_sigma`; new internal `add_events_scored_with_prior` (or extra arg) |
|
||
| `tests/scored.rs` | **create** | end-to-end Scored integration tests |
|
||
| `examples/scored.rs` | **create** | worked example using `Outcome::Scored` |
|
||
| `benches/scored.rs` | **create** | criterion benchmark mirroring `batch.rs` with scored events |
|
||
| `CLAUDE.md` | modify | mark T4-MarginFactor complete in the architecture notes |
|
||
|
||
---
|
||
|
||
## Background — math the implementer needs
|
||
|
||
For a diff variable `D` with current marginal `D_marg`, the MarginFactor models an observation `m_obs ~ N(D, σ²)` where `σ = score_sigma`. Standard EP for a Gaussian-likelihood factor:
|
||
|
||
1. **Cavity:** `D_cav = D_marg / msg` (where `msg` is this factor's stored outgoing message; init `N_INF` so the first cavity = the current marginal).
|
||
2. **Tilted distribution:** `D_cav · N(m_obs, σ²)` — a product of two Gaussians; closed-form, no approximation needed (so it converges in one propagation).
|
||
3. **New marginal:** the tilted distribution.
|
||
4. **New outgoing message:** `new_msg = new_marginal / D_cav`. Because the tilted distribution is exact, `new_msg = N(m_obs, σ²)` (a constant in `m_obs` and `σ`).
|
||
5. **Cavity evidence:** `Z_cav = pdf(m_obs; D_cav.mu(), sqrt(D_cav.sigma()² + σ²))` (the marginal likelihood of `m_obs` under the cavity). Cache on first propagate, identical to `TruncFactor`'s pattern. `log_evidence = Z_cav.ln()`.
|
||
|
||
Practical consequence: `MarginFactor::propagate` returns a non-zero delta on its first call (because `msg` jumps from `N_INF` to `N(m_obs, σ²)`) and exactly zero afterwards, since `new_msg` is a constant.
|
||
|
||
A Gaussian `N(m, σ)` constructed via `Gaussian::from_ms(m, σ)`. Multiplication adds nat-params (`pi += other.pi; tau += other.tau`). Division subtracts. The `pdf(x, mu, sigma)` helper already exists in `lib.rs` (private, but importable as `crate::pdf`).
|
||
|
||
**Concrete numerical check for tests:** With cavity `N(0, 6)` and observation `m_obs=5, σ=1`:
|
||
- `D_cav.pi = 1/36 ≈ 0.027778`, `D_cav.tau = 0`.
|
||
- New marginal: `pi = 0.027778 + 1 = 1.027778`, `tau = 0 + 5 = 5`. So `mu = 5 / 1.027778 ≈ 4.864865`, `sigma = 1/sqrt(1.027778) ≈ 0.986394`.
|
||
- `Z_cav = pdf(5, 0, sqrt(36 + 1)) = pdf(5, 0, sqrt(37)) ≈ 0.046827`. So `log_evidence ≈ -3.0613`.
|
||
|
||
---
|
||
|
||
### Task 1: `MarginFactor` core (file + struct + Factor impl + unit tests)
|
||
|
||
**Files:**
|
||
- Create: `src/factor/margin.rs`
|
||
- Modify: `src/factor/mod.rs:100-102` (add `pub mod margin;` next to the existing `pub mod` lines)
|
||
|
||
- [ ] **Step 1: Add the module declaration so the new file compiles**
|
||
|
||
In `src/factor/mod.rs`, find the existing block:
|
||
|
||
```rust
|
||
pub mod rank_diff;
|
||
pub mod team_sum;
|
||
pub mod trunc;
|
||
```
|
||
|
||
Replace with:
|
||
|
||
```rust
|
||
pub mod margin;
|
||
pub mod rank_diff;
|
||
pub mod team_sum;
|
||
pub mod trunc;
|
||
```
|
||
|
||
- [ ] **Step 2: Create `src/factor/margin.rs` with the failing tests first**
|
||
|
||
```rust
|
||
use crate::{
|
||
N_INF, cdf, pdf,
|
||
factor::{Factor, VarId, VarStore},
|
||
gaussian::Gaussian,
|
||
};
|
||
|
||
/// Gaussian observation factor on a diff variable.
|
||
///
|
||
/// Encodes the soft evidence `m_obs ~ N(diff, sigma²)`. The outgoing message
|
||
/// to `diff` is the constant `N(m_obs, sigma²)`, so this factor converges in a
|
||
/// single propagation: subsequent calls return a zero delta.
|
||
#[derive(Debug)]
|
||
pub struct MarginFactor {
|
||
pub diff: VarId,
|
||
pub m_obs: f64,
|
||
pub sigma: f64,
|
||
pub(crate) msg: Gaussian,
|
||
pub(crate) evidence_cached: Option<f64>,
|
||
}
|
||
|
||
impl MarginFactor {
|
||
pub fn new(diff: VarId, m_obs: f64, sigma: f64) -> Self {
|
||
debug_assert!(sigma > 0.0, "score sigma must be positive");
|
||
Self {
|
||
diff,
|
||
m_obs,
|
||
sigma,
|
||
msg: N_INF,
|
||
evidence_cached: None,
|
||
}
|
||
}
|
||
}
|
||
|
||
impl Factor for MarginFactor {
|
||
fn propagate(&mut self, vars: &mut VarStore) -> (f64, f64) {
|
||
let marginal = vars.get(self.diff);
|
||
let cavity = marginal / self.msg;
|
||
|
||
if self.evidence_cached.is_none() {
|
||
self.evidence_cached = Some(cavity_evidence(cavity, self.m_obs, self.sigma));
|
||
}
|
||
|
||
let new_msg = Gaussian::from_ms(self.m_obs, self.sigma);
|
||
let new_marginal = cavity * new_msg;
|
||
let old_msg = self.msg;
|
||
self.msg = new_msg;
|
||
vars.set(self.diff, new_marginal);
|
||
|
||
old_msg.delta(new_msg)
|
||
}
|
||
|
||
fn log_evidence(&self, _vars: &VarStore) -> f64 {
|
||
self.evidence_cached.unwrap_or(1.0).ln()
|
||
}
|
||
}
|
||
|
||
fn cavity_evidence(cavity: Gaussian, m_obs: f64, sigma: f64) -> f64 {
|
||
let combined_sigma = (cavity.sigma().powi(2) + sigma.powi(2)).sqrt();
|
||
pdf(m_obs, cavity.mu(), combined_sigma)
|
||
}
|
||
|
||
#[cfg(test)]
|
||
mod tests {
|
||
use super::*;
|
||
|
||
#[test]
|
||
fn first_propagate_writes_tilted_marginal() {
|
||
let mut vars = VarStore::new();
|
||
let diff = vars.alloc(Gaussian::from_ms(0.0, 6.0));
|
||
let mut f = MarginFactor::new(diff, 5.0, 1.0);
|
||
|
||
f.propagate(&mut vars);
|
||
|
||
let result = vars.get(diff);
|
||
// pi = 1/36 + 1 ≈ 1.027778; tau = 0 + 5 = 5
|
||
// mu = 5 / 1.027778 ≈ 4.864865; sigma = 1/sqrt(1.027778) ≈ 0.986394
|
||
assert!((result.mu() - 4.864864864864865).abs() < 1e-12);
|
||
assert!((result.sigma() - 0.986393923832144).abs() < 1e-12);
|
||
}
|
||
|
||
#[test]
|
||
fn converges_in_one_step() {
|
||
let mut vars = VarStore::new();
|
||
let diff = vars.alloc(Gaussian::from_ms(0.0, 6.0));
|
||
let mut f = MarginFactor::new(diff, 5.0, 1.0);
|
||
|
||
f.propagate(&mut vars);
|
||
let (dmu, dsig) = f.propagate(&mut vars);
|
||
assert!(dmu < 1e-12, "expected ~0 delta on second propagate, got {dmu}");
|
||
assert!(dsig < 1e-12);
|
||
}
|
||
|
||
#[test]
|
||
fn evidence_cached_on_first_propagate() {
|
||
let mut vars = VarStore::new();
|
||
let diff = vars.alloc(Gaussian::from_ms(0.0, 6.0));
|
||
let mut f = MarginFactor::new(diff, 5.0, 1.0);
|
||
assert!(f.evidence_cached.is_none());
|
||
|
||
f.propagate(&mut vars);
|
||
let z = f.evidence_cached.unwrap();
|
||
// pdf(5, 0, sqrt(37)) ≈ 0.046827
|
||
assert!((z - 0.04682752233851171).abs() < 1e-10);
|
||
|
||
// Subsequent propagations don't change it.
|
||
f.propagate(&mut vars);
|
||
assert_eq!(f.evidence_cached.unwrap(), z);
|
||
}
|
||
|
||
#[test]
|
||
fn log_evidence_matches_cached_ln() {
|
||
let mut vars = VarStore::new();
|
||
let diff = vars.alloc(Gaussian::from_ms(0.0, 6.0));
|
||
let mut f = MarginFactor::new(diff, 5.0, 1.0);
|
||
f.propagate(&mut vars);
|
||
let logz = f.log_evidence(&vars);
|
||
assert!((logz - (-3.061357379815869)).abs() < 1e-10);
|
||
}
|
||
|
||
// Silence unused-import warning for cdf until/if a tie-band variant is added.
|
||
#[allow(dead_code)]
|
||
fn _cdf_smoke() -> f64 {
|
||
cdf(0.0, 0.0, 1.0)
|
||
}
|
||
}
|
||
```
|
||
|
||
> Note: the unused `cdf` import keeps parity with `trunc.rs` style and reserves the spot if a tie-band MarginFactor variant gets added later. If you'd rather drop it, remove the `cdf` from the import list and delete `_cdf_smoke`.
|
||
|
||
- [ ] **Step 3: Run the new tests to verify they pass once added (after Step 2 they will pass; this step is the guard)**
|
||
|
||
Run: `cargo test --lib factor::margin`
|
||
|
||
Expected: 4 passed.
|
||
|
||
- [ ] **Step 4: Verify the module compiles cleanly with no warnings**
|
||
|
||
Run: `cargo build` and `cargo clippy --lib -- -D warnings`
|
||
|
||
Expected: no warnings, no errors.
|
||
|
||
- [ ] **Step 5: Format and commit**
|
||
|
||
```bash
|
||
cargo +nightly fmt
|
||
git add src/factor/margin.rs src/factor/mod.rs
|
||
git commit -m "feat(factor): add MarginFactor for scored-margin EP evidence"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 2: Wire `MarginFactor` into `BuiltinFactor` enum dispatch
|
||
|
||
**Files:**
|
||
- Modify: `src/factor/mod.rs:76-98` (the `BuiltinFactor` enum and its `Factor` impl)
|
||
- Modify: `src/factors.rs:7-13` (the public re-export list)
|
||
|
||
- [ ] **Step 1: Write a failing dispatch test in `src/factor/mod.rs`**
|
||
|
||
Open `src/factor/mod.rs`. Inside the existing `#[cfg(test)] mod tests { ... }` block (around line 105), add:
|
||
|
||
```rust
|
||
#[test]
|
||
fn builtin_factor_dispatches_to_margin() {
|
||
use super::margin::MarginFactor;
|
||
let mut vars = VarStore::new();
|
||
let diff = vars.alloc(Gaussian::from_ms(0.0, 6.0));
|
||
let mut f = BuiltinFactor::Margin(MarginFactor::new(diff, 5.0, 1.0));
|
||
|
||
f.propagate(&mut vars);
|
||
|
||
let result = vars.get(diff);
|
||
assert!((result.mu() - 4.864864864864865).abs() < 1e-12);
|
||
|
||
let logz = f.log_evidence(&vars);
|
||
assert!((logz - (-3.061357379815869)).abs() < 1e-10);
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Run the test to verify it fails**
|
||
|
||
Run: `cargo test --lib factor::tests::builtin_factor_dispatches_to_margin`
|
||
|
||
Expected: FAIL with `no variant named Margin found for enum BuiltinFactor`.
|
||
|
||
- [ ] **Step 3: Add the enum variant + Factor impl arms**
|
||
|
||
Replace the current `BuiltinFactor` definition and its `Factor` impl (currently `src/factor/mod.rs:76-98`):
|
||
|
||
```rust
|
||
/// Enum dispatcher for the built-in factor types.
|
||
///
|
||
/// Using an enum instead of `Box<dyn Factor>` keeps factor data inline and
|
||
/// avoids virtual-call overhead in the hot inference loop.
|
||
#[derive(Debug)]
|
||
pub enum BuiltinFactor {
|
||
TeamSum(team_sum::TeamSumFactor),
|
||
RankDiff(rank_diff::RankDiffFactor),
|
||
Trunc(trunc::TruncFactor),
|
||
Margin(margin::MarginFactor),
|
||
}
|
||
|
||
impl Factor for BuiltinFactor {
|
||
fn propagate(&mut self, vars: &mut VarStore) -> (f64, f64) {
|
||
match self {
|
||
Self::TeamSum(f) => f.propagate(vars),
|
||
Self::RankDiff(f) => f.propagate(vars),
|
||
Self::Trunc(f) => f.propagate(vars),
|
||
Self::Margin(f) => f.propagate(vars),
|
||
}
|
||
}
|
||
|
||
fn log_evidence(&self, vars: &VarStore) -> f64 {
|
||
match self {
|
||
Self::Trunc(f) => f.log_evidence(vars),
|
||
Self::Margin(f) => f.log_evidence(vars),
|
||
_ => 0.0,
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Re-export `MarginFactor` from `src/factors.rs`**
|
||
|
||
Replace the body of `src/factors.rs` (lines 7-13) with:
|
||
|
||
```rust
|
||
pub use crate::{
|
||
factor::{
|
||
BuiltinFactor, Factor, VarId, VarStore, margin::MarginFactor,
|
||
rank_diff::RankDiffFactor, team_sum::TeamSumFactor, trunc::TruncFactor,
|
||
},
|
||
schedule::{EpsilonOrMax, Schedule, ScheduleReport},
|
||
};
|
||
```
|
||
|
||
- [ ] **Step 5: Run the test to verify it passes**
|
||
|
||
Run: `cargo test --lib factor::tests::builtin_factor_dispatches_to_margin`
|
||
|
||
Expected: PASS.
|
||
|
||
- [ ] **Step 6: Run the full lib test suite to confirm no regressions**
|
||
|
||
Run: `cargo test --lib`
|
||
|
||
Expected: all tests pass (current count + 5 new from Tasks 1–2).
|
||
|
||
- [ ] **Step 7: Format and commit**
|
||
|
||
```bash
|
||
cargo +nightly fmt
|
||
git add src/factor/mod.rs src/factors.rs
|
||
git commit -m "feat(factor): dispatch MarginFactor through BuiltinFactor enum"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 3: Add `Outcome::Scored` variant and accessors
|
||
|
||
**Files:**
|
||
- Modify: `src/outcome.rs`
|
||
|
||
- [ ] **Step 1: Write failing tests in `src/outcome.rs`**
|
||
|
||
Add to the existing `#[cfg(test)] mod tests { ... }` block (after `winner_out_of_range_panics`, around line 86):
|
||
|
||
```rust
|
||
#[test]
|
||
fn scored_two_teams() {
|
||
let o = Outcome::scores([10.0, 4.0]);
|
||
assert_eq!(o.team_count(), 2);
|
||
assert_eq!(o.as_scores(), Some(&[10.0, 4.0][..]));
|
||
assert_eq!(o.as_ranks(), None);
|
||
}
|
||
|
||
#[test]
|
||
fn scored_team_count_matches_input() {
|
||
let o = Outcome::scores([3.0, 1.0, 2.0, 0.0]);
|
||
assert_eq!(o.team_count(), 4);
|
||
}
|
||
|
||
#[test]
|
||
fn ranked_as_scores_returns_none() {
|
||
let o = Outcome::winner(0, 2);
|
||
assert!(o.as_scores().is_none());
|
||
assert!(o.as_ranks().is_some());
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Run the tests to verify they fail**
|
||
|
||
Run: `cargo test --lib outcome::tests`
|
||
|
||
Expected: FAIL — `no function or associated item named scores found`, etc.
|
||
|
||
- [ ] **Step 3: Implement the `Scored` variant and helpers**
|
||
|
||
Replace the body of `src/outcome.rs` with:
|
||
|
||
```rust
|
||
//! Outcome of a match.
|
||
//!
|
||
//! `Ranked(ranks)` for ordinal results; `Scored(scores)` for continuous
|
||
//! per-team scores (engages `MarginFactor` in the engine).
|
||
|
||
use smallvec::SmallVec;
|
||
|
||
/// Final outcome of a match.
|
||
///
|
||
/// `Ranked(ranks)`: lower rank = better. Equal ranks mean a tie between those
|
||
/// teams. `ranks.len()` must equal the number of teams in the event.
|
||
///
|
||
/// `Scored(scores)`: higher score = better. Adjacent (sorted) pairs feed
|
||
/// observed margins to `MarginFactor`. `scores.len()` must equal the number
|
||
/// of teams in the event.
|
||
#[derive(Clone, Debug, PartialEq)]
|
||
#[non_exhaustive]
|
||
pub enum Outcome {
|
||
Ranked(SmallVec<[u32; 4]>),
|
||
Scored(SmallVec<[f64; 4]>),
|
||
}
|
||
|
||
impl Outcome {
|
||
/// `n`-team outcome where team `winner` won and everyone else tied for last.
|
||
///
|
||
/// Panics if `winner >= n`.
|
||
pub fn winner(winner: u32, n: u32) -> Self {
|
||
assert!(winner < n, "winner index {winner} out of range 0..{n}");
|
||
let ranks: SmallVec<[u32; 4]> = (0..n).map(|i| if i == winner { 0 } else { 1 }).collect();
|
||
Self::Ranked(ranks)
|
||
}
|
||
|
||
/// All `n` teams tied.
|
||
pub fn draw(n: u32) -> Self {
|
||
Self::Ranked(SmallVec::from_vec(vec![0; n as usize]))
|
||
}
|
||
|
||
/// Explicit per-team ranking.
|
||
pub fn ranking<I: IntoIterator<Item = u32>>(ranks: I) -> Self {
|
||
Self::Ranked(ranks.into_iter().collect())
|
||
}
|
||
|
||
/// Explicit per-team continuous scores; higher = better.
|
||
pub fn scores<I: IntoIterator<Item = f64>>(scores: I) -> Self {
|
||
Self::Scored(scores.into_iter().collect())
|
||
}
|
||
|
||
pub fn team_count(&self) -> usize {
|
||
match self {
|
||
Self::Ranked(r) => r.len(),
|
||
Self::Scored(s) => s.len(),
|
||
}
|
||
}
|
||
|
||
pub(crate) fn as_ranks(&self) -> Option<&[u32]> {
|
||
match self {
|
||
Self::Ranked(r) => Some(r),
|
||
Self::Scored(_) => None,
|
||
}
|
||
}
|
||
|
||
pub(crate) fn as_scores(&self) -> Option<&[f64]> {
|
||
match self {
|
||
Self::Scored(s) => Some(s),
|
||
Self::Ranked(_) => None,
|
||
}
|
||
}
|
||
}
|
||
|
||
#[cfg(test)]
|
||
mod tests {
|
||
use super::*;
|
||
|
||
#[test]
|
||
fn winner_two_teams() {
|
||
let o = Outcome::winner(0, 2);
|
||
assert_eq!(o.as_ranks(), Some(&[0u32, 1][..]));
|
||
assert_eq!(o.team_count(), 2);
|
||
}
|
||
|
||
#[test]
|
||
fn winner_three_teams_second_wins() {
|
||
let o = Outcome::winner(1, 3);
|
||
assert_eq!(o.as_ranks(), Some(&[1u32, 0, 1][..]));
|
||
}
|
||
|
||
#[test]
|
||
fn draw_three_teams() {
|
||
let o = Outcome::draw(3);
|
||
assert_eq!(o.as_ranks(), Some(&[0u32, 0, 0][..]));
|
||
}
|
||
|
||
#[test]
|
||
fn ranking_from_iter() {
|
||
let o = Outcome::ranking([2, 0, 1]);
|
||
assert_eq!(o.as_ranks(), Some(&[2u32, 0, 1][..]));
|
||
}
|
||
|
||
#[test]
|
||
#[should_panic(expected = "winner index 2 out of range")]
|
||
fn winner_out_of_range_panics() {
|
||
let _ = Outcome::winner(2, 2);
|
||
}
|
||
|
||
#[test]
|
||
fn scored_two_teams() {
|
||
let o = Outcome::scores([10.0, 4.0]);
|
||
assert_eq!(o.team_count(), 2);
|
||
assert_eq!(o.as_scores(), Some(&[10.0, 4.0][..]));
|
||
assert_eq!(o.as_ranks(), None);
|
||
}
|
||
|
||
#[test]
|
||
fn scored_team_count_matches_input() {
|
||
let o = Outcome::scores([3.0, 1.0, 2.0, 0.0]);
|
||
assert_eq!(o.team_count(), 4);
|
||
}
|
||
|
||
#[test]
|
||
fn ranked_as_scores_returns_none() {
|
||
let o = Outcome::winner(0, 2);
|
||
assert!(o.as_scores().is_none());
|
||
assert!(o.as_ranks().is_some());
|
||
}
|
||
}
|
||
```
|
||
|
||
> Note: the existing `as_ranks` returned `&[u32]` and was `#[allow(dead_code)]`. The new signature returns `Option<&[u32]>` because `Ranked` is no longer the only variant. All in-tree call sites that used `as_ranks()` (we'll update them in later tasks) must now handle the `Option`.
|
||
|
||
- [ ] **Step 4: Run the outcome tests to verify they pass**
|
||
|
||
Run: `cargo test --lib outcome`
|
||
|
||
Expected: 8 passed.
|
||
|
||
- [ ] **Step 5: Update existing call sites to handle the new `Option<&[u32]>` return**
|
||
|
||
Two call sites use `as_ranks()` today. Update each to expect `Option`:
|
||
|
||
In `src/history.rs:672`, change:
|
||
|
||
```rust
|
||
let ranks = ev.outcome.as_ranks();
|
||
if ranks.len() != ev.teams.len() {
|
||
```
|
||
|
||
to:
|
||
|
||
```rust
|
||
let ranks = match ev.outcome.as_ranks() {
|
||
Some(r) => r,
|
||
None => {
|
||
// Scored path will be wired in Task 7; for now it's an error.
|
||
return Err(InferenceError::MismatchedShape {
|
||
kind: "outcome variant",
|
||
expected: 0,
|
||
got: 0,
|
||
});
|
||
}
|
||
};
|
||
if ranks.len() != ev.teams.len() {
|
||
```
|
||
|
||
In `src/history.rs:701`, change:
|
||
|
||
```rust
|
||
let max_rank = ranks.iter().copied().max().unwrap_or(0) as f64;
|
||
let inverted: Vec<f64> = ranks.iter().map(|&r| max_rank - r as f64).collect();
|
||
```
|
||
|
||
(no change needed — `ranks` is already `&[u32]` here).
|
||
|
||
In `src/game.rs:312`, change:
|
||
|
||
```rust
|
||
let ranks = outcome.as_ranks();
|
||
let max_rank = ranks.iter().copied().max().unwrap_or(0) as f64;
|
||
let result: Vec<f64> = ranks.iter().map(|&r| max_rank - r as f64).collect();
|
||
```
|
||
|
||
to:
|
||
|
||
```rust
|
||
let ranks = outcome.as_ranks().ok_or(crate::InferenceError::MismatchedShape {
|
||
kind: "Game::ranked requires Outcome::Ranked",
|
||
expected: 0,
|
||
got: 0,
|
||
})?;
|
||
let max_rank = ranks.iter().copied().max().unwrap_or(0) as f64;
|
||
let result: Vec<f64> = ranks.iter().map(|&r| max_rank - r as f64).collect();
|
||
```
|
||
|
||
- [ ] **Step 6: Verify the full lib still compiles and tests pass**
|
||
|
||
Run: `cargo test --lib`
|
||
|
||
Expected: all tests pass (call sites updated cleanly).
|
||
|
||
- [ ] **Step 7: Format and commit**
|
||
|
||
```bash
|
||
cargo +nightly fmt
|
||
git add src/outcome.rs src/history.rs src/game.rs
|
||
git commit -m "feat(outcome): add Scored variant; switch as_ranks/as_scores to Option"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 4: Internal `DiffFactor` enum to dispatch Trunc vs Margin per-pair
|
||
|
||
**Files:**
|
||
- Modify: `src/game.rs` (top of file, before `Game` impl)
|
||
|
||
- [ ] **Step 1: Write a failing test in `src/game.rs`'s test module**
|
||
|
||
In the `#[cfg(test)] mod tests { ... }` block at the bottom of `src/game.rs`, add (after `test_2vs2_weighted`):
|
||
|
||
```rust
|
||
#[test]
|
||
fn diff_factor_dispatch_trunc_and_margin() {
|
||
use crate::factor::{margin::MarginFactor, trunc::TruncFactor, VarStore};
|
||
use super::DiffFactor;
|
||
|
||
let mut vars = VarStore::new();
|
||
let dt = vars.alloc(Gaussian::from_ms(0.0, 6.0));
|
||
let dm = vars.alloc(Gaussian::from_ms(0.0, 6.0));
|
||
|
||
let mut t = DiffFactor::Trunc(TruncFactor::new(dt, 0.0, false));
|
||
let mut m = DiffFactor::Margin(MarginFactor::new(dm, 5.0, 1.0));
|
||
|
||
let _ = t.propagate(&mut vars);
|
||
let _ = m.propagate(&mut vars);
|
||
|
||
// Smoke: both diffs got written; their msgs are non-N_INF.
|
||
assert!(t.msg().pi() > 0.0);
|
||
assert!(m.msg().pi() > 0.0);
|
||
assert_eq!(t.diff(), dt);
|
||
assert_eq!(m.diff(), dm);
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Run the test to verify it fails**
|
||
|
||
Run: `cargo test --lib game::tests::diff_factor_dispatch_trunc_and_margin`
|
||
|
||
Expected: FAIL — `cannot find type DiffFactor in this scope`.
|
||
|
||
- [ ] **Step 3: Add the `DiffFactor` enum at the top of `src/game.rs`**
|
||
|
||
Insert after the existing `use` block (around line 14, before `pub struct GameOptions`):
|
||
|
||
```rust
|
||
use crate::factor::margin::MarginFactor;
|
||
|
||
/// Per-adjacent-pair link factor in the game's diff chain.
|
||
///
|
||
/// `Trunc` is used for `Outcome::Ranked` (rank-based truncation).
|
||
/// `Margin` is used for `Outcome::Scored` (Gaussian observation on the diff).
|
||
#[derive(Debug)]
|
||
pub(crate) enum DiffFactor {
|
||
Trunc(TruncFactor),
|
||
Margin(MarginFactor),
|
||
}
|
||
|
||
impl DiffFactor {
|
||
pub(crate) fn diff(&self) -> crate::factor::VarId {
|
||
match self {
|
||
Self::Trunc(f) => f.diff,
|
||
Self::Margin(f) => f.diff,
|
||
}
|
||
}
|
||
|
||
pub(crate) fn msg(&self) -> Gaussian {
|
||
match self {
|
||
Self::Trunc(f) => f.msg,
|
||
Self::Margin(f) => f.msg,
|
||
}
|
||
}
|
||
|
||
pub(crate) fn evidence(&self) -> f64 {
|
||
match self {
|
||
Self::Trunc(f) => f.evidence_cached.unwrap_or(1.0),
|
||
Self::Margin(f) => f.evidence_cached.unwrap_or(1.0),
|
||
}
|
||
}
|
||
|
||
pub(crate) fn propagate(&mut self, vars: &mut crate::factor::VarStore) -> (f64, f64) {
|
||
use crate::factor::Factor;
|
||
match self {
|
||
Self::Trunc(f) => f.propagate(vars),
|
||
Self::Margin(f) => f.propagate(vars),
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Refactor `Game::likelihoods` to drive `Vec<DiffFactor>` instead of `Vec<TruncFactor>`**
|
||
|
||
This is a mechanical rename inside `Game::likelihoods` (currently `src/game.rs:135-273`). The loop logic is unchanged; we just move the per-pair object behind the enum. Replace the body of `Game::likelihoods` from where `let mut trunc: Vec<TruncFactor> = ...` is constructed (around line 160) to its last use (around line 243):
|
||
|
||
```rust
|
||
// One DiffFactor per adjacent sorted-team pair; each owns a diff VarId.
|
||
let mut links: Vec<DiffFactor> = (0..n_diffs)
|
||
.map(|i| {
|
||
let tie = self.result[arena.sort_buf[i]] == self.result[arena.sort_buf[i + 1]];
|
||
let margin = if self.p_draw == 0.0 {
|
||
0.0
|
||
} else {
|
||
let a: f64 = self.teams[arena.sort_buf[i]]
|
||
.iter()
|
||
.map(|p| p.beta.powi(2))
|
||
.sum();
|
||
let b: f64 = self.teams[arena.sort_buf[i + 1]]
|
||
.iter()
|
||
.map(|p| p.beta.powi(2))
|
||
.sum();
|
||
compute_margin(self.p_draw, (a + b).sqrt())
|
||
};
|
||
let vid = arena.vars.alloc(N_INF);
|
||
DiffFactor::Trunc(TruncFactor::new(vid, margin, tie))
|
||
})
|
||
.collect();
|
||
|
||
// Per-team messages from neighbouring RankDiff factors (replaces TeamMessage).
|
||
arena.lhood_lose.resize(n_teams, N_INF);
|
||
arena.lhood_win.resize(n_teams, N_INF);
|
||
|
||
let mut step = (f64::INFINITY, f64::INFINITY);
|
||
let mut iter = 0;
|
||
|
||
while tuple_gt(step, 1e-6) && iter < 10 {
|
||
step = (0.0_f64, 0.0_f64);
|
||
|
||
for (e, lf) in links[..n_diffs.saturating_sub(1)].iter_mut().enumerate() {
|
||
let pw = arena.team_prior[e] * arena.lhood_lose[e];
|
||
let pl = arena.team_prior[e + 1] * arena.lhood_win[e + 1];
|
||
let raw = pw - pl;
|
||
arena.vars.set(lf.diff(), raw * lf.msg());
|
||
let d = lf.propagate(&mut arena.vars);
|
||
step = tuple_max(step, d);
|
||
|
||
let new_ll = pw - lf.msg();
|
||
step = tuple_max(step, arena.lhood_lose[e + 1].delta(new_ll));
|
||
arena.lhood_lose[e + 1] = new_ll;
|
||
}
|
||
|
||
for (rev_i, lf) in links[1..].iter_mut().rev().enumerate() {
|
||
let e = n_diffs - 1 - rev_i;
|
||
let pw = arena.team_prior[e] * arena.lhood_lose[e];
|
||
let pl = arena.team_prior[e + 1] * arena.lhood_win[e + 1];
|
||
let raw = pw - pl;
|
||
arena.vars.set(lf.diff(), raw * lf.msg());
|
||
let d = lf.propagate(&mut arena.vars);
|
||
step = tuple_max(step, d);
|
||
|
||
let new_lw = pl + lf.msg();
|
||
step = tuple_max(step, arena.lhood_win[e].delta(new_lw));
|
||
arena.lhood_win[e] = new_lw;
|
||
}
|
||
|
||
iter += 1;
|
||
}
|
||
|
||
if n_diffs == 1 {
|
||
let raw = (arena.team_prior[0] * arena.lhood_lose[0])
|
||
- (arena.team_prior[1] * arena.lhood_win[1]);
|
||
arena.vars.set(links[0].diff(), raw * links[0].msg());
|
||
links[0].propagate(&mut arena.vars);
|
||
}
|
||
|
||
if n_diffs > 0 {
|
||
let pl1 = arena.team_prior[1] * arena.lhood_win[1];
|
||
arena.lhood_win[0] = pl1 + links[0].msg();
|
||
let pw_last = arena.team_prior[n_teams - 2] * arena.lhood_lose[n_teams - 2];
|
||
arena.lhood_lose[n_teams - 1] = pw_last - links[n_diffs - 1].msg();
|
||
}
|
||
|
||
self.evidence = links.iter().map(|l| l.evidence()).product();
|
||
```
|
||
|
||
(Everything below the evidence line is unchanged.) Also remove the now-unused `use crate::factor::trunc::TruncFactor;` from the file's top imports if it becomes unused — but we still construct `TruncFactor` directly above, so it stays.
|
||
|
||
- [ ] **Step 5: Run the full lib test suite to verify the refactor preserves all golden values**
|
||
|
||
Run: `cargo test --lib`
|
||
|
||
Expected: all tests pass with **identical** assertions — this is a pure refactor.
|
||
|
||
- [ ] **Step 6: Run the integration tests**
|
||
|
||
Run: `cargo test`
|
||
|
||
Expected: all pass.
|
||
|
||
- [ ] **Step 7: Format and commit**
|
||
|
||
```bash
|
||
cargo +nightly fmt
|
||
git add src/game.rs
|
||
git commit -m "refactor(game): dispatch per-diff link factors via DiffFactor enum"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 5: Add `score_sigma` to `GameOptions` and the scored path in `Game::likelihoods`
|
||
|
||
**Files:**
|
||
- Modify: `src/game.rs`
|
||
|
||
- [ ] **Step 1: Write a failing test for the scored path**
|
||
|
||
In `src/game.rs`'s test module, after the new dispatch test from Task 4, add:
|
||
|
||
```rust
|
||
#[test]
|
||
fn scored_path_sharper_when_margin_is_large() {
|
||
// Same prior on both sides; large positive observed margin should pull
|
||
// team A above team B.
|
||
let prior = R::new(
|
||
Gaussian::from_ms(25.0, 25.0 / 3.0),
|
||
25.0 / 6.0,
|
||
ConstantDrift(25.0 / 300.0),
|
||
);
|
||
let teams = vec![vec![prior], vec![prior]];
|
||
let result = vec![10.0, 0.0]; // a beat b by 10
|
||
let weights = [vec![1.0], vec![1.0]];
|
||
let mut arena = ScratchArena::new();
|
||
let g = Game::scored_with_arena(
|
||
teams,
|
||
&result,
|
||
&weights,
|
||
1.0, // score_sigma
|
||
&mut arena,
|
||
);
|
||
let p = g.posteriors();
|
||
let a = p[0][0];
|
||
let b = p[1][0];
|
||
assert!(a.mu() > b.mu(), "expected team a posterior mu > team b; got {} vs {}", a.mu(), b.mu());
|
||
|
||
// Tighter score_sigma should produce a stronger update.
|
||
let mut arena2 = ScratchArena::new();
|
||
let g_tight = Game::scored_with_arena(
|
||
vec![vec![prior], vec![prior]],
|
||
&result,
|
||
&weights,
|
||
0.1, // tighter score_sigma
|
||
&mut arena2,
|
||
);
|
||
let p_tight = g_tight.posteriors();
|
||
let a_tight = p_tight[0][0];
|
||
assert!(a_tight.mu() > a.mu(), "expected tighter sigma to push posterior further; {} vs {}", a_tight.mu(), a.mu());
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Run the test to verify it fails**
|
||
|
||
Run: `cargo test --lib game::tests::scored_path_sharper_when_margin_is_large`
|
||
|
||
Expected: FAIL — `no function or associated item named scored_with_arena`.
|
||
|
||
- [ ] **Step 3: Add `score_sigma` to `GameOptions`**
|
||
|
||
Replace the `GameOptions` definition (around `src/game.rs:15-28`):
|
||
|
||
```rust
|
||
#[derive(Clone, Copy, Debug)]
|
||
pub struct GameOptions {
|
||
pub p_draw: f64,
|
||
pub score_sigma: f64,
|
||
pub convergence: crate::ConvergenceOptions,
|
||
}
|
||
|
||
impl Default for GameOptions {
|
||
fn default() -> Self {
|
||
Self {
|
||
p_draw: crate::P_DRAW,
|
||
score_sigma: 1.0,
|
||
convergence: crate::ConvergenceOptions::default(),
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Add `Game::scored_with_arena` and friends**
|
||
|
||
In `Game<'a, T, D>`'s `impl` block (the one with `ranked_with_arena`, around `src/game.rs:90-133`), add a new method right after `ranked_with_arena`:
|
||
|
||
```rust
|
||
pub(crate) fn scored_with_arena(
|
||
teams: Vec<Vec<Rating<T, D>>>,
|
||
scores: &'a [f64],
|
||
weights: &'a [Vec<f64>],
|
||
score_sigma: f64,
|
||
arena: &mut ScratchArena,
|
||
) -> Self {
|
||
debug_assert!(
|
||
scores.len() == teams.len(),
|
||
"scores must have the same length as teams"
|
||
);
|
||
debug_assert!(
|
||
weights
|
||
.iter()
|
||
.zip(teams.iter())
|
||
.all(|(w, t)| w.len() == t.len()),
|
||
"weights must have the same dimensions as teams"
|
||
);
|
||
debug_assert!(score_sigma > 0.0, "score_sigma must be positive");
|
||
|
||
let mut this = Self {
|
||
teams,
|
||
result: scores,
|
||
weights,
|
||
p_draw: 0.0,
|
||
likelihoods: Vec::new(),
|
||
evidence: 0.0,
|
||
};
|
||
|
||
this.likelihoods_scored(arena, score_sigma);
|
||
this
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 5: Add `likelihoods_scored` (parallel to `likelihoods`)**
|
||
|
||
Right after `fn likelihoods` (around line 273), add:
|
||
|
||
```rust
|
||
fn likelihoods_scored(&mut self, arena: &mut ScratchArena, score_sigma: f64) {
|
||
arena.reset();
|
||
|
||
let n_teams = self.teams.len();
|
||
|
||
arena.sort_buf.extend(0..n_teams);
|
||
arena.sort_buf.sort_by(|&i, &j| {
|
||
self.result[j]
|
||
.partial_cmp(&self.result[i])
|
||
.unwrap_or(Ordering::Equal)
|
||
});
|
||
|
||
arena.team_prior.extend(arena.sort_buf.iter().map(|&t| {
|
||
self.teams[t]
|
||
.iter()
|
||
.zip(self.weights[t].iter())
|
||
.fold(N00, |p, (player, &w)| p + (player.performance() * w))
|
||
}));
|
||
|
||
let n_diffs = n_teams.saturating_sub(1);
|
||
|
||
// One MarginFactor per adjacent sorted-team pair, observed m_obs ≥ 0.
|
||
let mut links: Vec<DiffFactor> = (0..n_diffs)
|
||
.map(|i| {
|
||
let m_obs = self.result[arena.sort_buf[i]] - self.result[arena.sort_buf[i + 1]];
|
||
let vid = arena.vars.alloc(N_INF);
|
||
DiffFactor::Margin(MarginFactor::new(vid, m_obs, score_sigma))
|
||
})
|
||
.collect();
|
||
|
||
arena.lhood_lose.resize(n_teams, N_INF);
|
||
arena.lhood_win.resize(n_teams, N_INF);
|
||
|
||
let mut step = (f64::INFINITY, f64::INFINITY);
|
||
let mut iter = 0;
|
||
|
||
while tuple_gt(step, 1e-6) && iter < 10 {
|
||
step = (0.0_f64, 0.0_f64);
|
||
|
||
for (e, lf) in links[..n_diffs.saturating_sub(1)].iter_mut().enumerate() {
|
||
let pw = arena.team_prior[e] * arena.lhood_lose[e];
|
||
let pl = arena.team_prior[e + 1] * arena.lhood_win[e + 1];
|
||
let raw = pw - pl;
|
||
arena.vars.set(lf.diff(), raw * lf.msg());
|
||
let d = lf.propagate(&mut arena.vars);
|
||
step = tuple_max(step, d);
|
||
|
||
let new_ll = pw - lf.msg();
|
||
step = tuple_max(step, arena.lhood_lose[e + 1].delta(new_ll));
|
||
arena.lhood_lose[e + 1] = new_ll;
|
||
}
|
||
|
||
for (rev_i, lf) in links[1..].iter_mut().rev().enumerate() {
|
||
let e = n_diffs - 1 - rev_i;
|
||
let pw = arena.team_prior[e] * arena.lhood_lose[e];
|
||
let pl = arena.team_prior[e + 1] * arena.lhood_win[e + 1];
|
||
let raw = pw - pl;
|
||
arena.vars.set(lf.diff(), raw * lf.msg());
|
||
let d = lf.propagate(&mut arena.vars);
|
||
step = tuple_max(step, d);
|
||
|
||
let new_lw = pl + lf.msg();
|
||
step = tuple_max(step, arena.lhood_win[e].delta(new_lw));
|
||
arena.lhood_win[e] = new_lw;
|
||
}
|
||
|
||
iter += 1;
|
||
}
|
||
|
||
if n_diffs == 1 {
|
||
let raw = (arena.team_prior[0] * arena.lhood_lose[0])
|
||
- (arena.team_prior[1] * arena.lhood_win[1]);
|
||
arena.vars.set(links[0].diff(), raw * links[0].msg());
|
||
links[0].propagate(&mut arena.vars);
|
||
}
|
||
|
||
if n_diffs > 0 {
|
||
let pl1 = arena.team_prior[1] * arena.lhood_win[1];
|
||
arena.lhood_win[0] = pl1 + links[0].msg();
|
||
let pw_last = arena.team_prior[n_teams - 2] * arena.lhood_lose[n_teams - 2];
|
||
arena.lhood_lose[n_teams - 1] = pw_last - links[n_diffs - 1].msg();
|
||
}
|
||
|
||
self.evidence = links.iter().map(|l| l.evidence()).product();
|
||
|
||
arena.inv_buf.resize(n_teams, 0);
|
||
for (si, &orig_i) in arena.sort_buf.iter().enumerate() {
|
||
arena.inv_buf[orig_i] = si;
|
||
}
|
||
|
||
self.likelihoods = self
|
||
.teams
|
||
.iter()
|
||
.zip(self.weights.iter())
|
||
.enumerate()
|
||
.map(|(orig_i, (players, weights))| {
|
||
let si = arena.inv_buf[orig_i];
|
||
let m = arena.lhood_win[si] * arena.lhood_lose[si];
|
||
let performance = players
|
||
.iter()
|
||
.zip(weights.iter())
|
||
.fold(N00, |p, (player, &w)| p + (player.performance() * w));
|
||
players
|
||
.iter()
|
||
.zip(weights.iter())
|
||
.map(|(player, &w)| {
|
||
((m - performance.exclude(player.performance() * w)) * (1.0 / w))
|
||
.forget(player.beta.powi(2))
|
||
})
|
||
.collect::<Vec<_>>()
|
||
})
|
||
.collect::<Vec<_>>();
|
||
}
|
||
```
|
||
|
||
> The body is identical to `likelihoods` except for the per-pair factor construction (no draw-margin computation, `MarginFactor` instead of `TruncFactor`). DRY would let us extract the loop, but the duplication is small (~50 lines) and the divergence may grow as more factor kinds are added; we accept it for clarity. Revisit in T4-Synergy if it gets unwieldy.
|
||
|
||
- [ ] **Step 6: Run the test to verify it passes**
|
||
|
||
Run: `cargo test --lib game::tests::scored_path_sharper_when_margin_is_large`
|
||
|
||
Expected: PASS.
|
||
|
||
- [ ] **Step 7: Run the full test suite**
|
||
|
||
Run: `cargo test`
|
||
|
||
Expected: all pass.
|
||
|
||
- [ ] **Step 8: Format and commit**
|
||
|
||
```bash
|
||
cargo +nightly fmt
|
||
git add src/game.rs
|
||
git commit -m "feat(game): add scored_with_arena driving MarginFactor links"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 6: Public `Game::scored` constructor and `OwnedGame` support
|
||
|
||
**Files:**
|
||
- Modify: `src/game.rs`
|
||
|
||
- [ ] **Step 1: Write a failing test in `src/game.rs`'s test module**
|
||
|
||
```rust
|
||
#[test]
|
||
fn game_scored_public_ctor() {
|
||
use crate::Outcome;
|
||
let prior = R::new(
|
||
Gaussian::from_ms(25.0, 25.0 / 3.0),
|
||
25.0 / 6.0,
|
||
ConstantDrift(25.0 / 300.0),
|
||
);
|
||
let opts = GameOptions {
|
||
score_sigma: 1.0,
|
||
..GameOptions::default()
|
||
};
|
||
let g = Game::scored(&[&[prior], &[prior]], Outcome::scores([8.0, 2.0]), &opts).unwrap();
|
||
let p = g.posteriors();
|
||
assert!(p[0][0].mu() > p[1][0].mu());
|
||
}
|
||
|
||
#[test]
|
||
fn game_scored_rejects_ranked_outcome() {
|
||
let prior = R::new(
|
||
Gaussian::from_ms(25.0, 25.0 / 3.0),
|
||
25.0 / 6.0,
|
||
ConstantDrift(25.0 / 300.0),
|
||
);
|
||
let err = Game::scored(
|
||
&[&[prior], &[prior]],
|
||
crate::Outcome::winner(0, 2),
|
||
&GameOptions::default(),
|
||
)
|
||
.unwrap_err();
|
||
assert!(matches!(err, crate::InferenceError::MismatchedShape { .. }));
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Run the tests to verify they fail**
|
||
|
||
Run: `cargo test --lib game::tests::game_scored_public_ctor game::tests::game_scored_rejects_ranked_outcome`
|
||
|
||
Expected: FAIL — `no function or associated item named scored`.
|
||
|
||
- [ ] **Step 3: Add `OwnedGame::new_scored` constructor**
|
||
|
||
In `OwnedGame<T, D>`'s impl (around `src/game.rs:46-78`), add right after `new`:
|
||
|
||
```rust
|
||
pub(crate) fn new_scored(
|
||
teams: Vec<Vec<Rating<T, D>>>,
|
||
scores: Vec<f64>,
|
||
weights: Vec<Vec<f64>>,
|
||
score_sigma: f64,
|
||
) -> Self {
|
||
let mut arena = ScratchArena::new();
|
||
let g = Game::scored_with_arena(teams.clone(), &scores, &weights, score_sigma, &mut arena);
|
||
let likelihoods = g.likelihoods;
|
||
let evidence = g.evidence;
|
||
Self {
|
||
teams,
|
||
result: scores,
|
||
weights,
|
||
p_draw: 0.0,
|
||
likelihoods,
|
||
evidence,
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Add `Game::scored` public method**
|
||
|
||
In the `impl<T: Time, D: Drift<T>> Game<'_, T, D>` block (around `src/game.rs:293-349`), add right after `ranked`:
|
||
|
||
```rust
|
||
pub fn scored(
|
||
teams: &[&[Rating<T, D>]],
|
||
outcome: crate::Outcome,
|
||
options: &GameOptions,
|
||
) -> Result<OwnedGame<T, D>, crate::InferenceError> {
|
||
if options.score_sigma <= 0.0 {
|
||
return Err(crate::InferenceError::InvalidProbability {
|
||
value: options.score_sigma,
|
||
});
|
||
}
|
||
if outcome.team_count() != teams.len() {
|
||
return Err(crate::InferenceError::MismatchedShape {
|
||
kind: "outcome scores vs teams",
|
||
expected: teams.len(),
|
||
got: outcome.team_count(),
|
||
});
|
||
}
|
||
let scores = outcome
|
||
.as_scores()
|
||
.ok_or(crate::InferenceError::MismatchedShape {
|
||
kind: "Game::scored requires Outcome::Scored",
|
||
expected: 0,
|
||
got: 0,
|
||
})?
|
||
.to_vec();
|
||
let teams_owned: Vec<Vec<Rating<T, D>>> = teams.iter().map(|t| t.to_vec()).collect();
|
||
let weights: Vec<Vec<f64>> = teams.iter().map(|t| vec![1.0; t.len()]).collect();
|
||
Ok(OwnedGame::new_scored(teams_owned, scores, weights, options.score_sigma))
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 5: Run the new tests to verify they pass**
|
||
|
||
Run: `cargo test --lib game::tests::game_scored_public_ctor game::tests::game_scored_rejects_ranked_outcome`
|
||
|
||
Expected: both PASS.
|
||
|
||
- [ ] **Step 6: Run the full test suite**
|
||
|
||
Run: `cargo test`
|
||
|
||
Expected: all pass.
|
||
|
||
- [ ] **Step 7: Format and commit**
|
||
|
||
```bash
|
||
cargo +nightly fmt
|
||
git add src/game.rs
|
||
git commit -m "feat(game): add public Game::scored constructor"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 7: Plumb `Outcome::Scored` through `TimeSlice` and `History::add_events`
|
||
|
||
**Files:**
|
||
- Modify: `src/time_slice.rs`
|
||
- Modify: `src/history.rs`
|
||
|
||
The per-event `Event` struct in `src/time_slice.rs:80-85` is `{ teams, evidence, weights }`. We add a `kind: EventKind` field that selects which `Game::*_with_arena` to call. Score noise (`score_sigma`) lives inside the `Scored` variant so events can in principle have per-event sigma, though the public API only exposes one history-wide knob today.
|
||
|
||
- [ ] **Step 1: Add `EventKind` to `src/time_slice.rs` and a `kind` field on `Event`**
|
||
|
||
In `src/time_slice.rs`, immediately above the `struct Event` definition (currently around line 80), add:
|
||
|
||
```rust
|
||
#[derive(Debug, Clone, Copy)]
|
||
pub(crate) enum EventKind {
|
||
Ranked,
|
||
Scored { score_sigma: f64 },
|
||
}
|
||
```
|
||
|
||
Modify `struct Event` (currently lines 81-85) to:
|
||
|
||
```rust
|
||
#[derive(Debug)]
|
||
pub(crate) struct Event {
|
||
teams: Vec<Team>,
|
||
evidence: f64,
|
||
weights: Vec<Vec<f64>>,
|
||
kind: EventKind,
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Dispatch on `kind` in `Event::iteration_direct`**
|
||
|
||
Replace the body of `Event::iteration_direct` (currently `src/time_slice.rs:123-144`):
|
||
|
||
```rust
|
||
fn iteration_direct<T: Time, D: Drift<T>>(
|
||
&mut self,
|
||
skills: &mut SkillStore,
|
||
agents: &CompetitorStore<T, D>,
|
||
p_draw: f64,
|
||
arena: &mut ScratchArena,
|
||
) {
|
||
let teams = self.within_priors(false, false, skills, agents);
|
||
let result = self.outputs();
|
||
let g = match self.kind {
|
||
EventKind::Ranked => {
|
||
Game::ranked_with_arena(teams, &result, &self.weights, p_draw, arena)
|
||
}
|
||
EventKind::Scored { score_sigma } => {
|
||
Game::scored_with_arena(teams, &result, &self.weights, score_sigma, arena)
|
||
}
|
||
};
|
||
|
||
for (t, team) in self.teams.iter_mut().enumerate() {
|
||
for (i, item) in team.items.iter_mut().enumerate() {
|
||
let old_likelihood = skills.get(item.agent).unwrap().likelihood;
|
||
let new_likelihood = (old_likelihood / item.likelihood) * g.likelihoods[t][i];
|
||
skills.get_mut(item.agent).unwrap().likelihood = new_likelihood;
|
||
item.likelihood = g.likelihoods[t][i];
|
||
}
|
||
}
|
||
|
||
self.evidence = g.evidence;
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 3: Dispatch on `kind` in `TimeSlice::iteration` (sequential branch)**
|
||
|
||
Inside `TimeSlice::iteration` (currently `src/time_slice.rs:295-325`), replace the body of the `if from > 0 || self.color_groups.is_empty()` branch's inner `for event in ...` loop. The `Game::ranked_with_arena(...)` call (lines 302-308) becomes:
|
||
|
||
```rust
|
||
let g = match event.kind {
|
||
EventKind::Ranked => Game::ranked_with_arena(
|
||
teams,
|
||
&result,
|
||
&event.weights,
|
||
self.p_draw,
|
||
&mut self.arena,
|
||
),
|
||
EventKind::Scored { score_sigma } => Game::scored_with_arena(
|
||
teams,
|
||
&result,
|
||
&event.weights,
|
||
score_sigma,
|
||
&mut self.arena,
|
||
),
|
||
};
|
||
```
|
||
|
||
(The rest of that loop body — likelihood update + `event.evidence = g.evidence` — is unchanged.)
|
||
|
||
- [ ] **Step 4: Dispatch on `kind` in `TimeSlice::log_evidence`**
|
||
|
||
`TimeSlice::log_evidence` (currently `src/time_slice.rs:467-532`) calls `Game::ranked_with_arena` in three places (lines 482-490, 506-514). For each, change to a match on `event.kind` mirroring Step 2.
|
||
|
||
Add a helper inside the impl to keep the call sites tidy:
|
||
|
||
```rust
|
||
fn run_event<D: Drift<T>>(
|
||
&self,
|
||
event: &Event,
|
||
online: bool,
|
||
forward: bool,
|
||
agents: &CompetitorStore<T, D>,
|
||
arena: &mut ScratchArena,
|
||
) -> f64 {
|
||
let teams = event.within_priors(online, forward, &self.skills, agents);
|
||
let result = event.outputs();
|
||
match event.kind {
|
||
EventKind::Ranked => {
|
||
Game::ranked_with_arena(teams, &result, &event.weights, self.p_draw, arena).evidence
|
||
}
|
||
EventKind::Scored { score_sigma } => {
|
||
Game::scored_with_arena(teams, &result, &event.weights, score_sigma, arena)
|
||
.evidence
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
Then replace the inline `Game::ranked_with_arena(...).evidence.ln()` calls with `self.run_event(event, online, forward, agents, &mut arena).ln()`.
|
||
|
||
- [ ] **Step 5: Extend `TimeSlice::add_events` signature with per-event `kinds`**
|
||
|
||
Change the `add_events` signature (currently `src/time_slice.rs:203-209`) to:
|
||
|
||
```rust
|
||
pub fn add_events<D: Drift<T>>(
|
||
&mut self,
|
||
composition: Vec<Vec<Vec<Index>>>,
|
||
results: Vec<Vec<f64>>,
|
||
weights: Vec<Vec<Vec<f64>>>,
|
||
kinds: Vec<EventKind>,
|
||
agents: &CompetitorStore<T, D>,
|
||
) {
|
||
```
|
||
|
||
Inside the same method, update the event-construction map (around line 240). Each constructed `Event` gets its kind from `kinds[e]`:
|
||
|
||
```rust
|
||
Event {
|
||
teams,
|
||
evidence: 0.0,
|
||
weights,
|
||
kind: kinds[e],
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 6: Update `TimeSlice::add_events`'s tests to pass the new argument**
|
||
|
||
Three call sites in `src/time_slice.rs:604`, `:680`, `:759`, `:790`, `:855` (the unit tests `test_one_event_each`, `test_same_strength`, `test_add_events`, `time_slice_color_groups_reorders_events`) all call `time_slice.add_events(...)`. Add a fourth argument `vec![EventKind::Ranked; n_events]` between `weights` and `&agents` for each call. Example:
|
||
|
||
```rust
|
||
time_slice.add_events(
|
||
vec![
|
||
vec![vec![a], vec![b]],
|
||
vec![vec![c], vec![d]],
|
||
vec![vec![e], vec![f]],
|
||
],
|
||
vec![vec![1.0, 0.0], vec![0.0, 1.0], vec![1.0, 0.0]],
|
||
vec![],
|
||
vec![EventKind::Ranked; 3],
|
||
&agents,
|
||
);
|
||
```
|
||
|
||
- [ ] **Step 7: Update the `History` callers of `TimeSlice::add_events`**
|
||
|
||
In `src/history.rs:562` and `:572`, the calls pass `composition, results, weights, &self.agents`. Add the kinds vector. We'll thread the per-event `EventKind` through `add_events_with_prior` in Step 8 and pass it in here as `kinds_chunk`.
|
||
|
||
- [ ] **Step 8: Extend `History::add_events_with_prior` to accept and route per-event kinds**
|
||
|
||
In `src/history.rs:447-454`, change the signature to:
|
||
|
||
```rust
|
||
pub(crate) fn add_events_with_prior(
|
||
&mut self,
|
||
composition: Vec<Vec<Vec<Index>>>,
|
||
results: Vec<Vec<f64>>,
|
||
times: Vec<T>,
|
||
weights: Vec<Vec<Vec<f64>>>,
|
||
kinds: Vec<crate::time_slice::EventKind>,
|
||
mut priors: HashMap<Index, Rating<T, D>>,
|
||
) -> Result<(), InferenceError> {
|
||
```
|
||
|
||
Around line 543, alongside the existing per-batch slicing of `composition`, `results`, and `weights`, add:
|
||
|
||
```rust
|
||
let kinds_chunk: Vec<crate::time_slice::EventKind> =
|
||
(i..j).map(|e| kinds[o[e]]).collect();
|
||
```
|
||
|
||
Update the two `time_slice.add_events(composition, results, weights, &self.agents)` call sites (lines 562 and 572) to:
|
||
|
||
```rust
|
||
time_slice.add_events(composition, results, weights, kinds_chunk, &self.agents);
|
||
```
|
||
|
||
(For both branches — existing-slice and new-slice. Use `kinds_chunk.clone()` if the borrow checker complains; the vec is small.)
|
||
|
||
Validation: also add a length check at the top of the function alongside the existing ones:
|
||
|
||
```rust
|
||
if !kinds.is_empty() && kinds.len() != composition.len() {
|
||
return Err(InferenceError::MismatchedShape {
|
||
kind: "kinds",
|
||
expected: composition.len(),
|
||
got: kinds.len(),
|
||
});
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 9: Update `record_winner` and `record_draw` to pass `kinds`**
|
||
|
||
In `src/history.rs:617-647`, update both calls:
|
||
|
||
```rust
|
||
self.add_events_with_prior(
|
||
vec![vec![vec![w], vec![l]]],
|
||
vec![vec![1.0, 0.0]],
|
||
vec![time],
|
||
vec![],
|
||
vec![crate::time_slice::EventKind::Ranked],
|
||
HashMap::new(),
|
||
)
|
||
```
|
||
|
||
Same shape for `record_draw`.
|
||
|
||
- [ ] **Step 10: Update `History::add_events` to compute kinds per event and pass through**
|
||
|
||
Replace the placeholder match arm added in Task 3 Step 5 (around `src/history.rs:672-680`). The full updated event-loop body of `History::add_events` (around lines 671-705) becomes:
|
||
|
||
```rust
|
||
let mut kinds: Vec<crate::time_slice::EventKind> = Vec::with_capacity(events.len());
|
||
|
||
for ev in events {
|
||
let team_count = ev.teams.len();
|
||
|
||
let (results_for_event, kind): (Vec<f64>, crate::time_slice::EventKind) = match &ev.outcome {
|
||
Outcome::Ranked(ranks) => {
|
||
if ranks.len() != team_count {
|
||
return Err(InferenceError::MismatchedShape {
|
||
kind: "outcome ranks vs teams",
|
||
expected: team_count,
|
||
got: ranks.len(),
|
||
});
|
||
}
|
||
let max_rank = ranks.iter().copied().max().unwrap_or(0) as f64;
|
||
let inverted: Vec<f64> = ranks.iter().map(|&r| max_rank - r as f64).collect();
|
||
(inverted, crate::time_slice::EventKind::Ranked)
|
||
}
|
||
Outcome::Scored(scores) => {
|
||
if scores.len() != team_count {
|
||
return Err(InferenceError::MismatchedShape {
|
||
kind: "outcome scores vs teams",
|
||
expected: team_count,
|
||
got: scores.len(),
|
||
});
|
||
}
|
||
(
|
||
scores.to_vec(),
|
||
crate::time_slice::EventKind::Scored {
|
||
score_sigma: self.score_sigma,
|
||
},
|
||
)
|
||
}
|
||
};
|
||
|
||
let mut event_comp: Vec<Vec<Index>> = Vec::with_capacity(team_count);
|
||
let mut event_weights: Vec<Vec<f64>> = Vec::with_capacity(team_count);
|
||
|
||
for team in ev.teams {
|
||
let mut team_indices: Vec<Index> = Vec::with_capacity(team.members.len());
|
||
let mut team_weights: Vec<f64> = Vec::with_capacity(team.members.len());
|
||
for member in team.members {
|
||
let idx = self.keys.get_or_create(&member.key);
|
||
team_indices.push(idx);
|
||
team_weights.push(member.weight);
|
||
if let Some(prior) = member.prior {
|
||
priors.insert(idx, Rating::new(prior, self.beta, self.drift));
|
||
}
|
||
}
|
||
event_comp.push(team_indices);
|
||
event_weights.push(team_weights);
|
||
}
|
||
composition.push(event_comp);
|
||
weights.push(event_weights);
|
||
results.push(results_for_event);
|
||
times.push(ev.time);
|
||
kinds.push(kind);
|
||
}
|
||
|
||
self.add_events_with_prior(composition, results, times, weights, kinds, priors)
|
||
```
|
||
|
||
(Note `EventKind` needs to be re-exported from `time_slice`. Confirm `pub(crate) enum EventKind` in time_slice.rs is reachable from history.rs via `crate::time_slice::EventKind`.)
|
||
|
||
- [ ] **Step 11: Add `score_sigma: f64` field to `History` and `HistoryBuilder`**
|
||
|
||
In `src/history.rs:21-37` (`HistoryBuilder` struct), add field `score_sigma: f64,`.
|
||
|
||
In the `Default` impl (around line 121), set `score_sigma: 1.0`.
|
||
|
||
In `History::builder_with_key` (around line 170), set `score_sigma: 1.0`.
|
||
|
||
In each builder transition method that constructs a new `HistoryBuilder` (`drift` at line 55, `observer` at line 85), copy the `score_sigma` field through.
|
||
|
||
Add a builder method (insert near `p_draw`, around line 70):
|
||
|
||
```rust
|
||
pub fn score_sigma(mut self, score_sigma: f64) -> Self {
|
||
self.score_sigma = score_sigma;
|
||
self
|
||
}
|
||
```
|
||
|
||
In `HistoryBuilder::build` (around line 100), set `score_sigma: self.score_sigma,` on the constructed `History`.
|
||
|
||
In the `History` struct (around line 135), add `score_sigma: f64,`.
|
||
|
||
- [ ] **Step 12: Write a failing integration test in `tests/scored.rs` (new file)**
|
||
|
||
Create `tests/scored.rs`:
|
||
|
||
```rust
|
||
use smallvec::smallvec;
|
||
use trueskill_tt::{ConstantDrift, Event, History, Member, Outcome, Team};
|
||
|
||
#[test]
|
||
fn scored_two_team_one_event_pulls_winner_up() {
|
||
let mut h = History::builder()
|
||
.mu(25.0)
|
||
.sigma(25.0 / 3.0)
|
||
.beta(25.0 / 6.0)
|
||
.drift(ConstantDrift(0.0))
|
||
.build();
|
||
|
||
let events: Vec<Event<i64, &'static str>> = vec![Event {
|
||
time: 1,
|
||
teams: smallvec![
|
||
Team::with_members([Member::new("alice")]),
|
||
Team::with_members([Member::new("bob")]),
|
||
],
|
||
outcome: Outcome::scores([10.0, 0.0]),
|
||
}];
|
||
h.add_events(events).unwrap();
|
||
h.converge().unwrap();
|
||
|
||
let alice = h.current_skill(&"alice").unwrap();
|
||
let bob = h.current_skill(&"bob").unwrap();
|
||
assert!(alice.mu() > 25.0, "alice mu should exceed prior; got {}", alice.mu());
|
||
assert!(bob.mu() < 25.0, "bob mu should be below prior; got {}", bob.mu());
|
||
}
|
||
|
||
#[test]
|
||
fn scored_zero_margin_treats_as_tie() {
|
||
let mut h = History::builder()
|
||
.mu(25.0)
|
||
.sigma(25.0 / 3.0)
|
||
.beta(25.0 / 6.0)
|
||
.drift(ConstantDrift(0.0))
|
||
.build();
|
||
|
||
let events: Vec<Event<i64, &'static str>> = vec![Event {
|
||
time: 1,
|
||
teams: smallvec![
|
||
Team::with_members([Member::new("alice")]),
|
||
Team::with_members([Member::new("bob")]),
|
||
],
|
||
outcome: Outcome::scores([3.0, 3.0]),
|
||
}];
|
||
h.add_events(events).unwrap();
|
||
h.converge().unwrap();
|
||
|
||
let alice = h.current_skill(&"alice").unwrap();
|
||
let bob = h.current_skill(&"bob").unwrap();
|
||
assert!((alice.mu() - bob.mu()).abs() < 1e-6, "tied scores -> equal mu; got {} vs {}", alice.mu(), bob.mu());
|
||
// Sigma should still tighten (we have evidence diff ≈ 0).
|
||
assert!(alice.sigma() < 25.0 / 3.0);
|
||
}
|
||
|
||
#[test]
|
||
fn scored_three_team_partial_order() {
|
||
let mut h = History::builder()
|
||
.mu(25.0)
|
||
.sigma(25.0 / 3.0)
|
||
.beta(25.0 / 6.0)
|
||
.drift(ConstantDrift(0.0))
|
||
.build();
|
||
|
||
let events: Vec<Event<i64, &'static str>> = vec![Event {
|
||
time: 1,
|
||
teams: smallvec![
|
||
Team::with_members([Member::new("a")]),
|
||
Team::with_members([Member::new("b")]),
|
||
Team::with_members([Member::new("c")]),
|
||
],
|
||
outcome: Outcome::scores([20.0, 10.0, 5.0]),
|
||
}];
|
||
h.add_events(events).unwrap();
|
||
h.converge().unwrap();
|
||
|
||
let a = h.current_skill(&"a").unwrap();
|
||
let b = h.current_skill(&"b").unwrap();
|
||
let c = h.current_skill(&"c").unwrap();
|
||
assert!(a.mu() > b.mu());
|
||
assert!(b.mu() > c.mu());
|
||
}
|
||
|
||
#[test]
|
||
fn scored_rejects_outcome_team_count_mismatch() {
|
||
use trueskill_tt::InferenceError;
|
||
let mut h: History = History::builder().build();
|
||
let events: Vec<Event<i64, &'static str>> = vec![Event {
|
||
time: 1,
|
||
teams: smallvec![
|
||
Team::with_members([Member::new("a")]),
|
||
Team::with_members([Member::new("b")]),
|
||
],
|
||
outcome: Outcome::scores([1.0, 2.0, 3.0]),
|
||
}];
|
||
let err = h.add_events(events).unwrap_err();
|
||
assert!(matches!(err, InferenceError::MismatchedShape { .. }));
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 13: Run the integration tests**
|
||
|
||
Run: `cargo test --test scored`
|
||
|
||
Expected: all four tests PASS (the wiring from Steps 1–11 is now complete).
|
||
|
||
- [ ] **Step 14: Run the full test suite + clippy**
|
||
|
||
Run: `cargo test && cargo clippy --all-targets -- -D warnings`
|
||
|
||
Expected: all pass, no clippy warnings. Pay particular attention to the existing `time_slice` unit tests — they were updated in Step 6 and need to use `EventKind::Ranked`.
|
||
|
||
- [ ] **Step 15: Format and commit**
|
||
|
||
```bash
|
||
cargo +nightly fmt
|
||
git add src/history.rs src/time_slice.rs tests/scored.rs
|
||
git commit -m "feat(history): route Outcome::Scored events through MarginFactor path"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 8: `EventBuilder::scores` convenience
|
||
|
||
**Files:**
|
||
- Modify: `src/event_builder.rs`
|
||
- Modify: `tests/api_shape.rs` (add a fluent-builder scored test)
|
||
|
||
- [ ] **Step 1: Write failing tests in `tests/api_shape.rs`**
|
||
|
||
Append to the existing test list:
|
||
|
||
```rust
|
||
#[test]
|
||
fn fluent_event_builder_scores() {
|
||
use trueskill_tt::ConstantDrift;
|
||
let mut h = History::builder()
|
||
.mu(25.0)
|
||
.sigma(25.0 / 3.0)
|
||
.beta(25.0 / 6.0)
|
||
.drift(ConstantDrift(0.0))
|
||
.build();
|
||
|
||
h.event(1)
|
||
.team(["alice"])
|
||
.team(["bob"])
|
||
.scores([12.0, 4.0])
|
||
.commit()
|
||
.unwrap();
|
||
h.converge().unwrap();
|
||
|
||
let a = h.current_skill(&"alice").unwrap();
|
||
let b = h.current_skill(&"bob").unwrap();
|
||
assert!(a.mu() > b.mu());
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Run the test to verify it fails**
|
||
|
||
Run: `cargo test --test api_shape fluent_event_builder_scores`
|
||
|
||
Expected: FAIL — `no method named scores`.
|
||
|
||
- [ ] **Step 3: Add `.scores` to `EventBuilder`**
|
||
|
||
In `src/event_builder.rs`, alongside `.ranking`/`.winner`/`.draw` (around line 73), add:
|
||
|
||
```rust
|
||
/// Set explicit per-team continuous scores; higher = better.
|
||
pub fn scores<I: IntoIterator<Item = f64>>(mut self, scores: I) -> Self {
|
||
self.event.outcome = crate::Outcome::scores(scores);
|
||
self
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Run the test to verify it passes**
|
||
|
||
Run: `cargo test --test api_shape fluent_event_builder_scores`
|
||
|
||
Expected: PASS.
|
||
|
||
- [ ] **Step 5: Run the full test suite**
|
||
|
||
Run: `cargo test`
|
||
|
||
Expected: all pass.
|
||
|
||
- [ ] **Step 6: Format and commit**
|
||
|
||
```bash
|
||
cargo +nightly fmt
|
||
git add src/event_builder.rs tests/api_shape.rs
|
||
git commit -m "feat(event-builder): add .scores convenience for Outcome::Scored"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 9: Worked example — scored matches end-to-end
|
||
|
||
**Files:**
|
||
- Create: `examples/scored.rs`
|
||
|
||
- [ ] **Step 1: Create the example**
|
||
|
||
```rust
|
||
//! Worked example: continuous-score outcomes via `Outcome::Scored`.
|
||
//!
|
||
//! Three players play a small round-robin where the score margin matters,
|
||
//! not just who won. We show how `score_sigma` controls how much weight
|
||
//! the engine places on the observed margin.
|
||
//!
|
||
//! Run with: `cargo run --example scored --release`
|
||
|
||
use smallvec::smallvec;
|
||
use trueskill_tt::{ConstantDrift, Event, History, Member, Outcome, Team};
|
||
|
||
fn main() {
|
||
let mut h = History::builder()
|
||
.mu(25.0)
|
||
.sigma(25.0 / 3.0)
|
||
.beta(25.0 / 6.0)
|
||
.drift(ConstantDrift(0.03))
|
||
.score_sigma(2.0) // tune to data; smaller = trust margins more
|
||
.build();
|
||
|
||
let events: Vec<Event<i64, &'static str>> = vec![
|
||
Event {
|
||
time: 1,
|
||
teams: smallvec![
|
||
Team::with_members([Member::new("alice")]),
|
||
Team::with_members([Member::new("bob")]),
|
||
],
|
||
outcome: Outcome::scores([21.0, 9.0]),
|
||
},
|
||
Event {
|
||
time: 2,
|
||
teams: smallvec![
|
||
Team::with_members([Member::new("bob")]),
|
||
Team::with_members([Member::new("carol")]),
|
||
],
|
||
outcome: Outcome::scores([21.0, 18.0]),
|
||
},
|
||
Event {
|
||
time: 3,
|
||
teams: smallvec![
|
||
Team::with_members([Member::new("alice")]),
|
||
Team::with_members([Member::new("carol")]),
|
||
],
|
||
outcome: Outcome::scores([21.0, 21.0]),
|
||
},
|
||
];
|
||
h.add_events(events).unwrap();
|
||
|
||
let report = h.converge().unwrap();
|
||
println!(
|
||
"converged={}, iterations={}, log_evidence={:.4}",
|
||
report.converged, report.iterations, report.log_evidence
|
||
);
|
||
|
||
for who in &["alice", "bob", "carol"] {
|
||
let s = h.current_skill(who).unwrap();
|
||
println!("{:>6}: mu={:>7.3} sigma={:.3}", who, s.mu(), s.sigma());
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Confirm the example compiles and runs**
|
||
|
||
Run: `cargo run --example scored --release`
|
||
|
||
Expected: prints converged=true with three player skills; alice highest, bob middle, carol lowest (or close to bob — depends on `score_sigma`).
|
||
|
||
- [ ] **Step 3: Commit**
|
||
|
||
```bash
|
||
cargo +nightly fmt
|
||
git add examples/scored.rs
|
||
git commit -m "docs(examples): worked Outcome::Scored example"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 10: Benchmark — scored ingestion + convergence
|
||
|
||
**Files:**
|
||
- Create: `benches/scored.rs`
|
||
- Modify: `Cargo.toml` (add `[[bench]]` entry if needed)
|
||
|
||
- [ ] **Step 1: Check `Cargo.toml` for the existing bench wiring**
|
||
|
||
Run: `cat Cargo.toml | grep -A 3 'bench'`
|
||
|
||
If `auto-bench = false` is set or each bench is registered explicitly, add a new entry:
|
||
|
||
```toml
|
||
[[bench]]
|
||
name = "scored"
|
||
harness = false
|
||
```
|
||
|
||
- [ ] **Step 2: Create `benches/scored.rs` modeled on `benches/batch.rs`**
|
||
|
||
```rust
|
||
use criterion::{Criterion, criterion_group, criterion_main};
|
||
use smallvec::smallvec;
|
||
use trueskill_tt::{ConstantDrift, Event, History, Member, Outcome, Team};
|
||
|
||
fn bench_scored_history(c: &mut Criterion) {
|
||
c.bench_function("scored_history_60_events_30_iter", |bencher| {
|
||
bencher.iter(|| {
|
||
let mut h = History::builder()
|
||
.mu(25.0)
|
||
.sigma(25.0 / 3.0)
|
||
.beta(25.0 / 6.0)
|
||
.drift(ConstantDrift(0.03))
|
||
.score_sigma(2.0)
|
||
.build();
|
||
|
||
let mut events: Vec<Event<i64, String>> = Vec::with_capacity(60);
|
||
for i in 0..60 {
|
||
let a = format!("p{}", i % 20);
|
||
let b = format!("p{}", (i + 7) % 20);
|
||
let s_a = (i as f64 * 0.3).sin().abs() * 21.0;
|
||
let s_b = (i as f64 * 0.3).cos().abs() * 21.0;
|
||
events.push(Event {
|
||
time: 1 + (i / 6) as i64,
|
||
teams: smallvec![
|
||
Team::with_members([Member::new(a)]),
|
||
Team::with_members([Member::new(b)]),
|
||
],
|
||
outcome: Outcome::scores([s_a, s_b]),
|
||
});
|
||
}
|
||
h.add_events(events).unwrap();
|
||
h.converge().unwrap();
|
||
});
|
||
});
|
||
}
|
||
|
||
criterion_group!(benches, bench_scored_history);
|
||
criterion_main!(benches);
|
||
```
|
||
|
||
> The `History` here uses `String` keys to match the typical real-world bench shape; if `History<i64, _, _, String>` requires `builder_with_key`, adapt accordingly.
|
||
|
||
- [ ] **Step 3: Verify the benchmark compiles**
|
||
|
||
Run: `cargo bench --no-run --bench scored`
|
||
|
||
Expected: builds without error.
|
||
|
||
- [ ] **Step 4: Run the benchmark and capture a baseline number**
|
||
|
||
Run: `cargo bench --bench scored 2>&1 | tee benches/scored_baseline.txt`
|
||
|
||
(Save the result alongside the existing `benches/baseline.txt` so future tiers can compare.)
|
||
|
||
- [ ] **Step 5: Commit**
|
||
|
||
```bash
|
||
cargo +nightly fmt
|
||
git add benches/scored.rs benches/scored_baseline.txt Cargo.toml
|
||
git commit -m "bench(scored): add criterion bench mirroring batch bench"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 11: Documentation — README + CLAUDE.md status update
|
||
|
||
**Files:**
|
||
- Modify: `README.md`
|
||
- Modify: `CLAUDE.md`
|
||
- Modify: `docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md` (mark MarginFactor done)
|
||
|
||
- [ ] **Step 1: Add a "Scored outcomes" subsection to `README.md`**
|
||
|
||
Find the existing `## Usage` section (or equivalent) and add:
|
||
|
||
```markdown
|
||
### Scored outcomes
|
||
|
||
Use `Outcome::scores([...])` when you have continuous per-team scores rather
|
||
than just ranks. Adjacent score margins flow into a `MarginFactor` that adds
|
||
soft Gaussian evidence about the latent performance diff. Configure
|
||
`HistoryBuilder::score_sigma(σ)` to control how much you trust the margins
|
||
(smaller σ = more trust).
|
||
|
||
```rust
|
||
use trueskill_tt::{History, Outcome};
|
||
|
||
let mut h = History::builder().score_sigma(2.0).build();
|
||
h.event(1)
|
||
.team(["alice"])
|
||
.team(["bob"])
|
||
.scores([21.0, 9.0])
|
||
.commit()
|
||
.unwrap();
|
||
h.converge().unwrap();
|
||
```
|
||
```
|
||
|
||
(Replace the backticks-surrounded fence indicators above (````rust` and `````) with proper triple backticks; the zero-width chars are there to avoid breaking *this* plan file's nesting.)
|
||
|
||
- [ ] **Step 2: Update `CLAUDE.md` architecture notes**
|
||
|
||
In `CLAUDE.md`, add to the existing factor list (or near the architecture section):
|
||
|
||
```
|
||
- `MarginFactor` (factor/margin.rs) — Gaussian observation factor on a diff variable; engaged by `Outcome::Scored`.
|
||
```
|
||
|
||
- [ ] **Step 3: Mark the T4-Margin item complete in the spec**
|
||
|
||
In `docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md`, find the T4 section (line 577 onward):
|
||
|
||
```markdown
|
||
- `MarginFactor` → enables `Outcome::Scored`.
|
||
```
|
||
|
||
Change to:
|
||
|
||
```markdown
|
||
- `MarginFactor` → enables `Outcome::Scored`. **Done** (see `docs/superpowers/plans/2026-04-27-t4-margin-factor.md`).
|
||
```
|
||
|
||
- [ ] **Step 4: Final full test + clippy + fmt run**
|
||
|
||
Run:
|
||
|
||
```bash
|
||
cargo +nightly fmt
|
||
cargo clippy --all-targets -- -D warnings
|
||
cargo test
|
||
cargo bench --no-run
|
||
```
|
||
|
||
Expected: all green, no warnings, all bench targets compile.
|
||
|
||
- [ ] **Step 5: Commit**
|
||
|
||
```bash
|
||
git add README.md CLAUDE.md docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md
|
||
git commit -m "docs(t4-margin): document Outcome::Scored and mark spec item done"
|
||
```
|
||
|
||
---
|
||
|
||
## Acceptance criteria
|
||
|
||
- All existing lib + integration tests still pass with their existing golden values (Trunc path is bit-for-bit unchanged after the `DiffFactor` refactor in Task 4).
|
||
- `cargo test --test scored` passes all four tests added in Task 7.
|
||
- `cargo run --example scored --release` runs and prints sensible posteriors.
|
||
- `cargo bench --bench scored` produces a baseline result saved under `benches/`.
|
||
- `cargo clippy --all-targets -- -D warnings` is clean.
|
||
- `Outcome::Scored` is accepted by the public API: `History::add_events`, `History::event(...).scores(...)`, and `Game::scored`.
|
||
- `score_sigma` is configurable via `HistoryBuilder::score_sigma` and `GameOptions::score_sigma`, default `1.0`.
|
||
|
||
## Out of scope (deferred to later T4 plans)
|
||
|
||
- Damped / Residual schedules
|
||
- SynergyFactor
|
||
- ScoreFactor (continuous outcome variable distinct from observed margin)
|
||
- Per-event `score_sigma` overrides (currently history-wide)
|
||
- Tie-band MarginFactor variant (`m_obs` band rather than point observation)
|