Compare commits
7 Commits
68be7ab5b7
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| 2b5d3b1687 | |||
| e4ff46f45c | |||
| 7742b2b891 | |||
| 52482eea5f | |||
| b46e7f068d | |||
| d1d6b5136c | |||
| 46625d247a |
@@ -2,8 +2,54 @@
|
||||
|
||||
All notable changes to this project will be documented in this file.
|
||||
|
||||
## 0.1.2 - 2026-06-12
|
||||
|
||||
### Bug Fixes
|
||||
|
||||
- fix: release generated CHANGELOG at the wrong location
|
||||
- fix(gaussian): treat non-positive precision as improper in mu()/sigma()
|
||||
|
||||
### Documentation
|
||||
|
||||
- docs: spec for post-T4-MarginFactor tech debt cleanup
|
||||
- docs: implementation plan for post-T4-MarginFactor tech debt cleanup
|
||||
- docs: fix stale numerics in t4-margin-factor plan
|
||||
- docs: spec for game-local Damped EP
|
||||
- docs: implementation plan for game-local Damped EP
|
||||
- docs: spec for History → TimeSlice ConvergenceOptions plumbing
|
||||
- docs: implementation plan for History → TimeSlice plumbing
|
||||
- docs: spec for per-event score_sigma override
|
||||
- docs: implementation plan for per-event score_sigma override
|
||||
|
||||
### Features
|
||||
|
||||
- feat(gaussian): add damp_natural helper for EP damping
|
||||
- feat(convergence): add ConvergenceOptions::alpha damping field
|
||||
- feat(factor): add TruncFactor::propagate_with_alpha for EP damping
|
||||
- feat(factor): add MarginFactor::propagate_with_alpha for EP damping
|
||||
- feat(game): plumb ConvergenceOptions through to run_chain
|
||||
- feat(time_slice): inference callsites read self.convergence
|
||||
- feat(outcome): per-event score_sigma override on Outcome::Scored
|
||||
- feat(event_builder): expose scores_with_sigma fluent method
|
||||
|
||||
### Refactor
|
||||
|
||||
- refactor: dedupe Game::likelihoods and likelihoods_scored via run_chain
|
||||
- refactor: make BuiltinFactor::log_evidence match exhaustive
|
||||
- refactor(time_slice): add convergence field, rename iterate_to_convergence
|
||||
|
||||
### Testing
|
||||
|
||||
- test(game): integration tests for ConvergenceOptions behavior
|
||||
- test(history): end-to-end ConvergenceOptions propagation tests
|
||||
- test(history): end-to-end per-event score_sigma override tests
|
||||
|
||||
## 0.1.1 - 2026-04-27
|
||||
|
||||
### Miscellaneous Tasks
|
||||
|
||||
- chore: Release trueskill-tt version 0.1.1
|
||||
|
||||
### Other (unconventional)
|
||||
|
||||
- T0 + T1 + T2: engine redesign through new API surface (#1)
|
||||
|
||||
+1
-1
@@ -1,6 +1,6 @@
|
||||
[package]
|
||||
name = "trueskill-tt"
|
||||
version = "0.1.1"
|
||||
version = "0.1.2"
|
||||
edition = "2024"
|
||||
|
||||
[lib]
|
||||
|
||||
@@ -0,0 +1,540 @@
|
||||
# Per-Event `score_sigma` Override Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Let users specify a per-event score-sigma override on `Outcome::Scored`, defaulting to `HistoryBuilder::score_sigma` when not set.
|
||||
|
||||
**Architecture:** `Outcome::Scored` becomes a struct variant with an `Option<f64>` `sigma` field. `History::add_events` resolves `sigma.unwrap_or(self.score_sigma)` at ingest time, so downstream `EventKind::Scored.score_sigma` stays a plain `f64` and `TimeSlice` / `run_chain` need zero changes. Two new constructors (`Outcome::scores_with_sigma` and `EventBuilder::scores_with_sigma`) cover the override path; existing `scores(...)` keeps its signature.
|
||||
|
||||
**Tech Stack:** Rust 2024, `cargo +nightly fmt`, `cargo clippy`, `cargo test`.
|
||||
|
||||
---
|
||||
|
||||
## Spec reference
|
||||
|
||||
`docs/superpowers/specs/2026-05-08-per-event-score-sigma-design.md`
|
||||
|
||||
## File map
|
||||
|
||||
| File | Why touched |
|
||||
|---|---|
|
||||
| `src/outcome.rs` | `Outcome::Scored` variant becomes a struct; pattern matches in `team_count`, `as_scores`, `as_ranks`; new `scores_with_sigma` constructor; existing `scores` constructor body adapts |
|
||||
| `src/history.rs` | The single ingest pattern match at `:735` resolves `sigma.unwrap_or(self.score_sigma)`; three new end-to-end tests |
|
||||
| `src/event_builder.rs` | New `scores_with_sigma` builder method |
|
||||
|
||||
## Pre-flight context for the implementer
|
||||
|
||||
- `Outcome` is `pub`. Currently a tuple-variant enum at `src/outcome.rs:18-21`. Changing `Scored(SmallVec)` → `Scored { scores, sigma }` is a breaking change to a public variant shape, acceptable in 0.1.x.
|
||||
- Pattern-match callsite inventory across the workspace (verified by grep): only ONE site destructures the variant — `src/history.rs:735` (`crate::Outcome::Scored(scores) => { ... }`). Every other reference is either a constructor call (`Outcome::scores(...)`) or a string literal in a doc/error message. The constructors keep their existing signatures, so callsites don't need updating.
|
||||
- `Outcome::scores(I)` constructor at `src/outcome.rs:44`: keep the signature `pub fn scores<I: IntoIterator<Item = f64>>(scores: I) -> Self`. Only the body changes (it now builds `Self::Scored { scores: ..., sigma: None }`).
|
||||
- `as_scores`, `as_ranks`, `team_count` accessors at `src/outcome.rs:48-67`: their public signatures stay the same. Internal pattern matches adapt mechanically.
|
||||
- `EventBuilder::scores(I)` at `src/event_builder.rs:79-82`: keep unchanged. The new `scores_with_sigma(I, f64)` lives next to it.
|
||||
- `History::score_sigma` at `src/history.rs:165`: still the history-wide default. `HistoryBuilder::score_sigma(s)` builder method at `src/history.rs:82-89` stays as-is.
|
||||
- `EventKind::Scored { score_sigma: f64 }` at `src/time_slice.rs:51`: already per-event-shaped. Don't touch.
|
||||
- Test baseline: 100 lib + 27 integration tests, all passing.
|
||||
|
||||
---
|
||||
|
||||
### Task 1: `Outcome::Scored` becomes a struct variant + constructors
|
||||
|
||||
This is the foundational shape change. After this task: the new variant compiles, both `scores` and `scores_with_sigma` work on `Outcome` directly, but `History::add_events` (the only consumer that destructures the variant) hasn't yet been updated — Task 2 handles that.
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/outcome.rs` (variant shape, three pattern-match arms, two existing tests, three new tests, two constructors)
|
||||
|
||||
- [ ] **Step 1: Write failing tests for the new constructor**
|
||||
|
||||
In `src/outcome.rs`, inside the existing `#[cfg(test)] mod tests` block, add at the end:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn scores_with_sigma_round_trips() {
|
||||
let o = Outcome::scores_with_sigma([10.0, 4.0], 0.5);
|
||||
assert_eq!(o.team_count(), 2);
|
||||
assert_eq!(o.as_scores(), Some(&[10.0, 4.0][..]));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn scores_constructor_leaves_sigma_unset() {
|
||||
// After the variant change, the public Outcome::scores constructor
|
||||
// must build with sigma: None. We assert this indirectly via a match
|
||||
// on the variant.
|
||||
let o = Outcome::scores([3.0, 1.0]);
|
||||
match o {
|
||||
Outcome::Scored { scores: _, sigma } => assert!(sigma.is_none()),
|
||||
Outcome::Ranked(_) => panic!("expected Scored variant"),
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn scores_with_sigma_sets_sigma_some() {
|
||||
let o = Outcome::scores_with_sigma([3.0, 1.0], 2.0);
|
||||
match o {
|
||||
Outcome::Scored { scores: _, sigma } => assert_eq!(sigma, Some(2.0)),
|
||||
Outcome::Ranked(_) => panic!("expected Scored variant"),
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
#[should_panic(expected = "score_sigma must be > 0.0")]
|
||||
fn scores_with_sigma_rejects_zero() {
|
||||
let _ = Outcome::scores_with_sigma([3.0, 1.0], 0.0);
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run the new tests to verify they fail**
|
||||
|
||||
Run: `cargo test --lib outcome::tests`
|
||||
|
||||
Expected: 4 errors. The first three fail to compile (no `scores_with_sigma` function; pattern destructure on `Scored { ... }` doesn't match the current tuple variant). The last fails because `scores_with_sigma` doesn't exist.
|
||||
|
||||
- [ ] **Step 3: Change the variant shape and update the constructor + accessors**
|
||||
|
||||
In `src/outcome.rs`, replace the entire `Outcome` enum and `impl Outcome` block (currently `src/outcome.rs:16-68`) with:
|
||||
|
||||
```rust
|
||||
/// Final outcome of a match.
|
||||
///
|
||||
/// `Ranked(ranks)`: lower rank = better. Equal ranks mean a tie between those
|
||||
/// teams. `ranks.len()` must equal the number of teams in the event.
|
||||
///
|
||||
/// `Scored { scores, sigma }`: higher score = better. Adjacent (sorted) pairs
|
||||
/// feed observed margins to `MarginFactor`. `scores.len()` must equal the
|
||||
/// number of teams in the event. `sigma` overrides `HistoryBuilder::score_sigma`
|
||||
/// when `Some`; `None` inherits the history default.
|
||||
#[derive(Clone, Debug, PartialEq)]
|
||||
#[non_exhaustive]
|
||||
pub enum Outcome {
|
||||
Ranked(SmallVec<[u32; 4]>),
|
||||
Scored {
|
||||
scores: SmallVec<[f64; 4]>,
|
||||
/// Per-event noise override. `None` means inherit
|
||||
/// `HistoryBuilder::score_sigma`. Must be `> 0.0` if `Some`.
|
||||
sigma: Option<f64>,
|
||||
},
|
||||
}
|
||||
|
||||
impl Outcome {
|
||||
/// `n`-team outcome where team `winner` won and everyone else tied for last.
|
||||
///
|
||||
/// Panics if `winner >= n`.
|
||||
pub fn winner(winner: u32, n: u32) -> Self {
|
||||
assert!(winner < n, "winner index {winner} out of range 0..{n}");
|
||||
let ranks: SmallVec<[u32; 4]> = (0..n).map(|i| if i == winner { 0 } else { 1 }).collect();
|
||||
Self::Ranked(ranks)
|
||||
}
|
||||
|
||||
/// All `n` teams tied.
|
||||
pub fn draw(n: u32) -> Self {
|
||||
Self::Ranked(SmallVec::from_vec(vec![0; n as usize]))
|
||||
}
|
||||
|
||||
/// Explicit per-team ranking.
|
||||
pub fn ranking<I: IntoIterator<Item = u32>>(ranks: I) -> Self {
|
||||
Self::Ranked(ranks.into_iter().collect())
|
||||
}
|
||||
|
||||
/// Explicit per-team continuous scores; higher = better.
|
||||
/// Inherits `HistoryBuilder::score_sigma` for the noise model.
|
||||
pub fn scores<I: IntoIterator<Item = f64>>(scores: I) -> Self {
|
||||
Self::Scored {
|
||||
scores: scores.into_iter().collect(),
|
||||
sigma: None,
|
||||
}
|
||||
}
|
||||
|
||||
/// Explicit per-team continuous scores with a per-event noise override.
|
||||
///
|
||||
/// `sigma` must be `> 0.0`; debug-asserts otherwise.
|
||||
pub fn scores_with_sigma<I: IntoIterator<Item = f64>>(scores: I, sigma: f64) -> Self {
|
||||
debug_assert!(sigma > 0.0, "score_sigma must be > 0.0 (got {sigma})");
|
||||
Self::Scored {
|
||||
scores: scores.into_iter().collect(),
|
||||
sigma: Some(sigma),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn team_count(&self) -> usize {
|
||||
match self {
|
||||
Self::Ranked(r) => r.len(),
|
||||
Self::Scored { scores, .. } => scores.len(),
|
||||
}
|
||||
}
|
||||
|
||||
pub(crate) fn as_ranks(&self) -> Option<&[u32]> {
|
||||
match self {
|
||||
Self::Ranked(r) => Some(r),
|
||||
Self::Scored { .. } => None,
|
||||
}
|
||||
}
|
||||
|
||||
pub(crate) fn as_scores(&self) -> Option<&[f64]> {
|
||||
match self {
|
||||
Self::Scored { scores, .. } => Some(scores),
|
||||
Self::Ranked(_) => None,
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Run the new tests**
|
||||
|
||||
Run: `cargo test --lib outcome::tests`
|
||||
|
||||
Expected: all outcome tests pass (the 6 pre-existing tests + 4 new = 10 total in the outcome tests module).
|
||||
|
||||
If any pre-existing test fails, the issue is in this task — not Task 2. Most likely cause: a pattern-match arm in the rewritten `impl Outcome` block doesn't compile. Re-check the struct-variant destructure syntax (`Self::Scored { scores, .. }` for read-only access; `Self::Scored { scores, sigma }` when both fields are needed).
|
||||
|
||||
- [ ] **Step 5: Update `History::add_events` ingest arm to destructure the new variant**
|
||||
|
||||
The variant change from Step 3 breaks the existing `Outcome::Scored(scores)` pattern match in `src/history.rs:735`. Fix it now (in the same commit) — the codebase must build at every commit boundary.
|
||||
|
||||
In `src/history.rs`, find the `crate::Outcome::Scored(scores) => { ... }` arm (currently at `src/history.rs:735-740`). Replace with:
|
||||
|
||||
```rust
|
||||
crate::Outcome::Scored { scores, sigma } => {
|
||||
let resolved = sigma.unwrap_or(self.score_sigma);
|
||||
debug_assert!(
|
||||
resolved > 0.0,
|
||||
"resolved score_sigma must be > 0.0 (got {resolved})"
|
||||
);
|
||||
kinds.push(EventKind::Scored {
|
||||
score_sigma: resolved,
|
||||
});
|
||||
scores.to_vec()
|
||||
}
|
||||
```
|
||||
|
||||
The surrounding `match &ev.outcome { ... }` and the surrounding flow (the `ranks` arm above, the `results.push(event_result);` below) stay unchanged.
|
||||
|
||||
- [ ] **Step 6: Run the full library test suite — bit-equal regression net**
|
||||
|
||||
Run: `cargo build && cargo test --lib && cargo test`
|
||||
|
||||
Expected: clean build. All 100 lib + 27 integration tests pass. Bit-equal goldens — every existing scored-event constructor uses the no-override path (`Outcome::scores(...)` or `EventBuilder::scores(...)`), which now resolves to `sigma: None → resolved = self.score_sigma`, exactly equal to the previous behavior.
|
||||
|
||||
If unexpected additional compile errors surface (any site pattern-matching `Outcome::Scored(...)` outside the 735 arm), STOP and report — the plan's inventory is wrong, surface that as a finding before continuing.
|
||||
|
||||
If any existing test fails: investigate. Most likely cause is a typo in the new pattern arms (Step 3) or the resolution rule (Step 5). The override path isn't exercised yet by any existing test, so the only thing that can break is the inheritance path.
|
||||
|
||||
- [ ] **Step 7: Format and lint**
|
||||
|
||||
Run: `cargo +nightly fmt && cargo clippy --all-targets -- -D warnings`
|
||||
|
||||
Expected: no diff, no warnings.
|
||||
|
||||
- [ ] **Step 8: Commit**
|
||||
|
||||
```bash
|
||||
git add src/outcome.rs src/history.rs
|
||||
git commit -m "$(cat <<'EOF'
|
||||
feat(outcome): per-event score_sigma override on Outcome::Scored
|
||||
|
||||
Outcome::Scored shape changes from tuple to struct:
|
||||
{ scores, sigma: Option<f64> }. New constructor scores_with_sigma
|
||||
sets sigma=Some(s) and debug-asserts s > 0.0; existing scores(I)
|
||||
constructor keeps its signature and builds with sigma=None internally.
|
||||
team_count, as_scores, as_ranks accessor pattern matches updated.
|
||||
|
||||
History::add_events resolves sigma.unwrap_or(self.score_sigma) at the
|
||||
ingest arm, so downstream EventKind::Scored stays a plain f64 and
|
||||
TimeSlice / run_chain need zero changes.
|
||||
|
||||
Breaking change to the public Outcome::Scored variant shape
|
||||
(acceptable in 0.1.x). Bit-equal for callers using the no-override
|
||||
path because the resolution falls through to self.score_sigma exactly
|
||||
as before.
|
||||
EOF
|
||||
)"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: `EventBuilder::scores_with_sigma` builder method
|
||||
|
||||
The override path is fully wired by Task 1, but it's only reachable via the `Outcome::scores_with_sigma` constructor (passed into `History::add_events` directly). The fluent-builder ergonomic — `h.event(t).team(...).scores_with_sigma(scores, sigma).commit()` — needs one new method on `EventBuilder`.
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/event_builder.rs` (new builder method)
|
||||
|
||||
- [ ] **Step 1: Add the EventBuilder method**
|
||||
|
||||
In `src/event_builder.rs`, find the existing `scores` method (currently at `src/event_builder.rs:79-82`). Immediately below it (still inside `impl<'h, T, D, O, K> EventBuilder<...>`), add:
|
||||
|
||||
```rust
|
||||
/// Set explicit per-team continuous scores with a per-event noise override.
|
||||
///
|
||||
/// `sigma` overrides `HistoryBuilder::score_sigma` for this event only.
|
||||
/// Must be `> 0.0`; debug-asserts otherwise via `Outcome::scores_with_sigma`.
|
||||
pub fn scores_with_sigma<I: IntoIterator<Item = f64>>(mut self, scores: I, sigma: f64) -> Self {
|
||||
self.event.outcome = crate::Outcome::scores_with_sigma(scores, sigma);
|
||||
self
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Build and run the test suite**
|
||||
|
||||
Run: `cargo build && cargo test --lib && cargo test`
|
||||
|
||||
Expected: clean build, all 100 lib + 27 integration tests pass. The new method is additive — no behavior changes for existing tests.
|
||||
|
||||
- [ ] **Step 3: Format and lint**
|
||||
|
||||
Run: `cargo +nightly fmt && cargo clippy --all-targets -- -D warnings`
|
||||
|
||||
Expected: no diff, no warnings.
|
||||
|
||||
- [ ] **Step 4: Commit**
|
||||
|
||||
```bash
|
||||
git add src/event_builder.rs
|
||||
git commit -m "$(cat <<'EOF'
|
||||
feat(event_builder): expose scores_with_sigma fluent method
|
||||
|
||||
Adds EventBuilder::scores_with_sigma, the fluent-builder ergonomic
|
||||
mirror of Outcome::scores_with_sigma. Lets users write
|
||||
h.event(t).team(...).team(...).scores_with_sigma([..], sigma).commit()
|
||||
to set a per-event score_sigma override.
|
||||
EOF
|
||||
)"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 3: End-to-end integration tests
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/history.rs` (three new tests in the existing `#[cfg(test)] mod tests` block at the bottom)
|
||||
|
||||
- [ ] **Step 1: Locate the test module**
|
||||
|
||||
Run: `grep -n "^#\[cfg(test)\]" src/history.rs`
|
||||
|
||||
Identify the test module (there should be one near the bottom of the file). Read its imports and look at neighboring tests to see the existing builder/event-construction pattern in current use. Mirror that pattern in the new tests below — the surface syntax (`History::builder()`, `event(t).team(...)`, `learning_curves()`, etc.) must match what already works in this file.
|
||||
|
||||
- [ ] **Step 2: Write the failing tests**
|
||||
|
||||
Add the following three tests at the end of the existing `#[cfg(test)] mod tests` block in `src/history.rs` (just before the module's closing `}`):
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn outcome_scores_default_sigma_uses_history_default() {
|
||||
use crate::Outcome;
|
||||
|
||||
// Path A: explicit sigma=0.5 via override.
|
||||
let mut h_a = crate::History::builder().score_sigma(0.5).build();
|
||||
h_a.add_events([crate::Event {
|
||||
time: 0_i64,
|
||||
teams: smallvec::smallvec![
|
||||
crate::Team::with_members([crate::Member::new("a")]),
|
||||
crate::Team::with_members([crate::Member::new("b")]),
|
||||
],
|
||||
outcome: Outcome::scores_with_sigma([3.0, 1.0], 0.5),
|
||||
}])
|
||||
.unwrap();
|
||||
h_a.converge().unwrap();
|
||||
|
||||
// Path B: history-wide default 0.5, no per-event override.
|
||||
let mut h_b = crate::History::builder().score_sigma(0.5).build();
|
||||
h_b.add_events([crate::Event {
|
||||
time: 0_i64,
|
||||
teams: smallvec::smallvec![
|
||||
crate::Team::with_members([crate::Member::new("a")]),
|
||||
crate::Team::with_members([crate::Member::new("b")]),
|
||||
],
|
||||
outcome: Outcome::scores([3.0, 1.0]),
|
||||
}])
|
||||
.unwrap();
|
||||
h_b.converge().unwrap();
|
||||
|
||||
// Inheritance: posteriors must be bit-equal.
|
||||
let curves_a = h_a.learning_curves();
|
||||
let curves_b = h_b.learning_curves();
|
||||
for (key, a_pts) in curves_a.iter() {
|
||||
let b_pts = curves_b.get(key).expect("agent missing in path B");
|
||||
for (a, b) in a_pts.iter().zip(b_pts.iter()) {
|
||||
assert_eq!(a.1.pi(), b.1.pi(), "mismatch at agent {key:?}");
|
||||
assert_eq!(a.1.tau(), b.1.tau(), "mismatch at agent {key:?}");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn outcome_scores_with_sigma_overrides_history_default() {
|
||||
use crate::Outcome;
|
||||
|
||||
// Path A: history-wide default 0.5, per-event override 2.0.
|
||||
let mut h_a = crate::History::builder().score_sigma(0.5).build();
|
||||
h_a.add_events([crate::Event {
|
||||
time: 0_i64,
|
||||
teams: smallvec::smallvec![
|
||||
crate::Team::with_members([crate::Member::new("a")]),
|
||||
crate::Team::with_members([crate::Member::new("b")]),
|
||||
],
|
||||
outcome: Outcome::scores_with_sigma([3.0, 1.0], 2.0),
|
||||
}])
|
||||
.unwrap();
|
||||
h_a.converge().unwrap();
|
||||
|
||||
// Path B: history-wide default 2.0, no per-event override.
|
||||
let mut h_b = crate::History::builder().score_sigma(2.0).build();
|
||||
h_b.add_events([crate::Event {
|
||||
time: 0_i64,
|
||||
teams: smallvec::smallvec![
|
||||
crate::Team::with_members([crate::Member::new("a")]),
|
||||
crate::Team::with_members([crate::Member::new("b")]),
|
||||
],
|
||||
outcome: Outcome::scores([3.0, 1.0]),
|
||||
}])
|
||||
.unwrap();
|
||||
h_b.converge().unwrap();
|
||||
|
||||
// Override == default-set-to-the-override-value: bit-equal.
|
||||
let curves_a = h_a.learning_curves();
|
||||
let curves_b = h_b.learning_curves();
|
||||
for (key, a_pts) in curves_a.iter() {
|
||||
let b_pts = curves_b.get(key).expect("agent missing in path B");
|
||||
for (a, b) in a_pts.iter().zip(b_pts.iter()) {
|
||||
assert_eq!(a.1.pi(), b.1.pi(), "mismatch at agent {key:?}");
|
||||
assert_eq!(a.1.tau(), b.1.tau(), "mismatch at agent {key:?}");
|
||||
}
|
||||
}
|
||||
|
||||
// Path C: history-wide default 0.5, no override. Different sigma → different posteriors.
|
||||
let mut h_c = crate::History::builder().score_sigma(0.5).build();
|
||||
h_c.add_events([crate::Event {
|
||||
time: 0_i64,
|
||||
teams: smallvec::smallvec![
|
||||
crate::Team::with_members([crate::Member::new("a")]),
|
||||
crate::Team::with_members([crate::Member::new("b")]),
|
||||
],
|
||||
outcome: Outcome::scores([3.0, 1.0]),
|
||||
}])
|
||||
.unwrap();
|
||||
h_c.converge().unwrap();
|
||||
|
||||
let curves_c = h_c.learning_curves();
|
||||
let mut max_diff: f64 = 0.0;
|
||||
for (key, a_pts) in curves_a.iter() {
|
||||
let c_pts = curves_c.get(key).expect("agent missing in path C");
|
||||
for (a, c) in a_pts.iter().zip(c_pts.iter()) {
|
||||
max_diff = max_diff.max((a.1.mu() - c.1.mu()).abs());
|
||||
max_diff = max_diff.max((a.1.sigma() - c.1.sigma()).abs());
|
||||
}
|
||||
}
|
||||
assert!(
|
||||
max_diff > 1e-6,
|
||||
"override should produce different posteriors from inherited default; max_diff={max_diff}"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn event_builder_scores_with_sigma_threading() {
|
||||
use crate::Outcome;
|
||||
|
||||
// Path A: builder fluent API with sigma override.
|
||||
let mut h_a = crate::History::builder().score_sigma(0.5).build();
|
||||
h_a.event(0_i64)
|
||||
.team(["a"])
|
||||
.team(["b"])
|
||||
.scores_with_sigma([3.0, 1.0], 2.0)
|
||||
.commit()
|
||||
.unwrap();
|
||||
h_a.converge().unwrap();
|
||||
|
||||
// Path B: same outcome via the explicit Outcome constructor.
|
||||
let mut h_b = crate::History::builder().score_sigma(0.5).build();
|
||||
h_b.add_events([crate::Event {
|
||||
time: 0_i64,
|
||||
teams: smallvec::smallvec![
|
||||
crate::Team::with_members([crate::Member::new("a")]),
|
||||
crate::Team::with_members([crate::Member::new("b")]),
|
||||
],
|
||||
outcome: Outcome::scores_with_sigma([3.0, 1.0], 2.0),
|
||||
}])
|
||||
.unwrap();
|
||||
h_b.converge().unwrap();
|
||||
|
||||
let curves_a = h_a.learning_curves();
|
||||
let curves_b = h_b.learning_curves();
|
||||
for (key, a_pts) in curves_a.iter() {
|
||||
let b_pts = curves_b.get(key).expect("agent missing");
|
||||
for (a, b) in a_pts.iter().zip(b_pts.iter()) {
|
||||
assert_eq!(a.1.pi(), b.1.pi(), "mismatch at agent {key:?}");
|
||||
assert_eq!(a.1.tau(), b.1.tau(), "mismatch at agent {key:?}");
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
If the surface API (e.g. `History::add_events`, `Event { time, teams, outcome }`, `Team::with_members`, `Member::new`, `event(...).team(...).commit()`, `learning_curves()`) doesn't exactly match what's available in the test module, look at neighboring tests for the patterns currently in use and adjust. The CONTRACT is: build two Histories that should produce identical posteriors, run them, compare. The surface syntax must follow what compiles in this file.
|
||||
|
||||
- [ ] **Step 3: Run the new tests**
|
||||
|
||||
Run: `cargo test --lib outcome_scores_default_sigma_uses_history_default outcome_scores_with_sigma_overrides_history_default event_builder_scores_with_sigma_threading`
|
||||
|
||||
Expected: 3 passed.
|
||||
|
||||
**Fallback if Test 2's `max_diff > 1e-6` fails** (sigma=0.5 vs sigma=2.0 produces nearly identical posteriors — unlikely on a single 2-team scored event, but possible if the priors dominate): use a larger gap, e.g. `Outcome::scores_with_sigma([3.0, 1.0], 5.0)` vs `Outcome::scores([3.0, 1.0])` with `score_sigma(0.5)`. The point is to prove the resolution path actually engages — any sigma gap that produces a measurable posterior difference is fine.
|
||||
|
||||
- [ ] **Step 4: Run the full test suite**
|
||||
|
||||
Run: `cargo test --lib && cargo test`
|
||||
|
||||
Expected: lib count = 103 (was 100, +3), integration count = 27 (unchanged), all passing.
|
||||
|
||||
- [ ] **Step 5: Format and lint**
|
||||
|
||||
Run: `cargo +nightly fmt && cargo clippy --all-targets -- -D warnings`
|
||||
|
||||
Expected: no diff, no warnings.
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add src/history.rs
|
||||
git commit -m "$(cat <<'EOF'
|
||||
test(history): end-to-end per-event score_sigma override tests
|
||||
|
||||
Three integration tests on a 2-team scored event:
|
||||
- inheritance: Outcome::scores(...) with no override produces
|
||||
bit-equal posteriors to the same outcome wrapped in
|
||||
scores_with_sigma(scores, history.score_sigma)
|
||||
- override-supersedes-default: scores_with_sigma(scores, X) with
|
||||
history score_sigma(Y) produces bit-equal posteriors to
|
||||
scores(...) with history score_sigma(X), AND differs measurably
|
||||
from scores(...) with history score_sigma(Y)
|
||||
- builder threading: EventBuilder::scores_with_sigma reaches the
|
||||
ingest path identically to the Outcome constructor
|
||||
EOF
|
||||
)"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Self-review (writer's note)
|
||||
|
||||
**Spec coverage:**
|
||||
- Spec § "What ships" item 1 (Scored becomes struct variant) → Task 1 step 3 ✓
|
||||
- Spec § "What ships" item 2 (scores_with_sigma constructor) → Task 1 step 3 ✓
|
||||
- Spec § "What ships" item 3 (EventBuilder::scores_with_sigma) → Task 2 step 1 ✓
|
||||
- Spec § "What ships" item 4 (sigma resolution at ingest) → Task 1 step 5 ✓
|
||||
- Spec § "What ships" item 5 (pattern-match update inventory) → Task 1 step 5 (single site at history.rs:735) ✓
|
||||
- Spec § "Validation" (debug_assert at constructor) → Task 1 step 3 (in `scores_with_sigma`) ✓
|
||||
- Spec § "Validation" (debug_assert at ingest) → Task 1 step 5 ✓
|
||||
- Spec § "Testing strategy" §1 (regression net) → Task 1 step 6, Task 2 step 2, Task 3 step 4 ✓
|
||||
- Spec § "Testing strategy" §2 test 1 (default-uses-history-default) → Task 3 step 2 test 1 ✓
|
||||
- Spec § "Testing strategy" §2 test 2 (override-supersedes-default) → Task 3 step 2 test 2 ✓
|
||||
- Spec § "Testing strategy" §2 test 3 (builder threading) → Task 3 step 2 test 3 ✓
|
||||
|
||||
**Out-of-scope items correctly absent:** No `EventKind::Scored` change, no `TimeSlice`/`run_chain` changes, no `Game::scored` standalone API change, no deprecation of `HistoryBuilder::score_sigma`.
|
||||
|
||||
**Type / signature consistency:**
|
||||
- `Outcome::Scored { scores: SmallVec<[f64; 4]>, sigma: Option<f64> }` — Task 1 step 3 (def) and Task 1 step 5 (destructure) match ✓
|
||||
- `Outcome::scores_with_sigma<I>(scores: I, sigma: f64) -> Outcome` — Task 1 step 3 (def) and Task 2 step 1 (call) match ✓
|
||||
- `EventBuilder::scores_with_sigma<I>(mut self, scores: I, sigma: f64) -> Self` — Task 2 step 1 (def) and Task 3 step 2 test 3 (call) match ✓
|
||||
- `sigma.unwrap_or(self.score_sigma)` resolution rule — Task 1 step 5 ✓
|
||||
|
||||
**Task split rationale:** Task 1 lands the foundational shape change AND the ingest resolution atomically — every commit boundary builds and tests pass bit-equal. Task 2 is the small additive EventBuilder method, separated for review-focus reasons (it's the user-facing fluent API exposure). Task 3 is purely additive integration tests. Each task is independently committable; no intermediate non-building state.
|
||||
|
||||
**No placeholders detected.**
|
||||
@@ -0,0 +1,292 @@
|
||||
# Per-Event `score_sigma` Override
|
||||
|
||||
## Summary
|
||||
|
||||
Let users specify a per-event noise override on `Outcome::Scored`.
|
||||
Today every scored event in a `History` shares the single
|
||||
`HistoryBuilder::score_sigma` value (default `1.0`); a user who wants
|
||||
to say "this match was a clean blowout, trust the margin more" or
|
||||
"this one was a disrupted scrappy game, trust it less" has no way to
|
||||
do so.
|
||||
|
||||
The override is resolved at ingest time and stored as a plain `f64`
|
||||
on the existing `EventKind::Scored { score_sigma }` payload, so
|
||||
`TimeSlice` and `run_chain` need zero changes. The work is purely on
|
||||
the public API surface: `Outcome::Scored` becomes a struct variant
|
||||
with an `Option<f64> sigma` field; two builder methods on `Outcome`
|
||||
and `EventBuilder` cover the explicit-override path.
|
||||
|
||||
## Background
|
||||
|
||||
`Outcome::Scored(SmallVec<[f64; 4]>)` is the public per-team-score
|
||||
variant (`src/outcome.rs:20`). It's constructed via
|
||||
`Outcome::scores(I)` (`src/outcome.rs:44`) or
|
||||
`EventBuilder::scores(I)` (`src/event_builder.rs:79`).
|
||||
|
||||
When `History::add_events` ingests a Scored outcome, it always uses
|
||||
the history-wide default:
|
||||
|
||||
```rust
|
||||
// src/history.rs:735-740
|
||||
crate::Outcome::Scored(scores) => {
|
||||
kinds.push(EventKind::Scored {
|
||||
score_sigma: self.score_sigma,
|
||||
});
|
||||
scores.to_vec()
|
||||
}
|
||||
```
|
||||
|
||||
The downstream `EventKind::Scored { score_sigma: f64 }`
|
||||
(`src/time_slice.rs:51`) is already per-event-shaped — every Event
|
||||
carries its own copy. The constraint is purely at the ingest boundary.
|
||||
|
||||
This was flagged as deferred tech debt during the T4-MarginFactor
|
||||
work: "EventKind::Scored.score_sigma payload is always history-wide
|
||||
today; per-event override deferred."
|
||||
|
||||
## Scope
|
||||
|
||||
### What ships
|
||||
|
||||
1. `Outcome::Scored` becomes a struct variant:
|
||||
`Scored { scores: SmallVec<[f64; 4]>, sigma: Option<f64> }`.
|
||||
`None` = use history default; `Some(s)` = override.
|
||||
2. New constructor `Outcome::scores_with_sigma(scores, sigma)` on
|
||||
`Outcome`. Existing `Outcome::scores(I)` keeps the same shape but
|
||||
builds with `sigma: None`.
|
||||
3. New builder method `EventBuilder::scores_with_sigma(scores, sigma)`
|
||||
on `EventBuilder`.
|
||||
4. `History::add_events` resolves `sigma.unwrap_or(self.score_sigma)`
|
||||
when converting an `Outcome::Scored` to `EventKind::Scored`.
|
||||
5. Mechanical pattern-match updates at every site that destructures
|
||||
`Outcome::Scored(...)` as a tuple. Estimate ~5–10 sites across
|
||||
`src/`, `tests/`, `examples/`, `benches/`.
|
||||
|
||||
### What does not ship
|
||||
|
||||
- No change to `EventKind::Scored` (already per-event).
|
||||
- No change to `TimeSlice` or `run_chain`.
|
||||
- No change to `Game::scored` standalone API
|
||||
(it still takes `score_sigma` via `GameOptions::score_sigma`).
|
||||
- No deprecation of `HistoryBuilder::score_sigma` — the history-wide
|
||||
default is still useful as a common-case fallback.
|
||||
|
||||
## Design
|
||||
|
||||
### `Outcome` enum change
|
||||
|
||||
```rust
|
||||
// src/outcome.rs
|
||||
#[derive(Clone, Debug)]
|
||||
pub enum Outcome {
|
||||
Ranked(SmallVec<[u32; 4]>),
|
||||
Scored {
|
||||
scores: SmallVec<[f64; 4]>,
|
||||
/// Per-event noise override. `None` means inherit
|
||||
/// `HistoryBuilder::score_sigma`. Must be `> 0.0` if `Some`.
|
||||
sigma: Option<f64>,
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
The variant shape changes from tuple to struct. Pattern matches that
|
||||
extract the scores switch from `Outcome::Scored(scores)` to
|
||||
`Outcome::Scored { scores, .. }` (or `{ scores, sigma }` where the
|
||||
sigma is needed).
|
||||
|
||||
### `Outcome` constructors
|
||||
|
||||
```rust
|
||||
impl Outcome {
|
||||
/// Per-team continuous scores; uses HistoryBuilder::score_sigma default.
|
||||
pub fn scores<I: IntoIterator<Item = f64>>(scores: I) -> Self {
|
||||
Self::Scored {
|
||||
scores: scores.into_iter().collect(),
|
||||
sigma: None,
|
||||
}
|
||||
}
|
||||
|
||||
/// Per-team scores with explicit per-event noise override.
|
||||
///
|
||||
/// `sigma` must be > 0.0; debug_assert.
|
||||
pub fn scores_with_sigma<I: IntoIterator<Item = f64>>(
|
||||
scores: I,
|
||||
sigma: f64,
|
||||
) -> Self {
|
||||
debug_assert!(sigma > 0.0, "score_sigma must be > 0.0 (got {sigma})");
|
||||
Self::Scored {
|
||||
scores: scores.into_iter().collect(),
|
||||
sigma: Some(sigma),
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`Outcome::scores(I)` keeps the existing function signature exactly —
|
||||
its only behavioural change is the internal struct construction. The
|
||||
existing `as_scores()`, `team_count()`, etc. accessors keep their
|
||||
public signatures (they return `Option<&[f64]>` and `usize`); their
|
||||
internal pattern matches update mechanically.
|
||||
|
||||
### `EventBuilder` method
|
||||
|
||||
```rust
|
||||
impl<'h, T, D, O, K> EventBuilder<'h, T, D, O, K>
|
||||
where
|
||||
T: Time,
|
||||
D: Drift<T>,
|
||||
O: Observer<T>,
|
||||
K: Eq + std::hash::Hash + Clone,
|
||||
{
|
||||
/// Per-team scores; uses HistoryBuilder::score_sigma default.
|
||||
pub fn scores<I: IntoIterator<Item = f64>>(mut self, scores: I) -> Self {
|
||||
self.event.outcome = crate::Outcome::scores(scores);
|
||||
self
|
||||
}
|
||||
|
||||
/// Per-team scores with explicit per-event noise override.
|
||||
pub fn scores_with_sigma<I: IntoIterator<Item = f64>>(
|
||||
mut self,
|
||||
scores: I,
|
||||
sigma: f64,
|
||||
) -> Self {
|
||||
self.event.outcome = crate::Outcome::scores_with_sigma(scores, sigma);
|
||||
self
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The existing `.scores(...)` builder method stays — its body changes
|
||||
trivially because `Outcome::scores(I)` still has the same signature.
|
||||
`.scores_with_sigma(...)` is the new method.
|
||||
|
||||
### Sigma resolution
|
||||
|
||||
In `History::add_events` at `src/history.rs:735`:
|
||||
|
||||
```rust
|
||||
crate::Outcome::Scored { scores, sigma } => {
|
||||
let resolved = sigma.unwrap_or(self.score_sigma);
|
||||
debug_assert!(
|
||||
resolved > 0.0,
|
||||
"resolved score_sigma must be > 0.0 (got {resolved})"
|
||||
);
|
||||
kinds.push(EventKind::Scored {
|
||||
score_sigma: resolved,
|
||||
});
|
||||
scores.to_vec()
|
||||
}
|
||||
```
|
||||
|
||||
Resolution at ingest time means downstream code keeps a plain `f64`.
|
||||
No `Option` propagates further.
|
||||
|
||||
### Validation
|
||||
|
||||
- `Outcome::scores_with_sigma(_, sigma)` debug-asserts `sigma > 0.0`
|
||||
at construction.
|
||||
- `History::add_events` debug-asserts the resolved sigma is `> 0.0`
|
||||
(catches both inherited and overridden paths).
|
||||
- `HistoryBuilder::score_sigma(s)` keeps its existing positive
|
||||
assertion.
|
||||
|
||||
The default sigma at the History level (`1.0`) is positive, so an
|
||||
event with `sigma = None` against a default-built History always
|
||||
passes the resolved-sigma assertion trivially.
|
||||
|
||||
### Pattern-match update inventory
|
||||
|
||||
Every site that destructures `Outcome::Scored(_)` as a tuple needs
|
||||
updating. Known sites:
|
||||
|
||||
- `src/outcome.rs`: the `team_count()`, `as_scores()`, `as_ranks()`
|
||||
match arms (`src/outcome.rs:51`, `:58`, `:64`).
|
||||
- `src/history.rs:735`: the conversion arm (this is also where the
|
||||
resolution rule lands).
|
||||
- Any test in `src/outcome.rs` test mod that constructs
|
||||
`Outcome::Scored(...)` literally.
|
||||
- Any callsite in `src/`, `tests/`, `examples/`, `benches/`,
|
||||
`src/game.rs` that pattern-matches the variant.
|
||||
|
||||
The compiler surfaces every site at `cargo build`. Locating them is
|
||||
mechanical.
|
||||
|
||||
## Testing strategy
|
||||
|
||||
### Regression net
|
||||
|
||||
Existing 100 lib + 27 integration tests are the bit-equal regression
|
||||
net for the `sigma = None` path. Every existing test that uses
|
||||
`Outcome::scores(...)` or `EventBuilder::scores(...)` should
|
||||
continue to produce identical posteriors — the resolved sigma equals
|
||||
the history default (which equals what the hardcoded path produced).
|
||||
|
||||
### New tests
|
||||
|
||||
Three additions in the `src/history.rs` test module:
|
||||
|
||||
1. **`outcome_scores_default_sigma_uses_history_default`** — build a
|
||||
History with `score_sigma(0.5)`, add a 2-team event via
|
||||
`Outcome::scores([3.0, 1.0])` (no override), capture posteriors.
|
||||
Build a second History identical except using
|
||||
`Outcome::scores_with_sigma([3.0, 1.0], 0.5)` (override matches
|
||||
default). Assert posteriors are bit-equal across the two paths.
|
||||
|
||||
2. **`outcome_scores_with_sigma_overrides_history_default`** — build a
|
||||
History with `score_sigma(0.5)`, add an event via
|
||||
`Outcome::scores_with_sigma([3.0, 1.0], 2.0)`. Build a second
|
||||
History with `score_sigma(2.0)` and add the same event via
|
||||
`Outcome::scores([3.0, 1.0])`. Assert posteriors are bit-equal.
|
||||
Then build a third History with `score_sigma(0.5)` and add via
|
||||
`Outcome::scores([3.0, 1.0])` (no override). Assert this third
|
||||
one's posteriors differ measurably from the override path
|
||||
(max diff > 1e-6) — proves the override actually changes
|
||||
inference.
|
||||
|
||||
3. **`event_builder_scores_with_sigma_threading`** — same shape as
|
||||
#2 but constructed via the fluent builder
|
||||
`h.event(0).team(["a"]).team(["b"]).scores_with_sigma([3.0, 1.0], 2.0).commit()`.
|
||||
Proves the builder method works end-to-end.
|
||||
|
||||
### Pattern-match update test impact
|
||||
|
||||
Existing tests in `src/outcome.rs` that construct
|
||||
`Outcome::Scored(...)` literally need updating to the struct shape.
|
||||
Mechanical change; no new tests required.
|
||||
|
||||
## Verification gates
|
||||
|
||||
```bash
|
||||
cargo +nightly fmt
|
||||
cargo clippy --all-targets -- -D warnings
|
||||
cargo test --lib
|
||||
cargo test
|
||||
```
|
||||
|
||||
Test count grows by 3.
|
||||
|
||||
## Risks
|
||||
|
||||
- **Public API breaking change.** `Outcome::Scored` variant shape
|
||||
changes from tuple to struct. Any downstream consumer
|
||||
pattern-matching on the tuple form breaks. In a 0.1.x crate this
|
||||
is acceptable; flag it in the commit message.
|
||||
- **Mechanical breadth.** The pattern-match updates touch several
|
||||
files. They're all caught by the compiler so the risk is low, but
|
||||
the diff will look bigger than the actual logical change.
|
||||
- **Two ways to do the same thing.** `Outcome::scores_with_sigma(..)`
|
||||
and `EventBuilder::scores_with_sigma(..)` both produce the same
|
||||
outcome. This is intentional — the constructor is the underlying
|
||||
primitive; the builder method is the ergonomic wrapper. Same
|
||||
pattern as the existing `Outcome::scores(..)` /
|
||||
`EventBuilder::scores(..)` pair.
|
||||
|
||||
## Out-of-scope follow-ups
|
||||
|
||||
- Per-event override of other config currently history-wide
|
||||
(`p_draw`, drift, beta) — same architectural pattern would apply
|
||||
but each is its own design decision.
|
||||
- Validation upgrade from `debug_assert!` to a `Result` at the
|
||||
Outcome construction boundary.
|
||||
- Schedule trait integration with `run_chain`, `Residual` schedule,
|
||||
`SynergyFactor` (still pending from the larger spec).
|
||||
@@ -81,6 +81,15 @@ where
|
||||
self
|
||||
}
|
||||
|
||||
/// Set explicit per-team continuous scores with a per-event noise override.
|
||||
///
|
||||
/// `sigma` overrides `HistoryBuilder::score_sigma` for this event only.
|
||||
/// Must be `> 0.0`; debug-asserts otherwise via `Outcome::scores_with_sigma`.
|
||||
pub fn scores_with_sigma<I: IntoIterator<Item = f64>>(mut self, scores: I, sigma: f64) -> Self {
|
||||
self.event.outcome = crate::Outcome::scores_with_sigma(scores, sigma);
|
||||
self
|
||||
}
|
||||
|
||||
/// Mark team `winner_idx` as winner; others tied for last.
|
||||
pub fn winner(mut self, winner_idx: u32) -> Self {
|
||||
self.event.outcome = Outcome::winner(winner_idx, self.event.teams.len() as u32);
|
||||
|
||||
+31
-2
@@ -53,7 +53,11 @@ impl Gaussian {
|
||||
|
||||
#[inline]
|
||||
pub fn mu(&self) -> f64 {
|
||||
if self.pi == 0.0 {
|
||||
// A non-positive precision is an improper (uninformative) Gaussian — its mean is
|
||||
// undefined. Treat it like `pi == 0` and return 0. EP message cancellation can land
|
||||
// `pi` on a tiny negative value (round-off of exactly zero); without this guard
|
||||
// `tau / pi` would yield a spurious finite mean.
|
||||
if self.pi <= 0.0 {
|
||||
0.0
|
||||
} else {
|
||||
self.tau / self.pi
|
||||
@@ -62,7 +66,10 @@ impl Gaussian {
|
||||
|
||||
#[inline]
|
||||
pub fn sigma(&self) -> f64 {
|
||||
if self.pi == 0.0 {
|
||||
// A non-positive precision is improper → infinite standard deviation. Guarding
|
||||
// `pi <= 0.0` (not just `== 0.0`) keeps `1.0 / pi.sqrt()` from returning NaN when EP
|
||||
// cancellation produces a tiny negative precision (round-off of exactly zero).
|
||||
if self.pi <= 0.0 {
|
||||
f64::INFINITY
|
||||
} else if self.pi.is_infinite() {
|
||||
0.0
|
||||
@@ -174,6 +181,28 @@ impl ops::Div<Gaussian> for Gaussian {
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
#[test]
|
||||
fn non_positive_precision_is_improper_not_nan() {
|
||||
// EP message cancellation can leave `pi` a tiny negative (round-off of exactly zero).
|
||||
// Such a Gaussian is improper/uninformative: mu() must be 0 and sigma() infinite, not
|
||||
// NaN. A NaN here propagates through the moment-space `Sub` in the game chain and
|
||||
// poisons every skill in the slice.
|
||||
let tiny_neg = Gaussian::from_natural(-5.55e-17, -8.88e-16);
|
||||
assert_eq!(tiny_neg.mu(), 0.0);
|
||||
assert!(tiny_neg.sigma().is_infinite());
|
||||
|
||||
// A frankly-negative precision is treated the same way.
|
||||
let neg = Gaussian::from_natural(-1.0, 2.0);
|
||||
assert_eq!(neg.mu(), 0.0);
|
||||
assert!(neg.sigma().is_infinite());
|
||||
|
||||
// Subtracting such a message must not produce NaN (the original failure path).
|
||||
let proper = Gaussian::from_ms(9.75, 1.256);
|
||||
let diff = proper - tiny_neg;
|
||||
assert!(diff.pi().is_finite() && !diff.pi().is_nan());
|
||||
assert!(diff.tau().is_finite() && !diff.tau().is_nan());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_add() {
|
||||
let n = Gaussian::from_ms(25.0, 25.0 / 3.0);
|
||||
|
||||
+156
-2
@@ -732,9 +732,14 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> History<T, D, O
|
||||
kinds.push(EventKind::Ranked);
|
||||
ranks.iter().map(|&r| max_rank - r as f64).collect()
|
||||
}
|
||||
crate::Outcome::Scored(scores) => {
|
||||
crate::Outcome::Scored { scores, sigma } => {
|
||||
let resolved = sigma.unwrap_or(self.score_sigma);
|
||||
debug_assert!(
|
||||
resolved > 0.0,
|
||||
"resolved score_sigma must be > 0.0 (got {resolved})"
|
||||
);
|
||||
kinds.push(EventKind::Scored {
|
||||
score_sigma: self.score_sigma,
|
||||
score_sigma: resolved,
|
||||
});
|
||||
scores.to_vec()
|
||||
}
|
||||
@@ -1807,4 +1812,153 @@ mod tests {
|
||||
"α=0.5 should reach the same fixed point as α=1.0; max_diff={max_diff}"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn outcome_scores_default_sigma_uses_history_default() {
|
||||
use crate::Outcome;
|
||||
|
||||
// Path A: explicit sigma=0.5 via override.
|
||||
let mut h_a = crate::History::builder().score_sigma(0.5).build();
|
||||
h_a.add_events([crate::Event {
|
||||
time: 0_i64,
|
||||
teams: smallvec::smallvec![
|
||||
crate::Team::with_members([crate::Member::new("a")]),
|
||||
crate::Team::with_members([crate::Member::new("b")]),
|
||||
],
|
||||
outcome: Outcome::scores_with_sigma([3.0, 1.0], 0.5),
|
||||
}])
|
||||
.unwrap();
|
||||
h_a.converge().unwrap();
|
||||
|
||||
// Path B: history-wide default 0.5, no per-event override.
|
||||
let mut h_b = crate::History::builder().score_sigma(0.5).build();
|
||||
h_b.add_events([crate::Event {
|
||||
time: 0_i64,
|
||||
teams: smallvec::smallvec![
|
||||
crate::Team::with_members([crate::Member::new("a")]),
|
||||
crate::Team::with_members([crate::Member::new("b")]),
|
||||
],
|
||||
outcome: Outcome::scores([3.0, 1.0]),
|
||||
}])
|
||||
.unwrap();
|
||||
h_b.converge().unwrap();
|
||||
|
||||
// Inheritance: posteriors must be bit-equal.
|
||||
let curves_a = h_a.learning_curves();
|
||||
let curves_b = h_b.learning_curves();
|
||||
for (key, a_pts) in curves_a.iter() {
|
||||
let b_pts = curves_b.get(key).expect("agent missing in path B");
|
||||
for (a, b) in a_pts.iter().zip(b_pts.iter()) {
|
||||
assert_eq!(a.1.pi(), b.1.pi(), "mismatch at agent {key:?}");
|
||||
assert_eq!(a.1.tau(), b.1.tau(), "mismatch at agent {key:?}");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn outcome_scores_with_sigma_overrides_history_default() {
|
||||
use crate::Outcome;
|
||||
|
||||
// Path A: history-wide default 0.5, per-event override 2.0.
|
||||
let mut h_a = crate::History::builder().score_sigma(0.5).build();
|
||||
h_a.add_events([crate::Event {
|
||||
time: 0_i64,
|
||||
teams: smallvec::smallvec![
|
||||
crate::Team::with_members([crate::Member::new("a")]),
|
||||
crate::Team::with_members([crate::Member::new("b")]),
|
||||
],
|
||||
outcome: Outcome::scores_with_sigma([3.0, 1.0], 2.0),
|
||||
}])
|
||||
.unwrap();
|
||||
h_a.converge().unwrap();
|
||||
|
||||
// Path B: history-wide default 2.0, no per-event override.
|
||||
let mut h_b = crate::History::builder().score_sigma(2.0).build();
|
||||
h_b.add_events([crate::Event {
|
||||
time: 0_i64,
|
||||
teams: smallvec::smallvec![
|
||||
crate::Team::with_members([crate::Member::new("a")]),
|
||||
crate::Team::with_members([crate::Member::new("b")]),
|
||||
],
|
||||
outcome: Outcome::scores([3.0, 1.0]),
|
||||
}])
|
||||
.unwrap();
|
||||
h_b.converge().unwrap();
|
||||
|
||||
// Override == default-set-to-the-override-value: bit-equal.
|
||||
let curves_a = h_a.learning_curves();
|
||||
let curves_b = h_b.learning_curves();
|
||||
for (key, a_pts) in curves_a.iter() {
|
||||
let b_pts = curves_b.get(key).expect("agent missing in path B");
|
||||
for (a, b) in a_pts.iter().zip(b_pts.iter()) {
|
||||
assert_eq!(a.1.pi(), b.1.pi(), "mismatch at agent {key:?}");
|
||||
assert_eq!(a.1.tau(), b.1.tau(), "mismatch at agent {key:?}");
|
||||
}
|
||||
}
|
||||
|
||||
// Path C: history-wide default 0.5, no override. Different sigma → different posteriors.
|
||||
let mut h_c = crate::History::builder().score_sigma(0.5).build();
|
||||
h_c.add_events([crate::Event {
|
||||
time: 0_i64,
|
||||
teams: smallvec::smallvec![
|
||||
crate::Team::with_members([crate::Member::new("a")]),
|
||||
crate::Team::with_members([crate::Member::new("b")]),
|
||||
],
|
||||
outcome: Outcome::scores([3.0, 1.0]),
|
||||
}])
|
||||
.unwrap();
|
||||
h_c.converge().unwrap();
|
||||
|
||||
let curves_c = h_c.learning_curves();
|
||||
let mut max_diff: f64 = 0.0;
|
||||
for (key, a_pts) in curves_a.iter() {
|
||||
let c_pts = curves_c.get(key).expect("agent missing in path C");
|
||||
for (a, c) in a_pts.iter().zip(c_pts.iter()) {
|
||||
max_diff = max_diff.max((a.1.mu() - c.1.mu()).abs());
|
||||
max_diff = max_diff.max((a.1.sigma() - c.1.sigma()).abs());
|
||||
}
|
||||
}
|
||||
assert!(
|
||||
max_diff > 1e-6,
|
||||
"override should produce different posteriors from inherited default; max_diff={max_diff}"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn event_builder_scores_with_sigma_threading() {
|
||||
use crate::Outcome;
|
||||
|
||||
// Path A: builder fluent API with sigma override.
|
||||
let mut h_a = crate::History::builder().score_sigma(0.5).build();
|
||||
h_a.event(0_i64)
|
||||
.team(["a"])
|
||||
.team(["b"])
|
||||
.scores_with_sigma([3.0, 1.0], 2.0)
|
||||
.commit()
|
||||
.unwrap();
|
||||
h_a.converge().unwrap();
|
||||
|
||||
// Path B: same outcome via the explicit Outcome constructor.
|
||||
let mut h_b = crate::History::builder().score_sigma(0.5).build();
|
||||
h_b.add_events([crate::Event {
|
||||
time: 0_i64,
|
||||
teams: smallvec::smallvec![
|
||||
crate::Team::with_members([crate::Member::new("a")]),
|
||||
crate::Team::with_members([crate::Member::new("b")]),
|
||||
],
|
||||
outcome: Outcome::scores_with_sigma([3.0, 1.0], 2.0),
|
||||
}])
|
||||
.unwrap();
|
||||
h_b.converge().unwrap();
|
||||
|
||||
let curves_a = h_a.learning_curves();
|
||||
let curves_b = h_b.learning_curves();
|
||||
for (key, a_pts) in curves_a.iter() {
|
||||
let b_pts = curves_b.get(key).expect("agent missing");
|
||||
for (a, b) in a_pts.iter().zip(b_pts.iter()) {
|
||||
assert_eq!(a.1.pi(), b.1.pi(), "mismatch at agent {key:?}");
|
||||
assert_eq!(a.1.tau(), b.1.tau(), "mismatch at agent {key:?}");
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
+62
-10
@@ -1,7 +1,7 @@
|
||||
//! Outcome of a match.
|
||||
//!
|
||||
//! `Ranked(ranks)` for ordinal results; `Scored(scores)` for continuous
|
||||
//! per-team scores (engages `MarginFactor` in the engine).
|
||||
//! `Ranked(ranks)` for ordinal results; `Scored { scores, sigma }` for
|
||||
//! continuous per-team scores (engages `MarginFactor` in the engine).
|
||||
|
||||
use smallvec::SmallVec;
|
||||
|
||||
@@ -10,14 +10,20 @@ use smallvec::SmallVec;
|
||||
/// `Ranked(ranks)`: lower rank = better. Equal ranks mean a tie between those
|
||||
/// teams. `ranks.len()` must equal the number of teams in the event.
|
||||
///
|
||||
/// `Scored(scores)`: higher score = better. Adjacent (sorted) pairs feed
|
||||
/// observed margins to `MarginFactor`. `scores.len()` must equal the number
|
||||
/// of teams in the event.
|
||||
/// `Scored { scores, sigma }`: higher score = better. Adjacent (sorted) pairs
|
||||
/// feed observed margins to `MarginFactor`. `scores.len()` must equal the
|
||||
/// number of teams in the event. `sigma` overrides `HistoryBuilder::score_sigma`
|
||||
/// when `Some`; `None` inherits the history default.
|
||||
#[derive(Clone, Debug, PartialEq)]
|
||||
#[non_exhaustive]
|
||||
pub enum Outcome {
|
||||
Ranked(SmallVec<[u32; 4]>),
|
||||
Scored(SmallVec<[f64; 4]>),
|
||||
Scored {
|
||||
scores: SmallVec<[f64; 4]>,
|
||||
/// Per-event noise override. `None` means inherit
|
||||
/// `HistoryBuilder::score_sigma`. Must be `> 0.0` if `Some`.
|
||||
sigma: Option<f64>,
|
||||
},
|
||||
}
|
||||
|
||||
impl Outcome {
|
||||
@@ -41,27 +47,42 @@ impl Outcome {
|
||||
}
|
||||
|
||||
/// Explicit per-team continuous scores; higher = better.
|
||||
/// Inherits `HistoryBuilder::score_sigma` for the noise model.
|
||||
pub fn scores<I: IntoIterator<Item = f64>>(scores: I) -> Self {
|
||||
Self::Scored(scores.into_iter().collect())
|
||||
Self::Scored {
|
||||
scores: scores.into_iter().collect(),
|
||||
sigma: None,
|
||||
}
|
||||
}
|
||||
|
||||
/// Explicit per-team continuous scores with a per-event noise override.
|
||||
///
|
||||
/// `sigma` must be `> 0.0`; debug-asserts otherwise.
|
||||
pub fn scores_with_sigma<I: IntoIterator<Item = f64>>(scores: I, sigma: f64) -> Self {
|
||||
debug_assert!(sigma > 0.0, "score_sigma must be > 0.0 (got {sigma})");
|
||||
Self::Scored {
|
||||
scores: scores.into_iter().collect(),
|
||||
sigma: Some(sigma),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn team_count(&self) -> usize {
|
||||
match self {
|
||||
Self::Ranked(r) => r.len(),
|
||||
Self::Scored(s) => s.len(),
|
||||
Self::Scored { scores, .. } => scores.len(),
|
||||
}
|
||||
}
|
||||
|
||||
pub(crate) fn as_ranks(&self) -> Option<&[u32]> {
|
||||
match self {
|
||||
Self::Ranked(r) => Some(r),
|
||||
Self::Scored(_) => None,
|
||||
Self::Scored { .. } => None,
|
||||
}
|
||||
}
|
||||
|
||||
pub(crate) fn as_scores(&self) -> Option<&[f64]> {
|
||||
match self {
|
||||
Self::Scored(s) => Some(s),
|
||||
Self::Scored { scores, .. } => Some(scores),
|
||||
Self::Ranked(_) => None,
|
||||
}
|
||||
}
|
||||
@@ -122,4 +143,35 @@ mod tests {
|
||||
assert!(o.as_scores().is_none());
|
||||
assert!(o.as_ranks().is_some());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn scores_with_sigma_round_trips() {
|
||||
let o = Outcome::scores_with_sigma([10.0, 4.0], 0.5);
|
||||
assert_eq!(o.team_count(), 2);
|
||||
assert_eq!(o.as_scores(), Some(&[10.0, 4.0][..]));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn scores_constructor_leaves_sigma_unset() {
|
||||
let o = Outcome::scores([3.0, 1.0]);
|
||||
match o {
|
||||
Outcome::Scored { scores: _, sigma } => assert!(sigma.is_none()),
|
||||
Outcome::Ranked(_) => panic!("expected Scored variant"),
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn scores_with_sigma_sets_sigma_some() {
|
||||
let o = Outcome::scores_with_sigma([3.0, 1.0], 2.0);
|
||||
match o {
|
||||
Outcome::Scored { scores: _, sigma } => assert_eq!(sigma, Some(2.0)),
|
||||
Outcome::Ranked(_) => panic!("expected Scored variant"),
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
#[should_panic(expected = "score_sigma must be > 0.0")]
|
||||
fn scores_with_sigma_rejects_zero() {
|
||||
let _ = Outcome::scores_with_sigma([3.0, 1.0], 0.0);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -0,0 +1,71 @@
|
||||
//! Regression: a single time slice with many distinct competitors must converge to finite
|
||||
//! skills. Before the `pi <= 0` guard in `Gaussian::mu()/sigma()`, EP message cancellation
|
||||
//! produced a tiny-negative precision whose `sigma() = 1/sqrt(pi)` was NaN, which the
|
||||
//! moment-space `Sub` in the game chain propagated into every skill once the slice grew past
|
||||
//! ~75 competitors (e.g. a real ranking dataset with hundreds of players).
|
||||
use trueskill_tt::{ConstantDrift, ConvergenceOptions, EPSILON, History, ITERATIONS, NullObserver};
|
||||
|
||||
/// Tiny deterministic LCG — avoids a dev-dependency on `rand`.
|
||||
struct Lcg(u64);
|
||||
impl Lcg {
|
||||
fn next(&mut self) -> u64 {
|
||||
self.0 = self
|
||||
.0
|
||||
.wrapping_mul(6364136223846793005)
|
||||
.wrapping_add(1442695040888963407);
|
||||
self.0
|
||||
}
|
||||
fn below(&mut self, n: usize) -> usize {
|
||||
(self.next() >> 33) as usize % n
|
||||
}
|
||||
fn coin(&mut self) -> bool {
|
||||
self.next() & 1 == 0
|
||||
}
|
||||
}
|
||||
|
||||
fn nan_after_fit(players: usize) -> usize {
|
||||
let mut h: History<i64, ConstantDrift, NullObserver, String> = History::builder_with_key()
|
||||
.beta(1.0)
|
||||
.sigma(6.0)
|
||||
.drift(ConstantDrift(0.1))
|
||||
.convergence(ConvergenceOptions {
|
||||
max_iter: ITERATIONS,
|
||||
epsilon: EPSILON,
|
||||
..Default::default()
|
||||
})
|
||||
.build();
|
||||
|
||||
let ids: Vec<String> = (0..players).map(|i| format!("p{i:04}")).collect();
|
||||
let mut rng = Lcg(1);
|
||||
for _ in 0..(players * 4) {
|
||||
let a = rng.below(players);
|
||||
let mut b = rng.below(players - 1);
|
||||
if b >= a {
|
||||
b += 1;
|
||||
}
|
||||
let (w, l) = if rng.coin() { (a, b) } else { (b, a) };
|
||||
h.record_winner(&ids[w], &ids[l], 0).unwrap();
|
||||
}
|
||||
h.converge().unwrap();
|
||||
|
||||
ids.iter()
|
||||
.filter(|id| {
|
||||
h.current_skill(id.as_str())
|
||||
.map(|g| !g.mu().is_finite() || !g.sigma().is_finite())
|
||||
.unwrap_or(true)
|
||||
})
|
||||
.count()
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn many_competitors_converge_to_finite_skills() {
|
||||
// The NaN regression onset was between 70 and 80 competitors; 250 is comfortably past it
|
||||
// and in the range of a real ranking dataset.
|
||||
for players in [12usize, 75, 150, 250] {
|
||||
assert_eq!(
|
||||
nan_after_fit(players),
|
||||
0,
|
||||
"{players}-competitor history produced NaN skills"
|
||||
);
|
||||
}
|
||||
}
|
||||
Reference in New Issue
Block a user