docs: spec for per-event score_sigma override

Outcome::Scored becomes a struct variant with an Option<f64> sigma
field. None inherits HistoryBuilder::score_sigma; Some(s) overrides
per event. Resolved at ingest time so EventKind::Scored stays a plain
f64 and TimeSlice/run_chain need zero changes. New constructors
Outcome::scores_with_sigma and EventBuilder::scores_with_sigma cover
the override path; existing scores(..) keeps its signature with
sigma=None internally.

Breaking change to Outcome::Scored variant shape (tuple → struct);
acceptable in 0.1.x. Closes the last item from the T4-MarginFactor
deferred wishlist.
This commit is contained in:
2026-05-08 16:05:27 +02:00
parent 68be7ab5b7
commit 46625d247a
@@ -0,0 +1,292 @@
# Per-Event `score_sigma` Override
## Summary
Let users specify a per-event noise override on `Outcome::Scored`.
Today every scored event in a `History` shares the single
`HistoryBuilder::score_sigma` value (default `1.0`); a user who wants
to say "this match was a clean blowout, trust the margin more" or
"this one was a disrupted scrappy game, trust it less" has no way to
do so.
The override is resolved at ingest time and stored as a plain `f64`
on the existing `EventKind::Scored { score_sigma }` payload, so
`TimeSlice` and `run_chain` need zero changes. The work is purely on
the public API surface: `Outcome::Scored` becomes a struct variant
with an `Option<f64> sigma` field; two builder methods on `Outcome`
and `EventBuilder` cover the explicit-override path.
## Background
`Outcome::Scored(SmallVec<[f64; 4]>)` is the public per-team-score
variant (`src/outcome.rs:20`). It's constructed via
`Outcome::scores(I)` (`src/outcome.rs:44`) or
`EventBuilder::scores(I)` (`src/event_builder.rs:79`).
When `History::add_events` ingests a Scored outcome, it always uses
the history-wide default:
```rust
// src/history.rs:735-740
crate::Outcome::Scored(scores) => {
kinds.push(EventKind::Scored {
score_sigma: self.score_sigma,
});
scores.to_vec()
}
```
The downstream `EventKind::Scored { score_sigma: f64 }`
(`src/time_slice.rs:51`) is already per-event-shaped — every Event
carries its own copy. The constraint is purely at the ingest boundary.
This was flagged as deferred tech debt during the T4-MarginFactor
work: "EventKind::Scored.score_sigma payload is always history-wide
today; per-event override deferred."
## Scope
### What ships
1. `Outcome::Scored` becomes a struct variant:
`Scored { scores: SmallVec<[f64; 4]>, sigma: Option<f64> }`.
`None` = use history default; `Some(s)` = override.
2. New constructor `Outcome::scores_with_sigma(scores, sigma)` on
`Outcome`. Existing `Outcome::scores(I)` keeps the same shape but
builds with `sigma: None`.
3. New builder method `EventBuilder::scores_with_sigma(scores, sigma)`
on `EventBuilder`.
4. `History::add_events` resolves `sigma.unwrap_or(self.score_sigma)`
when converting an `Outcome::Scored` to `EventKind::Scored`.
5. Mechanical pattern-match updates at every site that destructures
`Outcome::Scored(...)` as a tuple. Estimate ~510 sites across
`src/`, `tests/`, `examples/`, `benches/`.
### What does not ship
- No change to `EventKind::Scored` (already per-event).
- No change to `TimeSlice` or `run_chain`.
- No change to `Game::scored` standalone API
(it still takes `score_sigma` via `GameOptions::score_sigma`).
- No deprecation of `HistoryBuilder::score_sigma` — the history-wide
default is still useful as a common-case fallback.
## Design
### `Outcome` enum change
```rust
// src/outcome.rs
#[derive(Clone, Debug)]
pub enum Outcome {
Ranked(SmallVec<[u32; 4]>),
Scored {
scores: SmallVec<[f64; 4]>,
/// Per-event noise override. `None` means inherit
/// `HistoryBuilder::score_sigma`. Must be `> 0.0` if `Some`.
sigma: Option<f64>,
},
}
```
The variant shape changes from tuple to struct. Pattern matches that
extract the scores switch from `Outcome::Scored(scores)` to
`Outcome::Scored { scores, .. }` (or `{ scores, sigma }` where the
sigma is needed).
### `Outcome` constructors
```rust
impl Outcome {
/// Per-team continuous scores; uses HistoryBuilder::score_sigma default.
pub fn scores<I: IntoIterator<Item = f64>>(scores: I) -> Self {
Self::Scored {
scores: scores.into_iter().collect(),
sigma: None,
}
}
/// Per-team scores with explicit per-event noise override.
///
/// `sigma` must be > 0.0; debug_assert.
pub fn scores_with_sigma<I: IntoIterator<Item = f64>>(
scores: I,
sigma: f64,
) -> Self {
debug_assert!(sigma > 0.0, "score_sigma must be > 0.0 (got {sigma})");
Self::Scored {
scores: scores.into_iter().collect(),
sigma: Some(sigma),
}
}
}
```
`Outcome::scores(I)` keeps the existing function signature exactly —
its only behavioural change is the internal struct construction. The
existing `as_scores()`, `team_count()`, etc. accessors keep their
public signatures (they return `Option<&[f64]>` and `usize`); their
internal pattern matches update mechanically.
### `EventBuilder` method
```rust
impl<'h, T, D, O, K> EventBuilder<'h, T, D, O, K>
where
T: Time,
D: Drift<T>,
O: Observer<T>,
K: Eq + std::hash::Hash + Clone,
{
/// Per-team scores; uses HistoryBuilder::score_sigma default.
pub fn scores<I: IntoIterator<Item = f64>>(mut self, scores: I) -> Self {
self.event.outcome = crate::Outcome::scores(scores);
self
}
/// Per-team scores with explicit per-event noise override.
pub fn scores_with_sigma<I: IntoIterator<Item = f64>>(
mut self,
scores: I,
sigma: f64,
) -> Self {
self.event.outcome = crate::Outcome::scores_with_sigma(scores, sigma);
self
}
}
```
The existing `.scores(...)` builder method stays — its body changes
trivially because `Outcome::scores(I)` still has the same signature.
`.scores_with_sigma(...)` is the new method.
### Sigma resolution
In `History::add_events` at `src/history.rs:735`:
```rust
crate::Outcome::Scored { scores, sigma } => {
let resolved = sigma.unwrap_or(self.score_sigma);
debug_assert!(
resolved > 0.0,
"resolved score_sigma must be > 0.0 (got {resolved})"
);
kinds.push(EventKind::Scored {
score_sigma: resolved,
});
scores.to_vec()
}
```
Resolution at ingest time means downstream code keeps a plain `f64`.
No `Option` propagates further.
### Validation
- `Outcome::scores_with_sigma(_, sigma)` debug-asserts `sigma > 0.0`
at construction.
- `History::add_events` debug-asserts the resolved sigma is `> 0.0`
(catches both inherited and overridden paths).
- `HistoryBuilder::score_sigma(s)` keeps its existing positive
assertion.
The default sigma at the History level (`1.0`) is positive, so an
event with `sigma = None` against a default-built History always
passes the resolved-sigma assertion trivially.
### Pattern-match update inventory
Every site that destructures `Outcome::Scored(_)` as a tuple needs
updating. Known sites:
- `src/outcome.rs`: the `team_count()`, `as_scores()`, `as_ranks()`
match arms (`src/outcome.rs:51`, `:58`, `:64`).
- `src/history.rs:735`: the conversion arm (this is also where the
resolution rule lands).
- Any test in `src/outcome.rs` test mod that constructs
`Outcome::Scored(...)` literally.
- Any callsite in `src/`, `tests/`, `examples/`, `benches/`,
`src/game.rs` that pattern-matches the variant.
The compiler surfaces every site at `cargo build`. Locating them is
mechanical.
## Testing strategy
### Regression net
Existing 100 lib + 27 integration tests are the bit-equal regression
net for the `sigma = None` path. Every existing test that uses
`Outcome::scores(...)` or `EventBuilder::scores(...)` should
continue to produce identical posteriors — the resolved sigma equals
the history default (which equals what the hardcoded path produced).
### New tests
Three additions in the `src/history.rs` test module:
1. **`outcome_scores_default_sigma_uses_history_default`** — build a
History with `score_sigma(0.5)`, add a 2-team event via
`Outcome::scores([3.0, 1.0])` (no override), capture posteriors.
Build a second History identical except using
`Outcome::scores_with_sigma([3.0, 1.0], 0.5)` (override matches
default). Assert posteriors are bit-equal across the two paths.
2. **`outcome_scores_with_sigma_overrides_history_default`** — build a
History with `score_sigma(0.5)`, add an event via
`Outcome::scores_with_sigma([3.0, 1.0], 2.0)`. Build a second
History with `score_sigma(2.0)` and add the same event via
`Outcome::scores([3.0, 1.0])`. Assert posteriors are bit-equal.
Then build a third History with `score_sigma(0.5)` and add via
`Outcome::scores([3.0, 1.0])` (no override). Assert this third
one's posteriors differ measurably from the override path
(max diff > 1e-6) — proves the override actually changes
inference.
3. **`event_builder_scores_with_sigma_threading`** — same shape as
#2 but constructed via the fluent builder
`h.event(0).team(["a"]).team(["b"]).scores_with_sigma([3.0, 1.0], 2.0).commit()`.
Proves the builder method works end-to-end.
### Pattern-match update test impact
Existing tests in `src/outcome.rs` that construct
`Outcome::Scored(...)` literally need updating to the struct shape.
Mechanical change; no new tests required.
## Verification gates
```bash
cargo +nightly fmt
cargo clippy --all-targets -- -D warnings
cargo test --lib
cargo test
```
Test count grows by 3.
## Risks
- **Public API breaking change.** `Outcome::Scored` variant shape
changes from tuple to struct. Any downstream consumer
pattern-matching on the tuple form breaks. In a 0.1.x crate this
is acceptable; flag it in the commit message.
- **Mechanical breadth.** The pattern-match updates touch several
files. They're all caught by the compiler so the risk is low, but
the diff will look bigger than the actual logical change.
- **Two ways to do the same thing.** `Outcome::scores_with_sigma(..)`
and `EventBuilder::scores_with_sigma(..)` both produce the same
outcome. This is intentional — the constructor is the underlying
primitive; the builder method is the ergonomic wrapper. Same
pattern as the existing `Outcome::scores(..)` /
`EventBuilder::scores(..)` pair.
## Out-of-scope follow-ups
- Per-event override of other config currently history-wide
(`p_draw`, drift, beta) — same architectural pattern would apply
but each is its own design decision.
- Validation upgrade from `debug_assert!` to a `Result` at the
Outcome construction boundary.
- Schedule trait integration with `run_chain`, `Residual` schedule,
`SynergyFactor` (still pending from the larger spec).