Anders Olsson 8b53cacd64 T4 (MarginFactor): scored outcomes via Gaussian-margin EP evidence
Adds soft Gaussian-observation evidence on the per-pair diff variable,
enabling continuous score margins as a richer alternative to ranks.

Public API:
- `Outcome::Scored([scores])` (non-breaking enum extension under
  `#[non_exhaustive]`).
- `Game::scored(teams, outcome, options)` constructor parallel to
  `Game::ranked`.
- `EventBuilder::scores([...])` fluent helper.
- `HistoryBuilder::score_sigma(σ)` knob (default 1.0, validated > 0).
- `GameOptions::score_sigma`.
- `EventKind` re-exported from `lib.rs` (annotated `#[non_exhaustive]`).
- New `InferenceError::InvalidParameter { name, value }` variant.

Internals:
- `MarginFactor` (`factor/margin.rs`): Gaussian observation factor that
  closes in one EP step; cavity-cached log-evidence mirrors `TruncFactor`.
- `BuiltinFactor::Margin` dispatch arm.
- `DiffFactor` enum in `game.rs` lets `Game::likelihoods` and the new
  `likelihoods_scored` share the per-pair link abstraction.
- Per-event `EventKind { Ranked, Scored { score_sigma } }` routed through
  `TimeSlice::add_events`, `iteration_direct`, and `log_evidence`.

Tests: 88 lib + 27 integration (4 new in `tests/scored.rs`); existing
goldens byte-identical.  Bench: `benches/scored.rs` baseline ~960µs for
60 events × 20-player pool with default convergence.

Plan: docs/superpowers/plans/2026-04-27-t4-margin-factor.md
Spec item marked Done.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 08:47:36 +02:00
2026-03-23 14:21:23 +01:00
2026-04-23 20:24:10 +02:00
2026-04-23 20:26:52 +02:00

TrueSkill - Through Time

Rust port of TrueSkillThroughTime.py.

Other implementations

Drift

Skill drift models how a player's true skill can change between appearances. Each time a player reappears after a gap, their skill uncertainty is widened by the drift model before the new evidence is incorporated.

Drift is represented by the Drift trait:

pub trait Drift: Copy + Debug {
    fn variance_delta(&self, elapsed: i64) -> f64;
}

variance_delta returns the amount to add to σ² given the elapsed time since the player last played. Internally, Gaussian::forget uses this to compute the new sigma: σ_new = sqrt(σ² + variance_delta).

ConstantDrift

The built-in ConstantDrift implements a linear random walk — skill uncertainty grows proportionally to time:

variance_delta = elapsed * γ²

This is the standard TrueSkill Through Time model. Use it by passing a ConstantDrift(gamma) when constructing a Player:

use trueskill_tt::{Player, Gaussian, drift::ConstantDrift};

// gamma = 0.1 means skill can shift ~0.1 per time unit
let player = Player::new(Gaussian::from_ms(0.0, 6.0), 1.0, ConstantDrift(0.1));

Custom drift

Implement Drift to express any other model. For example, a drift that saturates after a long absence (uncertainty grows with the square root of elapsed time instead of linearly):

use trueskill_tt::drift::Drift;

#[derive(Clone, Copy, Debug)]
struct SqrtDrift {
    gamma: f64,
}

impl Drift for SqrtDrift {
    fn variance_delta(&self, elapsed: i64) -> f64 {
        (elapsed as f64).sqrt() * self.gamma * self.gamma
    }
}

let player = Player::new(Gaussian::from_ms(0.0, 6.0), 1.0, SqrtDrift { gamma: 0.5 });

To use a custom drift type with History, use the .drift() builder method instead of .gamma():

let h = History::builder()
    .drift(SqrtDrift { gamma: 0.5 })
    .build();

Scored outcomes

Use Outcome::scores([...]) when you have continuous per-team scores rather than just ranks. Adjacent score margins flow into a MarginFactor that adds soft Gaussian evidence about the latent performance diff. Configure HistoryBuilder::score_sigma(σ) to control how much you trust the margins (smaller σ = more trust).

use trueskill_tt::{History, Outcome};

let mut h = History::builder().score_sigma(2.0).build();
h.event(1)
    .team(["alice"])
    .team(["bob"])
    .scores([21.0, 9.0])
    .commit()
    .unwrap();
h.converge().unwrap();

Todo

  • Implement approx for Gaussian
  • Add more tests from TrueSkillThroughTime.jl
  • Add tests for quality() (Use sublee/trueskill as reference)
  • Benchmark Batch::iteration()
  • Time needs to be an enum so we can have multiple states (see batch::compute_elapsed())
  • Add examples (use same TrueSkillThroughTime.(py|jl))
  • Add Observer (see argmin for inspiration)
Description
No description provided
Readme 9.2 MiB
Languages
Rust 99.9%