chore: Release trueskill-tt version 0.1.1

T4 (MarginFactor): scored outcomes via Gaussian-margin EP evidence
Adds soft Gaussian-observation evidence on the per-pair diff variable, enabling continuous score margins as a richer alternative to ranks. Public API: - `Outcome::Scored([scores])` (non-breaking enum extension under `#[non_exhaustive]`). - `Game::scored(teams, outcome, options)` constructor parallel to `Game::ranked`. - `EventBuilder::scores([...])` fluent helper. - `HistoryBuilder::score_sigma(σ)` knob (default 1.0, validated > 0). - `GameOptions::score_sigma`. - `EventKind` re-exported from `lib.rs` (annotated `#[non_exhaustive]`). - New `InferenceError::InvalidParameter { name, value }` variant. Internals: - `MarginFactor` (`factor/margin.rs`): Gaussian observation factor that closes in one EP step; cavity-cached log-evidence mirrors `TruncFactor`. - `BuiltinFactor::Margin` dispatch arm. - `DiffFactor` enum in `game.rs` lets `Game::likelihoods` and the new `likelihoods_scored` share the per-pair link abstraction. - Per-event `EventKind { Ranked, Scored { score_sigma } }` routed through `TimeSlice::add_events`, `iteration_direct`, and `log_evidence`. Tests: 88 lib + 27 integration (4 new in `tests/scored.rs`); existing goldens byte-identical. Bench: `benches/scored.rs` baseline ~960µs for 60 events × 20-player pool with default convergence. Plan: docs/superpowers/plans/2026-04-27-t4-margin-factor.md Spec item marked Done. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 09:01:46 +02:00 · 2026-04-27 08:47:36 +02:00 · 2026-04-24 13:01:01 +00:00
28 changed files with 3760 additions and 105 deletions
@@ -2,6 +2,66 @@

 All notable changes to this project will be documented in this file.

+## Unreleased — T3 concurrency
+
+Adds rayon-backed parallel paths per Section 6 of
+`docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md`.
+
+### Breaking
+
+- `Send + Sync` bounds added to public traits: `Time`, `Drift<T>`,
+  `Observer<T>`, `Factor`, `Schedule`. All built-in impls satisfy these
+  via auto-derive, but downstream custom impls that aren't thread-safe
+  will need the bounds.
+
+### New
+
+- Opt-in `rayon` cargo feature. When enabled:
+  - Within-slice event iteration runs color-group events in parallel
+    via `par_iter_mut` (`TimeSlice::sweep_color_groups`).
+  - `History::learning_curves` computes per-slice posteriors in
+    parallel, merges sequentially in slice order.
+  - `History::log_evidence` / `log_evidence_for` use per-slice parallel
+    computation with deterministic sequential reduction (sum in slice
+    order) — bit-identical to the sequential baseline.
+- `ColorGroups` internal infrastructure with greedy graph coloring
+  (`src/color_group.rs`). Events sharing no `Index` go into the same
+  color group; events in the same group can run concurrently without
+  touching each other's skills.
+- `tests/determinism.rs` asserts bit-identical posteriors across
+  `RAYON_NUM_THREADS={1, 2, 4, 8}`.
+- `benches/history_converge.rs` measures end-to-end convergence on
+  three workload shapes.
+
+### Performance notes
+
+- Default build (no rayon): `Batch::iteration` 23.23 µs — no regression
+  vs T2.
+- With `--features rayon`:
+  - 500 events / 100 competitors / 10 per slice: 1.0× speedup.
+  - 2000 events / 200 competitors / 20 per slice: 1.0× speedup.
+  - 5000 events in one slice / 50k competitors: **1.3× speedup.**
+- The spec targeted >2× speedup on 8-core offline converge. This is
+  only achievable on workloads with many events-per-slice AND large
+  competitor pools. **Typical TrueSkill workloads (tens of events
+  per slice) do not materially benefit from T3's within-slice
+  parallelism** because rayon's task-spawn overhead dominates.
+- Cross-slice parallelism (dirty-bit slice skipping per spec Section
+  5) is the natural next step for real workload speedup — deferred
+  to a future tier.
+
+### Internals
+
+- The parallel path uses an `unsafe` block to concurrently write to
+  `SkillStore` from color-group-disjoint events. Soundness rests on
+  the color-group invariant (events in the same color touch no shared
+  `Index`), which is guaranteed by construction in
+  `TimeSlice::recompute_color_groups`. Sequential path unchanged.
+- `RAYON_THRESHOLD = 64` — color groups smaller than this fall back to
+  sequential iteration inside the parallel `sweep_color_groups` to
+  avoid rayon's task-spawn overhead.
+- Thread-local `ScratchArena` per rayon worker thread.
+
 ## Unreleased — T2 new API surface

 Breaking: every renamed type and the new public API land together per
@@ -35,6 +35,7 @@ History  →  Batch[]  →  Game[]  →  teams/players
 - **`Player`** (`player.rs`) — static configuration: prior `Gaussian`, `beta` (performance noise), `gamma` (skill drift per time unit).
 - **`Gaussian`** (`gaussian.rs`) — core probability type. Stored as natural parameters (`pi = 1/sigma²`, `tau = mu/sigma²`). Arithmetic ops implement message multiplication/division in the factor graph.
 - **`message.rs`** — `TeamMessage` and `DiffMessage`: intermediate factor graph messages used inside `Game`.
+- **`MarginFactor`** (`factor/margin.rs`) — Gaussian observation factor on a diff variable; engaged by `Outcome::Scored`.
 - **`lib.rs`** — exports the public API (`Game`, `Gaussian`, `History`, `Player`) and standalone functions (`quality()`, `pdf()`, `cdf()`, `erfc()`). Also defines global defaults: `MU=0.0`, `SIGMA=6.0`, `BETA=1.0`, `GAMMA=0.03`, `P_DRAW=0.0`, `EPSILON=1e-6`, `ITERATIONS=30`.

 ### Key design points
@@ -1,6 +1,6 @@
 [package]
 name = "trueskill-tt"
-version = "0.1.0"
+version = "0.1.1"
 edition = "2024"

 [lib]
@@ -14,6 +14,14 @@ harness = false
 name = "gaussian"
 harness = false

+[[bench]]
+name = "history_converge"
+harness = false
+
+[[bench]]
+name = "scored"
+harness = false
+
 [dependencies]
 approx = { version = "0.5.1", optional = true }
 rayon = { version = "1", optional = true }
@@ -71,6 +71,27 @@ let h = History::builder()
    .build();
 ```

+## Scored outcomes
+
+Use `Outcome::scores([...])` when you have continuous per-team scores rather
+than just ranks. Adjacent score margins flow into a `MarginFactor` that adds
+soft Gaussian evidence about the latent performance diff. Configure
+`HistoryBuilder::score_sigma(σ)` to control how much you trust the margins
+(smaller σ = more trust).
+
+```rust
+use trueskill_tt::{History, Outcome};
+
+let mut h = History::builder().score_sigma(2.0).build();
+h.event(1)
+    .team(["alice"])
+    .team(["bob"])
+    .scores([21.0, 9.0])
+    .commit()
+    .unwrap();
+h.converge().unwrap();
+```
+
 ## Todo

 - [x] Implement approx for Gaussian
@@ -98,3 +98,35 @@ Gaussian::tau             260.80 ps    (unchanged)
 #   learning_curves_by_index(), nested-Vec public add_events().
 # - 90 tests green: 68 lib + 10 api_shape + 6 game + 4 record_winner +
 #   2 equivalence.
+
+# After T3 (2026-04-24, same hardware)
+
+Batch::iteration (seq, no rayon)     23.23 µs   (matches T2 baseline; no regression)
+Batch::iteration (rayon, small slice) 24.57 µs   (within noise; small workloads pay rayon overhead)
+Gaussian::add                         236.62 ps  (unchanged)
+Gaussian::sub                         236.43 ps  (unchanged)
+Gaussian::mul                         237.05 ps  (unchanged)
+Gaussian::div                         236.07 ps  (unchanged)
+
+# End-to-end history_converge benchmark (Apple M5 Pro, RAYON_NUM_THREADS=auto):
+# workload                              seq      rayon    speedup
+# 500 events, 100 competitors, 10/slice 4.03 ms  4.24 ms  1.0x
+# 2000 events, 200 competitors, 20/slice 20.18 ms 19.82 ms 1.0x
+# 5000 events, 50000 competitors, 1 slice 11.88 ms 9.10 ms 1.3x
+#
+# Notes:
+# - T3's within-slice color-group parallelism only materializes a speedup
+#   when a slice holds many events with disjoint competitor sets. Typical
+#   TrueSkill workloads (tens of events per slice) don't show measurable
+#   benefit from rayon.
+# - The pre-revert SmallVec experiment hit 2x on the 5000-event workload
+#   but regressed sequential Batch::iteration by 28%. The tradeoff wasn't
+#   worth it for typical workloads — ShipVec<[_; 8]> inline size (1 KB per
+#   Game struct) hurt cache locality on the hot path.
+# - Cross-slice parallelism (dirty-bit slice skipping per spec Section 5)
+#   is the natural next step for realistic TrueSkill workloads and would
+#   deliver the spec's ~50-500x online-add speedup. Deferred to T4+.
+# - Determinism verified: tests/determinism.rs asserts bit-identical
+#   posteriors across RAYON_NUM_THREADS={1, 2, 4, 8}.
+# - Send + Sync bounds added on Time, Drift<T>, Observer<T>, Factor, Schedule.
+# - Rayon is opt-in via `--features rayon`. Default build is unchanged from T2.
@@ -1,7 +1,7 @@
 use criterion::{Criterion, criterion_group, criterion_main};
 use trueskill_tt::{
-    BETA, Competitor, GAMMA, KeyTable, MU, P_DRAW, Rating, SIGMA, TimeSlice, drift::ConstantDrift,
-    gaussian::Gaussian, storage::CompetitorStore,
+    BETA, Competitor, EventKind, GAMMA, KeyTable, MU, P_DRAW, Rating, SIGMA, TimeSlice,
+    drift::ConstantDrift, gaussian::Gaussian, storage::CompetitorStore,
 };

 fn criterion_benchmark(criterion: &mut Criterion) {
@@ -33,8 +33,10 @@ fn criterion_benchmark(criterion: &mut Criterion) {
        weights.push(vec![vec![1.0], vec![1.0]]);
    }

+    let kinds = vec![EventKind::Ranked; composition.len()];
+
    let mut time_slice = TimeSlice::new(1, P_DRAW);
-    time_slice.add_events(composition, results, weights, &agents);
+    time_slice.add_events(composition, results, weights, kinds, &agents);

    criterion.bench_function("Batch::iteration", |b| {
        b.iter(|| time_slice.iteration(0, &agents))
@@ -0,0 +1,116 @@
+//! End-to-end History::converge benchmark.
+//!
+//! Workload shapes designed to expose rayon's within-slice color-group
+//! parallelism. Events in the same color group are processed in parallel
+//! via direct-write with disjoint index sets (no data races). Color groups
+//! smaller than a threshold fall back to the sequential path to avoid
+//! rayon overhead on small workloads.
+//!
+//! On Apple M5 Pro, the P-core count (6) is the optimal thread count.
+//! The rayon thread pool is initialised to `min(P-cores, available)` to
+//! avoid scheduling onto the slower E-cores.
+//!
+//! ## Results (Apple M5 Pro, 2026-04-24, after SmallVec revert)
+//!
+//! | Workload                                    | Sequential  | Parallel   | Speedup |
+//! |---------------------------------------------|------------:|-----------:|--------:|
+//! | History::converge/500x100@10perslice        |     4.03 ms |    4.24 ms |   1.0×  |
+//! | History::converge/2000x200@20perslice       |    20.18 ms |   19.82 ms |   1.0×  |
+//! | History::converge/1v1-5000x50000@5000perslice|   11.88 ms |    9.10 ms |   1.3×  |
+//!
+//! T3 acceptance gate: ≥2× speedup on at least one workload — NOT achieved after revert.
+//! The SmallVec storage that enabled the 2× gate caused a +28% regression in the
+//! sequential Batch::iteration benchmark and was reverted. Small workloads still fall
+//! below the RAYON_THRESHOLD (64 events/color) and run sequentially with near-zero overhead.
+
+use criterion::{BatchSize, Criterion, criterion_group, criterion_main};
+use smallvec::smallvec;
+use trueskill_tt::{
+    ConstantDrift, ConvergenceOptions, Event, History, Member, NullObserver, Outcome, Team,
+};
+
+fn build_history_1v1(
+    n_events: usize,
+    n_competitors: usize,
+    events_per_slice: usize,
+    seed: u64,
+) -> History<i64, ConstantDrift, NullObserver, String> {
+    let mut rng = seed;
+    let mut next = || {
+        rng = rng
+            .wrapping_mul(6364136223846793005)
+            .wrapping_add(1442695040888963407);
+        rng
+    };
+
+    let mut h = History::<i64, _, _, String>::builder_with_key()
+        .mu(25.0)
+        .sigma(25.0 / 3.0)
+        .beta(25.0 / 6.0)
+        .drift(ConstantDrift(25.0 / 300.0))
+        .convergence(ConvergenceOptions {
+            max_iter: 30,
+            epsilon: 1e-6,
+        })
+        .build();
+
+    let mut events: Vec<Event<i64, String>> = Vec::with_capacity(n_events);
+    for ev_i in 0..n_events {
+        let a = (next() as usize) % n_competitors;
+        let mut b = (next() as usize) % n_competitors;
+        while b == a {
+            b = (next() as usize) % n_competitors;
+        }
+        events.push(Event {
+            time: (ev_i as i64 / events_per_slice as i64) + 1,
+            teams: smallvec![
+                Team::with_members([Member::new(format!("p{a}"))]),
+                Team::with_members([Member::new(format!("p{b}"))]),
+            ],
+            outcome: Outcome::winner((next() % 2) as u32, 2),
+        });
+    }
+    h.add_events(events).unwrap();
+    h
+}
+
+fn bench_converge(c: &mut Criterion) {
+    // Two original task workloads (small per-slice event count;
+    // fall below RAYON_THRESHOLD so sequential path runs — near-zero overhead).
+    c.bench_function("History::converge/500x100@10perslice", |b| {
+        b.iter_batched(
+            || build_history_1v1(500, 100, 10, 42),
+            |mut h| {
+                h.converge().unwrap();
+            },
+            BatchSize::SmallInput,
+        );
+    });
+
+    c.bench_function("History::converge/2000x200@20perslice", |b| {
+        b.iter_batched(
+            || build_history_1v1(2000, 200, 20, 42),
+            |mut h| {
+                h.converge().unwrap();
+            },
+            BatchSize::SmallInput,
+        );
+    });
+
+    // Large single-slice workload: 5000 events, 50000 competitors.
+    // All events in one slice → color-0 gets ~4900 disjoint events, well above
+    // the 64-event RAYON_THRESHOLD. 30 iterations × 1 slice = 30 sweeps, each
+    // parallelised across P-core threads. Shows ≥2× speedup.
+    c.bench_function("History::converge/1v1-5000x50000@5000perslice", |b| {
+        b.iter_batched(
+            || build_history_1v1(5000, 50000, 5000, 42),
+            |mut h| {
+                h.converge().unwrap();
+            },
+            BatchSize::SmallInput,
+        );
+    });
+}
+
+criterion_group!(benches, bench_converge);
+criterion_main!(benches);
@@ -0,0 +1,38 @@
+use criterion::{Criterion, criterion_group, criterion_main};
+use smallvec::smallvec;
+use trueskill_tt::{ConstantDrift, Event, History, Member, Outcome, Team};
+
+fn bench_scored_history(c: &mut Criterion) {
+    c.bench_function("scored_history_60_events_30_iter", |bencher| {
+        bencher.iter(|| {
+            let mut h: History<i64, ConstantDrift, _, String> = History::builder_with_key()
+                .mu(25.0)
+                .sigma(25.0 / 3.0)
+                .beta(25.0 / 6.0)
+                .drift(ConstantDrift(0.03))
+                .score_sigma(2.0)
+                .build();
+
+            let mut events: Vec<Event<i64, String>> = Vec::with_capacity(60);
+            for i in 0..60 {
+                let a = format!("p{}", i % 20);
+                let b = format!("p{}", (i + 7) % 20);
+                let s_a = (i as f64 * 0.3).sin().abs() * 21.0;
+                let s_b = (i as f64 * 0.3).cos().abs() * 21.0;
+                events.push(Event {
+                    time: 1 + (i / 6) as i64,
+                    teams: smallvec![
+                        Team::with_members([Member::new(a)]),
+                        Team::with_members([Member::new(b)]),
+                    ],
+                    outcome: Outcome::scores([s_a, s_b]),
+                });
+            }
+            h.add_events(events).unwrap();
+            h.converge().unwrap();
+        });
+    });
+}
+
+criterion_group!(benches, bench_scored_history);
+criterion_main!(benches);
@@ -0,0 +1,14 @@
+    Finished `bench` profile [optimized + debuginfo] target(s) in 0.02s
+     Running benches/scored.rs (target/release/deps/scored-988d1798504ff7d2)
+Gnuplot not found, using plotters backend
+Benchmarking scored_history_60_events_30_iter
+Benchmarking scored_history_60_events_30_iter: Warming up for 3.0000 s
+Benchmarking scored_history_60_events_30_iter: Collecting 100 samples in estimated 9.7418 s (10k iterations)
+Benchmarking scored_history_60_events_30_iter: Analyzing
+scored_history_60_events_30_iter
+                        time:   [959.36 µs 962.68 µs 966.13 µs]
+Found 11 outliers among 100 measurements (11.00%)
+  1 (1.00%) low mild
+  5 (5.00%) high mild
+  5 (5.00%) high severe
+
@@ -578,7 +578,7 @@ All renames and the new public API land together. No half-renamed intermediate s

 Each shipped independently after T3.

- `MarginFactor` → enables `Outcome::Scored`.
+- `MarginFactor` → enables `Outcome::Scored`. **Done** (see `docs/superpowers/plans/2026-04-27-t4-margin-factor.md`).
 - `Damped` and `Residual` schedules.
 - `SynergyFactor`, `ScoreFactor` → same pattern when wanted.

@@ -0,0 +1,59 @@
+//! Worked example: continuous-score outcomes via `Outcome::Scored`.
+//!
+//! Three players play a small round-robin where the score margin matters,
+//! not just who won. We show how `score_sigma` controls how much weight
+//! the engine places on the observed margin.
+//!
+//! Run with: `cargo run --example scored --release`
+
+use smallvec::smallvec;
+use trueskill_tt::{ConstantDrift, Event, History, Member, Outcome, Team};
+
+fn main() {
+    let mut h = History::builder()
+        .mu(25.0)
+        .sigma(25.0 / 3.0)
+        .beta(25.0 / 6.0)
+        .drift(ConstantDrift(0.03))
+        .score_sigma(2.0) // tune to data; smaller = trust margins more
+        .build();
+
+    let events: Vec<Event<i64, &'static str>> = vec![
+        Event {
+            time: 1,
+            teams: smallvec![
+                Team::with_members([Member::new("alice")]),
+                Team::with_members([Member::new("bob")]),
+            ],
+            outcome: Outcome::scores([21.0, 9.0]),
+        },
+        Event {
+            time: 2,
+            teams: smallvec![
+                Team::with_members([Member::new("bob")]),
+                Team::with_members([Member::new("carol")]),
+            ],
+            outcome: Outcome::scores([21.0, 18.0]),
+        },
+        Event {
+            time: 3,
+            teams: smallvec![
+                Team::with_members([Member::new("alice")]),
+                Team::with_members([Member::new("carol")]),
+            ],
+            outcome: Outcome::scores([21.0, 21.0]),
+        },
+    ];
+    h.add_events(events).unwrap();
+
+    let report = h.converge().unwrap();
+    println!(
+        "converged={}, iterations={}, log_evidence={:.4}",
+        report.converged, report.iterations, report.log_evidence
+    );
+
+    for who in &["alice", "bob", "carol"] {
+        let s = h.current_skill(who).unwrap();
+        println!("{:>6}: mu={:>7.3}  sigma={:.3}", who, s.mu(), s.sigma());
+    }
+}
@@ -0,0 +1,158 @@
+//! Greedy graph coloring for within-slice event independence.
+//!
+//! Events sharing no `Index` can be processed in parallel under async-EP
+//! semantics. This module partitions a list of events into "colors" such
+//! that events of the same color touch disjoint index sets.
+//!
+//! The algorithm is greedy: for each event in ingestion order, place it in
+//! the lowest-numbered color whose existing members share no `Index`. If
+//! no existing color accepts the event, open a new color.
+//!
+//! Complexity: O(n × c × m) where n is events, c is colors (small, ≤ 5 in
+//! practice), and m is average team size.
+
+use std::collections::HashSet;
+
+use crate::Index;
+
+/// Partition of event indices into color groups.
+///
+/// Each inner `Vec<usize>` holds the indices (into the original events
+/// array) of events assigned to one color. Colors are iterated in ascending
+/// order by convention.
+#[derive(Clone, Debug, Default)]
+pub(crate) struct ColorGroups {
+    pub(crate) groups: Vec<Vec<usize>>,
+}
+
+impl ColorGroups {
+    #[allow(dead_code)]
+    pub(crate) fn new() -> Self {
+        Self::default()
+    }
+
+    #[allow(dead_code)]
+    pub(crate) fn n_colors(&self) -> usize {
+        self.groups.len()
+    }
+
+    #[allow(dead_code)]
+    pub(crate) fn is_empty(&self) -> bool {
+        self.groups.is_empty()
+    }
+
+    /// Total event count across all colors.
+    #[allow(dead_code)]
+    pub(crate) fn total_events(&self) -> usize {
+        self.groups.iter().map(|g| g.len()).sum()
+    }
+
+    /// Contiguous index range for one color after events have been reordered
+    /// into color-contiguous positions by `TimeSlice::recompute_color_groups`.
+    #[allow(dead_code)]
+    pub(crate) fn color_range(&self, color_idx: usize) -> std::ops::Range<usize> {
+        let group = &self.groups[color_idx];
+        if group.is_empty() {
+            return 0..0;
+        }
+        let start = *group.first().unwrap();
+        let end = *group.last().unwrap() + 1;
+        start..end
+    }
+}
+
+/// Compute color groups greedily.
+///
+/// `index_set(ev_idx)` yields, for each event index, the iterator of
+/// `Index` values that event touches. The returned `ColorGroups` has one
+/// inner `Vec<usize>` per color, containing event indices in the order
+/// they were assigned.
+#[allow(dead_code)]
+pub(crate) fn color_greedy<I, F>(n_events: usize, index_set: F) -> ColorGroups
+where
+    F: Fn(usize) -> I,
+    I: IntoIterator<Item = Index>,
+{
+    let mut groups: Vec<Vec<usize>> = Vec::new();
+    let mut members: Vec<HashSet<Index>> = Vec::new();
+
+    for ev_idx in 0..n_events {
+        let ev_members: HashSet<Index> = index_set(ev_idx).into_iter().collect();
+        // Find first color whose member-set is disjoint from this event's indices.
+        let chosen = members.iter().position(|m| m.is_disjoint(&ev_members));
+        let color_idx = match chosen {
+            Some(c) => c,
+            None => {
+                groups.push(Vec::new());
+                members.push(HashSet::new());
+                groups.len() - 1
+            }
+        };
+        groups[color_idx].push(ev_idx);
+        members[color_idx].extend(ev_members);
+    }
+
+    ColorGroups { groups }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    fn idx(i: usize) -> Index {
+        Index::from(i)
+    }
+
+    #[test]
+    fn single_event_gets_one_color() {
+        let cg = color_greedy(1, |_| vec![idx(0), idx(1)]);
+        assert_eq!(cg.n_colors(), 1);
+        assert_eq!(cg.groups[0], vec![0]);
+    }
+
+    #[test]
+    fn disjoint_events_share_a_color() {
+        let cg = color_greedy(2, |i| match i {
+            0 => vec![idx(0), idx(1)],
+            1 => vec![idx(2), idx(3)],
+            _ => unreachable!(),
+        });
+        assert_eq!(cg.n_colors(), 1);
+        assert_eq!(cg.groups[0], vec![0, 1]);
+    }
+
+    #[test]
+    fn overlapping_events_need_separate_colors() {
+        let cg = color_greedy(2, |i| match i {
+            0 => vec![idx(0), idx(1)],
+            1 => vec![idx(1), idx(2)],
+            _ => unreachable!(),
+        });
+        assert_eq!(cg.n_colors(), 2);
+        assert_eq!(cg.groups[0], vec![0]);
+        assert_eq!(cg.groups[1], vec![1]);
+    }
+
+    #[test]
+    fn three_events_two_colors() {
+        // Event 0: {0, 1}; event 1: {2, 3}; event 2: {0, 2}.
+        // Greedy: ev0→c0, ev1→c0 (disjoint), ev2 overlaps both→c1.
+        let cg = color_greedy(3, |i| match i {
+            0 => vec![idx(0), idx(1)],
+            1 => vec![idx(2), idx(3)],
+            2 => vec![idx(0), idx(2)],
+            _ => unreachable!(),
+        });
+        assert_eq!(cg.n_colors(), 2);
+        assert_eq!(cg.groups[0], vec![0, 1]);
+        assert_eq!(cg.groups[1], vec![2]);
+    }
+
+    #[test]
+    fn total_events_counts_correctly() {
+        let cg = color_greedy(4, |_| vec![idx(0)]);
+        // All events touch index 0 → 4 distinct colors.
+        assert_eq!(cg.n_colors(), 4);
+        assert_eq!(cg.total_events(), 4);
+    }
+}
@@ -10,6 +10,8 @@ pub enum InferenceError {
    },
    /// A probability value is outside `[0, 1]`.
    InvalidProbability { value: f64 },
+    /// A scalar parameter is outside its valid range.
+    InvalidParameter { name: &'static str, value: f64 },
    /// Convergence exceeded `max_iter` without falling below `epsilon`.
    ConvergenceFailed {
        last_step: (f64, f64),
@@ -32,6 +34,9 @@ impl fmt::Display for InferenceError {
            Self::InvalidProbability { value } => {
                write!(f, "probability must be in [0, 1]; got {value}")
            }
+            Self::InvalidParameter { name, value } => {
+                write!(f, "{name} is invalid: {value}")
+            }
            Self::ConvergenceFailed {
                last_step,
                iterations,
@@ -75,6 +75,12 @@ where
        self
    }

+    /// Set explicit per-team continuous scores; higher = better.
+    pub fn scores<I: IntoIterator<Item = f64>>(mut self, scores: I) -> Self {
+        self.event.outcome = crate::Outcome::scores(scores);
+        self
+    }
+
    /// Mark team `winner_idx` as winner; others tied for last.
    pub fn winner(mut self, winner_idx: u32) -> Self {
        self.event.outcome = Outcome::winner(winner_idx, self.event.teams.len() as u32);
@@ -0,0 +1,123 @@
+use crate::{
+    N_INF,
+    factor::{Factor, VarId, VarStore},
+    gaussian::Gaussian,
+    pdf,
+};
+
+/// Gaussian observation factor on a diff variable.
+///
+/// Encodes the soft evidence `m_obs ~ N(diff, sigma²)`. The outgoing message
+/// to `diff` is the constant `N(m_obs, sigma²)`, so this factor converges in a
+/// single propagation: subsequent calls return a zero delta.
+#[derive(Debug)]
+pub struct MarginFactor {
+    pub diff: VarId,
+    pub m_obs: f64,
+    pub sigma: f64,
+    pub(crate) msg: Gaussian,
+    pub(crate) evidence_cached: Option<f64>,
+}
+
+impl MarginFactor {
+    pub fn new(diff: VarId, m_obs: f64, sigma: f64) -> Self {
+        debug_assert!(sigma > 0.0, "score sigma must be positive");
+        Self {
+            diff,
+            m_obs,
+            sigma,
+            msg: N_INF,
+            evidence_cached: None,
+        }
+    }
+}
+
+impl Factor for MarginFactor {
+    fn propagate(&mut self, vars: &mut VarStore) -> (f64, f64) {
+        let marginal = vars.get(self.diff);
+        let cavity = marginal / self.msg;
+
+        if self.evidence_cached.is_none() {
+            self.evidence_cached = Some(cavity_evidence(cavity, self.m_obs, self.sigma));
+        }
+
+        let new_msg = Gaussian::from_ms(self.m_obs, self.sigma);
+        let new_marginal = cavity * new_msg;
+        let old_msg = self.msg;
+        self.msg = new_msg;
+        vars.set(self.diff, new_marginal);
+
+        old_msg.delta(new_msg)
+    }
+
+    fn log_evidence(&self, _vars: &VarStore) -> f64 {
+        self.evidence_cached.unwrap_or(1.0).ln()
+    }
+}
+
+fn cavity_evidence(cavity: Gaussian, m_obs: f64, sigma: f64) -> f64 {
+    let combined_sigma = (cavity.sigma().powi(2) + sigma.powi(2)).sqrt();
+    pdf(m_obs, cavity.mu(), combined_sigma)
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn first_propagate_writes_tilted_marginal() {
+        let mut vars = VarStore::new();
+        let diff = vars.alloc(Gaussian::from_ms(0.0, 6.0));
+        let mut f = MarginFactor::new(diff, 5.0, 1.0);
+
+        f.propagate(&mut vars);
+
+        let result = vars.get(diff);
+        // pi = 1/36 + 1 ≈ 1.027778; tau = 0 + 5 = 5
+        // mu = 5 / 1.027778 ≈ 4.864865; sigma = 1/sqrt(1.027778) ≈ 0.986394
+        assert!((result.mu() - 4.864864864864865).abs() < 1e-12);
+        assert!((result.sigma() - 0.986393923832144).abs() < 1e-12);
+    }
+
+    #[test]
+    fn converges_in_one_step() {
+        let mut vars = VarStore::new();
+        let diff = vars.alloc(Gaussian::from_ms(0.0, 6.0));
+        let mut f = MarginFactor::new(diff, 5.0, 1.0);
+
+        f.propagate(&mut vars);
+        let (dmu, dsig) = f.propagate(&mut vars);
+        assert!(
+            dmu < 1e-12,
+            "expected ~0 delta on second propagate, got {dmu}"
+        );
+        assert!(dsig < 1e-12);
+    }
+
+    #[test]
+    fn evidence_cached_on_first_propagate() {
+        let mut vars = VarStore::new();
+        let diff = vars.alloc(Gaussian::from_ms(0.0, 6.0));
+        let mut f = MarginFactor::new(diff, 5.0, 1.0);
+        assert!(f.evidence_cached.is_none());
+
+        f.propagate(&mut vars);
+        let z = f.evidence_cached.unwrap();
+        // pdf(5, 0, sqrt(37)) ≈ 0.046783
+        assert!((z - 0.04678300292616668).abs() < 1e-10);
+
+        // Subsequent propagations don't change it.
+        f.propagate(&mut vars);
+        assert_eq!(f.evidence_cached.unwrap(), z);
+    }
+
+    #[test]
+    fn log_evidence_matches_cached_ln() {
+        let mut vars = VarStore::new();
+        let diff = vars.alloc(Gaussian::from_ms(0.0, 6.0));
+        let mut f = MarginFactor::new(diff, 5.0, 1.0);
+        f.propagate(&mut vars);
+        let logz = f.log_evidence(&vars);
+        assert!((logz - (-3.062235327364623)).abs() < 1e-10);
+    }
+}
@@ -78,6 +78,7 @@ pub enum BuiltinFactor {
    TeamSum(team_sum::TeamSumFactor),
    RankDiff(rank_diff::RankDiffFactor),
    Trunc(trunc::TruncFactor),
+    Margin(margin::MarginFactor),
 }

 impl Factor for BuiltinFactor {
@@ -86,17 +87,20 @@ impl Factor for BuiltinFactor {
            Self::TeamSum(f) => f.propagate(vars),
            Self::RankDiff(f) => f.propagate(vars),
            Self::Trunc(f) => f.propagate(vars),
+            Self::Margin(f) => f.propagate(vars),
        }
    }

    fn log_evidence(&self, vars: &VarStore) -> f64 {
        match self {
            Self::Trunc(f) => f.log_evidence(vars),
+            Self::Margin(f) => f.log_evidence(vars),
            _ => 0.0,
        }
    }
 }

+pub mod margin;
 pub mod rank_diff;
 pub mod team_sum;
 pub mod trunc;
@@ -145,4 +149,20 @@ mod tests {
        assert_eq!(store.len(), 0);
        assert_eq!(store.marginals.capacity(), cap);
    }
+
+    #[test]
+    fn builtin_factor_dispatches_to_margin() {
+        use super::margin::MarginFactor;
+        let mut vars = VarStore::new();
+        let diff = vars.alloc(Gaussian::from_ms(0.0, 6.0));
+        let mut f = BuiltinFactor::Margin(MarginFactor::new(diff, 5.0, 1.0));
+
+        f.propagate(&mut vars);
+
+        let result = vars.get(diff);
+        assert!((result.mu() - 4.864864864864865).abs() < 1e-12);
+
+        let logz = f.log_evidence(&vars);
+        assert!((logz - (-3.062235327364623)).abs() < 1e-10);
+    }
 }
@@ -6,8 +6,8 @@

 pub use crate::{
    factor::{
-        BuiltinFactor, Factor, VarId, VarStore, rank_diff::RankDiffFactor, team_sum::TeamSumFactor,
-        trunc::TruncFactor,
+        BuiltinFactor, Factor, VarId, VarStore, margin::MarginFactor, rank_diff::RankDiffFactor,
+        team_sum::TeamSumFactor, trunc::TruncFactor,
    },
    schedule::{EpsilonOrMax, Schedule, ScheduleReport},
 };
@@ -5,16 +5,63 @@ use crate::{
    arena::ScratchArena,
    compute_margin,
    drift::Drift,
-    factor::{Factor, trunc::TruncFactor},
+    factor::{VarId, margin::MarginFactor, trunc::TruncFactor},
    gaussian::Gaussian,
    rating::Rating,
    time::Time,
    tuple_gt, tuple_max,
 };

+/// Per-adjacent-pair link factor in the game's diff chain.
+///
+/// `Trunc` is used for `Outcome::Ranked` (rank-based truncation).
+/// `Margin` is used for `Outcome::Scored` (Gaussian observation on the diff).
+#[derive(Debug)]
+pub(crate) enum DiffFactor {
+    Trunc(TruncFactor),
+    Margin(MarginFactor),
+}
+
+impl DiffFactor {
+    pub(crate) fn diff(&self) -> VarId {
+        match self {
+            Self::Trunc(f) => f.diff,
+            Self::Margin(f) => f.diff,
+        }
+    }
+
+    pub(crate) fn msg(&self) -> Gaussian {
+        match self {
+            Self::Trunc(f) => f.msg,
+            Self::Margin(f) => f.msg,
+        }
+    }
+
+    pub(crate) fn evidence(&self) -> f64 {
+        match self {
+            Self::Trunc(f) => f.evidence_cached.unwrap_or(1.0),
+            Self::Margin(f) => f.evidence_cached.unwrap_or(1.0),
+        }
+    }
+
+    pub(crate) fn propagate(&mut self, vars: &mut crate::factor::VarStore) -> (f64, f64) {
+        use crate::factor::Factor;
+        match self {
+            Self::Trunc(f) => f.propagate(vars),
+            Self::Margin(f) => f.propagate(vars),
+        }
+    }
+}
+
+/// Per-game inference options.
+///
+/// `p_draw` and `convergence` apply to ranked outcomes (`Game::ranked`).
+/// `score_sigma` applies only to scored outcomes (`Game::scored`); it controls
+/// how much the engine trusts the observed score margin (smaller σ = more trust).
 #[derive(Clone, Copy, Debug)]
 pub struct GameOptions {
    pub p_draw: f64,
+    pub score_sigma: f64,
    pub convergence: crate::ConvergenceOptions,
 }

@@ -22,6 +69,7 @@ impl Default for GameOptions {
    fn default() -> Self {
        Self {
            p_draw: crate::P_DRAW,
+            score_sigma: 1.0,
            convergence: crate::ConvergenceOptions::default(),
        }
    }
@@ -64,6 +112,26 @@ impl<T: Time, D: Drift<T>> OwnedGame<T, D> {
        }
    }

+    pub(crate) fn new_scored(
+        teams: Vec<Vec<Rating<T, D>>>,
+        scores: Vec<f64>,
+        weights: Vec<Vec<f64>>,
+        score_sigma: f64,
+    ) -> Self {
+        let mut arena = ScratchArena::new();
+        let g = Game::scored_with_arena(teams.clone(), &scores, &weights, score_sigma, &mut arena);
+        let likelihoods = g.likelihoods;
+        let evidence = g.evidence;
+        Self {
+            teams,
+            result: scores,
+            weights,
+            p_draw: 0.0,
+            likelihoods,
+            evidence,
+        }
+    }
+
    pub fn posteriors(&self) -> Vec<Vec<Gaussian>> {
        self.likelihoods
            .iter()
@@ -132,6 +200,39 @@ impl<'a, T: Time, D: Drift<T>> Game<'a, T, D> {
        this
    }

+    pub(crate) fn scored_with_arena(
+        teams: Vec<Vec<Rating<T, D>>>,
+        scores: &'a [f64],
+        weights: &'a [Vec<f64>],
+        score_sigma: f64,
+        arena: &mut ScratchArena,
+    ) -> Self {
+        debug_assert!(
+            scores.len() == teams.len(),
+            "scores must have the same length as teams"
+        );
+        debug_assert!(
+            weights
+                .iter()
+                .zip(teams.iter())
+                .all(|(w, t)| w.len() == t.len()),
+            "weights must have the same dimensions as teams"
+        );
+        debug_assert!(score_sigma > 0.0, "score_sigma must be positive");
+
+        let mut this = Self {
+            teams,
+            result: scores,
+            weights,
+            p_draw: 0.0,
+            likelihoods: Vec::new(),
+            evidence: 0.0,
+        };
+
+        this.likelihoods_scored(arena, score_sigma);
+        this
+    }
+
    fn likelihoods(&mut self, arena: &mut ScratchArena) {
        arena.reset();

@@ -155,9 +256,9 @@ impl<'a, T: Time, D: Drift<T>> Game<'a, T, D> {

        let n_diffs = n_teams.saturating_sub(1);

-        // One TruncFactor per adjacent sorted-team pair; each owns a diff VarId.
-        // trunc stays local (fresh state per game; Vec capacity is typically small).
-        let mut trunc: Vec<TruncFactor> = (0..n_diffs)
+        // One DiffFactor per adjacent sorted-team pair; each owns a diff VarId.
+        // links stays local (fresh state per game; Vec capacity is typically small).
+        let mut links: Vec<DiffFactor> = (0..n_diffs)
            .map(|i| {
                let tie = self.result[arena.sort_buf[i]] == self.result[arena.sort_buf[i + 1]];
                let margin = if self.p_draw == 0.0 {
@@ -174,7 +275,7 @@ impl<'a, T: Time, D: Drift<T>> Game<'a, T, D> {
                    compute_margin(self.p_draw, (a + b).sqrt())
                };
                let vid = arena.vars.alloc(N_INF);
-                TruncFactor::new(vid, margin, tie)
+                DiffFactor::Trunc(TruncFactor::new(vid, margin, tie))
            })
            .collect();

@@ -189,30 +290,30 @@ impl<'a, T: Time, D: Drift<T>> Game<'a, T, D> {
            step = (0.0_f64, 0.0_f64);

            // Forward sweep: diffs 0 .. n_diffs-2 (all but the last).
-            for (e, tf) in trunc[..n_diffs.saturating_sub(1)].iter_mut().enumerate() {
+            for (e, lf) in links[..n_diffs.saturating_sub(1)].iter_mut().enumerate() {
                let pw = arena.team_prior[e] * arena.lhood_lose[e];
                let pl = arena.team_prior[e + 1] * arena.lhood_win[e + 1];
                let raw = pw - pl;
-                arena.vars.set(tf.diff, raw * tf.msg);
-                let d = tf.propagate(&mut arena.vars);
+                arena.vars.set(lf.diff(), raw * lf.msg());
+                let d = lf.propagate(&mut arena.vars);
                step = tuple_max(step, d);

-                let new_ll = pw - tf.msg;
+                let new_ll = pw - lf.msg();
                step = tuple_max(step, arena.lhood_lose[e + 1].delta(new_ll));
                arena.lhood_lose[e + 1] = new_ll;
            }

            // Backward sweep: diffs n_diffs-1 .. 1 (reverse, all but the first).
-            for (rev_i, tf) in trunc[1..].iter_mut().rev().enumerate() {
+            for (rev_i, lf) in links[1..].iter_mut().rev().enumerate() {
                let e = n_diffs - 1 - rev_i;
                let pw = arena.team_prior[e] * arena.lhood_lose[e];
                let pl = arena.team_prior[e + 1] * arena.lhood_win[e + 1];
                let raw = pw - pl;
-                arena.vars.set(tf.diff, raw * tf.msg);
-                let d = tf.propagate(&mut arena.vars);
+                arena.vars.set(lf.diff(), raw * lf.msg());
+                let d = lf.propagate(&mut arena.vars);
                step = tuple_max(step, d);

-                let new_lw = pl + tf.msg;
+                let new_lw = pl + lf.msg();
                step = tuple_max(step, arena.lhood_win[e].delta(new_lw));
                arena.lhood_win[e] = new_lw;
            }
@@ -224,23 +325,20 @@ impl<'a, T: Time, D: Drift<T>> Game<'a, T, D> {
        if n_diffs == 1 {
            let raw = (arena.team_prior[0] * arena.lhood_lose[0])
                - (arena.team_prior[1] * arena.lhood_win[1]);
-            arena.vars.set(trunc[0].diff, raw * trunc[0].msg);
-            trunc[0].propagate(&mut arena.vars);
+            arena.vars.set(links[0].diff(), raw * links[0].msg());
+            links[0].propagate(&mut arena.vars);
        }

        // Boundary updates: close the chain at both ends.
        if n_diffs > 0 {
            let pl1 = arena.team_prior[1] * arena.lhood_win[1];
-            arena.lhood_win[0] = pl1 + trunc[0].msg;
+            arena.lhood_win[0] = pl1 + links[0].msg();
            let pw_last = arena.team_prior[n_teams - 2] * arena.lhood_lose[n_teams - 2];
-            arena.lhood_lose[n_teams - 1] = pw_last - trunc[n_diffs - 1].msg;
+            arena.lhood_lose[n_teams - 1] = pw_last - links[n_diffs - 1].msg();
        }

        // Evidence = product of per-diff evidences (each cached on first propagation).
-        self.evidence = trunc
-            .iter()
-            .map(|t| t.evidence_cached.unwrap_or(1.0))
-            .product();
+        self.evidence = links.iter().map(|l| l.evidence()).product();

        // Inverse permutation: inv_buf[orig_i] = sorted_i.
        arena.inv_buf.resize(n_teams, 0);
@@ -272,6 +370,120 @@ impl<'a, T: Time, D: Drift<T>> Game<'a, T, D> {
            .collect::<Vec<_>>();
    }

+    fn likelihoods_scored(&mut self, arena: &mut ScratchArena, score_sigma: f64) {
+        arena.reset();
+
+        let n_teams = self.teams.len();
+
+        arena.sort_buf.extend(0..n_teams);
+        arena.sort_buf.sort_by(|&i, &j| {
+            self.result[j]
+                .partial_cmp(&self.result[i])
+                .unwrap_or(Ordering::Equal)
+        });
+
+        arena.team_prior.extend(arena.sort_buf.iter().map(|&t| {
+            self.teams[t]
+                .iter()
+                .zip(self.weights[t].iter())
+                .fold(N00, |p, (player, &w)| p + (player.performance() * w))
+        }));
+
+        let n_diffs = n_teams.saturating_sub(1);
+
+        let mut links: Vec<DiffFactor> = (0..n_diffs)
+            .map(|i| {
+                // After descending-by-score sort, m_obs >= 0 for every adjacent pair.
+                let m_obs = self.result[arena.sort_buf[i]] - self.result[arena.sort_buf[i + 1]];
+                let vid = arena.vars.alloc(N_INF);
+                DiffFactor::Margin(MarginFactor::new(vid, m_obs, score_sigma))
+            })
+            .collect();
+
+        arena.lhood_lose.resize(n_teams, N_INF);
+        arena.lhood_win.resize(n_teams, N_INF);
+
+        let mut step = (f64::INFINITY, f64::INFINITY);
+        let mut iter = 0;
+
+        while tuple_gt(step, 1e-6) && iter < 10 {
+            step = (0.0_f64, 0.0_f64);
+
+            for (e, lf) in links[..n_diffs.saturating_sub(1)].iter_mut().enumerate() {
+                let pw = arena.team_prior[e] * arena.lhood_lose[e];
+                let pl = arena.team_prior[e + 1] * arena.lhood_win[e + 1];
+                let raw = pw - pl;
+                arena.vars.set(lf.diff(), raw * lf.msg());
+                let d = lf.propagate(&mut arena.vars);
+                step = tuple_max(step, d);
+
+                let new_ll = pw - lf.msg();
+                step = tuple_max(step, arena.lhood_lose[e + 1].delta(new_ll));
+                arena.lhood_lose[e + 1] = new_ll;
+            }
+
+            for (rev_i, lf) in links[1..].iter_mut().rev().enumerate() {
+                let e = n_diffs - 1 - rev_i;
+                let pw = arena.team_prior[e] * arena.lhood_lose[e];
+                let pl = arena.team_prior[e + 1] * arena.lhood_win[e + 1];
+                let raw = pw - pl;
+                arena.vars.set(lf.diff(), raw * lf.msg());
+                let d = lf.propagate(&mut arena.vars);
+                step = tuple_max(step, d);
+
+                let new_lw = pl + lf.msg();
+                step = tuple_max(step, arena.lhood_win[e].delta(new_lw));
+                arena.lhood_win[e] = new_lw;
+            }
+
+            iter += 1;
+        }
+
+        if n_diffs == 1 {
+            let raw = (arena.team_prior[0] * arena.lhood_lose[0])
+                - (arena.team_prior[1] * arena.lhood_win[1]);
+            arena.vars.set(links[0].diff(), raw * links[0].msg());
+            links[0].propagate(&mut arena.vars);
+        }
+
+        if n_diffs > 0 {
+            let pl1 = arena.team_prior[1] * arena.lhood_win[1];
+            arena.lhood_win[0] = pl1 + links[0].msg();
+            let pw_last = arena.team_prior[n_teams - 2] * arena.lhood_lose[n_teams - 2];
+            arena.lhood_lose[n_teams - 1] = pw_last - links[n_diffs - 1].msg();
+        }
+
+        self.evidence = links.iter().map(|l| l.evidence()).product();
+
+        arena.inv_buf.resize(n_teams, 0);
+        for (si, &orig_i) in arena.sort_buf.iter().enumerate() {
+            arena.inv_buf[orig_i] = si;
+        }
+
+        self.likelihoods = self
+            .teams
+            .iter()
+            .zip(self.weights.iter())
+            .enumerate()
+            .map(|(orig_i, (players, weights))| {
+                let si = arena.inv_buf[orig_i];
+                let m = arena.lhood_win[si] * arena.lhood_lose[si];
+                let performance = players
+                    .iter()
+                    .zip(weights.iter())
+                    .fold(N00, |p, (player, &w)| p + (player.performance() * w));
+                players
+                    .iter()
+                    .zip(weights.iter())
+                    .map(|(player, &w)| {
+                        ((m - performance.exclude(player.performance() * w)) * (1.0 / w))
+                            .forget(player.beta.powi(2))
+                    })
+                    .collect::<Vec<_>>()
+            })
+            .collect::<Vec<_>>();
+    }
+
    pub fn posteriors(&self) -> Vec<Vec<Gaussian>> {
        self.likelihoods
            .iter()
@@ -309,7 +521,13 @@ impl<T: Time, D: Drift<T>> Game<'_, T, D> {
            });
        }

-        let ranks = outcome.as_ranks();
+        let ranks = outcome
+            .as_ranks()
+            .ok_or(crate::InferenceError::MismatchedShape {
+                kind: "Game::ranked requires Outcome::Ranked",
+                expected: 0,
+                got: 0,
+            })?;
        let max_rank = ranks.iter().copied().max().unwrap_or(0) as f64;
        let result: Vec<f64> = ranks.iter().map(|&r| max_rank - r as f64).collect();
        let teams_owned: Vec<Vec<Rating<T, D>>> = teams.iter().map(|t| t.to_vec()).collect();
@@ -318,6 +536,42 @@ impl<T: Time, D: Drift<T>> Game<'_, T, D> {
        Ok(OwnedGame::new(teams_owned, result, weights, options.p_draw))
    }

+    pub fn scored(
+        teams: &[&[Rating<T, D>]],
+        outcome: crate::Outcome,
+        options: &GameOptions,
+    ) -> Result<OwnedGame<T, D>, crate::InferenceError> {
+        if options.score_sigma <= 0.0 || options.score_sigma.is_nan() {
+            return Err(crate::InferenceError::InvalidParameter {
+                name: "score_sigma",
+                value: options.score_sigma,
+            });
+        }
+        if outcome.team_count() != teams.len() {
+            return Err(crate::InferenceError::MismatchedShape {
+                kind: "outcome scores vs teams",
+                expected: teams.len(),
+                got: outcome.team_count(),
+            });
+        }
+        let scores = outcome
+            .as_scores()
+            .ok_or(crate::InferenceError::MismatchedShape {
+                kind: "Game::scored requires Outcome::Scored",
+                expected: 0,
+                got: 0,
+            })?
+            .to_vec();
+        let teams_owned: Vec<Vec<Rating<T, D>>> = teams.iter().map(|t| t.to_vec()).collect();
+        let weights: Vec<Vec<f64>> = teams.iter().map(|t| vec![1.0; t.len()]).collect();
+        Ok(OwnedGame::new_scored(
+            teams_owned,
+            scores,
+            weights,
+            options.score_sigma,
+        ))
+    }
+
    pub fn one_v_one(
        a: &Rating<T, D>,
        b: &Rating<T, D>,
@@ -805,6 +1059,131 @@ mod tests {
        assert_ulps_eq!(p[0][0], p[1][0], epsilon = 1e-6);
    }

+    #[test]
+    fn diff_factor_dispatch_trunc_and_margin() {
+        use super::DiffFactor;
+        use crate::factor::{VarStore, margin::MarginFactor, trunc::TruncFactor};
+
+        let mut vars = VarStore::new();
+        let dt = vars.alloc(Gaussian::from_ms(0.0, 6.0));
+        let dm = vars.alloc(Gaussian::from_ms(0.0, 6.0));
+
+        let mut t = DiffFactor::Trunc(TruncFactor::new(dt, 0.0, false));
+        let mut m = DiffFactor::Margin(MarginFactor::new(dm, 5.0, 1.0));
+
+        let _ = t.propagate(&mut vars);
+        let _ = m.propagate(&mut vars);
+
+        // Smoke: both diffs got written; their msgs are non-N_INF.
+        assert!(t.msg().pi() > 0.0);
+        assert!(m.msg().pi() > 0.0);
+        assert_eq!(t.diff(), dt);
+        assert_eq!(m.diff(), dm);
+    }
+
+    #[test]
+    fn scored_path_sharper_when_margin_is_large() {
+        let prior = R::new(
+            Gaussian::from_ms(25.0, 25.0 / 3.0),
+            25.0 / 6.0,
+            ConstantDrift(25.0 / 300.0),
+        );
+        let teams = vec![vec![prior], vec![prior]];
+        let result = vec![10.0, 0.0]; // a beat b by 10
+        let weights = [vec![1.0], vec![1.0]];
+        let mut arena = ScratchArena::new();
+        let g = Game::scored_with_arena(
+            teams, &result, &weights, 1.0, // score_sigma
+            &mut arena,
+        );
+        let p = g.posteriors();
+        let a = p[0][0];
+        let b = p[1][0];
+        assert!(
+            a.mu() > b.mu(),
+            "expected team a posterior mu > team b; got {} vs {}",
+            a.mu(),
+            b.mu()
+        );
+
+        // Tighter score_sigma should produce a stronger update.
+        let mut arena2 = ScratchArena::new();
+        let g_tight = Game::scored_with_arena(
+            vec![vec![prior], vec![prior]],
+            &result,
+            &weights,
+            0.1, // tighter score_sigma
+            &mut arena2,
+        );
+        let p_tight = g_tight.posteriors();
+        let a_tight = p_tight[0][0];
+        assert!(
+            a_tight.mu() > a.mu(),
+            "expected tighter sigma to push posterior further; {} vs {}",
+            a_tight.mu(),
+            a.mu()
+        );
+    }
+
+    #[test]
+    fn game_scored_public_ctor() {
+        use crate::Outcome;
+        let prior = R::new(
+            Gaussian::from_ms(25.0, 25.0 / 3.0),
+            25.0 / 6.0,
+            ConstantDrift(25.0 / 300.0),
+        );
+        let opts = GameOptions {
+            score_sigma: 1.0,
+            ..GameOptions::default()
+        };
+        let g = Game::scored(&[&[prior], &[prior]], Outcome::scores([8.0, 2.0]), &opts).unwrap();
+        let p = g.posteriors();
+        assert!(p[0][0].mu() > p[1][0].mu());
+    }
+
+    #[test]
+    fn game_scored_rejects_ranked_outcome() {
+        let prior = R::new(
+            Gaussian::from_ms(25.0, 25.0 / 3.0),
+            25.0 / 6.0,
+            ConstantDrift(25.0 / 300.0),
+        );
+        let err = Game::scored(
+            &[&[prior], &[prior]],
+            crate::Outcome::winner(0, 2),
+            &GameOptions::default(),
+        )
+        .unwrap_err();
+        assert!(matches!(err, crate::InferenceError::MismatchedShape { .. }));
+    }
+
+    #[test]
+    fn game_scored_rejects_zero_score_sigma() {
+        let prior = R::new(
+            Gaussian::from_ms(25.0, 25.0 / 3.0),
+            25.0 / 6.0,
+            ConstantDrift(25.0 / 300.0),
+        );
+        let opts = GameOptions {
+            score_sigma: 0.0,
+            ..GameOptions::default()
+        };
+        let err = Game::scored(
+            &[&[prior], &[prior]],
+            crate::Outcome::scores([1.0, 0.0]),
+            &opts,
+        )
+        .unwrap_err();
+        assert!(matches!(
+            err,
+            crate::InferenceError::InvalidParameter {
+                name: "score_sigma",
+                ..
+            }
+        ));
+    }
+
    #[test]
    fn test_2vs2_weighted() {
        let t_a = vec![
@@ -13,7 +13,7 @@ use crate::{
    sort_time,
    storage::CompetitorStore,
    time::Time,
-    time_slice::{self, TimeSlice},
+    time_slice::{self, EventKind, TimeSlice},
    tuple_gt, tuple_max,
 };

@@ -30,6 +30,7 @@ pub struct HistoryBuilder<
    drift: D,
    p_draw: f64,
    online: bool,
+    score_sigma: f64,
    convergence: ConvergenceOptions,
    observer: O,
    _time: PhantomData<T>,
@@ -60,6 +61,7 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> HistoryBuilder<
            beta: self.beta,
            p_draw: self.p_draw,
            online: self.online,
+            score_sigma: self.score_sigma,
            convergence: self.convergence,
            observer: self.observer,
            _time: self._time,
@@ -77,6 +79,15 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> HistoryBuilder<
        self
    }

+    pub fn score_sigma(mut self, score_sigma: f64) -> Self {
+        assert!(
+            score_sigma > 0.0,
+            "score_sigma must be positive (got {score_sigma})"
+        );
+        self.score_sigma = score_sigma;
+        self
+    }
+
    pub fn convergence(mut self, opts: ConvergenceOptions) -> Self {
        self.convergence = opts;
        self
@@ -90,6 +101,7 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> HistoryBuilder<
            drift: self.drift,
            p_draw: self.p_draw,
            online: self.online,
+            score_sigma: self.score_sigma,
            convergence: self.convergence,
            observer,
            _time: self._time,
@@ -109,6 +121,7 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> HistoryBuilder<
            drift: self.drift,
            p_draw: self.p_draw,
            online: self.online,
+            score_sigma: self.score_sigma,
            convergence: self.convergence,
            observer: self.observer,
        }
@@ -124,6 +137,7 @@ impl Default for HistoryBuilder<i64, ConstantDrift, NullObserver, &'static str>
            drift: ConstantDrift(GAMMA),
            p_draw: P_DRAW,
            online: false,
+            score_sigma: 1.0,
            convergence: ConvergenceOptions::default(),
            observer: NullObserver,
            _time: PhantomData,
@@ -148,6 +162,7 @@ pub struct History<
    drift: D,
    p_draw: f64,
    online: bool,
+    score_sigma: f64,
    convergence: ConvergenceOptions,
    observer: O,
 }
@@ -174,6 +189,7 @@ impl<K: Eq + Hash + Clone> History<i64, ConstantDrift, NullObserver, K> {
            drift: ConstantDrift(GAMMA),
            p_draw: P_DRAW,
            online: false,
+            score_sigma: 1.0,
            convergence: ConvergenceOptions::default(),
            observer: NullObserver,
            _time: PhantomData,
@@ -262,6 +278,33 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> History<T, D, O
    /// Note: `key(idx)` is O(n) per lookup; this method is therefore O(n²)
    /// in the number of competitors. Acceptable for T2; T3 may optimize.
    pub fn learning_curves(&self) -> HashMap<K, Vec<(T, Gaussian)>> {
+        #[cfg(feature = "rayon")]
+        {
+            use rayon::prelude::*;
+
+            let per_slice: Vec<Vec<(Index, T, Gaussian)>> = self
+                .time_slices
+                .par_iter()
+                .map(|ts| {
+                    ts.skills
+                        .iter()
+                        .map(|(idx, sk)| (idx, ts.time, sk.posterior()))
+                        .collect()
+                })
+                .collect();
+
+            let mut data: HashMap<K, Vec<(T, Gaussian)>> = HashMap::new();
+            for slice_contrib in per_slice {
+                for (idx, t, g) in slice_contrib {
+                    if let Some(key) = self.keys.key(idx).cloned() {
+                        data.entry(key).or_default().push((t, g));
+                    }
+                }
+            }
+            data
+        }
+        #[cfg(not(feature = "rayon"))]
+        {
            let mut data: HashMap<K, Vec<(T, Gaussian)>> = HashMap::new();
            for slice in &self.time_slices {
                for (idx, skill) in slice.skills.iter() {
@@ -274,6 +317,7 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> History<T, D, O
            }
            data
        }
+    }

    /// Skill estimate at the latest time slice the competitor appears in.
    pub fn current_skill<Q>(&self, key: &Q) -> Option<Gaussian>
@@ -304,11 +348,24 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> History<T, D, O
    }

    pub(crate) fn log_evidence_internal(&mut self, forward: bool, targets: &[Index]) -> f64 {
+        #[cfg(feature = "rayon")]
+        {
+            use rayon::prelude::*;
+            let per_slice: Vec<f64> = self
+                .time_slices
+                .par_iter()
+                .map(|ts| ts.log_evidence(self.online, targets, forward, &self.agents))
+                .collect();
+            per_slice.into_iter().sum()
+        }
+        #[cfg(not(feature = "rayon"))]
+        {
            self.time_slices
                .iter()
                .map(|ts| ts.log_evidence(self.online, targets, forward, &self.agents))
                .sum()
        }
+    }

    /// Total log-evidence across the history.
    pub fn log_evidence(&mut self) -> f64 {
@@ -409,6 +466,7 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> History<T, D, O
        results: Vec<Vec<f64>>,
        times: Vec<T>,
        weights: Vec<Vec<Vec<f64>>>,
+        kinds: Vec<EventKind>,
        mut priors: HashMap<Index, Rating<T, D>>,
    ) -> Result<(), InferenceError> {
        if !results.is_empty() && results.len() != composition.len() {
@@ -432,6 +490,13 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> History<T, D, O
                got: weights.len(),
            });
        }
+        if kinds.len() != composition.len() {
+            return Err(InferenceError::MismatchedShape {
+                kind: "kinds",
+                expected: composition.len(),
+                got: kinds.len(),
+            });
+        }

        competitor::clean(self.agents.values_mut(), true);

@@ -516,9 +581,11 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> History<T, D, O
                (i..j).map(|e| weights[o[e]].clone()).collect::<Vec<_>>()
            };

+            let kinds_chunk: Vec<EventKind> = (i..j).map(|e| kinds[o[e]]).collect();
+
            if self.time_slices.len() > k && self.time_slices[k].time == t {
                let time_slice = &mut self.time_slices[k];
-                time_slice.add_events(composition, results, weights, &self.agents);
+                time_slice.add_events(composition, results, weights, kinds_chunk, &self.agents);

                for agent_idx in time_slice.skills.keys() {
                    let agent = self.agents.get_mut(agent_idx).unwrap();
@@ -528,7 +595,7 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> History<T, D, O
                }
            } else {
                let mut time_slice = TimeSlice::new(t, self.p_draw);
-                time_slice.add_events(composition, results, weights, &self.agents);
+                time_slice.add_events(composition, results, weights, kinds_chunk, &self.agents);

                self.time_slices.insert(k, time_slice);

@@ -585,6 +652,7 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> History<T, D, O
            vec![vec![1.0, 0.0]],
            vec![time],
            vec![],
+            vec![EventKind::Ranked],
            HashMap::new(),
        )
    }
@@ -601,6 +669,7 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> History<T, D, O
            vec![vec![0.0, 0.0]],
            vec![time],
            vec![],
+            vec![EventKind::Ranked],
            HashMap::new(),
        )
    }
@@ -625,15 +694,15 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> History<T, D, O
        let mut results: Vec<Vec<f64>> = Vec::with_capacity(events.len());
        let mut times: Vec<T> = Vec::with_capacity(events.len());
        let mut weights: Vec<Vec<Vec<f64>>> = Vec::with_capacity(events.len());
+        let mut kinds: Vec<EventKind> = Vec::with_capacity(events.len());
        let mut priors: HashMap<Index, Rating<T, D>> = HashMap::new();

        for ev in events {
-            let ranks = ev.outcome.as_ranks();
-            if ranks.len() != ev.teams.len() {
+            if ev.outcome.team_count() != ev.teams.len() {
                return Err(InferenceError::MismatchedShape {
-                    kind: "outcome ranks vs teams",
+                    kind: "outcome vs teams",
                    expected: ev.teams.len(),
-                    got: ranks.len(),
+                    got: ev.outcome.team_count(),
                });
            }

@@ -657,13 +726,24 @@ impl<T: Time, D: Drift<T>, O: Observer<T>, K: Eq + Hash + Clone> History<T, D, O
            composition.push(event_comp);
            weights.push(event_weights);

+            let event_result: Vec<f64> = match &ev.outcome {
+                crate::Outcome::Ranked(ranks) => {
                    let max_rank = ranks.iter().copied().max().unwrap_or(0) as f64;
-            let inverted: Vec<f64> = ranks.iter().map(|&r| max_rank - r as f64).collect();
-            results.push(inverted);
+                    kinds.push(EventKind::Ranked);
+                    ranks.iter().map(|&r| max_rank - r as f64).collect()
+                }
+                crate::Outcome::Scored(scores) => {
+                    kinds.push(EventKind::Scored {
+                        score_sigma: self.score_sigma,
+                    });
+                    scores.to_vec()
+                }
+            };
+            results.push(event_result);
            times.push(ev.time);
        }

-        self.add_events_with_prior(composition, results, times, weights, priors)
+        self.add_events_with_prior(composition, results, times, weights, kinds, priors)
    }
 }

@@ -1625,4 +1705,10 @@ mod tests {
        assert!(report.iterations < 30);
        assert!(report.final_step.0 <= 1e-6);
    }
+
+    #[test]
+    #[should_panic(expected = "score_sigma must be positive")]
+    fn history_builder_rejects_zero_score_sigma() {
+        let _ = History::builder().score_sigma(0.0).build();
+    }
 }
@@ -8,7 +8,8 @@ mod approx;
 pub(crate) mod arena;
 mod time;
 mod time_slice;
-pub use time_slice::TimeSlice;
+pub use time_slice::{EventKind, TimeSlice};
+mod color_group;
 mod competitor;
 mod convergence;
 pub mod drift;
@@ -1,8 +1,7 @@
 //! Outcome of a match.
 //!
-//! In T2, only `Ranked` is supported; `Scored` will be added together with
-//! `MarginFactor` in T4. The enum is `#[non_exhaustive]` so adding `Scored`
-//! is non-breaking for downstream `match` expressions.
+//! `Ranked(ranks)` for ordinal results; `Scored(scores)` for continuous
+//! per-team scores (engages `MarginFactor` in the engine).

 use smallvec::SmallVec;

@@ -10,14 +9,19 @@ use smallvec::SmallVec;
 ///
 /// `Ranked(ranks)`: lower rank = better. Equal ranks mean a tie between those
 /// teams. `ranks.len()` must equal the number of teams in the event.
+///
+/// `Scored(scores)`: higher score = better. Adjacent (sorted) pairs feed
+/// observed margins to `MarginFactor`. `scores.len()` must equal the number
+/// of teams in the event.
 #[derive(Clone, Debug, PartialEq)]
 #[non_exhaustive]
 pub enum Outcome {
    Ranked(SmallVec<[u32; 4]>),
+    Scored(SmallVec<[f64; 4]>),
 }

 impl Outcome {
-    /// `N`-team outcome where team `winner` won and everyone else tied for last.
+    /// `n`-team outcome where team `winner` won and everyone else tied for last.
    ///
    /// Panics if `winner >= n`.
    pub fn winner(winner: u32, n: u32) -> Self {
@@ -36,16 +40,29 @@ impl Outcome {
        Self::Ranked(ranks.into_iter().collect())
    }

+    /// Explicit per-team continuous scores; higher = better.
+    pub fn scores<I: IntoIterator<Item = f64>>(scores: I) -> Self {
+        Self::Scored(scores.into_iter().collect())
+    }
+
    pub fn team_count(&self) -> usize {
        match self {
            Self::Ranked(r) => r.len(),
+            Self::Scored(s) => s.len(),
        }
    }

-    #[allow(dead_code)]
-    pub(crate) fn as_ranks(&self) -> &[u32] {
+    pub(crate) fn as_ranks(&self) -> Option<&[u32]> {
        match self {
-            Self::Ranked(r) => r,
+            Self::Ranked(r) => Some(r),
+            Self::Scored(_) => None,
+        }
+    }
+
+    pub(crate) fn as_scores(&self) -> Option<&[f64]> {
+        match self {
+            Self::Scored(s) => Some(s),
+            Self::Ranked(_) => None,
        }
    }
 }
@@ -57,26 +74,26 @@ mod tests {
    #[test]
    fn winner_two_teams() {
        let o = Outcome::winner(0, 2);
-        assert_eq!(o.as_ranks(), &[0u32, 1]);
+        assert_eq!(o.as_ranks(), Some(&[0u32, 1][..]));
        assert_eq!(o.team_count(), 2);
    }

    #[test]
    fn winner_three_teams_second_wins() {
        let o = Outcome::winner(1, 3);
-        assert_eq!(o.as_ranks(), &[1u32, 0, 1]);
+        assert_eq!(o.as_ranks(), Some(&[1u32, 0, 1][..]));
    }

    #[test]
    fn draw_three_teams() {
        let o = Outcome::draw(3);
-        assert_eq!(o.as_ranks(), &[0u32, 0, 0]);
+        assert_eq!(o.as_ranks(), Some(&[0u32, 0, 0][..]));
    }

    #[test]
    fn ranking_from_iter() {
        let o = Outcome::ranking([2, 0, 1]);
-        assert_eq!(o.as_ranks(), &[2u32, 0, 1]);
+        assert_eq!(o.as_ranks(), Some(&[2u32, 0, 1][..]));
    }

    #[test]
@@ -84,4 +101,25 @@ mod tests {
    fn winner_out_of_range_panics() {
        let _ = Outcome::winner(2, 2);
    }
+
+    #[test]
+    fn scored_two_teams() {
+        let o = Outcome::scores([10.0, 4.0]);
+        assert_eq!(o.team_count(), 2);
+        assert_eq!(o.as_scores(), Some(&[10.0, 4.0][..]));
+        assert_eq!(o.as_ranks(), None);
+    }
+
+    #[test]
+    fn scored_team_count_matches_input() {
+        let o = Outcome::scores([3.0, 1.0, 2.0, 0.0]);
+        assert_eq!(o.team_count(), 4);
+    }
+
+    #[test]
+    fn ranked_as_scores_returns_none() {
+        let o = Outcome::winner(0, 2);
+        assert!(o.as_scores().is_none());
+        assert!(o.as_ranks().is_some());
+    }
 }
@@ -7,6 +7,7 @@ use std::collections::HashMap;
 use crate::{
    Index, N_INF,
    arena::ScratchArena,
+    color_group::ColorGroups,
    drift::Drift,
    game::Game,
    gaussian::Gaussian,
@@ -43,6 +44,13 @@ impl Default for Skill {
    }
 }

+#[derive(Debug, Clone, Copy)]
+#[non_exhaustive]
+pub enum EventKind {
+    Ranked,
+    Scored { score_sigma: f64 },
+}
+
 #[derive(Debug)]
 struct Item {
    agent: Index,
@@ -81,9 +89,16 @@ pub(crate) struct Event {
    teams: Vec<Team>,
    evidence: f64,
    weights: Vec<Vec<f64>>,
+    kind: EventKind,
 }

 impl Event {
+    pub(crate) fn iter_agents(&self) -> impl Iterator<Item = Index> + '_ {
+        self.teams
+            .iter()
+            .flat_map(|t| t.items.iter().map(|it| it.agent))
+    }
+
    fn outputs(&self) -> Vec<f64> {
        self.teams
            .iter()
@@ -108,6 +123,40 @@ impl Event {
            })
            .collect::<Vec<_>>()
    }
+
+    /// Direct in-loop update: mutates self and `skills` inline with no
+    /// intermediate allocation. Used by both the sequential sweep path and,
+    /// via unsafe, by the parallel rayon path for events in the same color
+    /// group (which have disjoint agent sets — see `sweep_color_groups`).
+    fn iteration_direct<T: Time, D: Drift<T>>(
+        &mut self,
+        skills: &mut SkillStore,
+        agents: &CompetitorStore<T, D>,
+        p_draw: f64,
+        arena: &mut ScratchArena,
+    ) {
+        let teams = self.within_priors(false, false, skills, agents);
+        let result = self.outputs();
+        let g = match self.kind {
+            EventKind::Ranked => {
+                Game::ranked_with_arena(teams, &result, &self.weights, p_draw, arena)
+            }
+            EventKind::Scored { score_sigma } => {
+                Game::scored_with_arena(teams, &result, &self.weights, score_sigma, arena)
+            }
+        };
+
+        for (t, team) in self.teams.iter_mut().enumerate() {
+            for (i, item) in team.items.iter_mut().enumerate() {
+                let old_likelihood = skills.get(item.agent).unwrap().likelihood;
+                let new_likelihood = (old_likelihood / item.likelihood) * g.likelihoods[t][i];
+                skills.get_mut(item.agent).unwrap().likelihood = new_likelihood;
+                item.likelihood = g.likelihoods[t][i];
+            }
+        }
+
+        self.evidence = g.evidence;
+    }
 }

 #[derive(Debug)]
@@ -117,6 +166,7 @@ pub struct TimeSlice<T: Time = i64> {
    pub(crate) time: T,
    p_draw: f64,
    arena: ScratchArena,
+    pub(crate) color_groups: ColorGroups,
 }

 impl<T: Time> TimeSlice<T> {
@@ -127,14 +177,50 @@ impl<T: Time> TimeSlice<T> {
            time,
            p_draw,
            arena: ScratchArena::new(),
+            color_groups: ColorGroups::new(),
        }
    }

+    /// Recompute the color-group partition and reorder `self.events` into
+    /// color-contiguous ranges. After this call, `self.color_groups.groups[c]`
+    /// contains a contiguous ascending range of indices in `self.events`.
+    pub(crate) fn recompute_color_groups(&mut self) {
+        use crate::color_group::color_greedy;
+
+        let n = self.events.len();
+        if n == 0 {
+            self.color_groups = ColorGroups::new();
+            return;
+        }
+
+        let cg = color_greedy(n, |ev_idx| {
+            self.events[ev_idx].iter_agents().collect::<Vec<_>>()
+        });
+
+        let mut reordered: Vec<Event> = Vec::with_capacity(n);
+        let mut new_groups: Vec<Vec<usize>> = Vec::with_capacity(cg.groups.len());
+        let mut taken: Vec<Option<Event>> = self.events.drain(..).map(Some).collect();
+
+        for group in &cg.groups {
+            let mut new_indices: Vec<usize> = Vec::with_capacity(group.len());
+            for &old_idx in group {
+                let ev = taken[old_idx].take().expect("event already taken");
+                new_indices.push(reordered.len());
+                reordered.push(ev);
+            }
+            new_groups.push(new_indices);
+        }
+
+        self.events = reordered;
+        self.color_groups = ColorGroups { groups: new_groups };
+    }
+
    pub fn add_events<D: Drift<T>>(
        &mut self,
        composition: Vec<Vec<Vec<Index>>>,
        results: Vec<Vec<f64>>,
        weights: Vec<Vec<Vec<f64>>>,
+        kinds: Vec<EventKind>,
        agents: &CompetitorStore<T, D>,
    ) {
        let mut unique = Vec::with_capacity(10);
@@ -204,6 +290,7 @@ impl<T: Time> TimeSlice<T> {
                teams,
                evidence: 0.0,
                weights,
+                kind: kinds[e],
            }
        });

@@ -212,6 +299,7 @@ impl<T: Time> TimeSlice<T> {
        self.events.extend(events);

        self.iteration(from, agents);
+        self.recompute_color_groups();
    }

    pub(crate) fn posteriors(&self) -> HashMap<Index, Gaussian> {
@@ -222,22 +310,34 @@ impl<T: Time> TimeSlice<T> {
    }

    pub fn iteration<D: Drift<T>>(&mut self, from: usize, agents: &CompetitorStore<T, D>) {
+        if from > 0 || self.color_groups.is_empty() {
+            // Initial pass (add_events) or no color groups yet: simple sequential sweep.
            for event in self.events.iter_mut().skip(from) {
                let teams = event.within_priors(false, false, &self.skills, agents);
                let result = event.outputs();

-            let g = Game::ranked_with_arena(
+                let g = match event.kind {
+                    EventKind::Ranked => Game::ranked_with_arena(
                        teams,
                        &result,
                        &event.weights,
                        self.p_draw,
                        &mut self.arena,
-            );
+                    ),
+                    EventKind::Scored { score_sigma } => Game::scored_with_arena(
+                        teams,
+                        &result,
+                        &event.weights,
+                        score_sigma,
+                        &mut self.arena,
+                    ),
+                };

                for (t, team) in event.teams.iter_mut().enumerate() {
                    for (i, item) in team.items.iter_mut().enumerate() {
                        let old_likelihood = self.skills.get(item.agent).unwrap().likelihood;
-                    let new_likelihood = (old_likelihood / item.likelihood) * g.likelihoods[t][i];
+                        let new_likelihood =
+                            (old_likelihood / item.likelihood) * g.likelihoods[t][i];
                        self.skills.get_mut(item.agent).unwrap().likelihood = new_likelihood;
                        item.likelihood = g.likelihoods[t][i];
                    }
@@ -245,6 +345,90 @@ impl<T: Time> TimeSlice<T> {

                event.evidence = g.evidence;
            }
+        } else {
+            self.sweep_color_groups(agents);
+        }
+    }
+
+    /// Full event sweep using the color-group partition. Colors are processed
+    /// sequentially; within each color the inner loop is parallel under rayon.
+    ///
+    /// Events within each color group touch disjoint agent sets (guaranteed by
+    /// the greedy coloring). This lets each rayon thread write directly to its
+    /// events' skill likelihoods without a deferred-apply step, matching the
+    /// sequential path's allocation profile. The unsafe block is sound because:
+    ///   1. `self.events[range]` and `self.skills` are separate fields → disjoint.
+    ///   2. Events in the same color group access disjoint `Index` values in
+    ///      `self.skills`, so concurrent writes land on different memory locations.
+    ///   3. Each event only writes to its own items' likelihoods (no sharing).
+    #[cfg(feature = "rayon")]
+    fn sweep_color_groups<D: Drift<T>>(&mut self, agents: &CompetitorStore<T, D>) {
+        use rayon::prelude::*;
+
+        thread_local! {
+            static ARENA: std::cell::RefCell<ScratchArena> =
+                std::cell::RefCell::new(ScratchArena::new());
+        }
+
+        // Minimum color-group size to justify rayon's task-spawn overhead.
+        // Below this threshold, process events sequentially to avoid regression
+        // on small per-slice workloads.
+        const RAYON_THRESHOLD: usize = 64;
+
+        for color_idx in 0..self.color_groups.groups.len() {
+            let group_len = self.color_groups.groups[color_idx].len();
+            if group_len == 0 {
+                continue;
+            }
+            let range = self.color_groups.color_range(color_idx);
+            let p_draw = self.p_draw;
+
+            if group_len >= RAYON_THRESHOLD {
+                // Obtain a raw pointer from the unique `&mut self.skills` reference.
+                // Casting back to `&mut` inside the closure is sound because:
+                //   1. The pointer originates from a `&mut` — no aliasing with shared refs.
+                //   2. Events in the same color group touch disjoint `Index` slots in the
+                //      underlying Vec, so concurrent writes from different threads land on
+                //      different memory locations — no data race.
+                //   3. `self.events[range]` and `self.skills` are separate struct fields,
+                //      so the borrow splits cleanly.
+                let skills_addr: usize = (&mut self.skills as *mut SkillStore) as usize;
+                self.events[range].par_iter_mut().for_each(move |ev| {
+                    // SAFETY: see above.
+                    let skills: &mut SkillStore = unsafe { &mut *(skills_addr as *mut SkillStore) };
+                    ARENA.with(|cell| {
+                        let mut arena = cell.borrow_mut();
+                        arena.reset();
+                        ev.iteration_direct(skills, agents, p_draw, &mut arena);
+                    });
+                });
+            } else {
+                for ev in &mut self.events[range] {
+                    ev.iteration_direct(&mut self.skills, agents, p_draw, &mut self.arena);
+                }
+            }
+        }
+    }
+
+    /// Full event sweep using the color-group partition, sequential direct-write path.
+    /// Events within each color group are updated inline — no EventOutput allocation —
+    /// matching the T2 performance profile.
+    #[cfg(not(feature = "rayon"))]
+    fn sweep_color_groups<D: Drift<T>>(&mut self, agents: &CompetitorStore<T, D>) {
+        for color_idx in 0..self.color_groups.groups.len() {
+            if self.color_groups.groups[color_idx].is_empty() {
+                continue;
+            }
+            let range = self.color_groups.color_range(color_idx);
+
+            // Borrow self.events as a mutable slice for this color range.
+            // self.skills and self.arena are separate fields — disjoint borrows are
+            // allowed within a single method body.
+            let p_draw = self.p_draw;
+            for ev in &mut self.events[range] {
+                ev.iteration_direct(&mut self.skills, agents, p_draw, &mut self.arena);
+            }
+        }
    }

    #[allow(dead_code)]
@@ -316,21 +500,28 @@ impl<T: Time> TimeSlice<T> {
        // log_evidence is infrequent; a local arena avoids needing &mut self.
        let mut arena = ScratchArena::new();

+        let run_event = |event: &Event, arena: &mut ScratchArena| -> f64 {
+            let teams = event.within_priors(online, forward, &self.skills, agents);
+            let result = event.outputs();
+            match event.kind {
+                EventKind::Ranked => {
+                    Game::ranked_with_arena(teams, &result, &event.weights, self.p_draw, arena)
+                        .evidence
+                        .ln()
+                }
+                EventKind::Scored { score_sigma } => {
+                    Game::scored_with_arena(teams, &result, &event.weights, score_sigma, arena)
+                        .evidence
+                        .ln()
+                }
+            }
+        };
+
        if targets.is_empty() {
            if online || forward {
                self.events
                    .iter()
-                    .map(|event| {
-                        Game::ranked_with_arena(
-                            event.within_priors(online, forward, &self.skills, agents),
-                            &event.outputs(),
-                            &event.weights,
-                            self.p_draw,
-                            &mut arena,
-                        )
-                        .evidence
-                        .ln()
-                    })
+                    .map(|event| run_event(event, &mut arena))
                    .sum()
            } else {
                self.events.iter().map(|event| event.evidence.ln()).sum()
@@ -338,25 +529,14 @@ impl<T: Time> TimeSlice<T> {
        } else if online || forward {
            self.events
                .iter()
-                .enumerate()
-                .filter(|(_, event)| {
+                .filter(|event| {
                    event
                        .teams
                        .iter()
                        .flat_map(|team| &team.items)
                        .any(|item| targets.contains(&item.agent))
                })
-                .map(|(_, event)| {
-                    Game::ranked_with_arena(
-                        event.within_priors(online, forward, &self.skills, agents),
-                        &event.outputs(),
-                        &event.weights,
-                        self.p_draw,
-                        &mut arena,
-                    )
-                    .evidence
-                    .ln()
-                })
+                .map(|event| run_event(event, &mut arena))
                .sum()
        } else {
            self.events
@@ -451,6 +631,7 @@ mod tests {
            ],
            vec![vec![1.0, 0.0], vec![0.0, 1.0], vec![1.0, 0.0]],
            vec![],
+            vec![EventKind::Ranked; 3],
            &agents,
        );

@@ -527,6 +708,7 @@ mod tests {
            ],
            vec![vec![1.0, 0.0], vec![0.0, 1.0], vec![1.0, 0.0]],
            vec![],
+            vec![EventKind::Ranked; 3],
            &agents,
        );

@@ -606,6 +788,7 @@ mod tests {
            ],
            vec![vec![1.0, 0.0], vec![0.0, 1.0], vec![1.0, 0.0]],
            vec![],
+            vec![EventKind::Ranked; 3],
            &agents,
        );

@@ -637,6 +820,7 @@ mod tests {
            ],
            vec![vec![1.0, 0.0], vec![0.0, 1.0], vec![1.0, 0.0]],
            vec![],
+            vec![EventKind::Ranked; 3],
            &agents,
        );

@@ -662,4 +846,68 @@ mod tests {
            epsilon = 1e-6
        );
    }
+
+    #[test]
+    fn time_slice_color_groups_reorders_events() {
+        // ev0: [a, b]; ev1: [c, d]; ev2: [a, c]
+        // Greedy coloring: ev0→c0, ev1→c0 (disjoint), ev2→c1 (overlaps both).
+        // After recompute_color_groups, physical order is [ev0, ev1, ev2]
+        // and groups == [[0, 1], [2]].
+        let mut index_map = KeyTable::new();
+
+        let a = index_map.get_or_create("a");
+        let b = index_map.get_or_create("b");
+        let c = index_map.get_or_create("c");
+        let d = index_map.get_or_create("d");
+
+        let mut agents: CompetitorStore<i64, ConstantDrift> = CompetitorStore::new();
+
+        for agent in [a, b, c, d] {
+            agents.insert(
+                agent,
+                Competitor {
+                    rating: Rating::new(
+                        Gaussian::from_ms(25.0, 25.0 / 3.0),
+                        25.0 / 6.0,
+                        ConstantDrift(25.0 / 300.0),
+                    ),
+                    ..Default::default()
+                },
+            );
+        }
+
+        let mut ts = TimeSlice::new(0i64, 0.0);
+
+        ts.add_events(
+            vec![
+                vec![vec![a], vec![b]],
+                vec![vec![c], vec![d]],
+                vec![vec![a], vec![c]],
+            ],
+            vec![vec![1.0, 0.0], vec![1.0, 0.0], vec![1.0, 0.0]],
+            vec![],
+            vec![EventKind::Ranked; 3],
+            &agents,
+        );
+
+        assert_eq!(ts.color_groups.n_colors(), 2);
+        assert_eq!(ts.color_groups.groups[0], vec![0, 1]);
+        assert_eq!(ts.color_groups.groups[1], vec![2]);
+
+        assert_eq!(ts.color_groups.color_range(0), 0..2);
+        assert_eq!(ts.color_groups.color_range(1), 2..3);
+
+        // Events at positions 0 and 1 (color 0) must be disjoint — verify by
+        // checking that the agent sets of self.events[0] and self.events[1] do
+        // not include the agent at self.events[2].
+        let agents_in_ev2: Vec<Index> = ts.events[2].iter_agents().collect();
+        let agents_in_ev0: Vec<Index> = ts.events[0].iter_agents().collect();
+        let agents_in_ev1: Vec<Index> = ts.events[1].iter_agents().collect();
+        // ev0 and ev1 must be disjoint from each other (color-0 invariant).
+        assert!(agents_in_ev0.iter().all(|ag| !agents_in_ev1.contains(ag)));
+        // ev2 must share an agent with ev0 or ev1 (it needed its own color).
+        let ev2_overlaps_ev0 = agents_in_ev2.iter().any(|ag| agents_in_ev0.contains(ag));
+        let ev2_overlaps_ev1 = agents_in_ev2.iter().any(|ag| agents_in_ev1.contains(ag));
+        assert!(ev2_overlaps_ev0 || ev2_overlaps_ev1);
+    }
 }
@@ -223,3 +223,26 @@ fn predict_outcome_two_teams_sums_to_one() {
    assert!((p[0] + p[1] - 1.0).abs() < 1e-9);
    assert!(p[0] > p[1]);
 }
+
+#[test]
+fn fluent_event_builder_scores() {
+    use trueskill_tt::ConstantDrift;
+    let mut h = History::builder()
+        .mu(25.0)
+        .sigma(25.0 / 3.0)
+        .beta(25.0 / 6.0)
+        .drift(ConstantDrift(0.0))
+        .build();
+
+    h.event(1)
+        .team(["alice"])
+        .team(["bob"])
+        .scores([12.0, 4.0])
+        .commit()
+        .unwrap();
+    h.converge().unwrap();
+
+    let a = h.current_skill(&"alice").unwrap();
+    let b = h.current_skill(&"bob").unwrap();
+    assert!(a.mu() > b.mu());
+}
@@ -0,0 +1,100 @@
+//! Determinism tests: identical posteriors across RAYON_NUM_THREADS
+//! values. Only compiled with the `rayon` feature.
+
+#![cfg(feature = "rayon")]
+
+use smallvec::smallvec;
+use trueskill_tt::{ConstantDrift, ConvergenceOptions, Event, History, Member, Outcome, Team};
+
+/// Build a deterministic workload using a simple LCG (no external rand crate).
+fn build_and_converge(seed: u64) -> Vec<(i64, trueskill_tt::Gaussian)> {
+    let mut h = History::<i64, _, _, String>::builder_with_key()
+        .mu(25.0)
+        .sigma(25.0 / 3.0)
+        .beta(25.0 / 6.0)
+        .drift(ConstantDrift(25.0 / 300.0))
+        .convergence(ConvergenceOptions {
+            max_iter: 30,
+            epsilon: 1e-6,
+        })
+        .build();
+
+    // LCG for deterministic pseudo-random ints.
+    let mut rng = seed;
+    let mut next = || {
+        rng = rng
+            .wrapping_mul(6364136223846793005)
+            .wrapping_add(1442695040888963407);
+        rng
+    };
+
+    let mut events: Vec<Event<i64, String>> = Vec::with_capacity(200);
+    for ev_i in 0..200 {
+        let a = (next() % 40) as usize;
+        let mut b = (next() % 40) as usize;
+        while b == a {
+            b = (next() % 40) as usize;
+        }
+        // ~10 events per slice so color groups have material parallelism.
+        events.push(Event {
+            time: (ev_i as i64 / 10) + 1,
+            teams: smallvec![
+                Team::with_members([Member::new(format!("p{a}"))]),
+                Team::with_members([Member::new(format!("p{b}"))]),
+            ],
+            outcome: Outcome::winner((next() % 2) as u32, 2),
+        });
+    }
+    h.add_events(events).unwrap();
+    h.converge().unwrap();
+    // Sample one competitor's curve for the comparison.
+    h.learning_curve("p0")
+}
+
+#[test]
+fn posteriors_identical_across_thread_counts() {
+    let sizes = [1usize, 2, 4, 8];
+    let mut results: Vec<Vec<(i64, trueskill_tt::Gaussian)>> = Vec::new();
+    for &n in &sizes {
+        let pool = rayon::ThreadPoolBuilder::new()
+            .num_threads(n)
+            .build()
+            .expect("rayon pool build");
+        let curve = pool.install(|| build_and_converge(42));
+        results.push(curve);
+    }
+
+    let reference = &results[0];
+    for (i, curve) in results.iter().enumerate().skip(1) {
+        assert_eq!(
+            curve.len(),
+            reference.len(),
+            "curve length differs at {n} threads",
+            n = sizes[i],
+        );
+        for (j, (&(t_ref, g_ref), &(t, g))) in reference.iter().zip(curve.iter()).enumerate() {
+            assert_eq!(
+                t_ref,
+                t,
+                "time point {j} differs at {n} threads: ref={t_ref} vs got={t}",
+                n = sizes[i],
+            );
+            assert_eq!(
+                g_ref.mu().to_bits(),
+                g.mu().to_bits(),
+                "mu bits differ at {n} threads, time {t}: ref={ref_mu} got={got_mu}",
+                n = sizes[i],
+                ref_mu = g_ref.mu(),
+                got_mu = g.mu(),
+            );
+            assert_eq!(
+                g_ref.sigma().to_bits(),
+                g.sigma().to_bits(),
+                "sigma bits differ at {n} threads, time {t}: ref={ref_sigma} got={got_sigma}",
+                n = sizes[i],
+                ref_sigma = g_ref.sigma(),
+                got_sigma = g.sigma(),
+            );
+        }
+    }
+}
@@ -42,6 +42,7 @@ fn game_1v1_draw_golden() {
        Outcome::draw(2),
        &GameOptions {
            p_draw: 0.25,
+            score_sigma: 1.0,
            convergence: Default::default(),
        },
    )
@@ -45,6 +45,7 @@ fn game_ranked_rejects_bad_p_draw() {
        Outcome::winner(0, 2),
        &GameOptions {
            p_draw: 1.5,
+            score_sigma: 1.0,
            convergence: ConvergenceOptions::default(),
        },
    )
@@ -0,0 +1,139 @@
+//! Integration tests for `Outcome::Scored` routing through `History::add_events`.
+
+use smallvec::smallvec;
+use trueskill_tt::{ConstantDrift, Event, History, Member, Outcome, Team};
+
+#[test]
+fn scored_two_team_one_event_pulls_winner_up() {
+    let mut h: History = History::builder()
+        .mu(0.0)
+        .sigma(2.0)
+        .beta(1.0)
+        .drift(ConstantDrift(0.0))
+        .score_sigma(1.0)
+        .build();
+
+    let events: Vec<Event<i64, &'static str>> = vec![Event {
+        time: 1,
+        teams: smallvec![
+            Team::with_members([Member::new("a")]),
+            Team::with_members([Member::new("b")]),
+        ],
+        outcome: Outcome::scores([10.0, 4.0]),
+    }];
+    h.add_events(events).unwrap();
+
+    let mu_a = h.current_skill(&"a").unwrap().mu();
+    let mu_b = h.current_skill(&"b").unwrap().mu();
+
+    assert!(
+        mu_a > 0.0,
+        "winner mu should be pulled up; got mu_a = {mu_a}"
+    );
+    assert!(
+        mu_b < 0.0,
+        "loser mu should be pulled down; got mu_b = {mu_b}"
+    );
+    assert!(
+        mu_a > mu_b,
+        "winner mu should exceed loser mu; got mu_a = {mu_a}, mu_b = {mu_b}"
+    );
+}
+
+#[test]
+fn scored_zero_margin_treats_as_tie() {
+    let mut h: History = History::builder()
+        .mu(0.0)
+        .sigma(2.0)
+        .beta(1.0)
+        .drift(ConstantDrift(0.0))
+        .score_sigma(1.0)
+        .build();
+
+    let events: Vec<Event<i64, &'static str>> = vec![Event {
+        time: 1,
+        teams: smallvec![
+            Team::with_members([Member::new("a")]),
+            Team::with_members([Member::new("b")]),
+        ],
+        outcome: Outcome::scores([5.0, 5.0]),
+    }];
+    h.add_events(events).unwrap();
+
+    let mu_a = h.current_skill(&"a").unwrap().mu();
+    let mu_b = h.current_skill(&"b").unwrap().mu();
+    let sigma_a = h.current_skill(&"a").unwrap().sigma();
+
+    // Equal scores: posterior means stay symmetric around the prior mean.
+    assert!(
+        (mu_a - mu_b).abs() < 1e-9,
+        "equal scores should leave mu_a == mu_b; got {mu_a} vs {mu_b}"
+    );
+    assert!(
+        mu_a.abs() < 1e-9,
+        "equal scores against equal priors should leave mu near zero; got {mu_a}"
+    );
+
+    // A zero-margin scored event still reduces uncertainty.
+    assert!(
+        sigma_a < 2.0,
+        "expected sigma to tighten below prior 2.0; got {}",
+        sigma_a
+    );
+}
+
+#[test]
+fn scored_three_team_partial_order() {
+    let mut h: History = History::builder()
+        .mu(0.0)
+        .sigma(2.0)
+        .beta(1.0)
+        .drift(ConstantDrift(0.0))
+        .score_sigma(1.0)
+        .build();
+
+    let events: Vec<Event<i64, &'static str>> = vec![Event {
+        time: 1,
+        teams: smallvec![
+            Team::with_members([Member::new("a")]),
+            Team::with_members([Member::new("b")]),
+            Team::with_members([Member::new("c")]),
+        ],
+        outcome: Outcome::scores([9.0, 5.0, 1.0]),
+    }];
+    h.add_events(events).unwrap();
+
+    let mu_a = h.current_skill(&"a").unwrap().mu();
+    let mu_b = h.current_skill(&"b").unwrap().mu();
+    let mu_c = h.current_skill(&"c").unwrap().mu();
+
+    assert!(
+        mu_a > mu_b,
+        "team with highest score should rank highest; mu_a = {mu_a}, mu_b = {mu_b}"
+    );
+    assert!(
+        mu_b > mu_c,
+        "middle score should outrank lowest; mu_b = {mu_b}, mu_c = {mu_c}"
+    );
+}
+
+#[test]
+fn scored_rejects_outcome_team_count_mismatch() {
+    use trueskill_tt::InferenceError;
+
+    let mut h: History = History::builder().build();
+    let events: Vec<Event<i64, &'static str>> = vec![Event {
+        time: 1,
+        teams: smallvec![
+            Team::with_members([Member::new("a")]),
+            Team::with_members([Member::new("b")]),
+        ],
+        outcome: Outcome::scores([10.0, 4.0, 1.0]), // 3 scores, 2 teams
+    }];
+
+    let err = h.add_events(events).unwrap_err();
+    assert!(
+        matches!(err, InferenceError::MismatchedShape { .. }),
+        "expected MismatchedShape error, got {err:?}"
+    );
+}