Files
trueskill-tt/CHANGELOG.md
Anders Olsson db633bdafe bench,docs: capture T3 final numbers and update CHANGELOG
Batch::iteration sequential: 23.23 µs (no regression vs T2 baseline).
Gaussian ops unchanged.

End-to-end history_converge benchmark on Apple M5 Pro:
  Workload                                        seq       rayon    speedup
  500 events / 100 competitors / 10 per slice     4.03 ms   4.24 ms  1.0x
  2000 events / 200 competitors / 20 per slice   20.18 ms  19.82 ms  1.0x
  5000 events / 50000 competitors / 1 slice      11.88 ms   9.10 ms  1.3x

The spec's >=2x target is not achieved on realistic workloads. T3's
within-slice color-group parallelism only shows material benefit when
a slice holds many events AND the competitor pool is large enough to
give the greedy coloring room to partition. Typical TrueSkill
workloads don't fit that profile. Cross-slice parallelism (dirty-bit
slice skipping, spec Section 5) is the natural next step for
real-workload speedup.

Determinism verified: bit-identical posteriors across
RAYON_NUM_THREADS={1, 2, 4, 8}.

Closes T3 of docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:58:24 +02:00

235 lines
8.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Changelog
All notable changes to this project will be documented in this file.
## Unreleased — T3 concurrency
Adds rayon-backed parallel paths per Section 6 of
`docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md`.
### Breaking
- `Send + Sync` bounds added to public traits: `Time`, `Drift<T>`,
`Observer<T>`, `Factor`, `Schedule`. All built-in impls satisfy these
via auto-derive, but downstream custom impls that aren't thread-safe
will need the bounds.
### New
- Opt-in `rayon` cargo feature. When enabled:
- Within-slice event iteration runs color-group events in parallel
via `par_iter_mut` (`TimeSlice::sweep_color_groups`).
- `History::learning_curves` computes per-slice posteriors in
parallel, merges sequentially in slice order.
- `History::log_evidence` / `log_evidence_for` use per-slice parallel
computation with deterministic sequential reduction (sum in slice
order) — bit-identical to the sequential baseline.
- `ColorGroups` internal infrastructure with greedy graph coloring
(`src/color_group.rs`). Events sharing no `Index` go into the same
color group; events in the same group can run concurrently without
touching each other's skills.
- `tests/determinism.rs` asserts bit-identical posteriors across
`RAYON_NUM_THREADS={1, 2, 4, 8}`.
- `benches/history_converge.rs` measures end-to-end convergence on
three workload shapes.
### Performance notes
- Default build (no rayon): `Batch::iteration` 23.23 µs — no regression
vs T2.
- With `--features rayon`:
- 500 events / 100 competitors / 10 per slice: 1.0× speedup.
- 2000 events / 200 competitors / 20 per slice: 1.0× speedup.
- 5000 events in one slice / 50k competitors: **1.3× speedup.**
- The spec targeted >2× speedup on 8-core offline converge. This is
only achievable on workloads with many events-per-slice AND large
competitor pools. **Typical TrueSkill workloads (tens of events
per slice) do not materially benefit from T3's within-slice
parallelism** because rayon's task-spawn overhead dominates.
- Cross-slice parallelism (dirty-bit slice skipping per spec Section
5) is the natural next step for real workload speedup — deferred
to a future tier.
### Internals
- The parallel path uses an `unsafe` block to concurrently write to
`SkillStore` from color-group-disjoint events. Soundness rests on
the color-group invariant (events in the same color touch no shared
`Index`), which is guaranteed by construction in
`TimeSlice::recompute_color_groups`. Sequential path unchanged.
- `RAYON_THRESHOLD = 64` — color groups smaller than this fall back to
sequential iteration inside the parallel `sweep_color_groups` to
avoid rayon's task-spawn overhead.
- Thread-local `ScratchArena` per rayon worker thread.
## Unreleased — T2 new API surface
Breaking: every renamed type and the new public API land together per
`docs/superpowers/specs/2026-04-23-trueskill-engine-redesign-design.md`
Section 7 "T2".
### Breaking renames
- `Batch``TimeSlice`
- `Player``Rating` (and the `.player` field on `Competitor` is now `.rating`)
- `Agent``Competitor`
- `IndexMap``KeyTable`
- `History` field `.batches``.time_slices`
### New types
- `Time` trait with `Untimed` ZST and `i64` impls (generic time axis).
- `Drift<T: Time>` — generified from the old `Drift` trait.
- `Event<T, K>`, `Team<K>`, `Member<K>` — typed bulk-ingest event shape.
- `Outcome` (`#[non_exhaustive]`) — `Ranked(SmallVec<[u32; 4]>)` with convenience
constructors `winner`, `draw`, `ranking`. `Scored` lands in T4.
- `Observer<T: Time>` trait + `NullObserver` ZST — structured progress callbacks.
- `ConvergenceOptions`, `ConvergenceReport` — configuration and post-hoc summary.
- `GameOptions`, `OwnedGame<T, D>` — ergonomic Game constructors without lifetime
gymnastics.
- `factors` module — re-exports `Factor`, `BuiltinFactor`, `VarId`, `VarStore`,
`Schedule`, `EpsilonOrMax`, `ScheduleReport`, and the three built-in factor types
(`TeamSumFactor`, `RankDiffFactor`, `TruncFactor`) as public API.
### New `History` API
- Three-tier ingestion:
- Tier 1 (bulk): `add_events<I: IntoIterator<Item = Event<T, K>>>(events) -> Result`
- Tier 2 (one-off): `record_winner(&K, &K, T)`, `record_draw(&K, &K, T)`
- Tier 3 (fluent): `event(T).team([...]).weights([...]).ranking([...]).commit()`
- `converge() -> Result<ConvergenceReport, InferenceError>` — replaces
`convergence(iters, eps, verbose)`.
- `current_skill(&K)`, `learning_curve(&K)`, `learning_curves()` (now keyed on `K`).
- `log_evidence()` zero-arg, `log_evidence_for(&[&K])`.
- `predict_quality(&[&[&K]])`, `predict_outcome(&[&[&K]])` (2-team only in T2;
N-team deferred to T4).
- `intern(&Q)` / `lookup(&Q)` expose the internal `KeyTable<K>` for power users.
- `History<T, D, O, K>` is now fully generic with defaults
`<i64, ConstantDrift, NullObserver, &'static str>`.
### New `Game` API
- `Game::ranked(&[&[Rating]], Outcome, &GameOptions) -> Result<OwnedGame, _>`.
- `Game::one_v_one(&Rating, &Rating, Outcome) -> Result<(Gaussian, Gaussian), _>`.
- `Game::free_for_all(&[&Rating], Outcome, &GameOptions) -> Result<OwnedGame, _>`.
- `Game::custom(...)` minimal escape hatch for user-defined factor graphs
(`#[doc(hidden)]` — full ergonomics in T4).
- `Game::log_evidence()` and `OwnedGame::log_evidence()` accessors.
### Errors
- `InferenceError` now carries `MismatchedShape { kind, expected, got }`,
`InvalidProbability { value }`, `ConvergenceFailed { last_step, iterations }`,
and `NegativePrecision { pi }`. Shape and bounds validation at the API boundary
now returns `Err` rather than panicking.
### Removed (breaking)
- `History::convergence(iters, eps, verbose)` — use `converge()`.
- `HistoryBuilder::gamma(f64)` — use `.drift(ConstantDrift(g))`.
- `HistoryBuilder::time(bool)` and `History.time: bool` — use the `Time` type parameter.
- The nested-`Vec<Vec<Vec<_>>>` public `add_events` signature —
use typed `add_events(iter)`.
- `learning_curves_by_index()` — use `learning_curves()`.
### Performance
`Batch::iteration` bench: **21.36 µs** (T1 was 22.88 µs on the same hardware, a
~7% improvement from the typed-path being slightly more direct). Gaussian
operations unchanged.
### Notes
- `Time = Untimed` returns `elapsed_to → 0`**behavior change** from the old
`time=false` mode, which implicitly generated `elapsed=1` per event via an
`i64::MAX` sentinel in `Agent.last_time`. Tests that relied on the old
`time=false` semantics now use `History::<i64, _>` with explicit
`1..=n` timestamps.
## 0.1.0 - 2026-04-23
### Features
- feat: added a Drift trait and a "default" ConstantDrift implementation
### Miscellaneous Tasks
- chore: added cliff.toml, release.toml and rustfmt.toml
- chore: clean up
### Other (unconventional)
- Initial commit.
- Begin working on batch.
- Passing tests for Batch
- Working on History struct. First test is passing.
- More test passing for History
- Added more functions to History
- Remove Display impl, better to use Debug
- Use flatten instead of flat_map
- Handle case where there is no time
- It works, or so it seems
- Use PlayerIndex instead of String
- Inline a lot of functions
- Refactor some code
- Refactor some stuff
- Port from julia version instead
- More things, better things, awesome
- More tests, more code
- More things, more tests
- Fix tests
- More tests
- More tests
- Added builder for History, and start migrating test to use builder instead.
- Update test to use builder
- Remove unused code
- Use and Index struct instead of str and String for player id
- Update example so now it works, and thats, well, good
- Update test to use assert_ulps_eq
- Fixed test
- Change time to use i64 instead of u64
- Small change
- Clean up example
- Update crates and added methods to get a key or all keys in an IndexMap
- Added a get function to IndexMap
- Agents doens't have to be behind a mutable reference in within_prior
- Agents doens't have to be behind a mutable reference in within_priors
- Refactor so we can see if there is any way to improve the performance
- Fix clippy warning
- More refactoring
- Remove warnings and refactor some code
- Added benchmark for Batch
- Added default implementation for TeamMessage
- Remove unused mut reference
- Make it more rusty
- More rustifying
- Small refactor
- Rename d to diff, and t to team
- Added more links to readme
- Fix broken link in README
- Update crates
- Clean up
- Dry my eyes
- Remove unnecessary allocations
- Fix clippy warning
- Refactor history
- Rename variables
- Move stuff around
- Added quality function
- Make quality a free standing function instead
- Improve performance
- Change assert to debug_assert
- Added todo to readme, and documentation for quality function
- Basic test for quality
- Ignore temp folder
- Update edition
- Small changes for new 2024 edition
- remove notepad
- added benchmark
### Styling
- style: cargo fmt
<!-- generated by git-cliff -->