AgentStore<D> is a Vec<Option<Agent<D>>>-backed store indexed directly
by Index.0, eliminating per-iteration hashing in the cross-history
forward/backward sweep. Implements Index<Index>/IndexMut<Index> for
ergonomic agent access.
AgentStore is public (so benches/batch.rs can use it). SkillStore
remains pub(crate) since Skill is pub(crate) in batch.rs.
HashMap<Index, _> is now only used for the posteriors() return value
(temporary; will be replaced in T2 with a proper typed return) and
for the add_events_with_prior(priors: HashMap<Index, Player<D>>) API
(also T2 target).
Part of T0 engine redesign.
SkillStore is a Vec<Skill>-backed dense store with a parallel present
mask, indexed directly by Index.0. Eliminates per-iteration hashing
in the within-slice convergence loop; O(1) array lookup replaces O(1)
amortised hash lookup with better cache behaviour.
Iteration order is now ascending-by-Index (was arbitrary for HashMap);
EP fixed point is order-independent so posteriors are unchanged.
Part of T0 engine redesign.
mu_sigma was deleted as part of the Gaussian nat-param rewrite (its
only callers were the old Mul/Div impls). This commit adds the
InferenceError enum as a seed for the T2 API surface, with the
NegativePrecision variant that mu_sigma would have returned.
Part of T0 engine redesign.
Mul and Div become two f64 adds/subs with no sqrt in the hot path.
mu() and sigma() are computed on demand from stored pi/tau.
Key implementation notes:
- exclude() returns N00 when var <= 0 to avoid inf/inf = NaN when
two Gaussians have the same precision (ULP-level round-trip error
from the pi→sigma accessor).
- Mul<f64> by 0.0 returns N00 (point mass at 0), matching old behavior.
- from_ms(0, 0) == N00 {pi:inf, tau:0}; from_ms(0, inf) == N_INF {pi:0, tau:0}.
Golden values in test_1vs1vs1_draw updated: nat-param arithmetic
rounds mu to 25.0 (was 24.999999) and shifts sigma by ~3e-7.
Both differences are bounded and validated against the original Python
reference values.
Part of T0 engine redesign.
- Promotes Gaussian::pi and Gaussian::tau to public so benches/gaussian.rs
compiles, then captures baseline numbers for the T0 acceptance gate.
- Fixes the divide bench: g1/g2 panicked (g1 has lower precision than g2;
cavity requires pi_num >= pi_den). Swapped to g2/g1 (well-defined).
Baseline on Apple M5 Pro:
Batch::iteration 29.840 µs
Gaussian::mul 1.568 ns (vs ~220 ps for add/sub — hot path)
Gaussian::div 1.572 ns
Bite-sized, TDD-style task breakdown for the first tier of the engine
redesign: Gaussian to natural-parameter storage, dense Vec storage
replacing HashMap, ScratchArena to eliminate per-event allocs,
Result-ifying the lone panic. No top-level public API change.
Acceptance gate: ≥3x speedup on Batch::iteration vs. baseline.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comprehensive design for a multi-tier rewrite covering performance,
factor-graph extensibility, convergence scheduling, and API surface.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>