8f67503f45
- docs/VISION.md: product vision + feature catalogue (MVP / post-MVP / later) - docs/specs/2026-06-02-mvp-architecture.md: MVP architecture + 16-entry decision log - reference/: Spectrum 5.0 cataloguing + Riksantikvarieämbetet source material (build-time reference) - CLAUDE.md: project guidance for Claude Code Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
317 lines
16 KiB
Markdown
317 lines
16 KiB
Markdown
# MVP Architecture & Design
|
|
|
|
**Status:** approved design, pre-implementation
|
|
**Date:** 2026-06-02
|
|
**Scope:** the MVP — the smallest useful build that exercises every architectural
|
|
pillar. Companion to [`../VISION.md`](../VISION.md) (full feature catalogue) and
|
|
grounded in [`../../reference/`](../../reference/) (Spectrum 5.0 + Riksantikvarie-
|
|
ämbetet source material).
|
|
|
|
> Neutral naming throughout. No product/brand name appears in code or these docs
|
|
> (see §13).
|
|
|
|
---
|
|
|
|
## 1. Goals & non-goals
|
|
|
|
**Goals**
|
|
- A small, strongly-typed, well-tested **core** that is easy to extend.
|
|
- An organization can **catalogue** its collection (Spectrum Cataloguing), attach
|
|
**media**, **search** it, control **visibility**, and expose **public** records.
|
|
- **Airtight org isolation** and a full **audit trail** from day one.
|
|
- **Easy self-hosting**: one binary, one database, minimal dependencies.
|
|
|
|
**Non-goals (MVP)**
|
|
- Other Spectrum procedures as workflows (entry, accession, loans, location/
|
|
movement control, …) — roadmap.
|
|
- Reporting/label templates, aggregator/LIDO/OAI-PMH/IIIF, translation workflow
|
|
UI, fleet provisioning/control plane, migrations machinery (none until 1.0).
|
|
|
|
---
|
|
|
|
## 2. Guiding principles
|
|
|
|
- **Make illegal states unrepresentable** (§9). Parse, don't validate.
|
|
- **Isolation by construction** (§4): credentials + topology, not `org_id`
|
|
filtering in code.
|
|
- **Module separation; no SQL spread.** SQL lives only in repository modules (§5,
|
|
§8).
|
|
- **Minimal custom code, reversible dependency bets** (§14).
|
|
- **Self-host is first-class** (§12).
|
|
- **Well-tested, not overboard** (§15): strong types shrink the test surface; the
|
|
isolation/security and the core get thorough tests; the dynamic field layer is
|
|
validated at runtime.
|
|
|
|
---
|
|
|
|
## 3. Deployment topology & tenancy
|
|
|
|
**The application binary is always single-tenant.** One running instance serves
|
|
exactly one organization and contains no concept of "other orgs". There is **no
|
|
multi-tenant code path**. Multi-tenancy is achieved entirely at the deployment
|
|
layer:
|
|
|
|
| | Self-host | Hosted fleet |
|
|
|---|---|---|
|
|
| App instances | one (1+ pods) | one deployment **per org** (1+ pods each) |
|
|
| Postgres | its own database | **one shared server**, one **database per org** |
|
|
| Meilisearch | its own index | **one shared server**, one **index per org** |
|
|
| Files | local disk or S3 | S3 (or RWX volume per org) |
|
|
| Domain | the org's domain | each org its own domain |
|
|
| Rollout | upgrade the instance | **per-org** image bump |
|
|
|
|
Consequences (recorded):
|
|
- **Per-org rollout & schema version.** Bumping one org's image rolls out that org
|
|
only; the instance runs its own migrations against its own database. Orgs may sit
|
|
on different versions. (Pre-1.0: recreate rather than migrate.)
|
|
- **Files:** with more than one pod per org, files must be on shared storage (S3 or
|
|
RWX volume) — local disk is single-pod/self-host only. `BlobStore` (§11) abstracts
|
|
this.
|
|
- **Cross-org features** (a future aggregator searching across museums; fleet
|
|
admin) are a **separate service**, never a single org-app. Out of MVP.
|
|
|
|
## 4. Isolation model
|
|
|
|
Because each org-app holds **credentials scoped to its own database and its own
|
|
search index**, cross-org access is not "prevented" — it is **impossible, because
|
|
the access path does not exist**:
|
|
|
|
- **Postgres:** database-per-org + a role granted access to *only* that database.
|
|
An instance physically cannot connect to another org's database.
|
|
- **Meilisearch:** index-per-org + an API key scoped to that org's index only.
|
|
- **No Row-Level Security needed** — there is no shared multi-org data in any
|
|
single database to protect, and the app has no cross-org code.
|
|
- **Files:** per-org bucket/prefix (S3) or per-org volume, with scoped credentials.
|
|
|
|
Defense-in-depth / verification:
|
|
- A **single configuration chokepoint** establishes "which org am I" at startup
|
|
from config; nothing reconstructs it ad hoc.
|
|
- **Negative tests** assert the app cannot be pointed outside its configured
|
|
database/index and that scoped credentials reject foreign access.
|
|
|
|
## 5. Crate / module layout
|
|
|
|
A Cargo **workspace** with **role-named** member crates (no brand name anywhere):
|
|
|
|
```
|
|
/ virtual workspace
|
|
crates/
|
|
domain/ core types, value objects, invariants (no I/O)
|
|
db/ sqlx repositories; ALL SQL lives here
|
|
storage/ BlobStore trait + OpenDAL adapter (S3 / local)
|
|
search/ search abstraction + Meilisearch adapter
|
|
auth/ password + OIDC, session/token, extractors
|
|
api/ axum router, handlers, OpenAPI (utoipa), public + admin
|
|
server/ binary: config, wiring, startup, migrations runner
|
|
web/ React SPA (separate build), consumes the OpenAPI
|
|
migrations/ SQL migrations (post-1.0; pre-1.0 = recreate)
|
|
```
|
|
|
|
Dependency direction points inward toward `domain`. `domain` has no I/O deps.
|
|
Each crate has one clear purpose, a defined interface, and is testable in
|
|
isolation. Experimental/volatile dependencies sit behind a crate-owned trait
|
|
(`BlobStore`, the search trait, …) so they are swappable (§14).
|
|
|
|
## 6. Data model — hybrid (Approach C)
|
|
|
|
Three layers:
|
|
|
|
### 6.1 Typed relational core
|
|
The accountability backbone and the most queried/integrity-critical fields, as
|
|
real columns/tables with strong types:
|
|
- object number (configurable format), object name, number of objects,
|
|
brief description, current location, current owner, recorder, recording date,
|
|
**visibility**, media links, audit linkage.
|
|
|
|
### 6.2 Flexible field layer
|
|
- A **field-definition registry**: each definition has a key, data type, optional
|
|
**vocabulary/authority binding**, validation rules, grouping, and locale
|
|
behavior.
|
|
- Field **values** stored as **JSONB** on the record, validated at write time
|
|
against the registry.
|
|
- The **Spectrum 5.0 Cataloguing field set** ships as **seed field definitions**
|
|
(see [`reference/spectrum-5.0-cataloguing-units-of-information.md`](../../reference/spectrum-5.0-cataloguing-units-of-information.md)).
|
|
Orgs enable a subset or the full set; custom fields are *data*, not migrations.
|
|
- **Trade (explicit):** this layer is **runtime-typed by design** — validated
|
|
against definitions at runtime, not by the compiler. Hard types where structure
|
|
is fixed (core, IDs, refs), runtime validation where it is dynamic.
|
|
|
|
### 6.3 Controlled vocabularies & authority records
|
|
- First-class relational tables for **person / organization / place** authorities
|
|
and **term sources** (vocabularies) — *store once, link many*.
|
|
- Referenced from both the typed core and the flexible fields. A field bound to a
|
|
term source accepts only a **resolved reference** (§9), never a free string.
|
|
- **Multilingual labels** (sv/en …) on terms and authorities.
|
|
|
|
### 6.4 Content i18n (capability now, workflow later)
|
|
- Localizable text values are **language-tagged in the data model from day one**
|
|
(so no painful migration later).
|
|
- The **translation workflow/UI is post-MVP**; MVP authors enter content in one
|
|
language while the model already supports more.
|
|
|
|
## 7. Surfaces & API
|
|
|
|
Two cleanly separated surfaces — a **load-bearing** rule:
|
|
- **Public surface** — `/api/public/**`: unauthenticated, **read-only**, serves
|
|
only **public** records as a typed **`PublicView`** (public-safe fields only).
|
|
- **Admin/privileged surface** — everything else: authenticated, read/write.
|
|
|
|
This separation enables independent **IP/VPN lockdown** (admin behind an ingress
|
|
allowlist while public stays open), caching, and rate-limiting — all at the
|
|
ingress layer, not in app code. An optional in-app IP-allowlist middleware is a
|
|
post-MVP portable fallback.
|
|
|
|
**OpenAPI:** code-first with **utoipa** — the spec is generated from Rust
|
|
types/handlers (cannot drift) and is the contract the React client consumes.
|
|
|
|
## 8. Persistence & data access
|
|
|
|
- **PostgreSQL** via **sqlx** (async, compile-time-checked queries). **All SQL is
|
|
confined to the `db` crate**, one repository per aggregate — satisfying "no SQL
|
|
spread everywhere" without an ORM's abstraction.
|
|
- JSONB for the flexible field values (GIN-indexable for search/filter needs).
|
|
- **No migrations until 1.0** — pre-1.0 we reshape freely (drop & recreate). Post-
|
|
1.0, each instance runs its own migrations on startup (per-org schema version).
|
|
|
|
## 9. Type-driven design (cross-cutting)
|
|
|
|
- **Newtype IDs** — `ObjectId`, `OrgId`, `MediaId`, `TermId`, `AuthorityId`; never
|
|
bare UUIDs.
|
|
- **Validated value objects** — `ObjectNumber`, `Email`, and `TermRef` /
|
|
`AuthorityRef` that are **constructable only by resolving** against the
|
|
vocabulary/authority. An unvalidated term cannot exist as that type. (Direct
|
|
mapping of Spectrum's "use a standard term source / form of name".)
|
|
- **`PublicView` projection** — a distinct type carrying only public-safe fields;
|
|
leaking an internal field on the public surface is impossible because the type
|
|
lacks it. (Preferred over a literal `Record<Public>` generic, since visibility is
|
|
runtime data from the DB.)
|
|
- **Visibility** — an enum with explicit transition methods (`publish`,
|
|
`unpublish`, `archive`): a type-driven state machine, not a stringly-typed flag.
|
|
- **Auth via extractors** — public handlers take no auth extractor; privileged
|
|
handlers require an `AuthUser` / `Authorized<Cap>` extractor, so a privileged
|
|
handler cannot compile without proof of authorization.
|
|
|
|
## 10. Authentication & authorization
|
|
|
|
- **Email/password** + **external OIDC** (the org-app is an OIDC relying party),
|
|
scoped to the single org the instance serves.
|
|
- **No separate IdP and no cross-org switching** in MVP (deferred; rare case).
|
|
- Sessions: stateless tokens or a sessions table in the org DB (no Redis required).
|
|
- Authorization enforced through typed extractors (§9); role/permission model kept
|
|
simple in MVP.
|
|
|
|
## 11. File storage
|
|
|
|
- **`BlobStore` trait** in the `storage` crate; **OpenDAL** adapter for **S3 and
|
|
local disk**. Chosen on fit (high-level, multi-backend; our bottleneck is
|
|
network/S3, not syscall I/O). `fusio` is watch-listed and swappable behind the
|
|
trait (§14).
|
|
- Media files are linked to records; derivatives/thumbnails/IIIF are post-MVP.
|
|
|
|
## 12. Search
|
|
|
|
- **Meilisearch**, one index per org, scoped API key. A search abstraction in the
|
|
`search` crate; Meili adapter behind it.
|
|
- MVP: index catalogue records on write; basic full-text + facet search in admin.
|
|
- Public-facing search is post-MVP.
|
|
|
|
## 13. Audit & amendment history
|
|
|
|
- **One append-only, immutable log** in the org database: who / when / what, with
|
|
**field-level before→after diffs**, covering domain create/update/delete and
|
|
auth/security events.
|
|
- Doubles as Spectrum **amendment history** surfaced on catalogue records
|
|
(Spectrum requires a transparent record of changes — never silently erase prior
|
|
terminology).
|
|
- MVP audits **writes + auth events**; auditing reads is deferred.
|
|
|
|
## 14. Visibility & publishing
|
|
|
|
- **Record-level visibility**: `draft` / `internal` / `public`.
|
|
- A fixed **never-public** field set (location, valuation, insurance, personal
|
|
data). Per-field publishability is post-MVP.
|
|
- Public API serves only `public` records via `PublicView`.
|
|
|
|
## 15. Export & backup (distinct)
|
|
|
|
- **Backup** (operational): `pg_dump` / PITR of the org database. Ops concern.
|
|
- **Export** (portable handover): a single **SQLite** file (metadata incl.
|
|
flattened flexible fields + vocab/authority tables) + plain **media files** + a
|
|
**manifest** — a whole-org archive, openable anywhere, stable long-term.
|
|
|
|
## 16. Internationalization
|
|
|
|
- **UI:** Swedish + English via a React i18n library + locale files; localized API
|
|
validation/error messages.
|
|
- **Data:** multilingual labels on vocab/authority terms; language-tagged content
|
|
values in the model (workflow post-MVP, §6.4).
|
|
|
|
## 17. Frontend
|
|
|
|
- **Lean React SPA**, evergreen browsers, consuming the OpenAPI. Separate build in
|
|
`web/`.
|
|
- **"Potato hardware" = an explicit bundle-discipline budget**: small dependency
|
|
set, code-splitting, measured bundle size as a tracked target — *not* a framework
|
|
compromise.
|
|
- Suits the data-entry-heavy cataloguing UI (vocabulary autocomplete, dynamic field
|
|
groups from the registry, inline validation).
|
|
|
|
## 18. Dependencies & tech stack
|
|
|
|
| Concern | Choice | Notes |
|
|
|---|---|---|
|
|
| Language | **Rust 2024** | |
|
|
| HTTP | **axum** | |
|
|
| API spec | **utoipa** (code-first OpenAPI) | drives the React client |
|
|
| DB | **PostgreSQL** + **sqlx** | SQL confined to `db` crate |
|
|
| Storage | **OpenDAL** behind `BlobStore` | S3 + local; `fusio` watch-listed |
|
|
| Search | **Meilisearch** behind a search trait | index-per-org |
|
|
| Cache | **Redis** — *deferred* | add only when needed; key-prefixed |
|
|
| Frontend | **React** (lean SPA) | bundle budget enforced |
|
|
| i18n (FE) | React i18n lib | sv/en |
|
|
|
|
**Dependency philosophy:** pre-1.0, choose on **capability/fit, not maturity**;
|
|
isolate volatile deps behind owned traits (reversible bets); **re-evaluate each
|
|
bet before 1.0**, when the API surface and data formats lock.
|
|
|
|
## 19. Testing strategy
|
|
|
|
- **Core & domain:** thorough unit tests; strong types remove whole categories from
|
|
the test surface.
|
|
- **Isolation/security:** dedicated **negative tests** (scoped credentials reject
|
|
foreign access; the public surface never emits internal fields/non-public
|
|
records).
|
|
- **Repositories:** integration tests against Postgres.
|
|
- **Flexible fields:** validation tested against field definitions.
|
|
- Deliberately **not overboard** elsewhere.
|
|
|
|
## 20. Decision log
|
|
|
|
| # | Decision | Why | Alternatives rejected |
|
|
|---|---|---|---|
|
|
| D1 | Per-org single-tenant binary; tenancy is deployment-only | Simplest core (no tenant plumbing); self-host = same artifact; isolation by construction | Shared multi-tenant app w/ `org_id`+RLS (bleed risk, complex core) |
|
|
| D2 | Database-per-org + scoped role; index-per-org + scoped key | Hard isolation; clean per-org export; no RLS | Schema-per-org (softer); shared DB + RLS (shared data path) |
|
|
| D3 | Hybrid data model (typed core + JSONB flexible + relational vocab/authority) | Small tested core + extensible tail; matches "link don't duplicate" | Fixed Spectrum schema (rigid); pure EAV/JSONB (weak integrity) |
|
|
| D4 | Type-driven design; `PublicView` projection; refs as validated types | Removes bug classes incl. public-data leaks; shrinks tests | Runtime checks only |
|
|
| D5 | sqlx + repository layer | Compile-time-checked SQL, no ORM, SQL in one place | SeaORM (more abstraction); Diesel (sync) |
|
|
| D6 | Clean public/admin surface split | Enables IP-lock/caching/publishing cleanly | Single mixed surface |
|
|
| D7 | Ingress-layer IP/VPN lockdown, admin-only-lockable | Not the app's job; public stays open | App-level firewall (fallback only) |
|
|
| D8 | Lean React SPA, evergreen + bundle budget | Growth path; ecosystem for data-entry UI; fits weak HW if disciplined | htmx/SSR (only needed for ancient browsers — none required) |
|
|
| D9 | Append-only audit w/ field diffs = amendment history | One mechanism satisfies ops audit + Spectrum requirement | Separate audit & history systems |
|
|
| D10 | Export = SQLite + files; backup = pg_dump | Portable, openable anywhere; distinct from ops backup | pg_dump as the only "export" (not portable) |
|
|
| D11 | OpenDAL behind `BlobStore` | Right altitude, multi-backend; bottleneck is network not syscalls | fusio now (lower-level, DB-engine focus) — watch-listed |
|
|
| D12 | utoipa code-first OpenAPI | Spec can't drift; drives client | spec-first |
|
|
| D13 | i18n: UI+vocab labels MVP; content workflow later, model ready now | Avoids painful migration; keeps MVP small | Full content translation in MVP (too big) |
|
|
| D14 | No IdP / no cross-org switching now | Rare case; keeps auth simple | Build shared IdP now |
|
|
| D15 | No migrations until 1.0 | Freedom to reshape pre-1.0 | Migrations from day one |
|
|
| D16 | No product name in code; role-named workspace; name from config | Placeholder must never leak; trivial rename later | Hardcode a working name |
|
|
|
|
## 21. Open items for the implementation plan
|
|
|
|
- First scaffolding task: **dissolve the current `biggus-dickus` package** into the
|
|
role-named workspace (the placeholder name must not survive into real code).
|
|
- Decide the role/permission model's MVP shape (kept minimal).
|
|
- Decide the object-number format configuration mechanism.
|
|
- Define the SQLite export schema mapping for the hybrid model.
|
|
- Choose specific crates for OIDC, JSONB validation, and React i18n during planning.
|