Files
biggus-dickus/docs/VISION.md
T
logaritmisk 8f67503f45 docs: add project vision, MVP architecture spec, and reference material
- docs/VISION.md: product vision + feature catalogue (MVP / post-MVP / later)
- docs/specs/2026-06-02-mvp-architecture.md: MVP architecture + 16-entry decision log
- reference/: Spectrum 5.0 cataloguing + Riksantikvarieämbetet source material (build-time reference)
- CLAUDE.md: project guidance for Claude Code

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 00:24:53 +02:00

208 lines
9.9 KiB
Markdown

# Vision — Collection Management System (working name TBD)
> **Codename note:** the repository folder is a throwaway working name. The real
> product name is undecided and **must never appear in code** (see the
> architecture spec, "Naming"). This document uses neutral terms — "the
> platform", "the system".
## What this is
A modern **collection management system** (Swedish: *samlingsförvaltningssystem*)
for museums and other heritage organizations: software for documenting, managing,
and selectively publishing the objects in a collection. It is built around the
**Spectrum 5.0** standard (Collections Trust) and the guidance from
Riksantikvarieämbetet — see [`reference/`](../reference/) for the source material
this design is grounded in.
It is **not** primarily a web-publishing tool or a digital-asset manager. Its job
is to support the internal processes of collection management — cataloguing,
location/movement control, loans, condition, and so on — with selective public
access layered on top.
## Who it is for
- **Primary (now):** small and mid-sized **non-profit** heritage organizations —
limited budget, limited or volunteer IT, who need something *easy* but correct.
- **Roadmap:** larger institutions with professional staff and IT, who need the
full Spectrum process coverage, custom fields, and the option to run it in their
own environment.
The design tension we hold throughout: **easy by default, flexible when needed.**
A small org runs with a tiny subset and sensible defaults; an advanced org enables
the full standard, adds custom fields, and self-hosts.
## Guiding principles
1. **Small, well-tested, extensible core.** The accountability backbone is small
and strongly typed; extensibility lives in well-bounded modules around it.
2. **Make illegal states unrepresentable.** Lean on Rust's type system to remove
bug classes (newtype IDs, validated value objects, projection types, auth via
extractors). Strong types also shrink the test surface.
3. **Isolation by construction.** An organization's data must *never* bleed into
another's. We achieve this at the deployment/credential layer, not by app
discipline (see architecture spec).
4. **Easy to self-host.** The single-tenant binary *is* the self-host artifact:
one binary, minimal external dependencies, sensible defaults, local-disk
storage option, standalone auth.
5. **Standards-aligned.** Spectrum 5.0 for process/data; CIDOC-CRM / LIDO and
controlled vocabularies (Getty, KulturNav, Wikidata) on the roadmap for
interchange.
6. **Minimal custom code, reversible bets.** Prefer existing crates. Pre-1.0 we
choose dependencies on *fit*, not maturity, and isolate experimental ones
behind our own traits so swapping stays cheap.
7. **Clean public/private separation.** The public, unauthenticated surface is a
distinct, narrow boundary — which makes publishing, caching, rate-limiting, and
network lockdown all clean.
## Architecture in one paragraph
The application binary is **always single-tenant** — one running instance serves
exactly one organization and knows nothing of any other. "Multi-tenancy" is purely
a *deployment* concern: a hosted fleet runs many copies of the same binary, each
with its own Postgres database and Meilisearch index (scoped credentials) against
shared database/search servers, each on its own domain, independently rolled out.
**Self-hosting is the same binary with one database.** Data isolation is therefore
guaranteed by credentials and topology, not by `org_id` filtering in code. See
[`specs/2026-06-02-mvp-architecture.md`](specs/2026-06-02-mvp-architecture.md).
---
## Feature catalogue
Each feature is tagged **[MVP]**, **[Post-MVP]**, or **[Later]**. The MVP cut is
the smallest build that is genuinely useful *and* exercises every architectural
pillar, so nothing structural is discovered late.
### Catalogue core
- **[MVP]** Catalogue records for objects and groups of objects, with a typed
**inventory minimum** (object number, name, count, brief description, current
location, current owner, recorder, recording date).
- **[MVP]** **Hybrid flexible fields** — a field-definition registry + JSONB value
layer, seeded with the **Spectrum 5.0 Cataloguing** field set; orgs enable a
subset or the whole set without schema changes.
- **[MVP]** Object numbering with a configurable standard format; multiple
historical numbers per object.
- **[Post-MVP]** Org-defined **custom fields** beyond Spectrum (the registry
already supports it; this is the management UI + validation polish).
- **[Post-MVP]** Object groups / hierarchical relationships, related-object links.
- **[Later]** Subject-specialist templates / external cataloguing standards.
### Controlled vocabularies & authority records
- **[MVP]** **Authority records** for person, organization, place — *store once,
link many* — referenced from core and flexible fields.
- **[MVP]** **Controlled vocabularies** (term sources) for fields like material,
object name, technique; fields bound to a vocabulary accept only resolved terms.
- **[MVP]** **Multilingual labels** on terms and authorities (sv/en) in the data
model.
- **[Post-MVP]** Import/sync from external vocabularies — Getty AAT/TGN/ULAN,
KulturNav, Wikidata; storing external URIs on local terms.
- **[Later]** Linked-open-data publishing of authorities.
### Media & files
- **[MVP]** Upload and attach images/documents to records via **OpenDAL**
(S3 or local disk), behind a `BlobStore` trait.
- **[Post-MVP]** Thumbnails / derivative generation; per-reproduction licensing;
multiple reproductions per object.
- **[Later]** **IIIF** image serving; bulk/mass ingest pipelines; dedicated
image-management (DAM-style) workflows.
### Search
- **[MVP]** **Meilisearch** indexing of records; basic faceted/full-text search in
the admin UI.
- **[Post-MVP]** Saved searches, advanced filters, sort presets, search across
all fields incl. flexible fields.
- **[Later]** Public-facing search on the published catalogue.
### Audit & history
- **[MVP]** **Append-only, immutable audit log** — who/when/what with field-level
before→after diffs — covering domain writes and auth/security events; surfaced
as Spectrum **amendment history** on records.
- **[Post-MVP]** Auditing of sensitive *reads*; audit export/reporting; retention
policy controls.
### Publishing & public access
- **[MVP]** **Record-level visibility** (draft / internal / public) with a fixed
set of never-public fields (location, valuation, insurance, personal data).
- **[MVP]** **Public read API** (OpenAPI) serving only public records, only
public-safe fields (a typed `PublicView` projection).
- **[Post-MVP]** **Per-field publishability** flags; public collection landing
pages / embeddable widgets.
- **[Later]** Aggregator interoperability — **LIDO** export, **OAI-PMH** harvest,
feeds to **K-samsök/Kringla**, **Europeana**, Sveriges dataportal; Wikidata/
Wikimedia publishing.
### Authentication & access control
- **[MVP]** **Email/password** and **external OIDC** login, scoped to the single
org the instance serves; role/permission model enforced via typed extractors.
- **[Post-MVP]** Granular per-field / per-process permissions; API tokens for
integrations.
- **[Later]** **Shared identity provider** + **cross-org membership and fast
switching** (deferred by decision; revisit if multi-org usage grows).
### Import / export / portability
- **[MVP]** **Portable export**: a single **SQLite** file (metadata incl.
flattened flexible fields + vocab/authority tables) + plain media files + a
manifest — a whole-org archive, openable anywhere.
- **[Post-MVP]** Import from Excel/CSV (the common "we have a spreadsheet" path)
and from another instance's export.
- **[Post-MVP]** Migration tooling from legacy systems.
### Reporting & output
- **[Post-MVP]** Templated outputs: exhibition labels, loan letters, condition
reports, inventory lists; user-defined templates.
- **[Later]** Statistics/dashboards (records per year, % with images, etc.).
### Spectrum procedure coverage
The MVP implements **Cataloguing**. The other Spectrum 5.0 procedures are the
functional roadmap:
- **[Post-MVP] Primary procedures:** Object entry, Acquisition & accessioning,
**Location & movement control**, Inventory, Loans in, Loans out, Object exit,
Documentation planning.
- **[Later] Secondary procedures:** Rights management, Reproduction, Condition
checking & technical assessment, Conservation & collections care, Valuation,
Insurance & indemnity, Use of collections (incl. exhibitions), Emergency/disaster
planning, Damage & loss, Deaccession & disposal, Collections review, Audit.
### Internationalization
- **[MVP]** UI localization (Swedish + English); localized API validation/error
messages; multilingual vocab/authority labels; data model carries language-tagged
content values.
- **[Post-MVP]** Translation **workflow/UI** for per-field record content;
additional UI locales.
### Hosting, fleet & operations
- **[MVP]** Runs as a single instance (self-host or one hosted cell); local-disk or
S3 storage; per-instance migrations on startup.
- **[Post-MVP]** Per-org **provisioning control plane** (create DB + role + Meili
key + deployment + domain); batched/canary rollouts; A/B routing.
- **[Post-MVP]** Optional **Redis** (cache/sessions/rate-limit) with per-org key
prefixing, added only when a real bottleneck appears.
- **[Post-MVP]** In-app IP-allowlist middleware as a portable fallback for
self-hosters without ingress-level controls.
- **[Later]** Multi-Postgres sharding for large fleets; per-org Redis instances.
---
## Explicitly deferred decisions (recorded so they aren't relitigated)
- **Multi-org user switching / shared IdP** — rare case; deferred until it
demonstrably hurts.
- **Database migrations machinery** — not until 1.0. Pre-1.0 the data model is
reshaped freely (recreate, don't migrate).
- **Final product name** — TBD; never hardcoded.
- **Hosting/ops documentation** — later, but the design keeps self-host easy
throughout.