Files
biggus-dickus/docs/plans/2026-06-02-field-definition-registry.md
T
logaritmisk da2db11a30 docs: add Plan 4 (Field-definition registry) implementation plan
Type-driven FieldType enum carrying binding; field_definition + labels tables with
a CHECK enforcing term<->vocabulary; create/read/list repository.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 10:09:46 +02:00

488 lines
19 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Field-Definition Registry Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** The "schema of schemas" for Approach C's flexible layer — a registry of field definitions (key, type, vocabulary/authority binding, required, group, multilingual labels). This is half of the flexible-fields subsystem; the object JSONB values + validation + audit are the next plan (Plan 5).
**Architecture:** `domain` holds `FieldDefinitionId`, a type-driven `FieldType` enum that *carries its binding* (a `Term` always has a `VocabularyId`; a non-term never does — illegal states unrepresentable), and `FieldDefinition`/`NewFieldDefinition`. `db::fields` owns the `field_definition` + `field_definition_label` tables (migration 0004) and a create/read/list repository. The DB enforces the type↔binding invariant with a CHECK constraint mirroring the enum. No values, no validation engine, no HTTP yet.
**Tech Stack:** Rust 2024, sqlx 0.8 (`time`+`json`), `serde_json` (labels json_agg). Tests use `#[sqlx::test]`.
## Design decisions (approved)
- Split flexible fields into **this plan (registry)** and **Plan 5 (values + validation + audit)**.
- `data_type` set: `text`, `localized_text`, `integer`, `date`, `boolean`, `term`, `authority`.
- `FieldType` is a type-driven enum carrying the binding; the DB stores `(data_type, vocabulary_id, authority_kind)` with a CHECK enforcing `term ⇔ vocabulary_id present`.
- Field definitions carry multilingual display labels (reusing `LocalizedLabel`) and a `group_key`.
- Scope: create/read/list of definitions. Update/delete of definitions deferred (admin UI, Plan 10).
## Prerequisites
- Postgres for tests with CREATE DATABASE rights; pass `DATABASE_URL` inline (e.g. `postgres://postgres:postgres@localhost:5433/cms_dev`). Shell env does not persist between commands.
## File Structure
```
crates/domain/
src/id.rs + FieldDefinitionId
src/field_definition.rs FieldType, FieldDefinition, NewFieldDefinition
src/lib.rs re-exports
crates/db/
migrations/0004_field_definition.sql
src/fields.rs create/read/list field definitions
src/lib.rs pub mod fields;
tests/fields.rs
```
---
## Task 1: `domain` — field definition types
**Files:** modify `crates/domain/src/id.rs`, `crates/domain/src/lib.rs`; create `crates/domain/src/field_definition.rs`.
- [ ] **Step 1: Add `FieldDefinitionId`** to `crates/domain/src/id.rs` (another `id_newtype!` invocation):
```rust
id_newtype!(
/// Identifier for a flexible-field definition.
FieldDefinitionId
);
```
- [ ] **Step 2: Create `crates/domain/src/field_definition.rs`:**
```rust
use crate::{AuthorityKind, FieldDefinitionId, LocalizedLabel, VocabularyId};
/// The type of a flexible field, carrying its binding where applicable.
///
/// Type-driven: a `Term` always names its vocabulary; a non-term never carries one.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum FieldType {
Text,
LocalizedText,
Integer,
Date,
Boolean,
Term { vocabulary_id: VocabularyId },
Authority { kind: Option<AuthorityKind> },
}
impl FieldType {
/// The stored discriminant string.
pub fn kind_str(&self) -> &'static str {
match self {
FieldType::Text => "text",
FieldType::LocalizedText => "localized_text",
FieldType::Integer => "integer",
FieldType::Date => "date",
FieldType::Boolean => "boolean",
FieldType::Term { .. } => "term",
FieldType::Authority { .. } => "authority",
}
}
/// Decompose into the three stored columns: `(data_type, vocabulary_id, authority_kind)`.
pub fn to_parts(&self) -> (&'static str, Option<VocabularyId>, Option<AuthorityKind>) {
match self {
FieldType::Term { vocabulary_id } => ("term", Some(*vocabulary_id), None),
FieldType::Authority { kind } => ("authority", None, *kind),
other => (other.kind_str(), None, None),
}
}
/// Reconstruct from the stored columns. `None` for an unknown or inconsistent combo
/// (e.g. `term` without a vocabulary).
pub fn from_parts(
data_type: &str,
vocabulary_id: Option<VocabularyId>,
authority_kind: Option<AuthorityKind>,
) -> Option<Self> {
match data_type {
"text" => Some(FieldType::Text),
"localized_text" => Some(FieldType::LocalizedText),
"integer" => Some(FieldType::Integer),
"date" => Some(FieldType::Date),
"boolean" => Some(FieldType::Boolean),
"term" => vocabulary_id.map(|vocabulary_id| FieldType::Term { vocabulary_id }),
"authority" => Some(FieldType::Authority { kind: authority_kind }),
_ => None,
}
}
}
/// A registered flexible field, with its multilingual display labels.
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct FieldDefinition {
pub id: FieldDefinitionId,
pub key: String,
pub field_type: FieldType,
pub required: bool,
pub group_key: Option<String>,
pub labels: Vec<LocalizedLabel>,
}
/// A field definition to be created.
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct NewFieldDefinition {
pub key: String,
pub field_type: FieldType,
pub required: bool,
pub group_key: Option<String>,
pub labels: Vec<LocalizedLabel>,
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn field_type_round_trips_through_parts() {
let v = VocabularyId::new();
let cases = [
FieldType::Text,
FieldType::LocalizedText,
FieldType::Integer,
FieldType::Date,
FieldType::Boolean,
FieldType::Term { vocabulary_id: v },
FieldType::Authority { kind: Some(AuthorityKind::Person) },
FieldType::Authority { kind: None },
];
for ft in cases {
let (data_type, vocabulary_id, authority_kind) = ft.to_parts();
assert_eq!(FieldType::from_parts(data_type, vocabulary_id, authority_kind), Some(ft));
}
}
#[test]
fn term_without_vocabulary_is_invalid() {
assert_eq!(FieldType::from_parts("term", None, None), None);
}
#[test]
fn unknown_type_is_none() {
assert_eq!(FieldType::from_parts("blob", None, None), None);
}
}
```
- [ ] **Step 3: Update `crates/domain/src/lib.rs`:** add `mod field_definition;` (sorted: audit, authority, field_definition, id, label, object, vocabulary), add `FieldDefinitionId` to the id re-export, and add:
```rust
pub use field_definition::{FieldDefinition, FieldType, NewFieldDefinition};
```
- [ ] **Step 4: Test + lint.** `cargo test -p domain` → all pass. `cargo +nightly fmt`; `cargo clippy -p domain --all-targets -- -D warnings` → clean.
- [ ] **Step 5: Commit.**
```bash
git add crates/domain
git commit -m "feat(domain): add field definition types (FieldType, FieldDefinition)"
```
---
## Task 2: `db` migration — field_definition tables
**Files:** create `crates/db/migrations/0004_field_definition.sql`; modify `crates/db/tests/migrate.rs`.
- [ ] **Step 1: Create `crates/db/migrations/0004_field_definition.sql`:**
```sql
-- Registry of flexible field definitions (the "schema of schemas").
CREATE TABLE field_definition (
id UUID PRIMARY KEY,
key TEXT NOT NULL UNIQUE CHECK (key <> ''),
data_type TEXT NOT NULL CHECK (data_type IN
('text', 'localized_text', 'integer', 'date', 'boolean', 'term', 'authority')),
vocabulary_id UUID REFERENCES vocabulary (id) ON DELETE RESTRICT,
authority_kind TEXT CHECK (authority_kind IN ('person', 'organisation', 'place')),
required BOOLEAN NOT NULL DEFAULT false,
group_key TEXT CHECK (group_key <> ''),
-- A term field must name a vocabulary; any other type must not.
CONSTRAINT term_has_vocabulary CHECK ((data_type = 'term') = (vocabulary_id IS NOT NULL)),
-- authority_kind is only meaningful for authority fields.
CONSTRAINT authority_kind_only_for_authority
CHECK (authority_kind IS NULL OR data_type = 'authority')
);
CREATE TABLE field_definition_label (
field_definition_id UUID NOT NULL REFERENCES field_definition (id) ON DELETE CASCADE,
lang TEXT NOT NULL CHECK (lang <> ''),
label TEXT NOT NULL CHECK (label <> ''),
PRIMARY KEY (field_definition_id, lang)
);
```
- [ ] **Step 2: Append to `crates/db/tests/migrate.rs`:**
```rust
#[sqlx::test]
async fn migrate_creates_field_definition_tables(pool: PgPool) {
let db = Db::from_pool(pool);
for table in ["field_definition", "field_definition_label"] {
let regclass: Option<String> =
sqlx::query_scalar(&format!("SELECT to_regclass('public.{table}')::text"))
.fetch_one(db.pool())
.await
.unwrap();
assert_eq!(regclass.as_deref(), Some(table), "table {table} should exist");
}
}
```
- [ ] **Step 3: Run + lint.** `DATABASE_URL=<url> cargo test -p db --test migrate` → 4 tests pass. `cargo +nightly fmt`; clippy clean.
- [ ] **Step 4: Commit.**
```bash
git add crates/db/migrations crates/db/tests/migrate.rs
git commit -m "feat(db): add field_definition tables"
```
---
## Task 3: `db::fields` repository
**Files:** create `crates/db/src/fields.rs`; modify `crates/db/src/lib.rs`; create `crates/db/tests/fields.rs`.
- [ ] **Step 1: Write the failing test** `crates/db/tests/fields.rs`:
```rust
use db::{Db, fields, vocab};
use domain::{AuthorityKind, FieldType, LocalizedLabel, NewFieldDefinition};
use sqlx::PgPool;
fn labels() -> Vec<LocalizedLabel> {
vec![
LocalizedLabel { lang: "sv".into(), label: "material".into() },
LocalizedLabel { lang: "en".into(), label: "material".into() },
]
}
#[sqlx::test]
async fn text_field_round_trips(pool: PgPool) {
let db = Db::from_pool(pool);
let mut tx = db.pool().begin().await.unwrap();
let id = fields::create_field_definition(
&mut tx,
&NewFieldDefinition {
key: "comments".into(),
field_type: FieldType::Text,
required: false,
group_key: Some("identification".into()),
labels: labels(),
},
)
.await
.unwrap();
tx.commit().await.unwrap();
let def = fields::field_definition_by_key(db.pool(), "comments").await.unwrap().unwrap();
assert_eq!(def.id, id);
assert_eq!(def.field_type, FieldType::Text);
assert_eq!(def.group_key.as_deref(), Some("identification"));
assert_eq!(def.labels.len(), 2);
assert!(fields::field_definition_by_key(db.pool(), "nope").await.unwrap().is_none());
}
#[sqlx::test]
async fn term_and_authority_fields_round_trip_their_binding(pool: PgPool) {
let db = Db::from_pool(pool);
let material = vocab::create_vocabulary(db.pool(), "material").await.unwrap();
let mut tx = db.pool().begin().await.unwrap();
fields::create_field_definition(
&mut tx,
&NewFieldDefinition {
key: "material".into(),
field_type: FieldType::Term { vocabulary_id: material.id },
required: true,
group_key: None,
labels: labels(),
},
)
.await
.unwrap();
fields::create_field_definition(
&mut tx,
&NewFieldDefinition {
key: "maker".into(),
field_type: FieldType::Authority { kind: Some(AuthorityKind::Person) },
required: false,
group_key: None,
labels: vec![LocalizedLabel { lang: "en".into(), label: "maker".into() }],
},
)
.await
.unwrap();
tx.commit().await.unwrap();
let material_def = fields::field_definition_by_key(db.pool(), "material").await.unwrap().unwrap();
assert_eq!(material_def.field_type, FieldType::Term { vocabulary_id: material.id });
assert!(material_def.required);
let maker_def = fields::field_definition_by_key(db.pool(), "maker").await.unwrap().unwrap();
assert_eq!(maker_def.field_type, FieldType::Authority { kind: Some(AuthorityKind::Person) });
let all = fields::list_field_definitions(db.pool()).await.unwrap();
assert_eq!(all.len(), 2);
}
```
- [ ] **Step 2: Run to verify it fails.** `DATABASE_URL=<url> cargo test -p db --test fields` → FAIL.
- [ ] **Step 3: Implement** `crates/db/src/fields.rs`:
```rust
//! Registry of flexible field definitions.
use domain::{
AuthorityKind, FieldDefinition, FieldDefinitionId, FieldType, LocalizedLabel,
NewFieldDefinition, VocabularyId,
};
use sqlx::Row;
/// Labels aggregated per row as JSON, to read a definition and its labels in one query.
const LABELS_JSON: &str = "COALESCE(json_agg(json_build_object('lang', fdl.lang, 'label', fdl.label) \
ORDER BY fdl.lang) FILTER (WHERE fdl.field_definition_id IS NOT NULL), '[]'::json)";
const SELECT_COLUMNS: &str =
"fd.id, fd.key, fd.data_type, fd.vocabulary_id, fd.authority_kind, fd.required, fd.group_key";
/// Create a field definition and its labels. Multiple statements — pass a
/// transaction connection (`&mut *tx`) for atomicity.
pub async fn create_field_definition(
conn: &mut sqlx::PgConnection,
new: &NewFieldDefinition,
) -> Result<FieldDefinitionId, sqlx::Error> {
let id = FieldDefinitionId::new();
let (data_type, vocabulary_id, authority_kind) = new.field_type.to_parts();
sqlx::query(
"INSERT INTO field_definition \
(id, key, data_type, vocabulary_id, authority_kind, required, group_key) \
VALUES ($1, $2, $3, $4, $5, $6, $7)",
)
.bind(id.to_uuid())
.bind(&new.key)
.bind(data_type)
.bind(vocabulary_id.map(|v| v.to_uuid()))
.bind(authority_kind.map(|k| k.as_str()))
.bind(new.required)
.bind(new.group_key.as_deref())
.execute(&mut *conn)
.await?;
for label in &new.labels {
sqlx::query(
"INSERT INTO field_definition_label (field_definition_id, lang, label) \
VALUES ($1, $2, $3)",
)
.bind(id.to_uuid())
.bind(&label.lang)
.bind(&label.label)
.execute(&mut *conn)
.await?;
}
Ok(id)
}
/// Look up a field definition by its key (with labels).
pub async fn field_definition_by_key<'e, E>(
executor: E,
key: &str,
) -> Result<Option<FieldDefinition>, sqlx::Error>
where
E: sqlx::PgExecutor<'e>,
{
let sql = format!(
"SELECT {SELECT_COLUMNS}, {LABELS_JSON} AS labels \
FROM field_definition fd \
LEFT JOIN field_definition_label fdl ON fdl.field_definition_id = fd.id \
WHERE fd.key = $1 GROUP BY fd.id"
);
let row = sqlx::query(&sql).bind(key).fetch_optional(executor).await?;
row.map(map_field_definition).transpose()
}
/// List all field definitions (with labels), ordered by key.
pub async fn list_field_definitions<'e, E>(
executor: E,
) -> Result<Vec<FieldDefinition>, sqlx::Error>
where
E: sqlx::PgExecutor<'e>,
{
let sql = format!(
"SELECT {SELECT_COLUMNS}, {LABELS_JSON} AS labels \
FROM field_definition fd \
LEFT JOIN field_definition_label fdl ON fdl.field_definition_id = fd.id \
GROUP BY fd.id ORDER BY fd.key"
);
let rows = sqlx::query(&sql).fetch_all(executor).await?;
rows.into_iter().map(map_field_definition).collect()
}
fn map_field_definition(row: sqlx::postgres::PgRow) -> Result<FieldDefinition, sqlx::Error> {
let data_type: String = row.try_get("data_type")?;
let vocabulary_id: Option<uuid::Uuid> = row.try_get("vocabulary_id")?;
let authority_kind: Option<String> = row.try_get("authority_kind")?;
let authority_kind = authority_kind
.map(|k| {
AuthorityKind::from_db(&k)
.ok_or_else(|| sqlx::Error::Decode(format!("unknown authority kind: {k}").into()))
})
.transpose()?;
let field_type = FieldType::from_parts(
&data_type,
vocabulary_id.map(VocabularyId::from_uuid),
authority_kind,
)
.ok_or_else(|| {
sqlx::Error::Decode(format!("inconsistent field type stored: {data_type}").into())
})?;
let labels: sqlx::types::Json<Vec<LocalizedLabel>> = row.try_get("labels")?;
Ok(FieldDefinition {
id: FieldDefinitionId::from_uuid(row.try_get("id")?),
key: row.try_get("key")?,
field_type,
required: row.try_get("required")?,
group_key: row.try_get("group_key")?,
labels: labels.0,
})
}
```
Add to `crates/db/src/lib.rs` (top-level): `pub mod fields;`
- [ ] **Step 4: Run to verify it passes.** `DATABASE_URL=<url> cargo test -p db --test fields` → PASS (2 tests).
- [ ] **Step 5: Full workspace check.**
```bash
cargo +nightly fmt --check
DATABASE_URL=<url> cargo clippy --workspace --all-targets -- -D warnings
DATABASE_URL=<url> cargo test --workspace
```
Expected: all green.
- [ ] **Step 6: Commit.**
```bash
git add crates/db
git commit -m "feat(db): add field-definition registry repository"
```
---
## Self-Review (completed)
**Spec coverage (§6.2 registry portion):**
- Field-definition registry: key, type, binding, required, group, multilingual labels → Tasks 13. ✓
- Type-driven `FieldType` carrying binding; DB CHECK enforces `term ⇔ vocabulary` → Task 1 (enum) + Task 2 (CHECK). ✓
- Approved `data_type` set incl. `localized_text` → Task 1. ✓
- Create/read/list; update/delete deferred to admin UI. ✓ (intentional)
- SQL confined to `db`; `domain` I/O-free. ✓
- No values/validation/HTTP (Plan 5+). ✓ (intentional)
**Placeholder scan:** none. `<url>` is the documented `DATABASE_URL`.
**Type consistency:** `FieldType`/`FieldDefinition`/`NewFieldDefinition`/`FieldDefinitionId` names+fields identical across `domain` (Task 1), the repo (Task 3), tests. `to_parts`/`from_parts` are inverse (tested in Task 1) and used symmetrically in `create_field_definition` (to_parts → binds) and `map_field_definition` (from_parts ← columns). The DB CHECK `(data_type='term') = (vocabulary_id IS NOT NULL)` mirrors `from_parts` returning None for term-without-vocabulary.
## Notes for follow-on plans
- **Plan 5 (values + validation + audit):** add `object.fields jsonb`, set/get with validation against this registry (type match; `term`/`authority` values resolved via `vocab::resolve_term`/`authority::resolve_authority`), and audit flexible-field changes on the object. Seed the Spectrum Cataloguing field set there (or a small follow-on) using `reference/spectrum-5.0-cataloguing-units-of-information.md`.
- Update/delete of field definitions (and the impact on existing values) is an admin-UI concern (Plan 10).
- `list_field_definitions` unbounded — covered by the pagination follow-up (#10) before API exposure.