Files
biggus-dickus/docs/plans/2026-06-02-field-definition-registry.md
T
logaritmisk da2db11a30 docs: add Plan 4 (Field-definition registry) implementation plan
Type-driven FieldType enum carrying binding; field_definition + labels tables with
a CHECK enforcing term<->vocabulary; create/read/list repository.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 10:09:46 +02:00

19 KiB
Raw Blame History

Field-Definition Registry Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: The "schema of schemas" for Approach C's flexible layer — a registry of field definitions (key, type, vocabulary/authority binding, required, group, multilingual labels). This is half of the flexible-fields subsystem; the object JSONB values + validation + audit are the next plan (Plan 5).

Architecture: domain holds FieldDefinitionId, a type-driven FieldType enum that carries its binding (a Term always has a VocabularyId; a non-term never does — illegal states unrepresentable), and FieldDefinition/NewFieldDefinition. db::fields owns the field_definition + field_definition_label tables (migration 0004) and a create/read/list repository. The DB enforces the type↔binding invariant with a CHECK constraint mirroring the enum. No values, no validation engine, no HTTP yet.

Tech Stack: Rust 2024, sqlx 0.8 (time+json), serde_json (labels json_agg). Tests use #[sqlx::test].

Design decisions (approved)

  • Split flexible fields into this plan (registry) and Plan 5 (values + validation + audit).
  • data_type set: text, localized_text, integer, date, boolean, term, authority.
  • FieldType is a type-driven enum carrying the binding; the DB stores (data_type, vocabulary_id, authority_kind) with a CHECK enforcing term ⇔ vocabulary_id present.
  • Field definitions carry multilingual display labels (reusing LocalizedLabel) and a group_key.
  • Scope: create/read/list of definitions. Update/delete of definitions deferred (admin UI, Plan 10).

Prerequisites

  • Postgres for tests with CREATE DATABASE rights; pass DATABASE_URL inline (e.g. postgres://postgres:postgres@localhost:5433/cms_dev). Shell env does not persist between commands.

File Structure

crates/domain/
  src/id.rs                + FieldDefinitionId
  src/field_definition.rs  FieldType, FieldDefinition, NewFieldDefinition
  src/lib.rs               re-exports
crates/db/
  migrations/0004_field_definition.sql
  src/fields.rs            create/read/list field definitions
  src/lib.rs               pub mod fields;
  tests/fields.rs

Task 1: domain — field definition types

Files: modify crates/domain/src/id.rs, crates/domain/src/lib.rs; create crates/domain/src/field_definition.rs.

  • Step 1: Add FieldDefinitionId to crates/domain/src/id.rs (another id_newtype! invocation):
id_newtype!(
    /// Identifier for a flexible-field definition.
    FieldDefinitionId
);
  • Step 2: Create crates/domain/src/field_definition.rs:
use crate::{AuthorityKind, FieldDefinitionId, LocalizedLabel, VocabularyId};

/// The type of a flexible field, carrying its binding where applicable.
///
/// Type-driven: a `Term` always names its vocabulary; a non-term never carries one.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum FieldType {
    Text,
    LocalizedText,
    Integer,
    Date,
    Boolean,
    Term { vocabulary_id: VocabularyId },
    Authority { kind: Option<AuthorityKind> },
}

impl FieldType {
    /// The stored discriminant string.
    pub fn kind_str(&self) -> &'static str {
        match self {
            FieldType::Text => "text",
            FieldType::LocalizedText => "localized_text",
            FieldType::Integer => "integer",
            FieldType::Date => "date",
            FieldType::Boolean => "boolean",
            FieldType::Term { .. } => "term",
            FieldType::Authority { .. } => "authority",
        }
    }

    /// Decompose into the three stored columns: `(data_type, vocabulary_id, authority_kind)`.
    pub fn to_parts(&self) -> (&'static str, Option<VocabularyId>, Option<AuthorityKind>) {
        match self {
            FieldType::Term { vocabulary_id } => ("term", Some(*vocabulary_id), None),
            FieldType::Authority { kind } => ("authority", None, *kind),
            other => (other.kind_str(), None, None),
        }
    }

    /// Reconstruct from the stored columns. `None` for an unknown or inconsistent combo
    /// (e.g. `term` without a vocabulary).
    pub fn from_parts(
        data_type: &str,
        vocabulary_id: Option<VocabularyId>,
        authority_kind: Option<AuthorityKind>,
    ) -> Option<Self> {
        match data_type {
            "text" => Some(FieldType::Text),
            "localized_text" => Some(FieldType::LocalizedText),
            "integer" => Some(FieldType::Integer),
            "date" => Some(FieldType::Date),
            "boolean" => Some(FieldType::Boolean),
            "term" => vocabulary_id.map(|vocabulary_id| FieldType::Term { vocabulary_id }),
            "authority" => Some(FieldType::Authority { kind: authority_kind }),
            _ => None,
        }
    }
}

/// A registered flexible field, with its multilingual display labels.
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct FieldDefinition {
    pub id: FieldDefinitionId,
    pub key: String,
    pub field_type: FieldType,
    pub required: bool,
    pub group_key: Option<String>,
    pub labels: Vec<LocalizedLabel>,
}

/// A field definition to be created.
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct NewFieldDefinition {
    pub key: String,
    pub field_type: FieldType,
    pub required: bool,
    pub group_key: Option<String>,
    pub labels: Vec<LocalizedLabel>,
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn field_type_round_trips_through_parts() {
        let v = VocabularyId::new();
        let cases = [
            FieldType::Text,
            FieldType::LocalizedText,
            FieldType::Integer,
            FieldType::Date,
            FieldType::Boolean,
            FieldType::Term { vocabulary_id: v },
            FieldType::Authority { kind: Some(AuthorityKind::Person) },
            FieldType::Authority { kind: None },
        ];
        for ft in cases {
            let (data_type, vocabulary_id, authority_kind) = ft.to_parts();
            assert_eq!(FieldType::from_parts(data_type, vocabulary_id, authority_kind), Some(ft));
        }
    }

    #[test]
    fn term_without_vocabulary_is_invalid() {
        assert_eq!(FieldType::from_parts("term", None, None), None);
    }

    #[test]
    fn unknown_type_is_none() {
        assert_eq!(FieldType::from_parts("blob", None, None), None);
    }
}
  • Step 3: Update crates/domain/src/lib.rs: add mod field_definition; (sorted: audit, authority, field_definition, id, label, object, vocabulary), add FieldDefinitionId to the id re-export, and add:
pub use field_definition::{FieldDefinition, FieldType, NewFieldDefinition};
  • Step 4: Test + lint. cargo test -p domain → all pass. cargo +nightly fmt; cargo clippy -p domain --all-targets -- -D warnings → clean.

  • Step 5: Commit.

git add crates/domain
git commit -m "feat(domain): add field definition types (FieldType, FieldDefinition)"

Task 2: db migration — field_definition tables

Files: create crates/db/migrations/0004_field_definition.sql; modify crates/db/tests/migrate.rs.

  • Step 1: Create crates/db/migrations/0004_field_definition.sql:
-- Registry of flexible field definitions (the "schema of schemas").
CREATE TABLE field_definition (
    id             UUID PRIMARY KEY,
    key            TEXT NOT NULL UNIQUE CHECK (key <> ''),
    data_type      TEXT NOT NULL CHECK (data_type IN
                       ('text', 'localized_text', 'integer', 'date', 'boolean', 'term', 'authority')),
    vocabulary_id  UUID REFERENCES vocabulary (id) ON DELETE RESTRICT,
    authority_kind TEXT CHECK (authority_kind IN ('person', 'organisation', 'place')),
    required       BOOLEAN NOT NULL DEFAULT false,
    group_key      TEXT CHECK (group_key <> ''),
    -- A term field must name a vocabulary; any other type must not.
    CONSTRAINT term_has_vocabulary CHECK ((data_type = 'term') = (vocabulary_id IS NOT NULL)),
    -- authority_kind is only meaningful for authority fields.
    CONSTRAINT authority_kind_only_for_authority
        CHECK (authority_kind IS NULL OR data_type = 'authority')
);

CREATE TABLE field_definition_label (
    field_definition_id UUID NOT NULL REFERENCES field_definition (id) ON DELETE CASCADE,
    lang                TEXT NOT NULL CHECK (lang <> ''),
    label               TEXT NOT NULL CHECK (label <> ''),
    PRIMARY KEY (field_definition_id, lang)
);
  • Step 2: Append to crates/db/tests/migrate.rs:
#[sqlx::test]
async fn migrate_creates_field_definition_tables(pool: PgPool) {
    let db = Db::from_pool(pool);
    for table in ["field_definition", "field_definition_label"] {
        let regclass: Option<String> =
            sqlx::query_scalar(&format!("SELECT to_regclass('public.{table}')::text"))
                .fetch_one(db.pool())
                .await
                .unwrap();
        assert_eq!(regclass.as_deref(), Some(table), "table {table} should exist");
    }
}
  • Step 3: Run + lint. DATABASE_URL=<url> cargo test -p db --test migrate → 4 tests pass. cargo +nightly fmt; clippy clean.

  • Step 4: Commit.

git add crates/db/migrations crates/db/tests/migrate.rs
git commit -m "feat(db): add field_definition tables"

Task 3: db::fields repository

Files: create crates/db/src/fields.rs; modify crates/db/src/lib.rs; create crates/db/tests/fields.rs.

  • Step 1: Write the failing test crates/db/tests/fields.rs:
use db::{Db, fields, vocab};
use domain::{AuthorityKind, FieldType, LocalizedLabel, NewFieldDefinition};
use sqlx::PgPool;

fn labels() -> Vec<LocalizedLabel> {
    vec![
        LocalizedLabel { lang: "sv".into(), label: "material".into() },
        LocalizedLabel { lang: "en".into(), label: "material".into() },
    ]
}

#[sqlx::test]
async fn text_field_round_trips(pool: PgPool) {
    let db = Db::from_pool(pool);
    let mut tx = db.pool().begin().await.unwrap();
    let id = fields::create_field_definition(
        &mut tx,
        &NewFieldDefinition {
            key: "comments".into(),
            field_type: FieldType::Text,
            required: false,
            group_key: Some("identification".into()),
            labels: labels(),
        },
    )
    .await
    .unwrap();
    tx.commit().await.unwrap();

    let def = fields::field_definition_by_key(db.pool(), "comments").await.unwrap().unwrap();
    assert_eq!(def.id, id);
    assert_eq!(def.field_type, FieldType::Text);
    assert_eq!(def.group_key.as_deref(), Some("identification"));
    assert_eq!(def.labels.len(), 2);
    assert!(fields::field_definition_by_key(db.pool(), "nope").await.unwrap().is_none());
}

#[sqlx::test]
async fn term_and_authority_fields_round_trip_their_binding(pool: PgPool) {
    let db = Db::from_pool(pool);
    let material = vocab::create_vocabulary(db.pool(), "material").await.unwrap();

    let mut tx = db.pool().begin().await.unwrap();
    fields::create_field_definition(
        &mut tx,
        &NewFieldDefinition {
            key: "material".into(),
            field_type: FieldType::Term { vocabulary_id: material.id },
            required: true,
            group_key: None,
            labels: labels(),
        },
    )
    .await
    .unwrap();
    fields::create_field_definition(
        &mut tx,
        &NewFieldDefinition {
            key: "maker".into(),
            field_type: FieldType::Authority { kind: Some(AuthorityKind::Person) },
            required: false,
            group_key: None,
            labels: vec![LocalizedLabel { lang: "en".into(), label: "maker".into() }],
        },
    )
    .await
    .unwrap();
    tx.commit().await.unwrap();

    let material_def = fields::field_definition_by_key(db.pool(), "material").await.unwrap().unwrap();
    assert_eq!(material_def.field_type, FieldType::Term { vocabulary_id: material.id });
    assert!(material_def.required);

    let maker_def = fields::field_definition_by_key(db.pool(), "maker").await.unwrap().unwrap();
    assert_eq!(maker_def.field_type, FieldType::Authority { kind: Some(AuthorityKind::Person) });

    let all = fields::list_field_definitions(db.pool()).await.unwrap();
    assert_eq!(all.len(), 2);
}
  • Step 2: Run to verify it fails. DATABASE_URL=<url> cargo test -p db --test fields → FAIL.

  • Step 3: Implement crates/db/src/fields.rs:

//! Registry of flexible field definitions.

use domain::{
    AuthorityKind, FieldDefinition, FieldDefinitionId, FieldType, LocalizedLabel,
    NewFieldDefinition, VocabularyId,
};
use sqlx::Row;

/// Labels aggregated per row as JSON, to read a definition and its labels in one query.
const LABELS_JSON: &str = "COALESCE(json_agg(json_build_object('lang', fdl.lang, 'label', fdl.label) \
     ORDER BY fdl.lang) FILTER (WHERE fdl.field_definition_id IS NOT NULL), '[]'::json)";

const SELECT_COLUMNS: &str =
    "fd.id, fd.key, fd.data_type, fd.vocabulary_id, fd.authority_kind, fd.required, fd.group_key";

/// Create a field definition and its labels. Multiple statements — pass a
/// transaction connection (`&mut *tx`) for atomicity.
pub async fn create_field_definition(
    conn: &mut sqlx::PgConnection,
    new: &NewFieldDefinition,
) -> Result<FieldDefinitionId, sqlx::Error> {
    let id = FieldDefinitionId::new();
    let (data_type, vocabulary_id, authority_kind) = new.field_type.to_parts();

    sqlx::query(
        "INSERT INTO field_definition \
         (id, key, data_type, vocabulary_id, authority_kind, required, group_key) \
         VALUES ($1, $2, $3, $4, $5, $6, $7)",
    )
    .bind(id.to_uuid())
    .bind(&new.key)
    .bind(data_type)
    .bind(vocabulary_id.map(|v| v.to_uuid()))
    .bind(authority_kind.map(|k| k.as_str()))
    .bind(new.required)
    .bind(new.group_key.as_deref())
    .execute(&mut *conn)
    .await?;

    for label in &new.labels {
        sqlx::query(
            "INSERT INTO field_definition_label (field_definition_id, lang, label) \
             VALUES ($1, $2, $3)",
        )
        .bind(id.to_uuid())
        .bind(&label.lang)
        .bind(&label.label)
        .execute(&mut *conn)
        .await?;
    }

    Ok(id)
}

/// Look up a field definition by its key (with labels).
pub async fn field_definition_by_key<'e, E>(
    executor: E,
    key: &str,
) -> Result<Option<FieldDefinition>, sqlx::Error>
where
    E: sqlx::PgExecutor<'e>,
{
    let sql = format!(
        "SELECT {SELECT_COLUMNS}, {LABELS_JSON} AS labels \
         FROM field_definition fd \
         LEFT JOIN field_definition_label fdl ON fdl.field_definition_id = fd.id \
         WHERE fd.key = $1 GROUP BY fd.id"
    );
    let row = sqlx::query(&sql).bind(key).fetch_optional(executor).await?;
    row.map(map_field_definition).transpose()
}

/// List all field definitions (with labels), ordered by key.
pub async fn list_field_definitions<'e, E>(
    executor: E,
) -> Result<Vec<FieldDefinition>, sqlx::Error>
where
    E: sqlx::PgExecutor<'e>,
{
    let sql = format!(
        "SELECT {SELECT_COLUMNS}, {LABELS_JSON} AS labels \
         FROM field_definition fd \
         LEFT JOIN field_definition_label fdl ON fdl.field_definition_id = fd.id \
         GROUP BY fd.id ORDER BY fd.key"
    );
    let rows = sqlx::query(&sql).fetch_all(executor).await?;
    rows.into_iter().map(map_field_definition).collect()
}

fn map_field_definition(row: sqlx::postgres::PgRow) -> Result<FieldDefinition, sqlx::Error> {
    let data_type: String = row.try_get("data_type")?;
    let vocabulary_id: Option<uuid::Uuid> = row.try_get("vocabulary_id")?;
    let authority_kind: Option<String> = row.try_get("authority_kind")?;

    let authority_kind = authority_kind
        .map(|k| {
            AuthorityKind::from_db(&k)
                .ok_or_else(|| sqlx::Error::Decode(format!("unknown authority kind: {k}").into()))
        })
        .transpose()?;

    let field_type = FieldType::from_parts(
        &data_type,
        vocabulary_id.map(VocabularyId::from_uuid),
        authority_kind,
    )
    .ok_or_else(|| {
        sqlx::Error::Decode(format!("inconsistent field type stored: {data_type}").into())
    })?;

    let labels: sqlx::types::Json<Vec<LocalizedLabel>> = row.try_get("labels")?;

    Ok(FieldDefinition {
        id: FieldDefinitionId::from_uuid(row.try_get("id")?),
        key: row.try_get("key")?,
        field_type,
        required: row.try_get("required")?,
        group_key: row.try_get("group_key")?,
        labels: labels.0,
    })
}

Add to crates/db/src/lib.rs (top-level): pub mod fields;

  • Step 4: Run to verify it passes. DATABASE_URL=<url> cargo test -p db --test fields → PASS (2 tests).

  • Step 5: Full workspace check.

cargo +nightly fmt --check
DATABASE_URL=<url> cargo clippy --workspace --all-targets -- -D warnings
DATABASE_URL=<url> cargo test --workspace

Expected: all green.

  • Step 6: Commit.
git add crates/db
git commit -m "feat(db): add field-definition registry repository"

Self-Review (completed)

Spec coverage (§6.2 registry portion):

  • Field-definition registry: key, type, binding, required, group, multilingual labels → Tasks 13. ✓
  • Type-driven FieldType carrying binding; DB CHECK enforces term ⇔ vocabulary → Task 1 (enum) + Task 2 (CHECK). ✓
  • Approved data_type set incl. localized_text → Task 1. ✓
  • Create/read/list; update/delete deferred to admin UI. ✓ (intentional)
  • SQL confined to db; domain I/O-free. ✓
  • No values/validation/HTTP (Plan 5+). ✓ (intentional)

Placeholder scan: none. <url> is the documented DATABASE_URL.

Type consistency: FieldType/FieldDefinition/NewFieldDefinition/FieldDefinitionId names+fields identical across domain (Task 1), the repo (Task 3), tests. to_parts/from_parts are inverse (tested in Task 1) and used symmetrically in create_field_definition (to_parts → binds) and map_field_definition (from_parts ← columns). The DB CHECK (data_type='term') = (vocabulary_id IS NOT NULL) mirrors from_parts returning None for term-without-vocabulary.

Notes for follow-on plans

  • Plan 5 (values + validation + audit): add object.fields jsonb, set/get with validation against this registry (type match; term/authority values resolved via vocab::resolve_term/authority::resolve_authority), and audit flexible-field changes on the object. Seed the Spectrum Cataloguing field set there (or a small follow-on) using reference/spectrum-5.0-cataloguing-units-of-information.md.
  • Update/delete of field definitions (and the impact on existing values) is an admin-UI concern (Plan 10).
  • list_field_definitions unbounded — covered by the pagination follow-up (#10) before API exposure.