Files
biggus-dickus/docs/plans/2026-06-02-vocabularies-authorities.md
T
logaritmisk 42e0a5f5f1 docs: add Plan 2 (Vocabularies & authorities) implementation plan
Unified authority table + normalized multilingual label tables; vocab/authority
repositories; TermRef/AuthorityRef validated refs; id_newtype! macro.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 08:36:03 +02:00

867 lines
30 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Vocabularies & Authorities Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Build the "store once, link many" subsystem — controlled vocabularies (term sources) and person/organisation/place authority records, both with multilingual labels — that the catalogue core will reference (`docs/specs/2026-06-02-mvp-architecture.md` §6.3).
**Architecture:** Value types and validated reference types in `domain` (pure). The `db` crate owns the tables (migration 0002) and two repositories (`db::vocab`, `db::authority`). Multilingual labels are normalized into per-entity label tables, read back via a single `json_agg` query. Reference types `TermRef`/`AuthorityRef` are produced by `db` resolve functions; hard referential integrity arrives when the catalogue FK-references terms/authorities (Plan 4). No HTTP surface yet.
**Tech Stack:** Rust 2024, sqlx 0.8 (Postgres, `time`+`json` features already enabled), `serde_json` for the aggregated-label payload. Tests use `#[sqlx::test]`.
## Design decisions (approved)
- **Unified `authority` table** with `kind ∈ {person, organisation, place}` (one FK target; kind-specific fields later).
- **Normalized per-entity label tables** (`term_label`, `authority_label`) keyed `(id, lang)`; display resolved as requested-lang → fallback → first.
- **`TermRef`/`AuthorityRef`** validated newtypes produced by `db` resolve functions; FK integrity comes in Plan 4.
- App-generated UUID ids (matches `OrgId`). A `id_newtype!` macro removes the per-id boilerplate (DRYs `OrgId` + the three new ids).
## Prerequisites
- Postgres for tests with CREATE DATABASE rights; pass `DATABASE_URL` inline on every test/clippy command (e.g. `postgres://postgres:postgres@localhost:5433/cms_dev`). Shell env does not persist between commands.
## File Structure
```
crates/domain/
src/id.rs id_newtype! macro + OrgId, VocabularyId, TermId, AuthorityId
src/label.rs LocalizedLabel + pick_label
src/vocabulary.rs Vocabulary, Term, NewTerm, TermRef
src/authority.rs AuthorityKind, Authority, NewAuthority, AuthorityRef
src/lib.rs re-exports
crates/db/
migrations/0002_vocabularies_authorities.sql
src/vocab.rs create_vocabulary, vocabulary_by_key, add_term, term_by_id, list_terms, resolve_term
src/authority.rs create_authority, authority_by_id, list_by_kind, resolve_authority
src/lib.rs pub mod vocab; pub mod authority;
tests/vocab.rs
tests/authority.rs
```
---
## Task 1: `domain` — id macro, labels, vocabulary & authority types
**Files:** modify `crates/domain/src/id.rs`, `crates/domain/src/lib.rs`; create `crates/domain/src/label.rs`, `crates/domain/src/vocabulary.rs`, `crates/domain/src/authority.rs`.
- [ ] **Step 1: Replace `crates/domain/src/id.rs`** with a macro + the four ids (keeps the existing OrgId behavior/tests):
```rust
//! Strongly-typed identifiers.
/// Define a UUID newtype identifier with the standard constructors and conversions.
macro_rules! id_newtype {
($(#[$meta:meta])* $name:ident) => {
$(#[$meta])*
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, serde::Serialize, serde::Deserialize)]
#[serde(transparent)]
pub struct $name(uuid::Uuid);
impl $name {
/// Generate a fresh random id.
#[must_use]
pub fn new() -> Self {
Self(uuid::Uuid::new_v4())
}
/// Wrap an existing [`uuid::Uuid`].
pub fn from_uuid(uuid: uuid::Uuid) -> Self {
Self(uuid)
}
/// The underlying [`uuid::Uuid`].
pub fn to_uuid(&self) -> uuid::Uuid {
self.0
}
}
impl Default for $name {
fn default() -> Self {
Self::new()
}
}
impl std::fmt::Display for $name {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
std::fmt::Display::fmt(&self.0, f)
}
}
impl std::str::FromStr for $name {
type Err = uuid::Error;
fn from_str(s: &str) -> Result<Self, Self::Err> {
Ok(Self(uuid::Uuid::parse_str(s)?))
}
}
};
}
id_newtype!(
/// Identifier for an organization (tenant).
OrgId
);
id_newtype!(
/// Identifier for a controlled vocabulary (term source).
VocabularyId
);
id_newtype!(
/// Identifier for a term within a vocabulary.
TermId
);
id_newtype!(
/// Identifier for an authority record (person, organisation, or place).
AuthorityId
);
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn parses_and_displays_round_trip() {
let text = "550e8400-e29b-41d4-a716-446655440000";
let id: OrgId = text.parse().expect("valid uuid should parse");
assert_eq!(id.to_string(), text);
}
#[test]
fn rejects_invalid_uuid() {
assert!("not-a-uuid".parse::<OrgId>().is_err());
}
#[test]
fn distinct_id_types_parse_independently() {
let text = "550e8400-e29b-41d4-a716-446655440000";
assert_eq!(text.parse::<VocabularyId>().unwrap().to_string(), text);
assert_eq!(text.parse::<TermId>().unwrap().to_string(), text);
assert_eq!(text.parse::<AuthorityId>().unwrap().to_string(), text);
}
}
```
- [ ] **Step 2: Create `crates/domain/src/label.rs`:**
```rust
use serde::{Deserialize, Serialize};
/// A label in a specific language (BCP-47 tag, e.g. "sv", "en").
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub struct LocalizedLabel {
pub lang: String,
pub label: String,
}
/// Pick the best label for `lang`, falling back to `fallback`, then the first.
pub fn pick_label<'a>(labels: &'a [LocalizedLabel], lang: &str, fallback: &str) -> Option<&'a str> {
labels
.iter()
.find(|l| l.lang == lang)
.or_else(|| labels.iter().find(|l| l.lang == fallback))
.or_else(|| labels.first())
.map(|l| l.label.as_str())
}
#[cfg(test)]
mod tests {
use super::*;
fn sample() -> Vec<LocalizedLabel> {
vec![
LocalizedLabel { lang: "sv".into(), label: "trä".into() },
LocalizedLabel { lang: "en".into(), label: "wood".into() },
]
}
#[test]
fn prefers_requested_language() {
assert_eq!(pick_label(&sample(), "sv", "en"), Some("trä"));
}
#[test]
fn falls_back_then_first() {
assert_eq!(pick_label(&sample(), "de", "en"), Some("wood"));
assert_eq!(pick_label(&sample(), "de", "fr"), Some("trä"));
assert_eq!(pick_label(&[], "sv", "en"), None);
}
}
```
- [ ] **Step 3: Create `crates/domain/src/vocabulary.rs`:**
```rust
use serde::{Deserialize, Serialize};
use crate::{LocalizedLabel, TermId, VocabularyId};
/// A controlled vocabulary (term source), e.g. "material" or "object_name".
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct Vocabulary {
pub id: VocabularyId,
pub key: String,
}
/// A term within a vocabulary, with its multilingual labels.
#[derive(Debug, Clone, PartialEq)]
pub struct Term {
pub id: TermId,
pub vocabulary_id: VocabularyId,
pub external_uri: Option<String>,
pub labels: Vec<LocalizedLabel>,
}
/// A term to be created.
#[derive(Debug, Clone, PartialEq)]
pub struct NewTerm {
pub vocabulary_id: VocabularyId,
pub external_uri: Option<String>,
pub labels: Vec<LocalizedLabel>,
}
/// A reference to a term confirmed to exist in a given vocabulary.
///
/// Obtain via `db::vocab::resolve_term`; do not construct ad hoc for
/// values that haven't been resolved.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
pub struct TermRef {
term_id: TermId,
vocabulary_id: VocabularyId,
}
impl TermRef {
pub fn new(term_id: TermId, vocabulary_id: VocabularyId) -> Self {
Self { term_id, vocabulary_id }
}
pub fn term_id(&self) -> TermId {
self.term_id
}
pub fn vocabulary_id(&self) -> VocabularyId {
self.vocabulary_id
}
}
```
- [ ] **Step 4: Create `crates/domain/src/authority.rs`:**
```rust
use serde::{Deserialize, Serialize};
use crate::{AuthorityId, LocalizedLabel};
/// The kind of authority record.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "lowercase")]
pub enum AuthorityKind {
Person,
Organisation,
Place,
}
impl AuthorityKind {
pub fn as_str(&self) -> &'static str {
match self {
AuthorityKind::Person => "person",
AuthorityKind::Organisation => "organisation",
AuthorityKind::Place => "place",
}
}
pub fn from_db(s: &str) -> Option<Self> {
match s {
"person" => Some(AuthorityKind::Person),
"organisation" => Some(AuthorityKind::Organisation),
"place" => Some(AuthorityKind::Place),
_ => None,
}
}
}
/// An authority record (person / organisation / place), with multilingual labels.
#[derive(Debug, Clone, PartialEq)]
pub struct Authority {
pub id: AuthorityId,
pub kind: AuthorityKind,
pub external_uri: Option<String>,
pub labels: Vec<LocalizedLabel>,
}
/// An authority to be created.
#[derive(Debug, Clone, PartialEq)]
pub struct NewAuthority {
pub kind: AuthorityKind,
pub external_uri: Option<String>,
pub labels: Vec<LocalizedLabel>,
}
/// A reference to an authority confirmed to exist (carries its kind).
///
/// Obtain via `db::authority::resolve_authority`.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
pub struct AuthorityRef {
authority_id: AuthorityId,
kind: AuthorityKind,
}
impl AuthorityRef {
pub fn new(authority_id: AuthorityId, kind: AuthorityKind) -> Self {
Self { authority_id, kind }
}
pub fn authority_id(&self) -> AuthorityId {
self.authority_id
}
pub fn kind(&self) -> AuthorityKind {
self.kind
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn kind_round_trips_via_db_string() {
for k in [AuthorityKind::Person, AuthorityKind::Organisation, AuthorityKind::Place] {
assert_eq!(AuthorityKind::from_db(k.as_str()), Some(k));
}
assert_eq!(AuthorityKind::from_db("ufo"), None);
}
}
```
- [ ] **Step 5: Update `crates/domain/src/lib.rs`** — keep existing `mod audit;`/`mod id;` lines and their re-exports; add the new modules and re-exports. The full module/re-export block should be:
```rust
mod audit;
mod authority;
mod id;
mod label;
mod vocabulary;
pub use audit::{AuditAction, AuditActor, AuditEntry, FieldChange, NewAuditEvent};
pub use authority::{Authority, AuthorityKind, AuthorityRef, NewAuthority};
pub use id::{AuthorityId, OrgId, TermId, VocabularyId};
pub use label::{LocalizedLabel, pick_label};
pub use vocabulary::{NewTerm, Term, TermRef, Vocabulary};
```
(Keep the crate-level `//!` doc comment at the top.)
- [ ] **Step 6: Test + lint.** `cargo test -p domain` → all pass (existing audit/id tests + the new label/authority/id tests). `cargo +nightly fmt`; `cargo clippy -p domain --all-targets -- -D warnings` → clean.
- [ ] **Step 7: Commit.**
```bash
git add crates/domain
git commit -m "feat(domain): id macro + vocabulary/authority/label value types"
```
---
## Task 2: `db` migration — vocabularies, terms, authorities, labels
**Files:** create `crates/db/migrations/0002_vocabularies_authorities.sql`; test `crates/db/tests/migrate.rs` (extend).
- [ ] **Step 1: Create `crates/db/migrations/0002_vocabularies_authorities.sql`:**
```sql
-- Controlled vocabularies (term sources) and their terms.
CREATE TABLE vocabulary (
id UUID PRIMARY KEY,
key TEXT NOT NULL UNIQUE -- e.g. 'material', 'object_name'
);
CREATE TABLE term (
id UUID PRIMARY KEY,
vocabulary_id UUID NOT NULL REFERENCES vocabulary (id) ON DELETE CASCADE,
external_uri TEXT -- e.g. Getty AAT / KulturNav / Wikidata URI
);
CREATE INDEX term_vocabulary_idx ON term (vocabulary_id);
CREATE TABLE term_label (
term_id UUID NOT NULL REFERENCES term (id) ON DELETE CASCADE,
lang TEXT NOT NULL, -- BCP-47, e.g. 'sv', 'en'
label TEXT NOT NULL,
PRIMARY KEY (term_id, lang)
);
-- Authority records: person / organisation / place. Store once, link many.
CREATE TABLE authority (
id UUID PRIMARY KEY,
kind TEXT NOT NULL CHECK (kind IN ('person', 'organisation', 'place')),
external_uri TEXT
);
CREATE INDEX authority_kind_idx ON authority (kind);
CREATE TABLE authority_label (
authority_id UUID NOT NULL REFERENCES authority (id) ON DELETE CASCADE,
lang TEXT NOT NULL,
label TEXT NOT NULL,
PRIMARY KEY (authority_id, lang)
);
```
- [ ] **Step 2: Extend the migrate test** — add to `crates/db/tests/migrate.rs` a check that the new tables exist (append this test):
```rust
#[sqlx::test]
async fn migrate_creates_vocabulary_and_authority_tables(pool: PgPool) {
let db = Db::from_pool(pool);
for table in ["vocabulary", "term", "term_label", "authority", "authority_label"] {
let regclass: Option<String> =
sqlx::query_scalar(&format!("SELECT to_regclass('public.{table}')::text"))
.fetch_one(db.pool())
.await
.unwrap();
assert_eq!(regclass.as_deref(), Some(table), "table {table} should exist");
}
}
```
- [ ] **Step 3: Run + lint.** `DATABASE_URL=<url> cargo test -p db --test migrate` → 2 tests pass. `cargo +nightly fmt`; clippy clean.
- [ ] **Step 4: Commit.**
```bash
git add crates/db/migrations crates/db/tests/migrate.rs
git commit -m "feat(db): add vocabulary, term, and authority tables"
```
---
## Task 3: `db::vocab` repository
**Files:** create `crates/db/src/vocab.rs`; modify `crates/db/src/lib.rs`; test `crates/db/tests/vocab.rs`.
- [ ] **Step 1: Write the failing test** `crates/db/tests/vocab.rs`:
```rust
use db::{Db, vocab};
use domain::{LocalizedLabel, NewTerm};
use sqlx::PgPool;
#[sqlx::test]
async fn vocabulary_create_and_lookup(pool: PgPool) {
let db = Db::from_pool(pool);
let v = vocab::create_vocabulary(db.pool(), "material").await.unwrap();
let found = vocab::vocabulary_by_key(db.pool(), "material")
.await
.unwrap()
.unwrap();
assert_eq!(found.id, v.id);
assert_eq!(found.key, "material");
assert!(vocab::vocabulary_by_key(db.pool(), "nope").await.unwrap().is_none());
}
#[sqlx::test]
async fn term_with_multilingual_labels_round_trips(pool: PgPool) {
let db = Db::from_pool(pool);
let v = vocab::create_vocabulary(db.pool(), "material").await.unwrap();
let mut tx = db.pool().begin().await.unwrap();
let term_id = vocab::add_term(
&mut *tx,
&NewTerm {
vocabulary_id: v.id,
external_uri: Some("http://vocab.getty.edu/aat/300011914".into()),
labels: vec![
LocalizedLabel { lang: "sv".into(), label: "trä".into() },
LocalizedLabel { lang: "en".into(), label: "wood".into() },
],
},
)
.await
.unwrap();
tx.commit().await.unwrap();
let term = vocab::term_by_id(db.pool(), term_id).await.unwrap().unwrap();
assert_eq!(term.vocabulary_id, v.id);
assert_eq!(
term.external_uri.as_deref(),
Some("http://vocab.getty.edu/aat/300011914")
);
assert_eq!(term.labels.len(), 2);
assert_eq!(domain::pick_label(&term.labels, "sv", "en"), Some("trä"));
assert_eq!(domain::pick_label(&term.labels, "de", "en"), Some("wood"));
let listed = vocab::list_terms(db.pool(), v.id).await.unwrap();
assert_eq!(listed.len(), 1);
assert_eq!(listed[0].id, term_id);
}
#[sqlx::test]
async fn resolve_term_checks_vocabulary_membership(pool: PgPool) {
let db = Db::from_pool(pool);
let material = vocab::create_vocabulary(db.pool(), "material").await.unwrap();
let technique = vocab::create_vocabulary(db.pool(), "technique").await.unwrap();
let mut tx = db.pool().begin().await.unwrap();
let term_id = vocab::add_term(
&mut *tx,
&NewTerm {
vocabulary_id: material.id,
external_uri: None,
labels: vec![LocalizedLabel { lang: "en".into(), label: "wood".into() }],
},
)
.await
.unwrap();
tx.commit().await.unwrap();
assert!(vocab::resolve_term(db.pool(), material.id, term_id).await.unwrap().is_some());
assert!(vocab::resolve_term(db.pool(), technique.id, term_id).await.unwrap().is_none());
}
```
- [ ] **Step 2: Run to verify it fails.** `DATABASE_URL=<url> cargo test -p db --test vocab` → FAIL (`db::vocab` missing).
- [ ] **Step 3: Implement** `crates/db/src/vocab.rs`:
```rust
//! Controlled vocabularies and terms.
use domain::{LocalizedLabel, NewTerm, Term, TermId, TermRef, Vocabulary, VocabularyId};
use sqlx::Row;
/// Labels aggregated per row as JSON, to read a term/its labels in one query.
const LABELS_JSON: &str = "COALESCE(json_agg(json_build_object('lang', tl.lang, 'label', tl.label) \
ORDER BY tl.lang) FILTER (WHERE tl.term_id IS NOT NULL), '[]'::json)";
/// Create a vocabulary with the given key.
pub async fn create_vocabulary<'e, E>(executor: E, key: &str) -> Result<Vocabulary, sqlx::Error>
where
E: sqlx::PgExecutor<'e>,
{
let id = VocabularyId::new();
sqlx::query("INSERT INTO vocabulary (id, key) VALUES ($1, $2)")
.bind(id.to_uuid())
.bind(key)
.execute(executor)
.await?;
Ok(Vocabulary { id, key: key.to_owned() })
}
/// Look up a vocabulary by its key.
pub async fn vocabulary_by_key<'e, E>(
executor: E,
key: &str,
) -> Result<Option<Vocabulary>, sqlx::Error>
where
E: sqlx::PgExecutor<'e>,
{
let row = sqlx::query("SELECT id, key FROM vocabulary WHERE key = $1")
.bind(key)
.fetch_optional(executor)
.await?;
Ok(row.map(|r| Vocabulary {
id: VocabularyId::from_uuid(r.get("id")),
key: r.get("key"),
}))
}
/// Insert a term and its labels. Multiple statements — pass a transaction
/// connection (`&mut *tx`) so the term and its labels commit atomically.
pub async fn add_term(conn: &mut sqlx::PgConnection, new: &NewTerm) -> Result<TermId, sqlx::Error> {
let id = TermId::new();
sqlx::query("INSERT INTO term (id, vocabulary_id, external_uri) VALUES ($1, $2, $3)")
.bind(id.to_uuid())
.bind(new.vocabulary_id.to_uuid())
.bind(new.external_uri.as_deref())
.execute(&mut *conn)
.await?;
for label in &new.labels {
sqlx::query("INSERT INTO term_label (term_id, lang, label) VALUES ($1, $2, $3)")
.bind(id.to_uuid())
.bind(&label.lang)
.bind(&label.label)
.execute(&mut *conn)
.await?;
}
Ok(id)
}
/// Fetch one term (with its labels).
pub async fn term_by_id<'e, E>(executor: E, id: TermId) -> Result<Option<Term>, sqlx::Error>
where
E: sqlx::PgExecutor<'e>,
{
let sql = format!(
"SELECT t.id, t.vocabulary_id, t.external_uri, {LABELS_JSON} AS labels \
FROM term t LEFT JOIN term_label tl ON tl.term_id = t.id \
WHERE t.id = $1 GROUP BY t.id"
);
let row = sqlx::query(&sql).bind(id.to_uuid()).fetch_optional(executor).await?;
row.map(map_term).transpose()
}
/// List all terms in a vocabulary (with labels), ordered by id.
pub async fn list_terms<'e, E>(
executor: E,
vocabulary_id: VocabularyId,
) -> Result<Vec<Term>, sqlx::Error>
where
E: sqlx::PgExecutor<'e>,
{
let sql = format!(
"SELECT t.id, t.vocabulary_id, t.external_uri, {LABELS_JSON} AS labels \
FROM term t LEFT JOIN term_label tl ON tl.term_id = t.id \
WHERE t.vocabulary_id = $1 GROUP BY t.id ORDER BY t.id"
);
let rows = sqlx::query(&sql).bind(vocabulary_id.to_uuid()).fetch_all(executor).await?;
rows.into_iter().map(map_term).collect()
}
/// Resolve a term to a [`TermRef`], confirming it belongs to `vocabulary_id`.
pub async fn resolve_term<'e, E>(
executor: E,
vocabulary_id: VocabularyId,
term_id: TermId,
) -> Result<Option<TermRef>, sqlx::Error>
where
E: sqlx::PgExecutor<'e>,
{
let found = sqlx::query_scalar::<_, i32>(
"SELECT 1 FROM term WHERE id = $1 AND vocabulary_id = $2",
)
.bind(term_id.to_uuid())
.bind(vocabulary_id.to_uuid())
.fetch_optional(executor)
.await?;
Ok(found.map(|_| TermRef::new(term_id, vocabulary_id)))
}
fn map_term(row: sqlx::postgres::PgRow) -> Result<Term, sqlx::Error> {
let labels: sqlx::types::Json<Vec<LocalizedLabel>> = row.try_get("labels")?;
Ok(Term {
id: TermId::from_uuid(row.try_get("id")?),
vocabulary_id: VocabularyId::from_uuid(row.try_get("vocabulary_id")?),
external_uri: row.try_get("external_uri")?,
labels: labels.0,
})
}
```
Add to `crates/db/src/lib.rs` (top-level): `pub mod vocab;`
- [ ] **Step 4: Run to verify it passes.** `DATABASE_URL=<url> cargo test -p db --test vocab` → PASS (3 tests).
- [ ] **Step 5: Lint.** `cargo +nightly fmt`; `DATABASE_URL=<url> cargo clippy -p db --all-targets -- -D warnings` → clean.
- [ ] **Step 6: Commit.**
```bash
git add crates/db
git commit -m "feat(db): add vocabulary/term repository with multilingual labels"
```
---
## Task 4: `db::authority` repository
**Files:** create `crates/db/src/authority.rs`; modify `crates/db/src/lib.rs`; test `crates/db/tests/authority.rs`.
- [ ] **Step 1: Write the failing test** `crates/db/tests/authority.rs`:
```rust
use db::{Db, authority};
use domain::{AuthorityKind, LocalizedLabel, NewAuthority};
use sqlx::PgPool;
fn new_person(name_sv: &str, name_en: &str) -> NewAuthority {
NewAuthority {
kind: AuthorityKind::Person,
external_uri: None,
labels: vec![
LocalizedLabel { lang: "sv".into(), label: name_sv.into() },
LocalizedLabel { lang: "en".into(), label: name_en.into() },
],
}
}
#[sqlx::test]
async fn authority_round_trips_with_labels(pool: PgPool) {
let db = Db::from_pool(pool);
let mut tx = db.pool().begin().await.unwrap();
let id = authority::create_authority(&mut *tx, &new_person("Carl Larsson", "Carl Larsson"))
.await
.unwrap();
tx.commit().await.unwrap();
let got = authority::authority_by_id(db.pool(), id).await.unwrap().unwrap();
assert_eq!(got.id, id);
assert_eq!(got.kind, AuthorityKind::Person);
assert_eq!(got.labels.len(), 2);
assert_eq!(domain::pick_label(&got.labels, "sv", "en"), Some("Carl Larsson"));
}
#[sqlx::test]
async fn list_by_kind_filters(pool: PgPool) {
let db = Db::from_pool(pool);
let mut tx = db.pool().begin().await.unwrap();
authority::create_authority(&mut *tx, &new_person("A", "A")).await.unwrap();
authority::create_authority(
&mut *tx,
&NewAuthority {
kind: AuthorityKind::Place,
external_uri: None,
labels: vec![LocalizedLabel { lang: "en".into(), label: "Stockholm".into() }],
},
)
.await
.unwrap();
tx.commit().await.unwrap();
let people = authority::list_by_kind(db.pool(), AuthorityKind::Person).await.unwrap();
assert_eq!(people.len(), 1);
assert_eq!(people[0].kind, AuthorityKind::Person);
let places = authority::list_by_kind(db.pool(), AuthorityKind::Place).await.unwrap();
assert_eq!(places.len(), 1);
assert_eq!(places[0].kind, AuthorityKind::Place);
}
#[sqlx::test]
async fn resolve_authority_returns_kind(pool: PgPool) {
let db = Db::from_pool(pool);
let mut tx = db.pool().begin().await.unwrap();
let id = authority::create_authority(&mut *tx, &new_person("X", "X")).await.unwrap();
tx.commit().await.unwrap();
let r = authority::resolve_authority(db.pool(), id).await.unwrap().unwrap();
assert_eq!(r.authority_id(), id);
assert_eq!(r.kind(), AuthorityKind::Person);
let missing = authority::resolve_authority(db.pool(), domain::AuthorityId::new())
.await
.unwrap();
assert!(missing.is_none());
}
```
- [ ] **Step 2: Run to verify it fails.** `DATABASE_URL=<url> cargo test -p db --test authority` → FAIL.
- [ ] **Step 3: Implement** `crates/db/src/authority.rs`:
```rust
//! Authority records (person / organisation / place).
use domain::{Authority, AuthorityId, AuthorityKind, AuthorityRef, LocalizedLabel, NewAuthority};
use sqlx::Row;
const LABELS_JSON: &str = "COALESCE(json_agg(json_build_object('lang', al.lang, 'label', al.label) \
ORDER BY al.lang) FILTER (WHERE al.authority_id IS NOT NULL), '[]'::json)";
/// Insert an authority and its labels. Multiple statements — pass a transaction
/// connection (`&mut *tx`) for atomicity.
pub async fn create_authority(
conn: &mut sqlx::PgConnection,
new: &NewAuthority,
) -> Result<AuthorityId, sqlx::Error> {
let id = AuthorityId::new();
sqlx::query("INSERT INTO authority (id, kind, external_uri) VALUES ($1, $2, $3)")
.bind(id.to_uuid())
.bind(new.kind.as_str())
.bind(new.external_uri.as_deref())
.execute(&mut *conn)
.await?;
for label in &new.labels {
sqlx::query("INSERT INTO authority_label (authority_id, lang, label) VALUES ($1, $2, $3)")
.bind(id.to_uuid())
.bind(&label.lang)
.bind(&label.label)
.execute(&mut *conn)
.await?;
}
Ok(id)
}
/// Fetch one authority (with its labels).
pub async fn authority_by_id<'e, E>(
executor: E,
id: AuthorityId,
) -> Result<Option<Authority>, sqlx::Error>
where
E: sqlx::PgExecutor<'e>,
{
let sql = format!(
"SELECT a.id, a.kind, a.external_uri, {LABELS_JSON} AS labels \
FROM authority a LEFT JOIN authority_label al ON al.authority_id = a.id \
WHERE a.id = $1 GROUP BY a.id"
);
let row = sqlx::query(&sql).bind(id.to_uuid()).fetch_optional(executor).await?;
row.map(map_authority).transpose()
}
/// List authorities of a given kind (with labels), ordered by id.
pub async fn list_by_kind<'e, E>(
executor: E,
kind: AuthorityKind,
) -> Result<Vec<Authority>, sqlx::Error>
where
E: sqlx::PgExecutor<'e>,
{
let sql = format!(
"SELECT a.id, a.kind, a.external_uri, {LABELS_JSON} AS labels \
FROM authority a LEFT JOIN authority_label al ON al.authority_id = a.id \
WHERE a.kind = $1 GROUP BY a.id ORDER BY a.id"
);
let rows = sqlx::query(&sql).bind(kind.as_str()).fetch_all(executor).await?;
rows.into_iter().map(map_authority).collect()
}
/// Resolve an authority to an [`AuthorityRef`] (carrying its kind).
pub async fn resolve_authority<'e, E>(
executor: E,
id: AuthorityId,
) -> Result<Option<AuthorityRef>, sqlx::Error>
where
E: sqlx::PgExecutor<'e>,
{
let kind: Option<String> = sqlx::query_scalar("SELECT kind FROM authority WHERE id = $1")
.bind(id.to_uuid())
.fetch_optional(executor)
.await?;
match kind {
Some(k) => {
let kind = AuthorityKind::from_db(&k)
.ok_or_else(|| sqlx::Error::Decode(format!("unknown authority kind: {k}").into()))?;
Ok(Some(AuthorityRef::new(id, kind)))
}
None => Ok(None),
}
}
fn map_authority(row: sqlx::postgres::PgRow) -> Result<Authority, sqlx::Error> {
let kind_str: String = row.try_get("kind")?;
let kind = AuthorityKind::from_db(&kind_str)
.ok_or_else(|| sqlx::Error::Decode(format!("unknown authority kind: {kind_str}").into()))?;
let labels: sqlx::types::Json<Vec<LocalizedLabel>> = row.try_get("labels")?;
Ok(Authority {
id: AuthorityId::from_uuid(row.try_get("id")?),
kind,
external_uri: row.try_get("external_uri")?,
labels: labels.0,
})
}
```
Add to `crates/db/src/lib.rs` (top-level): `pub mod authority;`
- [ ] **Step 4: Run to verify it passes.** `DATABASE_URL=<url> cargo test -p db --test authority` → PASS (3 tests).
- [ ] **Step 5: Full workspace check.**
```bash
cargo +nightly fmt --check
DATABASE_URL=<url> cargo clippy --workspace --all-targets -- -D warnings
DATABASE_URL=<url> cargo test --workspace
```
Expected: all green.
- [ ] **Step 6: Commit.**
```bash
git add crates/db
git commit -m "feat(db): add authority repository with multilingual labels"
```
---
## Self-Review (completed)
**Spec coverage (§6.3 vocab/authority):**
- Controlled vocabularies + terms, person/org/place authorities, store-once-link-many → Tasks 24. ✓
- Multilingual labels (sv/en…) → label tables + `LocalizedLabel`/`pick_label` (Tasks 14). ✓
- Validated reference types `TermRef`/`AuthorityRef` produced by resolve functions → Tasks 1, 3, 4. ✓
- SQL confined to `db`; `domain` I/O-free; uses `domain` ids → all tasks. ✓
- Unified authority table + normalized labels (approved decisions) → Task 2. ✓
- No HTTP/admin UI (deferred to Plan 10). ✓ (intentional)
**Placeholder scan:** none. `<url>` is the documented `DATABASE_URL`.
**Type consistency:** `VocabularyId`/`TermId`/`AuthorityId`/`AuthorityKind`/`LocalizedLabel`/`Vocabulary`/`Term`/`NewTerm`/`TermRef`/`Authority`/`NewAuthority`/`AuthorityRef` names + fields are identical across `domain` (Task 1), the repositories (Tasks 34), and tests. Repo signatures: reads take `impl PgExecutor`; multi-statement writes (`add_term`, `create_authority`) take `&mut PgConnection` and are called with `&mut *tx` in tests. `LABELS_JSON` aliases differ per module (`tl`/`term_id` vs `al`/`authority_id`) matching their joins.
## Notes for follow-on plans
- `TermRef`/`AuthorityRef` become FK-backed when the catalogue references them (Plan 4); consider whether `resolve_*` should run inside the catalogue write transaction.
- Authority/term **search by label** (fuzzy/substring) is deferred to Meilisearch (Plan 6) and the admin UI (Plan 10); the relational repos here cover by-id/by-key/by-kind/list.
- Seeding the Spectrum-recommended vocabularies (and Getty/KulturNav import) is a later concern (VISION post-MVP).