search crate (SearchClient adapter) indexing core + flexible fields with term/authority resolved to labels; reindex_all; on-write sync deferred to API. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
20 KiB
Search (Meilisearch) Implementation Plan
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: A search crate that indexes catalogue objects (core + flexible fields, with term/authority values resolved to their labels) into Meilisearch and runs full-text search, plus a reindex_all rebuild. On-write sync orchestration is deferred to the API/service layer (Plan 7+); this plan builds the capability and reindex_all.
Architecture: A new role-named crate search depending on db + domain (cycle-free: search → db → domain). It exposes a SearchClient (Meilisearch adapter behind our own type, so the engine stays swappable), a SearchDocument (the indexed shape), build_document (reads db to resolve a CatalogueObject's flexible fields to searchable text), and reindex_all. Search returns object ids; callers load full objects from db. visibility is a filterable attribute (for the future public API).
Tech Stack: Rust 2024, meilisearch-sdk (async client), serde (document), thiserror (SearchError), tokio. Tests run against a real Meilisearch (Docker) + Postgres.
Design decisions (approved)
searchcrate:SearchClientwrappingmeilisearch-sdk, swappable behind our type.- Index doc = core text + flexible values flattened to searchable text; term/authority resolved to labels;
localized_text→ all language strings;visibilityfilterable. Search returns object ids. - Build the capability +
reindex_allnow; on-write sync is wired at the API/service layer (Plan 7+). Eventual consistency (Meili not transactional with Postgres). - Integration tests use a real Meilisearch in Docker, each test on a unique index for isolation.
⚠️ Implementer note on the Meilisearch SDK
The meilisearch-sdk API (method names, async task handling) varies by version. The code blocks below are the intended shape; adapt the exact SDK calls to the installed version while preserving behavior. The tests are the contract — make them pass. Key behaviors: indexing operations must wait_for_completion (Meilisearch indexes asynchronously) so a subsequent search sees the document. Verify the current meilisearch-sdk version via the cratesio tooling and pin it.
Prerequisites
- Postgres (as before) AND a Meilisearch instance. The controller will start Meilisearch in Docker (e.g.
getmeili/meilisearch) with a master key. Tests readMEILI_URL(e.g.http://localhost:7700) andMEILI_MASTER_KEY; pass them inline alongsideDATABASE_URL. Pass transaction connections as&mut tx.
File Structure
Cargo.toml + search member; meilisearch-sdk in workspace deps
crates/search/
Cargo.toml
src/lib.rs SearchError, SearchDocument, SearchClient, build_document, reindex_all
tests/search.rs (Meili only) index/search/remove
tests/reindex.rs (Meili + Postgres) build_document + reindex_all
Task 1: search crate — client, document, index/search/remove
Files: modify root Cargo.toml; create crates/search/Cargo.toml, crates/search/src/lib.rs, crates/search/tests/search.rs.
-
Step 1: Workspace + crate setup.
- In root
Cargo.toml, add"crates/search"tomembers, and add to[workspace.dependencies](verify the latest version via cratesio):meilisearch-sdk = "0.28" - Create
crates/search/Cargo.toml:[package] name = "search" version = "0.0.0" edition.workspace = true rust-version.workspace = true [dependencies] meilisearch-sdk.workspace = true serde = { workspace = true } thiserror.workspace = true domain = { path = "../domain" } db = { path = "../db" } [dev-dependencies] tokio.workspace = true uuid.workspace = true serde_json.workspace = true sqlx.workspace = true
- In root
-
Step 2: Write the failing test
crates/search/tests/search.rs(Meilisearch only — hand-built documents, no Postgres):
use search::{SearchClient, SearchDocument};
fn meili() -> (String, String) {
(
std::env::var("MEILI_URL").expect("MEILI_URL must be set"),
std::env::var("MEILI_MASTER_KEY").expect("MEILI_MASTER_KEY must be set"),
)
}
fn unique_index() -> String {
format!("objects_test_{}", uuid::Uuid::new_v4().simple())
}
fn doc(id: &str, object_name: &str, fields_text: &[&str]) -> SearchDocument {
SearchDocument {
id: id.to_string(),
object_number: format!("N-{id}"),
object_name: object_name.to_string(),
brief_description: None,
current_owner: None,
recorder: None,
visibility: "draft".to_string(),
fields_text: fields_text.iter().map(|s| s.to_string()).collect(),
}
}
#[tokio::test]
async fn index_search_and_remove() {
let (url, key) = meili();
let client = SearchClient::connect(&url, &key, &unique_index()).unwrap();
client.ensure_index().await.unwrap();
let vase = domain::ObjectId::new();
let chair = domain::ObjectId::new();
client.index_object(&doc(&vase.to_string(), "vase", &["wood", "trä"])).await.unwrap();
client.index_object(&doc(&chair.to_string(), "chair", &["oak"])).await.unwrap();
// full-text on a flexible value
let hits = client.search("wood").await.unwrap();
assert_eq!(hits, vec![vase]);
// full-text on the object name
let hits = client.search("chair").await.unwrap();
assert_eq!(hits, vec![chair]);
// remove
client.remove_object(vase).await.unwrap();
assert!(client.search("wood").await.unwrap().is_empty());
}
-
Step 3: Run to verify it fails.
MEILI_URL=<url> MEILI_MASTER_KEY=<key> cargo test -p search --test search→ FAIL (crate/types missing). -
Step 4: Implement
crates/search/src/lib.rs(adapt the SDK calls to the installed version; keep behavior + signatures):
//! Full-text search over catalogue objects, backed by Meilisearch.
use db::Db;
use domain::{CatalogueObject, ObjectId};
use serde::{Deserialize, Serialize};
/// Errors from the search subsystem.
#[derive(Debug, thiserror::Error)]
pub enum SearchError {
#[error(transparent)]
Meili(#[from] meilisearch_sdk::errors::Error),
#[error(transparent)]
Db(#[from] sqlx::Error),
#[error("invalid object id in index: {0}")]
BadId(String),
}
/// The indexed shape of a catalogue object.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct SearchDocument {
pub id: String,
pub object_number: String,
pub object_name: String,
pub brief_description: Option<String>,
pub current_owner: Option<String>,
pub recorder: Option<String>,
/// Filterable: "draft" | "internal" | "public".
pub visibility: String,
/// Flexible field values flattened to searchable text (term/authority labels,
/// localized strings, and scalar values).
pub fields_text: Vec<String>,
}
/// A Meilisearch-backed search client scoped to one index.
pub struct SearchClient {
client: meilisearch_sdk::client::Client,
index_uid: String,
}
impl SearchClient {
/// Connect to Meilisearch at `url` with `api_key`, scoped to `index_uid`.
pub fn connect(url: &str, api_key: &str, index_uid: &str) -> Result<Self, SearchError> {
let client = meilisearch_sdk::client::Client::new(url, Some(api_key))?;
Ok(Self { client, index_uid: index_uid.to_owned() })
}
/// Create the index (primary key "id") if absent and set filterable attributes.
pub async fn ensure_index(&self) -> Result<(), SearchError> {
// Create the index if it doesn't exist (ignore "index already exists").
let task = self.client.create_index(&self.index_uid, Some("id")).await?;
task.wait_for_completion(&self.client, None, None).await?;
let index = self.client.index(&self.index_uid);
index
.set_filterable_attributes(["visibility"])
.await?
.wait_for_completion(&self.client, None, None)
.await?;
Ok(())
}
/// Upsert one object document (waits for indexing to complete).
pub async fn index_object(&self, doc: &SearchDocument) -> Result<(), SearchError> {
self.client
.index(&self.index_uid)
.add_or_replace_documents(std::slice::from_ref(doc), Some("id"))
.await?
.wait_for_completion(&self.client, None, None)
.await?;
Ok(())
}
/// Remove one object from the index by id (waits for completion).
pub async fn remove_object(&self, id: ObjectId) -> Result<(), SearchError> {
self.client
.index(&self.index_uid)
.delete_document(id.to_string())
.await?
.wait_for_completion(&self.client, None, None)
.await?;
Ok(())
}
/// Full-text search; returns matching object ids (in Meilisearch ranking order).
pub async fn search(&self, query: &str) -> Result<Vec<ObjectId>, SearchError> {
let results = self
.client
.index(&self.index_uid)
.search()
.with_query(query)
.execute::<SearchDocument>()
.await?;
results
.hits
.into_iter()
.map(|hit| hit.result.id.parse::<ObjectId>().map_err(|_| SearchError::BadId(hit.result.id)))
.collect()
}
/// Rebuild the whole index from the database (clears then re-adds all objects).
pub async fn reindex_all(&self, db: &Db) -> Result<(), SearchError> {
let index = self.client.index(&self.index_uid);
index.delete_all_documents().await?.wait_for_completion(&self.client, None, None).await?;
let objects = db::catalog::list_objects(db.pool()).await?;
let mut docs = Vec::with_capacity(objects.len());
for object in &objects {
docs.push(build_document(db, object).await?);
}
if !docs.is_empty() {
index
.add_or_replace_documents(&docs, Some("id"))
.await?
.wait_for_completion(&self.client, None, None)
.await?;
}
Ok(())
}
}
/// Build a [`SearchDocument`] from an object, resolving its flexible fields to
/// searchable text (term/authority → labels, localized text → all values).
/// Implemented in Task 2; declared here so the crate compiles.
pub async fn build_document(
_db: &Db,
_object: &CatalogueObject,
) -> Result<SearchDocument, SearchError> {
unimplemented!("implemented in Task 2")
}
NOTE: ObjectId: FromStr (Err = uuid::Error) exists from the id macro. reindex_all/build_document are needed for compilation now (Task 1 test doesn't call them) — build_document is a stub unimplemented!() filled in Task 2. If clippy flags the stub's unused params, the leading underscores suppress that; if it flags unimplemented! in a non-test fn, add #[allow(clippy::unimplemented)] to build_document with a // Task 2 note, OR move reindex_all+build_document entirely into Task 2 (preferred if it keeps Task 1 clippy-clean — in that case omit them here and add pub mod-level items in Task 2).
-
Step 5: Run to verify it passes.
MEILI_URL=<url> MEILI_MASTER_KEY=<key> cargo test -p search --test search→ PASS. (You may need to adapt SDK calls; iterate until the test passes.) -
Step 6: Lint.
cargo +nightly fmt;cargo clippy -p search --all-targets -- -D warnings→ clean. -
Step 7: Commit.
git add Cargo.toml crates/search
git commit -m "feat(search): add Meilisearch-backed SearchClient (index, search, remove)"
Task 2: build_document + reindex_all (db integration)
Files: modify crates/search/src/lib.rs; create crates/search/tests/reindex.rs.
- Step 1: Write the failing test
crates/search/tests/reindex.rs(Meilisearch + Postgres):
use db::{Db, catalog, fields, vocab};
use domain::{
AuditActor, FieldType, LocalizedLabel, NewFieldDefinition, NewTerm, ObjectInput, Visibility,
};
use search::SearchClient;
use sqlx::PgPool;
fn meili() -> (String, String) {
(
std::env::var("MEILI_URL").expect("MEILI_URL must be set"),
std::env::var("MEILI_MASTER_KEY").expect("MEILI_MASTER_KEY must be set"),
)
}
fn unique_index() -> String {
format!("reindex_test_{}", uuid::Uuid::new_v4().simple())
}
#[sqlx::test]
async fn reindex_resolves_term_labels_and_finds_by_label(pool: PgPool) {
let db = Db::from_pool(pool);
// a material vocabulary with a "wood" term
let material = vocab::create_vocabulary(db.pool(), "material").await.unwrap();
let mut tx = db.pool().begin().await.unwrap();
let wood = vocab::add_term(
&mut tx,
&NewTerm {
vocabulary_id: material.id,
external_uri: None,
labels: vec![LocalizedLabel { lang: "en".into(), label: "wood".into() }],
},
)
.await
.unwrap();
fields::create_field_definition(
&mut tx,
&NewFieldDefinition {
key: "material".into(),
field_type: FieldType::Term { vocabulary_id: material.id },
required: false,
group_key: None,
labels: vec![LocalizedLabel { lang: "en".into(), label: "material".into() }],
},
)
.await
.unwrap();
let object_id = catalog::create_object(
&mut tx,
AuditActor::System,
&ObjectInput {
object_number: "LM-1".into(),
object_name: "vase".into(),
number_of_objects: 1,
brief_description: None,
current_location: None,
current_owner: None,
recorder: None,
recording_date: None,
visibility: Visibility::Public,
},
)
.await
.unwrap();
tx.commit().await.unwrap();
// set the material field to the wood term
let mut tx = db.pool().begin().await.unwrap();
catalog::set_object_fields(
&mut tx,
AuditActor::System,
object_id,
serde_json::json!({ "material": wood.to_string() }).as_object().unwrap(),
)
.await
.unwrap();
tx.commit().await.unwrap();
let (url, key) = meili();
let client = SearchClient::connect(&url, &key, &unique_index()).unwrap();
client.ensure_index().await.unwrap();
client.reindex_all(&db).await.unwrap();
// found by the object name
assert_eq!(client.search("vase").await.unwrap(), vec![object_id]);
// found by the resolved TERM LABEL (not the uuid)
assert_eq!(client.search("wood").await.unwrap(), vec![object_id]);
}
-
Step 2: Run to verify it fails. With both env vars +
DATABASE_URL:... cargo test -p search --test reindex→ FAIL (build_documentisunimplemented!). -
Step 3: Implement
build_documentincrates/search/src/lib.rs— replace the stub body with a real implementation that flattens the object's flexible fields to searchable text, resolving term/authority values to labels:
pub async fn build_document(
db: &Db,
object: &CatalogueObject,
) -> Result<SearchDocument, SearchError> {
let mut fields_text = Vec::new();
if let Some(map) = object.fields.as_object() {
for (key, value) in map {
let Some(def) = db::fields::field_definition_by_key(db.pool(), key).await? else {
continue; // a field with no definition (stale) — skip
};
match def.field_type {
domain::FieldType::Text | domain::FieldType::Date => {
if let Some(s) = value.as_str() {
fields_text.push(s.to_owned());
}
}
domain::FieldType::Integer | domain::FieldType::Boolean => {
fields_text.push(value.to_string());
}
domain::FieldType::LocalizedText => {
if let Some(obj) = value.as_object() {
for v in obj.values() {
if let Some(s) = v.as_str() {
fields_text.push(s.to_owned());
}
}
}
}
domain::FieldType::Term { .. } => {
if let Some(term_id) = value.as_str().and_then(|s| s.parse().ok()) {
if let Some(term) = db::vocab::term_by_id(db.pool(), term_id).await? {
fields_text.extend(term.labels.into_iter().map(|l| l.label));
}
}
}
domain::FieldType::Authority { .. } => {
if let Some(authority_id) = value.as_str().and_then(|s| s.parse().ok()) {
if let Some(authority) =
db::authority::authority_by_id(db.pool(), authority_id).await?
{
fields_text.extend(authority.labels.into_iter().map(|l| l.label));
}
}
}
}
}
}
Ok(SearchDocument {
id: object.id.to_string(),
object_number: object.object_number.clone(),
object_name: object.object_name.clone(),
brief_description: object.brief_description.clone(),
current_owner: object.current_owner.clone(),
recorder: object.recorder.clone(),
visibility: object.visibility.as_str().to_owned(),
fields_text,
})
}
(db::vocab::term_by_id takes a TermId; db::authority::authority_by_id takes an AuthorityId — value.as_str().and_then(|s| s.parse().ok()) parses into the inferred id type. If type inference needs help, annotate: let term_id: domain::TermId = ....)
-
Step 4: Run to verify it passes.
MEILI_URL=<url> MEILI_MASTER_KEY=<key> DATABASE_URL=<url> cargo test -p search --test reindex→ PASS. -
Step 5: Full workspace check.
cargo +nightly fmt --check
DATABASE_URL=<url> MEILI_URL=<url> MEILI_MASTER_KEY=<key> cargo clippy --workspace --all-targets -- -D warnings
DATABASE_URL=<url> MEILI_URL=<url> MEILI_MASTER_KEY=<key> cargo test --workspace
Expected: all green. (The search tests need the MEILI env vars; the rest need DATABASE_URL.)
- Step 6: Commit.
git add crates/search
git commit -m "feat(search): build documents resolving term/authority labels; reindex_all"
Self-Review (completed)
Spec coverage (Plan 6 / VISION search MVP):
searchcrate, Meilisearch adapter behindSearchClient, swappable → Task 1. ✓- Index core + flexible text; term/authority resolved to labels; localized → all values; visibility filterable; search returns object ids → Tasks 1–2. ✓
- Build capability +
reindex_allnow; on-write sync deferred to API/service → this plan + notes. ✓ search → db → domain(no cycle); SQL stays indb(search calls db repos) → Cargo deps. ✓- Real-Meili integration tests, unique index per test → Tasks 1–2. ✓
Placeholder scan: the only unimplemented! is the Task 1 build_document stub, explicitly filled in Task 2 (with a fallback instruction). <url>/<key> are documented env values. No other placeholders.
Type consistency: SearchDocument fields used identically in tests + build_document; SearchClient::{connect, ensure_index, index_object, remove_object, search, reindex_all} signatures consistent across tasks/tests; search returns Vec<ObjectId> parsed via ObjectId: FromStr; build_document matches on domain::FieldType (Plan 4) and calls db::vocab::term_by_id/db::authority::authority_by_id/db::fields::field_definition_by_key/db::catalog::list_objects as defined.
Notes for follow-on plans
- On-write sync (API/service, Plan 7+): after a catalogue create/update/delete/set_fields commits, call
index_object/remove_objectbest-effort (log failures;reindex_allis the recovery path). Meili is not transactional with Postgres — eventual consistency. - Public API (Plan 7):
searchalready storesvisibilityas filterable; add awith_filter("visibility = public")search variant for the public surface. - Per-deployment index/credentials: production uses a fixed index uid (e.g.
objects) with a scoped Meili key per the single-tenant deployment; only tests use unique index names. - Reindex cost:
reindex_allis N+1 over objects×fields (resolves labels per field) — fine for now; batch when collections grow (relates to #12).