# Search (Meilisearch) Implementation Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** A `search` crate that indexes catalogue objects (core + flexible fields, with term/authority values resolved to their labels) into Meilisearch and runs full-text search, plus a `reindex_all` rebuild. On-write sync orchestration is deferred to the API/service layer (Plan 7+); this plan builds the capability and `reindex_all`. **Architecture:** A new role-named crate `search` depending on `db` + `domain` (cycle-free: `search → db → domain`). It exposes a `SearchClient` (Meilisearch adapter behind our own type, so the engine stays swappable), a `SearchDocument` (the indexed shape), `build_document` (reads `db` to resolve a `CatalogueObject`'s flexible fields to searchable text), and `reindex_all`. Search returns object ids; callers load full objects from `db`. `visibility` is a filterable attribute (for the future public API). **Tech Stack:** Rust 2024, `meilisearch-sdk` (async client), `serde` (document), `thiserror` (SearchError), tokio. Tests run against a real Meilisearch (Docker) + Postgres. ## Design decisions (approved) - `search` crate: `SearchClient` wrapping `meilisearch-sdk`, swappable behind our type. - Index doc = core text + flexible values flattened to searchable text; **term/authority resolved to labels**; `localized_text` → all language strings; `visibility` filterable. Search returns object ids. - Build the capability + `reindex_all` now; **on-write sync is wired at the API/service layer (Plan 7+)**. Eventual consistency (Meili not transactional with Postgres). - Integration tests use a real Meilisearch in Docker, each test on a **unique index** for isolation. ## ⚠️ Implementer note on the Meilisearch SDK The `meilisearch-sdk` API (method names, async task handling) varies by version. The **code blocks below are the intended shape**; adapt the exact SDK calls to the installed version while preserving behavior. **The tests are the contract** — make them pass. Key behaviors: indexing operations must `wait_for_completion` (Meilisearch indexes asynchronously) so a subsequent search sees the document. Verify the current `meilisearch-sdk` version via the cratesio tooling and pin it. ## Prerequisites - Postgres (as before) AND a Meilisearch instance. The controller will start Meilisearch in Docker (e.g. `getmeili/meilisearch`) with a master key. Tests read `MEILI_URL` (e.g. `http://localhost:7700`) and `MEILI_MASTER_KEY`; pass them inline alongside `DATABASE_URL`. Pass transaction connections as `&mut tx`. ## File Structure ``` Cargo.toml + search member; meilisearch-sdk in workspace deps crates/search/ Cargo.toml src/lib.rs SearchError, SearchDocument, SearchClient, build_document, reindex_all tests/search.rs (Meili only) index/search/remove tests/reindex.rs (Meili + Postgres) build_document + reindex_all ``` --- ## Task 1: `search` crate — client, document, index/search/remove **Files:** modify root `Cargo.toml`; create `crates/search/Cargo.toml`, `crates/search/src/lib.rs`, `crates/search/tests/search.rs`. - [ ] **Step 1: Workspace + crate setup.** - In root `Cargo.toml`, add `"crates/search"` to `members`, and add to `[workspace.dependencies]` (verify the latest version via cratesio): ```toml meilisearch-sdk = "0.28" ``` - Create `crates/search/Cargo.toml`: ```toml [package] name = "search" version = "0.0.0" edition.workspace = true rust-version.workspace = true [dependencies] meilisearch-sdk.workspace = true serde = { workspace = true } thiserror.workspace = true domain = { path = "../domain" } db = { path = "../db" } [dev-dependencies] tokio.workspace = true uuid.workspace = true serde_json.workspace = true sqlx.workspace = true ``` - [ ] **Step 2: Write the failing test** `crates/search/tests/search.rs` (Meilisearch only — hand-built documents, no Postgres): ```rust use search::{SearchClient, SearchDocument}; fn meili() -> (String, String) { ( std::env::var("MEILI_URL").expect("MEILI_URL must be set"), std::env::var("MEILI_MASTER_KEY").expect("MEILI_MASTER_KEY must be set"), ) } fn unique_index() -> String { format!("objects_test_{}", uuid::Uuid::new_v4().simple()) } fn doc(id: &str, object_name: &str, fields_text: &[&str]) -> SearchDocument { SearchDocument { id: id.to_string(), object_number: format!("N-{id}"), object_name: object_name.to_string(), brief_description: None, current_owner: None, recorder: None, visibility: "draft".to_string(), fields_text: fields_text.iter().map(|s| s.to_string()).collect(), } } #[tokio::test] async fn index_search_and_remove() { let (url, key) = meili(); let client = SearchClient::connect(&url, &key, &unique_index()).unwrap(); client.ensure_index().await.unwrap(); let vase = domain::ObjectId::new(); let chair = domain::ObjectId::new(); client.index_object(&doc(&vase.to_string(), "vase", &["wood", "trä"])).await.unwrap(); client.index_object(&doc(&chair.to_string(), "chair", &["oak"])).await.unwrap(); // full-text on a flexible value let hits = client.search("wood").await.unwrap(); assert_eq!(hits, vec![vase]); // full-text on the object name let hits = client.search("chair").await.unwrap(); assert_eq!(hits, vec![chair]); // remove client.remove_object(vase).await.unwrap(); assert!(client.search("wood").await.unwrap().is_empty()); } ``` - [ ] **Step 3: Run to verify it fails.** `MEILI_URL= MEILI_MASTER_KEY= cargo test -p search --test search` → FAIL (crate/types missing). - [ ] **Step 4: Implement** `crates/search/src/lib.rs` (adapt the SDK calls to the installed version; keep behavior + signatures): ```rust //! Full-text search over catalogue objects, backed by Meilisearch. use db::Db; use domain::{CatalogueObject, ObjectId}; use serde::{Deserialize, Serialize}; /// Errors from the search subsystem. #[derive(Debug, thiserror::Error)] pub enum SearchError { #[error(transparent)] Meili(#[from] meilisearch_sdk::errors::Error), #[error(transparent)] Db(#[from] sqlx::Error), #[error("invalid object id in index: {0}")] BadId(String), } /// The indexed shape of a catalogue object. #[derive(Debug, Clone, Serialize, Deserialize)] pub struct SearchDocument { pub id: String, pub object_number: String, pub object_name: String, pub brief_description: Option, pub current_owner: Option, pub recorder: Option, /// Filterable: "draft" | "internal" | "public". pub visibility: String, /// Flexible field values flattened to searchable text (term/authority labels, /// localized strings, and scalar values). pub fields_text: Vec, } /// A Meilisearch-backed search client scoped to one index. pub struct SearchClient { client: meilisearch_sdk::client::Client, index_uid: String, } impl SearchClient { /// Connect to Meilisearch at `url` with `api_key`, scoped to `index_uid`. pub fn connect(url: &str, api_key: &str, index_uid: &str) -> Result { let client = meilisearch_sdk::client::Client::new(url, Some(api_key))?; Ok(Self { client, index_uid: index_uid.to_owned() }) } /// Create the index (primary key "id") if absent and set filterable attributes. pub async fn ensure_index(&self) -> Result<(), SearchError> { // Create the index if it doesn't exist (ignore "index already exists"). let task = self.client.create_index(&self.index_uid, Some("id")).await?; task.wait_for_completion(&self.client, None, None).await?; let index = self.client.index(&self.index_uid); index .set_filterable_attributes(["visibility"]) .await? .wait_for_completion(&self.client, None, None) .await?; Ok(()) } /// Upsert one object document (waits for indexing to complete). pub async fn index_object(&self, doc: &SearchDocument) -> Result<(), SearchError> { self.client .index(&self.index_uid) .add_or_replace_documents(std::slice::from_ref(doc), Some("id")) .await? .wait_for_completion(&self.client, None, None) .await?; Ok(()) } /// Remove one object from the index by id (waits for completion). pub async fn remove_object(&self, id: ObjectId) -> Result<(), SearchError> { self.client .index(&self.index_uid) .delete_document(id.to_string()) .await? .wait_for_completion(&self.client, None, None) .await?; Ok(()) } /// Full-text search; returns matching object ids (in Meilisearch ranking order). pub async fn search(&self, query: &str) -> Result, SearchError> { let results = self .client .index(&self.index_uid) .search() .with_query(query) .execute::() .await?; results .hits .into_iter() .map(|hit| hit.result.id.parse::().map_err(|_| SearchError::BadId(hit.result.id))) .collect() } /// Rebuild the whole index from the database (clears then re-adds all objects). pub async fn reindex_all(&self, db: &Db) -> Result<(), SearchError> { let index = self.client.index(&self.index_uid); index.delete_all_documents().await?.wait_for_completion(&self.client, None, None).await?; let objects = db::catalog::list_objects(db.pool()).await?; let mut docs = Vec::with_capacity(objects.len()); for object in &objects { docs.push(build_document(db, object).await?); } if !docs.is_empty() { index .add_or_replace_documents(&docs, Some("id")) .await? .wait_for_completion(&self.client, None, None) .await?; } Ok(()) } } /// Build a [`SearchDocument`] from an object, resolving its flexible fields to /// searchable text (term/authority → labels, localized text → all values). /// Implemented in Task 2; declared here so the crate compiles. pub async fn build_document( _db: &Db, _object: &CatalogueObject, ) -> Result { unimplemented!("implemented in Task 2") } ``` NOTE: `ObjectId: FromStr` (Err = `uuid::Error`) exists from the id macro. `reindex_all`/`build_document` are needed for compilation now (Task 1 test doesn't call them) — `build_document` is a stub `unimplemented!()` filled in Task 2. If clippy flags the stub's unused params, the leading underscores suppress that; if it flags `unimplemented!` in a non-test fn, add `#[allow(clippy::unimplemented)]` to `build_document` with a `// Task 2` note, OR move `reindex_all`+`build_document` entirely into Task 2 (preferred if it keeps Task 1 clippy-clean — in that case omit them here and add `pub mod`-level items in Task 2). - [ ] **Step 5: Run to verify it passes.** `MEILI_URL= MEILI_MASTER_KEY= cargo test -p search --test search` → PASS. (You may need to adapt SDK calls; iterate until the test passes.) - [ ] **Step 6: Lint.** `cargo +nightly fmt`; `cargo clippy -p search --all-targets -- -D warnings` → clean. - [ ] **Step 7: Commit.** ```bash git add Cargo.toml crates/search git commit -m "feat(search): add Meilisearch-backed SearchClient (index, search, remove)" ``` --- ## Task 2: `build_document` + `reindex_all` (db integration) **Files:** modify `crates/search/src/lib.rs`; create `crates/search/tests/reindex.rs`. - [ ] **Step 1: Write the failing test** `crates/search/tests/reindex.rs` (Meilisearch + Postgres): ```rust use db::{Db, catalog, fields, vocab}; use domain::{ AuditActor, FieldType, LocalizedLabel, NewFieldDefinition, NewTerm, ObjectInput, Visibility, }; use search::SearchClient; use sqlx::PgPool; fn meili() -> (String, String) { ( std::env::var("MEILI_URL").expect("MEILI_URL must be set"), std::env::var("MEILI_MASTER_KEY").expect("MEILI_MASTER_KEY must be set"), ) } fn unique_index() -> String { format!("reindex_test_{}", uuid::Uuid::new_v4().simple()) } #[sqlx::test] async fn reindex_resolves_term_labels_and_finds_by_label(pool: PgPool) { let db = Db::from_pool(pool); // a material vocabulary with a "wood" term let material = vocab::create_vocabulary(db.pool(), "material").await.unwrap(); let mut tx = db.pool().begin().await.unwrap(); let wood = vocab::add_term( &mut tx, &NewTerm { vocabulary_id: material.id, external_uri: None, labels: vec![LocalizedLabel { lang: "en".into(), label: "wood".into() }], }, ) .await .unwrap(); fields::create_field_definition( &mut tx, &NewFieldDefinition { key: "material".into(), field_type: FieldType::Term { vocabulary_id: material.id }, required: false, group_key: None, labels: vec![LocalizedLabel { lang: "en".into(), label: "material".into() }], }, ) .await .unwrap(); let object_id = catalog::create_object( &mut tx, AuditActor::System, &ObjectInput { object_number: "LM-1".into(), object_name: "vase".into(), number_of_objects: 1, brief_description: None, current_location: None, current_owner: None, recorder: None, recording_date: None, visibility: Visibility::Public, }, ) .await .unwrap(); tx.commit().await.unwrap(); // set the material field to the wood term let mut tx = db.pool().begin().await.unwrap(); catalog::set_object_fields( &mut tx, AuditActor::System, object_id, serde_json::json!({ "material": wood.to_string() }).as_object().unwrap(), ) .await .unwrap(); tx.commit().await.unwrap(); let (url, key) = meili(); let client = SearchClient::connect(&url, &key, &unique_index()).unwrap(); client.ensure_index().await.unwrap(); client.reindex_all(&db).await.unwrap(); // found by the object name assert_eq!(client.search("vase").await.unwrap(), vec![object_id]); // found by the resolved TERM LABEL (not the uuid) assert_eq!(client.search("wood").await.unwrap(), vec![object_id]); } ``` - [ ] **Step 2: Run to verify it fails.** With both env vars + `DATABASE_URL`: `... cargo test -p search --test reindex` → FAIL (`build_document` is `unimplemented!`). - [ ] **Step 3: Implement `build_document`** in `crates/search/src/lib.rs` — replace the stub body with a real implementation that flattens the object's flexible fields to searchable text, resolving term/authority values to labels: ```rust pub async fn build_document( db: &Db, object: &CatalogueObject, ) -> Result { let mut fields_text = Vec::new(); if let Some(map) = object.fields.as_object() { for (key, value) in map { let Some(def) = db::fields::field_definition_by_key(db.pool(), key).await? else { continue; // a field with no definition (stale) — skip }; match def.field_type { domain::FieldType::Text | domain::FieldType::Date => { if let Some(s) = value.as_str() { fields_text.push(s.to_owned()); } } domain::FieldType::Integer | domain::FieldType::Boolean => { fields_text.push(value.to_string()); } domain::FieldType::LocalizedText => { if let Some(obj) = value.as_object() { for v in obj.values() { if let Some(s) = v.as_str() { fields_text.push(s.to_owned()); } } } } domain::FieldType::Term { .. } => { if let Some(term_id) = value.as_str().and_then(|s| s.parse().ok()) { if let Some(term) = db::vocab::term_by_id(db.pool(), term_id).await? { fields_text.extend(term.labels.into_iter().map(|l| l.label)); } } } domain::FieldType::Authority { .. } => { if let Some(authority_id) = value.as_str().and_then(|s| s.parse().ok()) { if let Some(authority) = db::authority::authority_by_id(db.pool(), authority_id).await? { fields_text.extend(authority.labels.into_iter().map(|l| l.label)); } } } } } } Ok(SearchDocument { id: object.id.to_string(), object_number: object.object_number.clone(), object_name: object.object_name.clone(), brief_description: object.brief_description.clone(), current_owner: object.current_owner.clone(), recorder: object.recorder.clone(), visibility: object.visibility.as_str().to_owned(), fields_text, }) } ``` (`db::vocab::term_by_id` takes a `TermId`; `db::authority::authority_by_id` takes an `AuthorityId` — `value.as_str().and_then(|s| s.parse().ok())` parses into the inferred id type. If type inference needs help, annotate: `let term_id: domain::TermId = ...`.) - [ ] **Step 4: Run to verify it passes.** `MEILI_URL= MEILI_MASTER_KEY= DATABASE_URL= cargo test -p search --test reindex` → PASS. - [ ] **Step 5: Full workspace check.** ```bash cargo +nightly fmt --check DATABASE_URL= MEILI_URL= MEILI_MASTER_KEY= cargo clippy --workspace --all-targets -- -D warnings DATABASE_URL= MEILI_URL= MEILI_MASTER_KEY= cargo test --workspace ``` Expected: all green. (The `search` tests need the MEILI env vars; the rest need `DATABASE_URL`.) - [ ] **Step 6: Commit.** ```bash git add crates/search git commit -m "feat(search): build documents resolving term/authority labels; reindex_all" ``` --- ## Self-Review (completed) **Spec coverage (Plan 6 / VISION search MVP):** - `search` crate, Meilisearch adapter behind `SearchClient`, swappable → Task 1. ✓ - Index core + flexible text; term/authority resolved to labels; localized → all values; visibility filterable; search returns object ids → Tasks 1–2. ✓ - Build capability + `reindex_all` now; on-write sync deferred to API/service → this plan + notes. ✓ - `search → db → domain` (no cycle); SQL stays in `db` (search calls db repos) → Cargo deps. ✓ - Real-Meili integration tests, unique index per test → Tasks 1–2. ✓ **Placeholder scan:** the only `unimplemented!` is the Task 1 `build_document` stub, explicitly filled in Task 2 (with a fallback instruction). ``/`` are documented env values. No other placeholders. **Type consistency:** `SearchDocument` fields used identically in tests + `build_document`; `SearchClient::{connect, ensure_index, index_object, remove_object, search, reindex_all}` signatures consistent across tasks/tests; `search` returns `Vec` parsed via `ObjectId: FromStr`; `build_document` matches on `domain::FieldType` (Plan 4) and calls `db::vocab::term_by_id`/`db::authority::authority_by_id`/`db::fields::field_definition_by_key`/`db::catalog::list_objects` as defined. ## Notes for follow-on plans - **On-write sync (API/service, Plan 7+):** after a catalogue create/update/delete/set_fields commits, call `index_object`/`remove_object` best-effort (log failures; `reindex_all` is the recovery path). Meili is not transactional with Postgres — eventual consistency. - **Public API (Plan 7):** `search` already stores `visibility` as filterable; add a `with_filter("visibility = public")` search variant for the public surface. - **Per-deployment index/credentials:** production uses a fixed index uid (e.g. `objects`) with a scoped Meili key per the single-tenant deployment; only tests use unique index names. - **Reindex cost:** `reindex_all` is N+1 over objects×fields (resolves labels per field) — fine for now; batch when collections grow (relates to #12).