Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
62 KiB
WASM Provider Service Implementation Plan
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Rebuild whoareyou as an async HTTP service that looks up Swedish phone numbers via WASM-component providers (hitta.se in v1), retiring the CLI.
Architecture: Cargo workspace with an axum server hosting wasmtime; providers are pure WASM components (WIT contract: metadata/requests/parse) — the host fetches all URLs and caches parsed results in moka. Provider parse logic is plain Rust, unit-tested natively against HTML fixtures; WIT glue is a thin cfg(wasm32) layer.
Tech Stack: Rust edition 2024 · tokio · axum 0.8 · reqwest 0.13 · moka 0.12 · wasmtime + wasmtime-wasi 45 · wit-bindgen 0.57 · thiserror 2 · tracing · insta 1.47
Spec: docs/superpowers/specs/2026-06-05-wasm-provider-service-design.md
File structure
whoareyou/
├── Cargo.toml # workspace (NEW)
├── justfile # build orchestration (NEW)
├── wit/provider.wit # provider contract (NEW)
├── crates/
│ ├── server/ # package whoareyou-server (lib + bin)
│ │ ├── Cargo.toml
│ │ ├── src/lib.rs # module exports
│ │ ├── src/main.rs # wiring only
│ │ ├── src/config.rs # env config
│ │ ├── src/error.rs # HostError, FetchError, ConfigError
│ │ ├── src/model.rs # Entry, Comment, ProviderResult, API types
│ │ ├── src/service.rs # ProviderHandle + Fetch traits, LookupService
│ │ ├── src/fetch.rs # ReqwestFetcher
│ │ ├── src/http.rs # axum router, normalize()
│ │ ├── src/wasm.rs # wasmtime host, WasmProvider
│ │ └── tests/component.rs # loads the real .wasm
│ └── providers/hitta/ # package whoareyou-provider-hitta (cdylib+rlib)
│ ├── Cargo.toml
│ ├── src/lib.rs
│ ├── src/parser.rs # pure parse logic + native tests
│ └── src/component.rs # wit-bindgen glue (wasm32 only)
├── fixtures/hitta/*.html # KEPT (+ one fresh fixture)
├── fetch-fixture # KEPT, trimmed to hitta
└── DELETED: src/, definitions/, _build.rs, NOTEPAD.md, old Cargo.toml contents
whoareyou-server is a lib + thin bin so tests/component.rs can use its modules.
Task 1: Workspace scaffold & demolition
Files:
-
Delete:
src/,definitions/,_build.rs,NOTEPAD.md -
Create:
Cargo.toml(workspace),wit/provider.wit,crates/server/{Cargo.toml,src/lib.rs,src/main.rs},crates/providers/hitta/{Cargo.toml,src/lib.rs} -
Modify:
.gitignore -
Step 1: Install the wasm target
Run: rustup target add wasm32-wasip2
Expected: installs or "is up to date".
- Step 2: Delete the old code
git rm -r src definitions _build.rs NOTEPAD.md
(The old hitta parser is reproduced in Task 3 — nothing needed from the deleted tree.)
- Step 3: Write the workspace
Cargo.toml(replaces the old package manifest)
[workspace]
resolver = "3"
members = ["crates/server", "crates/providers/hitta"]
[workspace.package]
version = "0.1.0"
edition = "2024"
authors = ["Anders Olsson <anders.e.olsson@gmail.com>"]
- Step 4: Write
wit/provider.wit
package whoareyou:provider@0.1.0;
interface lookup {
record provider-info {
name: string,
version: string,
}
record request {
url: string,
}
record response {
status: u16,
body: string,
}
record comment {
timestamp: option<s64>,
title: option<string>,
message: string,
}
record entry {
messages: list<string>,
history: list<string>,
comments: list<comment>,
}
variant lookup-error {
no-data,
parse-failed(string),
}
metadata: func() -> provider-info;
requests: func(number: string) -> list<request>;
parse: func(number: string, responses: list<response>) -> result<entry, lookup-error>;
}
world provider {
export lookup;
}
- Step 5: Create the server crate stub
crates/server/Cargo.toml:
[package]
name = "whoareyou-server"
version.workspace = true
edition.workspace = true
authors.workspace = true
[dependencies]
[dev-dependencies]
crates/server/src/lib.rs:
// modules added as they are implemented
crates/server/src/main.rs:
fn main() {}
- Step 6: Create the hitta provider crate stub
crates/providers/hitta/Cargo.toml:
[package]
name = "whoareyou-provider-hitta"
version.workspace = true
edition.workspace = true
authors.workspace = true
[lib]
crate-type = ["cdylib", "rlib"]
[dependencies]
[dev-dependencies]
crates/providers/hitta/src/lib.rs:
// modules added as they are implemented
- Step 7: Ignore the components dir
Append to .gitignore (create if missing):
components/
- Step 8: Verify the workspace builds
Run: cargo check --workspace
Expected: success (two empty crates). Cargo.lock regenerates — that's fine.
- Step 9: Commit
git add -A
git commit -m "refactor!: replace CLI with workspace scaffold for WASM provider service"
Task 2: Refresh hitta fixture & audit page structure
The 2019 fixtures predate any hitta.se redesign. Before porting the parser, capture what the site serves today so Task 3 is written against reality.
Files:
-
Create:
fixtures/hitta/fresh-0104754350.html -
Step 1: Fetch a fresh copy of a known number's page
Run:
curl -sL -A "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)" \
"https://www.hitta.se/vem-ringde/0104754350" \
-o fixtures/hitta/fresh-0104754350.html
wc -c fixtures/hitta/fresh-0104754350.html
Expected: a non-trivial file (> 10 KB). If the response is a bot-block page (check with head -c 2000), retry with the http --follow (httpie) variant from fetch-fixture, or fetch the page in a real browser (View Source → save). The fixture MUST contain real page markup before continuing.
- Step 2: Audit the page structure
Run:
grep -c "__NEXT_DATA__" fixtures/hitta/fresh-0104754350.html
grep -o '__NEXT_DATA__[^>]\{0,80\}' fixtures/hitta/fresh-0104754350.html | head -3
Two outcomes — record which one applies, it determines Step 3 of Task 3:
-
(a)
__NEXT_DATA__still present. Check whether it's still<script>__NEXT_DATA__ = {...};__NEXT_LOADED_PAGES__(2019 inline style) or the modern<script id="__NEXT_DATA__" type="application/json">{...}</script>form. Note which. -
(b) Gone entirely. Inspect the page (
python3 -m json.toolon any embedded JSON, or read the HTML) and locate where phone data + comments live now. Write down the JSON path to: comments list, comment text, comment timestamp, and the statistics/"X others searched" text — Task 3's serde structs must be adapted to those paths (the shape of the parser — regex/JSON extraction → typed structs →ParsedEntry— stays identical). -
Step 3: Commit the fixture
git add fixtures/hitta/fresh-0104754350.html
git commit -m "test: add fresh hitta.se fixture for parser port"
Task 3: hitta parser (pure logic, native TDD)
Port the old src/probe/hitta.rs parse logic (reproduced below) into the provider crate as plain functions. All tests run natively — no WASM involved.
Files:
-
Create:
crates/providers/hitta/src/parser.rs -
Modify:
crates/providers/hitta/src/lib.rs,crates/providers/hitta/Cargo.toml -
Step 1: Add dependencies
In crates/providers/hitta/Cargo.toml set:
[dependencies]
regex = "1"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
[dev-dependencies]
insta = { version = "1.47", features = ["yaml"] }
- Step 2: Declare the module
crates/providers/hitta/src/lib.rs:
pub mod parser;
- Step 3: Write the failing tests
Append to crates/providers/hitta/src/parser.rs (create the file with ONLY this test module first; the types/functions it references don't exist yet, that's the point):
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn requests_single_hitta_url() {
assert_eq!(
request_urls("0700000000"),
vec!["https://www.hitta.se/vem-ringde/0700000000".to_string()]
);
}
#[test]
fn parses_number_with_comments() {
let body = include_str!("../../../../fixtures/hitta/0104754350.html");
let entry = parse(body).unwrap();
assert_eq!(entry.messages, Vec::<String>::new());
assert_eq!(entry.history, vec!["42 andra har rapporterat detta nummer"]);
assert_eq!(entry.comments.len(), 29);
// newest first
let first = &entry.comments[0];
assert_eq!(first.timestamp, Some(1547746162)); // 2019-01-17T17:29:22Z
assert_eq!(first.title, None);
assert_eq!(first.message, "Varmsälj från Folksam");
}
#[test]
fn parses_number_with_history_only() {
let body = include_str!("../../../../fixtures/hitta/0702269893.html");
let entry = parse(body).unwrap();
assert_eq!(entry.history, vec!["Tre andra har också sökt på detta nummer"]);
assert!(entry.comments.is_empty());
}
#[test]
fn no_phone_data_is_no_data() {
let body = include_str!("../../../../fixtures/hitta/0313908905.html");
assert_eq!(parse(body), Err(ParseError::NoData));
}
#[test]
fn unparseable_page_is_failed() {
let body = include_str!("../../../../fixtures/hitta/0701807618.html");
assert!(matches!(parse(body), Err(ParseError::Failed(_))));
}
#[test]
fn garbage_is_failed() {
assert!(matches!(parse("<html></html>"), Err(ParseError::Failed(_))));
}
#[test]
fn parses_fresh_fixture() {
let body = include_str!("../../../../fixtures/hitta/fresh-0104754350.html");
insta::assert_yaml_snapshot!(parse(body));
}
}
Semantics note (differs from the old CLI): the old code returned Ok with an
all-empty entry when JSON parsed but phoneData was absent. That is now
Err(ParseError::NoData). Old fixtures 0313908905, 0751793426/83/99 fall
in that bucket; 0701807618, 0546780862 fail the regex → Failed.
- Step 4: Run tests to verify they fail
Run: cargo test -p whoareyou-provider-hitta
Expected: COMPILE ERROR — request_urls, parse, ParseError not found.
- Step 5: Implement the parser
Prepend to crates/providers/hitta/src/parser.rs (above the test module). This is the 2019 logic ported; if Task 2 found outcome (b) or the modern <script id="__NEXT_DATA__"> form, adapt NEXT_DATA_RE / the serde structs to the JSON paths recorded in Task 2 — keep the public surface (request_urls, parse, the three types) exactly as below:
use std::sync::LazyLock;
use regex::Regex;
use serde::{Deserialize, Serialize};
#[derive(Debug, PartialEq, Serialize)]
pub struct ParsedEntry {
pub messages: Vec<String>,
pub history: Vec<String>,
pub comments: Vec<ParsedComment>,
}
#[derive(Debug, PartialEq, Serialize)]
pub struct ParsedComment {
/// Unix epoch seconds, UTC.
pub timestamp: Option<i64>,
pub title: Option<String>,
pub message: String,
}
#[derive(Debug, PartialEq, Serialize)]
pub enum ParseError {
/// Page fetched and understood, but it contains no data for the number.
NoData,
/// Page structure did not match expectations — scraper rot signal.
Failed(String),
}
static NEXT_DATA_RE: LazyLock<Regex> = LazyLock::new(|| {
Regex::new(r"<script>__NEXT_DATA__ = (.*?);__NEXT_LOADED_PAGES__").unwrap()
});
#[derive(Deserialize)]
#[serde(rename_all = "camelCase")]
struct Data {
props: Props,
}
#[derive(Deserialize)]
#[serde(rename_all = "camelCase")]
struct Props {
page_props: PageProps,
}
#[derive(Deserialize)]
#[serde(rename_all = "camelCase")]
struct PageProps {
phone_data: Option<PhoneData>,
}
#[derive(Deserialize)]
#[serde(rename_all = "camelCase")]
struct PhoneData {
#[serde(default)]
comments: Vec<RawComment>,
statistics_text: String,
}
#[derive(Deserialize)]
#[serde(rename_all = "camelCase")]
struct RawComment {
comment: String,
/// Milliseconds since epoch.
timestamp: u64,
}
pub fn request_urls(number: &str) -> Vec<String> {
vec![format!("https://www.hitta.se/vem-ringde/{number}")]
}
pub fn parse(body: &str) -> Result<ParsedEntry, ParseError> {
let captures = NEXT_DATA_RE
.captures(body)
.ok_or_else(|| ParseError::Failed("__NEXT_DATA__ not found".to_string()))?;
let json = captures.get(1).unwrap().as_str();
let data: Data = serde_json::from_str(json)
.map_err(|e| ParseError::Failed(format!("deserialize __NEXT_DATA__: {e}")))?;
let Some(phone_data) = data.props.page_props.phone_data else {
return Err(ParseError::NoData);
};
let mut comments: Vec<ParsedComment> = phone_data
.comments
.into_iter()
.map(|c| ParsedComment {
timestamp: Some((c.timestamp / 1000) as i64),
title: None,
message: c.comment,
})
.collect();
comments.sort_by(|a, b| b.timestamp.cmp(&a.timestamp));
Ok(ParsedEntry {
messages: Vec::new(),
history: vec![phone_data.statistics_text],
comments,
})
}
- Step 6: Run tests to verify they pass
Run: cargo test -p whoareyou-provider-hitta
Expected: all pass except possibly parses_fresh_fixture (pending snapshot).
If the fresh-fixture test FAILS to parse (Failed/NoData against a real
page that visibly has data), the site changed — adapt the regex/structs per
Task 2's notes until the fresh fixture parses, while keeping the 2019-fixture
tests passing (if the old format is truly gone from the new code path, update
those tests' expectations to Failed and note it in the commit message).
- Step 7: Accept the fresh-fixture snapshot after eyeballing it
Run: cargo insta review (or cargo insta accept after inspecting the .snap.new file manually)
Expected: snapshot under crates/providers/hitta/src/snapshots/ showing a plausible entry (or an honest NoData/Failed for a dead number — verify it matches what the fixture actually contains).
- Step 8: Run the full test suite
Run: cargo test --workspace
Expected: PASS.
- Step 9: Commit
git add crates/providers/hitta .gitignore Cargo.lock
git commit -m "feat: port hitta.se parser as pure native-testable functions"
Task 4: hitta component glue (WIT export)
Files:
-
Create:
crates/providers/hitta/src/component.rs -
Modify:
crates/providers/hitta/src/lib.rs,crates/providers/hitta/Cargo.toml -
Step 1: Add wit-bindgen for wasm32 only
Append to crates/providers/hitta/Cargo.toml:
[target.'cfg(target_arch = "wasm32")'.dependencies]
wit-bindgen = "0.57"
- Step 2: Write the glue
crates/providers/hitta/src/component.rs:
use crate::parser;
wit_bindgen::generate!({
world: "provider",
path: "../../../wit",
});
use exports::whoareyou::provider::lookup::{
Comment, Entry, Guest, LookupError, ProviderInfo, Request, Response,
};
struct Component;
impl Guest for Component {
fn metadata() -> ProviderInfo {
ProviderInfo {
name: "hitta.se".to_string(),
version: env!("CARGO_PKG_VERSION").to_string(),
}
}
fn requests(number: String) -> Vec<Request> {
parser::request_urls(&number)
.into_iter()
.map(|url| Request { url })
.collect()
}
fn parse(_number: String, responses: Vec<Response>) -> Result<Entry, LookupError> {
let Some(first) = responses.first() else {
return Err(LookupError::ParseFailed("no responses provided".to_string()));
};
match parser::parse(&first.body) {
Ok(entry) => Ok(Entry {
messages: entry.messages,
history: entry.history,
comments: entry
.comments
.into_iter()
.map(|c| Comment {
timestamp: c.timestamp,
title: c.title,
message: c.message,
})
.collect(),
}),
Err(parser::ParseError::NoData) => Err(LookupError::NoData),
Err(parser::ParseError::Failed(msg)) => Err(LookupError::ParseFailed(msg)),
}
}
}
export!(Component);
- Step 3: Gate it into the crate
crates/providers/hitta/src/lib.rs:
pub mod parser;
#[cfg(target_arch = "wasm32")]
mod component;
- Step 4: Build the component
Run: cargo build --release --target wasm32-wasip2 -p whoareyou-provider-hitta
Expected: success; target/wasm32-wasip2/release/whoareyou_provider_hitta.wasm exists.
- Step 5: Verify native tests still pass
Run: cargo test -p whoareyou-provider-hitta
Expected: PASS (glue is cfg'd out natively).
- Step 6: Commit
git add crates/providers/hitta
git commit -m "feat: export hitta parser as a WASM component via wit-bindgen"
Task 5: Server model types
Files:
-
Create:
crates/server/src/model.rs -
Modify:
crates/server/src/lib.rs,crates/server/Cargo.toml -
Step 1: Add first server dependencies
In crates/server/Cargo.toml:
[dependencies]
serde = { version = "1", features = ["derive"] }
[dev-dependencies]
serde_json = "1"
- Step 2: Write the failing test
crates/server/src/model.rs (test module only, types come next step):
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn provider_result_serializes_to_api_shape() {
let ok = ProviderResult::Ok {
entry: Entry {
messages: vec![],
history: vec!["42 andra".to_string()],
comments: vec![Comment {
timestamp: Some(1547746162),
title: None,
message: "Varmsälj".to_string(),
}],
},
};
let json = serde_json::to_value(&ok).unwrap();
assert_eq!(json["status"], "ok");
assert_eq!(json["entry"]["history"][0], "42 andra");
assert_eq!(json["entry"]["comments"][0]["timestamp"], 1547746162);
assert_eq!(
serde_json::to_value(&ProviderResult::NoData).unwrap()["status"],
"no_data"
);
assert_eq!(
serde_json::to_value(&ProviderResult::FetchFailed).unwrap()["status"],
"fetch_failed"
);
assert_eq!(
serde_json::to_value(&ProviderResult::ParseFailed).unwrap()["status"],
"parse_failed"
);
}
}
crates/server/src/lib.rs:
pub mod model;
- Step 3: Run test to verify it fails
Run: cargo test -p whoareyou-server
Expected: COMPILE ERROR — types not defined.
- Step 4: Implement the types
Prepend to crates/server/src/model.rs:
use std::collections::BTreeMap;
use serde::Serialize;
#[derive(Debug, Clone, PartialEq, Serialize)]
pub struct Entry {
pub messages: Vec<String>,
pub history: Vec<String>,
pub comments: Vec<Comment>,
}
#[derive(Debug, Clone, PartialEq, Serialize)]
pub struct Comment {
/// Unix epoch seconds, UTC.
pub timestamp: Option<i64>,
pub title: Option<String>,
pub message: String,
}
/// Per-provider outcome as exposed in the API (and cached).
#[derive(Debug, Clone, PartialEq, Serialize)]
#[serde(tag = "status", rename_all = "snake_case")]
pub enum ProviderResult {
Ok { entry: Entry },
NoData,
FetchFailed,
ParseFailed,
}
/// A fetched HTTP response handed to a provider's `parse`.
#[derive(Debug, Clone)]
pub struct FetchedResponse {
pub status: u16,
pub body: String,
}
/// Outcome of a provider's `parse` call, before API mapping.
#[derive(Debug)]
pub enum ParseOutcome {
Ok(Entry),
NoData,
Failed(String),
}
#[derive(Debug, Serialize)]
pub struct LookupResponse {
pub number: String,
pub results: BTreeMap<String, ProviderResult>,
}
- Step 5: Run test to verify it passes
Run: cargo test -p whoareyou-server
Expected: PASS.
- Step 6: Commit
git add crates/server Cargo.lock
git commit -m "feat: add server model types and API serialization shape"
Task 6: Errors and fetcher
Files:
-
Create:
crates/server/src/error.rs,crates/server/src/fetch.rs -
Modify:
crates/server/src/lib.rs,crates/server/Cargo.toml -
Step 1: Add dependencies
Extend crates/server/Cargo.toml [dependencies]:
async-trait = "0.1"
reqwest = "0.13"
thiserror = "2"
tokio = { version = "1", features = ["full"] }
wasmtime = { version = "45", features = ["component-model"] }
(wasmtime is needed now because HostError wraps wasmtime::Error.)
- Step 2: Write
crates/server/src/error.rs
use thiserror::Error;
/// Errors from hosting/calling a WASM component.
#[derive(Debug, Error)]
pub enum HostError {
#[error("wasm error: {0}")]
Wasm(#[from] wasmtime::Error),
#[error("io error: {0}")]
Io(#[from] std::io::Error),
}
#[derive(Debug, Error)]
pub enum FetchError {
#[error("request failed: {0}")]
Request(#[from] reqwest::Error),
}
#[derive(Debug, Error)]
pub enum ConfigError {
#[error("invalid value for {key}: {message}")]
Invalid { key: String, message: String },
}
- Step 3: Write
crates/server/src/fetch.rs
The Fetch trait lives in service.rs (Task 7); to keep this task compiling
standalone, define the trait there first — so this task only adds the
implementation file with a stub trait import deferred. Simplest ordering:
write fetch.rs now but leave it out of lib.rs until Task 7 wires it in.
crates/server/src/fetch.rs:
use std::time::Duration;
use async_trait::async_trait;
use crate::error::FetchError;
use crate::model::FetchedResponse;
use crate::service::Fetch;
pub struct ReqwestFetcher {
client: reqwest::Client,
}
impl ReqwestFetcher {
pub fn new(timeout: Duration) -> Result<Self, FetchError> {
let client = reqwest::Client::builder()
.timeout(timeout)
.user_agent(concat!("whoareyou/", env!("CARGO_PKG_VERSION")))
.build()?;
Ok(Self { client })
}
}
#[async_trait]
impl Fetch for ReqwestFetcher {
async fn fetch(&self, url: &str) -> Result<FetchedResponse, FetchError> {
let response = self.client.get(url).send().await?;
let status = response.status().as_u16();
let body = response.text().await?;
Ok(FetchedResponse { status, body })
}
}
- Step 4: Wire only
errorintolib.rs
crates/server/src/lib.rs:
pub mod error;
pub mod model;
- Step 5: Verify it compiles
Run: cargo check -p whoareyou-server
Expected: success (fetch.rs is not yet a module, so its crate::service import is not compiled).
- Step 6: Commit
git add crates/server Cargo.lock
git commit -m "feat: add server error types and reqwest fetcher"
Task 7: LookupService (orchestration + cache, TDD)
Files:
-
Create:
crates/server/src/service.rs -
Modify:
crates/server/src/lib.rs,crates/server/Cargo.toml -
Step 1: Add dependencies
Extend crates/server/Cargo.toml [dependencies]:
futures = "0.3"
moka = { version = "0.12", features = ["future"] }
tracing = "0.1"
- Step 2: Write the failing tests
crates/server/src/service.rs, test module first:
#[cfg(test)]
mod tests {
use std::sync::Arc;
use std::sync::atomic::{AtomicUsize, Ordering};
use std::time::Duration;
use async_trait::async_trait;
use super::*;
use crate::error::{FetchError, HostError};
use crate::model::{Comment, Entry, FetchedResponse, ParseOutcome, ProviderResult};
fn entry() -> Entry {
Entry {
messages: vec![],
history: vec!["history".to_string()],
comments: vec![Comment {
timestamp: Some(1547746162),
title: None,
message: "spam".to_string(),
}],
}
}
/// Provider whose parse outcome is scripted per call.
struct FakeProvider {
name: &'static str,
outcome: fn() -> ParseOutcome,
}
impl ProviderHandle for FakeProvider {
fn name(&self) -> &str {
self.name
}
fn requests(&self, number: &str) -> Result<Vec<String>, HostError> {
Ok(vec![format!("https://example.test/{number}")])
}
fn parse(
&self,
_number: &str,
_responses: &[FetchedResponse],
) -> ParseOutcome {
(self.outcome)()
}
}
/// Fetcher that counts calls and can be told to fail.
struct FakeFetcher {
calls: AtomicUsize,
fail: bool,
}
impl FakeFetcher {
fn new(fail: bool) -> Self {
Self { calls: AtomicUsize::new(0), fail }
}
}
#[async_trait]
impl Fetch for FakeFetcher {
async fn fetch(&self, _url: &str) -> Result<FetchedResponse, FetchError> {
self.calls.fetch_add(1, Ordering::SeqCst);
if self.fail {
// construct a real reqwest error by failing a bad URL... instead
// keep FetchError easy to fabricate via a connection refused on a
// reserved port? No — simplest: add a test-only variant? Use
// reqwest from an invalid builder is convoluted. See note below.
unreachable!("replaced in Step 4");
}
Ok(FetchedResponse { status: 200, body: "body".to_string() })
}
}
fn service(
providers: Vec<Arc<dyn ProviderHandle>>,
fetcher: Arc<dyn Fetch>,
) -> LookupService {
LookupService::new(providers, fetcher, Duration::from_secs(60))
}
#[tokio::test]
async fn ok_result_is_returned_and_cached() {
let provider = Arc::new(FakeProvider {
name: "fake.se",
outcome: || ParseOutcome::Ok(entry()),
});
let fetcher = Arc::new(FakeFetcher::new(false));
let svc = service(vec![provider], fetcher.clone());
let results = svc.lookup("0700000000").await;
assert_eq!(results["fake.se"], ProviderResult::Ok { entry: entry() });
// second lookup served from cache — fetcher not called again
let results = svc.lookup("0700000000").await;
assert_eq!(results["fake.se"], ProviderResult::Ok { entry: entry() });
assert_eq!(fetcher.calls.load(Ordering::SeqCst), 1);
}
#[tokio::test]
async fn no_data_is_cached() {
let provider = Arc::new(FakeProvider { name: "fake.se", outcome: || ParseOutcome::NoData });
let fetcher = Arc::new(FakeFetcher::new(false));
let svc = service(vec![provider], fetcher.clone());
assert_eq!(svc.lookup("0700000000").await["fake.se"], ProviderResult::NoData);
assert_eq!(svc.lookup("0700000000").await["fake.se"], ProviderResult::NoData);
assert_eq!(fetcher.calls.load(Ordering::SeqCst), 1);
}
#[tokio::test]
async fn parse_failure_maps_and_is_cached() {
let provider = Arc::new(FakeProvider {
name: "fake.se",
outcome: || ParseOutcome::Failed("rot".to_string()),
});
let fetcher = Arc::new(FakeFetcher::new(false));
let svc = service(vec![provider], fetcher.clone());
assert_eq!(svc.lookup("0700000000").await["fake.se"], ProviderResult::ParseFailed);
assert_eq!(svc.lookup("0700000000").await["fake.se"], ProviderResult::ParseFailed);
assert_eq!(fetcher.calls.load(Ordering::SeqCst), 1);
}
#[tokio::test]
async fn fetch_failure_is_not_cached() {
let provider = Arc::new(FakeProvider {
name: "fake.se",
outcome: || ParseOutcome::NoData,
});
let fetcher = Arc::new(FakeFetcher::new(true));
let svc = service(vec![provider], fetcher.clone());
assert_eq!(svc.lookup("0700000000").await["fake.se"], ProviderResult::FetchFailed);
assert_eq!(svc.lookup("0700000000").await["fake.se"], ProviderResult::FetchFailed);
// NOT cached: fetcher tried twice
assert_eq!(fetcher.calls.load(Ordering::SeqCst), 2);
}
#[tokio::test]
async fn multiple_providers_keyed_by_name() {
let a = Arc::new(FakeProvider { name: "a.se", outcome: || ParseOutcome::NoData });
let b = Arc::new(FakeProvider {
name: "b.se",
outcome: || ParseOutcome::Ok(entry()),
});
let fetcher = Arc::new(FakeFetcher::new(false));
let svc = service(vec![a, b], fetcher);
let results = svc.lookup("0700000000").await;
assert_eq!(results.len(), 2);
assert_eq!(results["a.se"], ProviderResult::NoData);
assert!(matches!(results["b.se"], ProviderResult::Ok { .. }));
}
}
Fabricating a FetchError in tests: reqwest::Error cannot be constructed
directly. Make the fail path real instead of fabricated — in Step 4's
implementation of FakeFetcher::fetch, replace the unreachable! with an
actual failing request against a closed local port:
if self.fail {
let err = reqwest::Client::new()
.get("http://127.0.0.1:1/unreachable")
.send()
.await
.unwrap_err();
return Err(FetchError::Request(err));
}
(Port 1 is never listening; connection is refused immediately — no external network involved.)
- Step 3: Run tests to verify they fail
Run: cargo test -p whoareyou-server
Expected: COMPILE ERROR — ProviderHandle, Fetch, LookupService not defined.
- Step 4: Implement the service
Prepend to crates/server/src/service.rs (and fix the FakeFetcher fail path as noted above):
use std::collections::BTreeMap;
use std::sync::Arc;
use std::time::Duration;
use async_trait::async_trait;
use moka::future::Cache;
use tracing::warn;
use crate::error::{FetchError, HostError};
use crate::model::{FetchedResponse, ParseOutcome, ProviderResult};
/// A loaded provider. Implemented by `wasm::WasmProvider`; faked in tests.
/// Methods are sync — WASM calls are CPU-bound; the service wraps them in
/// `spawn_blocking`.
pub trait ProviderHandle: Send + Sync {
fn name(&self) -> &str;
fn requests(&self, number: &str) -> Result<Vec<String>, HostError>;
fn parse(&self, number: &str, responses: &[FetchedResponse]) -> ParseOutcome;
}
#[async_trait]
pub trait Fetch: Send + Sync {
async fn fetch(&self, url: &str) -> Result<FetchedResponse, FetchError>;
}
pub struct LookupService {
providers: Vec<Arc<dyn ProviderHandle>>,
fetcher: Arc<dyn Fetch>,
cache: Cache<String, ProviderResult>,
}
impl LookupService {
pub fn new(
providers: Vec<Arc<dyn ProviderHandle>>,
fetcher: Arc<dyn Fetch>,
cache_ttl: Duration,
) -> Self {
Self {
providers,
fetcher,
cache: Cache::builder().time_to_live(cache_ttl).build(),
}
}
pub fn provider_names(&self) -> Vec<&str> {
self.providers.iter().map(|p| p.name()).collect()
}
/// Run all providers concurrently; one result per provider name.
pub async fn lookup(&self, number: &str) -> BTreeMap<String, ProviderResult> {
let tasks = self.providers.iter().map(|provider| {
let provider = provider.clone();
let fetcher = self.fetcher.clone();
let cache = self.cache.clone();
let number = number.to_string();
async move {
let name = provider.name().to_string();
let key = format!("{name}:{number}");
if let Some(hit) = cache.get(&key).await {
return (name, hit);
}
let result = run_provider(provider, &number, fetcher).await;
// Transient failures must not poison the cache.
if result != ProviderResult::FetchFailed {
cache.insert(key, result.clone()).await;
}
(name, result)
}
});
futures::future::join_all(tasks).await.into_iter().collect()
}
}
async fn run_provider(
provider: Arc<dyn ProviderHandle>,
number: &str,
fetcher: Arc<dyn Fetch>,
) -> ProviderResult {
let name = provider.name().to_string();
let urls = {
let provider = provider.clone();
let number = number.to_string();
match tokio::task::spawn_blocking(move || provider.requests(&number)).await {
Ok(Ok(urls)) => urls,
Ok(Err(error)) => {
warn!(provider = %name, %error, "requests() failed");
return ProviderResult::ParseFailed;
}
Err(error) => {
warn!(provider = %name, %error, "requests() panicked");
return ProviderResult::ParseFailed;
}
}
};
let fetched = futures::future::join_all(urls.iter().map(|url| fetcher.fetch(url))).await;
let mut responses = Vec::with_capacity(fetched.len());
for result in fetched {
match result {
Ok(response) => responses.push(response),
Err(error) => {
warn!(provider = %name, %error, "fetch failed");
return ProviderResult::FetchFailed;
}
}
}
let outcome = {
let provider = provider.clone();
let number = number.to_string();
tokio::task::spawn_blocking(move || provider.parse(&number, &responses)).await
};
match outcome {
Ok(ParseOutcome::Ok(entry)) => ProviderResult::Ok { entry },
Ok(ParseOutcome::NoData) => ProviderResult::NoData,
Ok(ParseOutcome::Failed(message)) => {
warn!(provider = %name, %message, "parse failed — scraper rot?");
ProviderResult::ParseFailed
}
Err(error) => {
warn!(provider = %name, %error, "parse() panicked");
ProviderResult::ParseFailed
}
}
}
- Step 5: Wire modules into
lib.rs
crates/server/src/lib.rs:
pub mod error;
pub mod fetch;
pub mod model;
pub mod service;
- Step 6: Run tests to verify they pass
Run: cargo test -p whoareyou-server
Expected: PASS (all five service tests + model test).
- Step 7: Commit
git add crates/server Cargo.lock
git commit -m "feat: add LookupService with moka cache and provider orchestration"
Task 8: HTTP layer (axum, TDD)
Files:
-
Create:
crates/server/src/http.rs -
Modify:
crates/server/src/lib.rs,crates/server/Cargo.toml -
Step 1: Add dependencies
Extend crates/server/Cargo.toml:
[dependencies]
# add:
axum = "0.8"
serde_json = "1"
[dev-dependencies]
# add:
http-body-util = "0.1"
tower = { version = "0.5", features = ["util"] }
(serde_json moves from dev-dependencies to dependencies — remove the dev entry.)
- Step 2: Write the failing tests
crates/server/src/http.rs, test module first:
#[cfg(test)]
mod tests {
use std::sync::Arc;
use std::time::Duration;
use async_trait::async_trait;
use axum::body::Body;
use axum::http::{Request, StatusCode};
use http_body_util::BodyExt;
use tower::ServiceExt;
use super::*;
use crate::error::{FetchError, HostError};
use crate::model::{FetchedResponse, ParseOutcome};
use crate::service::{Fetch, LookupService, ProviderHandle};
struct NoDataProvider;
impl ProviderHandle for NoDataProvider {
fn name(&self) -> &str {
"fake.se"
}
fn requests(&self, number: &str) -> Result<Vec<String>, HostError> {
Ok(vec![format!("https://example.test/{number}")])
}
fn parse(&self, _: &str, _: &[FetchedResponse]) -> ParseOutcome {
ParseOutcome::NoData
}
}
struct StaticFetcher;
#[async_trait]
impl Fetch for StaticFetcher {
async fn fetch(&self, _: &str) -> Result<FetchedResponse, FetchError> {
Ok(FetchedResponse { status: 200, body: String::new() })
}
}
fn app() -> axum::Router {
let service = LookupService::new(
vec![Arc::new(NoDataProvider)],
Arc::new(StaticFetcher),
Duration::from_secs(60),
);
router(Arc::new(service))
}
#[test]
fn normalize_strips_separators() {
assert_eq!(normalize("0700 00-00.00"), Some("0700000000".to_string()));
assert_eq!(normalize("+46701234567"), Some("+46701234567".to_string()));
}
#[test]
fn normalize_rejects_garbage() {
assert_eq!(normalize("not-a-number"), None);
assert_eq!(normalize(""), None);
assert_eq!(normalize("0"), None);
assert_eq!(normalize("07001231231231231231"), None); // > 15 digits
assert_eq!(normalize("070+123"), None); // '+' not at start
}
#[tokio::test]
async fn lookup_returns_results_keyed_by_provider() {
let response = app()
.oneshot(
Request::builder()
.uri("/api/v1/number/0700 00-00 00")
.body(Body::empty())
.unwrap(),
)
.await
.unwrap();
assert_eq!(response.status(), StatusCode::OK);
let bytes = response.into_body().collect().await.unwrap().to_bytes();
let json: serde_json::Value = serde_json::from_slice(&bytes).unwrap();
assert_eq!(json["number"], "0700000000");
assert_eq!(json["results"]["fake.se"]["status"], "no_data");
}
#[tokio::test]
async fn invalid_number_is_400() {
let response = app()
.oneshot(
Request::builder()
.uri("/api/v1/number/banana")
.body(Body::empty())
.unwrap(),
)
.await
.unwrap();
assert_eq!(response.status(), StatusCode::BAD_REQUEST);
}
#[tokio::test]
async fn healthz_is_ok() {
let response = app()
.oneshot(Request::builder().uri("/healthz").body(Body::empty()).unwrap())
.await
.unwrap();
assert_eq!(response.status(), StatusCode::OK);
}
}
- Step 3: Run tests to verify they fail
Run: cargo test -p whoareyou-server
Expected: COMPILE ERROR — router, normalize not defined.
- Step 4: Implement the HTTP layer
Prepend to crates/server/src/http.rs:
use std::sync::Arc;
use axum::Json;
use axum::Router;
use axum::extract::{Path, State};
use axum::http::StatusCode;
use axum::response::{IntoResponse, Response};
use axum::routing::get;
use serde_json::json;
use crate::model::LookupResponse;
use crate::service::LookupService;
pub fn router(service: Arc<LookupService>) -> Router {
Router::new()
.route("/api/v1/number/{number}", get(lookup_number))
.route("/healthz", get(|| async { "ok" }))
.with_state(service)
}
async fn lookup_number(
State(service): State<Arc<LookupService>>,
Path(raw): Path<String>,
) -> Response {
let Some(number) = normalize(&raw) else {
return (
StatusCode::BAD_REQUEST,
Json(json!({ "error": "invalid phone number" })),
)
.into_response();
};
let results = service.lookup(&number).await;
Json(LookupResponse { number, results }).into_response()
}
/// Strip separators and validate: optional leading '+', then 2–15 digits.
pub fn normalize(raw: &str) -> Option<String> {
let cleaned: String = raw
.chars()
.filter(|c| !matches!(c, ' ' | '-' | '.'))
.collect();
let digits = cleaned.strip_prefix('+').unwrap_or(&cleaned);
let valid = (2..=15).contains(&digits.len())
&& digits.chars().all(|c| c.is_ascii_digit());
valid.then_some(cleaned)
}
- Step 5: Wire the module
crates/server/src/lib.rs:
pub mod error;
pub mod fetch;
pub mod http;
pub mod model;
pub mod service;
- Step 6: Run tests to verify they pass
Run: cargo test -p whoareyou-server
Expected: PASS.
- Step 7: Commit
git add crates/server Cargo.lock
git commit -m "feat: add axum HTTP layer with lookup endpoint and healthz"
Task 9: wasmtime host (WasmProvider)
Files:
-
Create:
crates/server/src/wasm.rs -
Modify:
crates/server/src/lib.rs,crates/server/Cargo.toml -
Step 1: Add wasmtime-wasi
Extend crates/server/Cargo.toml [dependencies]:
wasmtime-wasi = "45"
- Step 2: Write
crates/server/src/wasm.rs
API-drift note: the
WasiView/WasiCtxViewshape below matches recent wasmtime-wasi releases as of this plan's writing. Ifcargo checkdisagrees, consult https://docs.rs/wasmtime-wasi/45 — the intent is fixed: a store data struct holdingWasiCtx+ResourceTable, WASI added to the linker sync, no preopens / no env / no inherited stdio. Adapt mechanically; do not change the public surface of this module.
use std::path::Path;
use wasmtime::component::{Component, Linker};
use wasmtime::{Config, Engine, Store};
use wasmtime_wasi::ResourceTable;
use wasmtime_wasi::p2::{WasiCtx, WasiCtxBuilder, WasiCtxView, WasiView};
use crate::error::HostError;
use crate::model::{Comment, Entry, FetchedResponse, ParseOutcome};
use crate::service::ProviderHandle;
wasmtime::component::bindgen!({
world: "provider",
path: "../../wit",
});
use exports::whoareyou::provider::lookup::{LookupError as WitLookupError, Response as WitResponse};
/// How many epoch ticks a guest call may run. The epoch thread ticks every
/// 100 ms → 50 ticks ≈ 5 s budget per call.
const EPOCH_DEADLINE_TICKS: u64 = 50;
pub const EPOCH_TICK: std::time::Duration = std::time::Duration::from_millis(100);
pub struct HostState {
ctx: WasiCtx,
table: ResourceTable,
}
impl WasiView for HostState {
fn ctx(&mut self) -> WasiCtxView<'_> {
WasiCtxView { ctx: &mut self.ctx, table: &mut self.table }
}
}
pub fn engine() -> Result<Engine, HostError> {
let mut config = Config::new();
config.epoch_interruption(true);
Ok(Engine::new(&config)?)
}
pub fn linker(engine: &Engine) -> Result<Linker<HostState>, HostError> {
let mut linker = Linker::new(engine);
wasmtime_wasi::p2::add_to_linker_sync(&mut linker)?;
Ok(linker)
}
/// Spawn the thread that advances the engine epoch so runaway guest calls
/// trap instead of hanging the service. Call once at startup.
pub fn spawn_epoch_thread(engine: &Engine) {
let engine = engine.clone();
std::thread::spawn(move || {
loop {
std::thread::sleep(EPOCH_TICK);
engine.increment_epoch();
}
});
}
pub struct WasmProvider {
name: String,
version: String,
engine: Engine,
pre: ProviderPre<HostState>,
}
impl WasmProvider {
/// Compile a component from disk and read its metadata once.
/// Fails fast if the component does not satisfy the provider world.
pub fn load(
engine: &Engine,
linker: &Linker<HostState>,
path: &Path,
) -> Result<Self, HostError> {
let component = Component::from_file(engine, path)?;
let pre = ProviderPre::new(linker.instantiate_pre(&component)?)?;
let mut provider = Self {
name: String::new(),
version: String::new(),
engine: engine.clone(),
pre,
};
let mut store = provider.new_store();
let instance = provider.pre.instantiate(&mut store)?;
let info = instance.whoareyou_provider_lookup().call_metadata(&mut store)?;
provider.name = info.name;
provider.version = info.version;
Ok(provider)
}
pub fn version(&self) -> &str {
&self.version
}
fn new_store(&self) -> Store<HostState> {
// No preopens, no env, no inherited stdio — fully sandboxed guest.
let ctx = WasiCtxBuilder::new().build();
let mut store = Store::new(
&self.engine,
HostState { ctx, table: ResourceTable::new() },
);
store.set_epoch_deadline(EPOCH_DEADLINE_TICKS);
store
}
}
impl ProviderHandle for WasmProvider {
fn name(&self) -> &str {
&self.name
}
fn requests(&self, number: &str) -> Result<Vec<String>, HostError> {
let mut store = self.new_store();
let instance = self.pre.instantiate(&mut store)?;
let requests = instance
.whoareyou_provider_lookup()
.call_requests(&mut store, number)?;
Ok(requests.into_iter().map(|r| r.url).collect())
}
fn parse(&self, number: &str, responses: &[FetchedResponse]) -> ParseOutcome {
let wit_responses: Vec<WitResponse> = responses
.iter()
.map(|r| WitResponse { status: r.status, body: r.body.clone() })
.collect();
let mut store = self.new_store();
let result = (|| {
let instance = self.pre.instantiate(&mut store)?;
instance
.whoareyou_provider_lookup()
.call_parse(&mut store, number, &wit_responses)
})();
match result {
Ok(Ok(entry)) => ParseOutcome::Ok(Entry {
messages: entry.messages,
history: entry.history,
comments: entry
.comments
.into_iter()
.map(|c| Comment {
timestamp: c.timestamp,
title: c.title,
message: c.message,
})
.collect(),
}),
Ok(Err(WitLookupError::NoData)) => ParseOutcome::NoData,
Ok(Err(WitLookupError::ParseFailed(message))) => ParseOutcome::Failed(message),
// Trap (incl. epoch deadline exceeded) or instantiation failure.
Err(error) => ParseOutcome::Failed(format!("component error: {error}")),
}
}
}
- Step 3: Wire the module
crates/server/src/lib.rs:
pub mod error;
pub mod fetch;
pub mod http;
pub mod model;
pub mod service;
pub mod wasm;
- Step 4: Verify it compiles (adapt API drift here if needed)
Run: cargo check -p whoareyou-server
Expected: success. If WasiView/WasiCtxView/add_to_linker_sync signatures
drifted in wasmtime-wasi 45, fix per the docs.rs note above and re-check.
- Step 5: Run all tests
Run: cargo test -p whoareyou-server
Expected: PASS (no new tests — real coverage lands in Task 10's integration test).
- Step 6: Commit
git add crates/server Cargo.lock
git commit -m "feat: add wasmtime host with epoch-bounded WasmProvider"
Task 10: Component integration test
Proves the WIT boundary end-to-end: the real .wasm built from Task 4, loaded by the real host from Task 9.
Files:
-
Create:
crates/server/tests/component.rs -
Step 1: Build the component
Run: cargo build --release --target wasm32-wasip2 -p whoareyou-provider-hitta
Expected: target/wasm32-wasip2/release/whoareyou_provider_hitta.wasm exists.
- Step 2: Write the integration test
crates/server/tests/component.rs:
use std::path::Path;
use whoareyou_server::model::{FetchedResponse, ParseOutcome};
use whoareyou_server::service::ProviderHandle;
use whoareyou_server::wasm;
const COMPONENT_PATH: &str = concat!(
env!("CARGO_MANIFEST_DIR"),
"/../../target/wasm32-wasip2/release/whoareyou_provider_hitta.wasm"
);
fn load_provider() -> wasm::WasmProvider {
let path = Path::new(COMPONENT_PATH);
assert!(
path.exists(),
"hitta component not built — run `just build-components` first"
);
let engine = wasm::engine().unwrap();
let linker = wasm::linker(&engine).unwrap();
wasm::spawn_epoch_thread(&engine);
wasm::WasmProvider::load(&engine, &linker, path).unwrap()
}
#[test]
fn metadata_identifies_hitta() {
let provider = load_provider();
assert_eq!(provider.name(), "hitta.se");
assert!(!provider.version().is_empty());
}
#[test]
fn requests_contain_the_number() {
let provider = load_provider();
let urls = provider.requests("0104754350").unwrap();
assert_eq!(urls, vec!["https://www.hitta.se/vem-ringde/0104754350"]);
}
#[test]
fn parse_roundtrips_a_fixture_through_wasm() {
let provider = load_provider();
let body = include_str!("../../../fixtures/hitta/0104754350.html").to_string();
let outcome = provider.parse(
"0104754350",
&[FetchedResponse { status: 200, body }],
);
let ParseOutcome::Ok(entry) = outcome else {
panic!("expected Ok entry, got {outcome:?}");
};
assert_eq!(entry.history, vec!["42 andra har rapporterat detta nummer"]);
assert_eq!(entry.comments.len(), 29);
assert_eq!(entry.comments[0].timestamp, Some(1547746162));
}
#[test]
fn parse_maps_no_data() {
let provider = load_provider();
let body = include_str!("../../../fixtures/hitta/0313908905.html").to_string();
let outcome = provider.parse(
"0313908905",
&[FetchedResponse { status: 200, body }],
);
assert!(matches!(outcome, ParseOutcome::NoData), "got {outcome:?}");
}
- Step 3: Run the integration test
Run: cargo test -p whoareyou-server --test component
Expected: 4 tests PASS. (If 0104754350.html parse expectations changed in
Task 3 Step 6's contingency branch, mirror the same expectations here.)
- Step 4: Commit
git add crates/server/tests
git commit -m "test: prove WIT boundary with real component integration test"
Task 11: Config + main wiring
Files:
-
Create:
crates/server/src/config.rs -
Modify:
crates/server/src/main.rs,crates/server/src/lib.rs,crates/server/Cargo.toml -
Step 1: Add binary dependencies
Extend crates/server/Cargo.toml [dependencies]:
anyhow = "1"
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
- Step 2: Write the failing config tests
crates/server/src/config.rs, test module first:
#[cfg(test)]
mod tests {
use std::collections::HashMap;
use super::*;
fn env(pairs: &[(&str, &str)]) -> impl Fn(&str) -> Option<String> + '_ {
let map: HashMap<String, String> = pairs
.iter()
.map(|(k, v)| (k.to_string(), v.to_string()))
.collect();
move |key: &str| map.get(key).cloned()
}
#[test]
fn defaults_apply_when_unset() {
let config = AppConfig::from_lookup(env(&[])).unwrap();
assert_eq!(config.listen.to_string(), "127.0.0.1:8080");
assert_eq!(config.components_dir, std::path::PathBuf::from("components"));
assert_eq!(config.cache_ttl, std::time::Duration::from_secs(24 * 3600));
assert_eq!(config.fetch_timeout, std::time::Duration::from_secs(10));
}
#[test]
fn env_overrides_apply() {
let config = AppConfig::from_lookup(env(&[
("WHOAREYOU_LISTEN", "0.0.0.0:9000"),
("WHOAREYOU_COMPONENTS_DIR", "/opt/providers"),
("WHOAREYOU_CACHE_TTL_HOURS", "1"),
("WHOAREYOU_FETCH_TIMEOUT_SECS", "30"),
]))
.unwrap();
assert_eq!(config.listen.to_string(), "0.0.0.0:9000");
assert_eq!(config.components_dir, std::path::PathBuf::from("/opt/providers"));
assert_eq!(config.cache_ttl, std::time::Duration::from_secs(3600));
assert_eq!(config.fetch_timeout, std::time::Duration::from_secs(30));
}
#[test]
fn invalid_values_error() {
assert!(AppConfig::from_lookup(env(&[("WHOAREYOU_LISTEN", "not-an-addr")])).is_err());
assert!(AppConfig::from_lookup(env(&[("WHOAREYOU_CACHE_TTL_HOURS", "soon")])).is_err());
}
}
- Step 3: Run tests to verify they fail
Run: cargo test -p whoareyou-server config
Expected: COMPILE ERROR — AppConfig not defined. (First wire pub mod config; into lib.rs.)
- Step 4: Implement config
Prepend to crates/server/src/config.rs:
use std::net::SocketAddr;
use std::path::PathBuf;
use std::time::Duration;
use crate::error::ConfigError;
#[derive(Debug)]
pub struct AppConfig {
pub listen: SocketAddr,
pub components_dir: PathBuf,
pub cache_ttl: Duration,
pub fetch_timeout: Duration,
}
impl AppConfig {
pub fn from_env() -> Result<Self, ConfigError> {
Self::from_lookup(|key| std::env::var(key).ok())
}
pub fn from_lookup(get: impl Fn(&str) -> Option<String>) -> Result<Self, ConfigError> {
let listen = match get("WHOAREYOU_LISTEN") {
Some(value) => value.parse().map_err(|e| ConfigError::Invalid {
key: "WHOAREYOU_LISTEN".to_string(),
message: format!("{e}"),
})?,
None => SocketAddr::from(([127, 0, 0, 1], 8080)),
};
let components_dir = get("WHOAREYOU_COMPONENTS_DIR")
.map(PathBuf::from)
.unwrap_or_else(|| PathBuf::from("components"));
let cache_ttl_hours: u64 = parse_or("WHOAREYOU_CACHE_TTL_HOURS", &get, 24)?;
let fetch_timeout_secs: u64 = parse_or("WHOAREYOU_FETCH_TIMEOUT_SECS", &get, 10)?;
Ok(Self {
listen,
components_dir,
cache_ttl: Duration::from_secs(cache_ttl_hours * 3600),
fetch_timeout: Duration::from_secs(fetch_timeout_secs),
})
}
}
fn parse_or(
key: &str,
get: &impl Fn(&str) -> Option<String>,
default: u64,
) -> Result<u64, ConfigError> {
match get(key) {
Some(value) => value.parse().map_err(|e| ConfigError::Invalid {
key: key.to_string(),
message: format!("{e}"),
}),
None => Ok(default),
}
}
crates/server/src/lib.rs final state:
pub mod config;
pub mod error;
pub mod fetch;
pub mod http;
pub mod model;
pub mod service;
pub mod wasm;
- Step 5: Run tests to verify they pass
Run: cargo test -p whoareyou-server config
Expected: PASS.
- Step 6: Write
main.rs
crates/server/src/main.rs:
use std::sync::Arc;
use anyhow::Context;
use tracing::info;
use tracing_subscriber::EnvFilter;
use whoareyou_server::config::AppConfig;
use whoareyou_server::fetch::ReqwestFetcher;
use whoareyou_server::service::{LookupService, ProviderHandle};
use whoareyou_server::{http, wasm};
#[tokio::main]
async fn main() -> anyhow::Result<()> {
tracing_subscriber::fmt()
.with_env_filter(
EnvFilter::try_from_default_env().unwrap_or_else(|_| EnvFilter::new("info")),
)
.init();
let config = AppConfig::from_env()?;
let engine = wasm::engine()?;
let linker = wasm::linker(&engine)?;
wasm::spawn_epoch_thread(&engine);
let mut providers: Vec<Arc<dyn ProviderHandle>> = Vec::new();
let dir = std::fs::read_dir(&config.components_dir).with_context(|| {
format!("reading components dir {:?}", config.components_dir)
})?;
for entry in dir {
let path = entry?.path();
if path.extension().is_some_and(|ext| ext == "wasm") {
let provider = wasm::WasmProvider::load(&engine, &linker, &path)
.with_context(|| format!("loading component {path:?}"))?;
info!(
name = provider.name(),
version = provider.version(),
?path,
"loaded provider"
);
providers.push(Arc::new(provider));
}
}
anyhow::ensure!(
!providers.is_empty(),
"no .wasm components found in {:?}",
config.components_dir
);
let fetcher = Arc::new(ReqwestFetcher::new(config.fetch_timeout)?);
let service = Arc::new(LookupService::new(providers, fetcher, config.cache_ttl));
let app = http::router(service);
let listener = tokio::net::TcpListener::bind(config.listen).await?;
info!("listening on http://{}", config.listen);
axum::serve(listener, app).await?;
Ok(())
}
- Step 7: Full workspace check + tests
Run: cargo test --workspace
Expected: PASS.
- Step 8: Smoke-test the real service (network)
mkdir -p components
cp target/wasm32-wasip2/release/whoareyou_provider_hitta.wasm components/hitta.wasm
cargo run -p whoareyou-server &
sleep 3
curl -s http://127.0.0.1:8080/healthz
curl -s "http://127.0.0.1:8080/api/v1/number/0104754350" | python3 -m json.tool
kill %1
Expected: ok from healthz; lookup returns JSON with a results["hitta.se"]
object whose status is one of ok/no_data/parse_failed (live site —
parse_failed here while the fixture tests pass means hitta.se serves
different markup to the server's User-Agent; if so, record it as a follow-up
issue, it does not block this task).
- Step 9: Commit
git add crates/server Cargo.lock
git commit -m "feat: wire config, component loading, and axum serve in main"
Task 12: justfile, docs, cleanup
Files:
-
Create:
justfile -
Modify:
fetch-fixture,README.md,CLAUDE.md -
Step 1: Write the
justfile
# Build provider components and copy them where the server looks.
build-components:
cargo build --release --target wasm32-wasip2 -p whoareyou-provider-hitta
mkdir -p components
cp target/wasm32-wasip2/release/whoareyou_provider_hitta.wasm components/hitta.wasm
# Full build: components first, then the server.
build: build-components
cargo build --release
# All tests (the integration test needs the built component).
test: build-components
cargo test --workspace
# Run the service locally.
run: build-components
cargo run -p whoareyou-server
fmt:
cargo +nightly fmt
lint:
cargo clippy --workspace
- Step 2: Verify
just testworks end to end
Run: just test
Expected: builds the component, all tests PASS.
- Step 3: Trim
fetch-fixtureto live providers
Replace fetch-fixture contents:
#!/bin/bash
# Refresh HTML fixtures for provider parser tests.
# Usage: ./fetch-fixture <number>
set -euo pipefail
curl -sL -A "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)" \
"https://www.hitta.se/vem-ringde/$1" \
-o "fixtures/hitta/$1.html"
echo "fixtures/hitta/$1.html: $(wc -c < "fixtures/hitta/$1.html") bytes"
- Step 4: Rewrite
README.md
# whoareyou
Who is calling me? A self-hosted HTTP service that looks up Swedish phone
numbers across reverse-lookup sites. Providers are sandboxed WASM components.
## Usage
```shell
$ just run
$ curl "http://127.0.0.1:8080/api/v1/number/0700000000"
Configuration (env)
| Variable | Default |
|---|---|
WHOAREYOU_LISTEN |
127.0.0.1:8080 |
WHOAREYOU_COMPONENTS_DIR |
components |
WHOAREYOU_CACHE_TTL_HOURS |
24 |
WHOAREYOU_FETCH_TIMEOUT_SECS |
10 |
Development
$ rustup target add wasm32-wasip2
$ just test
Provider contract lives in wit/provider.wit. See
docs/superpowers/specs/2026-06-05-wasm-provider-service-design.md.
- [ ] **Step 5: Rewrite `CLAUDE.md`**
```markdown
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## What this is
A self-hosted HTTP service that looks up Swedish phone numbers ("who is
calling me?") by scraping reverse-lookup sites. Providers are WASM components
(Component Model / WASI p2) loaded from a directory at startup; the host does
all fetching and caching. Design spec:
`docs/superpowers/specs/2026-06-05-wasm-provider-service-design.md`.
## Commands
```bash
just test # build components + run all tests (preferred)
just run # build components + run the service
just build # release build of everything
cargo test -p whoareyou-provider-hitta # provider parser tests (native, no WASM)
cargo test -p whoareyou-server --test component # WIT-boundary integration test
cargo +nightly fmt # always nightly, not stable
cargo clippy --workspace
./fetch-fixture <number> # refresh an HTML fixture from hitta.se
The integration test needs the component built first — run via just test,
or cargo build --release --target wasm32-wasip2 -p whoareyou-provider-hitta
before bare cargo test.
Architecture
wit/provider.wit— the provider contract (metadata/requests/parse). Components are pure: no network, no filesystem. The HOST fetches URLs.crates/providers/hitta— parse logic inparser.rsis plain Rust, unit-tested natively againstfixtures/hitta/*.html;component.rsis thin WIT glue, compiled only forwasm32(cargo testnever touches WASM here).crates/server— lib + thin bin.service.rsholds theProviderHandle+Fetchtraits andLookupService(moka cache, TTL 24h, keyprovider:number; fetch failures are NOT cached).wasm.rsimplementsProviderHandleover wasmtime (fresh Store per call, epoch deadline ≈5s).http.rsis axum:GET /api/v1/number/{number},GET /healthz.
Gotchas
- Components build with plain
cargo build --target wasm32-wasip2— no cargo-component. Output name uses underscores:whoareyou_provider_hitta.wasm; the justfile copies it tocomponents/hitta.wasm(gitignored). - One provider failing maps to a per-provider
statusin the JSON response — never a non-200 for the whole lookup.parse_failedin logs (WARN) means a site changed its markup: refresh a fixture with./fetch-fixtureand fix the parser. ParseError::NoDatavsFailed: a fetched page with no phone data is NoData (normal); a page that doesn't match the expected structure is Failed (scraper rot). Don't conflate them.
- [ ] **Step 6: Final verification**
Run: `just test && cargo clippy --workspace && cargo +nightly fmt -- --check`
Expected: tests pass, no clippy errors (warnings OK to fix or note), fmt clean.
- [ ] **Step 7: Commit**
```bash
git add justfile fetch-fixture README.md CLAUDE.md
git commit -m "docs: add justfile and rewrite README/CLAUDE.md for service architecture"
Out of scope (per spec)
Container image · k8s/Pithos/CI · provider upload/enable-disable · more providers · host-fetch import for multi-step providers · lookup history / persistent cache · metrics.