f8555722af
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2225 lines
62 KiB
Markdown
2225 lines
62 KiB
Markdown
# WASM Provider Service Implementation Plan
|
||
|
||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||
|
||
**Goal:** Rebuild whoareyou as an async HTTP service that looks up Swedish phone numbers via WASM-component providers (hitta.se in v1), retiring the CLI.
|
||
|
||
**Architecture:** Cargo workspace with an axum server hosting wasmtime; providers are pure WASM components (WIT contract: `metadata`/`requests`/`parse`) — the host fetches all URLs and caches parsed results in moka. Provider parse logic is plain Rust, unit-tested natively against HTML fixtures; WIT glue is a thin `cfg(wasm32)` layer.
|
||
|
||
**Tech Stack:** Rust edition 2024 · tokio · axum 0.8 · reqwest 0.13 · moka 0.12 · wasmtime + wasmtime-wasi 45 · wit-bindgen 0.57 · thiserror 2 · tracing · insta 1.47
|
||
|
||
**Spec:** `docs/superpowers/specs/2026-06-05-wasm-provider-service-design.md`
|
||
|
||
---
|
||
|
||
## File structure
|
||
|
||
```
|
||
whoareyou/
|
||
├── Cargo.toml # workspace (NEW)
|
||
├── justfile # build orchestration (NEW)
|
||
├── wit/provider.wit # provider contract (NEW)
|
||
├── crates/
|
||
│ ├── server/ # package whoareyou-server (lib + bin)
|
||
│ │ ├── Cargo.toml
|
||
│ │ ├── src/lib.rs # module exports
|
||
│ │ ├── src/main.rs # wiring only
|
||
│ │ ├── src/config.rs # env config
|
||
│ │ ├── src/error.rs # HostError, FetchError, ConfigError
|
||
│ │ ├── src/model.rs # Entry, Comment, ProviderResult, API types
|
||
│ │ ├── src/service.rs # ProviderHandle + Fetch traits, LookupService
|
||
│ │ ├── src/fetch.rs # ReqwestFetcher
|
||
│ │ ├── src/http.rs # axum router, normalize()
|
||
│ │ ├── src/wasm.rs # wasmtime host, WasmProvider
|
||
│ │ └── tests/component.rs # loads the real .wasm
|
||
│ └── providers/hitta/ # package whoareyou-provider-hitta (cdylib+rlib)
|
||
│ ├── Cargo.toml
|
||
│ ├── src/lib.rs
|
||
│ ├── src/parser.rs # pure parse logic + native tests
|
||
│ └── src/component.rs # wit-bindgen glue (wasm32 only)
|
||
├── fixtures/hitta/*.html # KEPT (+ one fresh fixture)
|
||
├── fetch-fixture # KEPT, trimmed to hitta
|
||
└── DELETED: src/, definitions/, _build.rs, NOTEPAD.md, old Cargo.toml contents
|
||
```
|
||
|
||
`whoareyou-server` is a lib + thin bin so `tests/component.rs` can use its modules.
|
||
|
||
---
|
||
|
||
### Task 1: Workspace scaffold & demolition
|
||
|
||
**Files:**
|
||
- Delete: `src/`, `definitions/`, `_build.rs`, `NOTEPAD.md`
|
||
- Create: `Cargo.toml` (workspace), `wit/provider.wit`, `crates/server/{Cargo.toml,src/lib.rs,src/main.rs}`, `crates/providers/hitta/{Cargo.toml,src/lib.rs}`
|
||
- Modify: `.gitignore`
|
||
|
||
- [ ] **Step 1: Install the wasm target**
|
||
|
||
Run: `rustup target add wasm32-wasip2`
|
||
Expected: installs or "is up to date".
|
||
|
||
- [ ] **Step 2: Delete the old code**
|
||
|
||
```bash
|
||
git rm -r src definitions _build.rs NOTEPAD.md
|
||
```
|
||
|
||
(The old hitta parser is reproduced in Task 3 — nothing needed from the deleted tree.)
|
||
|
||
- [ ] **Step 3: Write the workspace `Cargo.toml`** (replaces the old package manifest)
|
||
|
||
```toml
|
||
[workspace]
|
||
resolver = "3"
|
||
members = ["crates/server", "crates/providers/hitta"]
|
||
|
||
[workspace.package]
|
||
version = "0.1.0"
|
||
edition = "2024"
|
||
authors = ["Anders Olsson <anders.e.olsson@gmail.com>"]
|
||
```
|
||
|
||
- [ ] **Step 4: Write `wit/provider.wit`**
|
||
|
||
```wit
|
||
package whoareyou:provider@0.1.0;
|
||
|
||
interface lookup {
|
||
record provider-info {
|
||
name: string,
|
||
version: string,
|
||
}
|
||
|
||
record request {
|
||
url: string,
|
||
}
|
||
|
||
record response {
|
||
status: u16,
|
||
body: string,
|
||
}
|
||
|
||
record comment {
|
||
timestamp: option<s64>,
|
||
title: option<string>,
|
||
message: string,
|
||
}
|
||
|
||
record entry {
|
||
messages: list<string>,
|
||
history: list<string>,
|
||
comments: list<comment>,
|
||
}
|
||
|
||
variant lookup-error {
|
||
no-data,
|
||
parse-failed(string),
|
||
}
|
||
|
||
metadata: func() -> provider-info;
|
||
requests: func(number: string) -> list<request>;
|
||
parse: func(number: string, responses: list<response>) -> result<entry, lookup-error>;
|
||
}
|
||
|
||
world provider {
|
||
export lookup;
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 5: Create the server crate stub**
|
||
|
||
`crates/server/Cargo.toml`:
|
||
|
||
```toml
|
||
[package]
|
||
name = "whoareyou-server"
|
||
version.workspace = true
|
||
edition.workspace = true
|
||
authors.workspace = true
|
||
|
||
[dependencies]
|
||
|
||
[dev-dependencies]
|
||
```
|
||
|
||
`crates/server/src/lib.rs`:
|
||
|
||
```rust
|
||
// modules added as they are implemented
|
||
```
|
||
|
||
`crates/server/src/main.rs`:
|
||
|
||
```rust
|
||
fn main() {}
|
||
```
|
||
|
||
- [ ] **Step 6: Create the hitta provider crate stub**
|
||
|
||
`crates/providers/hitta/Cargo.toml`:
|
||
|
||
```toml
|
||
[package]
|
||
name = "whoareyou-provider-hitta"
|
||
version.workspace = true
|
||
edition.workspace = true
|
||
authors.workspace = true
|
||
|
||
[lib]
|
||
crate-type = ["cdylib", "rlib"]
|
||
|
||
[dependencies]
|
||
|
||
[dev-dependencies]
|
||
```
|
||
|
||
`crates/providers/hitta/src/lib.rs`:
|
||
|
||
```rust
|
||
// modules added as they are implemented
|
||
```
|
||
|
||
- [ ] **Step 7: Ignore the components dir**
|
||
|
||
Append to `.gitignore` (create if missing):
|
||
|
||
```
|
||
components/
|
||
```
|
||
|
||
- [ ] **Step 8: Verify the workspace builds**
|
||
|
||
Run: `cargo check --workspace`
|
||
Expected: success (two empty crates). `Cargo.lock` regenerates — that's fine.
|
||
|
||
- [ ] **Step 9: Commit**
|
||
|
||
```bash
|
||
git add -A
|
||
git commit -m "refactor!: replace CLI with workspace scaffold for WASM provider service"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 2: Refresh hitta fixture & audit page structure
|
||
|
||
The 2019 fixtures predate any hitta.se redesign. Before porting the parser, capture what the site serves **today** so Task 3 is written against reality.
|
||
|
||
**Files:**
|
||
- Create: `fixtures/hitta/fresh-0104754350.html`
|
||
|
||
- [ ] **Step 1: Fetch a fresh copy of a known number's page**
|
||
|
||
Run:
|
||
|
||
```bash
|
||
curl -sL -A "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)" \
|
||
"https://www.hitta.se/vem-ringde/0104754350" \
|
||
-o fixtures/hitta/fresh-0104754350.html
|
||
wc -c fixtures/hitta/fresh-0104754350.html
|
||
```
|
||
|
||
Expected: a non-trivial file (> 10 KB). If the response is a bot-block page (check with `head -c 2000`), retry with the `http --follow` (httpie) variant from `fetch-fixture`, or fetch the page in a real browser (View Source → save). The fixture MUST contain real page markup before continuing.
|
||
|
||
- [ ] **Step 2: Audit the page structure**
|
||
|
||
Run:
|
||
|
||
```bash
|
||
grep -c "__NEXT_DATA__" fixtures/hitta/fresh-0104754350.html
|
||
grep -o '__NEXT_DATA__[^>]\{0,80\}' fixtures/hitta/fresh-0104754350.html | head -3
|
||
```
|
||
|
||
Two outcomes — record which one applies, it determines Step 3 of Task 3:
|
||
|
||
- **(a) `__NEXT_DATA__` still present.** Check whether it's still `<script>__NEXT_DATA__ = {...};__NEXT_LOADED_PAGES__` (2019 inline style) or the modern `<script id="__NEXT_DATA__" type="application/json">{...}</script>` form. Note which.
|
||
- **(b) Gone entirely.** Inspect the page (`python3 -m json.tool` on any embedded JSON, or read the HTML) and locate where phone data + comments live now. Write down the JSON path to: comments list, comment text, comment timestamp, and the statistics/"X others searched" text — Task 3's serde structs must be adapted to those paths (the *shape* of the parser — regex/JSON extraction → typed structs → `ParsedEntry` — stays identical).
|
||
|
||
- [ ] **Step 3: Commit the fixture**
|
||
|
||
```bash
|
||
git add fixtures/hitta/fresh-0104754350.html
|
||
git commit -m "test: add fresh hitta.se fixture for parser port"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 3: hitta parser (pure logic, native TDD)
|
||
|
||
Port the old `src/probe/hitta.rs` parse logic (reproduced below) into the provider crate as plain functions. All tests run natively — no WASM involved.
|
||
|
||
**Files:**
|
||
- Create: `crates/providers/hitta/src/parser.rs`
|
||
- Modify: `crates/providers/hitta/src/lib.rs`, `crates/providers/hitta/Cargo.toml`
|
||
|
||
- [ ] **Step 1: Add dependencies**
|
||
|
||
In `crates/providers/hitta/Cargo.toml` set:
|
||
|
||
```toml
|
||
[dependencies]
|
||
regex = "1"
|
||
serde = { version = "1", features = ["derive"] }
|
||
serde_json = "1"
|
||
|
||
[dev-dependencies]
|
||
insta = { version = "1.47", features = ["yaml"] }
|
||
```
|
||
|
||
- [ ] **Step 2: Declare the module**
|
||
|
||
`crates/providers/hitta/src/lib.rs`:
|
||
|
||
```rust
|
||
pub mod parser;
|
||
```
|
||
|
||
- [ ] **Step 3: Write the failing tests**
|
||
|
||
Append to `crates/providers/hitta/src/parser.rs` (create the file with ONLY this test module first; the types/functions it references don't exist yet, that's the point):
|
||
|
||
```rust
|
||
#[cfg(test)]
|
||
mod tests {
|
||
use super::*;
|
||
|
||
#[test]
|
||
fn requests_single_hitta_url() {
|
||
assert_eq!(
|
||
request_urls("0700000000"),
|
||
vec!["https://www.hitta.se/vem-ringde/0700000000".to_string()]
|
||
);
|
||
}
|
||
|
||
#[test]
|
||
fn parses_number_with_comments() {
|
||
let body = include_str!("../../../../fixtures/hitta/0104754350.html");
|
||
let entry = parse(body).unwrap();
|
||
|
||
assert_eq!(entry.messages, Vec::<String>::new());
|
||
assert_eq!(entry.history, vec!["42 andra har rapporterat detta nummer"]);
|
||
assert_eq!(entry.comments.len(), 29);
|
||
|
||
// newest first
|
||
let first = &entry.comments[0];
|
||
assert_eq!(first.timestamp, Some(1547746162)); // 2019-01-17T17:29:22Z
|
||
assert_eq!(first.title, None);
|
||
assert_eq!(first.message, "Varmsälj från Folksam");
|
||
}
|
||
|
||
#[test]
|
||
fn parses_number_with_history_only() {
|
||
let body = include_str!("../../../../fixtures/hitta/0702269893.html");
|
||
let entry = parse(body).unwrap();
|
||
|
||
assert_eq!(entry.history, vec!["Tre andra har också sökt på detta nummer"]);
|
||
assert!(entry.comments.is_empty());
|
||
}
|
||
|
||
#[test]
|
||
fn no_phone_data_is_no_data() {
|
||
let body = include_str!("../../../../fixtures/hitta/0313908905.html");
|
||
assert_eq!(parse(body), Err(ParseError::NoData));
|
||
}
|
||
|
||
#[test]
|
||
fn unparseable_page_is_failed() {
|
||
let body = include_str!("../../../../fixtures/hitta/0701807618.html");
|
||
assert!(matches!(parse(body), Err(ParseError::Failed(_))));
|
||
}
|
||
|
||
#[test]
|
||
fn garbage_is_failed() {
|
||
assert!(matches!(parse("<html></html>"), Err(ParseError::Failed(_))));
|
||
}
|
||
|
||
#[test]
|
||
fn parses_fresh_fixture() {
|
||
let body = include_str!("../../../../fixtures/hitta/fresh-0104754350.html");
|
||
insta::assert_yaml_snapshot!(parse(body));
|
||
}
|
||
}
|
||
```
|
||
|
||
Semantics note (differs from the old CLI): the old code returned `Ok` with an
|
||
all-empty entry when JSON parsed but `phoneData` was absent. That is now
|
||
`Err(ParseError::NoData)`. Old fixtures `0313908905`, `0751793426/83/99` fall
|
||
in that bucket; `0701807618`, `0546780862` fail the regex → `Failed`.
|
||
|
||
- [ ] **Step 4: Run tests to verify they fail**
|
||
|
||
Run: `cargo test -p whoareyou-provider-hitta`
|
||
Expected: COMPILE ERROR — `request_urls`, `parse`, `ParseError` not found.
|
||
|
||
- [ ] **Step 5: Implement the parser**
|
||
|
||
Prepend to `crates/providers/hitta/src/parser.rs` (above the test module). This is the 2019 logic ported; **if Task 2 found outcome (b) or the modern `<script id="__NEXT_DATA__">` form, adapt `NEXT_DATA_RE` / the serde structs to the JSON paths recorded in Task 2** — keep the public surface (`request_urls`, `parse`, the three types) exactly as below:
|
||
|
||
```rust
|
||
use std::sync::LazyLock;
|
||
|
||
use regex::Regex;
|
||
use serde::{Deserialize, Serialize};
|
||
|
||
#[derive(Debug, PartialEq, Serialize)]
|
||
pub struct ParsedEntry {
|
||
pub messages: Vec<String>,
|
||
pub history: Vec<String>,
|
||
pub comments: Vec<ParsedComment>,
|
||
}
|
||
|
||
#[derive(Debug, PartialEq, Serialize)]
|
||
pub struct ParsedComment {
|
||
/// Unix epoch seconds, UTC.
|
||
pub timestamp: Option<i64>,
|
||
pub title: Option<String>,
|
||
pub message: String,
|
||
}
|
||
|
||
#[derive(Debug, PartialEq, Serialize)]
|
||
pub enum ParseError {
|
||
/// Page fetched and understood, but it contains no data for the number.
|
||
NoData,
|
||
/// Page structure did not match expectations — scraper rot signal.
|
||
Failed(String),
|
||
}
|
||
|
||
static NEXT_DATA_RE: LazyLock<Regex> = LazyLock::new(|| {
|
||
Regex::new(r"<script>__NEXT_DATA__ = (.*?);__NEXT_LOADED_PAGES__").unwrap()
|
||
});
|
||
|
||
#[derive(Deserialize)]
|
||
#[serde(rename_all = "camelCase")]
|
||
struct Data {
|
||
props: Props,
|
||
}
|
||
|
||
#[derive(Deserialize)]
|
||
#[serde(rename_all = "camelCase")]
|
||
struct Props {
|
||
page_props: PageProps,
|
||
}
|
||
|
||
#[derive(Deserialize)]
|
||
#[serde(rename_all = "camelCase")]
|
||
struct PageProps {
|
||
phone_data: Option<PhoneData>,
|
||
}
|
||
|
||
#[derive(Deserialize)]
|
||
#[serde(rename_all = "camelCase")]
|
||
struct PhoneData {
|
||
#[serde(default)]
|
||
comments: Vec<RawComment>,
|
||
statistics_text: String,
|
||
}
|
||
|
||
#[derive(Deserialize)]
|
||
#[serde(rename_all = "camelCase")]
|
||
struct RawComment {
|
||
comment: String,
|
||
/// Milliseconds since epoch.
|
||
timestamp: u64,
|
||
}
|
||
|
||
pub fn request_urls(number: &str) -> Vec<String> {
|
||
vec![format!("https://www.hitta.se/vem-ringde/{number}")]
|
||
}
|
||
|
||
pub fn parse(body: &str) -> Result<ParsedEntry, ParseError> {
|
||
let captures = NEXT_DATA_RE
|
||
.captures(body)
|
||
.ok_or_else(|| ParseError::Failed("__NEXT_DATA__ not found".to_string()))?;
|
||
|
||
let json = captures.get(1).unwrap().as_str();
|
||
|
||
let data: Data = serde_json::from_str(json)
|
||
.map_err(|e| ParseError::Failed(format!("deserialize __NEXT_DATA__: {e}")))?;
|
||
|
||
let Some(phone_data) = data.props.page_props.phone_data else {
|
||
return Err(ParseError::NoData);
|
||
};
|
||
|
||
let mut comments: Vec<ParsedComment> = phone_data
|
||
.comments
|
||
.into_iter()
|
||
.map(|c| ParsedComment {
|
||
timestamp: Some((c.timestamp / 1000) as i64),
|
||
title: None,
|
||
message: c.comment,
|
||
})
|
||
.collect();
|
||
|
||
comments.sort_by(|a, b| b.timestamp.cmp(&a.timestamp));
|
||
|
||
Ok(ParsedEntry {
|
||
messages: Vec::new(),
|
||
history: vec![phone_data.statistics_text],
|
||
comments,
|
||
})
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 6: Run tests to verify they pass**
|
||
|
||
Run: `cargo test -p whoareyou-provider-hitta`
|
||
Expected: all pass except possibly `parses_fresh_fixture` (pending snapshot).
|
||
If the fresh-fixture test FAILS to parse (`Failed`/`NoData` against a real
|
||
page that visibly has data), the site changed — adapt the regex/structs per
|
||
Task 2's notes until the fresh fixture parses, while keeping the 2019-fixture
|
||
tests passing (if the old format is truly gone from the new code path, update
|
||
those tests' expectations to `Failed` and note it in the commit message).
|
||
|
||
- [ ] **Step 7: Accept the fresh-fixture snapshot after eyeballing it**
|
||
|
||
Run: `cargo insta review` (or `cargo insta accept` after inspecting the `.snap.new` file manually)
|
||
Expected: snapshot under `crates/providers/hitta/src/snapshots/` showing a plausible entry (or an honest `NoData`/`Failed` for a dead number — verify it matches what the fixture actually contains).
|
||
|
||
- [ ] **Step 8: Run the full test suite**
|
||
|
||
Run: `cargo test --workspace`
|
||
Expected: PASS.
|
||
|
||
- [ ] **Step 9: Commit**
|
||
|
||
```bash
|
||
git add crates/providers/hitta .gitignore Cargo.lock
|
||
git commit -m "feat: port hitta.se parser as pure native-testable functions"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 4: hitta component glue (WIT export)
|
||
|
||
**Files:**
|
||
- Create: `crates/providers/hitta/src/component.rs`
|
||
- Modify: `crates/providers/hitta/src/lib.rs`, `crates/providers/hitta/Cargo.toml`
|
||
|
||
- [ ] **Step 1: Add wit-bindgen for wasm32 only**
|
||
|
||
Append to `crates/providers/hitta/Cargo.toml`:
|
||
|
||
```toml
|
||
[target.'cfg(target_arch = "wasm32")'.dependencies]
|
||
wit-bindgen = "0.57"
|
||
```
|
||
|
||
- [ ] **Step 2: Write the glue**
|
||
|
||
`crates/providers/hitta/src/component.rs`:
|
||
|
||
```rust
|
||
use crate::parser;
|
||
|
||
wit_bindgen::generate!({
|
||
world: "provider",
|
||
path: "../../../wit",
|
||
});
|
||
|
||
use exports::whoareyou::provider::lookup::{
|
||
Comment, Entry, Guest, LookupError, ProviderInfo, Request, Response,
|
||
};
|
||
|
||
struct Component;
|
||
|
||
impl Guest for Component {
|
||
fn metadata() -> ProviderInfo {
|
||
ProviderInfo {
|
||
name: "hitta.se".to_string(),
|
||
version: env!("CARGO_PKG_VERSION").to_string(),
|
||
}
|
||
}
|
||
|
||
fn requests(number: String) -> Vec<Request> {
|
||
parser::request_urls(&number)
|
||
.into_iter()
|
||
.map(|url| Request { url })
|
||
.collect()
|
||
}
|
||
|
||
fn parse(_number: String, responses: Vec<Response>) -> Result<Entry, LookupError> {
|
||
let Some(first) = responses.first() else {
|
||
return Err(LookupError::ParseFailed("no responses provided".to_string()));
|
||
};
|
||
|
||
match parser::parse(&first.body) {
|
||
Ok(entry) => Ok(Entry {
|
||
messages: entry.messages,
|
||
history: entry.history,
|
||
comments: entry
|
||
.comments
|
||
.into_iter()
|
||
.map(|c| Comment {
|
||
timestamp: c.timestamp,
|
||
title: c.title,
|
||
message: c.message,
|
||
})
|
||
.collect(),
|
||
}),
|
||
Err(parser::ParseError::NoData) => Err(LookupError::NoData),
|
||
Err(parser::ParseError::Failed(msg)) => Err(LookupError::ParseFailed(msg)),
|
||
}
|
||
}
|
||
}
|
||
|
||
export!(Component);
|
||
```
|
||
|
||
- [ ] **Step 3: Gate it into the crate**
|
||
|
||
`crates/providers/hitta/src/lib.rs`:
|
||
|
||
```rust
|
||
pub mod parser;
|
||
|
||
#[cfg(target_arch = "wasm32")]
|
||
mod component;
|
||
```
|
||
|
||
- [ ] **Step 4: Build the component**
|
||
|
||
Run: `cargo build --release --target wasm32-wasip2 -p whoareyou-provider-hitta`
|
||
Expected: success; `target/wasm32-wasip2/release/whoareyou_provider_hitta.wasm` exists.
|
||
|
||
- [ ] **Step 5: Verify native tests still pass**
|
||
|
||
Run: `cargo test -p whoareyou-provider-hitta`
|
||
Expected: PASS (glue is cfg'd out natively).
|
||
|
||
- [ ] **Step 6: Commit**
|
||
|
||
```bash
|
||
git add crates/providers/hitta
|
||
git commit -m "feat: export hitta parser as a WASM component via wit-bindgen"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 5: Server model types
|
||
|
||
**Files:**
|
||
- Create: `crates/server/src/model.rs`
|
||
- Modify: `crates/server/src/lib.rs`, `crates/server/Cargo.toml`
|
||
|
||
- [ ] **Step 1: Add first server dependencies**
|
||
|
||
In `crates/server/Cargo.toml`:
|
||
|
||
```toml
|
||
[dependencies]
|
||
serde = { version = "1", features = ["derive"] }
|
||
|
||
[dev-dependencies]
|
||
serde_json = "1"
|
||
```
|
||
|
||
- [ ] **Step 2: Write the failing test**
|
||
|
||
`crates/server/src/model.rs` (test module only, types come next step):
|
||
|
||
```rust
|
||
#[cfg(test)]
|
||
mod tests {
|
||
use super::*;
|
||
|
||
#[test]
|
||
fn provider_result_serializes_to_api_shape() {
|
||
let ok = ProviderResult::Ok {
|
||
entry: Entry {
|
||
messages: vec![],
|
||
history: vec!["42 andra".to_string()],
|
||
comments: vec![Comment {
|
||
timestamp: Some(1547746162),
|
||
title: None,
|
||
message: "Varmsälj".to_string(),
|
||
}],
|
||
},
|
||
};
|
||
|
||
let json = serde_json::to_value(&ok).unwrap();
|
||
assert_eq!(json["status"], "ok");
|
||
assert_eq!(json["entry"]["history"][0], "42 andra");
|
||
assert_eq!(json["entry"]["comments"][0]["timestamp"], 1547746162);
|
||
|
||
assert_eq!(
|
||
serde_json::to_value(&ProviderResult::NoData).unwrap()["status"],
|
||
"no_data"
|
||
);
|
||
assert_eq!(
|
||
serde_json::to_value(&ProviderResult::FetchFailed).unwrap()["status"],
|
||
"fetch_failed"
|
||
);
|
||
assert_eq!(
|
||
serde_json::to_value(&ProviderResult::ParseFailed).unwrap()["status"],
|
||
"parse_failed"
|
||
);
|
||
}
|
||
}
|
||
```
|
||
|
||
`crates/server/src/lib.rs`:
|
||
|
||
```rust
|
||
pub mod model;
|
||
```
|
||
|
||
- [ ] **Step 3: Run test to verify it fails**
|
||
|
||
Run: `cargo test -p whoareyou-server`
|
||
Expected: COMPILE ERROR — types not defined.
|
||
|
||
- [ ] **Step 4: Implement the types**
|
||
|
||
Prepend to `crates/server/src/model.rs`:
|
||
|
||
```rust
|
||
use std::collections::BTreeMap;
|
||
|
||
use serde::Serialize;
|
||
|
||
#[derive(Debug, Clone, PartialEq, Serialize)]
|
||
pub struct Entry {
|
||
pub messages: Vec<String>,
|
||
pub history: Vec<String>,
|
||
pub comments: Vec<Comment>,
|
||
}
|
||
|
||
#[derive(Debug, Clone, PartialEq, Serialize)]
|
||
pub struct Comment {
|
||
/// Unix epoch seconds, UTC.
|
||
pub timestamp: Option<i64>,
|
||
pub title: Option<String>,
|
||
pub message: String,
|
||
}
|
||
|
||
/// Per-provider outcome as exposed in the API (and cached).
|
||
#[derive(Debug, Clone, PartialEq, Serialize)]
|
||
#[serde(tag = "status", rename_all = "snake_case")]
|
||
pub enum ProviderResult {
|
||
Ok { entry: Entry },
|
||
NoData,
|
||
FetchFailed,
|
||
ParseFailed,
|
||
}
|
||
|
||
/// A fetched HTTP response handed to a provider's `parse`.
|
||
#[derive(Debug, Clone)]
|
||
pub struct FetchedResponse {
|
||
pub status: u16,
|
||
pub body: String,
|
||
}
|
||
|
||
/// Outcome of a provider's `parse` call, before API mapping.
|
||
#[derive(Debug)]
|
||
pub enum ParseOutcome {
|
||
Ok(Entry),
|
||
NoData,
|
||
Failed(String),
|
||
}
|
||
|
||
#[derive(Debug, Serialize)]
|
||
pub struct LookupResponse {
|
||
pub number: String,
|
||
pub results: BTreeMap<String, ProviderResult>,
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 5: Run test to verify it passes**
|
||
|
||
Run: `cargo test -p whoareyou-server`
|
||
Expected: PASS.
|
||
|
||
- [ ] **Step 6: Commit**
|
||
|
||
```bash
|
||
git add crates/server Cargo.lock
|
||
git commit -m "feat: add server model types and API serialization shape"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 6: Errors and fetcher
|
||
|
||
**Files:**
|
||
- Create: `crates/server/src/error.rs`, `crates/server/src/fetch.rs`
|
||
- Modify: `crates/server/src/lib.rs`, `crates/server/Cargo.toml`
|
||
|
||
- [ ] **Step 1: Add dependencies**
|
||
|
||
Extend `crates/server/Cargo.toml` `[dependencies]`:
|
||
|
||
```toml
|
||
async-trait = "0.1"
|
||
reqwest = "0.13"
|
||
thiserror = "2"
|
||
tokio = { version = "1", features = ["full"] }
|
||
wasmtime = { version = "45", features = ["component-model"] }
|
||
```
|
||
|
||
(wasmtime is needed now because `HostError` wraps `wasmtime::Error`.)
|
||
|
||
- [ ] **Step 2: Write `crates/server/src/error.rs`**
|
||
|
||
```rust
|
||
use thiserror::Error;
|
||
|
||
/// Errors from hosting/calling a WASM component.
|
||
#[derive(Debug, Error)]
|
||
pub enum HostError {
|
||
#[error("wasm error: {0}")]
|
||
Wasm(#[from] wasmtime::Error),
|
||
#[error("io error: {0}")]
|
||
Io(#[from] std::io::Error),
|
||
}
|
||
|
||
#[derive(Debug, Error)]
|
||
pub enum FetchError {
|
||
#[error("request failed: {0}")]
|
||
Request(#[from] reqwest::Error),
|
||
}
|
||
|
||
#[derive(Debug, Error)]
|
||
pub enum ConfigError {
|
||
#[error("invalid value for {key}: {message}")]
|
||
Invalid { key: String, message: String },
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 3: Write `crates/server/src/fetch.rs`**
|
||
|
||
The `Fetch` trait lives in `service.rs` (Task 7); to keep this task compiling
|
||
standalone, define the trait there first — so this task only adds the
|
||
*implementation file* with a stub trait import deferred. Simplest ordering:
|
||
write `fetch.rs` now but leave it out of `lib.rs` until Task 7 wires it in.
|
||
|
||
`crates/server/src/fetch.rs`:
|
||
|
||
```rust
|
||
use std::time::Duration;
|
||
|
||
use async_trait::async_trait;
|
||
|
||
use crate::error::FetchError;
|
||
use crate::model::FetchedResponse;
|
||
use crate::service::Fetch;
|
||
|
||
pub struct ReqwestFetcher {
|
||
client: reqwest::Client,
|
||
}
|
||
|
||
impl ReqwestFetcher {
|
||
pub fn new(timeout: Duration) -> Result<Self, FetchError> {
|
||
let client = reqwest::Client::builder()
|
||
.timeout(timeout)
|
||
.user_agent(concat!("whoareyou/", env!("CARGO_PKG_VERSION")))
|
||
.build()?;
|
||
|
||
Ok(Self { client })
|
||
}
|
||
}
|
||
|
||
#[async_trait]
|
||
impl Fetch for ReqwestFetcher {
|
||
async fn fetch(&self, url: &str) -> Result<FetchedResponse, FetchError> {
|
||
let response = self.client.get(url).send().await?;
|
||
let status = response.status().as_u16();
|
||
let body = response.text().await?;
|
||
|
||
Ok(FetchedResponse { status, body })
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Wire only `error` into `lib.rs`**
|
||
|
||
`crates/server/src/lib.rs`:
|
||
|
||
```rust
|
||
pub mod error;
|
||
pub mod model;
|
||
```
|
||
|
||
- [ ] **Step 5: Verify it compiles**
|
||
|
||
Run: `cargo check -p whoareyou-server`
|
||
Expected: success (`fetch.rs` is not yet a module, so its `crate::service` import is not compiled).
|
||
|
||
- [ ] **Step 6: Commit**
|
||
|
||
```bash
|
||
git add crates/server Cargo.lock
|
||
git commit -m "feat: add server error types and reqwest fetcher"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 7: LookupService (orchestration + cache, TDD)
|
||
|
||
**Files:**
|
||
- Create: `crates/server/src/service.rs`
|
||
- Modify: `crates/server/src/lib.rs`, `crates/server/Cargo.toml`
|
||
|
||
- [ ] **Step 1: Add dependencies**
|
||
|
||
Extend `crates/server/Cargo.toml` `[dependencies]`:
|
||
|
||
```toml
|
||
futures = "0.3"
|
||
moka = { version = "0.12", features = ["future"] }
|
||
tracing = "0.1"
|
||
```
|
||
|
||
- [ ] **Step 2: Write the failing tests**
|
||
|
||
`crates/server/src/service.rs`, test module first:
|
||
|
||
```rust
|
||
#[cfg(test)]
|
||
mod tests {
|
||
use std::sync::Arc;
|
||
use std::sync::atomic::{AtomicUsize, Ordering};
|
||
use std::time::Duration;
|
||
|
||
use async_trait::async_trait;
|
||
|
||
use super::*;
|
||
use crate::error::{FetchError, HostError};
|
||
use crate::model::{Comment, Entry, FetchedResponse, ParseOutcome, ProviderResult};
|
||
|
||
fn entry() -> Entry {
|
||
Entry {
|
||
messages: vec![],
|
||
history: vec!["history".to_string()],
|
||
comments: vec![Comment {
|
||
timestamp: Some(1547746162),
|
||
title: None,
|
||
message: "spam".to_string(),
|
||
}],
|
||
}
|
||
}
|
||
|
||
/// Provider whose parse outcome is scripted per call.
|
||
struct FakeProvider {
|
||
name: &'static str,
|
||
outcome: fn() -> ParseOutcome,
|
||
}
|
||
|
||
impl ProviderHandle for FakeProvider {
|
||
fn name(&self) -> &str {
|
||
self.name
|
||
}
|
||
|
||
fn requests(&self, number: &str) -> Result<Vec<String>, HostError> {
|
||
Ok(vec![format!("https://example.test/{number}")])
|
||
}
|
||
|
||
fn parse(
|
||
&self,
|
||
_number: &str,
|
||
_responses: &[FetchedResponse],
|
||
) -> ParseOutcome {
|
||
(self.outcome)()
|
||
}
|
||
}
|
||
|
||
/// Fetcher that counts calls and can be told to fail.
|
||
struct FakeFetcher {
|
||
calls: AtomicUsize,
|
||
fail: bool,
|
||
}
|
||
|
||
impl FakeFetcher {
|
||
fn new(fail: bool) -> Self {
|
||
Self { calls: AtomicUsize::new(0), fail }
|
||
}
|
||
}
|
||
|
||
#[async_trait]
|
||
impl Fetch for FakeFetcher {
|
||
async fn fetch(&self, _url: &str) -> Result<FetchedResponse, FetchError> {
|
||
self.calls.fetch_add(1, Ordering::SeqCst);
|
||
|
||
if self.fail {
|
||
// construct a real reqwest error by failing a bad URL... instead
|
||
// keep FetchError easy to fabricate via a connection refused on a
|
||
// reserved port? No — simplest: add a test-only variant? Use
|
||
// reqwest from an invalid builder is convoluted. See note below.
|
||
unreachable!("replaced in Step 4");
|
||
}
|
||
|
||
Ok(FetchedResponse { status: 200, body: "body".to_string() })
|
||
}
|
||
}
|
||
|
||
fn service(
|
||
providers: Vec<Arc<dyn ProviderHandle>>,
|
||
fetcher: Arc<dyn Fetch>,
|
||
) -> LookupService {
|
||
LookupService::new(providers, fetcher, Duration::from_secs(60))
|
||
}
|
||
|
||
#[tokio::test]
|
||
async fn ok_result_is_returned_and_cached() {
|
||
let provider = Arc::new(FakeProvider {
|
||
name: "fake.se",
|
||
outcome: || ParseOutcome::Ok(entry()),
|
||
});
|
||
let fetcher = Arc::new(FakeFetcher::new(false));
|
||
let svc = service(vec![provider], fetcher.clone());
|
||
|
||
let results = svc.lookup("0700000000").await;
|
||
assert_eq!(results["fake.se"], ProviderResult::Ok { entry: entry() });
|
||
|
||
// second lookup served from cache — fetcher not called again
|
||
let results = svc.lookup("0700000000").await;
|
||
assert_eq!(results["fake.se"], ProviderResult::Ok { entry: entry() });
|
||
assert_eq!(fetcher.calls.load(Ordering::SeqCst), 1);
|
||
}
|
||
|
||
#[tokio::test]
|
||
async fn no_data_is_cached() {
|
||
let provider = Arc::new(FakeProvider { name: "fake.se", outcome: || ParseOutcome::NoData });
|
||
let fetcher = Arc::new(FakeFetcher::new(false));
|
||
let svc = service(vec![provider], fetcher.clone());
|
||
|
||
assert_eq!(svc.lookup("0700000000").await["fake.se"], ProviderResult::NoData);
|
||
assert_eq!(svc.lookup("0700000000").await["fake.se"], ProviderResult::NoData);
|
||
assert_eq!(fetcher.calls.load(Ordering::SeqCst), 1);
|
||
}
|
||
|
||
#[tokio::test]
|
||
async fn parse_failure_maps_and_is_cached() {
|
||
let provider = Arc::new(FakeProvider {
|
||
name: "fake.se",
|
||
outcome: || ParseOutcome::Failed("rot".to_string()),
|
||
});
|
||
let fetcher = Arc::new(FakeFetcher::new(false));
|
||
let svc = service(vec![provider], fetcher.clone());
|
||
|
||
assert_eq!(svc.lookup("0700000000").await["fake.se"], ProviderResult::ParseFailed);
|
||
assert_eq!(svc.lookup("0700000000").await["fake.se"], ProviderResult::ParseFailed);
|
||
assert_eq!(fetcher.calls.load(Ordering::SeqCst), 1);
|
||
}
|
||
|
||
#[tokio::test]
|
||
async fn fetch_failure_is_not_cached() {
|
||
let provider = Arc::new(FakeProvider {
|
||
name: "fake.se",
|
||
outcome: || ParseOutcome::NoData,
|
||
});
|
||
let fetcher = Arc::new(FakeFetcher::new(true));
|
||
let svc = service(vec![provider], fetcher.clone());
|
||
|
||
assert_eq!(svc.lookup("0700000000").await["fake.se"], ProviderResult::FetchFailed);
|
||
assert_eq!(svc.lookup("0700000000").await["fake.se"], ProviderResult::FetchFailed);
|
||
// NOT cached: fetcher tried twice
|
||
assert_eq!(fetcher.calls.load(Ordering::SeqCst), 2);
|
||
}
|
||
|
||
#[tokio::test]
|
||
async fn multiple_providers_keyed_by_name() {
|
||
let a = Arc::new(FakeProvider { name: "a.se", outcome: || ParseOutcome::NoData });
|
||
let b = Arc::new(FakeProvider {
|
||
name: "b.se",
|
||
outcome: || ParseOutcome::Ok(entry()),
|
||
});
|
||
let fetcher = Arc::new(FakeFetcher::new(false));
|
||
let svc = service(vec![a, b], fetcher);
|
||
|
||
let results = svc.lookup("0700000000").await;
|
||
assert_eq!(results.len(), 2);
|
||
assert_eq!(results["a.se"], ProviderResult::NoData);
|
||
assert!(matches!(results["b.se"], ProviderResult::Ok { .. }));
|
||
}
|
||
}
|
||
```
|
||
|
||
**Fabricating a `FetchError` in tests:** `reqwest::Error` cannot be constructed
|
||
directly. Make the fail path real instead of fabricated — in Step 4's
|
||
implementation of `FakeFetcher::fetch`, replace the `unreachable!` with an
|
||
actual failing request against a closed local port:
|
||
|
||
```rust
|
||
if self.fail {
|
||
let err = reqwest::Client::new()
|
||
.get("http://127.0.0.1:1/unreachable")
|
||
.send()
|
||
.await
|
||
.unwrap_err();
|
||
return Err(FetchError::Request(err));
|
||
}
|
||
```
|
||
|
||
(Port 1 is never listening; connection is refused immediately — no external
|
||
network involved.)
|
||
|
||
- [ ] **Step 3: Run tests to verify they fail**
|
||
|
||
Run: `cargo test -p whoareyou-server`
|
||
Expected: COMPILE ERROR — `ProviderHandle`, `Fetch`, `LookupService` not defined.
|
||
|
||
- [ ] **Step 4: Implement the service**
|
||
|
||
Prepend to `crates/server/src/service.rs` (and fix the `FakeFetcher` fail path as noted above):
|
||
|
||
```rust
|
||
use std::collections::BTreeMap;
|
||
use std::sync::Arc;
|
||
use std::time::Duration;
|
||
|
||
use async_trait::async_trait;
|
||
use moka::future::Cache;
|
||
use tracing::warn;
|
||
|
||
use crate::error::{FetchError, HostError};
|
||
use crate::model::{FetchedResponse, ParseOutcome, ProviderResult};
|
||
|
||
/// A loaded provider. Implemented by `wasm::WasmProvider`; faked in tests.
|
||
/// Methods are sync — WASM calls are CPU-bound; the service wraps them in
|
||
/// `spawn_blocking`.
|
||
pub trait ProviderHandle: Send + Sync {
|
||
fn name(&self) -> &str;
|
||
fn requests(&self, number: &str) -> Result<Vec<String>, HostError>;
|
||
fn parse(&self, number: &str, responses: &[FetchedResponse]) -> ParseOutcome;
|
||
}
|
||
|
||
#[async_trait]
|
||
pub trait Fetch: Send + Sync {
|
||
async fn fetch(&self, url: &str) -> Result<FetchedResponse, FetchError>;
|
||
}
|
||
|
||
pub struct LookupService {
|
||
providers: Vec<Arc<dyn ProviderHandle>>,
|
||
fetcher: Arc<dyn Fetch>,
|
||
cache: Cache<String, ProviderResult>,
|
||
}
|
||
|
||
impl LookupService {
|
||
pub fn new(
|
||
providers: Vec<Arc<dyn ProviderHandle>>,
|
||
fetcher: Arc<dyn Fetch>,
|
||
cache_ttl: Duration,
|
||
) -> Self {
|
||
Self {
|
||
providers,
|
||
fetcher,
|
||
cache: Cache::builder().time_to_live(cache_ttl).build(),
|
||
}
|
||
}
|
||
|
||
pub fn provider_names(&self) -> Vec<&str> {
|
||
self.providers.iter().map(|p| p.name()).collect()
|
||
}
|
||
|
||
/// Run all providers concurrently; one result per provider name.
|
||
pub async fn lookup(&self, number: &str) -> BTreeMap<String, ProviderResult> {
|
||
let tasks = self.providers.iter().map(|provider| {
|
||
let provider = provider.clone();
|
||
let fetcher = self.fetcher.clone();
|
||
let cache = self.cache.clone();
|
||
let number = number.to_string();
|
||
|
||
async move {
|
||
let name = provider.name().to_string();
|
||
let key = format!("{name}:{number}");
|
||
|
||
if let Some(hit) = cache.get(&key).await {
|
||
return (name, hit);
|
||
}
|
||
|
||
let result = run_provider(provider, &number, fetcher).await;
|
||
|
||
// Transient failures must not poison the cache.
|
||
if result != ProviderResult::FetchFailed {
|
||
cache.insert(key, result.clone()).await;
|
||
}
|
||
|
||
(name, result)
|
||
}
|
||
});
|
||
|
||
futures::future::join_all(tasks).await.into_iter().collect()
|
||
}
|
||
}
|
||
|
||
async fn run_provider(
|
||
provider: Arc<dyn ProviderHandle>,
|
||
number: &str,
|
||
fetcher: Arc<dyn Fetch>,
|
||
) -> ProviderResult {
|
||
let name = provider.name().to_string();
|
||
|
||
let urls = {
|
||
let provider = provider.clone();
|
||
let number = number.to_string();
|
||
|
||
match tokio::task::spawn_blocking(move || provider.requests(&number)).await {
|
||
Ok(Ok(urls)) => urls,
|
||
Ok(Err(error)) => {
|
||
warn!(provider = %name, %error, "requests() failed");
|
||
return ProviderResult::ParseFailed;
|
||
}
|
||
Err(error) => {
|
||
warn!(provider = %name, %error, "requests() panicked");
|
||
return ProviderResult::ParseFailed;
|
||
}
|
||
}
|
||
};
|
||
|
||
let fetched = futures::future::join_all(urls.iter().map(|url| fetcher.fetch(url))).await;
|
||
|
||
let mut responses = Vec::with_capacity(fetched.len());
|
||
|
||
for result in fetched {
|
||
match result {
|
||
Ok(response) => responses.push(response),
|
||
Err(error) => {
|
||
warn!(provider = %name, %error, "fetch failed");
|
||
return ProviderResult::FetchFailed;
|
||
}
|
||
}
|
||
}
|
||
|
||
let outcome = {
|
||
let provider = provider.clone();
|
||
let number = number.to_string();
|
||
|
||
tokio::task::spawn_blocking(move || provider.parse(&number, &responses)).await
|
||
};
|
||
|
||
match outcome {
|
||
Ok(ParseOutcome::Ok(entry)) => ProviderResult::Ok { entry },
|
||
Ok(ParseOutcome::NoData) => ProviderResult::NoData,
|
||
Ok(ParseOutcome::Failed(message)) => {
|
||
warn!(provider = %name, %message, "parse failed — scraper rot?");
|
||
ProviderResult::ParseFailed
|
||
}
|
||
Err(error) => {
|
||
warn!(provider = %name, %error, "parse() panicked");
|
||
ProviderResult::ParseFailed
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 5: Wire modules into `lib.rs`**
|
||
|
||
`crates/server/src/lib.rs`:
|
||
|
||
```rust
|
||
pub mod error;
|
||
pub mod fetch;
|
||
pub mod model;
|
||
pub mod service;
|
||
```
|
||
|
||
- [ ] **Step 6: Run tests to verify they pass**
|
||
|
||
Run: `cargo test -p whoareyou-server`
|
||
Expected: PASS (all five service tests + model test).
|
||
|
||
- [ ] **Step 7: Commit**
|
||
|
||
```bash
|
||
git add crates/server Cargo.lock
|
||
git commit -m "feat: add LookupService with moka cache and provider orchestration"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 8: HTTP layer (axum, TDD)
|
||
|
||
**Files:**
|
||
- Create: `crates/server/src/http.rs`
|
||
- Modify: `crates/server/src/lib.rs`, `crates/server/Cargo.toml`
|
||
|
||
- [ ] **Step 1: Add dependencies**
|
||
|
||
Extend `crates/server/Cargo.toml`:
|
||
|
||
```toml
|
||
[dependencies]
|
||
# add:
|
||
axum = "0.8"
|
||
serde_json = "1"
|
||
|
||
[dev-dependencies]
|
||
# add:
|
||
http-body-util = "0.1"
|
||
tower = { version = "0.5", features = ["util"] }
|
||
```
|
||
|
||
(`serde_json` moves from dev-dependencies to dependencies — remove the dev entry.)
|
||
|
||
- [ ] **Step 2: Write the failing tests**
|
||
|
||
`crates/server/src/http.rs`, test module first:
|
||
|
||
```rust
|
||
#[cfg(test)]
|
||
mod tests {
|
||
use std::sync::Arc;
|
||
use std::time::Duration;
|
||
|
||
use async_trait::async_trait;
|
||
use axum::body::Body;
|
||
use axum::http::{Request, StatusCode};
|
||
use http_body_util::BodyExt;
|
||
use tower::ServiceExt;
|
||
|
||
use super::*;
|
||
use crate::error::{FetchError, HostError};
|
||
use crate::model::{FetchedResponse, ParseOutcome};
|
||
use crate::service::{Fetch, LookupService, ProviderHandle};
|
||
|
||
struct NoDataProvider;
|
||
|
||
impl ProviderHandle for NoDataProvider {
|
||
fn name(&self) -> &str {
|
||
"fake.se"
|
||
}
|
||
|
||
fn requests(&self, number: &str) -> Result<Vec<String>, HostError> {
|
||
Ok(vec![format!("https://example.test/{number}")])
|
||
}
|
||
|
||
fn parse(&self, _: &str, _: &[FetchedResponse]) -> ParseOutcome {
|
||
ParseOutcome::NoData
|
||
}
|
||
}
|
||
|
||
struct StaticFetcher;
|
||
|
||
#[async_trait]
|
||
impl Fetch for StaticFetcher {
|
||
async fn fetch(&self, _: &str) -> Result<FetchedResponse, FetchError> {
|
||
Ok(FetchedResponse { status: 200, body: String::new() })
|
||
}
|
||
}
|
||
|
||
fn app() -> axum::Router {
|
||
let service = LookupService::new(
|
||
vec![Arc::new(NoDataProvider)],
|
||
Arc::new(StaticFetcher),
|
||
Duration::from_secs(60),
|
||
);
|
||
|
||
router(Arc::new(service))
|
||
}
|
||
|
||
#[test]
|
||
fn normalize_strips_separators() {
|
||
assert_eq!(normalize("0700 00-00.00"), Some("0700000000".to_string()));
|
||
assert_eq!(normalize("+46701234567"), Some("+46701234567".to_string()));
|
||
}
|
||
|
||
#[test]
|
||
fn normalize_rejects_garbage() {
|
||
assert_eq!(normalize("not-a-number"), None);
|
||
assert_eq!(normalize(""), None);
|
||
assert_eq!(normalize("0"), None);
|
||
assert_eq!(normalize("07001231231231231231"), None); // > 15 digits
|
||
assert_eq!(normalize("070+123"), None); // '+' not at start
|
||
}
|
||
|
||
#[tokio::test]
|
||
async fn lookup_returns_results_keyed_by_provider() {
|
||
let response = app()
|
||
.oneshot(
|
||
Request::builder()
|
||
.uri("/api/v1/number/0700 00-00 00")
|
||
.body(Body::empty())
|
||
.unwrap(),
|
||
)
|
||
.await
|
||
.unwrap();
|
||
|
||
assert_eq!(response.status(), StatusCode::OK);
|
||
|
||
let bytes = response.into_body().collect().await.unwrap().to_bytes();
|
||
let json: serde_json::Value = serde_json::from_slice(&bytes).unwrap();
|
||
|
||
assert_eq!(json["number"], "0700000000");
|
||
assert_eq!(json["results"]["fake.se"]["status"], "no_data");
|
||
}
|
||
|
||
#[tokio::test]
|
||
async fn invalid_number_is_400() {
|
||
let response = app()
|
||
.oneshot(
|
||
Request::builder()
|
||
.uri("/api/v1/number/banana")
|
||
.body(Body::empty())
|
||
.unwrap(),
|
||
)
|
||
.await
|
||
.unwrap();
|
||
|
||
assert_eq!(response.status(), StatusCode::BAD_REQUEST);
|
||
}
|
||
|
||
#[tokio::test]
|
||
async fn healthz_is_ok() {
|
||
let response = app()
|
||
.oneshot(Request::builder().uri("/healthz").body(Body::empty()).unwrap())
|
||
.await
|
||
.unwrap();
|
||
|
||
assert_eq!(response.status(), StatusCode::OK);
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 3: Run tests to verify they fail**
|
||
|
||
Run: `cargo test -p whoareyou-server`
|
||
Expected: COMPILE ERROR — `router`, `normalize` not defined.
|
||
|
||
- [ ] **Step 4: Implement the HTTP layer**
|
||
|
||
Prepend to `crates/server/src/http.rs`:
|
||
|
||
```rust
|
||
use std::sync::Arc;
|
||
|
||
use axum::Json;
|
||
use axum::Router;
|
||
use axum::extract::{Path, State};
|
||
use axum::http::StatusCode;
|
||
use axum::response::{IntoResponse, Response};
|
||
use axum::routing::get;
|
||
use serde_json::json;
|
||
|
||
use crate::model::LookupResponse;
|
||
use crate::service::LookupService;
|
||
|
||
pub fn router(service: Arc<LookupService>) -> Router {
|
||
Router::new()
|
||
.route("/api/v1/number/{number}", get(lookup_number))
|
||
.route("/healthz", get(|| async { "ok" }))
|
||
.with_state(service)
|
||
}
|
||
|
||
async fn lookup_number(
|
||
State(service): State<Arc<LookupService>>,
|
||
Path(raw): Path<String>,
|
||
) -> Response {
|
||
let Some(number) = normalize(&raw) else {
|
||
return (
|
||
StatusCode::BAD_REQUEST,
|
||
Json(json!({ "error": "invalid phone number" })),
|
||
)
|
||
.into_response();
|
||
};
|
||
|
||
let results = service.lookup(&number).await;
|
||
|
||
Json(LookupResponse { number, results }).into_response()
|
||
}
|
||
|
||
/// Strip separators and validate: optional leading '+', then 2–15 digits.
|
||
pub fn normalize(raw: &str) -> Option<String> {
|
||
let cleaned: String = raw
|
||
.chars()
|
||
.filter(|c| !matches!(c, ' ' | '-' | '.'))
|
||
.collect();
|
||
|
||
let digits = cleaned.strip_prefix('+').unwrap_or(&cleaned);
|
||
|
||
let valid = (2..=15).contains(&digits.len())
|
||
&& digits.chars().all(|c| c.is_ascii_digit());
|
||
|
||
valid.then_some(cleaned)
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 5: Wire the module**
|
||
|
||
`crates/server/src/lib.rs`:
|
||
|
||
```rust
|
||
pub mod error;
|
||
pub mod fetch;
|
||
pub mod http;
|
||
pub mod model;
|
||
pub mod service;
|
||
```
|
||
|
||
- [ ] **Step 6: Run tests to verify they pass**
|
||
|
||
Run: `cargo test -p whoareyou-server`
|
||
Expected: PASS.
|
||
|
||
- [ ] **Step 7: Commit**
|
||
|
||
```bash
|
||
git add crates/server Cargo.lock
|
||
git commit -m "feat: add axum HTTP layer with lookup endpoint and healthz"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 9: wasmtime host (WasmProvider)
|
||
|
||
**Files:**
|
||
- Create: `crates/server/src/wasm.rs`
|
||
- Modify: `crates/server/src/lib.rs`, `crates/server/Cargo.toml`
|
||
|
||
- [ ] **Step 1: Add wasmtime-wasi**
|
||
|
||
Extend `crates/server/Cargo.toml` `[dependencies]`:
|
||
|
||
```toml
|
||
wasmtime-wasi = "45"
|
||
```
|
||
|
||
- [ ] **Step 2: Write `crates/server/src/wasm.rs`**
|
||
|
||
> **API-drift note:** the `WasiView`/`WasiCtxView` shape below matches recent
|
||
> wasmtime-wasi releases as of this plan's writing. If `cargo check` disagrees,
|
||
> consult https://docs.rs/wasmtime-wasi/45 — the intent is fixed: a store data
|
||
> struct holding `WasiCtx` + `ResourceTable`, WASI added to the linker sync,
|
||
> no preopens / no env / no inherited stdio. Adapt mechanically; do not change
|
||
> the public surface of this module.
|
||
|
||
```rust
|
||
use std::path::Path;
|
||
|
||
use wasmtime::component::{Component, Linker};
|
||
use wasmtime::{Config, Engine, Store};
|
||
use wasmtime_wasi::ResourceTable;
|
||
use wasmtime_wasi::p2::{WasiCtx, WasiCtxBuilder, WasiCtxView, WasiView};
|
||
|
||
use crate::error::HostError;
|
||
use crate::model::{Comment, Entry, FetchedResponse, ParseOutcome};
|
||
use crate::service::ProviderHandle;
|
||
|
||
wasmtime::component::bindgen!({
|
||
world: "provider",
|
||
path: "../../wit",
|
||
});
|
||
|
||
use exports::whoareyou::provider::lookup::{LookupError as WitLookupError, Response as WitResponse};
|
||
|
||
/// How many epoch ticks a guest call may run. The epoch thread ticks every
|
||
/// 100 ms → 50 ticks ≈ 5 s budget per call.
|
||
const EPOCH_DEADLINE_TICKS: u64 = 50;
|
||
pub const EPOCH_TICK: std::time::Duration = std::time::Duration::from_millis(100);
|
||
|
||
pub struct HostState {
|
||
ctx: WasiCtx,
|
||
table: ResourceTable,
|
||
}
|
||
|
||
impl WasiView for HostState {
|
||
fn ctx(&mut self) -> WasiCtxView<'_> {
|
||
WasiCtxView { ctx: &mut self.ctx, table: &mut self.table }
|
||
}
|
||
}
|
||
|
||
pub fn engine() -> Result<Engine, HostError> {
|
||
let mut config = Config::new();
|
||
config.epoch_interruption(true);
|
||
|
||
Ok(Engine::new(&config)?)
|
||
}
|
||
|
||
pub fn linker(engine: &Engine) -> Result<Linker<HostState>, HostError> {
|
||
let mut linker = Linker::new(engine);
|
||
wasmtime_wasi::p2::add_to_linker_sync(&mut linker)?;
|
||
|
||
Ok(linker)
|
||
}
|
||
|
||
/// Spawn the thread that advances the engine epoch so runaway guest calls
|
||
/// trap instead of hanging the service. Call once at startup.
|
||
pub fn spawn_epoch_thread(engine: &Engine) {
|
||
let engine = engine.clone();
|
||
|
||
std::thread::spawn(move || {
|
||
loop {
|
||
std::thread::sleep(EPOCH_TICK);
|
||
engine.increment_epoch();
|
||
}
|
||
});
|
||
}
|
||
|
||
pub struct WasmProvider {
|
||
name: String,
|
||
version: String,
|
||
engine: Engine,
|
||
pre: ProviderPre<HostState>,
|
||
}
|
||
|
||
impl WasmProvider {
|
||
/// Compile a component from disk and read its metadata once.
|
||
/// Fails fast if the component does not satisfy the provider world.
|
||
pub fn load(
|
||
engine: &Engine,
|
||
linker: &Linker<HostState>,
|
||
path: &Path,
|
||
) -> Result<Self, HostError> {
|
||
let component = Component::from_file(engine, path)?;
|
||
let pre = ProviderPre::new(linker.instantiate_pre(&component)?)?;
|
||
|
||
let mut provider = Self {
|
||
name: String::new(),
|
||
version: String::new(),
|
||
engine: engine.clone(),
|
||
pre,
|
||
};
|
||
|
||
let mut store = provider.new_store();
|
||
let instance = provider.pre.instantiate(&mut store)?;
|
||
let info = instance.whoareyou_provider_lookup().call_metadata(&mut store)?;
|
||
|
||
provider.name = info.name;
|
||
provider.version = info.version;
|
||
|
||
Ok(provider)
|
||
}
|
||
|
||
pub fn version(&self) -> &str {
|
||
&self.version
|
||
}
|
||
|
||
fn new_store(&self) -> Store<HostState> {
|
||
// No preopens, no env, no inherited stdio — fully sandboxed guest.
|
||
let ctx = WasiCtxBuilder::new().build();
|
||
let mut store = Store::new(
|
||
&self.engine,
|
||
HostState { ctx, table: ResourceTable::new() },
|
||
);
|
||
|
||
store.set_epoch_deadline(EPOCH_DEADLINE_TICKS);
|
||
|
||
store
|
||
}
|
||
}
|
||
|
||
impl ProviderHandle for WasmProvider {
|
||
fn name(&self) -> &str {
|
||
&self.name
|
||
}
|
||
|
||
fn requests(&self, number: &str) -> Result<Vec<String>, HostError> {
|
||
let mut store = self.new_store();
|
||
let instance = self.pre.instantiate(&mut store)?;
|
||
|
||
let requests = instance
|
||
.whoareyou_provider_lookup()
|
||
.call_requests(&mut store, number)?;
|
||
|
||
Ok(requests.into_iter().map(|r| r.url).collect())
|
||
}
|
||
|
||
fn parse(&self, number: &str, responses: &[FetchedResponse]) -> ParseOutcome {
|
||
let wit_responses: Vec<WitResponse> = responses
|
||
.iter()
|
||
.map(|r| WitResponse { status: r.status, body: r.body.clone() })
|
||
.collect();
|
||
|
||
let mut store = self.new_store();
|
||
|
||
let result = (|| {
|
||
let instance = self.pre.instantiate(&mut store)?;
|
||
|
||
instance
|
||
.whoareyou_provider_lookup()
|
||
.call_parse(&mut store, number, &wit_responses)
|
||
})();
|
||
|
||
match result {
|
||
Ok(Ok(entry)) => ParseOutcome::Ok(Entry {
|
||
messages: entry.messages,
|
||
history: entry.history,
|
||
comments: entry
|
||
.comments
|
||
.into_iter()
|
||
.map(|c| Comment {
|
||
timestamp: c.timestamp,
|
||
title: c.title,
|
||
message: c.message,
|
||
})
|
||
.collect(),
|
||
}),
|
||
Ok(Err(WitLookupError::NoData)) => ParseOutcome::NoData,
|
||
Ok(Err(WitLookupError::ParseFailed(message))) => ParseOutcome::Failed(message),
|
||
// Trap (incl. epoch deadline exceeded) or instantiation failure.
|
||
Err(error) => ParseOutcome::Failed(format!("component error: {error}")),
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 3: Wire the module**
|
||
|
||
`crates/server/src/lib.rs`:
|
||
|
||
```rust
|
||
pub mod error;
|
||
pub mod fetch;
|
||
pub mod http;
|
||
pub mod model;
|
||
pub mod service;
|
||
pub mod wasm;
|
||
```
|
||
|
||
- [ ] **Step 4: Verify it compiles (adapt API drift here if needed)**
|
||
|
||
Run: `cargo check -p whoareyou-server`
|
||
Expected: success. If `WasiView`/`WasiCtxView`/`add_to_linker_sync` signatures
|
||
drifted in wasmtime-wasi 45, fix per the docs.rs note above and re-check.
|
||
|
||
- [ ] **Step 5: Run all tests**
|
||
|
||
Run: `cargo test -p whoareyou-server`
|
||
Expected: PASS (no new tests — real coverage lands in Task 10's integration test).
|
||
|
||
- [ ] **Step 6: Commit**
|
||
|
||
```bash
|
||
git add crates/server Cargo.lock
|
||
git commit -m "feat: add wasmtime host with epoch-bounded WasmProvider"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 10: Component integration test
|
||
|
||
Proves the WIT boundary end-to-end: the real `.wasm` built from Task 4, loaded by the real host from Task 9.
|
||
|
||
**Files:**
|
||
- Create: `crates/server/tests/component.rs`
|
||
|
||
- [ ] **Step 1: Build the component**
|
||
|
||
Run: `cargo build --release --target wasm32-wasip2 -p whoareyou-provider-hitta`
|
||
Expected: `target/wasm32-wasip2/release/whoareyou_provider_hitta.wasm` exists.
|
||
|
||
- [ ] **Step 2: Write the integration test**
|
||
|
||
`crates/server/tests/component.rs`:
|
||
|
||
```rust
|
||
use std::path::Path;
|
||
|
||
use whoareyou_server::model::{FetchedResponse, ParseOutcome};
|
||
use whoareyou_server::service::ProviderHandle;
|
||
use whoareyou_server::wasm;
|
||
|
||
const COMPONENT_PATH: &str = concat!(
|
||
env!("CARGO_MANIFEST_DIR"),
|
||
"/../../target/wasm32-wasip2/release/whoareyou_provider_hitta.wasm"
|
||
);
|
||
|
||
fn load_provider() -> wasm::WasmProvider {
|
||
let path = Path::new(COMPONENT_PATH);
|
||
|
||
assert!(
|
||
path.exists(),
|
||
"hitta component not built — run `just build-components` first"
|
||
);
|
||
|
||
let engine = wasm::engine().unwrap();
|
||
let linker = wasm::linker(&engine).unwrap();
|
||
wasm::spawn_epoch_thread(&engine);
|
||
|
||
wasm::WasmProvider::load(&engine, &linker, path).unwrap()
|
||
}
|
||
|
||
#[test]
|
||
fn metadata_identifies_hitta() {
|
||
let provider = load_provider();
|
||
|
||
assert_eq!(provider.name(), "hitta.se");
|
||
assert!(!provider.version().is_empty());
|
||
}
|
||
|
||
#[test]
|
||
fn requests_contain_the_number() {
|
||
let provider = load_provider();
|
||
let urls = provider.requests("0104754350").unwrap();
|
||
|
||
assert_eq!(urls, vec!["https://www.hitta.se/vem-ringde/0104754350"]);
|
||
}
|
||
|
||
#[test]
|
||
fn parse_roundtrips_a_fixture_through_wasm() {
|
||
let provider = load_provider();
|
||
let body = include_str!("../../../fixtures/hitta/0104754350.html").to_string();
|
||
|
||
let outcome = provider.parse(
|
||
"0104754350",
|
||
&[FetchedResponse { status: 200, body }],
|
||
);
|
||
|
||
let ParseOutcome::Ok(entry) = outcome else {
|
||
panic!("expected Ok entry, got {outcome:?}");
|
||
};
|
||
|
||
assert_eq!(entry.history, vec!["42 andra har rapporterat detta nummer"]);
|
||
assert_eq!(entry.comments.len(), 29);
|
||
assert_eq!(entry.comments[0].timestamp, Some(1547746162));
|
||
}
|
||
|
||
#[test]
|
||
fn parse_maps_no_data() {
|
||
let provider = load_provider();
|
||
let body = include_str!("../../../fixtures/hitta/0313908905.html").to_string();
|
||
|
||
let outcome = provider.parse(
|
||
"0313908905",
|
||
&[FetchedResponse { status: 200, body }],
|
||
);
|
||
|
||
assert!(matches!(outcome, ParseOutcome::NoData), "got {outcome:?}");
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 3: Run the integration test**
|
||
|
||
Run: `cargo test -p whoareyou-server --test component`
|
||
Expected: 4 tests PASS. (If `0104754350.html` parse expectations changed in
|
||
Task 3 Step 6's contingency branch, mirror the same expectations here.)
|
||
|
||
- [ ] **Step 4: Commit**
|
||
|
||
```bash
|
||
git add crates/server/tests
|
||
git commit -m "test: prove WIT boundary with real component integration test"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 11: Config + main wiring
|
||
|
||
**Files:**
|
||
- Create: `crates/server/src/config.rs`
|
||
- Modify: `crates/server/src/main.rs`, `crates/server/src/lib.rs`, `crates/server/Cargo.toml`
|
||
|
||
- [ ] **Step 1: Add binary dependencies**
|
||
|
||
Extend `crates/server/Cargo.toml` `[dependencies]`:
|
||
|
||
```toml
|
||
anyhow = "1"
|
||
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
|
||
```
|
||
|
||
- [ ] **Step 2: Write the failing config tests**
|
||
|
||
`crates/server/src/config.rs`, test module first:
|
||
|
||
```rust
|
||
#[cfg(test)]
|
||
mod tests {
|
||
use std::collections::HashMap;
|
||
|
||
use super::*;
|
||
|
||
fn env(pairs: &[(&str, &str)]) -> impl Fn(&str) -> Option<String> + '_ {
|
||
let map: HashMap<String, String> = pairs
|
||
.iter()
|
||
.map(|(k, v)| (k.to_string(), v.to_string()))
|
||
.collect();
|
||
|
||
move |key: &str| map.get(key).cloned()
|
||
}
|
||
|
||
#[test]
|
||
fn defaults_apply_when_unset() {
|
||
let config = AppConfig::from_lookup(env(&[])).unwrap();
|
||
|
||
assert_eq!(config.listen.to_string(), "127.0.0.1:8080");
|
||
assert_eq!(config.components_dir, std::path::PathBuf::from("components"));
|
||
assert_eq!(config.cache_ttl, std::time::Duration::from_secs(24 * 3600));
|
||
assert_eq!(config.fetch_timeout, std::time::Duration::from_secs(10));
|
||
}
|
||
|
||
#[test]
|
||
fn env_overrides_apply() {
|
||
let config = AppConfig::from_lookup(env(&[
|
||
("WHOAREYOU_LISTEN", "0.0.0.0:9000"),
|
||
("WHOAREYOU_COMPONENTS_DIR", "/opt/providers"),
|
||
("WHOAREYOU_CACHE_TTL_HOURS", "1"),
|
||
("WHOAREYOU_FETCH_TIMEOUT_SECS", "30"),
|
||
]))
|
||
.unwrap();
|
||
|
||
assert_eq!(config.listen.to_string(), "0.0.0.0:9000");
|
||
assert_eq!(config.components_dir, std::path::PathBuf::from("/opt/providers"));
|
||
assert_eq!(config.cache_ttl, std::time::Duration::from_secs(3600));
|
||
assert_eq!(config.fetch_timeout, std::time::Duration::from_secs(30));
|
||
}
|
||
|
||
#[test]
|
||
fn invalid_values_error() {
|
||
assert!(AppConfig::from_lookup(env(&[("WHOAREYOU_LISTEN", "not-an-addr")])).is_err());
|
||
assert!(AppConfig::from_lookup(env(&[("WHOAREYOU_CACHE_TTL_HOURS", "soon")])).is_err());
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 3: Run tests to verify they fail**
|
||
|
||
Run: `cargo test -p whoareyou-server config`
|
||
Expected: COMPILE ERROR — `AppConfig` not defined. (First wire `pub mod config;` into `lib.rs`.)
|
||
|
||
- [ ] **Step 4: Implement config**
|
||
|
||
Prepend to `crates/server/src/config.rs`:
|
||
|
||
```rust
|
||
use std::net::SocketAddr;
|
||
use std::path::PathBuf;
|
||
use std::time::Duration;
|
||
|
||
use crate::error::ConfigError;
|
||
|
||
#[derive(Debug)]
|
||
pub struct AppConfig {
|
||
pub listen: SocketAddr,
|
||
pub components_dir: PathBuf,
|
||
pub cache_ttl: Duration,
|
||
pub fetch_timeout: Duration,
|
||
}
|
||
|
||
impl AppConfig {
|
||
pub fn from_env() -> Result<Self, ConfigError> {
|
||
Self::from_lookup(|key| std::env::var(key).ok())
|
||
}
|
||
|
||
pub fn from_lookup(get: impl Fn(&str) -> Option<String>) -> Result<Self, ConfigError> {
|
||
let listen = match get("WHOAREYOU_LISTEN") {
|
||
Some(value) => value.parse().map_err(|e| ConfigError::Invalid {
|
||
key: "WHOAREYOU_LISTEN".to_string(),
|
||
message: format!("{e}"),
|
||
})?,
|
||
None => SocketAddr::from(([127, 0, 0, 1], 8080)),
|
||
};
|
||
|
||
let components_dir = get("WHOAREYOU_COMPONENTS_DIR")
|
||
.map(PathBuf::from)
|
||
.unwrap_or_else(|| PathBuf::from("components"));
|
||
|
||
let cache_ttl_hours: u64 = parse_or("WHOAREYOU_CACHE_TTL_HOURS", &get, 24)?;
|
||
let fetch_timeout_secs: u64 = parse_or("WHOAREYOU_FETCH_TIMEOUT_SECS", &get, 10)?;
|
||
|
||
Ok(Self {
|
||
listen,
|
||
components_dir,
|
||
cache_ttl: Duration::from_secs(cache_ttl_hours * 3600),
|
||
fetch_timeout: Duration::from_secs(fetch_timeout_secs),
|
||
})
|
||
}
|
||
}
|
||
|
||
fn parse_or(
|
||
key: &str,
|
||
get: &impl Fn(&str) -> Option<String>,
|
||
default: u64,
|
||
) -> Result<u64, ConfigError> {
|
||
match get(key) {
|
||
Some(value) => value.parse().map_err(|e| ConfigError::Invalid {
|
||
key: key.to_string(),
|
||
message: format!("{e}"),
|
||
}),
|
||
None => Ok(default),
|
||
}
|
||
}
|
||
```
|
||
|
||
`crates/server/src/lib.rs` final state:
|
||
|
||
```rust
|
||
pub mod config;
|
||
pub mod error;
|
||
pub mod fetch;
|
||
pub mod http;
|
||
pub mod model;
|
||
pub mod service;
|
||
pub mod wasm;
|
||
```
|
||
|
||
- [ ] **Step 5: Run tests to verify they pass**
|
||
|
||
Run: `cargo test -p whoareyou-server config`
|
||
Expected: PASS.
|
||
|
||
- [ ] **Step 6: Write `main.rs`**
|
||
|
||
`crates/server/src/main.rs`:
|
||
|
||
```rust
|
||
use std::sync::Arc;
|
||
|
||
use anyhow::Context;
|
||
use tracing::info;
|
||
use tracing_subscriber::EnvFilter;
|
||
|
||
use whoareyou_server::config::AppConfig;
|
||
use whoareyou_server::fetch::ReqwestFetcher;
|
||
use whoareyou_server::service::{LookupService, ProviderHandle};
|
||
use whoareyou_server::{http, wasm};
|
||
|
||
#[tokio::main]
|
||
async fn main() -> anyhow::Result<()> {
|
||
tracing_subscriber::fmt()
|
||
.with_env_filter(
|
||
EnvFilter::try_from_default_env().unwrap_or_else(|_| EnvFilter::new("info")),
|
||
)
|
||
.init();
|
||
|
||
let config = AppConfig::from_env()?;
|
||
|
||
let engine = wasm::engine()?;
|
||
let linker = wasm::linker(&engine)?;
|
||
wasm::spawn_epoch_thread(&engine);
|
||
|
||
let mut providers: Vec<Arc<dyn ProviderHandle>> = Vec::new();
|
||
|
||
let dir = std::fs::read_dir(&config.components_dir).with_context(|| {
|
||
format!("reading components dir {:?}", config.components_dir)
|
||
})?;
|
||
|
||
for entry in dir {
|
||
let path = entry?.path();
|
||
|
||
if path.extension().is_some_and(|ext| ext == "wasm") {
|
||
let provider = wasm::WasmProvider::load(&engine, &linker, &path)
|
||
.with_context(|| format!("loading component {path:?}"))?;
|
||
|
||
info!(
|
||
name = provider.name(),
|
||
version = provider.version(),
|
||
?path,
|
||
"loaded provider"
|
||
);
|
||
|
||
providers.push(Arc::new(provider));
|
||
}
|
||
}
|
||
|
||
anyhow::ensure!(
|
||
!providers.is_empty(),
|
||
"no .wasm components found in {:?}",
|
||
config.components_dir
|
||
);
|
||
|
||
let fetcher = Arc::new(ReqwestFetcher::new(config.fetch_timeout)?);
|
||
let service = Arc::new(LookupService::new(providers, fetcher, config.cache_ttl));
|
||
let app = http::router(service);
|
||
|
||
let listener = tokio::net::TcpListener::bind(config.listen).await?;
|
||
info!("listening on http://{}", config.listen);
|
||
|
||
axum::serve(listener, app).await?;
|
||
|
||
Ok(())
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 7: Full workspace check + tests**
|
||
|
||
Run: `cargo test --workspace`
|
||
Expected: PASS.
|
||
|
||
- [ ] **Step 8: Smoke-test the real service (network)**
|
||
|
||
```bash
|
||
mkdir -p components
|
||
cp target/wasm32-wasip2/release/whoareyou_provider_hitta.wasm components/hitta.wasm
|
||
cargo run -p whoareyou-server &
|
||
sleep 3
|
||
curl -s http://127.0.0.1:8080/healthz
|
||
curl -s "http://127.0.0.1:8080/api/v1/number/0104754350" | python3 -m json.tool
|
||
kill %1
|
||
```
|
||
|
||
Expected: `ok` from healthz; lookup returns JSON with a `results["hitta.se"]`
|
||
object whose `status` is one of `ok`/`no_data`/`parse_failed` (live site —
|
||
`parse_failed` here while the fixture tests pass means hitta.se serves
|
||
different markup to the server's User-Agent; if so, record it as a follow-up
|
||
issue, it does not block this task).
|
||
|
||
- [ ] **Step 9: Commit**
|
||
|
||
```bash
|
||
git add crates/server Cargo.lock
|
||
git commit -m "feat: wire config, component loading, and axum serve in main"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 12: justfile, docs, cleanup
|
||
|
||
**Files:**
|
||
- Create: `justfile`
|
||
- Modify: `fetch-fixture`, `README.md`, `CLAUDE.md`
|
||
|
||
- [ ] **Step 1: Write the `justfile`**
|
||
|
||
```just
|
||
# Build provider components and copy them where the server looks.
|
||
build-components:
|
||
cargo build --release --target wasm32-wasip2 -p whoareyou-provider-hitta
|
||
mkdir -p components
|
||
cp target/wasm32-wasip2/release/whoareyou_provider_hitta.wasm components/hitta.wasm
|
||
|
||
# Full build: components first, then the server.
|
||
build: build-components
|
||
cargo build --release
|
||
|
||
# All tests (the integration test needs the built component).
|
||
test: build-components
|
||
cargo test --workspace
|
||
|
||
# Run the service locally.
|
||
run: build-components
|
||
cargo run -p whoareyou-server
|
||
|
||
fmt:
|
||
cargo +nightly fmt
|
||
|
||
lint:
|
||
cargo clippy --workspace
|
||
```
|
||
|
||
- [ ] **Step 2: Verify `just test` works end to end**
|
||
|
||
Run: `just test`
|
||
Expected: builds the component, all tests PASS.
|
||
|
||
- [ ] **Step 3: Trim `fetch-fixture` to live providers**
|
||
|
||
Replace `fetch-fixture` contents:
|
||
|
||
```bash
|
||
#!/bin/bash
|
||
# Refresh HTML fixtures for provider parser tests.
|
||
# Usage: ./fetch-fixture <number>
|
||
|
||
set -euo pipefail
|
||
|
||
curl -sL -A "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)" \
|
||
"https://www.hitta.se/vem-ringde/$1" \
|
||
-o "fixtures/hitta/$1.html"
|
||
|
||
echo "fixtures/hitta/$1.html: $(wc -c < "fixtures/hitta/$1.html") bytes"
|
||
```
|
||
|
||
- [ ] **Step 4: Rewrite `README.md`**
|
||
|
||
```markdown
|
||
# whoareyou
|
||
|
||
Who is calling me? A self-hosted HTTP service that looks up Swedish phone
|
||
numbers across reverse-lookup sites. Providers are sandboxed WASM components.
|
||
|
||
## Usage
|
||
|
||
```shell
|
||
$ just run
|
||
$ curl "http://127.0.0.1:8080/api/v1/number/0700000000"
|
||
```
|
||
|
||
## Configuration (env)
|
||
|
||
| Variable | Default |
|
||
|---|---|
|
||
| `WHOAREYOU_LISTEN` | `127.0.0.1:8080` |
|
||
| `WHOAREYOU_COMPONENTS_DIR` | `components` |
|
||
| `WHOAREYOU_CACHE_TTL_HOURS` | `24` |
|
||
| `WHOAREYOU_FETCH_TIMEOUT_SECS` | `10` |
|
||
|
||
## Development
|
||
|
||
```shell
|
||
$ rustup target add wasm32-wasip2
|
||
$ just test
|
||
```
|
||
|
||
Provider contract lives in `wit/provider.wit`. See
|
||
`docs/superpowers/specs/2026-06-05-wasm-provider-service-design.md`.
|
||
```
|
||
|
||
- [ ] **Step 5: Rewrite `CLAUDE.md`**
|
||
|
||
```markdown
|
||
# CLAUDE.md
|
||
|
||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||
|
||
## What this is
|
||
|
||
A self-hosted HTTP service that looks up Swedish phone numbers ("who is
|
||
calling me?") by scraping reverse-lookup sites. Providers are WASM components
|
||
(Component Model / WASI p2) loaded from a directory at startup; the host does
|
||
all fetching and caching. Design spec:
|
||
`docs/superpowers/specs/2026-06-05-wasm-provider-service-design.md`.
|
||
|
||
## Commands
|
||
|
||
```bash
|
||
just test # build components + run all tests (preferred)
|
||
just run # build components + run the service
|
||
just build # release build of everything
|
||
cargo test -p whoareyou-provider-hitta # provider parser tests (native, no WASM)
|
||
cargo test -p whoareyou-server --test component # WIT-boundary integration test
|
||
cargo +nightly fmt # always nightly, not stable
|
||
cargo clippy --workspace
|
||
./fetch-fixture <number> # refresh an HTML fixture from hitta.se
|
||
```
|
||
|
||
The integration test needs the component built first — run via `just test`,
|
||
or `cargo build --release --target wasm32-wasip2 -p whoareyou-provider-hitta`
|
||
before bare `cargo test`.
|
||
|
||
## Architecture
|
||
|
||
- `wit/provider.wit` — the provider contract (`metadata`/`requests`/`parse`).
|
||
Components are pure: no network, no filesystem. The HOST fetches URLs.
|
||
- `crates/providers/hitta` — parse logic in `parser.rs` is plain Rust,
|
||
unit-tested natively against `fixtures/hitta/*.html`; `component.rs` is
|
||
thin WIT glue, compiled only for `wasm32` (`cargo test` never touches WASM
|
||
here).
|
||
- `crates/server` — lib + thin bin. `service.rs` holds the `ProviderHandle` +
|
||
`Fetch` traits and `LookupService` (moka cache, TTL 24h, key
|
||
`provider:number`; fetch failures are NOT cached). `wasm.rs` implements
|
||
`ProviderHandle` over wasmtime (fresh Store per call, epoch deadline ≈5s).
|
||
`http.rs` is axum: `GET /api/v1/number/{number}`, `GET /healthz`.
|
||
|
||
## Gotchas
|
||
|
||
- Components build with plain `cargo build --target wasm32-wasip2` — no
|
||
cargo-component. Output name uses underscores:
|
||
`whoareyou_provider_hitta.wasm`; the justfile copies it to
|
||
`components/hitta.wasm` (gitignored).
|
||
- One provider failing maps to a per-provider `status` in the JSON response —
|
||
never a non-200 for the whole lookup. `parse_failed` in logs (WARN) means a
|
||
site changed its markup: refresh a fixture with `./fetch-fixture` and fix
|
||
the parser.
|
||
- `ParseError::NoData` vs `Failed`: a fetched page with no phone data is
|
||
NoData (normal); a page that doesn't match the expected structure is Failed
|
||
(scraper rot). Don't conflate them.
|
||
```
|
||
|
||
- [ ] **Step 6: Final verification**
|
||
|
||
Run: `just test && cargo clippy --workspace && cargo +nightly fmt -- --check`
|
||
Expected: tests pass, no clippy errors (warnings OK to fix or note), fmt clean.
|
||
|
||
- [ ] **Step 7: Commit**
|
||
|
||
```bash
|
||
git add justfile fetch-fixture README.md CLAUDE.md
|
||
git commit -m "docs: add justfile and rewrite README/CLAUDE.md for service architecture"
|
||
```
|
||
|
||
---
|
||
|
||
## Out of scope (per spec)
|
||
|
||
Container image · k8s/Pithos/CI · provider upload/enable-disable · more providers · host-fetch import for multi-step providers · lookup history / persistent cache · metrics.
|