docs: add justfile and rewrite README/CLAUDE.md for service architecture
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -4,64 +4,57 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
|||||||
|
|
||||||
## What this is
|
## What this is
|
||||||
|
|
||||||
A CLI that looks up Swedish phone numbers ("who is calling me?") by scraping
|
A self-hosted HTTP service that looks up Swedish phone numbers ("who is
|
||||||
reverse-lookup sites. Old codebase: Rust edition 2018, reqwest 0.9 (synchronous
|
calling me?") by scraping reverse-lookup sites. Providers are WASM components
|
||||||
API), insta 0.11.
|
(Component Model / WASI p2) loaded from a directory at startup; the host does
|
||||||
|
all fetching and caching. Design spec:
|
||||||
|
`docs/superpowers/specs/2026-06-05-wasm-provider-service-design.md`.
|
||||||
|
|
||||||
## Commands
|
## Commands
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
cargo build
|
just test # build components + run all tests (preferred)
|
||||||
cargo run -- 0700000000 # query a number (hitta.se built in)
|
just run # build components + run the service
|
||||||
cargo run -- -d definitions/vem_ringde.toml 0700000000 # add TOML-defined probes
|
just build # release build of everything
|
||||||
cargo run -- -o 0700000000 # open probe URLs in browser (macOS `open`)
|
cargo test -p whoareyou-provider-hitta # provider parser tests (native, no WASM)
|
||||||
|
cargo test -p whoareyou-server --test component # WIT-boundary integration test
|
||||||
cargo test # all tests (insta snapshot tests)
|
cargo +nightly fmt # always nightly, not stable
|
||||||
cargo test probe::hitta # one module
|
cargo clippy --workspace
|
||||||
cargo test test_0104754350 # one test
|
./fetch-fixture <number> # refresh an HTML fixture from hitta.se
|
||||||
|
|
||||||
cargo +nightly fmt # always nightly, not stable
|
|
||||||
cargo clippy
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Tests are inline-snapshot tests (`assert_yaml_snapshot!(..., @r###"..."###)`)
|
The integration test needs the component built first — run via `just test`,
|
||||||
against checked-in HTML fixtures in `fixtures/<provider>/<number>.html` — no
|
or `cargo build --release --target wasm32-wasip2 -p whoareyou-provider-hitta`
|
||||||
network needed. Refresh/add fixtures with `./fetch-fixture <number>` (requires
|
before bare `cargo test`.
|
||||||
`http`/httpie); it fetches the number from all five sites into `fixtures/`.
|
|
||||||
|
|
||||||
## Architecture
|
## Architecture
|
||||||
|
|
||||||
Everything revolves around the `Probe` trait (`src/probe.rs`): `provider()`,
|
- `wit/provider.wit` — the provider contract (`metadata`/`requests`/`parse`).
|
||||||
`uri(number)`, `fetch(number)`, `parse(html) -> Result<Entry, ()>`.
|
Components are pure: no network, no filesystem. The HOST fetches URLs.
|
||||||
|
- `crates/providers/hitta` — parse logic in `parser.rs` is plain Rust,
|
||||||
Two kinds of probes:
|
unit-tested natively against `fixtures/hitta/*.html`; `component.rs` is
|
||||||
|
thin WIT glue, compiled only for `wasm32` (`cargo test` never touches WASM
|
||||||
1. **Hard-coded**: `Hitta` (`src/probe/hitta.rs`) — extracts the
|
here). hitta.se serves Next.js App Router pages — data lives in RSC flight
|
||||||
`__NEXT_DATA__` JSON blob via regex and deserializes it with serde. Always
|
payloads (`self.__next_f.push`), NOT `__NEXT_DATA__` (that's the dead 2019
|
||||||
registered in `main.rs`.
|
format kept in old fixtures as a Failed-path regression case).
|
||||||
2. **Declarative**: `Definition` (`src/definition.rs`) — generic scraper
|
- `crates/server` — lib + thin bin. `service.rs` holds the `ProviderHandle` +
|
||||||
configured by a TOML file (`definitions/*.toml`) with CSS selectors for
|
`Fetch` traits and `LookupService` (moka cache, TTL 24h, key
|
||||||
`messages`, `history`, and `comments` (each comment has optional
|
`provider:number`; fetch failures are NOT cached). `wasm.rs` implements
|
||||||
`date_time`/`title`/`message` sub-selectors). The URL `path` is a
|
`ProviderHandle` over wasmtime (fresh Store per call, epoch deadline ≈5s —
|
||||||
tinytemplate string with `{ number }`. Loaded at runtime via `-d`.
|
`spawn_epoch_thread` must run once at startup or runaway guests hang
|
||||||
|
instead of trapping). `http.rs` is axum: `GET /api/v1/number/{number}`,
|
||||||
Flow in `main.rs`: build probe list → for each probe, check the cache
|
`GET /healthz`.
|
||||||
(`Context` in `src/context.rs`, bincode files under the platform cache dir
|
|
||||||
with a 1-day TTL) → otherwise `fetch()` and cache → `parse()` into an `Entry`
|
|
||||||
(`src/entry.rs`) → `Display` it.
|
|
||||||
|
|
||||||
## Gotchas
|
## Gotchas
|
||||||
|
|
||||||
- `src/probe/{eniro,konsument_info,telefonforsaljare,vem_ringde}.rs` are
|
- Components build with plain `cargo build --target wasm32-wasip2` — no
|
||||||
**orphaned** — `probe.rs` only declares `mod hitta;`. Those providers were
|
cargo-component. Output name uses underscores:
|
||||||
superseded by the TOML definitions in `definitions/`. Don't "fix" them or
|
`whoareyou_provider_hitta.wasm`; the justfile copies it to
|
||||||
expect them to compile; they're kept as reference.
|
`components/hitta.wasm` (gitignored).
|
||||||
- `_build.rs` is intentionally disabled (underscore prefix, not referenced in
|
- One provider failing maps to a per-provider `status` in the JSON response —
|
||||||
Cargo.toml) — an abandoned attempt at generating fixture tests.
|
never a non-200 for the whole lookup. `parse_failed` in logs (WARN) means a
|
||||||
- `definitions/vem_ringde.yml` is an experimental YAML variant of the TOML
|
site changed its markup: refresh a fixture with `./fetch-fixture` and fix
|
||||||
definition, but `main.rs` only parses TOML (`toml::from_slice`).
|
the parser.
|
||||||
- The `Filter` enum in `src/definition.rs` has no variants yet — `filters` is
|
- `ParseError::NoData` vs `Failed`: a fetched page with no phone data is
|
||||||
parsed from definitions but unimplemented (commented-out loops in `parse`).
|
NoData (normal); a page that doesn't match the expected structure is Failed
|
||||||
- insta 0.11 is old: the macro is `assert_yaml_snapshot!` and inline-snapshot
|
(scraper rot). Don't conflate them.
|
||||||
updates need a matching old `cargo-insta`; it's usually easier to update the
|
|
||||||
inline `@r###"..."###` literals by hand.
|
|
||||||
|
|||||||
@@ -1,21 +1,30 @@
|
|||||||
# whoareyou
|
# whoareyou
|
||||||
|
|
||||||
Who is calling me?
|
Who is calling me? A self-hosted HTTP service that looks up Swedish phone
|
||||||
|
numbers across reverse-lookup sites. Providers are sandboxed WASM components.
|
||||||
|
|
||||||
## Usage
|
## Usage
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
$ whoareyou 0700000000
|
$ just run
|
||||||
|
$ curl "http://127.0.0.1:8080/api/v1/number/0700000000"
|
||||||
```
|
```
|
||||||
|
|
||||||
## Todo
|
## Configuration (env)
|
||||||
|
|
||||||
Almost everything. I will add stuff when I need stuff. But hey, if you found this project and want to use it. Fork it, change it, create a PR, and I will add it :)
|
| Variable | Default |
|
||||||
|
|---|---|
|
||||||
|
| `WHOAREYOU_LISTEN` | `127.0.0.1:8080` |
|
||||||
|
| `WHOAREYOU_COMPONENTS_DIR` | `components` |
|
||||||
|
| `WHOAREYOU_CACHE_TTL_HOURS` | `24` |
|
||||||
|
| `WHOAREYOU_FETCH_TIMEOUT_SECS` | `10` |
|
||||||
|
|
||||||
- [x] Add flag to open url for probes in browser (easier for debugging)
|
## Development
|
||||||
- [x] Probe should return and Result, so we don't print a new line for empty result
|
|
||||||
- [x] Add logging
|
```shell
|
||||||
- [ ] List cache entries
|
$ rustup target add wasm32-wasip2
|
||||||
- [ ] Clear cache entries
|
$ just test
|
||||||
- [ ] Add some nice colors, so it's easier to read the output.
|
```
|
||||||
- [x] Add tests for probes.
|
|
||||||
|
Provider contract lives in `wit/provider.wit`. See
|
||||||
|
`docs/superpowers/specs/2026-06-05-wasm-provider-service-design.md`.
|
||||||
|
|||||||
+8
-5
@@ -1,8 +1,11 @@
|
|||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
|
# Refresh HTML fixtures for provider parser tests.
|
||||||
|
# Usage: ./fetch-fixture <number>
|
||||||
|
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
http --follow GET "https://gulasidorna.eniro.se/hitta:$1" > "fixtures/eniro/$1.html"
|
curl -sL -A "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)" \
|
||||||
http --follow GET "https://www.hitta.se/vem-ringde/$1" > "fixtures/hitta/$1.html"
|
"https://www.hitta.se/vem-ringde/$1" \
|
||||||
http --follow GET "http://konsumentinfo.se/telefonnummer/sverige/$1" > "fixtures/konsumentinfo/$1.html"
|
-o "fixtures/hitta/$1.html"
|
||||||
http --follow GET "http://telefonforsaljare.nu/telefonnummer/$1/" > "fixtures/telefonforsaljare/$1.html"
|
|
||||||
http --follow GET "http://vemringde.se/?q=$1" > "fixtures/vemringde/$1.html"
|
echo "fixtures/hitta/$1.html: $(wc -c < "fixtures/hitta/$1.html") bytes"
|
||||||
|
|||||||
@@ -0,0 +1,23 @@
|
|||||||
|
# Build provider components and copy them where the server looks.
|
||||||
|
build-components:
|
||||||
|
cargo build --release --target wasm32-wasip2 -p whoareyou-provider-hitta
|
||||||
|
mkdir -p components
|
||||||
|
cp target/wasm32-wasip2/release/whoareyou_provider_hitta.wasm components/hitta.wasm
|
||||||
|
|
||||||
|
# Full build: components first, then the server.
|
||||||
|
build: build-components
|
||||||
|
cargo build --release
|
||||||
|
|
||||||
|
# All tests (the integration test needs the built component).
|
||||||
|
test: build-components
|
||||||
|
cargo test --workspace
|
||||||
|
|
||||||
|
# Run the service locally.
|
||||||
|
run: build-components
|
||||||
|
cargo run -p whoareyou-server
|
||||||
|
|
||||||
|
fmt:
|
||||||
|
cargo +nightly fmt
|
||||||
|
|
||||||
|
lint:
|
||||||
|
cargo clippy --workspace
|
||||||
Reference in New Issue
Block a user