docs: add CLAUDE.md with build commands and architecture notes

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-05 13:50:40 +02:00
parent af5b9d2e22
commit 831d5c4184
+67
View File
@@ -0,0 +1,67 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## What this is
A CLI that looks up Swedish phone numbers ("who is calling me?") by scraping
reverse-lookup sites. Old codebase: Rust edition 2018, reqwest 0.9 (synchronous
API), insta 0.11.
## Commands
```bash
cargo build
cargo run -- 0700000000 # query a number (hitta.se built in)
cargo run -- -d definitions/vem_ringde.toml 0700000000 # add TOML-defined probes
cargo run -- -o 0700000000 # open probe URLs in browser (macOS `open`)
cargo test # all tests (insta snapshot tests)
cargo test probe::hitta # one module
cargo test test_0104754350 # one test
cargo +nightly fmt # always nightly, not stable
cargo clippy
```
Tests are inline-snapshot tests (`assert_yaml_snapshot!(..., @r###"..."###)`)
against checked-in HTML fixtures in `fixtures/<provider>/<number>.html` — no
network needed. Refresh/add fixtures with `./fetch-fixture <number>` (requires
`http`/httpie); it fetches the number from all five sites into `fixtures/`.
## Architecture
Everything revolves around the `Probe` trait (`src/probe.rs`): `provider()`,
`uri(number)`, `fetch(number)`, `parse(html) -> Result<Entry, ()>`.
Two kinds of probes:
1. **Hard-coded**: `Hitta` (`src/probe/hitta.rs`) — extracts the
`__NEXT_DATA__` JSON blob via regex and deserializes it with serde. Always
registered in `main.rs`.
2. **Declarative**: `Definition` (`src/definition.rs`) — generic scraper
configured by a TOML file (`definitions/*.toml`) with CSS selectors for
`messages`, `history`, and `comments` (each comment has optional
`date_time`/`title`/`message` sub-selectors). The URL `path` is a
tinytemplate string with `{ number }`. Loaded at runtime via `-d`.
Flow in `main.rs`: build probe list → for each probe, check the cache
(`Context` in `src/context.rs`, bincode files under the platform cache dir
with a 1-day TTL) → otherwise `fetch()` and cache → `parse()` into an `Entry`
(`src/entry.rs`) → `Display` it.
## Gotchas
- `src/probe/{eniro,konsument_info,telefonforsaljare,vem_ringde}.rs` are
**orphaned**`probe.rs` only declares `mod hitta;`. Those providers were
superseded by the TOML definitions in `definitions/`. Don't "fix" them or
expect them to compile; they're kept as reference.
- `_build.rs` is intentionally disabled (underscore prefix, not referenced in
Cargo.toml) — an abandoned attempt at generating fixture tests.
- `definitions/vem_ringde.yml` is an experimental YAML variant of the TOML
definition, but `main.rs` only parses TOML (`toml::from_slice`).
- The `Filter` enum in `src/definition.rs` has no variants yet — `filters` is
parsed from definitions but unimplemented (commented-out loops in `parse`).
- insta 0.11 is old: the macro is `assert_yaml_snapshot!` and inline-snapshot
updates need a matching old `cargo-insta`; it's usually easier to update the
inline `@r###"..."###` literals by hand.