Commit Graph

47 Commits

Author SHA1 Message Date
logaritmisk 00c7e7e812 feat(paths): auto-create config dir on daemon startup
ensure_dirs() now creates config_dir alongside state_dir and log_dir,
so first daemon run materializes $XDG_CONFIG_HOME/xy/servers/ — making
it obvious where to drop server .kdl files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 12:44:47 +02:00
logaritmisk 740b8b4c84 Merge feat/mcp-supervisor: HTTP MCP server supervisor MVP
37 planned tasks plus 3 follow-up fixes from final code review.

Architecture:
- Cargo workspace: xy-protocol, xy-supervisor, xy-ipc, xy (single binary)
- Unix socket + newline-delimited JSON-RPC 2.0
- Per-server KDL configs at XDG paths (XDG on macOS via etcetera)
- One supervisor task per managed server, owning all state
- Per-server log capture: rotating disk + ring buffer + broadcast stream

Features:
- Daemon auto-launches all configured servers on boot
- start/stop/restart (single or --all), reload (diff added/removed/changed),
  list/status, logs (--tail / --follow)
- Per-server restart policy (always/on-failure/never) with exponential
  backoff, sliding 60s retry window, and Failed state on cap
- Graceful shutdown via SIGTERM/SIGINT, SIGKILL escalation after grace
- 51 tests: unit (state machine via MockChild, KDL parser, framing) +
  integration (real daemon + helper bins exercising lifecycle/reload/
  restart-cap/logs)

Bugs found and fixed during execution:
- Connection deadlock from single shared read/write mutex (split into
  separate reader/writer halves)
- LOGS response vs notification ordering race (oneshot gate)
- StartAck::Started returned even on spawn failure (added SpawnFailed)
- Backoff sleep blocked the supervisor's command channel (interruptible
  select)
- list/status returned zeroed fields (now publish full Status via watch)
2026-05-25 13:22:28 +02:00
logaritmisk 4a0b32d90e fix(supervisor): StartAck::SpawnFailed surfaces real failures
Add StartAck::SpawnFailed(String) so callers can distinguish a successful
start from a failed spawn. The Start command arm now sends SpawnFailed on
io::Error rather than the misleading Started. handlers.rs maps the new
variant to an RpcErrorCode::SpawnFailed JSON-RPC error response.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 12:32:07 +02:00
logaritmisk b366df0482 fix(supervisor): make backoff sleep interruptible by Stop/Shutdown
Replace the bare sleep(delay).await in the Restart backoff arm with a
tokio::select! over the timer and cmd_rx. Stop/Shutdown are now handled
immediately during backoff (Stop → Stopped, Shutdown → clean exit);
Start/Restart/Reconfigure skip the remaining delay and retry at once.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 12:31:32 +02:00
logaritmisk 3e4ad79137 fix(supervisor): publish full status (pid, port, uptime, restart_count, last_exit) via watch channel
Replace watch::Receiver<ServerState> on SupervisorHandle with watch::Receiver<Status>,
a richer snapshot type that carries pid, port, uptime_secs, restart_count and last_exit.
SupervisorTask maintains current_pid and publishes a fresh Status on every state
transition; handlers.rs reads the full Status so list/status no longer return
zeroed/None fields.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 12:30:56 +02:00
logaritmisk ae6ed1cf0a chore: remove stray libnull.rlib and gitignore *.rlib
Accidentally committed in 9d5d8f0 during the polish task.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 12:25:13 +02:00
logaritmisk 0261d58d5d docs: README and example KDL config 2026-05-25 12:19:34 +02:00
logaritmisk 9d5d8f04a2 chore: clippy fixes - allow should_implement_trait and collapse nested if 2026-05-25 12:19:24 +02:00
logaritmisk b1e7dea739 test(xy): logs --tail and --follow
Fix a deadlock in the log-stream handler that caused all logs
requests to hang: Connection used a single Mutex<JsonFramed> for
both reads and writes, so the serve loop holding the read lock
blocked the spawned notification task from writing.  Split
Connection into separate reader and writer mutexes.

Also fix a response/notification ordering race: the log task now
waits for an explicit ready signal sent by serve after writing the
LOGS response, ensuring notifications never arrive at the client
before their initiating response.
2026-05-25 12:17:32 +02:00
logaritmisk 15791c628b test(xy): reload diff 2026-05-25 12:05:58 +02:00
logaritmisk 284b6e7402 test(xy): restart cap escalates to failed 2026-05-25 12:05:45 +02:00
logaritmisk 434828c14e test(xy): auto-start + stop/start lifecycle 2026-05-25 12:05:28 +02:00
logaritmisk 48d63a0549 test(xy): integration test harness 2026-05-25 12:03:38 +02:00
logaritmisk 7107977637 test(xy): helper binaries for integration tests 2026-05-25 12:03:13 +02:00
logaritmisk c1f6225e26 feat(xy): CLI client commands
Replace bail!("not implemented") stubs with real RPC calls over the Unix
socket; add format::list_table for fixed-width list output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 12:02:09 +02:00
logaritmisk b434c636a6 feat(xy): logs streaming via subscription notifications
Implement per-connection ConnState tracking active subscriptions, and the
logs/logs_cancel RPC handlers. Snapshot-only streams terminate with a
log_end notification; follow streams forward broadcast lines until
cancelled or connection close.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 11:59:46 +02:00
logaritmisk c679465f12 feat(xy): reload handler with diff
Implements the `reload` JSON-RPC method: diffs the on-disk config dir
against the in-memory registry and reconciles — stops removed servers,
restarts changed servers (shutdown-then-respawn), and starts new ones.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 11:56:45 +02:00
logaritmisk 736e6d1854 feat(xy): RPC handlers for list/status/start/stop/restart
Per-connection JSON-RPC dispatch in daemon/handlers.rs — list, status,
start, stop, and restart are fully implemented; reload, logs, and
logs_cancel are stubbed with -32601 for later tasks.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 11:54:52 +02:00
logaritmisk 3ab982aea1 feat(xy): daemon boot + accept loop + graceful shutdown
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 11:52:41 +02:00
logaritmisk d7aa543ac0 feat(xy): daemon Registry with config-hash entries 2026-05-25 11:49:58 +02:00
logaritmisk 71808783c4 feat(xy): clap CLI scaffold 2026-05-25 11:49:47 +02:00
logaritmisk 49c006df10 feat(xy): exclusive pidfile guard 2026-05-25 11:48:38 +02:00
logaritmisk 58c44e0b48 feat(xy): XDG path resolution 2026-05-25 11:48:36 +02:00
logaritmisk b137f85a0c feat(ipc): server bind + Connection wrapper 2026-05-25 11:47:24 +02:00
logaritmisk fbfb1db427 feat(ipc): client with call + notification reader 2026-05-25 11:47:11 +02:00
logaritmisk e58b6866ef feat(ipc): newline-delimited JSON framing 2026-05-25 11:45:50 +02:00
logaritmisk 53f6b82f2b feat(ipc): JSON-RPC envelope types 2026-05-25 11:45:37 +02:00
logaritmisk a3c979511e feat(supervisor): supervisor task with state machine
One async task per managed server owns all state transitions via a
tokio::select! loop over cmd_rx and wait_child. Includes RealSpawner
and a smoke test covering the Start → Running → exit → Stopped →
Shutdown happy path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 11:44:12 +02:00
logaritmisk f1b2306156 feat(supervisor): RealChild + spawn_with_logs
Append RealChild (real tokio::process::Child wrapper) and spawn_with_logs
to child.rs. Uses nix::unistd::setpgid via tokio's re-exported pre_exec
to create an own process group, and fires per-stream log pump tasks that
drain stdout/stderr into the provided LogSink. terminate/kill signal the
whole process group via kill(-pgid, SIG*).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 11:40:19 +02:00
logaritmisk e121fe28bb feat(supervisor): LogSink fans out to file, ring buffer, broadcast 2026-05-25 11:37:19 +02:00
logaritmisk 7995a53e82 feat(supervisor): ring buffer for recent log lines 2026-05-25 11:36:52 +02:00
logaritmisk d51f25350c feat(supervisor): rotating log writer 2026-05-25 11:36:23 +02:00
logaritmisk d237e980e9 feat(supervisor): sliding retry-window tracker 2026-05-25 11:34:55 +02:00
logaritmisk 54045da2df feat(supervisor): exponential backoff calculator 2026-05-25 11:34:53 +02:00
logaritmisk 4837a73167 feat(supervisor): restart-policy decision logic 2026-05-25 11:34:50 +02:00
logaritmisk 1d2848f03a feat(supervisor): ChildHandle trait + MockChild 2026-05-25 11:33:23 +02:00
logaritmisk bd926061bf feat(protocol): JSON-RPC method param/result types 2026-05-25 11:31:56 +02:00
logaritmisk e8f5846cec feat(protocol): load_all_configs from dir with duplicate port detection 2026-05-25 11:30:38 +02:00
logaritmisk 7e59d7d050 feat(protocol): KDL parser for ServerConfig
Adds kdl_parse module with parse_server_config() that deserialises a
KDL document into ServerConfig, with full validation of name, types,
durations, and restart/stop blocks. Also derives Default on
RestartPolicy to satisfy clippy.
2026-05-25 11:29:05 +02:00
logaritmisk 355d0debda feat(protocol): ServerConfig + ConfigError + RpcErrorCode 2026-05-25 11:23:57 +02:00
logaritmisk 5a0963665d feat(protocol): RestartPolicy/RestartConfig/StopConfig with defaults 2026-05-25 11:22:52 +02:00
logaritmisk 0e49834c93 feat(protocol): ServerState enum 2026-05-25 11:21:43 +02:00
logaritmisk 1b76378b37 chore: bump workspace resolver to "3"
cargo 1.95 supports resolver 3; align with plan spec.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 11:20:54 +02:00
logaritmisk 5b1314b0af chore: convert to cargo workspace with four crates 2026-05-25 11:17:24 +02:00
logaritmisk c252bd7716 docs: add xy MCP supervisor implementation plan
37-task TDD-style plan across 7 phases: workspace skeleton,
xy-protocol (config/state/rpc types), xy-supervisor (state machine
with mock-driven unit tests), xy-ipc (JSON-RPC over Unix socket),
xy binary (daemon + CLI), integration tests with test-helper bins,
and polish (fmt/clippy/README).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 11:13:56 +02:00
logaritmisk 8f7200aa25 docs: add xy MCP supervisor design spec
Approved design for the MVP: single xy binary with a Cargo workspace
(xy-protocol, xy-supervisor, xy-ipc, xy), Unix socket + newline-delimited
JSON-RPC, per-server KDL configs at XDG paths (XDG on macOS too via
etcetera), supervisor-per-server task model with per-server restart policy,
log capture to disk + ring buffer + broadcast for follow.

MVP commands: daemon, list, status, start/stop/restart (name|--all),
reload, logs. Process-alive supervision only; HTTP/MCP-aware probes,
container isolation, launchd integration, and TUI deferred.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 10:51:57 +02:00
logaritmisk cd2746cc3d chore: initial cargo skeleton
cargo new output as the baseline for the xy MCP supervisor project.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 10:51:49 +02:00