Observability
How the system witnesses its own operation: latencies, decisions, spend, failures. Not content (that’s streams), not structural change (that’s mutations) — see primitives for where signals sit in the formal layer.
Four layers, picked by the question you’re asking:
| Layer | Answers | Storage | Retention |
|---|---|---|---|
| Run state | what ran, what’s running, what happened last | Postgres + thread streams, via HTTP API | durable |
| Usage | who spent what, on whose key, at what cost | usage_events + PostHog | durable / plan limits |
| Lifecycle | aggregate trends — signups, reply rates, activity | signals on threads + PostHog mirror | durable / plan limits |
| Cloudflare logs | live debugging | CF dashboard, wrangler tail | ~72h (free tier) |
Run state is remote-only: the CLI has no local database; Postgres and the thread’s durable stream are the sole sources of truth.
Usage: every seam that spends money calls recordUsage() after the spend; one event lands in the usage_events ledger and in PostHog, joined by trace_id. Dispatch mints one traceId per turn and stamps it on the turn’s signal.dispatch.* payloads and every recordUsage event the turn produces — so “activation started → spend → result” joins across the thread stream, the ledger, and PostHog. Call shape and columns: analytics → usage; whose money it is: llm-keys.
Lifecycle: typed signal.<entity>.<verb> entries on threads, mirrored to PostHog by track() when the agent opted in. Call shape and vocabulary: analytics.
Cloudflare logs: console output streams to the CF dashboard and wrangler tail (observability.logs.enabled in wrangler config). Console statements use bracket prefixes: [dispatch.gate], [dispatch.turn], [usage].
Dispatch activations are visible twice: durably as signal.dispatch.* on the thread, and in PostHog as dispatch.started/completed/skipped/failed (mirrored by publishDispatchSignal, always-on). Both carry the turn’s trace_id.
PostHog receives operational data only — latencies, model usage, failure rates, token/cost counts, and app ids (UUIDs, sent raw — they’re meaningless without DB access and keep cross-referencing easy). Server capture disables GeoIP enrichment and drops content/name/path fields. Never send content or identity (emails, names) or session data.
See analytics, debugging, system/dispatch.