Skip to content
View as .md

Observability

How the system witnesses its own operation: latencies, decisions, spend, failures. Not content (that’s streams), not structural change (that’s mutations) — see primitives for where signals sit in the formal layer.

Four layers, picked by the question you’re asking:

LayerAnswersStorageRetention
Run statewhat ran, what’s running, what happened lastPostgres + thread streams, via HTTP APIdurable
Usagewho spent what, on whose key, at what costusage_events + PostHogdurable / plan limits
Lifecycleaggregate trends — signups, reply rates, activitysignals on threads + PostHog mirrordurable / plan limits
Cloudflare logslive debuggingCF dashboard, wrangler tail~72h (free tier)

Run state is remote-only: the CLI has no local database; Postgres and the thread’s durable stream are the sole sources of truth.

Usage: every seam that spends money calls recordUsage() after the spend; one event lands in the usage_events ledger and in PostHog, joined by trace_id. Dispatch mints one traceId per turn and stamps it on the turn’s signal.dispatch.* payloads and every recordUsage event the turn produces — so “activation started → spend → result” joins across the thread stream, the ledger, and PostHog. Call shape and columns: analytics → usage; whose money it is: llm-keys.

Lifecycle: typed signal.<entity>.<verb> entries on threads, mirrored to PostHog by track() when the agent opted in. Call shape and vocabulary: analytics.

Cloudflare logs: console output streams to the CF dashboard and wrangler tail (observability.logs.enabled in wrangler config). Console statements use bracket prefixes: [dispatch.gate], [dispatch.turn], [usage].

Dispatch activations are visible twice: durably as signal.dispatch.* on the thread, and in PostHog as dispatch.started/completed/skipped/failed (mirrored by publishDispatchSignal, always-on). Both carry the turn’s trace_id.

PostHog receives operational data only — latencies, model usage, failure rates, token/cost counts, and app ids (UUIDs, sent raw — they’re meaningless without DB access and keep cross-referencing easy). Server capture disables GeoIP enrichment and drops content/name/path fields. Never send content or identity (emails, names) or session data.

See analytics, debugging, system/dispatch.