Observability

Observability data is not content (that’s streams) and not structural change (that’s mutations). It’s the sixth primitive: signals — the system witnessing its own operation. Latencies, decisions, retries, activations, failures. Every signal has an agent_id like any other write, but its audience is the operator, not the participant. See primitives for how signals fit the formal layer.

Four layers, each serving a different question.

CLI control plane — `@arbe/observability`

Local SQLite at .arbe/arbe.db, not committed. Four tables: runs (source of truth for run state), sessions, messages, parts (verbatim copies of opencode session data, snapshotted on run completion). WAL mode + busy timeout for concurrent CLI access.

When logged in, runs and sessions sync to Supabase on completion (see storage.md). When not, local-only. PostHog is a separate concern — anonymized operational signals on the server side, never content or identity.

Answers: what ran, what’s running now, what happened last, what to inspect next.

Agent activations — DO SQLite

Each agent DO in packages/worker/src/agent.ts logs activations to its own Durable Object SQLite. One row per activation: trigger type, outcome (replied/skipped/error), skip reason, LLM metrics, duration. 30-day retention, pruned alongside message processing.

Queryable via getActivations() RPC and GET /api/agents/[id]/activations.

Answers: did the agent activate on this message, and what did it decide?

Implemented. See packages/worker/src/agent.ts for the activation logging.

PostHog — product analytics

Four server-side events (mutation, stream_message, agent_dispatch, agent_activation) plus client-side auto-capture. See analytics.md for event schemas, coverage, tradeoffs, and configuration.

Answers: aggregate trends — cost by model, reply rates over time, room activity.

Cloudflare native logs

Both packages/www and packages/worker have observability.logs.enabled: true in their wrangler configs. Console output from both workers streams to the CF dashboard and wrangler tail. Ephemeral — good for live debugging, gone once logs scroll past.

Console statements use bracket-prefix convention: [agent-dispatch], [agentName], [streams/scopeId].

What lives where

Layer	Storage	Runtime	Retention
CLI control plane	local SQLite `.arbe/arbe.db`	bun	until pruned/deleted
Agent activations	DO SQLite (per agent)	CF Workers	30 days
PostHog	PostHog cloud	CF Workers (server), browser (client)	PostHog plan limits
CF logs	Cloudflare platform	CF Workers	~72h (free tier)

The two SQLite stores cannot merge — different runtimes (bun vs CF DO). PostHog receives anonymized operational signals only (latencies, model usage, failure rates) — never content, identity, or session data.