# Analytics

Three surfaces, picked by what the event *is* — never send the same fact through two of them by hand:

- **Signals** — durable in-app lifecycle record (`signal.<entity>.<verb>` on a thread).
- **`track()`** — PostHog mirror of lifecycle events, opt-in per agent.
- **`recordUsage()`** — money. Not opt-in — billing attribution. See [usage](#usage--money).

## Lifecycle — signals + track()

Signals are the durable record: schema-validated `signal.<entity>.<verb>` payloads written atomically with the mutation. `signal.thread.*` lands on the thread itself; `signal.house.*` and `signal.dispatch.*` on the scope's primary thread; `signal.permission.*` on the thread where dispatch runs. `track()` is the PostHog mirror — called at the route boundary, fire-and-forget via `waitUntil`, only when the agent opted in.

```
mutation ──┬─► postThreadEntry  signal.house.created   (durable, queryable, in-app)
           └─► track(...)       house.created          (PostHog, opt-in)
```

Adding a lifecycle event is two independent decisions: post a typed `signal.*` from the core mutation if it should appear on a thread; call `track()` at the route if it should appear in PostHog. Both, either, or neither. The one rule: anything worth recording durably belongs in the core schema as a typed `signal.*` — never through `track()` alone.

The signal vocabulary lives in `packages/core/schemas/stream-events/thread.ts` (`signal.house.*`, `signal.thread.*`, `signal.permission.*`, `signal.dispatch.*`). To add a kind: extend the schema, post from the mutation.

`track()` (`apps/www/src/lib/server/track.ts`) auto-injects `route`, `method`, `request_id`, `caller_kind`, `release_sha`, `ts`; caller properties win on collisions before the PostHog sanitiser runs. Opt-in is per-agent (`agents.telemetry_opt_in`, surfaced at `/account/telemetry`; `null` counts as opt-out). Privacy line: no content or identity — `track()` drops content/name/path fields before capture; app ids (UUIDs) are sent raw; the shared transport disables PostHog GeoIP enrichment. Errors flow through `trackError(...)` from `handleError`. The browser side (`apps/www/src/lib/analytics.ts`) is posthog-js auto-capture only — no manual `capture()` in page code, no signal concept.

Current PostHog lifecycle events: `user.signed_in`, `house.created`, `house.deleted`, `agent.created`, `thread.created`, `member.joined`, `environment.created`, `invite.created`, `invite.claimed`, `account.deleted`, `agent.self_deleted`, `house.file.uploaded`, `house.file.deleted`, `server.error`.

**Dispatch mirror** — the exception to "track() at the route": dispatch terminals fire after the response under `waitUntil`, so `publishDispatchSignal` (`packages/core/dispatch/signals.ts`) mirrors every `signal.dispatch.*` it posts straight to PostHog as `dispatch.started` / `dispatch.completed` / `dispatch.skipped` / `dispatch.failed`. Ops data like usage — always-on, not opt-in. Props: `thread_id`, `agent_id`, `trace_id`, `duration_ms`, `http_status`, `reason`.

**Correlation** — `dispatch()` mints one `traceId` per turn and stamps it on every `signal.dispatch.*` payload and every `recordUsage` event the turn produces (gate, reply, tool seams via `ToolContext.traceId`). One dispatch turn = one trace: the thread stream, the `usage_events` ledger, and PostHog (`$ai_trace_id` on `$ai_generation`, `trace_id` elsewhere) all join on it.

Verify a signal landed: `arbe thread entries list <thread-id>`. PostHog needs a dashboard or a PAT.

```
PUBLIC_POSTHOG_KEY      empty disables PostHog
PUBLIC_POSTHOG_HOST     defaults from PostHog
RELEASE_SHA             stamped on every event; falls back to 'dev'
```

## Usage — money

`recordUsage(event)` (`packages/core/usage.ts`) is the single call for paid usage. Every spending seam calls it after the spend. Fire-and-forget, never throws, two sinks per event sharing one `trace_id`:

1. **`usage_events`** (Postgres, append-only) — the ledger and future enforcement source. Columns: `house_id`, `agent_id`, `thread_id`, `capability` (`llm`/`sandbox`/`file_index`, open set), `seam` (`gate`/`reply`/`sandbox_reply`/`run_command`/`delegate_task`/`sandbox_provision`/`sandbox_exec`/`file_upload`/`file_search`), `key_source` (`worker` = arbe pays, `house`/`env` = the house pays), `model_ref`, `input_tokens`, `output_tokens`, `cost_usd` (real OpenRouter-reported cost via `usage: {include: true}`), `meta`.
2. **PostHog** (`packages/core/usage-posthog.ts`) — `$ai_generation` for LLM spend (feeds the AI Observability dashboards; no prompt/output content), `arbe_usage` for the rest. House attached as group, joined to the ledger row by `$ai_trace_id`.

Wiring: `hooks.server.ts` calls `configureUsage({supabase, posthogKey, posthogHost})` once at boot — the ledger needs a service-role client (`usage_events` RLS rejects request-scoped ones) and core can't read `$env`. Scripts call it themselves; unconfigured, `recordUsage` warns and drops the row. The hooks also hand `flushUsage()` to `waitUntil` after every request so workerd doesn't cancel in-flight writes; dispatch drains its own in a `finally`.

Key-source semantics: [llm-keys](./llm-keys.md). Smoke probe: `bun --env-file=apps/www/.env --env-file=apps/www/.env.local run packages/core/scripts/test-llm-tracking.ts` — one real LLM call, asserts the ledger row, prints the `posthog-cli` query for the PostHog half.

## Code

One server-side PostHog transport: `capturePosthogEvent` in `@arbe/core/posthog` (direct POST to `/capture/` — workerd-safe; don't add `posthog-node`). Both `apps/www/src/lib/server/posthog.ts` (route capture) and `packages/core/usage-posthog.ts` (usage sink) delegate to it.

See [debugging](./debugging.md), [observability](./observability.md), [llm-keys](./llm-keys.md).

_Gaps: `signed_up` vs `signed_in` not distinguished (`ensureHumanAgent` is idempotent); `signal.agent.created` and `signal.member.joined` aren't in core's vocabulary yet — currently PostHog-only; read-path activity not instrumented._

_Rejected: per-entry `thread.entry.created` PostHog events — too high-volume, weak dashboard value, messy consent; if activity trends are ever needed in PostHog, emit periodic aggregates instead._
