View as .md

Layers

The point isn’t “automations” as a feature. It’s agents that perceive events, decide, act with permission, use real runtimes, and leave durable state that humans and other agents can inspect. “Say hello every hour in #general” should fall out of the stack, not require a bespoke feature. Two things must not be confused: the shared world (identity, scope, permissions, history, durable outputs) and execution (runs, sessions, sandboxes, tools, memory, retries, approvals, results). The system gets messy when execution state pretends to be world state, or when local operator state pretends to be the shared substrate.

1. substrate          records, streams, agents, permissions, mutations, signals
2. execution kernel   activation policy, run, session, capability, sandbox lease, memory, artifact
3. orchestration      tasks, loops, stuck detection, control plane (where arbe is strongest)
4. adapters           Discord, Slack, GitHub, webhook, cron, MCP — ports, not the product

Substrate primitives are the only durable concepts that survive a UI rewrite, a runtime rewrite, or a product metaphor change — see primitives. A room, a session, a Discord target, a tool, a run artifact, a bot, a house, a webhook endpoint, and a code review thread can all be represented as records in different relational positions with different content schemas.

Execution-kernel concepts are stable enough to name but not irreducible. A sandbox is not a substrate primitive — it’s an embodiment. Some runs need no sandbox at all (a scheduled Discord message can run inside a DO + an HTTP capability). A coding agent needs a real filesystem, process tree, network policy, and git checkout — that’s where Fly sprites or Cloudflare runtimes enter. The sandbox lease answers where side effects happen — inspectable and replaceable. The Durable Object is the durable brain. The sandbox is the temporary body.

Orchestration is where arbe is strongest. Tasks, runs, sessions, sandboxes, and event streams are the execution control plane for code work in repositories. Two things follow: arbe task is not a universal primitive — it’s a specialised, opinionated, local-first planning surface for repo work; and run/session/sandbox are more general — they project into shared records and streams when work is collaborative or remote. Repo-local task state stays in-repo because git is the right durability boundary for code tasks. Local SQLite observability is a projection because it’s operator-local and checkout-scoped. Shared runs/sessions/resources/results live in the substrate when they matter beyond one checkout.

Adapters feed the same kernel objects as any other surface. A Discord event creates an activation, then a run, then a session, then results — not a special Discord-only path. Adapters are ports, not the product.

Storage boundaries. Postgres = shared structural truth (agents, scopes, tools, resources, activation policies, artifacts, shared run metadata). Durable streams = append-only narratives inside scopes (sessions, rooms, approvals, runs). Durable Object SQLite = private working state and caches — fast, runtime-local, never the only copy of anything that matters across restarts. Sandbox FS = embodiment-local side effects, useful but not authoritative. Repo-local SQLite + .arbe/tasks/ = local projections and code-work planning state, intentionally narrow until there’s a real need to share globally.

Four observation layers a useful operator view needs, joined by agent_id and time: mutations (what changed structurally — Postgres row writes + narrating signal.* entries; no separate log), activations (per-agent decision log — wake, respond, why not, latency, model — also not its own table yet), streams (what was said — replayable from any offset), signals (system telemetry — latencies, retries, errors, resource usage; the sixth primitive). Every signal has an agent_id because every action has an author. See observability for what’s implemented today.

What already works: agents (human and bot) are records with permissions; a bot wakes when the app dispatches an activation; the dispatcher decides whether to respond (cooldown, ambient gate); it reads the stream, calls an LLM, posts a reply directly. No polling, no self-triggering loops. Self-triggering is prevented structurally — dispatch excludes the message author, agent replies bypass the proxy. An agent today is identity + system prompt + model + trigger mode + permissions; on activation it gets the last 50 messages plus house context, responds or doesn’t, then forgets. For agents to truly roam, three things connect: memory, tools, scheduled activation. None requires a new primitive — see memory, capabilities, activation.

The system isn’t trying to be a chatbot with plugins, a code agent with a nice shell, or an automation builder with cron forms. It’s a permissioned multi-agent substrate with an inspectable execution kernel. Chat, coding, cron jobs, webhooks, incident response, Discord bots, and review agents are different surfaces over the same stack.

See: thinking/primitives, thinking/activation, thinking/capabilities, thinking/memory, debugging.