Skip to content

Layers

The point is not “automations” as a product feature. The point is agents that can perceive events, decide, act with permission, use real runtimes, and leave behind durable state that humans and other agents can inspect.

“Say hello every hour in our Discord general chat” should not require a bespoke feature. It should fall out of the stack.

Core claim

There are two different things here and they should not be confused.

The first is the shared world: identity, scope, permissions, history, and durable outputs.

The second is execution: runs, sessions, sandboxes, tools, memory, retries, approvals, and results.

The system gets messy when execution state pretends to be world state, or when local operator state pretends to be the shared substrate.

The stack should be explicit:

  1. substrate
  2. execution kernel
  3. orchestration
  4. adapters

“Automation” is not a primitive. It is a composition across these layers.

Substrate

The substrate is the shared, permissioned world. Its primitives: records, streams, agents, permissions, mutations, signals. See primitives.

These are the only durable concepts that should survive a UI rewrite, a runtime rewrite, or a product metaphor change.

A room, a session, a Discord target, a tool, a run artifact, a bot, a house, a webhook endpoint, and a code review thread can all be represented as records in different relational positions with different content schemas.

A stream is the append-only history of what happened in a scope. A mutation is the append-only history of structural change. These must stay distinct.

The substrate answers: who exists, what scopes exist, who can do what, what happened in a scope, what changed structurally.

It does not answer how a runtime cached context, which process ID ran a command, or which local CLI instance last observed a session. Those belong higher up.

What already works

Agents (human and bot) are just records with permissions. A bot wakes up when the app dispatches an activation event to its Durable Object. The DO decides whether to respond (cooldown, relevance gate for ambient mode). It reads the stream, builds context from cached messages, calls an LLM, and posts a reply directly to the stream. No polling, no self-triggering loops. This pipeline is solid.

Self-triggering is prevented structurally: dispatch excludes the message author, and agent replies bypass the proxy so dispatch never fires for them.

An agent today is: identity + system prompt + model + trigger mode + permissions. When activated, it gets the last 50 messages plus house context. It responds or doesn’t. Then it forgets.

For agents to truly roam, three things need to connect: memory, tools, and scheduled activation. These three are the minimum for agents that roam autonomously. None requires a new primitive. See memory, capabilities, activation.

Execution kernel

The execution kernel is the standard pattern by which an agent turns permissioned context into side effects.

These are kernel concepts, not substrate primitives:

  • activation policy — what wakes an agent. See activation.
  • run — one top-level execution attempt. See runs and sessions.
  • session — the conversational and operational thread inside a run.
  • capability — the right to cause a class of side effects. See capabilities.
  • sandbox lease — the runtime binding for a run or agent. Where side effects happen.
  • memory — four kinds, not one. See memory.
  • artifact/result — the durable outcome of work.

They are stable enough to name, but they are not irreducible in the same way records and streams are.

A sandbox is not a substrate primitive. It is an embodiment. Some runs need no sandbox at all. A scheduled Discord message can run inside a Durable Object plus an HTTP capability. A coding agent needs a real filesystem, process tree, network policy, and git checkout. That is where Fly sprites or Cloudflare runtimes enter.

A sandbox lease answers where side effects happen. It should be inspectable and replaceable. The Durable Object is the durable brain. The sandbox is the temporary body.

Orchestration

Orchestration is the layer that selects work, correlates runs, and gives operators a legible control plane.

This is where arbe is strongest.

Tasks, runs, sessions, sandboxes, and event streams are not competing with the substrate. They are the execution control plane for a particular domain, especially code work in repositories.

That means two important things.

First, arbe task is not a universal primitive. It is a specialized planning surface for repository work. That is fine. It should stay opinionated and local-first.

Second, run, session, and sandbox are more general. They are execution-kernel concepts that should be able to project into shared records and streams when the work is collaborative or remote.

The clean relationship is:

  • repo-local task state remains in-repo because git is the right durability boundary for code tasks
  • local SQLite observability remains a projection because it is operator-local and checkout-scoped
  • shared runs, sessions, resources, and results live in the shared substrate when they matter beyond one checkout

Do not force all repo tasks into global Postgres. That would be premature and annoying. Do not pretend local SQLite is the global truth either. It is a projection.

Adapters

Adapters connect the stack to the outside world. Discord, Slack, GitHub, Linear, PagerDuty, email, webhook ingress, cron scheduling, MCP servers, and local CLI surfaces.

An adapter turns an external event into an activation input, or an internal decision into an external side effect.

Adapters are not the product. They are ports.

The important design rule is that adapters should feed the same kernel objects as any other surface. A Discord event should not create a special Discord-only execution path. It should create an activation, then a run, then a session, then results.

Storage boundaries

The current storage story is not wrong. It is undernamed.

Postgres is the shared structural truth. It should hold records for agents, scopes, tools, resources, activation policies, artifacts, and any shared run metadata we want visible across users and surfaces.

Durable streams hold append-only narratives and event histories inside scopes. Session streams, room streams, approval streams, and run streams belong here.

Durable Object SQLite holds private working state and caches. It is fast and close to execution, but it is not the long-term source of truth for anything another runtime must reliably observe.

Sandbox filesystems and process state hold embodiment-local side effects. They are useful and real, but not authoritative by themselves.

Repo-local SQLite and repo task files are local projections and planning state for code work. They are intentionally narrow and should remain so until there is a real need to share them globally.

If a piece of state matters across agents, surfaces, or runtime restarts, it should not exist only in a local SQLite file or a Durable Object cache.

Four observation layers

  1. Mutations — structural audit trail. Every record/permission change logged with agent_id and monotonic seq. This is the “what changed” layer.
  2. Activations — per-agent decision log in DO SQLite. Did it wake, did it respond, why not, how long, what model. This is the “what agents did” layer.
  3. Streams — the content itself. Replayable from any offset. This is the “what was said” layer.
  4. Signals — the sixth primitive. System telemetry: latencies, retries, errors, resource usage. Not structure (mutations) and not content (streams). Every signal has an agent_id because every action has an author.

For a useful operator view, you need: mutations for structure changes, activations for agent behavior, stream replay for conversation context, and signals for system health. A dashboard that joins these four by agent_id and time gives you full observability of agents roaming across rooms.

See observability for what’s implemented today.

The direction

The system is not trying to be a chatbot with plugins, a code agent with a nice shell, or an automation builder with cron forms.

It is trying to be a permissioned multi-agent substrate with an inspectable execution kernel.

If that is right, then chat, coding, cron jobs, webhooks, incident response, Discord bots, and review agents are all just different surfaces over the same stack.