Skip to content
View as .md

Arbe threads — schema & model

This is an attempt to evolve the design of the “arbe” platform, free of legacy, no constraints. How would we redo this project if we started from scratch?

THIS IS NOT BASED ON EXISTING CODE. THIS IS AN OPPORTUNITY TO RETHINK ASSUMPTIONS

What happens

The whole platform is one verb — someone says something, and work happens — and everything below is the machinery that keeps that durable, shared, and resumable.

someone (human or bot) says something in a thread
└─ append to the thread ──▶ durable stream (everyone tailing sees it live)
└─ is it work?
├─ no ──▶ bot replies in-process ──▶ another stream event
└─ yes ──▶ resolve a sandbox (live→use · dead→resume · none→create)
├─ run_command (tool_call) ──▶ output back inline, same thread, same stream
└─ delegate_task (tool_call) ──▶ spawn subagent thread + its own stream
└─ child gets its own fresh sandbox (isolated by default; or reuse an existing sandbox)
└─ agent output streams into child; parent gets an `agent_spawn` event

Requirements

What the model must support — technology-free. The schema below is derived from these, and any change we propose has to satisfy them. This is the list we check before flipping a decision.

  1. A thread is a durable, observable log of entries, shared by many agents — human and bot.
  2. Agents in a thread can run computer work in a sandbox: quick shell commands and long-running coding-agent tasks (for now, pi).
  3. A single thread can have several tasks running at once (“make a PDF”, “pull this”, “fix that”).
  4. Tasks can reuse one sandbox and its working tree when useful, or run isolated when not — chosen per task, not fixed ahead of time. The safe default is isolation: a delegated task gets its own fresh sandbox, so concurrent tasks never write one tree by accident. Reuse is the deliberate case — the caller points the child at an existing sandbox — while the everyday same-tree path is inline commands, which always run on the thread’s own sandbox.
  5. Work continues across agents and time: someone picks up a task hours later; it’s tied to the work (the thread), not to who started it.
  6. A sandbox is disposable: it can die and resume without losing the thread or its log.
  7. It just works: nobody hand-picks a sandbox; the right one is resolved automatically.
  8. A sandbox can exist with no thread (CLI, a warm pool) — permitted, not built. Threads point at sandboxes, not the reverse, so a thread-less sandbox breaks nothing; we just don’t invest until there’s a real use.

The model

An environment is a recipe, a sandbox is the dish, a thread is the log. A house is the tenant and owns everything; an agent is any actor, human or bot. Each noun lives in one of three layers:

Postgres (durable control-plane)house; thread (metadata + a pointer to its stream); environment (the sandbox recipe — base image, repo, setup, limits); secret (encrypted house value, injected via secret_bindings); agent (global, not house-scoped); member (agent ↔ house + role); sandbox (handle for a cloud machine).

Durable stream (HTTP-observable, not Postgres) — the ordered events of each thread: messages, agent output, control events. One stream per thread; tail it to observe live, read from 0 to replay.

Ephemeral (cloud, no table) — the live sandbox machine (only its handle row persists) and the agent process inside it (its session persists as a thread).

Pinned threads and tags — A pinned thread is a place (one you land in, usually titled; work branches off via parent_thread_id); a tag is a set (a predicate over the free-form threads.tags). Collapsing them is the mistake room made — being both a place that owned a thread and a set that held siblings. An optional discovery layer, not critical.

Decision: one subagent status enum

Keep one row enum for subagent thread status. Do not split the row into lifecycle/result/reason columns yet. The row answers one queryable question: what bucket is this subagent thread in now? It is an index/cache over the stream, not the full runtime report.

Subagent row values:

idle | running | completed | failed | cancelled

Meanings:

  • idle — no run has started yet, or the thread is parked before work begins.
  • running — the live attempt: a coding-agent process being driven, emitting progress, or waiting on a human/tool/external condition (the stream says which).
  • completed — the latest run reached a successful rest point.
  • failed — the latest run ended unsuccessfully or was declared orphaned/stuck.
  • cancelled — the latest run was intentionally stopped.

This is deliberately not two enum sets. completed, failed, and cancelled are states in the subagent thread’s state machine; detailed causes are stream facts. A completed or failed subagent thread can be resumed into running by a later turn, so these are terminal for a run, not immutable for the thread row.

Actual Pi/arbe facts that shaped this:

  • Pi message_end carries message.stopReason on each assistant message. The real tool fixture in packages/core/pi/events.fixtures/tool-turn.jsonl shows stopReason:"toolUse" followed later by stopReason:"stop", so stop reason is per assistant turn/attempt, not a thread result.
  • packages/core/pi/events.ts lifts assistant messages into pi.assistant and returns attemptStopReason; it intentionally drops many provider/tool lifecycle details. Those remain stream entries or decoder signals, not columns.
  • packages/sandbox/src/daytona/decide-pi-outcome.ts maps Pi/process facts into canonical stream narration: stop|length → status_changed:completed, error → pi_failed + status_changed:failed, no Pi output → pi_session_orphaned + status_changed:failed. Exit code, stderr summaries, provider/model, and orphan silence stay on signal.thread.*, not the row.
  • arbe relay/dispatch already has separate stream signals for machinery: signal.dispatch.session_started, signal.dispatch.pi_waiting, signal.dispatch.*, signal.thread.pi_failed, signal.thread.pi_session_orphaned, signal.thread.status_changed, and signal.thread.child_finished. The row status should not duplicate those details.

Current-to-v2 mapping:

current valuev2 row statuswhere the detail goes
idleidleno process facts yet
runningrunningdispatch/session/heartbeat signals in the stream
completedcompletedpi.assistant.stopReason of stop or length stays on Pi entries
failedfailedsignal.thread.pi_failed / process-exit detail in the stream
stuckfaileddetection path, e.g. pi_session_orphaned or pi_failed.errorClass = 'pi_exit'
blockedrunningwaiting is still the live attempt; the wait condition lives in signal.dispatch.pi_waiting
cancelledcancelledabort/SIGTERM/provider-abort details in the terminal stream event

stuck is one value to delete. It names how failure was detected, not the state users act on. If a watchdog declares a run stuck, the row is failed; the stream says whether that was orphan silence, timeout, non-zero exit, or another detected path.

blocked is the other, deleted for the opposite reason: it’s a state with no reliable producer. Nothing owns the running→blocked detection or the flip back, so a row would sit in running while actually waiting — the worst case. So a waiting agent is just running; the reason it waits (signal.dispatch.pi_waiting) is a stream fact, not a row bucket. Add it back when a real producer and a consumer that needs to query “what’s waiting” both exist — we have enough to chew on first.


Schema (Postgres)

Naming & ids. Tables are plural (houses, threads, …), matching the live schema. Primary keys and the FKs that point at them are text short-ids (generate_short_id()): houses.id, environments.id, secrets.id, sandboxes.id, threads.id, and every house_id / environment_id / sandbox_id / parent_thread_id. The lone exception is agents: an agent’s id IS its Supabase auth.users id (the signup trigger inserts agents.id = auth.users.id), and auth.uid() — the value every RLS predicate and is_house_member compares against — is that uuid. So agents.id and every reference to it (agent_id, author_id, parent_agent_id, created_by) stay uuid; short-id-ing them would decouple identity from auth and force an auth.uid()→short-id translation on the hottest authz path, for no gain. vault_secret_id is also uuid — an external Supabase Vault id, not a row of ours.

-- ============================================================
-- DURABLE METADATA / CONTROL-PLANE (Postgres)
-- ============================================================
create table houses (
id text primary key default generate_short_id(),
name text not null,
default_environment_id text, -- recipe a sandbox-building hand falls back to when the thread has no environment_id; circular w/ environments.house_id, so a soft pointer. null = no default, so a null-env thread fails loudly on first command
created_at timestamptz not null default now()
);
-- actors: humans and bots. GLOBAL, not house-scoped — one agent per human (their OAuth
-- identity) can belong to many houses; a bot typically to one. Membership is the join.
-- ids are uuid here (the lone exception to the short-id convention): a human agent's id IS its
-- Supabase auth.users id (the signup trigger inserts agents.id = auth.users.id), and auth.uid() —
-- what every RLS predicate and is_house_member compares against — is that uuid. Bots share the
-- table, so they keep uuid too.
create table agents (
id uuid primary key default gen_random_uuid(),
kind text not null check (kind in ('human','bot')),
runtime text, -- 'codex' | 'claude-code' | 'pi' (bots only)
name text, -- live uses `name`, not `display_name`
created_at timestamptz not null default now(),
check (kind = 'bot' or runtime is null), -- humans have no runtime
check (kind <> 'bot' or runtime is not null) -- bots must name a runtime
);
-- membership + role: which agents are in a house and what they may do. The role
-- inherits down to every thread / environment / secret under the house — so a thread's
-- agents are its house's agents (any scope id resolves UP to its house; the house's
-- members are the access edge). No per-thread member list, and no intra-house privacy: the house IS
-- the privacy boundary (want a private space? make a house). Bots are always plain members
-- (app convention; agents.kind already distinguishes human from bot, so role isn't constrained in
-- the schema). The deferred frontier is the opposite — sharing a thread ACROSS houses; see Deferred.
create table members (
house_id text not null references houses(id) on delete cascade,
agent_id uuid not null references agents(id) on delete cascade, -- uuid: the auth.users id (see agents)
role text not null check (role in ('owner','member')),
joined_at timestamptz not null default now(),
primary key (house_id, agent_id)
);
create table environments (
id text primary key default generate_short_id(),
house_id text not null references houses(id) on delete cascade,
name text not null,
config jsonb not null default '{}', -- base image, repo, setup script, limits
secret_bindings jsonb not null default '[]', -- [{ name, required }] — house secrets to inject, by name
created_at timestamptz not null default now(),
unique (house_id, id) -- backs same-house composite FKs (sandboxes, threads)
);
-- secret lives in the HOUSE, reusable across environments. One value per name per
-- house — no per-author scoping (the house is the privacy boundary; see members). arbe: the
-- value sits in Supabase Vault (vault_secret_id), not inline.
create table secrets (
id text primary key default generate_short_id(),
house_id text not null references houses(id) on delete cascade,
name text not null,
vault_secret_id uuid not null, -- uuid: the external Supabase Vault id, not a row of ours
created_at timestamptz not null default now(),
unique (house_id, name)
);
-- The env→secret link is the secret_bindings array above: an environment names the
-- house secrets it injects, by name, resolved at dispatch — so a binding may reference a
-- secret that doesn't exist yet, and a missing REQUIRED one fails the dispatch loudly.
-- No join table; only if one name must hold a different value per env (DATABASE_URL staging vs
-- prod) does secret_bindings get promoted to an environment_secret row carrying it — not before.
-- a pure machine + working tree, born from an environment and provisioned lazily on first need. Carries NO thread link — threads point at it (threads.sandbox_id), so many threads can share one sandbox and its working tree. House-scoped only.
create table sandboxes (
id text primary key default generate_short_id(),
house_id text not null references houses(id) on delete cascade, -- tenant + cascade (clears a house's sandboxes on delete); denormalized for RLS, like every table
environment_id text not null, -- the recipe it was built from (same-house composite FK below)
provider text not null, -- 'sprite' | 'daytona' | ... (open set — no check)
provider_ref text, -- external id of the live machine
status text not null default 'pending' check (status in ('pending','live','dead')),
created_at timestamptz not null default now(),
destroyed_at timestamptz,
unique (house_id, id), -- backs threads.sandbox_id composite FK
foreign key (house_id, environment_id) references environments (house_id, id) -- recipe must be same-house
);
-- A thread is a general primitive: a durable log some set of agents share. Two axes describe it,
-- owned by different parties, so they use different mechanisms:
-- 1. system truth — is a bot DRIVING this thread? That fact is already carried by agent_id (null =
-- a chat; set = a subagent thread running a coding-agent session, whose progress is the thread's
-- status), so it's read off the driving bot rather than duplicated into a `kind` enum that could
-- drift from it. The distinction is binary and structural; it needs no label.
-- 2. human meaning — what someone calls this thread: `tags`, an open free-form set. Meaning is
-- unbounded and owned by users, so it can't be a closed enum — "activity-log", "support",
-- "release" are just labels.
-- The invariant that holds them apart: a tag NEVER changes how a thread behaves, and the system axis
-- NEVER becomes a user folder. System truth is derived and finite; human meaning is open and tagged.
-- (name + pinned_at are a third, separate thing — a thread's identity/prominence, i.e. the pinned-thread
-- place — not a classification axis.)
create table threads (
id text primary key default generate_short_id(),
house_id text not null references houses(id) on delete cascade, -- tenant
stream_id text not null unique, -- address of this thread's durable stream (often = id::text)
-- the thread's title (its label in the sidebar). null = untitled (a thread addressed to an
-- agent, or a subagent thread).
name text,
-- pinned = surfaced in the sidebar as a place you land in (null = unpinned). Independent of
-- name: the pin makes it a place, the name is just its label.
pinned_at timestamptz,
-- PARENT EDGE: at most one, each a REAL typed FK (exclusive arc). The non-null column is the
-- discriminator, and cascade + integrity come free, unlike a polymorphic parent_id. None set =
-- a root thread (a top-level house thread; pinned, it's a place you land in) — usually a chat, but a
-- driven thread may also stand alone here (agent_id set, no parent).
parent_thread_id text, -- a spawned child (subagent thread); same-house composite FK below, on delete cascade
parent_agent_id uuid references agents(id) on delete cascade, -- addressed TO an agent (house-visible, not private); uuid: an auth.users id
-- recipe this thread builds sandboxes from (same-house composite FK below). null = a pure-chat
-- thread (bot replies in-process), a valid mode, not a misconfig. Only the sandbox-building hands
-- (run_command, delegate_task) need it: at dispatch they resolve an explicit call env, else
-- threads.environment_id, else houses.default_environment_id, else fail loudly — action-conditional,
-- not a latent error like a missing required secret. (sandboxes.environment_id stays not-null, so a
-- sandbox always names a real recipe.)
environment_id text,
-- the sandbox this thread's session runs on (same-house composite FK below, on delete set null).
-- null = none yet, resolved lazily on first command. A pointer, not ownership: a child that
-- deliberately reuses an existing sandbox copies its sandbox_id (typically the parent's) and works the
-- same tree; isolation, the default, leaves this null → a fresh sandbox. on delete set null so the thread + its log outlive the machine.
sandbox_id text,
-- the bot DRIVING this thread (distinct from parent_agent_id). null = a chat thread (no driving
-- bot); set = a coding-agent session (usually a spawned child — a "subagent thread"). Its presence
-- is axis 1's discriminator (see the note above the table) — no `kind` column. We deliberately do NOT
-- add `check (agent_id is null or parent_thread_id is not null)`: a driven thread MAY stand alone as a
-- root thread (e.g. a cron/autonomous agent owning its own top-level thread), so "driven" does not
-- imply "has a parent". The standalone case isn't built yet — but it's permitted on purpose, not forbidden.
agent_id uuid,
-- human-meaning labels (axis 2 in the note above the table): open, free-form, ZERO engine meaning.
-- GIN-indexed below.
tags text[] not null default '{}',
-- chat liveness OR a subagent thread's current state, keyed on the driving bot (agent_id),
-- not a `kind` enum — ONE column carries the cheap query bucket (not a separate table; see
-- "No runs table" in Deferred). Subagent values are idle/running/completed/failed/cancelled
-- (their meanings, and why `stuck` and `blocked` aren't among them, live in "Decision: one subagent
-- status enum"). Subagent threads insert as 'idle'; the default 'open' is for chat threads.
status text not null default 'open' check (
(agent_id is null and status in ('open','closed')) or
(agent_id is not null and status in ('idle','running','completed','failed','cancelled'))
),
created_at timestamptz not null default now(),
updated_at timestamptz not null default now(),
unique (house_id, id), -- backs the same-house composite FKs (incl. the self-ref parent_thread_id)
check (num_nonnulls(parent_thread_id, parent_agent_id) <= 1), -- at most one parent edge
-- TENANT COHERENCE: every cross-table edge carries house_id so a careless dispatch can't attach
-- another house's row. Postgres validates FKs as a privileged system check that BYPASSES RLS, so a
-- plain environment_id/sandbox_id FK could point cross-house — and environments carry secret
-- bindings, i.e. another house's secret material injected here. The composite FKs below close it:
-- agent_id routes through members (agents are GLOBAL, so this is also what forces the driving bot to
-- be a member of the house); environment_id and sandbox_id are already house-local; parent_thread_id
-- keeps a child same-house as its parent. parent_agent_id stays a plain FK to the global agents table
-- on purpose: it's the addressee, not a driver — nothing executes by virtue of it, so it can't leak.
foreign key (house_id, agent_id) references members (house_id, agent_id),
foreign key (house_id, environment_id) references environments (house_id, id),
foreign key (house_id, sandbox_id) references sandboxes (house_id, id) on delete set null (sandbox_id), -- PG15+: nulls only sandbox_id
foreign key (house_id, parent_thread_id) references threads (house_id, id) on delete cascade -- child is same-house as parent
);
-- access paths
create index on threads (parent_thread_id); -- a thread's children (fan-out); keeps the composite cascade cheap
create index on threads (sandbox_id); -- threads sharing a sandbox (resume repoints them all when the machine dies)
create index on threads (house_id) where pinned_at is not null; -- sidebar: a house's pinned threads
create index on members (agent_id); -- which houses an agent is in (PK covers house → members)
create index on environments using gin (secret_bindings); -- blast-radius: which envs inject secret X (rotation/revocation audit)
create index on threads using gin (tags); -- "threads tagged X" — a column read, no join table
-- keep updated_at honest: Postgres bumps it on every thread row change (status flips, sandbox
-- repoints, renames) so no caller has to remember — same "the database owns the invariant" stance as
-- the composite FKs and CHECKs above. INSERT keeps the column default (created_at = updated_at).
create function set_updated_at() returns trigger as $$
begin
new.updated_at := now();
return new;
end;
$$ language plpgsql;
create trigger threads_set_updated_at
before update on threads
for each row execute function set_updated_at();
-- tenant scoping (back every house_id with an index; basis for RLS predicates)
create index on environments (house_id);
create index on secrets (house_id);
create index on sandboxes (house_id);
create index on threads (house_id);

Implementation note: the on delete set null (sandbox_id) column-list form needs Postgres 15+ (it nulls only sandbox_id, keeping house_id non-null). On older versions, null sandbox_id via a trigger instead, or the composite FK’s plain set-null would try to null house_id (NOT NULL) and fail.


Durable stream event (NOT Postgres — the HTTP log protocol)

Entries are events appended to a thread’s stream. The envelope, roughly:

StreamEvent {
seq monotonic offset within the stream (ordering + replay cursor)
stream_id which thread's stream
author_agent_id who emitted it — a human or a bot
type 'message' | 'agent_output' | 'agent_spawn' | 'sandbox_resumed' | ...
payload anything
ts
}

One ordered log per thread carries chat, agent output, and control events interleaved. That single log is what powers live tailing, replay, and multiplayer fan-out. payload stays polymorphic and type says which kind it is. author_agent_id is how human and bot authorship unify.


Two ways to touch a sandbox

Both run commands on a sandbox; they differ in ceremony, not in whether a sandbox is involved. Think bash versus claude -p:

  • A shell command (run_command): one command now, output back inline on the current thread. No child thread. It runs on the thread’s sandbox (threads.sandbox_id) — resolved automatically (live → use, dead → resume, none → create) from the thread’s environment (else the house default). The result is just another stream event.
  • A coding agent (delegate_task): a long-running agent that issues its own commands over minutes and outlives the request. It runs the spawn flow below — its own child thread and stream. By default it isolates: its own fresh sandbox, recipe resolved by the same chain as threads.environment_id above (explicit call env, else thread, else house default). To work an existing tree instead, the caller has it reuse a named sandbox (typically the parent’s, threads.sandbox_id): the child copies that sandbox_id and runs on the same live machine, ignoring env (you’re reusing a running sandbox, not building one). Reuse is the deliberate, contention-bearing case — isolation is what you get for naming nothing. The caller gets a child thread id to follow either way.

So the split is own child thread running an agent versus one inline command — both run on a thread’s sandbox (threads.sandbox_id). They’re named for that fork (command vs job), not for the sync/async that falls out: run_command and delegate_task.


Spawn flow

1. tool call fires inside a thread (authored by a bot or human agent)
2. create the subagent thread row FIRST: house_id (= parent.house_id), status='idle',
parent_thread_id (= parent.id), agent_id (the bot) — setting agent_id is
what makes it a subagent thread — stream_id,
environment_id (resolved by the same chain),
sandbox_id = null by default (isolated; lazily materialized on first
command). A deliberate reuse sets it to an existing sandbox (e.g.
parent.sandbox_id), so the child runs the same tree
3. append to parent stream: { type: 'agent_spawn', author_agent_id, payload: { child_thread_id } }
4. on the child's first command, resolve its sandbox: if sandbox_id is null, materialize one lazily
from environment (inject env's selected secrets) and set child.sandbox_id
5. agent output streams as entries into the subagent thread's stream; status walks idle→running on
first work, then rests at →completed|failed|cancelled for that run (waiting on a human/tool stays
`running` — the stream says so). Each row flip is mirrored by `signal.thread.status_changed`; the
surrounding stream entries carry the result detail (exit code, error, abort path, summary).

One parent → many subagent threads. By default each gets its own fresh sandbox (isolated), so fanning out three tasks at once never lands three agents on one tree by accident. A task that deliberately reuses an existing sandbox (same sandbox_id) runs the same working tree — the reuse-or-isolate choice from requirement 4, made per task, with isolation as the safe default. Fan-out is just the thread tree getting wide; no extra tables. Sandboxes are provisioned lazily — nothing at thread creation, none until a hand needs one — and the environment stays a pure recipe, many sandboxes built from one. Row-first ordering (step 2 before 3) is deliberate: a crash leaves a harmless orphan idle row (GC-able), never an agent_spawn event pointing at a thread that doesn’t exist.

Resume flow (sandbox dies)

sandboxes.status = 'dead'
→ threads + their streams are untouched (the log survives)
→ create new sandbox: same environment_id (working tree restored from snapshot/git)
→ repoint every thread on the dead sandbox: set threads.sandbox_id = new sandbox (find them via the sandbox_id index)
→ append to each affected stream: { type: 'sandbox_resumed', payload: { sandbox_id } }

A thread points only at its current sandbox (threads.sandbox_id), so resume is a repoint. When several threads shared the dead machine, all are repointed together — they keep sharing the new one; repointing only the thread that noticed would silently fork the others onto a separate resume. (Same row-first ordering as spawn: a crash leaves threads correctly repointed with at most a missing sandbox_resumed event — never the reverse.)

History: a thread holds only its current sandbox; the timeline of which sandboxes it ran on lives in the stream’s sandbox_resumed entries (no runs table — see Deferred).


Authorization across layers

house_id on every Postgres table gives tenant isolation via RLS:

alter table threads enable row level security;
create policy threads_tenant on threads
using (house_id = current_setting('app.house_id')::text); -- house_id is a text short-id, not uuid
-- repeat per scoped table

The stream lives outside Postgres, so RLS can’t reach it. No problem: the API is the only door to the stream, so it checks house membership when you subscribe or append. One check, one place — the house is the only privacy boundary, so it’s the same same house? predicate everywhere (RLS, the stream door, and the composite FKs).


Deliberately deferred (don’t solve yet)

  • No runs table. The subagent thread is the queryable row; the sandboxes it used are sandbox_resumed events in its stream. Status rides one threads.status enum and the detailed outcome (exit code, error, abort path, summary) is a terminal/control stream event — no extra column or table. Chat and subagent status vocabularies are disjoint, so where status='failed' finds last night’s failed subagent threads, the payoff the table was meant to buy. One thread isn’t one run: it can cycle idle→running→completed→running again, many runs under one driving bot. A human picking it up kicks a new run, and switching runtime spawns a fresh child rather than rebinding agent_id. The run/session split was already collapsed into the thread (arbe-5bca); Pi stop reasons, process exit, provider errors, orphan detection, and session health are narrated as pi.* / signal.thread.* stream entries.
  • Tag registry. threads.tags is a free-form text[] — open, no metadata. Promote to a normalized tag table (+ thread_tag join) only when a tag needs its own attributes (color, description, a rename that propagates) or a validated allowed set. Until then the array + GIN index is the whole feature. Tags are the set half of discovery; pinned threads are the place half and already live on thread.
  • Cross-house thread sharing. The house is the only privacy boundary — a thread is visible to all its house’s members, no per-thread roster, which retires thread_member outright. The open frontier is the opposite: sharing a thread across houses. It’s survivable because a thread is a log plus an execution context — the log is shareable (provenance is already global via author_agent_id), execution is not (compute + secrets are house-owned). So the future feature grants an outside agent read + message while keeping run/delegate to house members, leaving the same-house composite FKs intact. Open question: can a guest execute, or only watch + message? Lean watch + message — once a guest runs commands, your secrets are in their hands.
  • Secret audience / capability secrets. A secret is just a named house value — no audience enum. What binds it tells what it’s for: an environment names it in secret_bindings for env-var injection. Capabilities will bind secrets through their own target, and that binding is the audience. Add it when capabilities are designed, not before.
  • Working-tree restoration. How environments.config makes a resumed sandbox usable — rebuild from git + setup script, or restore a snapshot. The schema works either way. Resume is sandbox-level, not thread-level: threads sharing a sandbox restore together, and any unsaved work since the last save point is lost for all of them — sharing working as intended, not a bug.
  • Multiplayer write-contention. Many actors on one shared sandbox — steering a live subagent thread, or several people running bash at once: who holds the input lock, how commands serialize. A concurrency question, not a schema one.
  • Entry read model. To search or filter entries relationally, build a Postgres projection off the stream. It’s a read model; the stream stays the source of truth.
  • Crash-time dual-write. Spawn and resume each write two systems — Postgres rows and the durable stream — with no shared transaction. The flows order writes rows-first so a crash leaves a harmless leftover (see the Spawn/Resume notes); the residual outbox/saga gap is documented, not built — a reconciliation sweep is cheap later, premature now.