Website marketing

Source material for the arbe website. Structure follows the page flow a visitor would see. Written for a technical audience — people who build software, are already using agents in some capacity, and are skeptical of hype.

Audience

Developers and teams who’ve started working with AI agents and discovered the real problem isn’t the model — it’s everything around it. Models get smarter, you give them harder problems, and they continue to fail in unexpected ways. The bottleneck is the infrastructure around the model: task scoping, sandboxing, verification, observability, context management. They’re stitching this together by hand — AGENTS.md files, ad hoc scripts, manual triage — and it’s fragile. They want the infrastructure to be a real system, not a pile of workarounds.

Positioning

arbe is infrastructure for humans and AI agents working in the same space. Not a chatbot builder. Not a workflow DSL. Not a wrapper around an LLM. The system that makes agent work reliable — task queues with dependency graphs, isolated sandboxes, verification loops, observability across sessions, and a permission model that treats humans and bots as the same kind of participant.

Two entry points depending on where the reader is:

Practical: arbe gives humans and AI agents the same workflow — claim a task, do the work, close it, commit. Task state lives in the repo. No external services. No dashboards. Just the work.

Architectural: every collaboration tool independently approximates the same substrate — entities, content logs, participants, permissions, an audit trail, and observability. arbe makes those six things the explicit foundation. Build on the substrate instead of reinventing it.

Hero

Headline options

The substrate for human-AI collaboration.
Agents and humans, same workflow.
Infrastructure for the work, not the hype.

Subhead

Six primitives. One permission model. Task graph, sandboxes, observability — for humans and agents alike.

Elevator (2-3 sentences for the fold)

Every collaboration tool — Slack, a filesystem, an agent framework — independently reinvents the same substrate: entities, content logs, participants, permissions, an audit trail, and observability. arbe makes those six things the explicit foundation. Build on the substrate instead of reinventing it.

Value propositions

It’s not a model problem

Models will keep getting smarter. You’ll keep giving them harder problems. They’ll keep failing in unexpected ways — that’s fundamental to non-deterministic systems. The leverage isn’t in waiting for a better model. It’s in the infrastructure around the model: how tasks are scoped, how work is verified, how context is managed, how you know what happened.

arbe is that infrastructure.

Same workflow for humans and agents

arbe task ready → claim → implement → close → commit. Tasks live in .arbe/tasks/, committed alongside code. No external services, no credentials. The queue doesn’t care who’s working — a person at a terminal and an agent in a sandbox use the same commands.

Task scoping prevents the mega-session

Agents fail when tasks are vague or sprawling. arbe’s task graph forces decomposition: dependencies, blocking/ready computation, discovered-from provenance. Each task is a bounded unit of work. An agent claims one task, does it, closes it. If new work surfaces during implementation, it becomes a new task — not scope creep in the current session.

Verification is built into the loop

Agents that can verify their own work succeed more often. arbe’s workflow bakes in lint and test before every commit. Success is silent; only failures surface. The agent fixes its mistakes before the work lands. Back-pressure, not babysitting.

Skills for progressive disclosure

Not everything belongs in the system prompt. arbe loads skills — domain-specific instructions and workflows — only when the task needs them. The agent’s context stays focused on the work at hand rather than drowning in instructions it doesn’t need yet.

Agents are first-class participants, not integrations

A bot and a human are both agents with identity, permissions, and authorship. Same rwx model, same audit trail, same RLS enforcement. An LLM reply and a human message are the same operation at the stream layer: a participant with write permission appended to a room’s log. No special “bot API.”

One permission model for everything

Three bits — read, write, execute — inherited down a scope tree. Works for houses, rooms, tools, humans, bots. x on a scope manages structure. x on a tool gates invocation. Same symbol, different target. Unix got this right in 1973.

No workflow DSL

Activation policies on agent records define when. Permissions define what. Tools define how. Streams record what happened. The substrate is the automation engine. No DAG builder, no YAML, no visual programming.

Feature areas

The claim-work loop

arbe task ready          # what's unblocked?
arbe task claim <id>     # take it
# implement (or arbe do <id> to dispatch)
arbe task close <id>     # done
jj commit               # task state + code in one change

The rhythm. Same for you, same for the agent.

Task graph

Repo-native task tracker. Dependencies, priorities, blocking/ready computation, discovered-from provenance. Committed alongside code so the state that motivated a commit is always recoverable from that commit. arbe task ready is the starting point for every session — human or machine.

Sandboxes

Isolated remote environments where agent sessions execute. Disposable, separate from your machine. arbe do dispatches work to a sandbox. arbe wait watches it finish. arbe resume drops you into a running session. Code execution without local risk.

Sessions and runs

A session is a conversation. A run is the unit of intended work — one per chat, loop, or do invocation. arbe loop runs N iterations with stuck detection. Sub-agent sessions run inside a parent’s run — isolated context windows that keep the orchestrating session clean. Runs correlate sessions to tasks, track lifecycle, and sync when authenticated. Local-first, cloud-optional.

Houses and rooms

Beyond the CLI, arbe is also a collaborative space. Houses are organizational boundaries. Rooms are conversational scopes backed by durable streams. Humans and bots participate in the same rooms with the same permission model. Mention a bot and it responds. Set it to ambient mode and it watches for relevant conversation. One Durable Object class serves all agents — behavior comes from config, not code.

Observability

Four layers, each answering a different question. Mutations: what changed structurally? Activations: what did the agent decide? Streams: what was said? Signals: what did the system experience? Every write has an agent_id. Join by agent and time for the full picture.

Technical credibility

Stack

SvelteKit 5 on Cloudflare Workers. Postgres (Supabase) + Electric SQL for structural sync. Durable Streams (Electric SQL) for message content. Cloudflare Durable Objects for agent compute. Bun. TypeScript.

The thesis

Every collaborative system independently approximates the same six primitives: records, streams, agents, permissions, mutations, signals. The divergence between Slack, a filesystem, and an agent framework is accidental, not essential. arbe names them, builds them once, and treats everything above — rooms, channels, sessions, whatever metaphor fits — as a swappable surface.

This isn’t a sales pitch. It’s a design constraint. When evaluating any addition to arbe, the question is: does this keep the primitives general, or does it bake in assumptions?

Dogfooding

arbe builds arbe. The CLI is both the product and the development tool. Tasks, sessions, runs, and observability are all dogfooded daily. The system is shaped by the friction of using it — if something feels awkward, that’s signal.

CTA

Early access / waitlist / GitHub — TBD based on launch strategy.

Tone notes

Technical, not breathless. Assume the reader builds software and is skeptical of agent frameworks that overpromise. They’ve been through the cycle: chatbot disappointment, agent promise, configuration frustration. They don’t need to be sold on agents — they need to be sold on infrastructure that makes agents reliable.

Lead with the problem (it’s not the model, it’s the infrastructure around it) and what’s different about arbe’s approach (the primitive model, the permission symmetry, the shared workflow, verification built into the loop). Don’t lead with what’s table-stakes (it has chat, it has agents).

Dry, precise, occasionally wry. Avoid “revolutionary,” “game-changing,” “seamless,” “intelligent.” Let the architecture speak.

The practical entry point (claim-work loop, sandboxes, observability) should come first in the visual hierarchy. The thesis and primitive model are for people who want to understand why — a deeper layer, not the headline.

Reference images

Useful diagrams from the field that could inspire visual design for the website (not for direct use — create our own versions):

Harness components diagram (system prompt + tools/MCPs + context + sub-agents as the four levers around a model)
Context firewall: sub-agents as isolated context windows that return condensed results to a parent session
The “smart zone” vs “dumb zone” — agent performance degrades as context fills with irrelevant intermediate results
Progressive disclosure: skills loaded on demand vs everything in the system prompt

arbe’s version of these: the claim-work loop diagram, the run/session/sub-agent hierarchy, the four observability layers, the permission scope tree.