A reusable control plane for a fleet of relay agents. One repo, one CLI, one unified GitHub watcher — spawn as many always-on Claude Code agents as you want, each its own persistent session, all sharing the plumbing.
It's the multi-agent layer on top of the relay pattern: instead of N agents that each copy the plugin code and each poll GitHub independently, relayfleet gives you config-only agents, shared plugins loaded from the root, and one poller that routes each event to exactly the agents that should see it.
┌──────────────────────────────┐
GitHub ───> │ github router (ONE process) │ routes.json = the gate
│ • polls each watch on its │
│ own cadence │
│ • ONE durable dedup cursor │
│ • fans out by route │
└───────────────┬──────────────┘
│ appends {source:"github"} lines
┌──────────────────────────┼───────────────────────────┐
▼ ▼ ▼
agents/wms-coder/ agents/wms-reviewer/ agents/manager/
state/inbox.jsonl state/inbox.jsonl state/inbox.jsonl
│ │ │
inbox plugin (shared) surfaces each as a [FROM github | ...] turn
│ │ │
persistent Claude persistent Claude persistent Claude
Code session Code session Code session
(its own context, (own context) (own context)
own MCP, own
remote-control)
Key idea: centralise reads, distribute writes. The router is the only thing that watches GitHub. Agents still act on their own — they comment, push, open PRs directly. Polling is the thing that multiplies with agent count, so that's the thing we made singular.
If you run a handful of relay agents the naïve way, two problems show up as you scale:
- Copied code. Each agent has its own copy of the plugin code, so a fix doesn't land until you copy it into every agent and restart.
- N independent pollers. Every agent hits
ghon its own schedule, with overlapping queries and separate dedup state — redundant work, and a burst pattern that trips GitHub's secondary rate limits well before the 5000/hr primary limit matters.
relayfleet fixes both: plugins live once at the repo root and are loaded for every agent; a single watcher polls, dedups, and routes.
git clone <your-fork> relayfleet && cd relayfleet
python -m pip install -e ".[runner]" # runner extras: fastmcp, mcp, pywinpty (win)
# or, control-plane only (no session engine): python -m pip install -e .Requirements: Python 3.12+, the claude CLI
authenticated with a Claude subscription, and the gh
CLI authenticated. The runner's PTY backend (pywinpty) is Windows-only today;
the control plane (CLI, router, scaffolding) is cross-platform.
Run fleet doctor any time to check your environment + config.
fleet init --name my-fleet
# stand up a coder + reviewer for a repo (auto-allocates ports + starter routes)
fleet new-agent wms-coder --role coder --repo acme/WMS
fleet new-agent wms-reviewer --role reviewer --repo acme/WMS
fleet new-agent manager --role manager --repo acme/CustomerSupport
fleet route # inspect the gate: watches + routes
fleet start --all # launch every agent's persistent session
fleet router start # launch the unified watcher
fleet status # who's running, router state, cursor ageThat's it — assigned issues now flow to wms-coder, open PRs to
wms-reviewer, and mentions to manager, each as [FROM github | ...] turns
in its own Claude session.
| Command | What it does |
|---|---|
fleet init [--name N] |
Create fleet.json + routes.json + agents/. |
fleet new-agent <name> --role <role> [--repo R] [--model M] [--no-route] [--no-remote] [--force] |
Scaffold a config-only agent, allocate the next free port, fill the role persona, and (for coder/reviewer/manager with a repo) add a starter watch+route. |
fleet list |
Agents with role, port, model, repo. |
fleet status |
Running state of the router + every agent; cursor age. |
fleet start|stop|restart [agent|--all] |
Manage agent session processes (detached, PID-tracked). |
fleet router start|stop|restart|status |
Manage the unified watcher process. |
fleet route |
Print every watch + route (the gate); validate; flag routes to unknown agents. |
fleet tell <agent> <msg...> |
Drop a message into an agent's inbox (operator or hand-off). |
fleet logs <agent> |
Print the tail of an agent's session log. |
fleet doctor |
Check claude/gh on PATH, routes validity, port collisions, dangling routes. |
Roles: coder, reviewer, manager, orchestrator, base. Each is a
persona template in relayfleet/templates/roles/ — edit those to reshape what
every new agent of that role becomes, or sharpen an individual agent's
agents/<name>/CLAUDE.md after scaffolding.
No install? Use the wrappers: bin/fleet (POSIX) / bin/fleet.ps1 (PowerShell)
set PYTHONPATH/RELAYFLEET_ROOT and run python -m relayfleet.
A watch is one gh query the poller runs. A route maps matched items
to one or more agents. An item is delivered to the union of to agents
across all matching routes, each agent at most once per dedup key.
{
"watches": [
{ "name": "wms-assigned", "kind": "issues", "repo": "acme/WMS",
"filter": "--search assignee:@me --state open", "poll_sec": 240 },
{ "name": "wms-prs", "kind": "pulls", "repo": "acme/WMS",
"filter": "--state open", "poll_sec": 240 }
],
"routes": [
{ "match": { "watch": "wms-assigned" }, "to": ["wms-coder"] },
{ "match": { "watch": "wms-prs" }, "to": ["wms-reviewer"] },
{ "match": { "repo": "acme/WMS", "kind": "pulls", "label": "needs-human" }, "to": ["manager"] }
]
}A route matches when every key in its match matches; absent keys are
wildcards. Supported match keys: watch, repo, kind
(issues/pulls/runs), label, author, state, title_contains. An
empty match is a catch-all and is flagged by validation so it's never a
silent firehose.
Dedup & first run. The router keeps one durable cursor
(state/router_cursor.json). A watch's first poll is a snapshot — current
items are marked seen but not delivered, so startup never replays the backlog.
Issues/PRs re-fire when their updatedAt changes (new comment/commit), because
that's baked into the dedup key. The cursor survives restarts: a relaunch
neither replays nor floods.
Create an orchestrator agent and it can run the fleet CLI itself —
designing and standing up new agents on request:
fleet new-agent fleet-orchestrator --role orchestratorIts persona (templates/roles/orchestrator.md) teaches it to scaffold agents,
sharpen their briefs, and edit routes — the CLI does the mechanical work, the
agent decides the topology.
An agent's runtime.backend decides where its runner process lives. Default is
local; docker and ssh (Linux and Windows VMs) run it elsewhere.
Multiple agents can share a host via the hosts registry in fleet.json.
fleet host add test-vm --backend ssh --host you@10.0.0.5 --workdir /srv/fleet --os linux
fleet new-agent shop-coder --role coder --repo acme/Shop --backend docker
fleet new-agent vm-coder --role coder --repo acme/X --backend ssh --host test-vm --inbox-transport ssh
fleet start --all # dispatches per backend (detached local / docker run / remote tmux)
fleet status # shows [local] / [docker] / [ssh@host] per agentThe one router still polls GitHub centrally; it delivers to a remote agent via
its runtime.inbox_transport (file | shared | ssh | http). The channel
into each agent's session is unchanged — it's local to that agent's runner.
Per-agent claude.endpoint.mode controls the LLM endpoint, without ever
touching your global ~/.claude. Every agent runs with an isolated
CLAUDE_CONFIG_DIR by default (isolate_config: true), so your normal Claude
Code and other agents are completely unaffected.
| mode | effect |
|---|---|
default |
talk to Anthropic directly; isolated config reuses the machine login (inherit_auth) or fleet login <agent> |
custom |
point at any Anthropic-API-compatible URL — another proxy, or a local model server (--endpoint custom --endpoint-url http://127.0.0.1:11434) |
gateway |
route through the fleet Claude gateway (below) |
Set via_gateway: true on a custom endpoint to send a local model through
the gateway too (e.g. for logging/PII/compression in front of it).
Remote/container agents shouldn't carry your credentials. With the gateway, they carry only a deterministic sentinel key; the gateway (on the main PC) swaps it for the machine's real Claude credentials — read live per request, so OAuth refresh is picked up — and forwards to each agent's upstream.
# in fleet.json: { "gateway": { "enabled": true, "url": "http://127.0.0.1:8799",
# "secret_env": "RELAYFLEET_GATEWAY_SECRET" } }
fleet gateway start # boots the proxy on the main PC
fleet gateway routes # sentinel -> upstream, per agent
fleet tunnel open test-vm # reverse SSH tunnel so the VM reaches the gateway at localhostCompression (rolling-context) and PII redaction are opt-in upstreams you point
the gateway at — compression stays default-off because it breaks prompt
caching (see docs/RUNTIMES.md).
Agents declare exactly which credentials they get (least privilege). Sourced
from host env, a file, or a command; delivered as env or a file in the agent's
box (local env, container -e/docker cp, or pushed over SSH).
A plugin is a folder under the shared plugins/ dir with a plugin.json and a
module exposing async def setup(api). It's loaded once and serves every agent
that enables it in config.json. See plugins/_template/. Inside setup you
get api.config, api.agent_dir/api.state_dir,
await api.emit(body, metadata=, source=), the @api.tool(...) decorator, and
api.spawn(coro). The bundled inbox plugin is the universal delivery channel
(operator messages, inter-agent hand-offs, and the router's GitHub events).
relayfleet/ package: cli, fleetconfig, scaffold, paths, proc
runner/ per-agent engine (PTY session + MCP/channel)
router/ the unified watcher (github, routes, service)
templates/ base config/CLAUDE.md + role personas
plugins/ SHARED plugins (inbox, _template) — loaded for every agent
agents/ config-only agent folders, created by `fleet new-agent`
routes.json the gate
fleet.json fleet-wide defaults
tests/ pytest suite (routes, router, scaffold, cli, proc, github)
relayfleet is a from-scratch, self-contained reimagining of the single-agent
relay runner, generalised to a fleet.
The per-agent engine in relayfleet/runner/ is faithful to relay's proven
ConPTY + MCP-channel + remote-control approach — agents are still ordinary
persistent interactive Claude Code sessions. What's new is everything around
them: shared plugin loading, the unified router, config-only agents, and the
fleet CLI.
Beta. The control plane (CLI, router, scaffolding, config) is covered by the
test suite and runs cross-platform. The session engine currently requires
Windows (pywinpty) and the claude CLI to exercise end-to-end. License: MIT.