relayfleet

A reusable control plane for a fleet of relay agents. One repo, one CLI, one unified GitHub watcher — spawn as many always-on Claude Code agents as you want, each its own persistent session, all sharing the plumbing.

It's the multi-agent layer on top of the relay pattern: instead of N agents that each copy the plugin code and each poll GitHub independently, relayfleet gives you config-only agents, shared plugins loaded from the root, and one poller that routes each event to exactly the agents that should see it.

                         ┌──────────────────────────────┐
            GitHub  ───>  │  github router (ONE process) │   routes.json = the gate
                         │  • polls each watch on its    │
                         │    own cadence                │
                         │  • ONE durable dedup cursor   │
                         │  • fans out by route          │
                         └───────────────┬──────────────┘
                                         │  appends {source:"github"} lines
              ┌──────────────────────────┼───────────────────────────┐
              ▼                          ▼                            ▼
   agents/wms-coder/            agents/wms-reviewer/         agents/manager/
     state/inbox.jsonl            state/inbox.jsonl            state/inbox.jsonl
              │                          │                            │
        inbox plugin (shared)  surfaces each as a  [FROM github | ...]  turn
              │                          │                            │
        persistent Claude          persistent Claude            persistent Claude
        Code session               Code session                Code session
        (its own context,          (own context)               (own context)
         own MCP, own
         remote-control)

Key idea: centralise reads, distribute writes. The router is the only thing that watches GitHub. Agents still act on their own — they comment, push, open PRs directly. Polling is the thing that multiplies with agent count, so that's the thing we made singular.

Why

If you run a handful of relay agents the naïve way, two problems show up as you scale:

Copied code. Each agent has its own copy of the plugin code, so a fix doesn't land until you copy it into every agent and restart.
N independent pollers. Every agent hits gh on its own schedule, with overlapping queries and separate dedup state — redundant work, and a burst pattern that trips GitHub's secondary rate limits well before the 5000/hr primary limit matters.

relayfleet fixes both: plugins live once at the repo root and are loaded for every agent; a single watcher polls, dedups, and routes.

Install

git clone <your-fork> relayfleet && cd relayfleet
python -m pip install -e ".[runner]"     # runner extras: fastmcp, mcp, pywinpty (win)
# or, control-plane only (no session engine): python -m pip install -e .

Requirements: Python 3.12+, the claude CLI authenticated with a Claude subscription, and the gh CLI authenticated. The runner's PTY backend (pywinpty) is Windows-only today; the control plane (CLI, router, scaffolding) is cross-platform.

Run fleet doctor any time to check your environment + config.

Quickstart

fleet init --name my-fleet

# stand up a coder + reviewer for a repo (auto-allocates ports + starter routes)
fleet new-agent wms-coder    --role coder    --repo acme/WMS
fleet new-agent wms-reviewer --role reviewer --repo acme/WMS
fleet new-agent manager      --role manager  --repo acme/CustomerSupport

fleet route          # inspect the gate: watches + routes
fleet start --all    # launch every agent's persistent session
fleet router start   # launch the unified watcher
fleet status         # who's running, router state, cursor age

That's it — assigned issues now flow to wms-coder, open PRs to wms-reviewer, and mentions to manager, each as [FROM github | ...] turns in its own Claude session.

The CLI

Command	What it does
`fleet init [--name N]`	Create `fleet.json` + `routes.json` + `agents/`.
`fleet new-agent <name> --role <role> [--repo R] [--model M] [--no-route] [--no-remote] [--force]`	Scaffold a config-only agent, allocate the next free port, fill the role persona, and (for coder/reviewer/manager with a repo) add a starter watch+route.
`fleet list`	Agents with role, port, model, repo.
`fleet status`	Running state of the router + every agent; cursor age.
`fleet start\|stop\|restart [agent\|--all]`	Manage agent session processes (detached, PID-tracked).
`fleet router start\|stop\|restart\|status`	Manage the unified watcher process.
`fleet route`	Print every watch + route (the gate); validate; flag routes to unknown agents.
`fleet tell <agent> <msg...>`	Drop a message into an agent's inbox (operator or hand-off).
`fleet logs <agent>`	Print the tail of an agent's session log.
`fleet doctor`	Check `claude`/`gh` on PATH, routes validity, port collisions, dangling routes.

Roles: coder, reviewer, manager, orchestrator, base. Each is a persona template in relayfleet/templates/roles/ — edit those to reshape what every new agent of that role becomes, or sharpen an individual agent's agents/<name>/CLAUDE.md after scaffolding.

No install? Use the wrappers: bin/fleet (POSIX) / bin/fleet.ps1 (PowerShell) set PYTHONPATH/RELAYFLEET_ROOT and run python -m relayfleet.

routes.json — the gate

A watch is one gh query the poller runs. A route maps matched items to one or more agents. An item is delivered to the union of to agents across all matching routes, each agent at most once per dedup key.

{
  "watches": [
    { "name": "wms-assigned", "kind": "issues", "repo": "acme/WMS",
      "filter": "--search assignee:@me --state open", "poll_sec": 240 },
    { "name": "wms-prs", "kind": "pulls", "repo": "acme/WMS",
      "filter": "--state open", "poll_sec": 240 }
  ],
  "routes": [
    { "match": { "watch": "wms-assigned" },                                 "to": ["wms-coder"] },
    { "match": { "watch": "wms-prs" },                                      "to": ["wms-reviewer"] },
    { "match": { "repo": "acme/WMS", "kind": "pulls", "label": "needs-human" }, "to": ["manager"] }
  ]
}

A route matches when every key in its match matches; absent keys are wildcards. Supported match keys: watch, repo, kind (issues/pulls/runs), label, author, state, title_contains. An empty match is a catch-all and is flagged by validation so it's never a silent firehose.

Dedup & first run. The router keeps one durable cursor (state/router_cursor.json). A watch's first poll is a snapshot — current items are marked seen but not delivered, so startup never replays the backlog. Issues/PRs re-fire when their updatedAt changes (new comment/commit), because that's baked into the dedup key. The cursor survives restarts: a relaunch neither replays nor floods.

Spawning agents from inside the fleet

Create an orchestrator agent and it can run the fleet CLI itself — designing and standing up new agents on request:

fleet new-agent fleet-orchestrator --role orchestrator

Its persona (templates/roles/orchestrator.md) teaches it to scaffold agents, sharpen their briefs, and edit routes — the CLI does the mechanical work, the agent decides the topology.

Running agents anywhere (backends + hosts)

An agent's runtime.backend decides where its runner process lives. Default is local; docker and ssh (Linux and Windows VMs) run it elsewhere. Multiple agents can share a host via the hosts registry in fleet.json.

fleet host add test-vm --backend ssh --host you@10.0.0.5 --workdir /srv/fleet --os linux
fleet new-agent shop-coder --role coder --repo acme/Shop --backend docker
fleet new-agent vm-coder --role coder --repo acme/X --backend ssh --host test-vm --inbox-transport ssh
fleet start --all          # dispatches per backend (detached local / docker run / remote tmux)
fleet status               # shows [local] / [docker] / [ssh@host] per agent

The one router still polls GitHub centrally; it delivers to a remote agent via its runtime.inbox_transport (file | shared | ssh | http). The channel into each agent's session is unchanged — it's local to that agent's runner.

Endpoints — where each agent's model traffic goes

Per-agent claude.endpoint.mode controls the LLM endpoint, without ever touching your global ~/.claude. Every agent runs with an isolated CLAUDE_CONFIG_DIR by default (isolate_config: true), so your normal Claude Code and other agents are completely unaffected.

mode	effect
`default`	talk to Anthropic directly; isolated config reuses the machine login (`inherit_auth`) or `fleet login <agent>`
`custom`	point at any Anthropic-API-compatible URL — another proxy, or a local model server (`--endpoint custom --endpoint-url http://127.0.0.1:11434`)
`gateway`	route through the fleet Claude gateway (below)

Set via_gateway: true on a custom endpoint to send a local model through the gateway too (e.g. for logging/PII/compression in front of it).

The Claude gateway — one auth point for the fleet

Remote/container agents shouldn't carry your credentials. With the gateway, they carry only a deterministic sentinel key; the gateway (on the main PC) swaps it for the machine's real Claude credentials — read live per request, so OAuth refresh is picked up — and forwards to each agent's upstream.

# in fleet.json: { "gateway": { "enabled": true, "url": "http://127.0.0.1:8799",
#                               "secret_env": "RELAYFLEET_GATEWAY_SECRET" } }
fleet gateway start         # boots the proxy on the main PC
fleet gateway routes        # sentinel -> upstream, per agent
fleet tunnel open test-vm   # reverse SSH tunnel so the VM reaches the gateway at localhost

Compression (rolling-context) and PII redaction are opt-in upstreams you point the gateway at — compression stays default-off because it breaks prompt caching (see docs/RUNTIMES.md).

Credentials — gated, per agent

Agents declare exactly which credentials they get (least privilege). Sourced from host env, a file, or a command; delivered as env or a file in the agent's box (local env, container -e/docker cp, or pushed over SSH).

// agents/<name>/config.json
"credentials": {
  "gh":     { "from_command": ["gh", "auth", "token"], "as_env": "GH_TOKEN" },
  "openai": { "from_env": "OPENAI_API_KEY" },
  "sshkey": { "from_file": "~/.ssh/id_ed25519", "as_file": "~/.ssh/id_ed25519", "mode": "0600" }
}

Writing a plugin

A plugin is a folder under the shared plugins/ dir with a plugin.json and a module exposing async def setup(api). It's loaded once and serves every agent that enables it in config.json. See plugins/_template/. Inside setup you get api.config, api.agent_dir/api.state_dir, await api.emit(body, metadata=, source=), the @api.tool(...) decorator, and api.spawn(coro). The bundled inbox plugin is the universal delivery channel (operator messages, inter-agent hand-offs, and the router's GitHub events).

Layout

relayfleet/                package: cli, fleetconfig, scaffold, paths, proc
  runner/                  per-agent engine (PTY session + MCP/channel)
  router/                  the unified watcher (github, routes, service)
  templates/               base config/CLAUDE.md + role personas
plugins/                   SHARED plugins (inbox, _template) — loaded for every agent
agents/                    config-only agent folders, created by `fleet new-agent`
routes.json                the gate
fleet.json                 fleet-wide defaults
tests/                     pytest suite (routes, router, scaffold, cli, proc, github)

Relationship to relay

relayfleet is a from-scratch, self-contained reimagining of the single-agent relay runner, generalised to a fleet. The per-agent engine in relayfleet/runner/ is faithful to relay's proven ConPTY + MCP-channel + remote-control approach — agents are still ordinary persistent interactive Claude Code sessions. What's new is everything around them: shared plugin loading, the unified router, config-only agents, and the fleet CLI.

Status

Beta. The control plane (CLI, router, scaffolding, config) is covered by the test suite and runs cross-platform. The session engine currently requires Windows (pywinpty) and the claude CLI to exercise end-to-end. License: MIT.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
agents		agents
bin		bin
docs		docs
plugins		plugins
relayfleet		relayfleet
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
Dockerfile.sshtest		Dockerfile.sshtest
LICENSE		LICENSE
README.md		README.md
fleet.example.json		fleet.example.json
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
routes.example.json		routes.example.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

relayfleet

Why

Install

Quickstart

The CLI

routes.json — the gate

Spawning agents from inside the fleet

Running agents anywhere (backends + hosts)

Endpoints — where each agent's model traffic goes

The Claude gateway — one auth point for the fleet

Credentials — gated, per agent

Writing a plugin

Layout

Relationship to relay

Status

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

relayfleet

Why

Install

Quickstart

The CLI

routes.json — the gate

Spawning agents from inside the fleet

Running agents anywhere (backends + hosts)

Endpoints — where each agent's model traffic goes

The Claude gateway — one auth point for the fleet

Credentials — gated, per agent

Writing a plugin

Layout

Relationship to relay

Status

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages