Skip to content

saggda/agent-forge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

agent-forge

Run one task across Codex, DeepSeek, Claude Code (or any CLI agent) in parallel — each in an isolated git worktree — then pick the best diff.

forge run "add retry logic to send_message()" --engine codex,deepseek

        your task
       /          \
  Codex          DeepSeek
  worktree       worktree
  (branch)       (branch)
       \          /
      forge compare
      → LLM picks winner

Stop being the manual copy-paste buffer between AI coding tools.

Zero external dependencies. Pure Python stdlib (3.11+). Works today.

The Problem

If you maintain real projects (especially multiple ones), you already use several coding agents:

  • OpenAI Codex / GPT-4o / o1 for one class of tasks
  • Claude Code / Sonnet for architecture and large refactors
  • DeepSeek, Qwen, local models for speed and cost

Every time you switch, you lose context. You manually copy prompts, diffs, and decisions. You become the bottleneck.

agent-forge removes that friction.

How It Works

Task (or task.md) 
    → forge run task.md --engine codex,claude,deepseek
    → each engine gets its own git worktree (full isolation)
    → parallel execution
    → forge compare <id>   (LLM-assisted diff review)
    → forge review <run>   (deeper analysis of one result)

Web tools (Cursor, Claude.ai, Grok) stay as your upstream for strategy. agent-forge only orchestrates anything with a CLI or API.

Key Features

  • True parallel execution with git worktree isolation (no branch pollution)
  • Pluggable engines via simple TOML config — add any CLI or OpenRouter-compatible API in 3 lines
  • Built-in review & compare using your preferred model (OpenRouter)
  • Per-repo SQLite tracking (.forge/forge.db)
  • digest command — perfect for daily/launchd summaries
  • Clean, auditable diffs after every run

Installation

git clone https://github.com/molty/agent-forge.git
cd agent-forge
pip install -e .
cp forge.toml.example forge.toml

Then configure your engines in forge.toml.

Usage

forge run "add rate limiting to the auth middleware" --engine codex,deepseek
forge run --file task.md --engine codex,sonnet
forge list
forge show 42
forge compare 42
forge review 87
forge digest --hours 24
forge clean 42

Configuration Example

[engines.codex]
type = "cli"
cmd = "codex exec --sandbox workspace-write {prompt}"
timeout = 1800

[engines.deepseek]
type = "cli"
cmd = "commandcode run --prompt-file {prompt_file} --model deepseek/deepseek-v4-pro"

[engines.sonnet]
type = "cli"
cmd = "claude -p {prompt} --model sonnet-3.7"

[review]
model = "anthropic/claude-3.7-sonnet"
api_key_env = "OPENROUTER_API_KEY"

Adding a new engine is one [engines.*] section.

Real Usage

This tool was built because the author maintains multiple active production-grade systems (prediction market research platform, on-chain Solana automation, high-volume Telegram infrastructure). It is used daily to accelerate implementation across different agent strengths without losing context.

For OSS Maintainers

If you are an open-source maintainer juggling several coding agents while trying to keep velocity on issues and PRs — this is for you.

  • Fast triage of complex issues across multiple models
  • Consistent review quality via compare
  • Easy to extend with your own preferred agents and review models

We actively welcome contributions from other maintainers.

Roadmap (visible in issues)

  • Resume previous task runs
  • Auto-apply clean reviews
  • Better multi-repo support
  • Native Telegram / Slack digest

Contributing

Good first issues are labeled good-first-issue.

Before opening a PR:

  1. Run python -m pytest (when tests appear)
  2. Keep the zero-dependency promise for the core
  3. Update this README if behavior changes

License

MIT

Status

MVP, but daily driver for its author since early 2026.

Production-ready for power users and maintainers who live in the terminal and multiple agents.

About

Run coding tasks across Codex, DeepSeek, Claude Code in parallel. Zero deps, pure Python.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages