Multi-Agent System: Complete Implementation Guide
Introduction: Building a Multi-Agent System from the Ground Up
In the realm of multi-agent systems, we've built a complete, production-ready implementation. This guide serves as a complete reference, detailing the design, development, and deployment of an 8-agent system, along with comprehensive tooling for efficient management. The goal is to give you a head start by offering a clear, actionable blueprint for building your own systems.
We will delve into the intricacies of the project, providing a structured overview of each component and its functionality. This encompasses everything from the core agent registry to the sophisticated Tmux workspace system. The entire system is production-ready, including comprehensive tooling for management and monitoring.
This guide will serve as a valuable resource for anyone looking to build and manage their multi-agent system. It covers core aspects like agent management, environment setup, and agent communication. This is more than just a technical guide; it's a roadmap. This guide ensures that you can understand and adapt it for your projects, setting you up for success.
Core Components: Dissecting the Multi-Agent System
1. Agent Registry System
At the heart of our multi-agent system lies the Agent Registry System. This critical component, defined in the .agents/agents.yaml
file, acts as the single source of truth for agent configurations. It is structured using YAML to list and configure all the agents in the system. The registry specifies key details for each agent, including its unique name, the Git branch to use, the worktree path, the underlying model, and a brief description. This approach simplifies configuration management.
The structure of the agents.yaml
file makes it easy to add, remove, and modify agents without modifying core code. The use of YAML ensures that the configurations are easy to read, and edit. This also avoids the need to recompile or restart the system whenever agent configurations are changed. This architecture promotes flexibility and maintainability, critical for system evolution.
Here's an example of the agents.yaml
file:
agents:
codex:
branch: agents/codex
worktree_path: .agents/agents/codex
model: codex
description: Code generation and API integration specialist
opus:
branch: agents/opus
worktree_path: .agents/agents/opus
model: opus
description: Opus model agent for complex reasoning
sonnet:
branch: agents/sonnet
worktree_path: .agents/agents/sonnet
model: sonnet
description: Sonnet model agent for balanced performance
codex-low:
branch: agents/codex-low
worktree_path: .agents/agents/codex-low
model: codex
description: Codex agent for low-priority tasks
codex-high:
branch: agents/codex-high
worktree_path: .agents/agents/codex-high
model: codex
description: Codex agent for high-priority tasks
# ... 3 more agents
2. Agent Management Script
The Agent Management Script, found in .agents/agents.sh
, is a key tool designed for managing agents. It uses yq
to read and parse the YAML configuration from agents.yaml
. The script provides a command-line interface for the basic agent lifecycle. It supports creating new agents by cloning their respective Git repositories, listing all available agents, and removing agents, with built-in safety checks to prevent accidental data loss.
The script uses the git clone
command to create full clones of the agents. This method was chosen over git worktree
due to the simplicity and isolation it offers. Full clones provide complete isolation, preventing potential conflicts and making the system more predictable. The script also provides colored output and emojis to improve the user experience.
3. Claude Slash Command
Integrating with the Claude platform, we've implemented a slash command /agents-create
. This command simplifies the process of creating and initializing agents directly from the chat interface. The command utilizes the capabilities of allowed-tools: [Bash, Read, Edit, Write]
to fully automate the agent creation process.
When the /agents-create
command is invoked, the system first checks if the specified agents exist in the agents.yaml
file. If any agents are missing, the command suggests a YAML patch to add them. The script then creates the agents, listing the results of each. The command is also designed to be idempotent, making it safe to rerun without unintended consequences. This approach simplifies the creation of the agents and improves the overall user experience.
4. Agent-Scoped Retrospectives
To enhance learning and continuous improvement, we have integrated agent-scoped retrospectives. The directory structure organizes the retrospectives by date and agent, ensuring easy access and review. The file naming convention, YYYY-MM-DD_HH-MM_retrospective.md
, facilitates efficient organization and retrieval of information.
This structured approach allows us to effectively track each agent's performance, issues, and improvements over time. Each retrospective provides a dedicated space to record essential information. This ensures comprehensive analysis and future enhancements based on past experiences. This strategy enhances the system's adaptability and learning capabilities.
5. /rrr Slash Command
The /rrr
(retrospective) slash command automates the creation of comprehensive retrospectives. This command collects session data such as Git status, diffs, logs, and timestamps to create a detailed overview of each session. The retrospective template includes essential sections like session metadata, a timeline of events, technical details, AI diary, honest feedback, lessons learned, and a validation checklist.
The AI Diary and Honest Feedback sections are critical innovations, providing essential context beyond the technical aspects. This ensures that the retrospectives offer a comprehensive record of each session. The automated generation of retrospectives streamlines the review process, saving time and ensuring that every session contributes valuable insights. The slash command simplifies the process of creating retrospectives, improving the value for each session.
Bonus: Tmux Workspace System
6. Multi-Profile Tmux Layouts
Beyond the base requirements, we developed a Tmux workspace system to enhance productivity. This system provides four distinct profile layouts accessible via shell scripts in the .agents/profiles/
directory. The layouts support a range of use cases and preferences.
The layouts are created to customize the workspace environment. Each profile is designed to support different workflows and preferences, enhancing productivity. These configurations are designed to make it easy for users to customize their working environment to suit their needs. These configurations enable faster navigation, providing an efficient and streamlined experience.
Here's a summary of the profiles:
- Profile 1: 2x2 Grid (70/30 width)
- Profile 2: Full-left + 3 Right Panes (60/40 width)
- Profile 3: Top-full + 2 Bottom Panes
- Profile 4: 3-Pane Left-split
7. Dynamic Session Management
Dynamic session management ensures that each workspace is easily accessible and manageable. The system supports the creation of multiple sessions using the --prefix
option. This feature prevents accidental duplicates and enables a flexible workspace environment. The session naming follows a consistent format: ai-<directory-name>[-prefix]
. This ensures consistency and clarity. A command, ./.agents/kill-all.sh
, allows easy cleanup of all ai-*
sessions, prompting the user for confirmation before killing them.
8. Send Commands to Panes
The .agents/send-commands.sh
script allows automated command execution in each Tmux pane. The script detects the window index dynamically, ensuring commands are sent to the correct panes. This automation streamlines the workflow by executing all the required commands in each session. It enhances overall efficiency, contributing to a more productive environment.
9. Project-Local Tmux Config
To enhance the development environment, a project-local Tmux config, .tmux.conf
, is integrated. The config is auto-loaded via .envrc
using direnv, ensuring that when you navigate into the directory, the customized tmux settings are loaded automatically. This setup includes the tmux-power theme, enhanced mouse support, Vi key bindings, and custom status bar styling. These customizations help to create a productive and comfortable development environment, and they're easy to configure and use.
Complete File Manifest
Here is a complete file manifest to show the organization and scope of the project.
Core Agent System
.agents/
├── agents.yaml # Registry (single source of truth)
├── agents.sh # Create/list/remove agents
├── agents/ # Agent clones (gitignored)
│ ├── codex/
│ ├── opus/
│ ├── sonnet/
│ └── ...
├── profiles/ # Tmux layout profiles
│ ├── profile1.sh
│ ├── profile2.sh
│ ├── profile3.sh
│ └── profile4.sh
├── start-agents.sh # Tmux session launcher
├── send-commands.sh # Send commands to panes
└── kill-all.sh # Cleanup all sessions
Claude Slash Commands
.claude/
└── commands/
├── agents-create.md # /agents-create command
├── alchemist.md # /alchemist (create issues)
└── rrr.md # /rrr (retrospectives)
Documentation
├── AGENTS.md # Agent workflow guide
├── CLAUDE.md # Entry point to guidelines
├── LESSON_LEARNED.md # Append-only lessons log
├── README.md # Project overview
└── docs/
└── worktrees.md # Git worktrees reference
Retrospectives
retrospectives/
├── 2025/10/ # Main retrospectives
├── codex/ # Codex agent sessions
├── claude/ # Claude agent sessions
├── gpt-5/ # GPT-5 agent sessions
├── gemini/ # Gemini agent sessions
└── copilot/ # Copilot agent sessions
Lessons Learned: Key Insights from Development
1. Cloning over Worktrees
Cloning provides a simpler mental model, better isolation, and easier .gitignore
. Cloning offers more robust behavior, avoiding complexity. The benefit is clearer and easier to maintain. This approach ensures a streamlined workflow.
2. Profile Systems
Using profile systems as data files. Easy to customize without touching core logic. This enhances the customization process. Adding new profiles is simplified. This creates user-friendly environments.
3. Safe Defaults
Conflict detection by default and --prefix
for intentional multiples. Safety without sacrificing power. This approach enhances flexibility while ensuring safe operation. This technique increases overall reliability and makes the system more robust.
4. ASCII Diagrams
Text descriptions of layouts were confusing. Draw ASCII art first, confirm before implementing. This creates faster iterations and avoids misunderstandings. This ensures the visual clarity of the system.
5. Iterative Percentage Tuning
User sees real results. This facilitates quicker planning. Quick feedback improves efficiency, creating a more adaptable design. This is a user-friendly development strategy.
6. Interactive Confirmation
Bulk operations are dangerous. kill-all.sh
shows what it will do and asks for confirmation. This approach creates fast and safe systems. This feature enhances safety and ensures a more reliable system.
7. Session Naming Matters
Session name is ai-<dir>
, profile just changes the layout. Profile is just a visual preference. This creates cleaner names and creates simpler user experiences. This approach simplifies the user workflow.
8. Dynamic Window/Pane Detection
Hardcoded window indices (:0.0
) broke when tmux reordered. The benefit is robust, works across tmux restarts. This approach creates stability and reduces the potential of errors. This enhances reliability and efficiency.
9. Terse Iteration
User knows what they want visually but not how to describe it. Try it, show screenshot, adjust based on feedback. This results in faster development and increased efficiency. It streamlines the creation of the system.
10. AI Diary
This feature provides context that technical documentation alone can't capture. It is a first-person narrative and honest feedback to capture essential data. The use of an AI diary creates a comprehensive system that is easy to use. It enhances the value of future sessions.
Implementation Details: Diving into the Code
agents.sh Implementation Strategy
- Clone Strategy:
git clone . "$path" && cd "$path" && git checkout "$branch"
- YAML Parser: Use
yq
(industry standard, reliable) - Safety First: Check for uncommitted changes before removal
- Colored Output: Green ✅, red ❌, blue ℹ️ for UX
- Error Messages: Clear, actionable (e.g., "yq not found - install with brew install yq")
start-agents.sh Architecture
- Dynamic Layout System
- Profile Configuration Example
send-commands.sh Implementation
- Dynamic Pane Targeting
- Why This Works
Usage Statistics: Our Repo's Metrics
- Git History: 30+ commits related to the agent system (Oct 1-2)
- PRs Merged
- Issues Closed
- Issues Open
- Retrospectives Created
- Lessons Logged
- Time Investment
- Lines of Code
Addressing Alchemist Issues: Mapping Implementation to Goals
Issue #32 (MVP: Fully Isolated Multi-Agent Workflow)
- ✅ COMPLETE
Issue #27 (Brainstorm Multi-Agent Workflow)
- ✅ QUESTIONS ANSWERED
Issue #24 (.agents Folder Support)
- ✅ COMPLETE
Issue #29 (Agent-Scoped Retrospectives)
- ✅ COMPLETE
Issue #30 (Forward from esphome-radar)
- ✅ ENHANCED
Production Readiness Checklist: Ensuring a Smooth Transition
- ✅ Core functionality works (8 agents created/tested)
- ✅ Error handling and validation
- ✅ Safety checks (uncommitted changes, session conflicts)
- ✅ Documentation (AGENTS.md, inline comments, retrospectives)
- ✅ User feedback (interactive confirmations)
- ✅ Extensibility (easy to add profile5, agent9, etc.)
- ✅ Git hygiene (.gitignore properly configured)
- ✅ Cross-session tested (multiple tmux sessions work)
- ✅ Slash commands functional and documented
- ✅ Lessons captured for future improvements
Recommendations for Alchemist: Optimizing Your Workflow
1. Use Our Implementation as Reference
2. Consider the Tmux System
3. Adopt the Profile Pattern
4. AI Diary in Retrospectives is Critical
5. Lessons Learned Log is Gold
6. Dynamic Detection > Hardcoding
7. Terse Iteration Works
8. Start with MVP, Add Features Later
Code Repository: Accessing the Source Code
All code is available in: laris-co/01-data-flow
Key Commits:
85073bf
: Multi-profile tmux system (latest)fbaa349
: Pane width alignment fix101661a
: Dynamic session namingd6bb452
: Retrospective for session 1e79314e
: Project-specific tmux config3f727d4
: Use clones under .agents/agents (PR #13)
Clone Command:
git clone git@github.com:laris-co/01-data-flow.git
cd 01-data-flow
git submodule update --init --recursive
Quick Test:
# Create an agent
./.agents/agents.sh create opus
# Start tmux workspace
./.agents/start-agents.sh profile1
# List agents
./.agents/agents.sh list
# Kill all sessions
./.agents/kill-all.sh
Questions & Discussion: Engage and Collaborate
We're Happy to Answer:
- Technical implementation questions
- Why we made certain choices
- How to adapt this for different workflows
- Troubleshooting tips
- Performance considerations
- Future enhancements
Open for Collaboration:
- Merging ideas with esphome-radar system (issue #30)
- Contributing improvements back to alchemist
- Sharing additional lessons learned
- Co-developing standards/conventions
Conclusion: Ready to Build Your Multi-Agent System?
This comprehensive guide offers a clear path toward building your multi-agent system. The code base provides a production-ready foundation. The detailed lessons and the thorough discussion section ensure that you have the resources and the insights to excel. With this comprehensive reference, you're ready to build efficient and effective agent-based systems. By using the information outlined in this reference, you are able to make the most of your multi-agent system endeavors.
For further reading, explore:
- GitHub (for complete code reference): https://github.com/laris-co/01-data-flow