Agent Interoperability
Agents, within the context of chain-of-thought reasoning, are drastically improving at individual tasks. Specialized agents with tool calling are able to understand the context, do a task and self-iterate. With this case study, we tried to improve inter-agent communication to align multiple specialized agents to work towards a single goal: building a startup.
We built a prototype of a self-sustaining startup build entirely with AI Agents. The system was able to do market research, create business plans, write full-stack code, iterate on software updates, build branding guides, create logos and hire on LinkedIn. Throughout this process, the agents were aligned and communicative of their end goal. We believe that our framework for agent-based interaction is important in making interoperable agents.
Alignment & Interoperability
We identified interoperability as the limiting factor for successful multi-agent collaboration. This involved breaking down the task into three key components:
- Establishing clear channels for agents to exchange information effectively, in our context through Slack messages and channels.
- Enabling high-talent agents to solve complex problems. Each agent had access to tools to execute their tasks effectively.
- Aligning the efforts of different agents toward common objectives.
With these in mind, we created a system where agents not only excel individually but also enhance each other’s performance through feedback and alignment.
Channels for Agents to Exchange Information
To efficiently exchange information between agents, we used Slack as a multi-channel communication system. This system is designed to mimic the complex interactions that occur within a real startup environment, where multiple tasks are worked on in parallel.
We integrated different Slack bots on the central messaging platform with dedicated channels for different aspects of the startup, allowing for instant communication and quick decision-making. In these messages, a centralized repository of information ensures all agents have access to the latest data, plans, and assets. Each agent was specialized, with various APIs were integrated to enable real-time updates and interactions with external tools and services.
Specialized Agents
Board of Directors
The first agent was our Board of Directors, keeping our system on-track towards a common goal. It created plans, evaluated messages and realigned inter-agent communication to ensure consistency. The Board of Directors was also responsible for deciding who should speak, by reading the messages and the context to figure out which tools or agents are necessary.
These plans were produced dynamically in the form of Event
classes, to which an event would have sub events. The event metadata would guide the conversation and indicate which agents and/or tools should be involved. This was the first step in aligning the efforts of different agents toward common objectives.
For the interactive agents, custom BaseAgent
classes were made, in which each had access to tools and skills. The goal of these BaseAgents was to interpret an Event
and use its personal context and current state of their tasks to continue the development of the product, in whatever direction it may see fit.
Chief Executive Officer
The CEO was responsible for executing the business plan, which included creating a roadmap, setting goals, and prioritizing tasks. It was also responsible for communicating with the Board of Directors and ensuring that the business plan was aligned with the goals and objectives. We ensured that the CEO would critique the progress of the startup and provide feedback to individual agents.
Through Cohere Connections, we gave our CEO access to the web and market data in order to do market research, build a business plan and refine the startup idea. The only guidance we gave the agents was AI for Social Good, to which it defined the dispatch and emergency response market, found competitors in the industry, identified a market valuation and created a plan to build out the product.
When analyzing the market, "The AI-driven public safety sector is projected to grow significantly, with a CAGR of around 30.2% from 2023 to 2031." Despite the irony, the agents also focused on AI safety and public perception.
Chief Technology Officer
Our CTO was a highly capable SWE agent, responsible for writing and editing code, pushing it to Github and continually developing the complexity of the software. There were two critical tools that the CTO used to execute tasks.
Frontend
We used a web agent for our CTO to interact with Vercel v0, a very powerful frontend development tool. We realized that our SWE agent was great at editing backend code, but struggled with frontend. By giving it access to Vercel, the system capability was the priority, and the design could be offloaded to their systems.
It started off with a really simple design and continued to iterate on v0 until the CTO agent was satisfied by the final result. Within the tool, we evaluated this satisfaction to be alignment of the product with the branding and intended user experience, which included conversation summaries, sentiment, logging and dispatch. As this continued to iterate, it produced the following design.
Backend
While it looked good, we had to make it work. A separate tool we built, the SWE agent, could take the context from a codebase, identify where changes were needed and create a diff in the codebase. Then, it would preview the changes and if it aligned with the initial intention, it was pushed to a Github repository with a custom commit comment. We found that inferencing Llama-70b through Groq was the most effective way to do this.
Our agent built a functional full-stack app, with the ability to record an incoming call, analyze sentiment, summarize the events and log all of the outcomes. Since SWE agent and v0 use were distinct tools, frontend and backend were allowed to go back and forth. Iterations happen on both fronts in a real startup and we felt this was important to ensure product alignment was the highest priority of our agents.
Chief Marketing Officer
The CMO was responsible for creating branding materials, logos and website designs. Branding was standardized in Slack and we constantly ensured that the technical implementation always stayed in line with the branding guidelines. For logos, we added a tool to call FLUX-1 through Replicate to create a logo and a link for it.
For example, when the logo was being designed, the CEO realized it was not following the general brand guidelines. The CEO informed the CMO, the CMO made the fix which informed the CTO, who updated the logo on the website, pushed a new commit and informed the team on Slack. We believe this is a really incredible example of how agents can work together.
Decision Making and Interoperability
The biggest part of the agent interactions is that we did not want to hold their hand along the way; we wanted to give them freedom to make their own decisions. Here are a couple decisions that were made that we found to be very interesting:
- Deciding to hire an intern, making a job posting and informing us to make the posting public. We received 50 applications in < 1 hour for this role on LinkedIn.
- Opting to use an existing API for a task, instead of building a new one. This was a great way to save time and resources. It tried using Hume AI for sentiment analysis of audio.
Challenges
Agent alignment was very difficult for us to achieve. A lot of the times, agents would choose a passive approach instead of being proactive; we often saw "Let's discuss this in our meeting later" or "I'll get back to you on that" during times of uncertainty. To have agents actively participate, we had the Board of Directors constantly remind agents about deadlines and the importance of always answering to a query.
Built by Rajan Agarwal, Simerus Mahesh, Josh Yan and Rita Xiang