thoughts#

May 24, 2024
in thoughts, interview notes
8 min read

Agents in AutoGen

agents

TL;DR

AutoGen agents unify different agent definitions.
When talking about multi vs. single agents, it is beneficial to clarify whether we refer to the interface or the architecture.

I often get asked two common questions: 1. What's an agent? 1. What are the pros and cons of multi vs. single agent?

This blog collects my thoughts from several interviews and recent learnings.

There are many different types of definitions of agents. When building AutoGen, I was looking for the most generic notion that can incorporate all these different types of definitions. And to do that we really need to think about the minimal set of concepts that are needed.

In AutoGen, we think about the agent as an entity that can act on behalf of human intent. They can send messages, receive messages, respond to other agents after taking actions and interact with other agents. We think it's a minimal set of capabilities that an agent needs to have underneath. They can have different types of backends to support them to perform actions and generate replies. Some of the agents can use AI models to generate replies. Some other agents can use functions underneath to generate tool-based replies and other agents can use human input as a way to reply to other agents. And you can also have agents that mix these different types of backends or have more complex agents that have internal conversations among multiple agents. But on the surface, other agents still perceive it as a single entity to communicate to.

With this definition, we can incorporate both very simple agents that can solve simple tasks with a single backend, but also we can have agents that are composed of multiple simpler agents. One can recursively build up more powerful agents. The agent concept in AutoGen can cover all these different complexities.