Use Cases
- Use cases
- Reference Agents
- Notebooks
- All Notebooks
- Group Chat with Customized Speaker Selection Method
- RAG OpenAI Assistants in AutoGen
- Using RetrieveChat with Qdrant for Retrieve Augmented Code Generation and Question Answering
- Auto Generated Agent Chat: Function Inception
- Task Solving with Provided Tools as Functions (Asynchronous Function Calls)
- Using Guidance with AutoGen
- Solving Complex Tasks with A Sequence of Nested Chats
- Group Chat
- Solving Multiple Tasks in a Sequence of Async Chats
- Auto Generated Agent Chat: Task Solving with Provided Tools as Functions
- Conversational Chess using non-OpenAI clients
- RealtimeAgent with local websocket connection
- Web Scraping using Apify Tools
- DeepSeek: Adding Browsing Capabilities to AG2
- Interactive LLM Agent Dealing with Data Stream
- Generate Dalle Images With Conversable Agents
- Supercharging Web Crawling with Crawl4AI
- RealtimeAgent in a Swarm Orchestration
- Perform Research with Multi-Agent Group Chat
- Agent Tracking with AgentOps
- Translating Video audio using Whisper and GPT-3.5-turbo
- Automatically Build Multi-agent System from Agent Library
- Auto Generated Agent Chat: Collaborative Task Solving with Multiple Agents and Human Users
- Structured output
- CaptainAgent
- Group Chat with Coder and Visualization Critic
- Cross-Framework LLM Tool Integration with AG2
- Using FalkorGraphRagCapability with agents for GraphRAG Question & Answering
- Demonstrating the `AgentEval` framework using the task of solving math problems as an example
- RealtimeAgent in a Swarm Orchestration using WebRTC
- A Uniform interface to call different LLMs
- From Dad Jokes To Sad Jokes: Function Calling with GPTAssistantAgent
- Solving Complex Tasks with Nested Chats
- Usage tracking with AutoGen
- Agent with memory using Mem0
- Using RetrieveChat Powered by PGVector for Retrieve Augmented Code Generation and Question Answering
- Tools with Dependency Injection
- Solving Multiple Tasks in a Sequence of Chats with Different Conversable Agent Pairs
- WebSurferAgent
- Using RetrieveChat Powered by MongoDB Atlas for Retrieve Augmented Code Generation and Question Answering
- Assistants with Azure Cognitive Search and Azure Identity
- ReasoningAgent - Advanced LLM Reasoning with Multiple Search Strategies
- Agentic RAG workflow on tabular data from a PDF file
- Making OpenAI Assistants Teachable
- Run a standalone AssistantAgent
- AutoBuild
- Solving Multiple Tasks in a Sequence of Chats
- Currency Calculator: Task Solving with Provided Tools as Functions
- Swarm Orchestration with AG2
- Use AutoGen in Databricks with DBRX
- Using a local Telemetry server to monitor a GraphRAG agent
- Auto Generated Agent Chat: Solving Tasks Requiring Web Info
- StateFlow: Build Workflows through State-Oriented Actions
- Groupchat with Llamaindex agents
- Using Neo4j's native GraphRAG SDK with AG2 agents for Question & Answering
- Agent Chat with Multimodal Models: LLaVA
- Group Chat with Retrieval Augmented Generation
- Runtime Logging with AutoGen
- SocietyOfMindAgent
- Agent Chat with Multimodal Models: DALLE and GPT-4V
- Agent Observability with OpenLIT
- Mitigating Prompt hacking with JSON Mode in Autogen
- Trip planning with a FalkorDB GraphRAG agent using a Swarm
- Language Agent Tree Search
- Auto Generated Agent Chat: Collaborative Task Solving with Coding and Planning Agent
- OptiGuide with Nested Chats in AutoGen
- Auto Generated Agent Chat: Task Solving with Langchain Provided Tools as Functions
- Writing a software application using function calls
- Auto Generated Agent Chat: GPTAssistant with Code Interpreter
- Adding Browsing Capabilities to AG2
- Agentchat MathChat
- Chatting with a teachable agent
- RealtimeAgent with gemini client
- Preprocessing Chat History with `TransformMessages`
- Chat with OpenAI Assistant using function call in AutoGen: OSS Insights for Advanced GitHub Data Analysis
- Websockets: Streaming input and output using websockets
- Task Solving with Code Generation, Execution and Debugging
- Agent Chat with Async Human Inputs
- Agent Chat with custom model loading
- Chat Context Dependency Injection
- Nested Chats for Tool Use in Conversational Chess
- Auto Generated Agent Chat: Group Chat with GPTAssistantAgent
- Cross-Framework LLM Tool for CaptainAgent
- Auto Generated Agent Chat: Teaching AI New Skills via Natural Language Interaction
- SQL Agent for Spider text-to-SQL benchmark
- Auto Generated Agent Chat: Task Solving with Code Generation, Execution, Debugging & Human Feedback
- OpenAI Assistants in AutoGen
- (Legacy) Implement Swarm-style orchestration with GroupChat
- Enhanced Swarm Orchestration with AG2
- Using RetrieveChat for Retrieve Augmented Code Generation and Question Answering
- Using Neo4j's graph database with AG2 agents for Question & Answering
- AgentOptimizer: An Agentic Way to Train Your LLM Agent
- Engaging with Multimodal Models: GPT-4V in AutoGen
- RealtimeAgent with WebRTC connection
- FSM - User can input speaker transition constraints
- Config loader utility functions
- Community Gallery
Generate Dalle Images With Conversable Agents
Generate images with conversable agents.
This notebook illustrates how to add the image generation capability to a conversable agent.
Some extra dependencies are needed for this notebook, which can be installed via pip:
pip install pyautogen[lmm]
For more information, please refer to the installation guide.
First, let’s import all the required modules to run this example.
import os
from IPython.display import display
from PIL.Image import Image
import autogen
from autogen.agentchat.contrib import img_utils
from autogen.agentchat.contrib.capabilities import generate_images
Let’s define our LLM configs.
gpt_config = {
"config_list": [{"model": "gpt-4-turbo-preview", "api_key": os.environ["OPENAI_API_KEY"]}],
"timeout": 120,
"temperature": 0.7,
}
gpt_vision_config = {
"config_list": [{"model": "gpt-4-vision-preview", "api_key": os.environ["OPENAI_API_KEY"]}],
"timeout": 120,
"temperature": 0.7,
}
dalle_config = {
"config_list": [{"model": "dall-e-3", "api_key": os.environ["OPENAI_API_KEY"]}],
"timeout": 120,
"temperature": 0.7,
}
Learn more about configuring LLMs for agents here.
Our system will consist of 2 main agents: 1. Image generator agent. 2. Critic agent.
The image generator agent will carry a conversation with the critic, and generate images based on the critic’s requests.
CRITIC_SYSTEM_MESSAGE = """You need to improve the prompt of the figures you saw.
How to create an image that is better in terms of color, shape, text (clarity), and other things.
Reply with the following format:
CRITICS: the image needs to improve...
PROMPT: here is the updated prompt!
If you have no critique or a prompt, just say TERMINATE
"""
def _is_termination_message(msg) -> bool:
# Detects if we should terminate the conversation
if isinstance(msg.get("content"), str):
return msg["content"].rstrip().endswith("TERMINATE")
elif isinstance(msg.get("content"), list):
for content in msg["content"]:
if isinstance(content, dict) and "text" in content:
return content["text"].rstrip().endswith("TERMINATE")
return False
def critic_agent() -> autogen.ConversableAgent:
return autogen.ConversableAgent(
name="critic",
llm_config=gpt_vision_config,
system_message=CRITIC_SYSTEM_MESSAGE,
max_consecutive_auto_reply=3,
human_input_mode="NEVER",
is_termination_msg=lambda msg: _is_termination_message(msg),
)
def image_generator_agent() -> autogen.ConversableAgent:
# Create the agent
agent = autogen.ConversableAgent(
name="dalle",
llm_config=gpt_vision_config,
max_consecutive_auto_reply=3,
human_input_mode="NEVER",
is_termination_msg=lambda msg: _is_termination_message(msg),
)
# Add image generation ability to the agent
dalle_gen = generate_images.DalleImageGenerator(llm_config=dalle_config)
image_gen_capability = generate_images.ImageGeneration(
image_generator=dalle_gen, text_analyzer_llm_config=gpt_config
)
image_gen_capability.add_to_agent(agent)
return agent
We’ll define extract_img
to help us extract the image generated by the
image generator agent.
def extract_images(sender: autogen.ConversableAgent, recipient: autogen.ConversableAgent) -> Image:
images = []
all_messages = sender.chat_messages[recipient]
for message in reversed(all_messages):
# The GPT-4V format, where the content is an array of data
contents = message.get("content", [])
for content in contents:
if isinstance(content, str):
continue
if content.get("type", "") == "image_url":
img_data = content["image_url"]["url"]
images.append(img_utils.get_pil_image(img_data))
if not images:
raise ValueError("No image data found in messages.")
return images
Start the converstion
dalle = image_generator_agent()
critic = critic_agent()
img_prompt = "A happy dog wearing a shirt saying 'I Love AutoGen'. Make sure the text is clear."
# img_prompt = "Ask me how I'm doing"
result = dalle.initiate_chat(critic, message=img_prompt)
dalle (to critic):
A happy dog wearing a shirt saying 'I Love AutoGen'. Make sure the text is clear.
--------------------------------------------------------------------------------
critic (to dalle):
CRITICS: the image needs to improve the contrast and size of the text to enhance its clarity, and the shirt's color should not clash with the dog's fur color to maintain a harmonious color scheme.
PROMPT: here is the updated prompt!
Create an image of a joyful dog with a coat of a contrasting color to its fur, wearing a shirt with bold, large text saying 'I Love AutoGen' for clear readability.
--------------------------------------------------------------------------------
dalle (to critic):
I generated an image with the prompt: Joyful dog, contrasting coat color to its fur, shirt with bold, large text "I Love AutoGen" for clear readability.<image>
--------------------------------------------------------------------------------
critic (to dalle):
CRITICS: the image effectively showcases a joyful dog with a contrasting shirt color, and the text 'I Love AutoGen' is large and bold, ensuring clear readability.
PROMPT: TERMINATE
--------------------------------------------------------------------------------
Let’s display all the images that was generated by Dalle
images = extract_images(dalle, critic)
for image in reversed(images):
display(image.resize((300, 300)))