OpenAI Responses
OpenAI's Responses API is a significant evolution from the Chat Completions and Assistants APIs, offering support for both stateless interactions and structured, stateful conversations.
Additionally, it includes built-in tools for web search, image generation, apply_patch, shell command execution, and computer use. The robust multimodal capabilities of the API also set it apart. See their website for further details.
AG2 provides two client implementations: - V1 Client (api_type: "responses") - Returns ChatCompletion-like responses - V2 Client (api_type: "responses_v2") - Returns rich UnifiedResponse objects with typed content blocks (recommended)
Installation#
Environment Setup#
Set your OpenAI API key as an environment variable:
Linux/Mac:
Windows:
V2 Client (Recommended)#
The V2 client (api_type: "responses_v2") provides rich UnifiedResponse objects with typed content blocks, making it easier to work with multimodal content, reasoning tokens, and structured outputs.
Key Features#
| Feature | Description |
|---|---|
| Stateful Conversations | Automatic context tracking via previous_response_id |
| Rich Content Blocks | TextContent, ReasoningContent, CitationContent, ImageContent, ToolCallContent |
| Built-in Tools | web_search, image_generation, apply_patch, shell |
| Multimodal Support | Send and receive images |
| Structured Output | Pydantic models and JSON schemas |
| Cost Tracking | Token and image generation cost tracking |
| V1 Compatibility | create_v1_compatible() for ChatCompletion format |
LLM Configuration#
config_list = [
{
"api_type": "responses_v2", # Use V2 client
"model": "gpt-4.1",
"api_key": "your OpenAI Key goes here", # Or use OPENAI_API_KEY env var
# Optional: Enable built-in tools
"built_in_tools": ["web_search", "image_generation", "apply_patch", "shell"],
}
]
Basic Usage#
from autogen.llm_clients.openai_responses_v2 import OpenAIResponsesV2Client
# Create the V2 client
client = OpenAIResponsesV2Client()
# Make a request
response = client.create({
"model": "gpt-4.1",
"messages": [{"role": "user", "content": "Hello, how are you?"}],
})
# Access the response
print(f"Response ID: {response.id}")
print(f"Model: {response.model}")
print(f"Content: {response.messages[0].get_text()}")
print(f"Cost: ${response.cost:.6f}")
Stateful Conversations#
The V2 client automatically maintains conversation context:
client = OpenAIResponsesV2Client()
# First message
response1 = client.create({
"model": "gpt-4.1",
"messages": [{"role": "user", "content": "My name is Alice. Remember this."}],
})
# Second message - context is automatically maintained
response2 = client.create({
"model": "gpt-4.1",
"messages": [{"role": "user", "content": "What is my name?"}],
})
# The model will remember "Alice"
# Reset conversation to start fresh
client.reset_conversation()
Built-in Tools#
Web Search#
response = client.create({
"model": "gpt-4.1",
"messages": [{"role": "user", "content": "What are the latest news about AI?"}],
"built_in_tools": ["web_search"],
})
# Extract citations
citations = OpenAIResponsesV2Client.get_citations(response)
for citation in citations:
print(f"- {citation.title}: {citation.url}")
Image Generation#
# Configure image output
client.set_image_output_params(quality="high", size="1024x1024", output_format="png")
response = client.create({
"model": "gpt-4.1",
"messages": [{"role": "user", "content": "Generate an image of a sunset"}],
"built_in_tools": ["image_generation"],
})
# Extract generated images
images = OpenAIResponsesV2Client.get_generated_images(response)
for img in images:
print(f"Image data URI: {img.data_uri[:50]}...")
Shell Commands#
# Configure shell security
client.set_shell_params(
allowed_commands=["ls", "cat", "grep"],
denied_commands=["rm", "sudo"],
enable_command_filtering=True,
)
response = client.create({
"model": "gpt-4.1",
"messages": [{"role": "user", "content": "List files in the current directory"}],
"built_in_tools": ["shell"],
})
# Extract shell calls
shell_calls = OpenAIResponsesV2Client.get_shell_calls(response)
Apply Patch (File Operations)#
response = client.create({
"model": "gpt-4.1",
"messages": [{"role": "user", "content": "Create a file called hello.py with a hello world function"}],
"built_in_tools": ["apply_patch"],
"workspace_dir": "/path/to/workspace",
"allowed_paths": ["**/*.py"],
})
Multimodal Support#
# Create a multimodal message with images
message = OpenAIResponsesV2Client.create_multimodal_message(
text="What do you see in this image?",
images=["https://example.com/image.jpg"],
role="user",
)
response = client.create({
"model": "gpt-4.1",
"messages": [message],
})
Structured Output#
from pydantic import BaseModel
class Person(BaseModel):
name: str
age: int
occupation: str
response = client.create({
"model": "gpt-4.1",
"messages": [{"role": "user", "content": "Generate a fictional person's profile"}],
"response_format": Person,
})
# Get the parsed object
parsed = OpenAIResponsesV2Client.get_parsed_object(response)
print(f"Name: {parsed.name}, Age: {parsed.age}, Occupation: {parsed.occupation}")
Cost Tracking#
# Per-request usage
usage = OpenAIResponsesV2Client.get_usage(response)
print(f"Tokens: {usage['total_tokens']}, Cost: ${usage['cost']:.6f}")
# Cumulative usage across all requests
cumulative = client.get_cumulative_usage()
print(f"Total cost: ${cumulative['total_cost']:.6f}")
# Reset cost tracking
client.reset_all_costs()
V1 Backward Compatibility#
For code expecting ChatCompletion format:
response = client.create_v1_compatible({
"model": "gpt-4.1",
"messages": [{"role": "user", "content": "Hello!"}],
})
# Access like standard ChatCompletion
print(response.choices[0].message.content)
print(response.usage.total_tokens)
Agent Integration#
from autogen import ConversableAgent
config_list = [{"api_type": "responses_v2", "model": "gpt-4.1"}]
llm_config = {"config_list": config_list}
assistant = ConversableAgent(
name="assistant",
llm_config=llm_config,
system_message="You are a helpful AI assistant.",
)
result = assistant.initiate_chat(
assistant,
message="What is the capital of France?",
max_turns=2,
)
V1 Client#
The V1 client (api_type: "responses") returns ChatCompletion-like responses for backward compatibility.
LLM Configuration#
[
{
"api_type": "responses", # Use V1 client
"model": "gpt-4.1",
"api_key": "your OpenAI Key goes here",
"built_in_tools": ["web_search", "image_generation", "apply_patch", "shell"],
}
]
Warning
The V1 client is maintained for backward compatibility. For new projects, we recommend using the V2 client (api_type: "responses_v2") for access to rich content blocks and better type safety.
Built-in Tools Reference#
| Tool | Description |
|---|---|
web_search | Search the web for real-time information with citations |
image_generation | Generate images using DALL-E or GPT-Image models |
apply_patch | Create, update, and delete files using structured diffs |
apply_patch_async | Async version of apply_patch for better performance |
shell | Execute shell commands with configurable sandboxing |
Note
The apply_patch tool enables agents to create, update, and delete files using structured diffs, making it ideal for code editing tasks.
The shell tool allows secure, filtered command-line execution. When enabling it, consult the Shell Tool documentation for configuration options and security guidelines.