OpenAI Responses API V2 Client - Complete Guide#
This notebook demonstrates the OpenAIResponsesV2Client which implements the new OpenAI Responses API with rich UnifiedResponse objects.
Key Features#
- Stateful Conversations: Maintain conversation context via
previous_response_id - Built-in Tools: Web search, image generation, apply_patch
- Rich Content Blocks: TextContent, ReasoningContent, CitationContent, ImageContent, ToolCallContent
- Multimodal Support: Send and receive images
- Structured Output: Pydantic models and JSON schema support
- Cost Tracking: Token and image generation cost tracking
- Agent Integration: Works with AG2 agents for single, two-agent, and group chat
Requirements#
AG2 requires Python>=3.10. Install the required packages:
Setup#
Set your OpenAI API key as an environment variable or pass it directly to the client.
# Set your API key (or use environment variable OPENAI_API_KEY)
# os.environ["OPENAI_API_KEY"] = "sk-..."
1. Basic Usage#
The OpenAIResponsesV2Client returns rich UnifiedResponse objects with typed content blocks.
from autogen.llm_clients.openai_responses_v2 import OpenAIResponsesV2Client
# Create the V2 client
client = OpenAIResponsesV2Client()
# Make a simple request
response = client.create({
"model": "gpt-5-nano",
"messages": [
{"role": "user", "content": "how are you? tell me about yourself? and what is a machine? in one line"}
],
})
# Access the response
print(f"Response ID: {response.id}")
print(f"Model: {response.model}")
print(f"Content: {response.messages[0].get_text()}")
Understanding UnifiedResponse Structure#
The UnifiedResponse contains rich, typed content blocks:
from autogen.llm_clients.models.content_blocks import (
ReasoningContent,
TextContent,
)
# Inspect the response structure
print(f"Number of messages: {len(response.messages)}")
print(f"Usage: {response.usage}")
print(f"Cost: ${response.cost:.6f}")
# Iterate through content blocks
for msg in response.messages:
print(f"\nRole: {msg.role}")
for block in msg.content:
if isinstance(block, TextContent):
print(f" Text: {block.text[:100]}..." if len(block.text) > 100 else f" Text: {block.text}")
elif isinstance(block, ReasoningContent):
# Note that OpenAI may not return reasoning content with its API
print(f" Reasoning: {block.text[:100]}...")
2. Stateful Conversations#
The Responses API is stateful - it maintains conversation context server-side using previous_response_id.
# Create a new client for stateful conversation
stateful_client = OpenAIResponsesV2Client()
# First message
response1 = stateful_client.create({
"model": "gpt-4.1",
"messages": [{"role": "user", "content": "My name is Alice. Remember this."}],
})
print(f"Response 1: {response1.messages[0].get_text()}")
print(f"Response ID: {response1.id}")
# Second message - the client automatically tracks state
response2 = stateful_client.create({"model": "gpt-4.1", "messages": [{"role": "user", "content": "What is my name?"}]})
print(f"Response 2: {response2.messages[0].get_text()}")
print("\nThe model remembered the context from the previous turn!")
# Reset conversation state to start fresh
stateful_client.reset_conversation()
response3 = stateful_client.create({"model": "gpt-4.1", "messages": [{"role": "user", "content": "What is my name?"}]})
print(f"After reset: {response3.messages[0].get_text()}")
print("\nThe model no longer has context from previous conversation.")
Manual State Control#
You can also manually control the conversation state:
# Get current state
current_state = stateful_client._get_previous_response_id()
print(f"Current state: {current_state}")
# Get a fresh response ID
response_a = client.create({"model": "gpt-4.1", "messages": [{"role": "user", "content": "Hello, my name is Alice"}]})
response_a_id = client._get_previous_response_id()
response_b = client.create({"model": "gpt-4.1", "messages": [{"role": "user", "content": "Hello, my name is Hatter"}]})
response_b_id = client._get_previous_response_id()
# Use the fresh ID immediately
client._set_previous_response_id(response_a.id)
response_a1 = client.create({
"model": "gpt-4.1",
"messages": [{"role": "user", "content": "What's my name?"}],
# "previous_response_id": response_a.id # Use the real, fresh ID
})
response_a1.messages[0].get_text()
client._set_previous_response_id(response_b.id)
response_b1 = client.create({
"model": "gpt-4.1",
"messages": [{"role": "user", "content": "What's my name?"}],
# "previous_response_id": response_b.id # Use the real, fresh ID
})
response_a1_id = client._get_previous_response_id()
response_b1_id = client._get_previous_response_id()
print("response_a1_id", response_a1_id)
print("response_b1_id", response_b1_id)
3. Multimodal Support#
Send images in your messages using various formats.
# Create a multimodal message with an image URL
multimodal_message = OpenAIResponsesV2Client.create_multimodal_message(
text="What do you see in this image?",
images=["https://images.unsplash.com/photo-1587300003388-59208cc962cb?w=400"],
role="user",
)
print("Multimodal message structure:")
print(multimodal_message)
# Send multimodal request
mm_client = OpenAIResponsesV2Client()
response = mm_client.create({
"model": "gpt-4.1", # Use a vision-capable model
"messages": [multimodal_message],
})
print(f"Image description: {response.messages[0].get_text()}")
4. Built-in Tools#
The Responses API provides built-in tools that don’t require function definitions. supports an array of built in tools : [“web_search”, “image_generation”, “apply_patch”, “apply_patch_async”, “shell_tool”] - web_search - Enables the model to search the web for real-time information and returns results with citations. - image_generation - Allows the model to generate images from text descriptions using DALL-E or GPT-Image models. - apply_patch - Enables file operations (create, update, delete files) in a workspace directory with path restrictions. - apply_patch_async - Same as apply_patch but executes file operations asynchronously for better performance. - shell - Executes shell commands with configurable sandboxing, command filtering, and security restrictions. ## 4.1 Web Search
# Enable web search
search_client = OpenAIResponsesV2Client()
response = search_client.create({
"model": "gpt-4.1",
"messages": [{"role": "user", "content": "What is the latest news about AI?"}],
"built_in_tools": ["web_search"],
})
print(f"Response: {response.messages[0].get_text()[:500]}...")
# Extract citations from the response
citations = OpenAIResponsesV2Client.get_citations(response)
print(f"\nFound {len(citations)} citations:")
for citation in citations[:5]: # Show first 5
print(f" - {citation.title}: {citation.url}")
4.2 Image Generation#
import base64
from IPython.display import Image, display
# Enable image generation
image_client = OpenAIResponsesV2Client()
# # Configure image output parameters
image_client.set_image_output_params(quality="high", size="1024x1024", output_format="png")
response = image_client.create({
"model": "gpt-4.1",
"messages": [{"role": "user", "content": "Generate an image of a tree with fruit"}],
"built_in_tools": ["image_generation"],
})
# Extract generated images
images = OpenAIResponsesV2Client.get_generated_images(response)
print(f"Generated {len(images)} image(s)")
if images:
# Get the data URI
data_uri = images[0].data_uri
# Extract base64 data (remove the "data:image/png;base64," prefix)
if data_uri and data_uri.startswith("data:"):
# Split on comma to get base64 data
base64_data = data_uri.split(",", 1)[1]
# Display the image
display(Image(data=base64.b64decode(base64_data)))
# Check image generation costs
print(f"Image costs: ${image_client.get_image_costs():.4f}")
print(f"Total costs: ${image_client.get_total_costs():.4f}")
4.3 Structured Output#
from pydantic import BaseModel
from autogen.llm_clients.openai_responses_v2 import OpenAIResponsesV2Client
# Define a Pydantic model for structured output
class Person(BaseModel):
name: str
age: int
occupation: str
# Request structured output
struct_client = OpenAIResponsesV2Client()
response = struct_client.create({
"model": "gpt-4.1",
"messages": [{"role": "user", "content": "Generate a fictional person's profile"}],
"response_format": Person,
})
# Get the parsed object
parsed = OpenAIResponsesV2Client.get_parsed_object(response)
if parsed:
print(parsed)
print("---------------------------------")
print(f"Name: {parsed.name}")
print(f"Age: {parsed.age}")
print(f"Occupation: {parsed.occupation}")
5. Cost Tracking#
The V2 client tracks both token costs and image generation costs.
cost_client = OpenAIResponsesV2Client()
# Make several requests
for i in range(3):
response = cost_client.create({"model": "gpt-4.1", "messages": [{"role": "user", "content": f"Count to {i + 1}"}]})
# Per-request cost
usage = OpenAIResponsesV2Client.get_usage(response)
print(f"Request {i + 1}: {usage['total_tokens']} tokens, ${usage['cost']:.6f}")
# Get cumulative usage
cumulative = cost_client.get_cumulative_usage()
print("\nCumulative Usage:")
print(f" Total prompt tokens: {cumulative['prompt_tokens']}")
print(f" Total completion tokens: {cumulative['completion_tokens']}")
print(f" Total tokens: {cumulative['total_tokens']}")
print(f" Token cost: ${cumulative['token_cost']:.6f}")
print(f" Image cost: ${cumulative['image_cost']:.6f}")
print(f" Total cost: ${cumulative['total_cost']:.6f}")
# Reset cost tracking
cost_client.reset_all_costs()
print(f"After reset: ${cost_client.get_total_costs():.6f}")
6. V1 Backward Compatibility#
For code that expects ChatCompletion format, use create_v1_compatible().
v2_client = OpenAIResponsesV2Client()
# Get ChatCompletion-like response
response = v2_client.create_v1_compatible({"model": "gpt-4.1", "messages": [{"role": "user", "content": "Hello!"}]})
# Access like standard ChatCompletion
print(f"Type: {type(response).__name__}")
print(f"Content: {response.choices[0].message.content}")
print(f"Tokens: {response.usage.total_tokens}")
print(f"Cost: ${response.cost:.6f}")
7. Agent Integration#
The V2 client integrates with AG2 agents for conversational AI workflows.
7.1 Single Agent#
from autogen import ConversableAgent
# Configure LLM with Responses API
config_list = [
{
"model": "gpt-5-nano",
"api_type": "responses_v2", # Use Responses API
}
]
llm_config = {"config_list": config_list}
def math_tool(expression: str) -> str:
"""Evaluate a mathematical expression."""
try:
result = eval(expression)
return str(result)
except Exception as e:
return f"Error: {e}"
# Create a single assistant agent
assistant = ConversableAgent(
name="assistant",
llm_config=llm_config,
system_message="You are a helpful AI assistant who can do math. use the math_tool to do math.",
functions=[math_tool],
)
assistant.register_for_execution()(math_tool)
# Start a conversation
result = assistant.run(
assistant,
message="use a tool to perform 2+2",
max_turns=2,
)
result.process()
7.2 Two-Agent Chat#
# Create two specialized agents
researcher = ConversableAgent(
name="researcher",
llm_config=llm_config,
system_message="""You are a research assistant. Your job is to:
1. Analyze questions thoroughly
2. Provide detailed, factual information
3. Cite sources when possible""",
)
critic = ConversableAgent(
name="critic",
llm_config=llm_config,
system_message="""You are a critical reviewer. Your job is to:
1. Review the researcher's findings
2. Point out any gaps or inaccuracies
3. Suggest improvements
Say 'TERMINATE' when the research is satisfactory.""",
)
# Two-agent collaboration
response = researcher.run(
critic, message="Research the benefits and drawbacks of renewable energy sources.", max_turns=2
)
response.process()
7.3 Group Chat#
# Create multiple specialized agents for group chat
planner = ConversableAgent(
name="planner",
llm_config=llm_config,
system_message="""You are a project planner. Break down tasks into actionable steps.
Focus on creating clear, organized plans. Do not do any coding.""",
is_termination_msg=lambda x: "TERMINATE" in x.get("content", ""),
)
developer = ConversableAgent(
name="developer",
llm_config=llm_config,
system_message="""You are a software developer. Implement solutions based on the plan.
Write clean, well-documented code. Do not create plans.""",
is_termination_msg=lambda x: "TERMINATE" in x.get("content", ""),
)
reviewer = ConversableAgent(
name="reviewer",
llm_config=llm_config,
system_message="""You are a code reviewer. Review implementations for:
1. Correctness
2. Best practices
3. Potential improvements
Say 'TERMINATE' when the solution is complete and reviewed.""",
is_termination_msg=lambda x: "TERMINATE" in x.get("content", ""),
)
# Create group chat
from autogen.agentchat import run_group_chat
from autogen.agentchat.group.patterns import AutoPattern
pattern = AutoPattern(
initial_agent=planner,
agents=[planner, developer, reviewer],
group_manager_args={"llm_config": llm_config},
)
response = run_group_chat(
pattern=pattern,
messages="Create a Python function that calculates the Fibonacci sequence up to n terms.",
max_rounds=4,
)
response.process()
8. Advanced: Custom Function Tools#
Combine built-in tools with custom function tools.
# Define custom tools
def get_weather(city: str) -> str:
"""Get the current weather for a city."""
# Mock implementation
return f"The weather in {city} is sunny, 72°F"
def calculate(expression: str) -> str:
"""Evaluate a mathematical expression."""
try:
result = eval(expression)
return str(result)
except Exception as e:
return f"Error: {e}"
# Define tool schemas
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {"city": {"type": "string", "description": "City name"{{ "}}" }},
"required": ["city"],
},
},
},
{
"type": "function",
"function": {
"name": "calculate",
"description": "Evaluate a math expression",
"parameters": {
"type": "object",
"properties": {"expression": {"type": "string", "description": "Math expression"{{ "}}" }},
"required": ["expression"],
},
},
},
]
print("Tools defined successfully!")
from autogen.llm_clients.openai_responses_v2 import OpenAIResponsesV2Client, TextContent, ToolCallContent
# Use custom tools with the V2 client
tools_client = OpenAIResponsesV2Client()
response = tools_client.create({
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "What's 25 * 4 + 10?"}],
"tools": tools,
})
# Check for tool calls
for msg in response.messages:
for block in msg.content:
if isinstance(block, ToolCallContent):
print(f"Tool call: {block.name}({block.arguments})")
elif isinstance(block, TextContent):
print(f"Text: {block.text}")
Summary#
The OpenAIResponsesV2Client provides:
| Feature | Description |
|---|---|
| Stateful Conversations | Automatic context tracking via previous_response_id |
| Rich Content Blocks | TextContent, ReasoningContent, CitationContent, ImageContent, ToolCallContent |
| Built-in Tools | Web search, image generation, apply_patch |
| Multimodal Support | Send and receive images |
| Structured Output | Pydantic models and JSON schemas |
| Cost Tracking | Token and image generation cost tracking |
| V1 Compatibility | create_v1_compatible() for ChatCompletion format |
| Agent Integration | Works with AG2 single, two-agent, and group chat |
For more information, see the AG2 documentation.