Skip to content

Anthropic Structured Outputs with AG2#

Open In Colab Open on GitHub

Author: Yixuan Zhai

This notebook demonstrates how to use Anthropic’s structured outputs feature with AG2 agents. Structured outputs guarantee schema-compliant responses through constrained decoding, eliminating parsing errors and ensuring type safety.

Overview#

Anthropic’s structured outputs feature provides two powerful modes:

  1. JSON Outputs (response_format): Get validated JSON responses matching a specific schema
  2. Strict Tool Use (strict: true): Guaranteed schema validation for tool inputs

Key Benefits#

  • Always Valid: No more JSON.parse() errors
  • Type Safe: Guaranteed field types and required fields
  • Reliable: No retries needed for schema violations
  • Dual Modes: JSON for data extraction, strict tools for agentic workflows

Requirements#

  • Claude Sonnet 4.5 (claude-sonnet-4-5) or Claude Opus 4.1 (claude-opus-4-1)
  • Anthropic SDK >= 0.74.1
  • Beta header: structured-outputs-2025-11-13 (automatically applied by AG2)

Setup#

First, let’s install the required dependencies and set up our environment.

import os

from pydantic import BaseModel

import autogen

# Ensure you have your Anthropic API key set
# os.environ["ANTHROPIC_API_KEY"] = "your-api-key-here"

# Verify the API key is set
assert os.getenv("ANTHROPIC_API_KEY"), "Please set ANTHROPIC_API_KEY environment variable"
print("✅ Environment configured")

Example 1: JSON Structured Outputs with Pydantic Models#

The most common use case is extracting structured data from unstructured text. We’ll use Pydantic models to define our schema and get validated JSON responses.

Use Case: Mathematical Reasoning#

Let’s create an agent that solves math problems and returns structured step-by-step reasoning.

# Define the structured output schema using Pydantic
class Step(BaseModel):
    """A single step in mathematical reasoning."""

    explanation: str
    output: str

class MathReasoning(BaseModel):
    """Structured output for mathematical problem solving."""

    steps: list[Step]
    final_answer: str

    def format(self) -> str:
        """Format the response for display."""
        steps_output = "\n".join(
            f"Step {i + 1}: {step.explanation}\n  Output: {step.output}" for i, step in enumerate(self.steps)
        )
        return f"{steps_output}\n\nFinal Answer: {self.final_answer}"

# Configure LLM with structured output
llm_config = {
    "config_list": [
        {
            "model": "claude-sonnet-4-5",
            "api_key": os.environ["ANTHROPIC_API_KEY"],
            "api_type": "anthropic",
            "response_format": MathReasoning,  # Enable structured outputs
        }
    ],
}

# Create agents
user_proxy = autogen.UserProxyAgent(
    name="User",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=0,
    code_execution_config=False,
)

math_assistant = autogen.AssistantAgent(
    name="MathAssistant",
    system_message="You are a math tutor. Solve problems step by step.",
    llm_config=llm_config,
)

print("✅ Example 1 configured: Math reasoning with structured outputs")
# Ask the assistant to solve a math problem
chat_result = user_proxy.initiate_chat(
    math_assistant,
    message="Solve the equation: 3x + 7 = 22",
    max_turns=1,
)

# The response is automatically formatted using the format() method
print("\n" + "=" * 60)
print("STRUCTURED OUTPUT RESULT:")
print("=" * 60)
print(chat_result.chat_history[-1]["content"])

How It Works#

  1. Schema Definition: Pydantic models define the expected structure
  2. Beta API: AG2 automatically uses beta.messages.parse() for Pydantic models
  3. Constrained Decoding: Claude generates output that strictly follows the schema
  4. FormatterProtocol: If your model has a format() method, it’s automatically called

Benefits: - ✅ No JSON parsing errors - ✅ Guaranteed schema compliance - ✅ Type-safe field access - ✅ Custom formatting support

Example 2: Strict Tool Use for Type-Safe Function Calls#

Strict tool use ensures that Claude’s tool inputs exactly match your schema. This is critical for production agentic systems where invalid parameters can break workflows.

Use Case: Weather API with Validated Inputs#

Without strict mode, Claude might return "celsius" as a string when you expect an enum, or "2" instead of 2. Strict mode guarantees correct types.

# Define a tool function
def get_weather(location: str, unit: str = "celsius") -> str:
    """Get the weather for a location.

    Args:
        location: The city and state, e.g. San Francisco, CA
        unit: Temperature unit (celsius or fahrenheit)
    """
    # In a real application, this would call a weather API
    return f"Weather in {location}: 22°{unit.upper()[0]}, partly cloudy"

# Configure LLM with strict tool
llm_config_strict = {
    "config_list": [
        {
            "model": "claude-sonnet-4-5",
            "api_key": os.environ["ANTHROPIC_API_KEY"],
            "api_type": "anthropic",
        }
    ],
    "functions": [
        {
            "name": "get_weather",
            "description": "Get the weather for a location",
            "strict": True,  # Enable strict schema validation ✨
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"},
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"], "description": "Temperature unit"},
                },
                "required": ["location"],
            },
        }
    ],
}

# Create agents
weather_assistant = autogen.AssistantAgent(
    name="WeatherAssistant",
    system_message="You help users get weather information. Use the get_weather function.",
    llm_config=llm_config_strict,
)

user_proxy_2 = autogen.UserProxyAgent(
    name="User",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=1,
    code_execution_config=False,
)

# Register function on both agents
# Assistant needs it for LLM awareness, UserProxy executes it
weather_assistant.register_function({"get_weather": get_weather})
user_proxy_2.register_function({"get_weather": get_weather})

print("✅ Example 2 configured: Strict tool use for weather queries")
# Query the weather
chat_result = user_proxy_2.initiate_chat(
    weather_assistant,
    message="What's the weather in Boston, MA?",
    max_turns=2,
)

# Verify tool call had strict typing
print("\n" + "=" * 60)
print("TOOL CALL VERIFICATION:")
print("=" * 60)

import json

for message in chat_result.chat_history:
    if message.get("tool_calls"):
        tool_call = message["tool_calls"][0]
        args = json.loads(tool_call["function"]["arguments"])
        print(f"Function: {tool_call['function']['name']}")
        print(f"Arguments: {args}")
        print(f"✅ location type: {type(args['location']).__name__}")
        if "unit" in args:
            print(f"✅ unit value: {args['unit']} (valid enum)")
        break

Why Strict Tool Use Matters#

Without strict: true: - Claude might return {"location": "Boston", "unit": "Celsius"} (wrong case) - Or {"passengers": "2"} instead of {"passengers": 2} (string vs int) - Missing required fields could cause runtime errors

With strict: true: - ✅ Types are guaranteed correct (int not "2") - ✅ Enums match exactly ("celsius" not "Celsius") - ✅ Required fields are always present - ✅ No need for validation code in your functions

Example 3: Combined JSON Outputs + Strict Tools#

The most powerful pattern is combining both features: use strict tools for calculations/actions, then return structured JSON for the final result.

Use Case: Math Calculator Agent#

The agent uses strict tools to perform calculations (guaranteed correct types), then provides a structured summary of the work.

# Define calculator tool
def calculate(operation: str, a: float, b: float) -> float:
    """Perform a calculation.

    Args:
        operation: The operation to perform (add, subtract, multiply, divide)
        a: First number
        b: Second number
    """
    if operation == "add":
        return a + b
    elif operation == "subtract":
        return a - b
    elif operation == "multiply":
        return a * b
    elif operation == "divide":
        return a / b if b != 0 else 0
    return 0

# Result model for structured output
class CalculationResult(BaseModel):
    """Structured output for calculation results."""

    problem: str
    steps: list[str]
    result: float
    verification: str

# Configure with BOTH features
llm_config_combined = {
    "config_list": [
        {
            "model": "claude-sonnet-4-5",
            "api_key": os.environ["ANTHROPIC_API_KEY"],
            "api_type": "anthropic",
            "response_format": CalculationResult,  # 1. Structured JSON output
        }
    ],
    "functions": [
        {  # 2. Strict tool validation
            "name": "calculate",
            "description": "Perform arithmetic calculation",
            "strict": True,  # Enable strict mode
            "parameters": {
                "type": "object",
                "properties": {
                    "operation": {"type": "string", "enum": ["add", "subtract", "multiply", "divide"]},
                    "a": {"type": "number"},
                    "b": {"type": "number"},
                },
                "required": ["operation", "a", "b"],
            },
        }
    ],
}

# Create agents
calc_assistant = autogen.AssistantAgent(
    name="MathAssistant",
    system_message="You solve math problems using tools and provide structured results.",
    llm_config=llm_config_combined,
)

user_proxy_3 = autogen.UserProxyAgent(
    name="User",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=5,
    code_execution_config=False,
)

# Register function on both agents
calc_assistant.register_function({"calculate": calculate})
user_proxy_3.register_function({"calculate": calculate})

print("✅ Example 3 configured: Combined strict tools + structured output")
# Ask for a calculation with explanation
chat_result = user_proxy_3.initiate_chat(
    calc_assistant,
    message="Calculate (15 + 7) * 3 and explain your steps",
    max_turns=6,
)

print("\n" + "=" * 60)
print("COMBINED FEATURES RESULT:")
print("=" * 60)

# Check for tool calls (strict validation)
found_tool_call = False
found_structured_output = False

for message in chat_result.chat_history:
    if message.get("tool_calls"):
        found_tool_call = True
        tool_call = message["tool_calls"][0]
        args = json.loads(tool_call["function"]["arguments"])
        print(f"\n✅ Tool Call: {tool_call['function']['name']}")
        print(f"   Arguments: {args}")
        print(f"   Types verified: a={type(args['a']).__name__}, b={type(args['b']).__name__}")

    # Check for structured output
    if message.get("role") == "assistant" and message.get("content"):
        result = CalculationResult.model_validate_json(message["content"])
        found_structured_output = True
        print("\n✅ Structured Output:")
        print(f"   Problem: {result.problem}")
        print(f"   Steps: {len(result.steps)} steps")
        print(f"   Result: {result.result}")
        print(f"   Verification: {result.verification}")

print(f"\n{'=' * 60}")
print(
    f"Features used: Tool Calls={'✅' if found_tool_call else '❌'} | Structured Output={'✅' if found_structured_output else '❌'}"
)

How Combined Mode Works#

When both response_format and strict: true tools are configured:

  1. AG2 uses beta.messages.create() (not parse()) to support tools
  2. Claude chooses the approach based on the task:
    • Makes tool calls for calculations/actions
    • Returns structured output for final summaries
  3. Both features use the same beta API with structured-outputs-2025-11-13 header

Benefits: - ✅ Type-safe tool calls (no "2" vs 2 issues) - ✅ Structured final output (guaranteed schema) - ✅ Production-ready reliability - ✅ No manual validation needed

Example 4: GroupChat with AutoPattern and Structured Outputs#

Multi-agent collaboration becomes even more powerful with structured outputs. Let’s build a research team where agents automatically coordinate using AutoPattern and produce a structured research report.

Use Case: Collaborative Research Analysis#

Three specialized agents collaborate to analyze a topic, with automatic speaker selection and a guaranteed structured output format.

# Define structured output for research report
class ResearchFinding(BaseModel):
    """A single research finding."""

    category: str
    finding: str
    confidence: str  # high, medium, low

class ResearchReport(BaseModel):
    """Structured output for collaborative research."""

    topic: str
    summary: str
    findings: list[ResearchFinding]
    recommendations: list[str]
    contributors: list[str]

    def format(self) -> str:
        """Format the research report for display."""
        output = f"# Research Report: {self.topic}\n\n"
        output += f"## Summary\n{self.summary}\n\n"
        output += f"## Findings ({len(self.findings)} total)\n"
        for i, finding in enumerate(self.findings):
            output += f"{i + 1}. [{finding.category}] {finding.finding} (Confidence: {finding.confidence})\n"
        output += "\n## Recommendations\n"
        for i, rec in enumerate(self.recommendations):
            output += f"{i + 1}. {rec}\n"
        output += f"\n## Contributors: {', '.join(self.contributors)}"
        return output

# Define a research tool
def search_literature(query: str, field: str) -> str:
    """Search academic literature for a query in a specific field.

    Args:
        query: The search query
        field: The field to search (computer_science, biology, physics)
    """
    # Simulated literature search results
    results = {
        "computer_science": "Recent advances in LLMs show 40% improvement in reasoning tasks.",
        "biology": "Studies indicate protein folding accuracy increased by 35% with AI models.",
        "physics": "Quantum computing simulations demonstrate 50x speedup on specific problems.",
    }
    return results.get(field, "No results found for this field.")

# Configure LLM for group agents with strict tools
llm_config_group = {
    "config_list": [
        {
            "model": "claude-sonnet-4-5",
            "api_key": os.environ["ANTHROPIC_API_KEY"],
            "api_type": "anthropic",
        }
    ],
    "functions": [
        {
            "name": "search_literature",
            "description": "Search academic literature in a specific field",
            "strict": True,
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "The search query"},
                    "field": {
                        "type": "string",
                        "enum": ["computer_science", "biology", "physics"],
                        "description": "The academic field",
                    },
                },
                "required": ["query", "field"],
            },
        }
    ],
}

# Configure LLM for report writer with structured output
llm_config_report = {
    "config_list": [
        {
            "model": "claude-sonnet-4-5",
            "api_key": os.environ["ANTHROPIC_API_KEY"],
            "api_type": "anthropic",
            "response_format": ResearchReport,  # Structured output for final report
        }
    ],
}

# Create specialized research agents
cs_researcher = autogen.AssistantAgent(
    name="CS_Researcher",
    system_message="You are a computer science researcher. Analyze AI and ML topics. Use search_literature when needed.",
    llm_config=llm_config_group,
)

bio_researcher = autogen.AssistantAgent(
    name="Bio_Researcher",
    system_message="You are a biology researcher. Analyze biological and medical topics. Use search_literature when needed.",
    llm_config=llm_config_group,
)

report_writer = autogen.AssistantAgent(
    name="Report_Writer",
    system_message="You synthesize research from other agents into comprehensive structured reports. Wait for all researchers to contribute before writing the final report.",
    llm_config=llm_config_report,
)

# Register search function on all researchers
for agent in [cs_researcher, bio_researcher]:
    agent.register_function({"search_literature": search_literature})

print("✅ Example 4 configured: GroupChat with AutoPattern and structured outputs")
# Import AutoPattern for intelligent speaker selection
from autogen.agentchat.group.multi_agent_chat import initiate_group_chat
from autogen.agentchat.group.patterns import AutoPattern

llm_config_manager = {
    "config_list": [
        {
            "model": "claude-sonnet-4-5",
            "api_key": os.environ["ANTHROPIC_API_KEY"],
            "api_type": "anthropic",
        }
    ],
}

# Initialize pattern with agents - AutoPattern uses agents' existing llm_config
pattern = AutoPattern(
    initial_agent=cs_researcher,
    agents=[cs_researcher, bio_researcher, report_writer],
    group_manager_args={"llm_config": llm_config_manager},
)

# Create initial research task
research_task = """
Analyze the impact of AI on scientific research across different fields.
Each researcher should contribute findings from their domain, then the report writer
should create a comprehensive structured report with all findings and recommendations.
"""

# Initiate group chat
chat_result, context_variables, last_agent = initiate_group_chat(
    pattern=pattern,
    messages=research_task,
    max_rounds=8,
)

print("\n" + "=" * 60)
print("GROUPCHAT WITH STRUCTURED OUTPUT:")
print("=" * 60)
print(f"\nTotal messages: {len(chat_result.chat_history)}")
print(f"Last agent: {last_agent.name}")

# Display the structured research report
for message in chat_result.chat_history:
    if message.get("name") == "Report_Writer" and message.get("content"):
        # Try to parse as ResearchReport
        report = ResearchReport.model_validate_json(message["content"])
        print(f"\n{report.format()}")
        print(f"\n✅ Structured report generated with {len(report.findings)} findings")
        break

GroupChat Features Demonstrated#

AutoPattern Benefits: - Automatic Speaker Selection: Claude intelligently chooses which researcher to speak based on conversation context - No Manual Orchestration: No need to specify speaker order or transitions - Natural Collaboration: Agents coordinate organically based on the conversation flow - Flexible Configuration: Simple setup with model and API key

Structured Outputs in GroupChat: - ✅ Individual agents use strict tools (search_literature) with type validation - ✅ Report writer produces guaranteed structured output (ResearchReport) - ✅ Multi-agent contributions synthesized into single validated schema - ✅ FormatterProtocol provides clean, readable final output

Key Implementation Details: - Each agent can have different llm_config and response_format - Tools registered per agent (only researchers get search_literature) - AutoPattern manages speaker selection using Claude’s intelligent routing - Structured output typically comes from a dedicated “synthesis” agent at the end

Production Considerations: - Set appropriate max_rounds to allow sufficient collaboration (8-15 rounds typical) - Use descriptive system messages to guide agent behavior - Consider adding termination conditions for cost control - Context is automatically managed across all agents in the group

Important Considerations#

Performance#

  • First request latency: Grammar compilation adds latency on first use
  • Automatic caching: Compiled grammars cached for 24 hours
  • Cache invalidation: Changing schema structure invalidates cache

JSON Schema Limitations#

Supported: - All basic types: object, array, string, integer, number, boolean, null - enum (strings, numbers, bools only) - required and additionalProperties: false - String formats: date-time, email, uri, uuid, etc.

Not supported: - Recursive schemas - Numerical constraints (minimum, maximum) - String constraints (minLength, maxLength) - Complex regex patterns

Model Requirements#

  • Required: Claude Sonnet 4.5 or Claude Opus 4.1
  • Older models: Will error if strict: true used
  • Fallback: Use JSON Mode for older models (automatic in AG2)

Feature Compatibility#

Works with: ✅ Batch processing, ✅ Streaming, ✅ Token counting, ✅ Group chats

Incompatible: ❌ Citations, ❌ Message prefilling with JSON outputs

Summary#

Quick Reference#

Feature When to Use Configuration
JSON Outputs Data extraction, classification, API responses response_format: PydanticModel
Strict Tools Agentic workflows, type-safe function calls "strict": True in tool definition
Combined Complex agents with tools + structured results Both configurations

Key Takeaways#

  1. Always valid: Structured outputs eliminate JSON parsing errors
  2. Type safe: Guaranteed correct types for tool inputs and JSON fields
  3. Production ready: No retries or manual validation needed
  4. Two modes: Choose based on your use case (extraction vs tools)
  5. Automatic: AG2 handles beta API, headers, and schema transformation

Next Steps#

  • Explore GroupChat with structured outputs
  • Implement custom FormatterProtocol methods
  • Build multi-tool agentic workflows with strict validation
  • Combine with streaming for real-time structured responses

Resources#