Skip to content

Structured Output#

Structured output constrains the model’s final message so you can parse it into a typed Python value—a number, a dataclass, a Pydantic model, or the result of your own validator—instead of treating the reply as an opaque string.

What you get on each turn#

Every turn returns an AgentReply. Two surfaces matter for structured output:

Surface What it is
reply.body Raw text from the model for that turn (a str or None).
await reply.content() Parsed value according to the response schema in effect for that turn.

If the model’s output cannot be parsed or fails validation, content() raises an error from the underlying parser (for example Pydantic’s validation errors). You can pass retries to automatically re-ask the model on failure.

With the default OpenAI client, when the schema exposes a JSON Schema to the API, the client sends a structured response_format so the model is guided to emit JSON matching that schema. PromptedSchema is the escape hatch when the provider does not support that mechanism: the schema is injected into the system prompt instead, and content() still runs the same way afterward.

When to use which tool#

  • Pass a plain type (int, YourModel, …) when the default schema name and description are enough.
  • Use ResponseSchema when you want a clear name and description in the API payload so the model knows the role of the structured payload.
  • Use @response_schema when you need custom parsing, normalization, or extra steps after JSON is read.
  • Use PromptedSchema when your model or endpoint does not support native structured output.

Quick start#

from autogen.beta import Agent
from autogen.beta.config import OpenAIConfig

agent = Agent(
    "assistant",
    prompt="You are a helpful assistant. Answer concisely.",
    config=OpenAIConfig("gpt-4o-mini"),
    response_schema=int,
)

reply = await agent.ask("How many bits are in a byte?")
print(reply.body)       # e.g. '8' — raw model text
result = await reply.content()
print(result)              # 8 — Python int

Real-world examples#

The following patterns mirror how structured output is used in applications: triage, extraction, and safe normalization.

Classify a support ticket (Pydantic)#

Route incoming text into fields your helpdesk or CRM already understands:

from typing import Annotated
from pydantic import BaseModel, Field

from autogen.beta import Agent
from autogen.beta.config import OpenAIConfig

class TicketTriage(BaseModel):
    """Structured triage for a single support message."""

    category: Annotated[str, Field(description="e.g. billing, bug, account_access")]
    urgency: Annotated[str, Field(description="low, medium, or high")]
    summary_one_line: Annotated[str, Field(description="Max 120 characters", max_length=120)]

agent = Agent(
    "triage",
    prompt="You triage customer support messages. Be conservative with urgency.",
    config=OpenAIConfig("gpt-4o-mini"),
    response_schema=TicketTriage,
)

body = (
    "I was charged twice for Pro last week and I still can't export my reports. "
    "This is blocking our quarter close."
)
reply = await agent.ask(f"Classify this ticket:\n\n{body}")
triage = await reply.content()
# triage.category, triage.urgency, triage.summary_one_line → use in routing rules

Extract a delivery ETA window (dataclass)#

Turn natural language into something your scheduling layer can consume:

from dataclasses import dataclass

from autogen.beta import Agent
from autogen.beta.config import OpenAIConfig

@dataclass
class DeliveryWindow:
    day_label: str
    start_hour_local: int
    end_hour_local: int
    timezone: str

agent = Agent(
    "scheduler",
    prompt="Extract delivery windows as structured data only; use 24h integers for hours.",
    config=OpenAIConfig("gpt-4o-mini"),
    response_schema=DeliveryWindow,
)

reply = await agent.ask(
    "Customer said: drop off Tuesday between 2 and 5pm Pacific, before dinner."
)
window = await reply.content()

Score a review on a fixed scale (primitive + clear prompt)#

Use a primitive schema when the payload is a single JSON value and your prompt defines the scale:

from autogen.beta import Agent
from autogen.beta.config import OpenAIConfig

agent = Agent(
    "reviews",
    prompt="You output a single integer 1–5 for overall satisfaction. No prose.",
    config=OpenAIConfig("gpt-4o-mini"),
    response_schema=int,
)

reply = await agent.ask(
    "Rate this review: 'Shipped fast, packaging was torn, product works great.'"
)
stars = await reply.content()

Supported schema types#

You can pass any type the stack can turn into a JSON Schema and parse back: primitives, dataclass, Pydantic models, unions, and more. Plain types are wrapped in an internal ResponseSchema instance for validation and API schema generation.

Primitives#

1
2
3
4
5
agent = Agent("assistant", config=config, response_schema=int)

reply = await agent.ask("What is 2 + 2?")
result = await reply.content()
# 4 — int

Dataclasses#

from dataclasses import dataclass

@dataclass
class City:
    name: str
    population: int

agent = Agent("assistant", config=config, response_schema=City)

reply = await agent.ask("Give the city name and approximate population for Kyoto.")
result = await reply.content()

Pydantic models#

from pydantic import BaseModel

class Sentiment(BaseModel):
    label: str
    score: float

agent = Agent("assistant", config=config, response_schema=Sentiment)

reply = await agent.ask("Analyze: 'I love this product!'")
result = await reply.content()

Unions#

Use a union (int | str) or a tuple of types ((int, str)) when the model must return one of several JSON shapes.

from autogen.beta import Agent
from autogen.beta.config import OpenAIConfig

config = OpenAIConfig("gpt-4o-mini")

# int | str — e.g. a count, or "unknown" when the text does not say
agent = Agent(
    "extractor",
    prompt='Reply with JSON only: either an integer count or the string "unknown".',
    config=config,
    response_schema=int | str,
)

reply = await agent.ask("How many seats does the venue mention? (no number in text)")
result = await reply.content()
# result is int or str, depending on the model output

ResponseSchema (named payloads)#

For clearer API metadata, construct a ResponseSchema with an explicit name and description:

1
2
3
4
5
6
7
8
9
from autogen.beta import Agent, ResponseSchema

schema = ResponseSchema(
    int | str,
    name="ByteWidth",
    description="The number of bits in one byte.",
)

agent = Agent("assistant", config=config, response_schema=schema)

Those fields are attached to the structured-output payload where the provider supports it, which helps the model treat the JSON as a named contract rather than a generic blob.


Custom validation with @response_schema#

Use the decorator when you need logic beyond “parse this JSON into a type”: clamping, regex cleanup, decoding wrapped JSON, or combining fields.

Sync validator: clamp a numeric rating#

from autogen.beta import Agent, response_schema

@response_schema
def parse_rating(content: str) -> int:
    """Parse a rating and clamp it to 1–5."""
    return max(1, min(5, int(content)))

agent = Agent("assistant", config=config, response_schema=parse_rating)

reply = await agent.ask("Rate this movie from 1 to 5.")
result = await reply.content()

Async validator: enrich after JSON parse#

1
2
3
4
5
6
7
8
import json

@response_schema
async def fetch_and_validate(content: str) -> dict:
    """Validate and enrich the model's JSON response."""
    data = json.loads(content)
    data["validated"] = True
    return data

Validation rules for @response_schema#

The framework introspects your function with fast_depends (the same dependency-injection path as @tool callables). Parameters satisfied by injection - Variables, Depends, Inject, Context and similar—are not part of the JSON the model must produce. Every other parameter controls how the completion text is decoded and whether a JSON Schema is attached for native structured output.

One non-injected parameter#

Annotated type What the model’s message must look like JSON Schema sent to the API?
str Any text. The raw completion string is passed in; nothing is parsed as JSON for you. No — there is no derived schema, so clients such as OpenAI do not get a response_format schema from this callable alone.
Primitive or union (int, float, bool, int \| str, …) By default (embed=True), a JSON object {"data": <value>}. The framework unwraps it before calling your function. With embed=False, a bare JSON value. Yes, when the client supports structured output and emits json_schema from the derived schema.
Structured type (dataclass, Pydantic model, dict, …) A JSON object matching the type’s schema. These are never embedded regardless of the embed flag. Yes.

Illustrative shapes (each function would be decorated with @response_schema and used as response_schema=... on an Agent):

1
2
3
# Raw text — parse inside the function (e.g. json.loads).
def only_str(content: str) -> dict:
    pass
1
2
3
# Single JSON value at the top level, e.g. 42
def only_int(content: int) -> dict:
    pass
1
2
3
4
5
6
7
8
9
from dataclasses import dataclass

@dataclass
class Data:
    content: int

# Single JSON object at the top level, e.g. {"content": 1}
def only_dataclass(content: Data) -> dict:
    pass

Two or more non-injected parameters#

The framework builds one synthetic JSON object schema: Python parameter names are JSON keys. The completion must be a single object with those keys; values are validated against the annotations and passed into your function as keyword arguments (alongside any injected parameters).

For example:

@response_schema
def create_user(name: str, age: int, email: str) -> dict:
    """Create a validated user record."""
    return {"name": name, "age": age, "email": email, "active": True}

# expected JSON: {"name": "John Doe", "age": 30, "email": "john.doe@example.com"}
agent = Agent("assistant", config=config, response_schema=create_user)

reply = await agent.ask("Create a user for Alice, age 30, alice@example.com")
result = await reply.content()
# {"name": "Alice", "age": 30, "email": "alice@example.com", "active": True}

pydantic.Field on each parameter

Multi-parameter validators are backed by a synthetic Pydantic model, so you can document and constrain each JSON property with Field, just like on a BaseModel:

  • Use typing.Annotated when the parameter has no default: Annotated[str, Field(description="...")].
  • Combine a default and metadata with Field as the default value, e.g. score: float = Field(1.0, description="Test score").

description is surfaced on each property in the generated JSON Schema (and thus in native structured output when the client sends that schema). Other Field arguments—ge, le, pattern, and so on—are reflected as the usual JSON Schema keywords.

from typing import Annotated

from pydantic import Field

from autogen.beta import response_schema

@response_schema
def extract_listing(
    title: Annotated[str, Field(description="Product name from the text")],
    price_usd: Annotated[float, Field(description="Price in US dollars", ge=0)],
    in_stock: Annotated[bool, Field(description="True if the listing says it ships now")],
) -> dict:
    return {"title": title, "price_usd": price_usd, "in_stock": in_stock}

Parameters with a Python default (plain value or Field(default, ...)) are usually not listed as required in the schema; callers can omit those keys in the JSON object.

Note

Renaming a parameter changes the key the model is instructed to use. Treat those names as part of your contract with the model.

Accessing Context#

Validators participate in the same dependency injection model as tools. Inject Context to read variables, tie validation to session state, or perform lookups:

1
2
3
4
5
6
7
from autogen.beta import Context, response_schema

@response_schema
def validate_with_context(content: str, context: Context) -> str:
    """Use context variables during validation."""
    language = context.variables.get("language", "en")
    return f"[{language}] {content}"

PromptedSchema (models without native structured output)#

Some models or providers do not support API-level structured output (no response_format JSON schema). PromptedSchema injects the JSON Schema into the system prompt and sets json_schema to None on the wire so the client does not request native structured mode. Validation still goes through the inner schema’s validate method.

from autogen.beta import Agent, PromptedSchema

agent = Agent(
    "assistant",
    config=config,
    response_schema=PromptedSchema(int),
)

reply = await agent.ask("How many oceans are there on Earth?")
result = await reply.content()

You can keep a single schema definition (type, ResponseSchema, or @response_schema callable) and only wrap it when you need prompt-based delivery. The inner validate logic and JSON shape stay the same; PromptedSchema swaps how the schema reaches the model (system-prompt text instead of API response_format).

from autogen.beta import Agent, PromptedSchema, ResponseSchema, response_schema
from autogen.beta.config import OpenAIConfig

config = OpenAIConfig("gpt-4o-mini")

# Plain type you already pass as response_schema=int elsewhere
agent_a = Agent("a", config=config, response_schema=PromptedSchema(int))

# Named ResponseSchema reused from a “native structured” setup — wrap for a weaker API
ocean_count = ResponseSchema(
    int,
    name="OceanCount",
    description="Number of oceans on Earth.",
)
agent_b = Agent("b", config=config, response_schema=PromptedSchema(ocean_count))

# Same callable validator as without PromptedSchema — wrap it when the wire format must be prompt-only
@response_schema
def parse_int(content: str) -> int:
    return int(content.strip())

strict_int = PromptedSchema(parse_int)
agent_c = Agent("c", config=config, response_schema=strict_int)

Custom prompt template#

The default template asks for raw JSON only. Override it with a string that contains the {schema} placeholder:

1
2
3
4
PromptedSchema(
    int,
    prompt_template="Reply with JSON matching this schema:\n{schema}",
)

Override schema per request#

Pass response_schema to ask() (or AgentReply.ask()) to change the contract for one turn only. The agent’s default schema applies again on the next turn unless you override again.

1
2
3
4
5
6
7
8
9
agent = Agent("assistant", config=config)

turn = await agent.ask("How many seconds in a minute?", response_schema=int)
result = await turn.content()
#> 60 - int

turn2 = await turn.ask("Say hello.")
result2 = await turn2.content()
#> "Hello!" - str

Pass response_schema=None to drop a schema that was set on the agent for a single request:

1
2
3
4
agent = Agent("assistant", config=config, response_schema=int)

reply = await agent.ask("Just say hello in plain text.", response_schema=None)
result = await reply.content()

Note

The per-request override applies only to that turn. The conversation history is unchanged; only the schema used for the next completion differs.

Validation retries#

When the model's response fails schema validation, you can automatically re-ask the model instead of raising immediately. Pass the retries keyword to content():

1
2
3
4
agent = Agent("assistant", config=config, response_schema=int)

reply = await agent.ask("How many planets in the solar system?")
result = await reply.content(retries=3)

The retries parameter controls how many re-asks are allowed after the initial attempt. With retries=3, the initial response is validated; if it fails, the model is re-asked up to 3 more times before the error is raised.

Value Behavior
retries=0 (default) No retries — raise on the first validation failure.
retries=3 Up to 3 re-asks after the initial attempt (4 total).
retries=math.inf Re-ask indefinitely until the model produces a valid response.

Each retry sends the validation error back to the model as a follow-up message in the same conversation, so the model can see what went wrong and correct its output.

Warning

retries=math.inf will loop forever if the model consistently produces invalid output. Use a finite count in production, and reserve math.inf for interactive or experimental use.


Primitive embedding (embed)#

When a schema type is a primitive (int, float, bool, list[…]) or a union (int | str), the framework wraps it in a one-field JSON object by default. This is called embedding.

Instead of asking the model to produce a bare value like 42, the API schema asks for {"data": 42}. The content() method transparently unwraps the envelope so your code still receives a plain Python value.

Why?#

Most structured-output APIs (OpenAI, etc.) are designed around JSON objects. A bare value (42, true, "hello") is technically valid JSON but some providers handle it less reliably. Wrapping the value in {"data": …} gives the model a proper object to fill in, which improves reliability without changing your application code.

Which types are embedded?#

Type Embedded by default? Reason
str No schema generated Raw text is passed through as-is.
int, float, bool Yes Bare primitives benefit from the object wrapper.
list[T], tuple[T, ...] Yes Array values also benefit from the wrapper.
int \| str, Union[T1, T2], (T1, T2) Yes Union of primitives.
BaseModel subclass No Already a JSON object.
@dataclass No Already a JSON object.
TypedDict No Already a JSON object.
dict[K, V] No Already a JSON object.

Opting out#

Pass embed=False to ResponseSchema or @response_schema to disable wrapping. The model must then produce the bare JSON value directly (e.g. 42 instead of {"data": 42}).

1
2
3
4
5
from autogen.beta import ResponseSchema

schema = ResponseSchema(int, name="RawInt", embed=False)
# Model must produce: 42
# With embed=True (default): model produces {"data": 42}, content() returns 42 either way

With the @response_schema decorator:

1
2
3
@response_schema(embed=False)
def parse_rating(value: int) -> int:
    return max(1, min(5, value))

Note

Embedding is transparent to your code. Whether embed is True or False, content() always returns the unwrapped Python value. The only difference is the JSON shape the model is asked to produce.