Structured Output#
Structured output constrains the model’s final message so you can parse it into a typed Python value—a number, a dataclass, a Pydantic model, or the result of your own validator—instead of treating the reply as an opaque string.
What you get on each turn#
Every turn returns an AgentReply. Two surfaces matter for structured output:
| Surface | What it is |
|---|---|
reply.body | Raw text from the model for that turn (a str or None). |
await reply.content() | Parsed value according to the response schema in effect for that turn. |
If the model’s output cannot be parsed or fails validation, content() raises an error from the underlying parser (for example Pydantic’s validation errors). You can pass retries to automatically re-ask the model on failure.
With the default OpenAI client, when the schema exposes a JSON Schema to the API, the client sends a structured response_format so the model is guided to emit JSON matching that schema. PromptedSchema is the escape hatch when the provider does not support that mechanism: the schema is injected into the system prompt instead, and content() still runs the same way afterward.
When to use which tool#
- Pass a plain type (
int,YourModel, …) when the default schema name and description are enough. - Use
ResponseSchemawhen you want a clearnameanddescriptionin the API payload so the model knows the role of the structured payload. - Use
@response_schemawhen you need custom parsing, normalization, or extra steps after JSON is read. - Use
PromptedSchemawhen your model or endpoint does not support native structured output.
Quick start#
Real-world examples#
The following patterns mirror how structured output is used in applications: triage, extraction, and safe normalization.
Classify a support ticket (Pydantic)#
Route incoming text into fields your helpdesk or CRM already understands:
Extract a delivery ETA window (dataclass)#
Turn natural language into something your scheduling layer can consume:
Score a review on a fixed scale (primitive + clear prompt)#
Use a primitive schema when the payload is a single JSON value and your prompt defines the scale:
Supported schema types#
You can pass any type the stack can turn into a JSON Schema and parse back: primitives, dataclass, Pydantic models, unions, and more. Plain types are wrapped in an internal ResponseSchema instance for validation and API schema generation.
Primitives#
Dataclasses#
Pydantic models#
Unions#
Use a union (int | str) or a tuple of types ((int, str)) when the model must return one of several JSON shapes.
ResponseSchema (named payloads)#
For clearer API metadata, construct a ResponseSchema with an explicit name and description:
Those fields are attached to the structured-output payload where the provider supports it, which helps the model treat the JSON as a named contract rather than a generic blob.
Custom validation with @response_schema#
Use the decorator when you need logic beyond “parse this JSON into a type”: clamping, regex cleanup, decoding wrapped JSON, or combining fields.
Sync validator: clamp a numeric rating#
Async validator: enrich after JSON parse#
Validation rules for @response_schema#
The framework introspects your function with fast_depends (the same dependency-injection path as @tool callables). Parameters satisfied by injection - Variables, Depends, Inject, Context and similar—are not part of the JSON the model must produce. Every other parameter controls how the completion text is decoded and whether a JSON Schema is attached for native structured output.
One non-injected parameter#
| Annotated type | What the model’s message must look like | JSON Schema sent to the API? |
|---|---|---|
str | Any text. The raw completion string is passed in; nothing is parsed as JSON for you. | No — there is no derived schema, so clients such as OpenAI do not get a response_format schema from this callable alone. |
Primitive or union (int, float, bool, int \| str, …) | By default (embed=True), a JSON object {"data": <value>}. The framework unwraps it before calling your function. With embed=False, a bare JSON value. | Yes, when the client supports structured output and emits json_schema from the derived schema. |
Structured type (dataclass, Pydantic model, dict, …) | A JSON object matching the type’s schema. These are never embedded regardless of the embed flag. | Yes. |
Illustrative shapes (each function would be decorated with @response_schema and used as response_schema=... on an Agent):
Two or more non-injected parameters#
The framework builds one synthetic JSON object schema: Python parameter names are JSON keys. The completion must be a single object with those keys; values are validated against the annotations and passed into your function as keyword arguments (alongside any injected parameters).
For example:
pydantic.Field on each parameter
Multi-parameter validators are backed by a synthetic Pydantic model, so you can document and constrain each JSON property with Field, just like on a BaseModel:
- Use
typing.Annotatedwhen the parameter has no default:Annotated[str, Field(description="...")]. - Combine a default and metadata with
Fieldas the default value, e.g.score: float = Field(1.0, description="Test score").
description is surfaced on each property in the generated JSON Schema (and thus in native structured output when the client sends that schema). Other Field arguments—ge, le, pattern, and so on—are reflected as the usual JSON Schema keywords.
Parameters with a Python default (plain value or Field(default, ...)) are usually not listed as required in the schema; callers can omit those keys in the JSON object.
Note
Renaming a parameter changes the key the model is instructed to use. Treat those names as part of your contract with the model.
Accessing Context#
Validators participate in the same dependency injection model as tools. Inject Context to read variables, tie validation to session state, or perform lookups:
PromptedSchema (models without native structured output)#
Some models or providers do not support API-level structured output (no response_format JSON schema). PromptedSchema injects the JSON Schema into the system prompt and sets json_schema to None on the wire so the client does not request native structured mode. Validation still goes through the inner schema’s validate method.
You can keep a single schema definition (type, ResponseSchema, or @response_schema callable) and only wrap it when you need prompt-based delivery. The inner validate logic and JSON shape stay the same; PromptedSchema swaps how the schema reaches the model (system-prompt text instead of API response_format).
Custom prompt template#
The default template asks for raw JSON only. Override it with a string that contains the {schema} placeholder:
Override schema per request#
Pass response_schema to ask() (or AgentReply.ask()) to change the contract for one turn only. The agent’s default schema applies again on the next turn unless you override again.
Pass response_schema=None to drop a schema that was set on the agent for a single request:
Note
The per-request override applies only to that turn. The conversation history is unchanged; only the schema used for the next completion differs.
Validation retries#
When the model's response fails schema validation, you can automatically re-ask the model instead of raising immediately. Pass the retries keyword to content():
The retries parameter controls how many re-asks are allowed after the initial attempt. With retries=3, the initial response is validated; if it fails, the model is re-asked up to 3 more times before the error is raised.
| Value | Behavior |
|---|---|
retries=0 (default) | No retries — raise on the first validation failure. |
retries=3 | Up to 3 re-asks after the initial attempt (4 total). |
retries=math.inf | Re-ask indefinitely until the model produces a valid response. |
Each retry sends the validation error back to the model as a follow-up message in the same conversation, so the model can see what went wrong and correct its output.
Warning
retries=math.inf will loop forever if the model consistently produces invalid output. Use a finite count in production, and reserve math.inf for interactive or experimental use.
Primitive embedding (embed)#
When a schema type is a primitive (int, float, bool, list[…]) or a union (int | str), the framework wraps it in a one-field JSON object by default. This is called embedding.
Instead of asking the model to produce a bare value like 42, the API schema asks for {"data": 42}. The content() method transparently unwraps the envelope so your code still receives a plain Python value.
Why?#
Most structured-output APIs (OpenAI, etc.) are designed around JSON objects. A bare value (42, true, "hello") is technically valid JSON but some providers handle it less reliably. Wrapping the value in {"data": …} gives the model a proper object to fill in, which improves reliability without changing your application code.
Which types are embedded?#
| Type | Embedded by default? | Reason |
|---|---|---|
str | No schema generated | Raw text is passed through as-is. |
int, float, bool | Yes | Bare primitives benefit from the object wrapper. |
list[T], tuple[T, ...] | Yes | Array values also benefit from the wrapper. |
int \| str, Union[T1, T2], (T1, T2) | Yes | Union of primitives. |
BaseModel subclass | No | Already a JSON object. |
@dataclass | No | Already a JSON object. |
TypedDict | No | Already a JSON object. |
dict[K, V] | No | Already a JSON object. |
Opting out#
Pass embed=False to ResponseSchema or @response_schema to disable wrapping. The model must then produce the bare JSON value directly (e.g. 42 instead of {"data": 42}).
With the @response_schema decorator:
Note
Embedding is transparent to your code. Whether embed is True or False, content() always returns the unwrapped Python value. The only difference is the JSON shape the model is asked to produce.